Commit Graph

56 Commits

Author SHA1 Message Date
Zhengchao Shao 74a25e6e91
9p/rdma: unmap receive dma buffer in rdma_request()/post_recv()
When down_interruptible() or ib_post_send() failed in rdma_request(),
receive dma buffer is not unmapped. Add unmap action to error path.
Also if ib_post_recv() failed in post_recv(), dma buffer is not unmapped.
Add unmap action to error path.

Link: https://lkml.kernel.org/r/20230104020424.611926-1-shaozhengchao@huawei.com
Fixes: fc79d4b104 ("9p: rdma: RDMA Transport Support for 9P")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
2023-02-24 13:42:28 +00:00
Dominique Martinet 1a4f69ef15 9p/client: fix data race on req->status
KCSAN reported a race between writing req->status in p9_client_cb and
accessing it in p9_client_rpc's wait_event.

Accesses to req itself is protected by the data barrier (writing req
fields, write barrier, writing status // reading status, read barrier,
reading other req fields), but status accesses themselves apparently
also must be annotated properly with WRITE_ONCE/READ_ONCE when we
access it without locks.

Follows:
 - error paths writing status in various threads all can notify
p9_client_rpc, so these all also need WRITE_ONCE
 - there's a similar read loop in trans_virtio for zc case that also
needs READ_ONCE
 - other reads in trans_fd should be protected by the trans_fd lock and
lists state machine, as corresponding writers all are within trans_fd
and should be under the same lock. If KCSAN complains on them we likely
will have something else to fix as well, so it's better to leave them
unmarked and look again if required.

Link: https://lkml.kernel.org/r/20221205124756.426350-1-asmadeus@codewreck.org
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Suggested-by: Marco Elver <elver@google.com>
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-12-13 13:02:15 +09:00
Christophe JAILLET 31fff92c9c 9p/net: Remove unneeded idr.h #include
The 9p net files don't use IDR or IDA functionalities. So there is no point
in including <linux/idr.h>.
Remove it.

Link: https://lkml.kernel.org/r/9e386018601d7e4a9e5d7da8fc3e9555ebb25c87.1669560387.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-12-02 23:59:20 +09:00
Christian Schoenebeck 01d205d936 net/9p: add 'pooled_rbuffers' flag to struct p9_trans_module
This is a preparatory change for the subsequent patch: the RDMA
transport pulls the buffers for its 9p response messages from a
shared pool. [1] So this case has to be considered when choosing
an appropriate response message size in the subsequent patch.

Link: https://lore.kernel.org/all/Ys3jjg52EIyITPua@codewreck.org/ [1]
Link: https://lkml.kernel.org/r/79d24310226bc4eb037892b5c097ec4ad4819a03.1657920926.git.linux_oss@crudebyte.com
Signed-off-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-10-05 07:05:41 +09:00
Kent Overstreet 8b11ff098a 9p: Add client parameter to p9_req_put()
This is to aid in adding mempools, in the next patch.

Link: https://lkml.kernel.org/r/20220704014243.153050-2-kent.overstreet@gmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-07-09 14:38:35 +09:00
Dominique Martinet 024b7d6a43 9p: fix file headers
- add missing SPDX-License-Identifier
- remove (sometimes incorrect) file name from file header

Link: http://lkml.kernel.org/r/20211102134608.1588018-2-dominique.martinet@atmark-techno.com
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2021-11-03 17:45:04 +09:00
Thomas Weißschuh 4cd82a5bb0 net/9p: autoload transport modules
Automatically load transport modules based on the trans= parameter
passed to mount.
This removes the requirement for the user to know which module to use.

Link: http://lkml.kernel.org/r/20211017134611.4330-1-linux@weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2021-11-03 09:50:35 +09:00
Andrew Lunn 760b3d61fb net: 9p: Fix kerneldoc warnings of missing parameters etc
net/9p/client.c:420: warning: Function parameter or member 'c' not described in 'p9_client_cb'
net/9p/client.c:420: warning: Function parameter or member 'req' not described in 'p9_client_cb'
net/9p/client.c:420: warning: Function parameter or member 'status' not described in 'p9_client_cb'
net/9p/client.c:568: warning: Function parameter or member 'uidata' not described in 'p9_check_zc_errors'
net/9p/trans_common.c:23: warning: Function parameter or member 'nr_pages' not described in 'p9_release_pages'
net/9p/trans_common.c:23: warning: Function parameter or member 'pages' not described in 'p9_release_pages'
net/9p/trans_fd.c:132: warning: Function parameter or member 'rreq' not described in 'p9_conn'
net/9p/trans_fd.c:132: warning: Function parameter or member 'wreq' not described in 'p9_conn'
net/9p/trans_fd.c:56: warning: Function parameter or member 'privport' not described in 'p9_fd_opts'
net/9p/trans_rdma.c:113: warning: Function parameter or member 'cqe' not described in 'p9_rdma_context'
net/9p/trans_rdma.c:129: warning: Function parameter or member 'privport' not described in 'p9_rdma_opts'
net/9p/trans_virtio.c:215: warning: Function parameter or member 'limit' not described in 'pack_sg_list_p'
net/9p/trans_virtio.c:83: warning: Function parameter or member 'chan_list' not described in 'virtio_chan'
net/9p/trans_virtio.c:83: warning: Function parameter or member 'p9_max_pages' not described in 'virtio_chan'
net/9p/trans_virtio.c:83: warning: Function parameter or member 'ring_bufs_avail' not described in 'virtio_chan'
net/9p/trans_virtio.c:83: warning: Function parameter or member 'tag' not described in 'virtio_chan'
net/9p/trans_virtio.c:83: warning: Function parameter or member 'vc_wq' not described in 'virtio_chan'

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Dominique Martinet <asmadeus@codewreck.org>
Link: https://lore.kernel.org/r/20201031182655.1082065-1-andrew@lunn.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-02 12:25:52 -08:00
Andrew Lunn 15e522a7b1 net: 9p: kerneldoc fixes
Simple fixes which require no deep knowledge of the code.

Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:20:39 -07:00
Chuck Lever 20cf4e0267 rdma: Enable ib_alloc_cq to spread work over a device's comp_vectors
Send and Receive completion is handled on a single CPU selected at
the time each Completion Queue is allocated. Typically this is when
an initiator instantiates an RDMA transport, or when a target
accepts an RDMA connection.

Some ULPs cannot open a connection per CPU to spread completion
workload across available CPUs and MSI vectors. For such ULPs,
provide an API that allows the RDMA core to select a completion
vector based on the device's complement of available comp_vecs.

ULPs that invoke ib_alloc_cq() with only comp_vector 0 are converted
to use the new API so that their completion workloads interfere less
with each other.

Suggested-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Cc: <linux-cifs@vger.kernel.org>
Cc: <v9fs-developer@lists.sourceforge.net>
Link: https://lore.kernel.org/r/20190729171923.13428.52555.stgit@manet.1015granger.net
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-08-05 11:50:32 -04:00
Thomas Gleixner 1f32761322 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 188
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not write to free software
  foundation 51 franklin street fifth floor boston ma 02111 1301 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 27 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190528170026.981318839@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:29:21 -07:00
Dominique Martinet 473c7dd1d7 9p/rdma: remove useless check in cm_event_handler
the client c is always dereferenced to get the rdma struct, so c has to
be a valid pointer at this point.
Gcc would optimize that away but let's make coverity happy...

Link: http://lkml.kernel.org/r/1536339057-21974-3-git-send-email-asmadeus@codewreck.org
Addresses-Coverity-ID: 102778 ("Dereference before null check")
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
2018-09-08 01:52:22 +09:00
Dominique Martinet 8b894adb2b 9p/rdma: do not disconnect on down_interruptible EAGAIN
9p/rdma would sometimes drop the connection and display errors in
recv_done when the user does ^C.
The errors were caused by recv buffers that were posted at the time
of disconnect, and we just do not want to disconnect when
down_interruptible is... interrupted.

Link: http://lkml.kernel.org/r/1535625307-18019-1-git-send-email-asmadeus@codewreck.org
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
2018-09-08 01:39:47 +09:00
Tomas Bortoli 728356dede 9p: Add refcount to p9_req_t
To avoid use-after-free(s), use a refcount to keep track of the
usable references to any instantiated struct p9_req_t.

This commit adds p9_req_put(), p9_req_get() and p9_req_try_get() as
wrappers to kref_put(), kref_get() and kref_get_unless_zero().
These are used by the client and the transports to keep track of
valid requests' references.

p9_free_req() is added back and used as callback by kref_put().

Add SLAB_TYPESAFE_BY_RCU as it ensures that the memory freed by
kmem_cache_free() will not be reused for another type until the rcu
synchronisation period is over, so an address gotten under rcu read
lock is safe to inc_ref() without corrupting random memory while
the lock is held.

Link: http://lkml.kernel.org/r/1535626341-20693-1-git-send-email-asmadeus@codewreck.org
Co-developed-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Reported-by: syzbot+467050c1ce275af2a5b8@syzkaller.appspotmail.com
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
2018-09-08 01:39:47 +09:00
Dominique Martinet 523adb6cc1 9p: embed fcall in req to round down buffer allocs
'msize' is often a power of two, or at least page-aligned, so avoiding
an overhead of two dozen bytes for each allocation will help the
allocator do its work and reduce memory fragmentation.

Link: http://lkml.kernel.org/r/1533825236-22896-1-git-send-email-asmadeus@codewreck.org
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Jun Piao <piaojun@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
2018-09-08 01:39:45 +09:00
Linus Torvalds 1f7a4c73a7 Pull request for inclusion in 4.19, take two
This tag is the same as 9p-for-4.19 without the two MAINTAINERS patches
 
 Contains mostly fixes (6 to be backported to stable) and a few changes,
 here is the breakdown:
  * Rework how fids are attributed by replacing some custom tracking in a
 list by an idr (f28cdf0430)
  * For packet-based transports (virtio/rdma) validate that the packet
 length matches what the header says (f984579a01)
  * A few race condition fixes found by syzkaller (9f476d7c54,
 430ac66eb4)
  * Missing argument check when NULL device is passed in sys_mount
 (10aa14527f)
  * A few virtio fixes (23cba9cbde, 31934da810, d28c756cae)
  * Some spelling and style fixes
 
 ----------------------------------------------------------------
 Chirantan Ekbote (1):
       9p/net: Fix zero-copy path in the 9p virtio transport
 
 Colin Ian King (1):
       fs/9p/v9fs.c: fix spelling mistake "Uknown" -> "Unknown"
 
 Jean-Philippe Brucker (1):
       net/9p: fix error path of p9_virtio_probe
 
 Matthew Wilcox (4):
       9p: Fix comment on smp_wmb
       9p: Change p9_fid_create calling convention
       9p: Replace the fidlist with an IDR
       9p: Embed wait_queue_head into p9_req_t
 
 Souptick Joarder (1):
       fs/9p/vfs_file.c: use new return type vm_fault_t
 
 Stephen Hemminger (1):
       9p: fix whitespace issues
 
 Tomas Bortoli (5):
       net/9p/client.c: version pointer uninitialized
       net/9p/trans_fd.c: fix race-condition by flushing workqueue before the kfree()
       net/9p/trans_fd.c: fix race by holding the lock
       9p: validate PDU length
       9p: fix multiple NULL-pointer-dereferences
 
 jiangyiwen (2):
       net/9p/virtio: Fix hard lockup in req_done
       9p/virtio: fix off-by-one error in sg list bounds check
 
 piaojun (5):
       net/9p/client.c: add missing '\n' at the end of p9_debug()
       9p/net/protocol.c: return -ENOMEM when kmalloc() failed
       net/9p/trans_virtio.c: fix some spell mistakes in comments
       fs/9p/xattr.c: catch the error of p9_client_clunk when setting xattr failed
       net/9p/trans_virtio.c: add null terminal for mount tag
 
  fs/9p/v9fs.c            |   2 +-
  fs/9p/vfs_file.c        |   2 +-
  fs/9p/xattr.c           |   6 ++++--
  include/net/9p/client.h |  11 ++++-------
  net/9p/client.c         | 119 +++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------------------------------
  net/9p/protocol.c       |   2 +-
  net/9p/trans_fd.c       |  22 +++++++++++++++-------
  net/9p/trans_rdma.c     |   4 ++++
  net/9p/trans_virtio.c   |  66 +++++++++++++++++++++++++++++++++++++---------------------------
  net/9p/trans_xen.c      |   3 +++
  net/9p/util.c           |   1 -
  12 files changed, 122 insertions(+), 116 deletions(-)
 -----BEGIN PGP SIGNATURE-----
 
 iF0EABECAB0WIQQ8idm2ZSicIMLgzKqoqIItDqvwPAUCW3ElNwAKCRCoqIItDqvw
 PMzfAKCkCYFyNC89vcpxcCNsK7rFQ1qKlwCgoaBpZDdegOu0jMB7cyKwAWrB0LM=
 =h3T0
 -----END PGP SIGNATURE-----

Merge tag '9p-for-4.19-2' of git://github.com/martinetd/linux

Pull 9p updates from Dominique Martinet:
 "This contains mostly fixes (6 to be backported to stable) and a few
  changes, here is the breakdown:

   - rework how fids are attributed by replacing some custom tracking in
     a list by an idr

   - for packet-based transports (virtio/rdma) validate that the packet
     length matches what the header says

   - a few race condition fixes found by syzkaller

   - missing argument check when NULL device is passed in sys_mount

   - a few virtio fixes

   - some spelling and style fixes"

* tag '9p-for-4.19-2' of git://github.com/martinetd/linux: (21 commits)
  net/9p/trans_virtio.c: add null terminal for mount tag
  9p/virtio: fix off-by-one error in sg list bounds check
  9p: fix whitespace issues
  9p: fix multiple NULL-pointer-dereferences
  fs/9p/xattr.c: catch the error of p9_client_clunk when setting xattr failed
  9p: validate PDU length
  net/9p/trans_fd.c: fix race by holding the lock
  net/9p/trans_fd.c: fix race-condition by flushing workqueue before the kfree()
  net/9p/virtio: Fix hard lockup in req_done
  net/9p/trans_virtio.c: fix some spell mistakes in comments
  9p/net: Fix zero-copy path in the 9p virtio transport
  9p: Embed wait_queue_head into p9_req_t
  9p: Replace the fidlist with an IDR
  9p: Change p9_fid_create calling convention
  9p: Fix comment on smp_wmb
  net/9p/client.c: version pointer uninitialized
  fs/9p/v9fs.c: fix spelling mistake "Uknown" -> "Unknown"
  net/9p: fix error path of p9_virtio_probe
  9p/net/protocol.c: return -ENOMEM when kmalloc() failed
  net/9p/client.c: add missing '\n' at the end of p9_debug()
  ...
2018-08-17 17:27:58 -07:00
Tomas Bortoli 10aa14527f 9p: fix multiple NULL-pointer-dereferences
Added checks to prevent GPFs from raising.

Link: http://lkml.kernel.org/r/20180727110558.5479-1-tomasbortoli@gmail.com
Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Reported-by: syzbot+1a262da37d3bead15c39@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
2018-08-13 09:34:58 +09:00
Tomas Bortoli f984579a01 9p: validate PDU length
This commit adds length check for the PDU size.
The size contained in the header has to match the actual size,
except for TCP (trans_fd.c) where actual length is not known ahead
and the header's length will be checked only against the validity
range.

Link: http://lkml.kernel.org/r/20180723154404.2406-1-tomasbortoli@gmail.com
Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Reported-by: syzbot+65c6b72f284a39d416b4@syzkaller.appspotmail.com
To: Eric Van Hensbergen <ericvh@gmail.com>
To: Ron Minnich <rminnich@sandia.gov>
To: Latchesar Ionkov <lucho@ionkov.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
2018-08-13 09:34:58 +09:00
Bart Van Assche 72bc4d375d net/9p: Simplify ib_post_(send|recv|srq_recv)() calls
Instead of declaring and passing a dummy 'bad_wr' pointer, pass NULL
as third argument to ib_post_(send|recv|srq_recv)().

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-24 16:06:36 -06:00
Sun Lianwen 4a026da91c net/9p: correct some comment errors in 9p file system code
There are follow comment errors:
1 The function name is wrong in p9_release_pages() comment.
2 The function name and variable name is wrong in p9_poll_workfn() comment.
3 There is no variable dm_mr and lkey in struct p9_trans_rdma.
4 The function name is wrong in rdma_create_trans() comment.
5 There is no variable initialized in struct virtio_chan.
6 The variable name is wrong in p9_virtio_zc_request() comment.

Signed-off-by: Sun Lianwen <sunlw.fnst@cn.fujitsu.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-10 08:21:53 -04:00
David Howells c4fac91004 9p: Implement show_options
Implement the show_options superblock op for 9p as part of a bid to get
rid of s_options and generic_show_options() to make it easier to implement
a context-based mount where the mount options can be passed individually
over a file descriptor.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Van Hensbergen <ericvh@gmail.com>
cc: Ron Minnich <rminnich@sandia.gov>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: v9fs-developer@lists.sourceforge.net
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-07-11 06:08:58 -04:00
Christoph Hellwig ed082d36a7 IB/core: add support to create a unsafe global rkey to ib_create_pd
Instead of exposing ib_get_dma_mr to ULPs and letting them use it more or
less unchecked, this moves the capability of creating a global rkey into
the RDMA core, where it can be easily audited.  It also prints a warning
everytime this feature is used as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-23 13:47:44 -04:00
Christoph Hellwig 7cf20fc624 net/9p: convert to new CQ API
Trivial conversion to the new RDMA CQ API.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-10 20:54:09 -05:00
Guy Shapiro fa20105e09 IB/cma: Add support for network namespaces
Add support for network namespaces in the ib_cma module. This is
accomplished by:

1. Adding network namespace parameter for rdma_create_id. This parameter is
   used to populate the network namespace field in rdma_id_private.
   rdma_create_id keeps a reference on the network namespace.
2. Using the network namespace from the rdma_id instead of init_net inside
   of ib_cma, when listening on an ID and when looking for an ID for an
   incoming request.
3. Decrementing the reference count for the appropriate network namespace
   when calling rdma_destroy_id.

In order to preserve the current behavior init_net is passed when calling
from other modules.

Signed-off-by: Guy Shapiro <guysh@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Yotam Kenneth <yotamke@mellanox.com>
Signed-off-by: Shachar Raindel <raindel@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 12:32:48 -04:00
Jason Gunthorpe 2f31fa881f net/9p: Remove ib_get_dma_mr calls
The pd now has a local_dma_lkey member which completely replaces
ib_get_dma_mr, use it instead.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Tested-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:36 -04:00
Matan Barak 8e37210b38 IB/core: Change ib_create_cq to use struct ib_cq_init_attr
Currently, ib_create_cq uses cqe and comp_vecotr instead
of the extendible ib_cq_init_attr struct.

Earlier patches already changed the vendors to work with
ib_cq_init_attr. This patch changes the consumers too.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-12 14:49:10 -04:00
Dominique Martinet f569d3ef82 net/9p: add a privport option for RDMA transport.
RDMA can use the same kind of weak security as TCP by checking the
client can bind to a privileged port, which is better than nothing
if TAUTH isn't implemented.

Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2015-03-21 19:32:33 -07:00
Simon Derr 8a5daf1e2c 9pnet_rdma: check token type before int conversion
When parsing options, make sure we have found a proper token before
doing a numeric conversion.

Without this check, the current code will end up following random
pointers that just happened to be on the stack when this function was
called, because match_token() will not touch the 'args' list unless a
valid token is found.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2014-03-25 16:38:19 -05:00
Simon Derr 931700d26b 9pnet_rdma: add cancelled()
Take into account posted recv buffers that will never receive their
reply.

The RDMA code posts a recv buffer for each request that it sends.
When a request is flushed, it is possible that this request will
never receive a reply, and that one recv buffer will stay unused on
the recv queue.

It is then possible, if this scenario happens several times, to have the
recv queue full, and have the 9pnet_rmda module unable to send new requests.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2014-03-25 16:38:13 -05:00
Simon Derr 3f9d5b8dfd 9pnet_rdma: update request status during send
This will be needed by the flush logic.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2014-03-25 16:38:12 -05:00
Dominique Martinet 2b6e72ed74 9P: Add memory barriers to protect request fields over cb/rpc threads handoff
We need barriers to guarantee this pattern works as intended:
[w] req->rc, 1		[r] req->status, 1
wmb			rmb
[w] req->status, 1	[r] req->rc

Where the wmb ensures that rc gets written before status,
and the rmb ensures that if you observe status == 1, rc is the new value.

Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2014-03-25 16:37:59 -05:00
Andi Shyti 59ea52dc46 net: trans_rdma: remove unused function
This patch gets rid of the following warning:

net/9p/trans_rdma.c:594:12: warning: ‘rdma_cancelled’ defined but not used [-Wunused-function]
 static int rdma_cancelled(struct p9_client *client, struct p9_req_t *req)

The rdma_cancelled function is not called anywhere in the kernel

Signed-off-by: Andi Shyti <andi@etezian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-24 15:46:27 -07:00
Simon Derr 80b45261a0 9P: Add cancelled() to the transport functions.
RDMA needs to post a buffer for each incoming reply.
Hence it needs to keep count of these and needs to be
aware of whether a flushed request has received a reply
or not.

This patch adds the cancelled() callback to the transport modules.
It is called when RFLUSH has been received and that the corresponding
request will never receive a reply.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2013-07-07 22:18:18 -05:00
Simon Derr 1cff33069a 9P/RDMA: count posted buffers without a pending request
In rdma_request():

If an error occurs between posting the recv and the send,
there will be a reply context posted without a pending
request.
Since there is no way to "un-post" it, we remember it and
skip post_recv() for the next request.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2013-07-07 22:04:36 -05:00
Simon Derr 2f52d07cb7 9P/RDMA: Improve error handling in rdma_request
Most importantly:
- do not free the recv context (rpl_context) after a successful post_recv()
- but do free the send context (c) after a failed send.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2013-07-07 22:02:29 -05:00
Simon Derr b530e252e2 9P/RDMA: Do not free req->rc in error handling in rdma_request()
rdma_request() should never be in charge of freeing rc.

When an error occurs:
* Either the rc buffer has been recv_post()'ed.
  then kfree()'ing it certainly is a bad idea.
* Or is has not, and in that case req->rc still points to it,
  hence it needs not be freed.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2013-07-07 22:02:29 -05:00
Simon Derr fd453d0ed6 9P/RDMA: Use a semaphore to protect the RQ
The current code keeps track of the number of buffers posted in the RQ,
and will prevent it from overflowing. But it does so by simply dropping
post requests (And leaking memory in the process).
When this happens there will actually be too few buffers posted, and
soon the 9P server will complain about 'RNR retry counter exceeded'
errors.

Instead, use a semaphore, and block until the RQ is ready for another
buffer to be posted.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2013-07-07 22:02:29 -05:00
Simon Derr 47229ff85e 9P/RDMA: Protect against duplicate replies
A well-behaved server would not send twice the reply to a request.
But if it ever happens...
This additional check prevents the kernel from leaking memory
and possibly more nasty consequences in that unlikely event.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2013-07-07 22:02:28 -05:00
Simon Derr 3fcc62f4e8 9P/RDMA: increase P9_RDMA_MAXSIZE to 1MB
The current value is too low to get good performance.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2013-07-07 22:02:28 -05:00
Simon Derr 17b6fd9d6d 9P/RDMA: rdma_request() needs not allocate req->rc
p9_tag_alloc() takes care of that.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2013-07-07 22:02:27 -05:00
Joe Perches 5d3851530d 9p: Reduce object size with CONFIG_NET_9P_DEBUG
Reduce object size by deduplicating formats.

Use vsprintf extension %pV.
Rename P9_DPRINTK uses to p9_debug, align arguments.
Add function for _p9_debug and macro to add __func__.
Add missing "\n"s to p9_debug uses.
Remove embedded function names as p9_debug adds it.
Remove P9_EPRINTK macro and convert use to pr_<level>.
Add and use pr_fmt and pr_<level>.

$ size fs/9p/built-in.o*
   text	   data	    bss	    dec	    hex	filename
  62133	    984	  16000	  79117	  1350d	fs/9p/built-in.o.new
  67342	    984	  16928	  85254	  14d06	fs/9p/built-in.o.old
$ size net/9p/built-in.o*
   text	   data	    bss	    dec	    hex	filename
  88792	   4148	  22024	 114964	  1c114	net/9p/built-in.o.new
  94072	   4148	  23232	 121452	  1da6c	net/9p/built-in.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2012-01-05 10:51:44 -06:00
Linus Torvalds 4c171acc20 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/cma: Save PID of ID's owner
  RDMA/cma: Add support for netlink statistics export
  RDMA/cma: Pass QP type into rdma_create_id()
  RDMA: Update exported headers list
  RDMA/cma: Export enum cma_state in <rdma/rdma_cm.h>
  RDMA/nes: Add a check for strict_strtoul()
  RDMA/cxgb3: Don't post zero-byte read if endpoint is going away
  RDMA/cxgb4: Use completion objects for event blocking
  IB/srp: Fix integer -> pointer cast warnings
  IB: Add devnode methods to cm_class and umad_class
  IB/mad: Return EPROTONOSUPPORT when an RDMA device lacks the QP required
  IB/uverbs: Add devnode method to set path/mode
  RDMA/ucma: Add .nodename/.mode to tell userspace where to create device node
  RDMA: Add netlink infrastructure
  RDMA: Add error handling to ib_core_init()
2011-05-26 12:13:57 -07:00
Sean Hefty b26f9b9949 RDMA/cma: Pass QP type into rdma_create_id()
The RDMA CM currently infers the QP type from the port space selected
by the user.  In the future (eg with RDMA_PS_IB or XRC), there may not
be a 1-1 correspondence between port space and QP type.  For netlink
export of RDMA CM state, we want to export the QP type to userspace,
so it is cleaner to explicitly associate a QP type to an ID.

Modify rdma_create_id() to allow the user to specify the QP type, and
use it to make our selections of datagram versus connected mode.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-25 13:46:23 -07:00
David S. Miller 5b5ed8afe4 9p: Kill unused variable 'ret' in trans_rdma.c:parse_opts()
Probably just cut and pasted from the other parse_opts() implementations
in the 9p sources.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-19 17:45:29 -04:00
Aneesh Kumar K.V eeff66ef6e net/9p: Convert the in the 9p rpc call path to GFP_NOFS
Without this we can cause reclaim allocation in writepage.

[ 3433.448430] =================================
[ 3433.449117] [ INFO: inconsistent lock state ]
[ 3433.449117] 2.6.38-rc5+ #84
[ 3433.449117] ---------------------------------
[ 3433.449117] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage.
[ 3433.449117] kswapd0/505 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 3433.449117]  (iprune_sem){+++++-}, at: [<ffffffff810ebbab>] shrink_icache_memory+0x45/0x2b1
[ 3433.449117] {RECLAIM_FS-ON-W} state was registered at:
[ 3433.449117]   [<ffffffff8107fe5f>] mark_held_locks+0x52/0x70
[ 3433.449117]   [<ffffffff8107ff02>] lockdep_trace_alloc+0x85/0x9f
[ 3433.449117]   [<ffffffff810d353d>] slab_pre_alloc_hook+0x18/0x3c
[ 3433.449117]   [<ffffffff810d3fd5>] kmem_cache_alloc+0x23/0xa2
[ 3433.449117]   [<ffffffff8127be77>] idr_pre_get+0x2d/0x6f
[ 3433.449117]   [<ffffffff815434eb>] p9_idpool_get+0x30/0xae
[ 3433.449117]   [<ffffffff81540123>] p9_client_rpc+0xd7/0x9b0
[ 3433.449117]   [<ffffffff815427b0>] p9_client_clunk+0x88/0xdb
[ 3433.449117]   [<ffffffff811d56e5>] v9fs_evict_inode+0x3c/0x48
[ 3433.449117]   [<ffffffff810eb511>] evict+0x1f/0x87
[ 3433.449117]   [<ffffffff810eb5c0>] dispose_list+0x47/0xe3
[ 3433.449117]   [<ffffffff810eb8da>] evict_inodes+0x138/0x14f
[ 3433.449117]   [<ffffffff810d90e2>] generic_shutdown_super+0x57/0xe8
[ 3433.449117]   [<ffffffff810d91e8>] kill_anon_super+0x11/0x50
[ 3433.449117]   [<ffffffff811d4951>] v9fs_kill_super+0x49/0xab
[ 3433.449117]   [<ffffffff810d926e>] deactivate_locked_super+0x21/0x46
[ 3433.449117]   [<ffffffff810d9e84>] deactivate_super+0x40/0x44
[ 3433.449117]   [<ffffffff810ef848>] mntput_no_expire+0x100/0x109
[ 3433.449117]   [<ffffffff810f0aeb>] sys_umount+0x2f1/0x31c
[ 3433.449117]   [<ffffffff8102c87b>] system_call_fastpath+0x16/0x1b
[ 3433.449117] irq event stamp: 192941
[ 3433.449117] hardirqs last  enabled at (192941): [<ffffffff81568dcf>] _raw_spin_unlock_irq+0x2b/0x30
[ 3433.449117] hardirqs last disabled at (192940): [<ffffffff810b5f97>] shrink_inactive_list+0x290/0x2f5
[ 3433.449117] softirqs last  enabled at (188470): [<ffffffff8105fd65>] __do_softirq+0x133/0x152
[ 3433.449117] softirqs last disabled at (188455): [<ffffffff8102d7cc>] call_softirq+0x1c/0x28
[ 3433.449117]
[ 3433.449117] other info that might help us debug this:
[ 3433.449117] 1 lock held by kswapd0/505:
[ 3433.449117]  #0:  (shrinker_rwsem){++++..}, at: [<ffffffff810b52e2>] shrink_slab+0x38/0x15f
[ 3433.449117]
[ 3433.449117] stack backtrace:
[ 3433.449117] Pid: 505, comm: kswapd0 Not tainted 2.6.38-rc5+ #84
[ 3433.449117] Call Trace:
[ 3433.449117]  [<ffffffff8107fbce>] ? valid_state+0x17e/0x191
[ 3433.449117]  [<ffffffff81036896>] ? save_stack_trace+0x28/0x45
[ 3433.449117]  [<ffffffff81080426>] ? check_usage_forwards+0x0/0x87
[ 3433.449117]  [<ffffffff8107fcf4>] ? mark_lock+0x113/0x22c
[ 3433.449117]  [<ffffffff8108105f>] ? __lock_acquire+0x37a/0xcf7
[ 3433.449117]  [<ffffffff8107fc0e>] ? mark_lock+0x2d/0x22c
[ 3433.449117]  [<ffffffff81081077>] ? __lock_acquire+0x392/0xcf7
[ 3433.449117]  [<ffffffff810b14d2>] ? determine_dirtyable_memory+0x15/0x28
[ 3433.449117]  [<ffffffff81081a33>] ? lock_acquire+0x57/0x6d
[ 3433.449117]  [<ffffffff810ebbab>] ? shrink_icache_memory+0x45/0x2b1
[ 3433.449117]  [<ffffffff81567d85>] ? down_read+0x47/0x5c
[ 3433.449117]  [<ffffffff810ebbab>] ? shrink_icache_memory+0x45/0x2b1
[ 3433.449117]  [<ffffffff810ebbab>] ? shrink_icache_memory+0x45/0x2b1
[ 3433.449117]  [<ffffffff810b5385>] ? shrink_slab+0xdb/0x15f
[ 3433.449117]  [<ffffffff810b69bc>] ? kswapd+0x574/0x96a
[ 3433.449117]  [<ffffffff810b6448>] ? kswapd+0x0/0x96a
[ 3433.449117]  [<ffffffff810714e2>] ? kthread+0x7d/0x85
[ 3433.449117]  [<ffffffff8102d6d4>] ? kernel_thread_helper+0x4/0x10
[ 3433.449117]  [<ffffffff81569200>] ? restore_args+0x0/0x30
[ 3433.449117]  [<ffffffff81071465>] ? kthread+0x0/0x85
[ 3433.449117]  [<ffffffff8102d6d0>] ? kernel_thread_helper+0x0/0x10

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 15:43:35 -05:00
Shan Wei 441c793a56 net: cleanup unused macros in net directory
Clean up some unused macros in net/*.
1. be left for code change. e.g. PGV_FROM_VMALLOC, PGV_FROM_VMALLOC, KMEM_SAFETYZONE.
2. never be used since introduced to kernel.
   e.g. P9_RDMA_MAX_SGE, UTIL_CTRL_PKT_SIZE.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19 23:20:04 -08:00
Davidlohr Bueso 1d6400c7c9 net/9p: fix memory handling/allocation in rdma_request()
Return -ENOMEM when erroring on kmalloc and fix memory leaks when returning on error.

Signed-off-by: Davidlohr Bueso <dave@gnu.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2010-09-27 07:52:50 -05:00
Joe Perches 3fa21e07e6 net: Remove unnecessary returns from void function()s
This patch removes from net/ (but not any netfilter files)
all the unnecessary return; statements that precede the
last closing brace of void functions.

It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.

Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
  xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 23:23:14 -07:00
Tejun Heo 5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Eric Van Hensbergen d8c8a9e365 9p: fix option parsing
Options pointer is being moved before calling kfree() which seems
to cause problems.  This uses a separate pointer to track and free
original allocation.

Signed-off-by: Venkateswararao Jujjuri <jvrao@us.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>w
2010-02-08 16:23:23 -06:00