Commit Graph

1013621 Commits

Author SHA1 Message Date
Yangyang Li da43b7bebc RDMA/hns: Use IDA interface to manage xrcd index
Switch xrcd index allocation and release from hns own bitmap interface
to IDA interface.

Link: https://lore.kernel.org/r/1623325814-55737-7-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:42:54 -03:00
Yangyang Li 645f059346 RDMA/hns: Use IDA interface to manage pd index
Switch pd index allocation and release from hns own bitmap interface
to IDA interface.

Link: https://lore.kernel.org/r/1623325814-55737-6-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:42:54 -03:00
Yangyang Li d38936f010 RDMA/hns: Use IDA interface to manage mtpt index
Switch mtpt index allocation and release from hns own bitmap interface
to IDA interface.

Link: https://lore.kernel.org/r/1623325814-55737-5-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:42:53 -03:00
Yangyang Li 38e375b771 RDMA/hns: Remove unused RR mechanism
Round-robin (RR) is no longer used in the allocation of the bitmap table,
and all the function input parameters that use this mechanism are
BITMAP_NO_RR. The code that defines and uses the RR needs to be deleted.

Link: https://lore.kernel.org/r/1623325814-55737-4-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:42:53 -03:00
Yangyang Li 1bc530c79d RDMA/hns: Remove the unused hns_roce_bitmap_free_range function
hns_roce_bitmap_free_range() is only called inside hns_roce_bitmap_free(),
and the input parameter "cnt" is set to a constant 1. In addition, the
driver does not use alloc_range scenarios, so free_range does not need to
exist.

Link: https://lore.kernel.org/r/1623325814-55737-3-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:42:53 -03:00
Yangyang Li 24977edbb5 RDMA/hns: Remove the unused hns_roce_bitmap_alloc_range function
The function is no longer used.

Link: https://lore.kernel.org/r/1623325814-55737-2-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:42:53 -03:00
Wenpeng Liang 3cea7b4a7d RDMA/core: Fix incorrect print format specifier
There are some '%u' for 'int' and '%d' for 'unsigend int', they should be
fixed.

Link: https://lore.kernel.org/r/1623325232-30900-1-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:38:30 -03:00
Xi Wang 57dba89ad2 RDMA/hns: Clean SRQC structure definition
Remove unused members in srq context structure.

Link: https://lore.kernel.org/r/1624262443-24528-10-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:03:42 -03:00
Yixing Liu 2b035e7312 RDMA/hns: Use new interface to write DB related fields
Use hr_write_reg() instead of roce_set_field().

Link: https://lore.kernel.org/r/1624262443-24528-9-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:03:42 -03:00
Yixing Liu fd9e3679af RDMA/hns: Use new interface to write FRMR fields
Use "hr_reg_write" to replace "roce_set_filed".

Link: https://lore.kernel.org/r/1624262443-24528-8-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:03:42 -03:00
Lang Cheng f778bf1b8c RDMA/hns: Use new interface to get CQE fields
WQE_INDEX and OPCODE and QPN of CQE use redundant masks. Just remove them.

Link: https://lore.kernel.org/r/1624262443-24528-7-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:03:41 -03:00
Lang Cheng f0cb411aad RDMA/hns: Use new interface to modify QP context
Fill all QPC fileds with hr_reg_*() instead of roce_set_*(). SQPN is used
for HIP08 ES only, it should be removed.

Link: https://lore.kernel.org/r/1624262443-24528-6-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:03:41 -03:00
Yixing Liu f6fcd28d49 RDMA/hns: Use new interface to write CQ context.
Use hr_reg_*() to write CQ context, it's simpler than roce_set_*().

Link: https://lore.kernel.org/r/1624262443-24528-5-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:03:41 -03:00
Lang Cheng a762fe656b RDMA/hns: Add hr_reg_write_bool()
In order to avoid to do bitwise operations on a boolean value, add a new
register interface to avoid sparse comlaint about "dubious: x & !y" when
calling hr_reg_write(ctx, field, !!val).

Fixes: dc50477440 ("RDMA/hns: Use new interface to set MPT related fields")
Fixes: 495c24808c ("RDMA/hns: Add XRC subtype in QPC and XRC type in SRQC")
Link: https://lore.kernel.org/r/1624262443-24528-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:03:41 -03:00
Weihang Li fe331da0f2 RDMA/hns: Add a check to ensure integer mtu is positive
GCC may reports an running time assert error when a value calculated from
ib_mtu_enum_to_int() is using as 'val' in FIELD_PREDP:

include/linux/compiler_types.h:328:38: error: call to
'__compiletime_assert_1524' declared with attribute error: FIELD_PREP:
value too large for the field

So a check is added about whether integer mtu from ib_mtu_enum_to_int() is
negative to avoid this warning.

Link: https://lore.kernel.org/r/1624262443-24528-3-git-send-email-liweihang@huawei.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:03:41 -03:00
Weihang Li 78c1da5270 RDMA/hns: Do not use !! for values that are already bool when calling hr_reg_write()
There is no need to use "!!" before "eq->eqe_size ==
HNS_ROCE_V3_EQE_SIZE", or sparse will complain about "dubious: x & !y".

Fixes: 782832f254 ("RDMA/hns: Simplify the function config_eqc()")
Link: https://lore.kernel.org/r/1624262443-24528-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 15:03:40 -03:00
Avihai Horon 1477d44ce4 RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPs
Relaxed Ordering is a capability that can only benefit users that support
it. All kernel ULPs should support Relaxed Ordering, as they are designed
to read data only after observing the CQE and use the DMA API correctly.

Hence, implicitly enable Relaxed Ordering by default for MR transfers in
kernel ULPs.

Link: https://lore.kernel.org/r/b7e820aab7402b8efa63605f4ea465831b3b1e5e.1623236426.git.leonro@nvidia.com
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 12:33:08 -03:00
Xi Wang 7e78dd816e RDMA/hns: Clear extended doorbell info before using
Both of HIP08 and HIP09 require the extended doorbell information to be
cleared before being used.

Fixes: 6b63597d35 ("RDMA/hns: Add TSQ link table support")
Link: https://lore.kernel.org/r/1623392089-35639-1-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-18 13:54:25 -03:00
Jack Wang a95fbe2aba RDMA/rtrs: Check device max_qp_wr limit when create QP
Currently we only check device max_qp_wr limit for IO connection, but not
for service connection. We should check for both.

So save the max_qp_wr device limit in wr_limit, and use it for both IO
connections and service connections.

While at it, also remove an outdated comments.

Link: https://lore.kernel.org/r/20210614090337.29557-6-jinpu.wang@ionos.com
Suggested-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-18 13:47:13 -03:00
Guoqing Jiang 354462eb7f RDMA/rtrs: Rename cq_size/queue_size to cq_num/queue_num
Those variables are passed to create_cq, create_qp, rtrs_iu_alloc and
rtrs_iu_free, so these *_size means the num of unit. And cq_size also
means number of cq element.

Also move the setting of cq_num to common path.

Link: https://lore.kernel.org/r/20210614090337.29557-5-jinpu.wang@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-18 13:47:13 -03:00
Md Haris Iqbal b012f0ad53 RDMA/rtrs: RDMA_RXE requires more number of WR
When using rdma_rxe, post_one_recv() returns ENOMEM error due to the full
recv queue.  This patch increase the number of WR for receive queue to
support all devices.

Link: https://lore.kernel.org/r/20210614090337.29557-4-jinpu.wang@ionos.com
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-18 13:47:13 -03:00
Jack Wang 0509ebfa33 RDMA/rtrs-clt: Use minimal max_send_sge when create qp
We use device limit max_send_sge, which is suboptimal for memory usage.
We don't need that much for User Con, 1 is enough. And for IO con,
sess->max_segments + 1 is enough

Link: https://lore.kernel.org/r/20210614090337.29557-3-jinpu.wang@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-18 13:47:12 -03:00
Jack Wang 5e91eabf66 RDMA/rtrs-srv: Set minimal max_send_wr and max_recv_wr
Currently rtrs when create_qp use a coarse numbers (bigger in general),
which leads to hardware create more resources which only waste memory with
no benefits.

For max_send_wr, we don't really need alway max_qp_wr size when creating
qp, reduce it to cq_size.

For max_recv_wr,  cq_size is enough.

With the patch when sess_queue_depth=128, per session (2 paths) memory
consumption reduced from 188 MB to 65MB

When always_invalidate is enabled, we need send more wr, so treat it
special.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20210614090337.29557-2-jinpu.wang@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-18 13:47:12 -03:00
Jason Gunthorpe 915e4af59f RDMA: Remove rdma_set_device_sysfs_group()
The driver's device group can be specified as part of the ops structure
like the device's port group. No need for the complicated API.

Link: https://lore.kernel.org/r/8964785a34fd3a29ff5b6693493f575b717e594d.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:32 -03:00
Jason Gunthorpe 69d86a66bd RDMA/core: Allow port_groups to be used with namespaces
Now that the port_groups data is being destroyed and managed by the core
code this restriction is no longer needed. All the ib_port_attrs are
compatible with the core's sysfs lifecycle.

When the main device is destroyed and moved to another namespace the
driver's port sysfs can be created/destroyed as well due to it now being a
simple attribute list.

Link: https://lore.kernel.org/r/afd8b676eace2821692d44489ff71856277c48d1.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:31 -03:00
Jason Gunthorpe d7407d1669 RDMA: Change ops->init_port to ops->port_groups
init_port was only being used to register sysfs attributes against the
port kobject. Now that all users are creating static attribute_group's we
can simply set the attribute_group list in the ops and the core code can
just handle it directly.

This makes all the sysfs management quite straightforward and prevents any
driver from abusing the naked port kobject in future because no driver
code can access it.

Link: https://lore.kernel.org/r/114f68f3d921460eafe14cea5a80ca65d81729c3.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:31 -03:00
Jason Gunthorpe 8f1708f19f RDMA/hfi1: Use attributes for the port sysfs
hfi1 should not be creating a mess of kobjects to attach to the port
kobject - this is all attributes. The proper API is to create an
attribute_group list and create it against the port's kobject.

Link: https://lore.kernel.org/r/cbe0ccb6175dd22274359b6ad803a37435a70e91.1623427137.git.leonro@nvidia.com
Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:31 -03:00
Jason Gunthorpe 4a7aaf88c8 RDMA/qib: Use attributes for the port sysfs
qib should not be creating a mess of kobjects to attach to the port
kobject - this is all attributes. The proper API is to create an
attribute_group list and create it against the port's kobject.

Link: https://lore.kernel.org/r/911e0031e1ed495b0006e8a6efec7b67a702cd5e.1623427137.git.leonro@nvidia.com
Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:31 -03:00
Jason Gunthorpe 526a12c8c5 RDMA/cm: Use an attribute_group on the ib_port_attribute intead of kobj's
This code is trying to attach a list of counters grouped into 4 groups to
the ib_port sysfs. Instead of creating a bunch of kobjects simply express
everything naturally as an ib_port_attribute and add a single
attribute_groups list.

Remove all the naked kobject manipulations.

Link: https://lore.kernel.org/r/0d5a7241ee0fe66622de04fcbaafaf6a791d5c7c.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:31 -03:00
Jason Gunthorpe 054239f45c RDMA/core: Expose the ib port sysfs attribute machinery
Other things outside the core code are creating attributes against the
port. This patch exposes the basic machinery to do this.

The ib_port_attribute type allows creating groups of attributes attatched
to the port and comes with the usual machinery to do this.

Link: https://lore.kernel.org/r/5c4aeae57f6fa7c59a1d6d1c5506069516ae9bbf.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:30 -03:00
Jason Gunthorpe d89eb509aa RDMA/core: Remove the kobject_uevent() NOP
This call does nothing because the ib_port kobj is nested under a struct
device kobject and the dev_uevent_filter() function of the struct device
blocks uevents for any children kobj's that are not also struct devices.

A uevent for the struct device will be triggered after
ib_setup_port_attrs() returns which causes udev to pick up all the deep
"attributes" which are implemented as kobjects nested under a struct
device and assign them to the udev object for the struct device:

 $ udevadm info -a /sys/class/infiniband/ibp0s9
     ATTR{ports/1/counters/excessive_buffer_overrun_errors}=="0"

Link: https://lore.kernel.org/r/49231c92c7d4c60686de18f7e20932d0c82160ee.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:30 -03:00
Jason Gunthorpe b7066b32a1 RDMA/core: Create the device hw_counters through the normal groups mechanism
Instead of calling device_add_groups() add the group to the existing
groups array which is managed through device_add().

This requires setting up the hw_counters before device_add(), so it gets
split up from the already split port sysfs flow.

Move all the memory freeing to the release function.

Link: https://lore.kernel.org/r/666250d937b64f6fdf45da9e2dc0b6e5e4f7abd8.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:30 -03:00
Jason Gunthorpe 2ca1cca435 RDMA/core: Simplify how the port sysfs is created
Use the same technique as gid_attrs now uses to manage the port
sysfs. Bundle everything into three allocations and use a single
sysfs_create_groups() to build everything in one shot.

All the memory is always freed in the kobj release function, removing most
of the error unwinding.

The gid_attr technique and the hw_counters are very similar, merge the two
together and combine the sysfs_create_group() call for hw_counters with
the single sysfs group setup.

Link: https://lore.kernel.org/r/b688f3340694c59f7b44b1bde40e25559ef43cf3.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:30 -03:00
Jason Gunthorpe a4676388e2 RDMA/core: Simplify how the gid_attrs sysfs is created
Instead of having an whole bunch of different allocations to create the
gid_attr kobjects reduce it to three, one for the kobj struct plus the
attributes, and one for the attribute list for each of the two
groups.

Move the freeing of all allocations to the release function.

Reorder the operations so all the allocations happen first then the
kobject & sysfs operations are last.

This removes the majority of the complicated error unwind since the
release function will always undo all the memory allocations. Freeing the
memory is also much simpler since there is a lot less of it.

Consolidate creating the "group of array indexes" pattern into one helper
function. Ensure kobject_del is used.

Link: https://lore.kernel.org/r/f4149d379db7178d37d11d75e3026bf550f818a1.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:29 -03:00
Jason Gunthorpe a32f433522 RDMA/core: Split gid_attrs related sysfs from add_port()
The gid_attrs directory is a dedicated kobj nested under the port,
construct/destruct it with its own pair of functions for
understandability. This is much more readable than having it weirdly
inlined out of order into the add_port() function.

Link: https://lore.kernel.org/r/1c9434111b6770a7aef0e644a88a16eee7e325b8.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:29 -03:00
Jason Gunthorpe 467f432a52 RDMA/core: Split port and device counter sysfs attributes
This code creates a 'struct hw_stats_attribute' for each sysfs entry that
contains a naked 'struct attribute' inside.

It then proceeds to attach this same structure to a 'struct device' kobj
and a 'struct ib_port' kobj. However, this violates the typing
requirements.  'struct device' requires the attribute to be a 'struct
device_attribute' and 'struct ib_port' requires the attribute to be
'struct port_attribute'.

This happens to work because the show/store function pointers in all three
structures happen to be at the same offset and happen to be nearly the
same signature. This means when container_of() was used to go between the
wrong two types it still managed to work.

However clang CFI detection notices that the function pointers have a
slightly different signature. As with show/store this was only working
because the device and port struct layouts happened to have the kobj at
the front.

Correct this by have two independent sets of data structures for the port
and device case. The two different attributes correctly include the
port/device_attribute struct and everything from there up is kept
split. The show/store function call chains start with device/port unique
functions that invoke a common show/store function pointer.

Link: https://lore.kernel.org/r/a8b3864b4e722aed3657512af6aa47dc3c5033be.1623427137.git.leonro@nvidia.com
Reported-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:29 -03:00
Jason Gunthorpe d8a5883814 RDMA/core: Replace the ib_port_data hw_stats pointers with a ib_port pointer
It is much saner to store a pointer to the kobject structure that contains
the cannonical stats pointer than to copy the stats pointers into a public
structure.

Future patches will require the sysfs pointer for other purposes.

Link: https://lore.kernel.org/r/f90551dfd296cde1cb507bbef27cca9891d19871.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:29 -03:00
Jason Gunthorpe 4b5f4d3fb4 RDMA: Split the alloc_hw_stats() ops to port and device variants
This is being used to implement both the port and device global stats,
which is causing some confusion in the drivers. For instance EFA and i40iw
both seem to be misusing the device stats.

Split it into two ops so drivers that don't support one or the other can
leave the op NULL'd, making the calling code a little simpler to
understand.

Link: https://lore.kernel.org/r/1955c154197b2a159adc2dc97266ddc74afe420c.1623427137.git.leonro@nvidia.com
Tested-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:29 -03:00
Bob Pearson 570d2b99d0 RDMA/rxe: Disallow MR dereg and invalidate when bound
Check that an MR has no bound MWs before allowing a dereg or invalidate
operation.

Link: https://lore.kernel.org/r/20210608042552.33275-11-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:51:19 -03:00
Bob Pearson cdd0b85675 RDMA/rxe: Implement memory access through MWs
Add code to implement memory access through memory windows.

Link: https://lore.kernel.org/r/20210608042552.33275-10-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:51:18 -03:00
Bob Pearson 3902b429ca RDMA/rxe: Implement invalidate MW operations
Implement invalidate MW and cleaned up invalidate MR operations.

Added code to perform remote invalidate for send with invalidate.  Added
code to perform local invalidation. Deleted some blank lines in rxe_loc.h.

Link: https://lore.kernel.org/r/20210608042552.33275-9-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:51:18 -03:00
Bob Pearson 32a577b4c3 RDMA/rxe: Add support for bind MW work requests
Add support for bind MW work requests from user space.  Since rdma/core
does not support bind mw in ib_send_wr there is no way to support bind mw
in kernel space.

Added bind_mw local operation in rxe_req.c. Added bind_mw WR operation in
rxe_opcode.c. Added bind_mw WC in rxe_comp.c.  Added additional fields to
rxe_mw in rxe_verbs.h. Added rxe_do_dealloc_mw() subroutine to cleanup an
mw when rxe_dealloc_mw is called.  Added code to implement bind_mw
operation in rxe_mw.c

Link: https://lore.kernel.org/r/20210608042552.33275-8-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:51:18 -03:00
Bob Pearson c1a411268a RDMA/rxe: Move local ops to subroutine
Simplify rxe_requester() by moving the local operations to a subroutine.
Add an error return for illegal send WR opcode.  Moved next_index ahead of
rxe_run_task which fixed a small bug where work completions were delayed
until after the next wqe which was not the intended behavior.  Let errors
return their own WC status. Previously all errors were reported as
protection errors which was incorrect. Changed the return of errors from
rxe_do_local_ops() to err: which causes an immediate completion.  Without
this an error on a last WR may get lost. Changed fill_packet() to
finish_packet() which is more accurate.

Fixes: 8700e2e7c485 ("The software RoCE driver")
Link: https://lore.kernel.org/r/20210608042552.33275-7-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:51:18 -03:00
Bob Pearson 886441fb2e RDMA/rxe: Replace WR_REG_MASK by WR_LOCAL_OP_MASK
Rxe has two mask bits WR_LOCAL_MASK and WR_REG_MASK with WR_REG_MASK used
to indicate any local operation and WR_LOCAL_MASK unused. This patch
replaces both of these with one mask bit WR_LOCAL_OP_MASK which is
clearer.

Link: https://lore.kernel.org/r/20210608042552.33275-6-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:51:18 -03:00
Bob Pearson beec0239c3 RDMA/rxe: Add ib_alloc_mw and ib_dealloc_mw verbs
Add ib_alloc_mw and ib_dealloc_mw verbs APIs.

Added new file rxe_mw.c focused on MWs. Changed the 8 bit random key
generator. Added a cleanup routine for MWs. Added verbs routines to
ib_device_ops.

Link: https://lore.kernel.org/r/20210608042552.33275-5-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:51:17 -03:00
Bob Pearson af732adfac RDMA/rxe: Enable MW object pool
Currently the rxe driver has a rxe_mw struct object but nothing about
memory windows is enabled. This patch turns on memory windows and some
minor cleanup.

Set device attribute in rxe.c so max_mw = MAX_MW.  Change parameters in
rxe_param.h so that MAX_MW is the same as MAX_MR.  Reduce the number of
MRs and MWs to 4K from 256K.  Add device capability bits for 2a and 2b
memory windows.  Removed RXE_MR_TYPE_MW from the rxe_mr_type enum.

Link: https://lore.kernel.org/r/20210608042552.33275-4-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:51:17 -03:00
Bob Pearson 08224016ab RDMA/rxe: Return errors for add index and key
Modify rxe_add_index() and rxe_add_key() to return an error if the index
or key is aleady present in the pool.  Currently they print a warning and
silently fail with bad consequences to the caller.

Link: https://lore.kernel.org/r/20210608042552.33275-3-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:51:17 -03:00
Bob Pearson 660a59369e RDMA/rxe: Add bind MW fields to rxe_send_wr
Add fields to struct rxe_send_wr in rdma_user_rxe.h to support bind MW
work requests

Link: https://lore.kernel.org/r/20210608042552.33275-2-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:51:17 -03:00
Bob Pearson 15ae1375ea RDMA/rxe: Fix qp reference counting for atomic ops
Currently the rdma_rxe driver attempts to protect atomic responder
resources by taking a reference to the qp which is only freed when the
resource is recycled for a new read or atomic operation. This means that
in normal circumstances there is almost always an extra qp reference once
an atomic operation has been executed which prevents cleaning up the qp
and associated pd and cqs when the qp is destroyed.

This patch removes the call to rxe_add_ref() in send_atomic_ack() and the
call to rxe_drop_ref() in free_rd_atomic_resource(). If the qp is
destroyed while a peer is retrying an atomic op it will cause the
operation to fail which is acceptable.

Link: https://lore.kernel.org/r/20210604230558.4812-1-rpearsonhpe@gmail.com
Reported-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Fixes: 86af617641 ("IB/rxe: remove unnecessary skb_clone")
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:20:23 -03:00
Xi Wang 61b460d100 RDMA/hns: Support getting max QP number from firmware
All functions of HIP09's ROCEE share on-chip resources for all QPs, the
driver needs configure the resource index and number for each function
during the init stage.

Link: https://lore.kernel.org/r/1622541427-42193-1-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 15:26:22 -03:00