5.2 Merge Window pull request

This has been a smaller cycle than normal. One new driver was accepted,
 which is unusual, and at least one more driver remains in review on the
 list.
 
 - Driver fixes for hns, hfi1, nes, rxe, i40iw, mlx5, cxgb4, vmw_pvrdma
 
 - Many patches from MatthewW converting radix tree and IDR users to use
   xarray
 
 - Introduction of tracepoints to the MAD layer
 
 - Build large SGLs at the start for DMA mapping and get the driver to
   split them
 
 - Generally clean SGL handling code throughout the subsystem
 
 - Support for restricting RDMA devices to net namespaces for containers
 
 - Progress to remove object allocation boilerplate code from drivers
 
 - Change in how the mlx5 driver shows representor ports linked to VFs
 
 - mlx5 uapi feature to access the on chip SW ICM memory
 
 - Add a new driver for 'EFA'. This is HW that supports user space packet
   processing through QPs in Amazon's cloud
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlzTIU0ACgkQOG33FX4g
 mxrGKQ/8CqpyvuCyZDW5ovO4DI4YlzYSPXehWlwxA4CWhU1AYTujutnNOdZdngnz
 atTthOlJpZWJV26orvvzwIOi4qX/5UjLXEY3HYdn07JP1Z4iT7E3P4W2sdU3vdl3
 j8bU7xM7ZWmnGxrBZ6yQlVRadEhB8+HJIZWMw+wx66cIPnvU+g9NgwouH67HEEQ3
 PU8OCtGBwNNR508WPiZhjqMDfi/3BED4BfCihFhMbZEgFgObjRgtCV0M33SSXKcR
 IO2FGNVuDAUBlND3vU9guW1+M77xE6p1GvzkIgdCp6qTc724NuO5F2ngrpHKRyZT
 CxvBhAJI6tAZmjBVnmgVJex7rA8p+y/8M/2WD6GE3XSO89XVOkzNBiO2iTMeoxXr
 +CX6VvP2BWwCArxsfKMgW3j0h/WVE9w8Ciej1628m1NvvKEV4AGIJC1g93lIJkRN
 i3RkJ5PkIrdBrTEdKwDu1FdXQHaO7kGgKvwzJ7wBFhso8BRMrMfdULiMbaXs2Bw1
 WdL5zoSe/bLUpPZxcT9IjXRxY5qR0FpIOoo6925OmvyYe/oZo1zbitS5GGbvV90g
 tkq6Jb+aq8ZKtozwCo+oMcg9QPLYNibQsnkL3QirtURXWCG467xdgkaJLdF6s5Oh
 cp+YBqbR/8HNMG/KQlCfnNQKp1ci8mG3EdthQPhvdcZ4jtbqnSI=
 =TS64
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This has been a smaller cycle than normal. One new driver was
  accepted, which is unusual, and at least one more driver remains in
  review on the list.

  Summary:

   - Driver fixes for hns, hfi1, nes, rxe, i40iw, mlx5, cxgb4,
     vmw_pvrdma

   - Many patches from MatthewW converting radix tree and IDR users to
     use xarray

   - Introduction of tracepoints to the MAD layer

   - Build large SGLs at the start for DMA mapping and get the driver to
     split them

   - Generally clean SGL handling code throughout the subsystem

   - Support for restricting RDMA devices to net namespaces for
     containers

   - Progress to remove object allocation boilerplate code from drivers

   - Change in how the mlx5 driver shows representor ports linked to VFs

   - mlx5 uapi feature to access the on chip SW ICM memory

   - Add a new driver for 'EFA'. This is HW that supports user space
     packet processing through QPs in Amazon's cloud"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (186 commits)
  RDMA/ipoib: Allow user space differentiate between valid dev_port
  IB/core, ipoib: Do not overreact to SM LID change event
  RDMA/device: Don't fire uevent before device is fully initialized
  lib/scatterlist: Remove leftover from sg_page_iter comment
  RDMA/efa: Add driver to Kconfig/Makefile
  RDMA/efa: Add the efa module
  RDMA/efa: Add EFA verbs implementation
  RDMA/efa: Add common command handlers
  RDMA/efa: Implement functions that submit and complete admin commands
  RDMA/efa: Add the ABI definitions
  RDMA/efa: Add the com service API definitions
  RDMA/efa: Add the efa_com.h file
  RDMA/efa: Add the efa.h header file
  RDMA/efa: Add EFA device definitions
  RDMA: Add EFA related definitions
  RDMA/umem: Remove hugetlb flag
  RDMA/bnxt_re: Use core helpers to get aligned DMA address
  RDMA/i40iw: Use core helpers to get aligned DMA address within a supported page size
  RDMA/verbs: Add a DMA iterator to return aligned contiguous memory blocks
  RDMA/umem: Add API to find best driver supported page size in an MR
  ...
This commit is contained in:
Linus Torvalds 2019-05-09 09:02:46 -07:00
commit dce45af5c2
251 changed files with 12549 additions and 4585 deletions

View File

@ -745,6 +745,15 @@ S: Supported
F: Documentation/networking/device_drivers/amazon/ena.txt
F: drivers/net/ethernet/amazon/
AMAZON RDMA EFA DRIVER
M: Gal Pressman <galpress@amazon.com>
R: Yossi Leybovich <sleybo@amazon.com>
L: linux-rdma@vger.kernel.org
Q: https://patchwork.kernel.org/project/linux-rdma/list/
S: Supported
F: drivers/infiniband/hw/efa/
F: include/uapi/rdma/efa-abi.h
AMD CRYPTOGRAPHIC COPROCESSOR (CCP) DRIVER
M: Tom Lendacky <thomas.lendacky@amd.com>
M: Gary Hook <gary.hook@amd.com>
@ -4279,7 +4288,7 @@ S: Supported
F: drivers/scsi/cxgbi/cxgb3i
CXGB3 IWARP RNIC DRIVER (IW_CXGB3)
M: Steve Wise <swise@chelsio.com>
M: Potnuri Bharat Teja <bharat@chelsio.com>
L: linux-rdma@vger.kernel.org
W: http://www.openfabrics.org
S: Supported
@ -4308,7 +4317,7 @@ S: Supported
F: drivers/scsi/cxgbi/cxgb4i
CXGB4 IWARP RNIC DRIVER (IW_CXGB4)
M: Steve Wise <swise@chelsio.com>
M: Potnuri Bharat Teja <bharat@chelsio.com>
L: linux-rdma@vger.kernel.org
W: http://www.openfabrics.org
S: Supported
@ -7727,6 +7736,10 @@ F: drivers/infiniband/
F: include/uapi/linux/if_infiniband.h
F: include/uapi/rdma/
F: include/rdma/
F: include/trace/events/ib_mad.h
F: include/trace/events/ib_umad.h
F: samples/bpf/ibumad_kern.c
F: samples/bpf/ibumad_user.c
INGENIC JZ4780 DMA Driver
M: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com>

View File

@ -93,6 +93,7 @@ source "drivers/infiniband/hw/mthca/Kconfig"
source "drivers/infiniband/hw/qib/Kconfig"
source "drivers/infiniband/hw/cxgb3/Kconfig"
source "drivers/infiniband/hw/cxgb4/Kconfig"
source "drivers/infiniband/hw/efa/Kconfig"
source "drivers/infiniband/hw/i40iw/Kconfig"
source "drivers/infiniband/hw/mlx4/Kconfig"
source "drivers/infiniband/hw/mlx5/Kconfig"

View File

@ -45,6 +45,7 @@
#include <net/ipv6_stubs.h>
#include <net/ip6_route.h>
#include <rdma/ib_addr.h>
#include <rdma/ib_cache.h>
#include <rdma/ib_sa.h>
#include <rdma/ib.h>
#include <rdma/rdma_netlink.h>

View File

@ -78,11 +78,22 @@ enum gid_table_entry_state {
GID_TABLE_ENTRY_PENDING_DEL = 3,
};
struct roce_gid_ndev_storage {
struct rcu_head rcu_head;
struct net_device *ndev;
};
struct ib_gid_table_entry {
struct kref kref;
struct work_struct del_work;
struct ib_gid_attr attr;
void *context;
/* Store the ndev pointer to release reference later on in
* call_rcu context because by that time gid_table_entry
* and attr might be already freed. So keep a copy of it.
* ndev_storage is freed by rcu callback.
*/
struct roce_gid_ndev_storage *ndev_storage;
enum gid_table_entry_state state;
};
@ -206,6 +217,20 @@ static void schedule_free_gid(struct kref *kref)
queue_work(ib_wq, &entry->del_work);
}
static void put_gid_ndev(struct rcu_head *head)
{
struct roce_gid_ndev_storage *storage =
container_of(head, struct roce_gid_ndev_storage, rcu_head);
WARN_ON(!storage->ndev);
/* At this point its safe to release netdev reference,
* as all callers working on gid_attr->ndev are done
* using this netdev.
*/
dev_put(storage->ndev);
kfree(storage);
}
static void free_gid_entry_locked(struct ib_gid_table_entry *entry)
{
struct ib_device *device = entry->attr.device;
@ -228,8 +253,8 @@ static void free_gid_entry_locked(struct ib_gid_table_entry *entry)
/* Now this index is ready to be allocated */
write_unlock_irq(&table->rwlock);
if (entry->attr.ndev)
dev_put(entry->attr.ndev);
if (entry->ndev_storage)
call_rcu(&entry->ndev_storage->rcu_head, put_gid_ndev);
kfree(entry);
}
@ -266,14 +291,25 @@ static struct ib_gid_table_entry *
alloc_gid_entry(const struct ib_gid_attr *attr)
{
struct ib_gid_table_entry *entry;
struct net_device *ndev;
entry = kzalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
return NULL;
ndev = rcu_dereference_protected(attr->ndev, 1);
if (ndev) {
entry->ndev_storage = kzalloc(sizeof(*entry->ndev_storage),
GFP_KERNEL);
if (!entry->ndev_storage) {
kfree(entry);
return NULL;
}
dev_hold(ndev);
entry->ndev_storage->ndev = ndev;
}
kref_init(&entry->kref);
memcpy(&entry->attr, attr, sizeof(*attr));
if (entry->attr.ndev)
dev_hold(entry->attr.ndev);
INIT_WORK(&entry->del_work, free_gid_work);
entry->state = GID_TABLE_ENTRY_INVALID;
return entry;
@ -343,6 +379,7 @@ static int add_roce_gid(struct ib_gid_table_entry *entry)
static void del_gid(struct ib_device *ib_dev, u8 port,
struct ib_gid_table *table, int ix)
{
struct roce_gid_ndev_storage *ndev_storage;
struct ib_gid_table_entry *entry;
lockdep_assert_held(&table->lock);
@ -360,6 +397,13 @@ static void del_gid(struct ib_device *ib_dev, u8 port,
table->data_vec[ix] = NULL;
write_unlock_irq(&table->rwlock);
ndev_storage = entry->ndev_storage;
if (ndev_storage) {
entry->ndev_storage = NULL;
rcu_assign_pointer(entry->attr.ndev, NULL);
call_rcu(&ndev_storage->rcu_head, put_gid_ndev);
}
if (rdma_cap_roce_gid_table(ib_dev, port))
ib_dev->ops.del_gid(&entry->attr, &entry->context);
@ -543,30 +587,11 @@ out_unlock:
int ib_cache_gid_add(struct ib_device *ib_dev, u8 port,
union ib_gid *gid, struct ib_gid_attr *attr)
{
struct net_device *idev;
unsigned long mask;
int ret;
unsigned long mask = GID_ATTR_FIND_MASK_GID |
GID_ATTR_FIND_MASK_GID_TYPE |
GID_ATTR_FIND_MASK_NETDEV;
idev = ib_device_get_netdev(ib_dev, port);
if (idev && attr->ndev != idev) {
union ib_gid default_gid;
/* Adding default GIDs is not permitted */
make_default_gid(idev, &default_gid);
if (!memcmp(gid, &default_gid, sizeof(*gid))) {
dev_put(idev);
return -EPERM;
}
}
if (idev)
dev_put(idev);
mask = GID_ATTR_FIND_MASK_GID |
GID_ATTR_FIND_MASK_GID_TYPE |
GID_ATTR_FIND_MASK_NETDEV;
ret = __ib_cache_gid_add(ib_dev, port, gid, attr, mask, false);
return ret;
return __ib_cache_gid_add(ib_dev, port, gid, attr, mask, false);
}
static int
@ -1263,11 +1288,72 @@ struct net_device *rdma_read_gid_attr_ndev_rcu(const struct ib_gid_attr *attr)
read_lock_irqsave(&table->rwlock, flags);
valid = is_gid_entry_valid(table->data_vec[attr->index]);
if (valid && attr->ndev && (READ_ONCE(attr->ndev->flags) & IFF_UP))
ndev = attr->ndev;
if (valid) {
ndev = rcu_dereference(attr->ndev);
if (!ndev ||
(ndev && ((READ_ONCE(ndev->flags) & IFF_UP) == 0)))
ndev = ERR_PTR(-ENODEV);
}
read_unlock_irqrestore(&table->rwlock, flags);
return ndev;
}
EXPORT_SYMBOL(rdma_read_gid_attr_ndev_rcu);
static int get_lower_dev_vlan(struct net_device *lower_dev, void *data)
{
u16 *vlan_id = data;
if (is_vlan_dev(lower_dev))
*vlan_id = vlan_dev_vlan_id(lower_dev);
/* We are interested only in first level vlan device, so
* always return 1 to stop iterating over next level devices.
*/
return 1;
}
/**
* rdma_read_gid_l2_fields - Read the vlan ID and source MAC address
* of a GID entry.
*
* @attr: GID attribute pointer whose L2 fields to be read
* @vlan_id: Pointer to vlan id to fill up if the GID entry has
* vlan id. It is optional.
* @smac: Pointer to smac to fill up for a GID entry. It is optional.
*
* rdma_read_gid_l2_fields() returns 0 on success and returns vlan id
* (if gid entry has vlan) and source MAC, or returns error.
*/
int rdma_read_gid_l2_fields(const struct ib_gid_attr *attr,
u16 *vlan_id, u8 *smac)
{
struct net_device *ndev;
rcu_read_lock();
ndev = rcu_dereference(attr->ndev);
if (!ndev) {
rcu_read_unlock();
return -ENODEV;
}
if (smac)
ether_addr_copy(smac, ndev->dev_addr);
if (vlan_id) {
*vlan_id = 0xffff;
if (is_vlan_dev(ndev)) {
*vlan_id = vlan_dev_vlan_id(ndev);
} else {
/* If the netdev is upper device and if it's lower
* device is vlan device, consider vlan id of the
* the lower vlan device for this gid entry.
*/
netdev_walk_all_lower_dev_rcu(attr->ndev,
get_lower_dev_vlan, vlan_id);
}
}
rcu_read_unlock();
return 0;
}
EXPORT_SYMBOL(rdma_read_gid_l2_fields);
static int config_non_roce_gid_cache(struct ib_device *device,
u8 port, int gid_tbl_len)
@ -1392,7 +1478,6 @@ static void ib_cache_event(struct ib_event_handler *handler,
event->event == IB_EVENT_PORT_ACTIVE ||
event->event == IB_EVENT_LID_CHANGE ||
event->event == IB_EVENT_PKEY_CHANGE ||
event->event == IB_EVENT_SM_CHANGE ||
event->event == IB_EVENT_CLIENT_REREGISTER ||
event->event == IB_EVENT_GID_CHANGE) {
work = kmalloc(sizeof *work, GFP_ATOMIC);

View File

@ -52,6 +52,7 @@
#include <rdma/ib_cache.h>
#include <rdma/ib_cm.h>
#include "cm_msgs.h"
#include "core_priv.h"
MODULE_AUTHOR("Sean Hefty");
MODULE_DESCRIPTION("InfiniBand CM");
@ -124,7 +125,8 @@ static struct ib_cm {
struct rb_root remote_qp_table;
struct rb_root remote_id_table;
struct rb_root remote_sidr_table;
struct idr local_id_table;
struct xarray local_id_table;
u32 local_id_next;
__be32 random_id_operand;
struct list_head timewait_list;
struct workqueue_struct *wq;
@ -219,7 +221,6 @@ struct cm_port {
struct cm_device {
struct list_head list;
struct ib_device *ib_device;
struct device *device;
u8 ack_delay;
int going_down;
struct cm_port *port[0];
@ -598,35 +599,31 @@ static int cm_init_av_by_path(struct sa_path_rec *path,
static int cm_alloc_id(struct cm_id_private *cm_id_priv)
{
unsigned long flags;
int id;
int err;
u32 id;
idr_preload(GFP_KERNEL);
spin_lock_irqsave(&cm.lock, flags);
id = idr_alloc_cyclic(&cm.local_id_table, cm_id_priv, 0, 0, GFP_NOWAIT);
spin_unlock_irqrestore(&cm.lock, flags);
idr_preload_end();
err = xa_alloc_cyclic_irq(&cm.local_id_table, &id, cm_id_priv,
xa_limit_32b, &cm.local_id_next, GFP_KERNEL);
cm_id_priv->id.local_id = (__force __be32)id ^ cm.random_id_operand;
return id < 0 ? id : 0;
return err;
}
static u32 cm_local_id(__be32 local_id)
{
return (__force u32) (local_id ^ cm.random_id_operand);
}
static void cm_free_id(__be32 local_id)
{
spin_lock_irq(&cm.lock);
idr_remove(&cm.local_id_table,
(__force int) (local_id ^ cm.random_id_operand));
spin_unlock_irq(&cm.lock);
xa_erase_irq(&cm.local_id_table, cm_local_id(local_id));
}
static struct cm_id_private * cm_get_id(__be32 local_id, __be32 remote_id)
{
struct cm_id_private *cm_id_priv;
cm_id_priv = idr_find(&cm.local_id_table,
(__force int) (local_id ^ cm.random_id_operand));
cm_id_priv = xa_load(&cm.local_id_table, cm_local_id(local_id));
if (cm_id_priv) {
if (cm_id_priv->id.remote_id == remote_id)
atomic_inc(&cm_id_priv->refcount);
@ -1988,11 +1985,12 @@ static int cm_req_handler(struct cm_work *work)
grh = rdma_ah_read_grh(&cm_id_priv->av.ah_attr);
gid_attr = grh->sgid_attr;
if (gid_attr && gid_attr->ndev) {
if (gid_attr &&
rdma_protocol_roce(work->port->cm_dev->ib_device,
work->port->port_num)) {
work->path[0].rec_type =
sa_conv_gid_to_pathrec_type(gid_attr->gid_type);
} else {
/* If no GID attribute or ndev is null, it is not RoCE. */
cm_path_set_rec_type(work->port->cm_dev->ib_device,
work->port->port_num,
&work->path[0],
@ -2824,9 +2822,8 @@ static struct cm_id_private * cm_acquire_rejected_id(struct cm_rej_msg *rej_msg)
spin_unlock_irq(&cm.lock);
return NULL;
}
cm_id_priv = idr_find(&cm.local_id_table, (__force int)
(timewait_info->work.local_id ^
cm.random_id_operand));
cm_id_priv = xa_load(&cm.local_id_table,
cm_local_id(timewait_info->work.local_id));
if (cm_id_priv) {
if (cm_id_priv->id.remote_id == remote_id)
atomic_inc(&cm_id_priv->refcount);
@ -4276,18 +4273,6 @@ static struct kobj_type cm_counter_obj_type = {
.default_attrs = cm_counter_default_attrs
};
static void cm_release_port_obj(struct kobject *obj)
{
struct cm_port *cm_port;
cm_port = container_of(obj, struct cm_port, port_obj);
kfree(cm_port);
}
static struct kobj_type cm_port_obj_type = {
.release = cm_release_port_obj
};
static char *cm_devnode(struct device *dev, umode_t *mode)
{
if (mode)
@ -4306,19 +4291,12 @@ static int cm_create_port_fs(struct cm_port *port)
{
int i, ret;
ret = kobject_init_and_add(&port->port_obj, &cm_port_obj_type,
&port->cm_dev->device->kobj,
"%d", port->port_num);
if (ret) {
kfree(port);
return ret;
}
for (i = 0; i < CM_COUNTER_GROUPS; i++) {
ret = kobject_init_and_add(&port->counter_group[i].obj,
&cm_counter_obj_type,
&port->port_obj,
"%s", counter_group_names[i]);
ret = ib_port_register_module_stat(port->cm_dev->ib_device,
port->port_num,
&port->counter_group[i].obj,
&cm_counter_obj_type,
counter_group_names[i]);
if (ret)
goto error;
}
@ -4327,8 +4305,7 @@ static int cm_create_port_fs(struct cm_port *port)
error:
while (i--)
kobject_put(&port->counter_group[i].obj);
kobject_put(&port->port_obj);
ib_port_unregister_module_stat(&port->counter_group[i].obj);
return ret;
}
@ -4338,9 +4315,8 @@ static void cm_remove_port_fs(struct cm_port *port)
int i;
for (i = 0; i < CM_COUNTER_GROUPS; i++)
kobject_put(&port->counter_group[i].obj);
ib_port_unregister_module_stat(&port->counter_group[i].obj);
kobject_put(&port->port_obj);
}
static void cm_add_one(struct ib_device *ib_device)
@ -4367,13 +4343,6 @@ static void cm_add_one(struct ib_device *ib_device)
cm_dev->ib_device = ib_device;
cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay;
cm_dev->going_down = 0;
cm_dev->device = device_create(&cm_class, &ib_device->dev,
MKDEV(0, 0), NULL,
"%s", dev_name(&ib_device->dev));
if (IS_ERR(cm_dev->device)) {
kfree(cm_dev);
return;
}
set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
for (i = 1; i <= ib_device->phys_port_cnt; i++) {
@ -4440,7 +4409,6 @@ error1:
cm_remove_port_fs(port);
}
free:
device_unregister(cm_dev->device);
kfree(cm_dev);
}
@ -4494,7 +4462,6 @@ static void cm_remove_one(struct ib_device *ib_device, void *client_data)
cm_remove_port_fs(port);
}
device_unregister(cm_dev->device);
kfree(cm_dev);
}
@ -4502,7 +4469,6 @@ static int __init ib_cm_init(void)
{
int ret;
memset(&cm, 0, sizeof cm);
INIT_LIST_HEAD(&cm.device_list);
rwlock_init(&cm.device_lock);
spin_lock_init(&cm.lock);
@ -4512,7 +4478,7 @@ static int __init ib_cm_init(void)
cm.remote_id_table = RB_ROOT;
cm.remote_qp_table = RB_ROOT;
cm.remote_sidr_table = RB_ROOT;
idr_init(&cm.local_id_table);
xa_init_flags(&cm.local_id_table, XA_FLAGS_ALLOC | XA_FLAGS_LOCK_IRQ);
get_random_bytes(&cm.random_id_operand, sizeof cm.random_id_operand);
INIT_LIST_HEAD(&cm.timewait_list);
@ -4538,7 +4504,6 @@ error3:
error2:
class_unregister(&cm_class);
error1:
idr_destroy(&cm.local_id_table);
return ret;
}
@ -4560,9 +4525,8 @@ static void __exit ib_cm_cleanup(void)
}
class_unregister(&cm_class);
idr_destroy(&cm.local_id_table);
WARN_ON(!xa_empty(&cm.local_id_table));
}
module_init(ib_cm_init);
module_exit(ib_cm_cleanup);

View File

@ -98,7 +98,7 @@ struct cm_req_msg {
u32 private_data[IB_CM_REQ_PRIVATE_DATA_SIZE / sizeof(u32)];
} __attribute__ ((packed));
} __packed;
static inline __be32 cm_req_get_local_qpn(struct cm_req_msg *req_msg)
{
@ -423,7 +423,7 @@ enum cm_msg_response {
u8 private_data[IB_CM_MRA_PRIVATE_DATA_SIZE];
} __attribute__ ((packed));
} __packed;
static inline u8 cm_mra_get_msg_mraed(struct cm_mra_msg *mra_msg)
{
@ -461,7 +461,7 @@ struct cm_rej_msg {
u8 private_data[IB_CM_REJ_PRIVATE_DATA_SIZE];
} __attribute__ ((packed));
} __packed;
static inline u8 cm_rej_get_msg_rejected(struct cm_rej_msg *rej_msg)
{
@ -506,7 +506,7 @@ struct cm_rep_msg {
u8 private_data[IB_CM_REP_PRIVATE_DATA_SIZE];
} __attribute__ ((packed));
} __packed;
static inline __be32 cm_rep_get_local_qpn(struct cm_rep_msg *rep_msg)
{
@ -614,7 +614,7 @@ struct cm_rtu_msg {
u8 private_data[IB_CM_RTU_PRIVATE_DATA_SIZE];
} __attribute__ ((packed));
} __packed;
struct cm_dreq_msg {
struct ib_mad_hdr hdr;
@ -626,7 +626,7 @@ struct cm_dreq_msg {
u8 private_data[IB_CM_DREQ_PRIVATE_DATA_SIZE];
} __attribute__ ((packed));
} __packed;
static inline __be32 cm_dreq_get_remote_qpn(struct cm_dreq_msg *dreq_msg)
{
@ -647,7 +647,7 @@ struct cm_drep_msg {
u8 private_data[IB_CM_DREP_PRIVATE_DATA_SIZE];
} __attribute__ ((packed));
} __packed;
struct cm_lap_msg {
struct ib_mad_hdr hdr;
@ -675,7 +675,7 @@ struct cm_lap_msg {
u8 offset63;
u8 private_data[IB_CM_LAP_PRIVATE_DATA_SIZE];
} __attribute__ ((packed));
} __packed;
static inline __be32 cm_lap_get_remote_qpn(struct cm_lap_msg *lap_msg)
{
@ -784,7 +784,7 @@ struct cm_apr_msg {
u8 info[IB_CM_APR_INFO_LENGTH];
u8 private_data[IB_CM_APR_PRIVATE_DATA_SIZE];
} __attribute__ ((packed));
} __packed;
struct cm_sidr_req_msg {
struct ib_mad_hdr hdr;
@ -795,7 +795,7 @@ struct cm_sidr_req_msg {
__be64 service_id;
u32 private_data[IB_CM_SIDR_REQ_PRIVATE_DATA_SIZE / sizeof(u32)];
} __attribute__ ((packed));
} __packed;
struct cm_sidr_rep_msg {
struct ib_mad_hdr hdr;
@ -811,7 +811,7 @@ struct cm_sidr_rep_msg {
u8 info[IB_CM_SIDR_REP_INFO_LENGTH];
u8 private_data[IB_CM_SIDR_REP_PRIVATE_DATA_SIZE];
} __attribute__ ((packed));
} __packed;
static inline __be32 cm_sidr_rep_get_qpn(struct cm_sidr_rep_msg *sidr_rep_msg)
{

View File

@ -39,7 +39,7 @@
#include <linux/mutex.h>
#include <linux/random.h>
#include <linux/igmp.h>
#include <linux/idr.h>
#include <linux/xarray.h>
#include <linux/inetdevice.h>
#include <linux/slab.h>
#include <linux/module.h>
@ -191,10 +191,10 @@ static struct workqueue_struct *cma_wq;
static unsigned int cma_pernet_id;
struct cma_pernet {
struct idr tcp_ps;
struct idr udp_ps;
struct idr ipoib_ps;
struct idr ib_ps;
struct xarray tcp_ps;
struct xarray udp_ps;
struct xarray ipoib_ps;
struct xarray ib_ps;
};
static struct cma_pernet *cma_pernet(struct net *net)
@ -202,7 +202,8 @@ static struct cma_pernet *cma_pernet(struct net *net)
return net_generic(net, cma_pernet_id);
}
static struct idr *cma_pernet_idr(struct net *net, enum rdma_ucm_port_space ps)
static
struct xarray *cma_pernet_xa(struct net *net, enum rdma_ucm_port_space ps)
{
struct cma_pernet *pernet = cma_pernet(net);
@ -247,25 +248,25 @@ struct class_port_info_context {
static int cma_ps_alloc(struct net *net, enum rdma_ucm_port_space ps,
struct rdma_bind_list *bind_list, int snum)
{
struct idr *idr = cma_pernet_idr(net, ps);
struct xarray *xa = cma_pernet_xa(net, ps);
return idr_alloc(idr, bind_list, snum, snum + 1, GFP_KERNEL);
return xa_insert(xa, snum, bind_list, GFP_KERNEL);
}
static struct rdma_bind_list *cma_ps_find(struct net *net,
enum rdma_ucm_port_space ps, int snum)
{
struct idr *idr = cma_pernet_idr(net, ps);
struct xarray *xa = cma_pernet_xa(net, ps);
return idr_find(idr, snum);
return xa_load(xa, snum);
}
static void cma_ps_remove(struct net *net, enum rdma_ucm_port_space ps,
int snum)
{
struct idr *idr = cma_pernet_idr(net, ps);
struct xarray *xa = cma_pernet_xa(net, ps);
idr_remove(idr, snum);
xa_erase(xa, snum);
}
enum {
@ -615,6 +616,9 @@ cma_validate_port(struct ib_device *device, u8 port,
int dev_type = dev_addr->dev_type;
struct net_device *ndev = NULL;
if (!rdma_dev_access_netns(device, id_priv->id.route.addr.dev_addr.net))
return ERR_PTR(-ENODEV);
if ((dev_type == ARPHRD_INFINIBAND) && !rdma_protocol_ib(device, port))
return ERR_PTR(-ENODEV);
@ -1173,18 +1177,31 @@ static inline bool cma_any_addr(const struct sockaddr *addr)
return cma_zero_addr(addr) || cma_loopback_addr(addr);
}
static int cma_addr_cmp(struct sockaddr *src, struct sockaddr *dst)
static int cma_addr_cmp(const struct sockaddr *src, const struct sockaddr *dst)
{
if (src->sa_family != dst->sa_family)
return -1;
switch (src->sa_family) {
case AF_INET:
return ((struct sockaddr_in *) src)->sin_addr.s_addr !=
((struct sockaddr_in *) dst)->sin_addr.s_addr;
case AF_INET6:
return ipv6_addr_cmp(&((struct sockaddr_in6 *) src)->sin6_addr,
&((struct sockaddr_in6 *) dst)->sin6_addr);
return ((struct sockaddr_in *)src)->sin_addr.s_addr !=
((struct sockaddr_in *)dst)->sin_addr.s_addr;
case AF_INET6: {
struct sockaddr_in6 *src_addr6 = (struct sockaddr_in6 *)src;
struct sockaddr_in6 *dst_addr6 = (struct sockaddr_in6 *)dst;
bool link_local;
if (ipv6_addr_cmp(&src_addr6->sin6_addr,
&dst_addr6->sin6_addr))
return 1;
link_local = ipv6_addr_type(&dst_addr6->sin6_addr) &
IPV6_ADDR_LINKLOCAL;
/* Link local must match their scope_ids */
return link_local ? (src_addr6->sin6_scope_id !=
dst_addr6->sin6_scope_id) :
0;
}
default:
return ib_addr_cmp(&((struct sockaddr_ib *) src)->sib_addr,
&((struct sockaddr_ib *) dst)->sib_addr);
@ -1469,6 +1486,7 @@ static struct net_device *
roce_get_net_dev_by_cm_event(const struct ib_cm_event *ib_event)
{
const struct ib_gid_attr *sgid_attr = NULL;
struct net_device *ndev;
if (ib_event->event == IB_CM_REQ_RECEIVED)
sgid_attr = ib_event->param.req_rcvd.ppath_sgid_attr;
@ -1477,8 +1495,15 @@ roce_get_net_dev_by_cm_event(const struct ib_cm_event *ib_event)
if (!sgid_attr)
return NULL;
dev_hold(sgid_attr->ndev);
return sgid_attr->ndev;
rcu_read_lock();
ndev = rdma_read_gid_attr_ndev_rcu(sgid_attr);
if (IS_ERR(ndev))
ndev = NULL;
else
dev_hold(ndev);
rcu_read_unlock();
return ndev;
}
static struct net_device *cma_get_net_dev(const struct ib_cm_event *ib_event,
@ -3247,7 +3272,7 @@ static int cma_alloc_port(enum rdma_ucm_port_space ps,
goto err;
bind_list->ps = ps;
bind_list->port = (unsigned short)ret;
bind_list->port = snum;
cma_bind_port(bind_list, id_priv);
return 0;
err:
@ -4655,10 +4680,10 @@ static int cma_init_net(struct net *net)
{
struct cma_pernet *pernet = cma_pernet(net);
idr_init(&pernet->tcp_ps);
idr_init(&pernet->udp_ps);
idr_init(&pernet->ipoib_ps);
idr_init(&pernet->ib_ps);
xa_init(&pernet->tcp_ps);
xa_init(&pernet->udp_ps);
xa_init(&pernet->ipoib_ps);
xa_init(&pernet->ib_ps);
return 0;
}
@ -4667,10 +4692,10 @@ static void cma_exit_net(struct net *net)
{
struct cma_pernet *pernet = cma_pernet(net);
idr_destroy(&pernet->tcp_ps);
idr_destroy(&pernet->udp_ps);
idr_destroy(&pernet->ipoib_ps);
idr_destroy(&pernet->ib_ps);
WARN_ON(!xa_empty(&pernet->tcp_ps));
WARN_ON(!xa_empty(&pernet->udp_ps));
WARN_ON(!xa_empty(&pernet->ipoib_ps));
WARN_ON(!xa_empty(&pernet->ib_ps));
}
static struct pernet_operations cma_pernet_operations = {

View File

@ -55,6 +55,7 @@ struct pkey_index_qp_list {
};
extern const struct attribute_group ib_dev_attr_group;
extern bool ib_devices_shared_netns;
int ib_device_register_sysfs(struct ib_device *device);
void ib_device_unregister_sysfs(struct ib_device *device);
@ -279,7 +280,8 @@ static inline void ib_mad_agent_security_change(void)
}
#endif
struct ib_device *ib_device_get_by_index(u32 ifindex);
struct ib_device *ib_device_get_by_index(const struct net *net, u32 index);
/* RDMA device netlink */
void nldev_init(void);
void nldev_exit(void);
@ -302,6 +304,7 @@ static inline struct ib_qp *_ib_create_qp(struct ib_device *dev,
qp->device = dev;
qp->pd = pd;
qp->uobject = uobj;
qp->real_qp = qp;
/*
* We don't track XRC QPs for now, because they don't have PD
* and more importantly they are created internaly by driver,
@ -336,4 +339,17 @@ int roce_resolve_route_from_path(struct sa_path_rec *rec,
const struct ib_gid_attr *attr);
struct net_device *rdma_read_gid_attr_ndev_rcu(const struct ib_gid_attr *attr);
void ib_free_port_attrs(struct ib_core_device *coredev);
int ib_setup_port_attrs(struct ib_core_device *coredev);
int rdma_compatdev_set(u8 enable);
int ib_port_register_module_stat(struct ib_device *device, u8 port_num,
struct kobject *kobj, struct kobj_type *ktype,
const char *name);
void ib_port_unregister_module_stat(struct kobject *kobj);
int ib_device_set_netns_put(struct sk_buff *skb,
struct ib_device *dev, u32 ns_fd);
#endif /* _CORE_PRIV_H */

View File

@ -128,15 +128,17 @@ static void ib_cq_completion_workqueue(struct ib_cq *cq, void *private)
* @comp_vector: HCA completion vectors for this CQ
* @poll_ctx: context to poll the CQ from.
* @caller: module owner name.
* @udata: Valid user data or NULL for kernel object
*
* This is the proper interface to allocate a CQ for in-kernel users. A
* CQ allocated with this interface will automatically be polled from the
* specified context. The ULP must use wr->wr_cqe instead of wr->wr_id
* to use this CQ abstraction.
*/
struct ib_cq *__ib_alloc_cq(struct ib_device *dev, void *private,
int nr_cqe, int comp_vector,
enum ib_poll_context poll_ctx, const char *caller)
struct ib_cq *__ib_alloc_cq_user(struct ib_device *dev, void *private,
int nr_cqe, int comp_vector,
enum ib_poll_context poll_ctx,
const char *caller, struct ib_udata *udata)
{
struct ib_cq_init_attr cq_attr = {
.cqe = nr_cqe,
@ -145,7 +147,7 @@ struct ib_cq *__ib_alloc_cq(struct ib_device *dev, void *private,
struct ib_cq *cq;
int ret = -ENOMEM;
cq = dev->ops.create_cq(dev, &cq_attr, NULL, NULL);
cq = dev->ops.create_cq(dev, &cq_attr, NULL);
if (IS_ERR(cq))
return cq;
@ -193,16 +195,17 @@ out_free_wc:
kfree(cq->wc);
rdma_restrack_del(&cq->res);
out_destroy_cq:
cq->device->ops.destroy_cq(cq);
cq->device->ops.destroy_cq(cq, udata);
return ERR_PTR(ret);
}
EXPORT_SYMBOL(__ib_alloc_cq);
EXPORT_SYMBOL(__ib_alloc_cq_user);
/**
* ib_free_cq - free a completion queue
* @cq: completion queue to free.
* @udata: User data or NULL for kernel object
*/
void ib_free_cq(struct ib_cq *cq)
void ib_free_cq_user(struct ib_cq *cq, struct ib_udata *udata)
{
int ret;
@ -225,7 +228,7 @@ void ib_free_cq(struct ib_cq *cq)
kfree(cq->wc);
rdma_restrack_del(&cq->res);
ret = cq->device->ops.destroy_cq(cq);
ret = cq->device->ops.destroy_cq(cq, udata);
WARN_ON_ONCE(ret);
}
EXPORT_SYMBOL(ib_free_cq);
EXPORT_SYMBOL(ib_free_cq_user);

View File

@ -38,6 +38,8 @@
#include <linux/slab.h>
#include <linux/init.h>
#include <linux/netdevice.h>
#include <net/net_namespace.h>
#include <net/netns/generic.h>
#include <linux/security.h>
#include <linux/notifier.h>
#include <linux/hashtable.h>
@ -101,6 +103,54 @@ static DECLARE_RWSEM(clients_rwsem);
* be registered.
*/
#define CLIENT_DATA_REGISTERED XA_MARK_1
/**
* struct rdma_dev_net - rdma net namespace metadata for a net
* @net: Pointer to owner net namespace
* @id: xarray id to identify the net namespace.
*/
struct rdma_dev_net {
possible_net_t net;
u32 id;
};
static unsigned int rdma_dev_net_id;
/*
* A list of net namespaces is maintained in an xarray. This is necessary
* because we can't get the locking right using the existing net ns list. We
* would require a init_net callback after the list is updated.
*/
static DEFINE_XARRAY_FLAGS(rdma_nets, XA_FLAGS_ALLOC);
/*
* rwsem to protect accessing the rdma_nets xarray entries.
*/
static DECLARE_RWSEM(rdma_nets_rwsem);
bool ib_devices_shared_netns = true;
module_param_named(netns_mode, ib_devices_shared_netns, bool, 0444);
MODULE_PARM_DESC(netns_mode,
"Share device among net namespaces; default=1 (shared)");
/**
* rdma_dev_access_netns() - Return whether a rdma device can be accessed
* from a specified net namespace or not.
* @device: Pointer to rdma device which needs to be checked
* @net: Pointer to net namesapce for which access to be checked
*
* rdma_dev_access_netns() - Return whether a rdma device can be accessed
* from a specified net namespace or not. When
* rdma device is in shared mode, it ignores the
* net namespace. When rdma device is exclusive
* to a net namespace, rdma device net namespace is
* checked against the specified one.
*/
bool rdma_dev_access_netns(const struct ib_device *dev, const struct net *net)
{
return (ib_devices_shared_netns ||
net_eq(read_pnet(&dev->coredev.rdma_net), net));
}
EXPORT_SYMBOL(rdma_dev_access_netns);
/*
* xarray has this behavior where it won't iterate over NULL values stored in
* allocated arrays. So we need our own iterator to see all values stored in
@ -147,10 +197,73 @@ static int ib_security_change(struct notifier_block *nb, unsigned long event,
static void ib_policy_change_task(struct work_struct *work);
static DECLARE_WORK(ib_policy_change_work, ib_policy_change_task);
static void __ibdev_printk(const char *level, const struct ib_device *ibdev,
struct va_format *vaf)
{
if (ibdev && ibdev->dev.parent)
dev_printk_emit(level[1] - '0',
ibdev->dev.parent,
"%s %s %s: %pV",
dev_driver_string(ibdev->dev.parent),
dev_name(ibdev->dev.parent),
dev_name(&ibdev->dev),
vaf);
else if (ibdev)
printk("%s%s: %pV",
level, dev_name(&ibdev->dev), vaf);
else
printk("%s(NULL ib_device): %pV", level, vaf);
}
void ibdev_printk(const char *level, const struct ib_device *ibdev,
const char *format, ...)
{
struct va_format vaf;
va_list args;
va_start(args, format);
vaf.fmt = format;
vaf.va = &args;
__ibdev_printk(level, ibdev, &vaf);
va_end(args);
}
EXPORT_SYMBOL(ibdev_printk);
#define define_ibdev_printk_level(func, level) \
void func(const struct ib_device *ibdev, const char *fmt, ...) \
{ \
struct va_format vaf; \
va_list args; \
\
va_start(args, fmt); \
\
vaf.fmt = fmt; \
vaf.va = &args; \
\
__ibdev_printk(level, ibdev, &vaf); \
\
va_end(args); \
} \
EXPORT_SYMBOL(func);
define_ibdev_printk_level(ibdev_emerg, KERN_EMERG);
define_ibdev_printk_level(ibdev_alert, KERN_ALERT);
define_ibdev_printk_level(ibdev_crit, KERN_CRIT);
define_ibdev_printk_level(ibdev_err, KERN_ERR);
define_ibdev_printk_level(ibdev_warn, KERN_WARNING);
define_ibdev_printk_level(ibdev_notice, KERN_NOTICE);
define_ibdev_printk_level(ibdev_info, KERN_INFO);
static struct notifier_block ibdev_lsm_nb = {
.notifier_call = ib_security_change,
};
static int rdma_dev_change_netns(struct ib_device *device, struct net *cur_net,
struct net *net);
/* Pointer to the RCU head at the start of the ib_port_data array */
struct ib_port_data_rcu {
struct rcu_head rcu_head;
@ -200,16 +313,22 @@ static int ib_device_check_mandatory(struct ib_device *device)
* Caller must perform ib_device_put() to return the device reference count
* when ib_device_get_by_index() returns valid device pointer.
*/
struct ib_device *ib_device_get_by_index(u32 index)
struct ib_device *ib_device_get_by_index(const struct net *net, u32 index)
{
struct ib_device *device;
down_read(&devices_rwsem);
device = xa_load(&devices, index);
if (device) {
if (!rdma_dev_access_netns(device, net)) {
device = NULL;
goto out;
}
if (!ib_device_try_get(device))
device = NULL;
}
out:
up_read(&devices_rwsem);
return device;
}
@ -268,6 +387,26 @@ struct ib_device *ib_device_get_by_name(const char *name,
}
EXPORT_SYMBOL(ib_device_get_by_name);
static int rename_compat_devs(struct ib_device *device)
{
struct ib_core_device *cdev;
unsigned long index;
int ret = 0;
mutex_lock(&device->compat_devs_mutex);
xa_for_each (&device->compat_devs, index, cdev) {
ret = device_rename(&cdev->dev, dev_name(&device->dev));
if (ret) {
dev_warn(&cdev->dev,
"Fail to rename compatdev to new name %s\n",
dev_name(&device->dev));
break;
}
}
mutex_unlock(&device->compat_devs_mutex);
return ret;
}
int ib_device_rename(struct ib_device *ibdev, const char *name)
{
int ret;
@ -287,6 +426,7 @@ int ib_device_rename(struct ib_device *ibdev, const char *name)
if (ret)
goto out;
strlcpy(ibdev->name, name, IB_DEVICE_NAME_MAX);
ret = rename_compat_devs(ibdev);
out:
up_write(&devices_rwsem);
return ret;
@ -336,6 +476,7 @@ static void ib_device_release(struct device *device)
WARN_ON(refcount_read(&dev->refcount));
ib_cache_release_one(dev);
ib_security_release_port_pkey_list(dev);
xa_destroy(&dev->compat_devs);
xa_destroy(&dev->client_data);
if (dev->port_data)
kfree_rcu(container_of(dev->port_data, struct ib_port_data_rcu,
@ -357,12 +498,42 @@ static int ib_device_uevent(struct device *device,
return 0;
}
static const void *net_namespace(struct device *d)
{
struct ib_core_device *coredev =
container_of(d, struct ib_core_device, dev);
return read_pnet(&coredev->rdma_net);
}
static struct class ib_class = {
.name = "infiniband",
.dev_release = ib_device_release,
.dev_uevent = ib_device_uevent,
.ns_type = &net_ns_type_operations,
.namespace = net_namespace,
};
static void rdma_init_coredev(struct ib_core_device *coredev,
struct ib_device *dev, struct net *net)
{
/* This BUILD_BUG_ON is intended to catch layout change
* of union of ib_core_device and device.
* dev must be the first element as ib_core and providers
* driver uses it. Adding anything in ib_core_device before
* device will break this assumption.
*/
BUILD_BUG_ON(offsetof(struct ib_device, coredev.dev) !=
offsetof(struct ib_device, dev));
coredev->dev.class = &ib_class;
coredev->dev.groups = dev->groups;
device_initialize(&coredev->dev);
coredev->owner = dev;
INIT_LIST_HEAD(&coredev->port_list);
write_pnet(&coredev->rdma_net, net);
}
/**
* _ib_alloc_device - allocate an IB device struct
* @size:size of structure to allocate
@ -389,10 +560,8 @@ struct ib_device *_ib_alloc_device(size_t size)
return NULL;
}
device->dev.class = &ib_class;
device->groups[0] = &ib_dev_attr_group;
device->dev.groups = device->groups;
device_initialize(&device->dev);
rdma_init_coredev(&device->coredev, device, &init_net);
INIT_LIST_HEAD(&device->event_handler_list);
spin_lock_init(&device->event_handler_lock);
@ -403,7 +572,8 @@ struct ib_device *_ib_alloc_device(size_t size)
*/
xa_init_flags(&device->client_data, XA_FLAGS_ALLOC);
init_rwsem(&device->client_data_rwsem);
INIT_LIST_HEAD(&device->port_list);
xa_init_flags(&device->compat_devs, XA_FLAGS_ALLOC);
mutex_init(&device->compat_devs_mutex);
init_completion(&device->unreg_completion);
INIT_WORK(&device->unregistration_work, ib_unregister_work);
@ -436,6 +606,7 @@ void ib_dealloc_device(struct ib_device *device)
/* Expedite releasing netdev references */
free_netdevs(device);
WARN_ON(!xa_empty(&device->compat_devs));
WARN_ON(!xa_empty(&device->client_data));
WARN_ON(refcount_read(&device->refcount));
rdma_restrack_clean(device);
@ -644,6 +815,283 @@ static int ib_security_change(struct notifier_block *nb, unsigned long event,
return NOTIFY_OK;
}
static void compatdev_release(struct device *dev)
{
struct ib_core_device *cdev =
container_of(dev, struct ib_core_device, dev);
kfree(cdev);
}
static int add_one_compat_dev(struct ib_device *device,
struct rdma_dev_net *rnet)
{
struct ib_core_device *cdev;
int ret;
lockdep_assert_held(&rdma_nets_rwsem);
if (!ib_devices_shared_netns)
return 0;
/*
* Create and add compat device in all namespaces other than where it
* is currently bound to.
*/
if (net_eq(read_pnet(&rnet->net),
read_pnet(&device->coredev.rdma_net)))
return 0;
/*
* The first of init_net() or ib_register_device() to take the
* compat_devs_mutex wins and gets to add the device. Others will wait
* for completion here.
*/
mutex_lock(&device->compat_devs_mutex);
cdev = xa_load(&device->compat_devs, rnet->id);
if (cdev) {
ret = 0;
goto done;
}
ret = xa_reserve(&device->compat_devs, rnet->id, GFP_KERNEL);
if (ret)
goto done;
cdev = kzalloc(sizeof(*cdev), GFP_KERNEL);
if (!cdev) {
ret = -ENOMEM;
goto cdev_err;
}
cdev->dev.parent = device->dev.parent;
rdma_init_coredev(cdev, device, read_pnet(&rnet->net));
cdev->dev.release = compatdev_release;
dev_set_name(&cdev->dev, "%s", dev_name(&device->dev));
ret = device_add(&cdev->dev);
if (ret)
goto add_err;
ret = ib_setup_port_attrs(cdev);
if (ret)
goto port_err;
ret = xa_err(xa_store(&device->compat_devs, rnet->id,
cdev, GFP_KERNEL));
if (ret)
goto insert_err;
mutex_unlock(&device->compat_devs_mutex);
return 0;
insert_err:
ib_free_port_attrs(cdev);
port_err:
device_del(&cdev->dev);
add_err:
put_device(&cdev->dev);
cdev_err:
xa_release(&device->compat_devs, rnet->id);
done:
mutex_unlock(&device->compat_devs_mutex);
return ret;
}
static void remove_one_compat_dev(struct ib_device *device, u32 id)
{
struct ib_core_device *cdev;
mutex_lock(&device->compat_devs_mutex);
cdev = xa_erase(&device->compat_devs, id);
mutex_unlock(&device->compat_devs_mutex);
if (cdev) {
ib_free_port_attrs(cdev);
device_del(&cdev->dev);
put_device(&cdev->dev);
}
}
static void remove_compat_devs(struct ib_device *device)
{
struct ib_core_device *cdev;
unsigned long index;
xa_for_each (&device->compat_devs, index, cdev)
remove_one_compat_dev(device, index);
}
static int add_compat_devs(struct ib_device *device)
{
struct rdma_dev_net *rnet;
unsigned long index;
int ret = 0;
lockdep_assert_held(&devices_rwsem);
down_read(&rdma_nets_rwsem);
xa_for_each (&rdma_nets, index, rnet) {
ret = add_one_compat_dev(device, rnet);
if (ret)
break;
}
up_read(&rdma_nets_rwsem);
return ret;
}
static void remove_all_compat_devs(void)
{
struct ib_compat_device *cdev;
struct ib_device *dev;
unsigned long index;
down_read(&devices_rwsem);
xa_for_each (&devices, index, dev) {
unsigned long c_index = 0;
/* Hold nets_rwsem so that any other thread modifying this
* system param can sync with this thread.
*/
down_read(&rdma_nets_rwsem);
xa_for_each (&dev->compat_devs, c_index, cdev)
remove_one_compat_dev(dev, c_index);
up_read(&rdma_nets_rwsem);
}
up_read(&devices_rwsem);
}
static int add_all_compat_devs(void)
{
struct rdma_dev_net *rnet;
struct ib_device *dev;
unsigned long index;
int ret = 0;
down_read(&devices_rwsem);
xa_for_each_marked (&devices, index, dev, DEVICE_REGISTERED) {
unsigned long net_index = 0;
/* Hold nets_rwsem so that any other thread modifying this
* system param can sync with this thread.
*/
down_read(&rdma_nets_rwsem);
xa_for_each (&rdma_nets, net_index, rnet) {
ret = add_one_compat_dev(dev, rnet);
if (ret)
break;
}
up_read(&rdma_nets_rwsem);
}
up_read(&devices_rwsem);
if (ret)
remove_all_compat_devs();
return ret;
}
int rdma_compatdev_set(u8 enable)
{
struct rdma_dev_net *rnet;
unsigned long index;
int ret = 0;
down_write(&rdma_nets_rwsem);
if (ib_devices_shared_netns == enable) {
up_write(&rdma_nets_rwsem);
return 0;
}
/* enable/disable of compat devices is not supported
* when more than default init_net exists.
*/
xa_for_each (&rdma_nets, index, rnet) {
ret++;
break;
}
if (!ret)
ib_devices_shared_netns = enable;
up_write(&rdma_nets_rwsem);
if (ret)
return -EBUSY;
if (enable)
ret = add_all_compat_devs();
else
remove_all_compat_devs();
return ret;
}
static void rdma_dev_exit_net(struct net *net)
{
struct rdma_dev_net *rnet = net_generic(net, rdma_dev_net_id);
struct ib_device *dev;
unsigned long index;
int ret;
down_write(&rdma_nets_rwsem);
/*
* Prevent the ID from being re-used and hide the id from xa_for_each.
*/
ret = xa_err(xa_store(&rdma_nets, rnet->id, NULL, GFP_KERNEL));
WARN_ON(ret);
up_write(&rdma_nets_rwsem);
down_read(&devices_rwsem);
xa_for_each (&devices, index, dev) {
get_device(&dev->dev);
/*
* Release the devices_rwsem so that pontentially blocking
* device_del, doesn't hold the devices_rwsem for too long.
*/
up_read(&devices_rwsem);
remove_one_compat_dev(dev, rnet->id);
/*
* If the real device is in the NS then move it back to init.
*/
rdma_dev_change_netns(dev, net, &init_net);
put_device(&dev->dev);
down_read(&devices_rwsem);
}
up_read(&devices_rwsem);
xa_erase(&rdma_nets, rnet->id);
}
static __net_init int rdma_dev_init_net(struct net *net)
{
struct rdma_dev_net *rnet = net_generic(net, rdma_dev_net_id);
unsigned long index;
struct ib_device *dev;
int ret;
/* No need to create any compat devices in default init_net. */
if (net_eq(net, &init_net))
return 0;
write_pnet(&rnet->net, net);
ret = xa_alloc(&rdma_nets, &rnet->id, rnet, xa_limit_32b, GFP_KERNEL);
if (ret)
return ret;
down_read(&devices_rwsem);
xa_for_each_marked (&devices, index, dev, DEVICE_REGISTERED) {
/* Hold nets_rwsem so that netlink command cannot change
* system configuration for device sharing mode.
*/
down_read(&rdma_nets_rwsem);
ret = add_one_compat_dev(dev, rnet);
up_read(&rdma_nets_rwsem);
if (ret)
break;
}
up_read(&devices_rwsem);
if (ret)
rdma_dev_exit_net(net);
return ret;
}
/*
* Assign the unique string device name and the unique device index. This is
* undone by ib_dealloc_device.
@ -711,6 +1159,9 @@ static void setup_dma_device(struct ib_device *device)
WARN_ON_ONCE(!parent);
device->dma_device = parent;
}
/* Setup default max segment size for all IB devices */
dma_set_max_seg_size(device->dma_device, SZ_2G);
}
/*
@ -765,8 +1216,12 @@ static void disable_device(struct ib_device *device)
ib_device_put(device);
wait_for_completion(&device->unreg_completion);
/* Expedite removing unregistered pointers from the hash table */
free_netdevs(device);
/*
* compat devices must be removed after device refcount drops to zero.
* Otherwise init_net() may add more compatdevs after removing compat
* devices and before device is disabled.
*/
remove_compat_devs(device);
}
/*
@ -807,7 +1262,8 @@ static int enable_device_and_get(struct ib_device *device)
break;
}
up_read(&clients_rwsem);
if (!ret)
ret = add_compat_devs(device);
out:
up_read(&devices_rwsem);
return ret;
@ -847,6 +1303,11 @@ int ib_register_device(struct ib_device *device, const char *name)
ib_device_register_rdmacg(device);
/*
* Ensure that ADD uevent is not fired because it
* is too early amd device is not initialized yet.
*/
dev_set_uevent_suppress(&device->dev, true);
ret = device_add(&device->dev);
if (ret)
goto cg_cleanup;
@ -859,6 +1320,9 @@ int ib_register_device(struct ib_device *device, const char *name)
}
ret = enable_device_and_get(device);
dev_set_uevent_suppress(&device->dev, false);
/* Mark for userspace that device is ready */
kobject_uevent(&device->dev.kobj, KOBJ_ADD);
if (ret) {
void (*dealloc_fn)(struct ib_device *);
@ -887,6 +1351,7 @@ int ib_register_device(struct ib_device *device, const char *name)
dev_cleanup:
device_del(&device->dev);
cg_cleanup:
dev_set_uevent_suppress(&device->dev, false);
ib_device_unregister_rdmacg(device);
ib_cache_cleanup_one(device);
return ret;
@ -908,6 +1373,10 @@ static void __ib_unregister_device(struct ib_device *ib_dev)
goto out;
disable_device(ib_dev);
/* Expedite removing unregistered pointers from the hash table */
free_netdevs(ib_dev);
ib_device_unregister_sysfs(ib_dev);
device_del(&ib_dev->dev);
ib_device_unregister_rdmacg(ib_dev);
@ -1038,6 +1507,126 @@ void ib_unregister_device_queued(struct ib_device *ib_dev)
}
EXPORT_SYMBOL(ib_unregister_device_queued);
/*
* The caller must pass in a device that has the kref held and the refcount
* released. If the device is in cur_net and still registered then it is moved
* into net.
*/
static int rdma_dev_change_netns(struct ib_device *device, struct net *cur_net,
struct net *net)
{
int ret2 = -EINVAL;
int ret;
mutex_lock(&device->unregistration_lock);
/*
* If a device not under ib_device_get() or if the unregistration_lock
* is not held, the namespace can be changed, or it can be unregistered.
* Check again under the lock.
*/
if (refcount_read(&device->refcount) == 0 ||
!net_eq(cur_net, read_pnet(&device->coredev.rdma_net))) {
ret = -ENODEV;
goto out;
}
kobject_uevent(&device->dev.kobj, KOBJ_REMOVE);
disable_device(device);
/*
* At this point no one can be using the device, so it is safe to
* change the namespace.
*/
write_pnet(&device->coredev.rdma_net, net);
down_read(&devices_rwsem);
/*
* Currently rdma devices are system wide unique. So the device name
* is guaranteed free in the new namespace. Publish the new namespace
* at the sysfs level.
*/
ret = device_rename(&device->dev, dev_name(&device->dev));
up_read(&devices_rwsem);
if (ret) {
dev_warn(&device->dev,
"%s: Couldn't rename device after namespace change\n",
__func__);
/* Try and put things back and re-enable the device */
write_pnet(&device->coredev.rdma_net, cur_net);
}
ret2 = enable_device_and_get(device);
if (ret2) {
/*
* This shouldn't really happen, but if it does, let the user
* retry at later point. So don't disable the device.
*/
dev_warn(&device->dev,
"%s: Couldn't re-enable device after namespace change\n",
__func__);
}
kobject_uevent(&device->dev.kobj, KOBJ_ADD);
ib_device_put(device);
out:
mutex_unlock(&device->unregistration_lock);
if (ret)
return ret;
return ret2;
}
int ib_device_set_netns_put(struct sk_buff *skb,
struct ib_device *dev, u32 ns_fd)
{
struct net *net;
int ret;
net = get_net_ns_by_fd(ns_fd);
if (IS_ERR(net)) {
ret = PTR_ERR(net);
goto net_err;
}
if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) {
ret = -EPERM;
goto ns_err;
}
/*
* Currently supported only for those providers which support
* disassociation and don't do port specific sysfs init. Once a
* port_cleanup infrastructure is implemented, this limitation will be
* removed.
*/
if (!dev->ops.disassociate_ucontext || dev->ops.init_port ||
ib_devices_shared_netns) {
ret = -EOPNOTSUPP;
goto ns_err;
}
get_device(&dev->dev);
ib_device_put(dev);
ret = rdma_dev_change_netns(dev, current->nsproxy->net_ns, net);
put_device(&dev->dev);
put_net(net);
return ret;
ns_err:
put_net(net);
net_err:
ib_device_put(dev);
return ret;
}
static struct pernet_operations rdma_dev_net_ops = {
.init = rdma_dev_init_net,
.exit = rdma_dev_exit_net,
.id = &rdma_dev_net_id,
.size = sizeof(struct rdma_dev_net),
};
static int assign_client_id(struct ib_client *client)
{
int ret;
@ -1515,6 +2104,9 @@ int ib_enum_all_devs(nldev_callback nldev_cb, struct sk_buff *skb,
down_read(&devices_rwsem);
xa_for_each_marked (&devices, index, dev, DEVICE_REGISTERED) {
if (!rdma_dev_access_netns(dev, sock_net(skb->sk)))
continue;
ret = nldev_cb(dev, skb, cb, idx);
if (ret)
break;
@ -1787,6 +2379,14 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
SET_DEVICE_OP(dev_ops, get_vf_config);
SET_DEVICE_OP(dev_ops, get_vf_stats);
SET_DEVICE_OP(dev_ops, init_port);
SET_DEVICE_OP(dev_ops, iw_accept);
SET_DEVICE_OP(dev_ops, iw_add_ref);
SET_DEVICE_OP(dev_ops, iw_connect);
SET_DEVICE_OP(dev_ops, iw_create_listen);
SET_DEVICE_OP(dev_ops, iw_destroy_listen);
SET_DEVICE_OP(dev_ops, iw_get_qp);
SET_DEVICE_OP(dev_ops, iw_reject);
SET_DEVICE_OP(dev_ops, iw_rem_ref);
SET_DEVICE_OP(dev_ops, map_mr_sg);
SET_DEVICE_OP(dev_ops, map_phys_fmr);
SET_DEVICE_OP(dev_ops, mmap);
@ -1823,7 +2423,9 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
SET_DEVICE_OP(dev_ops, set_vf_link_state);
SET_DEVICE_OP(dev_ops, unmap_fmr);
SET_OBJ_SIZE(dev_ops, ib_ah);
SET_OBJ_SIZE(dev_ops, ib_pd);
SET_OBJ_SIZE(dev_ops, ib_srq);
SET_OBJ_SIZE(dev_ops, ib_ucontext);
}
EXPORT_SYMBOL(ib_set_device_ops);
@ -1903,12 +2505,20 @@ static int __init ib_core_init(void)
goto err_sa;
}
ret = register_pernet_device(&rdma_dev_net_ops);
if (ret) {
pr_warn("Couldn't init compat dev. ret %d\n", ret);
goto err_compat;
}
nldev_init();
rdma_nl_register(RDMA_NL_LS, ibnl_ls_cb_table);
roce_gid_mgmt_init();
return 0;
err_compat:
unregister_lsm_notifier(&ibdev_lsm_nb);
err_sa:
ib_sa_cleanup();
err_mad:
@ -1933,6 +2543,7 @@ static void __exit ib_core_cleanup(void)
roce_gid_mgmt_cleanup();
nldev_exit();
rdma_nl_unregister(RDMA_NL_LS);
unregister_pernet_device(&rdma_dev_net_ops);
unregister_lsm_notifier(&ibdev_lsm_nb);
ib_sa_cleanup();
ib_mad_cleanup();
@ -1950,5 +2561,8 @@ static void __exit ib_core_cleanup(void)
MODULE_ALIAS_RDMA_NETLINK(RDMA_NL_LS, 4);
subsys_initcall(ib_core_init);
/* ib core relies on netdev stack to first register net_ns_type_operations
* ns kobject type before ib_core initialization.
*/
fs_initcall(ib_core_init);
module_exit(ib_core_cleanup);

View File

@ -394,7 +394,7 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
cm_id_priv->state = IW_CM_STATE_DESTROYING;
spin_unlock_irqrestore(&cm_id_priv->lock, flags);
/* destroy the listening endpoint */
cm_id->device->iwcm->destroy_listen(cm_id);
cm_id->device->ops.iw_destroy_listen(cm_id);
spin_lock_irqsave(&cm_id_priv->lock, flags);
break;
case IW_CM_STATE_ESTABLISHED:
@ -417,7 +417,7 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
*/
cm_id_priv->state = IW_CM_STATE_DESTROYING;
spin_unlock_irqrestore(&cm_id_priv->lock, flags);
cm_id->device->iwcm->reject(cm_id, NULL, 0);
cm_id->device->ops.iw_reject(cm_id, NULL, 0);
spin_lock_irqsave(&cm_id_priv->lock, flags);
break;
case IW_CM_STATE_CONN_SENT:
@ -427,7 +427,7 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
break;
}
if (cm_id_priv->qp) {
cm_id_priv->id.device->iwcm->rem_ref(cm_id_priv->qp);
cm_id_priv->id.device->ops.iw_rem_ref(cm_id_priv->qp);
cm_id_priv->qp = NULL;
}
spin_unlock_irqrestore(&cm_id_priv->lock, flags);
@ -504,7 +504,7 @@ static void iw_cm_check_wildcard(struct sockaddr_storage *pm_addr,
static int iw_cm_map(struct iw_cm_id *cm_id, bool active)
{
const char *devname = dev_name(&cm_id->device->dev);
const char *ifname = cm_id->device->iwcm->ifname;
const char *ifname = cm_id->device->iw_ifname;
struct iwpm_dev_data pm_reg_msg = {};
struct iwpm_sa_data pm_msg;
int status;
@ -526,7 +526,7 @@ static int iw_cm_map(struct iw_cm_id *cm_id, bool active)
cm_id->mapped = true;
pm_msg.loc_addr = cm_id->local_addr;
pm_msg.rem_addr = cm_id->remote_addr;
pm_msg.flags = (cm_id->device->iwcm->driver_flags & IW_F_NO_PORT_MAP) ?
pm_msg.flags = (cm_id->device->iw_driver_flags & IW_F_NO_PORT_MAP) ?
IWPM_FLAGS_NO_PORT_MAP : 0;
if (active)
status = iwpm_add_and_query_mapping(&pm_msg,
@ -577,7 +577,8 @@ int iw_cm_listen(struct iw_cm_id *cm_id, int backlog)
spin_unlock_irqrestore(&cm_id_priv->lock, flags);
ret = iw_cm_map(cm_id, false);
if (!ret)
ret = cm_id->device->iwcm->create_listen(cm_id, backlog);
ret = cm_id->device->ops.iw_create_listen(cm_id,
backlog);
if (ret)
cm_id_priv->state = IW_CM_STATE_IDLE;
spin_lock_irqsave(&cm_id_priv->lock, flags);
@ -617,7 +618,7 @@ int iw_cm_reject(struct iw_cm_id *cm_id,
cm_id_priv->state = IW_CM_STATE_IDLE;
spin_unlock_irqrestore(&cm_id_priv->lock, flags);
ret = cm_id->device->iwcm->reject(cm_id, private_data,
ret = cm_id->device->ops.iw_reject(cm_id, private_data,
private_data_len);
clear_bit(IWCM_F_CONNECT_WAIT, &cm_id_priv->flags);
@ -653,25 +654,25 @@ int iw_cm_accept(struct iw_cm_id *cm_id,
return -EINVAL;
}
/* Get the ib_qp given the QPN */
qp = cm_id->device->iwcm->get_qp(cm_id->device, iw_param->qpn);
qp = cm_id->device->ops.iw_get_qp(cm_id->device, iw_param->qpn);
if (!qp) {
spin_unlock_irqrestore(&cm_id_priv->lock, flags);
clear_bit(IWCM_F_CONNECT_WAIT, &cm_id_priv->flags);
wake_up_all(&cm_id_priv->connect_wait);
return -EINVAL;
}
cm_id->device->iwcm->add_ref(qp);
cm_id->device->ops.iw_add_ref(qp);
cm_id_priv->qp = qp;
spin_unlock_irqrestore(&cm_id_priv->lock, flags);
ret = cm_id->device->iwcm->accept(cm_id, iw_param);
ret = cm_id->device->ops.iw_accept(cm_id, iw_param);
if (ret) {
/* An error on accept precludes provider events */
BUG_ON(cm_id_priv->state != IW_CM_STATE_CONN_RECV);
cm_id_priv->state = IW_CM_STATE_IDLE;
spin_lock_irqsave(&cm_id_priv->lock, flags);
if (cm_id_priv->qp) {
cm_id->device->iwcm->rem_ref(qp);
cm_id->device->ops.iw_rem_ref(qp);
cm_id_priv->qp = NULL;
}
spin_unlock_irqrestore(&cm_id_priv->lock, flags);
@ -712,25 +713,25 @@ int iw_cm_connect(struct iw_cm_id *cm_id, struct iw_cm_conn_param *iw_param)
}
/* Get the ib_qp given the QPN */
qp = cm_id->device->iwcm->get_qp(cm_id->device, iw_param->qpn);
qp = cm_id->device->ops.iw_get_qp(cm_id->device, iw_param->qpn);
if (!qp) {
ret = -EINVAL;
goto err;
}
cm_id->device->iwcm->add_ref(qp);
cm_id->device->ops.iw_add_ref(qp);
cm_id_priv->qp = qp;
cm_id_priv->state = IW_CM_STATE_CONN_SENT;
spin_unlock_irqrestore(&cm_id_priv->lock, flags);
ret = iw_cm_map(cm_id, true);
if (!ret)
ret = cm_id->device->iwcm->connect(cm_id, iw_param);
ret = cm_id->device->ops.iw_connect(cm_id, iw_param);
if (!ret)
return 0; /* success */
spin_lock_irqsave(&cm_id_priv->lock, flags);
if (cm_id_priv->qp) {
cm_id->device->iwcm->rem_ref(qp);
cm_id->device->ops.iw_rem_ref(qp);
cm_id_priv->qp = NULL;
}
cm_id_priv->state = IW_CM_STATE_IDLE;
@ -895,7 +896,7 @@ static int cm_conn_rep_handler(struct iwcm_id_private *cm_id_priv,
cm_id_priv->state = IW_CM_STATE_ESTABLISHED;
} else {
/* REJECTED or RESET */
cm_id_priv->id.device->iwcm->rem_ref(cm_id_priv->qp);
cm_id_priv->id.device->ops.iw_rem_ref(cm_id_priv->qp);
cm_id_priv->qp = NULL;
cm_id_priv->state = IW_CM_STATE_IDLE;
}
@ -946,7 +947,7 @@ static int cm_close_handler(struct iwcm_id_private *cm_id_priv,
spin_lock_irqsave(&cm_id_priv->lock, flags);
if (cm_id_priv->qp) {
cm_id_priv->id.device->iwcm->rem_ref(cm_id_priv->qp);
cm_id_priv->id.device->ops.iw_rem_ref(cm_id_priv->qp);
cm_id_priv->qp = NULL;
}
switch (cm_id_priv->state) {

View File

@ -3,7 +3,7 @@
* Copyright (c) 2005 Intel Corporation. All rights reserved.
* Copyright (c) 2005 Mellanox Technologies Ltd. All rights reserved.
* Copyright (c) 2009 HNR Consulting. All rights reserved.
* Copyright (c) 2014 Intel Corporation. All rights reserved.
* Copyright (c) 2014,2018 Intel Corporation. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
@ -38,10 +38,10 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/dma-mapping.h>
#include <linux/idr.h>
#include <linux/slab.h>
#include <linux/module.h>
#include <linux/security.h>
#include <linux/xarray.h>
#include <rdma/ib_cache.h>
#include "mad_priv.h"
@ -51,6 +51,32 @@
#include "opa_smi.h"
#include "agent.h"
#define CREATE_TRACE_POINTS
#include <trace/events/ib_mad.h>
#ifdef CONFIG_TRACEPOINTS
static void create_mad_addr_info(struct ib_mad_send_wr_private *mad_send_wr,
struct ib_mad_qp_info *qp_info,
struct trace_event_raw_ib_mad_send_template *entry)
{
u16 pkey;
struct ib_device *dev = qp_info->port_priv->device;
u8 pnum = qp_info->port_priv->port_num;
struct ib_ud_wr *wr = &mad_send_wr->send_wr;
struct rdma_ah_attr attr = {};
rdma_query_ah(wr->ah, &attr);
/* These are common */
entry->sl = attr.sl;
ib_query_pkey(dev, pnum, wr->pkey_index, &pkey);
entry->pkey = pkey;
entry->rqpn = wr->remote_qpn;
entry->rqkey = wr->remote_qkey;
entry->dlid = rdma_ah_get_dlid(&attr);
}
#endif
static int mad_sendq_size = IB_MAD_QP_SEND_SIZE;
static int mad_recvq_size = IB_MAD_QP_RECV_SIZE;
@ -59,12 +85,9 @@ MODULE_PARM_DESC(send_queue_size, "Size of send queue in number of work requests
module_param_named(recv_queue_size, mad_recvq_size, int, 0444);
MODULE_PARM_DESC(recv_queue_size, "Size of receive queue in number of work requests");
/*
* The mlx4 driver uses the top byte to distinguish which virtual function
* generated the MAD, so we must avoid using it.
*/
#define AGENT_ID_LIMIT (1 << 24)
static DEFINE_IDR(ib_mad_clients);
/* Client ID 0 is used for snoop-only clients */
static DEFINE_XARRAY_ALLOC1(ib_mad_clients);
static u32 ib_mad_client_next;
static struct list_head ib_mad_port_list;
/* Port list lock */
@ -389,18 +412,17 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device,
goto error4;
}
idr_preload(GFP_KERNEL);
idr_lock(&ib_mad_clients);
ret2 = idr_alloc_cyclic(&ib_mad_clients, mad_agent_priv, 0,
AGENT_ID_LIMIT, GFP_ATOMIC);
idr_unlock(&ib_mad_clients);
idr_preload_end();
/*
* The mlx4 driver uses the top byte to distinguish which virtual
* function generated the MAD, so we must avoid using it.
*/
ret2 = xa_alloc_cyclic(&ib_mad_clients, &mad_agent_priv->agent.hi_tid,
mad_agent_priv, XA_LIMIT(0, (1 << 24) - 1),
&ib_mad_client_next, GFP_KERNEL);
if (ret2 < 0) {
ret = ERR_PTR(ret2);
goto error5;
}
mad_agent_priv->agent.hi_tid = ret2;
/*
* Make sure MAD registration (if supplied)
@ -445,12 +467,11 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device,
}
spin_unlock_irq(&port_priv->reg_lock);
trace_ib_mad_create_agent(mad_agent_priv);
return &mad_agent_priv->agent;
error6:
spin_unlock_irq(&port_priv->reg_lock);
idr_lock(&ib_mad_clients);
idr_remove(&ib_mad_clients, mad_agent_priv->agent.hi_tid);
idr_unlock(&ib_mad_clients);
xa_erase(&ib_mad_clients, mad_agent_priv->agent.hi_tid);
error5:
ib_mad_agent_security_cleanup(&mad_agent_priv->agent);
error4:
@ -602,6 +623,7 @@ static void unregister_mad_agent(struct ib_mad_agent_private *mad_agent_priv)
struct ib_mad_port_private *port_priv;
/* Note that we could still be handling received MADs */
trace_ib_mad_unregister_agent(mad_agent_priv);
/*
* Canceling all sends results in dropping received response
@ -614,9 +636,7 @@ static void unregister_mad_agent(struct ib_mad_agent_private *mad_agent_priv)
spin_lock_irq(&port_priv->reg_lock);
remove_mad_reg_req(mad_agent_priv);
spin_unlock_irq(&port_priv->reg_lock);
idr_lock(&ib_mad_clients);
idr_remove(&ib_mad_clients, mad_agent_priv->agent.hi_tid);
idr_unlock(&ib_mad_clients);
xa_erase(&ib_mad_clients, mad_agent_priv->agent.hi_tid);
flush_workqueue(port_priv->wq);
ib_cancel_rmpp_recvs(mad_agent_priv);
@ -821,6 +841,8 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
if (opa && smp->class_version == OPA_SM_CLASS_VERSION) {
u32 opa_drslid;
trace_ib_mad_handle_out_opa_smi(opa_smp);
if ((opa_get_smp_direction(opa_smp)
? opa_smp->route.dr.dr_dlid : opa_smp->route.dr.dr_slid) ==
OPA_LID_PERMISSIVE &&
@ -846,6 +868,8 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
opa_smi_check_local_returning_smp(opa_smp, device) == IB_SMI_DISCARD)
goto out;
} else {
trace_ib_mad_handle_out_ib_smi(smp);
if ((ib_get_smp_direction(smp) ? smp->dr_dlid : smp->dr_slid) ==
IB_LID_PERMISSIVE &&
smi_handle_dr_smp_send(smp, rdma_cap_ib_switch(device), port_num) ==
@ -1223,6 +1247,7 @@ int ib_send_mad(struct ib_mad_send_wr_private *mad_send_wr)
spin_lock_irqsave(&qp_info->send_queue.lock, flags);
if (qp_info->send_queue.count < qp_info->send_queue.max_active) {
trace_ib_mad_ib_send_mad(mad_send_wr, qp_info);
ret = ib_post_send(mad_agent->qp, &mad_send_wr->send_wr.wr,
NULL);
list = &qp_info->send_queue.list;
@ -1756,7 +1781,7 @@ find_mad_agent(struct ib_mad_port_private *port_priv,
*/
hi_tid = be64_to_cpu(mad_hdr->tid) >> 32;
rcu_read_lock();
mad_agent = idr_find(&ib_mad_clients, hi_tid);
mad_agent = xa_load(&ib_mad_clients, hi_tid);
if (mad_agent && !atomic_inc_not_zero(&mad_agent->refcount))
mad_agent = NULL;
rcu_read_unlock();
@ -2077,6 +2102,8 @@ static enum smi_action handle_ib_smi(const struct ib_mad_port_private *port_priv
enum smi_forward_action retsmi;
struct ib_smp *smp = (struct ib_smp *)recv->mad;
trace_ib_mad_handle_ib_smi(smp);
if (smi_handle_dr_smp_recv(smp,
rdma_cap_ib_switch(port_priv->device),
port_num,
@ -2162,6 +2189,8 @@ handle_opa_smi(struct ib_mad_port_private *port_priv,
enum smi_forward_action retsmi;
struct opa_smp *smp = (struct opa_smp *)recv->mad;
trace_ib_mad_handle_opa_smi(smp);
if (opa_smi_handle_dr_smp_recv(smp,
rdma_cap_ib_switch(port_priv->device),
port_num,
@ -2286,6 +2315,9 @@ static void ib_mad_recv_done(struct ib_cq *cq, struct ib_wc *wc)
if (!validate_mad((const struct ib_mad_hdr *)recv->mad, qp_info, opa))
goto out;
trace_ib_mad_recv_done_handler(qp_info, wc,
(struct ib_mad_hdr *)recv->mad);
mad_size = recv->mad_size;
response = alloc_mad_private(mad_size, GFP_KERNEL);
if (!response)
@ -2332,6 +2364,7 @@ static void ib_mad_recv_done(struct ib_cq *cq, struct ib_wc *wc)
mad_agent = find_mad_agent(port_priv, (const struct ib_mad_hdr *)recv->mad);
if (mad_agent) {
trace_ib_mad_recv_done_agent(mad_agent);
ib_mad_complete_recv(mad_agent, &recv->header.recv_wc);
/*
* recv is freed up in error cases in ib_mad_complete_recv
@ -2496,6 +2529,9 @@ static void ib_mad_send_done(struct ib_cq *cq, struct ib_wc *wc)
send_queue = mad_list->mad_queue;
qp_info = send_queue->qp_info;
trace_ib_mad_send_done_agent(mad_send_wr->mad_agent_priv);
trace_ib_mad_send_done_handler(mad_send_wr, wc);
retry:
ib_dma_unmap_single(mad_send_wr->send_buf.mad_agent->device,
mad_send_wr->header_mapping,
@ -2527,6 +2563,7 @@ retry:
ib_mad_complete_send_wr(mad_send_wr, &mad_send_wc);
if (queued_send_wr) {
trace_ib_mad_send_done_resend(queued_send_wr, qp_info);
ret = ib_post_send(qp_info->qp, &queued_send_wr->send_wr.wr,
NULL);
if (ret) {
@ -2574,6 +2611,7 @@ static bool ib_mad_send_error(struct ib_mad_port_private *port_priv,
if (mad_send_wr->retry) {
/* Repost send */
mad_send_wr->retry = 0;
trace_ib_mad_error_handler(mad_send_wr, qp_info);
ret = ib_post_send(qp_info->qp, &mad_send_wr->send_wr.wr,
NULL);
if (!ret)
@ -3356,9 +3394,6 @@ int ib_mad_init(void)
INIT_LIST_HEAD(&ib_mad_port_list);
/* Client ID 0 is used for snoop-only clients */
idr_alloc(&ib_mad_clients, NULL, 0, 0, GFP_KERNEL);
if (ib_register_client(&mad_client)) {
pr_err("Couldn't register ib_mad client\n");
return -EINVAL;

View File

@ -73,14 +73,14 @@ struct ib_mad_private_header {
struct ib_mad_recv_wc recv_wc;
struct ib_wc wc;
u64 mapping;
} __attribute__ ((packed));
} __packed;
struct ib_mad_private {
struct ib_mad_private_header header;
size_t mad_size;
struct ib_grh grh;
u8 mad[0];
} __attribute__ ((packed));
} __packed;
struct ib_rmpp_segment {
struct list_head list;

View File

@ -804,7 +804,6 @@ static void mcast_event_handler(struct ib_event_handler *handler,
switch (event->event) {
case IB_EVENT_PORT_ERR:
case IB_EVENT_LID_CHANGE:
case IB_EVENT_SM_CHANGE:
case IB_EVENT_CLIENT_REREGISTER:
mcast_groups_event(&dev->port[index], MCAST_GROUP_ERROR);
break;

View File

@ -116,6 +116,10 @@ static const struct nla_policy nldev_policy[RDMA_NLDEV_ATTR_MAX] = {
[RDMA_NLDEV_ATTR_RES_CTXN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_LINK_TYPE] = { .type = NLA_NUL_STRING,
.len = RDMA_NLDEV_ATTR_ENTRY_STRLEN },
[RDMA_NLDEV_SYS_ATTR_NETNS_MODE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_DEV_PROTOCOL] = { .type = NLA_NUL_STRING,
.len = RDMA_NLDEV_ATTR_ENTRY_STRLEN },
[RDMA_NLDEV_NET_NS_FD] = { .type = NLA_U32 },
};
static int put_driver_name_print_type(struct sk_buff *msg, const char *name,
@ -198,6 +202,8 @@ static int fill_nldev_handle(struct sk_buff *msg, struct ib_device *device)
static int fill_dev_info(struct sk_buff *msg, struct ib_device *device)
{
char fw[IB_FW_VERSION_NAME_MAX];
int ret = 0;
u8 port;
if (fill_nldev_handle(msg, device))
return -EMSGSIZE;
@ -226,7 +232,25 @@ static int fill_dev_info(struct sk_buff *msg, struct ib_device *device)
return -EMSGSIZE;
if (nla_put_u8(msg, RDMA_NLDEV_ATTR_DEV_NODE_TYPE, device->node_type))
return -EMSGSIZE;
return 0;
/*
* Link type is determined on first port and mlx4 device
* which can potentially have two different link type for the same
* IB device is considered as better to be avoided in the future,
*/
port = rdma_start_port(device);
if (rdma_cap_opa_mad(device, port))
ret = nla_put_string(msg, RDMA_NLDEV_ATTR_DEV_PROTOCOL, "opa");
else if (rdma_protocol_ib(device, port))
ret = nla_put_string(msg, RDMA_NLDEV_ATTR_DEV_PROTOCOL, "ib");
else if (rdma_protocol_iwarp(device, port))
ret = nla_put_string(msg, RDMA_NLDEV_ATTR_DEV_PROTOCOL, "iw");
else if (rdma_protocol_roce(device, port))
ret = nla_put_string(msg, RDMA_NLDEV_ATTR_DEV_PROTOCOL, "roce");
else if (rdma_protocol_usnic(device, port))
ret = nla_put_string(msg, RDMA_NLDEV_ATTR_DEV_PROTOCOL,
"usnic");
return ret;
}
static int fill_port_info(struct sk_buff *msg,
@ -615,7 +639,7 @@ static int nldev_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = ib_device_get_by_index(index);
device = ib_device_get_by_index(sock_net(skb->sk), index);
if (!device)
return -EINVAL;
@ -659,7 +683,7 @@ static int nldev_set_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EINVAL;
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = ib_device_get_by_index(index);
device = ib_device_get_by_index(sock_net(skb->sk), index);
if (!device)
return -EINVAL;
@ -669,9 +693,20 @@ static int nldev_set_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
nla_strlcpy(name, tb[RDMA_NLDEV_ATTR_DEV_NAME],
IB_DEVICE_NAME_MAX);
err = ib_device_rename(device, name);
goto done;
}
if (tb[RDMA_NLDEV_NET_NS_FD]) {
u32 ns_fd;
ns_fd = nla_get_u32(tb[RDMA_NLDEV_NET_NS_FD]);
err = ib_device_set_netns_put(skb, device, ns_fd);
goto put_done;
}
done:
ib_device_put(device);
put_done:
return err;
}
@ -707,7 +742,7 @@ static int nldev_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
{
/*
* There is no need to take lock, because
* we are relying on ib_core's lists_rwsem
* we are relying on ib_core's locking.
*/
return ib_enum_all_devs(_nldev_get_dumpit, skb, cb);
}
@ -730,7 +765,7 @@ static int nldev_port_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EINVAL;
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = ib_device_get_by_index(index);
device = ib_device_get_by_index(sock_net(skb->sk), index);
if (!device)
return -EINVAL;
@ -784,7 +819,7 @@ static int nldev_port_get_dumpit(struct sk_buff *skb,
return -EINVAL;
ifindex = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = ib_device_get_by_index(ifindex);
device = ib_device_get_by_index(sock_net(skb->sk), ifindex);
if (!device)
return -EINVAL;
@ -839,7 +874,7 @@ static int nldev_res_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EINVAL;
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = ib_device_get_by_index(index);
device = ib_device_get_by_index(sock_net(skb->sk), index);
if (!device)
return -EINVAL;
@ -887,7 +922,6 @@ static int _nldev_res_get_dumpit(struct ib_device *device,
nlmsg_cancel(skb, nlh);
goto out;
}
nlmsg_end(skb, nlh);
idx++;
@ -988,7 +1022,7 @@ static int res_get_common_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EINVAL;
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = ib_device_get_by_index(index);
device = ib_device_get_by_index(sock_net(skb->sk), index);
if (!device)
return -EINVAL;
@ -1085,7 +1119,7 @@ static int res_get_common_dumpit(struct sk_buff *skb,
return -EINVAL;
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = ib_device_get_by_index(index);
device = ib_device_get_by_index(sock_net(skb->sk), index);
if (!device)
return -EINVAL;
@ -1300,7 +1334,7 @@ static int nldev_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EINVAL;
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = ib_device_get_by_index(index);
device = ib_device_get_by_index(sock_net(skb->sk), index);
if (!device)
return -EINVAL;
@ -1313,6 +1347,55 @@ static int nldev_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
return 0;
}
static int nldev_get_sys_get_dumpit(struct sk_buff *skb,
struct netlink_callback *cb)
{
struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
struct nlmsghdr *nlh;
int err;
err = nlmsg_parse(cb->nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
nldev_policy, NULL);
if (err)
return err;
nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV,
RDMA_NLDEV_CMD_SYS_GET),
0, 0);
err = nla_put_u8(skb, RDMA_NLDEV_SYS_ATTR_NETNS_MODE,
(u8)ib_devices_shared_netns);
if (err) {
nlmsg_cancel(skb, nlh);
return err;
}
nlmsg_end(skb, nlh);
return skb->len;
}
static int nldev_set_sys_set_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
u8 enable;
int err;
err = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
nldev_policy, extack);
if (err || !tb[RDMA_NLDEV_SYS_ATTR_NETNS_MODE])
return -EINVAL;
enable = nla_get_u8(tb[RDMA_NLDEV_SYS_ATTR_NETNS_MODE]);
/* Only 0 and 1 are supported */
if (enable > 1)
return -EINVAL;
err = rdma_compatdev_set(enable);
return err;
}
static const struct rdma_nl_cbs nldev_cb_table[RDMA_NLDEV_NUM_OPS] = {
[RDMA_NLDEV_CMD_GET] = {
.doit = nldev_get_doit,
@ -1358,6 +1441,13 @@ static const struct rdma_nl_cbs nldev_cb_table[RDMA_NLDEV_NUM_OPS] = {
.doit = nldev_res_get_pd_doit,
.dump = nldev_res_get_pd_dumpit,
},
[RDMA_NLDEV_CMD_SYS_GET] = {
.dump = nldev_get_sys_get_dumpit,
},
[RDMA_NLDEV_CMD_SYS_SET] = {
.doit = nldev_set_sys_set_doit,
.flags = RDMA_NL_ADMIN_PERM,
},
};
void __init nldev_init(void)

View File

@ -125,9 +125,10 @@ static void assert_uverbs_usecnt(struct ib_uobject *uobj,
* and consumes the kref on the uobj.
*/
static int uverbs_destroy_uobject(struct ib_uobject *uobj,
enum rdma_remove_reason reason)
enum rdma_remove_reason reason,
struct uverbs_attr_bundle *attrs)
{
struct ib_uverbs_file *ufile = uobj->ufile;
struct ib_uverbs_file *ufile = attrs->ufile;
unsigned long flags;
int ret;
@ -135,7 +136,8 @@ static int uverbs_destroy_uobject(struct ib_uobject *uobj,
assert_uverbs_usecnt(uobj, UVERBS_LOOKUP_WRITE);
if (uobj->object) {
ret = uobj->uapi_object->type_class->destroy_hw(uobj, reason);
ret = uobj->uapi_object->type_class->destroy_hw(uobj, reason,
attrs);
if (ret) {
if (ib_is_destroy_retryable(ret, reason, uobj))
return ret;
@ -196,9 +198,9 @@ static int uverbs_destroy_uobject(struct ib_uobject *uobj,
* version requires the caller to have already obtained an
* LOOKUP_DESTROY uobject kref.
*/
int uobj_destroy(struct ib_uobject *uobj)
int uobj_destroy(struct ib_uobject *uobj, struct uverbs_attr_bundle *attrs)
{
struct ib_uverbs_file *ufile = uobj->ufile;
struct ib_uverbs_file *ufile = attrs->ufile;
int ret;
down_read(&ufile->hw_destroy_rwsem);
@ -207,7 +209,7 @@ int uobj_destroy(struct ib_uobject *uobj)
if (ret)
goto out_unlock;
ret = uverbs_destroy_uobject(uobj, RDMA_REMOVE_DESTROY);
ret = uverbs_destroy_uobject(uobj, RDMA_REMOVE_DESTROY, attrs);
if (ret) {
atomic_set(&uobj->usecnt, 0);
goto out_unlock;
@ -224,18 +226,17 @@ out_unlock:
* uverbs_put_destroy.
*/
struct ib_uobject *__uobj_get_destroy(const struct uverbs_api_object *obj,
u32 id,
const struct uverbs_attr_bundle *attrs)
u32 id, struct uverbs_attr_bundle *attrs)
{
struct ib_uobject *uobj;
int ret;
uobj = rdma_lookup_get_uobject(obj, attrs->ufile, id,
UVERBS_LOOKUP_DESTROY);
UVERBS_LOOKUP_DESTROY, attrs);
if (IS_ERR(uobj))
return uobj;
ret = uobj_destroy(uobj);
ret = uobj_destroy(uobj, attrs);
if (ret) {
rdma_lookup_put_uobject(uobj, UVERBS_LOOKUP_DESTROY);
return ERR_PTR(ret);
@ -249,7 +250,7 @@ struct ib_uobject *__uobj_get_destroy(const struct uverbs_api_object *obj,
* (negative errno on failure). For use by callers that do not need the uobj.
*/
int __uobj_perform_destroy(const struct uverbs_api_object *obj, u32 id,
const struct uverbs_attr_bundle *attrs)
struct uverbs_attr_bundle *attrs)
{
struct ib_uobject *uobj;
@ -296,25 +297,13 @@ static struct ib_uobject *alloc_uobj(struct ib_uverbs_file *ufile,
static int idr_add_uobj(struct ib_uobject *uobj)
{
int ret;
idr_preload(GFP_KERNEL);
spin_lock(&uobj->ufile->idr_lock);
/*
* We start with allocating an idr pointing to NULL. This represents an
* object which isn't initialized yet. We'll replace it later on with
* the real object once we commit.
*/
ret = idr_alloc(&uobj->ufile->idr, NULL, 0,
min_t(unsigned long, U32_MAX - 1, INT_MAX), GFP_NOWAIT);
if (ret >= 0)
uobj->id = ret;
spin_unlock(&uobj->ufile->idr_lock);
idr_preload_end();
return ret < 0 ? ret : 0;
/*
* We start with allocating an idr pointing to NULL. This represents an
* object which isn't initialized yet. We'll replace it later on with
* the real object once we commit.
*/
return xa_alloc(&uobj->ufile->idr, &uobj->id, NULL, xa_limit_32b,
GFP_KERNEL);
}
/* Returns the ib_uobject or an error. The caller should check for IS_ERR. */
@ -324,29 +313,20 @@ lookup_get_idr_uobject(const struct uverbs_api_object *obj,
enum rdma_lookup_mode mode)
{
struct ib_uobject *uobj;
unsigned long idrno = id;
if (id < 0 || id > ULONG_MAX)
return ERR_PTR(-EINVAL);
rcu_read_lock();
/* object won't be released as we're protected in rcu */
uobj = idr_find(&ufile->idr, idrno);
if (!uobj) {
uobj = ERR_PTR(-ENOENT);
goto free;
}
/*
* The idr_find is guaranteed to return a pointer to something that
* isn't freed yet, or NULL, as the free after idr_remove goes through
* kfree_rcu(). However the object may still have been released and
* kfree() could be called at any time.
*/
if (!kref_get_unless_zero(&uobj->ref))
uobj = xa_load(&ufile->idr, id);
if (!uobj || !kref_get_unless_zero(&uobj->ref))
uobj = ERR_PTR(-ENOENT);
free:
rcu_read_unlock();
return uobj;
}
@ -393,12 +373,13 @@ lookup_get_fd_uobject(const struct uverbs_api_object *obj,
struct ib_uobject *rdma_lookup_get_uobject(const struct uverbs_api_object *obj,
struct ib_uverbs_file *ufile, s64 id,
enum rdma_lookup_mode mode)
enum rdma_lookup_mode mode,
struct uverbs_attr_bundle *attrs)
{
struct ib_uobject *uobj;
int ret;
if (IS_ERR(obj) && PTR_ERR(obj) == -ENOMSG) {
if (obj == ERR_PTR(-ENOMSG)) {
/* must be UVERBS_IDR_ANY_OBJECT, see uapi_get_object() */
uobj = lookup_get_idr_uobject(NULL, ufile, id, mode);
if (IS_ERR(uobj))
@ -431,6 +412,8 @@ struct ib_uobject *rdma_lookup_get_uobject(const struct uverbs_api_object *obj,
ret = uverbs_try_lock_object(uobj, mode);
if (ret)
goto free;
if (attrs)
attrs->context = uobj->context;
return uobj;
free:
@ -438,38 +421,6 @@ free:
uverbs_uobject_put(uobj);
return ERR_PTR(ret);
}
struct ib_uobject *_uobj_get_read(enum uverbs_default_objects type,
u32 object_id,
struct uverbs_attr_bundle *attrs)
{
struct ib_uobject *uobj;
uobj = rdma_lookup_get_uobject(uobj_get_type(attrs, type), attrs->ufile,
object_id, UVERBS_LOOKUP_READ);
if (IS_ERR(uobj))
return uobj;
attrs->context = uobj->context;
return uobj;
}
struct ib_uobject *_uobj_get_write(enum uverbs_default_objects type,
u32 object_id,
struct uverbs_attr_bundle *attrs)
{
struct ib_uobject *uobj;
uobj = rdma_lookup_get_uobject(uobj_get_type(attrs, type), attrs->ufile,
object_id, UVERBS_LOOKUP_WRITE);
if (IS_ERR(uobj))
return uobj;
attrs->context = uobj->context;
return uobj;
}
static struct ib_uobject *
alloc_begin_idr_uobject(const struct uverbs_api_object *obj,
@ -489,14 +440,12 @@ alloc_begin_idr_uobject(const struct uverbs_api_object *obj,
ret = ib_rdmacg_try_charge(&uobj->cg_obj, uobj->context->device,
RDMACG_RESOURCE_HCA_OBJECT);
if (ret)
goto idr_remove;
goto remove;
return uobj;
idr_remove:
spin_lock(&ufile->idr_lock);
idr_remove(&ufile->idr, uobj->id);
spin_unlock(&ufile->idr_lock);
remove:
xa_erase(&ufile->idr, uobj->id);
uobj_put:
uverbs_uobject_put(uobj);
return ERR_PTR(ret);
@ -526,7 +475,8 @@ alloc_begin_fd_uobject(const struct uverbs_api_object *obj,
}
struct ib_uobject *rdma_alloc_begin_uobject(const struct uverbs_api_object *obj,
struct ib_uverbs_file *ufile)
struct ib_uverbs_file *ufile,
struct uverbs_attr_bundle *attrs)
{
struct ib_uobject *ret;
@ -546,6 +496,8 @@ struct ib_uobject *rdma_alloc_begin_uobject(const struct uverbs_api_object *obj,
up_read(&ufile->hw_destroy_rwsem);
return ret;
}
if (attrs)
attrs->context = ret->context;
return ret;
}
@ -554,18 +506,17 @@ static void alloc_abort_idr_uobject(struct ib_uobject *uobj)
ib_rdmacg_uncharge(&uobj->cg_obj, uobj->context->device,
RDMACG_RESOURCE_HCA_OBJECT);
spin_lock(&uobj->ufile->idr_lock);
idr_remove(&uobj->ufile->idr, uobj->id);
spin_unlock(&uobj->ufile->idr_lock);
xa_erase(&uobj->ufile->idr, uobj->id);
}
static int __must_check destroy_hw_idr_uobject(struct ib_uobject *uobj,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
const struct uverbs_obj_idr_type *idr_type =
container_of(uobj->uapi_object->type_attrs,
struct uverbs_obj_idr_type, type);
int ret = idr_type->destroy_object(uobj, why);
int ret = idr_type->destroy_object(uobj, why, attrs);
/*
* We can only fail gracefully if the user requested to destroy the
@ -586,9 +537,7 @@ static int __must_check destroy_hw_idr_uobject(struct ib_uobject *uobj,
static void remove_handle_idr_uobject(struct ib_uobject *uobj)
{
spin_lock(&uobj->ufile->idr_lock);
idr_remove(&uobj->ufile->idr, uobj->id);
spin_unlock(&uobj->ufile->idr_lock);
xa_erase(&uobj->ufile->idr, uobj->id);
/* Matches the kref in alloc_commit_idr_uobject */
uverbs_uobject_put(uobj);
}
@ -599,7 +548,8 @@ static void alloc_abort_fd_uobject(struct ib_uobject *uobj)
}
static int __must_check destroy_hw_fd_uobject(struct ib_uobject *uobj,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
const struct uverbs_obj_fd_type *fd_type = container_of(
uobj->uapi_object->type_attrs, struct uverbs_obj_fd_type, type);
@ -618,17 +568,17 @@ static void remove_handle_fd_uobject(struct ib_uobject *uobj)
static int alloc_commit_idr_uobject(struct ib_uobject *uobj)
{
struct ib_uverbs_file *ufile = uobj->ufile;
void *old;
spin_lock(&ufile->idr_lock);
/*
* We already allocated this IDR with a NULL object, so
* this shouldn't fail.
*
* NOTE: Once we set the IDR we loose ownership of our kref on uobj.
* NOTE: Storing the uobj transfers our kref on uobj to the XArray.
* It will be put by remove_commit_idr_uobject()
*/
WARN_ON(idr_replace(&ufile->idr, uobj, uobj->id));
spin_unlock(&ufile->idr_lock);
old = xa_store(&ufile->idr, uobj->id, uobj, GFP_KERNEL);
WARN_ON(old != NULL);
return 0;
}
@ -675,15 +625,16 @@ static int alloc_commit_fd_uobject(struct ib_uobject *uobj)
* caller can no longer assume uobj is valid. If this function fails it
* destroys the uboject, including the attached HW object.
*/
int __must_check rdma_alloc_commit_uobject(struct ib_uobject *uobj)
int __must_check rdma_alloc_commit_uobject(struct ib_uobject *uobj,
struct uverbs_attr_bundle *attrs)
{
struct ib_uverbs_file *ufile = uobj->ufile;
struct ib_uverbs_file *ufile = attrs->ufile;
int ret;
/* alloc_commit consumes the uobj kref */
ret = uobj->uapi_object->type_class->alloc_commit(uobj);
if (ret) {
uverbs_destroy_uobject(uobj, RDMA_REMOVE_ABORT);
uverbs_destroy_uobject(uobj, RDMA_REMOVE_ABORT, attrs);
up_read(&ufile->hw_destroy_rwsem);
return ret;
}
@ -707,12 +658,13 @@ int __must_check rdma_alloc_commit_uobject(struct ib_uobject *uobj)
* This consumes the kref for uobj. It is up to the caller to unwind the HW
* object and anything else connected to uobj before calling this.
*/
void rdma_alloc_abort_uobject(struct ib_uobject *uobj)
void rdma_alloc_abort_uobject(struct ib_uobject *uobj,
struct uverbs_attr_bundle *attrs)
{
struct ib_uverbs_file *ufile = uobj->ufile;
uobj->object = NULL;
uverbs_destroy_uobject(uobj, RDMA_REMOVE_ABORT);
uverbs_destroy_uobject(uobj, RDMA_REMOVE_ABORT, attrs);
/* Matches the down_read in rdma_alloc_begin_uobject */
up_read(&ufile->hw_destroy_rwsem);
@ -760,29 +712,28 @@ void rdma_lookup_put_uobject(struct ib_uobject *uobj,
void setup_ufile_idr_uobject(struct ib_uverbs_file *ufile)
{
spin_lock_init(&ufile->idr_lock);
idr_init(&ufile->idr);
xa_init_flags(&ufile->idr, XA_FLAGS_ALLOC);
}
void release_ufile_idr_uobject(struct ib_uverbs_file *ufile)
{
struct ib_uobject *entry;
int id;
unsigned long id;
/*
* At this point uverbs_cleanup_ufile() is guaranteed to have run, and
* there are no HW objects left, however the IDR is still populated
* there are no HW objects left, however the xarray is still populated
* with anything that has not been cleaned up by userspace. Since the
* kref on ufile is 0, nothing is allowed to call lookup_get.
*
* This is an optimized equivalent to remove_handle_idr_uobject
*/
idr_for_each_entry(&ufile->idr, entry, id) {
xa_for_each(&ufile->idr, id, entry) {
WARN_ON(entry->object);
uverbs_uobject_put(entry);
}
idr_destroy(&ufile->idr);
xa_destroy(&ufile->idr);
}
const struct uverbs_obj_type_class uverbs_idr_class = {
@ -814,6 +765,10 @@ void uverbs_close_fd(struct file *f)
{
struct ib_uobject *uobj = f->private_data;
struct ib_uverbs_file *ufile = uobj->ufile;
struct uverbs_attr_bundle attrs = {
.context = uobj->context,
.ufile = ufile,
};
if (down_read_trylock(&ufile->hw_destroy_rwsem)) {
/*
@ -823,7 +778,7 @@ void uverbs_close_fd(struct file *f)
* write lock here, or we have a kernel bug.
*/
WARN_ON(uverbs_try_lock_object(uobj, UVERBS_LOOKUP_WRITE));
uverbs_destroy_uobject(uobj, RDMA_REMOVE_CLOSE);
uverbs_destroy_uobject(uobj, RDMA_REMOVE_CLOSE, &attrs);
up_read(&ufile->hw_destroy_rwsem);
}
@ -872,6 +827,7 @@ static int __uverbs_cleanup_ufile(struct ib_uverbs_file *ufile,
{
struct ib_uobject *obj, *next_obj;
int ret = -EINVAL;
struct uverbs_attr_bundle attrs = { .ufile = ufile };
/*
* This shouldn't run while executing other commands on this
@ -883,12 +839,13 @@ static int __uverbs_cleanup_ufile(struct ib_uverbs_file *ufile,
* other threads (which might still use the FDs) chance to run.
*/
list_for_each_entry_safe(obj, next_obj, &ufile->uobjects, list) {
attrs.context = obj->context;
/*
* if we hit this WARN_ON, that means we are
* racing with a lookup_get.
*/
WARN_ON(uverbs_try_lock_object(obj, UVERBS_LOOKUP_WRITE));
if (!uverbs_destroy_uobject(obj, reason))
if (!uverbs_destroy_uobject(obj, reason, &attrs))
ret = 0;
else
atomic_set(&obj->usecnt, 0);
@ -967,26 +924,25 @@ const struct uverbs_obj_type_class uverbs_fd_class = {
EXPORT_SYMBOL(uverbs_fd_class);
struct ib_uobject *
uverbs_get_uobject_from_file(u16 object_id,
struct ib_uverbs_file *ufile,
enum uverbs_obj_access access, s64 id)
uverbs_get_uobject_from_file(u16 object_id, enum uverbs_obj_access access,
s64 id, struct uverbs_attr_bundle *attrs)
{
const struct uverbs_api_object *obj =
uapi_get_object(ufile->device->uapi, object_id);
uapi_get_object(attrs->ufile->device->uapi, object_id);
switch (access) {
case UVERBS_ACCESS_READ:
return rdma_lookup_get_uobject(obj, ufile, id,
UVERBS_LOOKUP_READ);
return rdma_lookup_get_uobject(obj, attrs->ufile, id,
UVERBS_LOOKUP_READ, attrs);
case UVERBS_ACCESS_DESTROY:
/* Actual destruction is done inside uverbs_handle_method */
return rdma_lookup_get_uobject(obj, ufile, id,
UVERBS_LOOKUP_DESTROY);
return rdma_lookup_get_uobject(obj, attrs->ufile, id,
UVERBS_LOOKUP_DESTROY, attrs);
case UVERBS_ACCESS_WRITE:
return rdma_lookup_get_uobject(obj, ufile, id,
UVERBS_LOOKUP_WRITE);
return rdma_lookup_get_uobject(obj, attrs->ufile, id,
UVERBS_LOOKUP_WRITE, attrs);
case UVERBS_ACCESS_NEW:
return rdma_alloc_begin_uobject(obj, ufile);
return rdma_alloc_begin_uobject(obj, attrs->ufile, attrs);
default:
WARN_ON(true);
return ERR_PTR(-EOPNOTSUPP);
@ -994,8 +950,8 @@ uverbs_get_uobject_from_file(u16 object_id,
}
int uverbs_finalize_object(struct ib_uobject *uobj,
enum uverbs_obj_access access,
bool commit)
enum uverbs_obj_access access, bool commit,
struct uverbs_attr_bundle *attrs)
{
int ret = 0;
@ -1018,9 +974,9 @@ int uverbs_finalize_object(struct ib_uobject *uobj,
break;
case UVERBS_ACCESS_NEW:
if (commit)
ret = rdma_alloc_commit_uobject(uobj);
ret = rdma_alloc_commit_uobject(uobj, attrs);
else
rdma_alloc_abort_uobject(uobj);
rdma_alloc_abort_uobject(uobj, attrs);
break;
default:
WARN_ON(true);

View File

@ -48,7 +48,7 @@ struct ib_uverbs_device;
void uverbs_destroy_ufile_hw(struct ib_uverbs_file *ufile,
enum rdma_remove_reason reason);
int uobj_destroy(struct ib_uobject *uobj);
int uobj_destroy(struct ib_uobject *uobj, struct uverbs_attr_bundle *attrs);
/*
* uverbs_uobject_get is called in order to increase the reference count on
@ -83,9 +83,8 @@ void uverbs_close_fd(struct file *f);
* uverbs_finalize_objects are called.
*/
struct ib_uobject *
uverbs_get_uobject_from_file(u16 object_id,
struct ib_uverbs_file *ufile,
enum uverbs_obj_access access, s64 id);
uverbs_get_uobject_from_file(u16 object_id, enum uverbs_obj_access access,
s64 id, struct uverbs_attr_bundle *attrs);
/*
* Note that certain finalize stages could return a status:
@ -103,8 +102,8 @@ uverbs_get_uobject_from_file(u16 object_id,
* object.
*/
int uverbs_finalize_object(struct ib_uobject *uobj,
enum uverbs_obj_access access,
bool commit);
enum uverbs_obj_access access, bool commit,
struct uverbs_attr_bundle *attrs);
int uverbs_output_written(const struct uverbs_attr_bundle *bundle, size_t idx);

View File

@ -40,7 +40,7 @@
#include <linux/slab.h>
#include <linux/dma-mapping.h>
#include <linux/kref.h>
#include <linux/idr.h>
#include <linux/xarray.h>
#include <linux/workqueue.h>
#include <uapi/linux/if_ether.h>
#include <rdma/ib_pack.h>
@ -183,8 +183,7 @@ static struct ib_client sa_client = {
.remove = ib_sa_remove_one
};
static DEFINE_SPINLOCK(idr_lock);
static DEFINE_IDR(query_idr);
static DEFINE_XARRAY_FLAGS(queries, XA_FLAGS_ALLOC | XA_FLAGS_LOCK_IRQ);
static DEFINE_SPINLOCK(tid_lock);
static u32 tid;
@ -1180,14 +1179,14 @@ void ib_sa_cancel_query(int id, struct ib_sa_query *query)
struct ib_mad_agent *agent;
struct ib_mad_send_buf *mad_buf;
spin_lock_irqsave(&idr_lock, flags);
if (idr_find(&query_idr, id) != query) {
spin_unlock_irqrestore(&idr_lock, flags);
xa_lock_irqsave(&queries, flags);
if (xa_load(&queries, id) != query) {
xa_unlock_irqrestore(&queries, flags);
return;
}
agent = query->port->agent;
mad_buf = query->mad_buf;
spin_unlock_irqrestore(&idr_lock, flags);
xa_unlock_irqrestore(&queries, flags);
/*
* If the query is still on the netlink request list, schedule
@ -1363,21 +1362,14 @@ static void init_mad(struct ib_sa_query *query, struct ib_mad_agent *agent)
static int send_mad(struct ib_sa_query *query, unsigned long timeout_ms,
gfp_t gfp_mask)
{
bool preload = gfpflags_allow_blocking(gfp_mask);
unsigned long flags;
int ret, id;
if (preload)
idr_preload(gfp_mask);
spin_lock_irqsave(&idr_lock, flags);
id = idr_alloc(&query_idr, query, 0, 0, GFP_NOWAIT);
spin_unlock_irqrestore(&idr_lock, flags);
if (preload)
idr_preload_end();
if (id < 0)
return id;
xa_lock_irqsave(&queries, flags);
ret = __xa_alloc(&queries, &id, query, xa_limit_32b, gfp_mask);
xa_unlock_irqrestore(&queries, flags);
if (ret < 0)
return ret;
query->mad_buf->timeout_ms = timeout_ms;
query->mad_buf->context[0] = query;
@ -1394,9 +1386,9 @@ static int send_mad(struct ib_sa_query *query, unsigned long timeout_ms,
ret = ib_post_send_mad(query->mad_buf, NULL);
if (ret) {
spin_lock_irqsave(&idr_lock, flags);
idr_remove(&query_idr, id);
spin_unlock_irqrestore(&idr_lock, flags);
xa_lock_irqsave(&queries, flags);
__xa_erase(&queries, id);
xa_unlock_irqrestore(&queries, flags);
}
/*
@ -2188,9 +2180,9 @@ static void send_handler(struct ib_mad_agent *agent,
break;
}
spin_lock_irqsave(&idr_lock, flags);
idr_remove(&query_idr, query->id);
spin_unlock_irqrestore(&idr_lock, flags);
xa_lock_irqsave(&queries, flags);
__xa_erase(&queries, query->id);
xa_unlock_irqrestore(&queries, flags);
free_mad(query);
if (query->client)
@ -2475,5 +2467,5 @@ void ib_sa_cleanup(void)
destroy_workqueue(ib_nl_wq);
mcast_cleanup();
ib_unregister_client(&sa_client);
idr_destroy(&query_idr);
WARN_ON(!xa_empty(&queries));
}

View File

@ -349,10 +349,15 @@ static struct attribute *port_default_attrs[] = {
static size_t print_ndev(const struct ib_gid_attr *gid_attr, char *buf)
{
if (!gid_attr->ndev)
return -EINVAL;
struct net_device *ndev;
size_t ret = -EINVAL;
return sprintf(buf, "%s\n", gid_attr->ndev->name);
rcu_read_lock();
ndev = rcu_dereference(gid_attr->ndev);
if (ndev)
ret = sprintf(buf, "%s\n", ndev->name);
rcu_read_unlock();
return ret;
}
static size_t print_gid_type(const struct ib_gid_attr *gid_attr, char *buf)
@ -1015,8 +1020,10 @@ err_free_stats:
return;
}
static int add_port(struct ib_device *device, int port_num)
static int add_port(struct ib_core_device *coredev, int port_num)
{
struct ib_device *device = rdma_device_to_ibdev(&coredev->dev);
bool is_full_dev = &device->coredev == coredev;
struct ib_port *p;
struct ib_port_attr attr;
int i;
@ -1034,7 +1041,7 @@ static int add_port(struct ib_device *device, int port_num)
p->port_num = port_num;
ret = kobject_init_and_add(&p->kobj, &port_type,
device->ports_kobj,
coredev->ports_kobj,
"%d", port_num);
if (ret) {
kfree(p);
@ -1055,7 +1062,7 @@ static int add_port(struct ib_device *device, int port_num)
goto err_put;
}
if (device->ops.process_mad) {
if (device->ops.process_mad && is_full_dev) {
p->pma_table = get_counter_table(device, port_num);
ret = sysfs_create_group(&p->kobj, p->pma_table);
if (ret)
@ -1111,7 +1118,7 @@ static int add_port(struct ib_device *device, int port_num)
if (ret)
goto err_free_pkey;
if (device->ops.init_port) {
if (device->ops.init_port && is_full_dev) {
ret = device->ops.init_port(device, port_num, &p->kobj);
if (ret)
goto err_remove_pkey;
@ -1122,10 +1129,10 @@ static int add_port(struct ib_device *device, int port_num)
* port, so holder should be device. Therefore skip per port conunter
* initialization.
*/
if (device->ops.alloc_hw_stats && port_num)
if (device->ops.alloc_hw_stats && port_num && is_full_dev)
setup_hw_stats(device, p, port_num);
list_add_tail(&p->kobj.entry, &device->port_list);
list_add_tail(&p->kobj.entry, &coredev->port_list);
kobject_uevent(&p->kobj, KOBJ_ADD);
return 0;
@ -1194,6 +1201,7 @@ static ssize_t node_type_show(struct device *device,
case RDMA_NODE_RNIC: return sprintf(buf, "%d: RNIC\n", dev->node_type);
case RDMA_NODE_USNIC: return sprintf(buf, "%d: usNIC\n", dev->node_type);
case RDMA_NODE_USNIC_UDP: return sprintf(buf, "%d: usNIC UDP\n", dev->node_type);
case RDMA_NODE_UNSPECIFIED: return sprintf(buf, "%d: unspecified\n", dev->node_type);
case RDMA_NODE_IB_SWITCH: return sprintf(buf, "%d: switch\n", dev->node_type);
case RDMA_NODE_IB_ROUTER: return sprintf(buf, "%d: router\n", dev->node_type);
default: return sprintf(buf, "%d: <unknown>\n", dev->node_type);
@ -1279,11 +1287,11 @@ const struct attribute_group ib_dev_attr_group = {
.attrs = ib_dev_attrs,
};
static void ib_free_port_attrs(struct ib_device *device)
void ib_free_port_attrs(struct ib_core_device *coredev)
{
struct kobject *p, *t;
list_for_each_entry_safe(p, t, &device->port_list, entry) {
list_for_each_entry_safe(p, t, &coredev->port_list, entry) {
struct ib_port *port = container_of(p, struct ib_port, kobj);
list_del(&p->entry);
@ -1303,20 +1311,22 @@ static void ib_free_port_attrs(struct ib_device *device)
kobject_put(p);
}
kobject_put(device->ports_kobj);
kobject_put(coredev->ports_kobj);
}
static int ib_setup_port_attrs(struct ib_device *device)
int ib_setup_port_attrs(struct ib_core_device *coredev)
{
struct ib_device *device = rdma_device_to_ibdev(&coredev->dev);
unsigned int port;
int ret;
device->ports_kobj = kobject_create_and_add("ports", &device->dev.kobj);
if (!device->ports_kobj)
coredev->ports_kobj = kobject_create_and_add("ports",
&coredev->dev.kobj);
if (!coredev->ports_kobj)
return -ENOMEM;
rdma_for_each_port (device, port) {
ret = add_port(device, port);
ret = add_port(coredev, port);
if (ret)
goto err_put;
}
@ -1324,7 +1334,7 @@ static int ib_setup_port_attrs(struct ib_device *device)
return 0;
err_put:
ib_free_port_attrs(device);
ib_free_port_attrs(coredev);
return ret;
}
@ -1332,7 +1342,7 @@ int ib_device_register_sysfs(struct ib_device *device)
{
int ret;
ret = ib_setup_port_attrs(device);
ret = ib_setup_port_attrs(&device->coredev);
if (ret)
return ret;
@ -1348,5 +1358,48 @@ void ib_device_unregister_sysfs(struct ib_device *device)
free_hsag(&device->dev.kobj, device->hw_stats_ag);
kfree(device->hw_stats);
ib_free_port_attrs(device);
ib_free_port_attrs(&device->coredev);
}
/**
* ib_port_register_module_stat - add module counters under relevant port
* of IB device.
*
* @device: IB device to add counters
* @port_num: valid port number
* @kobj: pointer to the kobject to initialize
* @ktype: pointer to the ktype for this kobject.
* @name: the name of the kobject
*/
int ib_port_register_module_stat(struct ib_device *device, u8 port_num,
struct kobject *kobj, struct kobj_type *ktype,
const char *name)
{
struct kobject *p, *t;
int ret;
list_for_each_entry_safe(p, t, &device->coredev.port_list, entry) {
struct ib_port *port = container_of(p, struct ib_port, kobj);
if (port->port_num != port_num)
continue;
ret = kobject_init_and_add(kobj, ktype, &port->kobj, "%s",
name);
if (ret)
return ret;
}
return 0;
}
EXPORT_SYMBOL(ib_port_register_module_stat);
/**
* ib_port_unregister_module_stat - release module counters
* @kobj: pointer to the kobject to release
*/
void ib_port_unregister_module_stat(struct kobject *kobj)
{
kobject_put(kobj);
}
EXPORT_SYMBOL(ib_port_unregister_module_stat);

View File

@ -42,7 +42,7 @@
#include <linux/file.h>
#include <linux/mount.h>
#include <linux/cdev.h>
#include <linux/idr.h>
#include <linux/xarray.h>
#include <linux/mutex.h>
#include <linux/slab.h>
@ -125,23 +125,22 @@ static struct ib_client ucm_client = {
.remove = ib_ucm_remove_one
};
static DEFINE_MUTEX(ctx_id_mutex);
static DEFINE_IDR(ctx_id_table);
static DEFINE_XARRAY_ALLOC(ctx_id_table);
static DECLARE_BITMAP(dev_map, IB_UCM_MAX_DEVICES);
static struct ib_ucm_context *ib_ucm_ctx_get(struct ib_ucm_file *file, int id)
{
struct ib_ucm_context *ctx;
mutex_lock(&ctx_id_mutex);
ctx = idr_find(&ctx_id_table, id);
xa_lock(&ctx_id_table);
ctx = xa_load(&ctx_id_table, id);
if (!ctx)
ctx = ERR_PTR(-ENOENT);
else if (ctx->file != file)
ctx = ERR_PTR(-EINVAL);
else
atomic_inc(&ctx->ref);
mutex_unlock(&ctx_id_mutex);
xa_unlock(&ctx_id_table);
return ctx;
}
@ -194,10 +193,7 @@ static struct ib_ucm_context *ib_ucm_ctx_alloc(struct ib_ucm_file *file)
ctx->file = file;
INIT_LIST_HEAD(&ctx->events);
mutex_lock(&ctx_id_mutex);
ctx->id = idr_alloc(&ctx_id_table, ctx, 0, 0, GFP_KERNEL);
mutex_unlock(&ctx_id_mutex);
if (ctx->id < 0)
if (xa_alloc(&ctx_id_table, &ctx->id, ctx, xa_limit_32b, GFP_KERNEL))
goto error;
list_add_tail(&ctx->file_list, &file->ctxs);
@ -514,9 +510,7 @@ static ssize_t ib_ucm_create_id(struct ib_ucm_file *file,
err2:
ib_destroy_cm_id(ctx->cm_id);
err1:
mutex_lock(&ctx_id_mutex);
idr_remove(&ctx_id_table, ctx->id);
mutex_unlock(&ctx_id_mutex);
xa_erase(&ctx_id_table, ctx->id);
kfree(ctx);
return result;
}
@ -536,15 +530,15 @@ static ssize_t ib_ucm_destroy_id(struct ib_ucm_file *file,
if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
return -EFAULT;
mutex_lock(&ctx_id_mutex);
ctx = idr_find(&ctx_id_table, cmd.id);
xa_lock(&ctx_id_table);
ctx = xa_load(&ctx_id_table, cmd.id);
if (!ctx)
ctx = ERR_PTR(-ENOENT);
else if (ctx->file != file)
ctx = ERR_PTR(-EINVAL);
else
idr_remove(&ctx_id_table, ctx->id);
mutex_unlock(&ctx_id_mutex);
__xa_erase(&ctx_id_table, ctx->id);
xa_unlock(&ctx_id_table);
if (IS_ERR(ctx))
return PTR_ERR(ctx);
@ -1189,10 +1183,7 @@ static int ib_ucm_close(struct inode *inode, struct file *filp)
struct ib_ucm_context, file_list);
mutex_unlock(&file->file_mutex);
mutex_lock(&ctx_id_mutex);
idr_remove(&ctx_id_table, ctx->id);
mutex_unlock(&ctx_id_mutex);
xa_erase(&ctx_id_table, ctx->id);
ib_destroy_cm_id(ctx->cm_id);
ib_ucm_cleanup_events(ctx);
kfree(ctx);
@ -1352,7 +1343,7 @@ static void __exit ib_ucm_cleanup(void)
class_remove_file(&cm_class, &class_attr_abi_version.attr);
unregister_chrdev_region(IB_UCM_BASE_DEV, IB_UCM_NUM_FIXED_MINOR);
unregister_chrdev_region(dynamic_ucm_dev, IB_UCM_NUM_DYNAMIC_MINOR);
idr_destroy(&ctx_id_table);
WARN_ON(!xa_empty(&ctx_id_table));
}
module_init(ib_ucm_init);

View File

@ -37,27 +37,23 @@
#include <linux/sched/signal.h>
#include <linux/sched/mm.h>
#include <linux/export.h>
#include <linux/hugetlb.h>
#include <linux/slab.h>
#include <linux/pagemap.h>
#include <rdma/ib_umem_odp.h>
#include "uverbs.h"
static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int dirty)
{
struct scatterlist *sg;
struct sg_page_iter sg_iter;
struct page *page;
int i;
if (umem->nmap > 0)
ib_dma_unmap_sg(dev, umem->sg_head.sgl,
umem->npages,
ib_dma_unmap_sg(dev, umem->sg_head.sgl, umem->sg_nents,
DMA_BIDIRECTIONAL);
for_each_sg(umem->sg_head.sgl, sg, umem->npages, i) {
page = sg_page(sg);
for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->sg_nents, 0) {
page = sg_page_iter_page(&sg_iter);
if (!PageDirty(page) && umem->writable && dirty)
set_page_dirty_lock(page);
put_page(page);
@ -66,6 +62,124 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int d
sg_free_table(&umem->sg_head);
}
/* ib_umem_add_sg_table - Add N contiguous pages to scatter table
*
* sg: current scatterlist entry
* page_list: array of npage struct page pointers
* npages: number of pages in page_list
* max_seg_sz: maximum segment size in bytes
* nents: [out] number of entries in the scatterlist
*
* Return new end of scatterlist
*/
static struct scatterlist *ib_umem_add_sg_table(struct scatterlist *sg,
struct page **page_list,
unsigned long npages,
unsigned int max_seg_sz,
int *nents)
{
unsigned long first_pfn;
unsigned long i = 0;
bool update_cur_sg = false;
bool first = !sg_page(sg);
/* Check if new page_list is contiguous with end of previous page_list.
* sg->length here is a multiple of PAGE_SIZE and sg->offset is 0.
*/
if (!first && (page_to_pfn(sg_page(sg)) + (sg->length >> PAGE_SHIFT) ==
page_to_pfn(page_list[0])))
update_cur_sg = true;
while (i != npages) {
unsigned long len;
struct page *first_page = page_list[i];
first_pfn = page_to_pfn(first_page);
/* Compute the number of contiguous pages we have starting
* at i
*/
for (len = 0; i != npages &&
first_pfn + len == page_to_pfn(page_list[i]) &&
len < (max_seg_sz >> PAGE_SHIFT);
len++)
i++;
/* Squash N contiguous pages from page_list into current sge */
if (update_cur_sg) {
if ((max_seg_sz - sg->length) >= (len << PAGE_SHIFT)) {
sg_set_page(sg, sg_page(sg),
sg->length + (len << PAGE_SHIFT),
0);
update_cur_sg = false;
continue;
}
update_cur_sg = false;
}
/* Squash N contiguous pages into next sge or first sge */
if (!first)
sg = sg_next(sg);
(*nents)++;
sg_set_page(sg, first_page, len << PAGE_SHIFT, 0);
first = false;
}
return sg;
}
/**
* ib_umem_find_best_pgsz - Find best HW page size to use for this MR
*
* @umem: umem struct
* @pgsz_bitmap: bitmap of HW supported page sizes
* @virt: IOVA
*
* This helper is intended for HW that support multiple page
* sizes but can do only a single page size in an MR.
*
* Returns 0 if the umem requires page sizes not supported by
* the driver to be mapped. Drivers always supporting PAGE_SIZE
* or smaller will never see a 0 result.
*/
unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem,
unsigned long pgsz_bitmap,
unsigned long virt)
{
struct scatterlist *sg;
unsigned int best_pg_bit;
unsigned long va, pgoff;
dma_addr_t mask;
int i;
/* At minimum, drivers must support PAGE_SIZE or smaller */
if (WARN_ON(!(pgsz_bitmap & GENMASK(PAGE_SHIFT, 0))))
return 0;
va = virt;
/* max page size not to exceed MR length */
mask = roundup_pow_of_two(umem->length);
/* offset into first SGL */
pgoff = umem->address & ~PAGE_MASK;
for_each_sg(umem->sg_head.sgl, sg, umem->nmap, i) {
/* Walk SGL and reduce max page size if VA/PA bits differ
* for any address.
*/
mask |= (sg_dma_address(sg) + pgoff) ^ va;
if (i && i != (umem->nmap - 1))
/* restrict by length as well for interior SGEs */
mask |= sg_dma_len(sg);
va += sg_dma_len(sg) - pgoff;
pgoff = 0;
}
best_pg_bit = rdma_find_pg_bit(mask, pgsz_bitmap);
return BIT_ULL(best_pg_bit);
}
EXPORT_SYMBOL(ib_umem_find_best_pgsz);
/**
* ib_umem_get - Pin and DMA map userspace memory.
*
@ -84,16 +198,14 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr,
struct ib_ucontext *context;
struct ib_umem *umem;
struct page **page_list;
struct vm_area_struct **vma_list;
unsigned long lock_limit;
unsigned long new_pinned;
unsigned long cur_base;
struct mm_struct *mm;
unsigned long npages;
int ret;
int i;
unsigned long dma_attrs = 0;
struct scatterlist *sg, *sg_list_start;
struct scatterlist *sg;
unsigned int gup_flags = FOLL_WRITE;
if (!udata)
@ -138,29 +250,23 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr,
mmgrab(mm);
if (access & IB_ACCESS_ON_DEMAND) {
if (WARN_ON_ONCE(!context->invalidate_range)) {
ret = -EINVAL;
goto umem_kfree;
}
ret = ib_umem_odp_get(to_ib_umem_odp(umem), access);
if (ret)
goto umem_kfree;
return umem;
}
/* We assume the memory is from hugetlb until proved otherwise */
umem->hugetlb = 1;
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
ret = -ENOMEM;
goto umem_kfree;
}
/*
* if we can't alloc the vma_list, it's not so bad;
* just assume the memory is not hugetlb memory
*/
vma_list = (struct vm_area_struct **) __get_free_page(GFP_KERNEL);
if (!vma_list)
umem->hugetlb = 0;
npages = ib_umem_num_pages(umem);
if (npages == 0 || npages > UINT_MAX) {
ret = -EINVAL;
@ -185,41 +291,34 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr,
if (!umem->writable)
gup_flags |= FOLL_FORCE;
sg_list_start = umem->sg_head.sgl;
sg = umem->sg_head.sgl;
while (npages) {
down_read(&mm->mmap_sem);
ret = get_user_pages_longterm(cur_base,
min_t(unsigned long, npages,
PAGE_SIZE / sizeof (struct page *)),
gup_flags, page_list, vma_list);
gup_flags, page_list, NULL);
if (ret < 0) {
up_read(&mm->mmap_sem);
goto umem_release;
}
umem->npages += ret;
cur_base += ret * PAGE_SIZE;
npages -= ret;
/* Continue to hold the mmap_sem as vma_list access
* needs to be protected.
*/
for_each_sg(sg_list_start, sg, ret, i) {
if (vma_list && !is_vm_hugetlb_page(vma_list[i]))
umem->hugetlb = 0;
sg = ib_umem_add_sg_table(sg, page_list, ret,
dma_get_max_seg_size(context->device->dma_device),
&umem->sg_nents);
sg_set_page(sg, page_list[i], PAGE_SIZE, 0);
}
up_read(&mm->mmap_sem);
/* preparing for next loop */
sg_list_start = sg;
}
sg_mark_end(sg);
umem->nmap = ib_dma_map_sg_attrs(context->device,
umem->sg_head.sgl,
umem->npages,
umem->sg_nents,
DMA_BIDIRECTIONAL,
dma_attrs);
@ -236,8 +335,6 @@ umem_release:
vma:
atomic64_sub(ib_umem_num_pages(umem), &mm->pinned_vm);
out:
if (vma_list)
free_page((unsigned long) vma_list);
free_page((unsigned long) page_list);
umem_kfree:
if (ret) {
@ -315,7 +412,7 @@ int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset,
return -EINVAL;
}
ret = sg_pcopy_to_buffer(umem->sg_head.sgl, umem->npages, dst, length,
ret = sg_pcopy_to_buffer(umem->sg_head.sgl, umem->sg_nents, dst, length,
offset + ib_umem_offset(umem));
if (ret < 0)

View File

@ -241,7 +241,7 @@ static struct ib_ucontext_per_mm *alloc_per_mm(struct ib_ucontext *ctx,
per_mm->mm = mm;
per_mm->umem_tree = RB_ROOT_CACHED;
init_rwsem(&per_mm->umem_rwsem);
per_mm->active = ctx->invalidate_range;
per_mm->active = true;
rcu_read_lock();
per_mm->tgid = get_task_pid(current->group_leader, PIDTYPE_PID);
@ -417,9 +417,6 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access)
h = hstate_vma(vma);
umem->page_shift = huge_page_shift(h);
up_read(&mm->mmap_sem);
umem->hugetlb = 1;
} else {
umem->hugetlb = 0;
}
mutex_init(&umem_odp->umem_mutex);
@ -503,7 +500,6 @@ static int ib_umem_odp_map_dma_single_page(
struct ib_umem *umem = &umem_odp->umem;
struct ib_device *dev = umem->context->device;
dma_addr_t dma_addr;
int stored_page = 0;
int remove_existing_mapping = 0;
int ret = 0;
@ -527,8 +523,7 @@ static int ib_umem_odp_map_dma_single_page(
}
umem_odp->dma_list[page_index] = dma_addr | access_mask;
umem_odp->page_list[page_index] = page;
umem->npages++;
stored_page = 1;
umem_odp->npages++;
} else if (umem_odp->page_list[page_index] == page) {
umem_odp->dma_list[page_index] |= access_mask;
} else {
@ -540,11 +535,9 @@ static int ib_umem_odp_map_dma_single_page(
}
out:
/* On Demand Paging - avoid pinning the page */
if (umem->context->invalidate_range || !stored_page)
put_page(page);
put_page(page);
if (remove_existing_mapping && umem->context->invalidate_range) {
if (remove_existing_mapping) {
ib_umem_notifier_start_account(umem_odp);
umem->context->invalidate_range(
umem_odp,
@ -754,12 +747,9 @@ void ib_umem_odp_unmap_dma_pages(struct ib_umem_odp *umem_odp, u64 virt,
*/
set_page_dirty(head_page);
}
/* on demand pinning support */
if (!umem->context->invalidate_range)
put_page(page);
umem_odp->page_list[idx] = NULL;
umem_odp->dma_list[idx] = 0;
umem->npages--;
umem_odp->npages--;
}
}
mutex_unlock(&umem_odp->umem_mutex);

View File

@ -129,6 +129,9 @@ struct ib_umad_packet {
struct ib_user_mad mad;
};
#define CREATE_TRACE_POINTS
#include <trace/events/ib_umad.h>
static const dev_t base_umad_dev = MKDEV(IB_UMAD_MAJOR, IB_UMAD_MINOR_BASE);
static const dev_t base_issm_dev = MKDEV(IB_UMAD_MAJOR, IB_UMAD_MINOR_BASE) +
IB_UMAD_NUM_FIXED_MINOR;
@ -334,6 +337,9 @@ static ssize_t copy_recv_mad(struct ib_umad_file *file, char __user *buf,
return -EFAULT;
}
}
trace_ib_umad_read_recv(file, &packet->mad.hdr, &recv_buf->mad->mad_hdr);
return hdr_size(file) + packet->length;
}
@ -353,6 +359,9 @@ static ssize_t copy_send_mad(struct ib_umad_file *file, char __user *buf,
if (copy_to_user(buf, packet->mad.data, packet->length))
return -EFAULT;
trace_ib_umad_read_send(file, &packet->mad.hdr,
(struct ib_mad_hdr *)&packet->mad.data);
return size;
}
@ -508,6 +517,9 @@ static ssize_t ib_umad_write(struct file *filp, const char __user *buf,
mutex_lock(&file->mutex);
trace_ib_umad_write(file, &packet->mad.hdr,
(struct ib_mad_hdr *)&packet->mad.data);
agent = __get_agent(file, packet->mad.hdr.id);
if (!agent) {
ret = -EINVAL;
@ -968,6 +980,11 @@ static int ib_umad_open(struct inode *inode, struct file *filp)
goto out;
}
if (!rdma_dev_access_netns(port->ib_dev, current->nsproxy->net_ns)) {
ret = -EPERM;
goto out;
}
file = kzalloc(sizeof(*file), GFP_KERNEL);
if (!file) {
ret = -ENOMEM;
@ -1061,6 +1078,11 @@ static int ib_umad_sm_open(struct inode *inode, struct file *filp)
}
}
if (!rdma_dev_access_netns(port->ib_dev, current->nsproxy->net_ns)) {
ret = -EPERM;
goto err_up_sem;
}
ret = ib_modify_port(port->ib_dev, port->port_num, 0, &props);
if (ret)
goto err_up_sem;

View File

@ -162,9 +162,7 @@ struct ib_uverbs_file {
struct list_head umaps;
struct page *disassociate_page;
struct idr idr;
/* spinlock protects write access to idr */
spinlock_t idr_lock;
struct xarray idr;
};
struct ib_uverbs_event {
@ -241,7 +239,8 @@ void ib_uverbs_srq_event_handler(struct ib_event *event, void *context_ptr);
void ib_uverbs_event_handler(struct ib_event_handler *handler,
struct ib_event *event);
int ib_uverbs_dealloc_xrcd(struct ib_uobject *uobject, struct ib_xrcd *xrcd,
enum rdma_remove_reason why);
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs);
int uverbs_dealloc_mw(struct ib_mw *mw);
void ib_uverbs_detach_umcast(struct ib_qp *qp,

View File

@ -162,7 +162,7 @@ static const void __user *uverbs_request_next_ptr(struct uverbs_req_iter *iter,
const void __user *res = iter->cur;
if (iter->cur + len > iter->end)
return ERR_PTR(-ENOSPC);
return (void __force __user *)ERR_PTR(-ENOSPC);
iter->cur += len;
return res;
}
@ -175,7 +175,7 @@ static int uverbs_request_finish(struct uverbs_req_iter *iter)
}
static struct ib_uverbs_completion_event_file *
_ib_uverbs_lookup_comp_file(s32 fd, const struct uverbs_attr_bundle *attrs)
_ib_uverbs_lookup_comp_file(s32 fd, struct uverbs_attr_bundle *attrs)
{
struct ib_uobject *uobj = ufd_get_read(UVERBS_OBJECT_COMP_CHANNEL,
fd, attrs);
@ -230,6 +230,8 @@ static int ib_uverbs_get_context(struct uverbs_attr_bundle *attrs)
goto err_alloc;
}
attrs->context = ucontext;
ucontext->res.type = RDMA_RESTRACK_CTX;
ucontext->device = ib_dev;
ucontext->cg_obj = cg_obj;
@ -423,7 +425,7 @@ static int ib_uverbs_alloc_pd(struct uverbs_attr_bundle *attrs)
atomic_set(&pd->usecnt, 0);
pd->res.type = RDMA_RESTRACK_PD;
ret = ib_dev->ops.alloc_pd(pd, uobj->context, &attrs->driver_udata);
ret = ib_dev->ops.alloc_pd(pd, &attrs->driver_udata);
if (ret)
goto err_alloc;
@ -436,15 +438,15 @@ static int ib_uverbs_alloc_pd(struct uverbs_attr_bundle *attrs)
if (ret)
goto err_copy;
return uobj_alloc_commit(uobj);
return uobj_alloc_commit(uobj, attrs);
err_copy:
ib_dealloc_pd(pd);
ib_dealloc_pd_user(pd, &attrs->driver_udata);
pd = NULL;
err_alloc:
kfree(pd);
err:
uobj_alloc_abort(uobj);
uobj_alloc_abort(uobj, attrs);
return ret;
}
@ -594,8 +596,7 @@ static int ib_uverbs_open_xrcd(struct uverbs_attr_bundle *attrs)
}
if (!xrcd) {
xrcd = ib_dev->ops.alloc_xrcd(ib_dev, obj->uobject.context,
&attrs->driver_udata);
xrcd = ib_dev->ops.alloc_xrcd(ib_dev, &attrs->driver_udata);
if (IS_ERR(xrcd)) {
ret = PTR_ERR(xrcd);
goto err;
@ -633,7 +634,7 @@ static int ib_uverbs_open_xrcd(struct uverbs_attr_bundle *attrs)
mutex_unlock(&ibudev->xrcd_tree_mutex);
return uobj_alloc_commit(&obj->uobject);
return uobj_alloc_commit(&obj->uobject, attrs);
err_copy:
if (inode) {
@ -643,10 +644,10 @@ err_copy:
}
err_dealloc_xrcd:
ib_dealloc_xrcd(xrcd);
ib_dealloc_xrcd(xrcd, &attrs->driver_udata);
err:
uobj_alloc_abort(&obj->uobject);
uobj_alloc_abort(&obj->uobject, attrs);
err_tree_mutex_unlock:
if (f.file)
@ -669,19 +670,19 @@ static int ib_uverbs_close_xrcd(struct uverbs_attr_bundle *attrs)
return uobj_perform_destroy(UVERBS_OBJECT_XRCD, cmd.xrcd_handle, attrs);
}
int ib_uverbs_dealloc_xrcd(struct ib_uobject *uobject,
struct ib_xrcd *xrcd,
enum rdma_remove_reason why)
int ib_uverbs_dealloc_xrcd(struct ib_uobject *uobject, struct ib_xrcd *xrcd,
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
struct inode *inode;
int ret;
struct ib_uverbs_device *dev = uobject->context->ufile->device;
struct ib_uverbs_device *dev = attrs->ufile->device;
inode = xrcd->inode;
if (inode && !atomic_dec_and_test(&xrcd->usecnt))
return 0;
ret = ib_dealloc_xrcd(xrcd);
ret = ib_dealloc_xrcd(xrcd, &attrs->driver_udata);
if (ib_is_destroy_retryable(ret, why, uobject)) {
atomic_inc(&xrcd->usecnt);
@ -763,16 +764,16 @@ static int ib_uverbs_reg_mr(struct uverbs_attr_bundle *attrs)
uobj_put_obj_read(pd);
return uobj_alloc_commit(uobj);
return uobj_alloc_commit(uobj, attrs);
err_copy:
ib_dereg_mr(mr);
ib_dereg_mr_user(mr, &attrs->driver_udata);
err_put:
uobj_put_obj_read(pd);
err_free:
uobj_alloc_abort(uobj);
uobj_alloc_abort(uobj, attrs);
return ret;
}
@ -917,14 +918,14 @@ static int ib_uverbs_alloc_mw(struct uverbs_attr_bundle *attrs)
goto err_copy;
uobj_put_obj_read(pd);
return uobj_alloc_commit(uobj);
return uobj_alloc_commit(uobj, attrs);
err_copy:
uverbs_dealloc_mw(mw);
err_put:
uobj_put_obj_read(pd);
err_free:
uobj_alloc_abort(uobj);
uobj_alloc_abort(uobj, attrs);
return ret;
}
@ -965,11 +966,11 @@ static int ib_uverbs_create_comp_channel(struct uverbs_attr_bundle *attrs)
ret = uverbs_response(attrs, &resp, sizeof(resp));
if (ret) {
uobj_alloc_abort(uobj);
uobj_alloc_abort(uobj, attrs);
return ret;
}
return uobj_alloc_commit(uobj);
return uobj_alloc_commit(uobj, attrs);
}
static struct ib_ucq_object *create_cq(struct uverbs_attr_bundle *attrs,
@ -1009,8 +1010,7 @@ static struct ib_ucq_object *create_cq(struct uverbs_attr_bundle *attrs,
attr.comp_vector = cmd->comp_vector;
attr.flags = cmd->flags;
cq = ib_dev->ops.create_cq(ib_dev, &attr, obj->uobject.context,
&attrs->driver_udata);
cq = ib_dev->ops.create_cq(ib_dev, &attr, &attrs->driver_udata);
if (IS_ERR(cq)) {
ret = PTR_ERR(cq);
goto err_file;
@ -1036,7 +1036,7 @@ static struct ib_ucq_object *create_cq(struct uverbs_attr_bundle *attrs,
if (ret)
goto err_cb;
ret = uobj_alloc_commit(&obj->uobject);
ret = uobj_alloc_commit(&obj->uobject, attrs);
if (ret)
return ERR_PTR(ret);
return obj;
@ -1049,7 +1049,7 @@ err_file:
ib_uverbs_release_ucq(attrs->ufile, ev_file, obj);
err:
uobj_alloc_abort(&obj->uobject);
uobj_alloc_abort(&obj->uobject, attrs);
return ERR_PTR(ret);
}
@ -1418,7 +1418,6 @@ static int create_qp(struct uverbs_attr_bundle *attrs,
if (ret)
goto err_cb;
qp->real_qp = qp;
qp->pd = pd;
qp->send_cq = attr.send_cq;
qp->recv_cq = attr.recv_cq;
@ -1477,7 +1476,7 @@ static int create_qp(struct uverbs_attr_bundle *attrs,
if (ind_tbl)
uobj_put_obj_read(ind_tbl);
return uobj_alloc_commit(&obj->uevent.uobject);
return uobj_alloc_commit(&obj->uevent.uobject, attrs);
err_cb:
ib_destroy_qp(qp);
@ -1495,7 +1494,7 @@ err_put:
if (ind_tbl)
uobj_put_obj_read(ind_tbl);
uobj_alloc_abort(&obj->uevent.uobject);
uobj_alloc_abort(&obj->uevent.uobject, attrs);
return ret;
}
@ -1609,14 +1608,14 @@ static int ib_uverbs_open_qp(struct uverbs_attr_bundle *attrs)
qp->uobject = &obj->uevent.uobject;
uobj_put_read(xrcd_uobj);
return uobj_alloc_commit(&obj->uevent.uobject);
return uobj_alloc_commit(&obj->uevent.uobject, attrs);
err_destroy:
ib_destroy_qp(qp);
err_xrcd:
uobj_put_read(xrcd_uobj);
err_put:
uobj_alloc_abort(&obj->uevent.uobject);
uobj_alloc_abort(&obj->uevent.uobject, attrs);
return ret;
}
@ -2451,7 +2450,7 @@ static int ib_uverbs_create_ah(struct uverbs_attr_bundle *attrs)
goto err_copy;
uobj_put_obj_read(pd);
return uobj_alloc_commit(uobj);
return uobj_alloc_commit(uobj, attrs);
err_copy:
rdma_destroy_ah(ah, RDMA_DESTROY_AH_SLEEPABLE);
@ -2460,7 +2459,7 @@ err_put:
uobj_put_obj_read(pd);
err:
uobj_alloc_abort(uobj);
uobj_alloc_abort(uobj, attrs);
return ret;
}
@ -2962,16 +2961,16 @@ static int ib_uverbs_ex_create_wq(struct uverbs_attr_bundle *attrs)
uobj_put_obj_read(pd);
uobj_put_obj_read(cq);
return uobj_alloc_commit(&obj->uevent.uobject);
return uobj_alloc_commit(&obj->uevent.uobject, attrs);
err_copy:
ib_destroy_wq(wq);
ib_destroy_wq(wq, &attrs->driver_udata);
err_put_cq:
uobj_put_obj_read(cq);
err_put_pd:
uobj_put_obj_read(pd);
err_uobj:
uobj_alloc_abort(&obj->uevent.uobject);
uobj_alloc_abort(&obj->uevent.uobject, attrs);
return err;
}
@ -3136,12 +3135,12 @@ static int ib_uverbs_ex_create_rwq_ind_table(struct uverbs_attr_bundle *attrs)
for (j = 0; j < num_read_wqs; j++)
uobj_put_obj_read(wqs[j]);
return uobj_alloc_commit(uobj);
return uobj_alloc_commit(uobj, attrs);
err_copy:
ib_destroy_rwq_ind_table(rwq_ind_tbl);
err_uobj:
uobj_alloc_abort(uobj);
uobj_alloc_abort(uobj, attrs);
put_wqs:
for (j = 0; j < num_read_wqs; j++)
uobj_put_obj_read(wqs[j]);
@ -3314,7 +3313,7 @@ static int ib_uverbs_ex_create_flow(struct uverbs_attr_bundle *attrs)
kfree(flow_attr);
if (cmd.flow_attr.num_of_specs)
kfree(kern_flow_attr);
return uobj_alloc_commit(uobj);
return uobj_alloc_commit(uobj, attrs);
err_copy:
if (!qp->device->ops.destroy_flow(flow_id))
atomic_dec(&qp->usecnt);
@ -3325,7 +3324,7 @@ err_free_flow_attr:
err_put:
uobj_put_obj_read(qp);
err_uobj:
uobj_alloc_abort(uobj);
uobj_alloc_abort(uobj, attrs);
err_free_attr:
if (cmd.flow_attr.num_of_specs)
kfree(kern_flow_attr);
@ -3411,9 +3410,9 @@ static int __uverbs_create_xsrq(struct uverbs_attr_bundle *attrs,
obj->uevent.events_reported = 0;
INIT_LIST_HEAD(&obj->uevent.event_list);
srq = pd->device->ops.create_srq(pd, &attr, udata);
if (IS_ERR(srq)) {
ret = PTR_ERR(srq);
srq = rdma_zalloc_drv_obj(ib_dev, ib_srq);
if (!srq) {
ret = -ENOMEM;
goto err_put;
}
@ -3424,6 +3423,10 @@ static int __uverbs_create_xsrq(struct uverbs_attr_bundle *attrs,
srq->event_handler = attr.event_handler;
srq->srq_context = attr.srq_context;
ret = pd->device->ops.create_srq(srq, &attr, udata);
if (ret)
goto err_free;
if (ib_srq_has_cq(cmd->srq_type)) {
srq->ext.cq = attr.ext.cq;
atomic_inc(&attr.ext.cq->usecnt);
@ -3458,11 +3461,13 @@ static int __uverbs_create_xsrq(struct uverbs_attr_bundle *attrs,
uobj_put_obj_read(attr.ext.cq);
uobj_put_obj_read(pd);
return uobj_alloc_commit(&obj->uevent.uobject);
return uobj_alloc_commit(&obj->uevent.uobject, attrs);
err_copy:
ib_destroy_srq(srq);
ib_destroy_srq_user(srq, &attrs->driver_udata);
err_free:
kfree(srq);
err_put:
uobj_put_obj_read(pd);
@ -3477,7 +3482,7 @@ err_put_xrcd:
}
err:
uobj_alloc_abort(&obj->uevent.uobject);
uobj_alloc_abort(&obj->uevent.uobject, attrs);
return ret;
}

View File

@ -207,13 +207,12 @@ static int uverbs_process_idrs_array(struct bundle_priv *pbundle,
for (i = 0; i != array_len; i++) {
attr->uobjects[i] = uverbs_get_uobject_from_file(
spec->u2.objs_arr.obj_type, pbundle->bundle.ufile,
spec->u2.objs_arr.access, idr_vals[i]);
spec->u2.objs_arr.obj_type, spec->u2.objs_arr.access,
idr_vals[i], &pbundle->bundle);
if (IS_ERR(attr->uobjects[i])) {
ret = PTR_ERR(attr->uobjects[i]);
break;
}
pbundle->bundle.context = attr->uobjects[i]->context;
}
attr->len = i;
@ -223,7 +222,7 @@ static int uverbs_process_idrs_array(struct bundle_priv *pbundle,
static int uverbs_free_idrs_array(const struct uverbs_api_attr *attr_uapi,
struct uverbs_objs_arr_attr *attr,
bool commit)
bool commit, struct uverbs_attr_bundle *attrs)
{
const struct uverbs_attr_spec *spec = &attr_uapi->spec;
int current_ret;
@ -231,8 +230,9 @@ static int uverbs_free_idrs_array(const struct uverbs_api_attr *attr_uapi,
size_t i;
for (i = 0; i != attr->len; i++) {
current_ret = uverbs_finalize_object(
attr->uobjects[i], spec->u2.objs_arr.access, commit);
current_ret = uverbs_finalize_object(attr->uobjects[i],
spec->u2.objs_arr.access,
commit, attrs);
if (!ret)
ret = current_ret;
}
@ -325,13 +325,10 @@ static int uverbs_process_attr(struct bundle_priv *pbundle,
* IDR implementation today rejects negative IDs
*/
o_attr->uobject = uverbs_get_uobject_from_file(
spec->u.obj.obj_type,
pbundle->bundle.ufile,
spec->u.obj.access,
uattr->data_s64);
spec->u.obj.obj_type, spec->u.obj.access,
uattr->data_s64, &pbundle->bundle);
if (IS_ERR(o_attr->uobject))
return PTR_ERR(o_attr->uobject);
pbundle->bundle.context = o_attr->uobject->context;
__set_bit(attr_bkey, pbundle->uobj_finalize);
if (spec->u.obj.access == UVERBS_ACCESS_NEW) {
@ -456,12 +453,14 @@ static int ib_uverbs_run_method(struct bundle_priv *pbundle,
uverbs_fill_udata(&pbundle->bundle,
&pbundle->bundle.driver_udata,
UVERBS_ATTR_UHW_IN, UVERBS_ATTR_UHW_OUT);
else
pbundle->bundle.driver_udata = (struct ib_udata){};
if (destroy_bkey != UVERBS_API_ATTR_BKEY_LEN) {
struct uverbs_obj_attr *destroy_attr =
&pbundle->bundle.attrs[destroy_bkey].obj_attr;
ret = uobj_destroy(destroy_attr->uobject);
ret = uobj_destroy(destroy_attr->uobject, &pbundle->bundle);
if (ret)
return ret;
__clear_bit(destroy_bkey, pbundle->uobj_finalize);
@ -512,7 +511,8 @@ static int bundle_destroy(struct bundle_priv *pbundle, bool commit)
current_ret = uverbs_finalize_object(
attr->obj_attr.uobject,
attr->obj_attr.attr_elm->spec.u.obj.access, commit);
attr->obj_attr.attr_elm->spec.u.obj.access, commit,
&pbundle->bundle);
if (!ret)
ret = current_ret;
}
@ -535,7 +535,8 @@ static int bundle_destroy(struct bundle_priv *pbundle, bool commit)
if (attr_uapi->spec.type == UVERBS_ATTR_TYPE_IDRS_ARRAY) {
current_ret = uverbs_free_idrs_array(
attr_uapi, &attr->objs_arr_attr, commit);
attr_uapi, &attr->objs_arr_attr, commit,
&pbundle->bundle);
if (!ret)
ret = current_ret;
}

View File

@ -723,7 +723,7 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
* then the command request structure starts
* with a '__aligned u64 response' member.
*/
ret = get_user(response, (const u64 *)buf);
ret = get_user(response, (const u64 __user *)buf);
if (ret)
goto out_unlock;
@ -926,31 +926,6 @@ static const struct vm_operations_struct rdma_umap_ops = {
.fault = rdma_umap_fault,
};
static struct rdma_umap_priv *rdma_user_mmap_pre(struct ib_ucontext *ucontext,
struct vm_area_struct *vma,
unsigned long size)
{
struct ib_uverbs_file *ufile = ucontext->ufile;
struct rdma_umap_priv *priv;
if (!(vma->vm_flags & VM_SHARED))
return ERR_PTR(-EINVAL);
if (vma->vm_end - vma->vm_start != size)
return ERR_PTR(-EINVAL);
/* Driver is using this wrong, must be called by ib_uverbs_mmap */
if (WARN_ON(!vma->vm_file ||
vma->vm_file->private_data != ufile))
return ERR_PTR(-EINVAL);
lockdep_assert_held(&ufile->device->disassociate_srcu);
priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv)
return ERR_PTR(-ENOMEM);
return priv;
}
/*
* Map IO memory into a process. This is to be called by drivers as part of
* their mmap() functions if they wish to send something like PCI-E BAR memory
@ -959,10 +934,24 @@ static struct rdma_umap_priv *rdma_user_mmap_pre(struct ib_ucontext *ucontext,
int rdma_user_mmap_io(struct ib_ucontext *ucontext, struct vm_area_struct *vma,
unsigned long pfn, unsigned long size, pgprot_t prot)
{
struct rdma_umap_priv *priv = rdma_user_mmap_pre(ucontext, vma, size);
struct ib_uverbs_file *ufile = ucontext->ufile;
struct rdma_umap_priv *priv;
if (IS_ERR(priv))
return PTR_ERR(priv);
if (!(vma->vm_flags & VM_SHARED))
return -EINVAL;
if (vma->vm_end - vma->vm_start != size)
return -EINVAL;
/* Driver is using this wrong, must be called by ib_uverbs_mmap */
if (WARN_ON(!vma->vm_file ||
vma->vm_file->private_data != ufile))
return -EINVAL;
lockdep_assert_held(&ufile->device->disassociate_srcu);
priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
vma->vm_page_prot = prot;
if (io_remap_pfn_range(vma, vma->vm_start, pfn, size, prot)) {
@ -975,35 +964,6 @@ int rdma_user_mmap_io(struct ib_ucontext *ucontext, struct vm_area_struct *vma,
}
EXPORT_SYMBOL(rdma_user_mmap_io);
/*
* The page case is here for a slightly different reason, the driver expects
* to be able to free the page it is sharing to user space when it destroys
* its ucontext, which means we need to zap the user space references.
*
* We could handle this differently by providing an API to allocate a shared
* page and then only freeing the shared page when the last ufile is
* destroyed.
*/
int rdma_user_mmap_page(struct ib_ucontext *ucontext,
struct vm_area_struct *vma, struct page *page,
unsigned long size)
{
struct rdma_umap_priv *priv = rdma_user_mmap_pre(ucontext, vma, size);
if (IS_ERR(priv))
return PTR_ERR(priv);
if (remap_pfn_range(vma, vma->vm_start, page_to_pfn(page), size,
vma->vm_page_prot)) {
kfree(priv);
return -EAGAIN;
}
rdma_umap_priv_init(priv, vma);
return 0;
}
EXPORT_SYMBOL(rdma_user_mmap_page);
void uverbs_user_mmap_disassociate(struct ib_uverbs_file *ufile)
{
struct rdma_umap_priv *priv, *next_priv;
@ -1094,6 +1054,11 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp)
goto err;
}
if (!rdma_dev_access_netns(ib_dev, current->nsproxy->net_ns)) {
ret = -EPERM;
goto err;
}
/* In case IB device supports disassociate ucontext, there is no hard
* dependency between uverbs device and its low level device.
*/

View File

@ -40,14 +40,17 @@
#include "uverbs.h"
static int uverbs_free_ah(struct ib_uobject *uobject,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
return rdma_destroy_ah((struct ib_ah *)uobject->object,
RDMA_DESTROY_AH_SLEEPABLE);
return rdma_destroy_ah_user((struct ib_ah *)uobject->object,
RDMA_DESTROY_AH_SLEEPABLE,
&attrs->driver_udata);
}
static int uverbs_free_flow(struct ib_uobject *uobject,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
struct ib_flow *flow = (struct ib_flow *)uobject->object;
struct ib_uflow_object *uflow =
@ -66,13 +69,15 @@ static int uverbs_free_flow(struct ib_uobject *uobject,
}
static int uverbs_free_mw(struct ib_uobject *uobject,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
return uverbs_dealloc_mw((struct ib_mw *)uobject->object);
}
static int uverbs_free_qp(struct ib_uobject *uobject,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
struct ib_qp *qp = uobject->object;
struct ib_uqp_object *uqp =
@ -93,19 +98,20 @@ static int uverbs_free_qp(struct ib_uobject *uobject,
ib_uverbs_detach_umcast(qp, uqp);
}
ret = ib_destroy_qp(qp);
ret = ib_destroy_qp_user(qp, &attrs->driver_udata);
if (ib_is_destroy_retryable(ret, why, uobject))
return ret;
if (uqp->uxrcd)
atomic_dec(&uqp->uxrcd->refcnt);
ib_uverbs_release_uevent(uobject->context->ufile, &uqp->uevent);
ib_uverbs_release_uevent(attrs->ufile, &uqp->uevent);
return ret;
}
static int uverbs_free_rwq_ind_tbl(struct ib_uobject *uobject,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
struct ib_rwq_ind_table *rwq_ind_tbl = uobject->object;
struct ib_wq **ind_tbl = rwq_ind_tbl->ind_tbl;
@ -120,23 +126,25 @@ static int uverbs_free_rwq_ind_tbl(struct ib_uobject *uobject,
}
static int uverbs_free_wq(struct ib_uobject *uobject,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
struct ib_wq *wq = uobject->object;
struct ib_uwq_object *uwq =
container_of(uobject, struct ib_uwq_object, uevent.uobject);
int ret;
ret = ib_destroy_wq(wq);
ret = ib_destroy_wq(wq, &attrs->driver_udata);
if (ib_is_destroy_retryable(ret, why, uobject))
return ret;
ib_uverbs_release_uevent(uobject->context->ufile, &uwq->uevent);
ib_uverbs_release_uevent(attrs->ufile, &uwq->uevent);
return ret;
}
static int uverbs_free_srq(struct ib_uobject *uobject,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
struct ib_srq *srq = uobject->object;
struct ib_uevent_object *uevent =
@ -144,7 +152,7 @@ static int uverbs_free_srq(struct ib_uobject *uobject,
enum ib_srq_type srq_type = srq->srq_type;
int ret;
ret = ib_destroy_srq(srq);
ret = ib_destroy_srq_user(srq, &attrs->driver_udata);
if (ib_is_destroy_retryable(ret, why, uobject))
return ret;
@ -155,12 +163,13 @@ static int uverbs_free_srq(struct ib_uobject *uobject,
atomic_dec(&us->uxrcd->refcnt);
}
ib_uverbs_release_uevent(uobject->context->ufile, uevent);
ib_uverbs_release_uevent(attrs->ufile, uevent);
return ret;
}
static int uverbs_free_xrcd(struct ib_uobject *uobject,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
struct ib_xrcd *xrcd = uobject->object;
struct ib_uxrcd_object *uxrcd =
@ -171,15 +180,16 @@ static int uverbs_free_xrcd(struct ib_uobject *uobject,
if (ret)
return ret;
mutex_lock(&uobject->context->ufile->device->xrcd_tree_mutex);
ret = ib_uverbs_dealloc_xrcd(uobject, xrcd, why);
mutex_unlock(&uobject->context->ufile->device->xrcd_tree_mutex);
mutex_lock(&attrs->ufile->device->xrcd_tree_mutex);
ret = ib_uverbs_dealloc_xrcd(uobject, xrcd, why, attrs);
mutex_unlock(&attrs->ufile->device->xrcd_tree_mutex);
return ret;
}
static int uverbs_free_pd(struct ib_uobject *uobject,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
struct ib_pd *pd = uobject->object;
int ret;
@ -188,7 +198,7 @@ static int uverbs_free_pd(struct ib_uobject *uobject,
if (ret)
return ret;
ib_dealloc_pd(pd);
ib_dealloc_pd_user(pd, &attrs->driver_udata);
return 0;
}

View File

@ -31,11 +31,13 @@
* SOFTWARE.
*/
#include "rdma_core.h"
#include "uverbs.h"
#include <rdma/uverbs_std_types.h>
static int uverbs_free_counters(struct ib_uobject *uobject,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
struct ib_counters *counters = uobject->object;
int ret;
@ -52,7 +54,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_COUNTERS_CREATE)(
{
struct ib_uobject *uobj = uverbs_attr_get_uobject(
attrs, UVERBS_ATTR_CREATE_COUNTERS_HANDLE);
struct ib_device *ib_dev = uobj->context->device;
struct ib_device *ib_dev = attrs->context->device;
struct ib_counters *counters;
int ret;

View File

@ -35,7 +35,8 @@
#include "uverbs.h"
static int uverbs_free_cq(struct ib_uobject *uobject,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
struct ib_cq *cq = uobject->object;
struct ib_uverbs_event_queue *ev_queue = cq->cq_context;
@ -43,12 +44,12 @@ static int uverbs_free_cq(struct ib_uobject *uobject,
container_of(uobject, struct ib_ucq_object, uobject);
int ret;
ret = ib_destroy_cq(cq);
ret = ib_destroy_cq_user(cq, &attrs->driver_udata);
if (ib_is_destroy_retryable(ret, why, uobject))
return ret;
ib_uverbs_release_ucq(
uobject->context->ufile,
attrs->ufile,
ev_queue ? container_of(ev_queue,
struct ib_uverbs_completion_event_file,
ev_queue) :
@ -63,7 +64,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
struct ib_ucq_object *obj = container_of(
uverbs_attr_get_uobject(attrs, UVERBS_ATTR_CREATE_CQ_HANDLE),
typeof(*obj), uobject);
struct ib_device *ib_dev = obj->uobject.context->device;
struct ib_device *ib_dev = attrs->context->device;
int ret;
u64 user_handle;
struct ib_cq_init_attr attr = {};
@ -110,8 +111,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
INIT_LIST_HEAD(&obj->comp_list);
INIT_LIST_HEAD(&obj->async_list);
cq = ib_dev->ops.create_cq(ib_dev, &attr, obj->uobject.context,
&attrs->driver_udata);
cq = ib_dev->ops.create_cq(ib_dev, &attr, &attrs->driver_udata);
if (IS_ERR(cq)) {
ret = PTR_ERR(cq);
goto err_event_file;

View File

@ -30,11 +30,13 @@
* SOFTWARE.
*/
#include "rdma_core.h"
#include "uverbs.h"
#include <rdma/uverbs_std_types.h>
static int uverbs_free_dm(struct ib_uobject *uobject,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
struct ib_dm *dm = uobject->object;
int ret;
@ -43,7 +45,7 @@ static int uverbs_free_dm(struct ib_uobject *uobject,
if (ret)
return ret;
return dm->device->ops.dealloc_dm(dm);
return dm->device->ops.dealloc_dm(dm, attrs);
}
static int UVERBS_HANDLER(UVERBS_METHOD_DM_ALLOC)(
@ -53,7 +55,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_DM_ALLOC)(
struct ib_uobject *uobj =
uverbs_attr_get(attrs, UVERBS_ATTR_ALLOC_DM_HANDLE)
->obj_attr.uobject;
struct ib_device *ib_dev = uobj->context->device;
struct ib_device *ib_dev = attrs->context->device;
struct ib_dm *dm;
int ret;
@ -70,7 +72,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_DM_ALLOC)(
if (ret)
return ret;
dm = ib_dev->ops.alloc_dm(ib_dev, uobj->context, &attr, attrs);
dm = ib_dev->ops.alloc_dm(ib_dev, attrs->context, &attr, attrs);
if (IS_ERR(dm))
return PTR_ERR(dm);

View File

@ -30,11 +30,13 @@
* SOFTWARE.
*/
#include "rdma_core.h"
#include "uverbs.h"
#include <rdma/uverbs_std_types.h>
static int uverbs_free_flow_action(struct ib_uobject *uobject,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
struct ib_flow_action *action = uobject->object;
int ret;
@ -308,7 +310,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_FLOW_ACTION_ESP_CREATE)(
{
struct ib_uobject *uobj = uverbs_attr_get_uobject(
attrs, UVERBS_ATTR_CREATE_FLOW_ACTION_ESP_HANDLE);
struct ib_device *ib_dev = uobj->context->device;
struct ib_device *ib_dev = attrs->context->device;
int ret;
struct ib_flow_action *action;
struct ib_flow_action_esp_attr esp_attr = {};

View File

@ -30,13 +30,16 @@
* SOFTWARE.
*/
#include "rdma_core.h"
#include "uverbs.h"
#include <rdma/uverbs_std_types.h>
static int uverbs_free_mr(struct ib_uobject *uobject,
enum rdma_remove_reason why)
enum rdma_remove_reason why,
struct uverbs_attr_bundle *attrs)
{
return ib_dereg_mr((struct ib_mr *)uobject->object);
return ib_dereg_mr_user((struct ib_mr *)uobject->object,
&attrs->driver_udata);
}
static int UVERBS_HANDLER(UVERBS_METHOD_ADVISE_MR)(
@ -145,7 +148,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_DM_MR_REG)(
return 0;
err_dereg:
ib_dereg_mr(mr);
ib_dereg_mr_user(mr, &attrs->driver_udata);
return ret;
}

View File

@ -218,6 +218,8 @@ rdma_node_get_transport(enum rdma_node_type node_type)
return RDMA_TRANSPORT_USNIC_UDP;
if (node_type == RDMA_NODE_RNIC)
return RDMA_TRANSPORT_IWARP;
if (node_type == RDMA_NODE_UNSPECIFIED)
return RDMA_TRANSPORT_UNSPECIFIED;
return RDMA_TRANSPORT_IB;
}
@ -269,7 +271,7 @@ struct ib_pd *__ib_alloc_pd(struct ib_device *device, unsigned int flags,
pd->res.type = RDMA_RESTRACK_PD;
rdma_restrack_set_task(&pd->res, caller);
ret = device->ops.alloc_pd(pd, NULL, NULL);
ret = device->ops.alloc_pd(pd, NULL);
if (ret) {
kfree(pd);
return ERR_PTR(ret);
@ -316,17 +318,18 @@ EXPORT_SYMBOL(__ib_alloc_pd);
/**
* ib_dealloc_pd - Deallocates a protection domain.
* @pd: The protection domain to deallocate.
* @udata: Valid user data or NULL for kernel object
*
* It is an error to call this function while any resources in the pd still
* exist. The caller is responsible to synchronously destroy them and
* guarantee no new allocations will happen.
*/
void ib_dealloc_pd(struct ib_pd *pd)
void ib_dealloc_pd_user(struct ib_pd *pd, struct ib_udata *udata)
{
int ret;
if (pd->__internal_mr) {
ret = pd->device->ops.dereg_mr(pd->__internal_mr);
ret = pd->device->ops.dereg_mr(pd->__internal_mr, NULL);
WARN_ON(ret);
pd->__internal_mr = NULL;
}
@ -336,10 +339,10 @@ void ib_dealloc_pd(struct ib_pd *pd)
WARN_ON(atomic_read(&pd->usecnt));
rdma_restrack_del(&pd->res);
pd->device->ops.dealloc_pd(pd);
pd->device->ops.dealloc_pd(pd, udata);
kfree(pd);
}
EXPORT_SYMBOL(ib_dealloc_pd);
EXPORT_SYMBOL(ib_dealloc_pd_user);
/* Address handles */
@ -495,25 +498,33 @@ static struct ib_ah *_rdma_create_ah(struct ib_pd *pd,
u32 flags,
struct ib_udata *udata)
{
struct ib_device *device = pd->device;
struct ib_ah *ah;
int ret;
might_sleep_if(flags & RDMA_CREATE_AH_SLEEPABLE);
if (!pd->device->ops.create_ah)
if (!device->ops.create_ah)
return ERR_PTR(-EOPNOTSUPP);
ah = pd->device->ops.create_ah(pd, ah_attr, flags, udata);
ah = rdma_zalloc_drv_obj_gfp(
device, ib_ah,
(flags & RDMA_CREATE_AH_SLEEPABLE) ? GFP_KERNEL : GFP_ATOMIC);
if (!ah)
return ERR_PTR(-ENOMEM);
if (!IS_ERR(ah)) {
ah->device = pd->device;
ah->pd = pd;
ah->uobject = NULL;
ah->type = ah_attr->type;
ah->sgid_attr = rdma_update_sgid_attr(ah_attr, NULL);
ah->device = device;
ah->pd = pd;
ah->type = ah_attr->type;
ah->sgid_attr = rdma_update_sgid_attr(ah_attr, NULL);
atomic_inc(&pd->usecnt);
ret = device->ops.create_ah(ah, ah_attr, flags, udata);
if (ret) {
kfree(ah);
return ERR_PTR(ret);
}
atomic_inc(&pd->usecnt);
return ah;
}
@ -930,25 +941,24 @@ int rdma_query_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr)
}
EXPORT_SYMBOL(rdma_query_ah);
int rdma_destroy_ah(struct ib_ah *ah, u32 flags)
int rdma_destroy_ah_user(struct ib_ah *ah, u32 flags, struct ib_udata *udata)
{
const struct ib_gid_attr *sgid_attr = ah->sgid_attr;
struct ib_pd *pd;
int ret;
might_sleep_if(flags & RDMA_DESTROY_AH_SLEEPABLE);
pd = ah->pd;
ret = ah->device->ops.destroy_ah(ah, flags);
if (!ret) {
atomic_dec(&pd->usecnt);
if (sgid_attr)
rdma_put_gid_attr(sgid_attr);
}
return ret;
ah->device->ops.destroy_ah(ah, flags);
atomic_dec(&pd->usecnt);
if (sgid_attr)
rdma_put_gid_attr(sgid_attr);
kfree(ah);
return 0;
}
EXPORT_SYMBOL(rdma_destroy_ah);
EXPORT_SYMBOL(rdma_destroy_ah_user);
/* Shared receive queues */
@ -956,29 +966,40 @@ struct ib_srq *ib_create_srq(struct ib_pd *pd,
struct ib_srq_init_attr *srq_init_attr)
{
struct ib_srq *srq;
int ret;
if (!pd->device->ops.create_srq)
return ERR_PTR(-EOPNOTSUPP);
srq = pd->device->ops.create_srq(pd, srq_init_attr, NULL);
srq = rdma_zalloc_drv_obj(pd->device, ib_srq);
if (!srq)
return ERR_PTR(-ENOMEM);
if (!IS_ERR(srq)) {
srq->device = pd->device;
srq->pd = pd;
srq->uobject = NULL;
srq->event_handler = srq_init_attr->event_handler;
srq->srq_context = srq_init_attr->srq_context;
srq->srq_type = srq_init_attr->srq_type;
if (ib_srq_has_cq(srq->srq_type)) {
srq->ext.cq = srq_init_attr->ext.cq;
atomic_inc(&srq->ext.cq->usecnt);
}
if (srq->srq_type == IB_SRQT_XRC) {
srq->ext.xrc.xrcd = srq_init_attr->ext.xrc.xrcd;
atomic_inc(&srq->ext.xrc.xrcd->usecnt);
}
atomic_inc(&pd->usecnt);
atomic_set(&srq->usecnt, 0);
srq->device = pd->device;
srq->pd = pd;
srq->event_handler = srq_init_attr->event_handler;
srq->srq_context = srq_init_attr->srq_context;
srq->srq_type = srq_init_attr->srq_type;
if (ib_srq_has_cq(srq->srq_type)) {
srq->ext.cq = srq_init_attr->ext.cq;
atomic_inc(&srq->ext.cq->usecnt);
}
if (srq->srq_type == IB_SRQT_XRC) {
srq->ext.xrc.xrcd = srq_init_attr->ext.xrc.xrcd;
atomic_inc(&srq->ext.xrc.xrcd->usecnt);
}
atomic_inc(&pd->usecnt);
ret = pd->device->ops.create_srq(srq, srq_init_attr, NULL);
if (ret) {
atomic_dec(&srq->pd->usecnt);
if (srq->srq_type == IB_SRQT_XRC)
atomic_dec(&srq->ext.xrc.xrcd->usecnt);
if (ib_srq_has_cq(srq->srq_type))
atomic_dec(&srq->ext.cq->usecnt);
kfree(srq);
return ERR_PTR(ret);
}
return srq;
@ -1003,36 +1024,23 @@ int ib_query_srq(struct ib_srq *srq,
}
EXPORT_SYMBOL(ib_query_srq);
int ib_destroy_srq(struct ib_srq *srq)
int ib_destroy_srq_user(struct ib_srq *srq, struct ib_udata *udata)
{
struct ib_pd *pd;
enum ib_srq_type srq_type;
struct ib_xrcd *uninitialized_var(xrcd);
struct ib_cq *uninitialized_var(cq);
int ret;
if (atomic_read(&srq->usecnt))
return -EBUSY;
pd = srq->pd;
srq_type = srq->srq_type;
if (ib_srq_has_cq(srq_type))
cq = srq->ext.cq;
if (srq_type == IB_SRQT_XRC)
xrcd = srq->ext.xrc.xrcd;
srq->device->ops.destroy_srq(srq, udata);
ret = srq->device->ops.destroy_srq(srq);
if (!ret) {
atomic_dec(&pd->usecnt);
if (srq_type == IB_SRQT_XRC)
atomic_dec(&xrcd->usecnt);
if (ib_srq_has_cq(srq_type))
atomic_dec(&cq->usecnt);
}
atomic_dec(&srq->pd->usecnt);
if (srq->srq_type == IB_SRQT_XRC)
atomic_dec(&srq->ext.xrc.xrcd->usecnt);
if (ib_srq_has_cq(srq->srq_type))
atomic_dec(&srq->ext.cq->usecnt);
kfree(srq);
return ret;
return 0;
}
EXPORT_SYMBOL(ib_destroy_srq);
EXPORT_SYMBOL(ib_destroy_srq_user);
/* Queue pairs */
@ -1111,8 +1119,9 @@ struct ib_qp *ib_open_qp(struct ib_xrcd *xrcd,
}
EXPORT_SYMBOL(ib_open_qp);
static struct ib_qp *create_xrc_qp(struct ib_qp *qp,
struct ib_qp_init_attr *qp_init_attr)
static struct ib_qp *create_xrc_qp_user(struct ib_qp *qp,
struct ib_qp_init_attr *qp_init_attr,
struct ib_udata *udata)
{
struct ib_qp *real_qp = qp;
@ -1134,8 +1143,9 @@ static struct ib_qp *create_xrc_qp(struct ib_qp *qp,
return qp;
}
struct ib_qp *ib_create_qp(struct ib_pd *pd,
struct ib_qp_init_attr *qp_init_attr)
struct ib_qp *ib_create_qp_user(struct ib_pd *pd,
struct ib_qp_init_attr *qp_init_attr,
struct ib_udata *udata)
{
struct ib_device *device = pd ? pd->device : qp_init_attr->xrcd->device;
struct ib_qp *qp;
@ -1164,7 +1174,6 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd,
if (ret)
goto err;
qp->real_qp = qp;
qp->qp_type = qp_init_attr->qp_type;
qp->rwq_ind_tbl = qp_init_attr->rwq_ind_tbl;
@ -1176,7 +1185,8 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd,
qp->port = 0;
if (qp_init_attr->qp_type == IB_QPT_XRC_TGT) {
struct ib_qp *xrc_qp = create_xrc_qp(qp, qp_init_attr);
struct ib_qp *xrc_qp =
create_xrc_qp_user(qp, qp_init_attr, udata);
if (IS_ERR(xrc_qp)) {
ret = PTR_ERR(xrc_qp);
@ -1230,7 +1240,7 @@ err:
return ERR_PTR(ret);
}
EXPORT_SYMBOL(ib_create_qp);
EXPORT_SYMBOL(ib_create_qp_user);
static const struct {
int valid;
@ -1837,7 +1847,7 @@ static int __ib_destroy_shared_qp(struct ib_qp *qp)
return 0;
}
int ib_destroy_qp(struct ib_qp *qp)
int ib_destroy_qp_user(struct ib_qp *qp, struct ib_udata *udata)
{
const struct ib_gid_attr *alt_path_sgid_attr = qp->alt_path_sgid_attr;
const struct ib_gid_attr *av_sgid_attr = qp->av_sgid_attr;
@ -1869,7 +1879,7 @@ int ib_destroy_qp(struct ib_qp *qp)
rdma_rw_cleanup_mrs(qp);
rdma_restrack_del(&qp->res);
ret = qp->device->ops.destroy_qp(qp);
ret = qp->device->ops.destroy_qp(qp, udata);
if (!ret) {
if (alt_path_sgid_attr)
rdma_put_gid_attr(alt_path_sgid_attr);
@ -1894,7 +1904,7 @@ int ib_destroy_qp(struct ib_qp *qp)
return ret;
}
EXPORT_SYMBOL(ib_destroy_qp);
EXPORT_SYMBOL(ib_destroy_qp_user);
/* Completion queues */
@ -1907,7 +1917,7 @@ struct ib_cq *__ib_create_cq(struct ib_device *device,
{
struct ib_cq *cq;
cq = device->ops.create_cq(device, cq_attr, NULL, NULL);
cq = device->ops.create_cq(device, cq_attr, NULL);
if (!IS_ERR(cq)) {
cq->device = device;
@ -1933,15 +1943,15 @@ int rdma_set_cq_moderation(struct ib_cq *cq, u16 cq_count, u16 cq_period)
}
EXPORT_SYMBOL(rdma_set_cq_moderation);
int ib_destroy_cq(struct ib_cq *cq)
int ib_destroy_cq_user(struct ib_cq *cq, struct ib_udata *udata)
{
if (atomic_read(&cq->usecnt))
return -EBUSY;
rdma_restrack_del(&cq->res);
return cq->device->ops.destroy_cq(cq);
return cq->device->ops.destroy_cq(cq, udata);
}
EXPORT_SYMBOL(ib_destroy_cq);
EXPORT_SYMBOL(ib_destroy_cq_user);
int ib_resize_cq(struct ib_cq *cq, int cqe)
{
@ -1952,14 +1962,14 @@ EXPORT_SYMBOL(ib_resize_cq);
/* Memory regions */
int ib_dereg_mr(struct ib_mr *mr)
int ib_dereg_mr_user(struct ib_mr *mr, struct ib_udata *udata)
{
struct ib_pd *pd = mr->pd;
struct ib_dm *dm = mr->dm;
int ret;
rdma_restrack_del(&mr->res);
ret = mr->device->ops.dereg_mr(mr);
ret = mr->device->ops.dereg_mr(mr, udata);
if (!ret) {
atomic_dec(&pd->usecnt);
if (dm)
@ -1968,13 +1978,14 @@ int ib_dereg_mr(struct ib_mr *mr)
return ret;
}
EXPORT_SYMBOL(ib_dereg_mr);
EXPORT_SYMBOL(ib_dereg_mr_user);
/**
* ib_alloc_mr() - Allocates a memory region
* @pd: protection domain associated with the region
* @mr_type: memory region type
* @max_num_sg: maximum sg entries available for registration.
* @udata: user data or null for kernel objects
*
* Notes:
* Memory registeration page/sg lists must not exceed max_num_sg.
@ -1982,16 +1993,15 @@ EXPORT_SYMBOL(ib_dereg_mr);
* max_num_sg * used_page_size.
*
*/
struct ib_mr *ib_alloc_mr(struct ib_pd *pd,
enum ib_mr_type mr_type,
u32 max_num_sg)
struct ib_mr *ib_alloc_mr_user(struct ib_pd *pd, enum ib_mr_type mr_type,
u32 max_num_sg, struct ib_udata *udata)
{
struct ib_mr *mr;
if (!pd->device->ops.alloc_mr)
return ERR_PTR(-EOPNOTSUPP);
mr = pd->device->ops.alloc_mr(pd, mr_type, max_num_sg);
mr = pd->device->ops.alloc_mr(pd, mr_type, max_num_sg, udata);
if (!IS_ERR(mr)) {
mr->device = pd->device;
mr->pd = pd;
@ -2005,7 +2015,7 @@ struct ib_mr *ib_alloc_mr(struct ib_pd *pd,
return mr;
}
EXPORT_SYMBOL(ib_alloc_mr);
EXPORT_SYMBOL(ib_alloc_mr_user);
/* "Fast" memory regions */
@ -2138,7 +2148,7 @@ struct ib_xrcd *__ib_alloc_xrcd(struct ib_device *device, const char *caller)
if (!device->ops.alloc_xrcd)
return ERR_PTR(-EOPNOTSUPP);
xrcd = device->ops.alloc_xrcd(device, NULL, NULL);
xrcd = device->ops.alloc_xrcd(device, NULL);
if (!IS_ERR(xrcd)) {
xrcd->device = device;
xrcd->inode = NULL;
@ -2151,7 +2161,7 @@ struct ib_xrcd *__ib_alloc_xrcd(struct ib_device *device, const char *caller)
}
EXPORT_SYMBOL(__ib_alloc_xrcd);
int ib_dealloc_xrcd(struct ib_xrcd *xrcd)
int ib_dealloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata)
{
struct ib_qp *qp;
int ret;
@ -2166,7 +2176,7 @@ int ib_dealloc_xrcd(struct ib_xrcd *xrcd)
return ret;
}
return xrcd->device->ops.dealloc_xrcd(xrcd);
return xrcd->device->ops.dealloc_xrcd(xrcd, udata);
}
EXPORT_SYMBOL(ib_dealloc_xrcd);
@ -2210,10 +2220,11 @@ struct ib_wq *ib_create_wq(struct ib_pd *pd,
EXPORT_SYMBOL(ib_create_wq);
/**
* ib_destroy_wq - Destroys the specified WQ.
* ib_destroy_wq - Destroys the specified user WQ.
* @wq: The WQ to destroy.
* @udata: Valid user data
*/
int ib_destroy_wq(struct ib_wq *wq)
int ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata)
{
int err;
struct ib_cq *cq = wq->cq;
@ -2222,7 +2233,7 @@ int ib_destroy_wq(struct ib_wq *wq)
if (atomic_read(&wq->usecnt))
return -EBUSY;
err = wq->device->ops.destroy_wq(wq);
err = wq->device->ops.destroy_wq(wq, udata);
if (!err) {
atomic_dec(&pd->usecnt);
atomic_dec(&cq->usecnt);
@ -2701,3 +2712,37 @@ int rdma_init_netdev(struct ib_device *device, u8 port_num,
netdev, params.param);
}
EXPORT_SYMBOL(rdma_init_netdev);
void __rdma_block_iter_start(struct ib_block_iter *biter,
struct scatterlist *sglist, unsigned int nents,
unsigned long pgsz)
{
memset(biter, 0, sizeof(struct ib_block_iter));
biter->__sg = sglist;
biter->__sg_nents = nents;
/* Driver provides best block size to use */
biter->__pg_bit = __fls(pgsz);
}
EXPORT_SYMBOL(__rdma_block_iter_start);
bool __rdma_block_iter_next(struct ib_block_iter *biter)
{
unsigned int block_offset;
if (!biter->__sg_nents || !biter->__sg)
return false;
biter->__dma_addr = sg_dma_address(biter->__sg) + biter->__sg_advance;
block_offset = biter->__dma_addr & (BIT_ULL(biter->__pg_bit) - 1);
biter->__sg_advance += BIT_ULL(biter->__pg_bit) - block_offset;
if (biter->__sg_advance >= sg_dma_len(biter->__sg)) {
biter->__sg_advance = 0;
biter->__sg = sg_next(biter->__sg);
biter->__sg_nents--;
}
return true;
}
EXPORT_SYMBOL(__rdma_block_iter_next);

View File

@ -3,6 +3,7 @@ obj-$(CONFIG_INFINIBAND_MTHCA) += mthca/
obj-$(CONFIG_INFINIBAND_QIB) += qib/
obj-$(CONFIG_INFINIBAND_CXGB3) += cxgb3/
obj-$(CONFIG_INFINIBAND_CXGB4) += cxgb4/
obj-$(CONFIG_INFINIBAND_EFA) += efa/
obj-$(CONFIG_INFINIBAND_I40IW) += i40iw/
obj-$(CONFIG_MLX4_INFINIBAND) += mlx4/
obj-$(CONFIG_MLX5_INFINIBAND) += mlx5/

View File

@ -1,10 +1,10 @@
config INFINIBAND_BNXT_RE
tristate "Broadcom Netxtreme HCA support"
depends on 64BIT
depends on ETHERNET && NETDEVICES && PCI && INET && DCB
select NET_VENDOR_BROADCOM
select BNXT
---help---
tristate "Broadcom Netxtreme HCA support"
depends on 64BIT
depends on ETHERNET && NETDEVICES && PCI && INET && DCB
select NET_VENDOR_BROADCOM
select BNXT
---help---
This driver supports Broadcom NetXtreme-E 10/25/40/50 gigabit
RoCE HCAs. To compile this driver as a module, choose M here:
the module will be called bnxt_re.

View File

@ -119,21 +119,6 @@ static int bnxt_re_build_sgl(struct ib_sge *ib_sg_list,
}
/* Device */
struct net_device *bnxt_re_get_netdev(struct ib_device *ibdev, u8 port_num)
{
struct bnxt_re_dev *rdev = to_bnxt_re_dev(ibdev, ibdev);
struct net_device *netdev = NULL;
rcu_read_lock();
if (rdev)
netdev = rdev->netdev;
if (netdev)
dev_hold(netdev);
rcu_read_unlock();
return netdev;
}
int bnxt_re_query_device(struct ib_device *ibdev,
struct ib_device_attr *ib_attr,
struct ib_udata *udata)
@ -375,8 +360,9 @@ int bnxt_re_add_gid(const struct ib_gid_attr *attr, void **context)
struct bnxt_re_dev *rdev = to_bnxt_re_dev(attr->device, ibdev);
struct bnxt_qplib_sgid_tbl *sgid_tbl = &rdev->qplib_res.sgid_tbl;
if ((attr->ndev) && is_vlan_dev(attr->ndev))
vlan_id = vlan_dev_vlan_id(attr->ndev);
rc = rdma_read_gid_l2_fields(attr, &vlan_id, NULL);
if (rc)
return rc;
rc = bnxt_qplib_add_sgid(sgid_tbl, (struct bnxt_qplib_gid *)&attr->gid,
rdev->qplib_res.netdev->dev_addr,
@ -564,7 +550,7 @@ fail:
}
/* Protection Domains */
void bnxt_re_dealloc_pd(struct ib_pd *ib_pd)
void bnxt_re_dealloc_pd(struct ib_pd *ib_pd, struct ib_udata *udata)
{
struct bnxt_re_pd *pd = container_of(ib_pd, struct bnxt_re_pd, ib_pd);
struct bnxt_re_dev *rdev = pd->rdev;
@ -576,14 +562,12 @@ void bnxt_re_dealloc_pd(struct ib_pd *ib_pd)
&pd->qplib_pd);
}
int bnxt_re_alloc_pd(struct ib_pd *ibpd, struct ib_ucontext *ucontext,
struct ib_udata *udata)
int bnxt_re_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
{
struct ib_device *ibdev = ibpd->device;
struct bnxt_re_dev *rdev = to_bnxt_re_dev(ibdev, ibdev);
struct bnxt_re_ucontext *ucntx = container_of(ucontext,
struct bnxt_re_ucontext,
ib_uctx);
struct bnxt_re_ucontext *ucntx = rdma_udata_to_drv_context(
udata, struct bnxt_re_ucontext, ib_uctx);
struct bnxt_re_pd *pd = container_of(ibpd, struct bnxt_re_pd, ib_pd);
int rc;
@ -635,20 +619,13 @@ fail:
}
/* Address Handles */
int bnxt_re_destroy_ah(struct ib_ah *ib_ah, u32 flags)
void bnxt_re_destroy_ah(struct ib_ah *ib_ah, u32 flags)
{
struct bnxt_re_ah *ah = container_of(ib_ah, struct bnxt_re_ah, ib_ah);
struct bnxt_re_dev *rdev = ah->rdev;
int rc;
rc = bnxt_qplib_destroy_ah(&rdev->qplib_res, &ah->qplib_ah,
!(flags & RDMA_DESTROY_AH_SLEEPABLE));
if (rc) {
dev_err(rdev_to_dev(rdev), "Failed to destroy HW AH");
return rc;
}
kfree(ah);
return 0;
bnxt_qplib_destroy_ah(&rdev->qplib_res, &ah->qplib_ah,
!(flags & RDMA_DESTROY_AH_SLEEPABLE));
}
static u8 bnxt_re_stack_to_dev_nw_type(enum rdma_network_type ntype)
@ -669,26 +646,22 @@ static u8 bnxt_re_stack_to_dev_nw_type(enum rdma_network_type ntype)
return nw_type;
}
struct ib_ah *bnxt_re_create_ah(struct ib_pd *ib_pd,
struct rdma_ah_attr *ah_attr,
u32 flags,
struct ib_udata *udata)
int bnxt_re_create_ah(struct ib_ah *ib_ah, struct rdma_ah_attr *ah_attr,
u32 flags, struct ib_udata *udata)
{
struct ib_pd *ib_pd = ib_ah->pd;
struct bnxt_re_pd *pd = container_of(ib_pd, struct bnxt_re_pd, ib_pd);
const struct ib_global_route *grh = rdma_ah_read_grh(ah_attr);
struct bnxt_re_dev *rdev = pd->rdev;
const struct ib_gid_attr *sgid_attr;
struct bnxt_re_ah *ah;
struct bnxt_re_ah *ah = container_of(ib_ah, struct bnxt_re_ah, ib_ah);
u8 nw_type;
int rc;
if (!(rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH)) {
dev_err(rdev_to_dev(rdev), "Failed to alloc AH: GRH not set");
return ERR_PTR(-EINVAL);
return -EINVAL;
}
ah = kzalloc(sizeof(*ah), GFP_ATOMIC);
if (!ah)
return ERR_PTR(-ENOMEM);
ah->rdev = rdev;
ah->qplib_ah.pd = &pd->qplib_pd;
@ -718,7 +691,7 @@ struct ib_ah *bnxt_re_create_ah(struct ib_pd *ib_pd,
!(flags & RDMA_CREATE_AH_SLEEPABLE));
if (rc) {
dev_err(rdev_to_dev(rdev), "Failed to allocate HW AH");
goto fail;
return rc;
}
/* Write AVID to shared page. */
@ -735,11 +708,7 @@ struct ib_ah *bnxt_re_create_ah(struct ib_pd *ib_pd,
spin_unlock_irqrestore(&uctx->sh_lock, flag);
}
return &ah->ib_ah;
fail:
kfree(ah);
return ERR_PTR(rc);
return 0;
}
int bnxt_re_modify_ah(struct ib_ah *ib_ah, struct rdma_ah_attr *ah_attr)
@ -789,7 +758,7 @@ void bnxt_re_unlock_cqs(struct bnxt_re_qp *qp,
}
/* Queue Pairs */
int bnxt_re_destroy_qp(struct ib_qp *ib_qp)
int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
{
struct bnxt_re_qp *qp = container_of(ib_qp, struct bnxt_re_qp, ib_qp);
struct bnxt_re_dev *rdev = qp->rdev;
@ -812,13 +781,8 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp)
bnxt_qplib_free_qp_res(&rdev->qplib_res, &qp->qplib_qp);
if (ib_qp->qp_type == IB_QPT_GSI && rdev->qp1_sqp) {
rc = bnxt_qplib_destroy_ah(&rdev->qplib_res,
&rdev->sqp_ah->qplib_ah, false);
if (rc) {
dev_err(rdev_to_dev(rdev),
"Failed to destroy HW AH for shadow QP");
return rc;
}
bnxt_qplib_destroy_ah(&rdev->qplib_res, &rdev->sqp_ah->qplib_ah,
false);
bnxt_qplib_clean_qp(&qp->qplib_qp);
rc = bnxt_qplib_destroy_qp(&rdev->qplib_res,
@ -895,8 +859,9 @@ static int bnxt_re_init_user_qp(struct bnxt_re_dev *rdev, struct bnxt_re_pd *pd,
return PTR_ERR(umem);
qp->sumem = umem;
qplib_qp->sq.sglist = umem->sg_head.sgl;
qplib_qp->sq.nmap = umem->nmap;
qplib_qp->sq.sg_info.sglist = umem->sg_head.sgl;
qplib_qp->sq.sg_info.npages = ib_umem_num_pages(umem);
qplib_qp->sq.sg_info.nmap = umem->nmap;
qplib_qp->qp_handle = ureq.qp_handle;
if (!qp->qplib_qp.srq) {
@ -907,8 +872,9 @@ static int bnxt_re_init_user_qp(struct bnxt_re_dev *rdev, struct bnxt_re_pd *pd,
if (IS_ERR(umem))
goto rqfail;
qp->rumem = umem;
qplib_qp->rq.sglist = umem->sg_head.sgl;
qplib_qp->rq.nmap = umem->nmap;
qplib_qp->rq.sg_info.sglist = umem->sg_head.sgl;
qplib_qp->rq.sg_info.npages = ib_umem_num_pages(umem);
qplib_qp->rq.sg_info.nmap = umem->nmap;
}
qplib_qp->dpi = &cntx->dpi;
@ -916,8 +882,7 @@ static int bnxt_re_init_user_qp(struct bnxt_re_dev *rdev, struct bnxt_re_pd *pd,
rqfail:
ib_umem_release(qp->sumem);
qp->sumem = NULL;
qplib_qp->sq.sglist = NULL;
qplib_qp->sq.nmap = 0;
memset(&qplib_qp->sq.sg_info, 0, sizeof(qplib_qp->sq.sg_info));
return PTR_ERR(umem);
}
@ -1326,30 +1291,22 @@ static enum ib_mtu __to_ib_mtu(u32 mtu)
}
/* Shared Receive Queues */
int bnxt_re_destroy_srq(struct ib_srq *ib_srq)
void bnxt_re_destroy_srq(struct ib_srq *ib_srq, struct ib_udata *udata)
{
struct bnxt_re_srq *srq = container_of(ib_srq, struct bnxt_re_srq,
ib_srq);
struct bnxt_re_dev *rdev = srq->rdev;
struct bnxt_qplib_srq *qplib_srq = &srq->qplib_srq;
struct bnxt_qplib_nq *nq = NULL;
int rc;
if (qplib_srq->cq)
nq = qplib_srq->cq->nq;
rc = bnxt_qplib_destroy_srq(&rdev->qplib_res, qplib_srq);
if (rc) {
dev_err(rdev_to_dev(rdev), "Destroy HW SRQ failed!");
return rc;
}
bnxt_qplib_destroy_srq(&rdev->qplib_res, qplib_srq);
if (srq->umem)
ib_umem_release(srq->umem);
kfree(srq);
atomic_dec(&rdev->srq_count);
if (nq)
nq->budget--;
return 0;
}
static int bnxt_re_init_user_srq(struct bnxt_re_dev *rdev,
@ -1374,22 +1331,25 @@ static int bnxt_re_init_user_srq(struct bnxt_re_dev *rdev,
return PTR_ERR(umem);
srq->umem = umem;
qplib_srq->nmap = umem->nmap;
qplib_srq->sglist = umem->sg_head.sgl;
qplib_srq->sg_info.sglist = umem->sg_head.sgl;
qplib_srq->sg_info.npages = ib_umem_num_pages(umem);
qplib_srq->sg_info.nmap = umem->nmap;
qplib_srq->srq_handle = ureq.srq_handle;
qplib_srq->dpi = &cntx->dpi;
return 0;
}
struct ib_srq *bnxt_re_create_srq(struct ib_pd *ib_pd,
struct ib_srq_init_attr *srq_init_attr,
struct ib_udata *udata)
int bnxt_re_create_srq(struct ib_srq *ib_srq,
struct ib_srq_init_attr *srq_init_attr,
struct ib_udata *udata)
{
struct ib_pd *ib_pd = ib_srq->pd;
struct bnxt_re_pd *pd = container_of(ib_pd, struct bnxt_re_pd, ib_pd);
struct bnxt_re_dev *rdev = pd->rdev;
struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr;
struct bnxt_re_srq *srq;
struct bnxt_re_srq *srq =
container_of(ib_srq, struct bnxt_re_srq, ib_srq);
struct bnxt_qplib_nq *nq = NULL;
int rc, entries;
@ -1404,11 +1364,6 @@ struct ib_srq *bnxt_re_create_srq(struct ib_pd *ib_pd,
goto exit;
}
srq = kzalloc(sizeof(*srq), GFP_KERNEL);
if (!srq) {
rc = -ENOMEM;
goto exit;
}
srq->rdev = rdev;
srq->qplib_srq.pd = &pd->qplib_pd;
srq->qplib_srq.dpi = &rdev->dpi_privileged;
@ -1454,14 +1409,13 @@ struct ib_srq *bnxt_re_create_srq(struct ib_pd *ib_pd,
nq->budget++;
atomic_inc(&rdev->srq_count);
return &srq->ib_srq;
return 0;
fail:
if (srq->umem)
ib_umem_release(srq->umem);
kfree(srq);
exit:
return ERR_PTR(rc);
return rc;
}
int bnxt_re_modify_srq(struct ib_srq *ib_srq, struct ib_srq_attr *srq_attr,
@ -1684,8 +1638,11 @@ int bnxt_re_modify_qp(struct ib_qp *ib_qp, struct ib_qp_attr *qp_attr,
qp_attr->ah_attr.roce.dmac);
sgid_attr = qp_attr->ah_attr.grh.sgid_attr;
memcpy(qp->qplib_qp.smac, sgid_attr->ndev->dev_addr,
ETH_ALEN);
rc = rdma_read_gid_l2_fields(sgid_attr, NULL,
&qp->qplib_qp.smac[0]);
if (rc)
return rc;
nw_type = rdma_gid_attr_network_type(sgid_attr);
switch (nw_type) {
case RDMA_NETWORK_IPV4:
@ -1904,8 +1861,10 @@ static int bnxt_re_build_qp1_send_v2(struct bnxt_re_qp *qp,
memset(&qp->qp1_hdr, 0, sizeof(qp->qp1_hdr));
if (is_vlan_dev(sgid_attr->ndev))
vlan_id = vlan_dev_vlan_id(sgid_attr->ndev);
rc = rdma_read_gid_l2_fields(sgid_attr, &vlan_id, NULL);
if (rc)
return rc;
/* Get network header type for this GID */
nw_type = rdma_gid_attr_network_type(sgid_attr);
switch (nw_type) {
@ -2558,7 +2517,7 @@ int bnxt_re_post_recv(struct ib_qp *ib_qp, const struct ib_recv_wr *wr,
}
/* Completion Queues */
int bnxt_re_destroy_cq(struct ib_cq *ib_cq)
int bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
{
int rc;
struct bnxt_re_cq *cq;
@ -2587,7 +2546,6 @@ int bnxt_re_destroy_cq(struct ib_cq *ib_cq)
struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *context,
struct ib_udata *udata)
{
struct bnxt_re_dev *rdev = to_bnxt_re_dev(ibdev, ibdev);
@ -2614,12 +2572,10 @@ struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
if (entries > dev_attr->max_cq_wqes + 1)
entries = dev_attr->max_cq_wqes + 1;
if (context) {
if (udata) {
struct bnxt_re_cq_req req;
struct bnxt_re_ucontext *uctx = container_of
(context,
struct bnxt_re_ucontext,
ib_uctx);
struct bnxt_re_ucontext *uctx = rdma_udata_to_drv_context(
udata, struct bnxt_re_ucontext, ib_uctx);
if (ib_copy_from_udata(&req, udata, sizeof(req))) {
rc = -EFAULT;
goto fail;
@ -2632,8 +2588,9 @@ struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
rc = PTR_ERR(cq->umem);
goto fail;
}
cq->qplib_cq.sghead = cq->umem->sg_head.sgl;
cq->qplib_cq.nmap = cq->umem->nmap;
cq->qplib_cq.sg_info.sglist = cq->umem->sg_head.sgl;
cq->qplib_cq.sg_info.npages = ib_umem_num_pages(cq->umem);
cq->qplib_cq.sg_info.nmap = cq->umem->nmap;
cq->qplib_cq.dpi = &uctx->dpi;
} else {
cq->max_cql = min_t(u32, entries, MAX_CQL_PER_POLL);
@ -2645,8 +2602,6 @@ struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
}
cq->qplib_cq.dpi = &rdev->dpi_privileged;
cq->qplib_cq.sghead = NULL;
cq->qplib_cq.nmap = 0;
}
/*
* Allocating the NQ in a round robin fashion. nq_alloc_cnt is a
@ -2671,7 +2626,7 @@ struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
atomic_inc(&rdev->cq_count);
spin_lock_init(&cq->cq_lock);
if (context) {
if (udata) {
struct bnxt_re_cq_resp resp;
resp.cqid = cq->qplib_cq.id;
@ -2689,7 +2644,7 @@ struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
return &cq->ib_cq;
c2fail:
if (context)
if (udata)
ib_umem_release(cq->umem);
fail:
kfree(cq->cql);
@ -3381,7 +3336,7 @@ fail:
return ERR_PTR(rc);
}
int bnxt_re_dereg_mr(struct ib_mr *ib_mr)
int bnxt_re_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata)
{
struct bnxt_re_mr *mr = container_of(ib_mr, struct bnxt_re_mr, ib_mr);
struct bnxt_re_dev *rdev = mr->rdev;
@ -3427,7 +3382,7 @@ int bnxt_re_map_mr_sg(struct ib_mr *ib_mr, struct scatterlist *sg, int sg_nents,
}
struct ib_mr *bnxt_re_alloc_mr(struct ib_pd *ib_pd, enum ib_mr_type type,
u32 max_num_sg)
u32 max_num_sg, struct ib_udata *udata)
{
struct bnxt_re_pd *pd = container_of(ib_pd, struct bnxt_re_pd, ib_pd);
struct bnxt_re_dev *rdev = pd->rdev;
@ -3552,17 +3507,12 @@ static int fill_umem_pbl_tbl(struct ib_umem *umem, u64 *pbl_tbl_orig,
int page_shift)
{
u64 *pbl_tbl = pbl_tbl_orig;
u64 paddr;
u64 page_mask = (1ULL << page_shift) - 1;
struct sg_dma_page_iter sg_iter;
u64 page_size = BIT_ULL(page_shift);
struct ib_block_iter biter;
rdma_for_each_block(umem->sg_head.sgl, &biter, umem->nmap, page_size)
*pbl_tbl++ = rdma_block_iter_dma_address(&biter);
for_each_sg_dma_page (umem->sg_head.sgl, &sg_iter, umem->nmap, 0) {
paddr = sg_page_iter_dma_address(&sg_iter);
if (pbl_tbl == pbl_tbl_orig)
*pbl_tbl++ = paddr & ~page_mask;
else if ((paddr & page_mask) == 0)
*pbl_tbl++ = paddr;
}
return pbl_tbl - pbl_tbl_orig;
}
@ -3624,7 +3574,9 @@ struct ib_mr *bnxt_re_reg_user_mr(struct ib_pd *ib_pd, u64 start, u64 length,
goto free_umem;
}
page_shift = PAGE_SHIFT;
page_shift = __ffs(ib_umem_find_best_pgsz(umem,
BNXT_RE_PAGE_SIZE_4K | BNXT_RE_PAGE_SIZE_2M,
virt_addr));
if (!bnxt_re_page_size_ok(page_shift)) {
dev_err(rdev_to_dev(rdev), "umem page size unsupported!");
@ -3632,17 +3584,13 @@ struct ib_mr *bnxt_re_reg_user_mr(struct ib_pd *ib_pd, u64 start, u64 length,
goto fail;
}
if (!umem->hugetlb && length > BNXT_RE_MAX_MR_SIZE_LOW) {
if (page_shift == BNXT_RE_PAGE_SHIFT_4K &&
length > BNXT_RE_MAX_MR_SIZE_LOW) {
dev_err(rdev_to_dev(rdev), "Requested MR Sz:%llu Max sup:%llu",
length, (u64)BNXT_RE_MAX_MR_SIZE_LOW);
rc = -EINVAL;
goto fail;
}
if (umem->hugetlb && length > BNXT_RE_PAGE_SIZE_2M) {
page_shift = BNXT_RE_PAGE_SHIFT_2M;
dev_warn(rdev_to_dev(rdev), "umem hugetlb set page_size %x",
1 << page_shift);
}
/* Map umem buf ptrs to the PBL */
umem_pgs = fill_umem_pbl_tbl(umem, pbl_tbl, page_shift);
@ -3709,7 +3657,7 @@ int bnxt_re_alloc_ucontext(struct ib_ucontext *ctx, struct ib_udata *udata)
resp.chip_id0 = chip_met_rev_num;
/* Future extension of chip info */
resp.chip_id1 = 0;
/*Temp, Use idr_alloc instead */
/*Temp, Use xa_alloc instead */
resp.dev_id = rdev->en_dev->pdev->devfn;
resp.max_qp = rdev->qplib_ctx.qpc_count;
resp.pg_size = PAGE_SIZE;

View File

@ -63,15 +63,15 @@ struct bnxt_re_pd {
};
struct bnxt_re_ah {
struct bnxt_re_dev *rdev;
struct ib_ah ib_ah;
struct bnxt_re_dev *rdev;
struct bnxt_qplib_ah qplib_ah;
};
struct bnxt_re_srq {
struct ib_srq ib_srq;
struct bnxt_re_dev *rdev;
u32 srq_limit;
struct ib_srq ib_srq;
struct bnxt_qplib_srq qplib_srq;
struct ib_umem *umem;
spinlock_t lock; /* protect srq */
@ -142,8 +142,6 @@ struct bnxt_re_ucontext {
spinlock_t sh_lock; /* protect shpg */
};
struct net_device *bnxt_re_get_netdev(struct ib_device *ibdev, u8 port_num);
int bnxt_re_query_device(struct ib_device *ibdev,
struct ib_device_attr *ib_attr,
struct ib_udata *udata);
@ -163,24 +161,21 @@ int bnxt_re_query_gid(struct ib_device *ibdev, u8 port_num,
int index, union ib_gid *gid);
enum rdma_link_layer bnxt_re_get_link_layer(struct ib_device *ibdev,
u8 port_num);
int bnxt_re_alloc_pd(struct ib_pd *pd, struct ib_ucontext *context,
struct ib_udata *udata);
void bnxt_re_dealloc_pd(struct ib_pd *pd);
struct ib_ah *bnxt_re_create_ah(struct ib_pd *pd,
struct rdma_ah_attr *ah_attr,
u32 flags,
struct ib_udata *udata);
int bnxt_re_alloc_pd(struct ib_pd *pd, struct ib_udata *udata);
void bnxt_re_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
int bnxt_re_create_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr, u32 flags,
struct ib_udata *udata);
int bnxt_re_modify_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr);
int bnxt_re_query_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr);
int bnxt_re_destroy_ah(struct ib_ah *ah, u32 flags);
struct ib_srq *bnxt_re_create_srq(struct ib_pd *pd,
struct ib_srq_init_attr *srq_init_attr,
struct ib_udata *udata);
void bnxt_re_destroy_ah(struct ib_ah *ah, u32 flags);
int bnxt_re_create_srq(struct ib_srq *srq,
struct ib_srq_init_attr *srq_init_attr,
struct ib_udata *udata);
int bnxt_re_modify_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr,
enum ib_srq_attr_mask srq_attr_mask,
struct ib_udata *udata);
int bnxt_re_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr);
int bnxt_re_destroy_srq(struct ib_srq *srq);
void bnxt_re_destroy_srq(struct ib_srq *srq, struct ib_udata *udata);
int bnxt_re_post_srq_recv(struct ib_srq *srq, const struct ib_recv_wr *recv_wr,
const struct ib_recv_wr **bad_recv_wr);
struct ib_qp *bnxt_re_create_qp(struct ib_pd *pd,
@ -190,16 +185,15 @@ int bnxt_re_modify_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
int qp_attr_mask, struct ib_udata *udata);
int bnxt_re_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr);
int bnxt_re_destroy_qp(struct ib_qp *qp);
int bnxt_re_destroy_qp(struct ib_qp *qp, struct ib_udata *udata);
int bnxt_re_post_send(struct ib_qp *qp, const struct ib_send_wr *send_wr,
const struct ib_send_wr **bad_send_wr);
int bnxt_re_post_recv(struct ib_qp *qp, const struct ib_recv_wr *recv_wr,
const struct ib_recv_wr **bad_recv_wr);
struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *context,
struct ib_udata *udata);
int bnxt_re_destroy_cq(struct ib_cq *cq);
int bnxt_re_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
int bnxt_re_poll_cq(struct ib_cq *cq, int num_entries, struct ib_wc *wc);
int bnxt_re_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags);
struct ib_mr *bnxt_re_get_dma_mr(struct ib_pd *pd, int mr_access_flags);
@ -207,8 +201,8 @@ struct ib_mr *bnxt_re_get_dma_mr(struct ib_pd *pd, int mr_access_flags);
int bnxt_re_map_mr_sg(struct ib_mr *ib_mr, struct scatterlist *sg, int sg_nents,
unsigned int *sg_offset);
struct ib_mr *bnxt_re_alloc_mr(struct ib_pd *ib_pd, enum ib_mr_type mr_type,
u32 max_num_sg);
int bnxt_re_dereg_mr(struct ib_mr *mr);
u32 max_num_sg, struct ib_udata *udata);
int bnxt_re_dereg_mr(struct ib_mr *mr, struct ib_udata *udata);
struct ib_mw *bnxt_re_alloc_mw(struct ib_pd *ib_pd, enum ib_mw_type type,
struct ib_udata *udata);
int bnxt_re_dealloc_mw(struct ib_mw *mw);

View File

@ -617,7 +617,6 @@ static const struct ib_device_ops bnxt_re_dev_ops = {
.get_dma_mr = bnxt_re_get_dma_mr,
.get_hw_stats = bnxt_re_ib_get_hw_stats,
.get_link_layer = bnxt_re_get_link_layer,
.get_netdev = bnxt_re_get_netdev,
.get_port_immutable = bnxt_re_get_port_immutable,
.map_mr_sg = bnxt_re_map_mr_sg,
.mmap = bnxt_re_mmap,
@ -637,13 +636,16 @@ static const struct ib_device_ops bnxt_re_dev_ops = {
.query_srq = bnxt_re_query_srq,
.reg_user_mr = bnxt_re_reg_user_mr,
.req_notify_cq = bnxt_re_req_notify_cq,
INIT_RDMA_OBJ_SIZE(ib_ah, bnxt_re_ah, ib_ah),
INIT_RDMA_OBJ_SIZE(ib_pd, bnxt_re_pd, ib_pd),
INIT_RDMA_OBJ_SIZE(ib_srq, bnxt_re_srq, ib_srq),
INIT_RDMA_OBJ_SIZE(ib_ucontext, bnxt_re_ucontext, ib_uctx),
};
static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
{
struct ib_device *ibdev = &rdev->ibdev;
int ret;
/* ib device init */
ibdev->owner = THIS_MODULE;
@ -691,6 +693,10 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
rdma_set_device_sysfs_group(ibdev, &bnxt_re_dev_attr_group);
ibdev->driver_id = RDMA_DRIVER_BNXT_RE;
ib_set_device_ops(ibdev, &bnxt_re_dev_ops);
ret = ib_device_set_netdev(&rdev->ibdev, rdev->netdev, 1);
if (ret)
return ret;
return ib_register_device(ibdev, "bnxt_re%d");
}

View File

@ -478,7 +478,7 @@ int bnxt_qplib_alloc_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq)
nq->hwq.max_elements > BNXT_QPLIB_NQE_MAX_CNT)
nq->hwq.max_elements = BNXT_QPLIB_NQE_MAX_CNT;
hwq_type = bnxt_qplib_get_hwq_type(nq->res);
if (bnxt_qplib_alloc_init_hwq(nq->pdev, &nq->hwq, NULL, 0,
if (bnxt_qplib_alloc_init_hwq(nq->pdev, &nq->hwq, NULL,
&nq->hwq.max_elements,
BNXT_QPLIB_MAX_NQE_ENTRY_SIZE, 0,
PAGE_SIZE, hwq_type))
@ -507,7 +507,7 @@ static void bnxt_qplib_arm_srq(struct bnxt_qplib_srq *srq, u32 arm_type)
writeq(val, db);
}
int bnxt_qplib_destroy_srq(struct bnxt_qplib_res *res,
void bnxt_qplib_destroy_srq(struct bnxt_qplib_res *res,
struct bnxt_qplib_srq *srq)
{
struct bnxt_qplib_rcfw *rcfw = res->rcfw;
@ -521,14 +521,12 @@ int bnxt_qplib_destroy_srq(struct bnxt_qplib_res *res,
/* Configure the request */
req.srq_cid = cpu_to_le32(srq->id);
rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
(void *)&resp, NULL, 0);
if (rc)
return rc;
bnxt_qplib_free_hwq(res->pdev, &srq->hwq);
rc = bnxt_qplib_rcfw_send_message(rcfw, (struct cmdq_base *)&req,
(struct creq_base *)&resp, NULL, 0);
kfree(srq->swq);
return 0;
if (rc)
return;
bnxt_qplib_free_hwq(res->pdev, &srq->hwq);
}
int bnxt_qplib_create_srq(struct bnxt_qplib_res *res,
@ -542,8 +540,8 @@ int bnxt_qplib_create_srq(struct bnxt_qplib_res *res,
int rc, idx;
srq->hwq.max_elements = srq->max_wqe;
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &srq->hwq, srq->sglist,
srq->nmap, &srq->hwq.max_elements,
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &srq->hwq, &srq->sg_info,
&srq->hwq.max_elements,
BNXT_QPLIB_MAX_RQE_ENTRY_SIZE, 0,
PAGE_SIZE, HWQ_TYPE_QUEUE);
if (rc)
@ -742,7 +740,7 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
/* SQ */
sq->hwq.max_elements = sq->max_wqe;
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &sq->hwq, NULL, 0,
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &sq->hwq, NULL,
&sq->hwq.max_elements,
BNXT_QPLIB_MAX_SQE_ENTRY_SIZE, 0,
PAGE_SIZE, HWQ_TYPE_QUEUE);
@ -781,7 +779,7 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
/* RQ */
if (rq->max_wqe) {
rq->hwq.max_elements = qp->rq.max_wqe;
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &rq->hwq, NULL, 0,
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &rq->hwq, NULL,
&rq->hwq.max_elements,
BNXT_QPLIB_MAX_RQE_ENTRY_SIZE, 0,
PAGE_SIZE, HWQ_TYPE_QUEUE);
@ -890,8 +888,8 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
sizeof(struct sq_psn_search);
}
sq->hwq.max_elements = sq->max_wqe;
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &sq->hwq, sq->sglist,
sq->nmap, &sq->hwq.max_elements,
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &sq->hwq, &sq->sg_info,
&sq->hwq.max_elements,
BNXT_QPLIB_MAX_SQE_ENTRY_SIZE,
psn_sz,
PAGE_SIZE, HWQ_TYPE_QUEUE);
@ -959,8 +957,9 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
/* RQ */
if (rq->max_wqe) {
rq->hwq.max_elements = rq->max_wqe;
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &rq->hwq, rq->sglist,
rq->nmap, &rq->hwq.max_elements,
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &rq->hwq,
&rq->sg_info,
&rq->hwq.max_elements,
BNXT_QPLIB_MAX_RQE_ENTRY_SIZE, 0,
PAGE_SIZE, HWQ_TYPE_QUEUE);
if (rc)
@ -1030,7 +1029,7 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
req_size = xrrq->max_elements *
BNXT_QPLIB_MAX_ORRQE_ENTRY_SIZE + PAGE_SIZE - 1;
req_size &= ~(PAGE_SIZE - 1);
rc = bnxt_qplib_alloc_init_hwq(res->pdev, xrrq, NULL, 0,
rc = bnxt_qplib_alloc_init_hwq(res->pdev, xrrq, NULL,
&xrrq->max_elements,
BNXT_QPLIB_MAX_ORRQE_ENTRY_SIZE,
0, req_size, HWQ_TYPE_CTX);
@ -1046,7 +1045,7 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
BNXT_QPLIB_MAX_IRRQE_ENTRY_SIZE + PAGE_SIZE - 1;
req_size &= ~(PAGE_SIZE - 1);
rc = bnxt_qplib_alloc_init_hwq(res->pdev, xrrq, NULL, 0,
rc = bnxt_qplib_alloc_init_hwq(res->pdev, xrrq, NULL,
&xrrq->max_elements,
BNXT_QPLIB_MAX_IRRQE_ENTRY_SIZE,
0, req_size, HWQ_TYPE_CTX);
@ -1935,8 +1934,8 @@ int bnxt_qplib_create_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq)
int rc;
cq->hwq.max_elements = cq->max_wqe;
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &cq->hwq, cq->sghead,
cq->nmap, &cq->hwq.max_elements,
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &cq->hwq, &cq->sg_info,
&cq->hwq.max_elements,
BNXT_QPLIB_MAX_CQE_ENTRY_SIZE, 0,
PAGE_SIZE, HWQ_TYPE_QUEUE);
if (rc)

View File

@ -52,10 +52,9 @@ struct bnxt_qplib_srq {
struct bnxt_qplib_cq *cq;
struct bnxt_qplib_hwq hwq;
struct bnxt_qplib_swq *swq;
struct scatterlist *sglist;
int start_idx;
int last_idx;
u32 nmap;
struct bnxt_qplib_sg_info sg_info;
u16 eventq_hw_ring_id;
spinlock_t lock; /* protect SRQE link list */
};
@ -237,8 +236,7 @@ struct bnxt_qplib_swqe {
struct bnxt_qplib_q {
struct bnxt_qplib_hwq hwq;
struct bnxt_qplib_swq *swq;
struct scatterlist *sglist;
u32 nmap;
struct bnxt_qplib_sg_info sg_info;
u32 max_wqe;
u16 q_full_delta;
u16 max_sge;
@ -381,8 +379,7 @@ struct bnxt_qplib_cq {
u32 cnq_hw_ring_id;
struct bnxt_qplib_nq *nq;
bool resize_in_progress;
struct scatterlist *sghead;
u32 nmap;
struct bnxt_qplib_sg_info sg_info;
u64 cq_handle;
#define CQ_RESIZE_WAIT_TIME_MS 500
@ -521,8 +518,8 @@ int bnxt_qplib_modify_srq(struct bnxt_qplib_res *res,
struct bnxt_qplib_srq *srq);
int bnxt_qplib_query_srq(struct bnxt_qplib_res *res,
struct bnxt_qplib_srq *srq);
int bnxt_qplib_destroy_srq(struct bnxt_qplib_res *res,
struct bnxt_qplib_srq *srq);
void bnxt_qplib_destroy_srq(struct bnxt_qplib_res *res,
struct bnxt_qplib_srq *srq);
int bnxt_qplib_post_srq_recv(struct bnxt_qplib_srq *srq,
struct bnxt_qplib_swqe *wqe);
int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);

View File

@ -569,7 +569,7 @@ int bnxt_qplib_alloc_rcfw_channel(struct pci_dev *pdev,
rcfw->pdev = pdev;
rcfw->creq.max_elements = BNXT_QPLIB_CREQE_MAX_CNT;
hwq_type = bnxt_qplib_get_hwq_type(rcfw->res);
if (bnxt_qplib_alloc_init_hwq(rcfw->pdev, &rcfw->creq, NULL, 0,
if (bnxt_qplib_alloc_init_hwq(rcfw->pdev, &rcfw->creq, NULL,
&rcfw->creq.max_elements,
BNXT_QPLIB_CREQE_UNITS,
0, PAGE_SIZE, hwq_type)) {
@ -584,7 +584,7 @@ int bnxt_qplib_alloc_rcfw_channel(struct pci_dev *pdev,
rcfw->cmdq.max_elements = rcfw->cmdq_depth;
if (bnxt_qplib_alloc_init_hwq
(rcfw->pdev, &rcfw->cmdq, NULL, 0,
(rcfw->pdev, &rcfw->cmdq, NULL,
&rcfw->cmdq.max_elements,
BNXT_QPLIB_CMDQE_UNITS, 0,
bnxt_qplib_cmdqe_page_size(rcfw->cmdq_depth),

View File

@ -83,7 +83,8 @@ static void __free_pbl(struct pci_dev *pdev, struct bnxt_qplib_pbl *pbl,
}
static int __alloc_pbl(struct pci_dev *pdev, struct bnxt_qplib_pbl *pbl,
struct scatterlist *sghead, u32 pages, u32 pg_size)
struct scatterlist *sghead, u32 pages,
u32 nmaps, u32 pg_size)
{
struct sg_dma_page_iter sg_iter;
bool is_umem = false;
@ -116,7 +117,7 @@ static int __alloc_pbl(struct pci_dev *pdev, struct bnxt_qplib_pbl *pbl,
} else {
i = 0;
is_umem = true;
for_each_sg_dma_page (sghead, &sg_iter, pages, 0) {
for_each_sg_dma_page(sghead, &sg_iter, nmaps, 0) {
pbl->pg_map_arr[i] = sg_page_iter_dma_address(&sg_iter);
pbl->pg_arr[i] = NULL;
pbl->pg_count++;
@ -158,12 +159,13 @@ void bnxt_qplib_free_hwq(struct pci_dev *pdev, struct bnxt_qplib_hwq *hwq)
/* All HWQs are power of 2 in size */
int bnxt_qplib_alloc_init_hwq(struct pci_dev *pdev, struct bnxt_qplib_hwq *hwq,
struct scatterlist *sghead, int nmap,
struct bnxt_qplib_sg_info *sg_info,
u32 *elements, u32 element_size, u32 aux,
u32 pg_size, enum bnxt_qplib_hwq_type hwq_type)
{
u32 pages, slots, size, aux_pages = 0, aux_size = 0;
u32 pages, maps, slots, size, aux_pages = 0, aux_size = 0;
dma_addr_t *src_phys_ptr, **dst_virt_ptr;
struct scatterlist *sghead = NULL;
int i, rc;
hwq->level = PBL_LVL_MAX;
@ -177,6 +179,9 @@ int bnxt_qplib_alloc_init_hwq(struct pci_dev *pdev, struct bnxt_qplib_hwq *hwq,
}
size = roundup_pow_of_two(element_size);
if (sg_info)
sghead = sg_info->sglist;
if (!sghead) {
hwq->is_user = false;
pages = (slots * size) / pg_size + aux_pages;
@ -184,17 +189,20 @@ int bnxt_qplib_alloc_init_hwq(struct pci_dev *pdev, struct bnxt_qplib_hwq *hwq,
pages++;
if (!pages)
return -EINVAL;
maps = 0;
} else {
hwq->is_user = true;
pages = nmap;
pages = sg_info->npages;
maps = sg_info->nmap;
}
/* Alloc the 1st memory block; can be a PDL/PTL/PBL */
if (sghead && (pages == MAX_PBL_LVL_0_PGS))
rc = __alloc_pbl(pdev, &hwq->pbl[PBL_LVL_0], sghead,
pages, pg_size);
pages, maps, pg_size);
else
rc = __alloc_pbl(pdev, &hwq->pbl[PBL_LVL_0], NULL, 1, pg_size);
rc = __alloc_pbl(pdev, &hwq->pbl[PBL_LVL_0], NULL,
1, 0, pg_size);
if (rc)
goto fail;
@ -204,7 +212,8 @@ int bnxt_qplib_alloc_init_hwq(struct pci_dev *pdev, struct bnxt_qplib_hwq *hwq,
if (pages > MAX_PBL_LVL_1_PGS) {
/* 2 levels of indirection */
rc = __alloc_pbl(pdev, &hwq->pbl[PBL_LVL_1], NULL,
MAX_PBL_LVL_1_PGS_FOR_LVL_2, pg_size);
MAX_PBL_LVL_1_PGS_FOR_LVL_2,
0, pg_size);
if (rc)
goto fail;
/* Fill in lvl0 PBL */
@ -217,7 +226,7 @@ int bnxt_qplib_alloc_init_hwq(struct pci_dev *pdev, struct bnxt_qplib_hwq *hwq,
hwq->level = PBL_LVL_1;
rc = __alloc_pbl(pdev, &hwq->pbl[PBL_LVL_2], sghead,
pages, pg_size);
pages, maps, pg_size);
if (rc)
goto fail;
@ -246,7 +255,7 @@ int bnxt_qplib_alloc_init_hwq(struct pci_dev *pdev, struct bnxt_qplib_hwq *hwq,
/* 1 level of indirection */
rc = __alloc_pbl(pdev, &hwq->pbl[PBL_LVL_1], sghead,
pages, pg_size);
pages, maps, pg_size);
if (rc)
goto fail;
/* Fill in lvl0 PBL */
@ -339,7 +348,7 @@ int bnxt_qplib_alloc_ctx(struct pci_dev *pdev,
/* QPC Tables */
ctx->qpc_tbl.max_elements = ctx->qpc_count;
rc = bnxt_qplib_alloc_init_hwq(pdev, &ctx->qpc_tbl, NULL, 0,
rc = bnxt_qplib_alloc_init_hwq(pdev, &ctx->qpc_tbl, NULL,
&ctx->qpc_tbl.max_elements,
BNXT_QPLIB_MAX_QP_CTX_ENTRY_SIZE, 0,
PAGE_SIZE, HWQ_TYPE_CTX);
@ -348,7 +357,7 @@ int bnxt_qplib_alloc_ctx(struct pci_dev *pdev,
/* MRW Tables */
ctx->mrw_tbl.max_elements = ctx->mrw_count;
rc = bnxt_qplib_alloc_init_hwq(pdev, &ctx->mrw_tbl, NULL, 0,
rc = bnxt_qplib_alloc_init_hwq(pdev, &ctx->mrw_tbl, NULL,
&ctx->mrw_tbl.max_elements,
BNXT_QPLIB_MAX_MRW_CTX_ENTRY_SIZE, 0,
PAGE_SIZE, HWQ_TYPE_CTX);
@ -357,7 +366,7 @@ int bnxt_qplib_alloc_ctx(struct pci_dev *pdev,
/* SRQ Tables */
ctx->srqc_tbl.max_elements = ctx->srqc_count;
rc = bnxt_qplib_alloc_init_hwq(pdev, &ctx->srqc_tbl, NULL, 0,
rc = bnxt_qplib_alloc_init_hwq(pdev, &ctx->srqc_tbl, NULL,
&ctx->srqc_tbl.max_elements,
BNXT_QPLIB_MAX_SRQ_CTX_ENTRY_SIZE, 0,
PAGE_SIZE, HWQ_TYPE_CTX);
@ -366,7 +375,7 @@ int bnxt_qplib_alloc_ctx(struct pci_dev *pdev,
/* CQ Tables */
ctx->cq_tbl.max_elements = ctx->cq_count;
rc = bnxt_qplib_alloc_init_hwq(pdev, &ctx->cq_tbl, NULL, 0,
rc = bnxt_qplib_alloc_init_hwq(pdev, &ctx->cq_tbl, NULL,
&ctx->cq_tbl.max_elements,
BNXT_QPLIB_MAX_CQ_CTX_ENTRY_SIZE, 0,
PAGE_SIZE, HWQ_TYPE_CTX);
@ -375,7 +384,7 @@ int bnxt_qplib_alloc_ctx(struct pci_dev *pdev,
/* TQM Buffer */
ctx->tqm_pde.max_elements = 512;
rc = bnxt_qplib_alloc_init_hwq(pdev, &ctx->tqm_pde, NULL, 0,
rc = bnxt_qplib_alloc_init_hwq(pdev, &ctx->tqm_pde, NULL,
&ctx->tqm_pde.max_elements, sizeof(u64),
0, PAGE_SIZE, HWQ_TYPE_CTX);
if (rc)
@ -386,7 +395,7 @@ int bnxt_qplib_alloc_ctx(struct pci_dev *pdev,
continue;
ctx->tqm_tbl[i].max_elements = ctx->qpc_count *
ctx->tqm_count[i];
rc = bnxt_qplib_alloc_init_hwq(pdev, &ctx->tqm_tbl[i], NULL, 0,
rc = bnxt_qplib_alloc_init_hwq(pdev, &ctx->tqm_tbl[i], NULL,
&ctx->tqm_tbl[i].max_elements, 1,
0, PAGE_SIZE, HWQ_TYPE_CTX);
if (rc)
@ -424,7 +433,7 @@ int bnxt_qplib_alloc_ctx(struct pci_dev *pdev,
/* TIM Buffer */
ctx->tim_tbl.max_elements = ctx->qpc_count * 16;
rc = bnxt_qplib_alloc_init_hwq(pdev, &ctx->tim_tbl, NULL, 0,
rc = bnxt_qplib_alloc_init_hwq(pdev, &ctx->tim_tbl, NULL,
&ctx->tim_tbl.max_elements, 1,
0, PAGE_SIZE, HWQ_TYPE_CTX);
if (rc)

View File

@ -219,6 +219,12 @@ static inline u8 bnxt_qplib_get_ring_type(struct bnxt_qplib_chip_ctx *cctx)
RING_ALLOC_REQ_RING_TYPE_ROCE_CMPL;
}
struct bnxt_qplib_sg_info {
struct scatterlist *sglist;
u32 nmap;
u32 npages;
};
#define to_bnxt_qplib(ptr, type, member) \
container_of(ptr, type, member)
@ -227,7 +233,7 @@ struct bnxt_qplib_dev_attr;
void bnxt_qplib_free_hwq(struct pci_dev *pdev, struct bnxt_qplib_hwq *hwq);
int bnxt_qplib_alloc_init_hwq(struct pci_dev *pdev, struct bnxt_qplib_hwq *hwq,
struct scatterlist *sl, int nmap, u32 *elements,
struct bnxt_qplib_sg_info *sg_info, u32 *elements,
u32 elements_per_page, u32 aux, u32 pg_size,
enum bnxt_qplib_hwq_type hwq_type);
void bnxt_qplib_get_guid(u8 *dev_addr, u8 *guid);

View File

@ -532,25 +532,21 @@ int bnxt_qplib_create_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah,
return 0;
}
int bnxt_qplib_destroy_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah,
bool block)
void bnxt_qplib_destroy_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah,
bool block)
{
struct bnxt_qplib_rcfw *rcfw = res->rcfw;
struct cmdq_destroy_ah req;
struct creq_destroy_ah_resp resp;
u16 cmd_flags = 0;
int rc;
/* Clean up the AH table in the device */
RCFW_CMD_PREP(req, DESTROY_AH, cmd_flags);
req.ah_cid = cpu_to_le32(ah->id);
rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req, (void *)&resp,
NULL, block);
if (rc)
return rc;
return 0;
bnxt_qplib_rcfw_send_message(rcfw, (void *)&req, (void *)&resp, NULL,
block);
}
/* MRW */
@ -684,7 +680,7 @@ int bnxt_qplib_reg_mr(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mr,
mr->hwq.max_elements = pages;
/* Use system PAGE_SIZE */
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &mr->hwq, NULL, 0,
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &mr->hwq, NULL,
&mr->hwq.max_elements,
PAGE_SIZE, 0, PAGE_SIZE,
HWQ_TYPE_CTX);
@ -754,7 +750,7 @@ int bnxt_qplib_alloc_fast_reg_page_list(struct bnxt_qplib_res *res,
return -ENOMEM;
frpl->hwq.max_elements = pages;
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &frpl->hwq, NULL, 0,
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &frpl->hwq, NULL,
&frpl->hwq.max_elements, PAGE_SIZE, 0,
PAGE_SIZE, HWQ_TYPE_CTX);
if (!rc)

View File

@ -243,8 +243,8 @@ int bnxt_qplib_set_func_resources(struct bnxt_qplib_res *res,
struct bnxt_qplib_ctx *ctx);
int bnxt_qplib_create_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah,
bool block);
int bnxt_qplib_destroy_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah,
bool block);
void bnxt_qplib_destroy_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah,
bool block);
int bnxt_qplib_alloc_mrw(struct bnxt_qplib_res *res,
struct bnxt_qplib_mrw *mrw);
int bnxt_qplib_dereg_mrw(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mrw,

View File

@ -64,7 +64,7 @@ enum t3_wr_flags {
T3_SOLICITED_EVENT_FLAG = 0x04,
T3_READ_FENCE_FLAG = 0x08,
T3_LOCAL_FENCE_FLAG = 0x10
} __attribute__ ((packed));
} __packed;
enum t3_wr_opcode {
T3_WR_BP = FW_WROPCODE_RI_BYPASS,
@ -77,7 +77,7 @@ enum t3_wr_opcode {
T3_WR_INIT = FW_WROPCODE_RI_RDMA_INIT,
T3_WR_QP_MOD = FW_WROPCODE_RI_MODIFY_QP,
T3_WR_FASTREG = FW_WROPCODE_RI_FASTREGISTER_MR
} __attribute__ ((packed));
} __packed;
enum t3_rdma_opcode {
T3_RDMA_WRITE, /* IETF RDMAP v1.0 ... */
@ -95,7 +95,7 @@ enum t3_rdma_opcode {
T3_QP_MOD,
T3_BYPASS,
T3_RDMA_READ_REQ_WITH_INV,
} __attribute__ ((packed));
} __packed;
static inline enum t3_rdma_opcode wr2opcode(enum t3_wr_opcode wrop)
{
@ -306,7 +306,7 @@ enum t3_mpa_attrs {
uP_RI_MPA_TX_MARKER_ENABLE = 0x2,
uP_RI_MPA_CRC_ENABLE = 0x4,
uP_RI_MPA_IETF_ENABLE = 0x8
} __attribute__ ((packed));
} __packed;
enum t3_qp_caps {
uP_RI_QP_RDMA_READ_ENABLE = 0x01,
@ -314,7 +314,7 @@ enum t3_qp_caps {
uP_RI_QP_BIND_ENABLE = 0x04,
uP_RI_QP_FAST_REGISTER_ENABLE = 0x08,
uP_RI_QP_STAG0_ENABLE = 0x10
} __attribute__ ((packed));
} __packed;
enum rdma_init_rtr_types {
RTR_READ = 1,

View File

@ -62,37 +62,30 @@ struct cxgb3_client t3c_client = {
static LIST_HEAD(dev_list);
static DEFINE_MUTEX(dev_mutex);
static int disable_qp_db(int id, void *p, void *data)
{
struct iwch_qp *qhp = p;
cxio_disable_wq_db(&qhp->wq);
return 0;
}
static int enable_qp_db(int id, void *p, void *data)
{
struct iwch_qp *qhp = p;
if (data)
ring_doorbell(qhp->rhp->rdev.ctrl_qp.doorbell, qhp->wq.qpid);
cxio_enable_wq_db(&qhp->wq);
return 0;
}
static void disable_dbs(struct iwch_dev *rnicp)
{
spin_lock_irq(&rnicp->lock);
idr_for_each(&rnicp->qpidr, disable_qp_db, NULL);
spin_unlock_irq(&rnicp->lock);
unsigned long index;
struct iwch_qp *qhp;
xa_lock_irq(&rnicp->qps);
xa_for_each(&rnicp->qps, index, qhp)
cxio_disable_wq_db(&qhp->wq);
xa_unlock_irq(&rnicp->qps);
}
static void enable_dbs(struct iwch_dev *rnicp, int ring_db)
{
spin_lock_irq(&rnicp->lock);
idr_for_each(&rnicp->qpidr, enable_qp_db,
(void *)(unsigned long)ring_db);
spin_unlock_irq(&rnicp->lock);
unsigned long index;
struct iwch_qp *qhp;
xa_lock_irq(&rnicp->qps);
xa_for_each(&rnicp->qps, index, qhp) {
if (ring_db)
ring_doorbell(qhp->rhp->rdev.ctrl_qp.doorbell,
qhp->wq.qpid);
cxio_enable_wq_db(&qhp->wq);
}
xa_unlock_irq(&rnicp->qps);
}
static void iwch_db_drop_task(struct work_struct *work)
@ -105,10 +98,9 @@ static void iwch_db_drop_task(struct work_struct *work)
static void rnic_init(struct iwch_dev *rnicp)
{
pr_debug("%s iwch_dev %p\n", __func__, rnicp);
idr_init(&rnicp->cqidr);
idr_init(&rnicp->qpidr);
idr_init(&rnicp->mmidr);
spin_lock_init(&rnicp->lock);
xa_init_flags(&rnicp->cqs, XA_FLAGS_LOCK_IRQ);
xa_init_flags(&rnicp->qps, XA_FLAGS_LOCK_IRQ);
xa_init_flags(&rnicp->mrs, XA_FLAGS_LOCK_IRQ);
INIT_DELAYED_WORK(&rnicp->db_drop_task, iwch_db_drop_task);
rnicp->attr.max_qps = T3_MAX_NUM_QP - 32;
@ -190,9 +182,9 @@ static void close_rnic_dev(struct t3cdev *tdev)
list_del(&dev->entry);
iwch_unregister_device(dev);
cxio_rdev_close(&dev->rdev);
idr_destroy(&dev->cqidr);
idr_destroy(&dev->qpidr);
idr_destroy(&dev->mmidr);
WARN_ON(!xa_empty(&dev->cqs));
WARN_ON(!xa_empty(&dev->qps));
WARN_ON(!xa_empty(&dev->mrs));
ib_dealloc_device(&dev->ibdev);
break;
}

View File

@ -35,7 +35,7 @@
#include <linux/mutex.h>
#include <linux/list.h>
#include <linux/spinlock.h>
#include <linux/idr.h>
#include <linux/xarray.h>
#include <linux/workqueue.h>
#include <rdma/ib_verbs.h>
@ -106,10 +106,9 @@ struct iwch_dev {
struct cxio_rdev rdev;
u32 device_cap_flags;
struct iwch_rnic_attributes attr;
struct idr cqidr;
struct idr qpidr;
struct idr mmidr;
spinlock_t lock;
struct xarray cqs;
struct xarray qps;
struct xarray mrs;
struct list_head entry;
struct delayed_work db_drop_task;
};
@ -136,40 +135,17 @@ static inline int t3a_device(const struct iwch_dev *rhp)
static inline struct iwch_cq *get_chp(struct iwch_dev *rhp, u32 cqid)
{
return idr_find(&rhp->cqidr, cqid);
return xa_load(&rhp->cqs, cqid);
}
static inline struct iwch_qp *get_qhp(struct iwch_dev *rhp, u32 qpid)
{
return idr_find(&rhp->qpidr, qpid);
return xa_load(&rhp->qps, qpid);
}
static inline struct iwch_mr *get_mhp(struct iwch_dev *rhp, u32 mmid)
{
return idr_find(&rhp->mmidr, mmid);
}
static inline int insert_handle(struct iwch_dev *rhp, struct idr *idr,
void *handle, u32 id)
{
int ret;
idr_preload(GFP_KERNEL);
spin_lock_irq(&rhp->lock);
ret = idr_alloc(idr, handle, id, id + 1, GFP_NOWAIT);
spin_unlock_irq(&rhp->lock);
idr_preload_end();
return ret < 0 ? ret : 0;
}
static inline void remove_handle(struct iwch_dev *rhp, struct idr *idr, u32 id)
{
spin_lock_irq(&rhp->lock);
idr_remove(idr, id);
spin_unlock_irq(&rhp->lock);
return xa_load(&rhp->mrs, mmid);
}
extern struct cxgb3_client t3c_client;

View File

@ -48,14 +48,14 @@ static void post_qp_event(struct iwch_dev *rnicp, struct iwch_cq *chp,
struct iwch_qp *qhp;
unsigned long flag;
spin_lock(&rnicp->lock);
qhp = get_qhp(rnicp, CQE_QPID(rsp_msg->cqe));
xa_lock(&rnicp->qps);
qhp = xa_load(&rnicp->qps, CQE_QPID(rsp_msg->cqe));
if (!qhp) {
pr_err("%s unaffiliated error 0x%x qpid 0x%x\n",
__func__, CQE_STATUS(rsp_msg->cqe),
CQE_QPID(rsp_msg->cqe));
spin_unlock(&rnicp->lock);
xa_unlock(&rnicp->qps);
return;
}
@ -65,7 +65,7 @@ static void post_qp_event(struct iwch_dev *rnicp, struct iwch_cq *chp,
__func__,
qhp->attr.state, qhp->wq.qpid,
CQE_STATUS(rsp_msg->cqe));
spin_unlock(&rnicp->lock);
xa_unlock(&rnicp->qps);
return;
}
@ -76,7 +76,7 @@ static void post_qp_event(struct iwch_dev *rnicp, struct iwch_cq *chp,
CQE_WRID_HI(rsp_msg->cqe), CQE_WRID_LOW(rsp_msg->cqe));
atomic_inc(&qhp->refcnt);
spin_unlock(&rnicp->lock);
xa_unlock(&rnicp->qps);
if (qhp->attr.state == IWCH_QP_STATE_RTS) {
attrs.next_state = IWCH_QP_STATE_TERMINATE;
@ -114,21 +114,21 @@ void iwch_ev_dispatch(struct cxio_rdev *rdev_p, struct sk_buff *skb)
unsigned long flag;
rnicp = (struct iwch_dev *) rdev_p->ulp;
spin_lock(&rnicp->lock);
xa_lock(&rnicp->qps);
chp = get_chp(rnicp, cqid);
qhp = get_qhp(rnicp, CQE_QPID(rsp_msg->cqe));
qhp = xa_load(&rnicp->qps, CQE_QPID(rsp_msg->cqe));
if (!chp || !qhp) {
pr_err("BAD AE cqid 0x%x qpid 0x%x opcode %d status 0x%x type %d wrid.hi 0x%x wrid.lo 0x%x\n",
cqid, CQE_QPID(rsp_msg->cqe),
CQE_OPCODE(rsp_msg->cqe), CQE_STATUS(rsp_msg->cqe),
CQE_TYPE(rsp_msg->cqe), CQE_WRID_HI(rsp_msg->cqe),
CQE_WRID_LOW(rsp_msg->cqe));
spin_unlock(&rnicp->lock);
xa_unlock(&rnicp->qps);
goto out;
}
iwch_qp_add_ref(&qhp->ibqp);
atomic_inc(&chp->refcnt);
spin_unlock(&rnicp->lock);
xa_unlock(&rnicp->qps);
/*
* 1) completion of our sending a TERMINATE.

View File

@ -49,7 +49,7 @@ static int iwch_finish_mem_reg(struct iwch_mr *mhp, u32 stag)
mmid = stag >> 8;
mhp->ibmr.rkey = mhp->ibmr.lkey = stag;
pr_debug("%s mmid 0x%x mhp %p\n", __func__, mmid, mhp);
return insert_handle(mhp->rhp, &mhp->rhp->mmidr, mhp, mmid);
return xa_insert_irq(&mhp->rhp->mrs, mmid, mhp, GFP_KERNEL);
}
int iwch_register_mem(struct iwch_dev *rhp, struct iwch_pd *php,

View File

@ -88,14 +88,14 @@ static int iwch_alloc_ucontext(struct ib_ucontext *ucontext,
return 0;
}
static int iwch_destroy_cq(struct ib_cq *ib_cq)
static int iwch_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
{
struct iwch_cq *chp;
pr_debug("%s ib_cq %p\n", __func__, ib_cq);
chp = to_iwch_cq(ib_cq);
remove_handle(chp->rhp, &chp->rhp->cqidr, chp->cq.cqid);
xa_erase_irq(&chp->rhp->cqs, chp->cq.cqid);
atomic_dec(&chp->refcnt);
wait_event(chp->wait, !atomic_read(&chp->refcnt));
@ -106,7 +106,6 @@ static int iwch_destroy_cq(struct ib_cq *ib_cq)
static struct ib_cq *iwch_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *ib_context,
struct ib_udata *udata)
{
int entries = attr->cqe;
@ -114,7 +113,6 @@ static struct ib_cq *iwch_create_cq(struct ib_device *ibdev,
struct iwch_cq *chp;
struct iwch_create_cq_resp uresp;
struct iwch_create_cq_req ureq;
struct iwch_ucontext *ucontext = NULL;
static int warned;
size_t resplen;
@ -127,8 +125,7 @@ static struct ib_cq *iwch_create_cq(struct ib_device *ibdev,
if (!chp)
return ERR_PTR(-ENOMEM);
if (ib_context) {
ucontext = to_iwch_ucontext(ib_context);
if (udata) {
if (!t3a_device(rhp)) {
if (ib_copy_from_udata(&ureq, udata, sizeof (ureq))) {
kfree(chp);
@ -154,7 +151,7 @@ static struct ib_cq *iwch_create_cq(struct ib_device *ibdev,
entries = roundup_pow_of_two(entries);
chp->cq.size_log2 = ilog2(entries);
if (cxio_create_cq(&rhp->rdev, &chp->cq, !ucontext)) {
if (cxio_create_cq(&rhp->rdev, &chp->cq, !udata)) {
kfree(chp);
return ERR_PTR(-ENOMEM);
}
@ -164,18 +161,20 @@ static struct ib_cq *iwch_create_cq(struct ib_device *ibdev,
spin_lock_init(&chp->comp_handler_lock);
atomic_set(&chp->refcnt, 1);
init_waitqueue_head(&chp->wait);
if (insert_handle(rhp, &rhp->cqidr, chp, chp->cq.cqid)) {
if (xa_store_irq(&rhp->cqs, chp->cq.cqid, chp, GFP_KERNEL)) {
cxio_destroy_cq(&chp->rhp->rdev, &chp->cq);
kfree(chp);
return ERR_PTR(-ENOMEM);
}
if (ucontext) {
if (udata) {
struct iwch_mm_entry *mm;
struct iwch_ucontext *ucontext = rdma_udata_to_drv_context(
udata, struct iwch_ucontext, ibucontext);
mm = kmalloc(sizeof *mm, GFP_KERNEL);
if (!mm) {
iwch_destroy_cq(&chp->ibcq);
iwch_destroy_cq(&chp->ibcq, udata);
return ERR_PTR(-ENOMEM);
}
uresp.cqid = chp->cq.cqid;
@ -201,7 +200,7 @@ static struct ib_cq *iwch_create_cq(struct ib_device *ibdev,
}
if (ib_copy_to_udata(udata, &uresp, resplen)) {
kfree(mm);
iwch_destroy_cq(&chp->ibcq);
iwch_destroy_cq(&chp->ibcq, udata);
return ERR_PTR(-EFAULT);
}
insert_mmap(ucontext, mm);
@ -367,7 +366,7 @@ static int iwch_mmap(struct ib_ucontext *context, struct vm_area_struct *vma)
return ret;
}
static void iwch_deallocate_pd(struct ib_pd *pd)
static void iwch_deallocate_pd(struct ib_pd *pd, struct ib_udata *udata)
{
struct iwch_dev *rhp;
struct iwch_pd *php;
@ -378,8 +377,7 @@ static void iwch_deallocate_pd(struct ib_pd *pd)
cxio_hal_put_pdid(rhp->rdev.rscp, php->pdid);
}
static int iwch_allocate_pd(struct ib_pd *pd, struct ib_ucontext *context,
struct ib_udata *udata)
static int iwch_allocate_pd(struct ib_pd *pd, struct ib_udata *udata)
{
struct iwch_pd *php = to_iwch_pd(pd);
struct ib_device *ibdev = pd->device;
@ -394,11 +392,11 @@ static int iwch_allocate_pd(struct ib_pd *pd, struct ib_ucontext *context,
php->pdid = pdid;
php->rhp = rhp;
if (context) {
if (udata) {
struct iwch_alloc_pd_resp resp = {.pdid = php->pdid};
if (ib_copy_to_udata(udata, &resp, sizeof(resp))) {
iwch_deallocate_pd(&php->ibpd);
iwch_deallocate_pd(&php->ibpd, udata);
return -EFAULT;
}
}
@ -406,7 +404,7 @@ static int iwch_allocate_pd(struct ib_pd *pd, struct ib_ucontext *context,
return 0;
}
static int iwch_dereg_mr(struct ib_mr *ib_mr)
static int iwch_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata)
{
struct iwch_dev *rhp;
struct iwch_mr *mhp;
@ -421,7 +419,7 @@ static int iwch_dereg_mr(struct ib_mr *ib_mr)
cxio_dereg_mem(&rhp->rdev, mhp->attr.stag, mhp->attr.pbl_size,
mhp->attr.pbl_addr);
iwch_free_pbl(mhp);
remove_handle(rhp, &rhp->mmidr, mmid);
xa_erase_irq(&rhp->mrs, mmid);
if (mhp->kva)
kfree((void *) (unsigned long) mhp->kva);
if (mhp->umem)
@ -539,7 +537,7 @@ static struct ib_mr *iwch_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
shift = PAGE_SHIFT;
n = mhp->umem->nmap;
n = ib_umem_num_pages(mhp->umem);
err = iwch_alloc_pbl(mhp, n);
if (err)
@ -590,7 +588,7 @@ pbl_done:
uresp.pbl_addr);
if (ib_copy_to_udata(udata, &uresp, sizeof (uresp))) {
iwch_dereg_mr(&mhp->ibmr);
iwch_dereg_mr(&mhp->ibmr, udata);
err = -EFAULT;
goto err;
}
@ -636,7 +634,7 @@ static struct ib_mw *iwch_alloc_mw(struct ib_pd *pd, enum ib_mw_type type,
mhp->attr.stag = stag;
mmid = (stag) >> 8;
mhp->ibmw.rkey = stag;
if (insert_handle(rhp, &rhp->mmidr, mhp, mmid)) {
if (xa_insert_irq(&rhp->mrs, mmid, mhp, GFP_KERNEL)) {
cxio_deallocate_window(&rhp->rdev, mhp->attr.stag);
kfree(mhp);
return ERR_PTR(-ENOMEM);
@ -655,15 +653,14 @@ static int iwch_dealloc_mw(struct ib_mw *mw)
rhp = mhp->rhp;
mmid = (mw->rkey) >> 8;
cxio_deallocate_window(&rhp->rdev, mhp->attr.stag);
remove_handle(rhp, &rhp->mmidr, mmid);
xa_erase_irq(&rhp->mrs, mmid);
pr_debug("%s ib_mw %p mmid 0x%x ptr %p\n", __func__, mw, mmid, mhp);
kfree(mhp);
return 0;
}
static struct ib_mr *iwch_alloc_mr(struct ib_pd *pd,
enum ib_mr_type mr_type,
u32 max_num_sg)
static struct ib_mr *iwch_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type,
u32 max_num_sg, struct ib_udata *udata)
{
struct iwch_dev *rhp;
struct iwch_pd *php;
@ -701,7 +698,7 @@ static struct ib_mr *iwch_alloc_mr(struct ib_pd *pd,
mhp->attr.state = 1;
mmid = (stag) >> 8;
mhp->ibmr.rkey = mhp->ibmr.lkey = stag;
ret = insert_handle(rhp, &rhp->mmidr, mhp, mmid);
ret = xa_insert_irq(&rhp->mrs, mmid, mhp, GFP_KERNEL);
if (ret)
goto err3;
@ -742,7 +739,7 @@ static int iwch_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg,
return ib_sg_to_pages(ibmr, sg, sg_nents, sg_offset, iwch_set_page);
}
static int iwch_destroy_qp(struct ib_qp *ib_qp)
static int iwch_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
{
struct iwch_dev *rhp;
struct iwch_qp *qhp;
@ -756,13 +753,13 @@ static int iwch_destroy_qp(struct ib_qp *ib_qp)
iwch_modify_qp(rhp, qhp, IWCH_QP_ATTR_NEXT_STATE, &attrs, 0);
wait_event(qhp->wait, !qhp->ep);
remove_handle(rhp, &rhp->qpidr, qhp->wq.qpid);
xa_erase_irq(&rhp->qps, qhp->wq.qpid);
atomic_dec(&qhp->refcnt);
wait_event(qhp->wait, !atomic_read(&qhp->refcnt));
ucontext = ib_qp->uobject ? to_iwch_ucontext(ib_qp->uobject->context)
: NULL;
ucontext = rdma_udata_to_drv_context(udata, struct iwch_ucontext,
ibucontext);
cxio_destroy_qp(&rhp->rdev, &qhp->wq,
ucontext ? &ucontext->uctx : &rhp->rdev.uctx);
@ -872,7 +869,7 @@ static struct ib_qp *iwch_create_qp(struct ib_pd *pd,
init_waitqueue_head(&qhp->wait);
atomic_set(&qhp->refcnt, 1);
if (insert_handle(rhp, &rhp->qpidr, qhp, qhp->wq.qpid)) {
if (xa_store_irq(&rhp->qps, qhp->wq.qpid, qhp, GFP_KERNEL)) {
cxio_destroy_qp(&rhp->rdev, &qhp->wq,
ucontext ? &ucontext->uctx : &rhp->rdev.uctx);
kfree(qhp);
@ -885,14 +882,14 @@ static struct ib_qp *iwch_create_qp(struct ib_pd *pd,
mm1 = kmalloc(sizeof *mm1, GFP_KERNEL);
if (!mm1) {
iwch_destroy_qp(&qhp->ibqp);
iwch_destroy_qp(&qhp->ibqp, udata);
return ERR_PTR(-ENOMEM);
}
mm2 = kmalloc(sizeof *mm2, GFP_KERNEL);
if (!mm2) {
kfree(mm1);
iwch_destroy_qp(&qhp->ibqp);
iwch_destroy_qp(&qhp->ibqp, udata);
return ERR_PTR(-ENOMEM);
}
@ -909,7 +906,7 @@ static struct ib_qp *iwch_create_qp(struct ib_pd *pd,
if (ib_copy_to_udata(udata, &uresp, sizeof (uresp))) {
kfree(mm1);
kfree(mm2);
iwch_destroy_qp(&qhp->ibqp);
iwch_destroy_qp(&qhp->ibqp, udata);
return ERR_PTR(-EFAULT);
}
mm1->key = uresp.key;
@ -1324,6 +1321,14 @@ static const struct ib_device_ops iwch_dev_ops = {
.get_dma_mr = iwch_get_dma_mr,
.get_hw_stats = iwch_get_mib,
.get_port_immutable = iwch_port_immutable,
.iw_accept = iwch_accept_cr,
.iw_add_ref = iwch_qp_add_ref,
.iw_connect = iwch_connect,
.iw_create_listen = iwch_create_listen,
.iw_destroy_listen = iwch_destroy_listen,
.iw_get_qp = iwch_get_qp,
.iw_reject = iwch_reject_cr,
.iw_rem_ref = iwch_qp_rem_ref,
.map_mr_sg = iwch_map_mr_sg,
.mmap = iwch_mmap,
.modify_qp = iwch_ib_modify_qp,
@ -1343,8 +1348,6 @@ static const struct ib_device_ops iwch_dev_ops = {
int iwch_register_device(struct iwch_dev *dev)
{
int ret;
pr_debug("%s iwch_dev %p\n", __func__, dev);
memset(&dev->ibdev.node_guid, 0, sizeof(dev->ibdev.node_guid));
memcpy(&dev->ibdev.node_guid, dev->rdev.t3cdev_p->lldev->dev_addr, 6);
@ -1382,34 +1385,18 @@ int iwch_register_device(struct iwch_dev *dev)
dev->ibdev.dev.parent = &dev->rdev.rnic_info.pdev->dev;
dev->ibdev.uverbs_abi_ver = IWCH_UVERBS_ABI_VERSION;
dev->ibdev.iwcm = kzalloc(sizeof(struct iw_cm_verbs), GFP_KERNEL);
if (!dev->ibdev.iwcm)
return -ENOMEM;
dev->ibdev.iwcm->connect = iwch_connect;
dev->ibdev.iwcm->accept = iwch_accept_cr;
dev->ibdev.iwcm->reject = iwch_reject_cr;
dev->ibdev.iwcm->create_listen = iwch_create_listen;
dev->ibdev.iwcm->destroy_listen = iwch_destroy_listen;
dev->ibdev.iwcm->add_ref = iwch_qp_add_ref;
dev->ibdev.iwcm->rem_ref = iwch_qp_rem_ref;
dev->ibdev.iwcm->get_qp = iwch_get_qp;
memcpy(dev->ibdev.iwcm->ifname, dev->rdev.t3cdev_p->lldev->name,
sizeof(dev->ibdev.iwcm->ifname));
memcpy(dev->ibdev.iw_ifname, dev->rdev.t3cdev_p->lldev->name,
sizeof(dev->ibdev.iw_ifname));
dev->ibdev.driver_id = RDMA_DRIVER_CXGB3;
rdma_set_device_sysfs_group(&dev->ibdev, &iwch_attr_group);
ib_set_device_ops(&dev->ibdev, &iwch_dev_ops);
ret = ib_register_device(&dev->ibdev, "cxgb3_%d");
if (ret)
kfree(dev->ibdev.iwcm);
return ret;
return ib_register_device(&dev->ibdev, "cxgb3_%d");
}
void iwch_unregister_device(struct iwch_dev *dev)
{
pr_debug("%s iwch_dev %p\n", __func__, dev);
ib_unregister_device(&dev->ibdev);
kfree(dev->ibdev.iwcm);
return;
}

View File

@ -331,20 +331,23 @@ static void remove_ep_tid(struct c4iw_ep *ep)
{
unsigned long flags;
spin_lock_irqsave(&ep->com.dev->lock, flags);
_remove_handle(ep->com.dev, &ep->com.dev->hwtid_idr, ep->hwtid, 0);
if (idr_is_empty(&ep->com.dev->hwtid_idr))
xa_lock_irqsave(&ep->com.dev->hwtids, flags);
__xa_erase(&ep->com.dev->hwtids, ep->hwtid);
if (xa_empty(&ep->com.dev->hwtids))
wake_up(&ep->com.dev->wait);
spin_unlock_irqrestore(&ep->com.dev->lock, flags);
xa_unlock_irqrestore(&ep->com.dev->hwtids, flags);
}
static void insert_ep_tid(struct c4iw_ep *ep)
static int insert_ep_tid(struct c4iw_ep *ep)
{
unsigned long flags;
int err;
spin_lock_irqsave(&ep->com.dev->lock, flags);
_insert_handle(ep->com.dev, &ep->com.dev->hwtid_idr, ep, ep->hwtid, 0);
spin_unlock_irqrestore(&ep->com.dev->lock, flags);
xa_lock_irqsave(&ep->com.dev->hwtids, flags);
err = __xa_insert(&ep->com.dev->hwtids, ep->hwtid, ep, GFP_KERNEL);
xa_unlock_irqrestore(&ep->com.dev->hwtids, flags);
return err;
}
/*
@ -355,11 +358,11 @@ static struct c4iw_ep *get_ep_from_tid(struct c4iw_dev *dev, unsigned int tid)
struct c4iw_ep *ep;
unsigned long flags;
spin_lock_irqsave(&dev->lock, flags);
ep = idr_find(&dev->hwtid_idr, tid);
xa_lock_irqsave(&dev->hwtids, flags);
ep = xa_load(&dev->hwtids, tid);
if (ep)
c4iw_get_ep(&ep->com);
spin_unlock_irqrestore(&dev->lock, flags);
xa_unlock_irqrestore(&dev->hwtids, flags);
return ep;
}
@ -372,11 +375,11 @@ static struct c4iw_listen_ep *get_ep_from_stid(struct c4iw_dev *dev,
struct c4iw_listen_ep *ep;
unsigned long flags;
spin_lock_irqsave(&dev->lock, flags);
ep = idr_find(&dev->stid_idr, stid);
xa_lock_irqsave(&dev->stids, flags);
ep = xa_load(&dev->stids, stid);
if (ep)
c4iw_get_ep(&ep->com);
spin_unlock_irqrestore(&dev->lock, flags);
xa_unlock_irqrestore(&dev->stids, flags);
return ep;
}
@ -457,6 +460,8 @@ static struct sk_buff *get_skb(struct sk_buff *skb, int len, gfp_t gfp)
skb_reset_transport_header(skb);
} else {
skb = alloc_skb(len, gfp);
if (!skb)
return NULL;
}
t4_set_arp_err_handler(skb, NULL, NULL);
return skb;
@ -555,7 +560,7 @@ static void act_open_req_arp_failure(void *handle, struct sk_buff *skb)
cxgb4_clip_release(ep->com.dev->rdev.lldi.ports[0],
(const u32 *)&sin6->sin6_addr.s6_addr, 1);
}
remove_handle(ep->com.dev, &ep->com.dev->atid_idr, ep->atid);
xa_erase_irq(&ep->com.dev->atids, ep->atid);
cxgb4_free_atid(ep->com.dev->rdev.lldi.tids, ep->atid);
queue_arp_failure_cpl(ep, skb, FAKE_CPL_PUT_EP_SAFE);
}
@ -1235,7 +1240,7 @@ static int act_establish(struct c4iw_dev *dev, struct sk_buff *skb)
set_emss(ep, tcp_opt);
/* dealloc the atid */
remove_handle(ep->com.dev, &ep->com.dev->atid_idr, atid);
xa_erase_irq(&ep->com.dev->atids, atid);
cxgb4_free_atid(t, atid);
set_bit(ACT_ESTAB, &ep->com.history);
@ -2184,7 +2189,9 @@ static int c4iw_reconnect(struct c4iw_ep *ep)
err = -ENOMEM;
goto fail2;
}
insert_handle(ep->com.dev, &ep->com.dev->atid_idr, ep, ep->atid);
err = xa_insert_irq(&ep->com.dev->atids, ep->atid, ep, GFP_KERNEL);
if (err)
goto fail2a;
/* find a route */
if (ep->com.cm_id->m_local_addr.ss_family == AF_INET) {
@ -2236,7 +2243,8 @@ static int c4iw_reconnect(struct c4iw_ep *ep)
fail4:
dst_release(ep->dst);
fail3:
remove_handle(ep->com.dev, &ep->com.dev->atid_idr, ep->atid);
xa_erase_irq(&ep->com.dev->atids, ep->atid);
fail2a:
cxgb4_free_atid(ep->com.dev->rdev.lldi.tids, ep->atid);
fail2:
/*
@ -2319,8 +2327,7 @@ static int act_open_rpl(struct c4iw_dev *dev, struct sk_buff *skb)
(const u32 *)
&sin6->sin6_addr.s6_addr, 1);
}
remove_handle(ep->com.dev, &ep->com.dev->atid_idr,
atid);
xa_erase_irq(&ep->com.dev->atids, atid);
cxgb4_free_atid(t, atid);
dst_release(ep->dst);
cxgb4_l2t_release(ep->l2t);
@ -2357,7 +2364,7 @@ fail:
cxgb4_remove_tid(ep->com.dev->rdev.lldi.tids, 0, GET_TID(rpl),
ep->com.local_addr.ss_family);
remove_handle(ep->com.dev, &ep->com.dev->atid_idr, atid);
xa_erase_irq(&ep->com.dev->atids, atid);
cxgb4_free_atid(t, atid);
dst_release(ep->dst);
cxgb4_l2t_release(ep->l2t);
@ -2947,7 +2954,7 @@ out:
(const u32 *)&sin6->sin6_addr.s6_addr,
1);
}
remove_handle(ep->com.dev, &ep->com.dev->hwtid_idr, ep->hwtid);
xa_erase_irq(&ep->com.dev->hwtids, ep->hwtid);
cxgb4_remove_tid(ep->com.dev->rdev.lldi.tids, 0, ep->hwtid,
ep->com.local_addr.ss_family);
dst_release(ep->dst);
@ -3342,7 +3349,9 @@ int c4iw_connect(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param)
err = -ENOMEM;
goto fail2;
}
insert_handle(dev, &dev->atid_idr, ep, ep->atid);
err = xa_insert_irq(&dev->atids, ep->atid, ep, GFP_KERNEL);
if (err)
goto fail5;
memcpy(&ep->com.local_addr, &cm_id->m_local_addr,
sizeof(ep->com.local_addr));
@ -3430,7 +3439,8 @@ int c4iw_connect(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param)
fail4:
dst_release(ep->dst);
fail3:
remove_handle(ep->com.dev, &ep->com.dev->atid_idr, ep->atid);
xa_erase_irq(&ep->com.dev->atids, ep->atid);
fail5:
cxgb4_free_atid(ep->com.dev->rdev.lldi.tids, ep->atid);
fail2:
skb_queue_purge(&ep->com.ep_skb_list);
@ -3553,7 +3563,9 @@ int c4iw_create_listen(struct iw_cm_id *cm_id, int backlog)
err = -ENOMEM;
goto fail2;
}
insert_handle(dev, &dev->stid_idr, ep, ep->stid);
err = xa_insert_irq(&dev->stids, ep->stid, ep, GFP_KERNEL);
if (err)
goto fail3;
state_set(&ep->com, LISTEN);
if (ep->com.local_addr.ss_family == AF_INET)
@ -3564,7 +3576,8 @@ int c4iw_create_listen(struct iw_cm_id *cm_id, int backlog)
cm_id->provider_data = ep;
goto out;
}
remove_handle(ep->com.dev, &ep->com.dev->stid_idr, ep->stid);
xa_erase_irq(&ep->com.dev->stids, ep->stid);
fail3:
cxgb4_free_stid(ep->com.dev->rdev.lldi.tids, ep->stid,
ep->com.local_addr.ss_family);
fail2:
@ -3603,7 +3616,7 @@ int c4iw_destroy_listen(struct iw_cm_id *cm_id)
cxgb4_clip_release(ep->com.dev->rdev.lldi.ports[0],
(const u32 *)&sin6->sin6_addr.s6_addr, 1);
}
remove_handle(ep->com.dev, &ep->com.dev->stid_idr, ep->stid);
xa_erase_irq(&ep->com.dev->stids, ep->stid);
cxgb4_free_stid(ep->com.dev->rdev.lldi.tids, ep->stid,
ep->com.local_addr.ss_family);
done:
@ -3763,7 +3776,7 @@ static void active_ofld_conn_reply(struct c4iw_dev *dev, struct sk_buff *skb,
cxgb4_clip_release(ep->com.dev->rdev.lldi.ports[0],
(const u32 *)&sin6->sin6_addr.s6_addr, 1);
}
remove_handle(dev, &dev->atid_idr, atid);
xa_erase_irq(&dev->atids, atid);
cxgb4_free_atid(dev->rdev.lldi.tids, atid);
dst_release(ep->dst);
cxgb4_l2t_release(ep->l2t);

View File

@ -30,6 +30,8 @@
* SOFTWARE.
*/
#include <rdma/uverbs_ioctl.h>
#include "iw_cxgb4.h"
static int destroy_cq(struct c4iw_rdev *rdev, struct t4_cq *cq,
@ -968,7 +970,7 @@ int c4iw_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc)
return !err || err == -ENODATA ? npolled : err;
}
int c4iw_destroy_cq(struct ib_cq *ib_cq)
int c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
{
struct c4iw_cq *chp;
struct c4iw_ucontext *ucontext;
@ -976,12 +978,12 @@ int c4iw_destroy_cq(struct ib_cq *ib_cq)
pr_debug("ib_cq %p\n", ib_cq);
chp = to_c4iw_cq(ib_cq);
remove_handle(chp->rhp, &chp->rhp->cqidr, chp->cq.cqid);
xa_erase_irq(&chp->rhp->cqs, chp->cq.cqid);
atomic_dec(&chp->refcnt);
wait_event(chp->wait, !atomic_read(&chp->refcnt));
ucontext = ib_cq->uobject ? to_c4iw_ucontext(ib_cq->uobject->context)
: NULL;
ucontext = rdma_udata_to_drv_context(udata, struct c4iw_ucontext,
ibucontext);
destroy_cq(&chp->rhp->rdev, &chp->cq,
ucontext ? &ucontext->uctx : &chp->cq.rdev->uctx,
chp->destroy_skb, chp->wr_waitp);
@ -992,7 +994,6 @@ int c4iw_destroy_cq(struct ib_cq *ib_cq)
struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *ib_context,
struct ib_udata *udata)
{
int entries = attr->cqe;
@ -1001,10 +1002,11 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
struct c4iw_cq *chp;
struct c4iw_create_cq ucmd;
struct c4iw_create_cq_resp uresp;
struct c4iw_ucontext *ucontext = NULL;
int ret, wr_len;
size_t memsize, hwentries;
struct c4iw_mm_entry *mm, *mm2;
struct c4iw_ucontext *ucontext = rdma_udata_to_drv_context(
udata, struct c4iw_ucontext, ibucontext);
pr_debug("ib_dev %p entries %d\n", ibdev, entries);
if (attr->flags)
@ -1015,8 +1017,7 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
if (vector >= rhp->rdev.lldi.nciq)
return ERR_PTR(-EINVAL);
if (ib_context) {
ucontext = to_c4iw_ucontext(ib_context);
if (udata) {
if (udata->inlen < sizeof(ucmd))
ucontext->is_32b_cqe = 1;
}
@ -1068,7 +1069,7 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
/*
* memsize must be a multiple of the page size if its a user cq.
*/
if (ucontext)
if (udata)
memsize = roundup(memsize, PAGE_SIZE);
chp->cq.size = hwentries;
@ -1088,7 +1089,7 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
spin_lock_init(&chp->comp_handler_lock);
atomic_set(&chp->refcnt, 1);
init_waitqueue_head(&chp->wait);
ret = insert_handle(rhp, &rhp->cqidr, chp, chp->cq.cqid);
ret = xa_insert_irq(&rhp->cqs, chp->cq.cqid, chp, GFP_KERNEL);
if (ret)
goto err_destroy_cq;
@ -1143,7 +1144,7 @@ err_free_mm2:
err_free_mm:
kfree(mm);
err_remove_handle:
remove_handle(rhp, &rhp->cqidr, chp->cq.cqid);
xa_erase_irq(&rhp->cqs, chp->cq.cqid);
err_destroy_cq:
destroy_cq(&chp->rhp->rdev, &chp->cq,
ucontext ? &ucontext->uctx : &rhp->rdev.uctx,

View File

@ -81,14 +81,6 @@ struct c4iw_debugfs_data {
int pos;
};
static int count_idrs(int id, void *p, void *data)
{
int *countp = data;
*countp = *countp + 1;
return 0;
}
static ssize_t debugfs_read(struct file *file, char __user *buf, size_t count,
loff_t *ppos)
{
@ -250,16 +242,11 @@ static void set_ep_sin6_addrs(struct c4iw_ep *ep,
}
}
static int dump_qp(int id, void *p, void *data)
static int dump_qp(struct c4iw_qp *qp, struct c4iw_debugfs_data *qpd)
{
struct c4iw_qp *qp = p;
struct c4iw_debugfs_data *qpd = data;
int space;
int cc;
if (id != qp->wq.sq.qid)
return 0;
space = qpd->bufsize - qpd->pos - 1;
if (space == 0)
return 1;
@ -335,7 +322,9 @@ static int qp_release(struct inode *inode, struct file *file)
static int qp_open(struct inode *inode, struct file *file)
{
struct c4iw_qp *qp;
struct c4iw_debugfs_data *qpd;
unsigned long index;
int count = 1;
qpd = kmalloc(sizeof *qpd, GFP_KERNEL);
@ -345,9 +334,12 @@ static int qp_open(struct inode *inode, struct file *file)
qpd->devp = inode->i_private;
qpd->pos = 0;
spin_lock_irq(&qpd->devp->lock);
idr_for_each(&qpd->devp->qpidr, count_idrs, &count);
spin_unlock_irq(&qpd->devp->lock);
/*
* No need to lock; we drop the lock to call vmalloc so it's racy
* anyway. Someone who cares should switch this over to seq_file
*/
xa_for_each(&qpd->devp->qps, index, qp)
count++;
qpd->bufsize = count * 180;
qpd->buf = vmalloc(qpd->bufsize);
@ -356,9 +348,10 @@ static int qp_open(struct inode *inode, struct file *file)
return -ENOMEM;
}
spin_lock_irq(&qpd->devp->lock);
idr_for_each(&qpd->devp->qpidr, dump_qp, qpd);
spin_unlock_irq(&qpd->devp->lock);
xa_lock_irq(&qpd->devp->qps);
xa_for_each(&qpd->devp->qps, index, qp)
dump_qp(qp, qpd);
xa_unlock_irq(&qpd->devp->qps);
qpd->buf[qpd->pos++] = 0;
file->private_data = qpd;
@ -373,9 +366,8 @@ static const struct file_operations qp_debugfs_fops = {
.llseek = default_llseek,
};
static int dump_stag(int id, void *p, void *data)
static int dump_stag(unsigned long id, struct c4iw_debugfs_data *stagd)
{
struct c4iw_debugfs_data *stagd = data;
int space;
int cc;
struct fw_ri_tpte tpte;
@ -424,6 +416,8 @@ static int stag_release(struct inode *inode, struct file *file)
static int stag_open(struct inode *inode, struct file *file)
{
struct c4iw_debugfs_data *stagd;
void *p;
unsigned long index;
int ret = 0;
int count = 1;
@ -435,9 +429,8 @@ static int stag_open(struct inode *inode, struct file *file)
stagd->devp = inode->i_private;
stagd->pos = 0;
spin_lock_irq(&stagd->devp->lock);
idr_for_each(&stagd->devp->mmidr, count_idrs, &count);
spin_unlock_irq(&stagd->devp->lock);
xa_for_each(&stagd->devp->mrs, index, p)
count++;
stagd->bufsize = count * 256;
stagd->buf = vmalloc(stagd->bufsize);
@ -446,9 +439,10 @@ static int stag_open(struct inode *inode, struct file *file)
goto err1;
}
spin_lock_irq(&stagd->devp->lock);
idr_for_each(&stagd->devp->mmidr, dump_stag, stagd);
spin_unlock_irq(&stagd->devp->lock);
xa_lock_irq(&stagd->devp->mrs);
xa_for_each(&stagd->devp->mrs, index, p)
dump_stag(index, stagd);
xa_unlock_irq(&stagd->devp->mrs);
stagd->buf[stagd->pos++] = 0;
file->private_data = stagd;
@ -558,10 +552,8 @@ static const struct file_operations stats_debugfs_fops = {
.write = stats_clear,
};
static int dump_ep(int id, void *p, void *data)
static int dump_ep(struct c4iw_ep *ep, struct c4iw_debugfs_data *epd)
{
struct c4iw_ep *ep = p;
struct c4iw_debugfs_data *epd = data;
int space;
int cc;
@ -617,10 +609,9 @@ static int dump_ep(int id, void *p, void *data)
return 0;
}
static int dump_listen_ep(int id, void *p, void *data)
static
int dump_listen_ep(struct c4iw_listen_ep *ep, struct c4iw_debugfs_data *epd)
{
struct c4iw_listen_ep *ep = p;
struct c4iw_debugfs_data *epd = data;
int space;
int cc;
@ -674,6 +665,9 @@ static int ep_release(struct inode *inode, struct file *file)
static int ep_open(struct inode *inode, struct file *file)
{
struct c4iw_ep *ep;
struct c4iw_listen_ep *lep;
unsigned long index;
struct c4iw_debugfs_data *epd;
int ret = 0;
int count = 1;
@ -686,11 +680,12 @@ static int ep_open(struct inode *inode, struct file *file)
epd->devp = inode->i_private;
epd->pos = 0;
spin_lock_irq(&epd->devp->lock);
idr_for_each(&epd->devp->hwtid_idr, count_idrs, &count);
idr_for_each(&epd->devp->atid_idr, count_idrs, &count);
idr_for_each(&epd->devp->stid_idr, count_idrs, &count);
spin_unlock_irq(&epd->devp->lock);
xa_for_each(&epd->devp->hwtids, index, ep)
count++;
xa_for_each(&epd->devp->atids, index, ep)
count++;
xa_for_each(&epd->devp->stids, index, lep)
count++;
epd->bufsize = count * 240;
epd->buf = vmalloc(epd->bufsize);
@ -699,11 +694,18 @@ static int ep_open(struct inode *inode, struct file *file)
goto err1;
}
spin_lock_irq(&epd->devp->lock);
idr_for_each(&epd->devp->hwtid_idr, dump_ep, epd);
idr_for_each(&epd->devp->atid_idr, dump_ep, epd);
idr_for_each(&epd->devp->stid_idr, dump_listen_ep, epd);
spin_unlock_irq(&epd->devp->lock);
xa_lock_irq(&epd->devp->hwtids);
xa_for_each(&epd->devp->hwtids, index, ep)
dump_ep(ep, epd);
xa_unlock_irq(&epd->devp->hwtids);
xa_lock_irq(&epd->devp->atids);
xa_for_each(&epd->devp->atids, index, ep)
dump_ep(ep, epd);
xa_unlock_irq(&epd->devp->atids);
xa_lock_irq(&epd->devp->stids);
xa_for_each(&epd->devp->stids, index, lep)
dump_listen_ep(lep, epd);
xa_unlock_irq(&epd->devp->stids);
file->private_data = epd;
goto out;
@ -931,16 +933,12 @@ static void c4iw_rdev_close(struct c4iw_rdev *rdev)
void c4iw_dealloc(struct uld_ctx *ctx)
{
c4iw_rdev_close(&ctx->dev->rdev);
WARN_ON_ONCE(!idr_is_empty(&ctx->dev->cqidr));
idr_destroy(&ctx->dev->cqidr);
WARN_ON_ONCE(!idr_is_empty(&ctx->dev->qpidr));
idr_destroy(&ctx->dev->qpidr);
WARN_ON_ONCE(!idr_is_empty(&ctx->dev->mmidr));
idr_destroy(&ctx->dev->mmidr);
wait_event(ctx->dev->wait, idr_is_empty(&ctx->dev->hwtid_idr));
idr_destroy(&ctx->dev->hwtid_idr);
idr_destroy(&ctx->dev->stid_idr);
idr_destroy(&ctx->dev->atid_idr);
WARN_ON(!xa_empty(&ctx->dev->cqs));
WARN_ON(!xa_empty(&ctx->dev->qps));
WARN_ON(!xa_empty(&ctx->dev->mrs));
wait_event(ctx->dev->wait, xa_empty(&ctx->dev->hwtids));
WARN_ON(!xa_empty(&ctx->dev->stids));
WARN_ON(!xa_empty(&ctx->dev->atids));
if (ctx->dev->rdev.bar2_kva)
iounmap(ctx->dev->rdev.bar2_kva);
if (ctx->dev->rdev.oc_mw_kva)
@ -1044,13 +1042,12 @@ static struct c4iw_dev *c4iw_alloc(const struct cxgb4_lld_info *infop)
return ERR_PTR(ret);
}
idr_init(&devp->cqidr);
idr_init(&devp->qpidr);
idr_init(&devp->mmidr);
idr_init(&devp->hwtid_idr);
idr_init(&devp->stid_idr);
idr_init(&devp->atid_idr);
spin_lock_init(&devp->lock);
xa_init_flags(&devp->cqs, XA_FLAGS_LOCK_IRQ);
xa_init_flags(&devp->qps, XA_FLAGS_LOCK_IRQ);
xa_init_flags(&devp->mrs, XA_FLAGS_LOCK_IRQ);
xa_init_flags(&devp->hwtids, XA_FLAGS_LOCK_IRQ);
xa_init_flags(&devp->atids, XA_FLAGS_LOCK_IRQ);
xa_init_flags(&devp->stids, XA_FLAGS_LOCK_IRQ);
mutex_init(&devp->rdev.stats.lock);
mutex_init(&devp->db_mutex);
INIT_LIST_HEAD(&devp->db_fc_list);
@ -1265,34 +1262,21 @@ static int c4iw_uld_state_change(void *handle, enum cxgb4_state new_state)
return 0;
}
static int disable_qp_db(int id, void *p, void *data)
{
struct c4iw_qp *qp = p;
t4_disable_wq_db(&qp->wq);
return 0;
}
static void stop_queues(struct uld_ctx *ctx)
{
unsigned long flags;
struct c4iw_qp *qp;
unsigned long index, flags;
spin_lock_irqsave(&ctx->dev->lock, flags);
xa_lock_irqsave(&ctx->dev->qps, flags);
ctx->dev->rdev.stats.db_state_transitions++;
ctx->dev->db_state = STOPPED;
if (ctx->dev->rdev.flags & T4_STATUS_PAGE_DISABLED)
idr_for_each(&ctx->dev->qpidr, disable_qp_db, NULL);
else
if (ctx->dev->rdev.flags & T4_STATUS_PAGE_DISABLED) {
xa_for_each(&ctx->dev->qps, index, qp)
t4_disable_wq_db(&qp->wq);
} else {
ctx->dev->rdev.status_page->db_off = 1;
spin_unlock_irqrestore(&ctx->dev->lock, flags);
}
static int enable_qp_db(int id, void *p, void *data)
{
struct c4iw_qp *qp = p;
t4_enable_wq_db(&qp->wq);
return 0;
}
xa_unlock_irqrestore(&ctx->dev->qps, flags);
}
static void resume_rc_qp(struct c4iw_qp *qp)
@ -1322,18 +1306,21 @@ static void resume_a_chunk(struct uld_ctx *ctx)
static void resume_queues(struct uld_ctx *ctx)
{
spin_lock_irq(&ctx->dev->lock);
xa_lock_irq(&ctx->dev->qps);
if (ctx->dev->db_state != STOPPED)
goto out;
ctx->dev->db_state = FLOW_CONTROL;
while (1) {
if (list_empty(&ctx->dev->db_fc_list)) {
struct c4iw_qp *qp;
unsigned long index;
WARN_ON(ctx->dev->db_state != FLOW_CONTROL);
ctx->dev->db_state = NORMAL;
ctx->dev->rdev.stats.db_state_transitions++;
if (ctx->dev->rdev.flags & T4_STATUS_PAGE_DISABLED) {
idr_for_each(&ctx->dev->qpidr, enable_qp_db,
NULL);
xa_for_each(&ctx->dev->qps, index, qp)
t4_enable_wq_db(&qp->wq);
} else {
ctx->dev->rdev.status_page->db_off = 0;
}
@ -1345,12 +1332,12 @@ static void resume_queues(struct uld_ctx *ctx)
resume_a_chunk(ctx);
}
if (!list_empty(&ctx->dev->db_fc_list)) {
spin_unlock_irq(&ctx->dev->lock);
xa_unlock_irq(&ctx->dev->qps);
if (DB_FC_RESUME_DELAY) {
set_current_state(TASK_UNINTERRUPTIBLE);
schedule_timeout(DB_FC_RESUME_DELAY);
}
spin_lock_irq(&ctx->dev->lock);
xa_lock_irq(&ctx->dev->qps);
if (ctx->dev->db_state != FLOW_CONTROL)
break;
}
@ -1359,7 +1346,7 @@ static void resume_queues(struct uld_ctx *ctx)
out:
if (ctx->dev->db_state != NORMAL)
ctx->dev->rdev.stats.db_fc_interruptions++;
spin_unlock_irq(&ctx->dev->lock);
xa_unlock_irq(&ctx->dev->qps);
}
struct qp_list {
@ -1367,23 +1354,6 @@ struct qp_list {
struct c4iw_qp **qps;
};
static int add_and_ref_qp(int id, void *p, void *data)
{
struct qp_list *qp_listp = data;
struct c4iw_qp *qp = p;
c4iw_qp_add_ref(&qp->ibqp);
qp_listp->qps[qp_listp->idx++] = qp;
return 0;
}
static int count_qps(int id, void *p, void *data)
{
unsigned *countp = data;
(*countp)++;
return 0;
}
static void deref_qps(struct qp_list *qp_list)
{
int idx;
@ -1400,7 +1370,7 @@ static void recover_lost_dbs(struct uld_ctx *ctx, struct qp_list *qp_list)
for (idx = 0; idx < qp_list->idx; idx++) {
struct c4iw_qp *qp = qp_list->qps[idx];
spin_lock_irq(&qp->rhp->lock);
xa_lock_irq(&qp->rhp->qps);
spin_lock(&qp->lock);
ret = cxgb4_sync_txq_pidx(qp->rhp->rdev.lldi.ports[0],
qp->wq.sq.qid,
@ -1410,7 +1380,7 @@ static void recover_lost_dbs(struct uld_ctx *ctx, struct qp_list *qp_list)
pr_err("%s: Fatal error - DB overflow recovery failed - error syncing SQ qid %u\n",
pci_name(ctx->lldi.pdev), qp->wq.sq.qid);
spin_unlock(&qp->lock);
spin_unlock_irq(&qp->rhp->lock);
xa_unlock_irq(&qp->rhp->qps);
return;
}
qp->wq.sq.wq_pidx_inc = 0;
@ -1424,12 +1394,12 @@ static void recover_lost_dbs(struct uld_ctx *ctx, struct qp_list *qp_list)
pr_err("%s: Fatal error - DB overflow recovery failed - error syncing RQ qid %u\n",
pci_name(ctx->lldi.pdev), qp->wq.rq.qid);
spin_unlock(&qp->lock);
spin_unlock_irq(&qp->rhp->lock);
xa_unlock_irq(&qp->rhp->qps);
return;
}
qp->wq.rq.wq_pidx_inc = 0;
spin_unlock(&qp->lock);
spin_unlock_irq(&qp->rhp->lock);
xa_unlock_irq(&qp->rhp->qps);
/* Wait for the dbfifo to drain */
while (cxgb4_dbfifo_count(qp->rhp->rdev.lldi.ports[0], 1) > 0) {
@ -1441,6 +1411,8 @@ static void recover_lost_dbs(struct uld_ctx *ctx, struct qp_list *qp_list)
static void recover_queues(struct uld_ctx *ctx)
{
struct c4iw_qp *qp;
unsigned long index;
int count = 0;
struct qp_list qp_list;
int ret;
@ -1458,22 +1430,26 @@ static void recover_queues(struct uld_ctx *ctx)
}
/* Count active queues so we can build a list of queues to recover */
spin_lock_irq(&ctx->dev->lock);
xa_lock_irq(&ctx->dev->qps);
WARN_ON(ctx->dev->db_state != STOPPED);
ctx->dev->db_state = RECOVERY;
idr_for_each(&ctx->dev->qpidr, count_qps, &count);
xa_for_each(&ctx->dev->qps, index, qp)
count++;
qp_list.qps = kcalloc(count, sizeof(*qp_list.qps), GFP_ATOMIC);
if (!qp_list.qps) {
spin_unlock_irq(&ctx->dev->lock);
xa_unlock_irq(&ctx->dev->qps);
return;
}
qp_list.idx = 0;
/* add and ref each qp so it doesn't get freed */
idr_for_each(&ctx->dev->qpidr, add_and_ref_qp, &qp_list);
xa_for_each(&ctx->dev->qps, index, qp) {
c4iw_qp_add_ref(&qp->ibqp);
qp_list.qps[qp_list.idx++] = qp;
}
spin_unlock_irq(&ctx->dev->lock);
xa_unlock_irq(&ctx->dev->qps);
/* now traverse the list in a safe context to recover the db state*/
recover_lost_dbs(ctx, &qp_list);
@ -1482,10 +1458,10 @@ static void recover_queues(struct uld_ctx *ctx)
deref_qps(&qp_list);
kfree(qp_list.qps);
spin_lock_irq(&ctx->dev->lock);
xa_lock_irq(&ctx->dev->qps);
WARN_ON(ctx->dev->db_state != RECOVERY);
ctx->dev->db_state = STOPPED;
spin_unlock_irq(&ctx->dev->lock);
xa_unlock_irq(&ctx->dev->qps);
}
static int c4iw_uld_control(void *handle, enum cxgb4_control control, ...)

View File

@ -123,15 +123,15 @@ void c4iw_ev_dispatch(struct c4iw_dev *dev, struct t4_cqe *err_cqe)
struct c4iw_qp *qhp;
u32 cqid;
spin_lock_irq(&dev->lock);
qhp = get_qhp(dev, CQE_QPID(err_cqe));
xa_lock_irq(&dev->qps);
qhp = xa_load(&dev->qps, CQE_QPID(err_cqe));
if (!qhp) {
pr_err("BAD AE qpid 0x%x opcode %d status 0x%x type %d wrid.hi 0x%x wrid.lo 0x%x\n",
CQE_QPID(err_cqe),
CQE_OPCODE(err_cqe), CQE_STATUS(err_cqe),
CQE_TYPE(err_cqe), CQE_WRID_HI(err_cqe),
CQE_WRID_LOW(err_cqe));
spin_unlock_irq(&dev->lock);
xa_unlock_irq(&dev->qps);
goto out;
}
@ -146,13 +146,13 @@ void c4iw_ev_dispatch(struct c4iw_dev *dev, struct t4_cqe *err_cqe)
CQE_OPCODE(err_cqe), CQE_STATUS(err_cqe),
CQE_TYPE(err_cqe), CQE_WRID_HI(err_cqe),
CQE_WRID_LOW(err_cqe));
spin_unlock_irq(&dev->lock);
xa_unlock_irq(&dev->qps);
goto out;
}
c4iw_qp_add_ref(&qhp->ibqp);
atomic_inc(&chp->refcnt);
spin_unlock_irq(&dev->lock);
xa_unlock_irq(&dev->qps);
/* Bad incoming write */
if (RQ_TYPE(err_cqe) &&
@ -225,11 +225,11 @@ int c4iw_ev_handler(struct c4iw_dev *dev, u32 qid)
struct c4iw_cq *chp;
unsigned long flag;
spin_lock_irqsave(&dev->lock, flag);
chp = get_chp(dev, qid);
xa_lock_irqsave(&dev->cqs, flag);
chp = xa_load(&dev->cqs, qid);
if (chp) {
atomic_inc(&chp->refcnt);
spin_unlock_irqrestore(&dev->lock, flag);
xa_unlock_irqrestore(&dev->cqs, flag);
t4_clear_cq_armed(&chp->cq);
spin_lock_irqsave(&chp->comp_handler_lock, flag);
(*chp->ibcq.comp_handler)(&chp->ibcq, chp->ibcq.cq_context);
@ -238,7 +238,7 @@ int c4iw_ev_handler(struct c4iw_dev *dev, u32 qid)
wake_up(&chp->wait);
} else {
pr_debug("unknown cqid 0x%x\n", qid);
spin_unlock_irqrestore(&dev->lock, flag);
xa_unlock_irqrestore(&dev->cqs, flag);
}
return 0;
}

View File

@ -34,7 +34,7 @@
#include <linux/mutex.h>
#include <linux/list.h>
#include <linux/spinlock.h>
#include <linux/idr.h>
#include <linux/xarray.h>
#include <linux/completion.h>
#include <linux/netdevice.h>
#include <linux/sched/mm.h>
@ -315,16 +315,15 @@ struct c4iw_dev {
struct ib_device ibdev;
struct c4iw_rdev rdev;
u32 device_cap_flags;
struct idr cqidr;
struct idr qpidr;
struct idr mmidr;
spinlock_t lock;
struct xarray cqs;
struct xarray qps;
struct xarray mrs;
struct mutex db_mutex;
struct dentry *debugfs_root;
enum db_state db_state;
struct idr hwtid_idr;
struct idr atid_idr;
struct idr stid_idr;
struct xarray hwtids;
struct xarray atids;
struct xarray stids;
struct list_head db_fc_list;
u32 avail_ird;
wait_queue_head_t wait;
@ -349,70 +348,12 @@ static inline struct c4iw_dev *rdev_to_c4iw_dev(struct c4iw_rdev *rdev)
static inline struct c4iw_cq *get_chp(struct c4iw_dev *rhp, u32 cqid)
{
return idr_find(&rhp->cqidr, cqid);
return xa_load(&rhp->cqs, cqid);
}
static inline struct c4iw_qp *get_qhp(struct c4iw_dev *rhp, u32 qpid)
{
return idr_find(&rhp->qpidr, qpid);
}
static inline struct c4iw_mr *get_mhp(struct c4iw_dev *rhp, u32 mmid)
{
return idr_find(&rhp->mmidr, mmid);
}
static inline int _insert_handle(struct c4iw_dev *rhp, struct idr *idr,
void *handle, u32 id, int lock)
{
int ret;
if (lock) {
idr_preload(GFP_KERNEL);
spin_lock_irq(&rhp->lock);
}
ret = idr_alloc(idr, handle, id, id + 1, GFP_ATOMIC);
if (lock) {
spin_unlock_irq(&rhp->lock);
idr_preload_end();
}
return ret < 0 ? ret : 0;
}
static inline int insert_handle(struct c4iw_dev *rhp, struct idr *idr,
void *handle, u32 id)
{
return _insert_handle(rhp, idr, handle, id, 1);
}
static inline int insert_handle_nolock(struct c4iw_dev *rhp, struct idr *idr,
void *handle, u32 id)
{
return _insert_handle(rhp, idr, handle, id, 0);
}
static inline void _remove_handle(struct c4iw_dev *rhp, struct idr *idr,
u32 id, int lock)
{
if (lock)
spin_lock_irq(&rhp->lock);
idr_remove(idr, id);
if (lock)
spin_unlock_irq(&rhp->lock);
}
static inline void remove_handle(struct c4iw_dev *rhp, struct idr *idr, u32 id)
{
_remove_handle(rhp, idr, id, 1);
}
static inline void remove_handle_nolock(struct c4iw_dev *rhp,
struct idr *idr, u32 id)
{
_remove_handle(rhp, idr, id, 0);
return xa_load(&rhp->qps, qpid);
}
extern uint c4iw_max_read_depth;
@ -1038,9 +979,8 @@ int c4iw_accept_cr(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param);
int c4iw_reject_cr(struct iw_cm_id *cm_id, const void *pdata, u8 pdata_len);
void c4iw_qp_add_ref(struct ib_qp *qp);
void c4iw_qp_rem_ref(struct ib_qp *qp);
struct ib_mr *c4iw_alloc_mr(struct ib_pd *pd,
enum ib_mr_type mr_type,
u32 max_num_sg);
struct ib_mr *c4iw_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type,
u32 max_num_sg, struct ib_udata *udata);
int c4iw_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, int sg_nents,
unsigned int *sg_offset);
int c4iw_dealloc_mw(struct ib_mw *mw);
@ -1051,21 +991,19 @@ struct ib_mr *c4iw_reg_user_mr(struct ib_pd *pd, u64 start,
u64 length, u64 virt, int acc,
struct ib_udata *udata);
struct ib_mr *c4iw_get_dma_mr(struct ib_pd *pd, int acc);
int c4iw_dereg_mr(struct ib_mr *ib_mr);
int c4iw_destroy_cq(struct ib_cq *ib_cq);
int c4iw_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata);
int c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata);
struct ib_cq *c4iw_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *ib_context,
struct ib_udata *udata);
int c4iw_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
int c4iw_modify_srq(struct ib_srq *ib_srq, struct ib_srq_attr *attr,
enum ib_srq_attr_mask srq_attr_mask,
struct ib_udata *udata);
int c4iw_destroy_srq(struct ib_srq *ib_srq);
struct ib_srq *c4iw_create_srq(struct ib_pd *pd,
struct ib_srq_init_attr *attrs,
struct ib_udata *udata);
int c4iw_destroy_qp(struct ib_qp *ib_qp);
void c4iw_destroy_srq(struct ib_srq *ib_srq, struct ib_udata *udata);
int c4iw_create_srq(struct ib_srq *srq, struct ib_srq_init_attr *attrs,
struct ib_udata *udata);
int c4iw_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata);
struct ib_qp *c4iw_create_qp(struct ib_pd *pd,
struct ib_qp_init_attr *attrs,
struct ib_udata *udata);

View File

@ -395,7 +395,7 @@ static int finish_mem_reg(struct c4iw_mr *mhp, u32 stag)
mhp->ibmr.iova = mhp->attr.va_fbo;
mhp->ibmr.page_size = 1U << (mhp->attr.page_size + 12);
pr_debug("mmid 0x%x mhp %p\n", mmid, mhp);
return insert_handle(mhp->rhp, &mhp->rhp->mmidr, mhp, mmid);
return xa_insert_irq(&mhp->rhp->mrs, mmid, mhp, GFP_KERNEL);
}
static int register_mem(struct c4iw_dev *rhp, struct c4iw_pd *php,
@ -542,7 +542,7 @@ struct ib_mr *c4iw_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
shift = PAGE_SHIFT;
n = mhp->umem->nmap;
n = ib_umem_num_pages(mhp->umem);
err = alloc_pbl(mhp, n);
if (err)
goto err_umem_release;
@ -645,7 +645,7 @@ struct ib_mw *c4iw_alloc_mw(struct ib_pd *pd, enum ib_mw_type type,
mhp->attr.stag = stag;
mmid = (stag) >> 8;
mhp->ibmw.rkey = stag;
if (insert_handle(rhp, &rhp->mmidr, mhp, mmid)) {
if (xa_insert_irq(&rhp->mrs, mmid, mhp, GFP_KERNEL)) {
ret = -ENOMEM;
goto dealloc_win;
}
@ -673,7 +673,7 @@ int c4iw_dealloc_mw(struct ib_mw *mw)
mhp = to_c4iw_mw(mw);
rhp = mhp->rhp;
mmid = (mw->rkey) >> 8;
remove_handle(rhp, &rhp->mmidr, mmid);
xa_erase_irq(&rhp->mrs, mmid);
deallocate_window(&rhp->rdev, mhp->attr.stag, mhp->dereg_skb,
mhp->wr_waitp);
kfree_skb(mhp->dereg_skb);
@ -683,9 +683,8 @@ int c4iw_dealloc_mw(struct ib_mw *mw)
return 0;
}
struct ib_mr *c4iw_alloc_mr(struct ib_pd *pd,
enum ib_mr_type mr_type,
u32 max_num_sg)
struct ib_mr *c4iw_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type,
u32 max_num_sg, struct ib_udata *udata)
{
struct c4iw_dev *rhp;
struct c4iw_pd *php;
@ -740,7 +739,7 @@ struct ib_mr *c4iw_alloc_mr(struct ib_pd *pd,
mhp->attr.state = 0;
mmid = (stag) >> 8;
mhp->ibmr.rkey = mhp->ibmr.lkey = stag;
if (insert_handle(rhp, &rhp->mmidr, mhp, mmid)) {
if (xa_insert_irq(&rhp->mrs, mmid, mhp, GFP_KERNEL)) {
ret = -ENOMEM;
goto err_dereg;
}
@ -786,7 +785,7 @@ int c4iw_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, int sg_nents,
return ib_sg_to_pages(ibmr, sg, sg_nents, sg_offset, c4iw_set_page);
}
int c4iw_dereg_mr(struct ib_mr *ib_mr)
int c4iw_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata)
{
struct c4iw_dev *rhp;
struct c4iw_mr *mhp;
@ -797,7 +796,7 @@ int c4iw_dereg_mr(struct ib_mr *ib_mr)
mhp = to_c4iw_mr(ib_mr);
rhp = mhp->rhp;
mmid = mhp->attr.stag >> 8;
remove_handle(rhp, &rhp->mmidr, mmid);
xa_erase_irq(&rhp->mrs, mmid);
if (mhp->mpl)
dma_free_coherent(&mhp->rhp->rdev.lldi.pdev->dev,
mhp->max_mpl_len, mhp->mpl, mhp->mpl_addr);
@ -821,9 +820,9 @@ void c4iw_invalidate_mr(struct c4iw_dev *rhp, u32 rkey)
struct c4iw_mr *mhp;
unsigned long flags;
spin_lock_irqsave(&rhp->lock, flags);
mhp = get_mhp(rhp, rkey >> 8);
xa_lock_irqsave(&rhp->mrs, flags);
mhp = xa_load(&rhp->mrs, rkey >> 8);
if (mhp)
mhp->attr.state = 0;
spin_unlock_irqrestore(&rhp->lock, flags);
xa_unlock_irqrestore(&rhp->mrs, flags);
}

View File

@ -190,7 +190,7 @@ static int c4iw_mmap(struct ib_ucontext *context, struct vm_area_struct *vma)
return ret;
}
static void c4iw_deallocate_pd(struct ib_pd *pd)
static void c4iw_deallocate_pd(struct ib_pd *pd, struct ib_udata *udata)
{
struct c4iw_dev *rhp;
struct c4iw_pd *php;
@ -204,8 +204,7 @@ static void c4iw_deallocate_pd(struct ib_pd *pd)
mutex_unlock(&rhp->rdev.stats.lock);
}
static int c4iw_allocate_pd(struct ib_pd *pd, struct ib_ucontext *context,
struct ib_udata *udata)
static int c4iw_allocate_pd(struct ib_pd *pd, struct ib_udata *udata)
{
struct c4iw_pd *php = to_c4iw_pd(pd);
struct ib_device *ibdev = pd->device;
@ -220,11 +219,11 @@ static int c4iw_allocate_pd(struct ib_pd *pd, struct ib_ucontext *context,
php->pdid = pdid;
php->rhp = rhp;
if (context) {
if (udata) {
struct c4iw_alloc_pd_resp uresp = {.pdid = php->pdid};
if (ib_copy_to_udata(udata, &uresp, sizeof(uresp))) {
c4iw_deallocate_pd(&php->ibpd);
c4iw_deallocate_pd(&php->ibpd, udata);
return -EFAULT;
}
}
@ -483,24 +482,6 @@ static void get_dev_fw_str(struct ib_device *dev, char *str)
FW_HDR_FW_VER_BUILD_G(c4iw_dev->rdev.lldi.fw_vers));
}
static struct net_device *get_netdev(struct ib_device *dev, u8 port)
{
struct c4iw_dev *c4iw_dev = container_of(dev, struct c4iw_dev, ibdev);
struct c4iw_rdev *rdev = &c4iw_dev->rdev;
struct net_device *ndev;
if (!port || port > rdev->lldi.nports)
return NULL;
rcu_read_lock();
ndev = rdev->lldi.ports[port - 1];
if (ndev)
dev_hold(ndev);
rcu_read_unlock();
return ndev;
}
static int fill_res_entry(struct sk_buff *msg, struct rdma_restrack_entry *res)
{
return (res->type < ARRAY_SIZE(c4iw_restrack_funcs) &&
@ -528,8 +509,15 @@ static const struct ib_device_ops c4iw_dev_ops = {
.get_dev_fw_str = get_dev_fw_str,
.get_dma_mr = c4iw_get_dma_mr,
.get_hw_stats = c4iw_get_mib,
.get_netdev = get_netdev,
.get_port_immutable = c4iw_port_immutable,
.iw_accept = c4iw_accept_cr,
.iw_add_ref = c4iw_qp_add_ref,
.iw_connect = c4iw_connect,
.iw_create_listen = c4iw_create_listen,
.iw_destroy_listen = c4iw_destroy_listen,
.iw_get_qp = c4iw_get_qp,
.iw_reject = c4iw_reject_cr,
.iw_rem_ref = c4iw_qp_rem_ref,
.map_mr_sg = c4iw_map_mr_sg,
.mmap = c4iw_mmap,
.modify_qp = c4iw_ib_modify_qp,
@ -546,9 +534,24 @@ static const struct ib_device_ops c4iw_dev_ops = {
.reg_user_mr = c4iw_reg_user_mr,
.req_notify_cq = c4iw_arm_cq,
INIT_RDMA_OBJ_SIZE(ib_pd, c4iw_pd, ibpd),
INIT_RDMA_OBJ_SIZE(ib_srq, c4iw_srq, ibsrq),
INIT_RDMA_OBJ_SIZE(ib_ucontext, c4iw_ucontext, ibucontext),
};
static int set_netdevs(struct ib_device *ib_dev, struct c4iw_rdev *rdev)
{
int ret;
int i;
for (i = 0; i < rdev->lldi.nports; i++) {
ret = ib_device_set_netdev(ib_dev, rdev->lldi.ports[i],
i + 1);
if (ret)
return ret;
}
return 0;
}
void c4iw_register_device(struct work_struct *work)
{
int ret;
@ -593,33 +596,20 @@ void c4iw_register_device(struct work_struct *work)
dev->ibdev.dev.parent = &dev->rdev.lldi.pdev->dev;
dev->ibdev.uverbs_abi_ver = C4IW_UVERBS_ABI_VERSION;
dev->ibdev.iwcm = kzalloc(sizeof(struct iw_cm_verbs), GFP_KERNEL);
if (!dev->ibdev.iwcm) {
ret = -ENOMEM;
goto err_dealloc_ctx;
}
dev->ibdev.iwcm->connect = c4iw_connect;
dev->ibdev.iwcm->accept = c4iw_accept_cr;
dev->ibdev.iwcm->reject = c4iw_reject_cr;
dev->ibdev.iwcm->create_listen = c4iw_create_listen;
dev->ibdev.iwcm->destroy_listen = c4iw_destroy_listen;
dev->ibdev.iwcm->add_ref = c4iw_qp_add_ref;
dev->ibdev.iwcm->rem_ref = c4iw_qp_rem_ref;
dev->ibdev.iwcm->get_qp = c4iw_get_qp;
memcpy(dev->ibdev.iwcm->ifname, dev->rdev.lldi.ports[0]->name,
sizeof(dev->ibdev.iwcm->ifname));
memcpy(dev->ibdev.iw_ifname, dev->rdev.lldi.ports[0]->name,
sizeof(dev->ibdev.iw_ifname));
rdma_set_device_sysfs_group(&dev->ibdev, &c4iw_attr_group);
dev->ibdev.driver_id = RDMA_DRIVER_CXGB4;
ib_set_device_ops(&dev->ibdev, &c4iw_dev_ops);
ret = set_netdevs(&dev->ibdev, &dev->rdev);
if (ret)
goto err_dealloc_ctx;
ret = ib_register_device(&dev->ibdev, "cxgb4_%d");
if (ret)
goto err_kfree_iwcm;
goto err_dealloc_ctx;
return;
err_kfree_iwcm:
kfree(dev->ibdev.iwcm);
err_dealloc_ctx:
pr_err("%s - Failed registering iwarp device: %d\n",
pci_name(ctx->lldi.pdev), ret);
@ -631,6 +621,5 @@ void c4iw_unregister_device(struct c4iw_dev *dev)
{
pr_debug("c4iw_dev %p\n", dev);
ib_unregister_device(&dev->ibdev);
kfree(dev->ibdev.iwcm);
return;
}

View File

@ -57,18 +57,18 @@ MODULE_PARM_DESC(db_coalescing_threshold,
static int max_fr_immd = T4_MAX_FR_IMMD;
module_param(max_fr_immd, int, 0644);
MODULE_PARM_DESC(max_fr_immd, "fastreg threshold for using DSGL instead of immedate");
MODULE_PARM_DESC(max_fr_immd, "fastreg threshold for using DSGL instead of immediate");
static int alloc_ird(struct c4iw_dev *dev, u32 ird)
{
int ret = 0;
spin_lock_irq(&dev->lock);
xa_lock_irq(&dev->qps);
if (ird <= dev->avail_ird)
dev->avail_ird -= ird;
else
ret = -ENOMEM;
spin_unlock_irq(&dev->lock);
xa_unlock_irq(&dev->qps);
if (ret)
dev_warn(&dev->rdev.lldi.pdev->dev,
@ -79,9 +79,9 @@ static int alloc_ird(struct c4iw_dev *dev, u32 ird)
static void free_ird(struct c4iw_dev *dev, int ird)
{
spin_lock_irq(&dev->lock);
xa_lock_irq(&dev->qps);
dev->avail_ird += ird;
spin_unlock_irq(&dev->lock);
xa_unlock_irq(&dev->qps);
}
static void set_state(struct c4iw_qp *qhp, enum c4iw_qp_state state)
@ -939,7 +939,7 @@ static int ring_kernel_sq_db(struct c4iw_qp *qhp, u16 inc)
{
unsigned long flags;
spin_lock_irqsave(&qhp->rhp->lock, flags);
xa_lock_irqsave(&qhp->rhp->qps, flags);
spin_lock(&qhp->lock);
if (qhp->rhp->db_state == NORMAL)
t4_ring_sq_db(&qhp->wq, inc, NULL);
@ -948,7 +948,7 @@ static int ring_kernel_sq_db(struct c4iw_qp *qhp, u16 inc)
qhp->wq.sq.wq_pidx_inc += inc;
}
spin_unlock(&qhp->lock);
spin_unlock_irqrestore(&qhp->rhp->lock, flags);
xa_unlock_irqrestore(&qhp->rhp->qps, flags);
return 0;
}
@ -956,7 +956,7 @@ static int ring_kernel_rq_db(struct c4iw_qp *qhp, u16 inc)
{
unsigned long flags;
spin_lock_irqsave(&qhp->rhp->lock, flags);
xa_lock_irqsave(&qhp->rhp->qps, flags);
spin_lock(&qhp->lock);
if (qhp->rhp->db_state == NORMAL)
t4_ring_rq_db(&qhp->wq, inc, NULL);
@ -965,7 +965,7 @@ static int ring_kernel_rq_db(struct c4iw_qp *qhp, u16 inc)
qhp->wq.rq.wq_pidx_inc += inc;
}
spin_unlock(&qhp->lock);
spin_unlock_irqrestore(&qhp->rhp->lock, flags);
xa_unlock_irqrestore(&qhp->rhp->qps, flags);
return 0;
}
@ -1976,10 +1976,10 @@ int c4iw_modify_qp(struct c4iw_dev *rhp, struct c4iw_qp *qhp,
qhp->attr.layer_etype = attrs->layer_etype;
qhp->attr.ecode = attrs->ecode;
ep = qhp->ep;
c4iw_get_ep(&ep->com);
disconnect = 1;
if (!internal) {
c4iw_get_ep(&qhp->ep->com);
terminate = 1;
disconnect = 1;
} else {
terminate = qhp->attr.send_term;
ret = rdma_fini(rhp, qhp, ep);
@ -2095,7 +2095,7 @@ out:
return ret;
}
int c4iw_destroy_qp(struct ib_qp *ib_qp)
int c4iw_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
{
struct c4iw_dev *rhp;
struct c4iw_qp *qhp;
@ -2111,12 +2111,11 @@ int c4iw_destroy_qp(struct ib_qp *ib_qp)
c4iw_modify_qp(rhp, qhp, C4IW_QP_ATTR_NEXT_STATE, &attrs, 0);
wait_event(qhp->wait, !qhp->ep);
remove_handle(rhp, &rhp->qpidr, qhp->wq.sq.qid);
spin_lock_irq(&rhp->lock);
xa_lock_irq(&rhp->qps);
__xa_erase(&rhp->qps, qhp->wq.sq.qid);
if (!list_empty(&qhp->db_fc_entry))
list_del_init(&qhp->db_fc_entry);
spin_unlock_irq(&rhp->lock);
xa_unlock_irq(&rhp->qps);
free_ird(rhp, qhp->attr.max_ird);
c4iw_qp_rem_ref(ib_qp);
@ -2234,7 +2233,7 @@ struct ib_qp *c4iw_create_qp(struct ib_pd *pd, struct ib_qp_init_attr *attrs,
kref_init(&qhp->kref);
INIT_WORK(&qhp->free_work, free_qp_work);
ret = insert_handle(rhp, &rhp->qpidr, qhp, qhp->wq.sq.qid);
ret = xa_insert_irq(&rhp->qps, qhp->wq.sq.qid, qhp, GFP_KERNEL);
if (ret)
goto err_destroy_qp;
@ -2370,7 +2369,7 @@ err_free_rq_key:
err_free_sq_key:
kfree(sq_key_mm);
err_remove_handle:
remove_handle(rhp, &rhp->qpidr, qhp->wq.sq.qid);
xa_erase_irq(&rhp->qps, qhp->wq.sq.qid);
err_destroy_qp:
destroy_qp(&rhp->rdev, &qhp->wq,
ucontext ? &ucontext->uctx : &rhp->rdev.uctx, !attrs->srq);
@ -2684,11 +2683,12 @@ void c4iw_copy_wr_to_srq(struct t4_srq *srq, union t4_recv_wr *wqe, u8 len16)
}
}
struct ib_srq *c4iw_create_srq(struct ib_pd *pd, struct ib_srq_init_attr *attrs,
int c4iw_create_srq(struct ib_srq *ib_srq, struct ib_srq_init_attr *attrs,
struct ib_udata *udata)
{
struct ib_pd *pd = ib_srq->pd;
struct c4iw_dev *rhp;
struct c4iw_srq *srq;
struct c4iw_srq *srq = to_c4iw_srq(ib_srq);
struct c4iw_pd *php;
struct c4iw_create_srq_resp uresp;
struct c4iw_ucontext *ucontext;
@ -2703,11 +2703,11 @@ struct ib_srq *c4iw_create_srq(struct ib_pd *pd, struct ib_srq_init_attr *attrs,
rhp = php->rhp;
if (!rhp->rdev.lldi.vr->srq.size)
return ERR_PTR(-EINVAL);
return -EINVAL;
if (attrs->attr.max_wr > rhp->rdev.hw_queue.t4_max_rq_size)
return ERR_PTR(-E2BIG);
return -E2BIG;
if (attrs->attr.max_sge > T4_MAX_RECV_SGE)
return ERR_PTR(-E2BIG);
return -E2BIG;
/*
* SRQ RQT and RQ must be a power of 2 and at least 16 deep.
@ -2718,15 +2718,9 @@ struct ib_srq *c4iw_create_srq(struct ib_pd *pd, struct ib_srq_init_attr *attrs,
ucontext = rdma_udata_to_drv_context(udata, struct c4iw_ucontext,
ibucontext);
srq = kzalloc(sizeof(*srq), GFP_KERNEL);
if (!srq)
return ERR_PTR(-ENOMEM);
srq->wr_waitp = c4iw_alloc_wr_wait(GFP_KERNEL);
if (!srq->wr_waitp) {
ret = -ENOMEM;
goto err_free_srq;
}
if (!srq->wr_waitp)
return -ENOMEM;
srq->idx = c4iw_alloc_srq_idx(&rhp->rdev);
if (srq->idx < 0) {
@ -2760,7 +2754,7 @@ struct ib_srq *c4iw_create_srq(struct ib_pd *pd, struct ib_srq_init_attr *attrs,
if (CHELSIO_CHIP_VERSION(rhp->rdev.lldi.adapter_type) > CHELSIO_T6)
srq->flags = T4_SRQ_LIMIT_SUPPORT;
ret = insert_handle(rhp, &rhp->qpidr, srq, srq->wq.qid);
ret = xa_insert_irq(&rhp->qps, srq->wq.qid, srq, GFP_KERNEL);
if (ret)
goto err_free_queue;
@ -2806,13 +2800,14 @@ struct ib_srq *c4iw_create_srq(struct ib_pd *pd, struct ib_srq_init_attr *attrs,
(unsigned long)srq->wq.memsize, attrs->attr.max_wr);
spin_lock_init(&srq->lock);
return &srq->ibsrq;
return 0;
err_free_srq_db_key_mm:
kfree(srq_db_key_mm);
err_free_srq_key_mm:
kfree(srq_key_mm);
err_remove_handle:
remove_handle(rhp, &rhp->qpidr, srq->wq.qid);
xa_erase_irq(&rhp->qps, srq->wq.qid);
err_free_queue:
free_srq_queue(srq, ucontext ? &ucontext->uctx : &rhp->rdev.uctx,
srq->wr_waitp);
@ -2822,12 +2817,10 @@ err_free_srq_idx:
c4iw_free_srq_idx(&rhp->rdev, srq->idx);
err_free_wr_wait:
c4iw_put_wr_wait(srq->wr_waitp);
err_free_srq:
kfree(srq);
return ERR_PTR(ret);
return ret;
}
int c4iw_destroy_srq(struct ib_srq *ibsrq)
void c4iw_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
{
struct c4iw_dev *rhp;
struct c4iw_srq *srq;
@ -2838,13 +2831,11 @@ int c4iw_destroy_srq(struct ib_srq *ibsrq)
pr_debug("%s id %d\n", __func__, srq->wq.qid);
remove_handle(rhp, &rhp->qpidr, srq->wq.qid);
ucontext = ibsrq->uobject ?
to_c4iw_ucontext(ibsrq->uobject->context) : NULL;
xa_erase_irq(&rhp->qps, srq->wq.qid);
ucontext = rdma_udata_to_drv_context(udata, struct c4iw_ucontext,
ibucontext);
free_srq_queue(srq, ucontext ? &ucontext->uctx : &rhp->rdev.uctx,
srq->wr_waitp);
c4iw_free_srq_idx(&rhp->rdev, srq->idx);
c4iw_put_wr_wait(srq->wr_waitp);
kfree(srq);
return 0;
}

View File

@ -0,0 +1,15 @@
# SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
# Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
#
# Amazon fabric device configuration
#
config INFINIBAND_EFA
tristate "Amazon Elastic Fabric Adapter (EFA) support"
depends on PCI_MSI && 64BIT && !CPU_BIG_ENDIAN
depends on INFINIBAND_USER_ACCESS
help
This driver supports Amazon Elastic Fabric Adapter (EFA).
To compile this driver as a module, choose M here.
The module will be called efa.

View File

@ -0,0 +1,9 @@
# SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
# Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
#
# Makefile for Amazon Elastic Fabric Adapter (EFA) device driver.
#
obj-$(CONFIG_INFINIBAND_EFA) += efa.o
efa-y := efa_com_cmd.o efa_com.o efa_main.o efa_verbs.o

View File

@ -0,0 +1,163 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
/*
* Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
*/
#ifndef _EFA_H_
#define _EFA_H_
#include <linux/bitops.h>
#include <linux/idr.h>
#include <linux/interrupt.h>
#include <linux/pci.h>
#include <linux/sched.h>
#include <rdma/efa-abi.h>
#include <rdma/ib_verbs.h>
#include "efa_com_cmd.h"
#define DRV_MODULE_NAME "efa"
#define DEVICE_NAME "Elastic Fabric Adapter (EFA)"
#define EFA_IRQNAME_SIZE 40
/* 1 for AENQ + ADMIN */
#define EFA_NUM_MSIX_VEC 1
#define EFA_MGMNT_MSIX_VEC_IDX 0
struct efa_irq {
irq_handler_t handler;
void *data;
int cpu;
u32 vector;
cpumask_t affinity_hint_mask;
char name[EFA_IRQNAME_SIZE];
};
struct efa_sw_stats {
atomic64_t alloc_pd_err;
atomic64_t create_qp_err;
atomic64_t create_cq_err;
atomic64_t reg_mr_err;
atomic64_t alloc_ucontext_err;
atomic64_t create_ah_err;
};
/* Don't use anything other than atomic64 */
struct efa_stats {
struct efa_sw_stats sw_stats;
atomic64_t keep_alive_rcvd;
};
struct efa_dev {
struct ib_device ibdev;
struct efa_com_dev edev;
struct pci_dev *pdev;
struct efa_com_get_device_attr_result dev_attr;
u64 reg_bar_addr;
u64 reg_bar_len;
u64 mem_bar_addr;
u64 mem_bar_len;
u64 db_bar_addr;
u64 db_bar_len;
u8 addr[EFA_GID_SIZE];
u32 mtu;
int admin_msix_vector_idx;
struct efa_irq admin_irq;
struct efa_stats stats;
};
struct efa_ucontext {
struct ib_ucontext ibucontext;
struct xarray mmap_xa;
u32 mmap_xa_page;
u16 uarn;
};
struct efa_pd {
struct ib_pd ibpd;
u16 pdn;
};
struct efa_mr {
struct ib_mr ibmr;
struct ib_umem *umem;
};
struct efa_cq {
struct ib_cq ibcq;
struct efa_ucontext *ucontext;
dma_addr_t dma_addr;
void *cpu_addr;
size_t size;
u16 cq_idx;
};
struct efa_qp {
struct ib_qp ibqp;
dma_addr_t rq_dma_addr;
void *rq_cpu_addr;
size_t rq_size;
enum ib_qp_state state;
u32 qp_handle;
u32 max_send_wr;
u32 max_recv_wr;
u32 max_send_sge;
u32 max_recv_sge;
u32 max_inline_data;
};
struct efa_ah {
struct ib_ah ibah;
u16 ah;
/* dest_addr */
u8 id[EFA_GID_SIZE];
};
int efa_query_device(struct ib_device *ibdev,
struct ib_device_attr *props,
struct ib_udata *udata);
int efa_query_port(struct ib_device *ibdev, u8 port,
struct ib_port_attr *props);
int efa_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
int qp_attr_mask,
struct ib_qp_init_attr *qp_init_attr);
int efa_query_gid(struct ib_device *ibdev, u8 port, int index,
union ib_gid *gid);
int efa_query_pkey(struct ib_device *ibdev, u8 port, u16 index,
u16 *pkey);
int efa_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata);
void efa_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata);
int efa_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata);
struct ib_qp *efa_create_qp(struct ib_pd *ibpd,
struct ib_qp_init_attr *init_attr,
struct ib_udata *udata);
int efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata);
struct ib_cq *efa_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_udata *udata);
struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length,
u64 virt_addr, int access_flags,
struct ib_udata *udata);
int efa_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata);
int efa_get_port_immutable(struct ib_device *ibdev, u8 port_num,
struct ib_port_immutable *immutable);
int efa_alloc_ucontext(struct ib_ucontext *ibucontext, struct ib_udata *udata);
void efa_dealloc_ucontext(struct ib_ucontext *ibucontext);
int efa_mmap(struct ib_ucontext *ibucontext,
struct vm_area_struct *vma);
int efa_create_ah(struct ib_ah *ibah,
struct rdma_ah_attr *ah_attr,
u32 flags,
struct ib_udata *udata);
void efa_destroy_ah(struct ib_ah *ibah, u32 flags);
int efa_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
int qp_attr_mask, struct ib_udata *udata);
enum rdma_link_layer efa_port_link_layer(struct ib_device *ibdev,
u8 port_num);
#endif /* _EFA_H_ */

View File

@ -0,0 +1,794 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
/*
* Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
*/
#ifndef _EFA_ADMIN_CMDS_H_
#define _EFA_ADMIN_CMDS_H_
#define EFA_ADMIN_API_VERSION_MAJOR 0
#define EFA_ADMIN_API_VERSION_MINOR 1
/* EFA admin queue opcodes */
enum efa_admin_aq_opcode {
EFA_ADMIN_CREATE_QP = 1,
EFA_ADMIN_MODIFY_QP = 2,
EFA_ADMIN_QUERY_QP = 3,
EFA_ADMIN_DESTROY_QP = 4,
EFA_ADMIN_CREATE_AH = 5,
EFA_ADMIN_DESTROY_AH = 6,
EFA_ADMIN_REG_MR = 7,
EFA_ADMIN_DEREG_MR = 8,
EFA_ADMIN_CREATE_CQ = 9,
EFA_ADMIN_DESTROY_CQ = 10,
EFA_ADMIN_GET_FEATURE = 11,
EFA_ADMIN_SET_FEATURE = 12,
EFA_ADMIN_GET_STATS = 13,
EFA_ADMIN_ALLOC_PD = 14,
EFA_ADMIN_DEALLOC_PD = 15,
EFA_ADMIN_ALLOC_UAR = 16,
EFA_ADMIN_DEALLOC_UAR = 17,
EFA_ADMIN_MAX_OPCODE = 17,
};
enum efa_admin_aq_feature_id {
EFA_ADMIN_DEVICE_ATTR = 1,
EFA_ADMIN_AENQ_CONFIG = 2,
EFA_ADMIN_NETWORK_ATTR = 3,
EFA_ADMIN_QUEUE_ATTR = 4,
EFA_ADMIN_HW_HINTS = 5,
EFA_ADMIN_FEATURES_OPCODE_NUM = 8,
};
/* QP transport type */
enum efa_admin_qp_type {
/* Unreliable Datagram */
EFA_ADMIN_QP_TYPE_UD = 1,
/* Scalable Reliable Datagram */
EFA_ADMIN_QP_TYPE_SRD = 2,
};
/* QP state */
enum efa_admin_qp_state {
EFA_ADMIN_QP_STATE_RESET = 0,
EFA_ADMIN_QP_STATE_INIT = 1,
EFA_ADMIN_QP_STATE_RTR = 2,
EFA_ADMIN_QP_STATE_RTS = 3,
EFA_ADMIN_QP_STATE_SQD = 4,
EFA_ADMIN_QP_STATE_SQE = 5,
EFA_ADMIN_QP_STATE_ERR = 6,
};
enum efa_admin_get_stats_type {
EFA_ADMIN_GET_STATS_TYPE_BASIC = 0,
};
enum efa_admin_get_stats_scope {
EFA_ADMIN_GET_STATS_SCOPE_ALL = 0,
EFA_ADMIN_GET_STATS_SCOPE_QUEUE = 1,
};
enum efa_admin_modify_qp_mask_bits {
EFA_ADMIN_QP_STATE_BIT = 0,
EFA_ADMIN_CUR_QP_STATE_BIT = 1,
EFA_ADMIN_QKEY_BIT = 2,
EFA_ADMIN_SQ_PSN_BIT = 3,
EFA_ADMIN_SQ_DRAINED_ASYNC_NOTIFY_BIT = 4,
};
/*
* QP allocation sizes, converted by fabric QueuePair (QP) create command
* from QP capabilities.
*/
struct efa_admin_qp_alloc_size {
/* Send descriptor ring size in bytes */
u32 send_queue_ring_size;
/* Max number of WQEs that can be outstanding on send queue. */
u32 send_queue_depth;
/*
* Recv descriptor ring size in bytes, sufficient for user-provided
* number of WQEs
*/
u32 recv_queue_ring_size;
/* Max number of WQEs that can be outstanding on recv queue */
u32 recv_queue_depth;
};
struct efa_admin_create_qp_cmd {
/* Common Admin Queue descriptor */
struct efa_admin_aq_common_desc aq_common_desc;
/* Protection Domain associated with this QP */
u16 pd;
/* QP type */
u8 qp_type;
/*
* 0 : sq_virt - If set, SQ ring base address is
* virtual (IOVA returned by MR registration)
* 1 : rq_virt - If set, RQ ring base address is
* virtual (IOVA returned by MR registration)
* 7:2 : reserved - MBZ
*/
u8 flags;
/*
* Send queue (SQ) ring base physical address. This field is not
* used if this is a Low Latency Queue(LLQ).
*/
u64 sq_base_addr;
/* Receive queue (RQ) ring base address. */
u64 rq_base_addr;
/* Index of CQ to be associated with Send Queue completions */
u32 send_cq_idx;
/* Index of CQ to be associated with Recv Queue completions */
u32 recv_cq_idx;
/*
* Memory registration key for the SQ ring, used only when not in
* LLQ mode and base address is virtual
*/
u32 sq_l_key;
/*
* Memory registration key for the RQ ring, used only when base
* address is virtual
*/
u32 rq_l_key;
/* Requested QP allocation sizes */
struct efa_admin_qp_alloc_size qp_alloc_size;
/* UAR number */
u16 uar;
/* MBZ */
u16 reserved;
/* MBZ */
u32 reserved2;
};
struct efa_admin_create_qp_resp {
/* Common Admin Queue completion descriptor */
struct efa_admin_acq_common_desc acq_common_desc;
/* Opaque handle to be used for consequent operations on the QP */
u32 qp_handle;
/* QP number in the given EFA virtual device */
u16 qp_num;
/* MBZ */
u16 reserved;
/* Index of sub-CQ for Send Queue completions */
u16 send_sub_cq_idx;
/* Index of sub-CQ for Receive Queue completions */
u16 recv_sub_cq_idx;
/* SQ doorbell address, as offset to PCIe DB BAR */
u32 sq_db_offset;
/* RQ doorbell address, as offset to PCIe DB BAR */
u32 rq_db_offset;
/*
* low latency send queue ring base address as an offset to PCIe
* MMIO LLQ_MEM BAR
*/
u32 llq_descriptors_offset;
};
struct efa_admin_modify_qp_cmd {
/* Common Admin Queue descriptor */
struct efa_admin_aq_common_desc aq_common_desc;
/*
* Mask indicating which fields should be updated see enum
* efa_admin_modify_qp_mask_bits
*/
u32 modify_mask;
/* QP handle returned by create_qp command */
u32 qp_handle;
/* QP state */
u32 qp_state;
/* Override current QP state (before applying the transition) */
u32 cur_qp_state;
/* QKey */
u32 qkey;
/* SQ PSN */
u32 sq_psn;
/* Enable async notification when SQ is drained */
u8 sq_drained_async_notify;
/* MBZ */
u8 reserved1;
/* MBZ */
u16 reserved2;
};
struct efa_admin_modify_qp_resp {
/* Common Admin Queue completion descriptor */
struct efa_admin_acq_common_desc acq_common_desc;
};
struct efa_admin_query_qp_cmd {
/* Common Admin Queue descriptor */
struct efa_admin_aq_common_desc aq_common_desc;
/* QP handle returned by create_qp command */
u32 qp_handle;
};
struct efa_admin_query_qp_resp {
/* Common Admin Queue completion descriptor */
struct efa_admin_acq_common_desc acq_common_desc;
/* QP state */
u32 qp_state;
/* QKey */
u32 qkey;
/* SQ PSN */
u32 sq_psn;
/* Indicates that draining is in progress */
u8 sq_draining;
/* MBZ */
u8 reserved1;
/* MBZ */
u16 reserved2;
};
struct efa_admin_destroy_qp_cmd {
/* Common Admin Queue descriptor */
struct efa_admin_aq_common_desc aq_common_desc;
/* QP handle returned by create_qp command */
u32 qp_handle;
};
struct efa_admin_destroy_qp_resp {
/* Common Admin Queue completion descriptor */
struct efa_admin_acq_common_desc acq_common_desc;
};
/*
* Create Address Handle command parameters. Must not be called more than
* once for the same destination
*/
struct efa_admin_create_ah_cmd {
/* Common Admin Queue descriptor */
struct efa_admin_aq_common_desc aq_common_desc;
/* Destination address in network byte order */
u8 dest_addr[16];
/* PD number */
u16 pd;
u16 reserved;
};
struct efa_admin_create_ah_resp {
/* Common Admin Queue completion descriptor */
struct efa_admin_acq_common_desc acq_common_desc;
/* Target interface address handle (opaque) */
u16 ah;
u16 reserved;
};
struct efa_admin_destroy_ah_cmd {
/* Common Admin Queue descriptor */
struct efa_admin_aq_common_desc aq_common_desc;
/* Target interface address handle (opaque) */
u16 ah;
/* PD number */
u16 pd;
};
struct efa_admin_destroy_ah_resp {
/* Common Admin Queue completion descriptor */
struct efa_admin_acq_common_desc acq_common_desc;
};
/*
* Registration of MemoryRegion, required for QP working with Virtual
* Addresses. In standard verbs semantics, region length is limited to 2GB
* space, but EFA offers larger MR support for large memory space, to ease
* on users working with very large datasets (i.e. full GPU memory mapping).
*/
struct efa_admin_reg_mr_cmd {
/* Common Admin Queue descriptor */
struct efa_admin_aq_common_desc aq_common_desc;
/* Protection Domain */
u16 pd;
/* MBZ */
u16 reserved16_w1;
/* Physical Buffer List, each element is page-aligned. */
union {
/*
* Inline array of guest-physical page addresses of user
* memory pages (optimization for short region
* registrations)
*/
u64 inline_pbl_array[4];
/* points to PBL (direct or indirect, chained if needed) */
struct efa_admin_ctrl_buff_info pbl;
} pbl;
/* Memory region length, in bytes. */
u64 mr_length;
/*
* flags and page size
* 4:0 : phys_page_size_shift - page size is (1 <<
* phys_page_size_shift). Page size is used for
* building the Virtual to Physical address mapping
* 6:5 : reserved - MBZ
* 7 : mem_addr_phy_mode_en - Enable bit for physical
* memory registration (no translation), can be used
* only by privileged clients. If set, PBL must
* contain a single entry.
*/
u8 flags;
/*
* permissions
* 0 : local_write_enable - Write permissions: value
* of 1 needed for RQ buffers and for RDMA write
* 7:1 : reserved1 - remote access flags, etc
*/
u8 permissions;
u16 reserved16_w5;
/* number of pages in PBL (redundant, could be calculated) */
u32 page_num;
/*
* IO Virtual Address associated with this MR. If
* mem_addr_phy_mode_en is set, contains the physical address of
* the region.
*/
u64 iova;
};
struct efa_admin_reg_mr_resp {
/* Common Admin Queue completion descriptor */
struct efa_admin_acq_common_desc acq_common_desc;
/*
* L_Key, to be used in conjunction with local buffer references in
* SQ and RQ WQE, or with virtual RQ/CQ rings
*/
u32 l_key;
/*
* R_Key, to be used in RDMA messages to refer to remotely accessed
* memory region
*/
u32 r_key;
};
struct efa_admin_dereg_mr_cmd {
/* Common Admin Queue descriptor */
struct efa_admin_aq_common_desc aq_common_desc;
/* L_Key, memory region's l_key */
u32 l_key;
};
struct efa_admin_dereg_mr_resp {
/* Common Admin Queue completion descriptor */
struct efa_admin_acq_common_desc acq_common_desc;
};
struct efa_admin_create_cq_cmd {
struct efa_admin_aq_common_desc aq_common_desc;
/*
* 4:0 : reserved5
* 5 : interrupt_mode_enabled - if set, cq operates
* in interrupt mode (i.e. CQ events and MSI-X are
* generated), otherwise - polling
* 6 : virt - If set, ring base address is virtual
* (IOVA returned by MR registration)
* 7 : reserved6
*/
u8 cq_caps_1;
/*
* 4:0 : cq_entry_size_words - size of CQ entry in
* 32-bit words, valid values: 4, 8.
* 7:5 : reserved7
*/
u8 cq_caps_2;
/* completion queue depth in # of entries. must be power of 2 */
u16 cq_depth;
/* msix vector assigned to this cq */
u32 msix_vector_idx;
/*
* CQ ring base address, virtual or physical depending on 'virt'
* flag
*/
struct efa_common_mem_addr cq_ba;
/*
* Memory registration key for the ring, used only when base
* address is virtual
*/
u32 l_key;
/*
* number of sub cqs - must be equal to sub_cqs_per_cq of queue
* attributes.
*/
u16 num_sub_cqs;
/* UAR number */
u16 uar;
};
struct efa_admin_create_cq_resp {
struct efa_admin_acq_common_desc acq_common_desc;
u16 cq_idx;
/* actual cq depth in number of entries */
u16 cq_actual_depth;
};
struct efa_admin_destroy_cq_cmd {
struct efa_admin_aq_common_desc aq_common_desc;
u16 cq_idx;
u16 reserved1;
};
struct efa_admin_destroy_cq_resp {
struct efa_admin_acq_common_desc acq_common_desc;
};
/*
* EFA AQ Get Statistics command. Extended statistics are placed in control
* buffer pointed by AQ entry
*/
struct efa_admin_aq_get_stats_cmd {
struct efa_admin_aq_common_desc aq_common_descriptor;
union {
/* command specific inline data */
u32 inline_data_w1[3];
struct efa_admin_ctrl_buff_info control_buffer;
} u;
/* stats type as defined in enum efa_admin_get_stats_type */
u8 type;
/* stats scope defined in enum efa_admin_get_stats_scope */
u8 scope;
u16 scope_modifier;
};
struct efa_admin_basic_stats {
u64 tx_bytes;
u64 tx_pkts;
u64 rx_bytes;
u64 rx_pkts;
u64 rx_drops;
};
struct efa_admin_acq_get_stats_resp {
struct efa_admin_acq_common_desc acq_common_desc;
struct efa_admin_basic_stats basic_stats;
};
struct efa_admin_get_set_feature_common_desc {
/*
* 1:0 : select - 0x1 - current value; 0x3 - default
* value
* 7:3 : reserved3
*/
u8 flags;
/* as appears in efa_admin_aq_feature_id */
u8 feature_id;
/* MBZ */
u16 reserved16;
};
struct efa_admin_feature_device_attr_desc {
/* Bitmap of efa_admin_aq_feature_id */
u64 supported_features;
/* Bitmap of supported page sizes in MR registrations */
u64 page_size_cap;
u32 fw_version;
u32 admin_api_version;
u32 device_version;
/* Bar used for SQ and RQ doorbells */
u16 db_bar;
/* Indicates how many bits are used physical address access */
u8 phys_addr_width;
/* Indicates how many bits are used virtual address access */
u8 virt_addr_width;
};
struct efa_admin_feature_queue_attr_desc {
/* The maximum number of queue pairs supported */
u32 max_qp;
u32 max_sq_depth;
/* max send wr used in inline-buf */
u32 inline_buf_size;
u32 max_rq_depth;
/* The maximum number of completion queues supported per VF */
u32 max_cq;
u32 max_cq_depth;
/* Number of sub-CQs to be created for each CQ */
u16 sub_cqs_per_cq;
u16 reserved;
/*
* Maximum number of SGEs (buffs) allowed for a single send work
* queue element (WQE)
*/
u16 max_wr_send_sges;
/* Maximum number of SGEs allowed for a single recv WQE */
u16 max_wr_recv_sges;
/* The maximum number of memory regions supported */
u32 max_mr;
/* The maximum number of pages can be registered */
u32 max_mr_pages;
/* The maximum number of protection domains supported */
u32 max_pd;
/* The maximum number of address handles supported */
u32 max_ah;
/* The maximum size of LLQ in bytes */
u32 max_llq_size;
};
struct efa_admin_feature_aenq_desc {
/* bitmask for AENQ groups the device can report */
u32 supported_groups;
/* bitmask for AENQ groups to report */
u32 enabled_groups;
};
struct efa_admin_feature_network_attr_desc {
/* Raw address data in network byte order */
u8 addr[16];
u32 mtu;
};
/*
* When hint value is 0, hints capabilities are not supported or driver
* should use its own predefined value
*/
struct efa_admin_hw_hints {
/* value in ms */
u16 mmio_read_timeout;
/* value in ms */
u16 driver_watchdog_timeout;
/* value in ms */
u16 admin_completion_timeout;
/* poll interval in ms */
u16 poll_interval;
};
struct efa_admin_get_feature_cmd {
struct efa_admin_aq_common_desc aq_common_descriptor;
struct efa_admin_ctrl_buff_info control_buffer;
struct efa_admin_get_set_feature_common_desc feature_common;
u32 raw[11];
};
struct efa_admin_get_feature_resp {
struct efa_admin_acq_common_desc acq_common_desc;
union {
u32 raw[14];
struct efa_admin_feature_device_attr_desc device_attr;
struct efa_admin_feature_aenq_desc aenq;
struct efa_admin_feature_network_attr_desc network_attr;
struct efa_admin_feature_queue_attr_desc queue_attr;
struct efa_admin_hw_hints hw_hints;
} u;
};
struct efa_admin_set_feature_cmd {
struct efa_admin_aq_common_desc aq_common_descriptor;
struct efa_admin_ctrl_buff_info control_buffer;
struct efa_admin_get_set_feature_common_desc feature_common;
union {
u32 raw[11];
/* AENQ configuration */
struct efa_admin_feature_aenq_desc aenq;
} u;
};
struct efa_admin_set_feature_resp {
struct efa_admin_acq_common_desc acq_common_desc;
union {
u32 raw[14];
} u;
};
struct efa_admin_alloc_pd_cmd {
struct efa_admin_aq_common_desc aq_common_descriptor;
};
struct efa_admin_alloc_pd_resp {
struct efa_admin_acq_common_desc acq_common_desc;
/* PD number */
u16 pd;
/* MBZ */
u16 reserved;
};
struct efa_admin_dealloc_pd_cmd {
struct efa_admin_aq_common_desc aq_common_descriptor;
/* PD number */
u16 pd;
/* MBZ */
u16 reserved;
};
struct efa_admin_dealloc_pd_resp {
struct efa_admin_acq_common_desc acq_common_desc;
};
struct efa_admin_alloc_uar_cmd {
struct efa_admin_aq_common_desc aq_common_descriptor;
};
struct efa_admin_alloc_uar_resp {
struct efa_admin_acq_common_desc acq_common_desc;
/* UAR number */
u16 uar;
/* MBZ */
u16 reserved;
};
struct efa_admin_dealloc_uar_cmd {
struct efa_admin_aq_common_desc aq_common_descriptor;
/* UAR number */
u16 uar;
/* MBZ */
u16 reserved;
};
struct efa_admin_dealloc_uar_resp {
struct efa_admin_acq_common_desc acq_common_desc;
};
/* asynchronous event notification groups */
enum efa_admin_aenq_group {
EFA_ADMIN_FATAL_ERROR = 1,
EFA_ADMIN_WARNING = 2,
EFA_ADMIN_NOTIFICATION = 3,
EFA_ADMIN_KEEP_ALIVE = 4,
EFA_ADMIN_AENQ_GROUPS_NUM = 5,
};
enum efa_admin_aenq_notification_syndrom {
EFA_ADMIN_SUSPEND = 0,
EFA_ADMIN_RESUME = 1,
EFA_ADMIN_UPDATE_HINTS = 2,
};
struct efa_admin_mmio_req_read_less_resp {
u16 req_id;
u16 reg_off;
/* value is valid when poll is cleared */
u32 reg_val;
};
/* create_qp_cmd */
#define EFA_ADMIN_CREATE_QP_CMD_SQ_VIRT_MASK BIT(0)
#define EFA_ADMIN_CREATE_QP_CMD_RQ_VIRT_SHIFT 1
#define EFA_ADMIN_CREATE_QP_CMD_RQ_VIRT_MASK BIT(1)
/* reg_mr_cmd */
#define EFA_ADMIN_REG_MR_CMD_PHYS_PAGE_SIZE_SHIFT_MASK GENMASK(4, 0)
#define EFA_ADMIN_REG_MR_CMD_MEM_ADDR_PHY_MODE_EN_SHIFT 7
#define EFA_ADMIN_REG_MR_CMD_MEM_ADDR_PHY_MODE_EN_MASK BIT(7)
#define EFA_ADMIN_REG_MR_CMD_LOCAL_WRITE_ENABLE_MASK BIT(0)
/* create_cq_cmd */
#define EFA_ADMIN_CREATE_CQ_CMD_INTERRUPT_MODE_ENABLED_SHIFT 5
#define EFA_ADMIN_CREATE_CQ_CMD_INTERRUPT_MODE_ENABLED_MASK BIT(5)
#define EFA_ADMIN_CREATE_CQ_CMD_VIRT_SHIFT 6
#define EFA_ADMIN_CREATE_CQ_CMD_VIRT_MASK BIT(6)
#define EFA_ADMIN_CREATE_CQ_CMD_CQ_ENTRY_SIZE_WORDS_MASK GENMASK(4, 0)
/* get_set_feature_common_desc */
#define EFA_ADMIN_GET_SET_FEATURE_COMMON_DESC_SELECT_MASK GENMASK(1, 0)
#endif /* _EFA_ADMIN_CMDS_H_ */

View File

@ -0,0 +1,136 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
/*
* Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
*/
#ifndef _EFA_ADMIN_H_
#define _EFA_ADMIN_H_
enum efa_admin_aq_completion_status {
EFA_ADMIN_SUCCESS = 0,
EFA_ADMIN_RESOURCE_ALLOCATION_FAILURE = 1,
EFA_ADMIN_BAD_OPCODE = 2,
EFA_ADMIN_UNSUPPORTED_OPCODE = 3,
EFA_ADMIN_MALFORMED_REQUEST = 4,
/* Additional status is provided in ACQ entry extended_status */
EFA_ADMIN_ILLEGAL_PARAMETER = 5,
EFA_ADMIN_UNKNOWN_ERROR = 6,
EFA_ADMIN_RESOURCE_BUSY = 7,
};
struct efa_admin_aq_common_desc {
/*
* 11:0 : command_id
* 15:12 : reserved12
*/
u16 command_id;
/* as appears in efa_admin_aq_opcode */
u8 opcode;
/*
* 0 : phase
* 1 : ctrl_data - control buffer address valid
* 2 : ctrl_data_indirect - control buffer address
* points to list of pages with addresses of control
* buffers
* 7:3 : reserved3
*/
u8 flags;
};
/*
* used in efa_admin_aq_entry. Can point directly to control data, or to a
* page list chunk. Used also at the end of indirect mode page list chunks,
* for chaining.
*/
struct efa_admin_ctrl_buff_info {
u32 length;
struct efa_common_mem_addr address;
};
struct efa_admin_aq_entry {
struct efa_admin_aq_common_desc aq_common_descriptor;
union {
u32 inline_data_w1[3];
struct efa_admin_ctrl_buff_info control_buffer;
} u;
u32 inline_data_w4[12];
};
struct efa_admin_acq_common_desc {
/*
* command identifier to associate it with the aq descriptor
* 11:0 : command_id
* 15:12 : reserved12
*/
u16 command;
u8 status;
/*
* 0 : phase
* 7:1 : reserved1
*/
u8 flags;
u16 extended_status;
/*
* indicates to the driver which AQ entry has been consumed by the
* device and could be reused
*/
u16 sq_head_indx;
};
struct efa_admin_acq_entry {
struct efa_admin_acq_common_desc acq_common_descriptor;
u32 response_specific_data[14];
};
struct efa_admin_aenq_common_desc {
u16 group;
u16 syndrom;
/*
* 0 : phase
* 7:1 : reserved - MBZ
*/
u8 flags;
u8 reserved1[3];
u32 timestamp_low;
u32 timestamp_high;
};
struct efa_admin_aenq_entry {
struct efa_admin_aenq_common_desc aenq_common_desc;
/* command specific inline data */
u32 inline_data_w4[12];
};
/* aq_common_desc */
#define EFA_ADMIN_AQ_COMMON_DESC_COMMAND_ID_MASK GENMASK(11, 0)
#define EFA_ADMIN_AQ_COMMON_DESC_PHASE_MASK BIT(0)
#define EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_SHIFT 1
#define EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_MASK BIT(1)
#define EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_INDIRECT_SHIFT 2
#define EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_INDIRECT_MASK BIT(2)
/* acq_common_desc */
#define EFA_ADMIN_ACQ_COMMON_DESC_COMMAND_ID_MASK GENMASK(11, 0)
#define EFA_ADMIN_ACQ_COMMON_DESC_PHASE_MASK BIT(0)
/* aenq_common_desc */
#define EFA_ADMIN_AENQ_COMMON_DESC_PHASE_MASK BIT(0)
#endif /* _EFA_ADMIN_H_ */

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,144 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
/*
* Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
*/
#ifndef _EFA_COM_H_
#define _EFA_COM_H_
#include <linux/delay.h>
#include <linux/device.h>
#include <linux/dma-mapping.h>
#include <linux/semaphore.h>
#include <linux/sched.h>
#include <rdma/ib_verbs.h>
#include "efa_common_defs.h"
#include "efa_admin_defs.h"
#include "efa_admin_cmds_defs.h"
#include "efa_regs_defs.h"
#define EFA_MAX_HANDLERS 256
struct efa_com_admin_cq {
struct efa_admin_acq_entry *entries;
dma_addr_t dma_addr;
spinlock_t lock; /* Protects ACQ */
u16 cc; /* consumer counter */
u8 phase;
};
struct efa_com_admin_sq {
struct efa_admin_aq_entry *entries;
dma_addr_t dma_addr;
spinlock_t lock; /* Protects ASQ */
u32 __iomem *db_addr;
u16 cc; /* consumer counter */
u16 pc; /* producer counter */
u8 phase;
};
/* Don't use anything other than atomic64 */
struct efa_com_stats_admin {
atomic64_t aborted_cmd;
atomic64_t submitted_cmd;
atomic64_t completed_cmd;
atomic64_t no_completion;
};
enum {
EFA_AQ_STATE_RUNNING_BIT = 0,
EFA_AQ_STATE_POLLING_BIT = 1,
};
struct efa_com_admin_queue {
void *dmadev;
void *efa_dev;
struct efa_comp_ctx *comp_ctx;
u32 completion_timeout; /* usecs */
u16 poll_interval; /* msecs */
u16 depth;
struct efa_com_admin_cq cq;
struct efa_com_admin_sq sq;
u16 msix_vector_idx;
unsigned long state;
/* Count the number of available admin commands */
struct semaphore avail_cmds;
struct efa_com_stats_admin stats;
spinlock_t comp_ctx_lock; /* Protects completion context pool */
u32 *comp_ctx_pool;
u16 comp_ctx_pool_next;
};
struct efa_aenq_handlers;
struct efa_com_aenq {
struct efa_admin_aenq_entry *entries;
struct efa_aenq_handlers *aenq_handlers;
dma_addr_t dma_addr;
u32 cc; /* consumer counter */
u16 msix_vector_idx;
u16 depth;
u8 phase;
};
struct efa_com_mmio_read {
struct efa_admin_mmio_req_read_less_resp *read_resp;
dma_addr_t read_resp_dma_addr;
u16 seq_num;
u16 mmio_read_timeout; /* usecs */
/* serializes mmio reads */
spinlock_t lock;
};
struct efa_com_dev {
struct efa_com_admin_queue aq;
struct efa_com_aenq aenq;
u8 __iomem *reg_bar;
void *dmadev;
void *efa_dev;
u32 supported_features;
u32 dma_addr_bits;
struct efa_com_mmio_read mmio_read;
};
typedef void (*efa_aenq_handler)(void *data,
struct efa_admin_aenq_entry *aenq_e);
/* Holds aenq handlers. Indexed by AENQ event group */
struct efa_aenq_handlers {
efa_aenq_handler handlers[EFA_MAX_HANDLERS];
efa_aenq_handler unimplemented_handler;
};
int efa_com_admin_init(struct efa_com_dev *edev,
struct efa_aenq_handlers *aenq_handlers);
void efa_com_admin_destroy(struct efa_com_dev *edev);
int efa_com_dev_reset(struct efa_com_dev *edev,
enum efa_regs_reset_reason_types reset_reason);
void efa_com_set_admin_polling_mode(struct efa_com_dev *edev, bool polling);
void efa_com_admin_q_comp_intr_handler(struct efa_com_dev *edev);
int efa_com_mmio_reg_read_init(struct efa_com_dev *edev);
void efa_com_mmio_reg_read_destroy(struct efa_com_dev *edev);
int efa_com_validate_version(struct efa_com_dev *edev);
int efa_com_get_dma_width(struct efa_com_dev *edev);
int efa_com_cmd_exec(struct efa_com_admin_queue *aq,
struct efa_admin_aq_entry *cmd,
size_t cmd_size,
struct efa_admin_acq_entry *comp,
size_t comp_size);
void efa_com_aenq_intr_handler(struct efa_com_dev *edev, void *data);
#endif /* _EFA_COM_H_ */

View File

@ -0,0 +1,692 @@
// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
/*
* Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
*/
#include "efa.h"
#include "efa_com.h"
#include "efa_com_cmd.h"
void efa_com_set_dma_addr(dma_addr_t addr, u32 *addr_high, u32 *addr_low)
{
*addr_low = lower_32_bits(addr);
*addr_high = upper_32_bits(addr);
}
int efa_com_create_qp(struct efa_com_dev *edev,
struct efa_com_create_qp_params *params,
struct efa_com_create_qp_result *res)
{
struct efa_admin_create_qp_cmd create_qp_cmd = {};
struct efa_admin_create_qp_resp cmd_completion;
struct efa_com_admin_queue *aq = &edev->aq;
int err;
create_qp_cmd.aq_common_desc.opcode = EFA_ADMIN_CREATE_QP;
create_qp_cmd.pd = params->pd;
create_qp_cmd.qp_type = params->qp_type;
create_qp_cmd.rq_base_addr = params->rq_base_addr;
create_qp_cmd.send_cq_idx = params->send_cq_idx;
create_qp_cmd.recv_cq_idx = params->recv_cq_idx;
create_qp_cmd.qp_alloc_size.send_queue_ring_size =
params->sq_ring_size_in_bytes;
create_qp_cmd.qp_alloc_size.send_queue_depth =
params->sq_depth;
create_qp_cmd.qp_alloc_size.recv_queue_ring_size =
params->rq_ring_size_in_bytes;
create_qp_cmd.qp_alloc_size.recv_queue_depth =
params->rq_depth;
create_qp_cmd.uar = params->uarn;
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)&create_qp_cmd,
sizeof(create_qp_cmd),
(struct efa_admin_acq_entry *)&cmd_completion,
sizeof(cmd_completion));
if (err) {
ibdev_err(edev->efa_dev, "Failed to create qp [%d]\n", err);
return err;
}
res->qp_handle = cmd_completion.qp_handle;
res->qp_num = cmd_completion.qp_num;
res->sq_db_offset = cmd_completion.sq_db_offset;
res->rq_db_offset = cmd_completion.rq_db_offset;
res->llq_descriptors_offset = cmd_completion.llq_descriptors_offset;
res->send_sub_cq_idx = cmd_completion.send_sub_cq_idx;
res->recv_sub_cq_idx = cmd_completion.recv_sub_cq_idx;
return err;
}
int efa_com_modify_qp(struct efa_com_dev *edev,
struct efa_com_modify_qp_params *params)
{
struct efa_com_admin_queue *aq = &edev->aq;
struct efa_admin_modify_qp_cmd cmd = {};
struct efa_admin_modify_qp_resp resp;
int err;
cmd.aq_common_desc.opcode = EFA_ADMIN_MODIFY_QP;
cmd.modify_mask = params->modify_mask;
cmd.qp_handle = params->qp_handle;
cmd.qp_state = params->qp_state;
cmd.cur_qp_state = params->cur_qp_state;
cmd.qkey = params->qkey;
cmd.sq_psn = params->sq_psn;
cmd.sq_drained_async_notify = params->sq_drained_async_notify;
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)&cmd,
sizeof(cmd),
(struct efa_admin_acq_entry *)&resp,
sizeof(resp));
if (err) {
ibdev_err(edev->efa_dev,
"Failed to modify qp-%u modify_mask[%#x] [%d]\n",
cmd.qp_handle, cmd.modify_mask, err);
return err;
}
return 0;
}
int efa_com_query_qp(struct efa_com_dev *edev,
struct efa_com_query_qp_params *params,
struct efa_com_query_qp_result *result)
{
struct efa_com_admin_queue *aq = &edev->aq;
struct efa_admin_query_qp_cmd cmd = {};
struct efa_admin_query_qp_resp resp;
int err;
cmd.aq_common_desc.opcode = EFA_ADMIN_QUERY_QP;
cmd.qp_handle = params->qp_handle;
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)&cmd,
sizeof(cmd),
(struct efa_admin_acq_entry *)&resp,
sizeof(resp));
if (err) {
ibdev_err(edev->efa_dev, "Failed to query qp-%u [%d]\n",
cmd.qp_handle, err);
return err;
}
result->qp_state = resp.qp_state;
result->qkey = resp.qkey;
result->sq_draining = resp.sq_draining;
result->sq_psn = resp.sq_psn;
return 0;
}
int efa_com_destroy_qp(struct efa_com_dev *edev,
struct efa_com_destroy_qp_params *params)
{
struct efa_admin_destroy_qp_resp cmd_completion;
struct efa_admin_destroy_qp_cmd qp_cmd = {};
struct efa_com_admin_queue *aq = &edev->aq;
int err;
qp_cmd.aq_common_desc.opcode = EFA_ADMIN_DESTROY_QP;
qp_cmd.qp_handle = params->qp_handle;
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)&qp_cmd,
sizeof(qp_cmd),
(struct efa_admin_acq_entry *)&cmd_completion,
sizeof(cmd_completion));
if (err)
ibdev_err(edev->efa_dev, "Failed to destroy qp-%u [%d]\n",
qp_cmd.qp_handle, err);
return 0;
}
int efa_com_create_cq(struct efa_com_dev *edev,
struct efa_com_create_cq_params *params,
struct efa_com_create_cq_result *result)
{
struct efa_admin_create_cq_resp cmd_completion;
struct efa_admin_create_cq_cmd create_cmd = {};
struct efa_com_admin_queue *aq = &edev->aq;
int err;
create_cmd.aq_common_desc.opcode = EFA_ADMIN_CREATE_CQ;
create_cmd.cq_caps_2 = (params->entry_size_in_bytes / 4) &
EFA_ADMIN_CREATE_CQ_CMD_CQ_ENTRY_SIZE_WORDS_MASK;
create_cmd.cq_depth = params->cq_depth;
create_cmd.num_sub_cqs = params->num_sub_cqs;
create_cmd.uar = params->uarn;
efa_com_set_dma_addr(params->dma_addr,
&create_cmd.cq_ba.mem_addr_high,
&create_cmd.cq_ba.mem_addr_low);
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)&create_cmd,
sizeof(create_cmd),
(struct efa_admin_acq_entry *)&cmd_completion,
sizeof(cmd_completion));
if (err) {
ibdev_err(edev->efa_dev, "Failed to create cq[%d]\n", err);
return err;
}
result->cq_idx = cmd_completion.cq_idx;
result->actual_depth = params->cq_depth;
return err;
}
int efa_com_destroy_cq(struct efa_com_dev *edev,
struct efa_com_destroy_cq_params *params)
{
struct efa_admin_destroy_cq_cmd destroy_cmd = {};
struct efa_admin_destroy_cq_resp destroy_resp;
struct efa_com_admin_queue *aq = &edev->aq;
int err;
destroy_cmd.cq_idx = params->cq_idx;
destroy_cmd.aq_common_desc.opcode = EFA_ADMIN_DESTROY_CQ;
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)&destroy_cmd,
sizeof(destroy_cmd),
(struct efa_admin_acq_entry *)&destroy_resp,
sizeof(destroy_resp));
if (err)
ibdev_err(edev->efa_dev, "Failed to destroy CQ-%u [%d]\n",
params->cq_idx, err);
return 0;
}
int efa_com_register_mr(struct efa_com_dev *edev,
struct efa_com_reg_mr_params *params,
struct efa_com_reg_mr_result *result)
{
struct efa_admin_reg_mr_resp cmd_completion;
struct efa_com_admin_queue *aq = &edev->aq;
struct efa_admin_reg_mr_cmd mr_cmd = {};
int err;
mr_cmd.aq_common_desc.opcode = EFA_ADMIN_REG_MR;
mr_cmd.pd = params->pd;
mr_cmd.mr_length = params->mr_length_in_bytes;
mr_cmd.flags |= params->page_shift &
EFA_ADMIN_REG_MR_CMD_PHYS_PAGE_SIZE_SHIFT_MASK;
mr_cmd.iova = params->iova;
mr_cmd.permissions |= params->permissions &
EFA_ADMIN_REG_MR_CMD_LOCAL_WRITE_ENABLE_MASK;
if (params->inline_pbl) {
memcpy(mr_cmd.pbl.inline_pbl_array,
params->pbl.inline_pbl_array,
sizeof(mr_cmd.pbl.inline_pbl_array));
} else {
mr_cmd.pbl.pbl.length = params->pbl.pbl.length;
mr_cmd.pbl.pbl.address.mem_addr_low =
params->pbl.pbl.address.mem_addr_low;
mr_cmd.pbl.pbl.address.mem_addr_high =
params->pbl.pbl.address.mem_addr_high;
mr_cmd.aq_common_desc.flags |=
EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_MASK;
if (params->indirect)
mr_cmd.aq_common_desc.flags |=
EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_INDIRECT_MASK;
}
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)&mr_cmd,
sizeof(mr_cmd),
(struct efa_admin_acq_entry *)&cmd_completion,
sizeof(cmd_completion));
if (err) {
ibdev_err(edev->efa_dev, "Failed to register mr [%d]\n", err);
return err;
}
result->l_key = cmd_completion.l_key;
result->r_key = cmd_completion.r_key;
return 0;
}
int efa_com_dereg_mr(struct efa_com_dev *edev,
struct efa_com_dereg_mr_params *params)
{
struct efa_admin_dereg_mr_resp cmd_completion;
struct efa_com_admin_queue *aq = &edev->aq;
struct efa_admin_dereg_mr_cmd mr_cmd = {};
int err;
mr_cmd.aq_common_desc.opcode = EFA_ADMIN_DEREG_MR;
mr_cmd.l_key = params->l_key;
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)&mr_cmd,
sizeof(mr_cmd),
(struct efa_admin_acq_entry *)&cmd_completion,
sizeof(cmd_completion));
if (err)
ibdev_err(edev->efa_dev,
"Failed to de-register mr(lkey-%u) [%d]\n",
mr_cmd.l_key, err);
return 0;
}
int efa_com_create_ah(struct efa_com_dev *edev,
struct efa_com_create_ah_params *params,
struct efa_com_create_ah_result *result)
{
struct efa_admin_create_ah_resp cmd_completion;
struct efa_com_admin_queue *aq = &edev->aq;
struct efa_admin_create_ah_cmd ah_cmd = {};
int err;
ah_cmd.aq_common_desc.opcode = EFA_ADMIN_CREATE_AH;
memcpy(ah_cmd.dest_addr, params->dest_addr, sizeof(ah_cmd.dest_addr));
ah_cmd.pd = params->pdn;
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)&ah_cmd,
sizeof(ah_cmd),
(struct efa_admin_acq_entry *)&cmd_completion,
sizeof(cmd_completion));
if (err) {
ibdev_err(edev->efa_dev, "Failed to create ah [%d]\n", err);
return err;
}
result->ah = cmd_completion.ah;
return 0;
}
int efa_com_destroy_ah(struct efa_com_dev *edev,
struct efa_com_destroy_ah_params *params)
{
struct efa_admin_destroy_ah_resp cmd_completion;
struct efa_admin_destroy_ah_cmd ah_cmd = {};
struct efa_com_admin_queue *aq = &edev->aq;
int err;
ah_cmd.aq_common_desc.opcode = EFA_ADMIN_DESTROY_AH;
ah_cmd.ah = params->ah;
ah_cmd.pd = params->pdn;
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)&ah_cmd,
sizeof(ah_cmd),
(struct efa_admin_acq_entry *)&cmd_completion,
sizeof(cmd_completion));
if (err)
ibdev_err(edev->efa_dev, "Failed to destroy ah-%d pd-%d [%d]\n",
ah_cmd.ah, ah_cmd.pd, err);
return 0;
}
static bool
efa_com_check_supported_feature_id(struct efa_com_dev *edev,
enum efa_admin_aq_feature_id feature_id)
{
u32 feature_mask = 1 << feature_id;
/* Device attributes is always supported */
if (feature_id != EFA_ADMIN_DEVICE_ATTR &&
!(edev->supported_features & feature_mask))
return false;
return true;
}
static int efa_com_get_feature_ex(struct efa_com_dev *edev,
struct efa_admin_get_feature_resp *get_resp,
enum efa_admin_aq_feature_id feature_id,
dma_addr_t control_buf_dma_addr,
u32 control_buff_size)
{
struct efa_admin_get_feature_cmd get_cmd = {};
struct efa_com_admin_queue *aq;
int err;
if (!efa_com_check_supported_feature_id(edev, feature_id)) {
ibdev_err(edev->efa_dev, "Feature %d isn't supported\n",
feature_id);
return -EOPNOTSUPP;
}
aq = &edev->aq;
get_cmd.aq_common_descriptor.opcode = EFA_ADMIN_GET_FEATURE;
if (control_buff_size)
get_cmd.aq_common_descriptor.flags =
EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_INDIRECT_MASK;
efa_com_set_dma_addr(control_buf_dma_addr,
&get_cmd.control_buffer.address.mem_addr_high,
&get_cmd.control_buffer.address.mem_addr_low);
get_cmd.control_buffer.length = control_buff_size;
get_cmd.feature_common.feature_id = feature_id;
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)
&get_cmd,
sizeof(get_cmd),
(struct efa_admin_acq_entry *)
get_resp,
sizeof(*get_resp));
if (err)
ibdev_err(edev->efa_dev,
"Failed to submit get_feature command %d [%d]\n",
feature_id, err);
return 0;
}
static int efa_com_get_feature(struct efa_com_dev *edev,
struct efa_admin_get_feature_resp *get_resp,
enum efa_admin_aq_feature_id feature_id)
{
return efa_com_get_feature_ex(edev, get_resp, feature_id, 0, 0);
}
int efa_com_get_network_attr(struct efa_com_dev *edev,
struct efa_com_get_network_attr_result *result)
{
struct efa_admin_get_feature_resp resp;
int err;
err = efa_com_get_feature(edev, &resp,
EFA_ADMIN_NETWORK_ATTR);
if (err) {
ibdev_err(edev->efa_dev,
"Failed to get network attributes %d\n", err);
return err;
}
memcpy(result->addr, resp.u.network_attr.addr,
sizeof(resp.u.network_attr.addr));
result->mtu = resp.u.network_attr.mtu;
return 0;
}
int efa_com_get_device_attr(struct efa_com_dev *edev,
struct efa_com_get_device_attr_result *result)
{
struct efa_admin_get_feature_resp resp;
int err;
err = efa_com_get_feature(edev, &resp, EFA_ADMIN_DEVICE_ATTR);
if (err) {
ibdev_err(edev->efa_dev, "Failed to get device attributes %d\n",
err);
return err;
}
result->page_size_cap = resp.u.device_attr.page_size_cap;
result->fw_version = resp.u.device_attr.fw_version;
result->admin_api_version = resp.u.device_attr.admin_api_version;
result->device_version = resp.u.device_attr.device_version;
result->supported_features = resp.u.device_attr.supported_features;
result->phys_addr_width = resp.u.device_attr.phys_addr_width;
result->virt_addr_width = resp.u.device_attr.virt_addr_width;
result->db_bar = resp.u.device_attr.db_bar;
if (result->admin_api_version < 1) {
ibdev_err(edev->efa_dev,
"Failed to get device attr api version [%u < 1]\n",
result->admin_api_version);
return -EINVAL;
}
edev->supported_features = resp.u.device_attr.supported_features;
err = efa_com_get_feature(edev, &resp,
EFA_ADMIN_QUEUE_ATTR);
if (err) {
ibdev_err(edev->efa_dev,
"Failed to get network attributes %d\n", err);
return err;
}
result->max_qp = resp.u.queue_attr.max_qp;
result->max_sq_depth = resp.u.queue_attr.max_sq_depth;
result->max_rq_depth = resp.u.queue_attr.max_rq_depth;
result->max_cq = resp.u.queue_attr.max_cq;
result->max_cq_depth = resp.u.queue_attr.max_cq_depth;
result->inline_buf_size = resp.u.queue_attr.inline_buf_size;
result->max_sq_sge = resp.u.queue_attr.max_wr_send_sges;
result->max_rq_sge = resp.u.queue_attr.max_wr_recv_sges;
result->max_mr = resp.u.queue_attr.max_mr;
result->max_mr_pages = resp.u.queue_attr.max_mr_pages;
result->max_pd = resp.u.queue_attr.max_pd;
result->max_ah = resp.u.queue_attr.max_ah;
result->max_llq_size = resp.u.queue_attr.max_llq_size;
result->sub_cqs_per_cq = resp.u.queue_attr.sub_cqs_per_cq;
return 0;
}
int efa_com_get_hw_hints(struct efa_com_dev *edev,
struct efa_com_get_hw_hints_result *result)
{
struct efa_admin_get_feature_resp resp;
int err;
err = efa_com_get_feature(edev, &resp, EFA_ADMIN_HW_HINTS);
if (err) {
ibdev_err(edev->efa_dev, "Failed to get hw hints %d\n", err);
return err;
}
result->admin_completion_timeout = resp.u.hw_hints.admin_completion_timeout;
result->driver_watchdog_timeout = resp.u.hw_hints.driver_watchdog_timeout;
result->mmio_read_timeout = resp.u.hw_hints.mmio_read_timeout;
result->poll_interval = resp.u.hw_hints.poll_interval;
return 0;
}
static int efa_com_set_feature_ex(struct efa_com_dev *edev,
struct efa_admin_set_feature_resp *set_resp,
struct efa_admin_set_feature_cmd *set_cmd,
enum efa_admin_aq_feature_id feature_id,
dma_addr_t control_buf_dma_addr,
u32 control_buff_size)
{
struct efa_com_admin_queue *aq;
int err;
if (!efa_com_check_supported_feature_id(edev, feature_id)) {
ibdev_err(edev->efa_dev, "Feature %d isn't supported\n",
feature_id);
return -EOPNOTSUPP;
}
aq = &edev->aq;
set_cmd->aq_common_descriptor.opcode = EFA_ADMIN_SET_FEATURE;
if (control_buff_size) {
set_cmd->aq_common_descriptor.flags =
EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_INDIRECT_MASK;
efa_com_set_dma_addr(control_buf_dma_addr,
&set_cmd->control_buffer.address.mem_addr_high,
&set_cmd->control_buffer.address.mem_addr_low);
}
set_cmd->control_buffer.length = control_buff_size;
set_cmd->feature_common.feature_id = feature_id;
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)set_cmd,
sizeof(*set_cmd),
(struct efa_admin_acq_entry *)set_resp,
sizeof(*set_resp));
if (err)
ibdev_err(edev->efa_dev,
"Failed to submit set_feature command %d error: %d\n",
feature_id, err);
return 0;
}
static int efa_com_set_feature(struct efa_com_dev *edev,
struct efa_admin_set_feature_resp *set_resp,
struct efa_admin_set_feature_cmd *set_cmd,
enum efa_admin_aq_feature_id feature_id)
{
return efa_com_set_feature_ex(edev, set_resp, set_cmd, feature_id,
0, 0);
}
int efa_com_set_aenq_config(struct efa_com_dev *edev, u32 groups)
{
struct efa_admin_get_feature_resp get_resp;
struct efa_admin_set_feature_resp set_resp;
struct efa_admin_set_feature_cmd cmd = {};
int err;
ibdev_dbg(edev->efa_dev, "Configuring aenq with groups[%#x]\n", groups);
err = efa_com_get_feature(edev, &get_resp, EFA_ADMIN_AENQ_CONFIG);
if (err) {
ibdev_err(edev->efa_dev, "Failed to get aenq attributes: %d\n",
err);
return err;
}
ibdev_dbg(edev->efa_dev,
"Get aenq groups: supported[%#x] enabled[%#x]\n",
get_resp.u.aenq.supported_groups,
get_resp.u.aenq.enabled_groups);
if ((get_resp.u.aenq.supported_groups & groups) != groups) {
ibdev_err(edev->efa_dev,
"Trying to set unsupported aenq groups[%#x] supported[%#x]\n",
groups, get_resp.u.aenq.supported_groups);
return -EOPNOTSUPP;
}
cmd.u.aenq.enabled_groups = groups;
err = efa_com_set_feature(edev, &set_resp, &cmd,
EFA_ADMIN_AENQ_CONFIG);
if (err) {
ibdev_err(edev->efa_dev, "Failed to set aenq attributes: %d\n",
err);
return err;
}
return 0;
}
int efa_com_alloc_pd(struct efa_com_dev *edev,
struct efa_com_alloc_pd_result *result)
{
struct efa_com_admin_queue *aq = &edev->aq;
struct efa_admin_alloc_pd_cmd cmd = {};
struct efa_admin_alloc_pd_resp resp;
int err;
cmd.aq_common_descriptor.opcode = EFA_ADMIN_ALLOC_PD;
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)&cmd,
sizeof(cmd),
(struct efa_admin_acq_entry *)&resp,
sizeof(resp));
if (err) {
ibdev_err(edev->efa_dev, "Failed to allocate pd[%d]\n", err);
return err;
}
result->pdn = resp.pd;
return 0;
}
int efa_com_dealloc_pd(struct efa_com_dev *edev,
struct efa_com_dealloc_pd_params *params)
{
struct efa_com_admin_queue *aq = &edev->aq;
struct efa_admin_dealloc_pd_cmd cmd = {};
struct efa_admin_dealloc_pd_resp resp;
int err;
cmd.aq_common_descriptor.opcode = EFA_ADMIN_DEALLOC_PD;
cmd.pd = params->pdn;
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)&cmd,
sizeof(cmd),
(struct efa_admin_acq_entry *)&resp,
sizeof(resp));
if (err) {
ibdev_err(edev->efa_dev, "Failed to deallocate pd-%u [%d]\n",
cmd.pd, err);
return err;
}
return 0;
}
int efa_com_alloc_uar(struct efa_com_dev *edev,
struct efa_com_alloc_uar_result *result)
{
struct efa_com_admin_queue *aq = &edev->aq;
struct efa_admin_alloc_uar_cmd cmd = {};
struct efa_admin_alloc_uar_resp resp;
int err;
cmd.aq_common_descriptor.opcode = EFA_ADMIN_ALLOC_UAR;
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)&cmd,
sizeof(cmd),
(struct efa_admin_acq_entry *)&resp,
sizeof(resp));
if (err) {
ibdev_err(edev->efa_dev, "Failed to allocate uar[%d]\n", err);
return err;
}
result->uarn = resp.uar;
return 0;
}
int efa_com_dealloc_uar(struct efa_com_dev *edev,
struct efa_com_dealloc_uar_params *params)
{
struct efa_com_admin_queue *aq = &edev->aq;
struct efa_admin_dealloc_uar_cmd cmd = {};
struct efa_admin_dealloc_uar_resp resp;
int err;
cmd.aq_common_descriptor.opcode = EFA_ADMIN_DEALLOC_UAR;
cmd.uar = params->uarn;
err = efa_com_cmd_exec(aq,
(struct efa_admin_aq_entry *)&cmd,
sizeof(cmd),
(struct efa_admin_acq_entry *)&resp,
sizeof(resp));
if (err) {
ibdev_err(edev->efa_dev, "Failed to deallocate uar-%u [%d]\n",
cmd.uar, err);
return err;
}
return 0;
}

View File

@ -0,0 +1,270 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
/*
* Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
*/
#ifndef _EFA_COM_CMD_H_
#define _EFA_COM_CMD_H_
#include "efa_com.h"
#define EFA_GID_SIZE 16
struct efa_com_create_qp_params {
u64 rq_base_addr;
u32 send_cq_idx;
u32 recv_cq_idx;
/*
* Send descriptor ring size in bytes,
* sufficient for user-provided number of WQEs and SGL size
*/
u32 sq_ring_size_in_bytes;
/* Max number of WQEs that will be posted on send queue */
u32 sq_depth;
/* Recv descriptor ring size in bytes */
u32 rq_ring_size_in_bytes;
u32 rq_depth;
u16 pd;
u16 uarn;
u8 qp_type;
};
struct efa_com_create_qp_result {
u32 qp_handle;
u32 qp_num;
u32 sq_db_offset;
u32 rq_db_offset;
u32 llq_descriptors_offset;
u16 send_sub_cq_idx;
u16 recv_sub_cq_idx;
};
struct efa_com_modify_qp_params {
u32 modify_mask;
u32 qp_handle;
u32 qp_state;
u32 cur_qp_state;
u32 qkey;
u32 sq_psn;
u8 sq_drained_async_notify;
};
struct efa_com_query_qp_params {
u32 qp_handle;
};
struct efa_com_query_qp_result {
u32 qp_state;
u32 qkey;
u32 sq_draining;
u32 sq_psn;
};
struct efa_com_destroy_qp_params {
u32 qp_handle;
};
struct efa_com_create_cq_params {
/* cq physical base address in OS memory */
dma_addr_t dma_addr;
/* completion queue depth in # of entries */
u16 cq_depth;
u16 num_sub_cqs;
u16 uarn;
u8 entry_size_in_bytes;
};
struct efa_com_create_cq_result {
/* cq identifier */
u16 cq_idx;
/* actual cq depth in # of entries */
u16 actual_depth;
};
struct efa_com_destroy_cq_params {
u16 cq_idx;
};
struct efa_com_create_ah_params {
u16 pdn;
/* Destination address in network byte order */
u8 dest_addr[EFA_GID_SIZE];
};
struct efa_com_create_ah_result {
u16 ah;
};
struct efa_com_destroy_ah_params {
u16 ah;
u16 pdn;
};
struct efa_com_get_network_attr_result {
u8 addr[EFA_GID_SIZE];
u32 mtu;
};
struct efa_com_get_device_attr_result {
u64 page_size_cap;
u64 max_mr_pages;
u32 fw_version;
u32 admin_api_version;
u32 device_version;
u32 supported_features;
u32 phys_addr_width;
u32 virt_addr_width;
u32 max_qp;
u32 max_sq_depth; /* wqes */
u32 max_rq_depth; /* wqes */
u32 max_cq;
u32 max_cq_depth; /* cqes */
u32 inline_buf_size;
u32 max_mr;
u32 max_pd;
u32 max_ah;
u32 max_llq_size;
u16 sub_cqs_per_cq;
u16 max_sq_sge;
u16 max_rq_sge;
u8 db_bar;
};
struct efa_com_get_hw_hints_result {
u16 mmio_read_timeout;
u16 driver_watchdog_timeout;
u16 admin_completion_timeout;
u16 poll_interval;
u32 reserved[4];
};
struct efa_com_mem_addr {
u32 mem_addr_low;
u32 mem_addr_high;
};
/* Used at indirect mode page list chunks for chaining */
struct efa_com_ctrl_buff_info {
/* indicates length of the buffer pointed by control_buffer_address. */
u32 length;
/* points to control buffer (direct or indirect) */
struct efa_com_mem_addr address;
};
struct efa_com_reg_mr_params {
/* Memory region length, in bytes. */
u64 mr_length_in_bytes;
/* IO Virtual Address associated with this MR. */
u64 iova;
/* words 8:15: Physical Buffer List, each element is page-aligned. */
union {
/*
* Inline array of physical addresses of app pages
* (optimization for short region reservations)
*/
u64 inline_pbl_array[4];
/*
* Describes the next physically contiguous chunk of indirect
* page list. A page list contains physical addresses of command
* data pages. Data pages are 4KB; page list chunks are
* variable-sized.
*/
struct efa_com_ctrl_buff_info pbl;
} pbl;
/* number of pages in PBL (redundant, could be calculated) */
u32 page_num;
/* Protection Domain */
u16 pd;
/*
* phys_page_size_shift - page size is (1 << phys_page_size_shift)
* Page size is used for building the Virtual to Physical
* address mapping
*/
u8 page_shift;
/*
* permissions
* 0: local_write_enable - Write permissions: value of 1 needed
* for RQ buffers and for RDMA write:1: reserved1 - remote
* access flags, etc
*/
u8 permissions;
u8 inline_pbl;
u8 indirect;
};
struct efa_com_reg_mr_result {
/*
* To be used in conjunction with local buffers references in SQ and
* RQ WQE
*/
u32 l_key;
/*
* To be used in incoming RDMA semantics messages to refer to remotely
* accessed memory region
*/
u32 r_key;
};
struct efa_com_dereg_mr_params {
u32 l_key;
};
struct efa_com_alloc_pd_result {
u16 pdn;
};
struct efa_com_dealloc_pd_params {
u16 pdn;
};
struct efa_com_alloc_uar_result {
u16 uarn;
};
struct efa_com_dealloc_uar_params {
u16 uarn;
};
void efa_com_set_dma_addr(dma_addr_t addr, u32 *addr_high, u32 *addr_low);
int efa_com_create_qp(struct efa_com_dev *edev,
struct efa_com_create_qp_params *params,
struct efa_com_create_qp_result *res);
int efa_com_modify_qp(struct efa_com_dev *edev,
struct efa_com_modify_qp_params *params);
int efa_com_query_qp(struct efa_com_dev *edev,
struct efa_com_query_qp_params *params,
struct efa_com_query_qp_result *result);
int efa_com_destroy_qp(struct efa_com_dev *edev,
struct efa_com_destroy_qp_params *params);
int efa_com_create_cq(struct efa_com_dev *edev,
struct efa_com_create_cq_params *params,
struct efa_com_create_cq_result *result);
int efa_com_destroy_cq(struct efa_com_dev *edev,
struct efa_com_destroy_cq_params *params);
int efa_com_register_mr(struct efa_com_dev *edev,
struct efa_com_reg_mr_params *params,
struct efa_com_reg_mr_result *result);
int efa_com_dereg_mr(struct efa_com_dev *edev,
struct efa_com_dereg_mr_params *params);
int efa_com_create_ah(struct efa_com_dev *edev,
struct efa_com_create_ah_params *params,
struct efa_com_create_ah_result *result);
int efa_com_destroy_ah(struct efa_com_dev *edev,
struct efa_com_destroy_ah_params *params);
int efa_com_get_network_attr(struct efa_com_dev *edev,
struct efa_com_get_network_attr_result *result);
int efa_com_get_device_attr(struct efa_com_dev *edev,
struct efa_com_get_device_attr_result *result);
int efa_com_get_hw_hints(struct efa_com_dev *edev,
struct efa_com_get_hw_hints_result *result);
int efa_com_set_aenq_config(struct efa_com_dev *edev, u32 groups);
int efa_com_alloc_pd(struct efa_com_dev *edev,
struct efa_com_alloc_pd_result *result);
int efa_com_dealloc_pd(struct efa_com_dev *edev,
struct efa_com_dealloc_pd_params *params);
int efa_com_alloc_uar(struct efa_com_dev *edev,
struct efa_com_alloc_uar_result *result);
int efa_com_dealloc_uar(struct efa_com_dev *edev,
struct efa_com_dealloc_uar_params *params);
#endif /* _EFA_COM_CMD_H_ */

View File

@ -0,0 +1,18 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
/*
* Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
*/
#ifndef _EFA_COMMON_H_
#define _EFA_COMMON_H_
#define EFA_COMMON_SPEC_VERSION_MAJOR 2
#define EFA_COMMON_SPEC_VERSION_MINOR 0
struct efa_common_mem_addr {
u32 mem_addr_low;
u32 mem_addr_high;
};
#endif /* _EFA_COMMON_H_ */

View File

@ -0,0 +1,533 @@
// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
/*
* Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
*/
#include <linux/module.h>
#include <linux/pci.h>
#include <rdma/ib_user_verbs.h>
#include "efa.h"
#define PCI_DEV_ID_EFA_VF 0xefa0
static const struct pci_device_id efa_pci_tbl[] = {
{ PCI_VDEVICE(AMAZON, PCI_DEV_ID_EFA_VF) },
{ }
};
MODULE_AUTHOR("Amazon.com, Inc. or its affiliates");
MODULE_LICENSE("Dual BSD/GPL");
MODULE_DESCRIPTION(DEVICE_NAME);
MODULE_DEVICE_TABLE(pci, efa_pci_tbl);
#define EFA_REG_BAR 0
#define EFA_MEM_BAR 2
#define EFA_BASE_BAR_MASK (BIT(EFA_REG_BAR) | BIT(EFA_MEM_BAR))
#define EFA_AENQ_ENABLED_GROUPS \
(BIT(EFA_ADMIN_FATAL_ERROR) | BIT(EFA_ADMIN_WARNING) | \
BIT(EFA_ADMIN_NOTIFICATION) | BIT(EFA_ADMIN_KEEP_ALIVE))
static void efa_update_network_attr(struct efa_dev *dev,
struct efa_com_get_network_attr_result *network_attr)
{
memcpy(dev->addr, network_attr->addr, sizeof(network_attr->addr));
dev->mtu = network_attr->mtu;
dev_dbg(&dev->pdev->dev, "Full address %pI6\n", dev->addr);
}
/* This handler will called for unknown event group or unimplemented handlers */
static void unimplemented_aenq_handler(void *data,
struct efa_admin_aenq_entry *aenq_e)
{
struct efa_dev *dev = (struct efa_dev *)data;
ibdev_err(&dev->ibdev,
"Unknown event was received or event with unimplemented handler\n");
}
static void efa_keep_alive(void *data, struct efa_admin_aenq_entry *aenq_e)
{
struct efa_dev *dev = (struct efa_dev *)data;
atomic64_inc(&dev->stats.keep_alive_rcvd);
}
static struct efa_aenq_handlers aenq_handlers = {
.handlers = {
[EFA_ADMIN_KEEP_ALIVE] = efa_keep_alive,
},
.unimplemented_handler = unimplemented_aenq_handler
};
static void efa_release_bars(struct efa_dev *dev, int bars_mask)
{
struct pci_dev *pdev = dev->pdev;
int release_bars;
release_bars = pci_select_bars(pdev, IORESOURCE_MEM) & bars_mask;
pci_release_selected_regions(pdev, release_bars);
}
static irqreturn_t efa_intr_msix_mgmnt(int irq, void *data)
{
struct efa_dev *dev = data;
efa_com_admin_q_comp_intr_handler(&dev->edev);
efa_com_aenq_intr_handler(&dev->edev, data);
return IRQ_HANDLED;
}
static int efa_request_mgmnt_irq(struct efa_dev *dev)
{
struct efa_irq *irq;
int err;
irq = &dev->admin_irq;
err = request_irq(irq->vector, irq->handler, 0, irq->name,
irq->data);
if (err) {
dev_err(&dev->pdev->dev, "Failed to request admin irq (%d)\n",
err);
return err;
}
dev_dbg(&dev->pdev->dev, "Set affinity hint of mgmnt irq to %*pbl (irq vector: %d)\n",
nr_cpumask_bits, &irq->affinity_hint_mask, irq->vector);
irq_set_affinity_hint(irq->vector, &irq->affinity_hint_mask);
return err;
}
static void efa_setup_mgmnt_irq(struct efa_dev *dev)
{
u32 cpu;
snprintf(dev->admin_irq.name, EFA_IRQNAME_SIZE,
"efa-mgmnt@pci:%s", pci_name(dev->pdev));
dev->admin_irq.handler = efa_intr_msix_mgmnt;
dev->admin_irq.data = dev;
dev->admin_irq.vector =
pci_irq_vector(dev->pdev, dev->admin_msix_vector_idx);
cpu = cpumask_first(cpu_online_mask);
dev->admin_irq.cpu = cpu;
cpumask_set_cpu(cpu,
&dev->admin_irq.affinity_hint_mask);
dev_info(&dev->pdev->dev, "Setup irq:0x%p vector:%d name:%s\n",
&dev->admin_irq,
dev->admin_irq.vector,
dev->admin_irq.name);
}
static void efa_free_mgmnt_irq(struct efa_dev *dev)
{
struct efa_irq *irq;
irq = &dev->admin_irq;
irq_set_affinity_hint(irq->vector, NULL);
free_irq(irq->vector, irq->data);
}
static int efa_set_mgmnt_irq(struct efa_dev *dev)
{
efa_setup_mgmnt_irq(dev);
return efa_request_mgmnt_irq(dev);
}
static int efa_request_doorbell_bar(struct efa_dev *dev)
{
u8 db_bar_idx = dev->dev_attr.db_bar;
struct pci_dev *pdev = dev->pdev;
int bars;
int err;
if (!(BIT(db_bar_idx) & EFA_BASE_BAR_MASK)) {
bars = pci_select_bars(pdev, IORESOURCE_MEM) & BIT(db_bar_idx);
err = pci_request_selected_regions(pdev, bars, DRV_MODULE_NAME);
if (err) {
dev_err(&dev->pdev->dev,
"pci_request_selected_regions for bar %d failed %d\n",
db_bar_idx, err);
return err;
}
}
dev->db_bar_addr = pci_resource_start(dev->pdev, db_bar_idx);
dev->db_bar_len = pci_resource_len(dev->pdev, db_bar_idx);
return 0;
}
static void efa_release_doorbell_bar(struct efa_dev *dev)
{
if (!(BIT(dev->dev_attr.db_bar) & EFA_BASE_BAR_MASK))
efa_release_bars(dev, BIT(dev->dev_attr.db_bar));
}
static void efa_update_hw_hints(struct efa_dev *dev,
struct efa_com_get_hw_hints_result *hw_hints)
{
struct efa_com_dev *edev = &dev->edev;
if (hw_hints->mmio_read_timeout)
edev->mmio_read.mmio_read_timeout =
hw_hints->mmio_read_timeout * 1000;
if (hw_hints->poll_interval)
edev->aq.poll_interval = hw_hints->poll_interval;
if (hw_hints->admin_completion_timeout)
edev->aq.completion_timeout =
hw_hints->admin_completion_timeout;
}
static void efa_stats_init(struct efa_dev *dev)
{
atomic64_t *s = (atomic64_t *)&dev->stats;
int i;
for (i = 0; i < sizeof(dev->stats) / sizeof(*s); i++, s++)
atomic64_set(s, 0);
}
static const struct ib_device_ops efa_dev_ops = {
.alloc_pd = efa_alloc_pd,
.alloc_ucontext = efa_alloc_ucontext,
.create_ah = efa_create_ah,
.create_cq = efa_create_cq,
.create_qp = efa_create_qp,
.dealloc_pd = efa_dealloc_pd,
.dealloc_ucontext = efa_dealloc_ucontext,
.dereg_mr = efa_dereg_mr,
.destroy_ah = efa_destroy_ah,
.destroy_cq = efa_destroy_cq,
.destroy_qp = efa_destroy_qp,
.get_link_layer = efa_port_link_layer,
.get_port_immutable = efa_get_port_immutable,
.mmap = efa_mmap,
.modify_qp = efa_modify_qp,
.query_device = efa_query_device,
.query_gid = efa_query_gid,
.query_pkey = efa_query_pkey,
.query_port = efa_query_port,
.query_qp = efa_query_qp,
.reg_user_mr = efa_reg_mr,
INIT_RDMA_OBJ_SIZE(ib_ah, efa_ah, ibah),
INIT_RDMA_OBJ_SIZE(ib_pd, efa_pd, ibpd),
INIT_RDMA_OBJ_SIZE(ib_ucontext, efa_ucontext, ibucontext),
};
static int efa_ib_device_add(struct efa_dev *dev)
{
struct efa_com_get_network_attr_result network_attr;
struct efa_com_get_hw_hints_result hw_hints;
struct pci_dev *pdev = dev->pdev;
int err;
efa_stats_init(dev);
err = efa_com_get_device_attr(&dev->edev, &dev->dev_attr);
if (err)
return err;
dev_dbg(&dev->pdev->dev, "Doorbells bar (%d)\n", dev->dev_attr.db_bar);
err = efa_request_doorbell_bar(dev);
if (err)
return err;
err = efa_com_get_network_attr(&dev->edev, &network_attr);
if (err)
goto err_release_doorbell_bar;
efa_update_network_attr(dev, &network_attr);
err = efa_com_get_hw_hints(&dev->edev, &hw_hints);
if (err)
goto err_release_doorbell_bar;
efa_update_hw_hints(dev, &hw_hints);
/* Try to enable all the available aenq groups */
err = efa_com_set_aenq_config(&dev->edev, EFA_AENQ_ENABLED_GROUPS);
if (err)
goto err_release_doorbell_bar;
dev->ibdev.owner = THIS_MODULE;
dev->ibdev.node_type = RDMA_NODE_UNSPECIFIED;
dev->ibdev.phys_port_cnt = 1;
dev->ibdev.num_comp_vectors = 1;
dev->ibdev.dev.parent = &pdev->dev;
dev->ibdev.uverbs_abi_ver = EFA_UVERBS_ABI_VERSION;
dev->ibdev.uverbs_cmd_mask =
(1ull << IB_USER_VERBS_CMD_GET_CONTEXT) |
(1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) |
(1ull << IB_USER_VERBS_CMD_QUERY_PORT) |
(1ull << IB_USER_VERBS_CMD_ALLOC_PD) |
(1ull << IB_USER_VERBS_CMD_DEALLOC_PD) |
(1ull << IB_USER_VERBS_CMD_REG_MR) |
(1ull << IB_USER_VERBS_CMD_DEREG_MR) |
(1ull << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL) |
(1ull << IB_USER_VERBS_CMD_CREATE_CQ) |
(1ull << IB_USER_VERBS_CMD_DESTROY_CQ) |
(1ull << IB_USER_VERBS_CMD_CREATE_QP) |
(1ull << IB_USER_VERBS_CMD_MODIFY_QP) |
(1ull << IB_USER_VERBS_CMD_QUERY_QP) |
(1ull << IB_USER_VERBS_CMD_DESTROY_QP) |
(1ull << IB_USER_VERBS_CMD_CREATE_AH) |
(1ull << IB_USER_VERBS_CMD_DESTROY_AH);
dev->ibdev.uverbs_ex_cmd_mask =
(1ull << IB_USER_VERBS_EX_CMD_QUERY_DEVICE);
dev->ibdev.driver_id = RDMA_DRIVER_EFA;
ib_set_device_ops(&dev->ibdev, &efa_dev_ops);
err = ib_register_device(&dev->ibdev, "efa_%d");
if (err)
goto err_release_doorbell_bar;
ibdev_info(&dev->ibdev, "IB device registered\n");
return 0;
err_release_doorbell_bar:
efa_release_doorbell_bar(dev);
return err;
}
static void efa_ib_device_remove(struct efa_dev *dev)
{
efa_com_dev_reset(&dev->edev, EFA_REGS_RESET_NORMAL);
ibdev_info(&dev->ibdev, "Unregister ib device\n");
ib_unregister_device(&dev->ibdev);
efa_release_doorbell_bar(dev);
}
static void efa_disable_msix(struct efa_dev *dev)
{
pci_free_irq_vectors(dev->pdev);
}
static int efa_enable_msix(struct efa_dev *dev)
{
int msix_vecs, irq_num;
/* Reserve the max msix vectors we might need */
msix_vecs = EFA_NUM_MSIX_VEC;
dev_dbg(&dev->pdev->dev, "Trying to enable MSI-X, vectors %d\n",
msix_vecs);
dev->admin_msix_vector_idx = EFA_MGMNT_MSIX_VEC_IDX;
irq_num = pci_alloc_irq_vectors(dev->pdev, msix_vecs,
msix_vecs, PCI_IRQ_MSIX);
if (irq_num < 0) {
dev_err(&dev->pdev->dev, "Failed to enable MSI-X. irq_num %d\n",
irq_num);
return -ENOSPC;
}
if (irq_num != msix_vecs) {
dev_err(&dev->pdev->dev,
"Allocated %d MSI-X (out of %d requested)\n",
irq_num, msix_vecs);
return -ENOSPC;
}
return 0;
}
static int efa_device_init(struct efa_com_dev *edev, struct pci_dev *pdev)
{
int dma_width;
int err;
err = efa_com_dev_reset(edev, EFA_REGS_RESET_NORMAL);
if (err)
return err;
err = efa_com_validate_version(edev);
if (err)
return err;
dma_width = efa_com_get_dma_width(edev);
if (dma_width < 0) {
err = dma_width;
return err;
}
err = pci_set_dma_mask(pdev, DMA_BIT_MASK(dma_width));
if (err) {
dev_err(&pdev->dev, "pci_set_dma_mask failed %d\n", err);
return err;
}
err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(dma_width));
if (err) {
dev_err(&pdev->dev,
"err_pci_set_consistent_dma_mask failed %d\n",
err);
return err;
}
return 0;
}
static struct efa_dev *efa_probe_device(struct pci_dev *pdev)
{
struct efa_com_dev *edev;
struct efa_dev *dev;
int bars;
int err;
err = pci_enable_device_mem(pdev);
if (err) {
dev_err(&pdev->dev, "pci_enable_device_mem() failed!\n");
return ERR_PTR(err);
}
pci_set_master(pdev);
dev = ib_alloc_device(efa_dev, ibdev);
if (!dev) {
dev_err(&pdev->dev, "Device alloc failed\n");
err = -ENOMEM;
goto err_disable_device;
}
pci_set_drvdata(pdev, dev);
edev = &dev->edev;
edev->efa_dev = dev;
edev->dmadev = &pdev->dev;
dev->pdev = pdev;
bars = pci_select_bars(pdev, IORESOURCE_MEM) & EFA_BASE_BAR_MASK;
err = pci_request_selected_regions(pdev, bars, DRV_MODULE_NAME);
if (err) {
dev_err(&pdev->dev, "pci_request_selected_regions failed %d\n",
err);
goto err_ibdev_destroy;
}
dev->reg_bar_addr = pci_resource_start(pdev, EFA_REG_BAR);
dev->reg_bar_len = pci_resource_len(pdev, EFA_REG_BAR);
dev->mem_bar_addr = pci_resource_start(pdev, EFA_MEM_BAR);
dev->mem_bar_len = pci_resource_len(pdev, EFA_MEM_BAR);
edev->reg_bar = devm_ioremap(&pdev->dev,
dev->reg_bar_addr,
dev->reg_bar_len);
if (!edev->reg_bar) {
dev_err(&pdev->dev, "Failed to remap register bar\n");
err = -EFAULT;
goto err_release_bars;
}
err = efa_com_mmio_reg_read_init(edev);
if (err) {
dev_err(&pdev->dev, "Failed to init readless MMIO\n");
goto err_iounmap;
}
err = efa_device_init(edev, pdev);
if (err) {
dev_err(&pdev->dev, "EFA device init failed\n");
if (err == -ETIME)
err = -EPROBE_DEFER;
goto err_reg_read_destroy;
}
err = efa_enable_msix(dev);
if (err)
goto err_reg_read_destroy;
edev->aq.msix_vector_idx = dev->admin_msix_vector_idx;
edev->aenq.msix_vector_idx = dev->admin_msix_vector_idx;
err = efa_set_mgmnt_irq(dev);
if (err)
goto err_disable_msix;
err = efa_com_admin_init(edev, &aenq_handlers);
if (err)
goto err_free_mgmnt_irq;
return dev;
err_free_mgmnt_irq:
efa_free_mgmnt_irq(dev);
err_disable_msix:
efa_disable_msix(dev);
err_reg_read_destroy:
efa_com_mmio_reg_read_destroy(edev);
err_iounmap:
devm_iounmap(&pdev->dev, edev->reg_bar);
err_release_bars:
efa_release_bars(dev, EFA_BASE_BAR_MASK);
err_ibdev_destroy:
ib_dealloc_device(&dev->ibdev);
err_disable_device:
pci_disable_device(pdev);
return ERR_PTR(err);
}
static void efa_remove_device(struct pci_dev *pdev)
{
struct efa_dev *dev = pci_get_drvdata(pdev);
struct efa_com_dev *edev;
edev = &dev->edev;
efa_com_admin_destroy(edev);
efa_free_mgmnt_irq(dev);
efa_disable_msix(dev);
efa_com_mmio_reg_read_destroy(edev);
devm_iounmap(&pdev->dev, edev->reg_bar);
efa_release_bars(dev, EFA_BASE_BAR_MASK);
ib_dealloc_device(&dev->ibdev);
pci_disable_device(pdev);
}
static int efa_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
{
struct efa_dev *dev;
int err;
dev = efa_probe_device(pdev);
if (IS_ERR(dev))
return PTR_ERR(dev);
err = efa_ib_device_add(dev);
if (err)
goto err_remove_device;
return 0;
err_remove_device:
efa_remove_device(pdev);
return err;
}
static void efa_remove(struct pci_dev *pdev)
{
struct efa_dev *dev = pci_get_drvdata(pdev);
efa_ib_device_remove(dev);
efa_remove_device(pdev);
}
static struct pci_driver efa_pci_driver = {
.name = DRV_MODULE_NAME,
.id_table = efa_pci_tbl,
.probe = efa_probe,
.remove = efa_remove,
};
module_pci_driver(efa_pci_driver);

View File

@ -0,0 +1,113 @@
/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
/*
* Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
*/
#ifndef _EFA_REGS_H_
#define _EFA_REGS_H_
enum efa_regs_reset_reason_types {
EFA_REGS_RESET_NORMAL = 0,
/* Keep alive timeout */
EFA_REGS_RESET_KEEP_ALIVE_TO = 1,
EFA_REGS_RESET_ADMIN_TO = 2,
EFA_REGS_RESET_INIT_ERR = 3,
EFA_REGS_RESET_DRIVER_INVALID_STATE = 4,
EFA_REGS_RESET_OS_TRIGGER = 5,
EFA_REGS_RESET_SHUTDOWN = 6,
EFA_REGS_RESET_USER_TRIGGER = 7,
EFA_REGS_RESET_GENERIC = 8,
};
/* efa_registers offsets */
/* 0 base */
#define EFA_REGS_VERSION_OFF 0x0
#define EFA_REGS_CONTROLLER_VERSION_OFF 0x4
#define EFA_REGS_CAPS_OFF 0x8
#define EFA_REGS_AQ_BASE_LO_OFF 0x10
#define EFA_REGS_AQ_BASE_HI_OFF 0x14
#define EFA_REGS_AQ_CAPS_OFF 0x18
#define EFA_REGS_ACQ_BASE_LO_OFF 0x20
#define EFA_REGS_ACQ_BASE_HI_OFF 0x24
#define EFA_REGS_ACQ_CAPS_OFF 0x28
#define EFA_REGS_AQ_PROD_DB_OFF 0x2c
#define EFA_REGS_AENQ_CAPS_OFF 0x34
#define EFA_REGS_AENQ_BASE_LO_OFF 0x38
#define EFA_REGS_AENQ_BASE_HI_OFF 0x3c
#define EFA_REGS_AENQ_CONS_DB_OFF 0x40
#define EFA_REGS_INTR_MASK_OFF 0x4c
#define EFA_REGS_DEV_CTL_OFF 0x54
#define EFA_REGS_DEV_STS_OFF 0x58
#define EFA_REGS_MMIO_REG_READ_OFF 0x5c
#define EFA_REGS_MMIO_RESP_LO_OFF 0x60
#define EFA_REGS_MMIO_RESP_HI_OFF 0x64
/* version register */
#define EFA_REGS_VERSION_MINOR_VERSION_MASK 0xff
#define EFA_REGS_VERSION_MAJOR_VERSION_SHIFT 8
#define EFA_REGS_VERSION_MAJOR_VERSION_MASK 0xff00
/* controller_version register */
#define EFA_REGS_CONTROLLER_VERSION_SUBMINOR_VERSION_MASK 0xff
#define EFA_REGS_CONTROLLER_VERSION_MINOR_VERSION_SHIFT 8
#define EFA_REGS_CONTROLLER_VERSION_MINOR_VERSION_MASK 0xff00
#define EFA_REGS_CONTROLLER_VERSION_MAJOR_VERSION_SHIFT 16
#define EFA_REGS_CONTROLLER_VERSION_MAJOR_VERSION_MASK 0xff0000
#define EFA_REGS_CONTROLLER_VERSION_IMPL_ID_SHIFT 24
#define EFA_REGS_CONTROLLER_VERSION_IMPL_ID_MASK 0xff000000
/* caps register */
#define EFA_REGS_CAPS_CONTIGUOUS_QUEUE_REQUIRED_MASK 0x1
#define EFA_REGS_CAPS_RESET_TIMEOUT_SHIFT 1
#define EFA_REGS_CAPS_RESET_TIMEOUT_MASK 0x3e
#define EFA_REGS_CAPS_DMA_ADDR_WIDTH_SHIFT 8
#define EFA_REGS_CAPS_DMA_ADDR_WIDTH_MASK 0xff00
#define EFA_REGS_CAPS_ADMIN_CMD_TO_SHIFT 16
#define EFA_REGS_CAPS_ADMIN_CMD_TO_MASK 0xf0000
/* aq_caps register */
#define EFA_REGS_AQ_CAPS_AQ_DEPTH_MASK 0xffff
#define EFA_REGS_AQ_CAPS_AQ_ENTRY_SIZE_SHIFT 16
#define EFA_REGS_AQ_CAPS_AQ_ENTRY_SIZE_MASK 0xffff0000
/* acq_caps register */
#define EFA_REGS_ACQ_CAPS_ACQ_DEPTH_MASK 0xffff
#define EFA_REGS_ACQ_CAPS_ACQ_ENTRY_SIZE_SHIFT 16
#define EFA_REGS_ACQ_CAPS_ACQ_ENTRY_SIZE_MASK 0xff0000
#define EFA_REGS_ACQ_CAPS_ACQ_MSIX_VECTOR_SHIFT 24
#define EFA_REGS_ACQ_CAPS_ACQ_MSIX_VECTOR_MASK 0xff000000
/* aenq_caps register */
#define EFA_REGS_AENQ_CAPS_AENQ_DEPTH_MASK 0xffff
#define EFA_REGS_AENQ_CAPS_AENQ_ENTRY_SIZE_SHIFT 16
#define EFA_REGS_AENQ_CAPS_AENQ_ENTRY_SIZE_MASK 0xff0000
#define EFA_REGS_AENQ_CAPS_AENQ_MSIX_VECTOR_SHIFT 24
#define EFA_REGS_AENQ_CAPS_AENQ_MSIX_VECTOR_MASK 0xff000000
/* dev_ctl register */
#define EFA_REGS_DEV_CTL_DEV_RESET_MASK 0x1
#define EFA_REGS_DEV_CTL_AQ_RESTART_SHIFT 1
#define EFA_REGS_DEV_CTL_AQ_RESTART_MASK 0x2
#define EFA_REGS_DEV_CTL_RESET_REASON_SHIFT 28
#define EFA_REGS_DEV_CTL_RESET_REASON_MASK 0xf0000000
/* dev_sts register */
#define EFA_REGS_DEV_STS_READY_MASK 0x1
#define EFA_REGS_DEV_STS_AQ_RESTART_IN_PROGRESS_SHIFT 1
#define EFA_REGS_DEV_STS_AQ_RESTART_IN_PROGRESS_MASK 0x2
#define EFA_REGS_DEV_STS_AQ_RESTART_FINISHED_SHIFT 2
#define EFA_REGS_DEV_STS_AQ_RESTART_FINISHED_MASK 0x4
#define EFA_REGS_DEV_STS_RESET_IN_PROGRESS_SHIFT 3
#define EFA_REGS_DEV_STS_RESET_IN_PROGRESS_MASK 0x8
#define EFA_REGS_DEV_STS_RESET_FINISHED_SHIFT 4
#define EFA_REGS_DEV_STS_RESET_FINISHED_MASK 0x10
#define EFA_REGS_DEV_STS_FATAL_ERROR_SHIFT 5
#define EFA_REGS_DEV_STS_FATAL_ERROR_MASK 0x20
/* mmio_reg_read register */
#define EFA_REGS_MMIO_REG_READ_REQ_ID_MASK 0xffff
#define EFA_REGS_MMIO_REG_READ_REG_OFF_SHIFT 16
#define EFA_REGS_MMIO_REG_READ_REG_OFF_MASK 0xffff0000
#endif /* _EFA_REGS_H_ */

File diff suppressed because it is too large Load Diff

View File

@ -4104,6 +4104,9 @@ def_access_ibp_counter(seq_naks);
static struct cntr_entry dev_cntrs[DEV_CNTR_LAST] = {
[C_RCV_OVF] = RXE32_DEV_CNTR_ELEM(RcvOverflow, RCV_BUF_OVFL_CNT, CNTR_SYNTH),
[C_RX_LEN_ERR] = RXE32_DEV_CNTR_ELEM(RxLenErr, RCV_LENGTH_ERR_CNT, CNTR_SYNTH),
[C_RX_ICRC_ERR] = RXE32_DEV_CNTR_ELEM(RxICrcErr, RCV_ICRC_ERR_CNT, CNTR_SYNTH),
[C_RX_EBP] = RXE32_DEV_CNTR_ELEM(RxEbpCnt, RCV_EBP_CNT, CNTR_SYNTH),
[C_RX_TID_FULL] = RXE32_DEV_CNTR_ELEM(RxTIDFullEr, RCV_TID_FULL_ERR_CNT,
CNTR_NORMAL),
[C_RX_TID_INVALID] = RXE32_DEV_CNTR_ELEM(RxTIDInvalid, RCV_TID_VALID_ERR_CNT,
@ -13294,15 +13297,18 @@ static int set_up_context_variables(struct hfi1_devdata *dd)
/*
* The RMT entries are currently allocated as shown below:
* 1. QOS (0 to 128 entries);
* 2. FECN for PSM (num_user_contexts + num_vnic_contexts);
* 2. FECN (num_kernel_context - 1 + num_user_contexts +
* num_vnic_contexts);
* 3. VNIC (num_vnic_contexts).
* It should be noted that PSM FECN oversubscribe num_vnic_contexts
* It should be noted that FECN oversubscribe num_vnic_contexts
* entries of RMT because both VNIC and PSM could allocate any receive
* context between dd->first_dyn_alloc_text and dd->num_rcv_contexts,
* and PSM FECN must reserve an RMT entry for each possible PSM receive
* context.
*/
rmt_count = qos_rmt_entries(dd, NULL, NULL) + (num_vnic_contexts * 2);
if (HFI1_CAP_IS_KSET(TID_RDMA))
rmt_count += num_kernel_contexts - 1;
if (rmt_count + n_usr_ctxts > NUM_MAP_ENTRIES) {
user_rmt_reduced = NUM_MAP_ENTRIES - rmt_count;
dd_dev_err(dd,
@ -14285,37 +14291,43 @@ bail:
init_qpmap_table(dd, FIRST_KERNEL_KCTXT, dd->n_krcv_queues - 1);
}
static void init_user_fecn_handling(struct hfi1_devdata *dd,
struct rsm_map_table *rmt)
static void init_fecn_handling(struct hfi1_devdata *dd,
struct rsm_map_table *rmt)
{
struct rsm_rule_data rrd;
u64 reg;
int i, idx, regoff, regidx;
int i, idx, regoff, regidx, start;
u8 offset;
u32 total_cnt;
if (HFI1_CAP_IS_KSET(TID_RDMA))
/* Exclude context 0 */
start = 1;
else
start = dd->first_dyn_alloc_ctxt;
total_cnt = dd->num_rcv_contexts - start;
/* there needs to be enough room in the map table */
total_cnt = dd->num_rcv_contexts - dd->first_dyn_alloc_ctxt;
if (rmt->used + total_cnt >= NUM_MAP_ENTRIES) {
dd_dev_err(dd, "User FECN handling disabled - too many user contexts allocated\n");
dd_dev_err(dd, "FECN handling disabled - too many contexts allocated\n");
return;
}
/*
* RSM will extract the destination context as an index into the
* map table. The destination contexts are a sequential block
* in the range first_dyn_alloc_ctxt...num_rcv_contexts-1 (inclusive).
* in the range start...num_rcv_contexts-1 (inclusive).
* Map entries are accessed as offset + extracted value. Adjust
* the added offset so this sequence can be placed anywhere in
* the table - as long as the entries themselves do not wrap.
* There are only enough bits in offset for the table size, so
* start with that to allow for a "negative" offset.
*/
offset = (u8)(NUM_MAP_ENTRIES + (int)rmt->used -
(int)dd->first_dyn_alloc_ctxt);
offset = (u8)(NUM_MAP_ENTRIES + rmt->used - start);
for (i = dd->first_dyn_alloc_ctxt, idx = rmt->used;
i < dd->num_rcv_contexts; i++, idx++) {
for (i = start, idx = rmt->used; i < dd->num_rcv_contexts;
i++, idx++) {
/* replace with identity mapping */
regoff = (idx % 8) * 8;
regidx = idx / 8;
@ -14437,7 +14449,7 @@ static void init_rxe(struct hfi1_devdata *dd)
rmt = alloc_rsm_map_table(dd);
/* set up QOS, including the QPN map table */
init_qos(dd, rmt);
init_user_fecn_handling(dd, rmt);
init_fecn_handling(dd, rmt);
complete_rsm_map_table(dd, rmt);
/* record number of used rsm map entries for vnic */
dd->vnic.rmt_start = rmt->used;
@ -14663,8 +14675,8 @@ void hfi1_start_cleanup(struct hfi1_devdata *dd)
*/
static int init_asic_data(struct hfi1_devdata *dd)
{
unsigned long flags;
struct hfi1_devdata *tmp, *peer = NULL;
unsigned long index;
struct hfi1_devdata *peer;
struct hfi1_asic_data *asic_data;
int ret = 0;
@ -14673,14 +14685,12 @@ static int init_asic_data(struct hfi1_devdata *dd)
if (!asic_data)
return -ENOMEM;
spin_lock_irqsave(&hfi1_devs_lock, flags);
xa_lock_irq(&hfi1_dev_table);
/* Find our peer device */
list_for_each_entry(tmp, &hfi1_dev_list, list) {
if ((HFI_BASE_GUID(dd) == HFI_BASE_GUID(tmp)) &&
dd->unit != tmp->unit) {
peer = tmp;
xa_for_each(&hfi1_dev_table, index, peer) {
if ((HFI_BASE_GUID(dd) == HFI_BASE_GUID(peer)) &&
dd->unit != peer->unit)
break;
}
}
if (peer) {
@ -14692,7 +14702,7 @@ static int init_asic_data(struct hfi1_devdata *dd)
mutex_init(&dd->asic_data->asic_resource_mutex);
}
dd->asic_data->dds[dd->hfi1_id] = dd; /* self back-pointer */
spin_unlock_irqrestore(&hfi1_devs_lock, flags);
xa_unlock_irq(&hfi1_dev_table);
/* first one through - set up i2c devices */
if (!peer)

View File

@ -858,6 +858,9 @@ static inline int idx_from_vl(int vl)
/* Per device counter indexes */
enum {
C_RCV_OVF = 0,
C_RX_LEN_ERR,
C_RX_ICRC_ERR,
C_RX_EBP,
C_RX_TID_FULL,
C_RX_TID_INVALID,
C_RX_TID_FLGMS,

View File

@ -380,6 +380,9 @@
#define DC_LCB_PRF_TX_FLIT_CNT (DC_LCB_CSRS + 0x000000000418)
#define DC_LCB_STS_LINK_TRANSFER_ACTIVE (DC_LCB_CSRS + 0x000000000468)
#define DC_LCB_STS_ROUND_TRIP_LTP_CNT (DC_LCB_CSRS + 0x0000000004B0)
#define RCV_LENGTH_ERR_CNT 0
#define RCV_ICRC_ERR_CNT 6
#define RCV_EBP_CNT 9
#define RCV_BUF_OVFL_CNT 10
#define RCV_CONTEXT_EGR_STALL 22
#define RCV_DATA_PKT_CNT 0

View File

@ -286,7 +286,7 @@ struct diag_pkt {
#define RHF_TID_ERR (0x1ull << 59)
#define RHF_LEN_ERR (0x1ull << 60)
#define RHF_ECC_ERR (0x1ull << 61)
#define RHF_VCRC_ERR (0x1ull << 62)
#define RHF_RESERVED (0x1ull << 62)
#define RHF_ICRC_ERR (0x1ull << 63)
#define RHF_ERROR_SMASK 0xffe0000000000000ull /* bits 63:53 */

View File

@ -1080,6 +1080,77 @@ static int qsfp2_debugfs_release(struct inode *in, struct file *fp)
return __qsfp_debugfs_release(in, fp, 1);
}
#define EXPROM_WRITE_ENABLE BIT_ULL(14)
static bool exprom_wp_disabled;
static int exprom_wp_set(struct hfi1_devdata *dd, bool disable)
{
u64 gpio_val = 0;
if (disable) {
gpio_val = EXPROM_WRITE_ENABLE;
exprom_wp_disabled = true;
dd_dev_info(dd, "Disable Expansion ROM Write Protection\n");
} else {
exprom_wp_disabled = false;
dd_dev_info(dd, "Enable Expansion ROM Write Protection\n");
}
write_csr(dd, ASIC_GPIO_OUT, gpio_val);
write_csr(dd, ASIC_GPIO_OE, gpio_val);
return 0;
}
static ssize_t exprom_wp_debugfs_read(struct file *file, char __user *buf,
size_t count, loff_t *ppos)
{
return 0;
}
static ssize_t exprom_wp_debugfs_write(struct file *file,
const char __user *buf, size_t count,
loff_t *ppos)
{
struct hfi1_pportdata *ppd = private2ppd(file);
char cdata;
if (count != 1)
return -EINVAL;
if (get_user(cdata, buf))
return -EFAULT;
if (cdata == '0')
exprom_wp_set(ppd->dd, false);
else if (cdata == '1')
exprom_wp_set(ppd->dd, true);
else
return -EINVAL;
return 1;
}
static unsigned long exprom_in_use;
static int exprom_wp_debugfs_open(struct inode *in, struct file *fp)
{
if (test_and_set_bit(0, &exprom_in_use))
return -EBUSY;
return 0;
}
static int exprom_wp_debugfs_release(struct inode *in, struct file *fp)
{
struct hfi1_pportdata *ppd = private2ppd(fp);
if (exprom_wp_disabled)
exprom_wp_set(ppd->dd, false);
clear_bit(0, &exprom_in_use);
return 0;
}
#define DEBUGFS_OPS(nm, readroutine, writeroutine) \
{ \
.name = nm, \
@ -1119,6 +1190,9 @@ static const struct counter_info port_cntr_ops[] = {
qsfp1_debugfs_open, qsfp1_debugfs_release),
DEBUGFS_XOPS("qsfp2", qsfp2_debugfs_read, qsfp2_debugfs_write,
qsfp2_debugfs_open, qsfp2_debugfs_release),
DEBUGFS_XOPS("exprom_wp", exprom_wp_debugfs_read,
exprom_wp_debugfs_write, exprom_wp_debugfs_open,
exprom_wp_debugfs_release),
DEBUGFS_OPS("asic_flags", asic_flags_read, asic_flags_write),
DEBUGFS_OPS("dc8051_memory", dc8051_memory_read, NULL),
DEBUGFS_OPS("lcb", debugfs_lcb_read, debugfs_lcb_write),
@ -1302,15 +1376,15 @@ static void _driver_stats_seq_stop(struct seq_file *s, void *v)
static u64 hfi1_sps_ints(void)
{
unsigned long flags;
unsigned long index, flags;
struct hfi1_devdata *dd;
u64 sps_ints = 0;
spin_lock_irqsave(&hfi1_devs_lock, flags);
list_for_each_entry(dd, &hfi1_dev_list, list) {
xa_lock_irqsave(&hfi1_dev_table, flags);
xa_for_each(&hfi1_dev_table, index, dd) {
sps_ints += get_all_cpu_total(dd->int_counter);
}
spin_unlock_irqrestore(&hfi1_devs_lock, flags);
xa_unlock_irqrestore(&hfi1_dev_table, flags);
return sps_ints;
}

View File

@ -72,8 +72,6 @@
*/
const char ib_hfi1_version[] = HFI1_DRIVER_VERSION "\n";
DEFINE_SPINLOCK(hfi1_devs_lock);
LIST_HEAD(hfi1_dev_list);
DEFINE_MUTEX(hfi1_mutex); /* general driver use */
unsigned int hfi1_max_mtu = HFI1_DEFAULT_MAX_MTU;
@ -175,11 +173,11 @@ int hfi1_count_active_units(void)
{
struct hfi1_devdata *dd;
struct hfi1_pportdata *ppd;
unsigned long flags;
unsigned long index, flags;
int pidx, nunits_active = 0;
spin_lock_irqsave(&hfi1_devs_lock, flags);
list_for_each_entry(dd, &hfi1_dev_list, list) {
xa_lock_irqsave(&hfi1_dev_table, flags);
xa_for_each(&hfi1_dev_table, index, dd) {
if (!(dd->flags & HFI1_PRESENT) || !dd->kregbase1)
continue;
for (pidx = 0; pidx < dd->num_pports; ++pidx) {
@ -190,7 +188,7 @@ int hfi1_count_active_units(void)
}
}
}
spin_unlock_irqrestore(&hfi1_devs_lock, flags);
xa_unlock_irqrestore(&hfi1_dev_table, flags);
return nunits_active;
}
@ -264,7 +262,7 @@ static void rcv_hdrerr(struct hfi1_ctxtdata *rcd, struct hfi1_pportdata *ppd,
hfi1_dbg_fault_suppress_err(verbs_dev))
return;
if (packet->rhf & (RHF_VCRC_ERR | RHF_ICRC_ERR))
if (packet->rhf & RHF_ICRC_ERR)
return;
if (packet->etype == RHF_RCV_TYPE_BYPASS) {
@ -516,7 +514,9 @@ bool hfi1_process_ecn_slowpath(struct rvt_qp *qp, struct hfi1_packet *pkt,
*/
do_cnp = prescan ||
(opcode >= IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST &&
opcode <= IB_OPCODE_RC_ATOMIC_ACKNOWLEDGE);
opcode <= IB_OPCODE_RC_ATOMIC_ACKNOWLEDGE) ||
opcode == TID_OP(READ_RESP) ||
opcode == TID_OP(ACK);
/* Call appropriate CNP handler */
if (!ignore_fecn && do_cnp && fecn)
@ -1581,7 +1581,7 @@ static void show_eflags_errs(struct hfi1_packet *packet)
u32 rte = rhf_rcv_type_err(packet->rhf);
dd_dev_err(rcd->dd,
"receive context %d: rhf 0x%016llx, errs [ %s%s%s%s%s%s%s%s] rte 0x%x\n",
"receive context %d: rhf 0x%016llx, errs [ %s%s%s%s%s%s%s] rte 0x%x\n",
rcd->ctxt, packet->rhf,
packet->rhf & RHF_K_HDR_LEN_ERR ? "k_hdr_len " : "",
packet->rhf & RHF_DC_UNC_ERR ? "dc_unc " : "",
@ -1589,7 +1589,6 @@ static void show_eflags_errs(struct hfi1_packet *packet)
packet->rhf & RHF_TID_ERR ? "tid " : "",
packet->rhf & RHF_LEN_ERR ? "len " : "",
packet->rhf & RHF_ECC_ERR ? "ecc " : "",
packet->rhf & RHF_VCRC_ERR ? "vcrc " : "",
packet->rhf & RHF_ICRC_ERR ? "icrc " : "",
rte);
}

View File

@ -112,9 +112,6 @@ int hfi1_alloc_ctxt_rcv_groups(struct hfi1_ctxtdata *rcd)
*/
void hfi1_free_ctxt_rcv_groups(struct hfi1_ctxtdata *rcd)
{
WARN_ON(!EXP_TID_SET_EMPTY(rcd->tid_full_list));
WARN_ON(!EXP_TID_SET_EMPTY(rcd->tid_used_list));
kfree(rcd->groups);
rcd->groups = NULL;
hfi1_exp_tid_group_init(rcd);

View File

@ -54,7 +54,6 @@
#include <linux/list.h>
#include <linux/scatterlist.h>
#include <linux/slab.h>
#include <linux/idr.h>
#include <linux/io.h>
#include <linux/fs.h>
#include <linux/completion.h>
@ -65,6 +64,7 @@
#include <linux/kthread.h>
#include <linux/i2c.h>
#include <linux/i2c-algo-bit.h>
#include <linux/xarray.h>
#include <rdma/ib_hdrs.h>
#include <rdma/opa_addr.h>
#include <linux/rhashtable.h>
@ -1021,8 +1021,8 @@ struct hfi1_asic_data {
struct hfi1_vnic_data {
struct hfi1_ctxtdata *ctxt[HFI1_NUM_VNIC_CTXT];
struct kmem_cache *txreq_cache;
struct xarray vesws;
u8 num_vports;
struct idr vesw_idr;
u8 rmt_start;
u8 num_ctxt;
};
@ -1041,7 +1041,6 @@ struct sdma_vl_map;
typedef int (*send_routine)(struct rvt_qp *, struct hfi1_pkt_state *, u64);
struct hfi1_devdata {
struct hfi1_ibdev verbs_dev; /* must be first */
struct list_head list;
/* pointers to related structs for this device */
/* pci access data structure */
struct pci_dev *pcidev;
@ -1426,8 +1425,7 @@ struct hfi1_filedata {
struct mm_struct *mm;
};
extern struct list_head hfi1_dev_list;
extern spinlock_t hfi1_devs_lock;
extern struct xarray hfi1_dev_table;
struct hfi1_devdata *hfi1_lookup(int unit);
static inline unsigned long uctxt_offset(struct hfi1_ctxtdata *uctxt)

View File

@ -49,7 +49,7 @@
#include <linux/netdevice.h>
#include <linux/vmalloc.h>
#include <linux/delay.h>
#include <linux/idr.h>
#include <linux/xarray.h>
#include <linux/module.h>
#include <linux/printk.h>
#include <linux/hrtimer.h>
@ -124,7 +124,7 @@ MODULE_PARM_DESC(user_credit_return_threshold, "Credit return threshold for user
static inline u64 encode_rcv_header_entry_size(u16 size);
static struct idr hfi1_unit_table;
DEFINE_XARRAY_FLAGS(hfi1_dev_table, XA_FLAGS_ALLOC | XA_FLAGS_LOCK_IRQ);
static int hfi1_create_kctxt(struct hfi1_devdata *dd,
struct hfi1_pportdata *ppd)
@ -469,7 +469,7 @@ int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int numa,
if (rcd->egrbufs.size < hfi1_max_mtu) {
rcd->egrbufs.size = __roundup_pow_of_two(hfi1_max_mtu);
hfi1_cdbg(PROC,
"ctxt%u: eager bufs size too small. Adjusting to %zu\n",
"ctxt%u: eager bufs size too small. Adjusting to %u\n",
rcd->ctxt, rcd->egrbufs.size);
}
rcd->egrbufs.rcvtid_size = HFI1_MAX_EAGER_BUFFER_SIZE;
@ -805,7 +805,8 @@ static int create_workqueues(struct hfi1_devdata *dd)
ppd->hfi1_wq =
alloc_workqueue(
"hfi%d_%d",
WQ_SYSFS | WQ_HIGHPRI | WQ_CPU_INTENSIVE,
WQ_SYSFS | WQ_HIGHPRI | WQ_CPU_INTENSIVE |
WQ_MEM_RECLAIM,
HFI1_MAX_ACTIVE_WORKQUEUE_ENTRIES,
dd->unit, pidx);
if (!ppd->hfi1_wq)
@ -1018,21 +1019,9 @@ done:
return ret;
}
static inline struct hfi1_devdata *__hfi1_lookup(int unit)
{
return idr_find(&hfi1_unit_table, unit);
}
struct hfi1_devdata *hfi1_lookup(int unit)
{
struct hfi1_devdata *dd;
unsigned long flags;
spin_lock_irqsave(&hfi1_devs_lock, flags);
dd = __hfi1_lookup(unit);
spin_unlock_irqrestore(&hfi1_devs_lock, flags);
return dd;
return xa_load(&hfi1_dev_table, unit);
}
/*
@ -1200,7 +1189,7 @@ void hfi1_free_ctxtdata(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd)
/*
* Release our hold on the shared asic data. If we are the last one,
* return the structure to be finalized outside the lock. Must be
* holding hfi1_devs_lock.
* holding hfi1_dev_table lock.
*/
static struct hfi1_asic_data *release_asic_data(struct hfi1_devdata *dd)
{
@ -1236,13 +1225,10 @@ static void hfi1_clean_devdata(struct hfi1_devdata *dd)
struct hfi1_asic_data *ad;
unsigned long flags;
spin_lock_irqsave(&hfi1_devs_lock, flags);
if (!list_empty(&dd->list)) {
idr_remove(&hfi1_unit_table, dd->unit);
list_del_init(&dd->list);
}
xa_lock_irqsave(&hfi1_dev_table, flags);
__xa_erase(&hfi1_dev_table, dd->unit);
ad = release_asic_data(dd);
spin_unlock_irqrestore(&hfi1_devs_lock, flags);
xa_unlock_irqrestore(&hfi1_dev_table, flags);
finalize_asic_data(dd, ad);
free_platform_config(dd);
@ -1286,13 +1272,10 @@ void hfi1_free_devdata(struct hfi1_devdata *dd)
* Must be done via verbs allocator, because the verbs cleanup process
* both does cleanup and free of the data structure.
* "extra" is for chip-specific data.
*
* Use the idr mechanism to get a unit number for this unit.
*/
static struct hfi1_devdata *hfi1_alloc_devdata(struct pci_dev *pdev,
size_t extra)
{
unsigned long flags;
struct hfi1_devdata *dd;
int ret, nports;
@ -1307,21 +1290,10 @@ static struct hfi1_devdata *hfi1_alloc_devdata(struct pci_dev *pdev,
dd->pport = (struct hfi1_pportdata *)(dd + 1);
dd->pcidev = pdev;
pci_set_drvdata(pdev, dd);
INIT_LIST_HEAD(&dd->list);
idr_preload(GFP_KERNEL);
spin_lock_irqsave(&hfi1_devs_lock, flags);
ret = idr_alloc(&hfi1_unit_table, dd, 0, 0, GFP_NOWAIT);
if (ret >= 0) {
dd->unit = ret;
list_add(&dd->list, &hfi1_dev_list);
}
dd->node = NUMA_NO_NODE;
spin_unlock_irqrestore(&hfi1_devs_lock, flags);
idr_preload_end();
ret = xa_alloc_irq(&hfi1_dev_table, &dd->unit, dd, xa_limit_32b,
GFP_KERNEL);
if (ret < 0) {
dev_err(&pdev->dev,
"Could not allocate unit ID: error %d\n", -ret);
@ -1522,8 +1494,6 @@ static int __init hfi1_mod_init(void)
* These must be called before the driver is registered with
* the PCI subsystem.
*/
idr_init(&hfi1_unit_table);
hfi1_dbg_init();
ret = pci_register_driver(&hfi1_pci_driver);
if (ret < 0) {
@ -1534,7 +1504,6 @@ static int __init hfi1_mod_init(void)
bail_dev:
hfi1_dbg_exit();
idr_destroy(&hfi1_unit_table);
dev_cleanup();
bail:
return ret;
@ -1552,7 +1521,7 @@ static void __exit hfi1_mod_cleanup(void)
node_affinity_destroy_all();
hfi1_dbg_exit();
idr_destroy(&hfi1_unit_table);
WARN_ON(!xa_empty(&hfi1_dev_table));
dispose_firmware(); /* asymmetric with obtain_firmware() */
dev_cleanup();
}
@ -2071,7 +2040,7 @@ int hfi1_setup_eagerbufs(struct hfi1_ctxtdata *rcd)
rcd->egrbufs.size = alloced_bytes;
hfi1_cdbg(PROC,
"ctxt%u: Alloced %u rcv tid entries @ %uKB, total %zuKB\n",
"ctxt%u: Alloced %u rcv tid entries @ %uKB, total %uKB\n",
rcd->ctxt, rcd->egrbufs.alloced,
rcd->egrbufs.rcvtid_size / 1024, rcd->egrbufs.size / 1024);

View File

@ -47,12 +47,14 @@
* for future transactions
*/
#include <linux/workqueue.h>
#include <rdma/ib_verbs.h>
#include <rdma/rdmavt_qp.h>
/* STL Verbs Extended */
#define IB_BTHE_E_SHIFT 24
#define HFI1_VERBS_E_ATOMIC_VADDR U64_MAX
struct ib_atomic_eth;
enum hfi1_opfn_codes {
STL_VERBS_EXTD_NONE = 0,
STL_VERBS_EXTD_TID_RDMA,

View File

@ -742,6 +742,8 @@ void *qp_priv_alloc(struct rvt_dev_info *rdi, struct rvt_qp *qp)
iowait_wakeup,
iowait_sdma_drained,
hfi1_init_priority);
/* Init to a value to start the running average correctly */
priv->s_running_pkt_size = piothreshold / 2;
return priv;
}

View File

@ -140,10 +140,7 @@ static int make_rc_ack(struct hfi1_ibdev *dev, struct rvt_qp *qp,
case OP(RDMA_READ_RESPONSE_LAST):
case OP(RDMA_READ_RESPONSE_ONLY):
e = &qp->s_ack_queue[qp->s_tail_ack_queue];
if (e->rdma_sge.mr) {
rvt_put_mr(e->rdma_sge.mr);
e->rdma_sge.mr = NULL;
}
release_rdma_sge_mr(e);
/* FALLTHROUGH */
case OP(ATOMIC_ACKNOWLEDGE):
/*
@ -343,7 +340,8 @@ write_resp:
break;
e->sent = 1;
qp->s_ack_state = OP(RDMA_READ_RESPONSE_LAST);
/* Do not free e->rdma_sge until all data are received */
qp->s_ack_state = OP(ATOMIC_ACKNOWLEDGE);
break;
case TID_OP(READ_RESP):
@ -1836,7 +1834,7 @@ void hfi1_rc_send_complete(struct rvt_qp *qp, struct hfi1_opa_header *opah)
qp->s_last = s_last;
/* see post_send() */
barrier();
rvt_put_swqe(wqe);
rvt_put_qp_swqe(qp, wqe);
rvt_qp_swqe_complete(qp,
wqe,
ib_hfi1_wc_opcode[wqe->wr.opcode],
@ -1884,7 +1882,7 @@ struct rvt_swqe *do_rc_completion(struct rvt_qp *qp,
u32 s_last;
trdma_clean_swqe(qp, wqe);
rvt_put_swqe(wqe);
rvt_put_qp_swqe(qp, wqe);
rvt_qp_wqe_unreserve(qp, wqe);
s_last = qp->s_last;
trace_hfi1_qp_send_completion(qp, wqe, s_last);
@ -2643,10 +2641,7 @@ static noinline int rc_rcv_error(struct ib_other_headers *ohdr, void *data,
len = be32_to_cpu(reth->length);
if (unlikely(offset + len != e->rdma_sge.sge_length))
goto unlock_done;
if (e->rdma_sge.mr) {
rvt_put_mr(e->rdma_sge.mr);
e->rdma_sge.mr = NULL;
}
release_rdma_sge_mr(e);
if (len != 0) {
u32 rkey = be32_to_cpu(reth->rkey);
u64 vaddr = get_ib_reth_vaddr(reth);
@ -3088,10 +3083,7 @@ send_last:
update_ack_queue(qp, next);
}
e = &qp->s_ack_queue[qp->r_head_ack_queue];
if (e->rdma_sge.mr) {
rvt_put_mr(e->rdma_sge.mr);
e->rdma_sge.mr = NULL;
}
release_rdma_sge_mr(e);
reth = &ohdr->u.rc.reth;
len = be32_to_cpu(reth->length);
if (len) {
@ -3166,10 +3158,7 @@ send_last:
update_ack_queue(qp, next);
}
e = &qp->s_ack_queue[qp->r_head_ack_queue];
if (e->rdma_sge.mr) {
rvt_put_mr(e->rdma_sge.mr);
e->rdma_sge.mr = NULL;
}
release_rdma_sge_mr(e);
/* Process OPFN special virtual address */
if (opfn) {
opfn_conn_response(qp, e, ateth);

View File

@ -41,6 +41,14 @@ static inline u32 restart_sge(struct rvt_sge_state *ss, struct rvt_swqe *wqe,
return rvt_restart_sge(ss, wqe, len);
}
static inline void release_rdma_sge_mr(struct rvt_ack_entry *e)
{
if (e->rdma_sge.mr) {
rvt_put_mr(e->rdma_sge.mr);
e->rdma_sge.mr = NULL;
}
}
struct rvt_ack_entry *find_prev_entry(struct rvt_qp *qp, u32 psn, u8 *prev,
u8 *prev_ack, bool *scheduled);
int do_rc_ack(struct rvt_qp *qp, u32 aeth, u32 psn, int opcode, u64 val,

View File

@ -524,7 +524,7 @@ void _hfi1_do_send(struct work_struct *work)
/**
* hfi1_do_send - perform a send on a QP
* @work: contains a pointer to the QP
* @qp: a pointer to the QP
* @in_thread: true if in a workqueue thread
*
* Process entries in the send work queue until credit or queue is

View File

@ -67,8 +67,6 @@ static u32 mask_generation(u32 a)
#define TID_RDMA_DESTQP_FLOW_SHIFT 11
#define TID_RDMA_DESTQP_FLOW_MASK 0x1f
#define TID_FLOW_SW_PSN BIT(0)
#define TID_OPFN_QP_CTXT_MASK 0xff
#define TID_OPFN_QP_CTXT_SHIFT 56
#define TID_OPFN_QP_KDETH_MASK 0xff
@ -128,6 +126,15 @@ static int make_tid_rdma_ack(struct rvt_qp *qp,
struct ib_other_headers *ohdr,
struct hfi1_pkt_state *ps);
static void hfi1_do_tid_send(struct rvt_qp *qp);
static u32 read_r_next_psn(struct hfi1_devdata *dd, u8 ctxt, u8 fidx);
static void tid_rdma_rcv_err(struct hfi1_packet *packet,
struct ib_other_headers *ohdr,
struct rvt_qp *qp, u32 psn, int diff, bool fecn);
static void update_r_next_psn_fecn(struct hfi1_packet *packet,
struct hfi1_qp_priv *priv,
struct hfi1_ctxtdata *rcd,
struct tid_rdma_flow *flow,
bool fecn);
static u64 tid_rdma_opfn_encode(struct tid_rdma_params *p)
{
@ -776,7 +783,6 @@ int hfi1_kern_setup_hw_flow(struct hfi1_ctxtdata *rcd, struct rvt_qp *qp)
rcd->flows[fs->index].generation = fs->generation;
fs->generation = kern_setup_hw_flow(rcd, fs->index);
fs->psn = 0;
fs->flags = 0;
dequeue_tid_waiter(rcd, &rcd->flow_queue, qp);
/* get head before dropping lock */
fqp = first_qp(rcd, &rcd->flow_queue);
@ -1807,6 +1813,7 @@ sync_check:
goto done;
hfi1_kern_clear_hw_flow(req->rcd, qp);
qpriv->s_flags &= ~HFI1_R_TID_SW_PSN;
req->state = TID_REQUEST_ACTIVE;
}
@ -2036,10 +2043,7 @@ static int tid_rdma_rcv_error(struct hfi1_packet *packet,
if (psn != e->psn || len != req->total_len)
goto unlock;
if (e->rdma_sge.mr) {
rvt_put_mr(e->rdma_sge.mr);
e->rdma_sge.mr = NULL;
}
release_rdma_sge_mr(e);
rkey = be32_to_cpu(reth->rkey);
vaddr = get_ib_reth_vaddr(reth);
@ -2238,7 +2242,7 @@ void hfi1_rc_rcv_tid_rdma_read_req(struct hfi1_packet *packet)
struct ib_reth *reth;
struct hfi1_qp_priv *qpriv = qp->priv;
u32 bth0, psn, len, rkey;
bool is_fecn;
bool fecn;
u8 next;
u64 vaddr;
int diff;
@ -2248,7 +2252,7 @@ void hfi1_rc_rcv_tid_rdma_read_req(struct hfi1_packet *packet)
if (hfi1_ruc_check_hdr(ibp, packet))
return;
is_fecn = process_ecn(qp, packet);
fecn = process_ecn(qp, packet);
psn = mask_psn(be32_to_cpu(ohdr->bth[2]));
trace_hfi1_rsp_rcv_tid_read_req(qp, psn);
@ -2267,9 +2271,8 @@ void hfi1_rc_rcv_tid_rdma_read_req(struct hfi1_packet *packet)
diff = delta_psn(psn, qp->r_psn);
if (unlikely(diff)) {
if (tid_rdma_rcv_error(packet, ohdr, qp, psn, diff))
return;
goto send_ack;
tid_rdma_rcv_err(packet, ohdr, qp, psn, diff, fecn);
return;
}
/* We've verified the request, insert it into the ack queue. */
@ -2285,10 +2288,7 @@ void hfi1_rc_rcv_tid_rdma_read_req(struct hfi1_packet *packet)
update_ack_queue(qp, next);
}
e = &qp->s_ack_queue[qp->r_head_ack_queue];
if (e->rdma_sge.mr) {
rvt_put_mr(e->rdma_sge.mr);
e->rdma_sge.mr = NULL;
}
release_rdma_sge_mr(e);
rkey = be32_to_cpu(reth->rkey);
qp->r_len = len;
@ -2324,11 +2324,11 @@ void hfi1_rc_rcv_tid_rdma_read_req(struct hfi1_packet *packet)
/* Schedule the send tasklet. */
qp->s_flags |= RVT_S_RESP_PENDING;
if (fecn)
qp->s_flags |= RVT_S_ECN;
hfi1_schedule_send(qp);
spin_unlock_irqrestore(&qp->s_lock, flags);
if (is_fecn)
goto send_ack;
return;
nack_inv_unlock:
@ -2345,8 +2345,6 @@ nack_acc:
rvt_rc_error(qp, IB_WC_LOC_PROT_ERR);
qp->r_nak_state = IB_NAK_REMOTE_ACCESS_ERROR;
qp->r_ack_psn = qp->r_psn;
send_ack:
hfi1_send_rc_ack(packet, is_fecn);
}
u32 hfi1_build_tid_rdma_read_resp(struct rvt_qp *qp, struct rvt_ack_entry *e,
@ -2463,12 +2461,12 @@ void hfi1_rc_rcv_tid_rdma_read_resp(struct hfi1_packet *packet)
struct tid_rdma_request *req;
struct tid_rdma_flow *flow;
u32 opcode, aeth;
bool is_fecn;
bool fecn;
unsigned long flags;
u32 kpsn, ipsn;
trace_hfi1_sender_rcv_tid_read_resp(qp);
is_fecn = process_ecn(qp, packet);
fecn = process_ecn(qp, packet);
kpsn = mask_psn(be32_to_cpu(ohdr->bth[2]));
aeth = be32_to_cpu(ohdr->u.tid_rdma.r_rsp.aeth);
opcode = (be32_to_cpu(ohdr->bth[0]) >> 24) & 0xff;
@ -2481,8 +2479,43 @@ void hfi1_rc_rcv_tid_rdma_read_resp(struct hfi1_packet *packet)
flow = &req->flows[req->clear_tail];
/* When header suppression is disabled */
if (cmp_psn(ipsn, flow->flow_state.ib_lpsn))
if (cmp_psn(ipsn, flow->flow_state.ib_lpsn)) {
update_r_next_psn_fecn(packet, priv, rcd, flow, fecn);
if (cmp_psn(kpsn, flow->flow_state.r_next_psn))
goto ack_done;
flow->flow_state.r_next_psn = mask_psn(kpsn + 1);
/*
* Copy the payload to destination buffer if this packet is
* delivered as an eager packet due to RSM rule and FECN.
* The RSM rule selects FECN bit in BTH and SH bit in
* KDETH header and therefore will not match the last
* packet of each segment that has SH bit cleared.
*/
if (fecn && packet->etype == RHF_RCV_TYPE_EAGER) {
struct rvt_sge_state ss;
u32 len;
u32 tlen = packet->tlen;
u16 hdrsize = packet->hlen;
u8 pad = packet->pad;
u8 extra_bytes = pad + packet->extra_byte +
(SIZE_OF_CRC << 2);
u32 pmtu = qp->pmtu;
if (unlikely(tlen != (hdrsize + pmtu + extra_bytes)))
goto ack_op_err;
len = restart_sge(&ss, req->e.swqe, ipsn, pmtu);
if (unlikely(len < pmtu))
goto ack_op_err;
rvt_copy_sge(qp, &ss, packet->payload, pmtu, false,
false);
/* Raise the sw sequence check flag for next packet */
priv->s_flags |= HFI1_R_TID_SW_PSN;
}
goto ack_done;
}
flow->flow_state.r_next_psn = mask_psn(kpsn + 1);
req->ack_pending--;
priv->pending_tid_r_segs--;
qp->s_num_rd_atomic--;
@ -2524,6 +2557,7 @@ void hfi1_rc_rcv_tid_rdma_read_resp(struct hfi1_packet *packet)
req->comp_seg == req->cur_seg) ||
priv->tid_r_comp == priv->tid_r_reqs) {
hfi1_kern_clear_hw_flow(priv->rcd, qp);
priv->s_flags &= ~HFI1_R_TID_SW_PSN;
if (req->state == TID_REQUEST_SYNC)
req->state = TID_REQUEST_ACTIVE;
}
@ -2545,8 +2579,6 @@ ack_op_err:
ack_done:
spin_unlock_irqrestore(&qp->s_lock, flags);
if (is_fecn)
hfi1_send_rc_ack(packet, is_fecn);
}
void hfi1_kern_read_tid_flow_free(struct rvt_qp *qp)
@ -2773,9 +2805,9 @@ static bool handle_read_kdeth_eflags(struct hfi1_ctxtdata *rcd,
rvt_rc_error(qp, IB_WC_LOC_QP_OP_ERR);
return ret;
}
if (priv->flow_state.flags & TID_FLOW_SW_PSN) {
if (priv->s_flags & HFI1_R_TID_SW_PSN) {
diff = cmp_psn(psn,
priv->flow_state.r_next_psn);
flow->flow_state.r_next_psn);
if (diff > 0) {
if (!(qp->r_flags & RVT_R_RDMAR_SEQ))
restart_tid_rdma_read_req(rcd,
@ -2811,22 +2843,15 @@ static bool handle_read_kdeth_eflags(struct hfi1_ctxtdata *rcd,
qp->r_flags &=
~RVT_R_RDMAR_SEQ;
}
priv->flow_state.r_next_psn++;
flow->flow_state.r_next_psn =
mask_psn(psn + 1);
} else {
u64 reg;
u32 last_psn;
/*
* The only sane way to get the amount of
* progress is to read the HW flow state.
*/
reg = read_uctxt_csr(dd, rcd->ctxt,
RCV_TID_FLOW_TABLE +
(8 * flow->idx));
last_psn = mask_psn(reg);
priv->flow_state.r_next_psn = last_psn;
priv->flow_state.flags |= TID_FLOW_SW_PSN;
last_psn = read_r_next_psn(dd, rcd->ctxt,
flow->idx);
flow->flow_state.r_next_psn = last_psn;
priv->s_flags |= HFI1_R_TID_SW_PSN;
/*
* If no request has been restarted yet,
* restart the current one.
@ -2891,10 +2916,11 @@ bool hfi1_handle_kdeth_eflags(struct hfi1_ctxtdata *rcd,
struct rvt_ack_entry *e;
struct tid_rdma_request *req;
struct tid_rdma_flow *flow;
int diff = 0;
trace_hfi1_msg_handle_kdeth_eflags(NULL, "Kdeth error: rhf ",
packet->rhf);
if (packet->rhf & (RHF_VCRC_ERR | RHF_ICRC_ERR))
if (packet->rhf & RHF_ICRC_ERR)
return ret;
packet->ohdr = &hdr->u.oth;
@ -2974,17 +3000,10 @@ bool hfi1_handle_kdeth_eflags(struct hfi1_ctxtdata *rcd,
switch (rte) {
case RHF_RTE_EXPECTED_FLOW_SEQ_ERR:
if (!(qpriv->s_flags & HFI1_R_TID_SW_PSN)) {
u64 reg;
qpriv->s_flags |= HFI1_R_TID_SW_PSN;
/*
* The only sane way to get the amount of
* progress is to read the HW flow state.
*/
reg = read_uctxt_csr(dd, rcd->ctxt,
RCV_TID_FLOW_TABLE +
(8 * flow->idx));
flow->flow_state.r_next_psn = mask_psn(reg);
flow->flow_state.r_next_psn =
read_r_next_psn(dd, rcd->ctxt,
flow->idx);
qpriv->r_next_psn_kdeth =
flow->flow_state.r_next_psn;
goto nak_psn;
@ -2997,10 +3016,12 @@ bool hfi1_handle_kdeth_eflags(struct hfi1_ctxtdata *rcd,
* mismatch could be due to packets that were
* already in flight.
*/
if (psn != flow->flow_state.r_next_psn) {
psn = flow->flow_state.r_next_psn;
diff = cmp_psn(psn,
flow->flow_state.r_next_psn);
if (diff > 0)
goto nak_psn;
}
else if (diff < 0)
break;
qpriv->s_nak_state = 0;
/*
@ -3011,8 +3032,10 @@ bool hfi1_handle_kdeth_eflags(struct hfi1_ctxtdata *rcd,
if (psn == full_flow_psn(flow,
flow->flow_state.lpsn))
ret = false;
flow->flow_state.r_next_psn =
mask_psn(psn + 1);
qpriv->r_next_psn_kdeth =
++flow->flow_state.r_next_psn;
flow->flow_state.r_next_psn;
}
break;
@ -3517,8 +3540,10 @@ static void hfi1_tid_write_alloc_resources(struct rvt_qp *qp, bool intr_ctx)
if (qpriv->r_tid_alloc == qpriv->r_tid_head) {
/* If all data has been received, clear the flow */
if (qpriv->flow_state.index < RXE_NUM_TID_FLOWS &&
!qpriv->alloc_w_segs)
!qpriv->alloc_w_segs) {
hfi1_kern_clear_hw_flow(rcd, qp);
qpriv->s_flags &= ~HFI1_R_TID_SW_PSN;
}
break;
}
@ -3544,8 +3569,7 @@ static void hfi1_tid_write_alloc_resources(struct rvt_qp *qp, bool intr_ctx)
if (qpriv->sync_pt && !qpriv->alloc_w_segs) {
hfi1_kern_clear_hw_flow(rcd, qp);
qpriv->sync_pt = false;
if (qpriv->s_flags & HFI1_R_TID_SW_PSN)
qpriv->s_flags &= ~HFI1_R_TID_SW_PSN;
qpriv->s_flags &= ~HFI1_R_TID_SW_PSN;
}
/* Allocate flow if we don't have one */
@ -3687,7 +3711,7 @@ void hfi1_rc_rcv_tid_rdma_write_req(struct hfi1_packet *packet)
struct hfi1_qp_priv *qpriv = qp->priv;
struct tid_rdma_request *req;
u32 bth0, psn, len, rkey, num_segs;
bool is_fecn;
bool fecn;
u8 next;
u64 vaddr;
int diff;
@ -3696,7 +3720,7 @@ void hfi1_rc_rcv_tid_rdma_write_req(struct hfi1_packet *packet)
if (hfi1_ruc_check_hdr(ibp, packet))
return;
is_fecn = process_ecn(qp, packet);
fecn = process_ecn(qp, packet);
psn = mask_psn(be32_to_cpu(ohdr->bth[2]));
trace_hfi1_rsp_rcv_tid_write_req(qp, psn);
@ -3713,9 +3737,8 @@ void hfi1_rc_rcv_tid_rdma_write_req(struct hfi1_packet *packet)
num_segs = DIV_ROUND_UP(len, qpriv->tid_rdma.local.max_len);
diff = delta_psn(psn, qp->r_psn);
if (unlikely(diff)) {
if (tid_rdma_rcv_error(packet, ohdr, qp, psn, diff))
return;
goto send_ack;
tid_rdma_rcv_err(packet, ohdr, qp, psn, diff, fecn);
return;
}
/*
@ -3751,10 +3774,7 @@ void hfi1_rc_rcv_tid_rdma_write_req(struct hfi1_packet *packet)
goto update_head;
}
if (e->rdma_sge.mr) {
rvt_put_mr(e->rdma_sge.mr);
e->rdma_sge.mr = NULL;
}
release_rdma_sge_mr(e);
/* The length needs to be in multiples of PAGE_SIZE */
if (!len || len & ~PAGE_MASK)
@ -3834,11 +3854,11 @@ update_head:
/* Schedule the send tasklet. */
qp->s_flags |= RVT_S_RESP_PENDING;
if (fecn)
qp->s_flags |= RVT_S_ECN;
hfi1_schedule_send(qp);
spin_unlock_irqrestore(&qp->s_lock, flags);
if (is_fecn)
goto send_ack;
return;
nack_inv_unlock:
@ -3855,8 +3875,6 @@ nack_acc:
rvt_rc_error(qp, IB_WC_LOC_PROT_ERR);
qp->r_nak_state = IB_NAK_REMOTE_ACCESS_ERROR;
qp->r_ack_psn = qp->r_psn;
send_ack:
hfi1_send_rc_ack(packet, is_fecn);
}
u32 hfi1_build_tid_rdma_write_resp(struct rvt_qp *qp, struct rvt_ack_entry *e,
@ -4073,10 +4091,10 @@ void hfi1_rc_rcv_tid_rdma_write_resp(struct hfi1_packet *packet)
struct tid_rdma_flow *flow;
enum ib_wc_status status;
u32 opcode, aeth, psn, flow_psn, i, tidlen = 0, pktlen;
bool is_fecn;
bool fecn;
unsigned long flags;
is_fecn = process_ecn(qp, packet);
fecn = process_ecn(qp, packet);
psn = mask_psn(be32_to_cpu(ohdr->bth[2]));
aeth = be32_to_cpu(ohdr->u.tid_rdma.w_rsp.aeth);
opcode = (be32_to_cpu(ohdr->bth[0]) >> 24) & 0xff;
@ -4216,7 +4234,6 @@ void hfi1_rc_rcv_tid_rdma_write_resp(struct hfi1_packet *packet)
qpriv->s_tid_cur = i;
}
qp->s_flags &= ~HFI1_S_WAIT_TID_RESP;
hfi1_schedule_tid_send(qp);
goto ack_done;
@ -4225,9 +4242,9 @@ ack_op_err:
ack_err:
rvt_error_qp(qp, status);
ack_done:
if (fecn)
qp->s_flags |= RVT_S_ECN;
spin_unlock_irqrestore(&qp->s_lock, flags);
if (is_fecn)
hfi1_send_rc_ack(packet, is_fecn);
}
bool hfi1_build_tid_rdma_packet(struct rvt_swqe *wqe,
@ -4307,7 +4324,9 @@ void hfi1_rc_rcv_tid_rdma_write_data(struct hfi1_packet *packet)
unsigned long flags;
u32 psn, next;
u8 opcode;
bool fecn;
fecn = process_ecn(qp, packet);
psn = mask_psn(be32_to_cpu(ohdr->bth[2]));
opcode = (be32_to_cpu(ohdr->bth[0]) >> 24) & 0xff;
@ -4320,9 +4339,53 @@ void hfi1_rc_rcv_tid_rdma_write_data(struct hfi1_packet *packet)
req = ack_to_tid_req(e);
flow = &req->flows[req->clear_tail];
if (cmp_psn(psn, full_flow_psn(flow, flow->flow_state.lpsn))) {
update_r_next_psn_fecn(packet, priv, rcd, flow, fecn);
if (cmp_psn(psn, flow->flow_state.r_next_psn))
goto send_nak;
flow->flow_state.r_next_psn++;
flow->flow_state.r_next_psn = mask_psn(psn + 1);
/*
* Copy the payload to destination buffer if this packet is
* delivered as an eager packet due to RSM rule and FECN.
* The RSM rule selects FECN bit in BTH and SH bit in
* KDETH header and therefore will not match the last
* packet of each segment that has SH bit cleared.
*/
if (fecn && packet->etype == RHF_RCV_TYPE_EAGER) {
struct rvt_sge_state ss;
u32 len;
u32 tlen = packet->tlen;
u16 hdrsize = packet->hlen;
u8 pad = packet->pad;
u8 extra_bytes = pad + packet->extra_byte +
(SIZE_OF_CRC << 2);
u32 pmtu = qp->pmtu;
if (unlikely(tlen != (hdrsize + pmtu + extra_bytes)))
goto send_nak;
len = req->comp_seg * req->seg_len;
len += delta_psn(psn,
full_flow_psn(flow, flow->flow_state.spsn)) *
pmtu;
if (unlikely(req->total_len - len < pmtu))
goto send_nak;
/*
* The e->rdma_sge field is set when TID RDMA WRITE REQ
* is first received and is never modified thereafter.
*/
ss.sge = e->rdma_sge;
ss.sg_list = NULL;
ss.num_sge = 1;
ss.total_len = req->total_len;
rvt_skip_sge(&ss, len, false);
rvt_copy_sge(qp, &ss, packet->payload, pmtu, false,
false);
/* Raise the sw sequence check flag for next packet */
priv->r_next_psn_kdeth = mask_psn(psn + 1);
priv->s_flags |= HFI1_R_TID_SW_PSN;
}
goto exit;
}
flow->flow_state.r_next_psn = mask_psn(psn + 1);
@ -4347,6 +4410,7 @@ void hfi1_rc_rcv_tid_rdma_write_data(struct hfi1_packet *packet)
priv->r_tid_ack = priv->r_tid_tail;
if (opcode == TID_OP(WRITE_DATA_LAST)) {
release_rdma_sge_mr(e);
for (next = priv->r_tid_tail + 1; ; next++) {
if (next > rvt_size_atomic(&dev->rdi))
next = 0;
@ -4386,6 +4450,8 @@ done:
hfi1_schedule_tid_send(qp);
exit:
priv->r_next_psn_kdeth = flow->flow_state.r_next_psn;
if (fecn)
qp->s_flags |= RVT_S_ECN;
spin_unlock_irqrestore(&qp->s_lock, flags);
return;
@ -4487,12 +4553,11 @@ void hfi1_rc_rcv_tid_rdma_ack(struct hfi1_packet *packet)
struct tid_rdma_request *req;
struct tid_rdma_flow *flow;
u32 aeth, psn, req_psn, ack_psn, fspsn, resync_psn, ack_kpsn;
bool is_fecn;
unsigned long flags;
u16 fidx;
trace_hfi1_tid_write_sender_rcv_tid_ack(qp, 0);
is_fecn = process_ecn(qp, packet);
process_ecn(qp, packet);
psn = mask_psn(be32_to_cpu(ohdr->bth[2]));
aeth = be32_to_cpu(ohdr->u.tid_rdma.ack.aeth);
req_psn = mask_psn(be32_to_cpu(ohdr->u.tid_rdma.ack.verbs_psn));
@ -4846,10 +4911,10 @@ void hfi1_rc_rcv_tid_rdma_resync(struct hfi1_packet *packet)
struct tid_rdma_flow *flow;
struct tid_flow_state *fs = &qpriv->flow_state;
u32 psn, generation, idx, gen_next;
bool is_fecn;
bool fecn;
unsigned long flags;
is_fecn = process_ecn(qp, packet);
fecn = process_ecn(qp, packet);
psn = mask_psn(be32_to_cpu(ohdr->bth[2]));
generation = mask_psn(psn + 1) >> HFI1_KDETH_BTH_SEQ_SHIFT;
@ -4940,6 +5005,8 @@ void hfi1_rc_rcv_tid_rdma_resync(struct hfi1_packet *packet)
qpriv->s_flags |= RVT_S_ACK_PENDING;
hfi1_schedule_tid_send(qp);
bail:
if (fecn)
qp->s_flags |= RVT_S_ECN;
spin_unlock_irqrestore(&qp->s_lock, flags);
}
@ -5449,3 +5516,48 @@ bool hfi1_tid_rdma_ack_interlock(struct rvt_qp *qp, struct rvt_ack_entry *e)
}
return false;
}
static u32 read_r_next_psn(struct hfi1_devdata *dd, u8 ctxt, u8 fidx)
{
u64 reg;
/*
* The only sane way to get the amount of
* progress is to read the HW flow state.
*/
reg = read_uctxt_csr(dd, ctxt, RCV_TID_FLOW_TABLE + (8 * fidx));
return mask_psn(reg);
}
static void tid_rdma_rcv_err(struct hfi1_packet *packet,
struct ib_other_headers *ohdr,
struct rvt_qp *qp, u32 psn, int diff, bool fecn)
{
unsigned long flags;
tid_rdma_rcv_error(packet, ohdr, qp, psn, diff);
if (fecn) {
spin_lock_irqsave(&qp->s_lock, flags);
qp->s_flags |= RVT_S_ECN;
spin_unlock_irqrestore(&qp->s_lock, flags);
}
}
static void update_r_next_psn_fecn(struct hfi1_packet *packet,
struct hfi1_qp_priv *priv,
struct hfi1_ctxtdata *rcd,
struct tid_rdma_flow *flow,
bool fecn)
{
/*
* If a start/middle packet is delivered here due to
* RSM rule and FECN, we need to update the r_next_psn.
*/
if (fecn && packet->etype == RHF_RCV_TYPE_EAGER &&
!(priv->s_flags & HFI1_R_TID_SW_PSN)) {
struct hfi1_devdata *dd = rcd->dd;
flow->flow_state.r_next_psn =
read_r_next_psn(dd, rcd->ctxt, flow->idx);
}
}

View File

@ -76,10 +76,8 @@ struct tid_rdma_qp_params {
struct tid_flow_state {
u32 generation;
u32 psn;
u32 r_next_psn; /* next PSN to be received (in TID space) */
u8 index;
u8 last_index;
u8 flags;
};
enum tid_rdma_req_state {

View File

@ -86,14 +86,14 @@ DECLARE_EVENT_CLASS(hfi1_trace_template,
* actual function to work and can not be in a macro.
*/
#define __hfi1_trace_def(lvl) \
void __hfi1_trace_##lvl(const char *funct, char *fmt, ...); \
void __printf(2, 3) __hfi1_trace_##lvl(const char *funct, char *fmt, ...); \
\
DEFINE_EVENT(hfi1_trace_template, hfi1_ ##lvl, \
TP_PROTO(const char *function, struct va_format *vaf), \
TP_ARGS(function, vaf))
#define __hfi1_trace_fn(lvl) \
void __hfi1_trace_##lvl(const char *func, char *fmt, ...) \
void __printf(2, 3) __hfi1_trace_##lvl(const char *func, char *fmt, ...)\
{ \
struct va_format vaf = { \
.fmt = fmt, \

View File

@ -53,7 +53,7 @@ u16 hfi1_trace_get_tid_idx(u32 ent);
"tid_r_comp %u pending_tid_r_segs %u " \
"s_flags 0x%x ps_flags 0x%x iow_flags 0x%lx " \
"s_state 0x%x hw_flow_index %u generation 0x%x " \
"fpsn 0x%x flow_flags 0x%x"
"fpsn 0x%x"
#define TID_REQ_PRN "[%s] qpn 0x%x newreq %u opcode 0x%x psn 0x%x lpsn 0x%x " \
"cur_seg %u comp_seg %u ack_seg %u alloc_seg %u " \
@ -71,7 +71,7 @@ u16 hfi1_trace_get_tid_idx(u32 ent);
"pending_tid_w_segs %u sync_pt %s " \
"ps_nak_psn 0x%x ps_nak_state 0x%x " \
"prnr_nak_state 0x%x hw_flow_index %u generation "\
"0x%x fpsn 0x%x flow_flags 0x%x resync %s" \
"0x%x fpsn 0x%x resync %s" \
"r_next_psn_kdeth 0x%x"
#define TID_WRITE_SENDER_PRN "[%s] qpn 0x%x newreq %u s_tid_cur %u " \
@ -973,7 +973,6 @@ DECLARE_EVENT_CLASS(/* tid_read_sender */
__field(u32, hw_flow_index)
__field(u32, generation)
__field(u32, fpsn)
__field(u32, flow_flags)
),
TP_fast_assign(/* assign */
struct hfi1_qp_priv *priv = qp->priv;
@ -991,7 +990,6 @@ DECLARE_EVENT_CLASS(/* tid_read_sender */
__entry->hw_flow_index = priv->flow_state.index;
__entry->generation = priv->flow_state.generation;
__entry->fpsn = priv->flow_state.psn;
__entry->flow_flags = priv->flow_state.flags;
),
TP_printk(/* print */
TID_READ_SENDER_PRN,
@ -1007,8 +1005,7 @@ DECLARE_EVENT_CLASS(/* tid_read_sender */
__entry->s_state,
__entry->hw_flow_index,
__entry->generation,
__entry->fpsn,
__entry->flow_flags
__entry->fpsn
)
);
@ -1338,7 +1335,6 @@ DECLARE_EVENT_CLASS(/* tid_write_sp */
__field(u32, hw_flow_index)
__field(u32, generation)
__field(u32, fpsn)
__field(u32, flow_flags)
__field(bool, resync)
__field(u32, r_next_psn_kdeth)
),
@ -1360,7 +1356,6 @@ DECLARE_EVENT_CLASS(/* tid_write_sp */
__entry->hw_flow_index = priv->flow_state.index;
__entry->generation = priv->flow_state.generation;
__entry->fpsn = priv->flow_state.psn;
__entry->flow_flags = priv->flow_state.flags;
__entry->resync = priv->resync;
__entry->r_next_psn_kdeth = priv->r_next_psn_kdeth;
),
@ -1381,7 +1376,6 @@ DECLARE_EVENT_CLASS(/* tid_write_sp */
__entry->hw_flow_index,
__entry->generation,
__entry->fpsn,
__entry->flow_flags,
__entry->resync ? "yes" : "no",
__entry->r_next_psn_kdeth
)

View File

@ -1223,15 +1223,16 @@ static inline send_routine get_send_routine(struct rvt_qp *qp,
case IB_QPT_UD:
break;
case IB_QPT_UC:
case IB_QPT_RC: {
case IB_QPT_RC:
priv->s_running_pkt_size =
(tx->s_cur_size + priv->s_running_pkt_size) / 2;
if (piothreshold &&
tx->s_cur_size <= min(piothreshold, qp->pmtu) &&
priv->s_running_pkt_size <= min(piothreshold, qp->pmtu) &&
(BIT(ps->opcode & OPMASK) & pio_opmask[ps->opcode >> 5]) &&
iowait_sdma_pending(&priv->s_iowait) == 0 &&
!sdma_txreq_built(&tx->txreq))
return dd->process_pio_send;
break;
}
default:
break;
}
@ -1739,15 +1740,15 @@ static struct rdma_hw_stats *alloc_hw_stats(struct ib_device *ibdev,
static u64 hfi1_sps_ints(void)
{
unsigned long flags;
unsigned long index, flags;
struct hfi1_devdata *dd;
u64 sps_ints = 0;
spin_lock_irqsave(&hfi1_devs_lock, flags);
list_for_each_entry(dd, &hfi1_dev_list, list) {
xa_lock_irqsave(&hfi1_dev_table, flags);
xa_for_each(&hfi1_dev_table, index, dd) {
sps_ints += get_all_cpu_total(dd->int_counter);
}
spin_unlock_irqrestore(&hfi1_devs_lock, flags);
xa_unlock_irqrestore(&hfi1_dev_table, flags);
return sps_ints;
}

View File

@ -170,6 +170,7 @@ struct hfi1_qp_priv {
struct tid_flow_state flow_state;
struct tid_rdma_qp_params tid_rdma;
struct rvt_qp *owner;
u16 s_running_pkt_size;
u8 hdr_type; /* 9B or 16B */
struct rvt_sge_state tid_ss; /* SGE state pointer for 2nd leg */
atomic_t n_requests; /* # of TID RDMA requests in the */

View File

@ -162,12 +162,12 @@ static void deallocate_vnic_ctxt(struct hfi1_devdata *dd,
void hfi1_vnic_setup(struct hfi1_devdata *dd)
{
idr_init(&dd->vnic.vesw_idr);
xa_init(&dd->vnic.vesws);
}
void hfi1_vnic_cleanup(struct hfi1_devdata *dd)
{
idr_destroy(&dd->vnic.vesw_idr);
WARN_ON(!xa_empty(&dd->vnic.vesws));
}
#define SUM_GRP_COUNTERS(stats, qstats, x_grp) do { \
@ -533,7 +533,7 @@ void hfi1_vnic_bypass_rcv(struct hfi1_packet *packet)
l4_type = hfi1_16B_get_l4(packet->ebuf);
if (likely(l4_type == OPA_16B_L4_ETHR)) {
vesw_id = HFI1_VNIC_GET_VESWID(packet->ebuf);
vinfo = idr_find(&dd->vnic.vesw_idr, vesw_id);
vinfo = xa_load(&dd->vnic.vesws, vesw_id);
/*
* In case of invalid vesw id, count the error on
@ -541,9 +541,10 @@ void hfi1_vnic_bypass_rcv(struct hfi1_packet *packet)
*/
if (unlikely(!vinfo)) {
struct hfi1_vnic_vport_info *vinfo_tmp;
int id_tmp = 0;
unsigned long index = 0;
vinfo_tmp = idr_get_next(&dd->vnic.vesw_idr, &id_tmp);
vinfo_tmp = xa_find(&dd->vnic.vesws, &index, ULONG_MAX,
XA_PRESENT);
if (vinfo_tmp) {
spin_lock(&vport_cntr_lock);
vinfo_tmp->stats[0].netstats.rx_nohandler++;
@ -597,8 +598,7 @@ static int hfi1_vnic_up(struct hfi1_vnic_vport_info *vinfo)
if (!vinfo->vesw_id)
return -EINVAL;
rc = idr_alloc(&dd->vnic.vesw_idr, vinfo, vinfo->vesw_id,
vinfo->vesw_id + 1, GFP_NOWAIT);
rc = xa_insert(&dd->vnic.vesws, vinfo->vesw_id, vinfo, GFP_KERNEL);
if (rc < 0)
return rc;
@ -624,7 +624,7 @@ static void hfi1_vnic_down(struct hfi1_vnic_vport_info *vinfo)
clear_bit(HFI1_VNIC_UP, &vinfo->flags);
netif_carrier_off(vinfo->netdev);
netif_tx_disable(vinfo->netdev);
idr_remove(&dd->vnic.vesw_idr, vinfo->vesw_id);
xa_erase(&dd->vnic.vesws, vinfo->vesw_id);
/* ensure irqs see the change */
msix_vnic_synchronize_irq(dd);

View File

@ -7,8 +7,8 @@ ccflags-y := -I $(srctree)/drivers/net/ethernet/hisilicon/hns3
obj-$(CONFIG_INFINIBAND_HNS) += hns-roce.o
hns-roce-objs := hns_roce_main.o hns_roce_cmd.o hns_roce_pd.o \
hns_roce_ah.o hns_roce_hem.o hns_roce_mr.o hns_roce_qp.o \
hns_roce_cq.o hns_roce_alloc.o hns_roce_db.o hns_roce_srq.o
hns_roce_cq.o hns_roce_alloc.o hns_roce_db.o hns_roce_srq.o hns_roce_restrack.o
obj-$(CONFIG_INFINIBAND_HNS_HIP06) += hns-roce-hw-v1.o
hns-roce-hw-v1-objs := hns_roce_hw_v1.o
obj-$(CONFIG_INFINIBAND_HNS_HIP08) += hns-roce-hw-v2.o
hns-roce-hw-v2-objs := hns_roce_hw_v2.o
hns-roce-hw-v2-objs := hns_roce_hw_v2.o hns_roce_hw_v2_dfx.o

View File

@ -39,38 +39,34 @@
#define HNS_ROCE_VLAN_SL_BIT_MASK 7
#define HNS_ROCE_VLAN_SL_SHIFT 13
struct ib_ah *hns_roce_create_ah(struct ib_pd *ibpd,
struct rdma_ah_attr *ah_attr,
u32 flags,
struct ib_udata *udata)
int hns_roce_create_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr,
u32 flags, struct ib_udata *udata)
{
struct hns_roce_dev *hr_dev = to_hr_dev(ibpd->device);
struct hns_roce_dev *hr_dev = to_hr_dev(ibah->device);
const struct ib_gid_attr *gid_attr;
struct device *dev = hr_dev->dev;
struct hns_roce_ah *ah;
struct hns_roce_ah *ah = to_hr_ah(ibah);
u16 vlan_tag = 0xffff;
const struct ib_global_route *grh = rdma_ah_read_grh(ah_attr);
bool vlan_en = false;
int ret;
ah = kzalloc(sizeof(*ah), GFP_ATOMIC);
if (!ah)
return ERR_PTR(-ENOMEM);
gid_attr = ah_attr->grh.sgid_attr;
ret = rdma_read_gid_l2_fields(gid_attr, &vlan_tag, NULL);
if (ret)
return ret;
/* Get mac address */
memcpy(ah->av.mac, ah_attr->roce.dmac, ETH_ALEN);
gid_attr = ah_attr->grh.sgid_attr;
if (is_vlan_dev(gid_attr->ndev)) {
vlan_tag = vlan_dev_vlan_id(gid_attr->ndev);
if (vlan_tag < VLAN_CFI_MASK) {
vlan_en = true;
}
if (vlan_tag < 0x1000)
vlan_tag |= (rdma_ah_get_sl(ah_attr) &
HNS_ROCE_VLAN_SL_BIT_MASK) <<
HNS_ROCE_VLAN_SL_SHIFT;
}
ah->av.port_pd = cpu_to_be32(to_hr_pd(ibpd)->pdn |
ah->av.port_pd = cpu_to_le32(to_hr_pd(ibah->pd)->pdn |
(rdma_ah_get_port_num(ah_attr) <<
HNS_ROCE_PORT_NUM_SHIFT));
ah->av.gid_index = grh->sgid_index;
@ -86,7 +82,7 @@ struct ib_ah *hns_roce_create_ah(struct ib_pd *ibpd,
ah->av.sl_tclass_flowlabel = cpu_to_le32(rdma_ah_get_sl(ah_attr) <<
HNS_ROCE_SL_SHIFT);
return &ah->ibah;
return 0;
}
int hns_roce_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr)
@ -111,9 +107,7 @@ int hns_roce_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr)
return 0;
}
int hns_roce_destroy_ah(struct ib_ah *ah, u32 flags)
void hns_roce_destroy_ah(struct ib_ah *ah, u32 flags)
{
kfree(to_hr_ah(ah));
return 0;
return;
}

View File

@ -53,6 +53,7 @@ enum {
HNS_ROCE_CMD_QUERY_QPC = 0x42,
HNS_ROCE_CMD_MODIFY_CQC = 0x52,
HNS_ROCE_CMD_QUERY_CQC = 0x53,
/* CQC BT commands */
HNS_ROCE_CMD_WRITE_CQC_BT0 = 0x10,
HNS_ROCE_CMD_WRITE_CQC_BT1 = 0x11,

View File

@ -57,32 +57,6 @@
#define roce_set_bit(origin, shift, val) \
roce_set_field((origin), (1ul << (shift)), (shift), (val))
/*
* roce_hw_index_cmp_lt - Compare two hardware index values in hisilicon
* SOC, check if a is less than b.
* @a: hardware index value
* @b: hardware index value
* @bits: the number of bits of a and b, range: 0~31.
*
* Hardware index increases continuously till max value, and then restart
* from zero, again and again. Because the bits of reg field is often
* limited, the reg field can only hold the low bits of the hardware index
* in hisilicon SOC.
* In some scenes we need to compare two values(a,b) getted from two reg
* fields in this driver, for example:
* If a equals 0xfffe, b equals 0x1 and bits equals 16, we think b has
* incresed from 0xffff to 0x1 and a is less than b.
* If a equals 0xfffe, b equals 0x0xf001 and bits equals 16, we think a
* is bigger than b.
*
* Return true on a less than b, otherwise false.
*/
#define roce_hw_index_mask(bits) ((1ul << (bits)) - 1)
#define roce_hw_index_shift(bits) (32 - (bits))
#define roce_hw_index_cmp_lt(a, b, bits) \
((int)((((a) - (b)) & roce_hw_index_mask(bits)) << \
roce_hw_index_shift(bits)) < 0)
#define ROCEE_GLB_CFG_ROCEE_DB_SQ_MODE_S 3
#define ROCEE_GLB_CFG_ROCEE_DB_OTH_MODE_S 4
@ -271,8 +245,6 @@
#define ROCEE_SDB_SEND_PTR_SDB_SEND_PTR_M \
(((1UL << 28) - 1) << ROCEE_SDB_SEND_PTR_SDB_SEND_PTR_S)
#define ROCEE_SDB_PTR_CMP_BITS 28
#define ROCEE_SDB_INV_CNT_SDB_INV_CNT_S 0
#define ROCEE_SDB_INV_CNT_SDB_INV_CNT_M \
(((1UL << 16) - 1) << ROCEE_SDB_INV_CNT_SDB_INV_CNT_S)
@ -353,13 +325,8 @@
#define ROCEE_CAEP_AE_MASK_REG 0x6C8
#define ROCEE_CAEP_AE_ST_REG 0x6CC
#define ROCEE_SDB_ISSUE_PTR_REG 0x758
#define ROCEE_SDB_SEND_PTR_REG 0x75C
#define ROCEE_CAEP_CQE_WCMD_EMPTY 0x850
#define ROCEE_SCAEP_WR_CQE_CNT 0x8D0
#define ROCEE_SDB_INV_CNT_REG 0x9A4
#define ROCEE_SDB_RETRY_CNT_REG 0x9AC
#define ROCEE_TSP_BP_ST_REG 0x9EC
#define ROCEE_ECC_UCERR_ALM0_REG 0xB34
#define ROCEE_ECC_CERR_ALM0_REG 0xB40

View File

@ -32,6 +32,7 @@
#include <linux/platform_device.h>
#include <rdma/ib_umem.h>
#include <rdma/uverbs_ioctl.h>
#include "hns_roce_device.h"
#include "hns_roce_cmd.h"
#include "hns_roce_hem.h"
@ -127,13 +128,9 @@ static int hns_roce_cq_alloc(struct hns_roce_dev *hr_dev, int nent,
goto err_out;
}
/* The cq insert radix tree */
spin_lock_irq(&cq_table->lock);
/* Radix_tree: The associated pointer and long integer key value like */
ret = radix_tree_insert(&cq_table->tree, hr_cq->cqn, hr_cq);
spin_unlock_irq(&cq_table->lock);
ret = xa_err(xa_store(&cq_table->array, hr_cq->cqn, hr_cq, GFP_KERNEL));
if (ret) {
dev_err(dev, "CQ alloc.Failed to radix_tree_insert.\n");
dev_err(dev, "CQ alloc failed xa_store.\n");
goto err_put;
}
@ -141,7 +138,7 @@ static int hns_roce_cq_alloc(struct hns_roce_dev *hr_dev, int nent,
mailbox = hns_roce_alloc_cmd_mailbox(hr_dev);
if (IS_ERR(mailbox)) {
ret = PTR_ERR(mailbox);
goto err_radix;
goto err_xa;
}
hr_dev->hw->write_cqc(hr_dev, hr_cq, mailbox->buf, mtts, dma_handle,
@ -152,7 +149,7 @@ static int hns_roce_cq_alloc(struct hns_roce_dev *hr_dev, int nent,
hns_roce_free_cmd_mailbox(hr_dev, mailbox);
if (ret) {
dev_err(dev, "CQ alloc.Failed to cmd mailbox.\n");
goto err_radix;
goto err_xa;
}
hr_cq->cons_index = 0;
@ -164,10 +161,8 @@ static int hns_roce_cq_alloc(struct hns_roce_dev *hr_dev, int nent,
return 0;
err_radix:
spin_lock_irq(&cq_table->lock);
radix_tree_delete(&cq_table->tree, hr_cq->cqn);
spin_unlock_irq(&cq_table->lock);
err_xa:
xa_erase(&cq_table->array, hr_cq->cqn);
err_put:
hns_roce_table_put(hr_dev, &cq_table->table, hr_cq->cqn);
@ -197,6 +192,8 @@ void hns_roce_free_cq(struct hns_roce_dev *hr_dev, struct hns_roce_cq *hr_cq)
dev_err(dev, "HW2SW_CQ failed (%d) for CQN %06lx\n", ret,
hr_cq->cqn);
xa_erase(&cq_table->array, hr_cq->cqn);
/* Waiting interrupt process procedure carried out */
synchronize_irq(hr_dev->eq_table.eq[hr_cq->vector].irq);
@ -205,10 +202,6 @@ void hns_roce_free_cq(struct hns_roce_dev *hr_dev, struct hns_roce_cq *hr_cq)
complete(&hr_cq->free);
wait_for_completion(&hr_cq->free);
spin_lock_irq(&cq_table->lock);
radix_tree_delete(&cq_table->tree, hr_cq->cqn);
spin_unlock_irq(&cq_table->lock);
hns_roce_table_put(hr_dev, &cq_table->table, hr_cq->cqn);
hns_roce_bitmap_free(&cq_table->bitmap, hr_cq->cqn, BITMAP_NO_RR);
}
@ -309,7 +302,6 @@ static void hns_roce_ib_free_cq_buf(struct hns_roce_dev *hr_dev,
struct ib_cq *hns_roce_ib_create_cq(struct ib_device *ib_dev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *context,
struct ib_udata *udata)
{
struct hns_roce_dev *hr_dev = to_hr_dev(ib_dev);
@ -321,6 +313,8 @@ struct ib_cq *hns_roce_ib_create_cq(struct ib_device *ib_dev,
int vector = attr->comp_vector;
int cq_entries = attr->cqe;
int ret;
struct hns_roce_ucontext *context = rdma_udata_to_drv_context(
udata, struct hns_roce_ucontext, ibucontext);
if (cq_entries < 1 || cq_entries > hr_dev->caps.max_cqes) {
dev_err(dev, "Creat CQ failed. entries=%d, max=%d\n",
@ -339,7 +333,7 @@ struct ib_cq *hns_roce_ib_create_cq(struct ib_device *ib_dev,
hr_cq->ib_cq.cqe = cq_entries - 1;
spin_lock_init(&hr_cq->lock);
if (context) {
if (udata) {
if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) {
dev_err(dev, "Failed to copy_from_udata.\n");
ret = -EFAULT;
@ -357,8 +351,7 @@ struct ib_cq *hns_roce_ib_create_cq(struct ib_device *ib_dev,
if ((hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RECORD_DB) &&
(udata->outlen >= sizeof(resp))) {
ret = hns_roce_db_map_user(to_hr_ucontext(context),
udata, ucmd.db_addr,
ret = hns_roce_db_map_user(context, udata, ucmd.db_addr,
&hr_cq->db);
if (ret) {
dev_err(dev, "cq record doorbell map failed!\n");
@ -369,7 +362,7 @@ struct ib_cq *hns_roce_ib_create_cq(struct ib_device *ib_dev,
}
/* Get user space parameters */
uar = &to_hr_ucontext(context)->uar;
uar = &context->uar;
} else {
if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RECORD_DB) {
ret = hns_roce_alloc_db(hr_dev, &hr_cq->db, 1);
@ -408,7 +401,7 @@ struct ib_cq *hns_roce_ib_create_cq(struct ib_device *ib_dev,
* problems if tptr is set to zero here, so we initialze it in user
* space.
*/
if (!context && hr_cq->tptr_addr)
if (!udata && hr_cq->tptr_addr)
*hr_cq->tptr_addr = 0;
/* Get created cq handler and carry out event */
@ -416,7 +409,7 @@ struct ib_cq *hns_roce_ib_create_cq(struct ib_device *ib_dev,
hr_cq->event = hns_roce_ib_cq_event;
hr_cq->cq_depth = cq_entries;
if (context) {
if (udata) {
resp.cqn = hr_cq->cqn;
ret = ib_copy_to_udata(udata, &resp, sizeof(resp));
if (ret)
@ -429,21 +422,20 @@ err_cqc:
hns_roce_free_cq(hr_dev, hr_cq);
err_dbmap:
if (context && (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RECORD_DB) &&
if (udata && (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RECORD_DB) &&
(udata->outlen >= sizeof(resp)))
hns_roce_db_unmap_user(to_hr_ucontext(context),
&hr_cq->db);
hns_roce_db_unmap_user(context, &hr_cq->db);
err_mtt:
hns_roce_mtt_cleanup(hr_dev, &hr_cq->hr_buf.hr_mtt);
if (context)
if (udata)
ib_umem_release(hr_cq->umem);
else
hns_roce_ib_free_cq_buf(hr_dev, &hr_cq->hr_buf,
hr_cq->ib_cq.cqe);
err_db:
if (!context && (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RECORD_DB))
if (!udata && (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RECORD_DB))
hns_roce_free_db(hr_dev, &hr_cq->db);
err_cq:
@ -452,24 +444,27 @@ err_cq:
}
EXPORT_SYMBOL_GPL(hns_roce_ib_create_cq);
int hns_roce_ib_destroy_cq(struct ib_cq *ib_cq)
int hns_roce_ib_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
{
struct hns_roce_dev *hr_dev = to_hr_dev(ib_cq->device);
struct hns_roce_cq *hr_cq = to_hr_cq(ib_cq);
int ret = 0;
if (hr_dev->hw->destroy_cq) {
ret = hr_dev->hw->destroy_cq(ib_cq);
ret = hr_dev->hw->destroy_cq(ib_cq, udata);
} else {
hns_roce_free_cq(hr_dev, hr_cq);
hns_roce_mtt_cleanup(hr_dev, &hr_cq->hr_buf.hr_mtt);
if (ib_cq->uobject) {
if (udata) {
ib_umem_release(hr_cq->umem);
if (hr_cq->db_en == 1)
hns_roce_db_unmap_user(
to_hr_ucontext(ib_cq->uobject->context),
rdma_udata_to_drv_context(
udata,
struct hns_roce_ucontext,
ibucontext),
&hr_cq->db);
} else {
/* Free the buff of stored cq */
@ -491,8 +486,7 @@ void hns_roce_cq_completion(struct hns_roce_dev *hr_dev, u32 cqn)
struct device *dev = hr_dev->dev;
struct hns_roce_cq *cq;
cq = radix_tree_lookup(&hr_dev->cq_table.tree,
cqn & (hr_dev->caps.num_cqs - 1));
cq = xa_load(&hr_dev->cq_table.array, cqn & (hr_dev->caps.num_cqs - 1));
if (!cq) {
dev_warn(dev, "Completion event for bogus CQ 0x%08x\n", cqn);
return;
@ -509,8 +503,7 @@ void hns_roce_cq_event(struct hns_roce_dev *hr_dev, u32 cqn, int event_type)
struct device *dev = hr_dev->dev;
struct hns_roce_cq *cq;
cq = radix_tree_lookup(&cq_table->tree,
cqn & (hr_dev->caps.num_cqs - 1));
cq = xa_load(&cq_table->array, cqn & (hr_dev->caps.num_cqs - 1));
if (cq)
atomic_inc(&cq->refcount);
@ -530,8 +523,7 @@ int hns_roce_init_cq_table(struct hns_roce_dev *hr_dev)
{
struct hns_roce_cq_table *cq_table = &hr_dev->cq_table;
spin_lock_init(&cq_table->lock);
INIT_RADIX_TREE(&cq_table->tree, GFP_ATOMIC);
xa_init(&cq_table->array);
return hns_roce_bitmap_init(&cq_table->bitmap, hr_dev->caps.num_cqs,
hr_dev->caps.num_cqs - 1,

View File

@ -505,7 +505,6 @@ struct hns_roce_uar_table {
struct hns_roce_qp_table {
struct hns_roce_bitmap bitmap;
spinlock_t lock;
struct hns_roce_hem_table qp_table;
struct hns_roce_hem_table irrl_table;
struct hns_roce_hem_table trrl_table;
@ -515,8 +514,7 @@ struct hns_roce_qp_table {
struct hns_roce_cq_table {
struct hns_roce_bitmap bitmap;
spinlock_t lock;
struct radix_tree_root tree;
struct xarray array;
struct hns_roce_hem_table table;
};
@ -869,6 +867,11 @@ struct hns_roce_work {
int sub_type;
};
struct hns_roce_dfx_hw {
int (*query_cqc_info)(struct hns_roce_dev *hr_dev, u32 cqn,
int *buffer);
};
struct hns_roce_hw {
int (*reset)(struct hns_roce_dev *hr_dev, bool enable);
int (*cmq_init)(struct hns_roce_dev *hr_dev);
@ -907,7 +910,7 @@ struct hns_roce_hw {
int (*modify_qp)(struct ib_qp *ibqp, const struct ib_qp_attr *attr,
int attr_mask, enum ib_qp_state cur_state,
enum ib_qp_state new_state);
int (*destroy_qp)(struct ib_qp *ibqp);
int (*destroy_qp)(struct ib_qp *ibqp, struct ib_udata *udata);
int (*qp_flow_control_init)(struct hns_roce_dev *hr_dev,
struct hns_roce_qp *hr_qp);
int (*post_send)(struct ib_qp *ibqp, const struct ib_send_wr *wr,
@ -916,8 +919,9 @@ struct hns_roce_hw {
const struct ib_recv_wr **bad_recv_wr);
int (*req_notify_cq)(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
int (*poll_cq)(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc);
int (*dereg_mr)(struct hns_roce_dev *hr_dev, struct hns_roce_mr *mr);
int (*destroy_cq)(struct ib_cq *ibcq);
int (*dereg_mr)(struct hns_roce_dev *hr_dev, struct hns_roce_mr *mr,
struct ib_udata *udata);
int (*destroy_cq)(struct ib_cq *ibcq, struct ib_udata *udata);
int (*modify_cq)(struct ib_cq *cq, u16 cq_count, u16 cq_period);
int (*init_eq)(struct hns_roce_dev *hr_dev);
void (*cleanup_eq)(struct hns_roce_dev *hr_dev);
@ -956,7 +960,7 @@ struct hns_roce_dev {
int irq[HNS_ROCE_MAX_IRQ_NUM];
u8 __iomem *reg_base;
struct hns_roce_caps caps;
struct radix_tree_root qp_table_tree;
struct xarray qp_table_xa;
unsigned char dev_addr[HNS_ROCE_MAX_PORTS][MAC_ADDR_OCTET_NUM];
u64 sys_image_guid;
@ -985,6 +989,7 @@ struct hns_roce_dev {
const struct hns_roce_hw *hw;
void *priv;
struct workqueue_struct *irq_workq;
const struct hns_roce_dfx_hw *dfx;
};
static inline struct hns_roce_dev *to_hr_dev(struct ib_device *ib_dev)
@ -1046,8 +1051,7 @@ static inline void hns_roce_write64_k(__le32 val[2], void __iomem *dest)
static inline struct hns_roce_qp
*__hns_roce_qp_lookup(struct hns_roce_dev *hr_dev, u32 qpn)
{
return radix_tree_lookup(&hr_dev->qp_table_tree,
qpn & (hr_dev->caps.num_qps - 1));
return xa_load(&hr_dev->qp_table_xa, qpn & (hr_dev->caps.num_qps - 1));
}
static inline void *hns_roce_buf_offset(struct hns_roce_buf *buf, int offset)
@ -1107,16 +1111,13 @@ void hns_roce_bitmap_free_range(struct hns_roce_bitmap *bitmap,
unsigned long obj, int cnt,
int rr);
struct ib_ah *hns_roce_create_ah(struct ib_pd *pd,
struct rdma_ah_attr *ah_attr,
u32 flags,
struct ib_udata *udata);
int hns_roce_create_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr,
u32 flags, struct ib_udata *udata);
int hns_roce_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr);
int hns_roce_destroy_ah(struct ib_ah *ah, u32 flags);
void hns_roce_destroy_ah(struct ib_ah *ah, u32 flags);
int hns_roce_alloc_pd(struct ib_pd *pd, struct ib_ucontext *context,
struct ib_udata *udata);
void hns_roce_dealloc_pd(struct ib_pd *pd);
int hns_roce_alloc_pd(struct ib_pd *pd, struct ib_udata *udata);
void hns_roce_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
struct ib_mr *hns_roce_get_dma_mr(struct ib_pd *pd, int acc);
struct ib_mr *hns_roce_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
@ -1126,10 +1127,10 @@ int hns_roce_rereg_user_mr(struct ib_mr *mr, int flags, u64 start, u64 length,
u64 virt_addr, int mr_access_flags, struct ib_pd *pd,
struct ib_udata *udata);
struct ib_mr *hns_roce_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type,
u32 max_num_sg);
u32 max_num_sg, struct ib_udata *udata);
int hns_roce_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, int sg_nents,
unsigned int *sg_offset);
int hns_roce_dereg_mr(struct ib_mr *ibmr);
int hns_roce_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata);
int hns_roce_hw2sw_mpt(struct hns_roce_dev *hr_dev,
struct hns_roce_cmd_mailbox *mailbox,
unsigned long mpt_index);
@ -1147,13 +1148,13 @@ int hns_roce_buf_alloc(struct hns_roce_dev *hr_dev, u32 size, u32 max_direct,
int hns_roce_ib_umem_write_mtt(struct hns_roce_dev *hr_dev,
struct hns_roce_mtt *mtt, struct ib_umem *umem);
struct ib_srq *hns_roce_create_srq(struct ib_pd *pd,
struct ib_srq_init_attr *srq_init_attr,
struct ib_udata *udata);
int hns_roce_create_srq(struct ib_srq *srq,
struct ib_srq_init_attr *srq_init_attr,
struct ib_udata *udata);
int hns_roce_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *srq_attr,
enum ib_srq_attr_mask srq_attr_mask,
struct ib_udata *udata);
int hns_roce_destroy_srq(struct ib_srq *ibsrq);
void hns_roce_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata);
struct ib_qp *hns_roce_create_qp(struct ib_pd *ib_pd,
struct ib_qp_init_attr *init_attr,
@ -1179,10 +1180,9 @@ int to_hr_qp_type(int qp_type);
struct ib_cq *hns_roce_ib_create_cq(struct ib_device *ib_dev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *context,
struct ib_udata *udata);
int hns_roce_ib_destroy_cq(struct ib_cq *ib_cq);
int hns_roce_ib_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata);
void hns_roce_free_cq(struct hns_roce_dev *hr_dev, struct hns_roce_cq *hr_cq);
int hns_roce_db_map_user(struct hns_roce_ucontext *context,
@ -1202,4 +1202,6 @@ int hns_get_gid_index(struct hns_roce_dev *hr_dev, u8 port, int gid_index);
int hns_roce_init(struct hns_roce_dev *hr_dev);
void hns_roce_exit(struct hns_roce_dev *hr_dev);
int hns_roce_fill_res_entry(struct sk_buff *msg,
struct rdma_restrack_entry *res);
#endif /* _HNS_ROCE_DEVICE_H */

Some files were not shown because too many files have changed in this diff Show More