Commit graph

126 commits

Author SHA1 Message Date
Si-Wei Liu
2eacf4b5e3 vdpa/mlx5: implement .reset_map driver op
Since commit 6f5312f801 ("vdpa/mlx5: Add support for running with
virtio_vdpa"), mlx5_vdpa starts with preallocate 1:1 DMA MR at device
creation time. This 1:1 DMA MR will be implicitly destroyed while the
first .set_map call is invoked, in which case callers like vhost-vdpa
will start to set up custom mappings. When the .reset callback is
invoked, the custom mappings will be cleared and the 1:1 DMA MR will be
re-created.

In order to reduce excessive memory mapping cost in live migration, it
is desirable to decouple the vhost-vdpa IOTLB abstraction from the
virtio device life cycle, i.e. mappings can be kept around intact across
virtio device reset. Leverage the .reset_map callback, which is meant to
destroy the regular MR (including cvq mapping) on the given ASID and
recreate the initial DMA mapping. That way, the device .reset op runs
free from having to maintain and clean up memory mappings by itself.

Additionally, implement .compat_reset to cater for older userspace,
which may wish to see mapping to be cleared during reset.

Co-developed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Message-Id: <1697880319-4937-7-git-send-email-si-wei.liu@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-11-01 09:20:00 -04:00
Eugenio Pérez
c695964474 mlx5_vdpa: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK
Offer this backend feature as mlx5 is compatible with it. It allows it
to do live migration with CVQ, dynamically switching between passthrough
and shadow virtqueue.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230703142514.363256-1-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-11-01 09:19:58 -04:00
Dragos Tatulea
5dc31bd245 vdpa/mlx5: Update cvq iotlb mapping on ASID change
For the following sequence:
- cvq group is in ASID 0
- .set_map(1, cvq_iotlb)
- .set_group_asid(cvq_group, 1)

... the cvq mapping from ASID 0 will be used. This is not always correct
behaviour.

This patch adds support for the above mentioned flow by saving the iotlb
on each .set_map and updating the cvq iotlb with it on a cvq group change.

Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-18-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-11-01 09:19:57 -04:00
Dragos Tatulea
03dd63c8fa vdpa/mlx5: Enable hw support for vq descriptor mapping
Vq descriptor mappings are supported in hardware by filling in an
additional mkey which contains the descriptor mappings to the hw vq.

A previous patch in this series added support for hw mkey (mr) creation
for ASID 1.

This patch fills in both the vq data and vq descriptor mkeys based on
group ASID mapping.

The feature is signaled to the vdpa core through the presence of the
.get_vq_desc_group op.

Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-16-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-11-01 09:19:57 -04:00
Dragos Tatulea
55229eab8c vdpa/mlx5: Introduce mr for vq descriptor
Introduce the vq descriptor group and mr per ASID. Until now
.set_map on ASID 1 was only updating the cvq iotlb. From now on it also
creates a mkey for it. The current patch doesn't use it but follow-up
patches will add hardware support for mapping the vq descriptors.

Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-15-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-11-01 09:19:57 -04:00
Dragos Tatulea
625e4b59a9 vdpa/mlx5: Improve mr update flow
The current flow for updating an mr works directly on mvdev->mr which
makes it cumbersome to handle multiple new mr structs.

This patch makes the flow more straightforward by having
mlx5_vdpa_create_mr return a new mr which will update the old mr (if
any). The old mr will be deleted and unlinked from mvdev. For the case
when the iotlb is empty (not NULL), the old mr will be cleared.

This change paves the way for adding mrs for different ASIDs.

The initialized bool is no longer needed as mr is now a pointer in the
mlx5_vdpa_dev struct which will be NULL when not initialized.

Acked-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-14-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-11-01 09:19:57 -04:00
Dragos Tatulea
1b3ce9576f vdpa/mlx5: Allow creation/deletion of any given mr struct
This patch adapts the mr creation/deletion code to be able to work with
any given mr struct pointer. All the APIs are adapted to take an extra
parameter for the mr.

mlx5_vdpa_create/delete_mr doesn't need a ASID parameter anymore. The
check is done in the caller instead (mlx5_set_map).

This change is needed for a followup patch which will introduce an
additional mr for the vq descriptor data.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231018171456.1624030-12-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-11-01 09:19:56 -04:00
Dragos Tatulea
07a2da4024 vdpa/mlx5: Rename mr destroy functions
Make mlx5_destroy_mr symmetric to mlx5_create_mr.

Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-11-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-11-01 09:19:56 -04:00
Dragos Tatulea
512c0cdd80 vdpa/mlx5: Decouple cvq iotlb handling from hw mapping code
The handling of the cvq iotlb is currently coupled with the creation
and destruction of the hardware mkeys (mr).

This patch moves cvq iotlb handling into its own function and shifts it
to a scope that is not related to mr handling. As cvq handling is just a
prune_iotlb + dup_iotlb cycle, put it all in the same "update" function.
Finally, the destruction path is handled by directly pruning the iotlb.

After this move is done the ASID mr code can be collapsed into a single
function.

Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-8-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-11-01 09:19:56 -04:00
Dragos Tatulea
049cbeab86 vdpa/mlx5: Create helper function for dma mappings
Necessary for upcoming cvq separation from mr allocation.

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-7-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-11-01 09:19:55 -04:00
Dragos Tatulea
abb0dcf993 vdpa/mlx5: Fix firmware error on creation of 1k VQs
A firmware error is triggered when configuring a 9k MTU on the PF after
switching to switchdev mode and then using a vdpa device with larger
(1k) rings:
mlx5_cmd_out_err: CREATE_GENERAL_OBJECT(0xa00) op_mod(0xd) failed, status bad resource(0x5), syndrome (0xf6db90), err(-22)

This is due to the fact that the hw VQ size parameters are computed
based on the umem_1/2/3_buffer_param_a/b capabilities and all
device capabilities are read only when the driver is moved to switchdev mode.

The problematic configuration flow looks like this:
1) Create VF
2) Unbind VF
3) Switch PF to switchdev mode.
4) Bind VF
5) Set PF MTU to 9k
6) create vDPA device
7) Start VM with vDPA device and 1K queue size

Note that setting the MTU before step 3) doesn't trigger this issue.

This patch reads the forementioned umem parameters at the latest point
possible before the VQs of the device are created.

v2:
- Allocate output with kmalloc to reduce stack frame size.
- Removed stable from cc.

Fixes: 1a86b377aa ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20230831155702.1080754-1-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-10-18 11:29:41 -04:00
Dragos Tatulea
f8a3db47d9 vdpa/mlx5: Fix double release of debugfs entry
The error path in setup_driver deletes the debugfs entry but doesn't
clear the pointer. During .dev_del the invalid pointer will be released
again causing a crash.

This patch fixes the issue by always clearing the debugfs entry in
mlx5_vdpa_remove_debugfs. Also, stop removing the debugfs entry in
.dev_del op: the debugfs entry is already handled within the
setup_driver/teardown_driver scope.

Cc: stable@vger.kernel.org
Fixes: f0417e72ad ("vdpa/mlx5: Add and remove debugfs in setup/teardown driver")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Message-Id: <20230829174014.928189-2-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-10-18 11:29:29 -04:00
Jakub Kicinski
7ff57803d2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/sfc/tc.c
  fa165e1949 ("sfc: don't unregister flow_indr if it was never registered")
  3bf969e88a ("sfc: add MAE table machinery for conntrack table")
https://lore.kernel.org/all/20230818112159.7430e9b4@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-18 12:44:56 -07:00
Dragos Tatulea
810b0cc1c2 vdpa/mlx5: Fix crash on shutdown for when no ndev exists
The ndev was accessed on shutdown without a check if it actually exists.
This triggered the crash pasted below.

Instead of doing the ndev check, delete the shutdown handler altogether.
The irqs will be released at the parent VF level (mlx5_core).

 BUG: kernel NULL pointer dereference, address: 0000000000000300
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] SMP
 CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 6.5.0-rc2_for_upstream_min_debug_2023_07_17_15_05 #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:mlx5v_shutdown+0xe/0x50 [mlx5_vdpa]
 RSP: 0018:ffff8881003bfdc0 EFLAGS: 00010286
 RAX: ffff888103befba0 RBX: ffff888109d28008 RCX: 0000000000000017
 RDX: 0000000000000001 RSI: 0000000000000212 RDI: ffff888109d28000
 RBP: 0000000000000000 R08: 0000000d3a3a3882 R09: 0000000000000001
 R10: 0000000000000000 R11: 0000000000000000 R12: ffff888109d28000
 R13: ffff888109d28080 R14: 00000000fee1dead R15: 0000000000000000
 FS:  00007f4969e0be40(0000) GS:ffff88852c800000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000300 CR3: 00000001051cd006 CR4: 0000000000370eb0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <TASK>
  ? __die+0x20/0x60
  ? page_fault_oops+0x14c/0x3c0
  ? exc_page_fault+0x75/0x140
  ? asm_exc_page_fault+0x22/0x30
  ? mlx5v_shutdown+0xe/0x50 [mlx5_vdpa]
  device_shutdown+0x13e/0x1e0
  kernel_restart+0x36/0x90
  __do_sys_reboot+0x141/0x210
  ? vfs_writev+0xcd/0x140
  ? handle_mm_fault+0x161/0x260
  ? do_writev+0x6b/0x110
  do_syscall_64+0x3d/0x90
  entry_SYSCALL_64_after_hwframe+0x46/0xb0
 RIP: 0033:0x7f496990fb56
 RSP: 002b:00007fffc7bdde88 EFLAGS: 00000206 ORIG_RAX: 00000000000000a9
 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f496990fb56
 RDX: 0000000001234567 RSI: 0000000028121969 RDI: fffffffffee1dead
 RBP: 00007fffc7bde1d0 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
 R13: 00007fffc7bddf10 R14: 0000000000000000 R15: 00007fffc7bde2b8
  </TASK>
 CR2: 0000000000000300
 ---[ end trace 0000000000000000 ]---

Fixes: bc9a2b3e68 ("vdpa/mlx5: Support interrupt bypassing")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20230803152648.199297-1-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-08-10 15:24:29 -04:00
Eugenio Pérez
ad03a0f44c vdpa/mlx5: Delete control vq iotlb in destroy_mr only when necessary
mlx5_vdpa_destroy_mr can be called from .set_map with data ASID after
the control virtqueue ASID iotlb has been populated. The control vq
iotlb must not be cleared, since it will not be populated again.

So call the ASID aware destroy function which makes sure that the
right vq resource is destroyed.

Fixes: 8fcd20c307 ("vdpa/mlx5: Support different address spaces for control and data")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Message-Id: <20230802171231.11001-5-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-08-10 15:24:29 -04:00
Dragos Tatulea
3fe0241933 vdpa/mlx5: Correct default number of queues when MQ is on
The standard specifies that the initial number of queues is the
default, which is 1 (1 tx, 1 rx).

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230727172354.68243-2-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-08-10 15:24:28 -04:00
Maher Sanalla
f14c1a14e6 net/mlx5: Allocate completion EQs dynamically
This commit enables the dynamic allocation of EQs at runtime, allowing
for more flexibility in managing completion EQs and reducing the memory
overhead of driver load. Whenever a CQ is created for a given vector
index, the driver will lookup to see if there is an already mapped
completion EQ for that vector, if so, utilize it. Otherwise, allocate a
new EQ on demand and then utilize it for the CQ completion events.

Add a protection lock to the EQ table to protect from concurrent EQ
creation attempts.

While at it, replace mlx5_vector2irqn()/mlx5_vector2eqn() with
mlx5_comp_eqn_get() and mlx5_comp_irqn_get() which will allocate an
EQ on demand if no EQ is found for the given vector.

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:52 -07:00
Eli Cohen
bc9a2b3e68 vdpa/mlx5: Support interrupt bypassing
Add support for generation of interrupts from the device directly to the
VM to the VCPU thus avoiding the overhead on the host CPU.

When supported, the driver will attempt to allocate vectors for each
data virtqueue. If a vector for a virtqueue cannot be provided it will
use the QP mode where notifications go through the driver.

In addition, we add a shutdown callback to make sure allocated
interrupts are released in case of shutdown to allow clean shutdown.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Message-Id: <20230607190007.290505-1-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-27 10:47:09 -04:00
Dragos Tatulea
73790bdfba vdpa/mlx5: Fix hang when cvq commands are triggered during device unregister
Currently the vdpa device is unregistered after the workqueue that
processes vq commands is disabled. However, the device unregister
process can still send commands to the cvq (a vlan delete for example)
which leads to a hang because the handing workqueue has been disabled
and the command never finishes:

 [ 2263.095764] rcu: INFO: rcu_sched self-detected stall on CPU
 [ 2263.096307] rcu:        9-....: (5250 ticks this GP) idle=dac4/1/0x4000000000000000 softirq=111009/111009 fqs=2544
 [ 2263.097154] rcu:        (t=5251 jiffies g=393549 q=347 ncpus=10)
 [ 2263.097648] CPU: 9 PID: 94300 Comm: kworker/u20:2 Not tainted 6.3.0-rc6_for_upstream_min_debug_2023_04_14_00_02 #1
 [ 2263.098535] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 [ 2263.099481] Workqueue: mlx5_events mlx5_vhca_state_work_handler [mlx5_core]
 [ 2263.100143] RIP: 0010:virtnet_send_command+0x109/0x170
 [ 2263.100621] Code: 1d df f5 ff 85 c0 78 5c 48 8b 7b 08 e8 d0 c5 f5 ff 84 c0 75 11 eb 22 48 8b 7b 08 e8 01 b7 f5 ff 84 c0 75 15 f3 90 48 8b 7b 08 <48> 8d 74 24 04 e8 8d c5 f5 ff 48 85 c0 74 de 48 8b 83 f8 00 00 00
 [ 2263.102148] RSP: 0018:ffff888139cf36e8 EFLAGS: 00000246
 [ 2263.102624] RAX: 0000000000000000 RBX: ffff888166bea940 RCX: 0000000000000001
 [ 2263.103244] RDX: 0000000000000000 RSI: ffff888139cf36ec RDI: ffff888146763800
 [ 2263.103864] RBP: ffff888139cf3710 R08: ffff88810d201000 R09: 0000000000000000
 [ 2263.104473] R10: 0000000000000002 R11: 0000000000000003 R12: 0000000000000002
 [ 2263.105082] R13: 0000000000000002 R14: ffff888114528400 R15: ffff888166bea000
 [ 2263.105689] FS:  0000000000000000(0000) GS:ffff88852cc80000(0000) knlGS:0000000000000000
 [ 2263.106404] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [ 2263.106925] CR2: 00007f31f394b000 CR3: 000000010615b006 CR4: 0000000000370ea0
 [ 2263.107542] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [ 2263.108163] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 [ 2263.108769] Call Trace:
 [ 2263.109059]  <TASK>
 [ 2263.109320]  ? check_preempt_wakeup+0x11f/0x230
 [ 2263.109750]  virtnet_vlan_rx_kill_vid+0x5a/0xa0
 [ 2263.110180]  vlan_vid_del+0x9c/0x170
 [ 2263.110546]  vlan_device_event+0x351/0x760 [8021q]
 [ 2263.111004]  raw_notifier_call_chain+0x41/0x60
 [ 2263.111426]  dev_close_many+0xcb/0x120
 [ 2263.111808]  unregister_netdevice_many_notify+0x130/0x770
 [ 2263.112297]  ? wq_worker_running+0xa/0x30
 [ 2263.112688]  unregister_netdevice_queue+0x89/0xc0
 [ 2263.113128]  unregister_netdev+0x18/0x20
 [ 2263.113512]  virtnet_remove+0x4f/0x230
 [ 2263.113885]  virtio_dev_remove+0x31/0x70
 [ 2263.114273]  device_release_driver_internal+0x18f/0x1f0
 [ 2263.114746]  bus_remove_device+0xc6/0x130
 [ 2263.115146]  device_del+0x173/0x3c0
 [ 2263.115502]  ? kernfs_find_ns+0x35/0xd0
 [ 2263.115895]  device_unregister+0x1a/0x60
 [ 2263.116279]  unregister_virtio_device+0x11/0x20
 [ 2263.116706]  device_release_driver_internal+0x18f/0x1f0
 [ 2263.117182]  bus_remove_device+0xc6/0x130
 [ 2263.117576]  device_del+0x173/0x3c0
 [ 2263.117929]  ? vdpa_dev_remove+0x20/0x20 [vdpa]
 [ 2263.118364]  device_unregister+0x1a/0x60
 [ 2263.118752]  mlx5_vdpa_dev_del+0x4c/0x80 [mlx5_vdpa]
 [ 2263.119232]  vdpa_match_remove+0x21/0x30 [vdpa]
 [ 2263.119663]  bus_for_each_dev+0x71/0xc0
 [ 2263.120054]  vdpa_mgmtdev_unregister+0x57/0x70 [vdpa]
 [ 2263.120520]  mlx5v_remove+0x12/0x20 [mlx5_vdpa]
 [ 2263.120953]  auxiliary_bus_remove+0x18/0x30
 [ 2263.121356]  device_release_driver_internal+0x18f/0x1f0
 [ 2263.121830]  bus_remove_device+0xc6/0x130
 [ 2263.122223]  device_del+0x173/0x3c0
 [ 2263.122581]  ? devl_param_driverinit_value_get+0x29/0x90
 [ 2263.123070]  mlx5_rescan_drivers_locked+0xc4/0x2d0 [mlx5_core]
 [ 2263.123633]  mlx5_unregister_device+0x54/0x80 [mlx5_core]
 [ 2263.124169]  mlx5_uninit_one+0x54/0x150 [mlx5_core]
 [ 2263.124656]  mlx5_sf_dev_remove+0x45/0x90 [mlx5_core]
 [ 2263.125153]  auxiliary_bus_remove+0x18/0x30
 [ 2263.125560]  device_release_driver_internal+0x18f/0x1f0
 [ 2263.126052]  bus_remove_device+0xc6/0x130
 [ 2263.126451]  device_del+0x173/0x3c0
 [ 2263.126815]  mlx5_sf_dev_remove+0x39/0xf0 [mlx5_core]
 [ 2263.127318]  mlx5_sf_dev_state_change_handler+0x178/0x270 [mlx5_core]
 [ 2263.127920]  blocking_notifier_call_chain+0x5a/0x80
 [ 2263.128379]  mlx5_vhca_state_work_handler+0x151/0x200 [mlx5_core]
 [ 2263.128951]  process_one_work+0x1bb/0x3c0
 [ 2263.129355]  ? process_one_work+0x3c0/0x3c0
 [ 2263.129766]  worker_thread+0x4d/0x3c0
 [ 2263.130140]  ? process_one_work+0x3c0/0x3c0
 [ 2263.130548]  kthread+0xb9/0xe0
 [ 2263.130895]  ? kthread_complete_and_exit+0x20/0x20
 [ 2263.131349]  ret_from_fork+0x1f/0x30
 [ 2263.131717]  </TASK>

The fix is to disable and destroy the workqueue after the device
unregister. It is expected that vhost will not trigger kicks after
the unregister. But even if it would, the wq is disabled already by
setting the pointer to NULL (done so in the referenced commit).

Fixes: ad6dc1daaf ("vdpa/mlx5: Avoid processing works if workqueue was destroyed")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20230516095800.3549932-1-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-06-08 15:43:08 -04:00
Eli Cohen
e9d67e59f1 vdpa/mlx5: Extend driver support for new features
Extend the possible list for features that can be supported by firmware.
Note that different versions of firmware may or may not support these
features. The driver is made aware of them by querying the firmware.

While doing this, improve the code so we use enum names instead of hard
coded numerical values.

The new features supported by the driver are the following:

VIRTIO_NET_F_MRG_RXBUF
VIRTIO_NET_F_HOST_ECN
VIRTIO_NET_F_GUEST_ECN
VIRTIO_NET_F_GUEST_TSO6
VIRTIO_NET_F_GUEST_TSO4

Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20230321112809.221432-3-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eugenio Pérez Martin <eperezma@redhat.com>
2023-04-21 03:02:31 -04:00
Eli Cohen
791a1cb7b8 vdpa/mlx5: Make VIRTIO_NET_F_MRG_RXBUF off by default
Following patch adds driver support for VIRTIO_NET_F_MRG_RXBUF.

Current firmware versions show degradation in packet rate when using
MRG_RXBUF. Users who favor memory saving over packet rate could enable
this feature but we want to keep it off by default.

One can still enable it when creating the vdpa device using vdpa tool by
providing features that include it.

For example:
$ vdpa dev add name vdpa0 mgmtdev pci/0000:86:00.2 device_features 0x300cb982b

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20230321112809.221432-2-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
2023-04-21 03:02:31 -04:00
Eli Cohen
c384c2401e vdpa/mlx5: Avoid losing link state updates
Current code ignores link state updates if VIRTIO_NET_F_STATUS was not
negotiated. However, link state updates could be received before feature
negotiation was completed , therefore causing link state events to be
lost, possibly leaving the link state down.

Modify the code so link state notifier is registered after DRIVER_OK was
negotiated and carry the registration only if
VIRTIO_NET_F_STATUS was negotiated.  Unregister the notifier when the
device is reset.

Fixes: 033779a708 ("vdpa/mlx5: make MTU/STATUS presence conditional on feature bits")
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20230417110343.138319-1-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21 03:02:29 -04:00
Eli Cohen
f0417e72ad vdpa/mlx5: Add and remove debugfs in setup/teardown driver
The right place to add the debugfs create is in
setup_driver() and remove it in teardown_driver().

Current code adds the debugfs when creating the device but resetting a
device will remove the debugfs subtree and subsequent set_driver will
not be able to create the files since the debugfs pointer is NULL.

Fixes: 2942210043 ("vdpa/mlx5: Add debugfs subtree")
Signed-off-by: Eli Cohen <elic@nvidia.com>

v3 -> v4:
Fix error flow in setup_driver()
Message-Id: <20230403114039.11102-1-elic@nvidia.com>

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-04-04 11:08:30 -04:00
Si-Wei Liu
09e65ee905 vdpa/mlx5: should not activate virtq object when suspended
Otherwise the virtqueue object to instate could point to invalid address
that was unmapped from the MTT:

  mlx5_core 0000:41:04.2: mlx5_cmd_out_err:782:(pid 8321):
  CREATE_GENERAL_OBJECT(0xa00) op_mod(0xd) failed, status
  bad parameter(0x3), syndrome (0x5fa1c), err(-22)

Fixes: cae15c2ed8 ("vdpa/mlx5: Implement susupend virtqueue callback")
Cc: Eli Cohen <elic@nvidia.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>

Message-Id: <1676424640-11673-1-git-send-email-si-wei.liu@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-03-10 04:02:31 -05:00
Si-Wei Liu
deeacf35c9 vdpa/mlx5: support device features provisioning
This patch implements features provisioning for mlx5_vdpa.

1) Validate the provisioned features are a subset of the parent
    features.
2) Clearing features that are not wanted by userspace.

For example:

    # vdpa mgmtdev show
    pci/0000:41:04.2:
      supported_classes net
      max_supported_vqs 65
      dev_features CSUM GUEST_CSUM MTU MAC HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ CTRL_VLAN MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM

1) Provision vDPA device with all features derived from the parent

    # vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2
    # vdpa dev config show
    vdpa1: mac e4:11:c6:d3:45:f0 link up link_announce false max_vq_pairs 1 mtu 1500
      negotiated_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ CTRL_VLAN MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM

2) Provision vDPA device with a subset of parent features

    # vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 device_features 0x300020000
    # vdpa dev config show
    vdpa1:
      negotiated_features CTRL_VQ VERSION_1 ACCESS_PLATFORM

Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Message-Id: <1675725124-7375-7-git-send-email-si-wei.liu@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-02-20 19:27:00 -05:00
Si-Wei Liu
033779a708 vdpa/mlx5: make MTU/STATUS presence conditional on feature bits
The spec says:
    mtu only exists if VIRTIO_NET_F_MTU is set
    status only exists if VIRTIO_NET_F_STATUS is set

We should only present MTU and STATUS conditionally depending on
the feature bits.

Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Message-Id: <1675725124-7375-6-git-send-email-si-wei.liu@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-02-20 19:27:00 -05:00
Jason Wang
36871fb92b vdpa: mlx5: support per virtqueue dma device
This patch implements per virtqueue dma device for mlx5_vdpa. This is
needed for virtio_vdpa to work for CVQ which is backed by vringh but
not DMA. We simply advertise the vDPA device itself as the DMA device
for CVQ then DMA API can simply use PA so the identical mapping for
CVQ can still be used. Otherwise the identical (1:1) mapping won't
work when platform IOMMU is enabled since the IOVA is allocated on
demand which is not necessarily the PA.

This fixes the following crash when mlx5 vDPA device is bound to
virtio-vdpa with platform IOMMU enabled but not in passthrough mode:

BUG: unable to handle page fault for address: ff2fb3063deb1002
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 1393001067 P4D 1393002067 PUD 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 55 PID: 8923 Comm: kworker/u112:3 Kdump: loaded Not tainted 6.1.0+ #7
Hardware name: Dell Inc. PowerEdge R750/0PJ80M, BIOS 1.5.4 12/17/2021
Workqueue: mlx5_vdpa_wq mlx5_cvq_kick_handler [mlx5_vdpa]
RIP: 0010:vringh_getdesc_iotlb+0x93/0x1d0 [vringh]
Code: 14 25 40 ef 01 00 83 82 c0 0a 00 00 01 48 2b 05 93 5a 1b ea 8b 4c 24 14 48 c1 f8 06 48 c1 e0 0c 48 03 05 90 5a 1b ea 48 01 c8 <0f> b7 00 83 aa c0 0a 00 00 01 65 ff 0d bc e4 41 3f 0f 84 05 01 00
RSP: 0018:ff46821ba664fdf8 EFLAGS: 00010282
RAX: ff2fb3063deb1002 RBX: 0000000000000a20 RCX: 0000000000000002
RDX: ff2fb318d2f94380 RSI: 0000000000000002 RDI: 0000000000000001
RBP: ff2fb3065e832410 R08: ff46821ba664fe00 R09: 0000000000000001
R10: 0000000000000000 R11: 000000000000000d R12: ff2fb3065e832488
R13: ff2fb3065e8324a8 R14: ff2fb3065e8324c8 R15: ff2fb3065e8324a8
FS:  0000000000000000(0000) GS:ff2fb3257fac0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ff2fb3063deb1002 CR3: 0000001392010006 CR4: 0000000000771ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
  mlx5_cvq_kick_handler+0x89/0x2b0 [mlx5_vdpa]
  process_one_work+0x1e2/0x3b0
  ? rescuer_thread+0x390/0x390
  worker_thread+0x50/0x3a0
  ? rescuer_thread+0x390/0x390
  kthread+0xd6/0x100
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x1f/0x30
  </TASK>

Reviewed-by: Eli Cohen <elic@nvidia.com>
Tested-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230119061525.75068-6-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-02-20 19:26:58 -05:00
Eli Cohen
0a59975088 vdpa/mlx5: Add RX counters to debugfs
For each interface, either VLAN tagged or untagged, add two hardware
counters: one for unicast and another for multicast. The counters count
RX packets and bytes and can be read through debugfs:

$ cat /sys/kernel/debug/mlx5/mlx5_core.sf.1/vdpa-0/rx/untagged/mcast/packets
$ cat /sys/kernel/debug/mlx5/mlx5_core.sf.1/vdpa-0/rx/untagged/ucast/bytes

This feature is controlled via the config option
MLX5_VDPA_STEERING_DEBUG. It is off by default as it may have some
impact on performance.

includes a fixup By Yang Yingliang <yangyingliang@huawei.com>:

vdpa/mlx5: fix check wrong pointer in mlx5_vdpa_add_mac_vlan_rules()

The local variable 'rule' is not used anymore, fix return value
check after calling mlx5_add_flow_rules().

Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20221114131759.57883-9-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Message-Id: <20230104074418.1737510-1-yangyingliang@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-02-20 19:26:54 -05:00
Eli Cohen
2942210043 vdpa/mlx5: Add debugfs subtree
Add debugfs subtree and expose flow table ID and TIR number. This
information can be used by external tools to do extended
troubleshooting.

The information can be retrieved like so:
$ cat /sys/kernel/debug/mlx5/mlx5_core.sf.1/vdpa-0/rx/table_id
$ cat /sys/kernel/debug/mlx5/mlx5_core.sf.1/vdpa-0/rx/tirn

Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20221114131759.57883-8-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-02-20 19:26:54 -05:00
Eli Cohen
72c67e9b90 vdpa/mlx5: Move some definitions to a new header file
Move some definitions from mlx5_vnet.c to newly added header file
mlx5_vnet.h. We need these definitions for the following patches that
add debugfs tree to expose information vital for debug.

Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20221114131759.57883-7-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-02-20 19:26:54 -05:00
Eli Cohen
38fc462f57 vdpa/mlx5: Avoid overwriting CVQ iotlb
When qemu uses different address spaces for data and control virtqueues,
the current code would overwrite the control virtqueue iotlb through the
dup_iotlb call. Fix this by referring to the address space identifier
and the group to asid mapping to determine which mapping needs to be
updated. We also move the address space logic from mlx5 net to core
directory.

Reported-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20221114131759.57883-6-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
2022-12-28 05:28:09 -05:00
Eli Cohen
0dbc1b4ae0 vdpa/mlx5: Avoid using reslock in event_handler
event_handler runs under atomic context and may not acquire reslock. We
can still guarantee that the handler won't be called after suspend by
clearing nb_registered, unregistering the handler and flushing the
workqueue.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20221114131759.57883-5-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28 05:28:09 -05:00
Eli Cohen
1ab53760d3 vdpa/mlx5: Fix wrong mac address deletion
Delete the old MAC from the table and not the new one which is not there
yet.

Fixes: baf2ad3f6a ("vdpa/mlx5: Add RX MAC VLAN filter support")
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20221114131759.57883-4-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28 05:28:09 -05:00
Eli Cohen
5aec804936 vdpa/mlx5: Return error on vlan ctrl commands if not supported
Check if VIRTIO_NET_F_CTRL_VLAN is negotiated and return error if
control VQ command is received.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20221114131759.57883-3-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
2022-12-28 05:28:09 -05:00
Eli Cohen
a6ce72c0fb vdpa/mlx5: Fix rule forwarding VLAN to TIR
Set the VLAN id to the header values field instead of overwriting the
headers criteria field.

Before this fix, VLAN filtering would not really work and tagged packets
would be forwarded unfiltered to the TIR.

Fixes: baf2ad3f6a ("vdpa/mlx5: Add RX MAC VLAN filter support")
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20221114131759.57883-2-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28 05:28:09 -05:00
Eli Cohen
a43ae8057c vdpa/mlx5: Fix MQ to support non power of two num queues
RQT objects require that a power of two value be configured for both
rqt_max_size and rqt_actual size.

For create_rqt, make sure to round up to the power of two the value of
given by the user who created the vdpa device and given by
ndev->rqt_size. The actual size is also rounded up to the power of two
using the current number of VQs given by ndev->cur_num_vqs.

Same goes with modify_rqt where we need to make sure act size is power
of two based on the new number of QPs.

Without this patch, attempt to create a device with non power of two QPs
would result in error from firmware.

Fixes: 52893733f2 ("vdpa/mlx5: Add multiqueue support")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220912125019.833708-1-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-09-27 18:32:45 -04:00
Eli Cohen
93e530d2a1 vdpa/mlx5: Fix possible uninitialized return value
Initialize err local variable to return -EAGAIN if the asid cannot be
found thus avoiding returning uninitialized value.

Fixes: 8fcd20c307 ("vdpa/mlx5: Support different address spaces for control and data")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220811134010.952291-1-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11 10:00:36 -04:00
Eli Cohen
8fcd20c307 vdpa/mlx5: Support different address spaces for control and data
Partition virtqueues to two different address spaces: one for control
virtqueue which is implemented in software, and one for data virtqueues.

Based-on: <20220526124338.36247-1-eperezma@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220714113927.85729-3-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11 04:26:08 -04:00
Eli Cohen
cae15c2ed8 vdpa/mlx5: Implement susupend virtqueue callback
Implement the suspend callback allowing to suspend the virtqueues so
they stop processing descriptors. This is required to allow to query a
consistent state of the virtqueue while live migration is taking place.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220714113927.85729-2-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11 04:26:08 -04:00
Xu Qiang
71aa95a69c vdpa/mlx5: Use eth_broadcast_addr() to assign broadcast address
Using eth_broadcast_addr() to assign broadcast address instead
of memset().

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xu Qiang <xuqiang36@huawei.com>
Message-Id: <20220704021405.64545-1-xuqiang36@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-08-11 04:26:07 -04:00
Eli Cohen
ace9252446 vdpa/mlx5: Initialize CVQ vringh only once
Currently, CVQ vringh is initialized inside setup_virtqueues() which is
called every time a memory update is done. This is undesirable since it
resets all the context of the vring, including the available and used
indices.

Move the initialization to mlx5_vdpa_set_status() when
VIRTIO_CONFIG_S_DRIVER_OK is set.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220613075958.511064-2-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
2022-06-24 02:49:47 -04:00
Eli Cohen
40f2f3e941 vdpa/mlx5: Update Control VQ callback information
The control VQ specific information is stored in the dedicated struct
mlx5_control_vq. When the callback is updated through
mlx5_vdpa_set_vq_cb(), make sure to update the control VQ struct.

Fixes: 5262912ef3 ("vdpa/mlx5: Add support for control VQ and MAC setting")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220613075958.511064-1-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com)
2022-06-24 02:49:47 -04:00
Dan Carpenter
f38b3c6a78 vdpa/mlx5: clean up indenting in handle_ctrl_vlan()
These lines were supposed to be indented.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Message-Id: <Yp71IYMP+QfuCJ8t@kili>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eli Cohen <elic@nvidia.com>
Acked-by: Si-Wei Liu <si-wei.liu@oracle.com>
2022-06-08 08:56:03 -04:00
Dan Carpenter
f766c409fc vdpa/mlx5: fix error code for deleting vlan
Return success if we were able to delete a vlan.  The current code
always returns failure.

Fixes: baf2ad3f6a ("vdpa/mlx5: Add RX MAC VLAN filter support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Message-Id: <Yp709f1g9NcMBCHg@kili>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eli Cohen <elic@nvidia.com>
Acked-by: Si-Wei Liu <si-wei.liu@oracle.com>
2022-06-08 08:56:03 -04:00
Xiang wangx
2f72b2262d vdpa/mlx5: Fix syntax errors in comments
Delete the redundant word 'is'.

Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Message-Id: <20220604143858.16073-1-wangxiang@cdjrlc.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-06-08 08:56:03 -04:00
Eli Cohen
baf2ad3f6a vdpa/mlx5: Add RX MAC VLAN filter support
Support HW offloaded filtering of MAC/VLAN packets.
To allow that, we add a handler to handle VLAN configurations coming
through the control VQ. Two operations are supported.

1. Adding VLAN - in this case, an entry will be added to the RX flow
   table that will allow the combination of the MAC/VLAN to be
   forwarded to the TIR.
2. Removing VLAN - will remove the entry from the flow table,
   effectively blocking such packets from going through.

Currently the control VQ does not propagate changes to the MAC of the
VLAN device so we always use the MAC of the parent device.

Examples:
1. Create vlan device:
$ ip link add link ens1 name ens1.8 type vlan id 8

Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220411122942.225717-4-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-06-01 02:16:38 -04:00
Eli Cohen
7becdd13b6 vdpa/mlx5: Remove flow counter from steering
The flow counter has been introduced in early versions of the driver to
aid in debugging. It is no longer needed and can harm performance.

Remove it.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220411122942.225717-2-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-01 02:16:38 -04:00
Gautam Dawar
db9adcbf42 vdpa: multiple address spaces support
This patches introduces the multiple address spaces support for vDPA
device. This idea is to identify a specific address space via an
dedicated identifier - ASID.

During vDPA device allocation, vDPA device driver needs to report the
number of address spaces supported by the device then the DMA mapping
ops of the vDPA device needs to be extended to support ASID.

This helps to isolate the environments for the virtqueue that will not
be assigned directly. E.g in the case of virtio-net, the control
virtqueue will not be assigned directly to guest.

As a start, simply claim 1 virtqueue groups and 1 address spaces for
all vDPA devices. And vhost-vDPA will simply reject the device with
more than 1 virtqueue groups or address spaces.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Gautam Dawar <gdawar@xilinx.com>
Message-Id: <20220330180436.24644-7-gdawar@xilinx.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:44:27 -04:00
Gautam Dawar
d4821902e4 vdpa: introduce virtqueue groups
This patch introduces virtqueue groups to vDPA device. The virtqueue
group is the minimal set of virtqueues that must share an address
space. And the address space identifier could only be attached to
a specific virtqueue group.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Gautam Dawar <gdawar@xilinx.com>
Message-Id: <20220330180436.24644-6-gdawar@xilinx.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:44:27 -04:00
Eli Cohen
759ae7f9bf vdpa/mlx5: Use readers/writers semaphore instead of mutex
Reading statistics could be done intensively and by several processes
concurrently. Reader's lock is sufficient in this case.

Change reslock from mutex to a rwsem.

Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220518133804.1075129-7-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:44:23 -04:00