Commit graph

329 commits

Author SHA1 Message Date
Stefano Garzarella
67f8f10c0b vdpa_sim: use max_iotlb_entries as a limit in vhost_iotlb_init
Commit bda324fd03 ("vdpasim: control virtqueue support") changed
the allocation of iotlbs calling vhost_iotlb_init() for each address
space, instead of vhost_iotlb_alloc().

With this change we forgot to use the limit we had introduced with
the `max_iotlb_entries` module parameter.

Fixes: bda324fd03 ("vdpasim: control virtqueue support")
Cc: gautam.dawar@xilinx.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220621151208.189959-1-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
2022-08-11 04:26:07 -04:00
Stefano Garzarella
19cd4a5471 vdpa_sim_blk: set number of address spaces and virtqueue groups
Commit bda324fd03 ("vdpasim: control virtqueue support") added two
new fields (nas, ngroups) to vdpasim_dev_attr, but we forgot to
initialize them for vdpa_sim_blk.

When creating a new vdpa_sim_blk device this causes the kernel
to panic in this way:
    $ vdpa dev add mgmtdev vdpasim_blk name blk0
    BUG: kernel NULL pointer dereference, address: 0000000000000030
    ...
    RIP: 0010:vhost_iotlb_add_range_ctx+0x41/0x220 [vhost_iotlb]
    ...
    Call Trace:
     <TASK>
     vhost_iotlb_add_range+0x11/0x800 [vhost_iotlb]
     vdpasim_map_range+0x91/0xd0 [vdpa_sim]
     vdpasim_alloc_coherent+0x56/0x90 [vdpa_sim]
     ...

This happens because vdpasim->iommu[0] is not initialized when
dev_attr.nas is 0.

Let's fix this issue by initializing both (nas, ngroups) to 1 for
vdpa_sim_blk.

Fixes: bda324fd03 ("vdpasim: control virtqueue support")
Cc: gautam.dawar@xilinx.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220621151323.190431-1-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
2022-08-11 04:26:07 -04:00
Stefano Garzarella
8472019eec vdpa_sim_blk: call vringh_complete_iotlb() also in the error path
Call vringh_complete_iotlb() even when we encounter a serious error
that prevents us from writing the status in the "in" header
(e.g. the header length is incorrect, etc.).

The guest is misbehaving, so maybe the ring is in a bad state, but
let's avoid making things worse.

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220630153221.83371-4-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11 04:26:07 -04:00
Stefano Garzarella
9c4df0900f vdpa_sim_blk: limit the number of request handled per batch
Limit the number of requests (4 per queue as for vdpa_sim_net) handled
in a batch to prevent the worker from using the CPU for too long.

Suggested-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220630153221.83371-3-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-08-11 04:26:07 -04:00
Stefano Garzarella
42a84c09ef vdpa_sim_blk: use dev_dbg() to print errors
Use dev_dbg() instead of dev_err()/dev_warn() to avoid flooding the
host with prints, when the guest driver is misbehaving.
In this way, prints can be dynamically enabled when the vDPA block
simulator is used to validate a driver.

Suggested-by: Jason Wang <jasowang@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220630153221.83371-2-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-11 04:26:07 -04:00
Parav Pandit
0e0348ac3f vduse: Tie vduse mgmtdev and its device
vduse devices are not backed by any real devices such as PCI. Hence it
doesn't have any parent device linked to it.

Kernel driver model in [1] suggests to avoid an empty device
release callback.

Hence tie the mgmtdevice object's life cycle to an allocate dummy struct
device instead of static one.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/core-api/kobject.rst?h=v5.18-rc7#n284

Signed-off-by: Parav Pandit <parav@nvidia.com>
Message-Id: <20220613195223.473966-1-parav@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-06-24 02:49:48 -04:00
Eli Cohen
ace9252446 vdpa/mlx5: Initialize CVQ vringh only once
Currently, CVQ vringh is initialized inside setup_virtqueues() which is
called every time a memory update is done. This is undesirable since it
resets all the context of the vring, including the available and used
indices.

Move the initialization to mlx5_vdpa_set_status() when
VIRTIO_CONFIG_S_DRIVER_OK is set.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220613075958.511064-2-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
2022-06-24 02:49:47 -04:00
Eli Cohen
40f2f3e941 vdpa/mlx5: Update Control VQ callback information
The control VQ specific information is stored in the dedicated struct
mlx5_control_vq. When the callback is updated through
mlx5_vdpa_set_vq_cb(), make sure to update the control VQ struct.

Fixes: 5262912ef3 ("vdpa/mlx5: Add support for control VQ and MAC setting")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220613075958.511064-1-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com)
2022-06-24 02:49:47 -04:00
Xie Yongji
b27ee76c74 vduse: Fix NULL pointer dereference on sysfs access
The control device has no drvdata. So we will get a
NULL pointer dereference when accessing control
device's msg_timeout attribute via sysfs:

[ 132.841881][ T3644] BUG: kernel NULL pointer dereference, address: 00000000000000f8
[ 132.850619][ T3644] RIP: 0010:msg_timeout_show (drivers/vdpa/vdpa_user/vduse_dev.c:1271)
[ 132.869447][ T3644] dev_attr_show (drivers/base/core.c:2094)
[ 132.870215][ T3644] sysfs_kf_seq_show (fs/sysfs/file.c:59)
[ 132.871164][ T3644] ? device_remove_bin_file (drivers/base/core.c:2088)
[ 132.872082][ T3644] kernfs_seq_show (fs/kernfs/file.c:164)
[ 132.872838][ T3644] seq_read_iter (fs/seq_file.c:230)
[ 132.873578][ T3644] ? __vmalloc_area_node (mm/vmalloc.c:3041)
[ 132.874532][ T3644] kernfs_fop_read_iter (fs/kernfs/file.c:238)
[ 132.875513][ T3644] __kernel_read (fs/read_write.c:440 (discriminator 1))
[ 132.876319][ T3644] kernel_read (fs/read_write.c:459)
[ 132.877129][ T3644] kernel_read_file (fs/kernel_read_file.c:94)
[ 132.877978][ T3644] kernel_read_file_from_fd (include/linux/file.h:45 fs/kernel_read_file.c:186)
[ 132.879019][ T3644] __do_sys_finit_module (kernel/module.c:4207)
[ 132.879930][ T3644] __ia32_sys_finit_module (kernel/module.c:4189)
[ 132.880930][ T3644] do_int80_syscall_32 (arch/x86/entry/common.c:112 arch/x86/entry/common.c:132)
[ 132.881847][ T3644] entry_INT80_compat (arch/x86/entry/entry_64_compat.S:419)

To fix it, don't create the unneeded attribute for
control device anymore.

Fixes: c8a6153b6c ("vduse: Introduce VDUSE - vDPA Device in Userspace")
Reported-by: kernel test robot <oliver.sang@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Message-Id: <20220426073656.229-1-xieyongji@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-08 08:56:03 -04:00
Dan Carpenter
f38b3c6a78 vdpa/mlx5: clean up indenting in handle_ctrl_vlan()
These lines were supposed to be indented.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Message-Id: <Yp71IYMP+QfuCJ8t@kili>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eli Cohen <elic@nvidia.com>
Acked-by: Si-Wei Liu <si-wei.liu@oracle.com>
2022-06-08 08:56:03 -04:00
Dan Carpenter
f766c409fc vdpa/mlx5: fix error code for deleting vlan
Return success if we were able to delete a vlan.  The current code
always returns failure.

Fixes: baf2ad3f6a ("vdpa/mlx5: Add RX MAC VLAN filter support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Message-Id: <Yp709f1g9NcMBCHg@kili>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eli Cohen <elic@nvidia.com>
Acked-by: Si-Wei Liu <si-wei.liu@oracle.com>
2022-06-08 08:56:03 -04:00
Xiang wangx
2f72b2262d vdpa/mlx5: Fix syntax errors in comments
Delete the redundant word 'is'.

Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Message-Id: <20220604143858.16073-1-wangxiang@cdjrlc.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-06-08 08:56:03 -04:00
Linus Torvalds
500a434fc5 Driver core changes for 5.19-rc1
Here is the set of driver core changes for 5.19-rc1.
 
 Note, I'm not really happy with this pull request as-is, see below for
 details, but overall this is all good for everything but a small set of
 systems, which we have a fix for already.
 
 Lots of tiny driver core changes and cleanups happened this cycle,
 but the two major things were:
 
 	- firmware_loader reorganization and additions including the
 	  ability to have XZ compressed firmware images and the ability
 	  for userspace to initiate the firmware load when it needs to,
 	  instead of being always initiated by the kernel. FPGA devices
 	  specifically want this ability to have their firmware changed
 	  over the lifetime of the system boot, and this allows them to
 	  work without having to come up with yet-another-custom-uapi
 	  interface for loading firmware for them.
 	- physical location support added to sysfs so that devices that
 	  know this information, can tell userspace where they are
 	  located in a common way.  Some ACPI devices already support
 	  this today, and more bus types should support this in the
 	  future.
 
 Smaller changes included:
 	- driver_override api cleanups and fixes
 	- error path cleanups and fixes
 	- get_abi script fixes
 	- deferred probe timeout changes.
 
 It's that last change that I'm the most worried about.  It has been
 reported to cause boot problems for a number of systems, and I have a
 tested patch series that resolves this issue.  But I didn't get it
 merged into my tree before 5.18-final came out, so it has not gotten any
 linux-next testing.
 
 I'll send the fixup patches (there are 2) as a follow-on series to this
 pull request if you want to take them directly, _OR_ I can just revert
 the probe timeout changes and they can wait for the next -rc1 merge
 cycle.  Given that the fixes are tested, and pretty simple, I'm leaning
 toward that choice.  Sorry this all came at the end of the merge window,
 I should have resolved this all 2 weeks ago, that's my fault as it was
 in the middle of some travel for me.
 
 All have been tested in linux-next for weeks, with no reported issues
 other than the above-mentioned boot time outs.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYpnv/A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yk/fACgvmenbo5HipqyHnOmTQlT50xQ9EYAn2eTq6ai
 GkjLXBGNWOPBa5cU52qf
 =yEi/
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the set of driver core changes for 5.19-rc1.

  Lots of tiny driver core changes and cleanups happened this cycle, but
  the two major things are:

   - firmware_loader reorganization and additions including the ability
     to have XZ compressed firmware images and the ability for userspace
     to initiate the firmware load when it needs to, instead of being
     always initiated by the kernel. FPGA devices specifically want this
     ability to have their firmware changed over the lifetime of the
     system boot, and this allows them to work without having to come up
     with yet-another-custom-uapi interface for loading firmware for
     them.

   - physical location support added to sysfs so that devices that know
     this information, can tell userspace where they are located in a
     common way. Some ACPI devices already support this today, and more
     bus types should support this in the future.

  Smaller changes include:

   - driver_override api cleanups and fixes

   - error path cleanups and fixes

   - get_abi script fixes

   - deferred probe timeout changes.

  It's that last change that I'm the most worried about. It has been
  reported to cause boot problems for a number of systems, and I have a
  tested patch series that resolves this issue. But I didn't get it
  merged into my tree before 5.18-final came out, so it has not gotten
  any linux-next testing.

  I'll send the fixup patches (there are 2) as a follow-on series to this
  pull request.

  All have been tested in linux-next for weeks, with no reported issues
  other than the above-mentioned boot time-outs"

* tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
  driver core: fix deadlock in __device_attach
  kernfs: Separate kernfs_pr_cont_buf and rename_lock.
  topology: Remove unused cpu_cluster_mask()
  driver core: Extend deferred probe timeout on driver registration
  MAINTAINERS: add Russ Weight as a firmware loader maintainer
  driver: base: fix UAF when driver_attach failed
  test_firmware: fix end of loop test in upload_read_show()
  driver core: location: Add "back" as a possible output for panel
  driver core: location: Free struct acpi_pld_info *pld
  driver core: Add "*" wildcard support to driver_async_probe cmdline param
  driver core: location: Check for allocations failure
  arch_topology: Trace the update thermal pressure
  kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file.
  export: fix string handling of namespace in EXPORT_SYMBOL_NS
  rpmsg: use local 'dev' variable
  rpmsg: Fix calling device_lock() on non-initialized device
  firmware_loader: describe 'module' parameter of firmware_upload_register()
  firmware_loader: Move definitions from sysfs_upload.h to sysfs.h
  firmware_loader: Fix configs for sysfs split
  selftests: firmware: Add firmware upload selftests
  ...
2022-06-03 11:48:47 -07:00
Jason Wang
bd8bb9aed5 vdpa: ifcvf: set pci driver data in probe
We should set the pci driver data in probe instead of the vdpa device
adding callback. Otherwise if no vDPA device is created we will lose
the pointer to the management device.

Fixes: 6b5df347c6 ("vDPA/ifcvf: implement management netlink framework for ifcvf")
Tested-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220524055557.1938-1-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-01 02:16:38 -04:00
Eli Cohen
baf2ad3f6a vdpa/mlx5: Add RX MAC VLAN filter support
Support HW offloaded filtering of MAC/VLAN packets.
To allow that, we add a handler to handle VLAN configurations coming
through the control VQ. Two operations are supported.

1. Adding VLAN - in this case, an entry will be added to the RX flow
   table that will allow the combination of the MAC/VLAN to be
   forwarded to the TIR.
2. Removing VLAN - will remove the entry from the flow table,
   effectively blocking such packets from going through.

Currently the control VQ does not propagate changes to the MAC of the
VLAN device so we always use the MAC of the parent device.

Examples:
1. Create vlan device:
$ ip link add link ens1 name ens1.8 type vlan id 8

Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220411122942.225717-4-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-06-01 02:16:38 -04:00
Eli Cohen
7becdd13b6 vdpa/mlx5: Remove flow counter from steering
The flow counter has been introduced in early versions of the driver to
aid in debugging. It is no longer needed and can harm performance.

Remove it.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220411122942.225717-2-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-01 02:16:38 -04:00
Dan Carpenter
1f97b97850 vdpasim: Off by one in vdpasim_set_group_asid()
The > comparison needs to be >= to prevent an out of bounds access
of the vdpasim->iommu[] array.  The vdpasim->iommu[] is allocated in
vdpasim_create() and it has vdpasim->dev_attr.nas elements.

Fixes: 87e5afeac247 ("vdpasim: control virtqueue support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Message-Id: <YotGQU1q224RKZR8@kili>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-05-31 12:45:10 -04:00
Eugenio Pérez
2424369738 vdpasim: allow to enable a vq repeatedly
Code must be resilient to enable a queue many times.

At the moment the queue is resetting so it's definitely not the expected
behavior.

v2: set vq->ready = 0 at disable.

Fixes: 2c53d0f64c ("vdpasim: vDPA device simulator")
Cc: stable@vger.kernel.org
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20220519145919.772896-1-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-05-31 12:45:09 -04:00
Zhu Lingshan
ac33f84ba5 vDPA/ifcvf: fix uninitialized config_vector warning
Static checkers are not informed that config_vector is controlled
by vf->msix_vector_status, which can only be
MSIX_VECTOR_SHARED_VQ_AND_CONFIG, MSIX_VECTOR_SHARED_VQ_AND_CONFIG
and MSIX_VECTOR_DEV_SHARED.

This commit uses an "if...elseif...else" code block to tell the
checkers that it is a complete set, and config_vector can be
initialized anyway

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Message-Id: <20220424072806.1083189-1-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-05-31 12:45:09 -04:00
Cindy Lu
ffbda8e9df vdpa/vp_vdpa : add vdpa tool support in vp_vdpa
this patch is to add the support for vdpa tool in vp_vdpa
here is the example steps

modprobe vp_vdpa
modprobe vhost_vdpa
echo 0000:00:06.0>/sys/bus/pci/drivers/virtio-pci/unbind
echo 1af4 1041 > /sys/bus/pci/drivers/vp-vdpa/new_id

vdpa dev add name vdpa1 mgmtdev pci/0000:00:06.0

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20220429091030.547434-1-lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-05-31 12:45:09 -04:00
Gautam Dawar
bda324fd03 vdpasim: control virtqueue support
This patch introduces the control virtqueue support for vDPA
simulator. This is a requirement for supporting advanced features like
multiqueue.

A requirement for control virtqueue is to isolate its memory access
from the rx/tx virtqueues. This is because when using vDPA device
for VM, the control virqueue is not directly assigned to VM. Userspace
(Qemu) will present a shadow control virtqueue to control for
recording the device states.

The isolation is done via the virtqueue groups and ASID support in
vDPA through vhost-vdpa. The simulator is extended to have:

1) three virtqueues: RXVQ, TXVQ and CVQ (control virtqueue)
2) two virtqueue groups: group 0 contains RXVQ and TXVQ; group 1
   contains CVQ
3) two address spaces and the simulator simply implements the address
   spaces by mapping it 1:1 to IOTLB.

For the VM use cases, userspace(Qemu) may set AS 0 to group 0 and AS 1
to group 1. So we have:

1) The IOTLB for virtqueue group 0 contains the mappings of guest, so
   RX and TX can be assigned to guest directly.
2) The IOTLB for virtqueue group 1 contains the mappings of CVQ which
   is the buffers that allocated and managed by VMM only. So CVQ of
   vhost-vdpa is visible to VMM only. And Guest can not access the CVQ
   of vhost-vdpa.

For the other use cases, since AS 0 is associated to all virtqueue
groups by default. All virtqueues share the same mapping by default.

To demonstrate the function, VIRITO_NET_F_CTRL_MACADDR is
implemented in the simulator for the driver to set mac address.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Gautam Dawar <gdawar@xilinx.com>
Message-Id: <20220330180436.24644-20-gdawar@xilinx.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:08 -04:00
Gautam Dawar
cfe2268929 vdpa_sim: filter destination mac address
This patch implements a simple unicast filter for vDPA simulator.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Gautam Dawar <gdawar@xilinx.com>
Message-Id: <20220330180436.24644-19-gdawar@xilinx.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:45:08 -04:00
Gautam Dawar
ec103d983b vdpa_sim: factor out buffer completion logic
Wrap up common buffer completion logic in to vdpasim_net_complete

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Gautam Dawar <gdawar@xilinx.com>
Message-Id: <20220330180436.24644-18-gdawar@xilinx.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:44:33 -04:00
Gautam Dawar
05b6976212 vdpa_sim: advertise VIRTIO_NET_F_MTU
We've already reported maximum mtu via config space, so let's
advertise the feature.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Gautam Dawar <gdawar@xilinx.com>
Message-Id: <20220330180436.24644-17-gdawar@xilinx.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:44:32 -04:00
Gautam Dawar
db9adcbf42 vdpa: multiple address spaces support
This patches introduces the multiple address spaces support for vDPA
device. This idea is to identify a specific address space via an
dedicated identifier - ASID.

During vDPA device allocation, vDPA device driver needs to report the
number of address spaces supported by the device then the DMA mapping
ops of the vDPA device needs to be extended to support ASID.

This helps to isolate the environments for the virtqueue that will not
be assigned directly. E.g in the case of virtio-net, the control
virtqueue will not be assigned directly to guest.

As a start, simply claim 1 virtqueue groups and 1 address spaces for
all vDPA devices. And vhost-vDPA will simply reject the device with
more than 1 virtqueue groups or address spaces.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Gautam Dawar <gdawar@xilinx.com>
Message-Id: <20220330180436.24644-7-gdawar@xilinx.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:44:27 -04:00
Gautam Dawar
d4821902e4 vdpa: introduce virtqueue groups
This patch introduces virtqueue groups to vDPA device. The virtqueue
group is the minimal set of virtqueues that must share an address
space. And the address space identifier could only be attached to
a specific virtqueue group.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Gautam Dawar <gdawar@xilinx.com>
Message-Id: <20220330180436.24644-6-gdawar@xilinx.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:44:27 -04:00
Eli Cohen
759ae7f9bf vdpa/mlx5: Use readers/writers semaphore instead of mutex
Reading statistics could be done intensively and by several processes
concurrently. Reader's lock is sufficient in this case.

Change reslock from mutex to a rwsem.

Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220518133804.1075129-7-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:44:23 -04:00
Eli Cohen
1892a3d425 vdpa/mlx5: Add support for reading descriptor statistics
Implement the get_vq_stats calback of vdpa_config_ops to return the
statistics for a virtqueue.

The statistics are provided as vendor specific statistics where the
driver provides a pair of attribute name and attribute value.

Currently supported are received descriptors and completed descriptors.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220518133804.1075129-6-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:44:22 -04:00
Eli Cohen
a6a51adc6e net/vdpa: Use readers/writers semaphore instead of cf_mutex
Replace cf_mutex with rw_semaphore to reflect the fact that some calls
could be called concurrently but can suffice with read lock.

Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220518133804.1075129-5-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:44:22 -04:00
Eli Cohen
0078ad905d net/vdpa: Use readers/writers semaphore instead of vdpa_dev_mutex
Use rw_semaphore instead of mutex to control access to vdpa devices.
This can be especially beneficial in case processes poll on statistics
information.

Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220518133804.1075129-4-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:44:21 -04:00
Eli Cohen
13b00b1356 vdpa: Add support for querying vendor statistics
Allows to read vendor statistics of a vdpa device. The specific
statistics data are received from the upstream driver in the form of an
(attribute name, attribute value) pairs.

An example of statistics for mlx5_vdpa device are:

received_desc - number of descriptors received by the virtqueue
completed_desc - number of descriptors completed by the virtqueue

A descriptor using indirect buffers is still counted as 1. In addition,
N chained descriptors are counted correctly N times as one would expect.

A new callback was added to vdpa_config_ops which provides the means for
the vdpa driver to return statistics results.

The interface allows for reading all the supported virtqueues, including
the control virtqueue if it exists.

Below are some examples taken from mlx5_vdpa which are introduced in the
following patch:

1. Read statistics for the virtqueue at index 1

$ vdpa dev vstats show vdpa-a qidx 1
vdpa-a:
queue_type tx queue_index 1 received_desc 3844836 completed_desc 3844836

2. Read statistics for the virtqueue at index 32
$ vdpa dev vstats show vdpa-a qidx 32
vdpa-a:
queue_type control_vq queue_index 32 received_desc 62 completed_desc 62

3. Read statisitics for the virtqueue at index 0 with json output
$ vdpa -j dev vstats show vdpa-a qidx 0
{"vstats":{"vdpa-a":{
"queue_type":"rx","queue_index":0,"name":"received_desc","value":417776,\
 "name":"completed_desc","value":417548}}}

4. Read statistics for the virtqueue at index 0 with preety json output
$ vdpa -jp dev vstats show vdpa-a qidx 0
{
    "vstats": {
        "vdpa-a": {

            "queue_type": "rx",
            "queue_index": 0,
            "name": "received_desc",
            "value": 417776,
            "name": "completed_desc",
            "value": 417548
        }
    }
}

Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220518133804.1075129-3-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:44:20 -04:00
Eli Cohen
7a6691f1f8 vdpa: Fix error logic in vdpa_nl_cmd_dev_get_doit
In vdpa_nl_cmd_dev_get_doit(), if the call to genlmsg_reply() fails we
must not call nlmsg_free() since this is done inside genlmsg_reply().

Fix it.

Fixes: bc0d90ee02 ("vdpa: Enable user to query vdpa device info")
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220518133804.1075129-2-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31 12:44:20 -04:00
Eli Cohen
acde392949 vdpa/mlx5: Use consistent RQT size
The current code evaluates RQT size based on the configured number of
virtqueues. This can raise an issue in the following scenario:

Assume MQ was negotiated.
1. mlx5_vdpa_set_map() gets called.
2. handle_ctrl_mq() is called setting cur_num_vqs to some value, lower
   than the configured max VQs.
3. A second set_map gets called, but now a smaller number of VQs is used
   to evaluate the size of the RQT.
4. handle_ctrl_mq() is called with a value larger than what the RQT can
   hold. This will emit errors and the driver state is compromised.

To fix this, we use a new field in struct mlx5_vdpa_net to hold the
required number of entries in the RQT. This value is evaluated in
mlx5_vdpa_set_driver_features() where we have the negotiated features
all set up.

In addition to that, we take into consideration the max capability of RQT
entries early when the device is added so we don't need to take consider
it when creating the RQT.

Last, we remove the use of mlx5_vdpa_max_qps() which just returns the
max_vas / 2 and make the code clearer.

Fixes: 52893733f2 ("vdpa/mlx5: Add multiqueue support")
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-18 12:31:31 -04:00
Krzysztof Kozlowski
240bf4e665 vdpa: Use helper for safer setting of driver_override
Use a helper to set driver_override to the reduce amount of duplicated
code.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220419113435.246203-9-krzysztof.kozlowski@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 17:13:54 +02:00
Linus Torvalds
3e732ebf73 virtio: fixes, cleanups
A couple of mlx5 fixes related to cvq
 A couple of reverts dropping useless code (code that used it got reverted
 earlier)
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmJKyK4PHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpE40IAIvsq1WDsXfg/649lp0wMO5MPEzAvCfgQpub
 hCti2lKY4K9Ykqh+uTipDTxpRmg3wt/4QlQ6NYZVmn8jQ1JsNAIFCkc7ptG1xwQI
 0RypofGsFQwbar2vKh8jEzYUsIbwLKeLNlib113fH9zyDRw7KjRppUgC2cjMoYv6
 JhlOFZAApzFknXC/I8fmJyzVYpX+v2GB8RX2wWiNXHQZ9hkdKd1IvJqRiuRuSY3L
 haubMBFG8hrdW8qQ40ngvt+GnrxKUustmPM6x9DCJYaMkKRkXC6NxueUcIk6qWiw
 UJz5IOXURdJ3i1qxmkMGSnJiYbFpmwVGlqRekN3qNVpXNovfbg0=
 =l15S
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
 "Fixes and cleanups:

   - A couple of mlx5 fixes related to cvq

   - A couple of reverts dropping useless code (code that used it got
     reverted earlier)"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vdpa: mlx5: synchronize driver status with CVQ
  vdpa: mlx5: prevent cvq work from hogging CPU
  Revert "virtio_config: introduce a new .enable_cbs method"
  Revert "virtio: use virtio_device_ready() in virtio_device_restore()"
2022-04-05 10:40:52 -07:00
Linus Torvalds
f4f5d7cfb2 virtio: features, fixes
vdpa generic device type support
 More virtio hardening for broken devices
 On the same theme, revert some virtio hotplug hardening patches -
 they were misusing some interrupt flags, will have to be reverted.
 RSS support in virtio-net
 max device MTU support in mlx5 vdpa
 akcipher support in virtio-crypto
 shared IRQ support in ifcvf vdpa
 a minor performance improvement in vhost
 Enable virtio mem for ARM64
 beginnings of advance dma support
 
 Cleanups, fixes all over the place.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmJEEk8PHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpcpUH+wRIXrzveirsN4MYH0aAeF+SLYaA5pgtO4U7
 da22HYtwlMrDRMxwjepKBOTSu89uP5LEK7IKWPj9VRZg+GLz/Cdfc6BZl/fND3qt
 0yFpwG1ZLsBK1+WHbysWQneEbPjXqQdbh9eVkKVGcNkRuLJJwXbmF95dyQEJwzeh
 dPHssDcEC2tRgHAMrLyjLPKwMCRwcgtdPoB1ZC+lqTs3G6lktAfREEvqVfJOVe1b
 mQcgdAJ+aRM0J/w/PYTmxFOZPYAmQ6hmAQ8Hf7nkjfRWQ4EM91W0cKAoZPc/+7KN
 ZfFKVL28GEZLJqnx+3xijwCR2gwVHsRYZHaTjfGgQUWZPoB3Vrc=
 =ynRx
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:

 - vdpa generic device type support

 - more virtio hardening for broken devices (but on the same theme,
   revert some virtio hotplug hardening patches - they were misusing
   some interrupt flags and had to be reverted)

 - RSS support in virtio-net

 - max device MTU support in mlx5 vdpa

 - akcipher support in virtio-crypto

 - shared IRQ support in ifcvf vdpa

 - a minor performance improvement in vhost

 - enable virtio mem for ARM64

 - beginnings of advance dma support

 - cleanups, fixes all over the place

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (33 commits)
  vdpa/mlx5: Avoid processing works if workqueue was destroyed
  vhost: handle error while adding split ranges to iotlb
  vdpa: support exposing the count of vqs to userspace
  vdpa: change the type of nvqs to u32
  vdpa: support exposing the config size to userspace
  vdpa/mlx5: re-create forwarding rules after mac modified
  virtio: pci: check bar values read from virtio config space
  Revert "virtio_pci: harden MSI-X interrupts"
  Revert "virtio-pci: harden INTX interrupts"
  drivers/net/virtio_net: Added RSS hash report control.
  drivers/net/virtio_net: Added RSS hash report.
  drivers/net/virtio_net: Added basic RSS support.
  drivers/net/virtio_net: Fixed padded vheader to use v1 with hash.
  virtio: use virtio_device_ready() in virtio_device_restore()
  tools/virtio: compile with -pthread
  tools/virtio: fix after premapped buf support
  virtio_ring: remove flags check for unmap packed indirect desc
  virtio_ring: remove flags check for unmap split indirect desc
  virtio_ring: rename vring_unmap_state_packed() to vring_unmap_extra_packed()
  net/mlx5: Add support for configuring max device MTU
  ...
2022-03-31 13:57:15 -07:00
Jason Wang
1c80cf031e vdpa: mlx5: synchronize driver status with CVQ
Currently, CVQ doesn't have any synchronization with the driver
status. Then CVQ emulation code run in the middle of:

1) device reset
2) device status changed
3) map updating

The will lead several unexpected issue like trying to execute CVQ
command after the driver has been teared down.

Fixing this by using reslock to synchronize CVQ emulation code with
the driver status changing:

- protect the whole device reset, status changing and set_map()
  updating with reslock
- protect the CVQ handler with the reslock and check
  VIRTIO_CONFIG_S_DRIVER_OK in the CVQ handler

This will guarantee that:

1) CVQ handler won't work if VIRTIO_CONFIG_S_DRIVER_OK is not set
2) CVQ handler will see a consistent state of the driver instead of
   the partial one when it is running in the middle of the
   teardown_driver() or setup_driver().

Cc: 5262912ef3 ("vdpa/mlx5: Add support for control VQ and MAC setting")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20220329042109.4029-2-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eli Cohen <elic@nvidia.com>
2022-03-30 04:18:14 -04:00
Jason Wang
55ebf0d60e vdpa: mlx5: prevent cvq work from hogging CPU
A userspace triggerable infinite loop could happen in
mlx5_cvq_kick_handler() if userspace keeps sending a huge amount of
cvq requests.

Fixing this by introducing a quota and re-queue the work if we're out
of the budget (currently the implicit budget is one) . While at it,
using a per device work struct to avoid on demand memory allocation
for cvq.

Fixes: 5262912ef3 ("vdpa/mlx5: Add support for control VQ and MAC setting")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20220329042109.4029-1-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eli Cohen <elic@nvidia.com>
2022-03-30 04:18:14 -04:00
Eli Cohen
ad6dc1daaf vdpa/mlx5: Avoid processing works if workqueue was destroyed
If mlx5_vdpa gets unloaded while a VM is running, the workqueue will be
destroyed. However, vhost might still have reference to the kick
function and might attempt to push new works. This could lead to null
pointer dereference.

To fix this, set mvdev->wq to NULL just before destroying and verify
that the workqueue is not NULL in mlx5_vdpa_kick_vq before attempting to
push a new work.

Fixes: 5262912ef3 ("vdpa/mlx5: Add support for control VQ and MAC setting")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220321141303.9586-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-28 16:54:30 -04:00
Longpeng
81d46d6931 vdpa: change the type of nvqs to u32
Change vdpa_device.nvqs and vhost_vdpa.nvqs to use u32

Signed-off-by: Longpeng <longpeng2@huawei.com>
Link: https://lore.kernel.org/r/20220315032553.455-3-longpeng2@huawei.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Longpeng &lt;<a href="mailto:longpeng2@huawei.com" target="_blank">longpeng2@huawei.com</a>&gt;<br></blockquote><div><br></div><div>Acked-by: Jason Wang &lt;<a href="mailto:jasowang@redhat.com">jasowang@redhat.com</a>&gt;</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-03-28 16:53:00 -04:00
Michael Qiu
f1781bedea vdpa/mlx5: re-create forwarding rules after mac modified
When MAC Address has been modified in guest, we only re-add the
Mac to mpfs, it is not enough, because the guest network will
not work correctly: the reply package from outside will go
straight away to the host VF net interface.

This patch recreate the flow rules, and make it work correctly.

Signed-off-by: Michael Qiu <qiudayu@archeros.com>
Link: https://lore.kernel.org/r/1648446492-17614-1-git-send-email-08005325@163.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
2022-03-28 16:52:59 -04:00
Eli Cohen
1e00e821e4 net/mlx5: Add support for configuring max device MTU
Allow an admin creating a vdpa device to specify the max MTU for the
net device.

For example, to create a device with max MTU of 1000, the following
command can be used:

$ vdpa dev add name vdpa-a mgmtdev auxiliary/mlx5_core.sf.1 mtu 1000

This configuration mechanism assumes that vdpa is the sole real user of
the function. mlx5_core could theoretically change the mtu of the
function using the ip command on the mlx5_core net device but this
should not be done.

Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220221121927.194728-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-28 16:52:58 -04:00
Zhu Lingshan
6f84622db3 vDPA/ifcvf: cacheline alignment for ifcvf_hw
This commit introduces a new cacheline aligned layout for
ifcvf_hw.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Link: https://lore.kernel.org/r/20220222115428.998334-6-lingshan.zhu@intel.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-28 16:52:57 -04:00
Zhu Lingshan
9b3e814834 vDPA/ifcvf: implement shared IRQ feature
On some platforms/devices, there may not be enough MSI vectors
allocated for the virtqueues and config changes. In such a case,
the interrupt sources(virtqueues, config changes) must share
an IRQ/vector, to avoid initialization failures, keep
the device functional.

This commit handles three cases:
(1) number of the allocated vectors == the number of virtqueues + 1
(config changes), every virtqueue and the config interrupt has
a separated vector/IRQ, the best and the most likely case.
(2) number of the allocated vectors is less than the best case, but
greater than 1. In this case, all virtqueues share a vector/IRQ,
the config interrupt has a separated vector/IRQ
(3) only one vector is allocated, in this case, the virtqueues and
the config interrupt share a vector/IRQ. The worst and most
unlikely case.

Otherwise, it needs to fail.

This commit introduces some helper functions:
ifcvf_set_vq_vector() and ifcvf_set_config_vector() sets virtqueue
vector and config vector in the device config space, so that
the device can send interrupt DMA.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Link: https://lore.kernel.org/r/20220222115428.998334-5-lingshan.zhu@intel.com
Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20220315124130.1710030-1-trix@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-28 16:52:57 -04:00
Zhu Lingshan
ad5c5690de vDPA/ifcvf: implement device MSIX vector allocator
This commit implements a MSIX vector allocation helper
for vqs and config interrupts.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Link: https://lore.kernel.org/r/20220222115428.998334-4-lingshan.zhu@intel.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-28 16:52:57 -04:00
Zhu Lingshan
8897d6d0fc vDPA/ifcvf: make use of virtio pci modern IO helpers in ifcvf
This commit discards ifcvf_ioreadX()/writeX(), use virtio pci
modern IO helpers instead

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Link: https://lore.kernel.org/r/20220222115428.998334-2-lingshan.zhu@intel.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-28 16:52:57 -04:00
Linus Torvalds
34af78c4e6 IOMMU Updates for Linux v5.18
Including:
 
 	- IOMMU Core changes:
 	  - Removal of aux domain related code as it is basically dead
 	    and will be replaced by iommu-fd framework
 	  - Split of iommu_ops to carry domain-specific call-backs
 	    separatly
 	  - Cleanup to remove useless ops->capable implementations
 	  - Improve 32-bit free space estimate in iova allocator
 
 	- Intel VT-d updates:
 	  - Various cleanups of the driver
 	  - Support for ATS of SoC-integrated devices listed in
 	    ACPI/SATC table
 
 	- ARM SMMU updates:
 	  - Fix SMMUv3 soft lockup during continuous stream of events
 	  - Fix error path for Qualcomm SMMU probe()
 	  - Rework SMMU IRQ setup to prepare the ground for PMU support
 	  - Minor cleanups and refactoring
 
 	- AMD IOMMU driver:
 	  - Some minor cleanups and error-handling fixes
 
 	- Rockchip IOMMU driver:
 	  - Use standard driver registration
 
 	- MSM IOMMU driver:
 	  - Minor cleanup and change to standard driver registration
 
 	- Mediatek IOMMU driver:
 	  - Fixes for IOTLB flushing logic
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmI8OUkACgkQK/BELZcB
 GuNz9xAAvlgEg3byMx1y6LY9IctVGyLsegweVGM4+m6XR7qvT5Llc1E2Yw4Gooe4
 EAceOihDKW2T9VnMlz9g/cG7Modrx60chcB22KKfxDXPl6yF3R89EMd7DE43T6n/
 KPrP9+EsBnI8QSXyYu9ZowioX4CYwWhWD0dKHKAwDvw0BWHHUJ4hTaoHqEoIqLdP
 vubeHziIok/g1sylSpJjTzV7r/Na8Q3TGcb/Mi5qC8uiyiyx40vtaduMGNW+/ToN
 EqOKszxPmHfHv/xf0IHo0eUZ2L/JAe0mAlZzOb09f5F2sXJrbC05jlmRaDmSjT+u
 iEc1r2By/0xo6iOuQC3wD6kTvwwO/ecpNYGhXYXdTbtLquYfL5PSXjRHEU9gf2BO
 i/llPDsnytPvm/hnmbi26ChNR6yrDPz5bkoCUl5mnB1jZcaZtIURN7cRlEPPZUWo
 62VDNdqWDB6AvALc1/SwYdJX/i5eaBf+niS7/BJ/IkLp2oJxFzrGsU8SRJFHNYsa
 zdFIUUoTw647Ul6derSpGzHow169/RwVKYPiXMsaA8/viPNjpBOtfg56abn1WLW6
 N4CtwNu6tt+sPfftFdFx2cDEMW2zpWg5doMddBfEx9FAk0HJ4WLZiTpaO2PxcLyd
 kCAsGHj+ViAZHINVKFV4nQN/V9yQtcIc4UPmSGJBtKCK3KUYujw=
 =bcqr
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - IOMMU Core changes:
      - Removal of aux domain related code as it is basically dead and
        will be replaced by iommu-fd framework
      - Split of iommu_ops to carry domain-specific call-backs separatly
      - Cleanup to remove useless ops->capable implementations
      - Improve 32-bit free space estimate in iova allocator

 - Intel VT-d updates:
      - Various cleanups of the driver
      - Support for ATS of SoC-integrated devices listed in ACPI/SATC
        table

 - ARM SMMU updates:
      - Fix SMMUv3 soft lockup during continuous stream of events
      - Fix error path for Qualcomm SMMU probe()
      - Rework SMMU IRQ setup to prepare the ground for PMU support
      - Minor cleanups and refactoring

 - AMD IOMMU driver:
      - Some minor cleanups and error-handling fixes

 - Rockchip IOMMU driver:
      - Use standard driver registration

 - MSM IOMMU driver:
      - Minor cleanup and change to standard driver registration

 - Mediatek IOMMU driver:
      - Fixes for IOTLB flushing logic

* tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (47 commits)
  iommu/amd: Improve amd_iommu_v2_exit()
  iommu/amd: Remove unused struct fault.devid
  iommu/amd: Clean up function declarations
  iommu/amd: Call memunmap in error path
  iommu/arm-smmu: Account for PMU interrupts
  iommu/vt-d: Enable ATS for the devices in SATC table
  iommu/vt-d: Remove unused function intel_svm_capable()
  iommu/vt-d: Add missing "__init" for rmrr_sanity_check()
  iommu/vt-d: Move intel_iommu_ops to header file
  iommu/vt-d: Fix indentation of goto labels
  iommu/vt-d: Remove unnecessary prototypes
  iommu/vt-d: Remove unnecessary includes
  iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO
  iommu/vt-d: Remove domain and devinfo mempool
  iommu/vt-d: Remove iova_cache_get/put()
  iommu/vt-d: Remove finding domain in dmar_insert_one_dev_info()
  iommu/vt-d: Remove intel_iommu::domains
  iommu/mediatek: Always tlb_flush_all when each PM resume
  iommu/mediatek: Add tlb_lock in tlb_flush_all
  iommu/mediatek: Remove the power status checking in tlb flush all
  ...
2022-03-24 19:48:57 -07:00
Zhang Min
eb057b44db vdpa: fix use-after-free on vp_vdpa_remove
When vp_vdpa driver is unbind, vp_vdpa is freed in vdpa_unregister_device
and then vp_vdpa->mdev.pci_dev is dereferenced in vp_modern_remove,
triggering use-after-free.

Call Trace of unbinding driver free vp_vdpa :
do_syscall_64
  vfs_write
    kernfs_fop_write_iter
      device_release_driver_internal
        pci_device_remove
          vp_vdpa_remove
            vdpa_unregister_device
              kobject_release
                device_release
                  kfree

Call Trace of dereference vp_vdpa->mdev.pci_dev:
vp_modern_remove
  pci_release_selected_regions
    pci_release_region
      pci_resource_len
        pci_resource_end
          (dev)->resource[(bar)].end

Signed-off-by: Zhang Min <zhang.min9@zte.com.cn>
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Link: https://lore.kernel.org/r/20220301091059.46869-1-wang.yi59@zte.com.cn
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: 64b9f64f80 ("vdpa: introduce virtio pci driver")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-03-06 06:06:50 -05:00
Xie Yongji
b9d102dafe vduse: Fix returning wrong type in vduse_domain_alloc_iova()
This fixes the following smatch warnings:

drivers/vdpa/vdpa_user/iova_domain.c:305 vduse_domain_alloc_iova() warn: should 'iova_pfn << shift' be a 64 bit type?

Fixes: 8c773d53fb ("vduse: Implement an MMU-based software IOTLB")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20220121083940.102-1-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-04 11:56:34 -05:00
Si-Wei Liu
ed0f849fc3 vdpa/mlx5: add validation for VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command
When control vq receives a VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command
request from the driver, presently there is no validation against the
number of queue pairs to configure, or even if multiqueue had been
negotiated or not is unverified. This may lead to kernel panic due to
uninitialized resource for the queues were there any bogus request
sent down by untrusted driver. Tie up the loose ends there.

Fixes: 52893733f2 ("vdpa/mlx5: Add multiqueue support")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Link: https://lore.kernel.org/r/1642206481-30721-4-git-send-email-si-wei.liu@oracle.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-04 11:56:34 -05:00
Si-Wei Liu
30c22f3816 vdpa/mlx5: should verify CTRL_VQ feature exists for MQ
Per VIRTIO v1.1 specification, section 5.1.3.1 Feature bit requirements:
"VIRTIO_NET_F_MQ Requires VIRTIO_NET_F_CTRL_VQ".

There's assumption in the mlx5_vdpa multiqueue code that MQ must come
together with CTRL_VQ. However, there's nowhere in the upper layer to
guarantee this assumption would hold. Were there an untrusted driver
sending down MQ without CTRL_VQ, it would compromise various spots for
e.g. is_index_valid() and is_ctrl_vq_idx(). Although this doesn't end
up with immediate panic or security loophole as of today's code, the
chance for this to be taken advantage of due to future code change is
not zero.

Harden the crispy assumption by failing the set_driver_features() call
when seeing (MQ && !CTRL_VQ). For that end, verify_min_features() is
renamed to verify_driver_features() to reflect the fact that it now does
more than just validate the minimum features. verify_driver_features()
is now used to accommodate various checks against the driver features
for set_driver_features().

Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Link: https://lore.kernel.org/r/1642206481-30721-3-git-send-email-si-wei.liu@oracle.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-04 11:56:33 -05:00
Si-Wei Liu
e0077cc13b vdpa: factor out vdpa_set_features_unlocked for vdpa internal use
No functional change introduced. vdpa bus driver such as virtio_vdpa
or vhost_vdpa is not supposed to take care of the locking for core
by its own. The locked API vdpa_set_features should suffice the
bus driver's need.

Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/1642206481-30721-2-git-send-email-si-wei.liu@oracle.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-04 11:56:33 -05:00
John Garry
32e92d9f6f iommu/iova: Separate out rcache init
Currently the rcache structures are allocated for all IOVA domains, even if
they do not use "fast" alloc+free interface. This is wasteful of memory.

In addition, fails in init_iova_rcaches() are not handled safely, which is
less than ideal.

Make "fast" users call a separate rcache init explicitly, which includes
error checking.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/1643882360-241739-1-git-send-email-john.garry@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-14 15:43:15 +01:00
Linus Torvalds
3bf6a9e36e virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes
partial support for < MAX_ORDER - 1 granularity for virtio-mem
 driver_override for vdpa
 sysfs ABI documentation for vdpa
 multiqueue config support for mlx5 vdpa
 
 Misc fixes, cleanups.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmHiDHkPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpVT4H/3Veixt3uYPOmuLU2tSx+8X+sFTtik81hyiE
 okz5fRJrxxA8SqS76FnmO10FS4hlPOGNk0Z5WVhr0yihwFvPLvpCM/xi2Lmrz9I7
 pB0sXOIocEL1xApsxukR9K1Twpb2hfYsflbJYUVlRfhS5G0izKJNZp5I7OPrzd80
 vVNNDWKW2iLDlfqsavumI4Kvm4nsFuCHG03jzMtcIa7YTXYV3DORD4ZGFFVUOIQN
 t5F74TznwHOeYgJeg7TzjFjfPWmXjLetvx10QX1A1uOvwppWW/QY6My0UafTXNXj
 VB3gOwJPf+gxXAXl/4bafq4NzM0xys6cpcPpjvhmU+erY4UuyAU=
 =Y1eO
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:
 "virtio,vdpa,qemu_fw_cfg: features, cleanups, and fixes.

   - partial support for < MAX_ORDER - 1 granularity for virtio-mem

   - driver_override for vdpa

   - sysfs ABI documentation for vdpa

   - multiqueue config support for mlx5 vdpa

   - and misc fixes, cleanups"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (42 commits)
  vdpa/mlx5: Fix tracking of current number of VQs
  vdpa/mlx5: Fix is_index_valid() to refer to features
  vdpa: Protect vdpa reset with cf_mutex
  vdpa: Avoid taking cf_mutex lock on get status
  vdpa/vdpa_sim_net: Report max device capabilities
  vdpa: Use BIT_ULL for bit operations
  vdpa/vdpa_sim: Configure max supported virtqueues
  vdpa/mlx5: Report max device capabilities
  vdpa: Support reporting max device capabilities
  vdpa/mlx5: Restore cur_num_vqs in case of failure in change_num_qps()
  vdpa: Add support for returning device configuration information
  vdpa/mlx5: Support configuring max data virtqueue
  vdpa/mlx5: Fix config_attr_mask assignment
  vdpa: Allow to configure max data virtqueues
  vdpa: Read device configuration only if FEATURES_OK
  vdpa: Sync calls set/get config/status with cf_mutex
  vdpa/mlx5: Distribute RX virtqueues in RQT object
  vdpa: Provide interface to read driver features
  vdpa: clean up get_config_size ret value handling
  virtio_ring: mark ring unused on error
  ...
2022-01-18 10:05:48 +02:00
Eli Cohen
b03fc43e73 vdpa/mlx5: Fix tracking of current number of VQs
Modify the code such that ndev->cur_num_vqs better reflects the actual
number of data virtqueues. The value can be accurately realized after
features have been negotiated.

This is to prevent possible failures when modifying the RQT object if
the cur_num_vqs bears invalid value.

No issue was actually encountered but this also makes the code more
readable.

Fixes: c5a5cd3d3217 ("vdpa/mlx5: Support configuring max data virtqueue")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220111183400.38418-5-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14 18:50:54 -05:00
Eli Cohen
f8ae3a489b vdpa/mlx5: Fix is_index_valid() to refer to features
Make sure the decision whether an index received through a callback is
valid or not consults the negotiated features.

The motivation for this was due to a case encountered where I shut down
the VM. After the reset operation was called features were already
clear, I got get_vq_state() call which caused out array bounds
access since is_index_valid() reported the index value.

So this is more of not hit a bug since the call shouldn't have been made
first place.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220111183400.38418-4-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14 18:50:54 -05:00
Eli Cohen
f6d955d808 vdpa: Avoid taking cf_mutex lock on get status
Avoid the wrapper holding cf_mutex since it is not protecting anything.
To avoid confusion and unnecessary overhead incurred by it, remove.

Fixes: f489f27bc0ab ("vdpa: Sync calls set/get config/status with cf_mutex")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220111183400.38418-2-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14 18:50:54 -05:00
Eli Cohen
b2ce6197c9 vdpa/vdpa_sim_net: Report max device capabilities
Configure max supported virtqueues features on the management device.
This info can be retrieved using:

$ vdpa mgmtdev show
vdpasim_net:
  supported_classes net
  max_supported_vqs 2
  dev_features MAC ANY_LAYOUT VERSION_1 ACCESS_PLATFORM

Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-15-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14 18:50:54 -05:00
Eli Cohen
47a1401ac9 vdpa: Use BIT_ULL for bit operations
All masks in this file are 64 bits. Change BIT to BIT_ULL.

Other occurences use (1 << val) which yields a 32 bit value. Change them
to use BIT_ULL too.

Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-14-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14 18:50:54 -05:00
Eli Cohen
cbe777e98b vdpa/vdpa_sim: Configure max supported virtqueues
Configure max supported virtqueues on the management device.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-13-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14 18:50:54 -05:00
Eli Cohen
79de65edf8 vdpa/mlx5: Report max device capabilities
Configure max supported virtqueues and features on the management
device.
This info can be retrieved using:

$ vdpa mgmtdev show
auxiliary/mlx5_core.sf.1:
  supported_classes net
  max_supported_vqs 257
  dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ MQ \
               CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM

Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-12-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>
2022-01-14 18:50:54 -05:00
Eli Cohen
cd2629f6df vdpa: Support reporting max device capabilities
Add max_supported_vqs and supported_features fields to struct
vdpa_mgmt_dev. Upstream drivers need to feel these values according to
the device capabilities.

These values are reported back in a netlink message when showing management
devices.

Examples:

$ auxiliary/mlx5_core.sf.1:
  supported_classes net
  max_supported_vqs 257
  dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ MQ \
               CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM

$ vdpa -j mgmtdev show
{"mgmtdev":{"auxiliary/mlx5_core.sf.1":{"supported_classes":["net"], \
  "max_supported_vqs":257,"dev_features":["CSUM","GUEST_CSUM","MTU", \
  "HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ","CTRL_MAC_ADDR", \
  "VERSION_1","ACCESS_PLATFORM"]}}}

$ vdpa -jp mgmtdev show
{
    "mgmtdev": {
        "auxiliary/mlx5_core.sf.1": {
            "supported_classes": [ "net" ],
            "max_supported_vqs": 257,
            "dev_features": ["CSUM","GUEST_CSUM","MTU","HOST_TSO4", \
                             "HOST_TSO6","STATUS","CTRL_VQ","MQ", \
                             "CTRL_MAC_ADDR","VERSION_1","ACCESS_PLATFORM"]
        }
    }
}

Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-11-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>
2022-01-14 18:50:54 -05:00
Eli Cohen
37e07e7058 vdpa/mlx5: Restore cur_num_vqs in case of failure in change_num_qps()
Restore ndev->cur_num_vqs to the original value in case change_num_qps()
fails.

Fixes: 52893733f2 ("vdpa/mlx5: Add multiqueue support")
Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-10-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14 18:50:54 -05:00
Eli Cohen
612f330ec5 vdpa: Add support for returning device configuration information
Add netlink attribute to store the negotiated features. This can be used
by userspace to get the current state of the vdpa instance.

Examples:

$ vdpa dev config show vdpa-a
vdpa-a: mac 00:00:00:00:88:88 link up link_announce false max_vq_pairs 16 mtu 1500
  negotiated_features CSUM GUEST_CSUM MTU MAC HOST_TSO4 HOST_TSO6 STATUS \
  CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM

$ vdpa -j dev config show vdpa-a
{"config":{"vdpa-a":{"mac":"00:00:00:00:88:88","link ":"up","link_announce":false, \
 "max_vq_pairs":16,"mtu":1500,"negotiated_features":["CSUM","GUEST_CSUM","MTU","MAC", \
 "HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ","CTRL_MAC_ADDR","VERSION_1", \
 "ACCESS_PLATFORM"]}}}

$ vdpa -jp dev config show vdpa-a
{
    "config": {
        "vdpa-a": {
            "mac": "00:00:00:00:88:88",
            "link ": "up",
            "link_announce ": false,
            "max_vq_pairs": 16,
            "mtu": 1500,
            "negotiated_features": [
"CSUM","GUEST_CSUM","MTU","MAC","HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ", \
"CTRL_MAC_ADDR","VERSION_1","ACCESS_PLATFORM"
]
        }
    }
}

Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-9-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14 18:50:53 -05:00
Eli Cohen
75560522ea vdpa/mlx5: Support configuring max data virtqueue
Check whether the max number of data virtqueue pairs was provided when a
adding a new device and verify the new value does not exceed device
capabilities.

In addition, change the arrays holding virtqueue and callback contexts
to be dynamically allocated.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-8-elic@nvidia.com

Includes fixup:

vdpa/mlx5: fix error handling in mlx5_vdpa_dev_add()

Clang build fails with
mlx5_vnet.c:2574:6: error: variable 'mvdev' is used uninitialized whenever
  'if' condition is true
        if (!ndev->vqs || !ndev->event_cbs) {
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mlx5_vnet.c:2660:14: note: uninitialized use occurs here
        put_device(&mvdev->vdev.dev);
                    ^~~~~
This because mvdev is set after trying to allocate ndev->vqs,event_cbs.
So move the allocation to after mvdev is set but before the arrays
are used in init_mvqs()

Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20220107211352.3940570-1-trix@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Includes fixup:

vdpa/mlx5: fix endian-ness for max vqs

sparse warnings: (new ones prefixed by >>)
>> drivers/vdpa/mlx5/net/mlx5_vnet.c:1247:23: sparse: sparse: cast to restricted __le16
>> drivers/vdpa/mlx5/net/mlx5_vnet.c:1247:23: sparse: sparse: cast from restricted __virtio16

> 1247                  num = le16_to_cpu(ndev->config.max_virtqueue_pairs);

Address this using the appropriate wrapper.

Cc: "Eli Cohen" <elic@nvidia.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
2022-01-14 18:50:53 -05:00
Eli Cohen
e3137056e6 vdpa/mlx5: Fix config_attr_mask assignment
Fix VDPA_ATTR_DEV_NET_CFG_MACADDR assignment to be explicit 64 bit
assignment.

No issue was seen since the value is well below 64 bit max value.
Nevertheless it needs to be fixed.

Fixes: a007d94004 ("vdpa/mlx5: Support configuration of MAC")
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-7-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14 18:50:53 -05:00
Eli Cohen
aba21aff77 vdpa: Allow to configure max data virtqueues
Add netlink support to configure the max virtqueue pairs for a device.
At least one pair is required. The maximum is dictated by the device.

Example:
$ vdpa dev add name vdpa-a mgmtdev auxiliary/mlx5_core.sf.1 max_vqp 4

Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-6-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14 18:50:53 -05:00
Eli Cohen
30ef7a8ac8 vdpa: Read device configuration only if FEATURES_OK
Avoid reading device configuration during feature negotiation. Read
device status and verify that VIRTIO_CONFIG_S_FEATURES_OK is set.

Protect the entire operation, including configuration read with cf_mutex
to ensure integrity of the results.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-5-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14 18:50:53 -05:00
Eli Cohen
73bc0dbb59 vdpa: Sync calls set/get config/status with cf_mutex
Add wrappers to get/set status and protect these operations with
cf_mutex to serialize these operations with respect to get/set config
operations.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-4-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14 18:50:53 -05:00
Eli Cohen
a7f46ba424 vdpa/mlx5: Distribute RX virtqueues in RQT object
Distribute the available rx virtqueues amongst the available RQT
entries.

RQTs require to have a power of two entries. When creating or modifying
the RQT, use the lowest number of power of two entries that is not less
than the number of rx virtqueues. Distribute them in the available
entries such that some virtqueus may be referenced twice.

This allows to configure any number of virtqueue pairs when multiqueue
is used.

Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-3-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14 18:50:53 -05:00
Eli Cohen
a64917bc2e vdpa: Provide interface to read driver features
Provide an interface to read the negotiated features. This is needed
when building the netlink message in vdpa_dev_net_config_fill().

Also fix the implementation of vdpa_dev_net_config_fill() to use the
negotiated features instead of the device features.

To make APIs clearer, make the following name changes to struct
vdpa_config_ops so they better describe their operations:

get_features -> get_device_features
set_features -> set_driver_features

Finally, add get_driver_features to return the negotiated features and
add implementation to all the upstream drivers.

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220105114646.577224-2-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14 18:50:53 -05:00
Eli Cohen
97143b70aa vdpa/mlx5: Fix wrong configuration of virtio_version_1_0
Remove overriding of virtio_version_1_0 which forced the virtqueue
object to version 1.

Fixes: 1a86b377aa ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20211230142024.142979-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
2022-01-14 18:50:53 -05:00
Christophe JAILLET
10aa250b2f eni_vdpa: Simplify 'eni_vdpa_probe()'
When 'pcim_enable_device()' is used, some resources become automagically
managed.
There is no need to call 'pci_free_irq_vectors()' when the driver is
removed. The same will already be done by 'pcim_release()'.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/02045bdcbbb25f79bae4827f66029cfcddc90381.1636301587.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14 18:50:52 -05:00
Eli Cohen
60af39c1f4 net/mlx5_vdpa: Offer VIRTIO_NET_F_MTU when setting MTU
Make sure to offer VIRTIO_NET_F_MTU since we configure the MTU based on
what was queried from the device.

This allows the virtio driver to allocate large enough buffers based on
the reported MTU.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20211124170949.51725-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
2022-01-14 18:50:52 -05:00
Stefano Garzarella
539fec78ed vdpa: add driver_override support
`driver_override` allows to control which of the vDPA bus drivers
binds to a vDPA device.

If `driver_override` is not set, the previous behaviour is followed:
devices use the first vDPA bus driver loaded (unless auto binding
is disabled).

Tested on Fedora 34 with driverctl(8):
  $ modprobe virtio-vdpa
  $ modprobe vhost-vdpa
  $ modprobe vdpa-sim-net

  $ vdpa dev add mgmtdev vdpasim_net name dev1

  # dev1 is attached to the first vDPA bus driver loaded
  $ driverctl -b vdpa list-devices
    dev1 virtio_vdpa

  $ driverctl -b vdpa set-override dev1 vhost_vdpa

  $ driverctl -b vdpa list-devices
    dev1 vhost_vdpa [*]

  Note: driverctl(8) integrates with udev so the binding is
  preserved.

Suggested-by: Jason Wang <jasowang@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211126164753.181829-3-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14 18:50:52 -05:00
Zhu Lingshan
0f420c383a ifcvf/vDPA: fix misuse virtio-net device config size for blk dev
This commit fixes a misuse of virtio-net device config size issue
for virtio-block devices.

A new member config_size in struct ifcvf_hw is introduced and would
be initialized through vdpa_dev_add() to record correct device
config size.

To be more generic, rename ifcvf_hw.net_config to ifcvf_hw.dev_config,
the helpers ifcvf_read/write_net_config() to ifcvf_read/write_dev_config()

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Reported-and-suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Fixes: 6ad31d162a ("vDPA/ifcvf: enable Intel C5000X-PL virtio-block for vDPA")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20211201081255.60187-1-lingshan.zhu@intel.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14 18:50:52 -05:00
Guanjun
b4d80c8dda vduse: moving kvfree into caller
This free action should be moved into caller 'vduse_ioctl' in
concert with the allocation.

No functional change.

Signed-off-by: Guanjun <guanjun@linux.alibaba.com>
Link: https://lore.kernel.org/r/1638780498-55571-1-git-send-email-guanjun@linux.alibaba.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14 18:50:52 -05:00
Linus Torvalds
13eaa5bda0 IOMMU Updates for Linux v5.17
Including:
 
 	- Identity domain support for virtio-iommu
 
 	- Move flush queue code into iommu-dma
 
 	- Some fixes for AMD IOMMU suspend/resume support when x2apic
 	  is used
 
 	- Arm SMMU Updates from Will Deacon:
 	  - Revert evtq and priq back to their former sizes
 	  - Return early on short-descriptor page-table allocation failure
 	  - Fix page fault reporting for Adreno GPU on SMMUv2
 	  - Make SMMUv3 MMU notifier ops 'const'
 	  - Numerous new compatible strings for Qualcomm SMMUv2 implementations
 
 	- Various smaller fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmHfBvsACgkQK/BELZcB
 GuOURRAAqHBTUUgT/f2dJbVsWMOxSx0dhDnKuLRhAREMbmcn88tR8dxPRPYB1/6q
 yyFmkh13UzEr1gosEd34P5G8GS6fMD/G4yz8p9tK6Mqu7UeU+DoiDIi3s194WhYa
 iNPBlGgQ3XznTrbBnFV+n/LjvkxSguNfjj809Jiw7Ew3i50K4GFnzRA+IZVAT4+F
 zdNmwY12Y0v4SCfHKiVR1ZB9kcn2/Dx78dlBQ6ALFuejeZGa2OVr94byKnfz+AJh
 OssrgRcgUKclWbMk4Tljf4/FIdwkLMKD6gieBFb3sNKTUpzkG8elYmETO5d+hM+l
 27CKOCz1OOqVRzKe3r5n3c0wf9UAmi0Q91zW+UVZp2i0GjxpfIeIS9/NwRy+IXHS
 U9zybU47q10WF6cVO0n6wWHPRbjPii2OZpjqhSTq57qsnniCPLwkrry9H2fP71zz
 NDAZv5qvHCvRF7QoZfkBvCCJ12ZhNnhZqTfZR2wGGITMIk6dokG4NCsU93rSVKvZ
 4xQDPm45rECmunibdc9c1vrifKC7BIWCSU5DH3AEDBU/i9QfYpVPXfJlGdz3enIV
 /FA+kcvYrh21sokly/TqiZXGSaOFBqFEN13KJReXOgbENNq6kT/4lSNjK5Q1WSWp
 qDq12EQyv0RtTEcKVDpRVunI+/G5MquO8gbIrVmsRV0SU1Z0yoM=
 =Q8LV
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - Identity domain support for virtio-iommu

 - Move flush queue code into iommu-dma

 - Some fixes for AMD IOMMU suspend/resume support when x2apic is used

 - Arm SMMU Updates from Will Deacon:
      - Revert evtq and priq back to their former sizes
      - Return early on short-descriptor page-table allocation failure
      - Fix page fault reporting for Adreno GPU on SMMUv2
      - Make SMMUv3 MMU notifier ops 'const'
      - Numerous new compatible strings for Qualcomm SMMUv2 implementations

 - Various smaller fixes and cleanups

* tag 'iommu-updates-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (38 commits)
  iommu/iova: Temporarily include dma-mapping.h from iova.h
  iommu: Move flush queue data into iommu_dma_cookie
  iommu/iova: Move flush queue code to iommu-dma
  iommu/iova: Consolidate flush queue code
  iommu/vt-d: Use put_pages_list
  iommu/amd: Use put_pages_list
  iommu/amd: Simplify pagetable freeing
  iommu/iova: Squash flush_cb abstraction
  iommu/iova: Squash entry_dtor abstraction
  iommu/iova: Fix race between FQ timeout and teardown
  iommu/amd: Fix typo in *glues … together* in comment
  iommu/vt-d: Remove unused dma_to_mm_pfn function
  iommu/vt-d: Drop duplicate check in dma_pte_free_pagetable()
  iommu/vt-d: Use bitmap_zalloc() when applicable
  iommu/amd: Remove useless irq affinity notifier
  iommu/amd: X2apic mode: mask/unmask interrupts on suspend/resume
  iommu/amd: X2apic mode: setup the INTX registers on mask/unmask
  iommu/amd: X2apic mode: re-enable after resume
  iommu/amd: Restore GA log/tail pointer on host resume
  iommu/iova: Move fast alloc size roundup into alloc_iova_fast()
  ...
2022-01-12 16:15:51 -08:00
Linus Torvalds
6dc69d3d0d driver core changes for 5.17-rc1
Here is the set of changes for the driver core for 5.17-rc1.
 
 Lots of little things here, including:
 	- kobj_type cleanups
 	- auxiliary_bus documentation updates
 	- auxiliary_device conversions for some drivers (relevant
 	  subsystems all have provided acks for these)
 	- kernfs lock contention reduction for some workloads
 	- other tiny cleanups and changes.
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYd7deA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ym8ngCgw0ANwrRPE5b1dthEmfU2f8Knk5kAn0pHQv6R
 VRZJypgNfU/Pt0ykstZD
 =CO9J
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the set of changes for the driver core for 5.17-rc1.

  Lots of little things here, including:

   - kobj_type cleanups

   - auxiliary_bus documentation updates

   - auxiliary_device conversions for some drivers (relevant subsystems
     all have provided acks for these)

   - kernfs lock contention reduction for some workloads

   - other tiny cleanups and changes.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (43 commits)
  kobject documentation: remove default_attrs information
  drivers/firmware: Add missing platform_device_put() in sysfb_create_simplefb
  debugfs: lockdown: Allow reading debugfs files that are not world readable
  driver core: Make bus notifiers in right order in really_probe()
  driver core: Move driver_sysfs_remove() after driver_sysfs_add()
  firmware: edd: remove empty default_attrs array
  firmware: dmi-sysfs: use default_groups in kobj_type
  qemu_fw_cfg: use default_groups in kobj_type
  firmware: memmap: use default_groups in kobj_type
  sh: sq: use default_groups in kobj_type
  headers/uninline: Uninline single-use function: kobject_has_children()
  devtmpfs: mount with noexec and nosuid
  driver core: Simplify async probe test code by using ktime_ms_delta()
  nilfs2: use default_groups in kobj_type
  kobject: remove kset from struct kset_uevent_ops callbacks
  driver core: make kobj_type constant.
  driver core: platform: document registration-failure requirement
  vdpa/mlx5: Use auxiliary_device driver data helpers
  net/mlx5e: Use auxiliary_device driver data helpers
  soundwire: intel: Use auxiliary_device driver data helpers
  ...
2022-01-12 11:11:34 -08:00
Joerg Roedel
66dc1b791c Merge branches 'arm/smmu', 'virtio', 'x86/amd', 'x86/vt-d' and 'core' into next 2022-01-04 10:33:45 +01:00
David E. Box
45e3a27984 vdpa/mlx5: Use auxiliary_device driver data helpers
Use auxiliary_get_drvdata and auxiliary_set_drvdata helpers.

Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20211221235852.323752-5-david.e.box@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-22 13:59:01 +01:00
John Garry via iommu
972bf252f8 iommu/iova: Move fast alloc size roundup into alloc_iova_fast()
It really is a property of the IOVA rcache code that we need to alloc a
power-of-2 size, so relocate the functionality to resize into
alloc_iova_fast(), rather than the callsites.

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/1638875846-23993-1-git-send-email-john.garry@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-17 09:10:40 +01:00
Parav Pandit
bb47620be3 vdpa: Consider device id larger than 31
virtio device id value can be more than 31. Hence, use BIT_ULL in
assignment.

Fixes: 33b347503f ("vdpa: Define vdpa mgmt device, ops and a netlink interface")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20211130042949.88958-1-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-12-08 15:41:50 -05:00
Dan Carpenter
dc1db0060c vduse: check that offset is within bounds in get_config()
This condition checks "len" but it does not check "offset" and that
could result in an out of bounds read if "offset > dev->config_size".
The problem is that since both variables are unsigned the
"dev->config_size - offset" subtraction would result in a very high
unsigned value.

I think these checks might not be necessary because "len" and "offset"
are supposed to already have been validated using the
vhost_vdpa_config_validate() function.  But I do not know the code
perfectly, and I like to be safe.

Fixes: c8a6153b6c ("vduse: Introduce VDUSE - vDPA Device in Userspace")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211208150956.GA29160@kili
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: stable@vger.kernel.org
2021-12-08 14:53:15 -05:00
Dan Carpenter
ff9f9c6e74 vduse: fix memory corruption in vduse_dev_ioctl()
The "config.offset" comes from the user.  There needs to a check to
prevent it being out of bounds.  The "config.offset" and
"dev->config_size" variables are both type u32.  So if the offset if
out of bounds then the "dev->config_size - config.offset" subtraction
results in a very high u32 value.  The out of bounds offset can result
in memory corruption.

Fixes: c8a6153b6c ("vduse: Introduce VDUSE - vDPA Device in Userspace")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211208103307.GA3778@kili
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: stable@vger.kernel.org
2021-12-08 14:53:15 -05:00
Longpeng
bb93ce4b15 vdpa_sim: avoid putting an uninitialized iova_domain
The system will crash if we put an uninitialized iova_domain, this
could happen when an error occurs before initializing the iova_domain
in vdpasim_create().

BUG: kernel NULL pointer dereference, address: 0000000000000000
...
RIP: 0010:__cpuhp_state_remove_instance+0x96/0x1c0
...
Call Trace:
 <TASK>
 put_iova_domain+0x29/0x220
 vdpasim_free+0xd1/0x120 [vdpa_sim]
 vdpa_release_dev+0x21/0x40 [vdpa]
 device_release+0x33/0x90
 kobject_release+0x63/0x160
 vdpasim_create+0x127/0x2a0 [vdpa_sim]
 vdpasim_net_dev_add+0x7d/0xfe [vdpa_sim_net]
 vdpa_nl_cmd_dev_add_set_doit+0xe1/0x1a0 [vdpa]
 genl_family_rcv_msg_doit+0x112/0x140
 genl_rcv_msg+0xdf/0x1d0
 ...

So we must make sure the iova_domain is already initialized before
put it.

In addition, we may get the following warning in this case:
WARNING: ... drivers/iommu/iova.c:344 iova_cache_put+0x58/0x70

So we must make sure the iova_cache_put() is invoked only if the
iova_cache_get() is already invoked. Let's fix it together.

Cc: stable@vger.kernel.org
Fixes: 4080fc1067 ("vdpa_sim: use iova module to allocate IOVA addresses")
Signed-off-by: Longpeng <longpeng2@huawei.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211124015215.119-1-longpeng2@huawei.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-24 19:00:29 -05:00
Linus Torvalds
43e1b12927 vhost,virtio,vhost: fixes,features
Hardening work by Jason
 vdpa driver for Alibaba ENI
 Performance tweaks for virtio blk
 virtio rng rework using an internal buffer
 mac/mtu programming for mlx5 vdpa
 Misc fixes, cleanups
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmF/su8PHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpo+sIAJjBTvbET+d0KuIMt8YBGsU+NWgHaJxW06hm
 GHZzinueZYWaS20MPMYMPfcHUgigWj0kk7F0gMljG5rtDFzR85JkOi7+e5V6RVJW
 8SQc7JjA1Krde6EiPJdlv3mVkLz/5VbrGUgjAW9di/O04Xc/jAc9kmJ41wTYr0mL
 E5IsUi9QBmgHtKQ+2ofb9QEWuebJclGK+NVd3Kp08BtAQOrdn5UrIzY30/sjkZwE
 ul2ll6v/k7T3dCQbHD/wMgfFycPvoZVfxaVj+FWQnwdWkqZwzMRmpKPRes4KJfww
 Lfx1qox10mJzH7XJCs7xmOM8f7HgNNWi0gD5FxNlfkiz1AwHYOk=
 =DTXY
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:
 "vhost and virtio fixes and features:

   - Hardening work by Jason

   - vdpa driver for Alibaba ENI

   - Performance tweaks for virtio blk

   - virtio rng rework using an internal buffer

   - mac/mtu programming for mlx5 vdpa

   - Misc fixes, cleanups"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (45 commits)
  vdpa/mlx5: Forward only packets with allowed MAC address
  vdpa/mlx5: Support configuration of MAC
  vdpa/mlx5: Fix clearing of VIRTIO_NET_F_MAC feature bit
  vdpa_sim_net: Enable user to set mac address and mtu
  vdpa: Enable user to set mac and mtu of vdpa device
  vdpa: Use kernel coding style for structure comments
  vdpa: Introduce query of device config layout
  vdpa: Introduce and use vdpa device get, set config helpers
  virtio-scsi: don't let virtio core to validate used buffer length
  virtio-blk: don't let virtio core to validate used length
  virtio-net: don't let virtio core to validate used length
  virtio_ring: validate used buffer length
  virtio_blk: correct types for status handling
  virtio_blk: allow 0 as num_request_queues
  i2c: virtio: Add support for zero-length requests
  virtio-blk: fixup coccinelle warnings
  virtio_ring: fix typos in vring_desc_extra
  virtio-pci: harden INTX interrupts
  virtio_pci: harden MSI-X interrupts
  virtio_config: introduce a new .enable_cbs method
  ...
2021-11-03 15:00:39 -07:00
Eli Cohen
540061ac79 vdpa/mlx5: Forward only packets with allowed MAC address
Add rules to forward packets to the net device's TIR only if the
destination MAC is equal to the configured MAC. This is required to
prevent the netdevice from receiving traffic not destined to its
configured MAC.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Link: https://lore.kernel.org/r/20211026175519.87795-9-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2021-11-01 05:26:49 -04:00
Eli Cohen
a007d94004 vdpa/mlx5: Support configuration of MAC
Add code to accept MAC configuration through vdpa tool. The MAC is
written into the config struct and later can be retrieved through
get_config().

Examples:
1. Configure MAC while adding a device:
$ vdpa dev add mgmtdev pci/0000:06:00.2 name vdpa0 mac 00:11:22:33:44:55

2. Show configured params:
$ vdpa dev config show
vdpa0: mac 00:11:22:33:44:55 link down link_announce false max_vq_pairs 8 mtu 1500

Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20211026175519.87795-8-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:49 -04:00
Parav Pandit
ef76eb83a1 vdpa/mlx5: Fix clearing of VIRTIO_NET_F_MAC feature bit
Cited patch in the fixes tag clears the features bit during reset.
mlx5 vdpa device feature bits are static decided by device capabilities.
These feature bits (including VIRTIO_NET_F_MAC) are initialized during
device addition time.

Clearing features bit in reset callback cleared the VIRTIO_NET_F_MAC. Due
to this, MAC address provided by the device is not honored.

Fix it by not clearing the static feature bits during reset.

Fixes: 0686082dbf ("vdpa: Add reset callback in vdpa_config_ops")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20211026175519.87795-7-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2021-11-01 05:26:49 -04:00
Parav Pandit
1138b9818e vdpa_sim_net: Enable user to set mac address and mtu
Enable user to set the mac address and mtu so that each vdpa device
can have its own user specified mac address and mtu.

Now that user is enabled to set the mac address, remove the module
parameter for same.

And example of setting mac addr and mtu and view the configuration:
$ vdpa mgmtdev show
vdpasim_net:
  supported_classes net

$ vdpa dev add name bar mgmtdev vdpasim_net mac 00:11:22:33:44:55 mtu 9000

$ vdpa dev config show
bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211026175519.87795-6-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:49 -04:00
Parav Pandit
d8ca2fa5be vdpa: Enable user to set mac and mtu of vdpa device
$ vdpa dev add name bar mgmtdev vdpasim_net mac 00:11:22:33:44:55 mtu 9000

$ vdpa dev config show
bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000

$ vdpa dev config show -jp
{
    "config": {
        "bar": {
            "mac": "00:11:22:33:44:55",
            "link ": "up",
            "link_announce ": false,
            "mtu": 9000,
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>

Link: https://lore.kernel.org/r/20211026175519.87795-5-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-11-01 05:26:49 -04:00
Parav Pandit
ad69dd0bf2 vdpa: Introduce query of device config layout
Introduce a command to query a device config layout.

An example query of network vdpa device:

$ vdpa dev add name bar mgmtdev vdpasim_net

$ vdpa dev config show
bar: mac 00:35:09:19:48:05 link up link_announce false mtu 1500

$ vdpa dev config show -jp
{
    "config": {
        "bar": {
            "mac": "00:35:09:19:48:05",
            "link ": "up",
            "link_announce ": false,
            "mtu": 1500,
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20211026175519.87795-3-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:49 -04:00
Parav Pandit
6dbb1f1687 vdpa: Introduce and use vdpa device get, set config helpers
Subsequent patches enable get and set configuration either
via management device or via vdpa device' config ops.

This requires synchronization between multiple callers to get and set
config callbacks. Features setting also influence the layout of the
configuration fields endianness.

To avoid exposing synchronization primitives to callers, introduce
helper for setting the configuration and use it.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211026175519.87795-2-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:49 -04:00
Eli Cohen
edf747affc vdpa/mlx5: Propagate link status from device to vdpa driver
Add code to register to hardware asynchronous events. Use this
mechanism to track link status events coming from the device and update
the config struct.

After doing link status change, call the vdpa callback to notify of the
link status change.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210909123635.30884-4-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2021-11-01 05:26:47 -04:00
Eli Cohen
218bdd20e5 vdpa/mlx5: Rename control VQ workqueue to vdpa wq
A subesequent patch will use the same workqueue for executing other
work not related to control VQ. Rename the workqueue and the work queue
entry used to convey information to the workqueue.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210909123635.30884-3-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2021-11-01 05:26:47 -04:00
Eli Cohen
246fd1caf0 vdpa/mlx5: Remove mtu field from vdpa net device
No need to save the mtu int the net device struct. We can save it in the
config struct which cannot be modified.

Moreover, move the initialization to. mlx5_vdpa_set_features() callback
is not the right place to put it.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210909123635.30884-2-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:47 -04:00
Wu Zongyong
e85087beed eni_vdpa: add vDPA driver for Alibaba ENI
This patch adds a new vDPA driver for Alibaba ENI(Elastic Network
Interface) which is build upon virtio 0.9.5 specification.
And this driver is only enabled on X86 host currently.

Link: https://lore.kernel.org/r/6a9f32c00609af16bbb2ea32e633b3beb1cbf84b.1635493219.git.wuzongyong@linux.alibaba.com
Signed-off-by: Wu Zongyong <wuzongyong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20211026083214.3375383-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de> # fix Kconfig typo
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:23:41 -04:00
Wu Zongyong
e47be840e8 vdpa: add new attribute VDPA_ATTR_DEV_MIN_VQ_SIZE
This attribute advertises the min value of virtqueue size. The value is
1 by default.

Signed-off-by: Wu Zongyong <wuzongyong@linux.alibaba.com>
Link: https://lore.kernel.org/r/2bbc417355c4d22298050b1ba887cecfbde3e85d.1635493219.git.wuzongyong@linux.alibaba.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 04:30:35 -04:00
Wu Zongyong
c53e5d1b5e vdpa: min vq num of vdpa device cannot be greater than max vq num
Just failed to probe the vdpa device if the min virtqueue num returned
by get_vq_num_min is greater than the max virtqueue num returned by
get_vq_num_max.

Signed-off-by: Wu Zongyong <wuzongyong@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/21199b62cc10b2a9f2cf90eeb63ad080645d881f.1635493219.git.wuzongyong@linux.alibaba.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 04:30:35 -04:00