linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-09-26 20:38:12 +00:00

Author	SHA1	Message	Date
Chris Mi	2951b2e142	net/mlx5e: Always clear dest encap in neigh-update-del The cited commit introduced a bug for multiple encapsulations flow. If one dest encap becomes invalid, the flow is set slow path flag. But when other dests encap become invalid, they are not cleared due to slow path flag of the flow. When neigh-update-add is running, it will use invalid encap. Fix it by checking slow path flag after clearing dest encap. Fixes: `9a5f9cc794` ("net/mlx5e: Fix possible use-after-free deleting fdb rule") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-12-28 11:38:50 -08:00
Chris Mi	849190e3e4	net/mlx5e: CT: Fix ct debugfs folder name Need to use sprintf to build a string instead of sscanf. Otherwise dirname is null and both "ct_nic" and "ct_fdb" won't be created. But its redundant anyway as driver could be in switchdev mode but still add nic rules. So use "ct" as folder name. Fixes: `77422a8f6f` ("net/mlx5e: CT: Add ct driver counters") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-12-28 11:38:50 -08:00
Tariq Toukan	f8c18a5749	net/mlx5e: Fix RX reporter for XSK RQs RX reporter mistakenly reads from the regular (inactive) RQ when XSK RQ is active. Fix it here. Fixes: `3db4c85cde` ("net/mlx5e: xsk: Use queue indices starting from 0 for XSK queues") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-12-28 11:38:50 -08:00
Dragos Tatulea	b12d581e83	net/mlx5e: IPoIB, Don't allow CQE compression to be turned on by default mlx5e_build_nic_params will turn CQE compression on if the hardware capability is enabled and the slow_pci_heuristic condition is detected. As IPoIB doesn't support CQE compression, make sure to disable the feature in the IPoIB profile init. Please note that the feature is not exposed to the user for IPoIB interfaces, so it can't be subsequently turned on. Fixes: `b797a684b0` ("net/mlx5e: Enable CQE compression when PCI is slower than link") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-12-28 11:38:50 -08:00
Shay Drory	c4ad5f2bda	net/mlx5: Fix RoCE setting at HCA level mlx5 PF can disable RoCE for its VFs and SFs. In such case RoCE is marked as unsupported on those VFs/SFs. The cited patch added an option for disable (and enable) RoCE at HCA level. However, that commit didn't check whether RoCE is supported on the HCA and enabled user to try and set RoCE to on. Fix it by checking whether the HCA supports RoCE. Fixes: `fbfa97b4d7` ("net/mlx5: Disable roce at HCA level") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-12-28 11:38:50 -08:00
Shay Drory	9078e843ef	net/mlx5: Avoid recovery in probe flows Currently, recovery is done without considering whether the device is still in probe flow. This may lead to recovery before device have finished probed successfully. e.g.: while mlx5_init_one() is running. Recovery flow is using functionality that is loaded only by mlx5_init_one(), and there is no point in running recovery without mlx5_init_one() finished successfully. Fix it by waiting for probe flow to finish and checking whether the device is probed before trying to perform recovery. Fixes: `51d138c261` ("net/mlx5: Fix health error state handling") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-12-28 11:38:50 -08:00
Shay Drory	44aee8ea15	net/mlx5: Fix io_eq_size and event_eq_size params validation io_eq_size and event_eq_size params are of param type DEVLINK_PARAM_TYPE_U32. But, the validation callback is addressing them as DEVLINK_PARAM_TYPE_U16. This cause mismatch in validation in big-endian systems, in which values in range were rejected while 268500991 was accepted. Fix it by checking the U32 value in the validation callback. Fixes: `0844fa5f7b` ("net/mlx5: Let user configure io_eq_size param") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-12-28 11:38:49 -08:00
Jiri Pirko	2a35b2c2e6	net/mlx5: Add forgotten cleanup calls into mlx5_init_once() error path There are two cleanup calls missing in mlx5_init_once() error path. Add them making the error path flow to be the same as mlx5_cleanup_once(). Fixes: `52ec462eca` ("net/mlx5: Add reserved-gids support") Fixes: `7c39afb394` ("net/mlx5: PTP code migration to driver core section") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-12-28 11:38:49 -08:00
Moshe Shemesh	1f0ae22ab4	net/mlx5: E-Switch, properly handle ingress tagged packets on VST Fix SRIOV VST mode behavior to insert cvlan when a guest tag is already present in the frame. Previous VST mode behavior was to drop packets or override existing tag, depending on the device version. In this patch we fix this behavior by correctly building the HW steering rule with a push vlan action, or for older devices we ask the FW to stack the vlan when a vlan is already present. Fixes: `07bab95026` ("net/mlx5: E-Switch, Refactor eswitch ingress acl codes") Fixes: `dfcb1ed3c3` ("net/mlx5: E-Switch, Vport ingress/egress ACLs rules for VST mode") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-12-28 11:38:49 -08:00
Sagi Grimberg	76807fcd73	nvme-auth: fix smatch warning complaints When initializing auth context, there may be no secrets passed by the user. Make return code explicit when returning successfully. smatch warnings: drivers/nvme/host/auth.c:950 nvme_auth_init_ctrl() warn: missing error code? 'ret' Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>	2022-12-28 06:26:35 -10:00
Christoph Hellwig	6f99ac04c4	nvme: consult the CSE log page for unprivileged passthrough Commands like Write Zeros can change the contents of a namespaces without actually transferring data. To protect against this, check the Commands Supported and Effects log is supported by the controller for any unprivileg command passthrough and refuse unprivileged passthrough if the command has any effects that can change data or metadata. Note: While the Commands Support and Effects log page has only been mandatory since NVMe 2.0, it is widely supported because Windows requires it for any command passthrough from userspace. Fixes: `e4fbcf32c8` ("nvme: identify-namespace without CAP_SYS_ADMIN") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>	2022-12-28 06:26:31 -10:00
Christoph Hellwig	831ed60c2a	nvme: also return I/O command effects from nvme_command_effects To be able to use the Commands Supported and Effects Log for allowing unprivileged passtrough, it needs to be corretly reported for I/O commands as well. Return the I/O command effects from nvme_command_effects, and also add a default list of effects for the NVM command set. For other command sets, the Commands Supported and Effects log is required to be present already. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>	2022-12-28 06:26:25 -10:00
Christoph Hellwig	2a459f6933	nvmet: don't defer passthrough commands with trivial effects to the workqueue Mask out the "Command Supported" and "Logical Block Content Change" bits and only defer execution of commands that have non-trivial effects to the workqueue for synchronous execution. This allows to execute admin commands asynchronously on controllers that provide a Command Supported and Effects log page, and will keep allowing to execute Write commands asynchronously once command effects on I/O commands are taken into account. Fixes: `c1fef73f79` ("nvmet: add passthru code to process commands") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>	2022-12-28 06:26:18 -10:00
Christoph Hellwig	f2d1421391	nvmet: set the LBCC bit for commands that modify data Write, Write Zeroes, Zone append and a Zone Reset through Zone Management Send modify the logical block content of a namespace, so make sure the LBCC bit is reported for them. Fixes: b5d0b38c0475 ("nvmet: add Command Set Identifier support") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>	2022-12-28 06:26:13 -10:00
Christoph Hellwig	61f37154c5	nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding it Use NVME_CMD_EFFECTS_CSUPP instead of open coding it and assign a single value to multiple array entries instead of repeated assignments. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>	2022-12-28 06:26:08 -10:00
Christoph Hellwig	685e631163	nvme: fix the NVME_CMD_EFFECTS_CSE_MASK definition 3 << 16 does not generate the correct mask for bits 16, 17 and 18. Use the GENMASK macro to generate the correct mask instead. Fixes: `84fef62d13` ("nvme: check admin passthru command effects") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>	2022-12-28 06:25:55 -10:00
Christoph Hellwig	8ca4fc323d	docs, nvme: add a feature and quirk policy document This adds a document about what specification features are supported by the Linux NVMe driver, and what qualifies for a quirk if an implementation has problems following the specification. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Jonathan Corbet <corbet@lwn.net>	2022-12-28 05:37:12 -10:00
Takashi Iwai	090ddad4c7	ALSA: hda/hdmi: Static PCM mapping again with AMD HDMI codecs The recent code refactoring for HD-audio HDMI codec driver caused a regression on AMD/ATI HDMI codecs; namely, PulseAudioand pipewire don't recognize HDMI outputs any longer while the direct output via ALSA raw access still works. The problem turned out that, after the code refactoring, the driver assumes only the dynamic PCM assignment, and when a PCM stream that still isn't assigned to any pin gets opened, the driver tries to assign any free converter to the PCM stream. This behavior is OK for Intel and other codecs, as they have arbitrary connections between pins and converters. OTOH, on AMD chips that have a 1:1 mapping between pins and converters, this may end up with blocking the open of the next PCM stream for the pin that is tied with the formerly taken converter. Also, with the code refactoring, more PCM streams are exposed than necessary as we assume all converters can be used, while this isn't true for AMD case. This may change the PCM stream assignment and confuse users as well. This patch fixes those problems by: - Introducing a flag spec->static_pcm_mapping, and if it's set, the driver applies the static mapping between pins and converters at the probe time - Limiting the number of PCM streams per pins, too; this avoids the superfluous PCM streams Fixes: `ef6f5494fa` ("ALSA: hda/hdmi: Use only dynamic PCM device allocation") Cc: <stable@vger.kernel.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216836 Co-developed-by: Jaroslav Kysela <perex@perex.cz> Signed-off-by: Jaroslav Kysela <perex@perex.cz> Link: https://lore.kernel.org/r/20221228125714.16329-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2022-12-28 14:05:06 +01:00
Paolo Bonzini	a5496886eb	Merge branch 'kvm-late-6.1-fixes' into HEAD x86: * several fixes to nested VMX execution controls * fixes and clarification to the documentation for Xen emulation * do not unnecessarily release a pmu event with zero period * MMU fixes * fix Coverity warning in kvm_hv_flush_tlb() selftests: * fixes for the ucall mechanism in selftests * other fixes mostly related to compilation with clang	2022-12-28 07:19:14 -05:00
Paolo Bonzini	129c48cde6	KVM: selftests: restore special vmmcall code layout needed by the harness Commit `8fda37cf3d` ("KVM: selftests: Stuff RAX/RCX with 'safe' values in vmmcall()/vmcall()", 2022-11-21) broke the svm_nested_soft_inject_test because it placed a "pop rbp" instruction after vmmcall. While this is correct and mimics what is done in the VMX case, this particular test expects a ud2 instruction right after the vmmcall, so that it can skip over it in the L1 part of the test. Inline a suitably-modified version of vmmcall() to restore the functionality of the test. Fixes: `8fda37cf3d` ("KVM: selftests: Stuff RAX/RCX with 'safe' values in vmmcall()/vmcall()" Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20221130181147.9911-1-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-28 07:18:01 -05:00
Pedro Tammela	40cab44b90	net/sched: fix retpoline wrapper compilation on configs without tc filters Rudi reports a compilation failure on x86_64 when CONFIG_NET_CLS or CONFIG_NET_CLS_ACT is not set but CONFIG_RETPOLINE is set. A misplaced '#endif' was causing the issue. Fixes: `7f0e810220` ("net/sched: add retpoline wrapper for tc") Tested-by: Rudi Heitbaum <rudi@heitbaum.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-28 12:11:32 +00:00
Xuezhi Zhang	c2052189f1	s390/qeth: convert sysfs snprintf to sysfs_emit Follow the advice of the Documentation/filesystems/sysfs.rst and show() should only use sysfs_emit() or sysfs_emit_at() when formatting the value to be returned to user space. Signed-off-by: Xuezhi Zhang <zhangxuezhi1@coolpad.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-28 12:10:42 +00:00
David S. Miller	0e3d18359a	Merge branch 'r8169-fixes' Chunhao Lin says: ==================== r8169: fix dmar pte write access is not set error This series fixes dmar pte write access is not set error. Chunhao Lin (2): r8169: move rtl_wol_enable_rx() and rtl_prepare_power_down() r8169: fix dmar pte write access is not set error v2: -update commit message -adjust the code according to current kernel code v3: -update title and commit message -split the patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-28 11:58:08 +00:00
Chunhao Lin	bb41c13c05	r8169: fix dmar pte write access is not set error When close device, if wol is enabled, rx will be enabled. When open device it will cause rx packet to be dma to the wrong memory address after pci_set_master() and system log will show blow messages. DMAR: DRHD: handling fault status reg 3 DMAR: [DMA Write] Request device [02:00.0] PASID ffffffff fault addr ffdd4000 [fault reason 05] PTE Write access is not set In this patch, driver disable tx/rx when close device. If wol is enabled, only enable rx filter and disable rxdv_gate(if support) to let hardware only receive packet to fifo but not to dma it. Signed-off-by: Chunhao Lin <hau@realtek.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-28 11:58:08 +00:00
Chunhao Lin	ad425666a1	r8169: move rtl_wol_enable_rx() and rtl_prepare_power_down() There is no functional change. Moving these two functions for following patch "r8169: fix dmar pte write access is not set error". Signed-off-by: Chunhao Lin <hau@realtek.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-28 11:58:07 +00:00
David S. Miller	e71460d446	Merge branch 'ethtool_gert_phy_stats-fixes' Daniil Tatianin says: ==================== net/ethtool/ioctl: split ethtool_get_phy_stats into multiple helpers This series fixes a potential NULL dereference in ethtool_get_phy_stats while also attempting to refactor/split said function into multiple helpers so that it's easier to reason about what's going on. I've taken Andrew Lunn's suggestions on the previous version of this patch and added a bit of my own. Changes since v1: - Remove an extra newline in the first patch - Move WARN_ON_ONCE into the if check as it already returns the result of the comparison - Actually split ethtool_get_phy_stats instead of attempting to refactor it ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-28 11:55:24 +00:00
Daniil Tatianin	201ed315f9	net/ethtool/ioctl: split ethtool_get_phy_stats into multiple helpers So that it's easier to follow and make sense of the branching and various conditions. Stats retrieval has been split into two separate functions ethtool_get_phy_stats_phydev & ethtool_get_phy_stats_ethtool. The former attempts to retrieve the stats using phydev & phy_ops, while the latter uses ethtool_ops. Actual n_stats validation & array allocation has been moved into a new ethtool_vzalloc_stats_array helper. This also fixes a potential NULL dereference of ops->get_ethtool_phy_stats where it was getting called in an else branch unconditionally without making sure it was actually present. Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-28 11:55:24 +00:00
Daniil Tatianin	fd4778581d	net/ethtool/ioctl: remove if n_stats checks from ethtool_get_phy_stats Now that we always early return if we don't have any stats we can remove these checks as they're no longer necessary. Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-28 11:55:24 +00:00
Daniil Tatianin	9deb1e9fb8	net/ethtool/ioctl: return -EOPNOTSUPP if we have no phy stats It's not very useful to copy back an empty ethtool_stats struct and return 0 if we didn't actually have any stats. This also allows for further simplification of this function in the future commits. Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-28 11:55:24 +00:00
Paolo Bonzini	02d9a04da4	Documentation: kvm: clarify SRCU locking order Currently only the locking order of SRCU vs kvm->slots_arch_lock and kvm->slots_lock is documented. Extend this to kvm->lock since Xen emulation got it terribly wrong. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-28 06:02:54 -05:00
Paolo Bonzini	a79b53aaaa	KVM: x86: fix deadlock for KVM_XEN_EVTCHN_RESET While KVM_XEN_EVTCHN_RESET is usually called with no vCPUs running, if that happened it could cause a deadlock. This is due to kvm_xen_eventfd_reset() doing a synchronize_srcu() inside a kvm->lock critical section. To avoid this, first collect all the evtchnfd objects in an array and free all of them once the kvm->lock critical section is over and th SRCU grace period has expired. Reported-by: Michal Luczaj <mhal@rbox.co> Cc: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-28 05:53:57 -05:00
Rafael Mendonca	a26116c1e7	virtio_blk: Fix signedness bug in virtblk_prep_rq() The virtblk_map_data() function returns negative error codes, however, the 'nents' field of vbr->sg_table is an unsigned int, which causes the error handling not to work correctly. Cc: stable@vger.kernel.org Fixes: `0e9911fa76` ("virtio-blk: support mq_ops->queue_rqs()") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Message-Id: <20221021204126.927603-1-rafaelmendsr@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Suwan Kim <suwan.kim027@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-12-28 05:28:11 -05:00
Cindy Lu	72455a1142	vdpa_sim_net: should not drop the multicast/broadcast packet In the receive_filter(), should not drop the packet with the broadcast/multicast address. Add the check for this Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20221214054306.24145-1-lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-12-28 05:28:11 -05:00
Jason Wang	0b7a04a30e	vdpasim: fix memory leak when freeing IOTLBs After commit `bda324fd03` ("vdpasim: control virtqueue support"), vdpasim->iommu became an array of IOTLB, so we should clean the mappings of each free one by one instead of just deleting the ranges in the first IOTLB which may leak maps. Fixes: `bda324fd03` ("vdpasim: control virtqueue support") Cc: Gautam Dawar <gautam.dawar@xilinx.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20221213090717.61529-1-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Gautam Dawar <gautam.dawar@amd.com>	2022-12-28 05:28:11 -05:00
Jason Wang	1c96d5457f	vdpa: conditionally fill max max queue pair for stats For the device without multiqueue feature, we will read 0 as max_virtqueue_pairs from the config. So if we fill VDPA_ATTR_DEV_NET_CFG_MAX_VQP with the value we read from the config we will confuse the user. Fixing this by only filling the value when multiqueue is offered by the device so userspace can assume 1 when the attr is not provided. Fixes: 13b00b135665c("vdpa: Add support for querying vendor statistics") Cc: Eli Cohen <elic@nvidia.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220907060110.4511-1-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eli Cohen <elic@nvidia.com>	2022-12-28 05:28:11 -05:00
Rong Wang	ed843d6ed7	vdpa/vp_vdpa: fix kfree a wrong pointer in vp_vdpa_remove In vp_vdpa_remove(), the code kfree(&vp_vdpa_mgtdev->mgtdev.id_table) uses a reference of pointer as the argument of kfree, which is the wrong pointer and then may hit crash like this: Unable to handle kernel paging request at virtual address 00ffff003363e30c Internal error: Oops: 96000004 [#1] SMP Call trace: rb_next+0x20/0x5c ext4_readdir+0x494/0x5c4 [ext4] iterate_dir+0x168/0x1b4 __se_sys_getdents64+0x68/0x170 __arm64_sys_getdents64+0x24/0x30 el0_svc_common.constprop.0+0x7c/0x1bc do_el0_svc+0x2c/0x94 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb4 el0_sync+0x160/0x180 Code: 54000220 f9400441 b4000161 aa0103e0 (f9400821) SMP: stopping secondary CPUs Starting crashdump kernel... Fixes: `ffbda8e9df` ("vdpa/vp_vdpa : add vdpa tool support in vp_vdpa") Signed-off-by: Rong Wang <wangrong68@huawei.com> Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> Message-Id: <20221207120813.2837529-1-sunnanyong@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cindy Lu <lulu@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-12-28 05:28:11 -05:00
Harshit Mogalapalli	937c783aa3	vduse: Validate vq_num in vduse_validate_config() Add a limit to 'config->vq_num' which is user controlled data which comes from an vduse_ioctl to prevent large memory allocations. Micheal says - This limit is somewhat arbitrary. However, currently virtio pci and ccw are limited to a 16 bit vq number. While MMIO isn't it is also isn't used with lots of VQs due to current lack of support for per-vq interrupts. Thus, the 0xffff limit on number of VQs corresponding to a 16-bit VQ number seems sufficient for now. This is found using static analysis with smatch. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Message-Id: <20221128155717.2579992-1-harshit.m.mogalapalli@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-12-28 05:28:11 -05:00
Davidlohr Bueso	81931012bd	tools/virtio: remove smp_read_barrier_depends() This gets rid of the last references to smp_read_barrier_depends() which for the kernel side was removed in v5.9. The serialization required for Alpha is done inside READ_ONCE() instead of having users deal with it. Simply use a full barrier, the architecture does not have rmb in the first place. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Message-Id: <20221128034347.990-3-dave@stgolabs.net> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>	2022-12-28 05:28:11 -05:00
Davidlohr Bueso	8aeac42d60	tools/virtio: remove stray characters __read_once_size() is not a macro, remove those '/'s. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Message-Id: <20221128034347.990-2-dave@stgolabs.net> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>	2022-12-28 05:28:11 -05:00
Cindy Lu	e794070af2	vhost_vdpa: fix the crash in unmap a large memory While testing in vIOMMU, sometimes Guest will unmap very large memory, which will cause the crash. To fix this, add a new function vhost_vdpa_general_unmap(). This function will only unmap the memory that saved in iotlb. Call Trace: [ 647.820144] ------------[ cut here ]------------ [ 647.820848] kernel BUG at drivers/iommu/intel/iommu.c:1174! [ 647.821486] invalid opcode: 0000 [#1] PREEMPT SMP PTI [ 647.822082] CPU: 10 PID: 1181 Comm: qemu-system-x86 Not tainted 6.0.0-rc1home_lulu_2452_lulu7_vhost+ #62 [ 647.823139] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-29-g6a62e0cb0dfe-prebuilt.qem4 [ 647.824365] RIP: 0010:domain_unmap+0x48/0x110 [ 647.825424] Code: 48 89 fb 8d 4c f6 1e 39 c1 0f 4f c8 83 e9 0c 83 f9 3f 7f 18 48 89 e8 48 d3 e8 48 85 c0 75 59 [ 647.828064] RSP: 0018:ffffae5340c0bbf0 EFLAGS: 00010202 [ 647.828973] RAX: 0000000000000001 RBX: ffff921793d10540 RCX: 000000000000001b [ 647.830083] RDX: 00000000080000ff RSI: 0000000000000001 RDI: ffff921793d10540 [ 647.831214] RBP: 0000000007fc0100 R08: ffffae5340c0bcd0 R09: 0000000000000003 [ 647.832388] R10: 0000007fc0100000 R11: 0000000000100000 R12: 00000000080000ff [ 647.833668] R13: ffffae5340c0bcd0 R14: ffff921793d10590 R15: 0000008000100000 [ 647.834782] FS: 00007f772ec90640(0000) GS:ffff921ce7a80000(0000) knlGS:0000000000000000 [ 647.836004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 647.836990] CR2: 00007f02c27a3a20 CR3: 0000000101b0c006 CR4: 0000000000372ee0 [ 647.838107] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 647.839283] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 647.840666] Call Trace: [ 647.841437] <TASK> [ 647.842107] intel_iommu_unmap_pages+0x93/0x140 [ 647.843112] __iommu_unmap+0x91/0x1b0 [ 647.844003] iommu_unmap+0x6a/0x95 [ 647.844885] vhost_vdpa_unmap+0x1de/0x1f0 [vhost_vdpa] [ 647.845985] vhost_vdpa_process_iotlb_msg+0xf0/0x90b [vhost_vdpa] [ 647.847235] ? _raw_spin_unlock+0x15/0x30 [ 647.848181] ? _copy_from_iter+0x8c/0x580 [ 647.849137] vhost_chr_write_iter+0xb3/0x430 [vhost] [ 647.850126] vfs_write+0x1e4/0x3a0 [ 647.850897] ksys_write+0x53/0xd0 [ 647.851688] do_syscall_64+0x3a/0x90 [ 647.852508] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 647.853457] RIP: 0033:0x7f7734ef9f4f [ 647.854408] Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 29 76 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c8 [ 647.857217] RSP: 002b:00007f772ec8f040 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [ 647.858486] RAX: ffffffffffffffda RBX: 00000000fef00000 RCX: 00007f7734ef9f4f [ 647.859713] RDX: 0000000000000048 RSI: 00007f772ec8f090 RDI: 0000000000000010 [ 647.860942] RBP: 00007f772ec8f1a0 R08: 0000000000000000 R09: 0000000000000000 [ 647.862206] R10: 0000000000000001 R11: 0000000000000293 R12: 0000000000000010 [ 647.863446] R13: 0000000000000002 R14: 0000000000000000 R15: ffffffff01100000 [ 647.864692] </TASK> [ 647.865458] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs v] [ 647.874688] ---[ end trace 0000000000000000 ]--- Cc: stable@vger.kernel.org Fixes: `4c8cf31885` ("vhost: introduce vDPA-based backend") Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20221219073331.556140-1-lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-12-28 05:28:11 -05:00
Dawei Li	c8e82e3877	virtio: Implementing attribute show with sysfs_emit Replace sprintf with sysfs_emit or its variants for their built-in PAGE_SIZE awareness. Signed-off-by: Dawei Li <set_pte_at@outlook.com> Message-Id: <TYCP286MB23232A999FE7DBDF50BA0FAACA0F9@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-12-28 05:28:11 -05:00
Wei Yongjun	b1d65f717c	virtio-crypto: fix memory leak in virtio_crypto_alg_skcipher_close_session() 'vc_ctrl_req' is alloced in virtio_crypto_alg_skcipher_close_session(), and should be freed in the invalid ctrl_status->status error handling case. Otherwise there is a memory leak. Fixes: `0756ad15b1` ("virtio-crypto: use private buffer for control request") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Message-Id: <20221114110740.537276-1-weiyongjun@huaweicloud.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Acked-by: zhenwei pi<pizhenwei@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-12-28 05:28:10 -05:00
wangjianli	a4722f64f9	tools/virtio: Variable type completion Replace "unsigned" with "unsigned int" Signed-off-by: wangjianli <wangjianli@cdjrlc.com> Message-Id: <20221113070742.48271-1-wangjianli@cdjrlc.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-12-28 05:28:10 -05:00
Stefano Garzarella	794ec498c9	vdpa_sim: fix vringh initialization in vdpasim_queue_ready() When we initialize vringh, we should pass the features and the number of elements in the virtqueue negotiated with the driver, otherwise operations with vringh may fail. This was discovered in a case where the driver sets a number of elements in the virtqueue different from the value returned by .get_vq_num_max(). In vdpasim_vq_reset() is safe to initialize the vringh with default values, since the virtqueue will not be used until vdpasim_queue_ready() is called again. Fixes: `2c53d0f64c` ("vdpasim: vDPA device simulator") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20221110141335.62171-1-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>	2022-12-28 05:28:10 -05:00
Angus Chen	f4e468f708	virtio_blk: use UINT_MAX instead of -1U We use UINT_MAX to limit max_discard_sectors in virtblk_probe, we can use UINT_MAX to limit max_hw_sectors for consistencies. No functional change intended. Signed-off-by: Angus Chen <angus.chen@jaguarmicro.com> Message-Id: <20221110030124.1986-1-angus.chen@jaguarmicro.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-12-28 05:28:10 -05:00
Stefano Garzarella	c070c1912a	vhost-vdpa: fix an iotlb memory leak Before commit `3d56987938` ("vhost-vdpa: introduce asid based IOTLB") we called vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1) during release to free all the resources allocated when processing user IOTLB messages through vhost_vdpa_process_iotlb_update(). That commit changed the handling of IOTLB a bit, and we accidentally removed some code called during the release. We partially fixed this with commit `037d430556` ("vhost-vdpa: call vhost_vdpa_cleanup during the release") but a potential memory leak is still there as showed by kmemleak if the application does not send VHOST_IOTLB_INVALIDATE or crashes: unreferenced object 0xffff888007fbaa30 (size 16): comm "blkio-bench", pid 914, jiffies 4294993521 (age 885.500s) hex dump (first 16 bytes): 40 73 41 07 80 88 ff ff 00 00 00 00 00 00 00 00 @sA............. backtrace: [<0000000087736d2a>] kmem_cache_alloc_trace+0x142/0x1c0 [<0000000060740f50>] vhost_vdpa_process_iotlb_msg+0x68c/0x901 [vhost_vdpa] [<0000000083e8e205>] vhost_chr_write_iter+0xc0/0x4a0 [vhost] [<000000008f2f414a>] vhost_vdpa_chr_write_iter+0x18/0x20 [vhost_vdpa] [<00000000de1cd4a0>] vfs_write+0x216/0x4b0 [<00000000a2850200>] ksys_write+0x71/0xf0 [<00000000de8e720b>] __x64_sys_write+0x19/0x20 [<0000000018b12cbb>] do_syscall_64+0x3f/0x90 [<00000000986ec465>] entry_SYSCALL_64_after_hwframe+0x63/0xcd Let's fix this calling vhost_vdpa_iotlb_unmap() on the whole range in vhost_vdpa_remove_as(). We move that call before vhost_dev_cleanup() since we need a valid v->vdev.mm in vhost_vdpa_pa_unmap(). vhost_iotlb_reset() call can be removed, since vhost_vdpa_iotlb_unmap() on the whole range removes all the entries. The kmemleak log reported was observed with a vDPA device that has `use_va` set to true (e.g. VDUSE). This patch has been tested with both types of devices. Fixes: `037d430556` ("vhost-vdpa: call vhost_vdpa_cleanup during the release") Fixes: `3d56987938` ("vhost-vdpa: introduce asid based IOTLB") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20221109154213.146789-1-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-12-28 05:28:10 -05:00
Stefano Garzarella	98047313cd	vhost: fix range used in translate_desc() vhost_iotlb_itree_first() requires `start` and `last` parameters to search for a mapping that overlaps the range. In translate_desc() we cyclically call vhost_iotlb_itree_first(), incrementing `addr` by the amount already translated, so rightly we move the `start` parameter passed to vhost_iotlb_itree_first(), but we should hold the `last` parameter constant. Let's fix it by saving the `last` parameter value before incrementing `addr` in the loop. Fixes: `a9709d6874` ("vhost: convert pre sorted vhost memory array to interval tree") Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20221109102503.18816-3-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-12-28 05:28:10 -05:00
Stefano Garzarella	f85efa9b0f	vringh: fix range used in iotlb_translate() vhost_iotlb_itree_first() requires `start` and `last` parameters to search for a mapping that overlaps the range. In iotlb_translate() we cyclically call vhost_iotlb_itree_first(), incrementing `addr` by the amount already translated, so rightly we move the `start` parameter passed to vhost_iotlb_itree_first(), but we should hold the `last` parameter constant. Let's fix it by saving the `last` parameter value before incrementing `addr` in the loop. Fixes: `9ad9c49cfe` ("vringh: IOTLB support") Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20221109102503.18816-2-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-12-28 05:28:10 -05:00
Yuan Can	7a4efe182c	vhost/vsock: Fix error handling in vhost_vsock_init() A problem about modprobe vhost_vsock failed is triggered with the following log given: modprobe: ERROR: could not insert 'vhost_vsock': Device or resource busy The reason is that vhost_vsock_init() returns misc_register() directly without checking its return value, if misc_register() failed, it returns without calling vsock_core_unregister() on vhost_transport, resulting the vhost_vsock can never be installed later. A simple call graph is shown as below: vhost_vsock_init() vsock_core_register() # register vhost_transport misc_register() device_create_with_groups() device_create_groups_vargs() dev = kzalloc(...) # OOM happened # return without unregister vhost_transport Fix by calling vsock_core_unregister() when misc_register() returns error. Fixes: `433fc58e6b` ("VSOCK: Introduce vhost_vsock.ko") Signed-off-by: Yuan Can <yuancan@huawei.com> Message-Id: <20221108101705.45981-1-yuancan@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-12-28 05:28:10 -05:00
ruanjinjie	aeca7ff254	vdpa_sim: fix possible memory leak in vdpasim_net_init() and vdpasim_blk_init() Inject fault while probing module, if device_register() fails in vdpasim_net_init() or vdpasim_blk_init(), but the refcount of kobject is not decreased to 0, the name allocated in dev_set_name() is leaked. Fix this by calling put_device(), so that name can be freed in callback function kobject_cleanup(). (vdpa_sim_net) unreferenced object 0xffff88807eebc370 (size 16): comm "modprobe", pid 3848, jiffies 4362982860 (age 18.153s) hex dump (first 16 bytes): 76 64 70 61 73 69 6d 5f 6e 65 74 00 6b 6b 6b a5 vdpasim_net.kkk. backtrace: [<ffffffff8174f19e>] __kmalloc_node_track_caller+0x4e/0x150 [<ffffffff81731d53>] kstrdup+0x33/0x60 [<ffffffff83a5d421>] kobject_set_name_vargs+0x41/0x110 [<ffffffff82d87aab>] dev_set_name+0xab/0xe0 [<ffffffff82d91a23>] device_add+0xe3/0x1a80 [<ffffffffa0270013>] 0xffffffffa0270013 [<ffffffff81001c27>] do_one_initcall+0x87/0x2e0 [<ffffffff813739cb>] do_init_module+0x1ab/0x640 [<ffffffff81379d20>] load_module+0x5d00/0x77f0 [<ffffffff8137bc40>] __do_sys_finit_module+0x110/0x1b0 [<ffffffff83c4d505>] do_syscall_64+0x35/0x80 [<ffffffff83e0006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 (vdpa_sim_blk) unreferenced object 0xffff8881070c1250 (size 16): comm "modprobe", pid 6844, jiffies 4364069319 (age 17.572s) hex dump (first 16 bytes): 76 64 70 61 73 69 6d 5f 62 6c 6b 00 6b 6b 6b a5 vdpasim_blk.kkk. backtrace: [<ffffffff8174f19e>] __kmalloc_node_track_caller+0x4e/0x150 [<ffffffff81731d53>] kstrdup+0x33/0x60 [<ffffffff83a5d421>] kobject_set_name_vargs+0x41/0x110 [<ffffffff82d87aab>] dev_set_name+0xab/0xe0 [<ffffffff82d91a23>] device_add+0xe3/0x1a80 [<ffffffffa0220013>] 0xffffffffa0220013 [<ffffffff81001c27>] do_one_initcall+0x87/0x2e0 [<ffffffff813739cb>] do_init_module+0x1ab/0x640 [<ffffffff81379d20>] load_module+0x5d00/0x77f0 [<ffffffff8137bc40>] __do_sys_finit_module+0x110/0x1b0 [<ffffffff83c4d505>] do_syscall_64+0x35/0x80 [<ffffffff83e0006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: `899c4d187f` ("vdpa_sim_blk: add support for vdpa management tool") Fixes: `a3c06ae158` ("vdpa_sim_net: Add support for user supported devices") Signed-off-by: ruanjinjie <ruanjinjie@huawei.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20221110082348.4105476-1-ruanjinjie@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-12-28 05:28:10 -05:00

... 2 3 4 5 6 ...

1153741 commits