linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-11-01 08:58:07 +00:00

Author	SHA1	Message	Date
Juan Vazquez	4eea5332d6	scsi: storvsc: Fix storvsc_queuecommand() memory leak Fix possible memory leak in error path of storvsc_queuecommand() when DMA mapping fails. Signed-off-by: Juan Vazquez <juvazq@linux.microsoft.com> Reviewed-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Link: https://lore.kernel.org/r/20220109001758.6401-1-juvazq@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>	2022-01-10 12:33:47 +00:00
Ingo Molnar	0422fe2666	Merge branch 'linus' into irq/core, to fix conflict Conflicts: drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c Signed-off-by: Ingo Molnar <mingo@kernel.org>	2022-01-08 10:53:57 +01:00
Kees Cook	315d049ad1	scsi: megaraid: Avoid mismatched storage type sizes Remove needless use of mbox_t, replacing with just struct mbox_out. Silences compiler warnings under a -Warray-bounds build: drivers/scsi/megaraid.c: In function 'megaraid_probe_one': drivers/scsi/megaraid.c:3615:30: error: array subscript 'mbox_t[0]' is partly outside array bounds of 'unsigned char[15]' [-Werror=array-bounds] 3615 \| mbox->m_out.xferaddr = (u32)adapter->buf_dma_handle; \| ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/scsi/megaraid.c:3599:23: note: while referencing 'raw_mbox' 3599 \| unsigned char raw_mbox[sizeof(struct mbox_out)]; \| ^~~~~~~~ Link: https://lore.kernel.org/r/20220105173633.2421129-1-keescook@chromium.org Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumit Saxena <sumit.saxena@broadcom.com> Cc: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: megaraidlinux.pdl@broadcom.com Cc: linux-scsi@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-07 09:19:51 -05:00
Xiang Chen	5d9224fb07	scsi: hisi_sas: Remove unused variable and check in hisi_sas_send_ata_reset_each_phy() In commit `29e2bac874` ("scsi: hisi_sas: Fix some issues related to asd_sas_port->phy_list"), we use asd_sas_port->phy_mask instead of accessing asd_sas_port->phy_list, and it is enough to use asd_sas_port->phy_mask to check the state of phy, so remove the unused check and variable. Link: https://lore.kernel.org/r/1641300126-53574-1-git-send-email-chenxiang66@hisilicon.com Fixes: `29e2bac874` ("scsi: hisi_sas: Fix some issues related to asd_sas_port->phy_list") Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Colin King <colin.i.king@gmail.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-07 09:09:39 -05:00
YueHaibing	0bd2fbee9d	scsi: storvsc: Fix unsigned comparison to zero The unsigned variable sg_count is being assigned a return value from the call to scsi_dma_map() that can return -ENOMEM. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20211227040311.54584-1-yuehaibing@huawei.com Signed-off-by: Wei Liu <wei.liu@kernel.org>	2022-01-05 12:26:27 +00:00
Minghao Chi	c3b48443ba	scsi: aic79xx: Remove redundant error variable Return the value from ahd_linux_queue_abort_cmd() directly instead of using a redundant variable. Link: https://lore.kernel.org/r/20220104112452.601899-1-chi.minghao@zte.com.cn Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: CGEL ZTE <cgel.zte@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-05 00:55:38 -05:00
Ajish Koshy	ee05cb71f9	scsi: pm80xx: Port reset timeout error handling correction Error handling steps were not in sequence as per the programmers manual. Expected sequence: - PHY_DOWN (PORT_IN_RESET) - PORT_RESET_TIMER_TMO - Host aborts pending I/Os - Host deregister the device - Host sends HW_EVENT_PHY_DOWN ACK Previously we were sending HW_EVENT_PHY_DOWN ACK first and then deregister the device. Fix this to use the expected sequence. Link: https://lore.kernel.org/r/20211228111753.10802-1-Ajish.Koshy@microchip.com Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-05 00:52:48 -05:00
Yang Li	3bb3c24e26	scsi: mpi3mr: Fix formatting problems in some kernel-doc comments Remove some warnings found by running scripts/kernel-doc, which is caused by using 'make W=1'. drivers/scsi/mpi3mr/mpi3mr_fw.c:2188: warning: Function parameter or member 'reason_code' not described in 'mpi3mr_check_rh_fault_ioc' drivers/scsi/mpi3mr/mpi3mr_fw.c:3650: warning: Excess function parameter 'init_type' description in 'mpi3mr_init_ioc' drivers/scsi/mpi3mr/mpi3mr_fw.c:4177: warning: bad line Link: https://lore.kernel.org/r/20211231082350.19315-1-yang.lee@linux.alibaba.com Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-05 00:34:41 -05:00
Colin Ian King	5867b8569e	scsi: mpi3mr: Fix some spelling mistakes There are some spelling mistakes in some literal strings. Fix them. Link: https://lore.kernel.org/r/20211224175240.1348942-1-colin.i.king@gmail.com Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-05 00:33:26 -05:00
Suganath Prabu S	9211faa39a	scsi: mpt3sas: Update persistent trigger pages from sysfs interface Store sysfs-provided trigger values into the corresponding persistent trigger pages. Otherwise trigger entries are not persistent across system reboots. Link: https://lore.kernel.org/r/20211227053055.289537-1-suganath-prabu.subramani@broadcom.com Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-05 00:31:39 -05:00
Damien Le Moal	81d3f500ee	scsi: core: Fix scsi_mode_select() interface The modepage argument is unused. Remove it. Link: https://lore.kernel.org/r/20210929091744.706003-3-damien.lemoal@wdc.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-05 00:26:08 -05:00
Randy Dunlap	4d516e4952	scsi: aacraid: Fix spelling of "its" Use the possessive "its" instead of the contraction "it's" in user messages. Link: https://lore.kernel.org/r/20211223061119.18304-1-rdunlap@infradead.org Cc: Adaptec OEM Raid Solutions <aacraid@microsemi.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-05 00:23:47 -05:00
Jiasheng Jiang	aa7069d840	scsi: qedf: Fix potential dereference of NULL pointer The return value of dma_alloc_coherent() needs to be checked to avoid use of NULL pointer in case of an allocation failure. Link: https://lore.kernel.org/r/20211216101449.375953-1-jiasheng@iscas.ac.cn Fixes: `61d8658b4a` ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Acked-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-05 00:19:28 -05:00
Linus Torvalds	e46227bf38	SCSI fixes on 20211231 Three fixes, all in drivers. The lpfc one doesn't look exploitable, but nasty things could happen in string operations if mybuf ends up with an on stack unterminated string. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYc8YLiYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdUhAQCVmqLx GhEK15Y8etJwMoj03I6hO5gChhQz6kk7pxXAVwD/e5LHrVVeq/WxjUnyrC1gx6sm iYHYbZ0UHotwbRpwU9k= =WAIf -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three fixes, all in drivers. The lpfc one doesn't look exploitable, but nasty things could happen in string operations if mybuf ends up with an on stack unterminated string" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: vmw_pvscsi: Set residual data length conditionally scsi: libiscsi: Fix UAF in iscsi_conn_get_param()/iscsi_conn_teardown() scsi: lpfc: Terminate string in lpfc_debugfs_nvmeio_trc_write()	2021-12-31 09:22:25 -08:00
Sreekanth Reddy	c77b1f8a8f	scsi: mpi3mr: Bump driver version to 8.0.0.61.0 Update the driver version to newer version format i.e. 8.0.0.61.0. Link: https://lore.kernel.org/r/20211220141159.16117-26-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:25 -05:00
Sreekanth Reddy	243bcc8efd	scsi: mpi3mr: Fixes around reply request queues Set reply queue depth of 1K for B0 and 4K for A0. While freeing the segmented request queues use the actual queue depth that is used while creating them. Link: https://lore.kernel.org/r/20211220141159.16117-25-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:25 -05:00
Sreekanth Reddy	a91603a5d5	scsi: mpi3mr: Enhanced Task Management Support Reply handling Enhance driver to consider MPI3_IOCSTATUS_SCSI_IOC_TERMINATED as a success for TMs issued by it and check the pending I/Os to decide the success or failure of the task management requests instead of just considering the MPI3_IOCSTATUS_SCSI_IOC_TERMINATED as a failure of the task management request. Link: https://lore.kernel.org/r/20211220141159.16117-24-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:25 -05:00
Sreekanth Reddy	c86651345c	scsi: mpi3mr: Use TM response codes from MPI3 headers Remove locally defined TM response codes and use codes from MPI3 headers. Link: https://lore.kernel.org/r/20211220141159.16117-23-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:25 -05:00
Sreekanth Reddy	afd3a5793f	scsi: mpi3mr: Add io_uring interface support in I/O-polled mode Add support for the io_uring interface in I/O-polled mode. This feature is disabled in the driver by default. To enable the feature, a module parameter "poll_queues" has to be set with the desired number of polling queues. When the feature is enabled, the driver reserves a certain number of operational queue pairs for the poll_queues either from the available queue pairs or creates additional queue pairs based on the operational queue availability. The Polling queues will have corresponding IRQ and ISR functions as similar to default queues. However, the IRQ line is disabled by the driver for poll_queues. Link: https://lore.kernel.org/r/20211220141159.16117-22-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:24 -05:00
Sreekanth Reddy	95cca8d554	scsi: mpi3mr: Print cable mngnt and temp threshold events Print cable management & temperature threshold event data. Use vendor id & device id macro definitions from MPI3 headers. Link: https://lore.kernel.org/r/20211220141159.16117-21-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:24 -05:00
Sreekanth Reddy	78b76a0768	scsi: mpi3mr: Support Prepare for Reset event The IOC sends a Prepare for Reset Event to the host to prepare for a Soft Reset. This event data has two reason codes: 1. Start - The host is expected to gracefully quiesce all I/O within approximately 1 second. 2. Abort - The IOC is requesting to abort a previous Prepare for Reset Event request. Normal I/O may be resumed. Link: https://lore.kernel.org/r/20211220141159.16117-20-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:24 -05:00
Sreekanth Reddy	c1af985d27	scsi: mpi3mr: Add Event acknowledgment logic Add Event acknowledgment logic. Link: https://lore.kernel.org/r/20211220141159.16117-19-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:24 -05:00
Sreekanth Reddy	c5758fc72b	scsi: mpi3mr: Gracefully handle online FW update operation Enhance driver to gracefully handle discrepancies in certain key data sizes between firmware update operations as mentioned below: - The driver displays an error message and marks the controller as unrecoverable if the firmware reports ReplyFrameSize that is greater than the current ReplyFrameSize. - If the firmware reports ReplyFrameSize greater than the current ReplyFrameSize then the driver uses the current ReplyFrameSize while copying the reply messages. - The driver displays an error message and marks the controller as unrecoverable if the firmware reports MaxOperationalReplyQueues less than the currently allocated operational reply queues count. - If the firmware reports MaxOperationalReplyQueues that is greater than the currently allocated operational reply queue count then the driver ignores the new increased value and uses the previously allocated number of operational queues only. - If the firmware reports MaxDevHandle greater than the previously used MaxDevHandle value after a reset then the driver re-allocates the 'device remove pending bitmap' buffer with the newer size using krealloc(). Link: https://lore.kernel.org/r/20211220141159.16117-18-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:24 -05:00
Sreekanth Reddy	b64845a7d4	scsi: mpi3mr: Detect async reset that occurred in firmware Detect asynchronous reset that occurred in the firmware by polling for reset history bit of IOC status register is set and if that bit is set, then the driver waits for the controller to become ready and then re-initializes the controller. Also reduce the time driver is waiting for the controller to acknowledge the reset action after issuing a specific reset action to the controller. The wait time is reduced from 510 seconds to 30 seconds. If the controller didn't acknowledge a specific reset action within the time interval then the driver marks the controller as unrecoverable instead of retrying two more times prior to giving up. Link: https://lore.kernel.org/r/20211220141159.16117-17-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:24 -05:00
Sreekanth Reddy	c0b00a931e	scsi: mpi3mr: Add IOC reinit function Add IOC reinitialization function. Link: https://lore.kernel.org/r/20211220141159.16117-16-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:23 -05:00
Sreekanth Reddy	fe6db61515	scsi: mpi3mr: Handle offline FW activation in graceful manner Currently the driver marks the controller as unrecoverable if there is an asynchronous reset or fault during the initialization, reinitialization post reset, and OS resume. Enhance driver to retry the initialization, re-initialization, and resume sequences for a maximum of 3 times if the controller became faulty or asynchronously reset due to a firmware activation during the initialization sequence. Link: https://lore.kernel.org/r/20211220141159.16117-15-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:23 -05:00
Sreekanth Reddy	59bd9cfe3f	scsi: mpi3mr: Code refactor of IOC init - part2 Move the IOC initialization's bring up logic to mpi3mr_bring_ioc_ready() routine. Link: https://lore.kernel.org/r/20211220141159.16117-14-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:23 -05:00
Sreekanth Reddy	e3605f65ef	scsi: mpi3mr: Code refactor of IOC init - part1 Separate out reply and sense buffer allocation and initialization into two routines and call only initialization routine while issuing the IOC Init request message. Also move out the event enable logic to a separate function. Link: https://lore.kernel.org/r/20211220141159.16117-13-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:23 -05:00
Sreekanth Reddy	a6856cc450	scsi: mpi3mr: Fault IOC when internal command gets timeout Save snapdump and fault the controller with the given reason code if it is already not in the fault or not in asynchronous reset. This ensures that soft reset is issued from the watchdog thread. This will also be used to handle initialization time faults/resets/timeout as in those cases immediate soft reset invocation is not required. Link: https://lore.kernel.org/r/20211220141159.16117-12-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:23 -05:00
Sreekanth Reddy	2ac794baae	scsi: mpi3mr: Display IOC firmware package version Display IOC firmware package version by reading component image upload data. Link: https://lore.kernel.org/r/20211220141159.16117-11-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:23 -05:00
Sreekanth Reddy	13fd7b1555	scsi: mpi3mr: Handle unaligned PLL in unmap cmnds The following special handling is needed for UNMAP commands issued to NVMe drives: - On B0 boards, if the parameter list length is greater than 24 and not a 16-byte multiple, then truncate the parameter list length to a 16-byte multiple. - On A0 boards, if the parameter list length is greater than block descriptor data length + 8, then truncate the parameter list length to block descriptor data length + 8 value. Link: https://lore.kernel.org/r/20211220141159.16117-10-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:22 -05:00
Sreekanth Reddy	4f08b9637f	scsi: mpi3mr: Increase internal cmnds timeout to 60s - Increase internal command timeout to 60 seconds. - Enable 16 device removal handshake processing in parallel in the device removal handshake infrastructure. Link: https://lore.kernel.org/r/20211220141159.16117-9-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:22 -05:00
Sreekanth Reddy	ba68779a51	scsi: mpi3mr: Do access status validation before adding devices Add validation for various access statuses prior to exposing attached target device to the operating system. Link: https://lore.kernel.org/r/20211220141159.16117-8-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:22 -05:00
Sreekanth Reddy	17d6b9cf89	scsi: mpi3mr: Add support for PCIe Managed Switch SES device The SAS4 Controller firmware exposes the SES devices in Managed PCIe Switch as a PCIe Device Type SCSI Device (MPI3_DEVICE0_PCIE_DEVICE_INFO_TYPE_SCSI_DEVICE). Driver is enhanced to handle this device type by: - Exposing the device to the upper layers and - Not updating any hardware sectors & virtual boundary settings as these settings are needed only for NVMe devices. Link: https://lore.kernel.org/r/20211220141159.16117-7-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:22 -05:00
Sreekanth Reddy	ec5ebd2c14	scsi: mpi3mr: Update MPI3 headers - part2 Continued updating MPI3 headers. Link: https://lore.kernel.org/r/20211220141159.16117-6-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:22 -05:00
Sreekanth Reddy	d00ff7c311	scsi: mpi3mr: Update MPI3 headers - part1 Update MPI3 headers. Link: https://lore.kernel.org/r/20211220141159.16117-5-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:22 -05:00
Sreekanth Reddy	fbaa9aa48b	scsi: mpi3mr: Don't reset IOC if cmnds flush with reset status Don't issue the soft reset if internal commands are flushed out with reset status. Soft reset needs to be issued only if commands are really timed out. Link: https://lore.kernel.org/r/20211220141159.16117-4-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:21 -05:00
Sreekanth Reddy	a83ec831b2	scsi: mpi3mr: Replace spin_lock() with spin_lock_irqsave() Use spin_lock_irqsave() instead of spin_lock() while acquiring reply_free_queue_lock & sbq_lock locks. Link: https://lore.kernel.org/r/20211220141159.16117-3-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:21 -05:00
Sreekanth Reddy	9cf0666f34	scsi: mpi3mr: Add debug APIs based on logging_level bits Add debug print functions which will print messages based on logging_level bits enabled. Link: https://lore.kernel.org/r/20211220141159.16117-2-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:21 -05:00
Christoph Hellwig	657b44d651	scsi: pmcraid: Don't use GFP_DMA in pmcraid_alloc_sglist() The driver doesn't express DMA addressing limitation under 32-bits anywhere else, so remove the spurious GFP_DMA allocation. Link: https://lore.kernel.org/r/20211222092247.928711-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:44:12 -05:00
Christoph Hellwig	1964777e10	scsi: snic: Don't use GFP_DMA in snic_queue_report_tgt_req() The driver doesn't express DMA addressing limitation under 32-bits anywhere else, so remove the spurious GFP_DMA allocation. Link: https://lore.kernel.org/r/20211222092048.925829-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:43:23 -05:00
Christoph Hellwig	0298b7daf8	scsi: myrs: Don't use GFP_DMA The myrs devices supports 64-bit addressing, so remove the spurious GFP_DMA allocations. Link: https://lore.kernel.org/r/20211222091935.925624-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:42:53 -05:00
Christoph Hellwig	27363ba89f	scsi: myrb: Don't use GFP_DMA in myrb_pdev_slave_alloc() The driver doesn't express DMA addressing limitation under 32-bits anywhere else, so remove the spurious GFP_DMA allocation. Link: https://lore.kernel.org/r/20211222091801.924745-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:42:23 -05:00
Christoph Hellwig	c981e9e0f8	scsi: initio: Don't use GFP_DMA in initio_probe_one() The driver doesn't express DMA addressing limitation under 32-bits anywhere else, so remove the spurious GFP_DMA allocation. Link: https://lore.kernel.org/r/20211222091630.922788-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:41:53 -05:00
Christoph Hellwig	d94d94969a	scsi: sr: Don't use GFP_DMA The allocated buffers are used as a command payload, for which the block layer and/or DMA API do the proper bounce buffering if needed. Link: https://lore.kernel.org/r/20211222090842.920724-1-hch@lst.de Reported-by: Baoquan He <bhe@redhat.com> Reviewed-by: Baoquan He <bhe@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:41:13 -05:00
Christoph Hellwig	bc7806b395	scsi: ch: Don't use GFP_DMA The allocated buffers are used as a command payload, for which the block layer and/or DMA API do the proper bounce buffering if needed. Link: https://lore.kernel.org/r/20211222090311.916624-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:40:02 -05:00
Xiang Chen	b4cc094922	scsi: hisi_sas: Use autosuspend for the host controller The controller may frequently enter and exit suspend for each I/O which we need to deal with. This is inefficient and may cause too much suspend and resume activity for the controller. To avoid this, use a default 5s autosuspend for the controller to stop frequently suspending and resuming. This value may still be modified via sysfs interfaces. Link: https://lore.kernel.org/r/1639999298-244569-16-git-send-email-chenxiang66@hisilicon.com Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:31 -05:00
Xiang Chen	307d9f49cc	scsi: libsas: Keep host active while processing events Processing events such as PORTE_BROADCAST_RCVD may cause dependency issues for runtime power management support. Such a problem would be that handling a PORTE_BROADCAST_RCVD event requires that the host is resumed to send SMP commands. However, in resuming the host, the phyup events generated from re-enabling the phys are processed in the same workqueue as the original PORTE_BROADCAST_RCVD event. As such, the host will never finish resuming (as it waits for the phyup event processing), and then the PORTE_BROADCAST_RCVD event can't be processed as the SMP commands are blocked, and so we have a deadlock. Solve this problem by ensuring that libsas keeps the host active until completely finished phy or port events, such as PORTE_BYTES_DMAED. As such, we don't have to worry about resuming the host for processing individual SMP commands in this example. Link: https://lore.kernel.org/r/1639999298-244569-15-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:31 -05:00
Xiang Chen	ae9b69e85e	scsi: hisi_sas: Keep controller active between ISR of phyup and the event being processed It is possible that controller may become suspended between processing a phyup interrupt and the event being processed by libsas. As such, we can't ensure the controller is active when processing the phyup event - this may cause the phyup event to be lost or other issues. To avoid any possible issues, add pm_runtime_get_noresume() in phyup interrupt handler and pm_runtime_put_sync() in the work handler exit to ensure that we stay always active. Since we only want to call pm_runtime_get_noresume() for v3 hw, signal this will a new event, HISI_PHYE_PHY_UP_PM. Link: https://lore.kernel.org/r/1639999298-244569-14-git-send-email-chenxiang66@hisilicon.com Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:31 -05:00
Xiang Chen	bf19aea460	scsi: libsas: Defer works of new phys during suspend During the processing of event PORT_BYTES_DMAED, the driver queues work DISCE_DISCOVER_DOMAIN and then flushes workqueue ha->disco_q. If a new phyup event occurs during resuming the controller, the work PORTE_BYTES_DMAED of new phy occurs before suspended phy's. The work DISCE_DISCOVER_DOMAIN of new phy requires an active SAS controller (it needs to resume SAS controller by function scsi_sysfs_add_sdev() and some other functions such as function add_device_link()). However, the activation of the SAS controller requires completion of work PORTE_BYTES_DMAED of suspended phys while it is blocked by new phy's work on ha->event_q. So there is a deadlock and it is released only after resume timeout. To solve the issue, defer works of new phys during suspend and queue those defer works after SAS controller becomes active. Link: https://lore.kernel.org/r/1639999298-244569-13-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:30 -05:00
Xiang Chen	1bc35475c6	scsi: libsas: Refactor sas_queue_deferred_work() In the second part of function __sas_drain_work(), deferred work is queued. This functionality is required other places so factor it out into the function sas_queue_deferred_work(). Link: https://lore.kernel.org/r/1639999298-244569-12-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:30 -05:00
Xiang Chen	4ea775abbb	scsi: libsas: Add flag SAS_HA_RESUMING Add a flag SAS_HA_RESUMING and use it to indicate the state of resuming the host controller. Link: https://lore.kernel.org/r/1639999298-244569-11-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:30 -05:00
Xiang Chen	0da7ca4c4f	scsi: libsas: Resume host while sending SMP I/Os When sending SMP I/Os to the host we need to ensure that the host is not suspended and can process the commands. This is a better approach than replying on the host to resume itself to handle such commands. Use pm_runtime_get_sync() and pm_runtime_put_sync() calls for the host when executing SMP I/Os. Link: https://lore.kernel.org/r/1639999298-244569-10-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:30 -05:00
Xiang Chen	97f4100939	scsi: hisi_sas: Add more logs for runtime suspend/resume Add some logs at the beginning and end of suspend/resume. Link: https://lore.kernel.org/r/1639999298-244569-9-git-send-email-chenxiang66@hisilicon.com Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:30 -05:00
Xiang Chen	e31e18128e	scsi: libsas: Insert PORTE_BROADCAST_RCVD event for resuming host If a new disk is inserted through an expander when the host was suspended, it will not necessarily be detected as the topology is not re-scanned during resume. To detect possible changes in topology during suspension, insert a PORTE_BROADCAST_RCVD event per port when resuming to trigger a revalidation. Link: https://lore.kernel.org/r/1639999298-244569-8-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:30 -05:00
Xiang Chen	133b688b2d	scsi: mvsas: Add spin_lock/unlock() to protect asd_sas_port->phy_list phy_list_lock is not held when using asd_sas_port->phy_list in the mvsas driver. Add spin_lock/unlock in those places. Link: https://lore.kernel.org/r/1639999298-244569-7-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:29 -05:00
Xiang Chen	29e2bac874	scsi: hisi_sas: Fix some issues related to asd_sas_port->phy_list Most places that use asd_sas_port->phy_list are protected by spinlock asd_sas_port->phy_list_lock, however there are still some places which miss grabbing the lock. Add it in function hisi_sas_refresh_port_id() when accessing asd_sas_port->phy_list. This carries a risk that list mutates while at the same time dropping the lock in function hisi_sas_send_ata_reset_each_phy(). Read asd_sas_port->phy_mask instead of accessing asd_sas_port->phy_list to avoid this risk. Link: https://lore.kernel.org/r/1639999298-244569-6-git-send-email-chenxiang66@hisilicon.com Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:29 -05:00
Xiang Chen	42159d3c8d	scsi: libsas: Add spin_lock/unlock() to protect asd_sas_port->phy_list Most places that use asd_sas_port->phy_list in libsas are protected by spinlock asd_sas_port->phy_list_lock. However, there are still a few places which miss the lock. Add it in those places. Link: https://lore.kernel.org/r/1639999298-244569-5-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:29 -05:00
Alan Stern	6e1fcab00a	scsi: block: pm: Always set request queue runtime active in blk_post_runtime_resume() John Garry reported a deadlock that occurs when trying to access a runtime-suspended SATA device. For obscure reasons, the rescan procedure causes the link to be hard-reset, which disconnects the device. The rescan tries to carry out a runtime resume when accessing the device. scsi_rescan_device() holds the SCSI device lock and won't release it until it can put commands onto the device's block queue. This can't happen until the queue is successfully runtime-resumed or the device is unregistered. But the runtime resume fails because the device is disconnected, and __scsi_remove_device() can't do the unregistration because it can't get the device lock. The best way to resolve this deadlock appears to be to allow the block queue to start running again even after an unsuccessful runtime resume. The idea is that the driver or the SCSI error handler will need to be able to use the queue to resolve the runtime resume failure. This patch removes the err argument to blk_post_runtime_resume() and makes the routine act as though the resume was successful always. This fixes the deadlock. Link: https://lore.kernel.org/r/1639999298-244569-4-git-send-email-chenxiang66@hisilicon.com Fixes: `e27829dc92` ("scsi: serialize ->rescan against ->remove") Reported-and-tested-by: John Garry <john.garry@huawei.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:29 -05:00
John Garry	6cc7390877	scsi: Revert "scsi: hisi_sas: Filter out new PHY up events during suspend" This reverts commit `b14a37e011`. In that commit, we had to filter out phy-up events during suspend, as it work cause a deadlock between processing the phyup event and the resume HA function try to drain the HA event workqueue to complete the resume process. Now that we no longer try to drain the HA event queue during the HA resume processor, the deadlock would not occur, so remove the special handling for it. Link: https://lore.kernel.org/r/1639999298-244569-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:29 -05:00
John Garry	fbefe22811	scsi: libsas: Don't always drain event workqueue for HA resume For the hisi_sas driver, if a directly attached disk is removed during suspend, a hang will occur in the resume process: The background is that in commit `16fd4a7c59` ("scsi: hisi_sas: Add device link between SCSI devices and hisi_hba"), it is ensured that the HBA device cannot be runtime suspended when any SCSI device associated is active. Other drivers which use libsas don't worry about this as none support runtime suspend. The mentioned hang occurs when an disk is removed during suspend. In the removal process - from PHYE_RESUME_TIMEOUT event processing - we call into scsi_remove_device(), which is being processed in the HA event workqueue. Here we wait for all suppliers of the SCSI device to resume, which includes the HBA device (from the above commit). However the HBA device cannot resume, as it is waiting for the PHYE_RESUME_TIMEOUT to be processed (from calling sas_resume_ha() -> sas_drain_work()). This is the deadlock. There does not appear to be any need for the sas_drain_work() to be called at all in sas_resume_ha() as it is not syncing against anything, so allow LLDDs to avoid this by providing a variant of sas_resume_ha() which does "sync", i.e. doesn't drain the event workqueue. Link: https://lore.kernel.org/r/1639999298-244569-2-git-send-email-chenxiang66@hisilicon.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:29 -05:00
Alexey Makhalov	142c779d05	scsi: vmw_pvscsi: Set residual data length conditionally The PVSCSI implementation in the VMware hypervisor under specific configuration ("SCSI Bus Sharing" set to "Physical") returns zero dataLen in the completion descriptor for READ CAPACITY(16). As a result, the kernel can not detect proper disk geometry. This can be recognized by the kernel message: [ 0.776588] sd 1:0:0:0: [sdb] Sector size 0 reported, assuming 512. The PVSCSI implementation in QEMU does not set dataLen at all, keeping it zeroed. This leads to a boot hang as was reported by Shmulik Ladkani. It is likely that the controller returns the garbage at the end of the buffer. Residual length should be set by the driver in that case. The SCSI layer will erase corresponding data. See commit `bdb2b8cab4` ("[SCSI] erase invalid data returned by device") for details. Commit `e662502b3a` ("scsi: vmw_pvscsi: Set correct residual data length") introduced the issue by setting residual length unconditionally, causing the SCSI layer to erase the useful payload beyond dataLen when this value is returned as 0. As a result, considering existing issues in implementations of PVSCSI controllers, we do not want to call scsi_set_resid() when dataLen == 0. Calling scsi_set_resid() has no effect if dataLen equals buffer length. Link: https://lore.kernel.org/lkml/20210824120028.30d9c071@blondie/ Link: https://lore.kernel.org/r/20211220190514.55935-1-amakhalov@vmware.com Fixes: `e662502b3a` ("scsi: vmw_pvscsi: Set correct residual data length") Cc: Matt Wang <wwentao@vmware.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Vishal Bhakta <vbhakta@vmware.com> Cc: VMware PV-Drivers <pv-drivers@vmware.com> Cc: James E.J. Bottomley <jejb@linux.ibm.com> Cc: linux-scsi@vger.kernel.org Cc: stable@vger.kernel.org Reported-and-suggested-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Alexey Makhalov <amakhalov@vmware.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:17:27 -05:00
Lixiaokeng	1b8d0300a3	scsi: libiscsi: Fix UAF in iscsi_conn_get_param()/iscsi_conn_teardown() \|- iscsi_if_destroy_conn \|-dev_attr_show \|-iscsi_conn_teardown \|-spin_lock_bh \|-iscsi_sw_tcp_conn_get_param \|-kfree(conn->persistent_address) \|-iscsi_conn_get_param \|-kfree(conn->local_ipaddr) ==>\|-read persistent_address ==>\|-read local_ipaddr \|-spin_unlock_bh When iscsi_conn_teardown() and iscsi_conn_get_param() happen in parallel, a UAF may be triggered. Link: https://lore.kernel.org/r/046ec8a0-ce95-d3fc-3235-666a7c65b224@huawei.com Reported-by: Lu Tixiong <lutianxiong@huawei.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Lixiaokeng <lixiaokeng@huawei.com> Signed-off-by: Linfeilong <linfeilong@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:03:59 -05:00
Tianyu Lan	743b237c3a	scsi: storvsc: Add Isolation VM support for storvsc driver In Isolation VM, all shared memory with host needs to mark visible to host via hvcall. vmbus_establish_gpadl() has already done it for storvsc rx/tx ring buffer. The page buffer used by vmbus_sendpacket_ mpb_desc() still needs to be handled. Use DMA API(scsi_dma_map/unmap) to map these memory during sending/receiving packet and return swiotlb bounce buffer dma address. In Isolation VM, swiotlb bounce buffer is marked to be visible to host and the swiotlb force mode is enabled. Set device's dma min align mask to HV_HYP_PAGE_SIZE - 1 in order to keep the original data offset in the bounce buffer. Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Reviewed-by: Long Li <longli@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20211213071407.314309-5-ltykernel@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>	2021-12-20 18:01:09 +00:00
Linus Torvalds	5d65f6f3df	SCSI fixes on 20211217 One driver fix: the pm8001 has never actually worked on a system with an IOMMU and this fixes that use case. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYb0E6yYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishSNgAPsFHTM+ 5uvAtK7W5pNQHnG3DaBPwED7LuklKmNfxH0c2AEA2BE7ijAda1DOrYC61BKQJtGm 8W+shJ/O0/mJSYqcCbQ= =J2/2 -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "One driver fix: the pm8001 has never actually worked on a system with an IOMMU and this fixes that use case" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: pm8001: Fix phys_to_virt() usage on dma_addr_t	2021-12-17 13:55:03 -08:00
John Garry	4be6181fea	scsi: libsas: Decode SAM status and host byte codes Value 0 is used for SAM status and libsas exec_status bytes codes in sas_end_task() - use defined macros instead. In addition, change to proper enum types. Also replace SAM_STAT_CHECK_CONDITION with SAS_SAM_STAT_CHECK_CONDITION, the former being a proper member of enum exec_status. Link: https://lore.kernel.org/r/1639579061-179473-9-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-16 22:59:58 -05:00
Qi Liu	37310bad7f	scsi: hisi_sas: Fix phyup timeout on FPGA The OOB interrupt and phyup interrupt handlers may run out-of-order in high CPU usage scenarios. Since the hisi_sas_phy.timer is added in hisi_sas_phy_oob_ready() and disarmed in phy_up_v3_hw(), this out-of-order execution will cause hisi_sas_phy.timer timeout to trigger. To solve, protect hisi_sas_phy.timer and .attached with a lock, and ensure that the timer won't be added after phyup handler completes. Link: https://lore.kernel.org/r/1639579061-179473-8-git-send-email-john.garry@huawei.com Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-16 22:59:58 -05:00
Qi Liu	16775db613	scsi: hisi_sas: Prevent parallel FLR and controller reset If we issue a controller reset command during executing a FLR a hung task may be found: Call trace: __switch_to+0x158/0x1cc __schedule+0x2e8/0x85c schedule+0x7c/0x110 schedule_timeout+0x190/0x1cc __down+0x7c/0xd4 down+0x5c/0x7c hisi_sas_task_exec+0x510/0x680 [hisi_sas_main] hisi_sas_queue_command+0x24/0x30 [hisi_sas_main] smp_execute_task_sg+0xf4/0x23c [libsas] sas_smp_phy_control+0x110/0x1e0 [libsas] transport_sas_phy_reset+0xc8/0x190 [libsas] phy_reset_work+0x2c/0x40 [libsas] process_one_work+0x1dc/0x48c worker_thread+0x15c/0x464 kthread+0x160/0x170 ret_from_fork+0x10/0x18 This is a race condition which occurs when the FLR completes first. Here the host HISI_SAS_RESETTING_BIT flag out gets of sync as HISI_SAS_RESETTING_BIT is not always cleared with the hisi_hba.sem held, so now only set/unset HISI_SAS_RESETTING_BIT under hisi_hba.sem . Link: https://lore.kernel.org/r/1639579061-179473-7-git-send-email-john.garry@huawei.com Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-16 22:59:58 -05:00
Qi Liu	20c634932a	scsi: hisi_sas: Prevent parallel controller reset and control phy command A user may issue a control phy command from sysfs at any time, even if the controller is resetting. If a phy is disabled by hardreset/linkreset command before calling get_phys_state() in the reset path, the saved phy state may be incorrect. To avoid incorrectly recording the phy state, use hisi_hba.sem to ensure that the controller reset may not run at the same time as when the phy control function is running. Link: https://lore.kernel.org/r/1639579061-179473-6-git-send-email-john.garry@huawei.com Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-16 22:59:57 -05:00
John Garry	dc313f6b12	scsi: hisi_sas: Factor out task prep and delivery code The task prep code is the same between the normal path (in hisi_sas_task_prep()) and the internal abort path, so factor is out into a common function. Link: https://lore.kernel.org/r/1639579061-179473-5-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-16 22:59:57 -05:00
John Garry	08c61b5d90	scsi: hisi_sas: Pass abort structure for internal abort To help factor out code in future, it's useful to know if we're executing an internal abort, so pass a pointer to the structure. The idea is that a NULL pointer means not an internal abort. Link: https://lore.kernel.org/r/1639579061-179473-4-git-send-email-john.garry@huawei.com Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-16 22:59:57 -05:00
John Garry	934385a4fd	scsi: hisi_sas: Make internal abort have no task proto For an internal abort, the task does not have a protocol, so set to none. This will make it easier to differentiate internal abort tasks in future. Link: https://lore.kernel.org/r/1639579061-179473-3-git-send-email-john.garry@huawei.com Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-16 22:59:57 -05:00
John Garry	0e4620856b	scsi: hisi_sas: Start delivery hisi_sas_task_exec() directly Currently we start delivery of commands to the DQ after returning from hisi_sas_task_exec() with success. Let's just start delivery directly in that function without having to check if some local variable is set. Link: https://lore.kernel.org/r/1639579061-179473-2-git-send-email-john.garry@huawei.com Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-16 22:59:57 -05:00
Christoph Hellwig	efac162a4e	scsi: efct: Don't pass GFP_DMA to dma_alloc_coherent() dma_alloc_coherent() ignores the zone specifiers so this is pointless and confusing. Link: https://lore.kernel.org/r/20211214163605.416288-1-hch@lst.de Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-16 22:52:29 -05:00
Bean Huo	99c66a8868	scsi: ufs: core: Fix deadlock issue in ufshcd_wait_for_doorbell_clr() Call shost_for_each_device() with holding host->host_lock will cause a deadlock situation, which will cause the system to stall (the log as follow). Fix this issue by using __shost_for_each_device() in ufshcd_pending_cmds(). stalls on CPUs/tasks: all trace: __switch_to+0x120/0x170 0xffff800011643998 ask dump for CPU 5: ask:kworker/u16:2 state:R running task stack: 0 pid: 80 ppid: 2 flags:0x0000000a orkqueue: events_unbound async_run_entry_fn all trace: __switch_to+0x120/0x170 0x0 ask dump for CPU 6: ask:kworker/u16:6 state:R running task stack: 0 pid: 164 ppid: 2 flags:0x0000000a orkqueue: events_unbound async_run_entry_fn all trace: __switch_to+0x120/0x170 0xffff54e7c4429f80 ask dump for CPU 7: ask:kworker/u16:4 state:R running task stack: 0 pid: 153 ppid: 2 flags:0x0000000a orkqueue: events_unbound async_run_entry_fn all trace: __switch_to+0x120/0x170 blk_mq_run_hw_queue+0x34/0x110 blk_mq_sched_insert_request+0xb0/0x120 blk_execute_rq_nowait+0x68/0x88 blk_execute_rq+0x4c/0xd8 __scsi_execute+0xec/0x1d0 scsi_vpd_inquiry+0x84/0xf0 scsi_get_vpd_buf+0x34/0xb8 scsi_attach_vpd+0x34/0x140 scsi_probe_and_add_lun+0xa6c/0xab8 __scsi_scan_target+0x438/0x4f8 scsi_scan_channel+0x6c/0xa8 scsi_scan_host_selected+0xf0/0x150 do_scsi_scan_host+0x88/0x90 scsi_scan_host+0x1b4/0x1d0 ufshcd_async_scan+0x248/0x310 async_run_entry_fn+0x30/0x178 process_one_work+0x1e8/0x368 worker_thread+0x40/0x478 kthread+0x174/0x180 ret_from_fork+0x10/0x20 Link: https://lore.kernel.org/r/20211214120537.531628-1-huobean@gmail.com Fixes: `8d077ede48` ("scsi: ufs: Optimize the command queueing code") Reported-by: YongQin Liu <yongqin.liu@linaro.org> Reported-by: Amit Pundir <amit.pundir@linaro.org> Tested-by: John Stultz <john.stultz@linaro.org> Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Co-developed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-16 22:51:13 -05:00
Hannes Reinecke	baea0e833f	scsi: qla2xxx: Synchronize rport dev_loss_tmo setting Currently, the dev_loss_tmo setting is only ever used for SCSI devices. This patch reshuffles initialisation such that the SCSI remote ports are registered before the NVMe ones, allowing the dev_loss_tmo setting to be synchronized between SCSI and NVMe. Link: https://lore.kernel.org/r/20211214111139.52503-1-dwagner@suse.de Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-16 22:48:33 -05:00
Dan Carpenter	9020be114a	scsi: lpfc: Terminate string in lpfc_debugfs_nvmeio_trc_write() The "mybuf" string comes from the user, so we need to ensure that it is NUL terminated. Link: https://lore.kernel.org/r/20211214070527.GA27934@kili Fixes: `bd2cdd5e40` ("scsi: lpfc: NVME Initiator: Add debugfs support") Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-16 22:46:30 -05:00
Martin K. Petersen	87f77d37d3	Merge branch '5.16/scsi-fixes' into 5.17/scsi-staging Pull in the 5.16 fixes branch to resolve a conflict in the UFS driver core. Conflicts: drivers/scsi/ufs/ufshcd.c Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-16 22:38:19 -05:00
Christophe JAILLET	8c2d045515	scsi: hpsa: Remove an unused variable in hpsa_update_scsi_devices() 'lunzerobits' is unused. Remove it. This a left over of commit `2d62a33e05` ("hpsa: eliminate fake lun0 enclosures") Link: https://lore.kernel.org/r/9f80ea569867b5f7ae1e0f99d656e5a8bacad34e.1639084205.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-13 23:34:01 -05:00
Kees Cook	c167dd0b2a	scsi: lpfc: Use struct_group to isolate cast to larger object When building under -Warray-bounds, a warning is generated when casting a u32 into MAILBOX_t (which is larger). This warning is conservative, but it's not an unreasonable change to make to improve future robustness. Use a tagged struct_group that can refer to either the specific fields or the first u32 separately, silencing this warning: drivers/scsi/lpfc/lpfc_sli.c: In function 'lpfc_reset_barrier': drivers/scsi/lpfc/lpfc_sli.c:4787:29: error: array subscript 'MAILBOX_t[0]' is partly outside array bounds of 'volatile uint32_t[1]' {aka 'volatile unsigned int[1]'} [-Werror=array-bounds] 4787 \| ((MAILBOX_t *)&mbox)->mbxCommand = MBX_KILL_BOARD; \| ^~ drivers/scsi/lpfc/lpfc_sli.c:4752:27: note: while referencing 'mbox' 4752 \| volatile uint32_t mbox; \| ^~~~ There is no change to the resulting executable instruction code. Link: https://lore.kernel.org/r/20211203223351.107323-1-keescook@chromium.org Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-13 23:31:40 -05:00
Kees Cook	532adda9f4	scsi: lpfc: Use struct_group() to initialize struct lpfc_cgn_info In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memset(), avoid intentionally writing across neighboring fields. Add struct_group() to mark "stat" region of struct lpfc_cgn_info that should be initialized to zero, and refactor the "data" region memset() to wipe everything up to the cgn_stats region. Link: https://lore.kernel.org/r/20211208195957.1603092-1-keescook@chromium.org Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-13 23:30:30 -05:00
John Garry	2fe2434392	scsi: pm8001: Fix phys_to_virt() usage on dma_addr_t The driver supports a "direct" mode of operation, where the SMP req frame is directly copied into the command payload (and vice-versa for the SMP resp). To get at the SMP req frame data in the scatterlist the driver uses phys_to_virt() on the DMA mapped memory dma_addr_t . This is broken, and subsequently crashes as follows when an IOMMU is enabled: Unable to handle kernel paging request at virtual address ffff0000fcebfb00 ... pc : pm80xx_chip_smp_req+0x2d0/0x3d0 lr : pm80xx_chip_smp_req+0xac/0x3d0 pm80xx_chip_smp_req+0x2d0/0x3d0 pm8001_task_exec.constprop.0+0x368/0x520 pm8001_queue_command+0x1c/0x30 smp_execute_task_sg+0xdc/0x204 sas_discover_expander.part.0+0xac/0x6cc sas_discover_root_expander+0x8c/0x150 sas_discover_domain+0x3ac/0x6a0 process_one_work+0x1d0/0x354 worker_thread+0x13c/0x470 kthread+0x17c/0x190 ret_from_fork+0x10/0x20 Code: 371806e1 910006d6 6b16033f 54000249 (38766b05) ---[ end trace b91d59aaee98ea2d ]--- note: kworker/u192:0[7] exited with preempt_count 1 Instead use kmap_atomic(). -- Difference to v1: - use kmap_atomic() in both locations Difference to v2: - add whitespace around arithmetic (Damien) Link: https://lore.kernel.org/r/1639390248-213603-1-git-send-email-john.garry@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-13 23:19:58 -05:00
Linus Torvalds	a763d5a5ab	SCSI fixes on 20211211 Four fixes, all in drivers. Three are small and obvious, the qedi one is a bit larger but also pretty obvious. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYbUihiYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfqeAQDSVbQd T9SwcBLKWhElkw57urT2FLEx4fVFhsPHeCsImwD/ZLk3zq7Hu1jc/I5F1NQhb8Ux mRy0PZEMf8XKzaglYYk= =zqyN -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four fixes, all in drivers. Three are small and obvious, the qedi one is a bit larger but also pretty obvious" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qla2xxx: Format log strings only if needed scsi: scsi_debug: Fix buffer size of REPORT ZONES command scsi: qedi: Fix cmd_cleanup_cmpl counter mismatch issue scsi: pm80xx: Do not call scsi_remove_host() in pm8001_alloc()	2021-12-11 16:28:27 -08:00
Nitesh Narayan Lal	ce5a58a96c	scsi: lpfc: Use irq_set_affinity() The driver uses irq_set_affinity_hint to set the affinity for the lpfc interrupts to a mask corresponding to the local NUMA node to avoid performance overhead on AMD architectures. However, irq_set_affinity_hint() setting the affinity is an undocumented side effect that this function also sets the affinity under the hood. To remove this side effect irq_set_affinity_hint() has been marked as deprecated and new interfaces have been introduced. Also, as per the commit `dcaa213679` ("scsi: lpfc: Change default IRQ model on AMD architectures"): "On AMD architecture, revert the irq allocation to the normal style (non-managed) and then use irq_set_affinity_hint() to set the cpu affinity and disable user-space rebalancing." we don't really need to set the affinity_hint as user-space rebalancing for the lpfc interrupts is not desired. Hence, replace the irq_set_affinity_hint() with irq_set_affinity() which only applies the affinity for the interrupts. Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: James Smart <jsmart2021@gmail.com> Link: https://lore.kernel.org/r/20210903152430.244937-12-nitesh@redhat.com	2021-12-10 20:47:39 +01:00
Nitesh Narayan Lal	fdb8ed13a7	scsi: mpt3sas: Use irq_set_affinity_and_hint() The driver uses irq_set_affinity_hint() specifically for the high IOPS queue interrupts for two purposes: - To set the affinity_hint which is consumed by the userspace for distributing the interrupts - To apply an affinity that it provides The driver enforces its own affinity to bind the high IOPS queue interrupts to the local NUMA node. However, irq_set_affinity_hint() applying the provided cpumask as an affinity (if not NULL) for the interrupt is an undocumented side effect. To remove this side effect irq_set_affinity_hint() has been marked as deprecated and new interfaces have been introduced. Hence, replace the irq_set_affinity_hint() with the new interface irq_set_affinity_and_hint() where the provided mask needs to be applied as the affinity and affinity_hint pointer needs to be set and replace with irq_update_affinity_hint() where only affinity_hint needs to be updated. Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Link: https://lore.kernel.org/r/20210903152430.244937-6-nitesh@redhat.com	2021-12-10 20:47:38 +01:00
Nitesh Narayan Lal	8049da6f39	scsi: megaraid_sas: Use irq_set_affinity_and_hint() The driver uses irq_set_affinity_hint() specifically for the high IOPS queue interrupts for two purposes: - To set the affinity_hint which is consumed by the userspace for distributing the interrupts - To apply an affinity that it provides The driver enforces its own affinity to bind the high IOPS queue interrupts to the local NUMA node. However, irq_set_affinity_hint() applying the provided cpumask as an affinity for the interrupt is an undocumented side effect. To remove this side effect irq_set_affinity_hint() has been marked as deprecated and new interfaces have been introduced. Hence, replace the irq_set_affinity_hint() with the new interface irq_set_affinity_and_hint() where the provided mask needs to be applied as the affinity and affinity_hint pointer needs to be set and replace with irq_update_affinity_hint() where only affinity_hint needs to be updated. Change the megasas_set_high_iops_queue_affinity_hint function name to megasas_set_high_iops_queue_affinity_and_hint to clearly indicate that the function is setting both affinity and affinity_hint. Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20210903152430.244937-5-nitesh@redhat.com	2021-12-10 20:47:38 +01:00
Roman Bolshakov	69002c8ce9	scsi: qla2xxx: Format log strings only if needed Commit `598a90f200` ("scsi: qla2xxx: add ring buffer for tracing debug logs") introduced unconditional log string formatting to ql_dbg() even if ql_dbg_log event is disabled. It harms performance because some strings are formatted in fastpath and/or interrupt context. Link: https://lore.kernel.org/r/20211112145446.51210-1-r.bolshakov@yadro.com Fixes: `598a90f200` ("scsi: qla2xxx: add ring buffer for tracing debug logs") Cc: Rajan Shanmugavelu <rajan.shanmugavelu@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-06 22:38:17 -05:00
James Smart	4437503bfb	scsi: lpfc: Update lpfc version to 14.0.0.4 Update lpfc version to 14.0.0.4. Link: https://lore.kernel.org/r/20211204002644.116455-10-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-06 22:35:37 -05:00
James Smart	6014a2468f	scsi: lpfc: Add additional debugfs support for CMF Dump raw CMF parameter information in debugfs cgn_buffer. Link: https://lore.kernel.org/r/20211204002644.116455-9-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-06 22:35:37 -05:00
James Smart	05116ef9c4	scsi: lpfc: Cap CMF read bytes to MBPI Ensure read bytes data does not go over MBPI for CMF timer intervals that are purposely shortened. Link: https://lore.kernel.org/r/20211204002644.116455-8-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-06 22:35:37 -05:00
James Smart	a6269f8370	scsi: lpfc: Adjust CMF total bytes and rxmonitor Calculate any extra bytes needed to account for timer accuracy. If we are less than LPFC_CMF_INTERVAL, then calculate the adjustment needed for total to reflect a full LPFC_CMF_INTERVAL. Add additional info to rxmonitor, and adjust some log formatting. Link: https://lore.kernel.org/r/20211204002644.116455-7-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-06 22:35:37 -05:00
James Smart	7dd2e2a923	scsi: lpfc: Trigger SLI4 firmware dump before doing driver cleanup Extraneous teardown routines are present in the firmware dump path causing altered states in firmware captures. When a firmware dump is requested via sysfs, trigger the dump immediately without tearing down structures and changing adapter state. The driver shall rely on pre-existing firmware error state clean up handlers to restore the adapter. Link: https://lore.kernel.org/r/20211204002644.116455-6-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-06 22:35:36 -05:00
James Smart	8ed190a919	scsi: lpfc: Fix NPIV port deletion crash The driver is calling schedule_timeout after the DA_ID nameserver request and LOGO commands are issued to the fabric by the initiator virtual endport. These fixed delay functions are causing long delays in the driver's worker thread when processing discovery I/Os in a serialized fashion, which is then triggering mailbox timeout errors artificially. To fix this, don't wait on the DA_ID request to complete and call wait_event_timeout to allow the vport delete thread to make progress on an event driven basis rather than fixing the wait time. Link: https://lore.kernel.org/r/20211204002644.116455-5-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-06 22:35:36 -05:00
James Smart	7576d48c64	scsi: lpfc: Fix lpfc_force_rscn ndlp kref imbalance Issuing lpfc_force_rscn twice results in an ndlp kref use-after-free call trace. A prior patch reworked the get/put handling by ensuring nlp_get was done before WQE submission and a put was done in the completion path. Unfortunately, the issue_els_rscn path had a piece of legacy code that did a nlp_put, causing an imbalance on the ref counts. Fixed by removing the unnecessary legacy code snippet. Link: https://lore.kernel.org/r/20211204002644.116455-4-jsmart2021@gmail.com Fixes: `4430f7fd09` ("scsi: lpfc: Rework locations of ndlp reference taking") Cc: <stable@vger.kernel.org> # v5.11+ Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-06 22:35:36 -05:00
James Smart	2e81b1a374	scsi: lpfc: Change return code on I/Os received during link bounce During heavy I/O testing with issue_lip to bounce the link, occasionally I/O is terminated with status 3 result 9, which means the RPI is suspended. The I/O is completed and this type of error will result in immediate retry by the SCSI layer. The retry count expires and the I/O fails and returns error to the application. To avoid these quick retry/retries exhausted scenarios change the return code given to the midlayer to DID_REQUEUE rather than DID_ERROR. This gets them retried, and eventually succeed when the link recovers. Link: https://lore.kernel.org/r/20211204002644.116455-3-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-06 22:35:36 -05:00
James Smart	f0d3919697	scsi: lpfc: Fix leaked lpfc_dmabuf mbox allocations with NPIV During rmmod testing, messages appeared indicating lpfc_mbuf_pool entries were still busy. This situation was only seen doing rmmod after at least 1 vport (NPIV) instance was created and destroyed. The number of messages scaled with the number of vports created. When a vport is created, it can receive a PLOGI from another initiator Nport. When this happens, the driver prepares to ack the PLOGI and prepares an RPI for registration (via mbx cmd) which includes an mbuf allocation. During the unsolicited PLOGI processing and after the RPI preparation, the driver recognizes it is one of the vport instances and decides to reject the PLOGI. During the LS_RJT preparation for the PLOGI, the mailbox struct allocated for RPI registration is freed, but the mbuf that was also allocated is not released. Fix by freeing the mbuf with the mailbox struct in the LS_RJT path. As part of the code review to figure the issue out a couple of other areas where found that also would not have released the mbuf. Those are cleaned up as well. Link: https://lore.kernel.org/r/20211204002644.116455-2-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-06 22:35:36 -05:00
Bart Van Assche	eaab9b5730	scsi: ufs: Implement polling support The time spent in io_schedule() and also the interrupt latency are significant when submitting direct I/O to a UFS device. Hence this patch that implements polling support. User space software can enable polling by passing the RWF_HIPRI flag to the preadv2() system call or the IORING_SETUP_IOPOLL flag to the io_uring interface. Although the block layer supports to partition the tag space for interrupt-based completions (HCTX_TYPE_DEFAULT) purposes and polling (HCTX_TYPE_POLL), the choice has been made to use the same hardware queue for both hctx types because partitioning the tag space would negatively affect performance. On my test setup this patch increases IOPS from 2736 to 22000 (8x) for the following test: for hipri in 0 1; do fio --ioengine=io_uring --iodepth=1 --rw=randread \ --runtime=60 --time_based=1 --direct=1 --name=qd1 \ --filename=/dev/block/sda --ioscheduler=none --gtod_reduce=1 \ --norandommap --hipri=$hipri done Link: https://lore.kernel.org/r/20211203231950.193369-18-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-06 22:30:34 -05:00
Bart Van Assche	8d077ede48	scsi: ufs: Optimize the command queueing code Remove the clock scaling lock from ufshcd_queuecommand() since it is a performance bottleneck. Instead check the SCSI device budget bitmaps in the code that waits for ongoing ufshcd_queuecommand() calls. A bit is set in sdev->budget_map just before scsi_queue_rq() is called and a bit is cleared from that bitmap if scsi_queue_rq() does not submit the request or after the request has finished. See also the blk_mq_{get,put}_dispatch_budget() calls in the block layer. There is no risk for a livelock since the block layer delays queue reruns if queueing a request fails because the SCSI host has been blocked. Link: https://lore.kernel.org/r/20211203231950.193369-17-bvanassche@acm.org Cc: Asutosh Das (asd) <asutoshd@codeaurora.org> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-06 22:30:34 -05:00
Bart Van Assche	5675c381ea	scsi: ufs: Stop using the clock scaling lock in the error handler Instead of locking and unlocking the clock scaling lock, surround the command queueing code with an RCU reader lock and call synchronize_rcu(). This patch prepares for removal of the clock scaling lock. Link: https://lore.kernel.org/r/20211203231950.193369-16-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-06 22:30:34 -05:00
Bart Van Assche	3489c34bd0	scsi: ufs: Fix a kernel crash during shutdown Fix the following kernel crash: Unable to handle kernel paging request at virtual address ffffffc91e735000 Call trace: __queue_work+0x26c/0x624 queue_work_on+0x6c/0xf0 ufshcd_hold+0x12c/0x210 __ufshcd_wl_suspend+0xc0/0x400 ufshcd_wl_shutdown+0xb8/0xcc device_shutdown+0x184/0x224 kernel_restart+0x4c/0x124 __arm64_sys_reboot+0x194/0x264 el0_svc_common+0xc8/0x1d4 do_el0_svc+0x30/0x8c el0_svc+0x20/0x30 el0_sync_handler+0x84/0xe4 el0_sync+0x1bc/0x1c0 Fix this crash by ungating the clock before destroying the work queue on which clock gating work is queued. Link: https://lore.kernel.org/r/20211203231950.193369-15-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-06 22:30:34 -05:00

1 2 3 4 5 ...

22508 commits