linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-09-29 22:02:02 +00:00

Author	SHA1	Message	Date
Arun Easi	cb4dff4984	scsi: qla2xxx: Fix crash when I/O abort times out commit `68ad83188d` upstream. While performing CPU hotplug, a crash with the following stack was seen: Call Trace: qla24xx_process_response_queue+0x42a/0x970 [qla2xxx] qla2x00_start_nvme_mq+0x3a2/0x4b0 [qla2xxx] qla_nvme_post_cmd+0x166/0x240 [qla2xxx] nvme_fc_start_fcp_op.part.0+0x119/0x2e0 [nvme_fc] blk_mq_dispatch_rq_list+0x17b/0x610 __blk_mq_sched_dispatch_requests+0xb0/0x140 blk_mq_sched_dispatch_requests+0x30/0x60 __blk_mq_run_hw_queue+0x35/0x90 __blk_mq_delay_run_hw_queue+0x161/0x180 blk_execute_rq+0xbe/0x160 __nvme_submit_sync_cmd+0x16f/0x220 [nvme_core] nvmf_connect_admin_queue+0x11a/0x170 [nvme_fabrics] nvme_fc_create_association.cold+0x50/0x3dc [nvme_fc] nvme_fc_connect_ctrl_work+0x19/0x30 [nvme_fc] process_one_work+0x1e8/0x3c0 On abort timeout, completion was called without checking if the I/O was already completed. Verify that I/O and abort request are indeed outstanding before attempting completion. Fixes: `71c80b75ce` ("scsi: qla2xxx: Do command completion on abort timeout") Reported-by: Marco Patalano <mpatalan@redhat.com> Tested-by: Marco Patalano <mpatalan@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20221129092634.15347-1-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-12-31 13:26:53 +01:00
Kumar Meiyappan	e8e9e0c289	scsi: smartpqi: Correct device removal for multi-actuator devices [ Upstream commit `cc9befcbbb` ] Correct device count for multi-actuator drives which can cause kernel panics. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike Mcgowan <mike.mcgowan@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Kumar Meiyappan <Kumar.Meiyappan@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/166793531872.322537.9003385780343419275.stgit@brunhilda Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:48 +01:00
Mike McGowen	c38e48750c	scsi: smartpqi: Add new controller PCI IDs [ Upstream commit `0b93cf2a90` ] All PCI ID entries in Hex. Add PCI IDs for ByteDance controllers: VID / DID / SVID / SDID ---- ---- ---- ---- ByteHBA JGH43024-8 9005 / 028f / 1e93 / 1000 ByteHBA JGH43034-8 9005 / 028f / 1e93 / 1001 ByteHBA JGH44014-8 9005 / 028f / 1e93 / 1002 Add PCI IDs for new Inspur controllers: VID / DID / SVID / SDID ---- ---- ---- ---- INSPUR RT0800M7E 9005 / 028f / 1bd4 / 0086 INSPUR RT0800M7H 9005 / 028f / 1bd4 / 0087 INSPUR RT0804M7R 9005 / 028f / 1bd4 / 0088 INSPUR RT0808M7R 9005 / 028f / 1bd4 / 0089 Add PCI IDs for new FAB A controllers: VID / DID / SVID / SDID ---- ---- ---- ---- Adaptec SmartRAID 3254-16e /e 9005 / 028f / 9005 / 1475 Adaptec HBA 1200-16e 9005 / 028f / 9005 / 14c3 Adaptec HBA 1200-8e 9005 / 028f / 9005 / 14c4 Add H3C controller PCI IDs: VID / DID / SVID / SDID ---- ---- ---- ---- H3C H4508-Mf-8i 9005 / 028f / 193d / 110b Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/166793530327.322537.6056884426657539311.stgit@brunhilda Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:47 +01:00
Nathan Chancellor	0817b5a6e7	scsi: elx: libefc: Fix second parameter type in state callbacks [ Upstream commit `3d75e766b5` ] With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG), indirect call targets are validated against the expected function pointer prototype to make sure the call target is valid to help mitigate ROP attacks. If they are not identical, there is a failure at run time, which manifests as either a kernel panic or thread getting killed. A proposed warning in clang aims to catch these at compile time, which reveals: drivers/scsi/elx/libefc/efc_node.c:811:22: error: incompatible function pointer types assigning to 'void ()(struct efc_sm_ctx , u32, void )' (aka 'void ()(struct efc_sm_ctx , unsigned int, void )') from 'void ()(struct efc_sm_ctx , enum efc_sm_event, void )' [-Werror,-Wincompatible-function-pointer-types-strict] ctx->current_state = state; ^ ~~~~~ drivers/scsi/elx/libefc/efc_node.c:878:21: error: incompatible function pointer types assigning to 'void ()(struct efc_sm_ctx , u32, void )' (aka 'void ()(struct efc_sm_ctx , unsigned int, void )') from 'void ()(struct efc_sm_ctx , enum efc_sm_event, void )' [-Werror,-Wincompatible-function-pointer-types-strict] node->nodedb_state = state; ^ ~~~~~ drivers/scsi/elx/libefc/efc_node.c:905:6: error: incompatible function pointer types assigning to 'void ()(struct efc_sm_ctx , enum efc_sm_event, void )' from 'void ()(struct efc_sm_ctx , u32, void )' (aka 'void ()(struct efc_sm_ctx , unsigned int, void )') [-Werror,-Wincompatible-function-pointer-types-strict] pf = node->nodedb_state; ^ ~~~~~~~~~~~~~~~~~~ drivers/scsi/elx/libefc/efc_device.c:455:22: error: incompatible function pointer types assigning to 'void ()(struct efc_sm_ctx , u32, void )' (aka 'void ()(struct efc_sm_ctx , unsigned int, void )') from 'void (struct efc_sm_ctx , enum efc_sm_event, void )' [-Werror,-Wincompatible-function-pointer-types-strict] node->nodedb_state = __efc_d_init; ^ ~~~~~~~~~~~~ drivers/scsi/elx/libefc/efc_sm.c:41:22: error: incompatible function pointer types assigning to 'void ()(struct efc_sm_ctx , u32, void )' (aka 'void ()(struct efc_sm_ctx , unsigned int, void )') from 'void ()(struct efc_sm_ctx , enum efc_sm_event, void )' [-Werror,-Wincompatible-function-pointer-types-strict] ctx->current_state = state; ^ ~~~~~ The type of the second parameter in the prototypes of ->current_state() and ->nodedb_state() ('u32') does not match the implementations, which have a second parameter type of 'enum efc_sm_event'. Update the prototypes to have the correct second parameter type, clearing up all the warnings and CFI failures. Link: https://github.com/ClangBuiltLinux/linux/issues/1750 Reported-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20221102161906.2781508-1-nathan@kernel.org Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:47 +01:00
Justin Tee	cd542900ee	scsi: lpfc: Fix hard lockup when reading the rx_monitor from debugfs [ Upstream commit `c44e50f4a0` ] During I/O and simultaneous cat of /sys/kernel/debug/lpfc/fnX/rx_monitor, a hard lockup similar to the call trace below may occur. The spin_lock_bh in lpfc_rx_monitor_report is not protecting from timer interrupts as expected, so change the strength of the spin lock to _irq. Kernel panic - not syncing: Hard LOCKUP CPU: 3 PID: 110402 Comm: cat Kdump: loaded exception RIP: native_queued_spin_lock_slowpath+91 [IRQ stack] native_queued_spin_lock_slowpath at ffffffffb814e30b _raw_spin_lock at ffffffffb89a667a lpfc_rx_monitor_record at ffffffffc0a73a36 [lpfc] lpfc_cmf_timer at ffffffffc0abbc67 [lpfc] __hrtimer_run_queues at ffffffffb8184250 hrtimer_interrupt at ffffffffb8184ab0 smp_apic_timer_interrupt at ffffffffb8a026ba apic_timer_interrupt at ffffffffb8a01c4f [End of IRQ stack] apic_timer_interrupt at ffffffffb8a01c4f lpfc_rx_monitor_report at ffffffffc0a73c80 [lpfc] lpfc_rx_monitor_read at ffffffffc0addde1 [lpfc] full_proxy_read at ffffffffb83e7fc3 vfs_read at ffffffffb833fe71 ksys_read at ffffffffb83402af do_syscall_64 at ffffffffb800430b entry_SYSCALL_64_after_hwframe at ffffffffb8a000ad Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20221017164323.14536-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:47 +01:00
Jie Zhan	2331856600	scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset [ Upstream commit `3c2673a09c` ] SATA devices on an expander may be removed and not be found again when I_T nexus reset and revalidation are processed simultaneously. The issue comes from: - Revalidation can remove SATA devices in link reset, e.g. in hisi_sas_clear_nexus_ha(). - However, hisi_sas_debug_I_T_nexus_reset() polls the state of a SATA device on an expander after sending link_reset, where it calls: hisi_sas_debug_I_T_nexus_reset sas_ata_wait_after_reset ata_wait_after_reset ata_wait_ready smp_ata_check_ready sas_ex_phy_discover sas_ex_phy_discover_helper sas_set_ex_phy The ex_phy's change count is updated in sas_set_ex_phy(), so SATA devices after a link reset may not be found later through revalidation. A similar issue was reported in: commit `0f3fce5cc7` ("[SCSI] libsas: fix ata_eh clobbering ex_phys via smp_ata_check_ready") commit `87c8331fcf` ("[SCSI] libsas: prevent domain rediscovery competing with ata error handling"). To address this issue, in hisi_sas_debug_I_T_nexus_reset(), we now call smp_ata_check_ready_type() that only polls the device type while not updating the ex_phy's data of libsas. Fixes: `71453bd9d1` ("scsi: hisi_sas: Use sas_ata_wait_after_reset() in IT nexus reset") Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Link: https://lore.kernel.org/r/20221118083714.4034612-5-zhanjie9@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:21 +01:00
Jie Zhan	b059b4cd43	scsi: libsas: Add smp_ata_check_ready_type() [ Upstream commit `9181ce3cb5` ] Create function smp_ata_check_ready_type() for LLDDs to wait for SATA devices to come up after a link reset. Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Link: https://lore.kernel.org/r/20221118083714.4034612-4-zhanjie9@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: `3c2673a09c` ("scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:21 +01:00
Gaosheng Cui	c7f0f8dab1	scsi: snic: Fix possible UAF in snic_tgt_create() [ Upstream commit `e118df4923` ] Smatch reports a warning as follows: drivers/scsi/snic/snic_disc.c:307 snic_tgt_create() warn: '&tgt->list' not removed from list If device_add() fails in snic_tgt_create(), tgt will be freed, but tgt->list will not be removed from snic->disc.tgt_list, then list traversal may cause UAF. Remove from snic->disc.tgt_list before free(). Fixes: `c8806b6c9e` ("snic: driver for Cisco SCSI HBA") Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Link: https://lore.kernel.org/r/20221117035100.2944812-1-cuigaosheng1@huawei.com Acked-by: Narsimhulu Musini <nmusini@cisco.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:21 +01:00
Chen Zhongjin	1dc499c615	scsi: fcoe: Fix transport not deattached when fcoe_if_init() fails [ Upstream commit `4155658cee` ] fcoe_init() calls fcoe_transport_attach(&fcoe_sw_transport), but when fcoe_if_init() fails, &fcoe_sw_transport is not detached and leaves freed &fcoe_sw_transport on fcoe_transports list. This causes panic when reinserting module. BUG: unable to handle page fault for address: fffffbfff82e2213 RIP: 0010:fcoe_transport_attach+0xe1/0x230 [libfcoe] Call Trace: <TASK> do_one_initcall+0xd0/0x4e0 load_module+0x5eee/0x7210 ... Fixes: `78a582463c` ("[SCSI] fcoe: convert fcoe.ko to become an fcoe transport provider driver") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Link: https://lore.kernel.org/r/20221115092442.133088-1-chenzhongjin@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:21 +01:00
Shang XiaoJing	8c739021b2	scsi: ipr: Fix WARNING in ipr_init() [ Upstream commit `e6f108bffc` ] ipr_init() will not call unregister_reboot_notifier() when pci_register_driver() fails, which causes a WARNING. Call unregister_reboot_notifier() when pci_register_driver() fails. notifier callback ipr_halt [ipr] already registered WARNING: CPU: 3 PID: 299 at kernel/notifier.c:29 notifier_chain_register+0x16d/0x230 Modules linked in: ipr(+) xhci_pci_renesas xhci_hcd ehci_hcd usbcore led_class gpu_sched drm_buddy video wmi drm_ttm_helper ttm drm_display_helper drm_kms_helper drm drm_panel_orientation_quirks agpgart cfbft CPU: 3 PID: 299 Comm: modprobe Tainted: G W 6.1.0-rc1-00190-g39508d23b672-dirty #332 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 RIP: 0010:notifier_chain_register+0x16d/0x230 Call Trace: <TASK> __blocking_notifier_chain_register+0x73/0xb0 ipr_init+0x30/0x1000 [ipr] do_one_initcall+0xdb/0x480 do_init_module+0x1cf/0x680 load_module+0x6a50/0x70a0 __do_sys_finit_module+0x12f/0x1c0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: `f72919ec2b` ("[SCSI] ipr: implement shutdown changes and remove obsolete write cache parameter") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Link: https://lore.kernel.org/r/20221113064513.14028-1-shangxiaojing@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:21 +01:00
Yang Yingliang	22666c527d	scsi: scsi_debug: Fix possible name leak in sdebug_add_host_helper() [ Upstream commit `e6d773f93a` ] Afer commit `1fa5ae857b` ("driver core: get rid of struct device's bus_id string array"), the name of device is allocated dynamically, it needs be freed when device_register() returns error. As comment of device_register() says, one should use put_device() to give up the reference in the error path. Fix this by calling put_device(), then the name can be freed in kobject_cleanup(), and sdbg_host is freed in sdebug_release_adapter(). When the device release is not set, it means the device is not initialized. We can not call put_device() in this case. Use kfree() to free memory. Fixes: `1fa5ae857b` ("driver core: get rid of struct device's bus_id string array") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221112131010.3757845-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:21 +01:00
Yang Yingliang	29f2298c32	scsi: fcoe: Fix possible name leak when device_register() fails [ Upstream commit `47b6a122c7` ] If device_register() returns an error, the name allocated by dev_set_name() needs to be freed. As the comment of device_register() says, one should use put_device() to give up the reference in the error path. Fix this by calling put_device(), then the name can be freed in kobject_cleanup(). The 'fcf' is freed in fcoe_fcf_device_release(), so the kfree() in the error path can be removed. The 'ctlr' is freed in fcoe_ctlr_device_release(), so don't use the error label, just return NULL after calling put_device(). Fixes: `9a74e884ee` ("[SCSI] libfcoe: Add fcoe_sysfs") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221112094310.3633291-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:21 +01:00
Harshit Mogalapalli	869908496b	scsi: scsi_debug: Fix a warning in resp_report_zones() [ Upstream commit `07f2ca139d` ] As 'alloc_len' is user controlled data, if user tries to allocate memory larger than(>=) MAX_ORDER, then kcalloc() will fail, it creates a stack trace and messes up dmesg with a warning. Add __GFP_NOWARN in order to avoid too large allocation warning. This is detected by static analysis using smatch. Fixes: `7db0e0c819` ("scsi: scsi_debug: Fix buffer size of REPORT ZONES command") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20221112070612.2121535-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:21 +01:00
Harshit Mogalapalli	c82df36274	scsi: scsi_debug: Fix a warning in resp_verify() [ Upstream commit `ed0f17b748` ] As 'vnum' is controlled by user, so if user tries to allocate memory larger than(>=) MAX_ORDER, then kcalloc() will fail, it creates a stack trace and messes up dmesg with a warning. Add __GFP_NOWARN in order to avoid too large allocation warning. This is detected by static analysis using smatch. Fixes: `c3e2fe9222` ("scsi: scsi_debug: Implement VERIFY(10), add VERIFY(16)") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20221112070031.2121068-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:20 +01:00
Chen Zhongjin	0c6e6bb302	scsi: efct: Fix possible memleak in efct_device_init() [ Upstream commit `bb0cd225dd` ] In efct_device_init(), when efct_scsi_reg_fc_transport() fails, efct_scsi_tgt_driver_exit() is not called to release memory for efct_scsi_tgt_driver_init() and causes memleak: unreferenced object 0xffff8881020ce000 (size 2048): comm "modprobe", pid 465, jiffies 4294928222 (age 55.872s) backtrace: [<0000000021a1ef1b>] kmalloc_trace+0x27/0x110 [<000000004c3ed51c>] target_register_template+0x4fd/0x7b0 [target_core_mod] [<00000000f3393296>] efct_scsi_tgt_driver_init+0x18/0x50 [efct] [<00000000115de533>] 0xffffffffc0d90011 [<00000000d608f646>] do_one_initcall+0xd0/0x4e0 [<0000000067828cf1>] do_init_module+0x1cc/0x6a0 ... Fixes: `4df84e8466` ("scsi: elx: efct: Driver initialization routines") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Link: https://lore.kernel.org/r/20221111074046.57061-1-chenzhongjin@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:20 +01:00
Yang Yingliang	7a5537cb46	scsi: hpsa: Fix possible memory leak in hpsa_add_sas_device() [ Upstream commit `fda34a5d30` ] If hpsa_sas_port_add_rphy() returns an error, the 'rphy' allocated in sas_end_device_alloc() needs to be freed. Address this by calling sas_rphy_free() in the error path. Fixes: `d04e62b9d6` ("hpsa: add in sas transport class") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221111043012.1074466-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:20 +01:00
Yang Yingliang	6617b4fee3	scsi: hpsa: Fix error handling in hpsa_add_sas_host() [ Upstream commit `4ef174a3ad` ] hpsa_sas_port_add_phy() does: ... sas_phy_add() -> may return error here sas_port_add_phy() ... Whereas hpsa_free_sas_phy() does: ... sas_port_delete_phy() sas_phy_delete() ... If hpsa_sas_port_add_phy() returns an error, hpsa_free_sas_phy() can not be called to free the memory because the port and the phy have not been added yet. Replace hpsa_free_sas_phy() with sas_phy_free() and kfree() to avoid kernel crash in this case. Fixes: `d04e62b9d6` ("hpsa: add in sas transport class") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221110151129.394389-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:20 +01:00
Yang Yingliang	6f6768e2fc	scsi: mpt3sas: Fix possible resource leaks in mpt3sas_transport_port_add() [ Upstream commit `78316e9dfc` ] In mpt3sas_transport_port_add(), if sas_rphy_add() returns error, sas_rphy_free() needs be called to free the resource allocated in sas_end_device_alloc(). Otherwise a kernel crash will happen: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000108 CPU: 45 PID: 37020 Comm: bash Kdump: loaded Tainted: G W 6.1.0-rc1+ #189 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : device_del+0x54/0x3d0 lr : device_del+0x37c/0x3d0 Call trace: device_del+0x54/0x3d0 attribute_container_class_device_del+0x28/0x38 transport_remove_classdev+0x6c/0x80 attribute_container_device_trigger+0x108/0x110 transport_remove_device+0x28/0x38 sas_rphy_remove+0x50/0x78 [scsi_transport_sas] sas_port_delete+0x30/0x148 [scsi_transport_sas] do_sas_phy_delete+0x78/0x80 [scsi_transport_sas] device_for_each_child+0x68/0xb0 sas_remove_children+0x30/0x50 [scsi_transport_sas] sas_rphy_remove+0x38/0x78 [scsi_transport_sas] sas_port_delete+0x30/0x148 [scsi_transport_sas] do_sas_phy_delete+0x78/0x80 [scsi_transport_sas] device_for_each_child+0x68/0xb0 sas_remove_children+0x30/0x50 [scsi_transport_sas] sas_remove_host+0x20/0x38 [scsi_transport_sas] scsih_remove+0xd8/0x420 [mpt3sas] Because transport_add_device() is not called when sas_rphy_add() fails, the device is not added. When sas_rphy_remove() is subsequently called to remove the device in the remove() path, a NULL pointer dereference happens. Fixes: `f92363d123` ("[SCSI] mpt3sas: add new driver supporting 12GB SAS") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221109032403.1636422-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:20 +01:00
Yuan Can	fc998d0a7d	scsi: hpsa: Fix possible memory leak in hpsa_init_one() [ Upstream commit `9c9ff300e0` ] The hpda_alloc_ctlr_info() allocates h and its field reply_map. However, in hpsa_init_one(), if alloc_percpu() failed, the hpsa_init_one() jumps to clean1 directly, which frees h and leaks the h->reply_map. Fix by calling hpda_free_ctlr_info() to release h->replay_map and h instead free h directly. Fixes: `8b834bff1b` ("scsi: hpsa: fix selection of reply queue") Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20221122015751.87284-1-yuancan@huawei.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:20 +01:00
Harshit Mogalapalli	987b004942	scsi: scsi_debug: Fix a warning in resp_write_scat() [ Upstream commit `216e179724` ] As 'lbdof_blen' is coming from user, if the size in kzalloc() is >= MAX_ORDER then we hit a warning. Call trace: sg_ioctl sg_ioctl_common scsi_ioctl sg_scsi_ioctl blk_execute_rq blk_mq_sched_insert_request blk_mq_run_hw_queue __blk_mq_delay_run_hw_queue __blk_mq_run_hw_queue blk_mq_sched_dispatch_requests __blk_mq_sched_dispatch_requests blk_mq_dispatch_rq_list scsi_queue_rq scsi_dispatch_cmd scsi_debug_queuecommand schedule_resp resp_write_scat If you try to allocate a memory larger than(>=) MAX_ORDER, then kmalloc() will definitely fail. It creates a stack trace and messes up dmesg. The user controls the size here so if they specify a too large size it will fail. Add __GFP_NOWARN in order to avoid too large allocation warning. This is detected by static analysis using smatch. Fixes: `481b5e5c79` ("scsi: scsi_debug: add resp_write_scat function") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20221111100526.1790533-1-harshit.m.mogalapalli@oracle.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:19 +01:00
Bart Van Assche	f8e164f8ec	scsi: qla2xxx: Fix set-but-not-used variable warnings [ Upstream commit `4fb2169d66` ] Fix the following two compiler warnings: drivers/scsi/qla2xxx/qla_init.c: In function ‘qla24xx_async_abort_cmd’: drivers/scsi/qla2xxx/qla_init.c:171:17: warning: variable ‘bail’ set but not used [-Wunused-but-set-variable] 171 \| uint8_t bail; \| ^~~~ drivers/scsi/qla2xxx/qla_init.c: In function ‘qla2x00_async_tm_cmd’: drivers/scsi/qla2xxx/qla_init.c:2023:17: warning: variable ‘bail’ set but not used [-Wunused-but-set-variable] 2023 \| uint8_t bail; \| ^~~~ Cc: Arun Easi <arun.easi@qlogic.com> Cc: Giridhar Malavali <giridhar.malavali@qlogic.com> Fixes: `feafb7b171` ("[SCSI] qla2xxx: Fix vport delete issues") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221031224818.2607882-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:18 +01:00
Bart Van Assche	8dd35e1e83	scsi: core: Fix a race between scsi_done() and scsi_timeout() [ Upstream commit `978b7922d3` ] If there is a race between scsi_done() and scsi_timeout() and if scsi_timeout() loses the race, scsi_timeout() should not reset the request timer. Hence change the return value for this case from BLK_EH_RESET_TIMER into BLK_EH_DONE. Although the block layer holds a reference on a request (req->ref) while calling a timeout handler, restarting the timer (blk_add_timer()) while a request is being completed is racy. Reviewed-by: Mike Christie <michael.christie@oracle.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Hannes Reinecke <hare@suse.de> Reported-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: `15f73f5b3e` ("blk-mq: move failure injection out of blk_mq_complete_request") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-2-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:26:16 +01:00
Zhou Guanghui	f176b11469	scsi: iscsi: Fix possible memory leak when device_register() failed [ Upstream commit `f014165faa` ] If device_register() returns error, the name allocated by the dev_set_name() need be freed. As described in the comment of device_register(), we should use put_device() to give up the reference in the error path. Fix this by calling put_device(), the name will be freed in the kobject_cleanup(), and this patch modified resources will be released by calling the corresponding callback function in the device_release(). Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com> Link: https://lore.kernel.org/r/20221110033729.1555-1-zhouguanghui1@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-02 17:43:16 +01:00
Shin'ichiro Kawasaki	f51c3741c8	scsi: mpi3mr: Suppress command reply debug prints [ Upstream commit `7d21fcfb40` ] After it receives command reply, mpi3mr driver checks command result. If the result is not zero, it prints out command information. This debug information is confusing since they are printed even when the non-zero result is expected. "Power-on or device reset occurred" is printed for Test Unit Ready command at drive detection. Inquiry failure for unsupported VPD page header is also printed. They are harmless but look like failures. To avoid the confusion, print the command reply debug information only when the module parameter logging_level has value MPI3_DEBUG_SCSI_ERROR= 64, in same manner as mpt3sas driver. Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20221111014449.1649968-1-shinichiro.kawasaki@wdc.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-02 17:43:16 +01:00
Michael Kelley	8c2fcad34c	scsi: storvsc: Fix handling of srb_status and capacity change events [ Upstream commit `b8a5376c32` ] Current handling of the srb_status is incorrect. Commit `52e1b3b3da` ("scsi: storvsc: Correctly handle multiple flags in srb_status") is based on srb_status being a set of flags, when in fact only the 2 high order bits are flags and the remaining 6 bits are an integer status. Because the integer values of interest mostly look like flags, the code actually works when treated that way. But in the interest of correctness going forward, fix this by treating the low 6 bits of srb_status as an integer status code. Add handling for SRB_STATUS_INVALID_REQUEST, which was the original intent of commit `52e1b3b3da`. Furthermore, treat the ERROR, ABORTED, and INVALID_REQUEST srb status codes as essentially equivalent for the cases we care about. There's no harm in doing so, and it isn't always clear which status code current or older versions of Hyper-V report for particular conditions. Treating the srb status codes as equivalent has the additional benefit of ensuring that capacity change events result in an immediate rescan so that the new size is known to Linux. Existing code checks SCSI sense data for capacity change events when the srb status is ABORTED. But capacity change events are also being observed when Hyper-V reports the srb status as ERROR. Without the immediate rescan, the new size isn't known until something else causes a rescan (such as running fdisk to expand a partition), and in the meantime, tools such as "lsblk" continue to report the old size. Fixes: `52e1b3b3da` ("scsi: storvsc: Correctly handle multiple flags in srb_status") Reported-by: Juan Tian <juantian@microsoft.com> Signed-off-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1668019722-1983-1-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-02 17:43:02 +01:00
Bart Van Assche	aaf13ba71e	scsi: scsi_debug: Make the READ CAPACITY response compliant with ZBC [ Upstream commit `ecb8c2580d` ] From ZBC-1: - RC BASIS = 0: The RETURNED LOGICAL BLOCK ADDRESS field indicates the highest LBA of a contiguous range of zones that are not sequential write required zones starting with the first zone. - RC BASIS = 1: The RETURNED LOGICAL BLOCK ADDRESS field indicates the LBA of the last logical block on the logical unit. The current scsi_debug READ CAPACITY response does not comply with the above if there are one or more sequential write required zones. SCSI initiators need a way to retrieve the largest valid LBA from SCSI devices. Reporting the largest valid LBA if there are one or more sequential zones requires to set the RC BASIS field in the READ CAPACITY response to one. Hence this patch. Cc: Douglas Gilbert <dgilbert@interlog.com> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> Suggested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221102193248.3177608-1-bvanassche@acm.org Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-02 17:43:00 +01:00
Brian King	f98ad956a1	scsi: ibmvfc: Avoid path failures during live migration [ Upstream commit `62fa3ce05d` ] Fix an issue reported when performing a live migration when multipath is configured with a short fast fail timeout of 5 seconds and also to have no_path_retry set to fail. In this scenario, all paths would go into the devloss state while the ibmvfc driver went through discovery to log back in. On a loaded system, the discovery might take longer than 5 seconds, which was resulting in all paths being marked failed, which then resulted in a read only filesystem. This patch changes the migration code in ibmvfc to avoid deleting rports at all in this scenario, so we avoid losing all paths. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20221026181356.148517-1-brking@linux.vnet.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-02 17:43:00 +01:00
Yuan Can	aa5b8170b2	scsi: scsi_debug: Fix possible UAF in sdebug_add_host_helper() [ Upstream commit `e208a1d795` ] If device_register() fails in sdebug_add_host_helper(), it will goto clean and sdbg_host will be freed, but sdbg_host->host_list will not be removed from sdebug_host_list, then list traversal may cause UAF. Fix it. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20221117084421.58918-1-yuancan@huawei.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-26 09:27:53 +01:00
Yang Yingliang	c736876ee2	scsi: scsi_transport_sas: Fix error handling in sas_phy_add() [ Upstream commit `5d7bebf2df` ] If transport_add_device() fails in sas_phy_add(), the kernel will crash trying to delete the device in transport_remove_device() called from sas_remove_host(). Unable to handle kernel NULL pointer dereference at virtual address 0000000000000108 CPU: 61 PID: 42829 Comm: rmmod Kdump: loaded Tainted: G W 6.1.0-rc1+ #173 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : device_del+0x54/0x3d0 lr : device_del+0x37c/0x3d0 Call trace: device_del+0x54/0x3d0 attribute_container_class_device_del+0x28/0x38 transport_remove_classdev+0x6c/0x80 attribute_container_device_trigger+0x108/0x110 transport_remove_device+0x28/0x38 sas_phy_delete+0x30/0x60 [scsi_transport_sas] do_sas_phy_delete+0x6c/0x80 [scsi_transport_sas] device_for_each_child+0x68/0xb0 sas_remove_children+0x40/0x50 [scsi_transport_sas] sas_remove_host+0x20/0x38 [scsi_transport_sas] hisi_sas_remove+0x40/0x68 [hisi_sas_main] hisi_sas_v2_remove+0x20/0x30 [hisi_sas_v2_hw] platform_remove+0x2c/0x60 Fix this by checking and handling return value of transport_add_device() in sas_phy_add(). Fixes: `c7ebbbce36` ("[SCSI] SAS transport class") Suggested-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221107124828.115557-1-yangyingliang@huawei.com Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-26 09:27:28 +01:00
Uday Shankar	022577b941	scsi: core: Restrict legal sdev_state transitions via sysfs [ Upstream commit `2331ce6126` ] Userspace can currently write to sysfs to transition sdev_state to RUNNING or OFFLINE from any source state. This causes issues because proper transitioning out of some states involves steps besides just changing sdev_state, so allowing userspace to change sdev_state regardless of the source state can result in inconsistencies; e.g. with ISCSI we can end up with sdev_state == SDEV_RUNNING while the device queue is quiesced. Any task attempting I/O on the device will then hang, and in more recent kernels, iscsid will hang as well. More detail about this bug is provided in my first attempt: https://groups.google.com/g/open-iscsi/c/PNKca4HgPDs/m/CXaDkntOAQAJ Link: https://lore.kernel.org/r/20220924000241.2967323-1-ushankar@purestorage.com Signed-off-by: Uday Shankar <ushankar@purestorage.com> Suggested-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-10 18:17:25 +01:00
Manish Rangankar	2bc9b08130	scsi: qla2xxx: Use transport-defined speed mask for supported_speeds commit `0b863257c1` upstream. One of the sysfs values reported for supported_speeds was not valid (20Gb/s reported instead of 64Gb/s). Instead of driver internal speed mask definition, use speed mask defined in transport_fc for reporting host->supported_speeds. Link: https://lore.kernel.org/r/20220927115946.17559-1-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-04 00:00:21 +09:00
Rafael Mendonca	5ea1f195f5	scsi: lpfc: Fix memory leak in lpfc_create_port() [ Upstream commit `dc8e483f68` ] Commit `5e633302ac` ("scsi: lpfc: vmid: Add support for VMID in mailbox command") introduced allocations for the VMID resources in lpfc_create_port() after the call to scsi_host_alloc(). Upon failure on the VMID allocations, the new code would branch to the 'out' label, which returns NULL without unwinding anything, thus skipping the call to scsi_host_put(). Fix the problem by creating a separate label 'out_free_vmid' to unwind the VMID resources and make the 'out_put_shost' label call only scsi_host_put(), as was done before the introduction of allocations for VMID. Fixes: `5e633302ac` ("scsi: lpfc: vmid: Add support for VMID in mailbox command") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Link: https://lore.kernel.org/r/20220916035908.712799-1-rafaelmendsr@gmail.com Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-29 10:08:33 +02:00
Letu Ren	a51d20e0ae	scsi: 3w-9xxx: Avoid disabling device if failing to enable it [ Upstream commit `7eff437b5e` ] The original code will "goto out_disable_device" and call pci_disable_device() if pci_enable_device() fails. The kernel will generate a warning message like "3w-9xxx 0000:00:05.0: disabling already-disabled device". We shouldn't disable a device that failed to be enabled. A simple return is fine. Link: https://lore.kernel.org/r/20220829110115.38789-1-fantasquex@gmail.com Reported-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Letu Ren <fantasquex@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-21 12:39:21 +02:00
James Smart	82dc1fe432	scsi: lpfc: Fix null ndlp ptr dereference in abnormal exit path for GFT_ID [ Upstream commit `59b7e210a5` ] An error case exit from lpfc_cmpl_ct_cmd_gft_id() results in a call to lpfc_nlp_put() with a null pointer to a nodelist structure. Changed lpfc_cmpl_ct_cmd_gft_id() to initialize nodelist pointer upon entry. Link: https://lore.kernel.org/r/20220819011736.14141-3-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-21 12:39:21 +02:00
Mike Christie	0a0b861fce	scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername() [ Upstream commit `57569c37f0` ] Fix a NULL pointer crash that occurs when we are freeing the socket at the same time we access it via sysfs. The problem is that: 1. iscsi_sw_tcp_conn_get_param() and iscsi_sw_tcp_host_get_param() take the frwd_lock and do sock_hold() then drop the frwd_lock. sock_hold() does a get on the "struct sock". 2. iscsi_sw_tcp_release_conn() does sockfd_put() which does the last put on the "struct socket" and that does __sock_release() which sets the sock->ops to NULL. 3. iscsi_sw_tcp_conn_get_param() and iscsi_sw_tcp_host_get_param() then call kernel_getpeername() which accesses the NULL sock->ops. Above we do a get on the "struct sock", but we needed a get on the "struct socket". Originally, we just held the frwd_lock the entire time but in commit `bcf3a2953d` ("scsi: iscsi: iscsi_tcp: Avoid holding spinlock while calling getpeername()") we switched to refcount based because the network layer changed and started taking a mutex in that path, so we could no longer hold the frwd_lock. Instead of trying to maintain multiple refcounts, this just has us use a mutex for accessing the socket in the interface code paths. Link: https://lore.kernel.org/r/20220907221700.10302-1-michael.christie@oracle.com Fixes: `bcf3a2953d` ("scsi: iscsi: iscsi_tcp: Avoid holding spinlock while calling getpeername()") Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-21 12:38:50 +02:00
John Garry	a62b9fc977	scsi: pm8001: Fix running_req for internal abort commands [ Upstream commit `d8c22c4697` ] Disabling the remote phy for a SATA disk causes a hang: root@(none)$ more /sys/class/sas_phy/phy-0:0:8/target_port_protocols sata root@(none)$ echo 0 > sys/class/sas_phy/phy-0:0:8/enable root@(none)$ [ 67.855950] sas: ex 500e004aaaaaaa1f phy08 change count has changed [ 67.920585] sd 0:0:2:0: [sdc] Synchronizing SCSI cache [ 67.925780] sd 0:0:2:0: [sdc] Synchronize Cache(10) failed: Result: hostbyte=0x04 driverbyte=DRIVER_OK [ 67.935094] sd 0:0:2:0: [sdc] Stopping disk [ 67.939305] sd 0:0:2:0: [sdc] Start/Stop Unit failed: Result: hostbyte=0x04 driverbyte=DRIVER_OK ... [ 123.998998] INFO: task kworker/u192:1:642 blocked for more than 30 seconds. [ 124.005960] Not tainted 6.0.0-rc1-205202-gf26f8f761e83 #218 [ 124.012049] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 124.019872] task:kworker/u192:1 state:D stack:0 pid: 642 ppid: 2 flags:0x00000008 [ 124.028223] Workqueue: 0000:04:00.0_event_q sas_port_event_worker [ 124.034319] Call trace: [ 124.036758] __switch_to+0x128/0x278 [ 124.040333] __schedule+0x434/0xa58 [ 124.043820] schedule+0x94/0x138 [ 124.047045] schedule_timeout+0x2fc/0x368 [ 124.051052] wait_for_completion+0xdc/0x200 [ 124.055234] __flush_workqueue+0x1a8/0x708 [ 124.059328] sas_porte_broadcast_rcvd+0xa8/0xc0 [ 124.063858] sas_port_event_worker+0x60/0x98 [ 124.068126] process_one_work+0x3f8/0x660 [ 124.072134] worker_thread+0x70/0x700 [ 124.075793] kthread+0x1a4/0x1b8 [ 124.079014] ret_from_fork+0x10/0x20 The issue is that the per-device running_req read in pm8001_dev_gone_notify() never goes to zero and we never make progress. This is caused by missing accounting for running_req for when an internal abort command completes. In commit `2cbbf48977` ("scsi: pm8001: Use libsas internal abort support") we started to send internal abort commands as a proper sas_task. In this when we deliver a sas_task to HW the per-device running_req is incremented in pm8001_queue_command(). However it is never decremented for internal abort commnds, so decrement in pm8001_mpi_task_abort_resp(). Link: https://lore.kernel.org/r/1663854664-76165-1-git-send-email-john.garry@huawei.com Fixes: `2cbbf48977` ("scsi: pm8001: Use libsas internal abort support") Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-21 12:38:50 +02:00
Duoming Zhou	2e12ce270f	scsi: libsas: Fix use-after-free bug in smp_execute_task_sg() [ Upstream commit `46ba53c306` ] When executing SMP task failed, the smp_execute_task_sg() calls del_timer() to delete "slow_task->timer". However, if the timer handler sas_task_internal_timedout() is running, the del_timer() in smp_execute_task_sg() will not stop it and a UAF will happen. The process is shown below: (thread 1) \| (thread 2) smp_execute_task_sg() \| sas_task_internal_timedout() ... \| del_timer() \| ... \| ... sas_free_task(task) \| kfree(task->slow_task) //FREE\| \| task->slow_task->... //USE Fix by calling del_timer_sync() in smp_execute_task_sg(), which makes sure the timer handler have finished before the "task->slow_task" is deallocated. Link: https://lore.kernel.org/r/20220920144213.10536-1-duoming@zju.edu.cn Fixes: `2908d778ab` ("[SCSI] aic94xx: new driver") Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-21 12:38:50 +02:00
James Smart	922d890fda	scsi: lpfc: Fix various issues reported by tools [ Upstream commit `a4de8356b6` ] This patch fixes below Smatch reported issues: 1. lpfc_hbadisc.c:3020 lpfc_mbx_cmpl_fcf_rr_read_fcf_rec() error: uninitialized symbol 'vlan_id'. 2. lpfc_hbadisc.c:3121 lpfc_mbx_cmpl_read_fcf_rec() error: uninitialized symbol 'vlan_id'. 3. lpfc_init.c:335 lpfc_dump_wakeup_param_cmpl() warn: always true condition '(prg->dist < 4) => (0-3 < 4)' 4. lpfc_init.c:2419 lpfc_parse_vpd() warn: inconsistent indenting. 5. lpfc_init.c:13248 lpfc_sli4_enable_msi() warn: 'phba->pcidev->irq' 2147483648 can't fit into 65535 'eqhdl->irq' 6. lpfc_debugfs.c:5300 lpfc_idiag_extacc_avail_get() error: uninitialized symbol 'ext_cnt' 7. lpfc_debugfs.c:5300 lpfc_idiag_extacc_avail_get() error: uninitialized symbol 'ext_size' 8. lpfc_vmid.c:248 lpfc_vmid_get_appid() warn: sleeping in atomic context. 9. lpfc_init.c:8342 lpfc_sli4_driver_resource_setup() warn: missing error code 'rc'. 10. lpfc_init.c:13573 lpfc_sli4_hba_unset() warn: variable dereferenced before check 'phba->pport' (see line 13546) 11. lpfc_auth.c:1923 lpfc_auth_handle_dhchap_reply() error: double free of 'hash_value' Fixes: 1. Initialize vlan_id to LPFC_FCOE_NULL_VID. 2. Initialize vlan_id to LPFC_FCOE_NULL_VID. 3. prg->dist is a 2 bit field. Its value can only be between 0-3. Remove redundent check 'if (prg->dist < 4)'. 4. Fix inconsistent indenting. Moved logic into helper function lpfc_fill_vpd(). 5. Define 'eqhdl->irq' as int value as pci_irq_vector() returns int. Also, check for return value of pci_irq_vector() and log message in case of failure. 6. Initialize 'ext_cnt' to 0. 7. Initialize 'ext_size' to 0. 8. Use alloc_percpu_gfp() with GFP_ATOMIC flag. 9. 'rc' was not updated when dma_pool_create() fails. Update 'rc = -ENOMEM' when dma_pool_create() fails before calling goto statement. 10. Add check for 'phba->pport' in lpfc_cpuhp_remove(). 11. Initialize 'hash_value' to NULL, same like 'aug_chal' variable. Link: https://lore.kernel.org/r/20220911221505.117655-13-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-21 12:38:44 +02:00
Saurav Kashyap	c56523c333	scsi: qedf: Populate sysfs attributes for vport commit `592642e6b1` upstream. Few vport parameters were displayed by systool as 'Unknown' or 'NULL'. Copy speed, supported_speed, frame_size and update port_type for NPIV port. Link: https://lore.kernel.org/r/20220919134434.3513-1-njavali@marvell.com Cc: stable@vger.kernel.org Tested-by: Guangwu Zhang <guazhang@redhat.com> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-21 12:37:45 +02:00
James Smart	2c9b5b8326	scsi: lpfc: Rework MIB Rx Monitor debug info logic commit `bd269188ea` upstream. The kernel test robot reported the following sparse warning: arch/arm64/include/asm/cmpxchg.h:88:1: sparse: sparse: cast truncates bits from constant value (369 becomes 69) On arm64, atomic_xchg only works on 8-bit byte fields. Thus, the macro usage of LPFC_RXMONITOR_TABLE_IN_USE can be unintentionally truncated leading to all logic involving the LPFC_RXMONITOR_TABLE_IN_USE macro to not work properly. Replace the Rx Table atomic_t indexing logic with a new lpfc_rx_info_monitor structure that holds a circular ring buffer. For locking semantics, a spinlock_t is used. Link: https://lore.kernel.org/r/20220819011736.14141-4-jsmart2021@gmail.com Fixes: `17b27ac592` ("scsi: lpfc: Add rx monitoring statistics") Cc: <stable@vger.kernel.org> # v5.15+ Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-21 12:37:45 +02:00
Linus Torvalds	b9b7369d89	scsi: stex: Properly zero out the passthrough command structure commit `6022f21046` upstream. The passthrough structure is declared off of the stack, so it needs to be set to zero before copied back to userspace to prevent any unintentional data leakage. Switch things to be statically allocated which will fill the unused fields with 0 automatically. Link: https://lore.kernel.org/r/YxrjN3OOw2HHl9tx@kroah.com Cc: stable@kernel.org Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Reported-by: hdthky <hdthky0@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-15 08:02:56 +02:00
Arun Easi	f22520a213	scsi: qla2xxx: Fix response queue handler reading stale packets commit `e4f8a29deb` upstream. On some platforms, the current logic of relying on finding new packet solely based on signature pattern can lead to driver reading stale packets. Though this is a bug in those platforms, reduce such exposures by limiting reading packets until the IN pointer. Link: https://lore.kernel.org/r/20220826102559.17474-3-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-15 08:02:56 +02:00
Arun Easi	add6d15e3d	scsi: qla2xxx: Revert "scsi: qla2xxx: Fix response queue handler reading stale packets" commit `6dc45a7322` upstream. Reverting this commit so that a fixed up patch, without adding new module parameters, can be submitted. Link: https://lore.kernel.org/stable/166039743723771@kroah.com/ This reverts commit `b1f7071469`. Link: https://lore.kernel.org/r/20220826102559.17474-2-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-15 08:02:56 +02:00
Linus Torvalds	71f1875705	ATA fixes for 6.0-rc7 Three late patches to fix problems discovered recently. * Add a horkage to disable link power management by default for the Pioneer BDR-207M and BDR-205 DVD drives (from Niklas). * 2 patches to fix setting the maximum queue depth of libsas owned ATA devices (from me). -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCYzUUjAAKCRDdoc3SxdoY du6zAQD5X5JGmXppn5alNTFmfvekLj3wDswGgFcYtopZ0XLCrwD+Pf7K+3VVzhsj IMj45WJe0LlnWeb1J5cCviSrm6lfIwo= =Udns -----END PGP SIGNATURE----- Merge tag 'ata-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ATA fixes from Damien Le Moal: "Three late patches to fix problems discovered recently: - Add a horkage to disable link power management by default for the Pioneer BDR-207M and BDR-205 DVD drives (from Niklas) - Two patches to fix setting the maximum queue depth of libsas owned ATA devices (from me)" * tag 'ata-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: libata-sata: Fix device queue depth control ata: libata-scsi: Fix initialization of device queue depth libata: add ATA_HORKAGE_NOLPM for Pioneer BDR-207M and BDR-205	2022-09-29 05:40:59 -07:00
Damien Le Moal	141f3d6256	ata: libata-sata: Fix device queue depth control The function __ata_change_queue_depth() uses the helper ata_scsi_find_dev() to get the ata_device structure of a scsi device and set that device maximum queue depth. However, when the ata device is managed by libsas, ata_scsi_find_dev() returns NULL, turning __ata_change_queue_depth() into a nop, which prevents the user from setting the maximum queue depth of ATA devices used with libsas based HBAs. Fix this by renaming __ata_change_queue_depth() to ata_change_queue_depth() and adding a pointer to the ata_device structure of the target device as argument. This pointer is provided by ata_scsi_change_queue_depth() using ata_scsi_find_dev() in the case of a libata managed device and by sas_change_queue_depth() using sas_to_ata_dev() in the case of a libsas managed ata device. Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: John Garry <john.garry@huawei.com>	2022-09-28 20:47:31 +09:00
Sreekanth Reddy	e0e0747de0	scsi: mpt3sas: Fix return value check of dma_get_required_mask() Fix the incorrect return value check of dma_get_required_mask(). Due to this incorrect check, the driver was always setting the DMA mask to 63 bit. Link: https://lore.kernel.org/r/20220913120538.18759-2-sreekanth.reddy@broadcom.com Fixes: `ba27c5cf28` ("scsi: mpt3sas: Don't change the DMA coherent mask after allocations") Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-09-15 22:24:28 -04:00
Rafael Mendonca	601be20fc6	scsi: qla2xxx: Fix memory leak in __qlt_24xx_handle_abts() Commit `8f394da36a` ("scsi: qla2xxx: Drop TARGET_SCF_LOOKUP_LUN_FROM_TAG") made the __qlt_24xx_handle_abts() function return early if tcm_qla2xxx_find_cmd_by_tag() didn't find a command, but it missed to clean up the allocated memory for the management command. Link: https://lore.kernel.org/r/20220914024924.695604-1-rafaelmendsr@gmail.com Fixes: `8f394da36a` ("scsi: qla2xxx: Drop TARGET_SCF_LOOKUP_LUN_FROM_TAG") Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-09-15 21:58:03 -04:00
Letu Ren	fbfe96869b	scsi: qedf: Fix a UAF bug in __qedf_probe() In __qedf_probe(), if qedf->cdev is NULL which means qed_ops->common->probe() failed, then the program will goto label err1, and scsi_host_put() will free lport->host pointer. Because the memory qedf points to is allocated by libfc_host_alloc(), it will be freed by scsi_host_put(). However, the if statement below label err0 only checks whether qedf is NULL but doesn't check whether the memory has been freed. So a UAF bug can occur. There are two ways to reach the statements below err0. The first one is described as before, "qedf" should be set to NULL. The second one is goto "err0" directly. In the latter scenario qedf hasn't been changed and it has the initial value NULL. As a result the if statement is not reachable in any situation. The KASAN logs are as follows: [ 2.312969] BUG: KASAN: use-after-free in __qedf_probe+0x5dcf/0x6bc0 [ 2.312969] [ 2.312969] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 2.312969] Call Trace: [ 2.312969] dump_stack_lvl+0x59/0x7b [ 2.312969] print_address_description+0x7c/0x3b0 [ 2.312969] ? __qedf_probe+0x5dcf/0x6bc0 [ 2.312969] __kasan_report+0x160/0x1c0 [ 2.312969] ? __qedf_probe+0x5dcf/0x6bc0 [ 2.312969] kasan_report+0x4b/0x70 [ 2.312969] ? kobject_put+0x25d/0x290 [ 2.312969] kasan_check_range+0x2ca/0x310 [ 2.312969] __qedf_probe+0x5dcf/0x6bc0 [ 2.312969] ? selinux_kernfs_init_security+0xdc/0x5f0 [ 2.312969] ? trace_rpm_return_int_rcuidle+0x18/0x120 [ 2.312969] ? rpm_resume+0xa5c/0x16e0 [ 2.312969] ? qedf_get_generic_tlv_data+0x160/0x160 [ 2.312969] local_pci_probe+0x13c/0x1f0 [ 2.312969] pci_device_probe+0x37e/0x6c0 Link: https://lore.kernel.org/r/20211112120641.16073-1-fantasquex@gmail.com Reported-by: Zheyu Ma <zheyuma97@gmail.com> Acked-by: Saurav Kashyap <skashyap@marvell.com> Co-developed-by: Wende Tan <twd2.me@gmail.com> Signed-off-by: Wende Tan <twd2.me@gmail.com> Signed-off-by: Letu Ren <fantasquex@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-09-15 21:26:55 -04:00
Sreekanth Reddy	991df3dd51	scsi: mpt3sas: Fix use-after-free warning Fix the following use-after-free warning which is observed during controller reset: refcount_t: underflow; use-after-free. WARNING: CPU: 23 PID: 5399 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0 Link: https://lore.kernel.org/r/20220906134908.1039-2-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-09-06 22:31:05 -04:00
Bart Van Assche	8fe4ce5836	scsi: core: Fix a use-after-free There are two .exit_cmd_priv implementations. Both implementations use resources associated with the SCSI host. Make sure that these resources are still available when .exit_cmd_priv is called by waiting inside scsi_remove_host() until the tag set has been freed. This commit fixes the following use-after-free: ================================================================== BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp] Read of size 8 at addr ffff888100337000 by task multipathd/16727 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report.cold+0x5e/0x5db kasan_report+0xab/0x120 srp_exit_cmd_priv+0x27/0xd0 [ib_srp] scsi_mq_exit_request+0x4d/0x70 blk_mq_free_rqs+0x143/0x410 __blk_mq_free_map_and_rqs+0x6e/0x100 blk_mq_free_tag_set+0x2b/0x160 scsi_host_dev_release+0xf3/0x1a0 device_release+0x54/0xe0 kobject_put+0xa5/0x120 device_release+0x54/0xe0 kobject_put+0xa5/0x120 scsi_device_dev_release_usercontext+0x4c1/0x4e0 execute_in_process_context+0x23/0x90 device_release+0x54/0xe0 kobject_put+0xa5/0x120 scsi_disk_release+0x3f/0x50 device_release+0x54/0xe0 kobject_put+0xa5/0x120 disk_release+0x17f/0x1b0 device_release+0x54/0xe0 kobject_put+0xa5/0x120 dm_put_table_device+0xa3/0x160 [dm_mod] dm_put_device+0xd0/0x140 [dm_mod] free_priority_group+0xd8/0x110 [dm_multipath] free_multipath+0x94/0xe0 [dm_multipath] dm_table_destroy+0xa2/0x1e0 [dm_mod] __dm_destroy+0x196/0x350 [dm_mod] dev_remove+0x10c/0x160 [dm_mod] ctl_ioctl+0x2c2/0x590 [dm_mod] dm_ctl_ioctl+0x5/0x10 [dm_mod] __x64_sys_ioctl+0xb4/0xf0 dm_ctl_ioctl+0x5/0x10 [dm_mod] __x64_sys_ioctl+0xb4/0xf0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Link: https://lore.kernel.org/r/20220826002635.919423-1-bvanassche@acm.org Fixes: `65ca846a53` ("scsi: core: Introduce {init,exit}_cmd_priv()") Cc: Ming Lei <ming.lei@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Christie <michael.christie@oracle.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.garry@huawei.com> Cc: Li Zhijian <lizhijian@fujitsu.com> Reported-by: Li Zhijian <lizhijian@fujitsu.com> Tested-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-09-01 01:02:10 -04:00

1 2 3 4 5 ...

23310 commits