linux-stable/drivers
Sagi Grimberg f5eb7e12a7 nvme-mpath: replace direct_make_request with generic_make_request
The below patches caused a regression in a multipath setup:
Fixes: 9f98772ba3 ("nvme-rdma: fix controller reset hang during traffic")
Fixes: 2875b0aeca ("nvme-tcp: fix controller reset hang during traffic")

These patches on their own are correct because they fixed a controller reset
regression.

When we reset/teardown a controller, we must freeze and quiesce the namespaces
request queues to make sure that we safely stop inflight I/O submissions.
Freeze is mandatory because if our hctx map changed between reconnects,
blk_mq_update_nr_hw_queues will immediately attempt to freeze the queue, and
if it still has pending submissions (that are still quiesced) it will hang.
This is what the above patches fixed.

However, by freezing the namespaces request queues, and only unfreezing them
when we successfully reconnect, inflight submissions that are running
concurrently can now block grabbing the nshead srcu until either we successfully
reconnect or ctrl_loss_tmo expired (or the user explicitly disconnected).

This caused a deadlock [1] when a different controller (different path on the
same subsystem) became live (i.e. optimized/non-optimized). This is because
nvme_mpath_set_live needs to synchronize the nshead srcu before requeueing I/O
in order to make sure that current_path is visible to future (re)submisions.
However the srcu lock is taken by a blocked submission on a frozen request
queue, and we have a deadlock.

In recent kernels (v5.9+) direct_make_request was replaced by submit_bio_noacct
which does not have this issue because it bio_list will be active when
nvme-mpath calls submit_bio_noacct on the bottom device (because it was
populated when submit_bio was triggered on it.

Hence, we need to fix all the kernels that were before submit_bio_noacct was
introduced.

[1]:
Workqueue: nvme-wq nvme_tcp_reconnect_ctrl_work [nvme_tcp]
Call Trace:
 __schedule+0x293/0x730
 schedule+0x33/0xa0
 schedule_timeout+0x1d3/0x2f0
 wait_for_completion+0xba/0x140
 __synchronize_srcu.part.21+0x91/0xc0
 synchronize_srcu_expedited+0x27/0x30
 synchronize_srcu+0xce/0xe0
 nvme_mpath_set_live+0x64/0x130 [nvme_core]
 nvme_update_ns_ana_state+0x2c/0x30 [nvme_core]
 nvme_update_ana_state+0xcd/0xe0 [nvme_core]
 nvme_parse_ana_log+0xa1/0x180 [nvme_core]
 nvme_read_ana_log+0x76/0x100 [nvme_core]
 nvme_mpath_init+0x122/0x180 [nvme_core]
 nvme_init_identify+0x80e/0xe20 [nvme_core]
 nvme_tcp_setup_ctrl+0x359/0x660 [nvme_tcp]
 nvme_tcp_reconnect_ctrl_work+0x24/0x70 [nvme_tcp]

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-10 13:34:31 +02:00
..
accessibility
acpi ACPI: scan: Use unique number for instance_no 2021-03-30 14:35:28 +02:00
amba amba: Fix resource leak for drivers without .remove 2021-03-04 10:26:32 +01:00
android binder: add flag to clear buffer on txn complete 2020-12-30 11:51:35 +01:00
ata ata: ahci_brcm: Add back regulators management 2021-03-04 10:26:23 +01:00
atm atm: idt77252: fix null-ptr-dereference 2021-03-30 14:35:21 +02:00
auxdisplay auxdisplay: ht16k33: Fix refresh rate handling 2021-03-04 10:26:30 +01:00
base PM: runtime: Fix ordering in pm_runtime_get_suppliers() 2021-04-07 14:47:42 +02:00
bcma
block xen-blkback: don't leak persistent grants from xen_blkbk_map() 2021-03-30 14:35:30 +02:00
bluetooth Bluetooth: hci_h5: Set HCI_QUIRK_SIMULTANEOUS_DISCOVERY for btrtl 2021-03-07 12:20:44 +01:00
bus bus: ti-sysc: Fix warning on unbind if reset is not deasserted 2021-04-10 13:34:30 +02:00
cdrom
char tpm, tpm_tis: Decorate tpm_get_timeouts() with request_locality() 2021-03-09 11:09:36 +01:00
clk clk: aspeed: Fix APLL calculate formula from ast2600-A2 2021-03-04 10:26:34 +01:00
clocksource clocksource/drivers/mxs_timer: Add missing semicolon when DEBUG is defined 2021-03-04 10:26:29 +01:00
connector
counter counter: stm32-timer-cnt: fix ceiling write max value 2021-03-24 11:26:43 +01:00
cpufreq cpufreq: blacklist Arm Vexpress platforms in cpufreq-dt-platdev 2021-03-30 14:35:21 +02:00
cpuidle
crypto crypto: sun4i-ss - initialize need_fallback 2021-03-04 10:26:45 +01:00
dax device-dax/core: Fix memory leak when rmmod dax.ko 2020-12-30 11:51:46 +01:00
dca
devfreq
dio
dma dmaengine: hsu: disable spurious interrupt 2021-03-04 10:26:28 +01:00
dma-buf dmabuf: fix use-after-free of dmabuf's file->f_inode 2021-01-12 20:16:23 +01:00
edac EDAC/amd64: Fix PCI component registration 2020-12-30 11:51:36 +01:00
eisa
extcon extcon: Fix error handling in extcon_dev_register 2021-04-07 14:47:43 +02:00
firewire firewire: nosy: Fix a use-after-free bug in nosy_ioctl() 2021-04-07 14:47:43 +02:00
firmware firmware/efi: Fix a use after bug in efi_mem_reserve_persistent 2021-03-24 11:26:45 +01:00
fpga
fsi
gnss
gpio gpiolib: acpi: Add missing IRQF_ONESHOT 2021-03-30 14:35:21 +02:00
gpu drm/msm: Ratelimit invalid-fence message 2021-04-10 13:34:31 +02:00
greybus
hid HID: logitech-dj: add support for the new lightspeed connection iteration 2021-03-17 17:03:43 +01:00
hsi HSI: Fix PM usage counter unbalance in ssi_hw_init 2021-03-04 10:26:26 +01:00
hv Drivers: hv: vmbus: Avoid use-after-free in vmbus_onoffer_rescind() 2021-03-04 10:26:24 +01:00
hwmon hwmon: (pwm-fan) Ensure that calculation doesn't discard big period values 2021-01-19 18:26:15 +01:00
hwspinlock
hwtracing stm class: Fix module init return on allocation failure 2021-01-27 11:47:50 +01:00
i2c i2c: rcar: optimize cacheline to minimize HW race condition 2021-03-17 17:03:41 +01:00
i3c i3c master: fix missing destroy_workqueue() on error in i3c_master_register 2021-01-06 14:48:40 +01:00
ide scsi: ide: Do not set the RQF_PREEMPT flag for sense requests 2021-01-12 20:16:09 +01:00
idle
iio iio: hid-sensor-temperature: Fix issues of timestamp channel 2021-03-24 11:26:43 +01:00
infiniband RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server 2021-03-30 14:35:28 +02:00
input Input: applespi - don't wait for responses to commands indefinitely. 2021-03-17 17:03:44 +01:00
interconnect interconnect: qcom: qcs404: Remove GPU and display RPM IDs 2020-12-16 10:56:56 +01:00
iommu iommu/amd: Fix performance counter initialization 2021-03-17 17:03:43 +01:00
ipack
irqchip irqchip/ingenic: Add support for the JZ4760 2021-03-30 14:35:22 +02:00
isdn mISDN: fix crash in fritzpci 2021-04-10 13:34:30 +02:00
leds leds: trigger: fix potential deadlock with libata 2021-02-03 23:25:58 +01:00
lightnvm lightnvm: fix memory leak when submit fails 2021-01-27 11:47:53 +01:00
macintosh
mailbox
mcb
md dm ioctl: fix out of bounds array access when no devices 2021-03-30 14:35:24 +02:00
media media: rc: compile rc-cec.c into rc-core 2021-03-17 17:03:40 +01:00
memory memory: ti-aemif: Drop child node when jumping out loop 2021-03-04 10:26:14 +01:00
memstick memstick: r592: Fix error return in r592_probe() 2020-12-30 11:51:18 +01:00
message scsi: mptfusion: Fix null pointer dereferences in mptscsih_remove() 2020-11-05 11:43:25 +01:00
mfd mfd: wm831x-auxadc: Prevent use after free in wm831x_auxadc_read_irq() 2021-03-04 10:26:33 +01:00
misc habanalabs: Call put_pid() when releasing control device 2021-03-30 14:35:22 +02:00
mmc mmc: cqhci: Fix random crash when remove mmc module/card 2021-03-17 17:03:48 +01:00
mtd mtd: spi-nor: hisi-sfc: Put child node np on error path 2021-03-04 10:26:48 +01:00
mux
net net: pxa168_eth: Fix a potential data race in pxa168_eth_remove 2021-04-10 13:34:30 +02:00
nfc nfc: s3fwrn5: Release the nfc firmware 2020-12-30 11:51:26 +01:00
ntb
nubus
nvdimm libnvdimm/dimm: Avoid race between probe and available_slots_show() 2021-02-10 09:25:30 +01:00
nvme nvme-mpath: replace direct_make_request with generic_make_request 2021-04-10 13:34:31 +02:00
nvmem nvmem: core: skip child nodes not matching binding 2021-03-04 10:26:37 +01:00
of of/fdt: Make sure no-map does not remove already reserved regions 2021-03-04 10:26:28 +01:00
opp opp: Reduce the size of critical section in _opp_table_kref_release() 2020-11-18 19:20:21 +01:00
oprofile
parisc
parport
pci PCI: rpadlpar: Fix potential drc_name corruption in store functions 2021-03-24 11:26:43 +01:00
pcmcia
perf
phy phy: rockchip-emmc: emmc_phy_init() always return 0 2021-03-04 10:26:36 +01:00
pinctrl pinctrl: rockchip: fix restore error in resume 2021-04-07 14:47:43 +02:00
platform platform/x86: thinkpad_acpi: Allow the FnLock LED to change state 2021-04-10 13:34:31 +02:00
pnp
power power: reset: at91-sama5d2_shdwc: fix wkupdbc mask 2021-03-04 10:26:28 +01:00
powercap powercap: restrict energy meter to root access 2020-11-10 21:13:20 +01:00
pps
ps3 powerpc/ps3: use dma_mapping_error() 2020-12-30 11:51:26 +01:00
ptp
pwm pwm: rockchip: rockchip_pwm_probe(): Remove superfluous clk_unprepare() 2021-03-04 10:26:36 +01:00
rapidio
ras
regulator regulator: qcom-rpmh: Correct the pmic5_hfsmps515 buck 2021-03-30 14:35:22 +02:00
remoteproc remoteproc: qcom: Fix potential NULL dereference in adsp_init_mmio() 2020-12-30 11:51:24 +01:00
reset
rpmsg rpmsg: glink: Use complete_all for open states 2020-11-05 11:43:20 +01:00
rtc rtc: s5m: select REGMAP_I2C 2021-03-04 10:26:29 +01:00
s390 s390/dasd: fix hanging IO request during DASD driver unbind 2021-03-17 17:03:48 +01:00
sbus
scsi scsi: qla2xxx: Fix broken #endif placement 2021-04-07 14:47:40 +02:00
sfi
sh
siox
slimbus slimbus: qcom-ngd-ctrl: Avoid sending power requests without QMI 2020-12-30 11:51:13 +01:00
soc soc: aspeed: snoop: Add clock control logic 2021-03-04 10:26:16 +01:00
soundwire soundwire: cadence: fix ACK/NAK handling 2021-03-04 10:26:36 +01:00
spi spi: stm32: make spurious and overrun interrupts visible 2021-03-17 17:03:42 +01:00
spmi spmi: spmi-pmic-arb: Fix hw_irq overflow 2021-03-04 10:26:49 +01:00
ssb
staging staging: rtl8192e: Change state information from u16 to u8 2021-04-07 14:47:44 +02:00
target scsi: target: pscsi: Clean up after failure in pscsi_map_sg() 2021-04-10 13:34:31 +02:00
tc
tee tee: optee: replace might_sleep with cond_resched 2021-02-03 23:25:58 +01:00
thermal thermal/core: Add NULL pointer check before using cooling device stats 2021-04-07 14:47:40 +02:00
thunderbolt thunderbolt: Fix use-after-free in remove_unplugged_switch() 2020-12-11 13:23:29 +01:00
tty vt/consolemap: do font sum unsigned 2021-03-07 12:20:44 +01:00
uio uio: Fix use-after-free in uio_unregister_device() 2020-11-18 19:20:29 +01:00
usb usb: dwc2: Prevent core suspend when port connection flag is 0 2021-04-07 14:47:44 +02:00
vfio vfio/nvlink: Add missing SPAPR_TCE_IOMMU depends 2021-04-07 14:47:43 +02:00
vhost vhost: Fix vhost_vq_reset() 2021-04-07 14:47:39 +02:00
video drivers: video: fbcon: fix NULL dereference in fbcon_cursor() 2021-04-07 14:47:44 +02:00
virt virt: vbox: Do not use wait_event_interruptible when called from kernel context 2021-03-04 10:26:10 +01:00
virtio virtio_ring: Fix two use after free bugs 2020-12-30 11:51:29 +01:00
visorbus
vlynq
vme
w1 w1: mxc_w1: Fix timeout resolution problem leading to bus error 2020-11-05 11:43:25 +01:00
watchdog watchdog: mei_wdt: request stop on unregister 2021-03-04 10:26:47 +01:00
xen xen/events: avoid handling the same event on two cpus at the same time 2021-03-17 17:03:58 +01:00
zorro
Kconfig
Makefile