linux-stable/drivers
Sukadev Bhattiprolu 4219196d1f ibmvnic: fix race between xmit and reset
There is a race between reset and the transmit paths that can lead to
ibmvnic_xmit() accessing an scrq after it has been freed in the reset
path. It can result in a crash like:

	Kernel attempted to read user page (0) - exploit attempt? (uid: 0)
	BUG: Kernel NULL pointer dereference on read at 0x00000000
	Faulting instruction address: 0xc0080000016189f8
	Oops: Kernel access of bad area, sig: 11 [#1]
	...
	NIP [c0080000016189f8] ibmvnic_xmit+0x60/0xb60 [ibmvnic]
	LR [c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
	Call Trace:
	[c008000001618f08] ibmvnic_xmit+0x570/0xb60 [ibmvnic] (unreliable)
	[c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
	[c000000000c9cfcc] sch_direct_xmit+0xec/0x330
	[c000000000bfe640] __dev_xmit_skb+0x3a0/0x9d0
	[c000000000c00ad4] __dev_queue_xmit+0x394/0x730
	[c008000002db813c] __bond_start_xmit+0x254/0x450 [bonding]
	[c008000002db8378] bond_start_xmit+0x40/0xc0 [bonding]
	[c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
	[c000000000c00ca4] __dev_queue_xmit+0x564/0x730
	[c000000000cf97e0] neigh_hh_output+0xd0/0x180
	[c000000000cfa69c] ip_finish_output2+0x31c/0x5c0
	[c000000000cfd244] __ip_queue_xmit+0x194/0x4f0
	[c000000000d2a3c4] __tcp_transmit_skb+0x434/0x9b0
	[c000000000d2d1e0] __tcp_retransmit_skb+0x1d0/0x6a0
	[c000000000d2d984] tcp_retransmit_skb+0x34/0x130
	[c000000000d310e8] tcp_retransmit_timer+0x388/0x6d0
	[c000000000d315ec] tcp_write_timer_handler+0x1bc/0x330
	[c000000000d317bc] tcp_write_timer+0x5c/0x200
	[c000000000243270] call_timer_fn+0x50/0x1c0
	[c000000000243704] __run_timers.part.0+0x324/0x460
	[c000000000243894] run_timer_softirq+0x54/0xa0
	[c000000000ea713c] __do_softirq+0x15c/0x3e0
	[c000000000166258] __irq_exit_rcu+0x158/0x190
	[c000000000166420] irq_exit+0x20/0x40
	[c00000000002853c] timer_interrupt+0x14c/0x2b0
	[c000000000009a00] decrementer_common_virt+0x210/0x220
	--- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c

The immediate cause of the crash is the access of tx_scrq in the following
snippet during a reset, where the tx_scrq can be either NULL or an address
that will soon be invalid:

	ibmvnic_xmit()
	{
		...
		tx_scrq = adapter->tx_scrq[queue_num];
		txq = netdev_get_tx_queue(netdev, queue_num);
		ind_bufp = &tx_scrq->ind_buf;

		if (test_bit(0, &adapter->resetting)) {
		...
	}

But beyond that, the call to ibmvnic_xmit() itself is not safe during a
reset and the reset path attempts to avoid this by stopping the queue in
ibmvnic_cleanup(). However just after the queue was stopped, an in-flight
ibmvnic_complete_tx() could have restarted the queue even as the reset is
progressing.

Since the queue was restarted we could get a call to ibmvnic_xmit() which
can then access the bad tx_scrq (or other fields).

We cannot however simply have ibmvnic_complete_tx() check the ->resetting
bit and skip starting the queue. This can race at the "back-end" of a good
reset which just restarted the queue but has not cleared the ->resetting
bit yet. If we skip restarting the queue due to ->resetting being true,
the queue would remain stopped indefinitely potentially leading to transmit
timeouts.

IOW ->resetting is too broad for this purpose. Instead use a new flag
that indicates whether or not the queues are active. Only the open/
reset paths control when the queues are active. ibmvnic_complete_tx()
and others wake up the queue only if the queue is marked active.

So we will have:
	A. reset/open thread in ibmvnic_cleanup() and __ibmvnic_open()

		->resetting = true
		->tx_queues_active = false
		disable tx queues
		...
		->tx_queues_active = true
		start tx queues

	B. Tx interrupt in ibmvnic_complete_tx():

		if (->tx_queues_active)
			netif_wake_subqueue();

To ensure that ->tx_queues_active and state of the queues are consistent,
we need a lock which:

	- must also be taken in the interrupt path (ibmvnic_complete_tx())
	- shared across the multiple queues in the adapter (so they don't
	  become serialized)

Use rcu_read_lock() and have the reset thread synchronize_rcu() after
updating the ->tx_queues_active state.

While here, consolidate a few boolean fields in ibmvnic_adapter for
better alignment.

Based on discussions with Brian King and Dany Madden.

Fixes: 7ed5b31f4a ("net/ibmvnic: prevent more than one thread from running in reset")
Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-18 13:22:22 +00:00
..
accessibility speakup-dectlk: Restore pitch setting 2022-02-08 12:15:04 +01:00
acpi Revert "ACPI: scan: Do not add device IDs from _CID if _HID is not valid" 2022-03-16 11:23:05 +01:00
amba ARM: 9163/1: amba: Move of_amba_device_decode_irq() into amba_probe() 2021-12-17 11:34:35 +00:00
android Merge 5.16-rc8 into char-misc-next 2022-01-03 13:44:38 +01:00
ata ata: pata_hpt37x: disable primary channel on HPT371 2022-02-23 09:39:37 +09:00
atm atm: eni: Add check for dma_map_single 2022-03-15 11:01:52 +00:00
auxdisplay auxdisplay: lcd2s: Use proper API to free the instance of charlcd object 2022-03-03 00:30:31 +01:00
base regmap: Fix for v5.17 2022-02-25 12:30:01 -08:00
bcma
block xen: XSA-396 security patches for v5.17 2022-03-09 20:44:17 -08:00
bluetooth virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes 2022-01-18 10:05:48 +02:00
bus bus: mhi: pci_generic: Add mru_default for Cinterion MV31-W 2022-02-06 13:19:46 +01:00
cdrom cdrom: simplify subdirectory registration with register_sysctl() 2022-01-22 08:33:35 +02:00
char virtio_console: break out of buf poll on remove 2022-03-04 08:33:22 -05:00
clk clk: lan966x: Fix linking error 2022-02-24 16:53:24 -08:00
clocksource ARM: dts: Use 32KiHz oscillator on devkit8000 2022-02-18 10:08:45 +02:00
comedi
connector connector/cn_proc: Use task_is_in_init_pid_ns() 2022-01-26 18:57:09 -08:00
counter counter: fix an IS_ERR() vs NULL bug 2022-01-26 19:40:33 +01:00
cpufreq cpufreq: qcom-hw: Delay enabling throttle_irq 2022-02-09 13:18:49 +05:30
cpuidle cpuidle: use default_groups in kobj_type 2022-01-05 18:31:17 +01:00
crypto crypto: qcom-rng - ensure buffer for generate is completely filled 2022-03-14 14:41:04 +12:00
cxl cxl/core: Remove cxld_const_init in cxl_decoder_alloc() 2022-01-04 17:29:31 -08:00
dax Merge branch 'akpm' (patches from Andrew) 2022-01-15 20:37:06 +02:00
dca
devfreq
dio
dma dmaengine: shdma: Fix runtime PM imbalance on error 2022-02-15 11:04:16 +05:30
dma-buf dma-buf: heaps: Fix potential spectre v1 gadget 2022-02-01 13:18:09 +05:30
edac EDAC: Fix calculation of returned address and next offset in edac_align_ptr() 2022-02-15 15:54:46 +01:00
eisa
extcon extcon: Deduplicate code in extcon_set_state_sync() 2021-12-24 15:27:52 +09:00
firewire
firmware Final EFI fix for v5.17 2022-03-16 11:57:46 -07:00
fpga
fsi
gnss gnss: usb: add support for Sierra Wireless XM1210 2021-12-22 15:38:12 +01:00
gpio Revert "gpio: Revert regression in sysfs-gpio (gpiolib.c)" 2022-03-15 09:59:08 -07:00
gpu drm/panel: Select DRM_DP_HELPER for DRM_PANEL_EDP 2022-03-12 17:41:30 +10:00
greybus greybus: es2: fix typo in a comment 2021-12-21 10:13:26 +01:00
hid Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid 2022-03-09 13:47:12 -08:00
hsi
hv Drivers: hv: utils: Make use of the helper macro LIST_HEAD() 2022-02-09 14:33:21 +00:00
hwmon hwmon: (pmbus) Clear pmbus fault/warning bits after read 2022-02-22 08:15:39 -08:00
hwspinlock
hwtracing
i2c i2c: brcmstb: fix support for DSL and CM variants 2022-02-18 10:37:33 +01:00
i3c i3c: master: dw: check return of dw_i3c_master_get_free_pos() 2022-01-13 02:05:50 +01:00
idle
iio 1st set of IIO fixes for the 5.17 cycle. 2022-02-21 17:58:09 +01:00
infiniband RDMA/cma: Do not change route.addr.src_addr outside state checks 2022-02-25 16:46:51 -04:00
input Input: elan_i2c - fix regulator enable count imbalance after suspend/resume 2022-03-01 20:41:22 -08:00
interconnect
iommu iommu/tegra-smmu: Fix missing put_device() call in tegra_smmu_find 2022-02-28 14:01:57 +01:00
ipack
irqchip irqchip/sifive-plic: Add missing thead,c900-plic match string 2022-02-02 10:49:29 +00:00
isdn isdn: hfcpci: check the return value of dma_set_mask() in setup_hw() 2022-03-07 11:27:12 +00:00
leds LED updates for 5.17. Nothing major is happening here. 2022-01-12 16:59:22 -08:00
macintosh macintosh/mac_hid.c: simplify subdirectory registration with register_sysctl() 2022-01-22 08:33:35 +02:00
mailbox - qcom: misc updates to qcom-ipcc driver 2022-01-13 11:19:07 -08:00
mcb
md block: fix surprise removal for drivers calling blk_set_queue_dying 2022-02-17 07:54:03 -07:00
media bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
memory MTD core changes: 2022-01-11 11:35:28 -08:00
memstick
message scsi: message: fusion: mptctl: Use dma_alloc_coherent() 2022-01-10 10:33:52 -05:00
mfd driver core changes for 5.17-rc1 2022-01-12 11:11:34 -08:00
misc eeprom: ee1004: limit i2c reads to I2C_SMBUS_BLOCK_MAX 2022-02-04 16:27:44 +01:00
mmc mmc: core: Restore (almost) the busy polling for MMC_SEND_OP_COND 2022-03-07 11:47:39 +01:00
most
mtd mtd: rawnand: omap2: Actually prevent invalid configuration and build error 2022-03-07 17:46:54 +01:00
mux
net ibmvnic: fix race between xmit and reset 2022-03-18 13:22:22 +00:00
nfc NFC: port100: fix use-after-free in port100_send_complete 2022-03-09 19:59:34 -08:00
ntb ntb: intel: fix port config status offset for SPR 2022-01-28 10:19:16 -05:00
nubus proc: remove PDE_DATA() completely 2022-01-22 08:33:37 +02:00
nvdimm virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes 2022-01-18 10:05:48 +02:00
nvme nvme-tcp: send H2CData PDUs based on MAXH2CDATA 2022-02-23 14:43:11 +01:00
nvmem nvmem: core: Fix a conflict between MTD and NVMEM on wp-gpios property 2022-02-21 17:59:25 +01:00
of of/fdt: move elfcorehdr reservation early for crash dump kernel 2022-02-17 17:13:52 -06:00
opp
parisc parisc: Fix sglist access in ccio-dma.c 2022-01-28 10:15:34 +01:00
parport
pci A single fix for a regression caused by the recent PCI/MSI rework which 2022-02-27 13:07:40 -08:00
pcmcia pci-v5.17-changes 2022-01-16 08:08:11 +02:00
perf Rework of the MSI interrupt infrastructure: 2022-01-13 09:05:29 -08:00
phy phy: dphy: Correct clk_pre parameter 2022-02-02 10:33:04 +05:30
pinctrl pinctrl: sunxi: Use unique lockdep classes for IRQs 2022-02-28 23:53:19 +01:00
platform surface: surface3_power: Fix battery readings on batteries without a serial number 2022-02-24 13:48:39 +01:00
pnp proc: remove PDE_DATA() completely 2022-01-22 08:33:37 +02:00
power power: supply: bq256xx: Handle OOM correctly 2022-02-11 21:19:51 +01:00
powercap Merge back earlier power capping changes for v5.17 2021-12-27 16:51:12 +01:00
pps
ps3
ptp ptp: ocp: Add ptp_ocp_adjtime_coarse for large adjustments 2022-03-02 09:51:21 -08:00
pwm pwm: Changes for v5.17-rc1 2022-01-20 13:25:01 +02:00
rapidio rapidio: remove not used code about RIO_VID_TUNDRA 2021-12-21 10:22:19 +01:00
ras
regulator regulator: da9121: Remove surplus DA9141 parameters 2022-02-22 11:56:29 +00:00
remoteproc remoteproc: qcom: q6v5: fix service routines build errors 2022-01-17 16:44:26 -06:00
reset SoC: Add support for StarFive JH7100 RISC-V SoC 2022-01-10 08:32:37 -08:00
rpmsg rpmsg fixes for v5.17-rc1 2022-01-27 11:23:26 +02:00
rtc rtc: sunplus: fix return value in sp_rtc_probe() 2022-01-16 23:50:34 +01:00
s390 s390/cio: verify the driver availability for path_event call 2022-02-09 22:55:01 +01:00
sbus
scsi xen/scsifront: don't use gnttab_query_foreign_access() for mapped status 2022-03-07 09:48:54 +01:00
sh
siox
slimbus
soc ARM: SoC fixes for 5.17, part 3 2022-03-10 11:43:01 -08:00
soundwire Char/Misc and other driver changes for 5.17-rc1 2022-01-14 16:02:28 +01:00
spi spi: Fix for v5.17 2022-03-10 04:15:09 -08:00
spmi spmi: spmi-pmic-arb: fix irq_set_type race condition 2021-12-17 17:18:18 +01:00
ssb
staging staging: rtl8723bs: Improve the comment explaining the locking rules 2022-03-02 16:38:24 +01:00
target scsi: target: iscsi: Make sure the np under each tpg is unique 2022-01-24 23:30:24 -05:00
tc
tee OP-TEE fix error return code in probe functions 2022-02-18 17:30:01 +01:00
thermal thermal: core: Fix TZ_GET_TRIP NULL pointer dereference 2022-03-01 16:11:38 +01:00
thunderbolt thunderbolt: Add module parameter for CLx disabling 2021-12-28 10:43:56 +03:00
tty TTY/Serial driver fixes for 5.17-rc6 2022-02-25 11:45:29 -08:00
uio UIO: use default_groups in kobj_type 2021-12-29 10:54:50 +01:00
usb xen/usb: don't use gnttab_end_foreign_access() in xenhcd_gnttab_done() 2022-03-07 09:48:55 +01:00
vdpa vdpa: fix use-after-free on vp_vdpa_remove 2022-03-06 06:06:50 -05:00
vfio VFIO updates for v5.17-rc1 2022-01-20 13:31:46 +02:00
vhost Networking fixes for 5.17-final, including fixes from netfilter, ipsec, 2022-03-17 12:55:26 -07:00
video * drm/panel: simple: Fix assignments from panel_dpi_probe() 2022-02-11 12:06:15 +10:00
virt bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
virtio virtio: drop default for virtio-mem 2022-03-06 06:06:50 -05:00
visorbus
vlynq
vme
w1 w1: w1_therm: use swap() to make code cleaner 2021-12-21 10:38:13 +01:00
watchdog linux-watchdog 5.17-rc1 tag 2022-01-17 08:07:57 +02:00
xen xen/gnttab: fix gnttab_end_foreign_access() without page specified 2022-03-07 09:48:55 +01:00
zorro proc: remove PDE_DATA() completely 2022-01-22 08:33:37 +02:00
Kconfig
Makefile