linux-stable/drivers
Shuai Xue 6ddbd411a0 ACPI: APEI: do not add task_work to kernel thread to avoid memory leak
[ Upstream commit 415fed694f ]

If an error is detected as a result of user-space process accessing a
corrupt memory location, the CPU may take an abort. Then the platform
firmware reports kernel via NMI like notifications, e.g. NOTIFY_SEA,
NOTIFY_SOFTWARE_DELEGATED, etc.

For NMI like notifications, commit 7f17b4a121 ("ACPI: APEI: Kick the
memory_failure() queue for synchronous errors") keep track of whether
memory_failure() work was queued, and make task_work pending to flush out
the queue so that the work is processed before return to user-space.

The code use init_mm to check whether the error occurs in user space:

    if (current->mm != &init_mm)

The condition is always true, becase _nobody_ ever has "init_mm" as a real
VM any more.

In addition to abort, errors can also be signaled as asynchronous
exceptions, such as interrupt and SError. In such case, the interrupted
current process could be any kind of thread. When a kernel thread is
interrupted, the work ghes_kick_task_work deferred to task_work will never
be processed because entry_handler returns to call ret_to_kernel() instead
of ret_to_user(). Consequently, the estatus_node alloced from
ghes_estatus_pool in ghes_in_nmi_queue_one_entry() will not be freed.
After around 200 allocations in our platform, the ghes_estatus_pool will
run of memory and ghes_in_nmi_queue_one_entry() returns ENOMEM. As a
result, the event failed to be processed.

    sdei: event 805 on CPU 113 failed with error: -2

Finally, a lot of unhandled events may cause platform firmware to exceed
some threshold and reboot.

The condition should generally just do

    if (current->mm)

as described in active_mm.rst documentation.

Then if an asynchronous error is detected when a kernel thread is running,
(e.g. when detected by a background scrubber), do not add task_work to it
as the original patch intends to do.

Fixes: 7f17b4a121 ("ACPI: APEI: Kick the memory_failure() queue for synchronous errors")
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-26 12:35:28 +02:00
..
accessibility tty: the rest, stop using tty_schedule_flip() 2022-07-29 17:25:32 +02:00
acpi ACPI: APEI: do not add task_work to kernel thread to avoid memory leak 2022-10-26 12:35:28 +02:00
amba
android binder: remove inaccurate mmap_assert_locked() 2022-09-23 14:15:49 +02:00
ata libata: add ATA_HORKAGE_NOLPM for Pioneer BDR-207M and BDR-205 2022-10-05 10:39:38 +02:00
atm atm: idt77252: fix use-after-free bugs caused by tst_timer 2022-08-25 11:40:15 +02:00
auxdisplay
base arm64: topology: move store_cpu_topology() to shared code 2022-10-26 12:34:22 +02:00
bcma
block xen-blkfront: Cache feature_persistent value before advertisement 2022-09-08 12:28:05 +02:00
bluetooth Bluetooth: hci_{ldisc,serdev}: check percpu_init_rwsem() failure 2022-10-26 12:34:44 +02:00
bus bus: hisi_lpc: fix missing platform_device_put() in hisi_lpc_acpi_probe() 2022-08-17 14:23:10 +02:00
cdrom
char hwrng: imx-rngc - Moving IRQ handler registering after imx_rngc_irq_mask_clear() 2022-10-26 12:35:24 +02:00
clk clk: ast2600: BCLK comes from EPLL 2022-10-26 12:35:20 +02:00
clocksource clocksource/drivers/ixp4xx: remove EXPORT_SYMBOL_GPL from ixp4xx_timer_setup() 2022-07-07 17:53:32 +02:00
comedi comedi: vmk80xx: fix expression for tx buffer size 2022-06-22 14:22:03 +02:00
connector
counter
cpufreq cpufreq: check only freq_table in __resolve_freq() 2022-09-15 11:30:01 +02:00
cpuidle
crypto crypto: cavium - prevent integer overflow loading firmware 2022-10-26 12:35:28 +02:00
cxl cxl/port: Hold port reference until decoder release 2022-07-12 16:34:58 +02:00
dax devdax: Fix soft-reservation memory description 2022-09-28 11:11:57 +02:00
dca
devfreq PM / devfreq: exynos-ppmu: Fix refcount leak in of_get_devfreq_events 2022-07-07 17:53:27 +02:00
dio
dma dmaengine: ioat: stop mod_timer from resurrecting deleted timer in __cleanup() 2022-10-26 12:35:18 +02:00
dma-buf udmabuf: Set the DMA mask for the udmabuf device (v2) 2022-09-05 10:30:06 +02:00
edac EDAC/ghes: Set the DIMM label unconditionally 2022-08-03 12:03:55 +02:00
eisa
extcon
firewire
firmware firmware: google: Test spinlock on panic path to avoid lockups 2022-10-26 12:35:15 +02:00
fpga fpga: prevent integer overflow in dfl_feature_ioctl_set_irq() 2022-10-26 12:35:07 +02:00
fsi fsi: core: Check error number after calling ida_simple_get 2022-10-26 12:35:17 +02:00
gnss
gpio gpio: rockchip: request GPIO mux to pinctrl when setting direction 2022-10-26 12:34:26 +02:00
gpu drm/vmwgfx: Fix memory leak in vmw_mksstat_add_ioctl() 2022-10-26 12:34:55 +02:00
greybus
hid HID: multitouch: Add memory barriers 2022-10-26 12:34:21 +02:00
hsi HSI: omap_ssi_port: Fix dma_map_sg error check 2022-10-26 12:35:05 +02:00
hv Drivers: hv: Never allocate anything besides framebuffer from framebuffer memory region 2022-09-28 11:11:55 +02:00
hwmon hwmon: (pmbus/mp2888) Fix sensors readouts for MPS Multi-phase mp2888 controller 2022-10-26 12:34:48 +02:00
hwspinlock
hwtracing coresight: etm4x: avoid build failure with unrolled loops 2022-08-25 11:40:35 +02:00
i2c i2c: mlxbf: support lock mechanism 2022-10-26 12:34:46 +02:00
i3c
idle intel_idle: Disable IBRS during long idle 2022-07-23 12:54:04 +02:00
iio iio: magnetometer: yas530: Change data type of hard_offsets to signed 2022-10-26 12:35:03 +02:00
infiniband RDMA/rxe: Fix resize_finish() in rxe_queue.c 2022-10-26 12:35:16 +02:00
input Input: xpad - fix wireless 360 controller breaking after suspend 2022-10-15 07:59:04 +02:00
interconnect interconnect: imx: fix max_node_id 2022-08-17 14:23:53 +02:00
iommu iommu/omap: Fix buffer overflow in debugfs 2022-10-26 12:35:25 +02:00
ipack
irqchip irqchip/tegra: Fix overflow implicit truncation warnings 2022-08-25 11:40:32 +02:00
isdn mISDN: fix use-after-free bugs in l1oip timer handlers 2022-10-26 12:34:47 +02:00
leds leds: lm3601x: Don't use mutex after it was destroyed 2022-10-26 12:34:39 +02:00
macintosh macintosh/adb: fix oob read in do_adb_query() function 2022-08-11 13:07:54 +02:00
mailbox mailbox: bcm-ferxrm-mailbox: Fix error check for dma_map_sg 2022-10-26 12:35:21 +02:00
mcb
md md/raid5: Remove unnecessary bio_put() in raid5_read_one_chunk() 2022-10-26 12:35:12 +02:00
media media: xilinx: vipp: Fix refcount leak in xvip_graph_dma_init 2022-10-26 12:35:06 +02:00
memory memory: of: Fix refcount leak bug in of_lpddr3_get_ddr_timings() 2022-10-26 12:34:58 +02:00
memstick memstick/ms_block: Fix a memory leak 2022-08-17 14:23:50 +02:00
message
mfd mfd: sm501: Add check for platform_driver_register() 2022-10-26 12:35:18 +02:00
misc misc: ocxl: fix possible refcount leak in afu_ioctl() 2022-10-26 12:35:07 +02:00
mmc mmc: wmt-sdmmc: Fix an error handling path in wmt_mci_probe() 2022-10-26 12:34:56 +02:00
most
mtd mtd: rawnand: meson: fix bit map use in meson_nfc_ecc_correct() 2022-10-26 12:35:12 +02:00
mux
net net: mvpp2: fix mvpp2 debugfs leak 2022-10-26 12:34:50 +02:00
nfc nfc: pn533: Fix use-after-free bugs caused by pn532_cmd_timeout 2022-08-31 17:16:38 +02:00
ntb NTB: ntb_tool: uninitialized heap data in tool_fn_write() 2022-08-25 11:40:14 +02:00
nubus
nvdimm nvdimm: Fix badblocks clear off-by-one error 2022-07-07 17:53:24 +02:00
nvme nvme-pci: set min_align_mask before calculating max_hw_sectors 2022-10-26 12:34:23 +02:00
nvmem nvmem: core: Fix memleak in nvmem_register() 2022-10-26 12:34:23 +02:00
of of: fdt: fix off-by-one error in unflatten_dt_nodes() 2022-09-23 14:15:46 +02:00
opp opp: Fix error check in dev_pm_opp_attach_genpd() 2022-08-17 14:24:01 +02:00
parisc parisc: ccio-dma: Add missing iounmap in error path in ccio_probe() 2022-09-23 14:15:48 +02:00
parport
pci PCI: Sanitise firmware BAR assignments behind a PCI-PCI bridge 2022-10-26 12:34:24 +02:00
pcmcia
perf perf/arm_pmu_platform: fix tests for platform_get_irq() failure 2022-09-20 12:39:45 +02:00
phy phy: qualcomm: call clk_disable_unprepare in the error handling 2022-10-26 12:35:14 +02:00
pinctrl pinctrl: rockchip: add pinmux_ops.gpio_set_direction callback 2022-10-26 12:34:26 +02:00
platform platform/chrome: cros_ec_typec: Correct alt mode index 2022-10-26 12:34:53 +02:00
pnp
power power/reset: arm-versatile: Fix refcount leak in versatile_reboot_probe 2022-07-29 17:25:10 +02:00
powercap powercap: intel_rapl: Use standard Energy Unit for SPR Dram RAPL domain 2022-10-26 12:34:25 +02:00
pps
ps3
ptp
pwm pwm: lpc18xx: Fix period handling 2022-08-17 14:23:16 +02:00
rapidio
ras
regulator regulator: qcom_rpm: Fix circular deferral regression 2022-10-26 12:34:21 +02:00
remoteproc remoteproc: sysmon: Wait for SSCTL service to come up 2022-08-17 14:24:09 +02:00
reset reset: imx7: Fix the iMX8MP PCIe PHY PERST support 2022-10-05 10:39:40 +02:00
rpmsg rpmsg: qcom: glink: replace strncpy() with strscpy_pad() 2022-10-12 09:53:28 +02:00
rtc rtc: rx8025: fix 12/24 hour mode detection on RX-8035 2022-08-17 14:22:53 +02:00
s390 s390/dasd: fix Oops in dasd_alias_get_start_dev due to missing pavgroup 2022-09-28 11:11:54 +02:00
sbus
scsi scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername() 2022-10-26 12:35:16 +02:00
sh
siox
slimbus slimbus: qcom-ngd-ctrl: allow compile testing without QCOM_RPROC_COMMON 2022-10-26 12:35:14 +02:00
soc soc/tegra: fuse: Drop Kconfig dependency on TEGRA20_APB_DMA 2022-10-26 12:35:00 +02:00
soundwire soundwire: qcom: fix device status array range 2022-09-08 12:28:03 +02:00
spi spi: Ensure that sg_table won't be used after being freed 2022-10-26 12:34:48 +02:00
spmi spmi: pmic-arb: correct duplicate APID to PPID mapping logic 2022-10-26 12:35:19 +02:00
ssb
staging staging: vt6655: fix some erroneous memory clean-up loops 2022-10-26 12:35:14 +02:00
target
tc
tee tee: fix compiler warning in tee_shm_register() 2022-09-15 11:30:03 +02:00
thermal thermal/drivers/qcom/tsens-v0_1: Fix MSM8939 fourth sensor hw_id 2022-10-26 12:35:28 +02:00
thunderbolt thunderbolt: Explicitly enable lane adapter hotplug events at startup 2022-10-26 12:34:32 +02:00
tty serial: 8250: Fix restoring termios speed after suspend 2022-10-26 12:35:15 +02:00
uio
usb usb: mtu3: fix failed runtime suspend in host only mode 2022-10-26 12:35:19 +02:00
vdpa vdpa/ifcvf: fix the calculation of queuepair 2022-10-05 10:39:43 +02:00
vfio vfio/type1: Unpin zero pages 2022-09-15 11:30:02 +02:00
vhost vhost/vsock: Use kvmalloc/kvfree for larger packets. 2022-10-26 12:34:47 +02:00
video fbdev: smscufx: Fix use-after-free in ufx_ops_open() 2022-10-26 12:34:26 +02:00
virt vboxguest: Do not use devm for irq 2022-08-25 11:40:33 +02:00
virtio virtio_mmio: Restore guest page size on resume 2022-07-21 21:24:33 +02:00
visorbus
vlynq
vme
w1
watchdog watchdog: armada_37xx_wdt: check the return value of devm_ioremap() in armada_37xx_wdt_probe() 2022-08-17 14:24:11 +02:00
xen xen/gntdev: Accommodate VMA splitting 2022-10-26 12:34:24 +02:00
zorro
Kconfig
Makefile