linux-stable/Documentation
zhenwei pi fbbc999464 mm/memory-failure: disable unpoison once hw error happens
commit 67f22ba775 upstream.

Currently unpoison_memory(unsigned long pfn) is designed for soft
poison(hwpoison-inject) only.  Since 17fae1294a, the KPTE gets cleared
on a x86 platform once hardware memory corrupts.

Unpoisoning a hardware corrupted page puts page back buddy only, the
kernel has a chance to access the page with *NOT PRESENT* KPTE.  This
leads BUG during accessing on the corrupted KPTE.

Suggested by David&Naoya, disable unpoison mechanism when a real HW error
happens to avoid BUG like this:

 Unpoison: Software-unpoisoned page 0x61234
 BUG: unable to handle page fault for address: ffff888061234000
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 2c01067 P4D 2c01067 PUD 107267063 PMD 10382b063 PTE 800fffff9edcb062
 Oops: 0002 [#1] PREEMPT SMP NOPTI
 CPU: 4 PID: 26551 Comm: stress Kdump: loaded Tainted: G   M       OE     5.18.0.bm.1-amd64 #7
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ...
 RIP: 0010:clear_page_erms+0x7/0x10
 Code: ...
 RSP: 0000:ffffc90001107bc8 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: 0000000000000901 RCX: 0000000000001000
 RDX: ffffea0001848d00 RSI: ffffea0001848d40 RDI: ffff888061234000
 RBP: ffffea0001848d00 R08: 0000000000000901 R09: 0000000000001276
 R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000001
 R13: 0000000000000000 R14: 0000000000140dca R15: 0000000000000001
 FS:  00007fd8b2333740(0000) GS:ffff88813fd00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffff888061234000 CR3: 00000001023d2005 CR4: 0000000000770ee0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Call Trace:
  <TASK>
  prep_new_page+0x151/0x170
  get_page_from_freelist+0xca0/0xe20
  ? sysvec_apic_timer_interrupt+0xab/0xc0
  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
  __alloc_pages+0x17e/0x340
  __folio_alloc+0x17/0x40
  vma_alloc_folio+0x84/0x280
  __handle_mm_fault+0x8d4/0xeb0
  handle_mm_fault+0xd5/0x2a0
  do_user_addr_fault+0x1d0/0x680
  ? kvm_read_and_reset_apf_flags+0x3b/0x50
  exc_page_fault+0x78/0x170
  asm_exc_page_fault+0x27/0x30

Link: https://lkml.kernel.org/r/20220615093209.259374-2-pizhenwei@bytedance.com
Fixes: 847ce401df ("HWPOISON: Add unpoisoning support")
Fixes: 17fae1294a ("x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned")
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>	[5.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-29 09:04:41 +02:00
..
ABI iio: adc: vf610: fix conversion mode sysfs node name 2022-06-29 09:04:35 +02:00
accounting sched/psi: report zeroes for CPU full at the system level 2022-06-09 10:30:00 +02:00
admin-guide x86/speculation/mmio: Add mitigation for Processor MMIO Stale Data 2022-06-16 13:32:03 +02:00
arc
arm
arm64 arm64: Enable repeat tlbi workaround on KRYO4XX gold CPUs 2022-05-12 13:15:38 +01:00
block
bpf docs: netdev: move the netdev-FAQ to the process pages 2022-03-31 10:49:39 +02:00
cdrom
core-api XArray update for 5.18: 2022-04-01 13:40:44 -07:00
cpu-freq
crypto
dev-tools Documentation: kunit: fix path to .kunitconfig in start.rst 2022-04-04 12:02:44 -06:00
devicetree dt-bindings: usb: ehci: Increase the number of PHYs 2022-06-29 09:04:37 +02:00
doc-guide
driver-api docs: driver-api/thermal/intel_dptf: Use copyright symbol 2022-06-09 10:29:55 +02:00
fault-injection
fb
features nds32: Remove the architecture 2022-03-07 13:54:59 +01:00
filesystems netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_context 2022-06-22 14:28:12 +02:00
firmware-guide Merge branch 'i2c/for-mergewindow' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2022-03-26 12:46:08 -07:00
firmware_class
fpga
gpu pci-v5.18-changes 2022-03-25 13:02:05 -07:00
hid
hwmon Char/Misc and other driver updates for 5.18-rc1 2022-03-28 12:27:35 -07:00
i2c
ia64
ide
iio
infiniband
input Input: docs: add more details on the use of BTN_TOOL 2022-03-01 15:46:03 +01:00
isdn
kbuild kbuild: Make $(LLVM) more flexible 2022-03-31 12:03:46 +09:00
kernel-hacking
leds
litmus-tests
livepatch
locking Documentation: Fix duplicate statement about raw_spinlock_t type 2022-03-25 13:30:08 -06:00
m68k
maintainer Some late-arriving documentation improvements. This is mostly build-system 2022-03-31 12:10:42 -07:00
mhi
mips
misc-devices
netlabel
networking doc/ip-sysctl: add bc_forwarding 2022-04-20 10:31:43 +01:00
nios2
nvdimm
openrisc
parisc
PCI PCI/doc: cleanup references to the legacy PCI DMA API 2022-03-30 16:54:24 +02:00
pcmcia
peci
power Documentation: EM: Describe new registration method using DT 2022-03-03 09:35:04 +05:30
powerpc
process docs: submitting-patches: Fix crossref to 'The canonical patch format' 2022-06-06 08:48:59 +02:00
RCU
riscv Documentation: riscv: remove non-existent directory from table of contents 2022-03-31 16:18:56 -07:00
s390
scheduler Changes in this cycle were: 2022-03-22 14:39:12 -07:00
scsi scsi: ufs: docs: UFS documentation corrections 2022-03-08 22:49:49 -05:00
security Documentation: siphash: disambiguate HalfSipHash algorithm from hsiphash functions 2022-04-25 17:26:40 +02:00
sh
sound ALSA: usb-audio: Add quirk bits for enabling/disabling generic implicit fb 2022-06-09 10:29:49 +02:00
sparc
sphinx docs: sphinx/requirements: Limit jinja2<3.1 2022-03-30 13:44:54 -06:00
sphinx-static
spi
staging remoteproc: Change rproc_shutdown() to return a status 2022-03-11 14:31:55 -06:00
target
timers
tools rtla/Makefile: Properly handle dependencies 2022-06-14 18:45:03 +02:00
trace Updates to Tracing: 2022-04-03 12:26:01 -07:00
translations Kbuild -std=gnu11 updates for v5.18 2022-03-25 11:48:01 -07:00
tty
usb
userspace-api media: lirc: add missing exceptions for lirc uapi header file 2022-06-09 10:30:55 +02:00
virt KVM: fix bad user ABI for KVM_EXIT_SYSTEM_EVENT 2022-04-29 12:38:22 -04:00
vm mm/memory-failure: disable unpoison once hw error happens 2022-06-29 09:04:41 +02:00
w1
watchdog
x86 - More noinstr fixes 2022-03-25 12:34:53 -07:00
xtensa
.gitignore
arch.rst
asm-annotations.rst
atomic_bitops.txt
atomic_t.txt
Changes
CodingStyle
conf.py docs/conf.py: Cope with removal of language=None in Sphinx 5.0.0 2022-06-09 10:30:54 +02:00
COPYING-logo
docutils.conf
dontdiff
index.rst
Kconfig
logo.gif
Makefile
memory-barriers.txt
SubmittingPatches
watch_queue.rst