linux-stable/arch
Jue Wang 8ca97812c3 x86/mce: Work around an erratum on fast string copy instructions
A rare kernel panic scenario can happen when the following conditions
are met due to an erratum on fast string copy instructions:

1) An uncorrected error.
2) That error must be in first cache line of a page.
3) Kernel must execute page_copy from the page immediately before that
page.

The fast string copy instructions ("REP; MOVS*") could consume an
uncorrectable memory error in the cache line _right after_ the desired
region to copy and raise an MCE.

Bit 0 of MSR_IA32_MISC_ENABLE can be cleared to disable fast string
copy and will avoid such spurious machine checks. However, that is less
preferable due to the permanent performance impact. Considering memory
poison is rare, it's desirable to keep fast string copy enabled until an
MCE is seen.

Intel has confirmed the following:
1. The CPU erratum of fast string copy only applies to Skylake,
Cascade Lake and Cooper Lake generations.

Directly return from the MCE handler:
2. Will result in complete execution of the "REP; MOVS*" with no data
loss or corruption.
3. Will not result in another MCE firing on the next poisoned cache line
due to "REP; MOVS*".
4. Will resume execution from a correct point in code.
5. Will result in the same instruction that triggered the MCE firing a
second MCE immediately for any other software recoverable data fetch
errors.
6. Is not safe without disabling the fast string copy, as the next fast
string copy of the same buffer on the same CPU would result in a PANIC
MCE.

This should mitigate the erratum completely with the only caveat that
the fast string copy is disabled on the affected hyper thread thus
performance degradation.

This is still better than the OS crashing on MCEs raised on an
irrelevant process due to "REP; MOVS*' accesses in a kernel context,
e.g., copy_page.

Tested:

Injected errors on 1st cache line of 8 anonymous pages of process
'proc1' and observed MCE consumption from 'proc2' with no panic
(directly returned).

Without the fix, the host panicked within a few minutes on a
random 'proc2' process due to kernel access from copy_page.

  [ bp: Fix comment style + touch ups, zap an unlikely(), improve the
    quirk function's readability. ]

Signed-off-by: Jue Wang <juew@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20220218013209.2436006-1-juew@google.com
2022-02-19 14:26:42 +01:00
..
alpha bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
arc bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
arm Amlogic fixes for v5.17-rc 2022-02-08 10:51:05 +01:00
arm64 ARM: SoC fixes for 5.17 2022-02-11 13:40:03 -08:00
csky bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
h8300 bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
hexagon bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
ia64 ia64: make IA64_MCA_RECOVERY bool instead of tristate 2022-01-30 09:56:58 +02:00
m68k bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
microblaze Kbuild updates for v5.17 2022-01-19 11:15:19 +02:00
mips MIPS: DTS: CI20: fix how ddc power is enabled 2022-02-09 13:58:26 +01:00
nds32 Kbuild updates for v5.17 2022-01-19 11:15:19 +02:00
nios2 Kbuild updates for v5.17 2022-01-19 11:15:19 +02:00
openrisc bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
parisc bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
powerpc powerpc/64s/interrupt: Fix decrementer storm 2022-01-25 16:50:10 +11:00
riscv RISC-V Fixes for 5.17-rc4 2022-02-11 12:02:09 -08:00
s390 s390 updates for 5.17-rc4 2022-02-12 09:12:44 -08:00
sh bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
sparc bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
um virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes 2022-01-18 10:05:48 +02:00
x86 x86/mce: Work around an erratum on fast string copy instructions 2022-02-19 14:26:42 +01:00
xtensa bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
.gitignore
Kconfig Merge branch 'akpm' (patches from Andrew) 2022-01-20 10:41:01 +02:00