linux-stable/arch/x86
Dave Hansen d39268ad24 x86/mm/tlb: Revert retpoline avoidance approach
0day reported a regression on a microbenchmark which is intended to
stress the TLB flushing path:

	https://lore.kernel.org/all/20220317090415.GE735@xsang-OptiPlex-9020/

It pointed at a commit from Nadav which intended to remove retpoline
overhead in the TLB flushing path by taking the 'cond'-ition in
on_each_cpu_cond_mask(), pre-calculating it, and incorporating it into
'cpumask'.  That allowed the code to use a bunch of earlier direct
calls instead of later indirect calls that need a retpoline.

But, in practice, threads can go idle (and into lazy TLB mode where
they don't need to flush their TLB) between the early and late calls.
It works in this direction and not in the other because TLB-flushing
threads tend to hold mmap_lock for write.  Contention on that lock
causes threads to _go_ idle right in this early/late window.

There was not any performance data in the original commit specific
to the retpoline overhead.  I did a few tests on a system with
retpolines:

	https://lore.kernel.org/all/dd8be93c-ded6-b962-50d4-96b1c3afb2b7@intel.com/

which showed a possible small win.  But, that small win pales in
comparison with the bigger loss induced on non-retpoline systems.

Revert the patch that removed the retpolines.  This was not a
clean revert, but it was self-contained enough not to be too painful.

Fixes: 6035152d8e ("x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Nadav Amit <namit@vmware.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/164874672286.389.7021457716635788197.tip-bot2@tip-bot2
2022-04-04 19:41:36 +02:00
..
boot memcpy updates for v5.18-rc1 2022-03-26 12:19:04 -07:00
coco x86/coco: Add API to handle encryption mask 2022-02-23 19:14:29 +01:00
configs x86/config: Make the x86 defconfigs a bit more usable 2022-03-27 20:58:35 +02:00
crypto This push fixes the following issues: 2022-03-31 11:17:39 -07:00
entry Kbuild updates for v5.18 2022-03-31 11:59:03 -07:00
events asm-generic updates for 5.18 2022-03-23 18:03:08 -07:00
hyperv hyperv-next for 5.17 2022-01-16 15:53:00 +02:00
ia32 audit/stable-5.16 PR 20211101 2021-11-01 21:17:39 -07:00
include * Only do MSR filtering for MSRs accessed by rdmsr/wrmsr 2022-04-02 12:09:02 -07:00
kernel A set of x86 fixes and updates: 2022-04-03 12:15:47 -07:00
kvm * Only do MSR filtering for MSRs accessed by rdmsr/wrmsr 2022-04-02 12:09:02 -07:00
lib A set of x86 fixes and updates: 2022-04-03 12:15:47 -07:00
math-emu x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
mm x86/mm/tlb: Revert retpoline avoidance approach 2022-04-04 19:41:36 +02:00
net Add support for Intel CET-IBT, available since Tigerlake (11th gen), which is a 2022-03-27 10:17:23 -07:00
pci PCI/sysfs: Find shadow ROM before static attribute initialization 2022-01-26 10:41:21 -06:00
platform objtool,efi: Update __efi64_thunk annotation 2022-03-15 10:32:32 +01:00
power x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
purgatory x86/purgatory: Remove -nostdlib compiler flag 2021-12-30 14:13:06 +01:00
ras
realmode - Flush *all* mappings from the TLB after switching to the trampoline 2022-01-10 09:51:38 -08:00
tools x86/build: Use the proper name CONFIG_FW_LOADER 2021-12-29 22:20:38 +01:00
um Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2022-04-01 19:57:03 -07:00
video
xen xen: branch for v5.18-rc1 2022-03-28 14:32:39 -07:00
.gitignore
Kbuild x86/cc: Move arch/x86/{kernel/cc_platform.c => coco/core.c} 2022-02-23 18:25:58 +01:00
Kconfig Revert the RT related signal changes. They need to be reworked and 2022-04-03 12:08:26 -07:00
Kconfig.assembler
Kconfig.cpu x86/mmx_32: Remove X86_USE_3DNOW 2021-12-11 09:09:45 +01:00
Kconfig.debug
Makefile x86: Remove toolchain check for X32 ABI capability 2022-03-15 10:32:48 +01:00
Makefile.um
Makefile_32.cpu