linux-stable/arch/x86/mm
Dave Hansen d39268ad24 x86/mm/tlb: Revert retpoline avoidance approach
0day reported a regression on a microbenchmark which is intended to
stress the TLB flushing path:

	https://lore.kernel.org/all/20220317090415.GE735@xsang-OptiPlex-9020/

It pointed at a commit from Nadav which intended to remove retpoline
overhead in the TLB flushing path by taking the 'cond'-ition in
on_each_cpu_cond_mask(), pre-calculating it, and incorporating it into
'cpumask'.  That allowed the code to use a bunch of earlier direct
calls instead of later indirect calls that need a retpoline.

But, in practice, threads can go idle (and into lazy TLB mode where
they don't need to flush their TLB) between the early and late calls.
It works in this direction and not in the other because TLB-flushing
threads tend to hold mmap_lock for write.  Contention on that lock
causes threads to _go_ idle right in this early/late window.

There was not any performance data in the original commit specific
to the retpoline overhead.  I did a few tests on a system with
retpolines:

	https://lore.kernel.org/all/dd8be93c-ded6-b962-50d4-96b1c3afb2b7@intel.com/

which showed a possible small win.  But, that small win pales in
comparison with the bigger loss induced on non-retpoline systems.

Revert the patch that removed the retpolines.  This was not a
clean revert, but it was self-contained enough not to be too painful.

Fixes: 6035152d8e ("x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Nadav Amit <namit@vmware.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/164874672286.389.7021457716635788197.tip-bot2@tip-bot2
2022-04-04 19:41:36 +02:00
..
pat - Remove a misleading message and an unused function 2022-03-21 11:49:16 -07:00
Makefile x86/sev: Move common memory encryption code to mem_encrypt.c 2021-12-08 16:49:53 +01:00
amdtopology.c
cpu_entry_area.c x86/sev: Make the #VC exception stacks part of the default stacks storage 2021-10-06 21:48:27 +02:00
debug_pagetables.c mm: introduce include/linux/pgtable.h 2020-06-09 09:39:13 -07:00
dump_pagetables.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
extable.c x86/entry_32: Fix segment exceptions 2022-01-12 16:38:25 +01:00
fault.c mm: remove redundant check about FAULT_FLAG_ALLOW_RETRY bit 2022-01-15 16:30:27 +02:00
highmem_32.c x86/mm/highmem: Use generic kmap atomic implementation 2020-11-06 23:14:55 +01:00
hugetlbpage.c mm: remove unneeded includes of <asm/pgalloc.h> 2020-08-07 11:33:26 -07:00
ident_map.c x86/mm/ident_map: Check for errors from ident_pud_init() 2020-10-28 14:48:30 +01:00
init.c mm/migration: add trace events for base page and HugeTLB migrations 2022-03-24 19:06:45 -07:00
init_32.c x86: mm: rename __is_kernel_text() to is_x86_32_kernel_text() 2021-11-09 10:02:51 -08:00
init_64.c bootmem: Use page->index instead of page->freelist 2022-01-06 12:27:03 +01:00
iomap_32.c io-mapping: Cleanup atomic iomap 2020-11-06 23:14:58 +01:00
ioremap.c x86/boot: Add setup_indirect support in early_memremap_is_setup_data() 2022-03-09 12:49:46 +01:00
kasan_init_64.c memblock: use memblock_free for freeing virtual pointers 2021-11-06 13:30:41 -07:00
kaslr.c x86: Fix various typos in comments 2021-03-18 15:31:53 +01:00
kmmio.c x86: Fix various typos in comments 2021-03-18 15:31:53 +01:00
maccess.c x86: Share definition of __is_canonical_address() 2022-02-02 13:11:42 +01:00
mem_encrypt.c x86/sev: Move common memory encryption code to mem_encrypt.c 2021-12-08 16:49:53 +01:00
mem_encrypt_amd.c x86/mm/cpa: Generalize __set_memory_enc_pgtable() 2022-02-23 19:14:29 +01:00
mem_encrypt_boot.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
mem_encrypt_identity.c x86/coco: Add API to handle encryption mask 2022-02-23 19:14:29 +01:00
mm_internal.h x86/mm: thread pgprot_t through init_memory_mapping() 2020-04-10 15:36:21 -07:00
mmap.c x86/mm/mmap: Fix -Wmissing-prototypes warnings 2020-04-22 20:19:48 +02:00
mmio-mod.c x86/mmiotrace: Replace deprecated CPU-hotplug functions. 2021-08-10 14:46:27 +02:00
numa.c arch/x86/mm/numa: Do not initialize nodes twice 2022-03-22 15:57:06 -07:00
numa_32.c x86/mm: Drop deprecated DISCONTIGMEM support for 32-bit 2020-05-28 18:34:30 +02:00
numa_64.c
numa_emulation.c memblock: use memblock_free for freeing virtual pointers 2021-11-06 13:30:41 -07:00
numa_internal.h
pf_in.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
pf_in.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
pgtable.c Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge" 2021-07-21 11:28:09 +01:00
pgtable_32.c mm: remove unneeded includes of <asm/pgalloc.h> 2020-08-07 11:33:26 -07:00
physaddr.c mm, x86/mm: Untangle address space layout definitions from basic pgtable type definitions 2019-12-10 10:12:55 +01:00
physaddr.h
pkeys.c Fixes and improvements for FPU handling on x86: 2021-07-07 11:12:01 -07:00
pti.c x86/process/64: Move cpu_current_top_of_stack out of TSS 2021-03-28 22:40:10 +02:00
setup_nx.c mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
srat.c
testmmiotrace.c remove ioremap_nocache and devm_ioremap_nocache 2020-01-06 09:45:59 +01:00
tlb.c x86/mm/tlb: Revert retpoline avoidance approach 2022-04-04 19:41:36 +02:00