linux-stable/tools
Peter Zijlstra 28ca351296 x86/retpoline: Simplify retpolines
commit 119251855f upstream.

Due to:

  c9c324dc22 ("objtool: Support stack layout changes in alternatives")

it is now possible to simplify the retpolines.

Currently our retpolines consist of 2 symbols:

 - __x86_indirect_thunk_\reg: the compiler target
 - __x86_retpoline_\reg:  the actual retpoline.

Both are consecutive in code and aligned such that for any one register
they both live in the same cacheline:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop

  0000000000000005 <__x86_retpoline_rax>:
   5:   e8 07 00 00 00          callq  11 <__x86_retpoline_rax+0xc>
   a:   f3 90                   pause
   c:   0f ae e8                lfence
   f:   eb f9                   jmp    a <__x86_retpoline_rax+0x5>
  11:   48 89 04 24             mov    %rax,(%rsp)
  15:   c3                      retq
  16:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)

The thunk is an alternative_2, where one option is a JMP to the
retpoline. This was done so that objtool didn't need to deal with
alternatives with stack ops. But that problem has been solved, so now
it is possible to fold the entire retpoline into the alternative to
simplify and consolidate unused bytes:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop
   5:   90                      nop
   6:   90                      nop
   7:   90                      nop
   8:   90                      nop
   9:   90                      nop
   a:   90                      nop
   b:   90                      nop
   c:   90                      nop
   d:   90                      nop
   e:   90                      nop
   f:   90                      nop
  10:   90                      nop
  11:   66 66 2e 0f 1f 84 00 00 00 00 00        data16 nopw %cs:0x0(%rax,%rax,1)
  1c:   0f 1f 40 00             nopl   0x0(%rax)

Notice that since the longest alternative sequence is now:

   0:   e8 07 00 00 00          callq  c <.altinstr_replacement+0xc>
   5:   f3 90                   pause
   7:   0f ae e8                lfence
   a:   eb f9                   jmp    5 <.altinstr_replacement+0x5>
   c:   48 89 04 24             mov    %rax,(%rsp)
  10:   c3                      retq

17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW, if
we can shrink the retpoline by 1 byte we can pack it more densely).

 [ bp: Massage commit message. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210326151259.506071949@infradead.org
[bwh: Backported to 5.10:
 - Use X86_FEATRURE_RETPOLINE_LFENCE flag instead of
   X86_FEATURE_RETPOLINE_AMD, since the later renaming of this flag
   has already been applied
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-25 11:26:13 +02:00
..
accounting
arch x86: Add insn_decode_kernel() 2022-07-25 11:26:12 +02:00
bootconfig tools/bootconfig: Fix tracing_on option checking in ftrace2bconf.sh 2021-09-26 14:08:59 +02:00
bpf tools/resolve_btfids: Do not print any commands when building silently 2022-02-08 18:30:39 +01:00
build tools build: Use $(shell ) instead of `` to get embedded libperl's ccopts 2022-04-13 21:01:10 +02:00
cgroup tools/cgroup/slabinfo.py: updated to work on current kernel 2021-05-07 11:04:31 +02:00
debugging
edid
firewire
firmware
gpio tools: gpio: fix %llu warning in gpio-watch.c 2021-01-27 11:55:20 +01:00
hv
iio
include x86/insn: Add an insn_decode() API 2022-07-25 11:26:11 +02:00
io_uring
kvm/kvm_stat tools/kvm_stat: Add restart delay 2021-04-16 11:43:20 +02:00
laptop
leds
lib libbpf: Fix logic for finding matching program for CO-RE relocation 2022-06-09 10:21:03 +02:00
memory-model
objtool x86/retpoline: Simplify retpolines 2022-07-25 11:26:13 +02:00
pci
pcmcia
perf x86/insn: Add a __ignore_sync_check__ marker 2022-07-25 11:26:11 +02:00
power tools/power turbostat: fix ICX DRAM power numbers 2022-06-09 10:20:51 +02:00
scripts tools: Allow proper CC/CXX/... override with LLVM=1 in Makefile.include 2021-07-31 08:16:10 +02:00
spi
testing selftests: forwarding: fix error message in learning_test 2022-07-12 16:32:22 +02:00
thermal/tmon tools/thermal/tmon: Add cross compiling support 2021-09-18 13:40:07 +02:00
time
usb usb: testusb: Fix for showing the connection speed 2021-10-09 14:40:56 +02:00
virtio tools/virtio: compile with -pthread 2022-05-25 09:17:53 +02:00
vm tools/vm/page-types: remove dependency on opt_file for idle page tracking 2021-10-09 14:40:57 +02:00
wmi
Makefile