linux-stable/Documentation
Daniel Sneddon f2f41ef035 x86/speculation: Add RSB VM Exit protections
commit 2b12993220 upstream.

tl;dr: The Enhanced IBRS mitigation for Spectre v2 does not work as
documented for RET instructions after VM exits. Mitigate it with a new
one-entry RSB stuffing mechanism and a new LFENCE.

== Background ==

Indirect Branch Restricted Speculation (IBRS) was designed to help
mitigate Branch Target Injection and Speculative Store Bypass, i.e.
Spectre, attacks. IBRS prevents software run in less privileged modes
from affecting branch prediction in more privileged modes. IBRS requires
the MSR to be written on every privilege level change.

To overcome some of the performance issues of IBRS, Enhanced IBRS was
introduced.  eIBRS is an "always on" IBRS, in other words, just turn
it on once instead of writing the MSR on every privilege level change.
When eIBRS is enabled, more privileged modes should be protected from
less privileged modes, including protecting VMMs from guests.

== Problem ==

Here's a simplification of how guests are run on Linux' KVM:

void run_kvm_guest(void)
{
	// Prepare to run guest
	VMRESUME();
	// Clean up after guest runs
}

The execution flow for that would look something like this to the
processor:

1. Host-side: call run_kvm_guest()
2. Host-side: VMRESUME
3. Guest runs, does "CALL guest_function"
4. VM exit, host runs again
5. Host might make some "cleanup" function calls
6. Host-side: RET from run_kvm_guest()

Now, when back on the host, there are a couple of possible scenarios of
post-guest activity the host needs to do before executing host code:

* on pre-eIBRS hardware (legacy IBRS, or nothing at all), the RSB is not
touched and Linux has to do a 32-entry stuffing.

* on eIBRS hardware, VM exit with IBRS enabled, or restoring the host
IBRS=1 shortly after VM exit, has a documented side effect of flushing
the RSB except in this PBRSB situation where the software needs to stuff
the last RSB entry "by hand".

IOW, with eIBRS supported, host RET instructions should no longer be
influenced by guest behavior after the host retires a single CALL
instruction.

However, if the RET instructions are "unbalanced" with CALLs after a VM
exit as is the RET in #6, it might speculatively use the address for the
instruction after the CALL in #3 as an RSB prediction. This is a problem
since the (untrusted) guest controls this address.

Balanced CALL/RET instruction pairs such as in step #5 are not affected.

== Solution ==

The PBRSB issue affects a wide variety of Intel processors which
support eIBRS. But not all of them need mitigation. Today,
X86_FEATURE_RETPOLINE triggers an RSB filling sequence that mitigates
PBRSB. Systems setting RETPOLINE need no further mitigation - i.e.,
eIBRS systems which enable retpoline explicitly.

However, such systems (X86_FEATURE_IBRS_ENHANCED) do not set RETPOLINE
and most of them need a new mitigation.

Therefore, introduce a new feature flag X86_FEATURE_RSB_VMEXIT_LITE
which triggers a lighter-weight PBRSB mitigation versus RSB Filling at
vmexit.

The lighter-weight mitigation performs a CALL instruction which is
immediately followed by a speculative execution barrier (INT3). This
steers speculative execution to the barrier -- just like a retpoline
-- which ensures that speculation can never reach an unbalanced RET.
Then, ensure this CALL is retired before continuing execution with an
LFENCE.

In other words, the window of exposure is opened at VM exit where RET
behavior is troublesome. While the window is open, force RSB predictions
sampling for RET targets to a dead end at the INT3. Close the window
with the LFENCE.

There is a subset of eIBRS systems which are not vulnerable to PBRSB.
Add these systems to the cpu_vuln_whitelist[] as NO_EIBRS_PBRSB.
Future systems that aren't vulnerable will set ARCH_CAP_PBRSB_NO.

  [ bp: Massage, incorporate review comments from Andy Cooper. ]
  [ Pawan: Update commit message to replace RSB_VMEXIT with RETPOLINE ]

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-11 12:57:53 +02:00
..
ABI iio: adc: vf610: fix conversion mode sysfs node name 2022-06-29 08:58:47 +02:00
EDID docs: driver-api: add a series of orphaned documents 2019-07-15 11:03:02 -03:00
PCI Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-08-27 14:23:31 -07:00
RCU Merge branches 'consolidate.2019.08.01b', 'fixes.2019.08.12a', 'lists.2019.08.13a' and 'torture.2019.08.01b' into HEAD 2019-08-13 14:30:30 -07:00
accounting psi: Fix uaf issue when psi trigger is destroyed while being polled 2022-02-05 12:35:36 +01:00
admin-guide x86/speculation: Add RSB VM Exit protections 2022-08-11 12:57:53 +02:00
arm ARM: 9012/1: move device tree mapping out of linear region 2021-05-19 10:08:32 +02:00
arm64 userfaultfd: do not untag user pointers 2021-07-28 13:31:01 +02:00
block docs: block: null_blk: enhance document style 2019-09-11 16:04:22 -06:00
bpf bpf/flow_dissector: document flags 2019-07-25 18:00:41 -07:00
cdrom docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
core-api XArray: add xas_split 2021-06-10 13:37:14 +02:00
cpu-freq Documentation: cpufreq: Update policy notifier documentation 2019-09-02 22:44:05 +02:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2019-09-18 12:11:14 -07:00
dev-tools mm, page_owner: decouple freeing stack trace from debug_pagealloc 2019-10-14 15:04:00 -07:00
devicetree dt-bindings: dma: allwinner,sun50i-a64-dma: Fix min/max typo 2022-07-12 16:30:49 +02:00
doc-guide docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
driver-api Documentation: fix firewire.rst ABI file path error 2022-01-27 09:19:52 +01:00
fault-injection docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
fb fbdev: fix numbering of fbcon options 2020-02-24 08:36:42 +01:00
features It's a somewhat calmer cycle for docs this time, as the churn of the mass 2019-09-17 16:22:26 -07:00
filesystems ext4, doc: fix incorrect h_reserved size 2022-04-27 13:50:50 +02:00
firmware-guide Documentation: ACPI: Fix data node reference documentation 2022-01-27 09:19:52 +01:00
firmware_class
fpga Documentation: fpga: dfl: add descriptions for virtualization and new interfaces. 2019-09-03 19:35:42 -07:00
gpu Merge drm/drm-next into drm-intel-next-queued 2019-08-22 00:10:36 -07:00
hid docs: add some documentation dirs to the driver-api book 2019-07-15 11:03:02 -03:00
hwmon Revert "hwmon: Make chip parameter for with_info API mandatory" 2022-06-25 12:44:36 +02:00
i2c docs: i2c: convert to ReST and add to driver-api bookset 2019-07-31 13:25:27 -06:00
ia64 docs: add SPDX tags to new index files 2019-07-15 11:03:03 -03:00
ide docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
iio docs: add some documentation dirs to the driver-api book 2019-07-15 11:03:02 -03:00
infiniband Documentation/infiniband: update name of some functions 2019-09-13 16:55:55 -03:00
input Input: docs: fix spelling mistake "potocol" -> "protocol" 2019-08-06 11:24:49 -06:00
ioctl fs-verity: add UAPI header 2019-07-28 16:59:16 -07:00
isdn docs: isdn: convert to ReST and add to kAPI bookset 2019-07-31 13:30:25 -06:00
kbuild kbuild: support LLVM=1 to switch the default tools to Clang/LLVM 2020-08-26 10:40:47 +02:00
kernel-hacking docs: Add documentation for Symbol Namespaces 2019-09-10 10:30:49 +02:00
leds leds: core: Add support for composing LED class device names 2019-07-25 20:07:52 +02:00
livepatch docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
locking doc🔒 remove reference to clever use of read-write lock 2019-09-14 01:53:27 -06:00
m68k docs: README.buddha: convert to ReST and add to m68k book 2019-07-31 13:30:10 -06:00
maintainer docs: Fix typo on pull requests guide 2019-08-12 15:14:14 -06:00
media media: videodev2.h: RGB BT2020 and HSV are always full range 2020-11-05 11:43:15 +01:00
mic docs: driver-api: add remaining converted dirs to it 2019-07-15 11:03:03 -03:00
mips Main MIPS changes for v5.4: 2019-09-22 09:30:30 -07:00
misc-devices Docs: misc: xilinx_sdfec: Add documentation 2019-08-15 17:54:38 +02:00
netlabel docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
networking Documentation: fix sctp_wmem in ip-sysctl.rst 2022-08-03 11:59:40 +02:00
nios2 docs: nios2: add it to the main Documentation body 2019-07-31 13:31:51 -06:00
openrisc docs: openrisc: convert to ReST and add to documentation body 2019-07-31 13:30:20 -06:00
parisc docs: parisc: convert to ReST and add to documentation body 2019-07-31 13:30:15 -06:00
pcmcia docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
power Merge branches 'pm-opp', 'pm-qos', 'acpi-pm', 'pm-domains' and 'pm-tools' 2019-09-17 09:49:19 +02:00
powerpc docs: powerpc: Add missing documentation reference 2019-09-17 23:59:34 +10:00
process docs: submitting-patches: Fix crossref to 'The canonical patch format' 2022-06-06 08:33:51 +02:00
riscv It's a somewhat calmer cycle for docs this time, as the churn of the mass 2019-09-17 16:22:26 -07:00
s390 Documentation/s390: remove outdated debugging390 documentation 2019-08-21 12:41:43 +02:00
scheduler sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices 2019-08-08 09:09:30 +02:00
scsi scsi: smartpqi: Update attribute name to `driver_version` 2020-01-17 19:48:27 +01:00
security Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity 2019-09-27 19:37:27 -07:00
sh docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
sound ALSA: hda/realtek: Add alc256-samsung-headphone fixup 2022-04-15 14:18:25 +02:00
sparc docs: add arch doc directories to the index 2019-07-15 11:03:01 -03:00
sphinx tweewide: Fix most Shebang lines 2021-05-22 11:38:30 +02:00
sphinx-static
spi spi: docs: convert to ReST and add it to the kABI bookset 2019-07-31 14:13:13 -06:00
target tweewide: Fix most Shebang lines 2021-05-22 11:38:30 +02:00
timers docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
trace tracing/histogram: Rename "cpu" to "common_cpu" 2021-07-28 13:31:00 +02:00
translations doc: arm64: fix grammar dtb placed in no attributes region 2019-09-06 08:44:34 -06:00
usb USB: rio500: Remove Rio 500 kernel driver 2019-10-04 10:53:36 +02:00
userspace-api Documentation: seccomp: Fix user notification documentation 2021-06-03 08:59:03 +02:00
virt KVM: X86: MMU: Use the correct inherited permissions to get shadow page 2021-08-15 13:08:04 +02:00
virtual cpuidle: add haltpoll governor 2019-07-30 17:27:37 +02:00
vm arm: remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL 2022-05-15 19:54:46 +02:00
w1 docs: w1: convert to ReST and add to the kAPI group of docs 2019-07-31 14:16:17 -06:00
watchdog linux-watchdog 5.4-rc1 tag 2019-09-27 11:17:38 -07:00
x86 x86/CPU/AMD: Save AMD NodeId as cpu_die_id 2020-12-30 11:51:47 +01:00
xtensa xtensa: fix TLBTEMP area placement 2020-11-24 13:29:22 +01:00
.gitignore
COPYING-logo docs: logo.txt: rename it to COPYING-logo 2019-07-15 09:20:27 -03:00
Changes
CodingStyle
DMA-API-HOWTO.txt docs: DMA-API-HOWTO.txt: fix an unmarked code block 2019-07-15 09:20:24 -03:00
DMA-API.txt dma-mapping: remove dma_release_declared_memory 2019-09-04 11:13:19 +02:00
DMA-ISA-LPC.txt
DMA-attributes.txt Reinstate some of "swiotlb: rework "fix info leak with DMA_FROM_DEVICE"" 2022-05-25 09:14:38 +02:00
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
Kconfig
Makefile
SubmittingPatches
asm-annotations.rst linkage: Introduce new macros for assembler symbols 2020-11-10 12:37:24 +01:00
atomic_bitops.txt
atomic_t.txt Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-07-08 16:12:03 -07:00
bus-virt-phys-mapping.txt
conf.py docs/conf.py: Cope with removal of language=None in Sphinx 5.0.0 2022-06-14 18:11:49 +02:00
crc32.txt
debugging-modules.txt
debugging-via-ohci1394.txt
digsig.txt
docutils.conf doc-rst: Add missing newline at end of file 2019-06-20 14:16:56 -06:00
dontdiff kbuild: create *.mod with full directory path and remove MODVERDIR 2019-07-18 02:19:31 +09:00
futex-requeue-pi.txt
hwspinlock.txt hwspinlock: add the 'in_atomic' API 2019-06-29 21:08:14 -07:00
index.rst linkage: Introduce new macros for assembler symbols 2020-11-10 12:37:24 +01:00
io-mapping.txt
io_ordering.txt
irqflags-tracing.txt
kobject.txt
kprobes.txt
kref.txt
logo.gif
lzo.txt lib/lzo: fix ambiguous encoding bug in lzo-rle 2020-06-17 16:40:28 +02:00
mailbox.txt
memory-barriers.txt docs: fix broken doc references due to renames 2019-07-17 06:57:51 -03:00
nommu-mmap.txt
padata.txt padata: allocate workqueue internally 2019-09-13 21:15:39 +10:00
percpu-rw-semaphore.txt
pi-futex.txt docs: locking: convert docs to ReST and rename to *.rst 2019-07-15 08:53:27 -03:00
preempt-locking.txt
rbtree.txt docs: rbtree.txt: fix Sphinx build warnings 2019-07-15 09:20:24 -03:00
remoteproc.txt remoteproc: add vendor resources handling 2019-06-29 12:02:17 -07:00
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
speculation.txt
static-keys.txt
tee.txt
this_cpu_ops.txt
unaligned-memory-access.txt
xz.txt