linux-stable/Documentation
Lai Jiangshan cea9e8ee3b KVM: X86: MMU: Use the correct inherited permissions to get shadow page
commit b1bd5cba33 upstream.

When computing the access permissions of a shadow page, use the effective
permissions of the walk up to that point, i.e. the logic AND of its parents'
permissions.  Two guest PxE entries that point at the same table gfn need to
be shadowed with different shadow pages if their parents' permissions are
different.  KVM currently uses the effective permissions of the last
non-leaf entry for all non-leaf entries.  Because all non-leaf SPTEs have
full ("uwx") permissions, and the effective permissions are recorded only
in role.access and merged into the leaves, this can lead to incorrect
reuse of a shadow page and eventually to a missing guest protection page
fault.

For example, here is a shared pagetable:

   pgd[]   pud[]        pmd[]            virtual address pointers
                     /->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--)
        /->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-)
   pgd-|           (shared pmd[] as above)
        \->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--)
                     \->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--)

  pud1 and pud2 point to the same pmd table, so:
  - ptr1 and ptr3 points to the same page.
  - ptr2 and ptr4 points to the same page.

(pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries)

- First, the guest reads from ptr1 first and KVM prepares a shadow
  page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1.
  "u--" comes from the effective permissions of pgd, pud1 and
  pmd1, which are stored in pt->access.  "u--" is used also to get
  the pagetable for pud1, instead of "uw-".

- Then the guest writes to ptr2 and KVM reuses pud1 which is present.
  The hypervisor set up a shadow page for ptr2 with pt->access is "uw-"
  even though the pud1 pmd (because of the incorrect argument to
  kvm_mmu_get_page in the previous step) has role.access="u--".

- Then the guest reads from ptr3.  The hypervisor reuses pud1's
  shadow pmd for pud2, because both use "u--" for their permissions.
  Thus, the shadow pmd already includes entries for both pmd1 and pmd2.

- At last, the guest writes to ptr4.  This causes no vmexit or pagefault,
  because pud1's shadow page structures included an "uw-" page even though
  its role.access was "u--".

Any kind of shared pagetable might have the similar problem when in
virtual machine without TDP enabled if the permissions are different
from different ancestors.

In order to fix the problem, we change pt->access to be an array, and
any access in it will not include permissions ANDed from child ptes.

The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/
Remember to test it with TDP disabled.

The problem had existed long before the commit 41074d07c7 ("KVM: MMU:
Fix inherited permissions for emulated guest pte updates"), and it
is hard to find which is the culprit.  So there is no fixes tag here.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20210603052455.21023-1-jiangshanlai@gmail.com>
Cc: stable@vger.kernel.org
Fixes: cea0f0e7ea ("[PATCH] KVM: MMU: Shadow page table caching")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[OP: - apply arch/x86/kvm/mmu/* changes to arch/x86/kvm
     - apply documentation changes to Documentation/virtual/kvm/mmu.txt
     - add vcpu parameter to gpte_access() call]
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-03 09:56:26 +02:00
..
ABI iio: improve IIO_CONCENTRATION channel type description 2020-08-21 09:48:07 +02:00
accounting
acpi
admin-guide USB: UAS: introduce a quirk to set no_write_same 2020-12-29 13:46:46 +01:00
aoe
arm ARM: 8833/1: Ensure that NEON code always compiles with Clang 2019-04-05 22:31:34 +02:00
arm64 arm64: Expose Arm v8.4 features 2019-10-29 09:17:10 +01:00
auxdisplay
backlight
blackfin
block
blockdev SCSI misc on 20170907 2017-09-07 21:11:05 -07:00
bus-devices
cdrom
cgroup-v1
cma
connector
console
core-api doc: Fix RCU's docbook options 2017-10-19 22:26:11 -04:00
cpu-freq cpufreq: docs: Drop intel-pstate.txt from index.txt 2017-09-28 02:08:43 +02:00
cpuidle
cris
crypto
dev-tools kmemcheck: rip it out 2018-02-22 15:42:24 +01:00
device-mapper dm thin: fix documentation relative to low water mark threshold 2018-04-26 11:02:07 +02:00
devicetree dt-bindings: net: btusb: DT fix s/interrupt-name/interrupt-names/ 2021-03-07 11:27:43 +01:00
dmaengine Merge branch 'topic/dmatest' into for-linus 2017-09-06 21:55:10 +05:30
doc-guide
driver-api ata: make qc_prep return ata_completion_errors 2020-10-01 13:12:52 +02:00
driver-model driver core: remove DRIVER_ATTR 2017-09-19 09:20:33 +02:00
early-userspace
EDID
extcon
fault-injection
fb
features
filesystems locks: print a warning when mount fails due to lack of "mand" support 2021-08-26 08:37:10 -04:00
firmware_class
fmc
fpga
frv
gpio Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2017-09-07 13:39:21 -07:00
gpu
hid HID: doc: fix wrong data structure reference for UHID_OUTPUT 2019-12-05 15:37:31 +01:00
hwmon hwmon: (ina2xx) fix sysfs shunt resistor read access 2018-10-03 17:00:58 -07:00
i2c i2c: i801: Add support for Intel Cedar Fork 2017-10-05 14:44:56 +02:00
ia64
ide
iio
infiniband
input
ioctl
isdn
kbuild kbuild: delete INSTALL_FW_PATH from kbuild documentation 2018-07-17 11:39:30 +02:00
kdump
kernel-hacking
laptops
leds
lightnvm
livepatch
locking
m68k
md
media media: videodev2.h: RGB BT2020 and HSV are always full range 2020-11-05 11:06:54 +01:00
memory-devices
metag
mic
mips
misc-devices
mmc
mn10300
mtd
namespaces
netlabel
networking icmp: randomize the global rate limiter 2020-10-29 09:06:59 +01:00
nfc
nios2
nvdimm
nvmem
parisc
PCI
pcmcia
perf
phy
platform
power
powerpc
pps drivers/pps: aesthetic tweaks to PPS-related content 2017-09-08 18:26:51 -07:00
process kbuild: verify that $DEPMOD is installed 2018-08-17 21:01:10 +02:00
pti
ptp
rapidio
RCU
s390
scheduler sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices 2019-11-12 19:18:29 +01:00
scsi
security
serial
sh
sound ALSA: doc: Document PC Beep Hidden Register on Realtek ALC256 2020-04-24 08:00:35 +02:00
sparc
sphinx tweewide: Fix most Shebang lines 2021-06-03 08:36:11 +02:00
sphinx-static
spi
sysctl bpf: add bpf_jit_limit knob to restrict unpriv allocations 2019-08-25 10:50:03 +02:00
target tweewide: Fix most Shebang lines 2021-06-03 08:36:11 +02:00
thermal
timers
trace tweewide: Fix most Shebang lines 2021-06-03 08:36:11 +02:00
translations kokr/memory-barriers.txt: Apply atomic_t.txt change 2017-09-08 10:10:53 -06:00
usb USB: rio500: Remove Rio 500 kernel driver 2019-10-17 13:43:20 -07:00
userspace-api Documentation: Add section about CPU vulnerabilities for Spectre 2019-07-21 09:04:31 +02:00
virtual KVM: X86: MMU: Use the correct inherited permissions to get shadow page 2021-09-03 09:56:26 +02:00
vm hmm: heterogeneous memory management documentation 2017-09-08 18:26:45 -07:00
w1
watchdog watchdog: Revert "iTCO_wdt: all versions count down twice" 2017-09-09 17:41:24 +02:00
wimax
x86 x86/speculation/taa: Add documentation for TSX Async Abort 2019-11-12 19:19:02 +01:00
xtensa
.gitignore
00-INDEX
atomic_bitops.txt
atomic_t.txt x86/atomic: Fix smp_mb__{before,after}_atomic() 2019-07-31 07:28:26 +02:00
bcache.txt
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt
cachetlb.txt
cgroup-v2.txt
Changes
circular-buffers.txt
clk.txt
CodingStyle
conf.py docs: Fix conf.py for Sphinx 2.0 2019-06-09 09:18:17 +02:00
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt
digsig.txt
DMA-API-HOWTO.txt
DMA-API.txt
DMA-attributes.txt
DMA-ISA-LPC.txt
docutils.conf
dontdiff
efi-stub.txt
eisa.txt
errseq.rst
flexible-arrays.txt
futex-requeue-pi.txt
gcc-plugins.txt
highuid.txt
hw_random.txt
hwspinlock.txt
index.rst x86/speculation/mds: Add mds_clear_cpu_buffers() 2019-05-14 19:18:43 +02:00
Intel-IOMMU.txt
intel_txt.txt
io-mapping.txt
io_ordering.txt
iostats.txt
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
isa.txt
isapnp.txt
kernel-doc-nano-HOWTO.txt
kernel-per-CPU-kthreads.txt
kobject.txt
kprobes.txt
kref.txt
ldm.txt
lockup-watchdogs.txt
logo.gif
logo.txt
lsm.txt
lzo.txt
mailbox.txt
Makefile
memory-barriers.txt Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-04 11:52:29 -07:00
memory-hotplug.txt
men-chameleon-bus.txt
nommu-mmap.txt
ntb.txt
numastat.txt
padata.txt
parport-lowlevel.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pnp.txt
preempt-locking.txt
printk-formats.txt lib/vsprintf: Remove atomic-unsafe support for %pCr 2018-07-03 11:24:48 +02:00
pwm.txt
rbtree.txt rbtree: cache leftmost node internally 2017-09-08 18:26:48 -07:00
remoteproc.txt
rfkill.txt
robust-futex-ABI.txt
robust-futexes.txt futex: Update comments and docs about return values of arch futex code 2019-07-03 13:16:03 +02:00
rpmsg.txt
rtc.txt
SAK.txt
sgi-ioc4.txt
siphash.txt
SM501.txt
smsc_ece1099.txt
speculation.txt Documentation: Document array_index_nospec 2018-02-07 11:12:22 -08:00
static-keys.txt
SubmittingPatches
svga.txt
switchtec.txt
sync_file.txt
tee.txt
this_cpu_ops.txt
unaligned-memory-access.txt
vfio-mediated-device.txt vfio/mdev: Check globally for duplicate devices 2018-08-03 07:50:22 +02:00
vfio.txt
video-output.txt
xillybus.txt
xz.txt
zorro.txt