Go to file
Rick Edgecombe 048cc4a028 KVM: x86/mmu: x86: Don't overflow lpage_info when checking attributes
commit 992b54bd08 upstream.

Fix KVM_SET_MEMORY_ATTRIBUTES to not overflow lpage_info array and trigger
KASAN splat, as seen in the private_mem_conversions_test selftest.

When memory attributes are set on a GFN range, that range will have
specific properties applied to the TDP. A huge page cannot be used when
the attributes are inconsistent, so they are disabled for those the
specific huge pages. For internal KVM reasons, huge pages are also not
allowed to span adjacent memslots regardless of whether the backing memory
could be mapped as huge.

What GFNs support which huge page sizes is tracked by an array of arrays
'lpage_info' on the memslot, of ‘kvm_lpage_info’ structs. Each index of
lpage_info contains a vmalloc allocated array of these for a specific
supported page size. The kvm_lpage_info denotes whether a specific huge
page (GFN and page size) on the memslot is supported. These arrays include
indices for unaligned head and tail huge pages.

Preventing huge pages from spanning adjacent memslot is covered by
incrementing the count in head and tail kvm_lpage_info when the memslot is
allocated, but disallowing huge pages for memory that has mixed attributes
has to be done in a more complicated way. During the
KVM_SET_MEMORY_ATTRIBUTES ioctl KVM updates lpage_info for each memslot in
the range that has mismatched attributes. KVM does this a memslot at a
time, and marks a special bit, KVM_LPAGE_MIXED_FLAG, in the kvm_lpage_info
for any huge page. This bit is essentially a permanently elevated count.
So huge pages will not be mapped for the GFN at that page size if the
count is elevated in either case: a huge head or tail page unaligned to
the memslot or if KVM_LPAGE_MIXED_FLAG is set because it has mixed
attributes.

To determine whether a huge page has consistent attributes, the
KVM_SET_MEMORY_ATTRIBUTES operation checks an xarray to make sure it
consistently has the incoming attribute. Since level - 1 huge pages are
aligned to level huge pages, it employs an optimization. As long as the
level - 1 huge pages are checked first, it can just check these and assume
that if each level - 1 huge page contained within the level sized huge
page is not mixed, then the level size huge page is not mixed. This
optimization happens in the helper hugepage_has_attrs().

Unfortunately, although the kvm_lpage_info array representing page size
'level' will contain an entry for an unaligned tail page of size level,
the array for level - 1  will not contain an entry for each GFN at page
size level. The level - 1 array will only contain an index for any
unaligned region covered by level - 1 huge page size, which can be a
smaller region. So this causes the optimization to overflow the level - 1
kvm_lpage_info and perform a vmalloc out of bounds read.

In some cases of head and tail pages where an overflow could happen,
callers skip the operation completely as KVM_LPAGE_MIXED_FLAG is not
required to prevent huge pages as discussed earlier. But for memslots that
are smaller than the 1GB page size, it does call hugepage_has_attrs(). In
this case the huge page is both the head and tail page. The issue can be
observed simply by compiling the kernel with CONFIG_KASAN_VMALLOC and
running the selftest “private_mem_conversions_test”, which produces the
output like the following:

BUG: KASAN: vmalloc-out-of-bounds in hugepage_has_attrs+0x7e/0x110
Read of size 4 at addr ffffc900000a3008 by task private_mem_con/169
Call Trace:
  dump_stack_lvl
  print_report
  ? __virt_addr_valid
  ? hugepage_has_attrs
  ? hugepage_has_attrs
  kasan_report
  ? hugepage_has_attrs
  hugepage_has_attrs
  kvm_arch_post_set_memory_attributes
  kvm_vm_ioctl

It is a little ambiguous whether the unaligned head page (in the bug case
also the tail page) should be expected to have KVM_LPAGE_MIXED_FLAG set.
It is not functionally required, as the unaligned head/tail pages will
already have their kvm_lpage_info count incremented. The comments imply
not setting it on unaligned head pages is intentional, so fix the callers
to skip trying to set KVM_LPAGE_MIXED_FLAG in this case, and in doing so
not call hugepage_has_attrs().

Cc: stable@vger.kernel.org
Fixes: 90b4fe1798 ("KVM: x86: Disallow hugepages when memory attributes are mixed")
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Chao Peng <chao.p.peng@linux.intel.com>
Link: https://lore.kernel.org/r/20240314212902.2762507-1-rick.p.edgecombe@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-27 17:13:01 +02:00
Documentation x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto 2024-04-17 11:23:41 +02:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
arch KVM: x86/mmu: x86: Don't overflow lpage_info when checking attributes 2024-04-27 17:13:01 +02:00
block block: propagate partition scanning errors to the BLKRRPART ioctl 2024-04-27 17:12:57 +02:00
certs This update includes the following changes: 2023-11-02 16:15:30 -10:00
crypto Revert "crypto: pkcs7 - remove sha1 support" 2024-04-03 15:32:31 +02:00
drivers speakup: Avoid crash on very long word 2024-04-27 17:13:01 +02:00
fs fs: sysfs: Fix reference leak in sysfs_break_active_protection() 2024-04-27 17:13:01 +02:00
include sched: Add missing memory barrier in switch_mm_cid 2024-04-27 17:13:01 +02:00
init fs/proc: Skip bootloader comment if no embedded kernel parameters 2024-04-17 11:23:36 +02:00
io_uring io_uring: Fix io_cqring_wait() not restoring sigmask on get_timespec64() failure 2024-04-27 17:12:47 +02:00
ipc shm: Slim down dependencies 2023-12-20 19:26:31 -05:00
kernel sched: Add missing memory barrier in switch_mm_cid 2024-04-27 17:13:01 +02:00
lib lib: checksum: hide unused expected_csum_ipv6_magic[] 2024-04-17 11:23:28 +02:00
mm userfaultfd: change src_folio after ensuring it's unpinned in UFFDIO_MOVE 2024-04-27 17:12:54 +02:00
net net/sched: Fix mirred deadlock on device recursion 2024-04-27 17:12:53 +02:00
rust Rust changes for v6.8 2024-01-11 13:05:41 -08:00
samples work around gcc bugs with 'asm goto' with outputs 2024-02-09 15:57:48 -08:00
scripts gcc-plugins/stackleak: Avoid .head.text section 2024-04-13 13:10:11 +02:00
security selinux: avoid dereference of garbage after mount failure 2024-04-10 16:38:01 +02:00
sound ALSA: hda/realtek - Enable audio jacks of Haier Boyue G42 with ALC269VC 2024-04-27 17:12:57 +02:00
tools selftests/powerpc/papr-vpd: Fix missing variable initialization 2024-04-27 17:12:56 +02:00
usr Kbuild updates for v6.8 2024-01-18 17:57:07 -08:00
virt KVM: Always flush async #PF workqueue when vCPU is being destroyed 2024-04-03 15:32:03 +02:00
.clang-format clang-format: Update with v6.7-rc4's `for_each` macro list 2023-12-08 23:54:38 +01:00
.cocciconfig
.editorconfig Add .editorconfig file for basic formatting 2023-12-28 16:22:47 +09:00
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore Add .editorconfig file for basic formatting 2023-12-28 16:22:47 +09:00
.mailmap drm fixes for 6.8 final 2024-03-08 12:44:56 -08:00
.rustfmt.toml rust: add `.rustfmt.toml` 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: supplement of zswap maintainers update 2024-01-25 23:52:21 -08:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS drm fixes for 6.8 final 2024-03-08 12:44:56 -08:00
Makefile Linux 6.8.7 2024-04-17 11:23:43 +02:00
README

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.