Commit Graph

119 Commits

Author SHA1 Message Date
Linus Torvalds 4f712ee0cb S390:
* Changes to FPU handling came in via the main s390 pull request
 
 * Only deliver to the guest the SCLP events that userspace has
   requested.
 
 * More virtual vs physical address fixes (only a cleanup since
   virtual and physical address spaces are currently the same).
 
 * Fix selftests undefined behavior.
 
 x86:
 
 * Fix a restriction that the guest can't program a PMU event whose
   encoding matches an architectural event that isn't included in the
   guest CPUID.  The enumeration of an architectural event only says
   that if a CPU supports an architectural event, then the event can be
   programmed *using the architectural encoding*.  The enumeration does
   NOT say anything about the encoding when the CPU doesn't report support
   the event *in general*.  It might support it, and it might support it
   using the same encoding that made it into the architectural PMU spec.
 
 * Fix a variety of bugs in KVM's emulation of RDPMC (more details on
   individual commits) and add a selftest to verify KVM correctly emulates
   RDMPC, counter availability, and a variety of other PMC-related
   behaviors that depend on guest CPUID and therefore are easier to
   validate with selftests than with custom guests (aka kvm-unit-tests).
 
 * Zero out PMU state on AMD if the virtual PMU is disabled, it does not
   cause any bug but it wastes time in various cases where KVM would check
   if a PMC event needs to be synthesized.
 
 * Optimize triggering of emulated events, with a nice ~10% performance
   improvement in VM-Exit microbenchmarks when a vPMU is exposed to the
   guest.
 
 * Tighten the check for "PMI in guest" to reduce false positives if an NMI
   arrives in the host while KVM is handling an IRQ VM-Exit.
 
 * Fix a bug where KVM would report stale/bogus exit qualification information
   when exiting to userspace with an internal error exit code.
 
 * Add a VMX flag in /proc/cpuinfo to report 5-level EPT support.
 
 * Rework TDP MMU root unload, free, and alloc to run with mmu_lock held for
   read, e.g. to avoid serializing vCPUs when userspace deletes a memslot.
 
 * Tear down TDP MMU page tables at 4KiB granularity (used to be 1GiB).  KVM
   doesn't support yielding in the middle of processing a zap, and 1GiB
   granularity resulted in multi-millisecond lags that are quite impolite
   for CONFIG_PREEMPT kernels.
 
 * Allocate write-tracking metadata on-demand to avoid the memory overhead when
   a kernel is built with i915 virtualization support but the workloads use
   neither shadow paging nor i915 virtualization.
 
 * Explicitly initialize a variety of on-stack variables in the emulator that
   triggered KMSAN false positives.
 
 * Fix the debugregs ABI for 32-bit KVM.
 
 * Rework the "force immediate exit" code so that vendor code ultimately decides
   how and when to force the exit, which allowed some optimization for both
   Intel and AMD.
 
 * Fix a long-standing bug where kvm_has_noapic_vcpu could be left elevated if
   vCPU creation ultimately failed, causing extra unnecessary work.
 
 * Cleanup the logic for checking if the currently loaded vCPU is in-kernel.
 
 * Harden against underflowing the active mmu_notifier invalidation
   count, so that "bad" invalidations (usually due to bugs elsehwere in the
   kernel) are detected earlier and are less likely to hang the kernel.
 
 x86 Xen emulation:
 
 * Overlay pages can now be cached based on host virtual address,
   instead of guest physical addresses.  This removes the need to
   reconfigure and invalidate the cache if the guest changes the
   gpa but the underlying host virtual address remains the same.
 
 * When possible, use a single host TSC value when computing the deadline for
   Xen timers in order to improve the accuracy of the timer emulation.
 
 * Inject pending upcall events when the vCPU software-enables its APIC to fix
   a bug where an upcall can be lost (and to follow Xen's behavior).
 
 * Fall back to the slow path instead of warning if "fast" IRQ delivery of Xen
   events fails, e.g. if the guest has aliased xAPIC IDs.
 
 RISC-V:
 
 * Support exception and interrupt handling in selftests
 
 * New self test for RISC-V architectural timer (Sstc extension)
 
 * New extension support (Ztso, Zacas)
 
 * Support userspace emulation of random number seed CSRs.
 
 ARM:
 
 * Infrastructure for building KVM's trap configuration based on the
   architectural features (or lack thereof) advertised in the VM's ID
   registers
 
 * Support for mapping vfio-pci BARs as Normal-NC (vaguely similar to
   x86's WC) at stage-2, improving the performance of interacting with
   assigned devices that can tolerate it
 
 * Conversion of KVM's representation of LPIs to an xarray, utilized to
   address serialization some of the serialization on the LPI injection
   path
 
 * Support for _architectural_ VHE-only systems, advertised through the
   absence of FEAT_E2H0 in the CPU's ID register
 
 * Miscellaneous cleanups, fixes, and spelling corrections to KVM and
   selftests
 
 LoongArch:
 
 * Set reserved bits as zero in CPUCFG.
 
 * Start SW timer only when vcpu is blocking.
 
 * Do not restart SW timer when it is expired.
 
 * Remove unnecessary CSR register saving during enter guest.
 
 * Misc cleanups and fixes as usual.
 
 Generic:
 
 * cleanup Kconfig by removing CONFIG_HAVE_KVM, which was basically always
   true on all architectures except MIPS (where Kconfig determines the
   available depending on CPU capabilities).  It is replaced either by
   an architecture-dependent symbol for MIPS, and IS_ENABLED(CONFIG_KVM)
   everywhere else.
 
 * Factor common "select" statements in common code instead of requiring
   each architecture to specify it
 
 * Remove thoroughly obsolete APIs from the uapi headers.
 
 * Move architecture-dependent stuff to uapi/asm/kvm.h
 
 * Always flush the async page fault workqueue when a work item is being
   removed, especially during vCPU destruction, to ensure that there are no
   workers running in KVM code when all references to KVM-the-module are gone,
   i.e. to prevent a very unlikely use-after-free if kvm.ko is unloaded.
 
 * Grab a reference to the VM's mm_struct in the async #PF worker itself instead
   of gifting the worker a reference, so that there's no need to remember
   to *conditionally* clean up after the worker.
 
 Selftests:
 
 * Reduce boilerplate especially when utilize selftest TAP infrastructure.
 
 * Add basic smoke tests for SEV and SEV-ES, along with a pile of library
   support for handling private/encrypted/protected memory.
 
 * Fix benign bugs where tests neglect to close() guest_memfd files.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmX0iP8UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroND7wf+JZoNvwZ+bmwWe/4jn/YwNoYi/C5z
 eypn8M1gsWEccpCpqPBwznVm9T29rF4uOlcMvqLEkHfTpaL1EKUUjP1lXPz/ileP
 6a2RdOGxAhyTiFC9fjy+wkkjtLbn1kZf6YsS0hjphP9+w0chNbdn0w81dFVnXryd
 j7XYI8R/bFAthNsJOuZXSEjCfIHxvTTG74OrTf1B1FEBB+arPmrgUeJftMVhffQK
 Sowgg8L/Ii/x6fgV5NZQVSIyVf1rp8z7c6UaHT4Fwb0+RAMW8p9pYv9Qp1YkKp8y
 5j0V9UzOHP7FRaYimZ5BtwQoqiZXYylQ+VuU/Y2f4X85cvlLzSqxaEMAPA==
 =mqOV
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "S390:

   - Changes to FPU handling came in via the main s390 pull request

   - Only deliver to the guest the SCLP events that userspace has
     requested

   - More virtual vs physical address fixes (only a cleanup since
     virtual and physical address spaces are currently the same)

   - Fix selftests undefined behavior

  x86:

   - Fix a restriction that the guest can't program a PMU event whose
     encoding matches an architectural event that isn't included in the
     guest CPUID. The enumeration of an architectural event only says
     that if a CPU supports an architectural event, then the event can
     be programmed *using the architectural encoding*. The enumeration
     does NOT say anything about the encoding when the CPU doesn't
     report support the event *in general*. It might support it, and it
     might support it using the same encoding that made it into the
     architectural PMU spec

   - Fix a variety of bugs in KVM's emulation of RDPMC (more details on
     individual commits) and add a selftest to verify KVM correctly
     emulates RDMPC, counter availability, and a variety of other
     PMC-related behaviors that depend on guest CPUID and therefore are
     easier to validate with selftests than with custom guests (aka
     kvm-unit-tests)

   - Zero out PMU state on AMD if the virtual PMU is disabled, it does
     not cause any bug but it wastes time in various cases where KVM
     would check if a PMC event needs to be synthesized

   - Optimize triggering of emulated events, with a nice ~10%
     performance improvement in VM-Exit microbenchmarks when a vPMU is
     exposed to the guest

   - Tighten the check for "PMI in guest" to reduce false positives if
     an NMI arrives in the host while KVM is handling an IRQ VM-Exit

   - Fix a bug where KVM would report stale/bogus exit qualification
     information when exiting to userspace with an internal error exit
     code

   - Add a VMX flag in /proc/cpuinfo to report 5-level EPT support

   - Rework TDP MMU root unload, free, and alloc to run with mmu_lock
     held for read, e.g. to avoid serializing vCPUs when userspace
     deletes a memslot

   - Tear down TDP MMU page tables at 4KiB granularity (used to be
     1GiB). KVM doesn't support yielding in the middle of processing a
     zap, and 1GiB granularity resulted in multi-millisecond lags that
     are quite impolite for CONFIG_PREEMPT kernels

   - Allocate write-tracking metadata on-demand to avoid the memory
     overhead when a kernel is built with i915 virtualization support
     but the workloads use neither shadow paging nor i915 virtualization

   - Explicitly initialize a variety of on-stack variables in the
     emulator that triggered KMSAN false positives

   - Fix the debugregs ABI for 32-bit KVM

   - Rework the "force immediate exit" code so that vendor code
     ultimately decides how and when to force the exit, which allowed
     some optimization for both Intel and AMD

   - Fix a long-standing bug where kvm_has_noapic_vcpu could be left
     elevated if vCPU creation ultimately failed, causing extra
     unnecessary work

   - Cleanup the logic for checking if the currently loaded vCPU is
     in-kernel

   - Harden against underflowing the active mmu_notifier invalidation
     count, so that "bad" invalidations (usually due to bugs elsehwere
     in the kernel) are detected earlier and are less likely to hang the
     kernel

  x86 Xen emulation:

   - Overlay pages can now be cached based on host virtual address,
     instead of guest physical addresses. This removes the need to
     reconfigure and invalidate the cache if the guest changes the gpa
     but the underlying host virtual address remains the same

   - When possible, use a single host TSC value when computing the
     deadline for Xen timers in order to improve the accuracy of the
     timer emulation

   - Inject pending upcall events when the vCPU software-enables its
     APIC to fix a bug where an upcall can be lost (and to follow Xen's
     behavior)

   - Fall back to the slow path instead of warning if "fast" IRQ
     delivery of Xen events fails, e.g. if the guest has aliased xAPIC
     IDs

  RISC-V:

   - Support exception and interrupt handling in selftests

   - New self test for RISC-V architectural timer (Sstc extension)

   - New extension support (Ztso, Zacas)

   - Support userspace emulation of random number seed CSRs

  ARM:

   - Infrastructure for building KVM's trap configuration based on the
     architectural features (or lack thereof) advertised in the VM's ID
     registers

   - Support for mapping vfio-pci BARs as Normal-NC (vaguely similar to
     x86's WC) at stage-2, improving the performance of interacting with
     assigned devices that can tolerate it

   - Conversion of KVM's representation of LPIs to an xarray, utilized
     to address serialization some of the serialization on the LPI
     injection path

   - Support for _architectural_ VHE-only systems, advertised through
     the absence of FEAT_E2H0 in the CPU's ID register

   - Miscellaneous cleanups, fixes, and spelling corrections to KVM and
     selftests

  LoongArch:

   - Set reserved bits as zero in CPUCFG

   - Start SW timer only when vcpu is blocking

   - Do not restart SW timer when it is expired

   - Remove unnecessary CSR register saving during enter guest

   - Misc cleanups and fixes as usual

  Generic:

   - Clean up Kconfig by removing CONFIG_HAVE_KVM, which was basically
     always true on all architectures except MIPS (where Kconfig
     determines the available depending on CPU capabilities). It is
     replaced either by an architecture-dependent symbol for MIPS, and
     IS_ENABLED(CONFIG_KVM) everywhere else

   - Factor common "select" statements in common code instead of
     requiring each architecture to specify it

   - Remove thoroughly obsolete APIs from the uapi headers

   - Move architecture-dependent stuff to uapi/asm/kvm.h

   - Always flush the async page fault workqueue when a work item is
     being removed, especially during vCPU destruction, to ensure that
     there are no workers running in KVM code when all references to
     KVM-the-module are gone, i.e. to prevent a very unlikely
     use-after-free if kvm.ko is unloaded

   - Grab a reference to the VM's mm_struct in the async #PF worker
     itself instead of gifting the worker a reference, so that there's
     no need to remember to *conditionally* clean up after the worker

  Selftests:

   - Reduce boilerplate especially when utilize selftest TAP
     infrastructure

   - Add basic smoke tests for SEV and SEV-ES, along with a pile of
     library support for handling private/encrypted/protected memory

   - Fix benign bugs where tests neglect to close() guest_memfd files"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (246 commits)
  selftests: kvm: remove meaningless assignments in Makefiles
  KVM: riscv: selftests: Add Zacas extension to get-reg-list test
  RISC-V: KVM: Allow Zacas extension for Guest/VM
  KVM: riscv: selftests: Add Ztso extension to get-reg-list test
  RISC-V: KVM: Allow Ztso extension for Guest/VM
  RISC-V: KVM: Forward SEED CSR access to user space
  KVM: riscv: selftests: Add sstc timer test
  KVM: riscv: selftests: Change vcpu_has_ext to a common function
  KVM: riscv: selftests: Add guest helper to get vcpu id
  KVM: riscv: selftests: Add exception handling support
  LoongArch: KVM: Remove unnecessary CSR register saving during enter guest
  LoongArch: KVM: Do not restart SW timer when it is expired
  LoongArch: KVM: Start SW timer only when vcpu is blocking
  LoongArch: KVM: Set reserved bits as zero in CPUCFG
  KVM: selftests: Explicitly close guest_memfd files in some gmem tests
  KVM: x86/xen: fix recursive deadlock in timer injection
  KVM: pfncache: simplify locking and make more self-contained
  KVM: x86/xen: remove WARN_ON_ONCE() with false positives in evtchn delivery
  KVM: x86/xen: inject vCPU upcall vector when local APIC is enabled
  KVM: x86/xen: improve accuracy of Xen timers
  ...
2024-03-15 13:03:13 -07:00
Mark Brown b6c0b424cb arm64/fpsimd: Enable host kernel access to FPMR
FEAT_FPMR provides a new generally accessible architectural register FPMR.
This is only accessible to EL0 and EL1 when HCRX_EL2.EnFPM is set to 1,
do this when the host is running. The guest part will be done along with
context switching the new register and exposing it via guest management.

Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240306-arm64-2023-dpisa-v5-2-c568edc8ed7f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-03-07 17:14:52 +00:00
Marc Zyngier 84de212d73 KVM: arm64: Make FEAT_MOPS UNDEF if not advertised to the guest
We unconditionally enable FEAT_MOPS, which is obviously wrong.

So let's only do that when it is advertised to the guest.
Which means we need to rely on a per-vcpu HCRX_EL2 shadow register.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20240214131827.2856277-25-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-02-19 17:13:02 +00:00
Marc Zyngier 53d5486114 Merge branch kvm-arm64/fgt-rework into kvmarm-master/next
* kvm-arm64/fgt-rework: (30 commits)
  : .
  : Fine Grain Trapping update, courtesy of Fuad Tabba.
  :
  : From the cover letter:
  :
  : "This patch series has fixes, updates, and code for validating
  : fine grain trap register masks, as well as some fixes to feature
  : trapping in pKVM.
  :
  : New fine grain trap (FGT) bits have been defined in the latest
  : Arm Architecture System Registers xml specification (DDI0601 and
  : DDI0602 2023-09) [1], so the code is updated to reflect them.
  : Moreover, some of the already-defined masks overlap with RES0,
  : which this series fixes.
  :
  : It also adds FGT register masks that weren't defined earlier,
  : handling of HAFGRTR_EL2 in nested virt, as well as build time
  : validation that the bits of the various masks are all accounted
  : for and without overlap."
  :
  : This branch also drags the arm64/for-next/sysregs branch,
  : which is a dependency on this work.
  : .
  KVM: arm64: Trap external trace for protected VMs
  KVM: arm64: Mark PAuth as a restricted feature for protected VMs
  KVM: arm64: Fix which features are marked as allowed for protected VMs
  KVM: arm64: Macros for setting/clearing FGT bits
  KVM: arm64: Define FGT nMASK bits relative to other fields
  KVM: arm64: Use generated FGT RES0 bits instead of specifying them
  KVM: arm64: Add build validation for FGT trap mask values
  KVM: arm64: Update and fix FGT register masks
  KVM: arm64: Handle HAFGRTR_EL2 trapping in nested virt
  KVM: arm64: Add bit masks for HAFGRTR_EL2
  KVM: arm64: Add missing HFGITR_EL2 FGT entries to nested virt
  KVM: arm64: Add missing HFGxTR_EL2 FGT entries to nested virt
  KVM: arm64: Explicitly trap unsupported HFGxTR_EL2 features
  arm64/sysreg: Add missing system instruction definitions for FGT
  arm64/sysreg: Add missing system register definitions for FGT
  arm64/sysreg: Add missing ExtTrcBuff field definition to ID_AA64DFR0_EL1
  arm64/sysreg: Add missing Pauth_LR field definitions to ID_AA64ISAR1_EL1
  arm64/sysreg: Add new system registers for GCS
  arm64/sysreg: Add definition for FPMR
  arm64/sysreg: Update HCRX_EL2 definition for DDI0601 2023-09
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-12-18 17:09:32 +00:00
Fuad Tabba 5f6bd3f3da KVM: arm64: Define FGT nMASK bits relative to other fields
Now that RES0 and MASK have full coverage, no need to manually
encode nMASK. Calculate it relative to the other fields.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231214100158.2305400-14-tabba@google.com
2023-12-18 11:25:51 +00:00
Fuad Tabba 9ff67dd26a KVM: arm64: Use generated FGT RES0 bits instead of specifying them
Now that all FGT fields are accounted for and represented, use
the generated value instead of manually specifying them.

For __HFGWTR_EL2_RES0, however, there is no generated value. Its
fields are subset of HFGRTR_EL2, with the remaining being RES0.
Therefore, add a mask that represents the HFGRTR_EL2 only bits
and define __HFGWTR_EL2_* using those and the __HFGRTR_EL2_*
fields.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231214100158.2305400-13-tabba@google.com
2023-12-18 11:25:51 +00:00
Fuad Tabba fc04838f9c KVM: arm64: Update and fix FGT register masks
New trap bits have been defined since the latest update to this
patch.  Moreover, the existing definitions of some of the mask
and the RES0 bits overlap, which could be wrong, confusing, or
both.

Update the bits based on DDI0601 2023-09, and ensure that the
existing bits are consistent.

Subsequent patches will use the generated RES0 fields instead of
specifying them manually. This patch keeps the manual encoding of
the bits to make it easier to review the series.

Fixes: 0fd7686500 ("KVM: arm64: Add nPIR{E0}_EL1 to HFG traps")
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231214100158.2305400-11-tabba@google.com
2023-12-18 11:25:51 +00:00
Fuad Tabba f9d6ed0213 KVM: arm64: Add bit masks for HAFGRTR_EL2
To support HAFGRTR_EL2 supported in nested virt in the following
patch, first add its bitmask definitions based on DDI0601 2023-09.

Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231214100158.2305400-9-tabba@google.com
2023-12-18 09:22:32 +00:00
Ryan Roberts d4fbbb26da KVM: arm64: Add new (V)TCR_EL2 field definitions for FEAT_LPA2
As per Arm ARM (0487I.a), (V)TCR_EL2.DS fields control whether 52 bit
input and output addresses are supported on 4K and 16K page size
configurations when FEAT_LPA2 is known to have been implemented.

This adds these field definitions which will be used by KVM when
FEAT_LPA2 is enabled.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231127111737.1897081-7-ryan.roberts@arm.com
2023-11-27 15:03:50 +00:00
Paolo Bonzini 45b890f768 KVM/arm64 updates for 6.7
- Generalized infrastructure for 'writable' ID registers, effectively
    allowing userspace to opt-out of certain vCPU features for its guest
 
  - Optimization for vSGI injection, opportunistically compressing MPIDR
    to vCPU mapping into a table
 
  - Improvements to KVM's PMU emulation, allowing userspace to select
    the number of PMCs available to a VM
 
  - Guest support for memory operation instructions (FEAT_MOPS)
 
  - Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
    bugs and getting rid of useless code
 
  - Changes to the way the SMCCC filter is constructed, avoiding wasted
    memory allocations when not in use
 
  - Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing
    the overhead of errata mitigations
 
  - Miscellaneous kernel and selftest fixes
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCZUFJRgAKCRCivnWIJHzd
 FtgYAP9cMsc1Mhlw3jNQnTc6q0cbTulD/SoEDPUat1dXMqjs+gEAnskwQTrTX834
 fgGQeCAyp7Gmar+KeP64H0xm8kPSpAw=
 =R4M7
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for 6.7

 - Generalized infrastructure for 'writable' ID registers, effectively
   allowing userspace to opt-out of certain vCPU features for its guest

 - Optimization for vSGI injection, opportunistically compressing MPIDR
   to vCPU mapping into a table

 - Improvements to KVM's PMU emulation, allowing userspace to select
   the number of PMCs available to a VM

 - Guest support for memory operation instructions (FEAT_MOPS)

 - Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
   bugs and getting rid of useless code

 - Changes to the way the SMCCC filter is constructed, avoiding wasted
   memory allocations when not in use

 - Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing
   the overhead of errata mitigations

 - Miscellaneous kernel and selftest fixes
2023-10-31 16:37:07 -04:00
Joey Gouly 0fd7686500 KVM: arm64: Add nPIR{E0}_EL1 to HFG traps
nPIR_EL1 and nPIREO_EL1 are part of the 'reverse polarity' set of bits, set
them so that we disable the traps for a guest. Unfortunately, these bits
are not yet described in the ARM ARM, but only live in the XML description.

Also add them to the NV FGT forwarding infrastructure.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Fixes: e930694e61 ("KVM: arm64: Restructure FGT register switching")
Cc: Oliver Upton <oliver.upton@linux.dev>
[maz: add entries to the NV FGT array, commit message update]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231012123459.2820835-2-joey.gouly@arm.com
2023-10-12 16:38:50 +01:00
Kristina Martsenko e0bb80c62c KVM: arm64: Expose MOPS instructions to guests
Expose the Armv8.8 FEAT_MOPS feature to guests in the ID register and
allow the MOPS instructions to be run in a guest. Only expose MOPS if
the whole system supports it.

Note, it is expected that guests do not use these instructions on MMIO,
similarly to other instructions where ESR_EL2.ISV==0 such as LDP/STP.

Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230922112508.1774352-3-kristina.martsenko@arm.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-09 19:54:25 +00:00
Marc Zyngier 1f66f1246b Merge branch kvm-arm64/6.6/misc into kvmarm-master/next
* kvm-arm64/6.6/misc:
  : .
  : Misc KVM/arm64 updates for 6.6:
  :
  : - Don't unnecessary align non-stack allocations in the EL2 VA space
  :
  : - Drop HCR_VIRT_EXCP_MASK, which was never used...
  :
  : - Don't use smp_processor_id() in kvm_arch_vcpu_load(),
  :   but the cpu parameter instead
  :
  : - Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort()
  :
  : - Remove prototypes without implementations
  : .
  KVM: arm64: Remove size-order align in the nVHE hyp private VA range
  KVM: arm64: Remove unused declarations
  KVM: arm64: Remove redundant kvm_set_pfn_accessed() from user_mem_abort()
  KVM: arm64: Drop HCR_VIRT_EXCP_MASK
  KVM: arm64: Use the known cpu id instead of smp_processor_id()

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-08-28 09:30:32 +01:00
Marc Zyngier 03fb54d0aa KVM: arm64: nv: Add support for HCRX_EL2
HCRX_EL2 has an interesting effect on HFGITR_EL2, as it conditions
the traps of TLBI*nXS.

Expand the FGT support to add a new Fine Grained Filter that will
get checked when the instruction gets trapped, allowing the shadow
register to override the trap as needed.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-29-maz@kernel.org
2023-08-17 10:00:28 +01:00
Marc Zyngier d0be0b2ede KVM: arm64: nv: Add trap forwarding for HDFGxTR_EL2
... and finally, the Debug version of FGT, with its *enormous*
list of trapped registers.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-23-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier 039f9f12de KVM: arm64: nv: Add trap forwarding for HFGITR_EL2
Similarly, implement the trap forwarding for instructions affected
by HFGITR_EL2.

Note that the TLBI*nXS instructions should be affected by HCRX_EL2,
which will be dealt with down the line. Also, ERET* and SVC traps
are handled separately.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-22-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier e930694e61 KVM: arm64: Restructure FGT register switching
As we're about to majorly extend the handling of FGT registers,
restructure the code to actually save/restore the registers
as required. This is made easy thanks to the previous addition
of the EL2 registers, allowing us to use the host context for
this purpose.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-14-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier 3ea84b4fe4 KVM: arm64: Add missing HCR_EL2 trap bits
We're still missing a handfull of HCR_EL2 trap bits. Add them.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-12-maz@kernel.org
2023-08-17 10:00:27 +01:00
Zenghui Yu e21f3905f9 KVM: arm64: Drop HCR_VIRT_EXCP_MASK
This was introduced in commit 0369f6a34b ("arm64: KVM: EL2 register
definitions") and for more than 10 years nobody used. Remove it.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230724142257.1551-1-yuzenghui@huawei.com
2023-07-28 09:50:38 +01:00
Linus Torvalds e8069f5a8e ARM64:
* Eager page splitting optimization for dirty logging, optionally
   allowing for a VM to avoid the cost of hugepage splitting in the stage-2
   fault path.
 
 * Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact with
   services that live in the Secure world. pKVM intervenes on FF-A calls
   to guarantee the host doesn't misuse memory donated to the hyp or a
   pKVM guest.
 
 * Support for running the split hypervisor with VHE enabled, known as
   'hVHE' mode. This is extremely useful for testing the split
   hypervisor on VHE-only systems, and paves the way for new use cases
   that depend on having two TTBRs available at EL2.
 
 * Generalized framework for configurable ID registers from userspace.
   KVM/arm64 currently prevents arbitrary CPU feature set configuration
   from userspace, but the intent is to relax this limitation and allow
   userspace to select a feature set consistent with the CPU.
 
 * Enable the use of Branch Target Identification (FEAT_BTI) in the
   hypervisor.
 
 * Use a separate set of pointer authentication keys for the hypervisor
   when running in protected mode, as the host is untrusted at runtime.
 
 * Ensure timer IRQs are consistently released in the init failure
   paths.
 
 * Avoid trapping CTR_EL0 on systems with Enhanced Virtualization Traps
   (FEAT_EVT), as it is a register commonly read from userspace.
 
 * Erratum workaround for the upcoming AmpereOne part, which has broken
   hardware A/D state management.
 
 RISC-V:
 
 * Redirect AMO load/store misaligned traps to KVM guest
 
 * Trap-n-emulate AIA in-kernel irqchip for KVM guest
 
 * Svnapot support for KVM Guest
 
 s390:
 
 * New uvdevice secret API
 
 * CMM selftest and fixes
 
 * fix racy access to target CPU for diag 9c
 
 x86:
 
 * Fix missing/incorrect #GP checks on ENCLS
 
 * Use standard mmu_notifier hooks for handling APIC access page
 
 * Drop now unnecessary TR/TSS load after VM-Exit on AMD
 
 * Print more descriptive information about the status of SEV and SEV-ES during
   module load
 
 * Add a test for splitting and reconstituting hugepages during and after
   dirty logging
 
 * Add support for CPU pinning in demand paging test
 
 * Add support for AMD PerfMonV2, with a variety of cleanups and minor fixes
   included along the way
 
 * Add a "nx_huge_pages=never" option to effectively avoid creating NX hugepage
   recovery threads (because nx_huge_pages=off can be toggled at runtime)
 
 * Move handling of PAT out of MTRR code and dedup SVM+VMX code
 
 * Fix output of PIC poll command emulation when there's an interrupt
 
 * Add a maintainer's handbook to document KVM x86 processes, preferred coding
   style, testing expectations, etc.
 
 * Misc cleanups, fixes and comments
 
 Generic:
 
 * Miscellaneous bugfixes and cleanups
 
 Selftests:
 
 * Generate dependency files so that partial rebuilds work as expected
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmSgHrIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroORcAf+KkBlXwQMf+Q0Hy6Mfe0OtkKmh0Ae
 6HJ6dsuMfOHhWv5kgukh+qvuGUGzHq+gpVKmZg2yP3h3cLHOLUAYMCDm+rjXyjsk
 F4DbnJLfxq43Pe9PHRKFxxSecRcRYCNox0GD5UYL4PLKcH0FyfQrV+HVBK+GI8L3
 FDzUcyJkR12Lcj1qf++7fsbzfOshL0AJPmidQCoc6wkLJpUEr/nYUqlI1Kx3YNuQ
 LKmxFHS4l4/O/px3GKNDrLWDbrVlwciGIa3GZLS52PZdW3mAqT+cqcPcYK6SW71P
 m1vE80VbNELX5q3YSRoOXtedoZ3Pk97LEmz/xQAsJ/jri0Z5Syk0Ok0m/Q==
 =AMXp
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM64:

   - Eager page splitting optimization for dirty logging, optionally
     allowing for a VM to avoid the cost of hugepage splitting in the
     stage-2 fault path.

   - Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact
     with services that live in the Secure world. pKVM intervenes on
     FF-A calls to guarantee the host doesn't misuse memory donated to
     the hyp or a pKVM guest.

   - Support for running the split hypervisor with VHE enabled, known as
     'hVHE' mode. This is extremely useful for testing the split
     hypervisor on VHE-only systems, and paves the way for new use cases
     that depend on having two TTBRs available at EL2.

   - Generalized framework for configurable ID registers from userspace.
     KVM/arm64 currently prevents arbitrary CPU feature set
     configuration from userspace, but the intent is to relax this
     limitation and allow userspace to select a feature set consistent
     with the CPU.

   - Enable the use of Branch Target Identification (FEAT_BTI) in the
     hypervisor.

   - Use a separate set of pointer authentication keys for the
     hypervisor when running in protected mode, as the host is untrusted
     at runtime.

   - Ensure timer IRQs are consistently released in the init failure
     paths.

   - Avoid trapping CTR_EL0 on systems with Enhanced Virtualization
     Traps (FEAT_EVT), as it is a register commonly read from userspace.

   - Erratum workaround for the upcoming AmpereOne part, which has
     broken hardware A/D state management.

  RISC-V:

   - Redirect AMO load/store misaligned traps to KVM guest

   - Trap-n-emulate AIA in-kernel irqchip for KVM guest

   - Svnapot support for KVM Guest

  s390:

   - New uvdevice secret API

   - CMM selftest and fixes

   - fix racy access to target CPU for diag 9c

  x86:

   - Fix missing/incorrect #GP checks on ENCLS

   - Use standard mmu_notifier hooks for handling APIC access page

   - Drop now unnecessary TR/TSS load after VM-Exit on AMD

   - Print more descriptive information about the status of SEV and
     SEV-ES during module load

   - Add a test for splitting and reconstituting hugepages during and
     after dirty logging

   - Add support for CPU pinning in demand paging test

   - Add support for AMD PerfMonV2, with a variety of cleanups and minor
     fixes included along the way

   - Add a "nx_huge_pages=never" option to effectively avoid creating NX
     hugepage recovery threads (because nx_huge_pages=off can be toggled
     at runtime)

   - Move handling of PAT out of MTRR code and dedup SVM+VMX code

   - Fix output of PIC poll command emulation when there's an interrupt

   - Add a maintainer's handbook to document KVM x86 processes,
     preferred coding style, testing expectations, etc.

   - Misc cleanups, fixes and comments

  Generic:

   - Miscellaneous bugfixes and cleanups

  Selftests:

   - Generate dependency files so that partial rebuilds work as
     expected"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (153 commits)
  Documentation/process: Add a maintainer handbook for KVM x86
  Documentation/process: Add a label for the tip tree handbook's coding style
  KVM: arm64: Fix misuse of KVM_ARM_VCPU_POWER_OFF bit index
  RISC-V: KVM: Remove unneeded semicolon
  RISC-V: KVM: Allow Svnapot extension for Guest/VM
  riscv: kvm: define vcpu_sbi_ext_pmu in header
  RISC-V: KVM: Expose IMSIC registers as attributes of AIA irqchip
  RISC-V: KVM: Add in-kernel virtualization of AIA IMSIC
  RISC-V: KVM: Expose APLIC registers as attributes of AIA irqchip
  RISC-V: KVM: Add in-kernel emulation of AIA APLIC
  RISC-V: KVM: Implement device interface for AIA irqchip
  RISC-V: KVM: Skeletal in-kernel AIA irqchip support
  RISC-V: KVM: Set kvm_riscv_aia_nr_hgei to zero
  RISC-V: KVM: Add APLIC related defines
  RISC-V: KVM: Add IMSIC related defines
  RISC-V: KVM: Implement guest external interrupt line management
  KVM: x86: Remove PRIx* definitions as they are solely for user space
  s390/uv: Update query for secret-UVCs
  s390/uv: replace scnprintf with sysfs_emit
  s390/uvdevice: Add 'Lock Secret Store' UVC
  ...
2023-07-03 15:32:22 -07:00
Oliver Upton e1e315c4d5 Merge branch kvm-arm64/misc into kvmarm/next
* kvm-arm64/misc:
  : Miscellaneous updates
  :
  :  - Avoid trapping CTR_EL0 on systems with FEAT_EVT, as the register is
  :    commonly read by userspace
  :
  :  - Make use of FEAT_BTI at hyp stage-1, setting the Guard Page bit to 1
  :    for executable mappings
  :
  :  - Use a separate set of pointer authentication keys for the hypervisor
  :    when running in protected mode (i.e. pKVM)
  :
  :  - Plug a few holes in timer initialization where KVM fails to free the
  :    timer IRQ(s)
  KVM: arm64: Use different pointer authentication keys for pKVM
  KVM: arm64: timers: Fix resource leaks in kvm_timer_hyp_init()
  KVM: arm64: Use BTI for nvhe
  KVM: arm64: Relax trapping of CTR_EL0 when FEAT_EVT is available

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-15 13:09:43 +00:00
Marc Zyngier 75c76ab5a6 KVM: arm64: Rework CPTR_EL2 programming for HVHE configuration
Just like we repainted the early arm64 code, we need to update
the CPTR_EL2 accesses that are taking place in the nVHE code
when hVHE is used, making them look as if they were CPACR_EL1
accesses. Just like the VHE code.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230609162200.2024064-14-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-12 23:17:24 +00:00
Joey Gouly 7df7170965 arm64: disable EL2 traps for PIE
Disable trapping of TCR2_EL1 and PIRx_EL1 registers, so they can be
accessed from by EL1.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20230606145859.697944-15-joey.gouly@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06 16:52:41 +01:00
Kristina Martsenko b1319c0e95 arm64: mops: don't disable host MOPS instructions from EL2
To allow nVHE host EL0 and EL1 to use FEAT_MOPS instructions, configure
EL2 to not cause these instructions to be treated as UNDEFINED. A VHE
host is unaffected by this control.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20230509142235.3284028-6-kristina.martsenko@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-05 17:05:31 +01:00
Kristina Martsenko 306b4c9f71 KVM: arm64: switch HCRX_EL2 between host and guest
Switch the HCRX_EL2 register between host and guest configurations, in
order to enable different features in the host and guest.

Now that there are separate guest flags, we can also remove SMPME from
the host flags, as SMPME is used for virtualizing SME priorities and has
no use in the host.

Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20230509142235.3284028-4-kristina.martsenko@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-05 17:05:26 +01:00
Kristina Martsenko af94aad4c9 KVM: arm64: initialize HCRX_EL2
ARMv8.7/9.2 adds a new hypervisor configuration register HCRX_EL2.
Initialize the register to a safe value (all fields 0), to be robust
against firmware that has not initialized it. This is also needed to
ensure that the register is reinitialized after a kexec by a future
kernel.

In addition, move SMPME setup over to the new flags, as it would
otherwise get overridden. It is safe to set the bit even if SME is not
(uniformly) supported, as it will write to a RES0 bit (having no
effect), and SME will be disabled by the cpufeature framework.
(Similar to how e.g. the API bit is handled in HCR_HOST_NVHE_FLAGS.)

Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20230509142235.3284028-2-kristina.martsenko@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-05 17:05:05 +01:00
Marc Zyngier c876c3f182 KVM: arm64: Relax trapping of CTR_EL0 when FEAT_EVT is available
CTR_EL0 can often be used in userspace, and it would be nice if
KVM didn't have to emulate it unnecessarily.

While it isn't possible to trap the cache configuration registers
independently from CTR_EL0 in the base ARMv8.0 architecture, FEAT_EVT
allows these cache configuration registers (CCSIDR_EL1, CCSIDR2_EL1,
CLIDR_EL1 and CSSELR_EL1) to be trapped independently by setting
HCR_EL2.TID4.

Switch to using TID4 instead of TID2 in the cases where FEAT_EVT
is available *and* that KVM doesn't need to sanitise CTR_EL0 to
paper over mismatched cache configurations.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230515170016.965378-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-05-21 19:09:44 +00:00
Oliver Upton 0d3b2b4d23 Merge branch kvm-arm64/nv-prefix into kvmarm/next
* kvm-arm64/nv-prefix:
  : Preamble to NV support, courtesy of Marc Zyngier.
  :
  : This brings in a set of prerequisite patches for supporting nested
  : virtualization in KVM/arm64. Of course, there is a long way to go until
  : NV is actually enabled in KVM.
  :
  :  - Introduce cpucap / vCPU feature flag to pivot the NV code on
  :
  :  - Add support for EL2 vCPU register state
  :
  :  - Basic nested exception handling
  :
  :  - Hide unsupported features from the ID registers for NV-capable VMs
  KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID
  KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes
  KVM: arm64: nv: Filter out unsupported features from ID regs
  KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2
  KVM: arm64: nv: Allow a sysreg to be hidden from userspace only
  KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor
  KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2
  KVM: arm64: nv: Handle SMCs taken from virtual EL2
  KVM: arm64: nv: Handle trapped ERET from virtual EL2
  KVM: arm64: nv: Inject HVC exceptions to the virtual EL2
  KVM: arm64: nv: Support virtual EL2 exceptions
  KVM: arm64: nv: Handle HCR_EL2.NV system register traps
  KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state
  KVM: arm64: nv: Add EL2 system registers to vcpu context
  KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x
  KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set
  KVM: arm64: nv: Introduce nested virtualization VCPU feature
  KVM: arm64: Use the S2 MMU context to iterate over S2 table
  arm64: Add ARM64_HAS_NESTED_VIRT cpufeature

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 23:33:41 +00:00
Oliver Upton 022d3f0800 Merge branch kvm-arm64/misc into kvmarm/next
* kvm-arm64/misc:
  : Miscellaneous updates
  :
  :  - Convert CPACR_EL1_TTA to the new, generated system register
  :    definitions.
  :
  :  - Serialize toggling CPACR_EL1.SMEN to avoid unexpected exceptions when
  :    accessing SVCR in the host.
  :
  :  - Avoid quiescing the guest if a vCPU accesses its own redistributor's
  :    SGIs/PPIs, eliminating the need to IPI. Largely an optimization for
  :    nested virtualization, as the L1 accesses the affected registers
  :    rather often.
  :
  :  - Conversion to kstrtobool()
  :
  :  - Common definition of INVALID_GPA across architectures
  :
  :  - Enable CONFIG_USERFAULTFD for CI runs of KVM selftests
  KVM: arm64: Fix non-kerneldoc comments
  KVM: selftests: Enable USERFAULTFD
  KVM: selftests: Remove redundant setbuf()
  arm64/sysreg: clean up some inconsistent indenting
  KVM: MMU: Make the definition of 'INVALID_GPA' common
  KVM: arm64: vgic-v3: Use kstrtobool() instead of strtobool()
  KVM: arm64: vgic-v3: Limit IPI-ing when accessing GICR_{C,S}ACTIVER0
  KVM: arm64: Synchronize SMEN on vcpu schedule out
  KVM: arm64: Kill CPACR_EL1_TTA definition

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 23:33:25 +00:00
Oliver Upton e8789ab704 Merge branch kvm-arm64/virtual-cache-geometry into kvmarm/next
* kvm-arm64/virtual-cache-geometry:
  : Virtualized cache geometry for KVM guests, courtesy of Akihiko Odaki.
  :
  : KVM/arm64 has always exposed the host cache geometry directly to the
  : guest, even though non-secure software should never perform CMOs by
  : Set/Way. This was slightly wrong, as the cache geometry was derived from
  : the PE on which the vCPU thread was running and not a sanitized value.
  :
  : All together this leads to issues migrating VMs on heterogeneous
  : systems, as the cache geometry saved/restored could be inconsistent.
  :
  : KVM/arm64 now presents 1 level of cache with 1 set and 1 way. The cache
  : geometry is entirely controlled by userspace, such that migrations from
  : older kernels continue to work.
  KVM: arm64: Mark some VM-scoped allocations as __GFP_ACCOUNT
  KVM: arm64: Normalize cache configuration
  KVM: arm64: Mask FEAT_CCIDX
  KVM: arm64: Always set HCR_TID2
  arm64/cache: Move CLIDR macro definitions
  arm64/sysreg: Add CCSIDR2_EL1
  arm64/sysreg: Convert CCSIDR_EL1 to automatic generation
  arm64: Allow the definition of UNKNOWN system register fields

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 22:32:40 +00:00
Christoffer Dall 6898a55ce3 KVM: arm64: nv: Handle trapped ERET from virtual EL2
When a guest hypervisor running virtual EL2 in EL1 executes an ERET
instruction, we will have set HCR_EL2.NV which traps ERET to EL2, so
that we can emulate the exception return in software.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-12-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-11 09:16:11 +00:00
Jintack Lim 47f3a2fc76 KVM: arm64: nv: Support virtual EL2 exceptions
Support injecting exceptions and performing exception returns to and
from virtual EL2.  This must be done entirely in software except when
taking an exception from vEL0 to vEL2 when the virtual HCR_EL2.{E2H,TGE}
== {1,1}  (a VHE guest hypervisor).

[maz: switch to common exception injection framework, illegal exeption
 return handling]

Reviewed-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-10-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-11 09:16:11 +00:00
Marc Zyngier 7a5e9c8f0b KVM: arm64: Kill CPACR_EL1_TTA definition
Since the One True Way is to use the new generated definition,
kill the KVM-specific definition of CPACR_EL1_TTA, and move
over to CPACR_ELx_TTA, hopefully for the same result.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230112154803.1808559-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-01-12 21:10:51 +00:00
Akihiko Odaki 8cc6dedaff KVM: arm64: Always set HCR_TID2
Always set HCR_TID2 to trap CTR_EL0, CCSIDR2_EL1, CLIDR_EL1, and
CSSELR_EL1. This saves a few lines of code and allows to employ their
access trap handlers for more purposes anticipated by the old
condition for setting HCR_TID2.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Link: https://lore.kernel.org/r/20230112023852.42012-6-akihiko.odaki@daynix.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-01-12 21:07:43 +00:00
Marc Zyngier b0803ba72b KVM: arm64: Convert FSC_* over to ESR_ELx_FSC_*
The former is an AArch32 legacy, so let's move over to the
verbose (and strictly identical) version.

This involves moving some of the #defines that were private
to KVM into the more generic esr.h.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-01-03 10:01:52 +00:00
Marc Zyngier cfa72993d1 Merge branch kvm-arm64/pkvm-vcpu-state into kvmarm-master/next
* kvm-arm64/pkvm-vcpu-state: (25 commits)
  : .
  : Large drop of pKVM patches from Will Deacon and co, adding
  : a private vm/vcpu state at EL2, managed independently from
  : the EL1 state. From the cover letter:
  :
  : "This is version six of the pKVM EL2 state series, extending the pKVM
  : hypervisor code so that it can dynamically instantiate and manage VM
  : data structures without the host being able to access them directly.
  : These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2
  : page-table for the MMU. The pages used to hold the hypervisor structures
  : are returned to the host when the VM is destroyed."
  : .
  KVM: arm64: Use the pKVM hyp vCPU structure in handle___kvm_vcpu_run()
  KVM: arm64: Don't unnecessarily map host kernel sections at EL2
  KVM: arm64: Explicitly map 'kvm_vgic_global_state' at EL2
  KVM: arm64: Maintain a copy of 'kvm_arm_vmid_bits' at EL2
  KVM: arm64: Unmap 'kvm_arm_hyp_percpu_base' from the host
  KVM: arm64: Return guest memory from EL2 via dedicated teardown memcache
  KVM: arm64: Instantiate guest stage-2 page-tables at EL2
  KVM: arm64: Consolidate stage-2 initialisation into a single function
  KVM: arm64: Add generic hyp_memcache helpers
  KVM: arm64: Provide I-cache invalidation by virtual address at EL2
  KVM: arm64: Initialise hypervisor copies of host symbols unconditionally
  KVM: arm64: Add per-cpu fixmap infrastructure at EL2
  KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1
  KVM: arm64: Add infrastructure to create and track pKVM instances at EL2
  KVM: arm64: Rename 'host_kvm' to 'host_mmu'
  KVM: arm64: Add hyp_spinlock_t static initializer
  KVM: arm64: Include asm/kvm_mmu.h in nvhe/mem_protect.h
  KVM: arm64: Add helpers to pin memory shared with the hypervisor at EL2
  KVM: arm64: Prevent the donation of no-map pages
  KVM: arm64: Implement do_donate() helper for donating memory
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05 14:37:23 +00:00
Quentin Perret 315775ff7c KVM: arm64: Consolidate stage-2 initialisation into a single function
The initialisation of guest stage-2 page-tables is currently split
across two functions: kvm_init_stage2_mmu() and kvm_arm_setup_stage2().
That is presumably for historical reasons as kvm_arm_setup_stage2()
originates from the (now defunct) KVM port for 32-bit Arm.

Simplify this code path by merging both functions into one, taking care
to map the 'struct kvm' into the hypervisor stage-1 early on in order to
simplify the failure path.

Tested-by: Vincent Donnefort <vdonnefort@google.com>
Co-developed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110190259.26861-19-will@kernel.org
2022-11-11 17:16:25 +00:00
Ryan Roberts a0d37784bf KVM: arm64: Fix PAR_TO_HPFAR() to work independently of PA_BITS.
Kernel configs with PAGE_SIZE=64KB and PA_BITS=48 still advertise 52 bit
IPA space on HW that implements LPA. This is by design (admitedly this
is a very unlikely configuration in the real world).

However on such a config, attempting to create a vm with the guest
kernel placed above 48 bits in IPA space results in misbehaviour due to
the hypervisor incorrectly interpretting a faulting IPA.

Fix up PAR_TO_HPFAR() to always take 52 bits out of the PAR rather than
masking to CONFIG_ARM64_PA_BITS. If the system has a smaller implemented
PARange this should be safe because the bits are res0.

A more robust approach would be to discover the IPA size in use by the
page-table and mask based on that, to avoid relying on res0 reading back
as zero. But this information is difficult to access safely from the
code's location, so take the easy way out.

Fixes: bc1d7de8c5 ("kvm: arm64: Add 52bit support for PAR to HPFAR conversoin")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
[maz: commit message fixes]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221103150507.32948-3-ryan.roberts@arm.com
2022-11-10 19:22:58 +00:00
Marc Zyngier d25f30fe41 Merge branch kvm-arm64/aarch32-idreg-trap into kvmarm-master/next
* kvm-arm64/aarch32-idreg-trap:
  : .
  : Add trapping/sanitising infrastructure for AArch32 systen registers,
  : allowing more control over what we actually expose (such as the PMU).
  :
  : Patches courtesy of Oliver and Alexandru.
  : .
  KVM: arm64: Fix new instances of 32bit ESRs
  KVM: arm64: Hide AArch32 PMU registers when not available
  KVM: arm64: Start trapping ID registers for 32 bit guests
  KVM: arm64: Plumb cp10 ID traps through the AArch64 sysreg handler
  KVM: arm64: Wire up CP15 feature registers to their AArch64 equivalents
  KVM: arm64: Don't write to Rt unless sys_reg emulation succeeds
  KVM: arm64: Return a bool from emulate_cp()

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-04 09:42:45 +01:00
Oliver Upton fd1264c4ca KVM: arm64: Start trapping ID registers for 32 bit guests
To date KVM has not trapped ID register accesses from AArch32, meaning
that guests get an unconstrained view of what hardware supports. This
can be a serious problem because we try to base the guest's feature
registers on values that are safe system-wide. Furthermore, KVM does not
implement the latest ISA in the PMU and Debug architecture, so we
constrain these fields to supported values.

Since KVM now correctly handles CP15 and CP10 register traps, we no
longer need to clear HCR_EL2.TID3 for 32 bit guests and will instead
emulate reads with their safe values.

Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220503060205.2823727-6-oupton@google.com
2022-05-03 11:14:34 +01:00
Mark Brown b4adc83b07 arm64/sme: System register and exception syndrome definitions
The arm64 Scalable Matrix Extension (SME) adds some new system registers,
fields in existing system registers and exception syndromes. This patch
adds definitions for these for use in future patches implementing support
for this extension.

Since SME will be the first user of FEAT_HCX in the kernel also include
the definitions for enumerating it and the HCRX system register it adds.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220419112247.711548-6-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22 18:50:43 +01:00
Mark Brown 3bb72d86d8 arm64: Always use individual bits in CPACR floating point enables
CPACR_EL1 has several bitfields for controlling traps for floating point
features to EL1, each of which has a separate bits for EL0 and EL1. Marc
Zyngier noted that we are not consistent in our use of defines to
manipulate these, sometimes using a define covering the whole field and
sometimes using defines for the individual bits. Make this consistent by
expanding the whole field defines where they are used (currently only in
the KVM code) and deleting them so that no further uses can be
introduced.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220207152109.197566-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-02-25 14:28:18 +00:00
Mark Brown 879358fc67 arm64: Define CPACR_EL1_FPEN similarly to other floating point controls
The base floating point, SVE and SME all have enable controls for EL0 and
EL1 in CPACR_EL1 which have a similar layout and function. Currently the
basic floating point enable FPEN is defined differently to the SVE control,
specified as a single define in kvm_arm.h rather than in sysreg.h. Move the
define to sysreg.h and provide separate EL0 and EL1 control bits so code
managing the different floating point enables can look consistent.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220207152109.197566-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-02-25 14:28:18 +00:00
Catalin Marinas 1f80d15020 KVM: arm64: Avoid setting the upper 32 bits of TCR_EL2 and CPTR_EL2 to 1
Having a signed (1 << 31) constant for TCR_EL2_RES1 and CPTR_EL2_TCPAC
causes the upper 32-bit to be set to 1 when assigning them to a 64-bit
variable. Bit 32 in TCR_EL2 is no longer RES0 in ARMv8.7: with FEAT_LPA2
it changes the meaning of bits 49:48 and 9:8 in the stage 1 EL2 page
table entries. As a result of the sign-extension, a non-VHE kernel can
no longer boot on a model with ARMv8.7 enabled.

CPTR_EL2 still has the top 32 bits RES0 but we should preempt any future
problems

Make these top bit constants unsigned as per commit df655b75c4
("arm64: KVM: Avoid setting the upper 32 bits of VTCR_EL2 to 1").

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Chris January <Chris.January@arm.com>
Cc: <stable@vger.kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211125152014.2806582-1-catalin.marinas@arm.com
2021-11-25 15:51:25 +00:00
Fuad Tabba 5386839077 KVM: arm64: Add missing field descriptor for MDCR_EL2
It's not currently used. Added for completeness.

No functional change intended.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211010145636.1950948-6-tabba@google.com
2021-10-11 14:57:28 +01:00
Fuad Tabba 2d701243b9 KVM: arm64: Add config register bit definitions
Add hardware configuration register bit definitions for HCR_EL2
and MDCR_EL2. Future patches toggle these hyp configuration
register bits to trap on certain accesses.

No functional change intended.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210817081134.2918285-11-tabba@google.com
2021-08-20 11:12:18 +01:00
Fuad Tabba dabb1667d8 KVM: arm64: Fix names of config register fields
Change the names of hcr_el2 register fields to match the Arm
Architecture Reference Manual. Easier for cross-referencing and
for grepping.

Also, change the name of CPTR_EL2_RES1 to CPTR_NVHE_EL2_RES1,
because res1 bits are different for VHE.

No functional change intended.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210817081134.2918285-5-tabba@google.com
2021-08-20 11:12:17 +01:00
Fuad Tabba d6c850dd6c KVM: arm64: MDCR_EL2 is a 64-bit register
Fix the places in KVM that treat MDCR_EL2 as a 32-bit register.
More recent features (e.g., FEAT_SPEv1p2) use bits above 31.

No functional change intended.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210817081134.2918285-4-tabba@google.com
2021-08-20 11:12:17 +01:00
Steven Price e1f358b504 KVM: arm64: Save/restore MTE registers
Define the new system registers that MTE introduces and context switch
them. The MTE feature is still hidden from the ID register as it isn't
supported in a VM yet.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210621111716.37157-4-steven.price@arm.com
2021-06-22 14:08:05 +01:00
Suzuki K Poulose a1319260bf arm64: KVM: Enable access to TRBE support for host
For a nvhe host, the EL2 must allow the EL1&0 translation
regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must
be saved/restored over a trip to the guest. Also, before
entering the guest, we must flush any trace data if the
TRBE was enabled. And we must prohibit the generation
of trace while we are in EL1 by clearing the TRFCR_EL1.

For vhe, the EL2 must prevent the EL1 access to the Trace
Buffer.

The MDCR_EL2 bit definitions for TRBE are available here :

  https://developer.arm.com/documentation/ddi0601/2020-12/AArch64-Registers/

Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210405164307.1720226-8-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2021-04-06 16:05:28 -06:00