- Use common KVM implementation of MMU memory caches
 
 - SBI v0.2 support for Guest
 
 - Initial KVM selftests support
 
 - Fix to avoid spurious virtual interrupts after clearing hideleg CSR
 
 - Update email address for Anup and Atish
 
 ARM:
 - Simplification of the 'vcpu first run' by integrating it into
   KVM's 'pid change' flow
 
 - Refactoring of the FP and SVE state tracking, also leading to
   a simpler state and less shared data between EL1 and EL2 in
   the nVHE case
 
 - Tidy up the header file usage for the nvhe hyp object
 
 - New HYP unsharing mechanism, finally allowing pages to be
   unmapped from the Stage-1 EL2 page-tables
 
 - Various pKVM cleanups around refcounting and sharing
 
 - A couple of vgic fixes for bugs that would trigger once
   the vcpu xarray rework is merged, but not sooner
 
 - Add minimal support for ARMv8.7's PMU extension
 
 - Rework kvm_pgtable initialisation ahead of the NV work
 
 - New selftest for IRQ injection
 
 - Teach selftests about the lack of default IPA space and
   page sizes
 
 - Expand sysreg selftest to deal with Pointer Authentication
 
 - The usual bunch of cleanups and doc update
 
 s390:
 - fix sigp sense/start/stop/inconsistency
 
 - cleanups
 
 x86:
 - Clean up some function prototypes more
 
 - improved gfn_to_pfn_cache with proper invalidation, used by Xen emulation
 
 - add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery
 
 - completely remove potential TOC/TOU races in nested SVM consistency checks
 
 - update some PMCs on emulated instructions
 
 - Intel AMX support (joint work between Thomas and Intel)
 
 - large MMU cleanups
 
 - module parameter to disable PMU virtualization
 
 - cleanup register cache
 
 - first part of halt handling cleanups
 
 - Hyper-V enlightened MSR bitmap support for nested hypervisors
 
 Generic:
 - clean up Makefiles
 
 - introduce CONFIG_HAVE_KVM_DIRTY_RING
 
 - optimize memslot lookup using a tree
 
 - optimize vCPU array usage by converting to xarray
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmHhxvsUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPZkAf+Nz92UL/5nNGcdHtE4m7AToMmitE9
 bYkesf9BMQvAe5wjkABLuoHGi6ay4jabo4fiGzbdkiK7lO5YgfsWiMB3/MT5fl4E
 jRPzaVQabp3YZLM8UYCBmfUVuRj524S967SfSRe0AvYjDEH8y7klPf4+7sCsFT0/
 Px9Vf2KGuOlf0eM78yKg4rGaF0jS22eLgXm6FfNMY8/e29ZAo/jyUmqBY+Z2xxZG
 aWhceDtSheW1jwLHLj3nOlQJvHTn8LVGXBE/R8Gda3ZjrBV2rKaDi4Fh+HD+dz86
 2zVXwzQ7uck2CMW73GMoXMTWoKSHMyvlBOs1BdvBm4UsnGcXR+q8IFCeuQ==
 =s73m
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "RISCV:

   - Use common KVM implementation of MMU memory caches

   - SBI v0.2 support for Guest

   - Initial KVM selftests support

   - Fix to avoid spurious virtual interrupts after clearing hideleg CSR

   - Update email address for Anup and Atish

  ARM:

   - Simplification of the 'vcpu first run' by integrating it into KVM's
     'pid change' flow

   - Refactoring of the FP and SVE state tracking, also leading to a
     simpler state and less shared data between EL1 and EL2 in the nVHE
     case

   - Tidy up the header file usage for the nvhe hyp object

   - New HYP unsharing mechanism, finally allowing pages to be unmapped
     from the Stage-1 EL2 page-tables

   - Various pKVM cleanups around refcounting and sharing

   - A couple of vgic fixes for bugs that would trigger once the vcpu
     xarray rework is merged, but not sooner

   - Add minimal support for ARMv8.7's PMU extension

   - Rework kvm_pgtable initialisation ahead of the NV work

   - New selftest for IRQ injection

   - Teach selftests about the lack of default IPA space and page sizes

   - Expand sysreg selftest to deal with Pointer Authentication

   - The usual bunch of cleanups and doc update

  s390:

   - fix sigp sense/start/stop/inconsistency

   - cleanups

  x86:

   - Clean up some function prototypes more

   - improved gfn_to_pfn_cache with proper invalidation, used by Xen
     emulation

   - add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery

   - completely remove potential TOC/TOU races in nested SVM consistency
     checks

   - update some PMCs on emulated instructions

   - Intel AMX support (joint work between Thomas and Intel)

   - large MMU cleanups

   - module parameter to disable PMU virtualization

   - cleanup register cache

   - first part of halt handling cleanups

   - Hyper-V enlightened MSR bitmap support for nested hypervisors

  Generic:

   - clean up Makefiles

   - introduce CONFIG_HAVE_KVM_DIRTY_RING

   - optimize memslot lookup using a tree

   - optimize vCPU array usage by converting to xarray"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (268 commits)
  x86/fpu: Fix inline prefix warnings
  selftest: kvm: Add amx selftest
  selftest: kvm: Move struct kvm_x86_state to header
  selftest: kvm: Reorder vcpu_load_state steps for AMX
  kvm: x86: Disable interception for IA32_XFD on demand
  x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state()
  kvm: selftests: Add support for KVM_CAP_XSAVE2
  kvm: x86: Add support for getting/setting expanded xstate buffer
  x86/fpu: Add uabi_size to guest_fpu
  kvm: x86: Add CPUID support for Intel AMX
  kvm: x86: Add XCR0 support for Intel AMX
  kvm: x86: Disable RDMSR interception of IA32_XFD_ERR
  kvm: x86: Emulate IA32_XFD_ERR for guest
  kvm: x86: Intercept #NM for saving IA32_XFD_ERR
  x86/fpu: Prepare xfd_err in struct fpu_guest
  kvm: x86: Add emulation for IA32_XFD
  x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation
  kvm: x86: Enable dynamic xfeatures at KVM_SET_CPUID2
  x86/fpu: Provide fpu_enable_guest_xfd_features() for KVM
  x86/fpu: Add guest support to xfd_enable_feature()
  ...
This commit is contained in:
Linus Torvalds 2022-01-16 16:15:14 +02:00
commit 79e06c4c49
212 changed files with 9112 additions and 2978 deletions

View File

@ -49,10 +49,12 @@ Andy Adamson <andros@citi.umich.edu>
Antoine Tenart <atenart@kernel.org> <antoine.tenart@bootlin.com>
Antoine Tenart <atenart@kernel.org> <antoine.tenart@free-electrons.com>
Antonio Ospite <ao2@ao2.it> <ao2@amarulasolutions.com>
Anup Patel <anup@brainfault.org> <anup.patel@wdc.com>
Archit Taneja <archit@ti.com>
Ard Biesheuvel <ardb@kernel.org> <ard.biesheuvel@linaro.org>
Arnaud Patard <arnaud.patard@rtp-net.org>
Arnd Bergmann <arnd@arndb.de>
Atish Patra <atishp@atishpatra.org> <atish.patra@wdc.com>
Axel Dyks <xl@xlsigned.net>
Axel Lin <axel.lin@gmail.com>
Bart Van Assche <bvanassche@acm.org> <bart.vanassche@sandisk.com>

View File

@ -371,6 +371,9 @@ The bits in the dirty bitmap are cleared before the ioctl returns, unless
KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is enabled. For more information,
see the description of the capability.
Note that the Xen shared info page, if configured, shall always be assumed
to be dirty. KVM will not explicitly mark it such.
4.9 KVM_SET_MEMORY_ALIAS
------------------------
@ -1566,6 +1569,7 @@ otherwise it will return EBUSY error.
struct kvm_xsave {
__u32 region[1024];
__u32 extra[0];
};
This ioctl would copy current vcpu's xsave struct to the userspace.
@ -1574,7 +1578,7 @@ This ioctl would copy current vcpu's xsave struct to the userspace.
4.43 KVM_SET_XSAVE
------------------
:Capability: KVM_CAP_XSAVE
:Capability: KVM_CAP_XSAVE and KVM_CAP_XSAVE2
:Architectures: x86
:Type: vcpu ioctl
:Parameters: struct kvm_xsave (in)
@ -1585,9 +1589,18 @@ This ioctl would copy current vcpu's xsave struct to the userspace.
struct kvm_xsave {
__u32 region[1024];
__u32 extra[0];
};
This ioctl would copy userspace's xsave struct to the kernel.
This ioctl would copy userspace's xsave struct to the kernel. It copies
as many bytes as are returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2),
when invoked on the vm file descriptor. The size value returned by
KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2) will always be at least 4096.
Currently, it is only greater than 4096 if a dynamic feature has been
enabled with ``arch_prctl()``, but this may change in the future.
The offsets of the state save areas in struct kvm_xsave follow the
contents of CPUID leaf 0xD on the host.
4.44 KVM_GET_XCRS
@ -1684,6 +1697,10 @@ userspace capabilities, and with user requirements (for example, the
user may wish to constrain cpuid to emulate older hardware, or for
feature consistency across a cluster).
Dynamically-enabled feature bits need to be requested with
``arch_prctl()`` before calling this ioctl. Feature bits that have not
been requested are excluded from the result.
Note that certain capabilities, such as KVM_CAP_X86_DISABLE_EXITS, may
expose cpuid features (e.g. MONITOR) which are not supported by kvm in
its default configuration. If userspace enables such capabilities, it
@ -1796,6 +1813,7 @@ No flags are specified so far, the corresponding field must be set to zero.
struct kvm_irq_routing_msi msi;
struct kvm_irq_routing_s390_adapter adapter;
struct kvm_irq_routing_hv_sint hv_sint;
struct kvm_irq_routing_xen_evtchn xen_evtchn;
__u32 pad[8];
} u;
};
@ -1805,6 +1823,7 @@ No flags are specified so far, the corresponding field must be set to zero.
#define KVM_IRQ_ROUTING_MSI 2
#define KVM_IRQ_ROUTING_S390_ADAPTER 3
#define KVM_IRQ_ROUTING_HV_SINT 4
#define KVM_IRQ_ROUTING_XEN_EVTCHN 5
flags:
@ -1856,6 +1875,20 @@ address_hi must be zero.
__u32 sint;
};
struct kvm_irq_routing_xen_evtchn {
__u32 port;
__u32 vcpu;
__u32 priority;
};
When KVM_CAP_XEN_HVM includes the KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL bit
in its indication of supported features, routing to Xen event channels
is supported. Although the priority field is present, only the value
KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL is supported, which means delivery by
2 level event channels. FIFO event channel support may be added in
the future.
4.55 KVM_SET_TSC_KHZ
--------------------
@ -3701,7 +3734,7 @@ KVM with the currently defined set of flags.
:Architectures: s390
:Type: vm ioctl
:Parameters: struct kvm_s390_skeys
:Returns: 0 on success, KVM_S390_GET_KEYS_NONE if guest is not using storage
:Returns: 0 on success, KVM_S390_GET_SKEYS_NONE if guest is not using storage
keys, negative value on error
This ioctl is used to get guest storage key values on the s390
@ -3720,7 +3753,7 @@ you want to get.
The count field is the number of consecutive frames (starting from start_gfn)
whose storage keys to get. The count field must be at least 1 and the maximum
allowed value is defined as KVM_S390_SKEYS_ALLOC_MAX. Values outside this range
allowed value is defined as KVM_S390_SKEYS_MAX. Values outside this range
will cause the ioctl to return -EINVAL.
The skeydata_addr field is the address to a buffer large enough to hold count
@ -3744,7 +3777,7 @@ you want to set.
The count field is the number of consecutive frames (starting from start_gfn)
whose storage keys to get. The count field must be at least 1 and the maximum
allowed value is defined as KVM_S390_SKEYS_ALLOC_MAX. Values outside this range
allowed value is defined as KVM_S390_SKEYS_MAX. Values outside this range
will cause the ioctl to return -EINVAL.
The skeydata_addr field is the address to a buffer containing count bytes of
@ -5134,6 +5167,15 @@ KVM_XEN_ATTR_TYPE_SHARED_INFO
not aware of the Xen CPU id which is used as the index into the
vcpu_info[] array, so cannot know the correct default location.
Note that the shared info page may be constantly written to by KVM;
it contains the event channel bitmap used to deliver interrupts to
a Xen guest, amongst other things. It is exempt from dirty tracking
mechanisms — KVM will not explicitly mark the page as dirty each
time an event channel interrupt is delivered to the guest! Thus,
userspace should always assume that the designated GFN is dirty if
any vCPU has been running or any event channel interrupts can be
routed to the guest.
KVM_XEN_ATTR_TYPE_UPCALL_VECTOR
Sets the exception vector used to deliver Xen event channel upcalls.
@ -5503,6 +5545,34 @@ the trailing ``'\0'``, is indicated by ``name_size`` in the header.
The Stats Data block contains an array of 64-bit values in the same order
as the descriptors in Descriptors block.
4.42 KVM_GET_XSAVE2
------------------
:Capability: KVM_CAP_XSAVE2
:Architectures: x86
:Type: vcpu ioctl
:Parameters: struct kvm_xsave (out)
:Returns: 0 on success, -1 on error
::
struct kvm_xsave {
__u32 region[1024];
__u32 extra[0];
};
This ioctl would copy current vcpu's xsave struct to the userspace. It
copies as many bytes as are returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2)
when invoked on the vm file descriptor. The size value returned by
KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2) will always be at least 4096.
Currently, it is only greater than 4096 if a dynamic feature has been
enabled with ``arch_prctl()``, but this may change in the future.
The offsets of the state save areas in struct kvm_xsave follow the contents
of CPUID leaf 0xD on the host.
5. The kvm_run structure
========================
@ -7401,6 +7471,7 @@ PVHVM guests. Valid flags are::
#define KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL (1 << 1)
#define KVM_XEN_HVM_CONFIG_SHARED_INFO (1 << 2)
#define KVM_XEN_HVM_CONFIG_RUNSTATE (1 << 2)
#define KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL (1 << 3)
The KVM_XEN_HVM_CONFIG_HYPERCALL_MSR flag indicates that the KVM_XEN_HVM_CONFIG
ioctl is available, for the guest to set its hypercall page.
@ -7420,6 +7491,10 @@ The KVM_XEN_HVM_CONFIG_RUNSTATE flag indicates that the runstate-related
features KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR/_CURRENT/_DATA/_ADJUST are
supported by the KVM_XEN_VCPU_SET_ATTR/KVM_XEN_VCPU_GET_ATTR ioctls.
The KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL flag indicates that IRQ routing entries
of the type KVM_IRQ_ROUTING_XEN_EVTCHN are supported, with the priority
field set to indicate 2 level event channel delivery.
8.31 KVM_CAP_PPC_MULTITCE
-------------------------

View File

@ -161,7 +161,7 @@ Shadow pages contain the following information:
If clear, this page corresponds to a guest page table denoted by the gfn
field.
role.quadrant:
When role.gpte_is_8_bytes=0, the guest uses 32-bit gptes while the host uses 64-bit
When role.has_4_byte_gpte=1, the guest uses 32-bit gptes while the host uses 64-bit
sptes. That means a guest page table contains more ptes than the host,
so multiple shadow pages are needed to shadow one guest page.
For first-level shadow pages, role.quadrant can be 0 or 1 and denotes the
@ -177,9 +177,9 @@ Shadow pages contain the following information:
The page is invalid and should not be used. It is a root page that is
currently pinned (by a cpu hardware register pointing to it); once it is
unpinned it will be destroyed.
role.gpte_is_8_bytes:
Reflects the size of the guest PTE for which the page is valid, i.e. '1'
if 64-bit gptes are in use, '0' if 32-bit gptes are in use.
role.has_4_byte_gpte:
Reflects the size of the guest PTE for which the page is valid, i.e. '0'
if direct map or 64-bit gptes are in use, '1' if 32-bit gptes are in use.
role.efer_nx:
Contains the value of efer.nx for which the page is valid.
role.cr0_wp:

View File

@ -10539,8 +10539,8 @@ F: arch/powerpc/kernel/kvm*
F: arch/powerpc/kvm/
KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)
M: Anup Patel <anup.patel@wdc.com>
R: Atish Patra <atish.patra@wdc.com>
M: Anup Patel <anup@brainfault.org>
R: Atish Patra <atishp@atishpatra.org>
L: kvm@vger.kernel.org
L: kvm-riscv@lists.infradead.org
L: linux-riscv@lists.infradead.org

View File

@ -63,6 +63,7 @@ enum __kvm_host_smccc_func {
/* Hypercalls available after pKVM finalisation */
__KVM_HOST_SMCCC_FUNC___pkvm_host_share_hyp,
__KVM_HOST_SMCCC_FUNC___pkvm_host_unshare_hyp,
__KVM_HOST_SMCCC_FUNC___kvm_adjust_pc,
__KVM_HOST_SMCCC_FUNC___kvm_vcpu_run,
__KVM_HOST_SMCCC_FUNC___kvm_flush_vm_context,

View File

@ -41,6 +41,8 @@ void kvm_inject_vabt(struct kvm_vcpu *vcpu);
void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
void kvm_vcpu_wfi(struct kvm_vcpu *vcpu);
static __always_inline bool vcpu_el1_is_32bit(struct kvm_vcpu *vcpu)
{
return !(vcpu->arch.hcr_el2 & HCR_RW);
@ -386,7 +388,7 @@ static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
*vcpu_cpsr(vcpu) |= PSR_AA32_E_BIT;
} else {
u64 sctlr = vcpu_read_sys_reg(vcpu, SCTLR_EL1);
sctlr |= (1 << 25);
sctlr |= SCTLR_ELx_EE;
vcpu_write_sys_reg(vcpu, sctlr, SCTLR_EL1);
}
}

View File

@ -26,7 +26,6 @@
#include <asm/fpsimd.h>
#include <asm/kvm.h>
#include <asm/kvm_asm.h>
#include <asm/thread_info.h>
#define __KVM_HAVE_ARCH_INTC_INITIALIZED
@ -298,9 +297,6 @@ struct kvm_vcpu_arch {
/* Exception Information */
struct kvm_vcpu_fault_info fault;
/* State of various workarounds, see kvm_asm.h for bit assignment */
u64 workaround_flags;
/* Miscellaneous vcpu state flags */
u64 flags;
@ -321,8 +317,8 @@ struct kvm_vcpu_arch {
struct kvm_guest_debug_arch vcpu_debug_state;
struct kvm_guest_debug_arch external_debug_state;
struct thread_info *host_thread_info; /* hyp VA */
struct user_fpsimd_state *host_fpsimd_state; /* hyp VA */
struct task_struct *parent_task;
struct {
/* {Break,watch}point registers */
@ -367,9 +363,6 @@ struct kvm_vcpu_arch {
int target;
DECLARE_BITMAP(features, KVM_VCPU_MAX_FEATURES);
/* Detect first run of a vcpu */
bool has_run_once;
/* Virtual SError ESR to restore when HCR_EL2.VSE is set */
u64 vsesr_el2;
@ -411,20 +404,17 @@ struct kvm_vcpu_arch {
#define KVM_ARM64_DEBUG_DIRTY (1 << 0)
#define KVM_ARM64_FP_ENABLED (1 << 1) /* guest FP regs loaded */
#define KVM_ARM64_FP_HOST (1 << 2) /* host FP regs loaded */
#define KVM_ARM64_HOST_SVE_IN_USE (1 << 3) /* backup for host TIF_SVE */
#define KVM_ARM64_HOST_SVE_ENABLED (1 << 4) /* SVE enabled for EL0 */
#define KVM_ARM64_GUEST_HAS_SVE (1 << 5) /* SVE exposed to guest */
#define KVM_ARM64_VCPU_SVE_FINALIZED (1 << 6) /* SVE config completed */
#define KVM_ARM64_GUEST_HAS_PTRAUTH (1 << 7) /* PTRAUTH exposed to guest */
#define KVM_ARM64_PENDING_EXCEPTION (1 << 8) /* Exception pending */
/*
* Overlaps with KVM_ARM64_EXCEPT_MASK on purpose so that it can't be
* set together with an exception...
*/
#define KVM_ARM64_INCREMENT_PC (1 << 9) /* Increment PC */
#define KVM_ARM64_EXCEPT_MASK (7 << 9) /* Target EL/MODE */
#define KVM_ARM64_DEBUG_STATE_SAVE_SPE (1 << 12) /* Save SPE context if active */
#define KVM_ARM64_DEBUG_STATE_SAVE_TRBE (1 << 13) /* Save TRBE context if active */
#define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
KVM_GUESTDBG_USE_SW_BP | \
KVM_GUESTDBG_USE_HW | \
KVM_GUESTDBG_SINGLESTEP)
/*
* When KVM_ARM64_PENDING_EXCEPTION is set, KVM_ARM64_EXCEPT_MASK can
* take the following values:
@ -442,11 +432,14 @@ struct kvm_vcpu_arch {
#define KVM_ARM64_EXCEPT_AA64_EL1 (0 << 11)
#define KVM_ARM64_EXCEPT_AA64_EL2 (1 << 11)
/*
* Overlaps with KVM_ARM64_EXCEPT_MASK on purpose so that it can't be
* set together with an exception...
*/
#define KVM_ARM64_INCREMENT_PC (1 << 9) /* Increment PC */
#define KVM_ARM64_DEBUG_STATE_SAVE_SPE (1 << 12) /* Save SPE context if active */
#define KVM_ARM64_DEBUG_STATE_SAVE_TRBE (1 << 13) /* Save TRBE context if active */
#define KVM_ARM64_FP_FOREIGN_FPSTATE (1 << 14)
#define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
KVM_GUESTDBG_USE_SW_BP | \
KVM_GUESTDBG_USE_HW | \
KVM_GUESTDBG_SINGLESTEP)
#define vcpu_has_sve(vcpu) (system_supports_sve() && \
((vcpu)->arch.flags & KVM_ARM64_GUEST_HAS_SVE))
@ -606,6 +599,8 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
void kvm_arm_halt_guest(struct kvm *kvm);
void kvm_arm_resume_guest(struct kvm *kvm);
#define vcpu_has_run_once(vcpu) !!rcu_access_pointer((vcpu)->pid)
#ifndef __KVM_NVHE_HYPERVISOR__
#define kvm_call_hyp_nvhe(f, ...) \
({ \
@ -724,7 +719,6 @@ void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu);
static inline void kvm_arch_hardware_unsetup(void) {}
static inline void kvm_arch_sync_events(struct kvm *kvm) {}
static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
void kvm_arm_init_debug(void);
void kvm_arm_vcpu_init_debug(struct kvm_vcpu *vcpu);
@ -744,8 +738,10 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
/* Guest/host FPSIMD coordination helpers */
int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
void kvm_arch_vcpu_ctxflush_fp(struct kvm_vcpu *vcpu);
void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu);
static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
{
@ -756,12 +752,7 @@ static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
void kvm_arch_vcpu_load_debug_state_flags(struct kvm_vcpu *vcpu);
void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu);
#ifdef CONFIG_KVM /* Avoid conflicts with core headers if CONFIG_KVM=n */
static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
{
return kvm_arch_vcpu_run_map_fp(vcpu);
}
#ifdef CONFIG_KVM
void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
void kvm_clr_pmu_events(u32 clr);

View File

@ -90,7 +90,6 @@ void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu);
void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
void __sve_save_state(void *sve_pffr, u32 *fpsr);
void __sve_restore_state(void *sve_pffr, u32 *fpsr);
#ifndef __KVM_NVHE_HYPERVISOR__

View File

@ -150,6 +150,8 @@ static __always_inline unsigned long __kern_hyp_va(unsigned long v)
#include <asm/kvm_pgtable.h>
#include <asm/stage2_pgtable.h>
int kvm_share_hyp(void *from, void *to);
void kvm_unshare_hyp(void *from, void *to);
int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
void __iomem **kaddr,

View File

@ -251,6 +251,27 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt);
int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys,
enum kvm_pgtable_prot prot);
/**
* kvm_pgtable_hyp_unmap() - Remove a mapping from a hypervisor stage-1 page-table.
* @pgt: Page-table structure initialised by kvm_pgtable_hyp_init().
* @addr: Virtual address from which to remove the mapping.
* @size: Size of the mapping.
*
* The offset of @addr within a page is ignored, @size is rounded-up to
* the next page boundary and @phys is rounded-down to the previous page
* boundary.
*
* TLB invalidation is performed for each page-table entry cleared during the
* unmapping operation and the reference count for the page-table page
* containing the cleared entry is decremented, with unreferenced pages being
* freed. The unmapping operation will stop early if it encounters either an
* invalid page-table entry or a valid block mapping which maps beyond the range
* being unmapped.
*
* Return: Number of bytes unmapped, which may be 0.
*/
u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
/**
* kvm_get_vtcr() - Helper to construct VTCR_EL2
* @mmfr0: Sanitized value of SYS_ID_AA64MMFR0_EL1 register.
@ -270,8 +291,7 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift);
/**
* __kvm_pgtable_stage2_init() - Initialise a guest stage-2 page-table.
* @pgt: Uninitialised page-table structure to initialise.
* @arch: Arch-specific KVM structure representing the guest virtual
* machine.
* @mmu: S2 MMU context for this S2 translation
* @mm_ops: Memory management callbacks.
* @flags: Stage-2 configuration flags.
* @force_pte_cb: Function that returns true if page level mappings must
@ -279,13 +299,13 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift);
*
* Return: 0 on success, negative error code on failure.
*/
int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch,
int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu,
struct kvm_pgtable_mm_ops *mm_ops,
enum kvm_pgtable_stage2_flags flags,
kvm_pgtable_force_pte_cb_t force_pte_cb);
#define kvm_pgtable_stage2_init(pgt, arch, mm_ops) \
__kvm_pgtable_stage2_init(pgt, arch, mm_ops, 0, NULL)
#define kvm_pgtable_stage2_init(pgt, mmu, mm_ops) \
__kvm_pgtable_stage2_init(pgt, mmu, mm_ops, 0, NULL)
/**
* kvm_pgtable_stage2_destroy() - Destroy an unused guest stage-2 page-table.

View File

@ -0,0 +1,71 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (C) 2020 - Google LLC
* Author: Quentin Perret <qperret@google.com>
*/
#ifndef __ARM64_KVM_PKVM_H__
#define __ARM64_KVM_PKVM_H__
#include <linux/memblock.h>
#include <asm/kvm_pgtable.h>
#define HYP_MEMBLOCK_REGIONS 128
extern struct memblock_region kvm_nvhe_sym(hyp_memory)[];
extern unsigned int kvm_nvhe_sym(hyp_memblock_nr);
static inline unsigned long __hyp_pgtable_max_pages(unsigned long nr_pages)
{
unsigned long total = 0, i;
/* Provision the worst case scenario */
for (i = 0; i < KVM_PGTABLE_MAX_LEVELS; i++) {
nr_pages = DIV_ROUND_UP(nr_pages, PTRS_PER_PTE);
total += nr_pages;
}
return total;
}
static inline unsigned long __hyp_pgtable_total_pages(void)
{
unsigned long res = 0, i;
/* Cover all of memory with page-granularity */
for (i = 0; i < kvm_nvhe_sym(hyp_memblock_nr); i++) {
struct memblock_region *reg = &kvm_nvhe_sym(hyp_memory)[i];
res += __hyp_pgtable_max_pages(reg->size >> PAGE_SHIFT);
}
return res;
}
static inline unsigned long hyp_s1_pgtable_pages(void)
{
unsigned long res;
res = __hyp_pgtable_total_pages();
/* Allow 1 GiB for private mappings */
res += __hyp_pgtable_max_pages(SZ_1G >> PAGE_SHIFT);
return res;
}
static inline unsigned long host_s2_pgtable_pages(void)
{
unsigned long res;
/*
* Include an extra 16 pages to safely upper-bound the worst case of
* concatenated pgds.
*/
res = __hyp_pgtable_total_pages() + 16;
/* Allow 1 GiB for MMIO mappings */
res += __hyp_pgtable_max_pages(SZ_1G >> PAGE_SHIFT);
return res;
}
#endif /* __ARM64_KVM_PKVM_H__ */

View File

@ -15,6 +15,7 @@
#ifndef __ASSEMBLY__
#include <linux/refcount.h>
#include <asm/cpufeature.h>
typedef struct {
atomic64_t id;

View File

@ -953,6 +953,7 @@
#define ID_AA64DFR0_PMUVER_8_1 0x4
#define ID_AA64DFR0_PMUVER_8_4 0x5
#define ID_AA64DFR0_PMUVER_8_5 0x6
#define ID_AA64DFR0_PMUVER_8_7 0x7
#define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
#define ID_AA64DFR0_PMSVER_8_2 0x1

View File

@ -111,7 +111,6 @@ int main(void)
#ifdef CONFIG_KVM
DEFINE(VCPU_CONTEXT, offsetof(struct kvm_vcpu, arch.ctxt));
DEFINE(VCPU_FAULT_DISR, offsetof(struct kvm_vcpu, arch.fault.disr_el1));
DEFINE(VCPU_WORKAROUND_FLAGS, offsetof(struct kvm_vcpu, arch.workaround_flags));
DEFINE(VCPU_HCR_EL2, offsetof(struct kvm_vcpu, arch.hcr_el2));
DEFINE(CPU_USER_PT_REGS, offsetof(struct kvm_cpu_context, regs));
DEFINE(CPU_RGSR_EL1, offsetof(struct kvm_cpu_context, sys_regs[RGSR_EL1]));

View File

@ -79,7 +79,11 @@
* indicate whether or not the userland FPSIMD state of the current task is
* present in the registers. The flag is set unless the FPSIMD registers of this
* CPU currently contain the most recent userland FPSIMD state of the current
* task.
* task. If the task is behaving as a VMM, then this is will be managed by
* KVM which will clear it to indicate that the vcpu FPSIMD state is currently
* loaded on the CPU, allowing the state to be saved if a FPSIMD-aware
* softirq kicks in. Upon vcpu_put(), KVM will save the vcpu FP state and
* flag the register state as invalid.
*
* In order to allow softirq handlers to use FPSIMD, kernel_neon_begin() may
* save the task's FPSIMD context back to task_struct from softirq context.

2
arch/arm64/kvm/.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
# SPDX-License-Identifier: GPL-2.0-only
hyp_constants.h

View File

@ -40,6 +40,7 @@ menuconfig KVM
select HAVE_KVM_VCPU_RUN_PID_CHANGE
select SCHED_INFO
select GUEST_PERF_EVENTS if PERF_EVENTS
select INTERVAL_TREE
help
Support hosting virtualized guest machines.

View File

@ -5,17 +5,15 @@
ccflags-y += -I $(srctree)/$(src)
KVM=../../../virt/kvm
include $(srctree)/virt/kvm/Makefile.kvm
obj-$(CONFIG_KVM) += kvm.o
obj-$(CONFIG_KVM) += hyp/
kvm-y := $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o \
$(KVM)/vfio.o $(KVM)/irqchip.o $(KVM)/binary_stats.o \
arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
inject_fault.o va_layout.o handle_exit.o \
guest.o debug.o reset.o sys_regs.o \
vgic-sys-reg-v3.o fpsimd.o pmu.o \
vgic-sys-reg-v3.o fpsimd.o pmu.o pkvm.o \
arch_timer.o trng.o\
vgic/vgic.o vgic/vgic-init.o \
vgic/vgic-irqfd.o vgic/vgic-v2.o \
@ -25,3 +23,19 @@ kvm-y := $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o \
vgic/vgic-its.o vgic/vgic-debug.o
kvm-$(CONFIG_HW_PERF_EVENTS) += pmu-emul.o
always-y := hyp_constants.h hyp-constants.s
define rule_gen_hyp_constants
$(call filechk,offsets,__HYP_CONSTANTS_H__)
endef
CFLAGS_hyp-constants.o = -I $(srctree)/$(src)/hyp/include
$(obj)/hyp-constants.s: $(src)/hyp/hyp-constants.c FORCE
$(call if_changed_dep,cc_s_c)
$(obj)/hyp_constants.h: $(obj)/hyp-constants.s FORCE
$(call if_changed_rule,gen_hyp_constants)
obj-kvm := $(addprefix $(obj)/, $(kvm-y))
$(obj-kvm): $(obj)/hyp_constants.h

View File

@ -467,7 +467,7 @@ out:
}
/*
* Schedule the background timer before calling kvm_vcpu_block, so that this
* Schedule the background timer before calling kvm_vcpu_halt, so that this
* thread is removed from its waitqueue and made runnable when there's a timer
* interrupt to handle.
*/
@ -649,7 +649,6 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
{
struct arch_timer_cpu *timer = vcpu_timer(vcpu);
struct timer_map map;
struct rcuwait *wait = kvm_arch_vcpu_get_wait(vcpu);
if (unlikely(!timer->enabled))
return;
@ -672,7 +671,7 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
if (map.emul_ptimer)
soft_timer_cancel(&map.emul_ptimer->hrtimer);
if (rcuwait_active(wait))
if (kvm_vcpu_is_blocking(vcpu))
kvm_timer_blocking(vcpu);
/*
@ -750,7 +749,7 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
/* Make the updates of cntvoff for all vtimer contexts atomic */
static void update_vtimer_cntvoff(struct kvm_vcpu *vcpu, u64 cntvoff)
{
int i;
unsigned long i;
struct kvm *kvm = vcpu->kvm;
struct kvm_vcpu *tmp;
@ -1189,8 +1188,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
static bool timer_irqs_are_valid(struct kvm_vcpu *vcpu)
{
int vtimer_irq, ptimer_irq;
int i, ret;
int vtimer_irq, ptimer_irq, ret;
unsigned long i;
vtimer_irq = vcpu_vtimer(vcpu)->irq.irq;
ret = kvm_vgic_set_owner(vcpu, vtimer_irq, vcpu_vtimer(vcpu));
@ -1297,7 +1296,7 @@ void kvm_timer_init_vhe(void)
static void set_timer_irqs(struct kvm *kvm, int vtimer_irq, int ptimer_irq)
{
struct kvm_vcpu *vcpu;
int i;
unsigned long i;
kvm_for_each_vcpu(i, vcpu, kvm) {
vcpu_vtimer(vcpu)->irq.irq = vtimer_irq;

View File

@ -146,7 +146,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
if (ret)
return ret;
ret = create_hyp_mappings(kvm, kvm + 1, PAGE_HYP);
ret = kvm_share_hyp(kvm, kvm + 1);
if (ret)
goto out_free_stage2_pgd;
@ -175,19 +175,13 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
*/
void kvm_arch_destroy_vm(struct kvm *kvm)
{
int i;
bitmap_free(kvm->arch.pmu_filter);
kvm_vgic_destroy(kvm);
for (i = 0; i < KVM_MAX_VCPUS; ++i) {
if (kvm->vcpus[i]) {
kvm_vcpu_destroy(kvm->vcpus[i]);
kvm->vcpus[i] = NULL;
}
}
atomic_set(&kvm->online_vcpus, 0);
kvm_destroy_vcpus(kvm);
kvm_unshare_hyp(kvm, kvm + 1);
}
int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
@ -342,7 +336,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
if (err)
return err;
return create_hyp_mappings(vcpu, vcpu + 1, PAGE_HYP);
return kvm_share_hyp(vcpu, vcpu + 1);
}
void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
@ -351,7 +345,7 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
{
if (vcpu->arch.has_run_once && unlikely(!irqchip_in_kernel(vcpu->kvm)))
if (vcpu_has_run_once(vcpu) && unlikely(!irqchip_in_kernel(vcpu->kvm)))
static_branch_dec(&userspace_irqchip_in_use);
kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
@ -368,27 +362,12 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
{
/*
* If we're about to block (most likely because we've just hit a
* WFI), we need to sync back the state of the GIC CPU interface
* so that we have the latest PMR and group enables. This ensures
* that kvm_arch_vcpu_runnable has up-to-date data to decide
* whether we have pending interrupts.
*
* For the same reason, we want to tell GICv4 that we need
* doorbells to be signalled, should an interrupt become pending.
*/
preempt_disable();
kvm_vgic_vmcr_sync(vcpu);
vgic_v4_put(vcpu, true);
preempt_enable();
}
void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
{
preempt_disable();
vgic_v4_load(vcpu);
preempt_enable();
}
void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
@ -591,18 +570,33 @@ static void update_vmid(struct kvm_vmid *vmid)
spin_unlock(&kvm_vmid_lock);
}
static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
static int kvm_vcpu_initialized(struct kvm_vcpu *vcpu)
{
return vcpu->arch.target >= 0;
}
/*
* Handle both the initialisation that is being done when the vcpu is
* run for the first time, as well as the updates that must be
* performed each time we get a new thread dealing with this vcpu.
*/
int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
int ret = 0;
int ret;
if (likely(vcpu->arch.has_run_once))
return 0;
if (!kvm_vcpu_initialized(vcpu))
return -ENOEXEC;
if (!kvm_arm_vcpu_is_finalized(vcpu))
return -EPERM;
vcpu->arch.has_run_once = true;
ret = kvm_arch_vcpu_run_map_fp(vcpu);
if (ret)
return ret;
if (likely(vcpu_has_run_once(vcpu)))
return 0;
kvm_arm_vcpu_init_debug(vcpu);
@ -614,12 +608,6 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
ret = kvm_vgic_map_resources(kvm);
if (ret)
return ret;
} else {
/*
* Tell the rest of the code that there are userspace irqchip
* VMs in the wild.
*/
static_branch_inc(&userspace_irqchip_in_use);
}
ret = kvm_timer_enable(vcpu);
@ -627,6 +615,16 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
return ret;
ret = kvm_arm_pmu_v3_enable(vcpu);
if (ret)
return ret;
if (!irqchip_in_kernel(kvm)) {
/*
* Tell the rest of the code that there are userspace irqchip
* VMs in the wild.
*/
static_branch_inc(&userspace_irqchip_in_use);
}
/*
* Initialize traps for protected VMs.
@ -646,7 +644,7 @@ bool kvm_arch_intc_initialized(struct kvm *kvm)
void kvm_arm_halt_guest(struct kvm *kvm)
{
int i;
unsigned long i;
struct kvm_vcpu *vcpu;
kvm_for_each_vcpu(i, vcpu, kvm)
@ -656,12 +654,12 @@ void kvm_arm_halt_guest(struct kvm *kvm)
void kvm_arm_resume_guest(struct kvm *kvm)
{
int i;
unsigned long i;
struct kvm_vcpu *vcpu;
kvm_for_each_vcpu(i, vcpu, kvm) {
vcpu->arch.pause = false;
rcuwait_wake_up(kvm_arch_vcpu_get_wait(vcpu));
__kvm_vcpu_wake_up(vcpu);
}
}
@ -686,9 +684,37 @@ static void vcpu_req_sleep(struct kvm_vcpu *vcpu)
smp_rmb();
}
static int kvm_vcpu_initialized(struct kvm_vcpu *vcpu)
/**
* kvm_vcpu_wfi - emulate Wait-For-Interrupt behavior
* @vcpu: The VCPU pointer
*
* Suspend execution of a vCPU until a valid wake event is detected, i.e. until
* the vCPU is runnable. The vCPU may or may not be scheduled out, depending
* on when a wake event arrives, e.g. there may already be a pending wake event.
*/
void kvm_vcpu_wfi(struct kvm_vcpu *vcpu)
{
return vcpu->arch.target >= 0;
/*
* Sync back the state of the GIC CPU interface so that we have
* the latest PMR and group enables. This ensures that
* kvm_arch_vcpu_runnable has up-to-date data to decide whether
* we have pending interrupts, e.g. when determining if the
* vCPU should block.
*
* For the same reason, we want to tell GICv4 that we need
* doorbells to be signalled, should an interrupt become pending.
*/
preempt_disable();
kvm_vgic_vmcr_sync(vcpu);
vgic_v4_put(vcpu, true);
preempt_enable();
kvm_vcpu_halt(vcpu);
kvm_clear_request(KVM_REQ_UNHALT, vcpu);
preempt_disable();
vgic_v4_load(vcpu);
preempt_enable();
}
static void check_vcpu_requests(struct kvm_vcpu *vcpu)
@ -786,13 +812,6 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
struct kvm_run *run = vcpu->run;
int ret;
if (unlikely(!kvm_vcpu_initialized(vcpu)))
return -ENOEXEC;
ret = kvm_vcpu_first_run_init(vcpu);
if (ret)
return ret;
if (run->exit_reason == KVM_EXIT_MMIO) {
ret = kvm_handle_mmio_return(vcpu);
if (ret)
@ -856,6 +875,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
}
kvm_arm_setup_debug(vcpu);
kvm_arch_vcpu_ctxflush_fp(vcpu);
/**************************************************************
* Enter the guest
@ -1130,7 +1150,7 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
* need to invalidate the I-cache though, as FWB does *not*
* imply CTR_EL0.DIC.
*/
if (vcpu->arch.has_run_once) {
if (vcpu_has_run_once(vcpu)) {
if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB))
stage2_unmap_vm(vcpu->kvm);
else
@ -2043,7 +2063,7 @@ static int finalize_hyp_mode(void)
struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr)
{
struct kvm_vcpu *vcpu;
int i;
unsigned long i;
mpidr &= MPIDR_HWID_BITMASK;
kvm_for_each_vcpu(i, vcpu, kvm) {

View File

@ -7,7 +7,6 @@
*/
#include <linux/irqflags.h>
#include <linux/sched.h>
#include <linux/thread_info.h>
#include <linux/kvm_host.h>
#include <asm/fpsimd.h>
#include <asm/kvm_asm.h>
@ -15,6 +14,19 @@
#include <asm/kvm_mmu.h>
#include <asm/sysreg.h>
void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
{
struct task_struct *p = vcpu->arch.parent_task;
struct user_fpsimd_state *fpsimd;
if (!is_protected_kvm_enabled() || !p)
return;
fpsimd = &p->thread.uw.fpsimd_state;
kvm_unshare_hyp(fpsimd, fpsimd + 1);
put_task_struct(p);
}
/*
* Called on entry to KVM_RUN unless this vcpu previously ran at least
* once and the most recent prior KVM_RUN for this vcpu was called from
@ -28,36 +40,29 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
{
int ret;
struct thread_info *ti = &current->thread_info;
struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
kvm_vcpu_unshare_task_fp(vcpu);
/* Make sure the host task fpsimd state is visible to hyp: */
ret = kvm_share_hyp(fpsimd, fpsimd + 1);
if (ret)
return ret;
vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
/*
* Make sure the host task thread flags and fpsimd state are
* visible to hyp:
* We need to keep current's task_struct pinned until its data has been
* unshared with the hypervisor to make sure it is not re-used by the
* kernel and donated to someone else while already shared -- see
* kvm_vcpu_unshare_task_fp() for the matching put_task_struct().
*/
ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
if (ret)
goto error;
ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
if (ret)
goto error;
if (vcpu->arch.sve_state) {
void *sve_end;
sve_end = vcpu->arch.sve_state + vcpu_sve_state_size(vcpu);
ret = create_hyp_mappings(vcpu->arch.sve_state, sve_end,
PAGE_HYP);
if (ret)
goto error;
if (is_protected_kvm_enabled()) {
get_task_struct(current);
vcpu->arch.parent_task = current;
}
vcpu->arch.host_thread_info = kern_hyp_va(ti);
vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
error:
return ret;
return 0;
}
/*
@ -66,26 +71,27 @@ error:
*
* Here, we just set the correct metadata to indicate that the FPSIMD
* state in the cpu regs (if any) belongs to current on the host.
*
* TIF_SVE is backed up here, since it may get clobbered with guest state.
* This flag is restored by kvm_arch_vcpu_put_fp(vcpu).
*/
void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
{
BUG_ON(!current->mm);
BUG_ON(test_thread_flag(TIF_SVE));
vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED |
KVM_ARM64_HOST_SVE_IN_USE |
KVM_ARM64_HOST_SVE_ENABLED);
vcpu->arch.flags &= ~KVM_ARM64_FP_ENABLED;
vcpu->arch.flags |= KVM_ARM64_FP_HOST;
if (test_thread_flag(TIF_SVE))
vcpu->arch.flags |= KVM_ARM64_HOST_SVE_IN_USE;
if (read_sysreg(cpacr_el1) & CPACR_EL1_ZEN_EL0EN)
vcpu->arch.flags |= KVM_ARM64_HOST_SVE_ENABLED;
}
void kvm_arch_vcpu_ctxflush_fp(struct kvm_vcpu *vcpu)
{
if (test_thread_flag(TIF_FOREIGN_FPSTATE))
vcpu->arch.flags |= KVM_ARM64_FP_FOREIGN_FPSTATE;
else
vcpu->arch.flags &= ~KVM_ARM64_FP_FOREIGN_FPSTATE;
}
/*
* If the guest FPSIMD state was loaded, update the host's context
* tracking data mark the CPU FPSIMD regs as dirty and belonging to vcpu
@ -115,13 +121,11 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
{
unsigned long flags;
bool host_has_sve = system_supports_sve();
bool guest_has_sve = vcpu_has_sve(vcpu);
local_irq_save(flags);
if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
if (guest_has_sve) {
if (vcpu_has_sve(vcpu)) {
__vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
/* Restore the VL that was saved when bound to the CPU */
@ -131,7 +135,7 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
}
fpsimd_save_and_flush_cpu_state();
} else if (has_vhe() && host_has_sve) {
} else if (has_vhe() && system_supports_sve()) {
/*
* The FPSIMD/SVE state in the CPU has not been touched, and we
* have SVE (and VHE): CPACR_EL1 (alias CPTR_EL2) has been
@ -145,8 +149,7 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
sysreg_clear_set(CPACR_EL1, CPACR_EL1_ZEN_EL0EN, 0);
}
update_thread_flag(TIF_SVE,
vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE);
update_thread_flag(TIF_SVE, 0);
local_irq_restore(flags);
}

View File

@ -82,7 +82,7 @@ static int handle_no_fpsimd(struct kvm_vcpu *vcpu)
*
* WFE: Yield the CPU and come back to this vcpu when the scheduler
* decides to.
* WFI: Simply call kvm_vcpu_block(), which will halt execution of
* WFI: Simply call kvm_vcpu_halt(), which will halt execution of
* world-switches and schedule other host processes until there is an
* incoming IRQ or FIQ to the VM.
*/
@ -95,8 +95,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu)
} else {
trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
vcpu->stat.wfi_exit_stat++;
kvm_vcpu_block(vcpu);
kvm_clear_request(KVM_REQ_UNHALT, vcpu);
kvm_vcpu_wfi(vcpu);
}
kvm_incr_pc(vcpu);

View File

@ -10,4 +10,4 @@ subdir-ccflags-y := -I$(incdir) \
-DDISABLE_BRANCH_PROFILING \
$(DISABLE_STACKLEAK_PLUGIN)
obj-$(CONFIG_KVM) += vhe/ nvhe/ pgtable.o reserved_mem.o
obj-$(CONFIG_KVM) += vhe/ nvhe/ pgtable.o

View File

@ -25,9 +25,3 @@ SYM_FUNC_START(__sve_restore_state)
sve_load 0, x1, x2, 3
ret
SYM_FUNC_END(__sve_restore_state)
SYM_FUNC_START(__sve_save_state)
mov x2, #1
sve_save 0, x1, x2, 3
ret
SYM_FUNC_END(__sve_save_state)

View File

@ -0,0 +1,10 @@
// SPDX-License-Identifier: GPL-2.0-only
#include <linux/kbuild.h>
#include <nvhe/memory.h>
int main(void)
{
DEFINE(STRUCT_HYP_PAGE_SIZE, sizeof(struct hyp_page));
return 0;
}

View File

@ -29,7 +29,6 @@
#include <asm/fpsimd.h>
#include <asm/debug-monitors.h>
#include <asm/processor.h>
#include <asm/thread_info.h>
struct kvm_exception_table_entry {
int insn, fixup;
@ -49,7 +48,7 @@ static inline bool update_fp_enabled(struct kvm_vcpu *vcpu)
* trap the accesses.
*/
if (!system_supports_fpsimd() ||
vcpu->arch.host_thread_info->flags & _TIF_FOREIGN_FPSTATE)
vcpu->arch.flags & KVM_ARM64_FP_FOREIGN_FPSTATE)
vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED |
KVM_ARM64_FP_HOST);
@ -143,16 +142,6 @@ static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
return __get_fault_info(vcpu->arch.fault.esr_el2, &vcpu->arch.fault);
}
static inline void __hyp_sve_save_host(struct kvm_vcpu *vcpu)
{
struct thread_struct *thread;
thread = container_of(vcpu->arch.host_fpsimd_state, struct thread_struct,
uw.fpsimd_state);
__sve_save_state(sve_pffr(thread), &vcpu->arch.host_fpsimd_state->fpsr);
}
static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
{
sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
@ -169,21 +158,14 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
*/
static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
{
bool sve_guest, sve_host;
bool sve_guest;
u8 esr_ec;
u64 reg;
if (!system_supports_fpsimd())
return false;
if (system_supports_sve()) {
sve_guest = vcpu_has_sve(vcpu);
sve_host = vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE;
} else {
sve_guest = false;
sve_host = false;
}
sve_guest = vcpu_has_sve(vcpu);
esr_ec = kvm_vcpu_trap_get_class(vcpu);
/* Don't handle SVE traps for non-SVE vcpus here: */
@ -207,11 +189,7 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
isb();
if (vcpu->arch.flags & KVM_ARM64_FP_HOST) {
if (sve_host)
__hyp_sve_save_host(vcpu);
else
__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
}

View File

@ -24,6 +24,11 @@ enum pkvm_page_state {
PKVM_PAGE_OWNED = 0ULL,
PKVM_PAGE_SHARED_OWNED = KVM_PGTABLE_PROT_SW0,
PKVM_PAGE_SHARED_BORROWED = KVM_PGTABLE_PROT_SW1,
__PKVM_PAGE_RESERVED = KVM_PGTABLE_PROT_SW0 |
KVM_PGTABLE_PROT_SW1,
/* Meta-states which aren't encoded directly in the PTE's SW bits */
PKVM_NOPAGE,
};
#define PKVM_PAGE_STATE_PROT_MASK (KVM_PGTABLE_PROT_SW0 | KVM_PGTABLE_PROT_SW1)
@ -50,6 +55,7 @@ extern const u8 pkvm_hyp_id;
int __pkvm_prot_finalize(void);
int __pkvm_host_share_hyp(u64 pfn);
int __pkvm_host_unshare_hyp(u64 pfn);
bool addr_is_memory(phys_addr_t phys);
int host_stage2_idmap_locked(phys_addr_t addr, u64 size, enum kvm_pgtable_prot prot);

View File

@ -10,13 +10,8 @@
#include <nvhe/memory.h>
#include <nvhe/spinlock.h>
#define HYP_MEMBLOCK_REGIONS 128
extern struct memblock_region kvm_nvhe_sym(hyp_memory)[];
extern unsigned int kvm_nvhe_sym(hyp_memblock_nr);
extern struct kvm_pgtable pkvm_pgtable;
extern hyp_spinlock_t pkvm_pgd_lock;
extern struct hyp_pool hpool;
extern u64 __io_map_base;
int hyp_create_idmap(u32 hyp_va_bits);
int hyp_map_vectors(void);
@ -39,58 +34,4 @@ static inline void hyp_vmemmap_range(phys_addr_t phys, unsigned long size,
*end = ALIGN(*end, PAGE_SIZE);
}
static inline unsigned long __hyp_pgtable_max_pages(unsigned long nr_pages)
{
unsigned long total = 0, i;
/* Provision the worst case scenario */
for (i = 0; i < KVM_PGTABLE_MAX_LEVELS; i++) {
nr_pages = DIV_ROUND_UP(nr_pages, PTRS_PER_PTE);
total += nr_pages;
}
return total;
}
static inline unsigned long __hyp_pgtable_total_pages(void)
{
unsigned long res = 0, i;
/* Cover all of memory with page-granularity */
for (i = 0; i < kvm_nvhe_sym(hyp_memblock_nr); i++) {
struct memblock_region *reg = &kvm_nvhe_sym(hyp_memory)[i];
res += __hyp_pgtable_max_pages(reg->size >> PAGE_SHIFT);
}
return res;
}
static inline unsigned long hyp_s1_pgtable_pages(void)
{
unsigned long res;
res = __hyp_pgtable_total_pages();
/* Allow 1 GiB for private mappings */
res += __hyp_pgtable_max_pages(SZ_1G >> PAGE_SHIFT);
return res;
}
static inline unsigned long host_s2_pgtable_pages(void)
{
unsigned long res;
/*
* Include an extra 16 pages to safely upper-bound the worst case of
* concatenated pgds.
*/
res = __hyp_pgtable_total_pages() + 16;
/* Allow 1 GiB for MMIO mappings */
res += __hyp_pgtable_max_pages(SZ_1G >> PAGE_SHIFT);
return res;
}
#endif /* __KVM_HYP_MM_H */

View File

@ -43,6 +43,9 @@ void *hyp_early_alloc_page(void *arg)
return hyp_early_alloc_contig(1);
}
static void hyp_early_alloc_get_page(void *addr) { }
static void hyp_early_alloc_put_page(void *addr) { }
void hyp_early_alloc_init(void *virt, unsigned long size)
{
base = cur = (unsigned long)virt;
@ -51,4 +54,6 @@ void hyp_early_alloc_init(void *virt, unsigned long size)
hyp_early_alloc_mm_ops.zalloc_page = hyp_early_alloc_page;
hyp_early_alloc_mm_ops.phys_to_virt = hyp_phys_to_virt;
hyp_early_alloc_mm_ops.virt_to_phys = hyp_virt_to_phys;
hyp_early_alloc_mm_ops.get_page = hyp_early_alloc_get_page;
hyp_early_alloc_mm_ops.put_page = hyp_early_alloc_put_page;
}

View File

@ -147,6 +147,13 @@ static void handle___pkvm_host_share_hyp(struct kvm_cpu_context *host_ctxt)
cpu_reg(host_ctxt, 1) = __pkvm_host_share_hyp(pfn);
}
static void handle___pkvm_host_unshare_hyp(struct kvm_cpu_context *host_ctxt)
{
DECLARE_REG(u64, pfn, host_ctxt, 1);
cpu_reg(host_ctxt, 1) = __pkvm_host_unshare_hyp(pfn);
}
static void handle___pkvm_create_private_mapping(struct kvm_cpu_context *host_ctxt)
{
DECLARE_REG(phys_addr_t, phys, host_ctxt, 1);
@ -184,6 +191,7 @@ static const hcall_t host_hcall[] = {
HANDLE_FUNC(__pkvm_prot_finalize),
HANDLE_FUNC(__pkvm_host_share_hyp),
HANDLE_FUNC(__pkvm_host_unshare_hyp),
HANDLE_FUNC(__kvm_adjust_pc),
HANDLE_FUNC(__kvm_vcpu_run),
HANDLE_FUNC(__kvm_flush_vm_context),

View File

@ -9,6 +9,7 @@
#include <asm/kvm_hyp.h>
#include <asm/kvm_mmu.h>
#include <asm/kvm_pgtable.h>
#include <asm/kvm_pkvm.h>
#include <asm/stage2_pgtable.h>
#include <hyp/fault.h>
@ -27,6 +28,26 @@ static struct hyp_pool host_s2_pool;
const u8 pkvm_hyp_id = 1;
static void host_lock_component(void)
{
hyp_spin_lock(&host_kvm.lock);
}
static void host_unlock_component(void)
{
hyp_spin_unlock(&host_kvm.lock);
}
static void hyp_lock_component(void)
{
hyp_spin_lock(&pkvm_pgd_lock);
}
static void hyp_unlock_component(void)
{
hyp_spin_unlock(&pkvm_pgd_lock);
}
static void *host_s2_zalloc_pages_exact(size_t size)
{
void *addr = hyp_alloc_pages(&host_s2_pool, get_order(size));
@ -103,19 +124,19 @@ int kvm_host_prepare_stage2(void *pgt_pool_base)
prepare_host_vtcr();
hyp_spin_lock_init(&host_kvm.lock);
mmu->arch = &host_kvm.arch;
ret = prepare_s2_pool(pgt_pool_base);
if (ret)
return ret;
ret = __kvm_pgtable_stage2_init(&host_kvm.pgt, &host_kvm.arch,
ret = __kvm_pgtable_stage2_init(&host_kvm.pgt, mmu,
&host_kvm.mm_ops, KVM_HOST_S2_FLAGS,
host_stage2_force_pte_cb);
if (ret)
return ret;
mmu->pgd_phys = __hyp_pa(host_kvm.pgt.pgd);
mmu->arch = &host_kvm.arch;
mmu->pgt = &host_kvm.pgt;
WRITE_ONCE(mmu->vmid.vmid_gen, 0);
WRITE_ONCE(mmu->vmid.vmid, 0);
@ -338,102 +359,14 @@ static int host_stage2_idmap(u64 addr)
prot = is_memory ? PKVM_HOST_MEM_PROT : PKVM_HOST_MMIO_PROT;
hyp_spin_lock(&host_kvm.lock);
host_lock_component();
ret = host_stage2_adjust_range(addr, &range);
if (ret)
goto unlock;
ret = host_stage2_idmap_locked(range.start, range.end - range.start, prot);
unlock:
hyp_spin_unlock(&host_kvm.lock);
return ret;
}
static inline bool check_prot(enum kvm_pgtable_prot prot,
enum kvm_pgtable_prot required,
enum kvm_pgtable_prot denied)
{
return (prot & (required | denied)) == required;
}
int __pkvm_host_share_hyp(u64 pfn)
{
phys_addr_t addr = hyp_pfn_to_phys(pfn);
enum kvm_pgtable_prot prot, cur;
void *virt = __hyp_va(addr);
enum pkvm_page_state state;
kvm_pte_t pte;
int ret;
if (!addr_is_memory(addr))
return -EINVAL;
hyp_spin_lock(&host_kvm.lock);
hyp_spin_lock(&pkvm_pgd_lock);
ret = kvm_pgtable_get_leaf(&host_kvm.pgt, addr, &pte, NULL);
if (ret)
goto unlock;
if (!pte)
goto map_shared;
/*
* Check attributes in the host stage-2 PTE. We need the page to be:
* - mapped RWX as we're sharing memory;
* - not borrowed, as that implies absence of ownership.
* Otherwise, we can't let it got through
*/
cur = kvm_pgtable_stage2_pte_prot(pte);
prot = pkvm_mkstate(0, PKVM_PAGE_SHARED_BORROWED);
if (!check_prot(cur, PKVM_HOST_MEM_PROT, prot)) {
ret = -EPERM;
goto unlock;
}
state = pkvm_getstate(cur);
if (state == PKVM_PAGE_OWNED)
goto map_shared;
/*
* Tolerate double-sharing the same page, but this requires
* cross-checking the hypervisor stage-1.
*/
if (state != PKVM_PAGE_SHARED_OWNED) {
ret = -EPERM;
goto unlock;
}
ret = kvm_pgtable_get_leaf(&pkvm_pgtable, (u64)virt, &pte, NULL);
if (ret)
goto unlock;
/*
* If the page has been shared with the hypervisor, it must be
* already mapped as SHARED_BORROWED in its stage-1.
*/
cur = kvm_pgtable_hyp_pte_prot(pte);
prot = pkvm_mkstate(PAGE_HYP, PKVM_PAGE_SHARED_BORROWED);
if (!check_prot(cur, prot, ~prot))
ret = -EPERM;
goto unlock;
map_shared:
/*
* If the page is not yet shared, adjust mappings in both page-tables
* while both locks are held.
*/
prot = pkvm_mkstate(PAGE_HYP, PKVM_PAGE_SHARED_BORROWED);
ret = pkvm_create_mappings_locked(virt, virt + PAGE_SIZE, prot);
BUG_ON(ret);
prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_OWNED);
ret = host_stage2_idmap_locked(addr, PAGE_SIZE, prot);
BUG_ON(ret);
unlock:
hyp_spin_unlock(&pkvm_pgd_lock);
hyp_spin_unlock(&host_kvm.lock);
host_unlock_component();
return ret;
}
@ -451,3 +384,421 @@ void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt)
ret = host_stage2_idmap(addr);
BUG_ON(ret && ret != -EAGAIN);
}
/* This corresponds to locking order */
enum pkvm_component_id {
PKVM_ID_HOST,
PKVM_ID_HYP,
};
struct pkvm_mem_transition {
u64 nr_pages;
struct {
enum pkvm_component_id id;
/* Address in the initiator's address space */
u64 addr;
union {
struct {
/* Address in the completer's address space */
u64 completer_addr;
} host;
};
} initiator;
struct {
enum pkvm_component_id id;
} completer;
};
struct pkvm_mem_share {
const struct pkvm_mem_transition tx;
const enum kvm_pgtable_prot completer_prot;
};
struct check_walk_data {
enum pkvm_page_state desired;
enum pkvm_page_state (*get_page_state)(kvm_pte_t pte);
};
static int __check_page_state_visitor(u64 addr, u64 end, u32 level,
kvm_pte_t *ptep,
enum kvm_pgtable_walk_flags flag,
void * const arg)
{
struct check_walk_data *d = arg;
kvm_pte_t pte = *ptep;
if (kvm_pte_valid(pte) && !addr_is_memory(kvm_pte_to_phys(pte)))
return -EINVAL;
return d->get_page_state(pte) == d->desired ? 0 : -EPERM;
}
static int check_page_state_range(struct kvm_pgtable *pgt, u64 addr, u64 size,
struct check_walk_data *data)
{
struct kvm_pgtable_walker walker = {
.cb = __check_page_state_visitor,
.arg = data,
.flags = KVM_PGTABLE_WALK_LEAF,
};
return kvm_pgtable_walk(pgt, addr, size, &walker);
}
static enum pkvm_page_state host_get_page_state(kvm_pte_t pte)
{
if (!kvm_pte_valid(pte) && pte)
return PKVM_NOPAGE;
return pkvm_getstate(kvm_pgtable_stage2_pte_prot(pte));
}
static int __host_check_page_state_range(u64 addr, u64 size,
enum pkvm_page_state state)
{
struct check_walk_data d = {
.desired = state,
.get_page_state = host_get_page_state,
};
hyp_assert_lock_held(&host_kvm.lock);
return check_page_state_range(&host_kvm.pgt, addr, size, &d);
}
static int __host_set_page_state_range(u64 addr, u64 size,
enum pkvm_page_state state)
{
enum kvm_pgtable_prot prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, state);
return host_stage2_idmap_locked(addr, size, prot);
}
static int host_request_owned_transition(u64 *completer_addr,
const struct pkvm_mem_transition *tx)
{
u64 size = tx->nr_pages * PAGE_SIZE;
u64 addr = tx->initiator.addr;
*completer_addr = tx->initiator.host.completer_addr;
return __host_check_page_state_range(addr, size, PKVM_PAGE_OWNED);
}
static int host_request_unshare(u64 *completer_addr,
const struct pkvm_mem_transition *tx)
{
u64 size = tx->nr_pages * PAGE_SIZE;
u64 addr = tx->initiator.addr;
*completer_addr = tx->initiator.host.completer_addr;
return __host_check_page_state_range(addr, size, PKVM_PAGE_SHARED_OWNED);
}
static int host_initiate_share(u64 *completer_addr,
const struct pkvm_mem_transition *tx)
{
u64 size = tx->nr_pages * PAGE_SIZE;
u64 addr = tx->initiator.addr;
*completer_addr = tx->initiator.host.completer_addr;
return __host_set_page_state_range(addr, size, PKVM_PAGE_SHARED_OWNED);
}
static int host_initiate_unshare(u64 *completer_addr,
const struct pkvm_mem_transition *tx)
{
u64 size = tx->nr_pages * PAGE_SIZE;
u64 addr = tx->initiator.addr;
*completer_addr = tx->initiator.host.completer_addr;
return __host_set_page_state_range(addr, size, PKVM_PAGE_OWNED);
}
static enum pkvm_page_state hyp_get_page_state(kvm_pte_t pte)
{
if (!kvm_pte_valid(pte))
return PKVM_NOPAGE;
return pkvm_getstate(kvm_pgtable_stage2_pte_prot(pte));
}
static int __hyp_check_page_state_range(u64 addr, u64 size,
enum pkvm_page_state state)
{
struct check_walk_data d = {
.desired = state,
.get_page_state = hyp_get_page_state,
};
hyp_assert_lock_held(&pkvm_pgd_lock);
return check_page_state_range(&pkvm_pgtable, addr, size, &d);
}
static bool __hyp_ack_skip_pgtable_check(const struct pkvm_mem_transition *tx)
{
return !(IS_ENABLED(CONFIG_NVHE_EL2_DEBUG) ||
tx->initiator.id != PKVM_ID_HOST);
}
static int hyp_ack_share(u64 addr, const struct pkvm_mem_transition *tx,
enum kvm_pgtable_prot perms)
{
u64 size = tx->nr_pages * PAGE_SIZE;
if (perms != PAGE_HYP)
return -EPERM;
if (__hyp_ack_skip_pgtable_check(tx))
return 0;
return __hyp_check_page_state_range(addr, size, PKVM_NOPAGE);
}
static int hyp_ack_unshare(u64 addr, const struct pkvm_mem_transition *tx)
{
u64 size = tx->nr_pages * PAGE_SIZE;
if (__hyp_ack_skip_pgtable_check(tx))
return 0;
return __hyp_check_page_state_range(addr, size,
PKVM_PAGE_SHARED_BORROWED);
}
static int hyp_complete_share(u64 addr, const struct pkvm_mem_transition *tx,
enum kvm_pgtable_prot perms)
{
void *start = (void *)addr, *end = start + (tx->nr_pages * PAGE_SIZE);
enum kvm_pgtable_prot prot;
prot = pkvm_mkstate(perms, PKVM_PAGE_SHARED_BORROWED);
return pkvm_create_mappings_locked(start, end, prot);
}
static int hyp_complete_unshare(u64 addr, const struct pkvm_mem_transition *tx)
{
u64 size = tx->nr_pages * PAGE_SIZE;
int ret = kvm_pgtable_hyp_unmap(&pkvm_pgtable, addr, size);
return (ret != size) ? -EFAULT : 0;
}
static int check_share(struct pkvm_mem_share *share)
{
const struct pkvm_mem_transition *tx = &share->tx;
u64 completer_addr;
int ret;
switch (tx->initiator.id) {
case PKVM_ID_HOST:
ret = host_request_owned_transition(&completer_addr, tx);
break;
default:
ret = -EINVAL;
}
if (ret)
return ret;
switch (tx->completer.id) {
case PKVM_ID_HYP:
ret = hyp_ack_share(completer_addr, tx, share->completer_prot);
break;
default:
ret = -EINVAL;
}
return ret;
}
static int __do_share(struct pkvm_mem_share *share)
{
const struct pkvm_mem_transition *tx = &share->tx;
u64 completer_addr;
int ret;
switch (tx->initiator.id) {
case PKVM_ID_HOST:
ret = host_initiate_share(&completer_addr, tx);
break;
default:
ret = -EINVAL;
}
if (ret)
return ret;
switch (tx->completer.id) {
case PKVM_ID_HYP:
ret = hyp_complete_share(completer_addr, tx, share->completer_prot);
break;
default:
ret = -EINVAL;
}
return ret;
}
/*
* do_share():
*
* The page owner grants access to another component with a given set
* of permissions.
*
* Initiator: OWNED => SHARED_OWNED
* Completer: NOPAGE => SHARED_BORROWED
*/
static int do_share(struct pkvm_mem_share *share)
{
int ret;
ret = check_share(share);
if (ret)
return ret;
return WARN_ON(__do_share(share));
}
static int check_unshare(struct pkvm_mem_share *share)
{
const struct pkvm_mem_transition *tx = &share->tx;
u64 completer_addr;
int ret;
switch (tx->initiator.id) {
case PKVM_ID_HOST:
ret = host_request_unshare(&completer_addr, tx);
break;
default:
ret = -EINVAL;
}
if (ret)
return ret;
switch (tx->completer.id) {
case PKVM_ID_HYP:
ret = hyp_ack_unshare(completer_addr, tx);
break;
default:
ret = -EINVAL;
}
return ret;
}
static int __do_unshare(struct pkvm_mem_share *share)
{
const struct pkvm_mem_transition *tx = &share->tx;
u64 completer_addr;
int ret;
switch (tx->initiator.id) {
case PKVM_ID_HOST:
ret = host_initiate_unshare(&completer_addr, tx);
break;
default:
ret = -EINVAL;
}
if (ret)
return ret;
switch (tx->completer.id) {
case PKVM_ID_HYP:
ret = hyp_complete_unshare(completer_addr, tx);
break;
default:
ret = -EINVAL;
}
return ret;
}
/*
* do_unshare():
*
* The page owner revokes access from another component for a range of
* pages which were previously shared using do_share().
*
* Initiator: SHARED_OWNED => OWNED
* Completer: SHARED_BORROWED => NOPAGE
*/
static int do_unshare(struct pkvm_mem_share *share)
{
int ret;
ret = check_unshare(share);
if (ret)
return ret;
return WARN_ON(__do_unshare(share));
}
int __pkvm_host_share_hyp(u64 pfn)
{
int ret;
u64 host_addr = hyp_pfn_to_phys(pfn);
u64 hyp_addr = (u64)__hyp_va(host_addr);
struct pkvm_mem_share share = {
.tx = {
.nr_pages = 1,
.initiator = {
.id = PKVM_ID_HOST,
.addr = host_addr,
.host = {
.completer_addr = hyp_addr,
},
},
.completer = {
.id = PKVM_ID_HYP,
},
},
.completer_prot = PAGE_HYP,
};
host_lock_component();
hyp_lock_component();
ret = do_share(&share);
hyp_unlock_component();
host_unlock_component();
return ret;
}
int __pkvm_host_unshare_hyp(u64 pfn)
{
int ret;
u64 host_addr = hyp_pfn_to_phys(pfn);
u64 hyp_addr = (u64)__hyp_va(host_addr);
struct pkvm_mem_share share = {
.tx = {
.nr_pages = 1,
.initiator = {
.id = PKVM_ID_HOST,
.addr = host_addr,
.host = {
.completer_addr = hyp_addr,
},
},
.completer = {
.id = PKVM_ID_HYP,
},
},
.completer_prot = PAGE_HYP,
};
host_lock_component();
hyp_lock_component();
ret = do_unshare(&share);
hyp_unlock_component();
host_unlock_component();
return ret;
}

View File

@ -8,6 +8,7 @@
#include <asm/kvm_hyp.h>
#include <asm/kvm_mmu.h>
#include <asm/kvm_pgtable.h>
#include <asm/kvm_pkvm.h>
#include <asm/spectre.h>
#include <nvhe/early_alloc.h>
@ -18,11 +19,12 @@
struct kvm_pgtable pkvm_pgtable;
hyp_spinlock_t pkvm_pgd_lock;
u64 __io_map_base;
struct memblock_region hyp_memory[HYP_MEMBLOCK_REGIONS];
unsigned int hyp_memblock_nr;
static u64 __io_map_base;
static int __pkvm_create_mappings(unsigned long start, unsigned long size,
unsigned long phys, enum kvm_pgtable_prot prot)
{

View File

@ -241,7 +241,7 @@ int hyp_pool_init(struct hyp_pool *pool, u64 pfn, unsigned int nr_pages,
int i;
hyp_spin_lock_init(&pool->lock);
pool->max_order = min(MAX_ORDER, get_order(nr_pages << PAGE_SHIFT));
pool->max_order = min(MAX_ORDER, get_order((nr_pages + 1) << PAGE_SHIFT));
for (i = 0; i < pool->max_order; i++)
INIT_LIST_HEAD(&pool->free_area[i]);
pool->range_start = phys;

View File

@ -8,6 +8,7 @@
#include <asm/kvm_hyp.h>
#include <asm/kvm_mmu.h>
#include <asm/kvm_pgtable.h>
#include <asm/kvm_pkvm.h>
#include <nvhe/early_alloc.h>
#include <nvhe/fixed_config.h>
@ -17,7 +18,6 @@
#include <nvhe/mm.h>
#include <nvhe/trap_handler.h>
struct hyp_pool hpool;
unsigned long hyp_nr_cpus;
#define hyp_percpu_size ((unsigned long)__per_cpu_end - \
@ -27,6 +27,7 @@ static void *vmemmap_base;
static void *hyp_pgt_base;
static void *host_s2_pgt_base;
static struct kvm_pgtable_mm_ops pkvm_pgtable_mm_ops;
static struct hyp_pool hpool;
static int divide_memory_pool(void *virt, unsigned long size)
{
@ -165,6 +166,7 @@ static int finalize_host_mappings_walker(u64 addr, u64 end, u32 level,
enum kvm_pgtable_walk_flags flag,
void * const arg)
{
struct kvm_pgtable_mm_ops *mm_ops = arg;
enum kvm_pgtable_prot prot;
enum pkvm_page_state state;
kvm_pte_t pte = *ptep;
@ -173,6 +175,15 @@ static int finalize_host_mappings_walker(u64 addr, u64 end, u32 level,
if (!kvm_pte_valid(pte))
return 0;
/*
* Fix-up the refcount for the page-table pages as the early allocator
* was unable to access the hyp_vmemmap and so the buddy allocator has
* initialised the refcount to '1'.
*/
mm_ops->get_page(ptep);
if (flag != KVM_PGTABLE_WALK_LEAF)
return 0;
if (level != (KVM_PGTABLE_MAX_LEVELS - 1))
return -EINVAL;
@ -205,7 +216,8 @@ static int finalize_host_mappings(void)
{
struct kvm_pgtable_walker walker = {
.cb = finalize_host_mappings_walker,
.flags = KVM_PGTABLE_WALK_LEAF,
.flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
.arg = pkvm_pgtable.mm_ops,
};
int i, ret;
@ -240,19 +252,20 @@ void __noreturn __pkvm_init_finalise(void)
if (ret)
goto out;
ret = finalize_host_mappings();
if (ret)
goto out;
pkvm_pgtable_mm_ops = (struct kvm_pgtable_mm_ops) {
.zalloc_page = hyp_zalloc_hyp_page,
.phys_to_virt = hyp_phys_to_virt,
.virt_to_phys = hyp_virt_to_phys,
.get_page = hpool_get_page,
.put_page = hpool_put_page,
.page_count = hyp_page_count,
};
pkvm_pgtable.mm_ops = &pkvm_pgtable_mm_ops;
ret = finalize_host_mappings();
if (ret)
goto out;
out:
/*
* We tail-called to here from handle___pkvm_init() and will not return,

View File

@ -25,7 +25,6 @@
#include <asm/fpsimd.h>
#include <asm/debug-monitors.h>
#include <asm/processor.h>
#include <asm/thread_info.h>
#include <nvhe/fixed_config.h>
#include <nvhe/mem_protect.h>

View File

@ -383,21 +383,6 @@ enum kvm_pgtable_prot kvm_pgtable_hyp_pte_prot(kvm_pte_t pte)
return prot;
}
static bool hyp_pte_needs_update(kvm_pte_t old, kvm_pte_t new)
{
/*
* Tolerate KVM recreating the exact same mapping, or changing software
* bits if the existing mapping was valid.
*/
if (old == new)
return false;
if (!kvm_pte_valid(old))
return true;
return !WARN_ON((old ^ new) & ~KVM_PTE_LEAF_ATTR_HI_SW);
}
static bool hyp_map_walker_try_leaf(u64 addr, u64 end, u32 level,
kvm_pte_t *ptep, struct hyp_map_data *data)
{
@ -407,11 +392,16 @@ static bool hyp_map_walker_try_leaf(u64 addr, u64 end, u32 level,
if (!kvm_block_mapping_supported(addr, end, phys, level))
return false;
new = kvm_init_valid_leaf_pte(phys, data->attr, level);
if (hyp_pte_needs_update(old, new))
smp_store_release(ptep, new);
data->phys += granule;
new = kvm_init_valid_leaf_pte(phys, data->attr, level);
if (old == new)
return true;
if (!kvm_pte_valid(old))
data->mm_ops->get_page(ptep);
else if (WARN_ON((old ^ new) & ~KVM_PTE_LEAF_ATTR_HI_SW))
return false;
smp_store_release(ptep, new);
return true;
}
@ -433,6 +423,7 @@ static int hyp_map_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
return -ENOMEM;
kvm_set_table_pte(ptep, childp, mm_ops);
mm_ops->get_page(ptep);
return 0;
}
@ -460,6 +451,69 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys,
return ret;
}
struct hyp_unmap_data {
u64 unmapped;
struct kvm_pgtable_mm_ops *mm_ops;
};
static int hyp_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
enum kvm_pgtable_walk_flags flag, void * const arg)
{
kvm_pte_t pte = *ptep, *childp = NULL;
u64 granule = kvm_granule_size(level);
struct hyp_unmap_data *data = arg;
struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops;
if (!kvm_pte_valid(pte))
return -EINVAL;
if (kvm_pte_table(pte, level)) {
childp = kvm_pte_follow(pte, mm_ops);
if (mm_ops->page_count(childp) != 1)
return 0;
kvm_clear_pte(ptep);
dsb(ishst);
__tlbi_level(vae2is, __TLBI_VADDR(addr, 0), level);
} else {
if (end - addr < granule)
return -EINVAL;
kvm_clear_pte(ptep);
dsb(ishst);
__tlbi_level(vale2is, __TLBI_VADDR(addr, 0), level);
data->unmapped += granule;
}
dsb(ish);
isb();
mm_ops->put_page(ptep);
if (childp)
mm_ops->put_page(childp);
return 0;
}
u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size)
{
struct hyp_unmap_data unmap_data = {
.mm_ops = pgt->mm_ops,
};
struct kvm_pgtable_walker walker = {
.cb = hyp_unmap_walker,
.arg = &unmap_data,
.flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
};
if (!pgt->mm_ops->page_count)
return 0;
kvm_pgtable_walk(pgt, addr, size, &walker);
return unmap_data.unmapped;
}
int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits,
struct kvm_pgtable_mm_ops *mm_ops)
{
@ -482,8 +536,16 @@ static int hyp_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
enum kvm_pgtable_walk_flags flag, void * const arg)
{
struct kvm_pgtable_mm_ops *mm_ops = arg;
kvm_pte_t pte = *ptep;
if (!kvm_pte_valid(pte))
return 0;
mm_ops->put_page(ptep);
if (kvm_pte_table(pte, level))
mm_ops->put_page(kvm_pte_follow(pte, mm_ops));
mm_ops->put_page((void *)kvm_pte_follow(*ptep, mm_ops));
return 0;
}
@ -491,7 +553,7 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt)
{
struct kvm_pgtable_walker walker = {
.cb = hyp_free_walker,
.flags = KVM_PGTABLE_WALK_TABLE_POST,
.flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
.arg = pgt->mm_ops,
};
@ -1116,13 +1178,13 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size)
}
int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch,
int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu,
struct kvm_pgtable_mm_ops *mm_ops,
enum kvm_pgtable_stage2_flags flags,
kvm_pgtable_force_pte_cb_t force_pte_cb)
{
size_t pgd_sz;
u64 vtcr = arch->vtcr;
u64 vtcr = mmu->arch->vtcr;
u32 ia_bits = VTCR_EL2_IPA(vtcr);
u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr);
u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0;
@ -1135,7 +1197,7 @@ int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch,
pgt->ia_bits = ia_bits;
pgt->start_level = start_level;
pgt->mm_ops = mm_ops;
pgt->mmu = &arch->mmu;
pgt->mmu = mmu;
pgt->flags = flags;
pgt->force_pte_cb = force_pte_cb;

View File

@ -24,7 +24,6 @@
#include <asm/fpsimd.h>
#include <asm/debug-monitors.h>
#include <asm/processor.h>
#include <asm/thread_info.h>
/* VHE specific context */
DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);

View File

@ -210,13 +210,13 @@ static void stage2_flush_vm(struct kvm *kvm)
{
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot;
int idx;
int idx, bkt;
idx = srcu_read_lock(&kvm->srcu);
spin_lock(&kvm->mmu_lock);
slots = kvm_memslots(kvm);
kvm_for_each_memslot(memslot, slots)
kvm_for_each_memslot(memslot, bkt, slots)
stage2_flush_memslot(kvm, memslot);
spin_unlock(&kvm->mmu_lock);
@ -239,6 +239,9 @@ void free_hyp_pgds(void)
static bool kvm_host_owns_hyp_mappings(void)
{
if (is_kernel_in_hyp_mode())
return false;
if (static_branch_likely(&kvm_protected_mode_initialized))
return false;
@ -281,14 +284,117 @@ static phys_addr_t kvm_kaddr_to_phys(void *kaddr)
}
}
static int pkvm_share_hyp(phys_addr_t start, phys_addr_t end)
struct hyp_shared_pfn {
u64 pfn;
int count;
struct rb_node node;
};
static DEFINE_MUTEX(hyp_shared_pfns_lock);
static struct rb_root hyp_shared_pfns = RB_ROOT;
static struct hyp_shared_pfn *find_shared_pfn(u64 pfn, struct rb_node ***node,
struct rb_node **parent)
{
phys_addr_t addr;
struct hyp_shared_pfn *this;
*node = &hyp_shared_pfns.rb_node;
*parent = NULL;
while (**node) {
this = container_of(**node, struct hyp_shared_pfn, node);
*parent = **node;
if (this->pfn < pfn)
*node = &((**node)->rb_left);
else if (this->pfn > pfn)
*node = &((**node)->rb_right);
else
return this;
}
return NULL;
}
static int share_pfn_hyp(u64 pfn)
{
struct rb_node **node, *parent;
struct hyp_shared_pfn *this;
int ret = 0;
mutex_lock(&hyp_shared_pfns_lock);
this = find_shared_pfn(pfn, &node, &parent);
if (this) {
this->count++;
goto unlock;
}
this = kzalloc(sizeof(*this), GFP_KERNEL);
if (!this) {
ret = -ENOMEM;
goto unlock;
}
this->pfn = pfn;
this->count = 1;
rb_link_node(&this->node, parent, node);
rb_insert_color(&this->node, &hyp_shared_pfns);
ret = kvm_call_hyp_nvhe(__pkvm_host_share_hyp, pfn, 1);
unlock:
mutex_unlock(&hyp_shared_pfns_lock);
return ret;
}
static int unshare_pfn_hyp(u64 pfn)
{
struct rb_node **node, *parent;
struct hyp_shared_pfn *this;
int ret = 0;
mutex_lock(&hyp_shared_pfns_lock);
this = find_shared_pfn(pfn, &node, &parent);
if (WARN_ON(!this)) {
ret = -ENOENT;
goto unlock;
}
this->count--;
if (this->count)
goto unlock;
rb_erase(&this->node, &hyp_shared_pfns);
kfree(this);
ret = kvm_call_hyp_nvhe(__pkvm_host_unshare_hyp, pfn, 1);
unlock:
mutex_unlock(&hyp_shared_pfns_lock);
return ret;
}
int kvm_share_hyp(void *from, void *to)
{
phys_addr_t start, end, cur;
u64 pfn;
int ret;
for (addr = ALIGN_DOWN(start, PAGE_SIZE); addr < end; addr += PAGE_SIZE) {
ret = kvm_call_hyp_nvhe(__pkvm_host_share_hyp,
__phys_to_pfn(addr));
if (is_kernel_in_hyp_mode())
return 0;
/*
* The share hcall maps things in the 'fixed-offset' region of the hyp
* VA space, so we can only share physically contiguous data-structures
* for now.
*/
if (is_vmalloc_or_module_addr(from) || is_vmalloc_or_module_addr(to))
return -EINVAL;
if (kvm_host_owns_hyp_mappings())
return create_hyp_mappings(from, to, PAGE_HYP);
start = ALIGN_DOWN(__pa(from), PAGE_SIZE);
end = PAGE_ALIGN(__pa(to));
for (cur = start; cur < end; cur += PAGE_SIZE) {
pfn = __phys_to_pfn(cur);
ret = share_pfn_hyp(pfn);
if (ret)
return ret;
}
@ -296,6 +402,22 @@ static int pkvm_share_hyp(phys_addr_t start, phys_addr_t end)
return 0;
}
void kvm_unshare_hyp(void *from, void *to)
{
phys_addr_t start, end, cur;
u64 pfn;
if (is_kernel_in_hyp_mode() || kvm_host_owns_hyp_mappings() || !from)
return;
start = ALIGN_DOWN(__pa(from), PAGE_SIZE);
end = PAGE_ALIGN(__pa(to));
for (cur = start; cur < end; cur += PAGE_SIZE) {
pfn = __phys_to_pfn(cur);
WARN_ON(unshare_pfn_hyp(pfn));
}
}
/**
* create_hyp_mappings - duplicate a kernel virtual address range in Hyp mode
* @from: The virtual kernel start address of the range
@ -316,12 +438,8 @@ int create_hyp_mappings(void *from, void *to, enum kvm_pgtable_prot prot)
if (is_kernel_in_hyp_mode())
return 0;
if (!kvm_host_owns_hyp_mappings()) {
if (WARN_ON(prot != PAGE_HYP))
return -EPERM;
return pkvm_share_hyp(kvm_kaddr_to_phys(from),
kvm_kaddr_to_phys(to));
}
if (!kvm_host_owns_hyp_mappings())
return -EPERM;
start = start & PAGE_MASK;
end = PAGE_ALIGN(end);
@ -407,6 +525,9 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
unsigned long addr;
int ret;
if (is_protected_kvm_enabled())
return -EPERM;
*kaddr = ioremap(phys_addr, size);
if (!*kaddr)
return -ENOMEM;
@ -516,7 +637,8 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu)
if (!pgt)
return -ENOMEM;
err = kvm_pgtable_stage2_init(pgt, &kvm->arch, &kvm_s2_mm_ops);
mmu->arch = &kvm->arch;
err = kvm_pgtable_stage2_init(pgt, mmu, &kvm_s2_mm_ops);
if (err)
goto out_free_pgtable;
@ -529,7 +651,6 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu)
for_each_possible_cpu(cpu)
*per_cpu_ptr(mmu->last_vcpu_ran, cpu) = -1;
mmu->arch = &kvm->arch;
mmu->pgt = pgt;
mmu->pgd_phys = __pa(pgt->pgd);
WRITE_ONCE(mmu->vmid.vmid_gen, 0);
@ -595,14 +716,14 @@ void stage2_unmap_vm(struct kvm *kvm)
{
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot;
int idx;
int idx, bkt;
idx = srcu_read_lock(&kvm->srcu);
mmap_read_lock(current->mm);
spin_lock(&kvm->mmu_lock);
slots = kvm_memslots(kvm);
kvm_for_each_memslot(memslot, slots)
kvm_for_each_memslot(memslot, bkt, slots)
stage2_unmap_memslot(kvm, memslot);
spin_unlock(&kvm->mmu_lock);
@ -650,6 +771,9 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
KVM_PGTABLE_PROT_R |
(writable ? KVM_PGTABLE_PROT_W : 0);
if (is_protected_kvm_enabled())
return -EPERM;
size += offset_in_page(guest_ipa);
guest_ipa &= PAGE_MASK;
@ -1463,7 +1587,6 @@ out:
}
void kvm_arch_commit_memory_region(struct kvm *kvm,
const struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change)
@ -1473,25 +1596,24 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
* allocated dirty_bitmap[], dirty pages will be tracked while the
* memory slot is write protected.
*/
if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES) {
if (change != KVM_MR_DELETE && new->flags & KVM_MEM_LOG_DIRTY_PAGES) {
/*
* If we're with initial-all-set, we don't need to write
* protect any pages because they're all reported as dirty.
* Huge pages and normal pages will be write protect gradually.
*/
if (!kvm_dirty_log_manual_protect_and_init_set(kvm)) {
kvm_mmu_wp_memory_region(kvm, mem->slot);
kvm_mmu_wp_memory_region(kvm, new->id);
}
}
}
int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
struct kvm_memory_slot *new,
enum kvm_mr_change change)
{
hva_t hva = mem->userspace_addr;
hva_t reg_end = hva + mem->memory_size;
hva_t hva, reg_end;
int ret = 0;
if (change != KVM_MR_CREATE && change != KVM_MR_MOVE &&
@ -1502,9 +1624,12 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
* Prevent userspace from creating a memory region outside of the IPA
* space addressable by the KVM guest IPA space.
*/
if ((memslot->base_gfn + memslot->npages) > (kvm_phys_size(kvm) >> PAGE_SHIFT))
if ((new->base_gfn + new->npages) > (kvm_phys_size(kvm) >> PAGE_SHIFT))
return -EFAULT;
hva = new->userspace_addr;
reg_end = hva + (new->npages << PAGE_SHIFT);
mmap_read_lock(current->mm);
/*
* A memory region could potentially cover multiple VMAs, and any holes
@ -1536,7 +1661,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
if (vma->vm_flags & VM_PFNMAP) {
/* IO region dirty page logging not allowed */
if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES) {
if (new->flags & KVM_MEM_LOG_DIRTY_PAGES) {
ret = -EINVAL;
break;
}

View File

@ -8,10 +8,9 @@
#include <linux/memblock.h>
#include <linux/sort.h>
#include <asm/kvm_host.h>
#include <asm/kvm_pkvm.h>
#include <nvhe/memory.h>
#include <nvhe/mm.h>
#include "hyp_constants.h"
static struct memblock_region *hyp_memory = kvm_nvhe_sym(hyp_memory);
static unsigned int *hyp_memblock_nr_ptr = &kvm_nvhe_sym(hyp_memblock_nr);
@ -82,7 +81,8 @@ void __init kvm_hyp_reserve(void)
do {
prev = nr_pages;
nr_pages = hyp_mem_pages + prev;
nr_pages = DIV_ROUND_UP(nr_pages * sizeof(struct hyp_page), PAGE_SIZE);
nr_pages = DIV_ROUND_UP(nr_pages * STRUCT_HYP_PAGE_SIZE,
PAGE_SIZE);
nr_pages += __hyp_pgtable_max_pages(nr_pages);
} while (nr_pages != prev);
hyp_mem_pages += nr_pages;

View File

@ -30,6 +30,7 @@ static u32 kvm_pmu_event_mask(struct kvm *kvm)
case ID_AA64DFR0_PMUVER_8_1:
case ID_AA64DFR0_PMUVER_8_4:
case ID_AA64DFR0_PMUVER_8_5:
case ID_AA64DFR0_PMUVER_8_7:
return GENMASK(15, 0);
default: /* Shouldn't be here, just for sanity */
WARN_ONCE(1, "Unknown PMU version %d\n", kvm->arch.pmuver);
@ -902,7 +903,7 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
*/
static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
{
int i;
unsigned long i;
struct kvm_vcpu *vcpu;
kvm_for_each_vcpu(i, vcpu, kvm) {

View File

@ -46,7 +46,7 @@ static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
* specification (ARM DEN 0022A). This means all suspend states
* for KVM will preserve the register state.
*/
kvm_vcpu_block(vcpu);
kvm_vcpu_halt(vcpu);
kvm_clear_request(KVM_REQ_UNHALT, vcpu);
return PSCI_RET_SUCCESS;
@ -109,7 +109,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
/*
* Make sure the reset request is observed if the change to
* power_state is observed.
* power_off is observed.
*/
smp_wmb();
@ -121,8 +121,8 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
{
int i, matching_cpus = 0;
unsigned long mpidr;
int matching_cpus = 0;
unsigned long i, mpidr;
unsigned long target_affinity;
unsigned long target_affinity_mask;
unsigned long lowest_affinity_level;
@ -164,7 +164,7 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type)
{
int i;
unsigned long i;
struct kvm_vcpu *tmp;
/*

View File

@ -94,22 +94,31 @@ static int kvm_vcpu_finalize_sve(struct kvm_vcpu *vcpu)
{
void *buf;
unsigned int vl;
size_t reg_sz;
int ret;
vl = vcpu->arch.sve_max_vl;
/*
* Responsibility for these properties is shared between
* kvm_arm_init_arch_resources(), kvm_vcpu_enable_sve() and
* kvm_arm_init_sve(), kvm_vcpu_enable_sve() and
* set_sve_vls(). Double-check here just to be sure:
*/
if (WARN_ON(!sve_vl_valid(vl) || vl > sve_max_virtualisable_vl() ||
vl > VL_ARCH_MAX))
return -EIO;
buf = kzalloc(SVE_SIG_REGS_SIZE(sve_vq_from_vl(vl)), GFP_KERNEL_ACCOUNT);
reg_sz = vcpu_sve_state_size(vcpu);
buf = kzalloc(reg_sz, GFP_KERNEL_ACCOUNT);
if (!buf)
return -ENOMEM;
ret = kvm_share_hyp(buf, buf + reg_sz);
if (ret) {
kfree(buf);
return ret;
}
vcpu->arch.sve_state = buf;
vcpu->arch.flags |= KVM_ARM64_VCPU_SVE_FINALIZED;
return 0;
@ -141,7 +150,13 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu)
void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu)
{
kfree(vcpu->arch.sve_state);
void *sve_state = vcpu->arch.sve_state;
kvm_vcpu_unshare_task_fp(vcpu);
kvm_unshare_hyp(vcpu, vcpu + 1);
if (sve_state)
kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu));
kfree(sve_state);
}
static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu)
@ -170,7 +185,7 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
{
struct kvm_vcpu *tmp;
bool is32bit;
int i;
unsigned long i;
is32bit = vcpu_has_feature(vcpu, KVM_ARM_VCPU_EL1_32BIT);
if (!cpus_have_const_cap(ARM64_HAS_32BIT_EL1) && is32bit)
@ -193,10 +208,9 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
* kvm_reset_vcpu - sets core registers and sys_regs to reset value
* @vcpu: The VCPU pointer
*
* This function finds the right table above and sets the registers on
* the virtual CPU struct to their architecturally defined reset
* values, except for registers whose reset is deferred until
* kvm_arm_vcpu_finalize().
* This function sets the registers on the virtual CPU struct to their
* architecturally defined reset values, except for registers whose reset is
* deferred until kvm_arm_vcpu_finalize().
*
* Note: This function can be called from two paths: The KVM_ARM_VCPU_INIT
* ioctl or as part of handling a request issued by another VCPU in the PSCI

View File

@ -70,8 +70,9 @@ void kvm_vgic_early_init(struct kvm *kvm)
*/
int kvm_vgic_create(struct kvm *kvm, u32 type)
{
int i, ret;
struct kvm_vcpu *vcpu;
unsigned long i;
int ret;
if (irqchip_in_kernel(kvm))
return -EEXIST;
@ -91,7 +92,7 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
return ret;
kvm_for_each_vcpu(i, vcpu, kvm) {
if (vcpu->arch.has_run_once)
if (vcpu_has_run_once(vcpu))
goto out_unlock;
}
ret = 0;
@ -255,7 +256,8 @@ int vgic_init(struct kvm *kvm)
{
struct vgic_dist *dist = &kvm->arch.vgic;
struct kvm_vcpu *vcpu;
int ret = 0, i, idx;
int ret = 0, i;
unsigned long idx;
if (vgic_initialized(kvm))
return 0;
@ -308,7 +310,7 @@ int vgic_init(struct kvm *kvm)
goto out;
}
kvm_for_each_vcpu(i, vcpu, kvm)
kvm_for_each_vcpu(idx, vcpu, kvm)
kvm_vgic_vcpu_enable(vcpu);
ret = kvm_vgic_setup_default_irq_routing(kvm);
@ -370,7 +372,7 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
static void __kvm_vgic_destroy(struct kvm *kvm)
{
struct kvm_vcpu *vcpu;
int i;
unsigned long i;
vgic_debug_destroy(kvm);

View File

@ -325,7 +325,7 @@ void unlock_all_vcpus(struct kvm *kvm)
bool lock_all_vcpus(struct kvm *kvm)
{
struct kvm_vcpu *tmp_vcpu;
int c;
unsigned long c;
/*
* Any time a vcpu is run, vcpu_load is called which tries to grab the

View File

@ -113,9 +113,8 @@ static void vgic_mmio_write_sgir(struct kvm_vcpu *source_vcpu,
int intid = val & 0xf;
int targets = (val >> 16) & 0xff;
int mode = (val >> 24) & 0x03;
int c;
struct kvm_vcpu *vcpu;
unsigned long flags;
unsigned long flags, c;
switch (mode) {
case 0x0: /* as specified by targets */

View File

@ -754,7 +754,8 @@ static void vgic_unregister_redist_iodev(struct kvm_vcpu *vcpu)
static int vgic_register_all_redist_iodevs(struct kvm *kvm)
{
struct kvm_vcpu *vcpu;
int c, ret = 0;
unsigned long c;
int ret = 0;
kvm_for_each_vcpu(c, vcpu, kvm) {
ret = vgic_register_redist_iodev(vcpu);
@ -763,10 +764,12 @@ static int vgic_register_all_redist_iodevs(struct kvm *kvm)
}
if (ret) {
/* The current c failed, so we start with the previous one. */
/* The current c failed, so iterate over the previous ones. */
int i;
mutex_lock(&kvm->slots_lock);
for (c--; c >= 0; c--) {
vcpu = kvm_get_vcpu(kvm, c);
for (i = 0; i < c; i++) {
vcpu = kvm_get_vcpu(kvm, i);
vgic_unregister_redist_iodev(vcpu);
}
mutex_unlock(&kvm->slots_lock);
@ -995,10 +998,10 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg, bool allow_group1)
struct kvm_vcpu *c_vcpu;
u16 target_cpus;
u64 mpidr;
int sgi, c;
int sgi;
int vcpu_id = vcpu->vcpu_id;
bool broadcast;
unsigned long flags;
unsigned long c, flags;
sgi = (reg & ICC_SGI1R_SGI_ID_MASK) >> ICC_SGI1R_SGI_ID_SHIFT;
broadcast = reg & BIT_ULL(ICC_SGI1R_IRQ_ROUTING_MODE_BIT);

View File

@ -1050,7 +1050,7 @@ static int dispatch_mmio_write(struct kvm_vcpu *vcpu, struct kvm_io_device *dev,
return 0;
}
struct kvm_io_device_ops kvm_io_gic_ops = {
const struct kvm_io_device_ops kvm_io_gic_ops = {
.read = dispatch_mmio_read,
.write = dispatch_mmio_write,
};

View File

@ -34,7 +34,7 @@ struct vgic_register_region {
};
};
extern struct kvm_io_device_ops kvm_io_gic_ops;
extern const struct kvm_io_device_ops kvm_io_gic_ops;
#define VGIC_ACCESS_8bit 1
#define VGIC_ACCESS_32bit 2

View File

@ -293,12 +293,12 @@ int vgic_v2_map_resources(struct kvm *kvm)
if (IS_VGIC_ADDR_UNDEF(dist->vgic_dist_base) ||
IS_VGIC_ADDR_UNDEF(dist->vgic_cpu_base)) {
kvm_err("Need to set vgic cpu and dist addresses first\n");
kvm_debug("Need to set vgic cpu and dist addresses first\n");
return -ENXIO;
}
if (!vgic_v2_check_base(dist->vgic_dist_base, dist->vgic_cpu_base)) {
kvm_err("VGIC CPU and dist frames overlap\n");
kvm_debug("VGIC CPU and dist frames overlap\n");
return -EINVAL;
}
@ -345,6 +345,11 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
int ret;
u32 vtr;
if (is_protected_kvm_enabled()) {
kvm_err("GICv2 not supported in protected mode\n");
return -ENXIO;
}
if (!info->vctrl.start) {
kvm_err("GICH not present in the firmware table\n");
return -ENXIO;

View File

@ -542,24 +542,24 @@ int vgic_v3_map_resources(struct kvm *kvm)
struct vgic_dist *dist = &kvm->arch.vgic;
struct kvm_vcpu *vcpu;
int ret = 0;
int c;
unsigned long c;
kvm_for_each_vcpu(c, vcpu, kvm) {
struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
if (IS_VGIC_ADDR_UNDEF(vgic_cpu->rd_iodev.base_addr)) {
kvm_debug("vcpu %d redistributor base not set\n", c);
kvm_debug("vcpu %ld redistributor base not set\n", c);
return -ENXIO;
}
}
if (IS_VGIC_ADDR_UNDEF(dist->vgic_dist_base)) {
kvm_err("Need to set vgic distributor addresses first\n");
kvm_debug("Need to set vgic distributor addresses first\n");
return -ENXIO;
}
if (!vgic_v3_check_base(kvm)) {
kvm_err("VGIC redist and dist frames overlap\n");
kvm_debug("VGIC redist and dist frames overlap\n");
return -EINVAL;
}
@ -651,7 +651,7 @@ int vgic_v3_probe(const struct gic_kvm_info *info)
} else if (!PAGE_ALIGNED(info->vcpu.start)) {
pr_warn("GICV physical address 0x%llx not page aligned\n",
(unsigned long long)info->vcpu.start);
} else {
} else if (kvm_get_mode() != KVM_MODE_PROTECTED) {
kvm_vgic_global_state.vcpu_base = info->vcpu.start;
kvm_vgic_global_state.can_emulate_gicv2 = true;
ret = kvm_register_vgic_device(KVM_DEV_TYPE_ARM_VGIC_V2);

View File

@ -189,7 +189,7 @@ void vgic_v4_configure_vsgis(struct kvm *kvm)
{
struct vgic_dist *dist = &kvm->arch.vgic;
struct kvm_vcpu *vcpu;
int i;
unsigned long i;
kvm_arm_halt_guest(kvm);
@ -235,7 +235,8 @@ int vgic_v4_init(struct kvm *kvm)
{
struct vgic_dist *dist = &kvm->arch.vgic;
struct kvm_vcpu *vcpu;
int i, nr_vcpus, ret;
int nr_vcpus, ret;
unsigned long i;
if (!kvm_vgic_global_state.has_gicv4)
return 0; /* Nothing to see here... move along. */

View File

@ -990,7 +990,7 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
void vgic_kick_vcpus(struct kvm *kvm)
{
struct kvm_vcpu *vcpu;
int c;
unsigned long c;
/*
* We've injected an interrupt, time to find out who deserves

View File

@ -898,7 +898,6 @@ static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {}
static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
#define __KVM_HAVE_ARCH_FLUSH_REMOTE_TLB
int kvm_arch_flush_remote_tlb(struct kvm *kvm);

View File

@ -27,6 +27,7 @@ config KVM
select KVM_MMIO
select MMU_NOTIFIER
select SRCU
select INTERVAL_TREE
help
Support for hosting Guest kernels.

View File

@ -2,9 +2,10 @@
# Makefile for KVM support for MIPS
#
include $(srctree)/virt/kvm/Makefile.kvm
ccflags-y += -Ivirt/kvm -Iarch/mips/kvm
kvm-y := $(addprefix ../../../virt/kvm/, kvm_main.o coalesced_mmio.o eventfd.o binary_stats.o)
kvm-$(CONFIG_CPU_HAS_MSA) += msa.o
kvm-y += mips.o emulate.o entry.o \

View File

@ -952,7 +952,7 @@ enum emulation_result kvm_mips_emul_wait(struct kvm_vcpu *vcpu)
if (!vcpu->arch.pending_exceptions) {
kvm_vz_lose_htimer(vcpu);
vcpu->arch.wait = 1;
kvm_vcpu_block(vcpu);
kvm_vcpu_halt(vcpu);
/*
* We we are runnable, then definitely go off to user space to

View File

@ -120,7 +120,7 @@ static int loongson_vipi_write(struct loongson_kvm_ipi *ipi,
s->status |= data;
irq.cpu = id;
irq.irq = 6;
kvm_vcpu_ioctl_interrupt(kvm->vcpus[id], &irq);
kvm_vcpu_ioctl_interrupt(kvm_get_vcpu(kvm, id), &irq);
break;
case CORE0_CLEAR_OFF:
@ -128,7 +128,7 @@ static int loongson_vipi_write(struct loongson_kvm_ipi *ipi,
if (!s->status) {
irq.cpu = id;
irq.irq = -6;
kvm_vcpu_ioctl_interrupt(kvm->vcpus[id], &irq);
kvm_vcpu_ioctl_interrupt(kvm_get_vcpu(kvm, id), &irq);
}
break;

View File

@ -171,25 +171,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
return 0;
}
void kvm_mips_free_vcpus(struct kvm *kvm)
{
unsigned int i;
struct kvm_vcpu *vcpu;
kvm_for_each_vcpu(i, vcpu, kvm) {
kvm_vcpu_destroy(vcpu);
}
mutex_lock(&kvm->lock);
for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
kvm->vcpus[i] = NULL;
atomic_set(&kvm->online_vcpus, 0);
mutex_unlock(&kvm->lock);
}
static void kvm_mips_free_gpa_pt(struct kvm *kvm)
{
/* It should always be safe to remove after flushing the whole range */
@ -199,7 +180,7 @@ static void kvm_mips_free_gpa_pt(struct kvm *kvm)
void kvm_arch_destroy_vm(struct kvm *kvm)
{
kvm_mips_free_vcpus(kvm);
kvm_destroy_vcpus(kvm);
kvm_mips_free_gpa_pt(kvm);
}
@ -233,25 +214,20 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
}
int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
struct kvm_memory_slot *new,
enum kvm_mr_change change)
{
return 0;
}
void kvm_arch_commit_memory_region(struct kvm *kvm,
const struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change)
{
int needs_flush;
kvm_debug("%s: kvm: %p slot: %d, GPA: %llx, size: %llx, QVA: %llx\n",
__func__, kvm, mem->slot, mem->guest_phys_addr,
mem->memory_size, mem->userspace_addr);
/*
* If dirty page logging is enabled, write protect all pages in the slot
* ready for dirty logging.
@ -498,7 +474,7 @@ int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu,
if (irq->cpu == -1)
dvcpu = vcpu;
else
dvcpu = vcpu->kvm->vcpus[irq->cpu];
dvcpu = kvm_get_vcpu(vcpu->kvm, irq->cpu);
if (intr == 2 || intr == 3 || intr == 4 || intr == 6) {
kvm_mips_callbacks->queue_io_int(dvcpu, irq);

View File

@ -752,6 +752,7 @@ struct kvm_vcpu_arch {
u8 irq_pending; /* Used by XIVE to signal pending guest irqs */
u32 last_inst;
struct rcuwait wait;
struct rcuwait *waitp;
struct kvmppc_vcore *vcore;
int ret;
@ -867,6 +868,5 @@ static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
static inline void kvm_arch_exit(void) {}
static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
#endif /* __POWERPC_KVM_HOST_H__ */

View File

@ -200,12 +200,11 @@ extern void kvmppc_core_destroy_vm(struct kvm *kvm);
extern void kvmppc_core_free_memslot(struct kvm *kvm,
struct kvm_memory_slot *slot);
extern int kvmppc_core_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
struct kvm_memory_slot *new,
enum kvm_mr_change change);
extern void kvmppc_core_commit_memory_region(struct kvm *kvm,
const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change);
extern int kvm_vm_ioctl_get_smmu_info(struct kvm *kvm,
@ -274,12 +273,11 @@ struct kvmppc_ops {
int (*get_dirty_log)(struct kvm *kvm, struct kvm_dirty_log *log);
void (*flush_memslot)(struct kvm *kvm, struct kvm_memory_slot *memslot);
int (*prepare_memory_region)(struct kvm *kvm,
struct kvm_memory_slot *memslot,
const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
struct kvm_memory_slot *new,
enum kvm_mr_change change);
void (*commit_memory_region)(struct kvm *kvm,
const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change);
bool (*unmap_gfn_range)(struct kvm *kvm, struct kvm_gfn_range *range);

View File

@ -26,6 +26,7 @@ config KVM
select KVM_VFIO
select IRQ_BYPASS_MANAGER
select HAVE_KVM_IRQ_BYPASS
select INTERVAL_TREE
config KVM_BOOK3S_HANDLER
bool

View File

@ -4,11 +4,8 @@
#
ccflags-y := -Ivirt/kvm -Iarch/powerpc/kvm
KVM := ../../../virt/kvm
common-objs-y = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/binary_stats.o
common-objs-$(CONFIG_KVM_VFIO) += $(KVM)/vfio.o
common-objs-$(CONFIG_KVM_MMIO) += $(KVM)/coalesced_mmio.o
include $(srctree)/virt/kvm/Makefile.kvm
common-objs-y += powerpc.o emulate_loadstore.o
obj-$(CONFIG_KVM_EXIT_TIMING) += timing.o
@ -125,9 +122,8 @@ kvm-book3s_32-objs := \
kvm-objs-$(CONFIG_KVM_BOOK3S_32) := $(kvm-book3s_32-objs)
kvm-objs-$(CONFIG_KVM_MPIC) += mpic.o
kvm-objs-$(CONFIG_HAVE_KVM_IRQ_ROUTING) += $(KVM)/irqchip.o
kvm-objs := $(kvm-objs-m) $(kvm-objs-y)
kvm-y += $(kvm-objs-m) $(kvm-objs-y)
obj-$(CONFIG_KVM_E500V2) += kvm.o
obj-$(CONFIG_KVM_E500MC) += kvm.o

View File

@ -847,21 +847,19 @@ void kvmppc_core_flush_memslot(struct kvm *kvm, struct kvm_memory_slot *memslot)
}
int kvmppc_core_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
const struct kvm_userspace_memory_region *mem,
enum kvm_mr_change change)
const struct kvm_memory_slot *old,
struct kvm_memory_slot *new,
enum kvm_mr_change change)
{
return kvm->arch.kvm_ops->prepare_memory_region(kvm, memslot, mem,
change);
return kvm->arch.kvm_ops->prepare_memory_region(kvm, old, new, change);
}
void kvmppc_core_commit_memory_region(struct kvm *kvm,
const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change)
{
kvm->arch.kvm_ops->commit_memory_region(kvm, mem, old, new, change);
kvm->arch.kvm_ops->commit_memory_region(kvm, old, new, change);
}
bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)

View File

@ -337,7 +337,7 @@ static void kvmppc_mmu_book3s_32_mtsrin(struct kvm_vcpu *vcpu, u32 srnum,
static void kvmppc_mmu_book3s_32_tlbie(struct kvm_vcpu *vcpu, ulong ea, bool large)
{
int i;
unsigned long i;
struct kvm_vcpu *v;
/* flush this VA on all cpus */

View File

@ -530,7 +530,7 @@ static void kvmppc_mmu_book3s_64_tlbie(struct kvm_vcpu *vcpu, ulong va,
bool large)
{
u64 mask = 0xFFFFFFFFFULL;
long i;
unsigned long i;
struct kvm_vcpu *v;
dprintk("KVM MMU: tlbie(0x%lx)\n", va);

View File

@ -734,11 +734,11 @@ void kvmppc_rmap_reset(struct kvm *kvm)
{
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot;
int srcu_idx;
int srcu_idx, bkt;
srcu_idx = srcu_read_lock(&kvm->srcu);
slots = kvm_memslots(kvm);
kvm_for_each_memslot(memslot, slots) {
kvm_for_each_memslot(memslot, bkt, slots) {
/* Mutual exclusion with kvm_unmap_hva_range etc. */
spin_lock(&kvm->mmu_lock);
/*

View File

@ -2084,7 +2084,7 @@ static void kvmppc_set_lpcr(struct kvm_vcpu *vcpu, u64 new_lpcr,
*/
if ((new_lpcr & LPCR_ILE) != (vc->lpcr & LPCR_ILE)) {
struct kvm_vcpu *vcpu;
int i;
unsigned long i;
kvm_for_each_vcpu(i, vcpu, kvm) {
if (vcpu->arch.vcore != vc)
@ -4798,8 +4798,8 @@ static int kvm_vm_ioctl_get_dirty_log_hv(struct kvm *kvm,
{
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot;
int i, r;
unsigned long n;
int r;
unsigned long n, i;
unsigned long *buf, *p;
struct kvm_vcpu *vcpu;
@ -4866,41 +4866,38 @@ static void kvmppc_core_free_memslot_hv(struct kvm_memory_slot *slot)
}
static int kvmppc_core_prepare_memory_region_hv(struct kvm *kvm,
struct kvm_memory_slot *slot,
const struct kvm_userspace_memory_region *mem,
enum kvm_mr_change change)
const struct kvm_memory_slot *old,
struct kvm_memory_slot *new,
enum kvm_mr_change change)
{
unsigned long npages = mem->memory_size >> PAGE_SHIFT;
if (change == KVM_MR_CREATE) {
unsigned long size = array_size(npages, sizeof(*slot->arch.rmap));
unsigned long size = array_size(new->npages, sizeof(*new->arch.rmap));
if ((size >> PAGE_SHIFT) > totalram_pages())
return -ENOMEM;
slot->arch.rmap = vzalloc(size);
if (!slot->arch.rmap)
new->arch.rmap = vzalloc(size);
if (!new->arch.rmap)
return -ENOMEM;
} else if (change != KVM_MR_DELETE) {
new->arch.rmap = old->arch.rmap;
}
return 0;
}
static void kvmppc_core_commit_memory_region_hv(struct kvm *kvm,
const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change)
{
unsigned long npages = mem->memory_size >> PAGE_SHIFT;
/*
* If we are making a new memslot, it might make
* If we are creating or modifying a memslot, it might make
* some address that was previously cached as emulated
* MMIO be no longer emulated MMIO, so invalidate
* all the caches of emulated MMIO translations.
*/
if (npages)
if (change != KVM_MR_DELETE)
atomic64_inc(&kvm->arch.mmio_update);
/*
@ -5901,7 +5898,7 @@ static int kvmhv_svm_off(struct kvm *kvm)
int mmu_was_ready;
int srcu_idx;
int ret = 0;
int i;
unsigned long i;
if (!(kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START))
return ret;
@ -5923,11 +5920,12 @@ static int kvmhv_svm_off(struct kvm *kvm)
for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
struct kvm_memory_slot *memslot;
struct kvm_memslots *slots = __kvm_memslots(kvm, i);
int bkt;
if (!slots)
continue;
kvm_for_each_memslot(memslot, slots) {
kvm_for_each_memslot(memslot, bkt, slots) {
kvmppc_uvmem_drop_pages(memslot, kvm, true);
uv_unregister_mem_slot(kvm->arch.lpid, memslot->id);
}

View File

@ -747,7 +747,7 @@ void kvmhv_release_all_nested(struct kvm *kvm)
struct kvm_nested_guest *gp;
struct kvm_nested_guest *freelist = NULL;
struct kvm_memory_slot *memslot;
int srcu_idx;
int srcu_idx, bkt;
spin_lock(&kvm->mmu_lock);
for (i = 0; i <= kvm->arch.max_nested_lpid; i++) {
@ -768,7 +768,7 @@ void kvmhv_release_all_nested(struct kvm *kvm)
}
srcu_idx = srcu_read_lock(&kvm->srcu);
kvm_for_each_memslot(memslot, kvm_memslots(kvm))
kvm_for_each_memslot(memslot, bkt, kvm_memslots(kvm))
kvmhv_free_memslot_nest_rmap(memslot);
srcu_read_unlock(&kvm->srcu, srcu_idx);
}

View File

@ -459,7 +459,7 @@ unsigned long kvmppc_h_svm_init_start(struct kvm *kvm)
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot, *m;
int ret = H_SUCCESS;
int srcu_idx;
int srcu_idx, bkt;
kvm->arch.secure_guest = KVMPPC_SECURE_INIT_START;
@ -478,7 +478,7 @@ unsigned long kvmppc_h_svm_init_start(struct kvm *kvm)
/* register the memslot */
slots = kvm_memslots(kvm);
kvm_for_each_memslot(memslot, slots) {
kvm_for_each_memslot(memslot, bkt, slots) {
ret = __kvmppc_uvmem_memslot_create(kvm, memslot);
if (ret)
break;
@ -486,7 +486,7 @@ unsigned long kvmppc_h_svm_init_start(struct kvm *kvm)
if (ret) {
slots = kvm_memslots(kvm);
kvm_for_each_memslot(m, slots) {
kvm_for_each_memslot(m, bkt, slots) {
if (m == memslot)
break;
__kvmppc_uvmem_memslot_delete(kvm, memslot);
@ -647,7 +647,7 @@ void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *slot,
unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm)
{
int srcu_idx;
int srcu_idx, bkt;
struct kvm_memory_slot *memslot;
/*
@ -662,7 +662,7 @@ unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm)
srcu_idx = srcu_read_lock(&kvm->srcu);
kvm_for_each_memslot(memslot, kvm_memslots(kvm))
kvm_for_each_memslot(memslot, bkt, kvm_memslots(kvm))
kvmppc_uvmem_drop_pages(memslot, kvm, false);
srcu_read_unlock(&kvm->srcu, srcu_idx);
@ -821,7 +821,7 @@ unsigned long kvmppc_h_svm_init_done(struct kvm *kvm)
{
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot;
int srcu_idx;
int srcu_idx, bkt;
long ret = H_SUCCESS;
if (!(kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START))
@ -830,7 +830,7 @@ unsigned long kvmppc_h_svm_init_done(struct kvm *kvm)
/* migrate any unmoved normal pfn to device pfns*/
srcu_idx = srcu_read_lock(&kvm->srcu);
slots = kvm_memslots(kvm);
kvm_for_each_memslot(memslot, slots) {
kvm_for_each_memslot(memslot, bkt, slots) {
ret = kvmppc_uv_migrate_mem_slot(kvm, memslot);
if (ret) {
/*

View File

@ -428,7 +428,7 @@ static int kvmppc_core_check_requests_pr(struct kvm_vcpu *vcpu)
/************* MMU Notifiers *************/
static bool do_kvm_unmap_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
{
long i;
unsigned long i;
struct kvm_vcpu *vcpu;
kvm_for_each_vcpu(i, vcpu, kvm)
@ -492,7 +492,7 @@ static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 msr)
if (msr & MSR_POW) {
if (!vcpu->arch.pending_exceptions) {
kvm_vcpu_block(vcpu);
kvm_vcpu_halt(vcpu);
kvm_clear_request(KVM_REQ_UNHALT, vcpu);
vcpu->stat.generic.halt_wakeup++;
@ -1899,16 +1899,15 @@ static void kvmppc_core_flush_memslot_pr(struct kvm *kvm,
}
static int kvmppc_core_prepare_memory_region_pr(struct kvm *kvm,
struct kvm_memory_slot *memslot,
const struct kvm_userspace_memory_region *mem,
enum kvm_mr_change change)
const struct kvm_memory_slot *old,
struct kvm_memory_slot *new,
enum kvm_mr_change change)
{
return 0;
}
static void kvmppc_core_commit_memory_region_pr(struct kvm *kvm,
const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change)
{

View File

@ -376,7 +376,7 @@ int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd)
return kvmppc_h_pr_stuff_tce(vcpu);
case H_CEDE:
kvmppc_set_msr_fast(vcpu, kvmppc_get_msr(vcpu) | MSR_EE);
kvm_vcpu_block(vcpu);
kvm_vcpu_halt(vcpu);
kvm_clear_request(KVM_REQ_UNHALT, vcpu);
vcpu->stat.generic.halt_wakeup++;
return EMULATE_DONE;

View File

@ -942,8 +942,8 @@ static int xics_debug_show(struct seq_file *m, void *private)
struct kvmppc_xics *xics = m->private;
struct kvm *kvm = xics->kvm;
struct kvm_vcpu *vcpu;
int icsid, i;
unsigned long flags;
int icsid;
unsigned long flags, i;
unsigned long t_rm_kick_vcpu, t_rm_check_resend;
unsigned long t_rm_notify_eoi;
unsigned long t_reject, t_check_resend;
@ -1340,7 +1340,7 @@ static int xics_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
static void kvmppc_xics_release(struct kvm_device *dev)
{
struct kvmppc_xics *xics = dev->private;
int i;
unsigned long i;
struct kvm *kvm = xics->kvm;
struct kvm_vcpu *vcpu;

View File

@ -116,7 +116,7 @@ static inline struct kvmppc_icp *kvmppc_xics_find_server(struct kvm *kvm,
u32 nr)
{
struct kvm_vcpu *vcpu = NULL;
int i;
unsigned long i;
kvm_for_each_vcpu(i, vcpu, kvm) {
if (vcpu->arch.icp && nr == vcpu->arch.icp->server_num)

View File

@ -368,7 +368,8 @@ static int xive_check_provisioning(struct kvm *kvm, u8 prio)
{
struct kvmppc_xive *xive = kvm->arch.xive;
struct kvm_vcpu *vcpu;
int i, rc;
unsigned long i;
int rc;
lockdep_assert_held(&xive->lock);
@ -439,7 +440,8 @@ static int xive_try_pick_queue(struct kvm_vcpu *vcpu, u8 prio)
int kvmppc_xive_select_target(struct kvm *kvm, u32 *server, u8 prio)
{
struct kvm_vcpu *vcpu;
int i, rc;
unsigned long i;
int rc;
/* Locate target server */
vcpu = kvmppc_xive_find_server(kvm, *server);
@ -1519,7 +1521,8 @@ static void xive_pre_save_queue(struct kvmppc_xive *xive, struct xive_q *q)
static void xive_pre_save_scan(struct kvmppc_xive *xive)
{
struct kvm_vcpu *vcpu = NULL;
int i, j;
unsigned long i;
int j;
/*
* See comment in xive_get_source() about how this
@ -1700,7 +1703,7 @@ static bool xive_check_delayed_irq(struct kvmppc_xive *xive, u32 irq)
{
struct kvm *kvm = xive->kvm;
struct kvm_vcpu *vcpu = NULL;
int i;
unsigned long i;
kvm_for_each_vcpu(i, vcpu, kvm) {
struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
@ -2037,7 +2040,7 @@ static void kvmppc_xive_release(struct kvm_device *dev)
struct kvmppc_xive *xive = dev->private;
struct kvm *kvm = xive->kvm;
struct kvm_vcpu *vcpu;
int i;
unsigned long i;
pr_devel("Releasing xive device\n");
@ -2291,7 +2294,7 @@ static int xive_debug_show(struct seq_file *m, void *private)
u64 t_vm_h_cppr = 0;
u64 t_vm_h_eoi = 0;
u64 t_vm_h_ipi = 0;
unsigned int i;
unsigned long i;
if (!kvm)
return 0;

View File

@ -199,7 +199,7 @@ struct kvmppc_xive_vcpu {
static inline struct kvm_vcpu *kvmppc_xive_find_server(struct kvm *kvm, u32 nr)
{
struct kvm_vcpu *vcpu = NULL;
int i;
unsigned long i;
kvm_for_each_vcpu(i, vcpu, kvm) {
if (vcpu->arch.xive_vcpu && nr == vcpu->arch.xive_vcpu->server_num)
@ -240,7 +240,7 @@ static inline u32 kvmppc_xive_vp(struct kvmppc_xive *xive, u32 server)
static inline bool kvmppc_xive_vp_in_use(struct kvm *kvm, u32 vp_id)
{
struct kvm_vcpu *vcpu = NULL;
int i;
unsigned long i;
kvm_for_each_vcpu(i, vcpu, kvm) {
if (vcpu->arch.xive_vcpu && vp_id == vcpu->arch.xive_vcpu->vp_id)

View File

@ -807,7 +807,7 @@ static int kvmppc_xive_reset(struct kvmppc_xive *xive)
{
struct kvm *kvm = xive->kvm;
struct kvm_vcpu *vcpu;
unsigned int i;
unsigned long i;
pr_devel("%s\n", __func__);
@ -916,7 +916,7 @@ static int kvmppc_xive_native_eq_sync(struct kvmppc_xive *xive)
{
struct kvm *kvm = xive->kvm;
struct kvm_vcpu *vcpu;
unsigned int i;
unsigned long i;
pr_devel("%s\n", __func__);
@ -1017,7 +1017,7 @@ static void kvmppc_xive_native_release(struct kvm_device *dev)
struct kvmppc_xive *xive = dev->private;
struct kvm *kvm = xive->kvm;
struct kvm_vcpu *vcpu;
int i;
unsigned long i;
pr_devel("Releasing xive native device\n");
@ -1214,7 +1214,7 @@ static int xive_native_debug_show(struct seq_file *m, void *private)
struct kvmppc_xive *xive = m->private;
struct kvm *kvm = xive->kvm;
struct kvm_vcpu *vcpu;
unsigned int i;
unsigned long i;
if (!kvm)
return 0;

View File

@ -718,7 +718,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
if (vcpu->arch.shared->msr & MSR_WE) {
local_irq_enable();
kvm_vcpu_block(vcpu);
kvm_vcpu_halt(vcpu);
kvm_clear_request(KVM_REQ_UNHALT, vcpu);
hard_irq_disable();
@ -1821,16 +1821,15 @@ void kvmppc_core_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot)
}
int kvmppc_core_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
struct kvm_memory_slot *new,
enum kvm_mr_change change)
{
return 0;
}
void kvmppc_core_commit_memory_region(struct kvm *kvm,
const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change)
{

View File

@ -65,7 +65,7 @@ static int kvmppc_e500_emul_msgsnd(struct kvm_vcpu *vcpu, int rb)
ulong param = vcpu->arch.regs.gpr[rb];
int prio = dbell2prio(rb);
int pir = param & PPC_DBELL_PIR_MASK;
int i;
unsigned long i;
struct kvm_vcpu *cvcpu;
if (prio < 0)

View File

@ -236,7 +236,7 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
break;
case EV_HCALL_TOKEN(EV_IDLE):
r = EV_SUCCESS;
kvm_vcpu_block(vcpu);
kvm_vcpu_halt(vcpu);
kvm_clear_request(KVM_REQ_UNHALT, vcpu);
break;
default:
@ -463,9 +463,6 @@ err_out:
void kvm_arch_destroy_vm(struct kvm *kvm)
{
unsigned int i;
struct kvm_vcpu *vcpu;
#ifdef CONFIG_KVM_XICS
/*
* We call kick_all_cpus_sync() to ensure that all
@ -476,14 +473,9 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
kick_all_cpus_sync();
#endif
kvm_for_each_vcpu(i, vcpu, kvm)
kvm_vcpu_destroy(vcpu);
kvm_destroy_vcpus(kvm);
mutex_lock(&kvm->lock);
for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
kvm->vcpus[i] = NULL;
atomic_set(&kvm->online_vcpus, 0);
kvmppc_core_destroy_vm(kvm);
@ -706,20 +698,19 @@ void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot)
}
int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
struct kvm_memory_slot *new,
enum kvm_mr_change change)
{
return kvmppc_core_prepare_memory_region(kvm, memslot, mem, change);
return kvmppc_core_prepare_memory_region(kvm, old, new, change);
}
void kvm_arch_commit_memory_region(struct kvm *kvm,
const struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change)
{
kvmppc_core_commit_memory_region(kvm, mem, old, new, change);
kvmppc_core_commit_memory_region(kvm, old, new, change);
}
void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
@ -762,7 +753,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
if (err)
goto out_vcpu_uninit;
vcpu->arch.waitp = &vcpu->wait;
rcuwait_init(&vcpu->arch.wait);
vcpu->arch.waitp = &vcpu->arch.wait;
kvmppc_create_vcpu_debugfs(vcpu, vcpu->vcpu_id);
return 0;

View File

@ -77,13 +77,6 @@ struct kvm_sbi_context {
int return_handled;
};
#define KVM_MMU_PAGE_CACHE_NR_OBJS 32
struct kvm_mmu_page_cache {
int nobjs;
void *objects[KVM_MMU_PAGE_CACHE_NR_OBJS];
};
struct kvm_cpu_trap {
unsigned long sepc;
unsigned long scause;
@ -193,7 +186,7 @@ struct kvm_vcpu_arch {
struct kvm_sbi_context sbi_context;
/* Cache pages needed to program page tables with spinlock held */
struct kvm_mmu_page_cache mmu_page_cache;
struct kvm_mmu_memory_cache mmu_page_cache;
/* VCPU power-off state */
bool power_off;
@ -208,7 +201,6 @@ struct kvm_vcpu_arch {
static inline void kvm_arch_hardware_unsetup(void) {}
static inline void kvm_arch_sync_events(struct kvm *kvm) {}
static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
#define KVM_ARCH_WANT_MMU_NOTIFIER
@ -221,12 +213,12 @@ void __kvm_riscv_hfence_gvma_all(void);
int kvm_riscv_stage2_map(struct kvm_vcpu *vcpu,
struct kvm_memory_slot *memslot,
gpa_t gpa, unsigned long hva, bool is_write);
void kvm_riscv_stage2_flush_cache(struct kvm_vcpu *vcpu);
int kvm_riscv_stage2_alloc_pgd(struct kvm *kvm);
void kvm_riscv_stage2_free_pgd(struct kvm *kvm);
void kvm_riscv_stage2_update_hgatp(struct kvm_vcpu *vcpu);
void kvm_riscv_stage2_mode_detect(void);
unsigned long kvm_riscv_stage2_mode(void);
int kvm_riscv_stage2_gpa_bits(void);
void kvm_riscv_stage2_vmid_detect(void);
unsigned long kvm_riscv_stage2_vmid_bits(void);

View File

@ -2,6 +2,6 @@
#ifndef _ASM_RISCV_KVM_TYPES_H
#define _ASM_RISCV_KVM_TYPES_H
#define KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE 40
#define KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE 32
#endif /* _ASM_RISCV_KVM_TYPES_H */

View File

@ -0,0 +1,33 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/**
* Copyright (c) 2021 Western Digital Corporation or its affiliates.
*
* Authors:
* Atish Patra <atish.patra@wdc.com>
*/
#ifndef __RISCV_KVM_VCPU_SBI_H__
#define __RISCV_KVM_VCPU_SBI_H__
#define KVM_SBI_IMPID 3
#define KVM_SBI_VERSION_MAJOR 0
#define KVM_SBI_VERSION_MINOR 2
struct kvm_vcpu_sbi_extension {
unsigned long extid_start;
unsigned long extid_end;
/**
* SBI extension handler. It can be defined for a given extension or group of
* extension. But it should always return linux error codes rather than SBI
* specific error codes.
*/
int (*handler)(struct kvm_vcpu *vcpu, struct kvm_run *run,
unsigned long *out_val, struct kvm_cpu_trap *utrap,
bool *exit);
};
void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run);
const struct kvm_vcpu_sbi_extension *kvm_vcpu_sbi_find_ext(unsigned long extid);
#endif /* __RISCV_KVM_VCPU_SBI_H__ */

View File

@ -27,6 +27,14 @@ enum sbi_ext_id {
SBI_EXT_IPI = 0x735049,
SBI_EXT_RFENCE = 0x52464E43,
SBI_EXT_HSM = 0x48534D,
/* Experimentals extensions must lie within this range */
SBI_EXT_EXPERIMENTAL_START = 0x08000000,
SBI_EXT_EXPERIMENTAL_END = 0x08FFFFFF,
/* Vendor extensions must lie within this range */
SBI_EXT_VENDOR_START = 0x09000000,
SBI_EXT_VENDOR_END = 0x09FFFFFF,
};
enum sbi_ext_base_fid {
@ -82,6 +90,7 @@ enum sbi_hsm_hart_status {
#define SBI_ERR_INVALID_PARAM -3
#define SBI_ERR_DENIED -4
#define SBI_ERR_INVALID_ADDRESS -5
#define SBI_ERR_ALREADY_AVAILABLE -6
extern unsigned long sbi_spec_version;
struct sbiret {

View File

@ -5,14 +5,10 @@
ccflags-y += -I $(srctree)/$(src)
KVM := ../../../virt/kvm
include $(srctree)/virt/kvm/Makefile.kvm
obj-$(CONFIG_KVM) += kvm.o
kvm-y += $(KVM)/kvm_main.o
kvm-y += $(KVM)/coalesced_mmio.o
kvm-y += $(KVM)/binary_stats.o
kvm-y += $(KVM)/eventfd.o
kvm-y += main.o
kvm-y += vm.o
kvm-y += vmid.o
@ -23,4 +19,8 @@ kvm-y += vcpu_exit.o
kvm-y += vcpu_fp.o
kvm-y += vcpu_switch.o
kvm-y += vcpu_sbi.o
kvm-$(CONFIG_RISCV_SBI_V01) += vcpu_sbi_v01.o
kvm-y += vcpu_sbi_base.o
kvm-y += vcpu_sbi_replace.o
kvm-y += vcpu_sbi_hsm.o
kvm-y += vcpu_timer.o

View File

@ -58,6 +58,14 @@ int kvm_arch_hardware_enable(void)
void kvm_arch_hardware_disable(void)
{
/*
* After clearing the hideleg CSR, the host kernel will receive
* spurious interrupts if hvip CSR has pending interrupts and the
* corresponding enable bits in vsie CSR are asserted. To avoid it,
* hvip CSR and vsie CSR must be cleared before clearing hideleg CSR.
*/
csr_write(CSR_VSIE, 0);
csr_write(CSR_HVIP, 0);
csr_write(CSR_HEDELEG, 0);
csr_write(CSR_HIDELEG, 0);
}

View File

@ -83,43 +83,6 @@ static int stage2_level_to_page_size(u32 level, unsigned long *out_pgsize)
return 0;
}
static int stage2_cache_topup(struct kvm_mmu_page_cache *pcache,
int min, int max)
{
void *page;
BUG_ON(max > KVM_MMU_PAGE_CACHE_NR_OBJS);
if (pcache->nobjs >= min)
return 0;
while (pcache->nobjs < max) {
page = (void *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
if (!page)
return -ENOMEM;
pcache->objects[pcache->nobjs++] = page;
}
return 0;
}
static void stage2_cache_flush(struct kvm_mmu_page_cache *pcache)
{
while (pcache && pcache->nobjs)
free_page((unsigned long)pcache->objects[--pcache->nobjs]);
}
static void *stage2_cache_alloc(struct kvm_mmu_page_cache *pcache)
{
void *p;
if (!pcache)
return NULL;
BUG_ON(!pcache->nobjs);
p = pcache->objects[--pcache->nobjs];
return p;
}
static bool stage2_get_leaf_entry(struct kvm *kvm, gpa_t addr,
pte_t **ptepp, u32 *ptep_level)
{
@ -171,7 +134,7 @@ static void stage2_remote_tlb_flush(struct kvm *kvm, u32 level, gpa_t addr)
}
static int stage2_set_pte(struct kvm *kvm, u32 level,
struct kvm_mmu_page_cache *pcache,
struct kvm_mmu_memory_cache *pcache,
gpa_t addr, const pte_t *new_pte)
{
u32 current_level = stage2_pgd_levels - 1;
@ -186,7 +149,9 @@ static int stage2_set_pte(struct kvm *kvm, u32 level,
return -EEXIST;
if (!pte_val(*ptep)) {
next_ptep = stage2_cache_alloc(pcache);
if (!pcache)
return -ENOMEM;
next_ptep = kvm_mmu_memory_cache_alloc(pcache);
if (!next_ptep)
return -ENOMEM;
*ptep = pfn_pte(PFN_DOWN(__pa(next_ptep)),
@ -209,7 +174,7 @@ static int stage2_set_pte(struct kvm *kvm, u32 level,
}
static int stage2_map_page(struct kvm *kvm,
struct kvm_mmu_page_cache *pcache,
struct kvm_mmu_memory_cache *pcache,
gpa_t gpa, phys_addr_t hpa,
unsigned long page_size,
bool page_rdonly, bool page_exec)
@ -384,7 +349,10 @@ static int stage2_ioremap(struct kvm *kvm, gpa_t gpa, phys_addr_t hpa,
int ret = 0;
unsigned long pfn;
phys_addr_t addr, end;
struct kvm_mmu_page_cache pcache = { 0, };
struct kvm_mmu_memory_cache pcache;
memset(&pcache, 0, sizeof(pcache));
pcache.gfp_zero = __GFP_ZERO;
end = (gpa + size + PAGE_SIZE - 1) & PAGE_MASK;
pfn = __phys_to_pfn(hpa);
@ -395,9 +363,7 @@ static int stage2_ioremap(struct kvm *kvm, gpa_t gpa, phys_addr_t hpa,
if (!writable)
pte = pte_wrprotect(pte);
ret = stage2_cache_topup(&pcache,
stage2_pgd_levels,
KVM_MMU_PAGE_CACHE_NR_OBJS);
ret = kvm_mmu_topup_memory_cache(&pcache, stage2_pgd_levels);
if (ret)
goto out;
@ -411,7 +377,7 @@ static int stage2_ioremap(struct kvm *kvm, gpa_t gpa, phys_addr_t hpa,
}
out:
stage2_cache_flush(&pcache);
kvm_mmu_free_memory_cache(&pcache);
return ret;
}
@ -462,7 +428,6 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
}
void kvm_arch_commit_memory_region(struct kvm *kvm,
const struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change)
@ -472,18 +437,18 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
* allocated dirty_bitmap[], dirty pages will be tracked while
* the memory slot is write protected.
*/
if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
stage2_wp_memory_region(kvm, mem->slot);
if (change != KVM_MR_DELETE && new->flags & KVM_MEM_LOG_DIRTY_PAGES)
stage2_wp_memory_region(kvm, new->id);
}
int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
const struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
struct kvm_memory_slot *new,
enum kvm_mr_change change)
{
hva_t hva = mem->userspace_addr;
hva_t reg_end = hva + mem->memory_size;
bool writable = !(mem->flags & KVM_MEM_READONLY);
hva_t hva, reg_end, size;
gpa_t base_gpa;
bool writable;
int ret = 0;
if (change != KVM_MR_CREATE && change != KVM_MR_MOVE &&
@ -494,10 +459,16 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
* Prevent userspace from creating a memory region outside of the GPA
* space addressable by the KVM guest GPA space.
*/
if ((memslot->base_gfn + memslot->npages) >=
if ((new->base_gfn + new->npages) >=
(stage2_gpa_size >> PAGE_SHIFT))
return -EFAULT;
hva = new->userspace_addr;
size = new->npages << PAGE_SHIFT;
reg_end = hva + size;
base_gpa = new->base_gfn << PAGE_SHIFT;
writable = !(new->flags & KVM_MEM_READONLY);
mmap_read_lock(current->mm);
/*
@ -533,15 +504,14 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
vm_end = min(reg_end, vma->vm_end);
if (vma->vm_flags & VM_PFNMAP) {
gpa_t gpa = mem->guest_phys_addr +
(vm_start - mem->userspace_addr);
gpa_t gpa = base_gpa + (vm_start - hva);
phys_addr_t pa;
pa = (phys_addr_t)vma->vm_pgoff << PAGE_SHIFT;
pa += vm_start - vma->vm_start;
/* IO region dirty page logging not allowed */
if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES) {
if (new->flags & KVM_MEM_LOG_DIRTY_PAGES) {
ret = -EINVAL;
goto out;
}
@ -559,8 +529,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
spin_lock(&kvm->mmu_lock);
if (ret)
stage2_unmap_range(kvm, mem->guest_phys_addr,
mem->memory_size, false);
stage2_unmap_range(kvm, base_gpa, size, false);
spin_unlock(&kvm->mmu_lock);
out:
@ -646,7 +615,7 @@ int kvm_riscv_stage2_map(struct kvm_vcpu *vcpu,
gfn_t gfn = gpa >> PAGE_SHIFT;
struct vm_area_struct *vma;
struct kvm *kvm = vcpu->kvm;
struct kvm_mmu_page_cache *pcache = &vcpu->arch.mmu_page_cache;
struct kvm_mmu_memory_cache *pcache = &vcpu->arch.mmu_page_cache;
bool logging = (memslot->dirty_bitmap &&
!(memslot->flags & KVM_MEM_READONLY)) ? true : false;
unsigned long vma_pagesize, mmu_seq;
@ -681,8 +650,7 @@ int kvm_riscv_stage2_map(struct kvm_vcpu *vcpu,
}
/* We need minimum second+third level pages */
ret = stage2_cache_topup(pcache, stage2_pgd_levels,
KVM_MMU_PAGE_CACHE_NR_OBJS);
ret = kvm_mmu_topup_memory_cache(pcache, stage2_pgd_levels);
if (ret) {
kvm_err("Failed to topup stage2 cache\n");
return ret;
@ -731,11 +699,6 @@ out_unlock:
return ret;
}
void kvm_riscv_stage2_flush_cache(struct kvm_vcpu *vcpu)
{
stage2_cache_flush(&vcpu->arch.mmu_page_cache);
}
int kvm_riscv_stage2_alloc_pgd(struct kvm *kvm)
{
struct page *pgd_page;
@ -806,3 +769,8 @@ unsigned long kvm_riscv_stage2_mode(void)
{
return stage2_mode >> HGATP_MODE_SHIFT;
}
int kvm_riscv_stage2_gpa_bits(void)
{
return stage2_gpa_bits;
}

View File

@ -53,6 +53,17 @@ static void kvm_riscv_reset_vcpu(struct kvm_vcpu *vcpu)
struct kvm_vcpu_csr *reset_csr = &vcpu->arch.guest_reset_csr;
struct kvm_cpu_context *cntx = &vcpu->arch.guest_context;
struct kvm_cpu_context *reset_cntx = &vcpu->arch.guest_reset_context;
bool loaded;
/**
* The preemption should be disabled here because it races with
* kvm_sched_out/kvm_sched_in(called from preempt notifiers) which
* also calls vcpu_load/put.
*/
get_cpu();
loaded = (vcpu->cpu != -1);
if (loaded)
kvm_arch_vcpu_put(vcpu);
memcpy(csr, reset_csr, sizeof(*csr));
@ -64,6 +75,11 @@ static void kvm_riscv_reset_vcpu(struct kvm_vcpu *vcpu)
WRITE_ONCE(vcpu->arch.irqs_pending, 0);
WRITE_ONCE(vcpu->arch.irqs_pending_mask, 0);
/* Reset the guest CSRs for hotplug usecase */
if (loaded)
kvm_arch_vcpu_load(vcpu, smp_processor_id());
put_cpu();
}
int kvm_arch_vcpu_precreate(struct kvm *kvm, unsigned int id)
@ -77,6 +93,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
/* Mark this VCPU never ran */
vcpu->arch.ran_atleast_once = false;
vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
/* Setup ISA features available to VCPU */
vcpu->arch.isa = riscv_isa_extension_base(NULL) & KVM_RISCV_ISA_ALLOWED;
@ -100,6 +117,13 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
{
/**
* vcpu with id 0 is the designated boot cpu.
* Keep all vcpus with non-zero id in power-off state so that
* they can be brought up using SBI HSM extension.
*/
if (vcpu->vcpu_idx != 0)
kvm_riscv_vcpu_power_off(vcpu);
}
void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
@ -107,8 +131,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
/* Cleanup VCPU timer */
kvm_riscv_vcpu_timer_deinit(vcpu);
/* Flush the pages pre-allocated for Stage2 page table mappings */
kvm_riscv_stage2_flush_cache(vcpu);
/* Free unused pages pre-allocated for Stage2 page table mappings */
kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
}
int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)

View File

@ -146,7 +146,7 @@ static int system_opcode_insn(struct kvm_vcpu *vcpu,
vcpu->stat.wfi_exit_stat++;
if (!kvm_arch_vcpu_runnable(vcpu)) {
srcu_read_unlock(&vcpu->kvm->srcu, vcpu->arch.srcu_idx);
kvm_vcpu_block(vcpu);
kvm_vcpu_halt(vcpu);
vcpu->arch.srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
kvm_clear_request(KVM_REQ_UNHALT, vcpu);
}

View File

@ -26,7 +26,7 @@ void kvm_riscv_vcpu_fp_reset(struct kvm_vcpu *vcpu)
cntx->sstatus |= SR_FS_OFF;
}
void kvm_riscv_vcpu_fp_clean(struct kvm_cpu_context *cntx)
static void kvm_riscv_vcpu_fp_clean(struct kvm_cpu_context *cntx)
{
cntx->sstatus &= ~SR_FS;
cntx->sstatus |= SR_FS_CLEAN;

View File

@ -9,15 +9,58 @@
#include <linux/errno.h>
#include <linux/err.h>
#include <linux/kvm_host.h>
#include <asm/csr.h>
#include <asm/sbi.h>
#include <asm/kvm_vcpu_timer.h>
#include <asm/kvm_vcpu_sbi.h>
#define SBI_VERSION_MAJOR 0
#define SBI_VERSION_MINOR 1
static int kvm_linux_err_map_sbi(int err)
{
switch (err) {
case 0:
return SBI_SUCCESS;
case -EPERM:
return SBI_ERR_DENIED;
case -EINVAL:
return SBI_ERR_INVALID_PARAM;
case -EFAULT:
return SBI_ERR_INVALID_ADDRESS;
case -EOPNOTSUPP:
return SBI_ERR_NOT_SUPPORTED;
case -EALREADY:
return SBI_ERR_ALREADY_AVAILABLE;
default:
return SBI_ERR_FAILURE;
};
}
static void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu,
struct kvm_run *run)
#ifdef CONFIG_RISCV_SBI_V01
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01;
#else
static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
.extid_start = -1UL,
.extid_end = -1UL,
.handler = NULL,
};
#endif
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base;
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time;
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_ipi;
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence;
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm;
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_experimental;
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_vendor;
static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
&vcpu_sbi_ext_v01,
&vcpu_sbi_ext_base,
&vcpu_sbi_ext_time,
&vcpu_sbi_ext_ipi,
&vcpu_sbi_ext_rfence,
&vcpu_sbi_ext_hsm,
&vcpu_sbi_ext_experimental,
&vcpu_sbi_ext_vendor,
};
void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
@ -55,131 +98,73 @@ int kvm_riscv_vcpu_sbi_return(struct kvm_vcpu *vcpu, struct kvm_run *run)
return 0;
}
#ifdef CONFIG_RISCV_SBI_V01
static void kvm_sbi_system_shutdown(struct kvm_vcpu *vcpu,
struct kvm_run *run, u32 type)
const struct kvm_vcpu_sbi_extension *kvm_vcpu_sbi_find_ext(unsigned long extid)
{
int i;
struct kvm_vcpu *tmp;
int i = 0;
kvm_for_each_vcpu(i, tmp, vcpu->kvm)
tmp->arch.power_off = true;
kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_SLEEP);
for (i = 0; i < ARRAY_SIZE(sbi_ext); i++) {
if (sbi_ext[i]->extid_start <= extid &&
sbi_ext[i]->extid_end >= extid)
return sbi_ext[i];
}
memset(&run->system_event, 0, sizeof(run->system_event));
run->system_event.type = type;
run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
return NULL;
}
int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
ulong hmask;
int i, ret = 1;
u64 next_cycle;
struct kvm_vcpu *rvcpu;
int ret = 1;
bool next_sepc = true;
struct cpumask cm, hm;
struct kvm *kvm = vcpu->kvm;
struct kvm_cpu_trap utrap = { 0 };
bool userspace_exit = false;
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
const struct kvm_vcpu_sbi_extension *sbi_ext;
struct kvm_cpu_trap utrap = { 0 };
unsigned long out_val = 0;
bool ext_is_v01 = false;
if (!cp)
return -EINVAL;
switch (cp->a7) {
case SBI_EXT_0_1_CONSOLE_GETCHAR:
case SBI_EXT_0_1_CONSOLE_PUTCHAR:
/*
* The CONSOLE_GETCHAR/CONSOLE_PUTCHAR SBI calls cannot be
* handled in kernel so we forward these to user-space
*/
kvm_riscv_vcpu_sbi_forward(vcpu, run);
next_sepc = false;
ret = 0;
break;
case SBI_EXT_0_1_SET_TIMER:
#if __riscv_xlen == 32
next_cycle = ((u64)cp->a1 << 32) | (u64)cp->a0;
#else
next_cycle = (u64)cp->a0;
sbi_ext = kvm_vcpu_sbi_find_ext(cp->a7);
if (sbi_ext && sbi_ext->handler) {
#ifdef CONFIG_RISCV_SBI_V01
if (cp->a7 >= SBI_EXT_0_1_SET_TIMER &&
cp->a7 <= SBI_EXT_0_1_SHUTDOWN)
ext_is_v01 = true;
#endif
kvm_riscv_vcpu_timer_next_event(vcpu, next_cycle);
break;
case SBI_EXT_0_1_CLEAR_IPI:
kvm_riscv_vcpu_unset_interrupt(vcpu, IRQ_VS_SOFT);
break;
case SBI_EXT_0_1_SEND_IPI:
if (cp->a0)
hmask = kvm_riscv_vcpu_unpriv_read(vcpu, false, cp->a0,
&utrap);
else
hmask = (1UL << atomic_read(&kvm->online_vcpus)) - 1;
if (utrap.scause) {
utrap.sepc = cp->sepc;
kvm_riscv_vcpu_trap_redirect(vcpu, &utrap);
next_sepc = false;
break;
}
for_each_set_bit(i, &hmask, BITS_PER_LONG) {
rvcpu = kvm_get_vcpu_by_id(vcpu->kvm, i);
kvm_riscv_vcpu_set_interrupt(rvcpu, IRQ_VS_SOFT);
}
break;
case SBI_EXT_0_1_SHUTDOWN:
kvm_sbi_system_shutdown(vcpu, run, KVM_SYSTEM_EVENT_SHUTDOWN);
next_sepc = false;
ret = 0;
break;
case SBI_EXT_0_1_REMOTE_FENCE_I:
case SBI_EXT_0_1_REMOTE_SFENCE_VMA:
case SBI_EXT_0_1_REMOTE_SFENCE_VMA_ASID:
if (cp->a0)
hmask = kvm_riscv_vcpu_unpriv_read(vcpu, false, cp->a0,
&utrap);
else
hmask = (1UL << atomic_read(&kvm->online_vcpus)) - 1;
if (utrap.scause) {
utrap.sepc = cp->sepc;
kvm_riscv_vcpu_trap_redirect(vcpu, &utrap);
next_sepc = false;
break;
}
cpumask_clear(&cm);
for_each_set_bit(i, &hmask, BITS_PER_LONG) {
rvcpu = kvm_get_vcpu_by_id(vcpu->kvm, i);
if (rvcpu->cpu < 0)
continue;
cpumask_set_cpu(rvcpu->cpu, &cm);
}
riscv_cpuid_to_hartid_mask(&cm, &hm);
if (cp->a7 == SBI_EXT_0_1_REMOTE_FENCE_I)
sbi_remote_fence_i(cpumask_bits(&hm));
else if (cp->a7 == SBI_EXT_0_1_REMOTE_SFENCE_VMA)
sbi_remote_hfence_vvma(cpumask_bits(&hm),
cp->a1, cp->a2);
else
sbi_remote_hfence_vvma_asid(cpumask_bits(&hm),
cp->a1, cp->a2, cp->a3);
break;
default:
ret = sbi_ext->handler(vcpu, run, &out_val, &utrap, &userspace_exit);
} else {
/* Return error for unsupported SBI calls */
cp->a0 = SBI_ERR_NOT_SUPPORTED;
break;
goto ecall_done;
}
/* Handle special error cases i.e trap, exit or userspace forward */
if (utrap.scause) {
/* No need to increment sepc or exit ioctl loop */
ret = 1;
utrap.sepc = cp->sepc;
kvm_riscv_vcpu_trap_redirect(vcpu, &utrap);
next_sepc = false;
goto ecall_done;
}
/* Exit ioctl loop or Propagate the error code the guest */
if (userspace_exit) {
next_sepc = false;
ret = 0;
} else {
/**
* SBI extension handler always returns an Linux error code. Convert
* it to the SBI specific error code that can be propagated the SBI
* caller.
*/
ret = kvm_linux_err_map_sbi(ret);
cp->a0 = ret;
ret = 1;
}
ecall_done:
if (next_sepc)
cp->sepc += 4;
if (!ext_is_v01)
cp->a1 = out_val;
return ret;
}
#else
int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
kvm_riscv_vcpu_sbi_forward(vcpu, run);
return 0;
}
#endif

View File

@ -0,0 +1,99 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2021 Western Digital Corporation or its affiliates.
*
* Authors:
* Atish Patra <atish.patra@wdc.com>
*/
#include <linux/errno.h>
#include <linux/err.h>
#include <linux/kvm_host.h>
#include <asm/csr.h>
#include <asm/sbi.h>
#include <asm/kvm_vcpu_timer.h>
#include <asm/kvm_vcpu_sbi.h>
static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
unsigned long *out_val,
struct kvm_cpu_trap *trap, bool *exit)
{
int ret = 0;
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
struct sbiret ecall_ret;
switch (cp->a6) {
case SBI_EXT_BASE_GET_SPEC_VERSION:
*out_val = (KVM_SBI_VERSION_MAJOR <<
SBI_SPEC_VERSION_MAJOR_SHIFT) |
KVM_SBI_VERSION_MINOR;
break;
case SBI_EXT_BASE_GET_IMP_ID:
*out_val = KVM_SBI_IMPID;
break;
case SBI_EXT_BASE_GET_IMP_VERSION:
*out_val = 0;
break;
case SBI_EXT_BASE_PROBE_EXT:
if ((cp->a0 >= SBI_EXT_EXPERIMENTAL_START &&
cp->a0 <= SBI_EXT_EXPERIMENTAL_END) ||
(cp->a0 >= SBI_EXT_VENDOR_START &&
cp->a0 <= SBI_EXT_VENDOR_END)) {
/*
* For experimental/vendor extensions
* forward it to the userspace
*/
kvm_riscv_vcpu_sbi_forward(vcpu, run);
*exit = true;
} else
*out_val = kvm_vcpu_sbi_find_ext(cp->a0) ? 1 : 0;
break;
case SBI_EXT_BASE_GET_MVENDORID:
case SBI_EXT_BASE_GET_MARCHID:
case SBI_EXT_BASE_GET_MIMPID:
ecall_ret = sbi_ecall(SBI_EXT_BASE, cp->a6, 0, 0, 0, 0, 0, 0);
if (!ecall_ret.error)
*out_val = ecall_ret.value;
/*TODO: We are unnecessarily converting the error twice */
ret = sbi_err_map_linux_errno(ecall_ret.error);
break;
default:
ret = -EOPNOTSUPP;
break;
}
return ret;
}
const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
.extid_start = SBI_EXT_BASE,
.extid_end = SBI_EXT_BASE,
.handler = kvm_sbi_ext_base_handler,
};
static int kvm_sbi_ext_forward_handler(struct kvm_vcpu *vcpu,
struct kvm_run *run,
unsigned long *out_val,
struct kvm_cpu_trap *utrap,
bool *exit)
{
/*
* Both SBI experimental and vendor extensions are
* unconditionally forwarded to userspace.
*/
kvm_riscv_vcpu_sbi_forward(vcpu, run);
*exit = true;
return 0;
}
const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_experimental = {
.extid_start = SBI_EXT_EXPERIMENTAL_START,
.extid_end = SBI_EXT_EXPERIMENTAL_END,
.handler = kvm_sbi_ext_forward_handler,
};
const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_vendor = {
.extid_start = SBI_EXT_VENDOR_START,
.extid_end = SBI_EXT_VENDOR_END,
.handler = kvm_sbi_ext_forward_handler,
};

View File

@ -0,0 +1,105 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2021 Western Digital Corporation or its affiliates.
*
* Authors:
* Atish Patra <atish.patra@wdc.com>
*/
#include <linux/errno.h>
#include <linux/err.h>
#include <linux/kvm_host.h>
#include <asm/csr.h>
#include <asm/sbi.h>
#include <asm/kvm_vcpu_sbi.h>
static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
{
struct kvm_cpu_context *reset_cntx;
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
struct kvm_vcpu *target_vcpu;
unsigned long target_vcpuid = cp->a0;
target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
if (!target_vcpu)
return -EINVAL;
if (!target_vcpu->arch.power_off)
return -EALREADY;
reset_cntx = &target_vcpu->arch.guest_reset_context;
/* start address */
reset_cntx->sepc = cp->a1;
/* target vcpu id to start */
reset_cntx->a0 = target_vcpuid;
/* private data passed from kernel */
reset_cntx->a1 = cp->a2;
kvm_make_request(KVM_REQ_VCPU_RESET, target_vcpu);
kvm_riscv_vcpu_power_on(target_vcpu);
return 0;
}
static int kvm_sbi_hsm_vcpu_stop(struct kvm_vcpu *vcpu)
{
if (vcpu->arch.power_off)
return -EINVAL;
kvm_riscv_vcpu_power_off(vcpu);
return 0;
}
static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
{
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
unsigned long target_vcpuid = cp->a0;
struct kvm_vcpu *target_vcpu;
target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
if (!target_vcpu)
return -EINVAL;
if (!target_vcpu->arch.power_off)
return SBI_HSM_HART_STATUS_STARTED;
else
return SBI_HSM_HART_STATUS_STOPPED;
}
static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
unsigned long *out_val,
struct kvm_cpu_trap *utrap,
bool *exit)
{
int ret = 0;
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
struct kvm *kvm = vcpu->kvm;
unsigned long funcid = cp->a6;
switch (funcid) {
case SBI_EXT_HSM_HART_START:
mutex_lock(&kvm->lock);
ret = kvm_sbi_hsm_vcpu_start(vcpu);
mutex_unlock(&kvm->lock);
break;
case SBI_EXT_HSM_HART_STOP:
ret = kvm_sbi_hsm_vcpu_stop(vcpu);
break;
case SBI_EXT_HSM_HART_STATUS:
ret = kvm_sbi_hsm_vcpu_get_status(vcpu);
if (ret >= 0) {
*out_val = ret;
ret = 0;
}
break;
default:
ret = -EOPNOTSUPP;
}
return ret;
}
const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm = {
.extid_start = SBI_EXT_HSM,
.extid_end = SBI_EXT_HSM,
.handler = kvm_sbi_ext_hsm_handler,
};

View File

@ -0,0 +1,135 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2021 Western Digital Corporation or its affiliates.
*
* Authors:
* Atish Patra <atish.patra@wdc.com>
*/
#include <linux/errno.h>
#include <linux/err.h>
#include <linux/kvm_host.h>
#include <asm/csr.h>
#include <asm/sbi.h>
#include <asm/kvm_vcpu_timer.h>
#include <asm/kvm_vcpu_sbi.h>
static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
unsigned long *out_val,
struct kvm_cpu_trap *utrap, bool *exit)
{
int ret = 0;
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
u64 next_cycle;
if (cp->a6 != SBI_EXT_TIME_SET_TIMER)
return -EINVAL;
#if __riscv_xlen == 32
next_cycle = ((u64)cp->a1 << 32) | (u64)cp->a0;
#else
next_cycle = (u64)cp->a0;
#endif
kvm_riscv_vcpu_timer_next_event(vcpu, next_cycle);
return ret;
}
const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time = {
.extid_start = SBI_EXT_TIME,
.extid_end = SBI_EXT_TIME,
.handler = kvm_sbi_ext_time_handler,
};
static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
unsigned long *out_val,
struct kvm_cpu_trap *utrap, bool *exit)
{
int ret = 0;
unsigned long i;
struct kvm_vcpu *tmp;
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
unsigned long hmask = cp->a0;
unsigned long hbase = cp->a1;
if (cp->a6 != SBI_EXT_IPI_SEND_IPI)
return -EINVAL;
kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
if (hbase != -1UL) {
if (tmp->vcpu_id < hbase)
continue;
if (!(hmask & (1UL << (tmp->vcpu_id - hbase))))
continue;
}
ret = kvm_riscv_vcpu_set_interrupt(tmp, IRQ_VS_SOFT);
if (ret < 0)
break;
}
return ret;
}
const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_ipi = {
.extid_start = SBI_EXT_IPI,
.extid_end = SBI_EXT_IPI,
.handler = kvm_sbi_ext_ipi_handler,
};
static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
unsigned long *out_val,
struct kvm_cpu_trap *utrap, bool *exit)
{
int ret = 0;
unsigned long i;
struct cpumask cm, hm;
struct kvm_vcpu *tmp;
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
unsigned long hmask = cp->a0;
unsigned long hbase = cp->a1;
unsigned long funcid = cp->a6;
cpumask_clear(&cm);
cpumask_clear(&hm);
kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
if (hbase != -1UL) {
if (tmp->vcpu_id < hbase)
continue;
if (!(hmask & (1UL << (tmp->vcpu_id - hbase))))
continue;
}
if (tmp->cpu < 0)
continue;
cpumask_set_cpu(tmp->cpu, &cm);
}
riscv_cpuid_to_hartid_mask(&cm, &hm);
switch (funcid) {
case SBI_EXT_RFENCE_REMOTE_FENCE_I:
ret = sbi_remote_fence_i(cpumask_bits(&hm));
break;
case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
ret = sbi_remote_hfence_vvma(cpumask_bits(&hm), cp->a2, cp->a3);
break;
case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
ret = sbi_remote_hfence_vvma_asid(cpumask_bits(&hm), cp->a2,
cp->a3, cp->a4);
break;
case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID:
case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA:
case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID:
/* TODO: implement for nested hypervisor case */
default:
ret = -EOPNOTSUPP;
}
return ret;
}
const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
.extid_start = SBI_EXT_RFENCE,
.extid_end = SBI_EXT_RFENCE,
.handler = kvm_sbi_ext_rfence_handler,
};

View File

@ -0,0 +1,126 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2021 Western Digital Corporation or its affiliates.
*
* Authors:
* Atish Patra <atish.patra@wdc.com>
*/
#include <linux/errno.h>
#include <linux/err.h>
#include <linux/kvm_host.h>
#include <asm/csr.h>
#include <asm/sbi.h>
#include <asm/kvm_vcpu_timer.h>
#include <asm/kvm_vcpu_sbi.h>
static void kvm_sbi_system_shutdown(struct kvm_vcpu *vcpu,
struct kvm_run *run, u32 type)
{
unsigned long i;
struct kvm_vcpu *tmp;
kvm_for_each_vcpu(i, tmp, vcpu->kvm)
tmp->arch.power_off = true;
kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_SLEEP);
memset(&run->system_event, 0, sizeof(run->system_event));
run->system_event.type = type;
run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
}
static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
unsigned long *out_val,
struct kvm_cpu_trap *utrap,
bool *exit)
{
ulong hmask;
int i, ret = 0;
u64 next_cycle;
struct kvm_vcpu *rvcpu;
struct cpumask cm, hm;
struct kvm *kvm = vcpu->kvm;
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
switch (cp->a7) {
case SBI_EXT_0_1_CONSOLE_GETCHAR:
case SBI_EXT_0_1_CONSOLE_PUTCHAR:
/*
* The CONSOLE_GETCHAR/CONSOLE_PUTCHAR SBI calls cannot be
* handled in kernel so we forward these to user-space
*/
kvm_riscv_vcpu_sbi_forward(vcpu, run);
*exit = true;
break;
case SBI_EXT_0_1_SET_TIMER:
#if __riscv_xlen == 32
next_cycle = ((u64)cp->a1 << 32) | (u64)cp->a0;
#else
next_cycle = (u64)cp->a0;
#endif
ret = kvm_riscv_vcpu_timer_next_event(vcpu, next_cycle);
break;
case SBI_EXT_0_1_CLEAR_IPI:
ret = kvm_riscv_vcpu_unset_interrupt(vcpu, IRQ_VS_SOFT);
break;
case SBI_EXT_0_1_SEND_IPI:
if (cp->a0)
hmask = kvm_riscv_vcpu_unpriv_read(vcpu, false, cp->a0,
utrap);
else
hmask = (1UL << atomic_read(&kvm->online_vcpus)) - 1;
if (utrap->scause)
break;
for_each_set_bit(i, &hmask, BITS_PER_LONG) {
rvcpu = kvm_get_vcpu_by_id(vcpu->kvm, i);
ret = kvm_riscv_vcpu_set_interrupt(rvcpu, IRQ_VS_SOFT);
if (ret < 0)
break;
}
break;
case SBI_EXT_0_1_SHUTDOWN:
kvm_sbi_system_shutdown(vcpu, run, KVM_SYSTEM_EVENT_SHUTDOWN);
*exit = true;
break;
case SBI_EXT_0_1_REMOTE_FENCE_I:
case SBI_EXT_0_1_REMOTE_SFENCE_VMA:
case SBI_EXT_0_1_REMOTE_SFENCE_VMA_ASID:
if (cp->a0)
hmask = kvm_riscv_vcpu_unpriv_read(vcpu, false, cp->a0,
utrap);
else
hmask = (1UL << atomic_read(&kvm->online_vcpus)) - 1;
if (utrap->scause)
break;
cpumask_clear(&cm);
for_each_set_bit(i, &hmask, BITS_PER_LONG) {
rvcpu = kvm_get_vcpu_by_id(vcpu->kvm, i);
if (rvcpu->cpu < 0)
continue;
cpumask_set_cpu(rvcpu->cpu, &cm);
}
riscv_cpuid_to_hartid_mask(&cm, &hm);
if (cp->a7 == SBI_EXT_0_1_REMOTE_FENCE_I)
ret = sbi_remote_fence_i(cpumask_bits(&hm));
else if (cp->a7 == SBI_EXT_0_1_REMOTE_SFENCE_VMA)
ret = sbi_remote_hfence_vvma(cpumask_bits(&hm),
cp->a1, cp->a2);
else
ret = sbi_remote_hfence_vvma_asid(cpumask_bits(&hm),
cp->a1, cp->a2, cp->a3);
break;
default:
ret = -EINVAL;
break;
};
return ret;
}
const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
.extid_start = SBI_EXT_0_1_SET_TIMER,
.extid_end = SBI_EXT_0_1_SHUTDOWN,
.handler = kvm_sbi_ext_v01_handler,
};

View File

@ -46,15 +46,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
void kvm_arch_destroy_vm(struct kvm *kvm)
{
int i;
for (i = 0; i < KVM_MAX_VCPUS; ++i) {
if (kvm->vcpus[i]) {
kvm_vcpu_destroy(kvm->vcpus[i]);
kvm->vcpus[i] = NULL;
}
}
atomic_set(&kvm->online_vcpus, 0);
kvm_destroy_vcpus(kvm);
}
int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
@ -82,6 +74,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_NR_MEMSLOTS:
r = KVM_USER_MEM_SLOTS;
break;
case KVM_CAP_VM_GPA_BITS:
r = kvm_riscv_stage2_gpa_bits();
break;
default:
r = 0;
break;

View File

@ -65,7 +65,7 @@ bool kvm_riscv_stage2_vmid_ver_changed(struct kvm_vmid *vmid)
void kvm_riscv_stage2_vmid_update(struct kvm_vcpu *vcpu)
{
int i;
unsigned long i;
struct kvm_vcpu *v;
struct cpumask hmask;
struct kvm_vmid *vmid = &vcpu->kvm->arch.vmid;

View File

@ -1010,6 +1010,4 @@ static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu);
#endif

View File

@ -91,23 +91,23 @@ struct uv_cb_header {
/* Query Ultravisor Information */
struct uv_cb_qui {
struct uv_cb_header header;
u64 reserved08;
u64 inst_calls_list[4];
u64 reserved30[2];
u64 uv_base_stor_len;
u64 reserved48;
u64 conf_base_phys_stor_len;
u64 conf_base_virt_stor_len;
u64 conf_virt_var_stor_len;
u64 cpu_stor_len;
u32 reserved70[3];
u32 max_num_sec_conf;
u64 max_guest_stor_addr;
u8 reserved88[158 - 136];
u16 max_guest_cpu_id;
u64 uv_feature_indications;
u8 reserveda0[200 - 168];
struct uv_cb_header header; /* 0x0000 */
u64 reserved08; /* 0x0008 */
u64 inst_calls_list[4]; /* 0x0010 */
u64 reserved30[2]; /* 0x0030 */
u64 uv_base_stor_len; /* 0x0040 */
u64 reserved48; /* 0x0048 */
u64 conf_base_phys_stor_len; /* 0x0050 */
u64 conf_base_virt_stor_len; /* 0x0058 */
u64 conf_virt_var_stor_len; /* 0x0060 */
u64 cpu_stor_len; /* 0x0068 */
u32 reserved70[3]; /* 0x0070 */
u32 max_num_sec_conf; /* 0x007c */
u64 max_guest_stor_addr; /* 0x0080 */
u8 reserved88[158 - 136]; /* 0x0088 */
u16 max_guest_cpu_id; /* 0x009e */
u64 uv_feature_indications; /* 0x00a0 */
u8 reserveda8[200 - 168]; /* 0x00a8 */
} __packed __aligned(8);
/* Initialize Ultravisor */

View File

@ -33,6 +33,7 @@ config KVM
select HAVE_KVM_NO_POLL
select SRCU
select KVM_VFIO
select INTERVAL_TREE
help
Support hosting paravirtualized guest machines using the SIE
virtualization capability on the mainframe. This should work

View File

@ -3,13 +3,11 @@
#
# Copyright IBM Corp. 2008
KVM := ../../../virt/kvm
common-objs = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/async_pf.o \
$(KVM)/irqchip.o $(KVM)/vfio.o $(KVM)/binary_stats.o
include $(srctree)/virt/kvm/Makefile.kvm
ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
kvm-y += kvm-s390.o intercept.o interrupt.o priv.o sigp.o
kvm-y += diag.o gaccess.o guestdbg.o vsie.o pv.o
obj-$(CONFIG_KVM) += kvm.o

Some files were not shown because too many files have changed in this diff Show More