Commit graph

950342 commits

Author SHA1 Message Date
Vitaly Kuznetsov
3f4e3eb417 KVM: x86: bump KVM_MAX_CPUID_ENTRIES
As vcpu->arch.cpuid_entries is now allocated dynamically, the only
remaining use for KVM_MAX_CPUID_ENTRIES is to check KVM_SET_CPUID/
KVM_SET_CPUID2 input for sanity. Since it was reported that the
current limit (80) is insufficient for some CPUs, bump
KVM_MAX_CPUID_ENTRIES and use an arbitrary value '256' as the new
limit.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20201001130541.1398392-4-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-21 17:36:33 -04:00
Vitaly Kuznetsov
255cbecfe0 KVM: x86: allocate vcpu->arch.cpuid_entries dynamically
The current limit for guest CPUID leaves (KVM_MAX_CPUID_ENTRIES, 80)
is reported to be insufficient but before we bump it let's switch to
allocating vcpu->arch.cpuid_entries[] array dynamically. Currently,
'struct kvm_cpuid_entry2' is 40 bytes so vcpu->arch.cpuid_entries is
3200 bytes which accounts for 1/4 of the whole 'struct kvm_vcpu_arch'
but having it pre-allocated (for all vCPUs which we also pre-allocate)
gives us no real benefits.

Another plus of the dynamic allocation is that we now do kvm_check_cpuid()
check before we assign anything to vcpu->arch.cpuid_nent/cpuid_entries so
no changes are made in case the check fails.

Opportunistically remove unneeded 'out' labels from
kvm_vcpu_ioctl_set_cpuid()/kvm_vcpu_ioctl_set_cpuid2() and return
directly whenever possible.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20201001130541.1398392-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
2020-10-21 17:36:33 -04:00
Vitaly Kuznetsov
f69858fcc7 KVM: x86: disconnect kvm_check_cpuid() from vcpu->arch.cpuid_entries
As a preparatory step to allocating vcpu->arch.cpuid_entries dynamically
make kvm_check_cpuid() check work with an arbitrary 'struct kvm_cpuid_entry2'
array.

Currently, when kvm_check_cpuid() fails we reset vcpu->arch.cpuid_nent to
0 and this is kind of weird, i.e. one would expect CPUIDs to remain
unchanged when KVM_SET_CPUID[2] call fails.

No functional change intended. It would've been possible to move the updated
kvm_check_cpuid() in kvm_vcpu_ioctl_set_cpuid2() and check the supplied
input before we start updating vcpu->arch.cpuid_entries/nent but we
can't do the same in kvm_vcpu_ioctl_set_cpuid() as we'll have to copy
'struct kvm_cpuid_entry' entries first. The change will be made when
vcpu->arch.cpuid_entries[] array becomes allocated dynamically.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20201001130541.1398392-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-21 17:36:32 -04:00
Oliver Upton
3ee6fb4949 Documentation: kvm: fix some typos in cpuid.rst
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Change-Id: I0c6355b09fedf8f9cc4cc5f51be418e2c1c82b7b
Message-Id: <20200818152429.1923996-5-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-21 17:36:32 -04:00
Oliver Upton
66570e966d kvm: x86: only provide PV features if enabled in guest's CPUID
KVM unconditionally provides PV features to the guest, regardless of the
configured CPUID. An unwitting guest that doesn't check
KVM_CPUID_FEATURES before use could access paravirt features that
userspace did not intend to provide. Fix this by checking the guest's
CPUID before performing any paravirtual operations.

Introduce a capability, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, to gate the
aforementioned enforcement. Migrating a VM from a host w/o this patch to
a host with this patch could silently change the ABI exposed to the
guest, warranting that we default to the old behavior and opt-in for
the new one.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Change-Id: I202a0926f65035b872bfe8ad15307c026de59a98
Message-Id: <20200818152429.1923996-4-oupton@google.com>
Reviewed-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-21 17:36:32 -04:00
Oliver Upton
210dfd93ea kvm: x86: set wall_clock in kvm_write_wall_clock()
Small change to avoid meaningless duplication in the subsequent patch.
No functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Change-Id: I77ab9cdad239790766b7a49d5cbae5e57a3005ea
Message-Id: <20200818152429.1923996-3-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-21 17:36:31 -04:00
Oliver Upton
5b9bb0ebbc kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper fn
No functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Reviewed-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Change-Id: I7cbe71069db98d1ded612fd2ef088b70e7618426
Message-Id: <20200818152429.1923996-2-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-21 17:36:31 -04:00
Vitaly Kuznetsov
66af4f5cb1 x86/kvm: Update the comment about asynchronous page fault in exc_page_fault()
KVM was switched to interrupt-based mechanism for 'page ready' event
delivery in Linux-5.8 (see commit 2635b5c4a0 ("KVM: x86: interrupt based
APF 'page ready' event delivery")) and #PF (ab)use for 'page ready' event
delivery was removed. Linux guest switched to this new mechanism
exclusively in 5.9 (see commit b1d405751c ("KVM: x86: Switch KVM guest to
using interrupts for page ready APF delivery")) so it is not possible to
get #PF for a 'page ready' event even when the guest is running on top
of an older KVM (APF mechanism won't be enabled). Update the comment in
exc_page_fault() to reflect the new reality.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20201002154313.1505327-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-21 17:36:31 -04:00
Matteo Croce
8f116a6c73 x86/kvm: hide KVM options from menuconfig when KVM is not compiled
Let KVM_WERROR depend on KVM, so it doesn't show in menuconfig alone.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Message-Id: <20201001112014.9561-1-mcroce@linux.microsoft.com>
Fixes: 4f337faf1c ("KVM: allow disabling -Werror")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-21 17:36:30 -04:00
Li Qiang
10f79ccaf3 Documentation: kvm: fix a typo
Fixes: e287d6de62 ("Documentation: kvm: Convert cpuid.txt to .rst")
Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20201001095333.7611-1-liq3ea@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-21 17:36:30 -04:00
Paolo Bonzini
043248b328 KVM: VMX: Forbid userspace MSR filters for x2APIC
Allowing userspace to intercept reads to x2APIC MSRs when APICV is
fully enabled for the guest simply can't work.   But more in general,
the LAPIC could be set to in-kernel after the MSR filter is setup
and allowing accesses by userspace would be very confusing.

We could in principle allow userspace to intercept reads and writes to TPR,
and writes to EOI and SELF_IPI, but while that could be made it work, it
would still be silly.

Cc: Alexander Graf <graf@amazon.com>
Cc: Aaron Lewis <aaronlewis@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-21 17:36:24 -04:00
Sean Christopherson
9389b9d5d3 KVM: VMX: Ignore userspace MSR filters for x2APIC
Rework the resetting of the MSR bitmap for x2APIC MSRs to ignore userspace
filtering.  Allowing userspace to intercept reads to x2APIC MSRs when
APICV is fully enabled for the guest simply can't work; the LAPIC and thus
virtual APIC is in-kernel and cannot be directly accessed by userspace.
To keep things simple we will in fact forbid intercepting x2APIC MSRs
altogether, independent of the default_allow setting.

Cc: Alexander Graf <graf@amazon.com>
Cc: Aaron Lewis <aaronlewis@google.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20201005195532.8674-3-sean.j.christopherson@intel.com>
[Modified to operate even if APICv is disabled, adjust documentation. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-21 17:36:19 -04:00
Paolo Bonzini
1b21c8db0e KVM/arm64 updates for Linux 5.10
- New page table code for both hypervisor and guest stage-2
 - Introduction of a new EL2-private host context
 - Allow EL2 to have its own private per-CPU variables
 - Support of PMU event filtering
 - Complete rework of the Spectre mitigation
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl+B2vwPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDI20P/3zj+DT2INf6VrQr9nKg4uxwRk3AQYevRUhs
 id6s6OA9CCXNrN4OuboH3XXcPbI9bYJrp9xuxALAIPzVtLpQs4CLU0tgNclM3zw2
 Fe6u8wEIEFUxuNKlezd1J0gUqskTOloCsscMZ4s72ayDp56VR9fs6Ll/48rvtKXO
 x0FDkouIOAK8Cp/LVhFc0R4ZqaLvFiRr8OlJIqybM5JATl7CEq60jVlZqtoJcjOp
 M/Q/uPtoPZ+yaNizvzhNX9SSX5q7bW2hYqfjveCvmJXLoRZnUzmSiCGHk0DWSw86
 YO0X9cmK/E4feu7EwDppCbVi+FD/+Wcvhs87wAAa6jG2QgHab+OjGOg88gyNAQdL
 APMGALROiTPyC5s9SmrS8eMYZHJku9NRvBjXz5kdrXYhGhMSEmv9RfoBfwJKC4gz
 iqaQ9ejDm+uo6n9p9O8Fng34SqpIzCD1NwRW7AowM2WZH7wkpZ7d0RnbtToNzXsH
 TRKWrSZDydsy/rfL/e6k9CHdoIYCZcrRo/tfc7WGPoO6bwpUZP8O2K/dNz39fUTY
 eQls1fpEFJa8xRJFrxLHl3LuLcL4MuiTfN7TzbJpiwojXCFfUYcBJ3MGniONXIdQ
 2rtl6cx/FhUFekaC8S/k3UDK4i7AqjWsBm9nxRNBwSdwD/ecnqXMljq/DmOfjEmt
 3kOh0WS3
 =mjUF
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for Linux 5.10

- New page table code for both hypervisor and guest stage-2
- Introduction of a new EL2-private host context
- Allow EL2 to have its own private per-CPU variables
- Support of PMU event filtering
- Complete rework of the Spectre mitigation
2020-10-20 08:14:25 -04:00
Peter Xu
628ade2d08 KVM: VMX: Fix x2APIC MSR intercept handling on !APICV platforms
Fix an inverted flag for intercepting x2APIC MSRs and intercept writes
by default, even when APICV is enabled.

Fixes: 3eb900173c ("KVM: x86: VMX: Prevent MSR passthrough when MSR access is denied")
Co-developed-by: Peter Xu <peterx@redhat.com>
[sean: added changelog]
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20201005195532.8674-2-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-19 12:22:52 -04:00
Marc Zyngier
4e5dc64c43 Merge branches 'kvm-arm64/pt-new' and 'kvm-arm64/pmu-5.9' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-10-02 09:25:55 +01:00
Will Deacon
ffd1b63a58 KVM: arm64: Ensure user_mem_abort() return value is initialised
If a change in the MMU notifier sequence number forces user_mem_abort()
to return early when attempting to handle a stage-2 fault, we return
uninitialised stack to kvm_handle_guest_abort(), which could potentially
result in the injection of an external abort into the guest or a spurious
return to userspace. Neither or these are what we want to do.

Initialise 'ret' to 0 in user_mem_abort() so that bailing due to a
change in the MMU notrifier sequence number is treated as though the
fault was handled.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20200930102442.16142-1-will@kernel.org
2020-10-02 09:25:25 +01:00
Will Deacon
b259d137e9 KVM: arm64: Pass level hint to TLBI during stage-2 permission fault
Alex pointed out that we don't pass a level hint to the TLBI instruction
when handling a stage-2 permission fault, even though the walker does
at some point have the level information in its hands.

Rework stage2_update_leaf_attrs() so that it can optionally return the
level of the updated pte to its caller, which can in turn be used to
provide the correct TLBI level hint.

Reported-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/595cc73e-636e-8b3a-f93a-b4e9fb218db8@arm.com
Link: https://lore.kernel.org/r/20200930131801.16889-1-will@kernel.org
2020-10-02 09:16:07 +01:00
Mauro Carvalho Chehab
030bdf3698 KVM: arm64: Fix some documentation build warnings
As warned with make htmldocs:

	.../Documentation/virt/kvm/devices/vcpu.rst:70: WARNING: Malformed table.
	Text in column margin in table line 2.

	=======  ======================================================
	-ENODEV: PMUv3 not supported or GIC not initialized
	-ENXIO:  PMUv3 not properly configured or in-kernel irqchip not
	         configured as required prior to calling this attribute
	-EBUSY:  PMUv3 already initialized
	-EINVAL: Invalid filter range
	=======  ======================================================

The ':' character for two lines are above the size of the column.
Besides that, other tables at the file doesn't use ':', so
just drop them.

While here, also fix this warning also introduced at the same patch:

	.../Documentation/virt/kvm/devices/vcpu.rst:88: WARNING: Block quote ends without a blank line; unexpected unindent.

By marking the C code as a literal block.

Fixes: 8be86a5eec ("KVM: arm64: Document PMU filtering API")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/b5385dd0213f1f070667925bf7a807bf5270ba78.1601616399.git.mchehab+huawei@kernel.org
2020-10-02 09:12:25 +01:00
Marc Zyngier
14ef9d0492 Merge branch 'kvm-arm64/hyp-pcpu' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-30 14:05:35 +01:00
Marc Zyngier
816c347f3a Merge remote-tracking branch 'arm64/for-next/ghostbusters' into kvm-arm64/hyp-pcpu
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-30 09:48:30 +01:00
David Brazdil
a3bb9c3a00 kvm: arm64: Remove unnecessary hyp mappings
With all nVHE per-CPU variables being part of the hyp per-CPU region,
mapping them individual is not necessary any longer. They are mapped to hyp
as part of the overall per-CPU region.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Andrew Scull <ascull@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-11-dbrazdil@google.com
2020-09-30 08:37:57 +01:00
David Brazdil
30c953911c kvm: arm64: Set up hyp percpu data for nVHE
Add hyp percpu section to linker script and rename the corresponding ELF
sections of hyp/nvhe object files. This moves all nVHE-specific percpu
variables to the new hyp percpu section.

Allocate sufficient amount of memory for all percpu hyp regions at global KVM
init time and create corresponding hyp mappings.

The base addresses of hyp percpu regions are kept in a dynamically allocated
array in the kernel.

Add NULL checks in PMU event-reset code as it may run before KVM memory is
initialized.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-10-dbrazdil@google.com
2020-09-30 08:37:14 +01:00
David Brazdil
2a1198c9b4 kvm: arm64: Create separate instances of kvm_host_data for VHE/nVHE
Host CPU context is stored in a global per-cpu variable `kvm_host_data`.
In preparation for introducing independent per-CPU region for nVHE hyp,
create two separate instances of `kvm_host_data`, one for VHE and one
for nVHE.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-9-dbrazdil@google.com
2020-09-30 08:37:13 +01:00
David Brazdil
df4c8214a1 kvm: arm64: Duplicate arm64_ssbd_callback_required for nVHE hyp
Hyp keeps track of which cores require SSBD callback by accessing a
kernel-proper global variable. Create an nVHE symbol of the same name
and copy the value from kernel proper to nVHE as KVM is being enabled
on a core.

Done in preparation for separating percpu memory owned by kernel
proper and nVHE.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-8-dbrazdil@google.com
2020-09-30 08:37:13 +01:00
David Brazdil
572494995b kvm: arm64: Add helpers for accessing nVHE hyp per-cpu vars
Defining a per-CPU variable in hyp/nvhe will result in its name being
prefixed with __kvm_nvhe_. Add helpers for declaring these variables
in kernel proper and accessing them with this_cpu_ptr and per_cpu_ptr.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-7-dbrazdil@google.com
2020-09-30 08:37:13 +01:00
David Brazdil
ea391027d3 kvm: arm64: Remove hyp_adr/ldr_this_cpu
The hyp_adr/ldr_this_cpu helpers were introduced for use in hyp code
because they always needed to use TPIDR_EL2 for base, while
adr/ldr_this_cpu from kernel proper would select between TPIDR_EL2 and
_EL1 based on VHE/nVHE.

Simplify this now that the hyp mode case can be handled using the
__KVM_VHE/NVHE_HYPERVISOR__ macros.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Andrew Scull <ascull@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-6-dbrazdil@google.com
2020-09-30 08:37:07 +01:00
David Brazdil
717cf94adb kvm: arm64: Remove __hyp_this_cpu_read
this_cpu_ptr is meant for use in kernel proper because it selects between
TPIDR_EL1/2 based on nVHE/VHE. __hyp_this_cpu_ptr was used in hyp to always
select TPIDR_EL2. Unify all users behind this_cpu_ptr and friends by
selecting _EL2 register under __KVM_NVHE_HYPERVISOR__. VHE continues
selecting the register using alternatives.

Under CONFIG_DEBUG_PREEMPT, the kernel helpers perform a preemption check
which is omitted by the hyp helpers. Preserve the behavior for nVHE by
overriding the corresponding macros under __KVM_NVHE_HYPERVISOR__. Extend
the checks into VHE hyp code.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Andrew Scull <ascull@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-5-dbrazdil@google.com
2020-09-30 08:33:52 +01:00
David Brazdil
3471ee06e3 kvm: arm64: Only define __kvm_ex_table for CONFIG_KVM
Minor cleanup that only creates __kvm_ex_table ELF section and
related symbols if CONFIG_KVM is enabled. Also useful as more
hyp-specific sections will be added.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-4-dbrazdil@google.com
2020-09-30 08:33:52 +01:00
David Brazdil
ce492a16ff kvm: arm64: Move nVHE hyp namespace macros to hyp_image.h
Minor cleanup to move all macros related to prefixing nVHE hyp section
and symbol names into one place: hyp_image.h.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-3-dbrazdil@google.com
2020-09-30 08:33:52 +01:00
David Brazdil
ab25464bda kvm: arm64: Partially link nVHE hyp code, simplify HYPCOPY
Relying on objcopy to prefix the ELF section names of the nVHE hyp code
is brittle and prevents us from using wildcards to match specific
section names.

Improve the build rules by partially linking all '.nvhe.o' files and
prefixing their ELF section names using a linker script. Continue using
objcopy for prefixing ELF symbol names.

One immediate advantage of this approach is that all subsections
matching a pattern can be merged into a single prefixed section, eg.
.text and .text.* can be linked into a single '.hyp.text'. This removes
the need for -fno-reorder-functions on GCC and will be useful in the
future too: LTO builds use .text subsections, compilers routinely
generate .rodata subsections, etc.

Partially linking all hyp code into a single object file also makes it
easier to analyze.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-2-dbrazdil@google.com
2020-09-30 08:33:52 +01:00
Will Deacon
780c083a8f arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option
The PR_SPEC_DISABLE_NOEXEC option to the PR_SPEC_STORE_BYPASS prctl()
allows the SSB mitigation to be enabled only until the next execve(),
at which point the state will revert back to PR_SPEC_ENABLE and the
mitigation will be disabled.

Add support for PR_SPEC_DISABLE_NOEXEC on arm64.

Reported-by: Anthony Steinhauser <asteinhauser@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:17 +01:00
Will Deacon
5c8b0cbd9d arm64: Pull in task_stack_page() to Spectre-v4 mitigation code
The kbuild robot reports that we're relying on an implicit inclusion to
get a definition of task_stack_page() in the Spectre-v4 mitigation code,
which is not always in place for some configurations:

  | arch/arm64/kernel/proton-pack.c:329:2: error: implicit declaration of function 'task_stack_page' [-Werror,-Wimplicit-function-declaration]
  |         task_pt_regs(task)->pstate |= val;
  |         ^
  | arch/arm64/include/asm/processor.h:268:36: note: expanded from macro 'task_pt_regs'
  |         ((struct pt_regs *)(THREAD_SIZE + task_stack_page(p)) - 1)
  |                                           ^
  | arch/arm64/kernel/proton-pack.c:329:2: note: did you mean 'task_spread_page'?

Add the missing include to fix the build error.

Fixes: a44acf477220 ("arm64: Move SSBD prctl() handler alongside other spectre mitigation code")
Reported-by: Anthony Steinhauser <asteinhauser@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/202009260013.Ul7AD29w%lkp@intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:17 +01:00
Will Deacon
9ef2b48be9 KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled
Patching the EL2 exception vectors is integral to the Spectre-v2
workaround, where it can be necessary to execute CPU-specific sequences
to nobble the branch predictor before running the hypervisor text proper.

Remove the dependency on CONFIG_RANDOMIZE_BASE and allow the EL2 vectors
to be patched even when KASLR is not enabled.

Fixes: 7a132017e7a5 ("KVM: arm64: Replace CONFIG_KVM_INDIRECT_VECTORS with CONFIG_RANDOMIZE_BASE")
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/202009221053.Jv1XsQUZ%lkp@intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:17 +01:00
Marc Zyngier
31c84d6c9c arm64: Get rid of arm64_ssbd_state
Out with the old ghost, in with the new...

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:17 +01:00
Marc Zyngier
d63d975a71 KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state()
Convert the KVM WA2 code to using the Spectre infrastructure,
making the code much more readable. It also allows us to
take SSBS into account for the mitigation.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:17 +01:00
Marc Zyngier
7311467702 KVM: arm64: Get rid of kvm_arm_have_ssbd()
kvm_arm_have_ssbd() is now completely unused, get rid of it.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:17 +01:00
Marc Zyngier
29e8910a56 KVM: arm64: Simplify handling of ARCH_WORKAROUND_2
Owing to the fact that the host kernel is always mitigated, we can
drastically simplify the WA2 handling by keeping the mitigation
state ON when entering the guest. This means the guest is either
unaffected or not mitigated.

This results in a nice simplification of the mitigation space,
and the removal of a lot of code that was never really used anyway.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:16 +01:00
Will Deacon
c28762070c arm64: Rewrite Spectre-v4 mitigation code
Rewrite the Spectre-v4 mitigation handling code to follow the same
approach as that taken by Spectre-v2.

For now, report to KVM that the system is vulnerable (by forcing
'ssbd_state' to ARM64_SSBD_UNKNOWN), as this will be cleared up in
subsequent steps.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:16 +01:00
Will Deacon
9e78b659b4 arm64: Move SSBD prctl() handler alongside other spectre mitigation code
As part of the spectre consolidation effort to shift all of the ghosts
into their own proton pack, move all of the horrible SSBD prctl() code
out of its own 'ssbd.c' file.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:16 +01:00
Will Deacon
9b0955baa4 arm64: Rename ARM64_SSBD to ARM64_SPECTRE_V4
In a similar manner to the renaming of ARM64_HARDEN_BRANCH_PREDICTOR
to ARM64_SPECTRE_V2, rename ARM64_SSBD to ARM64_SPECTRE_V4. This isn't
_entirely_ accurate, as we also need to take into account the interaction
with SSBS, but that will be taken care of in subsequent patches.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:16 +01:00
Will Deacon
532d581583 arm64: Treat SSBS as a non-strict system feature
If all CPUs discovered during boot have SSBS, then spectre-v4 will be
considered to be "mitigated". However, we still allow late CPUs without
SSBS to be onlined, albeit with a "SANITY CHECK" warning. This is
problematic for userspace because it means that the system can quietly
transition to "Vulnerable" at runtime.

Avoid this by treating SSBS as a non-strict system feature: if all of
the CPUs discovered during boot have SSBS, then late arriving secondaries
better have it as well.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:16 +01:00
Will Deacon
a8de949893 arm64: Group start_thread() functions together
The is_ttbrX_addr() functions have somehow ended up in the middle of
the start_thread() functions, so move them out of the way to keep the
code readable.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:16 +01:00
Marc Zyngier
e1026237f9 KVM: arm64: Set CSV2 for guests on hardware unaffected by Spectre-v2
If the system is not affected by Spectre-v2, then advertise to the KVM
guest that it is not affected, without the need for a safelist in the
guest.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:16 +01:00
Will Deacon
d4647f0a2a arm64: Rewrite Spectre-v2 mitigation code
The Spectre-v2 mitigation code is pretty unwieldy and hard to maintain.
This is largely due to it being written hastily, without much clue as to
how things would pan out, and also because it ends up mixing policy and
state in such a way that it is very difficult to figure out what's going
on.

Rewrite the Spectre-v2 mitigation so that it clearly separates state from
policy and follows a more structured approach to handling the mitigation.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:15 +01:00
Will Deacon
455697adef arm64: Introduce separate file for spectre mitigations and reporting
The spectre mitigation code is spread over a few different files, which
makes it both hard to follow, but also hard to remove it should we want
to do that in future.

Introduce a new file for housing the spectre mitigations, and populate
it with the spectre-v1 reporting code to start with.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:15 +01:00
Will Deacon
688f1e4b6d arm64: Rename ARM64_HARDEN_BRANCH_PREDICTOR to ARM64_SPECTRE_V2
For better or worse, the world knows about "Spectre" and not about
"Branch predictor hardening". Rename ARM64_HARDEN_BRANCH_PREDICTOR to
ARM64_SPECTRE_V2 as part of moving all of the Spectre mitigations into
their own little corner.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:15 +01:00
Will Deacon
b181048f41 KVM: arm64: Simplify install_bp_hardening_cb()
Use is_hyp_mode_available() to detect whether or not we need to patch
the KVM vectors for branch hardening, which avoids the need to take the
vector pointers as parameters.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:15 +01:00
Will Deacon
5359a87d5b KVM: arm64: Replace CONFIG_KVM_INDIRECT_VECTORS with CONFIG_RANDOMIZE_BASE
The removal of CONFIG_HARDEN_BRANCH_PREDICTOR means that
CONFIG_KVM_INDIRECT_VECTORS is synonymous with CONFIG_RANDOMIZE_BASE,
so replace it.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:15 +01:00
Will Deacon
6e5f092784 arm64: Remove Spectre-related CONFIG_* options
The spectre mitigations are too configurable for their own good, leading
to confusing logic trying to figure out when we should mitigate and when
we shouldn't. Although the plethora of command-line options need to stick
around for backwards compatibility, the default-on CONFIG options that
depend on EXPERT can be dropped, as the mitigations only do anything if
the system is vulnerable, a mitigation is available and the command-line
hasn't disabled it.

Remove CONFIG_HARDEN_BRANCH_PREDICTOR and CONFIG_ARM64_SSBD in favour of
enabling this code unconditionally.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:15 +01:00
Marc Zyngier
39533e1206 arm64: Run ARCH_WORKAROUND_2 enabling code on all CPUs
Commit 606f8e7b27 ("arm64: capabilities: Use linear array for
detection and verification") changed the way we deal with per-CPU errata
by only calling the .matches() callback until one CPU is found to be
affected. At this point, .matches() stop being called, and .cpu_enable()
will be called on all CPUs.

This breaks the ARCH_WORKAROUND_2 handling, as only a single CPU will be
mitigated.

In order to address this, forcefully call the .matches() callback from a
.cpu_enable() callback, which brings us back to the original behaviour.

Fixes: 606f8e7b27 ("arm64: capabilities: Use linear array for detection and verification")
Cc: <stable@vger.kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:03 +01:00