Commit graph

737816 commits

Author SHA1 Message Date
Christoffer Dall
8a43a2b34b KVM: arm/arm64: Move arm64-only vgic-v2-sr.c file to arm64
The vgic-v2-sr.c file now only contains the logic to replay unaligned
accesses to the virtual CPU interface on 16K and 64K page systems, which
is only relevant on 64-bit platforms.  Therefore move this file to the
arm64 KVM tree, remove the compile directive from the 32-bit side
makefile, and remove the ifdef in the C file.

Since this file also no longer saves/restores anything, rename the file
to vgic-v2-cpuif-proxy.c to more accurately describe the logic in this
file.

Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:20 +00:00
Christoffer Dall
75174ba6ca KVM: arm/arm64: Handle VGICv2 save/restore from the main VGIC code
We can program the GICv2 hypervisor control interface logic directly
from the core vgic code and can instead do the save/restore directly
from the flush/sync functions, which can lead to a number of future
optimizations.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:20 +00:00
Christoffer Dall
bb5ed70359 KVM: arm/arm64: Get rid of vgic_elrsr
There is really no need to store the vgic_elrsr on the VGIC data
structures as the only need we have for the elrsr is to figure out if an
LR is inactive when we save the VGIC state upon returning from the
guest.  We can might as well store this in a temporary local variable.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:20 +00:00
Christoffer Dall
b7787e6666 KVM: arm64: Cleanup __activate_traps and __deactive_traps for VHE and non-VHE
To make the code more readable and to avoid the overhead of a function
call, let's get rid of a pair of the alternative function selectors and
explicitly call the VHE and non-VHE functions using the has_vhe() static
key based selector instead, telling the compiler to try to inline the
static function if it can.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:19 +00:00
Christoffer Dall
a2465629b6 KVM: arm64: Configure c15, PMU, and debug register traps on cpu load/put for VHE
We do not have to change the c15 trap setting on each switch to/from the
guest on VHE systems, because this setting only affects guest EL1/EL0
(and therefore not the VHE host).

The PMU and debug trap configuration can also be done on vcpu load/put
instead, because they don't affect how the VHE host kernel can access the
debug registers while executing KVM kernel code.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:19 +00:00
Christoffer Dall
c16c1131fb KVM: arm64: Directly call VHE and non-VHE FPSIMD enabled functions
There is no longer a need for an alternative to choose the right
function to tell us whether or not FPSIMD was enabled for the VM,
because we can simply can the appropriate functions directly from within
the _vhe and _nvhe run functions.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:19 +00:00
Christoffer Dall
d5a21bcc29 KVM: arm64: Move common VHE/non-VHE trap config in separate functions
As we are about to be more lazy with some of the trap configuration
register read/writes for VHE systems, move the logic that is currently
shared between VHE and non-VHE into a separate function which can be
called from either the world-switch path or from vcpu_load/vcpu_put.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:19 +00:00
Christoffer Dall
b9f8ca4db4 KVM: arm64: Defer saving/restoring 32-bit sysregs to vcpu load/put
When running a 32-bit VM (EL1 in AArch32), the AArch32 system registers
can be deferred to vcpu load/put on VHE systems because neither
the host kernel nor host userspace uses these registers.

Note that we can't save DBGVCR32_EL2 conditionally based on the state of
the debug dirty flag on VHE after this change, because during
vcpu_load() we haven't calculated a valid debug flag yet, and when we've
restored the register during vcpu_load() we also have to save it during
vcpu_put().  This means that we'll always restore/save the register for
VHE on load/put, but luckily vcpu load/put are called rarely, so saving
an extra register unconditionally shouldn't significantly hurt
performance.

We can also not defer saving FPEXC32_32 because this register only holds
a guest-valid value for 32-bit guests during the exit path when the
guest has used FPSIMD registers and restored the register in the early
assembly handler from taking the EL2 fault, and therefore we have to
check if fpsimd is enabled for the guest in the exit path and save the
register then, for both VHE and non-VHE guests.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:18 +00:00
Christoffer Dall
a892819560 KVM: arm64: Prepare to handle deferred save/restore of 32-bit registers
32-bit registers are not used by a 64-bit host kernel and can be
deferred, but we need to rework the accesses to these register to access
the latest values depending on whether or not guest system registers are
loaded on the CPU or only reside in memory.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:18 +00:00
Christoffer Dall
fc7563b340 KVM: arm64: Defer saving/restoring 64-bit sysregs to vcpu load/put on VHE
Some system registers do not affect the host kernel's execution and can
therefore be loaded when we are about to run a VCPU and we don't have to
restore the host state to the hardware before the time when we are
actually about to return to userspace or schedule out the VCPU thread.

The EL1 system registers and the userspace state registers only
affecting EL0 execution do not need to be saved and restored on every
switch between the VM and the host, because they don't affect the host
kernel's execution.

We mark all registers which are now deffered as such in the
vcpu_{read,write}_sys_reg accessors in sys-regs.c to ensure the most
up-to-date copy is always accessed.

Note MPIDR_EL1 (controlled via VMPIDR_EL2) is accessed from other vcpu
threads, for example via the GIC emulation, and therefore must be
declared as immediate, which is fine as the guest cannot modify this
value.

The 32-bit sysregs can also be deferred but we do this in a separate
patch as it requires a bit more infrastructure.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:18 +00:00
Christoffer Dall
6d4bd90964 KVM: arm64: Prepare to handle deferred save/restore of ELR_EL1
ELR_EL1 is not used by a VHE host kernel and can be deferred, but we
need to rework the accesses to this register to access the latest value
depending on whether or not guest system registers are loaded on the CPU
or only reside in memory.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:17 +00:00
Christoffer Dall
00536ec476 KVM: arm/arm64: Prepare to handle deferred save/restore of SPSR_EL1
SPSR_EL1 is not used by a VHE host kernel and can be deferred, but we
need to rework the accesses to this register to access the latest value
depending on whether or not guest system registers are loaded on the CPU
or only reside in memory.

The handling of accessing the various banked SPSRs for 32-bit VMs is a
bit clunky, but this will be improved in following patches which will
first prepare and subsequently implement deferred save/restore of the
32-bit registers, including the 32-bit SPSRs.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:17 +00:00
Christoffer Dall
d47533dab9 KVM: arm64: Introduce framework for accessing deferred sysregs
We are about to defer saving and restoring some groups of system
registers to vcpu_put and vcpu_load on supported systems.  This means
that we need some infrastructure to access system registes which
supports either accessing the memory backing of the register or directly
accessing the system registers, depending on the state of the system
when we access the register.

We do this by defining read/write accessor functions, which can handle
both "immediate" and "deferrable" system registers.  Immediate registers
are always saved/restored in the world-switch path, but deferrable
registers are only saved/restored in vcpu_put/vcpu_load when supported
and sysregs_loaded_on_cpu will be set in that case.

Note that we don't use the deferred mechanism yet in this patch, but only
introduce infrastructure.  This is to improve convenience of review in
the subsequent patches where it is clear which registers become
deferred.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:17 +00:00
Christoffer Dall
8d404c4c24 KVM: arm64: Rewrite system register accessors to read/write functions
Currently we access the system registers array via the vcpu_sys_reg()
macro.  However, we are about to change the behavior to some times
modify the register file directly, so let's change this to two
primitives:

 * Accessor macros vcpu_write_sys_reg() and vcpu_read_sys_reg()
 * Direct array access macro __vcpu_sys_reg()

The accessor macros should be used in places where the code needs to
access the currently loaded VCPU's state as observed by the guest.  For
example, when trapping on cache related registers, a write to a system
register should go directly to the VCPU version of the register.

The direct array access macro can be used in places where the VCPU is
known to never be running (for example userspace access) or for
registers which are never context switched (for example all the PMU
system registers).

This rewrites all users of vcpu_sys_regs to one of the macros described
above.

No functional change.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:16 +00:00
Christoffer Dall
52f6c4f021 KVM: arm64: Change 32-bit handling of VM system registers
We currently handle 32-bit accesses to trapped VM system registers using
the 32-bit index into the coproc array on the vcpu structure, which is a
union of the coproc array and the sysreg array.

Since all the 32-bit coproc indices are created to correspond to the
architectural mapping between 64-bit system registers and 32-bit
coprocessor registers, and because the AArch64 system registers are the
double in size of the AArch32 coprocessor registers, we can always find
the system register entry that we must update by dividing the 32-bit
coproc index by 2.

This is going to make our lives much easier when we have to start
accessing system registers that use deferred save/restore and might
have to be read directly from the physical CPU.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:16 +00:00
Christoffer Dall
0c389d90eb KVM: arm64: Don't save the host ELR_EL2 and SPSR_EL2 on VHE systems
On non-VHE systems we need to save the ELR_EL2 and SPSR_EL2 so that we can
return to the host in EL1 in the same state and location where we issued a
hypercall to EL2, but on VHE ELR_EL2 and SPSR_EL2 are not useful because we
never enter a guest as a result of an exception entry that would be directly
handled by KVM. The kernel entry code already saves ELR_EL1/SPSR_EL1 on
exception entry, which is enough.  Therefore, factor out these registers into
separate save/restore functions, making it easy to exclude them from the VHE
world-switch path later on.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:16 +00:00
Christoffer Dall
4cdecaba01 KVM: arm64: Unify non-VHE host/guest sysreg save and restore functions
There is no need to have multiple identical functions with different
names for saving host and guest state.  When saving and restoring state
for the host and guest, the state is the same for both contexts, and
that's why we have the kvm_cpu_context structure.  Delete one
version and rename the other into simply save/restore.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:15 +00:00
Christoffer Dall
0a62d43314 KVM: arm/arm64: Remove leftover comment from kvm_vcpu_run_vhe
The comment only applied to SPE on non-VHE systems, so we simply remove
it.

Suggested-by: Andrew Jones <drjones@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:15 +00:00
Christoffer Dall
f837453d0e KVM: arm64: Introduce separate VHE/non-VHE sysreg save/restore functions
As we are about to handle system registers quite differently between VHE
and non-VHE systems.  In preparation for that, we need to split some of
the handling functions between VHE and non-VHE functionality.

For now, we simply copy the non-VHE functions, but we do change the use
of static keys for VHE and non-VHE functionality now that we have
separate functions.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:15 +00:00
Christoffer Dall
2b88104467 KVM: arm64: Rewrite sysreg alternatives to static keys
As we are about to move calls around in the sysreg save/restore logic,
let's first rewrite the alternative function callers, because it is
going to make the next patches much easier to read.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:14 +00:00
Christoffer Dall
060701f04a KVM: arm64: Move userspace system registers into separate function
There's a semantic difference between the EL1 registers that control
operation of a kernel running in EL1 and EL1 registers that only control
userspace execution in EL0.  Since we can defer saving/restoring the
latter, move them into their own function.

The ARMv8 ARM (ARM DDI 0487C.a) Section D10.2.1 recommends that
ACTLR_EL1 has no effect on the processor when running the VHE host, and
we can therefore move this register into the EL1 state which is only
saved/restored on vcpu_put/load for a VHE host.

We also take this chance to rename the function saving/restoring the
remaining system register to make it clear this function deals with
the EL1 system registers.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:14 +00:00
Christoffer Dall
04fef05700 KVM: arm64: Remove noop calls to timer save/restore from VHE switch
The VHE switch function calls __timer_enable_traps and
__timer_disable_traps which don't do anything on VHE systems.
Therefore, simply remove these calls from the VHE switch function and
make the functions non-conditional as they are now only called from the
non-VHE switch path.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:14 +00:00
Christoffer Dall
34f8cdf1df KVM: arm64: Don't deactivate VM on VHE systems
There is no need to reset the VTTBR to zero when exiting the guest on
VHE systems.  VHE systems don't use stage 2 translations for the EL2&0
translation regime used by the host.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:13 +00:00
Christoffer Dall
86d05682b4 KVM: arm64: Remove kern_hyp_va() use in VHE switch function
VHE kernels run completely in EL2 and therefore don't have a notion of
kernel and hyp addresses, they are all just kernel addresses.  Therefore
don't call kern_hyp_va() in the VHE switch function.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:13 +00:00
Christoffer Dall
3f5c90b890 KVM: arm64: Introduce VHE-specific kvm_vcpu_run
So far this is mostly (see below) a copy of the legacy non-VHE switch
function, but we will start reworking these functions in separate
directions to work on VHE and non-VHE in the most optimal way in later
patches.

The only difference after this patch between the VHE and non-VHE run
functions is that we omit the branch-predictor variant-2 hardening for
QC Falkor CPUs, because this workaround is specific to a series of
non-VHE ARMv8.0 CPUs.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:13 +00:00
Christoffer Dall
dc251406bf KVM: arm64: Factor out fault info population and gic workarounds
The current world-switch function has functionality to detect a number
of cases where we need to fixup some part of the exit condition and
possibly run the guest again, before having restored the host state.

This includes populating missing fault info, emulating GICv2 CPU
interface accesses when mapped at unaligned addresses, and emulating
the GICv3 CPU interface on systems that need it.

As we are about to have an alternative switch function for VHE systems,
but VHE systems still need the same early fixup logic, factor out this
logic into a separate function that can be shared by both switch
functions.

No functional change.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:12 +00:00
Christoffer Dall
014c4c77aa KVM: arm64: Improve debug register save/restore flow
Instead of having multiple calls from the world switch path to the debug
logic, each figuring out if the dirty bit is set and if we should
save/restore the debug registers, let's just provide two hooks to the
debug save/restore functionality, one for switching to the guest
context, and one for switching to the host context, and we get the
benefit of only having to evaluate the dirty flag once on each path,
plus we give the compiler some more room to inline some of this
functionality.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:12 +00:00
Christoffer Dall
5742d04912 KVM: arm64: Slightly improve debug save/restore functions
The debug save/restore functions can be improved by using the has_vhe()
static key instead of the instruction alternative.  Using the static key
uses the same paradigm as we're going to use elsewhere, it makes the
code more readable, and it generates slightly better code (no
stack setups and function calls unless necessary).

We also use a static key on the restore path, because it will be
marginally faster than loading a value from memory.

Finally, we don't have to conditionally clear the debug dirty flag if
it's set, we can just clear it.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:12 +00:00
Christoffer Dall
54ceb1bcf8 KVM: arm64: Move debug dirty flag calculation out of world switch
There is no need to figure out inside the world-switch if we should
save/restore the debug registers or not, we might as well do that in the
higher level debug setup code, making it easier to optimize down the
line.

Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:11 +00:00
Christoffer Dall
e72341c512 KVM: arm/arm64: Introduce vcpu_el1_is_32bit
We have numerous checks around that checks if the HCR_EL2 has the RW bit
set to figure out if we're running an AArch64 or AArch32 VM.  In some
cases, directly checking the RW bit (given its unintuitive name), is a
bit confusing, and that's not going to improve as we move logic around
for the following patches that optimize KVM on AArch64 hosts with VHE.

Therefore, introduce a helper, vcpu_el1_is_32bit, and replace existing
direct checks of HCR_EL2.RW with the helper.

Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:11 +00:00
Christoffer Dall
bc192ceec3 KVM: arm/arm64: Add kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs
As we are about to move a bunch of save/restore logic for VHE kernels to
the load and put functions, we need some infrastructure to do this.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:11 +00:00
Christoffer Dall
3df59d8dd3 KVM: arm/arm64: Get rid of vcpu->arch.irq_lines
We currently have a separate read-modify-write of the HCR_EL2 on entry
to the guest for the sole purpose of setting the VF and VI bits, if set.
Since this is most rarely the case (only when using userspace IRQ chip
and interrupts are in flight), let's get rid of this operation and
instead modify the bits in the vcpu->arch.hcr[_el2] directly when
needed.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:10 +00:00
Shih-Wei Li
35a84dec00 KVM: arm64: Move HCR_INT_OVERRIDE to default HCR_EL2 guest flag
We always set the IMO and FMO bits in the HCR_EL2 when running the
guest, regardless if we use the vgic or not.  By moving these flags to
HCR_GUEST_FLAGS we can avoid one of the extra save/restore operations of
HCR_EL2 in the world switch code, and we can also soon get rid of the
other one.

This is safe, because even though the IMO and FMO bits control both
taking the interrupts to EL2 and remapping ICC_*_EL1 to ICV_*_EL1 when
executed at EL1, as long as we ensure that these bits are clear when
running the EL1 host, we're OK, because we reset the HCR_EL2 to only
have the HCR_RW bit set when returning to EL1 on non-VHE systems.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Shih-Wei Li <shihwei@cs.columbia.edu>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:10 +00:00
Christoffer Dall
8f17f5e469 KVM: arm64: Rework hyp_panic for VHE and non-VHE
VHE actually doesn't rely on clearing the VTTBR when returning to the
host kernel, and that is the current key mechanism of hyp_panic to
figure out how to attempt to return to a state good enough to print a
panic statement.

Therefore, we split the hyp_panic function into two functions, a VHE and
a non-VHE, keeping the non-VHE version intact, but changing the VHE
behavior.

The vttbr_el2 check on VHE doesn't really make that much sense, because
the only situation where we can get here on VHE is when the hypervisor
assembly code actually called into hyp_panic, which only happens when
VBAR_EL2 has been set to the KVM exception vectors.  On VHE, we can
always safely disable the traps and restore the host registers at this
point, so we simply do that unconditionally and call into the panic
function directly.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:10 +00:00
Christoffer Dall
4464e210de KVM: arm64: Avoid storing the vcpu pointer on the stack
We already have the percpu area for the host cpu state, which points to
the VCPU, so there's no need to store the VCPU pointer on the stack on
every context switch.  We can be a little more clever and just use
tpidr_el2 for the percpu offset and load the VCPU pointer from the host
context.

This has the benefit of being able to retrieve the host context even
when our stack is corrupted, and it has a potential performance benefit
because we trade a store plus a load for an mrs and a load on a round
trip to the guest.

This does require us to calculate the percpu offset without including
the offset from the kernel mapping of the percpu array to the linear
mapping of the array (which is what we store in tpidr_el1), because a
PC-relative generated address in EL2 is already giving us the hyp alias
of the linear mapping of a kernel address.  We do this in
__cpu_init_hyp_mode() by using kvm_ksym_ref().

The code that accesses ESR_EL2 was previously using an alternative to
use the _EL1 accessor on VHE systems, but this was actually unnecessary
as the _EL1 accessor aliases the ESR_EL2 register on VHE, and the _EL2
accessor does the same thing on both systems.

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:09 +00:00
Christoffer Dall
829a586354 KVM: arm/arm64: Move vcpu_load call after kvm_vcpu_first_run_init
Moving the call to vcpu_load() in kvm_arch_vcpu_ioctl_run() to after
we've called kvm_vcpu_first_run_init() simplifies some of the vgic and
there is also no need to do vcpu_load() for things such as handling the
immediate_exit flag.

Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:09 +00:00
Christoffer Dall
7a364bd5db KVM: arm/arm64: Avoid vcpu_load for other vcpu ioctls than KVM_RUN
Calling vcpu_load() registers preempt notifiers for this vcpu and calls
kvm_arch_vcpu_load().  The latter will soon be doing a lot of heavy
lifting on arm/arm64 and will try to do things such as enabling the
virtual timer and setting us up to handle interrupts from the timer
hardware.

Loading state onto hardware registers and enabling hardware to signal
interrupts can be problematic when we're not actually about to run the
VCPU, because it makes it difficult to establish the right context when
handling interrupts from the timer, and it makes the register access
code difficult to reason about.

Luckily, now when we call vcpu_load in each ioctl implementation, we can
simply remove the call from the non-KVM_RUN vcpu ioctls, and our
kvm_arch_vcpu_load() is only used for loading vcpu content to the
physical CPU when we're actually going to run the vcpu.

Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19 10:53:09 +00:00
Shanker Donthineni
250be9d61c KVM: arm/arm64: No need to zero CNTVOFF in kvm_timer_vcpu_put() for VHE
In AArch64/AArch32, the virtual counter uses a fixed virtual offset
of zero in the following situations as per ARMv8 specifications:

1) HCR_EL2.E2H is 1, and CNTVCT_EL0/CNTVCT are read from EL2.
2) HCR_EL2.{E2H, TGE} is {1, 1}, and either:
   — CNTVCT_EL0 is read from Non-secure EL0 or EL2.
   — CNTVCT is read from Non-secure EL0.

So, no need to zero CNTVOFF_EL2/CNTVOFF for VHE case.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2018-02-26 10:48:02 +01:00
Jérémy Fanguède
b9fb17395b KVM: arm: Enable emulation of the physical timer
Set the handlers to emulate read and write operations for CNTP_CTL,
CNTP_CVAL and CNTP_TVAL registers in such a way that VMs can use the
physical timer.

Signed-off-by: Jérémy Fanguède <j.fanguede@virtualopensystems.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2018-02-26 10:48:02 +01:00
Jérémy Fanguède
eac137b4a9 KVM: arm64: Enable the EL1 physical timer for AArch32 guests
Some 32bits guest OS can use the CNTP timer, however KVM does not
handle the accesses, injecting a fault instead.

Use the proper handlers to emulate the EL1 Physical Timer (CNTP)
register accesses of AArch32 guests.

Signed-off-by: Jérémy Fanguède <j.fanguede@virtualopensystems.com>
Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2018-02-26 10:48:02 +01:00
Dave Martin
005781be12 arm64: KVM: Move CPU ID reg trap setup off the world switch path
The HCR_EL2.TID3 flag needs to be set when trapping guest access to
the CPU ID registers is required.  However, the decision about
whether to set this bit does not need to be repeated at every
switch to the guest.

Instead, it's sufficient to make this decision once and record the
outcome.

This patch moves the decision to vcpu_reset_hcr() and records the
choice made in vcpu->arch.hcr_el2.  The world switch code can then
load this directly when switching to the guest without the need for
conditional logic on the critical path.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Suggested-by: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2018-02-26 10:48:01 +01:00
Mark Rutland
cc33c4e201 arm64/kvm: Prohibit guest LOR accesses
We don't currently limit guest accesses to the LOR registers, which we
neither virtualize nor context-switch. As such, guests are provided with
unusable information/controls, and are not isolated from each other (or
the host).

To prevent these issues, we can trap register accesses and present the
illusion LORegions are unssupported by the CPU. To do this, we mask
ID_AA64MMFR1.LO, and set HCR_EL2.TLOR to trap accesses to the following
registers:

* LORC_EL1
* LOREA_EL1
* LORID_EL1
* LORN_EL1
* LORSA_EL1

... when trapped, we inject an UNDEFINED exception to EL1, simulating
their non-existence.

As noted in D7.2.67, when no LORegions are implemented, LoadLOAcquire
and StoreLORelease must behave as LoadAcquire and StoreRelease
respectively. We can ensure this by clearing LORC_EL1.EN when a CPU's
EL2 is first initialized, as the host kernel will not modify this.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2018-02-26 10:48:01 +01:00
Linus Torvalds
4a3928c6f8 Linux 4.16-rc3 2018-02-25 18:50:41 -08:00
Linus Torvalds
e1171aca7d Xtensa fixes for 4.16
- fix memory accounting when reserved memory is in high memory region;
 - fix DMA allocation from high memory.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEK2eFS5jlMn3N6xfYUfnMkfg/oEQFAlqTLMUTHGpjbXZia2Jj
 QGdtYWlsLmNvbQAKCRBR+cyR+D+gRF/3D/49eKQJf8P/F1MQlZCGXPPR+NAtPnVe
 NakVAE4jNZ/6m+pt694hdifG2LIbo5ePxWeW+7Nmn5POI6QO71qZEJR9KKD6Sbs5
 1pG/+BtLHyRgef7QEqWuCaljFSvoIm7CrXnBvFbEN8FAiDlZ7MDu8w3iNtULTcyu
 jxML8vfAIq8w/aRa0v+GlKNkQDk+K6UCvzTn8oBGet5zWIJ/ufjLreNQm0qkDyB9
 K+UkyIUg94hC79RM4ZcJhka+cBq6B6qITft0TbdpMzWeYyZH01zGaw1t2tgi596B
 T/rIlP4fpjF+aBHRoWBoiz0I0sFEoTtwiEWXS+PBSDfoDSzT9KYH+eJMi6KjQxXR
 MwPmSz8wgMoObX91TO5pMhq281yZMkuee4mEdutssXisHNscwimpnlrCxAQLDLY7
 kosMxHkU4UW/elpnPUWTWWReJjif9+cXzu69Hwn1QKWIZJfbPeoIqGtiFeZTBAeR
 NANhS80A4Psk7XF/hgyWD44xXYB7PXLzg1cYyQDz/3hTydAnoh9Rq8I0T4ccmbDO
 c4SpKwe0I9H/b8jBAuxu6YmF02EOMPNQapVOoTe/mNLUj5tsOUH8oDx/P2fxWUpZ
 bY+9lwwo32AljujVxGEmJmtKKZQoJYfNlNDWIgL/fSd1sXE1Sft2M6eRG54K5tEa
 +Q6M+lu+kXKWoQ==
 =IRmt
 -----END PGP SIGNATURE-----

Merge tag 'xtensa-20180225' of git://github.com/jcmvbkbc/linux-xtensa

Pull Xtensa fixes from Max Filippov:
 "Two fixes for reserved memory/DMA buffers allocation in high memory on
  xtensa architecture

   - fix memory accounting when reserved memory is in high memory region

   - fix DMA allocation from high memory"

* tag 'xtensa-20180225' of git://github.com/jcmvbkbc/linux-xtensa:
  xtensa: support DMA buffers in high memory
  xtensa: fix high memory/reserved memory collision
2018-02-25 17:02:24 -08:00
Linus Torvalds
c23a757591 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A small set of fixes:

   - UAPI data type correction for hyperv

   - correct the cpu cores field in /proc/cpuinfo on CPU hotplug

   - return proper error code in the resctrl file system failure path to
     avoid silent subsequent failures

   - correct a subtle accounting issue in the new vector allocation code
     which went unnoticed for a while and caused suspend/resume
     failures"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations
  x86/topology: Fix function name in documentation
  x86/intel_rdt: Fix incorrect returned value when creating rdgroup sub-directory in resctrl file system
  x86/apic/vector: Handle vector release on CPU unplug correctly
  genirq/matrix: Handle CPU offlining proper
  x86/headers/UAPI: Use __u64 instead of u64 in <uapi/asm/hyperv.h>
2018-02-25 16:58:55 -08:00
Linus Torvalds
e912bf2cf7 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Thomas Gleixner:
 "A single commit which shuts up a bogus GCC-8 warning"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/oprofile: Fix bogus GCC-8 warning in nmi_setup()
2018-02-25 16:57:22 -08:00
Linus Torvalds
9c897096bb Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
 "Three patches to fix memory ordering issues on ALPHA and a comment to
  clarify the usage scope of a mutex internal function"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/xchg/alpha: Fix xchg() and cmpxchg() memory ordering bugs
  locking/xchg/alpha: Clean up barrier usage by using smp_mb() in place of __ASM__MB
  locking/xchg/alpha: Add unconditional memory barrier to cmpxchg()
  locking/mutex: Add comment to __mutex_owner() to deter usage
2018-02-25 16:29:59 -08:00
Linus Torvalds
297ea1b7f7 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull cleanup patchlet from Thomas Gleixner:
 "A single commit removing a bunch of bogus double semicolons all over
  the tree"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  treewide/trivial: Remove ';;$' typo noise
2018-02-25 16:27:51 -08:00
Linus Torvalds
c89be52426 NFS client bugfixes for Linux 4.16
Hightlights include:
 - Fix a broken cast in nfs4_callback_recallany()
 - Fix an Oops during NFSv4 migration events
 - make struct nlmclnt_fl_close_lock_ops static
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJakuNIAAoJEGcL54qWCgDykYYQAITHRrWP7tQ6aSpZxW5+Un5z
 6K3RRbfxFjHWVyaePCBzRMOtPTA/puqO0ggx9H+2D4+u2GeXhFl7FMdIuLueGKrc
 rh0wzB6+KiHvqK8NT3g4c2VzZbGJ8IWB6jlNaA3ZyHRJcO+Oi3rQhYBNZpVqP6ny
 M1C3yXQTUtA13aOLjeThoAKIJyknwdZcsiMTptJslvSsQ9PL0w6m6jZKrVHu6Rc+
 Hg12FFptaKien/gj2IUYJb6Z2Mz3arJu1Y7cm1P/zH/NBs37ynMUsrb9AvPbzvRm
 PvPRT4ugNOlTgDTaIT2JHwP2bhlp2JF+Tdzq7WYE3ek2CEPUD49jv07MlgTHxy/w
 +tcp/322ZCxvKLjHTeEWqGn4T0TZ1TdPrd4dIJsjox9Ffy72Z3rOjvLKt5UrRV0i
 8IhiE3/ruHFpB75Yfi7ABEIH8aEwwmchQTf5bth0ZKoZdaEPHmy3xnJkxa+wlD1V
 Hp6KoqMNlDXEFx2Ih/SD6j50MFKszq6+cjUk2D8iclLnelXhu9iddFQ+PFNfsxVZ
 WSo4AWZoPbtbLnn9Ez9dkdsJILKv86LbEvYxLX6/LnxLzzX70E34tbRQa+drVeR3
 Na6czRpld85juKWgiFkzNx+zD4TaBAxUMbQH2Gbwngiz5RoT6SnkkdkAK3EW7nhR
 pIJgZGiq/NG3NLiCxIgs
 =l2jv
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.16-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:

 - fix a broken cast in nfs4_callback_recallany()

 - fix an Oops during NFSv4 migration events

 - make struct nlmclnt_fl_close_lock_ops static

* tag 'nfs-for-4.16-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: make struct nlmclnt_fl_close_lock_ops static
  nfs: system crashes after NFS4ERR_MOVED recovery
  NFSv4: Fix broken cast in nfs4_callback_recallany()
2018-02-25 13:43:18 -08:00
Linus Torvalds
3664ce2d93 powerpc fixes for 4.16 #4
Add handling for a missing instruction in our 32-bit BPF JIT so that it can be
 used for seccomp filtering.
 
 Add a missing NULL pointer check before a function call in new EEH code.
 
 Fix an error path in the new ocxl driver to correctly return EFAULT.
 
 The support for the new ibm,drc-info device tree property turns out to need
 several fixes, so for now we just stop advertising to firmware that we support
 it until the bugs can be ironed out.
 
 One fix for the new drmem code which was incorrectly modifying the device tree
 in place.
 
 Finally two fixes for the RFI flush support, so that firmware can advertise to
 us that it should be disabled entirely so as not to affect performance.
 
 Thanks to:
   Bharata B Rao, Frederic Barrat, Juan J. Alvarez, Mark Lord, Michael Bringmann.
 -----BEGIN PGP SIGNATURE-----
 
 iQIwBAABCAAaBQJakNObExxtcGVAZWxsZXJtYW4uaWQuYXUACgkQUevqPMjhpYAy
 5A//RHFnwaXrOmcQ7h6fJim9lh+CsbhyHkf+hEGqJ4gp/ZOxjgWLjxgqB8O09hLb
 CZUvbqc3phip431E+qCw1KVVoPzKXVVqEQD1mckvpVwVL7vVxOvcNfjzRH3oMsjC
 jLQAfEvZMO6Lf14tb83vC5YBYOrWjojyOqJ/DbFzedw9hLLt1JTI12KfcgRZo5xr
 3WhyevTHa4JJSgLhY9r34Q2hsf7CS2yTfhHa2+iHFPg2fINYb+Ld8V5JNj3JiFtx
 fd6JiPFjQSyK+xXwK55eQGqMXZdEJU0iSwnYWBLUnFGaK7MQpEIJdv7FiRjD7Hcp
 b3wpBuCSsN2mQIYpsoBY3GUm0thtAIbLDTRq1nAWdULbnOi1QBQy694Vgn4f8A79
 T10Y1QuuwL2nInPy9qb/c6sukGz/czu9txFyqO/PwtRs99A6i4hRAtfKLUeW0cbG
 QxI3GMfxst31SThqjVQifyhy7omXEjU8Oc6o1VMp8Vm1eH1QviwzcfmKA5UQuq3k
 Y+ninT4wjo8MceAZ6F3w767DQmTK9e9lcgLU1ZZd/jntcnDzTr6AF54TFNmmicRb
 o0arHMneMDo1IHEUfeski+0j0uZGuA/reQ6A/mDGSxum09voPwtWEH+dlAr9oKSQ
 lgETCP6GN7rWluEWTYidgGGhV77QsfLNZyJYbGJXQ8cValo=
 =mTqU
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Add handling for a missing instruction in our 32-bit BPF JIT so that
   it can be used for seccomp filtering.

 - Add a missing NULL pointer check before a function call in new EEH
   code.

 - Fix an error path in the new ocxl driver to correctly return EFAULT.

 - The support for the new ibm,drc-info device tree property turns out
   to need several fixes, so for now we just stop advertising to
   firmware that we support it until the bugs can be ironed out.

 - One fix for the new drmem code which was incorrectly modifying the
   device tree in place.

 - Finally two fixes for the RFI flush support, so that firmware can
   advertise to us that it should be disabled entirely so as not to
   affect performance.

Thanks to: Bharata B Rao, Frederic Barrat, Juan J. Alvarez, Mark Lord,
Michael Bringmann.

* tag 'powerpc-4.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/powernv: Support firmware disable of RFI flush
  powerpc/pseries: Support firmware disable of RFI flush
  powerpc/mm/drmem: Fix unexpected flag value in ibm,dynamic-memory-v2
  powerpc/bpf/jit: Fix 32-bit JIT for seccomp_data access
  powerpc/pseries: Revert support for ibm,drc-info devtree property
  powerpc/pseries: Fix duplicate firmware feature for DRC_INFO
  ocxl: Fix potential bad errno on irq allocation
  powerpc/eeh: Fix crashes in eeh_report_resume()
2018-02-24 16:05:50 -08:00