KVM: nVMX: Do not clear CR3 load/store exiting bits if L1 wants 'em

[ Upstream commit 470750b342 ]

Keep CR3 load/store exiting enable as needed when running L2 in order to
honor L1's desires.  This fixes a largely theoretical bug where L1 could
intercept CR3 but not CR0.PG and end up not getting the desired CR3 exits
when L2 enables paging.  In other words, the existing !is_paging() check
inadvertantly handles the normal case for L2 where vmx_set_cr0() is
called during VM-Enter, which is guaranteed to run with paging enabled,
and thus will never clear the bits.

Removing the !is_paging() check will also allow future consolidation and
cleanup of the related code.  From a performance perspective, this is
all a nop, as the VMCS controls shadow will optimize away the VMWRITE
when the controls are in the desired state.

Add a comment explaining why CR3 is intercepted, with a big disclaimer
about not querying the old CR3.  Because vmx_set_cr0() is used for flows
that are not directly tied to MOV CR3, e.g. vCPU RESET/INIT and nested
VM-Enter, it's possible that is_paging() is not synchronized with CR3
load/store exiting.  This is actually guaranteed in the current code, as
KVM starts with CR3 interception disabled.  Obviously that can be fixed,
but there's no good reason to play whack-a-mole, and it tends to end
poorly, e.g. descriptor table exiting for UMIP emulation attempted to be
precise in the past and ended up botching the interception toggling.

Fixes: fe3ef05c75 ("KVM: nVMX: Prepare vmcs02 from vmcs01 and vmcs12")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-25-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Stable-dep-of: c4abd73520 ("KVM: VMX: Don't fudge CR0 and CR4 for restricted L2 guest")
Signed-off-by: Sasha Levin <sashal@kernel.org>
This commit is contained in:
Sean Christopherson 2021-07-13 09:33:02 -07:00 committed by Greg Kroah-Hartman
parent 7fc51da40b
commit c710ff0612

View file

@ -3063,10 +3063,14 @@ void ept_save_pdptrs(struct kvm_vcpu *vcpu)
kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR);
}
#define CR3_EXITING_BITS (CPU_BASED_CR3_LOAD_EXITING | \
CPU_BASED_CR3_STORE_EXITING)
void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
unsigned long hw_cr0;
u32 tmp;
hw_cr0 = (cr0 & ~KVM_VM_CR0_ALWAYS_OFF);
if (is_unrestricted_guest(vcpu))
@ -3093,18 +3097,42 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
#endif
if (enable_ept && !is_unrestricted_guest(vcpu)) {
/*
* Ensure KVM has an up-to-date snapshot of the guest's CR3. If
* the below code _enables_ CR3 exiting, vmx_cache_reg() will
* (correctly) stop reading vmcs.GUEST_CR3 because it thinks
* KVM's CR3 is installed.
*/
if (!kvm_register_is_available(vcpu, VCPU_EXREG_CR3))
vmx_cache_reg(vcpu, VCPU_EXREG_CR3);
/*
* When running with EPT but not unrestricted guest, KVM must
* intercept CR3 accesses when paging is _disabled_. This is
* necessary because restricted guests can't actually run with
* paging disabled, and so KVM stuffs its own CR3 in order to
* run the guest when identity mapped page tables.
*
* Do _NOT_ check the old CR0.PG, e.g. to optimize away the
* update, it may be stale with respect to CR3 interception,
* e.g. after nested VM-Enter.
*
* Lastly, honor L1's desires, i.e. intercept CR3 loads and/or
* stores to forward them to L1, even if KVM does not need to
* intercept them to preserve its identity mapped page tables.
*/
if (!(cr0 & X86_CR0_PG)) {
/* From paging/starting to nonpaging */
exec_controls_setbit(vmx, CPU_BASED_CR3_LOAD_EXITING |
CPU_BASED_CR3_STORE_EXITING);
vcpu->arch.cr0 = cr0;
vmx_set_cr4(vcpu, kvm_read_cr4(vcpu));
} else if (!is_paging(vcpu)) {
/* From nonpaging to paging */
exec_controls_clearbit(vmx, CPU_BASED_CR3_LOAD_EXITING |
CPU_BASED_CR3_STORE_EXITING);
exec_controls_setbit(vmx, CR3_EXITING_BITS);
} else if (!is_guest_mode(vcpu)) {
exec_controls_clearbit(vmx, CR3_EXITING_BITS);
} else {
tmp = exec_controls_get(vmx);
tmp &= ~CR3_EXITING_BITS;
tmp |= get_vmcs12(vcpu)->cpu_based_vm_exec_control & CR3_EXITING_BITS;
exec_controls_set(vmx, tmp);
}
if (!is_paging(vcpu) != !(cr0 & X86_CR0_PG)) {
vcpu->arch.cr0 = cr0;
vmx_set_cr4(vcpu, kvm_read_cr4(vcpu));
}