Commit graph

112 commits

Author SHA1 Message Date
Yu Zhang
e911eb3b34 KVM: x86: Add return value to kvm_cpuid().
Return false in kvm_cpuid() when it fails to find the cpuid
entry. Also, this routine(and its caller) is optimized with
a new argument - check_limit, so that the check_cpuid_limit()
fall back can be avoided.

Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-08-24 18:09:15 +02:00
Wanpeng Li
adfe20fb48 KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf
Add an nested_apf field to vcpu->arch.exception to identify an async page
fault, and constructs the expected vm-exit information fields. Force a
nested VM exit from nested_vmx_check_exception() if the injected #PF is
async page fault.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-07-14 14:26:16 +02:00
Paolo Bonzini
c8401dda2f KVM: x86: fix singlestepping over syscall
TF is handled a bit differently for syscall and sysret, compared
to the other instructions: TF is checked after the instruction completes,
so that the OS can disable #DB at a syscall by adding TF to FMASK.
When the sysret is executed the #DB is taken "as if" the syscall insn
just completed.

KVM emulates syscall so that it can trap 32-bit syscall on Intel processors.
Fix the behavior, otherwise you could get #DB on a user stack which is not
nice.  This does not affect Linux guests, as they use an IST or task gate
for #DB.

This fixes CVE-2017-7518.

Cc: stable@vger.kernel.org
Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-06-22 16:13:29 +02:00
Ladi Prosek
6ed071f051 KVM: x86: fix emulation of RSM and IRET instructions
On AMD, the effect of set_nmi_mask called by emulate_iret_real and em_rsm
on hflags is reverted later on in x86_emulate_instruction where hflags are
overwritten with ctxt->emul_flags (the kvm_set_hflags call). This manifests
as a hang when rebooting Windows VMs with QEMU, OVMF, and >1 vcpu.

Instead of trying to merge ctxt->emul_flags into vcpu->arch.hflags after
an instruction is emulated, this commit deletes emul_flags altogether and
makes the emulator access vcpu->arch.hflags using two new accessors. This
way all changes, on the emulator side as well as in functions called from
the emulator and accessing vcpu state with emul_to_vcpu, are preserved.

More details on the bug and its manifestation with Windows and OVMF:

  It's a KVM bug in the interaction between SMI/SMM and NMI, specific to AMD.
  I believe that the SMM part explains why we started seeing this only with
  OVMF.

  KVM masks and unmasks NMI when entering and leaving SMM. When KVM emulates
  the RSM instruction in em_rsm, the set_nmi_mask call doesn't stick because
  later on in x86_emulate_instruction we overwrite arch.hflags with
  ctxt->emul_flags, effectively reverting the effect of the set_nmi_mask call.
  The AMD-specific hflag of interest here is HF_NMI_MASK.

  When rebooting the system, Windows sends an NMI IPI to all but the current
  cpu to shut them down. Only after all of them are parked in HLT will the
  initiating cpu finish the restart. If NMI is masked, other cpus never get
  the memo and the initiating cpu spins forever, waiting for
  hal!HalpInterruptProcessorsStarted to drop. That's the symptom we observe.

Fixes: a584539b24 ("KVM: x86: pass the whole hflags field to emulator and back")
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-27 16:54:09 +02:00
Tom Lendacky
0f89b207b0 kvm: svm: Use the hardware provided GPA instead of page walk
When a guest causes a NPF which requires emulation, KVM sometimes walks
the guest page tables to translate the GVA to a GPA. This is unnecessary
most of the time on AMD hardware since the hardware provides the GPA in
EXITINFO2.

The only exception cases involve string operations involving rep or
operations that use two memory locations. With rep, the GPA will only be
the value of the initial NPF and with dual memory locations we won't know
which memory address was translated into EXITINFO2.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-09 14:47:58 +01:00
Radim Krčmář
7a036a6f67 KVM: x86: add read_phys to x86_emulate_ops
We want to read the physical memory when emulating RSM.

X86EMUL_IO_NEEDED is returned on all errors for consistency with other
helpers.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-04 16:24:31 +01:00
Paolo Bonzini
64d6067057 KVM: x86: stubs for SMM support
This patch adds the interface between x86.c and the emulator: the
SMBASE register, a new emulator flag, the RSM instruction.  It also
adds a new request bit that will be used by the KVM_SMI ioctl.

Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-04 16:01:45 +02:00
Paolo Bonzini
a584539b24 KVM: x86: pass the whole hflags field to emulator and back
The hflags field will contain information about system management mode
and will be useful for the emulator.  Pass the entire field rather than
just the guest-mode information.

Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-04 16:01:05 +02:00
Nadav Amit
801806d956 KVM: x86: IRET emulation does not clear NMI masking
The IRET instruction should clear NMI masking, but the current implementation
does not do so.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-01-26 12:14:42 +01:00
Paolo Bonzini
17052f16a5 KVM: emulate: put pointers in the fetch_cache
This simplifies the code a bit, especially the overflow checks.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-11 09:14:03 +02:00
Bandan Das
41061cdb98 KVM: emulate: do not initialize memopp
rip_relative is only set if decode_modrm runs, and if you have ModRM
you will also have a memopp.  We can then access memopp unconditionally.
Note that rip_relative cannot be hoisted up to decode_modrm, or you
break "mov $0, xyz(%rip)".

Also, move typecast on "out of range value" of mem.ea to decode_modrm.

Together, all these optimizations save about 50 cycles on each emulated
instructions (4-6%).

Signed-off-by: Bandan Das <bsd@redhat.com>
[Fix immediate operands with rip-relative addressing. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-11 09:14:01 +02:00
Bandan Das
573e80fe04 KVM: emulate: rework seg_override
x86_decode_insn already sets a default for seg_override,
so remove it from the zeroed area. Also replace set/get functions
with direct access to the field.

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-11 09:14:01 +02:00
Bandan Das
c44b4c6ab8 KVM: emulate: clean up initializations in init_decode_cache
A lot of initializations are unnecessary as they get set to
appropriate values before actually being used. Optimize
placement of fields in x86_emulate_ctxt

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-11 09:14:00 +02:00
Bandan Das
1498507a47 KVM: emulate: move init_decode_cache to emulate.c
Core emulator functions all belong in emulator.c,
x86 should have no knowledge of emulator internals

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-11 09:13:59 +02:00
Paolo Bonzini
54cfdb3e95 KVM: emulate: speed up emulated moves
We can just blindly move all 16 bytes of ctxt->src's value to ctxt->dst.
write_register_operand will take care of writing only the lower bytes.

Avoiding a call to memcpy (the compiler optimizes it out) gains about
200 cycles on kvm-unit-tests for register-to-register moves, and makes
them about as fast as arithmetic instructions.

We could perhaps get a larger speedup by moving all instructions _except_
moves out of x86_emulate_insn, removing opcode_len, and replacing the
switch statement with an inlined em_mov.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-11 09:13:58 +02:00
Jan Kiszka
6cbc5f5a80 KVM: nSVM: Set correct port for IOIO interception evaluation
Obtaining the port number from DX is bogus as a) there are immediate
port accesses and b) user space may have changed the register content
while processing the PIO access. Forward the correct value from the
instruction emulator instead.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-09 18:09:56 +02:00
Nadav Amit
67f4d4288c KVM: x86: rdpmc emulation checks the counter incorrectly
The rdpmc emulation checks that the counter (ECX) is not higher than 2, without
taking into considerations bits 30:31 role (e.g., bit 30 marks whether the
counter is fixed). The fix uses the pmu information for checking the validity
of the pmu counter.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 17:46:18 +02:00
Paolo Bonzini
fb5e336b97 KVM: x86: drop set_rflags callback
Not needed anymore now that the CPL is computed directly
during task switch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-22 17:47:16 +02:00
Borislav Petkov
b51e974fcd kvm, emulator: Rename VendorSpecific flag
Call it EmulateOnUD which is exactly what we're trying to do with
vendor-specific instructions.

Rename ->only_vendor_specific_insn to something shorter, while at it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-30 18:54:40 +01:00
Borislav Petkov
1ce19dc16c kvm, emulator: Use opcode length
Add a field to the current emulation context which contains the
instruction opcode length. This will streamline handling of opcodes of
different length.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-30 18:54:39 +01:00
Gleb Natapov
b3356bf0db KVM: emulator: optimize "rep ins" handling
Optimize "rep ins" by allowing emulator to write back more than one
datum at a time. Introduce new operand type OP_MEM_STR which tells
writeback() that dst contains pointer to an array that should be written
back as opposite to just one data element.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-06 18:07:38 +03:00
Gleb Natapov
9d1b39a967 KVM: emulator: make x86 emulation modes enum instead of defines
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-06 18:07:01 +03:00
Mathias Krause
0225fb509d KVM: x86 emulator: constify emulate_ops
We never change emulate_ops[] at runtime so it should be r/o.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-05 12:41:48 +03:00
Avi Kivity
dd856efafe KVM: x86 emulator: access GPRs on demand
Instead of populating the entire register file, read in registers
as they are accessed, and write back only the modified ones.  This
saves a VMREAD and VMWRITE on Intel (for rsp, since it is not usually
used during emulation), and a two 128-byte copies for the registers.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-08-27 18:38:55 -03:00
Avi Kivity
cbd27ee783 KVM: x86 emulator: initialize memop
memop is not initialized; this can lead to a two-byte operation
following a 4-byte operation to see garbage values.  Usually
truncation fixes things fot us later on, but at least in one case
(call abs) it doesn't.

Fix by moving memop to the auto-initialized field area.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-07-09 14:19:02 +03:00
Avi Kivity
0017f93a27 KVM: x86 emulator: change ->get_cpuid() accessor to use the x86 semantics
Instead of getting an exact leaf, follow the spec and fall back to the last
main leaf instead.  This lets us easily emulate the cpuid instruction in the
emulator.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-07-09 14:19:00 +03:00
Avi Kivity
cbe2c9d30a KVM: x86 emulator: MMX support
General support for the MMX instruction set.  Special care is taken
to trap pending x87 exceptions so that they are properly reflected
to the guest.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-04-16 20:36:16 -03:00
Kevin Wolf
4cee4798a3 KVM: x86 emulator: Allow PM/VM86 switch during task switch
Task switches can switch between Protected Mode and VM86. The current
mode must be updated during the task switch emulation so that the new
segment selectors are interpreted correctly.

In order to let privilege checks succeed, rflags needs to be updated in
the vcpu struct as this causes a CPL update.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-08 14:10:29 +02:00
Kevin Wolf
7f3d35fddd KVM: x86 emulator: Fix task switch privilege checks
Currently, all task switches check privileges against the DPL of the
TSS. This is only correct for jmp/call to a TSS. If a task gate is used,
the DPL of this take gate is used for the check instead. Exceptions,
external interrupts and iret shouldn't perform any check.

[avi: kill kvm-kmod remnants]

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-08 14:10:26 +02:00
Stephan Bärwolf
c2226fc9e8 KVM: x86: fix missing checks in syscall emulation
On hosts without this patch, 32bit guests will crash (and 64bit guests
may behave in a wrong way) for example by simply executing following
nasm-demo-application:

    [bits 32]
    global _start
    SECTION .text
    _start: syscall

(I tested it with winxp and linux - both always crashed)

    Disassembly of section .text:

    00000000 <_start>:
       0:   0f 05                   syscall

The reason seems a missing "invalid opcode"-trap (int6) for the
syscall opcode "0f05", which is not available on Intel CPUs
within non-longmodes, as also on some AMD CPUs within legacy-mode.
(depending on CPU vendor, MSR_EFER and cpuid)

Because previous mentioned OSs may not engage corresponding
syscall target-registers (STAR, LSTAR, CSTAR), they remain
NULL and (non trapping) syscalls are leading to multiple
faults and finally crashs.

Depending on the architecture (AMD or Intel) pretended by
guests, various checks according to vendor's documentation
are implemented to overcome the current issue and behave
like the CPUs physical counterparts.

[mtosatti: cleanup/beautify code]

Signed-off-by: Stephan Baerwolf <stephan.baerwolf@tu-ilmenau.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-02-01 11:43:40 +02:00
Stephan Bärwolf
bdb42f5afe KVM: x86: extend "struct x86_emulate_ops" with "get_cpuid"
In order to be able to proceed checks on CPU-specific properties
within the emulator, function "get_cpuid" is introduced.
With "get_cpuid" it is possible to virtually call the guests
"cpuid"-opcode without changing the VM's context.

[mtosatti: cleanup/beautify code]

Signed-off-by: Stephan Baerwolf <stephan.baerwolf@tu-ilmenau.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-02-01 11:43:33 +02:00
Avi Kivity
222d21aa07 KVM: x86 emulator: implement RDPMC (0F 33)
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-12-27 11:24:43 +02:00
Xiao Guangrong
1cb3f3ae5a KVM: x86: retry non-page-table writing instructions
If the emulation is caused by #PF and it is non-page_table writing instruction,
it means the VM-EXIT is caused by shadow page protected, we can zap the shadow
page and retry this instruction directly

The idea is from Avi

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-12-27 11:16:50 +02:00
Avi Kivity
b1ea50b2b6 KVM: x86 emulator: expand decode flags to 64 bits
Unifiying the operands means not taking advantage of the fact that some
operand types can only go into certain operands (for example, DI can only
be used by the destination), so we need more bits to hold the operand type.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-09-25 19:52:49 +03:00
Avi Kivity
f09ed83e21 KVM: x86 emulator: move memop, memopp into emulation context
Simplifies further generalization of decode.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-09-25 19:52:47 +03:00
Avi Kivity
9dac77fa40 KVM: x86 emulator: fold decode_cache into x86_emulate_ctxt
This saves a lot of pointless casts x86_emulate_ctxt and decode_cache.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12 13:16:09 +03:00
Avi Kivity
36dd9bb5ce KVM: x86 emulator: rename decode_cache::eip to _eip
The name eip conflicts with a field of the same name in x86_emulate_ctxt,
which we plan to fold decode_cache into.

The name _eip is unfortunate, but what's really needed is a refactoring
here, not a better name.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12 13:16:09 +03:00
Takuya Yoshikawa
b5c9ff731f KVM: x86 emulator: Avoid clearing the whole decode_cache
During tracing the emulator, we noticed that init_emulate_ctxt()
sometimes took a bit longer time than we expected.

This patch is for mitigating the problem by some degree.

By looking into the function, we soon notice that it clears the whole
decode_cache whose size is about 2.5K bytes now.  Furthermore, most of
the bytes are taken for the two read_cache arrays, which are used only
by a few instructions.

Considering the fact that we are not assuming the cache arrays have
been cleared when we store actual data, we do not need to clear the
arrays: 2K bytes elimination.  In addition, we can avoid clearing the
fetch_cache and regs arrays.

This patch changes the initialization not to clear the arrays.

On our 64-bit host, init_emulate_ctxt() becomes 0.3 to 0.5us faster with
this patch applied.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-12 11:45:09 +03:00
Takuya Yoshikawa
7b105ca290 KVM: x86 emulator: Stop passing ctxt->ops as arg of emul functions
Dereference it in the actual users.

This not only cleans up the emulator but also makes it easy to convert
the old emulation functions to the new em_xxx() form later.

Note: Remove some inline keywords to let the compiler decide inlining.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12 11:44:59 +03:00
Avi Kivity
1aa366163b KVM: x86 emulator: consolidate segment accessors
Instead of separate accessors for the segment selector and cached descriptor,
use one accessor for both.  This simplifies the code somewhat.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:47:39 -04:00
Avi Kivity
40e19b519c KVM: SVM: Get rid of x86_intercept_map::valid
By reserving 0 as an invalid x86_intercept_stage, we no longer
need to store a valid flag in x86_intercept_map.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:39:37 -04:00
Avi Kivity
13db70eca6 KVM: x86 emulator: drop x86_emulate_ctxt::vcpu
No longer used.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:39:24 -04:00
Avi Kivity
bcaf5cc543 KVM: x86 emulator: add new ->wbinvd() callback
Instead of calling kvm_emulate_wbinvd() directly.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:39:20 -04:00
Avi Kivity
d6aa10003b KVM: x86 emulator: add ->fix_hypercall() callback
Artificial, but needed to remove direct calls to KVM.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:39:18 -04:00
Avi Kivity
6c3287f7c5 KVM: x86 emulator: add new ->halt() callback
Instead of reaching into vcpu internals.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:39:17 -04:00
Avi Kivity
3cb16fe78c KVM: x86 emulator: make emulate_invlpg() an emulator callback
Removing direct calls to KVM.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:39:15 -04:00
Avi Kivity
1ac9d0cfb0 KVM: x86 emulator: add and use new callbacks set_idt(), set_gdt()
Replacing direct calls to realmode_lgdt(), realmode_lidt().

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:39:08 -04:00
Avi Kivity
2953538ebb KVM: x86 emulator: drop vcpu argument from intercept callback
Making the emulator caller agnostic.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:39:05 -04:00
Avi Kivity
717746e382 KVM: x86 emulator: drop vcpu argument from cr/dr/cpl/msr callbacks
Making the emulator caller agnostic.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:39:03 -04:00
Avi Kivity
4bff1e86ad KVM: x86 emulator: drop vcpu argument from segment/gdt/idt callbacks
Making the emulator caller agnostic.

[Takuya Yoshikawa: fix typo leading to LDT failures]

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-22 08:35:20 -04:00