Commit graph

662507 commits

Author SHA1 Message Date
Paul Mackerras
fb7dcf723d Merge remote-tracking branch 'remotes/powerpc/topic/xive' into kvm-ppc-next
This merges in the powerpc topic/xive branch to bring in the code for
the in-kernel XICS interrupt controller emulation to use the new XIVE
(eXternal Interrupt Virtualization Engine) hardware in the POWER9 chip
directly, rather than via a XICS emulation in firmware.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-28 08:23:16 +10:00
Denis Kirjanov
db4b0dfab7 KVM: PPC: Book3S HV: Avoid preemptibility warning in module initialization
With CONFIG_DEBUG_PREEMPT, get_paca() produces the following warning
in kvmppc_book3s_init_hv() since it calls debug_smp_processor_id().

There is no real issue with the xics_phys field.
If paca->kvm_hstate.xics_phys is non-zero on one cpu, it will be
non-zero on them all.  Therefore this is not fixing any actual
problem, just the warning.

[  138.521188] BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/5596
[  138.521308] caller is .kvmppc_book3s_init_hv+0x184/0x350 [kvm_hv]
[  138.521404] CPU: 5 PID: 5596 Comm: modprobe Not tainted 4.11.0-rc3-00022-gc7e790c #1
[  138.521509] Call Trace:
[  138.521563] [c0000007d018b810] [c0000000023eef10] .dump_stack+0xe4/0x150 (unreliable)
[  138.521694] [c0000007d018b8a0] [c000000001f6ec04] .check_preemption_disabled+0x134/0x150
[  138.521829] [c0000007d018b940] [d00000000a010274] .kvmppc_book3s_init_hv+0x184/0x350 [kvm_hv]
[  138.521963] [c0000007d018ba00] [c00000000191d5cc] .do_one_initcall+0x5c/0x1c0
[  138.522082] [c0000007d018bad0] [c0000000023e9494] .do_init_module+0x84/0x240
[  138.522201] [c0000007d018bb70] [c000000001aade18] .load_module+0x1f68/0x2a10
[  138.522319] [c0000007d018bd20] [c000000001aaeb30] .SyS_finit_module+0xc0/0xf0
[  138.522439] [c0000007d018be30] [c00000000191baec] system_call+0x38/0xfc

Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-28 08:21:51 +10:00
Benjamin Herrenschmidt
5af5099385 KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller
This patch makes KVM capable of using the XIVE interrupt controller
to provide the standard PAPR "XICS" style hypercalls. It is necessary
for proper operations when the host uses XIVE natively.

This has been lightly tested on an actual system, including PCI
pass-through with a TG3 device.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Cleanup pr_xxx(), unsplit pr_xxx() strings, etc., fix build
 failures by adding KVM_XIVE which depends on KVM_XICS and XIVE, and
 adding empty stubs for the kvm_xive_xxx() routines, fixup subject,
 integrate fixes from Paul for building PR=y HV=n]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-27 21:37:29 +10:00
Nicholas Piggin
8bf8f2e8c7 powerpc/64s: Revert setting of LPCR[LPES] on POWER9
The XIVE enablement patches included a change to set the LPES (Logical
Partitioning Environment Selector) bit (bit # 3) in LPCR (Logical Partitioning
Control Register) on POWER9 hosts. This bit sets external interrupts to guest
delivery mode, which uses SRR0/1. The host's EE interrupt handler is written to
expect HSRR0/1 (for earlier CPUs). This should be fine because XIVE is
configured not to deliver EEs to the host (Hypervisor Virtulization Interrupt is
used instead) so the EE handler should never be executed.

However a bug in interrupt controller code, hardware, or odd configuration of a
simulator could result in the host getting an EE incorrectly. Keeping the EE
delivery mode matching the host EE handler prevents strange crashes due to using
the wrong exception registers.

KVM will configure the LPCR to set LPES prior to running a guest so that EEs are
delivered to the guest using SRR0/1.

Fixes: 08a1e650cc ("powerpc: Fixup LPCR:PECE and HEIC setting on POWER9")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Massage change log to avoid referring to LPES0 which is now renamed LPES]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-26 11:40:21 +10:00
Paolo Bonzini
bd17117bb2 KVM: s390: Guarded storage fixup and keyless subset mode
- detect and use the keyless subset mode (guests without
   storage keys)
 - fix vSIE support for sdnxc
 - fix machine check data for guarded storage
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJY+dkNAAoJEBF7vIC1phx8HC4P+gJtKOhNT9+mAkI3ZEuIH+UY
 OB+2XIkvJpw1F47ZpL9dReEUtOYdF+r0sBmeel+zWjzfwLFBrZqgVRLG18St1JkQ
 i4jGkW7oQyZ7Lc9fhZA5hcRpaN8zN571+wAiYPNtg13oTzsm2N1gmMVBKVC6LMWi
 aoNGornkow9g6VhQiA9yl5ZHtbWEHT2hTE7CmUKPSAqe/SsdTSZ6svSCzWeRsUmT
 +/bmgZuy07Uol6N99xQWGWtYSKQML32THA/b14i14GEy2c7dUnMyE3PQ0jnOi4zd
 Jg9xr2+VOgcZvubhXrslVOQr34qZkAkuCWFC8hqgc6LYJjKXimW/9MCOVDqsLHD8
 gIFXVHaIX30HwUEK4A3pDByBRVYyZ5RtdB/LcIoslbYathig+JYaXEPw2PHmEi2K
 IwUK+dBJpQUpjh2djZBxmC2zMFjzDqQ0lB1y1UkZUeHFVTYSzsi/TI+ovrg6rNJ5
 WQLJIOBY0jNEWatM5ZNUfXdiCO8na0bVJlZU+YUoVFXu0DNmotf8gwfuqj3V2j7v
 JoHGGpjZyw6wwi0DPGo2tJ7h78s5psRGJIuBmRl9FaTn8KavZBLE+/s7TNO+SOTH
 Q08Qvowoz7eVZlKWtF3XG6FgrptNKpXO3YPj/c+kHW+b39JUr6t/1uoKWw7PrP8d
 IIXwghvNwgq8hWgt+zD6
 =Z5HR
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Guarded storage fixup and keyless subset mode

- detect and use the keyless subset mode (guests without
  storage keys)
- fix vSIE support for sdnxc
- fix machine check data for guarded storage
2017-04-21 12:47:47 +02:00
Paolo Bonzini
ec594c4710 Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD 2017-04-21 12:47:31 +02:00
Paolo Bonzini
8afd74c296 Merge branch 'x86/process' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into HEAD
Required for KVM support of the CPUID faulting feature.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-21 11:55:06 +02:00
David Hildenbrand
fe0e80befd KVM: VMX: drop vmm_exclusive module parameter
vmm_exclusive=0 leads to KVM setting X86_CR4_VMXE always and calling
VMXON only when the vcpu is loaded. X86_CR4_VMXE is used as an
indication in cpu_emergency_vmxoff() (called on kdump) if VMXOFF has to be
called. This is obviously not the case if both are used independtly.
Calling VMXOFF without a previous VMXON will result in an exception.

In addition, X86_CR4_VMXE is used as a mean to test if VMX is already in
use by another VMM in hardware_enable(). So there can't really be
co-existance. If the other VMM is prepared for co-existance and does a
similar check, only one VMM can exist. If the other VMM is not prepared
and blindly sets/clears X86_CR4_VMXE, we will get inconsistencies with
X86_CR4_VMXE.

As we also had bug reports related to clearing of vmcs with vmm_exclusive=0
this seems to be pretty much untested. So let's better drop it.

While at it, directly move setting/clearing X86_CR4_VMXE into
kvm_cpu_vmxon/off.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-21 11:42:49 +02:00
Farhan Ali
730cd632c4 KVM: s390: Support keyless subset guest mode
If the KSS facility is available on the machine, we also make it
available for our KVM guests.

The KSS facility bypasses storage key management as long as the guest
does not issue a related instruction. When that happens, the control is
returned to the host, which has to turn off KSS for a guest vcpu
before retrying the instruction.

Signed-off-by: Corey S. McQuay <csmcquay@linux.vnet.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-04-21 11:08:11 +02:00
Farhan Ali
71cb1bf66e s390/sclp: Detect KSS facility
Let's detect the keyless subset facility.

Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-04-21 11:08:04 +02:00
Thomas Huth
feafd13c96 KVM: PPC: Book3S PR: Do not fail emulation with mtspr/mfspr for unknown SPRs
According to the PowerISA 2.07, mtspr and mfspr should not always
generate an illegal instruction exception when being used with an
undefined SPR, but rather treat the instruction as a NOP or inject a
privilege exception in some cases, too - depending on the SPR number.
Also turn the printk here into a ratelimited print statement, so that
the guest can not flood the dmesg log of the host by issueing lots of
illegal mtspr/mfspr instruction here.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:39:32 +10:00
Alexey Kardashevskiy
121f80ba68 KVM: PPC: VFIO: Add in-kernel acceleration for VFIO
This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT
and H_STUFF_TCE requests targeted an IOMMU TCE table used for VFIO
without passing them to user space which saves time on switching
to user space and back.

This adds H_PUT_TCE/H_PUT_TCE_INDIRECT/H_STUFF_TCE handlers to KVM.
KVM tries to handle a TCE request in the real mode, if failed
it passes the request to the virtual mode to complete the operation.
If it a virtual mode handler fails, the request is passed to
the user space; this is not expected to happen though.

To avoid dealing with page use counters (which is tricky in real mode),
this only accelerates SPAPR TCE IOMMU v2 clients which are required
to pre-register the userspace memory. The very first TCE request will
be handled in the VFIO SPAPR TCE driver anyway as the userspace view
of the TCE table (iommu_table::it_userspace) is not allocated till
the very first mapping happens and we cannot call vmalloc in real mode.

If we fail to update a hardware IOMMU table unexpected reason, we just
clear it and move on as there is nothing really we can do about it -
for example, if we hot plug a VFIO device to a guest, existing TCE tables
will be mirrored automatically to the hardware and there is no interface
to report to the guest about possible failures.

This adds new attribute - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE - to
the VFIO KVM device. It takes a VFIO group fd and SPAPR TCE table fd
and associates a physical IOMMU table with the SPAPR TCE table (which
is a guest view of the hardware IOMMU table). The iommu_table object
is cached and referenced so we do not have to look up for it in real mode.

This does not implement the UNSET counterpart as there is no use for it -
once the acceleration is enabled, the existing userspace won't
disable it unless a VFIO container is destroyed; this adds necessary
cleanup to the KVM_DEV_VFIO_GROUP_DEL handler.

This advertises the new KVM_CAP_SPAPR_TCE_VFIO capability to the user
space.

This adds real mode version of WARN_ON_ONCE() as the generic version
causes problems with rcu_sched. Since we testing what vmalloc_to_phys()
returns in the code, this also adds a check for already existing
vmalloc_to_phys() call in kvmppc_rm_h_put_tce_indirect().

This finally makes use of vfio_external_user_iommu_id() which was
introduced quite some time ago and was considered for removal.

Tests show that this patch increases transmission speed from 220MB/s
to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:39:26 +10:00
Alexey Kardashevskiy
b1af23d836 KVM: PPC: iommu: Unify TCE checking
This reworks helpers for checking TCE update parameters in way they
can be used in KVM.

This should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:39:21 +10:00
Alexey Kardashevskiy
da6f59e192 KVM: PPC: Use preregistered memory API to access TCE list
VFIO on sPAPR already implements guest memory pre-registration
when the entire guest RAM gets pinned. This can be used to translate
the physical address of a guest page containing the TCE list
from H_PUT_TCE_INDIRECT.

This makes use of the pre-registrered memory API to access TCE list
pages in order to avoid unnecessary locking on the KVM memory
reverse map as we know that all of guest memory is pinned and
we have a flat array mapping GPA to HPA which makes it simpler and
quicker to index into that array (even with looking up the
kernel page tables in vmalloc_to_phys) than it is to find the memslot,
lock the rmap entry, look up the user page tables, and unlock the rmap
entry. Note that the rmap pointer is initialized to NULL
where declared (not in this patch).

If a requested chunk of memory has not been preregistered, this will
fall back to non-preregistered case and lock rmap.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:39:16 +10:00
Alexey Kardashevskiy
503bfcbe18 KVM: PPC: Pass kvm* to kvmppc_find_table()
The guest view TCE tables are per KVM anyway (not per VCPU) so pass kvm*
there. This will be used in the following patches where we will be
attaching VFIO containers to LIOBNs via ioctl() to KVM (rather than
to VCPU).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:39:12 +10:00
Alexey Kardashevskiy
e91aa8e6ec KVM: PPC: Enable IOMMU_API for KVM_BOOK3S_64 permanently
It does not make much sense to have KVM in book3s-64 and
not to have IOMMU bits for PCI pass through support as it costs little
and allows VFIO to function on book3s KVM.

Having IOMMU_API always enabled makes it unnecessary to have a lot of
"#ifdef IOMMU_API" in arch/powerpc/kvm/book3s_64_vio*. With those
ifdef's we could have only user space emulated devices accelerated
(but not VFIO) which do not seem to be very useful.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:39:07 +10:00
Alexey Kardashevskiy
4898d3f49b KVM: PPC: Reserve KVM_CAP_SPAPR_TCE_VFIO capability number
This adds a capability number for in-kernel support for VFIO on
SPAPR platform.

The capability will tell the user space whether in-kernel handlers of
H_PUT_TCE can handle VFIO-targeted requests or not. If not, the user space
must not attempt allocating a TCE table in the host kernel via
the KVM_CREATE_SPAPR_TCE KVM ioctl because in that case TCE requests
will not be passed to the user space which is desired action in
the situation like that.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:39:01 +10:00
Paul Mackerras
644d2d6fef Merge remote-tracking branch 'remotes/powerpc/topic/ppc-kvm' into kvm-ppc-next
This merges in the commits in the topic/ppc-kvm branch of the powerpc
tree to get the changes to arch/powerpc which subsequent patches will
rely on.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:38:33 +10:00
Alexey Kardashevskiy
3762d45aa7 KVM: PPC: Align the table size to system page size
At the moment the userspace can request a table smaller than a page size
and this value will be stored as kvmppc_spapr_tce_table::size.
However the actual allocated size will still be aligned to the system
page size as alloc_page() is used there.

This aligns the table size up to the system page size. It should not
change the existing behaviour but when in-kernel TCE acceleration patchset
reaches the upstream kernel, this will allow small TCE tables be
accelerated as well: PCI IODA iommu_table allocator already aligns
the size and, without this patch, an IOMMU group won't attach to LIOBN
due to the mismatching table size.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:38:20 +10:00
Alexey Kardashevskiy
96df226769 KVM: PPC: Book3S PR: Preserve storage control bits
PR KVM page fault handler performs eaddr to pte translation for a guest,
however kvmppc_mmu_book3s_64_xlate() does not preserve WIMG bits
(storage control) in the kvmppc_pte struct. If PR KVM is running as
a second level guest under HV KVM, and PR KVM tries inserting HPT entry,
this fails in HV KVM if it already has this mapping.

This preserves WIMG bits between kvmppc_mmu_book3s_64_xlate() and
kvmppc_mmu_map_page().

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:38:14 +10:00
Alexey Kardashevskiy
bd9166ffe6 KVM: PPC: Book3S PR: Exit KVM on failed mapping
At the moment kvmppc_mmu_map_page() returns -1 if
mmu_hash_ops.hpte_insert() fails for any reason so the page fault handler
resumes the guest and it faults on the same address again.

This adds distinction to kvmppc_mmu_map_page() to return -EIO if
mmu_hash_ops.hpte_insert() failed for a reason other than full pteg.
At the moment only pSeries_lpar_hpte_insert() returns -2 if
plpar_pte_enter() failed with a code other than H_PTEG_FULL.
Other mmu_hash_ops.hpte_insert() instances can only fail with
-1 "full pteg".

With this change, if PR KVM fails to update HPT, it can signal
the userspace about this instead of returning to guest and having
the very same page fault over and over again.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:38:04 +10:00
Alexey Kardashevskiy
9eecec126e KVM: PPC: Book3S PR: Get rid of unused local variable
@is_mmio has never been used since introduction in
commit 2f4cf5e42d ("Add book3s.c") from 2009.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:38:00 +10:00
Markus Elfring
37655490db KVM: PPC: e500: Use kcalloc() in e500_mmu_host_init()
* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus use the corresponding function "kcalloc".

  This issue was detected by using the Coccinelle software.

* Replace the specification of a data type by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:37:55 +10:00
Markus Elfring
a1c52e1c7c KVM: PPC: Book3S HV: Use common error handling code in kvmppc_clr_passthru_irq()
Add a jump target so that a bit of exception handling can be better reused
at the end of this function.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:37:50 +10:00
Paul Mackerras
9b5ab00513 KVM: PPC: Add MMIO emulation for remaining floating-point instructions
For completeness, this adds emulation of the lfiwax and lfiwzx
instructions.  With this, all floating-point load and store instructions
as of Power ISA V2.07 are emulated.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:37:44 +10:00
Paul Mackerras
ceba57df43 KVM: PPC: Emulation for more integer loads and stores
This adds emulation for the following integer loads and stores,
thus enabling them to be used in a guest for accessing emulated
MMIO locations.

- lhaux
- lwaux
- lwzux
- ldu
- lwa
- stdux
- stwux
- stdu
- ldbrx
- stdbrx

Previously, most of these would cause an emulation failure exit to
userspace, though ldu and lwa got treated incorrectly as ld, and
stdu got treated incorrectly as std.

This also tidies up some of the formatting and updates the comment
listing instructions that still need to be implemented.

With this, all integer loads and stores that are defined in the Power
ISA v2.07 are emulated, except for those that are permitted to trap
when used on cache-inhibited or write-through mappings (and which do
in fact trap on POWER8), that is, lmw/stmw, lswi/stswi, lswx/stswx,
lq/stq, and l[bhwdq]arx/st[bhwdq]cx.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:37:38 +10:00
Alexey Kardashevskiy
91242fd1a3 KVM: PPC: Add MMIO emulation for stdx (store doubleword indexed)
This adds missing stdx emulation for emulated MMIO accesses by KVM
guests.  This allows the Mellanox mlx5_core driver from recent kernels
to work when MMIO emulation is enforced by userspace.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:37:33 +10:00
Bin Lu
6f63e81bda KVM: PPC: Book3S: Add MMIO emulation for FP and VSX instructions
This patch provides the MMIO load/store emulation for instructions
of 'double & vector unsigned char & vector signed char & vector
unsigned short & vector signed short & vector unsigned int & vector
signed int & vector double '.

The instructions that this adds emulation for are:

- ldx, ldux, lwax,
- lfs, lfsx, lfsu, lfsux, lfd, lfdx, lfdu, lfdux,
- stfs, stfsx, stfsu, stfsux, stfd, stfdx, stfdu, stfdux, stfiwx,
- lxsdx, lxsspx, lxsiwax, lxsiwzx, lxvd2x, lxvw4x, lxvdsx,
- stxsdx, stxsspx, stxsiwx, stxvd2x, stxvw4x

[paulus@ozlabs.org - some cleanups, fixes and rework, make it
 compile for Book E, fix build when PR KVM is built in]

Signed-off-by: Bin Lu <lblulb@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 11:36:41 +10:00
Paul Mackerras
307d927967 KVM: PPC: Provide functions for queueing up FP/VEC/VSX unavailable interrupts
This provides functions that can be used for generating interrupts
indicating that a given functional unit (floating point, vector, or
VSX) is unavailable.  These functions will be used in instruction
emulation code.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-04-20 10:39:50 +10:00
Radim Krčmář
3325187061 KVM: nVMX: fix AD condition when handling EPT violation
I have introduced this bug when applying and simplifying Paolo's patch
as we agreed on the list.  The original was "x &= ~y; if (z) x |= y;".

Here is the story of a bad workflow:

  A maintainer was already testing with the intended change, but it was
  applied only to a testing repo on a different machine.  When the time
  to push tested patches to kvm/next came, he realized that this change
  was missing and quickly added it to the maintenance repo, didn't test
  again (because the change is trivial, right), and pushed the world to
  fire.

Fixes: ae1e2d1082 ("kvm: nVMX: support EPT accessed/dirty bits")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-13 19:36:55 +02:00
Ladi Prosek
405a353a0e KVM: x86: Add MSR_AMD64_DC_CFG to the list of ignored MSRs
Hyper-V writes 0x800000000000 to MSR_AMD64_DC_CFG when running on AMD CPUs
as recommended in erratum 383, analogous to our svm_init_erratum_383.

By ignoring the MSR, this patch enables running Hyper-V in L1 on AMD.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 21:09:24 +02:00
Wanpeng Li
5a48a62271 x86/kvm: virt_xxx memory barriers instead of mandatory barriers
virt_xxx memory barriers are implemented trivially using the low-level
__smp_xxx macros, __smp_xxx is equal to a compiler barrier for strong
TSO memory model, however, mandatory barriers will unconditional add
memory barriers, this patch replaces the rmb() in kvm_steal_clock() by
virt_rmb().

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:38 +02:00
Denis Plotnikov
bd8fab39cd KVM: x86: fix maintaining of kvm_clock stability on guest CPU hotplug
VCPU TSC synchronization is perfromed in kvm_write_tsc() when the TSC
value being set is within 1 second from the expected, as obtained by
extrapolating of the TSC in already synchronized VCPUs.

This is naturally achieved on all VCPUs at VM start and resume;
however on VCPU hotplug it is not: the newly added VCPU is created
with TSC == 0 while others are well ahead.

To compensate for that, consider host-initiated kvm_write_tsc() with
TSC == 0 a special case requiring synchronization regardless of the
current TSC on other VCPUs.

Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:15 +02:00
Denis Plotnikov
c5e8ec8e9b KVM: x86: remaster kvm_write_tsc code
Reuse existing code instead of using inline asm.
Make the code more concise and clear in the TSC
synchronization part.

Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:15 +02:00
David Hildenbrand
900ab14ca9 KVM: x86: use irqchip_kernel() to check for pic+ioapic
Although the current check is not wrong, this check explicitly includes
the pic.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:15 +02:00
David Hildenbrand
b5e7cf52e1 KVM: x86: simplify pic_ioport_read()
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:15 +02:00
David Hildenbrand
84a5c79e70 KVM: x86: set data directly in picdev_read()
Now it looks almost as picdev_write().

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:15 +02:00
David Hildenbrand
9fecaa9e32 KVM: x86: drop picdev_in_range()
We already have the exact same checks a couple of lines below.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:14 +02:00
David Hildenbrand
dc24d1d2cb KVM: x86: make kvm_pic_reset() static
Not used outside of i8259.c, so let's make it static.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:14 +02:00
David Hildenbrand
e21d1758b0 KVM: x86: simplify pic_unlock()
We can easily compact this code and get rid of one local variable.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:14 +02:00
David Hildenbrand
8c6b7828c2 KVM: x86: cleanup return handling in setup_routing_entry()
Let's drop the goto and return the error value directly.

Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:14 +02:00
David Hildenbrand
43ae312ca1 KVM: x86: drop goto label in kvm_set_routing_entry()
No need for the goto label + local variable "r".

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:14 +02:00
David Hildenbrand
993225adf4 KVM: x86: rename kvm_vcpu_request_scan_ioapic()
Let's rename it into a proper arch specific callback.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:14 +02:00
David Hildenbrand
ca8ab3f895 KVM: x86: directly call kvm_make_scan_ioapic_request() in ioapic.c
We know there is an ioapic, so let's call it directly.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:14 +02:00
David Hildenbrand
d62f270b2d KVM: x86: remove all-vcpu request from kvm_ioapic_init()
kvm_ioapic_init() is guaranteed to be called without any created VCPUs,
so doing an all-vcpu request results in a NOP.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:14 +02:00
David Hildenbrand
445ee82d7a KVM: x86: KVM_IRQCHIP_PIC_MASTER only has 8 pins
Currently, one could set pin 8-15, implicitly referring to
KVM_IRQCHIP_PIC_SLAVE.

Get rid of the two local variables max_pin and delta on the way.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:14 +02:00
David Hildenbrand
49f520b99a KVM: x86: push usage of slots_lock down
Let's just move it to the place where it is actually needed.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:14 +02:00
David Hildenbrand
ba7454e17f KVM: x86: don't take kvm->irq_lock when creating IRQCHIP
I don't see any reason any more for this lock, seemed to be used to protect
removal of kvm->arch.vpic / kvm->arch.vioapic when already partially
inititalized, now access is properly protected using kvm->arch.irqchip_mode
and this shouldn't be necessary anymore.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:14 +02:00
David Hildenbrand
33392b4911 KVM: x86: convert kvm_(set|get)_ioapic() into void
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:14 +02:00
David Hildenbrand
4c0b06d886 KVM: x86: remove duplicate checks for ioapic
When handling KVM_GET_IRQCHIP, we already check irqchip_kernel(), which
implies a fully inititalized ioapic.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:14 +02:00