Commit Graph

495 Commits

Author SHA1 Message Date
Linus Torvalds 09d1c6a80f Generic:
- Use memdup_array_user() to harden against overflow.
 
 - Unconditionally advertise KVM_CAP_DEVICE_CTRL for all architectures.
 
 - Clean up Kconfigs that all KVM architectures were selecting
 
 - New functionality around "guest_memfd", a new userspace API that
   creates an anonymous file and returns a file descriptor that refers
   to it.  guest_memfd files are bound to their owning virtual machine,
   cannot be mapped, read, or written by userspace, and cannot be resized.
   guest_memfd files do however support PUNCH_HOLE, which can be used to
   switch a memory area between guest_memfd and regular anonymous memory.
 
 - New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify
   per-page attributes for a given page of guest memory; right now the
   only attribute is whether the guest expects to access memory via
   guest_memfd or not, which in Confidential SVMs backed by SEV-SNP,
   TDX or ARM64 pKVM is checked by firmware or hypervisor that guarantees
   confidentiality (AMD PSP, Intel TDX module, or EL2 in the case of pKVM).
 
 x86:
 
 - Support for "software-protected VMs" that can use the new guest_memfd
   and page attributes infrastructure.  This is mostly useful for testing,
   since there is no pKVM-like infrastructure to provide a meaningfully
   reduced TCB.
 
 - Fix a relatively benign off-by-one error when splitting huge pages during
   CLEAR_DIRTY_LOG.
 
 - Fix a bug where KVM could incorrectly test-and-clear dirty bits in non-leaf
   TDP MMU SPTEs if a racing thread replaces a huge SPTE with a non-huge SPTE.
 
 - Use more generic lockdep assertions in paths that don't actually care
   about whether the caller is a reader or a writer.
 
 - let Xen guests opt out of having PV clock reported as "based on a stable TSC",
   because some of them don't expect the "TSC stable" bit (added to the pvclock
   ABI by KVM, but never set by Xen) to be set.
 
 - Revert a bogus, made-up nested SVM consistency check for TLB_CONTROL.
 
 - Advertise flush-by-ASID support for nSVM unconditionally, as KVM always
   flushes on nested transitions, i.e. always satisfies flush requests.  This
   allows running bleeding edge versions of VMware Workstation on top of KVM.
 
 - Sanity check that the CPU supports flush-by-ASID when enabling SEV support.
 
 - On AMD machines with vNMI, always rely on hardware instead of intercepting
   IRET in some cases to detect unmasking of NMIs
 
 - Support for virtualizing Linear Address Masking (LAM)
 
 - Fix a variety of vPMU bugs where KVM fail to stop/reset counters and other state
   prior to refreshing the vPMU model.
 
 - Fix a double-overflow PMU bug by tracking emulated counter events using a
   dedicated field instead of snapshotting the "previous" counter.  If the
   hardware PMC count triggers overflow that is recognized in the same VM-Exit
   that KVM manually bumps an event count, KVM would pend PMIs for both the
   hardware-triggered overflow and for KVM-triggered overflow.
 
 - Turn off KVM_WERROR by default for all configs so that it's not
   inadvertantly enabled by non-KVM developers, which can be problematic for
   subsystems that require no regressions for W=1 builds.
 
 - Advertise all of the host-supported CPUID bits that enumerate IA32_SPEC_CTRL
   "features".
 
 - Don't force a masterclock update when a vCPU synchronizes to the current TSC
   generation, as updating the masterclock can cause kvmclock's time to "jump"
   unexpectedly, e.g. when userspace hotplugs a pre-created vCPU.
 
 - Use RIP-relative address to read kvm_rebooting in the VM-Enter fault paths,
   partly as a super minor optimization, but mostly to make KVM play nice with
   position independent executable builds.
 
 - Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on
   CONFIG_HYPERV as a minor optimization, and to self-document the code.
 
 - Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV "emulation"
   at build time.
 
 ARM64:
 
 - LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB
   base granule sizes. Branch shared with the arm64 tree.
 
 - Large Fine-Grained Trap rework, bringing some sanity to the
   feature, although there is more to come. This comes with
   a prefix branch shared with the arm64 tree.
 
 - Some additional Nested Virtualization groundwork, mostly
   introducing the NV2 VNCR support and retargetting the NV
   support to that version of the architecture.
 
 - A small set of vgic fixes and associated cleanups.
 
 Loongarch:
 
 - Optimization for memslot hugepage checking
 
 - Cleanup and fix some HW/SW timer issues
 
 - Add LSX/LASX (128bit/256bit SIMD) support
 
 RISC-V:
 
 - KVM_GET_REG_LIST improvement for vector registers
 
 - Generate ISA extension reg_list using macros in get-reg-list selftest
 
 - Support for reporting steal time along with selftest
 
 s390:
 
 - Bugfixes
 
 Selftests:
 
 - Fix an annoying goof where the NX hugepage test prints out garbage
   instead of the magic token needed to run the test.
 
 - Fix build errors when a header is delete/moved due to a missing flag
   in the Makefile.
 
 - Detect if KVM bugged/killed a selftest's VM and print out a helpful
   message instead of complaining that a random ioctl() failed.
 
 - Annotate the guest printf/assert helpers with __printf(), and fix the
   various bugs that were lurking due to lack of said annotation.
 
 There are two non-KVM patches buried in the middle of guest_memfd support:
 
   fs: Rename anon_inode_getfile_secure() and anon_inode_getfd_secure()
   mm: Add AS_UNMOVABLE to mark mapping as completely unmovable
 
 The first is small and mostly suggested-by Christian Brauner; the second
 a bit less so but it was written by an mm person (Vlastimil Babka).
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmWcMWkUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroO15gf/WLmmg3SET6Uzw9iEq2xo28831ZA+
 6kpILfIDGKozV5safDmMvcInlc/PTnqOFrsKyyN4kDZ+rIJiafJdg/loE0kPXBML
 wdR+2ix5kYI1FucCDaGTahskBDz8Lb/xTpwGg9BFLYFNmuUeHc74o6GoNvr1uliE
 4kLZL2K6w0cSMPybUD+HqGaET80ZqPwecv+s1JL+Ia0kYZJONJifoHnvOUJ7DpEi
 rgudVdgzt3EPjG0y1z6MjvDBXTCOLDjXajErlYuZD3Ej8N8s59Dh2TxOiDNTLdP4
 a4zjRvDmgyr6H6sz+upvwc7f4M4p+DBvf+TkWF54mbeObHUYliStqURIoA==
 =66Ws
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "Generic:

   - Use memdup_array_user() to harden against overflow.

   - Unconditionally advertise KVM_CAP_DEVICE_CTRL for all
     architectures.

   - Clean up Kconfigs that all KVM architectures were selecting

   - New functionality around "guest_memfd", a new userspace API that
     creates an anonymous file and returns a file descriptor that refers
     to it. guest_memfd files are bound to their owning virtual machine,
     cannot be mapped, read, or written by userspace, and cannot be
     resized. guest_memfd files do however support PUNCH_HOLE, which can
     be used to switch a memory area between guest_memfd and regular
     anonymous memory.

   - New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify
     per-page attributes for a given page of guest memory; right now the
     only attribute is whether the guest expects to access memory via
     guest_memfd or not, which in Confidential SVMs backed by SEV-SNP,
     TDX or ARM64 pKVM is checked by firmware or hypervisor that
     guarantees confidentiality (AMD PSP, Intel TDX module, or EL2 in
     the case of pKVM).

  x86:

   - Support for "software-protected VMs" that can use the new
     guest_memfd and page attributes infrastructure. This is mostly
     useful for testing, since there is no pKVM-like infrastructure to
     provide a meaningfully reduced TCB.

   - Fix a relatively benign off-by-one error when splitting huge pages
     during CLEAR_DIRTY_LOG.

   - Fix a bug where KVM could incorrectly test-and-clear dirty bits in
     non-leaf TDP MMU SPTEs if a racing thread replaces a huge SPTE with
     a non-huge SPTE.

   - Use more generic lockdep assertions in paths that don't actually
     care about whether the caller is a reader or a writer.

   - let Xen guests opt out of having PV clock reported as "based on a
     stable TSC", because some of them don't expect the "TSC stable" bit
     (added to the pvclock ABI by KVM, but never set by Xen) to be set.

   - Revert a bogus, made-up nested SVM consistency check for
     TLB_CONTROL.

   - Advertise flush-by-ASID support for nSVM unconditionally, as KVM
     always flushes on nested transitions, i.e. always satisfies flush
     requests. This allows running bleeding edge versions of VMware
     Workstation on top of KVM.

   - Sanity check that the CPU supports flush-by-ASID when enabling SEV
     support.

   - On AMD machines with vNMI, always rely on hardware instead of
     intercepting IRET in some cases to detect unmasking of NMIs

   - Support for virtualizing Linear Address Masking (LAM)

   - Fix a variety of vPMU bugs where KVM fail to stop/reset counters
     and other state prior to refreshing the vPMU model.

   - Fix a double-overflow PMU bug by tracking emulated counter events
     using a dedicated field instead of snapshotting the "previous"
     counter. If the hardware PMC count triggers overflow that is
     recognized in the same VM-Exit that KVM manually bumps an event
     count, KVM would pend PMIs for both the hardware-triggered overflow
     and for KVM-triggered overflow.

   - Turn off KVM_WERROR by default for all configs so that it's not
     inadvertantly enabled by non-KVM developers, which can be
     problematic for subsystems that require no regressions for W=1
     builds.

   - Advertise all of the host-supported CPUID bits that enumerate
     IA32_SPEC_CTRL "features".

   - Don't force a masterclock update when a vCPU synchronizes to the
     current TSC generation, as updating the masterclock can cause
     kvmclock's time to "jump" unexpectedly, e.g. when userspace
     hotplugs a pre-created vCPU.

   - Use RIP-relative address to read kvm_rebooting in the VM-Enter
     fault paths, partly as a super minor optimization, but mostly to
     make KVM play nice with position independent executable builds.

   - Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on
     CONFIG_HYPERV as a minor optimization, and to self-document the
     code.

   - Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV
     "emulation" at build time.

  ARM64:

   - LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB base
     granule sizes. Branch shared with the arm64 tree.

   - Large Fine-Grained Trap rework, bringing some sanity to the
     feature, although there is more to come. This comes with a prefix
     branch shared with the arm64 tree.

   - Some additional Nested Virtualization groundwork, mostly
     introducing the NV2 VNCR support and retargetting the NV support to
     that version of the architecture.

   - A small set of vgic fixes and associated cleanups.

  Loongarch:

   - Optimization for memslot hugepage checking

   - Cleanup and fix some HW/SW timer issues

   - Add LSX/LASX (128bit/256bit SIMD) support

  RISC-V:

   - KVM_GET_REG_LIST improvement for vector registers

   - Generate ISA extension reg_list using macros in get-reg-list
     selftest

   - Support for reporting steal time along with selftest

  s390:

   - Bugfixes

  Selftests:

   - Fix an annoying goof where the NX hugepage test prints out garbage
     instead of the magic token needed to run the test.

   - Fix build errors when a header is delete/moved due to a missing
     flag in the Makefile.

   - Detect if KVM bugged/killed a selftest's VM and print out a helpful
     message instead of complaining that a random ioctl() failed.

   - Annotate the guest printf/assert helpers with __printf(), and fix
     the various bugs that were lurking due to lack of said annotation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (185 commits)
  x86/kvm: Do not try to disable kvmclock if it was not enabled
  KVM: x86: add missing "depends on KVM"
  KVM: fix direction of dependency on MMU notifiers
  KVM: introduce CONFIG_KVM_COMMON
  KVM: arm64: Add missing memory barriers when switching to pKVM's hyp pgd
  KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cache
  RISC-V: KVM: selftests: Add get-reg-list test for STA registers
  RISC-V: KVM: selftests: Add steal_time test support
  RISC-V: KVM: selftests: Add guest_sbi_probe_extension
  RISC-V: KVM: selftests: Move sbi_ecall to processor.c
  RISC-V: KVM: Implement SBI STA extension
  RISC-V: KVM: Add support for SBI STA registers
  RISC-V: KVM: Add support for SBI extension registers
  RISC-V: KVM: Add SBI STA info to vcpu_arch
  RISC-V: KVM: Add steal-update vcpu request
  RISC-V: KVM: Add SBI STA extension skeleton
  RISC-V: paravirt: Implement steal-time support
  RISC-V: Add SBI STA extension definitions
  RISC-V: paravirt: Add skeleton for pv-time support
  RISC-V: KVM: Fix indentation in kvm_riscv_vcpu_set_reg_csr()
  ...
2024-01-17 13:03:37 -08:00
Paolo Bonzini caadf876bb KVM: introduce CONFIG_KVM_COMMON
CONFIG_HAVE_KVM is currently used by some architectures to either
enabled the KVM config proper, or to enable host-side code that is
not part of the KVM module.  However, CONFIG_KVM's "select" statement
in virt/kvm/Kconfig corresponds to a third meaning, namely to
enable common Kconfigs required by all architectures that support
KVM.

These three meanings can be replaced respectively by an
architecture-specific Kconfig, by IS_ENABLED(CONFIG_KVM), or by
a new Kconfig symbol that is in turn selected by the
architecture-specific "config KVM".

Start by introducing such a new Kconfig symbol, CONFIG_KVM_COMMON.
Unlike CONFIG_HAVE_KVM, it is selected by CONFIG_KVM, not by
architecture code, and it brings in all dependencies of common
KVM code.  In particular, INTERVAL_TREE was missing in loongarch
and riscv, so that is another thing that is fixed.

Fixes: 8132d887a7 ("KVM: remove CONFIG_HAVE_KVM_EVENTFD", 2023-12-08)
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Closes: https://lore.kernel.org/all/44907c6b-c5bd-4e4a-a921-e4d3825539d8@infradead.org/
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-01-08 08:09:38 -05:00
Bjorn Helgaas 2f9060b1db MIPS: Fix typos
Fix typos, most reported by "codespell arch/mips".  Only touches comments,
no code changes.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-mips@vger.kernel.org
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-08 10:39:12 +01:00
Paolo Bonzini 8132d887a7 KVM: remove CONFIG_HAVE_KVM_EVENTFD
virt/kvm/eventfd.c is compiled unconditionally, meaning that the ioeventfds
member of struct kvm is accessed unconditionally.  CONFIG_HAVE_KVM_EVENTFD
therefore must be defined for KVM common code to compile successfully,
remove it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-08 15:43:33 -05:00
Sean Christopherson f128cf8cfb KVM: Convert KVM_ARCH_WANT_MMU_NOTIFIER to CONFIG_KVM_GENERIC_MMU_NOTIFIER
Convert KVM_ARCH_WANT_MMU_NOTIFIER into a Kconfig and select it where
appropriate to effectively maintain existing behavior.  Using a proper
Kconfig will simplify building more functionality on top of KVM's
mmu_notifier infrastructure.

Add a forward declaration of kvm_gfn_range to kvm_types.h so that
including arch/powerpc/include/asm/kvm_ppc.h's with CONFIG_KVM=n doesn't
generate warnings due to kvm_gfn_range being undeclared.  PPC defines
hooks for PR vs. HV without guarding them via #ifdeffery, e.g.

  bool (*unmap_gfn_range)(struct kvm *kvm, struct kvm_gfn_range *range);
  bool (*age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
  bool (*test_age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
  bool (*set_spte_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);

Alternatively, PPC could forward declare kvm_gfn_range, but there's no
good reason not to define it in common KVM.

Acked-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Message-Id: <20231027182217.3615211-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-11-13 05:29:09 -05:00
Paolo Bonzini e2145c99b5 KVM: MIPS: fix -Wunused-but-set-variable warning
The variable is completely unused, remove it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-12 11:25:40 -04:00
Paolo Bonzini 0d15bf966d Common KVM changes for 6.6:
- Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass
    action specific data without needing to constantly update the main handlers.
 
  - Drop unused function declarations
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmTudpYSHHNlYW5qY0Bn
 b29nbGUuY29tAAoJEGCRIgFNDBL5xJUQAKnMVEV+7gRtfV5KCJFRTNAMxo4zSIt/
 K6QX+x/SwUriXj4nTAlvAtju1xz4nwTYBABKj3bXEaLpVjIUIbnEzEGuTKKK6XY9
 UyJKVgafwLuKLWPYN/5Zv5SCO7DmVC9W3lVMtchgt7gFcRxtZhmEn53boHhrhan0
 /2L5XD6N9rd81Zmd/rQkJNRND7XY3HkvDSnfmsRI/rfFUglCUHBDp4c2Wkmz+Dnb
 ux7N37si5OTbRVp18VzbLg1jalstDEm36ZQ7tLkvIbNbZV6pV93/ZmcTmsOruTeN
 gHVr6/RXmKKwgO3wtZ9DKL6oMcoh20yoT+vqhbaihVssLPGPusk7S2cCQ7529u8/
 Oda+w67MMdbE46N9CmB56fkpwNvn9nLCoQFhMhXBWhPJVNmorpiR6drHKqLy5zCq
 lrsWGqXU/DXA2PwdsztfIIMVeALawzExHu9ayppcKwb4S8TLJhLma7dT+EvwUxuV
 hoswaIT7Tq2ptZ34Fo5/vEz+90u2wi7LynHrNdTs7NLsW+WI/jab7KxKc+mf5WYh
 KuMzqmmPXmWRFupFeDa61YY5PCvMddDeac/jCYL/2cr73RA8bUItivwt5mEg5nOW
 9NEU+cLbl1s8g2KfxwhvodVkbhiNGf8MkVpE5skHHh9OX8HYzZUa/s6uUZO1O0eh
 XOk+fa9KWabt
 =n819
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-generic-6.6' of https://github.com/kvm-x86/linux into HEAD

Common KVM changes for 6.6:

 - Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass
   action specific data without needing to constantly update the main handlers.

 - Drop unused function declarations
2023-08-31 13:19:55 -04:00
Sean Christopherson 3e1efe2b67 KVM: Wrap kvm_{gfn,hva}_range.pte in a per-action union
Wrap kvm_{gfn,hva}_range.pte in a union so that future notifier events can
pass event specific information up and down the stack without needing to
constantly expand and churn the APIs.  Lockless aging of SPTEs will pass
around a bitmap, and support for memory attributes will pass around the
new attributes for the range.

Add a "KVM_NO_ARG" placeholder to simplify handling events without an
argument (creating a dummy union variable is midly annoying).

Opportunstically drop explicit zero-initialization of the "pte" field, as
omitting the field (now a union) has the same effect.

Cc: Yu Zhao <yuzhao@google.com>
Link: https://lore.kernel.org/all/CAOUHufagkd2Jk3_HrVoFFptRXM=hX2CV8f+M-dka-hJU4bP8kw@mail.gmail.com
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Acked-by: Yu Zhao <yuzhao@google.com>
Link: https://lore.kernel.org/r/20230729004144.1054885-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-08-17 11:26:53 -07:00
David Matlack 619b507244 KVM: Move kvm_arch_flush_remote_tlbs_memslot() to common code
Move kvm_arch_flush_remote_tlbs_memslot() to common code and drop
"arch_" from the name. kvm_arch_flush_remote_tlbs_memslot() is just a
range-based TLB invalidation where the range is defined by the memslot.
Now that kvm_flush_remote_tlbs_range() can be called from common code we
can just use that and drop a bunch of duplicate code from the arch
directories.

Note this adds a lockdep assertion for slots_lock being held when
calling kvm_flush_remote_tlbs_memslot(), which was previously only
asserted on x86. MIPS has calls to kvm_flush_remote_tlbs_memslot(),
but they all hold the slots_lock, so the lockdep assertion continues to
hold true.

Also drop the CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT ifdef gating
kvm_flush_remote_tlbs_memslot(), since it is no longer necessary.

Signed-off-by: David Matlack <dmatlack@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Acked-by: Anup Patel <anup@brainfault.org>
Acked-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230811045127.3308641-7-rananta@google.com
2023-08-17 09:40:35 +01:00
David Matlack a1342c8027 KVM: Rename kvm_arch_flush_remote_tlb() to kvm_arch_flush_remote_tlbs()
Rename kvm_arch_flush_remote_tlb() and the associated macro
__KVM_HAVE_ARCH_FLUSH_REMOTE_TLB to kvm_arch_flush_remote_tlbs() and
__KVM_HAVE_ARCH_FLUSH_REMOTE_TLBS respectively.

Making the name plural matches kvm_flush_remote_tlbs() and makes it more
clear that this function can affect more than one remote TLB.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Acked-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230811045127.3308641-2-rananta@google.com
2023-08-17 09:35:14 +01:00
Thomas Bogendoerfer 3a6dbb6917 MIPS: kvm: Fix build error with KVM_MIPS_DEBUG_COP0_COUNTERS enabled
Commit e4de205769 ("MIPS: KVM: Fix NULL pointer dereference") missed
converting one place accessing cop0 registers, which results in a build
error, if KVM_MIPS_DEBUG_COP0_COUNTERS is enabled.

Fixes: e4de205769 ("MIPS: KVM: Fix NULL pointer dereference")
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-07-07 14:43:01 +02:00
Huacai Chen e4de205769 MIPS: KVM: Fix NULL pointer dereference
After commit 45c7e8af4a ("MIPS: Remove KVM_TE support") we
get a NULL pointer dereference when creating a KVM guest:

[  146.243409] Starting KVM with MIPS VZ extensions
[  149.849151] CPU 3 Unable to handle kernel paging request at virtual address 0000000000000300, epc == ffffffffc06356ec, ra == ffffffffc063568c
[  149.849177] Oops[#1]:
[  149.849182] CPU: 3 PID: 2265 Comm: qemu-system-mip Not tainted 6.4.0-rc3+ #1671
[  149.849188] Hardware name: THTF CX TL630 Series/THTF-LS3A4000-7A1000-ML4A, BIOS KL4.1F.TF.D.166.201225.R 12/25/2020
[  149.849192] $ 0   : 0000000000000000 000000007400cce0 0000000000400004 ffffffff8119c740
[  149.849209] $ 4   : 000000007400cce1 000000007400cce1 0000000000000000 0000000000000000
[  149.849221] $ 8   : 000000240058bb36 ffffffff81421ac0 0000000000000000 0000000000400dc0
[  149.849233] $12   : 9800000102a07cc8 ffffffff80e40e38 0000000000000001 0000000000400dc0
[  149.849245] $16   : 0000000000000000 9800000106cd0000 9800000106cd0000 9800000100cce000
[  149.849257] $20   : ffffffffc0632b28 ffffffffc05b31b0 9800000100ccca00 0000000000400000
[  149.849269] $24   : 9800000106cd09ce ffffffff802f69d0
[  149.849281] $28   : 9800000102a04000 9800000102a07cd0 98000001106a8000 ffffffffc063568c
[  149.849293] Hi    : 00000335b2111e66
[  149.849295] Lo    : 6668d90061ae0ae9
[  149.849298] epc   : ffffffffc06356ec kvm_vz_vcpu_setup+0xc4/0x328 [kvm]
[  149.849324] ra    : ffffffffc063568c kvm_vz_vcpu_setup+0x64/0x328 [kvm]
[  149.849336] Status: 7400cce3 KX SX UX KERNEL EXL IE
[  149.849351] Cause : 1000000c (ExcCode 03)
[  149.849354] BadVA : 0000000000000300
[  149.849357] PrId  : 0014c004 (ICT Loongson-3)
[  149.849360] Modules linked in: kvm nfnetlink_queue nfnetlink_log nfnetlink fuse sha256_generic libsha256 cfg80211 rfkill binfmt_misc vfat fat snd_hda_codec_hdmi input_leds led_class snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_pcm snd_timer snd serio_raw xhci_pci radeon drm_suballoc_helper drm_display_helper xhci_hcd ip_tables x_tables
[  149.849432] Process qemu-system-mip (pid: 2265, threadinfo=00000000ae2982d2, task=0000000038e09ad4, tls=000000ffeba16030)
[  149.849439] Stack : 9800000000000003 9800000100ccca00 9800000100ccc000 ffffffffc062cef4
[  149.849453]         9800000102a07d18 c89b63a7ab338e00 0000000000000000 ffffffff811a0000
[  149.849465]         0000000000000000 9800000106cd0000 ffffffff80e59938 98000001106a8920
[  149.849476]         ffffffff80e57f30 ffffffffc062854c ffffffff811a0000 9800000102bf4240
[  149.849488]         ffffffffc05b0000 ffffffff80e3a798 000000ff78000000 000000ff78000010
[  149.849500]         0000000000000255 98000001021f7de0 98000001023f0078 ffffffff81434000
[  149.849511]         0000000000000000 0000000000000000 9800000102ae0000 980000025e92ae28
[  149.849523]         0000000000000000 c89b63a7ab338e00 0000000000000001 ffffffff8119dce0
[  149.849535]         000000ff78000010 ffffffff804f3d3c 9800000102a07eb0 0000000000000255
[  149.849546]         0000000000000000 ffffffff8049460c 000000ff78000010 0000000000000255
[  149.849558]         ...
[  149.849565] Call Trace:
[  149.849567] [<ffffffffc06356ec>] kvm_vz_vcpu_setup+0xc4/0x328 [kvm]
[  149.849586] [<ffffffffc062cef4>] kvm_arch_vcpu_create+0x184/0x228 [kvm]
[  149.849605] [<ffffffffc062854c>] kvm_vm_ioctl+0x64c/0xf28 [kvm]
[  149.849623] [<ffffffff805209c0>] sys_ioctl+0xc8/0x118
[  149.849631] [<ffffffff80219eb0>] syscall_common+0x34/0x58

The root cause is the deletion of kvm_mips_commpage_init() leaves vcpu
->arch.cop0 NULL. So fix it by making cop0 from a pointer to an embedded
object.

Fixes: 45c7e8af4a ("MIPS: Remove KVM_TE support")
Cc: stable@vger.kernel.org
Reported-by: Yu Zhao <yuzhao@google.com>
Suggested-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2023-07-03 16:06:19 +02:00
Linus Torvalds c8c655c34e s390:
* More phys_to_virt conversions
 
 * Improvement of AP management for VSIE (nested virtualization)
 
 ARM64:
 
 * Numerous fixes for the pathological lock inversion issue that
   plagued KVM/arm64 since... forever.
 
 * New framework allowing SMCCC-compliant hypercalls to be forwarded
   to userspace, hopefully paving the way for some more features
   being moved to VMMs rather than be implemented in the kernel.
 
 * Large rework of the timer code to allow a VM-wide offset to be
   applied to both virtual and physical counters as well as a
   per-timer, per-vcpu offset that complements the global one.
   This last part allows the NV timer code to be implemented on
   top.
 
 * A small set of fixes to make sure that we don't change anything
   affecting the EL1&0 translation regime just after having having
   taken an exception to EL2 until we have executed a DSB. This
   ensures that speculative walks started in EL1&0 have completed.
 
 * The usual selftest fixes and improvements.
 
 KVM x86 changes for 6.4:
 
 * Optimize CR0.WP toggling by avoiding an MMU reload when TDP is enabled,
   and by giving the guest control of CR0.WP when EPT is enabled on VMX
   (VMX-only because SVM doesn't support per-bit controls)
 
 * Add CR0/CR4 helpers to query single bits, and clean up related code
   where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long" return
   as a bool
 
 * Move AMD_PSFD to cpufeatures.h and purge KVM's definition
 
 * Avoid unnecessary writes+flushes when the guest is only adding new PTEs
 
 * Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s optimizations
   when emulating invalidations
 
 * Clean up the range-based flushing APIs
 
 * Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a single
   A/D bit using a LOCK AND instead of XCHG, and skip all of the "handle
   changed SPTE" overhead associated with writing the entire entry
 
 * Track the number of "tail" entries in a pte_list_desc to avoid having
   to walk (potentially) all descriptors during insertion and deletion,
   which gets quite expensive if the guest is spamming fork()
 
 * Disallow virtualizing legacy LBRs if architectural LBRs are available,
   the two are mutually exclusive in hardware
 
 * Disallow writes to immutable feature MSRs (notably PERF_CAPABILITIES)
   after KVM_RUN, similar to CPUID features
 
 * Overhaul the vmx_pmu_caps selftest to better validate PERF_CAPABILITIES
 
 * Apply PMU filters to emulated events and add test coverage to the
   pmu_event_filter selftest
 
 x86 AMD:
 
 * Add support for virtual NMIs
 
 * Fixes for edge cases related to virtual interrupts
 
 x86 Intel:
 
 * Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if XTILE_DATA is
   not being reported due to userspace not opting in via prctl()
 
 * Fix a bug in emulation of ENCLS in compatibility mode
 
 * Allow emulation of NOP and PAUSE for L2
 
 * AMX selftests improvements
 
 * Misc cleanups
 
 MIPS:
 
 * Constify MIPS's internal callbacks (a leftover from the hardware enabling
   rework that landed in 6.3)
 
 Generic:
 
 * Drop unnecessary casts from "void *" throughout kvm_main.c
 
 * Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the struct
   size by 8 bytes on 64-bit kernels by utilizing a padding hole
 
 Documentation:
 
 * Fix goof introduced by the conversion to rST
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmRNExkUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNyjwf+MkzDael9y9AsOZoqhEZ5OsfQYJ32
 Im5ZVYsPRU2K5TuoWql6meIihgclCj1iIU32qYHa2F1WYt2rZ72rJp+HoY8b+TaI
 WvF0pvNtqQyg3iEKUBKPA4xQ6mj7RpQBw86qqiCHmlfNt0zxluEGEPxH8xrWcfhC
 huDQ+NUOdU7fmJ3rqGitCvkUbCuZNkw3aNPR8dhU8RAWrwRzP2hBOmdxIeo81WWY
 XMEpJSijbGpXL9CvM0Jz9nOuMJwZwCCBGxg1vSQq0xTfLySNMxzvWZC2GFaBjucb
 j0UOQ7yE0drIZDVhd3sdNslubXXU6FcSEzacGQb9aigMUon3Tem9SHi7Kw==
 =S2Hq
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "s390:

   - More phys_to_virt conversions

   - Improvement of AP management for VSIE (nested virtualization)

  ARM64:

   - Numerous fixes for the pathological lock inversion issue that
     plagued KVM/arm64 since... forever.

   - New framework allowing SMCCC-compliant hypercalls to be forwarded
     to userspace, hopefully paving the way for some more features being
     moved to VMMs rather than be implemented in the kernel.

   - Large rework of the timer code to allow a VM-wide offset to be
     applied to both virtual and physical counters as well as a
     per-timer, per-vcpu offset that complements the global one. This
     last part allows the NV timer code to be implemented on top.

   - A small set of fixes to make sure that we don't change anything
     affecting the EL1&0 translation regime just after having having
     taken an exception to EL2 until we have executed a DSB. This
     ensures that speculative walks started in EL1&0 have completed.

   - The usual selftest fixes and improvements.

  x86:

   - Optimize CR0.WP toggling by avoiding an MMU reload when TDP is
     enabled, and by giving the guest control of CR0.WP when EPT is
     enabled on VMX (VMX-only because SVM doesn't support per-bit
     controls)

   - Add CR0/CR4 helpers to query single bits, and clean up related code
     where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long"
     return as a bool

   - Move AMD_PSFD to cpufeatures.h and purge KVM's definition

   - Avoid unnecessary writes+flushes when the guest is only adding new
     PTEs

   - Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s
     optimizations when emulating invalidations

   - Clean up the range-based flushing APIs

   - Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a
     single A/D bit using a LOCK AND instead of XCHG, and skip all of
     the "handle changed SPTE" overhead associated with writing the
     entire entry

   - Track the number of "tail" entries in a pte_list_desc to avoid
     having to walk (potentially) all descriptors during insertion and
     deletion, which gets quite expensive if the guest is spamming
     fork()

   - Disallow virtualizing legacy LBRs if architectural LBRs are
     available, the two are mutually exclusive in hardware

   - Disallow writes to immutable feature MSRs (notably
     PERF_CAPABILITIES) after KVM_RUN, similar to CPUID features

   - Overhaul the vmx_pmu_caps selftest to better validate
     PERF_CAPABILITIES

   - Apply PMU filters to emulated events and add test coverage to the
     pmu_event_filter selftest

   - AMD SVM:
       - Add support for virtual NMIs
       - Fixes for edge cases related to virtual interrupts

   - Intel AMX:
       - Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if
         XTILE_DATA is not being reported due to userspace not opting in
         via prctl()
       - Fix a bug in emulation of ENCLS in compatibility mode
       - Allow emulation of NOP and PAUSE for L2
       - AMX selftests improvements
       - Misc cleanups

  MIPS:

   - Constify MIPS's internal callbacks (a leftover from the hardware
     enabling rework that landed in 6.3)

  Generic:

   - Drop unnecessary casts from "void *" throughout kvm_main.c

   - Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the
     struct size by 8 bytes on 64-bit kernels by utilizing a padding
     hole

  Documentation:

   - Fix goof introduced by the conversion to rST"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (211 commits)
  KVM: s390: pci: fix virtual-physical confusion on module unload/load
  KVM: s390: vsie: clarifications on setting the APCB
  KVM: s390: interrupt: fix virtual-physical confusion for next alert GISA
  KVM: arm64: Have kvm_psci_vcpu_on() use WRITE_ONCE() to update mp_state
  KVM: arm64: Acquire mp_state_lock in kvm_arch_vcpu_ioctl_vcpu_init()
  KVM: selftests: Test the PMU event "Instructions retired"
  KVM: selftests: Copy full counter values from guest in PMU event filter test
  KVM: selftests: Use error codes to signal errors in PMU event filter test
  KVM: selftests: Print detailed info in PMU event filter asserts
  KVM: selftests: Add helpers for PMC asserts in PMU event filter test
  KVM: selftests: Add a common helper for the PMU event filter guest code
  KVM: selftests: Fix spelling mistake "perrmited" -> "permitted"
  KVM: arm64: vhe: Drop extra isb() on guest exit
  KVM: arm64: vhe: Synchronise with page table walker on MMU update
  KVM: arm64: pkvm: Document the side effects of kvm_flush_dcache_to_poc()
  KVM: arm64: nvhe: Synchronise with page table walker on TLBI
  KVM: arm64: Handle 32bit CNTPCTSS traps
  KVM: arm64: nvhe: Synchronise with page table walker on vcpu run
  KVM: arm64: vgic: Don't acquire its_lock before config_lock
  KVM: selftests: Add test to verify KVM's supported XCR0
  ...
2023-05-01 12:06:20 -07:00
Paul E. McKenney 79cf833be6 kvm: Remove "select SRCU"
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in selecting it.  Therefore, remove the "select SRCU"
Kconfig statements from the various KVM Kconfig files.

Acked-by: Sean Christopherson <seanjc@google.com> (x86)
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <kvm@vger.kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org> (arm64)
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Anup Patel <anup@brainfault.org> (riscv)
Acked-by: Heiko Carstens <hca@linux.ibm.com> (s390)
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05 13:47:42 +00:00
Sean Christopherson 7ffc2e8951 KVM: MIPS: Make kvm_mips_callbacks const
Make kvm_mips_callbacks fully const as it's now hardcoded to point at
kvm_vz_callbacks, the only remaining the set of callbacks.

Link: https://lore.kernel.org/all/beb697c2-dfad-780e-4638-76b229f28731@linaro.org
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20230224192832.1286267-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-03-23 16:11:36 -07:00
Thomas Huth d8708b80fa KVM: Change return type of kvm_arch_vm_ioctl() to "int"
All kvm_arch_vm_ioctl() implementations now only deal with "int"
types as return values, so we can change the return type of these
functions to use "int" instead of "long".

Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Anup Patel <anup@brainfault.org>
Message-Id: <20230208140105.655814-7-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-16 10:18:07 -04:00
Linus Torvalds 498a1cf902 Kbuild updates for v6.3
- Change V=1 option to print both short log and full command log.
 
  - Allow V=1 and V=2 to be combined as V=12.
 
  - Make W=1 detect wrong .gitignore files.
 
  - Tree-wide cleanups for unused command line arguments passed to Clang.
 
  - Stop using -Qunused-arguments with Clang.
 
  - Make scripts/setlocalversion handle only correct release tags instead
    of any arbitrary annotated tag.
 
  - Create Debian and RPM source packages without cleaning the source tree.
 
  - Various cleanups for packaging.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmP7iHoVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGL/cQAK9q5rsNL5a2LgTbm89ORA+UV+ST
 hrAoGo5DkJHUbVH53oPzyLynFBZPvUzLK8yjApjXkyAzy2hXYnj+vbTs0s+JVCFL
 owS4NB0YP+tpHGuy8bGpWI0GMZSMwmspUteqxk86zuH8uQVAhnCaeV1/Cr6Aqj1h
 2jk1FZid3/h7qEkEgu5U8soeyFnV6VhAT6Ie5yfZ2O2RdsSqPUh6vfKrgdyW4RWz
 gito0SOUwvjIDfSmTnIIacUibisPRv2OW29OvmDp1aXj5rMhe3UfOznVE3NR86yl
 ZbWDAIm6KYT8V1ASOoAUR80qent9IPKytThLK9BVEQCT6bsujCZMvhYhhEvO30TF
 Lzsdr+FrES//xag3+hgc63FEied2xxWGQG1cRtzAhfRL9tJ03+mY1omoW6SyKqW/
 Gc9PIcTgQbCIrkeL0HuAI1q3I1vkvHXInJKtGkoHh1J9aJ8v5gQpwGA+DDRUnA+A
 LQSeEbT2Hf3MoF4CqZRnConvfhlMuLI+j5v54YPrhokxXmv7u807kjfwMFTiZ/+m
 CJFlEMf9YRv3pi8g/AYyGAg5ZQigCwzOCRUC5kguFqzZdgnjiI907GEL804lm1Mg
 lpx/HtYPyxwWEd2XyU6/C9AEIl3gm7MBd6b1tD54Tb/VmE+AvjS/O9jFYXZqnAnM
 Llv4BfK/cQKwHb6o
 =HpFZ
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Change V=1 option to print both short log and full command log

 - Allow V=1 and V=2 to be combined as V=12

 - Make W=1 detect wrong .gitignore files

 - Tree-wide cleanups for unused command line arguments passed to Clang

 - Stop using -Qunused-arguments with Clang

 - Make scripts/setlocalversion handle only correct release tags instead
   of any arbitrary annotated tag

 - Create Debian and RPM source packages without cleaning the source
   tree

 - Various cleanups for packaging

* tag 'kbuild-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (74 commits)
  kbuild: rpm-pkg: remove unneeded KERNELRELEASE from modules/headers_install
  docs: kbuild: remove description of KBUILD_LDS_MODULE
  .gitattributes: use 'dts' diff driver for *.dtso files
  kbuild: deb-pkg: improve the usability of source package
  kbuild: deb-pkg: fix binary-arch and clean in debian/rules
  kbuild: tar-pkg: use tar rules in scripts/Makefile.package
  kbuild: make perf-tar*-src-pkg work without relying on git
  kbuild: deb-pkg: switch over to source format 3.0 (quilt)
  kbuild: deb-pkg: make .orig tarball a hard link if possible
  kbuild: deb-pkg: hide KDEB_SOURCENAME from Makefile
  kbuild: srcrpm-pkg: create source package without cleaning
  kbuild: rpm-pkg: build binary packages from source rpm
  kbuild: deb-pkg: create source package without cleaning
  kbuild: add a tool to list files ignored by git
  Documentation/llvm: add Chimera Linux, Google and Meta datacenters
  setlocalversion: use only the correct release tag for git-describe
  setlocalversion: clean up the construction of version output
  .gitignore: ignore *.cover and *.mbx
  kbuild: remove --include-dir MAKEFLAG from top Makefile
  kbuild: fix trivial typo in comment
  ...
2023-02-26 11:53:25 -08:00
Nathan Chancellor 80a20d2f82 MIPS: Always use -Wa,-msoft-float and eliminate GAS_HAS_SET_HARDFLOAT
-Wa,-msoft-float is tested with as-option, which will be a problem for
clang with an upcoming change to move as-option to use KBUILD_AFLAGS
instead of KBUILD_CFLAGS due to a lack of '-mno-abicalls' in
KBUILD_AFLAGS at the point that this check occurs; $(cflags-y) is added
to KBUILD_AFLAGS towards the end of this file.

  clang: error: ignoring '-fno-PIE' option as it cannot be used with implicit usage of -mabicalls and the N64 ABI [-Werror,-Woption-ignored]

This could be resolved by switching to a cc-option check but
'$(cflags-y)' would need to be added so that '-mno-abicalls' is present
for the test. However, this check is no longer necessary, as
-msoft-float is supported by all supported assembler versions (GNU as
2.25+ and LLVM 11+). Eliminate GAS_HAS_SET_HARDFLOAT and all of its
uses, inlining SET_HARDFLOAT where necessary.

Link: https://lore.kernel.org/202209101939.bvk64Fok-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-01-26 12:41:16 +09:00
Sean Christopherson 441f7bfa99 KVM: Opt out of generic hardware enabling on s390 and PPC
Allow architectures to opt out of the generic hardware enabling logic,
and opt out on both s390 and PPC, which don't need to manually enable
virtualization as it's always on (when available).

In addition to letting s390 and PPC drop a bit of dead code, this will
hopefully also allow ARM to clean up its related code, e.g. ARM has its
own per-CPU flag to track which CPUs have enable hardware due to the
need to keep hardware enabled indefinitely when pKVM is enabled.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Anup Patel <anup@brainfault.org>
Message-Id: <20221130230934.1014142-50-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:48:37 -05:00
Sean Christopherson 81a1cf9f89 KVM: Drop kvm_arch_check_processor_compat() hook
Drop kvm_arch_check_processor_compat() and its support code now that all
architecture implementations are nops.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Farman <farman@linux.ibm.com>	# s390
Acked-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Message-Id: <20221130230934.1014142-33-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:41:28 -05:00
Sean Christopherson a578a0a9e3 KVM: Drop kvm_arch_{init,exit}() hooks
Drop kvm_arch_init() and kvm_arch_exit() now that all implementations
are nops.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>	# s390
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Anup Patel <anup@brainfault.org>
Message-Id: <20221130230934.1014142-30-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:41:23 -05:00
Sean Christopherson eed9fcdf57 KVM: MIPS: Register die notifier prior to kvm_init()
Call kvm_init() only after _all_ setup is complete, as kvm_init() exposes
/dev/kvm to userspace and thus allows userspace to create VMs (and call
other ioctls).

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221130230934.1014142-24-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:41:15 -05:00
Sean Christopherson 3fb8e89aa2 KVM: MIPS: Setup VZ emulation? directly from kvm_mips_init()
Invoke kvm_mips_emulation_init() directly from kvm_mips_init() instead
of bouncing through kvm_init()=>kvm_arch_init().  Functionally, this is
a glorified nop as invoking kvm_arch_init() is the very first action
performed by kvm_init().

Emptying kvm_arch_init() will allow dropping the hook entirely once all
architecture implementations are nops.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221130230934.1014142-23-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:41:13 -05:00
Sean Christopherson 1cfc1c7bf5 KVM: MIPS: Hardcode callbacks to hardware virtualization extensions
Now that KVM no longer supports trap-and-emulate (see commit 45c7e8af4a
"MIPS: Remove KVM_TE support"), hardcode the MIPS callbacks to the
virtualization callbacks.

Harcoding the callbacks eliminates the technically-unnecessary check on
non-NULL kvm_mips_callbacks in kvm_arch_init().  MIPS has never supported
multiple in-tree modules, i.e. barring an out-of-tree module, where
copying and renaming kvm.ko counts as "out-of-tree", KVM could never
encounter a non-NULL set of callbacks during module init.

The callback check is also subtly broken, as it is not thread safe,
i.e. if there were multiple modules, loading both concurrently would
create a race between checking and setting kvm_mips_callbacks.

Given that out-of-tree shenanigans are not the kernel's responsibility,
hardcode the callbacks to simplify the code.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221130230934.1014142-22-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:41:12 -05:00
Sean Christopherson 63a1bd8ad1 KVM: Drop arch hardware (un)setup hooks
Drop kvm_arch_hardware_setup() and kvm_arch_hardware_unsetup() now that
all implementations are nops.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>	# s390
Acked-by: Anup Patel <anup@brainfault.org>
Message-Id: <20221130230934.1014142-10-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:40:54 -05:00
Feiyang Chen 22c4e80466 MIPS&LoongArch&NIOS2: adjust prototypes of p?d_init()
Patch series "mm/sparse-vmemmap: Generalise helpers and enable for
LoongArch", v14.

This series is in order to enable sparse-vmemmap for LoongArch.  But
LoongArch cannot use generic helpers directly because MIPS&LoongArch need
to call pgd_init()/pud_init()/pmd_init() when populating page tables.  So
we adjust the prototypes of p?d_init() to make generic helpers can call
them, then enable sparse-vmemmap with generic helpers, and to be further,
generalise vmemmap_populate_hugepages() for ARM64, X86 and LoongArch.


This patch (of 4):

We are preparing to add sparse vmemmap support to LoongArch.  MIPS and
LoongArch need to call pgd_init()/pud_init()/pmd_init() when populating
page tables, so adjust their prototypes to make generic helpers can call
them.

NIOS2 declares pmd_init() but doesn't use, just remove it to avoid build
errors.

Link: https://lkml.kernel.org/r/20221027125253.3458989-1-chenhuacai@loongson.cn
Link: https://lkml.kernel.org/r/20221027125253.3458989-2-chenhuacai@loongson.cn
Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Will Deacon <will@kernel.org>
Cc: Xuefeng Li <lixuefeng@loongson.cn>
Cc: Xuerui Wang <kernel@xen0n.name>
Cc: Min Zhou <zhoumin@loongson.cn>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-11 18:12:11 -08:00
Paolo Bonzini c59fb12758 KVM: remove KVM_REQ_UNHALT
KVM_REQ_UNHALT is now unnecessary because it is replaced by the return
value of kvm_vcpu_block/kvm_vcpu_halt.  Remove it.

No functional change intended.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Message-Id: <20220921003201.1441511-13-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:37:21 -04:00
Paolo Bonzini 599275c060 KVM: mips, x86: do not rely on KVM_REQ_UNHALT
KVM_REQ_UNHALT is a weird request that simply reports the value of
kvm_arch_vcpu_runnable() on exit from kvm_vcpu_halt().  Only
MIPS and x86 are looking at it, the others just clear it.  Check
the state of the vCPU directly so that the request is handled
as a nop on all architectures.

No functional change intended, except for corner cases where an
event arrive immediately after a signal become pending or after
another similar host-side event.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220921003201.1441511-12-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:37:21 -04:00
Chao Peng 20ec3ebd70 KVM: Rename mmu_notifier_* to mmu_invalidate_*
The motivation of this renaming is to make these variables and related
helper functions less mmu_notifier bound and can also be used for non
mmu_notifier based page invalidation. mmu_invalidate_* was chosen to
better describe the purpose of 'invalidating' a page that those
variables are used for.

  - mmu_notifier_seq/range_start/range_end are renamed to
    mmu_invalidate_seq/range_start/range_end.

  - mmu_notifier_retry{_hva} helper functions are renamed to
    mmu_invalidate_retry{_hva}.

  - mmu_notifier_count is renamed to mmu_invalidate_in_progress to
    avoid confusion with mn_active_invalidate_count.

  - While here, also update kvm_inc/dec_notifier_count() to
    kvm_mmu_invalidate_begin/end() to match the change for
    mmu_notifier_count.

No functional change intended.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Message-Id: <20220816125322.1110439-3-chao.p.peng@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-08-19 04:05:41 -04:00
Mike Rapoport bb5af4f67a mips: rename PGD_ORDER to PGD_TABLE_ORDER
This is the order of the page table allocation, not the order of a PGD.

While at it remove unused defintion of _PGD_ORDER in asm-offsets.

Link: https://lkml.kernel.org/r/20220703141203.147893-7-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Xuerui Wang <kernel@xen0n.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-17 17:14:42 -07:00
Julia Lawall 94bd83e45a MIPS: fix typos in comments
Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-05-04 22:22:59 +02:00
Linus Torvalds 5fdb26213f ARM:
* A couple of fixes when handling an exception while a SError has been delivered
 
 * Workaround for Cortex-A510's single-step erratum
 
 RISCV:
 
 * Make CY, TM, and IR counters accessible in VU mode
 
 * Fix SBI implementation version
 
 x86:
 
 * Report deprecation of x87 features in supported CPUID
 
 * Preparation for fixing an interrupt delivery race on AMD hardware
 
 * Sparse fix
 
 All except POWER and s390:
 
 * Rework guest entry code to correctly mark noinstr areas and fix vtime'
   accounting (for x86, this was already mostly correct but not entirely;
   for ARM, MIPS and RISC-V it wasn't)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmH+E4AUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNujwf+ON/8pBWyMPdjiY5l5SyNLpRup8Su
 zkQoMEDICI7khYUz2bEAjOazFWHmHPsdogAlG82QeJCbFmCqyMb6iX0uWj53BdGP
 P2bOM/tXbulvKBBeiTritkUUNO+hBmmSF+57AOJSW+Enhc7HFwk54cuft6f30r+d
 JRaEOhPOP34hQ+wFQQhZZh72BaZBqgnrYwZDp1TiC0Wh8v7P4Nf+NFtEgba2nsxC
 xfz5PrEhvegtU8Ee9JAF2bAl7851WJq557P2cOpghtUMgh4t6GzCcUOCKIie67oQ
 0Vaf+OieAopdT+QNazSEWO9zxl7eTpWjk2hrwsDrgKHAL/YmuWJuSyEYIA==
 =C0LZ
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - A couple of fixes when handling an exception while a SError has
     been delivered

   - Workaround for Cortex-A510's single-step erratum

  RISC-V:

   - Make CY, TM, and IR counters accessible in VU mode

   - Fix SBI implementation version

  x86:

   - Report deprecation of x87 features in supported CPUID

   - Preparation for fixing an interrupt delivery race on AMD hardware

   - Sparse fix

  All except POWER and s390:

   - Rework guest entry code to correctly mark noinstr areas and fix
     vtime' accounting (for x86, this was already mostly correct but not
     entirely; for ARM, MIPS and RISC-V it wasn't)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Use ERR_PTR_USR() to return -EFAULT as a __user pointer
  KVM: x86: Report deprecated x87 features in supported CPUID
  KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata
  KVM: arm64: Stop handle_exit() from handling HVC twice when an SError occurs
  KVM: arm64: Avoid consuming a stale esr value when SError occur
  RISC-V: KVM: Fix SBI implementation version
  RISC-V: KVM: make CY, TM, and IR counters accessible in VU mode
  kvm/riscv: rework guest entry logic
  kvm/arm64: rework guest entry logic
  kvm/x86: rework guest entry logic
  kvm/mips: rework guest entry logic
  kvm: add guest_state_{enter,exit}_irqoff()
  KVM: x86: Move delivery of non-APICv interrupt into vendor code
  kvm: Move KVM_GET_XSAVE2 IOCTL definition at the end of kvm.h
2022-02-05 09:55:59 -08:00
Mark Rutland 72e3244512 kvm/mips: rework guest entry logic
In kvm_arch_vcpu_ioctl_run() we use guest_enter_irqoff() and
guest_exit_irqoff() directly, with interrupts masked between these. As
we don't handle any timer ticks during this window, we will not account
time spent within the guest as guest time, which is unfortunate.

Additionally, we do not inform lockdep or tracing that interrupts will
be enabled during guest execution, which caan lead to misleading traces
and warnings that interrupts have been enabled for overly-long periods.

This patch fixes these issues by using the new timing and context
entry/exit helpers to ensure that interrupts are handled during guest
vtime but with RCU watching, with a sequence:

	guest_timing_enter_irqoff();

	guest_state_enter_irqoff();
	< run the vcpu >
	guest_state_exit_irqoff();

	< take any pending IRQs >

	guest_timing_exit_irqoff();

In addition, as guest exits during the "run the vcpu" step are handled
by kvm_mips_handle_exit(), a wrapper function is added which ensures
that such exists are handled with a sequence:

	guest_state_exit_irqoff();
	< handle the exit >
	guest_state_enter_irqoff();

This means that exits which stop the vCPU running will have a redundant
guest_state_enter_irqoff() .. guest_state_exit_irqoff() sequence, which
can be addressed with future rework.

Since instrumentation may make use of RCU, we must also ensure that no
instrumented code is run during the EQS. I've split out the critical
section into a new kvm_mips_enter_exit_vcpu() helper which is marked
noinstr.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Message-Id: <20220201132926.3301912-6-mark.rutland@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-01 08:51:53 -05:00
Randy Dunlap 2161ba0709 MIPS: KVM: fix vz.c kernel-doc notation
Fix all kernel-doc warnings in mips/kvm/vz.c as reported by the
kernel test robot:

  arch/mips/kvm/vz.c:471: warning: Function parameter or member 'out_compare' not described in '_kvm_vz_save_htimer'
  arch/mips/kvm/vz.c:471: warning: Function parameter or member 'out_cause' not described in '_kvm_vz_save_htimer'
  arch/mips/kvm/vz.c:471: warning: Excess function parameter 'compare' description in '_kvm_vz_save_htimer'
  arch/mips/kvm/vz.c:471: warning: Excess function parameter 'cause' description in '_kvm_vz_save_htimer'
  arch/mips/kvm/vz.c:1551: warning: No description found for return value of 'kvm_trap_vz_handle_cop_unusable'
  arch/mips/kvm/vz.c:1552: warning: expecting prototype for kvm_trap_vz_handle_cop_unusuable(). Prototype was for kvm_trap_vz_handle_cop_unusable() instead
  arch/mips/kvm/vz.c:1597: warning: No description found for return value of 'kvm_trap_vz_handle_msa_disabled'

Fixes: c992a4f6a9 ("KVM: MIPS: Implement VZ support")
Fixes: f4474d50c7 ("KVM: MIPS/VZ: Support hardware guest timer")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: linux-mips@vger.kernel.org
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: kvm@vger.kernel.org
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-02-01 08:36:27 +01:00
David Woodhouse ae1b606e62 KVM: mips: Use Makefile.kvm for common files
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20211121125451.9489-5-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-09 12:58:37 -05:00
Sean Christopherson 91b99ea706 KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt()
Rename kvm_vcpu_block() to kvm_vcpu_halt() in preparation for splitting
the actual "block" sequences into a separate helper (to be named
kvm_vcpu_block()).  x86 will use the standalone block-only path to handle
non-halt cases where the vCPU is not runnable.

Rename block_ns to halt_ns to match the new function name.

No functional change intended.

Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211009021236.4122790-14-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08 04:24:51 -05:00
Maciej S. Szmigiero ed922739c9 KVM: Use interval tree to do fast hva lookup in memslots
The current memslots implementation only allows quick binary search by gfn,
quick lookup by hva is not possible - the implementation has to do a linear
scan of the whole memslots array, even though the operation being performed
might apply just to a single memslot.

This significantly hurts performance of per-hva operations with higher
memslot counts.

Since hva ranges can overlap between memslots an interval tree is needed
for tracking them.

[sean: handle interval tree updates in kvm_replace_memslot()]
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <d66b9974becaa9839be9c4e1a5de97b177b4ac20.1638817640.git.maciej.szmigiero@oracle.com>
2021-12-08 04:24:32 -05:00
Sean Christopherson 6a99c6e3f5 KVM: Stop passing kvm_userspace_memory_region to arch memslot hooks
Drop the @mem param from kvm_arch_{prepare,commit}_memory_region() now
that its use has been removed in all architectures.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <aa5ed3e62c27e881d0d8bc0acbc1572bc336dc19.1638817640.git.maciej.szmigiero@oracle.com>
2021-12-08 04:24:25 -05:00
Sean Christopherson 3b1816177b KVM: MIPS: Drop pr_debug from memslot commit to avoid using "mem"
Remove an old (circa 2012) kvm_debug from kvm_arch_commit_memory_region()
to print basic information when committing a memslot change.  The primary
motivation for removing the kvm_debug is to avoid using @mem, the user
memory region, so that said param can be removed.

Alternatively, the debug message could be converted to use @new, but that
would require synthesizing select state to play nice with the DELETED
case, which will pass NULL for @new in the future.  And there's no
argument to be had for dumping generic information in an arch callback,
i.e. if there's a good reason for the debug message, then it belongs in
common KVM code where all architectures can benefit.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <446929a668f6e1346751571b71db41e94e976cdf.1638817639.git.maciej.szmigiero@oracle.com>
2021-12-08 04:24:21 -05:00
Sean Christopherson 537a17b314 KVM: Let/force architectures to deal with arch specific memslot data
Pass the "old" slot to kvm_arch_prepare_memory_region() and force arch
code to handle propagating arch specific data from "new" to "old" when
necessary.  This is a baby step towards dynamically allocating "new" from
the get go, and is a (very) minor performance boost on x86 due to not
unnecessarily copying arch data.

For PPC HV, copy the rmap in the !CREATE and !DELETE paths, i.e. for MOVE
and FLAGS_ONLY.  This is functionally a nop as the previous behavior
would overwrite the pointer for CREATE, and eventually discard/ignore it
for DELETE.

For x86, copy the arch data only for FLAGS_ONLY changes.  Unlike PPC HV,
x86 needs to reallocate arch data in the MOVE case as the size of x86's
allocations depend on the alignment of the memslot's gfn.

Opportunistically tweak kvm_arch_prepare_memory_region()'s param order to
match the "commit" prototype.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
[mss: add missing RISCV kvm_arch_prepare_memory_region() change]
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <67dea5f11bbcfd71e3da5986f11e87f5dd4013f9.1638817639.git.maciej.szmigiero@oracle.com>
2021-12-08 04:24:20 -05:00
Marc Zyngier 75a9869f31 KVM: mips: Use kvm_get_vcpu() instead of open-coded access
As we are about to change the way vcpus are allocated, mandate
the use of kvm_get_vcpu() instead of open-coding the access.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Message-Id: <20211116160403.4074052-3-maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08 04:24:13 -05:00
Marc Zyngier 27592ae8db KVM: Move wiping of the kvm->vcpus array to common code
All architectures have similar loops iterating over the vcpus,
freeing one vcpu at a time, and eventually wiping the reference
off the vcpus array. They are also inconsistently taking
the kvm->lock mutex when wiping the references from the array.

Make this code common, which will simplify further changes.
The locking is dropped altogether, as this should only be called
when there is no further references on the kvm structure.

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Message-Id: <20211116160403.4074052-2-maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08 04:24:13 -05:00
Vitaly Kuznetsov 57a2e13ebd KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
It doesn't make sense to return the recommended maximum number of
vCPUs which exceeds the maximum possible number of vCPUs.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211116163443.88707-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-18 02:12:14 -05:00
Linus Torvalds a3f3677380 - added printing of CPU options for /proc/cpuinfo
- removed support for Netlogic SOCs
 - fixes and cleanup
 -----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAmGE6OEaHHRzYm9nZW5k
 QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHA3zQ//Ze+qwFP53nl21DTx+viH
 1wvoy+rMa95Udrzf8/Mp8qskZG9Jq2UKskYeWPq2cgMD2X7JSh8b0Xz3O9K8XxKD
 a21jGoY61np7HHl9XOhYD7A87DvrHS2VrYh6IiN9tqV5tK0x7uzYnH0pJDqhU4im
 0zDtnnvVDvUPU9EDEusD8xPaKHquQXEd0JyNXLo+hvuwc1RZ2VckjE8dsS3I0kn2
 2Sz4j7K/e1hB85vE1LXuuU8cL8IuMpDxkDnK6STA7mzNBPAdqnClnKuB70zjnwp0
 lCjZBWmSeVIqCEdoerB7wgn1g+8k/1TBbCoqtN2poQMiLBUKFbfv5gXg3Uei0SDa
 Q/c3od8Mr2jn39EbG5/cf0FOyuKtLDFvGXASaSuW7Xhvn3ue4I6MrKh2bb5nimXP
 3ShvsvYfoc2cGSz7Pt7tOCP6WHFlIi+9wjZEr8XZlGhJE6BgE4CbJHA3/Q4VAl+Q
 4whAcNldSX+sVS/dSHPo0s/9eLCPm1EWulD8mBpvj27cmeUVhqXCIY34Df7aIDBi
 F1+rFxt6/uDTDEJp3PkwkmVJuyzivN5CcXl4xks/MaJ/8BgYZVRbafyGlJxm+MMf
 ufAXvfiwZhyVg4cHajtMiFO2m1itmoyg3CvIffiVZMbegqy4Dsu+ZIf0csPmGheK
 88AfVKBfrRwrJG1yK2iX/Dw=
 =mvn0
 -----END PGP SIGNATURE-----

Merge tag 'mips_5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS updates from Thomas Bogendoerfer:

 - added printing of CPU options for /proc/cpuinfo

 - removed support for Netlogic SOCs

 - fixes and cleanup

* tag 'mips_5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: Cobalt: Explain GT64111 early PCI fixup
  mips: fix HUGETLB function without THP enabled
  mips: cm: Convert to bitfield API to fix out-of-bounds access
  MIPS: Remove NETLOGIC support
  MIPS: kernel: proc: add CPU option reporting
  MIPS: kernel: proc: use seq_puts instead of seq_printf
  MIPS: kernel: proc: fix trivial style errors
  MIPS: Fix assembly error from MIPSr2 code used within MIPS_ISA_ARCH_LEVEL
  MIPS: octeon: Remove unused functions
  MIPS: Loongson64: Add of_node_put() before break
  bcm47xx: Replace printk(KERN_ALERT ... pci_devname(dev)) with pci_alert()
  bcm47xx: Get rid of redundant 'else'
  MIPS: sni: Fix the build
  MIPS: Avoid macro redefinitions
  MIPS: loongson64: Fix no screen display during boot-up
  MIPS: loongson64: make CPU_LOONGSON64 depends on MIPS_FP_SUPPORT
2021-11-05 08:13:19 -07:00
Thomas Bogendoerfer 95b8a5e011 MIPS: Remove NETLOGIC support
No (active) developer owns this hardware, so let's remove Linux support.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-10-24 17:24:47 +02:00
Juergen Gross a1c42ddedf kvm: rename KVM_MAX_VCPU_ID to KVM_MAX_VCPU_IDS
KVM_MAX_VCPU_ID is not specifying the highest allowed vcpu-id, but the
number of allowed vcpu-ids. This has already led to confusion, so
rename KVM_MAX_VCPU_ID to KVM_MAX_VCPU_IDS to make its semantics more
clear

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210913135745.13944-3-jgross@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30 04:27:05 -04:00
Linus Torvalds 192ad3c27a ARM:
- Page ownership tracking between host EL1 and EL2
 
 - Rely on userspace page tables to create large stage-2 mappings
 
 - Fix incompatibility between pKVM and kmemleak
 
 - Fix the PMU reset state, and improve the performance of the virtual PMU
 
 - Move over to the generic KVM entry code
 
 - Address PSCI reset issues w.r.t. save/restore
 
 - Preliminary rework for the upcoming pKVM fixed feature
 
 - A bunch of MM cleanups
 
 - a vGIC fix for timer spurious interrupts
 
 - Various cleanups
 
 s390:
 
 - enable interpretation of specification exceptions
 
 - fix a vcpu_idx vs vcpu_id mixup
 
 x86:
 
 - fast (lockless) page fault support for the new MMU
 
 - new MMU now the default
 
 - increased maximum allowed VCPU count
 
 - allow inhibit IRQs on KVM_RUN while debugging guests
 
 - let Hyper-V-enabled guests run with virtualized LAPIC as long as they
   do not enable the Hyper-V "AutoEOI" feature
 
 - fixes and optimizations for the toggling of AMD AVIC (virtualized LAPIC)
 
 - tuning for the case when two-dimensional paging (EPT/NPT) is disabled
 
 - bugfixes and cleanups, especially with respect to 1) vCPU reset and
   2) choosing a paging mode based on CR0/CR4/EFER
 
 - support for 5-level page table on AMD processors
 
 Generic:
 
 - MMU notifier invalidation callbacks do not take mmu_lock unless necessary
 
 - improved caching of LRU kvm_memory_slot
 
 - support for histogram statistics
 
 - add statistics for halt polling and remote TLB flush requests
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmE2CIAUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMyqwf+Ky2WoThuQ9Ra0r/m8pUTAx5+gsAf
 MmG24rNLE+26X0xuBT9Q5+etYYRLrRTWJvo5cgHooz7muAYW6scR+ho5xzvLTAxi
 DAuoijkXsSdGoFCp0OMUHiwG3cgY5N7feTEwLPAb2i6xr/l6SZyCP4zcwiiQbJ2s
 UUD0i3rEoNQ02/hOEveud/ENxzUli9cmmgHKXR3kNgsJClSf1fcuLnhg+7EGMhK9
 +c2V+hde5y0gmEairQWm22MLMRolNZ5NL4kjykiNh2M5q9YvbHe5+f/JmENlNZMT
 bsUQT6Ry1ukuJ0V59rZvUw71KknPFzZ3d6HgW4pwytMq6EJKiISHzRbVnQ==
 =FCAB
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "ARM:
   - Page ownership tracking between host EL1 and EL2
   - Rely on userspace page tables to create large stage-2 mappings
   - Fix incompatibility between pKVM and kmemleak
   - Fix the PMU reset state, and improve the performance of the virtual
     PMU
   - Move over to the generic KVM entry code
   - Address PSCI reset issues w.r.t. save/restore
   - Preliminary rework for the upcoming pKVM fixed feature
   - A bunch of MM cleanups
   - a vGIC fix for timer spurious interrupts
   - Various cleanups

  s390:
   - enable interpretation of specification exceptions
   - fix a vcpu_idx vs vcpu_id mixup

  x86:
   - fast (lockless) page fault support for the new MMU
   - new MMU now the default
   - increased maximum allowed VCPU count
   - allow inhibit IRQs on KVM_RUN while debugging guests
   - let Hyper-V-enabled guests run with virtualized LAPIC as long as
     they do not enable the Hyper-V "AutoEOI" feature
   - fixes and optimizations for the toggling of AMD AVIC (virtualized
     LAPIC)
   - tuning for the case when two-dimensional paging (EPT/NPT) is
     disabled
   - bugfixes and cleanups, especially with respect to vCPU reset and
     choosing a paging mode based on CR0/CR4/EFER
   - support for 5-level page table on AMD processors

  Generic:
   - MMU notifier invalidation callbacks do not take mmu_lock unless
     necessary
   - improved caching of LRU kvm_memory_slot
   - support for histogram statistics
   - add statistics for halt polling and remote TLB flush requests"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (210 commits)
  KVM: Drop unused kvm_dirty_gfn_invalid()
  KVM: x86: Update vCPU's hv_clock before back to guest when tsc_offset is adjusted
  KVM: MMU: mark role_regs and role accessors as maybe unused
  KVM: MIPS: Remove a "set but not used" variable
  x86/kvm: Don't enable IRQ when IRQ enabled in kvm_wait
  KVM: stats: Add VM stat for remote tlb flush requests
  KVM: Remove unnecessary export of kvm_{inc,dec}_notifier_count()
  KVM: x86/mmu: Move lpage_disallowed_link further "down" in kvm_mmu_page
  KVM: x86/mmu: Relocate kvm_mmu_page.tdp_mmu_page for better cache locality
  Revert "KVM: x86: mmu: Add guest physical address check in translate_gpa()"
  KVM: x86/mmu: Remove unused field mmio_cached in struct kvm_mmu_page
  kvm: x86: Increase KVM_SOFT_MAX_VCPUS to 710
  kvm: x86: Increase MAX_VCPUS to 1024
  kvm: x86: Set KVM_MAX_VCPU_ID to 4*KVM_MAX_VCPUS
  KVM: VMX: avoid running vmx_handle_exit_irqoff in case of emulation
  KVM: x86/mmu: Don't freak out if pml5_root is NULL on 4-level host
  KVM: s390: index kvm->arch.idle_mask by vcpu_idx
  KVM: s390: Enable specification exception interpretation
  KVM: arm64: Trim guest debug exception handling
  KVM: SVM: Add 5-level page table support for SVM
  ...
2021-09-07 13:40:51 -07:00
Huacai Chen a3cf527e70 KVM: MIPS: Remove a "set but not used" variable
This fix a build warning:

   arch/mips/kvm/vz.c: In function '_kvm_vz_restore_htimer':
>> arch/mips/kvm/vz.c:392:10: warning: variable 'freeze_time' set but not used [-Wunused-but-set-variable]
     392 |  ktime_t freeze_time;
         |          ^~~~~~~~~~~

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Message-Id: <20210406024911.2008046-1-chenhuacai@loongson.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-06 06:52:10 -04:00
Huilong Deng 126b393686 MIPS: Return true/false (not 1/0) from bool functions
./arch/mips/kernel/uprobes.c:261:8-9: WARNING: return of 0/1 in function
'arch_uprobe_skip_sstep' with return type bool
./arch/mips/kernel/uprobes.c:78:10-11: WARNING: return of 0/1 in
function 'is_trap_insn' with return type bool
./arch/mips/kvm/mmu.c:489:9-10: WARNING: return of 0/1 in function
'kvm_test_age_gfn' with return type bool
./arch/mips/kvm/mmu.c:445:8-9: WARNING: return of 0/1 in function
'kvm_unmap_gfn_range' with return type bool

Signed-off-by: Huilong Deng <denghuilong@cdjrlc.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-08-21 10:37:30 +02:00
Jing Zhang f95937ccf5 KVM: stats: Support linear and logarithmic histogram statistics
Add new types of KVM stats, linear and logarithmic histogram.
Histogram are very useful for observing the value distribution
of time or size related stats.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
Message-Id: <20210802165633.1866976-2-jingzhangos@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-20 16:06:32 -04:00