Commit Graph

527 Commits

Author SHA1 Message Date
Mingwei Zhang 78b5605d44 KVM: Documentation: Add the missing description for tdp_mmu_page into kvm_mmu_page
Add the description for tdp_mmu_page into kvm_mmu_page description.
tdp_mmu_page is a field to differentiate shadow pages from TDP MMU and
non-TDP MMU.

Signed-off-by: Mingwei Zhang <mizhang@google.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20230912184553.1887764-7-mizhang@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-09-27 13:32:18 -07:00
Mingwei Zhang c3204c406b KVM: Documentation: Add the missing description for mmu_valid_gen into kvm_mmu_page
Add the description for mmu_valid_gen into kvm_mmu_page description.
mmu_valid_gen is used in shadow MMU for fast zapping. Update the doc to
reflect that.

Signed-off-by: Mingwei Zhang <mizhang@google.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20230912184553.1887764-6-mizhang@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-09-27 13:29:32 -07:00
Mingwei Zhang 6a713928ae KVM: Documentation: Add the missing description for tdp_mmu_root_count into kvm_mmu_page
Add the description of tdp_mmu_root_count into kvm_mmu_page description and
combine it with the description of root_count. tdp_mmu_root_count is an
atomic counter used only in TDP MMU. Update the doc.

Signed-off-by: Mingwei Zhang <mizhang@google.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20230912184553.1887764-5-mizhang@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-09-27 13:29:32 -07:00
Mingwei Zhang b40a2455e9 KVM: Documentation: Add the missing description for ptep in kvm_mmu_page
Add the missing description for ptep in kvm_mmu_page description. ptep is
used when TDP MMU is enabled and it shares the storage with parent_ptes.
Update the doc to help readers to get up-to-date info.

Signed-off-by: Mingwei Zhang <mizhang@google.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20230912184553.1887764-4-mizhang@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-09-27 13:29:32 -07:00
Mingwei Zhang fdaca560b6 KVM: Documentation: Update the field name gfns and its description in kvm_mmu_page
Update the field 'gfns' in kvm_mmu_page to 'shadowed_translation' to be
consistent with the code. Also update the corresponding 'gfns' in the
comments. The more detailed description of 'shadowed_translation' is
already inlined in the data structure definition, so no need to duplicate
the text but simply just update the names.

Signed-off-by: Mingwei Zhang <mizhang@google.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20230912184553.1887764-3-mizhang@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-09-27 13:29:31 -07:00
Mingwei Zhang b207cfbc8c KVM: Documentation: Add the missing description for guest_mode in kvm_mmu_page_role
Add the missing description for guest_mode in kvm_mmu_page_role
description.  guest_mode tells KVM whether a shadow page is used for the L1
or an L2. Update the missing field in documentation.

Signed-off-by: Mingwei Zhang <mizhang@google.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20230912184553.1887764-2-mizhang@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-09-27 13:29:31 -07:00
Michal Luczaj 57f33f1a87 KVM: Correct kvm_vcpu_event(s) typo in KVM API documentation
Set KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS parameter type to
`struct kvm_vcpu_events`. Events, plural.

Opportunistically fix few other typos.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20230814222358.707877-4-mhal@rbox.co
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-09-27 13:29:22 -07:00
Linus Torvalds 0c02183427 ARM:
* Clean up vCPU targets, always returning generic v8 as the preferred target
 
 * Trap forwarding infrastructure for nested virtualization (used for traps
   that are taken from an L2 guest and are needed by the L1 hypervisor)
 
 * FEAT_TLBIRANGE support to only invalidate specific ranges of addresses
   when collapsing a table PTE to a block PTE.  This avoids that the guest
   refills the TLBs again for addresses that aren't covered by the table PTE.
 
 * Fix vPMU issues related to handling of PMUver.
 
 * Don't unnecessary align non-stack allocations in the EL2 VA space
 
 * Drop HCR_VIRT_EXCP_MASK, which was never used...
 
 * Don't use smp_processor_id() in kvm_arch_vcpu_load(),
   but the cpu parameter instead
 
 * Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort()
 
 * Remove prototypes without implementations
 
 RISC-V:
 
 * Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest
 
 * Added ONE_REG interface for SATP mode
 
 * Added ONE_REG interface to enable/disable multiple ISA extensions
 
 * Improved error codes returned by ONE_REG interfaces
 
 * Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V
 
 * Added get-reg-list selftest for KVM RISC-V
 
 s390:
 
 * PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch)
   Allows a PV guest to use crypto cards. Card access is governed by
   the firmware and once a crypto queue is "bound" to a PV VM every
   other entity (PV or not) looses access until it is not bound
   anymore. Enablement is done via flags when creating the PV VM.
 
 * Guest debug fixes (Ilya)
 
 x86:
 
 * Clean up KVM's handling of Intel architectural events
 
 * Intel bugfixes
 
 * Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use debug
   registers and generate/handle #DBs
 
 * Clean up LBR virtualization code
 
 * Fix a bug where KVM fails to set the target pCPU during an IRTE update
 
 * Fix fatal bugs in SEV-ES intrahost migration
 
 * Fix a bug where the recent (architecturally correct) change to reinject
   #BP and skip INT3 broke SEV guests (can't decode INT3 to skip it)
 
 * Retry APIC map recalculation if a vCPU is added/enabled
 
 * Overhaul emergency reboot code to bring SVM up to par with VMX, tie the
   "emergency disabling" behavior to KVM actually being loaded, and move all of
   the logic within KVM
 
 * Fix user triggerable WARNs in SVM where KVM incorrectly assumes the TSC
   ratio MSR cannot diverge from the default when TSC scaling is disabled
   up related code
 
 * Add a framework to allow "caching" feature flags so that KVM can check if
   the guest can use a feature without needing to search guest CPUID
 
 * Rip out the ancient MMU_DEBUG crud and replace the useful bits with
   CONFIG_KVM_PROVE_MMU
 
 * Fix KVM's handling of !visible guest roots to avoid premature triple fault
   injection
 
 * Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the API surface
   that is needed by external users (currently only KVMGT), and fix a variety
   of issues in the process
 
 This last item had a silly one-character bug in the topic branch that
 was sent to me.  Because it caused pretty bad selftest failures in
 some configurations, I decided to squash in the fix.  So, while the
 exact commit ids haven't been in linux-next, the code has (from the
 kvm-x86 tree).
 
 Generic:
 
 * Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass
   action specific data without needing to constantly update the main handlers.
 
 * Drop unused function declarations
 
 Selftests:
 
 * Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs
 
 * Add support for printf() in guest code and covert all guest asserts to use
   printf-based reporting
 
 * Clean up the PMU event filter test and add new testcases
 
 * Include x86 selftests in the KVM x86 MAINTAINERS entry
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmT1m0kUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMNgggAiN7nz6UC423FznuI+yO3TLm8tkx1
 CpKh5onqQogVtchH+vrngi97cfOzZb1/AtifY90OWQi31KEWhehkeofcx7G6ERhj
 5a9NFADY1xGBsX4exca/VHDxhnzsbDWaWYPXw5vWFWI6erft9Mvy3tp1LwTvOzqM
 v8X4aWz+g5bmo/DWJf4Wu32tEU6mnxzkrjKU14JmyqQTBawVmJ3RYvHVJ/Agpw+n
 hRtPAy7FU6XTdkmq/uCT+KUHuJEIK0E/l1js47HFAqSzwdW70UDg14GGo1o4ETxu
 RjZQmVNvL57yVgi6QU38/A0FWIsWQm5IlaX1Ug6x8pjZPnUKNbo9BY4T1g==
 =W+4p
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:

   - Clean up vCPU targets, always returning generic v8 as the preferred
     target

   - Trap forwarding infrastructure for nested virtualization (used for
     traps that are taken from an L2 guest and are needed by the L1
     hypervisor)

   - FEAT_TLBIRANGE support to only invalidate specific ranges of
     addresses when collapsing a table PTE to a block PTE. This avoids
     that the guest refills the TLBs again for addresses that aren't
     covered by the table PTE.

   - Fix vPMU issues related to handling of PMUver.

   - Don't unnecessary align non-stack allocations in the EL2 VA space

   - Drop HCR_VIRT_EXCP_MASK, which was never used...

   - Don't use smp_processor_id() in kvm_arch_vcpu_load(), but the cpu
     parameter instead

   - Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort()

   - Remove prototypes without implementations

  RISC-V:

   - Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest

   - Added ONE_REG interface for SATP mode

   - Added ONE_REG interface to enable/disable multiple ISA extensions

   - Improved error codes returned by ONE_REG interfaces

   - Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V

   - Added get-reg-list selftest for KVM RISC-V

  s390:

   - PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch)

     Allows a PV guest to use crypto cards. Card access is governed by
     the firmware and once a crypto queue is "bound" to a PV VM every
     other entity (PV or not) looses access until it is not bound
     anymore. Enablement is done via flags when creating the PV VM.

   - Guest debug fixes (Ilya)

  x86:

   - Clean up KVM's handling of Intel architectural events

   - Intel bugfixes

   - Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use
     debug registers and generate/handle #DBs

   - Clean up LBR virtualization code

   - Fix a bug where KVM fails to set the target pCPU during an IRTE
     update

   - Fix fatal bugs in SEV-ES intrahost migration

   - Fix a bug where the recent (architecturally correct) change to
     reinject #BP and skip INT3 broke SEV guests (can't decode INT3 to
     skip it)

   - Retry APIC map recalculation if a vCPU is added/enabled

   - Overhaul emergency reboot code to bring SVM up to par with VMX, tie
     the "emergency disabling" behavior to KVM actually being loaded,
     and move all of the logic within KVM

   - Fix user triggerable WARNs in SVM where KVM incorrectly assumes the
     TSC ratio MSR cannot diverge from the default when TSC scaling is
     disabled up related code

   - Add a framework to allow "caching" feature flags so that KVM can
     check if the guest can use a feature without needing to search
     guest CPUID

   - Rip out the ancient MMU_DEBUG crud and replace the useful bits with
     CONFIG_KVM_PROVE_MMU

   - Fix KVM's handling of !visible guest roots to avoid premature
     triple fault injection

   - Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the
     API surface that is needed by external users (currently only
     KVMGT), and fix a variety of issues in the process

  Generic:

   - Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier
     events to pass action specific data without needing to constantly
     update the main handlers.

   - Drop unused function declarations

  Selftests:

   - Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs

   - Add support for printf() in guest code and covert all guest asserts
     to use printf-based reporting

   - Clean up the PMU event filter test and add new testcases

   - Include x86 selftests in the KVM x86 MAINTAINERS entry"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (279 commits)
  KVM: x86/mmu: Include mmu.h in spte.h
  KVM: x86/mmu: Use dummy root, backed by zero page, for !visible guest roots
  KVM: x86/mmu: Disallow guest from using !visible slots for page tables
  KVM: x86/mmu: Harden TDP MMU iteration against root w/o shadow page
  KVM: x86/mmu: Harden new PGD against roots without shadow pages
  KVM: x86/mmu: Add helper to convert root hpa to shadow page
  drm/i915/gvt: Drop final dependencies on KVM internal details
  KVM: x86/mmu: Handle KVM bookkeeping in page-track APIs, not callers
  KVM: x86/mmu: Drop @slot param from exported/external page-track APIs
  KVM: x86/mmu: Bug the VM if write-tracking is used but not enabled
  KVM: x86/mmu: Assert that correct locks are held for page write-tracking
  KVM: x86/mmu: Rename page-track APIs to reflect the new reality
  KVM: x86/mmu: Drop infrastructure for multiple page-track modes
  KVM: x86/mmu: Use page-track notifiers iff there are external users
  KVM: x86/mmu: Move KVM-only page-track declarations to internal header
  KVM: x86: Remove the unused page-track hook track_flush_slot()
  drm/i915/gvt: switch from ->track_flush_slot() to ->track_remove_region()
  KVM: x86: Add a new page-track hook to handle memslot deletion
  drm/i915/gvt: Don't bother removing write-protection on to-be-deleted slot
  KVM: x86: Reject memslot MOVE operations if KVMGT is attached
  ...
2023-09-07 13:52:20 -07:00
Linus Torvalds ec0e2dc810 VFIO updates for v6.6-rc1
- VFIO direct character device (cdev) interface support.  This extracts
    the vfio device fd from the container and group model, and is intended
    to be the native uAPI for use with IOMMUFD. (Yi Liu)
 
  - Enhancements to the PCI hot reset interface in support of cdev usage.
    (Yi Liu)
 
  - Fix a potential race between registering and unregistering vfio files
    in the kvm-vfio interface and extend use of a lock to avoid extra
    drop and acquires. (Dmitry Torokhov)
 
  - A new vfio-pci variant driver for the AMD/Pensando Distributed Services
    Card (PDS) Ethernet device, supporting live migration. (Brett Creeley)
 
  - Cleanups to remove redundant owner setup in cdx and fsl bus drivers,
    and simplify driver init/exit in fsl code. (Li Zetao)
 
  - Fix uninitialized hole in data structure and pad capability structures
    for alignment. (Stefan Hajnoczi)
 -----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmTvnDUbHGFsZXgud2ls
 bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsimEEP/AzG+VRcu5LfYbLGLe0z
 zB8ts6G7S78wXlmfN/LYi3v92XWvMMcm+vYF8oNAMfr1YL5sibWN6UtQfY1KCr7h
 nWKdQdqjajJ5yDDZnOFdhqHJGNfmZw6+fey8Z0j8zRI2oymK4DncWWX3g/7L1SNr
 9tIexGJef+mOdAmC94yOut3YviAaZ+f95T/xrdXHzzoNr50DD0+PD6AJdKJfKggP
 vhiC/DAYH3Fofaa6tRasgWuKCYWdjZLR/kxgNpeEmW6kZnbq/dnzZ+kgn4HH1f9G
 8p7UKVARR6FfG5aLheWu6Y9PDaKnfnqu8y/hobuE/ivXcmqqK+a6xSxrjgbVs8WJ
 94SYnTBRoTlDJaKWa7GxqdgzJnV+s5ZyAgPhjzdi6mLTPWGzkuLhFWGtYL+LZAQ6
 pNeZSM6CFBk+bva/xT0nNPCXxPh+/j/Y0G18FREj8aPFc03HrJQqz0RLydvTnoDz
 nX/by5KdzMSVSVLPr4uDMtAsgxsGqWiFcp7QMw1HhhlLWxqmYbA+mLZaqyMZUUOx
 6b/P8WXT9P2I+qPVKWQ5CWyqpsEqm6P+72yg6LOM9kINvgwDhOa7cagMXIuMWYMH
 Rf97FL+K8p1eIy6AnvRHgFBMM5185uG+0YcJyVqtucDr/k8T/Om6ujAI6JbWtNe6
 cLgaVAqKOYqCR4HC9bfVGSbd
 =eKSR
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v6.6-rc1' of https://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

 - VFIO direct character device (cdev) interface support. This extracts
   the vfio device fd from the container and group model, and is
   intended to be the native uAPI for use with IOMMUFD (Yi Liu)

 - Enhancements to the PCI hot reset interface in support of cdev usage
   (Yi Liu)

 - Fix a potential race between registering and unregistering vfio files
   in the kvm-vfio interface and extend use of a lock to avoid extra
   drop and acquires (Dmitry Torokhov)

 - A new vfio-pci variant driver for the AMD/Pensando Distributed
   Services Card (PDS) Ethernet device, supporting live migration (Brett
   Creeley)

 - Cleanups to remove redundant owner setup in cdx and fsl bus drivers,
   and simplify driver init/exit in fsl code (Li Zetao)

 - Fix uninitialized hole in data structure and pad capability
   structures for alignment (Stefan Hajnoczi)

* tag 'vfio-v6.6-rc1' of https://github.com/awilliam/linux-vfio: (53 commits)
  vfio/pds: Send type for SUSPEND_STATUS command
  vfio/pds: fix return value in pds_vfio_get_lm_file()
  pds_core: Fix function header descriptions
  vfio: align capability structures
  vfio/type1: fix cap_migration information leak
  vfio/fsl-mc: Use module_fsl_mc_driver macro to simplify the code
  vfio/cdx: Remove redundant initialization owner in vfio_cdx_driver
  vfio/pds: Add Kconfig and documentation
  vfio/pds: Add support for firmware recovery
  vfio/pds: Add support for dirty page tracking
  vfio/pds: Add VFIO live migration support
  vfio/pds: register with the pds_core PF
  pds_core: Require callers of register/unregister to pass PF drvdata
  vfio/pds: Initial support for pds VFIO driver
  vfio: Commonize combine_ranges for use in other VFIO drivers
  kvm/vfio: avoid bouncing the mutex when adding and deleting groups
  kvm/vfio: ensure kvg instance stays around in kvm_vfio_group_add()
  docs: vfio: Add vfio device cdev description
  vfio: Compile vfio_group infrastructure optionally
  vfio: Move the IOMMU_CAP_CACHE_COHERENCY check in __vfio_register_dev()
  ...
2023-08-30 20:36:01 -07:00
Bjorn Helgaas d56b699d76 Documentation: Fix typos
Fix typos in Documentation.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230814212822.193684-4-helgaas@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-08-18 11:29:03 -06:00
Haibo Xu 031f9efafc KVM: riscv: Add KVM_GET_REG_LIST API support
KVM_GET_REG_LIST API will return all registers that are available to
KVM_GET/SET_ONE_REG APIs. It's very useful to identify some platform
regression issue during VM migration.

Since this API was already supported on arm64, it is straightforward
to enable it on riscv with similar code structure.

Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-08-09 12:15:25 +05:30
Daniel Henrique Barboza e47f3c2843 docs: kvm: riscv: document EBUSY in KVM_SET_ONE_REG
The EBUSY errno is being used for KVM_SET_ONE_REG as a way to tell
userspace that a given reg can't be changed after the vcpu started.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-08-08 17:26:03 +05:30
Yi Liu dcc31ea60b kvm/vfio: Accept vfio device file from userspace
This defines KVM_DEV_VFIO_FILE* and make alias with KVM_DEV_VFIO_GROUP*.
Old userspace uses KVM_DEV_VFIO_GROUP* works as well.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Terrence Xu <terrence.xu@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yanting Jiang <yanting.jiang@intel.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Link: https://lore.kernel.org/r/20230718135551.6592-6-yi.l.liu@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-07-25 10:18:42 -06:00
Linus Torvalds 7210de3a32 A half-dozen late arriving docs patches. They are mostly fixes, but we
also have a kernel-doc tweak for enums and the long-overdue removal of the
 outdated and redundant patch-submission comments at the top of the
 MAINTAINERS file.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmSnR4EPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YfWIH/0b+gYD0PftjpG1MfPTlvsxm3yiO2IkZR1rX
 ZEvzMIk3cqDsZuhv8g4Xh3qrn7QHW9JE8XbOdkMDw+Hd1kkmYeweVhsLhcar44ai
 KPBXCbnd6bU6HcjT/o/AEkYVJzZDmKbt8ALi5C81xu8bWn2iybKgnJv1a3M1PFAx
 Dr5ne14HTEau5ewYeYPhkC2n1XRIE1BV0k4PdZlQE/67uwhplh9J2P/DiXh3I9DT
 0oxh8cZHRVheCkXNYseMWEC5V+xFfh3jP/fvIefuNGCb7AGDSE4s+Wx8I9CbduIN
 SwFtsqXRm2cQ8aj950T0E4JQZLVY0DJKrIJo0qh3LrfUYTinQx0=
 =tuod
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.5-2' of git://git.lwn.net/linux

Pull mode documentation updates from Jonathan Corbet:
 "A half-dozen late arriving docs patches. They are mostly fixes, but we
  also have a kernel-doc tweak for enums and the long-overdue removal of
  the outdated and redundant patch-submission comments at the top of the
  MAINTAINERS file"

* tag 'docs-6.5-2' of git://git.lwn.net/linux:
  scripts: kernel-doc: support private / public marking for enums
  Documentation: KVM: SEV: add a missing backtick
  Documentation: ACPI: fix typo in ssdt-overlays.rst
  Fix documentation of panic_on_warn
  docs: remove the tips on how to submit patches from MAINTAINERS
  docs: fix typo in zh_TW and zh_CN translation
2023-07-06 22:15:38 -07:00
Changyuan Lyu 1e6115f50b Documentation: KVM: SEV: add a missing backtick
``ENOTTY` -> ``ENOTTY``.

Signed-off-by: Changyuan Lyu <changyuanl@google.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20230624165858.21777-1-changyuanl@google.com>
2023-07-04 08:30:49 -06:00
Linus Torvalds e8069f5a8e ARM64:
* Eager page splitting optimization for dirty logging, optionally
   allowing for a VM to avoid the cost of hugepage splitting in the stage-2
   fault path.
 
 * Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact with
   services that live in the Secure world. pKVM intervenes on FF-A calls
   to guarantee the host doesn't misuse memory donated to the hyp or a
   pKVM guest.
 
 * Support for running the split hypervisor with VHE enabled, known as
   'hVHE' mode. This is extremely useful for testing the split
   hypervisor on VHE-only systems, and paves the way for new use cases
   that depend on having two TTBRs available at EL2.
 
 * Generalized framework for configurable ID registers from userspace.
   KVM/arm64 currently prevents arbitrary CPU feature set configuration
   from userspace, but the intent is to relax this limitation and allow
   userspace to select a feature set consistent with the CPU.
 
 * Enable the use of Branch Target Identification (FEAT_BTI) in the
   hypervisor.
 
 * Use a separate set of pointer authentication keys for the hypervisor
   when running in protected mode, as the host is untrusted at runtime.
 
 * Ensure timer IRQs are consistently released in the init failure
   paths.
 
 * Avoid trapping CTR_EL0 on systems with Enhanced Virtualization Traps
   (FEAT_EVT), as it is a register commonly read from userspace.
 
 * Erratum workaround for the upcoming AmpereOne part, which has broken
   hardware A/D state management.
 
 RISC-V:
 
 * Redirect AMO load/store misaligned traps to KVM guest
 
 * Trap-n-emulate AIA in-kernel irqchip for KVM guest
 
 * Svnapot support for KVM Guest
 
 s390:
 
 * New uvdevice secret API
 
 * CMM selftest and fixes
 
 * fix racy access to target CPU for diag 9c
 
 x86:
 
 * Fix missing/incorrect #GP checks on ENCLS
 
 * Use standard mmu_notifier hooks for handling APIC access page
 
 * Drop now unnecessary TR/TSS load after VM-Exit on AMD
 
 * Print more descriptive information about the status of SEV and SEV-ES during
   module load
 
 * Add a test for splitting and reconstituting hugepages during and after
   dirty logging
 
 * Add support for CPU pinning in demand paging test
 
 * Add support for AMD PerfMonV2, with a variety of cleanups and minor fixes
   included along the way
 
 * Add a "nx_huge_pages=never" option to effectively avoid creating NX hugepage
   recovery threads (because nx_huge_pages=off can be toggled at runtime)
 
 * Move handling of PAT out of MTRR code and dedup SVM+VMX code
 
 * Fix output of PIC poll command emulation when there's an interrupt
 
 * Add a maintainer's handbook to document KVM x86 processes, preferred coding
   style, testing expectations, etc.
 
 * Misc cleanups, fixes and comments
 
 Generic:
 
 * Miscellaneous bugfixes and cleanups
 
 Selftests:
 
 * Generate dependency files so that partial rebuilds work as expected
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmSgHrIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroORcAf+KkBlXwQMf+Q0Hy6Mfe0OtkKmh0Ae
 6HJ6dsuMfOHhWv5kgukh+qvuGUGzHq+gpVKmZg2yP3h3cLHOLUAYMCDm+rjXyjsk
 F4DbnJLfxq43Pe9PHRKFxxSecRcRYCNox0GD5UYL4PLKcH0FyfQrV+HVBK+GI8L3
 FDzUcyJkR12Lcj1qf++7fsbzfOshL0AJPmidQCoc6wkLJpUEr/nYUqlI1Kx3YNuQ
 LKmxFHS4l4/O/px3GKNDrLWDbrVlwciGIa3GZLS52PZdW3mAqT+cqcPcYK6SW71P
 m1vE80VbNELX5q3YSRoOXtedoZ3Pk97LEmz/xQAsJ/jri0Z5Syk0Ok0m/Q==
 =AMXp
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM64:

   - Eager page splitting optimization for dirty logging, optionally
     allowing for a VM to avoid the cost of hugepage splitting in the
     stage-2 fault path.

   - Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact
     with services that live in the Secure world. pKVM intervenes on
     FF-A calls to guarantee the host doesn't misuse memory donated to
     the hyp or a pKVM guest.

   - Support for running the split hypervisor with VHE enabled, known as
     'hVHE' mode. This is extremely useful for testing the split
     hypervisor on VHE-only systems, and paves the way for new use cases
     that depend on having two TTBRs available at EL2.

   - Generalized framework for configurable ID registers from userspace.
     KVM/arm64 currently prevents arbitrary CPU feature set
     configuration from userspace, but the intent is to relax this
     limitation and allow userspace to select a feature set consistent
     with the CPU.

   - Enable the use of Branch Target Identification (FEAT_BTI) in the
     hypervisor.

   - Use a separate set of pointer authentication keys for the
     hypervisor when running in protected mode, as the host is untrusted
     at runtime.

   - Ensure timer IRQs are consistently released in the init failure
     paths.

   - Avoid trapping CTR_EL0 on systems with Enhanced Virtualization
     Traps (FEAT_EVT), as it is a register commonly read from userspace.

   - Erratum workaround for the upcoming AmpereOne part, which has
     broken hardware A/D state management.

  RISC-V:

   - Redirect AMO load/store misaligned traps to KVM guest

   - Trap-n-emulate AIA in-kernel irqchip for KVM guest

   - Svnapot support for KVM Guest

  s390:

   - New uvdevice secret API

   - CMM selftest and fixes

   - fix racy access to target CPU for diag 9c

  x86:

   - Fix missing/incorrect #GP checks on ENCLS

   - Use standard mmu_notifier hooks for handling APIC access page

   - Drop now unnecessary TR/TSS load after VM-Exit on AMD

   - Print more descriptive information about the status of SEV and
     SEV-ES during module load

   - Add a test for splitting and reconstituting hugepages during and
     after dirty logging

   - Add support for CPU pinning in demand paging test

   - Add support for AMD PerfMonV2, with a variety of cleanups and minor
     fixes included along the way

   - Add a "nx_huge_pages=never" option to effectively avoid creating NX
     hugepage recovery threads (because nx_huge_pages=off can be toggled
     at runtime)

   - Move handling of PAT out of MTRR code and dedup SVM+VMX code

   - Fix output of PIC poll command emulation when there's an interrupt

   - Add a maintainer's handbook to document KVM x86 processes,
     preferred coding style, testing expectations, etc.

   - Misc cleanups, fixes and comments

  Generic:

   - Miscellaneous bugfixes and cleanups

  Selftests:

   - Generate dependency files so that partial rebuilds work as
     expected"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (153 commits)
  Documentation/process: Add a maintainer handbook for KVM x86
  Documentation/process: Add a label for the tip tree handbook's coding style
  KVM: arm64: Fix misuse of KVM_ARM_VCPU_POWER_OFF bit index
  RISC-V: KVM: Remove unneeded semicolon
  RISC-V: KVM: Allow Svnapot extension for Guest/VM
  riscv: kvm: define vcpu_sbi_ext_pmu in header
  RISC-V: KVM: Expose IMSIC registers as attributes of AIA irqchip
  RISC-V: KVM: Add in-kernel virtualization of AIA IMSIC
  RISC-V: KVM: Expose APLIC registers as attributes of AIA irqchip
  RISC-V: KVM: Add in-kernel emulation of AIA APLIC
  RISC-V: KVM: Implement device interface for AIA irqchip
  RISC-V: KVM: Skeletal in-kernel AIA irqchip support
  RISC-V: KVM: Set kvm_riscv_aia_nr_hgei to zero
  RISC-V: KVM: Add APLIC related defines
  RISC-V: KVM: Add IMSIC related defines
  RISC-V: KVM: Implement guest external interrupt line management
  KVM: x86: Remove PRIx* definitions as they are solely for user space
  s390/uv: Update query for secret-UVCs
  s390/uv: replace scnprintf with sysfs_emit
  s390/uvdevice: Add 'Lock Secret Store' UVC
  ...
2023-07-03 15:32:22 -07:00
Paolo Bonzini 36b68d360a KVM x86 changes for 6.5:
- Move handling of PAT out of MTRR code and dedup SVM+VMX code
 
  - Fix output of PIC poll command emulation when there's an interrupt
 
  - Fix a longstanding bug in the reporting of the number of entries returned by
    KVM_GET_CPUID2
 
  - Add a maintainer's handbook to document KVM x86 processes, preferred coding
    style, testing expectations, etc.
 
  - Misc cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmSaGMMSHHNlYW5qY0Bn
 b29nbGUuY29tAAoJEGCRIgFNDBL5iDIP/0PwY3J5odTEUTnAyuDFPimd5PBt9k/O
 B414wdpSKVgzq+0An4qM9mKRnklVIh2p8QqQTvDhcBUg3xb6CX9xZ4ery7hp/T5O
 tr5bAXs2AYX6jpxvsopt+w+E9j6fvkJhcJCRU9im3QbrqwUE+ecyU5OHvmv2n/GO
 syVZJbPOYuoLPKDjlSMrScE6fWEl9UOvHc5BK/vafTeyisMG3vv1BSmJj6GuiNNk
 TS1RRIg//cOZghQyDfdXt0azTmakNZyNn35xnoX9x8SRmdRykyUjQeHmeqWxPDso
 kiGO+CGancfS57S6ZtCkJjqEWZ1o/zKdOxr8MMf/3nJhv4kY7/5XtlVoACv5soW9
 bZEmNiXIaSbvKNMwAlLJxHFbLa1sMdSCb345CIuMdt5QiWJ53ZiTyIAJX6+eL+Zf
 8nkeekgPf5VUs6Zt0RdRPyvo+W7Vp9BtI87yDXm1nQKpbys2pt6CD3YB/oF4QViG
 a5cyGoFuqRQbS3nmbshIlR7EanTuxbhLZKrNrFnolZ5e624h3Cnk2hVsfTznVGiX
 vNHWM80phk1CWB9McErrZVkGfjlyVyBL13CBB2XF7Dl6PfF6/N22a9bOuTJD3tvk
 PlNx4hvZm3esvvyGpjfbSajTKYE8O7rxiE1KrF0BpZ5IUl5WSiTr6XCy/yI/mIeM
 hay2IWhPOF2z
 =D0BH
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-misc-6.5' of https://github.com/kvm-x86/linux into HEAD

KVM x86 changes for 6.5:

* Move handling of PAT out of MTRR code and dedup SVM+VMX code

* Fix output of PIC poll command emulation when there's an interrupt

* Add a maintainer's handbook to document KVM x86 processes, preferred coding
  style, testing expectations, etc.

* Misc cleanups
2023-07-01 07:08:59 -04:00
Linus Torvalds 6aeadf7896 Move the arm64 architecture documentation under Documentation/arch/. This
brings some order to the documentation directory, declutters the top-level
 directory, and makes the documentation organization more closely match that
 of the source.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmSbDsIPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5Y+ksH/2Xqun1ipPvu66+bBdPIf8N9AVFatl2q3mt4
 tgX3A4RH3Ejklb4GbRLOIP23PmCxt7LRv4P05ttw8VpTP3A+Cw1d1s2RxiXGvfDE
 j7IW6hrpUmVoDdiDCRGtjdIa7MVI5aAsj8CCTjEFywGi5CQe0Uzq4aTUKoxJDEnu
 GYVy2CwDNEt4GTQ6ClPpFx2rc4UZf/H2XqXsnod9ef8A5Nkt3EtgoS1hh3o1QZGA
 Mqx2HAOVS1tb6GUVUbVLCdj40+YjBLjXFlsH4dA+wsFFdUlZLKuTesdiAMg2X6eT
 E8C/6oRT+OiWbrnXUTJEn8z98Ds8VHn7D4n97O9bIQ+R9AFtmPI=
 =H/+D
 -----END PGP SIGNATURE-----

Merge tag 'docs-arm64-move' of git://git.lwn.net/linux

Pull arm64 documentation move from Jonathan Corbet:
 "Move the arm64 architecture documentation under Documentation/arch/.

  This brings some order to the documentation directory, declutters the
  top-level directory, and makes the documentation organization more
  closely match that of the source"

* tag 'docs-arm64-move' of git://git.lwn.net/linux:
  perf arm-spe: Fix a dangling Documentation/arm64 reference
  mm: Fix a dangling Documentation/arm64 reference
  arm64: Fix dangling references to Documentation/arm64
  dt-bindings: fix dangling Documentation/arm64 reference
  docs: arm64: Move arm64 documentation under Documentation/arch/
2023-06-27 21:52:15 -07:00
Linus Torvalds a354049532 It's been a relatively calm cycle in docsland. We do have:
- Some initial page-table documentation from Linus (the other Linus)
 
 - Regression-handling documentation improvements from Thorsten
 
 - Addition of kerneldoc documentation for the ERR_PTR() and related
   macros from James Seo
 
 ...and the usual collection of fixes and updates.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmSbC9wPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5Yw7YH/Rcd2oVQ/B8ui9TYcXTQid0ly5GvLl/ot0zf
 pml725bZSKodcdtmLvQ6CzMGRdzxhQpVfzy21zHAlQWiBMdheWeu0Etmpspn8fCI
 wnJIlUbGdp5Aq4ZtoJPTtE3vXvWEQ32gVytGjbTVNtSLRLXQ1bc+A/IvmRj3jdkV
 dwPfN7hPLVhBt5770pHMywlFVBQ9FUjUNC+uX0JkcNZJ3598c4ZzndBEaLdqfPHC
 DtWucRdnHubTncKECgYbspsfH6zuntFk8FgsD1gZ1K9izMAwVBsKSS+MeOz8oxx8
 rPq4Tscqs/9mpist/PqxEu0fvTC3xsyMbxLA4hAORmgpdnbWIaQ=
 =q2B4
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.5' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "It's been a relatively calm cycle in docsland. We do have:

   - Some initial page-table documentation from Linus (the other Linus)

   - Regression-handling documentation improvements from Thorsten

   - Addition of kerneldoc documentation for the ERR_PTR() and related
     macros from James Seo

  ... and the usual collection of fixes and updates"

* tag 'docs-6.5' of git://git.lwn.net/linux:
  docs: consolidate storage interfaces
  Documentation: update git configuration for Link: tag
  Documentation: KVM: make corrections to vcpu-requests.rst
  Documentation: KVM: make corrections to ppc-pv.rst
  Documentation: KVM: make corrections to locking.rst
  Documentation: KVM: make corrections to halt-polling.rst
  Documentation: virt: correct location of haltpoll module params
  Documentation/mm: Initial page table documentation
  docs: crypto: async-tx-api: fix typo in struct name
  docs/doc-guide: Clarify how to write tables
  docs: handling-regressions: rework section about fixing procedures
  docs: process: fix a typoed cross-reference
  docs: submitting-patches: Discuss interleaved replies
  MAINTAINERS: direct process doc changes to a dedicated ML
  Documentation: core-api: Add error pointer functions to kernel-api
  err.h: Add missing kerneldocs for error pointer functions
  Documentation: conf.py: Add __force to c_id_attributes
  docs: clarify KVM related kernel parameters' descriptions
  docs: consolidate human interface subsystems
  docs: admin-guide: Add information about intel_pstate active mode
2023-06-27 11:33:47 -07:00
Jonathan Corbet e4624435f3 docs: arm64: Move arm64 documentation under Documentation/arch/
Architecture-specific documentation is being moved into Documentation/arch/
as a way of cleaning up the top-level documentation directory and making
the docs hierarchy more closely match the source hierarchy.  Move
Documentation/arm64 into arch/ (along with the Chinese equvalent
translations) and fix up documentation references.

Cc: Will Deacon <will@kernel.org>
Cc: Alex Shi <alexs@kernel.org>
Cc: Hu Haowen <src.res@email.cn>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Yantengsi <siyanteng@loongson.cn>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-06-21 08:51:51 -06:00
Randy Dunlap 6f7f812f54 Documentation: virt: Clean up paravirt_ops doc
Clarify language. Clean up grammar. Hyphenate some words.

Change "low-ops" to "low-level" since "low-ops" isn't defined or even
mentioned anywhere else in the kernel source tree.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20230610054310.6242-1-rdunlap@infradead.org
2023-06-19 12:09:54 +02:00
Randy Dunlap 95b4d47a44 Documentation: KVM: make corrections to vcpu-requests.rst
Make corrections to punctuation and grammar.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Andrew Jones <drjones@redhat.com>
Cc: Christoffer Dall <cdall@linaro.org>
Cc: kvm@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230612030810.23376-5-rdunlap@infradead.org
2023-06-16 08:20:53 -06:00
Randy Dunlap daa3a39731 Documentation: KVM: make corrections to ppc-pv.rst
Correct the path of a header file.
Change "guest to ... guest" to "guest to ... host" in one place.
Hyphenate "32-bit" systems.
Add a comma at one parenthetical phrase.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org
Cc: Alexander Graf <agraf@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230612030810.23376-4-rdunlap@infradead.org
2023-06-16 08:20:53 -06:00
Randy Dunlap c37fa9dbb7 Documentation: KVM: make corrections to locking.rst
Correct grammar and punctuation.
Use "read-only" for consistency.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230612030810.23376-3-rdunlap@infradead.org
2023-06-16 08:20:53 -06:00
Randy Dunlap 4c60d49913 Documentation: KVM: make corrections to halt-polling.rst
Module parameters are in sysfs, not debugfs, so change that.

Remove superfluous "that" following "Note:".
Hyphenate "system-wide" values.
Hyphenate "trade-off".

Don't treat "denial of service" as a verb.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230612030810.23376-2-rdunlap@infradead.org
2023-06-16 08:20:53 -06:00
Randy Dunlap 1954d51592 Documentation: virt: correct location of haltpoll module params
Module parameters are located in sysfs, not debugfs, so correct the
statement.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20230610054302.6223-1-rdunlap@infradead.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-06-16 08:17:21 -06:00
Binbin Wu 06b66e0500 KVM: x86: Fix a typo in Documentation/virt/kvm/x86/mmu.rst
L1 CR4.LA57 should be '0' instead of '1' when shadowing 5-level NPT
for 4-level NPT L1 guest.

Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Link: https://lore.kernel.org/r/20230518091339.1102-4-binbin.wu@linux.intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-06-05 13:07:21 -07:00
Ricardo Koller 2f440b72e8 KVM: arm64: Add KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE
Add a capability for userspace to specify the eager split chunk size.
The chunk size specifies how many pages to break at a time, using a
single allocation. Bigger the chunk size, more pages need to be
allocated ahead of time.

Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Link: https://lore.kernel.org/r/20230426172330.1439644-6-ricarkol@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-05-16 17:39:18 +00:00
Linus Torvalds 7df047b3f0 VFIO updates for v6.4-rc1
- Expose and allow R/W access to the PCIe DVSEC capability through
    vfio-pci, as we already do with the legacy vendor capability.
    (K V P Satyanarayana)
 
  - Fix kernel-doc issues with structure definitions. (Simon Horman)
 
  - Clarify ordering of operations relative to the kvm-vfio device for
    driver dependencies against the kvm pointer. (Yi Liu)
 -----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmRRPjAbHGFsZXgud2ls
 bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsiePoQAJkUBO4bN3BvTG3iOkh1
 vGrD3llhoUeD3AetT+5Y7pX0ml6wx9SLjcuFIUyYDMomLebpFXQAlI3gjE5eGbnf
 PVfn6zJDrcA5w2tFwkJHUnN+RgcaPW5tiviP6XkRFpS95HCQRsFdp9GhI2U3X7q7
 1G1pTV4Fb6BPhU0zAOueW1HKovTFDi3qbzaG5JjosdZc8V32/USQ6ekkXSGMbuzN
 HsqLi2ujrtwAYluNKGxsuiDKXBFT8z8Mji7S9Z8RqoR0IF9q8JMUC0UMImn/XhKN
 4XgCsibPt4o+RSStAf3vUPoP1l3I5J+3dE4ynLiHN8mNmbJ2gCLzQx18Dfww3V9n
 Ao8wdwZvvskfc+gkEJAh47OFf9GvCvvHtg3uZS9KPDidS3uY8K4ooIr4Pv1F+QMh
 rcnjZRFCakPB2tijmtg+1/0dY5/YEAn0mYWQ8ph5Ips36q30WBrbhBLJ47wE6PZ0
 bsChcCU+4wK9e20lzMewa9QYnrRkvjGbmIo3GnjyPt4aG0VM08orIePy7ZVen3S3
 +ieXBbuW61BLN/fZBFrDB4IxmekWq2TkkxSQCe8rn+jgs7qUUlFzn34Iqna6QTQV
 pG+O3pyflULZGZHkij4lOn1i2SqqbLxZofYose9hHq5r6ALD5EFNg3QVLuiAc7Tb
 3ctmfePHowrzKMyHMlXb8DE7
 =Ylh5
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v6.4-rc1' of https://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

 - Expose and allow R/W access to the PCIe DVSEC capability through
   vfio-pci, as we already do with the legacy vendor capability
   (K V P Satyanarayana)

 - Fix kernel-doc issues with structure definitions (Simon Horman)

 - Clarify ordering of operations relative to the kvm-vfio device for
   driver dependencies against the kvm pointer (Yi Liu)

* tag 'vfio-v6.4-rc1' of https://github.com/awilliam/linux-vfio:
  docs: kvm: vfio: Suggest KVM_DEV_VFIO_GROUP_ADD vs VFIO_GROUP_GET_DEVICE_FD ordering
  vfio: correct kdoc for ops structures
  vfio/pci: Add DVSEC PCI Extended Config Capability to user visible list.
2023-05-02 11:56:43 -07:00
Linus Torvalds c8c655c34e s390:
* More phys_to_virt conversions
 
 * Improvement of AP management for VSIE (nested virtualization)
 
 ARM64:
 
 * Numerous fixes for the pathological lock inversion issue that
   plagued KVM/arm64 since... forever.
 
 * New framework allowing SMCCC-compliant hypercalls to be forwarded
   to userspace, hopefully paving the way for some more features
   being moved to VMMs rather than be implemented in the kernel.
 
 * Large rework of the timer code to allow a VM-wide offset to be
   applied to both virtual and physical counters as well as a
   per-timer, per-vcpu offset that complements the global one.
   This last part allows the NV timer code to be implemented on
   top.
 
 * A small set of fixes to make sure that we don't change anything
   affecting the EL1&0 translation regime just after having having
   taken an exception to EL2 until we have executed a DSB. This
   ensures that speculative walks started in EL1&0 have completed.
 
 * The usual selftest fixes and improvements.
 
 KVM x86 changes for 6.4:
 
 * Optimize CR0.WP toggling by avoiding an MMU reload when TDP is enabled,
   and by giving the guest control of CR0.WP when EPT is enabled on VMX
   (VMX-only because SVM doesn't support per-bit controls)
 
 * Add CR0/CR4 helpers to query single bits, and clean up related code
   where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long" return
   as a bool
 
 * Move AMD_PSFD to cpufeatures.h and purge KVM's definition
 
 * Avoid unnecessary writes+flushes when the guest is only adding new PTEs
 
 * Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s optimizations
   when emulating invalidations
 
 * Clean up the range-based flushing APIs
 
 * Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a single
   A/D bit using a LOCK AND instead of XCHG, and skip all of the "handle
   changed SPTE" overhead associated with writing the entire entry
 
 * Track the number of "tail" entries in a pte_list_desc to avoid having
   to walk (potentially) all descriptors during insertion and deletion,
   which gets quite expensive if the guest is spamming fork()
 
 * Disallow virtualizing legacy LBRs if architectural LBRs are available,
   the two are mutually exclusive in hardware
 
 * Disallow writes to immutable feature MSRs (notably PERF_CAPABILITIES)
   after KVM_RUN, similar to CPUID features
 
 * Overhaul the vmx_pmu_caps selftest to better validate PERF_CAPABILITIES
 
 * Apply PMU filters to emulated events and add test coverage to the
   pmu_event_filter selftest
 
 x86 AMD:
 
 * Add support for virtual NMIs
 
 * Fixes for edge cases related to virtual interrupts
 
 x86 Intel:
 
 * Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if XTILE_DATA is
   not being reported due to userspace not opting in via prctl()
 
 * Fix a bug in emulation of ENCLS in compatibility mode
 
 * Allow emulation of NOP and PAUSE for L2
 
 * AMX selftests improvements
 
 * Misc cleanups
 
 MIPS:
 
 * Constify MIPS's internal callbacks (a leftover from the hardware enabling
   rework that landed in 6.3)
 
 Generic:
 
 * Drop unnecessary casts from "void *" throughout kvm_main.c
 
 * Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the struct
   size by 8 bytes on 64-bit kernels by utilizing a padding hole
 
 Documentation:
 
 * Fix goof introduced by the conversion to rST
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmRNExkUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNyjwf+MkzDael9y9AsOZoqhEZ5OsfQYJ32
 Im5ZVYsPRU2K5TuoWql6meIihgclCj1iIU32qYHa2F1WYt2rZ72rJp+HoY8b+TaI
 WvF0pvNtqQyg3iEKUBKPA4xQ6mj7RpQBw86qqiCHmlfNt0zxluEGEPxH8xrWcfhC
 huDQ+NUOdU7fmJ3rqGitCvkUbCuZNkw3aNPR8dhU8RAWrwRzP2hBOmdxIeo81WWY
 XMEpJSijbGpXL9CvM0Jz9nOuMJwZwCCBGxg1vSQq0xTfLySNMxzvWZC2GFaBjucb
 j0UOQ7yE0drIZDVhd3sdNslubXXU6FcSEzacGQb9aigMUon3Tem9SHi7Kw==
 =S2Hq
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "s390:

   - More phys_to_virt conversions

   - Improvement of AP management for VSIE (nested virtualization)

  ARM64:

   - Numerous fixes for the pathological lock inversion issue that
     plagued KVM/arm64 since... forever.

   - New framework allowing SMCCC-compliant hypercalls to be forwarded
     to userspace, hopefully paving the way for some more features being
     moved to VMMs rather than be implemented in the kernel.

   - Large rework of the timer code to allow a VM-wide offset to be
     applied to both virtual and physical counters as well as a
     per-timer, per-vcpu offset that complements the global one. This
     last part allows the NV timer code to be implemented on top.

   - A small set of fixes to make sure that we don't change anything
     affecting the EL1&0 translation regime just after having having
     taken an exception to EL2 until we have executed a DSB. This
     ensures that speculative walks started in EL1&0 have completed.

   - The usual selftest fixes and improvements.

  x86:

   - Optimize CR0.WP toggling by avoiding an MMU reload when TDP is
     enabled, and by giving the guest control of CR0.WP when EPT is
     enabled on VMX (VMX-only because SVM doesn't support per-bit
     controls)

   - Add CR0/CR4 helpers to query single bits, and clean up related code
     where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long"
     return as a bool

   - Move AMD_PSFD to cpufeatures.h and purge KVM's definition

   - Avoid unnecessary writes+flushes when the guest is only adding new
     PTEs

   - Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s
     optimizations when emulating invalidations

   - Clean up the range-based flushing APIs

   - Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a
     single A/D bit using a LOCK AND instead of XCHG, and skip all of
     the "handle changed SPTE" overhead associated with writing the
     entire entry

   - Track the number of "tail" entries in a pte_list_desc to avoid
     having to walk (potentially) all descriptors during insertion and
     deletion, which gets quite expensive if the guest is spamming
     fork()

   - Disallow virtualizing legacy LBRs if architectural LBRs are
     available, the two are mutually exclusive in hardware

   - Disallow writes to immutable feature MSRs (notably
     PERF_CAPABILITIES) after KVM_RUN, similar to CPUID features

   - Overhaul the vmx_pmu_caps selftest to better validate
     PERF_CAPABILITIES

   - Apply PMU filters to emulated events and add test coverage to the
     pmu_event_filter selftest

   - AMD SVM:
       - Add support for virtual NMIs
       - Fixes for edge cases related to virtual interrupts

   - Intel AMX:
       - Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if
         XTILE_DATA is not being reported due to userspace not opting in
         via prctl()
       - Fix a bug in emulation of ENCLS in compatibility mode
       - Allow emulation of NOP and PAUSE for L2
       - AMX selftests improvements
       - Misc cleanups

  MIPS:

   - Constify MIPS's internal callbacks (a leftover from the hardware
     enabling rework that landed in 6.3)

  Generic:

   - Drop unnecessary casts from "void *" throughout kvm_main.c

   - Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the
     struct size by 8 bytes on 64-bit kernels by utilizing a padding
     hole

  Documentation:

   - Fix goof introduced by the conversion to rST"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (211 commits)
  KVM: s390: pci: fix virtual-physical confusion on module unload/load
  KVM: s390: vsie: clarifications on setting the APCB
  KVM: s390: interrupt: fix virtual-physical confusion for next alert GISA
  KVM: arm64: Have kvm_psci_vcpu_on() use WRITE_ONCE() to update mp_state
  KVM: arm64: Acquire mp_state_lock in kvm_arch_vcpu_ioctl_vcpu_init()
  KVM: selftests: Test the PMU event "Instructions retired"
  KVM: selftests: Copy full counter values from guest in PMU event filter test
  KVM: selftests: Use error codes to signal errors in PMU event filter test
  KVM: selftests: Print detailed info in PMU event filter asserts
  KVM: selftests: Add helpers for PMC asserts in PMU event filter test
  KVM: selftests: Add a common helper for the PMU event filter guest code
  KVM: selftests: Fix spelling mistake "perrmited" -> "permitted"
  KVM: arm64: vhe: Drop extra isb() on guest exit
  KVM: arm64: vhe: Synchronise with page table walker on MMU update
  KVM: arm64: pkvm: Document the side effects of kvm_flush_dcache_to_poc()
  KVM: arm64: nvhe: Synchronise with page table walker on TLBI
  KVM: arm64: Handle 32bit CNTPCTSS traps
  KVM: arm64: nvhe: Synchronise with page table walker on vcpu run
  KVM: arm64: vgic: Don't acquire its_lock before config_lock
  KVM: selftests: Add test to verify KVM's supported XCR0
  ...
2023-05-01 12:06:20 -07:00
Paolo Bonzini e1a6d5cf10 Common KVM changes for 6.4:
- Drop unnecessary casts from "void *" throughout kvm_main.c
 
  - Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the struct
    size by 8 bytes on 64-bit kernels by utilizing a padding hole
 
  - Fix a documentation format goof that was introduced when the KVM docs
    were converted to ReST
 
  - Constify MIPS's internal callbacks (a leftover from the hardware enabling
    rework that landed in 6.3)
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmRGrVkSHHNlYW5qY0Bn
 b29nbGUuY29tAAoJEGCRIgFNDBL52ZAP/0/6KOa6ZSvkRh+7MwQDkfeXkkbRIyyY
 ItPspXCqCmD9X79m2r/5PCfpLgWDizROzOxLXb2bMhh7DqPczWWMvwEfZxBRK9LN
 5zpHRdiiJJLR0HMdQtWkM5tdDCw/v37aQPkWyaZC/zDi2Zv6YPtPJVEBd38Squoh
 vJ8zQp3c1qxHJWKvNaS6JY7NQ1B1sI3e7H9VEldR2d3RAinuAnIMgi+I8WqU6RT1
 IdIYkemKrgquO9OPGeBxMV4ri5Km9FBdzb8LRkzzfYaELzVsrRxhXBOc9zaasgYK
 YVbJSINeq5dIpwoXI9tqDBJTUIAPJ3NOwK/4E6qc6YEIZoT7euKGgGAqI879TSKm
 zNR8b1ijVu5DquJbDFP8AR2UZnqCEIQ/EuuJdkHxFE5wQnNjgNJtSHZVJX/cKqW9
 wnXCqK6wQoAUq7pUgyqTsy3SCiRQddEtwsMcf/CdWRPXcgDqQ1P3UmVupLcEtL0I
 B+I7S+L64/KOHGeQsEKrohAOFBsMFVEkSkthyflg6/RFv1heHo2lx3njFKYm9lCW
 LDCd70+iHD8e5/X4RCWAjB0EaqM3MYpAU2UtD8Pbjx/DiZDLWEjDD0B2LkI0uinS
 +Ebdc5M9zGrNawiAzvF+MhZfDWut4Cr0tS5cPttXX3lg8aPl3nZL2G3nlk4vgpec
 jgNvjwQ5hUyv
 =Qw05
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-generic-6.4' of https://github.com/kvm-x86/linux into HEAD

Common KVM changes for 6.4:

 - Drop unnecessary casts from "void *" throughout kvm_main.c

 - Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the struct
   size by 8 bytes on 64-bit kernels by utilizing a padding hole

 - Fix a documentation format goof that was introduced when the KVM docs
   were converted to ReST

 - Constify MIPS's internal callbacks (a leftover from the hardware enabling
   rework that landed in 6.3)
2023-04-26 15:48:44 -04:00
Paolo Bonzini 4f382a79a6 KVM/arm64 updates for 6.4
- Numerous fixes for the pathological lock inversion issue that
   plagued KVM/arm64 since... forever.
 
 - New framework allowing SMCCC-compliant hypercalls to be forwarded
   to userspace, hopefully paving the way for some more features
   being moved to VMMs rather than be implemented in the kernel.
 
 - Large rework of the timer code to allow a VM-wide offset to be
   applied to both virtual and physical counters as well as a
   per-timer, per-vcpu offset that complements the global one.
   This last part allows the NV timer code to be implemented on
   top.
 
 - A small set of fixes to make sure that we don't change anything
   affecting the EL1&0 translation regime just after having having
   taken an exception to EL2 until we have executed a DSB. This
   ensures that speculative walks started in EL1&0 have completed.
 
 - The usual selftest fixes and improvements.
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmRCZIwPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDoZ8P/ioXAdDbAE4hTuyD2YdKJ3IGWN3pg52Z7xc2
 rBXXFrbK9+n9FEc3AVdHoGsRPDP0Ynl+apj+aB0Klr/Fl0KKqac+W0ARX9rn1mI1
 HjeygFPaGnXjMUp0BjeSLS+g3b0gebELJ6R1QEe1/MIPb8Se7M1y3ZpMWdhe0PPL
 vyzw3LZq2OAlLgWKZhAfhh03qdr2kqJxypYs6nMrcexfn8dXT78dsYKW1nXmqKcE
 61Gg23MDPUoexYpUhm+ym5t8hltoI1di8faPmxEpaFzpSDyAg8V5vo6LiW9jn3cf
 RX0Sikk1laiRAhVbbIFCKC148vFyKxum3scpKyb91Qc+sK1kmIcxvEqlc6SfG9je
 +5ndZwAfXtW6SMSOyX8y5fXbee7M0sx3n3le9BNgwXfmLWg/GHXJ544dJgVIlf/e
 0Z+8QnP1IUDfARR/b2FlW7A7XLzNHQzO379ekcAdUptbGwlS9CrW6SJ83QR7K6fB
 bh0aSSELKsD7pX8wnNyNACvmz2zL12ITlDKdZWUr8MSxyTjgVy7s0BDsQT3sbrA1
 1sH++RvUWfC2k7tVT3vjZFzUDlPw3bnZmo5YMWRTMbXEdr1V5rDw5F5IXit13KeT
 8bk0hnJgnLmyoX2A17v5dkFMIKD7p13tqDRdfFcn0ru63HIKxgkS3ITkDmsAQELK
 DHT7RBE0
 =Bhta
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for 6.4

- Numerous fixes for the pathological lock inversion issue that
  plagued KVM/arm64 since... forever.

- New framework allowing SMCCC-compliant hypercalls to be forwarded
  to userspace, hopefully paving the way for some more features
  being moved to VMMs rather than be implemented in the kernel.

- Large rework of the timer code to allow a VM-wide offset to be
  applied to both virtual and physical counters as well as a
  per-timer, per-vcpu offset that complements the global one.
  This last part allows the NV timer code to be implemented on
  top.

- A small set of fixes to make sure that we don't change anything
  affecting the EL1&0 translation regime just after having having
  taken an exception to EL2 until we have executed a DSB. This
  ensures that speculative walks started in EL1&0 have completed.

- The usual selftest fixes and improvements.
2023-04-26 15:46:52 -04:00
Linus Torvalds bc1bb2a49b - Add the necessary glue so that the kernel can run as a confidential
SEV-SNP vTOM guest on Hyper-V. A vTOM guest basically splits the
   address space in two parts: encrypted and unencrypted. The use case
   being running unmodified guests on the Hyper-V confidential computing
   hypervisor
 
 - Double-buffer messages between the guest and the hardware PSP device
   so that no partial buffers are copied back'n'forth and thus potential
   message integrity and leak attacks are possible
 
 - Name the return value the sev-guest driver returns when the hw PSP
   device hasn't been called, explicitly
 
 - Cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmRGl8gACgkQEsHwGGHe
 VUoEDhAAiw4+2nZR7XUJ7pewlXG7AJJZsVIpzzcF6Gyymn0LFCyMnP7O3snmFqzz
 aik0q2LzWrmDQ3Nmmzul0wtdsuW7Nik6BP9oF3WnB911+gGbpXyNWZ8EhOPNzkUR
 9D8Sp6f0xmqNE3YuzEpanufiDswgUxi++DRdmIRAs1TTh4bfUFWZcib1pdwoqSmR
 oS3UfVwVZ4Ee2Qm1f3n3XQ0FUpsjWeARPExUkLEvd8XeonTP+6aGAdggg9MnPcsl
 3zpSmOpuZ6VQbDrHxo3BH9HFuIUOd6S9PO++b9F6WxNPGEMk7fHa7ahOA6HjhgVz
 5Da3BN16OS9j64cZsYHMPsBcd+ja1YmvvZGypsY0d6X4d3M1zTPW+XeLbyb+VFBy
 SvA7z+JuxtLKVpju65sNiJWw8ZDTSu+eEYNDeeGLvAj3bxtclJjcPdMEPdzxmC5K
 eAhmRmiFuVM4nXMAR6cspVTsxvlTHFtd5gdm6RlRnvd7aV77Zl1CLzTy8IHTVpvI
 t7XTbtjEjYc0pI6cXXptHEOnBLjXUMPcqgGFgJYEauH6EvrxoWszUZD0tS3Hw80A
 K+Rwnc70ubq/PsgZcF4Ayer1j49z1NPfk5D4EA7/ChN6iNhQA8OqHT1UBrHAgqls
 2UAwzE2sQZnjDvGZghlOtFIQUIhwue7m93DaRi19EOdKYxVjV6U=
 =ZAw9
 -----END PGP SIGNATURE-----

Merge tag 'x86_sev_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 SEV updates from Borislav Petkov:

 - Add the necessary glue so that the kernel can run as a confidential
   SEV-SNP vTOM guest on Hyper-V. A vTOM guest basically splits the
   address space in two parts: encrypted and unencrypted. The use case
   being running unmodified guests on the Hyper-V confidential computing
   hypervisor

 - Double-buffer messages between the guest and the hardware PSP device
   so that no partial buffers are copied back'n'forth and thus potential
   message integrity and leak attacks are possible

 - Name the return value the sev-guest driver returns when the hw PSP
   device hasn't been called, explicitly

 - Cleanups

* tag 'x86_sev_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/hyperv: Change vTOM handling to use standard coco mechanisms
  init: Call mem_encrypt_init() after Hyper-V hypercall init is done
  x86/mm: Handle decryption/re-encryption of bss_decrypted consistently
  Drivers: hv: Explicitly request decrypted in vmap_pfn() calls
  x86/hyperv: Reorder code to facilitate future work
  x86/ioremap: Add hypervisor callback for private MMIO mapping in coco VM
  x86/sev: Change snp_guest_issue_request()'s fw_err argument
  virt/coco/sev-guest: Double-buffer messages
  crypto: ccp: Get rid of __sev_platform_init_locked()'s local function pointer
  crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL
2023-04-25 10:48:08 -07:00
Linus Torvalds c23f28975a Commit volume in documentation is relatively low this time, but there is
still a fair amount going on, including:
 
 - Reorganizing the architecture-specific documentation under
   Documentation/arch.  This makes the structure match the source directory
   and helps to clean up the mess that is the top-level Documentation
   directory a bit.  This work creates the new directory and moves x86 and
   most of the less-active architectures there.  The current plan is to move
   the rest of the architectures in 6.5, with the patches going through the
   appropriate subsystem trees.
 
 - Some more Spanish translations and maintenance of the Italian
   translation.
 
 - A new "Kernel contribution maturity model" document from Ted.
 
 - A new tutorial on quickly building a trimmed kernel from Thorsten.
 
 Plus the usual set of updates and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmRGze0PHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5Y/VsH/RyWqinorRVFZmHqRJMRhR0j7hE2pAgK5prE
 dGXYVtHHNQ+25thNaqhZTOLYFbSX6ii2NG7sLRXmyOTGIZrhUCFFXCHkuq4ZUypR
 gJpMUiKQVT4dhln3gIZ0k09NSr60gz8UTcq895N9UFpUdY1SCDhbCcLc4uXTRajq
 NrdgFaHWRkPb+gBRbXOExYm75DmCC6Ny5AyGo2rXfItV//ETjWIJVQpJhlxKrpMZ
 3LgpdYSLhEFFnFGnXJ+EAPJ7gXDi2Tg5DuPbkvJyFOTouF3j4h8lSS9l+refMljN
 xNRessv+boge/JAQidS6u8F2m2ESSqSxisv/0irgtKIMJwXaoX4=
 =1//8
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.4' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "Commit volume in documentation is relatively low this time, but there
  is still a fair amount going on, including:

   - Reorganize the architecture-specific documentation under
     Documentation/arch

     This makes the structure match the source directory and helps to
     clean up the mess that is the top-level Documentation directory a
     bit. This work creates the new directory and moves x86 and most of
     the less-active architectures there.

     The current plan is to move the rest of the architectures in 6.5,
     with the patches going through the appropriate subsystem trees.

   - Some more Spanish translations and maintenance of the Italian
     translation

   - A new "Kernel contribution maturity model" document from Ted

   - A new tutorial on quickly building a trimmed kernel from Thorsten

  Plus the usual set of updates and fixes"

* tag 'docs-6.4' of git://git.lwn.net/linux: (47 commits)
  media: Adjust column width for pdfdocs
  media: Fix building pdfdocs
  docs: clk: add documentation to log which clocks have been disabled
  docs: trace: Fix typo in ftrace.rst
  Documentation/process: always CC responsible lists
  docs: kmemleak: adjust to config renaming
  ELF: document some de-facto PT_* ABI quirks
  Documentation: arm: remove stih415/stih416 related entries
  docs: turn off "smart quotes" in the HTML build
  Documentation: firmware: Clarify firmware path usage
  docs/mm: Physical Memory: Fix grammar
  Documentation: Add document for false sharing
  dma-api-howto: typo fix
  docs: move m68k architecture documentation under Documentation/arch/
  docs: move parisc documentation under Documentation/arch/
  docs: move ia64 architecture docs under Documentation/arch/
  docs: Move arc architecture docs under Documentation/arch/
  docs: move nios2 documentation under Documentation/arch/
  docs: move openrisc documentation under Documentation/arch/
  docs: move superh documentation under Documentation/arch/
  ...
2023-04-24 12:35:49 -07:00
Yi Liu 705b004ee3 docs: kvm: vfio: Suggest KVM_DEV_VFIO_GROUP_ADD vs VFIO_GROUP_GET_DEVICE_FD ordering
as some vfio_device's open_device op requires kvm pointer and kvm pointer
set is part of GROUP_ADD.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20230421053611.55839-1-yi.l.liu@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-04-21 13:48:44 -06:00
Marc Zyngier 6dcf7316e0 Merge branch kvm-arm64/smccc-filtering into kvmarm-master/next
* kvm-arm64/smccc-filtering:
  : .
  : SMCCC call filtering and forwarding to userspace, courtesy of
  : Oliver Upton. From the cover letter:
  :
  : "The Arm SMCCC is rather prescriptive in regards to the allocation of
  : SMCCC function ID ranges. Many of the hypercall ranges have an
  : associated specification from Arm (FF-A, PSCI, SDEI, etc.) with some
  : room for vendor-specific implementations.
  :
  : The ever-expanding SMCCC surface leaves a lot of work within KVM for
  : providing new features. Furthermore, KVM implements its own
  : vendor-specific ABI, with little room for other implementations (like
  : Hyper-V, for example). Rather than cramming it all into the kernel we
  : should provide a way for userspace to handle hypercalls."
  : .
  KVM: selftests: Fix spelling mistake "KVM_HYPERCAL_EXIT_SMC" -> "KVM_HYPERCALL_EXIT_SMC"
  KVM: arm64: Test that SMC64 arch calls are reserved
  KVM: arm64: Prevent userspace from handling SMC64 arch range
  KVM: arm64: Expose SMC/HVC width to userspace
  KVM: selftests: Add test for SMCCC filter
  KVM: selftests: Add a helper for SMCCC calls with SMC instruction
  KVM: arm64: Let errors from SMCCC emulation to reach userspace
  KVM: arm64: Return NOT_SUPPORTED to guest for unknown PSCI version
  KVM: arm64: Introduce support for userspace SMCCC filtering
  KVM: arm64: Add support for KVM_EXIT_HYPERCALL
  KVM: arm64: Use a maple tree to represent the SMCCC filter
  KVM: arm64: Refactor hvc filtering to support different actions
  KVM: arm64: Start handling SMCs from EL1
  KVM: arm64: Rename SMC/HVC call handler to reflect reality
  KVM: arm64: Add vm fd device attribute accessors
  KVM: arm64: Add a helper to check if a VM has ran once
  KVM: x86: Redefine 'longmode' as a flag for KVM_EXIT_HYPERCALL

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-04-21 09:44:32 +01:00
Marc Zyngier 0e5c9a9d65 KVM: arm64: Expose SMC/HVC width to userspace
When returning to userspace to handle a SMCCC call, we consistently
set PC to point to the instruction immediately after the HVC/SMC.

However, should userspace need to know the exact address of the
trapping instruction, it needs to know about the *size* of that
instruction. For AArch64, this is pretty easy. For AArch32, this
is a bit more funky, as Thumb has 16bit encodings for both HVC
and SMC.

Expose this to userspace with a new flag that directly derives
from ESR_EL2.IL. Also update the documentation to reflect the PC
state at the point of exit.

Finally, this fixes a small buglet where the hypercall.{args,ret}
fields would not be cleared on exit, and could contain some
random junk.

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/86pm8iv8tj.wl-maz@kernel.org
2023-04-05 19:20:23 +01:00
Oliver Upton 821d935c87 KVM: arm64: Introduce support for userspace SMCCC filtering
As the SMCCC (and related specifications) march towards an 'everything
and the kitchen sink' interface for interacting with a system it becomes
less likely that KVM will support every related feature. We could do
better by letting userspace have a crack at it instead.

Allow userspace to define an 'SMCCC filter' that applies to both HVCs
and SMCs initiated by the guest. Supporting both conduits with this
interface is important for a couple of reasons. Guest SMC usage is table
stakes for a nested guest, as HVCs are always taken to the virtual EL2.
Additionally, guests may want to interact with a service on the secure
side which can now be proxied by userspace.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230404154050.2270077-10-oliver.upton@linux.dev
2023-04-05 12:07:41 +01:00
Oliver Upton d824dff191 KVM: arm64: Add support for KVM_EXIT_HYPERCALL
In anticipation of user hypercall filters, add the necessary plumbing to
get SMCCC calls out to userspace. Even though the exit structure has
space for KVM to pass register arguments, let's just avoid it altogether
and let userspace poke at the registers via KVM_GET_ONE_REG.

This deliberately stretches the definition of a 'hypercall' to cover
SMCs from EL1 in addition to the HVCs we know and love. KVM doesn't
support EL1 calls into secure services, but now we can paint that as a
userspace problem and be done with it.

Finally, we need a flag to let userspace know what conduit instruction
was used (i.e. SMC vs. HVC).

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230404154050.2270077-9-oliver.upton@linux.dev
2023-04-05 12:07:41 +01:00
Oliver Upton e65733b5c5 KVM: x86: Redefine 'longmode' as a flag for KVM_EXIT_HYPERCALL
The 'longmode' field is a bit annoying as it blows an entire __u32 to
represent a boolean value. Since other architectures are looking to add
support for KVM_EXIT_HYPERCALL, now is probably a good time to clean it
up.

Redefine the field (and the remaining padding) as a set of flags.
Preserve the existing ABI by using bit 0 to indicate if the guest was in
long mode and requiring that the remaining 31 bits must be zero.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230404154050.2270077-2-oliver.upton@linux.dev
2023-04-05 12:07:41 +01:00
Takahiro Itazuri fb5015bc8b docs: kvm: x86: Fix broken field list
Add a missing ":" to fix a broken field list.

Signed-off-by: Takahiro Itazuri <itazur@amazon.com>
Fixes: ba7bb663f5 ("KVM: x86: Provide per VM capability for disabling PMU virtualization")
Message-Id: <20230331093116.99820-1-itazur@amazon.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-04-04 13:22:05 -04:00
Jonathan Corbet ff61f0791c docs: move x86 documentation into Documentation/arch/
Move the x86 documentation under Documentation/arch/ as a way of cleaning
up the top-level directory and making the structure of our docs more
closely match the structure of the source directories it describes.

All in-kernel references to the old paths have been updated.

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-arch@vger.kernel.org
Cc: x86@kernel.org
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20230315211523.108836-1-corbet@lwn.net/
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-03-30 12:58:51 -06:00
Marc Zyngier 1935d34afa KVM: arm64: Document KVM_ARM_SET_CNT_OFFSETS and co
Add some basic documentation on the effects of KVM_ARM_SET_CNT_OFFSETS.

Reviewed-by: Colton Lewis <coltonlewis@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230330174800.2677007-16-maz@kernel.org
2023-03-30 19:01:10 +01:00
Jun Miao b0d237087c KVM: Fix comments that refer to the non-existent install_new_memslots()
Fix stale comments that were left behind when install_new_memslots() was
replaced by kvm_swap_active_memslots() as part of the scalable memslots
rework.

Fixes: a54d806688 ("KVM: Keep memslots in tree-based structures instead of array-based ones")
Signed-off-by: Jun Miao <jun.miao@intel.com>
Link: https://lore.kernel.org/r/20230223052851.1054799-1-jun.miao@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-03-24 08:20:17 -07:00
Shaoqin Huang 752b8a9b4d KVM: Add the missed title format
The 7.18 KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 now is not a title, make it
as a title to keep the format consistent.

Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
Fixes: 106ee47dc6 ("docs: kvm: Convert api.txt to ReST format")
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20230220034910.11024-1-shahuang@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-03-23 16:11:22 -07:00
Dionna Glaze 0144e3b85d x86/sev: Change snp_guest_issue_request()'s fw_err argument
The GHCB specification declares that the firmware error value for
a guest request will be stored in the lower 32 bits of EXIT_INFO_2.  The
upper 32 bits are for the VMM's own error code. The fw_err argument to
snp_guest_issue_request() is thus a misnomer, and callers will need
access to all 64 bits.

The type of unsigned long also causes problems, since sw_exit_info2 is
u64 (unsigned long long) vs the argument's unsigned long*. Change this
type for issuing the guest request. Pass the ioctl command struct's error
field directly instead of in a local variable, since an incomplete guest
request may not set the error code, and uninitialized stack memory would
be written back to user space.

The firmware might not even be called, so bookend the call with the no
firmware call error and clear the error.

Since the "fw_err" field is really exitinfo2 split into the upper bits'
vmm error code and lower bits' firmware error code, convert the 64 bit
value to a union.

  [ bp:
   - Massage commit message
   - adjust code
   - Fix a build issue as
   Reported-by: kernel test robot <lkp@intel.com>
   Link: https://lore.kernel.org/oe-kbuild-all/202303070609.vX6wp2Af-lkp@intel.com
   - print exitinfo2 in hex
   Tom:
    - Correct -EIO exit case. ]

Signed-off-by: Dionna Glaze <dionnaglaze@google.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230214164638.1189804-5-dionnaglaze@google.com
Link: https://lore.kernel.org/r/20230307192449.24732-12-bp@alien8.de
2023-03-21 15:43:19 +01:00
Peter Gonda efb339a833 crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL
The PSP can return a "firmware error" code of -1 in circumstances where
the PSP has not actually been called. To make this protocol unambiguous,
name the value SEV_RET_NO_FW_CALL.

  [ bp: Massage a bit. ]

Signed-off-by: Peter Gonda <pgonda@google.com>
Signed-off-by: Dionna Glaze <dionnaglaze@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20221207010210.2563293-2-dionnaglaze@google.com
2023-03-21 11:37:32 +01:00
Thomas Huth 2def950c63 KVM: arm64: Limit length in kvm_vm_ioctl_mte_copy_tags() to INT_MAX
In case of success, this function returns the amount of handled bytes.
However, this does not work for large values: The function is called
from kvm_arch_vm_ioctl() (which still returns a long), which in turn
is called from kvm_vm_ioctl() in virt/kvm/kvm_main.c. And that function
stores the return value in an "int r" variable. So the upper 32-bits
of the "long" return value are lost there.

KVM ioctl functions should only return "int" values, so let's limit
the amount of bytes that can be requested here to INT_MAX to avoid
the problem with the truncated return value. We can then also change
the return type of the function to "int" to make it clearer that it
is not possible to return a "long" here.

Fixes: f0376edb1d ("KVM: arm64: Add ioctl to fetch/store tags in a guest")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Message-Id: <20230208140105.655814-5-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-16 10:18:06 -04:00
Linus Torvalds 49d5759268 ARM:
- Provide a virtual cache topology to the guest to avoid
   inconsistencies with migration on heterogenous systems. Non secure
   software has no practical need to traverse the caches by set/way in
   the first place.
 
 - Add support for taking stage-2 access faults in parallel. This was an
   accidental omission in the original parallel faults implementation,
   but should provide a marginal improvement to machines w/o FEAT_HAFDBS
   (such as hardware from the fruit company).
 
 - A preamble to adding support for nested virtualization to KVM,
   including vEL2 register state, rudimentary nested exception handling
   and masking unsupported features for nested guests.
 
 - Fixes to the PSCI relay that avoid an unexpected host SVE trap when
   resuming a CPU when running pKVM.
 
 - VGIC maintenance interrupt support for the AIC
 
 - Improvements to the arch timer emulation, primarily aimed at reducing
   the trap overhead of running nested.
 
 - Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the
   interest of CI systems.
 
 - Avoid VM-wide stop-the-world operations when a vCPU accesses its own
   redistributor.
 
 - Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions
   in the host.
 
 - Aesthetic and comment/kerneldoc fixes
 
 - Drop the vestiges of the old Columbia mailing list and add [Oliver]
   as co-maintainer
 
 This also drags in arm64's 'for-next/sme2' branch, because both it and
 the PSCI relay changes touch the EL2 initialization code.
 
 RISC-V:
 
 - Fix wrong usage of PGDIR_SIZE instead of PUD_SIZE
 
 - Correctly place the guest in S-mode after redirecting a trap to the guest
 
 - Redirect illegal instruction traps to guest
 
 - SBI PMU support for guest
 
 s390:
 
 - Two patches sorting out confusion between virtual and physical
   addresses, which currently are the same on s390.
 
 - A new ioctl that performs cmpxchg on guest memory
 
 - A few fixes
 
 x86:
 
 - Change tdp_mmu to a read-only parameter
 
 - Separate TDP and shadow MMU page fault paths
 
 - Enable Hyper-V invariant TSC control
 
 - Fix a variety of APICv and AVIC bugs, some of them real-world,
   some of them affecting architecurally legal but unlikely to
   happen in practice
 
 - Mark APIC timer as expired if its in one-shot mode and the count
   underflows while the vCPU task was being migrated
 
 - Advertise support for Intel's new fast REP string features
 
 - Fix a double-shootdown issue in the emergency reboot code
 
 - Ensure GIF=1 and disable SVM during an emergency reboot, i.e. give SVM
   similar treatment to VMX
 
 - Update Xen's TSC info CPUID sub-leaves as appropriate
 
 - Add support for Hyper-V's extended hypercalls, where "support" at this
   point is just forwarding the hypercalls to userspace
 
 - Clean up the kvm->lock vs. kvm->srcu sequences when updating the PMU and
   MSR filters
 
 - One-off fixes and cleanups
 
 - Fix and cleanup the range-based TLB flushing code, used when KVM is
   running on Hyper-V
 
 - Add support for filtering PMU events using a mask.  If userspace
   wants to restrict heavily what events the guest can use, it can now
   do so without needing an absurd number of filter entries
 
 - Clean up KVM's handling of "PMU MSRs to save", especially when vPMU
   support is disabled
 
 - Add PEBS support for Intel Sapphire Rapids
 
 - Fix a mostly benign overflow bug in SEV's send|receive_update_data()
 
 - Move several SVM-specific flags into vcpu_svm
 
 x86 Intel:
 
 - Handle NMI VM-Exits before leaving the noinstr region
 
 - A few trivial cleanups in the VM-Enter flows
 
 - Stop enabling VMFUNC for L1 purely to document that KVM doesn't support
   EPTP switching (or any other VM function) for L1
 
 - Fix a crash when using eVMCS's enlighted MSR bitmaps
 
 Generic:
 
 - Clean up the hardware enable and initialization flow, which was
   scattered around multiple arch-specific hooks.  Instead, just
   let the arch code call into generic code.  Both x86 and ARM should
   benefit from not having to fight common KVM code's notion of how
   to do initialization.
 
 - Account allocations in generic kvm_arch_alloc_vm()
 
 - Fix a memory leak if coalesced MMIO unregistration fails
 
 selftests:
 
 - On x86, cache the CPU vendor (AMD vs. Intel) and use the info to emit
   the correct hypercall instruction instead of relying on KVM to patch
   in VMMCALL
 
 - Use TAP interface for kvm_binary_stats_test and tsc_msrs_test
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmP2YA0UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPg/Qf+J6nT+TkIa+8Ei+fN1oMTDp4YuIOx
 mXvJ9mRK9sQ+tAUVwvDz3qN/fK5mjsYbRHIDlVc5p2Q3bCrVGDDqXPFfCcLx1u+O
 9U9xjkO4JxD2LS9pc70FYOyzVNeJ8VMGOBbC2b0lkdYZ4KnUc6e/WWFKJs96bK+H
 duo+RIVyaMthnvbTwSv1K3qQb61n6lSJXplywS8KWFK6NZAmBiEFDAWGRYQE9lLs
 VcVcG0iDJNL/BQJ5InKCcvXVGskcCm9erDszPo7w4Bypa4S9AMS42DHUaRZrBJwV
 /WqdH7ckIz7+OSV0W1j+bKTHAFVTCjXYOM7wQykgjawjICzMSnnG9Gpskw==
 =goe1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:

   - Provide a virtual cache topology to the guest to avoid
     inconsistencies with migration on heterogenous systems. Non secure
     software has no practical need to traverse the caches by set/way in
     the first place

   - Add support for taking stage-2 access faults in parallel. This was
     an accidental omission in the original parallel faults
     implementation, but should provide a marginal improvement to
     machines w/o FEAT_HAFDBS (such as hardware from the fruit company)

   - A preamble to adding support for nested virtualization to KVM,
     including vEL2 register state, rudimentary nested exception
     handling and masking unsupported features for nested guests

   - Fixes to the PSCI relay that avoid an unexpected host SVE trap when
     resuming a CPU when running pKVM

   - VGIC maintenance interrupt support for the AIC

   - Improvements to the arch timer emulation, primarily aimed at
     reducing the trap overhead of running nested

   - Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the
     interest of CI systems

   - Avoid VM-wide stop-the-world operations when a vCPU accesses its
     own redistributor

   - Serialize when toggling CPACR_EL1.SMEN to avoid unexpected
     exceptions in the host

   - Aesthetic and comment/kerneldoc fixes

   - Drop the vestiges of the old Columbia mailing list and add [Oliver]
     as co-maintainer

  RISC-V:

   - Fix wrong usage of PGDIR_SIZE instead of PUD_SIZE

   - Correctly place the guest in S-mode after redirecting a trap to the
     guest

   - Redirect illegal instruction traps to guest

   - SBI PMU support for guest

  s390:

   - Sort out confusion between virtual and physical addresses, which
     currently are the same on s390

   - A new ioctl that performs cmpxchg on guest memory

   - A few fixes

  x86:

   - Change tdp_mmu to a read-only parameter

   - Separate TDP and shadow MMU page fault paths

   - Enable Hyper-V invariant TSC control

   - Fix a variety of APICv and AVIC bugs, some of them real-world, some
     of them affecting architecurally legal but unlikely to happen in
     practice

   - Mark APIC timer as expired if its in one-shot mode and the count
     underflows while the vCPU task was being migrated

   - Advertise support for Intel's new fast REP string features

   - Fix a double-shootdown issue in the emergency reboot code

   - Ensure GIF=1 and disable SVM during an emergency reboot, i.e. give
     SVM similar treatment to VMX

   - Update Xen's TSC info CPUID sub-leaves as appropriate

   - Add support for Hyper-V's extended hypercalls, where "support" at
     this point is just forwarding the hypercalls to userspace

   - Clean up the kvm->lock vs. kvm->srcu sequences when updating the
     PMU and MSR filters

   - One-off fixes and cleanups

   - Fix and cleanup the range-based TLB flushing code, used when KVM is
     running on Hyper-V

   - Add support for filtering PMU events using a mask. If userspace
     wants to restrict heavily what events the guest can use, it can now
     do so without needing an absurd number of filter entries

   - Clean up KVM's handling of "PMU MSRs to save", especially when vPMU
     support is disabled

   - Add PEBS support for Intel Sapphire Rapids

   - Fix a mostly benign overflow bug in SEV's
     send|receive_update_data()

   - Move several SVM-specific flags into vcpu_svm

  x86 Intel:

   - Handle NMI VM-Exits before leaving the noinstr region

   - A few trivial cleanups in the VM-Enter flows

   - Stop enabling VMFUNC for L1 purely to document that KVM doesn't
     support EPTP switching (or any other VM function) for L1

   - Fix a crash when using eVMCS's enlighted MSR bitmaps

  Generic:

   - Clean up the hardware enable and initialization flow, which was
     scattered around multiple arch-specific hooks. Instead, just let
     the arch code call into generic code. Both x86 and ARM should
     benefit from not having to fight common KVM code's notion of how to
     do initialization

   - Account allocations in generic kvm_arch_alloc_vm()

   - Fix a memory leak if coalesced MMIO unregistration fails

  selftests:

   - On x86, cache the CPU vendor (AMD vs. Intel) and use the info to
     emit the correct hypercall instruction instead of relying on KVM to
     patch in VMMCALL

   - Use TAP interface for kvm_binary_stats_test and tsc_msrs_test"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (325 commits)
  KVM: SVM: hyper-v: placate modpost section mismatch error
  KVM: x86/mmu: Make tdp_mmu_allowed static
  KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID
  KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes
  KVM: arm64: nv: Filter out unsupported features from ID regs
  KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2
  KVM: arm64: nv: Allow a sysreg to be hidden from userspace only
  KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor
  KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2
  KVM: arm64: nv: Handle SMCs taken from virtual EL2
  KVM: arm64: nv: Handle trapped ERET from virtual EL2
  KVM: arm64: nv: Inject HVC exceptions to the virtual EL2
  KVM: arm64: nv: Support virtual EL2 exceptions
  KVM: arm64: nv: Handle HCR_EL2.NV system register traps
  KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state
  KVM: arm64: nv: Add EL2 system registers to vcpu context
  KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x
  KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set
  KVM: arm64: nv: Introduce nested virtualization VCPU feature
  KVM: arm64: Use the S2 MMU context to iterate over S2 table
  ...
2023-02-25 11:30:21 -08:00
Linus Torvalds 70756b49be It has been a moderately calm cycle for documentation; the significant
changes include:
 
 - Some significant additions to the memory-management documentation
 
 - Some improvements to navigation in the HTML-rendered docs
 
 - More Spanish and Chinese translations
 
 ...and the usual set of typo fixes and such.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmPzkQUPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YC0QH/09u10xV3N+RuveNE/tArVxKcQi7JZd/xugQ
 toSXygh64WY10lzwi7Ms1bHZzpPYB0fOrqTGNqNQuhrVTjQzaZB0BBJqm8lwt2w/
 S/Z5wj+IicJTmQ7+0C2Hc/dcK5SCPfY3CgwqOUVdr3dEm1oU+4QaBy31fuIJJ0Hx
 NdbXBco8BZqJX9P67jwp9vbrFrSGBjPI0U4HNHVjrWlcBy8JT0aAnf0fyWFy3orA
 T86EzmEw8drA1mXsHa5pmVwuHDx2X+D+eRurG9llCBrlIG9EDSmnalY4BeGqR4LS
 oDrEH6M91I5+9iWoJ0rBheD8rPclXO2HpjXLApXzTjrORgEYZsM=
 =MCdX
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.3' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "It has been a moderately calm cycle for documentation; the significant
  changes include:

   - Some significant additions to the memory-management documentation

   - Some improvements to navigation in the HTML-rendered docs

   - More Spanish and Chinese translations

  ... and the usual set of typo fixes and such"

* tag 'docs-6.3' of git://git.lwn.net/linux: (68 commits)
  Documentation/watchdog/hpwdt: Fix Format
  Documentation/watchdog/hpwdt: Fix Reference
  Documentation: core-api: padata: correct spelling
  docs/mm: Physical Memory: correct spelling in reference to CONFIG_PAGE_EXTENSION
  docs: Use HTML comments for the kernel-toc SPDX line
  docs: Add more information to the HTML sidebar
  Documentation: KVM: Update AMD memory encryption link
  printk: Document that CONFIG_BOOT_PRINTK_DELAY required for boot_delay=
  Documentation: userspace-api: correct spelling
  Documentation: sparc: correct spelling
  Documentation: driver-api: correct spelling
  Documentation: admin-guide: correct spelling
  docs: add workload-tracing document to admin-guide
  docs/admin-guide/mm: remove useless markup
  docs/mm: remove useless markup
  docs/mm: Physical Memory: remove useless markup
  docs/sp_SP: Add process magic-number translation
  docs: ftrace: always use canonical ftrace path
  Doc/damon: fix the data path error
  dma-buf: Add "dma-buf" to title of documentation
  ...
2023-02-22 12:00:20 -08:00
Paolo Bonzini e4922088f8 * Two more V!=R patches
* The last part of the cmpxchg patches
 * A few fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEwGNS88vfc9+v45Yq41TmuOI4ufgFAmPkwH0ACgkQ41TmuOI4
 ufhrshAAmv9OlCNVsGTmQLpEnGdnxGM2vBPDEygdi+oVHtpMBFn27R3fu295aUR0
 v0o3xsSImhaOU03OxWrsLqPanEL5BqnicLwkL4xou3NXXD4Wo0Zrstd3ykfaODhq
 bTDx7zC2zMQ5J+LPuwDaYUat5R0bHv7cULv1CKLdyISnPGafy0kpUPvC30nymJZi
 nV7/DjvDYbuOFfhdTEOklGRXvMSEBPLGhIJk/cYZzJECNeNJFUeSs+00uNJ8P6WO
 BQD/FLWie+Fn6lTGIUhulZCPf65KI4bHHLB6WFXA5Jy+O08urdtLiZwlBC4iNsFV
 NFIwangpJ/RnupJoOMwQfw31op5SZuiOYn91njaGIiLpHgvA9+iaERsqXtjp8NW7
 /ne1TZqtrGbYY71XvZ/yPQU5VGc/MG1CyCGX1CPNSQO7v4yl27BNChxdkBHzzm2u
 C0IuLZuXl25XwAt8xbdi65fb84pJOeWRU4Zoe4cUZ3drBy5cZsmFXe3lhEAqs7nf
 MB9XekTLpZ6pCqTE1u/BOrobVg5es/lDQiDeLCvDe1I3I5inSD6ehjJz7qjK0w8o
 3pn0rb+Kb4Ijzfi4RNbgJXmBNzkwwSSPPwYt4THHOZtr8p0fZMBeGHqq1wTJmKcq
 M/+9w4cZqgFpdyNqitj8NyTayX1Lj4LWayexCBYaGkLuHTD6cCk=
 =HOly
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-6.3-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

* Two more V!=R patches
* The last part of the cmpxchg patches
* A few fixes
2023-02-15 12:35:26 -05:00
Paolo Bonzini 33436335e9 KVM/riscv changes for 6.3
- Fix wrong usage of PGDIR_SIZE to check page sizes
 - Fix privilege mode setting in kvm_riscv_vcpu_trap_redirect()
 - Redirect illegal instruction traps to guest
 - SBI PMU support for guest
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmPifFIACgkQrUjsVaLH
 LAcEyxAAinMBaBhiPmwWZQvcCzh/UFmJo8BQCwAPuwoc/a4ZGAR7ylzd0oJilP8M
 wSgX6Ad8XF+CEW2VpxW9nwyi41N25ep1Lrf8vOaWy9L9QNUo0t15WrCIbXT2p399
 HrK9fz7HHKKIMsJy+rYb9EepdmMf55xtr1Y/EjyvhoDQbrEMlKsAODYz/SUoriQG
 Tn3cCYBzLdvzDzu0xXM9v+nsetWXdajK/v4je+mE3NQceXhePAO4oVWP4IpnoROd
 ZQm3evvVdf0WtKG9curxwMB7jjBqDBFrcLYl0qHGa7pi2o5PzVM7esgaV47KwetH
 IgA/Mrf1IfzpgM7VYDDax5wUHlKj63KisqU0J8rU3PUloQXaWqv7+ho51t9GzZ/i
 9x4uyO/evVntgyTw6HCbqmQJDgEtJiG1ydrR/ydBMYHLnh7LPY2UpKgcqmirtbkK
 1/DYDp84vikQ5VW1hc8IACdoBShh9Moh4xsEStzkTrIeHcZCjtORXUh8UIPZ0Mu2
 7Mnkktu9I55SLwA3rwH/EYT1ISrOV1G+q3wfqgeLpn8YUWwCIiqWQ5Ur0/WSMJse
 uJ3HedZDzj9T4n4khX+mKEYh6joAafQZag+4TID2lRSwd0S/mpeC22hYrViMdDmq
 yhE+JNin/sz4AVaHNzGwfqk2NC2RFl9aRn2X0xTwyBubif9pKMQ=
 =spUL
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-6.3-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv changes for 6.3

- Fix wrong usage of PGDIR_SIZE to check page sizes
- Fix privilege mode setting in kvm_riscv_vcpu_trap_redirect()
- Redirect illegal instruction traps to guest
- SBI PMU support for guest
2023-02-15 12:33:28 -05:00
Janis Schoetterl-Glausch a7b0417328 Documentation: KVM: s390: Describe KVM_S390_MEMOP_F_CMPXCHG
Describe the semantics of the new KVM_S390_MEMOP_F_CMPXCHG flag for
absolute vm write memops which allows user space to perform (storage key
checked) cmpxchg operations on guest memory.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-14-scgl@linux.ibm.com
Message-Id: <20230206164602.138068-14-scgl@linux.ibm.com>
[frankja@de.ibm.com: Removed a line from an earlier version]
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2023-02-07 18:06:00 +01:00
Nico Boehr f2d3155e2a KVM: s390: disable migration mode when dirty tracking is disabled
Migration mode is a VM attribute which enables tracking of changes in
storage attributes (PGSTE). It assumes dirty tracking is enabled on all
memslots to keep a dirty bitmap of pages with changed storage attributes.

When enabling migration mode, we currently check that dirty tracking is
enabled for all memslots. However, userspace can disable dirty tracking
without disabling migration mode.

Since migration mode is pointless with dirty tracking disabled, disable
migration mode whenever userspace disables dirty tracking on any slot.

Also update the documentation to clarify that dirty tracking must be
enabled when enabling migration mode, which is already enforced by the
code in kvm_s390_vm_start_migration().

Also highlight in the documentation for KVM_S390_GET_CMMA_BITS that it
can now fail with -EINVAL when dirty tracking is disabled while
migration mode is on. Move all the error codes to a table so this stays
readable.

To disable migration mode, slots_lock should be held, which is taken
in kvm_set_memory_region() and thus held in
kvm_arch_prepare_memory_region().

Restructure the prepare code a bit so all the sanity checking is done
before disabling migration mode. This ensures migration mode isn't
disabled when some sanity check fails.

Cc: stable@vger.kernel.org
Fixes: 190df4a212 ("KVM: s390: CMMA tracking, ESSA emulation, migration mode")
Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20230127140532.230651-2-nrb@linux.ibm.com
Message-Id: <20230127140532.230651-2-nrb@linux.ibm.com>
[frankja@linux.ibm.com: fixed commit message typo, moved api.rst error table upwards]
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2023-02-07 18:05:59 +01:00
Paolo Bonzini 25b72cf7da KVM/arm64 fixes for 6.2, take #3
- Yet another fix for non-CPU accesses to the memory backing
   the VGICv3 subsystem
 
 - A set of fixes for the setlftest checking for the S1PTW
   behaviour after the fix that went in ealier in the cycle
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmPWwGEPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDi/wP/3kbZrZ+y/YNcYioQwqibRS5DKuACKXM1dbh
 sMX0e8t3frmkrfkHZ1FsBNjSWtDLmRbjANNDWi8ypAXaPVm7/0whFqkJgyPWDO+v
 /1VXYMwMjy2zpWfPGPu+/fQL0Ninp+EfLP3Y2/Lr8VW5rH21bfuQ1rm41ucK/jB5
 IsMiQ+YObZUTrSq22fHfNJKc8fysSqeMHW96bl0QnJxf6aDDieZFGF9rlRQf/faq
 lPux0faasgQC0VgXlokWGdU1x5kXIf3Ta4VtiKARKNwxziuG8B484+5hHXvoBR1h
 bXFJJUQjQs2qBuH75BJftini9fvWvQPgbk4NvkD1tlyMhlZ5w2MTTKB4QmuW/WDT
 OGuGXAcuP2stm0dUaSn1aCwzfYgtihssp+RCAB5DOoL64i/CtHl+FJgz8wZfDPRk
 UNXdK2JccDfD6bGv/kQqPJoozjI5e8Ha2ks1O4IPHIDpIsVMIWRRGULgIRvLaHaS
 iaR7Vx+XgzW50Knj++S85eak/aTSkVaykYZIiiB4DTai1/XuAZfMA79X6IvQLxHq
 419FHmXwhJmYdWZ/JFBXWnbR6wRJiv4TR23A5u8X6o/YgBn6fmwAt6o8Avk1quZQ
 mslRPHG45hM/7Z7uSEsIQnbVVnHPhbaKr3GmHlJJ4zXRI8GaSMe23wpnJdUj1q9a
 w1Oe0rpq
 =2l/n
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.2, take #3

- Yet another fix for non-CPU accesses to the memory backing
  the VGICv3 subsystem

- A set of fixes for the setlftest checking for the S1PTW
  behaviour after the fix that went in ealier in the cycle
2023-02-04 08:57:43 -05:00
Wyes Karny fbabc2eaef Documentation: KVM: Update AMD memory encryption link
Update AMD memory encryption white-paper document link.
Previous link is not available. Update new available link.

Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Carlos Bilbao <carlos.bilbao@amd.com>
Link: https://lore.kernel.org/r/20230125175948.21100-1-wyes.karny@amd.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-02-02 11:32:57 -07:00
Gavin Shan 6028acbe3a KVM: arm64: Allow no running vcpu on saving vgic3 pending table
We don't have a running VCPU context to save vgic3 pending table due
to KVM_DEV_ARM_VGIC_{GRP_CTRL, SAVE_PENDING_TABLES} command on KVM
device "kvm-arm-vgic-v3". The unknown case is caught by kvm-unit-tests.

   # ./kvm-unit-tests/tests/its-pending-migration
   WARNING: CPU: 120 PID: 7973 at arch/arm64/kvm/../../../virt/kvm/kvm_main.c:3325 \
   mark_page_dirty_in_slot+0x60/0xe0
    :
   mark_page_dirty_in_slot+0x60/0xe0
   __kvm_write_guest_page+0xcc/0x100
   kvm_write_guest+0x7c/0xb0
   vgic_v3_save_pending_tables+0x148/0x2a0
   vgic_set_common_attr+0x158/0x240
   vgic_v3_set_attr+0x4c/0x5c
   kvm_device_ioctl+0x100/0x160
   __arm64_sys_ioctl+0xa8/0xf0
   invoke_syscall.constprop.0+0x7c/0xd0
   el0_svc_common.constprop.0+0x144/0x160
   do_el0_svc+0x34/0x60
   el0_svc+0x3c/0x1a0
   el0t_64_sync_handler+0xb4/0x130
   el0t_64_sync+0x178/0x17c

Use vgic_write_guest_lock() to save vgic3 pending table.

Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230126235451.469087-5-gshan@redhat.com
2023-01-29 18:46:11 +00:00
Gavin Shan 2f8b1ad222 KVM: arm64: Allow no running vcpu on restoring vgic3 LPI pending status
We don't have a running VCPU context to restore vgic3 LPI pending status
due to command KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_RESTORE_TABLES} on KVM
device "kvm-arm-vgic-its".

Use vgic_write_guest_lock() to restore vgic3 LPI pending status.

Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230126235451.469087-4-gshan@redhat.com
2023-01-29 18:46:11 +00:00
Wang Yong 608348285a Documentation: KVM: fix typos in running-nested-guests.rst
change "gues" to "guest" and remove redundant ")".

Signed-off-by: Wang Yong <yongw.kernel@gmail.com>
Reviewed-by: Kashyap Chamarthy <kchamart@redhat.com>
Link: https://lore.kernel.org/r/20230110150046.549755-1-yongw.kernel@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-01-26 11:23:32 -07:00
SeongJae Park 941c95fdd6 Docs/subsystem-apis: Remove '[The ]Linux' prefixes from titles of listed documents
Some documents that listed on subsystem-apis have 'Linux' or 'The Linux'
title prefixes.  It's duplicated information, and makes finding the
document of interest with human eyes not easy.  Remove the prefixes from
the titles.

Signed-off-by: SeongJae Park <sj@kernel.org>
Acked-by: Iwona Winiarska <iwona.winiarska@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230122184834.181977-1-sj@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-01-24 15:27:08 -07:00
Aaron Lewis 14329b825f KVM: x86/pmu: Introduce masked events to the pmu event filter
When building a list of filter events, it can sometimes be a challenge
to fit all the events needed to adequately restrict the guest into the
limited space available in the pmu event filter.  This stems from the
fact that the pmu event filter requires each event (i.e. event select +
unit mask) be listed, when the intention might be to restrict the
event select all together, regardless of it's unit mask.  Instead of
increasing the number of filter events in the pmu event filter, add a
new encoding that is able to do a more generalized match on the unit mask.

Introduce masked events as another encoding the pmu event filter
understands.  Masked events has the fields: mask, match, and exclude.
When filtering based on these events, the mask is applied to the guest's
unit mask to see if it matches the match value (i.e. umask & mask ==
match).  The exclude bit can then be used to exclude events from that
match.  E.g. for a given event select, if it's easier to say which unit
mask values shouldn't be filtered, a masked event can be set up to match
all possible unit mask values, then another masked event can be set up to
match the unit mask values that shouldn't be filtered.

Userspace can query to see if this feature exists by looking for the
capability, KVM_CAP_PMU_EVENT_MASKED_EVENTS.

This feature is enabled by setting the flags field in the pmu event
filter to KVM_PMU_EVENT_FLAG_MASKED_EVENTS.

Events can be encoded by using KVM_PMU_ENCODE_MASKED_ENTRY().

It is an error to have a bit set outside the valid bits for a masked
event, and calls to KVM_SET_PMU_EVENT_FILTER will return -EINVAL in
such cases, including the high bits of the event select (35:32) if
called on Intel.

With these updates the filter matching code has been updated to match on
a common event.  Masked events were flexible enough to handle both event
types, so they were used as the common event.  This changes how guest
events get filtered because regardless of the type of event used in the
uAPI, they will be converted to masked events.  Because of this there
could be a slight performance hit because instead of matching the filter
event with a lookup on event select + unit mask, it does a lookup on event
select then walks the unit masks to find the match.  This shouldn't be a
big problem because I would expect the set of common event selects to be
small, and if they aren't the set can likely be reduced by using masked
events to generalize the unit mask.  Using one type of event when
filtering guest events allows for a common code path to be used.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Link: https://lore.kernel.org/r/20221220161236.555143-5-aaronlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-24 10:06:12 -08:00
Paolo Bonzini f15a87c006 Merge branch 'kvm-lapic-fix-and-cleanup' into HEAD
The first half or so patches fix semi-urgent, real-world relevant APICv
and AVIC bugs.

The second half fixes a variety of AVIC and optimized APIC map bugs
where KVM doesn't play nice with various edge cases that are
architecturally legal(ish), but are unlikely to occur in most real world
scenarios

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-01-24 06:08:01 -05:00
Paolo Bonzini dc7c31e922 Merge branch 'kvm-v6.2-rc4-fixes' into HEAD
ARM:

* Fix the PMCR_EL0 reset value after the PMU rework

* Correctly handle S2 fault triggered by a S1 page table walk
  by not always classifying it as a write, as this breaks on
  R/O memslots

* Document why we cannot exit with KVM_EXIT_MMIO when taking
  a write fault from a S1 PTW on a R/O memslot

* Put the Apple M2 on the naughty list for not being able to
  correctly implement the vgic SEIS feature, just like the M1
  before it

* Reviewer updates: Alex is stepping down, replaced by Zenghui

x86:

* Fix various rare locking issues in Xen emulation and teach lockdep
  to detect them

* Documentation improvements

* Do not return host topology information from KVM_GET_SUPPORTED_CPUID
2023-01-24 06:05:23 -05:00
Sean Christopherson 5b84b02917 KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs
Apply KVM's hotplug hack if and only if userspace has enabled 32-bit IDs
for x2APIC.  If 32-bit IDs are not enabled, disable the optimized map to
honor x86 architectural behavior if multiple vCPUs shared a physical APIC
ID.  As called out in the changelog that added the hack, all CPUs whose
(possibly truncated) APIC ID matches the target are supposed to receive
the IPI.

  KVM intentionally differs from real hardware, because real hardware
  (Knights Landing) does just "x2apic_id & 0xff" to decide whether to
  accept the interrupt in xAPIC mode and it can deliver one interrupt to
  more than one physical destination, e.g. 0x123 to 0x123 and 0x23.

Applying the hack even when x2APIC is not fully enabled means KVM doesn't
correctly handle scenarios where the guest has aliased xAPIC IDs across
multiple vCPUs, as only the vCPU with the lowest vCPU ID will receive any
interrupts.  It's extremely unlikely any real world guest aliases APIC
IDs, or even modifies APIC IDs, but KVM's behavior is arbitrary, e.g. the
lowest vCPU ID "wins" regardless of which vCPU is "aliasing" and which
vCPU is "normal".

Furthermore, the hack is _not_ guaranteed to work!  The hack works if and
only if the optimized APIC map is successfully allocated.  If the map
allocation fails (unlikely), KVM will fall back to its unoptimized
behavior, which _does_ honor the architectural behavior.

Pivot on 32-bit x2APIC IDs being enabled as that is required to take
advantage of the hotplug hack (see kvm_apic_state_fixup()), i.e. won't
break existing setups unless they are way, way off in the weeds.

And an entry in KVM's errata to document the hack.  Alternatively, KVM
could provide an actual x2APIC quirk and document the hack that way, but
there's unlikely to ever be a use case for disabling the quirk.  Go the
errata route to avoid having to validate a quirk no one cares about.

Fixes: 5bd5db385b ("KVM: x86: allow hotplug of VCPU with APIC ID over 0xff")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230106011306.85230-23-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-01-13 10:45:30 -05:00
David Woodhouse 310bc39546 KVM: x86/xen: Avoid deadlock by adding kvm->arch.xen.xen_lock leaf node lock
In commit 14243b3871 ("KVM: x86/xen: Add KVM_IRQ_ROUTING_XEN_EVTCHN
and event channel delivery") the clever version of me left some helpful
notes for those who would come after him:

       /*
        * For the irqfd workqueue, using the main kvm->lock mutex is
        * fine since this function is invoked from kvm_set_irq() with
        * no other lock held, no srcu. In future if it will be called
        * directly from a vCPU thread (e.g. on hypercall for an IPI)
        * then it may need to switch to using a leaf-node mutex for
        * serializing the shared_info mapping.
        */
       mutex_lock(&kvm->lock);

In commit 2fd6df2f2b ("KVM: x86/xen: intercept EVTCHNOP_send from guests")
the other version of me ran straight past that comment without reading it,
and introduced a potential deadlock by taking vcpu->mutex and kvm->lock
in the wrong order.

Solve this as originally suggested, by adding a leaf-node lock in the Xen
state rather than using kvm->lock for it.

Fixes: 2fd6df2f2b ("KVM: x86/xen: intercept EVTCHNOP_send from guests")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20230111180651.14394-4-dwmw2@infradead.org>
[Rebase, add docs. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-01-11 17:45:58 -05:00
Paolo Bonzini 71d0393576 KVM/arm64 fixes for 6.2, take #1
- Fix the PMCR_EL0 reset value after the PMU rework
 
 - Correctly handle S2 fault triggered by a S1 page table walk
   by not always classifying it as a write, as this breaks on
   R/O memslots
 
 - Document why we cannot exit with KVM_EXIT_MMIO when taking
   a write fault from a S1 PTW on a R/O memslot
 
 - Put the Apple M2 on the naughty step for not being able to
   correctly implement the vgic SEIS feature, just liek the M1
   before it
 
 - Reviewer updates: Alex is stepping down, replaced by Zenghui
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmO27gQPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDwioP/A0UE7ujSxv3dlBstBhmtzOoX64pRufX01Kr
 1oF24M1VuTVLwl3pp1nWH10SVWv5kukYZJAJ/3tDJOaMt/Q9c0exPCPc95i2p/r7
 OC9j8rZVZnjGN6sAP5zazIT67tSanyLDeCC+j4J1pw20r2tB67LKSOoozEb5How7
 CX+Oa2OiEiI34jp33v3mFQ3VxY3714QUMBUK7n+L29IFXGmQp6dfbhn2iY3uNpoU
 YYrkPzBLUC1H//oCx0qoDDCXXeOKMGuWP1At5GIDz6ZSCBVpKdVbftCC59Dk7dDz
 7BdQ5JoEc15RTZajdopOog4RV4YHP8VszaClhCA1ML0Pd2Mf4UVLlPnn7F+3yR3r
 pMgjlOAlLJwHiwggJZ0EQ0wFdx9LuGeu3OwckGE/JxeEwaMdzGAEfcFoAGZV0ExZ
 7riiKS+NmtrkuE9wJfWOrpDiseymmUbuhHq+F/HDq/SP6UdezAylkcxZRuN/ZCRc
 9XVhTcWu/UPxoaSSd/sB4l9X8Ey/cZe28+kV7eE/m2g79bZKxHd4UUOUymb/aJxj
 og10A6i0B1DOWMtKJ9hEsB6wI6Hllrqcbo8ewX1znKoKbfHZDeU/N5D4ZvTz85sf
 zyqbsSZPDxMOwBPYTqZqG65tEWWw68HIJ9cqQzKDehN1Xm1coNIWSPrUnBMpSsWJ
 qDQNmIzf
 =XBtQ
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/arm64 fixes for 6.2, take #1

- Fix the PMCR_EL0 reset value after the PMU rework

- Correctly handle S2 fault triggered by a S1 page table walk
  by not always classifying it as a write, as this breaks on
  R/O memslots

- Document why we cannot exit with KVM_EXIT_MMIO when taking
  a write fault from a S1 PTW on a R/O memslot

- Put the Apple M2 on the naughty step for not being able to
  correctly implement the vgic SEIS feature, just liek the M1
  before it

- Reviewer updates: Alex is stepping down, replaced by Zenghui
2023-01-11 13:31:53 -05:00
Paolo Bonzini 3a9ae31ac2 Documentation: kvm: fix SRCU locking order docs
kvm->srcu is taken in KVM_RUN and several other vCPU ioctls, therefore
vcpu->mutex is susceptible to the same deadlock that is documented
for kvm->slots_lock.  The same holds for kvm->lock, since kvm->lock
is held outside vcpu->mutex.  Fix the documentation and rearrange it
to highlight the difference between these locks and kvm->slots_arch_lock,
and how kvm->slots_arch_lock can be useful while processing a vmexit.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-01-11 13:31:33 -05:00
Paolo Bonzini 45e966fcca KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUID
Passing the host topology to the guest is almost certainly wrong
and will confuse the scheduler.  In addition, several fields of
these CPUID leaves vary on each processor; it is simply impossible to
return the right values from KVM_GET_SUPPORTED_CPUID in such a way that
they can be passed to KVM_SET_CPUID2.

The values that will most likely prevent confusion are all zeroes.
Userspace will have to override it anyway if it wishes to present a
specific topology to the guest.

Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-01-09 05:35:21 -05:00
Marc Zyngier afbb1b1cae Merge branch kvm-arm64/s1ptw-write-fault into kvmarm-master/fixes
* kvm-arm64/s1ptw-write-fault:
  : .
  : Fix S1PTW fault handling that was until then always taken
  : as a write. From the cover letter:
  :
  : `Recent developments on the EFI front have resulted in guests that
  : simply won't boot if the page tables are in a read-only memslot and
  : that you're a bit unlucky in the way S2 gets paged in... The core
  : issue is related to the fact that we treat a S1PTW as a write, which
  : is close enough to what needs to be done. Until to get to RO memslots.
  :
  : The first patch fixes this and is definitely a stable candidate. It
  : splits the faulting of page tables in two steps (RO translation fault,
  : followed by a writable permission fault -- should it even happen).
  : The second one documents the slightly odd behaviour of PTW writes to
  : RO memslot, which do not result in a KVM_MMIO exit. The last patch is
  : totally optional, only tangentially related, and randomly repainting
  : stuff (maybe that's contagious, who knows)."
  :
  : .
  KVM: arm64: Convert FSC_* over to ESR_ELx_FSC_*
  KVM: arm64: Document the behaviour of S1PTW faults on RO memslots
  KVM: arm64: Fix S1PTW handling on RO memslots

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-01-05 15:25:54 +00:00
Marc Zyngier b8f8d190fa KVM: arm64: Document the behaviour of S1PTW faults on RO memslots
Although the KVM API says that a write to a RO memslot must result
in a KVM_EXIT_MMIO describing the write, the arm64 architecture
doesn't provide the *data* written by a Stage-1 page table walk
(we only get the address).

Since there isn't much userspace can do with so little information
anyway, document the fact that such an access results in a guest
exception, not an exit. This is consistent with the guest being
terminally broken anyway.

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-01-03 10:01:52 +00:00
Isaku Yamahata 0bf50497f0 KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock
Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock now
that KVM hooks CPU hotplug during the ONLINE phase, which can sleep.
Previously, KVM hooked the STARTING phase, which is not allowed to sleep
and thus could not take kvm_lock (a mutex).  This effectively allows the
task that's initiating hardware enabling/disabling to preempted and/or
migrated.

Note, the Documentation/virt/kvm/locking.rst statement that kvm_count_lock
is "raw" because hardware enabling/disabling needs to be atomic with
respect to migration is wrong on multiple fronts.  First, while regular
spinlocks can be preempted, the task holding the lock cannot be migrated.
Second, preventing migration is not required.  on_each_cpu() disables
preemption, which ensures that cpus_hardware_enabled correctly reflects
hardware state.  The task may be preempted/migrated between bumping
kvm_usage_count and invoking on_each_cpu(), but that's perfectly ok as
kvm_usage_count is still protected, e.g. other tasks that call
hardware_enable_all() will be blocked until the preempted/migrated owner
exits its critical section.

KVM does have lockless accesses to kvm_usage_count in the suspend/resume
flows, but those are safe because all tasks must be frozen prior to
suspending CPUs, and a task cannot be frozen while it holds one or more
locks (userspace tasks are frozen via a fake signal).

Preemption doesn't need to be explicitly disabled in the hotplug path.
The hotplug thread is pinned to the CPU that's being hotplugged, and KVM
only cares about having a stable CPU, i.e. to ensure hardware is enabled
on the correct CPU.  Lockep, i.e. check_preemption_disabled(), plays nice
with this state too, as is_percpu_thread() is true for the hotplug thread.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221130230934.1014142-45-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:48:34 -05:00
Sean Christopherson 3af4a9e61e KVM: x86: Serialize vendor module initialization (hardware setup)
Acquire a new mutex, vendor_module_lock, in kvm_x86_vendor_init() while
doing hardware setup to ensure that concurrent calls are fully serialized.
KVM rejects attempts to load vendor modules if a different module has
already been loaded, but doesn't handle the case where multiple vendor
modules are loaded at the same time, and module_init() doesn't run under
the global module_mutex.

Note, in practice, this is likely a benign bug as no platform exists that
supports both SVM and VMX, i.e. barring a weird VM setup, one of the
vendor modules is guaranteed to fail a support check before modifying
common KVM state.

Alternatively, KVM could perform an atomic CMPXCHG on .hardware_enable,
but that comes with its own ugliness as it would require setting
.hardware_enable before success is guaranteed, e.g. attempting to load
the "wrong" could result in spurious failure to load the "right" module.

Introduce a new mutex as using kvm_lock is extremely deadlock prone due
to kvm_lock being taken under cpus_write_lock(), and in the future, under
under cpus_read_lock().  Any operation that takes cpus_read_lock() while
holding kvm_lock would potentially deadlock, e.g. kvm_timer_init() takes
cpus_read_lock() to register a callback.  In theory, KVM could avoid
such problematic paths, i.e. do less setup under kvm_lock, but avoiding
all calls to cpus_read_lock() is subtly difficult and thus fragile.  E.g.
updating static calls also acquires cpus_read_lock().

Inverting the lock ordering, i.e. always taking kvm_lock outside
cpus_read_lock(), is not a viable option as kvm_lock is taken in various
callbacks that may be invoked under cpus_read_lock(), e.g. x86's
kvmclock_cpufreq_notifier().

The lockdep splat below is dependent on future patches to take
cpus_read_lock() in hardware_enable_all(), but as above, deadlock is
already is already possible.

  ======================================================
  WARNING: possible circular locking dependency detected
  6.0.0-smp--7ec93244f194-init2 #27 Tainted: G           O
  ------------------------------------------------------
  stable/251833 is trying to acquire lock:
  ffffffffc097ea28 (kvm_lock){+.+.}-{3:3}, at: hardware_enable_all+0x1f/0xc0 [kvm]

               but task is already holding lock:
  ffffffffa2456828 (cpu_hotplug_lock){++++}-{0:0}, at: hardware_enable_all+0xf/0xc0 [kvm]

               which lock already depends on the new lock.

               the existing dependency chain (in reverse order) is:

               -> #1 (cpu_hotplug_lock){++++}-{0:0}:
         cpus_read_lock+0x2a/0xa0
         __cpuhp_setup_state+0x2b/0x60
         __kvm_x86_vendor_init+0x16a/0x1870 [kvm]
         kvm_x86_vendor_init+0x23/0x40 [kvm]
         0xffffffffc0a4d02b
         do_one_initcall+0x110/0x200
         do_init_module+0x4f/0x250
         load_module+0x1730/0x18f0
         __se_sys_finit_module+0xca/0x100
         __x64_sys_finit_module+0x1d/0x20
         do_syscall_64+0x3d/0x80
         entry_SYSCALL_64_after_hwframe+0x63/0xcd

               -> #0 (kvm_lock){+.+.}-{3:3}:
         __lock_acquire+0x16f4/0x30d0
         lock_acquire+0xb2/0x190
         __mutex_lock+0x98/0x6f0
         mutex_lock_nested+0x1b/0x20
         hardware_enable_all+0x1f/0xc0 [kvm]
         kvm_dev_ioctl+0x45e/0x930 [kvm]
         __se_sys_ioctl+0x77/0xc0
         __x64_sys_ioctl+0x1d/0x20
         do_syscall_64+0x3d/0x80
         entry_SYSCALL_64_after_hwframe+0x63/0xcd

               other info that might help us debug this:

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(cpu_hotplug_lock);
                                 lock(kvm_lock);
                                 lock(cpu_hotplug_lock);
    lock(kvm_lock);

                *** DEADLOCK ***

  1 lock held by stable/251833:
   #0: ffffffffa2456828 (cpu_hotplug_lock){++++}-{0:0}, at: hardware_enable_all+0xf/0xc0 [kvm]

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221130230934.1014142-16-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:41:03 -05:00
Paolo Bonzini fc471e8310 Merge branch 'kvm-late-6.1' into HEAD
x86:

* Change tdp_mmu to a read-only parameter

* Separate TDP and shadow MMU page fault paths

* Enable Hyper-V invariant TSC control

selftests:

* Use TAP interface for kvm_binary_stats_test and tsc_msrs_test

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:36:47 -05:00
Paolo Bonzini a5496886eb Merge branch 'kvm-late-6.1-fixes' into HEAD
x86:

* several fixes to nested VMX execution controls

* fixes and clarification to the documentation for Xen emulation

* do not unnecessarily release a pmu event with zero period

* MMU fixes

* fix Coverity warning in kvm_hv_flush_tlb()

selftests:

* fixes for the ucall mechanism in selftests

* other fixes mostly related to compilation with clang
2022-12-28 07:19:14 -05:00
Paolo Bonzini 02d9a04da4 Documentation: kvm: clarify SRCU locking order
Currently only the locking order of SRCU vs kvm->slots_arch_lock
and kvm->slots_lock is documented.  Extend this to kvm->lock
since Xen emulation got it terribly wrong.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-28 06:02:54 -05:00
David Woodhouse af2808906a KVM: x86/xen: Documentation updates and clarifications
Most notably, the KVM_XEN_EVTCHN_RESET feature had escaped documentation
entirely. Along with how to turn most stuff off on SHUTDOWN_soft_reset.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221226120320.1125390-6-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-27 06:01:50 -05:00
Sean Christopherson 23e528d9bc KVM: Delete extra block of "};" in the KVM API documentation
Delete an extra block of code/documentation that snuck in when KVM's
documentation was converted to ReST format.

Fixes: 106ee47dc6 ("docs: kvm: Convert api.txt to ReST format")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221207003637.2041211-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-27 06:00:51 -05:00
Linus Torvalds 8fa590bf34 ARM64:
* Enable the per-vcpu dirty-ring tracking mechanism, together with an
   option to keep the good old dirty log around for pages that are
   dirtied by something other than a vcpu.
 
 * Switch to the relaxed parallel fault handling, using RCU to delay
   page table reclaim and giving better performance under load.
 
 * Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option,
   which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a97d:
   "Fix a number of issues with MTE, such as races on the tags being
   initialised vs the PG_mte_tagged flag as well as the lack of support
   for VM_SHARED when KVM is involved.  Patches from Catalin Marinas and
   Peter Collingbourne").
 
 * Merge the pKVM shadow vcpu state tracking that allows the hypervisor
   to have its own view of a vcpu, keeping that state private.
 
 * Add support for the PMUv3p5 architecture revision, bringing support
   for 64bit counters on systems that support it, and fix the
   no-quite-compliant CHAIN-ed counter support for the machines that
   actually exist out there.
 
 * Fix a handful of minor issues around 52bit VA/PA support (64kB pages
   only) as a prefix of the oncoming support for 4kB and 16kB pages.
 
 * Pick a small set of documentation and spelling fixes, because no
   good merge window would be complete without those.
 
 s390:
 
 * Second batch of the lazy destroy patches
 
 * First batch of KVM changes for kernel virtual != physical address support
 
 * Removal of a unused function
 
 x86:
 
 * Allow compiling out SMM support
 
 * Cleanup and documentation of SMM state save area format
 
 * Preserve interrupt shadow in SMM state save area
 
 * Respond to generic signals during slow page faults
 
 * Fixes and optimizations for the non-executable huge page errata fix.
 
 * Reprogram all performance counters on PMU filter change
 
 * Cleanups to Hyper-V emulation and tests
 
 * Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest
   running on top of a L1 Hyper-V hypervisor)
 
 * Advertise several new Intel features
 
 * x86 Xen-for-KVM:
 
 ** Allow the Xen runstate information to cross a page boundary
 
 ** Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
 
 ** Add support for 32-bit guests in SCHEDOP_poll
 
 * Notable x86 fixes and cleanups:
 
 ** One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
 
 ** Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few
    years back when eliminating unnecessary barriers when switching between
    vmcs01 and vmcs02.
 
 ** Clean up vmread_error_trampoline() to make it more obvious that params
    must be passed on the stack, even for x86-64.
 
 ** Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective
    of the current guest CPUID.
 
 ** Fudge around a race with TSC refinement that results in KVM incorrectly
    thinking a guest needs TSC scaling when running on a CPU with a
    constant TSC, but no hardware-enumerated TSC frequency.
 
 ** Advertise (on AMD) that the SMM_CTL MSR is not supported
 
 ** Remove unnecessary exports
 
 Generic:
 
 * Support for responding to signals during page faults; introduces
   new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks
 
 Selftests:
 
 * Fix an inverted check in the access tracking perf test, and restore
   support for asserting that there aren't too many idle pages when
   running on bare metal.
 
 * Fix build errors that occur in certain setups (unsure exactly what is
   unique about the problematic setup) due to glibc overriding
   static_assert() to a variant that requires a custom message.
 
 * Introduce actual atomics for clear/set_bit() in selftests
 
 * Add support for pinning vCPUs in dirty_log_perf_test.
 
 * Rename the so called "perf_util" framework to "memstress".
 
 * Add a lightweight psuedo RNG for guest use, and use it to randomize
   the access pattern and write vs. read percentage in the memstress tests.
 
 * Add a common ucall implementation; code dedup and pre-work for running
   SEV (and beyond) guests in selftests.
 
 * Provide a common constructor and arch hook, which will eventually be
   used by x86 to automatically select the right hypercall (AMD vs. Intel).
 
 * A bunch of added/enabled/fixed selftests for ARM64, covering memslots,
   breakpoints, stage-2 faults and access tracking.
 
 * x86-specific selftest changes:
 
 ** Clean up x86's page table management.
 
 ** Clean up and enhance the "smaller maxphyaddr" test, and add a related
    test to cover generic emulation failure.
 
 ** Clean up the nEPT support checks.
 
 ** Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.
 
 ** Fix an ordering issue in the AMX test introduced by recent conversions
    to use kvm_cpu_has(), and harden the code to guard against similar bugs
    in the future.  Anything that tiggers caching of KVM's supported CPUID,
    kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if
    the caching occurs before the test opts in via prctl().
 
 Documentation:
 
 * Remove deleted ioctls from documentation
 
 * Clean up the docs for the x86 MSR filter.
 
 * Various fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmOaFrcUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPemQgAq49excg2Cc+EsHnZw3vu/QWdA0Rt
 KhL3OgKxuHNjCbD2O9n2t5di7eJOTQ7F7T0eDm3xPTr4FS8LQ2327/mQePU/H2CF
 mWOpq9RBWLzFsSTeVA2Mz9TUTkYSnDHYuRsBvHyw/n9cL76BWVzjImldFtjYjjex
 yAwl8c5itKH6bc7KO+5ydswbvBzODkeYKUSBNdbn6m0JGQST7XppNwIAJvpiHsii
 Qgpk0e4Xx9q4PXG/r5DedI6BlufBsLhv0aE9SHPzyKH3JbbUFhJYI8ZD5OhBQuYW
 MwxK2KlM5Jm5ud2NZDDlsMmmvd1lnYCFDyqNozaKEWC1Y5rq1AbMa51fXA==
 =QAYX
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM64:

   - Enable the per-vcpu dirty-ring tracking mechanism, together with an
     option to keep the good old dirty log around for pages that are
     dirtied by something other than a vcpu.

   - Switch to the relaxed parallel fault handling, using RCU to delay
     page table reclaim and giving better performance under load.

   - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
     option, which multi-process VMMs such as crosvm rely on (see merge
     commit 382b5b87a97d: "Fix a number of issues with MTE, such as
     races on the tags being initialised vs the PG_mte_tagged flag as
     well as the lack of support for VM_SHARED when KVM is involved.
     Patches from Catalin Marinas and Peter Collingbourne").

   - Merge the pKVM shadow vcpu state tracking that allows the
     hypervisor to have its own view of a vcpu, keeping that state
     private.

   - Add support for the PMUv3p5 architecture revision, bringing support
     for 64bit counters on systems that support it, and fix the
     no-quite-compliant CHAIN-ed counter support for the machines that
     actually exist out there.

   - Fix a handful of minor issues around 52bit VA/PA support (64kB
     pages only) as a prefix of the oncoming support for 4kB and 16kB
     pages.

   - Pick a small set of documentation and spelling fixes, because no
     good merge window would be complete without those.

  s390:

   - Second batch of the lazy destroy patches

   - First batch of KVM changes for kernel virtual != physical address
     support

   - Removal of a unused function

  x86:

   - Allow compiling out SMM support

   - Cleanup and documentation of SMM state save area format

   - Preserve interrupt shadow in SMM state save area

   - Respond to generic signals during slow page faults

   - Fixes and optimizations for the non-executable huge page errata
     fix.

   - Reprogram all performance counters on PMU filter change

   - Cleanups to Hyper-V emulation and tests

   - Process Hyper-V TLB flushes from a nested guest (i.e. from a L2
     guest running on top of a L1 Hyper-V hypervisor)

   - Advertise several new Intel features

   - x86 Xen-for-KVM:

      - Allow the Xen runstate information to cross a page boundary

      - Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured

      - Add support for 32-bit guests in SCHEDOP_poll

   - Notable x86 fixes and cleanups:

      - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).

      - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped
        a few years back when eliminating unnecessary barriers when
        switching between vmcs01 and vmcs02.

      - Clean up vmread_error_trampoline() to make it more obvious that
        params must be passed on the stack, even for x86-64.

      - Let userspace set all supported bits in MSR_IA32_FEAT_CTL
        irrespective of the current guest CPUID.

      - Fudge around a race with TSC refinement that results in KVM
        incorrectly thinking a guest needs TSC scaling when running on a
        CPU with a constant TSC, but no hardware-enumerated TSC
        frequency.

      - Advertise (on AMD) that the SMM_CTL MSR is not supported

      - Remove unnecessary exports

  Generic:

   - Support for responding to signals during page faults; introduces
     new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks

  Selftests:

   - Fix an inverted check in the access tracking perf test, and restore
     support for asserting that there aren't too many idle pages when
     running on bare metal.

   - Fix build errors that occur in certain setups (unsure exactly what
     is unique about the problematic setup) due to glibc overriding
     static_assert() to a variant that requires a custom message.

   - Introduce actual atomics for clear/set_bit() in selftests

   - Add support for pinning vCPUs in dirty_log_perf_test.

   - Rename the so called "perf_util" framework to "memstress".

   - Add a lightweight psuedo RNG for guest use, and use it to randomize
     the access pattern and write vs. read percentage in the memstress
     tests.

   - Add a common ucall implementation; code dedup and pre-work for
     running SEV (and beyond) guests in selftests.

   - Provide a common constructor and arch hook, which will eventually
     be used by x86 to automatically select the right hypercall (AMD vs.
     Intel).

   - A bunch of added/enabled/fixed selftests for ARM64, covering
     memslots, breakpoints, stage-2 faults and access tracking.

   - x86-specific selftest changes:

      - Clean up x86's page table management.

      - Clean up and enhance the "smaller maxphyaddr" test, and add a
        related test to cover generic emulation failure.

      - Clean up the nEPT support checks.

      - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.

      - Fix an ordering issue in the AMX test introduced by recent
        conversions to use kvm_cpu_has(), and harden the code to guard
        against similar bugs in the future. Anything that tiggers
        caching of KVM's supported CPUID, kvm_cpu_has() in this case,
        effectively hides opt-in XSAVE features if the caching occurs
        before the test opts in via prctl().

  Documentation:

   - Remove deleted ioctls from documentation

   - Clean up the docs for the x86 MSR filter.

   - Various fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits)
  KVM: x86: Add proper ReST tables for userspace MSR exits/flags
  KVM: selftests: Allocate ucall pool from MEM_REGION_DATA
  KVM: arm64: selftests: Align VA space allocator with TTBR0
  KVM: arm64: Fix benign bug with incorrect use of VA_BITS
  KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow
  KVM: x86: Advertise that the SMM_CTL MSR is not supported
  KVM: x86: remove unnecessary exports
  KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic"
  tools: KVM: selftests: Convert clear/set_bit() to actual atomics
  tools: Drop "atomic_" prefix from atomic test_and_set_bit()
  tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers
  KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests
  perf tools: Use dedicated non-atomic clear/set bit helpers
  tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers
  KVM: arm64: selftests: Enable single-step without a "full" ucall()
  KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself
  KVM: Remove stale comment about KVM_REQ_UNHALT
  KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR
  KVM: Reference to kvm_userspace_memory_region in doc and comments
  KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl
  ...
2022-12-15 11:12:21 -08:00
Sean Christopherson 549a715b98 KVM: x86: Add proper ReST tables for userspace MSR exits/flags
Add ReST formatting to the set of userspace MSR exits/flags so that the
resulting HTML docs generate a table instead of malformed gunk.  This
also fixes a warning that was introduced by a recent cleanup of the
relevant documentation (yay copy+paste).

 >> Documentation/virt/kvm/api.rst:7287: WARNING: Block quote ends
    without a blank line; unexpected unindent.

Fixes: 1ae099540e ("KVM: x86: Allow deflecting unknown MSR accesses to user space")
Fixes: 1f15814718 ("KVM: x86: Clean up KVM_CAP_X86_USER_SPACE_MSR documentation")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221207000959.2035098-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-14 13:28:31 -05:00
Linus Torvalds a89ef2aa55 Add TDX guest attestation infrastructure and driver
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmOXYmwACgkQaDWVMHDJ
 krD8hg/+J0hUTfljmlCctwGZyqVR3Y2E722wL9oTvbgYiUAtFrARzfPF0WNwvHi5
 Ywvod5hQ4unPoluthdVAD/uJqcPVhjIZ7CvNTGrS8J7ED5x5ydGLNWAL3Rn+9s6O
 xkz/DsV4zl+cPQ60XLsO+3Mc6RhwVs9DUthpUovl22epmgmRPCovkHWkvQsZajJq
 ceF/78ThfrkG4dDouaIXi1gsmKLLzU4KdHeBATMg0bgPQXFJZSGBCLaeJXWmLapq
 7N3SznUqDMn4Plr/IuP4XuMA6VTVojrakCcBmw5SGVqhkVWGM1/FMg7jHSQS7Z5V
 5uG7CkhTBqh17v9xKwDMPh34D51TLtNifA7jbecyL5155czFkj7BoSwEFINU/wCz
 agUO9NvK9j1chUnA2UGqGQigM3nWGZHMwaQjfgBWyq5gqF8HURUUrjx6XuunOfmB
 1byyrDu0g48u/zaQ/RpNfewz1ZY+WylDPcqOhYaVWF1PYThStML/VMBKpdsl1Ovw
 nytUdQsaBIjFHQdB+snizaF93+/0FG+FTGAlDnHYmey/8plL2LYuzrcDnDYnGEXa
 tN3HFd2lAi4JBLmvmgF39gH+BLXuKTLweIhwTXZTn91cfire3yxiXAnLd0tuptMP
 aXFddxKMdMpxTqzy2X+8gJjqCr2lZ9gZkxaPsWwrBM+xrJf0p2w=
 =JGnq
 -----END PGP SIGNATURE-----

Merge tag 'x86_tdx_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 tdx updates from Dave Hansen:
 "This includes a single chunk of new functionality for TDX guests which
  allows them to talk to the trusted TDX module software and obtain an
  attestation report.

  This report can then be used to prove the trustworthiness of the guest
  to a third party and get access to things like storage encryption
  keys"

* tag 'x86_tdx_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftests/tdx: Test TDX attestation GetReport support
  virt: Add TDX guest driver
  x86/tdx: Add a wrapper to get TDREPORT0 from the TDX Module
2022-12-12 14:27:49 -08:00
Paolo Bonzini 9352e7470a Merge remote-tracking branch 'kvm/queue' into HEAD
x86 Xen-for-KVM:

* Allow the Xen runstate information to cross a page boundary

* Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured

* add support for 32-bit guests in SCHEDOP_poll

x86 fixes:

* One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).

* Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few
   years back when eliminating unnecessary barriers when switching between
   vmcs01 and vmcs02.

* Clean up the MSR filter docs.

* Clean up vmread_error_trampoline() to make it more obvious that params
  must be passed on the stack, even for x86-64.

* Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective
  of the current guest CPUID.

* Fudge around a race with TSC refinement that results in KVM incorrectly
  thinking a guest needs TSC scaling when running on a CPU with a
  constant TSC, but no hardware-enumerated TSC frequency.

* Advertise (on AMD) that the SMM_CTL MSR is not supported

* Remove unnecessary exports

Selftests:

* Fix an inverted check in the access tracking perf test, and restore
  support for asserting that there aren't too many idle pages when
  running on bare metal.

* Fix an ordering issue in the AMX test introduced by recent conversions
  to use kvm_cpu_has(), and harden the code to guard against similar bugs
  in the future.  Anything that tiggers caching of KVM's supported CPUID,
  kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if
  the caching occurs before the test opts in via prctl().

* Fix build errors that occur in certain setups (unsure exactly what is
  unique about the problematic setup) due to glibc overriding
  static_assert() to a variant that requires a custom message.

* Introduce actual atomics for clear/set_bit() in selftests

Documentation:

* Remove deleted ioctls from documentation

* Various fixes
2022-12-12 15:54:07 -05:00
Paolo Bonzini eb5618911a KVM/arm64 updates for 6.2
- Enable the per-vcpu dirty-ring tracking mechanism, together with an
   option to keep the good old dirty log around for pages that are
   dirtied by something other than a vcpu.
 
 - Switch to the relaxed parallel fault handling, using RCU to delay
   page table reclaim and giving better performance under load.
 
 - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
   option, which multi-process VMMs such as crosvm rely on.
 
 - Merge the pKVM shadow vcpu state tracking that allows the hypervisor
   to have its own view of a vcpu, keeping that state private.
 
 - Add support for the PMUv3p5 architecture revision, bringing support
   for 64bit counters on systems that support it, and fix the
   no-quite-compliant CHAIN-ed counter support for the machines that
   actually exist out there.
 
 - Fix a handful of minor issues around 52bit VA/PA support (64kB pages
   only) as a prefix of the oncoming support for 4kB and 16kB pages.
 
 - Add/Enable/Fix a bunch of selftests covering memslots, breakpoints,
   stage-2 faults and access tracking. You name it, we got it, we
   probably broke it.
 
 - Pick a small set of documentation and spelling fixes, because no
   good merge window would be complete without those.
 
 As a side effect, this tag also drags:
 
 - The 'kvmarm-fixes-6.1-3' tag as a dependency to the dirty-ring
   series
 
 - A shared branch with the arm64 tree that repaints all the system
   registers to match the ARM ARM's naming, and resulting in
   interesting conflicts
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmOODb0PHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDztsQAInRnsgLl57/SpqhZzExNCllN6AT/bdeB3uz
 rnw3ScJOV174uNKp8lnPWoTvu2YUGiVtBp6tFHhDI8le7zHX438ZT8KE5mcs8p5i
 KfFKnb8SHV2DDpqkcy24c0Xl/6vsg1qkKrdfJb49yl5ZakRITDpynW/7tn6dXsxX
 wASeGFdCYeW4g2xMQzsCbtx6LgeQ8uomBmzRfPrOtZHYYxAn6+4Mj4595EC1sWxM
 AQnbp8tW3Vw46saEZAQvUEOGOW9q0Nls7G21YqQ52IA+ZVDK1LmAF2b1XY3edjkk
 pX8EsXOURfqdasBxfSfF3SgnUazoz9GHpSzp1cTVTktrPp40rrT7Ldtml0ktq69d
 1malPj47KVMDsIq0kNJGnMxciXFgAHw+VaCQX+k4zhIatNwviMbSop2fEoxj22jc
 4YGgGOxaGrnvmAJhreCIbr4CkZk5CJ8Zvmtfg+QM6npIp8BY8896nvORx/d4i6tT
 H4caadd8AAR56ANUyd3+KqF3x0WrkaU0PLHJLy1tKwOXJUUTjcpvIfahBAAeUlSR
 qEFrtb+EEMPgAwLfNOICcNkPZR/yyuYvM+FiUQNVy5cNiwFkpztpIctfOFaHySGF
 K07O2/a1F6xKL0OKRUg7hGKknF9ecmux4vHhiUMuIk9VOgNTWobHozBDorLKXMzC
 aWa6oGVC
 =iIPT
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-6.2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for 6.2

- Enable the per-vcpu dirty-ring tracking mechanism, together with an
  option to keep the good old dirty log around for pages that are
  dirtied by something other than a vcpu.

- Switch to the relaxed parallel fault handling, using RCU to delay
  page table reclaim and giving better performance under load.

- Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
  option, which multi-process VMMs such as crosvm rely on.

- Merge the pKVM shadow vcpu state tracking that allows the hypervisor
  to have its own view of a vcpu, keeping that state private.

- Add support for the PMUv3p5 architecture revision, bringing support
  for 64bit counters on systems that support it, and fix the
  no-quite-compliant CHAIN-ed counter support for the machines that
  actually exist out there.

- Fix a handful of minor issues around 52bit VA/PA support (64kB pages
  only) as a prefix of the oncoming support for 4kB and 16kB pages.

- Add/Enable/Fix a bunch of selftests covering memslots, breakpoints,
  stage-2 faults and access tracking. You name it, we got it, we
  probably broke it.

- Pick a small set of documentation and spelling fixes, because no
  good merge window would be complete without those.

As a side effect, this tag also drags:

- The 'kvmarm-fixes-6.1-3' tag as a dependency to the dirty-ring
  series

- A shared branch with the arm64 tree that repaints all the system
  registers to match the ARM ARM's naming, and resulting in
  interesting conflicts
2022-12-09 09:12:12 +01:00
Marc Zyngier 86f27d849b Merge branch kvm-arm64/misc-6.2 into kvmarm-master/next
* kvm-arm64/misc-6.2:
  : .
  : Misc fixes for 6.2:
  :
  : - Fix formatting for the pvtime documentation
  :
  : - Fix a comment in the VHE-specific Makefile
  : .
  KVM: arm64: Fix typo in comment
  KVM: arm64: Fix pvtime documentation

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05 14:39:12 +00:00
Marc Zyngier 382b5b87a9 Merge branch kvm-arm64/mte-map-shared into kvmarm-master/next
* kvm-arm64/mte-map-shared:
  : .
  : Update the MTE support to allow the VMM to use shared mappings
  : to back the memslots exposed to MTE-enabled guests.
  :
  : Patches courtesy of Catalin Marinas and Peter Collingbourne.
  : .
  : Fix a number of issues with MTE, such as races on the tags
  : being initialised vs the PG_mte_tagged flag as well as the
  : lack of support for VM_SHARED when KVM is involved.
  :
  : Patches from Catalin Marinas and Peter Collingbourne.
  : .
  Documentation: document the ABI changes for KVM_CAP_ARM_MTE
  KVM: arm64: permit all VM_MTE_ALLOWED mappings with MTE enabled
  KVM: arm64: unify the tests for VMAs in memslots when MTE is enabled
  arm64: mte: Lock a page for MTE tag initialisation
  mm: Add PG_arch_3 page flag
  KVM: arm64: Simplify the sanitise_mte_tags() logic
  arm64: mte: Fix/clarify the PG_mte_tagged semantics
  mm: Do not enable PG_arch_2 for all 64-bit architectures

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05 14:38:24 +00:00
David Matlack 34e30ebbe4 KVM: Document the interaction between KVM_CAP_HALT_POLL and halt_poll_ns
Clarify the existing documentation about how KVM_CAP_HALT_POLL and
halt_poll_ns interact to make it clear that VMs using KVM_CAP_HALT_POLL
ignore halt_poll_ns.

Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20221201195249.3369720-3-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-02 13:20:30 -05:00
David Matlack b8b43a4c2e KVM: Move halt-polling documentation into common directory
Move halt-polling.rst into the common KVM documentation directory and
out of the x86-specific directory. Halt-polling is a common feature and
the existing documentation is already written as such.

Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20221201195249.3369720-2-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-02 13:20:30 -05:00
Paolo Bonzini b376144595 Misc KVM x86 fixes and cleanups for 6.2:
- One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
 
  - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few
    years back when eliminating unnecessary barriers when switching between
    vmcs01 and vmcs02.
 
  - Clean up the MSR filter docs.
 
  - Clean up vmread_error_trampoline() to make it more obvious that params
    must be passed on the stack, even for x86-64.
 
  - Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective
    of the current guest CPUID.
 
  - Fudge around a race with TSC refinement that results in KVM incorrectly
    thinking a guest needs TSC scaling when running on a CPU with a
    constant TSC, but no hardware-enumerated TSC frequency.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmOJQesSHHNlYW5qY0Bn
 b29nbGUuY29tAAoJEGCRIgFNDBL5IR4QAKGPbLRykY/2FohV2HDu5fDPxA2Fe9nu
 5W7ZIptQu+tQtCTWKFEjcQdwYoNrLbr0hr1eGubVbIvBqJbwPQfH7G0765eOIcvy
 s6Zn2N24IisIoUxdkJGOL3Tt1UR7wCFbwC+ms0i4FQvMcw+TbM0BTHgJDdwR5laX
 mGN7ubz5iImwDFFE3Bd8Qy5I+FGL9CI60l+RzK6b7J8HYi1wOBMLU9QueF/dB7gR
 g+navZJAAnvN6YIkjP5j8yPBuvhDzni379ue5ATDq1ALvyyI7xlYALsxpUjCnLuo
 CkbvgmfmC94Vdm7pzFgsbazUN2oIjwoimjFQHP1bf8Jmd+770R282JpnwiD/ydCV
 Tl2ArwXA2zxVxNZm9g/XqPBwWBWWgWfYIQbuuxc065MnXCnHkY5UGGf0JNx41CDl
 hdtm9DHkft8+6kyBBmgkdKxd328Znljq02v3nLePUipfpDVaNj4VAUj9IpV6Lpuj
 GJjs4Wx7oqFwH1Im/LqZgnOGwgkSj3ObHtkYx2RSrQAQultbjuplFz2qZWP8PF6A
 FrJbcddKOmLINrdNOlvTd5WKCAjtV8vycjFkk+/7H67rpZdM8AI1StrzMP6gmwg4
 ARozZJ2UF8nTriRYFQbFQyNm9bBTZ7YQ/HajqfhvCuZLi7i1EaImhC0F1xn7IU5S
 6XhvQPvjRTgS
 =i6OA
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-fixes-6.2-1' of https://github.com/kvm-x86/linux into HEAD

Misc KVM x86 fixes and cleanups for 6.2:

 - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).

 - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few
   years back when eliminating unnecessary barriers when switching between
   vmcs01 and vmcs02.

 - Clean up the MSR filter docs.

 - Clean up vmread_error_trampoline() to make it more obvious that params
   must be passed on the stack, even for x86-64.

 - Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective
   of the current guest CPUID.

 - Fudge around a race with TSC refinement that results in KVM incorrectly
   thinking a guest needs TSC scaling when running on a CPU with a
   constant TSC, but no hardware-enumerated TSC frequency.
2022-12-02 12:56:25 -05:00
Javier Martinez Canillas 10c5e80b2c KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR
The ioctls are missing an architecture property that is present in others.

Suggested-by: Sergio Lopez Pascual <slp@redhat.com>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Message-Id: <20221202105011.185147-5-javierm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-02 12:54:41 -05:00
Javier Martinez Canillas 30ee198ce4 KVM: Reference to kvm_userspace_memory_region in doc and comments
There are still references to the removed kvm_memory_region data structure
but the doc and comments should mention struct kvm_userspace_memory_region
instead, since that is what's used by the ioctl that replaced the old one
and this data structure support the same set of flags.

Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Message-Id: <20221202105011.185147-4-javierm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-02 12:54:40 -05:00
Javier Martinez Canillas 66a9221d73 KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl
The documentation says that the ioctl has been deprecated, but it has been
actually removed and the remaining references are just left overs.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Message-Id: <20221202105011.185147-3-javierm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-02 12:54:40 -05:00
Javier Martinez Canillas 61e15f8712 KVM: Delete all references to removed KVM_SET_MEMORY_REGION ioctl
The documentation says that the ioctl has been deprecated, but it has been
actually removed and the remaining references are just left overs.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Message-Id: <20221202105011.185147-2-javierm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-02 12:54:30 -05:00
Sean Christopherson 1f15814718 KVM: x86: Clean up KVM_CAP_X86_USER_SPACE_MSR documentation
Clean up the KVM_CAP_X86_USER_SPACE_MSR documentation to eliminate
misleading and/or inconsistent verbiage, and to actually document what
accesses are intercepted by which flags.

  - s/will/may since not all #GPs are guaranteed to be intercepted
  - s/deflect/intercept to align with common KVM terminology
  - s/user space/userspace to align with the majority of KVM docs
  - Avoid using "trap" terminology, as KVM exits to userspace _before_
    stepping, i.e. doesn't exhibit trap-like behavior
  - Actually document the flags

Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220831001706.4075399-4-seanjc@google.com
2022-11-30 16:19:43 -08:00
Sean Christopherson b93d2ec34e KVM: x86: Reword MSR filtering docs to more precisely define behavior
Reword the MSR filtering documentatiion to more precisely define the
behavior of filtering using common virtualization terminology.

  - Explicitly document KVM's behavior when an MSR is denied
  - s/handled/allowed as there is no guarantee KVM will "handle" the
    MSR access
  - Drop the "fall back" terminology, which incorrectly suggests that
    there is existing KVM behavior to fall back to
  - Fix an off-by-one error in the range (the end is exclusive)
  - Call out the interaction between MSR filtering and
    KVM_CAP_X86_USER_SPACE_MSR's KVM_MSR_EXIT_REASON_FILTER
  - Delete the redundant paragraph on what '0' and '1' in the bitmap
    means, it's covered by the sections on KVM_MSR_FILTER_{READ,WRITE}
  - Delete the clause on x2APIC MSR behavior depending on APIC base, this
    is covered by stating that KVM follows architectural behavior when
    emulating/virtualizing MSR accesses

Reported-by: Aaron Lewis <aaronlewis@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220831001706.4075399-3-seanjc@google.com
2022-11-30 16:19:43 -08:00
Sean Christopherson 5c8c0b3273 KVM: x86: Delete documentation for READ|WRITE in KVM_X86_SET_MSR_FILTER
Delete the paragraph that describes the behavior when both
KVM_MSR_FILTER_READ | KVM_MSR_FILTER_WRITE are set for a range.  There is
nothing special about KVM's handling of this combination, whereas
explicitly documenting the combination suggests that there is some magic
behavior the user needs to be aware of.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220831001706.4075399-2-seanjc@google.com
2022-11-30 16:19:43 -08:00
David Woodhouse d8ba8ba4c8 KVM: x86/xen: Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
Closer inspection of the Xen code shows that we aren't supposed to be
using the XEN_RUNSTATE_UPDATE flag unconditionally. It should be
explicitly enabled by guests through the HYPERVISOR_vm_assist hypercall.
If we randomly set the top bit of ->state_entry_time for a guest that
hasn't asked for it and doesn't expect it, that could make the runtimes
fail to add up and confuse the guest. Without the flag it's perfectly
safe for a vCPU to read its own vcpu_runstate_info; just not for one
vCPU to read *another's*.

I briefly pondered adding a word for the whole set of VMASST_TYPE_*
flags but the only one we care about for HVM guests is this, so it
seemed a bit pointless.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221127122210.248427-3-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-30 10:59:37 -05:00
Peter Collingbourne a4baf8d263 Documentation: document the ABI changes for KVM_CAP_ARM_MTE
Document both the restriction on VM_MTE_ALLOWED mappings and
the relaxation for shared mappings.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221104011041.290951-9-pcc@google.com
2022-11-29 09:26:07 +00:00
Paolo Bonzini 1e79a9e3ab - Second batch of the lazy destroy patches
- First batch of KVM changes for kernel virtual != physical address support
 - Removal of a unused function
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEwGNS88vfc9+v45Yq41TmuOI4ufgFAmN/eYwACgkQ41TmuOI4
 ufjoxA/9Et38aXO/IhmUt8v0QhA4yec+sc5GSFfQSYehej/1Vqhw0DXx+ORUiRgg
 +rtiXJSSqkuD2dL+BDffY2xoul6nzNdVf4AbkcnrWscfWr6xwVYlPvuL0ymGI6J2
 U/IPedRoKw0bHw/wHs05yV5PubrRwDFERKhtyXWYGbPJhX0w2n3IFOoKH1oWBhLW
 Dc8jEs6t3gDbJ71Er0xoeBUoiuu+PgZG06cpOvzBZ0KclRgjADXyISqqk8/4mu8w
 R+/Wf8NcrbQYV1jfCeq5zIsKC8uvnFj25UuyTLumn5vh+dNNsvE72Khe4tz7LI0I
 ZPZ+GZuemu7Yi12dKjw4Sw3ui0ejWH/5XL1SVB0X/xYIWrBqOot+Lq6538GCng+c
 tJt+zsu64VFgXCCZ8O9qO4uE4DBL70H3ThT7VZxIghSTZtY0xh3uFc64f3/3d9dy
 K4WTJHrmMxhXaA/rqtIa8I53JvFl8CztofZATiQQesyPuc7lZ01w1Co5el4xYaxe
 YknyMTq11qf/iYqVOW7sjoWW/YRuuMZ4+FhpI3o/SllVdN98iTwkk1kP3wcoBO5P
 bvzpm+WXHbv9OxifPrqkqv34+upbjfEmSogHudQzagBX4vl3rZRfBCdQGCAha0Uc
 ZYyg68kiil5sWmHI/Ln/ZjANYfbS5sF0CreuWxnmqcwKl2NSN/E=
 =/1yt
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-6.2-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

- Second batch of the lazy destroy patches
- First batch of KVM changes for kernel virtual != physical address support
- Removal of a unused function
2022-11-28 13:34:47 -05:00
Claudio Imbrenda d9459922a1 KVM: s390: pv: api documentation for asynchronous destroy
Add documentation for the new commands added to the KVM_S390_PV_COMMAND
ioctl.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20221111170632.77622-3-imbrenda@linux.ibm.com
Message-Id: <20221111170632.77622-3-imbrenda@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-11-23 09:06:50 +00:00
Kuppuswamy Sathyanarayanan 6c8c1406a6 virt: Add TDX guest driver
TDX guest driver exposes IOCTL interfaces to service TDX guest
user-specific requests. Currently, it is only used to allow the user to
get the TDREPORT to support TDX attestation.

Details about the TDX attestation process are documented in
Documentation/x86/tdx.rst, and the IOCTL details are documented in
Documentation/virt/coco/tdx-guest.rst.

Operations like getting TDREPORT involves sending a blob of data as
input and getting another blob of data as output. It was considered
to use a sysfs interface for this, but it doesn't fit well into the
standard sysfs model for configuring values. It would be possible to
do read/write on files, but it would need multiple file descriptors,
which would be somewhat messy. IOCTLs seem to be the best fitting
and simplest model for this use case. The AMD sev-guest driver also
uses the IOCTL interface to support attestation.

[Bagas Sanjaya: Ack is for documentation portion]
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Wander Lairson Costa <wander@redhat.com>
Link: https://lore.kernel.org/all/20221116223820.819090-3-sathyanarayanan.kuppuswamy%40linux.intel.com
2022-11-17 11:04:23 -08:00
Usama Arif 83f8a81dec KVM: arm64: Fix pvtime documentation
This includes table format and using reST labels for
cross-referencing to vcpu.rst.

Suggested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Usama Arif <usama.arif@bytedance.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221103131210.3603385-1-usama.arif@bytedance.com
2022-11-11 16:29:22 +00:00