Commit Graph

269 Commits

Author SHA1 Message Date
Dexuan Cui ec97e11298 Drivers: hv: vmbus: Call hv_synic_free() if hv_synic_alloc() fails
Commit 572086325c ("Drivers: hv: vmbus: Cleanup synic memory free path")
says "Any memory allocations that succeeded will be freed when the caller
cleans up by calling hv_synic_free()", but if the get_zeroed_page() in
hv_synic_alloc() fails, currently hv_synic_free() is not really called
in vmbus_bus_init(), consequently there will be a memory leak, e.g.
hv_context.hv_numa_map is not freed in the error path. Fix this by
updating the goto labels.

Cc: stable@kernel.org
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Fixes: 4df4cb9e99 ("x86/hyperv: Initialize clockevents earlier in CPU onlining")
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20230504224155.10484-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2023-05-08 17:37:20 +00:00
Linus Torvalds da46b58ff8 hyperv-next for v6.4
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmRHJSgTHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXjSOCAClsmFmyP320yAB74vQer5cSzxbIpFW
 3qt/P3D8zABn0UxjjmD8+LTHuyB+72KANU6qQ9No6zdYs8yaA1vGX8j8UglWWHuj
 fmaAD4DuZl+V+fmqDgHukgaPlhakmW0m5tJkR+TW3kCgnyrtvSWpXPoxUAe6CLvj
 Kb/SPl6ylHRWlIAEZ51gy0Ipqxjvs5vR/h9CWpTmRMuZvxdWUro2Cm82wJgzXPqq
 3eLbAzB29kLFEIIUpba9a/rif1yrWgVFlfpuENFZ+HUYuR78wrPB9evhwuPvhXd2
 +f+Wk0IXORAJo8h7aaMMIr6bd4Lyn98GPgmS5YSe92HRIqjBvtYs3Dq8
 =F6+n
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed-20230424' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv updates from Wei Liu:

 - PCI passthrough for Hyper-V confidential VMs (Michael Kelley)

 - Hyper-V VTL mode support (Saurabh Sengar)

 - Move panic report initialization code earlier (Long Li)

 - Various improvements and bug fixes (Dexuan Cui and Michael Kelley)

* tag 'hyperv-next-signed-20230424' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (22 commits)
  PCI: hv: Replace retarget_msi_interrupt_params with hyperv_pcpu_input_arg
  Drivers: hv: move panic report code from vmbus to hv early init code
  x86/hyperv: VTL support for Hyper-V
  Drivers: hv: Kconfig: Add HYPERV_VTL_MODE
  x86/hyperv: Make hv_get_nmi_reason public
  x86/hyperv: Add VTL specific structs and hypercalls
  x86/init: Make get/set_rtc_noop() public
  x86/hyperv: Exclude lazy TLB mode CPUs from enlightened TLB flushes
  x86/hyperv: Add callback filter to cpumask_to_vpset()
  Drivers: hv: vmbus: Remove the per-CPU post_msg_page
  clocksource: hyper-v: make sure Invariant-TSC is used if it is available
  PCI: hv: Enable PCI pass-thru devices in Confidential VMs
  Drivers: hv: Don't remap addresses that are above shared_gpa_boundary
  hv_netvsc: Remove second mapping of send and recv buffers
  Drivers: hv: vmbus: Remove second way of mapping ring buffers
  Drivers: hv: vmbus: Remove second mapping of VMBus monitor pages
  swiotlb: Remove bounce buffer remapping for Hyper-V
  Driver: VMBus: Add Devicetree support
  dt-bindings: bus: Add Hyper-V VMBus
  Drivers: hv: vmbus: Convert acpi_device to more generic platform_device
  ...
2023-04-27 17:17:12 -07:00
Linus Torvalds 888d3c9f7f sysctl-6.4-rc1
This pull request goes with only a few sysctl moves from the
 kernel/sysctl.c file, the rest of the work has been put towards
 deprecating two API calls which incur recursion and prevent us
 from simplifying the registration process / saving memory per
 move. Most of the changes have been soaking on linux-next since
 v6.3-rc3.
 
 I've slowed down the kernel/sysctl.c moves due to Matthew Wilcox's
 feedback that we should see if we could *save* memory with these
 moves instead of incurring more memory. We currently incur more
 memory since when we move a syctl from kernel/sysclt.c out to its
 own file we end up having to add a new empty sysctl used to register
 it. To achieve saving memory we want to allow syctls to be passed
 without requiring the end element being empty, and just have our
 registration process rely on ARRAY_SIZE(). Without this, supporting
 both styles of sysctls would make the sysctl registration pretty
 brittle, hard to read and maintain as can be seen from Meng Tang's
 efforts to do just this [0]. Fortunately, in order to use ARRAY_SIZE()
 for all sysctl registrations also implies doing the work to deprecate
 two API calls which use recursion in order to support sysctl
 declarations with subdirectories.
 
 And so during this development cycle quite a bit of effort went into
 this deprecation effort. I've annotated the following two APIs are
 deprecated and in few kernel releases we should be good to remove them:
 
   * register_sysctl_table()
   * register_sysctl_paths()
 
 During this merge window we should be able to deprecate and unexport
 register_sysctl_paths(), we can probably do that towards the end
 of this merge window.
 
 Deprecating register_sysctl_table() will take a bit more time but
 this pull request goes with a few example of how to do this.
 
 As it turns out each of the conversions to move away from either of
 these two API calls *also* saves memory. And so long term, all these
 changes *will* prove to have saved a bit of memory on boot.
 
 The way I see it then is if remove a user of one deprecated call, it
 gives us enough savings to move one kernel/sysctl.c out from the
 generic arrays as we end up with about the same amount of bytes.
 
 Since deprecating register_sysctl_table() and register_sysctl_paths()
 does not require maintainer coordination except the final unexport
 you'll see quite a bit of these changes from other pull requests, I've
 just kept the stragglers after rc3.
 
 Most of these changes have been soaking on linux-next since around rc3.
 
 [0] https://lkml.kernel.org/r/ZAD+cpbrqlc5vmry@bombadil.infradead.org
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmRHAjQSHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinTzgQAI/uKHKi0VlUR1l2Psl0XbseUVueuyj3
 ZDxSJpbVUmsoDf2MlLjzB8mYE3ricnNTDbLr7qOyA6pXdM1N0mY5LQmRVRu8/ffd
 2T1hQ5pl7YnJdWP5dPhcF9Y+jnu1tjX1MW5DS4fzllwK7FnD86HuIruGq52RAPS/
 /FH+BD9eodLWWXk6A/o2GFqoWxPKQI0GLxEYWa7Hg7yt8E/3PQL9QsRzn8i6U+HW
 BrN/+G3YD1VCCzXu0UAeXnm+i1Z7CdvqNdZuSkvE3DObiZ5WpOS+/i7FrDB7zdiu
 zAbHaifHnDPtcK3w2ZodbLAAwEWD/mG4iwIjE2kgIMVYxBv7TFDBRREXAWYAevIT
 UUuZnWDQsGaWdjywrebaUycEfd6dytKyan0fTXgMFkcoWRjejhitfdM2iZDdQROg
 q453p4HqOw4vTrhy4ov4zOX7J3EFiBzpZdl+SmLqcXk+jbLVb/Q9snUWz1AFtHBl
 gHoP5bS82uVktGG3MsObjgTzYYMQjO9YGIrVuW1VP9uWs8WaoWx6M9FQJIIhtwE+
 h6wG2s7CjuFWnS0/IxWmDOn91QyUn1w7ohiz9TuvYj/5GLSBpBDGCJHsNB5T2WS1
 qbQRaZ2Kg3j9TeyWfXxdlxBx7bt3ni+J/IXDY0zom2sTpGHKl8D2g5AzmEXJDTpl
 kd7Z3gsmwhDh
 =0U0W
 -----END PGP SIGNATURE-----

Merge tag 'sysctl-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull sysctl updates from Luis Chamberlain:
 "This only does a few sysctl moves from the kernel/sysctl.c file, the
  rest of the work has been put towards deprecating two API calls which
  incur recursion and prevent us from simplifying the registration
  process / saving memory per move. Most of the changes have been
  soaking on linux-next since v6.3-rc3.

  I've slowed down the kernel/sysctl.c moves due to Matthew Wilcox's
  feedback that we should see if we could *save* memory with these moves
  instead of incurring more memory. We currently incur more memory since
  when we move a syctl from kernel/sysclt.c out to its own file we end
  up having to add a new empty sysctl used to register it. To achieve
  saving memory we want to allow syctls to be passed without requiring
  the end element being empty, and just have our registration process
  rely on ARRAY_SIZE(). Without this, supporting both styles of sysctls
  would make the sysctl registration pretty brittle, hard to read and
  maintain as can be seen from Meng Tang's efforts to do just this [0].
  Fortunately, in order to use ARRAY_SIZE() for all sysctl registrations
  also implies doing the work to deprecate two API calls which use
  recursion in order to support sysctl declarations with subdirectories.

  And so during this development cycle quite a bit of effort went into
  this deprecation effort. I've annotated the following two APIs are
  deprecated and in few kernel releases we should be good to remove
  them:

   - register_sysctl_table()
   - register_sysctl_paths()

  During this merge window we should be able to deprecate and unexport
  register_sysctl_paths(), we can probably do that towards the end of
  this merge window.

  Deprecating register_sysctl_table() will take a bit more time but this
  pull request goes with a few example of how to do this.

  As it turns out each of the conversions to move away from either of
  these two API calls *also* saves memory. And so long term, all these
  changes *will* prove to have saved a bit of memory on boot.

  The way I see it then is if remove a user of one deprecated call, it
  gives us enough savings to move one kernel/sysctl.c out from the
  generic arrays as we end up with about the same amount of bytes.

  Since deprecating register_sysctl_table() and register_sysctl_paths()
  does not require maintainer coordination except the final unexport
  you'll see quite a bit of these changes from other pull requests, I've
  just kept the stragglers after rc3"

Link: https://lkml.kernel.org/r/ZAD+cpbrqlc5vmry@bombadil.infradead.org [0]

* tag 'sysctl-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (29 commits)
  fs: fix sysctls.c built
  mm: compaction: remove incorrect #ifdef checks
  mm: compaction: move compaction sysctl to its own file
  mm: memory-failure: Move memory failure sysctls to its own file
  arm: simplify two-level sysctl registration for ctl_isa_vars
  ia64: simplify one-level sysctl registration for kdump_ctl_table
  utsname: simplify one-level sysctl registration for uts_kern_table
  ntfs: simplfy one-level sysctl registration for ntfs_sysctls
  coda: simplify one-level sysctl registration for coda_table
  fs/cachefiles: simplify one-level sysctl registration for cachefiles_sysctls
  xfs: simplify two-level sysctl registration for xfs_table
  nfs: simplify two-level sysctl registration for nfs_cb_sysctls
  nfs: simplify two-level sysctl registration for nfs4_cb_sysctls
  lockd: simplify two-level sysctl registration for nlm_sysctls
  proc_sysctl: enhance documentation
  xen: simplify sysctl registration for balloon
  md: simplify sysctl registration
  hv: simplify sysctl registration
  scsi: simplify sysctl registration with register_sysctl()
  csky: simplify alignment sysctl registration
  ...
2023-04-27 16:52:33 -07:00
Linus Torvalds 556eb8b791 Driver core changes for 6.4-rc1
Here is the large set of driver core changes for 6.4-rc1.
 
 Once again, a busy development cycle, with lots of changes happening in
 the driver core in the quest to be able to move "struct bus" and "struct
 class" into read-only memory, a task now complete with these changes.
 
 This will make the future rust interactions with the driver core more
 "provably correct" as well as providing more obvious lifetime rules for
 all busses and classes in the kernel.
 
 The changes required for this did touch many individual classes and
 busses as many callbacks were changed to take const * parameters
 instead.  All of these changes have been submitted to the various
 subsystem maintainers, giving them plenty of time to review, and most of
 them actually did so.
 
 Other than those changes, included in here are a small set of other
 things:
   - kobject logging improvements
   - cacheinfo improvements and updates
   - obligatory fw_devlink updates and fixes
   - documentation updates
   - device property cleanups and const * changes
   - firwmare loader dependency fixes.
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZEp7Sw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykitQCfamUHpxGcKOAGuLXMotXNakTEsxgAoIquENm5
 LEGadNS38k5fs+73UaxV
 =7K4B
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the large set of driver core changes for 6.4-rc1.

  Once again, a busy development cycle, with lots of changes happening
  in the driver core in the quest to be able to move "struct bus" and
  "struct class" into read-only memory, a task now complete with these
  changes.

  This will make the future rust interactions with the driver core more
  "provably correct" as well as providing more obvious lifetime rules
  for all busses and classes in the kernel.

  The changes required for this did touch many individual classes and
  busses as many callbacks were changed to take const * parameters
  instead. All of these changes have been submitted to the various
  subsystem maintainers, giving them plenty of time to review, and most
  of them actually did so.

  Other than those changes, included in here are a small set of other
  things:

   - kobject logging improvements

   - cacheinfo improvements and updates

   - obligatory fw_devlink updates and fixes

   - documentation updates

   - device property cleanups and const * changes

   - firwmare loader dependency fixes.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits)
  device property: make device_property functions take const device *
  driver core: update comments in device_rename()
  driver core: Don't require dynamic_debug for initcall_debug probe timing
  firmware_loader: rework crypto dependencies
  firmware_loader: Strip off \n from customized path
  zram: fix up permission for the hot_add sysfs file
  cacheinfo: Add use_arch[|_cache]_info field/function
  arch_topology: Remove early cacheinfo error message if -ENOENT
  cacheinfo: Check cache properties are present in DT
  cacheinfo: Check sib_leaf in cache_leaves_are_shared()
  cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
  cacheinfo: Add arm64 early level initializer implementation
  cacheinfo: Add arch specific early level initializer
  tty: make tty_class a static const structure
  driver core: class: remove struct class_interface * from callbacks
  driver core: class: mark the struct class in struct class_interface constant
  driver core: class: make class_register() take a const *
  driver core: class: mark class_release() as taking a const *
  driver core: remove incorrect comment for device_create*
  MIPS: vpe-cmp: remove module owner pointer from struct class usage.
  ...
2023-04-27 11:53:57 -07:00
Long Li 9c318a1d9b Drivers: hv: move panic report code from vmbus to hv early init code
The panic reporting code was added in commit 81b18bce48
("Drivers: HV: Send one page worth of kmsg dump over Hyper-V during panic")

It was added to the vmbus driver. The panic reporting has no dependence
on vmbus, and can be enabled at an earlier boot time when Hyper-V is
initialized.

This patch moves the panic reporting code out of vmbus. There is no
functionality changes. During moving, also refactored some cleanup
functions into hv_kmsg_dump_unregister().

Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1682030946-6372-1-git-send-email-longli@linuxonhyperv.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2023-04-20 23:00:37 +00:00
Wei Liu 21eb596fce Merge remote-tracking branch 'tip/x86/sev' into hyperv-next
Merge the following 6 patches from tip/x86/sev, which are taken from
Michael Kelley's series [0]. The rest of Michael's series depend on
them.

  x86/hyperv: Change vTOM handling to use standard coco mechanisms
  init: Call mem_encrypt_init() after Hyper-V hypercall init is done
  x86/mm: Handle decryption/re-encryption of bss_decrypted consistently
  Drivers: hv: Explicitly request decrypted in vmap_pfn() calls
  x86/hyperv: Reorder code to facilitate future work
  x86/ioremap: Add hypervisor callback for private MMIO mapping in coco VM

0: https://lore.kernel.org/linux-hyperv/1679838727-87310-1-git-send-email-mikelley@microsoft.com/
2023-04-17 19:18:13 +00:00
Saurabh Sengar f83705a512 Driver: VMBus: Add Devicetree support
Update the driver to support Devicetree boot as well along with ACPI.
At present the Devicetree parsing only provides the mmio region info
and is not the exact copy of ACPI parsing. This is sufficient to cater
all the current Devicetree usecases for VMBus.

Currently Devicetree is supported only for x86 systems.

Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1679298460-11855-6-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2023-04-17 19:16:22 +00:00
Saurabh Sengar 9c84342380 Drivers: hv: vmbus: Convert acpi_device to more generic platform_device
VMBus driver code currently has direct dependency on ACPI and struct
acpi_device.  As a staging step toward optionally configuring based on
Devicetree instead of ACPI, use a more generic platform device to reduce
the dependency on ACPI where possible, though the dependency on ACPI
is not completely removed.  Also rename the function vmbus_acpi_remove()
to the more generic vmbus_mmio_remove().

Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1679298460-11855-4-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2023-04-17 19:16:22 +00:00
Luis Chamberlain 525f23fe58 hv: simplify sysctl registration
register_sysctl_table() is a deprecated compatibility wrapper.
register_sysctl() can do the directory creation for you so just use
that.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
2023-04-13 11:49:20 -07:00
Michael Kelley 812b0597fb x86/hyperv: Change vTOM handling to use standard coco mechanisms
Hyper-V guests on AMD SEV-SNP hardware have the option of using the
"virtual Top Of Memory" (vTOM) feature specified by the SEV-SNP
architecture. With vTOM, shared vs. private memory accesses are
controlled by splitting the guest physical address space into two
halves.

vTOM is the dividing line where the uppermost bit of the physical
address space is set; e.g., with 47 bits of guest physical address
space, vTOM is 0x400000000000 (bit 46 is set).  Guest physical memory is
accessible at two parallel physical addresses -- one below vTOM and one
above vTOM.  Accesses below vTOM are private (encrypted) while accesses
above vTOM are shared (decrypted). In this sense, vTOM is like the
GPA.SHARED bit in Intel TDX.

Support for Hyper-V guests using vTOM was added to the Linux kernel in
two patch sets[1][2]. This support treats the vTOM bit as part of
the physical address. For accessing shared (decrypted) memory, these
patch sets create a second kernel virtual mapping that maps to physical
addresses above vTOM.

A better approach is to treat the vTOM bit as a protection flag, not
as part of the physical address. This new approach is like the approach
for the GPA.SHARED bit in Intel TDX. Rather than creating a second kernel
virtual mapping, the existing mapping is updated using recently added
coco mechanisms.

When memory is changed between private and shared using
set_memory_decrypted() and set_memory_encrypted(), the PTEs for the
existing kernel mapping are changed to add or remove the vTOM bit in the
guest physical address, just as with TDX. The hypercalls to change the
memory status on the host side are made using the existing callback
mechanism. Everything just works, with a minor tweak to map the IO-APIC
to use private accesses.

To accomplish the switch in approach, the following must be done:

* Update Hyper-V initialization to set the cc_mask based on vTOM
  and do other coco initialization.

* Update physical_mask so the vTOM bit is no longer treated as part
  of the physical address

* Remove CC_VENDOR_HYPERV and merge the associated vTOM functionality
  under CC_VENDOR_AMD. Update cc_mkenc() and cc_mkdec() to set/clear
  the vTOM bit as a protection flag.

* Code already exists to make hypercalls to inform Hyper-V about pages
  changing between shared and private.  Update this code to run as a
  callback from __set_memory_enc_pgtable().

* Remove the Hyper-V special case from __set_memory_enc_dec()

* Remove the Hyper-V specific call to swiotlb_update_mem_attributes()
  since mem_encrypt_init() will now do it.

* Add a Hyper-V specific implementation of the is_private_mmio()
  callback that returns true for the IO-APIC and vTPM MMIO addresses

  [1] https://lore.kernel.org/all/20211025122116.264793-1-ltykernel@gmail.com/
  [2] https://lore.kernel.org/all/20211213071407.314309-1-ltykernel@gmail.com/

  [ bp: Touchups. ]

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/1679838727-87310-7-git-send-email-mikelley@microsoft.com
2023-03-27 09:31:43 +02:00
Greg Kroah-Hartman 75cff725d9 driver core: bus: mark the struct bus_type for sysfs callbacks as constant
struct bus_type should never be modified in a sysfs callback as there is
nothing in the structure to modify, and frankly, the structure is almost
never used in a sysfs callback, so mark it as constant to allow struct
bus_type to be moved to read-only memory.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Ben Widawsky <bwidawsk@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Harald Freudenberger <freude@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Hu Haowen <src.res@email.cn>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Acked-by: Ilya Dryomov <idryomov@gmail.com> # rbd
Acked-by: Ira Weiny <ira.weiny@intel.com> # cxl
Reviewed-by: Alex Shi <alexs@kernel.org>
Acked-by: Iwona Winiarska <iwona.winiarska@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>	# pci
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com> # scsi
Link: https://lore.kernel.org/r/20230313182918.1312597-23-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-23 13:20:40 +01:00
Linus Torvalds a93e884edf Driver core changes for 6.3-rc1
Here is the large set of driver core changes for 6.3-rc1.
 
 There's a lot of changes this development cycle, most of the work falls
 into two different categories:
   - fw_devlink fixes and updates.  This has gone through numerous review
     cycles and lots of review and testing by lots of different devices.
     Hopefully all should be good now, and Saravana will be keeping a
     watch for any potential regression on odd embedded systems.
   - driver core changes to work to make struct bus_type able to be moved
     into read-only memory (i.e. const)  The recent work with Rust has
     pointed out a number of areas in the driver core where we are
     passing around and working with structures that really do not have
     to be dynamic at all, and they should be able to be read-only making
     things safer overall.  This is the contuation of that work (started
     last release with kobject changes) in moving struct bus_type to be
     constant.  We didn't quite make it for this release, but the
     remaining patches will be finished up for the release after this
     one, but the groundwork has been laid for this effort.
 
 Other than that we have in here:
   - debugfs memory leak fixes in some subsystems
   - error path cleanups and fixes for some never-able-to-be-hit
     codepaths.
   - cacheinfo rework and fixes
   - Other tiny fixes, full details are in the shortlog
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY/ipdg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynL3gCgwzbcWu0So3piZyLiJKxsVo9C2EsAn3sZ9gN6
 6oeFOjD3JDju3cQsfGgd
 =Su6W
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the large set of driver core changes for 6.3-rc1.

  There's a lot of changes this development cycle, most of the work
  falls into two different categories:

   - fw_devlink fixes and updates. This has gone through numerous review
     cycles and lots of review and testing by lots of different devices.
     Hopefully all should be good now, and Saravana will be keeping a
     watch for any potential regression on odd embedded systems.

   - driver core changes to work to make struct bus_type able to be
     moved into read-only memory (i.e. const) The recent work with Rust
     has pointed out a number of areas in the driver core where we are
     passing around and working with structures that really do not have
     to be dynamic at all, and they should be able to be read-only
     making things safer overall. This is the contuation of that work
     (started last release with kobject changes) in moving struct
     bus_type to be constant. We didn't quite make it for this release,
     but the remaining patches will be finished up for the release after
     this one, but the groundwork has been laid for this effort.

  Other than that we have in here:

   - debugfs memory leak fixes in some subsystems

   - error path cleanups and fixes for some never-able-to-be-hit
     codepaths.

   - cacheinfo rework and fixes

   - Other tiny fixes, full details are in the shortlog

  All of these have been in linux-next for a while with no reported
  problems"

[ Geert Uytterhoeven points out that that last sentence isn't true, and
  that there's a pending report that has a fix that is queued up - Linus ]

* tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (124 commits)
  debugfs: drop inline constant formatting for ERR_PTR(-ERROR)
  OPP: fix error checking in opp_migrate_dentry()
  debugfs: update comment of debugfs_rename()
  i3c: fix device.h kernel-doc warnings
  dma-mapping: no need to pass a bus_type into get_arch_dma_ops()
  driver core: class: move EXPORT_SYMBOL_GPL() lines to the correct place
  Revert "driver core: add error handling for devtmpfs_create_node()"
  Revert "devtmpfs: add debug info to handle()"
  Revert "devtmpfs: remove return value of devtmpfs_delete_node()"
  driver core: cpu: don't hand-override the uevent bus_type callback.
  devtmpfs: remove return value of devtmpfs_delete_node()
  devtmpfs: add debug info to handle()
  driver core: add error handling for devtmpfs_create_node()
  driver core: bus: update my copyright notice
  driver core: bus: add bus_get_dev_root() function
  driver core: bus: constify bus_unregister()
  driver core: bus: constify some internal functions
  driver core: bus: constify bus_get_kset()
  driver core: bus: constify bus_register/unregister_notifier()
  driver core: remove private pointer from struct bus_type
  ...
2023-02-24 12:58:55 -08:00
Greg Kroah-Hartman 2a81ada32f driver core: make struct bus_type.uevent() take a const *
The uevent() callback in struct bus_type should not be modifying the
device that is passed into it, so mark it as a const * and propagate the
function signature changes out into all relevant subsystems that use
this callback.

Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230111113018.459199-16-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-27 13:45:52 +01:00
Jinank Jain 8536290f00 Drivers: hv: Enable vmbus driver for nested root partition
Currently VMBus driver is not initialized for root partition but we need
to enable the VMBus driver for nested root partition. This is required,
so that L2 root can use the VMBus devices.

Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/c3cdd2cf2bffeba388688640eb61bc182e4c041d.1672639707.git.jinankjain@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2023-01-17 13:37:19 +00:00
Linus Torvalds 456ed864fd ACPI updates for 6.2-rc1
- Update the ACPICA code in the kernel to the 20221020 upstream
    version and fix a couple of issues in it:
 
    * Make acpi_ex_load_op() match upstream implementation (Rafael
      Wysocki).
    * Add support for loong_arch-specific APICs in MADT (Huacai Chen).
    * Add support for fixed PCIe wake event (Huacai Chen).
    * Add EBDA pointer sanity checks (Vit Kabele).
    * Avoid accessing VGA memory when EBDA < 1KiB (Vit Kabele).
    * Add CCEL table support to both compiler/disassembler (Kuppuswamy
      Sathyanarayanan).
    * Add a couple of new UUIDs to the known UUID list (Bob Moore).
    * Add support for FFH Opregion special context data (Sudeep Holla).
    * Improve warning message for "invalid ACPI name" (Bob Moore).
    * Add support for CXL 3.0 structures (CXIMS & RDPAS) in the CEDT
      table (Alison Schofield).
    * Prepare IORT support for revision E.e (Robin Murphy).
    * Finish support for the CDAT table (Bob Moore).
    * Fix error code path in acpi_ds_call_control_method() (Rafael
      Wysocki).
    * Fix use-after-free in acpi_ut_copy_ipackage_to_ipackage() (Li
      Zetao).
    * Update the version of the ACPICA code in the kernel (Bob Moore).
 
  - Use ZERO_PAGE(0) instead of empty_zero_page in the ACPI device
    enumeration code (Giulio Benetti).
 
  - Change the return type of the ACPI driver remove callback to void and
    update its users accordingly (Dawei Li).
 
  - Add general support for FFH address space type and implement the low-
    level part of it for ARM64 (Sudeep Holla).
 
  - Fix stale comments in the ACPI tables parsing code and make it print
    more messages related to MADT (Hanjun Guo, Huacai Chen).
 
  - Replace invocations of generic library functions with more kernel-
    specific counterparts in the ACPI sysfs interface (Christophe JAILLET,
    Xu Panda).
 
  - Print full name paths of ACPI power resource objects during
    enumeration (Kane Chen).
 
  - Eliminate a compiler warning regarding a missing function prototype
    in the ACPI power management code (Sudeep Holla).
 
  - Fix and clean up the ACPI processor driver (Rafael Wysocki, Li Zhong,
    Colin Ian King, Sudeep Holla).
 
  - Add quirk for the HP Pavilion Gaming 15-cx0041ur to the ACPI EC
    driver (Mia Kanashi).
 
  - Add some mew ACPI backlight handling quirks and update some existing
    ones (Hans de Goede).
 
  - Make the ACPI backlight driver prefer the native backlight control
    over vendor backlight control when possible (Hans de Goede).
 
  - Drop unsetting ACPI APEI driver data on remove (Uwe Kleine-König).
 
  - Use xchg_release() instead of cmpxchg() for updating new GHES cache
    slots (Ard Biesheuvel).
 
  - Clean up the ACPI APEI code (Sudeep Holla, Christophe JAILLET, Jay Lu).
 
  - Add new I2C device enumeration quirks for Medion Lifetab S10346 and
    Lenovo Yoga Tab 3 Pro (YT3-X90F) (Hans de Goede).
 
  - Make the ACPI battery driver notify user space about adding new
    battery hooks and removing the existing ones (Armin Wolf).
 
  - Modify the pfr_update and pfr_telemetry drivers to use ACPI_FREE()
    for freeing acpi_object structures to help diagnostics (Wang ShaoBo).
 
  - Make the ACPI fan driver use sysfs_emit_at() in its sysfs interface
    code (ye xingchen).
 
  - Fix the _FIF package extraction failure handling in the ACPI fan
    driver (Hanjun Guo).
 
  - Fix the PCC mailbox handling error code path (Huisong Li).
 
  - Avoid using PCC Opregions if there is no platform interrupt allocated
    for this purpose (Huisong Li).
 
  - Use sysfs_emit() instead of scnprintf() in the ACPI PAD driver and
    CPPC library (ye xingchen).
 
  - Fix some kernel-doc issues in the ACPI GSI processing code (Xiongfeng
    Wang).
 
  - Fix name memory leak in pnp_alloc_dev() (Yang Yingliang).
 
  - Do not disable PNP devices on suspend when they cannot be re-enabled
    on resume (Hans de Goede).
 
  - Clean up the ACPI thermal driver a bit (Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmOXV10SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxuOwP/2zew6val2Jf7I/Yxf1iQLlRyGmhFnaH
 wpltJvBjlHjAUKnPQ/kLYK9fjuUY5HVgjOE03WpwhFUpmhftYTrSkhoVkJ1Mw9Zl
 RNOAEgCG484ThHiTIVp/dMPxrtfuqpdbamhWX3Q51IfXjGW8Vc/lDxIa3k/JQxyq
 ko8GFPCoebJrSCfuwaAf2+xSQaf6dq4jpL/rlIk+nYMMB9mQmXhNEhc+l97NaCe8
 MyCIGynyNbhGsIlwdHRvTp04EIe8h0Z1+Dyns7g/TrzHj3Aezy7QVZbn8sKdZWa1
 W/Ck9QST5tfpDWyr+hUXxUJjEn4Yy+GXjM2xON0EMx5q+JD9XsOpwWOVwTR7CS5s
 FwEd6I89SC8OZM86AgMtnGxygjpK24R/kGzHjhG15IQCsypc8Rvzoxl0L0YVoon/
 UTkE57GzNWVzu0pY/oXJc2aT7lVqFXMFZ6ft/zHnBRnQmrcIi+xgDO5ni5KxctFN
 TVFwbAMCuwVx6IOcVQCZM2g4aJw426KpUn19fKnXvPwR5UIufBaCzSKWMiYrtdXr
 O5BM8ElYuyKCWGYEE0GSMjZygyDpyY6ENLH7s7P1IEmFyigBzaaGBbKm108JJq4V
 eCWJYTAx8pAptsU/vfuMvEQ1ErfhZ3TTokA5Lv0uPf53VcAnWDb7EAbW6ZGMwFSI
 IaV6cv6ILoqO
 =GVzp
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and PNP updates from Rafael Wysocki:
 "These include new code (for instance, support for the FFH address
  space type and support for new firmware data structures in ACPICA),
  some new quirks (mostly related to backlight handling and I2C
  enumeration), a number of fixes and a fair amount of cleanups all
  over.

  Specifics:

   - Update the ACPICA code in the kernel to the 20221020 upstream
     version and fix a couple of issues in it:
      - Make acpi_ex_load_op() match upstream implementation (Rafael
        Wysocki)
      - Add support for loong_arch-specific APICs in MADT (Huacai Chen)
      - Add support for fixed PCIe wake event (Huacai Chen)
      - Add EBDA pointer sanity checks (Vit Kabele)
      - Avoid accessing VGA memory when EBDA < 1KiB (Vit Kabele)
      - Add CCEL table support to both compiler/disassembler (Kuppuswamy
        Sathyanarayanan)
      - Add a couple of new UUIDs to the known UUID list (Bob Moore)
      - Add support for FFH Opregion special context data (Sudeep
        Holla)
      - Improve warning message for "invalid ACPI name" (Bob Moore)
      - Add support for CXL 3.0 structures (CXIMS & RDPAS) in the CEDT
        table (Alison Schofield)
      - Prepare IORT support for revision E.e (Robin Murphy)
      - Finish support for the CDAT table (Bob Moore)
      - Fix error code path in acpi_ds_call_control_method() (Rafael
        Wysocki)
      - Fix use-after-free in acpi_ut_copy_ipackage_to_ipackage() (Li
        Zetao)
      - Update the version of the ACPICA code in the kernel (Bob Moore)

   - Use ZERO_PAGE(0) instead of empty_zero_page in the ACPI device
     enumeration code (Giulio Benetti)

   - Change the return type of the ACPI driver remove callback to void
     and update its users accordingly (Dawei Li)

   - Add general support for FFH address space type and implement the
     low- level part of it for ARM64 (Sudeep Holla)

   - Fix stale comments in the ACPI tables parsing code and make it
     print more messages related to MADT (Hanjun Guo, Huacai Chen)

   - Replace invocations of generic library functions with more kernel-
     specific counterparts in the ACPI sysfs interface (Christophe
     JAILLET, Xu Panda)

   - Print full name paths of ACPI power resource objects during
     enumeration (Kane Chen)

   - Eliminate a compiler warning regarding a missing function prototype
     in the ACPI power management code (Sudeep Holla)

   - Fix and clean up the ACPI processor driver (Rafael Wysocki, Li
     Zhong, Colin Ian King, Sudeep Holla)

   - Add quirk for the HP Pavilion Gaming 15-cx0041ur to the ACPI EC
     driver (Mia Kanashi)

   - Add some mew ACPI backlight handling quirks and update some
     existing ones (Hans de Goede)

   - Make the ACPI backlight driver prefer the native backlight control
     over vendor backlight control when possible (Hans de Goede)

   - Drop unsetting ACPI APEI driver data on remove (Uwe Kleine-König)

   - Use xchg_release() instead of cmpxchg() for updating new GHES cache
     slots (Ard Biesheuvel)

   - Clean up the ACPI APEI code (Sudeep Holla, Christophe JAILLET, Jay
     Lu)

   - Add new I2C device enumeration quirks for Medion Lifetab S10346 and
     Lenovo Yoga Tab 3 Pro (YT3-X90F) (Hans de Goede)

   - Make the ACPI battery driver notify user space about adding new
     battery hooks and removing the existing ones (Armin Wolf)

   - Modify the pfr_update and pfr_telemetry drivers to use ACPI_FREE()
     for freeing acpi_object structures to help diagnostics (Wang
     ShaoBo)

   - Make the ACPI fan driver use sysfs_emit_at() in its sysfs interface
     code (ye xingchen)

   - Fix the _FIF package extraction failure handling in the ACPI fan
     driver (Hanjun Guo)

   - Fix the PCC mailbox handling error code path (Huisong Li)

   - Avoid using PCC Opregions if there is no platform interrupt
     allocated for this purpose (Huisong Li)

   - Use sysfs_emit() instead of scnprintf() in the ACPI PAD driver and
     CPPC library (ye xingchen)

   - Fix some kernel-doc issues in the ACPI GSI processing code
     (Xiongfeng Wang)

   - Fix name memory leak in pnp_alloc_dev() (Yang Yingliang)

   - Do not disable PNP devices on suspend when they cannot be
     re-enabled on resume (Hans de Goede)

   - Clean up the ACPI thermal driver a bit (Rafael Wysocki)"

* tag 'acpi-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (67 commits)
  ACPI: x86: Add skip i2c clients quirk for Medion Lifetab S10346
  ACPI: APEI: EINJ: Refactor available_error_type_show()
  ACPI: APEI: EINJ: Fix formatting errors
  ACPI: processor: perflib: Adjust acpi_processor_notify_smm() return value
  ACPI: processor: perflib: Rearrange acpi_processor_notify_smm()
  ACPI: processor: perflib: Rearrange unregistration routine
  ACPI: processor: perflib: Drop redundant parentheses
  ACPI: processor: perflib: Adjust white space
  ACPI: processor: idle: Drop unnecessary statements and parens
  ACPI: thermal: Adjust critical.flags.valid check
  ACPI: fan: Convert to use sysfs_emit_at() API
  ACPICA: Fix use-after-free in acpi_ut_copy_ipackage_to_ipackage()
  ACPI: battery: Call power_supply_changed() when adding hooks
  ACPI: use sysfs_emit() instead of scnprintf()
  ACPI: x86: Add skip i2c clients quirk for Lenovo Yoga Tab 3 Pro (YT3-X90F)
  ACPI: APEI: Remove a useless include
  PNP: Do not disable devices on suspend when they cannot be re-enabled on resume
  ACPI: processor: Silence missing prototype warnings
  ACPI: processor_idle: Silence missing prototype warnings
  ACPI: PM: Silence missing prototype warning
  ...
2022-12-12 13:38:17 -08:00
Linus Torvalds 9d33edb20f Updates for the interrupt core and driver subsystem:
- Core:
 
    The bulk is the rework of the MSI subsystem to support per device MSI
    interrupt domains. This solves conceptual problems of the current
    PCI/MSI design which are in the way of providing support for PCI/MSI[-X]
    and the upcoming PCI/IMS mechanism on the same device.
 
    IMS (Interrupt Message Store] is a new specification which allows device
    manufactures to provide implementation defined storage for MSI messages
    contrary to the uniform and specification defined storage mechanisms for
    PCI/MSI and PCI/MSI-X. IMS not only allows to overcome the size limitations
    of the MSI-X table, but also gives the device manufacturer the freedom to
    store the message in arbitrary places, even in host memory which is shared
    with the device.
 
    There have been several attempts to glue this into the current MSI code,
    but after lengthy discussions it turned out that there is a fundamental
    design problem in the current PCI/MSI-X implementation. This needs some
    historical background.
 
    When PCI/MSI[-X] support was added around 2003, interrupt management was
    completely different from what we have today in the actively developed
    architectures. Interrupt management was completely architecture specific
    and while there were attempts to create common infrastructure the
    commonalities were rudimentary and just providing shared data structures and
    interfaces so that drivers could be written in an architecture agnostic
    way.
 
    The initial PCI/MSI[-X] support obviously plugged into this model which
    resulted in some basic shared infrastructure in the PCI core code for
    setting up MSI descriptors, which are a pure software construct for holding
    data relevant for a particular MSI interrupt, but the actual association to
    Linux interrupts was completely architecture specific. This model is still
    supported today to keep museum architectures and notorious stranglers
    alive.
 
    In 2013 Intel tried to add support for hot-pluggable IO/APICs to the kernel,
    which was creating yet another architecture specific mechanism and resulted
    in an unholy mess on top of the existing horrors of x86 interrupt handling.
    The x86 interrupt management code was already an incomprehensible maze of
    indirections between the CPU vector management, interrupt remapping and the
    actual IO/APIC and PCI/MSI[-X] implementation.
 
    At roughly the same time ARM struggled with the ever growing SoC specific
    extensions which were glued on top of the architected GIC interrupt
    controller.
 
    This resulted in a fundamental redesign of interrupt management and
    provided the today prevailing concept of hierarchical interrupt
    domains. This allowed to disentangle the interactions between x86 vector
    domain and interrupt remapping and also allowed ARM to handle the zoo of
    SoC specific interrupt components in a sane way.
 
    The concept of hierarchical interrupt domains aims to encapsulate the
    functionality of particular IP blocks which are involved in interrupt
    delivery so that they become extensible and pluggable. The X86
    encapsulation looks like this:
 
                                             |--- device 1
      [Vector]---[Remapping]---[PCI/MSI]--|...
                                             |--- device N
 
    where the remapping domain is an optional component and in case that it is
    not available the PCI/MSI[-X] domains have the vector domain as their
    parent. This reduced the required interaction between the domains pretty
    much to the initialization phase where it is obviously required to
    establish the proper parent relation ship in the components of the
    hierarchy.
 
    While in most cases the model is strictly representing the chain of IP
    blocks and abstracting them so they can be plugged together to form a
    hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the hardware
    it's clear that the actual PCI/MSI[-X] interrupt controller is not a global
    entity, but strict a per PCI device entity.
 
    Here we took a short cut on the hierarchical model and went for the easy
    solution of providing "global" PCI/MSI domains which was possible because
    the PCI/MSI[-X] handling is uniform across the devices. This also allowed
    to keep the existing PCI/MSI[-X] infrastructure mostly unchanged which in
    turn made it simple to keep the existing architecture specific management
    alive.
 
    A similar problem was created in the ARM world with support for IP block
    specific message storage. Instead of going all the way to stack a IP block
    specific domain on top of the generic MSI domain this ended in a construct
    which provides a "global" platform MSI domain which allows overriding the
    irq_write_msi_msg() callback per allocation.
 
    In course of the lengthy discussions we identified other abuse of the MSI
    infrastructure in wireless drivers, NTB etc. where support for
    implementation specific message storage was just mindlessly glued into the
    existing infrastructure. Some of this just works by chance on particular
    platforms but will fail in hard to diagnose ways when the driver is used
    on platforms where the underlying MSI interrupt management code does not
    expect the creative abuse.
 
    Another shortcoming of today's PCI/MSI-X support is the inability to
    allocate or free individual vectors after the initial enablement of
    MSI-X. This results in an works by chance implementation of VFIO (PCI
    pass-through) where interrupts on the host side are not set up upfront to
    avoid resource exhaustion. They are expanded at run-time when the guest
    actually tries to use them. The way how this is implemented is that the
    host disables MSI-X and then re-enables it with a larger number of
    vectors again. That works by chance because most device drivers set up
    all interrupts before the device actually will utilize them. But that's
    not universally true because some drivers allocate a large enough number
    of vectors but do not utilize them until it's actually required,
    e.g. for acceleration support. But at that point other interrupts of the
    device might be in active use and the MSI-X disable/enable dance can
    just result in losing interrupts and therefore hard to diagnose subtle
    problems.
 
    Last but not least the "global" PCI/MSI-X domain approach prevents to
    utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact that IMS
    is not longer providing a uniform storage and configuration model.
 
    The solution to this is to implement the missing step and switch from
    global PCI/MSI domains to per device PCI/MSI domains. The resulting
    hierarchy then looks like this:
 
                               |--- [PCI/MSI] device 1
      [Vector]---[Remapping]---|...
                               |--- [PCI/MSI] device N
 
    which in turn allows to provide support for multiple domains per device:
 
                               |--- [PCI/MSI] device 1
                               |--- [PCI/IMS] device 1
      [Vector]---[Remapping]---|...
                               |--- [PCI/MSI] device N
                               |--- [PCI/IMS] device N
 
    This work converts the MSI and PCI/MSI core and the x86 interrupt
    domains to the new model, provides new interfaces for post-enable
    allocation/free of MSI-X interrupts and the base framework for PCI/IMS.
    PCI/IMS has been verified with the work in progress IDXD driver.
 
    There is work in progress to convert ARM over which will replace the
    platform MSI train-wreck. The cleanup of VFIO, NTB and other creative
    "solutions" are in the works as well.
 
  - Drivers:
 
    - Updates for the LoongArch interrupt chip drivers
 
    - Support for MTK CIRQv2
 
    - The usual small fixes and updates all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmOUsygTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoYXiD/40tXKzCzf0qFIqUlZLia1N3RRrwrNC
 DVTixuLtR9MrjwE+jWLQILa85SHInV8syXHSd35SzhsGDxkURFGi+HBgVWmysODf
 br9VSh3Gi+kt7iXtIwAg8WNWviGNmS3kPksxCko54F0YnJhMY5r5bhQVUBQkwFG2
 wES1C9Uzd4pdV2bl24Z+WKL85cSmZ+pHunyKw1n401lBABXnTF9c4f13zC14jd+y
 wDxNrmOxeL3mEH4Pg6VyrDuTOURSf3TjJjeEq3EYqvUo0FyLt9I/cKX0AELcZQX7
 fkRjrQQAvXNj39RJfeSkojDfllEPUHp7XSluhdBu5aIovSamdYGCDnuEoZ+l4MJ+
 CojIErp3Dwj/uSaf5c7C3OaDAqH2CpOFWIcrUebShJE60hVKLEpUwd6W8juplaoT
 gxyXRb1Y+BeJvO8VhMN4i7f3232+sj8wuj+HTRTTbqMhkElnin94tAx8rgwR1sgR
 BiOGMJi4K2Y8s9Rqqp0Dvs01CW4guIYvSR4YY+WDbbi1xgiev89OYs6zZTJCJe4Y
 NUwwpqYSyP1brmtdDdBOZLqegjQm+TwUb6oOaasFem4vT1swgawgLcDnPOx45bk5
 /FWt3EmnZxMz99x9jdDn1+BCqAZsKyEbEY1avvhPVMTwoVIuSX2ceTBMLseGq+jM
 03JfvdxnueM3gw==
 =9erA
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "Updates for the interrupt core and driver subsystem:

  The bulk is the rework of the MSI subsystem to support per device MSI
  interrupt domains. This solves conceptual problems of the current
  PCI/MSI design which are in the way of providing support for
  PCI/MSI[-X] and the upcoming PCI/IMS mechanism on the same device.

  IMS (Interrupt Message Store] is a new specification which allows
  device manufactures to provide implementation defined storage for MSI
  messages (as opposed to PCI/MSI and PCI/MSI-X that has a specified
  message store which is uniform accross all devices). The PCI/MSI[-X]
  uniformity allowed us to get away with "global" PCI/MSI domains.

  IMS not only allows to overcome the size limitations of the MSI-X
  table, but also gives the device manufacturer the freedom to store the
  message in arbitrary places, even in host memory which is shared with
  the device.

  There have been several attempts to glue this into the current MSI
  code, but after lengthy discussions it turned out that there is a
  fundamental design problem in the current PCI/MSI-X implementation.
  This needs some historical background.

  When PCI/MSI[-X] support was added around 2003, interrupt management
  was completely different from what we have today in the actively
  developed architectures. Interrupt management was completely
  architecture specific and while there were attempts to create common
  infrastructure the commonalities were rudimentary and just providing
  shared data structures and interfaces so that drivers could be written
  in an architecture agnostic way.

  The initial PCI/MSI[-X] support obviously plugged into this model
  which resulted in some basic shared infrastructure in the PCI core
  code for setting up MSI descriptors, which are a pure software
  construct for holding data relevant for a particular MSI interrupt,
  but the actual association to Linux interrupts was completely
  architecture specific. This model is still supported today to keep
  museum architectures and notorious stragglers alive.

  In 2013 Intel tried to add support for hot-pluggable IO/APICs to the
  kernel, which was creating yet another architecture specific mechanism
  and resulted in an unholy mess on top of the existing horrors of x86
  interrupt handling. The x86 interrupt management code was already an
  incomprehensible maze of indirections between the CPU vector
  management, interrupt remapping and the actual IO/APIC and PCI/MSI[-X]
  implementation.

  At roughly the same time ARM struggled with the ever growing SoC
  specific extensions which were glued on top of the architected GIC
  interrupt controller.

  This resulted in a fundamental redesign of interrupt management and
  provided the today prevailing concept of hierarchical interrupt
  domains. This allowed to disentangle the interactions between x86
  vector domain and interrupt remapping and also allowed ARM to handle
  the zoo of SoC specific interrupt components in a sane way.

  The concept of hierarchical interrupt domains aims to encapsulate the
  functionality of particular IP blocks which are involved in interrupt
  delivery so that they become extensible and pluggable. The X86
  encapsulation looks like this:

                                            |--- device 1
     [Vector]---[Remapping]---[PCI/MSI]--|...
                                            |--- device N

  where the remapping domain is an optional component and in case that
  it is not available the PCI/MSI[-X] domains have the vector domain as
  their parent. This reduced the required interaction between the
  domains pretty much to the initialization phase where it is obviously
  required to establish the proper parent relation ship in the
  components of the hierarchy.

  While in most cases the model is strictly representing the chain of IP
  blocks and abstracting them so they can be plugged together to form a
  hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the
  hardware it's clear that the actual PCI/MSI[-X] interrupt controller
  is not a global entity, but strict a per PCI device entity.

  Here we took a short cut on the hierarchical model and went for the
  easy solution of providing "global" PCI/MSI domains which was possible
  because the PCI/MSI[-X] handling is uniform across the devices. This
  also allowed to keep the existing PCI/MSI[-X] infrastructure mostly
  unchanged which in turn made it simple to keep the existing
  architecture specific management alive.

  A similar problem was created in the ARM world with support for IP
  block specific message storage. Instead of going all the way to stack
  a IP block specific domain on top of the generic MSI domain this ended
  in a construct which provides a "global" platform MSI domain which
  allows overriding the irq_write_msi_msg() callback per allocation.

  In course of the lengthy discussions we identified other abuse of the
  MSI infrastructure in wireless drivers, NTB etc. where support for
  implementation specific message storage was just mindlessly glued into
  the existing infrastructure. Some of this just works by chance on
  particular platforms but will fail in hard to diagnose ways when the
  driver is used on platforms where the underlying MSI interrupt
  management code does not expect the creative abuse.

  Another shortcoming of today's PCI/MSI-X support is the inability to
  allocate or free individual vectors after the initial enablement of
  MSI-X. This results in an works by chance implementation of VFIO (PCI
  pass-through) where interrupts on the host side are not set up upfront
  to avoid resource exhaustion. They are expanded at run-time when the
  guest actually tries to use them. The way how this is implemented is
  that the host disables MSI-X and then re-enables it with a larger
  number of vectors again. That works by chance because most device
  drivers set up all interrupts before the device actually will utilize
  them. But that's not universally true because some drivers allocate a
  large enough number of vectors but do not utilize them until it's
  actually required, e.g. for acceleration support. But at that point
  other interrupts of the device might be in active use and the MSI-X
  disable/enable dance can just result in losing interrupts and
  therefore hard to diagnose subtle problems.

  Last but not least the "global" PCI/MSI-X domain approach prevents to
  utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact
  that IMS is not longer providing a uniform storage and configuration
  model.

  The solution to this is to implement the missing step and switch from
  global PCI/MSI domains to per device PCI/MSI domains. The resulting
  hierarchy then looks like this:

                              |--- [PCI/MSI] device 1
     [Vector]---[Remapping]---|...
                              |--- [PCI/MSI] device N

  which in turn allows to provide support for multiple domains per
  device:

                              |--- [PCI/MSI] device 1
                              |--- [PCI/IMS] device 1
     [Vector]---[Remapping]---|...
                              |--- [PCI/MSI] device N
                              |--- [PCI/IMS] device N

  This work converts the MSI and PCI/MSI core and the x86 interrupt
  domains to the new model, provides new interfaces for post-enable
  allocation/free of MSI-X interrupts and the base framework for
  PCI/IMS. PCI/IMS has been verified with the work in progress IDXD
  driver.

  There is work in progress to convert ARM over which will replace the
  platform MSI train-wreck. The cleanup of VFIO, NTB and other creative
  "solutions" are in the works as well.

  Drivers:

   - Updates for the LoongArch interrupt chip drivers

   - Support for MTK CIRQv2

   - The usual small fixes and updates all over the place"

* tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (134 commits)
  irqchip/ti-sci-inta: Fix kernel doc
  irqchip/gic-v2m: Mark a few functions __init
  irqchip/gic-v2m: Include arm-gic-common.h
  irqchip/irq-mvebu-icu: Fix works by chance pointer assignment
  iommu/amd: Enable PCI/IMS
  iommu/vt-d: Enable PCI/IMS
  x86/apic/msi: Enable PCI/IMS
  PCI/MSI: Provide pci_ims_alloc/free_irq()
  PCI/MSI: Provide IMS (Interrupt Message Store) support
  genirq/msi: Provide constants for PCI/IMS support
  x86/apic/msi: Enable MSI_FLAG_PCI_MSIX_ALLOC_DYN
  PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X
  PCI/MSI: Provide prepare_desc() MSI domain op
  PCI/MSI: Split MSI-X descriptor setup
  genirq/msi: Provide MSI_FLAG_MSIX_ALLOC_DYN
  genirq/msi: Provide msi_domain_alloc_irq_at()
  genirq/msi: Provide msi_domain_ops:: Prepare_desc()
  genirq/msi: Provide msi_desc:: Msi_data
  genirq/msi: Provide struct msi_map
  x86/apic/msi: Remove arch_create_remap_msi_irq_domain()
  ...
2022-12-12 11:21:29 -08:00
Rafael J. Wysocki 45494d77f2 Merge branches 'acpi-scan', 'acpi-bus', 'acpi-tables' and 'acpi-sysfs'
Merge ACPI changes related to device enumeration, device object
managenet, operation region handling, table parsing and sysfs
interface:

 - Use ZERO_PAGE(0) instead of empty_zero_page in the ACPI device
   enumeration code (Giulio Benetti).

 - Change the return type of the ACPI driver remove callback to void and
   update its users accordingly (Dawei Li).

 - Add general support for FFH address space type and implement the low-
   level part of it for ARM64 (Sudeep Holla).

 - Fix stale comments in the ACPI tables parsing code and make it print
   more messages related to MADT (Hanjun Guo, Huacai Chen).

 - Replace invocations of generic library functions with more kernel-
   specific counterparts in the ACPI sysfs interface (Christophe JAILLET,
   Xu Panda).

* acpi-scan:
  ACPI: scan: substitute empty_zero_page with helper ZERO_PAGE(0)

* acpi-bus:
  ACPI: FFH: Silence missing prototype warnings
  ACPI: make remove callback of ACPI driver void
  ACPI: bus: Fix the _OSC capability check for FFH OpRegion
  arm64: Add architecture specific ACPI FFH Opregion callbacks
  ACPI: Implement a generic FFH Opregion handler

* acpi-tables:
  ACPI: tables: Fix the stale comments for acpi_locate_initial_tables()
  ACPI: tables: Print CORE_PIC information when MADT is parsed

* acpi-sysfs:
  ACPI: sysfs: use sysfs_emit() to instead of scnprintf()
  ACPI: sysfs: Use kstrtobool() instead of strtobool()
2022-12-12 14:55:44 +01:00
Guilherme G. Piccoli d786e00d19 drivers: hv, hyperv_fb: Untangle and refactor Hyper-V panic notifiers
Currently Hyper-V guests are among the most relevant users of the panic
infrastructure, like panic notifiers, kmsg dumpers, etc. The reasons rely
both in cleaning-up procedures (closing hypervisor <-> guest connection,
disabling some paravirtualized timer) as well as to data collection
(sending panic information to the hypervisor) and framebuffer management.

The thing is: some notifiers are related to others, ordering matters, some
functionalities are duplicated and there are lots of conditionals behind
sending panic information to the hypervisor. As part of an effort to
clean-up the panic notifiers mechanism and better document things, we
hereby address some of the issues/complexities of Hyper-V panic handling
through the following changes:

(a) We have die and panic notifiers on vmbus_drv.c and both have goals of
sending panic information to the hypervisor, though the panic notifier is
also responsible for a cleaning-up procedure.

This commit clears the code by splitting the panic notifier in two, one
for closing the vmbus connection whereas the other is only for sending
panic info to hypervisor. With that, it was possible to merge the die and
panic notifiers in a single/well-documented function, and clear some
conditional complexities on sending such information to the hypervisor.

(b) There is a Hyper-V framebuffer panic notifier, which relies in doing
a vmbus operation that demands a valid connection. So, we must order this
notifier with the panic notifier from vmbus_drv.c, to guarantee that the
framebuffer code executes before the vmbus connection is unloaded.

Also, this commit removes a useless header.

Although there is code rework and re-ordering, we expect that this change
has no functional regressions but instead optimize the path and increase
panic reliability on Hyper-V. This was tested on Hyper-V with success.

Cc: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Tianyu Lan <Tianyu.Lan@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Fabio A M Martins <fabiomirmar@gmail.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220819221731.480795-11-gpiccoli@igalia.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-11-28 16:48:20 +00:00
Dawei Li 6c0eb5ba35 ACPI: make remove callback of ACPI driver void
For bus-based driver, device removal is implemented as:
1 device_remove()->
2   bus->remove()->
3     driver->remove()

Driver core needs no inform from callee(bus driver) about the
result of remove callback. In that case, commit fc7a6209d5
("bus: Make remove callback return void") forces bus_type::remove
be void-returned.

Now we have the situation that both 1 & 2 of calling chain are
void-returned, so it does not make much sense for 3(driver->remove)
to return non-void to its caller.

So the basic idea behind this change is making remove() callback of
any bus-based driver to be void-returned.

This change, for itself, is for device drivers based on acpi-bus.

Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Lee Jones <lee@kernel.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dawei Li <set_pte_at@outlook.com>
Reviewed-by: Maximilian Luz <luzmaximilian@gmail.com>  # for drivers/platform/surface/*
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-23 19:11:22 +01:00
Yang Yingliang 25c94b0515 Drivers: hv: vmbus: fix possible memory leak in vmbus_device_register()
If device_register() returns error in vmbus_device_register(),
the name allocated by dev_set_name() must be freed. As comment
of device_register() says, it should use put_device() to give
up the reference in the error path. So fix this by calling
put_device(), then the name can be freed in kobject_cleanup().

Fixes: 09d50ff8a2 ("Staging: hv: make the Hyper-V virtual bus code build")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20221119081135.1564691-3-yangyingliang@huawei.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-11-21 10:57:42 +00:00
Thomas Gleixner e5dfd093ec clocksource/drivers/hyper-v: Include asm/hyperv-tlfs.h not asm/mshyperv.h
clocksource/hyperv_timer.h is included into the VDSO build. It includes
asm/mshyperv.h which in turn includes the world and some more. This worked
so far by chance, but any subtle change in the include chain results in a
build breakage because VDSO builds are building user space libraries.

Include asm/hyperv-tlfs.h instead which contains everything what the VDSO
build needs except the hv_get_raw_timer() define. Move this define into a
separate header file, which contains the prerequisites (msr.h) and is
included by clocksource/hyperv_timer.h.

Fixup drivers/hv/vmbus_drv.c which relies on the indirect include of
asm/mshyperv.h.

With that the VDSO build only pulls in the minimum requirements.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/87fsemtut0.ffs@tglx
2022-11-17 13:58:32 +01:00
Linus Torvalds 0e0073eb1b hyperv-next for 6.1
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmNDMCsTHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXpmUB/4k6V0JlLc9kxClxK8KCzAasby5hnGz
 EPOXyFi6ff1CSiMdpV+oWh/ENQr8lSQYsj6esMNNRr4DLDJ9o6NGtdK05HzizCjs
 KwHlvjF0eRsr60zSIySdeAwPyEyupqcpkLSF6oGi53f6OQuC2LDQa/h7FMnSKgxD
 TFzkWbFAf1kl/nQTUenxTh+u16umaKsedrD9tQli1vIyhltQbu9Hw2gAV1vWxdKO
 Md6BRwaMMz5Lffs6eM2ULlyxNzK1Pk1LWdPeNVHHIeBVxI+YMememFCOA5BBXLXk
 QlVGIPtbyHtryK9DRFyhk9UWgdAZIBCDHxBOqXXySOGA+Kc7/DK5zCeB
 =f55l
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed-20221009' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv updates from Wei Liu:

 - Remove unnecessary delay while probing for VMBus (Stanislav
   Kinsburskiy)

 - Optimize vmbus_on_event (Saurabh Sengar)

 - Fix a race in Hyper-V DRM driver (Saurabh Sengar)

 - Miscellaneous clean-up patches from various people

* tag 'hyperv-next-signed-20221009' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  x86/hyperv: Replace kmap() with kmap_local_page()
  drm/hyperv: Add ratelimit on error message
  hyperv: simplify and rename generate_guest_id
  Drivers: hv: vmbus: Split memcpy of flex-array
  scsi: storvsc: remove an extraneous "to" in a comment
  Drivers: hv: vmbus: Don't wait for the ACPI device upon initialization
  Drivers: hv: vmbus: Use PCI_VENDOR_ID_MICROSOFT for better discoverability
  Drivers: hv: vmbus: Fix kernel-doc
  drm/hyperv: Don't overwrite dirt_needed value set by host
  Drivers: hv: vmbus: Optimize vmbus_on_event
2022-10-10 13:59:01 -07:00
Rafael J. Wysocki 80487a37de Merge branch 'acpi-dev'
Merge changes regarding the management of ACPI device objects for
6.1-rc1:

 - Rename ACPI device object reference counting functions (Rafael
   Wysocki).

 - Rearrange ACPI device object initialization code (Rafael Wysocki).

 - Drop parent field from struct acpi_device (Rafael Wysocki).

 - Extend the the int3472-tps68470 driver to support multiple consumers
   of a single TPS68470 along with the requisite framework-level
   support (Daniel Scally).

* acpi-dev:
  platform/x86: int3472: Add board data for Surface Go2 IR camera
  platform/x86: int3472: Support multiple gpio lookups in board data
  platform/x86: int3472: Support multiple clock consumers
  ACPI: bus: Add iterator for dependent devices
  ACPI: scan: Add acpi_dev_get_next_consumer_dev()
  ACPI: property: Use acpi_dev_parent()
  ACPI: Drop redundant acpi_dev_parent() header
  ACPI: PM: Fix NULL argument handling in acpi_device_get/set_power()
  ACPI: Drop parent field from struct acpi_device
  ACPI: scan: Eliminate __acpi_device_add()
  ACPI: scan: Rearrange initialization of ACPI device objects
  ACPI: scan: Rename acpi_bus_get_parent() and rearrange it
  ACPI: Rename acpi_bus_get/put_acpi_device()
2022-09-30 20:05:16 +02:00
Kees Cook fb2d14add4 Drivers: hv: vmbus: Split memcpy of flex-array
To work around a misbehavior of the compiler's ability to see into
composite flexible array structs (as detailed in the coming memcpy()
hardening series[1]), split the memcpy() of the header and the payload
so no false positive run-time overflow warning will be generated. This
results in the already inlined memcpy getting unrolled a little more,
which very slightly increases text size:

$ size drivers/hv/vmbus_drv.o.before drivers/hv/vmbus_drv.o
   text    data     bss     dec     hex filename
  22968    5239     232   28439    6f17 drivers/hv/vmbus_drv.o.before
  23032    5239     232   28503    6f57 drivers/hv/vmbus_drv.o

Avoids the run-time false-positive warning:

  memcpy: detected field-spanning write (size 212) of single field "&ctx->msg" at drivers/hv/vmbus_drv.c:1133 (size 16)

[1] https://lore.kernel.org/linux-hardening/20220901065914.1417829-2-keescook@chromium.org/

Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: linux-hyperv@vger.kernel.org
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220927211736.3241175-1-keescook@chromium.org
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-09-28 13:35:26 +00:00
Stanislav Kinsburskiy f7ac541e18 Drivers: hv: vmbus: Don't wait for the ACPI device upon initialization
Waiting to 5 seconds in case of missing VMBus ACPI device is redundant as the
device is either present already or won't be available at all.

This patch enforces synchronous probing to make sure the bus traversal,
happening upon driver registering will either find the device (if present) or
not spend any additional time if device is absent.

Signed-off-by: Stanislav Kinsburskiy <stanislav.kinsburskiy@gmail.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Stephen Hemminger <sthemmin@microsoft.com>
CC: Wei Liu <wei.liu@kernel.org>
CC: Dexuan Cui <decui@microsoft.com>
CC: linux-hyperv@vger.kernel.org
CC: linux-kernel@vger.kernel.org
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/166378554568.581670.1124852716698789244.stgit@skinsburskii-cloud-desktop.internal.cloudapp.net
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-09-23 10:40:09 +00:00
Easwar Hariharan a99aaf2e3b Drivers: hv: vmbus: Use PCI_VENDOR_ID_MICROSOFT for better discoverability
pci_ids.h already defines PCI_VENDOR_ID_MICROSOFT, and is included via
linux/pci.h. Use the define instead of the magic number.

Signed-off-by: Easwar Hariharan <easwar.hariharan@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1663625084-2518-2-git-send-email-eahariha@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-09-23 10:30:37 +00:00
Jiapeng Chong e1a863cddb Drivers: hv: vmbus: Fix kernel-doc
drivers/hv/vmbus_drv.c:1587: warning: expecting prototype for __vmbus_child_driver_register(). Prototype was for __vmbus_driver_register() instead.

Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2210
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220919063815.1881-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-09-23 10:29:08 +00:00
Vitaly Kuznetsov f0880e2cb7 Drivers: hv: Never allocate anything besides framebuffer from framebuffer memory region
Passed through PCI device sometimes misbehave on Gen1 VMs when Hyper-V
DRM driver is also loaded. Looking at IOMEM assignment, we can see e.g.

$ cat /proc/iomem
...
f8000000-fffbffff : PCI Bus 0000:00
  f8000000-fbffffff : 0000:00:08.0
    f8000000-f8001fff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
...
fe0000000-fffffffff : PCI Bus 0000:00
  fe0000000-fe07fffff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
    fe0000000-fe07fffff : 2ba2:00:02.0
      fe0000000-fe07fffff : mlx4_core

the interesting part is the 'f8000000' region as it is actually the
VM's framebuffer:

$ lspci -v
...
0000:00:08.0 VGA compatible controller: Microsoft Corporation Hyper-V virtual VGA (prog-if 00 [VGA controller])
	Flags: bus master, fast devsel, latency 0, IRQ 11
	Memory at f8000000 (32-bit, non-prefetchable) [size=64M]
...

 hv_vmbus: registering driver hyperv_drm
 hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Synthvid Version major 3, minor 5
 hyperv_drm 0000:00:08.0: vgaarb: deactivate vga console
 hyperv_drm 0000:00:08.0: BAR 0: can't reserve [mem 0xf8000000-0xfbffffff]
 hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Cannot request framebuffer, boot fb still active?

Note: "Cannot request framebuffer" is not a fatal error in
hyperv_setup_gen1() as the code assumes there's some other framebuffer
device there but we actually have some other PCI device (mlx4 in this
case) config space there!

The problem appears to be that vmbus_allocate_mmio() can use dedicated
framebuffer region to serve any MMIO request from any device. The
semantics one might assume of a parameter named "fb_overlap_ok"
aren't implemented because !fb_overlap_ok essentially has no effect.
The existing semantics are really "prefer_fb_overlap". This patch
implements the expected and needed semantics, which is to not allocate
from the frame buffer space when !fb_overlap_ok.

Note, Gen2 VMs are usually unaffected by the issue because
framebuffer region is already taken by EFI fb (in case kernel supports
it) but Gen1 VMs may have this region unclaimed by the time Hyper-V PCI
pass-through driver tries allocating MMIO space if Hyper-V DRM/FB drivers
load after it. Devices can be brought up in any sequence so let's
resolve the issue by always ignoring 'fb_mmio' region for non-FB
requests, even if the region is unclaimed.

Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20220827130345.1320254-4-vkuznets@redhat.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-09-05 17:02:15 +00:00
Vitaly Kuznetsov 2a8a8afba0 Drivers: hv: Always reserve framebuffer region for Gen1 VMs
vmbus_reserve_fb() tries reserving framebuffer region iff
'screen_info.lfb_base' is set. Gen2 VMs seem to have it set by EFI
and/or by the kernel EFI FB driver (or, in some edge cases like kexec,
the address where the buffer was moved, see
https://lore.kernel.org/all/20201014092429.1415040-1-kasong@redhat.com/)
but on Gen1 VM it depends on bootloader behavior. With grub, it depends
on 'gfxpayload=' setting but in some cases it is observed to be zero.
That being said, relying on 'screen_info.lfb_base' to reserve
framebuffer region is risky. For Gen1 VMs, it should always be
possible to get the address from the dedicated PCI device instead.

Check for legacy PCI video device presence and reserve the whole
region for framebuffer on Gen1 VMs.

Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20220827130345.1320254-3-vkuznets@redhat.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-09-05 17:02:15 +00:00
Rafael J. Wysocki 62fcb99bdf ACPI: Drop parent field from struct acpi_device
The parent field in struct acpi_device is, in fact, redundant,
because the dev.parent field in it effectively points to the same
object and it is used by the driver core.

Accordingly, the parent field can be dropped from struct acpi_device
and for this purpose define acpi_dev_parent() to retrieve a parent
struct acpi_device pointer from the dev.parent field in struct
acpi_device.  Next, update all of the users of the parent field
in struct acpi_device to use acpi_dev_parent() instead of it and
drop it.

While at it, drop the ACPI_IS_ROOT_DEVICE() macro that is only used
in one place in a confusing way.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Punit Agrawal <punit.agrawal@bytedance.com>
2022-08-24 20:55:24 +02:00
Shradha Gupta 52be93558a Drivers: hv: vm_bus: Handle vmbus rescind calls after vmbus is suspended
Add a flag to indicate that the vmbus is suspended so we should ignore
any offer message. Add a new work_queue for rescind msg, so we could drain
it along with other offer work_queues upon suspension.
It was observed that in some hibernation related scenario testing, after
vmbus_bus_suspend() we get rescind offer message for the vmbus. This would
lead to processing of a rescind message for a channel that has already been
suspended.

Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220711041147.GA5569@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-07-11 18:53:42 +00:00
Saurabh Sengar 6640b5df1a Drivers: hv: vmbus: Don't assign VMbus channel interrupts to isolated CPUs
When initially assigning a VMbus channel interrupt to a CPU, don’t choose
a managed IRQ isolated CPU (as specified on the kernel boot line with
parameter 'isolcpus=managed_irq,<#cpu>'). Also, when using sysfs to change
the CPU that a VMbus channel will interrupt, don't allow changing to a
managed IRQ isolated CPU.

Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1653637439-23060-1-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-06-08 12:24:50 +00:00
Linus Torvalds 500a434fc5 Driver core changes for 5.19-rc1
Here is the set of driver core changes for 5.19-rc1.
 
 Note, I'm not really happy with this pull request as-is, see below for
 details, but overall this is all good for everything but a small set of
 systems, which we have a fix for already.
 
 Lots of tiny driver core changes and cleanups happened this cycle,
 but the two major things were:
 
 	- firmware_loader reorganization and additions including the
 	  ability to have XZ compressed firmware images and the ability
 	  for userspace to initiate the firmware load when it needs to,
 	  instead of being always initiated by the kernel. FPGA devices
 	  specifically want this ability to have their firmware changed
 	  over the lifetime of the system boot, and this allows them to
 	  work without having to come up with yet-another-custom-uapi
 	  interface for loading firmware for them.
 	- physical location support added to sysfs so that devices that
 	  know this information, can tell userspace where they are
 	  located in a common way.  Some ACPI devices already support
 	  this today, and more bus types should support this in the
 	  future.
 
 Smaller changes included:
 	- driver_override api cleanups and fixes
 	- error path cleanups and fixes
 	- get_abi script fixes
 	- deferred probe timeout changes.
 
 It's that last change that I'm the most worried about.  It has been
 reported to cause boot problems for a number of systems, and I have a
 tested patch series that resolves this issue.  But I didn't get it
 merged into my tree before 5.18-final came out, so it has not gotten any
 linux-next testing.
 
 I'll send the fixup patches (there are 2) as a follow-on series to this
 pull request if you want to take them directly, _OR_ I can just revert
 the probe timeout changes and they can wait for the next -rc1 merge
 cycle.  Given that the fixes are tested, and pretty simple, I'm leaning
 toward that choice.  Sorry this all came at the end of the merge window,
 I should have resolved this all 2 weeks ago, that's my fault as it was
 in the middle of some travel for me.
 
 All have been tested in linux-next for weeks, with no reported issues
 other than the above-mentioned boot time outs.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYpnv/A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yk/fACgvmenbo5HipqyHnOmTQlT50xQ9EYAn2eTq6ai
 GkjLXBGNWOPBa5cU52qf
 =yEi/
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the set of driver core changes for 5.19-rc1.

  Lots of tiny driver core changes and cleanups happened this cycle, but
  the two major things are:

   - firmware_loader reorganization and additions including the ability
     to have XZ compressed firmware images and the ability for userspace
     to initiate the firmware load when it needs to, instead of being
     always initiated by the kernel. FPGA devices specifically want this
     ability to have their firmware changed over the lifetime of the
     system boot, and this allows them to work without having to come up
     with yet-another-custom-uapi interface for loading firmware for
     them.

   - physical location support added to sysfs so that devices that know
     this information, can tell userspace where they are located in a
     common way. Some ACPI devices already support this today, and more
     bus types should support this in the future.

  Smaller changes include:

   - driver_override api cleanups and fixes

   - error path cleanups and fixes

   - get_abi script fixes

   - deferred probe timeout changes.

  It's that last change that I'm the most worried about. It has been
  reported to cause boot problems for a number of systems, and I have a
  tested patch series that resolves this issue. But I didn't get it
  merged into my tree before 5.18-final came out, so it has not gotten
  any linux-next testing.

  I'll send the fixup patches (there are 2) as a follow-on series to this
  pull request.

  All have been tested in linux-next for weeks, with no reported issues
  other than the above-mentioned boot time-outs"

* tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
  driver core: fix deadlock in __device_attach
  kernfs: Separate kernfs_pr_cont_buf and rename_lock.
  topology: Remove unused cpu_cluster_mask()
  driver core: Extend deferred probe timeout on driver registration
  MAINTAINERS: add Russ Weight as a firmware loader maintainer
  driver: base: fix UAF when driver_attach failed
  test_firmware: fix end of loop test in upload_read_show()
  driver core: location: Add "back" as a possible output for panel
  driver core: location: Free struct acpi_pld_info *pld
  driver core: Add "*" wildcard support to driver_async_probe cmdline param
  driver core: location: Check for allocations failure
  arch_topology: Trace the update thermal pressure
  kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file.
  export: fix string handling of namespace in EXPORT_SYMBOL_NS
  rpmsg: use local 'dev' variable
  rpmsg: Fix calling device_lock() on non-initialized device
  firmware_loader: describe 'module' parameter of firmware_upload_register()
  firmware_loader: Move definitions from sysfs_upload.h to sysfs.h
  firmware_loader: Fix configs for sysfs split
  selftests: firmware: Add firmware upload selftests
  ...
2022-06-03 11:48:47 -07:00
Michael Kelley a6b94c6b49 Drivers: hv: vmbus: Remove support for Hyper-V 2008 and Hyper-V 2008R2/Win7
The VMbus driver has special case code for running on the first released
versions of Hyper-V: 2008 and 2008 R2/Windows 7. These versions are now
out of support (except for extended security updates) and lack the
performance features needed for effective production usage of Linux
guests.

Simplify the code by removing the negotiation of the VMbus protocol
versions required for these releases of Hyper-V, and by removing the
special case code for handling these VMbus protocol versions.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/1651509391-2058-2-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-05-11 17:49:49 +00:00
Krzysztof Kozlowski 01ed100276 hv: Use driver_set_override() instead of open-coding
Use a helper to set driver_override to the reduce amount of duplicated
code.  Make the driver_override field const char, because it is not
modified by the core and it matches other subsystems.

Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220419113435.246203-5-krzysztof.kozlowski@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 17:13:53 +02:00
Linus Torvalds 42e7a03d3b hyperv-fixes for 5.18-rc2
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmJO+AATHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXi4VB/9NvwUuqgQWxGmaSrITVPLXtwDjGQc8
 Tt3shHWYp9qRuXbX6H7K/PDvyQreLytpStj7JL8rMUsLsccHaBPGTC1czN+oGuwx
 upxKxWzkRGB8DUMD5pXuP9C/XJxVAUGJJ5sJx40HMBblsNi/PSqVzd1bIvV168g4
 hFSPzGJXsbDJZfGloQux5y4NxkVl4k8g6v7EBV0Qxiu0oFhTxJjFzuK6Rau4/ajS
 cXKIpgtjuAXExfgpvORKTs/K98e6Um42sFt5JwUShm9Yocas8POqUo7q0Qw4obcL
 4K14j9t2uGOrrfOld4kGa5Emx0lnRjXMv0EiVaA3tns7GA9//06KQhOE
 =/mS+
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-fixes-signed-20220407' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv fixes from Wei Liu:

 - Correctly propagate coherence information for VMbus devices (Michael
   Kelley)

 - Disable balloon and memory hot-add on ARM64 temporarily (Boqun Feng)

 - Use barrier to prevent reording when reading ring buffer (Michael
   Kelley)

 - Use virt_store_mb in favour of smp_store_mb (Andrea Parri)

 - Fix VMbus device object initialization (Andrea Parri)

 - Deactivate sysctl_record_panic_msg on isolated guest (Andrea Parri)

 - Fix a crash when unloading VMbus module (Guilherme G. Piccoli)

* tag 'hyperv-fixes-signed-20220407' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: Replace smp_store_mb() with virt_store_mb()
  Drivers: hv: balloon: Disable balloon and hot-add accordingly
  Drivers: hv: balloon: Support status report for larger page sizes
  Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer
  PCI: hv: Propagate coherence from VMbus device to PCI device
  Drivers: hv: vmbus: Propagate VMbus coherence to each VMbus device
  Drivers: hv: vmbus: Fix potential crash on module unload
  Drivers: hv: vmbus: Fix initialization of device object in vmbus_device_register()
  Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests
2022-04-07 06:35:34 -10:00
Michael Kelley 37200078ed Drivers: hv: vmbus: Propagate VMbus coherence to each VMbus device
VMbus synthetic devices are not represented in the ACPI DSDT -- only
the top level VMbus device is represented. As a result, on ARM64
coherence information in the _CCA method is not specified for
synthetic devices, so they default to not hardware coherent.
Drivers for some of these synthetic devices have been recently
updated to use the standard DMA APIs, and they are incurring extra
overhead of unneeded software coherence management.

Fix this by propagating coherence information from the VMbus node
in ACPI to the individual synthetic devices. There's no effect on
x86/x64 where devices are always hardware coherent.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/1648138492-2191-2-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-03-29 12:12:50 +00:00
Guilherme G. Piccoli 792f232d57 Drivers: hv: vmbus: Fix potential crash on module unload
The vmbus driver relies on the panic notifier infrastructure to perform
some operations when a panic event is detected. Since vmbus can be built
as module, it is required that the driver handles both registering and
unregistering such panic notifier callback.

After commit 74347a99e7 ("x86/Hyper-V: Unload vmbus channel in hv panic callback")
though, the panic notifier registration is done unconditionally in the module
initialization routine whereas the unregistering procedure is conditionally
guarded and executes only if HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE capability
is set.

This patch fixes that by unconditionally unregistering the panic notifier
in the module's exit routine as well.

Fixes: 74347a99e7 ("x86/Hyper-V: Unload vmbus channel in hv panic callback")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220315203535.682306-1-gpiccoli@igalia.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-03-29 12:05:12 +00:00
Andrea Parri (Microsoft) 3a5469582c Drivers: hv: vmbus: Fix initialization of device object in vmbus_device_register()
Initialize the device's dma_{mask,parms} pointers and the device's
dma_mask value before invoking device_register().  Address the
following trace with 5.17-rc7:

[   49.646839] WARNING: CPU: 0 PID: 189 at include/linux/dma-mapping.h:543
	netvsc_probe+0x37a/0x3a0 [hv_netvsc]
[   49.646928] Call Trace:
[   49.646930]  <TASK>
[   49.646935]  vmbus_probe+0x40/0x60 [hv_vmbus]
[   49.646942]  really_probe+0x1ce/0x3b0
[   49.646948]  __driver_probe_device+0x109/0x180
[   49.646952]  driver_probe_device+0x23/0xa0
[   49.646955]  __device_attach_driver+0x76/0xe0
[   49.646958]  ? driver_allows_async_probing+0x50/0x50
[   49.646961]  bus_for_each_drv+0x84/0xd0
[   49.646964]  __device_attach+0xed/0x170
[   49.646967]  device_initial_probe+0x13/0x20
[   49.646970]  bus_probe_device+0x8f/0xa0
[   49.646973]  device_add+0x41a/0x8e0
[   49.646975]  ? hrtimer_init+0x28/0x80
[   49.646981]  device_register+0x1b/0x20
[   49.646983]  vmbus_device_register+0x5e/0xf0 [hv_vmbus]
[   49.646991]  vmbus_add_channel_work+0x12d/0x190 [hv_vmbus]
[   49.646999]  process_one_work+0x21d/0x3f0
[   49.647002]  worker_thread+0x4a/0x3b0
[   49.647005]  ? process_one_work+0x3f0/0x3f0
[   49.647007]  kthread+0xff/0x130
[   49.647011]  ? kthread_complete_and_exit+0x20/0x20
[   49.647015]  ret_from_fork+0x22/0x30
[   49.647020]  </TASK>
[   49.647021] ---[ end trace 0000000000000000 ]---

Fixes: 743b237c3a ("scsi: storvsc: Add Isolation VM support for storvsc driver")
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220315141053.3223-1-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-03-29 12:03:25 +00:00
Andrea Parri (Microsoft) 9f8b577f7b Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests
hv_panic_page might contain guest-sensitive information, do not dump it
over to Hyper-V by default in isolated guests.

While at it, update some comments in hyperv_{panic,die}_event().

Reported-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/20220301141135.2232-1-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-03-29 11:59:44 +00:00
Linus Torvalds 66711cfea6 hyperv-next for 5.18
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmI6MhgTHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXiBkB/9FEaebcytvBMNVnz/fprfQWfYxdrUB
 wmDB9dnJVPX6UcWg6DuDv7KqatimDp7JrnKrNsrqlgF6Wafn/wI3Cdf8ZUi8CgtZ
 Walq8m2QM9j8WCm4EtgmluzGkHjAn6llmjE7XoLmo/A8MYp0FZhJzghKfCEu54EB
 9DhgqQX+zzy7n32CxoAWrGrHegnGeqMpj3rhQ4qVuNn+bOzZ/wTyJQPmXTu4hSW1
 0PrXRnfzqnODkB6QDUantABZM5bh1VsqwIlENp+xAFQVsQpsc35lW7AUg7JXczVn
 ES0+UHG7wVWPJCtIfD4FgIFT1xekPKxRYX9U66LMRbWGYhXfohQX+wFt
 =u62V
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed-20220322' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv updates from Wei Liu:
 "Minor patches from various people"

* tag 'hyperv-next-signed-20220322' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  x86/hyperv: Output host build info as normal Windows version number
  hv_balloon: rate-limit "Unhandled message" warning
  drivers: hv: log when enabling crash_kexec_post_notifiers
  hv_utils: Add comment about max VMbus packet size in VSS driver
  Drivers: hv: Compare cpumasks and not their weights in init_vp_index()
  Drivers: hv: Rename 'alloced' to 'allocated'
  Drivers: hv: vmbus: Use struct_size() helper in kmalloc()
2022-03-24 12:30:37 -07:00
Michael Kelley 6bf625a414 Drivers: hv: vmbus: Rework use of DMA_BIT_MASK(64)
Using DMA_BIT_MASK(64) as an initializer for a global variable
causes problems with Clang 12.0.1. The compiler doesn't understand
that value 64 is excluded from the shift at compile time, resulting
in a build error.

While this is a compiler problem, avoid the issue by setting up
the dma_mask memory as part of struct hv_device, and initialize
it using dma_set_mask().

Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Vitaly Chikunov <vt@altlinux.org>
Reported-by: Jakub Kicinski <kuba@kernel.org>
Fixes: 743b237c3a ("scsi: storvsc: Add Isolation VM support for storvsc driver")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/1644176216-12531-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-02-07 17:55:30 +00:00
Miaoqian Lin 8bc69f8632 Drivers: hv: vmbus: Fix memory leak in vmbus_add_channel_kobj
kobject_init_and_add() takes reference even when it fails.
According to the doc of kobject_init_and_add():

   If this function returns an error, kobject_put() must be called to
   properly clean up the memory associated with the object.

Fix memory leak by calling kobject_put().

Fixes: c2e5df616e ("vmbus: add per-channel sysfs info")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Juan Vazquez <juvazq@linux.microsoft.com>
Link: https://lore.kernel.org/r/20220203173008.43480-1-linmq006@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-02-07 17:52:22 +00:00
Vitaly Kuznetsov de96e8a098 Drivers: hv: Rename 'alloced' to 'allocated'
'Alloced' is not a real word and only saves us two letters, let's
use 'allocated' instead.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220128103412.3033736-2-vkuznets@redhat.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-02-03 12:47:31 +00:00
Gustavo A. R. Silva a70d298c44 Drivers: hv: vmbus: Use struct_size() helper in kmalloc()
Make use of the struct_size() helper instead of an open-coded version,
in order to avoid any potential type mistakes or integer overflows that,
in the worst scenario, could lead to heap overflows.

Also, address the following sparse warnings:
drivers/hv/vmbus_drv.c:1132:31: warning: using sizeof on a flexible structure

Link: https://github.com/KSPP/linux/issues/174
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220125180131.GA67746@embeddedor
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-02-03 12:42:31 +00:00
Linus Torvalds cb3f09f9af hyperv-next for 5.17
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmHhw7oTHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXrjSB/979LV4Dn1PMcFYsSdlFEMeHcjzJdw/
 kFnLPXMaPJyfg6QPuf83jxzw9uxw8fcePMdVq/FFBtmVV9fJMAv62B8jaGS1p58c
 WnAg+7zsTN+xEoJn+tskSSon8BNMWVrl41zP3K4Ged+5j8UEBk62GB8Orz1qkpwL
 fTh3/+xAvczJeD4zZb1dAm4WnmcQJ4vhg45p07jX6owvnwQAikMFl45aSW54I5o8
 vAxGzFgdsZ2NtExnRNKh3b3DozA8JUE89KckBSZnDtq4rH8Fyy6Wij56Hc6v6Cml
 SUohiNbHX7hsNwit/lxL8wuF97IiA0pQSABobEg3rxfTghTUep51LlaN
 =/m4A
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed-20220114' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv updates from Wei Liu:

 - More patches for Hyper-V isolation VM support (Tianyu Lan)

 - Bug fixes and clean-up patches from various people

* tag 'hyperv-next-signed-20220114' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  scsi: storvsc: Fix storvsc_queuecommand() memory leak
  x86/hyperv: Properly deal with empty cpumasks in hyperv_flush_tlb_multi()
  Drivers: hv: vmbus: Initialize request offers message for Isolation VM
  scsi: storvsc: Fix unsigned comparison to zero
  swiotlb: Add CONFIG_HAS_IOMEM check around swiotlb_mem_remap()
  x86/hyperv: Fix definition of hv_ghcb_pg variable
  Drivers: hv: Fix definition of hypercall input & output arg variables
  net: netvsc: Add Isolation VM support for netvsc driver
  scsi: storvsc: Add Isolation VM support for storvsc driver
  hyper-v: Enable swiotlb bounce buffer for Isolation VM
  x86/hyper-v: Add hyperv Isolation VM check in the cc_platform_has()
  swiotlb: Add swiotlb bounce buffer remap function for HV IVM
2022-01-16 15:53:00 +02:00
Sebastian Andrzej Siewior 703f7066f4 random: remove unused irq_flags argument from add_interrupt_randomness()
Since commit
   ee3e00e9e7 ("random: use registers from interrupted code for CPU's w/o a cycle counter")

the irq_flags argument is no longer used.

Remove unused irq_flags.

Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: linux-hyperv@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-01-07 00:25:25 +01:00
Tianyu Lan 743b237c3a scsi: storvsc: Add Isolation VM support for storvsc driver
In Isolation VM, all shared memory with host needs to mark visible
to host via hvcall. vmbus_establish_gpadl() has already done it for
storvsc rx/tx ring buffer. The page buffer used by vmbus_sendpacket_
mpb_desc() still needs to be handled. Use DMA API(scsi_dma_map/unmap)
to map these memory during sending/receiving packet and return swiotlb
bounce buffer dma address. In Isolation VM, swiotlb  bounce buffer is
marked to be visible to host and the swiotlb force mode is enabled.

Set device's dma min align mask to HV_HYP_PAGE_SIZE - 1 in order to
keep the original data offset in the bounce buffer.

Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20211213071407.314309-5-ltykernel@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-12-20 18:01:09 +00:00
Uwe Kleine-König fc7a6209d5 bus: Make remove callback return void
The driver core ignores the return value of this callback because there
is only little it can do when a device disappears.

This is the final bit of a long lasting cleanup quest where several
buses were converted to also return void from their remove callback.
Additionally some resource leaks were fixed that were caused by drivers
returning an error code in the expectation that the driver won't go
away.

With struct bus_type::remove returning void it's prevented that newly
implemented buses return an ignored error code and so don't anticipate
wrong expectations for driver authors.

Reviewed-by: Tom Rix <trix@redhat.com> (For fpga)
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com> (For drivers/s390 and drivers/vfio)
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> (For ARM, Amba and related parts)
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Chen-Yu Tsai <wens@csie.org> (for sunxi-rsb)
Acked-by: Pali Rohár <pali@kernel.org>
Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org> (for media)
Acked-by: Hans de Goede <hdegoede@redhat.com> (For drivers/platform)
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Acked-By: Vinod Koul <vkoul@kernel.org>
Acked-by: Juergen Gross <jgross@suse.com> (For xen)
Acked-by: Lee Jones <lee.jones@linaro.org> (For mfd)
Acked-by: Johannes Thumshirn <jth@kernel.org> (For mcb)
Acked-by: Johan Hovold <johan@kernel.org>
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> (For slimbus)
Acked-by: Kirti Wankhede <kwankhede@nvidia.com> (For vfio)
Acked-by: Maximilian Luz <luzmaximilian@gmail.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> (For ulpi and typec)
Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> (For ipack)
Acked-by: Geoff Levand <geoff@infradead.org> (For ps3)
Acked-by: Yehezkel Bernat <YehezkelShB@gmail.com> (For thunderbolt)
Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> (For intel_th)
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net> (For pcmcia)
Acked-by: Rafael J. Wysocki <rafael@kernel.org> (For ACPI)
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> (rpmsg and apr)
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> (For intel-ish-hid)
Acked-by: Dan Williams <dan.j.williams@intel.com> (For CXL, DAX, and NVDIMM)
Acked-by: William Breathitt Gray <vilhelm.gray@gmail.com> (For isa)
Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (For firewire)
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> (For hid)
Acked-by: Thorsten Scherer <t.scherer@eckelmann.de> (For siox)
Acked-by: Sven Van Asbroeck <TheSven73@gmail.com> (For anybuss)
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> (For MMC)
Acked-by: Wolfram Sang <wsa@kernel.org> # for I2C
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20210713193522.1770306-6-u.kleine-koenig@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-21 11:53:42 +02:00
Andy Shevchenko f39650de68 kernel.h: split out panic and oops helpers
kernel.h is being used as a dump for all kinds of stuff for a long time.
Here is the attempt to start cleaning it up by splitting out panic and
oops helpers.

There are several purposes of doing this:
- dropping dependency in bug.h
- dropping a loop by moving out panic_notifier.h
- unload kernel.h from something which has its own domain

At the same time convert users tree-wide to use new headers, although for
the time being include new header back to kernel.h to avoid twisted
indirected includes for existing users.

[akpm@linux-foundation.org: thread_info.h needs limits.h]
[andriy.shevchenko@linux.intel.com: ia64 fix]
  Link: https://lkml.kernel.org/r/20210520130557.55277-1-andriy.shevchenko@linux.intel.com

Link: https://lkml.kernel.org/r/20210511074137.33666-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Co-developed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Sebastian Reichel <sre@kernel.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01 11:06:04 -07:00