Commit graph

386 commits

Author SHA1 Message Date
Jason Gunthorpe
aa0958570f iommu: Add iommu_init/deinit_device() paired functions
Move the driver init and destruction code into two logically paired
functions.

There is a subtle ordering dependency in how the group's domains are
freed, the current code does the kobject_put() on the group which will
hopefully trigger the free of the domains before the module_put() that
protects the domain->ops.

Reorganize this to be explicit and documented. The domains are cleaned up
by iommu_deinit_device() if it is the last device to be deinit'd from the
group.  This must be done in a specific order - after
ops->release_device() and before the module_put(). Make it very clear and
obvious by putting the order directly in one function.

Leave WARN_ON's in case the refcounting gets messed up somehow.

This also moves the module_put() and dev_iommu_free() under the
group->mutex to keep the code simple.

Building paired functions like this helps ensure that error cleanup flows
in __iommu_probe_device() are correct because they share the same code
that handles the normal flow. These details become relavent as following
patches add more error unwind into __iommu_probe_device(), and ultimately
a following series adds fine-grained locking to __iommu_probe_device().

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/5-v3-328044aa278c+45e49-iommu_probe_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-07-14 16:14:14 +02:00
Jason Gunthorpe
df15d76dca iommu: Simplify the __iommu_group_remove_device() flow
Instead of returning the struct group_device and then later freeing it, do
the entire free under the group->mutex and defer only putting the
iommu_group.

It is safe to remove the sysfs_links and free memory while holding that
mutex.

Move the sanity assert of the group status into
__iommu_group_free_device().

The next patch will improve upon this and consolidate the group put and
the mutex into __iommu_group_remove_device().

__iommu_group_free_device() is close to being the paired undo of
iommu_group_add_device(), following patches will improve on that.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/4-v3-328044aa278c+45e49-iommu_probe_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-07-14 16:14:13 +02:00
Jason Gunthorpe
7bdb99622f iommu: Inline iommu_group_get_for_dev() into __iommu_probe_device()
This is the only caller, and it doesn't need the generality of the
function. We already know there is no iommu_group, so it is simply two
function calls.

Moving it here allows the following patches to split the logic in these
functions.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v3-328044aa278c+45e49-iommu_probe_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-07-14 16:14:13 +02:00
Jason Gunthorpe
5665d15d3c iommu: Use iommu_group_ref_get/put() for dev->iommu_group
No reason to open code this, use the proper helper functions.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/2-v3-328044aa278c+45e49-iommu_probe_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-07-14 16:14:12 +02:00
Jason Gunthorpe
6eb4da8cf5 iommu: Have __iommu_probe_device() check for already probed devices
This is a step toward making __iommu_probe_device() self contained.

It should, under proper locking, check if the device is already associated
with an iommu driver and resolve parallel probes. All but one of the
callers open code this test using two different means, but they all
rely on dev->iommu_group.

Currently the bus_iommu_probe()/probe_iommu_group() and
probe_acpi_namespace_devices() rejects already probed devices with an
unlocked read of dev->iommu_group. The OF and ACPI "replay" functions use
device_iommu_mapped() which is the same read without the pointless
refcount.

Move this test into __iommu_probe_device() and put it under the
iommu_probe_device_lock. The store to dev->iommu_group is in
iommu_group_add_device() which is also called under this lock for iommu
driver devices, making it properly locked.

The only path that didn't have this check is the hotplug path triggered by
BUS_NOTIFY_ADD_DEVICE. The only way to get dev->iommu_group assigned
outside the probe path is via iommu_group_add_device(). Today the only
caller is VFIO no-iommu which never associates with an iommu driver. Thus
adding this additional check is safe.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v3-328044aa278c+45e49-iommu_probe_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-07-14 16:14:12 +02:00
Linus Torvalds
d35ac6ac0e IOMMU Updates for Linux v6.5
Including:
 
 	- Core changes:
 	  - iova_magazine_alloc() optimization
 	  - Make flush-queue an IOMMU driver capability
 	  - Consolidate the error handling around device attachment
 
 	- AMD IOMMU changes:
 	  - AVIC Interrupt Remapping Improvements
 	  - Some minor fixes and cleanups
 
 	- Intel VT-d changes from Lu Baolu:
 	  - Small and misc cleanups
 
 	- ARM-SMMU changes from Will Deacon:
 	  - Device-tree binding updates:
 	    * Add missing clocks for SC8280XP and SA8775 Adreno SMMUs
 	    * Add two new Qualcomm SMMUs in SDX75 and SM6375
 	  - Workarounds for Arm MMU-700 errata:
 	    * 1076982: Avoid use of SEV-based cmdq wakeup
 	    * 2812531: Terminate command batches with a CMD_SYNC
 	    * Enforce single-stage translation to avoid nesting-related errata
 	  - Set the correct level hint for range TLB invalidation on teardown
 
 	- Some other minor fixes and cleanups (including Freescale PAMU and
 	  virtio-iommu changes)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmSVnS0ACgkQK/BELZcB
 GuM4txAAvtE5pMxM4V/9uTJt+de/vd8XiaH2kfQEULCJm2Yz07Z5+oE+QRtjPc2D
 No+98IGMJCNOg+U+6JZ8P2GR3/soFvKdYjhY/iKTXK+C6jiy3dStIFN/KzzHkbpu
 Y/fUZ5B+DizTO6837osDWIdAz3PcwV3Vk/ogHe3FoHWU13RJYOMp2FAox0QreBNE
 kb7tK3ki/RCasbF9rMt9ClB0SZEVDysRkYF7AtXtsMNVm5jpQAITXVcNUYMeaJFL
 n0J8hjn3EiZj7dgzxbL5bRgDyfPadwJkWz2BxkQ6x0gopgHu0EimGL8p2Bei2f8x
 lv2y692L6zZth2ZgjSkecf3Lo4YHirsP/1U1zrLDjEgeBZ0vRxiX0qsvCb9692C1
 +shy5jOX22ub+zJ2UFHMNGKu3ZdhcKi+meejdqM/GrHcRfZABh26bQILFnPF3Oxp
 2WFb2v7Hq9qdQP50jsGbLji6n165aRW969fBdsk1uDUoCDHNOcdHQS3FsiKAAz5d
 /Z/3PR9tQgnF9bDXJB6RbGJ1rQxHlfvarOQCAYiC02ALj4FnuSLiFSBLe1bI4InR
 AgmnQaH2jmFMWHibdvj3q3sm33sLhOjmAE+ZX0YOhFfgrRGHq88qRwV53IfW477E
 8a+6A+tnu28axk7yVJMvvz5/PeYkD2CMeplYQycUiaQutjvN9sk=
 =aRMe
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:
 "Core changes:
   - iova_magazine_alloc() optimization
   - Make flush-queue an IOMMU driver capability
   - Consolidate the error handling around device attachment

  AMD IOMMU changes:
   - AVIC Interrupt Remapping Improvements
   - Some minor fixes and cleanups

  Intel VT-d changes from Lu Baolu:
   - Small and misc cleanups

  ARM-SMMU changes from Will Deacon:
   - Device-tree binding updates:
      - Add missing clocks for SC8280XP and SA8775 Adreno SMMUs
      - Add two new Qualcomm SMMUs in SDX75 and SM6375
   - Workarounds for Arm MMU-700 errata:
      - 1076982: Avoid use of SEV-based cmdq wakeup
      - 2812531: Terminate command batches with a CMD_SYNC
      - Enforce single-stage translation to avoid nesting-related errata
   - Set the correct level hint for range TLB invalidation on teardown

  .. and some other minor fixes and cleanups (including Freescale PAMU
  and virtio-iommu changes)"

* tag 'iommu-updates-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (50 commits)
  iommu/vt-d: Remove commented-out code
  iommu/vt-d: Remove two WARN_ON in domain_context_mapping_one()
  iommu/vt-d: Handle the failure case of dmar_reenable_qi()
  iommu/vt-d: Remove unnecessary (void*) conversions
  iommu/amd: Remove extern from function prototypes
  iommu/amd: Use BIT/BIT_ULL macro to define bit fields
  iommu/amd: Fix DTE_IRQ_PHYS_ADDR_MASK macro
  iommu/amd: Fix compile error for unused function
  iommu/amd: Improving Interrupt Remapping Table Invalidation
  iommu/amd: Do not Invalidate IRT when IRTE caching is disabled
  iommu/amd: Introduce Disable IRTE Caching Support
  iommu/amd: Remove the unused struct amd_ir_data.ref
  iommu/amd: Switch amd_iommu_update_ga() to use modify_irte_ga()
  iommu/arm-smmu-v3: Set TTL invalidation hint better
  iommu/arm-smmu-v3: Document nesting-related errata
  iommu/arm-smmu-v3: Add explicit feature for nesting
  iommu/arm-smmu-v3: Document MMU-700 erratum 2812531
  iommu/arm-smmu-v3: Work around MMU-600 erratum 1076982
  dt-bindings: arm-smmu: Add SDX75 SMMU compatible
  dt-bindings: arm-smmu: Add SM6375 GPU SMMU
  ...
2023-06-29 20:51:03 -07:00
Robin Murphy
cb147bbe22 dma-mapping: name SG DMA flag helpers consistently
sg_is_dma_bus_address() is inconsistent with the naming pattern of its
corresponding setters and its own kerneldoc, so take the majority vote and
rename it sg_dma_is_bus_address() (and fix up the missing underscores in
the kerneldoc too).  This gives us a nice clear pattern where SG DMA flags
are SG_DMA_<NAME>, and the helpers for acting on them are
sg_dma_<action>_<name>().

Link: https://lkml.kernel.org/r/20230612153201.554742-14-catalin.marinas@arm.com
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
  Link: https://lore.kernel.org/r/fa2eca2862c7ffc41b50337abffb2dfd2864d3ea.1685036694.git.robin.murphy@arm.com
Tested-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Mike Snitzer <snitzer@kernel.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Saravana Kannan <saravanak@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-19 16:19:22 -07:00
Jason Gunthorpe
5957c19305 iommu: Tidy the control flow in iommu_group_store_type()
Use a normal "goto unwind" instead of trying to be clever with checking
!ret and manually managing the unlock.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/17-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:58 +02:00
Jason Gunthorpe
e996c12d76 iommu: Remove __iommu_group_for_each_dev()
The last two users of it are quite trivial, just open code the one line
loop.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/16-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:58 +02:00
Jason Gunthorpe
1000dccd5d iommu: Allow IOMMU_RESV_DIRECT to work on ARM
For now several ARM drivers do not allow mappings to be created until a
domain is attached. This means they do not technically support
IOMMU_RESV_DIRECT as it requires the 1:1 maps to work continuously.

Currently if the platform requests these maps on ARM systems they are
silently ignored.

Work around this by trying again to establish the direct mappings after
the domain is attached if the pre-attach attempt failed.

In the long run the drivers will be fixed to fully setup domains when they
are created without waiting for attachment.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/15-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:57 +02:00
Jason Gunthorpe
d99be00f42 iommu: Consolidate the default_domain setup to one function
Make iommu_change_dev_def_domain() general enough to setup the initial
default_domain or replace it with a new default_domain. Call the new
function iommu_setup_default_domain() and make it the only place in the
code that stores to group->default_domain.

Consolidate the three copies of the default_domain setup sequence. The flow
flow requires:

 - Determining the domain type to use
 - Checking if the current default domain is the same type
 - Allocating a domain
 - Doing iommu_create_device_direct_mappings()
 - Attaching it to devices
 - Store group->default_domain

This adjusts the domain allocation from the prior patch to be able to
detect if each of the allocation steps is already the domain we already
have, which is a more robust version of what change default domain was
already doing.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/14-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:57 +02:00
Jason Gunthorpe
fcbb0a4d73 iommu: Revise iommu_group_alloc_default_domain()
Robin points out that the fallback to guessing what domains the driver
supports should only happen if the driver doesn't return a preference from
its ops->def_domain_type().

Re-organize iommu_group_alloc_default_domain() so it internally uses
iommu_def_domain_type only during the fallback and makes it clearer how
the fallback sequence works.

Make iommu_group_alloc_default_domain() return the domain so the return
based logic is cleaner and to prepare for the next patch.

Remove the iommu_alloc_default_domain() function as it is now trivially
just calling iommu_group_alloc_default_domain().

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/13-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:56 +02:00
Jason Gunthorpe
8b4eb75ee5 iommu: Consolidate the code to calculate the target default domain type
Put all the code to calculate the default domain type into one
function. Make the function able to handle the
iommu_change_dev_def_domain() by taking in the target domain type and
erroring out if the target type isn't reachable.

This makes it really clear that specifying a 0 type during
iommu_change_dev_def_domain() will have the same outcome as the normal
probe path.

Remove the obfuscating use of __iommu_group_for_each_dev() and related
struct __group_domain_type.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/12-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:56 +02:00
Jason Gunthorpe
dfddd54dc7 iommu: Remove the assignment of group->domain during default domain alloc
group->domain should only be set once all the device's drivers have
had their ops->attach_dev() called. iommu_group_alloc_default_domain()
doesn't do this, so it shouldn't set the value.

The previous patches organized things so that each caller of
iommu_group_alloc_default_domain() follows up with calling
__iommu_group_set_domain_internal() that does set the group->domain.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/11-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:55 +02:00
Jason Gunthorpe
152431e4fe iommu: Do iommu_group_create_direct_mappings() before attach
The iommu_probe_device() path calls iommu_create_device_direct_mappings()
after attaching the device.

IOMMU_RESV_DIRECT maps need to be continually in place, so if a hotplugged
device has new ranges the should have been mapped into the default domain
before it is attached.

Move the iommu_create_device_direct_mappings() call up.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/10-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:55 +02:00
Jason Gunthorpe
e7f85dfbbc iommu: Fix iommu_probe_device() to attach the right domain
The general invariant is that all devices in an iommu_group are attached
to group->domain. We missed some cases here where an owned group would not
get the device attached.

Rework this logic so it follows the default domain flow of the
bus_iommu_probe() - call iommu_alloc_default_domain(), then use
__iommu_group_set_domain_internal() to set up all the devices.

Finally always attach the device to the current domain if it is already
set.

This is an unlikely functional issue as iommufd uses iommu_attach_group().
It is possible to hot plug in a new group member, add a vfio driver to it
and then hot add it to an existing iommufd. In this case it is required
that the core code set the iommu_domain properly since iommufd won't call
iommu_attach_group() again.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/9-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:54 +02:00
Jason Gunthorpe
2f74198ae0 iommu: Replace iommu_group_do_dma_first_attach with __iommu_device_set_domain
Since __iommu_device_set_domain() now knows how to handle deferred attach
we can just call it directly from the only call site.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/8-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:54 +02:00
Jason Gunthorpe
0046a4337e iommu: Remove iommu_group_do_dma_first_attach() from iommu_group_add_device()
This function is only used to construct the groups, it should not be
operating the iommu driver.

External callers in VFIO and POWER do not have any iommu drivers on the
devices so group->domain will be NULL.

The only internal caller is from iommu_probe_device() which already calls
iommu_group_do_dma_first_attach(), meaning we are calling it twice in the
only case it matters.

Since iommu_probe_device() is the logical place to sort out the group's
domain, remove the call from iommu_group_add_device().

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/7-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:53 +02:00
Jason Gunthorpe
d257344c66 iommu: Replace __iommu_group_dma_first_attach() with set_domain
Reorganize the attach_deferred logic to set dev->iommu->attach_deferred
immediately during probe and then have __iommu_device_set_domain() check
it and not attach the default_domain.

This is to prepare for removing the group->domain set from
iommu_group_alloc_default_domain() by calling __iommu_group_set_domain()
to set the group->domain.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/6-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:53 +02:00
Jason Gunthorpe
4c8ad9da05 iommu: Use __iommu_group_set_domain() in iommu_change_dev_def_domain()
This is missing re-attach error handling if the attach fails, use the
common code.

The ugly "group->domain = prev_domain" will be cleaned in a later patch.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/5-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:52 +02:00
Jason Gunthorpe
ecd60dc5d2 iommu: Use __iommu_group_set_domain() for __iommu_attach_group()
The error recovery here matches the recovery inside
__iommu_group_set_domain(), so just use it directly.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/4-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:52 +02:00
Jason Gunthorpe
dcf40ed3a2 iommu: Make __iommu_group_set_domain() handle error unwind
Let's try to have a consistent and clear strategy for error handling
during domain attach failures.

There are two broad categories, the first is callers doing destruction and
trying to set the domain back to a previously good domain. These cases
cannot handle failure during destruction flows and must succeed, or at
least avoid a UAF on the current group->domain which is likely about to be
freed.

Many of the drivers are well behaved here and will not hit the WARN_ON's
or a UAF, but some are doing hypercalls/etc that can fail unpredictably
and don't meet the expectations.

The second case is attaching a domain for the first time in a failable
context, failure should restore the attachment back to group->domain using
the above unfailable operation.

Have __iommu_group_set_domain_internal() execute a common algorithm that
tries to achieve this, and in the worst case, would leave a device
"detached" or assigned to a global blocking domain. This relies on some
existing common driver behaviors where attach failure will also do detatch
and true IOMMU_DOMAIN_BLOCK implementations that are not allowed to ever
fail.

Name the first case with __iommu_group_set_domain_nofail() to make it
clear.

Pull all the error handling and WARN_ON generation into
__iommu_group_set_domain_internal().

Avoid the obfuscating use of __iommu_group_for_each_dev() and be more
careful about what should happen during failures by only touching devices
we've already touched.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:51 +02:00
Jason Gunthorpe
3006b15b36 iommu: Add for_each_group_device()
Convenience macro to iterate over every struct group_device in the group.

Replace all open coded list_for_each_entry's with this macro.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/2-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:51 +02:00
Jason Gunthorpe
4db0e5f887 iommu: Replace iommu_group_device_count() with list_count_nodes()
No reason to wrapper a standard function, just call the library directly.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-23 08:15:50 +02:00
Florian Fainelli
32261d1094 iommu: Suppress empty whitespaces in prints
If IOMMU_CMD_LINE_DMA_API or IOMMU_CMD_LINE_STRICT are not set in
iommu_cmd_line, we will be emitting a whitespace before the newline.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20230509191049.1752259-1-f.fainelli@gmail.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-22 17:41:51 +02:00
Robin Murphy
a4fdd97622 iommu: Use flush queue capability
It remains really handy to have distinct DMA domain types within core
code for the sake of default domain policy selection, but we can now
hide that detail from drivers by using the new capability instead.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Jerry Snitselaar <jsnitsel@redhat.com> # amd, intel, smmu-v3
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1c552d99e8ba452bdac48209fa74c0bdd52fd9d9.1683233867.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-05-22 17:38:45 +02:00
Linus Torvalds
58390c8ce1 IOMMU Updates for Linux 6.4
Including:
 
 	- Convert to platform remove callback returning void
 
 	- Extend changing default domain to normal group
 
 	- Intel VT-d updates:
 	    - Remove VT-d virtual command interface and IOASID
 	    - Allow the VT-d driver to support non-PRI IOPF
 	    - Remove PASID supervisor request support
 	    - Various small and misc cleanups
 
 	- ARM SMMU updates:
 	    - Device-tree binding updates:
 	        * Allow Qualcomm GPU SMMUs to accept relevant clock properties
 	        * Document Qualcomm 8550 SoC as implementing an MMU-500
 	        * Favour new "qcom,smmu-500" binding for Adreno SMMUs
 
 	    - Fix S2CR quirk detection on non-architectural Qualcomm SMMU
 	      implementations
 
 	    - Acknowledge SMMUv3 PRI queue overflow when consuming events
 
 	    - Document (in a comment) why ATS is disabled for bypass streams
 
 	- AMD IOMMU updates:
 	    - 5-level page-table support
 	    - NUMA awareness for memory allocations
 
 	- Unisoc driver: Support for reattaching an existing domain
 
 	- Rockchip driver: Add missing set_platform_dma_ops callback
 
 	- Mediatek driver: Adjust the dma-ranges
 
 	- Various other small fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmRONeAACgkQK/BELZcB
 GuPmpw/8C9ruxQ0JU5rcDBXQGvos4gMmxlbELMrBpbbiTtdb35xchpKfdhnECGIF
 k2SrrcF40R/S82SyzNU/eZtGKirtcXvGFraUFgu/QdCcnnqpRHs+IJMXX2NJP+it
 +0wO1uiInt3CN1ERcR4F31cDKiWjDG8bvQVE5LIyiy4KrIU5ld2G91Fkaa0R13Au
 6H+/wKkcUC6OyaGE6wPx474xBkapT20vj5AIQuAWisXJJR0wbBon1sUTo/IRKsU+
 IkNxH0W+1PNImJ+crAdf/nkOlyqoChY4ww6cm07LrOsBLIsX5bCqXfL4HvKthElD
 MEgk2SN5kfjfR5Vf29W4hZVM1CT8VbhO41I7OzaZ6X6RU2PXoldPKlgKtZGeSKn1
 9bcMpSgB0BtbttvBevSkxTo5KHFozXS2DG3DFoMB3yFMme8Th0LrhBZ9oB7NIPNw
 ntMo4K75vviC6Vvzjy4Anj/+y+Zm3W6wDDP7F12O6WZLkK5s4hrSsHUm/MQnnKQP
 muJlG870RnSl73xUQZe3cuBxktXuJ3EHqqYIPE0npzvauu8hhWcis3opf2Y+U2s8
 aBCCIgp5kTKqjHLh2e4lNCKZf1/b/dhxRcRBQhpAIb8YsjMlIJyM+G8Jz6K6gBga
 5Ld+68UQ3oHJwoLV1HCFN8jbpQ9KZn1s9+h3yrYjRAcLNiFb3nU=
 =OvTo
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - Convert to platform remove callback returning void

 - Extend changing default domain to normal group

 - Intel VT-d updates:
     - Remove VT-d virtual command interface and IOASID
     - Allow the VT-d driver to support non-PRI IOPF
     - Remove PASID supervisor request support
     - Various small and misc cleanups

 - ARM SMMU updates:
     - Device-tree binding updates:
         * Allow Qualcomm GPU SMMUs to accept relevant clock properties
         * Document Qualcomm 8550 SoC as implementing an MMU-500
         * Favour new "qcom,smmu-500" binding for Adreno SMMUs

     - Fix S2CR quirk detection on non-architectural Qualcomm SMMU
       implementations

     - Acknowledge SMMUv3 PRI queue overflow when consuming events

     - Document (in a comment) why ATS is disabled for bypass streams

 - AMD IOMMU updates:
     - 5-level page-table support
     - NUMA awareness for memory allocations

 - Unisoc driver: Support for reattaching an existing domain

 - Rockchip driver: Add missing set_platform_dma_ops callback

 - Mediatek driver: Adjust the dma-ranges

 - Various other small fixes and cleanups

* tag 'iommu-updates-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (82 commits)
  iommu: Remove iommu_group_get_by_id()
  iommu: Make iommu_release_device() static
  iommu/vt-d: Remove BUG_ON in dmar_insert_dev_scope()
  iommu/vt-d: Remove a useless BUG_ON(dev->is_virtfn)
  iommu/vt-d: Remove BUG_ON in map/unmap()
  iommu/vt-d: Remove BUG_ON when domain->pgd is NULL
  iommu/vt-d: Remove BUG_ON in handling iotlb cache invalidation
  iommu/vt-d: Remove BUG_ON on checking valid pfn range
  iommu/vt-d: Make size of operands same in bitwise operations
  iommu/vt-d: Remove PASID supervisor request support
  iommu/vt-d: Use non-privileged mode for all PASIDs
  iommu/vt-d: Remove extern from function prototypes
  iommu/vt-d: Do not use GFP_ATOMIC when not needed
  iommu/vt-d: Remove unnecessary checks in iopf disabling path
  iommu/vt-d: Move PRI handling to IOPF feature path
  iommu/vt-d: Move pfsid and ats_qdep calculation to device probe path
  iommu/vt-d: Move iopf code from SVA to IOPF enabling path
  iommu/vt-d: Allow SVA with device-specific IOPF
  dmaengine: idxd: Add enable/disable device IOPF feature
  arm64: dts: mt8186: Add dma-ranges for the parent "soc" node
  ...
2023-04-30 13:00:38 -07:00
Linus Torvalds
cec24b8b6b Char/Misc drivers for 6.4-rc1
Here is the "big" set of char/misc and other driver subsystems for
 6.4-rc1.
 
 It's pretty big, but due to the removal of pcmcia drivers, almost breaks
 even for number of lines added vs. removed, a nice change.
 
 Included in here are:
   - removal of unused PCMCIA drivers (finally!)
   - Interconnect driver updates and additions
   - Lots of IIO driver updates and additions
   - MHI driver updates
   - Coresight driver updates
   - NVMEM driver updates, which required some OF updates
   - W1 driver updates and a new maintainer to manage the subsystem
   - FPGA driver updates
   - New driver subsystem, CDX, for AMD systems
   - lots of other small driver updates and additions
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZEp5Eg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynSXgCg0kSw3vUYwpsnhAsQkoPw1QVA23sAn2edRCMa
 GEkPWjrROueCom7xbLMu
 =eR+P
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc drivers updates from Greg KH:
 "Here is the "big" set of char/misc and other driver subsystems for
  6.4-rc1.

  It's pretty big, but due to the removal of pcmcia drivers, almost
  breaks even for number of lines added vs. removed, a nice change.

  Included in here are:

   - removal of unused PCMCIA drivers (finally!)

   - Interconnect driver updates and additions

   - Lots of IIO driver updates and additions

   - MHI driver updates

   - Coresight driver updates

   - NVMEM driver updates, which required some OF updates

   - W1 driver updates and a new maintainer to manage the subsystem

   - FPGA driver updates

   - New driver subsystem, CDX, for AMD systems

   - lots of other small driver updates and additions

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (196 commits)
  mcb-lpc: Reallocate memory region to avoid memory overlapping
  mcb-pci: Reallocate memory region to avoid memory overlapping
  mcb: Return actual parsed size when reading chameleon table
  kernel/configs: Drop Android config fragments
  virt: acrn: Replace obsolete memalign() with posix_memalign()
  spmi: Add a check for remove callback when removing a SPMI driver
  spmi: fix W=1 kernel-doc warnings
  spmi: mtk-pmif: Drop of_match_ptr for ID table
  spmi: pmic-arb: Convert to platform remove callback returning void
  spmi: mtk-pmif: Convert to platform remove callback returning void
  spmi: hisi-spmi-controller: Convert to platform remove callback returning void
  w1: gpio: remove unnecessary ENOMEM messages
  w1: omap-hdq: remove unnecessary ENOMEM messages
  w1: omap-hdq: add SPDX tag
  w1: omap-hdq: allow compile testing
  w1: matrox: remove unnecessary ENOMEM messages
  w1: matrox: use inline over __inline__
  w1: matrox: switch from asm to linux header
  w1: ds2482: do not use assignment in if condition
  w1: ds2482: drop unnecessary header
  ...
2023-04-27 12:07:50 -07:00
Joerg Roedel
e51b419839 Merge branches 'iommu/fixes', 'arm/allwinner', 'arm/exynos', 'arm/mediatek', 'arm/omap', 'arm/renesas', 'arm/rockchip', 'arm/smmu', 'ppc/pamu', 'unisoc', 'x86/vt-d', 'x86/amd', 'core' and 'platform-remove_new' into next 2023-04-14 13:45:50 +02:00
Jason Gunthorpe
f7f9c054a2 iommu: Remove iommu_group_get_by_id()
This is never called.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/0-v1-60bbc66d7e92+24-rm_iommu_get_by_id_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-04-14 13:09:07 +02:00
Jason Gunthorpe
e223864f82 iommu: Make iommu_release_device() static
This is not called outside the core code, and indeed cannot be called
correctly outside the bus notifier. Make it static.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/0-v1-c3da18124d2d+56-rm_iommu_release_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-04-14 13:07:53 +02:00
Jerry Snitselaar
8f880d19e6 iommu/amd: Set page size bitmap during V2 domain allocation
With the addition of the V2 page table support, the domain page size
bitmap needs to be set prior to iommu core setting up direct mappings
for reserved regions. When reserved regions are mapped, if this is not
done, it will be looking at the V1 page size bitmap when determining
the page size to use in iommu_pgsize(). When it gets into the actual
amd mapping code, a check of see if the page size is supported can
fail, because at that point it is checking it against the V2 page size
bitmap which only supports 4K, 2M, and 1G.

Add a check to __iommu_domain_alloc() to not override the
bitmap if it was already set by the iommu ops domain_alloc() code path.

Cc: Vasant Hegde <vasant.hegde@amd.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Fixes: 4db6c41f09 ("iommu/amd: Add support for using AMD IOMMU v2 page table for DMA-API")
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20230404072742.1895252-1-jsnitsel@redhat.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-04-13 11:56:19 +02:00
Nipun Gupta
3f47d3e44d iommu: Add iommu probe for CDX bus
Add CDX bus to iommu_buses so that IOMMU probe is called
for it.

Signed-off-by: Nipun Gupta <nipun.gupta@amd.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Nikhil Agarwal <nikhil.agarwal@amd.com>
Link: https://lore.kernel.org/r/20230313132636.31850-3-nipun.gupta@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-29 12:26:32 +02:00
Greg Kroah-Hartman
b18d0a0f92 iommu: make the pointer to struct bus_type constant
A number of iommu functions take a struct bus_type * and never modify
the data passed in, so make them all const * as that is what the driver
core is expecting to have passed into as well.

This is a step toward making all struct bus_type pointers constant in
the kernel.

Cc: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: iommu@lists.linux.dev
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20230313182918.1312597-34-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-23 13:21:54 +01:00
Lu Baolu
c33fcc13ee iommu: Use sysfs_emit() for sysfs show
Use sysfs_emit() instead of the sprintf() for sysfs entries. sysfs_emit()
knows the maximum of the temporary buffer used for outputting sysfs
content and avoids overrunning the buffer length.

Prefer 'long long' over 'long long int' as suggested by checkpatch.pl.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20230322123421.278852-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22 15:47:10 +01:00
Lu Baolu
4c8444f19e iommu: Cleanup iommu_change_dev_def_domain()
As the singleton group limitation has been removed, cleanup the code
in iommu_change_dev_def_domain() accordingly.

Documentation is also updated.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20230322064956.263419-7-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22 15:45:17 +01:00
Lu Baolu
49a22aae7d iommu: Replace device_lock() with group->mutex
device_lock() was used in iommu_group_store_type() to prevent the
devices in an iommu group from being attached by any device driver.
On the other hand, in order to avoid lock race between group->mutex
and device_lock(), it limited the usage scenario to the singleton
groups.

We already have the DMA ownership scheme to avoid driver attachment
and group->mutex ensures that device ops are always valid, there's
no need for device_lock() anymore. Remove device_lock() and the
singleton group limitation.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20230322064956.263419-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22 15:45:17 +01:00
Lu Baolu
33793748de iommu: Move lock from iommu_change_dev_def_domain() to its caller
The intention is to make it possible to put group ownership check and
default domain change in a same critical region protected by the group's
mutex lock. No intentional functional change.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20230322064956.263419-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22 15:45:17 +01:00
Lu Baolu
dba9ca9d41 iommu: Same critical region for device release and removal
In a non-driver context, it is crucial to ensure the consistency of a
device's iommu ops. Otherwise, it may result in a situation where a
device is released but it's iommu ops are still used.

Put the ops->release_device and __iommu_group_remove_device() in a same
group->mutext critical region, so that, as long as group->mutex is held
and the device is in its group's device list, its iommu ops are always
consistent. Add check of group ownership if the released device is the
last one.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20230322064956.263419-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22 15:45:16 +01:00
Lu Baolu
293f2564f3 iommu: Split iommu_group_remove_device() into helpers
So that code could be re-used by iommu_release_device() in the subsequent
change. No intention for functionality change.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20230322064956.263419-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22 15:45:16 +01:00
Thomas Weißschuh
aa977833de iommu: Make kobj_type structure constant
Since commit ee6d3dd4ed ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.

Take advantage of this to constify the structure definition to prevent
modification at runtime.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20230214-kobj_type-iommu-v1-1-e7392834b9d0@weissschuh.net
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22 14:19:04 +01:00
Linus Torvalds
143c7bc649 iommufd for 6.3
Some polishing and small fixes for iommufd:
 
 - Remove IOMMU_CAP_INTR_REMAP, instead rely on the interrupt subsystem
 
 - Use GFP_KERNEL_ACCOUNT inside the iommu_domains
 
 - Support VFIO_NOIOMMU mode with iommufd
 
 - Various typos
 
 - A list corruption bug if HWPTs are used for attach
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCY/TgzQAKCRCFwuHvBreF
 Ya3AAP4/WxTJIbDvtTyH3Fae3NxTdO8j8gsUvU1vrRYG83zdnAEAxd1yii7GEO8D
 crkeq9D4FUiPAkFnJ64Exw2FHb060Qg=
 =RABK
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd

Pull iommufd updates from Jason Gunthorpe:
 "Some polishing and small fixes for iommufd:

   - Remove IOMMU_CAP_INTR_REMAP, instead rely on the interrupt
     subsystem

   - Use GFP_KERNEL_ACCOUNT inside the iommu_domains

   - Support VFIO_NOIOMMU mode with iommufd

   - Various typos

   - A list corruption bug if HWPTs are used for attach"

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
  iommufd: Do not add the same hwpt to the ioas->hwpt_list twice
  iommufd: Make sure to zero vfio_iommu_type1_info before copying to user
  vfio: Support VFIO_NOIOMMU with iommufd
  iommufd: Add three missing structures in ucmd_buffer
  selftests: iommu: Fix test_cmd_destroy_access() call in user_copy
  iommu: Remove IOMMU_CAP_INTR_REMAP
  irq/s390: Add arch_is_isolated_msi() for s390
  iommu/x86: Replace IOMMU_CAP_INTR_REMAP with IRQ_DOMAIN_FLAG_ISOLATED_MSI
  genirq/msi: Rename IRQ_DOMAIN_MSI_REMAP to IRQ_DOMAIN_ISOLATED_MSI
  genirq/irqdomain: Remove unused irq_domain_check_msi_remap() code
  iommufd: Convert to msi_device_has_isolated_msi()
  vfio/type1: Convert to iommu_group_has_isolated_msi()
  iommu: Add iommu_group_has_isolated_msi()
  genirq/msi: Add msi_device_has_isolated_msi()
2023-02-24 14:34:12 -08:00
Jason Gunthorpe
939204e4df Linux 6.2
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPyoZYeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGcE0H/1imH5XOfowBdPQU
 p06pCJGKQyEsGnn+kXd7UXes9N/uZFQgOzY9sFspS1ZpXfm60zDcWCeJT2l3qatK
 dtmAGxTEBeZJ8JuevtBiedWy9pJPpvMsfeZd85XzGDRxNUnGT5HgU0/98NpIjysb
 9HTPrpJO9HlmoAKkFDu+Z/kLJp+obns1yQOCH5glOREsPY+4SX76bjPjrbSic0oj
 oDSSBpM2gfdwHWnOKkXhgNuu8zr+hS3LaU1HMj6Kgy3Huz2NjGlgXrRpzutTHEmT
 cmt3Dl5hdIeUtMCt8LbQcngjTg/rX11rFdWaOp/MOuD6U7cqTCWeEDyVsPicFehH
 wdsIfgw=
 =+SoL
 -----END PGP SIGNATURE-----

Merge tag 'v6.2' into iommufd.git for-next

Resolve conflicts from the signature change in iommu_map:

 - drivers/infiniband/hw/usnic/usnic_uiom.c
   Switch iommu_map_atomic() to iommu_map(.., GFP_ATOMIC)

 - drivers/vfio/vfio_iommu_type1.c
   Following indenting change for GFP_KERNEL

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-02-21 11:11:03 -04:00
Joerg Roedel
bedd29d793 Merge branches 'apple/dart', 'arm/exynos', 'arm/renesas', 'arm/smmu', 'x86/vt-d', 'x86/amd' and 'core' into next 2023-02-18 15:43:04 +01:00
Vasant Hegde
2cc73c5712 iommu: Attach device group to old domain in error path
iommu_attach_group() attaches all devices in a group to domain and then
sets group domain (group->domain). Current code (__iommu_attach_group())
does not handle error path. This creates problem as devices to domain
attachment is in inconsistent state.

Flow:
  - During boot iommu attach devices to default domain
  - Later some device driver (like amd/iommu_v2 or vfio) tries to attach
    device to new domain.
  - In iommu_attach_group() path we detach device from current domain.
    Then it tries to attach devices to new domain.
  - If it fails to attach device to new domain then device to domain link
    is broken.
  - iommu_attach_group() returns error.
  - At this stage iommu_attach_group() caller thinks, attaching device to
    new domain failed and devices are still attached to old domain.
  - But in reality device to old domain link is broken. It will result
    in all sort of failures (like IO page fault) later.

To recover from this situation, we need to attach all devices back to the
old domain. Also log warning if it fails attach device back to old domain.

Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
Reported-by: Matt Fagnani <matt.fagnani@bell.net>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Matt Fagnani <matt.fagnani@bell.net>
Link: https://lore.kernel.org/r/20230215052642.6016-1-vasant.hegde@amd.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216865
Link: https://lore.kernel.org/lkml/15d0f9ff-2a56-b3e9-5b45-e6b23300ae3b@leemhuis.info/
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-02-18 15:34:24 +01:00
Jason Gunthorpe
4daa861174 iommu: Fix error unwind in iommu_group_alloc()
If either iommu_group_grate_file() fails then the
iommu_group is leaked.

Destroy it on these error paths.

Found by kselftest/iommu/iommufd_fail_nth

Fixes: bc7d12b91b ("iommu: Implement reserved_regions iommu-group sysfs file")
Fixes: c52c72d3de ("iommu: Add sysfs attribyte for domain type")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/0-v1-8f616bee028d+8b-iommu_group_alloc_leak_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-02-16 10:20:31 +01:00
Jason Gunthorpe
fd9f2a9122 Merge branch 'iommu-memory-accounting' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/joro/iommu intoiommufd/for-next
Jason Gunthorpe says:

====================
iommufd follows the same design as KVM and uses memory cgroups to limit
the amount of kernel memory a iommufd file descriptor can pin down. The
various internal data structures already use GFP_KERNEL_ACCOUNT to charge
its own memory.

However, one of the biggest consumers of kernel memory is the IOPTEs
stored under the iommu_domain and these allocations are not tracked.

This series is the first step in fixing it.

The iommu driver contract already includes a 'gfp' argument to the
map_pages op, allowing iommufd to specify GFP_KERNEL_ACCOUNT and then
having the driver allocate the IOPTE tables with that flag will capture a
significant amount of the allocations.

Update the iommu_map() API to pass in the GFP argument, and fix all call
sites. Replace iommu_map_atomic().

Audit the "enterprise" iommu drivers to make sure they do the right thing.
Intel and S390 ignore the GFP argument and always use GFP_ATOMIC. This is
problematic for iommufd anyhow, so fix it. AMD and ARM SMMUv2/3 are
already correct.

A follow up series will be needed to capture the allocations made when the
iommu_domain itself is allocated, which will complete the job.
====================

* 'iommu-memory-accounting' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/s390: Use GFP_KERNEL in sleepable contexts
  iommu/s390: Push the gfp parameter to the kmem_cache_alloc()'s
  iommu/intel: Use GFP_KERNEL in sleepable contexts
  iommu/intel: Support the gfp argument to the map_pages op
  iommu/intel: Add a gfp parameter to alloc_pgtable_page()
  iommufd: Use GFP_KERNEL_ACCOUNT for iommu_map()
  iommu/dma: Use the gfp parameter in __iommu_dma_alloc_noncontiguous()
  iommu: Add a gfp parameter to iommu_map_sg()
  iommu: Remove iommu_map_atomic()
  iommu: Add a gfp parameter to iommu_map()

Link: https://lore.kernel.org/linux-iommu/0-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-01-30 13:54:35 -04:00
Joerg Roedel
ff489fe002 Merge branch 'iommu-memory-accounting' into core
Merge patch-set from Jason:

	"Let iommufd charge IOPTE allocations to the memory cgroup"

Description:

IOMMUFD follows the same design as KVM and uses memory cgroups to limit
the amount of kernel memory a iommufd file descriptor can pin down. The
various internal data structures already use GFP_KERNEL_ACCOUNT to charge
its own memory.

However, one of the biggest consumers of kernel memory is the IOPTEs
stored under the iommu_domain and these allocations are not tracked.

This series is the first step in fixing it.

The iommu driver contract already includes a 'gfp' argument to the
map_pages op, allowing iommufd to specify GFP_KERNEL_ACCOUNT and then
having the driver allocate the IOPTE tables with that flag will capture a
significant amount of the allocations.

Update the iommu_map() API to pass in the GFP argument, and fix all call
sites. Replace iommu_map_atomic().

Audit the "enterprise" iommu drivers to make sure they do the right thing.
Intel and S390 ignore the GFP argument and always use GFP_ATOMIC. This is
problematic for iommufd anyhow, so fix it. AMD and ARM SMMUv2/3 are
already correct.

A follow up series will be needed to capture the allocations made when the
iommu_domain itself is allocated, which will complete the job.

Link: https://lore.kernel.org/linux-iommu/0-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com/
2023-01-25 11:54:58 +01:00
Jason Gunthorpe
f2b2c051be iommu: Add a gfp parameter to iommu_map_sg()
Follow the pattern for iommu_map() and remove iommu_map_sg_atomic().

This allows __iommu_dma_alloc_noncontiguous() to use a GFP_KERNEL
allocation here, based on the provided gfp flags.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-25 11:52:03 +01:00
Jason Gunthorpe
4dc6376af5 iommu: Remove iommu_map_atomic()
There is only one call site and it can now just pass the GFP_ATOMIC to the
normal iommu_map().

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/2-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-25 11:52:02 +01:00