Commit Graph

2972 Commits

Author SHA1 Message Date
Joerg Roedel a11bfde9c7 iommu/vt-d: Do deferred attachment in iommu_need_mapping()
The attachment of deferred devices needs to happen before the check
whether the device is identity mapped or not. Otherwise the check will
return wrong results, cause warnings boot failures in kdump kernels, like

	WARNING: CPU: 0 PID: 318 at ../drivers/iommu/intel-iommu.c:592 domain_get_iommu+0x61/0x70

	[...]

	 Call Trace:
	  __intel_map_single+0x55/0x190
	  intel_alloc_coherent+0xac/0x110
	  dmam_alloc_attrs+0x50/0xa0
	  ahci_port_start+0xfb/0x1f0 [libahci]
	  ata_host_start.part.39+0x104/0x1e0 [libata]

With the earlier check the kdump boot succeeds and a crashdump is written.

Fixes: 1ee0186b9a ("iommu/vt-d: Refactor find_domain() helper")
Cc: stable@vger.kernel.org # v5.5
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-02-18 17:21:51 +01:00
Joerg Roedel 034d98cc0c iommu/vt-d: Move deferred device attachment into helper function
Move the code that does the deferred device attachment into a separate
helper function.

Fixes: 1ee0186b9a ("iommu/vt-d: Refactor find_domain() helper")
Cc: stable@vger.kernel.org # v5.5
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-02-18 17:21:51 +01:00
Joerg Roedel 1d4615978f iommu/vt-d: Add attach_deferred() helper
Implement a helper function to check whether a device's attach process
is deferred.

Fixes: 1ee0186b9a ("iommu/vt-d: Refactor find_domain() helper")
Cc: stable@vger.kernel.org # v5.5
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-02-18 17:21:51 +01:00
Linus Torvalds 4fc2ea6a86 IOMMU Updates for Linux v5.6
Including:
 
 	- Allow to compile the ARM-SMMU drivers as modules.
 
 	- Fixes and cleanups for the ARM-SMMU drivers and io-pgtable code
 	  collected by Will Deacon. The merge-commit (6855d1ba75) has all the
 	  details.
 
 	- Cleanup of the iommu_put_resv_regions() call-backs in various drivers.
 
 	- AMD IOMMU driver cleanups.
 
 	- Update for the x2APIC support in the AMD IOMMU driver.
 
 	- Preparation patches for Intel VT-d nested mode support.
 
 	- RMRR and identity domain handling fixes for the Intel VT-d driver.
 
 	- More small fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl46yTsACgkQK/BELZcB
 GuPldg//bHDyntUWHta11oHFnUh759mveFFsfYto/2dV7hyHOM1lgv2+ZaXdOYxE
 03f9d9K9b4F3amF8wkluu9Z1Lve40JpZwD3WKTTg8sImh0z61nWoJ7+uE8ZjNzmA
 /6pNqIPJ4w8n5Wz2yycTR7GeUekM0X5nvF0IBlVcX3UYFG0DDlL/eldLVSk43Wmw
 P9C2dFwQLHBKhqhquqnXP1Z7YRfG9lX2ONAoMHSj3x2fm7UySSkrioBR0VrI6+sH
 8/LFL+NgqQEZBTK0I8tZRwNuDUoHQ396PAdYEiXFFzWBLrEaadAuoh4WeLnLzlRe
 G601on9ScUg1xy4lQHQiZDNKYqeB7gETMW/V2Q5nCaTrq2i1D5M6IwiNvOiV6crj
 ZtFBqejng1BXfqLOxutHYyjNQBB/KxmWR3zT2p/spcAgxhUAJxS5GkbGm5G3OucN
 2xWSnV+Dyu5jei8Ua5/onCdcuvCFi41xa9aCRrTSymIZ2W6aCNjm2mrctyRr1xmA
 AwHBZ0/dx/v3vZA8GnkBrrvAYVsTWiMbHD/2sChDjnwedO0o0pMooHEjywmDgMON
 /qXQjqR7pDpyxDA0US+30WfA3tkUcpLskza0ugWKtr/8RcHffrt5s9L0PpklgmtN
 ILC2/zNB43mIFw8uGJdDVFw4aYuCAqIFFHkqQ5+hOBGn/ePHawc=
 =HMIT
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - Allow compiling the ARM-SMMU drivers as modules.

 - Fixes and cleanups for the ARM-SMMU drivers and io-pgtable code
   collected by Will Deacon. The merge-commit (6855d1ba75) has all the
   details.

 - Cleanup of the iommu_put_resv_regions() call-backs in various
   drivers.

 - AMD IOMMU driver cleanups.

 - Update for the x2APIC support in the AMD IOMMU driver.

 - Preparation patches for Intel VT-d nested mode support.

 - RMRR and identity domain handling fixes for the Intel VT-d driver.

 - More small fixes and cleanups.

* tag 'iommu-updates-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (87 commits)
  iommu/amd: Remove the unnecessary assignment
  iommu/vt-d: Remove unnecessary WARN_ON_ONCE()
  iommu/vt-d: Unnecessary to handle default identity domain
  iommu/vt-d: Allow devices with RMRRs to use identity domain
  iommu/vt-d: Add RMRR base and end addresses sanity check
  iommu/vt-d: Mark firmware tainted if RMRR fails sanity check
  iommu/amd: Remove unused struct member
  iommu/amd: Replace two consecutive readl calls with one readq
  iommu/vt-d: Don't reject Host Bridge due to scope mismatch
  PCI/ATS: Add PASID stubs
  iommu/arm-smmu-v3: Return -EBUSY when trying to re-add a device
  iommu/arm-smmu-v3: Improve add_device() error handling
  iommu/arm-smmu-v3: Use WRITE_ONCE() when changing validity of an STE
  iommu/arm-smmu-v3: Add second level of context descriptor table
  iommu/arm-smmu-v3: Prepare for handling arm_smmu_write_ctx_desc() failure
  iommu/arm-smmu-v3: Propagate ssid_bits
  iommu/arm-smmu-v3: Add support for Substream IDs
  iommu/arm-smmu-v3: Add context descriptor tables allocators
  iommu/arm-smmu-v3: Prepare arm_smmu_s1_cfg for SSID support
  ACPI/IORT: Parse SSID property of named component node
  ...
2020-02-05 17:49:54 +00:00
Linus Torvalds 26dca6dbd6 pci-v5.6-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl40PWgUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwclA/+Id/7Uc5S0r7xgFQRr3lbn0hHcx7f
 oBgmm6kGl8bu77MDiY32WLmPsp9e4BlK2M765cKQL5n20y8CzJ+kthZM8tZEDba4
 pnrZnWZ0A2xaBKzJqqYDtCqAeP97noCs4zBLo3JCA6jYCYI5bkvmdMQRlRjTUofO
 tkenGE+vexaJsLB7ghNskL3xGMueXLtLf/hXvaC6WGbSI9/zUmliHDL53DoKDPRo
 /9TGYDMwItZz+BhmBJz8hAL4naQIhIcDk2mz7CzWkY9xDhCJ1yeEwFvtvJwq0uM2
 Nmtq1g6yCB3sjlx+bRzrioLnouflztK1PGRbNugrMkR5XM9HIFmNwaDrqpU11ffA
 LQabMpbS3RWH3hbh4LYVMW13hbO+ld7/NG8jMFce2LHBWaGj6YejUQGdifz6vGRk
 JnDOgP19v5gWw08ibwkdfYzznPfMXp5IzFdJQFKhK+ugGDSJ8VeXiQ/pWtzghl3z
 P/puRw0BiL7ob/FUmhwn4J1Ytml7PZE+cJVN2l4C/CwKxR583GRUDgSHNL7Dky+o
 GcH9Tmjt4hQMNYRP01PACUmFYJwDfB+zgQ64a+uJsQwl/j+yfMnc1t/kqdM6yC9J
 GgkqLp989G/a3n9w5IC1P8aDYiwRqABvAFzlP9OZcIMUwmWbrhH175Qf6skKYIhH
 q9RKcLVXZdRS3mc=
 =fQ0E
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:

 "Resource management:

   - Improve resource assignment for hot-added nested bridges, e.g.,
     Thunderbolt (Nicholas Johnson)

  Power management:

   - Optionally print config space of devices before suspend (Chen Yu)

   - Increase D3 delay for AMD Ryzen5/7 XHCI controllers (Daniel Drake)

  Virtualization:

   - Generalize DMA alias quirks (James Sewart)

   - Add DMA alias quirk for PLX PEX NTB (James Sewart)

   - Fix IOV memory leak (Navid Emamdoost)

  AER:

   - Log which device prevents error recovery (Yicong Yang)

  Peer-to-peer DMA:

   - Whitelist Intel SkyLake-E (Armen Baloyan)

  Broadcom iProc host bridge driver:

   - Apply PAXC quirk whether driver is built-in or module (Wei Liu)

  Broadcom STB host bridge driver:

   - Add Broadcom STB PCIe host controller driver (Jim Quinlan)

  Intel Gateway SoC host bridge driver:

   - Add driver for Intel Gateway SoC (Dilip Kota)

  Intel VMD host bridge driver:

   - Add support for DMA aliases on other buses (Jon Derrick)

   - Remove dma_map_ops overrides (Jon Derrick)

   - Remove now-unused X86_DEV_DMA_OPS (Christoph Hellwig)

  NVIDIA Tegra host bridge driver:

   - Fix Tegra30 afi_pex2_ctrl register offset (Marcel Ziswiler)

  Panasonic UniPhier host bridge driver:

   - Remove module code since driver can't be built as a module
     (Masahiro Yamada)

  Qualcomm host bridge driver:

   - Add support for SDM845 PCIe controller (Bjorn Andersson)

  TI Keystone host bridge driver:

   - Fix "num-viewport" DT property error handling (Kishon Vijay Abraham I)

   - Fix link training retries initiation (Yurii Monakov)

   - Fix outbound region mapping (Yurii Monakov)

  Misc:

   - Add Switchtec Gen4 support (Kelvin Cao)

   - Add Switchtec Intercomm Notify and Upstream Error Containment
     support (Logan Gunthorpe)

   - Use dma_set_mask_and_coherent() since Switchtec supports 64-bit
     addressing (Wesley Sheng)"

* tag 'pci-v5.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (60 commits)
  PCI: Allow adjust_bridge_window() to shrink resource if necessary
  PCI: Set resource size directly in adjust_bridge_window()
  PCI: Rename extend_bridge_window() to adjust_bridge_window()
  PCI: Rename extend_bridge_window() parameter
  PCI: Consider alignment of hot-added bridges when assigning resources
  PCI: Remove local variable usage in pci_bus_distribute_available_resources()
  PCI: Pass size + alignment to pci_bus_distribute_available_resources()
  PCI: Rename variables
  PCI: vmd: Add two VMD Device IDs
  PCI: Remove unnecessary braces
  PCI: brcmstb: Add MSI support
  PCI: brcmstb: Add Broadcom STB PCIe host controller driver
  x86/PCI: Remove X86_DEV_DMA_OPS
  PCI: vmd: Remove dma_map_ops overrides
  iommu/vt-d: Remove VMD child device sanity check
  iommu/vt-d: Use pci_real_dma_dev() for mapping
  PCI: Introduce pci_real_dma_dev()
  x86/PCI: Expose VMD's pci_dev in struct pci_sysdata
  x86/PCI: Add to_pci_sysdata() helper
  PCI/AER: Initialize aer_fifo
  ...
2020-01-31 14:48:54 -08:00
Bjorn Helgaas db83c269d2 Merge branch 'pci/host-vmd'
- Save VMD's pci_dev in x86 struct pci_sysdata (Jon Derrick)

  - Add pci_real_dma_dev() for DMA aliases not on the same bus as requester
    (Jon Derrick)

  - Add IOMMU mappings for pci_real_dma_dev() (Jon Derrick)

  - Remove IOMMU sanity checks for VMD devices (Jon Derrick)

  - Remove VMD dma_map_ops overrides (Jon Derrick)

  - Remove unused X86_DEV_DMA_OPS (Christoph Hellwig)

  - Add VMD device IDs that need bus restriction mode (Sushma Kalakota)

* pci/host-vmd:
  PCI: vmd: Add two VMD Device IDs
  x86/PCI: Remove X86_DEV_DMA_OPS
  PCI: vmd: Remove dma_map_ops overrides
  iommu/vt-d: Remove VMD child device sanity check
  iommu/vt-d: Use pci_real_dma_dev() for mapping
  PCI: Introduce pci_real_dma_dev()
  x86/PCI: Expose VMD's pci_dev in struct pci_sysdata
  x86/PCI: Add to_pci_sysdata() helper
2020-01-29 17:00:02 -06:00
Linus Torvalds 6a1000bd27 ioremap changes for 5.6
- remove ioremap_nocache given that is is equivalent to
    ioremap everywhere
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl4vKHwLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYMPGBAAuVNUZaZfWYHpiVP2oRcUQUguFiD3NTbknsyzV2oH
 J9P0GfeENSKwE9OOhZ7XIjnCZAJwQgTK/ppQY5yiQ/KAtYyyXjXEJ6jqqjiTDInr
 +3+I3t/LhkgrK7tMrb7ylTGa/d7KhaciljnOXC8+b75iddvM9I1z2pbHDbppZMS9
 wT4RXL/cFtRb85AfOyPLybcka3f5P2gGvQz38qyimhJYEzHDXZu9VO1Bd20f8+Xf
 eLBKX0o6yWMhcaPLma8tm0M0zaXHEfLHUKLSOkiOk+eHTWBZ3b/w5nsOQZYZ7uQp
 25yaClbameAn7k5dHajduLGEJv//ZjLRWcN3HJWJ5vzO111aHhswpE7JgTZJSVWI
 ggCVkytD3ESXapvswmACSeCIDMmiJMzvn6JvwuSMVB7a6e5mcqTuGo/FN+DrBF/R
 IP+/gY/T7zIIOaljhQVkiEIIwiD/akYo0V9fheHTBnqcKEDTHV4WjKbeF6aCwcO+
 b8inHyXZSKSMG//UlDuN84/KH/o1l62oKaB1uDIYrrL8JVyjAxctWt3GOt5KgSFq
 wVz1lMw4kIvWtC/Sy2H4oB+RtODLp6yJDqmvmPkeJwKDUcd/1JKf0KsZ8j3FpGei
 /rEkBEss0KBKyFAgBSRO2jIpdj2epgcBcsdB/r5mlhcn8L77AS6mHbA173kY4pQ/
 Kdg=
 =TUCJ
 -----END PGP SIGNATURE-----

Merge tag 'ioremap-5.6' of git://git.infradead.org/users/hch/ioremap

Pull ioremap updates from Christoph Hellwig:
 "Remove the ioremap_nocache API (plus wrappers) that are always
  identical to ioremap"

* tag 'ioremap-5.6' of git://git.infradead.org/users/hch/ioremap:
  remove ioremap_nocache and devm_ioremap_nocache
  MIPS: define ioremap_nocache to ioremap
2020-01-27 13:03:00 -08:00
Jon Derrick e3560ee4cf iommu/vt-d: Remove VMD child device sanity check
Remove the sanity check required for VMD child devices.  The new
pci_real_dma_dev() DMA alias mechanism places them in the same IOMMU group
as the VMD endpoint.  Assignment of the group would require assigning the
VMD endpoint, where unbinding the VMD endpoint removes the child device
domain from the hierarchy.

Link: https://lore.kernel.org/r/1579613871-301529-6-git-send-email-jonathan.derrick@intel.com
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
2020-01-24 14:58:45 -06:00
Jon Derrick 2b0140c696 iommu/vt-d: Use pci_real_dma_dev() for mapping
The PCI device may have a DMA requester on another bus, such as VMD
subdevices needing to use the VMD endpoint.  This case requires the real
DMA device for the IOMMU mapping, so use pci_real_dma_dev() to find that
device.

Link: https://lore.kernel.org/r/1579613871-301529-5-git-send-email-jonathan.derrick@intel.com
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
2020-01-24 14:58:33 -06:00
Joerg Roedel e3b5ee0cfb Merge branches 'iommu/fixes', 'arm/smmu', 'x86/amd', 'x86/vt-d' and 'core' into next 2020-01-24 15:39:39 +01:00
Adrian Huang 154e3a65f4 iommu/amd: Remove the unnecessary assignment
The assignment of the global variable 'iommu_detected' has been
moved from amd_iommu_init_dma_ops() to amd_iommu_detect(), so
this patch removes the assignment in amd_iommu_init_dma_ops().

Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-24 15:38:31 +01:00
Lu Baolu 857f081426 iommu/vt-d: Remove unnecessary WARN_ON_ONCE()
Address field in device TLB invalidation descriptor is qualified
by the S field. If S field is zero, a single page at page address
specified by address [63:12] is requested to be invalidated. If S
field is set, the least significant bit in the address field with
value 0b (say bit N) indicates the invalidation address range. The
spec doesn't require the address [N - 1, 0] to be cleared, hence
remove the unnecessary WARN_ON_ONCE().

Otherwise, the caller might set "mask = MAX_AGAW_PFN_WIDTH" in order
to invalidating all the cached mappings on an endpoint, and below
overflow error will be triggered.

[...]
UBSAN: Undefined behaviour in drivers/iommu/dmar.c:1354:3
shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[...]

Reported-and-tested-by: Frank <fgndev@posteo.de>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-24 15:36:27 +01:00
Lu Baolu b89b6605b8 iommu/vt-d: Unnecessary to handle default identity domain
The iommu default domain framework has been designed to take
care of setting identity default domain type. It's unnecessary
to handle this again in the VT-d driver. Hence, remove it.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-24 15:32:54 +01:00
Lu Baolu 9235cb13d7 iommu/vt-d: Allow devices with RMRRs to use identity domain
Since commit ea2447f700 ("intel-iommu: Prevent devices with
RMRRs from being placed into SI Domain"), the Intel IOMMU driver
doesn't allow any devices with RMRR locked to use the identity
domain. This was added to to fix the issue where the RMRR info
for devices being placed in and out of the identity domain gets
lost. This identity maps all RMRRs when setting up the identity
domain, so that devices with RMRRs could also use it.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-24 15:32:54 +01:00
Barret Rhoden ce4cc52b51 iommu/vt-d: Add RMRR base and end addresses sanity check
The VT-d spec specifies requirements for the RMRR entries base and
end (called 'Limit' in the docs) addresses.

This commit will cause the DMAR processing to mark the firmware as
tainted if any RMRR entries that do not meet these requirements.

Signed-off-by: Barret Rhoden <brho@google.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-24 15:32:53 +01:00
Barret Rhoden f5a68bb075 iommu/vt-d: Mark firmware tainted if RMRR fails sanity check
RMRR entries describe memory regions that are DMA targets for devices
outside the kernel's control.

RMRR entries that fail the sanity check are pointing to regions of
memory that the firmware did not tell the kernel are reserved or
otherwise should not be used.

Instead of aborting DMAR processing, this commit marks the firmware
as tainted. These RMRRs will still be identity mapped, otherwise,
some devices, e.x. graphic devices, will not work during boot.

Signed-off-by: Barret Rhoden <brho@google.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Fixes: f036c7fa0a ("iommu/vt-d: Check VT-d RMRR region in BIOS is reported as reserved")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-24 15:32:15 +01:00
Shuah Khan 8c17bbf6c8 iommu/amd: Fix IOMMU perf counter clobbering during init
init_iommu_perf_ctr() clobbers the register when it checks write access
to IOMMU perf counters and fails to restore when they are writable.

Add save and restore to fix it.

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Fixes: 30861ddc9c ("perf/x86/amd: Add IOMMU Performance Counter resource management")
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-24 15:28:40 +01:00
Jerry Snitselaar bf708cfb2f iommu/vt-d: Call __dmar_remove_one_dev_info with valid pointer
It is possible for archdata.iommu to be set to
DEFER_DEVICE_DOMAIN_INFO or DUMMY_DEVICE_DOMAIN_INFO so check for
those values before calling __dmar_remove_one_dev_info. Without a
check it can result in a null pointer dereference. This has been seen
while booting a kdump kernel on an HP dl380 gen9.

Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: stable@vger.kernel.org # 5.3+
Cc: linux-kernel@vger.kernel.org
Fixes: ae23bfb68f ("iommu/vt-d: Detach domain before using a private one")
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-24 15:23:50 +01:00
Adrian Huang 9646674878 iommu/amd: Remove unused struct member
Commit c805b428f2 ("iommu/amd: Remove amd_iommu_pd_list") removes
the global list for the allocated protection domains. The
corresponding member 'list' of the protection_domain struct is
not used anymore, so it can be removed.

Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-17 11:05:56 +01:00
Adrian Huang 62dcee7160 iommu/amd: Replace two consecutive readl calls with one readq
Optimize the reigster reading by using readq instead of the two
consecutive readl calls.

Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-17 11:05:56 +01:00
jimyan 53291622e2 iommu/vt-d: Don't reject Host Bridge due to scope mismatch
On a system with two host bridges(0000:00:00.0,0000:80:00.0), iommu
initialization fails with

    DMAR: Device scope type does not match for 0000:80:00.0

This is because the DMAR table reports this device as having scope 2
(ACPI_DMAR_SCOPE_TYPE_BRIDGE):

but the device has a type 0 PCI header:
80:00.0 Class 0600: Device 8086:2020 (rev 06)
00: 86 80 20 20 47 05 10 00 06 00 00 06 10 00 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 00 00
30: 00 00 00 00 90 00 00 00 00 00 00 00 00 01 00 00

VT-d works perfectly on this system, so there's no reason to bail out
on initialization due to this apparent scope mismatch. Add the class
0x06 ("PCI_BASE_CLASS_BRIDGE") as a heuristic for allowing DMAR
initialization for non-bridge PCI devices listed with scope bridge.

Signed-off-by: jimyan <jimyan@baidu.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-17 10:57:30 +01:00
Will Deacon 92c1d360dc iommu/arm-smmu-v3: Return -EBUSY when trying to re-add a device
Although we WARN in arm_smmu_add_device() if the device being added has
been added already without a subsequent call to arm_smmu_remove_device(),
we still continue half-heartedly, initialising the stream-table for any
new StreamIDs that may have magically appeared and re-establishing device
links that should still be there from last time.

Given that calling ->add_device() twice without removing the device in the
meantime is indicative of an error in the caller, just return -EBUSY after
warning.

Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jean Philippe-Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-15 16:30:28 +00:00
Jean-Philippe Brucker a2be6218e6 iommu/arm-smmu-v3: Improve add_device() error handling
Let add_device() clean up after itself. The iommu_bus_init() function
does call remove_device() on error, but other sites (e.g. of_iommu) do
not.

Don't free level-2 stream tables because we'd have to track if we
allocated each of them or if they are used by other endpoints. It's not
worth the hassle since they are managed resources.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-15 16:30:27 +00:00
Will Deacon d71e01716b iommu/arm-smmu-v3: Use WRITE_ONCE() when changing validity of an STE
If, for some bizarre reason, the compiler decided to split up the write
of STE DWORD 0, we could end up making a partial structure valid.

Although this probably won't happen, follow the example of the
context-descriptor code and use WRITE_ONCE() to ensure atomicity of the
write.

Reported-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-15 16:30:27 +00:00
Jean-Philippe Brucker 73af06f589 iommu/arm-smmu-v3: Add second level of context descriptor table
The SMMU can support up to 20 bits of SSID. Add a second level of page
tables to accommodate this. Devices that support more than 1024 SSIDs now
have a table of 1024 L1 entries (8kB), pointing to tables of 1024 context
descriptors (64kB), allocated on demand.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-15 16:06:50 +00:00
Jean-Philippe Brucker 492ddc79e0 iommu/arm-smmu-v3: Prepare for handling arm_smmu_write_ctx_desc() failure
Second-level context descriptor tables will be allocated lazily in
arm_smmu_write_ctx_desc(). Help with handling allocation failure by
moving the CD write into arm_smmu_domain_finalise_s1().

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
[will: Add comment per discussion on list]
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-15 16:05:54 +00:00
Jean-Philippe Brucker 2505ec6f85 iommu/arm-smmu-v3: Propagate ssid_bits
Now that we support substream IDs, initialize s1cdmax with the number of
SSID bits supported by a master and the SMMU.

Context descriptor tables are allocated once for the first master
attached to a domain. Therefore attaching multiple devices with
different SSID sizes is tricky, and we currently don't support it.

As a future improvement it would be nice to at least support attaching a
SSID-capable device to a domain that isn't using SSID, by reallocating
the SSID table. This would allow supporting a SSID-capable device that
is in the same IOMMU group as a bridge, for example. Varying SSID size
is less of a concern, since the PCIe specification "highly recommends"
that devices supporting PASID implement all 20 bits of it.

Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-15 16:05:53 +00:00
Jean-Philippe Brucker 87f42391f6 iommu/arm-smmu-v3: Add support for Substream IDs
At the moment, the SMMUv3 driver implements only one stage-1 or stage-2
page directory per device. However SMMUv3 allows more than one address
space for some devices, by providing multiple stage-1 page directories. In
addition to the Stream ID (SID), that identifies a device, we can now have
Substream IDs (SSID) identifying an address space. In PCIe, SID is called
Requester ID (RID) and SSID is called Process Address-Space ID (PASID).
A complete stage-1 walk goes through the context descriptor table:

      Stream tables       Ctx. Desc. tables       Page tables
        +--------+   ,------->+-------+   ,------->+-------+
        :        :   |        :       :   |        :       :
        +--------+   |        +-------+   |        +-------+
   SID->|  STE   |---'  SSID->|  CD   |---'  IOVA->|  PTE  |--> IPA
        +--------+            +-------+            +-------+
        :        :            :       :            :       :
        +--------+            +-------+            +-------+

Rewrite arm_smmu_write_ctx_desc() to modify context descriptor table
entries. To keep things simple we only implement one level of context
descriptor tables here, but as with stream and page tables, an SSID can
be split to index multiple levels of tables.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-15 16:05:53 +00:00
Jean-Philippe Brucker a557aff0c7 iommu/arm-smmu-v3: Add context descriptor tables allocators
Support for SSID will require allocating context descriptor tables. Move
the context descriptor allocation to separate functions.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-15 16:00:57 +00:00
Jean-Philippe Brucker 7bc4f3fae9 iommu/arm-smmu-v3: Prepare arm_smmu_s1_cfg for SSID support
When adding SSID support to the SMMUv3 driver, we'll need to manipulate
leaf pasid tables and context descriptors. Extract the context
descriptor structure and align with the way stream tables are handled.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-15 16:00:57 +00:00
Jean-Philippe Brucker 89535821c0 iommu/arm-smmu-v3: Parse PASID devicetree property of platform devices
For platform devices that support SubstreamID (SSID), firmware provides
the number of supported SSID bits. Restrict it to what the SMMU supports
and cache it into master->ssid_bits, which will also be used for PCI
PASID.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-15 16:00:57 +00:00
Jean-Philippe Brucker 9bb9069cfb iommu/arm-smmu-v3: Drop __GFP_ZERO flag from DMA allocation
Since commit 518a2f1925 ("dma-mapping: zero memory returned from
dma_alloc_*"), dma_alloc_* always initializes memory to zero, so there
is no need to use dma_zalloc_* or pass the __GFP_ZERO flag anymore.

The flag was introduced by commit 04fa26c71b ("iommu/arm-smmu: Convert
DMA buffer allocations to the managed API"), since the managed API
didn't provide a dmam_zalloc_coherent() function.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-15 15:59:13 +00:00
Robin Murphy 79f7a5cb87 iommu/arm-smmu: Improve SMR mask test
Make the SMR mask test more robust against SMR0 being live
at probe time, which might happen once we start supporting
firmware reservations for framebuffers and suchlike.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 16:09:58 +00:00
Robin Murphy db6903010a iommu/io-pgtable-arm: Prepare for TTBR1 usage
Now that we can correctly extract top-level indices without relying on
the remaining upper bits being zero, the only remaining impediments to
using a given table for TTBR1 are the address validation on map/unmap
and the awkward TCR translation granule format. Add a quirk so that we
can do the right thing at those points.

Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:52:25 +00:00
Will Deacon ac4b80e5b9 iommu/io-pgtable-arm: Rationalise VTCR handling
Commit 05a648cd2dd7 ("iommu/io-pgtable-arm: Rationalise TCR handling")
reworked the way in which the TCR register value is returned from the
io-pgtable code when targetting the Arm long-descriptor format, in
preparation for allowing page-tables to target TTBR1.

As it turns out, the new interface is a lot nicer to use, so do the same
conversion for the VTCR register even though there is only a single base
register for stage-2 translation.

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:52:25 +00:00
Will Deacon fba6e96077 iommu/arm-smmu: Rename public #defines under ARM_SMMU_ namespace
Now that we have arm-smmu.h defining various SMMU constants, ensure that
they are namespaced with the ARM_SMMU_ prefix in order to avoid conflicts
with the CPU, such as the one we're currently bodging around with the
TCR.

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:52:25 +00:00
Robin Murphy fb485eb18e iommu/io-pgtable-arm: Rationalise TCR handling
Although it's conceptually nice for the io_pgtable_cfg to provide a
standard VMSA TCR value, the reality is that no VMSA-compliant IOMMU
looks exactly like an Arm CPU, and they all have various other TCR
controls which io-pgtable can't be expected to understand. Thus since
there is an expectation that drivers will have to add to the given TCR
value anyway, let's strip it down to just the essentials that are
directly relevant to io-pgtable's inner workings - namely the various
sizes and the walk attributes.

Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: Add missing include of bitfield.h]
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:52:24 +00:00
Will Deacon 6f932ad369 iommu/io-pgtable-arm: Ensure ARM_64_LPAE_S2_TCR_RES1 is unsigned
ARM_64_LPAE_S2_TCR_RES1 is intended to map to bit 31 of the VTCR register,
which is required to be set to 1 by the architecture. Unfortunately, we
accidentally treat this as a signed quantity which means we also set the
upper 32 bits of the VTCR to one, and they are required to be zero.

Treat ARM_64_LPAE_S2_TCR_RES1 as unsigned to avoid the unwanted
sign-extension up to 64 bits.

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:52:24 +00:00
Robin Murphy 7618e47909 iommu/io-pgtable-arm: Improve attribute handling
By VMSA rules, using Normal Non-Cacheable type with a shareability
attribute of anything other than Outer Shareable is liable to lead into
unpredictable territory:

| Overlaying the shareability attribute (B3-1377, ARM DDI 0406C.c)
|
| A memory region with a resultant memory type attribute of Normal, and
| a resultant cacheability attribute of Inner Non-cacheable, Outer
| Non-cacheable, must have a resultant shareability attribute of Outer
| Shareable, otherwise shareability is UNPREDICTABLE

Although the SMMU architectures seem to give some slightly stronger
guarantees of Non-Cacheable output types becoming implicitly Outer
Shareable in most cases, we may as well be explicit and not take any
chances. It's also weird that LPAE attribute handling is currently split
between prot_to_pte() and init_pte() given that it can all be statically
determined up-front. Thus, collect *all* the LPAE attributes into
prot_to_pte() in order to logically pick the shareability based on the
incoming IOMMU API prot value, and tweak the short-descriptor code to
stop setting TTBR0.NOS for Non-Cacheable walks.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:52:24 +00:00
Will Deacon 30d2acb673 iommu/io-pgtable-arm: Support non-coherent stage-2 page tables
Commit 9e6ea59f3f ("iommu/io-pgtable: Support non-coherent page tables")
added support for non-coherent page-table walks to the Arm IOMMU page-table
backends. Unfortunately, it left the stage-2 allocator unchanged, so let's
hook that up in the same way.

Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:39:43 +00:00
Robin Murphy d1e5f26f14 iommu/io-pgtable-arm: Rationalise TTBRn handling
TTBR1 values have so far been redundant since no users implement any
support for split address spaces. Crucially, though, one of the main
reasons for wanting to do so is to be able to manage each half entirely
independently, e.g. context-switching one set of mappings without
disturbing the other. Thus it seems unlikely that tying two tables
together in a single io_pgtable_cfg would ever be particularly desirable
or useful.

Streamline the configs to just a single conceptual TTBR value
representing the allocated table. This paves the way for future users to
support split address spaces by simply allocating a table and dealing
with the detailed TTBRn logistics themselves.

Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: Drop change to ttbr value]
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:39:23 +00:00
Masahiro Yamada cd037ff2f9 iommu/arm-smmu: Fix -Wunused-const-variable warning
For ARCH=arm builds, OF is not necessarily enabled, that is, you can
build this driver without CONFIG_OF.

When CONFIG_OF is unset, of_match_ptr() is NULL, and arm_smmu_of_match
is left orphan.

Building it with W=1 emits a warning:

drivers/iommu/arm-smmu.c:1904:34: warning: ‘arm_smmu_of_match’ defined but not used [-Wunused-const-variable=]
 static const struct of_device_id arm_smmu_of_match[] = {
                                  ^~~~~~~~~~~~~~~~~

There are two ways to fix this:

 - annotate arm_smmu_of_match with __maybe_unused (or surround the
   code with #ifdef CONFIG_OF ... #endif)

 - stop using of_match_ptr()

This commit took the latter solution.

It slightly increases the object size, but it is probably not a big deal
because arm_smmu_device_dt_probe() is also compiled irrespective of
CONFIG_OF.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 09:49:27 +00:00
Masahiro Yamada 8efda06f83 iommu/arm-smmu-v3: Remove useless of_match_ptr()
The CONFIG option controlling this driver, ARM_SMMU_V3,
depends on ARM64, which select's OF.

So, CONFIG_OF is always defined when building this driver.
of_match_ptr(arm_smmu_of_match) is the same as arm_smmu_of_match.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 09:49:27 +00:00
Masahiro Yamada 322a9bbb72 iommu/arm-smmu-v3: Fix resource_size check
This is an off-by-one mistake.

resource_size() returns res->end - res->start + 1.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 09:49:27 +00:00
Shameer Kolothum 935d43ba27 iommu/arm-smmu-v3: Populate VMID field for CMDQ_OP_TLBI_NH_VA
CMDQ_OP_TLBI_NH_VA requires VMID and this was missing since
commit 1c27df1c0a ("iommu/arm-smmu: Use correct address mask
for CMD_TLBI_S2_IPA"). Add it back.

Fixes: 1c27df1c0a ("iommu/arm-smmu: Use correct address mask for CMD_TLBI_S2_IPA")
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 09:49:27 +00:00
Will Deacon fc10cca69e drivers/iommu: Initialise module 'owner' field in iommu_device_set_ops()
Requiring each IOMMU driver to initialise the 'owner' field of their
'struct iommu_ops' is error-prone and easily forgotten. Follow the
example set by PCI and USB by assigning THIS_MODULE automatically when
registering the ops structure with IOMMU core.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 09:49:13 +00:00
Qian Cai 55817b340a iommu/dma: fix variable 'cookie' set but not used
The commit c18647900e ("iommu/dma: Relax locking in
iommu_dma_prepare_msi()") introduced a compliation warning,

drivers/iommu/dma-iommu.c: In function 'iommu_dma_prepare_msi':
drivers/iommu/dma-iommu.c:1206:27: warning: variable 'cookie' set but
not used [-Wunused-but-set-variable]
  struct iommu_dma_cookie *cookie;
                           ^~~~~~

Fixes: c18647900e ("iommu/dma: Relax locking in iommu_dma_prepare_msi()")
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 17:08:58 +01:00
Jon Derrick f78947c409 iommu/vt-d: Unlink device if failed to add to group
If the device fails to be added to the group, make sure to unlink the
reference before returning.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Fixes: 39ab9555c2 ("iommu: Add sysfs bindings for struct iommu_device")
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 17:08:57 +01:00
Jon Derrick 7d4e6ccd1f iommu: Remove device link to group on failure
This adds the missing teardown step that removes the device link from
the group when the device addition fails.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Fixes: 797a8b4d76 ("iommu: Handle default domain attach failure")
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 17:08:57 +01:00
Patrick Steinhardt 4a350a0ee5 iommu/vt-d: Fix adding non-PCI devices to Intel IOMMU
Starting with commit fa212a97f3 ("iommu/vt-d: Probe DMA-capable ACPI
name space devices"), we now probe DMA-capable ACPI name
space devices. On Dell XPS 13 9343, which has an Intel LPSS platform
device INTL9C60 enumerated via ACPI, this change leads to the following
warning:

    ------------[ cut here ]------------
    WARNING: CPU: 1 PID: 1 at pci_device_group+0x11a/0x130
    CPU: 1 PID: 1 Comm: swapper/0 Tainted: G                T 5.5.0-rc3+ #22
    Hardware name: Dell Inc. XPS 13 9343/0310JH, BIOS A20 06/06/2019
    RIP: 0010:pci_device_group+0x11a/0x130
    Code: f0 ff ff 48 85 c0 49 89 c4 75 c4 48 8d 74 24 10 48 89 ef e8 48 ef ff ff 48 85 c0 49 89 c4 75 af e8 db f7 ff ff 49 89 c4 eb a5 <0f> 0b 49 c7 c4 ea ff ff ff eb 9a e8 96 1e c7 ff 66 0f 1f 44 00 00
    RSP: 0000:ffffc0d6c0043cb0 EFLAGS: 00010202
    RAX: 0000000000000000 RBX: ffffa3d1d43dd810 RCX: 0000000000000000
    RDX: ffffa3d1d4fecf80 RSI: ffffa3d12943dcc0 RDI: ffffa3d1d43dd810
    RBP: ffffa3d1d43dd810 R08: 0000000000000000 R09: ffffa3d1d4c04a80
    R10: ffffa3d1d4c00880 R11: ffffa3d1d44ba000 R12: 0000000000000000
    R13: ffffa3d1d4383b80 R14: ffffa3d1d4c090d0 R15: ffffa3d1d4324530
    FS:  0000000000000000(0000) GS:ffffa3d1d6700000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 000000000460a001 CR4: 00000000003606e0
    Call Trace:
     ? iommu_group_get_for_dev+0x81/0x1f0
     ? intel_iommu_add_device+0x61/0x170
     ? iommu_probe_device+0x43/0xd0
     ? intel_iommu_init+0x1fa2/0x2235
     ? pci_iommu_init+0x52/0xe7
     ? e820__memblock_setup+0x15c/0x15c
     ? do_one_initcall+0xcc/0x27e
     ? kernel_init_freeable+0x169/0x259
     ? rest_init+0x95/0x95
     ? kernel_init+0x5/0xeb
     ? ret_from_fork+0x35/0x40
    ---[ end trace 28473e7abc25b92c ]---
    DMAR: ACPI name space devices didn't probe correctly

The bug results from the fact that while we now enumerate ACPI devices,
we aren't able to handle any non-PCI device when generating the device
group. Fix the issue by implementing an Intel-specific callback that
returns `pci_device_group` only if the device is a PCI device.
Otherwise, it will return a generic device group.

Fixes: fa212a97f3 ("iommu/vt-d: Probe DMA-capable ACPI name space devices")
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Cc: stable@vger.kernel.org # v5.3+
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 17:08:51 +01:00
Adrian Huang bde9e6b9ba iommu/amd: Fix typos for PPR macros
The bit 13 and bit 14 of the IOMMU control register are
PPRLogEn and PPRIntEn. They are related to PPR (Peripheral Page
Request) instead of 'PPF'. Fix them accrodingly.

Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:16:27 +01:00
Adrian Huang 858defad2a iommu/amd: Remove local variables
The usage of the local variables 'range' and 'misc' was removed
from commit 226e889b20 ("iommu/amd: Remove first/last_device handling")
and commit 23c742db21 ("iommu/amd: Split out PCI related parts of
IOMMU initialization"). So, remove them accrodingly.

Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:14:39 +01:00
Lu Baolu e2726daea5 iommu/vt-d: debugfs: Add support to show page table internals
Export page table internals of the domain attached to each device.
Example of such dump on a Skylake machine:

$ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
[ ... ]
Device 0000:00:14.0 with pasid 0 @0x15f3d9000
IOVA_PFN                PML5E                   PML4E
0x000000008ced0 |       0x0000000000000000      0x000000015f3da003
0x000000008ced1 |       0x0000000000000000      0x000000015f3da003
0x000000008ced2 |       0x0000000000000000      0x000000015f3da003
0x000000008ced3 |       0x0000000000000000      0x000000015f3da003
0x000000008ced4 |       0x0000000000000000      0x000000015f3da003
0x000000008ced5 |       0x0000000000000000      0x000000015f3da003
0x000000008ced6 |       0x0000000000000000      0x000000015f3da003
0x000000008ced7 |       0x0000000000000000      0x000000015f3da003
0x000000008ced8 |       0x0000000000000000      0x000000015f3da003
0x000000008ced9 |       0x0000000000000000      0x000000015f3da003

PDPE                    PDE                     PTE
0x000000015f3db003      0x000000015f3dc003      0x000000008ced0003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced1003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced2003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced3003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced4003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced5003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced6003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced7003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced8003
0x000000015f3db003      0x000000015f3dc003      0x000000008ced9003
[ ... ]

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:58 +01:00
Lu Baolu b802d070a5 iommu/vt-d: Use iova over first level
After we make all map/unmap paths support first level page table.
Let's turn it on if hardware supports scalable mode.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:58 +01:00
Lu Baolu 64229e8f37 iommu/vt-d: Update first level super page capability
First-level translation may map input addresses to 4-KByte pages,
2-MByte pages, or 1-GByte pages. Support for 4-KByte pages and
2-Mbyte pages are mandatory for first-level translation. Hardware
support for 1-GByte page is reported through the FL1GP field in
the Capability Register.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:58 +01:00
Lu Baolu cb8b892dce iommu/vt-d: Make first level IOVA canonical
First-level translation restricts the input-address to a canonical
address (i.e., address bits 63:N have the same value as address
bit [N-1], where N is 48-bits with 4-level paging and 57-bits with
5-level paging). (section 3.6 in the spec)

This makes first level IOVA canonical by using IOVA with bit [N-1]
always cleared.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:58 +01:00
Lu Baolu 33cd6e642d iommu/vt-d: Flush PASID-based iotlb for iova over first level
When software has changed first-level tables, it should invalidate
the affected IOTLB and the paging-structure-caches using the PASID-
based-IOTLB Invalidate Descriptor defined in spec 6.5.2.4.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:58 +01:00
Lu Baolu ddf09b6d43 iommu/vt-d: Setup pasid entries for iova over first level
Intel VT-d in scalable mode supports two types of page tables for
IOVA translation: first level and second level. The IOMMU driver
can choose one from both for IOVA translation according to the use
case. This sets up the pasid entry if a domain is selected to use
the first-level page table for iova translation.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:58 +01:00
Lu Baolu 87208f22a4 iommu/vt-d: Add PASID_FLAG_FL5LP for first-level pasid setup
Current intel_pasid_setup_first_level() use 5-level paging for
first level translation if CPUs use 5-level paging mode too.
This makes sense for SVA usages since the page table is shared
between CPUs and IOMMUs. But it makes no sense if we only want
to use first level for IOVA translation. Add PASID_FLAG_FL5LP
bit in the flags which indicates whether the 5-level paging
mode should be used.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:58 +01:00
Lu Baolu 2cd1311a26 iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr
This adds the Intel VT-d specific callback of setting
DOMAIN_ATTR_NESTING domain attribution. It is necessary
to let the VT-d driver know that the domain represents
a virtual machine which requires the IOMMU hardware to
support nested translation mode. Return success if the
IOMMU hardware suports nested mode, otherwise failure.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:58 +01:00
Lu Baolu a1948f2e0a iommu/vt-d: Identify domains using first level page table
This checks whether a domain should use the first level page
table for map/unmap and marks it in the domain structure.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:57 +01:00
Lu Baolu 8e3391cfdc iommu/vt-d: Loose requirement for flush queue initializaton
Currently if flush queue initialization fails, we return error
or enforce the system-wide strict mode. These are unnecessary
because we always check the existence of a flush queue before
queuing any iova's for lazy flushing. Printing a informational
message is enough.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:57 +01:00
Lu Baolu 10f8008f0f iommu/vt-d: Avoid iova flush queue in strict mode
If Intel IOMMU strict mode is enabled by users, it's unnecessary
to create the iova flush queue.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:57 +01:00
Lu Baolu 984d03adc9 iommu/vt-d: trace: Extend map_sg trace event
Current map_sg stores trace message in a coarse manner. This
extends it so that more detailed messages could be traced.

The map_sg trace message looks like:

map_sg: dev=0000:00:17.0 [1/9] dev_addr=0xf8f90000 phys_addr=0x158051000 size=4096
map_sg: dev=0000:00:17.0 [2/9] dev_addr=0xf8f91000 phys_addr=0x15a858000 size=4096
map_sg: dev=0000:00:17.0 [3/9] dev_addr=0xf8f92000 phys_addr=0x15aa13000 size=4096
map_sg: dev=0000:00:17.0 [4/9] dev_addr=0xf8f93000 phys_addr=0x1570f1000 size=8192
map_sg: dev=0000:00:17.0 [5/9] dev_addr=0xf8f95000 phys_addr=0x15c6d0000 size=4096
map_sg: dev=0000:00:17.0 [6/9] dev_addr=0xf8f96000 phys_addr=0x157194000 size=4096
map_sg: dev=0000:00:17.0 [7/9] dev_addr=0xf8f97000 phys_addr=0x169552000 size=4096
map_sg: dev=0000:00:17.0 [8/9] dev_addr=0xf8f98000 phys_addr=0x169dde000 size=4096
map_sg: dev=0000:00:17.0 [9/9] dev_addr=0xf8f99000 phys_addr=0x148351000 size=4096

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:57 +01:00
Jacob Pan 034d473109 iommu/vt-d: Misc macro clean up for SVM
Use combined macros for_each_svm_dev() to simplify SVM device iteration
and error checking.

Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:57 +01:00
Jacob Pan 5f75585e19 iommu/vt-d: Avoid sending invalid page response
Page responses should only be sent when last page in group (LPIG) or
private data is present in the page request. This patch avoids sending
invalid descriptors.

Fixes: 5d308fc1ec ("iommu/vt-d: Add 256-bit invalidation descriptor support")
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:57 +01:00
Jacob Pan 59a623374d iommu/vt-d: Replace Intel specific PASID allocator with IOASID
Make use of generic IOASID code to manage PASID allocation,
free, and lookup. Replace Intel specific code.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:57 +01:00
Jacob Pan 39d630e332 iommu/vt-d: Fix off-by-one in PASID allocation
PASID allocator uses IDR which is exclusive for the end of the
allocation range. There is no need to decrement pasid_max.

Fixes: af39507305 ("iommu/vt-d: Apply global PASID in SVA")
Reported-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:57 +01:00
Jacob Pan d62efd4fa6 iommu/vt-d: Avoid duplicated code for PASID setup
After each setup for PASID entry, related translation caches must be
flushed. We can combine duplicated code into one function which is less
error prone.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:57 +01:00
Jacob Pan 6eba09a4b5 iommu/vt-d: Reject SVM bind for failed capability check
Add a check during SVM bind to ensure CPU and IOMMU hardware capabilities
are met.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:57 +01:00
Jacob Pan 79db7e1b4c iommu/vt-d: Match CPU and IOMMU paging mode
When setting up first level page tables for sharing with CPU, we need
to ensure IOMMU can support no less than the levels supported by the
CPU.

It is not adequate, as in the current code, to set up 5-level paging
in PASID entry First Level Paging Mode(FLPM) solely based on CPU.

Currently, intel_pasid_setup_first_level() is only used by native SVM
code which already checks paging mode matches. However, future use of
this helper function may not be limited to native SVM.
https://lkml.org/lkml/2019/11/18/1037

Fixes: 437f35e1cd ("iommu/vt-d: Add first level page table interface")
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:57 +01:00
Jacob Pan ff3dc6521f iommu/vt-d: Fix CPU and IOMMU SVM feature matching checks
Shared Virtual Memory(SVM) is based on a collective set of hardware
features detected at runtime. There are requirements for matching CPU
and IOMMU capabilities.

The current code checks CPU and IOMMU feature set for SVM support but
the result is never stored nor used. Therefore, SVM can still be used
even when these checks failed. The consequences can be:
1. CPU uses 5-level paging mode for virtual address of 57 bits, but
IOMMU can only support 4-level paging mode with 48 bits address for DMA.
2. 1GB page size is used by CPU but IOMMU does not support it. VT-d
unrecoverable faults may be generated.

The best solution to fix these problems is to prevent them in the first
place.

This patch consolidates code for checking PASID, CPU vs. IOMMU paging
mode compatibility, as well as provides specific error messages for
each failed checks. On sane hardware configurations, these error message
shall never appear in kernel log.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:57 +01:00
Lu Baolu 046182525d iommu/vt-d: Add Kconfig option to enable/disable scalable mode
This adds Kconfig option INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON
to make it easier for distributions to enable or disable the
Intel IOMMU scalable mode by default during kernel build.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07 14:05:57 +01:00
Christoph Hellwig 4bdc0d676a remove ioremap_nocache and devm_ioremap_nocache
ioremap has provided non-cached semantics by default since the Linux 2.6
days, so remove the additional ioremap_nocache interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2020-01-06 09:45:59 +01:00
Joerg Roedel 2ca6b6dc85 iommu/amd: Remove unused variable
The iommu variable in set_device_exclusion_range() us unused
now and causes a compiler warning. Remove it.

Fixes: 387caf0b75 ("iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:09:46 +01:00
Thierry Reding c11738cf9d iommu: virtio: Use generic_iommu_put_resv_regions()
Use the new standard function instead of open-coding it.

Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: virtualization@lists.linux-foundation.org
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:07:03 +01:00
Thierry Reding 0ecdebb7da iommu: intel: Use generic_iommu_put_resv_regions()
Use the new standard function instead of open-coding it.

Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:07:03 +01:00
Thierry Reding 55c2564a68 iommu: amd: Use generic_iommu_put_resv_regions()
Use the new standard function instead of open-coding it.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:07:03 +01:00
Thierry Reding a66c5dc549 iommu: arm: Use generic_iommu_put_resv_regions()
Use the new standard function instead of open-coding it.

Cc: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:07:03 +01:00
Thierry Reding f9f6971ebb iommu: Implement generic_iommu_put_resv_regions()
Implement a generic function for removing reserved regions. This can be
used by drivers that don't do anything fancy with these regions other
than allocating memory for them.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:07:03 +01:00
Qian Cai 944c917539 iommu/iova: Silence warnings under memory pressure
When running heavy memory pressure workloads, this 5+ old system is
throwing endless warnings below because disk IO is too slow to recover
from swapping. Since the volume from alloc_iova_fast() could be large,
once it calls printk(), it will trigger disk IO (writing to the log
files) and pending softirqs which could cause an infinite loop and make
no progress for days by the ongoimng memory reclaim. This is the counter
part for Intel where the AMD part has already been merged. See the
commit 3d70889532 ("iommu/amd: Silence warnings under memory
pressure"). Since the allocation failure will be reported in
intel_alloc_iova(), so just call dev_err_once() there because even the
"ratelimited" is too much, and silence the one in alloc_iova_mem() to
avoid the expensive warn_alloc().

 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 slab_out_of_memory: 66 callbacks suppressed
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: iommu_iova, object size: 40, buffer size: 448, default order:
0, min order: 0
   node 0: slabs: 1822, objs: 16398, free: 0
   node 1: slabs: 2051, objs: 18459, free: 31
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: iommu_iova, object size: 40, buffer size: 448, default order:
0, min order: 0
   node 0: slabs: 1822, objs: 16398, free: 0
   node 1: slabs: 2051, objs: 18459, free: 31
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: iommu_iova, object size: 40, buffer size: 448, default order:
0, min order: 0
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 1: slabs: 381, objs: 2286, free: 27
   node 1: slabs: 381, objs: 2286, free: 27
   node 1: slabs: 381, objs: 2286, free: 27
   node 1: slabs: 381, objs: 2286, free: 27
   node 0: slabs: 1822, objs: 16398, free: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 1: slabs: 2051, objs: 18459, free: 31
   node 0: slabs: 697, objs: 4182, free: 0
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   node 1: slabs: 381, objs: 2286, free: 27
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 1: slabs: 381, objs: 2286, free: 27
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 warn_alloc: 96 callbacks suppressed
 kworker/11:1H: page allocation failure: order:0,
mode:0xa20(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0-1
 CPU: 11 PID: 1642 Comm: kworker/11:1H Tainted: G    B
 Hardware name: HP ProLiant XL420 Gen9/ProLiant XL420 Gen9, BIOS U19
12/27/2015
 Workqueue: kblockd blk_mq_run_work_fn
 Call Trace:
  dump_stack+0xa0/0xea
  warn_alloc.cold.94+0x8a/0x12d
  __alloc_pages_slowpath+0x1750/0x1870
  __alloc_pages_nodemask+0x58a/0x710
  alloc_pages_current+0x9c/0x110
  alloc_slab_page+0xc9/0x760
  allocate_slab+0x48f/0x5d0
  new_slab+0x46/0x70
  ___slab_alloc+0x4ab/0x7b0
  __slab_alloc+0x43/0x70
  kmem_cache_alloc+0x2dd/0x450
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
  alloc_iova+0x33/0x210
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 0: slabs: 697, objs: 4182, free: 0
  alloc_iova_fast+0x62/0x3d1
   node 1: slabs: 381, objs: 2286, free: 27
  intel_alloc_iova+0xce/0xe0
  intel_map_sg+0xed/0x410
  scsi_dma_map+0xd7/0x160
  scsi_queue_rq+0xbf7/0x1310
  blk_mq_dispatch_rq_list+0x4d9/0xbc0
  blk_mq_sched_dispatch_requests+0x24a/0x300
  __blk_mq_run_hw_queue+0x156/0x230
  blk_mq_run_work_fn+0x3b/0x40
  process_one_work+0x579/0xb90
  worker_thread+0x63/0x5b0
  kthread+0x1e6/0x210
  ret_from_fork+0x3a/0x50
 Mem-Info:
 active_anon:2422723 inactive_anon:361971 isolated_anon:34403
  active_file:2285 inactive_file:1838 isolated_file:0
  unevictable:0 dirty:1 writeback:5 unstable:0
  slab_reclaimable:13972 slab_unreclaimable:453879
  mapped:2380 shmem:154 pagetables:6948 bounce:0
  free:19133 free_pcp:7363 free_cma:0

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:07:03 +01:00
Krzysztof Kozlowski d0432345b4 iommu: Fix Kconfig indentation
Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
	$ sed -e 's/^        /\t/' -i */Kconfig

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:26 +01:00
Suravee Suthikulpanit 966b753cf3 iommu/amd: Only support x2APIC with IVHD type 11h/40h
Current implementation for IOMMU x2APIC support makes use of
the MMIO access to MSI capability block registers, which requires
checking EFR[MsiCapMmioSup]. However, only IVHD type 11h/40h contain
the information, and not in the IVHD type 10h IOMMU feature reporting
field. Since the BIOS in newer systems, which supports x2APIC, would
normally contain IVHD type 11h/40h, remove the IOMMU_FEAT_XTSUP_SHIFT
check for IVHD type 10h, and only support x2APIC with IVHD type 11h/40h.

Fixes: 6692981295 ('iommu/amd: Add support for X2APIC IOMMU interrupts')
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:15 +01:00
Suravee Suthikulpanit 813071438e iommu/amd: Check feature support bit before accessing MSI capability registers
The IOMMU MMIO access to MSI capability registers is available only if
the EFR[MsiCapMmioSup] is set. Current implementation assumes this bit
is set if the EFR[XtSup] is set, which might not be the case.

Fix by checking the EFR[MsiCapMmioSup] before accessing the MSI address
low/high and MSI data registers via the MMIO.

Fixes: 6692981295 ('iommu/amd: Add support for X2APIC IOMMU interrupts')
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:15 +01:00
Adrian Huang 387caf0b75 iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions
Some buggy BIOSes might define multiple exclusion ranges of the
IVMD entries which are associated with the same IOMMU hardware.
This leads to the overwritten exclusion range (exclusion_start
and exclusion_length members) in set_device_exclusion_range().

Here is a real case:
When attaching two Broadcom RAID controllers to a server, the first
one reports the failure during booting (the disks connecting to the
RAID controller cannot be detected).

This patch prevents the issue by treating per-device exclusion
ranges as r/w unity-mapped regions.

Discussion:
  * https://lists.linuxfoundation.org/pipermail/iommu/2019-November/040140.html

Suggested-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:15 +01:00
Will Deacon 1ea27ee2f7 iommu/arm-smmu: Update my email address in MODULE_AUTHOR()
I no longer work for Arm, so update the stale reference to my old email
address.

Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: John Garry <john.garry@huawei.com> # smmu v3
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:06 +01:00
Will Deacon cd221bd24f iommu/arm-smmu: Allow building as a module
By conditionally dropping support for the legacy binding and exporting
the newly introduced 'arm_smmu_impl_init()' function we can allow the
ARM SMMU driver to be built as a module.

Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: John Garry <john.garry@huawei.com> # smmu v3
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:05 +01:00
Will Deacon 7359572e1a iommu/arm-smmu: Unregister IOMMU and bus ops on device removal
When removing the SMMU driver, we need to clear any state that we
registered during probe. This includes our bus ops, sysfs entries and
the IOMMU device registered for early firmware probing of masters.

Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: John Garry <john.garry@huawei.com> # smmu v3
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:05 +01:00
Will Deacon 2852ad05e3 iommu/arm-smmu-v3: Allow building as a module
By removing the redundant call to 'pci_request_acs()' we can allow the
ARM SMMUv3 driver to be built as a module.

Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: John Garry <john.garry@huawei.com> # smmu v3
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:05 +01:00
Ard Biesheuvel d3daf66621 iommu/arm-smmu: Support SMMU module probing from the IORT
Add support for SMMU drivers built as modules to the ACPI/IORT device
probing path, by deferring the probe of the master if the SMMU driver is
known to exist but has not been loaded yet. Given that the IORT code
registers a platform device for each SMMU that it discovers, we can
easily trigger the udev based autoloading of the SMMU drivers by making
the platform device identifier part of the module alias.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: John Garry <john.garry@huawei.com> # only manual smmu ko loading
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: John Garry <john.garry@huawei.com> # smmu v3
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:05 +01:00
Will Deacon ab24677471 iommu/arm-smmu-v3: Unregister IOMMU and bus ops on device removal
When removing the SMMUv3 driver, we need to clear any state that we
registered during probe. This includes our bus ops, sysfs entries and
the IOMMU device registered for early firmware probing of masters.

Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: John Garry <john.garry@huawei.com> # smmu v3
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:05 +01:00
Will Deacon 34debdca68 iommu/arm-smmu: Prevent forced unbinding of Arm SMMU drivers
Forcefully unbinding the Arm SMMU drivers is a pretty dangerous operation,
since it will likely lead to catastrophic failure for any DMA devices
mastering through the SMMU being unbound. When the driver then attempts
to "handle" the fatal faults, it's very easy to trip over dead data
structures, leading to use-after-free.

On John's machine, he reports that the machine was "unusable" due to
loss of the storage controller following a forced unbind of the SMMUv3
driver:

  | # cd ./bus/platform/drivers/arm-smmu-v3
  | # echo arm-smmu-v3.0.auto > unbind
  | hisi_sas_v2_hw HISI0162:01: CQE_AXI_W_ERR (0x800) found!
  | platform arm-smmu-v3.0.auto: CMD_SYNC timeout at 0x00000146
  | [hwprod 0x00000146, hwcons 0x00000000]

Prevent this forced unbinding of the drivers by setting "suppress_bind_attrs"
to true.

Link: https://lore.kernel.org/lkml/06dfd385-1af0-3106-4cc5-6a5b8e864759@huawei.com
Reported-by: John Garry <john.garry@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: John Garry <john.garry@huawei.com> # smmu v3
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:05 +01:00
Will Deacon b06c076ea9 Revert "iommu/arm-smmu: Make arm-smmu explicitly non-modular"
This reverts commit addb672f20.

Let's get the SMMU driver building as a module, which means putting
back some dead code that we used to carry.

Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: John Garry <john.garry@huawei.com> # smmu v3
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:05 +01:00
Will Deacon 6e8fa7404c Revert "iommu/arm-smmu: Make arm-smmu-v3 explicitly non-modular"
This reverts commit c07b6426df.

Let's get the SMMUv3 driver building as a module, which means putting
back some dead code that we used to carry.

Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: John Garry <john.garry@huawei.com> # smmu v3
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:05 +01:00
Will Deacon 4312cf7f16 drivers/iommu: Allow IOMMU bus ops to be unregistered
'bus_set_iommu()' allows IOMMU drivers to register their ops for a given
bus type. Unfortunately, it then doesn't allow them to be removed, which
is necessary for modular drivers to shutdown cleanly so that they can be
reloaded later on.

Allow 'bus_set_iommu()' to take a NULL 'ops' argument, which clear the
ops pointer for the selected bus_type.

Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: John Garry <john.garry@huawei.com> # smmu v3
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:05 +01:00
Will Deacon 386dce2788 iommu/of: Take a ref to the IOMMU driver during ->of_xlate()
Ensure that we hold a reference to the IOMMU driver module while calling
the '->of_xlate()' callback during early device probing.

Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: John Garry <john.garry@huawei.com> # smmu v3
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:05 +01:00
Will Deacon 25f003de98 drivers/iommu: Take a ref to the IOMMU driver prior to ->add_device()
To avoid accidental removal of an active IOMMU driver module, take a
reference to the driver module in 'iommu_probe_device()' immediately
prior to invoking the '->add_device()' callback and hold it until the
after the device has been removed by '->remove_device()'.

Suggested-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: John Garry <john.garry@huawei.com> # smmu v3
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:05 +01:00
Will Deacon 6bf6c24720 iommu/of: Request ACS from the PCI core when configuring IOMMU linkage
To avoid having to export 'pci_request_acs()' to modular IOMMU drivers,
move the call into the 'of_dma_configure()' path in a similar manner to
the way in which ACS is configured when probing via ACPI/IORT.

Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: John Garry <john.garry@huawei.com> # smmu v3
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:05 +01:00
Will Deacon a7ba5c3d00 drivers/iommu: Export core IOMMU API symbols to permit modular drivers
Building IOMMU drivers as modules requires that the core IOMMU API
symbols are exported as GPL symbols.

Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: John Garry <john.garry@huawei.com> # smmu v3
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-23 14:06:05 +01:00
Linus Torvalds b371ddb94f IOMMU Fixes for Linux v5.5-rc2
Including:
 
 	- Fix kmemleak warning in IOVA code
 
 	- Fix compile warnings on ARM32/64 in dma-iommu code due to
 	  dma_mask type mismatches
 
 	- Make ISA reserved regions relaxable, so that VFIO can assign
 	  devices which have such regions defined
 
 	- Fix mapping errors resulting in IO page-faults in the VT-d
 	  driver
 
 	- Make sure direct mappings for a domain are created after the
 	  default domain is updated
 
 	- Map ISA reserved regions in the VT-d driver with correct
 	  permissions
 
 	- Remove unneeded check for PSI capability in the IOTLB flush
 	  code of the VT-d driver
 
 	- Lockdep fix iommu_dma_prepare_msi()
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl38rzEACgkQK/BELZcB
 GuNPChAAzFdZw0GRphdnsrGog7vJICukFshPifLD8NeJXYzqLzRY89LT/sg4gZrZ
 K3Uibg8+0OmWl21JqAzDzXeHYUwDV0Xe/ygjeOdqFn3LY8zCo6UcY4OLCZ1az/XU
 om/yjTBgZjgBcUAxkzRJSdditQ2p7ItEa4dXnlpeCV07vQEmS/5x8JkNsea7CG2h
 bvBLYW5DpJ1LsJo1WjONHw0DvRkExQsXZaA3zj/6BzfQIXUnkF1Imkgr9gTbXXOl
 nGHHRLVtsFqv0U5JWz6fZh4/UgvInq45gZIkvvxQWAM/Kn9wxe2RKDwpQJ1wZ8wc
 S5fwSPa5g5k2X73BbEHx7AFYESpgCRFOeG74i9b7/DlzsbM+aTGPZ1/4kLt9fl+u
 +AOUV3l9/rqrrmeUEBF7F3kFC9/OL0KIT17xdJfQG1x3RBm9OHy1q0GQH4q8ZbWM
 aoWg3Ryc4uO/4Majm/kIjADKR0512LvplXsRXhWpud37szhL6vMJDxBb1zKZJgQ1
 j/PFUWgolCvmSG1Q048I9pljrsqfE9pgQhmITQ9VAny6eAaZgT7Y21MbBTyQksem
 /O08TWGAFddH4U9pGnQ1ST/q5hcVvnUgzy12A3MOuOYh5tWfAbeZparjQsu7bFhp
 uaJudGyXg9Xgrg82XQ2x/Nk710wLRAGG4VKnUz+mBDEzOOmcems=
 =VWME
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes-v5.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:

 - Fix kmemleak warning in IOVA code

 - Fix compile warnings on ARM32/64 in dma-iommu code due to dma_mask
   type mismatches

 - Make ISA reserved regions relaxable, so that VFIO can assign devices
   which have such regions defined

 - Fix mapping errors resulting in IO page-faults in the VT-d driver

 - Make sure direct mappings for a domain are created after the default
   domain is updated

 - Map ISA reserved regions in the VT-d driver with correct permissions

 - Remove unneeded check for PSI capability in the IOTLB flush code of
   the VT-d driver

 - Lockdep fix iommu_dma_prepare_msi()

* tag 'iommu-fixes-v5.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/dma: Relax locking in iommu_dma_prepare_msi()
  iommu/vt-d: Remove incorrect PSI capability check
  iommu/vt-d: Allocate reserved region for ISA with correct permission
  iommu: set group default domain before creating direct mappings
  iommu/vt-d: Fix dmar pte read access not set error
  iommu/vt-d: Set ISA bridge reserved region as relaxable
  iommu/dma: Rationalise types for DMA masks
  iommu/iova: Init the struct iova to fix the possible memleak
2019-12-20 10:42:25 -08:00