[ Upstream commit 462e768b55 ]
There is a typo so this loop does i++ where i-- was intended. It will
result in looping until the kernel crashes.
Fixes: 2659392856 ("iommu/mediatek: Add error path for loop of mm_dts_parse")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Yong Wu <yong.wu@mediatek.com>
Link: https://lore.kernel.org/r/Y5C3mTam2nkbaz6o@kili
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ef5bb8e7a7 ]
This driver treats IOMMU_DOMAIN_IDENTITY the same as UNMANAGED, which
cannot possibly be correct.
UNMANAGED domains are required to start out blocking all DMAs. This seems
to be what this driver does as it allocates a first level 'dt' for the IO
page table that is 0 filled.
Thus UNMANAGED looks like a working IO page table, and so IDENTITY must be
a mistake. Remove it.
Fixes: 4100b8c229 ("iommu: Add Allwinner H6 IOMMU driver")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/0-v1-97f0adf27b5e+1f0-s50_identity_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ef693a8440 ]
Fix the smatch warnings:
drivers/iommu/mtk_iommu.c:878 mtk_iommu_mm_dts_parse() error: uninitialized
symbol 'larbnode'.
If someone abuse the dtsi node(Don't follow the definition of dt-binding),
for example "mediatek,larbs" is provided as boolean property, "larb_nr"
will be zero and cause abnormal.
To fix this problem and improve the code safety, add some checking
for the invalid input from dtsi, e.g. checking the larb_nr/larbid valid
range, and avoid "mediatek,larb-id" property conflicts in the smi-larb
nodes.
Fixes: d2e9a1102c ("iommu/mediatek: Contain MM IOMMU flow with the MM TYPE")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Link: https://lore.kernel.org/r/20221018024258.19073-5-yong.wu@mediatek.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2659392856 ]
The mtk_iommu_mm_dts_parse will parse the smi larbs nodes. if the i+1
larb is parsed fail, we should put_device for the i..0 larbs.
There are two places need to comment:
1) The larbid may be not linear mapping, we should loop whole
the array in the error path.
2) I move this line position: "data->larb_imu[id].dev = &plarbdev->dev;"
before "if (!plarbdev->dev.driver)", That means set
data->larb_imu[id].dev before the error path. then we don't need
"platform_device_put(plarbdev)" again in probe_defer case. All depend
on "put_device" of the error path in error cases.
Fixes: d2e9a1102c ("iommu/mediatek: Contain MM IOMMU flow with the MM TYPE")
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Link: https://lore.kernel.org/r/20221018024258.19073-4-yong.wu@mediatek.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b5765a1b44 ]
In order to simplify the error patch(avoid call of_node_put), Use
component_match_add instead component_match_add_release since we are only
interested in the "device" here. Then we could always call of_node_put in
normal path.
Strictly this is not a fixes patch, but it is a prepare for adding the
error path, thus I add a Fixes tag too.
Fixes: d2e9a1102c ("iommu/mediatek: Contain MM IOMMU flow with the MM TYPE")
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Link: https://lore.kernel.org/r/20221018024258.19073-3-yong.wu@mediatek.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dcb40e9fcc ]
Add platform_device_put to match with of_find_device_by_node.
Meanwhile, I add a new variable "pcommdev" which is for smi common device.
Otherwise, "platform_device_put(plarbdev)" for smi-common dev may be not
readable. And add a checking for whether pcommdev is NULL.
Fixes: d2e9a1102c ("iommu/mediatek: Contain MM IOMMU flow with the MM TYPE")
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Link: https://lore.kernel.org/r/20221018024258.19073-2-yong.wu@mediatek.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6cf0981c22 ]
As comment of pci_get_domain_bus_and_slot() says, it returns
a pci device with refcount increment, when finish using it,
the caller must decrement the reference count by calling
pci_dev_put(). So call it before returning from ppr_notifier()
to avoid refcount leak.
Fixes: daae2d25a4 ("iommu/amd: Don't copy GCR3 table root pointer")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221118093604.216371-1-yangyingliang@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 73b6924cde ]
platform_get_resource() may return NULL pointer, we need check its
return value to avoid null-ptr-deref in resource_size().
Fixes: 42d57fc58a ("iommu/mediatek: Initialise/Remove for multi bank dev")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Link: https://lore.kernel.org/r/20221029103550.3774365-1-yangyingliang@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7eb99841f3 ]
As pointed out in the corresponding downstream fix [0], the permission bits
of the page table entries are compatible between v1 and v2 of the IOMMU.
This is in contrast to the current mainline code that incorrectly assumes
that the read and write permission bits are switched. Fix the permission
bits by reusing the v1 bit defines.
[0] e3bc123a22
Fixes: c55356c534 ("iommu: rockchip: Add support for iommu v2")
Signed-off-by: Michael Riesch <michael.riesch@wolfvision.net>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20221102063553.2464161-1-michael.riesch@wolfvision.net
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e563cc0c78 ]
Allocated iova ranges need to be invalidated immediately or otherwise
they might or might not work when used by master or CPU. This was
discovered when running video decoder conformity test with Cedrus. Some
videos were now and then decoded incorrectly and generated page faults.
According to vendor driver, it's enough to invalidate just start and end
TLB and PTW cache lines. Documentation says that neighbouring lines must
be invalidated too. Finally, when page fault occurs, that iova must be
invalidated the same way, according to documentation.
Fixes: 4100b8c229 ("iommu: Add Allwinner H6 IOMMU driver")
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20221025165415.307591-6-jernej.skrabec@gmail.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 67a8a67f9e ]
Function sun50i_table_flush() takes number of entries as an argument,
not number of bytes. Fix that mistake in sun50i_dte_get_page_table().
Fixes: 4100b8c229 ("iommu: Add Allwinner H6 IOMMU driver")
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20221025165415.307591-5-jernej.skrabec@gmail.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eac0104dc6 ]
Because driver has enum type permissions and iommu subsystem has bitmap
type, we have to be careful how check for combined read and write
permissions is done. In such case, we have to mask both permissions and
check that both are set at the same time.
Current code just masks both flags but doesn't check that both are set.
In short, it always sets R/W permission, regardles if requested
permissions were RO, WO or RW. Fix that.
Fixes: 4100b8c229 ("iommu: Add Allwinner H6 IOMMU driver")
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20221025165415.307591-4-jernej.skrabec@gmail.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cef20703e2 ]
We have to reset masters for all faults - permissions, L1 fault or L2
fault. Currently it's done only for permissions. If other type of fault
happens, master is in locked up state. Fix that by really considering
all fault sources.
Fixes: 4100b8c229 ("iommu: Add Allwinner H6 IOMMU driver")
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20221025165415.307591-3-jernej.skrabec@gmail.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9ad0c1252e ]
Reset signal is asserted by writing 0 to the corresponding locations of
masters we want to reset. So in order to deassert all reset signals, we
should write 1's to all locations.
Current code writes 1's to locations of masters which were just reset
which is good. However, at the same time it also writes 0's to other
locations and thus asserts reset signals of remaining masters. Fix code
by writing all 1's when we want to deassert all reset signals.
This bug was discovered when working with Cedrus (video decoder). When
it faulted, display went blank due to reset signal assertion.
Fixes: 4100b8c229 ("iommu: Add Allwinner H6 IOMMU driver")
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20221025165415.307591-2-jernej.skrabec@gmail.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bf8d2dd2ed ]
Since commit fa7e9ecc5e ("iommu/s390: Tolerate repeat attach_dev
calls") we can end up with duplicates in the list of devices attached to
a domain. This is inefficient and confusing since only one domain can
actually be in control of the IOMMU translations for a device. Fix this
by detaching the device from the previous domain, if any, on attach.
Add a WARN_ON() in case we still have attached devices on freeing the
domain. While here remove the re-attach on failure dance as it was
determined to be unlikely to help and may confuse debug and recovery.
Fixes: fa7e9ecc5e ("iommu/s390: Tolerate repeat attach_dev calls")
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20221025115657.1666860-2-schnelle@linux.ibm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4bedbbd782 ]
for_each_pci_dev() is implemented by pci_get_device(). The comment of
pci_get_device() says that it will increase the reference count for the
returned pci_dev and also decrease the reference count for the input
pci_dev @from if it is not NULL.
If we break for_each_pci_dev() loop with pdev not NULL, we need to call
pci_dev_put() to decrease the reference count. Add the missing
pci_dev_put() for the error path to avoid reference count leak.
Fixes: 2e45528930 ("iommu/vt-d: Unify the way to process DMAR device scope array")
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Link: https://lore.kernel.org/r/20221121113649.190393-3-wangxiongfeng2@huawei.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit afca9e19cc ]
for_each_pci_dev() is implemented by pci_get_device(). The comment of
pci_get_device() says that it will increase the reference count for the
returned pci_dev and also decrease the reference count for the input
pci_dev @from if it is not NULL.
If we break for_each_pci_dev() loop with pdev not NULL, we need to call
pci_dev_put() to decrease the reference count. Add the missing
pci_dev_put() before 'return true' to avoid reference count leak.
Fixes: 89a6079df7 ("iommu/vt-d: Force IOMMU on for platform opt in hint")
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Link: https://lore.kernel.org/r/20221121113649.190393-2-wangxiongfeng2@huawei.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 7fc961cf7f upstream.
SRS cap is the hardware cap telling if the hardware IOMMU can support
requests seeking supervisor privilege or not. SRE bit in scalable-mode
PASID table entry is treated as Reserved(0) for implementation not
supporting SRS cap.
Checking SRS cap before setting SRE bit can avoid the non-recoverable
fault of "Non-zero reserved field set in PASID Table Entry" caused by
setting SRE bit while there is no SRS cap support. The fault messages
look like below:
DMAR: DRHD: handling fault status reg 2
DMAR: [DMA Read NO_PASID] Request device [00:0d.0] fault addr 0x1154e1000
[fault reason 0x5a]
SM: Non-zero reserved field set in PASID Table Entry
Fixes: 6f7db75e1c ("iommu/vt-d: Add second level page table interface")
Cc: stable@vger.kernel.org
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Link: https://lore.kernel.org/r/20221115070346.1112273-1-tina.zhang@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20221116051544.26540-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 242b0aaeab upstream.
The A/D bits are preseted for IOVA over first level(FL) usage for both
kernel DMA (i.e, domain typs is IOMMU_DOMAIN_DMA) and user space DMA
usage (i.e., domain type is IOMMU_DOMAIN_UNMANAGED).
Presetting A bit in FL requires to preset the bit in every related paging
entries, including the non-leaf ones. Otherwise, hardware may treat this
as an error. For example, in a case of ECAP_REG.SMPWC==0, DMA faults might
occur with below DMAR fault messages (wrapped for line length) dumped.
DMAR: DRHD: handling fault status reg 2
DMAR: [DMA Read NO_PASID] Request device [aa:00.0] fault addr 0x10c3a6000
[fault reason 0x90]
SM: A/D bit update needed in first-level entry when set up in no snoop
Fixes: 289b3b005c ("iommu/vt-d: Preset A/D bits for user space DMA usage")
Cc: stable@vger.kernel.org
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Link: https://lore.kernel.org/r/20221113010324.1094483-1-tina.zhang@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20221116051544.26540-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 620bf9f981 ]
A splat from kmem_cache_destroy() was seen with a kernel prior to
commit ee2653bbe8 ("iommu/vt-d: Remove domain and devinfo mempool")
when there was a failure in init_dmars(), because the iommu_domain
cache still had objects. While the mempool code is now gone, there
still is a leak of the si_domain memory if init_dmars() fails. So
clean up si_domain in the init_dmars() error path.
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Fixes: 86080ccc22 ("iommu/vt-d: Allocate si_domain in init_dmars()")
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20221010144842.308890-1-jsnitsel@redhat.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 24b6c7798a ]
The DMA operations of HiSilicon PTT device can only work properly with
identical mappings. So add a quirk for the device to force the domain
as passthrough.
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20220816114414.4092-2-yangyicong@huawei.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 184233a520 ]
There are two issues here:
1) The "len" variable needs to be checked before the very first write.
Otherwise if omap2_iommu_dump_ctx() with "bytes" less than 32 it is a
buffer overflow.
2) The snprintf() function returns the number of bytes that *would* have
been copied if there were enough space. But we want to know the
number of bytes which were *actually* copied so use scnprintf()
instead.
Fixes: bd4396f09a ("iommu/omap: Consolidate OMAP IOMMU modules")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://lore.kernel.org/r/YuvYh1JbE3v+abd5@kili
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
This reverts commit 9cd4f14344.
Some issues were reported on the original commit. Some thunderbolt devices
don't work anymore due to the following DMA fault.
DMAR: DRHD: handling fault status reg 2
DMAR: [INTR-REMAP] Request device [09:00.0] fault index 0x8080
[fault reason 0x25]
Blocked a compatibility format interrupt request
Bring it back for now to avoid functional regression.
Fixes: 9cd4f14344 ("iommu/vt-d: Fix possible recursive locking in intel_iommu_init()")
Link: https://lore.kernel.org/linux-iommu/485A6EA5-6D58-42EA-B298-8571E97422DE@getmailspring.com/
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216497
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: <stable@vger.kernel.org> # 5.19.x
Reported-and-tested-by: George Hilliard <thirtythreeforty@gmail.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220920081701.3453504-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The AMD IOMMU driver cannot activate PASID mode on a RID without the RID's
translation being set to IDENTITY. Further it requires changing the RID's
page table layout from the normal v1 IOMMU_DOMAIN_IDENTITY layout to a
different v2 layout.
It does this by creating a new iommu_domain, configuring that domain for
v2 identity operation and then attaching it to the group, from within the
driver. This logic assumes the group is already set to the IDENTITY domain
and is being used by the DMA API.
However, since the ownership logic is based on the group's domain pointer
equaling the default domain to detect DMA API ownership, this causes it to
look like the group is not attached to the DMA API any more. This blocks
attaching drivers to any other devices in the group.
In a real system this manifests itself as the HD-audio devices on some AMD
platforms losing their device drivers.
Work around this unique behavior of the AMD driver by checking for
equality of IDENTITY domains based on their type, not their pointer
value. This allows the AMD driver to have two IDENTITY domains for
internal purposes without breaking the check.
Have the AMD driver properly declare that the special domain it created is
actually an IDENTITY domain.
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: stable@vger.kernel.org
Fixes: 512881eacf ("bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management")
Reported-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/0-v1-ea566e16b06b+811-amd_owner_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac
("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
is used to protect DMAR related global data from DMAR hotplug operations.
The dmar_global_lock used in the intel_iommu_init() might cause recursive
locking issue, for example, intel_iommu_get_resv_regions() is taking the
dmar_global_lock from within a section where intel_iommu_init() already
holds it via probe_acpi_namespace_devices().
Using dmar_global_lock in intel_iommu_init() could be relaxed since it is
unlikely that any IO board must be hot added before the IOMMU subsystem is
initialized. This eliminates the possible recursive locking issue by moving
down DMAR hotplug support after the IOMMU is initialized and removing the
uses of dmar_global_lock in intel_iommu_init().
Fixes: d5692d4af0 ("iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices()")
Reported-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/894db0ccae854b35c73814485569b634237b5538.1657034828.git.robin.murphy@arm.com
Link: https://lore.kernel.org/r/20220718235325.3952426-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit e8ae0e140c ("vfio: Require that devices support DMA cache
coherence") requires IOMMU drivers to advertise
IOMMU_CAP_CACHE_COHERENCY, in order to be used by VFIO. Since VFIO does
not provide to userspace the ability to maintain coherency through cache
invalidations, it requires hardware coherency. Advertise the capability
in order to restore VFIO support.
The meaning of IOMMU_CAP_CACHE_COHERENCY also changed from "IOMMU can
enforce cache coherent DMA transactions" to "IOMMU_CACHE is supported".
While virtio-iommu cannot enforce coherency (of PCIe no-snoop
transactions), it does support IOMMU_CACHE.
We can distinguish different cases of non-coherent DMA:
(1) When accesses from a hardware endpoint are not coherent. The host
would describe such a device using firmware methods ('dma-coherent'
in device-tree, '_CCA' in ACPI), since they are also needed without
a vIOMMU. In this case mappings are created without IOMMU_CACHE.
virtio-iommu doesn't need any additional support. It sends the same
requests as for coherent devices.
(2) When the physical IOMMU supports non-cacheable mappings. Supporting
those would require a new feature in virtio-iommu, new PROBE request
property and MAP flags. Device drivers would use a new API to
discover this since it depends on the architecture and the physical
IOMMU.
(3) When the hardware supports PCIe no-snoop. It is possible for
assigned PCIe devices to issue no-snoop transactions, and the
virtio-iommu specification is lacking any mention of this.
Arm platforms don't necessarily support no-snoop, and those that do
cannot enforce coherency of no-snoop transactions. Device drivers
must be careful about assuming that no-snoop transactions won't end
up cached; see commit e02f5c1bb2 ("drm: disable uncached DMA
optimization for ARM and arm64"). On x86 platforms, the host may or
may not enforce coherency of no-snoop transactions with the physical
IOMMU. But according to the above commit, on x86 a driver which
assumes that no-snoop DMA is compatible with uncached CPU mappings
will also work if the host enforces coherency.
Although these issues are not specific to virtio-iommu, it could be
used to facilitate discovery and configuration of no-snoop. This
would require a new feature bit, PROBE property and ATTACH/MAP
flags.
Cc: stable@vger.kernel.org
Fixes: e8ae0e140c ("vfio: Require that devices support DMA cache coherence")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20220825154622.86759-1-jean-philippe@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
With CONFIG_INTEL_IOMMU_DEBUGFS enabled, below lockdep splat are seen
when an I/O fault occurs on a machine with an Intel IOMMU in it.
DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0
[fault reason 0x05] PTE Write access is not set
DMAR: Dump dmar0 table entries for IOVA 0x0
DMAR: root entry: 0x0000000127f42001
DMAR: context entry: hi 0x0000000000001502, low 0x000000012d8ab001
================================
WARNING: inconsistent lock state
5.20.0-0.rc0.20220812git7ebfc85e2cd7.10.fc38.x86_64 #1 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
rngd/1006 [HC1[1]:SC0[0]:HE0:SE1] takes:
ff177021416f2d78 (&k->k_lock){?.+.}-{2:2}, at: klist_next+0x1b/0x160
{HARDIRQ-ON-W} state was registered at:
lock_acquire+0xce/0x2d0
_raw_spin_lock+0x33/0x80
klist_add_tail+0x46/0x80
bus_add_device+0xee/0x150
device_add+0x39d/0x9a0
add_memory_block+0x108/0x1d0
memory_dev_init+0xe1/0x117
driver_init+0x43/0x4d
kernel_init_freeable+0x1c2/0x2cc
kernel_init+0x16/0x140
ret_from_fork+0x1f/0x30
irq event stamp: 7812
hardirqs last enabled at (7811): [<ffffffff85000e86>] asm_sysvec_apic_timer_interrupt+0x16/0x20
hardirqs last disabled at (7812): [<ffffffff84f16894>] irqentry_enter+0x54/0x60
softirqs last enabled at (7794): [<ffffffff840ff669>] __irq_exit_rcu+0xf9/0x170
softirqs last disabled at (7787): [<ffffffff840ff669>] __irq_exit_rcu+0xf9/0x170
The klist iterator functions using spin_*lock_irq*() but the klist
insertion functions using spin_*lock(), combined with the Intel DMAR
IOMMU driver iterating over klists from atomic (hardirq) context, where
pci_get_domain_bus_and_slot() calls into bus_find_device() which iterates
over klists.
As currently there's no plan to fix the klist to make it safe to use in
atomic context, this fixes the lockdep splat by avoid calling
pci_get_domain_bus_and_slot() in the hardirq context.
Fixes: 8ac0b64b97 ("iommu/vt-d: Use pci_get_domain_bus_and_slot() in pgtable_walk()")
Reported-by: Lennert Buytenhek <buytenh@wantstofly.org>
Link: https://lore.kernel.org/linux-iommu/Yvo2dfpEh%2FWC+Wrr@wantstofly.org/
Link: https://lore.kernel.org/linux-iommu/YvyBdPwrTuHHbn5X@wantstofly.org/
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220819015949.4795-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The per domain spinlock is acquired in iommu_flush_dev_iotlb(), which
is possbile to be called in the interrupt context. For example, the
drm-intel's CI system got completely blocked with below error:
WARNING: inconsistent lock state
6.0.0-rc1-CI_DRM_11990-g6590d43d39b9+ #1 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/6/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
ffff88810440d678 (&domain->lock){+.?.}-{2:2}, at: iommu_flush_dev_iotlb.part.61+0x23/0x80
{SOFTIRQ-ON-W} state was registered at:
lock_acquire+0xd3/0x310
_raw_spin_lock+0x2a/0x40
domain_update_iommu_cap+0x20b/0x2c0
intel_iommu_attach_device+0x5bd/0x860
__iommu_attach_device+0x18/0xe0
bus_iommu_probe+0x1f3/0x2d0
bus_set_iommu+0x82/0xd0
intel_iommu_init+0xe45/0x102a
pci_iommu_init+0x9/0x31
do_one_initcall+0x53/0x2f0
kernel_init_freeable+0x18f/0x1e1
kernel_init+0x11/0x120
ret_from_fork+0x1f/0x30
irq event stamp: 162354
hardirqs last enabled at (162354): [<ffffffff81b59274>] _raw_spin_unlock_irqrestore+0x54/0x70
hardirqs last disabled at (162353): [<ffffffff81b5901b>] _raw_spin_lock_irqsave+0x4b/0x50
softirqs last enabled at (162338): [<ffffffff81e00323>] __do_softirq+0x323/0x48e
softirqs last disabled at (162349): [<ffffffff810c1588>] irq_exit_rcu+0xb8/0xe0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&domain->lock);
<Interrupt>
lock(&domain->lock);
*** DEADLOCK ***
1 lock held by swapper/6/0:
This coverts the spin_lock/unlock() into the irq save/restore varieties
to fix the recursive locking issues.
Fixes: ffd5869d93 ("iommu/vt-d: Replace spin_lock_irqsave() with spin_lock()")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20220817025650.3253959-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The Intel IOMMU driver possibly selects between the first-level and the
second-level translation tables for DMA address translation. However,
the levels of page-table walks for the 4KB base page size are calculated
from the SAGAW field of the capability register, which is only valid for
the second-level page table. This causes the IOMMU driver to stop working
if the hardware (or the emulated IOMMU) advertises only first-level
translation capability and reports the SAGAW field as 0.
This solves the above problem by considering both the first level and the
second level when calculating the supported page table levels.
Fixes: b802d070a5 ("iommu/vt-d: Use iova over first level")
Cc: stable@vger.kernel.org
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220817023558.3253263-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The translation table copying code for kdump kernels is currently based
on the extended root/context entry formats of ECS mode defined in older
VT-d v2.5, and doesn't handle the scalable mode formats. This causes
the kexec capture kernel boot failure with DMAR faults if the IOMMU was
enabled in scalable mode by the previous kernel.
The ECS mode has already been deprecated by the VT-d spec since v3.0 and
Intel IOMMU driver doesn't support this mode as there's no real hardware
implementation. Hence this converts ECS checking in copying table code
into scalable mode.
The existing copying code consumes a bit in the context entry as a mark
of copied entry. It needs to work for the old format as well as for the
extended context entries. As it's hard to find such a common bit for both
legacy and scalable mode context entries. This replaces it with a per-
IOMMU bitmap.
Fixes: 7373a8cc38 ("iommu/vt-d: Setup context and enable RID2PASID support")
Cc: stable@vger.kernel.org
Reported-by: Jerry Snitselaar <jsnitsel@redhat.com>
Tested-by: Wen Jin <wen.jin@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220817011035.3250131-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We started using a 64 bit completion value. Unfortunately, we only
stored the low 32-bits, so a very large completion value would never
be matched in iommu_completion_wait().
Fixes: c69d89aff3 ("iommu/amd: Use 4K page for completion wait write-back semaphore")
Signed-off-by: John Sperbeck <jsperbeck@google.com>
Link: https://lore.kernel.org/r/20220801192229.3358786-1-jsperbeck@google.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This reverts commit b09796d528.
An issue was reported[1] on the original commit. I'll need to address that
before I can delete the use of driver_deferred_probe_check_state(). So,
bring it back for now.
[1] - https://lore.kernel.org/lkml/4799738.LvFx2qVVIh@steina-w/
Fixes: b09796d528 ("iommu/of: Delete usage of driver_deferred_probe_check_state()")
Reported-by: Jean-Philippe Brucker <jpb@kernel.org>
Tested-by: Jean-Philippe Brucker <jpb@kernel.org>
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220819221616.2107893-5-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- A bunch of small fixes for the recently merged LoongArch drivers
- A leftover from the non-SMP IRQ affinity rework affecting
the Hyper-V IOMMU code
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmL2TNUPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDYH0P/0oThDALd3H/ieelfUBG1LCTRWR+b0O3E4Ge
6tJkFwhhEEpqDmtJBAWAtuWCrwwpipCWif2TCpvkbD3mhf8LWqs5HNWB5qSoK6th
dWMszhV4ljr8WH/mELxbuHapPuSYUSXV6Ty4f5b8A2KvSTNXmHhXQFU+22eG13TZ
w4aXGLfGunD0ozA1l2VR6EyCFfwiRg07jQUj48Hm0UuaBAQpDf3kfFDM/aM4rpNA
TLpdYO9kqppN1VoeovUP4H3nmCpwbvT2mPNmbz29pfkCHIKkhgFNrnh3Po8DQTrL
ddUECSjk7F4KGC4e/C5zBq/A09Znj1IqrMlP/pRvj56jrdvxEIvpdE1Y7/8AoWYp
DTODGldFjaUhpnOGUj0x3GfcUFC0qQL8lqi+qJ4YzKE7BXI8tjjjmXBGLgRjQCox
h9cFwYDdnALF/kKi1LISbCTf5ali6paB3xkb/VHPZ2yCjdiHgGaAYTXZjSHXqXa8
f6jvdl3olGUgwdpX2LprzFPTvspu7rImKNXRrqkrGBuUugjibr4sHYqZjfFNJ8Y/
uOKpzku8Ck+Husg7nsytFPYrYEss446/+dLxyJo+YXZS5+b30hC1meCFnK0qk0bk
mgX6xP/gdP3Hg898ZenVLB3rmIWMR76tdVQ4wTdvMx93xZTl6bcxj8axwjhvCVm7
SqJuwBDk
=GV+C
-----END PGP SIGNATURE-----
Merge tag 'irqchip-fixes-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent
Pull irqchip fixes from Marc Zyngier:
- A bunch of small fixes for the recently merged LoongArch drivers
- A leftover from the non-SMP IRQ affinity rework affecting
the Hyper-V IOMMU code
Link: https://lore.kernel.org/r/20220812125910.2227338-1-maz@kernel.org
This branch consists of:
Qu Wenruo:
lib: bitmap: fix the duplicated comments on bitmap_to_arr64()
https://lore.kernel.org/lkml/0d85e1dbad52ad7fb5787c4432bdb36cbd24f632.1656063005.git.wqu@suse.com/
Alexander Lobakin:
bitops: let optimize out non-atomic bitops on compile-time constants
https://lore.kernel.org/lkml/20220624121313.2382500-1-alexandr.lobakin@intel.com/T/
Yury Norov:
lib: cleanup bitmap-related headers
https://lore.kernel.org/linux-arm-kernel/YtCVeOGLiQ4gNPSf@yury-laptop/T/#m305522194c4d38edfdaffa71fcaaf2e2ca00a961
Alexander Lobakin:
x86/olpc: fix 'logical not is only applied to the left hand side'
https://www.spinics.net/lists/kernel/msg4440064.html
Yury Norov:
lib/nodemask: inline wrappers around bitmap
https://lore.kernel.org/all/20220723214537.2054208-1-yury.norov@gmail.com/
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmLpVvwACgkQsUSA/Tof
vsiAHgwAwS9pl8GJ+fKYnue2CYo9349d2oT6BBUs/Rv8uqYEa4QkpYsR7NS733TG
pos0hhoRvSOzrUP4qppXUjfJ+NkzLgpnKFOeWfFoNAKlHuaaMRvF3Y0Q/P8g0/Kg
HPWcCQLHyCH9Wjs3e2TTgRjxTrHuruD2VJ401/PX/lw0DicUhmev5mUFa10uwFkP
ZJRprjoFn9HJ0Hk16pFZDi36d3YumhACOcWRiJdoBDrEPV3S6lm9EeOy/yHBNp5k
9bKj+RboeT2t70KaZcKv+M5j1nu0cAhl7kRkjcxcmGyimI0l82Vgq9yFxhGqvWg8
RnCrJ5EaO08FGCAKG9GEwzdiNa24Gdq5XZSpQA7JZHmhmchpnnlNenJicyv0gOQi
abChZeWSEsyA+78l2+kk9nezfVKUOnKDEZQxBVTOyWsmZYxHZV94oam340VjQDaY
4/fETdOy/qqPIxnpxAeFGWxZjcVaYiYPLj7KLPMsB0aAAF7pZrem465vSfgbrE81
+gCdqrWd
=4dTW
-----END PGP SIGNATURE-----
Merge tag 'bitmap-6.0-rc1' of https://github.com/norov/linux
Pull bitmap updates from Yury Norov:
- fix the duplicated comments on bitmap_to_arr64() (Qu Wenruo)
- optimize out non-atomic bitops on compile-time constants (Alexander
Lobakin)
- cleanup bitmap-related headers (Yury Norov)
- x86/olpc: fix 'logical not is only applied to the left hand side'
(Alexander Lobakin)
- lib/nodemask: inline wrappers around bitmap (Yury Norov)
* tag 'bitmap-6.0-rc1' of https://github.com/norov/linux: (26 commits)
lib/nodemask: inline next_node_in() and node_random()
powerpc: drop dependency on <asm/machdep.h> in archrandom.h
x86/olpc: fix 'logical not is only applied to the left hand side'
lib/cpumask: move some one-line wrappers to header file
headers/deps: mm: align MANITAINERS and Docs with new gfp.h structure
headers/deps: mm: Split <linux/gfp_types.h> out of <linux/gfp.h>
headers/deps: mm: Optimize <linux/gfp.h> header dependencies
lib/cpumask: move trivial wrappers around find_bit to the header
lib/cpumask: change return types to unsigned where appropriate
cpumask: change return types to bool where appropriate
lib/bitmap: change type of bitmap_weight to unsigned long
lib/bitmap: change return types to bool where appropriate
arm: align find_bit declarations with generic kernel
iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE)
lib/test_bitmap: test the tail after bitmap_to_arr64()
lib/bitmap: fix off-by-one in bitmap_to_arr64()
lib: test_bitmap: add compile-time optimization/evaluations assertions
bitmap: don't assume compiler evaluates small mem*() builtins calls
net/ice: fix initializing the bitmap in the switch code
bitops: let optimize out non-atomic bitops on compile-time constants
...
This reverts commit 4bf7fda4dc.
It turns out that it was hopelessly naive to think that this would work,
considering that we've always done this. The first machine I actually
tested this on broke at bootup, getting to
Reached target cryptsetup.target - Local Encrypted Volumes.
and then hanging. It's unclear what actually fails, since there's a lot
else going on around that time (eg amdgpu probing also happens around
that same time, but it could be some other random init thing that didn't
complete earlier and just caused the boot to hang at that point).
The expectations that we should default to some unsafe and untested mode
seems entirely unfounded, and the belief that this wouldn't affect
modern systems is clearly entirely false. The machine in question is
about two years old, so it's not exactly shiny, but it's also not some
dusty old museum piece PDP-11 in a closet.
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- convert arm32 to the common dma-direct code (Arnd Bergmann, Robin Murphy,
Christoph Hellwig)
- restructure the PCIe peer to peer mapping support (Logan Gunthorpe)
- allow the IOMMU code to communicate an optional DMA mapping length
and use that in scsi and libata (John Garry)
- split the global swiotlb lock (Tianyu Lan)
- various fixes and cleanup (Chao Gao, Dan Carpenter, Dongli Zhang,
Lukas Bulwahn, Robin Murphy)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmLuIYULHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYPS5A//Ty1ZNyXExmwZ6J6g7/oIvQlpAHilDr22mCd8tR8Y
Ne7TgLa/X+usFvJTxJfkvg/LNMDjD7qx0J/mhDGm4reOFcEL4/PBy0rDSOgnmntV
k/fPhgwnpuztiAQ+s+WkJ3pkrmG1HaEId7GGj2JaoYdas6RX2mGX7vL8uvUFepjw
lYPAqWMtJHkOfsDK0PqqyQsr7dcC6lyFLqnn/wqvHtTJeKCfGs6W/SIrlWme2SZY
3dNx84ZR1uPjaazAmtf2IWfjh/TBmd0ETRYycgUUKRP9iwsCkBQDBwsBGSIYXiWj
BUKQ5oMvjAlUGRF0jYz9e77KuedE6GxWiXNQstitBmid142M37DHA5tvZRf65MPS
THHcjTDmmoaO4YfFhhXOcFOrjG4/V8bF7fgHB6XkHDjhVVTcnIx8zuOAXIVBZvIV
VAALmamBqEfIZZrCqgr7hzFssK2bip+TIMkdoD46Wcr+D7bAlujhuzWxubn9+ulT
23v/pAvC80ut6LvKj6EA+GpRm/pejfOtEbjXPoO2hguNxvuUKvPQqNh9hy0q+v1e
8n2Y/4lhy5bv02S7wKooNkfCoV753jBY1TIru45UmEYc3EkTQPii6okYe0DvW4QX
VCnKgo156wSBfE+9eWdxCROv2SZqJFMV/wL3vw54dpJQMbDy7VkNsh4mGREdUkU1
uek=
=Bv19
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-5.20-2022-08-06' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:
- convert arm32 to the common dma-direct code (Arnd Bergmann, Robin
Murphy, Christoph Hellwig)
- restructure the PCIe peer to peer mapping support (Logan Gunthorpe)
- allow the IOMMU code to communicate an optional DMA mapping length
and use that in scsi and libata (John Garry)
- split the global swiotlb lock (Tianyu Lan)
- various fixes and cleanup (Chao Gao, Dan Carpenter, Dongli Zhang,
Lukas Bulwahn, Robin Murphy)
* tag 'dma-mapping-5.20-2022-08-06' of git://git.infradead.org/users/hch/dma-mapping: (45 commits)
swiotlb: fix passing local variable to debugfs_create_ulong()
dma-mapping: reformat comment to suppress htmldoc warning
PCI/P2PDMA: Remove pci_p2pdma_[un]map_sg()
RDMA/rw: drop pci_p2pdma_[un]map_sg()
RDMA/core: introduce ib_dma_pci_p2p_dma_supported()
nvme-pci: convert to using dma_map_sgtable()
nvme-pci: check DMA ops when indicating support for PCI P2PDMA
iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg
iommu: Explicitly skip bus address marked segments in __iommu_map_sg()
dma-mapping: add flags to dma_map_ops to indicate PCI P2PDMA support
dma-direct: support PCI P2PDMA pages in dma-direct map_sg
dma-mapping: allow EREMOTEIO return code for P2PDMA transfers
PCI/P2PDMA: Introduce helpers for dma_map_sg implementations
PCI/P2PDMA: Attempt to set map_type if it has not been set
lib/scatterlist: add flag for indicating P2PDMA segments in an SGL
swiotlb: clean up some coding style and minor issues
dma-mapping: update comment after dmabounce removal
scsi: sd: Add a comment about limiting max_sectors to shost optimal limit
ata: libata-scsi: cap ata_device->max_sectors according to shost->max_sectors
scsi: scsi_transport_sas: cap shost opt_sectors according to DMA optimal limit
...
Including:
- Most intrusive patch is small and changes the default
allocation policy for DMA addresses. Before the change the
allocator tried its best to find an address in the first 4GB.
But that lead to performance problems when that space gets
exhaused, and since most devices are capable of 64-bit DMA
these days, we changed it to search in the full DMA-mask
range from the beginning. This change has the potential to
uncover bugs elsewhere, in the kernel or the hardware. There
is a Kconfig option and a command line option to restore the
old behavior, but none of them is enabled by default.
- Add Robin Murphy as reviewer of IOMMU code and maintainer for
the dma-iommu and iova code
- Chaning IOVA magazine size from 1032 to 1024 bytes to save
memory
- Some core code cleanups and dead-code removal
- Support for ACPI IORT RMR node
- Support for multiple PCI domains in the AMD-Vi driver
- ARM SMMU changes from Will Deacon:
- Add even more Qualcomm device-tree compatible strings
- Support dumping of IMP DEF Qualcomm registers on TLB sync
timeout
- Fix reference count leak on device tree node in Qualcomm
driver
- Intel VT-d driver updates from Lu Baolu:
- Make intel-iommu.h private
- Optimize the use of two locks
- Extend the driver to support large-scale platforms
- Cleanup some dead code
- MediaTek IOMMU refactoring and support for TTBR up to 35bit
- Basic support for Exynos SysMMU v7
- VirtIO IOMMU driver gets a map/unmap_pages() implementation
- Other smaller cleanups and fixes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmLs3DIACgkQK/BELZcB
GuMizhAAguAnLLOkOLlR9/MhrTZfNXCUX+bfrEIevjFXMw4iPNfCCr4ydQ7EdVK6
ZA/3Z89huYl0d0x/FELolnQi+HOeqYrfTDe4rB7TgNgwZnWa+fdHcyYkgBGyfPaV
ilgjNcx8o//9o4NasyB6kU395jVmFxb735gMTTb+tcO9fr+/qIB6hxrHuCklxrNr
C7wK6kkoDPi5n0QuXCSjXEx2Hk245pAWKPLwqxsUYzHGlLfl7ULOxw65BUBGvn/H
uCsTfJFu7u+ErwQYf0qPuOwRBnRdsx9g5EAnfab8p074SoKWvbNnftIxgIRp8ZEM
YgCbhYa1GOFI4r+XzqRzEbc0/vPSttims4Jqz0KxYs7pr5EoVifrWLJFjJdCdc2h
Tio1gTvOq8HbH63kwYNKJhg4iSC6zVd37ihEhvfFO6LcgFl4iCfd2o9zK7oY40J4
XoOxofVnJ2e3tzdhZ/n5quCXiudHixm6WuVa7QYKscF7Ud0tY1wWKuibdlMQTeNM
68MvtlteKcfs1BrWzZyrFMrFeAfIY8LI82y6jdJuoNMU5LE9+5yelXBdJhnVygZ+
Jglv1TIt6W/z1H5JgXtNVZ1wWgBm7rurOqNyfN8XCd8eP1z321CLfX8ujkhKrIWP
ApG15cwvpnh1JX630+UFiEikTGU0fb2orMdPwYmwuu8DAsoLVHE=
=hI2K
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.20-or-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- The most intrusive patch is small and changes the default allocation
policy for DMA addresses.
Before the change the allocator tried its best to find an address in
the first 4GB. But that lead to performance problems when that space
gets exhaused, and since most devices are capable of 64-bit DMA these
days, we changed it to search in the full DMA-mask range from the
beginning.
This change has the potential to uncover bugs elsewhere, in the
kernel or the hardware. There is a Kconfig option and a command line
option to restore the old behavior, but none of them is enabled by
default.
- Add Robin Murphy as reviewer of IOMMU code and maintainer for the
dma-iommu and iova code
- Chaning IOVA magazine size from 1032 to 1024 bytes to save memory
- Some core code cleanups and dead-code removal
- Support for ACPI IORT RMR node
- Support for multiple PCI domains in the AMD-Vi driver
- ARM SMMU changes from Will Deacon:
- Add even more Qualcomm device-tree compatible strings
- Support dumping of IMP DEF Qualcomm registers on TLB sync
timeout
- Fix reference count leak on device tree node in Qualcomm driver
- Intel VT-d driver updates from Lu Baolu:
- Make intel-iommu.h private
- Optimize the use of two locks
- Extend the driver to support large-scale platforms
- Cleanup some dead code
- MediaTek IOMMU refactoring and support for TTBR up to 35bit
- Basic support for Exynos SysMMU v7
- VirtIO IOMMU driver gets a map/unmap_pages() implementation
- Other smaller cleanups and fixes
* tag 'iommu-updates-v5.20-or-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (116 commits)
iommu/amd: Fix compile warning in init code
iommu/amd: Add support for AVIC when SNP is enabled
iommu/amd: Simplify and Consolidate Virtual APIC (AVIC) Enablement
ACPI/IORT: Fix build error implicit-function-declaration
drivers: iommu: fix clang -wformat warning
iommu/arm-smmu: qcom_iommu: Add of_node_put() when breaking out of loop
iommu/arm-smmu-qcom: Add SM6375 SMMU compatible
dt-bindings: arm-smmu: Add compatible for Qualcomm SM6375
MAINTAINERS: Add Robin Murphy as IOMMU SUBSYTEM reviewer
iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled
iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY after SNP is enabled
iommu/amd: Set translation valid bit only when IO page tables are in use
iommu/amd: Introduce function to check and enable SNP
iommu/amd: Globally detect SNP support
iommu/amd: Process all IVHDs before enabling IOMMU features
iommu/amd: Introduce global variable for storing common EFR and EFR2
iommu/amd: Introduce Support for Extended Feature 2 Register
iommu/amd: Change macro for IOMMU control register bit shift to decimal value
iommu/exynos: Enable default VM instance on SysMMU v7
iommu/exynos: Add SysMMU v7 register set
...
Here is the set of driver core and kernfs changes for 6.0-rc1.
"biggest" thing in here is some scalability improvements for kernfs for
large systems. Other than that, included in here are:
- arch topology and cache info changes that have been reviewed
and discussed a lot.
- potential error path cleanup fixes
- deferred driver probe cleanups
- firmware loader cleanups and tweaks
- documentation updates
- other small things
All of these have been in the linux-next tree for a while with no
reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYuqCnw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ym/JgCcCnaycJY00ZPRQm3LQCyzfJ0HgqoAn2qxGV+K
NKycLeXZSnuvIA87dycE
=/4Jk
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core / kernfs updates from Greg KH:
"Here is the set of driver core and kernfs changes for 6.0-rc1.
The "biggest" thing in here is some scalability improvements for
kernfs for large systems. Other than that, included in here are:
- arch topology and cache info changes that have been reviewed and
discussed a lot.
- potential error path cleanup fixes
- deferred driver probe cleanups
- firmware loader cleanups and tweaks
- documentation updates
- other small things
All of these have been in the linux-next tree for a while with no
reported problems"
* tag 'driver-core-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (63 commits)
docs: embargoed-hardware-issues: fix invalid AMD contact email
firmware_loader: Replace kmap() with kmap_local_page()
sysfs docs: ABI: Fix typo in comment
kobject: fix Kconfig.debug "its" grammar
kernfs: Fix typo 'the the' in comment
docs: driver-api: firmware: add driver firmware guidelines. (v3)
arch_topology: Fix cache attributes detection in the CPU hotplug path
ACPI: PPTT: Leave the table mapped for the runtime usage
cacheinfo: Use atomic allocation for percpu cache attributes
drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist
MAINTAINERS: Change mentions of mpm to olivia
docs: ABI: sysfs-devices-soc: Update Lee Jones' email address
docs: ABI: sysfs-class-pwm: Update Lee Jones' email address
Documentation/process: Add embargoed HW contact for LLVM
Revert "kernfs: Change kernfs_notify_list to llist."
ACPI: Remove the unused find_acpi_cpu_cache_topology()
arch_topology: Warn that topology for nested clusters is not supported
arch_topology: Add support for parsing sockets in /cpu-map
arch_topology: Set cluster identifier in each core/thread from /cpu-map
arch_topology: Limit span of cpu_clustergroup_mask()
...
Recent changes to solve inconsistencies in handling IRQ masks #ifdef
out the affinity field in irq_common_data for non-SMP configurations.
The current code in hyperv_irq_remapping_alloc() gets a compiler error
in that case.
Fix this by using the new irq_data_update_affinity() helper, which
handles the non-SMP case correctly.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Fixes: aa0813581b ("genirq: Provide an IRQ affinity mask in non-SMP configs")
Link: https://lore.kernel.org/r/1658796820-2261-1-git-send-email-mikelley@microsoft.com
core:
- Fix a few inconsistencies between UP and SMP vs. interrupt affinities
- Small updates and cleanups all over the place
drivers:
- New driver for the LoongArch interrupt controller
- New driver for the Renesas RZ/G2L interrupt controller
- Hotpath optimization for SiFive PLIC
- Workaround for broken PLIC edge triggered interrupts
- Simall cleanups and improvements as usual
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmLn5agTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoV2HD/4u0+09Fd8Awt1Knnb4CInmwFihZ/bu
EiS1Air+MEJ/fyFb5sT/Dn8YdUWYA6a3ifpLMGBwrKCcb5pMwPEtI8uqjSmtgsN/
2Z4o3N5v6EgM25CtrHNBrXK0E9Rz5Py49gm5p3K7+h4g63z9JwrM4G0Bvr8+krLS
EV9IZU6kVmGC6gnG/MspkArsLk1rCM0PU0SJ2lEPsWd1fjhVSDfunvy/qnnzXRzz
wjrcAf+a2Kgb1TMnpL6tx9d2Xx8KrKfODZTdOmPHrdv58F0EbJzapJnAVkYZDPtR
LE2kQc2Qhdawx0kgNNNhvu9P6oZtpnK9N7KAhDQdw17sgrRygINjAMSEe2RykYL1
lK+lJOIzfyd2JkEuC/8w1ZezL88S0EaTNawrkxjJ8L3fa7WDbwilCC+1w95QydCv
sQB137OaLKgWetcRsht9PLWFb4ujkWzxoPf2cPPsm81EzCicNtBuNPLReBTcNrWJ
X2VPpbaqRK8t8bnkXRqhahbq7f8c86feoICHfA4c7T4eZUp/Oq6T8aNvf6WPgjae
I2/FO6kxZj3CQqFkhJGhiZRtGZdx6HLCsL84A+2Ktsra+D8+/qecZCnkHYtz0Vo6
aFuGg+Wj+zuc2QfdaWwG8Dh5dijbxgHGHhzbh9znsWzytN9gfoBxuvBejf65i6sC
In63mEkv35ttfA==
=OnhF
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Updates for interrupt core and drivers:
Core:
- Fix a few inconsistencies between UP and SMP vs interrupt
affinities
- Small updates and cleanups all over the place
New drivers:
- LoongArch interrupt controller
- Renesas RZ/G2L interrupt controller
Updates:
- Hotpath optimization for SiFive PLIC
- Workaround for broken PLIC edge triggered interrupts
- Simall cleanups and improvements as usual"
* tag 'irq-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
irqchip/mmp: Declare init functions in common header file
irqchip/mips-gic: Check the return value of ioremap() in gic_of_init()
genirq: Use for_each_action_of_desc in actions_show()
irqchip / ACPI: Introduce ACPI_IRQ_MODEL_LPIC for LoongArch
irqchip: Add LoongArch CPU interrupt controller support
irqchip: Add Loongson Extended I/O interrupt controller support
irqchip/loongson-liointc: Add ACPI init support
irqchip/loongson-pch-msi: Add ACPI init support
irqchip/loongson-pch-pic: Add ACPI init support
irqchip: Add Loongson PCH LPC controller support
LoongArch: Prepare to support multiple pch-pic and pch-msi irqdomain
LoongArch: Use ACPI_GENERIC_GSI for gsi handling
genirq/generic_chip: Export irq_unmap_generic_chip
ACPI: irq: Allow acpi_gsi_to_irq() to have an arch-specific fallback
APCI: irq: Add support for multiple GSI domains
LoongArch: Provisionally add ACPICA data structures
irqdomain: Use hwirq_max instead of revmap_size for NOMAP domains
irqdomain: Report irq number for NOMAP domains
irqchip/gic-v3: Fix comment typo
dt-bindings: interrupt-controller: renesas,rzg2l-irqc: Document RZ/V2L SoC
...
A recent commit introduced these compile warnings:
CC drivers/iommu/amd/init.o
drivers/iommu/amd/init.c:938:12: error: ‘iommu_init_ga_log’ defined but not used [-Werror=unused-function]
938 | static int iommu_init_ga_log(struct amd_iommu *iommu)
| ^~~~~~~~~~~~~~~~~
drivers/iommu/amd/init.c:902:12: error: ‘iommu_ga_log_enable’ defined but not used [-Werror=unused-function]
902 | static int iommu_ga_log_enable(struct amd_iommu *iommu)
| ^~~~~~~~~~~~~~~~~~~
The warnings appear because both functions are defined when IRQ
remapping is not enabled, but only used when IRQ remapping is enabled.
Fix it by only defining the functions when IRQ remapping is enabled.
Fixes: c5e1a1eb92 ("iommu/amd: Simplify and Consolidate Virtual APIC (AVIC) Enablement")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220729100432.22474-1-joro@8bytes.org
In order to support AVIC on SNP-enabled system, The IOMMU driver needs to
check EFR2[SNPAVICSup] and enables the support by setting SNPAVICEn bit
in the IOMMU control register (MMIO offset 18h).
For detail, please see section "SEV-SNP Guest Virtual APIC Support" of the
AMD I/O Virtualization Technology (IOMMU) Specification.
(https://www.amd.com/system/files/TechDocs/48882_IOMMU.pdf)
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20220726134348.6438-3-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Call pci_p2pdma_map_segment() when a PCI P2PDMA page is seen so the bus
address is set in the dma address and the segment is marked with
sg_dma_mark_bus_address(). iommu_map_sg() will then skip these segments.
Then, in __finalise_sg(), copy the dma address from the input segment
to the output segment. __invalidate_sg() must also learn to skip these
segments.
A P2PDMA page may have three possible outcomes when being mapped:
1) If the data path between the two devices doesn't go through
the root port, then it should be mapped with a PCI bus address
2) If the data path goes through the host bridge, it should be mapped
normally with an IOMMU IOVA.
3) It is not possible for the two devices to communicate and thus
the mapping operation should fail (and it will return -EREMOTEIO).
Similar to dma-direct, the sg_dma_mark_pci_p2pdma() flag is used to
indicate bus address segments. On unmap, P2PDMA segments are skipped
over when determining the start and end IOVA addresses.
With this change, the flags variable in the dma_map_ops is set to
DMA_F_PCI_P2PDMA_SUPPORTED to indicate support for P2PDMA pages.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
In order to support PCI P2PDMA mappings with dma-iommu, explicitly skip
any segments marked with sg_dma_mark_bus_address() in __iommu_map_sg().
These segments should not be mapped into the IOVA and will be handled
separately in as subsequent patch for dma-iommu.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When building with Clang we encounter the following warning:
| drivers/iommu/msm_iommu.c:603:6: error: format specifies type 'unsigned
| short' but the argument has type 'int' [-Werror,-Wformat] sid);
`sid` is an int, use the proper format specifier `%x`.
Link: https://github.com/ClangBuiltLinux/linux/issues/378
Reported-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20220721210331.4012015-1-justinstitt@google.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In qcom_iommu_has_secure_context(), we should call of_node_put()
for the reference 'child' when breaking out of for_each_child_of_node()
which will automatically increase and decrease the refcount.
Fixes: d051f28c88 ("iommu/qcom: Initialize secure page table")
Signed-off-by: Liang He <windhl@126.com>
Link: https://lore.kernel.org/r/20220719124955.1242171-1-windhl@126.com
Signed-off-by: Will Deacon <will@kernel.org>