Commit graph

7301 commits

Author SHA1 Message Date
Hou Zhiqiang
bcbe0d9a8d PCI: mobiveil: Unify register accessors
It is confusing to have two sets of functions to read/write
registers, some with csr_readl()/csr_writel(), while others with
read_paged_register()/write_paged_register().

In the register space the lower 3KB of 4KB PCIe configure space can be
accessed directly and higher 1KB through a simple paging mechanism.

Unify the register accessors in csr_readl() and csr_writel() by
comparing the register offset with page access boundary 3KB in the
accessor internal so that the paging mechanism is hidden behind
the csr_read()/write() common function calls.

Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Minghuan Lian <Minghuan.Lian@nxp.com>
Reviewed-by: Subrahmanya Lingappa <l.subrahmanya@mobiveil.co.in>
2019-07-08 11:23:13 +01:00
Rafael J. Wysocki
3dbeb44854 Merge branch 'pm-sleep'
* pm-sleep:
  PM: sleep: Drop dev_pm_skip_next_resume_phases()
  ACPI: PM: Drop unused function and function header
  ACPI: PM: Introduce "poweroff" callbacks for ACPI PM domain and LPSS
  ACPI: PM: Simplify and fix PM domain hibernation callbacks
  PCI: PM: Simplify bus-level hibernation callbacks
  PM: ACPI/PCI: Resume all devices during hibernation
  kernel: power: swap: use kzalloc() instead of kmalloc() followed by memset()
  PM: sleep: Update struct wakeup_source documentation
  drivers: base: power: remove wakeup_sources_stats_dentry variable
  PM: suspend: Rename pm_suspend_via_s2idle()
  PM: sleep: Show how long dpm_suspend_start() and dpm_suspend_end() take
  PM: hibernate: powerpc: Expose pfn_is_nosave() prototype
2019-07-08 10:51:25 +02:00
Leonard Crestez
7e8ab1b268 PCI: imx6: Simplify Kconfig depends on
The imx6 driver can be used on imx6sx without enabling support for
imx6q or imx7d but the "depends on" condition doesn't allow that.

Instead of making the condition even longer just make it depend on
"ARCH_MXC || COMPILE_TEST" instead.

Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
2019-07-05 16:44:41 +01:00
Dexuan Cui
4df591b20b PCI: hv: Fix a use-after-free bug in hv_eject_device_work()
Fix a use-after-free in hv_eject_device_work().

Fixes: 05f151a73e ("PCI: hv: Fix a memory leak in hv_eject_device_work()")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Cc: stable@vger.kernel.org
2019-07-05 14:37:44 +01:00
Vidya Sagar
7be142caab PCI: tegra: Enable Relaxed Ordering only for Tegra20 & Tegra30
The PCI Tegra controller conversion to a device tree configurable
driver in commit d1523b52bf ("PCI: tegra: Move PCIe driver
to drivers/pci/host") implied that code for the driver can be
compiled in for a kernel supporting multiple platforms.

Unfortunately, a blind move of the code did not check that some of the
quirks that were applied in arch/arm (eg enabling Relaxed Ordering on
all PCI devices - since the quirk hook erroneously matches PCI_ANY_ID
for both Vendor-ID and Device-ID) are now applied in all kernels that
compile the PCI Tegra controlled driver, DT and ACPI alike.

This is completely wrong, in that enablement of Relaxed Ordering is only
required by default in Tegra20 platforms as described in the Tegra20
Technical Reference Manual (available at
https://developer.nvidia.com/embedded/downloads#?search=tegra%202 in
Section 34.1, where it is mentioned that Relaxed Ordering bit needs to
be enabled in its root ports to avoid deadlock in hardware) and in the
Tegra30 platforms for the same reasons (unfortunately not documented
in the TRM).

There is no other strict requirement on PCI devices Relaxed Ordering
enablement on any other Tegra platforms or PCI host bridge driver.

Fix this quite upsetting situation by limiting the vendor and device IDs
to which the Relaxed Ordering quirk applies to the root ports in
question, reported above.

Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
[lorenzo.pieralisi@arm.com: completely rewrote the commit log/fixes tag]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-07-05 13:59:59 +01:00
Manikanta Maddireddy
4b16a82279 PCI: tegra: Change link retry log level to debug
Driver checks for link up three times before giving up, each retry
attempt is printed as an error. Letting users know that PCIe link is
down and in the process of being brought up again is for debug, not an
error condition.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-07-05 13:59:59 +01:00
Manikanta Maddireddy
dbdcc22c84 PCI: tegra: Add support for GPIO based PERST#
Tegra PCIe has fixed per port SFIO line to signal PERST#, which can be
controlled by AFI port register. However, if a platform routes a
different GPIO to the PCIe slot, then port register cannot control it.
Add support for GPIO based PERST# signal for such platforms. GPIO number
comes from per port PCIe device tree node. PCIe driver probe doesn't
fail if per port "reset-gpios" property is not populated, so platforms
that require this workaround must make sure that the DT property is not
missed in the corresponding device tree.

Link: https://lore.kernel.org/linux-pci/20190705084850.30777-1-jonathanh@nvidia.com/
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
[lorenzo.pieralisi@arm.com: squashed in fix in Link]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-07-05 13:57:58 +01:00
Alex Williamson
06013b647c PCI/IOV: Assume SR-IOV VFs support extended config space.
The SR-IOV specification requires both PFs and VFs to implement a PCIe
capability.  Generally this is sufficient to assume extended config space
is present, but we generally also perform additional tests to make sure the
extended config space is reachable and not simply an alias of standard
config space.  For a VF to exist extended config space must be accessible
on the PF, therefore we can also assume it to be accessible on the VF.
This enables a micro performance optimization previously implemented in
commit 975bb8b4dc ("PCI/IOV: Use VF0 cached config space size for other
VFs") to speed up probing of VFs.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Cc: Hao Zheng <yinhe@linux.alibaba.com>
2019-07-03 08:58:57 -05:00
Alex Williamson
76bf6a8634 Revert "PCI/IOV: Use VF0 cached config space size for other VFs"
Revert 975bb8b4dc ("PCI/IOV: Use VF0 cached config space size for other
VFs"), which attempted to cache the config space size from the first VF to
re-use for subsequent VFs.

The cached value was determined prior to discovering the PCIe capability on
the VF, which resulted in the first VF reporting the correct config space
size (4K), as it has a special case through pci_cfg_space_size(), while all
the other VFs only reported 256 bytes.  As this was only a performance
optimization, we're better off without it.

Fixes: 975bb8b4dc ("PCI/IOV: Use VF0 cached config space size for other VFs")
Link: https://lore.kernel.org/r/156046663197.29869.3633634445109057665.stgit@gimli.home
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Hao Zheng <yinhe@linux.alibaba.com>
2019-07-03 08:58:19 -05:00
Markus Elfring
590a18e171 PCI: Use seq_puts() instead of seq_printf() in show_device()
The driver name in /proc/bus/pci/devices can be printed without a printf
format specification, so use seq_puts() instead of seq_printf().

This issue was detected by using the Coccinelle software.

Link: https://lore.kernel.org/r/a6b110cb-0d0e-5dc3-9ca1-9041609cf74c@web.de
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-07-02 18:38:51 -05:00
Logan Gunthorpe
9c002bb66f PCI/P2PDMA: Fix missing check for dma_virt_ops
Drivers that use dma_virt_ops were meant to be rejected when testing
compatibility for P2PDMA.

This check got inadvertently dropped in one of the later versions of the
original patchset, so add it back.

Fixes: 52916982af ("PCI/P2PDMA: Support peer-to-peer memory")
Link: https://lore.kernel.org/r/20190702173544.21950-1-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-07-02 18:35:15 -05:00
Rafael J. Wysocki
a78ae45a79 PCI: PM: Simplify bus-level hibernation callbacks
After a previous change causing all runtime-suspended PCI devices
to be resumed before creating a snapshot image of memory during
hibernation, it is not necessary to worry about the case in which
them might be left in runtime-suspend any more, so get rid of the
code related to that from bus-level PCI hibernation callbacks.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
2019-07-03 00:13:24 +02:00
Rafael J. Wysocki
501debd4aa PM: ACPI/PCI: Resume all devices during hibernation
Both the PCI bus type and the ACPI PM domain avoid resuming
runtime-suspended devices with DPM_FLAG_SMART_SUSPEND set during
hibernation (before creating the snapshot image of system memory),
but that turns out to be a mistake.  It leads to functional issues
and adds complexity that's hard to justify.

For this reason, resume all runtime-suspended PCI devices and all
devices in the ACPI PM domains before creating a snapshot image of
system memory during hibernation.

Fixes: 05087360fd (ACPI / PM: Take SMART_SUSPEND driver flag into account)
Fixes: c4b65157ae (PCI / PM: Take SMART_SUSPEND driver flag into account)
Link: https://lore.kernel.org/linux-acpi/917d4399-2e22-67b1-9d54-808561f9083f@uwyo.edu/T/#maf065fe6e4974f2a9d79f332ab99dfaba635f64c
Reported-by: Robert R. Howell <RHowell@uwyo.edu>
Tested-by: Robert R. Howell <RHowell@uwyo.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
2019-07-03 00:13:24 +02:00
Nicholas Johnson
6a381ea694 PCI: Skip resource distribution when no hotplug bridges
If "hotplug_bridges == 0", "!dev->is_hotplug_bridge" is always true, so the
loop that divides the remaining resources among hotplug-capable bridges
does nothing.

Check for "hotplug_bridges == 0" earlier, so we don't even have to compute
the amount of remaining resources.  No functional change intended.

Link: https://lore.kernel.org/r/PS2P216MB0642C7A485649D2D787A1C6F80000@PS2P216MB0642.KORP216.PROD.OUTLOOK.COM
Link: https://lore.kernel.org/r/20190622210310.180905-3-helgaas@kernel.org
Signed-off-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-07-02 13:56:54 -05:00
Nicholas Johnson
5c6bcc344b PCI: Simplify pci_bus_distribute_available_resources()
Reorder pci_bus_distribute_available_resources() to group related code
together.  No functional change intended.

Link: https://lore.kernel.org/r/PS2P216MB0642C7A485649D2D787A1C6F80000@PS2P216MB0642.KORP216.PROD.OUTLOOK.COM
Link: https://lore.kernel.org/r/20190622210310.180905-2-helgaas@kernel.org
Signed-off-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2019-07-02 13:56:26 -05:00
Christoph Hellwig
d0b3517dbc PCI/P2PDMA: use the dev_pagemap internal refcount
The functionality is identical to the one currently open coded in
p2pdma.c.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-02 14:32:44 -03:00
Christoph Hellwig
d8668bb045 memremap: pass a struct dev_pagemap to ->kill and ->cleanup
Passing the actual typed structure leads to more understandable code
vs just passing the ref member.

Reported-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-02 14:32:44 -03:00
Christoph Hellwig
1e240e8d4a memremap: move dev_pagemap callbacks into a separate structure
The dev_pagemap is a growing too many callbacks.  Move them into a
separate ops structure so that they are not duplicated for multiple
instances, and an attacker can't easily overwrite them.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-02 14:32:44 -03:00
Rafael J. Wysocki
28ad4b4e34 Merge back PCI power management material for v5.3. 2019-06-30 13:41:52 +02:00
David S. Miller
d96ff269a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The new route handling in ip_mc_finish_output() from 'net' overlapped
with the new support for returning congestion notifications from BPF
programs.

In order to handle this I had to take the dev_loopback_xmit() calls
out of the switch statement.

The aquantia driver conflicts were simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 21:06:39 -07:00
Vidya Sagar
ca98329d3b PCI: dwc: Export APIs to support .remove() implementation
Export all configuration space access APIs and also other APIs to
support host controller drivers of dwc core based implementations while
adding support for .remove() hook to build their respective drivers as
modules.

Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
2019-06-27 12:02:46 +01:00
Vidya Sagar
7bc082d7e9 PCI: dwc: Cleanup DBI,ATU read and write APIs
Cleanup DBI read and write APIs by removing leading "__" (underscore)
from their names as there is no reason to have leading underscores
in the first place in the function definition.

Remove dbi/dbi2 base address parameters as the same behaviour can be
obtained through read and write APIs. Since dw_pcie_{readl/writel}_dbi()
APIs can't be used for ATU read/write as ATU base address could be
different from DBI base address, implement ATU read/write APIs using ATU
base address without using dw_pcie_{readl/writel}_dbi() APIs.

Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
2019-06-27 12:02:46 +01:00
Vidya Sagar
9d071cade3 PCI: dwc: Add API support to de-initialize host
Add an API to group all the tasks to be done to de-initialize host which
can then be called by any dwc core based driver implementations
while adding .remove() support in their respective drivers.

Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
2019-06-27 12:02:46 +01:00
Rafael J. Wysocki
b51033e06c PCI: PM/ACPI: Refresh all stale power state data in pci_pm_complete()
In pci_pm_complete() there are checks to decide whether or not to
resume devices that were left in runtime-suspend during the preceding
system-wide transition into a sleep state.  They involve checking the
current power state of the device and comparing it with the power
state of it set before the preceding system-wide transition, but the
platform component of the device's power state is not handled
correctly in there.

Namely, on platforms with ACPI, the device power state information
needs to be updated with care, so that the reference counters of
power resources used by the device (if any) are set to ensure that
the refreshed power state of it will be maintained going forward.

To that end, introduce a new ->refresh_state() platform PM callback
for PCI devices, for asking the platform to refresh the device power
state data and ensure that the corresponding power state will be
maintained going forward, make it invoke acpi_device_update_power()
(for devices with ACPI PM) on platforms with ACPI and make
pci_pm_complete() use it, through a new pci_refresh_power_state()
wrapper function.

Fixes: a0d2a959d3 (PCI: Avoid unnecessary resume after direct-complete)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-06-27 12:33:26 +02:00
Mika Westerberg
53b22f900c PCI / ACPI: Add _PR0 dependent devices
If otherwise unrelated PCI devices share ACPI power resources turning
them on causes the devices to enter D0uninitialized power state which may
cause problems.

For example in Intel Ice Lake two root ports (RP0 and RP1), Thunderbolt
controller (NHI) and xHCI controller all share power resources as can be
ween in the topology below where power resources are marked with []:

  Host bridge
    |
    +- RP0 ---\
    +- RP1 ---|--+--> [TBT]
    +- NHI --/   |
    |            |
    |            v
    +- xHCI --> [D3C]

In a situation where all devices sharing the power resources are in
D3cold (the power resources are turned off) and for example the
Thunderbolt controller is runtime resumed resulting that the power
resources are turned on. This means that the other devices sharing them
(RP0, RP1 and xHCI) are transitioned into D0uninitialized state. If they
were configured to trigger wake (PME) on a certain event that
configuration gets lost after reset so we would need to re-initialize
them to get the wakeup working as expected again. To do so we would need
to runtime resume all of them to make sure their registers get restored
properly before we can runtime suspend them again.

Since we just added concept of "_PR0 dependent device" we can solve this
by calling the relevant add/remove functions when the PCI device is bind
to its ACPI representation. If it has power resources the PCI device
will be added as dependent device to them and runtime resumed whenever
they are physically turned on. This should make sure PCI core can
reconfigure wakes after the device is transitioned into D0uninitialized.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-27 12:31:57 +02:00
Mika Westerberg
83a16e3f6d PCI / ACPI: Use cached ACPI device state to get PCI device power state
The ACPI power state returned by acpi_device_get_power() may depend on
the configuration of ACPI power resources in the system which may change
any time after acpi_device_get_power() has returned, unless the
reference counters of the ACPI power resources in question are set to
prevent that from happening. Thus it is invalid to use acpi_device_get_power()
in acpi_pci_get_power_state() the way it is done now and the value of
the ->power.state field in the corresponding struct acpi_device objects
(which reflects the ACPI power resources reference counting, among other
things) should be used instead.

As an example where this becomes an issue is Intel Ice Lake where the
Thunderbolt controller (NHI), two PCIe root ports (RP0 and RP1) and xHCI
all share the same power resources. The following picture with power
resources marked with [] shows the topology:

  Host bridge
    |
    +- RP0 ---\
    +- RP1 ---|--+--> [TBT]
    +- NHI --/   |
    |            |
    |            v
    +- xHCI --> [D3C]

Here TBT and D3C are the shared ACPI power resources. ACPI _PR3() method
of the devices in question returns either TBT or D3C or both.

Say we runtime suspend first the root ports RP0 and RP1, then NHI. Now
since the TBT power resource is still on when the root ports are runtime
suspended their dev->current_state is set to D3hot. When NHI is runtime
suspended TBT is finally turned off but state of the root ports remain
to be D3hot. Now when the xHCI is runtime suspended D3C gets also turned
off. PCI core thus has power states of these devices cached in their
dev->current_state as follows:

  RP0 -> D3hot
  RP1 -> D3hot
  NHI -> D3cold
  xHCI -> D3cold

If the user now runs lspci for instance, the result is all 1's like in
the below output (00:07.0 is the first root port, RP0):

00:07.0 PCI bridge: Intel Corporation Device 8a1d (rev ff) (prog-if ff)
    !!! Unknown header type 7f
    Kernel driver in use: pcieport

In short the hardware state is not in sync with the software state
anymore. The exact same thing happens with the PME polling thread which
ends up bringing the root ports back into D0 after they are runtime
suspended.

For this reason, modify acpi_pci_get_power_state() so that it uses the
ACPI device power state that was cached by the ACPI core. This makes the
PCI device power state match the ACPI device power state regardless of
state of the shared power resources which may still be on at this point.

Link: https://lore.kernel.org/r/20190618161858.77834-2-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-27 12:31:57 +02:00
Rafael J. Wysocki
471a739a47 PCI: PM: Avoid skipping bus-level PM on platforms without ACPI
There are platforms that do not call pm_set_suspend_via_firmware(),
so pm_suspend_via_firmware() returns 'false' on them, but the power
states of PCI devices (PCIe ports in particular) are changed as a
result of powering down core platform components during system-wide
suspend.  Thus the pm_suspend_via_firmware() checks in
pci_pm_suspend_noirq() and pci_pm_resume_noirq() introduced by
commit 3e26c5feed ("PCI: PM: Skip devices in D0 for suspend-to-
idle") are not sufficient to determine that devices left in D0
during suspend will remain in D0 during resume and so the bus-level
power management can be skipped for them.

For this reason, introduce a new global suspend flag,
PM_SUSPEND_FLAG_NO_PLATFORM, set it for suspend-to-idle only
and replace the pm_suspend_via_firmware() checks mentioned above
with checks against this flag.

Fixes: 3e26c5feed ("PCI: PM: Skip devices in D0 for suspend-to-idle")
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-06-26 23:51:56 +02:00
Bharat Kumar Gogada
181fa434d0 PCI: xilinx-nwl: Fix Multi MSI data programming
According to the PCI Local Bus specification Revision 3.0,
section 6.8.1.3 (Message Control for MSI), endpoints that
are Multiple Message Capable as defined by bits [3:1] in
the Message Control for MSI can request a number of vectors
that is power of two aligned.

As specified in section 6.8.1.6 "Message data for MSI", the Multiple
Message Enable field (bits [6:4] of the Message Control register)
defines the number of low order message data bits the function is
permitted to modify to generate its system software allocated
vectors.

The MSI controller in the Xilinx NWL PCIe controller supports a number
of MSI vectors specified through a bitmap and the hwirq number for an
MSI, that is the value written in the MSI data TLP is determined by
the bitmap allocation.

For instance, in a situation where two endpoints sitting on
the PCI bus request the following MSI configuration, with
the current PCI Xilinx bitmap allocation code (that does not
align MSI vector allocation on a power of two boundary):

Endpoint #1: Requesting 1 MSI vector - allocated bitmap bits 0
Endpoint #2: Requesting 2 MSI vectors - allocated bitmap bits [1,2]

The bitmap value(s) corresponds to the hwirq number that is programmed
into the Message Data for MSI field in the endpoint MSI capability
and is detected by the root complex to fire the corresponding
MSI irqs. The value written in Message Data for MSI field corresponds
to the first bit allocated in the bitmap for Multi MSI vectors.

The current Xilinx NWL MSI allocation code allows a bitmap allocation
that is not a power of two boundaries, so endpoint #2, is allowed to
toggle Message Data bit[0] to differentiate between its two vectors
(meaning that the MSI data will be respectively 0x0 and 0x1 for the two
vectors allocated to endpoint #2).

This clearly aliases with the Endpoint #1 vector allocation, resulting
in a broken Multi MSI implementation.

Update the code to allocate MSI bitmap ranges with a power of two
alignment, fixing the bug.

Fixes: ab597d35ef ("PCI: xilinx-nwl: Add support for Xilinx NWL PCIe Host Controller")
Suggested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2019-06-26 10:56:51 +01:00
Rafael J. Wysocki
25bc694a8a Merge back PCI power management material for v5.3. 2019-06-24 10:11:27 +02:00
Suzuki K Poulose
418e3ea157 bus_find_device: Unify the match callback with class_find_device
There is an arbitrary difference between the prototypes of
bus_find_device() and class_find_device() preventing their callers
from passing the same pair of data and match() arguments to both of
them, which is the const qualifier used in the prototype of
class_find_device().  If that qualifier is also used in the
bus_find_device() prototype, it will be possible to pass the same
match() callback function to both bus_find_device() and
class_find_device(), which will allow some optimizations to be made in
order to avoid code duplication going forward.  Also with that, constify
the "data" parameter as it is passed as a const to the match function.

For this reason, change the prototype of bus_find_device() to match
the prototype of class_find_device() and adjust its callers to use the
const qualifier in accordance with the new prototype of it.

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andreas Noever <andreas.noever@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Corey Minyard <minyard@acm.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David Kershner <david.kershner@unisys.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Harald Freudenberger <freude@linux.ibm.com>
Cc: Hartmut Knaack <knaack.h@gmx.de>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Jamet <michael.jamet@intel.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Sebastian Ott <sebott@linux.ibm.com>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Yehezkel Bernat <YehezkelShB@gmail.com>
Cc: rafael@kernel.org
Acked-by: Corey Minyard <minyard@acm.org>
Acked-by: David Kershner <david.kershner@unisys.com>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Acked-by: Wolfram Sang <wsa@the-dreams.de> # for the I2C parts
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-24 05:22:31 +02:00
Linus Torvalds
b253d5f3ec pci-v5.2-fixes-1
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl0OTC0UHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxBkQ//Zye+wY3Md9M3TWHYjj73jeJTmfb/
 BQgWU2OsOvGWDgQe7gX2c5sjkJe1S450Mf9CvRYu77z1SHgT/2E25yY3OxV1wNMS
 UX1xjm91N1/KBnPj2L6ks5exequobVdAkkhUF0WsRB8L09KHe+E/eg8hKNhRA5jK
 mqNCmRox20LjbkKpKJF2p20ynU8+psFoM3Enm1JRo5UprgXsfFwBJaB75qQBGhN8
 SaRBQVMP2vJoghRVeofj2y/cZWSIKEcZ/dOY+q6MMzd2hsRxjiuqRLrm4f6MStdV
 W+0qr6qd0V6BzxMO+NrCrrhWrkjEb8cqB0F8V/hw3xU8G/17CBEk34HLrqWi2+3D
 //puT7TjXA8t/awkuz+wH2saDldZU4BfeDgpEriop//jQa30EhXM0RLiUBZofKfS
 U88qkd4N/CqPbScTe71ve5pUW5WH5kdcBYWHTN5venEW3sxCR13vFtlfDheki/Mc
 C865E7+ZEep8FhakhGrwiS6MjQrlF1Mzq3BxGviED0Cw92Rz3SEShdp+C0Qk/Av6
 5OYUaDfOw6tx92hBz6DtlbTHUNYbXCMv9aXCR2ju1DjDrIkFeIIr8cJwKI64f4LZ
 EJXIQrEKVWD7QOLe1ebRBlZKV+mEN0q9ZTII2waMcUfZ7GXLLueqYkCKRC8hd805
 +5sZtKspodBpl04=
 =0Qnd
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fix from Bjorn Helgaas:
 "If an IOMMU is present, ignore the P2PDMA whitelist we added for v5.2
  because we don't yet know how to support P2PDMA in that case (Logan
  Gunthorpe)"

* tag 'pci-v5.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI/P2PDMA: Ignore root complex whitelist when an IOMMU is present
2019-06-22 09:42:29 -07:00
David S. Miller
92ad6325cb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor SPDX change conflict.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-22 08:59:24 -04:00
Heiner Kallweit
4cfd218855 PCI: let pci_disable_link_state propagate errors
Drivers may rely on pci_disable_link_state() having disabled certain
ASPM link states. If OS can't control ASPM then pci_disable_link_state()
turns into a no-op w/o informing the caller. The driver therefore may
falsely assume the respective ASPM link states are disabled.
Let pci_disable_link_state() propagate errors to the caller, enabling
the caller to react accordingly.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-21 22:05:42 -04:00
Benjamin Herrenschmidt
7ac0d094fb PCI: Don't auto-realloc if we're preserving firmware config
Prevent auto-enabling of bridges reallocation when the FW tells us that the
initial configuration must be preserved for a given host bridge.

Link: https://lore.kernel.org/r/20190615002359.29577-3-benh@kernel.crashing.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-06-21 18:11:53 -05:00
Marek Vasut
dc6b698a86 PCI: sysfs: Ignore lockdep for remove attribute
With CONFIG_PROVE_LOCKING=y, using sysfs to remove a bridge with a device
below it causes a lockdep warning, e.g.,

  # echo 1 > /sys/class/pci_bus/0000:00/device/0000:00:00.0/remove
  ============================================
  WARNING: possible recursive locking detected
  ...
  pci_bus 0000:01: busn_res: [bus 01] is released

The remove recursively removes the subtree below the bridge.  Each call
uses a different lock so there's no deadlock, but the locks were all
created with the same lockdep key so the lockdep checker can't tell them
apart.

Mark the "remove" sysfs attribute with __ATTR_IGNORE_LOCKDEP() as it is
safe to ignore the lockdep check between different "remove" kernfs
instances.

There's discussion about a similar issue in USB at [1], which resulted in
356c05d58a ("sysfs: get rid of some lockdep false positives") and
e9b526fe70 ("i2c: suppress lockdep warning on delete_device"), which do
basically the same thing for USB "remove" and i2c "delete_device" files.

[1] https://lore.kernel.org/r/Pine.LNX.4.44L0.1204251436140.1206-100000@iolanthe.rowland.org
Link: https://lore.kernel.org/r/20190526225151.3865-1-marek.vasut@gmail.com
Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com>
[bhelgaas: trim commit log, details at above links]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Phil Edworthy <phil.edworthy@renesas.com>
Cc: Simon Horman <horms+renesas@verge.net.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: Wolfram Sang <wsa@the-dreams.de>
2019-06-21 09:40:13 -05:00
Manikanta Maddireddy
2d8c736158 PCI: tegra: Put PEX CLK & BIAS pads in DPD mode
In Tegra210 AFI design has clamp value for the BIAS pad as 0, which keeps
the bias pad in non power down mode. This is leading to power consumption
of 2 mW in BIAS pad, even if the PCIe partition is powergated. To avoid
unnecessary power consumption, put PEX CLK & BIAS pads in deep power down
mode when PCIe partition is power gated.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:40:48 +01:00
Manikanta Maddireddy
adb2653b3d PCI: tegra: Add AFI_PEX2_CTRL reg offset as part of SoC struct
Tegra186 and Tegra30 have three PCIe root ports. AFI_PEX2_CTRL register
is defined for third root port. Offset of this register in Tegra186 is
different from Tegra30, so add the offset as part of SoC data structure.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:40:48 +01:00
Manikanta Maddireddy
c894121d01 PCI: tegra: Change PRSNT_SENSE IRQ log to debug
PRSNT_MAP bit field is programmed to update the slot present status.
PRSNT_SENSE IRQ is triggered when this bit field is programmed, which is
not an error. Add a new if condition to trap PRSNT_SENSE code and print
it with debug log level.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:40:48 +01:00
Manikanta Maddireddy
b5b4717ea0 PCI: tegra: Program AFI_CACHE_BAR_{0,1}_{ST,SZ} registers only for Tegra20
Cacheable upstream transactions are supported in Tegra20 and Tegra186
only.

AFI_CACHE_BAR_{0,1}_{ST,SZ} registers are available in Tegra20 to
support cacheable upstream transactions. In Tegra186, AFI_AXCACHE
register is defined instead of AFI_CACHE_BAR_{0,1}_{ST,SZ} to be in line
with its memory subsystem design.

Therefore, program AFI_CACHE_BAR_{0,1}_{ST,SZ} registers only for Tegra20.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:40:48 +01:00
Manikanta Maddireddy
eef4a35026 PCI: tegra: Fix PLLE power down issue due to CLKREQ# signal
Disable controllers which failed to bring the link up and configure
CLKREQ# signals of these controllers as GPIO. This is required to avoid
CLKREQ# signal of inactive controllers interfering with PLLE power down
sequence.

PCIE_CLKREQ_GPIO bits are defined only in Tegra186, however programming
these bits in other SoCs doesn't cause any side effects. Program these
bits for all Tegra SoCs to avoid a conditional check.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:23:05 +01:00
Manikanta Maddireddy
c23ae2aec5 PCI: tegra: Set target speed as Gen1 before starting LTSSM
PCIe link up fails with few legacy endpoints if root port advertises both
Gen-1 and Gen-2 speeds in Tegra. This is because link number negotiation
fails if both Gen1 & Gen2 are advertised. Tegra doesn't retry link up by
advertising only Gen1. Hence, the strategy followed here is to initially
advertise only Gen-1 and after link is up, retrain link to Gen-2 speed.

Tegra doesn't support HW autonomous speed change. Link comes up in Gen1
even if Gen2 is advertised, so there is no downside of this change.

This behavior is observed with following two PCIe devices on Tegra:

- Fusion HDTV 5 Express card
- IOGear SIL - PCIE - SATA card

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:22:27 +01:00
Manikanta Maddireddy
9f570b6c24 PCI: tegra: Update flow control timer frequency in Tegra210
Recommended UpdateFC threshold in Tegra210 is 0x60 for best performance
of x1 link. Setting this to 0x60 provides the best balance between number
of UpdateFC packets and read data sent over the link.

UpdateFC timer frequency is equal to twice the value of register content
in nsec, i.e (2 * 0x60) = 192 nsec.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:22:12 +01:00
Manikanta Maddireddy
191cd6fb5d PCI: tegra: Add SW fixup for RAW violations
The logic which blocks read requests till AFI gets ACK for all outstanding
writes from memory controller does not behave correctly when number of
outstanding writes become more than 32 in Tegra124 and Tegra132.

SW fixup is to prevent writes from accumulating more than 32 by:

- limiting outstanding posted writes to 14
- modifying Gen1 and Gen2 UpdateFC timer frequency

UpdateFC timer frequency is equal to twice the value of register content
in nsec. These settings are recommended after stress testing with
different values.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:21:52 +01:00
Manikanta Maddireddy
b2634cd0d2 PCI: tegra: Increase the deskew retry time
Sometimes link speed change from Gen2 to Gen1 fails due to instability
in deskew logic on lane-0 in Tegra210. Increase the deskew retry time
to resolve this issue.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:21:44 +01:00
Manikanta Maddireddy
f1178099a6 PCI: tegra: Enable PCIe xclk clock clamping
Enable xclk clock clamping when entering L1. Clamp threshold will
determine the time spent waiting for clock module to turn on xclk after
signaling it. Default threshold value in Tegra124 and Tegra210 is not
enough to turn on xclk clock. Increase the clamp threshold to meet the
clock module timing in Tegra124 and Tegra210. Default threshold value is
enough in Tegra20, Tegra30 and Tegra186.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:21:31 +01:00
Manikanta Maddireddy
52db2fd89e PCI: tegra: Process pending DLL transactions before entering L1 or L2
PM message are truncated while entering L1 or L2, which is resulting in
receiver errors. Set the required bit to finish processing DLLP before
link enter L1 or L2.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:21:21 +01:00
Manikanta Maddireddy
92bd94f1fd PCI: tegra: Disable AFI dynamic clock gating
Outstanding write counter in AFI is used to generate idle signal to
dynamically gate the AFI clock. When there are 32 outstanding writes
from AFI to memory, the outstanding write counter overflows and
indicates that there are "0" outstanding write transactions.

When memory controller is under heavy load, write completions to AFI
gets delayed and AFI write counter overflows. This causes AFI clock gating
even when there are outstanding transactions towards memory controller
resulting in a system hang.

Disable dynamic clock gating of AFI clock to avoid system hang.

CLKEN_OVERRIDE bit is not defined in Tegra20 and Tegra30, however
programming this bit doesn't cause any side effects. Program this
bit for all Tegra SoCs to avoid conditional check.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:20:51 +01:00
Manikanta Maddireddy
7763cc24e2 PCI: tegra: Enable opportunistic UpdateFC and ACK
Enable opportunistic UpdateFC and ACK to allow data link layer send
pending ACKs and UpdateFC packets when link is idle instead of waiting
for timers to expire. This improves the PCIe performance due to better
utilization of PCIe bandwidth.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:20:40 +01:00
Manikanta Maddireddy
2513a4ee47 PCI: tegra: Program UPHY electrical settings for Tegra210
UPHY electrical programming guidelines are documented in Tegra210 TRM.
Program these electrical settings for proper eye diagram in Gen1 and Gen2
link speeds.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:20:32 +01:00
Manikanta Maddireddy
c635a815c8 PCI: tegra: Advertise PCIe Advanced Error Reporting (AER) capability
Default root port setting hides AER capability. This patch enables the
advertisement of AER capability by root port.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:20:25 +01:00
Manikanta Maddireddy
538123a29a PCI: tegra: Add PCIe Gen2 link speed support
Tegra124, Tegra132, Tegra210 and Tegra186 support Gen2 link speed. After
PCIe link is up in Gen1, set target link speed as Gen2 and retrain link.
Link switches to Gen2 speed if Gen2 capable end point is connected,
otherwise the link stays in Gen1.

Per PCIe 4.0r0.9 sec 7.6.3.7 implementation note, driver needs to wait for
PCIe LTSSM to come back from recovery before retraining the link.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:19:47 +01:00
Manikanta Maddireddy
d1f9113faf PCI: tegra: Fix PCIe host power up sequence
The PCIe host power up sequence requires to program AFI(AXI to FPCI
bridge) registers first and then PCIe registers, otherwise AFI register
settings may not latch to PCIe IP.

PCIe root port starts LTSSM as soon as PCIe xrst is deasserted.
So deassert PCIe xrst after programming PCIe registers.

Modify PCIe power up sequence as follows:

- Power ungate PCIe partition
- Enable AFI clock
- Deassert AFI reset
- Program AFI registers
- Enable PCIe clock
- Deassert PCIe reset
- Program PCIe PHY
- Program PCIe pad control registers
- Program PCIe root port registers
- Deassert PCIe xrst to start LTSSM

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:18:50 +01:00
Manikanta Maddireddy
316b9ef1ee PCI: tegra: Mask AFI_INTR in runtime suspend
AFI_INTR is unmasked in tegra_pcie_enable_controller(), mask it to avoid
unwanted interrupts raised by AFI after pex_rst is asserted.

The following sequence triggers such scenario:

- tegra_pcie_remove() triggers runtime suspend
- pex_rst is asserted in runtime suspend
- PRSNT_MAP bit field in RP_PRIV_MISC register changes from EP_PRSNT to
  EP_ABSNT
- This is sensed by AFI and triggers "Slot present pin change" interrupt
- tegra_pcie_isr() function accesses AFI register when runtime suspend
  is going through power off sequence

Resulting faulty backtrace:

rmmod pci-tegra
 pci_generic_config_write32: 108 callbacks suppressed
 pci_bus 0002:00: 2-byte config write to 0002:00:02.0 offset 0x4c may corrupt adjacent RW1C bits
 pci_bus 0002:00: 2-byte config write to 0002:00:02.0 offset 0x9c may corrupt adjacent RW1C bits
 pci_bus 0002:00: 2-byte config write to 0002:00:02.0 offset 0x88 may corrupt adjacent RW1C bits
 pci_bus 0002:00: 2-byte config write to 0002:00:02.0 offset 0x90 may corrupt adjacent RW1C bits
 pci_bus 0002:00: 2-byte config write to 0002:00:02.0 offset 0x4 may corrupt adjacent RW1C bits
 igb 0002:04:00.1: removed PHC on enP2p4s0f1
 igb 0002:04:00.0: removed PHC on enP2p4s0f0
 pci_bus 0002:00: 2-byte config write to 0002:00:01.0 offset 0x4c may corrupt adjacent RW1C bits
 pci_bus 0002:00: 2-byte config write to 0002:00:01.0 offset 0x9c may corrupt adjacent RW1C bits
 pci_bus 0002:00: 2-byte config write to 0002:00:01.0 offset 0x88 may corrupt adjacent RW1C bits
 pci_bus 0002:00: 2-byte config write to 0002:00:01.0 offset 0x90 may corrupt adjacent RW1C bits
 pci_bus 0002:00: 2-byte config write to 0002:00:01.0 offset 0x4 may corrupt adjacent RW1C bits
 rcu: INFO: rcu_preempt self-detected stall on CPU
 SError Interrupt on CPU0, code 0xbf000002 -- SError
 CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.1.0-rc3-next-20190405-00027-gcd8110499e6f-dirty #42
 Hardware name: NVIDIA Jetson TX1 Developer Kit (DT)
 pstate: 20000085 (nzCv daIf -PAN -UAO)
 pc : tegra_pcie_isr+0x58/0x178 [pci_tegra]
 lr : tegra_pcie_isr+0x40/0x178 [pci_tegra]
 sp : ffff000010003da0
 x29: ffff000010003da0 x28: 0000000000000000
 x27: ffff8000f9e61000 x26: ffff000010fbf420
 x25: ffff000011427f93 x24: ffff8000fa600410
 x23: ffff00001129d000 x22: ffff00001129d000
 x21: ffff8000f18bf3c0 x20: 0000000000000070
 x19: 00000000ffffffff x18: 0000000000000000
 x17: 0000000000000000 x16: 0000000000000000
 x15: 0000000000000000 x14: ffff000008d40a48
 x13: ffff000008d40a30 x12: ffff000008d40a20
 x11: ffff000008d40a10 x10: ffff000008d40a00
 x9 : ffff000008d409e8 x8 : ffff000008d40ae8
 x7 : ffff000008d40ad0 x6 : ffff000010003e58
 x5 : ffff8000fac00248 x4 : 0000000000000000
 x3 : ffff000008d40b08 x2 : fffffffffffffff8
 x1 : ffff000008d3f4e8 x0 : 00000000ffffffff
 Kernel panic - not syncing: Asynchronous SError Interrupt
 CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.1.0-rc3-next-20190405-00027-gcd8110499e6f-dirty #42
 Hardware name: NVIDIA Jetson TX1 Developer Kit (DT)
 Call trace:
  dump_backtrace+0x0/0x158
  show_stack+0x14/0x20
  dump_stack+0xa8/0xcc
  panic+0x140/0x2f4
  nmi_panic+0x6c/0x70
  arm64_serror_panic+0x74/0x80
  __pte_error+0x0/0x28
  el1_error+0x84/0xf8
  tegra_pcie_isr+0x58/0x178 [pci_tegra]
  __handle_irq_event_percpu+0x70/0x198
  handle_irq_event_percpu+0x34/0x88
  handle_irq_event+0x48/0x78
  handle_fasteoi_irq+0xb4/0x190
  generic_handle_irq+0x24/0x38
  __handle_domain_irq+0x5c/0xb8
  gic_handle_irq+0x58/0xa8
  el1_irq+0xb8/0x180
  cpuidle_enter_state+0x138/0x358
  cpuidle_enter+0x18/0x20
  call_cpuidle+0x1c/0x48
  do_idle+0x230/0x2d0
  cpu_startup_entry+0x20/0x28
  rest_init+0xd4/0xe0
  arch_call_rest_init+0xc/0x14
  start_kernel+0x444/0x470

AFI_INTR is re-enabled on resume in tegra_pcie_pm_resume() through
tegra_pcie_enable_controller().

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
[lorenzo.pieralisi@arm.com: updated log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:14:28 +01:00
Manikanta Maddireddy
973d7499c5 PCI: tegra: Rearrange Tegra PCIe driver functions
Tegra PCIe has register specifications for:

 - AXI to FPCI(AFI) bridge
 - Multiple PCIe root ports
 - PCIe PHY
 - PCIe pad control

Rearrange Tegra PCIe driver functions so that each function programs
the required module only.

- tegra_pcie_enable_controller(): Program AFI module and enable PCIe
  controller
- tegra_pcie_phy_power_on(): Bring up PCIe PHY
- tegra_pcie_apply_pad_settings(): Program PCIe REFCLK pad settings
- tegra_pcie_enable_ports(): Program each root port and bring up PCIe
  link

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:12:56 +01:00
Manikanta Maddireddy
1056dda8a8 PCI: tegra: Handle failure cases in tegra_pcie_power_on()
Unroll the PCIe power on sequence if any one of the steps fails in
tegra_pcie_power_on().

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2019-06-20 17:12:45 +01:00
Logan Gunthorpe
6dbbd053e6 PCI/P2PDMA: Ignore root complex whitelist when an IOMMU is present
Presently, there is no path to DMA map P2PDMA memory, so if a TLP targeting
this memory hits the root complex and an IOMMU is present, the IOMMU will
reject the transaction, even if the RC would support P2PDMA.

So until the kernel knows to map these DMA addresses in the IOMMU, we
should not enable the whitelist when an IOMMU is present.

Link: https://lore.kernel.org/linux-pci/20190522201252.2997-1-logang@deltatee.com/
Fixes: 0f97da8310 ("PCI/P2PDMA: Allow P2P DMA between any devices under AMD ZEN Root Complex")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Christoph Hellwig <hch@lst.de>
2019-06-19 16:43:42 -05:00
Linus Torvalds
abf02e2964 Power management fix for 5.2-rc6
Prevent PCI bridges in general (and PCIe ports in particular)
 from being put into low-power states during system-wide suspend
 transitions if there are any devices in D0 below them and refine
 the handling of PCI devices in D0 during suspend-to-idle cycles.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl0KAdUSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxejIQAIoR8FCLoKcxD4wJ6sDp5CtGVaw65pc9
 i0WaTGlQWiBcr3bkxCRERl+NNjolVrUu7aAVrrUNe5SUQduFXZuGsreF0q3SPMUh
 OZwSb+EpN6gSM3GTjrsF2P9nyvlJ80r5t9HI6vG1hAEFBU8T15gGVS6bnwm4ci7I
 +KuIb4zwkOQ+LCwjqwkGjn6s4ZHmx2KxGnI58GBTAd4KsvV3G7QIaa7Lfa/js88C
 pDhz8BiQqs/HTU0gHY52hsEvhKPeefMKH3QDpBFhoR0p1ZOkMoqK1jA+Kc5r/JF+
 36Fj/rPlD26pmqYMZA7bZi4Ij0M+vR8SWCdefcvzPqUZpzkHh9y7/foi01DVNjsf
 QGhlJODgGUl78mjEQwdPXz/ntzj4DEyo/3Re9Xf/SZ09sMeoyhbNi5Qolri05LqV
 8hAshCNcFLbOzF1emcAa+Yq76tggWnW78q3oKAsfUqg4Olyvcbxy6J3GDRpzTwPz
 D/4lEM7jtSqbcgprqWUcANB/zE3Jw93et0QtQNUfdOJ6a+LsS2XAhqenkQn2JQpk
 7ZjaVfNNm3YDQlKt6nPaWCCxVv/g6KHSYXDeWB5VJpCOrSVhdXAZgPU+UCGrk9TU
 3TqdqFoKi0LVZJVuWT+oyNfwzfolGZ7gd7TJVndFxVM8kbcLTzrj1ZQgpwP1l/tI
 Xs12WM7cw1dy
 =n3GG
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "Prevent PCI bridges in general (and PCIe ports in particular) from
  being put into low-power states during system-wide suspend transitions
  if there are any devices in D0 below them and refine the handling of
  PCI devices in D0 during suspend-to-idle cycles"

* tag 'pm-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PCI: PM: Skip devices in D0 for suspend-to-idle
2019-06-19 11:44:04 -07:00
Mika Westerberg
000dd5316e PCI: Do not poll for PME if the device is in D3cold
PME polling does not take into account that a device that is directly
connected to the host bridge may go into D3cold as well. This leads to a
situation where the PME poll thread reads from a config space of a
device that is in D3cold and gets incorrect information because the
config space is not accessible.

Here is an example from Intel Ice Lake system where two PCIe root ports
are in D3cold (I've instrumented the kernel to log the PMCSR register
contents):

  [   62.971442] pcieport 0000:00:07.1: Check PME status, PMCSR=0xffff
  [   62.971504] pcieport 0000:00:07.0: Check PME status, PMCSR=0xffff

Since 0xffff is interpreted so that PME is pending, the root ports will
be runtime resumed. This repeats over and over again essentially
blocking all runtime power management.

Prevent this from happening by checking whether the device is in D3cold
before its PME status is read.

Fixes: 71a83bd727 ("PCI/PM: add runtime PM support to PCIe port")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: 3.6+ <stable@vger.kernel.org> # v3.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-18 01:40:41 +02:00
Mika Westerberg
c2bf1fc212 PCI: Add missing link delays required by the PCIe spec
Currently Linux does not follow PCIe spec regarding the required delays
after reset. A concrete example is a Thunderbolt add-in-card that
consists of a PCIe switch and two PCIe endpoints:

  +-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller
                                  +-01.0-[04-36]-- DS hotplug port
                                  +-02.0-[37]----00.0 xHCI controller
                                  \-04.0-[38-6b]-- DS hotplug port

The root port (1b.0) and the PCIe switch downstream ports are all PCIe
gen3 so they support 8GT/s link speeds.

We wait for the PCIe hierarchy to enter D3cold (runtime):

  pcieport 0000:00:1b.0: power state changed by ACPI to D3cold

When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the
PCIe switch is put to reset and its power is re-applied. This means that
we must follow the rules in PCIe 4.0 section 6.6.1.

For the PCIe gen3 ports we are dealing with here, the following applies:

  With a Downstream Port that supports Link speeds greater than 5.0
  GT/s, software must wait a minimum of 100 ms after Link training
  completes before sending a Configuration Request to the device
  immediately below that Port. Software can determine when Link training
  completes by polling the Data Link Layer Link Active bit or by setting
  up an associated interrupt (see Section 6.7.3.3).

Translating this into the above topology we would need to do this (DLLLA
stands for Data Link Layer Link Active):

  pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0
  pcieport 0000:02:00.0: wait for 100ms after DLLLA is set before access to 0000:03:00.0
  pcieport 0000:02:02.0: wait for 100ms after DLLLA is set before access to 0000:37:00.0

I've instrumented the kernel with additional logging so we can see the
actual delays the kernel performs:

  pcieport 0000:00:1b.0: power state changed by ACPI to D0
  pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms
  pcieport 0000:00:1b.0: waking up bus
  pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms
  pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60)
  ...
  pcieport 0000:00:1b.0: PME# disabled
  pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
  ...
  pcieport 0000:01:00.0: PME# disabled
  pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
  ...
  pcieport 0000:02:00.0: PME# disabled
  pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
  ...
  pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
  pcieport 0000:02:01.0: PME# disabled
  pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
  ...
  pcieport 0000:02:02.0: PME# disabled
  pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
  ...
  pcieport 0000:02:04.0: PME# disabled
  pcieport 0000:02:01.0: PME# enabled
  pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms
  pcieport 0000:02:04.0: PME# enabled
  pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms
  thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
  ...
  thunderbolt 0000:03:00.0: PME# disabled
  xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000)
  ...
  xhci_hcd 0000:37:00.0: PME# disabled

For the switch upstream port (01:00.0) we wait for 100ms but not taking
into account the DLLLA requirement. We then wait 10ms for D3hot -> D0
transition of the root port and the two downstream hotplug ports. This
means that we deviate from what the spec requires.

Performing the same check for system sleep (s2idle) transitions we can
see following when resuming from s2idle:

  pcieport 0000:00:1b.0: power state changed by ACPI to D0
  pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60)
  ...
  pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
  ...
  pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
  pcieport 0000:02:02.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
  pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
  pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
  pcieport 0000:02:02.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
  pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
  pcieport 0000:02:02.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1)
  pcieport 0000:02:01.0: restoring config space at offset 0x2c (was 0x0, writing 0x60)
  pcieport 0000:02:02.0: restoring config space at offset 0x20 (was 0x0, writing 0x73f073f0)
  pcieport 0000:02:04.0: restoring config space at offset 0x2c (was 0x0, writing 0x60)
  pcieport 0000:02:01.0: restoring config space at offset 0x28 (was 0x0, writing 0x60)
  pcieport 0000:02:00.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
  pcieport 0000:02:02.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1)
  pcieport 0000:02:04.0: restoring config space at offset 0x28 (was 0x0, writing 0x60)
  pcieport 0000:02:01.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1ff10001)
  pcieport 0000:02:00.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
  pcieport 0000:02:02.0: restoring config space at offset 0x18 (was 0x0, writing 0x373702)
  pcieport 0000:02:04.0: restoring config space at offset 0x24 (was 0x10001, writing 0x49f12001)
  pcieport 0000:02:01.0: restoring config space at offset 0x20 (was 0x0, writing 0x73e05c00)
  pcieport 0000:02:00.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1)
  pcieport 0000:02:04.0: restoring config space at offset 0x20 (was 0x0, writing 0x89f07400)
  pcieport 0000:02:01.0: restoring config space at offset 0x1c (was 0x101, writing 0x5151)
  pcieport 0000:02:00.0: restoring config space at offset 0x20 (was 0x0, writing 0x8a008a00)
  pcieport 0000:02:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
  pcieport 0000:02:04.0: restoring config space at offset 0x1c (was 0x101, writing 0x6161)
  pcieport 0000:02:01.0: restoring config space at offset 0x18 (was 0x0, writing 0x360402)
  pcieport 0000:02:00.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1)
  pcieport 0000:02:04.0: restoring config space at offset 0x18 (was 0x0, writing 0x6b3802)
  pcieport 0000:02:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
  pcieport 0000:02:00.0: restoring config space at offset 0x18 (was 0x0, writing 0x30302)
  pcieport 0000:02:01.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
  pcieport 0000:02:04.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
  pcieport 0000:02:00.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
  pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
  pcieport 0000:02:04.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
  pcieport 0000:02:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
  xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000)
  ...
  thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)

This is even worse. None of the mandatory delays are performed. If this
would be S3 instead of s2idle then according to PCI FW spec 3.2 section
4.6.8.  there is a specific _DSM that allows the OS to skip the delays
but this platform does not provide the _DSM and does not go to S3 anyway
so no firmware is involved that could already handle these delays.

In this particular Intel Coffee Lake platform these delays are not
actually needed because there is an additional delay as part of the ACPI
power resource that is used to turn on power to the hierarchy but since
that additional delay is not required by any of standards (PCIe, ACPI)
it is not present in the Intel Ice Lake, for example where missing the
mandatory delays causes pciehp to start tearing down the stack too early
(links are not yet trained).

For this reason, change the PCIe portdrv PM resume hooks so that they
perform the mandatory delays before the downstream component gets
resumed. We perform the delays before port services are resumed because
otherwise pciehp might find that the link is not up (even if it is just
training) and tears-down the hierarchy.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-18 01:40:41 +02:00
Ley Foon Tan
7a28db0a25 PCI: altera: Fix configuration type based on secondary number
Stratix 10 PCIe controller does not support Type 1 to Type 0 conversion
as previous version (V1) does so the PCIe controller configuration
mechanism needs to send Type 0 config TLP if the target bus number
matches with the secondary bus number.

Implement a function to form a TLP header that depends on the PCIe
controller version, so that the header can be formed according to
specific host controller HW internals, fixing the type conversion issue.

Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
[lorenzo.pieralisi@arm.com: commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2019-06-17 12:22:25 +01:00
Miquel Raynal
c369b536f8 PCI: armada8k: Add PHYs support
Bring PHY support for the Armada8k driver.

The Armada8k IP only supports x1, x2 or x4 link widths. Iterate over
the DT 'phys' entries and configure them one by one. Use
phy_set_mode_ext() to make use of the submode parameter (initially
introduced for Ethernet modes). For PCI configuration, let the submode
be the width (1, 2, 4, etc) so that the PHY driver knows how many
lanes are bundled. Do not error out in case of error for compatibility
reasons.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
2019-06-17 12:15:07 +01:00
Rafael J. Wysocki
0c7376ada9 PCI: PM: Replace pci_dev_keep_suspended() with two functions
The code in pci_dev_keep_suspended() is relatively hard to follow due
to the negative checks in it and in its callers and the function has
a possible side-effect (disabling the PME) which doesn't really match
its role.

For this reason, move the PME disabling from pci_dev_keep_suspended()
to a separate function and change the semantics (and name) of the
rest of it, so that 'true' is returned when the device needs to be
resumed (and not the other way around).  Change the callers of
pci_dev_keep_suspended() accordingly.

While at it, make the code flow in pci_pm_poweroff() reflect the
pci_pm_suspend() more closely to avoid arbitrary differences between
them.

This is a cosmetic change with no intention to alter behavior.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-06-17 12:30:24 +02:00
Rafael J. Wysocki
234f223d63 PCI: PM: Avoid resuming devices in D3hot during system suspend
The current code resumes devices in D3hot during system suspend if
the target power state for them is D3cold, but that is not necessary
in general.  It only is necessary to do that if the platform firmware
requires the device to be resumed, but that should be covered by
the platform_pci_need_resume() check anyway, so rework
pci_dev_keep_suspended() to avoid returning 'false' for devices
in D3hot which need not be resumed due to platform firmware
requirements.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-06-17 12:30:24 +02:00
Dan Williams
50f44ee724 mm/devm_memremap_pages: fix final page put race
Logan noticed that devm_memremap_pages_release() kills the percpu_ref
drops all the page references that were acquired at init and then
immediately proceeds to unplug, arch_remove_memory(), the backing pages
for the pagemap.  If for some reason device shutdown actually collides
with a busy / elevated-ref-count page then arch_remove_memory() should
be deferred until after that reference is dropped.

As it stands the "wait for last page ref drop" happens *after*
devm_memremap_pages_release() returns, which is obviously too late and
can lead to crashes.

Fix this situation by assigning the responsibility to wait for the
percpu_ref to go idle to devm_memremap_pages() with a new ->cleanup()
callback.  Implement the new cleanup callback for all
devm_memremap_pages() users: pmem, devdax, hmm, and p2pdma.

Link: http://lkml.kernel.org/r/155727339156.292046.5432007428235387859.stgit@dwillia2-desk3.amr.corp.intel.com
Fixes: 41e94a8513 ("add devm_memremap_pages")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Dan Williams
1570175abd PCI/P2PDMA: track pgmap references per resource, not globally
In preparation for fixing a race between devm_memremap_pages_release()
and the final put of a page from the device-page-map, allocate a
percpu-ref per p2pdma resource mapping.

Link: http://lkml.kernel.org/r/155727338646.292046.9922678317501435597.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Dan Williams
e615a19121 PCI/P2PDMA: fix the gen_pool_add_virt() failure path
The pci_p2pdma_add_resource() implementation immediately frees the pgmap
if gen_pool_add_virt() fails.  However, that means that when @dev
triggers a devres release devm_memremap_pages_release() will crash
trying to access the freed @pgmap.

Use the new devm_memunmap_pages() to manually free the mapping in the
error path.

Link: http://lkml.kernel.org/r/155727337603.292046.13101332703665246702.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Fixes: 52916982af ("PCI/P2PDMA: Support peer-to-peer memory")
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Rafael J. Wysocki
3e26c5feed PCI: PM: Skip devices in D0 for suspend-to-idle
Commit d491f2b752 ("PCI: PM: Avoid possible suspend-to-idle issue")
attempted to avoid a problem with devices whose drivers want them to
stay in D0 over suspend-to-idle and resume, but it did not go as far
as it should with that.

Namely, first of all, the power state of a PCI bridge with a
downstream device in D0 must be D0 (based on the PCI PM spec r1.2,
sec 6, table 6-1, if the bridge is not in D0, there can be no PCI
transactions on its secondary bus), but that is not actively enforced
during system-wide PM transitions, so use the skip_bus_pm flag
introduced by commit d491f2b752 for that.

Second, the configuration of devices left in D0 (whatever the reason)
during suspend-to-idle need not be changed and attempting to put them
into D0 again by force is pointless, so explicitly avoid doing that.

Fixes: d491f2b752 ("PCI: PM: Avoid possible suspend-to-idle issue")
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
2019-06-14 00:03:27 +02:00
Gustavo Pimentel
de76cda215 PCI: Decode PCIe 32 GT/s link speed
PCIe r5.0, sec 7.5.3.18, defines a new 32.0 GT/s bit in the Supported Link
Speeds Vector of Link Capabilities 2.  Decode this new speed.  This does
not affect the speed of the link, which should be negotiated automatically
by the hardware; it only adds decoding when showing the speed to the user.

Previously, reading the speed of a link operating at this speed showed
"Unknown speed" instead of "32.0 GT/s".

Link: https://lore.kernel.org/lkml/92365e3caf0fc559f9ab14bcd053bfc92d4f661c.1559664969.git.gustavo.pimentel@synopsys.com
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-06-13 16:49:45 -05:00
Alex Williamson
2d2f4273cb PCI: Always allow probing with driver_override
Commit 0e7df22401 ("PCI: Add sysfs sriov_drivers_autoprobe to control
VF driver binding") introduced the sriov_drivers_autoprobe attribute
which allows users to prevent the kernel from automatically probing a
driver for new VFs as they are created.  This allows VFs to be spawned
without automatically binding the new device to a host driver, such as
in cases where the user intends to use the device only with a meta
driver like vfio-pci.  However, the current implementation prevents any
use of drivers_probe with the VF while sriov_drivers_autoprobe=0.  This
blocks the now current general practice of setting driver_override
followed by using drivers_probe to bind a device to a specified driver.

The kernel never automatically sets a driver_override therefore it seems
we can assume a driver_override reflects the intent of the user.  Also,
probing a device using a driver_override match seems outside the scope
of the 'auto' part of sriov_drivers_autoprobe.  Therefore, let's allow
driver_override matches regardless of sriov_drivers_autoprobe, which we
can do by simply testing if a driver_override is set for a device as a
'can probe' condition.

Fixes: 0e7df22401 ("PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding")
Link: https://lore.kernel.org/lkml/155742996741.21878.569845487290798703.stgit@gimli.home
Link: https://lore.kernel.org/linux-pci/155672991496.20698.4279330795743262888.stgit@gimli.home/T/#u
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-06-13 16:49:06 -05:00
Abhishek Sahu
6d2e369f0d PCI: Add NVIDIA GPU multi-function power dependencies
The NVIDIA Turing GPU is a multi-function PCI device with the following
functions:

  - Function 0: VGA display controller
  - Function 1: Audio controller
  - Function 2: USB xHCI Host controller
  - Function 3: USB Type-C UCSI controller

Function 0 is tightly coupled with other functions in the hardware.  When
function 0 is in D3, it gates power for hardware blocks used by other
functions, which means those functions only work when function 0 is in D0.
If any of these functions (1/2/3) are in D0, then function 0 should also be
in D0.

Commit 07f4f97d7b ("vga_switcheroo: Use device link for HDA controller")
already creates a device link to show the dependency of function 1 on
function 0 of this GPU.  Create additional device links to express the
dependencies of functions 2 and 3 on function 0.  This means function 0
will be in D0 if any other function is in D0.

[bhelgaas: I think the PCI spec expectation is that functions can be
power-managed independently, so I don't think this device is technically
compliant.  For example, the PCIe r5.0 spec, sec 1.4, says "the PCI/PCIe
hardware/software model includes architectural constructs necessary to
discover, configure, and use a Function, without needing Function-specific
knowledge" and sec 5.1 says "D states are associated with a particular
Function" and "PM provides ... a mechanism to identify power management
capabilities of a given Function [and] the ability to transition a Function
into a certain power management state."]

Link: https://lore.kernel.org/lkml/20190606092225.17960-3-abhsahu@nvidia.com
Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-06-13 15:53:40 -05:00
Abhishek Sahu
a17beb1a08 PCI: Generalize multi-function power dependency device links
Although not allowed by the PCI specs, some multi-function devices have
power dependencies between the functions.  For example, function 1 may not
work unless function 0 is in the D0 power state.

The existing quirk_gpu_hda() adds a device link to express this dependency
for GPU and HDA devices, but it really is not specific to those device
types.

Generalize it and rename it to pci_create_device_link() so we can create
dependencies between any "consumer" and "producer" functions of a
multi-function device, where the consumer is only functional if the
producer is in D0.  This reorganization should not affect any
functionality.

Link: https://lore.kernel.org/lkml/20190606092225.17960-2-abhsahu@nvidia.com
Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com>
[bhelgaas: commit log, reword diagnostic]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-06-13 15:52:37 -05:00
Logan Gunthorpe
fcdf8e95fa PCI/switchtec: Add module parameter to request more interrupts
Seeing the we want to use more interrupts in the NTB MSI code
we need to be able allocate more (sometimes virtual) interrupts
in the switchtec driver. Therefore add a module parameter to
request to allocate additional interrupts.

This puts virtually no limit on the number of MSI interrupts available
to NTB clients.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2019-06-13 08:59:38 -04:00
Logan Gunthorpe
d7cc609fb6 PCI/MSI: Support allocating virtual MSI interrupts
For NTB devices, we want to be able to trigger MSI interrupts
through a memory window. In these cases we may want to use
more interrupts than the NTB PCI device has available in its MSI-X
table.

We allow for this by creating a new 'virtual' interrupt. These
interrupts are allocated as usual but are not programmed into the
MSI-X table (as there may not be space for them).

The MSI address and data will then handled through an NTB MSI library
introduced later in this series.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2019-06-13 08:59:34 -04:00
Alan Mikhak
dbb7bbcc8a PCI: endpoint: Clear BAR before freeing its space
Associated pci_epf_bar structure is needed in pci_epc_clear_bar() to
clear a BAR correctly but it is reset in pci_epf_free_space() (that
is called first) which results in pci_epc_clear_bar() failure.

Reorder the pci_epc_clear_bar()/pci_epf_free_space() calls execution
to fix the issue.

Signed-off-by: Alan Mikhak <alan.mikhak@sifive.com>
[lorenzo.pieralisi@arm.com: reworded the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2019-06-11 10:57:54 +01:00
Alan Mikhak
3041a64361 PCI: endpoint: Skip odd BAR when skipping 64bit BAR
Always skip odd BAR when skipping 64bit BARs in pci_epf_test_set_bar()
and pci_epf_test_alloc_space() otherwise pci_epf_test_set_bar() will
call pci_epc_set_bar() on an odd loop index when skipping reserved 64bit
BAR.

Moreover, pci_epf_test_alloc_space() will call pci_epf_alloc_space() on
bind for an odd loop index when BAR is 64bit but leaks on subsequent
unbind by not calling pci_epf_free_space().

Signed-off-by: Alan Mikhak <alan.mikhak@sifive.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Reviewed-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-06-11 10:55:36 +01:00
Alan Mikhak
f16fb16ed1 PCI: endpoint: Allocate enough space for fixed size BAR
PCI endpoint test function code should honor the .bar_fixed_size parameter
from underlying endpoint controller drivers or results may be unexpected.

In pci_epf_test_alloc_space(), check if BAR being used for test
register space is a fixed size BAR. If so, allocate the required fixed
size.

Signed-off-by: Alan Mikhak <alan.mikhak@sifive.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2019-06-11 10:55:22 +01:00
Alan Mikhak
db7a62482d PCI: endpoint: Set endpoint controller pointer to NULL
Set endpoint controller pointer to NULL in pci_epc_remove_epf()
to avoid -EBUSY on subsequent call to pci_epc_add_epf().

Add a check for NULL endpoint function pointer.

Signed-off-by: Alan Mikhak <alan.mikhak@sifive.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2019-06-11 10:55:12 +01:00
Jean-Philippe Brucker
59b099a6c7 PCI: OF: Initialize dev->fwnode appropriately
For PCI devices that have an OF node, set the fwnode as well. This way
drivers that rely on fwnode don't need the special case described by
commit f94277af03 ("of/platform: Initialise dev->fwnode appropriately").

Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-06-06 17:32:13 -04:00
Niklas Cassel
64adde31c8 PCI: qcom: Ensure that PERST is asserted for at least 100 ms
Currently, there is only a 1 ms sleep after asserting PERST.

Reading the datasheets for different endpoints, some require PERST to be
asserted for 10 ms in order for the endpoint to perform a reset, others
require it to be asserted for 50 ms.

Several SoCs using this driver uses PCIe Mini Card, where we don't know
what endpoint will be plugged in.

The PCI Express Card Electromechanical Specification r2.0, section
2.2, "PERST# Signal" specifies:

"On power up, the deassertion of PERST# is delayed 100 ms (TPVPERL) from
the power rails achieving specified operating limits."

Add a sleep of 100 ms before deasserting PERST, in order to ensure that
we are compliant with the spec.

Fixes: 82a823833f ("PCI: qcom: Add Qualcomm PCIe controller driver")
Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Cc: stable@vger.kernel.org # 4.5+
2019-05-30 16:51:12 +01:00
Ley Foon Tan
c7ddfd3514 PCI: altera-msi: Allow building as module
Altera MSI IP is a soft IP and is only available after
an FPGA image (with design containing it) is programmed.

Make driver modulable to support use case FPGA image is programmed the
after kernel has booted, so that the driver can be loaded upon request.

Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2019-05-30 15:34:54 +01:00
Ley Foon Tan
ec15c4d0d5 PCI: altera: Allow building as module
Altera PCIe Rootport IP is a soft IP and is only available after
an FPGA image (whose design contains it) is programmed.

Make driver modulable to support use cases when FPGA image is
programmed after the kernel has booted, so that the driver
can be loaded upon request.

Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2019-05-30 15:28:01 +01:00
Alex Williamson
76002d8b48 PCI: Return error if cannot probe VF
Commit 0e7df22401 ("PCI: Add sysfs sriov_drivers_autoprobe to control
VF driver binding") allows the user to specify that drivers for VFs of
a PF should not be probed, but it actually causes pci_device_probe() to
return success back to the driver core in this case.  Therefore by all
sysfs appearances the device is bound to a driver, the driver link from
the device exists as does the device link back from the driver, yet the
driver's probe function is never called on the device.  We also fail to
do any sort of cleanup when we're prohibited from probing the device,
the IRQ setup remains in place and we even hold a device reference.

Instead, abort with errno before any setup or references are taken when
pci_device_can_probe() prevents us from trying to probe the device.

Link: https://lore.kernel.org/lkml/155672991496.20698.4279330795743262888.stgit@gimli.home
Fixes: 0e7df22401 ("PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-05-30 08:40:36 -05:00
Bjorn Andersson
67021ae0bb PCI: qcom: Add QCS404 PCIe controller support
The QCS404 platform contains a PCIe version 2.4.0 controller and a
Qualcomm PCIe2 PHY. The driver already supports version 2.4.0, for the
IPQ4019, but this support touches clocks and resets related to the PHY
as well and there's no upstream driver for the PHY.

On QCS404 we must initialize the PHY, so a separate PHY driver is
implemented to take care of this and the controller driver is updated to
not require the PHY related resources. This is done by relying on the
fact that operations in both the clock and reset framework are NOPs when
passed NULL, so we can isolate this change to only the
qcom_pcie_get_resources_2_4_0() function.

For QCS404 we also need to enable the AHB (iface) clock, in order to
access the register space of the controller, but as this is not part of
the IPQ4019 DT binding this is only added for new users of the 2.4.0
controller.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Niklas Cassel <niklas.cassel@linaro.org>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Acked-by: Stanimir Varbanov <svarbanov@mm-sol.com>
2019-05-29 17:12:35 +01:00
Bjorn Andersson
5aa180974e PCI: qcom: Use clk bulk API for 2.4.0 controllers
Before introducing the QCS404 platform, which uses the same PCIe
controller as IPQ4019, migrate this to use the bulk clock API, in order
to make the error paths slighly cleaner.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Niklas Cassel <niklas.cassel@linaro.org>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Acked-by: Stanimir Varbanov <svarbanov@mm-sol.com>
2019-05-29 17:12:24 +01:00
Rafael J. Wysocki
d491f2b752 PCI: PM: Avoid possible suspend-to-idle issue
If a PCI driver leaves the device handled by it in D0 and calls
pci_save_state() on the device in its ->suspend() or ->suspend_late()
callback, it can expect the device to stay in D0 over the whole
s2idle cycle.  However, that may not be the case if there is a
spurious wakeup while the system is suspended, because in that case
pci_pm_suspend_noirq() will run again after pci_pm_resume_noirq()
which calls pci_restore_state(), via pci_pm_default_resume_early(),
so state_saved is cleared and the second iteration of
pci_pm_suspend_noirq() will invoke pci_prepare_to_sleep() which
may change the power state of the device.

To avoid that, add a new internal flag, skip_bus_pm, that will be set
by pci_pm_suspend_noirq() when it runs for the first time during the
given system suspend-resume cycle if the state of the device has
been saved already and the device is still in D0.  Setting that flag
will cause the next iterations of pci_pm_suspend_noirq() to set
state_saved for pci_pm_resume_noirq(), so that it always restores the
device state from the originally saved data, and avoid calling
pci_prepare_to_sleep() for the device.

Fixes: 33e4f80ee6 ("ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-05-27 10:55:08 +02:00
Rafael J. Wysocki
9a51c6b1f9 ACPI/PCI: PM: Add missing wakeup.flags.valid checks
Both acpi_pci_need_resume() and acpi_dev_needs_resume() check if the
current ACPI wakeup configuration of the device matches what is
expected as far as system wakeup from sleep states is concerned, as
reflected by the device_may_wakeup() return value for the device.

However, they only should do that if wakeup.flags.valid is set for
the device's ACPI companion, because otherwise the wakeup.prepare_count
value for it is meaningless.

Add the missing wakeup.flags.valid checks to these functions.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-05-27 10:51:06 +02:00
Linus Torvalds
414147d99b pci-v5.2-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlzZ/4MUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwmYw/+Mzkkz/zOpzYdsYyy6Xv3qRdn92Kp
 bePOPACdwpUK+HV4qE6EEYBcVZdkO7NMkshA7wIb4VlsE0sVHSPvlybUmTUGWeFd
 CG87YytVOo4K7cAeKdGVwGaoQSeaZX3wmXVGGQtm/T4b63GdBjlNJ/MBuPWDDMlM
 XEis29MTH6xAu3MbT7pp5q+snSzOmt0RWuVpX/U1YcZdhu8fbwfOxj9Jx6slh4+2
 MvseYNNrTRJrMF0o5o83Khx3tAcW8OTTnDJ9+BCrAlE1PId1s/KjzY6nqReBtom9
 CIqtwAlx/wGkRBRgfsmEtFBhkDA05PPilhSy6k2LP8B4A3qBqir1Pd+5bhHG4FIu
 nPPCZjZs2+0DNrZwQv59qIlWsqDFm214WRln9Z7d/VNtrLs2UknVghjQcHv7rP+K
 /NKfPlAuHTI/AFi9pIPFWTMx5J4iXX1hX4LiptE9M0k9/vSiaCVnTS3QbFvp3py3
 VTT9sprzfV4JX4aqS/rbQc/9g4k9OXPW9viOuWf5rYZJTBbsu6PehjUIRECyFaO+
 0gDqE8WsQOtNNX7e5q2HJ/HpPQ+Q1IIlReC+1H56T/EQZmSIBwhTLttQMREL/8af
 Lka3/1SVUi4WG6SBrBI75ClsR91UzE6AK+h9fAyDuR6XJkbysWjkyG6Lmy617g6w
 lb+fQwOzUt4eGDo=
 =4Vc+
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration changes:

   - Add _HPX Type 3 settings support, which gives firmware more
     influence over device configuration (Alexandru Gagniuc)

   - Support fixed bus numbers from bridge Enhanced Allocation
     capabilities (Subbaraya Sundeep)

   - Add "external-facing" DT property to identify cases where we
     require IOMMU protection against untrusted devices (Jean-Philippe
     Brucker)

   - Enable PCIe services for host controller drivers that use managed
     host bridge alloc (Jean-Philippe Brucker)

   - Log PCIe port service messages with pci_dev, not the pcie_device
     (Frederick Lawler)

   - Convert pciehp from pciehp_debug module parameter to generic
     dynamic debug (Frederick Lawler)

  Peer-to-peer DMA:

   - Add whitelist of Root Complexes that support peer-to-peer DMA
     between Root Ports (Christian König)

  Native controller drivers:

   - Add PCI host bridge DMA ranges for bridges that can't DMA
     everywhere, e.g., iProc (Srinath Mannam)

   - Add Amazon Annapurna Labs PCIe host controller driver (Jonathan
     Chocron)

   - Fix Tegra MSI target allocation so DMA doesn't generate unwanted
     MSIs (Vidya Sagar)

   - Fix of_node reference leaks (Wen Yang)

   - Fix Hyper-V module unload & device removal issues (Dexuan Cui)

   - Cleanup R-Car driver (Marek Vasut)

   - Cleanup Keystone driver (Kishon Vijay Abraham I)

   - Cleanup i.MX6 driver (Andrey Smirnov)

  Significant bug fixes:

   - Reset Lenovo ThinkPad P50 GPU so nouveau works after reboot (Lyude
     Paul)

   - Fix Switchtec firmware update performance issue (Wesley Sheng)

   - Work around Pericom switch link retraining erratum (Stefan Mätje)"

* tag 'pci-v5.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (141 commits)
  MAINTAINERS: Add Karthikeyan Mitran and Hou Zhiqiang for Mobiveil PCI
  PCI: pciehp: Remove pointless MY_NAME definition
  PCI: pciehp: Remove pointless PCIE_MODULE_NAME definition
  PCI: pciehp: Remove unused dbg/err/info/warn() wrappers
  PCI: pciehp: Log messages with pci_dev, not pcie_device
  PCI: pciehp: Replace pciehp_debug module param with dyndbg
  PCI: pciehp: Remove pciehp_debug uses
  PCI/AER: Log messages with pci_dev, not pcie_device
  PCI/DPC: Log messages with pci_dev, not pcie_device
  PCI/PME: Replace dev_printk(KERN_DEBUG) with dev_info()
  PCI/AER: Replace dev_printk(KERN_DEBUG) with dev_info()
  PCI: Replace dev_printk(KERN_DEBUG) with dev_info(), etc
  PCI: Replace printk(KERN_INFO) with pr_info(), etc
  PCI: Use dev_printk() when possible
  PCI: Cleanup setup-bus.c comments and whitespace
  PCI: imx6: Allow asynchronous probing
  PCI: dwc: Save root bus for driver remove hooks
  PCI: dwc: Use devm_pci_alloc_host_bridge() to simplify code
  PCI: dwc: Free MSI in dw_pcie_host_init() error path
  PCI: dwc: Free MSI IRQ page in dw_pcie_free_msi()
  ...
2019-05-14 10:30:10 -07:00
Bjorn Helgaas
c7a1c2bbb6 Merge branch 'pci/trivial'
- Cleanup PCI register definitions, typos, etc (Bjorn Helgaas)

  - Remove unnecessary use of user-space types in CPER (Bjorn Helgaas)

  - Cleanup setup-bus.c comments & whitespace (Nicholas Johnson)

* pci/trivial:
  PCI: Cleanup setup-bus.c comments and whitespace
  CPER: Remove unnecessary use of user-space types
  CPER: Add UEFI spec references
  PCI: Fix comment typos
  PCI: Cleanup register definition width and whitespace

# Conflicts:
#	drivers/pci/pci.c
#	drivers/pci/setup-bus.c
2019-05-13 18:34:48 -05:00
Bjorn Helgaas
f8587c80c6 Merge branch 'pci/printk-portdrv'
- Change some desirable KERN_DEBUG messages to KERN_INFO/KERN_ERR
    (Frederick Lawler)

  - Log PCIe port service messages with pci_dev, not the pcie_device
    (Frederick Lawler)

  - Convert pciehp from pciehp_debug module parameter to generic dynamic
    debug (Frederick Lawler)

* pci/printk-portdrv:
  PCI: pciehp: Remove pointless MY_NAME definition
  PCI: pciehp: Remove pointless PCIE_MODULE_NAME definition
  PCI: pciehp: Remove unused dbg/err/info/warn() wrappers
  PCI: pciehp: Log messages with pci_dev, not pcie_device
  PCI: pciehp: Replace pciehp_debug module param with dyndbg
  PCI: pciehp: Remove pciehp_debug uses
  PCI/AER: Log messages with pci_dev, not pcie_device
  PCI/DPC: Log messages with pci_dev, not pcie_device
  PCI/PME: Replace dev_printk(KERN_DEBUG) with dev_info()
  PCI/AER: Replace dev_printk(KERN_DEBUG) with dev_info()
2019-05-13 18:34:47 -05:00
Bjorn Helgaas
192415f498 Merge branch 'pci/printk'
* pci/printk:
  PCI: Replace dev_printk(KERN_DEBUG) with dev_info(), etc
  PCI: Replace printk(KERN_INFO) with pr_info(), etc
  PCI: Use dev_printk() when possible
2019-05-13 18:34:46 -05:00
Bjorn Helgaas
f2e9468316 Merge branch 'pci/iova-dma-ranges'
- Add list of legal DMA address ranges to PCI host bridge (Srinath
    Mannam)

  - Reserve inaccessible DMA ranges so IOMMU doesn't allocate them (Srinath
    Mannam)

  - Parse iProc DT dma-ranges to learn what PCI devices can reach via DMA
    (Srinath Mannam)

* pci/iova-dma-ranges:
  PCI: iproc: Add sorted dma ranges resource entries to host bridge
  iommu/dma: Reserve IOVA for PCIe inaccessible DMA address
  PCI: Add dma_ranges window list

# Conflicts:
#	drivers/pci/probe.c
2019-05-13 18:34:45 -05:00
Bjorn Helgaas
ee6df38da8 Merge branch 'remotes/lorenzo/pci/misc'
- Exit pcitest with error code when test fails (Jean-Jacques Hiblot)

  - Fix leaked of_node references in dra7xx, uniphier, layerscape,
    rockchip, aardvark, iproc, mediatek, rpadlpar (Wen Yang)

  - Fix pcitest "help" option parsing (Kishon Vijay Abraham I)

  - Fix Makefile bug that inadvertently removes pcitest.sh (Kishon Vijay
    Abraham I)

  - Check for alloc_workqueue() failure in endpoint test driver (Kangjie
    Lu)

* remotes/lorenzo/pci/misc:
  PCI: endpoint: Fix a potential NULL pointer dereference
  tools: PCI: Handle pcitest.sh independently from pcitest
  tools: PCI: Add 'h' in optstring of getopt()
  PCI: mediatek: Fix a leaked reference by adding missing of_node_put()
  PCI: iproc: Fix a leaked reference by adding missing of_node_put()
  PCI: aardvark: Fix a leaked reference by adding missing of_node_put()
  PCI: rockchip: Fix a leaked reference by adding missing of_node_put()
  PCI: dwc: layerscape: Fix a leaked reference by adding missing of_node_put()
  PCI: uniphier: Fix a leaked reference by adding missing of_node_put()
  PCI: dwc: pci-dra7xx: Fix a leaked reference by adding missing of_node_put()
  tools: PCI: Exit with error code when test fails
2019-05-13 18:34:44 -05:00
Bjorn Helgaas
ed0eaf3205 Merge branch 'remotes/lorenzo/pci/xilinx'
- Check for __get_free_pages() failure in xilinx (Kangjie Lu)

* remotes/lorenzo/pci/xilinx:
  PCI: xilinx: Check for __get_free_pages() failure
2019-05-13 18:34:44 -05:00
Bjorn Helgaas
cdf4315502 Merge branch 'remotes/lorenzo/pci/tegra'
- Use DMA-API to get tegra MSI address to prevent device DMA from
    generating unwanted MSIs (Vidya Sagar)

* remotes/lorenzo/pci/tegra:
  PCI: tegra: Use the DMA-API to get the MSI address
2019-05-13 18:34:43 -05:00
Bjorn Helgaas
673525c5c2 Merge branch 'remotes/lorenzo/pci/rockchip'
- Fix rockchip bitwise operations that overflow type (Colin Ian King)

* remotes/lorenzo/pci/rockchip:
  PCI: rockchip: Fix rockchip_pcie_ep_assert_intx() bitwise operations
2019-05-13 18:34:43 -05:00
Bjorn Helgaas
bac9789e53 Merge branch 'remotes/lorenzo/pci/rcar'
- Use BIT() when appropriate in rcar (Marek Vasut)

  - Use u32 to match rcar hardware register widths (Marek Vasut)

  - Use BITS_PER_BYTE when appropriate in rcar (Marek Vasut)

  - Remove unnecessary casts in rcar (Marek Vasut)

  - Fix 64-bit MSI target addresses in rcar (Marek Vasut)

  - Check for __get_free_pages() failure in rcar (Kangjie Lu)

  - Fix shadowed rcar "irq" variable (Wolfram Sang)

* remotes/lorenzo/pci/rcar:
  PCI: rcar: Do not shadow the 'irq' variable
  PCI: rcar: Fix a potential NULL pointer dereference
  PCI: rcar: Fix 64bit MSI message address handling
  PCI: rcar: Clean up debug messages
  PCI: rcar: Replace (8 * n) with (BITS_PER_BYTE * n)
  PCI: rcar: Replace various variable types with unsigned ones for register values
  PCI: rcar: Replace unsigned long with u32/unsigned int in register accessors
  PCI: rcar: Clean up remaining macros defining bits

# Conflicts:
#	drivers/pci/controller/pcie-rcar.c
2019-05-13 18:34:42 -05:00
Bjorn Helgaas
fb8a85fabd Merge branch 'remotes/lorenzo/pci/mediatek'
- Make mediatek clocks optional, not required (Chunfeng Yun)

  - Remove unused mediatek mt2712 "num-lanes" DT property (Honghui Zhang)

* remotes/lorenzo/pci/mediatek:
  arm64: dts: mt2712: Remove un-used property for PCIe
  PCI: mediatek: Get optional clocks with devm_clk_get_optional()
2019-05-13 18:34:41 -05:00
Bjorn Helgaas
0b8439d374 Merge branch 'remotes/lorenzo/pci/keystone'
- Move IRQ register address computation inside macros (Kishon Vijay
    Abraham I)

  - Separate legacy IRQ and MSI configuration (Kishon Vijay Abraham I)

  - Use hwirq, not virq, to get MSI IRQ number offset (Kishon Vijay Abraham
    I)

  - Squash ks_pcie_handle_msi_irq() into ks_pcie_msi_irq_handler() (Kishon
    Vijay Abraham I)

  - Add dwc support for platforms with custom MSI controllers (Kishon Vijay
    Abraham I)

  - Add keystone-specific MSI controller (Kishon Vijay Abraham I)

  - Remove dwc host_ops previously used for keystone-specific MSI (Kishon
    Vijay Abraham I)

  - Skip dwc default MSI init if platform has custom MSI controller (Kishon
    Vijay Abraham I)

  - Implement .start_link() and .stop_link() for keystone endpoint support
    (Kishon Vijay Abraham I)

  - Add keystone "reg-names" DT binding (Kishon Vijay Abraham I)

  - Squash ks_pcie_dw_host_init() into ks_pcie_add_pcie_port() (Kishon
    Vijay Abraham I)

  - Get keystone register resources from DT by name, not index (Kishon
    Vijay Abraham I)

  - Get DT resources in .probe() to prepare for endpoint support (Kishon
    Vijay Abraham I)

  - Add "ti,syscon-pcie-mode" DT property for PCIe mode configuration
    (Kishon Vijay Abraham I)

  - Explicitly set keystone to host mode (Kishon Vijay Abraham I)

  - Document DT "atu" reg-names requirement for DesignWare core >= 4.80
    (Kishon Vijay Abraham I)

  - Enable dwc iATU unroll for endpoint mode as well as host mode (Kishon
    Vijay Abraham I)

  - Add dwc "version" to identify core >= 4.80 for ATU programming (Kishon
    Vijay Abraham I)

  - Don't build ARM32-specific keystone code on ARM64 (Kishon Vijay Abraham
    I)

  - Add DT binding for keystone PCIe RC in AM654 SoC (Kishon Vijay Abraham
    I)

  - Add keystone support for AM654 SoC PCIe RC (Kishon Vijay Abraham I)

  - Reset keystone PHYs before enabling them (Kishon Vijay Abraham I)

  - Make of_pci_get_max_link_speed() available to endpoint drivers as well
    as host drivers (Kishon Vijay Abraham I)

  - Add keystone support for DT "max-link-speed" property (Kishon Vijay
    Abraham I)

  - Add endpoint library support for BAR buffer alignment (Kishon Vijay
    Abraham I)

  - Make all dw_pcie_ep_ops structs const (Kishon Vijay Abraham I)

  - Fix fencepost error in dw_pcie_ep_find_capability() (Kishon Vijay
    Abraham I)

  - Add dwc hooks for dbi/dbi2 that share the same address space (Kishon
    Vijay Abraham I)

  - Add keystone support for TI AM654x in endpoint mode (Kishon Vijay
    Abraham I)

  - Configure designware endpoints to advertise smallest resizable BAR
    (1MB) (Kishon Vijay Abraham I)

  - Align designware endpoint ATU windows for raising MSIs (Kishon Vijay
    Abraham I)

  - Add endpoint test support for TI AM654x (Kishon Vijay Abraham I)

  - Fix endpoint test test_reg_bar issue (Kishon Vijay Abraham I)

* remotes/lorenzo/pci/keystone:
  misc: pci_endpoint_test: Fix test_reg_bar to be updated in pci_endpoint_test
  misc: pci_endpoint_test: Add support to test PCI EP in AM654x
  PCI: designware-ep: Use aligned ATU window for raising MSI interrupts
  PCI: designware-ep: Configure Resizable BAR cap to advertise the smallest size
  PCI: keystone: Add support for PCIe EP in AM654x Platforms
  dt-bindings: PCI: Add PCI EP DT binding documentation for AM654
  PCI: dwc: Add callbacks for accessing dbi2 address space
  PCI: dwc: Fix dw_pcie_ep_find_capability() to return correct capability offset
  PCI: dwc: Add const qualifier to struct dw_pcie_ep_ops
  PCI: endpoint: Add support to specify alignment for buffers allocated to BARs
  PCI: keystone: Add support to set the max link speed from DT
  PCI: OF: Allow of_pci_get_max_link_speed() to be used by PCI Endpoint drivers
  PCI: keystone: Invoke phy_reset() API before enabling PHY
  PCI: keystone: Add support for PCIe RC in AM654x Platforms
  dt-bindings: PCI: Add PCI RC DT binding documentation for AM654
  PCI: keystone: Prevent ARM32 specific code to be compiled for ARM64
  PCI: dwc: Fix ATU identification for designware version >= 4.80
  PCI: dwc: Enable iATU unroll for endpoint too
  dt-bindings: PCI: Document "atu" reg-names
  PCI: keystone: Explicitly set the PCIe mode
  dt-bindings: PCI: Add dt-binding to configure PCIe mode
  PCI: keystone: Move resources initialization to prepare for EP support
  PCI: keystone: Use platform_get_resource_byname() to get memory resources
  PCI: keystone: Perform host initialization in a single function
  dt-bindings: PCI: keystone: Add "reg-names" binding information
  PCI: keystone: Cleanup error_irq configuration
  PCI: keystone: Add start_link()/stop_link() dw_pcie_ops
  PCI: dwc: Remove default MSI initialization for platform specific MSI chips
  PCI: dwc: Remove Keystone specific dw_pcie_host_ops
  PCI: keystone: Use Keystone specific msi_irq_chip
  PCI: dwc: Add support to use non default msi_irq_chip
  PCI: keystone: Cleanup ks_pcie_msi_irq_handler()
  PCI: keystone: Use hwirq to get the MSI IRQ number offset
  PCI: keystone: Add separate functions for configuring MSI and legacy interrupt
  PCI: keystone: Cleanup interrupt related macros

# Conflicts:
#	drivers/pci/controller/dwc/pcie-designware.h
2019-05-13 18:34:41 -05:00
Bjorn Helgaas
b138f67d7b Merge branch 'remotes/lorenzo/pci/iproc'
- Work around iproc CRS completion issues (Srinath Mannam)

  - Allow smaller iproc outbound windows so driver can work on 32-bit
    systems (Srinath Mannam)

  - Use iproc-specific config read for PAXBv2 (not PAXB) (Srinath Mannam)

* remotes/lorenzo/pci/iproc:
  PCI: iproc: Enable iProc config read for PAXBv2
  PCI: iproc: Allow outbound configuration for 32-bit I/O region
  PCI: iproc: Add CRS check in config read
2019-05-13 18:34:39 -05:00
Bjorn Helgaas
5349abcf8e Merge branch 'remotes/lorenzo/pci/imx'
- Simplify imx7d_pcie_wait_for_phy_pll_lock() by using
    regmap_read_poll_timeout() (Andrey Smirnov)

  - Drop imx6_pcie_wait_for_link() in favor of the more generic
    dw_pcie_wait_for_link() (Andrey Smirnov)

  - Return -ETIMEDOUT instead of -EINVAL from
    imx6_pcie_wait_for_speed_change() (Andrey Smirnov)

  - Remove unused PCIE_PL_PFLR_* constants from imx6 (Andrey Smirnov)

  - Use shared PHY debug register definitions in imx6 (Andrey Smirnov)

  - Use BIT() in imx6 (Andrey Smirnov)

  - Simplify imx6 PHY bit operations (Andrey Smirnov)

  - Simplify imx6 pcie_phy_poll_ack() (Andrey Smirnov)

  - Use data types that match actual imx6 PHY register width (Andrey
    Smirnov)

  - Mark imx6 suspend support with drvdata flags instead of checking
    variants (Andrey Smirnov)

  - Sleep instead of delay in imx6_pcie_enable_ref_clk() (Andrey Smirnov)

* remotes/lorenzo/pci/imx:
  PCI: imx6: Use usleep_range() in imx6_pcie_enable_ref_clk()
  PCI: imx6: Use flags to indicate support for suspend
  PCI: imx6: Restrict PHY register data to 16-bit
  PCI: imx6: Simplify pcie_phy_poll_ack()
  PCI: imx6: Simplify bit operations in PHY functions
  PCI: imx6: Make use of BIT() in constant definitions
  PCI: dwc: imx6: Share PHY debug register definitions
  PCI: imx6: Remove PCIE_PL_PFLR_* constants
  PCI: imx6: Return -ETIMEOUT from imx6_pcie_wait_for_speed_change()
  PCI: imx6: Drop imx6_pcie_wait_for_link()
  PCI: imx6: Simplify imx7d_pcie_wait_for_phy_pll_lock()
2019-05-13 18:34:39 -05:00