Commit graph

798 commits

Author SHA1 Message Date
Linus Torvalds
5b47b10e8f pci-v5.12-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmA2xiQUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vzRDA/9GCyEskI9DMtyT9UeoTMzpHcUZpaU
 eCbLa2BSPjOKlrHLnPY7IwE0nT7ihe4OOcm8uOYOWtulE46XJNCHfxlUYP3SbI0Y
 JlG0FBCh4ldzCzzKsftwkSvVhk+gn+ms9ucJ8q2iBSOXVhG/41IbX7++8IfbQM4v
 VHjdYUmTCCiOSRDtBVi82p4+GAHxH8IhaB0gDNb1Q7myj+qJKL5nKjK/nukgO0fO
 UpCnSxyua48Ij+c59Y1QAIhGeORq5Gg5Q4ussY3FxS9ovhZODEGQwCFniTfilqRw
 wEB9Fb8tiPY60ljEyDPnERMkiW69zutTJqOY4LfwmoRM9IEbxD6VPIqF5gin8sB7
 pHhX4KUU+eB1hQdK9SGKjkwyehquNKzTdxsu2jccltOKwBm5jcXYeOvu2bJTzZn+
 rrZPYJoA1dQig3bEuOzsBxvW4Jaj7IsVfVcao4OzXyh8Y7tLr9kVDXxr7JC/EkPM
 zRK24yglERD2J1JXgNMvOuJQj6JmRHhEbV/faZci8x8ZEaz1FawRAUZqHf/gGmnW
 2CllarHbRnchPyD8btv03Mp84WG6fCfKy7zG2D8HxOsiStDO/5ICehHtGcvYg7IL
 RuE4Tj8OKdcbw/8cO4C3842FqiSj34+jooNIHSLyBqcpJam6VsN4XqNIZCL+DeG5
 Q2JXruAaahTWOZg=
 =GXL5
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:
   - Remove unnecessary locking around _OSC (Bjorn Helgaas)
   - Clarify message about _OSC failure (Bjorn Helgaas)
   - Remove notification of PCIe bandwidth changes (Bjorn Helgaas)
   - Tidy checking of syscall user config accessors (Heiner Kallweit)

  Resource management:
   - Decline to resize resources if boot config must be preserved (Ard
     Biesheuvel)
   - Fix pci_register_io_range() memory leak (Geert Uytterhoeven)

  Error handling (Keith Busch):
   - Clear error status from the correct device
   - Retain error recovery status so drivers can use it after reset
   - Log the type of Port (Root or Switch Downstream) that we reset
   - Always request a reset for Downstream Ports in frozen state

  Endpoint framework and NTB (Kishon Vijay Abraham I):
   - Make *_get_first_free_bar() take into account 64 bit BAR
   - Add helper API to get the 'next' unreserved BAR
   - Make *_free_bar() return error codes on failure
   - Remove unused pci_epf_match_device()
   - Add support to associate secondary EPC with EPF
   - Add support in configfs to associate two EPCs with EPF
   - Add pci_epc_ops to map MSI IRQ
   - Add pci_epf_ops to expose function-specific attrs
   - Allow user to create sub-directory of 'EPF Device' directory
   - Implement ->msi_map_irq() ops for cadence
   - Configure LM_EP_FUNC_CFG based on epc->function_num_map for cadence
   - Add EP function driver to provide NTB functionality
   - Add support for EPF PCI Non-Transparent Bridge
   - Add specification for PCI NTB function device
   - Add PCI endpoint NTB function user guide
   - Add configfs binding documentation for pci-ntb endpoint function

  Broadcom STB PCIe controller driver:
   - Add support for BCM4908 and external PERST# signal controller
     (Rafał Miłecki)

  Cadence PCIe controller driver:
   - Retrain Link to work around Gen2 training defect (Nadeem Athani)
   - Fix merge botch in cdns_pcie_host_map_dma_ranges() (Krzysztof
     Wilczyński)

  Freescale Layerscape PCIe controller driver:
   - Add LX2160A rev2 EP mode support (Hou Zhiqiang)
   - Convert to builtin_platform_driver() (Michael Walle)

  MediaTek PCIe controller driver:
   - Fix OF node reference leak (Krzysztof Wilczyński)

  Microchip PolarFlare PCIe controller driver:
   - Add Microchip PolarFire PCIe controller driver (Daire McNamara)

  Qualcomm PCIe controller driver:
   - Use PHY_REFCLK_USE_PAD only for ipq8064 (Ansuel Smith)
   - Add support for ddrss_sf_tbu clock for sm8250 (Dmitry Baryshkov)

  Renesas R-Car PCIe controller driver:
   - Drop PCIE_RCAR config option (Lad Prabhakar)
   - Always allocate MSI addresses in 32bit space (Marek Vasut)

  Rockchip PCIe controller driver:
   - Add FriendlyARM NanoPi M4B DT binding (Chen-Yu Tsai)
   - Make 'ep-gpios' DT property optional (Chen-Yu Tsai)

  Synopsys DesignWare PCIe controller driver:
   - Work around ECRC configuration hardware defect (Vidya Sagar)
   - Drop support for config space in DT 'ranges' (Rob Herring)
   - Change size to u64 for EP outbound iATU (Shradha Todi)
   - Add upper limit address for outbound iATU (Shradha Todi)
   - Make dw_pcie ops optional (Jisheng Zhang)
   - Remove unnecessary dw_pcie_ops from al driver (Jisheng Zhang)

  Xilinx Versal CPM PCIe controller driver:
   - Fix OF node reference leak (Pan Bian)

  Miscellaneous:
   - Remove tango host controller driver (Arnd Bergmann)
   - Remove IRQ handler & data together (altera-msi, brcmstb, dwc)
     (Martin Kaiser)
   - Fix xgene-msi race in installing chained IRQ handler (Martin
     Kaiser)
   - Apply CONFIG_PCI_DEBUG to entire drivers/pci hierarchy (Junhao He)
   - Fix pci-bridge-emul array overruns (Russell King)
   - Remove obsolete uses of WARN_ON(in_interrupt()) (Sebastian Andrzej
     Siewior)"

* tag 'pci-v5.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (69 commits)
  PCI: qcom: Use PHY_REFCLK_USE_PAD only for ipq8064
  PCI: qcom: Add support for ddrss_sf_tbu clock
  dt-bindings: PCI: qcom: Document ddrss_sf_tbu clock for sm8250
  PCI: al: Remove useless dw_pcie_ops
  PCI: dwc: Don't assume the ops in dw_pcie always exist
  PCI: dwc: Add upper limit address for outbound iATU
  PCI: dwc: Change size to u64 for EP outbound iATU
  PCI: dwc: Drop support for config space in 'ranges'
  PCI: layerscape: Convert to builtin_platform_driver()
  PCI: layerscape: Add LX2160A rev2 EP mode support
  dt-bindings: PCI: layerscape: Add LX2160A rev2 compatible strings
  PCI: dwc: Work around ECRC configuration issue
  PCI/portdrv: Report reset for frozen channel
  PCI/AER: Specify the type of Port that was reset
  PCI/ERR: Retain status from error notification
  PCI/AER: Clear AER status from Root Port when resetting Downstream Port
  PCI/ERR: Clear status of the reporting device
  dt-bindings: arm: rockchip: Add FriendlyARM NanoPi M4B
  PCI: rockchip: Make 'ep-gpios' DT property optional
  Documentation: PCI: Add PCI endpoint NTB function user guide
  ...
2021-02-25 09:56:08 -08:00
Bjorn Helgaas
215fc27dd8 Merge branch 'pci/link'
- Remove bandwidth notification for now to avoid reporting spam (Bjorn
  Helgaas)

* pci/link:
  PCI/LINK: Remove bandwidth notification
2021-02-24 14:59:18 -06:00
Keith Busch
ba952824e6 PCI/portdrv: Report reset for frozen channel
The PCI error recovery always resets the link for a frozen state, so the
port driver should return that a reset is required for its result. This
will get the .slot_reset() callback invoked, which is necessary to
restore the port's config space. Without this, the driver had been
relying on downstream drivers to return this status.

Link: https://lore.kernel.org/r/20210104230300.1277180-6-kbusch@kernel.org
Tested-by: Hedi Berriche <hedi.berriche@hpe.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
2021-02-23 17:10:42 -06:00
Keith Busch
33ac78bd3b PCI/AER: Specify the type of Port that was reset
The AER driver may be called upon to reset either a Downstream or a Root
Port. Check which type it is to properly identify it when logging that
the reset occurred.

Link: https://lore.kernel.org/r/20210104230300.1277180-5-kbusch@kernel.org
Tested-by: Hedi Berriche <hedi.berriche@hpe.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
2021-02-23 17:10:42 -06:00
Keith Busch
387c72cdd7 PCI/ERR: Retain status from error notification
Overwriting the frozen detected status with the result of the link reset
loses the NEED_RESET result that drivers are depending on for error
handling to report the .slot_reset() callback. Retain this status so
that subsequent error handling has the correct flow.

Link: https://lore.kernel.org/r/20210104230300.1277180-4-kbusch@kernel.org
Reported-by: Hinko Kocevar <hinko.kocevar@ess.eu>
Tested-by: Hedi Berriche <hedi.berriche@hpe.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Sean V Kelley <sean.v.kelley@intel.com>
Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
2021-02-23 17:10:42 -06:00
Keith Busch
7a8a22be35 PCI/AER: Clear AER status from Root Port when resetting Downstream Port
The pci_dev parameter given to aer_root_reset() may be a Downstream Port
rather than the Root Port. Get the Root Port from the provided device in
order to clear the root's AER status.

Link: https://lore.kernel.org/r/20210104230300.1277180-3-kbusch@kernel.org
Tested-by: Hedi Berriche <hedi.berriche@hpe.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Sean V Kelley <sean.v.kelley@intel.com>
Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
2021-02-23 17:10:42 -06:00
Keith Busch
7d7cbeaba5 PCI/ERR: Clear status of the reporting device
Error handling operates on the first Downstream Port above the detected
error, but the error may have been reported by a downstream device.
Clear the AER status of the device that reported the error rather than
the first Downstream Port.

Link: https://lore.kernel.org/r/20210104230300.1277180-2-kbusch@kernel.org
Tested-by: Hedi Berriche <hedi.berriche@hpe.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Sean V Kelley <sean.v.kelley@intel.com>
Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
2021-02-23 17:10:10 -06:00
Bjorn Helgaas
b4c7d2076b PCI/LINK: Remove bandwidth notification
The PCIe Bandwidth Change Notification feature logs messages when the link
bandwidth changes.  Some users have reported that these messages occur
often enough to significantly reduce NVMe performance.  GPUs also seem to
generate these messages.

We don't know why the link bandwidth changes, but in the reported cases
there's no indication that it's caused by hardware failures.

Remove the bandwidth change notifications for now.  Hopefully we can add
this back when we have a better understanding of why this happens and how
we can make the messages useful instead of overwhelming.

Link: https://lore.kernel.org/r/20200115221008.GA191037@google.com/
Link: https://lore.kernel.org/r/155605909349.3575.13433421148215616375.stgit@gimli.home/
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206197
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-02-02 14:25:11 -06:00
Bjorn Helgaas
40fb68c772 Revert "PCI/ASPM: Save/restore L1SS Capability for suspend/resume"
This reverts commit 4257f7e008.

Kenneth reported that after 4257f7e008, he sees a torrent of disk I/O
errors on his NVMe device after suspend/resume until a reboot.

Link: https://lore.kernel.org/linux-pci/20201228040513.GA611645@bjorn-Precision-5520/
Reported-by: Kenneth R. Crudup <kenny@panix.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-01-27 10:12:43 -06:00
Bjorn Helgaas
72b3a644bb Merge branch 'pci/ptm'
- Save/restore Precision Time Measurement Capability for suspend/resume
  (David E. Box)

- Disable PTM during suspend to save power (David E. Box)

* pci/ptm:
  PCI: Disable PTM during suspend to save power
  PCI/PTM: Save/restore Precision Time Measurement Capability for suspend/resume
2020-12-15 15:11:09 -06:00
Bjorn Helgaas
6a94785fb9 Merge branch 'pci/err'
- Stop writing AER Capability when we don't own it (Sean V Kelley)

- Bind RCEC devices to the Port driver (Qiuxu Zhuo)

- Cache the RCEC RA Capability offset (Sean V Kelley)

- Add pci_walk_bridge() (Sean V Kelley)

- Clear AER status only when we control AER (Sean V Kelley)

- Recover from RCEC AER errors (Sean V Kelley)

- Add pcie_link_rcec() to associate RCiEPs with RCECs (Sean V Kelley)

- Recover from RCiEP AER errors (Sean V Kelley)

- Add pcie_walk_rcec() for RCEC AER handling (Sean V Kelley)

- Add pcie_walk_rcec() for RCEC PME handling (Sean V Kelley)

- Add RCEC AER error injection support (Qiuxu Zhuo)

* pci/err:
  PCI/AER: Add RCEC AER error injection support
  PCI/PME: Add pcie_walk_rcec() to RCEC PME handling
  PCI/AER: Add pcie_walk_rcec() to RCEC AER handling
  PCI/ERR: Recover from RCiEP AER errors
  PCI/ERR: Add pcie_link_rcec() to associate RCiEPs
  PCI/ERR: Recover from RCEC AER errors
  PCI/ERR: Clear AER status only when we control AER
  PCI/ERR: Add pci_walk_bridge() to pcie_do_recovery()
  PCI/ERR: Avoid negated conditional for clarity
  PCI/ERR: Use "bridge" for clarity in pcie_do_recovery()
  PCI/ERR: Simplify by computing pci_pcie_type() once
  PCI/ERR: Simplify by using pci_upstream_bridge()
  PCI/ERR: Rename reset_link() to reset_subordinates()
  PCI/ERR: Cache RCEC EA Capability offset in pci_init_capabilities()
  PCI/ERR: Bind RCEC devices to the Root Port driver
  PCI/AER: Write AER Capability only when we control it
2020-12-15 15:11:06 -06:00
David E. Box
a697f072f5 PCI: Disable PTM during suspend to save power
There are systems (for example, Intel based mobile platforms since Coffee
Lake) where the power drawn while suspended can be significantly reduced by
disabling Precision Time Measurement (PTM) on PCIe root ports as this
allows the port to enter a lower-power PM state and the SoC to reach a
lower-power idle state. To save this power, disable the PTM feature on root
ports during pci_prepare_to_sleep() and pci_finish_runtime_suspend().  The
feature will be returned to its previous state during restore and error
recovery.

Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=209361
Link: https://lore.kernel.org/r/20201207223951.19667-2-david.e.box@linux.intel.com
Reported-by: Len Brown <len.brown@intel.com>
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-12-10 14:45:14 -06:00
David E. Box
39850ed510 PCI/PTM: Save/restore Precision Time Measurement Capability for suspend/resume
The PCI subsystem does not currently save and restore the configuration
space for the Precision Time Measurement (PTM) Extended Capability leading
to the possibility of the feature returning disabled on S3 resume.  This
has been observed on Intel Coffee Lake desktops. Add save/restore of the
PTM control register. This saves the PTM Enable, Root Select, and Effective
Granularity bits.

Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20201207223951.19667-1-david.e.box@linux.intel.com
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-12-10 14:43:54 -06:00
Qiuxu Zhuo
d292dd0eb3 PCI/AER: Add RCEC AER error injection support
Root Complex Event Collectors (RCEC) appear as peers to Root Ports and may
also have the AER capability.

Add RCEC support to the AER error injection driver.

Co-developed-by: Sean V Kelley <sean.v.kelley@intel.com>
Link: https://lore.kernel.org/r/20201121001036.8560-16-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2020-12-05 15:26:02 -06:00
Sean V Kelley
9a2f604f44 PCI/PME: Add pcie_walk_rcec() to RCEC PME handling
Root Complex Event Collectors (RCEC) appear as peers of Root Ports and also
have the PME capability. As with AER, there is a need to be able to walk
the RCiEPs associated with their RCEC for purposes of acting upon them with
callbacks.

Add RCEC support through the use of pcie_walk_rcec() to the current PME
service driver and attach the PME service driver to the RCEC device.

Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://lore.kernel.org/r/20201121001036.8560-15-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-12-05 15:26:02 -06:00
Sean V Kelley
af113553d9 PCI/AER: Add pcie_walk_rcec() to RCEC AER handling
Root Complex Event Collectors (RCEC) appear as peers to Root Ports and also
have the AER capability. In addition, actions need to be taken for
associated RCiEPs. In such cases the RCECs will need to be walked in order
to find and act upon their respective RCiEPs.

Extend the existing ability to link the RCECs with a walking function
pcie_walk_rcec(). Add RCEC support to the current AER service driver and
attach the AER service driver to the RCEC device.

Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://lore.kernel.org/r/20201121001036.8560-14-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2020-12-05 15:26:02 -06:00
Qiuxu Zhuo
5790862255 PCI/ERR: Recover from RCiEP AER errors
Add support for handling AER errors detected by Root Complex Integrated
Endpoints (RCiEPs).  These errors are signaled to software natively via a
Root Complex Event Collector (RCEC) or non-natively via ACPI APEI if the
platform retains control of AER or uses a non-standard RCEC-like device.

When recovering from RCiEP errors, the Root Error Command and Status
registers are in the AER Capability of an associated RCEC (if any), not in
a Root Port.  In the non-native case, the platform is responsible for those
registers and we can't touch them.

[bhelgaas: commit log, etc]
Co-developed-by: Sean V Kelley <sean.v.kelley@intel.com>
Link: https://lore.kernel.org/r/20201121001036.8560-13-sean.v.kelley@intel.com
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-12-05 15:26:02 -06:00
Sean V Kelley
507b460f81 PCI/ERR: Add pcie_link_rcec() to associate RCiEPs
A Root Complex Event Collector terminates error and PME messages from
associated RCiEPs.

Use the RCEC Endpoint Association Extended Capability to identify
associated RCiEPs. Link the associated RCiEPs as the RCECs are enumerated.

Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://lore.kernel.org/r/20201121001036.8560-12-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2020-12-05 15:26:02 -06:00
Sean V Kelley
a175102b0a PCI/ERR: Recover from RCEC AER errors
A Root Complex Event Collector (RCEC) collects and signals AER errors that
were detected by Root Complex Integrated Endpoints (RCiEPs), but it may
also signal errors it detects itself.  This is analogous to errors detected
and signaled by a Root Port.

Update the AER service driver to claim RCECs in addition to Root Ports.
Add support for handling RCEC-detected AER errors.  This does not
include handling RCiEP-detected errors that are signaled by the RCEC.

Note that we expect these errors only from the native AER and APEI paths,
not from DPC or EDR.

[bhelgaas: split from combined RCEC/RCiEP patch, commit log]
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-12-05 15:25:58 -06:00
Sean V Kelley
aa344bc8b7 PCI/ERR: Clear AER status only when we control AER
In some cases a bridge may not exist as the hardware controlling may be
handled only by firmware and so is not visible to the OS. This scenario is
also possible in future use cases involving non-native use of RCECs by
firmware. In this scenario, we expect the platform to retain control of the
bridge and to clear error status itself.

Clear error status only when the OS has native control of AER.

Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-12-04 11:18:58 -06:00
Sean V Kelley
05e9ae19ab PCI/ERR: Add pci_walk_bridge() to pcie_do_recovery()
Consolidate subordinate bus checks with pci_walk_bus() into
pci_walk_bridge() for walking below potentially AER affected bridges.

Link: https://lore.kernel.org/r/20201121001036.8560-10-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-12-04 11:18:58 -06:00
Sean V Kelley
3d7d8fc78f PCI/ERR: Avoid negated conditional for clarity
Reverse the sense of the Root Port/Downstream Port conditional for clarity.
No functional change intended.

Link: https://lore.kernel.org/r/20201121001036.8560-9-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2020-12-04 11:18:58 -06:00
Sean V Kelley
0791721d80 PCI/ERR: Use "bridge" for clarity in pcie_do_recovery()
pcie_do_recovery() may be called with "dev" being either a bridge (Root
Port or Switch Downstream Port) or an Endpoint.  The bulk of the function
deals with the bridge, so if we start with an Endpoint, we reset "dev" to
be the bridge leading to it.

For clarity, replace "dev" in the body of the function with "bridge".  No
functional change intended.

Link: https://lore.kernel.org/r/20201121001036.8560-8-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2020-12-04 11:18:58 -06:00
Sean V Kelley
480ef7cb9f PCI/ERR: Simplify by computing pci_pcie_type() once
Instead of calling pci_pcie_type(dev) twice, call it once and save the
result.  No functional change intended.

Link: https://lore.kernel.org/r/20201121001036.8560-7-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2020-12-04 11:18:58 -06:00
Sean V Kelley
5d69dcc9f8 PCI/ERR: Simplify by using pci_upstream_bridge()
Use pci_upstream_bridge() in place of dev->bus->self.  No functional change
intended.

Link: https://lore.kernel.org/r/20201121001036.8560-6-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2020-12-04 11:18:58 -06:00
Sean V Kelley
8f1bbfbc35 PCI/ERR: Rename reset_link() to reset_subordinates()
reset_link() appears to be misnamed.  The point is to reset any devices
below a given bridge, so rename it to reset_subordinates() to make it clear
that we are passing a bridge with the intent to reset the devices below it.

Link: https://lore.kernel.org/r/20201121001036.8560-5-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2020-12-04 11:18:58 -06:00
Sean V Kelley
9065563198 PCI/ERR: Cache RCEC EA Capability offset in pci_init_capabilities()
Extend support for Root Complex Event Collectors by decoding and caching
the RCEC Endpoint Association Extended Capabilities when enumerating. Use
that cached information for later error source reporting. See PCIe r5.0,
sec 7.9.10.

Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://lore.kernel.org/r/20201121001036.8560-4-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2020-12-04 11:18:50 -06:00
Qiuxu Zhuo
c9d659b607 PCI/ERR: Bind RCEC devices to the Root Port driver
If a Root Complex Integrated Endpoint (RCiEP) is implemented, it may signal
errors through a Root Complex Event Collector (RCEC).  Each RCiEP must be
associated with no more than one RCEC.

For an RCEC (which is technically not a Bridge), error messages "received"
from associated RCiEPs must be enabled for "transmission" in order to cause
a System Error via the Root Control register or (when the Advanced Error
Reporting Capability is present) reporting via the Root Error Command
register and logging in the Root Error Status register and Error Source
Identification register.

Given the commonality with Root Ports and the need to also support AER and
PME services for RCECs, extend the Root Port driver to support RCEC devices
by adding the RCEC Class ID to the driver structure.

Co-developed-by: Sean V Kelley <sean.v.kelley@intel.com>
Link: https://lore.kernel.org/r/20201121001036.8560-3-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2020-12-04 11:18:42 -06:00
Sean V Kelley
50cc18fcd3 PCI/AER: Write AER Capability only when we control it
If an OS has not been granted AER control via _OSC, it should not make
changes to PCI_ERR_ROOT_COMMAND and PCI_ERR_ROOT_STATUS related registers.
Per section 4.5.1 of the System Firmware Intermediary (SFI) _OSC and DPC
Updates ECN [1], this bit also covers these aspects of the PCI Express
Advanced Error Reporting. Based on the above and earlier discussion [2],
make the following changes:

Add a check for the native case (i.e., AER control via _OSC)

Note that the previous "clear, reset, enable" order suggests that the reset
might cause errors that we should ignore. After this commit, those errors
(if any) will remain logged in the PCI_ERR_ROOT_STATUS register.

[1] System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24,
    2020, affecting PCI Firmware Specification, Rev. 3.2
    https://members.pcisig.com/wg/PCI-SIG/document/14076
[2] https://lore.kernel.org/linux-pci/20201020162820.GA370938@bjorn-Precision-5520/

Link: https://lore.kernel.org/r/20201121001036.8560-2-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-12-01 18:08:05 -06:00
Vidya Sagar
4257f7e008 PCI/ASPM: Save/restore L1SS Capability for suspend/resume
Previously ASPM L1 Substates control registers (CTL1 and CTL2) weren't
saved and restored during suspend/resume leading to L1 Substates
configuration being lost post-resume.

Save the L1 Substates control registers so that the configuration is
retained post-resume.

Link: https://lore.kernel.org/r/20201024190442.871-1-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-11-20 11:17:44 -06:00
Bjorn Helgaas
8b28a3f346 Merge branch 'pci/misc'
- Remove unnecessary #includes (Gustavo Pimentel)

- Fix intel_mid_pci.c build error when !CONFIG_ACPI (Randy Dunlap)

- Use scnprintf(), not snprintf(), in sysfs "show" functions (Krzysztof
  Wilczyński)

- Simplify pci-pf-stub by using module_pci_driver() (Liu Shixin)

- Print IRQ used by Link Bandwidth Notification (Dongdong Liu)

- Update sysfs mmap-related #ifdef comments (Clint Sbisa)

- Simplify pci_dev_reset_slot_function() (Lukas Wunner)

- Use "NULL" instead of "0" to fix sparse warnings (Gustavo Pimentel)

- Simplify bool comparisons (Krzysztof Wilczyński)

- Drop double zeroing for P2PDMA sg_init_table() (Julia Lawall)

* pci/misc:
  PCI: v3-semi: Remove unneeded break
  PCI/P2PDMA: Drop double zeroing for sg_init_table()
  PCI: Simplify bool comparisons
  PCI: endpoint: Use "NULL" instead of "0" as a NULL pointer
  PCI: Simplify pci_dev_reset_slot_function()
  PCI: Update mmap-related #ifdef comments
  PCI/LINK: Print IRQ number used by port
  PCI/IOV: Simplify pci-pf-stub with module_pci_driver()
  PCI: Use scnprintf(), not snprintf(), in sysfs "show" functions
  x86/PCI: Fix intel_mid_pci.c build error when ACPI is not enabled
  PCI: Remove unnecessary header includes
2020-10-21 09:58:36 -05:00
Bjorn Helgaas
5cfdc750bc Merge branch 'pci/hotplug'
- Use for_each_child_of_node() and for_each_node_by_name() instead of
  open-coding them (Qinglang Miao)

- Reduce pciehp noisiness on hot removal (Lukas Wunner)

- Remove unused assignment in shpchp (Krzysztof Wilczyński)

* pci/hotplug:
  PCI: shpchp: Remove unused 'rc' assignment
  PCI: pciehp: Reduce noisiness on hot removal
  PCI: rpadlpar: Use for_each_child_of_node() and for_each_node_by_name()
2020-10-21 09:58:35 -05:00
Saheed O. Bolarinwa
df8f10587d PCI/ASPM: Remove struct pcie_link_state.l1ss
Previously we computed L1.2 parameters in the enumeration path, saved them
in struct pcie_link_state.l1ss, and programmed them into the devices
whenever we enabled or disabled L1.2 on the link.  But these parameters are
constant and don't need to be updated when enabling/disabling L1.2.

Compute and program the L1.2 parameters once during enumeration and remove
the struct pcie_link_state.l1ss member.  No functional change intended.

[bhelgaas: rework to program L1.2 parameters during enumeration]
Link: https://lore.kernel.org/r/20201015193039.12585-13-helgaas@kernel.org
Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-10-16 11:21:19 -05:00
Saheed O. Bolarinwa
187f91db82 PCI/ASPM: Remove struct aspm_register_info.l1ss_cap
Previously we stored the L1SS Capabilities value in the struct
aspm_register_info.

We only need this information in one place, so read it there and remove
struct aspm_register_info completely, since it's now empty.  No functional
change intended.

[bhelgaas: split up, don't cache l1ss_cap in pci_dev]
Link: https://lore.kernel.org/r/20201015193039.12585-12-helgaas@kernel.org
Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-10-16 11:21:15 -05:00
Bjorn Helgaas
1e8955fd83 PCI/ASPM: Pass L1SS Capabilities value, not struct aspm_register_info
aspm_calc_l1ss_info() needs only the L1SS Capabilities.  It doesn't need
anything else from struct aspm_register_info, so pass only the Capabilities
value.  No functional change intended.

Link: https://lore.kernel.org/r/20201015193039.12585-11-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-10-16 11:21:12 -05:00
Saheed O. Bolarinwa
28a1488e55 PCI/ASPM: Remove struct aspm_register_info.l1ss_ctl1
Previously we stored the L1SS Control 1 register in the struct
aspm_register_info.

We only need this information in one place, so read it there and remove it
from struct aspm_register_info.  No functional change intended.

[bhelgaas: split ctl1/ctl2]
Link: https://lore.kernel.org/r/20201015193039.12585-10-helgaas@kernel.org
Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-10-16 11:21:09 -05:00
Bjorn Helgaas
81c2b807c8 PCI/ASPM: Remove struct aspm_register_info.l1ss_ctl2 (unused)
We never use the aspm_register_info.l1ss_ctl2 value, so remove it.  No
functional change intended.

Link: https://lore.kernel.org/r/20201015193039.12585-9-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-10-16 11:21:07 -05:00
Saheed O. Bolarinwa
ecdf57b4f6 PCI/ASPM: Remove struct aspm_register_info.l1ss_cap_ptr
Save the L1 Substates Capability pointer in struct pci_dev.  Then we don't
have to keep track of it in the struct aspm_register_info and struct
pcie_link_state, which makes the code easier to read.  No functional change
intended.

[bhelgaas: split to a separate patch]
Link: https://lore.kernel.org/r/20201015193039.12585-8-helgaas@kernel.org
Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-10-16 11:21:04 -05:00
Saheed O. Bolarinwa
5f7875d651 PCI/ASPM: Remove struct aspm_register_info.latency_encoding
Previously we stored L0s and L1 Exit Latency information from the Link
Capabilities register in the struct aspm_register_info.

We only need these latencies when we already have the Link Capabilities
values, so use those directly and remove the latencies from struct
aspm_register_info.  No functional change intended.

Link: https://lore.kernel.org/r/20201015193039.12585-7-helgaas@kernel.org
Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-10-16 11:21:02 -05:00
Saheed O. Bolarinwa
67bcc9ad68 PCI/ASPM: Remove struct aspm_register_info.enabled
Previously we stored the "ASPM Control" bits from the Link Control register
in the struct aspm_register_info.

Read PCI_EXP_LNKCTL directly when needed.  This means we can use the
PCI_EXP_LNKCTL_ASPM_* bits directly instead of the similar but different
PCIE_LINK_STATE_* bits.  No functional change intended.

[bhelgaas: drop get_aspm_enable() and read LNKCTL once directly]
Link: https://lore.kernel.org/r/20201015193039.12585-6-helgaas@kernel.org
Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-10-16 11:20:59 -05:00
Saheed O. Bolarinwa
c6e5f02b52 PCI/ASPM: Remove struct aspm_register_info.support
Previously we stored the "ASPM Support" field from the Link Capabilities
register in the struct aspm_register_info.

Read the Link Capabilities directly when needed and remove it from the
struct aspm_register_info.  No functional change intended.

[bhelgaas: remove pci_dev cached copy since LNKCAP isn't truly read-only,
add PCI_EXP_LNKCAP_ASPM_L0S & PCI_EXP_LNKCAP_ASPM_L1, check them directly
instead of adding aspm_support()]
Link: https://lore.kernel.org/r/20201015193039.12585-5-helgaas@kernel.org
Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-10-16 11:20:53 -05:00
Bjorn Helgaas
190cd42cc1 PCI/ASPM: Use 'parent' and 'child' for readability
Other users of link->pdev and link->downstream, e.g., pcie_aspm_cap_init(),
pcie_config_aspm_l1ss(), and pcie_config_aspm_link(), use "parent" and
"child" as local names.

Do the same in aspm_calc_l1ss_info() for readability.  No functional change
intended.

Link: https://lore.kernel.org/r/20201015193039.12585-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-10-16 11:20:51 -05:00
Bjorn Helgaas
08e869ee16 PCI/ASPM: Move LTR path check to where it's used
pcie_get_aspm_reg() mostly reads ASPM-related registers, but in some cases
it also updates the value read from PCI_L1SS_CAP based on LTR properties.

Move this update to the point where the value is used to make the code more
readable.

No functional change intended, although previously we could clear
PCI_L1SS_CAP_ASPM_L1_2 for both ends of the link, and now we'll only do it
for the downstream end of a link.  This shouldn't matter because we always
test that bit by ANDing l1ss_cap for the upstream and downstream ends.

Link: https://lore.kernel.org/r/20201015193039.12585-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-10-16 11:20:48 -05:00
Bjorn Helgaas
0f1619cf82 PCI/ASPM: Move pci_clear_and_set_dword() earlier
Move pci_clear_and_set_dword() earlier in file to prepare for future patch.
No functional change intended.

Link: https://lore.kernel.org/r/20201015193039.12585-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-10-16 11:20:45 -05:00
Lukas Wunner
8a61449941 PCI: pciehp: Reduce noisiness on hot removal
When a PCIe card is hot-removed, the Presence Detect State and Data Link
Layer Link Active bits often do not clear simultaneously.  I've seen delays
of up to 244 msec between the two events with Thunderbolt.

After pciehp has brought down the slot in response to the first event, the
other bit may still be set.  It's not discernible whether it's set because
a new card is already in the slot or if it will soon clear.  So pciehp
tries to bring up the slot and in the latter case fails with a bunch of
messages, some of them at KERN_ERR severity.  If the slot is no longer
occupied, the messages are false positives and annoy users.

Stuart Hayes reports the following splat on hot removal:

  KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up
  KERN_INFO pcieport 0000:3c:06.0: pciehp: Timeout waiting for Presence Detect
  KERN_ERR  pcieport 0000:3c:06.0: pciehp: link training error: status 0x0001
  KERN_ERR  pcieport 0000:3c:06.0: pciehp: Failed to check link status

Dongdong Liu complains about a similar splat:

  KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Link Down
  KERN_INFO iommu: Removing device 0000:87:00.0 from group 12
  KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present
  KERN_INFO pcieport 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec
  KERN_ERR  pciehp 0000:80:10.0:pcie004: Failed to check link status

Users are particularly irritated to see a bringup attempt even though the
slot was explicitly brought down via sysfs.  In a perfect world, we could
avoid this by setting Link Disable on slot bringdown and re-enabling it
upon a Presence Detect State change.  In reality however, there are broken
hotplug ports which hardwire Presence Detect to zero, see 80696f9914
("PCI: pciehp: Tolerate Presence Detect hardwired to zero").  Conversely,
PCIe r1.0 hotplug ports hardwire Link Active to zero because Link Active
Reporting wasn't specified before PCIe r1.1.  On unplug, some ports first
clear Presence then Link (see Stuart Hayes' splat) whereas others use the
inverse order (see Dongdong Liu's splat).  To top it off, there are hotplug
ports which flap the Presence and Link bits on slot bringup, see
6c35a1ac3d ("PCI: pciehp: Tolerate initially unstable link").

pciehp is designed to work with all of these variants.  Surplus attempts at
slot bringup are a lesser evil than not being able to bring up slots at
all.  Although we could try to perfect the behavior for specific hotplug
controllers, we'd risk breaking others or increasing code complexity.

But we can certainly minimize annoyance by emitting only a single message
with KERN_INFO severity if bringup is unsuccessful:

* Drop the "Timeout waiting for Presence Detect" message in
  pcie_wait_for_presence().  The sole caller of that function,
  pciehp_check_link_status(), ignores the timeout and carries on.  It emits
  error messages of its own and I don't think this particular message adds
  much value.

* There's a single error condition in pciehp_check_link_status() which
  does not emit a message.  Adding one allows dropping the "Failed to check
  link status" message emitted by board_added() if
  pciehp_check_link_status() returns a non-zero integer.

* Tone down all messages in pciehp_check_link_status() to KERN_INFO
  severity and rephrase them to look as innocuous as possible.  To this
  end, move the message emitted by pcie_wait_for_link_delay() to its
  callers.

As a result, Stuart Hayes' splat becomes:

  KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up
  KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Cannot train link: status 0x0001

Dongdong Liu's splat becomes:

  KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present
  KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): No link

The messages now merely serve as information that presence or link bits
were set a little longer than expected.  Bringup failures which are not
false positives are still reported, albeit no longer at KERN_ERR severity.

Link: https://lore.kernel.org/linux-pci/20200310182100.102987-1-stuart.w.hayes@gmail.com/
Link: https://lore.kernel.org/linux-pci/1547649064-19019-1-git-send-email-liudongdong3@huawei.com/
Link: https://lore.kernel.org/r/b45e46fd8a6aa6930aaac9d7718c2e4b787a4e5e.1595935071.git.lukas@wunner.de
Reported-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Reported-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2020-09-17 16:22:36 -05:00
Dongdong Liu
7b6f224088 PCI/LINK: Print IRQ number used by port
Print the IRQ used by PCIe Link Bandwidth Notification services port as
AER, PME and DPC do.  It provides convenience to track PCIe BW notification
interrupt counts of certain port from /proc/interrupts.

The dmesg log is as below:

  pcieport 0000:00:00.0: bw_notification: enabled with IRQ 1166

Link: https://lore.kernel.org/r/1599737055-73624-1-git-send-email-liudongdong3@huawei.com
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-09-17 12:40:25 -05:00
Linus Torvalds
049eb096da pci-v5.9-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl8sdUkUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwH2Q/7Brcm1uyLORSzseGsaXSGMncBs2YB
 aKbfhyy4BPsDIZRLnzcfRZzgKo3f4jlLH9dJ6nBukbNXCvS/g7oYCXtNKVuB70MD
 IgBH3OJxLmqsYgDkoQmj1fZBCBhdqMgGbRmeIPLqiIBrWOJkBpGHXKpb0XtyXAas
 CpD0Tvr0JBeHMluZq6Uay09jBDKexeCFrT5HCoVaRMXT/C/iB5K1oMrUczzITsdi
 jB9xesDjh32rYtaePKfuL8itbRT7jtqOwQlk7sCtnMNamaOOaYO/s6hL5v/4GxMh
 rtWa1knOxxA1nOsnEkUEHi0Fj/+9zXDIdb7v6thRDo0ZgWQxl7l3nshvmPcxX421
 tpCm3HqmvHzGqSI85Rtr3p4XKm9e+IjgE2EA/J6Y8Q6Grrb0EGJituhO4meL2Ciq
 6mxdhu7InxDJ2p3TLGas3fB/1hrCO0Fc0pQoBJx7YgqA1ANyld9DYCkDN6IDoZBI
 uUjKgkE1dfbW/pGjotjhBsmz3dycZHkurIFdt1iX/Xtt5KKdPAzu9yM2U03iIS2R
 im1wZ/THiS/YCOlgL/J8+DHTY0ZvXjAdbiSPjTFfwb9XTh8aHVWtFaaZON1jRIjg
 xMpIY0SxfshpLx631ThZdDTDiOwE8D3B+1n/kMwps6HOLpxOoJZeSGTRCt9wGP40
 j58DTtLm5FKpdYc=
 =moI9
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:
   - Fix pci_cfg_wait queue locking problem (Bjorn Helgaas)
   - Convert PCIe capability PCIBIOS errors to errno (Bolarinwa Olayemi
     Saheed)
   - Align PCIe capability and PCI accessor return values (Bolarinwa
     Olayemi Saheed)
   - Fix pci_create_slot() reference count leak (Qiushi Wu)
   - Announce device after early fixups (Tiezhu Yang)

  PCI device hotplug:
   - Make rpadlpar functions static (Wei Yongjun)

  Driver binding:
   - Add device even if driver attach failed (Rajat Jain)

  Virtualization:
   - xen: Remove redundant initialization of irq (Colin Ian King)

  IOMMU:
   - Add pci_pri_supported() to check device or associated PF (Ashok Raj)
   - Release IVRS table in AMD ACS quirk (Hanjun Guo)
   - Mark AMD Navi10 GPU rev 0x00 ATS as broken (Kai-Heng Feng)
   - Treat "external-facing" devices themselves as internal (Rajat Jain)

  MSI:
   - Forward MSI-X error code in pci_alloc_irq_vectors_affinity() (Piotr
     Stankiewicz)

  Error handling:
   - Clear PCIe Device Status errors only if OS owns AER (Jonathan
     Cameron)
   - Log correctable errors as warning, not error (Matt Jolly)
   - Use 'pci_channel_state_t' instead of 'enum pci_channel_state' (Luc
     Van Oostenryck)

  Peer-to-peer DMA:
   - Allow P2PDMA on AMD Zen and newer CPUs (Logan Gunthorpe)

  ASPM:
   - Add missing newline in sysfs 'policy' (Xiongfeng Wang)

  Native PCIe controllers:
   - Convert to devm_platform_ioremap_resource_byname() (Dejin Zheng)
   - Convert to devm_platform_ioremap_resource() (Dejin Zheng)
   - Remove duplicate error message from devm_pci_remap_cfg_resource()
     callers (Dejin Zheng)
   - Fix runtime PM imbalance on error (Dinghao Liu)
   - Remove dev_err() when handing an error from platform_get_irq()
     (Krzysztof Wilczyński)
   - Use pci_host_bridge.windows list directly instead of splicing in a
     temporary list for cadence, mvebu, host-common (Rob Herring)
   - Use pci_host_probe() instead of open-coding all the pieces for
     altera, brcmstb, iproc, mobiveil, rcar, rockchip, tegra, v3,
     versatile, xgene, xilinx, xilinx-nwl (Rob Herring)
   - Default host bridge parent device to the platform device (Rob
     Herring)
   - Use pci_is_root_bus() instead of tracking root bus number
     separately in aardvark, designware (imx6, keystone,
     designware-host), mobiveil, xilinx-nwl, xilinx, rockchip, rcar (Rob
     Herring)
   - Set host bridge bus number in pci_scan_root_bus_bridge() instead of
     each driver for aardvark, designware-host, host-common, mediatek,
     rcar, tegra, v3-semi (Rob Herring)
   - Move DT resource setup into devm_pci_alloc_host_bridge() (Rob
     Herring)
   - Set bridge map_irq and swizzle_irq to default functions; drivers
     that don't support legacy IRQs (iproc) need to undo this (Rob
     Herring)

  ARM Versatile PCIe controller driver:
   - Drop flag PCI_ENABLE_PROC_DOMAINS (Rob Herring)

  Cadence PCIe controller driver:
   - Use "dma-ranges" instead of "cdns,no-bar-match-nbits" property
     (Kishon Vijay Abraham I)
   - Remove "mem" from reg binding (Kishon Vijay Abraham I)
   - Fix cdns_pcie_{host|ep}_setup() error path (Kishon Vijay Abraham I)
   - Convert all r/w accessors to perform only 32-bit accesses (Kishon
     Vijay Abraham I)
   - Add support to start link and verify link status (Kishon Vijay
     Abraham I)
   - Allow pci_host_bridge to have custom pci_ops (Kishon Vijay Abraham I)
   - Add new *ops* for CPU addr fixup (Kishon Vijay Abraham I)
   - Fix updating Vendor ID and Subsystem Vendor ID register (Kishon
     Vijay Abraham I)
   - Use bridge resources for outbound window setup (Rob Herring)
   - Remove private bus number and range storage (Rob Herring)

  Cadence PCIe endpoint driver:
   - Add MSI-X support (Alan Douglas)

  HiSilicon PCIe controller driver:
   - Remove non-ECAM HiSilicon hip05/hip06 driver (Rob Herring)

  Intel VMD host bridge driver:
   - Use Shadow MEMBAR registers for QEMU/KVM guests (Jon Derrick)

  Loongson PCIe controller driver:
   - Use DECLARE_PCI_FIXUP_EARLY for bridge_class_quirk() (Tiezhu Yang)

  Marvell Aardvark PCIe controller driver:
   - Indicate error in 'val' when config read fails (Pali Rohár)
   - Don't touch PCIe registers if no card connected (Pali Rohár)

  Marvell MVEBU PCIe controller driver:
   - Setup BAR0 in order to fix MSI (Shmuel Hazan)

  Microsoft Hyper-V host bridge driver:
   - Fix a timing issue which causes kdump to fail occasionally (Wei Hu)
   - Make some functions static (Wei Yongjun)

  NVIDIA Tegra PCIe controller driver:
   - Revert tegra124 raw_violation_fixup (Nicolas Chauvet)
   - Remove PLL power supplies (Thierry Reding)

  Qualcomm PCIe controller driver:
   - Change duplicate PCI reset to phy reset (Abhishek Sahu)
   - Add missing ipq806x clocks in PCIe driver (Ansuel Smith)
   - Add missing reset for ipq806x (Ansuel Smith)
   - Add ext reset (Ansuel Smith)
   - Use bulk clk API and assert on error (Ansuel Smith)
   - Add support for tx term offset for rev 2.1.0 (Ansuel Smith)
   - Define some PARF params needed for ipq8064 SoC (Ansuel Smith)
   - Add ipq8064 rev2 variant (Ansuel Smith)
   - Support PCI speed set for ipq806x (Sham Muthayyan)

  Renesas R-Car PCIe controller driver:
   - Use devm_pci_alloc_host_bridge() (Rob Herring)
   - Use struct pci_host_bridge.windows list directly (Rob Herring)
   - Convert rcar-gen2 to use modern host bridge probe functions (Rob
     Herring)

  TI J721E PCIe driver:
   - Add TI J721E PCIe host and endpoint driver (Kishon Vijay Abraham I)

  Xilinx Versal CPM PCIe controller driver:
   - Add Versal CPM Root Port driver and YAML schema (Bharat Kumar
     Gogada)

  MicroSemi Switchtec management driver:
   - Add missing __iomem and __user tags to fix sparse warnings (Logan
     Gunthorpe)

  Miscellaneous:
   - Replace http:// links with https:// (Alexander A. Klimov)
   - Replace lkml.org, spinics, gmane with lore.kernel.org (Bjorn
     Helgaas)
   - Remove unused pci_lost_interrupt() (Heiner Kallweit)
   - Move PCI_VENDOR_ID_REDHAT definition to pci_ids.h (Huacai Chen)
   - Fix kerneldoc warnings (Krzysztof Kozlowski)"

* tag 'pci-v5.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (113 commits)
  PCI: Fix kerneldoc warnings
  PCI: xilinx-cpm: Add Versal CPM Root Port driver
  PCI: xilinx-cpm: Add YAML schemas for Versal CPM Root Port
  PCI: Set bridge map_irq and swizzle_irq to default functions
  PCI: Move DT resource setup into devm_pci_alloc_host_bridge()
  PCI: rcar-gen2: Convert to use modern host bridge probe functions
  PCI: Remove dev_err() when handing an error from platform_get_irq()
  MAINTAINERS: Add Kishon Vijay Abraham I for TI J721E SoC PCIe
  misc: pci_endpoint_test: Add J721E in pci_device_id table
  PCI: j721e: Add TI J721E PCIe driver
  PCI: switchtec: Add missing __iomem tag to fix sparse warnings
  PCI: switchtec: Add missing __iomem and __user tags to fix sparse warnings
  PCI: rpadlpar: Make functions static
  PCI/P2PDMA: Allow P2PDMA on AMD Zen and newer CPUs
  PCI: Release IVRS table in AMD ACS quirk
  PCI: Announce device after early fixups
  PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken
  PCI: Remove unused pci_lost_interrupt()
  dt-bindings: PCI: Add EP mode dt-bindings for TI's J721E SoC
  dt-bindings: PCI: Add host mode dt-bindings for TI's J721E SoC
  ...
2020-08-07 18:48:15 -07:00
Bjorn Helgaas
0caa17f5f2 Merge branch 'pci/misc'
- Convert PCIe capability PCIBIOS errors to errno (Bolarinwa Olayemi
  Saheed)

- Align PCIe capability and PCI accessor return values (Bolarinwa Olayemi
  Saheed)

- Replace http:// links with https:// (Alexander A. Klimov)

- Replace lkml.org, spinics, gmane with lore.kernel.org (Bjorn Helgaas)

- Update panic message to mention kzalloc(), not kmalloc() (Liao Pingfang)

- Move PCI_VENDOR_ID_REDHAT definition to pci_ids.h (Huacai Chen)

- Remove unused pci_lost_interrupt() (Heiner Kallweit)

* pci/misc:
  PCI: Remove unused pci_lost_interrupt()
  PCI: Move PCI_VENDOR_ID_REDHAT definition to pci_ids.h
  PCI: Fix error in panic message
  PCI: Replace lkml.org, spinics, gmane with lore.kernel.org
  PCI: Replace http:// links with https://
  PCI: Align PCIe capability and PCI accessor return values
  PCI: Convert PCIe capability PCIBIOS errors to errno
2020-08-05 18:24:16 -05:00
Bjorn Helgaas
b0735e8d2c Merge branch 'pci/error'
- Use pci_channel_state_t instead of enum pci_channel_state (Luc Van
  Oostenryck)

- Simplify __aer_print_error() (Bjorn Helgaas)

- Log AER correctable errors as warning, not error (Matt Jolly)

- Rename pci_aer_clear_device_status() to pcie_clear_device_status() (Bjorn
  Helgaas)

- Clear PCIe Device Status errors only if OS owns AER (Jonathan Cameron)

* pci/error:
  PCI/ERR: Clear PCIe Device Status errors only if OS owns AER
  PCI/ERR: Rename pci_aer_clear_device_status() to pcie_clear_device_status()
  PCI/AER: Log correctable errors as warning, not error
  PCI/AER: Simplify __aer_print_error()
  PCI: Use 'pci_channel_state_t' instead of 'enum pci_channel_state'
2020-08-05 18:24:15 -05:00
Jonathan Cameron
068c29a248 PCI/ERR: Clear PCIe Device Status errors only if OS owns AER
pcie_clear_device_status() resets the error bits in the PCIe Device Status
Register (PCI_EXP_DEVSTA).

Previously we did this unconditionally, but on ACPI systems, the _OSC AER
bit negotiates control of the AER capability.  Per sec 4.5.1 of the System
Firmware Intermediary _OSC and DPC Updates ECN [1], this bit also covers
other error enable/status bits including the following:

  Correctable Error Reporting Enable
  Non-Fatal Error Reporting Enable
  Fatal Error Reporting Enable
  Unsupported Request Reporting Enable

These bits are all in the PCIe Device Control register (the ECN omitted
"Reporting", but I think that's a typo), so by implication the _OSC AER bit
also applies to the error status bits in the PCIe Device Status register:

  Correctable Error Detected
  Non-Fatal Error Detected
  Fatal Error Detected
  Unsupported Request Detected

Clear the PCIe Device Status error bits only when the OS controls the AER
capability and related error enable/status bits.  If platform firmware
controls the AER capability, firmware is responsible for clearing these
bits.

One call path leading here is:

  ghes_do_proc
    ghes_handle_aer
      aer_recover_queue
        schedule_work(&aer_recover_work)
  ...
  aer_recover_work_func
    pcie_do_recovery
      pcie_clear_device_status

[1] System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24,
    2020, affecting PCI Firmware Specification, Rev. 3.2
    https://members.pcisig.com/wg/PCI-SIG/document/14076
[bhelgaas: commit log, move test from pcie_clear_device_status() to callers]
Link: https://lore.kernel.org/r/20200622113523.891666-1-Jonathan.Cameron@huawei.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-07-22 15:41:03 -05:00