Commit Graph

1900 Commits

Author SHA1 Message Date
Yazen Ghannam 6e241bc93c EDAC/amd64: Remove scrub rate control for Family 17h and later
The scrub registers on AMD Family 17h and later may be inaccessible to
the OS. Furthermore, hardware designers recommend that the scrubbing
feature is managed by the firmware.

Remove support for the sdram_scrub_rate interface for AMD Family 17h
systems and later by not setting the scrub function pointers. The EDAC MC
core will then not expose the scrub files in sysfs.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230127170419.1824692-3-yazen.ghannam@amd.com
2023-02-09 11:25:21 +01:00
Yazen Ghannam fdce765a13 EDAC/amd64: Don't set up EDAC PCI control on Family 17h+
EDAC PCI control is used to detect/report legacy PCI errors like
"Parity" and "SERROR". Modern AMD systems use PCIe Advanced Error
Reporting (AER), and legacy PCI errors should not be reported.

Remove EDAC PCI control setup on AMD Family 17h and later systems.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230127170419.1824692-2-yazen.ghannam@amd.com
2023-02-09 11:14:53 +01:00
Youquan Song 221aa03fb4 EDAC/i10nm: Add driver decoder for Sapphire Rapids server
Intel SDM (December 2022) vol3B 17.13.2 contains IMC MC error codes
for Sapphire Rapids. Current i10nm_edac only supports firmware decoder
(ACPI DSM methods) for Sapphire Rapids. So add the driver decoder
(decoding DDR memory errors via extracting error information from the
IMC MC error codes) for Sapphire Rapids for better decoding performance.

Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2023-02-08 08:29:11 -08:00
Qiuxu Zhuo ba987eaaab EDAC/i10nm: Add Intel Granite Rapids server support
The Granite Rapids CPU model uses similar memory controller registers
as Sapphire Rapids server but with some different configurations:

- Various memory controller numbers for different Granite Rapids CPUs.
  So detect the number of present memory controllers at run time.

- Different MMIO offsets of memory controllers.

- Different triples of bus/dev/fun of some PCI devices used in i10nm_edac.

Add above configurations and Granite Rapids CPU model ID for EDAC support.

[Tony: Fixed 2 typos s/strcture/structure/]

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20230113032802.41752-1-qiuxu.zhuo@intel.com
2023-01-25 08:17:30 -08:00
Qiuxu Zhuo dd7814b785 EDAC/i10nm: Make more configurations CPU model specific
The numbers of memory controllers per socket, channels per memory
controller, DIMMs per channel and the triples of bus/device/function
of PCI devices used in i10nm_edac can be CPU model specific.
Add new fields to the structure res_config for above numbers and
triples to make them CPU model specific.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20230113032802.41752-1-qiuxu.zhuo@intel.com
2023-01-25 08:17:20 -08:00
Qiuxu Zhuo e4b2bc6616 EDAC/i10nm: Add Intel Emerald Rapids server support
The Emerald Rapids CPU model uses similar memory controller registers
as Sapphire Rapids server. Add Emerald Rapids CPU model number ID for
EDAC support.

Tested-by: Li Zhang <li4.zhang@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20230113032802.41752-1-qiuxu.zhuo@intel.com
2023-01-25 08:17:08 -08:00
Qiuxu Zhuo d2415e2e53 EDAC/skx_common: Delete duplicated and unreachable code
skx_mce_check_error() returns early if the error isn't from memory.
So when skx_mce_output_error() is invoked from skx_mce_check_error(),
it doesn't need to re-check whether the error is from memory. Delete
the duplicated and unreachable code from skx_mce_output_error().

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20230113032802.41752-1-qiuxu.zhuo@intel.com
2023-01-25 08:16:53 -08:00
Qiuxu Zhuo 6e8746cb73 EDAC/skx_common: Enable EDAC support for the "near" memory
The current {skx,i10nm}_edac miss the EDAC support to decode errors from
the 1st level memory (the fast "near" memory as cache) of the 2-level
memory system. Introduce a helper function skx_error_in_mem() to check
whether errors are from memory at the beginning of skx_mce_check_error().

As long as the errors are from memory (either the 1-level memory system
or the 2-level memory system), decode the errors.

Reported-and-tested-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20230113032802.41752-1-qiuxu.zhuo@intel.com
2023-01-25 08:16:44 -08:00
Manivannan Sadhasivam 977c6ba624 EDAC/qcom: Do not pass llcc_driv_data as edac_device_ctl_info's pvt_info
The memory for llcc_driv_data is allocated by the LLCC driver. But when
it is passed as the private driver info to the EDAC core, it will get freed
during the qcom_edac driver release. So when the qcom_edac driver gets probed
again, it will try to use the freed data leading to the use-after-free bug.

Hence, do not pass llcc_driv_data as pvt_info but rather reference it
using the platform_data pointer in the qcom_edac driver.

Fixes: 27450653f1 ("drivers: edac: Add EDAC driver support for QCOM SoCs")
Reported-by: Steev Klimaszewski <steev@kali.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Steev Klimaszewski <steev@kali.org> # Thinkpad X13s
Tested-by: Andrew Halaney <ahalaney@redhat.com> # sa8540p-ride
Cc: <stable@vger.kernel.org> # 4.20
Link: https://lore.kernel.org/r/20230118150904.26913-4-manivannan.sadhasivam@linaro.org
2023-01-20 19:47:34 +01:00
Manivannan Sadhasivam 8d8fcc391f EDAC/qcom: Add platform_device_id table for module autoloading
Add a device ID table so that the driver loads automatically when the
associated platform_device gets registered.

Reported-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Steev Klimaszewski <steev@kali.org> # Thinkpad X13s
Tested-by: Andrew Halaney <ahalaney@redhat.com> # sa8540p-ride
Link: https://lore.kernel.org/r/20230118150904.26913-3-manivannan.sadhasivam@linaro.org
2023-01-20 16:59:53 +01:00
Manivannan Sadhasivam cec669ff71 EDAC/device: Respect any driver-supplied workqueue polling value
The EDAC drivers may optionally pass the poll_msec value. Use that value
if available, else fall back to 1000ms.

  [ bp: Touchups. ]

Fixes: e27e3dac65 ("drivers/edac: add edac_device class")
Reported-by: Luca Weiss <luca.weiss@fairphone.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Steev Klimaszewski <steev@kali.org> # Thinkpad X13s
Tested-by: Andrew Halaney <ahalaney@redhat.com> # sa8540p-ride
Cc: <stable@vger.kernel.org> # 4.9
Link: https://lore.kernel.org/r/COZYL8MWN97H.MROQ391BGA09@otso
2023-01-19 11:43:16 +01:00
Tony Luck 8a01ec97dc x86/mce: Mask out non-address bits from machine check bank
Systems that support various memory encryption schemes (MKTME, TDX, SEV)
use high order physical address bits to indicate which key should be
used for a specific memory location.

When a memory error is reported, some systems may report those key
bits in the IA32_MCi_ADDR machine check MSR.

The Intel SDM has a footnote for the contents of the address register
that says: "Useful bits in this field depend on the address methodology
in use when the register state is saved."

AMD Processor Programming Reference has a more explicit description
of the MCA_ADDR register:

 "For physical addresses, the most significant bit is given by
  Core::X86::Cpuid::LongModeInfo[PhysAddrSize]."

Add a new #define MCI_ADDR_PHYSADDR for the mask of valid physical
address bits within the machine check bank address register. Use this
mask for recoverable machine check handling and in the EDAC driver to
ignore any key bits that may be present.

  [ Tony: Based on independent fixes proposed by Fan Du and Isaku Yamahata ]

Reported-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reported-by: Fan Du <fan.du@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Link: https://lore.kernel.org/r/20230109152936.397862-1-tony.luck@intel.com
2023-01-10 11:47:07 +01:00
Sai Krishna Potthuri 3bd2706c91 EDAC/zynqmp: Add EDAC support for Xilinx ZynqMP OCM
Add EDAC support for Xilinx ZynqMP OCM Controller, so this driver reports CE and
UE errors upon interrupt generation. Also add debugfs files for error injection.

On Xilinx ZynqMP platform, both OCM Controller driver(zynqmp_edac) and DDR
Memory Controller driver(synopsys_edac) co-exist which means both can be loaded
at a time. This scenario is tested on Xilinx ZynqMP platform.

Fix following issue reported by the robot:
  "MAINTAINERS references a file that doesn't exist:
  Documentation/devicetree/bindings/edac/xlnx,zynqmp-ocmc.yaml"

  [ bp:
    - Massage commit message
    - s/EDAC_ZYNQMP_OCM/EDAC_ZYNQMP/
    - Touchups
      ]

Reported-by: kernel test robot <lkp@intel.com>
Co-developed-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230104084512.1855243-3-sai.krishna.potthuri@amd.com
2023-01-09 11:13:58 +01:00
Miaoqian Lin e7a293658c EDAC/highbank: Fix memory leak in highbank_mc_probe()
When devres_open_group() fails, it returns -ENOMEM without freeing memory
allocated by edac_mc_alloc().

Call edac_mc_free() on the error handling path to avoid a memory leak.

  [ bp: Massage commit message. ]

Fixes: a1b01edb27 ("edac: add support for Calxeda highbank memory controller")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Link: https://lore.kernel.org/r/20221229054825.1361993-1-linmq006@gmail.com
2023-01-03 17:03:57 +01:00
Eliav Farber e840774379 EDAC/device: Fix period calculation in edac_device_reset_delay_period()
Fix period calculation in case user sets a value of 1000.  The input of
round_jiffies_relative() should be in jiffies and not in milli-seconds.

  [ bp: Use the same code pattern as in edac_device_workq_setup() for
    clarity. ]

Fixes: c4cf3b454e ("EDAC: Rework workqueue handling")
Signed-off-by: Eliav Farber <farbere@amazon.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20221020124458.22153-1-farbere@amazon.com
2022-12-30 15:51:41 +01:00
Borislav Petkov (AMD) 3919430fe9 Merge branches 'edac-ghes' and 'edac-misc' into edac-updates-for-v6.2
Combine all queued EDAC changes for submission into v6.2:

* ras/edac-ghes:
  EDAC/igen6: Return the correct error type when not the MC owner
  apei/ghes: Use xchg_release() for updating new cache slot instead of cmpxchg()
  EDAC: Check for GHES preference in the chipset-specific EDAC drivers
  EDAC/ghes: Make ghes_edac a proper module
  EDAC/ghes: Prepare to make ghes_edac a proper module
  EDAC/ghes: Add a notifier for reporting memory errors
  efi/cper: Export several helpers for ghes_edac to use

* ras/edac-misc:
  EDAC/i10nm: fix refcount leak in pci_get_dev_wrapper()
  EDAC/i5400: Fix typo in comment: vaious -> various
  EDAC/mc_sysfs: Increase legacy channel support to 12
  MAINTAINERS: Make Mauro EDAC reviewer
  MAINTAINERS: Make Manivannan Sadhasivam the maintainer of qcom_edac
  EDAC/i5000: Mark as BROKEN

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2022-12-12 15:40:03 +01:00
Yang Yingliang 9c89215559 EDAC/i10nm: fix refcount leak in pci_get_dev_wrapper()
As the comment of pci_get_domain_bus_and_slot() says, it returns
a PCI device with refcount incremented, so it doesn't need to
call an extra pci_dev_get() in pci_get_dev_wrapper(), and the PCI
device needs to be put in the error path.

Fixes: d4dc89d069 ("EDAC, i10nm: Add a driver for Intel 10nm server processors")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20221128065512.3572550-1-yangyingliang@huawei.com
2022-11-28 09:42:41 -08:00
Chen Zhang b586a59e14 EDAC/i5400: Fix typo in comment: vaious -> various
Fix spelling typo in comment: vaious -> various.

  [ bp: Massage. ]

Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: Chen Zhang <chenzhang@kylinos.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221102081248.45694-1-chenzhang@kylinos.cn
2022-11-25 19:29:02 +01:00
Yazen Ghannam 25836ce1df EDAC/mc_sysfs: Increase legacy channel support to 12
Newer AMD systems, such as Genoa, can support up to 12 channels per EDAC
"mc" device. These are detected by the device's EDAC module, and the
current EDAC interface is properly enumerated. However, the legacy EDAC
sysfs interface provides device attributes only for channels 0 to 7.
Therefore, channels 8 to 11 will not be visible in the legacy interface.
This was overlooked in the initial support for AMD Genoa.

Add additional device attributes so that up to 12 channels are visible
in the legacy EDAC sysfs interface.

Fixes: e2be5955a8 ("EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20221018153630.14664-1-yazen.ghannam@amd.com
2022-10-31 11:03:34 +01:00
Jia He f5e32344d4 EDAC/igen6: Return the correct error type when not the MC owner
Return -EBUSY instead of -ENODEV just like the other EDAC drivers do.

  [ bp: Rewrite text. ]

Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221018082214.569504-8-justin.he@arm.com
2022-10-25 10:10:54 +02:00
Jia He 315bada690 EDAC: Check for GHES preference in the chipset-specific EDAC drivers
Call ghes_get_devices() to check whether ghes_edac should be used on the
platform where it is preferred over the corresponding chipset-specific
EDAC driver.

Unlike the existing edac_get_owner() check, the ghes_get_devices() check
works independent to the module_init ordering.

  [ bp: Massage. ]

Suggested-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221010023559.69655-6-justin.he@arm.com
2022-10-21 22:09:54 +02:00
Jia He 802e7f1dfe EDAC/ghes: Make ghes_edac a proper module
Commit

  dc4e8c07e9 ("ACPI: APEI: explicit init of HEST and GHES in apci_init()")

introduced a bug leading to ghes_edac_register() to be invoked before
edac_init(). Because at that time the bus "edac" hadn't been even
registered, this created sysfs nodes as /devices/mc0 instead of
/sys/devices/system/edac/mc/mc0 on an Ampere eMag server.

Fix this by turning ghes_edac into a proper module.

The list of GHES devices returned is not protected from being modified
concurrently but it is pretty static as it gets created only during GHES
init and latter is not a module so...

  [ bp: Massage. ]

Fixes: dc4e8c07e9 ("ACPI: APEI: explicit init of HEST and GHES in apci_init()")
Co-developed-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221010023559.69655-5-justin.he@arm.com
2022-10-21 21:59:19 +02:00
Jia He 9057a3f7ac EDAC/ghes: Prepare to make ghes_edac a proper module
To make ghes_edac a proper module, prepare to decouple its dependencies
from GHES.

Move the ghes_edac.force_load parameter to ghes.c in order to
properly control whether ghes_edac should be force-loaded: In
ghes_edac_register() it is too late to set the module flag.

Introduce a helper ghes_get_devices(), which returns the list of GHES
devices which got probed when the platform-check passes on the system.

The previous force_load check is not needed in ghes_edac_unregister()
since it will be checked in the module's init function of ghes_edac
later.

  [ bp: Massage. ]

Suggested-by: Toshi Kani <toshi.kani@hpe.com>
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221010023559.69655-4-justin.he@arm.com
2022-10-21 19:32:38 +02:00
Jia He 8e40612f61 EDAC/ghes: Add a notifier for reporting memory errors
In order to make it a proper module and disentangle it from facilities,
add a notifier for reporting memory errors. Use an atomic notifier
because calls sites like ghes_proc_in_irq() run in interrupt context.

  [ bp: Massage commit message. ]

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221010023559.69655-3-justin.he@arm.com
2022-10-20 13:25:53 +02:00
Aristeu Rozanski 7556419180 EDAC/i5000: Mark as BROKEN
i5000_edac supports very old hardware which isn't available and it's
been broken for single/dual channel for many years without anyone
noticing. Marking as BROKEN.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220921181009.oxytvicy6sry6it7@redhat.com
2022-10-17 16:39:32 +02:00
Palmer Dabbelt 1a5a2cbd21
Merge patch series "Use composable cache instead of L2 cache"
Zong Li <zong.li@sifive.com> says:

Since composable cache may be L3 cache if private L2 cache exists, we
should use its original name "composable cache" to prevent confusion.

This patchset contains the modification which is related to ccache, such
as DT binding and EDAC driver.

* b4-shazam-merge:
  riscv: Add cache information in AUX vector
  soc: sifive: ccache: define the macro for the register shifts
  soc: sifive: ccache: use pr_fmt() to remove CCACHE: prefixes
  soc: sifive: ccache: reduce printing on init
  soc: sifive: ccache: determine the cache level from dts
  soc: sifive: ccache: Rename SiFive L2 cache to Composable cache.
  dt-bindings: sifive-ccache: change Sifive L2 cache to Composable cache

Link: https://lore.kernel.org/r/20220913061817.22564-1-zong.li@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13 11:07:13 -07:00
Greentime Hu ca120a79cf
soc: sifive: ccache: Rename SiFive L2 cache to Composable cache.
Since composable cache may be L3 cache if there is a L2 cache, we should
use its original name composable cache to prevent confusion.

There are some new lines were generated due to adding the compatible
"sifive,ccache0" into ID table and indent requirement.

The sifive L2 has been renamed to sifive CCACHE, EDAC driver needs to
apply the change as well.

Signed-off-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Zong Li <zong.li@sifive.com>
Co-developed-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20220913061817.22564-3-zong.li@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13 11:06:51 -07:00
Borislav Petkov c257795609 Merge branches 'edac-drivers' and 'edac-misc' into edac-updates-for-v6.1
Combine all queued EDAC changes for submission into v6.1:

* edac-drivers:
  EDAC/ie31200: Add Skylake-S support

* edac-misc:
  EDAC/i7300: Correct the i7300_exit() function name in comment
  x86/sb_edac: Add row column translation for Broadwell
  EDAC/i10nm: Print an extra register set of retry_rd_err_log
  EDAC/i10nm: Retrieve and print retry_rd_err_log registers for HBM
  EDAC/skx_common: Add ChipSelect ADXL component
  EDAC/ppc_4xx: Reorder symbols to get rid of a few forward declarations
  EDAC: Remove obsolete declarations in edac_module.h
  EDAC/i10nm: Add driver decoder for Ice Lake and Tremont CPUs
  EDAC/skx_common: Make output format similar
  EDAC/skx_common: Use driver decoder first
  EDAC/mc: Drop duplicated dimm->nr_pages debug printout
  EDAC/mc: Replace spaces with tabs in memtype flags definition
  EDAC/wq: Remove unneeded flush_workqueue()

Signed-off-by: Borislav Petkov <bp@suse.de>
2022-10-04 10:00:25 +02:00
Colin Ian King d3923513ed EDAC/i7300: Correct the i7300_exit() function name in comment
The incorrect function name is being used in the comment for function
i7300_exit. Correct this.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220805125008.2346559-1-colin.i.king@gmail.com
2022-09-23 23:07:17 +02:00
Youquan Song d389059685 x86/sb_edac: Add row column translation for Broadwell
The sb_edac driver lacks translation for DIMM internal address.

Add memory address translation for row/column/bank/bank_group
on Broadwell.

Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20220722233338.341567-1-tony.luck@intel.com
2022-09-23 12:34:23 -07:00
Qiuxu Zhuo d5f5e49953 EDAC/i10nm: Print an extra register set of retry_rd_err_log
Sapphire Rapids server adds an extra register set for logging more
retry_rd_err_log data. So add code to print the extra register set.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20220722233338.341567-1-tony.luck@intel.com
2022-09-23 12:33:27 -07:00
Qiuxu Zhuo acd4cf68fe EDAC/i10nm: Retrieve and print retry_rd_err_log registers for HBM
An HBM memory channel is divided into two pseudo channels. Each
pseudo channel has its own retry_rd_err_log registers. Retrieve and
print retry_rd_err_log registers of the HBM pseudo channel if the
memory error is from HBM.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20220722233338.341567-1-tony.luck@intel.com
2022-09-23 12:33:18 -07:00
Qiuxu Zhuo 14646de48b EDAC/skx_common: Add ChipSelect ADXL component
Each pseudo channel of HBM has its own retry_rd_err_log registers.
The bit 0 of ChipSelect ADXL component encodes the pseudo channel
number of HBM memory. So add ChipSelect ADXL component to get HBM
pseudo channel number.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20220722233338.341567-1-tony.luck@intel.com
2022-09-23 12:33:08 -07:00
Uwe Kleine-König d5e4eeea0c EDAC/ppc_4xx: Reorder symbols to get rid of a few forward declarations
When moving the definition of ppc4xx_edac_driver further down, the
forward declarations can just be dropped.

Do this to reduce needless line repetition.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220917232013.489931-1-u.kleine-koenig@pengutronix.de
2022-09-18 19:35:22 +02:00
Gaosheng Cui d42d6f5a5f EDAC: Remove obsolete declarations in edac_module.h
Commit

  4de78c6877 ("drivers/edac: mod PCI poll names"),

renamed the respective variables and accessors but left the old accessor
declarations edac_get_log_ce(), edac_get_log_ue(), edac_get_poll_msec()
and edac_get_panic_on_ue() in place. Remove them.

  [ bp: Masssage commit message. ]

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220911094038.3224365-1-cuigaosheng1@huawei.com
2022-09-11 12:26:41 +02:00
Youquan Song 2738c69a88 EDAC/i10nm: Add driver decoder for Ice Lake and Tremont CPUs
Current i10nm_edac only supports firmware decoder (ACPI DSM methods).
MCA bank registers of Ice Lake or Tremont CPUs contain the information
to decode DDR memory errors. To get better decoding performance, add
the driver decoder (decoding DDR memory errors via extracting error
information from MCA bank registers) for Ice Lake and Tremont CPUs.

Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20220901194310.115427-1-tony.luck@intel.com/
2022-09-08 11:40:01 -07:00
Qiuxu Zhuo 627d551a9e EDAC/skx_common: Make output format similar
The decoded output format of driver decoder is different from the
output format of firmware decoder. Make output format similar regardless
of decode function (Align driver decoder's to firmware decoder's).

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20220901194310.115427-1-tony.luck@intel.com/
2022-09-08 11:40:00 -07:00
Qiuxu Zhuo fe32f36693 EDAC/skx_common: Use driver decoder first
The performance of driver decoder[1] is better than the performance
of firmware decoder[2], especially on frequent correctable errors.

So use the driver decoder first, fall back to firmware decoder if
the driver decoder is unavailable. Also rename the function pointer
skx_decode to driver_decode (better name to contrast with adxl_decode).

[1] Decode errors by extracting error information from registers of
    memory controllers and/or MCA bank registers.

[2] Decode errors by calling ACPI DSM methods.

Co-developed-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20220901194310.115427-1-tony.luck@intel.com/
2022-09-08 11:40:00 -07:00
Serge Semin 93df194765 EDAC/mc: Drop duplicated dimm->nr_pages debug printout
The duplicated edac_dbg()-based dimm->nr_pages print was introduced in

  6e84d359b2 ("edac_mc: Cleanup per-dimm_info debug messages").

The duplicated line can be found even in the commit message text:

  [ 1011.380101] EDAC DEBUG: edac_mc_dump_dimm:   dimm->nr_pages = 0x40000
  [ 1011.380103] EDAC DEBUG: edac_mc_dump_dimm:   dimm->grain = 8
  [ 1011.380104] EDAC DEBUG: edac_mc_dump_dimm:   dimm->nr_pages = 0x40000

Drop the second edac_dbg() call.

  [ bp: Massage commit message. ]

Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220822190730.27277-14-Sergey.Semin@baikalelectronics.ru
2022-09-01 08:52:18 +02:00
ran jianping fb4b968577 EDAC/wq: Remove unneeded flush_workqueue()
destroy_workqueue() already takes care of flushing the workqueue so
there is no need to flush it explicitly.

  [ bp: Massage commit message. ]

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: ran jianping <ran.jianping@zte.com.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220424062127.3219542-1-ran.jianping@zte.com.cn
2022-08-25 10:50:35 +02:00
Josh Hant 7a14a11f93 EDAC/ie31200: Add Skylake-S support
Add device IDs for Skylake-S CPUs according to datasheet.

Signed-off-by: Josh Hant <joshuahant@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Jason Baron <jbaron@akamai.com>
Link: https://lore.kernel.org/r/20220712102121.20812-1-joshuahant@gmail.com
2022-08-25 10:28:01 +02:00
Linus Torvalds cae4199f93 powerpc updates for 6.0
- Add support for syscall stack randomization.
 
  - Add support for atomic operations to the 32 & 64-bit BPF JIT.
 
  - Full support for KASAN on 64-bit Book3E.
 
  - Add a watchdog driver for the new PowerVM hypervisor watchdog.
 
  - Add a number of new selftests for the Power10 PMU support.
 
  - Add a driver for the PowerVM Platform KeyStore.
 
  - Increase the NMI watchdog timeout during live partition migration, to avoid timeouts
    due to increased memory access latency.
 
  - Add support for using the 'linux,pci-domain' device tree property for PCI domain
    assignment.
 
  - Many other small features and fixes.
 
 Thanks to: Alexey Kardashevskiy, Andy Shevchenko, Arnd Bergmann, Athira Rajeev, Bagas
 Sanjaya, Christophe Leroy, Erhard Furtner, Fabiano Rosas, Greg Kroah-Hartman, Greg Kurz,
 Haowen Bai, Hari Bathini, Jason A. Donenfeld, Jason Wang, Jiang Jian, Joel Stanley, Juerg
 Haefliger, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Masahiro Yamada,
 Maxime Bizon, Miaoqian Lin, Murilo Opsfelder Araújo, Nathan Lynch, Naveen N. Rao, Nayna
 Jain, Nicholas Piggin, Ning Qiang, Pali Rohár, Petr Mladek, Rashmica Gupta, Sachin Sant,
 Scott Cheloha, Segher Boessenkool, Stephen Rothwell, Uwe Kleine-König, Wolfram Sang, Xiu
 Jianfeng, Zhouyi Zhou.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmLuAPgTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgBPpD/9kY/T0qlOXABxlZCgtqeAjPX+2xpnY
 BF+TlsN1TS1auFcEZL2BapmVacsvOeGEFDVuZHZvZJc69Hx+gSjnjFCnZjp6n+Yz
 wt6y9w9Pu0t/sjD5vNQ46O15/dXqm6RoVI7um12j/WLMN8Ko5+x3gKAyQONjQd2/
 1kPcxVH6FUosAdnCuvIcqCX4e4IIHl2ZkitHOTXoQUvUy9oAK/mOBnwqZ6zLGUKC
 E5M+Zyt4RFGxhPs48FkX6Nq6crDGU/P0VJpDKkR/t7GHnE67Bm70gZougAPrzrgP
 nx8zoTWgDKpqDeuqK7pFcyKgNS3dKbxsN3sAfKHOWu/YnV4wMyy+7fmwagMauki7
 lXccKN6F/r+8JcMNx80Jp/dAw3ZdLceP38M3Ryf8IL6lTfkNySumUvrKJn6r1Cu1
 wvzhgyEuDawss9KHdEmXcA2i3+XVZvitaipO7JWUC8pblrP1SJMoPfIIe9zh3y3M
 pyZj0TcGJ8XaK+badvI+PW/K/KeRgXEY8HpC3wDHSoIkli3OE4jDwXn6TiZgvm3n
 k0sKL8YSmQZ8hP8QAkR+r8NQKYqLlfyPxdslK5omDPxfub5Uzk9ZV2Ep7svkaiQn
 Wqjq27Dpz8+w0XPjsQ0Tkv+ByTkOhrawOH7x9SpFLHpv9g5otcYmS79NkO/htx8C
 6LyPNx1VYn5IRA==
 =tRkm
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Add support for syscall stack randomization

 - Add support for atomic operations to the 32 & 64-bit BPF JIT

 - Full support for KASAN on 64-bit Book3E

 - Add a watchdog driver for the new PowerVM hypervisor watchdog

 - Add a number of new selftests for the Power10 PMU support

 - Add a driver for the PowerVM Platform KeyStore

 - Increase the NMI watchdog timeout during live partition migration, to
   avoid timeouts due to increased memory access latency

 - Add support for using the 'linux,pci-domain' device tree property for
   PCI domain assignment

 - Many other small features and fixes

Thanks to Alexey Kardashevskiy, Andy Shevchenko, Arnd Bergmann, Athira
Rajeev, Bagas Sanjaya, Christophe Leroy, Erhard Furtner, Fabiano Rosas,
Greg Kroah-Hartman, Greg Kurz, Haowen Bai, Hari Bathini, Jason A.
Donenfeld, Jason Wang, Jiang Jian, Joel Stanley, Juerg Haefliger, Kajol
Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Masahiro Yamada,
Maxime Bizon, Miaoqian Lin, Murilo Opsfelder Araújo, Nathan Lynch,
Naveen N.  Rao, Nayna Jain, Nicholas Piggin, Ning Qiang, Pali Rohár,
Petr Mladek, Rashmica Gupta, Sachin Sant, Scott Cheloha, Segher
Boessenkool, Stephen Rothwell, Uwe Kleine-König, Wolfram Sang, Xiu
Jianfeng, and Zhouyi Zhou.

* tag 'powerpc-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (191 commits)
  powerpc/64e: Fix kexec build error
  EDAC/ppc_4xx: Include required of_irq header directly
  powerpc/pci: Fix PHB numbering when using opal-phbid
  powerpc/64: Init jump labels before parse_early_param()
  selftests/powerpc: Avoid GCC 12 uninitialised variable warning
  powerpc/cell/axon_msi: Fix refcount leak in setup_msi_msg_address
  powerpc/xive: Fix refcount leak in xive_get_max_prio
  powerpc/spufs: Fix refcount leak in spufs_init_isolated_loader
  powerpc/perf: Include caps feature for power10 DD1 version
  powerpc: add support for syscall stack randomization
  powerpc: Move system_call_exception() to syscall.c
  powerpc/powernv: rename remaining rng powernv_ functions to pnv_
  powerpc/powernv/kvm: Use darn for H_RANDOM on Power9
  powerpc/powernv: Avoid crashing if rng is NULL
  selftests/powerpc: Fix matrix multiply assist test
  powerpc/signal: Update comment for clarity
  powerpc: make facility_unavailable_exception 64s
  powerpc/platforms/83xx/suspend: Remove write-only global variable
  powerpc/platforms/83xx/suspend: Prevent unloading the driver
  powerpc/platforms/83xx/suspend: Reorder to get rid of a forward declaration
  ...
2022-08-06 16:38:17 -07:00
Linus Torvalds 5f0848190c platform-drivers-x86 for v6.0-1
Highlights:
  -  Microsoft Surface:
     - SSAM hot unplug support
     - Surface Pro 8 keyboard cover support
     - Tablet mode switch support for Surface Pro 8 and Surface Laptop Studio
  -  thinkpad_acpi: AMD Automatice Mode Transitions (AMT) support
  -  Mellanox:
     - Vulcan chassis COMe NVSwitch management support
     - XH3000 support
  - New generic/shared Intel P2SB (Primary to Sideband) support
  - Lots of small cleanups
  - Various small bugfixes
  - Various new hardware ids / quirks additions
 
 The following is an automated git shortlog grouped by driver:
 
 ACPI:
  -  video: Fix acpi_video_handles_brightness_key_presses()
  -  video: Change how we determine if brightness key-presses are handled
 
 Documentation/ABI:
  -  Add new attributes for mlxreg-io sysfs interfaces
  -  mlxreg-io: Fix contact info
 
 Drop the PMC_ATOM Kconfig option:
  - Drop the PMC_ATOM Kconfig option
 
 EDAC, pnd2:
  -  convert to use common P2SB accessor
  -  Use proper I/O accessors and address space annotation
 
 HID:
  -  surface-hid: Add support for hot-removal
 
 ISST:
  -  PUNIT device mapping with Sub-NUMA clustering
 
 Kconfig:
  -  Remove unnecessary "if X86"
 
 MAINTAINERS:
  -  repair file entry in MICROSOFT SURFACE AGGREGATOR TABLET-MODE SWITCH
 
 Merge tag 'ib-mfd-edac-i2c-leds-pinctrl-platform-watchdog-v5.20' into review-hans:
  - Merge tag 'ib-mfd-edac-i2c-leds-pinctrl-platform-watchdog-v5.20' into review-hans
 
 Move AMD platform drivers to separate directory:
  - Move AMD platform drivers to separate directory
 
 acer-wmi:
  -  Use backlight helper
 
 acer_wmi:
  -  Cleanup Kconfig selects
 
 apple-gmux:
  -  Use backlight helper
 
 asus-wmi:
  -  Add mic-mute LED classdev support
  -  Add key mappings
 
 compal-laptop:
  -  Use backlight helper
 
 efi:
  -  Fix efi_power_off() not being run before acpi_power_off() when necessary
 
 gigabyte-wmi:
  -  add support for B660I AORUS PRO DDR4
 
 hp-wmi:
  -  Ignore Sanitization Mode event
 
 i2c:
  -  i801: convert to use common P2SB accessor
 
 ideapad-laptop:
  -  Add Ideapad 5 15ITL05 to ideapad_dytc_v4_allow_table[]
  -  Add allow_v4_dytc module parameter
 
 intel/pmc:
  -  Add Alder Lake N support to PMC core driver
 
 intel_atomisp2_led:
  -  Also turn off the always-on camera LED on the Asus T100TAF
 
 leds:
  -  simatic-ipc-leds-gpio: Add GPIO version of Siemens driver
  -  simatic-ipc-leds: Convert to use P2SB accessor
 
 mfd:
  -  lpc_ich: Add support for pinctrl in non-ACPI system
  -  lpc_ich: Switch to generic p2sb_bar()
  -  lpc_ich: Factor out lpc_ich_enable_spi_write()
 
 mlx-platform:
  -  Add COME board revision register
  -  Add support for new system XH3000
  -  Introduce support for COMe NVSwitch management module for Vulcan chassis
  -  Add support for systems equipped with two ASICs
  -  Add cosmetic changes for alignment
  -  Make activation of some drivers conditional
 
 p2sb:
  -  Move out of X86_PLATFORM_DEVICES dependency
 
 panasonic-laptop:
  -  Use acpi_video_get_backlight_type()
  -  filter out duplicate volume up/down/mute keypresses
  -  don't report duplicate brightness key-presses
  -  revert "Resolve hotkey double trigger bug"
  -  sort includes alphabetically
  -  de-obfuscate button codes
 
 pinctrl:
  -  intel: Check against matching data instead of ACPI companion
 
 platform/mellanox:
  -  mlxreg-lc: Fix error flow and extend verbosity
  -  mlxreg-io: Add locking for io operations
  -  nvsw-sn2201: fix error code in nvsw_sn2201_create_static_devices()
 
 platform/olpc:
  -  Fix uninitialized data in debugfs write
 
 platform/surface:
  -  gpe: Add support for 13" Intel version of Surface Laptop 4
  -  tabletsw: Fix __le32 integer access
  -  Update copyright year of various drivers
  -  aggregator: Move subsystem hub drivers to their own module
  -  aggregator: Move device registry helper functions to core module
  -  aggregator_registry: Add support for tablet mode switch on Surface Laptop Studio
  -  aggregator_registry: Add support for tablet mode switch on Surface Pro 8
  -  Add KIP/POS tablet-mode switch driver
  -  aggregator: Add helper macros for requests with argument and return value
  -  aggregator: Reserve more event- and target-categories
  -  avoid flush_scheduled_work() usage
  -  aggregator_registry: Add support for keyboard cover on Surface Pro 8
  -  aggregator_registry: Add KIP device hub
  -  aggregator_registry: Change device ID for base hub
  -  aggregator_registry: Generify subsystem hub functionality
  -  aggregator: Add comment for KIP subsystem category
  -  aggregator_registry: Use client device wrappers for notifier registration
  -  aggregator: Allow notifiers to avoid communication on unregistering
  -  aggregator: Allow devices to be marked as hot-removed
  -  aggregator: Allow is_ssam_device() to be used when CONFIG_SURFACE_AGGREGATOR_BUS is disabled
 
 platform/x86/amd/pmc:
  -  Add new platform support
  -  Add new acpi id for PMC controller
 
 platform/x86/dell:
  -  Kconfig: Remove unnecessary "depends on X86_PLATFORM_DEVICES"
 
 platform/x86/intel:
  -  Add Primary to Sideband (P2SB) bridge support
 
 platform/x86/intel/ifs:
  -  Mark as BROKEN
 
 platform/x86/intel/pmt:
  -  telemetry: Fix fixed region handling
 
 platform/x86/intel/vsec:
  -  Fix wrong type for local status variables
  -  Add PCI error recovery support to Intel PMT
  -  Add support for Raptor Lake
  -  Rework early hardware code
 
 pmc_atom:
  -  Fix comment typo
  -  Match all Lex BayTrail boards with critclk_systems DMI table
 
 power/supply:
  -  surface_battery: Use client device wrappers for notifier registration
  -  surface_charger: Use client device wrappers for notifier registration
 
 serial-multi-instantiate:
  -  Sort ACPI IDs by HID
  -  Get rid of redundant 'else'
  -  Use while (i--) pattern to clean up
  -  Improve dev_err_probe() messaging
  -  Drop duplicate check
  -  Improve autodetection
 
 simatic-ipc:
  -  drop custom P2SB bar code
 
 sony-laptop:
  -  Remove useless comparisons in sony_pic_read_possible_resource()
 
 system76_acpi:
  -  Use dev_get_drvdata
 
 thinkpad_acpi:
  -  Enable AMT by default on supported systems
  -  Add support for hotkey 0x131a
  -  Add support for automatic mode transitions
  -  profile capabilities as integer
  -  do not use PSC mode on Intel platforms
  -  Fix a memory leak of EFCH MMIO resource
  -  Replace custom str_on_off() etc
  -  Sort headers for better maintenance
  -  Use backlight helper
 
 tools/power/x86/intel-speed-select:
  -  Remove unneeded semicolon
  -  Fix off by one check
 
 watchdog:
  -  simatic-ipc-wdt: convert to use P2SB accessor
 
 x86-android-tablets:
  -  Fix Lenovo Yoga Tablet 2 830/1050 poweroff again
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmLrndEUHGhkZWdvZWRl
 QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9zeMgf7BjSCz6ZA8SSY1i8QHDTvdjySihHJ
 j07Gn3j1T/5G00R/r6viMDE4PxcYvMAPXjq3azepKQd8H5kGfE323SA6fgWFPAvi
 P2OvEfvWfI5S8FYGYPBkNP2MjQ5MFe7qzLEh3+wQH0ocJ7WRCi457B4Xvtd2gWI3
 dHj5gMSWC3O5xNa2S4Mg3dnD9uJlwhX+FNjWIuRy8eh5+DikgByyC4B+uW6WtO5e
 t0rmIm6q5wUzB7dIetJLoAQwrcpYAOkK7L33G9h/7knWAfiJfklaKTbftnoxbDSv
 iGWODkLDyob4C48DmVusS6WMEhPUzl/R33+tk6LjVt/YOYOP030EMtCECQ==
 =Krhc
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver updates from Hans de Goede:

 - Microsoft Surface:
     - SSAM hot unplug support
     - Surface Pro 8 keyboard cover support
     - Tablet mode switch support for Surface Pro 8 and Surface Laptop
       Studio

 - thinkpad_acpi:
     - AMD Automatice Mode Transitions (AMT) support

 - Mellanox:
     - Vulcan chassis COMe NVSwitch management support
     - XH3000 support

 - New generic/shared Intel P2SB (Primary to Sideband) support

 - Lots of small cleanups

 - Various small bugfixes

 - Various new hardware ids / quirks additions

* tag 'platform-drivers-x86-v6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (105 commits)
  platform/x86/intel/vsec: Fix wrong type for local status variables
  platform/x86: p2sb: Move out of X86_PLATFORM_DEVICES dependency
  platform/x86: pmc_atom: Fix comment typo
  platform/surface: gpe: Add support for 13" Intel version of Surface Laptop 4
  platform/olpc: Fix uninitialized data in debugfs write
  platform/mellanox: mlxreg-lc: Fix error flow and extend verbosity
  platform/x86: pmc_atom: Match all Lex BayTrail boards with critclk_systems DMI table
  platform/x86: sony-laptop: Remove useless comparisons in sony_pic_read_possible_resource()
  tools/power/x86/intel-speed-select: Remove unneeded semicolon
  tools/power/x86/intel-speed-select: Fix off by one check
  platform/surface: tabletsw: Fix __le32 integer access
  Documentation/ABI: Add new attributes for mlxreg-io sysfs interfaces
  Documentation/ABI: mlxreg-io: Fix contact info
  platform/mellanox: mlxreg-io: Add locking for io operations
  platform/x86: mlx-platform: Add COME board revision register
  platform/x86: mlx-platform: Add support for new system XH3000
  platform/x86: mlx-platform: Introduce support for COMe NVSwitch management module for Vulcan chassis
  platform/x86: mlx-platform: Add support for systems equipped with two ASICs
  platform/x86: mlx-platform: Add cosmetic changes for alignment
  platform/x86: mlx-platform: Make activation of some drivers conditional
  ...
2022-08-04 18:19:14 -07:00
Linus Torvalds c1c76700a0 SPDX changes for 6.0-rc1
Here is the set of SPDX comment updates for 6.0-rc1.
 
 Nothing huge here, just a number of updated SPDX license tags and
 cleanups based on the review of a number of common patterns in GPLv2
 boilerplate text.  Also included in here are a few other minor updates,
 2 USB files, and one Documentation file update to get the SPDX lines
 correct.
 
 All of these have been in the linux-next tree for a very long time.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYupz3g8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynPUgCgslaf2ssCgW5IeuXbhla+ZBRAzisAnjVgOvLN
 4AKdqbiBNlFbCroQwmeQ
 =v1sg
 -----END PGP SIGNATURE-----

Merge tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx

Pull SPDX updates from Greg KH:
 "Here is the set of SPDX comment updates for 6.0-rc1.

  Nothing huge here, just a number of updated SPDX license tags and
  cleanups based on the review of a number of common patterns in GPLv2
  boilerplate text.

  Also included in here are a few other minor updates, two USB files,
  and one Documentation file update to get the SPDX lines correct.

  All of these have been in the linux-next tree for a very long time"

* tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (28 commits)
  Documentation: samsung-s3c24xx: Add blank line after SPDX directive
  x86/crypto: Remove stray comment terminator
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_406.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_398.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_391.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_390.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_385.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_320.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_319.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_318.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_298.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_292.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_179.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 2)
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 1)
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_160.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_152.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_149.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_147.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_133.RULE
  ...
2022-08-04 12:12:54 -07:00
Christophe Leroy bce02f71e4 EDAC/ppc_4xx: Include required of_irq header directly
Commit 4d5c5bad51 ("powerpc: Remove asm/prom.h from asm/mpc52xx.h
and asm/pci.h") that cleans up powerpc's asm/prom.h leads to build
errors in ppc4xx_edac.c due to missing header. Include required
header directly to avoid the build failure.

Fixes: 4d5c5bad51 ("powerpc: Remove asm/prom.h from asm/mpc52xx.h and asm/pci.h")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/993f5a7da371458cb819b5f3f569073c78523b01.1659436180.git.christophe.leroy@csgroup.eu
2022-08-02 22:30:14 +10:00
Michael Ellerman 4177ab2283 EDAC/mpc85xx: Include required of headers directly
A subsequent commit to cleanup powerpc's asm/prom.h leads to build
errors in mpc85xx_edac.c due to missing headers. Include all required
headers directly to avoid the build failure.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2022-07-28 16:22:12 +10:00
Toshi Kani 5e2805d537 EDAC/ghes: Set the DIMM label unconditionally
The commit

  cb51a371d0 ("EDAC/ghes: Setup DIMM label from DMI and use it in error reports")

enforced that both the bank and device strings passed to
dimm_setup_label() are not NULL.

However, there are BIOSes, for example on a

  HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 03/15/2019

which don't populate both strings:

  Handle 0x0020, DMI type 17, 84 bytes
  Memory Device
          Array Handle: 0x0013
          Error Information Handle: Not Provided
          Total Width: 72 bits
          Data Width: 64 bits
          Size: 32 GB
          Form Factor: DIMM
          Set: None
          Locator: PROC 1 DIMM 1        <===== device
          Bank Locator: Not Specified   <===== bank

This results in a buffer overflow because ghes_edac_register() calls
strlen() on an uninitialized label, which had non-zero values left over
from krealloc_array():

  detected buffer overflow in __fortify_strlen
   ------------[ cut here ]------------
   kernel BUG at lib/string_helpers.c:983!
   invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
   CPU: 1 PID: 1 Comm: swapper/0 Tainted: G          I       5.18.6-200.fc36.x86_64 #1
   Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 03/15/2019
   RIP: 0010:fortify_panic
   ...
   Call Trace:
    <TASK>
    ghes_edac_register.cold
    ghes_probe
    platform_probe
    really_probe
    __driver_probe_device
    driver_probe_device
    __driver_attach
    ? __device_attach_driver
    bus_for_each_dev
    bus_add_driver
    driver_register
    acpi_ghes_init
    acpi_init
    ? acpi_sleep_proc_init
    do_one_initcall

The label contains garbage because the commit in Fixes reallocs the
DIMMs array while scanning the system but doesn't clear the newly
allocated memory.

Change dimm_setup_label() to always initialize the label to fix the
issue. Set it to the empty string in case BIOS does not provide both
bank and device so that ghes_edac_register() can keep the default label
given by edac_mc_alloc_dimms().

  [ bp: Rewrite commit message. ]

Fixes: b9cae27728 ("EDAC/ghes: Scan the system once on driver init")
Co-developed-by: Robert Richter <rric@kernel.org>
Signed-off-by: Robert Richter <rric@kernel.org>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Robert Elliott <elliott@hpe.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220719220124.760359-1-toshi.kani@hpe.com
2022-07-27 10:42:52 +02:00
Sherry Sun 4bcffe9417 EDAC/synopsys: Re-enable the error interrupts on v3 hw
zynqmp_get_error_info() writes 0 to the ECC_CLR_OFST register after
an interrupt for a {un-,}correctable error is raised, which disables
the error interrupts. Then the interrupt handler will be called only
once. Therefore, re-enable the error interrupt line at the end of
intr_handler() for v3.x Synopsys EDAC DDR.

Fixes: f7824ded41 ("EDAC/synopsys: Add support for version 3 of the Synopsys EDAC DDR")
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Shubhrajyoti Datta <Shubhrajyoti.datta@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220427015137.8406-3-sherry.sun@nxp.com
2022-07-22 14:31:38 +02:00
Sherry Sun be76ceaf03 EDAC/synopsys: Use the correct register to disable the error interrupt on v3 hw
v3.x Synopsys EDAC DDR doesn't have the QOS Interrupt register. Use the
ECC Clear Register to disable the error interrupts instead.

Fixes: f7824ded41 ("EDAC/synopsys: Add support for version 3 of the Synopsys EDAC DDR")
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Shubhrajyoti Datta <Shubhrajyoti.datta@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220427015137.8406-2-sherry.sun@nxp.com
2022-07-22 14:31:30 +02:00
Andy Shevchenko 7b2db7049b EDAC, pnd2: convert to use common P2SB accessor
Since we have a common P2SB accessor in tree we may use it instead of
open coded variants.

Replace custom code by p2sb_bar() call.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Henning Schild <henning.schild@siemens.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Lee Jones <lee@kernel.org>
2022-07-14 10:50:36 +01:00
Andy Shevchenko 6adc32f58b EDAC, pnd2: Use proper I/O accessors and address space annotation
The driver uses rather voodoo kind of castings and I/O accessors.
Replace it with proper __iomem annotation and readl()/readq() calls.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Henning Schild <henning.schild@siemens.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Lee Jones <lee@kernel.org>
2022-07-14 10:50:36 +01:00
Thomas Gleixner 3bb165608e treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_56.RULE (part 2)
Based on the normalized pattern:

    this file is licensed under the terms of the gnu general public
    license version 2 this program is licensed as is without any warranty
    of any kind whether express or implied

extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

has been chosen to replace the boilerplate/reference.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 14:51:35 +02:00
Linus Torvalds abc8babefb - A gargen variety of fixes which don't fit any other tip bucket:
- Remove function export
  - Correct asm constraint
  - Fix __setup handlers retval
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmKL6VkACgkQEsHwGGHe
 VUqs6g/+Ikpd4Mrou4P5Ul8QNdN9mEzwUfW6i8VpoA3h1L6mKkZxbUsbSz9xInjw
 MAhrcevujW6GwdQdus2sUcSlX+jxl6c/IlMdf8RegNPY/JBPDX4dRA7rPetvZEDm
 ZiIYVTiEzJoOzPDJeO7a3v5EHPsY6CjsCFhGz7hjIcrwQjzCLkL5MqG+WDAtebe+
 QVdbllD2RlZNPDyHYE5Lqh1h+Y0e4n6kS7LCWxexfHlNOZ5KBRVyIJvz/xOZFZ1/
 9oX0UDD2gfH5chLs8GKsr7cZYERMtNlKBPoxGzl8iKF4iUeiksdj3P5y+mdcFaDG
 YbM7aXewmbyLyiCkh1zXU6Mw3lK1VfUtVXtEYj+qXf1jWp59ctNEJkc6/VAcaKh7
 oS7MNG7Y44B8XwdH7MiqDE7eVCnqEjIR+BIiwjyXNLFP1AXZMAXuBzXPF/vZ3Gyf
 3N5vzO4VNEN6Oa1TReSspKwYvq2uPtHMjLX2rT6Py2ru32mj2dCc5E7GD83RKL8V
 vDIz4VGOZyGfjp6gClMBsyK4mYwSwgXbnOci7DJn56mMf2qzBJITILXc31zz4gX2
 E9kiBu/4Mwjnrx9QRpCNXu7iddBA3YM2NMtNlwBcCgZOFaFz/yOx9TpnugF17WHQ
 VVtQi8wlcsS+F05Y11b7euusMQyk1EpWabIrw8UQd+61Dwpz58Q=
 =/WGB
 -----END PGP SIGNATURE-----

Merge tag 'x86_misc_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 updates from Borislav Petkov:
 "A variety of fixes which don't fit any other tip bucket:

   - Remove unnecessary function export

   - Correct asm constraint

   - Fix __setup handlers retval"

* tag 'x86_misc_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Cleanup the control_va_addr_alignment() __setup handler
  x86: Fix return value of __setup handlers
  x86/delay: Fix the wrong asm constraint in delay_loop()
  x86/amd_nb: Unexport amd_cache_northbridges()
2022-05-23 19:32:59 -07:00
Linus Torvalds 0be3ff0ccb - Switch ghes_edac to use the CPER error reporting routines and simplify
the code considerably this way
 
 - Rip out the silly edac_align_ptr() contraption which was computing the
 size of the private structures of each driver and thus allowing for a
 one-shot memory allocation. This was clearly unnecessary and confusing
 so switch to simple and boring kmalloc* calls.
 
 - Last but not least, the usual garden variety of fixes, cleanups and
 improvements all over EDAC land
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmKLRJkACgkQEsHwGGHe
 VUorYQ//a3s/j1wIXB5J0q7c9xlumqhxJz9P2A42kZMEW6MR94Ovr2lDnN6FRN5z
 dlLLn/fxjh3El084jaKrfhHHyB0Z78Qte/Caf4E3HVuhmZ2dQw58vXAm3TNMsiPz
 DEnJrRJ/vuX/VEcuuvX9wwSovPqNINW4lb9cWcIfGPToX051coUvuxTQXmCO80Hd
 2syv88S0a8tw94E6DeB+5hhAQdgdV2dK3rZChTNi1guDqHqv14E6oQowWe6+Dvq/
 XGBbJtmjuWsh2ZtS1KDnGYO0jvzLxe/5kjdgXYUoftG30MVTkVV0pBk0G+lPQQBN
 2nSLd9zEgSceB5SlNlfWtQQuL1I56q3chxT7mj5JBPRsqQmV6Rxg9E0jnyiUH6Cf
 Q9btDizjU7vUpDKe1Y8fJEMR3nXTIK58AnjcDmTZIu5hVZFY2nYnql0txClmkTUE
 Bffud97C7a8uiSECp6oS5vjQHK12xwqiD8KRIaAHlBDYnpqTOJw/mDoKUvV74yiJ
 TRvvPAiPgoA5ZLLkCFKxA7IzFXtgz9HL7m/MbbBo63ed187qvMyBxcyb8Teih/iy
 u6eK0W1fux+zEaS6q5Jp0v415aqVvoa0UHgImTlOJhBaWENlEQixHslFMaqnlDTV
 yhG405KxxMgW9/L9nI4kqP827zIr4iXJCVg3rJsOdytEzfwWy2o=
 =Hwmn
 -----END PGP SIGNATURE-----

Merge tag 'edac_updates_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:

 - Switch ghes_edac to use the CPER error reporting routines and
   simplify the code considerably this way

 - Rip out the silly edac_align_ptr() contraption which was computing
   the size of the private structures of each driver and thus allowing
   for a one-shot memory allocation. This was clearly unnecessary and
   confusing so switch to simple and boring kmalloc* calls.

 - Last but not least, the usual garden variety of fixes, cleanups and
   improvements all over EDAC land

* tag 'edac_updates_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/xgene: Fix typo processsors -> processors
  EDAC/i5100: Remove unused inline function i5100_nrecmema_dm_buf_id()
  EDAC: Use kcalloc()
  EDAC/ghes: Change ghes_hw from global to static
  EDAC/armada_xp: Use devm_platform_ioremap_resource()
  EDAC/synopsys: Add a SPDX identifier
  EDAC/synopsys: Add driver support for i.MX platforms
  EDAC/dmc520: Don't print an error for each unconfigured interrupt line
  EDAC/mc: Get rid of edac_align_ptr()
  EDAC/device: Sanitize edac_device_alloc_ctl_info() definition
  EDAC/device: Get rid of the silly one-shot memory allocation in edac_device_alloc_ctl_info()
  EDAC/pci: Get rid of the silly one-shot memory allocation in edac_pci_alloc_ctl_info()
  EDAC/mc: Get rid of silly one-shot struct allocation in edac_mc_alloc()
  efi/cper: Reformat CPER memory error location to more readable
  EDAC/ghes: Unify CPER memory error location reporting
  efi/cper: Add a cper_mem_err_status_str() to decode error description
  powerpc/85xx: Remove fsl,85... bindings
2022-05-23 17:34:20 -07:00
Borislav Petkov be80a1ca51 Merge branches 'edac-misc' and 'edac-alloc-cleanup' into edac-updates-for-v5.19
Combine all collected EDAC changes for submission into v5.19:

* edac-misc:
  EDAC/xgene: Fix typo processsors -> processors
  EDAC/i5100: Remove unused inline function i5100_nrecmema_dm_buf_id()
  EDAC/ghes: Change ghes_hw from global to static
  EDAC/armada_xp: Use devm_platform_ioremap_resource()
  EDAC/synopsys: Add a SPDX identifier
  EDAC/synopsys: Add driver support for i.MX platforms
  EDAC/dmc520: Don't print an error for each unconfigured interrupt line
  efi/cper: Reformat CPER memory error location to more readable
  EDAC/ghes: Unify CPER memory error location reporting
  efi/cper: Add a cper_mem_err_status_str() to decode error description
  powerpc/85xx: Remove fsl,85... bindings

* edac-alloc-cleanup:
  EDAC: Use kcalloc()
  EDAC/mc: Get rid of edac_align_ptr()
  EDAC/device: Sanitize edac_device_alloc_ctl_info() definition
  EDAC/device: Get rid of the silly one-shot memory allocation in edac_device_alloc_ctl_info()
  EDAC/pci: Get rid of the silly one-shot memory allocation in edac_pci_alloc_ctl_info()
  EDAC/mc: Get rid of silly one-shot struct allocation in edac_mc_alloc()

Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-23 10:19:30 +02:00
Julia Lawall 2aeb1f5fbb EDAC/xgene: Fix typo processsors -> processors
Spelling mistake (triple letters) in comment. Detected with the help of
Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220521111145.81697-39-Julia.Lawall@inria.fr
2022-05-21 16:03:49 +02:00
YueHaibing 2edb9863e1 EDAC/i5100: Remove unused inline function i5100_nrecmema_dm_buf_id()
Commit

  a4972b1b9a ("edac: i5100_edac: Remove unused i5100_recmema_dm_buf_id")

left this function unused. Remove it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220514080433.29944-1-yuehaibing@huawei.com
2022-05-17 17:44:50 +02:00
Borislav Petkov 13088b65d9 EDAC: Use kcalloc()
It is syntactic sugar anyway:

  # drivers/edac/edac_mc.o:

   text    data     bss     dec     hex filename
  13378     324       8   13710    358e edac_mc.o.before
  13378     324       8   13710    358e edac_mc.o.after

md5:
   70a53ee3ac7f867730e35c2be9110748  edac_mc.o.before.asm
   70a53ee3ac7f867730e35c2be9110748  edac_mc.o.after.asm

  # drivers/edac/edac_device.o:

   text    data     bss     dec     hex filename
   5704     120       4    5828    16c4 edac_device.o.before
   5704     120       4    5828    16c4 edac_device.o.after

md5:
   880563c859da6eb9aca85ec431fdbaeb  edac_device.o.before.asm
   880563c859da6eb9aca85ec431fdbaeb  edac_device.o.after.asm

No functional changes.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220412211957.28899-1-bp@alien8.de
2022-05-02 11:32:44 +02:00
Tom Rix 815fad6e4f EDAC/ghes: Change ghes_hw from global to static
Smatch reports this issue
  ghes_edac.c:44:3: warning: symbol 'ghes_hw' was not declared. Should it be static?

ghes_hw is used only in ghes_edac.c so change its storage-class
specifier to static.

Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220421135319.1508754-1-trix@redhat.com
2022-04-29 11:23:09 +02:00
Lv Ruyi 2f58783c5d EDAC/armada_xp: Use devm_platform_ioremap_resource()
Use the devm_platform_ioremap_resource() helper instead of calling
platform_get_resource() and devm_ioremap_resource() separately. Make the
code simpler without functional changes.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jan Luebbe <jlu@pengutronix.de>
Link: https://lore.kernel.org/r/20220421084621.2615517-1-lv.ruyi@zte.com.cn
2022-04-29 11:14:08 +02:00
Shubhrajyoti Datta 9ae83ec8b8 EDAC/synopsys: Add a SPDX identifier
Replace the copyright boilerplate with a SPDX identifier.

  [ bp: Rewrite commit message. ]

Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220428044051.2842687-1-shubhrajyoti.datta@xilinx.com
2022-04-28 15:38:08 +02:00
Sherry Sun 5297ecfe24 EDAC/synopsys: Add driver support for i.MX platforms
i.MX8MP use Synopsys v3.70a DDR controller IP so add support for it with
the Synopsys driver.

Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Michal Simek <michal.simek@amd.com>
Link: https://lore.kernel.org/r/20220428023209.18087-1-sherry.sun@nxp.com
2022-04-28 15:15:59 +02:00
Tyler Hicks ad2df24732 EDAC/dmc520: Don't print an error for each unconfigured interrupt line
The dmc520 driver requires that at least one interrupt line, out of the
ten possible, is configured. The driver prints an error and returns
-EINVAL from its .probe function if there are no interrupt lines
configured.

Don't print a KERN_ERR level message for each interrupt line that's
unconfigured as that can confuse users into thinking that there is an
error condition.

Before this change, the following KERN_ERR level messages would be
reported if only dram_ecc_errc and dram_ecc_errd were configured in the
device tree:

  dmc520 68000000.dmc: IRQ ram_ecc_errc not found
  dmc520 68000000.dmc: IRQ ram_ecc_errd not found
  dmc520 68000000.dmc: IRQ failed_access not found
  dmc520 68000000.dmc: IRQ failed_prog not found
  dmc520 68000000.dmc: IRQ link_err not
  dmc520 68000000.dmc: IRQ temperature_event not found
  dmc520 68000000.dmc: IRQ arch_fsm not found
  dmc520 68000000.dmc: IRQ phy_request not found

Fixes: 1088750d78 ("EDAC: Add EDAC driver for DMC520")
Reported-by: Sinan Kaya <okaya@kernel.org>
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220111163800.22362-1-tyhicks@linux.microsoft.com
2022-04-19 11:25:41 +02:00
Shubhrajyoti Datta e2932d1f6f EDAC/synopsys: Read the error count from the correct register
Currently, the error count is read wrongly from the status register. Read
the count from the proper error count register (ERRCNT).

  [ bp: Massage. ]

Fixes: b500b4a029 ("EDAC, synopsys: Add ECC support for ZynqMP DDR controller")
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220414102813.4468-1-shubhrajyoti.datta@xilinx.com
2022-04-14 14:44:49 +02:00
Borislav Petkov 713c4ff885 EDAC/mc: Get rid of edac_align_ptr()
Get rid of it now that it is unused.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220310095254.1510-6-bp@alien8.de
2022-04-11 12:04:11 +02:00
Borislav Petkov 0d24a49e88 EDAC/device: Sanitize edac_device_alloc_ctl_info() definition
Shorten argument names, correct formatting, sort local vars in reverse
x-mas tree order.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220310095254.1510-5-bp@alien8.de
2022-04-11 11:50:39 +02:00
Borislav Petkov 9fb9ce392a EDAC/device: Get rid of the silly one-shot memory allocation in edac_device_alloc_ctl_info()
Use boring kzalloc() instead. Add pointers to the different allocated
members in struct edac_device_ctl_info for easier freeing later.

One of the reasons, perhaps, why it was done this way is to be able to
do a single kfree(ctl_info) without having to kfree() the other parts of
the struct too but that is not nearly a sensible reason as to why there
should be this obscure pointer alignment.

There should be no functional changes resulting from this.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220310095254.1510-4-bp@alien8.de
2022-04-11 11:43:26 +02:00
Borislav Petkov fb8cd45ca3 EDAC/pci: Get rid of the silly one-shot memory allocation in edac_pci_alloc_ctl_info()
Use boring kzalloc() instead.

There should be no functional changes resulting from this.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220310095254.1510-3-bp@alien8.de
2022-04-11 11:38:10 +02:00
Borislav Petkov 0bbb265f70 EDAC/mc: Get rid of silly one-shot struct allocation in edac_mc_alloc()
This has probably meant something at some point but there's no need for
it anymore - the struct mem_ctl_info allocation can happen with normal,
boring k*alloc() calls like everyone else does it.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220310095254.1510-2-bp@alien8.de
2022-04-11 11:25:39 +02:00
Shuai Xue ed27b5df38 EDAC/ghes: Unify CPER memory error location reporting
Switch the GHES EDAC memory error reporting functions to use the common
CPER ones and get rid of code duplication.

  [ bp:
      - rewrite commit message, remove useless text
      - rip out useless reformatting
      - align function params on the opening brace
      - rename function to a more descriptive name
      - drop useless function exports
      - handle buffer lengths properly when printing other detail
      - remove useless casting
  ]

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220308144053.49090-3-xueshuai@linux.alibaba.com
2022-04-08 11:31:18 +02:00
Christophe Leroy b2fa90ef62 powerpc/85xx: Remove fsl,85... bindings
Since

  8a4ab218ef ("powerpc/85xx: Change deprecated binding for 85xx-based boards")

those bindings are not used anymore.

A comment in drivers/edac/mpc85xx_edac.c say they are to be removed
with kernel 2.6.30.

Remove them now.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Scott Wood <oss@buserror.net>
Link: https://lore.kernel.org/r/82a8bc4450a4daee50ee5fada75621fecb3703ff.1648721299.git.christophe.leroy@csgroup.eu
2022-04-05 22:03:02 +02:00
Muralidhara M K e1907d3751 x86/amd_nb: Unexport amd_cache_northbridges()
amd_cache_northbridges() is exported by amd_nb.c and is called by
amd64-agp.c and amd64_edac.c modules at module_init() time so that NB
descriptors are properly cached before those drivers can use them.

However, the init_amd_nbs() initcall already does call
amd_cache_northbridges() unconditionally and thus makes sure the NB
descriptors are enumerated.

That initcall is a fs_initcall type which is on the 5th group (starting
from 0) of initcalls that gets run in increasing numerical order by the
init code.

The module_init() call is turned into an __initcall() in the MODULE=n
case and those are device-level initcalls, i.e., group 6.

Therefore, the northbridges caching is already finished by the time
module initialization starts and thus the correct initialization order
is retained.

Unexport amd_cache_northbridges(), update dependent modules to
call amd_nb_num() instead. While at it, simplify the checks in
amd_cache_northbridges().

  [ bp: Heavily massage and *actually* explain why the change is ok. ]

Signed-off-by: Muralidhara M K <muralimk@amd.com>
Signed-off-by: Naveen Krishna Chatradhi <nchatrad@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220324122729.221765-1-nchatrad@amd.com
2022-04-05 19:22:27 +02:00
Borislav Petkov 1422df58e5 Merge branch 'edac-amd64' into edac-updates-for-v5.18
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-03-21 10:34:57 +01:00
Rabara Niravkumar L e1bca853dd EDAC/altera: Add SDRAM ECC check for U-Boot
A bug in legacy U-Boot causes a crash during SDRAM boot if ECC is not
enabled in the bitstream but enabled in the Linux config.

Memory mapped read of the ECC Enabled bit was only enabled if U-Boot
determined ECC was enabled in the bitstream.

The Linux driver checks the ECC enable bit using a memory map read.
In the ECC disabled bitstream case, U-Boot didn't enable ECC register
memory map reads and since they are not allowed this results in a crash.

Always read the ECC Enable register through a SMC call which is always
allowed and it works with legacy and current U-Boot.

  [ bp: Massage commit message. ]

Signed-off-by: Rabara Niravkumar L <niravkumar.l.rabara@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Link: https://lore.kernel.org/r/20220305014118.4794-1-niravkumar.l.rabara@intel.com
2022-03-16 09:56:39 +01:00
Yazen Ghannam 2151c84ece EDAC/amd64: Add new register offset support and related changes
Introduce a "family flags" bitmask that can be used to indicate any
special behavior needed on a per-family basis.

Add a flag to indicate a system uses the new register offsets introduced
with Family 19h Model 10h.

Use this flag to account for register offset changes, a new bitfield
indicating DDR5 use on a memory controller, and to set the proper number
of chip select masks.

Rework f17_addr_mask_to_cs_size() to properly handle the change in chip
select masks. And update code comments to reflect the updated Chip
Select, DIMM, and Mask relationships.

[uninitialized variable warning]
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: William Roche <william.roche@oracle.com>
Link: https://lore.kernel.org/r/20220202144307.2678405-3-yazen.ghannam@amd.com
2022-02-23 22:01:33 +01:00
Yazen Ghannam 75aeaaf23d EDAC/amd64: Set memory type per DIMM
Current AMD systems allow mixing of DIMM types within a system. However,
DIMMs within a channel, i.e. managed by a single Unified Memory
Controller (UMC), must be of the same type.

Handle this possible configuration by checking and setting the memory
type for each individual "UMC" structure.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: William Roche <william.roche@oracle.com>
Link: https://lore.kernel.org/r/20220202144307.2678405-2-yazen.ghannam@amd.com
2022-02-23 21:53:45 +01:00
Eliav Farber f8efca92ae EDAC: Fix calculation of returned address and next offset in edac_align_ptr()
Do alignment logic properly and use the "ptr" local variable for
calculating the remainder of the alignment.

This became an issue because struct edac_mc_layer has a size that is not
zero modulo eight, and the next offset that was prepared for the private
data was unaligned, causing an alignment exception.

The patch in Fixes: which broke this actually wanted to "what we
actually care about is the alignment of the actual pointer that's about
to be returned." But it didn't check that alignment.

Use the correct variable "ptr" for that.

  [ bp: Massage commit message. ]

Fixes: 8447c4d15e ("edac: Do alignment logic properly in edac_align_ptr()")
Signed-off-by: Eliav Farber <farbere@amazon.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220113100622.12783-2-farbere@amazon.com
2022-02-15 15:54:46 +01:00
Sergey Shtylyov dfd0dfb9a7 EDAC/xgene: Fix deferred probing
The driver overrides error codes returned by platform_get_irq_optional()
to -EINVAL for some strange reason, so if it returns -EPROBE_DEFER, the
driver will fail the probe permanently instead of the deferred probing.
Switch to propagating the proper error codes to platform driver code
upwards.

  [ bp: Massage commit message. ]

Fixes: 0d4429301c ("EDAC: Add APM X-Gene SoC EDAC driver")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220124185503.6720-3-s.shtylyov@omp.ru
2022-01-30 01:06:35 +01:00
Sergey Shtylyov 279eb8575f EDAC/altera: Fix deferred probing
The driver overrides the error codes returned by platform_get_irq() to
-ENODEV for some strange reason, so if it returns -EPROBE_DEFER, the
driver will fail the probe permanently instead of the deferred probing.
Switch to propagating the proper error codes to platform driver code
upwards.

  [ bp: Massage commit message. ]

Fixes: 71bcada88b ("edac: altera: Add Altera SDRAM EDAC support")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220124185503.6720-2-s.shtylyov@omp.ru
2022-01-28 21:38:46 +01:00
Eliav Farber b0596da1a0 EDAC/mc: Remove unnecessary cast to char * in edac_align_ptr()
Remove the forgotten (char *) casts as that function returns void *.

  [ bp: Rewrite commit message. ]

Signed-off-by: Eliav Farber <farbere@amazon.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220113100622.12783-3-farbere@amazon.com
2022-01-26 19:42:09 +01:00
Greg Kroah-Hartman 625c6b5569 EDAC: Use default_groups in kobj_type
There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field. Move the edac sysfs code to use default_groups field which has
been the preferred way since

  aa30f47cf6 ("kobject: Add support for default attribute groups to kobj_type")

so that the obsolete default_attrs field can be removed soon.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220104112401.1067148-2-gregkh@linuxfoundation.org
2022-01-23 20:01:29 +01:00
Greg Kroah-Hartman 11413893a0 EDAC: Use proper list of struct attribute for attributes
The EDAC sysfs code is doing some crazy casting of the list of
attributes that is not necessary at all. Instead, properly point to the
correct attribute structure in the lists, which removes the need to
cast anything and the code is now properly typesafe (as much as sysfs
attribute logic is typesafe...)

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220104112401.1067148-1-gregkh@linuxfoundation.org
2022-01-23 20:01:25 +01:00
Linus Torvalds ff8be96420 - Add support for version 3 of the Synopsys DDR controller to synopsys_edac
- Add support for DRR5 and new models 0x10-0x1f and 0x50-0x5f of AMD
   family 0x19 CPUs to amd64_edac
 
 - The usual set of fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHb/PwACgkQEsHwGGHe
 VUqhbhAAo0mRNnBF3CJn1zlXRgmqrvV1IPnJQNp+z5iaXY1vr0qRMgO4OcgsJrxF
 nxrx/fdAYlQQO4vz1iq4t4j+eazOQyM/JZ0DKi4e+Dw2mC0axdCx8a0pyl1g2de6
 oQ5GkplRKUFn+3bTJpHIE5QnCOD7S85Mrp1F3Soa6jD9i+HwQIqAoltNMcCP7Yei
 ibhWUBX2H/oYcHARecIkP/YEyzSEHhcX6LRjNILW5haZQ6GziQUFzKUUwpUS3hsz
 9i6hXnHXEPhOq8JyoyWWhvVDywFK9z8lh57G7DFfZIhAk1FjuLDP2iI270D/LkYF
 shq6+M8ST9yqwOMV3Iaoa8VZFf/fjTyV0E0L2p2+faxaJ66rqdzbagLIZQv6hDNe
 N1/LD72/Io4et1kEbbaHm5jpxzSJ0jQwu1o+rY1/NmKsWhzE6V4X0GnDTZzZwP9b
 CbFJAWdCD+fi3WQjzv8HLVepjIsV+R3VVTOLq2oodn/mtoK0DRU/vTeCCwRS9ntF
 IyF55L/jSqy9CtP119KBnItGo4b84UrJDozXizGtc6Zt3chz7ljSpa2gJrKF+fCR
 Yhyr9Pt+vYQBpnIMDu1BPcoE58pwZYKoOSO00COUHHsLn+u8qhetGmQzYwWHCq7J
 hz8HHZnlTdFseZTp5tavk3B4md5HiPu+A7GevK3YH3lBv/vEmDM=
 =85ZE
 -----END PGP SIGNATURE-----

Merge tag 'edac_updates_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:

 - Add support for version 3 of the Synopsys DDR controller to
   synopsys_edac

 - Add support for DRR5 and new models 0x10-0x1f and 0x50-0x5f of AMD
   family 0x19 CPUs to amd64_edac

 - The usual set of fixes and cleanups

* tag 'edac_updates_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/amd64: Add support for family 19h, models 50h-5fh
  EDAC/sb_edac: Remove redundant initialization of variable rc
  RAS/CEC: Remove a repeated 'an' in a comment
  EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh
  EDAC: Add RDDR5 and LRDDR5 memory types
  EDAC/sifive: Fix non-kernel-doc comment
  dt-bindings: memory: Add entry for version 3.80a
  EDAC/synopsys: Enable the driver on Intel's N5X platform
  EDAC/synopsys: Add support for version 3 of the Synopsys EDAC DDR
  EDAC/synopsys: Use the quirk for version instead of ddr version
2022-01-10 11:45:23 -08:00
Linus Torvalds 7e740ae635 - First part of a series to move the AMD address translation code from
arch/x86/ to amd64_edac as that is its only user anyway
 
 - Some MCE error injection improvements to the AMD side
 
 - Reorganization of the #MC handler code and the facilities it calls to
 make it noinstr-safe
 
 - Add support for new AMD MCA bank types and non-uniform banks layout
 
 - The usual set of cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHcGZ4ACgkQEsHwGGHe
 VUr6Zw//WBvNvfV/akQGsvVo94G0DaF+buYB+Tl1p0goMd7QfKA5iHxjB1alEJC2
 dTchIr7pjiiE3nr4svuWLLQZamx8kMwQNqipioBHXg3YThj0wD4PbUOC9TlIceBR
 3yxVbvwlD7Y7sb2PII6IMlagzTiIeW0/ps29DHFr5vqDBvEanNdAHoV/h2vQi+76
 Ma96psIxzTMSk11yGB6l9k66EASCdDGBU7sODjup7wuQmuRaQ/1oJAWY0wIJvJez
 frjpaz/YKmlTwTf9bxoJbky2FkeBsD4yXXUGwjDgMq0EyUUaeSbvaQkm8gSHX9Yr
 VDDv1WvT6QIw6x7Wc4skS8lWmZghNBbAHOoNS31BPJ2IDmFWkF5Q2bNEuHrtU4EC
 0mkNeyN6x48L/F8j/1aE/tm+SjiGexZX4zhi6MNWReTV140I1zqQq/r7CCu5+MEa
 PAB1YH/96k2dMPT6mbFrRIFJmkDuBuZOAkuwYWEjO/XjPl2SGBGj1jKolWW3qjRR
 Po7vBJnDt7wgigWFh6+R4rJv+fh87XfB7B2wEOt4Yn37jUkK6dNRIy0zFmDaC1J2
 bHgsJbWC+Sgs1G57gnYABJYzLj7RRdDyCu1/UUVyBBP7/WfZJw0kjABE7p3AaYTd
 15JV1L0c/Ypuv05LJf40LkyF2F5w2fnP5QM2Rr8U4xW/GumEyWs=
 =8Hu7
 -----END PGP SIGNATURE-----

Merge tag 'ras_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RAS updates from Borislav Petkov:
 "A relatively big amount of movements in RAS-land this time around:

   - First part of a series to move the AMD address translation code
     from arch/x86/ to amd64_edac as that is its only user anyway

   - Some MCE error injection improvements to the AMD side

   - Reorganization of the #MC handler code and the facilities it calls
     to make it noinstr-safe

   - Add support for new AMD MCA bank types and non-uniform banks layout

   - The usual set of cleanups and fixes"

* tag 'ras_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/mce: Reduce number of machine checks taken during recovery
  x86/mce/inject: Avoid out-of-bounds write when setting flags
  x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration
  x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types
  x86/mce: Check regs before accessing it
  x86/mce: Mark mce_start() noinstr
  x86/mce: Mark mce_timed_out() noinstr
  x86/mce: Move the tainting outside of the noinstr region
  x86/mce: Mark mce_read_aux() noinstr
  x86/mce: Mark mce_end() noinstr
  x86/mce: Mark mce_panic() noinstr
  x86/mce: Prevent severity computation from being instrumented
  x86/mce: Allow instrumentation during task work queueing
  x86/mce: Remove noinstr annotation from mce_setup()
  x86/mce: Use mce_rdmsrl() in severity checking code
  x86/mce: Remove function-local cpus variables
  x86/mce: Do not use memset to clear the banks bitmaps
  x86/mce/inject: Set the valid bit in MCA_STATUS before error injection
  x86/mce/inject: Check if a bank is populated before injecting
  x86/mce: Get rid of cpu_missing
  ...
2022-01-10 11:43:09 -08:00
Borislav Petkov da0119a912 Merge branches 'edac-misc' and 'edac-amd64' into edac-updates-for-v5.17
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-01-10 10:07:00 +01:00
Qiuxu Zhuo c370baa328 EDAC/i10nm: Release mdev/mbase when failing to detect HBM
On systems without HBM (High Bandwidth Memory) mdev/mbase are not
released/unmapped.

Add the code to release mdev/mbase when failing to detect HBM.

[Tony: re-word commit message]

Cc: <stable@vger.kernel.org>
Fixes: c945088384 ("EDAC/i10nm: Add support for high bandwidth memory")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20211224091126.1246-1-qiuxu.zhuo@intel.com
2022-01-04 09:08:00 -08:00
Marc Bevand 0b8bf9cb14 EDAC/amd64: Add support for family 19h, models 50h-5fh
Add the new family 19h models 50h-5fh PCI IDs (device 18h functions 0
and 6) to support Ryzen 5000 APUs ("Cezanne").

Signed-off-by: Marc Bevand <m@zorinaq.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Link: https://lore.kernel.org/r/20211221233112.556927-1-m@zorinaq.com
2021-12-24 10:53:41 +01:00
Yazen Ghannam 91f75eb481 x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration
AMD systems currently lay out MCA bank types such that the type of bank
number "i" is either the same across all CPUs or is Reserved/Read-as-Zero.

For example:

  Bank # | CPUx | CPUy
    0      LS     LS
    1      RAZ    UMC
    2      CS     CS
    3      SMU    RAZ

Future AMD systems will lay out MCA bank types such that the type of
bank number "i" may be different across CPUs.

For example:

  Bank # | CPUx | CPUy
    0      LS     LS
    1      RAZ    UMC
    2      CS     NBIO
    3      SMU    RAZ

Change the structures that cache MCA bank types to be per-CPU and update
smca_get_bank_type() to handle this change.

Move some SMCA-specific structures to amd.c from mce.h, since they no
longer need to be global.

Break out the "count" for bank types from struct smca_hwid, since this
should provide a per-CPU count rather than a system-wide count.

Apply the "const" qualifier to the struct smca_hwid_mcatypes array. The
values in this array should not change at runtime.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211216162905.4132657-3-yazen.ghannam@amd.com
2021-12-22 17:22:09 +01:00
Yazen Ghannam 5176a93ab2 x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types
Add HWID and McaType values for new SMCA bank types, and add their error
descriptions to edac_mce_amd.

The "PHY" bank types all have the same error descriptions, and the NBIF
and SHUB bank types have the same error descriptions. So reuse the same
arrays where appropriate.

  [ bp: Remove useless comments over hwid types. ]

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211216162905.4132657-2-yazen.ghannam@amd.com
2021-12-22 17:19:18 +01:00
Colin Ian King 567617baac EDAC/sb_edac: Remove redundant initialization of variable rc
The variable rc is being initialized with a value that is never read, it
is being updated later on. The assignment is redundant and thus remove
it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211126221848.1125321-1-colin.i.king@gmail.com
2021-12-21 12:02:11 +01:00
Yazen Ghannam e2be5955a8 EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh
Add a new family type for AMD Family 19h Models 10h to 1Fh. Use this new
family type for Models A0h to AFh also.

Increase the maximum number of controllers from 8 to 12.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208174356.1997855-3-yazen.ghannam@amd.com
2021-12-10 12:54:33 +01:00
Yazen Ghannam f957112423 EDAC: Add RDDR5 and LRDDR5 memory types
Include Registered-DDR5 and Load-Reduced DDR5 in the list of memory
types.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208174356.1997855-2-yazen.ghannam@amd.com
2021-12-10 12:51:28 +01:00
Randy Dunlap ad2c302bc6 EDAC/sifive: Fix non-kernel-doc comment
scripts/kernel-doc complains about a comment that begins with "/**"
but is not in kernel-doc format, so correct it.

Prevents this warning:

  drivers/edac/sifive_edac.c:23: warning: This comment starts with '/**', \
  but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
    * EDAC error callback

Fixes: 91abaeaaff ("EDAC/sifive: Add EDAC platform driver for SiFive SoCs")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211201030913.10283-1-rdunlap@infradead.org
2021-12-05 19:54:46 +01:00
Dinh Nguyen f6bc0d8bc2 EDAC/synopsys: Enable the driver on Intel's N5X platform
Intel's N5X platform is also using the Synopsys EDAC controller.

Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Link: https://lkml.kernel.org/r/20211012190709.1504152-3-dinguyen@kernel.org
2021-11-20 19:41:48 +01:00
Dinh Nguyen f7824ded41 EDAC/synopsys: Add support for version 3 of the Synopsys EDAC DDR
Add support for version 3.80a of the Synopsys DDR controller. This
version of the controller has the following differences:

- UE/CE are auto cleared
- Interrupts are supported by default

Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Michal Simek <michal.simek@xilinx.com>
Link: https://lkml.kernel.org/r/20211012190709.1504152-2-dinguyen@kernel.org
2021-11-20 19:23:49 +01:00
Dinh Nguyen bd1d6da17c EDAC/synopsys: Use the quirk for version instead of ddr version
Version 2.40a supports DDR_ECC_INTR_SUPPORT for a quirk, so use that
quirk to determine a call to setup_address_map().

Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Michal Simek <michal.simek@xilinx.com>
Link: https://lkml.kernel.org/r/20211012190709.1504152-1-dinguyen@kernel.org
2021-11-20 18:13:06 +01:00
Yazen Ghannam 70aeb807cf EDAC/amd64: Add context struct
Define an address translation context struct. This will hold values that
will be passed between multiple functions.

Save return address, Node ID, and the Instance ID number to start.
Currently, the UMC number is used as the Instance ID, but future DF
versions may use another value.

Also include a "tmp" field to use when reading registers. This is to
avoid having to define a temporary variable in multiple functions.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211028175728.121452-5-yazen.ghannam@amd.com
2021-11-15 12:54:16 +01:00
Yazen Ghannam 448c3d6085 EDAC/amd64: Allow for DF Indirect Broadcast reads
The DF Indirect Access method allows for "Broadcast" accesses in which
case no specific instance is targeted. Add support using a reserved
instance ID of 0xFF to indicate a broadcast access. Set the FICAA
register appropriately.

Define helpers functions for instance and broadcast reads and use them
where appropriate.

Drop the "amd_" prefix since these functions are all static.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211028175728.121452-4-yazen.ghannam@amd.com
2021-11-15 12:48:55 +01:00
Yazen Ghannam b3218ae477 x86/amd_nb, EDAC/amd64: Move DF Indirect Read to AMD64 EDAC
df_indirect_read() is used only for address translation. Move it to EDAC
along with the translation code.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211028175728.121452-3-yazen.ghannam@amd.com
2021-11-15 12:44:47 +01:00
Yazen Ghannam 0b746e8c1e x86/MCE/AMD, EDAC/amd64: Move address translation to AMD64 EDAC
The address translation code used for current AMD systems is
non-architectural. So move it to EDAC.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211028175728.121452-2-yazen.ghannam@amd.com
2021-11-15 12:36:32 +01:00