linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-10-02 15:18:19 +00:00

Author	SHA1	Message	Date
Dan Williams	8b3b1c0dc5	tools/testing/cxl: Make mock CEDT parsing more robust Accept any cxl_test topology device as the first argument in cxl_chbs_context. This is in preparation for reworking the detection of the component registers across VH and RCH topologies. Move mock_acpi_table_parse_cedt() beneath the definition of is_mock_port() and use is_mock_port() instead of the explicit mock cxl_acpi device check. Acked-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166993043433.1882361.17651413716599606118.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-02 23:15:16 -08:00
Dan Williams	4029c32fb6	cxl/acpi: Move rescan to the workqueue Now that the cxl_mem driver has a need to take the root device lock, the cxl_bus_rescan() needs to run outside of the root lock context. That need arises from RCH topologies and the locking that the cxl_mem driver does to attach a descendant to an upstream port. In the RCH case the lock needed is the CXL root device lock [1]. Link: http://lore.kernel.org/r/166993045621.1882361.1730100141527044744.stgit@dwillia2-xfh.jf.intel.com [1] Tested-by: Robert Richter <rrichter@amd.com> Link: http://lore.kernel.org/r/166993042884.1882361.5633723613683058881.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-02 23:10:20 -08:00
Dan Williams	03ff079aa6	cxl/pmem: Remove the cxl_pmem_wq and related infrastructure Now that cxl_nvdimm and cxl_pmem_region objects are torn down sychronously with the removal of either the bridge, or an endpoint, the cxl_pmem_wq infrastructure can be jettisoned. Tested-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993042335.1882361.17022872468068436287.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-02 23:07:22 -08:00
Dan Williams	f17b558d66	cxl/pmem: Refactor nvdimm device registration, delete the workqueue The three objects 'struct cxl_nvdimm_bridge', 'struct cxl_nvdimm', and 'struct cxl_pmem_region' manage CXL persistent memory resources. The bridge represents base platform resources, the nvdimm represents one or more endpoints, and the region is a collection of nvdimms that contribute to an assembled address range. Their relationship is such that a region is torn down if any component endpoints are removed. All regions and endpoints are torn down if the foundational bridge device goes down. A workqueue was deployed to manage these interdependencies, but it is difficult to reason about, and fragile. A recent attempt to take the CXL root device lock in the cxl_mem driver was reported by lockdep as colliding with the flush_work() in the cxl_pmem flows. Instead of the workqueue, arrange for all pmem/nvdimm devices to be torn down immediately and hierarchically. A similar change is made to both the 'cxl_nvdimm' and 'cxl_pmem_region' objects. For bisect-ability both changes are made in the same patch which unfortunately makes the patch bigger than desired. Arrange for cxl_memdev and cxl_region to register a cxl_nvdimm and cxl_pmem_region as a devres release action of the bridge device. Additionally, include a devres release action of the cxl_memdev or cxl_region device that triggers the bridge's release action if an endpoint exits before the bridge. I.e. this allows either unplugging the bridge, or unplugging and endpoint to result in the same cleanup actions. To keep the patch smaller the cleanup of the now defunct workqueue infrastructure is saved for a follow-on patch. Tested-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993041773.1882361.16444301376147207609.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-02 23:07:22 -08:00
Dan Williams	16d53cb0d6	cxl/region: Drop redundant pmem region release handling Now that a cxl_nvdimm object can only experience ->remove() via an unregistration event (because the cxl_nvdimm bind attributes are suppressed), additional cleanups are possible. It is already the case that the removal of a cxl_memdev object triggers ->remove() on any associated region. With that mechanism in place there is no need for the cxl_nvdimm removal to trigger the same. Just rely on cxl_region_detach() to tear down the whole cxl_pmem_region. Tested-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993041215.1882361.6321535567798911286.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-02 23:06:29 -08:00
Dan Williams	cb9cfff82f	cxl/acpi: Simplify cxl_nvdimm_bridge probing The 'struct cxl_nvdimm_bridge' object advertises platform CXL PMEM resources. It coordinates with libnvdimm to attach nvdimm devices and regions for each corresponding CXL object. That coordination is complicated, i.e. difficult to reason about, and it turns out redundant. It is already the case that the CXL core knows how to tear down a cxl_region when a cxl_memdev goes through ->remove(), so that pathway can be extended to directly cleanup cxl_nvdimm and cxl_pmem_region objects. Towards the goal of ripping out the cxl_nvdimm_bridge state machine, arrange for cxl_acpi to optionally pre-load the cxl_pmem driver so that the nvdimm bridge is active synchronously with devm_cxl_add_nvdimm_bridge(), and remove all the bind attributes for the cxl_nvdimm* objects since the cxl root device and cxl_memdev bind attributes are sufficient. Tested-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993040668.1882361.7450361097265836752.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 15:52:36 -08:00
Dave Jiang	452996fa07	cxl/pmem: add provider name to cxl pmem dimm attribute group Add provider name in order to associate cxl test dimm from cxl_test to the cxl pmem device when going through sysfs for security testing. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983618174.2734609.15600031015423828810.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	bd429e5355	cxl/pmem: add id attribute to CXL based nvdimm Add an id group attribute for CXL based nvdimm object. The addition allows ndctl to display the "unique id" for the nvdimm. The serial number for the CXL memory device will be used for this id. [ { "dev":"nmem10", "id":"0x4", "security":"disabled" }, ] The id attribute is needed by the ndctl security key management to setup a keyblob with a unique file name tied to the mem device. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983617029.2734609.8251308562882142281.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	dcedadfae2	nvdimm/cxl/pmem: Add support for master passphrase disable security command The original nvdimm_security_ops ->disable() only supports user passphrase for security disable. The CXL spec introduced the disabling of master passphrase. Add a ->disable_master() callback to support this new operation and leaving the old ->disable() mechanism alone. A "disable_master" command is added for the sysfs attribute in order to allow command to be issued from userspace. ndctl will need enabling in order to utilize this new operation. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983616454.2734609.14204031148234398086.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	3b502e886d	cxl/pmem: Add "Passphrase Secure Erase" security command support Create callback function to support the nvdimm_security_ops() ->erase() callback. Translate the operation to send "Passphrase Secure Erase" security command for CXL memory device. When the mem device is secure erased, cpu_cache_invalidate_memregion() is called in order to invalidate all CPU caches before attempting to access the mem device again. See CXL 3.0 spec section 8.2.9.8.6.6 for reference. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983615293.2734609.10358657600295932156.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	2bb692f7a6	cxl/pmem: Add "Unlock" security command support Create callback function to support the nvdimm_security_ops() ->unlock() callback. Translate the operation to send "Unlock" security command for CXL mem device. When the mem device is unlocked, cpu_cache_invalidate_memregion() is called in order to invalidate all CPU caches before attempting to access the mem device. See CXL rev3.0 spec section 8.2.9.8.6.4 for reference. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983614167.2734609.15124543712487741176.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	a072f7b797	cxl/pmem: Add "Freeze Security State" security command support Create callback function to support the nvdimm_security_ops() ->freeze() callback. Translate the operation to send "Freeze Security State" security command for CXL memory device. See CXL rev3.0 spec section 8.2.9.8.6.5 for reference. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983613019.2734609.10645754779802492122.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	c4ef680d0b	cxl/pmem: Add Disable Passphrase security command support Create callback function to support the nvdimm_security_ops ->disable() callback. Translate the operation to send "Disable Passphrase" security command for CXL memory device. The operation supports disabling a passphrase for the CXL persistent memory device. In the original implementation of nvdimm_security_ops, this operation only supports disabling of the user passphrase. This is due to the NFIT version of disable passphrase only supported disabling of user passphrase. The CXL spec allows disabling of the master passphrase as well which nvidmm_security_ops does not support yet. In this commit, the callback function will only support user passphrase. See CXL rev3.0 spec section 8.2.9.8.6.3 for reference. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983611878.2734609.10602135274526390127.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	997469407f	cxl/pmem: Add "Set Passphrase" security command support Create callback function to support the nvdimm_security_ops ->change_key() callback. Translate the operation to send "Set Passphrase" security command for CXL memory device. The operation supports setting a passphrase for the CXL persistent memory device. It also supports the changing of the currently set passphrase. The operation allows manipulation of a user passphrase or a master passphrase. See CXL rev3.0 spec section 8.2.9.8.6.2 for reference. However, the spec leaves a gap WRT master passphrase usages. The spec does not define any ways to retrieve the status of if the support of master passphrase is available for the device, nor does the commands that utilize master passphrase will return a specific error that indicates master passphrase is not supported. If using a device does not support master passphrase and a command is issued with a master passphrase, the error message returned by the device will be ambiguous. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983610751.2734609.4445075071552032091.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	3282811555	cxl/pmem: Introduce nvdimm_security_ops with ->get_flags() operation Add nvdimm_security_ops support for CXL memory device with the introduction of the ->get_flags() callback function. This is part of the "Persistent Memory Data-at-rest Security" command set for CXL memory device support. The ->get_flags() function provides the security state of the persistent memory device defined by the CXL 3.0 spec section 8.2.9.8.6.1. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983609611.2734609.13231854299523325319.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-30 16:30:47 -08:00
Adam Manzanares	3b39fd6cf1	cxl: Replace HDM decoder granularity magic numbers When reviewing the CFMWS parsing code that deals with the HDM decoders, I noticed a couple of magic numbers. This commit replaces these magic numbers with constants defined by the CXL 3.0 specification. v2: - Change references to CXL 3.0 specification (David) - CXL_DECODER_MAX_GRANULARITY_ORDER -> CXL_DECODER_MAX_ENCODED_IG (Dan) Signed-off-by: Adam Manzanares <a.manzanares@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20220829220249.243888-1-a.manzanares@samsung.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-14 12:11:27 -08:00
Robert Richter	b51d767521	cxl/acpi: Improve debug messages in cxl_acpi_probe() In cxl_acpi_probe() the iterator bus_for_each_dev() walks through all CXL hosts. Since all dev_*() debug messages point to the ACPI0017 device which is the CXL root for all hosts, the device information is pointless as it is always the same device. Change this to use the host device for this instead. Also, add additional host specific information such as CXL support, UID and CHBCR. This is an example log: acpi ACPI0016:00: UID found: 4 acpi ACPI0016:00: CHBCR found: 0x28090000000 acpi ACPI0016:00: dport added to root0 acpi ACPI0016:00: host-bridge: ACPI0016:00 pci0000:7f: host supports CXL Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/20221018132341.76259-6-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-14 12:11:27 -08:00
Robert Richter	58eef878fc	cxl: Unify debug messages when calling devm_cxl_add_dport() CXL dports are added in a couple of code paths using devm_cxl_add_dport(). Debug messages are individually generated, but are incomplete and inconsistent. Change this by moving its generation to devm_cxl_add_dport(). This unifies the messages and reduces code duplication. Also, generate messages on failure. Use a __devm_cxl_add_dport() wrapper to keep the readability of the error exits. Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/20221018132341.76259-5-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-14 10:37:08 -08:00
Robert Richter	f3cd264c4e	cxl: Unify debug messages when calling devm_cxl_add_port() CXL ports are added in a couple of code paths using devm_cxl_add_port(). Debug messages are individually generated, but are incomplete and inconsistent. Change this by moving its generation to devm_cxl_add_port(). This unifies the messages and reduces code duplication. Also, generate messages on failure. Use a __devm_cxl_add_port() wrapper to keep the readability of the error exits. Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/20221018132341.76259-4-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-14 10:37:08 -08:00
Robert Richter	3bb80da51b	cxl/core: Check physical address before mapping it in devm_cxl_iomap_block() The physical base address of a CXL range can be invalid and is then set to CXL_RESOURCE_NONE. In general software shall prevent such situations, but it is hard to proof this may never happen. E.g. in add_port_attach_ep() there this the following: component_reg_phys = find_component_registers(uport_dev); port = devm_cxl_add_port(&parent_port->dev, uport_dev, component_reg_phys, parent_dport); find_component_registers() and subsequent functions (e.g. cxl_regmap_to_base()) may return CXL_RESOURCE_NONE. But it is written to port without any further check in cxl_port_alloc(): port->component_reg_phys = component_reg_phys; It is then later directly used in devm_cxl_setup_hdm() to map io ranges with devm_cxl_iomap_block(). Just an example... Check this condition. Also do not fail silently like an ioremap() failure, use a WARN_ON_ONCE() for it. Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/20221018132341.76259-3-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-14 10:37:08 -08:00
Robert Richter	fa89248e66	cxl/core: Remove duplicate declaration of devm_cxl_iomap_block() The function devm_cxl_iomap_block() is only used in the core code. There are two declarations in header files of it, in drivers/cxl/core/core.h and drivers/cxl/cxl.h. Remove its unused declaration in drivers/cxl/cxl.h. Fixing build error in regs.c found by kernel test robot by including "core.h" there. Signed-off-by: Robert Richter <rrichter@amd.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20221018132341.76259-2-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-14 10:37:08 -08:00
Ira Weiny	487d828d75	cxl/doe: Request exclusive DOE access The PCIE Data Object Exchange (DOE) mailbox is a protocol run over configuration cycles. It assumes one initiator at a time. While the kernel has control of the mailbox user space writes could interfere with the kernel access. Mark DOE mailbox config space exclusive when iterated by the CXL driver. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220926215711.2893286-3-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-14 10:07:22 -08:00
Dan Williams	8f401ec1c8	cxl/region: Recycle region ids At region creation time the next region-id is atomically cached so that there is predictability of region device names. If that region is destroyed and then a new one is created the region id increments. That ends up looking like a memory leak, or is otherwise surprising that identifiers roll forward even after destroying all previously created regions. Try to reuse rather than free old region ids at region release time. While this fixes a cosmetic issue, the needlessly advancing memory region-id gives the appearance of a memory leak, hence the "Fixes" tag, but no "Cc: stable" tag. Cc: Ben Widawsky <bwidawsk@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Fixes: `779dd20cfb` ("cxl/region: Add region creation support") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166752186062.947915.13200195701224993317.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-04 16:03:43 -07:00
Dan Williams	e4f6dfa9ef	cxl/region: Fix 'distance' calculation with passthrough ports When programming port decode targets, the algorithm wants to ensure that two devices are compatible to be programmed as peers beneath a given port. A compatible peer is a target that shares the same dport, and where that target's interleave position also routes it to the same dport. Compatibility is determined by the device's interleave position being >= to distance. For example, if a given dport can only map every Nth position then positions less than N away from the last target programmed are incompatible. The @distance for the host-bridge's cxl_port in a simple dual-ported host-bridge configuration with 2 direct-attached devices is 1, i.e. An x2 region divided by 2 dports to reach 2 region targets. An x4 region under an x2 host-bridge would need 2 intervening switches where the @distance at the host bridge level is 2 (x4 region divided by 2 switches to reach 4 devices). However, the distance between peers underneath a single ported host-bridge is always zero because there is no limit to the number of devices that can be mapped. In other words, there are no decoders to program in a passthrough, all descendants are mapped and distance only starts matters for the intervening descendant ports of the passthrough port. Add tracking for the number of dports mapped to a port, and use that to detect the passthrough case for calculating @distance. Cc: <stable@vger.kernel.org> Reported-by: Bobo WL <lmw.bobo@gmail.com> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: http://lore.kernel.org/r/20221010172057.00001559@huawei.com Fixes: `27b3f8d138` ("cxl/region: Program target lists") Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166752185440.947915.6617495912508299445.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-04 16:01:24 -07:00
Dan Williams	4d07ae22e7	cxl/pmem: Fix cxl_pmem_region and cxl_memdev leak When a cxl_nvdimm object goes through a ->remove() event (device physically removed, nvdimm-bridge disabled, or nvdimm device disabled), then any associated regions must also be disabled. As highlighted by the cxl-create-region.sh test [1], a single device may host multiple regions, but the driver was only tracking one region at a time. This leads to a situation where only the last enabled region per nvdimm device is cleaned up properly. Other regions are leaked, and this also causes cxl_memdev reference leaks. Fix the tracking by allowing cxl_nvdimm objects to track multiple region associations. Cc: <stable@vger.kernel.org> Link: https://github.com/pmem/ndctl/blob/main/test/cxl-create-region.sh [1] Reported-by: Vishal Verma <vishal.l.verma@intel.com> Fixes: `04ad63f086` ("cxl/region: Introduce cxl_pmem_region objects") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166752183647.947915.2045230911503793901.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-04 15:58:35 -07:00
Dan Williams	0d9e734018	cxl/region: Fix cxl_region leak, cleanup targets at region delete When a region is deleted any targets that have been previously assigned to that region hold references to it. Trigger those references to drop by detaching all targets at unregister_region() time. Otherwise that region object will leak as userspace has lost the ability to detach targets once region sysfs is torn down. Cc: <stable@vger.kernel.org> Fixes: `b9686e8c8e` ("cxl/region: Enable the assignment of endpoint decoders to regions") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166752183055.947915.17681995648556534844.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-04 15:58:35 -07:00
Dan Williams	a90accb358	cxl/region: Fix region HPA ordering validation Some regions may not have any address space allocated. Skip them when validating HPA order otherwise a crash like the following may result: devm_cxl_add_region: cxl_acpi cxl_acpi.0: decoder3.4: created region9 BUG: kernel NULL pointer dereference, address: 0000000000000000 [..] RIP: 0010:store_targetN+0x655/0x1740 [cxl_core] [..] Call Trace: <TASK> kernfs_fop_write_iter+0x144/0x200 vfs_write+0x24a/0x4d0 ksys_write+0x69/0xf0 do_syscall_64+0x3a/0x90 store_targetN+0x655/0x1740: alloc_region_ref at drivers/cxl/core/region.c:676 (inlined by) cxl_port_attach_region at drivers/cxl/core/region.c:850 (inlined by) cxl_region_attach at drivers/cxl/core/region.c:1290 (inlined by) attach_target at drivers/cxl/core/region.c:1410 (inlined by) store_targetN at drivers/cxl/core/region.c:1453 Cc: <stable@vger.kernel.org> Fixes: `384e624bb2` ("cxl/region: Attach endpoint decoders") Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166752182461.947915.497032805239915067.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-04 15:58:35 -07:00
Yu Zhe	4f1aa35f1f	cxl/pmem: Use size_add() against integer overflow "struct_size() + n" may cause a integer overflow, use size_add() to handle it. Signed-off-by: Yu Zhe <yuzhe@nfschina.com> Link: https://lore.kernel.org/r/20220927070247.23148-1-yuzhe@nfschina.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-03 11:20:46 -07:00
Vishal Verma	71ee71d7ad	cxl/region: Fix decoder allocation crash When an intermediate port's decoders have been exhausted by existing regions, and creating a new region with the port in question in it's hierarchical path is attempted, cxl_port_attach_region() fails to find a port decoder (as would be expected), and drops into the failure / cleanup path. However, during cleanup of the region reference, a sanity check attempts to dereference the decoder, which in the above case didn't exist. This causes a NULL pointer dereference BUG. To fix this, refactor the decoder allocation and de-allocation into helper routines, and in this 'free' routine, check that the decoder, @cxld, is valid before attempting any operations on it. Cc: <stable@vger.kernel.org> Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Fixes: `384e624bb2` ("cxl/region: Attach endpoint decoders") Link: https://lore.kernel.org/r/20221101074100.1732003-1-vishal.l.verma@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-01 15:33:07 -07:00
Jonathan Cameron	f010c75c05	cxl/pmem: Fix failure to account for 8 byte header for writes to the device LSA. Writes to the device must include an offset and size as defined in CXL 2.0 8.2.9.5.2.4 Set LSA (Opcode 4103h) Fixes tag is non obvious as this code has been through several reworks and variable names + wasn't in use until the addition of the region code. Due to a bug in QEMU CXL emulation this overrun resulted in QEMU crashing. Reported-by: Bobo WL <lmw.bobo@gmail.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Fixes: `60b8f17215` ("cxl/pmem: Translate NVDIMM label commands to CXL label commands") Link: https://lore.kernel.org/r/20220815154044.24733-3-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-10-20 16:28:53 -07:00
Jonathan Cameron	2816e24b05	cxl/region: Fix null pointer dereference due to pass through decoder commit Not all decoders have a commit callback. The CXL specification allows a host bridge with a single root port to have no explicit HDM decoders. Currently the region driver assumes there are none. As such the CXL core creates a special pass through decoder instance without a commit callback. Prior to this patch, the ->commit() callback was called unconditionally. Thus a configuration with 1 Host Bridge, 1 Root Port, 1 switch with multiple downstream ports below which there are multiple CXL type 3 devices results in a situation where committing the region causes a null pointer dereference. Reported-by: Bobo WL <lmw.bobo@gmail.com> Fixes: `176baefb2e` ("cxl/hdm: Commit decoder state to hardware") Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20220818164210.2084-1-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-10-20 16:28:53 -07:00
Jonathan Cameron	cf00b33058	cxl/mbox: Add a check on input payload size A bug in the LSA code resulted in transfers slightly larger than the mailbox size. Let us make it easier to catch similar issues in future by adding a low level check. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220815154044.24733-2-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-10-20 16:28:53 -07:00
Dan Williams	1cd8a2537e	cxl/hdm: Fix skip allocations vs multiple pmem allocations Vishal notes that when attempting to define a second pmem region on a device the DPA allocation fails with a message of the form: decoder11.1: failed to reserve skipped space Recall that the skip setting is used when there is a pmem allocation in the presence of free ram DPA space. The first pmem allocation skips over the free ram and subsequent pmem allocations do not require a skip. The bug is that a skip is still attempted and the DPA reservation code flags the double skip allocation conflict. Fixes: `cf880423b6` ("cxl/hdm: Add support for allocating DPA to an endpoint decoder") Reported-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165973754730.1558392.15466392461645857658.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 16:11:38 -07:00
Dan Williams	4d8e4ea5bb	cxl/region: Disallow region granularity != window granularity The endpoint decode granularity must be <= the window granularity otherwise capacity in the endpoints is lost in the decode. Consider an attempt to have a region granularity of 512 with 4 devices within a window that maps 2 host bridges at a granularity of 256 bytes: HPA DPA Offset HB Port EP 0x0 0x0 0 0 0 0x100 0x0 1 0 2 0x200 0x100 0 0 0 0x300 0x100 1 0 2 0x400 0x200 0 1 1 0x500 0x200 1 1 3 0x600 0x300 0 1 1 0x700 0x300 1 1 3 0x800 0x400 0 0 0 0x900 0x400 1 0 2 0xA00 0x500 0 0 0 0xB00 0x500 1 0 2 Notice how endpoint0 maps HPA 0x0 and 0x200 correctly, but then at HPA 0x800 it results in DPA 0x200-0x400 on being skipped. Fix this by restricing the region granularity to be equal to the window granularity resulting in the following for a x4 region under a x2 window at a granularity of 256. HPA DPA Offset HB Port EP 0x0 0x0 0 0 0 0x100 0x0 1 0 2 0x200 0x0 0 1 1 0x300 0x0 1 1 3 0x400 0x100 0 0 0 0x500 0x100 1 0 2 0x600 0x100 0 1 1 0x700 0x100 1 1 3 Not that it ever made practical sense to support region granularity > window granularity. The window rotates host bridges causing endpoints to never see a consecutive stream of requests at the desired granularity without breaks to issue cycles to the other host bridge. Fixes: `80d10a6cee` ("cxl/region: Add interleave geometry attributes") Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165973127171.1526540.9923273539049172976.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 16:10:04 -07:00
Dan Williams	298d44d04b	cxl/region: Fix x1 interleave to greater than x1 interleave routing In cases where the decode fans out as it traverses downstream, the interleave granularity needs to increment to identify the port selector bits out of the remaining address bits. For example, recall that with an x2 parent port intereleave (IW == 1), the downstream decode for children of those ports will either see address bit IG+8 always set, or address bit IG+8 always clear. So if the child port needs to select a downstream port it can only use address bits starting at IG+9 (where IG and IW are the CXL encoded values for interleave granularity (ilog2(ig) - 8) and ways (ilog2(iw))). When the parent port interleave is x1 no such masking occurs and the child port can maintain the granularity that was routed to the parent port. Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165973126583.1526540.657948655360009242.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 16:10:03 -07:00
Dan Williams	910bc55da8	cxl/region: Move HPA setup to cxl_region_attach() A recent bug fix added the setup of the endpoint decoder interleave geometry settings to cxl_region_attach(). Move the HPA setup there as well to keep all endpoint decoder parameter setting in a central location. For symmetry, move endpoint HPA teardown to cxl_region_detach(), and for switches move HPA setup / teardown to cxl_port_{setup,reset}_targets(). Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165973126020.1526540.14701949254436069807.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 16:10:03 -07:00
Dan Williams	2901c8bded	cxl/region: Fix decoder interleave programming Jonathan notes: "Curiously interleave ways = 1 for the EPs which is obviously wrong" ...while testing the latest CXL development branch on QEMU. It turns out the region creation process failed to program the endpoint decoders. This was missed because the default settings of x1 at 4K intereleave still results in the region appearing to function. Jonathan caught the bug by reverse mapping the translations that need to happen for the QEMU support. Link: https://lore.kernel.org/r/62e95fdf9f6e2_30440294e4@dwillia2-xfh.jf.intel.com.notmuch Fixes: `384e624bb2` ("cxl/region: Attach endpoint decoders") Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165951146336.967013.11160153960900111443.stgit@dwillia2-xfh.jf.intel.com Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 08:41:19 -07:00
Bagas Sanjaya	038e6eb803	cxl/region: describe targets and nr_targets members of cxl_region_params Sphinx reported undescribed parameters in cxl_region_params struct: ./drivers/cxl/cxl.h:376: warning: Function parameter or member 'targets' not described in 'cxl_region_params' ./drivers/cxl/cxl.h:376: warning: Function parameter or member 'nr_targets' not described in 'cxl_region_params' Describe these members. Fixes: `b9686e8c8e` ("cxl/region: Enable the assignment of endpoint decoders to regions") Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220804075448.98241-3-bagasdotme@gmail.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 08:41:02 -07:00
Bagas Sanjaya	f13da0d9c3	cxl/regions: add padding for cxl_rr_ep_add nested lists Sphinx reported indentation warnings: Documentation/driver-api/cxl/memory-devices:457: ./drivers/cxl/core/region.c:732: WARNING: Unexpected indentation. Documentation/driver-api/cxl/memory-devices:457: ./drivers/cxl/core/region.c:733: WARNING: Block quote ends without a blank line; unexpected unindent. Documentation/driver-api/cxl/memory-devices:457: ./drivers/cxl/core/region.c:735: WARNING: Unexpected indentation. These warnings above are due to missing blank line padding in the nested list in kernel-doc comment for cxl_rr_ep_add(). Add the paddings to fix the warnings. Fixes: `384e624bb2` ("cxl/region: Attach endpoint decoders") Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220804075448.98241-2-bagasdotme@gmail.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 08:41:02 -07:00
Dan Carpenter	9fd2cf4d6f	cxl/region: Fix IS_ERR() vs NULL check The nvdimm_pmem_region_create() function returns NULL on error. It does not return error pointers. Fixes: `04ad63f086` ("cxl/region: Introduce cxl_pmem_region objects") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/Yuo65lq2WtfdGJ0X@kili Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 08:41:02 -07:00
Dan Williams	e29a8995d6	cxl/region: Fix region reference target accounting Dan reports: The error handling in cxl_port_attach_region() looks like it might have a similar bug. The cxl_rr->nr_targets++; might want a --. That function is more complicated. Indeed cxl_rr->nr_targets leaks when cxl_rr_ep_add() fails, but that flow is not clear. Fix the bug and the clarity by separating the 'new' region-reference case from the 'extend' region-reference case. This also moves the host-physical-address (HPA) validation, that the HPA of a new region being accounted to the port is greater than the HPA of all other regions associated with the port, to alloc_region_ref(). Introduce @nr_targets_inc to track when the error exit path needs to clean up cxl_rr->nr_targets. Fixes: `384e624bb2` ("cxl/region: Attach endpoint decoders") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: http://lore.kernel.org/r/165939482134.252363.1915691883146696327.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 08:41:02 -07:00
Dan Williams	69c9961387	cxl/region: Fix region commit uninitialized variable warning 0day robot reports: drivers/cxl/core/region.c:196 cxl_region_decode_commit() error: uninitialized symbol 'rc'. The re-checking of loop termination conditions to determine "success" makes it hard to see that @rc is initialized in all cases. Remove those to make it explicit that @rc reflects a commit error and that the rest of logic is concerned with unwinding committed decoders. This change potentially results in cxl_region_decode_reset() being called with @count == 0 where it was not called before, but cxl_region_decode_reset() treats that as a nop. Fixes: `176baefb2e` ("cxl/hdm: Commit decoder state to hardware") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: http://lore.kernel.org/r/165951148105.967013.14191992449932268431.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 08:41:02 -07:00
Dan Williams	8d42854257	cxl/region: Fix port setup uninitialized variable warnings 0day robot reports: drivers/cxl/core/region.c:1068 cxl_port_setup_targets() error: uninitialized symbol 'eiw'. drivers/cxl/core/region.c:1068 cxl_port_setup_targets() error: uninitialized symbol 'peig'. drivers/cxl/core/region.c:1068 cxl_port_setup_targets() error: uninitialized symbol 'peiw'. ...which are all valid reports. Add debug statement to consume the, albeit unexpected, errors. Fixes: `27b3f8d138` ("cxl/region: Program target lists") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165951147487.967013.929590444907251028.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 08:41:02 -07:00
Dan Williams	817b279467	cxl/region: Stop initializing interleave granularity In preparation for a patch that validates that the region ways setting is compatible with the granularity setting, the initial granularity setting needs to start at zero to indicate "unset". Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/165853777484.2430596.3423921169034844397.stgit@dwillia2-xfh.jf.intel.com [djbw: fix up unused variable] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 15:38:06 -07:00
Dan Williams	4d5c42a80b	cxl/hdm: Fix DPA reservation vs cxl_endpoint_decoder lifetime After adding support for emulating platform firmware established DPA reservations, the cxl-topology.sh [1] unit test started crashing with the following signature: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP [..] RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core] [..] Call Trace: <TASK> __cxl_dpa_release+0x1b/0xd0 [cxl_core] cxl_dpa_release+0x1d/0x30 [cxl_core] release_nodes+0x63/0x90 devres_release_all+0x88/0xc0 ...i.e. a use after free of a 'struct cxl_endpoint_decoder' object. This results from the ordering of init_hdm_decoder() before add_hdm_decoder() where, at release time, the decoder is unregistered and released before the DPA reservation. Fix this by extending the life of the object until all DPA reservations have been released which also preserves platform decoder settings being settled by the time the decoder is published in sysfs (KOBJ_ADD time). Note that the @len == 0 case in __cxl_dpa_reserve() is avoided in practice as this function is only called for committed decoders and new non-zero DPA allocations. Link: https://github.com/pmem/ndctl/blob/pending/test/cxl-topology.sh [1] Fixes: `9c57cde0dc` ("cxl/hdm: Enumerate allocated DPA") Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/165896020625.3546860.12390103413706292760.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 15:36:33 -07:00
Dan Williams	e77483055c	cxl/acpi: Minimize granularity for x1 interleaves The kernel enforces that region granularity is >= to the top-level interleave-granularity for the given CXL window. However, when the CXL window interleave is x1, i.e. non-interleaved at the host bridge level, then the specified granularity does not matter. Override the window specified granularity to the CXL minimum so that any valid region granularity is >= to the root granularity. Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/165853776917.2430596.16823264262010844458.stgit@dwillia2-xfh.jf.intel.com [djbw: add CXL_DECODER_MIN_GRANULARITY per vishal] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 15:36:33 -07:00
Dan Williams	2bde6dbebc	cxl/region: Delete 'region' attribute from root decoders For switch and endpoint decoders the relationship of decoders to regions is 1:1. However, for root decoders the relationship is 1:N. Also, regions are already children of root decoders, so the 1:N relationship is observed by walking the following glob: /sys/bus/cxl/devices/$decoder/region* Hide the vestigial 'region' attribute for root decoders. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/165853776328.2430596.4647259305040072751.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 15:36:33 -07:00
Dan Williams	a53c28b6ae	cxl/acpi: Autoload driver for 'cxl_acpi' test devices In support of CXL unit tests in the ndctl project, arrange for the cxl_acpi driver to load in response to the registration of cxl_test devices. Reported-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/165853775783.2430596.13637998086505316619.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 15:36:32 -07:00
Dan Carpenter	5e42bcbc3f	cxl/region: decrement ->nr_targets on error in cxl_region_attach() The ++ needs a match -- on the clean up path. If the p->nr_targets value gets to be more than 16 it leads to uninitialized data in cxl_port_setup_targets(). drivers/cxl/core/region.c:995 cxl_port_setup_targets() error: uninitialized symbol 'eiw'. Fixes: `27b3f8d138` ("cxl/region: Program target lists") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/YuepCvUAoCtdpcoO@kili Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 14:24:34 -07:00
Dan Carpenter	c7e3548cac	cxl/region: prevent underflow in ways_to_cxl() The "ways" variable comes from the user. The ways_to_cxl() function has an upper bound but it doesn't check for negatives. Make the "ways" variable an unsigned int to fix this bug. Fixes: `80d10a6cee` ("cxl/region: Add interleave geometry attributes") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/Yueo3NV2hFCXx1iV@kili [djbw: fixup interleave_ways_store() to only accept unsigned input] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 12:12:33 -07:00
Dan Carpenter	88ab1dde79	cxl/region: uninitialized variable in alloc_hpa() This should check "p->res" instead of "res" (which is uninitialized). Fixes: `23a22cd1c9` ("cxl/region: Allocate HPA capacity to regions") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/Yueor88I/DkVSOtL@kili Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 12:12:33 -07:00
Dan Williams	04ad63f086	cxl/region: Introduce cxl_pmem_region objects The LIBNVDIMM subsystem is a platform agnostic representation of system NVDIMM / persistent memory resources. To date, the CXL subsystem's interaction with LIBNVDIMM has been to register an nvdimm-bridge device and cxl_nvdimm objects to proxy CXL capabilities into existing LIBNVDIMM subsystem mechanics. With regions the approach is the same. Create a new cxl_pmem_region object to proxy CXL region details into a LIBNVDIMM definition. With this enabling LIBNVDIMM can partition CXL persistent memory regions with legacy namespace labels. A follow-on patch will add CXL region label and CXL namespace label support to persist region configurations across driver reload / system-reset events. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784340111.1758207.3036498385188290968.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-26 12:23:01 -07:00
Dan Williams	99183d26ed	cxl/pmem: Fix offline_nvdimm_bus() to offline by bridge Be careful to only disable cxl_pmem objects related to a given cxl_nvdimm_bridge. Otherwise, offline_nvdimm_bus() reaches across CXL domains and disables more than is expected. Fixes: `21083f5152` ("cxl/pmem: Register 'pmem' / cxl_nvdimm devices") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784339569.1758207.1557084545278004577.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-26 12:23:01 -07:00
Dan Williams	8d48817df6	cxl/region: Add region driver boiler plate The CXL region driver is responsible for routing fully formed CXL regions to one of libnvdimm, for persistent memory regions, device-dax for volatile memory regions, or just act as an enumeration placeholder if the region was setup and configuration locked by platform firmware. In the platform-firmware-setup case the expectation is that region is already accounted in the system memory map, i.e. already enabled as "System RAM". For now, just attach to CXL regions in the CXL_CONFIG_COMMIT state, and take no further action. Given this driver is just a small / simple router, include it in the core rather than its own module. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-18-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-26 12:23:01 -07:00
Dan Williams	176baefb2e	cxl/hdm: Commit decoder state to hardware After all the soft validation of the region has completed, convey the region configuration to hardware while being careful to commit decoders in specification mandated order. In addition to programming the endpoint decoder base-address, interleave ways and granularity, the switch decoder target lists are also established. While the kernel can enforce spec-mandated commit order, it can not enforce spec-mandated reset order. For example, the kernel can't stop someone from removing an endpoint device that is occupying decoderN in a switch decoder where decoderN+1 is also committed. To reset decoderN, decoderN+1 must be torn down first. That "tear down the world" implementation is saved for a follow-on patch. Callback operations are provided for the 'commit' and 'reset' operations. While those callbacks may prove useful for CXL accelerators (Type-2 devices with memory) the primary motivation is to enable a simple way for cxl_test to intercept those operations. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784338418.1758207.14659830845389904356.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:07 -07:00
Dan Williams	27b3f8d138	cxl/region: Program target lists Once the region's interleave geometry (ways, granularity, size) is established and all the endpoint decoder targets are assigned, the next phase is to program all the intermediate decoders. Specifically, each CXL switch in the path between the endpoint and its CXL host-bridge (including the logical switch internal to the host-bridge) needs to have its decoders programmed and the target list order assigned. The difficulty in this implementation lies in determining which endpoint decoder ordering combinations are valid. Consider the cxl_test case of 2 host bridges, each of those host-bridges attached to 2 switches, and each of those switches attached to 2 endpoints for a potential 8-way interleave. The x2 interleave at the host-bridge level requires that all even numbered endpoint decoder positions be located on the "left" hand side of the topology tree, and the odd numbered positions on the other. The endpoints that are peers on the same switch need to have a position that can be routed with a dedicated address bit per-endpoint. See check_last_peer() for the details. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784337827.1758207.132121746122685208.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:07 -07:00
Dan Williams	384e624bb2	cxl/region: Attach endpoint decoders CXL regions (interleave sets) are made up of a set of memory devices where each device maps a portion of the interleave with one of its decoders (see CXL 2.0 8.2.5.12 CXL HDM Decoder Capability Structure). As endpoint decoders are identified by a provisioning tool they can be added to a region provided the region interleave properties are set (way, granularity, HPA) and DPA has been assigned to the decoder. The attach event triggers several validation checks, for example: - is the DPA sized appropriately for the region - is the decoder reachable via the host-bridges identified by the region's root decoder - is the device already active in a different region position slot - are there already regions with a higher HPA active on a given port (per CXL 2.0 8.2.5.12.20 Committing Decoder Programming) ...and the attach event affords an opportunity to collect data and resources relevant to later programming the target lists in switch decoders, for example: - allocate a decoder at each cxl_port in the decode chain - for a given switch port, how many the region's endpoints are hosted through the port - how many unique targets (next hops) does a port need to map to reach those endpoints The act of reconciling this information and deploying it to the decoder configuration is saved for a follow-on patch. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784337277.1758207.4108508181328528703.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:07 -07:00
Dan Williams	6aa41144e7	cxl/acpi: Add a host-bridge index lookup mechanism The ACPI CXL Fixed Memory Window Structure (CFMWS) defines multiple methods to determine which host bridge provides access to a given endpoint relative to that device's position in the interleave. The "Interleave Arithmetic" defines either a "standard modulo" / round-random algorithm, or "xormap" based algorithm which can be defined as a non-linear transform. Given that there are already more options beyond "standard modulo" and that "xormap" may turn out to be ACPI CXL specific, provide a callback for the region provisioning code to map endpoint positions back to expected host bridge id (cxl_dport target). For now just support the simple modulo math case and save the xormap for a follow-on change. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-14-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:07 -07:00
Dan Williams	b9686e8c8e	cxl/region: Enable the assignment of endpoint decoders to regions The region provisioning process involves allocating DPA to a set of endpoint decoders, and HPA plus the region geometry to a region device. Then the decoder is assigned to the region. At this point several validation steps can be performed to validate that the decoder is suitable to participate in the region. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/165784336184.1758207.16403282029203949622.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:07 -07:00
Dan Williams	23a22cd1c9	cxl/region: Allocate HPA capacity to regions After a region's interleave parameters (ways and granularity) are set, add a way for regions to allocate HPA (host physical address space) from the free capacity in their parent root-decoder. The allocator for this capacity reuses the 'struct resource' based allocator used for CONFIG_DEVICE_PRIVATE. Once the tuple of "ways, granularity, [uuid], and size" is set the region configuration transitions to the CXL_CONFIG_INTERLEAVE_ACTIVE state which is a precursor to allowing endpoint decoders to be added to a region. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784335630.1758207.420216490941955417.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:06 -07:00
Ben Widawsky	80d10a6cee	cxl/region: Add interleave geometry attributes Add ABI to allow the number of devices that comprise a region to be set as well as the interleave granularity for the region. Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> [djbw: reword changelog] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-11-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:06 -07:00
Ben Widawsky	dd5ba0ebbd	cxl/region: Add a 'uuid' attribute The process of provisioning a region involves triggering the creation of a new region object, pouring in the configuration, and then binding that configured object to the region driver to start its operation. For persistent memory regions the CXL specification mandates that it identified by a uuid. Add an ABI for userspace to specify a region's uuid. Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> [djbw: simplify locking] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784334465.1758207.8224025435884752570.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:06 -07:00
Ben Widawsky	779dd20cfb	cxl/region: Add region creation support CXL 2.0 allows for dynamic provisioning of new memory regions (system physical address resources like "System RAM" and "Persistent Memory"). Whereas DDR and PMEM resources are conveyed statically at boot, CXL allows for assembling and instantiating new regions from the available capacity of CXL memory expanders in the system. Sysfs with an "echo $region_name > $create_region_attribute" interface is chosen as the mechanism to initiate the provisioning process. This was chosen over ioctl() and netlink() to keep the configuration interface entirely in a pseudo-fs interface, and it was chosen over configfs since, aside from this one creation event, the interface is read-mostly. I.e. configfs supports cases where an object is designed to be provisioned each boot, like an iSCSI storage target, and CXL region creation is mostly for PMEM regions which are created usually once per-lifetime of a server instance. This is an improvement over nvdimm that pre-created "seed" devices that tended to confuse users looking to determine which devices are active and which are idle. Recall that the major change that CXL brings over previous persistent memory architectures is the ability to dynamically define new regions. Compare that to drivers like 'nfit' where the region configuration is statically defined by platform firmware. Regions are created as a child of a root decoder that encompasses an address space with constraints. When created through sysfs, the root decoder is explicit. When created from an LSA's region structure a root decoder will possibly need to be inferred by the driver. Upon region creation through sysfs, a vacant region is created with a unique name. Regions have a number of attributes that must be configured before the region can be bound to the driver where HDM decoder program is completed. An example of creating a new region: - Allocate a new region name: region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region) - Create a new region by name: while region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region) ! echo $region > /sys/bus/cxl/devices/decoder0.0/create_pmem_region do true; done - Region now exists in sysfs: stat -t /sys/bus/cxl/devices/decoder0.0/$region - Delete the region, and name: echo $region > /sys/bus/cxl/devices/decoder0.0/delete_region Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784333909.1758207.794374602146306032.stgit@dwillia2-xfh.jf.intel.com [djbw: simplify locking, reword changelog] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:25 -07:00
Dan Williams	7f8faf96a2	cxl/mem: Enumerate port targets before adding endpoints The port scanning algorithm in devm_cxl_enumerate_ports() walks up the topology and adds cxl_port objects starting from the root down to the endpoint. When those ports are initially created they know all their dports, but they do not know the downstream cxl_port instance that represents the next descendant in the topology. Rework create_endpoint() into devm_cxl_add_endpoint() that enumerates the downstream cxl_port topology into each port's 'struct cxl_ep' record for each endpoint it that the port is an ancestor. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-7-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:25 -07:00
Ben Widawsky	538831f1be	cxl/hdm: Add sysfs attributes for interleave ways + granularity The region provisioning flow involves selecting interleave ways + granularity settings for a region, and then programming the decoder topology to meet those constraints, if possible. For example, root decoders set the minimum interleave ways + granularity for any hosted regions. Given decoder programming is not atomic and collisions can occur between multiple requesting regions userspace will be responsible for conflict resolution and it needs these attributes to make those decisions. Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784332235.1758207.7185062713652694607.stgit@dwillia2-xfh.jf.intel.com [djbw: reword changelog, make read-only, add sysfs ABI documentaion] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:25 -07:00
Dan Williams	391785859e	cxl/port: Move dport tracking to an xarray Reduce the complexity and the overhead of walking the topology to determine endpoint connectivity to root decoder interleave configurations. Note that cxl_detach_ep(), after it determines that the last @ep has departed and decides to delete the port, now needs to walk the dport array with the device_lock() held to remove entries. Previously list_splice_init() could be used atomically delete all dport entries at once and then perform entry tear down outside the lock. There is no list_splice_init() equivalent for the xarray. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784331647.1758207.6345820282285119339.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:24 -07:00
Dan Williams	256d0e9ee4	cxl/port: Move 'cxl_ep' references to an xarray per port In preparation for region provisioning that needs to walk the topology by endpoints, use an xarray to record endpoint interest in a given port. In addition to being more space and time efficient it also reduces the complexity of the implementation by moving locking internal to the xarray implementation. It also allows for a single cxl_ep reference to be recorded in multiple xarrays. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-2-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:24 -07:00
Dan Williams	1b58b4cac6	cxl/port: Record parent dport when adding ports At the time that cxl_port instances are being created, cache the dport from the parent port that points to this new child port. This will be useful for region provisioning when walking the tree to calculate decoder targets, and saves rewalking the dport list after the fact to build this information. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-1-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:24 -07:00
Dan Williams	de516b4011	cxl/port: Record dport in endpoint references Recall that the primary role of the cxl_mem driver is to probe if the given endpoint is connected to a CXL port topology. In that process it walks its device ancestry to its PCI root port. If that root port is also a CXL root port then the probe process adds cxl_port object instances at switch in the path between to the root and the endpoint. As those cxl_port instances are added, or if a previous enumeration attempt already created the port, a 'struct cxl_ep' instance is registered with that port to track the endpoints interested in that port. At the time the cxl_ep is registered the downstream egress path from the port to the endpoint is known. Take the opportunity to record that information as it will be needed for dynamic programming of decoder targets during region provisioning. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784329944.1758207.15203961796832072116.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:24 -07:00
Dan Williams	cf880423b6	cxl/hdm: Add support for allocating DPA to an endpoint decoder The region provisioning flow will roughly follow a sequence of: 1/ Allocate DPA to a set of decoders 2/ Allocate HPA to a region 3/ Associate decoders with a region and validate that the DPA allocations and topologies match the parameters of the region. For now, this change (step 1) arranges for DPA capacity to be allocated and deleted from non-committed decoders based on the decoder's mode / partition selection. Capacity is allocated from the lowest DPA in the partition and any 'pmem' allocation blocks out all remaining ram capacity in its 'skip' setting. DPA allocations are enforced in decoder instance order. I.e. decoder N + 1 always starts at a higher DPA than instance N, and deleting allocations must proceed from the highest-instance allocated decoder to the lowest. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784329399.1758207.16732038126938632700.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:24 -07:00
Dan Williams	0c33b39352	cxl/hdm: Track next decoder to allocate The CXL specification enforces that endpoint decoders are committed in hw instance id order. In preparation for adding dynamic DPA allocation, record the hw instance id in endpoint decoders, and enforce allocations to occur in hw instance id order. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784328827.1758207.9627538529944559954.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:23 -07:00
Dan Williams	2c8669033f	cxl/hdm: Add 'mode' attribute to decoder objects Recall that the Device Physical Address (DPA) space of a CXL Memory Expander is potentially partitioned into a volatile and persistent portion. A decoder maps a Host Physical Address (HPA) range to a DPA range and that translation depends on the value of all previous (lower instance number) decoders before the current one. In preparation for allowing dynamic provisioning of regions, decoders need an ABI to indicate which DPA partition a decoder targets. This ABI needs to be prepared for the possibility that some other agent committed and locked a decoder that spans the partition boundary. Add 'decoderX.Y/mode' to endpoint decoders that indicates which partition 'ram' / 'pmem' the decoder targets, or 'mixed' if the decoder currently spans the partition boundary. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603881967.551046.6007594190951596439.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:23 -07:00
Dan Williams	9c57cde0dc	cxl/hdm: Enumerate allocated DPA In preparation for provisioning CXL regions, add accounting for the DPA space consumed by existing regions / decoders. Recall, a CXL region is a memory range comprised from one or more endpoint devices contributing a mapping of their DPA into HPA space through a decoder. Record the DPA ranges covered by committed decoders at initial probe of endpoint ports relative to a per-device resource tree of the DPA type (pmem or volatile-ram). The cxl_dpa_rwsem semaphore is introduced to globally synchronize DPA state across all endpoints and their decoders at once. The vast majority of DPA operations are reads as region creation is expected to be as rare as disk partitioning and volume creation. The device_lock() for this synchronization is specifically avoided for concern of entangling with sysfs attribute removal. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784327682.1758207.7914919426043855876.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:12 -07:00
Dan Williams	3bf65915ce	cxl/core: Define a 'struct cxl_endpoint_decoder' Previously the target routing specifics of switch decoders and platform CXL window resource tracking of root decoders were factored out of 'struct cxl_decoder'. While switch decoders translate from SPA to downstream ports, endpoint decoders translate from SPA to DPA. This patch, 3 of 3, adds a 'struct cxl_endpoint_decoder' that tracks an endpoint-specific Device Physical Address (DPA) resource. For now this just defines ->dpa_res, a follow-on patch will handle requesting DPA resource ranges from a device-DPA resource tree. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784327088.1758207.15502834501671201192.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 08:41:20 -07:00
Dan Williams	0f157c7fa1	cxl/core: Define a 'struct cxl_root_decoder' Previously the target routing specifics of switch decoders were factored out of 'struct cxl_decoder' into 'struct cxl_switch_decoder'. This patch, 2 of 3, adds a 'struct cxl_root_decoder' as a superset of a switch decoder that also track the associated CXL window platform resource. Note that the reason the resource for a given root decoder needs to be looked up after the fact (i.e. after cxl_parse_cfmws() and add_cxl_resource()) is because add_cxl_resource() may have merged CXL windows in order to keep them at the top of the resource tree / decode hierarchy. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784326541.1758207.9915663937394448341.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 08:40:47 -07:00
Dan Williams	974854ab07	cxl/acpi: Track CXL resources in iomem_resource Recall that CXL capable address ranges, on ACPI platforms, are published in the CEDT.CFMWS (CXL Early Discovery Table: CXL Fixed Memory Window Structures). These windows represent both the actively mapped capacity and the potential address space that can be dynamically assigned to a new CXL decode configuration (region / interleave-set). CXL endpoints like DDR DIMMs can be mapped at any physical address including 0 and legacy ranges. There is an expectation and requirement that the /proc/iomem interface and the iomem_resource tree in the kernel reflect the full set of platform address ranges. I.e. that every address range that platform firmware and bus drivers enumerate be reflected as an iomem_resource entry. The hard requirement to do this for CXL arises from the fact that facilities like CONFIG_DEVICE_PRIVATE expect to be able to treat empty iomem_resource ranges as free for software to use as proxy address space. Without CXL publishing its potential address ranges in iomem_resource, the CONFIG_DEVICE_PRIVATE mechanism may inadvertently steal capacity reserved for runtime provisioning of new CXL regions. So, iomem_resource needs to know about both active and potential CXL resource ranges. The active CXL resources might already be reflected in iomem_resource as "System RAM". insert_resource_expand_to_fit() handles re-parenting "System RAM" underneath a CXL window. The "_expand_to_fit()" behavior handles cases where a CXL window is not a strict superset of an existing entry in the iomem_resource tree. The "_expand_to_fit()" behavior is acceptable from the perspective of resource allocation. The expansion happens because a conflicting resource range is already populated, which means the resource boundary expansion does not result in any additional free CXL address space being made available. CXL address space allocation is always bounded by the orginal unexpanded address range. However, the potential for expansion does mean that something like walk_iomem_res_desc(IORES_DESC_CXL...) can only return fuzzy answers on corner case platforms that cause the resource tree to expand a CXL window resource over a range that is not decoded by CXL. This would be an odd platform configuration, but if it becomes a problem in practice the CXL subsytem could just publish an API that returns definitive answers. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/165784325943.1758207.5310344844375305118.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 08:40:01 -07:00
Dan Williams	e636479e2f	cxl/core: Define a 'struct cxl_switch_decoder' Currently 'struct cxl_decoder' contains the superset of attributes needed for all decoder types. Before more type-specific attributes are added to the common definition, reorganize 'struct cxl_decoder' into type specific objects. This patch, the first of three, factors out a cxl_switch_decoder type. See the new kdoc for what a 'struct cxl_switch_decoder' represents in a CXL topology. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/165784325340.1758207.5064717153608954960.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 08:34:16 -07:00
Ira Weiny	c97006046c	cxl/port: Read CDAT table The per-device CDAT data provides performance data that is relevant for mapping which CXL devices can participate in which CXL ranges by QTG (QoS Throttling Group) (per ECN: CXL 2.0 CEDT CFMWS & QTG_DSM) [1]. The QTG association specified in the ECN is advisory. Until the cxl_acpi driver grows support for invoking the QTG _DSM method the CDAT data is only of interest to userspace that may need it for debug purposes. Search the DOE mailboxes available, query CDAT data, cache the data and make it available via a sysfs binary attribute per endpoint at: /sys/bus/cxl/devices/endpointX/CDAT ...similar to other ACPI-structured table data in /sys/firmware/ACPI/tables. The CDAT is relative to 'struct cxl_port' objects since switches in addition to endpoints can host a CDAT instance. Switch CDAT support is not implemented. This does not support table updates at runtime. It will always provide whatever was there when first cached. It is also the case that table updates are not expected outside of explicit DPA address map affecting commands like Set Partition with the immediate flag set. Given that the driver does not support Set Partition with the immediate flag set there is no current need for update support. Link: https://www.computeexpresslink.org/spec-landing [1] Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> [djbw: drop in-kernel parsing infra for now, and other minor fixups] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220719205249.566684-7-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-19 15:38:05 -07:00
Ira Weiny	3eddcc9385	cxl/pci: Create PCI DOE mailbox's for memory devices DOE mailbox objects will be needed for various mailbox communications with each memory device. Iterate each DOE mailbox capability and create PCI DOE mailbox objects as found. It is not anticipated that this is the final resting place for the iteration of the DOE devices. The support of switch ports will drive this code into the PCIe side. In this imagined architecture the CXL port driver would then query into the PCI device for the DOE mailbox array. For now creating the mailboxes in the CXL port is good enough for the endpoints. Later PCIe ports will need to support this to support switch ports more generically. Cc: Dan Williams <dan.j.williams@intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220719205249.566684-5-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-19 15:38:04 -07:00
Dan Williams	b060edfd8c	cxl/pmem: Delete unused nvdimm attribute While there is a need to go from a LIBNVDIMM 'struct nvdimm' to a CXL 'struct cxl_nvdimm', there is no use case to go the other direction. Likely this is a leftover from an early version of the referenced commit before it implemented devm for releasing the created nvdimm. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-19-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-11 12:34:55 -07:00
Dan Williams	9e9e44017d	cxl/hdm: Initialize decoder type for memory expander devices Unless and until accelerator (type-2) drivers start registering for CXL.mem mapping services from the CXL subsystem core, initialize idle HDM decoders to the "expander" type. I.e. the only CXL devices using the CXL core presently are those implementing the CXL 2.0 Type-3 memory expander device class code that the cxl_pci driver claims. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-6-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-10 13:41:49 -07:00
Dan Williams	ee80001083	cxl/port: Cache CXL host bridge data Region creation has need for checking host-bridge connectivity when adding endpoints to regions. Record, at port creation time, the host-bridge to provide a useful shortcut from any location in the topology to the most-significant ancestor. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-4-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-10 12:10:07 -07:00
Dan Williams	e7ad1bf683	tools/testing/cxl: Add partition support In support of testing DPA allocation mechanisms in the CXL core, the cxl_test environment needs to support establishing and retrieving the 'pmem partition boundary. Replace the platform_device_add_resources() method for delineating DPA within an endpoint with an emulated DEV_SIZE amount of partitionable capacity. Set DEV_SIZE such that an endpoint has enough capacity to simultaneously participate in 8 distinct regions. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603887411.551046.13234212587991192347.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-10 10:29:26 -07:00
Dan Williams	cc2a487870	cxl/mem: Add a debugfs version of 'iomem' for DPA, 'dpamem' Dump the device-physical-address map for a CXL expander in /proc/iomem style format. E.g.: cat /sys/kernel/debug/cxl/mem1/dpamem 00000000-0fffffff : ram 10000000-1fffffff : pmem Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603885318.551046.8308248564880066726.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-10 10:10:30 -07:00
Dan Williams	9b99ecf5a3	cxl/debug: Move debugfs init to cxl_core_init() In preparation for a new cxl debugfs file, move 'cxl' directory establishment and teardown to the core and let subsequent init routines reference that setup. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603884654.551046.4962104601691723080.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-10 09:57:28 -07:00
Ben Widawsky	14e473e1a7	cxl/hdm: Require all decoders to be enumerated In preparation for region provisioning all device decoders need to be enumerated since DPA allocations are calculated by summing the capacities of all decoders in a set. I.e. the programming for decoder[N] depends on the state of decoder[N-1], so skipping over decoders that fail to initialize prevents accurate DPA accounting. Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> [djbw: reword changelog] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603879664.551046.6863805202478861026.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 19:45:32 -07:00
Dan Williams	d3b75029f3	cxl/mem: Convert partition-info to resources To date the per-device-partition DPA range information has only been used for enumeration purposes. In preparation for allocating regions from available DPA capacity, convert those ranges into DPA-type resource trees. With resources and the new add_dpa_res() helper some open coded end address calculations and debug prints can be cleaned. The 'cxlds->pmem_res' and 'cxlds->ram_res' resources are child resources of the total-device DPA space and they in turn will host DPA allocations from cxl_endpoint_decoder instances (tracked by cxled->dpa_res). Cc: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603878921.551046.8127845916514734142.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 19:43:30 -07:00
Dan Williams	419af595b1	cxl: Introduce cxl_to_{ways,granularity} Interleave granularity and ways have CXL specification defined encodings. Promote the conversion helpers to a common header, and use them to replace other open-coded instances. Force caller to consider the error case of the conversion similarly to other conversion helpers like kstrto*(). Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603875016.551046.17236943065932132355.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 16:42:44 -07:00
Dan Williams	885d3bed6d	cxl/core: Drop is_cxl_decoder() This helper was only used to identify the object type for lockdep purposes. Now that lockdep support is done with explicit lock classes, this helper can be dropped. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603874340.551046.15491766127759244728.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 16:41:02 -07:00
Dan Williams	e50fe01e1f	cxl/core: Drop ->platform_res attribute for root decoders Root decoders are responsible for hosting the available host address space for endpoints and regions to claim. The tracking of that available capacity can be done in iomem_resource directly. As a result, root decoders no longer need to host their own resource tree. The current ->platform_res attribute was added prematurely. Otherwise, ->hpa_range fills the role of conveying the current decode range of the decoder. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603873619.551046.791596854070136223.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 16:23:37 -07:00
Dan Williams	e8b7ea58ab	cxl/core: Rename ->decoder_range ->hpa_range In preparation for growing a ->dpa_range attribute for endpoint decoders, rename the current ->decoder_range to the more descriptive ->hpa_range. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603872867.551046.2170426227407458814.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 16:22:41 -07:00
Ben Widawsky	04ed37a2ba	cxl/hdm: Use local hdm variable Save a few characters and use the already initialized local variable. Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603872171.551046.913207574344536475.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 16:21:19 -07:00
Dan Williams	fe80f1ad59	cxl/port: Keep port->uport valid for the entire life of a port The upcoming region provisioning implementation has a need to dereference port->uport during the port unregister flow. Specifically, endpoint decoders need to be able to lookup their corresponding memdev via port->uport. The existing ->dead flag was added for cases where the core was committed to tearing down the port, but needed to drop locks before calling device_unregister(). Reuse that flag to indicate to delete_endpoint() that it has no "release action" work to do as unregister_port() will handle it. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603871491.551046.6682199179541194356.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 13:06:50 -07:00
Vishal Verma	e35f571890	cxl/mbox: Fix missing variable payload checks in cmd size validation The conversion of command sizes to unsigned missed a couple of checks against variable size payloads during command validation, which made all variable payload commands unconditionally fail. Add the checks back using the new CXL_VARIABLE_PAYLOAD scheme. Fixes: `26f89535a5` ("cxl/mbox: Use type __u32 for mailbox payload sizes") Cc: <stable@vger.kernel.org> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Reported-by: Abhi Cs <abhi.cs@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20220628220109.633564-1-vishal.l.verma@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-06-28 22:03:18 -07:00
Alison Schofield	8a66487506	cxl/mbox: Use __le32 in get,set_lsa mailbox structures CXL specification defines these as little endian. Fixes: `60b8f17215` ("cxl/pmem: Translate NVDIMM label commands to CXL label commands") Reported-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20220225221456.1025635-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-06-21 14:09:00 -07:00
Ben Widawsky	8ae3cebc17	cxl/core: Use is_endpoint_decoder Save some characters and directly check decoder type rather than port type. There's no need to check if the port is an endpoint port since, by this point, cxl_endpoint_decoder_alloc() has a specified type. Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-06-21 14:09:00 -07:00
Jonathan Cameron	db9a3a35d3	cxl: Fix cleanup of port devices on failure to probe driver. The device is created, and then there is a check if a driver succesfully bound to it. In event of failing the bind (e.g. failure in cxl_port_probe()) the device is left registered. When a bus rescan later occurs, fresh devices are created leading to a multiple device representing the same underlying hardware. Bad things may follow and at very least we have far too many devices. Fix by ensuring autoremove is registered if the device create succeeds, but doesn't depend on sucessful binding to a driver. Bug was observed as side effect of incorrect ownership in [PATCH v9 6/9] cxl/port: Read CDAT table but will result from any failure to in cxl_port_probe(). Fixes: `8dd2bc0f8e` ("cxl/mem: Add the cxl_mem driver") Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220609134519.11668-1-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-06-21 14:09:00 -07:00
Dan Williams	34e37b4c43	cxl/port: Enable HDM Capability after validating DVSEC Ranges CXL memory expanders that support the CXL 2.0 memory device class code include an "HDM Decoder Capability" mechanism to supplant the "CXL DVSEC Range" mechanism originally defined in CXL 1.1. Both mechanisms depend on a "mem_enable" bit being set in configuration space before either mechanism activates. When the HDM Decoder Capability is enabled the CXL DVSEC Range settings are ignored. Previously, the cxl_mem driver was relying on platform-firmware to set "mem_enable". That is an invalid assumption as there is no requirement that platform-firmware sets the bit before the driver sees a device, especially in hot-plug scenarios. Additionally, ACPI-platforms that support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery Table). That table outlines the platform permissible address ranges for CXL operation. So, there is a need for the driver to set "mem_enable", and there is information available to determine the validity of the CXL DVSEC Ranges. Arrange for the driver to optionally enable the HDM Decoder Capability if "mem_enable" was not set by platform firmware, or the CXL DVSEC Range configuration was invalid. Be careful to only disable memory decode if the kernel was the one to enable it. In other words, if CXL is backing all of kernel memory at boot the device needs to maintain "mem_enable" and "HDM Decoder enable" all the way up to handoff back to platform firmware (e.g. ACPI S5 state entry may require CXL memory to stay active). Fixes: `560f785590` ("cxl/pci: Retrieve CXL DVSEC memory info") Cc: Dan Carpenter <dan.carpenter@oracle.com> [dan: fix early terminiation of range-allowed loop] Cc: Ariel Sibley <ariel.sibley@microchip.com> [ariel: Memory_size must be non-zero] Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165307136375.2499769.861793697156744166.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-20 12:30:53 -07:00
Dan Williams	fcfbc93cc3	cxl/port: Reuse 'struct cxl_hdm' context for hdm init The port driver maps component registers for port operations. Reuse that mapping for HDM Decoder Capability setup / enable. Move devm_cxl_setup_hdm() before cxl_hdm_decode_init() and plumb @cxlhdm through the hdm init helpers. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291691712.1426646.14336397551571515480.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:42 -07:00
Dan Williams	5e5f4ad52f	cxl/port: Move endpoint HDM Decoder Capability init to port driver The responsibility for establishing HDM Decoder Capability based operation is more closely tied to port enabling than memdev enabling which is concerned with port enumeration. This later enables reusing @cxlhdm for probing / controlling "global enable" for the HDM Decoder Capability. For now, just do the nominal move. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291691167.1426646.7936109077255288258.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	92804edb11	cxl/pci: Drop @info argument to cxl_hdm_decode_init() Now that nothing external to cxl_hdm_decode_init() considers 'struct cxl_endpoint_dvec_info' move it internal to cxl_hdm_decode_init(). Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291690612.1426646.7866084245521113414.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	a12562bb70	cxl/mem: Merge cxl_dvsec_ranges() and cxl_hdm_decode_init() In preparation for changing how the driver handles 'mem_enable' in the CXL DVSEC control register. Merge the contents of cxl_hdm_decode_init() into cxl_dvsec_ranges() and rename the combined function cxl_hdm_decode_init(). The possible cleanups and fixes that result from this merge are saved for a follow-on change. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165291690027.1426646.10249756632415633752.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	dd2d42ad6f	cxl/mem: Skip range enumeration if mem_enable clear When a device does not have mem_enable set then the current range settings are moot. Skip the enumeration and cause cxl_hdm_decode_init() to proceed directly to enable the HDM Decoder Capability. Fixes: `560f785590` ("cxl/pci: Retrieve CXL DVSEC memory info") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291689442.1426646.18012291761753694336.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	14d7887407	cxl/mem: Consolidate CXL DVSEC Range enumeration in the core In preparation for fixing the setting of the 'mem_enabled' bit in CXL DVSEC Control register, move all CXL DVSEC range enumeration into the same source file. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291688886.1426646.15046138604010482084.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	2e4ba0ec97	cxl/pci: Move cxl_await_media_ready() to the core Allow cxl_await_media_ready() to be mocked for testing purposes rather than carrying the maintenance burden of an indirect function call in the mainline driver. With the move cxl_await_media_ready() can no longer reuse the mailbox timeout override, so add a media_ready_timeout module parameter to the core to backfill. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291688340.1426646.4755627801983775011.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	75b7ae2999	cxl/mem: Validate port connectivity before dvsec ranges In preparation for validating DVSEC ranges against the platform declared CXL memory ranges (ACPI CFMWS) move port enumeration before the endpoint's decoder validation. Ultimately this logic will move to the port driver, but create a bisect point before that larger move. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291687749.1426646.18091538443879226995.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	76a4121e86	cxl/mem: Fix cxl_mem_probe() error exit The addition of cxl_mem_active() broke error exit scenarios for cxl_mem_probe(). Return early rather than proceed with disabling suspend, and update the label name since it is no longer a terminal "out" label that exits the function. Fixes: `9ea4dcf498` ("PM: CXL: Disable suspend") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291687176.1426646.15449254938752532784.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	194d5edadf	cxl/pci: Drop wait_for_valid() from cxl_await_media_ready() A check mem_info_valid already happens in __cxl_dvsec_ranges(). Rely on that instead of calling wait_for_valid again. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291686632.1426646.7479581732894574486.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:40 -07:00
Dan Williams	1e14c9fbb5	cxl/pci: Consolidate wait_for_media() and wait_for_media_ready() Now that wait_for_media() does nothing supplemental to wait_for_media_ready() just promote wait_for_media_ready() to a common helper and drop wait_for_media(). Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291686046.1426646.4390664747934592185.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:40 -07:00
Dan Williams	2bcf3bbd34	cxl/mem: Drop mem_enabled check from wait_for_media() Media ready is asserted by the device independent of whether mem_enabled was ever set. Drop this check to allow for dropping wait_for_media() in favor of ->wait_media_ready(). Fixes: `8dd2bc0f8e` ("cxl/mem: Add the cxl_mem driver") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291685501.1426646.10372821863672431074.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:40 -07:00
Dan Williams	38a34e1076	cxl: Drop cxl_device_lock() Now that all CXL subsystem locking is validated with custom lock classes, there is no need for the custom usage of the lockdep_mutex. Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165055520383.3745911.53447786039115271.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-28 14:01:55 -07:00
Dan Williams	d864b8ea64	cxl/acpi: Add root device lockdep validation The CXL "root" device, ACPI0017, is an attach point for coordinating platform level CXL resources and is the parent device for a CXL port topology tree. As such it has distinct locking rules relative to other CXL subsystem objects, but because it is an ACPI device the lock class is established well before it is given to the cxl_acpi driver. However, the lockdep API does support changing the lock class "live" for situations like this. Add a device_lock_set_class() helper that a driver can use in ->probe() to set a custom lock class, and device_lock_reset_class() to return to the default "no validate" class before the custom lock class key goes out of scope after ->remove(). Note the helpers are all macros to support dead code elimination in the CONFIG_PROVE_LOCKING=n case, however device_set_lock_class() still needs #ifdef CONFIG_PROVE_LOCKING since lockdep_match_class() explicitly does not have a helper in the CONFIG_PROVE_LOCKING=n case (see comment in lockdep.h). The lockdep API needs 2 small tweaks to prevent "unused" warnings for the @key argument to lock_set_class(), and a new lock_set_novalidate_class() is added to supplement lockdep_set_novalidate_class() in the cases where the lock class is converted while the lock is held. Suggested-by: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165100081305.1528964.11138612430659737238.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-28 14:01:54 -07:00
Dan Williams	3750d01318	cxl: Replace lockdep_mutex with local lock classes In response to an attempt to expand dev->lockdep_mutex for device_lock() validation [1], Peter points out [2] that the lockdep API already has the ability to assign a dedicated lock class per subsystem device-type. Use lockdep_set_class() to override the default device_lock() '__lockdep_no_validate__' class for each CXL subsystem device-type. This enables lockdep to detect deadlocks and recursive locking within the device-driver core and the subsystem. The lockdep_set_class_and_subclass() API is used for port objects that recursively lock the 'cxl_port_key' class by hierarchical topology depth. Link: https://lore.kernel.org/r/164982968798.684294.15817853329823976469.stgit@dwillia2-desk3.amr.corp.intel.com [1] Link: https://lore.kernel.org/r/Ylf0dewci8myLvoW@hirez.programming.kicks-ass.net [2] Suggested-by: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165055519317.3745911.7342499516839702840.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-28 14:01:54 -07:00
Dan Carpenter	35e01667c8	cxl/mbox: fix logical vs bitwise typo This should be bitwise & instead of &&. Fixes: `6179045ccc` ("cxl/mbox: Block immediate mode in SET_PARTITION_INFO command") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/YmpgkbbQ1Yxu36uO@kili Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-28 10:18:31 -07:00
Alison Schofield	280302f0e8	cxl/mbox: Replace NULL check with IS_ERR() after vmemdup_user() vmemdup_user() returns an ERR_PTR() on failure. Use IS_ERR() to check the return value. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220407010915.1211258-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-22 16:12:04 -07:00
Alison Schofield	26f89535a5	cxl/mbox: Use type __u32 for mailbox payload sizes Payload sizes for mailbox commands are expected to be positive values coming from userspace. The documentation correctly describes these as always unsigned values. The mailbox and send structures that support the mailbox commands however, use __s32 types for the payloads. Replace __s32 with __u32 in the mailbox and send command structures and update usages. Kernel users of the interface already block all negative values and there is no known ability for userspace to have grown a dependency on submitting negative values to the kernel. The known user of the IOCTL, the CXL command line interface (cxl-cli) already enforces positive size values. A Smatch warning of a signedness uncovered this issue. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20220414051246.1244575-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-22 16:12:04 -07:00
Dan Williams	9ea4dcf498	PM: CXL: Disable suspend The CXL specification claims S3 support at a hardware level, but at a system software level there are some missing pieces. Section 9.4 (CXL 2.0) rightly claims that "CXL mem adapters may need aux power to retain memory context across S3", but there is no enumeration mechanism for the OS to determine if a given adapter has that support. Moreover the save state and resume image for the system may inadvertantly end up in a CXL device that needs to be restored before the save state is recoverable. I.e. a circular dependency that is not resolvable without a third party save-area. Arrange for the cxl_mem driver to fail S3 attempts. This still nominaly allows for suspend, but requires unbinding all CXL memory devices before the suspend to ensure the typical DRAM flow is taken. The cxl_mem unbind flow is intended to also tear down all CXL memory regions associated with a given cxl_memdev. It is reasonable to assume that any device participating in a System RAM range published in the EFI memory map is covered by aux power and save-area outside the device itself. So this restriction can be minimized in the future once pre-existing region enumeration support arrives, and perhaps a spec update to clarify if the EFI memory map is sufficent for determining the range of devices managed by platform-firmware for S3 support. Per Rafael, if the CXL configuration prevents suspend then it should fail early before tasks are frozen, and mem_sleep should stop showing 'mem' as an option [1]. Effectively CXL augments the platform suspend ->valid() op since, for example, the ACPI ops are not aware of the CXL / PCI dependencies. Given the split role of platform firmware vs OS provisioned CXL memory it is up to the cxl_mem driver to determine if the CXL configuration has elements that platform firmware may not be prepared to restore. Link: https://lore.kernel.org/r/CAJZ5v0hGVN_=3iU8OLpHY3Ak35T5+JcBM-qs8SbojKrpd0VXsA@mail.gmail.com [1] Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <len.brown@intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/165066828317.3907920.5690432272182042556.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-22 16:09:42 -07:00
Dan Williams	35ee1f4990	cxl/mem: Replace redundant debug message with a comment cxl_mem_probe() already emits a log message when HDM operation can not be established. Delete the similar one in cxl_hdm_decode_init(). What is less obvious is why global_ctrl being enabled makes positive values of info->ranges irrelevant, and the Linux behavior with respect to the spec recommendation to mirror CXL Range registers with HDM Decoder Base + Size registers. Cc: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/164944616743.454665.7055846627973202403.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 19:11:58 -07:00
Dan Williams	31e624a77e	cxl/mem: Rename cxl_dvsec_decode_init() to cxl_hdm_decode_init() cxl_dvsec_decode_init() is tasked with checking whether legacy DVSEC range based decode is in effect, or whether HDM can be enabled / already is enabled. As such it either succeeds or fails and that result is the return value. The @do_hdm_init variable is misleading in the case where HDM operation is already found to be active, so just call it @retval. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/164730736435.3806189.2537160791687837469.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 19:11:58 -07:00
Dan Williams	36bfc6ad50	cxl/pci: Make cxl_dvsec_ranges() failure not fatal to cxl_pci cxl_dvsec_ranges(), the helper for enumerating the presence of an active legacy CXL.mem configuration on a CXL 2.0 Memory Expander, is not fatal for cxl_pci because there is still value to enable mailbox operations even if CXL.mem operation is disabled. Recall that the reason cxl_pci does this initialization and not cxl_mem is to preserve the useful property (for unit testing) that cxl_mem is cxl_memdev + mmio generic, and does not require access to a 'struct pci_dev' to issue config cycles. Update 'struct cxl_endpoint_dvsec_info' to carry either a positive number of non-zero size legacy CXL DVSEC ranges, or the negative error code from __cxl_dvsec_ranges() in its @ranges member. Reported-by: Krzysztof Zach <krzysztof.zach@intel.com> Fixes: `560f785590` ("cxl/pci: Retrieve CXL DVSEC memory info") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/164730735869.3806189.4032428192652531946.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 19:11:58 -07:00
Dan Williams	fbaf2b079d	cxl/mem: Make cxl_dvsec_range() init failure fatal In preparation for the cxl_pci driver to continue operation after cxl_dvsec_range() failure, update cxl_mem to check for negative error codes in info->ranges. Treat that condition as fatal regardless of the state of the HDM configuration since cxl_mem needs positive confirmation that legacy ranges were not established by platform firmware or another agent. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com. Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/164730735324.3806189.4167509857771192422.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 19:11:58 -07:00
Dan Williams	e39f9be08d	cxl/pci: Add debug for DVSEC range init failures In preparation for not treating DVSEC range initialization failures as fatal to cxl_pci_probe() add individual dev_dbg() statements for each of the major failure reasons in cxl_dvsec_ranges(). The rationale for cxl_dvsec_ranges() failure not being fatal is that there is still value for cxl_pci to enable mailbox operations even if CXL.mem operation is disabled. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/164730734812.3806189.2726330688692684104.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 19:11:58 -07:00
Dan Williams	e08063fb87	cxl/mem: Drop DVSEC vs EFI Memory Map sanity check When the driver finds legacy DVSEC ranges active on a CXL Memory Expander it indicates that platform firmware is not aware of, or is deliberately disabling common CXL 2.0 operation. In this case Linux generally has no choice, but to leave the device alone. The driver attempts to validate that the DVSEC range is in the EFI memory map. Remove that logic since there is no requirement that the BIOS publish DVSEC ranges in the EFI Memory Map. In the future the driver will want to permanently reserve this capacity out of the available CFMWS capacity and hide it from request_free_mem_region(), but it serves no purpose to warn about the range not appearing in the EFI Memory Map. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164730734246.3806189.13995924771963139898.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 19:11:57 -07:00
Davidlohr Bueso	c43e036d6f	cxl/mbox: Use new return_code handling Use the global cxl_mbox_cmd_rc table to improve debug messaging in __cxl_pci_mbox_send_cmd() and allow cxl_mbox_send_cmd() to map to proper kernel style errno codes - this patch continues to use -ENXIO only so no change in semantics. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/20220404021216.66841-5-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:02 -07:00
Davidlohr Bueso	92fcc1abab	cxl/mbox: Improve handling of mbox_cmd hw return codes Upon a completed command the caller is still expected to check the actual return_code register to ensure it succeed. This adds, per the spec, the potential command return codes. It maps the hardware return code with the kernel's errno style, and by default continues to use -ENXIO (Command completed, but device reported an error). Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/20220404021216.66841-4-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:02 -07:00
Davidlohr Bueso	cbe83a2052	cxl/pci: Use CXL_MBOX_SUCCESS to check against mbox_cmd return code Also mention the need for the caller to check against any errors from the hardware in return_code. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/20220404021216.66841-3-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:01 -07:00
Davidlohr Bueso	ee92c7e261	cxl/mbox: Drop mbox_mutex comment ... we have lockdep for this. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/20220404021216.66841-2-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:01 -07:00
Alison Schofield	6aa657f416	cxl/pmem: Remove CXL SET_PARTITION_INFO from exclusive_cmds list With SET_PARTITION_INFO on the exclusive_cmds list for the CXL_PMEM driver, userspace cannot execute a set-partition command without first unbinding the pmem driver from the device. When userspace requests a partition change to take effect on the next reboot this unbind requirement is unnecessarily restrictive. The driver does not need to enforce an unbind because partitions will not change until the next reboot. Of course, userspace still needs to be aware that changing the size of persistent capacity on the next reboot will result in the loss of data stored. That can happen regardless of whether it is presently bound at the time of issuing the set-partition command. When userspace requests a partition change to take effect immediately, restrictions are needed. The CXL_MEM driver currently blocks the usage of immediate mode, making the presence of SET_PARTITION_INFO, in this exclusive commands list, redundant. In the future, when the CXL_MEM driver adds support for immediate changes to device partitions it will ensure that the partition change will not affect any active decode. That means the work will not fall right back here, onto the CXL_PMEM driver. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/accc6abc878f0662093b81490a1a052f2ff6f06e.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:01 -07:00
Alison Schofield	6179045ccc	cxl/mbox: Block immediate mode in SET_PARTITION_INFO command User space may send the SET_PARTITION_INFO mailbox command using the IOCTL interface. Inspect the input payload and fail if the immediate flag is set. This is the first instance of the driver inspecting an input payload from user space. Assume there will be more such cases and implement with an extensible helper. In order for the kernel to react to an immediate partition change it needs to assert that the change will not affect any active decode. At a minimum this requires validating that the device is using HDM decoders instead of the CXL DVSEC for decode, and that none of the active HDM decoders are affected by the partition change. For now, just fail until that support arrives. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/241821186c363833980adbc389e2c547bc5a6395.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:01 -07:00
Alison Schofield	2dd5600a0e	cxl/mbox: Move cxl_mem_command param to a local variable cxl_validate_command_from_user() is now the single point of validation for mailbox commands coming from user space. Previously, it returned a a cxl_mem_command, but that was not sufficient when validation of the actual mailbox command became a requirement. Now, it returns a fully validated cxl_mbox_cmd. Remove the extraneous cxl_mem_command parameter. Define and use a local version only. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/c11a437896d914daf36f5ac8ec62f999c5ec2da7.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:01 -07:00
Alison Schofield	d97fe8eec2	cxl/mbox: Make handle_mailbox_cmd_from_user() use a mbox param Previously, handle_mailbox_cmd_from_user(), constructed the mailbox command and dispatched it to the hardware. The construction work has moved to the validation path. handle_mailbox_cmd_from_user() now expects a fully validated mbox param. Make it's caller, cxl_send_cmd(), deliver it. Update the comments and dereferencing of the new mbox parameter. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/77050ba512d6c30eccf7505467509e460dd325a0.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:01 -07:00
Alison Schofield	82b8ba2953	cxl/mbox: Remove dependency on cxl_mem_command for a debug msg In preparation for removing access to struct cxl_mem_command, change this debug message to use cxl_mbox_cmd fields instead. Retrieve the pretty command name from cxl_mbox_cmd using a new opcode to command name helper. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/57265751d336a6e95f5ca31a9c77189408b05742.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:01 -07:00
Alison Schofield	9ae016aeb7	cxl/mbox: Construct a users cxl_mbox_cmd in the validation path This is a step in refactoring the handling of user space mailbox commands. The intent is to have all the validation work originate in cxl_validate_cmd_from_user(). Move the construction and validation of a mailbox command to the validation path. Continue to pass both the out_cmd and the mbox_cmd until handle_mbox_cmd_from_user() learns how to use a mbox_cmd param. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/c9fbdad968a2b619f9108bb6c37cef1a853cdf5a.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:00 -07:00
Alison Schofield	63cf60b7e0	cxl/mbox: Move build of user mailbox cmd to a helper functions In preparation for moving the construction of a mailbox command to the validation path, extract the work into a helper functions. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/493d7618a846d787c3ae28778935ca35e2b85eed.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:00 -07:00
Alison Schofield	39ed8da4f3	cxl/mbox: Move raw command warning to raw command validation This move serves two purposes: 1) Emit the warning in the raw command validation path, and 2) Remove the dependency on the struct cxl_mem_command in handle_mailbox_cmd_from_user() in preparation for a refactor of that function. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/df5f0e0ec8afa1f75299aa86b4226ab4479ef325.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:00 -07:00
Alison Schofield	6dd0e5cc87	cxl/mbox: Move cxl_mem_command construction to helper funcs Sanitizing and constructing a cxl_mem_command from a userspace command is part of the validation process prior to submitting the command to a CXL device. Move this work to helper functions: cxl_to_mem_cmd(), cxl_to_mem_cmd_raw(). This declutters cxl_validate_cmd_from_user() in preparation for adding new validation steps. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/7d9b826f29262e3a484cb4bb7b63872134d60bd7.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:00 -07:00
Dan Williams	d28820419c	cxl/pci: Drop shadowed variable 0day reports that wait_for_media_ready() declares an @rc variable twice. >> drivers/cxl/pci.c:439:7: warning: Local variable 'rc' shadows outer variable [shadowVariable] int rc; ^ drivers/cxl/pci.c:431:6: note: Shadowed declaration int rc, i; ^ drivers/cxl/pci.c:439:7: note: Shadow variable int rc; ^ Cc: Randy Dunlap <rdunlap@infradead.org> Fixes: `523e594d9c` ("cxl/pci: Implement wait for media active") Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/164944636936.455177.14136200464724208233.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-08 12:59:43 -07:00
Wan Jiabing	05e815539f	cxl/core/port: Fix NULL but dereferenced coccicheck error Fix the following coccicheck warning: ./drivers/cxl/core/port.c:913:21-24: ERROR: port is NULL but dereferenced. The put_device() is only relevent in the is_cxl_root() case. Fixes: `2703c16c75` ("cxl/core/port: Add switch port enumeration") Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Link: https://lore.kernel.org/r/20220307094158.404882-1-wanjiabing@vivo.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-03-22 10:51:17 -07:00
Dan Williams	74be98774d	cxl/port: Hold port reference until decoder release KASAN + DEBUG_KOBJECT_RELEASE reports a potential use-after-free in cxl_decoder_release() where it goes to reference its parent, a cxl_port, to free its id back to port->decoder_ida. BUG: KASAN: use-after-free in to_cxl_port+0x18/0x90 [cxl_core] Read of size 8 at addr ffff888119270908 by task kworker/35:2/379 CPU: 35 PID: 379 Comm: kworker/35:2 Tainted: G OE 5.17.0-rc2+ #198 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Workqueue: events kobject_delayed_cleanup Call Trace: <TASK> dump_stack_lvl+0x59/0x73 print_address_description.constprop.0+0x1f/0x150 ? to_cxl_port+0x18/0x90 [cxl_core] kasan_report.cold+0x83/0xdf ? to_cxl_port+0x18/0x90 [cxl_core] to_cxl_port+0x18/0x90 [cxl_core] cxl_decoder_release+0x2a/0x60 [cxl_core] device_release+0x5f/0x100 kobject_cleanup+0x80/0x1c0 The device core only guarantees parent lifetime until all children are unregistered. If a child needs a parent to complete its ->release() callback that child needs to hold a reference to extend the lifetime of the parent. Fixes: `40ba17afdf` ("cxl/acpi: Introduce cxl_decoder objects") Reported-by: Ben Widawsky <ben.widawsky@intel.com> Tested-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164505751190.4175768.13324905271463416712.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-17 16:51:13 -08:00
Dan Williams	41ae9105f5	cxl/port: Fix endpoint refcount leak An endpoint can be unregistered via two paths. Either its parent port is unregistered, or the memdev that registered the endpoint is removed. The memdev remove path is responsible for synchronizing against the parent ->remove() event and if the memdev remove path wins, manually trigger unregister_port() via devm_release_action(). Until that race is resolved the memdev remove path holds a reference on the endpoint. If the parent port for the endpoint can not be found that is an indication that the endpoint has already been registered. Be sure to drop the reference in all exit paths from delete_endpoint(). Fixes: `8dd2bc0f8e` ("cxl/mem: Add the cxl_mem driver") Link: https://lore.kernel.org/r/164454148209.3429624.12905500880311609053.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-17 16:50:30 -08:00
Dan Williams	e6e17cc6ed	cxl/core: Fix cxl_device_lock() class detection If cxl_device_lock() is used on a non-CXL device the expectation is that the lock class will fall back to CXL_ANON_LOCK. Instead it crashes when trying to determine if the device is a 'decoder'. Specifically when the device has a NULL type pointer. Just check for NULL before de-referencing ->release. Fixes: `3c5b903955` ("cxl: Prove CXL locking") Reported-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164439225406.2941117.3927102269866914339.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-11 13:27:18 -08:00
Dan Williams	5c3c067b60	cxl/core/port: Fix unregister_port() lock assertion The device_lock_assert() in unregister_port() fails to pick the right device leading to splats like the following from: echo "ACPI0017:00" > /sys/bus/platform/drivers/cxl_acpi/unbind WARNING: CPU: 32 PID: 1147 at include/linux/device.h:787 unregister_port+0x49/0x50 [cxl_c [..] RIP: 0010:unregister_port+0x49/0x50 [cxl_core] [..] Call Trace: <TASK> release_nodes+0x63/0x80 devres_release_all+0x8b/0xc0 __device_release_driver+0x190/0x240 device_driver_detach+0x3e/0xa0 unbind_store+0x113/0x130 Fix it up to assert on the device_lock() for ACPI0017 for root and 1st level ports, and parent ports for all the rest. Fixes: `54cdbf845c` ("cxl/port: Add a driver for 'struct cxl_port' objects") Reported-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164439224893.2941117.18331456248117887720.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-11 13:26:49 -08:00
Jonathan Cameron	74b0fe8040	cxl/regs: Fix size of CXL Capability Header Register In CXL 2.0, 8.2.5.1 CXL Capability Header Register: this register is given as 32 bits. 8.2.3 which covers the CXL 2.0 Component registers, including the CXL Capability Header Register states that access restrictions specified in Section 8.2.2 apply. 8.2.2 includes: * A 32 bit register shall be accessed as a 4 Byte quantity. ... If these rules are not followed, the behavior is undefined. Discovered during review of CXL QEMU emulation. Alex Bennée pointed out there was a comment saying that 4 byte registers must be read with a 4 byte read, but 8 byte reads were being emulated. https://lore.kernel.org/qemu-devel/87bkzyd3c7.fsf@linaro.org/ Fixing that, led to this code failing. Whilst a given hardware implementation 'might' work with an 8 byte read, it should not be relied upon. The QEMU emulation v5 will return 0 and log the wrong access width. The code moved, so one fixes tag for where this will directly apply and also a reference to the earlier introduction of the code for backports. Fixes: `0f06157e01` ("cxl/core: Move register mapping infrastructure") Fixes: `08422378c4` ("cxl/pci: Add HDM decoder capabilities") Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/20220201153437.2873-1-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 23:15:33 -08:00
Dan Williams	7004cc9d15	cxl/core/port: Handle invalid decoders In case init_hdm_decoder() finds invalid settings, skip to the next valid decoder. Only fail port enumeration if zero valid decoders are found. This protects the driver init against broken hardware and / or future interleave capabilities. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164317464918.3438644.12371149695618136198.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 23:15:10 -08:00
Dan Williams	0909b4e528	cxl/core/port: Fix / relax decoder target enumeration If the decoder is not presently active the target_list may not be accurate. Perform a best effort mapping and assume that it will be fixed up when the decoder is enabled. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164317464406.3438644.6609329492458460242.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 23:15:10 -08:00
Ben Widawsky	9b71e1c9c3	cxl/core/port: Add endpoint decoders Recall that a CXL Port is any object that publishes a CXL HDM Decoder Capability structure. That is Host Bridge and Switches that have been enabled so far. Now, add decoder support to the 'endpoint' CXL Ports registered by the cxl_mem driver. They mostly share the same enumeration as Bridges and Switches, but witout a target list. The target of endpoint decode is device-internal DPA space, not another downstream port. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [djbw: clarify changelog, hookup enumeration in the port driver] Link: https://lore.kernel.org/r/164386092069.765089.14895687988217608642.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:32 -08:00
Dan Williams	8aea0ef19f	cxl/core: Move target_list out of base decoder attributes In preparation for introducing endpoint decoder objects, move the target_list attribute out of the common set since it has no meaning for endpoint decoders. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164298430100.3018233.4715072508880290970.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:32 -08:00
Ben Widawsky	8dd2bc0f8e	cxl/mem: Add the cxl_mem driver At this point the subsystem can enumerate all CXL ports (CXL.mem decode resources in upstream switch ports and host bridges) in a system. The last mile is connecting those ports to endpoints. The cxl_mem driver connects an endpoint device to the platform CXL.mem protoctol decode-topology. At ->probe() time it walks its device-topology-ancestry and adds a CXL Port object at every Upstream Port hop until it gets to CXL root. The CXL root object is only present after a platform firmware driver registers platform CXL resources. For ACPI based platform this is managed by the ACPI0017 device and the cxl_acpi driver. The ports are registered such that disabling a given port automatically unregisters all descendant ports, and the chain can only be registered after the root is established. Given ACPI device scanning may run asynchronously compared to PCI device scanning the root driver is tasked with rescanning the bus after the root successfully probes. Conversely if any ports in a chain between the root and an endpoint becomes disconnected it subsequently triggers the endpoint to unregister. Given lock depenedencies the endpoint unregistration happens in a workqueue asynchronously. If userspace cares about synchronizing delayed work after port events the /sys/bus/cxl/flush attribute is available for that purpose. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [djbw: clarify changelog, rework hotplug support] Link: https://lore.kernel.org/r/164398782997.903003.9725273241627693186.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:32 -08:00
Dan Williams	2703c16c75	cxl/core/port: Add switch port enumeration So far the platorm level CXL resources have been enumerated by the cxl_acpi driver, and cxl_pci has gathered all the pre-requisite information it needs to fire up a cxl_mem driver. However, the first thing the cxl_mem driver will be tasked to do is validate that all the PCIe Switches in its ancestry also have CXL capabilities and an CXL.mem link established. Provide a common mechanism for a CXL.mem endpoint driver to enumerate all the ancestor CXL ports in the topology and validate CXL.mem connectivity. Multiple endpoints may end up racing to establish a shared port in the topology. This race is resolved via taking the device-lock on a parent CXL Port before establishing a new child. The winner of the race establishes the port, the loser simply registers its interest in the port via 'struct cxl_ep' place-holder reference. At endpoint teardown the same parent port lock is taken as 'struct cxl_ep' references are deleted. Last endpoint to drop its reference unregisters the port. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164398731146.902644.1029761300481366248.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:32 -08:00
Dan Williams	cf1f6877b0	cxl/memdev: Add numa_node attribute While CXL memory targets will have their own memory target node, individual memory devices may be affinitized like other PCI devices. Emit that attribute for memdevs. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298428430.3018233.16409089892707993289.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:32 -08:00
Dan Williams	bcc79ea343	cxl/pci: Emit device serial number Per the CXL specification (8.1.12.2 Memory Device PCIe Capabilities and Extended Capabilities) the Device Serial Number capability is mandatory. Emit it for user tooling to identify devices. It is reasonable to ask whether the attribute should be added to the list of PCI sysfs device attributes. The PCI layer can optionally emit it too, but the CXL subsystem is aiming to preserve its independence and the possibility of CXL topologies with non-PCI devices in it. To date that has only proven useful for the 'cxl_test' model, but as can be seen with seen with ACPI0016 devices, sometimes all that is needed is a platform firmware table to point to CXL Component Registers in MMIO space to define a "CXL" device. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164366608838.196598.16856227191534267098.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:32 -08:00
Ben Widawsky	523e594d9c	cxl/pci: Implement wait for media active CXL 2.0 8.1.3.8.2 states: Memory_Active: When set, indicates that the CXL Range 1 memory is fully initialized and available for software use. Must be set within Range 1. Memory_Active_Timeout of deassertion of reset to CXL device if CXL.mem HwInit Mode=1 Unfortunately, Memory_Active can take quite a long time depending on media size (up to 256s per 2.0 spec). Provide a callback for the eventual establishment of CXL.mem operations via the 'cxl_mem' driver the 'struct cxl_memdev'. The implementation waits for 60s by default for now and can be overridden by the mbox_ready_time module parameter. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> [djbw: switch to sleeping wait] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298427373.3018233.9309741847039301834.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:31 -08:00
Ben Widawsky	560f785590	cxl/pci: Retrieve CXL DVSEC memory info Before CXL 2.0 HDM Decoder Capability mechanisms can be utilized in a device the driver must determine that the device is ready for CXL.mem operation and that platform firmware, or some other agent, has established an active decode via the legacy CXL 1.1 decoder mechanism. This legacy mechanism is defined in the CXL DVSEC as a set of range registers and status bits that take time to settle after a reset. Validate the CXL memory decode setup via the DVSEC and cache it for later consideration by the cxl_mem driver (to be added). Failure to validate is not fatal to the cxl_pci driver since that is only providing CXL command support over PCI.mmio, and might be needed to rectify CXL DVSEC validation problems. Any potential ranges that the device is already claiming via DVSEC need to be reconciled with the dynamic provisioning ranges provided by platform firmware (like ACPI CEDT.CFMWS). Leave that reconciliation to the cxl_mem driver. [djbw: shorten defines] [djbw: change precise spin wait to generous msleep] Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> [djbw: clarify changelog] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164375911821.559935.7375160041663453400.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:31 -08:00
Ben Widawsky	06e279e5eb	cxl/pci: Cache device DVSEC offset The PCIe device DVSEC, defined in the CXL 2.0 spec, 8.1.3 is required to be implemented by CXL 2.0 endpoint devices. In preparation for consuming this information in a new cxl_mem driver, retrieve the CXL DVSEC position and warn about the implications of not finding it. Allow for mailbox operation even if the CXL DVSEC is missing. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164375309615.513620.7874131241128599893.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:31 -08:00
Ben Widawsky	4112a08dd3	cxl/pci: Store component register base in cxlds In preparation for defining a cxl_port object to represent the decoder resources of a memory expander capture the component register base address. The port driver uses the component register base to enumerate the HDM Decoder Capability structure. Unlike other cxl_port objects the endpoint port decodes from upstream SPA to downstream DPA rather than upstream port to downstream port. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [djbw: clarify changelog] Link: https://lore.kernel.org/r/164375084181.484304.3919737667590006795.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:30 -08:00
Dan Williams	664bf11583	cxl/core/port: Remove @host argument for dport + decoder enumeration Now that dport and decoder enumeration is centralized in the port driver, the @host argument for these helpers can be made implicit. For the root port the host is the port's uport device (ACPI0017 for cxl_acpi), and for all other descendant ports the devm context is the parent of @port. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164375043390.484143.17617734732003230076.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:30 -08:00
Ben Widawsky	54cdbf845c	cxl/port: Add a driver for 'struct cxl_port' objects The need for a CXL port driver and a dedicated cxl_bus_type is driven by a need to simultaneously support 2 independent physical memory decode domains (cache coherent CXL.mem and uncached PCI.mmio) that also intersect at a single PCIe device node. A CXL Port is a device that advertises a CXL Component Register block with an "HDM Decoder Capability Structure". >From Documentation/driver-api/cxl/memory-devices.rst: Similar to how a RAID driver takes disk objects and assembles them into a new logical device, the CXL subsystem is tasked to take PCIe and ACPI objects and assemble them into a CXL.mem decode topology. The need for runtime configuration of the CXL.mem topology is also similar to RAID in that different environments with the same hardware configuration may decide to assemble the topology in contrasting ways. One may choose performance (RAID0) striping memory across multiple Host Bridges and endpoints while another may opt for fault tolerance and disable any striping in the CXL.mem topology. The port driver identifies whether an endpoint Memory Expander is connected to a CXL topology. If an active (bound to the 'cxl_port' driver) CXL Port is not found at every PCIe Switch Upstream port and an active "root" CXL Port then the device is just a plain PCIe endpoint only capable of participating in PCI.mmio and DMA cycles, not CXL.mem coherent interleave sets. The 'cxl_port' driver lets the CXL subsystem leverage driver-core infrastructure for setup and teardown of register resources and communicating device activation status to userspace. The cxl_bus_type can rendezvous the async arrival of platform level CXL resources (via the 'cxl_acpi' driver) with the asynchronous enumeration of Memory Expander endpoints, while also implementing a hierarchical locking model independent of the associated 'struct pci_dev' locking model. The locking for dport and decoder enumeration is now handled in the core rather than callers. For now the port driver only enumerates and registers CXL resources (downstream port metadata and decoder resources) later it will be used to take action on its decoders in response to CXL.mem region provisioning requests. Note1: cxlpci.h has long depended on pci.h, but port.c was the first to not include pci.h. Carry that dependency in cxlpci.h. Note2: cxl port enumeration and probing complicates CXL subsystem init to the point that it helps to have centralized debug logging of probe events in cxl_bus_probe(). Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/164374948116.464348.1772618057599155408.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:30 -08:00
Dan Williams	83fbdbe4c1	cxl/core: Emit modalias for CXL devices In order to enable libkmod lookups for CXL device objects to their corresponding module, add 'modalias' to the base attribute of CXL devices. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164298424120.3018233.15611905873808708542.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:30 -08:00
Dan Williams	d17d0540a0	cxl/core/hdm: Add CXL standard decoder enumeration to the core Unlike the decoder enumeration for "root decoders" described by platform firmware, standard decoders can be enumerated from the component registers space once the base address has been identified (via PCI, ACPI, or another mechanism). Add common infrastructure for HDM (Host-managed-Device-Memory) Decoder enumeration and share it between host-bridge, upstream switch port, and cxl_test defined decoders. The locking model for switch level decoders is to hold the port lock over the enumeration. This facilitates moving the dport and decoder enumeration to a 'port' driver. For now, the only enumerator of decoder resources is the cxl_acpi root driver. Co-developed-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164374688404.395335.9239248252443123526.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:30 -08:00
Dan Williams	98d2d3a264	cxl/core: Generalize dport enumeration in the core The core houses infrastructure for decoder resources. A CXL port's dports are more closely related to decoder infrastructure than topology enumeration. Implement generic PCI based dport enumeration in the core, i.e. arrange for existing root port enumeration from cxl_acpi to share code with switch port enumeration which just amounts to a small difference in a pci_walk_bus() invocation once the appropriate 'struct pci_bus' has been retrieved. Set the convention that decoder objects are registered after all dports are enumerated. This enables userspace to know when the CXL core is finished establishing 'dportX' links underneath the 'portX' object. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164368114191.354031.5270501846455462665.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:30 -08:00
Dan Williams	af9cae9fac	cxl/pci: Rename pci.h to cxlpci.h Similar to the mem.h rename, if the core wants to reuse definitions from drivers/cxl/pci.h it is unable to use <pci.h> as that collides with archs that have an arch/$arch/include/asm/pci.h, like MIPS. Reported-by: kernel test robot <lkp@intel.com> Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298422510.3018233.14693126572756675563.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:30 -08:00
Dan Williams	c978f1b10a	cxl/port: Up-level cxl_add_dport() locking requirements to the caller In preparation for moving dport enumeration into the core, require the port device lock to be acquired by the caller. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164367759016.324231.105551648350470000.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:29 -08:00
Dan Williams	a46cfc0f01	cxl/pmem: Introduce a find_cxl_root() helper In preparation for switch port enumeration while also preserving the potential for multi-domain / multi-root CXL topologies. Introduce a 'struct device' generic mechanism for retrieving a root CXL port, if one is registered. Note that the only known multi-domain CXL configurations are running the cxl_test unit test on a system that also publishes an ACPI0017 device. With this in hand the nvdimm-bridge lookup can be with device_find_child() instead of bus_find_device() + custom mocked lookup infrastructure in cxl_test. The mechanism looks for a 2nd level port since the root level topology is platform-firmware specific and the 2nd level down follows standard PCIe topology expectations. The cxl_acpi 2nd level is associated with a PCIe Root Port. Reported-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164367562182.225521.9488555616768096049.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:29 -08:00
Dan Williams	5ff7316f6f	cxl/port: Introduce cxl_port_to_pci_bus() Add a helper for converting a PCI enumerated cxl_port into the pci_bus that hosts its dports. For switch ports this is trivial, but for root ports there is no generic way to go from a platform defined host bridge device, like ACPI0016 to its corresponding pci_bus. Rather than spill ACPI goop outside of the cxl_acpi driver, just arrange for it to register an xarray translation from the uport device to the corresponding pci_bus. This is in preparation for centralizing dport enumeration in the core. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164364745633.85488.9744017377155103992.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:29 -08:00
Dan Williams	86c8ea0f3b	cxl/core/port: Use dedicated lock for decoder target list Lockdep reports: ====================================================== WARNING: possible circular locking dependency detected 5.16.0-rc1+ #142 Tainted: G OE ------------------------------------------------------ cxl/1220 is trying to acquire lock: ffff979b85475460 (kn->active#144){++++}-{0:0}, at: __kernfs_remove+0x1ab/0x1e0 but task is already holding lock: ffff979b87ab38e8 (&dev->lockdep_mutex#2/4){+.+.}-{3:3}, at: cxl_remove_ep+0x50c/0x5c0 [cxl_core] ...where cxl_remove_ep() is a helper that wants to delete ports while holding a lock on the host device for that port. That sets up a lockdep violation whereby target_list_show() can not rely holding the decoder's device lock while walking the target_list. Switch to a dedicated seqlock for this purpose. Reported-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164367209095.208169.1171673319121271280.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:29 -08:00
Dan Williams	3c5b903955	cxl: Prove CXL locking When CONFIG_PROVE_LOCKING is enabled the 'struct device' definition gets an additional mutex that is not clobbered by lockdep_set_novalidate_class() like the typical device_lock(). This allows for local annotation of subsystem locks with mutex_lock_nested() per the subsystem's object/lock hierarchy. For CXL, this primarily needs the ability to lock ports by depth and child objects of ports by their parent parent-port lock. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164365853422.99383.1052399160445197427.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:29 -08:00
Ben Widawsky	53fa1bff34	cxl/core: Track port depth In preparation for proving CXL subsystem usage of the device_lock() order track the depth of ports with the expectation that shallower port locks can be held over deeper port locks. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298419321.3018233.4469731547378993606.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:29 -08:00
Ben Widawsky	d2b61ed2ff	cxl/core/port: Make passthrough decoder init implicit Unused CXL decoders, or ports which use a passthrough decoder (no HDM decoder registers) are expected to be initialized in a specific way. Since upcoming drivers will want the same initialization, and it was already a requirement to have consumers of the API configure the decoder specific to their needs, initialize to this passthrough state by default. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298418778.3018233.13573986275832546547.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:29 -08:00
Dan Williams	d621bc2e72	cxl/core: Fix cxl_probe_component_regs() error message Fix a '\n' vs '/n' typo. Fixes: `08422378c4` ("cxl/pci: Add HDM decoder capabilities") Acked-by: Ben Widawsky <ben.widawsky@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298418268.3018233.17790073375430834911.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:28 -08:00
Ben Widawsky	d54c1bbe2d	cxl/core/port: Clarify decoder creation Add wrappers for the creation of decoder objects at the root level and switch level, and keep the core helper private to cxl/core/port.c. Root decoders are static descriptors conveyed from platform firmware (e.g. ACPI CFMWS). Switch decoders are CXL standard decoders enumerated via the HDM decoder capability structure. The base address for the HDM decoder capability structure may be conveyed either by PCIe or platform firmware (ACPI CEDT.CHBS). Additionally, the kdoc descriptions for these helpers and their dependencies is updated. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> [djbw: fixup changelog, clarify kdoc] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164366463014.111117.9714595404002687111.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:28 -08:00
Ben Widawsky	608135db1b	cxl/core: Convert decoder range to resource CXL decoders manage address ranges in a hierarchical fashion whereby a leaf is a unique subregion of its parent decoder (midlevel or root). It therefore makes sense to use the resource API for handling this. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> (v1) Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164298417191.3018233.5201055578165414714.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:28 -08:00
Dan Williams	c3bca8d4bb	cxl/decoder: Hide physical address information from non-root Just like /proc/iomem, CXL physical address information is reserved for root only. Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298416650.3018233.450720006145238709.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:28 -08:00
Dan Williams	0ff0af1821	cxl/core/port: Rename bus.c to port.c Given it is dominated by port infrastructure, and will only acquire more, rename bus.c to port.c. Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164298416136.3018233.15442880970000855425.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:28 -08:00
Ben Widawsky	c57cae78bf	cxl: Introduce module_cxl_driver Many CXL drivers simply want to register and unregister themselves. module_driver already supported this. A simple wrapper around that reduces a decent amount of boilerplate in upcoming patches. Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298415591.3018233.13608495220547681412.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:28 -08:00
Ben Widawsky	303ebc1b17	cxl/acpi: Map component registers for Root Ports This implements the TODO in cxl_acpi for mapping component registers. cxl_acpi becomes the second consumer of CXL register block enumeration (cxl_pci being the first). Moving the functionality to cxl_core allows both of these drivers to use the functionality. Equally importantly it allows cxl_core to use the functionality in the future. CXL 2.0 root ports are similar to CXL 2.0 Downstream Ports with the main distinction being they're a part of the CXL 2.0 host bridge. While mapping their component registers is not immediately useful for the CXL drivers, the movement of register block enumeration into core is a vital step towards HDM decoder programming. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [djbw: fix cxl_regmap_to_base() failure cases] Link: https://lore.kernel.org/r/164298415080.3018233.14694957480228676592.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:28 -08:00
Ben Widawsky	8baa787b93	cxl/pci: Add new DVSEC definitions In preparation for properly supporting memory active timeout, and later on, other attributes obtained from DVSEC fields, add the full list of DVSEC identifiers from the CXL 2.0 specification. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com> (v1) Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164298414567.3018233.12005290051592771878.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:27 -08:00
Ben Widawsky	46c6ad2762	cxl: Flesh out register names Get a better naming scheme in place for upcoming additions. By dropping redundant usages of CXL and DVSEC where appropriate we can get more concise and also more grepable defines. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164298414022.3018233.15522855498759815097.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:27 -08:00
Dan Williams	4f195ee73a	cxl/pci: Defer mailbox status checks to command timeouts Device status can change without warning at any point in time. This effectively means that no amount of status checking before a command is submitted can guarantee that the device is not in an error condition when the command is later submitted. The clearest signal that a device is not able to process commands is if it fails to process commands. With the above understanding in hand, update cxl_pci_setup_mailbox() to validate the readiness of the mailbox once at the beginning of time, and then use timeouts and busy sequencing errors as the only occasions to report status. Just as before, unless and until the driver gains a reset recovery path, doorbell clearing failures by the device are fatal to mailbox operations. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298413480.3018233.9643395389297971819.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:27 -08:00
Ben Widawsky	229e8828c2	cxl/pci: Implement Interface Ready Timeout The original driver implementation used the doorbell timeout for the Mailbox Interface Ready bit to piggy back off of, since the latter does not have a defined timeout. This functionality, introduced in commit `8adaf747c9` ("cxl/mem: Find device capabilities"), needs improvement as the recent "Add Mailbox Ready Time" ECN timeout indicates that the mailbox ready time can be significantly longer that 2 seconds. While the specification limits the maximum timeout to 256s, the cxl_pci driver gives up on the mailbox after 60s. This value corresponds with important timeout values already present in the kernel. A module parameter is provided as an emergency override and represents the default Linux policy for all devices. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [djbw: add modparam, drop check_device_status()] Co-developed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/164367306565.208548.1932299464604450843.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:27 -08:00
Ben Widawsky	68cdd3d2af	cxl: Rename CXL_MEM to CXL_PCI The cxl_mem module was renamed cxl_pci in commit `21e9f76733` ("cxl: Rename mem to pci"). In preparation for adding an ancillary driver for cxl_memdev devices (registered on the cxl bus by cxl_pci), go ahead and rename CONFIG_CXL_MEM to CONFIG_CXL_PCI. Free up the CXL_MEM name for that new driver to manage CXL.mem endpoint operations. Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164298412409.3018233.12407355692407890752.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:27 -08:00
Nathan Chancellor	be185c2988	cxl/core: Remove cxld_const_init in cxl_decoder_alloc() Commit `48667f6761` ("cxl/core: Split decoder setup into alloc + add") aimed to fix a large stack frame warning but from v5 to v6, it introduced a new instance of the warning due to allocating cxld_const_init on the stack, which was done due to the use of const on the nr_target member of the cxl_decoder struct. With ARCH=arm allmodconfig minus CONFIG_KASAN: GCC 11.2.0: drivers/cxl/core/bus.c: In function ‘cxl_decoder_alloc’: drivers/cxl/core/bus.c:523:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] 523 \| } \| ^ cc1: all warnings being treated as errors Clang 12.0.1: drivers/cxl/core/bus.c:486:21: error: stack frame size of 1056 bytes in function 'cxl_decoder_alloc' [-Werror,-Wframe-larger-than=] struct cxl_decoder cxl_decoder_alloc(struct cxl_port port, int nr_targets) ^ 1 error generated. Revert that part of the change, which makes the stack frame of cxl_decoder_alloc() much more reasonable. Fixes: `48667f6761` ("cxl/core: Split decoder setup into alloc + add") Link: https://github.com/ClangBuiltLinux/linux/issues/1539 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20211210213627.2477370-1-nathan@kernel.org Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-01-04 17:29:31 -08:00
Dan Williams	53989fad12	cxl/pmem: Fix module reload vs workqueue state A test of the form: while true; do modprobe -r cxl_pmem; modprobe cxl_pmem; done May lead to a crash signature of the form: BUG: unable to handle page fault for address: ffffffffc0660030 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page [..] Workqueue: cxl_pmem 0xffffffffc0660030 RIP: 0010:0xffffffffc0660030 Code: Unable to access opcode bytes at RIP 0xffffffffc0660006. [..] Call Trace: ? process_one_work+0x4ec/0x9c0 ? pwq_dec_nr_in_flight+0x100/0x100 ? rwlock_bug.part.0+0x60/0x60 ? worker_thread+0x2eb/0x700 In that report the 0xffffffffc0660030 address corresponds to the former function address of cxl_nvb_update_state() from a previous load of the module, not the current address. Fix that by arranging for ->state_work in the 'struct cxl_nvdimm_bridge' object to be reinitialized on cxl_pmem module reload. Details: Recall that CXL subsystem wants to link a CXL memory expander device to an NVDIMM sub-hierarchy when both a persistent memory range has been registered by the CXL platform driver (cxl_acpi) and when that CXL memory expander has published persistent memory capacity (Get Partition Info). To this end the cxl_nvdimm_bridge driver arranges to rescan the CXL bus when either of those conditions change. The helper bus_rescan_devices() can not be called underneath the device_lock() for any device on that bus, so the cxl_nvdimm_bridge driver uses a workqueue for the rescan. Typically a driver allocates driver data to hold a 'struct work_struct' for a driven device, but for a workqueue that may run after ->remove() returns, driver data will have been freed. The 'struct cxl_nvdimm_bridge' object holds the state and work_struct directly. Unfortunately it was only arranging for that infrastructure to be initialized once per device creation rather than the necessary once per workqueue (cxl_pmem_wq) creation. Introduce is_cxl_nvdimm_bridge() and cxl_nvdimm_bridge_reset() in support of invalidating stale references to a recently destroyed cxl_pmem_wq. Cc: <stable@vger.kernel.org> Fixes: `8fdcb1704f` ("cxl/pmem: Add initial infrastructure for pmem support") Reported-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/163665474585.3505991.8397182770066720755.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-11-15 11:03:00 -08:00
Alison Schofield	fd49f99c18	ACPI: NUMA: Add a node and memblk for each CFMWS not in SRAT During NUMA init, CXL memory defined in the SRAT Memory Affinity subtable may be assigned to a NUMA node. Since there is no requirement that the SRAT be comprehensive for CXL memory another mechanism is needed to assign NUMA nodes to CXL memory not identified in the SRAT. Use the CXL Fixed Memory Window Structure (CFMWS) of the ACPI CXL Early Discovery Table (CEDT) to find all CXL memory ranges. Create a NUMA node for each CFMWS that is not already assigned to a NUMA node. Add a memblk attaching its host physical address range to the node. Note that these ranges may not actually map any memory at boot time. They may describe persistent capacity or may be present to enable hot-plug. Consumers can use phys_to_target_node() to discover the NUMA node. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/163553711933.2509508.2203471175679990.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-11-15 11:03:00 -08:00
Dan Williams	814dff9ae2	cxl/test: Mock acpi_table_parse_cedt() Now that cxl_acpi has been converted to use the core ACPI CEDT sub-table parser, update cxl_test to inject CFMWS and CHBS data directly into cxl_acpi's handlers. Cc: Alison Schofield <alison.schofield@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/163553711363.2509508.17428994087868269952.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-11-15 11:03:00 -08:00
Dan Williams	f4ce1f766f	cxl/acpi: Convert CFMWS parsing to ACPI sub-table helpers The cxl_acpi driver originally open-coded its table parsing since the ACPI subtable helpers were marked __init and only used in early NUMA initialization. Now that those helpers have been exported for driver usage replace the open-coded solution with the common one. Cc: Alison Schofield <alison.schofield@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/163553710810.2509508.14686373989517930921.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-11-15 11:02:59 -08:00
Ira Weiny	a91bd78967	cxl/memdev: Remove unused cxlmd field This field was left over when the connection between the cxl_memdev and cxl_mem was tighter. It is no longer set nor used so remove it.[1] Link: https://lore.kernel.org/r/CAPcyv4hcgh2gb8qsS_UXTBSGqYfMPnC6p5kkvNUjm+V6kVKM5g@mail.gmail.com/ Suggested-by: Jonathan.Cameron@huawei.com Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20211103234857.3689354-1-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-11-15 11:02:59 -08:00
Dan Williams	affec78274	cxl/core: Convert to EXPORT_SYMBOL_NS_GPL It turns out that the usb example of specifying the subsystem namespace at build time is not preferred. The rationale for that preference has become more apparent as CXL patches with plain EXPORT_SYMBOL_GPL beg the question, "why would any code other than CXL care about this symbol?". Make the namespace explicit. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163676356810.3618264.601632777702192938.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-11-15 11:02:59 -08:00
Ira Weiny	5e2411ae80	cxl/memdev: Change cxl_mem to a more descriptive name The 'struct cxl_mem' object actually represents the state of a CXL device within the driver. Comments indicating that 'struct cxl_mem' is a device itself are incorrect. It is data layered on top of a CXL Memory Expander class device. Rename it 'struct cxl_dev_state'. The 'struct' cxl_memdev' structure represents a Linux CXL memory device object, and it uses services and information provided by 'struct cxl_dev_state'. Update the structure name, function names, and the kdocs to reflect the real uses of this structure. Some helper functions that were previously prefixed "cxl_mem_" are renamed to just "cxl_". Acked-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20211102202901.3675568-3-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-11-15 11:02:58 -08:00
Ira Weiny	888e034a74	cxl/mbox: Remove bad comment __cxl_mem_mbox_send_cmd() no longer exists. Remove the reference. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20211102202901.3675568-2-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-11-15 11:02:58 -08:00
Dan Williams	08b9e0ab8a	cxl/pmem: Fix reference counting for delayed work There is a potential race between queue_work() returning and the queued-work running that could result in put_device() running before get_device(). Introduce the cxl_nvdimm_bridge_state_work() helper that takes the reference unconditionally, but drops it if no new work was queued, to keep the references balanced. Fixes: `8fdcb1704f` ("cxl/pmem: Add initial infrastructure for pmem support") Cc: <stable@vger.kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/163553734757.2509761.3305231863616785470.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-11-15 11:02:58 -08:00
Linus Torvalds	dd72945c43	cxl for v5.16 - Fix support for platforms that do not enumerate every ACPI0016 (CXL Host Bridge) in the CHBS (ACPI Host Bridge Structure). - Introduce a common pci_find_dvsec_capability() helper, clean up open coded implementations in various drivers. - Add 'cxl_test' for regression testing CXL subsystem ABIs. 'cxl_test' is a module built from tools/testing/cxl/ that mocks up a CXL topology to augment the nascent support for emulation of CXL devices in QEMU. - Convert libnvdimm to use the uuid API. - Complete the definition of CXL namespace labels in libnvdimm. - Tunnel libnvdimm label operations from nd_ioctl() back to the CXL mailbox driver. Enable 'ndctl {read,write}-labels' for CXL. - Continue to sort and refactor functionality into distinct driver and core-infrastructure buckets. For example, mailbox handling is now a generic core capability consumed by the PCI and cxl_test drivers. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYYRtyQAKCRDfioYZHlFs Z7UsAP9DzUN6IWWnYk1R95YXYVxFriRtRsBjujAqTg49EMghawEAoHaA9lxO3Hho l25TLYUOmB/zFTlUbe6YQptMJZ5YLwY= =im9j -----END PGP SIGNATURE----- Merge tag 'cxl-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl updates from Dan Williams: "More preparation and plumbing work in the CXL subsystem. From an end user perspective the highlight here is lighting up the CXL Persistent Memory related commands (label read / write) with the generic ioctl() front-end in LIBNVDIMM. Otherwise, the ability to instantiate new persistent and volatile memory regions is still on track for v5.17. Summary: - Fix support for platforms that do not enumerate every ACPI0016 (CXL Host Bridge) in the CHBS (ACPI Host Bridge Structure). - Introduce a common pci_find_dvsec_capability() helper, clean up open coded implementations in various drivers. - Add 'cxl_test' for regression testing CXL subsystem ABIs. 'cxl_test' is a module built from tools/testing/cxl/ that mocks up a CXL topology to augment the nascent support for emulation of CXL devices in QEMU. - Convert libnvdimm to use the uuid API. - Complete the definition of CXL namespace labels in libnvdimm. - Tunnel libnvdimm label operations from nd_ioctl() back to the CXL mailbox driver. Enable 'ndctl {read,write}-labels' for CXL. - Continue to sort and refactor functionality into distinct driver and core-infrastructure buckets. For example, mailbox handling is now a generic core capability consumed by the PCI and cxl_test drivers" * tag 'cxl-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (34 commits) ocxl: Use pci core's DVSEC functionality cxl/pci: Use pci core's DVSEC functionality PCI: Add pci_find_dvsec_capability to find designated VSEC cxl/pci: Split cxl_pci_setup_regs() cxl/pci: Add @base to cxl_register_map cxl/pci: Make more use of cxl_register_map cxl/pci: Remove pci request/release regions cxl/pci: Fix NULL vs ERR_PTR confusion cxl/pci: Remove dev_dbg for unknown register blocks cxl/pci: Convert register block identifiers to an enum cxl/acpi: Do not fail cxl_acpi_probe() based on a missing CHBS cxl/pci: Disambiguate cxl_pci further from cxl_mem Documentation/cxl: Add bus internal docs cxl/core: Split decoder setup into alloc + add tools/testing/cxl: Introduce a mock memory device + driver cxl/mbox: Move command definitions to common location cxl/bus: Populate the target list at decoder create tools/testing/cxl: Introduce a mocked-up CXL port hierarchy cxl/pmem: Add support for multiple nvdimm-bridge objects cxl/pmem: Translate NVDIMM label commands to CXL label commands ...	2021-11-08 11:49:48 -08:00
Ben Widawsky	55006a2c94	cxl/pci: Use pci core's DVSEC functionality Reduce maintenance burden of DVSEC query implementation by using the centralized PCI core implementation. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> [djbw: kill cxl_pci_dvsec()] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163379788528.692348.11581080806976608802.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-10-29 11:53:52 -07:00
Ben Widawsky	85afc3175a	cxl/pci: Split cxl_pci_setup_regs() In preparation for moving parts of register mapping to cxl_core, split cxl_pci_setup_regs() into a helper that finds register blocks, (cxl_find_regblock()), and a generic wrapper that probes the precise register sets within a block (cxl_setup_regs()). Move the actual mapping (cxl_map_regs()) of the only register-set that cxl_pci cares about (memory device registers) up a level from the former cxl_pci_setup_regs() into cxl_pci_probe(). With this change the unused component registers are no longer mapped, but the helpers are primed to move into the core. [djbw: drop cxl_map_regs() for component registers] Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> [djbw: rebase on the cxl_register_map refactor] Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163434053788.914258.18412599112859205220.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-10-29 11:53:51 -07:00
Dan Williams	a261e9a157	cxl/pci: Add @base to cxl_register_map In addition to carrying @barno, @block_offset, and @reg_type, add @base to keep all map/unmap parameters in one object. The helpers cxl_{map,unmap}_regblock() handle adjusting @base to the @block_offset at map and unmap time. Document that @base incorporates @block_offset so that downstream consumers of a mapped cxl_register_map instance do not need perform any fixups / can use @base directly. Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163433497228.889435.11271988238496181536.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-10-29 11:53:51 -07:00
Ben Widawsky	7dc7a64de2	cxl/pci: Make more use of cxl_register_map The structure exists to pass around information about register mapping. Use it for passing @barno and @block_offset, and eliminate duplicate local variables. The helpers that use @map do not care about @cxlm, so just pass them a pdev instead. [djbw: reorder before cxl_pci_setup_regs() refactor to improver readability] Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> [djbw: separate @base conversion] Link: https://lore.kernel.org/r/163416901172.806743.10056306321247850914.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-10-29 11:53:51 -07:00
Ben Widawsky	84e36a9d1b	cxl/pci: Remove pci request/release regions Quoting Dan, "... the request + release regions should probably just be dropped. It's not like any of the register enumeration would collide with someone else who already has the registers mapped. The collision only comes when the registers are mapped for their final usage, and that will have more precision in the request." Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/163379785872.692348.8981679111988251260.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-10-29 11:53:51 -07:00
Dan Williams	ca76a3a805	cxl/pci: Fix NULL vs ERR_PTR confusion cxl_pci_map_regblock() may return an ERR_PTR(), but cxl_pci_setup_regs() is only prepared for NULL as the error case. Pick the minimal fix for -stable backport purposes and just have cxl_pci_map_regblock() return NULL for errors. Fixes: `f8a7e8c29b` ("cxl/pci: Reserve all device regions at once") Cc: <stable@vger.kernel.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163433325724.834522.17809774578178224149.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-10-29 11:53:51 -07:00
Ben Widawsky	d22fed9c2b	cxl/pci: Remove dev_dbg for unknown register blocks While interesting to driver developers, the dev_dbg message doesn't do much except clutter up logs. This information should be attainable through sysfs, and someday lspci like utilities. This change additionally helps reduce the LOC in a subsequent patch to refactor some of cxl_pci register mapping. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/163379784717.692348.3478221381958300790.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-10-29 11:53:51 -07:00
Ben Widawsky	cdcce47cb3	cxl/pci: Convert register block identifiers to an enum In preparation for passing around the Register Block Indicator (RBI) as a parameter, it is desirable to convert the type to an enum so that the interface can use a well defined parameter. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> [djbw: changelog fixups ] Link: https://lore.kernel.org/r/163379784199.692348.4366131432595112771.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-10-29 11:53:51 -07:00
Alison Schofield	91a45b12d4	cxl/acpi: Do not fail cxl_acpi_probe() based on a missing CHBS When an ACPI0016 Host Bridge device is present yet no corresponding CEDT Host Bridge Structure (CHBS) exists, the ACPI probe method fails. Rather than fail, emit this warning and continue: cxl_acpi ACPI0017:00: No CHBS found for Host Bridge: ACPI0016:02 This error may occur on systems that are not compliant with the ACPI specification. Compliant systems include a CHBS entry for every CXL host bridge that is present at boot. Suggested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20211007213426.392644-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-10-08 01:05:31 -07:00
Kees Cook	301e68dd9b	cxl/core: Replace unions with struct_group() Use the newly introduced struct_group_typed() macro to clean up the declaration of struct cxl_regs. Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: linux-cxl@vger.kernel.org Suggested-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/lkml/1d9a2e6df2a9a35b2cdd50a9a68cac5991e7e5f0.camel@intel.com Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Kees Cook <keescook@chromium.org>	2021-09-25 08:20:47 -07:00
Ben Widawsky	ed97afb533	cxl/pci: Disambiguate cxl_pci further from cxl_mem Commit `21e9f76733` ("cxl: Rename mem to pci") introduced the cxl_pci driver which had formerly been named cxl_mem. At the time, the goal was to be as light touch as possible because there were other patches in flight. Since things have settled now, and a new cxl_mem driver will be introduced shortly, spend the LOC now to clean up the existing names. While here, fix the kernel docs to explain the situation better after the core rework that has already landed. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/20210913163324.1008564-4-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 14:18:49 -07:00
Dan Williams	48667f6761	cxl/core: Split decoder setup into alloc + add The kbuild robot reports: drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds limit (1024) in function 'devm_cxl_add_decoder' It is also the case the devm_cxl_add_decoder() is unwieldy to use for all the different decoder types. Fix the stack usage by splitting the creation into alloc and add steps. This also allows for context specific construction before adding. With the split the caller is responsible for registering a devm callback to trigger device_unregister() for the decoder rather than it being implicit in the decoder registration. I.e. the routine that calls alloc is responsible for calling put_device() if the "add" operation fails. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/163225205828.3038145.6831131648369404859.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 14:09:34 -07:00
Dan Williams	7d3eb23c4c	tools/testing/cxl: Introduce a mock memory device + driver Introduce an emulated device-set plus driver to register CXL memory devices, 'struct cxl_memdev' instances, in the mock cxl_test topology. This enables the development of HDM Decoder (Host-managed Device Memory Decoder) programming flow (region provisioning) in an environment that can be updated alongside the kernel as it gains more functionality. Whereas the cxl_pci module looks for CXL memory expanders on the 'pci' bus, the cxl_mock_mem module attaches to CXL expanders on the platform bus emitted by cxl_test. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163116440099.2460985.10692549614409346604.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 14:09:34 -07:00
Dan Williams	49be6dd807	cxl/mbox: Move command definitions to common location In preparation for cxl_test to mock responses to mailbox command requests, move some definitions from core/mbox.c to cxlmem.h. No functional changes intended. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163116439547.2460985.10457111177103589574.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 14:09:34 -07:00
Dan Williams	a5c2580216	cxl/bus: Populate the target list at decoder create As found by cxl_test, the implementation populated the target_list for the single dport exceptional case, it missed populating the target_list for the typical multi-dport case. Root decoders always know their target list at the beginning of time, and even switch-level decoders should have a target list of one or more zeros by default, depending on the interleave-ways setting. Walk the hosting port's dport list and populate based on the passed in map. Move devm_cxl_add_passthrough_decoder() out of line now that it does the work of generating a target_map. Before: $ cat /sys/bus/cxl/devices/root2/decoder/target_list 0 0 After: $ cat /sys/bus/cxl/devices/root2/decoder/target_list 0 0,1,2,3 0 0,1,2,3 Where root2 is a CXL topology root object generated by 'cxl_test'. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163116439000.2460985.11713777051267946018.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 14:09:34 -07:00
Dan Williams	67dcdd4d3b	tools/testing/cxl: Introduce a mocked-up CXL port hierarchy Create an environment for CXL plumbing unit tests. Especially when it comes to an algorithm for HDM Decoder (Host-managed Device Memory Decoder) programming, the availability of an in-kernel-tree emulation environment for CXL configuration complexity and corner cases speeds development and deters regressions. The approach taken mirrors what was done for tools/testing/nvdimm/. I.e. an external module, cxl_test.ko built out of the tools/testing/cxl/ directory, provides mock implementations of kernel APIs and kernel objects to simulate a real world device hierarchy. One feedback for the tools/testing/nvdimm/ proposal was "why not do this in QEMU?". In fact, the CXL development community has developed a QEMU model for CXL [1]. However, there are a few blocking issues that keep QEMU from being a tight fit for topology + provisioning unit tests: 1/ The QEMU community has yet to show interest in merging any of this support that has had patches on the list since November 2020. So, testing CXL to date involves building custom QEMU with out-of-tree patches. 2/ CXL mechanisms like cross-host-bridge interleave do not have a clear path to be emulated by QEMU without major infrastructure work. This is easier to achieve with the alloc_mock_res() approach taken in this patch to shortcut-define emulated system physical address ranges with interleave behavior. The QEMU enabling has been critical to get the driver off the ground, and may still move forward, but it does not address the ongoing needs of a regression testing environment and test driven development. This patch adds an ACPI CXL Platform definition with emulated CXL multi-ported host-bridges. A follow on patch adds emulated memory expander devices. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reported-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20210202005948.241655-1-ben.widawsky@intel.com [1] Link: https://lore.kernel.org/r/163164680798.2831381.838684634806668012.stgit@dwillia2-desk3.amr.corp.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 13:47:10 -07:00
Dan Williams	2e52b6256b	cxl/pmem: Add support for multiple nvdimm-bridge objects In preparation for a mocked unit test environment for CXL objects, allow for multiple unique nvdimm-bridge objects. For now, just allow multiple bridges to be registered. Later, when there are multiple present, further updates are needed to cxl_find_nvdimm_bridge() to identify which bridge is associated with which CXL hierarchy for nvdimm registration. Note that this does change the kernel device-name for the bridge object. User space should not have any attachment to the device name at this point as it is still early days in the CXL driver development. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163164647007.2831228.2150246954620721526.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 13:47:10 -07:00
Dan Williams	60b8f17215	cxl/pmem: Translate NVDIMM label commands to CXL label commands The LIBNVDIMM IOCTL UAPI calls back to the nvdimm-bus-provider to translate the Linux command payload to the device native command format. The LIBNVDIMM commands get-config-size, get-config-data, and set-config-data, map to the CXL memory device commands device-identify, get-lsa, and set-lsa. Recall that the label-storage-area (LSA) on an NVDIMM device arranges for the provisioning of namespaces. Additionally for CXL the LSA is used for provisioning regions as well. The data from device-identify is already cached in the 'struct cxl_mem' instance associated with @cxl_nvd, so that payload return is simply crafted and no CXL command is issued. The conversion for get-lsa is straightforward, but the conversion for set-lsa requires an allocation to append the set-lsa header in front of the payload. Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163122524923.2534512.9431316965424264864.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 13:47:09 -07:00
Dan Williams	12f3856ad4	cxl/mbox: Add exclusive kernel command support The CXL_PMEM driver expects exclusive control of the label storage area space. Similar to the LIBNVDIMM expectation that the label storage area is only writable from userspace when the corresponding memory device is not active in any region, the expectation is the native CXL_PCI UAPI path is disabled while the cxl_nvdimm for a given cxl_memdev device is active in LIBNVDIMM. Add the ability to toggle the availability of a given command for the UAPI path. Use that new capability to shutdown changes to partitions and the label storage area while the cxl_nvdimm device is actively proxying commands for LIBNVDIMM. Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/163164579468.2830966.6980053377428474263.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 13:44:57 -07:00
Dan Williams	ff56ab9e16	cxl/mbox: Convert 'enabled_cmds' to DECLARE_BITMAP Define enabled_cmds as an embedded member of 'struct cxl_mem' rather than a pointer to another dynamic allocation. As this leaves only one user of cxl_cmd_count, just open code it and delete the helper. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163116436415.2460985.10101824045493194813.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 13:44:57 -07:00
Dan Williams	5a2328f4e8	cxl/pci: Use module_pci_driver Now that cxl_mem_{init,exit} no longer need to manage debugfs, switch back to the smaller form of the boiler plate. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163116435825.2460985.7201322215431441130.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 13:44:57 -07:00
Dan Williams	4faf31b434	cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core Now that the internals of mailbox operations are abstracted from the PCI specifics a bulk of infrastructure can move to the core. The CXL_PMEM driver intends to proxy LIBNVDIMM UAPI and driver requests to the equivalent functionality provided by the CXL hardware mailbox interface. In support of that intent move the mailbox implementation to a shared location for the CXL_PCI driver native IOCTL path and CXL_PMEM nvdimm command proxy path to share. A unit test framework seeks to implement a unit test backend transport for mailbox commands to communicate mocked up payloads. It can reuse all of the mailbox infrastructure minus the PCI specifics, so that also gets moved to the core. Finally with the mailbox infrastructure and ioctl handling being transport generic there is no longer any need to pass file file_operations to devm_cxl_add_memdev(). That allows all the ioctl boilerplate to move into the core for unit test reuse. No functional change intended, just code movement. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163116435233.2460985.16197340449713287180.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 13:44:56 -07:00
Dan Williams	4cb35f1ca0	cxl/pci: Drop idr.h Commit `3d135db510` ("cxl/core: Move memdev management to core") left this straggling include for cxl_memdev setup. Clean it up. Cc: Ben Widawsky <ben.widawsky@intel.com> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/163116434668.2460985.12264757586266849616.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 13:44:56 -07:00
Dan Williams	b64955a929	cxl/mbox: Introduce the mbox_send operation In preparation for implementing a unit test backend transport for ioctl operations, and making the mailbox available to the cxl/pmem infrastructure, move the existing PCI specific portion of mailbox handling to an "mbox_send" operation. With this split all the PCI-specific transport details are comprehended by a single operation and the rest of the mailbox infrastructure is 'struct cxl_mem' and 'struct cxl_memdev' generic. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/163116434098.2460985.9004760022659400540.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 13:44:56 -07:00
Dan Williams	13e7749d06	cxl/pci: Clean up cxl_mem_get_partition_info() Commit `0b9159d0ff` ("cxl/pci: Store memory capacity values") missed updating the kernel-doc for 'struct cxl_mem' leading to the following warnings: ./scripts/kernel-doc -v drivers/cxl/cxlmem.h 2>&1 \| grep warn drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'total_bytes' not described in 'cxl_mem' drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'volatile_only_bytes' not described in 'cxl_mem' drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'persistent_only_bytes' not described in 'cxl_mem' drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'partition_align_bytes' not described in 'cxl_mem' drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_volatile_bytes' not described in 'cxl_mem' drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_persistent_bytes' not described in 'cxl_mem' drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_volatile_bytes' not described in 'cxl_mem' drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_persistent_bytes' not described in 'cxl_mem' Also, it is redundant to describe those same parameters in the kernel-doc for cxl_mem_get_partition_info(). Given the only user of that routine updates the values in @cxlm, just do that implicitly internal to the helper. Cc: Ira Weiny <ira.weiny@intel.com> Reported-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163157174216.2653013.1277706528753990974.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 13:44:56 -07:00
Dan Williams	99e222a5f1	cxl/pci: Make 'struct cxl_mem' device type generic In preparation for adding a unit test provider of a cxl_memdev, convert the 'struct cxl_mem' driver context to carry a generic device rather than a pci device. Note, some dev_dbg() lines needed extra reformatting per clang-format. This conversion also allows the cxl_mem_create() and devm_cxl_add_memdev() calling conventions to be simplified. The "host" for a cxl_memdev, must be the same device for the driver that allocated @cxlm. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/163116432973.2460985.7553504957932024222.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-21 13:44:56 -07:00
Linus Torvalds	70868a1805	cxl for v5.15 - Fix detection of CXL host bridges to filter out disabled ACPI0016 devices in the ACPI DSDT. - Fix kernel lockdown integration to disable raw commands when raw PCI access is disabled. - Fix a broken debug message. - Add support for "Get Partition Info". I.e. enumerate the split between volatile and persistent capacity on bi-modal CXL memory expanders. - Re-factor the core by subject area. This is a work in progress. - Prepare libnvdimm to understand CXL labels in addition to EFI labels. This is a work in progress. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYTlXFQAKCRDfioYZHlFs Z4LXAQCKhh1VHhPHHBF0xkWjriJecM7ZT0AuEXdD9SnX3B6tXgEA6hwIMKGFqEOS hDqaQfk3ooydwEnItBhovFo+B8H+Qg4= =CDUy -----END PGP SIGNATURE----- Merge tag 'cxl-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull CXL (Compute Express Link) updates from Dan Williams: - Fix detection of CXL host bridges to filter out disabled ACPI0016 devices in the ACPI DSDT. - Fix kernel lockdown integration to disable raw commands when raw PCI access is disabled. - Fix a broken debug message. - Add support for "Get Partition Info". I.e. enumerate the split between volatile and persistent capacity on bi-modal CXL memory expanders. - Re-factor the core by subject area. This is a work in progress. - Prepare libnvdimm to understand CXL labels in addition to EFI labels. This is a work in progress. * tag 'cxl-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (25 commits) cxl/registers: Fix Documentation warning cxl/pmem: Fix Documentation warning cxl/uapi: Fix defined but not used warnings cxl/pci: Fix debug message in cxl_probe_regs() cxl/pci: Fix lockdown level cxl/acpi: Do not add DSDT disabled ACPI0016 host bridge ports libnvdimm/labels: Add claim class helpers libnvdimm/labels: Add type-guid helpers libnvdimm/labels: Add blk special cases for nlabel and position helpers libnvdimm/labels: Add blk isetcookie set / validation helpers libnvdimm/labels: Add a checksum calculation helper libnvdimm/labels: Introduce label setter helpers libnvdimm/labels: Add isetcookie validation helper libnvdimm/labels: Introduce getters for namespace label fields cxl/mem: Adjust ram/pmem range to represent DPA ranges cxl/mem: Account for partitionable space in ram/pmem ranges cxl/pci: Store memory capacity values cxl/pci: Simplify register setup cxl/pci: Ignore unknown register block types cxl/core: Move memdev management to core ...	2021-09-09 11:48:27 -07:00
Dan Williams	2b922a9d06	cxl/registers: Fix Documentation warning Commit `0f06157e01` ("cxl/core: Move register mapping infrastructure") neglected to add a DOC header for the new drivers/core/regs.c file. Reported-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163072206675.2250120.3527179192933919995.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-07 11:39:02 -07:00
Dan Williams	a01da6ca7d	cxl/pmem: Fix Documentation warning Commit `06737cd0d2` ("cxl/core: Move pmem functionality") neglected to add a DOC header for the new drivers/cxl/core/pmem.c file. Reported-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com> Link: https://lore.kernel.org/r/163072206163.2250120.11486436976516079516.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-07 11:39:01 -07:00
Li Qiang (Johnny Li)	da582aa5ad	cxl/pci: Fix debug message in cxl_probe_regs() Indicator string for mbox and memdev register set to status incorrectly in error message. Cc: <stable@vger.kernel.org> Fixes: `30af97296f` ("cxl/pci: Map registers based on capabilities") Signed-off-by: Li Qiang (Johnny Li) <johnny.li@montage-tech.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163072205089.2250120.8103605864156687395.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-07 11:39:01 -07:00
Dan Williams	9e56614c44	cxl/pci: Fix lockdown level A proposed rework of security_locked_down() users identified that the cxl_pci driver was passing the wrong lockdown_reason. Update cxl_mem_raw_command_allowed() to fail raw command access when raw pci access is also disabled. Fixes: `13237183c7` ("cxl/mem: Add a "RAW" send command") Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: <stable@vger.kernel.org> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/163072204525.2250120.16615792476976546735.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-07 11:39:01 -07:00
Alison Schofield	a7bfaad54b	cxl/acpi: Do not add DSDT disabled ACPI0016 host bridge ports During CXL ACPI probe, host bridge ports are discovered by scanning the ACPI0017 root port for ACPI0016 host bridge devices. The scan matches on the hardware id of "ACPI0016". An issue occurs when an ACPI0016 device is defined in the DSDT yet disabled on the platform. Attempts by the cxl_acpi driver to add host bridge ports using a disabled device fails, and the entire cxl_acpi probe fails. The DSDT table includes an _STA method that sets the status and the ACPI subsystem has checks available to examine it. One such check is in the acpi_pci_find_root() path. Move the call to acpi_pci_find_root() to the matching function to prevent this issue when adding either upstream or downstream ports. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Fixes: `7d4b5ca2e2` ("cxl/acpi: Add downstream port data to cxl_port instances") Cc: <stable@vger.kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163072203957.2250120.2178685721061002124.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-09-07 11:39:01 -07:00
Ira Weiny	ceeb0da0a0	cxl/mem: Adjust ram/pmem range to represent DPA ranges CXL spec defines the volatile DPA range to be 0 to Volatile memory size. It further defines the persistent DPA range to follow directly after the end of the Volatile DPA through the persistent memory size. Essentially Volatile DPA range = [0, Volatile size) Persistent DPA range = [Volatile size, Volatile size + Persistent size) Adjust the pmem_range start to reflect this and remote the TODO. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20210617221620.1904031-4-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-08-10 18:50:04 -07:00
Ira Weiny	f847502ad8	cxl/mem: Account for partitionable space in ram/pmem ranges Memory devices may specify volatile only, persistent only, and partitionable space which when added together result in a total capacity. If Identify Memory Device.Partition Alignment != 0 the device supports partitionable space. This partitionable space can be split between volatile and persistent space. The total volatile and persistent sizes are reported in Get Partition Info. ie active volatile memory = volatile only + partitionable volatile active persistent memory = persistent only + partitionable persistent Define cxl_mem_get_partition(), check for partitionable support, and use cxl_mem_get_partition() if applicable. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-08-10 11:57:59 -07:00
Ira Weiny	0b9159d0ff	cxl/pci: Store memory capacity values The Identify Memory Device command returns information about the volatile only and persistent only memory capacities. Store those values in the cxl_mem structure for later use. While at it, reuse those calculations to calculate the ram and pmem ranges. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20210617221620.1904031-2-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-08-07 01:01:09 -07:00
Ben Widawsky	5b68705d1e	cxl/pci: Simplify register setup It is desirable to retain the mappings from the calling function. By simplifying this code, it will be much more straightforward to do that. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20210716231548.174778-3-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-08-06 08:27:02 -07:00
Ben Widawsky	1e39db573e	cxl/pci: Ignore unknown register block types In an effort to explicit avoid supporting vendor specific register blocks (which can happily be mapped from userspace), entirely skip probing unknown types. The secondary benefit of this will be revealed in the future with code simplification. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20210716231548.174778-2-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-08-06 08:27:02 -07:00
Ben Widawsky	3d135db510	cxl/core: Move memdev management to core The motivation for moving cxl_memdev allocation to the core (beyond better file organization of sysfs attributes in core/ and drivers in cxl/), is that device lifetime is longer than module lifetime. The cxl_pci module should be free to come and go without needing to coordinate with devices that need the text associated with cxl_memdev_release() to stay resident. The move fixes a use after free bug when looping driver load / unload with CONFIG_DEBUG_KOBJECT_RELEASE=y. Another motivation for disconnecting cxl_memdev creation from cxl_pci is to enable other drivers, like a unit test driver, to registers memdevs. Fixes: `b39cb1052a` ("cxl/mem: Register CXL memX devices") Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162792540495.368511.9748638751088219595.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-08-06 08:22:54 -07:00
Dan Williams	9cc238c7a5	cxl/pci: Introduce cdevm_file_operations In preparation for moving cxl_memdev allocation to the core, introduce cdevm_file_operations to coordinate file operations shutdown relative to driver data release. The motivation for moving cxl_memdev allocation to the core (beyond better file organization of sysfs attributes in core/ and drivers in cxl/), is that device lifetime is longer than module lifetime. The cxl_pci module should be free to come and go without needing to coordinate with devices that need the text associated with cxl_memdev_release() to stay resident. The move will fix a use after free bug when looping driver load / unload with CONFIG_DEBUG_KOBJECT_RELEASE=y. Another motivation for passing in file_operations to the core cxl_memdev creation flow is to allow for alternate drivers, like unit test code, to define their own ioctl backends. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162792539962.368511.2962268954245340288.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-08-06 08:22:53 -07:00
Dan Williams	0f06157e01	cxl/core: Move register mapping infrastructure The register mapping infrastructure is large enough to move to its own compilation unit. This also cleans up an unnecessary include of <mem.h> core/bus.c. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162800068975.665205.12895551621746585289.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-08-06 08:22:53 -07:00
Dan Williams	06737cd0d2	cxl/core: Move pmem functionality Refactor the pmem / nvdimm-bridge functionality from core/bus.c to core/pmem.c. Introduce drivers/core/core.h to communicate data structures and helpers between the core bus and other functionality that registers devices on the bus. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162792538899.368511.3881663908293411300.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-08-06 08:22:53 -07:00
Ben Widawsky	95aaed2668	cxl/core: Improve CXL core kernel docs Now that CXL core's role is well understood, the documentation should reflect that information. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162792538379.368511.9055351193841619781.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-08-06 08:22:53 -07:00
Ben Widawsky	5161a55c06	cxl: Move cxl_core to new directory CXL core is growing, and it's already arguably unmanageable. To support future growth, move core functionality to a new directory and rename the file to represent just bus support. Future work will remove non-bus functionality. Note that mem.h is renamed to cxlmem.h to avoid a namespace collision with the global ARCH=um mem.h header. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162792537866.368511.8915631504621088321.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-08-06 08:22:53 -07:00
Uwe Kleine-König	fc7a6209d5	bus: Make remove callback return void The driver core ignores the return value of this callback because there is only little it can do when a device disappears. This is the final bit of a long lasting cleanup quest where several buses were converted to also return void from their remove callback. Additionally some resource leaks were fixed that were caused by drivers returning an error code in the expectation that the driver won't go away. With struct bus_type::remove returning void it's prevented that newly implemented buses return an ignored error code and so don't anticipate wrong expectations for driver authors. Reviewed-by: Tom Rix <trix@redhat.com> (For fpga) Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> (For drivers/s390 and drivers/vfio) Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> (For ARM, Amba and related parts) Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Chen-Yu Tsai <wens@csie.org> (for sunxi-rsb) Acked-by: Pali Rohár <pali@kernel.org> Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org> (for media) Acked-by: Hans de Goede <hdegoede@redhat.com> (For drivers/platform) Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-By: Vinod Koul <vkoul@kernel.org> Acked-by: Juergen Gross <jgross@suse.com> (For xen) Acked-by: Lee Jones <lee.jones@linaro.org> (For mfd) Acked-by: Johannes Thumshirn <jth@kernel.org> (For mcb) Acked-by: Johan Hovold <johan@kernel.org> Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> (For slimbus) Acked-by: Kirti Wankhede <kwankhede@nvidia.com> (For vfio) Acked-by: Maximilian Luz <luzmaximilian@gmail.com> Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> (For ulpi and typec) Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> (For ipack) Acked-by: Geoff Levand <geoff@infradead.org> (For ps3) Acked-by: Yehezkel Bernat <YehezkelShB@gmail.com> (For thunderbolt) Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> (For intel_th) Acked-by: Dominik Brodowski <linux@dominikbrodowski.net> (For pcmcia) Acked-by: Rafael J. Wysocki <rafael@kernel.org> (For ACPI) Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> (rpmsg and apr) Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> (For intel-ish-hid) Acked-by: Dan Williams <dan.j.williams@intel.com> (For CXL, DAX, and NVDIMM) Acked-by: William Breathitt Gray <vilhelm.gray@gmail.com> (For isa) Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (For firewire) Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> (For hid) Acked-by: Thorsten Scherer <t.scherer@eckelmann.de> (For siox) Acked-by: Sven Van Asbroeck <TheSven73@gmail.com> (For anybuss) Acked-by: Ulf Hansson <ulf.hansson@linaro.org> (For MMC) Acked-by: Wolfram Sang <wsa@kernel.org> # for I2C Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Finn Thain <fthain@linux-m68k.org> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20210713193522.1770306-6-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-07-21 11:53:42 +02:00
Ben Widawsky	4ad6181e4b	cxl/pci: Rename CXL REGLOC ID The current naming is confusing and wrong. The Register Locator is identified by the DSVSEC identifier, not an offset. Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/20210618003009.956929-1-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-17 17:37:18 -07:00
Alison Schofield	3e23d17ce1	cxl/acpi: Use the ACPI CFMWS to create static decoder objects The ACPI CXL Early Discovery Table (CEDT) includes a list of CXL memory resources in CXL Fixed Memory Window Structures (CFMWS). Retrieve each CFMWS in the CEDT and add a cxl_decoder object to the root port (root0) for each memory resource. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/d2b73eecfb7ea22e1103f1894b271a89958b4c41.1623968958.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-17 17:35:43 -07:00
Alison Schofield	da6aafec3d	cxl/acpi: Add the Host Bridge base address to CXL port objects The base address for the Host Bridge port component registers is located in the CXL Host Bridge Structure (CHBS) of the ACPI CXL Early Discovery Table (CEDT). Retrieve the CHBS for each Host Bridge (ACPI0016 device) and include that base address in the port object. Co-developed-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/a475ce137b899bc7ae5ba9550b5f198cb29ccbfd.1623968958.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-17 17:35:43 -07:00
Dan Williams	21083f5152	cxl/pmem: Register 'pmem' / cxl_nvdimm devices While a memX device on /sys/bus/cxl represents a CXL memory expander control interface, a pmemX device represents the persistent memory sub-functionality. It bridges the CXL subystem to the libnvdimm nmemX control interface. With this skeleton ndctl can now see persistent memory devices on a "CXL" bus. Later patches add support for translating libnvdimm native commands to CXL commands. # ndctl list -BDiu -b CXL { "provider":"CXL", "dev":"ndbus1", "dimms":[ { "dev":"nmem1", "state":"disabled" }, { "dev":"nmem0", "state":"disabled" } ] } Given nvdimm_bus_unregister() removes all devices on an ndbus0 the cxl_pmem infrastructure needs to arrange ->remove() to be triggered on cxl_nvdimm devices to keep their enabled state synchronized with the registration state of their corresponding device on the nvdimm_bus. In other words, always arrange for cxl_nvdimm_driver.remove() to unregister nvdimms from an nvdimm_bus ahead of the bus being unregistered. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162380012696.3039556.4293801691038740850.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-15 16:47:34 -07:00
Dan Williams	8fdcb1704f	cxl/pmem: Add initial infrastructure for pmem support Register an 'nvdimm-bridge' device to act as an anchor for a libnvdimm bus hierarchy. Also, flesh out the cxl_bus definition to allow a cxl_nvdimm_bridge_driver to attach to the bridge and trigger the nvdimm-bus registration. The creation of the bridge is gated on the detection of a PMEM capable address space registered to the root. The bridge indirection allows the libnvdimm module to remain unloaded on platforms without PMEM support. Given that the probing of ACPI0017 is asynchronous to CXL endpoint devices, and the expectation that CXL endpoint devices register other PMEM resources on the 'CXL' nvdimm bus, a workqueue is added. The workqueue is needed to run bus_rescan_devices() outside of the device_lock() of the nvdimm-bridge device to rendezvous nvdimm resources as they arrive. For now only the bus is taken online/offline in the workqueue. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162379909706.2993820.14051258608641140169.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-15 16:47:14 -07:00
Dan Williams	6af7139c97	cxl/core: Add cxl-bus driver infrastructure Enable devices on the 'cxl' bus to be attached to drivers. The initial user of this functionality is a driver for an 'nvdimm-bridge' device that anchors a libnvdimm hierarchy attached to CXL persistent memory resources. Other device types that will leverage this include: cxl_port: map and use component register functionality (HDM Decoders) cxl_nvdimm: translate CXL memory expander endpoints to libnvdimm 'nvdimm' objects cxl_region: translate CXL interleave sets to libnvdimm 'region' objects The pairing of devices to drivers is handled through the cxl_device_id() matching to cxl_driver.id values. A cxl_device_id() of '0' indicates no driver support. In addition to ->match(), ->probe(), and ->remove() support for the 'cxl' bus introduce MODULE_ALIAS_CXL() to autoload modules containing cxl-drivers. Drivers are added in follow-on changes. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162379909190.2993820.6134168109678004186.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-15 16:46:34 -07:00
Ben Widawsky	87815ee9d0	cxl/pci: Add media provisioning required commands Some of the commands have already been defined for the support of RAW commands (to be blocked). Unlike their usage in the RAW interface, when used through the supported interface, they will be coordinated and marshalled along with other commands being issued by userspace and the driver itself. That coordination will be added later. The list of commands was determined based on the learnings from libnvdimm and this list is provided directly from Dan. Recommended-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20210413140907.534404-1-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-14 23:54:53 -07:00
Ben Widawsky	ba26864736	cxl/component_regs: Fix offset The CXL.cache and CXL.mem registers begin after the CXL.io registers which occupy the first 0x1000 bytes. The current code wasn't setting this up properly for future users of the component registers. It was correct for the probing code however. Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Ira Weiny <ira.weiny@intel.com> Fixes: `08422378c4` ("cxl/pci: Add HDM decoder capabilities") Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20210611051113.224328-1-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-12 10:30:41 -07:00
Ben Widawsky	6423035fd2	cxl/hdm: Fix decoder count calculation The decoder count in the HDM decoder capability structure is an encoded field. As defined in the spec: Decoder Count: Reports the number of memory address decoders implemented by the component. 0 – 1 Decoder 1 – 2 Decoders 2 – 4 Decoders 3 – 6 Decoders 4 – 8 Decoders 5 – 10 Decoders All other values are reserved Nothing is actually fixed by this as nothing actually used this mapping yet. Cc: Ira Weiny <ira.weiny@intel.com> Fixes: `08422378c4` ("cxl/pci: Add HDM decoder capabilities") Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/20210611190111.121295-1-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-12 10:29:03 -07:00
Dan Williams	40ba17afdf	cxl/acpi: Introduce cxl_decoder objects A cxl_decoder is a child of a cxl_port. It represents a hardware decoder configuration of an upstream port to one or more of its downstream ports. The decoder is either represented in CXL standard HDM decoder registers (see CXL 2.0 section 8.2.5.12 CXL HDM Decoder Capability Structure), or it is a static decode configuration communicated by platform firmware (see the CXL Early Discovery Table: Fixed Memory Window Structure). The firmware described and hardware described decoders differ slightly leading to 2 different sub-types of decoders, cxl_decoder_root and cxl_decoder_switch. At the root level the decode capabilities restrict what can be mapped beneath them. Mid-level switch decoders are configured for either acclerator (type-2) or memory-expander (type-3) operation, but they are otherwise agnostic to the type of memory (volatile vs persistent) being mapped. Here is an example topology from a single-ported host-bridge environment without CFMWS decodes enumerated. /sys/bus/cxl/devices/root0 ├── devtype ├── dport0 -> ../../../LNXSYSTM:00/LNXSYBUS:00/ACPI0016:00 ├── port1 │ ├── decoder1.0 │ │ ├── devtype │ │ ├── locked │ │ ├── size │ │ ├── start │ │ ├── subsystem -> ../../../../../../bus/cxl │ │ ├── target_list │ │ ├── target_type │ │ └── uevent │ ├── devtype │ ├── dport0 -> ../../../../pci0000:34/0000:34:00.0 │ ├── subsystem -> ../../../../../bus/cxl │ ├── uevent │ └── uport -> ../../../../LNXSYSTM:00/LNXSYBUS:00/ACPI0016:00 ├── subsystem -> ../../../../bus/cxl ├── uevent └── uport -> ../../ACPI0017:00 Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162325695128.2293823.17519927266014762694.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-09 18:02:39 -07:00
Dan Williams	3b94ce7b7b	cxl/acpi: Enumerate host bridge root ports While the resources enumerated by the CEDT.CFMWS identify a cxl_port with host bridges as downstream ports, host bridges themselves are upstream ports that decode to downstream ports represented by PCIe Root Ports. Walk the PCIe Root Ports connected to a CXL Host Bridge, identified by the ACPI0016 _HID, and add each one as a cxl_dport of the host bridge cxl_port. For now, component registers are not enumerated, only the first order uport / dport relationships. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162325451145.2293126.10149150938788969381.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-09 18:02:39 -07:00
Dan Williams	7d4b5ca2e2	cxl/acpi: Add downstream port data to cxl_port instances In preparation for infrastructure that enumerates and configures the CXL decode mechanism of an upstream port to its downstream ports, add a representation of a CXL downstream port. On ACPI systems the top-most logical downstream ports in the hierarchy are the host bridges (ACPI0016 devices) that decode the memory windows described by the CXL Early Discovery Table Fixed Memory Window Structures (CEDT.CFMWS). Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/162325450624.2293126.3533006409920271718.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-09 18:02:39 -07:00
Dan Williams	3feaa2d358	cxl/Kconfig: Default drivers to CONFIG_CXL_BUS CONFIG_CXL_BUS is default 'n' as expected for new functionality. When that is enabled do not make the end user hunt for all the expected sub-options to enable. For example CONFIG_CXL_BUS without CONFIG_CXL_MEM is an odd/expert configuration, so is CONFIG_CXL_MEM without CONFIG_CXL_ACPI (on ACPI capable platforms). Default CONFIG_CXL_MEM and CONFIG_CXL_ACPI to CONFIG_CXL_BUS. Acked-by: Ben Widawsky <ben.widawsky@intel.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162325450105.2293126.17046356425194082921.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-09 18:02:38 -07:00
Dan Williams	4812be97c0	cxl/acpi: Introduce the root of a cxl_port topology While CXL builds upon the PCI software model for enumeration and endpoint control, a static platform component is required to bootstrap the CXL memory layout. Similar to how ACPI identifies root-level PCI memory resources, ACPI data enumerates the address space and interleave configuration for CXL Memory. In addition to identifying host bridges, ACPI is responsible for enumerating the CXL memory space that can be addressed by downstream decoders. This is similar to the requirement for ACPI to publish resources via the _CRS method for PCI host bridges. Specifically, ACPI publishes a table, CXL Early Discovery Table (CEDT), which includes a list of CXL Memory resources, CXL Fixed Memory Window Structures (CFMWS). For now, introduce the core infrastructure for a cxl_port hierarchy starting with a root level anchor represented by the ACPI0017 device. Follow on changes model support for the configurable decode capabilities of cxl_port instances, i.e. CXL switch support. Co-developed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162325449515.2293126.15303270193010154608.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-09 18:02:38 -07:00
Dan Williams	605a5e41db	cxl/pci: Fixup devm_cxl_iomap_block() to take a 'struct device ' The expectation is that devm functions take 'struct device ' and pci functions take 'struct pci_dev *'. Swap out the @pdev argument for @dev and fixup related helpers. Cc: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/162216592374.3833641.13281743585064451514.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-06-05 17:39:12 -07:00

... 3 4 5 6 7 ...

482 commits