Commit graph

287 commits

Author SHA1 Message Date
Robert Richter
0312171289 cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window
commit 0cab687205 upstream.

The Linux CXL subsystem is built on the assumption that HPA == SPA.
That is, the host physical address (HPA) the HDM decoder registers are
programmed with are system physical addresses (SPA).

During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1,
8.1.3.8) are checked if the memory is enabled and the CXL range is in
a HPA window that is described in a CFMWS structure of the CXL host
bridge (cxl-3.1, 9.18.1.3).

Now, if the HPA is not an SPA, the CXL range does not match a CFMWS
window and the CXL memory range will be disabled then. The HDM decoder
stops working which causes system memory being disabled and further a
system hang during HDM decoder initialization, typically when a CXL
enabled kernel boots.

Prevent a system hang and do not disable the HDM decoder if the
decoder's CXL range is not found in a CFMWS window.

Note the change only fixes a hardware hang, but does not implement
HPA/SPA translation. Support for this can be added in a follow on
patch series.

Signed-off-by: Robert Richter <rrichter@amd.com>
Fixes: 34e37b4c43 ("cxl/port: Enable HDM Capability after validating DVSEC Ranges")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20240216160113.407141-1-rrichter@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-01 13:26:31 +01:00
Quanquan Cao
ec745eeff4 cxl/region:Fix overflow issue in alloc_hpa()
commit d76779dd36 upstream.

Creating a region with 16 memory devices caused a problem. The div_u64_rem
function, used for dividing an unsigned 64-bit number by a 32-bit one,
faced an issue when SZ_256M * p->interleave_ways. The result surpassed
the maximum limit of the 32-bit divisor (4G), leading to an overflow
and a remainder of 0.
note: At this point, p->interleave_ways is 16, meaning 16 * 256M = 4G

To fix this issue, I replaced the div_u64_rem function with div64_u64_rem
and adjusted the type of the remainder.

Signed-off-by: Quanquan Cao <caoqq@fujitsu.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 23a22cd1c9 ("cxl/region: Allocate HPA capacity to regions")
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-01-31 16:17:12 -08:00
Jim Harris
3a46101871 cxl/region: fix x9 interleave typo
[ Upstream commit c7ad3dc364 ]

CXL supports x3, x6 and x12 - not x9.

Fixes: 80d10a6cee ("cxl/region: Add interleave geometry attributes")
Signed-off-by: Jim Harris <jim.harris@samsung.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Link: https://lore.kernel.org/r/169904271254.204936.8580772404462743630.stgit@ubuntu
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-25 15:27:48 -08:00
Huang Ying
970c0899a4 cxl/port: Fix decoder initialization when nr_targets > interleave_ways
commit d6488fee66 upstream.

The decoder_populate_targets() helper walks all of the targets in a port
and makes sure they can be looked up in @target_map. Where @target_map
is a lookup table from target position to target id (corresponding to a
cxl_dport instance). However @target_map is only responsible for
conveying the active dport instances as indicated by interleave_ways.

When nr_targets > interleave_ways it results in
decoder_populate_targets() walking off the end of the valid entries in
@target_map. Given target_map is initialized to 0 it results in the
dport lookup failing if position 0 is not mapped to a dport with an id
of 0:

  cxl_port port3: Failed to populate active decoder targets
  cxl_port port3: Failed to add decoder
  cxl_port port3: Failed to add decoder3.0
  cxl_bus_probe: cxl_port port3: probe: -6

This bug also highlights that when the decoder's ->targets[] array is
written in cxl_port_setup_targets() it is missing a hold of the
targets_lock to synchronize against sysfs readers of the target list. A
fix for that is saved for a later patch.

Fixes: a5c2580216 ("cxl/bus: Populate the target list at decoder create")
Cc:  <stable@vger.kernel.org>
Signed-off-by: Huang, Ying <ying.huang@intel.com>
[djbw: rewrite the changelog, find the Fixes: tag]
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-01-25 15:27:44 -08:00
Robert Richter
9e1e0887ea cxl/port: Fix NULL pointer access in devm_cxl_add_port()
commit a70fc4ed20 upstream.

In devm_cxl_add_port() the port creation may fail and its associated
pointer does not contain a valid address. During error message
generation this invalid port address is used. Fix that wrong address
access.

Fixes: f3cd264c4e ("cxl: Unify debug messages when calling devm_cxl_add_port()")
Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20230519215436.3394532-1-rrichter@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28 17:07:23 +00:00
Jim Harris
31f6ff62df cxl/region: Fix x1 root-decoder granularity calculations
[ Upstream commit 98a04c7ace ]

Root decoder granularity must match value from CFWMS, which may not
be the region's granularity for non-interleaved root decoders.

So when calculating granularities for host bridge decoders, use the
region's granularity instead of the root decoder's granularity to ensure
the correct granularities are set for the host bridge decoders and any
downstream switch decoders.

Test configuration is 1 host bridge * 2 switches * 2 endpoints per switch.

Region created with 2048 granularity using following command line:

cxl create-region -m -d decoder0.0 -w 4 mem0 mem2 mem1 mem3 \
		  -g 2048 -s 2048M

Use "cxl list -PDE | grep granularity" to get a view of the granularity
set at each level of the topology.

Before this patch:
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":512,
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":512,
"interleave_granularity":256,

After:
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":4096,
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":4096,
"interleave_granularity":2048,

Fixes: 27b3f8d138 ("cxl/region: Program target lists")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jim Harris <jim.harris@samsung.com>
Link: https://lore.kernel.org/r/169824893473.1403938.16110924262989774582.stgit@bgt-140510-bm03.eng.stellus.in
[djbw: fixup the prebuilt cxl_test region]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28 17:07:18 +00:00
Dan Williams
683b6a7324 tools/testing/cxl: Define a fixed volatile configuration to parse
[ Upstream commit 3d8f7ccaa6 ]

Take two endpoints attached to the first switch on the first host-bridge
in the cxl_test topology and define a pre-initialized region. This is a
x2 interleave underneath a x1 CXL Window.

$ modprobe cxl_test
$ # cxl list -Ru
{
  "region":"region3",
  "resource":"0xf010000000",
  "size":"512.00 MiB (536.87 MB)",
  "interleave_ways":2,
  "interleave_granularity":4096,
  "decode_state":"commit"
}

Tested-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/167602000547.1924368.11613151863880268868.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Stable-dep-of: 98a04c7ace ("cxl/region: Fix x1 root-decoder granularity calculations")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28 17:07:17 +00:00
Dan Williams
8cdc6b8b81 cxl/mem: Move devm_cxl_add_endpoint() from cxl_core to cxl_mem
[ Upstream commit 7592d935b7 ]

tl;dr: Clean up an unnecessary export and enable cxl_test.

An RCD (Restricted CXL Device), in contrast to a typical CXL device in
a VH topology, obtains its component registers from the bottom half of
the associated CXL host bridge RCRB (Root Complex Register Block). In
turn this means that cxl_rcrb_to_component() needs to be called from
devm_cxl_add_endpoint().

Presently devm_cxl_add_endpoint() is part of the CXL core, but the only
user is the CXL mem module. Move it from cxl_core to cxl_mem to not only
get rid of an unnecessary export, but to also enable its call out to
cxl_rcrb_to_component(), in a subsequent patch, to be mocked by
cxl_test. Recall that cxl_test can only mock exported symbols, and since
cxl_rcrb_to_component() is itself inside the core, all callers must be
outside of cxl_core to allow cxl_test to mock it.

Reviewed-by: Robert Richter <rrichter@amd.com>
Link: https://lore.kernel.org/r/166993045072.1882361.13944923741276843683.stgit@dwillia2-xfh.jf.intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Stable-dep-of: 98a04c7ace ("cxl/region: Fix x1 root-decoder granularity calculations")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28 17:07:17 +00:00
Robert Richter
8fce427169 cxl: Unify debug messages when calling devm_cxl_add_port()
[ Upstream commit f3cd264c4e ]

CXL ports are added in a couple of code paths using devm_cxl_add_port().
Debug messages are individually generated, but are incomplete and
inconsistent. Change this by moving its generation to
devm_cxl_add_port(). This unifies the messages and reduces code
duplication.  Also, generate messages on failure. Use a
__devm_cxl_add_port() wrapper to keep the readability of the error
exits.

Signed-off-by: Robert Richter <rrichter@amd.com>
Link: https://lore.kernel.org/r/20221018132341.76259-4-rrichter@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Stable-dep-of: 98a04c7ace ("cxl/region: Fix x1 root-decoder granularity calculations")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28 17:07:17 +00:00
Jim Harris
90db4c1d5e cxl/region: Do not try to cleanup after cxl_region_setup_targets() fails
[ Upstream commit 0718588c7a ]

Commit 5e42bcbc3f ("cxl/region: decrement ->nr_targets on error in
cxl_region_attach()") tried to avoid 'eiw' initialization errors when
->nr_targets exceeded 16, by just decrementing ->nr_targets when
cxl_region_setup_targets() failed.

Commit 86987c7662 ("cxl/region: Cleanup target list on attach error")
extended that cleanup to also clear cxled->pos and p->targets[pos]. The
initialization error was incidentally fixed separately by:
Commit 8d42854257 ("cxl/region: Fix port setup uninitialized variable
warnings") which was merged a few days after 5e42bcbc3f.

But now the original cleanup when cxl_region_setup_targets() fails
prevents endpoint and switch decoder resources from being reused:

1) the cleanup does not set the decoder's region to NULL, which results
   in future dpa_size_store() calls returning -EBUSY
2) the decoder is not properly freed, which results in future commit
   errors associated with the upstream switch

Now that the initialization errors were fixed separately, the proper
cleanup for this case is to just return immediately. Then the resources
associated with this target get cleanup up as normal when the failed
region is deleted.

The ->nr_targets decrement in the error case also helped prevent
a p->targets[] array overflow, so add a new check to prevent against
that overflow.

Tested by trying to create an invalid region for a 2 switch * 2 endpoint
topology, and then following up with creating a valid region.

Fixes: 5e42bcbc3f ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jim Harris <jim.harris@samsung.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/169703589120.1202031.14696100866518083806.stgit@bgt-140510-bm03.eng.stellus.in
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28 17:07:17 +00:00
Dan Williams
c415f113d9 cxl/region: Move region-position validation to a helper
[ Upstream commit 9995576cef ]

In preparation for region autodiscovery, that needs all devices
discovered before their relative position in the region can be
determined, consolidate all position dependent validation in a helper.

Recall that in the on-demand region creation flow the end-user picks the
position of a given endpoint decoder in a region. In the autodiscovery
case the position of an endpoint decoder can only be determined after
all other endpoint decoders that claim to decode the region's address
range have been enumerated and attached. So, in the autodiscovery case
endpoint decoders may be attached before their relative position is
known. Once all decoders arrive, then positions can be determined and
validated with cxl_region_validate_position() the same as user initiated
on-demand creation.

Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/167601997584.1924368.4615769326126138969.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Stable-dep-of: 0718588c7a ("cxl/region: Do not try to cleanup after cxl_region_setup_targets() fails")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28 17:07:17 +00:00
Dan Williams
008b08ab07 cxl/region: Cleanup target list on attach error
[ Upstream commit 86987c7662 ]

Jonathan noticed that the target list setup is not unwound completely
upon error. Undo all the setup in the 'err_decrement:' exit path.

Fixes: 27b3f8d138 ("cxl/region: Program target lists")
Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Link: http://lore.kernel.org/r/20230208123031.00006990@Huawei.com
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/167601996980.1924368.390423634911157277.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Stable-dep-of: 0718588c7a ("cxl/region: Do not try to cleanup after cxl_region_setup_targets() fails")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28 17:07:17 +00:00
Dan Williams
93d242f63e cxl/region: Validate region mode vs decoder mode
[ Upstream commit 1b9b7a6fd6 ]

In preparation for a new region mode, do not, for example, allow
'ram' decoders to be assigned to 'pmem' regions and vice versa.

Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Link: https://lore.kernel.org/r/167601995111.1924368.7459128614177994602.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Stable-dep-of: 0718588c7a ("cxl/region: Do not try to cleanup after cxl_region_setup_targets() fails")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28 17:07:17 +00:00
Dan Williams
7c7371b41a cxl/mem: Fix shutdown order
[ Upstream commit 88d3917f82 ]

Ira reports that removing cxl_mock_mem causes a crash with the following
trace:

 BUG: kernel NULL pointer dereference, address: 0000000000000044
 [..]
 RIP: 0010:cxl_region_decode_reset+0x7f/0x180 [cxl_core]
 [..]
 Call Trace:
  <TASK>
  cxl_region_detach+0xe8/0x210 [cxl_core]
  cxl_decoder_kill_region+0x27/0x40 [cxl_core]
  cxld_unregister+0x29/0x40 [cxl_core]
  devres_release_all+0xb8/0x110
  device_unbind_cleanup+0xe/0x70
  device_release_driver_internal+0x1d2/0x210
  bus_remove_device+0xd7/0x150
  device_del+0x155/0x3e0
  device_unregister+0x13/0x60
  devm_release_action+0x4d/0x90
  ? __pfx_unregister_port+0x10/0x10 [cxl_core]
  delete_endpoint+0x121/0x130 [cxl_core]
  devres_release_all+0xb8/0x110
  device_unbind_cleanup+0xe/0x70
  device_release_driver_internal+0x1d2/0x210
  bus_remove_device+0xd7/0x150
  device_del+0x155/0x3e0
  ? lock_release+0x142/0x290
  cdev_device_del+0x15/0x50
  cxl_memdev_unregister+0x54/0x70 [cxl_core]

This crash is due to the clearing out the cxl_memdev's driver context
(@cxlds) before the subsystem is done with it. This is ultimately due to
the region(s), that this memdev is a member, being torn down and expecting
to be able to de-reference @cxlds, like here:

static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
...
                if (cxlds->rcd)
                        goto endpoint_reset;
...

Fix it by keeping the driver context valid until memdev-device
unregistration, and subsequently the entire stack of related
dependencies, unwinds.

Fixes: 9cc238c7a5 ("cxl/pci: Introduce cdevm_file_operations")
Reported-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:52:13 +01:00
Breno Leitao
6806494ed4 cxl/acpi: Return 'rc' instead of '0' in cxl_parse_cfmws()
[ Upstream commit 91019b5bc7 ]

Driver initialization returned success (return 0) even if the
initialization (cxl_decoder_add() or acpi_table_parse_cedt()) failed.

Return the error instead of swallowing it.

Fixes: f4ce1f766f ("cxl/acpi: Convert CFMWS parsing to ACPI sub-table helpers")
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20230714093146.2253438-2-leitao@debian.org
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-03 10:24:04 +02:00
Breno Leitao
748fadc08b cxl/acpi: Fix a use-after-free in cxl_parse_cfmws()
[ Upstream commit 4cf67d3cc9 ]

KASAN and KFENCE detected an user-after-free in the CXL driver. This
happens in the cxl_decoder_add() fail path. KASAN prints the following
error:

   BUG: KASAN: slab-use-after-free in cxl_parse_cfmws (drivers/cxl/acpi.c:299)

This happens in cxl_parse_cfmws(), where put_device() is called,
releasing cxld, which is accessed later.

Use the local variables in the dev_err() instead of pointing to the
released memory. Since the dev_err() is printing a resource, change the open
coded print format to use the %pr format specifier.

Fixes: e50fe01e1f ("cxl/core: Drop ->platform_res attribute for root decoders")
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20230714093146.2253438-1-leitao@debian.org
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-03 10:24:04 +02:00
Dave Jiang
c9e09b070d cxl: Wait Memory_Info_Valid before access memory related info
commit ce17ad0d54 upstream.

The Memory_Info_Valid bit (CXL 3.0 8.1.3.8.2) indicates that the CXL
Range Size High and Size Low registers are valid. The bit must be set
within 1 second of reset deassertion to the device. Check valid bit
before we check the Memory_Active bit when waiting for
cxl_await_media_ready() to ensure that the memory info is valid for
consumption. Also ensures both DVSEC ranges 1 and 2 are ready if DVSEC
Capability indicates they are both supported.

Fixes: 523e594d9c ("cxl/pci: Implement wait for media active")
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/168444687469.3134781.11033518965387297327.stgit@djiang5-mobl3
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-30 14:03:32 +01:00
Dan Williams
a6f5c84b41 cxl/hdm: Fail upon detecting 0-sized decoders
commit 7701c8bef4 upstream.

Decoders committed with 0-size lead to later crashes on shutdown as
__cxl_dpa_release() assumes a 'struct resource' has been established in
the in 'cxlds->dpa_res'. Just fail the driver load in this instance
since there are deeper problems with the enumeration or the setup when
this happens.

Fixes: 9c57cde0dc ("cxl/hdm: Enumerate allocated DPA")
Cc: <stable@vger.kernel.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/168149843516.792294.11872242648319572632.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-11 23:03:05 +09:00
Lukas Wunner
5f625160b6 cxl/pci: Handle excessive CDAT length
commit 4fe2c13d59 upstream.

If the length in the CDAT header is larger than the concatenation of the
header and all table entries, then the CDAT exposed to user space
contains trailing null bytes.

Not every consumer may be able to handle that.  Per Postel's robustness
principle, "be liberal in what you accept" and silently reduce the
cached length to avoid exposing those null bytes.

Fixes: c97006046c ("cxl/port: Read CDAT table")
Tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: stable@vger.kernel.org # v6.0+
Link: https://lore.kernel.org/r/6d98b3c7da5343172bd3ccabfabbc1f31c079d74.1678543498.git.lukas@wunner.de
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13 16:55:25 +02:00
Lukas Wunner
0d8dc8993a cxl/pci: Handle truncated CDAT entries
commit b56faef231 upstream.

If truncated CDAT entries are received from a device, the concatenation
of those entries constitutes a corrupt CDAT, yet is happily exposed to
user space.

Avoid by verifying response lengths and erroring out if truncation is
detected.

The last CDAT entry may still be truncated despite the checks introduced
herein if the length in the CDAT header is too small.  However, that is
easily detectable by user space because it reaches EOF prematurely.
A subsequent commit which rightsizes the CDAT response allocation closes
that remaining loophole.

The two lines introduced here which exceed 80 chars are shortened to
less than 80 chars by a subsequent commit which migrates to a
synchronous DOE API and replaces "t.task.rv" by "rc".

The existing acpi_cdat_header and acpi_table_cdat struct definitions
provided by ACPICA cannot be used because they do not employ __le16 or
__le32 types.  I believe that cannot be changed because those types are
Linux-specific and ACPI is specified for little endian platforms only,
hence doesn't care about endianness.  So duplicate the structs.

Fixes: c97006046c ("cxl/port: Read CDAT table")
Tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: stable@vger.kernel.org # v6.0+
Link: https://lore.kernel.org/r/bce3aebc0e8e18a1173425a7a865b232c3912963.1678543498.git.lukas@wunner.de
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13 16:55:25 +02:00
Lukas Wunner
ff7edd1ac6 cxl/pci: Handle truncated CDAT header
commit 34bafc747c upstream.

cxl_cdat_get_length() only checks whether the DOE response size is
sufficient for the Table Access response header (1 dword), but not the
succeeding CDAT header (1 dword length plus other fields).

It thus returns whatever uninitialized memory happens to be on the stack
if a truncated DOE response with only 1 dword was received.  Fix it.

Fixes: c97006046c ("cxl/port: Read CDAT table")
Reported-by: Ming Li <ming4.li@intel.com>
Tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Ming Li <ming4.li@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: stable@vger.kernel.org # v6.0+
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://lore.kernel.org/r/000e69cd163461c8b1bc2cf4155b6e25402c29c7.1678543498.git.lukas@wunner.de
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13 16:55:25 +02:00
Lukas Wunner
021544721f cxl/pci: Fix CDAT retrieval on big endian
commit fbaa38214c upstream.

The CDAT exposed in sysfs differs between little endian and big endian
arches:  On big endian, every 4 bytes are byte-swapped.

PCI Configuration Space is little endian (PCI r3.0 sec 6.1).  Accessors
such as pci_read_config_dword() implicitly swap bytes on big endian.
That way, the macros in include/uapi/linux/pci_regs.h work regardless of
the arch's endianness.  For an example of implicit byte-swapping, see
ppc4xx_pciex_read_config(), which calls in_le32(), which uses lwbrx
(Load Word Byte-Reverse Indexed).

DOE Read/Write Data Mailbox Registers are unlike other registers in
Configuration Space in that they contain or receive a 4 byte portion of
an opaque byte stream (a "Data Object" per PCIe r6.0 sec 7.9.24.5f).
They need to be copied to or from the request/response buffer verbatim.
So amend pci_doe_send_req() and pci_doe_recv_resp() to undo the implicit
byte-swapping.

The CXL_DOE_TABLE_ACCESS_* and PCI_DOE_DATA_OBJECT_DISC_* macros assume
implicit byte-swapping.  Byte-swap requests after constructing them with
those macros and byte-swap responses before parsing them.

Change the request and response type to __le32 to avoid sparse warnings.
Per a request from Jonathan, replace sizeof(u32) with sizeof(__le32) for
consistency.

Fixes: c97006046c ("cxl/port: Read CDAT table")
Tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: stable@vger.kernel.org # v6.0+
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/3051114102f41d19df3debbee123129118fc5e6d.1678543498.git.lukas@wunner.de
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13 16:55:24 +02:00
Dan Williams
a371788d4f cxl/pmem: Fix nvdimm registration races
commit f57aec443c upstream.

A loop of the form:

    while true; do modprobe cxl_pci; modprobe -r cxl_pci; done

...fails with the following crash signature:

    BUG: kernel NULL pointer dereference, address: 0000000000000040
    [..]
    RIP: 0010:cxl_internal_send_cmd+0x5/0xb0 [cxl_core]
    [..]
    Call Trace:
     <TASK>
     cxl_pmem_ctl+0x121/0x240 [cxl_pmem]
     nvdimm_get_config_data+0xd6/0x1a0 [libnvdimm]
     nd_label_data_init+0x135/0x7e0 [libnvdimm]
     nvdimm_probe+0xd6/0x1c0 [libnvdimm]
     nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm]
     really_probe+0xde/0x380
     __driver_probe_device+0x78/0x170
     driver_probe_device+0x1f/0x90
     __device_attach_driver+0x85/0x110
     bus_for_each_drv+0x7d/0xc0
     __device_attach+0xb4/0x1e0
     bus_probe_device+0x9f/0xc0
     device_add+0x445/0x9c0
     nd_async_device_register+0xe/0x40 [libnvdimm]
     async_run_entry_fn+0x30/0x130

...namely that the bottom half of async nvdimm device registration runs
after the CXL has already torn down the context that cxl_pmem_ctl()
needs. Unlike the ACPI NFIT case that benefits from launching multiple
nvdimm device registrations in parallel from those listed in the table,
CXL is already marked PROBE_PREFER_ASYNCHRONOUS. So provide for a
synchronous registration path to preclude this scenario.

Fixes: 21083f5152 ("cxl/pmem: Register 'pmem' / cxl_nvdimm devices")
Cc: <stable@vger.kernel.org>
Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:20 +01:00
Dan Williams
15f9f8eb3b cxl/region: Fix passthrough-decoder detection
commit 711442e29f upstream.

A passthrough decoder is a decoder that maps only 1 target. It is a
special case because it does not impose any constraints on the
interleave-math as compared to a decoder with multiple targets. Extend
the passthrough case to multi-target-capable decoders that only have one
target selected. I.e. the current code was only considering passthrough
*ports* which are only a subset of the potential passthrough decoder
scenarios.

Fixes: e4f6dfa9ef ("cxl/region: Fix 'distance' calculation with passthrough ports")
Cc: <stable@vger.kernel.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167564540422.847146.13816934143225777888.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:11:52 +01:00
Fan Ni
a04c7d062b cxl/region: Fix null pointer dereference for resetting decoder
commit 4fa4302d6d upstream.

Not all decoders have a reset callback.

The CXL specification allows a host bridge with a single root port to
have no explicit HDM decoders. Currently the region driver assumes there
are none.  As such the CXL core creates a special pass through decoder
instance without a commit/reset callback.

Prior to this patch, the ->reset() callback was called unconditionally when
calling cxl_region_decode_reset. Thus a configuration with 1 Host Bridge,
1 Root Port, and one directly attached CXL type 3 device or multiple CXL
type 3 devices attached to downstream ports of a switch can cause a null
pointer dereference.

Before the fix, a kernel crash was observed when we destroy the region, and
a pass through decoder is reset.

The issue can be reproduced as below,
    1) create a region with a CXL setup which includes a HB with a
    single root port under which a memdev is attached directly.
    2) destroy the region with cxl destroy-region regionX -f.

Fixes: 176baefb2e ("cxl/hdm: Commit decoder state to hardware")
Cc: <stable@vger.kernel.org>
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Link: https://lore.kernel.org/r/20221215170909.2650271-1-fan.ni@samsung.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:11:52 +01:00
Dan Williams
b8b9b0b857 cxl/region: Fix missing probe failure
commit bf3e5da8cb upstream.

cxl_region_probe() allows for regions not in the 'commit' state to be
enabled. Fail probe when the region is not committed otherwise the
kernel may indicate that an address range is active when none of the
decoders are active.

Fixes: 8d48817df6 ("cxl/region: Add region driver boiler plate")
Cc: <stable@vger.kernel.org>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/166993220462.1995348.1698008475198427361.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-07 11:11:39 +01:00
Fan Ni
189c499376 cxl/region: Fix memdev reuse check
commit f04facfb99 upstream.

Due to a typo, the check of whether or not a memdev has already been
used as a target for the region (above code piece) will always be
skipped. Given a memdev with more than one HDM decoder, an interleaved
region can be created that maps multiple HPAs to the same DPA. According
to CXL spec 3.0 8.1.3.8.4, "Aliasing (mapping more than one Host
Physical Address (HPA) to a single Device Physical Address) is
forbidden."

Fix this by using existing iterator for memdev reuse check.

Cc: <stable@vger.kernel.org>
Fixes: 384e624bb2 ("cxl/region: Attach endpoint decoders")
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Link: https://lore.kernel.org/r/20221107212153.745993-1-fan.ni@samsung.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-07 11:11:36 +01:00
Dan Williams
8f401ec1c8 cxl/region: Recycle region ids
At region creation time the next region-id is atomically cached so that
there is predictability of region device names. If that region is
destroyed and then a new one is created the region id increments. That
ends up looking like a memory leak, or is otherwise surprising that
identifiers roll forward even after destroying all previously created
regions.

Try to reuse rather than free old region ids at region release time.

While this fixes a cosmetic issue, the needlessly advancing memory
region-id gives the appearance of a memory leak, hence the "Fixes" tag,
but no "Cc: stable" tag.

Cc: Ben Widawsky <bwidawsk@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Fixes: 779dd20cfb ("cxl/region: Add region creation support")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/166752186062.947915.13200195701224993317.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-04 16:03:43 -07:00
Dan Williams
e4f6dfa9ef cxl/region: Fix 'distance' calculation with passthrough ports
When programming port decode targets, the algorithm wants to ensure that
two devices are compatible to be programmed as peers beneath a given
port. A compatible peer is a target that shares the same dport, and
where that target's interleave position also routes it to the same
dport. Compatibility is determined by the device's interleave position
being >= to distance. For example, if a given dport can only map every
Nth position then positions less than N away from the last target
programmed are incompatible.

The @distance for the host-bridge's cxl_port in a simple dual-ported
host-bridge configuration with 2 direct-attached devices is 1, i.e. An
x2 region divided by 2 dports to reach 2 region targets.

An x4 region under an x2 host-bridge would need 2 intervening switches
where the @distance at the host bridge level is 2 (x4 region divided by
2 switches to reach 4 devices).

However, the distance between peers underneath a single ported
host-bridge is always zero because there is no limit to the number of
devices that can be mapped. In other words, there are no decoders to
program in a passthrough, all descendants are mapped and distance only
starts matters for the intervening descendant ports of the passthrough
port.

Add tracking for the number of dports mapped to a port, and use that to
detect the passthrough case for calculating @distance.

Cc: <stable@vger.kernel.org>
Reported-by: Bobo WL <lmw.bobo@gmail.com>
Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: http://lore.kernel.org/r/20221010172057.00001559@huawei.com
Fixes: 27b3f8d138 ("cxl/region: Program target lists")
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/166752185440.947915.6617495912508299445.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-04 16:01:24 -07:00
Dan Williams
4d07ae22e7 cxl/pmem: Fix cxl_pmem_region and cxl_memdev leak
When a cxl_nvdimm object goes through a ->remove() event (device
physically removed, nvdimm-bridge disabled, or nvdimm device disabled),
then any associated regions must also be disabled. As highlighted by the
cxl-create-region.sh test [1], a single device may host multiple
regions, but the driver was only tracking one region at a time. This
leads to a situation where only the last enabled region per nvdimm
device is cleaned up properly. Other regions are leaked, and this also
causes cxl_memdev reference leaks.

Fix the tracking by allowing cxl_nvdimm objects to track multiple region
associations.

Cc: <stable@vger.kernel.org>
Link: https://github.com/pmem/ndctl/blob/main/test/cxl-create-region.sh [1]
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Fixes: 04ad63f086 ("cxl/region: Introduce cxl_pmem_region objects")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/166752183647.947915.2045230911503793901.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-04 15:58:35 -07:00
Dan Williams
0d9e734018 cxl/region: Fix cxl_region leak, cleanup targets at region delete
When a region is deleted any targets that have been previously assigned
to that region hold references to it. Trigger those references to
drop by detaching all targets at unregister_region() time.

Otherwise that region object will leak as userspace has lost the ability
to detach targets once region sysfs is torn down.

Cc: <stable@vger.kernel.org>
Fixes: b9686e8c8e ("cxl/region: Enable the assignment of endpoint decoders to regions")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/166752183055.947915.17681995648556534844.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-04 15:58:35 -07:00
Dan Williams
a90accb358 cxl/region: Fix region HPA ordering validation
Some regions may not have any address space allocated. Skip them when
validating HPA order otherwise a crash like the following may result:

 devm_cxl_add_region: cxl_acpi cxl_acpi.0: decoder3.4: created region9
 BUG: kernel NULL pointer dereference, address: 0000000000000000
 [..]
 RIP: 0010:store_targetN+0x655/0x1740 [cxl_core]
 [..]
 Call Trace:
  <TASK>
  kernfs_fop_write_iter+0x144/0x200
  vfs_write+0x24a/0x4d0
  ksys_write+0x69/0xf0
  do_syscall_64+0x3a/0x90

store_targetN+0x655/0x1740:
alloc_region_ref at drivers/cxl/core/region.c:676
(inlined by) cxl_port_attach_region at drivers/cxl/core/region.c:850
(inlined by) cxl_region_attach at drivers/cxl/core/region.c:1290
(inlined by) attach_target at drivers/cxl/core/region.c:1410
(inlined by) store_targetN at drivers/cxl/core/region.c:1453

Cc: <stable@vger.kernel.org>
Fixes: 384e624bb2 ("cxl/region: Attach endpoint decoders")
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/166752182461.947915.497032805239915067.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-04 15:58:35 -07:00
Yu Zhe
4f1aa35f1f cxl/pmem: Use size_add() against integer overflow
"struct_size() + n" may cause a integer overflow,
use size_add() to handle it.

Signed-off-by: Yu Zhe <yuzhe@nfschina.com>
Link: https://lore.kernel.org/r/20220927070247.23148-1-yuzhe@nfschina.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-03 11:20:46 -07:00
Vishal Verma
71ee71d7ad cxl/region: Fix decoder allocation crash
When an intermediate port's decoders have been exhausted by existing
regions, and creating a new region with the port in question in it's
hierarchical path is attempted, cxl_port_attach_region() fails to find a
port decoder (as would be expected), and drops into the failure / cleanup
path.

However, during cleanup of the region reference, a sanity check attempts
to dereference the decoder, which in the above case didn't exist. This
causes a NULL pointer dereference BUG.

To fix this, refactor the decoder allocation and de-allocation into
helper routines, and in this 'free' routine, check that the decoder,
@cxld, is valid before attempting any operations on it.

Cc: <stable@vger.kernel.org>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 384e624bb2 ("cxl/region: Attach endpoint decoders")
Link: https://lore.kernel.org/r/20221101074100.1732003-1-vishal.l.verma@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-01 15:33:07 -07:00
Jonathan Cameron
f010c75c05 cxl/pmem: Fix failure to account for 8 byte header for writes to the device LSA.
Writes to the device must include an offset and size as defined in
CXL 2.0 8.2.9.5.2.4 Set LSA (Opcode 4103h)

Fixes tag is non obvious as this code has been through several
reworks and variable names + wasn't in use until the addition
of the region code.

Due to a bug in QEMU CXL emulation this overrun resulted in QEMU
crashing.

Reported-by: Bobo WL <lmw.bobo@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Fixes: 60b8f17215 ("cxl/pmem: Translate NVDIMM label commands to CXL label commands")
Link: https://lore.kernel.org/r/20220815154044.24733-3-Jonathan.Cameron@huawei.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-10-20 16:28:53 -07:00
Jonathan Cameron
2816e24b05 cxl/region: Fix null pointer dereference due to pass through decoder commit
Not all decoders have a commit callback.

The CXL specification allows a host bridge with a single root port to
have no explicit HDM decoders. Currently the region driver assumes there
are none.  As such the CXL core creates a special pass through decoder
instance without a commit callback.

Prior to this patch, the ->commit() callback was called unconditionally.
Thus a configuration with 1 Host Bridge, 1 Root Port, 1 switch with
multiple downstream ports below which there are multiple CXL type 3
devices results in a situation where committing the region causes a null
pointer dereference.

Reported-by: Bobo WL <lmw.bobo@gmail.com>
Fixes: 176baefb2e ("cxl/hdm: Commit decoder state to hardware")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20220818164210.2084-1-Jonathan.Cameron@huawei.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-10-20 16:28:53 -07:00
Jonathan Cameron
cf00b33058 cxl/mbox: Add a check on input payload size
A bug in the LSA code resulted in transfers slightly larger
than the mailbox size. Let us make it easier to catch similar
issues in future by adding a low level check.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20220815154044.24733-2-Jonathan.Cameron@huawei.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-10-20 16:28:53 -07:00
Dan Williams
1cd8a2537e cxl/hdm: Fix skip allocations vs multiple pmem allocations
Vishal notes that when attempting to define a second pmem region on a
device the DPA allocation fails with a message of the form:

    decoder11.1: failed to reserve skipped space

Recall that the skip setting is used when there is a pmem allocation in
the presence of free ram DPA space. The first pmem allocation skips over
the free ram and subsequent pmem allocations do not require a skip. The
bug is that a skip is still attempted and the DPA reservation code
flags the double skip allocation conflict.

Fixes: cf880423b6 ("cxl/hdm: Add support for allocating DPA to an endpoint decoder")
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Tested-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/165973754730.1558392.15466392461645857658.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05 16:11:38 -07:00
Dan Williams
4d8e4ea5bb cxl/region: Disallow region granularity != window granularity
The endpoint decode granularity must be <= the window granularity
otherwise capacity in the endpoints is lost in the decode. Consider an
attempt to have a region granularity of 512 with 4 devices within a
window that maps 2 host bridges at a granularity of 256 bytes:

HPA	DPA Offset	HB	Port	EP
0x0	0x0		0	0	0
0x100	0x0		1	0	2
0x200	0x100		0	0	0
0x300	0x100		1	0	2
0x400	0x200		0	1	1
0x500	0x200		1	1	3
0x600	0x300		0	1	1
0x700	0x300		1	1	3
0x800	0x400		0	0	0
0x900	0x400		1	0	2
0xA00	0x500		0	0	0
0xB00	0x500		1	0	2

Notice how endpoint0 maps HPA 0x0 and 0x200 correctly, but then at HPA
0x800 it results in DPA 0x200-0x400 on being skipped.

Fix this by restricing the region granularity to be equal to the window
granularity resulting in the following for a x4 region under a x2 window
at a granularity of 256.

HPA	DPA Offset	HB	Port	EP
0x0	0x0		0	0	0
0x100	0x0		1	0	2
0x200	0x0		0	1	1
0x300	0x0		1	1	3
0x400	0x100		0	0	0
0x500	0x100		1	0	2
0x600	0x100		0	1	1
0x700	0x100		1	1	3

Not that it ever made practical sense to support region granularity >
window granularity. The window rotates host bridges causing endpoints to
never see a consecutive stream of requests at the desired granularity
without breaks to issue cycles to the other host bridge.

Fixes: 80d10a6cee ("cxl/region: Add interleave geometry attributes")
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/165973127171.1526540.9923273539049172976.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05 16:10:04 -07:00
Dan Williams
298d44d04b cxl/region: Fix x1 interleave to greater than x1 interleave routing
In cases where the decode fans out as it traverses downstream, the
interleave granularity needs to increment to identify the port selector
bits out of the remaining address bits. For example, recall that with an
x2 parent port intereleave (IW == 1), the downstream decode for children
of those ports will either see address bit IG+8 always set, or address
bit IG+8 always clear. So if the child port needs to select a downstream
port it can only use address bits starting at IG+9 (where IG and IW are
the CXL encoded values for interleave granularity (ilog2(ig) - 8) and
ways (ilog2(iw))).

When the parent port interleave is x1 no such masking occurs and the
child port can maintain the granularity that was routed to the parent
port.

Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/165973126583.1526540.657948655360009242.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05 16:10:03 -07:00
Dan Williams
910bc55da8 cxl/region: Move HPA setup to cxl_region_attach()
A recent bug fix added the setup of the endpoint decoder interleave
geometry settings to cxl_region_attach(). Move the HPA setup there as
well to keep all endpoint decoder parameter setting in a central
location.

For symmetry, move endpoint HPA teardown to cxl_region_detach(), and for
switches move HPA setup / teardown to cxl_port_{setup,reset}_targets().

Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/165973126020.1526540.14701949254436069807.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05 16:10:03 -07:00
Dan Williams
2901c8bded cxl/region: Fix decoder interleave programming
Jonathan notes:

"Curiously interleave ways = 1 for the EPs which is obviously wrong"

...while testing the latest CXL development branch on QEMU.

It turns out the region creation process failed to program the endpoint
decoders. This was missed because the default settings of x1 at 4K
intereleave still results in the region appearing to function. Jonathan
caught the bug by reverse mapping the translations that need to happen
for the QEMU support.

Link: https://lore.kernel.org/r/62e95fdf9f6e2_30440294e4@dwillia2-xfh.jf.intel.com.notmuch
Fixes: 384e624bb2 ("cxl/region: Attach endpoint decoders")
Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165951146336.967013.11160153960900111443.stgit@dwillia2-xfh.jf.intel.com
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05 08:41:19 -07:00
Bagas Sanjaya
038e6eb803 cxl/region: describe targets and nr_targets members of cxl_region_params
Sphinx reported undescribed parameters in cxl_region_params struct:

./drivers/cxl/cxl.h:376: warning: Function parameter or member 'targets' not described in 'cxl_region_params'
./drivers/cxl/cxl.h:376: warning: Function parameter or member 'nr_targets' not described in 'cxl_region_params'

Describe these members.

Fixes: b9686e8c8e ("cxl/region: Enable the assignment of endpoint decoders to regions")
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20220804075448.98241-3-bagasdotme@gmail.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05 08:41:02 -07:00
Bagas Sanjaya
f13da0d9c3 cxl/regions: add padding for cxl_rr_ep_add nested lists
Sphinx reported indentation warnings:

Documentation/driver-api/cxl/memory-devices:457: ./drivers/cxl/core/region.c:732: WARNING: Unexpected indentation.
Documentation/driver-api/cxl/memory-devices:457: ./drivers/cxl/core/region.c:733: WARNING: Block quote ends without a blank line; unexpected unindent.
Documentation/driver-api/cxl/memory-devices:457: ./drivers/cxl/core/region.c:735: WARNING: Unexpected indentation.

These warnings above are due to missing blank line padding in the nested list
in kernel-doc comment for cxl_rr_ep_add().

Add the paddings to fix the warnings.

Fixes: 384e624bb2 ("cxl/region: Attach endpoint decoders")
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20220804075448.98241-2-bagasdotme@gmail.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05 08:41:02 -07:00
Dan Carpenter
9fd2cf4d6f cxl/region: Fix IS_ERR() vs NULL check
The nvdimm_pmem_region_create() function returns NULL on error.  It does
not return error pointers.

Fixes: 04ad63f086 ("cxl/region: Introduce cxl_pmem_region objects")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/Yuo65lq2WtfdGJ0X@kili
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05 08:41:02 -07:00
Dan Williams
e29a8995d6 cxl/region: Fix region reference target accounting
Dan reports:

    The error handling in cxl_port_attach_region() looks like it might
    have a similar bug.  The cxl_rr->nr_targets++; might want a --.

    That function is more complicated.

Indeed cxl_rr->nr_targets leaks when cxl_rr_ep_add() fails, but that
flow is not clear. Fix the bug and the clarity by separating the 'new'
region-reference case from the 'extend' region-reference case. This also
moves the host-physical-address (HPA) validation, that the HPA of a new
region being accounted to the port is greater than the HPA of all other
regions associated with the port, to alloc_region_ref().

Introduce @nr_targets_inc to track when the error exit path needs to
clean up cxl_rr->nr_targets.

Fixes: 384e624bb2 ("cxl/region: Attach endpoint decoders")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: http://lore.kernel.org/r/165939482134.252363.1915691883146696327.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05 08:41:02 -07:00
Dan Williams
69c9961387 cxl/region: Fix region commit uninitialized variable warning
0day robot reports:

drivers/cxl/core/region.c:196 cxl_region_decode_commit() error: uninitialized symbol 'rc'.

The re-checking of loop termination conditions to determine "success"
makes it hard to see that @rc is initialized in all cases. Remove those
to make it explicit that @rc reflects a commit error and that the rest
of logic is concerned with unwinding committed decoders.

This change potentially results in cxl_region_decode_reset() being
called with @count == 0 where it was not called before, but
cxl_region_decode_reset() treats that as a nop.

Fixes: 176baefb2e ("cxl/hdm: Commit decoder state to hardware")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: http://lore.kernel.org/r/165951148105.967013.14191992449932268431.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05 08:41:02 -07:00
Dan Williams
8d42854257 cxl/region: Fix port setup uninitialized variable warnings
0day robot reports:

drivers/cxl/core/region.c:1068 cxl_port_setup_targets() error: uninitialized symbol 'eiw'.
drivers/cxl/core/region.c:1068 cxl_port_setup_targets() error: uninitialized symbol 'peig'.
drivers/cxl/core/region.c:1068 cxl_port_setup_targets() error: uninitialized symbol 'peiw'.

...which are all valid reports. Add debug statement to consume the,
albeit unexpected, errors.

Fixes: 27b3f8d138 ("cxl/region: Program target lists")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165951147487.967013.929590444907251028.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05 08:41:02 -07:00
Dan Williams
817b279467 cxl/region: Stop initializing interleave granularity
In preparation for a patch that validates that the region ways setting
is compatible with the granularity setting, the initial granularity
setting needs to start at zero to indicate "unset".

Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/165853777484.2430596.3423921169034844397.stgit@dwillia2-xfh.jf.intel.com
[djbw: fix up unused variable]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-01 15:38:06 -07:00
Dan Williams
4d5c42a80b cxl/hdm: Fix DPA reservation vs cxl_endpoint_decoder lifetime
After adding support for emulating platform firmware established DPA
reservations, the cxl-topology.sh [1] unit test started crashing with
the following signature:

 general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP
 [..]
 RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core]
 [..]
 Call Trace:
  <TASK>
  __cxl_dpa_release+0x1b/0xd0 [cxl_core]
  cxl_dpa_release+0x1d/0x30 [cxl_core]
  release_nodes+0x63/0x90
  devres_release_all+0x88/0xc0

...i.e. a use after free of a 'struct cxl_endpoint_decoder' object. This
results from the ordering of init_hdm_decoder() before add_hdm_decoder()
where, at release time, the decoder is unregistered and released before
the DPA reservation.

Fix this by extending the life of the object until all DPA reservations
have been released which also preserves platform decoder settings being
settled by the time the decoder is published in sysfs (KOBJ_ADD time).

Note that the @len == 0 case in __cxl_dpa_reserve() is avoided in
practice as this function is only called for committed decoders and new
non-zero DPA allocations.

Link: https://github.com/pmem/ndctl/blob/pending/test/cxl-topology.sh [1]
Fixes: 9c57cde0dc ("cxl/hdm: Enumerate allocated DPA")
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/165896020625.3546860.12390103413706292760.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-01 15:36:33 -07:00