linux-stable/drivers/nvdimm
Dan Williams 18c65667fa cxl/pmem: Fix nvdimm registration races
commit f57aec443c upstream.

A loop of the form:

    while true; do modprobe cxl_pci; modprobe -r cxl_pci; done

...fails with the following crash signature:

    BUG: kernel NULL pointer dereference, address: 0000000000000040
    [..]
    RIP: 0010:cxl_internal_send_cmd+0x5/0xb0 [cxl_core]
    [..]
    Call Trace:
     <TASK>
     cxl_pmem_ctl+0x121/0x240 [cxl_pmem]
     nvdimm_get_config_data+0xd6/0x1a0 [libnvdimm]
     nd_label_data_init+0x135/0x7e0 [libnvdimm]
     nvdimm_probe+0xd6/0x1c0 [libnvdimm]
     nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm]
     really_probe+0xde/0x380
     __driver_probe_device+0x78/0x170
     driver_probe_device+0x1f/0x90
     __device_attach_driver+0x85/0x110
     bus_for_each_drv+0x7d/0xc0
     __device_attach+0xb4/0x1e0
     bus_probe_device+0x9f/0xc0
     device_add+0x445/0x9c0
     nd_async_device_register+0xe/0x40 [libnvdimm]
     async_run_entry_fn+0x30/0x130

...namely that the bottom half of async nvdimm device registration runs
after the CXL has already torn down the context that cxl_pmem_ctl()
needs. Unlike the ACPI NFIT case that benefits from launching multiple
nvdimm device registrations in parallel from those listed in the table,
CXL is already marked PROBE_PREFER_ASYNCHRONOUS. So provide for a
synchronous registration path to preclude this scenario.

Fixes: 21083f5152 ("cxl/pmem: Register 'pmem' / cxl_nvdimm devices")
Cc: <stable@vger.kernel.org>
Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:29:42 +01:00
..
Kconfig nvdimm: Support sizeof(struct page) > MAX_STRUCT_PAGE_SIZE 2023-01-28 15:32:36 -08:00
Makefile drivers/nvdimm: Fix build failure when CONFIG_PERF_EVENTS is not set 2022-03-23 12:17:36 -07:00
badrange.c
btt.c nvdimm-btt: Use the enum req_op type 2022-07-14 12:14:31 -06:00
btt.h nvdimm-btt: convert to blk_alloc_disk/blk_cleanup_disk 2021-06-01 07:42:23 -06:00
btt_devs.c nvdimm: Drop nd_device_lock() 2022-04-28 14:01:55 -07:00
bus.c cxl/pmem: Fix nvdimm registration races 2023-03-10 09:29:42 +01:00
claim.c
core.c nvdimm: Fix firmware activation deadlock scenarios 2022-04-28 14:01:56 -07:00
dax_devs.c nvdimm: Replace lockdep_mutex with local lock classes 2022-04-28 14:01:55 -07:00
dimm.c libnvdimm: Make remove callback return void 2021-02-16 19:35:29 -08:00
dimm_devs.c cxl/pmem: Fix nvdimm registration races 2023-03-10 09:29:42 +01:00
e820.c
label.c nvdimm/region: Delete nd_blk_region infrastructure 2022-03-11 15:53:13 -08:00
label.h nvdimm/region: Delete nd_blk_region infrastructure 2022-03-11 15:53:13 -08:00
namespace_devs.c libnvdimm for 6.1 2022-10-14 18:41:41 -07:00
nd-core.h cxl/pmem: Fix nvdimm registration races 2023-03-10 09:29:42 +01:00
nd.h nvdimm: Support sizeof(struct page) > MAX_STRUCT_PAGE_SIZE 2023-01-28 15:32:36 -08:00
nd_perf.c drivers/nvdimm: Fix build failure when CONFIG_PERF_EVENTS is not set 2022-03-23 12:17:36 -07:00
nd_virtio.c block: pass a block_device and opf to bio_alloc 2022-02-02 07:49:59 -07:00
of_pmem.c
pfn.h
pfn_devs.c nvdimm: Support sizeof(struct page) > MAX_STRUCT_PAGE_SIZE 2023-01-28 15:32:36 -08:00
pmem.c Merge branch 'for-6.0/dax' into libnvdimm-fixes 2022-09-24 18:14:12 -07:00
pmem.h dax: introduce DAX_RECOVERY_WRITE dax access mode 2022-05-16 13:35:56 -07:00
region.c nvdimm/region: Move cache management to the region driver 2022-12-02 23:52:32 -08:00
region_devs.c nvdimm/region: Move cache management to the region driver 2022-12-02 23:52:32 -08:00
security.c nvdimm/region: Move cache management to the region driver 2022-12-02 23:52:32 -08:00
virtio_pmem.c virtio_pmem: set device ready in probe() 2022-08-11 04:06:37 -04:00
virtio_pmem.h