linux-stable/include/linux/memregion.h

/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _MEMREGION_H_
#define _MEMREGION_H_
#include <linux/types.h>
#include <linux/errno.h>
#include <linux/range.h>
#include <linux/bug.h>

struct memregion_info {
	int target_node;
	struct range range;
};

#ifdef CONFIG_MEMREGION
int memregion_alloc(gfp_t gfp);
void memregion_free(int id);
#else
static inline int memregion_alloc(gfp_t gfp)
{
	return -ENOMEM;
}
static inline void memregion_free(int id)
{
}
#endif

/**
 * cpu_cache_invalidate_memregion - drop any CPU cached data for
 *     memregions described by @res_desc
 * @res_desc: one of the IORES_DESC_* types
 *
 * Perform cache maintenance after a memory event / operation that
 * changes the contents of physical memory in a cache-incoherent manner.
 * For example, device memory technologies like NVDIMM and CXL have
 * device secure erase, and dynamic region provision that can replace
 * the memory mapped to a given physical address.
 *
 * Limit the functionality to architectures that have an efficient way
 * to writeback and invalidate potentially terabytes of address space at
 * once.  Note that this routine may or may not write back any dirty
 * contents while performing the invalidation. It is only exported for
 * the explicit usage of the NVDIMM and CXL modules in the 'DEVMEM'
 * symbol namespace on bare platforms.
 *
 * Returns 0 on success or negative error code on a failure to perform
 * the cache maintenance.
 */
#ifdef CONFIG_ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION
int cpu_cache_invalidate_memregion(int res_desc);
bool cpu_cache_has_invalidate_memregion(void);
#else
static inline bool cpu_cache_has_invalidate_memregion(void)
{
	return false;
}

static inline int cpu_cache_invalidate_memregion(int res_desc)
{
	WARN_ON_ONCE("CPU cache invalidation required");
	return -ENXIO;
}
#endif
#endif /* _MEMREGION_H_ */
lib: Uplevel the pmem "region" ida to a global allocator In preparation for handling platform differentiated memory types beyond persistent memory, uplevel the "region" identifier to a global number space. This enables a device-dax instance to be registered to any memory type with guaranteed unique names. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> 2019-11-07 01:43:31 +00:00			`/* SPDX-License-Identifier: GPL-2.0 */`
			`#ifndef _MEMREGION_H_`
			`#define _MEMREGION_H_`
			`#include <linux/types.h>`
			`#include <linux/errno.h>`
dax/hmem: Convey the dax range via memregion_info() In preparation for hmem platform devices to be unregistered, stop using platform_device_add_resources() to convey the address range. The platform_device_add_resources() API causes an existing "Soft Reserved" iomem resource to be re-parented under an inserted platform device resource. When that platform device is deleted it removes the platform device resource and all children. Instead, it is sufficient to convey just the address range and let request_mem_region() insert resources to indicate the devices active in the range. This allows the "Soft Reserved" resource to be re-enumerated upon the next probe event. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167602002217.1924368.7036275892522551624.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> 2023-02-10 09:07:02 +00:00			`#include <linux/range.h>`
memregion: Add cpu_cache_invalidate_memregion() interface With CXL security features, and CXL dynamic provisioning, global CPU cache flushing nvdimm requirements are no longer specific to that subsystem, even beyond the scope of security_ops. CXL will need such semantics for features not necessarily limited to persistent memory. The functionality this is enabling is to be able to instantaneously secure erase potentially terabytes of memory at once and the kernel needs to be sure that none of the data from before the erase is still present in the cache. It is also used when unlocking a memory device where speculative reads and firmware accesses could have cached poison from before the device was unlocked. Lastly this facility is used when mapping new devices, or new capacity into an established physical address range. I.e. when the driver switches DeviceA mapping AddressX to DeviceB mapping AddressX then any cached data from DeviceA:AddressX needs to be invalidated. This capability is typically only used once per-boot (for unlock), or once per bare metal provisioning event (secure erase), like when handing off the system to another tenant or decommissioning a device. It may also be used for dynamic CXL region provisioning. Users must first call cpu_cache_has_invalidate_memregion() to know whether this functionality is available on the architecture. On x86 this respects the constraints of when wbinvd() is tolerable. It is already the case that wbinvd() is problematic to allow in VMs due its global performance impact and KVM, for example, has been known to just trap and ignore the call. With confidential computing guest execution of wbinvd() may even trigger an exception. Given guests should not be messing with the bare metal address map via CXL configuration changes cpu_cache_has_invalidate_memregion() returns false in VMs. While this global cache invalidation facility, is exported to modules, since NVDIMM and CXL support can be built as a module, it is not for general use. The intent is that this facility is not available outside of specific "device-memory" use cases. To make that expectation as clear as possible the API is scoped to a new "DEVMEM" module namespace that only the NVDIMM and CXL subsystems are expected to import. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Co-developed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> 2022-10-28 18:34:04 +00:00			`#include <linux/bug.h>`
lib: Uplevel the pmem "region" ida to a global allocator In preparation for handling platform differentiated memory types beyond persistent memory, uplevel the "region" identifier to a global number space. This enables a device-dax instance to be registered to any memory type with guaranteed unique names. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> 2019-11-07 01:43:31 +00:00
device-dax: Add a driver for "hmem" devices Platform firmware like EFI/ACPI may publish "hmem" platform devices. Such a device is a performance differentiated memory range likely reserved for an application specific use case. The driver gives access to 100% of the capacity via a device-dax mmap instance by default. However, if over-subscription and other kernel memory management is desired the resulting dax device can be assigned to the core-mm via the kmem driver. This consumes "hmem" devices the producer of "hmem" devices is saved for a follow-on patch so that it can reference the new CONFIG_DEV_DAX_HMEM symbol to gate performing the enumeration work. Reported-by: kbuild test robot <lkp@intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> 2019-11-07 01:43:43 +00:00			`struct memregion_info {`
			`int target_node;`
dax/hmem: Convey the dax range via memregion_info() In preparation for hmem platform devices to be unregistered, stop using platform_device_add_resources() to convey the address range. The platform_device_add_resources() API causes an existing "Soft Reserved" iomem resource to be re-parented under an inserted platform device resource. When that platform device is deleted it removes the platform device resource and all children. Instead, it is sufficient to convey just the address range and let request_mem_region() insert resources to indicate the devices active in the range. This allows the "Soft Reserved" resource to be re-enumerated upon the next probe event. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167602002217.1924368.7036275892522551624.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> 2023-02-10 09:07:02 +00:00			`struct range range;`
device-dax: Add a driver for "hmem" devices Platform firmware like EFI/ACPI may publish "hmem" platform devices. Such a device is a performance differentiated memory range likely reserved for an application specific use case. The driver gives access to 100% of the capacity via a device-dax mmap instance by default. However, if over-subscription and other kernel memory management is desired the resulting dax device can be assigned to the core-mm via the kmem driver. This consumes "hmem" devices the producer of "hmem" devices is saved for a follow-on patch so that it can reference the new CONFIG_DEV_DAX_HMEM symbol to gate performing the enumeration work. Reported-by: kbuild test robot <lkp@intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> 2019-11-07 01:43:43 +00:00			`};`

lib: Uplevel the pmem "region" ida to a global allocator In preparation for handling platform differentiated memory types beyond persistent memory, uplevel the "region" identifier to a global number space. This enables a device-dax instance to be registered to any memory type with guaranteed unique names. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> 2019-11-07 01:43:31 +00:00			`#ifdef CONFIG_MEMREGION`
			`int memregion_alloc(gfp_t gfp);`
			`void memregion_free(int id);`
			`#else`
			`static inline int memregion_alloc(gfp_t gfp)`
			`{`
			`return -ENOMEM;`
			`}`
memregion: Fix memregion_free() fallback definition In the CONFIG_MEMREGION=n case, memregion_free() is meant to be a static inline. 0day reports: In file included from drivers/cxl/core/port.c:4: include/linux/memregion.h:19:6: warning: no previous prototype for function 'memregion_free' [-Wmissing-prototypes] Mark memregion_free() static. Fixes: 33dd70752cd7 ("lib: Uplevel the pmem "region" ida to a global allocator") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/165601455171.4042645.3350844271068713515.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com> 2022-06-23 20:02:31 +00:00			`static inline void memregion_free(int id)`
lib: Uplevel the pmem "region" ida to a global allocator In preparation for handling platform differentiated memory types beyond persistent memory, uplevel the "region" identifier to a global number space. This enables a device-dax instance to be registered to any memory type with guaranteed unique names. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> 2019-11-07 01:43:31 +00:00			`{`
			`}`
			`#endif`
memregion: Add cpu_cache_invalidate_memregion() interface With CXL security features, and CXL dynamic provisioning, global CPU cache flushing nvdimm requirements are no longer specific to that subsystem, even beyond the scope of security_ops. CXL will need such semantics for features not necessarily limited to persistent memory. The functionality this is enabling is to be able to instantaneously secure erase potentially terabytes of memory at once and the kernel needs to be sure that none of the data from before the erase is still present in the cache. It is also used when unlocking a memory device where speculative reads and firmware accesses could have cached poison from before the device was unlocked. Lastly this facility is used when mapping new devices, or new capacity into an established physical address range. I.e. when the driver switches DeviceA mapping AddressX to DeviceB mapping AddressX then any cached data from DeviceA:AddressX needs to be invalidated. This capability is typically only used once per-boot (for unlock), or once per bare metal provisioning event (secure erase), like when handing off the system to another tenant or decommissioning a device. It may also be used for dynamic CXL region provisioning. Users must first call cpu_cache_has_invalidate_memregion() to know whether this functionality is available on the architecture. On x86 this respects the constraints of when wbinvd() is tolerable. It is already the case that wbinvd() is problematic to allow in VMs due its global performance impact and KVM, for example, has been known to just trap and ignore the call. With confidential computing guest execution of wbinvd() may even trigger an exception. Given guests should not be messing with the bare metal address map via CXL configuration changes cpu_cache_has_invalidate_memregion() returns false in VMs. While this global cache invalidation facility, is exported to modules, since NVDIMM and CXL support can be built as a module, it is not for general use. The intent is that this facility is not available outside of specific "device-memory" use cases. To make that expectation as clear as possible the API is scoped to a new "DEVMEM" module namespace that only the NVDIMM and CXL subsystems are expected to import. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Co-developed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> 2022-10-28 18:34:04 +00:00
			`/**`
			`* cpu_cache_invalidate_memregion - drop any CPU cached data for`
			`* memregions described by @res_desc`
			`* @res_desc: one of the IORES_DESC_* types`
			`*`
			`* Perform cache maintenance after a memory event / operation that`
			`* changes the contents of physical memory in a cache-incoherent manner.`
			`* For example, device memory technologies like NVDIMM and CXL have`
			`* device secure erase, and dynamic region provision that can replace`
			`* the memory mapped to a given physical address.`
			`*`
			`* Limit the functionality to architectures that have an efficient way`
			`* to writeback and invalidate potentially terabytes of address space at`
			`* once. Note that this routine may or may not write back any dirty`
			`* contents while performing the invalidation. It is only exported for`
			`* the explicit usage of the NVDIMM and CXL modules in the 'DEVMEM'`
			`* symbol namespace on bare platforms.`
			`*`
			`* Returns 0 on success or negative error code on a failure to perform`
			`* the cache maintenance.`
			`*/`
			`#ifdef CONFIG_ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION`
			`int cpu_cache_invalidate_memregion(int res_desc);`
			`bool cpu_cache_has_invalidate_memregion(void);`
			`#else`
			`static inline bool cpu_cache_has_invalidate_memregion(void)`
			`{`
			`return false;`
			`}`

			`static inline int cpu_cache_invalidate_memregion(int res_desc)`
			`{`
			`WARN_ON_ONCE("CPU cache invalidation required");`
			`return -ENXIO;`
			`}`
			`#endif`
lib: Uplevel the pmem "region" ida to a global allocator In preparation for handling platform differentiated memory types beyond persistent memory, uplevel the "region" identifier to a global number space. This enables a device-dax instance to be registered to any memory type with guaranteed unique names. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> 2019-11-07 01:43:31 +00:00			`#endif /* _MEMREGION_H_ */`