No description
Find a file
David Hildenbrand 7b7b27214b mm/memory_hotplug: introduce add_memory_driver_managed()
Patch series "mm/memory_hotplug: Interface to add driver-managed system
ram", v4.

kexec (via kexec_load()) can currently not properly handle memory added
via dax/kmem, and will have similar issues with virtio-mem.  kexec-tools
will currently add all memory to the fixed-up initial firmware memmap.  In
case of dax/kmem, this means that - in contrast to a proper reboot - how
that persistent memory will be used can no longer be configured by the
kexec'd kernel.  In case of virtio-mem it will be harmful, because that
memory might contain inaccessible pieces that require coordination with
hypervisor first.

In both cases, we want to let the driver in the kexec'd kernel handle
detecting and adding the memory, like during an ordinary reboot.
Introduce add_memory_driver_managed().  More on the samentics are in patch
#1.

In the future, we might want to make this behavior configurable for
dax/kmem- either by configuring it in the kernel (which would then also
allow to configure kexec_file_load()) or in kexec-tools by also adding
"System RAM (kmem)" memory from /proc/iomem to the fixed-up initial
firmware memmap.

More on the motivation can be found in [1] and [2].

[1] https://lkml.kernel.org/r/20200429160803.109056-1-david@redhat.com
[2] https://lkml.kernel.org/r/20200430102908.10107-1-david@redhat.com

This patch (of 3):

Some device drivers rely on memory they managed to not get added to the
initial (firmware) memmap as system RAM - so it's not used as initial
system RAM by the kernel and the driver is under control.  While this is
the case during cold boot and after a reboot, kexec is not aware of that
and might add such memory to the initial (firmware) memmap of the kexec
kernel.  We need ways to teach kernel and userspace that this system ram
is different.

For example, dax/kmem allows to decide at runtime if persistent memory is
to be used as system ram.  Another future user is virtio-mem, which has to
coordinate with its hypervisor to deal with inaccessible parts within
memory resources.

We want to let users in the kernel (esp. kexec) but also user space
(esp. kexec-tools) know that this memory has different semantics and
needs to be handled differently:
1. Don't create entries in /sys/firmware/memmap/
2. Name the memory resource "System RAM ($DRIVER)" (exposed via
   /proc/iomem) ($DRIVER might be "kmem", "virtio_mem").
3. Flag the memory resource IORESOURCE_MEM_DRIVER_MANAGED

/sys/firmware/memmap/ [1] represents the "raw firmware-provided memory
map" because "on most architectures that firmware-provided memory map is
modified afterwards by the kernel itself".  The primary user is kexec on
x86-64.  Since commit d96ae53091 ("memory-hotplug: create
/sys/firmware/memmap entry for new memory"), we add all hotplugged memory
to that firmware memmap - which makes perfect sense for traditional memory
hotplug on x86-64, where real HW will also add hotplugged DIMMs to the
firmware memmap.  We replicate what the "raw firmware-provided memory map"
looks like after hot(un)plug.

To keep things simple, let the user provide the full resource name instead
of only the driver name - this way, we don't have to manually
allocate/craft strings for memory resources.  Also use the resource name
to make decisions, to avoid passing additional flags.  In case the name
isn't "System RAM", it's special.

We don't have to worry about firmware_map_remove() on the removal path.
If there is no entry, it will simply return with -EINVAL.

We'll adapt dax/kmem in a follow-up patch.

[1] https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-firmware-memmap

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Link: http://lkml.kernel.org/r/20200508084217.9160-1-david@redhat.com
Link: http://lkml.kernel.org/r/20200508084217.9160-3-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04 19:06:23 -07:00
arch powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable() 2020-06-04 19:06:23 -07:00
block for-5.8/drivers-2020-06-01 2020-06-02 15:37:03 -07:00
certs .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
crypto crypto: engine - do not requeue in case of fatal error 2020-05-28 17:27:52 +10:00
Documentation mm/debug: add tests validating architecture page table helpers 2020-06-04 19:06:21 -07:00
drivers drm: remove drm specific kmap_atomic code 2020-06-04 19:06:22 -07:00
fs Merge branch 'akpm' (patches from Andrew) 2020-06-03 20:24:15 -07:00
include mm/memory_hotplug: introduce add_memory_driver_managed() 2020-06-04 19:06:23 -07:00
init Merge branch 'akpm' (patches from Andrew) 2020-06-03 20:24:15 -07:00
ipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-06-03 16:27:18 -07:00
kernel kcov: collect coverage from interrupts 2020-06-04 19:06:20 -07:00
lib mm/debug: add tests validating architecture page table helpers 2020-06-04 19:06:21 -07:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
mm mm/memory_hotplug: introduce add_memory_driver_managed() 2020-06-04 19:06:23 -07:00
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-06-03 16:27:18 -07:00
samples Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-06-03 16:27:18 -07:00
scripts Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-06-03 16:27:18 -07:00
security mm: add kvfree_sensitive() for freeing sensitive data objects 2020-06-04 19:06:22 -07:00
sound MIPS updates for v5.8: 2020-06-03 13:32:21 -07:00
tools Merge branch 'akpm' (patches from Andrew) 2020-06-03 20:24:15 -07:00
usr kbuild: fix comment about missing include guard detection 2020-04-11 12:09:48 +09:00
virt KVM/arm64 updates for Linux 5.8: 2020-06-01 04:26:27 -04:00
.clang-format block: add bio_for_each_bvec_all() 2020-05-25 11:25:24 +02:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
.mailmap A fair amount of stuff this time around, dominated by yet another massive 2020-06-01 15:45:27 -07:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS mailmap: change email for Ricardo Ribalda 2020-05-25 18:59:59 -06:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig docs: kbuild: convert docs to ReST and rename to *.rst 2019-06-14 14:21:21 -06:00
MAINTAINERS media updates for v5.8-rc1 2020-06-03 20:59:38 -07:00
Makefile arm64 updates for 5.8 2020-06-01 15:18:27 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.