linux-stable/Documentation
Michal Hocko 0d73e773ed mm, proc: be more verbose about unstable VMA flags in /proc/<pid>/smaps
[ Upstream commit 7550c60798 ]

Patch series "THP eligibility reporting via proc".

This series of three patches aims at making THP eligibility reporting much
more robust and long term sustainable.  The trigger for the change is a
regression report [2] and the long follow up discussion.  In short the
specific application didn't have good API to query whether a particular
mapping can be backed by THP so it has used VMA flags to workaround that.
These flags represent a deep internal state of VMAs and as such they
should be used by userspace with a great deal of caution.

A similar has happened for [3] when users complained that VM_MIXEDMAP is
no longer set on DAX mappings.  Again a lack of a proper API led to an
abuse.

The first patch in the series tries to emphasise that that the semantic of
flags might change and any application consuming those should be really
careful.

The remaining two patches provide a more suitable interface to address [2]
and provide a consistent API to query the THP status both for each VMA and
process wide as well.  [1]

http://lkml.kernel.org/r/20181120103515.25280-1-mhocko@kernel.org [2]
http://lkml.kernel.org/r/http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com
[3] http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz

This patch (of 3):

Even though vma flags exported via /proc/<pid>/smaps are explicitly
documented to be not guaranteed for future compatibility the warning
doesn't go far enough because it doesn't mention semantic changes to those
flags.  And they are important as well because these flags are a deep
implementation internal to the MM code and the semantic might change at
any time.

Let's consider two recent examples:
http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz
: commit e1fb4a0864 "dax: remove VM_MIXEDMAP for fsdax and device dax" has
: removed VM_MIXEDMAP flag from DAX VMAs. Now our testing shows that in the
: mean time certain customer of ours started poking into /proc/<pid>/smaps
: and looks at VMA flags there and if VM_MIXEDMAP is missing among the VMA
: flags, the application just fails to start complaining that DAX support is
: missing in the kernel.

http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com
: Commit 1860033237 ("mm: make PR_SET_THP_DISABLE immediately active")
: introduced a regression in that userspace cannot always determine the set
: of vmas where thp is ineligible.
: Userspace relies on the "nh" flag being emitted as part of /proc/pid/smaps
: to determine if a vma is eligible to be backed by hugepages.
: Previous to this commit, prctl(PR_SET_THP_DISABLE, 1) would cause thp to
: be disabled and emit "nh" as a flag for the corresponding vmas as part of
: /proc/pid/smaps.  After the commit, thp is disabled by means of an mm
: flag and "nh" is not emitted.
: This causes smaps parsing libraries to assume a vma is eligible for thp
: and ends up puzzling the user on why its memory is not backed by thp.

In both cases userspace was relying on a semantic of a specific VMA flag.
The primary reason why that happened is a lack of a proper interface.
While this has been worked on and it will be fixed properly, it seems that
our wording could see some refinement and be more vocal about semantic
aspect of these flags as well.

Link: http://lkml.kernel.org/r/20181211143641.3503-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Paul Oppenheimer <bepvte@gmail.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-26 09:32:44 +01:00
..
ABI xen/balloon: add runtime control for scrubbing ballooned out pages 2018-09-14 08:51:10 -04:00
accelerators ocxl: Document new OCXL IOCTLs 2018-06-03 20:40:33 +10:00
accounting
acpi ACPI: property: graph: Update graph documentation to use generic references 2018-07-23 12:44:52 +02:00
admin-guide x86/speculation/l1tf: Drop the swap storage limit restriction when l1tf=off 2019-01-09 17:38:41 +01:00
aoe
arm ARM: Device-tree updates 2018-06-11 17:57:38 -07:00
arm64 Documentation/arm64/sve: Couple of improvements and typos 2018-08-29 11:33:19 +01:00
auxdisplay Doc: misc-devices: move lcd-panel-cgram.txt to auxdisplay/ 2018-04-12 16:08:02 +02:00
backlight
block block: Track DISCARD statistics and output them in stat and diskstat 2018-07-18 08:44:22 -06:00
blockdev zram: introduce zram memory tracking 2018-06-07 17:34:34 -07:00
bpf Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2018-08-07 11:02:05 -07:00
bus-devices
cdrom Documentation/cdrom: fix German sharp s in LaTex 2018-03-08 19:35:29 -07:00
cgroup-v1 page cache: use xa_lock 2018-04-11 10:28:39 -07:00
cma
connector
console Documentation: corrections to console/console.txt 2018-08-10 16:09:40 -06:00
core-api idr: Change documentation license 2018-10-15 16:31:29 -04:00
cpu-freq cpufreq: Drop cpufreq_table_validate_and_show() 2018-04-10 08:40:45 +02:00
cpuidle cpuidle: Add definition of residency to sysfs documentation 2018-04-09 13:44:37 +02:00
crypto crypto: remove redundant type flags from tfm allocation 2018-07-09 00:30:29 +08:00
dev-tools doc: dev-tools: kselftest.rst: update contributing new tests 2018-06-29 09:01:50 -06:00
device-mapper dm raid: bump target version, update comments and documentation 2018-09-06 17:07:58 -04:00
devicetree can: hi311x: Use level-triggered interrupt 2018-12-01 09:37:30 +01:00
doc-guide Documentation/sphinx: allow "functions" with no parameters 2018-06-30 07:52:42 -06:00
driver-api docs: fpga: document fpga manager flags 2018-09-30 08:49:55 -07:00
driver-model dmaengine: add a new helper dmaenginem_async_device_register 2018-07-30 10:50:22 +05:30
early-userspace initramfs: move gen_initramfs_list.sh from scripts/ to usr/ 2018-08-22 23:21:44 +09:00
EDID
extcon
fault-injection Documentation: nvme: Documentation for nvme fault injection 2018-03-26 08:53:43 -06:00
fb uvesafb: Fix URLs in the documentation 2018-09-26 18:11:23 +02:00
features ARM: 8777/1: Hook up SYNC_CORE functionality for sys_membarrier() 2018-07-11 11:02:08 +01:00
filesystems mm, proc: be more verbose about unstable VMA flags in /proc/<pid>/smaps 2019-01-26 09:32:44 +01:00
firmware_class
fmc
fpga docs: fpga: add a document for FPGA Device Feature List (DFL) Framework Overview 2018-07-15 13:55:44 +02:00
gpio Documentation: gpio: Move drivers-on-gpio.txt to driver-api 2018-03-23 04:22:29 +01:00
gpu drm/msm/gpu: Add the buffer objects from the submit to the crash dump 2018-07-30 08:50:10 -04:00
hid
hwmon hwmon: (ina2xx) fix sysfs shunt resistor read access 2018-08-26 17:45:25 -07:00
i2c i2c: refactor function to release a DMA safe buffer 2018-08-30 23:13:15 +02:00
ia64 ia64: doc: tweak whitespace for 'console=' parameter 2018-03-05 14:41:38 -08:00
ide
iio
infiniband Documentation/ABI: update infiniband sysfs interfaces 2018-02-23 08:18:33 -07:00
input input: add MT_TOOL_DIAL 2018-07-17 15:33:47 +02:00
ioctl Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
isdn Documentation/isdn: check and fix dead links ... 2018-03-26 12:31:13 -04:00
kbuild kbuild: Fix LOADLIBES rename in Documentation/kbuild/makefiles.txt 2018-08-22 23:21:42 +09:00
kdump
kernel-hacking doc:it_IT: translation for kernel-hacking 2018-07-26 16:21:09 -06:00
laptops platform/x86: thinkpad_acpi: silence HKEY 0x6032, 0x60f0, 0x6030 2018-05-07 15:10:31 +03:00
leds
lightnvm
livepatch livepatch: Remove not longer valid limitations from the documentation 2018-05-24 15:37:57 +02:00
locking locking: Implement an algorithm choice for Wound-Wait mutexes 2018-07-03 09:44:36 +02:00
m68k
maintainer docs: Fix more broken references 2018-06-15 18:11:26 -03:00
md
media media: replace ADOBERGB by OPRGB 2018-11-13 11:08:54 -08:00
memory-devices
mic
mips
misc-devices pci_endpoint_test: Add 2 ioctl commands 2018-07-19 11:46:57 +01:00
mmc
mtd
namespaces
netlabel
networking net-tcp: /proc/sys/net/ipv4/tcp_probe_interval is a u32 not int 2018-09-26 20:33:21 -07:00
nfc
nios2
nvdimm
nvmem
openrisc
parisc
PCI Merge branch 'remotes/lorenzo/pci/dwc' 2018-08-15 14:59:11 -05:00
pcmcia pcmcia: remove long deprecated pcmcia_request_exclusive_irq() function 2018-08-18 12:30:42 -07:00
perf drivers/bus: Move Arm CCN PMU driver 2018-03-06 17:26:15 +01:00
phy
platform
power PM / reboot: Eliminate race between reboot and suspend 2018-08-06 12:35:20 +02:00
powerpc powerpc: Document issues with TM on POWER9 2018-07-02 23:54:29 +10:00
pps
process Code of Conduct: Change the contact email address 2018-10-22 07:33:36 +01:00
pti
ptp ptp: Fix documentation to match code. 2018-03-26 12:13:21 -04:00
rapidio Documentation: rapidio: move sysfs interface to ABI 2018-02-23 08:25:45 -07:00
RCU rculist: Improve documentation for list_for_each_entry_from_rcu() 2018-07-12 15:39:25 -07:00
riscv perf: riscv: Add Document for Future Porting Guide 2018-06-04 14:02:11 -07:00
s390 vfio-ccw: update documentation 2018-03-01 17:32:14 +01:00
scheduler sched/deadline/Documentation: Add overrun signal and GRUB-PA documentation 2018-05-14 09:12:27 +02:00
scsi scsi: documentation: add scsi_mod.use_blk_mq to scsi-parameters 2018-08-27 12:26:10 -04:00
security Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variables 2018-06-14 12:21:18 +09:00
serial
sh
sound ASoC: Updates for v4.19 2018-08-13 12:12:31 +02:00
sparc sparc64: Add support for ADI (Application Data Integrity) 2018-03-18 07:38:48 -07:00
sphinx Documentation/sphinx: allow "functions" with no parameters 2018-06-30 07:52:42 -06:00
sphinx-static
spi
sysctl namei: allow restricted O_CREAT of FIFOs and regular files 2018-08-23 18:48:43 -07:00
target
thermal thermal: Add cooling device's statistics in sysfs 2018-04-02 21:49:01 +08:00
timers timekeeping.txt: Correct maxCount of n-bit binary counter 2018-07-23 09:33:06 -06:00
trace This was a moderately busy cycle for docs, with the usual collection of 2018-08-14 14:29:31 -07:00
translations This was a moderately busy cycle for docs, with the usual collection of 2018-08-14 14:29:31 -07:00
usb USB-serial updates for v4.19-rc1 2018-07-20 21:47:15 +02:00
userspace-api x86/speculation: Add prctl() control for indirect branch speculation 2018-12-05 19:32:04 +01:00
virtual KVM: x86: Control guest reads of MSR_PLATFORM_INFO 2018-09-20 00:51:46 +02:00
vm docs/vm: move ksm and transhuge from "user" to "internals" section. 2018-05-29 06:45:55 -06:00
w1 w1: fix w1_ds2438 documentation 2018-07-07 17:27:13 +02:00
watchdog watchdog: remove bfin_wdt driver 2018-03-26 15:57:04 +02:00
wimax
x86 x86/mm: Move LDT remap out of KASLR region on 5-level paging 2018-11-27 16:13:08 +01:00
xtensa
.gitignore
00-INDEX docs: admin-guide: add cgroup-v2 documentation 2018-05-10 15:42:41 -06:00
atomic_bitops.txt
atomic_t.txt
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt
Changes
clearing-warn-once.txt
CodingStyle
conf.py ext4: import inode data fork chapter from wiki page 2018-07-29 15:45:00 -04:00
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt Documentation: remove stale firmware API reference 2018-05-14 16:44:41 +02:00
digsig.txt
DMA-API-HOWTO.txt
DMA-API.txt
DMA-attributes.txt
DMA-ISA-LPC.txt
docutils.conf
dontdiff
efi-stub.txt
eisa.txt
flexible-arrays.txt
futex-requeue-pi.txt
gcc-plugins.txt
highuid.txt
hw_random.txt
hwspinlock.txt
index.rst Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-08-15 15:04:25 -07:00
Intel-IOMMU.txt
intel_txt.txt
io-mapping.txt
io_ordering.txt
iostats.txt block: Track DISCARD statistics and output them in stat and diskstat 2018-07-18 08:44:22 -06:00
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
isa.txt
isapnp.txt
kernel-per-CPU-kthreads.txt
kobject.txt
kprobes.txt kprobes/Documentation: Fix various typos 2018-06-22 11:10:55 +02:00
kref.txt
ldm.txt
lockup-watchdogs.txt
logo.gif
logo.txt
lsm.txt
lzo.txt
mailbox.txt
Makefile
memory-barriers.txt sched/Documentation: Update wake_up() & co. memory-barrier guarantees 2018-07-17 09:30:34 +02:00
memory-hotplug.txt
men-chameleon-bus.txt
nommu-mmap.txt Documentation: nommu-map: Fix duplicate word typo 2018-06-26 09:01:27 -06:00
ntb.txt
numastat.txt
padata.txt
parport-lowlevel.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pnp.txt
preempt-locking.txt
pwm.txt
rbtree.txt
remoteproc.txt
rfkill.txt rfkill: Fix several typos in documentation 2018-06-15 13:36:08 +02:00
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
rtc.txt
SAK.txt
sgi-ioc4.txt
siphash.txt
SM501.txt
smsc_ece1099.txt
speculation.txt
static-keys.txt
SubmittingPatches
svga.txt
switchtec.txt
sync_file.txt
tee.txt
this_cpu_ops.txt
unaligned-memory-access.txt
vfio-mediated-device.txt vfio/mdev: Check globally for duplicate devices 2018-06-08 10:24:27 -06:00
vfio.txt vfio: fix documentation 2018-05-08 09:16:41 -06:00
video-output.txt
xillybus.txt
xz.txt
zorro.txt