linux-stable/Documentation/filesystems
Vlastimil Babka c261e7d94f mm, proc: account for shmem swap in /proc/pid/smaps
Currently, /proc/pid/smaps will always show "Swap: 0 kB" for
shmem-backed mappings, even if the mapped portion does contain pages
that were swapped out.  This is because unlike private anonymous
mappings, shmem does not change pte to swap entry, but pte_none when
swapping the page out.  In the smaps page walk, such page thus looks
like it was never faulted in.

This patch changes smaps_pte_entry() to determine the swap status for
such pte_none entries for shmem mappings, similarly to how
mincore_page() does it.  Swapped out shmem pages are thus accounted for.
For private mappings of tmpfs files that COWed some of the pages, swaped
out status of the original shmem pages is naturally ignored.  If some of
the private copies was also swapped out, they are accounted via their
page table swap entries, so the resulting reported swap usage is then a
sum of both swapped out private copies, and swapped out shmem pages that
were not COWed.  No double accounting can thus happen.

The accounting is arguably still not as precise as for private anonymous
mappings, since now we will count also pages that the process in
question never accessed, but another process populated them and then let
them become swapped out.  I believe it is still less confusing and
subtle than not showing any swap usage by shmem mappings at all.
Swapped out counter might of interest of users who would like to prevent
from future swapins during performance critical operation and pre-fault
them at their convenience.  Especially for larger swapped out regions
the cost of swapin is much higher than a fresh page allocation.  So a
differentiation between pte_none vs.  swapped out is important for those
usecases.

One downside of this patch is that it makes /proc/pid/smaps more
expensive for shmem mappings, as we consult the radix tree for each
pte_none entry, so the overal complexity is O(n*log(n)).  I have
measured this on a process that creates a 2GB mapping and dirties single
pages with a stride of 2MB, and time how long does it take to cat
/proc/pid/smaps of this process 100 times.

Private anonymous mapping:

real    0m0.949s
user    0m0.116s
sys     0m0.348s

Mapping of a /dev/shm/file:

real    0m3.831s
user    0m0.180s
sys     0m3.212s

The difference is rather substantial, so the next patch will reduce the
cost for shared or read-only mappings.

In a less controlled experiment, I've gathered pids of processes on my
desktop that have either '/dev/shm/*' or 'SYSV*' in smaps.  This
included the Chrome browser and some KDE processes.  Again, I've run cat
/proc/pid/smaps on each 100 times.

Before this patch:

real    0m9.050s
user    0m0.518s
sys     0m8.066s

After this patch:

real    0m9.221s
user    0m0.541s
sys     0m8.187s

This suggests low impact on average systems.

Note that this patch doesn't attempt to adjust the SwapPss field for
shmem mappings, which would need extra work to determine who else could
have the pages mapped.  Thus the value stays zero except for COWed
swapped out pages in a shmem mapping, which are accounted as usual.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
..
caching FS-Cache: Count the number of initialised operations 2015-04-02 14:28:53 +01:00
cifs
configfs configfs: implement binary attributes 2016-01-04 12:31:46 +01:00
nfs ipconfig: send Client-identifier in DHCP requests 2015-10-18 19:23:52 -07:00
pohmelfs
.gitignore Documentation: update .gitignore files 2014-09-26 11:02:59 +02:00
00-INDEX dax: replace XIP documentation with DAX documentation 2015-02-16 17:56:03 -08:00
9p.txt
adfs.txt
affs.txt
afs.txt
autofs4-mount-control.txt
autofs4.txt autofs: the documentation I wanted to read 2014-10-14 02:18:17 +02:00
automount-support.txt Documentation: remove outdated information from automount-support.txt 2015-05-15 01:10:38 -04:00
befs.txt
bfs.txt
btrfs.txt Documentation: filesystems: btrfs: Fixed typos and whitespace 2015-07-09 14:31:06 -06:00
ceph.txt
coda.txt
cramfs.txt
dax.txt dax: add huge page fault support 2015-09-08 15:35:28 -07:00
debugfs.txt debugfs: Pass bool pointer to debugfs_create_bool() 2015-10-04 11:36:07 +01:00
devpts.txt
directory-locking
dlmfs.txt ocfs2: update web page + git tree in documentation 2015-02-28 09:57:50 -08:00
dnotify.txt
dnotify_test.c
ecryptfs.txt
efivarfs.txt
exofs.txt
ext2.txt fs: Remove ext3 filesystem driver 2015-07-23 20:59:40 +02:00
ext3.txt fs: Remove ext3 filesystem driver 2015-07-23 20:59:40 +02:00
ext4.txt ext4: add DAX functionality 2015-02-16 17:56:04 -08:00
f2fs.txt f2fs: introduce new option for controlling data flush 2015-12-16 09:25:48 -08:00
fiemap.txt fsioctl.c: make generic_block_fiemap() signal-tolerant 2015-02-10 14:30:30 -08:00
files.txt
fuse.txt
gfs2-glocks.txt gfs2: Remove gl_spin define 2015-10-29 12:57:48 -05:00
gfs2-uevents.txt
gfs2.txt
hfs.txt
hfsplus.txt
hpfs.txt
inotify.txt inotify: update documentation to reflect code changes 2015-02-10 14:30:28 -08:00
isofs.txt
jfs.txt
Locking switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
locks.txt
logfs.txt
Makefile configfs: remove old API 2015-10-13 22:17:57 -07:00
mandatory-locking.txt
ncpfs.txt
nilfs2.txt
ntfs.txt NTFS: Remove changelog from Documentation/filesystems/ntfs.txt. 2014-10-16 12:43:57 +01:00
ocfs2.txt ocfs2: update web page + git tree in documentation 2015-02-28 09:57:50 -08:00
omfs.txt
overlayfs.txt Remove email address from Documentation/filesystems/overlayfs.txt 2015-11-11 10:04:53 -07:00
path-lookup.md Documentation: add new description of path-name lookup. 2015-11-02 18:18:25 -07:00
path-lookup.txt Documentation: add new description of path-name lookup. 2015-11-02 18:18:25 -07:00
porting switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
proc.txt mm, proc: account for shmem swap in /proc/pid/smaps 2016-01-14 16:00:49 -08:00
qnx6.txt
quota.txt quota: Update documentation 2015-05-18 11:23:07 +02:00
ramfs-rootfs-initramfs.txt
relay.txt
romfs.txt
seq_file.txt Documentation: update seq_file 2014-12-29 15:40:18 -07:00
sharedsubtree.txt
spufs.txt
squashfs.txt Squashfs: Add LZ4 compression configuration option 2014-11-27 18:48:44 +00:00
sysfs-pci.txt
sysfs-tagging.txt sysfs-tagging.txt: fix pre-kernfs references 2015-09-13 14:38:51 -06:00
sysfs.txt sysfs.txt: mention that store method buffers are null-terminated 2015-09-13 14:38:51 -06:00
sysv-fs.txt
tmpfs.txt
ubifs.txt
udf.txt
ufs.txt
vfat.txt
vfs.txt switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
xfs-delayed-logging-design.txt
xfs-self-describing-metadata.txt
xfs.txt xfs: fix kernel version in docs 2015-06-01 07:15:38 +10:00