linux-stable/fs
Minchan Kim 0726b01e70 mm/madvise: pass mm to do_madvise
Patch series "introduce memory hinting API for external process", v9.

Now, we have MADV_PAGEOUT and MADV_COLD as madvise hinting API.  With
that, application could give hints to kernel what memory range are
preferred to be reclaimed.  However, in some platform(e.g., Android), the
information required to make the hinting decision is not known to the app.
Instead, it is known to a centralized userspace daemon(e.g.,
ActivityManagerService), and that daemon must be able to initiate reclaim
on its own without any app involvement.

To solve the concern, this patch introduces new syscall -
process_madvise(2).  Bascially, it's same with madvise(2) syscall but it
has some differences.

1. It needs pidfd of target process to provide the hint

2. It supports only MADV_{COLD|PAGEOUT|MERGEABLE|UNMEREABLE} at this
   moment.  Other hints in madvise will be opened when there are explicit
   requests from community to prevent unexpected bugs we couldn't support.

3. Only privileged processes can do something for other process's
   address space.

For more detail of the new API, please see "mm: introduce external memory
hinting API" description in this patchset.

This patch (of 3):

In upcoming patches, do_madvise will be called from external process
context so we shouldn't asssume "current" is always hinted process's
task_struct.

Furthermore, we must not access mm_struct via task->mm, but obtain it via
access_mm() once (in the following patch) and only use that pointer [1],
so pass it to do_madvise() as well.  Note the vma->vm_mm pointers are
safe, so we can use them further down the call stack.

And let's pass current->mm as arguments of do_madvise so it shouldn't
change existing behavior but prepare next patch to make review easy.

[vbabka@suse.cz: changelog tweak]
[minchan@kernel.org: use current->mm for io_uring]
  Link: http://lkml.kernel.org/r/20200423145215.72666-1-minchan@kernel.org
[akpm@linux-foundation.org: fix it for upstream changes]
[akpm@linux-foundation.org: whoops]
[rdunlap@infradead.org: add missing includes]

Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jann Horn <jannh@google.com>
Cc: Tim Murray <timmurray@google.com>
Cc: Daniel Colascione <dancol@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Sonny Rao <sonnyrao@google.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: John Dias <joaodias@google.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: SeongJae Park <sj38.park@gmail.com>
Cc: Christian Brauner <christian@brauner.io>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Oleksandr Natalenko <oleksandr@redhat.com>
Cc: SeongJae Park <sjpark@amazon.de>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Florian Weimer <fw@deneb.enyo.de>
Cc: <linux-man@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200901000633.1920247-1-minchan@kernel.org
Link: http://lkml.kernel.org/r/20200622192900.22757-1-minchan@kernel.org
Link: http://lkml.kernel.org/r/20200302193630.68771-2-minchan@kernel.org
Link: http://lkml.kernel.org/r/20200622192900.22757-2-minchan@kernel.org
Link: https://lkml.kernel.org/r/20200901000633.1920247-2-minchan@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-18 09:27:09 -07:00
..
9p bdi: replace BDI_CAP_NO_{WRITEBACK,ACCT_DIRTY} with a single flag 2020-09-24 13:43:39 -06:00
adfs treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
affs affs: fix basic permission bits to actually work 2020-08-31 12:20:31 +02:00
afs afs fixes 2020-10-16 15:22:41 -07:00
autofs autofs: harden ioctl table 2020-10-16 11:11:22 -07:00
befs block: move struct block_device to blk_types.h 2020-06-24 09:16:02 -06:00
bfs
btrfs block-5.10-2020-10-12 2020-10-13 12:12:44 -07:00
cachefiles cachefiles: switch to kernel_write 2020-07-08 08:27:56 +02:00
ceph We have an inode number handling change, prompted by s390x which is 2020-08-28 10:33:04 -07:00
cifs cifs: Fix incomplete memory allocation on setxattr path 2020-10-10 15:52:54 -07:00
coda docs: filesystems: convert coda.txt to ReST 2020-05-05 09:22:21 -06:00
configfs fs: configfs: delete repeated words in comments 2020-10-16 11:11:19 -07:00
cramfs
crypto fscrypt: export fscrypt_d_revalidate() 2020-09-28 14:44:51 -07:00
debugfs debugfs: Fix module state check condition 2020-09-04 18:12:52 +02:00
devpts
dlm networking changes for the 5.10 merge window 2020-10-15 18:42:13 -07:00
ecryptfs mm, treewide: rename kzfree() to kfree_sensitive() 2020-08-07 11:33:22 -07:00
efivarfs efivarfs: Replace invalid slashes with exclamation marks in dentries. 2020-09-25 23:29:04 +02:00
efs block: move struct block_device to blk_types.h 2020-06-24 09:16:02 -06:00
erofs erofs: remove unnecessary enum entries 2020-10-09 10:37:42 +08:00
exfat exfat: fix use of uninitialized spinlock on error path 2020-10-07 14:27:13 +09:00
exportfs
ext2 \n 2020-10-15 14:56:15 -07:00
ext4 mm/readahead: make page_cache_ra_unbounded take a readahead_control 2020-10-16 11:11:16 -07:00
f2fs f2fs-for-5.10-rc1 2020-10-16 15:14:43 -07:00
fat fat: fix fat_ra_init() for data clusters == 0 2020-08-12 10:58:01 -07:00
freevxfs
fscache Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-06-03 16:27:18 -07:00
fuse block-5.10-2020-10-12 2020-10-13 12:12:44 -07:00
gfs2 Fix memory leak on filesystem withdraw 2020-08-28 10:41:00 -07:00
hfs block: move block-related definitions out of fs.h 2020-06-24 09:16:02 -06:00
hfsplus treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
hostfs
hpfs hpfs: fix warning due to superfluous semicolon 2020-06-06 10:08:17 -07:00
hugetlbfs hugetlbfs: prevent filesystem stacking of hugetlbfs 2020-08-12 10:57:56 -07:00
iomap iomap: Call inode_dio_end() before generic_write_sync() 2020-09-28 08:51:08 -07:00
isofs Remove uninitialized_var() macro for v5.9-rc1 2020-08-04 13:49:43 -07:00
jbd2 jbd2: clean up checksum verification in do_one_pass() 2020-08-19 12:04:35 -04:00
jffs2 treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
jfs fs: Introduce i_blocks_per_page 2020-09-21 08:59:26 -07:00
kernfs fsnotify: pass dir and inode arguments to fsnotify() 2020-07-27 23:15:48 +02:00
lockd
minix fs/minix: remove expected error message in block_to_path() 2020-08-12 10:58:00 -07:00
nfs block-5.10-2020-10-12 2020-10-13 12:12:44 -07:00
nfs_common treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
nfsd block: add a bdev_is_partition helper 2020-09-25 08:18:57 -06:00
nilfs2 nilfs2: fix some kernel-doc warnings for nilfs2 2020-10-16 11:11:22 -07:00
nls treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
notify mm, memcg: rework remote charging API to support nesting 2020-10-18 09:27:09 -07:00
ntfs ntfs: add check for mft record size in superblock 2020-10-13 18:38:27 -07:00
ocfs2 ocfs2: fix potential soft lockup during fstrim 2020-10-13 18:38:27 -07:00
omfs treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
openpromfs
orangefs orangefs: remove unnecessary assignment to variable ret 2020-08-04 15:01:58 -04:00
overlayfs ovl: use generic vfs_ioc_setflags_prepare() helper 2020-10-06 15:38:15 +02:00
proc mm: remove the now-unnecessary mmget_still_valid() hack 2020-10-16 11:11:22 -07:00
pstore treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
qnx4
qnx6 fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
quota \n 2020-10-15 14:56:15 -07:00
ramfs ramfs: fix nommu mmap with gaps in the page cache 2020-10-16 11:11:22 -07:00
reiserfs reiserfs: Fix oops during mount 2020-10-01 11:15:31 +02:00
romfs ROMFS: support inode blocks calculation 2020-10-16 11:11:22 -07:00
squashfs squashfs: avoid bio_alloc() failure with 1Mbyte blocks 2020-08-21 09:52:53 -07:00
sysfs sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output 2020-10-02 12:02:30 +02:00
sysv
tracefs
ubifs block-5.10-2020-10-12 2020-10-13 12:12:44 -07:00
udf udf: Limit sparing table size 2020-09-29 17:21:54 +02:00
ufs treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
unicode unicode: Add utf8_casefold_hash 2020-09-10 14:03:31 -07:00
vboxsf Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2020-10-15 15:11:56 -07:00
verity fs-verity: use smp_load_acquire() for ->i_verity_info 2020-07-21 16:02:41 -07:00
xfs New code for 5.10: 2020-10-14 14:06:06 -07:00
zonefs zonefs: add zone-capacity support 2020-08-11 17:42:24 +09:00
aio.c Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-12 16:35:51 -07:00
anon_inodes.c
attr.c
bad_inode.c fs: move the fiemap definitions out of fs.h 2020-06-03 23:16:55 -04:00
binfmt_aout.c exec: Rename flush_old_exec begin_new_exec 2020-05-07 16:55:47 -05:00
binfmt_elf.c binfmt_elf: take the mmap lock around find_extend_vma() 2020-10-18 09:27:09 -07:00
binfmt_elf_fdpic.c binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot 2020-10-16 11:11:21 -07:00
binfmt_em86.c Merge branch 'akpm' (patches from Andrew) 2020-06-04 19:18:29 -07:00
binfmt_flat.c binfmt_flat: revert "binfmt_flat: don't offset the data start" 2020-08-24 08:49:13 +10:00
binfmt_misc.c Merge branch 'akpm' (patches from Andrew) 2020-06-04 19:18:29 -07:00
binfmt_script.c Merge branch 'akpm' (patches from Andrew) 2020-06-04 19:18:29 -07:00
block_dev.c block: add a bdget_part helper 2020-10-05 10:38:33 -06:00
buffer.c mm, memcg: rework remote charging API to support nesting 2020-10-18 09:27:09 -07:00
char_dev.c vfs: allow unprivileged whiteout creation 2020-05-14 16:44:23 +02:00
compat_binfmt_elf.c Split the old READ_IMPLIES_EXEC workaround from executable PT_GNU_STACK 2020-06-05 13:45:21 -07:00
coredump.c binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot 2020-10-16 11:11:21 -07:00
d_path.c fs: fix NULL dereference due to data race in prepend_path() 2020-10-14 14:54:45 -07:00
dax.c iomap: Change calling convention for zeroing 2020-09-21 08:59:27 -07:00
dcache.c vfs: Use sequence counter with associated spinlock 2020-07-29 16:14:27 +02:00
dcookies.c
direct-io.c \n 2020-10-15 15:03:10 -07:00
drop_caches.c
eventfd.c eventfd: convert to f_op->read_iter() 2020-05-06 22:33:43 -04:00
eventpoll.c ep_create_wakeup_source(): dentry name can change under you... 2020-09-24 19:41:58 -04:00
exec.c powerpc updates for 5.10 2020-10-16 12:21:15 -07:00
fcntl.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
fhandle.c
file.c io_uring: don't rely on weak ->files references 2020-09-30 20:32:32 -06:00
file_table.c Revert "fs: Do not check if there is a fsnotify watcher on pseudo inodes" 2020-06-29 09:40:55 -07:00
filesystems.c
fs-writeback.c block-5.10-2020-10-12 2020-10-13 12:12:44 -07:00
fs_context.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
fs_parser.c fs_parse: mark fs_param_bad_value() as static 2020-10-13 18:38:27 -07:00
fs_pin.c
fs_struct.c vfs: Use sequence counter with associated spinlock 2020-07-29 16:14:27 +02:00
fs_types.c
fsopen.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
init.c init: add an init_dup helper 2020-08-04 21:02:38 -04:00
inode.c fs: add a filesystem flag for THPs 2020-10-16 11:11:15 -07:00
internal.h fs: remove compat_sys_mount 2020-09-22 23:45:57 -04:00
io-wq.c io-wq: kill unused IO_WORKER_F_EXITING 2020-09-30 20:32:34 -06:00
io-wq.h io_uring: add blkcg accounting to offloaded operations 2020-09-30 20:32:34 -06:00
io_uring.c mm/madvise: pass mm to do_madvise 2020-10-18 09:27:09 -07:00
ioctl.c fs: remove ksys_ioctl 2020-07-31 08:16:01 +02:00
Kconfig tmpfs: support 64-bit inums per-sb 2020-08-07 11:33:24 -07:00
Kconfig.binfmt treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
kernel_read_file.c fs/kernel_file_read: Add "offset" arg for partial reads 2020-10-05 13:37:04 +02:00
libfs.c fs: Add standard casefolding support 2020-09-10 14:03:31 -07:00
locks.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
Makefile Char/Misc driver patches for 5.10-rc1 2020-10-15 10:01:51 -07:00
mbcache.c
mount.h proc/mounts: add cursor 2020-05-14 16:44:24 +02:00
mpage.c fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
namei.c fs: remove the unused SB_I_MULTIROOT flag 2020-09-24 13:43:38 -06:00
namespace.c Merge branch 'compat.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-12 16:44:57 -07:00
no-block.c
nsfs.c nsproxy: attach to namespaces via pidfds 2020-05-13 11:41:22 +02:00
open.c exec: move S_ISREG() check earlier 2020-08-12 10:58:01 -07:00
pipe.c Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-11 11:11:35 -07:00
pnode.c
pnode.h
posix_acl.c vfs: clean up posix_acl_permission() logic aroudn MAY_NOT_BLOCK 2020-06-08 11:04:19 -07:00
proc_namespace.c Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-06-04 13:54:34 -07:00
read_write.c Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-12 16:35:51 -07:00
readdir.c fs: remove ksys_getdents64 2020-07-31 08:16:00 +02:00
select.c pselect6() and friends: take handling the combined 6th/7th args into helper 2020-05-29 19:10:42 -04:00
seq_file.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
signalfd.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
splice.c Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-12 16:35:51 -07:00
stack.c
stat.c New code for 5.8: 2020-06-02 19:45:12 -07:00
statfs.c
super.c bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag 2020-09-24 13:43:39 -06:00
sync.c overlayfs update for 5.8 2020-06-09 15:40:50 -07:00
timerfd.c
userfaultfd.c mm: remove the now-unnecessary mmget_still_valid() hack 2020-10-16 11:11:22 -07:00
utimes.c fs: expose utimes_common 2020-07-31 08:16:01 +02:00
xattr.c fs/xattr.c: fix kernel-doc warnings for setxattr & removexattr 2020-10-13 18:38:27 -07:00