linux-stable/mm
Eric B Munson 5bbe3547aa mm: allow compaction of unevictable pages
Currently, pages which are marked as unevictable are protected from
compaction, but not from other types of migration.  The POSIX real time
extension explicitly states that mlock() will prevent a major page
fault, but the spirit of this is that mlock() should give a process the
ability to control sources of latency, including minor page faults.
However, the mlock manpage only explicitly says that a locked page will
not be written to swap and this can cause some confusion.  The
compaction code today does not give a developer who wants to avoid swap
but wants to have large contiguous areas available any method to achieve
this state.  This patch introduces a sysctl for controlling compaction
behavior with respect to the unevictable lru.  Users who demand no page
faults after a page is present can set compact_unevictable_allowed to 0
and users who need the large contiguous areas can enable compaction on
locked memory by leaving the default value of 1.

To illustrate this problem I wrote a quick test program that mmaps a
large number of 1MB files filled with random data.  These maps are
created locked and read only.  Then every other mmap is unmapped and I
attempt to allocate huge pages to the static huge page pool.  When the
compact_unevictable_allowed sysctl is 0, I cannot allocate hugepages
after fragmenting memory.  When the value is set to 1, allocations
succeed.

Signed-off-by: Eric B Munson <emunson@akamai.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:17 -07:00
..
kasan kasan, module, vmalloc: rework shadow allocation for modules 2015-03-12 18:46:08 -07:00
backing-dev.c Merge branch 'lazytime' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-02-17 16:12:34 -08:00
balloon_compaction.c mm/balloon_compaction: fix deflation when compaction is disabled 2014-10-29 16:33:15 -07:00
bootmem.c mem-hotplug: reset node managed pages when hot-adding a new pgdat 2014-11-13 16:17:06 -08:00
cleancache.c cleancache: remove limit on the number of cleancache enabled filesystems 2015-04-14 16:49:03 -07:00
cma.c mm: cma: constify and use correct signness in mm/cma.c 2015-04-14 16:49:04 -07:00
cma.h mm: cma: allocation trigger 2015-04-14 16:49:00 -07:00
cma_debug.c mm-cma-allocation-trigger-fix 2015-04-14 16:49:00 -07:00
compaction.c mm: allow compaction of unevictable pages 2015-04-15 16:35:17 -07:00
debug-pagealloc.c mm/debug-pagealloc: make debug-pagealloc boottime configurable 2014-12-13 12:42:48 -08:00
debug.c mm: account pmd page tables to the process 2015-02-11 17:06:04 -08:00
dmapool.c mm/dmapool.c: fixed a brace coding style issue 2014-10-09 22:26:00 -04:00
early_ioremap.c mm: create generic early_ioremap() support 2014-04-07 16:36:15 -07:00
fadvise.c vfs: remove get_xip_mem 2015-02-16 17:56:03 -08:00
failslab.c
filemap.c Merge branch 'akpm' (patches from Andrew) 2015-04-14 16:49:17 -07:00
frontswap.c mm/frontswap.c: fix the condition in BUG_ON 2014-12-10 17:41:08 -08:00
gup.c mm: move mm_populate()-related code to mm/gup.c 2015-04-14 16:49:00 -07:00
highmem.c mm/highmem: make kmap cache coloring aware 2014-08-06 18:01:22 -07:00
huge_memory.c mm, thp: really limit transparent hugepage allocation to local node 2015-04-14 16:49:03 -07:00
hugetlb.c mm, hugetlb: abort __get_user_pages if current has been oom killed 2015-04-14 16:49:06 -07:00
hugetlb_cgroup.c mm: page_counter: pull "-1" handling out of page_counter_memparse() 2015-02-11 17:06:02 -08:00
hwpoison-inject.c mm/hwpoison-inject.c: remove unnecessary null test before debugfs_remove_recursive 2014-08-06 18:01:19 -07:00
init-mm.c
internal.h mm/compaction: enhance compaction finish condition 2015-04-14 16:49:01 -07:00
interval_tree.c mm: replace vma->sharead.linear with vma->shared 2015-02-10 14:30:31 -08:00
Kconfig mm: cma: debugfs interface 2015-04-14 16:49:00 -07:00
Kconfig.debug mm/debug_pagealloc: remove obsolete Kconfig options 2015-01-08 15:10:52 -08:00
kmemcheck.c mm/slab_common: move kmem_cache definition to internal header 2014-10-09 22:25:50 -04:00
kmemleak-test.c mm/kmemleak-test.c: use pr_fmt for logging 2014-06-06 16:08:18 -07:00
kmemleak.c kmemleak: disable kasan instrumentation for kmemleak 2015-02-13 21:21:41 -08:00
ksm.c mm: remove rest usage of VM_NONLINEAR and pte_file() 2015-02-10 14:30:31 -08:00
list_lru.c memcg: reparent list_lrus and free kmemcg_id on css offline 2015-02-12 18:54:10 -08:00
maccess.c
madvise.c vfs: remove get_xip_mem 2015-02-16 17:56:03 -08:00
Makefile mm: move memtest under mm 2015-04-14 16:49:06 -07:00
memblock.c mm/memblock.c: rename local variable of memblock_type to `type' 2015-04-14 16:49:00 -07:00
memcontrol.c memcg: remove obsolete comment 2015-04-15 16:35:16 -07:00
memory-failure.c mm/memory-failure.c: define page types for action_result() in one place 2015-04-15 16:35:16 -07:00
memory.c mm: refactor do_wp_page handling of shared vma into a function 2015-04-14 16:49:03 -07:00
memory_hotplug.c mm, hotplug: fix concurrent memory hot-add deadlock 2015-04-14 16:49:00 -07:00
mempolicy.c mm, thp: really limit transparent hugepage allocation to local node 2015-04-14 16:49:03 -07:00
mempool.c mm, mempool: do not allow atomic resizing 2015-04-14 16:49:06 -07:00
memtest.c memtest: use phys_addr_t for physical addresses 2015-04-14 16:49:06 -07:00
migrate.c mm/migrate: check-before-clear PageSwapCache 2015-04-15 16:35:17 -07:00
mincore.c mincore: apply page table walker on do_mincore() 2015-02-11 17:06:06 -08:00
mlock.c mm: move mm_populate()-related code to mm/gup.c 2015-04-14 16:49:00 -07:00
mm_init.c mm/mm_init.c: mark mminit_loglevel __meminitdata 2015-02-12 18:54:11 -08:00
mmap.c mm: rename __mlock_vma_pages_range() to populate_vma_page_range() 2015-04-14 16:49:00 -07:00
mmu_context.c sched/mm: call finish_arch_post_lock_switch in idle_task_exit and use_mm 2014-02-21 08:50:17 +01:00
mmu_notifier.c mmu_notifier: add the callback for mmu_notifier_invalidate_range() 2014-11-13 13:46:09 +11:00
mmzone.c mm: microoptimize zonelist operations 2015-02-11 17:06:02 -08:00
mprotect.c mm: numa: preserve PTE write permissions across a NUMA hinting fault 2015-03-25 16:20:31 -07:00
mremap.c fix mremap() vs. ioctx_kill() race 2015-04-06 17:50:59 -04:00
msync.c mm: remove rest usage of VM_NONLINEAR and pte_file() 2015-02-10 14:30:31 -08:00
nobootmem.c mem-hotplug: reset node managed pages when hot-adding a new pgdat 2014-11-13 16:17:06 -08:00
nommu.c mm/nommu.c: export symbol max_mapnr 2015-03-12 18:46:08 -07:00
oom_kill.c mm/oom_kill.c: fix typo in comment 2015-04-15 16:35:16 -07:00
page-writeback.c mm/page-writeback: check-before-clear PageReclaim 2015-04-15 16:35:17 -07:00
page_alloc.c mm/page_alloc.c: clean up comment 2015-04-14 16:49:04 -07:00
page_counter.c mm: page_counter: pull "-1" handling out of page_counter_memparse() 2015-02-11 17:06:02 -08:00
page_ext.c mm/page_owner: keep track of page owners 2014-12-13 12:42:48 -08:00
page_io.c fs: move struct kiocb to fs.h 2015-03-25 20:28:11 -04:00
page_isolation.c mm/page_alloc.c: call kernel_map_pages in unset_migrateype_isolate 2015-03-25 16:20:30 -07:00
page_owner.c mm/page_owner.c: remove unnecessary stack_trace field 2015-02-11 17:06:07 -08:00
pagewalk.c mm/pagewalk.c: prevent positive return value of walk_page_test() from being passed to callers 2015-03-25 16:20:30 -07:00
percpu-km.c percpu: implmeent pcpu_nr_empty_pop_pages and chunk->nr_populated 2014-09-02 14:46:05 -04:00
percpu-vm.c percpu: move region iterations out of pcpu_[de]populate_chunk() 2014-09-02 14:46:02 -04:00
percpu.c percpu: Fix trivial typos in comments 2015-03-24 13:41:54 -04:00
pgtable-generic.c mm: convert p[te|md]_mknonnuma and remaining page table manipulations 2015-02-12 18:54:08 -08:00
process_vm_access.c process_vm_access: switch to {compat_,}import_iovec() 2015-04-11 22:27:12 -04:00
quicklist.c
readahead.c fs: export inode_to_bdi and use it in favor of mapping->backing_dev_info 2015-01-20 14:03:04 -07:00
rmap.c mm: fix anon_vma->degree underflow in anon_vma endless growing prevention 2015-03-25 16:20:30 -07:00
shmem.c Merge branch 'iocb' into for-next 2015-04-11 22:24:41 -04:00
slab.c mm: remove GFP_THISNODE 2015-04-14 16:49:03 -07:00
slab.h slub: make dead caches discard free slabs immediately 2015-02-12 18:54:10 -08:00
slab_common.c mm: slub: add kernel address sanitizer support for slub allocator 2015-02-13 21:21:41 -08:00
slob.c slob: make slob_alloc_node() static and remove EXPORT_SYMBOL() 2015-04-14 16:48:59 -07:00
slub.c slub: use bool function return values of true/false not 1/0 2015-04-14 16:48:59 -07:00
sparse-vmemmap.c mm/sparse: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
sparse.c mm: use macros from compiler.h instead of __attribute__((...)) 2014-04-07 16:35:54 -07:00
swap.c Merge branch 'for-3.20/bdi' of git://git.kernel.dk/linux-block 2015-02-12 13:50:21 -08:00
swap_cgroup.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
swap_state.c fs: remove mapping->backing_dev_info 2015-01-20 14:03:05 -07:00
swapfile.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
truncate.c page_writeback: clean up mess around cancel_dirty_page() 2015-04-14 16:49:01 -07:00
util.c mm/util: add kstrdup_const 2015-02-13 21:21:35 -08:00
vmacache.c mm,vmacache: count number of system-wide flushes 2014-12-13 12:42:48 -08:00
vmalloc.c mm: change vunmap to tear down huge KVA mappings 2015-04-14 16:49:04 -07:00
vmpressure.c mm/vmpressure.c: fix race in vmpressure_work_fn() 2014-12-02 17:32:07 -08:00
vmscan.c Merge branch 'akpm' (patches from Andrew) 2015-02-12 18:54:28 -08:00
vmstat.c vmstat: Reduce time interval to stat update on idle cpu 2015-02-11 17:06:07 -08:00
workingset.c list_lru: add helpers to isolate items 2015-02-12 18:54:10 -08:00
zbud.c mm/zpool: add name argument to create zpool 2015-02-12 18:54:12 -08:00
zpool.c mm/zpool: add name argument to create zpool 2015-02-12 18:54:12 -08:00
zsmalloc.c mm/zsmalloc: add statistics support 2015-02-12 18:54:12 -08:00
zswap.c mm/zpool: add name argument to create zpool 2015-02-12 18:54:12 -08:00