linux-stable/mm
Jakub Kicinski 5c84ab9c96 mm, memcg: do not high throttle allocators based on wraparound
commit 9b8b17541f upstream.

If a cgroup violates its memory.high constraints, we may end up unduly
penalising it.  For example, for the following hierarchy:

  A:   max high, 20 usage
  A/B: 9 high, 10 usage
  A/C: max high, 10 usage

We would end up doing the following calculation below when calculating
high delay for A/B:

  A/B: 10 - 9 = 1...
  A:   20 - PAGE_COUNTER_MAX = 21, so set max_overage to 21.

This gets worse with higher disparities in usage in the parent.

I have no idea how this disappeared from the final version of the patch,
but it is certainly Not Good(tm).  This wasn't obvious in testing because,
for a simple cgroup hierarchy with only one child, the result is usually
roughly the same.  It's only in more complex hierarchies that things go
really awry (although still, the effects are limited to a maximum of 2
seconds in schedule_timeout_killable at a maximum).

[chris@chrisdown.name: changelog]
Fixes: e26733e0d0 ("mm, memcg: throttle allocators based on ancestral memory.high")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>	[5.4.x]
Link: http://lkml.kernel.org/r/20200331152424.GA1019937@chrisdown.name
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-17 10:50:17 +02:00
..
kasan mm: introduce compound_nr() 2019-09-24 15:54:08 -07:00
backing-dev.c memcg: fix a crash in wb_workfn when a device disappears 2020-02-11 04:35:11 -08:00
balloon_compaction.c mm/balloon_compaction: suppress allocation warnings 2019-09-04 07:42:01 -04:00
cleancache.c Driver Core and debugfs changes for 5.3-rc1 2019-07-12 12:24:03 -07:00
cma.c mm/cma.c: fail if fixed declaration can't be honored 2019-07-16 19:23:21 -07:00
cma.h
cma_debug.c mm/cma_debug.c: fix the break condition in cma_maxchunk_get() 2019-05-14 09:47:45 -07:00
compaction.c mm, compaction: fix wrong pfn handling in __reset_isolation_pfn() 2019-10-14 15:04:01 -07:00
debug.c mm/debug.c: always print flags in dump_page() 2020-03-05 16:43:51 +01:00
debug_page_ref.c
dmapool.c mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options 2019-07-12 11:05:46 -07:00
early_ioremap.c
fadvise.c fs: Export generic_fadvise() 2019-08-30 22:43:58 -07:00
failslab.c mm/failslab.c: by default, do not fail allocations with direct reclaim only 2019-07-12 11:05:43 -07:00
filemap.c mm: drop mmap_sem before calling balance_dirty_pages() in write fault 2020-01-09 10:19:55 +01:00
frame_vector.c mm: untag user pointers in get_vaddr_frames 2019-09-25 17:51:41 -07:00
frontswap.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 482 2019-06-19 17:09:52 +02:00
gup.c mm/gup: allow FOLL_FORCE for get_user_pages_fast() 2020-03-05 16:43:51 +01:00
gup_benchmark.c mm/gup: fix memory leak in __gup_benchmark_ioctl 2020-01-09 10:20:00 +01:00
highmem.c mm: convert totalram_pages and totalhigh_pages variables to atomic 2018-12-28 12:11:47 -08:00
hmm.c pagewalk: separate function pointers from iterator data 2019-09-07 04:28:04 -03:00
huge_memory.c mm: fix possible PMD dirty bit lost in set_pmd_migration_entry() 2020-03-12 13:00:19 +01:00
hugetlb.c mm/hugetlb: defer freeing of huge pages if in non-task context 2020-01-09 10:20:07 +01:00
hugetlb_cgroup.c mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup() 2019-11-15 18:34:00 -08:00
hwpoison-inject.c hwpoison-inject: no need to check return value of debugfs_create functions 2019-06-03 15:39:40 +02:00
init-mm.c mm/init-mm.c: include <linux/mman.h> for vm_committed_as_batch 2019-10-19 06:32:32 -04:00
internal.h mm: drop mmap_sem before calling balance_dirty_pages() in write fault 2020-01-09 10:19:55 +01:00
interval_tree.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 248 2019-06-19 17:09:08 +02:00
Kconfig mm,thp: add read-only THP support for (non-shmem) FS 2019-09-24 15:54:11 -07:00
Kconfig.debug mm, page_owner, debug_pagealloc: save and dump freeing stack trace 2019-09-24 15:54:08 -07:00
khugepaged.c mm,thp: recheck each page before collapsing file THP 2019-11-15 18:34:00 -08:00
kmemleak-test.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
kmemleak.c kmemleak: Do not corrupt the object_list during clean-up 2019-10-14 08:56:16 -07:00
ksm.c mm/ksm.c: don't WARN if page is still mapped in remove_stable_node() 2019-11-22 09:11:18 -08:00
list_lru.c mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages 2019-07-12 11:05:44 -07:00
maccess.c uaccess: Add non-pagefault user-space write function 2020-01-17 19:48:40 +01:00
madvise.c mm: do not allow MADV_PAGEOUT for CoW pages 2020-03-25 08:25:57 +01:00
Makefile mm: silence -Woverride-init/initializer-overrides 2019-09-24 15:54:10 -07:00
memblock.c mm: memblock: do not enforce current limit for memblock_phys* family 2019-10-19 06:32:32 -04:00
memcontrol.c mm, memcg: do not high throttle allocators based on wraparound 2020-04-17 10:50:17 +02:00
memfd.c mm: page cache: store only head pages in i_pages 2019-09-24 15:54:08 -07:00
memory-failure.c mm/memory-failure.c: don't access uninitialized memmaps in memory_failure() 2019-10-19 06:32:31 -04:00
memory.c mm: drop mmap_sem before calling balance_dirty_pages() in write fault 2020-01-09 10:19:55 +01:00
memory_hotplug.c mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled 2020-03-12 13:00:19 +01:00
mempolicy.c mm: mempolicy: require at least one nodeid for MPOL_PREFERRED 2020-04-08 09:08:47 +02:00
mempool.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
memremap.c mm/memory_hotplug: shrink zones when offlining memory 2020-01-09 10:19:56 +01:00
memtest.c
migrate.c mm: move_pages: report the number of non-attempted pages 2020-02-11 04:35:13 -08:00
mincore.c mm: untag user pointers passed to memory syscalls 2019-09-25 17:51:41 -07:00
mlock.c mm: untag user pointers passed to memory syscalls 2019-09-25 17:51:41 -07:00
mm_init.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
mmap.c mm: Avoid creating virtual address aliases in brk()/mmap()/mremap() 2020-02-28 17:22:21 +01:00
mmu_context.c
mmu_gather.c mm/mmu_gather: invalidate TLB correctly on batch allocation failure and flush 2020-02-11 04:35:42 -08:00
mmu_notifier.c mm/mmu_notifiers: use the right return code for WARN_ON 2019-11-06 08:47:50 -08:00
mmzone.c
mprotect.c mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa 2020-03-12 13:00:19 +01:00
mremap.c mm: Avoid creating virtual address aliases in brk()/mmap()/mremap() 2020-02-28 17:22:21 +01:00
msync.c mm: untag user pointers passed to memory syscalls 2019-09-25 17:51:41 -07:00
nommu.c x86/mm: split vmalloc_sync_all() 2020-03-25 08:25:58 +01:00
oom_kill.c mm/oom: fix pgtables units mismatch in Killed process message 2020-01-09 10:19:57 +01:00
page-writeback.c mm/page-writeback.c: avoid potential division by zero in wb_min_max_ratio() 2020-01-23 08:22:41 +01:00
page_alloc.c mm/page_alloc.c: fix uninitialized memmaps on a partially populated last section 2020-02-11 04:35:42 -08:00
page_counter.c
page_ext.c mm, page_owner: fix off-by-one error in __set_page_owner_handle() 2019-10-14 15:04:00 -07:00
page_idle.c mm/page_idle.c: fix oops because end_pfn is larger than max_pfn 2019-06-29 16:43:45 +08:00
page_io.c mm/page_io.c: do not free shared swap slots 2019-11-15 18:34:00 -08:00
page_isolation.c mm/page_isolation.c: change the prototype of undo_isolate_page_range() 2019-07-12 11:05:43 -07:00
page_owner.c mm/page_owner: don't access uninitialized memmaps when reading /proc/pagetypeinfo 2019-10-19 06:32:31 -04:00
page_poison.c mm/page_poison.c: fix a typo in a comment 2019-09-24 15:54:08 -07:00
page_vma_mapped.c mm: introduce page_size() 2019-09-24 15:54:08 -07:00
pagewalk.c pagewalk: use lockdep_assert_held for locking validation 2019-09-07 04:28:04 -03:00
percpu-internal.h percpu: convert chunk hints to be based on pcpu_block_md 2019-03-13 12:25:31 -07:00
percpu-km.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 2019-06-05 17:37:16 +02:00
percpu-stats.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 2019-06-05 17:37:16 +02:00
percpu-vm.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 2019-06-05 17:37:16 +02:00
percpu.c percpu: Use struct_size() helper 2019-09-04 13:40:49 -07:00
pgtable-generic.c x86/mm: Page size aware flush_tlb_mm_range() 2018-10-09 16:51:11 +02:00
process_vm_access.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
readahead.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
rmap.c mm: include <linux/huge_mm.h> for is_vma_temporary_stack 2019-10-19 06:32:32 -04:00
rodata_test.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
shmem.c mm/shmem.c: thp, shmem: fix conflict of above-47bit hint address and PMD alignment 2020-01-23 08:22:39 +01:00
shuffle.c mm: fix -Wmissing-prototypes warnings 2019-10-07 15:47:19 -07:00
shuffle.h mm: maintain randomization of page free lists 2019-05-14 19:52:48 -07:00
slab.c mm, debug_pagealloc: don't rely on static keys too early 2020-01-23 08:22:40 +01:00
slab.h mm: slab: make page_cgroup_ino() to recognize non-compound slab pages properly 2019-11-06 08:47:50 -08:00
slab_common.c mm: memcg/slab: call flush_memcg_workqueue() only if memcg workqueue is valid 2020-01-23 08:22:39 +01:00
slob.c mm, sl[aou]b: guarantee natural alignment for kmalloc(power-of-two) 2019-10-07 15:47:20 -07:00
slub.c slub: improve bit diffusion for freelist ptr obfuscation 2020-04-13 10:48:07 +02:00
sparse-vmemmap.c mm/sparsemem: convert kmalloc_section_memmap() to populate_section_memmap() 2019-07-18 17:08:07 -07:00
sparse.c mm/sparse: fix kernel crash with pfn_section_valid check 2020-04-01 11:02:03 +02:00
swap.c mm: introduce MADV_COLD 2019-09-25 17:51:41 -07:00
swap_cgroup.c
swap_slots.c mm, swap, get_swap_pages: use entry_size instead of cluster in parameter 2018-08-22 10:52:44 -07:00
swap_state.c mm: page cache: store only head pages in i_pages 2019-09-24 15:54:08 -07:00
swapfile.c mm/swapfile.c: move inode_lock out of claim_swapfile 2020-04-01 11:02:02 +02:00
truncate.c mm/thp: allow dropping THP from page cache 2019-10-19 06:32:33 -04:00
usercopy.c usercopy: Avoid HIGHMEM pfn warning 2019-09-17 15:20:17 -07:00
userfaultfd.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499 2019-06-19 17:09:53 +02:00
util.c arm64, mm: make randomization selected by generic topdown mmap layout 2019-09-24 15:54:11 -07:00
vmacache.c mm: get rid of vmacache_flush_all() entirely 2018-09-13 15:18:04 -10:00
vmalloc.c x86/mm: split vmalloc_sync_all() 2020-03-25 08:25:58 +01:00
vmpressure.c mm/vmpressure.c: fix a signedness bug in vmpressure_register_event() 2019-10-07 15:47:19 -07:00
vmscan.c mm/vmscan.c: don't round up scan size for online memory cgroup 2020-02-28 17:22:20 +01:00
vmstat.c mm, vmstat: reduce zone->lock holding time by /proc/pagetypeinfo 2019-11-06 08:47:50 -08:00
workingset.c mm: workingset: fix vmstat counters for shadow nodes 2019-08-13 16:06:52 -07:00
z3fold.c mm/z3fold.c: claim page in the beginning of free 2019-10-07 15:47:19 -07:00
zbud.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
zpool.c zpool: add malloc_support_movable to zpool_driver 2019-09-24 15:54:12 -07:00
zsmalloc.c mm/zsmalloc.c: fix the migrated zspage statistics. 2020-01-09 10:19:56 +01:00
zswap.c zswap: do not map same object twice 2019-09-24 15:54:12 -07:00