linux-stable/mm
NeilBrown a37b0715dd mm/writeback: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLE
PF_LESS_THROTTLE exists for loop-back nfsd (and a similar need in the
loop block driver and callers of prctl(PR_SET_IO_FLUSHER)), where a
daemon needs to write to one bdi (the final bdi) in order to free up
writes queued to another bdi (the client bdi).

The daemon sets PF_LESS_THROTTLE and gets a larger allowance of dirty
pages, so that it can still dirty pages after other processses have been
throttled.  The purpose of this is to avoid deadlock that happen when
the PF_LESS_THROTTLE process must write for any dirty pages to be freed,
but it is being thottled and cannot write.

This approach was designed when all threads were blocked equally,
independently on which device they were writing to, or how fast it was.
Since that time the writeback algorithm has changed substantially with
different threads getting different allowances based on non-trivial
heuristics.  This means the simple "add 25%" heuristic is no longer
reliable.

The important issue is not that the daemon needs a *larger* dirty page
allowance, but that it needs a *private* dirty page allowance, so that
dirty pages for the "client" bdi that it is helping to clear (the bdi
for an NFS filesystem or loop block device etc) do not affect the
throttling of the daemon writing to the "final" bdi.

This patch changes the heuristic so that the task is not throttled when
the bdi it is writing to has a dirty page count below below (or equal
to) the free-run threshold for that bdi.  This ensures it will always be
able to have some pages in flight, and so will not deadlock.

In a steady-state, it is expected that PF_LOCAL_THROTTLE tasks might
still be throttled by global threshold, but that is acceptable as it is
only the deadlock state that is interesting for this flag.

This approach of "only throttle when target bdi is busy" is consistent
with the other use of PF_LESS_THROTTLE in current_may_throttle(), were
it causes attention to be focussed only on the target bdi.

So this patch
 - renames PF_LESS_THROTTLE to PF_LOCAL_THROTTLE,
 - removes the 25% bonus that that flag gives, and
 - If PF_LOCAL_THROTTLE is set, don't delay at all unless the
   global and the local free-run thresholds are exceeded.

Note that previously realtime threads were treated the same as
PF_LESS_THROTTLE threads.  This patch does *not* change the behvaiour
for real-time threads, so it is now different from the behaviour of nfsd
and loop tasks.  I don't know what is wanted for realtime.

[akpm@linux-foundation.org: coding style fixes]
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Chuck Lever <chuck.lever@oracle.com>	[nfsd]
Cc: Christoph Hellwig <hch@lst.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Link: http://lkml.kernel.org/r/87ftbf7gs3.fsf@notabene.neil.brown.name
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:08 -07:00
..
kasan kasan: disable branch tracing for core runtime 2020-05-23 10:26:31 -07:00
backing-dev.c bdi: add a ->dev_name field to struct backing_dev_info 2020-05-09 16:07:57 -06:00
balloon_compaction.c
cleancache.c
cma.c mm: cma: NUMA node interface 2020-04-10 15:36:21 -07:00
cma.h
cma_debug.c mm/cma_debug.c: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops 2019-12-01 12:59:09 -08:00
compaction.c mm/compaction: add missing annotation for compact_lock_irqsave 2020-04-07 10:43:41 -07:00
debug.c mm, dump_page(): do not crash with invalid mapping pointer 2020-06-02 10:59:06 -07:00
debug_page_ref.c
dmapool.c mm/dmapool.c: micro-optimisation remove unnecessary branch 2020-04-07 10:43:42 -07:00
early_ioremap.c mm/early_ioremap.c: use %pa to print resource_size_t variables 2020-01-31 10:30:38 -08:00
fadvise.c mm: return void from various readahead functions 2020-06-02 10:59:06 -07:00
failslab.c
filemap.c mm/filemap.c: remove misleading comment 2020-06-02 10:59:08 -07:00
frame_vector.c
frontswap.c
gup.c mm/gup: fix fixup_user_fault() on multiple retries 2020-05-14 10:00:35 -07:00
gup_benchmark.c mm/gup_benchmark: support pin_user_pages() and related calls 2020-04-02 09:35:27 -07:00
highmem.c mm, x86/mm: Untangle address space layout definitions from basic pgtable type definitions 2019-12-10 10:12:55 +01:00
hmm.c mm/hmm: return error for non-vma snapshots 2020-03-30 16:58:36 -03:00
huge_memory.c userfaultfd: wp: support swap and page migration 2020-04-07 10:43:39 -07:00
hugetlb.c mm/hugetlb: fix a addressing exception caused by huge_pte_offset 2020-04-21 11:11:55 -07:00
hugetlb_cgroup.c mm: use fallthrough; 2020-04-07 10:43:41 -07:00
hwpoison-inject.c mm/hwpoison-inject: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops 2019-12-01 12:59:09 -08:00
init-mm.c
internal.h mm: return void from various readahead functions 2020-06-02 10:59:06 -07:00
interval_tree.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 248 2019-06-19 17:09:08 +02:00
Kconfig libnvdimm for 5.7 2020-04-08 21:03:40 -07:00
Kconfig.debug mm: add generic ptdump 2020-02-04 03:05:25 +00:00
khugepaged.c mm,thp: stop leaking unreleased file pages 2020-05-28 11:35:40 -07:00
kmemleak-test.c
kmemleak.c mm/kmemleak.c: use address-of operator on section symbols 2020-04-02 09:35:26 -07:00
ksm.c mm/ksm: fix NULL pointer dereference when KSM zero page is enabled 2020-04-21 11:11:55 -07:00
list_lru.c mm: use fallthrough; 2020-04-07 10:43:41 -07:00
maccess.c
madvise.c mm: check that mm is still valid in madvise() 2020-04-24 13:28:03 -07:00
Makefile mm: introduce Reported pages 2020-04-07 10:43:38 -07:00
mapping_dirty_helpers.c mm/mapping_dirty_helpers: update huge page-table entry callbacks 2020-04-02 09:35:29 -07:00
memblock.c mm: cma: NUMA node interface 2020-04-10 15:36:21 -07:00
memcontrol.c mm, memcg: fix error return value of mem_cgroup_css_alloc() 2020-05-07 19:27:20 -07:00
memfd.c
memory-failure.c mm: code cleanup for MADV_FREE 2020-04-07 10:43:38 -07:00
memory.c mm/memory.c: add vm_insert_pages() 2020-04-10 15:36:21 -07:00
memory_hotplug.c mm/memory_hotplug: add pgprot_t to mhp_params 2020-04-10 15:36:21 -07:00
mempolicy.c libnvdimm for 5.7 2020-04-08 21:03:40 -07:00
mempool.c
memremap.c mm/memremap: set caching mode for PCI P2PDMA memory to WC 2020-04-10 15:36:21 -07:00
memtest.c
migrate.c mm/migrate.c: call detach_page_private to cleanup code 2020-06-02 10:59:08 -07:00
mincore.c mm: pagewalk: add 'depth' parameter to pte_hole 2020-02-04 03:05:25 +00:00
mlock.c
mm_init.c mm/mm_init.c: clean code. Use BUILD_BUG_ON when comparing compile time constant 2020-04-07 10:43:41 -07:00
mmap.c mm/vma: introduce VM_ACCESS_FLAGS 2020-04-10 15:36:21 -07:00
mmu_context.c
mmu_gather.c asm-generic/tlb: provide MMU_GATHER_TABLE_FREE 2020-02-04 03:05:26 +00:00
mmu_notifier.c mm/mmu_notifier: silence PROVE_RCU_LIST warnings 2020-03-21 18:56:06 -07:00
mmzone.c
mprotect.c mm/vma: introduce VM_ACCESS_FLAGS 2020-04-10 15:36:21 -07:00
mremap.c userfaultfd: fix remap event with MREMAP_DONTUNMAP 2020-05-14 10:00:35 -07:00
msync.c
nommu.c x86/mm: split vmalloc_sync_all() 2020-03-21 18:56:06 -07:00
oom_kill.c mm, oom: dump stack of victim when reaping failed 2020-01-31 10:30:38 -08:00
page-writeback.c mm/writeback: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLE 2020-06-02 10:59:08 -07:00
page_alloc.c mm: limit boost_watermark on small zones 2020-05-07 19:27:21 -07:00
page_counter.c mm, memcg: prevent memory.min load/store tearing 2020-04-02 09:35:29 -07:00
page_ext.c mm/page_ext.c: drop pfn_present() check when onlining 2020-04-07 10:43:40 -07:00
page_idle.c
page_io.c fs: Enable bmap() function to properly return errors 2020-02-03 08:05:37 -05:00
page_isolation.c mm: add function __putback_isolated_page 2020-04-07 10:43:38 -07:00
page_owner.c
page_poison.c
page_reporting.c mm/page_reporting: add budget limit on how many pages can be reported per pass 2020-04-07 10:43:39 -07:00
page_reporting.h mm: introduce Reported pages 2020-04-07 10:43:38 -07:00
page_vma_mapped.c mm/page_vma_mapped.c: explicitly compare pfn for normal, hugetlbfs and THP page 2020-01-31 10:30:38 -08:00
pagewalk.c x86: mm: avoid allocating struct mm_struct on the stack 2020-02-04 03:05:25 +00:00
percpu-internal.h
percpu-km.c
percpu-stats.c percpu: update copyright emails to dennis@kernel.org 2020-04-01 10:09:12 -07:00
percpu-vm.c
percpu.c percpu: make pcpu_alloc() aware of current gfp context 2020-05-07 19:27:21 -07:00
pgtable-generic.c asm-generic/mm: stub out p{4,u}d_clear_bad() if __PAGETABLE_P{4,U}D_FOLDED 2019-12-01 06:29:19 -08:00
process_vm_access.c mm: docs: Fix a comment in process_vm_rw_core 2020-03-25 10:04:01 -05:00
ptdump.c x86: mm: avoid allocating struct mm_struct on the stack 2020-02-04 03:05:25 +00:00
readahead.c mm: use memalloc_nofs_save in readahead path 2020-06-02 10:59:07 -07:00
rmap.c mm: prevent a warning when casting void* -> enum 2020-04-07 10:43:41 -07:00
rodata_test.c
shmem.c mm: shmem: disable interrupt when acquiring info->lock in userfaultfd_copy path 2020-04-21 11:11:56 -07:00
shuffle.c mm: adjust shuffle code to allow for future coalescing 2020-04-07 10:43:38 -07:00
shuffle.h mm: adjust shuffle code to allow for future coalescing 2020-04-07 10:43:38 -07:00
slab.c mm, debug_pagealloc: don't rely on static keys too early 2020-01-13 18:19:02 -08:00
slab.h mm: kmem: rename (__)memcg_kmem_(un)charge_memcg() to __memcg_kmem_(un)charge() 2020-04-02 09:35:28 -07:00
slab_common.c usercopy: mark dma-kmalloc caches as usercopy caches 2020-06-02 10:59:06 -07:00
slob.c
slub.c mm/slub: fix stack overruns with SLUB_STATS 2020-06-02 10:59:06 -07:00
sparse-vmemmap.c
sparse.c mm/sparse.c: move subsection_map related functions together 2020-04-07 10:43:40 -07:00
swap.c mm: huge tmpfs: try to split_huge_page() when punching hole 2020-04-07 10:43:41 -07:00
swap_cgroup.c
swap_slots.c mm/swap_slots.c: assign|reset cache slot by value directly 2020-04-02 09:35:27 -07:00
swap_state.c mm/swap_state.c: use the same way to count page in [add_to|delete_from]_swap_cache 2020-04-02 09:35:28 -07:00
swapfile.c proc: faster open/read/close with "permanent" files 2020-04-07 10:43:42 -07:00
truncate.c
usercopy.c
userfaultfd.c userfaultfd: wp: support write protection for userfault vma range 2020-04-07 10:43:39 -07:00
util.c mm/mmap.c: rb_parent is not necessary in __vma_link_list() 2019-12-01 06:29:19 -08:00
vmacache.c
vmalloc.c vmalloc: fix remap_vmalloc_range() bounds checks 2020-04-21 11:11:56 -07:00
vmpressure.c mm: vmpressure: use mem_cgroup_is_root API 2020-04-02 09:35:31 -07:00
vmscan.c mm/writeback: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLE 2020-06-02 10:59:08 -07:00
vmstat.c mm, thp: track fallbacks due to failed memcg charges separately 2020-04-07 10:43:38 -07:00
workingset.c mm: vmscan: detect file thrashing at the reclaim root 2019-12-01 12:59:07 -08:00
z3fold.c mm/z3fold: silence kmemleak false positives of slots 2020-05-28 11:35:40 -07:00
zbud.c
zpool.c
zsmalloc.c mm: use fallthrough; 2020-04-07 10:43:41 -07:00
zswap.c mm/zswap: allow setting default status, compressor and allocator in Kconfig 2020-04-07 10:43:41 -07:00