linux-stable/kernel
NeilBrown a37b0715dd mm/writeback: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLE
PF_LESS_THROTTLE exists for loop-back nfsd (and a similar need in the
loop block driver and callers of prctl(PR_SET_IO_FLUSHER)), where a
daemon needs to write to one bdi (the final bdi) in order to free up
writes queued to another bdi (the client bdi).

The daemon sets PF_LESS_THROTTLE and gets a larger allowance of dirty
pages, so that it can still dirty pages after other processses have been
throttled.  The purpose of this is to avoid deadlock that happen when
the PF_LESS_THROTTLE process must write for any dirty pages to be freed,
but it is being thottled and cannot write.

This approach was designed when all threads were blocked equally,
independently on which device they were writing to, or how fast it was.
Since that time the writeback algorithm has changed substantially with
different threads getting different allowances based on non-trivial
heuristics.  This means the simple "add 25%" heuristic is no longer
reliable.

The important issue is not that the daemon needs a *larger* dirty page
allowance, but that it needs a *private* dirty page allowance, so that
dirty pages for the "client" bdi that it is helping to clear (the bdi
for an NFS filesystem or loop block device etc) do not affect the
throttling of the daemon writing to the "final" bdi.

This patch changes the heuristic so that the task is not throttled when
the bdi it is writing to has a dirty page count below below (or equal
to) the free-run threshold for that bdi.  This ensures it will always be
able to have some pages in flight, and so will not deadlock.

In a steady-state, it is expected that PF_LOCAL_THROTTLE tasks might
still be throttled by global threshold, but that is acceptable as it is
only the deadlock state that is interesting for this flag.

This approach of "only throttle when target bdi is busy" is consistent
with the other use of PF_LESS_THROTTLE in current_may_throttle(), were
it causes attention to be focussed only on the target bdi.

So this patch
 - renames PF_LESS_THROTTLE to PF_LOCAL_THROTTLE,
 - removes the 25% bonus that that flag gives, and
 - If PF_LOCAL_THROTTLE is set, don't delay at all unless the
   global and the local free-run thresholds are exceeded.

Note that previously realtime threads were treated the same as
PF_LESS_THROTTLE threads.  This patch does *not* change the behvaiour
for real-time threads, so it is now different from the behaviour of nfsd
and loop tasks.  I don't know what is wanted for realtime.

[akpm@linux-foundation.org: coding style fixes]
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Chuck Lever <chuck.lever@oracle.com>	[nfsd]
Cc: Christoph Hellwig <hch@lst.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Link: http://lkml.kernel.org/r/87ftbf7gs3.fsf@notabene.neil.brown.name
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:08 -07:00
..
bpf Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2020-06-01 12:00:10 -07:00
cgroup Revert "cgroup: Add memory barriers to plug cgroup_rstat_updated() race window" 2020-04-09 14:55:46 -04:00
configs compiler: remove CONFIG_OPTIMIZE_INLINING entirely 2020-04-07 10:43:42 -07:00
debug SPDX patches for 5.7-rc1. 2020-04-03 13:12:26 -07:00
dma dma-debug: fix displaying of dma allocation type 2020-04-08 21:46:57 +02:00
events perf/core: fix parent pid/tid in task exit events 2020-04-22 23:10:14 +02:00
gcov kernel/gcov/fs.c: gcov_seq_next() should increase position index 2020-04-10 15:36:22 -07:00
irq genirq: Remove setup_irq() and remove_irq() 2020-04-14 10:08:50 +02:00
livepatch New tracing features: 2019-11-27 11:42:01 -08:00
locking locking/lockdep: Improve 'invalid wait context' splat 2020-04-08 12:05:07 +02:00
power PM: hibernate: Freeze kernel threads in software_resume() 2020-04-27 10:30:30 +02:00
printk Printk changes for 5.8 2020-06-01 12:13:30 -07:00
rcu Merge branch 'urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/urgent 2020-04-14 08:36:41 +02:00
sched sched/fair: Don't NUMA balance for kthreads 2020-05-26 18:34:58 +02:00
time proc, time/namespace: Show clock symbolic names in /proc/pid/timens_offsets 2020-04-16 12:10:54 +02:00
trace Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-05-15 13:10:06 -07:00
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
acct.c acct: stop using get_seconds() 2019-12-18 18:07:31 +01:00
async.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
audit.c audit: check the length of userspace generated audit records 2020-04-20 17:10:58 -04:00
audit.h audit: trigger accompanying records when no rules present 2020-03-12 10:42:51 -04:00
audit_fsnotify.c fsnotify: use helpers to access data by data_type 2020-03-23 18:19:06 +01:00
audit_tree.c
audit_watch.c \n 2020-04-06 08:58:42 -07:00
auditfilter.c audit: fix error handling in audit_data_to_entry() 2020-02-22 20:36:47 -05:00
auditsc.c audit: trigger accompanying records when no rules present 2020-03-12 10:42:51 -04:00
backtracetest.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
bounds.c
capability.c
compat.c y2038: remove unused time32 interfaces 2020-02-21 11:22:15 -08:00
configs.c proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
context_tracking.c context-tracking: Introduce CONFIG_HAVE_TIF_NOHZ 2020-02-14 16:05:04 +01:00
cpu.c CPU (hotplug) updates: 2020-03-30 18:06:39 -07:00
cpu_pm.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 282 2019-06-05 17:36:37 +02:00
crash_core.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 230 2019-06-19 17:09:06 +02:00
crash_dump.c
cred.c kernel: doc: remove outdated comment cred.c 2020-03-25 10:04:01 -05:00
delayacct.c
dma.c
elfcore.c kernel/elfcore.c: include proper prototypes 2019-09-25 17:51:39 -07:00
exec_domain.c
exit.c proc: Put thread_pid in release_task not proc_flush_pid 2020-04-24 15:49:00 -05:00
extable.c kernel/extable.c: use address-of operator on section symbols 2020-04-07 10:43:42 -07:00
fail_function.c fail_function: no need to check return value of debugfs_create functions 2019-06-03 15:49:06 +02:00
fork.c fork: prevent accidental access to clone3 features 2020-05-08 17:31:50 +02:00
freezer.c Revert "libata, freezer: avoid block device removal while system is frozen" 2019-10-06 09:11:37 -06:00
futex.c Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2020-03-30 16:17:15 -07:00
gen_kheaders.sh kheaders: explain why include/config/autoconf.h is excluded from md5sum 2019-11-11 20:10:01 +09:00
groups.c
hung_task.c
iomem.c mm/nvdimm: add is_ioremap_addr and use that to check ioremap address 2019-07-12 11:05:40 -07:00
irq_work.c lockdep: Annotate irq_work 2020-03-21 16:00:24 +01:00
jump_label.c jump_label: Don't warn on __exit jump entries 2019-08-29 15:10:10 +01:00
kallsyms.c kallsyms: unexport kallsyms_lookup_name() and kallsyms_on_each_symbol() 2020-04-07 10:43:44 -07:00
kcmp.c kernel/kcmp.c: Use new infrastructure to fix deadlocks in execve 2020-03-25 10:04:01 -05:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks sched/rt, locking: Use CONFIG_PREEMPTION 2019-12-08 14:37:36 +01:00
Kconfig.preempt sched/Kconfig: Fix spelling mistake in user-visible help text 2019-11-12 11:35:32 +01:00
kcov.c kernel/kcov.c: fix typos in kcov_remote_start documentation 2020-05-07 19:27:20 -07:00
kexec.c kexec: add machine_kexec_post_load() 2020-01-08 16:32:55 +00:00
kexec_core.c kexec: add machine_kexec_post_load() 2020-01-08 16:32:55 +00:00
kexec_elf.c kexec_elf: support 32 bit ELF files 2019-09-06 23:58:44 +02:00
kexec_file.c kexec: add machine_kexec_post_load() 2020-01-08 16:32:55 +00:00
kexec_internal.h kexec: add machine_kexec_post_load() 2020-01-08 16:32:55 +00:00
kheaders.c kheaders: Move from proc to sysfs 2019-05-24 20:16:01 +02:00
kmod.c kmod: make request_module() return an error when autoloading is disabled 2020-04-10 15:36:22 -07:00
kprobes.c kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic 2020-01-09 12:40:13 +01:00
ksysfs.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 170 2019-05-30 11:26:39 -07:00
kthread.c kthread: Do not preempt current task if it is going to call schedule() 2020-03-20 13:06:20 +01:00
latencytop.c proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
Makefile kcov: ignore fault-inject and stacktrace 2020-01-31 10:30:41 -08:00
module-internal.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
module.c Modules updates for v5.7 2020-04-09 12:52:34 -07:00
module_signature.c MODSIGN: Export module signature definitions 2019-08-05 18:39:56 -04:00
module_signing.c MODSIGN: Export module signature definitions 2019-08-05 18:39:56 -04:00
notifier.c x86/mm: split vmalloc_sync_all() 2020-03-21 18:56:06 -07:00
nsproxy.c ns: Introduce Time Namespace 2020-01-14 12:20:48 +01:00
padata.c padata: add separate cpuhp node for CPUHP_PADATA_DEAD 2020-04-30 15:19:33 +10:00
panic.c locking/refcount: Remove unused 'refcount_error_report()' function 2019-11-25 09:15:42 +01:00
params.c lockdown: Lock down module params that specify hardware parameters (eg. ioport) 2019-08-19 21:54:16 -07:00
pid.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-04-10 12:59:56 -07:00
pid_namespace.c pid: Improve the comment about waiting in zap_pid_ns_processes 2020-02-28 16:29:12 -06:00
profile.c proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
ptrace.c ptrace: reintroduce usage of subjective credentials in ptrace_has_cap() 2020-01-18 13:51:39 +01:00
range.c
reboot.c printk: Collapse shutdown types into a single dump reason 2020-05-30 10:34:03 -07:00
relay.c
resource.c mm/memory_hotplug.c: use PFN_UP / PFN_DOWN in walk_system_ram_range() 2019-09-24 15:54:09 -07:00
rseq.c rseq: Reject unknown flags on rseq unregister 2019-12-25 10:41:20 +01:00
seccomp.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-03-31 17:29:33 -07:00
signal.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-04-23 13:30:18 -07:00
smp.c cpu/hotplug: Move bringup of secondary CPUs out of smp_init() 2020-03-25 12:59:37 +01:00
smpboot.c
smpboot.h
softirq.c lockdep: Rename trace_{hard,soft}{irq_context,irqs_enabled}() 2020-03-21 16:03:54 +01:00
stackleak.c
stacktrace.c stacktrace: Get rid of unneeded '!!' pattern 2019-11-11 10:30:59 +01:00
stop_machine.c stop_machine: Make stop_cpus() static 2020-01-17 10:19:21 +01:00
sys.c mm/writeback: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLE 2020-06-02 10:59:08 -07:00
sys_ni.c y2038: allow disabling time32 system calls 2019-11-15 14:38:30 +01:00
sysctl-test.c kunit: allow kunit tests to be loaded as a module 2020-01-09 16:42:29 -07:00
sysctl.c mm/compaction: Disable compact_unevictable_allowed on RT 2020-04-02 09:35:31 -07:00
sysctl_binary.c sysctl: Remove the sysctl system call 2019-11-26 13:03:56 -06:00
task_work.c task_work_run: don't take ->pi_lock unconditionally 2020-03-02 14:06:33 -07:00
taskstats.c taskstats: fix data-race 2019-12-04 15:18:39 +01:00
test_kprobes.c
torture.c CPU (hotplug) updates: 2020-03-30 18:06:39 -07:00
tracepoint.c The main changes in this release include: 2019-07-18 11:51:00 -07:00
tsacct.c tsacct: add 64-bit btime field 2019-12-18 18:07:31 +01:00
ucount.c ucount: Make sure ucounts in /proc/sys/user don't regress again 2020-04-07 21:51:27 +02:00
uid16.c
uid16.h
umh.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-05-15 13:10:06 -07:00
up.c smp/up: Make smp_call_function_single() match SMP semantics 2020-02-07 15:34:12 +01:00
user-return-notifier.c
user.c Keyrings namespacing 2019-07-08 19:36:47 -07:00
user_namespace.c Keyrings namespacing 2019-07-08 19:36:47 -07:00
utsname.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
utsname_sysctl.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
watchdog.c watchdog/softlockup: Enforce that timestamp is valid on boot 2020-01-17 11:19:22 +01:00
watchdog_hld.c
workqueue.c workqueue: Remove the warning in wq_worker_sleeping() 2020-04-08 11:35:20 +02:00
workqueue_internal.h