linux-stable/include/linux/sched
Frederic Weisbecker a85c2257a8 sched/isolation: add cpu_is_isolated() API
Patch series "memcg, cpuisol: do not interfere pcp cache charges draining
with cpuisol workloads".

Leonardo has reported [1] that pcp memcg charge draining can interfere
with cpu isolated workloads.  The said draining is done from a WQ context
with a pcp worker scheduled on each CPU which holds any cached charges for
a specific memcg hierarchy.  Operation is not really a common operation
[2].  It can be triggered from the userspace though so some care is
definitely due.

Leonardo has tried to address the issue by allowing remote charge draining
[3].  This approach requires an additional locking to synchronize pcp
caches sync from a remote cpu from local pcp consumers.  Even though the
proposed lock was per-cpu there is still potential for contention and less
predictable behavior.

This patchset addresses the issue from a different angle.  Rather than
dealing with a potential synchronization, cpus which are isolated are
simply never scheduled to be drained.  This means that a small amount of
charges could be laying around and waiting for a later use or they are
flushed when a different memcg is charged from the same cpu.  More details
are in patch 2.  The first patch from Frederic is implementing an
abstraction to tell whether a specific cpu has been isolated and therefore
require a special treatment.


This patch (of 2):

Provide this new API to check if a CPU has been isolated either through
isolcpus= or nohz_full= kernel parameter.

It aims at avoiding kernel load deemed to be safely spared on CPUs running
sensitive workload that can't bear any disturbance, such as pcp cache
draining.

Link: https://lkml.kernel.org/r/20230317134448.11082-1-mhocko@kernel.org
Link: https://lkml.kernel.org/r/20230317134448.11082-2-mhocko@kernel.org
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:29:43 -07:00
..
affinity.h
autogroup.h
clock.h sched/clock: Make local_clock() noinstr 2023-01-31 15:01:47 +01:00
cond_resched.h
coredump.h mm: implement memory-deny-write-execute as a prctl 2023-02-02 22:33:24 -08:00
cpufreq.h
cputime.h cputime: remove cputime_to_nsecs fallback 2022-12-27 12:52:17 +01:00
deadline.h
debug.h
hotplug.h
idle.h cpuidle, sched: Remove instrumentation from TIF_{POLLING_NRFLAG,NEED_RESCHED} 2023-01-13 11:48:16 +01:00
init.h
isolation.h sched/isolation: add cpu_is_isolated() API 2023-04-18 16:29:43 -07:00
jobctl.h
loadavg.h
mm.h lazy tlb: allow lazy tlb mm refcounting to be configurable 2023-03-28 16:20:08 -07:00
nohz.h
numa_balancing.h
posix-timers.h
prio.h
rseq_api.h
rt.h
sd_flags.h
signal.h
smt.h
stat.h
sysctl.h
task.h x86/mm: Use mm_alloc() in poking_init() 2022-12-15 10:37:26 -08:00
task_flags.h
task_stack.h
thread_info_api.h
topology.h
types.h
user.h
wake_q.h
xacct.h