linux-stable/kernel/sched
Linus Torvalds ed3b7923a8 Scheduler changes for v6.5:
- Scheduler SMP load-balancer improvements:
 
     - Avoid unnecessary migrations within SMT domains on hybrid systems.
 
       Problem:
 
         On hybrid CPU systems, (processors with a mixture of higher-frequency
 	SMT cores and lower-frequency non-SMT cores), under the old code
 	lower-priority CPUs pulled tasks from the higher-priority cores if
 	more than one SMT sibling was busy - resulting in many unnecessary
 	task migrations.
 
       Solution:
 
         The new code improves the load balancer to recognize SMT cores with more
         than one busy sibling and allows lower-priority CPUs to pull tasks, which
         avoids superfluous migrations and lets lower-priority cores inspect all SMT
         siblings for the busiest queue.
 
     - Implement the 'runnable boosting' feature in the EAS balancer: consider CPU
       contention in frequency, EAS max util & load-balance busiest CPU selection.
 
       This improves CPU utilization for certain workloads, while leaves other key
       workloads unchanged.
 
 - Scheduler infrastructure improvements:
 
     - Rewrite the scheduler topology setup code by consolidating it
       into the build_sched_topology() helper function and building
       it dynamically on the fly.
 
     - Resolve the local_clock() vs. noinstr complications by rewriting
       the code: provide separate sched_clock_noinstr() and
       local_clock_noinstr() functions to be used in instrumentation code,
       and make sure it is all instrumentation-safe.
 
 - Fixes:
 
     - Fix a kthread_park() race with wait_woken()
 
     - Fix misc wait_task_inactive() bugs unearthed by the -rt merge:
        - Fix UP PREEMPT bug by unifying the SMP and UP implementations.
        - Fix task_struct::saved_state handling.
 
     - Fix various rq clock update bugs, unearthed by turning on the rq clock
       debugging code.
 
     - Fix the PSI WINDOW_MIN_US trigger limit, which was easy to trigger by
       creating enough cgroups, by removing the warnign and restricting
       window size triggers to PSI file write-permission or CAP_SYS_RESOURCE.
 
     - Propagate SMT flags in the topology when removing degenerate domain
 
     - Fix grub_reclaim() calculation bug in the deadline scheduler code
 
     - Avoid resetting the min update period when it is unnecessary, in
       psi_trigger_destroy().
 
     - Don't balance a task to its current running CPU in load_balance(),
       which was possible on certain NUMA topologies with overlapping
       groups.
 
     - Fix the sched-debug printing of rq->nr_uninterruptible
 
 - Cleanups:
 
     - Address various -Wmissing-prototype warnings, as a preparation
       to (maybe) enable this warning in the future.
 
     - Remove unused code
 
     - Mark more functions __init
 
     - Fix shadow-variable warnings
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmSatWQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1j62xAAuGOx1LcDfRGC6WGQzp1zOdlsVQtnDvlS
 qL58zYSHgizprpVQ3j87SBaG4CHCdvd2Bo36yW0lNZS4nd203qdq7fkrMb3hPP/w
 egUQUzMegf5fF6BWldKeMjuHSt+twFQz/ZAKK8iSbAir6CHNAqbNst1oL0i/+Tyk
 o33hBs1hT5tnbFb1NSVZkX4k+qT3LzTW4K2QgjjGtkScr6yHh2BdEVefyigWOjdo
 9s02d00ll9a2r+F5txlN7Dnw6TN7rmTXGMOJU5bZvBE90/anNiAorMXHJdEKCyUR
 u9+JtBdJWiCplGa/tSRcxT16ZW1VdtTnd9q66TDhXREd2UNDFqBEyg5Wl77K4Tlf
 vKFajmj/to+cTbuv6m6TVR+zyXpdEpdL6F04P44U3qiJvDobBqeDNKHHIqpmbHXl
 AXUXcPWTVAzXX1Ce5M+BeAgTBQ1T7C5tELILrTNQHJvO1s9VVBRFZ/l65Ps4vu7T
 wIZ781IFuopk0zWqHovNvgKrJ7oFmOQQZFttQEe8n6nafkjI7u+IZ8FayiGaUMRr
 4GawFGUCEdYh8z9qyslGKe8Q/Rphfk6hxMFRYUJpDmubQ0PkMeDjDGq77jDGl1PF
 VqwSDEyOaBJs7Gqf/mem00JtzBmXhkhm1SEjggHMI2IQbr/eeBXoLQOn3CDapO/N
 PiDbtX760ic=
 =EWQA
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:
 "Scheduler SMP load-balancer improvements:

   - Avoid unnecessary migrations within SMT domains on hybrid systems.

     Problem:

        On hybrid CPU systems, (processors with a mixture of
        higher-frequency SMT cores and lower-frequency non-SMT cores),
        under the old code lower-priority CPUs pulled tasks from the
        higher-priority cores if more than one SMT sibling was busy -
        resulting in many unnecessary task migrations.

     Solution:

        The new code improves the load balancer to recognize SMT cores
        with more than one busy sibling and allows lower-priority CPUs
        to pull tasks, which avoids superfluous migrations and lets
        lower-priority cores inspect all SMT siblings for the busiest
        queue.

   - Implement the 'runnable boosting' feature in the EAS balancer:
     consider CPU contention in frequency, EAS max util & load-balance
     busiest CPU selection.

     This improves CPU utilization for certain workloads, while leaves
     other key workloads unchanged.

  Scheduler infrastructure improvements:

   - Rewrite the scheduler topology setup code by consolidating it into
     the build_sched_topology() helper function and building it
     dynamically on the fly.

   - Resolve the local_clock() vs. noinstr complications by rewriting
     the code: provide separate sched_clock_noinstr() and
     local_clock_noinstr() functions to be used in instrumentation code,
     and make sure it is all instrumentation-safe.

  Fixes:

   - Fix a kthread_park() race with wait_woken()

   - Fix misc wait_task_inactive() bugs unearthed by the -rt merge:
       - Fix UP PREEMPT bug by unifying the SMP and UP implementations
       - Fix task_struct::saved_state handling

   - Fix various rq clock update bugs, unearthed by turning on the rq
     clock debugging code.

   - Fix the PSI WINDOW_MIN_US trigger limit, which was easy to trigger
     by creating enough cgroups, by removing the warnign and restricting
     window size triggers to PSI file write-permission or
     CAP_SYS_RESOURCE.

   - Propagate SMT flags in the topology when removing degenerate domain

   - Fix grub_reclaim() calculation bug in the deadline scheduler code

   - Avoid resetting the min update period when it is unnecessary, in
     psi_trigger_destroy().

   - Don't balance a task to its current running CPU in load_balance(),
     which was possible on certain NUMA topologies with overlapping
     groups.

   - Fix the sched-debug printing of rq->nr_uninterruptible

  Cleanups:

   - Address various -Wmissing-prototype warnings, as a preparation to
     (maybe) enable this warning in the future.

   - Remove unused code

   - Mark more functions __init

   - Fix shadow-variable warnings"

* tag 'sched-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
  sched/core: Avoid multiple calling update_rq_clock() in __cfsb_csd_unthrottle()
  sched/core: Avoid double calling update_rq_clock() in __balance_push_cpu_stop()
  sched/core: Fixed missing rq clock update before calling set_rq_offline()
  sched/deadline: Update GRUB description in the documentation
  sched/deadline: Fix bandwidth reclaim equation in GRUB
  sched/wait: Fix a kthread_park race with wait_woken()
  sched/topology: Mark set_sched_topology() __init
  sched/fair: Rename variable cpu_util eff_util
  arm64/arch_timer: Fix MMIO byteswap
  sched/fair, cpufreq: Introduce 'runnable boosting'
  sched/fair: Refactor CPU utilization functions
  cpuidle: Use local_clock_noinstr()
  sched/clock: Provide local_clock_noinstr()
  x86/tsc: Provide sched_clock_noinstr()
  clocksource: hyper-v: Provide noinstr sched_clock()
  clocksource: hyper-v: Adjust hv_read_tsc_page_tsc() to avoid special casing U64_MAX
  x86/vdso: Fix gettimeofday masking
  math64: Always inline u128 version of mul_u64_u64_shr()
  s390/time: Provide sched_clock_noinstr()
  loongarch: Provide noinstr sched_clock_read()
  ...
2023-06-27 14:03:21 -07:00
..
autogroup.c sched/all: Change all BUG_ON() instances in the scheduler to WARN_ON_ONCE() 2022-08-12 11:25:10 +02:00
autogroup.h sched/headers: Add header guard to kernel/sched/stats.h and kernel/sched/autogroup.h 2022-02-23 08:22:00 +01:00
build_policy.c sched: Fix missing prototype warnings 2022-05-01 10:03:43 +02:00
build_utility.c sched: Fix missing prototype warnings 2022-05-01 10:03:43 +02:00
clock.c sched/clock: Provide local_clock_noinstr() 2023-06-05 21:11:09 +02:00
completion.c sched/completion: Add wait_for_completion_state() 2022-09-07 21:53:49 +02:00
core.c Scheduler changes for v6.5: 2023-06-27 14:03:21 -07:00
core_sched.c sched: Rename task_running() to task_on_cpu() 2022-09-07 21:53:47 +02:00
cpuacct.c Merge branch 'sched/fast-headers' into sched/core 2022-03-15 09:05:05 +01:00
cpudeadline.c sched/core: Introduce sched_asym_cpucap_active() 2022-08-02 12:32:45 +02:00
cpudeadline.h
cpufreq.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
cpufreq_schedutil.c sched/fair, cpufreq: Introduce 'runnable boosting' 2023-06-05 21:13:44 +02:00
cpupri.c sched/all: Change all BUG_ON() instances in the scheduler to WARN_ON_ONCE() 2022-08-12 11:25:10 +02:00
cpupri.h
cputime.c cputime: remove cputime_to_nsecs fallback 2022-12-27 12:52:17 +01:00
deadline.c sched/deadline: Fix bandwidth reclaim equation in GRUB 2023-06-16 22:08:11 +02:00
debug.c sched/debug: Correct printing for rq->nr_uninterruptible 2023-05-08 10:58:39 +02:00
fair.c sched/core: Avoid multiple calling update_rq_clock() in __cfsb_csd_unthrottle() 2023-06-16 22:08:13 +02:00
features.h sched/fair: Introduce SIS_UTIL to search idle CPU based on sum of util_avg 2022-06-28 09:08:30 +02:00
idle.c sched/idle: Mark arch_cpu_idle_dead() __noreturn 2023-03-08 08:44:28 -08:00
isolation.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
loadavg.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
Makefile sched/headers: Introduce kernel/sched/build_policy.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
membarrier.c sched/membarrier: Introduce MEMBARRIER_CMD_GET_REGISTRATIONS 2023-01-07 11:29:29 +01:00
pelt.c sched/headers: Introduce kernel/sched/build_policy.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
pelt.h sched/fair: Decay task PELT values during wakeup migration 2022-06-28 09:17:46 +02:00
psi.c sched/psi: Avoid resetting the min update period when it is unnecessary 2023-05-20 12:53:16 +02:00
rt.c sched/rt: Fix bad task migration for rt tasks 2023-04-21 13:24:21 +02:00
sched-pelt.h
sched.h sched/core: Avoid multiple calling update_rq_clock() in __cfsb_csd_unthrottle() 2023-06-16 22:08:13 +02:00
smp.h sched, smp: Trace smp callback causing an IPI 2023-03-24 11:01:29 +01:00
stats.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
stats.h sched/psi: Use task->psi_flags to clear in CPU migration 2022-10-30 10:12:15 +01:00
stop_task.c sched: Add update_current_exec_runtime helper 2022-08-27 00:05:35 +02:00
swait.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
topology.c sched/core: Fixed missing rq clock update before calling set_rq_offline() 2023-06-16 22:08:12 +02:00
wait.c sched/wait: Fix a kthread_park race with wait_woken() 2023-06-16 17:08:01 +02:00
wait_bit.c wait_on_bit: add an acquire memory barrier 2022-08-26 09:30:25 -07:00