linux-stable/include
Matt Fleming a55c7454a8 sched/topology: Improve load balancing on AMD EPYC systems
SD_BALANCE_{FORK,EXEC} and SD_WAKE_AFFINE are stripped in sd_init()
for any sched domains with a NUMA distance greater than 2 hops
(RECLAIM_DISTANCE). The idea being that it's expensive to balance
across domains that far apart.

However, as is rather unfortunately explained in:

  commit 32e45ff43e ("mm: increase RECLAIM_DISTANCE to 30")

the value for RECLAIM_DISTANCE is based on node distance tables from
2011-era hardware.

Current AMD EPYC machines have the following NUMA node distances:

 node distances:
 node   0   1   2   3   4   5   6   7
   0:  10  16  16  16  32  32  32  32
   1:  16  10  16  16  32  32  32  32
   2:  16  16  10  16  32  32  32  32
   3:  16  16  16  10  32  32  32  32
   4:  32  32  32  32  10  16  16  16
   5:  32  32  32  32  16  10  16  16
   6:  32  32  32  32  16  16  10  16
   7:  32  32  32  32  16  16  16  10

where 2 hops is 32.

The result is that the scheduler fails to load balance properly across
NUMA nodes on different sockets -- 2 hops apart.

For example, pinning 16 busy threads to NUMA nodes 0 (CPUs 0-7) and 4
(CPUs 32-39) like so,

  $ numactl -C 0-7,32-39 ./spinner 16

causes all threads to fork and remain on node 0 until the active
balancer kicks in after a few seconds and forcibly moves some threads
to node 4.

Override node_reclaim_distance for AMD Zen.

Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suravee.Suthikulpanit@amd.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas.Lendacky@amd.com
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190808195301.13222-3-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-03 09:17:37 +02:00
..
acpi
asm-generic Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2019-07-17 13:13:41 -07:00
clocksource
crypto
drm drm fixes for -rc1: 2019-07-19 12:29:43 -07:00
dt-bindings ARM: Device-tree updates 2019-07-19 17:19:24 -07:00
keys
kvm
linux sched/topology: Improve load balancing on AMD EPYC systems 2019-09-03 09:17:37 +02:00
math-emu
media
misc
net tcp: be more careful in tcp_fragment() 2019-07-21 20:41:24 -07:00
pcmcia
ras
rdma
scsi SCSI fixes on 20190720 2019-07-20 10:04:58 -07:00
soc ARM: SoC-related driver updates 2019-07-19 17:13:56 -07:00
sound
target
trace NFS client updates for Linux 5.3 2019-07-18 14:32:33 -07:00
uapi media updates for v5.3-rc1 2019-07-22 09:01:47 -07:00
vdso
video
xen xen: remove tmem driver 2019-07-17 08:09:58 +02:00
Kbuild kbuild: add net/netfilter/nf_tables_offload.h to header-test blacklist. 2019-07-21 11:43:43 -07:00