linux-stable/include
Feng Tang 4df910620b mm: memcg: relayout structure mem_cgroup to avoid cache interference
0day reported one -22.7% regression for will-it-scale page_fault2
case [1] on a 4 sockets 144 CPU platform, and bisected to it to be
caused by Waiman's optimization (commit bd0b230fe1) of saving one
'struct page_counter' space for 'struct mem_cgroup'.

Initially we thought it was due to the cache alignment change introduced
by the patch, but further debug shows that it is due to some hot data
members ('vmstats_local', 'vmstats_percpu', 'vmstats') sit in 2 adjacent
cacheline (2N and 2N+1 cacheline), and when adjacent cache line prefetch
is enabled, it triggers an "extended level" of cache false sharing for
2 adjacent cache lines.

So exchange the 2 member blocks, while keeping mostly the original
cache alignment, which can restore and even enhance the performance,
and save 64 bytes of space for 'struct mem_cgroup' (from 2880 to 2816,
with 0day's default RHEL-8.3 kernel config)

[1]. https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/

Fixes: bd0b230fe1 ("mm/memcg: unify swap and memsw page counters")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-26 09:35:49 -08:00
..
acpi pci-v5.10-changes 2020-10-22 12:41:00 -07:00
asm-generic Merge branch 'for-5.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu 2020-11-15 08:57:19 -08:00
clocksource
crypto
drm drm: drm_print.h: fix kernel-doc markups 2020-10-27 11:21:39 +01:00
dt-bindings ARM: Devicetree updates 2020-10-24 10:44:18 -07:00
keys
kunit kunit: fix display of failed expectations for strings 2020-11-10 13:45:15 -07:00
kvm ARM: 2020-10-23 11:17:56 -07:00
linux mm: memcg: relayout structure mem_cgroup to avoid cache interference 2020-11-26 09:35:49 -08:00
math-emu
media ARM: SoC platform updates 2020-10-24 10:33:08 -07:00
memory
misc
net ipv6: Remove dependency of ipv6_frag_thdr_truncated on ipv6 module 2020-11-19 10:49:50 -08:00
pcmcia
ras mm,hwpoison: introduce MF_MSG_UNSPLIT_THP 2020-10-16 11:11:17 -07:00
rdma RDMA: Add rdma_connect_locked() 2020-10-28 09:14:49 -03:00
scsi scsi: libiscsi: Fix NOP race condition 2020-11-16 22:32:50 -05:00
soc ARM: SoC-related driver updates 2020-10-24 10:39:22 -07:00
sound ASoC: Fixes for v5.11 2020-11-19 19:56:29 +01:00
target
trace Just one quick fix for a tracing oops. 2020-11-18 12:06:34 -08:00
uapi GPIO fixes for the v5.10 series: 2020-11-13 10:55:50 -08:00
vdso
video gpu: ipu-v3: remove unused functions 2020-10-26 10:42:38 +01:00
xen xen: branch for v5.10-rc1c 2020-10-25 10:55:35 -07:00