linux-stable/arch/powerpc
Laurent Dufour bd1dd4c5f5 powerpc/pseries: Prevent free CPU ids being reused on another node
When a CPU is hot added, the CPU ids are taken from the available mask
from the lower possible set. If that set of values was previously used
for a CPU attached to a different node, it appears to an application as
if these CPUs have migrated from one node to another node which is not
expected.

To prevent this, it is needed to record the CPU ids used for each node
and to not reuse them on another node. However, to prevent CPU hot plug
to fail, in the case the CPU ids is starved on a node, the capability to
reuse other nodes’ free CPU ids is kept. A warning is displayed in such
a case to warn the user.

A new CPU bit mask (node_recorded_ids_map) is introduced for each
possible node. It is populated with the CPU onlined at boot time, and
then when a CPU is hot plugged to a node. The bits in that mask remain
when the CPU is hot unplugged, to remind this CPU ids have been used for
this node.

If no id set was found, a retry is made without removing the ids used on
the other nodes to try reusing them. This is the way ids have been
allocated prior to this patch.

The effect of this patch can be seen by removing and adding CPUs using
the Qemu monitor. In the following case, the first CPU from the node 2
is removed, then the first one from the node 1 is removed too. Later,
the first CPU of the node 2 is added back. Without that patch, the
kernel will number these CPUs using the first CPU ids available which
are the ones freed when removing the second CPU of the node 0. This
leads to the CPU ids 16-23 to move from the node 1 to the node 2. With
the patch applied, the CPU ids 32-39 are used since they are the lowest
free ones which have not been used on another node.

At boot time:
  [root@vm40 ~]# numactl -H | grep cpus
  node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
  node 1 cpus: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
  node 2 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

Vanilla kernel, after the CPU hot unplug/plug operations:
  [root@vm40 ~]# numactl -H | grep cpus
  node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
  node 1 cpus: 24 25 26 27 28 29 30 31
  node 2 cpus: 16 17 18 19 20 21 22 23 40 41 42 43 44 45 46 47

Patched kernel, after the CPU hot unplug/plug operations:
  [root@vm40 ~]# numactl -H | grep cpus
  node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
  node 1 cpus: 24 25 26 27 28 29 30 31
  node 2 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210429174908.16613-1-ldufour@linux.ibm.com
2021-08-10 23:14:55 +10:00
..
boot powerpc: move the install rule to arch/powerpc/Makefile 2021-08-04 10:53:39 +10:00
configs TTY / Serial patches for 5.14-rc1 2021-07-05 14:08:24 -07:00
crypto crypto: powepc/sha1 - remove unneeded semicolon 2021-03-07 15:13:14 +11:00
include pseries/drmem: update LMBs after LPM 2021-08-10 23:14:55 +10:00
kernel powerpc/smp: Use existing L2 cache_map cpumask to find L3 cache siblings 2021-08-04 10:53:39 +10:00
kexec powerpc/kexec: blacklist functions called in real mode for kprobe 2021-07-26 20:38:51 +10:00
kvm KVM: PPC: Book3S HV Nested: Sanitise H_ENTER_NESTED TM state 2021-07-23 16:19:38 +10:00
lib powerpc: Only build restart_table.c for 64s 2021-07-01 22:50:54 +10:00
math-emu powerpc/64s: avoid reloading (H)SRR registers if they are still valid 2021-06-25 00:06:55 +10:00
mm pseries/drmem: update LMBs after LPM 2021-08-10 23:14:55 +10:00
net powerpc/bpf: Reject atomic ops in ppc32 JIT 2021-07-05 22:23:25 +10:00
perf powerpc/64s/perf: Always use SIAR for kernel interrupts 2021-08-04 10:53:39 +10:00
platforms powerpc/pseries: Prevent free CPU ids being reused on another node 2021-08-10 23:14:55 +10:00
purgatory powerpc/kexec: Don't use .machine ppc64 in trampoline_64.S 2021-04-08 21:17:43 +10:00
sysdev powerpc/xive: Fix error handling when allocating an IPI 2021-07-05 22:23:25 +10:00
tools
xmon powerpc updates for 5.14 2021-07-02 12:54:34 -07:00
Kbuild
Kconfig powerpc updates for 5.14 2021-07-02 12:54:34 -07:00
Kconfig.debug powerpc: Make PPC_IRQ_SOFT_MASK_DEBUG depend on PPC64 2021-06-25 00:07:09 +10:00
Makefile powerpc: move the install rule to arch/powerpc/Makefile 2021-08-04 10:53:39 +10:00
Makefile.postlink