linux-stable/kernel/irq
Dongli Zhang a40209d355 genirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline
commit a6c11c0a52 upstream.

The absence of IRQD_MOVE_PCNTXT prevents immediate effectiveness of
interrupt affinity reconfiguration via procfs. Instead, the change is
deferred until the next instance of the interrupt being triggered on the
original CPU.

When the interrupt next triggers on the original CPU, the new affinity is
enforced within __irq_move_irq(). A vector is allocated from the new CPU,
but the old vector on the original CPU remains and is not immediately
reclaimed. Instead, apicd->move_in_progress is flagged, and the reclaiming
process is delayed until the next trigger of the interrupt on the new CPU.

Upon the subsequent triggering of the interrupt on the new CPU,
irq_complete_move() adds a task to the old CPU's vector_cleanup list if it
remains online. Subsequently, the timer on the old CPU iterates over its
vector_cleanup list, reclaiming old vectors.

However, a rare scenario arises if the old CPU is outgoing before the
interrupt triggers again on the new CPU.

In that case irq_force_complete_move() is not invoked on the outgoing CPU
to reclaim the old apicd->prev_vector because the interrupt isn't currently
affine to the outgoing CPU, and irq_needs_fixup() returns false. Even
though __vector_schedule_cleanup() is later called on the new CPU, it
doesn't reclaim apicd->prev_vector; instead, it simply resets both
apicd->move_in_progress and apicd->prev_vector to 0.

As a result, the vector remains unreclaimed in vector_matrix, leading to a
CPU vector leak.

To address this issue, move the invocation of irq_force_complete_move()
before the irq_needs_fixup() call to reclaim apicd->prev_vector, if the
interrupt is currently or used to be affine to the outgoing CPU.

Additionally, reclaim the vector in __vector_schedule_cleanup() as well,
following a warning message, although theoretically it should never see
apicd->move_in_progress with apicd->prev_cpu pointing to an offline CPU.

Fixes: f0383c24b4 ("genirq/cpuhotplug: Add support for cleaning up move in progress")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240522220218.162423-1-dongli.zhang@oracle.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16 13:23:38 +02:00
..
Kconfig genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_HIERARCHY 2020-11-18 19:18:41 +01:00
Makefile Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-11-13 17:33:11 -08:00
affinity.c genirq/affinity: Spread IRQs to all available NUMA nodes 2019-02-12 19:46:57 +01:00
autoprobe.c genirq: Delay deactivation in free_irq() 2019-07-21 09:03:12 +02:00
chip.c genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP 2021-08-26 08:36:40 -04:00
cpuhotplug.c genirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline 2024-06-16 13:23:38 +02:00
debug.h Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk 2018-02-01 13:36:15 -08:00
debugfs.c x86/apic/msi: Plug non-maskable MSI affinity race 2020-02-11 04:34:18 -08:00
devres.c genirq: Add missing SPDX identifiers 2018-03-20 14:23:28 +01:00
dummychip.c genirq: Add missing SPDX identifiers 2018-03-20 14:23:28 +01:00
generic-chip.c genirq/generic_chip: Make irq_remove_generic_chip() irqdomain aware 2023-11-28 16:46:34 +00:00
handle.c random: remove unused irq_flags argument from add_interrupt_randomness() 2022-06-25 11:49:01 +02:00
internals.h genirq: Synchronize interrupt thread startup 2022-05-12 12:20:24 +02:00
ipi.c genirq: Add missing SPDX identifiers 2018-03-20 14:23:28 +01:00
irq_sim.c genirq/irq_sim: Remove the license boilerplate 2018-04-26 22:26:39 +02:00
irqdesc.c genirq: Synchronize interrupt thread startup 2022-05-12 12:20:24 +02:00
irqdomain.c irqdomain: Drop bogus fwspec-mapping error handling 2023-03-11 16:31:52 +01:00
manage.c genirq: Synchronize interrupt thread startup 2022-05-12 12:20:24 +02:00
matrix.c genirq/matrix: Exclude managed interrupts in irq_matrix_allocated() 2023-11-20 10:29:16 +01:00
migration.c genirq/migration: Avoid out of line call if pending is not set 2018-06-06 15:18:20 +02:00
msi.c genirq/msi: Ensure deactivation on teardown 2021-08-26 08:36:40 -04:00
pm.c genirq: Add missing SPDX identifiers 2018-03-20 14:23:28 +01:00
proc.c genirq/proc: Reject invalid affinity masks (again) 2020-02-28 16:38:59 +01:00
resend.c genirq: Prevent NULL pointer dereference in resend_irqs() 2019-09-19 09:09:34 +02:00
settings.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
spurious.c genirq: Cleanup top of file comments 2018-03-20 14:23:27 +01:00
timings.c genirq: Remove license boilerplate/references 2018-03-20 14:23:28 +01:00