linux-stable/arch/riscv/kernel
Xianting Tian 150573c60c RISC-V: Fixup schedule out issue in machine_crash_shutdown()
commit ad943893d5 upstream.

Current task of executing crash kexec will be schedule out when panic is
triggered by RCU Stall, as it needs to wait rcu completion. It lead to
inability to enter the crash system.

The implementation of machine_crash_shutdown() is non-standard for RISC-V
according to other Arch's implementation(eg, x86, arm64), we need to send
IPI to stop secondary harts.

[224521.877268] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[224521.883471] rcu: 	0-...0: (3 GPs behind) idle=cfa/0/0x1 softirq=3968793/3968793 fqs=2495
[224521.891742] 	(detected by 2, t=5255 jiffies, g=60855593, q=328)
[224521.897754] Task dump for CPU 0:
[224521.901074] task:swapper/0     state:R  running task   stack:  0 pid:  0 ppid:   0 flags:0x00000008
[224521.911090] Call Trace:
[224521.913638] [<ffffffe000c432de>] __schedule+0x208/0x5ea
[224521.918957] Kernel panic - not syncing: RCU Stall
[224521.923773] bad: scheduling from the idle thread!
[224521.928571] CPU: 2 PID: 0 Comm: swapper/2 Kdump: loaded Tainted: G   O  5.10.113-yocto-standard #1
[224521.938658] Call Trace:
[224521.941200] [<ffffffe00020395c>] walk_stackframe+0x0/0xaa
[224521.946689] [<ffffffe000c34f8e>] show_stack+0x32/0x3e
[224521.951830] [<ffffffe000c39020>] dump_stack_lvl+0x7e/0xa2
[224521.957317] [<ffffffe000c39058>] dump_stack+0x14/0x1c
[224521.962459] [<ffffffe000243884>] dequeue_task_idle+0x2c/0x40
[224521.968207] [<ffffffe000c434f4>] __schedule+0x41e/0x5ea
[224521.973520] [<ffffffe000c43826>] schedule+0x34/0xe4
[224521.978487] [<ffffffe000c46cae>] schedule_timeout+0xc6/0x170
[224521.984234] [<ffffffe000c4491e>] wait_for_completion+0x98/0xf2
[224521.990157] [<ffffffe00026d9e2>] __wait_rcu_gp+0x148/0x14a
[224521.995733] [<ffffffe0002761c4>] synchronize_rcu+0x5c/0x66
[224522.001307] [<ffffffe00026f1a6>] rcu_sync_enter+0x54/0xe6
[224522.006795] [<ffffffe00025a436>] percpu_down_write+0x32/0x11c
[224522.012629] [<ffffffe000c4266a>] _cpu_down+0x92/0x21a
[224522.017771] [<ffffffe000219a0a>] smp_shutdown_nonboot_cpus+0x90/0x118
[224522.024299] [<ffffffe00020701e>] machine_crash_shutdown+0x30/0x4a
[224522.030483] [<ffffffe00029a3f8>] __crash_kexec+0x62/0xa6
[224522.035884] [<ffffffe000c3515e>] panic+0xfa/0x2b6
[224522.040678] [<ffffffe0002772be>] rcu_sched_clock_irq+0xc26/0xcb8
[224522.046774] [<ffffffe00027fc7a>] update_process_times+0x62/0x8a
[224522.052785] [<ffffffe00028d522>] tick_sched_timer+0x9e/0x102
[224522.058533] [<ffffffe000280c3a>] __hrtimer_run_queues+0x16a/0x318
[224522.064716] [<ffffffe0002812ec>] hrtimer_interrupt+0xd4/0x228
[224522.070551] [<ffffffe0009a69b6>] riscv_timer_interrupt+0x3c/0x48
[224522.076646] [<ffffffe000268f8c>] handle_percpu_devid_irq+0xb0/0x24c
[224522.083004] [<ffffffe00026428e>] __handle_domain_irq+0xa8/0x122
[224522.089014] [<ffffffe00062f954>] riscv_intc_irq+0x38/0x60
[224522.094501] [<ffffffe000201bd4>] ret_from_exception+0x0/0xc
[224522.100161] [<ffffffe000c42146>] rcu_eqs_enter.constprop.0+0x8c/0xb8

With the patch, it can enter crash system when RCU Stall occur.

Fixes: e53d28180d ("RISC-V: Add kdump support")
Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220811074150.3020189-4-xianting.tian@linux.alibaba.com
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17 14:22:52 +02:00
..
probes riscv:uprobe fix SR_SPIE set/clear handling 2022-08-17 14:22:52 +02:00
vdso riscv/vdso: Move vdso data page up front 2021-10-02 13:42:25 -07:00
.gitignore
asm-offsets.c riscv: Introduce structure that group all variables regarding kernel mapping 2021-07-05 18:04:00 -07:00
cacheinfo.c drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION() 2021-09-01 10:29:10 +02:00
cpu-hotplug.c riscv: cpu-hotplug: clear cpu from numa map when teardown 2022-02-16 12:56:18 +01:00
cpu.c RISC-V: Rename and move plic_find_hart_id() to arch directory 2020-06-09 19:11:20 -07:00
cpu_ops.c treewide: Convert macro and uses of __section(foo) to __section("foo") 2020-10-25 14:51:49 -07:00
cpu_ops_sbi.c
cpu_ops_spinwait.c
cpufeature.c riscv: Add __init section marker to some functions again 2021-05-29 13:39:27 -07:00
crash_dump.c RISC-V: Add crash kernel support 2021-04-26 08:25:24 -07:00
crash_save_regs.S RISC-V: Fixup get incorrect user mode PC for kernel mode regs 2022-08-17 14:22:52 +02:00
efi-header.S RISC-V: Add PE/COFF header for EFI stub 2020-10-02 14:31:16 -07:00
efi.c riscv: read-only pages should not be writable 2022-06-14 18:36:11 +02:00
entry.S riscv: fix oops caused by irqsoff latency tracer 2022-03-02 11:48:08 +01:00
fpu.S
ftrace.c riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT 2021-01-14 15:09:05 -08:00
head.h RISC-V: enable XIP 2021-04-26 08:31:28 -07:00
head.S riscv: Initialize thread pointer before calling C functions 2022-06-09 10:22:26 +02:00
image-vars.h arch/riscv:fix typo in a comment in arch/riscv/kernel/image-vars.h 2021-02-18 23:18:00 -08:00
irq.c RISC-V: Remove do_IRQ() function 2020-06-09 19:11:24 -07:00
jump_label.c riscv: Add jump-label implementation 2020-07-30 11:37:43 -07:00
kexec_relocate.S riscv: Don't use va_pa_offset on kdump 2022-01-27 11:02:50 +01:00
kgdb.c riscv: Fix "no previous prototype" compile warning in kgdb.c file 2020-07-09 20:09:30 -07:00
machine_kexec.c RISC-V: Fixup schedule out issue in machine_crash_shutdown() 2022-08-17 14:22:52 +02:00
Makefile riscv: fix oops caused by irqsoff latency tracer 2022-03-02 11:48:08 +01:00
mcount-dyn.S riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT 2021-01-14 15:09:05 -08:00
mcount.S riscv: Workaround mcount name prior to clang-13 2021-04-26 08:25:01 -07:00
module-sections.c
module.c riscv: Fix auipc+jalr relocation range checks 2022-03-16 14:23:43 +01:00
patch.c riscv: patch_text: Fixup last cpu should be master 2022-05-09 09:14:31 +02:00
perf_callchain.c uaccess: fix type mismatch warnings from access_ok() 2022-04-08 14:24:01 +02:00
perf_event.c
perf_regs.c perf/arch: Remove perf_sample_data::regs_user_copy 2020-11-09 18:12:34 +01:00
process.c riscv: Turn has_fpu into a static key if FPU=y 2021-05-25 22:56:57 -07:00
ptrace.c riscv: Ensure the value of FP registers in the core dump file is up to date 2021-08-24 20:54:10 -07:00
reset.c riscv: set default pm_power_off to NULL 2022-08-17 14:22:49 +02:00
riscv_ksyms.c riscv: provide memmove implementation 2020-12-10 17:27:54 -08:00
sbi.c RISC-V Patches for the 5.13 Merge Window, Part 1 2021-05-06 09:24:18 -07:00
setup.c RISC-V: Mark IORESOURCE_EXCLUSIVE for reserved mem instead of IORESOURCE_BUSY 2022-06-09 10:22:26 +02:00
signal.c riscv: Turn has_fpu into a static key if FPU=y 2021-05-25 22:56:57 -07:00
smp.c RISC-V: Use common riscv_cpuid_to_hartid_mask() for both SMP=y and SMP=n 2022-01-27 11:02:50 +01:00
smpboot.c sched/core: Initialize the idle task with preemption disabled 2021-05-12 13:01:45 +02:00
soc.c riscv: Fix builtin DTB handling 2021-01-07 19:00:50 -08:00
stacktrace.c riscv: eliminate unreliable __builtin_frame_address(1) 2022-02-16 12:56:18 +01:00
sys_riscv.c RISC-V: Don't allow write+exec only page mapping request in mmap 2020-06-18 17:28:53 -07:00
syscall_table.c riscv/vdso: Refactor asm/vdso.h 2021-10-02 13:42:23 -07:00
time.c RISC-V Patches for the 5.13 Merge Window, Part 1 2021-05-06 09:24:18 -07:00
trace_irq.c riscv: fix oops caused by irqsoff latency tracer 2022-03-02 11:48:08 +01:00
trace_irq.h riscv: fix oops caused by irqsoff latency tracer 2022-03-02 11:48:08 +01:00
traps.c trap: cleanup trap_init() 2021-09-08 11:50:27 -07:00
traps_misaligned.c
vdso.c riscv/vdso: make arch_setup_additional_pages wait for mmap_sem for write killable 2021-10-02 13:42:26 -07:00
vmlinux-xip.lds.S riscv: Move EXCEPTION_TABLE to RO_DATA segment 2021-09-10 23:59:48 -07:00
vmlinux.lds.S riscv: Move EXCEPTION_TABLE to RO_DATA segment 2021-09-10 23:59:48 -07:00