linux-stable/arch/powerpc/kernel/interrupt.c

671 lines
17 KiB
C
Raw Normal View History

powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
// SPDX-License-Identifier: GPL-2.0-or-later
#include <linux/context_tracking.h>
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
#include <linux/err.h>
#include <linux/compat.h>
#include <linux/sched/debug.h> /* for show_regs */
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
#include <asm/asm-prototypes.h>
#include <asm/kup.h>
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
#include <asm/cputime.h>
#include <asm/hw_irq.h>
#include <asm/interrupt.h>
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
#include <asm/kprobes.h>
#include <asm/paca.h>
#include <asm/ptrace.h>
#include <asm/reg.h>
#include <asm/signal.h>
#include <asm/switch_to.h>
#include <asm/syscall.h>
#include <asm/time.h>
powerpc/64s: system call scv tabort fix for corrupt irq soft-mask state If a system call is made with a transaction active, the kernel immediately aborts it and returns. scv system calls disable irqs even earlier in their interrupt handler, and tabort_syscall does not fix this up. This can result in irq soft-mask state being messed up on the next kernel entry, and crashing at BUG_ON(arch_irq_disabled_regs(regs)) in the kernel exit handlers, or possibly worse. This can't easily be fixed in asm because at this point an async irq may have hit, which is soft-masked and marked pending. The pending interrupt has to be replayed before returning to userspace. The fix is to move the tabort_syscall code to C in the main syscall handler, and just skip the system call but otherwise return as usual, which will take care of the pending irqs. This also does a bunch of other things including possible signal delivery to the process, but the doomed transaction should still be aborted when it is eventually returned to. The sc system call path is changed to use the new C function as well to reduce code and path differences. This slows down how quickly system calls are aborted when called while a transaction is active, which could potentially impact TM performance. But making any system call is already bad for performance, and TM is on the way out, so go with simpler over faster. Fixes: 7fa95f9adaee7 ("powerpc/64s: system call support for scv/rfscv instructions") Reported-by: Eirik Fuller <efuller@redhat.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Use #ifdef rather than IS_ENABLED() to fix build error on 32-bit] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210903125707.1601269-1-npiggin@gmail.com
2021-09-03 12:57:06 +00:00
#include <asm/tm.h>
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
#include <asm/unistd.h>
#if defined(CONFIG_PPC_ADV_DEBUG_REGS) && defined(CONFIG_PPC32)
unsigned long global_dbcr0[NR_CPUS];
#endif
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
typedef long (*syscall_fn)(long, long, long, long, long, long);
#ifdef CONFIG_PPC_BOOK3S_64
DEFINE_STATIC_KEY_FALSE(interrupt_exit_not_reentrant);
static inline bool exit_must_hard_disable(void)
{
return static_branch_unlikely(&interrupt_exit_not_reentrant);
}
#else
static inline bool exit_must_hard_disable(void)
{
return true;
}
#endif
/*
* local irqs must be disabled. Returns false if the caller must re-enable
* them, check for new work, and try again.
*
* This should be called with local irqs disabled, but if they were previously
* enabled when the interrupt handler returns (indicating a process-context /
* synchronous interrupt) then irqs_enabled should be true.
*
* restartable is true then EE/RI can be left on because interrupts are handled
* with a restart sequence.
*/
static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
{
/* This must be done with RI=1 because tracing may touch vmaps */
trace_hardirqs_on();
if (exit_must_hard_disable() || !restartable)
__hard_EE_RI_disable();
#ifdef CONFIG_PPC64
/* This pattern matches prep_irq_for_idle */
if (unlikely(lazy_irq_pending_nocheck())) {
if (exit_must_hard_disable() || !restartable) {
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
__hard_RI_enable();
}
trace_hardirqs_off();
return false;
}
#endif
return true;
}
/* Has to run notrace because it is entered not completely "reconciled" */
notrace long system_call_exception(long r3, long r4, long r5,
long r6, long r7, long r8,
unsigned long r0, struct pt_regs *regs)
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
{
syscall_fn f;
kuep_lock();
regs->orig_gpr3 = r3;
if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
trace_hardirqs_off(); /* finish reconciling */
CT_WARN_ON(ct_state() == CONTEXT_KERNEL);
user_exit_irqoff();
BUG_ON(regs_is_unrecoverable(regs));
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
BUG_ON(!(regs->msr & MSR_PR));
BUG_ON(arch_irq_disabled_regs(regs));
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
#ifdef CONFIG_PPC_PKEY
if (mmu_has_feature(MMU_FTR_PKEY)) {
unsigned long amr, iamr;
bool flush_needed = false;
/*
* When entering from userspace we mostly have the AMR/IAMR
* different from kernel default values. Hence don't compare.
*/
amr = mfspr(SPRN_AMR);
iamr = mfspr(SPRN_IAMR);
regs->amr = amr;
regs->iamr = iamr;
if (mmu_has_feature(MMU_FTR_BOOK3S_KUAP)) {
mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
flush_needed = true;
}
if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP)) {
mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
flush_needed = true;
}
if (flush_needed)
isync();
} else
#endif
kuap_assert_locked();
booke_restore_dbcr0();
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
account_cpu_user_entry();
account_stolen_time();
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
/*
* This is not required for the syscall exit path, but makes the
* stack frame look nicer. If this was initialised in the first stack
* frame, or if the unwinder was taught the first stack frame always
* returns to user with IRQS_ENABLED, this store could be avoided!
*/
irq_soft_mask_regs_set_state(regs, IRQS_ENABLED);
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
/*
* If system call is called with TM active, set _TIF_RESTOREALL to
* prevent RFSCV being used to return to userspace, because POWER9
* TM implementation has problems with this instruction returning to
* transactional state. Final register values are not relevant because
* the transaction will be aborted upon return anyway. Or in the case
* of unsupported_scv SIGILL fault, the return state does not much
* matter because it's an edge case.
*/
if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
unlikely(MSR_TM_TRANSACTIONAL(regs->msr)))
set_bits(_TIF_RESTOREALL, &current_thread_info()->flags);
powerpc/64s: system call scv tabort fix for corrupt irq soft-mask state If a system call is made with a transaction active, the kernel immediately aborts it and returns. scv system calls disable irqs even earlier in their interrupt handler, and tabort_syscall does not fix this up. This can result in irq soft-mask state being messed up on the next kernel entry, and crashing at BUG_ON(arch_irq_disabled_regs(regs)) in the kernel exit handlers, or possibly worse. This can't easily be fixed in asm because at this point an async irq may have hit, which is soft-masked and marked pending. The pending interrupt has to be replayed before returning to userspace. The fix is to move the tabort_syscall code to C in the main syscall handler, and just skip the system call but otherwise return as usual, which will take care of the pending irqs. This also does a bunch of other things including possible signal delivery to the process, but the doomed transaction should still be aborted when it is eventually returned to. The sc system call path is changed to use the new C function as well to reduce code and path differences. This slows down how quickly system calls are aborted when called while a transaction is active, which could potentially impact TM performance. But making any system call is already bad for performance, and TM is on the way out, so go with simpler over faster. Fixes: 7fa95f9adaee7 ("powerpc/64s: system call support for scv/rfscv instructions") Reported-by: Eirik Fuller <efuller@redhat.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Use #ifdef rather than IS_ENABLED() to fix build error on 32-bit] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210903125707.1601269-1-npiggin@gmail.com
2021-09-03 12:57:06 +00:00
/*
* If the system call was made with a transaction active, doom it and
* return without performing the system call. Unless it was an
* unsupported scv vector, in which case it's treated like an illegal
* instruction.
*/
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
if (unlikely(MSR_TM_TRANSACTIONAL(regs->msr)) &&
!trap_is_unsupported_scv(regs)) {
/* Enable TM in the kernel, and disable EE (for scv) */
hard_irq_disable();
mtmsr(mfmsr() | MSR_TM);
/* tabort, this dooms the transaction, nothing else */
asm volatile(".long 0x7c00071d | ((%0) << 16)"
:: "r"(TM_CAUSE_SYSCALL|TM_CAUSE_PERSISTENT));
/*
* Userspace will never see the return value. Execution will
* resume after the tbegin. of the aborted transaction with the
* checkpointed register state. A context switch could occur
* or signal delivered to the process before resuming the
* doomed transaction context, but that should all be handled
* as expected.
*/
return -ENOSYS;
}
#endif // CONFIG_PPC_TRANSACTIONAL_MEM
local_irq_enable();
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
if (unlikely(read_thread_flags() & _TIF_SYSCALL_DOTRACE)) {
if (unlikely(trap_is_unsupported_scv(regs))) {
/* Unsupported scv vector */
_exception(SIGILL, regs, ILL_ILLOPC, regs->nip);
return regs->gpr[3];
}
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
/*
* We use the return value of do_syscall_trace_enter() as the
* syscall number. If the syscall was rejected for any reason
* do_syscall_trace_enter() returns an invalid syscall number
* and the test against NR_syscalls will fail and the return
* value to be used is in regs->gpr[3].
*/
r0 = do_syscall_trace_enter(regs);
if (unlikely(r0 >= NR_syscalls))
return regs->gpr[3];
r3 = regs->gpr[3];
r4 = regs->gpr[4];
r5 = regs->gpr[5];
r6 = regs->gpr[6];
r7 = regs->gpr[7];
r8 = regs->gpr[8];
} else if (unlikely(r0 >= NR_syscalls)) {
if (unlikely(trap_is_unsupported_scv(regs))) {
/* Unsupported scv vector */
_exception(SIGILL, regs, ILL_ILLOPC, regs->nip);
return regs->gpr[3];
}
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
return -ENOSYS;
}
/* May be faster to do array_index_nospec? */
barrier_nospec();
if (unlikely(is_compat_task())) {
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
f = (void *)compat_sys_call_table[r0];
r3 &= 0x00000000ffffffffULL;
r4 &= 0x00000000ffffffffULL;
r5 &= 0x00000000ffffffffULL;
r6 &= 0x00000000ffffffffULL;
r7 &= 0x00000000ffffffffULL;
r8 &= 0x00000000ffffffffULL;
} else {
f = (void *)sys_call_table[r0];
}
return f(r3, r4, r5, r6, r7, r8);
}
static notrace void booke_load_dbcr0(void)
{
#ifdef CONFIG_PPC_ADV_DEBUG_REGS
unsigned long dbcr0 = current->thread.debug.dbcr0;
if (likely(!(dbcr0 & DBCR0_IDM)))
return;
/*
* Check to see if the dbcr0 register is set up to debug.
* Use the internal debug mode bit to do this.
*/
mtmsr(mfmsr() & ~MSR_DE);
if (IS_ENABLED(CONFIG_PPC32)) {
isync();
global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
}
mtspr(SPRN_DBCR0, dbcr0);
mtspr(SPRN_DBSR, -1);
#endif
}
static void check_return_regs_valid(struct pt_regs *regs)
{
#ifdef CONFIG_PPC_BOOK3S_64
unsigned long trap, srr0, srr1;
static bool warned;
u8 *validp;
char *h;
if (trap_is_scv(regs))
return;
trap = TRAP(regs);
// EE in HV mode sets HSRRs like 0xea0
if (cpu_has_feature(CPU_FTR_HVMODE) && trap == INTERRUPT_EXTERNAL)
trap = 0xea0;
switch (trap) {
case 0x980:
case INTERRUPT_H_DATA_STORAGE:
case 0xe20:
case 0xe40:
case INTERRUPT_HMI:
case 0xe80:
case 0xea0:
case INTERRUPT_H_FAC_UNAVAIL:
case 0x1200:
case 0x1500:
case 0x1600:
case 0x1800:
validp = &local_paca->hsrr_valid;
if (!*validp)
return;
srr0 = mfspr(SPRN_HSRR0);
srr1 = mfspr(SPRN_HSRR1);
h = "H";
break;
default:
validp = &local_paca->srr_valid;
if (!*validp)
return;
srr0 = mfspr(SPRN_SRR0);
srr1 = mfspr(SPRN_SRR1);
h = "";
break;
}
if (srr0 == regs->nip && srr1 == regs->msr)
return;
/*
* A NMI / soft-NMI interrupt may have come in after we found
* srr_valid and before the SRRs are loaded. The interrupt then
* comes in and clobbers SRRs and clears srr_valid. Then we load
* the SRRs here and test them above and find they don't match.
*
* Test validity again after that, to catch such false positives.
*
* This test in general will have some window for false negatives
* and may not catch and fix all such cases if an NMI comes in
* later and clobbers SRRs without clearing srr_valid, but hopefully
* such things will get caught most of the time, statistically
* enough to be able to get a warning out.
*/
barrier();
if (!*validp)
return;
if (!warned) {
warned = true;
printk("%sSRR0 was: %lx should be: %lx\n", h, srr0, regs->nip);
printk("%sSRR1 was: %lx should be: %lx\n", h, srr1, regs->msr);
show_regs(regs);
}
*validp = 0; /* fixup */
#endif
}
static notrace unsigned long
interrupt_exit_user_prepare_main(unsigned long ret, struct pt_regs *regs)
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
{
unsigned long ti_flags;
again:
ti_flags = read_thread_flags();
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
while (unlikely(ti_flags & (_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM))) {
local_irq_enable();
if (ti_flags & _TIF_NEED_RESCHED) {
schedule();
} else {
/*
* SIGPENDING must restore signal handler function
* argument GPRs, and some non-volatiles (e.g., r1).
* Restore all for now. This could be made lighter.
*/
if (ti_flags & _TIF_SIGPENDING)
ret |= _TIF_RESTOREALL;
do_notify_resume(regs, ti_flags);
}
local_irq_disable();
ti_flags = read_thread_flags();
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
}
if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && IS_ENABLED(CONFIG_PPC_FPU)) {
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
unlikely((ti_flags & _TIF_RESTORE_TM))) {
restore_tm_state(regs);
} else {
unsigned long mathflags = MSR_FP;
if (cpu_has_feature(CPU_FTR_VSX))
mathflags |= MSR_VEC | MSR_VSX;
else if (cpu_has_feature(CPU_FTR_ALTIVEC))
mathflags |= MSR_VEC;
powerpc/64s: Fix restore_math unnecessarily changing MSR Before returning to user, if there are missing FP/VEC/VSX bits from the user MSR then those registers had been saved and must be restored again before use. restore_math will decide whether to restore immediately, or skip the restore and let fp/vec/vsx unavailable faults demand load the registers. Each time restore_math restores one of the FP/VSX or VEC register sets is loaded, an 8-bit counter is incremented (load_fp and load_vec). When these wrap to zero, restore_math no longer restores that register set until after they are next demand faulted. It's quite usual for those counters to have different values, so if one wraps to zero and restore_math no longer restores its registers or user MSR bit but the other is not zero yet does not need to be restored (because the kernel is not frequently using the FPU), then restore_math will be called and it will also not return in the early exit check. This causes msr_check_and_set to test and set the MSR at every kernel exit despite having no work to do. This can cause workloads (e.g., a NULL syscall microbenchmark) to run fast for a time while both counters are non-zero, then slow down when one of the counters reaches zero, then speed up again after the second counter reaches zero. The cost is significant, about 10% slowdown on a NULL syscall benchmark, and the jittery behaviour is very undesirable. Fix this by having restore_math test all conditions first, and only update MSR if we will be loading registers. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200623234139.2262227-2-npiggin@gmail.com
2020-06-23 23:41:38 +00:00
/*
* If userspace MSR has all available FP bits set,
* then they are live and no need to restore. If not,
* it means the regs were given up and restore_math
* may decide to restore them (to avoid taking an FP
* fault).
*/
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
if ((regs->msr & mathflags) != mathflags)
restore_math(regs);
}
}
check_return_regs_valid(regs);
user_enter_irqoff();
if (!prep_irq_for_enabled_exit(true)) {
user_exit_irqoff();
local_irq_enable();
local_irq_disable();
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
goto again;
}
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
local_paca->tm_scratch = regs->msr;
#endif
booke_load_dbcr0();
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
account_cpu_user_exit();
/* Restore user access locks last */
kuap_user_restore(regs);
kuep_unlock();
powerpc/64/sycall: Implement syscall entry/exit logic in C System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-02-25 17:35:34 +00:00
return ret;
}
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
/*
* This should be called after a syscall returns, with r3 the return value
* from the syscall. If this function returns non-zero, the system call
* exit assembly should additionally load all GPR registers and CTR and XER
* from the interrupt frame.
*
* The function graph tracer can not trace the return side of this function,
* because RI=0 and soft mask state is "unreconciled", so it is marked notrace.
*/
notrace unsigned long syscall_exit_prepare(unsigned long r3,
struct pt_regs *regs,
long scv)
{
unsigned long ti_flags;
unsigned long ret = 0;
bool is_not_scv = !IS_ENABLED(CONFIG_PPC_BOOK3S_64) || !scv;
CT_WARN_ON(ct_state() == CONTEXT_USER);
kuap_assert_locked();
regs->result = r3;
/* Check whether the syscall is issued inside a restartable sequence */
rseq_syscall(regs);
ti_flags = read_thread_flags();
if (unlikely(r3 >= (unsigned long)-MAX_ERRNO) && is_not_scv) {
if (likely(!(ti_flags & (_TIF_NOERROR | _TIF_RESTOREALL)))) {
r3 = -r3;
regs->ccr |= 0x10000000; /* Set SO bit in CR */
}
}
if (unlikely(ti_flags & _TIF_PERSYSCALL_MASK)) {
if (ti_flags & _TIF_RESTOREALL)
ret = _TIF_RESTOREALL;
else
regs->gpr[3] = r3;
clear_bits(_TIF_PERSYSCALL_MASK, &current_thread_info()->flags);
} else {
regs->gpr[3] = r3;
}
if (unlikely(ti_flags & _TIF_SYSCALL_DOTRACE)) {
do_syscall_trace_leave(regs);
ret |= _TIF_RESTOREALL;
}
local_irq_disable();
ret = interrupt_exit_user_prepare_main(ret, regs);
#ifdef CONFIG_PPC64
regs->exit_result = ret;
#endif
return ret;
}
#ifdef CONFIG_PPC64
notrace unsigned long syscall_exit_restart(unsigned long r3, struct pt_regs *regs)
{
/*
* This is called when detecting a soft-pending interrupt as well as
* an alternate-return interrupt. So we can't just have the alternate
* return path clear SRR1[MSR] and set PACA_IRQ_HARD_DIS (unless
* the soft-pending case were to fix things up as well). RI might be
* disabled, in which case it gets re-enabled by __hard_irq_disable().
*/
__hard_irq_disable();
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
#ifdef CONFIG_PPC_BOOK3S_64
set_kuap(AMR_KUAP_BLOCKED);
#endif
trace_hardirqs_off();
user_exit_irqoff();
account_cpu_user_entry();
BUG_ON(!user_mode(regs));
regs->exit_result = interrupt_exit_user_prepare_main(regs->exit_result, regs);
return regs->exit_result;
}
#endif
notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
{
unsigned long ret;
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
BUG_ON(regs_is_unrecoverable(regs));
BUG_ON(arch_irq_disabled_regs(regs));
CT_WARN_ON(ct_state() == CONTEXT_USER);
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
/*
* We don't need to restore AMR on the way back to userspace for KUAP.
* AMR can only have been unlocked if we interrupted the kernel.
*/
kuap_assert_locked();
local_irq_disable();
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
ret = interrupt_exit_user_prepare_main(0, regs);
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
#ifdef CONFIG_PPC64
regs->exit_result = ret;
#endif
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
return ret;
}
void preempt_schedule_irq(void);
notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
{
unsigned long flags;
unsigned long ret = 0;
unsigned long kuap;
bool stack_store = read_thread_flags() & _TIF_EMULATE_STACK_STORE;
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
if (regs_is_unrecoverable(regs))
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
unrecoverable_exception(regs);
/*
* CT_WARN_ON comes here via program_check_exception,
* so avoid recursion.
*/
if (TRAP(regs) != INTERRUPT_PROGRAM)
CT_WARN_ON(ct_state() == CONTEXT_USER);
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
kuap = kuap_get_and_assert_locked();
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
local_irq_save(flags);
if (!arch_irq_disabled_regs(regs)) {
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
/* Returning to a kernel context with local irqs enabled. */
WARN_ON_ONCE(!(regs->msr & MSR_EE));
again:
if (IS_ENABLED(CONFIG_PREEMPT)) {
/* Return to preemptible kernel context */
if (unlikely(read_thread_flags() & _TIF_NEED_RESCHED)) {
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
if (preempt_count() == 0)
preempt_schedule_irq();
}
}
check_return_regs_valid(regs);
/*
* Stack store exit can't be restarted because the interrupt
* stack frame might have been clobbered.
*/
if (!prep_irq_for_enabled_exit(unlikely(stack_store))) {
/*
* Replay pending soft-masked interrupts now. Don't
* just local_irq_enabe(); local_irq_disable(); because
* if we are returning from an asynchronous interrupt
* here, another one might hit after irqs are enabled,
* and it would exit via this same path allowing
* another to fire, and so on unbounded.
*/
hard_irq_disable();
replay_soft_interrupts();
/* Took an interrupt, may have more exit work to do. */
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
goto again;
}
#ifdef CONFIG_PPC64
/*
* An interrupt may clear MSR[EE] and set this concurrently,
* but it will be marked pending and the exit will be retried.
* This leaves a racy window where MSR[EE]=0 and HARD_DIS is
* clear, until interrupt_exit_kernel_restart() calls
* hard_irq_disable(), which will set HARD_DIS again.
*/
local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
} else {
check_return_regs_valid(regs);
if (unlikely(stack_store))
__hard_EE_RI_disable();
/*
* Returning to a kernel context with local irqs disabled.
* Here, if EE was enabled in the interrupted context, enable
* it on return as well. A problem exists here where a soft
* masked interrupt may have cleared MSR[EE] and set HARD_DIS
* here, and it will still exist on return to the caller. This
* will be resolved by the masked interrupt firing again.
*/
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
if (regs->msr & MSR_EE)
local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
#endif /* CONFIG_PPC64 */
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
}
if (unlikely(stack_store)) {
clear_bits(_TIF_EMULATE_STACK_STORE, &current_thread_info()->flags);
ret = 1;
}
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
local_paca->tm_scratch = regs->msr;
#endif
/*
* 64s does not want to mfspr(SPRN_AMR) here, because this comes after
* mtmsr, which would cause Read-After-Write stalls. Hence, take the
* AMR value from the check above.
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
*/
kuap_kernel_restore(regs, kuap);
powerpc/64s: Implement interrupt exit logic in C Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 17:35:37 +00:00
return ret;
}
#ifdef CONFIG_PPC64
notrace unsigned long interrupt_exit_user_restart(struct pt_regs *regs)
{
__hard_irq_disable();
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
#ifdef CONFIG_PPC_BOOK3S_64
set_kuap(AMR_KUAP_BLOCKED);
#endif
trace_hardirqs_off();
user_exit_irqoff();
account_cpu_user_entry();
BUG_ON(!user_mode(regs));
regs->exit_result |= interrupt_exit_user_prepare(regs);
return regs->exit_result;
}
/*
* No real need to return a value here because the stack store case does not
* get restarted.
*/
notrace unsigned long interrupt_exit_kernel_restart(struct pt_regs *regs)
{
__hard_irq_disable();
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
#ifdef CONFIG_PPC_BOOK3S_64
set_kuap(AMR_KUAP_BLOCKED);
#endif
if (regs->softe == IRQS_ENABLED)
trace_hardirqs_off();
BUG_ON(user_mode(regs));
return interrupt_exit_kernel_prepare(regs);
}
#endif