Commit graph

11 commits

Author SHA1 Message Date
Tiezhu Yang
cb8a2ef084 LoongArch: Add ORC stack unwinder support
The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
similar in concept to a DWARF unwinder. The difference is that the format
of the ORC data is much simpler than DWARF, which in turn allows the ORC
unwinder to be much simpler and faster.

The ORC data consists of unwind tables which are generated by objtool.
After analyzing all the code paths of a .o file, it determines information
about the stack state at each instruction address in the file and outputs
that information to the .orc_unwind and .orc_unwind_ip sections.

The per-object ORC sections are combined at link time and are sorted and
post-processed at boot time. The unwinder uses the resulting data to
correlate instruction addresses with their stack states at run time.

Most of the logic are similar with x86, in order to get ra info before ra
is saved into stack, add ra_reg and ra_offset into orc_entry. At the same
time, modify some arch-specific code to silence the objtool warnings.

Co-developed-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-03-11 22:23:47 +08:00
Huacai Chen
5056c596c3 LoongArch/smp: Call rcutree_report_cpu_starting() at tlb_init()
Machines which have more than 8 nodes fail to boot SMP after commit
a2ccf46333 ("LoongArch/smp: Call rcutree_report_cpu_starting()
earlier"). Because such machines use tlb-based per-cpu base address
rather than dmw-based per-cpu base address, resulting per-cpu variables
can only be accessed after tlb_init(). But rcutree_report_cpu_starting()
is now called before tlb_init() and accesses per-cpu variables indeed.

Since the original patch want to avoid the lockdep warning caused by
page allocation in tlb_init(), we can move rcutree_report_cpu_starting()
to tlb_init() where after tlb exception configuration but before page
allocation.

Fixes: a2ccf46333 ("LoongArch/smp: Call rcutree_report_cpu_starting() earlier")
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-01-26 16:22:07 +08:00
Bibo Mao
c718a0bad7 LoongArch: Fix some build warnings with W=1
There are some building warnings when building LoongArch kernel with W=1
as following, this patch fixes them.

arch/loongarch/kernel/acpi.c:284:13: warning: no previous prototype for ‘acpi_numa_arch_fixup’ [-Wmissing-prototypes]
  284 | void __init acpi_numa_arch_fixup(void) {}
      |             ^~~~~~~~~~~~~~~~~~~~
arch/loongarch/kernel/time.c:32:13: warning: no previous prototype for ‘constant_timer_interrupt’ [-Wmissing-prototypes]
   32 | irqreturn_t constant_timer_interrupt(int irq, void *data)
      |             ^~~~~~~~~~~~~~~~~~~~~~~~
arch/loongarch/kernel/traps.c:496:25: warning: no previous prototype for 'do_fpe' [-Wmissing-prototypes]
  496 | asmlinkage void noinstr do_fpe(struct pt_regs *regs
      |                         ^~~~~~
arch/loongarch/kernel/traps.c:813:22: warning: variable ‘opcode’ set but not used [-Wunused-but-set-variable]
  813 |         unsigned int opcode;
      |                      ^~~~~~
arch/loongarch/kernel/signal.c:895:14: warning: no previous prototype for ‘get_sigframe’ [-Wmissing-prototypes]
  895 | void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
      |              ^~~~~~~~~~~~
arch/loongarch/kernel/syscall.c:21:40: warning: initialized field overwritten [-Woverride-init]
   21 | #define __SYSCALL(nr, call)     [nr] = (call),
      |                                        ^
arch/loongarch/kernel/syscall.c:40:14: warning: no previous prototype for ‘do_syscall’ [-Wmissing-prototypes]
   40 | void noinstr do_syscall(struct pt_regs *regs)
      |              ^~~~~~~~~~
arch/loongarch/kernel/smp.c:502:17: warning: no previous prototype for ‘start_secondary’ [-Wmissing-prototypes]
  502 | asmlinkage void start_secondary(void)
      |                 ^~~~~~~~~~~~~~~
arch/loongarch/kernel/process.c:309:15: warning: no previous prototype for ‘arch_align_stack’ [-Wmissing-prototypes]
  309 | unsigned long arch_align_stack(unsigned long sp)
      |               ^~~~~~~~~~~~~~~~
arch/loongarch/kernel/topology.c:13:5: warning: no previous prototype for ‘arch_register_cpu’ [-Wmissing-prototypes]
   13 | int arch_register_cpu(int cpu)
      |     ^~~~~~~~~~~~~~~~~
arch/loongarch/kernel/topology.c:27:6: warning: no previous prototype for ‘arch_unregister_cpu’ [-Wmissing-prototypes]
   27 | void arch_unregister_cpu(int cpu)
      |      ^~~~~~~~~~~~~~~~~~~
arch/loongarch/kernel/module-sections.c:103:5: warning: no previous prototype for ‘module_frob_arch_sections’ [-Wmissing-prototypes]
  103 | int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~
arch/loongarch/mm/hugetlbpage.c:56:5: warning: no previous prototype for ‘is_aligned_hugepage_range’ [-Wmissing-prototypes]
   56 | int is_aligned_hugepage_range(unsigned long addr, unsigned long len)
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-20 14:26:28 +08:00
Matthew Wilcox (Oracle)
a6d01af08b loongarch: implement the new page table range API
Add update_mmu_cache_range() and change _PFN_SHIFT to PFN_PTE_SHIFT.  It
would probably be more efficient to implement __update_tlb() by flushing
the entire folio instead of calling __update_tlb() N times, but I'll leave
that for someone who understands the architecture better.

Link: https://lkml.kernel.org/r/20230802151406.3735276-15-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 16:20:21 -07:00
Huacai Chen
01158487af LoongArch: Introduce hardware page table walker
Loongson-3A6000 and newer processors have hardware page table walker
(PTW) support. PTW can handle all fastpaths of TLBI/TLBL/TLBS/TLBM
exceptions by hardware, software only need to handle slowpaths (page
faults).

BTW, PTW doesn't append _PAGE_MODIFIED for page table entries, so we
change pmd_dirty() and pte_dirty() to also check _PAGE_DIRTY for the
"dirty" attribute.

Signed-off-by: Liang Gao <gaoliang@loongson.cn>
Signed-off-by: Jun Yi <yijun@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29 20:58:44 +08:00
Jinyang He
dc74a9e8a8 LoongArch: Add generic ex-handler unwind in prologue unwinder
When exception is triggered, code flow go handle_\exception in some
cases. One of stackframe in this case as follows,

high -> +-------+
        | REGS  |  <- a pt_regs
        |       |
        |       |  <- ex trigger
        | REGS  |  <- ex pt_regs   <-+
        |       |                    |
        |       |                    |
low  -> +-------+           ->unwind-+

When unwinder unwinds to handler_\exception it cannot go on prologue
analysis. Because it is an asynchronous code flow, we should get the
next frame PC from regs->csr_era rather than regs->regs[1]. At init time
we copy the handlers to eentry and also copy them to NUMA-affine memory
named pcpu_handlers if NUMA is enabled. Thus, unwinder cannot unwind
normally. To solve this, we try to give some hints in handler_\exception
and fixup unwinders in unwind_next_frame().

Reported-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-01-17 11:42:16 +08:00
Huacai Chen
1299a129a9 LoongArch: Flush TLB earlier at initialization
Move local_flush_tlb_all() earlier (just after setup_ptwalker() and
before page allocation). This can avoid stale TLB entries misguiding
the later page allocation. Without this patch the second kernel of
kexec/kdump fails to boot SMP.

BTW, move output_pgtable_bits_defines() into tlb_init() since it has
nothing to do with tlb handler setup.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12 16:36:08 +08:00
Huacai Chen
26808cebf1 LoongArch: Fix EENTRY/MERRENTRY setting in setup_tlb_handler()
setup_tlb_handler() is expected to set per-cpu exception handlers, but
it only set the TLBRENTRY successfully because of copy & paste errors,
so fix it.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-25 18:05:58 +08:00
Huacai Chen
bab1c299f3 LoongArch: Fix sleeping in atomic context in setup_tlb_handler()
Since setup_tlb_handler() is executed in atomic context, we should use
GFP_ATOMIC instead of GFP_KERNEL to alloc pages. Otherwise we will get
a "sleeping in atomic context" error:

[    0.013118] BUG: sleeping function called from invalid context at mm/page_alloc.c:5158
[    0.013126] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
[    0.013131] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.19-rc3+ #1008 1a223086d14d07967cc427f15d52139422271360
[    0.013136] Hardware name: Loongson Loongson-3A5000-7A1000-1w-V0.1-CRB/Loongson-LS3A5000-7A1000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V2.0.04082-beta7 04/27
[    0.013140] Stack : 90000000015fc990 9000000100493c18 9000000000df3370 9000000100490000
[    0.013151]         9000000100493b50 0000000000000000 9000000100493b58 9000000001417ef0
[    0.013160]         900000000199e54e 0000000000000040 9000000100493c18 90000000015f7a98
[    0.013168]         ffffffffffffffff 6de72f8b42179d1e 9000000100403b80 90000000015f7890
[    0.013176]         0000000000000001 00000000fffff175 9000000000eb9860 9000000001530b4b
[    0.013184]         9000000000e99e60 0000000000000013 0000000006ecc000 0000000000000001
[    0.013193]         90000000015f7a98 9000000001417ef0 0000000000000004 0000000000000000
[    0.013201]         0000000000000cc0 0000000000000000 0000000000000001 90000000015fc990
[    0.013209]         9000000000217e74 9000000001603b6b 9000000000208640 0000000000000000
[    0.013217]         00000000000000b0 0000000000000004 0000000000000000 0000000000070000
[    0.013225]         ...
[    0.013229] Call Trace:
[    0.013230] [<9000000000208640>] show_stack+0x4c/0x14c
[    0.013240] [<9000000000df3370>] dump_stack_lvl+0x70/0xac
[    0.013246] [<9000000000270c8c>] ___might_sleep+0x104/0x124
[    0.013253] [<9000000000477e84>] __alloc_pages+0x240/0x464
[    0.013260] [<9000000000214214>] setup_tlb_handler+0x104/0x1e8
[    0.013265] [<9000000000214324>] tlb_init+0x2c/0x3c
[    0.013270] [<9000000000208b74>] per_cpu_trap_init+0xec/0x108
[    0.013275] [<9000000000202850>] cpu_probe+0x400/0x8a4
[    0.013279] [<900000000020d160>] start_secondary+0x5c/0x3d4

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-25 18:05:58 +08:00
Huacai Chen
d4b6f1562a LoongArch: Add Non-Uniform Memory Access (NUMA) support
Add Non-Uniform Memory Access (NUMA) support for LoongArch. LoongArch
has 48-bit physical address, but the HyperTransport I/O bus only support
40-bit address, so we need a custom phys_to_dma() and dma_to_phys() to
extract the 4-bit node id (bit 44~47) from Loongson-3's 48-bit physical
address space and embed it into 40-bit. In the 40-bit dma address, node
id offset can be read from the LS7A_DMA_CFG register.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-03 20:09:29 +08:00
Huacai Chen
09cfefb7fa LoongArch: Add memory management
Add memory management support for LoongArch, including: cache and tlb
management, page fault handling and ioremap/mmap support.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-03 20:09:28 +08:00