Commit graph

1561 commits

Author SHA1 Message Date
Anshuman Khandual
07431506e8 mm/hugetlb: generalize ARCH_WANT_GENERAL_HUGETLB
ARCH_WANT_GENERAL_HUGETLB config has duplicate definitions on platforms
that subscribe it.  Instead make it a generic config option which can be
selected on applicable platforms when required.

Link: https://lkml.kernel.org/r/1643718465-4324-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:08 -07:00
Linus Torvalds
77fe1ba902 RISC-V Fixes for 5.17-rc8
* A fix to prevent users from enabling the alternatives framework (and
   thus errata handling) on XIP kernels, where runtime code patching does
   not function correctly.
 * A fix to properly detect offset overflow for AUIPC-based relocations
   in modules.  This may manifest as modules calling arbitrary invalid
   addresses, depending on the address allocated when a module is loaded.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmIrfkkTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYib4WD/9C/v+AVqbFQuf4qhXlwlros+3s00/e
 LYlD79ddDYvU0wvjvUDBx3WLC/tk2dI96VDuPcUNC60T7W1x0K9TmzcR1YqV9efL
 sx3c7SnV5J1lnmGM+/1EtJWGbYJkSqmH6bk3BvTVCpFGpGd1UcEHu0zMkOq54AIK
 5GD3IG7dpKYPpzziIG3T7F96RFn+N4N4cORLAGFXa07TAcQfwJvL2rnAy6YMza6Q
 NIhpD22sNrKX/9tImwN9YCwZB0G+mrNDLzU8KQAzz1DcBtoWQoEV1b+vRyuubVP+
 YSCGO3e1PWMAU3UKzGMYjGPik8WHTqUrst25affV001v1xYEy835K3td+PiDcOfy
 eHwhhJJOE9TiQ28i1z4F5HsEbT+Sk61Xyf8cXra5y4utlQowIrO5mu4S2GNOi/3Z
 kbDmUzWHZtQ2yNyz4WlmXxwLWAgx7QeRTax7PvqnMLK/Wxbzy9qgbx1MzjmmaCBI
 aI3D6CMc3h54MoWidsrBu1Hmx+qm7UqR7ML2jcEI3LqBB+lIYHNJh/d/Brg/lyTh
 5V/ncrILYalT0qd2zDSODzYYKFm2EucrneJNiJr720McFc54V2ZzLKOWg4gLdBUl
 KJypYoGw2fDbpVqmlN6/oSCWI1RETq6u5kOiCPBaXz/2sKz6aT3xq5m5wXynQAZi
 NzOHZEqS9GYAQQ==
 =qUfT
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - prevent users from enabling the alternatives framework (and thus
   errata handling) on XIP kernels, where runtime code patching does not
   function correctly.

 - properly detect offset overflow for AUIPC-based relocations in
   modules. This may manifest as modules calling arbitrary invalid
   addresses, depending on the address allocated when a module is
   loaded.

* tag 'riscv-for-linus-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix auipc+jalr relocation range checks
  riscv: alternative only works on !XIP_KERNEL
2022-03-11 12:28:21 -08:00
Emil Renner Berthing
0966d38583
riscv: Fix auipc+jalr relocation range checks
RISC-V can do PC-relative jumps with a 32bit range using the following
two instructions:

	auipc	t0, imm20	; t0 = PC + imm20 * 2^12
	jalr	ra, t0, imm12	; ra = PC + 4, PC = t0 + imm12

Crucially both the 20bit immediate imm20 and the 12bit immediate imm12
are treated as two's-complement signed values. For this reason the
immediates are usually calculated like this:

	imm20 = (offset + 0x800) >> 12
	imm12 = offset & 0xfff

..where offset is the signed offset from the auipc instruction. When
the 11th bit of offset is 0 the addition of 0x800 doesn't change the top
20 bits and imm12 considered positive. When the 11th bit is 1 the carry
of the addition by 0x800 means imm20 is one higher, but since imm12 is
then considered negative the two's complement representation means it
all cancels out nicely.

However, this addition by 0x800 (2^11) means an offset greater than or
equal to 2^31 - 2^11 would overflow so imm20 is considered negative and
result in a backwards jump. Similarly the lower range of offset is also
moved down by 2^11 and hence the true 32bit range is

	[-2^31 - 2^11, 2^31 - 2^11)

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Fixes: e2c0cdfba7 ("RISC-V: User-facing API")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-10 20:37:44 -08:00
Jisheng Zhang
c80ee64a80
riscv: alternative only works on !XIP_KERNEL
The alternative mechanism needs runtime code patching, it can't work
on XIP_KERNEL. And the errata workarounds are implemented via the
alternative mechanism. So add !XIP_KERNEL dependency for alternative
and erratas.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Fixes: 44c9225729 ("RISC-V: enable XIP")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-10 10:05:19 -08:00
Linus Torvalds
07ebd38a0d RISC-V Fixes for 5.17-rc7
* Fixes for a handful of KASAN-related crashes.
 * A fix to avoid a crash during boot for SPARSEMEM && !SPARSEMEM_VMEMMAP
   configurations.
 * A fix to stop reporting some incorrect errors under DEBUG_VIRTUAL.
 * A fix for the K210's device tree to properly populate the interrupt
   map, so hart1 will get interrupts again.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAmIiNtYTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQf8cD/92NMaclwHMVjQ07svZloQcgDp+JSA5
 JP2EYHuDy3UCZsJSdJY8zJZ+Ct81MxNSNDDpLCLQCZe8fD8hA+FOOVlt8a21SqNH
 Pc96ycqIhD/QrfBlcYw5+8N3n5zNTpPSMjazrBphKj56qNWcAXdvQwQTh56pXGj+
 3J5vf3L8xlnx8mlTUMYqHivHKl4cJhYOY/ICwXjpZnRYx0NRF32cquo5A4Uh65ls
 qQjeKL2WXZd44avWK9IkDcBLpjyxr+pJmCsbIntvwK23bz37/SXmk4G2f5/8sBtH
 RK6RDLU1LIH8YNCq5KvAv9/qZZPkuvOKig//lWfcsOLYv43+bp2cGVlO4Z4gvUw3
 qRsrQxXxS+FQFxH5Fxre7UWqLlM9EUHUdbx/aXyGSF5e1DXuD8GcDSt0pOwQboiu
 xKqRxuMozr6ZiHlug3mUcEwzeDAHOwPWrIDSXNELMj+5r/8QogkcPaFUFFqmvigj
 gIwGMiPKe0nQ9XfAUAsjVTL3ozlGXa6nabbVNnA4N05a/scToy3hnFkYo2iEpjyH
 0sxyQ96AaKnN4ydWBsy+y/HA13CbWRP+dgfgaG1BaWQCQ4kh/FN3A3FYpvubBjIm
 5rslXvsmWEkCt/U/K0BY3t6Pvw9GNryAXWDsyPACaVFjMErZPqwwkRtp1PqoKpd6
 XiYQ1nJxgZPCRQ==
 =Ff5J
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - Fixes for a handful of KASAN-related crashes.

 - A fix to avoid a crash during boot for SPARSEMEM &&
   !SPARSEMEM_VMEMMAP configurations.

 - A fix to stop reporting some incorrect errors under DEBUG_VIRTUAL.

 - A fix for the K210's device tree to properly populate the interrupt
   map, so hart1 will get interrupts again.

* tag 'riscv-for-linus-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: dts: k210: fix broken IRQs on hart1
  riscv: Fix kasan pud population
  riscv: Move high_memory initialization to setup_bootmem
  riscv: Fix config KASAN && DEBUG_VIRTUAL
  riscv: Fix DEBUG_VIRTUAL false warnings
  riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP
  riscv: Fix is_linear_mapping with recent move of KASAN region
2022-03-04 11:54:06 -08:00
Niklas Cassel
74583f1b92
riscv: dts: k210: fix broken IRQs on hart1
Commit 67d96729a9 ("riscv: Update Canaan Kendryte K210 device tree")
incorrectly removed two entries from the PLIC interrupt-controller node's
interrupts-extended property.

The PLIC driver cannot know the mapping between hart contexts and hart ids,
so this information has to be provided by device tree, as specified by the
PLIC device tree binding.

The PLIC driver uses the interrupts-extended property, and initializes the
hart context registers in the exact same order as provided by the
interrupts-extended property.

In other words, if we don't specify the S-mode interrupts, the PLIC driver
will simply initialize the hart0 S-mode hart context with the hart1 M-mode
configuration. It is therefore essential to specify the S-mode IRQs even
though the system itself will only ever be running in M-mode.

Re-add the S-mode interrupts, so that we get working IRQs on hart1 again.

Cc: <stable@vger.kernel.org>
Fixes: 67d96729a9 ("riscv: Update Canaan Kendryte K210 device tree")
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 20:04:21 -08:00
Alexandre Ghiti
e4fcfe6eca
riscv: Fix kasan pud population
In sv48, the kasan inner regions are not aligned on PGDIR_SIZE and then
when we populate the kasan linear mapping region, we clear the kasan
vmalloc region which is in the same PGD.

Fix this by copying the content of the kasan early pud after allocating a
new PGD for the first time.

Fixes: e8a62cc26d ("riscv: Implement sv48 support")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:34:29 -08:00
Alexandre Ghiti
625e24a550
riscv: Move high_memory initialization to setup_bootmem
high_memory used to be initialized in mem_init, way after setup_bootmem.
But a call to dma_contiguous_reserve in this function gives rise to the
below warning because high_memory is equal to 0 and is used at the very
beginning at cma_declare_contiguous_nid.

It went unnoticed since the move of the kasan region redefined
KERN_VIRT_SIZE so that it does not encompass -1 anymore.

Fix this by initializing high_memory in setup_bootmem.

------------[ cut here ]------------
virt_to_phys used for non-linear address: ffffffffffffffff (0xffffffffffffffff)
WARNING: CPU: 0 PID: 0 at arch/riscv/mm/physaddr.c:14 __virt_to_phys+0xac/0x1b8
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.17.0-rc1-00007-ga68b89289e26 #27
Hardware name: riscv-virtio,qemu (DT)
epc : __virt_to_phys+0xac/0x1b8
 ra : __virt_to_phys+0xac/0x1b8
epc : ffffffff80014922 ra : ffffffff80014922 sp : ffffffff84a03c30
 gp : ffffffff85866c80 tp : ffffffff84a3f180 t0 : ffffffff86bce657
 t1 : fffffffef09406e8 t2 : 0000000000000000 s0 : ffffffff84a03c70
 s1 : ffffffffffffffff a0 : 000000000000004f a1 : 00000000000f0000
 a2 : 0000000000000002 a3 : ffffffff8011f408 a4 : 0000000000000000
 a5 : 0000000000000000 a6 : 0000000000f00000 a7 : ffffffff84a03747
 s2 : ffffffd800000000 s3 : ffffffff86ef4000 s4 : ffffffff8467f828
 s5 : fffffff800000000 s6 : 8000000000006800 s7 : 0000000000000000
 s8 : 0000000480000000 s9 : 0000000080038ea0 s10: 0000000000000000
 s11: ffffffffffffffff t3 : ffffffff84a035c0 t4 : fffffffef09406e8
 t5 : fffffffef09406e9 t6 : ffffffff84a03758
status: 0000000000000100 badaddr: 0000000000000000 cause: 0000000000000003
[<ffffffff8322ef4c>] cma_declare_contiguous_nid+0xf2/0x64a
[<ffffffff83212a58>] dma_contiguous_reserve_area+0x46/0xb4
[<ffffffff83212c3a>] dma_contiguous_reserve+0x174/0x18e
[<ffffffff83208fc2>] paging_init+0x12c/0x35e
[<ffffffff83206bd2>] setup_arch+0x120/0x74e
[<ffffffff83201416>] start_kernel+0xce/0x68c
irq event stamp: 0
hardirqs last  enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<0000000000000000>] 0x0
softirqs last  enabled at (0): [<0000000000000000>] 0x0
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace 0000000000000000 ]---

Fixes: f7ae02333d ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:34:12 -08:00
Alexandre Ghiti
c648c4bb7d
riscv: Fix config KASAN && DEBUG_VIRTUAL
__virt_to_phys function is called very early in the boot process (ie
kasan_early_init) so it should not be instrumented by KASAN otherwise it
bugs.

Fix this by declaring phys_addr.c as non-kasan instrumentable.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: 8ad8b72721 (riscv: Add KASAN support)
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:32:41 -08:00
Alexandre Ghiti
5f763b3b59
riscv: Fix DEBUG_VIRTUAL false warnings
KERN_VIRT_SIZE used to encompass the kernel mapping before it was
redefined when moving the kasan mapping next to the kernel mapping to only
match the maximum amount of physical memory.

Then, kernel mapping addresses that go through __virt_to_phys are now
declared as wrong which is not true, one can use __virt_to_phys on such
addresses.

Fix this by redefining the condition that matches wrong addresses.

Fixes: f7ae02333d ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:32:04 -08:00
Alexandre Ghiti
a3d3280378
riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP
In order to get the pfn of a struct page* when sparsemem is enabled
without vmemmap, the mem_section structures need to be initialized which
happens in sparse_init.

But kasan_early_init calls pfn_to_page way before sparse_init is called,
which then tries to dereference a null mem_section pointer.

Fix this by removing the usage of this function in kasan_early_init.

Fixes: 8ad8b72721 ("riscv: Add KASAN support")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 13:11:30 -08:00
Alexandre Ghiti
8b274f2238
riscv: Fix is_linear_mapping with recent move of KASAN region
The KASAN region was recently moved between the linear mapping and the
kernel mapping, is_linear_mapping used to check the validity of an
address by using the start of the kernel mapping, which is now wrong.

Fix this by using the maximum size of the physical memory.

Fixes: f7ae02333d ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 13:11:02 -08:00
Linus Torvalds
2c8c230eda RISC-V Fixes for 5.17-rc6
* A fix for the K210 sdcard defconfig, to avoid using a fixed delay for
   the root FS.
 * A fix to make sure there's a proper call frame for
   trace_hardirqs_{on,off}().
 
 ---
 
 There are a handful of additional fixes in flight, but not for this
 week.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAmIZHmQTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQT4ND/42sEQLhcQcDpdvFDX/0zBr1Y8RNy25
 7I9JBYmuTK5AfwmE52I/OcdCLE9bNELH1g+LMK/3amEqhkUtDelBb5UdC4TYfvRm
 SRlj75XKPxESMEW9EjU5BeAz+uDI4oMkOmDPyp+Xv/OayGrFQIPUTo75/SiOdlH7
 a2khiH4/OxqkVlOff3Ko96M4RNSUeUIEVSfrH4pgJC8n+031u02TvR1IIx5TT7ti
 W5YIMw6VZ32Gl5ByZaBMbs9pz+iOKDrn3UfnPrVpbs6P3389EmR4btJpqfzN9JeC
 UQzcx4rqoDzTtJvqkOxiR+Ig4nNJGyeYVvxaGH67MkD/nz6rS26adIs+xPGKjDCC
 TtFyLt4h2+JX+1kNiutTLrQLAaQO4N+LSkysIsoSr9wNGCdnSrAQRxoOLwIuTMBS
 61kRsBvuiuRJZQlbgkP2tiTug/8dYs4vQzPNeC5VO/c3MZB5/j2ykYdKBSsElrpi
 +br602CMdeqvT+M+pT9lWdxa8X9lbYVm1z3hx2FyRdbnYw3nbcQq/mGp8Ju9O5zt
 JXajwPFtUPpWXzm2CcRjeh+2GKoLetgVpwHAIOmnDd6meTXp/BEu12+o7c8vf3H7
 k12BPHVPH1gPklG2Oh8Z/UvevICd4AHlSJT7J19xh7fYVMaaALVQ71Nd2jFbv9N/
 eu8KKxR2pKXiPw==
 =1a16
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix for the K210 sdcard defconfig, to avoid using a
   fixed delay for the root FS

 - A fix to make sure there's a proper call frame for
   trace_hardirqs_{on,off}().

* tag 'riscv-for-linus-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: fix oops caused by irqsoff latency tracer
  riscv: fix nommu_k210_sdcard_defconfig
2022-02-26 10:26:24 -08:00
Changbin Du
22e2100b1b
riscv: fix oops caused by irqsoff latency tracer
The trace_hardirqs_{on,off}() require the caller to setup frame pointer
properly. This because these two functions use macro 'CALLER_ADDR1' (aka.
__builtin_return_address(1)) to acquire caller info. If the $fp is used
for other purpose, the code generated this macro (as below) could trigger
memory access fault.

   0xffffffff8011510e <+80>:    ld      a1,-16(s0)
   0xffffffff80115112 <+84>:    ld      s2,-8(a1)  # <-- paging fault here

The oops message during booting if compiled with 'irqoff' tracer enabled:
[    0.039615][    T0] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000f8
[    0.041925][    T0] Oops [#1]
[    0.042063][    T0] Modules linked in:
[    0.042864][    T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-rc1-00233-g9a20c48d1ed2 #29
[    0.043568][    T0] Hardware name: riscv-virtio,qemu (DT)
[    0.044343][    T0] epc : trace_hardirqs_on+0x56/0xe2
[    0.044601][    T0]  ra : restore_all+0x12/0x6e
[    0.044721][    T0] epc : ffffffff80126a5c ra : ffffffff80003b94 sp : ffffffff81403db0
[    0.044801][    T0]  gp : ffffffff8163acd8 tp : ffffffff81414880 t0 : 0000000000000020
[    0.044882][    T0]  t1 : 0098968000000000 t2 : 0000000000000000 s0 : ffffffff81403de0
[    0.044967][    T0]  s1 : 0000000000000000 a0 : 0000000000000001 a1 : 0000000000000100
[    0.045046][    T0]  a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
[    0.045124][    T0]  a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000054494d45
[    0.045210][    T0]  s2 : ffffffff80003b94 s3 : ffffffff81a8f1b0 s4 : ffffffff80e27b50
[    0.045289][    T0]  s5 : ffffffff81414880 s6 : ffffffff8160fa00 s7 : 00000000800120e8
[    0.045389][    T0]  s8 : 0000000080013100 s9 : 000000000000007f s10: 0000000000000000
[    0.045474][    T0]  s11: 0000000000000000 t3 : 7fffffffffffffff t4 : 0000000000000000
[    0.045548][    T0]  t5 : 0000000000000000 t6 : ffffffff814aa368
[    0.045620][    T0] status: 0000000200000100 badaddr: 00000000000000f8 cause: 000000000000000d
[    0.046402][    T0] [<ffffffff80003b94>] restore_all+0x12/0x6e

This because the $fp(aka. $s0) register is not used as frame pointer in the
assembly entry code.

	resume_kernel:
		REG_L s0, TASK_TI_PREEMPT_COUNT(tp)
		bnez s0, restore_all
		REG_L s0, TASK_TI_FLAGS(tp)
                andi s0, s0, _TIF_NEED_RESCHED
                beqz s0, restore_all
                call preempt_schedule_irq
                j restore_all

To fix above issue, here we add one extra level wrapper for function
trace_hardirqs_{on,off}() so they can be safely called by low level entry
code.

Signed-off-by: Changbin Du <changbin.du@gmail.com>
Fixes: 3c46979829 ("riscv: Enable LOCKDEP_SUPPORT & fixup TRACE_IRQFLAGS_SUPPORT")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-24 20:30:30 -08:00
Damien Le Moal
762e52f79c
riscv: fix nommu_k210_sdcard_defconfig
Instead of an arbitrary delay, use the "rootwait" kernel option to wait
for the mmc root device to be ready.

Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Fixes: 7e09fd3994 ("riscv: Add Canaan Kendryte K210 SD card defconfig")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-24 19:22:55 -08:00
Linus Torvalds
241c32d853 RISC-V Fixes for 5.17-rc5
* A set of three fixes, all aimed at fixing some fallout from the recent
   sparse hart ID support.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAmIP278THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQYigD/93G1sebBddYopE7oWydYZekkS5es4O
 W7D0QocH+vUsI5QA5788BWe6G2ShgpfXkOkPrVMSCfIZ+sorCbha36Oaj5y8XAMO
 M2PO1mNXWIoz/JuoYlO0iIUNKsc/Ensjfx4zzfcoFPurMrydsBJcrg1MXzFu677y
 dLuv/QDTMDtkIpkCF7QlpgwkzPLrh+vMLtzaW2gGAw/HKr7fSf0pf8/x9FWVKfPX
 3vBGkhKr12Fjwu941oW0rgndxm3qPDINhz0AqpLzyk73c1lwgOJTzIif46/BxyA6
 IZhw2tr4D0CiIAvrhxWKdkO21SFz40rwkS0AX+s+DgxE1DEZhthvaarsNT6Gqse3
 oYlB1aNBPThAHWLL7gTcXR+62kQH1KbqBNNY4rm0ciprBr+sx/m8J3K1tESnW9u9
 EMt3MMkdpLmx2cilpNe9gz2qwjM9C07j5JnO66Sa5MGQ60lp9UzfVRx2HDiwotxt
 0ZQjavxfE+Ndo9mSPfSEnNu8c6csTQGzJqbpdR0wvK603TBpG1x25gsEepbyRU4k
 cl3e1nN8WYH24Zhj0a3MqCDON1uq1K0aSanqq63C8MAcQTF65ALBK9JScdafgOI0
 ALi/o6D+Eus9Vm/a3gAVFmaDT0FIZc1MtYj5TcgzUFD2I5nEOBFu0Be6ApXVORol
 JeDRg5CtG7g7yg==
 =PQE3
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:
 "A set of three fixes, all aimed at fixing some fallout from the recent
  sparse hart ID support"

* tag 'riscv-for-linus-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: Fix IPI/RFENCE hmask on non-monotonic hartid ordering
  RISC-V: Fix handling of empty cpu masks
  RISC-V: Fix hartid mask handling for hartid 31 and up
2022-02-18 16:14:13 -08:00
Geert Uytterhoeven
5feef64f4c
RISC-V: Fix IPI/RFENCE hmask on non-monotonic hartid ordering
If the boot CPU does not have the lowest hartid, "hartid - hbase" can
become negative, leading to an incorrect hmask, causing userspace to
crash with SEGV.  This is observed on e.g. Starlight Beta, where cpuid 1
maps to hartid 0, and cpuid 0 maps to hartid 1.

Fix this by detecting this case, and shifting the accumulated mask and
updating hbase, if possible.

Fixes: 26fb751ca3 ("RISC-V: Do not use cpumask data structure for hartid bitmap")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14 12:27:45 -08:00
Geert Uytterhoeven
2b35d5b7d1
RISC-V: Fix handling of empty cpu masks
The cpumask rework slightly changed the behavior of the code.  Fix this
by treating an empty cpumask as meaning all online CPUs.

Extracted from a patch by Atish Patra <atishp@rivosinc.com>.

Reported-by: Jessica Clarke <jrtc27@jrtc27.com>
Fixes: 26fb751ca3 ("RISC-V: Do not use cpumask data structure for hartid bitmap")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14 12:27:44 -08:00
Geert Uytterhoeven
12f4a665cc
RISC-V: Fix hartid mask handling for hartid 31 and up
Jessica reports that using "1 << hartid" causes undefined behavior for
hartid 31 and up.

Fix this by using the BIT() helper instead of an explicit shift.

Reported-by: Jessica Clarke <jrtc27@jrtc27.com>
Fixes: 26fb751ca3 ("RISC-V: Do not use cpumask data structure for hartid bitmap")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14 12:27:38 -08:00
Linus Torvalds
1d41d2e826 RISC-V Fixes for 5.17-rc4
* A fix to avoid undefined behavior when stack backtracing, which
   manifests in GCC as incorrect stack addresses.
 * A few fixes for the XIP kernels.
 * A fix to tracking NUMA state on CPU hotplug.
 * Support for the recently relesaed binutils-2.38, which changed the
   default ISA version to one without CSRs or fence.i in I.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAmIGtOwTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQcR9D/9lzWPlayIts89Jz3DHrxVeBY13E3sh
 VqbnFxXzKe8Z1RwH4/ThTfsRP1MXislmc4xoRwUfRVJj2OWLDEBJ/2Sj/AJPFF/Z
 GopDgaT4pdFQ4DH5G8zgnkeAHqa+pMnXfmnmIuwIK2TbropDHoeR3tZzcnlevB7G
 CQL/N7aXtScnnXOAuTaFl9Pgxf5vnqA6NURrWMUXF6Y1e2vQKOg4eDmMTpyb+sG+
 3N/N5vyHg2EBi9nng05uinycjjNUIXfkJ861ZtAVqQUws1+5JtpMsEriadn6LRi8
 Uw+N7XeGdLcN79cHP70Wj4nf256VLXj/B2G3lL2oXRdidyVXKwv3UrbnqPhUvHOn
 QSO+siBetbwG8VvHB8jOZ1x7qKnYUdPgtbwda6EyYDwMrxVRE6dnGA5eW9IQfVse
 7LgGWZCYAcEdzTgPnq9C0mRdgPfZPJTkNnyF5VhnwIDt3mBKEQiXxjK6t4VJxJge
 VK80d8hhabTjxWVRuJIaxdSarRfCWfx3416TAgxbQAvoodDLWK1SQ9xfIgU+fXhB
 1PqHOu9w7M3YhTGb7yTX2mG9mqsCEx+qYajfZdZS3Ejnnu+6eFwjK4LN3jNip4tQ
 2TNqVjWgYmGxSytlI9ZoHsS+CAzNRN9rm7KheIrpqgiz7JjVvZtWkqhuW4YhHtmY
 d+7I3O5DvPHsqw==
 =MVlA
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix to avoid undefined behavior when stack backtracing, which
   manifests in GCC as incorrect stack addresses

 - A few fixes for the XIP kernels

 - A fix to tracking NUMA state on CPU hotplug

 - Support for the recently relesaed binutils-2.38, which changed the
   default ISA version to one without CSRs or fence.i in 'I' extension

* tag 'riscv-for-linus-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: fix build with binutils 2.38
  riscv: cpu-hotplug: clear cpu from numa map when teardown
  riscv: extable: fix err reg writing in dedicated uaccess handler
  riscv/mm: Add XIP_FIXUP for riscv_pfn_base
  riscv/mm: Add XIP_FIXUP for phys_ram_base
  riscv: Fix XIP_FIXUP_FLASH_OFFSET
  riscv: eliminate unreliable __builtin_frame_address(1)
2022-02-11 12:02:09 -08:00
Aurelien Jarno
6df2a016c0
riscv: fix build with binutils 2.38
From version 2.38, binutils default to ISA spec version 20191213. This
means that the csr read/write (csrr*/csrw*) instructions and fence.i
instruction has separated from the `I` extension, become two standalone
extensions: Zicsr and Zifencei. As the kernel uses those instruction,
this causes the following build failure:

  CC      arch/riscv/kernel/vdso/vgettimeofday.o
  <<BUILDDIR>>/arch/riscv/include/asm/vdso/gettimeofday.h: Assembler messages:
  <<BUILDDIR>>/arch/riscv/include/asm/vdso/gettimeofday.h:71: Error: unrecognized opcode `csrr a5,0xc01'
  <<BUILDDIR>>/arch/riscv/include/asm/vdso/gettimeofday.h:71: Error: unrecognized opcode `csrr a5,0xc01'
  <<BUILDDIR>>/arch/riscv/include/asm/vdso/gettimeofday.h:71: Error: unrecognized opcode `csrr a5,0xc01'
  <<BUILDDIR>>/arch/riscv/include/asm/vdso/gettimeofday.h:71: Error: unrecognized opcode `csrr a5,0xc01'

The fix is to specify those extensions explicitely in -march. However as
older binutils version do not support this, we first need to detect
that.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-10 09:17:01 -08:00
Pingfan Liu
f40fe31c01
riscv: cpu-hotplug: clear cpu from numa map when teardown
There is numa_add_cpu() when cpus online, accordingly, there should be
numa_remove_cpu() when cpus offline.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Fixes: 4f0e8eef77 ("riscv: Add numa support for riscv64 platform")
Cc: stable@vger.kernel.org
[Palmer: Add missing NUMA include]
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-10 09:16:50 -08:00
Jisheng Zhang
f81393a5b2
riscv: extable: fix err reg writing in dedicated uaccess handler
Mayuresh reported commit 20802d8d47 ("riscv: extable: add a dedicated
uaccess handler") breaks the writev02 test case in LTP. This is due to
the err reg isn't correctly set with the errno(-EFAULT in writev02
case). First of all, the err and zero regs are reg numbers rather than
reg offsets in struct pt_regs; Secondly, regs_set_gpr() should write
the regs when offset isn't zero(zero means epc)

Fix it by correcting regs_set_gpr() logic and passing the correct reg
offset to it.

Reported-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Fixes: 20802d8d47 ("riscv: extable: add a dedicated uaccess handler")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-08 17:02:47 -08:00
Paolo Bonzini
7e6a6b400d KVM/arm64 fixes for 5.17, take #2
- A couple of fixes when handling an exception while a SError has been
   delivered
 
 - Workaround for Cortex-A510's single-step[ erratum
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmH9LlcPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDLTcP/3Ry8CzvPubZquMyNdRUFvEg2EcfTa6vtIGW
 Fw7ap2hwPUaXUgJKDihMFIWj3Wf/wPmXw4t2Sr8R/yq8v9kWe+IG1isnT0yQhY3W
 kLXEqc8Mu4Rf8+jvlFHsp5mLENHIswpWAv/EY49ChgZkNmtkKpnPm1qnD89d8bNv
 tUwooDWidQ/7nXdM3z6zygSROJS24+OGTYTWzOQ1KgV3FGaXbqYiCleoPOpRR/Tc
 DQQWF/tVl8bZCqgkGKZCv3aXT0ZUPrQggARJGai78vP0l2sE/Kyaydgq5I7npZja
 2L2U4kDNoPYIVa8A1jvV3Ef3AqNFs6B7+jXWfYIgAcXjCYzDK3cZcxavf/Inq9F1
 3udVGJGSzH1KkGaihW3BVhsqGORRHKCdksJzWRgqf6vGyJhJw0u0D2u1rTWcT+jw
 Nm4KxShp0CX59HSLnVF5sR0Mct3jNNZ7UCCgH7q10wuBqYRfJT32hCo2ZrT7g9oD
 IQ+pa2dVYa3SaKZ4O6T/lSlbLOuuxtvmcEIfxYpPD6m10S5RrxOdsW3MCtiYM5HQ
 24oo2mk6NIu/va0XxhcW+NBMcYtLQD9JUGbkUkpcRy2mgilTi9b4YPp+muYM7plQ
 /S1gj2kGY8vjMg0H+wysjMJyl2huEwSRsZ/UfxCAgW+MYhHLDxhxAnDWc8EcwGgE
 tUzomowB
 =Mbx/
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 5.17, take #2

- A couple of fixes when handling an exception while a SError has been
  delivered

- Workaround for Cortex-A510's single-step[ erratum
2022-02-05 00:58:25 -05:00
Palmer Dabbelt
ca0cb9a60f
riscv/mm: Add XIP_FIXUP for riscv_pfn_base
This manifests as a crash early in boot on VexRiscv.

Signed-off-by: Myrtle Shah <gatecat@ds0.me>
[Palmer: split commit]
Fixes: 44c9225729 ("RISC-V: enable XIP")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-04 13:27:23 -08:00
Palmer Dabbelt
4b1c70aa8e
riscv/mm: Add XIP_FIXUP for phys_ram_base
This manifests as a crash early in boot on VexRiscv.

Signed-off-by: Myrtle Shah <gatecat@ds0.me>
[Palmer: split commit]
Fixes: 6d7f91d914 ("riscv: Get rid of CONFIG_PHYS_RAM_BASE in kernel physical address conversion")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-04 13:18:56 -08:00
Myrtle Shah
3c04d84508
riscv: Fix XIP_FIXUP_FLASH_OFFSET
There were several problems with the calculation. Not only was an 'and'
being computed into t1 but thrown away; but the 'and' itself would
cause problems if the granularity of the XIP physical address was less
than XIP_OFFSET - in my case I had the kernel image at 2MB in SPI flash.

Fixes: f9ace4ede4 ("riscv: remove .text section size limitation for XIP")
Cc: stable@vger.kernel.org
Signed-off-by: Myrtle Shah <gatecat@ds0.me>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-04 11:11:07 -08:00
Changbin Du
6a00ef4493
riscv: eliminate unreliable __builtin_frame_address(1)
I tried different pieces of code which uses __builtin_frame_address(1)
(with both gcc version 7.5.0 and 10.3.0) to verify whether it works as
expected on riscv64. The result is negative.

What the compiler had generated is as below:
31                      fp = (unsigned long)__builtin_frame_address(1);
   0xffffffff80006024 <+200>:   ld      s1,0(s0)

It takes '0(s0)' as the address of frame 1 (caller), but the actual address
should be '-16(s0)'.

          |       ...       | <-+
          +-----------------+   |
          | return address  |   |
          | previous fp     |   |
          | saved registers |   |
          | local variables |   |
  $fp --> |       ...       |   |
          +-----------------+   |
          | return address  |   |
          | previous fp --------+
          | saved registers |
  $sp --> | local variables |
          +-----------------+

This leads the kernel can not dump the full stack trace on riscv.

[    7.222126][    T1] Call Trace:
[    7.222804][    T1] [<ffffffff80006058>] dump_backtrace+0x2c/0x3a

This problem is not exposed on most riscv builds just because the '0(s0)'
occasionally is the address frame 2 (caller's caller), if only ra and fp
are stored in frame 1 (caller).

          |       ...       | <-+
          +-----------------+   |
          | return address  |   |
  $fp --> | previous fp     |   |
          +-----------------+   |
          | return address  |   |
          | previous fp --------+
          | saved registers |
  $sp --> | local variables |
          +-----------------+

This could be a *bug* of gcc that should be fixed. But as noted in gcc
manual "Calling this function with a nonzero argument can have
unpredictable effects, including crashing the calling program.", let's
remove the '__builtin_frame_address(1)' in backtrace code.

With this fix now it can show full stack trace:
[   10.444838][    T1] Call Trace:
[   10.446199][    T1] [<ffffffff8000606c>] dump_backtrace+0x2c/0x3a
[   10.447711][    T1] [<ffffffff800060ac>] show_stack+0x32/0x3e
[   10.448710][    T1] [<ffffffff80a005c0>] dump_stack_lvl+0x58/0x7a
[   10.449941][    T1] [<ffffffff80a005f6>] dump_stack+0x14/0x1c
[   10.450929][    T1] [<ffffffff804c04ee>] ubsan_epilogue+0x10/0x5a
[   10.451869][    T1] [<ffffffff804c092e>] __ubsan_handle_load_invalid_value+0x6c/0x78
[   10.453049][    T1] [<ffffffff8018f834>] __pagevec_release+0x62/0x64
[   10.455476][    T1] [<ffffffff80190830>] truncate_inode_pages_range+0x132/0x5be
[   10.456798][    T1] [<ffffffff80190ce0>] truncate_inode_pages+0x24/0x30
[   10.457853][    T1] [<ffffffff8045bb04>] kill_bdev+0x32/0x3c
...

Signed-off-by: Changbin Du <changbin.du@gmail.com>
Fixes: eac2f3059e ("riscv: stacktrace: fix the riscv stacktrace when CONFIG_FRAME_POINTER enabled")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-04 10:12:32 -08:00
Anup Patel
403271548a RISC-V: KVM: Fix SBI implementation version
The SBI implementation version returned by KVM RISC-V should be the
Host Linux version code.

Fixes: c62a768597 ("RISC-V: KVM: Add SBI v0.2 base extension")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-02-02 18:58:06 +05:30
Mayuresh Chitale
de1d7b6a51 RISC-V: KVM: make CY, TM, and IR counters accessible in VU mode
Those applications that run in VU mode and access the time CSR cause
a virtual instruction trap as Guest kernel currently does not
initialize the scounteren CSR.

To fix this, we should make CY, TM, and IR counters accessibile
by default in VU mode (similar to OpenSBI).

Fixes: a33c72faf2 ("RISC-V: KVM: Implement VCPU create, init and
destroy functions")
Cc: stable@vger.kernel.org
Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-02-02 18:57:10 +05:30
Mark Rutland
6455317e4d kvm/riscv: rework guest entry logic
In kvm_arch_vcpu_ioctl_run() we enter an RCU extended quiescent state
(EQS) by calling guest_enter_irqoff(), and unmask IRQs prior to exiting
the EQS by calling guest_exit(). As the IRQ entry code will not wake RCU
in this case, we may run the core IRQ code and IRQ handler without RCU
watching, leading to various potential problems.

Additionally, we do not inform lockdep or tracing that interrupts will
be enabled during guest execution, which caan lead to misleading traces
and warnings that interrupts have been enabled for overly-long periods.

This patch fixes these issues by using the new timing and context
entry/exit helpers to ensure that interrupts are handled during guest
vtime but with RCU watching, with a sequence:

	guest_timing_enter_irqoff();

	guest_state_enter_irqoff();
	< run the vcpu >
	guest_state_exit_irqoff();

	< take any pending IRQs >

	guest_timing_exit_irqoff();

Since instrumentation may make use of RCU, we must also ensure that no
instrumented code is run during the EQS. I've split out the critical
section into a new kvm_riscv_enter_exit_vcpu() helper which is marked
noinstr.

Fixes: 99cdc6c18c ("RISC-V: Add initial skeletal KVM support")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Anup Patel <anup@brainfault.org>
Cc: Atish Patra <atishp@atishpatra.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Tested-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-02-02 17:45:44 +05:30
Linus Torvalds
3689f9f8b0 bitmap patches for 5.17-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQHJBAABCgAzFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmHi+xgVHHl1cnkubm9y
 b3ZAZ21haWwuY29tAAoJELFEgP06H77IxdoMAMf3E+L51Ys/4iAiyJQNVoT3aIBC
 A8ZVOB9he1OA3o3wBNIRKmICHk+ovnfCWcXTr9fG/Ade2wJz88NAsGPQ1Phywb+s
 iGlpySllFN72RT9ZqtJhLEzgoHHOL0CzTW07TN9GJy4gQA2h2G9CTP+OmsQdnVqE
 m9Fn3PSlJ5lhzePlKfnln8rGZFgrriJakfEFPC79n/7an4+2Hvkb5rWigo7KQc4Z
 9YNqYUcHWZFUgq80adxEb9LlbMXdD+Z/8fCjOrAatuwVkD4RDt6iKD0mFGjHXGL7
 MZ9KRS8AfZXawmetk3jjtsV+/QkeS+Deuu7k0FoO0Th2QV7BGSDhsLXAS5By/MOC
 nfSyHhnXHzCsBMyVNrJHmNhEZoN29+tRwI84JX9lWcf/OLANcCofnP6f2UIX7tZY
 CAZAgVELp+0YQXdybrfzTQ8BT3TinjS/aZtCrYijRendI1GwUXcyl69vdOKqAHuk
 5jy8k/xHyp+ZWu6v+PyAAAEGowY++qhL0fmszA==
 =RKW4
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-5.17-rc1' of git://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - introduce for_each_set_bitrange()

 - use find_first_*_bit() instead of find_next_*_bit() where possible

 - unify for_each_bit() macros

* tag 'bitmap-5.17-rc1' of git://github.com/norov/linux:
  vsprintf: rework bitmap_list_string
  lib: bitmap: add performance test for bitmap_print_to_pagebuf
  bitmap: unify find_bit operations
  mm/percpu: micro-optimize pcpu_is_populated()
  Replace for_each_*_bit_from() with for_each_*_bit() where appropriate
  find: micro-optimize for_each_{set,clear}_bit()
  include/linux: move for_each_bit() macros from bitops.h to find.h
  cpumask: replace cpumask_next_* with cpumask_first_* where appropriate
  tools: sync tools/bitmap with mother linux
  all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate
  cpumask: use find_first_and_bit()
  lib: add find_first_and_bit()
  arch: remove GENERIC_FIND_FIRST_BIT entirely
  include: move find.h from asm_generic to linux
  bitops: move find_bit_*_le functions from le.h to find.h
  bitops: protect find_first_{,zero}_bit properly
2022-01-23 06:20:44 +02:00
Linus Torvalds
7867e40278 RISC-V Patches for the 5.17 Merge Window, Part 2
* Support for sv48 paging.
 * Hart ID mappings are now sparse, which enables more CPUs to come up on
   systems with sparse hart IDs.
 * A handful of cleanups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmHq5KQTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiY+yEACrEgyrb/YBVxmO8q2L4eodF4J/coxz
 g5pZkmZeZkGKkhpk7jrf+/P+7iLgT9x+EFzinofl4/QZeSeVHZyawiHpNUa72JUS
 Pk1hs2EpfQafbXgpSKGL6HLyJXESe/kzX31s4PJudNz7GcQBAL54hYCrTb79lmXX
 FQPVYZ71EZhGupRkSyTI1Bmp2wjdwy00Lto3/pQ5zZkVoLX7yyoL5XGDFqtNF+vI
 tiHXzWY5+TYDFhV+g6f8x9kIbqr7H7Gk2MzzOYoSp3jgOTTPGlnjXVOjWiCv5WWH
 6tt2fas4tlV4t4gPqIrDUQtkc3gvvfMSHY9gwid9kZKd8RNT2NoHjkLLeRqmcMZZ
 4R7Gu/QKmulZvnjQ99mxcdhrg9xse+FG7WqDeQykaLpGo9UBqQ9of+L5z7Mt9T23
 vn62G5qeH7AOrMNWdR0PWCQPFQlRibxpvVvzEVhgg+uUK19ReicHAgTEE/457E81
 Rx/f8zb3PbaAnmQEhPs8IVXJo6AXSzZQiUTD3KRJXNeYa4scHaVKGVY9RQQ9tDcH
 tJKUXHumITrLxxgxTktYHPTBgtoDvjYMGv/iSSJDkrScpWW1tsXadbm3ma6cPxQJ
 DQO+o/J8+DqWLFbwVZ93oL/19uZHth6y43kBocZJuVetg27a3CD5PutQWoqrbGjV
 m5oedHdFW1Oouw==
 =WKqE
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.17-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - Support for sv48 paging

 - Hart ID mappings are now sparse, which enables more CPUs to come up
   on systems with sparse hart IDs

 - A handful of cleanups and fixes

* tag 'riscv-for-linus-5.17-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (27 commits)
  RISC-V: nommu_virt: Drop unused SLAB_MERGE_DEFAULT
  RISC-V: Remove redundant err variable
  riscv: dts: sifive unmatched: Add gpio poweroff
  riscv: canaan: remove useless select of non-existing config SYSCON
  RISC-V: Do not use cpumask data structure for hartid bitmap
  RISC-V: Move spinwait booting method to its own config
  RISC-V: Move the entire hart selection via lottery to SMP
  RISC-V: Use __cpu_up_stack/task_pointer only for spinwait method
  RISC-V: Do not print the SBI version during HSM extension boot print
  RISC-V: Avoid using per cpu array for ordered booting
  riscv: default to CONFIG_RISCV_SBI_V01=n
  riscv: fix boolconv.cocci warnings
  riscv: Explicit comment about user virtual address space size
  riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo
  riscv: Implement sv48 support
  asm-generic: Prepare for riscv use of pud_alloc_one and pud_free
  riscv: Allow to dynamically define VA_BITS
  riscv: Introduce functions to switch pt_ops
  riscv: Split early kasan mapping to prepare sv48 introduction
  riscv: Move KASAN mapping next to the kernel mapping
  ...
2022-01-22 09:34:49 +02:00
Palmer Dabbelt
c59cd507fb
RISC-V: nommu_virt: Drop unused SLAB_MERGE_DEFAULT
Our nommu_virt_defconfig set SLOB=y and SLAB_MERGE_DEFAULT=n.  As of
eb52c0fc23 ("mm: Make SLAB_MERGE_DEFAULT depend on SL[AU]B") it's no
longer necessary to set the second, which appears to never have had any
effect for SLOB=y anyway.

This was suggested by savedefconfig.

Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-20 12:48:14 -08:00
Minghao Chi
8da46c0f98
RISC-V: Remove redundant err variable
Return value from user_regset_copyin() directly instead of taking this
in another redundant variable.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-20 10:24:12 -08:00
Ron Economos
db3f02df18
riscv: dts: sifive unmatched: Add gpio poweroff
Some of the GPIO pins on the Unmatched are wire up to control the power
of the board, indicate that in the device tree.

Signed-off-by: Ron Economos <w6rz@comcast.net>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-20 09:37:23 -08:00
Atish Patra
26fb751ca3
RISC-V: Do not use cpumask data structure for hartid bitmap
Currently, SBI APIs accept a hartmask that is generated from struct
cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
is not the correct data structure for hartids as it can be higher
than NR_CPUs for platforms with sparse or discontguous hartids.

Remove all association between hartid mask and struct cpumask.

Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-20 09:27:22 -08:00
Atish Patra
2ffc48fc70
RISC-V: Move spinwait booting method to its own config
The spinwait booting method should only be used for platforms with older
firmware without SBI HSM extension or M-mode firmware because spinwait
method can't support cpu hotplug, kexec or sparse hartid. It is better
to move the entire spinwait implementation to its own config which can
be disabled if required. It is enabled by default to maintain backward
compatibility and M-mode Linux.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-20 09:27:16 -08:00
Atish Patra
0b39eb38f8
RISC-V: Move the entire hart selection via lottery to SMP
The booting hart selection via lottery is only useful for SMP systems.
Moreover, the lottery selection is only necessary for systems using
spinwait booting method. It is better to keep the entire lottery
selection together so that it can be disabled in future.

Move the lottery selection code to under CONFIG_SMP.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-20 09:27:11 -08:00
Atish Patra
c78f94f35c
RISC-V: Use __cpu_up_stack/task_pointer only for spinwait method
The __cpu_up_stack/task_pointer array is only used for spinwait method
now. The per cpu array based lookup is also fragile for platforms with
discontiguous/sparse hartids. The spinwait method is only used for
M-mode Linux or older firmwares without SBI HSM extension. For general
Linux systems, ordered booting method is preferred anyways to support
cpu hotplug and kexec.

Make sure that __cpu_up_stack/task_pointer is only used for spinwait
method. Take this opportunity to rename it to
__cpu_spinwait_stack/task_pointer to emphasize the purpose as well.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-20 09:27:08 -08:00
Atish Patra
410bb20a69
RISC-V: Do not print the SBI version during HSM extension boot print
The HSM extension information log also prints the SBI version v0.2. This
is misleading as the underlying firmware SBI version may be different
from v0.2.

Remove the unncessary printing of SBI version.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-20 09:27:03 -08:00
Atish Patra
9a2451f186
RISC-V: Avoid using per cpu array for ordered booting
Currently both order booting and spinwait approach uses a per cpu
array to update stack & task pointer. This approach will not work for the
following cases.
1. If NR_CPUs are configured to be less than highest hart id.
2. A platform has sparse hartid.

This issue can be fixed for ordered booting as the booting cpu brings up
one cpu at a time using SBI HSM extension which has opaque parameter
that is unused until now.

Introduce a common secondary boot data structure that can store the stack
and task pointer. Secondary harts will use this data while booting up
to setup the sp & tp.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-20 09:26:59 -08:00
Heinrich Schuchardt
3938d5a2f9
riscv: default to CONFIG_RISCV_SBI_V01=n
The SBI 0.1 specification is obsolete. The current version is 0.3.
Hence we should not rely by default on SBI 0.1 being implemented.

Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-20 08:49:12 -08:00
Linus Torvalds
f4484d138b Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "55 patches.

  Subsystems affected by this patch series: percpu, procfs, sysctl,
  misc, core-kernel, get_maintainer, lib, checkpatch, binfmt, nilfs2,
  hfs, fat, adfs, panic, delayacct, kconfig, kcov, and ubsan"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (55 commits)
  lib: remove redundant assignment to variable ret
  ubsan: remove CONFIG_UBSAN_OBJECT_SIZE
  kcov: fix generic Kconfig dependencies if ARCH_WANTS_NO_INSTR
  lib/Kconfig.debug: make TEST_KMOD depend on PAGE_SIZE_LESS_THAN_256KB
  btrfs: use generic Kconfig option for 256kB page size limit
  arch/Kconfig: split PAGE_SIZE_LESS_THAN_256KB from PAGE_SIZE_LESS_THAN_64KB
  configs: introduce debug.config for CI-like setup
  delayacct: track delays from memory compact
  Documentation/accounting/delay-accounting.rst: add thrashing page cache and direct compact
  delayacct: cleanup flags in struct task_delay_info and functions use it
  delayacct: fix incomplete disable operation when switch enable to disable
  delayacct: support swapin delay accounting for swapping without blkio
  panic: remove oops_id
  panic: use error_report_end tracepoint on warnings
  fs/adfs: remove unneeded variable make code cleaner
  FAT: use io_schedule_timeout() instead of congestion_wait()
  hfsplus: use struct_group_attr() for memcpy() region
  nilfs2: remove redundant pointer sbufs
  fs/binfmt_elf: use PT_LOAD p_align values for static PIE
  const_structs.checkpatch: add frequently used ops structs
  ...
2022-01-20 10:41:01 +02:00
Kefeng Wang
7ecd19cfdf mm: percpu: generalize percpu related config
Patch series "mm: percpu: Cleanup percpu first chunk function".

When supporting page mapping percpu first chunk allocator on arm64, we
found there are lots of duplicated codes in percpu embed/page first chunk
allocator.  This patchset is aimed to cleanup them and should no function
change.

The currently supported status about 'embed' and 'page' in Archs shows
below,

	embed: NEED_PER_CPU_PAGE_FIRST_CHUNK
	page:  NEED_PER_CPU_EMBED_FIRST_CHUNK

		embed	page
	------------------------
	arm64	  Y	 Y
	mips	  Y	 N
	powerpc	  Y	 Y
	riscv	  Y	 N
	sparc	  Y	 Y
	x86	  Y	 Y
	------------------------

There are two interfaces about percpu first chunk allocator,

 extern int __init pcpu_embed_first_chunk(size_t reserved_size, size_t dyn_size,
                                size_t atom_size,
                                pcpu_fc_cpu_distance_fn_t cpu_distance_fn,
-                               pcpu_fc_alloc_fn_t alloc_fn,
-                               pcpu_fc_free_fn_t free_fn);
+                               pcpu_fc_cpu_to_node_fn_t cpu_to_nd_fn);

 extern int __init pcpu_page_first_chunk(size_t reserved_size,
-                               pcpu_fc_alloc_fn_t alloc_fn,
-                               pcpu_fc_free_fn_t free_fn,
-                               pcpu_fc_populate_pte_fn_t populate_pte_fn);
+                               pcpu_fc_cpu_to_node_fn_t cpu_to_nd_fn);

The pcpu_fc_alloc_fn_t/pcpu_fc_free_fn_t is killed, we provide generic
pcpu_fc_alloc() and pcpu_fc_free() function, which are called in the
pcpu_embed/page_first_chunk().

1) For pcpu_embed_first_chunk(), pcpu_fc_cpu_to_node_fn_t is needed to be
   provided when archs supported NUMA.

2) For pcpu_page_first_chunk(), the pcpu_fc_populate_pte_fn_t is killed too,
   a generic pcpu_populate_pte() which marked '__weak' is provided, if you
   need a different function to populate pte on the arch(like x86), please
   provide its own implementation.

[1] https://github.com/kevin78/linux.git percpu-cleanup

This patch (of 4):

The HAVE_SETUP_PER_CPU_AREA/NEED_PER_CPU_EMBED_FIRST_CHUNK/
NEED_PER_CPU_PAGE_FIRST_CHUNK/USE_PERCPU_NUMA_NODE_ID configs, which have
duplicate definitions on platforms that subscribe it.

Move them into mm, drop these redundant definitions and instead just
select it on applicable platforms.

Link: https://lkml.kernel.org/r/20211216112359.103822-1-wangkefeng.wang@huawei.com
Link: https://lkml.kernel.org/r/20211216112359.103822-2-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:52 +02:00
kernel test robot
20aa49541a
riscv: fix boolconv.cocci warnings
arch/riscv/mm/init.c:48:11-16: WARNING: conversion to bool not needed here

 Remove unneeded conversion to bool

Semantic patch information:
 Relational and logical operators evaluate to bool,
 explicit conversion is overly verbose and unneeded.

Generated by: scripts/coccinelle/misc/boolconv.cocci

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 19:52:30 -08:00
Palmer Dabbelt
0c34e79e52
RISC-V: Introduce sv48 support without relocatable kernel
This patchset allows to have a single kernel for sv39 and sv48 without
being relocatable.

The idea comes from Arnd Bergmann who suggested to do the same as x86,
that is mapping the kernel to the end of the address space, which allows
the kernel to be linked at the same address for both sv39 and sv48 and
then does not require to be relocated at runtime.

This implements sv48 support at runtime. The kernel will try to boot
with 4-level page table and will fallback to 3-level if the HW does not
support it. Folding the 4th level into a 3-level page table has almost
no cost at runtime.

Note that kasan region had to be moved to the end of the address space
since its location must be known at compile-time and then be valid for
both sv39 and sv48 (and sv57 that is coming).

* riscv-sv48-v3:
  riscv: Explicit comment about user virtual address space size
  riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo
  riscv: Implement sv48 support
  asm-generic: Prepare for riscv use of pud_alloc_one and pud_free
  riscv: Allow to dynamically define VA_BITS
  riscv: Introduce functions to switch pt_ops
  riscv: Split early kasan mapping to prepare sv48 introduction
  riscv: Move KASAN mapping next to the kernel mapping
  riscv: Get rid of MAXPHYSMEM configs

Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 19:37:44 -08:00
Alexandre Ghiti
c774de22c4
riscv: Explicit comment about user virtual address space size
Define precisely the size of the user accessible virtual space size
for sv32/39/48 mmu types and explain why the whole virtual address
space is split into 2 equal chunks between kernel and user space.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 17:54:11 -08:00
Alexandre Ghiti
73c7c8f68e
riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo
Now that the mmu type is determined at runtime using SATP
characteristic, use the global variable pgtable_l4_enabled to output
mmu type of the processor through /proc/cpuinfo instead of relying on
device tree infos.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 17:54:10 -08:00
Alexandre Ghiti
e8a62cc26d
riscv: Implement sv48 support
By adding a new 4th level of page table, give the possibility to 64bit
kernel to address 2^48 bytes of virtual address: in practice, that offers
128TB of virtual address space to userspace and allows up to 64TB of
physical memory.

If the underlying hardware does not support sv48, we will automatically
fallback to a standard 3-level page table by folding the new PUD level into
PGDIR level. In order to detect HW capabilities at runtime, we
use SATP feature that ignores writes with an unsupported mode.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 17:54:09 -08:00