Commit Graph

245 Commits

Author SHA1 Message Date
Alexandre Ghiti 54c5639d8f
riscv: Fix asan-stack clang build
Nathan reported that because KASAN_SHADOW_OFFSET was not defined in
Kconfig, it prevents asan-stack from getting disabled with clang even
when CONFIG_KASAN_STACK is disabled: fix this by defining the
corresponding config.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: 8ad8b72721 ("riscv: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-29 08:54:50 -07:00
Alexandre Ghiti cf11d01135
riscv: Do not re-populate shadow memory with kasan_populate_early_shadow
When calling this function, all the shadow memory is already populated
with kasan_early_shadow_pte which has PAGE_KERNEL protection.
kasan_populate_early_shadow write-protects the mapping of the range
of addresses passed in argument in zero_pte_populate, which actually
write-protects all the shadow memory mapping since kasan_early_shadow_pte
is used for all the shadow memory at this point. And then when using
memblock API to populate the shadow memory, the first write access to the
kernel stack triggers a trap. This becomes visible with the next commit
that contains a fix for asan-stack.

We already manually populate all the shadow memory in kasan_early_init
and we write-protect kasan_early_shadow_pte at the end of kasan_init
which makes the calls to kasan_populate_early_shadow superfluous so
we can remove them.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: e178d670f2 ("riscv/kasan: add KASAN_VMALLOC support")
Fixes: 8ad8b72721 ("riscv: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-29 08:53:42 -07:00
Alexandre Ghiti bb8958d5dc
riscv: Flush current cpu icache before other cpus
On SiFive Unmatched, I recently fell onto the following BUG when booting:

[    0.000000] ftrace: allocating 36610 entries in 144 pages
[    0.000000] Oops - illegal instruction [#1]
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.13.1+ #5
[    0.000000] Hardware name: SiFive HiFive Unmatched A00 (DT)
[    0.000000] epc : riscv_cpuid_to_hartid_mask+0x6/0xae
[    0.000000]  ra : __sbi_rfence_v02+0xc8/0x10a
[    0.000000] epc : ffffffff80007240 ra : ffffffff80009964 sp : ffffffff81803e10
[    0.000000]  gp : ffffffff81a1ea70 tp : ffffffff8180f500 t0 : ffffffe07fe30000
[    0.000000]  t1 : 0000000000000004 t2 : 0000000000000000 s0 : ffffffff81803e60
[    0.000000]  s1 : 0000000000000000 a0 : ffffffff81a22238 a1 : ffffffff81803e10
[    0.000000]  a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
[    0.000000]  a5 : 0000000000000000 a6 : ffffffff8000989c a7 : 0000000052464e43
[    0.000000]  s2 : ffffffff81a220c8 s3 : 0000000000000000 s4 : 0000000000000000
[    0.000000]  s5 : 0000000000000000 s6 : 0000000200000100 s7 : 0000000000000001
[    0.000000]  s8 : ffffffe07fe04040 s9 : ffffffff81a22c80 s10: 0000000000001000
[    0.000000]  s11: 0000000000000004 t3 : 0000000000000001 t4 : 0000000000000008
[    0.000000]  t5 : ffffffcf04000808 t6 : ffffffe3ffddf188
[    0.000000] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000002
[    0.000000] [<ffffffff80007240>] riscv_cpuid_to_hartid_mask+0x6/0xae
[    0.000000] [<ffffffff80009474>] sbi_remote_fence_i+0x1e/0x26
[    0.000000] [<ffffffff8000b8f4>] flush_icache_all+0x12/0x1a
[    0.000000] [<ffffffff8000666c>] patch_text_nosync+0x26/0x32
[    0.000000] [<ffffffff8000884e>] ftrace_init_nop+0x52/0x8c
[    0.000000] [<ffffffff800f051e>] ftrace_process_locs.isra.0+0x29c/0x360
[    0.000000] [<ffffffff80a0e3c6>] ftrace_init+0x80/0x130
[    0.000000] [<ffffffff80a00f8c>] start_kernel+0x5c4/0x8f6
[    0.000000] ---[ end trace f67eb9af4d8d492b ]---
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

While ftrace is looping over a list of addresses to patch, it always failed
when patching the same function: riscv_cpuid_to_hartid_mask. Looking at the
backtrace, the illegal instruction is encountered in this same function.
However, patch_text_nosync, after patching the instructions, calls
flush_icache_range. But looking at what happens in this function:

flush_icache_range -> flush_icache_all
                   -> sbi_remote_fence_i
                   -> __sbi_rfence_v02
                   -> riscv_cpuid_to_hartid_mask

The icache and dcache of the current cpu are never synchronized between the
patching of riscv_cpuid_to_hartid_mask and calling this same function.

So fix this by flushing the current cpu's icache before asking for the other
cpus to do the same.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Fixes: fab957c11e ("RISC-V: Atomic and Locking Code")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04 18:24:15 -07:00
Linus Torvalds 063df71a57 RISC-V Patches for the 5.15 Merge Window, Part 1
* Support for PC-relative instructions (auipc and branches) in kprobes.
 * Support for forced IRQ threading.
 * Support for the hlt/nohlt kernel command line options, via the generic
   idle loop.
 * Support for showing the edge/level triggered behavior of interrupts in
   /proc/interrupts.
 * A handful of cleanups to our address mapping mechanisms.
 * Support for allocating gigantic hugepages via CMA.
 * Support for the undefined behavior sanitizer.
 * A handful of cleanups to the VDSO that allow the kernel to build with
   LLD.
 * Support for hugepage migration.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmEzxA4THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYifuvD/93Tr9oAy6dwkGoa/VIzBc/nklCeTiO
 cv2W9eFQMG+PdRh1Ezi9FrLO6qZoADGOPPcnuc2HTTe58CWWWg0juC862H/N2k3V
 +3LGhHTv3E7NsEZNa4YhQXw6pPmj5OhbHab87HNowHpuNJGpKYE6Q+QbnYl5vorI
 9YS10pRpqvO6tdHUW335Z5N0nW9X3HCHD6NUb1ZV8qd34QzP+XeCJMqVXpoOijNO
 96lhyDYPivM2Rhi+uvtSjUeWzbddnN/P9IywfECddHFJyGvoEtiIROo9d2sR6pla
 lAa1YOw0ybD/UPHtL/diEU+K8ivNVKdRN/YvlzHS4CdCrPyuWSUmBMozsOBMBwf1
 UZCMDwmP2sQkHu1Ht9g1fXwk2R85j1s+qE5cZEXyAEULKfHlmt4vygd1xgm8DTf9
 xgS+np2g0frcmKU8cLPmC7Sr0YI1fPeDMRkBXTYE1SmBF3eHrZCCbeM1H97p2mUC
 zs/Dvyoj7e25HGyYqxN3lHZzhMluqh7wCQSOQMB2YxVwdiA4VjyjxLF4MKAdXL8i
 F1QTETsjLGu6UiMfNIQ4RVs981/fGYSANdOvMfwo447gVH6JaxC51sLo6YIDtXss
 tLzchHNu5IMNqB7XOb1gU8ZKeXruG9lrxwUOyEIYAVi6pHCbkdLV0A8ymwEjj06b
 aGV4xFPaAGvDLQ==
 =rDB3
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.15-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - support PC-relative instructions (auipc and branches) in kprobes

 - support for forced IRQ threading

 - support for the hlt/nohlt kernel command line options, via the
   generic idle loop

 - show the edge/level triggered behavior of interrupts
   in /proc/interrupts

 - a handful of cleanups to our address mapping mechanisms

 - support for allocating gigantic hugepages via CMA

 - support for the undefined behavior sanitizer (UBSAN)

 - a handful of cleanups to the VDSO that allow the kernel to build with
   LLD.

 - support for hugepage migration

* tag 'riscv-for-linus-5.15-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (21 commits)
  riscv: add support for hugepage migration
  RISC-V: Fix VDSO build for !MMU
  riscv: use strscpy to replace strlcpy
  riscv: explicitly use symbol offsets for VDSO
  riscv: Enable Undefined Behavior Sanitizer UBSAN
  riscv: Keep the riscv Kconfig selects sorted
  riscv: Support allocating gigantic hugepages using CMA
  riscv: fix the global name pfn_base confliction error
  riscv: Move early fdt mapping creation in its own function
  riscv: Simplify BUILTIN_DTB device tree mapping handling
  riscv: Use __maybe_unused instead of #ifdefs around variable declarations
  riscv: Get rid of map_size parameter to create_kernel_page_table
  riscv: Introduce va_kernel_pa_offset for 32-bit kernel
  riscv: Optimize kernel virtual address conversion macro
  dt-bindings: riscv: add starfive jh7100 bindings
  riscv: Enable GENERIC_IRQ_SHOW_LEVEL
  riscv: Enable idle generic idle loop
  riscv: Allow forced irq threading
  riscv: Implement thread_struct whitelist for hardened usercopy
  riscv: kprobes: implement the branch instructions
  ...
2021-09-05 11:31:23 -07:00
Linus Torvalds 14726903c8 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "173 patches.

  Subsystems affected by this series: ia64, ocfs2, block, and mm (debug,
  pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap,
  bootmem, sparsemem, vmalloc, kasan, pagealloc, memory-failure,
  hugetlb, userfaultfd, vmscan, compaction, mempolicy, memblock,
  oom-kill, migration, ksm, percpu, vmstat, and madvise)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (173 commits)
  mm/madvise: add MADV_WILLNEED to process_madvise()
  mm/vmstat: remove unneeded return value
  mm/vmstat: simplify the array size calculation
  mm/vmstat: correct some wrong comments
  mm/percpu,c: remove obsolete comments of pcpu_chunk_populated()
  selftests: vm: add COW time test for KSM pages
  selftests: vm: add KSM merging time test
  mm: KSM: fix data type
  selftests: vm: add KSM merging across nodes test
  selftests: vm: add KSM zero page merging test
  selftests: vm: add KSM unmerge test
  selftests: vm: add KSM merge test
  mm/migrate: correct kernel-doc notation
  mm: wire up syscall process_mrelease
  mm: introduce process_mrelease system call
  memblock: make memblock_find_in_range method private
  mm/mempolicy.c: use in_task() in mempolicy_slab_node()
  mm/mempolicy: unify the create() func for bind/interleave/prefer-many policies
  mm/mempolicy: advertise new MPOL_PREFERRED_MANY
  mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY
  ...
2021-09-03 10:08:28 -07:00
Mike Rapoport a7259df767 memblock: make memblock_find_in_range method private
There are a lot of uses of memblock_find_in_range() along with
memblock_reserve() from the times memblock allocation APIs did not exist.

memblock_find_in_range() is the very core of memblock allocations, so any
future changes to its internal behaviour would mandate updates of all the
users outside memblock.

Replace the calls to memblock_find_in_range() with an equivalent calls to
memblock_phys_alloc() and memblock_phys_alloc_range() and make
memblock_find_in_range() private method of memblock.

This simplifies the callers, ensures that (unlikely) errors in
memblock_reserve() are handled and improves maintainability of
memblock_find_in_range().

Link: https://lkml.kernel.org/r/20210816122622.30279-1-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>		[arm64]
Acked-by: Kirill A. Shutemov <kirill.shtuemov@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	[ACPI]
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Nick Kossifidis <mick@ics.forth.gr>			[riscv]
Tested-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-03 09:58:17 -07:00
Linus Torvalds 9e5f3ffcf1 Devicetree updates for v5.15:
- Refactor arch kdump DT related code to a common implementation
 
 - Add fw_devlink tracking for 'phy-handle', 'leds', 'backlight',
   'resets', and 'pwm' properties
 
 - Various clean-ups to DT FDT code
 
 - Fix a runtime error for !CONFIG_SYSFS
 
 - Convert Synopsys DW PCI and derivative binding docs to schemas. Add
   Toshiba Visconti PCIe binding.
 
 - Convert a bunch of memory controller bindings to schemas
 
 - Covert eeprom-93xx46, Samsung Exynos TRNG, Samsung Exynos IRQ
   combiner, arm-charlcd, img-ascii-lcd, UniPhier eFuse, Xilinx Zynq
   MPSoC FPGA, Xilinx Zynq MPSoC reset, Mediatek mmsys, Gemini boards,
   brcm,iproc-i2c, faraday,ftpci100, and ks8851 net to DT schema.
 
 - Extend nvmem bindings to handle bit offsets in unit-addresses
 
 - Add DT schemas for HiKey 970 PCIe PHY
 
 - Remove unused ZTE, energymicro,efm32-timer, and Exynos SATA bindings
 
 - Enable dtc pci_device_reg warning by default
 
 - Fixes for handling 'unevaluatedProperties' in preparation to enable
   pending support in the tooling for jsonschema 2020-12 draft
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmEuWEsQHHJvYmhAa2Vy
 bmVsLm9yZwAKCRD6+121jbxhw+CtD/45m84GisULb7FFmlo+WY2SbzE8a+MUEXo0
 5ZZoMViSvBchphap9ueFNDdrLMUOHMsFaxHuTCUxXr4tq7EOemM7Br4OLiwiRrM5
 o2CwBvXYu+49c4UKVFMM6RCKFiXvw5NLI4Twpj4Ge8farHvt9Ecwtq+Y+RYWgFk2
 xwXWut7ZK3zBU6B+s4MRBATCFTD5oC4pAJIK3OQUlUPqZEQqdTRBKv5lyg+VUY2k
 eU0Cyzm0dZAmtjAu8ovhVNLfK1pp165QiaFIE1qh5H3ZVZAJlNyqN4jBDx9E4pLj
 BeazrsqfOkC8mZC+T7TgixhwB6D+r6/JW9NiCjYbarXibIsUOKSTKtj8XR8eZF/g
 sLeVDx33U5S+dlj1OB7scwq4Q9sG27ii2rlkvafA5KKBjoR2dzz7o9JesCV1Guha
 goPXmcd08e+KrjINxVc6gk4Y+KG8u+G7qnXnnmSatESJKxiDu1OgU3L16mlTJFaM
 hBmrh5rx1y8EkQnzgceTZIIWh30poSQKKyDB6Ta4Dude5JE+rS30oVURDR7MIrav
 rY70OYOiSq/nCcC7bc0Yu0UxJi+bwH28WvsD0aeCUOBTFsnI4j2uvsPsh3Aq74O0
 UbQmUCMxhpmsDVdIOqlS1IVH8M79I+BrDTPVP6EE96ttoj9FbSi6AgjeGJzVMC99
 EhtWe+gKTQ==
 =28CD
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:

 - Refactor arch kdump DT related code to a common implementation

 - Add fw_devlink tracking for 'phy-handle', 'leds', 'backlight',
   'resets', and 'pwm' properties

 - Various clean-ups to DT FDT code

 - Fix a runtime error for !CONFIG_SYSFS

 - Convert Synopsys DW PCI and derivative binding docs to schemas. Add
   Toshiba Visconti PCIe binding.

 - Convert a bunch of memory controller bindings to schemas

 - Covert eeprom-93xx46, Samsung Exynos TRNG, Samsung Exynos IRQ
   combiner, arm-charlcd, img-ascii-lcd, UniPhier eFuse, Xilinx Zynq
   MPSoC FPGA, Xilinx Zynq MPSoC reset, Mediatek mmsys, Gemini boards,
   brcm,iproc-i2c, faraday,ftpci100, and ks8851 net to DT schema.

 - Extend nvmem bindings to handle bit offsets in unit-addresses

 - Add DT schemas for HiKey 970 PCIe PHY

 - Remove unused ZTE, energymicro,efm32-timer, and Exynos SATA bindings

 - Enable dtc pci_device_reg warning by default

 - Fixes for handling 'unevaluatedProperties' in preparation to enable
   pending support in the tooling for jsonschema 2020-12 draft

* tag 'devicetree-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (78 commits)
  dt-bindings: display: remove zte,vou.txt binding doc
  dt-bindings: hwmon: merge max1619 into trivial devices
  dt-bindings: mtd-physmap: Add 'arm,vexpress-flash' compatible
  dt-bindings: PCI: imx6: convert the imx pcie controller to dtschema
  dt-bindings: Use 'enum' instead of 'oneOf' plus 'const' entries
  dt-bindings: Add vendor prefix for Topic Embedded Systems
  of: fdt: Rename reserve_elfcorehdr() to fdt_reserve_elfcorehdr()
  arm64: kdump: Remove custom linux,usable-memory-range handling
  arm64: kdump: Remove custom linux,elfcorehdr handling
  riscv: Remove non-standard linux,elfcorehdr handling
  of: fdt: Use IS_ENABLED(CONFIG_BLK_DEV_INITRD) instead of #ifdef
  of: fdt: Add generic support for handling usable memory range property
  of: fdt: Add generic support for handling elf core headers property
  crash_dump: Make elfcorehdr address/size symbols always visible
  dt-bindings: memory: convert Samsung Exynos DMC to dtschema
  dt-bindings: devfreq: event: convert Samsung Exynos PPMU to dtschema
  dt-bindings: devfreq: event: convert Samsung Exynos NoCP to dtschema
  kbuild: Enable dtc 'pci_device_reg' warning by default
  dt-bindings: soc: remove obsolete zte zx header
  dt-bindings: clock: remove obsolete zte zx header
  ...
2021-09-01 18:34:51 -07:00
Geert Uytterhoeven 2931ea847d riscv: Remove non-standard linux,elfcorehdr handling
RISC-V uses platform-specific code to locate the elf core header in
memory.  However, this does not conform to the standard
"linux,elfcorehdr" DT bindings, as it relies on a reserved memory node
with the "linux,elfcorehdr" compatible value, instead of on a
"linux,elfcorehdr" property under the "/chosen" node.

The non-compliant code can just be removed, as the standard behavior is
already implemented by platform-agnostic handling in the FDT core code.

Fixes: 5640975003 ("RISC-V: Add crash kernel support")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/41c75d6ee3114ae6304f8afe0051895af91200ee.1628670468.git.geert+renesas@glider.be
2021-08-24 17:09:01 -05:00
Kefeng Wang 8ba1a8b77b
riscv: Support allocating gigantic hugepages using CMA
This patch adds support to allocate gigantic hugepages using CMA by
specifying the hugetlb_cma= kernel parameter.  This is only supported on
RV64.

Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-13 21:33:54 -07:00
Kenneth Lee fb31f0a499
riscv: fix the global name pfn_base confliction error
RISCV uses a global variable pfn_base for page/pfn translation. But this
is a common name and will be used elsewhere. In those cases, the
page-pfn macros which refer to this name will be referred to the
local/input variable instead. (such as in vfio_pin_pages_remote). This
make everything wrong.

This patch changes the name from pfn_base to riscv_pfn_base to fix
this problem.

Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-13 15:31:51 -07:00
Alexandre Ghiti fdf3a7a1e0
riscv: Fix comment regarding kernel mapping overlapping with IS_ERR_VALUE
The current comment states that we check if the 64-bit kernel mapping
overlaps with the last 4K of the address space that is reserved to
error values in create_kernel_page_table, which is not the case since it
is done in setup_vm. But anyway, remove the reference to any function
and simply note that in 64-bit kernel, the check should be done as soon
as the kernel mapping base address is known.

Fixes: db6b84a368 ("riscv: Make sure the kernel mapping does not overlap with IS_ERR_VALUE")
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-12 07:16:58 -07:00
Alexandre Ghiti fe45ffa4c5
riscv: Move early fdt mapping creation in its own function
The code that handles the early fdt mapping is hard to read and does not
create the same mapping size depending on the kernel:

- for 64-bit, 2 PMD entries are used which amounts to a 4MB mapping
- for 32-bit, 2 PGDIR entries are used which amounts to a 8MB mapping

So keep using 2 PMD entries for 64-bit and use only one PGD entry for
32-bit needed to cover 4MB. Move that into a new function called
create_fdt_early_page_table which, using the same naming as
create_kernel_page_table.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-11 22:41:38 -07:00
Alexandre Ghiti 977765ce31
riscv: Simplify BUILTIN_DTB device tree mapping handling
__PAGETABLE_PMD_FOLDED defines a 2-level page table that is only used in
32-bit kernel, so there is no need to check for CONFIG_64BIT in #ifndef
__PAGETABLE_PMD_FOLDED and vice-versa.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-11 22:41:34 -07:00
Alexandre Ghiti 6f3e5fd241
riscv: Use __maybe_unused instead of #ifdefs around variable declarations
This allows to simplify the code and make it more readable.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-11 22:41:31 -07:00
Alexandre Ghiti 526f83df1d
riscv: Get rid of map_size parameter to create_kernel_page_table
The kernel must always be mapped using PMD_SIZE, and this is already the
case, this just simplifies create_kernel_page_table.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-11 22:41:20 -07:00
Alexandre Ghiti 0aba691a74
riscv: Introduce va_kernel_pa_offset for 32-bit kernel
va_kernel_pa_offset was only used for 64-bit as the kernel mapping lies
in the linear mapping for 32-bit kernel and then only the offset between
the PAGE_OFFSET and the kernel load address is needed.

But this distinction complexifies the code with #ifdefs and especially
with a separate definition of the address conversions macros.

Simplify the code by defining this variable for both 32-bit and 64-bit.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-11 22:41:11 -07:00
Alexandre Ghiti 6d7f91d914
riscv: Get rid of CONFIG_PHYS_RAM_BASE in kernel physical address conversion
The usage of CONFIG_PHYS_RAM_BASE for all kernel types was a mistake:
this value is implementation-specific and this breaks the genericity of
the RISC-V kernel.

Fix this by introducing a new variable phys_ram_base that holds this
value at runtime and use it in the kernel physical address conversion
macro. Since this value is used only for XIP kernels, evaluate it only if
CONFIG_XIP_KERNEL is set which in addition optimizes this macro for
standard kernels at compile-time.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Tested-by: Emil Renner Berthing <kernel@esmil.dk>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Fixes: 44c9225729 ("RISC-V: enable XIP")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-06 22:41:28 -07:00
Alexandre Ghiti db6b84a368
riscv: Make sure the kernel mapping does not overlap with IS_ERR_VALUE
The check that is done in setup_bootmem currently only works for 32-bit
kernel since the kernel mapping has been moved outside of the linear
mapping for 64-bit kernel. So make sure that for 64-bit kernel, the kernel
mapping does not overlap with the last 4K of the addressable memory.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-22 21:34:36 -07:00
Alexandre Ghiti c99127c452
riscv: Make sure the linear mapping does not use the kernel mapping
For 64-bit kernel, the end of the address space is occupied by the
kernel mapping and currently, the functions to populate the kernel page
tables (i.e. create_p*d_mapping) do not override existing mapping so we
must make sure the linear mapping does not map memory in the kernel mapping
by clipping the memory above the memory limit.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Fixes: c9811e379b ("riscv: Add mem kernel parameter support")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-22 20:48:04 -07:00
Alexandre Ghiti c09dc9e1cd
riscv: Fix memory_limit for 64-bit kernel
As described in Documentation/riscv/vm-layout.rst, the end of the
virtual address space for 64-bit kernel is occupied by the modules/BPF/
kernel mappings so this actually reduces the amount of memory we are able
to map and then use in the linear mapping. So make sure this limit is
correctly set.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-22 20:29:30 -07:00
Palmer Dabbelt 444818b599
Merge remote-tracking branch 'riscv/riscv-fix-32bit' into fixes
This contains a single fix for 32-bit boot.  It happens this was already
fixed by c9811e379b ("riscv: Add mem kernel parameter support"), but
the bug existed before that feature addition so I've applied the patch
earlier and then merged it in (which results in a conflict, which is
fixed via not changing the resulting tree).

* riscv/riscv-fix-32bit:
  riscv: Fix 32-bit RISC-V boot failure
2021-07-21 22:18:58 -07:00
Bin Meng d0e4dae744
riscv: Fix 32-bit RISC-V boot failure
Commit dd2d082b57 ("riscv: Cleanup setup_bootmem()") adjusted
the calling sequence in setup_bootmem(), which invalidates the fix
commit de043da0b9 ("RISC-V: Fix usage of memblock_enforce_memory_limit")
did for 32-bit RISC-V unfortunately.

So now 32-bit RISC-V does not boot again when testing booting kernel
on QEMU 'virt' with '-m 2G', which was exactly what the original
commit de043da0b9 ("RISC-V: Fix usage of memblock_enforce_memory_limit")
tried to fix.

Fixes: dd2d082b57 ("riscv: Cleanup setup_bootmem()")
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-21 22:17:41 -07:00
Linus Torvalds 9b76d71fa8 RISC-V Patches for the 5.14 Merge Window, Part 1
In addition to We have a handful of new features for 5.14:
 
 * Support for transparent huge pages.
 * Support for generic PCI resources mapping.
 * Support for the mem= kernel parameter.
 * Support for KFENCE.
 * A handful of fixes to avoid W+X mappings in the kernel.
 * Support for VMAP_STACK based overflow detection.
 * An optimized copy_{to,from}_user.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmDn3XITHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiW4UD/9Z0BJNXG9rOERofyFWwb7EYchT551A
 Coi8BFpuUCZfT9qonuBzQcPrAlXH/T9yMDGiShZio9jh29bnaOIqo5NrvNjB88VU
 LdarNeiumPM4SCQFIsbIBnRrk5OQDtzsPx+dS5xVUQlnHUV26xBakAJo3K3FxG9Y
 bl2JU+LTvP52eBKpKHp0i2pbuC2z0dBEu9Y3d4q8phI+YeolJ0rgOCbMra+maCls
 kk5TeROrDJpTg5INsWpVNvKvRk+sNlh0K9AsZHL1fIA8d2UFyTeCltUNjkWkIZA/
 SBiS+ruxtLD9rTPOrF1i+rvW+rs/gL1LkPjOvoilTMuNm5ltCu5MLsflX6oZ5wX4
 2vAXup6HQzLcJpCCq2QyMIl2s8ORV+gY89nYmRYHLzCoqGtF/WhbSFSKjqNM1+Kf
 /M4C9q3kEopkdlHuKahR8dP4wYz6kP1kEijv83P21ahUFsShSLyLAegYoKO0pNeC
 H/WOWMBqpG+rE80Tsctfo/z3g16Wc51yesYjXsOP5iRYoytG2uGLtzG7fPTjtyuC
 Fa3Ue/9d/W/XyZ7bzolvGRoaeoHqoIXb07MsDLDhO2cHYg6B/wIvHXfPcM7ovR6f
 VT0aRu85dvvewlukuqyH5+rwSPNQfzHo32GIiK7h8QjjIEh1axOFi+7oXGfbcIUw
 EP0u4AZsK9iHZg==
 =uB+H
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.14-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:
 "We have a handful of new features for 5.14:

   - Support for transparent huge pages.

   - Support for generic PCI resources mapping.

   - Support for the mem= kernel parameter.

   - Support for KFENCE.

   - A handful of fixes to avoid W+X mappings in the kernel.

   - Support for VMAP_STACK based overflow detection.

   - An optimized copy_{to,from}_user"

* tag 'riscv-for-linus-5.14-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (37 commits)
  riscv: xip: Fix duplicate included asm/pgtable.h
  riscv: Fix PTDUMP output now BPF region moved back to module region
  riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall
  riscv: add VMAP_STACK overflow detection
  riscv: ptrace: add argn syntax
  riscv: mm: fix build errors caused by mk_pmd()
  riscv: Introduce structure that group all variables regarding kernel mapping
  riscv: Map the kernel with correct permissions the first time
  riscv: Introduce set_kernel_memory helper
  riscv: Enable KFENCE for riscv64
  RISC-V: Use asm-generic for {in,out}{bwlq}
  riscv: add ASID-based tlbflushing methods
  riscv: pass the mm_struct to __sbi_tlb_flush_range
  riscv: Add mem kernel parameter support
  riscv: Simplify xip and !xip kernel address conversion macros
  riscv: Remove CONFIG_PHYS_RAM_BASE_FIXED
  riscv: Only initialize swiotlb when necessary
  riscv: fix typo in init.c
  riscv: Cleanup unused functions
  riscv: mm: Use better bitmap_zalloc()
  ...
2021-07-09 10:36:29 -07:00
Alexandre Ghiti 7761e36bc7
riscv: Fix PTDUMP output now BPF region moved back to module region
BPF region was moved back to the region below the kernel at the end of
the module region by 3a02764c37 ("riscv: Ensure BPF_JIT_REGION_START
aligned with PMD size"), so reflect this change in kernel page table
output.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Fixes: 3a02764c37 ("riscv: Ensure BPF_JIT_REGION_START aligned with PMD size")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-06 15:21:27 -07:00
Alexandre Ghiti 658e2c5125
riscv: Introduce structure that group all variables regarding kernel mapping
We have a lot of variables that are used to hold kernel mapping addresses,
offsets between physical and virtual mappings and some others used for XIP
kernels: they are all defined at different places in mm/init.c, so group
them into a single structure with, for some of them, more explicit and concise
names.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-05 18:04:00 -07:00
Palmer Dabbelt 01112e5e20
Merge branch 'riscv-wx-mappings' into for-next
This contains both the short-term fix for the W+X boot mappings and the
larger cleanup.

* riscv-wx-mappings:
  riscv: Map the kernel with correct permissions the first time
  riscv: Introduce set_kernel_memory helper
  riscv: Simplify xip and !xip kernel address conversion macros
  riscv: Remove CONFIG_PHYS_RAM_BASE_FIXED
  riscv: mm: Fix W+X mappings at boot
2021-06-30 21:50:32 -07:00
Alexandre Ghiti e5c35fa040
riscv: Map the kernel with correct permissions the first time
For 64-bit kernels, we map all the kernel with write and execute
permissions and afterwards remove writability from text and executability
from data.

For 32-bit kernels, the kernel mapping resides in the linear mapping, so we
map all the linear mapping as writable and executable and afterwards we
remove those properties for unused memory and kernel mapping as
described above.

Change this behavior to directly map the kernel with correct permissions
and avoid going through the whole mapping to fix the permissions.

At the same time, this fixes an issue introduced by commit 2bfc6cd81b
("riscv: Move kernel mapping outside of linear mapping") as reported
here https://github.com/starfive-tech/linux/issues/17.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-30 21:18:58 -07:00
Liu Shixin 47513f243b
riscv: Enable KFENCE for riscv64
Add architecture specific implementation details for KFENCE and enable
KFENCE for the riscv64 architecture. In particular, this implements the
required interface in <asm/kfence.h>.

KFENCE requires that attributes for pages from its memory pool can
individually be set. Therefore, force the kfence pool to be mapped at
page granularity.

Testing this patch using the testcases in kfence_test.c and all passed.

Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-30 20:55:41 -07:00
Guo Ren 3f1e782998
riscv: add ASID-based tlbflushing methods
Implement optimized version of the tlb flushing routines for systems
using ASIDs. These are behind the use_asid_allocator static branch to
not affect existing systems not using ASIDs.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
[hch: rebased on top of previous cleanups, use the same algorithm as
      the non-ASID based code for local vs global flushes, keep functions
      as local as possible]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-30 20:55:39 -07:00
Christoph Hellwig 70c7605c08
riscv: pass the mm_struct to __sbi_tlb_flush_range
Move the call mm_cpumask from the callers into __sbi_tlb_flush_range to
reduce a bit of duplicate code and prepare for future changes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-30 20:55:38 -07:00
Jisheng Zhang 3a02764c37
riscv: Ensure BPF_JIT_REGION_START aligned with PMD size
Andreas reported commit fc8504765e ("riscv: bpf: Avoid breaking W^X")
breaks booting with one kind of defconfig, I reproduced a kernel panic
with the defconfig:

[    0.138553] Unable to handle kernel paging request at virtual address ffffffff81201220
[    0.139159] Oops [#1]
[    0.139303] Modules linked in:
[    0.139601] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc5-default+ #1
[    0.139934] Hardware name: riscv-virtio,qemu (DT)
[    0.140193] epc : __memset+0xc4/0xfc
[    0.140416]  ra : skb_flow_dissector_init+0x1e/0x82
[    0.140609] epc : ffffffff8029806c ra : ffffffff8033be78 sp : ffffffe001647da0
[    0.140878]  gp : ffffffff81134b08 tp : ffffffe001654380 t0 : ffffffff81201158
[    0.141156]  t1 : 0000000000000002 t2 : 0000000000000154 s0 : ffffffe001647dd0
[    0.141424]  s1 : ffffffff80a43250 a0 : ffffffff81201220 a1 : 0000000000000000
[    0.141654]  a2 : 000000000000003c a3 : ffffffff81201258 a4 : 0000000000000064
[    0.141893]  a5 : ffffffff8029806c a6 : 0000000000000040 a7 : ffffffffffffffff
[    0.142126]  s2 : ffffffff81201220 s3 : 0000000000000009 s4 : ffffffff81135088
[    0.142353]  s5 : ffffffff81135038 s6 : ffffffff8080ce80 s7 : ffffffff80800438
[    0.142584]  s8 : ffffffff80bc6578 s9 : 0000000000000008 s10: ffffffff806000ac
[    0.142810]  s11: 0000000000000000 t3 : fffffffffffffffc t4 : 0000000000000000
[    0.143042]  t5 : 0000000000000155 t6 : 00000000000003ff
[    0.143220] status: 0000000000000120 badaddr: ffffffff81201220 cause: 000000000000000f
[    0.143560] [<ffffffff8029806c>] __memset+0xc4/0xfc
[    0.143859] [<ffffffff8061e984>] init_default_flow_dissectors+0x22/0x60
[    0.144092] [<ffffffff800010fc>] do_one_initcall+0x3e/0x168
[    0.144278] [<ffffffff80600df0>] kernel_init_freeable+0x1c8/0x224
[    0.144479] [<ffffffff804868a8>] kernel_init+0x12/0x110
[    0.144658] [<ffffffff800022de>] ret_from_exception+0x0/0xc
[    0.145124] ---[ end trace f1e9643daa46d591 ]---

After some investigation, I think I found the root cause: commit
2bfc6cd81b ("move kernel mapping outside of linear mapping") moves
BPF JIT region after the kernel:

| #define BPF_JIT_REGION_START	PFN_ALIGN((unsigned long)&_end)

The &_end is unlikely aligned with PMD size, so the front bpf jit
region sits with part of kernel .data section in one PMD size mapping.
But kernel is mapped in PMD SIZE, when bpf_jit_binary_lock_ro() is
called to make the first bpf jit prog ROX, we will make part of kernel
.data section RO too, so when we write to, for example memset the
.data section, MMU will trigger a store page fault.

To fix the issue, we need to ensure the BPF JIT region is PMD size
aligned. This patch acchieve this goal by restoring the BPF JIT region
to original position, I.E the 128MB before kernel .text section. The
modification to kasan_init.c is inspired by Alexandre.

Fixes: fc8504765e ("riscv: bpf: Avoid breaking W^X")
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-18 21:10:05 -07:00
Jisheng Zhang 314b781706
riscv: kasan: Fix MODULES_VADDR evaluation due to local variables' name
commit 2bfc6cd81b ("riscv: Move kernel mapping outside of linear
mapping") makes use of MODULES_VADDR to populate kernel, BPF, modules
mapping. Currently, MODULES_VADDR is defined as below for RV64:

| #define MODULES_VADDR   (PFN_ALIGN((unsigned long)&_end) - SZ_2G)

But kasan_init() has two local variables which are also named as _start,
_end, so MODULES_VADDR is evaluated with the local variable _end
rather than the global "_end" as we expected. Fix this issue by
renaming the two local variables.

Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-18 21:09:56 -07:00
Kefeng Wang c9811e379b
riscv: Add mem kernel parameter support
The memblock_enforce_memory_limit() could change the memblock
range, so move the dram_end assignment after it in bootmem_init(),
then support mem= cmdline.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-15 08:59:04 -07:00
Kefeng Wang ce3aca0465
riscv: Only initialize swiotlb when necessary
The SWIOTLB buffer is not needed unless the physical address space
is beyond the limit of dma, only initialize swiotlb when swiotlb_force
is true or not all system memory is DMA-able.

Also move the swiotlb_init() into mem_init().

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-11 13:42:26 -07:00
Vitaly Wool ae3d69bcc4
riscv: fix typo in init.c
Commit 0106235682 introduced a typo in "__initdata" spelling
which led to build breakage for XIP. Fix that.

Fixes: 0106235682 ("riscv: mm: init: Consolidate vars, functions")
Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-08 17:32:24 -07:00
Kefeng Wang 5def4429ae
riscv: mm: Use better bitmap_zalloc()
Use better bitmap_zalloc() to allocate bitmap.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-08 17:12:30 -07:00
Jisheng Zhang 8a4102a0cf
riscv: mm: Fix W+X mappings at boot
When the kernel mapping was moved the last 2GB of the address space,
(__va(PFN_PHYS(max_low_pfn))) is much smaller than the .data section
start address, the last set_memory_nx() in protect_kernel_text_data()
will fail, thus the .data section is still mapped as W+X. This results
in below W+X mapping waring at boot. Fix it by passing the correct
.data section page num to the set_memory_nx().

[    0.396516] ------------[ cut here ]------------
[    0.396889] riscv/mm: Found insecure W+X mapping at address (____ptrval____)/0xffffffff80c00000
[    0.398347] WARNING: CPU: 0 PID: 1 at arch/riscv/mm/ptdump.c:258 note_page+0x244/0x24a
[    0.398964] Modules linked in:
[    0.399459] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc1+ #14
[    0.400003] Hardware name: riscv-virtio,qemu (DT)
[    0.400591] epc : note_page+0x244/0x24a
[    0.401368]  ra : note_page+0x244/0x24a
[    0.401772] epc : ffffffff80007c86 ra : ffffffff80007c86 sp : ffffffe000e7bc30
[    0.402304]  gp : ffffffff80caae88 tp : ffffffe000e70000 t0 : ffffffff80cb80cf
[    0.402800]  t1 : ffffffff80cb80c0 t2 : 0000000000000000 s0 : ffffffe000e7bc80
[    0.403310]  s1 : ffffffe000e7bde8 a0 : 0000000000000053 a1 : ffffffff80c83ff0
[    0.403805]  a2 : 0000000000000010 a3 : 0000000000000000 a4 : 6c7e7a5137233100
[    0.404298]  a5 : 6c7e7a5137233100 a6 : 0000000000000030 a7 : ffffffffffffffff
[    0.404849]  s2 : ffffffff80e00000 s3 : 0000000040000000 s4 : 0000000000000000
[    0.405393]  s5 : 0000000000000000 s6 : 0000000000000003 s7 : ffffffe000e7bd48
[    0.405935]  s8 : ffffffff81000000 s9 : ffffffffc0000000 s10: ffffffe000e7bd48
[    0.406476]  s11: 0000000000001000 t3 : 0000000000000072 t4 : ffffffffffffffff
[    0.407016]  t5 : 0000000000000002 t6 : ffffffe000e7b978
[    0.407435] status: 0000000000000120 badaddr: 0000000000000000 cause: 0000000000000003
[    0.408052] Call Trace:
[    0.408343] [<ffffffff80007c86>] note_page+0x244/0x24a
[    0.408855] [<ffffffff8010c5a6>] ptdump_hole+0x14/0x1e
[    0.409263] [<ffffffff800f65c6>] walk_pgd_range+0x2a0/0x376
[    0.409690] [<ffffffff800f6828>] walk_page_range_novma+0x4e/0x6e
[    0.410146] [<ffffffff8010c5f8>] ptdump_walk_pgd+0x48/0x78
[    0.410570] [<ffffffff80007d66>] ptdump_check_wx+0xb4/0xf8
[    0.410990] [<ffffffff80006738>] mark_rodata_ro+0x26/0x2e
[    0.411407] [<ffffffff8031961e>] kernel_init+0x44/0x108
[    0.411814] [<ffffffff80002312>] ret_from_exception+0x0/0xc
[    0.412309] ---[ end trace 7ec3459f2547ea83 ]---
[    0.413141] Checked W+X mappings: failed, 512 W+X pages found

Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-01 21:15:09 -07:00
Jisheng Zhang 0106235682
riscv: mm: init: Consolidate vars, functions
Consolidate the following items in init.c

Staticize global vars as much as possible;
Add __initdata mark if the global var isn't needed after init
Add __init mark if the func isn't needed after init
Add __ro_after_init if the global var is read only after init

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-29 13:51:16 -07:00
Jisheng Zhang 3df952ae2a
riscv: Add __init section marker to some functions again
These functions are not needed after booting, so mark them as __init
to move them to the __init section.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-29 13:39:27 -07:00
Jisheng Zhang 8237c5243a
riscv: Optimize switch_mm by passing "cpu" to flush_icache_deferred()
Directly passing the cpu to flush_icache_deferred() rather than calling
smp_processor_id() again.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
[Palmer: drop the QEMU performance numbers, and update the comment]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-25 22:50:52 -07:00
Kefeng Wang 50bae95e17
riscv: mm: Drop redundant _sdata and _edata declaration
The _sdata/_edata is already in sections.h, drop redundant
declaration.

Also move _xiprom/_exiprom declarations at the beginning of
the file, cleanup one CONFIG_XIP_KERNEL.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-25 22:50:51 -07:00
Kefeng Wang f842f5ff6a
riscv: Move setup_bootmem into paging_init
Make setup_bootmem() static.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-25 22:50:50 -07:00
Jisheng Zhang 8f3e136ff3
riscv: mm: Remove setup_zero_page()
The empty_zero_page sits at .bss..page_aligned section, so will be
cleared to zero during clearing bss, we don't need to clear it again.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-25 22:50:48 -07:00
Nanyong Sun e88b333142
riscv: mm: add THP support on 64-bit
Bring Transparent HugePage support to riscv. A
transparent huge page is always represented as a pmd.

Signed-off-by: Nanyong Sun <sunnanyong@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-22 10:20:02 -07:00
Nanyong Sun c3b2d67046
riscv: mm: add param stride for __sbi_tlb_flush_range
Add a parameter: stride for __sbi_tlb_flush_range(),
represent the page stride between the address of start and end.
Normally, the stride is PAGE_SIZE, and when flush huge page
address, the stride can be the huge page size such as:PMD_SIZE,
then it only need to flush one tlb entry if the address range
within PMD_SIZE.

Signed-off-by: Nanyong Sun <sunnanyong@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-22 10:19:47 -07:00
Geert Uytterhoeven 8d91b09733
riscv: Consistify protect_kernel_linear_mapping_text_rodata() use
The various uses of protect_kernel_linear_mapping_text_rodata() are
not consistent:
  - Its definition depends on "64BIT && !XIP_KERNEL",
  - Its forward declaration depends on MMU,
  - Its single caller depends on "STRICT_KERNEL_RWX && 64BIT && MMU &&
    !XIP_KERNEL".

Fix this by settling on the dependencies of the caller, which can be
simplified as STRICT_KERNEL_RWX depends on "MMU && !XIP_KERNEL".
Provide a dummy definition, as the caller is protected by
"IS_ENABLED(CONFIG_STRICT_KERNEL_RWX)" instead of "#ifdef
CONFIG_STRICT_KERNEL_RWX".

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-06 09:40:15 -07:00
Geert Uytterhoeven 8db6f937f4
riscv: Only extend kernel reservation if mapped read-only
When the kernel mapping was moved outside of the linear mapping, the
kernel memory reservation was increased, to take into account mapping
granularity.  However, this is done unconditionally, regardless of
whether the kernel memory is mapped read-only or not.

If this extension is not needed, up to 2 MiB may be lost, which has a
big impact on e.g. Canaan K210 (64-bit nommu) platforms with only 8 MiB
of RAM.

Reclaim the lost memory by only extending the reserved region when
needed, i.e. depending on a simplified version of the conditional logic
around the call to protect_kernel_linear_mapping_text_rodata().

Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-06 09:40:12 -07:00
Linus Torvalds 939b7cbc00 RISC-V Patches for the 5.13 Merge Window, Part 1
* Support for the memtest= kernel command-line argument.
 * Support for building the kernel with FORTIFY_SOURCE.
 * Support for generic clockevent broadcasts.
 * Support for the buildtar build target.
 * Some build system cleanups to pass more LLVM-friendly arguments.
 * Support for kprobes.
 * A rearranged kernel memory map, the first part of supporting sv48
   systems.
 * Improvements to kexec, along with support for kdump and crash kernels.
 * An alternatives-based errata framework, along with support for
   handling a pair of errata that manifest on some SiFive designs
   (including the HiFive Unmatched).
 * Support for XIP.
 * A device tree for the Microchip PolarFire ICICLE SoC and associated
   dev board.
 
 Along with a bunch of cleanups.  There are already a handful of fixes
 on the list so there will likely be a part 2.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmCS4lITHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYieZqEACSihfcOgZ/oyGWN3chca917/yCWimM
 DOu37Zlh81TNPgzzJwbT44IY5sg/lSecwktxs665TChiJjr3JlM4jmz+u64KOTA8
 mTWhqZNr5zT9kFj/m3x0V9yYOVr9g43QRmIlc14d+8JaQDw0N8WeH/yK85/CXDSS
 X5gQK/e9q/yPf/NPyPuPm67jDsFnJERINWaAHI8lhA5fvFyy/xRLmSkuexchysss
 XOGfyxxX590jGLK1vD+5wccX7ZwfwU4jriTaxyah/VBl8QUur/xSPVyspHIdWiMG
 jrNXI1dg6oI861BdjryUpZI0iYJaRe5FRWUx7uTIqHfIyL/MnvYI7USVYOOPb72M
 yZgN903R++5NeUUVTzfXwaigTwfXAPB6USFqZpEfRAf204pgNybmznJWThAVBdYG
 rUixp7GsEMU3aAT2tE/iHR33JQxQfnZq8Tg43/4gB7MoACrzQrYrGcPnj9xssMyV
 F1hnao3dr+5Xjo3MwfkW9JvLPwvDuE3mdrdj+a0XZ45gbTJeuBhYxo3VOsFeijhQ
 gf/VYuoNn5iae9fiMzx5rlmFT9NJDYKDhla+BpAel84/6nRryyfCZCaE5FvDynOO
 CNQynaeJMIMEygPBYR9FVVCwm+EtVsz3NVFKEuo5ilQpgX8ipctxiqy2+moZALLN
 OWlEH6BKEgXqkw==
 =PsA8
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.13-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the memtest= kernel command-line argument.

 - Support for building the kernel with FORTIFY_SOURCE.

 - Support for generic clockevent broadcasts.

 - Support for the buildtar build target.

 - Some build system cleanups to pass more LLVM-friendly arguments.

 - Support for kprobes.

 - A rearranged kernel memory map, the first part of supporting sv48
   systems.

 - Improvements to kexec, along with support for kdump and crash
   kernels.

 - An alternatives-based errata framework, along with support for
   handling a pair of errata that manifest on some SiFive designs
   (including the HiFive Unmatched).

 - Support for XIP.

 - A device tree for the Microchip PolarFire ICICLE SoC and associated
   dev board.

... along with a bunch of cleanups.  There are already a handful of fixes
on the list so there will likely be a part 2.

* tag 'riscv-for-linus-5.13-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (45 commits)
  RISC-V: Always define XIP_FIXUP
  riscv: Remove 32b kernel mapping from page table dump
  riscv: Fix 32b kernel build with CONFIG_DEBUG_VIRTUAL=y
  RISC-V: Fix error code returned by riscv_hartid_to_cpuid()
  RISC-V: Enable Microchip PolarFire ICICLE SoC
  RISC-V: Initial DTS for Microchip ICICLE board
  dt-bindings: riscv: microchip: Add YAML documentation for the PolarFire SoC
  RISC-V: Add Microchip PolarFire SoC kconfig option
  RISC-V: enable XIP
  RISC-V: Add crash kernel support
  RISC-V: Add kdump support
  RISC-V: Improve init_resources()
  RISC-V: Add kexec support
  RISC-V: Add EM_RISCV to kexec UAPI header
  riscv: vdso: fix and clean-up Makefile
  riscv/mm: Use BUG_ON instead of if condition followed by BUG.
  riscv/kprobe: fix kernel panic when invoking sys_read traced by kprobe
  riscv: Set ARCH_HAS_STRICT_MODULE_RWX if MMU
  riscv: module: Create module allocations without exec permissions
  riscv: bpf: Avoid breaking W^X
  ...
2021-05-06 09:24:18 -07:00
Alexandre Ghiti 28252e0864
riscv: Remove 32b kernel mapping from page table dump
The 32b kernel mapping lies in the linear mapping, there is no point in
printing its address in page table dump, so remove this leftover that
comes from moving the kernel mapping outside the linear mapping for 64b
kernel.

Fixes: e9efb21fe352 ("riscv: Prepare ptdump for vm layout dynamic addresses")
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-01 08:53:32 -07:00
Kefeng Wang 1f9d03c5e9 mm: move mem_init_print_info() into mm_init()
mem_init_print_info() is called in mem_init() on each architecture, and
pass NULL argument, so using void argument and move it into mm_init().

Link: https://lkml.kernel.org/r/20210317015210.33641-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>	[x86]
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>	[powerpc]
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Anatoly Pugachev <matorola@gmail.com>	[sparc64]
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>	[arm]
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Guo Ren <guoren@kernel.org>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-30 11:20:42 -07:00