Commit Graph

231 Commits

Author SHA1 Message Date
Ard Biesheuvel 9684ec186f arm64: Enable LPA2 at boot if supported by the system
Update the early kernel mapping code to take 52-bit virtual addressing
into account based on the LPA2 feature. This is a bit more involved than
LVA (which is supported with 64k pages only), given that some page table
descriptor bits change meaning in this case.

To keep the handling in asm to a minimum, the initial ID map is still
created with 48-bit virtual addressing, which implies that the kernel
image must be loaded into 48-bit addressable physical memory. This is
currently required by the boot protocol, even though we happen to
support placement outside of that for LVA/64k based configurations.

Enabling LPA2 involves more than setting TCR.T1SZ to a lower value,
there is also a DS bit in TCR that needs to be set, and which changes
the meaning of bits [9:8] in all page table descriptors. Since we cannot
enable DS and every live page table descriptor at the same time, let's
pivot through another temporary mapping. This avoids the need to
reintroduce manipulations of the page tables with the MMU and caches
disabled.

To permit the LPA2 feature to be overridden on the kernel command line,
which may be necessary to work around silicon errata, or to deal with
mismatched features on heterogeneous SoC designs, test for CPU feature
overrides first, and only then enable LPA2.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-78-ardb+git@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-02-16 12:42:40 +00:00
Linus Torvalds 8f6f76a6a2 As usual, lots of singleton and doubleton patches all over the tree and
there's little I can say which isn't in the individual changelogs.
 
 The lengthier patch series are
 
 - "kdump: use generic functions to simplify crashkernel reservation in
   arch", from Baoquan He.  This is mainly cleanups and consolidation of
   the "crashkernel=" kernel parameter handling.
 
 - After much discussion, David Laight's "minmax: Relax type checks in
   min() and max()" is here.  Hopefully reduces some typecasting and the
   use of min_t() and max_t().
 
 - A group of patches from Oleg Nesterov which clean up and slightly fix
   our handling of reads from /proc/PID/task/...  and which remove
   task_struct.therad_group.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZUQP9wAKCRDdBJ7gKXxA
 jmOAAQDh8sxagQYocoVsSm28ICqXFeaY9Co1jzBIDdNesAvYVwD/c2DHRqJHEiS4
 63BNcG3+hM9nwGJHb5lyh5m79nBMRg0=
 =On4u
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "As usual, lots of singleton and doubleton patches all over the tree
  and there's little I can say which isn't in the individual changelogs.

  The lengthier patch series are

   - 'kdump: use generic functions to simplify crashkernel reservation
     in arch', from Baoquan He. This is mainly cleanups and
     consolidation of the 'crashkernel=' kernel parameter handling

   - After much discussion, David Laight's 'minmax: Relax type checks in
     min() and max()' is here. Hopefully reduces some typecasting and
     the use of min_t() and max_t()

   - A group of patches from Oleg Nesterov which clean up and slightly
     fix our handling of reads from /proc/PID/task/... and which remove
     task_struct.thread_group"

* tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (64 commits)
  scripts/gdb/vmalloc: disable on no-MMU
  scripts/gdb: fix usage of MOD_TEXT not defined when CONFIG_MODULES=n
  .mailmap: add address mapping for Tomeu Vizoso
  mailmap: update email address for Claudiu Beznea
  tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions
  .mailmap: map Benjamin Poirier's address
  scripts/gdb: add lx_current support for riscv
  ocfs2: fix a spelling typo in comment
  proc: test ProtectionKey in proc-empty-vm test
  proc: fix proc-empty-vm test with vsyscall
  fs/proc/base.c: remove unneeded semicolon
  do_io_accounting: use sig->stats_lock
  do_io_accounting: use __for_each_thread()
  ocfs2: replace BUG_ON() at ocfs2_num_free_extents() with ocfs2_error()
  ocfs2: fix a typo in a comment
  scripts/show_delta: add __main__ judgement before main code
  treewide: mark stuff as __ro_after_init
  fs: ocfs2: check status values
  proc: test /proc/${pid}/statm
  compiler.h: move __is_constexpr() to compiler.h
  ...
2023-11-02 20:53:31 -10:00
Catalin Marinas 65033574ad arm64: swiotlb: Reduce the default size if no ZONE_DMA bouncing needed
With CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC enabled, the arm64 kernel still
allocates the default SWIOTLB buffer (64MB) even if ZONE_DMA is disabled
or all the RAM fits into this zone. However, this potentially wastes a
non-negligible amount of memory on platforms with little RAM.

Reduce the SWIOTLB size to 1MB per 1GB of RAM if only needed for
kmalloc() buffer bouncing.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Suggested-by: Ross Burton <ross.burton@arm.com>
Cc: Ross Burton <ross.burton@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2023-10-13 16:10:39 +01:00
Baoquan He fdc268232d arm64: kdump: use generic interface to simplify crashkernel reservation
With the help of newly changed function parse_crashkernel() and generic
reserve_crashkernel_generic(), crashkernel reservation can be simplified
by steps:

1) Add a new header file <asm/crash_core.h>, and define CRASH_ALIGN,
   CRASH_ADDR_LOW_MAX, CRASH_ADDR_HIGH_MAX and
   DEFAULT_CRASH_KERNEL_LOW_SIZE in <asm/crash_core.h>;

2) Add arch_reserve_crashkernel() to call parse_crashkernel() and
   reserve_crashkernel_generic();

3) Add ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION Kconfig in
   arch/arm64/Kconfig.

The old reserve_crashkernel_low() and reserve_crashkernel() can be
removed.

Link: https://lkml.kernel.org/r/20230914033142.676708-8-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Jiahao <chenjiahao16@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:41:58 -07:00
Baoquan He a9e1a3d84e crash_core: change the prototype of function parse_crashkernel()
Add two parameters 'low_size' and 'high' to function parse_crashkernel(),
later crashkernel=,high|low parsing will be added.  Make adjustments in
all call sites of parse_crashkernel() in arch.

Link: https://lkml.kernel.org/r/20230914033142.676708-3-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Jiahao <chenjiahao16@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:41:58 -07:00
Linus Torvalds 6c1b980a7e dma-maping updates for Linux 6.6
- allow dynamic sizing of the swiotlb buffer, to cater for secure
    virtualization workloads that require all I/O to be bounce buffered
    (Petr Tesarik)
  - move a declaration to a header (Arnd Bergmann)
  - check for memory region overlap in dma-contiguous (Binglei Wang)
  - remove the somewhat dangerous runtime swiotlb-xen enablement and
    unexport is_swiotlb_active (Christoph Hellwig, Juergen Gross)
  - per-node CMA improvements (Yajun Deng)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmTuDHkLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYOqvhAApMk2/ceTgVH17sXaKE822+xKvgv377O6TlggMeGG
 W4zA0KD69DNz0AfaaCc5U5f7n8Ld/YY1RsvkHW4b3jgw+KRTeQr0jjitBgP5kP2M
 A1+qxdyJpCTwiPt9s2+JFVPeyZ0s52V6OJODKRG3s0ore55R+U09VySKtASON+q3
 GMKfWqQteKC+thg7NkrQ7JUixuo84oICws+rZn4K9ifsX2O0HYW6aMW0feRfZjJH
 r0TgqZc4RdPTSaF22oapR9Ls39+7hp/pBvoLm5sBNA3cl5C3X4VWo9ERMU1jW9h+
 VYQv39NycUspgskWJmpbU06/+ooYqQlwHSR/vdNusmFIvxo4tf6/UX72YO5F8Dar
 ap0wYGauiEwTjSnhVxPTXk3obWyWEsgFAeRnPdTlH2CNmv38QZU2HLb8eU1pcXxX
 j+WI2Ewy9z22uBVYiPOKpdW1jkSfmlmfPp/8SbAdua7I3YQ90rQN6AvU06zAi/cL
 NQTgO81E4jPkygqAVgS/LeYziWAQ73yM7m9ExThtTgqFtHortwhJ4Fd8XKtvtvEb
 viXAZ/WZtQBv/CIKAW98NhgIDP/SPOT8ym6V35WK+kkNFMS6LMSQUfl9GgbHGyFa
 n9icMm7BmbDtT1+AKNafG9En4DtAf9M9QNidAVOyfrsIk6S0gZoZwvIStkA7on8a
 cNY=
 =kVVr
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-maping updates from Christoph Hellwig:

 - allow dynamic sizing of the swiotlb buffer, to cater for secure
   virtualization workloads that require all I/O to be bounce buffered
   (Petr Tesarik)

 - move a declaration to a header (Arnd Bergmann)

 - check for memory region overlap in dma-contiguous (Binglei Wang)

 - remove the somewhat dangerous runtime swiotlb-xen enablement and
   unexport is_swiotlb_active (Christoph Hellwig, Juergen Gross)

 - per-node CMA improvements (Yajun Deng)

* tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping:
  swiotlb: optimize get_max_slots()
  swiotlb: move slot allocation explanation comment where it belongs
  swiotlb: search the software IO TLB only if the device makes use of it
  swiotlb: allocate a new memory pool when existing pools are full
  swiotlb: determine potential physical address limit
  swiotlb: if swiotlb is full, fall back to a transient memory pool
  swiotlb: add a flag whether SWIOTLB is allowed to grow
  swiotlb: separate memory pool data from other allocator data
  swiotlb: add documentation and rename swiotlb_do_find_slots()
  swiotlb: make io_tlb_default_mem local to swiotlb.c
  swiotlb: bail out of swiotlb_init_late() if swiotlb is already allocated
  dma-contiguous: check for memory region overlap
  dma-contiguous: support numa CMA for specified node
  dma-contiguous: support per-numa CMA for all architectures
  dma-mapping: move arch_dma_set_mask() declaration to header
  swiotlb: unexport is_swiotlb_active
  x86: always initialize xen-swiotlb when xen-pcifront is enabling
  xen/pci: add flag for PCI passthrough being possible
2023-08-29 20:32:10 -07:00
Zhang Jianhua 4e0bacd65e arm64: fix build warning for ARM64_MEMSTART_SHIFT
When building with W=1, the following warning occurs.

arch/arm64/include/asm/kernel-pgtable.h:129:41: error: "PUD_SHIFT" is not defined, evaluates to 0 [-Werror=undef]
  129 | #define ARM64_MEMSTART_SHIFT            PUD_SHIFT
      |                                         ^~~~~~~~~
arch/arm64/include/asm/kernel-pgtable.h:142:5: note: in expansion of macro ‘ARM64_MEMSTART_SHIFT’
  142 | #if ARM64_MEMSTART_SHIFT < SECTION_SIZE_BITS
      |     ^~~~~~~~~~~~~~~~~~~~

The generic PUD_SHIFT was defined in include/asm-generic/pgtable-nopud.h,
however the #ifndef __ASSEMBLY__ guard in this header file makes it unavailable
for assembly files. While someone .S file include the <asm/kernel-pgtable.h>,
the build warning would occur. Now move the macro ARM64_MEMSTART_SHIFT and
ARM64_MEMSTART_ALIGN to arch/arm64/mm/init.c where it is used only, to avoid
this issue.

Signed-off-by: Zhang Jianhua <chris.zjh@huawei.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20230804075615.3334756-1-chris.zjh@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-08-04 17:19:44 +01:00
Yajun Deng 22e4a348f8 dma-contiguous: support per-numa CMA for all architectures
In the commit b7176c261c ("dma-contiguous: provide the ability to
reserve per-numa CMA"), Barry adds DMA_PERNUMA_CMA for ARM64.

But this feature is architecture independent, so support per-numa CMA
for all architectures, and enable it by default if NUMA.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-07-31 17:54:28 +02:00
Linus Torvalds 6e17c6de3d - Yosry Ahmed brought back some cgroup v1 stats in OOM logs.
- Yosry has also eliminated cgroup's atomic rstat flushing.
 
 - Nhat Pham adds the new cachestat() syscall.  It provides userspace
   with the ability to query pagecache status - a similar concept to
   mincore() but more powerful and with improved usability.
 
 - Mel Gorman provides more optimizations for compaction, reducing the
   prevalence of page rescanning.
 
 - Lorenzo Stoakes has done some maintanance work on the get_user_pages()
   interface.
 
 - Liam Howlett continues with cleanups and maintenance work to the maple
   tree code.  Peng Zhang also does some work on maple tree.
 
 - Johannes Weiner has done some cleanup work on the compaction code.
 
 - David Hildenbrand has contributed additional selftests for
   get_user_pages().
 
 - Thomas Gleixner has contributed some maintenance and optimization work
   for the vmalloc code.
 
 - Baolin Wang has provided some compaction cleanups,
 
 - SeongJae Park continues maintenance work on the DAMON code.
 
 - Huang Ying has done some maintenance on the swap code's usage of
   device refcounting.
 
 - Christoph Hellwig has some cleanups for the filemap/directio code.
 
 - Ryan Roberts provides two patch series which yield some
   rationalization of the kernel's access to pte entries - use the provided
   APIs rather than open-coding accesses.
 
 - Lorenzo Stoakes has some fixes to the interaction between pagecache
   and directio access to file mappings.
 
 - John Hubbard has a series of fixes to the MM selftesting code.
 
 - ZhangPeng continues the folio conversion campaign.
 
 - Hugh Dickins has been working on the pagetable handling code, mainly
   with a view to reducing the load on the mmap_lock.
 
 - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment from
   128 to 8.
 
 - Domenico Cerasuolo has improved the zswap reclaim mechanism by
   reorganizing the LRU management.
 
 - Matthew Wilcox provides some fixups to make gfs2 work better with the
   buffer_head code.
 
 - Vishal Moola also has done some folio conversion work.
 
 - Matthew Wilcox has removed the remnants of the pagevec code - their
   functionality is migrated over to struct folio_batch.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZJejewAKCRDdBJ7gKXxA
 joggAPwKMfT9lvDBEUnJagY7dbDPky1cSYZdJKxxM2cApGa42gEA6Cl8HRAWqSOh
 J0qXCzqaaN8+BuEyLGDVPaXur9KirwY=
 =B7yQ
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull mm updates from Andrew Morton:

 - Yosry Ahmed brought back some cgroup v1 stats in OOM logs

 - Yosry has also eliminated cgroup's atomic rstat flushing

 - Nhat Pham adds the new cachestat() syscall. It provides userspace
   with the ability to query pagecache status - a similar concept to
   mincore() but more powerful and with improved usability

 - Mel Gorman provides more optimizations for compaction, reducing the
   prevalence of page rescanning

 - Lorenzo Stoakes has done some maintanance work on the
   get_user_pages() interface

 - Liam Howlett continues with cleanups and maintenance work to the
   maple tree code. Peng Zhang also does some work on maple tree

 - Johannes Weiner has done some cleanup work on the compaction code

 - David Hildenbrand has contributed additional selftests for
   get_user_pages()

 - Thomas Gleixner has contributed some maintenance and optimization
   work for the vmalloc code

 - Baolin Wang has provided some compaction cleanups,

 - SeongJae Park continues maintenance work on the DAMON code

 - Huang Ying has done some maintenance on the swap code's usage of
   device refcounting

 - Christoph Hellwig has some cleanups for the filemap/directio code

 - Ryan Roberts provides two patch series which yield some
   rationalization of the kernel's access to pte entries - use the
   provided APIs rather than open-coding accesses

 - Lorenzo Stoakes has some fixes to the interaction between pagecache
   and directio access to file mappings

 - John Hubbard has a series of fixes to the MM selftesting code

 - ZhangPeng continues the folio conversion campaign

 - Hugh Dickins has been working on the pagetable handling code, mainly
   with a view to reducing the load on the mmap_lock

 - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment
   from 128 to 8

 - Domenico Cerasuolo has improved the zswap reclaim mechanism by
   reorganizing the LRU management

 - Matthew Wilcox provides some fixups to make gfs2 work better with the
   buffer_head code

 - Vishal Moola also has done some folio conversion work

 - Matthew Wilcox has removed the remnants of the pagevec code - their
   functionality is migrated over to struct folio_batch

* tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (380 commits)
  mm/hugetlb: remove hugetlb_set_page_subpool()
  mm: nommu: correct the range of mmap_sem_read_lock in task_mem()
  hugetlb: revert use of page_cache_next_miss()
  Revert "page cache: fix page_cache_next/prev_miss off by one"
  mm/vmscan: fix root proactive reclaim unthrottling unbalanced node
  mm: memcg: rename and document global_reclaim()
  mm: kill [add|del]_page_to_lru_list()
  mm: compaction: convert to use a folio in isolate_migratepages_block()
  mm: zswap: fix double invalidate with exclusive loads
  mm: remove unnecessary pagevec includes
  mm: remove references to pagevec
  mm: rename invalidate_mapping_pagevec to mapping_try_invalidate
  mm: remove struct pagevec
  net: convert sunrpc from pagevec to folio_batch
  i915: convert i915_gpu_error to use a folio_batch
  pagevec: rename fbatch_count()
  mm: remove check_move_unevictable_pages()
  drm: convert drm_gem_put_pages() to use a folio_batch
  i915: convert shmem_sg_free_table() to use a folio_batch
  scatterlist: add sg_set_folio()
  ...
2023-06-28 10:28:11 -07:00
Catalin Marinas 1c1a429efd arm64: enable ARCH_WANT_KMALLOC_DMA_BOUNCE for arm64
With the DMA bouncing of unaligned kmalloc() buffers now in place, enable
it for arm64 to allow the kmalloc-{8,16,32,48,96} caches.  In addition,
always create the swiotlb buffer even when the end of RAM is within the
32-bit physical address range (the swiotlb buffer can still be disabled on
the kernel command line).

Link: https://lkml.kernel.org/r/20230612153201.554742-18-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jerry Snitselaar <jsnitsel@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Mike Snitzer <snitzer@kernel.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Saravana Kannan <saravanak@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-19 16:19:23 -07:00
Baoquan He 6c4dcaddbd arm64: kdump: simplify the reservation behaviour of crashkernel=,high
On arm64, reservation for 'crashkernel=xM,high' is taken by searching for
suitable memory region top down. If the 'xM' of crashkernel high memory
is reserved from high memory successfully, it will try to reserve
crashkernel low memory later accoringly. Otherwise, it will try to search
low memory area for the 'xM' suitable region. Please see the details in
Documentation/admin-guide/kernel-parameters.txt.

While we observed an unexpected case where a reserved region crosses the
high and low meomry boundary. E.g on a system with 4G as low memory end,
user added the kernel parameters like: 'crashkernel=512M,high', it could
finally have [4G-126M, 4G+386M], [1G, 1G+128M] regions in running kernel.
The crashkernel high region crossing low and high memory boudary will bring
issues:

1) For crashkernel=x,high, if getting crashkernel high region across
low and high memory boundary, then user will see two memory regions in
low memory, and one memory region in high memory. The two crashkernel
low memory regions are confusing as shown in above example.

2) If people explicityly specify "crashkernel=x,high crashkernel=y,low"
and y <= 128M, when crashkernel high region crosses low and high memory
boundary and the part of crashkernel high reservation below boundary is
bigger than y, the expected crahskernel low reservation will be skipped.
But the expected crashkernel high reservation is shrank and could not
satisfy user space requirement.

3) The crossing boundary behaviour of crahskernel high reservation is
different than x86 arch. On x86_64, the low memory end is 4G fixedly,
and the memory near 4G is reserved by system, e.g for mapping firmware,
pci mapping, so the crashkernel reservation crossing boundary never happens.
From distros point of view, this brings inconsistency and confusion. Users
need to dig into x86 and arm64 system details to find out why.

For kernel itself, the impact of issue 3) could be slight. While issue
1) and 2) cause actual impact because it brings obscure semantics and
behaviour to crashkernel=,high reservation.

Here, for crashkernel=xM,high, search the high memory for the suitable
region only in high memory. If failed, try reserving the suitable
region only in low memory. Like this, the crashkernel high region will
only exist in high memory, and crashkernel low region only exists in low
memory. The reservation behaviour for crashkernel=,high is clearer and
simpler.

Note: RPi4 has different zone ranges than normal memory. Its DMA zone is
0~1G, and DMA32 zone is 1G~4G if CONFIG_ZONE_DMA|DMA32 are enabled by
default. The low memory end is 1G in order to validate all devices, high
memory starts at 1G memory. However, for being consistent with normal
arm64 system, its low memory end is still 1G, while reserving crashkernel
high memory from 4G if crashkernel=size,high specified. This will remove
confusion.

With above change applied, summary of arm64 crashkernel reservation range:
1)
RPi4(zone DMA:0~1G; DMA32:1G~4G):
 crashkernel=size
  0~1G: low memory | 1G~top: high memory

 crashkernel=size,high
  0~1G: low memory | 4G~top: high memory

2)
Other normal system:
 crashkernel=size
 crashkernel=size,high
  0~4G: low memory | 4G~top: high memory

3)
Systems w/o zone DMA|DMA32
 crashkernel=size
 crashkernel=size,high
  0~top: low memory

Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/ZGIBSEoZ7VRVvP8H@MiWiFi-R3L-srv
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-09 17:37:28 +01:00
Baoquan He 504cae453f arm64: kdump: defer the crashkernel reservation for platforms with no DMA memory zones
In commit 031495635b ("arm64: Do not defer reserve_crashkernel() for
platforms with no DMA memory zones"), reserve_crashkernel() is called
much earlier in arm64_memblock_init() to avoid causing base apge
mapping on platforms with no DMA meomry zones.

With taking off protection on crashkernel memory region, no need to call
reserve_crashkernel() specially in advance. The deferred invocation of
reserve_crashkernel() in bootmem_init() can cover all cases. So revert
the whole commit now.

Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20230407011507.17572-4-bhe@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-04-11 19:24:46 +01:00
Zhen Lei a9ae89df73 arm64: kdump: Support crashkernel=X fall back to reserve region above DMA zones
For crashkernel=X without '@offset', select a region within DMA zones
first, and fall back to reserve region above DMA zones. This allows
users to use the same configuration on multiple platforms.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20221116121044.1690-3-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-18 14:14:35 +00:00
Zhen Lei a149cf00b1 arm64: kdump: Provide default size when crashkernel=Y,low is not specified
Try to allocate at least 128 MiB low memory automatically for the case
that crashkernel=,high is explicitly specified, while crashkenrel=,low
is omitted. This allows users to focus more on the high memory
requirements of their business rather than the low memory requirements
of the crash kernel booting.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20221116121044.1690-2-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-18 14:14:35 +00:00
Mark Brown 2d987e64e8 arm64/sysreg: Add _EL1 into ID_AA64MMFR0_EL1 definition names
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64MMFR0_EL1 to follow the convention. No functional changes.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-5-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 10:59:02 +01:00
Linus Torvalds 97a77ab14f EFI updates for v5.20
- Enable mirrored memory for arm64
 - Fix up several abuses of the efivar API
 - Refactor the efivar API in preparation for moving the 'business logic'
   part of it into efivarfs
 - Enable ACPI PRM on arm64
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmLhuDIACgkQw08iOZLZ
 jyS9IQv/Wc2nhjN50S3gfrL+68/el/hGdP/J0FK5BOOjNosG2t1ZNYZtSthXqpPH
 hRrTU2m6PpQUalRpFDyLiHkJvdBFQe4VmvrzBa3TIBIzyflLQPJzkWrqThPchV+B
 qi4lmCtTDNIEJmayewqx1wWA+QmUiyI5zJ8wrZp84LTctBPL75seVv0SB20nqai0
 3/I73omB2RLVGpCpeWvb++vePXL8euFW3FEwCTM8hRboICjORTyIZPy8Y5os+3xT
 UgrIgVDOtn1Xwd4tK0qVwjOVA51east4Fcn3yGOrL40t+3SFm2jdpAJOO3UvyNPl
 vkbtjvXsIjt3/oxreKxXHLbamKyueWIfZRyCLsrg6wrr96oypPk6ID4iDCQoen/X
 Zf0VjM2vmvSd4YgnEIblOfSBxVg48cHJA4iVHVxFodNTrVnzGGFYPTmNKmJqo+Xn
 JeUILM7jlR4h/t0+cTTK3Busu24annTuuz5L5rjf4bUm6pPf4crb1yJaFWtGhlpa
 er233D6O
 =zI0R
 -----END PGP SIGNATURE-----

Merge tag 'efi-next-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI updates from Ard Biesheuvel:

 - Enable mirrored memory for arm64

 - Fix up several abuses of the efivar API

 - Refactor the efivar API in preparation for moving the 'business
   logic' part of it into efivarfs

 - Enable ACPI PRM on arm64

* tag 'efi-next-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits)
  ACPI: Move PRM config option under the main ACPI config
  ACPI: Enable Platform Runtime Mechanism(PRM) support on ARM64
  ACPI: PRM: Change handler_addr type to void pointer
  efi: Simplify arch_efi_call_virt() macro
  drivers: fix typo in firmware/efi/memmap.c
  efi: vars: Drop __efivar_entry_iter() helper which is no longer used
  efi: vars: Use locking version to iterate over efivars linked lists
  efi: pstore: Omit efivars caching EFI varstore access layer
  efi: vars: Add thin wrapper around EFI get/set variable interface
  efi: vars: Don't drop lock in the middle of efivar_init()
  pstore: Add priv field to pstore_record for backend specific use
  Input: applespi - avoid efivars API and invoke EFI services directly
  selftests/kexec: remove broken EFI_VARS secure boot fallback check
  brcmfmac: Switch to appropriate helper to load EFI variable contents
  iwlwifi: Switch to proper EFI variable store interface
  media: atomisp_gmin_platform: stop abusing efivar API
  efi: efibc: avoid efivar API for setting variables
  efi: avoid efivars layer when loading SSDTs from variables
  efi: Correct comment on efi_memmap_alloc
  memblock: Disable mirror feature if kernelcore is not specified
  ...
2022-08-03 14:38:02 -07:00
Anshuman Khandual 4890cc18f9 arm64/mm: Define defer_reserve_crashkernel()
Crash kernel memory reservation gets deferred, when either CONFIG_ZONE_DMA
or CONFIG_ZONE_DMA32 config is enabled on the platform. This deferral also
impacts overall linear mapping creation including the crash kernel itself.
Just encapsulate this deferral check in a new helper for better clarity.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220705062556.1845734-1-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-05 11:43:47 +01:00
Ma Wupeng c0b978fedf arm64: mm: Only remove nomap flag for initrd
Commit 177e15f0c1 ("arm64: add the initrd region to the linear mapping explicitly")
remove all the flags of the memory used by initrd. This is fine since
MEMBLOCK_MIRROR is not used in arm64.

However with mirrored feature introduced to arm64, this will clear the mirrored
flag used by initrd, which will lead to error log printed by
find_zone_movable_pfns_for_nodes() if the lower 4G range has some non-mirrored
memory.

To solve this problem, only MEMBLOCK_NOMAP flag will be removed via
memblock_clear_nomap().

Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220614092156.1972846-5-mawupeng1@huawei.com
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-06-15 12:14:15 +02:00
Linus Torvalds 3f306ea2e1 dma-mapping updates for Linux 5.19
- don't over-decrypt memory (Robin Murphy)
  - takes min align mask into account for the swiotlb max mapping size
    (Tianyu Lan)
  - use GFP_ATOMIC in dma-debug (Mikulas Patocka)
  - fix DMA_ATTR_NO_KERNEL_MAPPING on xen/arm (me)
  - don't fail on highmem CMA pages in dma_direct_alloc_pages (me)
  - cleanup swiotlb initialization and share more code with swiotlb-xen
    (me, Stefano Stabellini)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmKObTQLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYObmA//dIcDB/q4iFGD+WJh4MhM+asx0ZsdF2OJz42WEhgT
 Z9duOrgcneEQundCamqJP9rNTs980LHDA8uWQC5rZEc9vxuRVOdS7bSgYRUwWh6B
 r0ZjOsvQCn+ChoZML8uyk4rfmEINq+EvJuec3G5fgecZOhPuJS2i2uzzv5cHwqgP
 ChC0fwyZlkfdECXgvZXbEoCJLfTgGNlziN6Ai8dirSoqgEQUoCsY89/M7OiEBvV2
 R4XUWD7OvQERfB4t6xLuUHyzf9PAuWB+OiblRVNeAmK3lMjxVrc3k4kIowgklnzD
 8hfmphAa9Zou3zdfi6Gd4fiQRHRVOwKVp1rtqUmJ+lPSiwyMzu64z9ld2+2qac0h
 V4sSr/yJkhxnBT4/0MkTChvhnRobisackpUzNRpiM4ck7cNVb7eAvkISsbH+pWI9
 aEexPhbyskjlV+GOyM4QL4ygG0dpXY0HSyoh6uaSVsaXMycnWIsJCPidXxV1HGV0
 q2/RLHuHwYxia8cYCF01/DQvwOKSjwbU0zModxtRezGD5GYh2C0a+SrA1aX+qiTu
 yGJCs2UHtSQstAt78tTVp499YeDeL/oGSQkPAu8zyRkSczzF+CncGTuXyoJbAWyK
 otcgERWljgZ4scxjfu1uacfoVhKQ7nOu7hiJokL0U80FESAennLC3ZlocvB9h/ff
 HNA=
 =n2rk
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.19-2022-05-25' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - don't over-decrypt memory (Robin Murphy)

 - takes min align mask into account for the swiotlb max mapping size
   (Tianyu Lan)

 - use GFP_ATOMIC in dma-debug (Mikulas Patocka)

 - fix DMA_ATTR_NO_KERNEL_MAPPING on xen/arm (me)

 - don't fail on highmem CMA pages in dma_direct_alloc_pages (me)

 - cleanup swiotlb initialization and share more code with swiotlb-xen
   (me, Stefano Stabellini)

* tag 'dma-mapping-5.19-2022-05-25' of git://git.infradead.org/users/hch/dma-mapping: (23 commits)
  dma-direct: don't over-decrypt memory
  swiotlb: max mapping size takes min align mask into account
  swiotlb: use the right nslabs-derived sizes in swiotlb_init_late
  swiotlb: use the right nslabs value in swiotlb_init_remap
  swiotlb: don't panic when the swiotlb buffer can't be allocated
  dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
  dma-direct: don't fail on highmem CMA pages in dma_direct_alloc_pages
  swiotlb-xen: fix DMA_ATTR_NO_KERNEL_MAPPING on arm
  x86: remove cruft from <asm/dma-mapping.h>
  swiotlb: remove swiotlb_init_with_tbl and swiotlb_init_late_with_tbl
  swiotlb: merge swiotlb-xen initialization into swiotlb
  swiotlb: provide swiotlb_init variants that remap the buffer
  swiotlb: pass a gfp_mask argument to swiotlb_init_late
  swiotlb: add a SWIOTLB_ANY flag to lift the low memory restriction
  swiotlb: make the swiotlb_init interface more useful
  x86: centralize setting SWIOTLB_FORCE when guest memory encryption is enabled
  x86: remove the IOMMU table infrastructure
  MIPS/octeon: use swiotlb_init instead of open coding it
  arm/xen: don't check for xen_initial_domain() in xen_create_contiguous_region
  swiotlb: rename swiotlb_late_init_with_default_size
  ...
2022-05-25 19:18:36 -07:00
Catalin Marinas 201729d53a Merge branches 'for-next/sme', 'for-next/stacktrace', 'for-next/fault-in-subpage', 'for-next/misc', 'for-next/ftrace' and 'for-next/crashkernel', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf:
  perf/arm-cmn: Decode CAL devices properly in debugfs
  perf/arm-cmn: Fix filter_sel lookup
  perf/marvell_cn10k: Fix tad_pmu_event_init() to check pmu type first
  drivers/perf: hisi: Add Support for CPA PMU
  drivers/perf: hisi: Associate PMUs in SICL with CPUs online
  drivers/perf: arm_spe: Expose saturating counter to 16-bit
  perf/arm-cmn: Add CMN-700 support
  perf/arm-cmn: Refactor occupancy filter selector
  perf/arm-cmn: Add CMN-650 support
  dt-bindings: perf: arm-cmn: Add CMN-650 and CMN-700
  perf: check return value of armpmu_request_irq()
  perf: RISC-V: Remove non-kernel-doc ** comments

* for-next/sme: (30 commits)
  : Scalable Matrix Extensions support.
  arm64/sve: Move sve_free() into SVE code section
  arm64/sve: Make kernel FPU protection RT friendly
  arm64/sve: Delay freeing memory in fpsimd_flush_thread()
  arm64/sme: More sensibly define the size for the ZA register set
  arm64/sme: Fix NULL check after kzalloc
  arm64/sme: Add ID_AA64SMFR0_EL1 to __read_sysreg_by_encoding()
  arm64/sme: Provide Kconfig for SME
  KVM: arm64: Handle SME host state when running guests
  KVM: arm64: Trap SME usage in guest
  KVM: arm64: Hide SME system registers from guests
  arm64/sme: Save and restore streaming mode over EFI runtime calls
  arm64/sme: Disable streaming mode and ZA when flushing CPU state
  arm64/sme: Add ptrace support for ZA
  arm64/sme: Implement ptrace support for streaming mode SVE registers
  arm64/sme: Implement ZA signal handling
  arm64/sme: Implement streaming SVE signal handling
  arm64/sme: Disable ZA and streaming mode when handling signals
  arm64/sme: Implement traps and syscall handling for SME
  arm64/sme: Implement ZA context switching
  arm64/sme: Implement streaming SVE context switching
  ...

* for-next/stacktrace:
  : Stacktrace cleanups.
  arm64: stacktrace: align with common naming
  arm64: stacktrace: rename stackframe to unwind_state
  arm64: stacktrace: rename unwinder functions
  arm64: stacktrace: make struct stackframe private to stacktrace.c
  arm64: stacktrace: delete PCS comment
  arm64: stacktrace: remove NULL task check from unwind_frame()

* for-next/fault-in-subpage:
  : btrfs search_ioctl() live-lock fix using fault_in_subpage_writeable().
  btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults
  arm64: Add support for user sub-page fault probing
  mm: Add fault_in_subpage_writeable() to probe at sub-page granularity

* for-next/misc:
  : Miscellaneous patches.
  arm64: Kconfig.platforms: Add comments
  arm64: Kconfig: Fix indentation and add comments
  arm64: mm: avoid writable executable mappings in kexec/hibernate code
  arm64: lds: move special code sections out of kernel exec segment
  arm64/hugetlb: Implement arm64 specific huge_ptep_get()
  arm64/hugetlb: Use ptep_get() to get the pte value of a huge page
  arm64: mm: Make arch_faults_on_old_pte() check for migratability
  arm64: mte: Clean up user tag accessors
  arm64/hugetlb: Drop TLB flush from get_clear_flush()
  arm64: Declare non global symbols as static
  arm64: mm: Cleanup useless parameters in zone_sizes_init()
  arm64: fix types in copy_highpage()
  arm64: Set ARCH_NR_GPIO to 2048 for ARCH_APPLE
  arm64: cputype: Avoid overflow using MIDR_IMPLEMENTOR_MASK
  arm64: document the boot requirements for MTE
  arm64/mm: Compute PTRS_PER_[PMD|PUD] independently of PTRS_PER_PTE

* for-next/ftrace:
  : ftrace cleanups.
  arm64/ftrace: Make function graph use ftrace directly
  ftrace: cleanup ftrace_graph_caller enable and disable

* for-next/crashkernel:
  : Support for crashkernel reservations above ZONE_DMA.
  arm64: kdump: Do not allocate crash low memory if not needed
  docs: kdump: Update the crashkernel description for arm64
  of: Support more than one crash kernel regions for kexec -s
  of: fdt: Add memory for devices by DT property "linux,usable-memory-range"
  arm64: kdump: Reimplement crashkernel=X
  arm64: Use insert_resource() to simplify code
  kdump: return -ENOENT if required cmdline option does not exist
2022-05-20 18:50:35 +01:00
Zhen Lei 8f0f104e2a arm64: kdump: Do not allocate crash low memory if not needed
When "crashkernel=X,high" is specified, the specified "crashkernel=Y,low"
memory is not required in the following corner cases:
1. If both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, it means
   that the devices can access any memory.
2. If the system memory is small, the crash high memory may be allocated
   from the DMA zones. If that happens, there's no need to allocate
   another crash low memory because there's already one.

Add condition '(crash_base >= CRASH_ADDR_LOW_MAX)' to determine whether
the 'high' memory is allocated above DMA zones. Note: when both
CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the entire physical
memory is DMA accessible, CRASH_ADDR_LOW_MAX equals 'PHYS_MASK + 1'.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Baoquan He <bhe@redhat.com>
Link: https://lore.kernel.org/r/20220511032033.426-1-thunder.leizhen@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16 20:01:38 +01:00
Chen Zhou 944a45abfa arm64: kdump: Reimplement crashkernel=X
There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel in DMA zone, which
will fail when there is not enough low memory.
2. If reserving crashkernel above DMA zone, in this case, crash dump
kernel will fail to boot because there is no low memory available
for allocation.

To solve these issues, introduce crashkernel=X,[high,low].
The "crashkernel=X,high" is used to select a region above DMA zone, and
the "crashkernel=Y,low" is used to allocate specified size low memory.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20220506114402.365-4-thunder.leizhen@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-07 19:54:33 +01:00
Zhen Lei e6b394425c arm64: Use insert_resource() to simplify code
insert_resource() traverses the subtree layer by layer from the root node
until a proper location is found. Compared with request_resource(), the
parent node does not need to be determined in advance.

In addition, move the insertion of node 'crashk_res' into function
reserve_crashkernel() to make the associated code close together.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: John Donnelly  <john.p.donnelly@oracle.com>
Acked-by: Baoquan He <bhe@redhat.com>
Link: https://lore.kernel.org/r/20220506114402.365-3-thunder.leizhen@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-07 19:54:33 +01:00
Kefeng Wang f41ef4c2ee arm64: mm: Cleanup useless parameters in zone_sizes_init()
Directly use max_pfn for max and no one use min, kill them.

Reviewed-by: Vijay Balakrishna <vijayb@linux.microsoft.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20220411092455.1461-4-wangkefeng.wang@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-05 08:48:09 +01:00
Christoph Hellwig c6af2aa9ff swiotlb: make the swiotlb_init interface more useful
Pass a boolean flag to indicate if swiotlb needs to be enabled based on
the addressing needs, and replace the verbose argument with a set of
flags, including one to force enable bounce buffering.

Note that this patch removes the possibility to force xen-swiotlb use
with the swiotlb=force parameter on the command line on x86 (arm and
arm64 never supported that), but this interface will be restored shortly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-04-18 07:21:11 +02:00
Julia Lawall dd671f16b1 arm64: fix typos in comments
Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Link: https://lore.kernel.org/r/20220318103729.157574-10-Julia.Lawall@inria.fr
[will: Squashed in 20220318103729.157574-28-Julia.Lawall@inria.fr]
Signed-off-by: Will Deacon <will@kernel.org>
2022-04-04 10:32:50 +01:00
Linus Torvalds 52deda9551 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "Various misc subsystems, before getting into the post-linux-next
  material.

  41 patches.

  Subsystems affected by this patch series: procfs, misc, core-kernel,
  lib, checkpatch, init, pipe, minix, fat, cgroups, kexec, kdump,
  taskstats, panic, kcov, resource, and ubsan"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (41 commits)
  Revert "ubsan, kcsan: Don't combine sanitizer with kcov on clang"
  kernel/resource: fix kfree() of bootmem memory again
  kcov: properly handle subsequent mmap calls
  kcov: split ioctl handling into locked and unlocked parts
  panic: move panic_print before kmsg dumpers
  panic: add option to dump all CPUs backtraces in panic_print
  docs: sysctl/kernel: add missing bit to panic_print
  taskstats: remove unneeded dead assignment
  kasan: no need to unset panic_on_warn in end_report()
  ubsan: no need to unset panic_on_warn in ubsan_epilogue()
  panic: unset panic_on_warn inside panic()
  docs: kdump: add scp example to write out the dump file
  docs: kdump: update description about sysfs file system support
  arm64: mm: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef
  x86/setup: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef
  riscv: mm: init: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef
  kexec: make crashk_res, crashk_low_res and crash_notes symbols always visible
  cgroup: use irqsave in cgroup_rstat_flush_locked().
  fat: use pointer to simple type in put_user()
  minix: fix bug when opening a file with O_DIRECT
  ...
2022-03-24 14:14:07 -07:00
Jisheng Zhang d339f1584f arm64: mm: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef
Replace the conditional compilation using "#ifdef CONFIG_KEXEC_CORE" by a
check for "IS_ENABLED(CONFIG_KEXEC_CORE)", to simplify the code and
increase compile coverage.

Link: https://lkml.kernel.org/r/20211206160514.2000-5-jszhang@kernel.org
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23 19:00:34 -07:00
Will Deacon 770093459b arm64: mm: Drop 'const' from conditional arm64_dma_phys_limit definition
Commit 031495635b ("arm64: Do not defer reserve_crashkernel() for
platforms with no DMA memory zones") introduced different definitions
for 'arm64_dma_phys_limit' depending on CONFIG_ZONE_DMA{,32} based on
a late suggestion from Pasha. Sadly, this results in a build error when
passing W=1:

  | arch/arm64/mm/init.c:90:19: error: conflicting type qualifiers for 'arm64_dma_phys_limit'

Drop the 'const' for now and use '__ro_after_init' consistently.

Link: https://lore.kernel.org/r/202203090241.aj7paWeX-lkp@intel.com
Link: https://lore.kernel.org/r/CA+CK2bDbbx=8R=UthkMesWOST8eJMtOGJdfMRTFSwVmo0Vn0EA@mail.gmail.com
Fixes: 031495635b ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
Signed-off-by: Will Deacon <will@kernel.org>
2022-03-09 12:21:37 +00:00
Vijay Balakrishna 031495635b arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones
The following patches resulted in deferring crash kernel reservation to
mem_init(), mainly aimed at platforms with DMA memory zones (no IOMMU),
in particular Raspberry Pi 4.

commit 1a8e1cef76 ("arm64: use both ZONE_DMA and ZONE_DMA32")
commit 8424ecdde7 ("arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges")
commit 0a30c53573 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
commit 2687275a58 ("arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required")

Above changes introduced boot slowdown due to linear map creation for
all the memory banks with NO_BLOCK_MAPPINGS, see discussion[1].  The proposed
changes restore crash kernel reservation to earlier behavior thus avoids
slow boot, particularly for platforms with IOMMU (no DMA memory zones).

Tested changes to confirm no ~150ms boot slowdown on our SoC with IOMMU
and 8GB memory.  Also tested with ZONE_DMA and/or ZONE_DMA32 configs to confirm
no regression to deferring scheme of crash kernel memory reservation.
In both cases successfully collected kernel crash dump.

[1] https://lore.kernel.org/all/9436d033-579b-55fa-9b00-6f4b661c2dd7@linux.microsoft.com/

Signed-off-by: Vijay Balakrishna <vijayb@linux.microsoft.com>
Cc: stable@vger.kernel.org
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Link: https://lore.kernel.org/r/1646242689-20744-1-git-send-email-vijayb@linux.microsoft.com
[will: Add #ifdef CONFIG_KEXEC_CORE guards to fix 'crashk_res' references in allnoconfig build]
Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08 10:22:33 +00:00
Peng Fan bb425a7598 arm64: mm: apply __ro_after_init to memory_limit
This variable is only set during initialization, so mark with
__ro_after_init.

Signed-off-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20211215064559.2843555-1-peng.fan@oss.nxp.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-01-20 09:15:16 +00:00
Will Deacon 16c200e040 Merge branch 'for-next/pfn-valid' into for-next/core
* for-next/pfn-valid:
  arm64/mm: drop HAVE_ARCH_PFN_VALID
  dma-mapping: remove bogus test for pfn_valid from dma_map_resource
2021-10-29 12:25:19 +01:00
Anshuman Khandual 3de360c3fd arm64/mm: drop HAVE_ARCH_PFN_VALID
CONFIG_SPARSEMEM_VMEMMAP is now the only available memory model on arm64
platforms and free_unused_memmap() would just return without creating any
holes in the memmap mapping.  There is no need for any special handling in
pfn_valid() and HAVE_ARCH_PFN_VALID can just be dropped.  This also moves
the pfn upper bits sanity check into generic pfn_valid().

[rppt: rebased on v5.15-rc3]

Link: https://lkml.kernel.org/r/1621947349-25421-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Link: https://lore.kernel.org/r/20210930013039.11260-3-rppt@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01 14:54:45 +01:00
Will Deacon e63cf610ea arm64: mm: Drop pointless call to set_max_mapnr()
set_max_mapnr() is an empty stub function if CONFIG_NUMA=y, otherwise it
assigns to the 'max_mapnr' variable which is used to provide a generic
pfn_valid() implementation if CONFIG_MMU=n.

Since we don't support nommu on arm64, drop the pointless call to
set_max_mapnr() from mem_init().

Link: https://lore.kernel.org/r/130a50d7-92fd-31fa-261e-f73dadcb4fcf@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 13:54:04 +01:00
Linus Torvalds e99f23c5bf arm64 fixes:
- Limit the linear region to 51-bit when KVM is running in nVHE mode
   otherwise, depending on the placement of the ID map, kernel-VA to
   hyp-VA translations may produce addresses that either conflict with
   other HYP mappings or generate addresses outside of the 52-bit
   addressable range.
 
 - Instruct kmemleak not to scan the memory reserved for kdump as this
   range is removed from the kernel linear map and therefore not
   accessible.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmE7ldgACgkQa9axLQDI
 XvEIJRAAm6upVb+1mMtDrCpWBBc24PCfJctJruKpckH10JkfiSHZxPUGGbWH4gx6
 1eb9QuVRQ5KwanZp7J/ugqYfQlWH7JQqViV5NPRX7dL/aeR/xvCem1LpcgOMP6XY
 0z0LkJAqK6ayxtvhxHnG57SaCbLAE/8Ctok1pPKDOBeXqBlV1tOgpPOA2+PB9Vs6
 +r4kspW/tgk4wRIl+xNjOmPxz+Ej6Y7cgzhmVnByqW0Aaer0bTUYcBNgXf0959rG
 cqZybW1ugdtOP8js1BsUDGJyjF05V77beyC/8h0x5bF/8tfscuxTDfMdbdlCNnpj
 PG/z3fnoPRzXj4hZkcMdVkwtj1CcarRkgZLIDyIOf7nlBbOlGvWsjV9SK2wsADcq
 4pYMT36rv4RXs2bt1ET58a6eFWXTsC31hX+IUaIMRI7BwlJvh4JEekT7DpLvpLvJ
 4qdP8KoBPRgm1b5XjRqOF7XBpLoJHSPcLQ6VvatYZcfZaUXyyAfwTpLi7CGqD+Qs
 rqAtMjLFYZ+vUM8clhAlLsUhAZH1JH6am+qOE8qjUGdKGqFfECv2ViB8PMRgk1MH
 YxHot6VhemzKre9U7aVjlHBjrxPP/zRhmLzIQ1/SrP6x6kxxF2JUR45NfUMQO810
 yPW52qoSSk6P4ld6ka7jDGE0bZE2up2mkO15H6WcgML4dSoBvHQ=
 =7RDb
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Limit the linear region to 51-bit when KVM is running in nVHE mode.

   Otherwise, depending on the placement of the ID map, kernel-VA to
   hyp-VA translations may produce addresses that either conflict with
   other HYP mappings or generate addresses outside of the 52-bit
   addressable range.

 - Instruct kmemleak not to scan the memory reserved for kdump as this
   range is removed from the kernel linear map and therefore not
   accessible.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kdump: Skip kmemleak scan reserved memory for kdump
  arm64: mm: limit linear region to 51 bits for KVM in nVHE mode
2021-09-10 11:58:20 -07:00
Chen Wandun 85f58eb188 arm64: kdump: Skip kmemleak scan reserved memory for kdump
Trying to boot with kdump + kmemleak, command will result in a crash:
"echo scan > /sys/kernel/debug/kmemleak"

crashkernel reserved: 0x0000000007c00000 - 0x0000000027c00000 (512 MB)
Kernel command line: BOOT_IMAGE=(hd1,gpt2)/vmlinuz-5.14.0-rc5-next-20210809+ root=/dev/mapper/ao-root ro rd.lvm.lv=ao/root rd.lvm.lv=ao/swap crashkernel=512M
Unable to handle kernel paging request at virtual address ffff000007c00000
Mem abort info:
  ESR = 0x96000007
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x07: level 3 translation fault
Data abort info:
  ISV = 0, ISS = 0x00000007
  CM = 0, WnR = 0
swapper pgtable: 64k pages, 48-bit VAs, pgdp=00002024f0d80000
[ffff000007c00000] pgd=1800205ffffd0003, p4d=1800205ffffd0003, pud=1800205ffffd0003, pmd=1800205ffffc0003, pte=0068000007c00f06
Internal error: Oops: 96000007 [#1] SMP
pstate: 804000c9 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : scan_block+0x98/0x230
lr : scan_block+0x94/0x230
sp : ffff80008d6cfb70
x29: ffff80008d6cfb70 x28: 0000000000000000 x27: 0000000000000000
x26: 00000000000000c0 x25: 0000000000000001 x24: 0000000000000000
x23: ffffa88a6b18b398 x22: ffff000007c00ff9 x21: ffffa88a6ac7fc40
x20: ffffa88a6af6a830 x19: ffff000007c00000 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: ffffffffffffffff
x14: ffffffff00000000 x13: ffffffffffffffff x12: 0000000000000020
x11: 0000000000000000 x10: 0000000001080000 x9 : ffffa88a6951c77c
x8 : ffffa88a6a893988 x7 : ffff203ff6cfb3c0 x6 : ffffa88a6a52b3c0
x5 : ffff203ff6cfb3c0 x4 : 0000000000000000 x3 : 0000000000000000
x2 : 0000000000000001 x1 : ffff20226cb56a40 x0 : 0000000000000000
Call trace:
 scan_block+0x98/0x230
 scan_gray_list+0x120/0x270
 kmemleak_scan+0x3a0/0x648
 kmemleak_write+0x3ac/0x4c8
 full_proxy_write+0x6c/0xa0
 vfs_write+0xc8/0x2b8
 ksys_write+0x70/0xf8
 __arm64_sys_write+0x24/0x30
 invoke_syscall+0x4c/0x110
 el0_svc_common+0x9c/0x190
 do_el0_svc+0x30/0x98
 el0_svc+0x28/0xd8
 el0t_64_sync_handler+0x90/0xb8
 el0t_64_sync+0x180/0x184

The reserved memory for kdump will be looked up by kmemleak, this area
will be set invalid when kdump service is bring up. That will result in
crash when kmemleak scan this area.

Fixes: a7259df767 ("memblock: make memblock_find_in_range method private")
Signed-off-by: Chen Wandun <chenwandun@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210910064844.3827813-1-chenwandun@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-09-10 11:58:59 +01:00
Ard Biesheuvel 88053ec8cb arm64: mm: limit linear region to 51 bits for KVM in nVHE mode
KVM in nVHE mode divides up its VA space into two equal halves, and
picks the half that does not conflict with the HYP ID map to map its
linear region. This worked fine when the kernel's linear map itself was
guaranteed to cover precisely as many bits of VA space, but this was
changed by commit f4693c2716 ("arm64: mm: extend linear region for
52-bit VA configurations").

The result is that, depending on the placement of the ID map, kernel-VA
to hyp-VA translations may produce addresses that either conflict with
other HYP mappings (including the ID map itself) or generate addresses
outside of the 52-bit addressable range, neither of which is likely to
lead to anything useful.

Given that 52-bit capable cores are guaranteed to implement VHE, this
only affects configurations such as pKVM where we opt into non-VHE mode
even if the hardware is VHE capable. So just for these configurations,
let's limit the kernel linear map to 51 bits and work around the
problem.

Fixes: f4693c2716 ("arm64: mm: extend linear region for 52-bit VA configurations")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20210826165613.60774-1-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-09-09 18:02:08 +01:00
Linus Torvalds 14726903c8 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "173 patches.

  Subsystems affected by this series: ia64, ocfs2, block, and mm (debug,
  pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap,
  bootmem, sparsemem, vmalloc, kasan, pagealloc, memory-failure,
  hugetlb, userfaultfd, vmscan, compaction, mempolicy, memblock,
  oom-kill, migration, ksm, percpu, vmstat, and madvise)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (173 commits)
  mm/madvise: add MADV_WILLNEED to process_madvise()
  mm/vmstat: remove unneeded return value
  mm/vmstat: simplify the array size calculation
  mm/vmstat: correct some wrong comments
  mm/percpu,c: remove obsolete comments of pcpu_chunk_populated()
  selftests: vm: add COW time test for KSM pages
  selftests: vm: add KSM merging time test
  mm: KSM: fix data type
  selftests: vm: add KSM merging across nodes test
  selftests: vm: add KSM zero page merging test
  selftests: vm: add KSM unmerge test
  selftests: vm: add KSM merge test
  mm/migrate: correct kernel-doc notation
  mm: wire up syscall process_mrelease
  mm: introduce process_mrelease system call
  memblock: make memblock_find_in_range method private
  mm/mempolicy.c: use in_task() in mempolicy_slab_node()
  mm/mempolicy: unify the create() func for bind/interleave/prefer-many policies
  mm/mempolicy: advertise new MPOL_PREFERRED_MANY
  mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY
  ...
2021-09-03 10:08:28 -07:00
Mike Rapoport a7259df767 memblock: make memblock_find_in_range method private
There are a lot of uses of memblock_find_in_range() along with
memblock_reserve() from the times memblock allocation APIs did not exist.

memblock_find_in_range() is the very core of memblock allocations, so any
future changes to its internal behaviour would mandate updates of all the
users outside memblock.

Replace the calls to memblock_find_in_range() with an equivalent calls to
memblock_phys_alloc() and memblock_phys_alloc_range() and make
memblock_find_in_range() private method of memblock.

This simplifies the callers, ensures that (unlikely) errors in
memblock_reserve() are handled and improves maintainability of
memblock_find_in_range().

Link: https://lkml.kernel.org/r/20210816122622.30279-1-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>		[arm64]
Acked-by: Kirill A. Shutemov <kirill.shtuemov@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	[ACPI]
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Nick Kossifidis <mick@ics.forth.gr>			[riscv]
Tested-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-03 09:58:17 -07:00
Linus Torvalds 9e5f3ffcf1 Devicetree updates for v5.15:
- Refactor arch kdump DT related code to a common implementation
 
 - Add fw_devlink tracking for 'phy-handle', 'leds', 'backlight',
   'resets', and 'pwm' properties
 
 - Various clean-ups to DT FDT code
 
 - Fix a runtime error for !CONFIG_SYSFS
 
 - Convert Synopsys DW PCI and derivative binding docs to schemas. Add
   Toshiba Visconti PCIe binding.
 
 - Convert a bunch of memory controller bindings to schemas
 
 - Covert eeprom-93xx46, Samsung Exynos TRNG, Samsung Exynos IRQ
   combiner, arm-charlcd, img-ascii-lcd, UniPhier eFuse, Xilinx Zynq
   MPSoC FPGA, Xilinx Zynq MPSoC reset, Mediatek mmsys, Gemini boards,
   brcm,iproc-i2c, faraday,ftpci100, and ks8851 net to DT schema.
 
 - Extend nvmem bindings to handle bit offsets in unit-addresses
 
 - Add DT schemas for HiKey 970 PCIe PHY
 
 - Remove unused ZTE, energymicro,efm32-timer, and Exynos SATA bindings
 
 - Enable dtc pci_device_reg warning by default
 
 - Fixes for handling 'unevaluatedProperties' in preparation to enable
   pending support in the tooling for jsonschema 2020-12 draft
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmEuWEsQHHJvYmhAa2Vy
 bmVsLm9yZwAKCRD6+121jbxhw+CtD/45m84GisULb7FFmlo+WY2SbzE8a+MUEXo0
 5ZZoMViSvBchphap9ueFNDdrLMUOHMsFaxHuTCUxXr4tq7EOemM7Br4OLiwiRrM5
 o2CwBvXYu+49c4UKVFMM6RCKFiXvw5NLI4Twpj4Ge8farHvt9Ecwtq+Y+RYWgFk2
 xwXWut7ZK3zBU6B+s4MRBATCFTD5oC4pAJIK3OQUlUPqZEQqdTRBKv5lyg+VUY2k
 eU0Cyzm0dZAmtjAu8ovhVNLfK1pp165QiaFIE1qh5H3ZVZAJlNyqN4jBDx9E4pLj
 BeazrsqfOkC8mZC+T7TgixhwB6D+r6/JW9NiCjYbarXibIsUOKSTKtj8XR8eZF/g
 sLeVDx33U5S+dlj1OB7scwq4Q9sG27ii2rlkvafA5KKBjoR2dzz7o9JesCV1Guha
 goPXmcd08e+KrjINxVc6gk4Y+KG8u+G7qnXnnmSatESJKxiDu1OgU3L16mlTJFaM
 hBmrh5rx1y8EkQnzgceTZIIWh30poSQKKyDB6Ta4Dude5JE+rS30oVURDR7MIrav
 rY70OYOiSq/nCcC7bc0Yu0UxJi+bwH28WvsD0aeCUOBTFsnI4j2uvsPsh3Aq74O0
 UbQmUCMxhpmsDVdIOqlS1IVH8M79I+BrDTPVP6EE96ttoj9FbSi6AgjeGJzVMC99
 EhtWe+gKTQ==
 =28CD
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:

 - Refactor arch kdump DT related code to a common implementation

 - Add fw_devlink tracking for 'phy-handle', 'leds', 'backlight',
   'resets', and 'pwm' properties

 - Various clean-ups to DT FDT code

 - Fix a runtime error for !CONFIG_SYSFS

 - Convert Synopsys DW PCI and derivative binding docs to schemas. Add
   Toshiba Visconti PCIe binding.

 - Convert a bunch of memory controller bindings to schemas

 - Covert eeprom-93xx46, Samsung Exynos TRNG, Samsung Exynos IRQ
   combiner, arm-charlcd, img-ascii-lcd, UniPhier eFuse, Xilinx Zynq
   MPSoC FPGA, Xilinx Zynq MPSoC reset, Mediatek mmsys, Gemini boards,
   brcm,iproc-i2c, faraday,ftpci100, and ks8851 net to DT schema.

 - Extend nvmem bindings to handle bit offsets in unit-addresses

 - Add DT schemas for HiKey 970 PCIe PHY

 - Remove unused ZTE, energymicro,efm32-timer, and Exynos SATA bindings

 - Enable dtc pci_device_reg warning by default

 - Fixes for handling 'unevaluatedProperties' in preparation to enable
   pending support in the tooling for jsonschema 2020-12 draft

* tag 'devicetree-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (78 commits)
  dt-bindings: display: remove zte,vou.txt binding doc
  dt-bindings: hwmon: merge max1619 into trivial devices
  dt-bindings: mtd-physmap: Add 'arm,vexpress-flash' compatible
  dt-bindings: PCI: imx6: convert the imx pcie controller to dtschema
  dt-bindings: Use 'enum' instead of 'oneOf' plus 'const' entries
  dt-bindings: Add vendor prefix for Topic Embedded Systems
  of: fdt: Rename reserve_elfcorehdr() to fdt_reserve_elfcorehdr()
  arm64: kdump: Remove custom linux,usable-memory-range handling
  arm64: kdump: Remove custom linux,elfcorehdr handling
  riscv: Remove non-standard linux,elfcorehdr handling
  of: fdt: Use IS_ENABLED(CONFIG_BLK_DEV_INITRD) instead of #ifdef
  of: fdt: Add generic support for handling usable memory range property
  of: fdt: Add generic support for handling elf core headers property
  crash_dump: Make elfcorehdr address/size symbols always visible
  dt-bindings: memory: convert Samsung Exynos DMC to dtschema
  dt-bindings: devfreq: event: convert Samsung Exynos PPMU to dtschema
  dt-bindings: devfreq: event: convert Samsung Exynos NoCP to dtschema
  kbuild: Enable dtc 'pci_device_reg' warning by default
  dt-bindings: soc: remove obsolete zte zx header
  dt-bindings: clock: remove obsolete zte zx header
  ...
2021-09-01 18:34:51 -07:00
Will Deacon 3eb9cdffb3 Partially revert "arm64/mm: drop HAVE_ARCH_PFN_VALID"
This partially reverts commit 16c9afc776.

Alex Bee reports a regression in 5.14 on their RK3328 SoC when
configuring the PL330 DMA controller:

 | ------------[ cut here ]------------
 | WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
 | Modules linked in: spi_rockchip(+) fuse
 | CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
 | Hardware name: Pine64 Rock64 (DT)
 | pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
 | pc : dma_map_resource+0x68/0xc0
 | lr : pl330_prep_slave_fifo+0x78/0xd0

This appears to be because dma_map_resource() is being called for a
physical address which does not correspond to a memory address yet does
have a valid 'struct page' due to the way in which the vmemmap is
constructed.

Prior to 16c9afc776 ("arm64/mm: drop HAVE_ARCH_PFN_VALID"), the arm64
implementation of pfn_valid() called memblock_is_memory() to return
'false' for such regions and the DMA mapping request would proceed.
However, now that we are using the generic implementation where only the
presence of the memory map entry is considered, we return 'true' and
erroneously fail with DMA_MAPPING_ERROR because we identify the region
as DRAM.

Although fixing this in the DMA mapping code is arguably the right fix,
it is a risky, cross-architecture change at this stage in the cycle. So
just revert arm64 back to its old pfn_valid() implementation for v5.14.
The change to the generic pfn_valid() code is preserved from the original
patch, so as to avoid impacting other architectures.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Reported-by: Alex Bee <knaerzche@gmail.com>
Link: https://lore.kernel.org/r/d3a3c828-b777-faf8-e901-904995688437@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-08-25 11:33:24 +01:00
Geert Uytterhoeven b261dba2fd arm64: kdump: Remove custom linux,usable-memory-range handling
Remove the architecture-specific code for handling the
"linux,usable-memory-range" property under the "/chosen" node in DT, as
the platform-agnostic FDT core code already takes care of this.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/7356c531c49a24b4a55577bf8e46d93f4d8ae460.1628670468.git.geert+renesas@glider.be
2021-08-24 17:09:01 -05:00
Geert Uytterhoeven 57beb9bd18 arm64: kdump: Remove custom linux,elfcorehdr handling
Remove the architecture-specific code for handling the
"linux,elfcorehdr" property under the "/chosen" node in DT, as the
platform-agnostic handling in the FDT core code already takes care of
this.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/3b8f801f9b92066855e87f3079fafc153ab20f69.1628670468.git.geert+renesas@glider.be
2021-08-24 17:09:01 -05:00
Linus Torvalds 71bd934101 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "190 patches.

  Subsystems affected by this patch series: mm (hugetlb, userfaultfd,
  vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock,
  migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap,
  zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc,
  core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs,
  signals, exec, kcov, selftests, compress/decompress, and ipc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits)
  ipc/util.c: use binary search for max_idx
  ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock
  ipc: use kmalloc for msg_queue and shmid_kernel
  ipc sem: use kvmalloc for sem_undo allocation
  lib/decompressors: remove set but not used variabled 'level'
  selftests/vm/pkeys: exercise x86 XSAVE init state
  selftests/vm/pkeys: refill shadow register after implicit kernel write
  selftests/vm/pkeys: handle negative sys_pkey_alloc() return code
  selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random
  kcov: add __no_sanitize_coverage to fix noinstr for all architectures
  exec: remove checks in __register_bimfmt()
  x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned
  hfsplus: report create_date to kstat.btime
  hfsplus: remove unnecessary oom message
  nilfs2: remove redundant continue statement in a while-loop
  kprobes: remove duplicated strong free_insn_page in x86 and s390
  init: print out unknown kernel parameters
  checkpatch: do not complain about positive return values starting with EPOLL
  checkpatch: improve the indented label test
  checkpatch: scripts/spdxcheck.py now requires python3
  ...
2021-07-02 12:08:10 -07:00
Anshuman Khandual 16c9afc776 arm64/mm: drop HAVE_ARCH_PFN_VALID
CONFIG_SPARSEMEM_VMEMMAP is now the only available memory model on arm64
platforms and free_unused_memmap() would just return without creating any
holes in the memmap mapping.  There is no need for any special handling in
pfn_valid() and HAVE_ARCH_PFN_VALID can just be dropped.  This also moves
the pfn upper bits sanity check into generic pfn_valid().

Link: https://lkml.kernel.org/r/1621947349-25421-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 20:47:29 -07:00
Mike Rapoport a7d9f306ba arm64: drop pfn_valid_within() and simplify pfn_valid()
The arm64's version of pfn_valid() differs from the generic because of two
reasons:

* Parts of the memory map are freed during boot. This makes it necessary to
  verify that there is actual physical memory that corresponds to a pfn
  which is done by querying memblock.

* There are NOMAP memory regions. These regions are not mapped in the
  linear map and until the previous commit the struct pages representing
  these areas had default values.

As the consequence of absence of the special treatment of NOMAP regions in
the memory map it was necessary to use memblock_is_map_memory() in
pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that
generic mm functionality would not treat a NOMAP page as a normal page.

Since the NOMAP regions are now marked as PageReserved(), pfn walkers and
the rest of core mm will treat them as unusable memory and thus
pfn_valid_within() is no longer required at all and can be disabled on
arm64.

pfn_valid() can be slightly simplified by replacing
memblock_is_map_memory() with memblock_is_memory().

[rppt@kernel.org: fix merge fix]
  Link: https://lkml.kernel.org/r/YJtoQhidtIJOhYsV@kernel.org

Link: https://lkml.kernel.org/r/20210511100550.28178-5-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 20:47:29 -07:00
Mike Rapoport 873ba46391 arm64: decouple check whether pfn is in linear map from pfn_valid()
The intended semantics of pfn_valid() is to verify whether there is a
struct page for the pfn in question and nothing else.

Yet, on arm64 it is used to distinguish memory areas that are mapped in
the linear map vs those that require ioremap() to access them.

Introduce a dedicated pfn_is_map_memory() wrapper for
memblock_is_map_memory() to perform such check and use it where
appropriate.

Using a wrapper allows to avoid cyclic include dependencies.

While here also update style of pfn_valid() so that both pfn_valid() and
pfn_is_map_memory() declarations will be consistent.

Link: https://lkml.kernel.org/r/20210511100550.28178-4-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 20:47:29 -07:00
Anshuman Khandual 7e04cc9189 arm64/mm: Validate CONFIG_PGTABLE_LEVELS
CONFIG_PGTABLE_LEVELS has been statically defined in (arch/arm64/Kconfig)
depending on the page size and requested virtual address range. In order to
validate this page table levels selection this adds a BUILD_BUG_ON() as per
the existing formula ARM64_HW_PGTABLE_LEVELS(). This would help protect any
inadvertent changes to CONFIG_PGTABLE_LEVELS selection.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1620649326-24115-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25 18:54:39 +01:00
Christoph Hellwig 687842ec50 arm64: do not set SWIOTLB_NO_FORCE when swiotlb is required
Although SWIOTLB_NO_FORCE is meant to allow later calls to swiotlb_init,
today dma_direct_map_page returns error if SWIOTLB_NO_FORCE.

For now, without a larger overhaul of SWIOTLB_NO_FORCE, the best we can
do is to avoid setting SWIOTLB_NO_FORCE in mem_init when we know that it
is going to be required later (e.g. Xen requires it).

CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
CC: catalin.marinas@arm.com
CC: will@kernel.org
CC: linux-arm-kernel@lists.infradead.org
Fixes: 2726bf3ff2 ("swiotlb: Make SWIOTLB_NO_FORCE perform no allocation")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210512201823.1963-2-sstabellini@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-05-14 15:52:08 +02:00
Linus Torvalds 51595e3b49 Assorted arm64 fixes and clean-ups, the most important:
- Restore terminal stack frame records. Their previous removal caused
   traces which cross secondary_start_kernel to terminate one entry too
   late, with a spurious "0" entry.
 
 - Fix boot warning with pseudo-NMI due to the way we manipulate the PMR
   register.
 
 - ACPI fixes: avoid corruption of interrupt mappings on watchdog probe
   failure (GTDT), prevent unregistering of GIC SGIs.
 
 - Force SPARSEMEM_VMEMMAP as the only memory model, it saves with having
   to test all the other combinations.
 
 - Documentation fixes and updates: tagged address ABI exceptions on
   brk/mmap/mremap(), event stream frequency, update booting requirements
   on the configuration of traps.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmCVba0ACgkQa9axLQDI
 XvEClxAAsqigp+Mnotdr8YUOuXLjHWU41EMShV6WbFcmlViEyZxxtZ5qavw19T3L
 rPxb8hq9QqI8kCd+j4MAU7cdc0ry+047njJmQ3Va0WeiDsbgEfPvLWPguDbeDFXW
 EjKKib+F/u58IffDkn6rVA7ZVPgYHRH+8yw6EdApp0BN4JuxEFzGBzG4EWKXnNHH
 IOu4IIXlbLX+U1kTtUFR4u6i4uBs2pZdEYzo1NF/Joacg14F01CBRuh8U04eeWFD
 HF4pWd4eCl/bLYPurF1rOi1dIUyrPuaPgNInGEdSaocD0hIvQH0r0wyIt+aMmqvK
 9Jm+dDEGeLxQn2nDrXfyldYG5EbFa3OplkUt2MVDDMWwN2Gpsjlnf/ucff/SBT/N
 7D6AL2OH6KDDCsNgU1JH9H6rAlh4nWJcsMBrWmP7aQtBMRyccQLywrt4HXB8cy7E
 +MyhTit05P3lpsrK2uZSFujK35Ts8hxywA7lAlU7YP4ADKu3Noc6qXSaxZRe+1Gb
 O5k3Qdcih0VLE843PjJj8f8fW1ysJW5J60cK9BaZxpB77gNufKkh/hS6YAiA8qkt
 PT3J0jk/cgGvwKK54rW52dG7qvDImgUMGkXGKQnEimgb62DatCZ4ZOPC+UoiDiqO
 SEd1DSW0Lt1VxVIulAjatVgzIJGM0jGCm9L7/vBguR0+Lahakbg=
 =vYok
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull more arm64 updates from Catalin Marinas:
 "A mix of fixes and clean-ups that turned up too late for the first
  pull request:

   - Restore terminal stack frame records. Their previous removal caused
     traces which cross secondary_start_kernel to terminate one entry
     too late, with a spurious "0" entry.

   - Fix boot warning with pseudo-NMI due to the way we manipulate the
     PMR register.

   - ACPI fixes: avoid corruption of interrupt mappings on watchdog
     probe failure (GTDT), prevent unregistering of GIC SGIs.

   - Force SPARSEMEM_VMEMMAP as the only memory model, it saves with
     having to test all the other combinations.

   - Documentation fixes and updates: tagged address ABI exceptions on
     brk/mmap/mremap(), event stream frequency, update booting
     requirements on the configuration of traps"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kernel: Update the stale comment
  arm64: Fix the documented event stream frequency
  arm64: entry: always set GIC_PRIO_PSR_I_SET during entry
  arm64: Explicitly document boot requirements for SVE
  arm64: Explicitly require that FPSIMD instructions do not trap
  arm64: Relax booting requirements for configuration of traps
  arm64: cpufeatures: use min and max
  arm64: stacktrace: restore terminal records
  arm64/vdso: Discard .note.gnu.property sections in vDSO
  arm64: doc: Add brk/mmap/mremap() to the Tagged Address ABI Exceptions
  psci: Remove unneeded semicolon
  ACPI: irq: Prevent unregistering of GIC SGIs
  ACPI: GTDT: Don't corrupt interrupt mappings on watchdow probe failure
  arm64: Show three registers per line
  arm64: remove HAVE_DEBUG_BUGVERBOSE
  arm64: alternative: simplify passing alt_region
  arm64: Force SPARSEMEM_VMEMMAP as the only memory management model
  arm64: vdso32: drop -no-integrated-as flag
2021-05-07 12:11:05 -07:00