Commit graph

8022 commits

Author SHA1 Message Date
Liam R. Howlett
17dc622c7b maple_tree: fix mas_prev() and mas_find() state handling
When mas_prev() does not find anything, set the state to MAS_NONE.

Handle the MAS_NONE in mas_find() like a MAS_START.

Link: https://lkml.kernel.org/r/20230120162650.984577-7-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: <syzbot+502859d610c661e56545@syzkaller.appspotmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-09 16:51:31 -08:00
Liam R. Howlett
1202700c3f maple_tree: fix handle of invalidated state in mas_wr_store_setup()
If an invalidated maple state is encountered during write, reset the maple
state to MAS_START.  This will result in a re-walk of the tree to the
correct location for the write.

Link: https://lore.kernel.org/all/20230107020126.1627-1-sj@kernel.org/
Link: https://lkml.kernel.org/r/20230120162650.984577-6-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-09 16:51:31 -08:00
Liam R. Howlett
5159d64b33 test_maple_tree: test modifications while iterating
Add a testcase to ensure the iterator detects bad states on modifications
and does what the user expects

Link: https://lkml.kernel.org/r/20230120162650.984577-5-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-09 16:51:31 -08:00
Liam R. Howlett
50e81c82ad maple_tree: reduce user error potential
When iterating, a user may operate on the tree and cause the maple state
to be altered and left in an unintuitive state.  Detect this scenario and
correct it by setting to the limit and invalidating the state.

Link: https://lkml.kernel.org/r/20230120162650.984577-4-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-09 16:51:30 -08:00
Liam R. Howlett
65be6f058b maple_tree: fix potential rcu issue
Ensure the node isn't dead after reading the node end.

Link: https://lkml.kernel.org/r/20230120162650.984577-3-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-09 16:51:30 -08:00
Zhaoyang Huang
b2db9ef2c0 mm: move KMEMLEAK's Kconfig items from lib to mm
Have the kmemleak's source code and Kconfig items be in the same directory.

Link: https://lkml.kernel.org/r/1674091345-14799-1-git-send-email-zhaoyang.huang@unisoc.com
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: ke.wang <ke.wang@unisoc.com>
Cc: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02 22:33:26 -08:00
NeilBrown
2973d8229b mm: discard __GFP_ATOMIC
__GFP_ATOMIC serves little purpose.  Its main effect is to set
ALLOC_HARDER which adds a few little boosts to increase the chance of an
allocation succeeding, one of which is to lower the water-mark at which it
will succeed.

It is *always* paired with __GFP_HIGH which sets ALLOC_HIGH which also
adjusts this watermark.  It is probable that other users of __GFP_HIGH
should benefit from the other little bonuses that __GFP_ATOMIC gets.

__GFP_ATOMIC also gives a warning if used with __GFP_DIRECT_RECLAIM.
There is little point to this.  We already get a might_sleep() warning if
__GFP_DIRECT_RECLAIM is set.

__GFP_ATOMIC allows the "watermark_boost" to be side-stepped.  It is
probable that testing ALLOC_HARDER is a better fit here.

__GFP_ATOMIC is used by tegra-smmu.c to check if the allocation might
sleep.  This should test __GFP_DIRECT_RECLAIM instead.

This patch:
 - removes __GFP_ATOMIC
 - allows __GFP_HIGH allocations to ignore watermark boosting as well
   as GFP_ATOMIC requests.
 - makes other adjustments as suggested by the above.

The net result is not change to GFP_ATOMIC allocations.  Other
allocations that use __GFP_HIGH will benefit from a few different extra
privileges.  This affects:
  xen, dm, md, ntfs3
  the vermillion frame buffer
  hibernation
  ksm
  swap
all of which likely produce more benefit than cost if these selected
allocation are more likely to succeed quickly.

[mgorman: Minor adjustments to rework on top of a series]
Link: https://lkml.kernel.org/r/163712397076.13692.4727608274002939094@noble.neil.brown.name
Link: https://lkml.kernel.org/r/20230113111217.14134-7-mgorman@techsingularity.net
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02 22:33:13 -08:00
Vernon Yang
f942b0f052 maple_tree: fix comment of mte_destroy_walk
The parameter name of maple tree is mt, make the comment be mt instead of
mn, and the separator between the parameter name and the description to be
: instead of -.

Link: https://lkml.kernel.org/r/20230111135348.803181-1-vernon2gm@gmail.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02 22:33:01 -08:00
Vernon Yang
c5d5546ea0 maple_tree: remove the parameter entry of mas_preallocate
The parameter entry of mas_preallocate is not used, so drop it.

Link: https://lkml.kernel.org/r/20230110154211.1758562-1-vernon2gm@gmail.com
Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02 22:32:52 -08:00
Andrew Morton
5ab0fc155d Sync mm-stable with mm-hotfixes-stable to pick up dependent patches
Merge branch 'mm-hotfixes-stable' into mm-stable
2023-01-31 17:25:17 -08:00
ye xingchen
1e90e35b62 Kconfig.debug: fix the help description in SCHED_DEBUG
The correct file path for SCHED_DEBUG is /sys/kernel/debug/sched.

Link: https://lkml.kernel.org/r/202301291013573466558@zte.com.cn
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31 16:44:10 -08:00
Zhaoyang Huang
993f57e027 mm: use stack_depot_early_init for kmemleak
Mirsad report the below error which is caused by stack_depot_init()
failure in kvcalloc.  Solve this by having stackdepot use
stack_depot_early_init().

On 1/4/23 17:08, Mirsad Goran Todorovac wrote:
I hate to bring bad news again, but there seems to be a problem with the output of /sys/kernel/debug/kmemleak:

[root@pc-mtodorov ~]# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff951c118568b0 (size 16):
comm "kworker/u12:2", pid 56, jiffies 4294893952 (age 4356.548s)
hex dump (first 16 bytes):
    6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00 memstick0.......
    backtrace:
[root@pc-mtodorov ~]#

Apparently, backtrace of called functions on the stack is no longer
printed with the list of memory leaks.  This appeared on Lenovo desktop
10TX000VCR, with AlmaLinux 8.7 and BIOS version M22KT49A (11/10/2022) and
6.2-rc1 and 6.2-rc2 builds.  This worked on 6.1 with the same
CONFIG_KMEMLEAK=y and MGLRU enabled on a vanilla mainstream kernel from
Mr.  Torvalds' tree.  I don't know if this is deliberate feature for some
reason or a bug.  Please find attached the config, lshw and kmemleak
output.

[vbabka@suse.cz: remove stack_depot_init() call]
Link: https://lore.kernel.org/all/5272a819-ef74-65ff-be61-4d2d567337de@alu.unizg.hr/
Link: https://lkml.kernel.org/r/1674091345-14799-2-git-send-email-zhaoyang.huang@unisoc.com
Fixes: 56a61617dd ("mm: use stack_depot for recording kmemleak's backtrace")
Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: ke.wang <ke.wang@unisoc.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31 16:44:10 -08:00
Wei Yang
ab6ef70a8b maple_tree: should get pivots boundary by type
We should get pivots boundary by type.  Fixes a potential overindexing of
mt_pivots[].

Link: https://lkml.kernel.org/r/20221112234308.23823-1-richard.weiyang@gmail.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31 16:44:08 -08:00
Liam Howlett
7327e8111a maple_tree: fix mas_empty_area_rev() lower bound validation
mas_empty_area_rev() was not correctly validating the start of a gap
against the lower limit.  This could lead to the range starting lower than
the requested minimum.

Fix the issue by better validating a gap once one is found.

This commit also adds tests to the maple tree test suite for this issue
and tests the mas_empty_area() function for similar bound checking.

Link: https://lkml.kernel.org/r/20230111200136.1851322-1-Liam.Howlett@oracle.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216911
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: <amanieu@gmail.com>
  Link: https://lore.kernel.org/linux-mm/0b9f5425-08d4-8013-aa4c-e620c3b10bb2@leemhuis.info/
Tested-by: Holger Hoffsttte <holger@applied-asynchrony.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31 16:44:07 -08:00
Liam Howlett
541e06b772 maple_tree: remove GFP_ZERO from kmem_cache_alloc() and kmem_cache_alloc_bulk()
Preallocations are common in the VMA code to avoid allocating under
certain locking conditions.  The preallocations must also cover the
worst-case scenario.  Removing the GFP_ZERO flag from the
kmem_cache_alloc() (and bulk variant) calls will reduce the amount of time
spent zeroing memory that may not be used.  Only zero out the necessary
area to keep track of the allocations in the maple state.  Zero the entire
node prior to using it in the tree.

This required internal changes to node counting on allocation, so the test
code is also updated.

This restores some micro-benchmark performance: up to +9% in mmtests mmap1
by my testing +10% to +20% in mmap, mmapaddr, mmapmany tests reported by
Red Hat

Link: https://bugzilla.redhat.com/show_bug.cgi?id=2149636
Link: https://lkml.kernel.org/r/20230105160427.2988454-1-Liam.Howlett@oracle.com
Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com>
Reported-by: Jirka Hladky <jhladky@redhat.com>
Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18 17:12:54 -08:00
Vernon Yang
e11cb683b2 maple_tree: refine mab_calc_split function
Invert the conditional judgment of the mid_split, to focus the return
statement in the last statement, which is easier to understand and for
better readability.

Link: https://lkml.kernel.org/r/20221221060058.609003-8-vernon2gm@gmail.com
Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18 17:12:47 -08:00
Vernon Yang
46b3458482 maple_tree: refine ma_state init from mas_start()
If mas->node is an MAS_START, there are three cases, and they all assign
different values to mas->node and mas->offset.  So there is no need to set
them to a default value before updating.

Update them directly to make them easier to understand and for better
readability.

Link: https://lkml.kernel.org/r/20221221060058.609003-7-vernon2gm@gmail.com
Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18 17:12:46 -08:00
Vernon Yang
84fd3e1ee3 maple_tree: use macro MA_ROOT_PARENT instead of number
When you need to compare whether node->parent is parent of the
root node, using macro MA_ROOT_PARENT is easier to understand
and for better readability.

Link: https://lkml.kernel.org/r/20221221060058.609003-5-vernon2gm@gmail.com
Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18 17:12:46 -08:00
Vernon Yang
bd592703b8 maple_tree: use mt_node_max() instead of direct operations mt_max[]
Use mt_node_max() to get the maximum number of slots for a node,
rather than direct operations mt_max[], makes it better portability.

Link: https://lkml.kernel.org/r/20221221060058.609003-4-vernon2gm@gmail.com
Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18 17:12:46 -08:00
Vernon Yang
d56c593c8e maple_tree: remove extra return statement
For functions with a return type of void, it is unnecessary to
add a reurn statement at the end of the function, so drop it.

Link: https://lkml.kernel.org/r/20221221060058.609003-3-vernon2gm@gmail.com
Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18 17:12:46 -08:00
Vernon Yang
831978e37e maple_tree: remove extra space and blank line
Patch series "Clean up and refinement for maple tree", v2.

This patchset cleans up and refines some maple tree code.  A few small
changes make the code easier to understand and for better readability.


This patch (of 7):

These extra space and blank lines are unnecessary, so drop them.

Link: https://lkml.kernel.org/r/20221221060058.609003-1-vernon2gm@gmail.com
Link: https://lkml.kernel.org/r/20221221060058.609003-2-vernon2gm@gmail.com
Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18 17:12:46 -08:00
Qinglin Pan
6b1ead5985 lib/test_vmalloc.c: add parameter use_huge for fix_size_alloc_test
Add a parameter `use_huge' for fix_size_alloc_test(), which can be used to
test allocation vie vmalloc_huge for both functionality and performance.

Link: https://lkml.kernel.org/r/20221212055657.698420-1-panqinglin2020@iscas.ac.cn
Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18 17:12:41 -08:00
Andrew Morton
bd86d2ea36 Sync with v6.2-rc4
Merge branch 'master' into mm-hotfixes-stable
2023-01-18 16:52:20 -08:00
Mateusz Guzik
f5fe24ef17 lockref: stop doing cpu_relax in the cmpxchg loop
On the x86-64 architecture even a failing cmpxchg grants exclusive
access to the cacheline, making it preferable to retry the failed op
immediately instead of stalling with the pause instruction.

To illustrate the impact, below are benchmark results obtained by
running various will-it-scale tests on top of the 6.2-rc3 kernel and
Cascade Lake (2 sockets * 24 cores * 2 threads) CPU.

All results in ops/s.  Note there is some variance in re-runs, but the
code is consistently faster when contention is present.

  open3 ("Same file open/close"):
  proc          stock       no-pause
     1         805603         814942       (+%1)
     2        1054980        1054781       (-0%)
     8        1544802        1822858      (+18%)
    24        1191064        2199665      (+84%)
    48         851582        1469860      (+72%)
    96         609481        1427170     (+134%)

  fstat2 ("Same file fstat"):
  proc          stock       no-pause
     1        3013872        3047636       (+1%)
     2        4284687        4400421       (+2%)
     8        3257721        5530156      (+69%)
    24        2239819        5466127     (+144%)
    48        1701072        5256609     (+209%)
    96        1269157        6649326     (+423%)

Additionally, a kernel with a private patch to help access() scalability:
access2 ("Same file access"):

  proc          stock        patched      patched
                                         +nopause
    24        2378041        2005501      5370335  (-15% / +125%)

That is, fixing the problems in access itself *reduces* scalability
after the cacheline ping-pong only happens in lockref with the pause
instruction.

Note that fstat and access benchmarks are not currently integrated into
will-it-scale, but interested parties can find them in pull requests to
said project.

Code at hand has a rather tortured history.  First modification showed
up in commit d472d9d98b ("lockref: Relax in cmpxchg loop"), written
with Itanium in mind.  Later it got patched up to use an arch-dependent
macro to stop doing it on s390 where it caused a significant regression.
Said macro had undergone revisions and was ultimately eliminated later,
going back to cpu_relax.

While I intended to only remove cpu_relax for x86-64, I got the
following comment from Linus:

    I would actually prefer just removing it entirely and see if
    somebody else hollers. You have the numbers to prove it hurts on
    real hardware, and I don't think we have any numbers to the
    contrary.

    So I think it's better to trust the numbers and remove it as a
    failure, than say "let's just remove it on x86-64 and leave
    everybody else with the potentially broken code"

Additionally, Will Deacon (maintainer of the arm64 port, one of the
architectures previously benchmarked):

    So, from the arm64 side of the fence, I'm perfectly happy just
    removing the cpu_relax() calls from lockref.

As such, come back full circle in history and whack it altogether.

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://lore.kernel.org/all/CAGudoHHx0Nqg6DE70zAVA75eV-HXfWyhVMWZ-aSeOofkA_=WdA@mail.gmail.com/
Acked-by: Tony Luck <tony.luck@intel.com> # ia64
Acked-by: Nicholas Piggin <npiggin@gmail.com> # powerpc
Acked-by: Will Deacon <will@kernel.org> # arm64
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-01-13 14:35:38 -06:00
Randy Dunlap
d09dce1fff lib/win_minmax: use /* notation for regular comments
Don't use kernel-doc "/**" notation for non-kernel-doc comments.
Prevents a kernel-doc warning:

lib/win_minmax.c:31: warning: expecting prototype for lib/minmax.c(). Prototype was for minmax_subwin_update() instead

Link: https://lkml.kernel.org/r/20230102211614.26343-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-11 16:14:21 -08:00
Linus Torvalds
4a4dcea083 v6.2 first rc pull request
A big data corruption regression due to a change in the scatterlist
 
 - Fix compilation warnings on gcc 13
 
 - Oops when using some mlx5 stats
 
 - Bad enforcement of atomic responder resources in mlx5
 
 - Do not wrongly combine non-contiguous pages in scatterlist
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCY7izVgAKCRCFwuHvBreF
 YfOXAQC08HilnYdRjlVrxswOQIN1KzHQ63xDXM0Rv99XOcSKCgD+LeXkeCeJ0XWW
 kBtkPnhR194phADv4nWaaSrIc52DtA0=
 =HxxG
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "Most noticeable is that Yishai found a big data corruption regression
  due to a change in the scatterlist:

   - Do not wrongly combine non-contiguous pages in scatterlist

   - Fix compilation warnings on gcc 13

   - Oops when using some mlx5 stats

   - Bad enforcement of atomic responder resources in mlx5"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  lib/scatterlist: Fix to merge contiguous pages into the last SG properly
  RDMA/mlx5: Fix validation of max_rd_atomic caps for DC
  RDMA/mlx5: Fix mlx5_ib_get_hw_stats when used for device
  RDMA/srp: Move large values to a new enum for gcc13
2023-01-07 10:06:47 -08:00
Yishai Hadas
e95d50d74b lib/scatterlist: Fix to merge contiguous pages into the last SG properly
When sg_alloc_append_table_from_pages() calls to pages_are_mergeable() in
its 'sgt_append->prv' flow to check whether it can merge contiguous pages
into the last SG, it passes the page arguments in the wrong order.

The first parameter should be the next candidate page to be merged to
the last page and not the opposite.

The current code leads to a corrupted SG which resulted in OOPs and
unexpected errors when non-contiguous pages are merged wrongly.

Fix to pass the page parameters in the right order.

Fixes: 1567b49d1a ("lib/scatterlist: add check when merging zone device pages")
Link: https://lore.kernel.org/r/20230105112339.107969-1-yishaih@nvidia.com
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-01-05 16:01:05 -04:00
YoungJun.park
93ef83050e kunit: alloc_string_stream_fragment error handling bug fix
When it fails to allocate fragment, it does not free and return error.
And check the pointer inappropriately.

Fixed merge conflicts with
commit 618887768b ("kunit: update NULL vs IS_ERR() tests")
Shuah Khan <skhan@linuxfoundation.org>

Signed-off-by: YoungJun.park <her0gyugyu@gmail.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-12-26 16:01:36 -07:00
Liam Howlett
c5651b31f5 test_maple_tree: add test for mas_spanning_rebalance() on insufficient data
Add a test to the maple tree test suite for the spanning rebalance
insufficient node issue does not go undetected again.

Link: https://lkml.kernel.org/r/20221219161922.2708732-3-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-21 14:31:52 -08:00
Liam Howlett
0abb964aae maple_tree: fix mas_spanning_rebalance() on insufficient data
Mike Rapoport contacted me off-list with a regression in running criu. 
Periodic tests fail with an RCU stall during execution.  Although rare, it
is possible to hit this with other uses so this patch should be backported
to fix the regression.

This patchset adds the fix and a test case to the maple tree test
suite.


This patch (of 2):

An insufficient node was causing an out-of-bounds access on the node in
mas_leaf_max_gap().  The cause was the faulty detection of the new node
being a root node when overwriting many entries at the end of the tree.

Fix the detection of a new root and ensure there is sufficient data prior
to entering the spanning rebalance loop.

Link: https://lkml.kernel.org/r/20221219161922.2708732-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20221219161922.2708732-2-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Mike Rapoport <rppt@kernel.org>
Tested-by: Mike Rapoport <rppt@kernel.org>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-21 14:31:52 -08:00
Linus Torvalds
32d528c4b8 SPDX/License additions for 6.2-rc1
Here are 2 small updates for LICENSES and some kernel files that add the
 Copyleft-next license and use it in a SPDX tag as a dual-license for
 some kernel files.
 
 These have been discussed thoroughly in public on the linux-spdx mailing
 list, and have the needed acks on them, as well as having been in
 linux-next with no reported issues for quite some time.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY6F1Qg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynGWwCfVJ+Z1CVWSFC8KaaGNiFu/gXmgNUAoKy11gWJ
 8igpSNEkOiGiaGA+AvN+
 =j8iu
 -----END PGP SIGNATURE-----

Merge tag 'spdx-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx

Pull SPDX/License additions from Greg KH:
 "Here are two small updates for LICENSES and some kernel files that add
  the Copyleft-next license and use it in a SPDX tag as a dual-license
  for some kernel files.

  These have been discussed thoroughly in public on the linux-spdx
  mailing list, and have the needed acks on them, as well as having been
  in linux-next with no reported issues for quite some time"

* tag 'spdx-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
  testing: use the copyleft-next-0.3.1 SPDX tag
  LICENSES: Add the copyleft-next-0.3.1 license
2022-12-20 08:53:16 -06:00
Linus Torvalds
70b07bec95 asm-generic bits for 6.2
There are only three fairly simple patches. The #include
 change to linux/swab.h addresses a userspace build issue,
 and the change to the mmio tracing logic helps provide
 more useful traces.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmOgtJUACgkQmmx57+YA
 GNln8Q//dvQ2FRIWBXKh4r6CxtiCx2aktGmnP1YAuaIVuzjGSn/8EQZAoTYN5jKY
 Io8rFt1/FfOMtu3E32JtGpgfDAP/8sfz3Lao9bzJR/Fjv059qL5QCoI3qbEFTNz9
 vzUqiddFZGppn76qsXSA6aItHVDS4Y97XiYRSwSMlpIz+9a84rYxCo04bNR4ut4t
 PR5+lvlTDfGfmj+SebrCt/IEi/FF9ckEYCLJHfaSPcQcujLDZDKPcT2RbubgwHgB
 OfE5Rx25xJxR4BU5MFe74sKn5Qi5HOfr1GrsjL3RbMNiYuHgbwLcZkMXvbZukdHz
 50Gt8UXMAxvZYKz92kyQLYuiKEtFSrQ8JccgqVUWL/lRLDoUkTg4hz4tmGUZE6KP
 ElxdgIBem9yrFX0oCaPNkY5d3MRU2i19FvBfKWKC54NbcmBjpHxxSg+WW/P7Jw+N
 uegj7qcEh7RcQU4w97OW4nS+eZmnXb4O4qXZeFwhXHS/snH7p3iBApyoPlyb+KOs
 np5MWRNaGFfi8BWWeVTX78U2VW8Ql8nnlRIlk/Wwm8AkVaNFQDnffKPi87paZd9o
 Kl+a9broMf4v0Oq5JTxqPMzmn9zUV0rHa1VanRBnNKqTOWalmNcsfsg1Ih9PhAAT
 p3u2CN0cBI7QmrcymJHrCuv0eNJRjsYa5FB4xmhJcJkD2qjsqXI=
 =05F5
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "There are only three fairly simple patches.

  The #include change to linux/swab.h addresses a userspace build issue,
  and the change to the mmio tracing logic helps provide more useful
  traces"

* tag 'asm-generic-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  uapi: Add missing _UAPI prefix to <asm-generic/types.h> include guard
  asm-generic/io: Add _RET_IP_ to MMIO trace for more accurate debug info
  include/uapi/linux/swab: Fix potentially missing __always_inline
2022-12-20 08:32:11 -06:00
Linus Torvalds
6feb57c2fd Kbuild updates for v6.2
- Support zstd-compressed debug info
 
  - Allow W=1 builds to detect objects shared among multiple modules
 
  - Add srcrpm-pkg target to generate a source RPM package
 
  - Make the -s option detection work for future GNU Make versions
 
  - Add -Werror to KBUILD_CPPFLAGS when CONFIG_WERROR=y
 
  - Allow W=1 builds to detect -Wundef warnings in any preprocessed files
 
  - Raise the minimum supported version of binutils to 2.25
 
  - Use $(intcmp ...) to compare integers if GNU Make >= 4.4 is used
 
  - Use $(file ...) to read a file if GNU Make >= 4.2 is used
 
  - Print error if GNU Make older than 3.82 is used
 
  - Allow modpost to detect section mismatches with Clang LTO
 
  - Include vmlinuz.efi into kernel tarballs for arm64 CONFIG_EFI_ZBOOT=y
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmOeImsVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsG06IP/iVjuWFvnjDZT4X8X6zN8aKp1vtR
 EMkmoRtt5cD4CLb1MG4N7irYHgedQSx4rYceP45MyW1I3egl6Ct14RDyeQ1xSIZb
 XFTLDCZvfl/up3MdiqNAqKRS7x5lk9++7F0t+2SoQxKQyJvm735XreX+VhZ1FeLB
 qcHrmzJ5veky5Ry/3OkNUgKFBjKEAL+qKMc55uvkXqfTb3KoBa2r4VC1OaoYGRru
 R8oF9qQRnGVQAl/LbBVchmgSjxryxPrCvBGiKlK03VkXdzEMHMimEJh3BQ6e0PGo
 gajdk+4liy7z+jQnI7jFhvJjGKzkEP/Bc99M/uS92QX5MgpH6mqpHMoqqPiqW87K
 RmZH37FqRu1Vo8dpibmH6r2K6YD/HHRjaDHk1VuuCQYEn0dsNmokPXOqd/1v0I1i
 TXPjWOw1AID5vMJWllqxFhpeVvf0vx5BT/UNrh68MLqlJZzv2eMVJb4fNy6640ml
 U0NclMnOa3eOmf5z1T7/LqDRTa63Q0kpanRrBpcmVOaqW+ZpQ3SQjh4uBN1PyJHL
 cX3Skc341DyRlFiT54QhGKlm57MEb2gjhBZ3Z4J+b7sEFgvjXH/W8vcOGIKlppmA
 CfYMyres4OV+fJc89ONkWsvLiOP1OeUGPvytm33J5QMKXc8SzOLP0D/F8kjrDflm
 EROKuZ4EA5ej/rOy
 =Ig/Y
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Support zstd-compressed debug info

 - Allow W=1 builds to detect objects shared among multiple modules

 - Add srcrpm-pkg target to generate a source RPM package

 - Make the -s option detection work for future GNU Make versions

 - Add -Werror to KBUILD_CPPFLAGS when CONFIG_WERROR=y

 - Allow W=1 builds to detect -Wundef warnings in any preprocessed files

 - Raise the minimum supported version of binutils to 2.25

 - Use $(intcmp ...) to compare integers if GNU Make >= 4.4 is used

 - Use $(file ...) to read a file if GNU Make >= 4.2 is used

 - Print error if GNU Make older than 3.82 is used

 - Allow modpost to detect section mismatches with Clang LTO

 - Include vmlinuz.efi into kernel tarballs for arm64 CONFIG_EFI_ZBOOT=y

* tag 'kbuild-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (29 commits)
  buildtar: fix tarballs with EFI_ZBOOT enabled
  modpost: Include '.text.*' in TEXT_SECTIONS
  padata: Mark padata_work_init() as __ref
  kbuild: ensure Make >= 3.82 is used
  kbuild: refactor the prerequisites of the modpost rule
  kbuild: change module.order to list *.o instead of *.ko
  kbuild: use .NOTINTERMEDIATE for future GNU Make versions
  kconfig: refactor Makefile to reduce process forks
  kbuild: add read-file macro
  kbuild: do not sort after reading modules.order
  kbuild: add test-{ge,gt,le,lt} macros
  Documentation: raise minimum supported version of binutils to 2.25
  kbuild: add -Wundef to KBUILD_CPPFLAGS for W=1 builds
  kbuild: move -Werror from KBUILD_CFLAGS to KBUILD_CPPFLAGS
  kbuild: Port silent mode detection to future gnu make.
  init/version.c: remove #include <generated/utsrelease.h>
  firmware_loader: remove #include <generated/utsrelease.h>
  modpost: Mark uuid_le type to be suitable only for MEI
  kbuild: add ability to make source rpm buildable using koji
  kbuild: warn objects shared among multiple modules
  ...
2022-12-19 12:33:32 -06:00
Linus Torvalds
158738ea75 Update to zstd-1.5.2
The major goal of this PR is to update the kernel to upstream zstd v1.5.2 [0].
 Specifically to the tag v1.5.2-kernel [1] which includes several cherrypicked
 fixes for the kernel on top of v1.5.2.
 
 Excepting the MAINTAINERS change, all the changes in this PR can be generated by:
 
 ```
 git clone https://github.com/facebook/zstd
 cd zstd/contrib/linux-kernel
 git checkout v1.5.2-kernel
 LINUX=/path/to/linux/repo make import
 ```
 
 These changes have been baking in linux-next since 2022/10/24 when I put up
 my patchset [2]. Notably the first commit is a small refactor of the only zstd
 commit since the last update [3] to fit the upstream import scheme.
 
 Additionally, this PR includes several minor typo fixes, which have all been
 fixed upstream so they are maintained on the next import.
 
 [0] https://github.com/facebook/zstd/releases/tag/v1.5.2
 [1] https://github.com/facebook/zstd/tree/v1.5.2-kernel
 [2] https://lore.kernel.org/lkml/20221024202606.404049-1-nickrterrell@gmail.com/
 [3] 637a642f5c
 
 Signed-off-by: Nick Terrell <terrelln@fb.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEmIwAqlFIzbQodPwyuzRpqaNEqPUFAmOZK2AACgkQuzRpqaNE
 qPUWxQ//RWViIlgoXrVY8Zv/6zdUk+BJx9Z45pJy/gnu5wWvFDZHcOTPlbcKp7V0
 nVjFLxzNNzGL6JeBL+PPLlDrmL9P5NBtO44fexl2VS0/S3WMm8G5/C/oIPqrd8yD
 I6LdE44Az6Q/7DONkB0r3C+PRq14/a1Me3PVPsrXs4RNPCl+p952ZnwEIM51t+2w
 vX+z7r5C55F/F2KROU2NlgrL8ybjNdQQ2kocLcaEVIJjIX/lgcmfSpEXaOMA/8ah
 HOn8NQnWRkQoNbmD3n0yrF+IC2Ks8fClO5G2N/OTlSBsHLKVtsgXjwMmBMdVBkqi
 2+Qf8FfI6Q2HFDxHf/6oUDEFKbIEsai3lQSelI7FThZzSXzCdDh6w9QbG7FwrTcJ
 As+rQCH5Wb4IYl2WJc/GVkwo0Nwkcjatr7gyGB7Xsn7Xh6xDpQGaY72p7MlEaiCe
 egFa2XrhmpPW1gi7/U/kH8qn+M6aIKzqSHsK7ytBi3KGp+V3ijMPCNQlsqiTC6np
 9SrVjo1Az5mnZdBDYcKX6L189ylH8js+WQRpy2BoOTYYnRiszcBn23o84Z0VWzEd
 MpuLryhGCd0v7mmfapGi2Jgb610leLYMFdkyIddnZnjXvYdFT0/EatwZHZ7vNM4O
 t0LfZrNmMe7TdTAV2cQbHuDdxd3wKwqz1+qPuRIbDOgWUx17Ek0=
 =AXY5
 -----END PGP SIGNATURE-----

Merge tag 'zstd-linus-v6.2' of https://github.com/terrelln/linux

Pull zstd updates from Nick Terrell:
 "Update the kernel to upstream zstd v1.5.2 [0]. Specifically to the tag
  v1.5.2-kernel [1] which includes several cherrypicked fixes for the
  kernel on top of v1.5.2.

  Excepting the MAINTAINERS change, all the changes in this can be
  generated by:

    git clone https://github.com/facebook/zstd
    cd zstd/contrib/linux-kernel
    git checkout v1.5.2-kernel
    LINUX=/path/to/linux/repo make import

  Additionally, this includes several minor typo fixes, which have all
  been fixed upstream so they are maintained on the next import"

Link: https://github.com/facebook/zstd/releases/tag/v1.5.2 [0]
Link: https://github.com/facebook/zstd/tree/v1.5.2-kernel [1]
Link: https://lore.kernel.org/lkml/20221024202606.404049-1-nickrterrell@gmail.com/
Link: 637a642f5c

* tag 'zstd-linus-v6.2' of https://github.com/terrelln/linux:
  zstd: import usptream v1.5.2
  zstd: Move zstd-common module exports to zstd_common_module.c
  lib: zstd: Fix comment typo
  lib: zstd: fix repeated words in comments
  MAINTAINERS: git://github -> https://github.com for terrelln
  lib: zstd: clean up double word in comment.
2022-12-19 12:26:03 -06:00
Linus Torvalds
a6e3e6f138 Some fault-injection improvements from Wei Yongjun which enable stacktrace
filtering on x86_64.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY56YjgAKCRDdBJ7gKXxA
 jni0AQCXYCZE1pOVXnB+IA1G1xkM0f1xDS/D63uJVl7Lyurv5QEAvJWX25DsTbLR
 c0bq3y2PPpHzrcDyPhciVlY/iplHQQM=
 =tfs8
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2022-12-17-20-32' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull fault-injection updates from Andrew Morton:
 "Some fault-injection improvements from Wei Yongjun which enable
  stacktrace filtering on x86_64"

* tag 'mm-nonmm-stable-2022-12-17-20-32' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  fault-injection: make stacktrace filter works as expected
  fault-injection: make some stack filter attrs more readable
  fault-injection: skip stacktrace filtering by default
  fault-injection: allow stacktrace filter for x86-64
2022-12-19 07:03:44 -06:00
Linus Torvalds
1ea9d333ba - A few late-breaking minor fixups
- Two minor feature patches which were awkwardly dependent on mm-nonmm.
   I need to set up a new branch to handle such things.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY56V1wAKCRDdBJ7gKXxA
 juVQAP9pr5XBx880RJEil6skMCxYJmae8LvYShhvxJi9keot7QEA3wZRlGcllw/3
 fiHcsaBlXqtXBWUbtnMezcdP6gb3TQo=
 =8T1p
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-12-17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more mm updates from Andrew Morton:

 - A few late-breaking minor fixups

 - Two minor feature patches which were awkwardly dependent on mm-nonmm.
   I need to set up a new branch to handle such things.

* tag 'mm-stable-2022-12-17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  MAINTAINERS: zram: zsmalloc: Add an additional co-maintainer
  mm/kmemleak: use %pK to display kernel pointers in backtrace
  mm: use stack_depot for recording kmemleak's backtrace
  maple_tree: update copyright dates for test code
  maple_tree: fix mas_find_rev() comment
  mm/gup_test: free memory allocated via kvcalloc() using kvfree()
2022-12-19 06:58:57 -06:00
Linus Torvalds
71a7507afb Driver Core changes for 6.2-rc1
Here is the set of driver core and kernfs changes for 6.2-rc1.
 
 The "big" change in here is the addition of a new macro,
 container_of_const() that will preserve the "const-ness" of a pointer
 passed into it.
 
 The "problem" of the current container_of() macro is that if you pass in
 a "const *", out of it can comes a non-const pointer unless you
 specifically ask for it.  For many usages, we want to preserve the
 "const" attribute by using the same call.  For a specific example, this
 series changes the kobj_to_dev() macro to use it, allowing it to be used
 no matter what the const value is.  This prevents every subsystem from
 having to declare 2 different individual macros (i.e.
 kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce
 the const value at build time, which having 2 macros would not do
 either.
 
 The driver for all of this have been discussions with the Rust kernel
 developers as to how to properly mark driver core, and kobject, objects
 as being "non-mutable".  The changes to the kobject and driver core in
 this pull request are the result of that, as there are lots of paths
 where kobjects and device pointers are not modified at all, so marking
 them as "const" allows the compiler to enforce this.
 
 So, a nice side affect of the Rust development effort has been already
 to clean up the driver core code to be more obvious about object rules.
 
 All of this has been bike-shedded in quite a lot of detail on lkml with
 different names and implementations resulting in the tiny version we
 have in here, much better than my original proposal.  Lots of subsystem
 maintainers have acked the changes as well.
 
 Other than this change, included in here are smaller stuff like:
   - kernfs fixes and updates to handle lock contention better
   - vmlinux.lds.h fixes and updates
   - sysfs and debugfs documentation updates
   - device property updates
 
 All of these have been in the linux-next tree for quite a while with no
 problems, OTHER than some merge issues with other trees that should be
 obvious when you hit them (block tree deletes a driver that this tree
 modifies, iommufd tree modifies code that this tree also touches).  If
 there are merge problems with these trees, please let me know.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY5wz3A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yks0ACeKYUlVgCsER8eYW+x18szFa2QTXgAn2h/VhZe
 1Fp53boFaQkGBjl8mGF8
 =v+FB
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the set of driver core and kernfs changes for 6.2-rc1.

  The "big" change in here is the addition of a new macro,
  container_of_const() that will preserve the "const-ness" of a pointer
  passed into it.

  The "problem" of the current container_of() macro is that if you pass
  in a "const *", out of it can comes a non-const pointer unless you
  specifically ask for it. For many usages, we want to preserve the
  "const" attribute by using the same call. For a specific example, this
  series changes the kobj_to_dev() macro to use it, allowing it to be
  used no matter what the const value is. This prevents every subsystem
  from having to declare 2 different individual macros (i.e.
  kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce
  the const value at build time, which having 2 macros would not do
  either.

  The driver for all of this have been discussions with the Rust kernel
  developers as to how to properly mark driver core, and kobject,
  objects as being "non-mutable". The changes to the kobject and driver
  core in this pull request are the result of that, as there are lots of
  paths where kobjects and device pointers are not modified at all, so
  marking them as "const" allows the compiler to enforce this.

  So, a nice side affect of the Rust development effort has been already
  to clean up the driver core code to be more obvious about object
  rules.

  All of this has been bike-shedded in quite a lot of detail on lkml
  with different names and implementations resulting in the tiny version
  we have in here, much better than my original proposal. Lots of
  subsystem maintainers have acked the changes as well.

  Other than this change, included in here are smaller stuff like:

   - kernfs fixes and updates to handle lock contention better

   - vmlinux.lds.h fixes and updates

   - sysfs and debugfs documentation updates

   - device property updates

  All of these have been in the linux-next tree for quite a while with
  no problems"

* tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (58 commits)
  device property: Fix documentation for fwnode_get_next_parent()
  firmware_loader: fix up to_fw_sysfs() to preserve const
  usb.h: take advantage of container_of_const()
  device.h: move kobj_to_dev() to use container_of_const()
  container_of: add container_of_const() that preserves const-ness of the pointer
  driver core: fix up missed drivers/s390/char/hmcdrv_dev.c class.devnode() conversion.
  driver core: fix up missed scsi/cxlflash class.devnode() conversion.
  driver core: fix up some missing class.devnode() conversions.
  driver core: make struct class.devnode() take a const *
  driver core: make struct class.dev_uevent() take a const *
  cacheinfo: Remove of_node_put() for fw_token
  device property: Add a blank line in Kconfig of tests
  device property: Rename goto label to be more precise
  device property: Move PROPERTY_ENTRY_BOOL() a bit down
  device property: Get rid of __PROPERTY_ENTRY_ARRAY_EL*SIZE*()
  kernfs: fix all kernel-doc warnings and multiple typos
  driver core: pass a const * into of_device_uevent()
  kobject: kset_uevent_ops: make name() callback take a const *
  kobject: kset_uevent_ops: make filter() callback take a const *
  kobject: make kobject_namespace take a const *
  ...
2022-12-16 03:54:54 -08:00
Linus Torvalds
ba54ff1fb6 Char/Misc driver changes for 6.2-rc1
Here is the large set of char/misc and other driver subsystem changes
 for 6.2-rc1.  Nothing earth-shattering in here at all, just a lot of new
 driver development and minor fixes.  Highlights include:
  - fastrpc driver updates
  - iio new drivers and updates
  - habanalabs driver updates for new hardware and features
  - slimbus driver updates
  - speakup module parameters added to aid in boot time configuration
  - i2c probe_new conversions for lots of different drivers
  - other small driver fixes and additions
 
 One semi-interesting change in here is the increase of the number of
 misc dynamic minors available to 1048448 to handle new huge-cpu systems.
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY5wrdw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykSDgCdHjUHS62/UnKdB9rLtyAOFxS/6DgAn2X4Unf8
 RN8Mn2mUIiBzyu5p+Zc7
 =tK3S
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the large set of char/misc and other driver subsystem changes
  for 6.2-rc1. Nothing earth-shattering in here at all, just a lot of
  new driver development and minor fixes.

  Highlights include:

   - fastrpc driver updates

   - iio new drivers and updates

   - habanalabs driver updates for new hardware and features

   - slimbus driver updates

   - speakup module parameters added to aid in boot time configuration

   - i2c probe_new conversions for lots of different drivers

   - other small driver fixes and additions

  One semi-interesting change in here is the increase of the number of
  misc dynamic minors available to 1048448 to handle new huge-cpu
  systems.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'char-misc-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (521 commits)
  extcon: usbc-tusb320: Convert to i2c's .probe_new()
  extcon: rt8973: Convert to i2c's .probe_new()
  extcon: fsa9480: Convert to i2c's .probe_new()
  extcon: max77843: Replace irqchip mask_invert with unmask_base
  chardev: fix error handling in cdev_device_add()
  mcb: mcb-parse: fix error handing in chameleon_parse_gdd()
  drivers: mcb: fix resource leak in mcb_probe()
  coresight: etm4x: fix repeated words in comments
  coresight: cti: Fix null pointer error on CTI init before ETM
  coresight: trbe: remove cpuhp instance node before remove cpuhp state
  counter: stm32-lptimer-cnt: fix the check on arr and cmp registers update
  misc: fastrpc: Add dma_mask to fastrpc_channel_ctx
  misc: fastrpc: Add mmap request assigning for static PD pool
  misc: fastrpc: Safekeep mmaps on interrupted invoke
  misc: fastrpc: Add support for audiopd
  misc: fastrpc: Rework fastrpc_req_munmap
  misc: fastrpc: Use fastrpc_map_put in fastrpc_map_create on fail
  misc: fastrpc: Add fastrpc_remote_heap_alloc
  misc: fastrpc: Add reserved mem support
  misc: fastrpc: Rename audio protection domain to root
  ...
2022-12-16 03:49:24 -08:00
Wei Yongjun
f9eeef5918 fault-injection: make stacktrace filter works as expected
stacktrace filter is checked after others, such as fail-nth, interval and
probability.  This make it doesn't work well as expected.

Fix to running stacktrace filter before other filters.  It will speed up
fault inject testing for driver modules.

Link: https://lkml.kernel.org/r/20220817080332.1052710-5-weiyongjun1@huawei.com
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Isabella Basso <isabbasso@riseup.net>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-15 16:40:44 -08:00
Wei Yongjun
0199907474 fault-injection: make some stack filter attrs more readable
Attributes of stack filter are show as unsigned decimal, such as
'require-start', 'require-end'.  This patch change to show them as
unsigned hexadecimal for more readable.

Before:
  $ echo 0xffffffffc0257000 > /sys/kernel/debug/failslab/require-start
  $ cat /sys/kernel/debug/failslab/require-start
  18446744072638263296

After:
  $ echo 0xffffffffc0257000 > /sys/kernel/debug/failslab/require-start
  $ cat /sys/kernel/debug/failslab/require-start
  0xffffffffc0257000

[wangyufen@huawei.com: use debugfs_create_xul() instead of debugfs_create_xl()]
  Link: https://lkml.kernel.org/r/1664331299-4976-1-git-send-email-wangyufen@huawei.com
Link: https://lkml.kernel.org/r/20220817080332.1052710-4-weiyongjun1@huawei.com
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Isabella Basso <isabbasso@riseup.net>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-15 16:40:44 -08:00
Wei Yongjun
4acb9e5139 fault-injection: skip stacktrace filtering by default
If FAULT_INJECTION_STACKTRACE_FILTER is enabled, the depth is default to
32.  This means fail_stacktrace() will iter each entry's stacktrace, even
if filter is not configured.

This patch changes to quick return from fail_stacktrace() if stacktrace
filter is not set.

Link: https://lkml.kernel.org/r/20220817080332.1052710-3-weiyongjun1@huawei.com
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Isabella Basso <isabbasso@riseup.net>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-15 16:40:43 -08:00
Wei Yongjun
a7ebbbb159 fault-injection: allow stacktrace filter for x86-64
This patchset allow fault injection to run on x86_64 and makes stacktrace
filter work as expected.  With this, we can test a device driver module
with fault injection more easily.


This patch (of 4):

FAULT_INJECTION_STACKTRACE_FILTER option was apparently disallowed on
x86_64 because of problems with the stack unwinder:

    commit 6d690dcac9
    Author: Akinobu Mita <akinobu.mita@gmail.com>
    Date:   Sat May 12 10:36:53 2007 -0700

        fault injection: disable stacktrace filter for x86-64

However, there is no problems whatsoever with this today. Let's allow
it again.

Link: https://lkml.kernel.org/r/20220817080332.1052710-1-weiyongjun1@huawei.com
Link: https://lkml.kernel.org/r/20220817080332.1052710-2-weiyongjun1@huawei.com
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Isabella Basso <isabbasso@riseup.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-15 16:40:43 -08:00
Zhaoyang Huang
56a61617dd mm: use stack_depot for recording kmemleak's backtrace
Using stack_depot to record kmemleak's backtrace which has been
implemented on slub for reducing redundant information.

[akpm@linux-foundation.org: fix build - remove now-unused __save_stack_trace()]
[zhaoyang.huang@unisoc.com: v3]
  Link: https://lkml.kernel.org/r/1667101354-4669-1-git-send-email-zhaoyang.huang@unisoc.com
[akpm@linux-foundation.org: fix v3 layout oddities]
[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/1666864224-27541-1-git-send-email-zhaoyang.huang@unisoc.com
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: ke.wang <ke.wang@unisoc.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhaoyang Huang <huangzhaoyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-15 16:37:49 -08:00
Liam Howlett
d98c86b9f7 maple_tree: fix mas_find_rev() comment
mas_find_rev() uses mas_prev_entry(), not mas_next_entry(), correct comment.

Link: https://lkml.kernel.org/r/20221025173756.2719616-1-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-15 16:37:49 -08:00
Linus Torvalds
94a855111e - Add the call depth tracking mitigation for Retbleed which has
been long in the making. It is a lighterweight software-only fix for
 Skylake-based cores where enabling IBRS is a big hammer and causes a
 significant performance impact.
 
 What it basically does is, it aligns all kernel functions to 16 bytes
 boundary and adds a 16-byte padding before the function, objtool
 collects all functions' locations and when the mitigation gets applied,
 it patches a call accounting thunk which is used to track the call depth
 of the stack at any time.
 
 When that call depth reaches a magical, microarchitecture-specific value
 for the Return Stack Buffer, the code stuffs that RSB and avoids its
 underflow which could otherwise lead to the Intel variant of Retbleed.
 
 This software-only solution brings a lot of the lost performance back,
 as benchmarks suggest:
 
   https://lore.kernel.org/all/20220915111039.092790446@infradead.org/
 
 That page above also contains a lot more detailed explanation of the
 whole mechanism
 
 - Implement a new control flow integrity scheme called FineIBT which is
 based on the software kCFI implementation and uses hardware IBT support
 where present to annotate and track indirect branches using a hash to
 validate them
 
 - Other misc fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmOZp5EACgkQEsHwGGHe
 VUrZFxAAvi/+8L0IYSK4mKJvixGbTFjxN/Swo2JVOfs34LqGUT6JaBc+VUMwZxdb
 VMTFIZ3ttkKEodjhxGI7oGev6V8UfhI37SmO2lYKXpQVjXXnMlv/M+Vw3teE38CN
 gopi+xtGnT1IeWQ3tc/Tv18pleJ0mh5HKWiW+9KoqgXj0wgF9x4eRYDz1TDCDA/A
 iaBzs56j8m/FSykZHnrWZ/MvjKNPdGlfJASUCPeTM2dcrXQGJ93+X2hJctzDte0y
 Nuiw6Y0htfFBE7xoJn+sqm5Okr+McoUM18/CCprbgSKYk18iMYm3ZtAi6FUQZS1A
 ua4wQCf49loGp15PO61AS5d3OBf5D3q/WihQRbCaJvTVgPp9sWYnWwtcVUuhMllh
 ZQtBU9REcVJ/22bH09Q9CjBW0VpKpXHveqQdqRDViLJ6v/iI6EFGmD24SW/VxyRd
 73k9MBGrL/dOf1SbEzdsnvcSB3LGzp0Om8o/KzJWOomrVKjBCJy16bwTEsCZEJmP
 i406m92GPXeaN1GhTko7vmF0GnkEdJs1GVCZPluCAxxbhHukyxHnrjlQjI4vC80n
 Ylc0B3Kvitw7LGJsPqu+/jfNHADC/zhx1qz/30wb5cFmFbN1aRdp3pm8JYUkn+l/
 zri2Y6+O89gvE/9/xUhMohzHsWUO7xITiBavewKeTP9GSWybWUs=
 =cRy1
 -----END PGP SIGNATURE-----

Merge tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 core updates from Borislav Petkov:

 - Add the call depth tracking mitigation for Retbleed which has been
   long in the making. It is a lighterweight software-only fix for
   Skylake-based cores where enabling IBRS is a big hammer and causes a
   significant performance impact.

   What it basically does is, it aligns all kernel functions to 16 bytes
   boundary and adds a 16-byte padding before the function, objtool
   collects all functions' locations and when the mitigation gets
   applied, it patches a call accounting thunk which is used to track
   the call depth of the stack at any time.

   When that call depth reaches a magical, microarchitecture-specific
   value for the Return Stack Buffer, the code stuffs that RSB and
   avoids its underflow which could otherwise lead to the Intel variant
   of Retbleed.

   This software-only solution brings a lot of the lost performance
   back, as benchmarks suggest:

       https://lore.kernel.org/all/20220915111039.092790446@infradead.org/

   That page above also contains a lot more detailed explanation of the
   whole mechanism

 - Implement a new control flow integrity scheme called FineIBT which is
   based on the software kCFI implementation and uses hardware IBT
   support where present to annotate and track indirect branches using a
   hash to validate them

 - Other misc fixes and cleanups

* tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits)
  x86/paravirt: Use common macro for creating simple asm paravirt functions
  x86/paravirt: Remove clobber bitmask from .parainstructions
  x86/debug: Include percpu.h in debugreg.h to get DECLARE_PER_CPU() et al
  x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit
  x86/Kconfig: Enable kernel IBT by default
  x86,pm: Force out-of-line memcpy()
  objtool: Fix weak hole vs prefix symbol
  objtool: Optimize elf_dirty_reloc_sym()
  x86/cfi: Add boot time hash randomization
  x86/cfi: Boot time selection of CFI scheme
  x86/ibt: Implement FineIBT
  objtool: Add --cfi to generate the .cfi_sites section
  x86: Add prefix symbols for function padding
  objtool: Add option to generate prefix symbols
  objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf
  objtool: Slice up elf_create_section_symbol()
  kallsyms: Revert "Take callthunks into account"
  x86: Unconfuse CONFIG_ and X86_FEATURE_ namespaces
  x86/retpoline: Fix crash printing warning
  x86/paravirt: Fix a !PARAVIRT build warning
  ...
2022-12-14 15:03:00 -08:00
Linus Torvalds
64e7003c6b This update includes the following changes:
API:
 
 - Optimise away self-test overhead when they are disabled.
 - Support symmetric encryption via keyring keys in af_alg.
 - Flip hwrng default_quality, the default is now maximum entropy.
 
 Algorithms:
 
 - Add library version of aesgcm.
 - CFI fixes for assembly code.
 - Add arm/arm64 accelerated versions of sm3/sm4.
 
 Drivers:
 
 - Remove assumption on arm64 that kmalloc is DMA-aligned.
 - Fix selftest failures in rockchip.
 - Add support for RK3328/RK3399 in rockchip.
 - Add deflate support in qat.
 - Merge ux500 into stm32.
 - Add support for TEE for PCI ID 0x14CA in ccp.
 - Add mt7986 support in mtk.
 - Add MaxLinear platform support in inside-secure.
 - Add NPCM8XX support in npcm.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmOZhNQACgkQxycdCkmx
 i6edOQ/+IHYe2Z+fLsMGs0qgTVaEV33O0crTRl/PMkfBJai57grz6x/G9QrkwGHS
 084u4RmwhVrE7Z/pxvey48m0lHMw3H/ElLTRl5LV1zE2OtGgr4VV63wtqthu1QS1
 KblVnjb52DhFhvF1O1IrK9lxyX0lByOiARFVdyZR6+Rb66Xfq8rqk5t8U8mmTUFz
 ds9S2Un4HajgtjNEyI78DOX8o4wVST8tltQs0eVii6T9AeXgSgX37ytD7Xtg/zrz
 /p61KFgKBQkRT7EEGD6xgNrND0vNAp2w98ZTTRXTZI8+Y0aTUcTYya7cXOLBt9bQ
 rA7z9sNKvmwJijTMV6O9eqRGcYfzc2G4qfMhlQqj/P2pjLnEZXdvFNHTTbclR76h
 2UFlZXPDQVQukvnNNnB6bmIvv6DsM+jmGH0pK5BnBJXnD5SOZh1RqjJxw0Kj6QCM
 VxpKDvfStux2Guh6mz1lJna/S44qKy/sVYkWUawcmE4RF2+GfNayM1GUpEUofndE
 vz1yZdgLPETSh5QzKrjFkUAnqo/AsAdc5Qxroz9DRz1BCC0GCuIxjUG8ScTWgcth
 R/reQDczBckCNpPxrWPHHYoVXnAMwEFySfcjZyuCoMO6t6qVUvcjRShCyKwO/JPl
 9YREdRmq0swwIB9cFIrEoWrzc3wjjBtsltDFlkKsa9c92LXoW+g=
 =OpWt
 -----END PGP SIGNATURE-----

Merge tag 'v6.2-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Optimise away self-test overhead when they are disabled
   - Support symmetric encryption via keyring keys in af_alg
   - Flip hwrng default_quality, the default is now maximum entropy

  Algorithms:
   - Add library version of aesgcm
   - CFI fixes for assembly code
   - Add arm/arm64 accelerated versions of sm3/sm4

  Drivers:
   - Remove assumption on arm64 that kmalloc is DMA-aligned
   - Fix selftest failures in rockchip
   - Add support for RK3328/RK3399 in rockchip
   - Add deflate support in qat
   - Merge ux500 into stm32
   - Add support for TEE for PCI ID 0x14CA in ccp
   - Add mt7986 support in mtk
   - Add MaxLinear platform support in inside-secure
   - Add NPCM8XX support in npcm"

* tag 'v6.2-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (184 commits)
  crypto: ux500/cryp - delete driver
  crypto: stm32/cryp - enable for use with Ux500
  crypto: stm32 - enable drivers to be used on Ux500
  dt-bindings: crypto: Let STM32 define Ux500 CRYP
  hwrng: geode - Fix PCI device refcount leak
  hwrng: amd - Fix PCI device refcount leak
  crypto: qce - Set DMA alignment explicitly
  crypto: octeontx2 - Set DMA alignment explicitly
  crypto: octeontx - Set DMA alignment explicitly
  crypto: keembay - Set DMA alignment explicitly
  crypto: safexcel - Set DMA alignment explicitly
  crypto: hisilicon/hpre - Set DMA alignment explicitly
  crypto: chelsio - Set DMA alignment explicitly
  crypto: ccree - Set DMA alignment explicitly
  crypto: ccp - Set DMA alignment explicitly
  crypto: cavium - Set DMA alignment explicitly
  crypto: img-hash - Fix variable dereferenced before check 'hdev->req'
  crypto: arm64/ghash-ce - use frame_push/pop macros consistently
  crypto: arm64/crct10dif - use frame_push/pop macros consistently
  crypto: arm64/aes-modes - use frame_push/pop macros consistently
  ...
2022-12-14 12:31:09 -08:00
Linus Torvalds
48ea09cdda hardening updates for v6.2-rc1
- Convert flexible array members, fix -Wstringop-overflow warnings,
   and fix KCFI function type mismatches that went ignored by
   maintainers (Gustavo A. R. Silva, Nathan Chancellor, Kees Cook).
 
 - Remove the remaining side-effect users of ksize() by converting
   dma-buf, btrfs, and coredump to using kmalloc_size_roundup(),
   add more __alloc_size attributes, and introduce full testing
   of all allocator functions. Finally remove the ksize() side-effect
   so that each allocation-aware checker can finally behave without
   exceptions.
 
 - Introduce oops_limit (default 10,000) and warn_limit (default off)
   to provide greater granularity of control for panic_on_oops and
   panic_on_warn (Jann Horn, Kees Cook).
 
 - Introduce overflows_type() and castable_to_type() helpers for
   cleaner overflow checking.
 
 - Improve code generation for strscpy() and update str*() kern-doc.
 
 - Convert strscpy and sigphash tests to KUnit, and expand memcpy
   tests.
 
 - Always use a non-NULL argument for prepare_kernel_cred().
 
 - Disable structleak plugin in FORTIFY KUnit test (Anders Roxell).
 
 - Adjust orphan linker section checking to respect CONFIG_WERROR
   (Xin Li).
 
 - Make sure siginfo is cleared for forced SIGKILL (haifeng.xu).
 
 - Fix um vs FORTIFY warnings for always-NULL arguments.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmOZSOoWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJjAAD/0YkvpU7f03f8hcQMJK6wv//24K
 AW41hEaBikq9RcmkuvkLLrJRibGgZ5O2xUkUkxRs/HxhkhrZ0kEw8sbwZe8MoWls
 F4Y9+TDjsrdHmjhfcBZdLnVxwcKK5wlaEcpjZXtbsfcdhx3TbgcDA23YELl5t0K+
 I11j4kYmf9SLl4CwIrSP5iACml8CBHARDh8oIMF7FT/LrjNbM8XkvBcVVT6hTbOV
 yjgA8WP2e9GXvj9GzKgqvd0uE/kwPkVAeXLNFWopPi4FQ8AWjlxbBZR0gamA6/EB
 d7TIs0ifpVU2JGQaTav4xO6SsFMj3ntoUI0qIrFaTxZAvV4KYGrPT/Kwz1O4SFaG
 rN5lcxseQbPQSBTFNG4zFjpywTkVCgD2tZqDwz5Rrmiraz0RyIokCN+i4CD9S0Ds
 oEd8JSyLBk1sRALczkuEKo0an5AyC9YWRcBXuRdIHpLo08PsbeUUSe//4pe303cw
 0ApQxYOXnrIk26MLElTzSMImlSvlzW6/5XXzL9ME16leSHOIfDeerPnc9FU9Eb3z
 ODv22z6tJZ9H/apSUIHZbMciMbbVTZ8zgpkfydr08o87b342N/ncYHZ5cSvQ6DWb
 jS5YOIuvl46/IhMPT16qWC8p0bP5YhxoPv5l6Xr0zq0ooEj0E7keiD/SzoLvW+Qs
 AHXcibguPRQBPAdiPQ==
 =yaaN
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull kernel hardening updates from Kees Cook:

 - Convert flexible array members, fix -Wstringop-overflow warnings, and
   fix KCFI function type mismatches that went ignored by maintainers
   (Gustavo A. R. Silva, Nathan Chancellor, Kees Cook)

 - Remove the remaining side-effect users of ksize() by converting
   dma-buf, btrfs, and coredump to using kmalloc_size_roundup(), add
   more __alloc_size attributes, and introduce full testing of all
   allocator functions. Finally remove the ksize() side-effect so that
   each allocation-aware checker can finally behave without exceptions

 - Introduce oops_limit (default 10,000) and warn_limit (default off) to
   provide greater granularity of control for panic_on_oops and
   panic_on_warn (Jann Horn, Kees Cook)

 - Introduce overflows_type() and castable_to_type() helpers for cleaner
   overflow checking

 - Improve code generation for strscpy() and update str*() kern-doc

 - Convert strscpy and sigphash tests to KUnit, and expand memcpy tests

 - Always use a non-NULL argument for prepare_kernel_cred()

 - Disable structleak plugin in FORTIFY KUnit test (Anders Roxell)

 - Adjust orphan linker section checking to respect CONFIG_WERROR (Xin
   Li)

 - Make sure siginfo is cleared for forced SIGKILL (haifeng.xu)

 - Fix um vs FORTIFY warnings for always-NULL arguments

* tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (31 commits)
  ksmbd: replace one-element arrays with flexible-array members
  hpet: Replace one-element array with flexible-array member
  um: virt-pci: Avoid GCC non-NULL warning
  signal: Initialize the info in ksignal
  lib: fortify_kunit: build without structleak plugin
  panic: Expose "warn_count" to sysfs
  panic: Introduce warn_limit
  panic: Consolidate open-coded panic_on_warn checks
  exit: Allow oops_limit to be disabled
  exit: Expose "oops_count" to sysfs
  exit: Put an upper limit on how often we can oops
  panic: Separate sysctl logic from CONFIG_SMP
  mm/pgtable: Fix multiple -Wstringop-overflow warnings
  mm: Make ksize() a reporting-only function
  kunit/fortify: Validate __alloc_size attribute results
  drm/sti: Fix return type of sti_{dvo,hda,hdmi}_connector_mode_valid()
  drm/fsl-dcu: Fix return type of fsl_dcu_drm_connector_mode_valid()
  driver core: Add __alloc_size hint to devm allocators
  overflow: Introduce overflows_type() and castable_to_type()
  coredump: Proactively round up to kmalloc bucket size
  ...
2022-12-14 12:20:00 -08:00
Linus Torvalds
08cdc21579 iommufd for 6.2
iommufd is the user API to control the IOMMU subsystem as it relates to
 managing IO page tables that point at user space memory.
 
 It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO
 container) which is the VFIO specific interface for a similar idea.
 
 We see a broad need for extended features, some being highly IOMMU device
 specific:
  - Binding iommu_domain's to PASID/SSID
  - Userspace IO page tables, for ARM, x86 and S390
  - Kernel bypassed invalidation of user page tables
  - Re-use of the KVM page table in the IOMMU
  - Dirty page tracking in the IOMMU
  - Runtime Increase/Decrease of IOPTE size
  - PRI support with faults resolved in userspace
 
 Many of these HW features exist to support VM use cases - for instance the
 combination of PASID, PRI and Userspace IO Page Tables allows an
 implementation of DMA Shared Virtual Addressing (vSVA) within a
 guest. Dirty tracking enables VM live migration with SRIOV devices and
 PASID support allow creating "scalable IOV" devices, among other things.
 
 As these features are fundamental to a VM platform they need to be
 uniformly exposed to all the driver families that do DMA into VMs, which
 is currently VFIO and VDPA.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCY5ct7wAKCRCFwuHvBreF
 YZZ5AQDciXfcgXLt0UBEmWupNb0f/asT6tk717pdsKm8kAZMNAEAsIyLiKT5HqGl
 s7fAu+CQ1pr9+9NKGevD+frw8Solsw4=
 =jJkd
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd

Pull iommufd implementation from Jason Gunthorpe:
 "iommufd is the user API to control the IOMMU subsystem as it relates
  to managing IO page tables that point at user space memory.

  It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO
  container) which is the VFIO specific interface for a similar idea.

  We see a broad need for extended features, some being highly IOMMU
  device specific:
   - Binding iommu_domain's to PASID/SSID
   - Userspace IO page tables, for ARM, x86 and S390
   - Kernel bypassed invalidation of user page tables
   - Re-use of the KVM page table in the IOMMU
   - Dirty page tracking in the IOMMU
   - Runtime Increase/Decrease of IOPTE size
   - PRI support with faults resolved in userspace

  Many of these HW features exist to support VM use cases - for instance
  the combination of PASID, PRI and Userspace IO Page Tables allows an
  implementation of DMA Shared Virtual Addressing (vSVA) within a guest.
  Dirty tracking enables VM live migration with SRIOV devices and PASID
  support allow creating "scalable IOV" devices, among other things.

  As these features are fundamental to a VM platform they need to be
  uniformly exposed to all the driver families that do DMA into VMs,
  which is currently VFIO and VDPA"

For more background, see the extended explanations in Jason's pull request:

  https://lore.kernel.org/lkml/Y5dzTU8dlmXTbzoJ@nvidia.com/

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (62 commits)
  iommufd: Change the order of MSI setup
  iommufd: Improve a few unclear bits of code
  iommufd: Fix comment typos
  vfio: Move vfio group specific code into group.c
  vfio: Refactor dma APIs for emulated devices
  vfio: Wrap vfio group module init/clean code into helpers
  vfio: Refactor vfio_device open and close
  vfio: Make vfio_device_open() truly device specific
  vfio: Swap order of vfio_device_container_register() and open_device()
  vfio: Set device->group in helper function
  vfio: Create wrappers for group register/unregister
  vfio: Move the sanity check of the group to vfio_create_group()
  vfio: Simplify vfio_create_group()
  iommufd: Allow iommufd to supply /dev/vfio/vfio
  vfio: Make vfio_container optionally compiled
  vfio: Move container related MODULE_ALIAS statements into container.c
  vfio-iommufd: Support iommufd for emulated VFIO devices
  vfio-iommufd: Support iommufd for physical VFIO devices
  vfio-iommufd: Allow iommufd to be used in place of a container fd
  vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent()
  ...
2022-12-14 09:15:43 -08:00
Linus Torvalds
e2ca6ba6ba MM patches for 6.2-rc1.
- More userfaultfs work from Peter Xu.
 
 - Several convert-to-folios series from Sidhartha Kumar and Huang Ying.
 
 - Some filemap cleanups from Vishal Moola.
 
 - David Hildenbrand added the ability to selftest anon memory COW handling.
 
 - Some cpuset simplifications from Liu Shixin.
 
 - Addition of vmalloc tracing support by Uladzislau Rezki.
 
 - Some pagecache folioifications and simplifications from Matthew Wilcox.
 
 - A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use it.
 
 - Miguel Ojeda contributed some cleanups for our use of the
   __no_sanitize_thread__ gcc keyword.  This series shold have been in the
   non-MM tree, my bad.
 
 - Naoya Horiguchi improved the interaction between memory poisoning and
   memory section removal for huge pages.
 
 - DAMON cleanups and tuneups from SeongJae Park
 
 - Tony Luck fixed the handling of COW faults against poisoned pages.
 
 - Peter Xu utilized the PTE marker code for handling swapin errors.
 
 - Hugh Dickins reworked compound page mapcount handling, simplifying it
   and making it more efficient.
 
 - Removal of the autonuma savedwrite infrastructure from Nadav Amit and
   David Hildenbrand.
 
 - zram support for multiple compression streams from Sergey Senozhatsky.
 
 - David Hildenbrand reworked the GUP code's R/O long-term pinning so
   that drivers no longer need to use the FOLL_FORCE workaround which
   didn't work very well anyway.
 
 - Mel Gorman altered the page allocator so that local IRQs can remnain
   enabled during per-cpu page allocations.
 
 - Vishal Moola removed the try_to_release_page() wrapper.
 
 - Stefan Roesch added some per-BDI sysfs tunables which are used to
   prevent network block devices from dirtying excessive amounts of
   pagecache.
 
 - David Hildenbrand did some cleanup and repair work on KSM COW
   breaking.
 
 - Nhat Pham and Johannes Weiner have implemented writeback in zswap's
   zsmalloc backend.
 
 - Brian Foster has fixed a longstanding corner-case oddity in
   file[map]_write_and_wait_range().
 
 - sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang
   Chen.
 
 - Shiyang Ruan has done some work on fsdax, to make its reflink mode
   work better under xfstests.  Better, but still not perfect.
 
 - Christoph Hellwig has removed the .writepage() method from several
   filesystems.  They only need .writepages().
 
 - Yosry Ahmed wrote a series which fixes the memcg reclaim target
   beancounting.
 
 - David Hildenbrand has fixed some of our MM selftests for 32-bit
   machines.
 
 - Many singleton patches, as usual.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY5j6ZwAKCRDdBJ7gKXxA
 jkDYAP9qNeVqp9iuHjZNTqzMXkfmJPsw2kmy2P+VdzYVuQRcJgEAgoV9d7oMq4ml
 CodAgiA51qwzId3GRytIo/tfWZSezgA=
 =d19R
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - More userfaultfs work from Peter Xu

 - Several convert-to-folios series from Sidhartha Kumar and Huang Ying

 - Some filemap cleanups from Vishal Moola

 - David Hildenbrand added the ability to selftest anon memory COW
   handling

 - Some cpuset simplifications from Liu Shixin

 - Addition of vmalloc tracing support by Uladzislau Rezki

 - Some pagecache folioifications and simplifications from Matthew
   Wilcox

 - A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use
   it

 - Miguel Ojeda contributed some cleanups for our use of the
   __no_sanitize_thread__ gcc keyword.

   This series should have been in the non-MM tree, my bad

 - Naoya Horiguchi improved the interaction between memory poisoning and
   memory section removal for huge pages

 - DAMON cleanups and tuneups from SeongJae Park

 - Tony Luck fixed the handling of COW faults against poisoned pages

 - Peter Xu utilized the PTE marker code for handling swapin errors

 - Hugh Dickins reworked compound page mapcount handling, simplifying it
   and making it more efficient

 - Removal of the autonuma savedwrite infrastructure from Nadav Amit and
   David Hildenbrand

 - zram support for multiple compression streams from Sergey Senozhatsky

 - David Hildenbrand reworked the GUP code's R/O long-term pinning so
   that drivers no longer need to use the FOLL_FORCE workaround which
   didn't work very well anyway

 - Mel Gorman altered the page allocator so that local IRQs can remnain
   enabled during per-cpu page allocations

 - Vishal Moola removed the try_to_release_page() wrapper

 - Stefan Roesch added some per-BDI sysfs tunables which are used to
   prevent network block devices from dirtying excessive amounts of
   pagecache

 - David Hildenbrand did some cleanup and repair work on KSM COW
   breaking

 - Nhat Pham and Johannes Weiner have implemented writeback in zswap's
   zsmalloc backend

 - Brian Foster has fixed a longstanding corner-case oddity in
   file[map]_write_and_wait_range()

 - sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang
   Chen

 - Shiyang Ruan has done some work on fsdax, to make its reflink mode
   work better under xfstests. Better, but still not perfect

 - Christoph Hellwig has removed the .writepage() method from several
   filesystems. They only need .writepages()

 - Yosry Ahmed wrote a series which fixes the memcg reclaim target
   beancounting

 - David Hildenbrand has fixed some of our MM selftests for 32-bit
   machines

 - Many singleton patches, as usual

* tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (313 commits)
  mm/hugetlb: set head flag before setting compound_order in __prep_compound_gigantic_folio
  mm: mmu_gather: allow more than one batch of delayed rmaps
  mm: fix typo in struct pglist_data code comment
  kmsan: fix memcpy tests
  mm: add cond_resched() in swapin_walk_pmd_entry()
  mm: do not show fs mm pc for VM_LOCKONFAULT pages
  selftests/vm: ksm_functional_tests: fixes for 32bit
  selftests/vm: cow: fix compile warning on 32bit
  selftests/vm: madv_populate: fix missing MADV_POPULATE_(READ|WRITE) definitions
  mm/gup_test: fix PIN_LONGTERM_TEST_READ with highmem
  mm,thp,rmap: fix races between updates of subpages_mapcount
  mm: memcg: fix swapcached stat accounting
  mm: add nodes= arg to memory.reclaim
  mm: disable top-tier fallback to reclaim on proactive reclaim
  selftests: cgroup: make sure reclaim target memcg is unprotected
  selftests: cgroup: refactor proactive reclaim code to reclaim_until()
  mm: memcg: fix stale protection of reclaim target memcg
  mm/mmap: properly unaccount memory on mas_preallocate() failure
  omfs: remove ->writepage
  jfs: remove ->writepage
  ...
2022-12-13 19:29:45 -08:00
Nick Terrell
70d822cfb7 Merge branch 'zstd-next' into zstd-linus 2022-12-13 16:24:40 -08:00