Commit Graph

9009 Commits

Author SHA1 Message Date
Tanzir Hasan 38b9baf194 lib/string: shrink lib/string.i via IWYU
This diff uses an open source tool include-what-you-use (IWYU) to modify
the include list, changing indirect includes to direct includes. IWYU is
implemented using the IWYUScripts github repository which is a tool that
is currently undergoing development. These changes seek to improve build
times.

This change to lib/string.c resulted in a preprocessed size of
lib/string.i from 26371 lines to 5321 lines (-80%) for the x86
defconfig.

Link: https://github.com/ClangBuiltLinux/IWYUScripts
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tanzir Hasan <tanzirh@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20231226-libstringheader-v6-2-80aa08c7652c@google.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-01 09:47:59 -08:00
Yury Norov c1f5204efc cpumask: add cpumask_weight_andnot()
Similarly to cpumask_weight_and(), cpumask_weight_andnot() is a handy
helper that may help to avoid creating an intermediate mask just to
calculate number of bits that set in a 1st given mask, and clear in 2nd
one.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01 13:06:40 +01:00
Philipp Stanner 7626913652 pci_iounmap(): Fix MMIO mapping leak
The #ifdef ARCH_HAS_GENERIC_IOPORT_MAP accidentally also guards iounmap(),
which means MMIO mappings are leaked.

Move the guard so we call iounmap() for MMIO mappings.

Fixes: 316e8d79a0 ("pci_iounmap'2: Electric Boogaloo: try to make sense of it all")
Link: https://lore.kernel.org/r/20240131090023.12331-2-pstanner@redhat.com
Reported-by: Danilo Krummrich <dakr@redhat.com>
Suggested-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: <stable@vger.kernel.org> # v5.15+
2024-01-31 14:40:40 -06:00
Linus Torvalds 2a6526c4f3 linux_kselftest-kunit-fixes-6.8-rc3
This kunit fixes update for Linux 6.8-rc3 consists of NULL vs IS_ERR()
 bug fixes, documentation update, MAINTAINERS file update to add
 Rae Moar as a reviewer, and a fix to run test suites only after module
 initialization completes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmW5dOAACgkQCwJExA0N
 QxwvoQ/+LoNEP+yu0C/KSWdNsDCMJizbwoCtoImETVdxxxSXOLKlh6cVtFfZTenN
 lLMX7zfrdIZEKIqIPVa4URq2+M9cHLuFkA8z2h5a8wOLSYxR2EyFq63o0K0NJwuC
 eSA7zgf/YkfEp7AKdkmefCqyn3bB3910KBlComqdtFtc/d/doZkeo2zCpJ5WXPzr
 x0aUGPceUa59u44JCgb1JWjQnA50ST+DYFHbfUEQJs37XfkswaseyCBKzWFAZnzR
 cVFHBVTZRceMrVDaXfSlXLr7FFgh/bQKF9BDTNBOEg9b1TVZmVH5sY5qnXsBTJ86
 nCf4J8bIzBOvLkDFGQVRs+6edvSli5wXWfM9O8hzGEbR+xs8GOrlHitM74hyhX7z
 iq6QZl0FjrGfc98DKTftwO7vDKZcFd2bhpw9ZV1cO4YzYnZVRcWwWYxPO+wS7r3q
 b+vcypxBmBhXDxxNJaJttxTcblNrxuLY1r7oJZaroE2ioKMY3uEqIA2hLagzPp9E
 /q6968b2tvFQrveFT3MJwRAAIq175xBRlNcpF8UkhnGXgZMF4er/teMN8coPeK4q
 N//sxYobNVIJYnz6npIQB6wSobWrc0Qq4AD6YTEYkhbaXiGo3VfdrjwrxTNxryBg
 wWAXvrSGUqWJXZqnSdFpmpdcOAo9r0Pf0NsByW5jpLiObIGQN5U=
 =BWfz
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-fixes-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit fixes from Shuah Khan:
 "NULL vs IS_ERR() bug fixes, documentation update, MAINTAINERS file
  update to add Rae Moar as a reviewer, and a fix to run test suites
  only after module initialization completes"

* tag 'linux_kselftest-kunit-fixes-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  Documentation: KUnit: Update the instructions on how to test static functions
  kunit: run test suites only after module initialization completes
  MAINTAINERS: kunit: Add Rae Moar as a reviewer
  kunit: device: Fix a NULL vs IS_ERR() check in init()
  kunit: Fix a NULL vs IS_ERR() bug
2024-01-30 15:12:58 -08:00
Lukas Bulwahn 5c49b6a4a4 vt: remove superfluous CONFIG_HW_CONSOLE
The config HW_CONSOLE is always identical to the config VT and is not
visible in the kernel's build menuconfig. So, CONFIG_HW_CONSOLE is
redundant.

Replace all references to CONFIG_HW_CONSOLE with CONFIG_VT and remove
CONFIG_HW_CONSOLE.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20240108134102.601-1-lukas.bulwahn@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-01-27 19:03:51 -08:00
Marco Elver 4434a56ec2 stackdepot: make fast paths lock-less again
With the introduction of the pool_rwlock (reader-writer lock), several
fast paths end up taking the pool_rwlock as readers.  Furthermore,
stack_depot_put() unconditionally takes the pool_rwlock as a writer.

Despite allowing readers to make forward-progress concurrently,
reader-writer locks have inherent cache contention issues, which does not
scale well on systems with large CPU counts.

Rework the synchronization story of stack depot to again avoid taking any
locks in the fast paths.  This is done by relying on RCU-protected list
traversal, and the NMI-safe subset of RCU to delay reuse of freed stack
records.  See code comments for more details.

Along with the performance issues, this also fixes incorrect nesting of
rwlock within a raw_spinlock, given that stack depot should still be
usable from anywhere:

 | [ BUG: Invalid wait context ]
 | -----------------------------
 | swapper/0/1 is trying to lock:
 | ffffffff89869be8 (pool_rwlock){..--}-{3:3}, at: stack_depot_save_flags
 | other info that might help us debug this:
 | context-{5:5}
 | 2 locks held by swapper/0/1:
 |  #0: ffffffff89632440 (rcu_read_lock){....}-{1:3}, at: __queue_work
 |  #1: ffff888100092018 (&pool->lock){-.-.}-{2:2}, at: __queue_work  <-- raw_spin_lock

Stack depot usage stats are similar to the previous version after a KASAN
kernel boot:

 $ cat /sys/kernel/debug/stackdepot/stats
 pools: 838
 allocations: 29865
 frees: 6604
 in_use: 23261
 freelist_size: 1879

The number of pools is the same as previously.  The freelist size is
minimally larger, but this may also be due to variance across system
boots.  This shows that even though we do not eagerly wait for the next
RCU grace period (such as with synchronize_rcu() or call_rcu()) after
freeing a stack record - requiring depot_pop_free() to "poll" if an entry
may be used - new allocations are very likely to happen in later RCU grace
periods.

Link: https://lkml.kernel.org/r/20240118110216.2539519-2-elver@google.com
Fixes: 108be8def4 ("lib/stackdepot: allow users to evict stack traces")
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-25 23:52:21 -08:00
Marco Elver c2a292545c stackdepot: add stats counters exported via debugfs
Add a few basic stats counters for stack depot that can be used to derive
if stack depot is working as intended.  This is a snapshot of the new
stats after booting a system with a KASAN-enabled kernel:

 $ cat /sys/kernel/debug/stackdepot/stats
 pools: 838
 allocations: 29861
 frees: 6561
 in_use: 23300
 freelist_size: 1840

Generally, "pools" should be well below the max; once the system is
booted, "in_use" should remain relatively steady.

Link: https://lkml.kernel.org/r/20240118110216.2539519-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-25 23:52:21 -08:00
Jens Axboe 2263639f96 iov_iter: streamline iovec/bvec alignment iteration
Rewrite the alignment checking iterators for iovec and bvec to be easier
to read, and also significantly more compact in terms of generated code.
This saves 270 bytes of text on x86-64 for me (with clang-18) and 224
bytes on arm64 (with gcc-13).

In profiles, also saves a bit of time as well for the same workload:

     0.81%     -0.18%  [kernel.vmlinux]  [k] iov_iter_aligned_bvec
     0.48%     -0.09%  [kernel.vmlinux]  [k] iov_iter_is_aligned

which is a nice side benefit as well.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/544b31f7-6d4b-42f5-a544-1420501f081f@kernel.dk
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>

v2: do the other half of the iterators too, as suggested by Keith.
    This further saves some text.
2024-01-25 17:14:44 +01:00
Marcos Paulo de Souza c4bbe83d27 livepatch: Move tests from lib/livepatch to selftests/livepatch
The modules are being moved from lib/livepatch to
tools/testing/selftests/livepatch/test_modules.

This code moving will allow writing more complex tests, like for example an
userspace C code that will call a livepatched kernel function.

The modules are now built as out-of-tree
modules, but being part of the kernel source means they will be maintained.

Another advantage of the code moving is to be able to easily change,
debug and rebuild the tests by running make on the selftests/livepatch
directory, which is not currently possible since the modules on
lib/livepatch are build and installed using the "modules" target.

The current approach also keeps the ability to execute the tests manually
by executing the scripts inside selftests/livepatch directory, as it's
currently supported. If the modules are modified, they needed to be
rebuilt before running the scripts though.

The modules are built before running the selftests when using the
kselftest invocations:

	make kselftest TARGETS=livepatch
or
	make -C tools/testing/selftests/livepatch run_tests

Having the modules being built as out-of-modules requires changing the
currently used 'modprobe' by 'insmod' and adapt the test scripts that
check for the kernel message buffer.

Now it is possible to only compile the modules by running:

	make -C tools/testing/selftests/livepatch/

This way the test modules and other test program can be built in order
to be packaged if so desired.

As there aren't any modules being built on lib/livepatch, remove the
TEST_LIVEPATCH Kconfig and it's references.

Note: "make gen_tar" packages the pre-built binaries into the tarball.
       It means that it will store the test modules pre-built for
       the kernel running on the build host.

       Note that these modules need not binary compatible with
       the kernel built from the same sources. But the same
       is true for other packaged selftest binaries.

       The entire kernel sources are needed for rebuilding
       the selftests on another system.

Reviewed-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-22 10:29:47 -07:00
Marco Pagani a1af6a2bfa kunit: run test suites only after module initialization completes
Commit 2810c1e998 ("kunit: Fix wild-memory-access bug in
kunit_free_suite_set()") fixed a wild-memory-access bug that could have
happened during the loading phase of test suites built and executed as
loadable modules. However, it also introduced a problematic side effect
that causes test suites modules to crash when they attempt to register
fake devices.

When a module is loaded, it traverses the MODULE_STATE_UNFORMED and
MODULE_STATE_COMING states before reaching the normal operating state
MODULE_STATE_LIVE. Finally, when the module is removed, it moves to
MODULE_STATE_GOING before being released. However, if the loading
function load_module() fails between complete_formation() and
do_init_module(), the module goes directly from MODULE_STATE_COMING to
MODULE_STATE_GOING without passing through MODULE_STATE_LIVE.

This behavior was causing kunit_module_exit() to be called without
having first executed kunit_module_init(). Since kunit_module_exit() is
responsible for freeing the memory allocated by kunit_module_init()
through kunit_filter_suites(), this behavior was resulting in a
wild-memory-access bug.

Commit 2810c1e998 ("kunit: Fix wild-memory-access bug in
kunit_free_suite_set()") fixed this issue by running the tests when the
module is still in MODULE_STATE_COMING. However, modules in that state
are not fully initialized, lacking sysfs kobjects. Therefore, if a test
module attempts to register a fake device, it will inevitably crash.

This patch proposes a different approach to fix the original
wild-memory-access bug while restoring the normal module execution flow
by making kunit_module_exit() able to detect if kunit_module_init() has
previously initialized the tests suite set. In this way, test modules
can once again register fake devices without crashing.

This behavior is achieved by checking whether mod->kunit_suites is a
virtual or direct mapping address. If it is a virtual address, then
kunit_module_init() has allocated the suite_set in kunit_filter_suites()
using kmalloc_array(). On the contrary, if mod->kunit_suites is still
pointing to the original address that was set when looking up the
.kunit_test_suites section of the module, then the loading phase has
failed and there's no memory to be freed.

v4:
- rebased on 6.8
- noted that kunit_filter_suites() must return a virtual address
v3:
- add a comment to clarify why the start address is checked
v2:
- add include <linux/mm.h>

Fixes: 2810c1e998 ("kunit: Fix wild-memory-access bug in kunit_free_suite_set()")
Reviewed-by: David Gow <davidgow@google.com>
Tested-by: Rae Moar <rmoar@google.com>
Tested-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Marco Pagani <marpagan@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-22 07:58:54 -07:00
Dan Carpenter 083974ebb8 kunit: device: Fix a NULL vs IS_ERR() check in init()
The root_device_register() function does not return NULL, it returns
error pointers.  Fix the check to match.

Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-22 07:58:22 -07:00
Dan Carpenter ed1a72fb0d kunit: Fix a NULL vs IS_ERR() bug
The kunit_device_register() function doesn't return NULL, it returns
error pointers.  Change the KUNIT_ASSERT_NOT_NULL() to check for
ERR_OR_NULL().

Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-22 07:58:12 -07:00
Linus Torvalds e5075d8ec5 RISC-V Patches for the 6.8 Merge Window, Part 4
This includes everything from part 2:
 
 * Support for tuning for systems with fast misaligned accesses.
 * Support for SBI-based suspend.
 * Support for the new SBI debug console extension.
 * The T-Head CMOs now use PA-based flushes.
 * Support for enabling the V extension in kernel code.
 * Optimized IP checksum routines.
 * Various ftrace improvements.
 * Support for archrandom, which depends on the Zkr extension.
 
 and then also a fix for those:
 
 * The build is no longer broken under NET=n, KUNIT=y for ports that
   don't define their own ipv6 checksum.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmWsCOMTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYieQND/0f+1gizTM0OzuqZG9+DOdWTtqmILyr
 sZaYXWBw6SPzbUSlwjoW4Qp/S3Ur7IhrbfttM2aMoS4GHZvSESAXOMXC4c7AnCaQ
 HOXBC2OuXvq6jA0ZjK5XPviR70A/7uD2iu5SNO1hyfJK08LSEu+AulxtkW50+wMc
 bHXSpZxEf8AtwOJK1cRtwhH4qy+Qcs3Nla3jG7OnDsPbhJVcydHx95eCtfwn2cQA
 KwJPN1fjRtm4ALZb91QcMDO8VAoanfPEkSR3DoNVE/UfdTItYk35VHmf4RWh7IWA
 qDnV5Mp/XMX2RmJqwi1ZmSHHX0rfVLL5UqgBhGHC8PuMpLJn5p9U6DZ0qD7YWxcB
 NDlrHsaXt112RHEEM/7CcLkqEexua/ezcC45E5tSQ4sRDZE3fvgbALao67xSQ22D
 lCpVAY0Z3o5oWaM/jISiQHjSNn5RrAwEYSvvv2pkW4QAMShA2eggmQaCF+Jl4EMp
 u6yqJpXxDI99C088uvM6Bi2gcX8fnBSmOzCB/sSU4a1I72UpWrGngqUpTYKHG8Jz
 cTZhbIKmQirBP0vC/UgMOS0sNuw/NykItfRXZ2g0qGKvw1TjJ6djdeZBKcAj3h0E
 fJpMxuhmeOFYE7DavnhSt3CResFTXZzXLChjxGbT+g10YzVEf9g7vBVnjxAwad9f
 tryMVpL/ipGpQQ==
 =Sjhj
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - Support for tuning for systems with fast misaligned accesses.

 - Support for SBI-based suspend.

 - Support for the new SBI debug console extension.

 - The T-Head CMOs now use PA-based flushes.

 - Support for enabling the V extension in kernel code.

 - Optimized IP checksum routines.

 - Various ftrace improvements.

 - Support for archrandom, which depends on the Zkr extension.

 - The build is no longer broken under NET=n, KUNIT=y for ports that
   don't define their own ipv6 checksum.

* tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (56 commits)
  lib: checksum: Fix build with CONFIG_NET=n
  riscv: lib: Check if output in asm goto supported
  riscv: Fix build error on rv32 + XIP
  riscv: optimize ELF relocation function in riscv
  RISC-V: Implement archrandom when Zkr is available
  riscv: Optimize hweight API with Zbb extension
  riscv: add dependency among Image(.gz), loader(.bin), and vmlinuz.efi
  samples: ftrace: Add RISC-V support for SAMPLE_FTRACE_DIRECT[_MULTI]
  riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
  riscv: ftrace: Make function graph use ftrace directly
  riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
  lib/Kconfig.debug: Update AS_HAS_NON_CONST_LEB128 comment and name
  riscv: Restrict DWARF5 when building with LLVM to known working versions
  riscv: Hoist linker relaxation disabling logic into Kconfig
  kunit: Add tests for csum_ipv6_magic and ip_fast_csum
  riscv: Add checksum library
  riscv: Add checksum header
  riscv: Add static key for misaligned accesses
  asm-generic: Improve csum_fold
  RISC-V: selftests: cbo: Ensure asm operands match constraints
  ...
2024-01-20 11:06:04 -08:00
Linus Torvalds 57f22c8dab strlcpy removal for v6.8-rc1
- Remove of the final (very recent) user of strlcpy() (in bcachefs).
 
 - Remove the strlcpy() API. Long live strscpy().
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmWq5VgWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJhsHD/9xfEA1YFC4WzTuX1RcsSwZQTGL
 L8ej9NuRiQ57vJA37PEV3wyTIVHLOJDjNr+8cmL1Pu0GR9K4R2s4YzdQtaK6pFeE
 BYuXUOUK9rkQsVLL0DrTv/YryjMah0DDb/M7kKZDRfTgii0yWZ1WqEmO2+9wbdKS
 n7O9oYZreiNkFg3/6yHPYlBve9QXt+VHN/NIxSQqps3BVXRPKcIwCCJq7IiazBpR
 xo7FkhTftmL1ZqGOGRcoY7YKWt7WFg9HPBB30WXkqCIqmaFWm4sBancArVgTgQ+r
 vES/QF4SsFXkprf4fPWuQZlcChc2hibREI9o3t3Qck4FG7W+alXSpj3IxFiZqNFu
 BvNZwKW5/MB2r+CugM12JUszxAVlcwqskoilGOVD65AJ26xUYh2oAr3kpU5L6Nur
 c3zcLSlpec9sYQMdXSGQWOF2juhEp2ikceP5dw5ONcj4P7UXadPnB4hsW8ulG844
 Rh552sR0je5UCxzXNozec9X1JFZf7Z8lOjdRv1Xs549+F2rmzaZAt2eOnageCCO7
 XKoqZ/auIwzj/3WqDxivjs3xT+1PpxJd3bALDXb/iIu10DMbNq7CRwHO+1OZo1e1
 4OLE1gbM3Ldv2WgUe2o1dDURnmKq1aiYN8ThoOIVy9VTC0FOxujVXKsd//f6qpMu
 EGOypgqRBFpVd53DvQ==
 =DuKT
 -----END PGP SIGNATURE-----

Merge tag 'strlcpy-removal-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull strlcpy removal from Kees Cook:
 "As promised, this is 'part 2' of the hardening tree, late in -rc1 now
  that all the other trees with strlcpy() removals have landed. One new
  user appeared (in bcachefs) but was a trivial refactor. The kernel is
  now free of the strlcpy() API!

   - Remove of the final (very recent) user of strlcpy() (in bcachefs)

   - Remove the strlcpy() API. Long live strscpy()"

* tag 'strlcpy-removal-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  string: Remove strlcpy()
  bcachefs: Replace strlcpy() with strscpy()
2024-01-19 13:49:16 -08:00
Kees Cook d26270061a string: Remove strlcpy()
With all the users of strlcpy() removed[1] from the kernel, remove the
API, self-tests, and other references. Leave mentions in Documentation
(about its deprecation), and in checkpatch.pl (to help migrate host-only
tools/ usage). Long live strscpy().

Link: https://github.com/KSPP/linux/issues/89 [1]
Cc: Azeem Shaikh <azeemshaikh38@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Joe Perches <joe@perches.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: linux-hardening@vger.kernel.org
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-01-19 11:59:11 -08:00
Palmer Dabbelt f24a70106d
lib: checksum: Fix build with CONFIG_NET=n
The generic ipv6 checksums are only defined with CONFIG_NET=y, so gate
the test as well.

Fixes: 6f4c45cbcb ("kunit: Add tests for csum_ipv6_magic and ip_fast_csum")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202401192143.jLdjbIy3-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202401192357.WU4nPRdN-lkp@intel.com/
Reviewed-By: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240119145600.3093-2-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-19 08:12:38 -08:00
Linus Torvalds 9d1694dc91 for-6.8/block-2024-01-18
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmWpoCgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpqIUEADFvJdC2izkPzYzsOrMK5Rt1H7vaHGKhbA+
 zWCuQaa1xQd8bazq+NVnQpbzgclkE/WodTCNfNXcTTjzeQEmcZC888llP3Y9vwyP
 XfEKH7fSaeKvGigJLro1oPe3YV7/t89F5ol3BoZayfzJF8GEU9BXRWzgOkZzijnk
 xdm5wUyn/GknksMuQQraZ+U6bQRFLBOulzoaQeMD6Dosx+uRlM4WvAJawC+uOV6R
 qPT2BVSfYGzmgEKvoaphw0FMkUhFBMDHfXTpQBi5tIzTKOaof8tynYEGz0FHZWeh
 V0JEEp+3jLWFxFXeEcXgBVPJPE8J0DzGm9g17/uwC2Yhmlbw4FKZVRvGG+PpeUso
 D5aqhqm3w0x7HgZ7JKwy/aUctADYvjVcSVzPHTaFK0aCSYCIAXxqv4p7fOoxPqyx
 T32IUHTzGtkCdqzv/xFdtTYhTNM2vyzzbbWj5lXgCBqHsXOVbCh8UM2p+9ec2Umq
 Fo1XF9eoCDe6Sn4s15hJ5G4DEhKGOKkHluvRUdM+0selA5b0sNOeUqlAf2v+0ve3
 Pv3e3X4NPssNIEcsDHf5pc3zGC+LXRS0oFvfIvDESBjwXc3iHIMl+SkjyS57P4Fd
 RKrHEUUiACuCKO/IWqFYLiNBNHnP3RmV5gSxIZr9QJhFSwOzP+/+4++TCdF5vdAV
 amhv+0PdCw==
 =DLW9
 -----END PGP SIGNATURE-----

Merge tag 'for-6.8/block-2024-01-18' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

 - NVMe pull request via Keith:
      - tcp, fc, and rdma target fixes (Maurizio, Daniel, Hannes,
        Christoph)
      - discard fixes and improvements (Christoph)
      - timeout debug improvements (Keith, Max)
      - various cleanups (Daniel, Max, Giuxen)
      - trace event string fixes (Arnd)
      - shadow doorbell setup on reset fix (William)
      - a write zeroes quirk for SK Hynix (Jim)

 - MD pull request via Song:
      - Sparse warning since v6.0 (Bart)
      - /proc/mdstat regression since v6.7 (Yu Kuai)

 - Use symbolic error value (Christian)

 - IO Priority documentation update (Christian)

 - Fix for accessing queue limits without having entered the queue
   (Christoph, me)

 - Fix for loop dio support (Christoph)

 - Move null_blk off deprecated ida interface (Christophe)

 - Ensure nbd initializes full msghdr (Eric)

 - Fix for a regression with the folio conversion, which is now easier
   to hit because of an unrelated change (Matthew)

 - Remove redundant check in virtio-blk (Li)

 - Fix for a potential hang in sbitmap (Ming)

 - Fix for partial zone appending (Damien)

 - Misc changes and fixes (Bart, me, Kemeng, Dmitry)

* tag 'for-6.8/block-2024-01-18' of git://git.kernel.dk/linux: (45 commits)
  Documentation: block: ioprio: Update schedulers
  loop: fix the the direct I/O support check when used on top of block devices
  blk-mq: Remove the hctx 'run' debugfs attribute
  nbd: always initialize struct msghdr completely
  block: Fix iterating over an empty bio with bio_for_each_folio_all
  block: bio-integrity: fix kcalloc() arguments order
  virtio_blk: remove duplicate check if queue is broken in virtblk_done
  sbitmap: remove stale comment in sbq_calc_wake_batch
  block: Correct a documentation comment in blk-cgroup.c
  null_blk: Remove usage of the deprecated ida_simple_xx() API
  block: ensure we hold a queue reference when using queue limits
  blk-mq: rename blk_mq_can_use_cached_rq
  block: print symbolic error name instead of error code
  blk-mq: fix IO hang from sbitmap wakeup race
  nvmet-rdma: avoid circular locking dependency on install_queue()
  nvmet-tcp: avoid circular locking dependency on install_queue()
  nvme-pci: set doorbell config before unquiescing
  block: fix partial zone append completion handling in req_bio_endio()
  block/iocost: silence warning on 'last_period' potentially being unused
  md/raid1: Use blk_opf_t for read and write operations
  ...
2024-01-18 18:22:40 -08:00
Linus Torvalds db5ccb9eb2 cxl for v6.8
- Add support for parsing the Coherent Device Attribute Table (CDAT)
 
 - Add support for calculating a platform CXL QoS class from CDAT data
 
 - Unify the tracing of EFI CXL Events with native CXL Events.
 
 - Add Get Timestamp support
 
 - Miscellaneous cleanups and fixups
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZaHVvAAKCRDfioYZHlFs
 Z3sCAQDPHSsHmj845k4lvKbWjys3eh78MKKEFyTXLQgYhOlsGAEAigQY2ZiSum52
 nwdIgpOOADNt0Iq6yXuLsmn9xvY9bAU=
 =HjCl
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL (Compute Express Link) updates from Dan Williams:
 "The bulk of this update is support for enumerating the performance
  capabilities of CXL memory targets and connecting that to a platform
  CXL memory QoS class. Some follow-on work remains to hook up this data
  into core-mm policy, but that is saved for v6.9.

  The next significant update is unifying how CXL event records (things
  like background scrub errors) are processed between so called
  "firmware first" and native error record retrieval. The CXL driver
  handler that processes the record retrieved from the device mailbox is
  now the handler for that same record format coming from an EFI/ACPI
  notification source.

  This also contains miscellaneous feature updates, like Get Timestamp,
  and other fixups.

  Summary:

   - Add support for parsing the Coherent Device Attribute Table (CDAT)

   - Add support for calculating a platform CXL QoS class from CDAT data

   - Unify the tracing of EFI CXL Events with native CXL Events.

   - Add Get Timestamp support

   - Miscellaneous cleanups and fixups"

* tag 'cxl-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (41 commits)
  cxl/core: use sysfs_emit() for attr's _show()
  cxl/pci: Register for and process CPER events
  PCI: Introduce cleanup helpers for device reference counts and locks
  acpi/ghes: Process CXL Component Events
  cxl/events: Create a CXL event union
  cxl/events: Separate UUID from event structures
  cxl/events: Remove passing a UUID to known event traces
  cxl/events: Create common event UUID defines
  cxl/events: Promote CXL event structures to a core header
  cxl: Refactor to use __free() for cxl_root allocation in cxl_endpoint_port_probe()
  cxl: Refactor to use __free() for cxl_root allocation in cxl_find_nvdimm_bridge()
  cxl: Fix device reference leak in cxl_port_perf_data_calculate()
  cxl: Convert find_cxl_root() to return a 'struct cxl_root *'
  cxl: Introduce put_cxl_root() helper
  cxl/port: Fix missing target list lock
  cxl/port: Fix decoder initialization when nr_targets > interleave_ways
  cxl/region: fix x9 interleave typo
  cxl/trace: Pass UUID explicitly to event traces
  cxl/region: use %pap format to print resource_size_t
  cxl/region: Add dev_dbg() detail on failure to allocate HPA space
  ...
2024-01-18 16:22:43 -08:00
Palmer Dabbelt 448857ec53
Merge patch series "RISC-V: Disable DWARF5 with known broken LLVM versions"
Nathan Chancellor <nathan@kernel.org> says:

This series disables DWARF5 for LLVM versions where it is known to be
broken due to linker relaxation.

* b4-shazam-merge:
  lib/Kconfig.debug: Update AS_HAS_NON_CONST_LEB128 comment and name
  riscv: Restrict DWARF5 when building with LLVM to known working versions
  riscv: Hoist linker relaxation disabling logic into Kconfig

Link: bbc0f99f3b
Link: https://lore.kernel.org/r/20231205-riscv-restrict-dwarf5-llvm-v2-0-aedf00a382ac@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:08:30 -08:00
Nathan Chancellor a4426641f0
lib/Kconfig.debug: Update AS_HAS_NON_CONST_LEB128 comment and name
Fangrui noted that the comment around CONFIG_AS_HAS_NON_CONST_LEB128
could be made more accurate because explicit .sleb128 directives are not
emitted, only .uleb128 directives are. Rename the symbol to
CONFIG_AS_HAS_NON_CONST_ULEB128 as a result.

Further clarifications include replacing "symbol deltas" with the more
accurate "label differences", noting that this issue has been resolved
in newer binutils (2.41+), and it only occurs when a port uses RISC-V
style linker relaxation.

Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20231205-riscv-restrict-dwarf5-llvm-v2-3-aedf00a382ac@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:08:28 -08:00
Nathan Chancellor ae84ff9a14
riscv: Restrict DWARF5 when building with LLVM to known working versions
LLVM prior to 18.0.0 would generate incorrect debug info for DWARF5 due
to linker relaxation, which was worked around in clang by defaulting
RISC-V to DWARF4 [1]. Unfortunately, this workaround does not work for
the kernel because the DWARF version can be independently changed from
the default in Kconfig.

Do not allow DWARF5 to be selected for RISC-V when using linker
relaxation (ld.lld >= 15.0.0) and a version of LLVM that does not have
the fixes (the integrated assembler [2] and ld.lld [3] < 18.0.0)
necessary to generate the correct debug info.

Link: bbc0f99f3b [1]
Link: 1df5ea29b4 [2]
Link: 7ffabb61a5 [3]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20231205-riscv-restrict-dwarf5-llvm-v2-2-aedf00a382ac@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:08:27 -08:00
Charlie Jenkins 6f4c45cbcb
kunit: Add tests for csum_ipv6_magic and ip_fast_csum
Supplement existing checksum tests with tests for csum_ipv6_magic and
ip_fast_csum.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240108-optimize_checksum-v15-5-1c50de5f2167@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 17:52:33 -08:00
Linus Torvalds 7f5e47f785 17 hotfixes. 10 address post-6.7 issues and the other 7 are cc:stable.
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZaHe5gAKCRDdBJ7gKXxA
 jrAiAQCYZQuwsNVyGJUuPD/GGQzqVUZNpWcuYwMXXAi6dO5rSAD+LDeFviun2K52
 uHCz4iRq5EwNLA+MbdHtAnQzr+e5CQ8=
 =Jjkw
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-01-12-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc hotfixes from Andrew Morton:
 "For once not mostly MM-related.

  17 hotfixes. 10 address post-6.7 issues and the other 7 are cc:stable"

* tag 'mm-hotfixes-stable-2024-01-12-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  userfaultfd: avoid huge_zero_page in UFFDIO_MOVE
  MAINTAINERS: add entry for shrinker
  selftests: mm: hugepage-vmemmap fails on 64K page size systems
  mm/memory_hotplug: fix memmap_on_memory sysfs value retrieval
  mailmap: switch email for Tanzir Hasan
  mailmap: add old address mappings for Randy
  kernel/crash_core.c: make __crash_hotplug_lock static
  efi: disable mirror feature during crashkernel
  kexec: do syscore_shutdown() in kernel_kexec
  mailmap: update entry for Manivannan Sadhasivam
  fs/proc/task_mmu: move mmu notification mechanism inside mm lock
  mm: zswap: switch maintainers to recently active developers and reviewers
  scripts/decode_stacktrace.sh: optionally use LLVM utilities
  kasan: avoid resetting aux_lock
  lib/Kconfig.debug: disable CONFIG_DEBUG_INFO_BTF for Hexagon
  MAINTAINERS: update LTP maintainers
  kdump: defer the insertion of crashkernel resources
2024-01-17 09:31:36 -08:00
Kemeng Shi 5c7fa5c8ad sbitmap: remove stale comment in sbq_calc_wake_batch
After commit 106397376c ("sbitmap: fix batching wakeup"), we may wake
up more than one queue for each batch. Just remove stale comment that
we wake up only one queue for each batch.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Link: https://lore.kernel.org/r/20240115145626.665562-1-shikemeng@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-15 07:23:50 -07:00
Nathan Chancellor aaa2c9a97c lib/Kconfig.debug: disable CONFIG_DEBUG_INFO_BTF for Hexagon
pahole, which generates BTF, relies on elfutils to process DWARF debug
info.  Because kernel modules are relocatable files, elfutils needs to
resolve relocations when processing the DWARF .debug sections.

Hexagon is not supported in binutils or elfutils, so elfutils is unable to
process relocations in kernel modules, causing pahole to crash during BTF
generation.

Do not allow CONFIG_DEBUG_INFO_BTF to be selected for Hexagon until it is
supported in elfutils, so that there are no more cryptic build failures
during BTF generation.

Link: https://lkml.kernel.org/r/20240105-hexagon-disable-btf-v1-1-ddab073e7f74@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312192107.wMIKiZWw-lkp@intel.com/
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Brian Cain <bcain@quicinc.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-12 15:20:45 -08:00
Linus Torvalds 3e7aeb78ab Networking changes for 6.8.
Core & protocols
 ----------------
 
  - Analyze and reorganize core networking structs (socks, netdev,
    netns, mibs) to optimize cacheline consumption and set up
    build time warnings to safeguard against future header changes.
    This improves TCP performances with many concurrent connections
    up to 40%.
 
  - Add page-pool netlink-based introspection, exposing the
    memory usage and recycling stats. This helps indentify
    bad PP users and possible leaks.
 
  - Refine TCP/DCCP source port selection to no longer favor even
    source port at connect() time when IP_LOCAL_PORT_RANGE is set.
    This lowers the time taken by connect() for hosts having
    many active connections to the same destination.
 
  - Refactor the TCP bind conflict code, shrinking related socket
    structs.
 
  - Refactor TCP SYN-Cookie handling, as a preparation step to
    allow arbitrary SYN-Cookie processing via eBPF.
 
  - Tune optmem_max for 0-copy usage, increasing the default value
    to 128KB and namespecifying it.
 
  - Allow coalescing for cloned skbs coming from page pools, improving
    RX performances with some common configurations.
 
  - Reduce extension header parsing overhead at GRO time.
 
  - Add bridge MDB bulk deletion support, allowing user-space to
    request the deletion of matching entries.
 
  - Reorder nftables struct members, to keep data accessed by the
    datapath first.
 
  - Introduce TC block ports tracking and use. This allows supporting
    multicast-like behavior at the TC layer.
 
  - Remove UAPI support for retired TC qdiscs (dsmark, CBQ and ATM) and
    classifiers (RSVP and tcindex).
 
  - More data-race annotations.
 
  - Extend the diag interface to dump TCP bound-only sockets.
 
  - Conditional notification of events for TC qdisc class and actions.
 
  - Support for WPAN dynamic associations with nearby devices, to form
    a sub-network using a specific PAN ID.
 
  - Implement SMCv2.1 virtual ISM device support.
 
  - Add support for Batman-avd mulicast packet type.
 
 BPF
 ---
 
  - Tons of verifier improvements:
    - BPF register bounds logic and range support along with a large
      test suite
    - log improvements
    - complete precision tracking support for register spills
    - track aligned STACK_ZERO cases as imprecise spilled registers. It
      improves the verifier "instructions processed" metric from single
      digit to 50-60% for some programs
    - support for user's global BPF subprogram arguments with few
      commonly requested annotations for a better developer experience
    - support tracking of BPF_JNE which helps cases when the compiler
      transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the
      like
    - several fixes
 
  - Add initial TX metadata implementation for AF_XDP with support in
    mlx5 and stmmac drivers. Two types of offloads are supported right
    now, that is, TX timestamp and TX checksum offload.
 
  - Fix kCFI bugs in BPF all forms of indirect calls from BPF into
    kernel and from kernel into BPF work with CFI enabled. This allows
    BPF to work with CONFIG_FINEIBT=y.
 
  - Change BPF verifier logic to validate global subprograms lazily
    instead of unconditionally before the main program, so they can be
    guarded using BPF CO-RE techniques.
 
  - Support uid/gid options when mounting bpffs.
 
  - Add a new kfunc which acquires the associated cgroup of a task
    within a specific cgroup v1 hierarchy where the latter is identified
    by its id.
 
  - Extend verifier to allow bpf_refcount_acquire() of a map value field
    obtained via direct load which is a use-case needed in sched_ext.
 
  - Add BPF link_info support for uprobe multi link along with bpftool
    integration for the latter.
 
  - Support for VLAN tag in XDP hints.
 
  - Remove deprecated bpfilter kernel leftovers given the project
    is developed in user-space (https://github.com/facebook/bpfilter).
 
 Misc
 ----
 
  - Support for parellel TC self-tests execution.
 
  - Increase MPTCP self-tests coverage.
 
  - Updated the bridge documentation, including several so-far
    undocumented features.
 
  - Convert all the net self-tests to run in unique netns, to
    avoid random failures due to conflict and allow concurrent
    runs.
 
  - Add TCP-AO self-tests.
 
  - Add kunit tests for both cfg80211 and mac80211.
 
  - Autogenerate Netlink families documentation from YAML spec.
 
  - Add yml-gen support for fixed headers and recursive nests, the
    tool can now generate user-space code for all genetlink families
    for which we have specs.
 
  - A bunch of additional module descriptions fixes.
 
  - Catch incorrect freeing of pages belonging to a page pool.
 
 Driver API
 ----------
 
  - Rust abstractions for network PHY drivers; do not cover yet the
    full C API, but already allow implementing functional PHY drivers
    in rust.
 
  - Introduce queue and NAPI support in the netdev Netlink interface,
    allowing complete access to the device <> NAPIs <> queues
    relationship.
 
  - Introduce notifications filtering for devlink to allow control
    application scale to thousands of instances.
 
  - Improve PHY validation, requesting rate matching information for
    each ethtool link mode supported by both the PHY and host.
 
  - Add support for ethtool symmetric-xor RSS hash.
 
  - ACPI based Wifi band RFI (WBRF) mitigation feature for the AMD
    platform.
 
  - Expose pin fractional frequency offset value over new DPLL generic
    netlink attribute.
 
  - Convert older drivers to platform remove callback returning void.
 
  - Add support for PHY package MMD read/write.
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - Octeon CN10K devices
    - Broadcom 5760X P7
    - Qualcomm SM8550 SoC
    - Texas Instrument DP83TG720S PHY
 
  - Bluetooth:
    - IMC Networks Bluetooth radio
 
 Removed
 -------
 
  - WiFi:
    - libertas 16-bit PCMCIA support
    - Atmel at76c50x drivers
    - HostAP ISA/PCMCIA style 802.11b driver
    - zd1201 802.11b USB dongles
    - Orinoco ISA/PCMCIA 802.11b driver
    - Aviator/Raytheon driver
    - Planet WL3501 driver
    - RNDIS USB 802.11b driver
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Intel (100G, ice, idpf):
      - allow one by one port representors creation and removal
      - add temperature and clock information reporting
      - add get/set for ethtool's header split ringparam
      - add again FW logging
      - adds support switchdev hardware packet mirroring
      - iavf: implement symmetric-xor RSS hash
      - igc: add support for concurrent physical and free-running timers
      - i40e: increase the allowable descriptors
    - nVidia/Mellanox:
      - Preparation for Socket-Direct multi-dev netdev. That will allow
        in future releases combining multiple PFs devices attached to
        different NUMA nodes under the same netdev
    - Broadcom (bnxt):
      - TX completion handling improvements
      - add basic ntuple filter support
      - reduce MSIX vectors usage for MQPRIO offload
      - add VXLAN support, USO offload and TX coalesce completion for P7
    - Marvell Octeon EP:
      - xmit-more support
      - add PF-VF mailbox support and use it for FW notifications for VFs
    - Wangxun (ngbe/txgbe):
      - implement ethtool functions to operate pause param, ring param,
        coalesce channel number and msglevel
    - Netronome/Corigine (nfp):
      - add flow-steering support
      - support UDP segmentation offload
 
  - Ethernet NICs embedded, slower, virtual:
    - Xilinx AXI: remove duplicate DMA code adopting the dma engine driver
    - stmmac: add support for HW-accelerated VLAN stripping
    - TI AM654x sw: add mqprio, frame preemption & coalescing
    - gve: add support for non-4k page sizes.
    - virtio-net: support dynamic coalescing moderation
 
  - nVidia/Mellanox Ethernet datacenter switches:
    - allow firmware upgrade without a reboot
    - more flexible support for bridge flooding via the compressed
      FID flooding mode
 
  - Ethernet embedded switches:
    - Microchip:
      - fine-tune flow control and speed configurations in KSZ8xxx
      - KSZ88X3: enable setting rmii reference
    - Renesas:
      - add jumbo frames support
    - Marvell:
      - 88E6xxx: add "eth-mac" and "rmon" stats support
 
  - Ethernet PHYs:
    - aquantia: add firmware load support
    - at803x: refactor the driver to simplify adding support for more
      chip variants
    - NXP C45 TJA11xx: Add MACsec offload support
 
  - Wifi:
    - MediaTek (mt76):
      - NVMEM EEPROM improvements
      - mt7996 Extremely High Throughput (EHT) improvements
      - mt7996 Wireless Ethernet Dispatcher (WED) support
      - mt7996 36-bit DMA support
    - Qualcomm (ath12k):
      - support for a single MSI vector
      - WCN7850: support AP mode
    - Intel (iwlwifi):
      - new debugfs file fw_dbg_clear
      - allow concurrent P2P operation on DFS channels
 
  - Bluetooth:
    - QCA2066: support HFP offload
    - ISO: more broadcast-related improvements
    - NXP: better recovery in case receiver/transmitter get out of sync
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmWdamsSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkGC4P/2xjLzdw22ckSssuE9ORbGko9SNjnqHk
 PQh1E+26BHiCg5KB8VvzMsL78E79MRNXEattSW+1g7dhCvln3oi+Vd0WkdRkgt35
 98Iv18zLbbwFAJeyKvmLAPAkQkMLtVj19QILBBRrugF+egEZgVSE3JBcTAiKv2ZQ
 HzkabA171Ri6LpCcEEtY5XuaKvimGnGzF8YMFf8rX0wtqd2p5kbY9aMe47WAGxvU
 Vf9548XvH+A5yVH2/4/gujtUOpA/RHuhuCMb+oo0cZ+VCC1x9MGzoXzj6r87OTkf
 k2W1whNzcGoin92f+9Lk1JYMuiGKBH4QVaDdNXJnYFSJWPTE7RvRsPzYTSD4/GzK
 yEZbzSJXpy/2vDQm16NoAxl7evRs8Sorzkw4LQRviZHI/5SAkK2ZQiCK5CO8QSYy
 C1LELcV5kn6Foe24xWnrWLjAGug9oJnYoGPMU5gvPmFJMvUMXqm5rmbBgUWL5Rxw
 q1M6gVzabCyWUy6z2G2vaqW2ZntNVvCkdsLtIX0XZkcTzNoP0MA+TuhyGz4wbiuo
 PeyQp/mbGnDgCYggqKIA0YWrTVxkhFrKN520cbO8qXBQytV9oFbM/0/+C0/r/5WX
 pL1JVzLrh6l5ME7EIQfha8UOF9j8q4ueSwb40P3AR2NaZiDABM0zfUZ6+sx+91WF
 ucqPEcZB5cRE
 =1bW6
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "The most interesting thing is probably the networking structs
  reorganization and a significant amount of changes is around
  self-tests.

  Core & protocols:

   - Analyze and reorganize core networking structs (socks, netdev,
     netns, mibs) to optimize cacheline consumption and set up build
     time warnings to safeguard against future header changes

     This improves TCP performances with many concurrent connections up
     to 40%

   - Add page-pool netlink-based introspection, exposing the memory
     usage and recycling stats. This helps indentify bad PP users and
     possible leaks

   - Refine TCP/DCCP source port selection to no longer favor even
     source port at connect() time when IP_LOCAL_PORT_RANGE is set. This
     lowers the time taken by connect() for hosts having many active
     connections to the same destination

   - Refactor the TCP bind conflict code, shrinking related socket
     structs

   - Refactor TCP SYN-Cookie handling, as a preparation step to allow
     arbitrary SYN-Cookie processing via eBPF

   - Tune optmem_max for 0-copy usage, increasing the default value to
     128KB and namespecifying it

   - Allow coalescing for cloned skbs coming from page pools, improving
     RX performances with some common configurations

   - Reduce extension header parsing overhead at GRO time

   - Add bridge MDB bulk deletion support, allowing user-space to
     request the deletion of matching entries

   - Reorder nftables struct members, to keep data accessed by the
     datapath first

   - Introduce TC block ports tracking and use. This allows supporting
     multicast-like behavior at the TC layer

   - Remove UAPI support for retired TC qdiscs (dsmark, CBQ and ATM) and
     classifiers (RSVP and tcindex)

   - More data-race annotations

   - Extend the diag interface to dump TCP bound-only sockets

   - Conditional notification of events for TC qdisc class and actions

   - Support for WPAN dynamic associations with nearby devices, to form
     a sub-network using a specific PAN ID

   - Implement SMCv2.1 virtual ISM device support

   - Add support for Batman-avd mulicast packet type

  BPF:

   - Tons of verifier improvements:
       - BPF register bounds logic and range support along with a large
         test suite
       - log improvements
       - complete precision tracking support for register spills
       - track aligned STACK_ZERO cases as imprecise spilled registers.
         This improves the verifier "instructions processed" metric from
         single digit to 50-60% for some programs
       - support for user's global BPF subprogram arguments with few
         commonly requested annotations for a better developer
         experience
       - support tracking of BPF_JNE which helps cases when the compiler
         transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the
         like
       - several fixes

   - Add initial TX metadata implementation for AF_XDP with support in
     mlx5 and stmmac drivers. Two types of offloads are supported right
     now, that is, TX timestamp and TX checksum offload

   - Fix kCFI bugs in BPF all forms of indirect calls from BPF into
     kernel and from kernel into BPF work with CFI enabled. This allows
     BPF to work with CONFIG_FINEIBT=y

   - Change BPF verifier logic to validate global subprograms lazily
     instead of unconditionally before the main program, so they can be
     guarded using BPF CO-RE techniques

   - Support uid/gid options when mounting bpffs

   - Add a new kfunc which acquires the associated cgroup of a task
     within a specific cgroup v1 hierarchy where the latter is
     identified by its id

   - Extend verifier to allow bpf_refcount_acquire() of a map value
     field obtained via direct load which is a use-case needed in
     sched_ext

   - Add BPF link_info support for uprobe multi link along with bpftool
     integration for the latter

   - Support for VLAN tag in XDP hints

   - Remove deprecated bpfilter kernel leftovers given the project is
     developed in user-space (https://github.com/facebook/bpfilter)

  Misc:

   - Support for parellel TC self-tests execution

   - Increase MPTCP self-tests coverage

   - Updated the bridge documentation, including several so-far
     undocumented features

   - Convert all the net self-tests to run in unique netns, to avoid
     random failures due to conflict and allow concurrent runs

   - Add TCP-AO self-tests

   - Add kunit tests for both cfg80211 and mac80211

   - Autogenerate Netlink families documentation from YAML spec

   - Add yml-gen support for fixed headers and recursive nests, the tool
     can now generate user-space code for all genetlink families for
     which we have specs

   - A bunch of additional module descriptions fixes

   - Catch incorrect freeing of pages belonging to a page pool

  Driver API:

   - Rust abstractions for network PHY drivers; do not cover yet the
     full C API, but already allow implementing functional PHY drivers
     in rust

   - Introduce queue and NAPI support in the netdev Netlink interface,
     allowing complete access to the device <> NAPIs <> queues
     relationship

   - Introduce notifications filtering for devlink to allow control
     application scale to thousands of instances

   - Improve PHY validation, requesting rate matching information for
     each ethtool link mode supported by both the PHY and host

   - Add support for ethtool symmetric-xor RSS hash

   - ACPI based Wifi band RFI (WBRF) mitigation feature for the AMD
     platform

   - Expose pin fractional frequency offset value over new DPLL generic
     netlink attribute

   - Convert older drivers to platform remove callback returning void

   - Add support for PHY package MMD read/write

  New hardware / drivers:

   - Ethernet:
       - Octeon CN10K devices
       - Broadcom 5760X P7
       - Qualcomm SM8550 SoC
       - Texas Instrument DP83TG720S PHY

   - Bluetooth:
       - IMC Networks Bluetooth radio

  Removed:

   - WiFi:
       - libertas 16-bit PCMCIA support
       - Atmel at76c50x drivers
       - HostAP ISA/PCMCIA style 802.11b driver
       - zd1201 802.11b USB dongles
       - Orinoco ISA/PCMCIA 802.11b driver
       - Aviator/Raytheon driver
       - Planet WL3501 driver
       - RNDIS USB 802.11b driver

  Driver updates:

   - Ethernet high-speed NICs:
       - Intel (100G, ice, idpf):
          - allow one by one port representors creation and removal
          - add temperature and clock information reporting
          - add get/set for ethtool's header split ringparam
          - add again FW logging
          - adds support switchdev hardware packet mirroring
          - iavf: implement symmetric-xor RSS hash
          - igc: add support for concurrent physical and free-running
            timers
          - i40e: increase the allowable descriptors
       - nVidia/Mellanox:
          - Preparation for Socket-Direct multi-dev netdev. That will
            allow in future releases combining multiple PFs devices
            attached to different NUMA nodes under the same netdev
       - Broadcom (bnxt):
          - TX completion handling improvements
          - add basic ntuple filter support
          - reduce MSIX vectors usage for MQPRIO offload
          - add VXLAN support, USO offload and TX coalesce completion
            for P7
       - Marvell Octeon EP:
          - xmit-more support
          - add PF-VF mailbox support and use it for FW notifications
            for VFs
       - Wangxun (ngbe/txgbe):
          - implement ethtool functions to operate pause param, ring
            param, coalesce channel number and msglevel
       - Netronome/Corigine (nfp):
          - add flow-steering support
          - support UDP segmentation offload

   - Ethernet NICs embedded, slower, virtual:
       - Xilinx AXI: remove duplicate DMA code adopting the dma engine
         driver
       - stmmac: add support for HW-accelerated VLAN stripping
       - TI AM654x sw: add mqprio, frame preemption & coalescing
       - gve: add support for non-4k page sizes.
       - virtio-net: support dynamic coalescing moderation

   - nVidia/Mellanox Ethernet datacenter switches:
       - allow firmware upgrade without a reboot
       - more flexible support for bridge flooding via the compressed
         FID flooding mode

   - Ethernet embedded switches:
       - Microchip:
          - fine-tune flow control and speed configurations in KSZ8xxx
          - KSZ88X3: enable setting rmii reference
       - Renesas:
          - add jumbo frames support
       - Marvell:
          - 88E6xxx: add "eth-mac" and "rmon" stats support

   - Ethernet PHYs:
       - aquantia: add firmware load support
       - at803x: refactor the driver to simplify adding support for more
         chip variants
       - NXP C45 TJA11xx: Add MACsec offload support

   - Wifi:
       - MediaTek (mt76):
          - NVMEM EEPROM improvements
          - mt7996 Extremely High Throughput (EHT) improvements
          - mt7996 Wireless Ethernet Dispatcher (WED) support
          - mt7996 36-bit DMA support
       - Qualcomm (ath12k):
          - support for a single MSI vector
          - WCN7850: support AP mode
       - Intel (iwlwifi):
          - new debugfs file fw_dbg_clear
          - allow concurrent P2P operation on DFS channels

   - Bluetooth:
       - QCA2066: support HFP offload
       - ISO: more broadcast-related improvements
       - NXP: better recovery in case receiver/transmitter get out of sync"

* tag 'net-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1714 commits)
  lan78xx: remove redundant statement in lan78xx_get_eee
  lan743x: remove redundant statement in lan743x_ethtool_get_eee
  bnxt_en: Fix RCU locking for ntuple filters in bnxt_rx_flow_steer()
  bnxt_en: Fix RCU locking for ntuple filters in bnxt_srxclsrldel()
  bnxt_en: Remove unneeded variable in bnxt_hwrm_clear_vnic_filter()
  tcp: Revert no longer abort SYN_SENT when receiving some ICMP
  Revert "mlx5 updates 2023-12-20"
  Revert "net: stmmac: Enable Per DMA Channel interrupt"
  ipvlan: Remove usage of the deprecated ida_simple_xx() API
  ipvlan: Fix a typo in a comment
  net/sched: Remove ipt action tests
  net: stmmac: Use interrupt mode INTM=1 for per channel irq
  net: stmmac: Add support for TX/RX channel interrupt
  net: stmmac: Make MSI interrupt routine generic
  dt-bindings: net: snps,dwmac: per channel irq
  net: phy: at803x: make read_status more generic
  net: phy: at803x: add support for cdt cross short test for qca808x
  net: phy: at803x: refactor qca808x cable test get status function
  net: phy: at803x: generalize cdt fault length function
  net: ethernet: cortina: Drop TSO support
  ...
2024-01-11 10:07:29 -08:00
Linus Torvalds de927f6c0b s390 updates for 6.8 merge window
- Add machine variable capacity information to /proc/sysinfo.
 
 - Limit the waste of page tables and always align vmalloc area size
   and base address on segment boundary.
 
 - Fix a memory leak when an attempt to register interruption sub class
   (ISC) for the adjunct-processor (AP) guest failed.
 
 - Reset response code AP_RESPONSE_INVALID_GISA to understandable
   by guest AP_RESPONSE_INVALID_ADDRESS in response to a failed
   interruption sub class (ISC) registration attempt.
 
 - Improve reaction to adjunct-processor (AP) AP_RESPONSE_OTHERWISE_CHANGED
   response code when enabling interrupts on behalf of a guest.
 
 - Fix incorrect sysfs 'status' attribute of adjunct-processor (AP) queue
   device bound to the vfio_ap device driver when the mediated device is
   attached to a guest, but the queue device is not passed through.
 
 - Rework struct ap_card to hold the whole adjunct-processor (AP) card
   hardware information. As result, all the ugly bit checks are replaced
   by simple evaluations of the required bit fields.
 
 - Improve handling of some weird scenarios between service element (SE)
   host and SE guest with adjunct-processor (AP) pass-through support.
 
 - Change local_ctl_set_bit() and local_ctl_clear_bit() so they return the
   previous value of the to be changed control register. This is useful if
   a bit is only changed temporarily and the previous content needs to be
   restored.
 
 - The kernel starts with machine checks disabled and is expected to enable
   it once trap_init() is called. However the implementation allows machine
   checks early. Consistently enable it in trap_init() only.
 
 - local_mcck_disable() and local_mcck_enable() assume that machine checks
   are always enabled. Instead implement and use local_mcck_save() and
   local_mcck_restore() to disable machine checks and restore the previous
   state.
 
 - Modification of floating point control (FPC) register of a traced
   process using ptrace interface may lead to corruption of the FPC
   register of the tracing process. Fix this.
 
 - kvm_arch_vcpu_ioctl_set_fpu() allows to set the floating point control
   (FPC) register in vCPU, but may lead to corruption of the FPC register
   of the host process. Fix this.
 
 - Use READ_ONCE() to read a vCPU floating point register value from the
   memory mapped area. This avoids that, depending on code generation,
   a different value is tested for validity than the one that is used.
 
 - Get rid of test_fp_ctl(), since it is quite subtle to use it correctly.
   Instead copy a new floating point control register value into its save
   area and test the validity of the new value when loading it.
 
 - Remove superfluous save_fpu_regs() call.
 
 - Remove s390 support for ARCH_WANTS_DYNAMIC_TASK_STRUCT. All machines
   provide the vector facility since many years and the need to make the
   task structure size dependent on the vector facility does not exist.
 
 - Remove the "novx" kernel command line option, as the vector code runs
   without any problems since many years.
 
 - Add the vector facility to the z13 architecture level set (ALS).
   All hypervisors support the vector facility since many years.
   This allows compile time optimizations of the kernel.
 
 - Get rid of MACHINE_HAS_VX and replace it with cpu_has_vx(). As result,
   the compiled code will have less runtime checks and less code.
 
 - Convert pgste_get_lock() and pgste_set_unlock() ASM inlines to C.
 
 - Convert the struct subchannel spinlock from pointer to member.
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYIADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCZZxnchccYWdvcmRlZXZA
 bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8PyGAP9Z0Z1Pm3hPf24M4FgY5uuRqBHo
 VoiuohOZRGONwifnsgEAmN/3ba/7d/j0iMwpUdUFouiqtwTUihh15wYx1sH2IA8=
 =F6YD
 -----END PGP SIGNATURE-----

Merge tag 's390-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Alexander Gordeev:

 - Add machine variable capacity information to /proc/sysinfo.

 - Limit the waste of page tables and always align vmalloc area size and
   base address on segment boundary.

 - Fix a memory leak when an attempt to register interruption sub class
   (ISC) for the adjunct-processor (AP) guest failed.

 - Reset response code AP_RESPONSE_INVALID_GISA to understandable by
   guest AP_RESPONSE_INVALID_ADDRESS in response to a failed
   interruption sub class (ISC) registration attempt.

 - Improve reaction to adjunct-processor (AP)
   AP_RESPONSE_OTHERWISE_CHANGED response code when enabling interrupts
   on behalf of a guest.

 - Fix incorrect sysfs 'status' attribute of adjunct-processor (AP)
   queue device bound to the vfio_ap device driver when the mediated
   device is attached to a guest, but the queue device is not passed
   through.

 - Rework struct ap_card to hold the whole adjunct-processor (AP) card
   hardware information. As result, all the ugly bit checks are replaced
   by simple evaluations of the required bit fields.

 - Improve handling of some weird scenarios between service element (SE)
   host and SE guest with adjunct-processor (AP) pass-through support.

 - Change local_ctl_set_bit() and local_ctl_clear_bit() so they return
   the previous value of the to be changed control register. This is
   useful if a bit is only changed temporarily and the previous content
   needs to be restored.

 - The kernel starts with machine checks disabled and is expected to
   enable it once trap_init() is called. However the implementation
   allows machine checks early. Consistently enable it in trap_init()
   only.

 - local_mcck_disable() and local_mcck_enable() assume that machine
   checks are always enabled. Instead implement and use
   local_mcck_save() and local_mcck_restore() to disable machine checks
   and restore the previous state.

 - Modification of floating point control (FPC) register of a traced
   process using ptrace interface may lead to corruption of the FPC
   register of the tracing process. Fix this.

 - kvm_arch_vcpu_ioctl_set_fpu() allows to set the floating point
   control (FPC) register in vCPU, but may lead to corruption of the FPC
   register of the host process. Fix this.

 - Use READ_ONCE() to read a vCPU floating point register value from the
   memory mapped area. This avoids that, depending on code generation, a
   different value is tested for validity than the one that is used.

 - Get rid of test_fp_ctl(), since it is quite subtle to use it
   correctly. Instead copy a new floating point control register value
   into its save area and test the validity of the new value when
   loading it.

 - Remove superfluous save_fpu_regs() call.

 - Remove s390 support for ARCH_WANTS_DYNAMIC_TASK_STRUCT. All machines
   provide the vector facility since many years and the need to make the
   task structure size dependent on the vector facility does not exist.

 - Remove the "novx" kernel command line option, as the vector code runs
   without any problems since many years.

 - Add the vector facility to the z13 architecture level set (ALS). All
   hypervisors support the vector facility since many years. This allows
   compile time optimizations of the kernel.

 - Get rid of MACHINE_HAS_VX and replace it with cpu_has_vx(). As
   result, the compiled code will have less runtime checks and less
   code.

 - Convert pgste_get_lock() and pgste_set_unlock() ASM inlines to C.

 - Convert the struct subchannel spinlock from pointer to member.

* tag 's390-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (24 commits)
  Revert "s390: update defconfigs"
  s390/cio: make sch->lock spinlock pointer a member
  s390: update defconfigs
  s390/mm: convert pgste locking functions to C
  s390/fpu: get rid of MACHINE_HAS_VX
  s390/als: add vector facility to z13 architecture level set
  s390/fpu: remove "novx" option
  s390/fpu: remove ARCH_WANTS_DYNAMIC_TASK_STRUCT support
  KVM: s390: remove superfluous save_fpu_regs() call
  s390/fpu: get rid of test_fp_ctl()
  KVM: s390: use READ_ONCE() to read fpc register value
  KVM: s390: fix setting of fpc register
  s390/ptrace: handle setting of fpc register correctly
  s390/nmi: implement and use local_mcck_save() / local_mcck_restore()
  s390/nmi: consistently enable machine checks in trap_init()
  s390/ctlreg: return old register contents when changing bits
  s390/ap: handle outband SE bind state change
  s390/ap: store TAPQ hwinfo in struct ap_card
  s390/vfio-ap: fix sysfs status attribute for AP queue devices
  s390/vfio-ap: improve reaction to response code 07 from PQAP(AQIC) command
  ...
2024-01-10 18:18:20 -08:00
Linus Torvalds a05aea98d4 sysctl-6.8-rc1
To help make the move of sysctls out of kernel/sysctl.c not incur a size
 penalty sysctl has been changed to allow us to not require the sentinel, the
 final empty element on the sysctl array. Joel Granados has been doing all this
 work. On the v6.6 kernel we got the major infrastructure changes required to
 support this. For v6.7 we had all arch/ and drivers/ modified to remove
 the sentinel. For v6.8-rc1 we get a few more updates for fs/ directory only.
 The kernel/ directory is left but we'll save that for v6.9-rc1 as those patches
 are still being reviewed. After that we then can expect also the removal of the
 no longer needed check for procname == NULL.
 
 Let us recap the purpose of this work:
 
   - this helps reduce the overall build time size of the kernel and run time
     memory consumed by the kernel by about ~64 bytes per array
   - the extra 64-byte penalty is no longer inncurred now when we move sysctls
     out from kernel/sysctl.c to their own files
 
 Thomas Weißschuh also sent a few cleanups, for v6.9-rc1 we expect to see further
 work by Thomas Weißschuh with the constificatin of the struct ctl_table.
 
 Due to Joel Granados's work, and to help bring in new blood, I have suggested
 for him to become a maintainer and he's accepted. So for v6.9-rc1 I look forward
 to seeing him sent you a pull request for further sysctl changes. This also
 removes Iurii Zaikin as a maintainer as he has moved on to other projects and
 has had no time to help at all.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmWdWDESHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinjJAP/jTNNoyzWisvrrvmXqR5txFGLOE+wW6x
 Xv9avuiM+DTHsH/wK8CkXEivwDqYNAZEHU7NEcolS5bJX/ddSRwN9b5aSVlCrUdX
 Ab4rXmpeSCNFp9zNszWJsDuBKIqjvsKw7qGleGtgZ2qAUHbbH30VROLWCggaee50
 wU3icDLdwkasxrcMXy4Sq5dT5wYC4j/QelqBGIkYPT14Arl1im5zqPZ95gmO/s/6
 mdicTAmq+hhAUfUBJBXRKtsvxY6CItxe55Q4fjpncLUJLHUw+VPVNoBKFWJlBwlh
 LO3liKFfakPSkil4/en+/+zuMByd0JBkIzIJa+Kk5kjpbHRhK0RkmU4+Y5G5spWN
 jjLfiv6RxInNaZ8EWQBMfjE95A7PmYDQ4TOH08+OvzdDIi6B0BB5tBGQpG9BnyXk
 YsLg1Uo4CwE/vn1/a9w0rhadjUInvmAryhb/uSJYFz/lmApLm2JUpY3/KstwGetb
 z+HmLstJb24Djkr6pH8DcjhzRBHeWQ5p0b4/6B+v1HqAUuEhdbyw1F2GrDywyF3R
 h/UOAaKLm1+ffdA246o9TejKiDU96qEzzXMaCzPKyestaRZuiyuYEMDhYbvtsMV5
 zIdMJj5HQ+U1KHDv4IN99DEj7+/vjE3f4Sjo+POFpQeQ8/d+fxpFNqXVv449dgnb
 6xEkkxsR0ElM
 =2qBt
 -----END PGP SIGNATURE-----

Merge tag 'sysctl-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull sysctl updates from Luis Chamberlain:
 "To help make the move of sysctls out of kernel/sysctl.c not incur a
  size penalty sysctl has been changed to allow us to not require the
  sentinel, the final empty element on the sysctl array. Joel Granados
  has been doing all this work.

  In the v6.6 kernel we got the major infrastructure changes required to
  support this. For v6.7 we had all arch/ and drivers/ modified to
  remove the sentinel. For v6.8-rc1 we get a few more updates for fs/
  directory only.

  The kernel/ directory is left but we'll save that for v6.9-rc1 as
  those patches are still being reviewed. After that we then can expect
  also the removal of the no longer needed check for procname == NULL.

  Let us recap the purpose of this work:

   - this helps reduce the overall build time size of the kernel and run
     time memory consumed by the kernel by about ~64 bytes per array

   - the extra 64-byte penalty is no longer inncurred now when we move
     sysctls out from kernel/sysctl.c to their own files

  Thomas Weißschuh also sent a few cleanups, for v6.9-rc1 we expect to
  see further work by Thomas Weißschuh with the constificatin of the
  struct ctl_table.

  Due to Joel Granados's work, and to help bring in new blood, I have
  suggested for him to become a maintainer and he's accepted. So for
  v6.9-rc1 I look forward to seeing him sent you a pull request for
  further sysctl changes. This also removes Iurii Zaikin as a maintainer
  as he has moved on to other projects and has had no time to help at
  all"

* tag 'sysctl-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  sysctl: remove struct ctl_path
  sysctl: delete unused define SYSCTL_PERM_EMPTY_DIR
  coda: Remove the now superfluous sentinel elements from ctl_table array
  sysctl: Remove the now superfluous sentinel elements from ctl_table array
  fs: Remove the now superfluous sentinel elements from ctl_table array
  cachefiles: Remove the now superfluous sentinel element from ctl_table array
  sysclt: Clarify the results of selftest run
  sysctl: Add a selftest for handling empty dirs
  sysctl: Fix out of bounds access for empty sysctl registers
  MAINTAINERS: Add Joel Granados as co-maintainer for proc sysctl
  MAINTAINERS: remove Iurii Zaikin from proc sysctl
2024-01-10 17:44:36 -08:00
Linus Torvalds 78273df7f6 header cleanups for 6.8
The goal is to get sched.h down to a type only header, so the main thing
 happening in this patchset is splitting out various _types.h headers and
 dependency fixups, as well as moving some things out of sched.h to
 better locations.
 
 This is prep work for the memory allocation profiling patchset which
 adds new sched.h interdepencencies.
 
 Testing - it's been in -next, and fixes from pretty much all
 architectures have percolated in - nothing major.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmWfBwwACgkQE6szbY3K
 bnZPwBAAmuRojXaeWxi01IPIOehSGDe68vw44PR9glEMZvxdnZuPOdvE4/+245/L
 bRKU2WBCjBUokUbV9msIShwRkFTZAmEMPNfPAAsFMA+VXeDYHKB+ZRdwTggNAQ+I
 SG6fZgh5m0HsewCDxU8oqVHkjVq4fXn0cy+aL6xLEd9gu67GoBzX2pDieS2Kvy6j
 jnyoKTxFwb+LTQgph0P4EIpq5I2umAsdLwdSR8EJ+8e9NiNvMo1pI00Lx/ntAnFZ
 JftWUJcMy3TQ5u1GkyfQN9y/yThX1bZK5GvmHS9SJ2Dkacaus5d+xaKCHtRuFS1I
 7C6b8PsNgRczUMumBXus44HdlNfNs1yU3lvVxFvBIPE1qC9pYRHrkWIXXIocXLLC
 oxTEJ6B2G3BQZVQgLIA4fOaxMVhmvKffi/aEZLi9vN9VVosd1a6XNKI6KbyRnXFp
 GSs9qDqszhn5I3GYNlDNQTc/8UsRlhPFgS6nS0By6QnvxtGi9QkU2tBRBsXvqwCy
 cLoCYIhc2tvugHvld70dz26umiJ4rnmxGlobStNoigDvIKAIUt1UmIdr1so8P8eH
 xehnL9ZcOX6xnANDL0AqMFFHV6I58CJynhFdUoXfVQf/DWLGX48mpi9LVNsYBzsI
 CAwVOAQ0UjGrpdWmJ9ueY/ABYqg9vRjzaDEXQ+MhAYO55CLaVsg=
 =3tyT
 -----END PGP SIGNATURE-----

Merge tag 'header_cleanup-2024-01-10' of https://evilpiepirate.org/git/bcachefs

Pull header cleanups from Kent Overstreet:
 "The goal is to get sched.h down to a type only header, so the main
  thing happening in this patchset is splitting out various _types.h
  headers and dependency fixups, as well as moving some things out of
  sched.h to better locations.

  This is prep work for the memory allocation profiling patchset which
  adds new sched.h interdepencencies"

* tag 'header_cleanup-2024-01-10' of https://evilpiepirate.org/git/bcachefs: (51 commits)
  Kill sched.h dependency on rcupdate.h
  kill unnecessary thread_info.h include
  Kill unnecessary kernel.h include
  preempt.h: Kill dependency on list.h
  rseq: Split out rseq.h from sched.h
  LoongArch: signal.c: add header file to fix build error
  restart_block: Trim includes
  lockdep: move held_lock to lockdep_types.h
  sem: Split out sem_types.h
  uidgid: Split out uidgid_types.h
  seccomp: Split out seccomp_types.h
  refcount: Split out refcount_types.h
  uapi/linux/resource.h: fix include
  x86/signal: kill dependency on time.h
  syscall_user_dispatch.h: split out *_types.h
  mm_types_task.h: Trim dependencies
  Split out irqflags_types.h
  ipc: Kill bogus dependency on spinlock.h
  shm: Slim down dependencies
  workqueue: Split out workqueue_types.h
  ...
2024-01-10 16:43:55 -08:00
Linus Torvalds 0cb552aa97 This update includes the following changes:
API:
 
 - Add incremental lskcipher/skcipher processing.
 
 Algorithms:
 
 - Remove SHA1 from drbg.
 - Remove CFB and OFB.
 
 Drivers:
 
 - Add comp high perf mode configuration in hisilicon/zip.
 - Add support for 420xx devices in qat.
 - Add IAA Compression Accelerator driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmWdxR4ACgkQxycdCkmx
 i6fAjg//SqOwxeUYWpT4KdMCxGMn7U9iE3wJeX8nqfma3a62Wt2soey7H3GB9G7v
 gEh0OraOKIGeBtS8giIX83SZJOirMlgeE2tngxMmR9O95EUNR0XGnywF/emyt96z
 WcSN1IrRZ8qQzTASBF0KpV2Ir5mNzBiOwU9tVHIztROufA4C1fwKl7yhPM67C3MU
 88vf1R+ZeWUbNbzQNC8oYIqU11dcNaMNhOVPiZCECKbIR6LqwUf3Swexz+HuPR/D
 WTSrb4J3Eeg77SMhI959/Hi53WeEyVW1vWYAVMgfTEFw6PESiOXyPeImfzUMFos6
 fFYIAoQzoG5GlQeYwLLSoZAwtfY+f7gTNoaE+bnPk5317EFzFDijaXrkjjVKqkS2
 OOBfxrMMIGNmxp7pPkt6HPnIvGNTo+SnbAdVIm6M3EN1K+BTGrj7/CTJkcT6XSyK
 nCBL6nbP7zMB1GJfCFGPvlIdW4oYnAfB1Q5YJ9tzYbEZ0t5NWxDKZ45RnM9xQp4Y
 2V1zdfALdqmGRKBWgyUcqp1T4/AYRU0+WaQxz7gHw3BPR4QmfVLPRqiiR7OT0Z+P
 XFotOYD3epVXS1OUyZdLBn5+FXLnRd1uylQ+j8FNfnddr4Nr+tH1J6edK71NMvXG
 Tj7p5rP5bbgvVkD43ywsVnCI0w+9NS55mH5UP2Y4fSLS6p2tJAw=
 =yMmO
 -----END PGP SIGNATURE-----

Merge tag 'v6.8-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Add incremental lskcipher/skcipher processing

  Algorithms:
   - Remove SHA1 from drbg
   - Remove CFB and OFB

  Drivers:
   - Add comp high perf mode configuration in hisilicon/zip
   - Add support for 420xx devices in qat
   - Add IAA Compression Accelerator driver"

* tag 'v6.8-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (172 commits)
  crypto: iaa - Account for cpu-less numa nodes
  crypto: scomp - fix req->dst buffer overflow
  crypto: sahara - add support for crypto_engine
  crypto: sahara - remove error message for bad aes request size
  crypto: sahara - remove unnecessary NULL assignments
  crypto: sahara - remove 'active' flag from sahara_aes_reqctx struct
  crypto: sahara - use dev_err_probe()
  crypto: sahara - use devm_clk_get_enabled()
  crypto: sahara - use BIT() macro
  crypto: sahara - clean up macro indentation
  crypto: sahara - do not resize req->src when doing hash operations
  crypto: sahara - fix processing hash requests with req->nbytes < sg->length
  crypto: sahara - improve error handling in sahara_sha_process()
  crypto: sahara - fix wait_for_completion_timeout() error handling
  crypto: sahara - fix ahash reqsize
  crypto: sahara - handle zero-length aes requests
  crypto: skcipher - remove excess kerneldoc members
  crypto: shash - remove excess kerneldoc members
  crypto: qat - generate dynamically arbiter mappings
  crypto: qat - add support for ring pair level telemetry
  ...
2024-01-10 12:23:43 -08:00
Linus Torvalds 41daf06ea1 linux_kselftest-kunit-6.8-rc1
This KUnit update for Linux 6.8-rc1 consists of:
 
 - a new feature that adds APIs for managing devices introducing
   a set of helper functions which allow devices (internally a
   struct kunit_device) to be created and managed by KUnit.
   These devices will be automatically unregistered on
   test exit. These helpers can either use a user-provided
   struct device_driver, or have one automatically created and
   managed by KUnit. In both cases, the device lives on a new
   kunit_bus.
 
 - changes to switch drm/tests to use kunit devices
 
 - several fixes and enhancements to attribute feature
 
 - changes to reorganize deferred action function introducing
   KUNIT_DEFINE_ACTION_WRAPPER
 
 - new feature adds ability to run tests after boot using debugfs
 
 - fixes and enhancements to string-stream-test:
   - parse ERR_PTR in string_stream_destroy()
   - unchecked dereference in bug fix in debugfs_print_results()
   - handling errors from alloc_string_stream()
   - NULL-dereference bug fix in kunit_init_suite()
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmWdiHIACgkQCwJExA0N
 QxxzCxAAmhn+rkKV4DfuGXxUAJbO109H7LSP1Y7FKMYCVp83msWKASziujb2IQR9
 87jnmgeJMbmQaPcc9m//NHuFhZmJwQZwAdGZryoDiz7XK+1MwLxYeUj92HI7FPaD
 o5Jz6tlqFdehx5jCOymgwbvhI5kJMkQCTTtnEaiHCByfaA02UqmTtt3bXK5OeJkZ
 UG0HqdvI/6Xo01i+BnerRBZFcQV49GMhl4acw1k+dJnPLkqusL6txftRBoKtxuVd
 mXQHKS1SmNgiNA+nqs4d/8qERoMJWuwj6wV4pldVBXhgZwOHXbBxBf67i7hTakE/
 TkEURCkOb5X0QrT6akJj6phJ2xqXsF7xwzBJh9G4jF2Pdwwo8GGuAXW+ol0TRrm8
 ZEQ4eMBGIK07Lb9FeBMLO2bZ0Ox+oiN+YNGY/bs1d6Ibf4PnBUfy7IlmMjKL9h/V
 A/EpYdaq5q72IZZQ2pu1rYkBRPbnP7vHmjLXVYIq7Pq8bLA9/ycKO/0jnGHdo1oz
 rBK/6t7yB+ATi4KeKQpjpmUTX/vdEenUQI/QDn9ngXIEwYQfNrEUZitEvBXR1Kw+
 T8iKDIPFkvb/yEZgjWgNpxETooDx3yBkeeC29gKMj4QoN38wEdfy0Xltp8eqq9cS
 6lijRoipUypHRAuXeSJMW2dflLnFIt4mtC25hBNF+DmyNVT+IF4=
 =79+u
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:

 - a new feature that adds APIs for managing devices introducing a set
   of helper functions which allow devices (internally a struct
   kunit_device) to be created and managed by KUnit.

   These devices will be automatically unregistered on test exit. These
   helpers can either use a user-provided struct device_driver, or have
   one automatically created and managed by KUnit. In both cases, the
   device lives on a new kunit_bus.

 - changes to switch drm/tests to use kunit devices

 - several fixes and enhancements to attribute feature

 - changes to reorganize deferred action function introducing
   KUNIT_DEFINE_ACTION_WRAPPER

 - new feature adds ability to run tests after boot using debugfs

 - fixes and enhancements to string-stream-test:
     - parse ERR_PTR in string_stream_destroy()
     - unchecked dereference in bug fix in debugfs_print_results()
     - handling errors from alloc_string_stream()
     - NULL-dereference bug fix in kunit_init_suite()

* tag 'linux_kselftest-kunit-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (27 commits)
  kunit: Fix some comments which were mistakenly kerneldoc
  kunit: Protect string comparisons against NULL
  kunit: Add example of kunit_activate_static_stub() with pointer-to-function
  kunit: Allow passing function pointer to kunit_activate_static_stub()
  kunit: Fix NULL-dereference in kunit_init_suite() if suite->log is NULL
  kunit: Reset test->priv after each param iteration
  kunit: Add example for using test->priv
  drm/tests: Switch to kunit devices
  ASoC: topology: Replace fake root_device with kunit_device in tests
  overflow: Replace fake root_device with kunit_device
  fortify: test: Use kunit_device
  kunit: Add APIs for managing devices
  Documentation: Add debugfs docs with run after boot
  kunit: add ability to run tests after boot using debugfs
  kunit: add is_init test attribute
  kunit: add example suite to test init suites
  kunit: add KUNIT_INIT_TABLE to init linker section
  kunit: move KUNIT_TABLE out of INIT_DATA
  kunit: tool: add test for parsing attributes
  kunit: tool: fix parsing of test attributes
  ...
2024-01-09 17:16:58 -08:00
Linus Torvalds bd012f3a5b ACPI updates for 6.8-rc1
- Add CSI-2 and DisCo for Imaging support to the ACPI device
    enumeration code (Sakari Ailus, Rafael J. Wysocki).
 
  - Adjust the cpufreq thermal reduction algorithm in the ACPI processor
    driver for Tegra241 (Srikar Srimath Tirumala, Arnd Bergmann).
 
  - Make acpi_proc_quirk_mwait_check() x86-specific (Rafael J. Wysocki).
 
  - Switch over ACPI to using a threaded interrupt handler for the
    SCI (Rafael J. Wysocki).
 
  - Allow ACPI Notify () handlers to run on all CPUs and clean up the
    ACPI interface for deferred events processing (Rafael J. Wysocki).
 
  - Switch over the ACPI EC driver to using a threaded handler for the
    dedicated IRQ on systems without the EC GPE (Rafael J. Wysocki).
 
  - Adjust code using ACPICA spinlocks and the ACPI EC driver spinlock to
    keep local interrupts on (Rafael J. Wysocki).
 
  - Adjust the USB4 _OSC handshake to correctly handle cases in which
    certain types of OS control are denied by the platform (Mika
    Westerberg).
 
  - Correct and clean up the generic function for parsing ACPI data-only
    tables with array structure (Yuntao Wang).
 
  - Modify acpi_dev_uid_match() to support different types of its second
    argument and adjust its users accordingly (Raag Jadav).
 
  - Clean up code related to acpi_evaluate_reference() and ACPI device
    lists (Rafael J. Wysocki).
 
  - Use generic ACPI helpers for evaluating trip point temperature
    objects in the ACPI thermal zone driver (Rafael J. Wysockii, Arnd
    Bergmann).
 
  - Add Thermal fast Sampling Period (_TFP) support to the ACPI thermal
    zone driver (Jeff Brasen).
 
  - Modify the ACPI LPIT table handling code to avoid u32 multiplication
    overflows in state residency computations (Nikita Kiryushin).
 
  - Drop an unused helper function from the ACPI backlight (video) driver
    and add a clarifying comment to it (Hans de Goede).
 
  - Update the ACPI backlight driver to avoid using uninitialized memory
    in some cases (Nikita Kiryushin).
 
  - Add ACPI backlight quirk for the Colorful X15 AT 23 laptop (Yuluo
    Qiu).
 
  - Add support for vendor-defined error types to the ACPI APEI error
    injection code (Avadhut Naik).
 
  - Adjust APEI to properly set MF_ACTION_REQUIRED on synchronous memory
    failure events, so they are handled differently from the asynchronous
    ones (Shuai Xue).
 
  - Fix NULL pointer dereference check in the ACPI extlog driver (Prarit
    Bhargava).
 
  - Adjust the ACPI extlog driver to clear the Extended Error Log status
    when RAS_CEC handled the error (Tony Luck).
 
  - Add IRQ override quirks for some Infinity laptops and for TongFang
    GMxXGxx (David McFarland, Hans de Goede).
 
  - Clean up the ACPI NUMA code and fix it to ensure that fake_pxm is not
    the same as one of the real pxm values (Yuntao Wang).
 
  - Fix the fractional clock divider flags in the ACPI LPSS (Intel SoC)
    driver so as to prevent miscalculation of the values in the clock
    divider (Andy Shevchenko).
 
  - Adjust comments in the ACPI watchdog driver to prevent kernel-doc
    from complaining during documentation builds (Randy Dunlap).
 
  - Make the ACPI button driver send wakeup key events to user space in
    addition to power button events on systems that can be woken up by
    the power button (Ken Xue).
 
  - Adjust pnpacpi_parse_allocated_vendor() to use memcpy() on a full
    structure field (Dmitry Antipov).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmWb8asSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxhsQP/jfRiEP7L9WUl66PdzxSWi1u7bVUZIbs
 z07ujAFdAbvpdM1WgWVq6mSzYewAqIm0A9Koabj7zKuG4VPh0Gjvq26jrK/et65m
 RJhC/qcnZ4h/2bELf9/JE7FIQMDWBGK8gNHBBXVQOZrQYIiBzJ2xyHJ4F0AvLVW6
 GGuX/4mb00jlWGr6uot6qjBgLLxY0EowneLUuH4onEWrThoNWy7zbD34LSsKuljA
 a69UkQPetXbkX4XQYnt4K4BAnwjRQNU2DlUE9lpMtheTS70wilxrC+P0XaETeO7c
 NCm38X2aUv/hSwJ0BekBRdNEvG/WQsfRdOt9jWAkoCL3oDCZdOgfM6Eas7ZDLF2n
 RoxLk2O9UXFwaSSGBVgkRLPCVyWBNI6C8GXnVDN8f9hqIk+jmlsXaXghpzVlGS54
 +ox6fjO81zJjEBxSP5ACCTNZq3BwwHhPhygtIkTO5JQ9SPn+WYCPM0C5Lcvzoj7A
 x7cdOguddhAi4ZWcoRo2cg7qN6vVaDgDgV+ylzh7q5N4cBY4edCJLzcFFuasriN4
 j9/Uj/EgCafrnOhlTJz0iZkAbPZ6T/qa3qBfF948dtFRkztTsddmGA4xof90jfG9
 /FLXL4wSiXK7jbFeUb1OCLOVANWpjHP3pM3gmnggiI3ApcweEGilhhbgVr7FuCG8
 7qj78EUqNVbW
 =Ntzm
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "From the new features standpoint, the most significant change here is
  the addition of CSI-2 and MIPI DisCo for Imaging support to the ACPI
  device enumeration code that will allow MIPI cameras to be enumerated
  through the platform firmware on systems using ACPI.

  Also significant is the switch-over to threaded interrupt handlers for
  the ACPI SCI and the dedicated EC interrupt (on systems where the
  former is not used) which essentially allows all ACPI code to run with
  local interrupts enabled. That should improve responsiveness
  significantly on systems where multiple GPEs are enabled and the
  handling of one SCI involves many I/O address space accesses which
  previously had to be carried out in one go with disabled interrupts on
  the local CPU.

  Apart from the above, the ACPI thermal zone driver will use the
  Thermal fast Sampling Period (_TFP) object if available, which should
  allow temperature changes to be followed more accurately on some
  systems, the ACPI Notify () handlers can run on all CPUs (not just on
  CPU0), which should generally speed up the processing of events
  signaled through the ACPI SCI, and the ACPI power button driver will
  trigger wakeup key events via the input subsystem (on systems where it
  is a system wakeup device)

  In addition to that, there are the usual bunch of fixes and cleanups.

  Specifics:

   - Add CSI-2 and DisCo for Imaging support to the ACPI device
     enumeration code (Sakari Ailus, Rafael J. Wysocki)

   - Adjust the cpufreq thermal reduction algorithm in the ACPI
     processor driver for Tegra241 (Srikar Srimath Tirumala, Arnd
     Bergmann)

   - Make acpi_proc_quirk_mwait_check() x86-specific (Rafael J. Wysocki)

   - Switch over ACPI to using a threaded interrupt handler for the SCI
     (Rafael J. Wysocki)

   - Allow ACPI Notify () handlers to run on all CPUs and clean up the
     ACPI interface for deferred events processing (Rafael J. Wysocki)

   - Switch over the ACPI EC driver to using a threaded handler for the
     dedicated IRQ on systems without the EC GPE (Rafael J. Wysocki)

   - Adjust code using ACPICA spinlocks and the ACPI EC driver spinlock
     to keep local interrupts on (Rafael J. Wysocki)

   - Adjust the USB4 _OSC handshake to correctly handle cases in which
     certain types of OS control are denied by the platform (Mika
     Westerberg)

   - Correct and clean up the generic function for parsing ACPI
     data-only tables with array structure (Yuntao Wang)

   - Modify acpi_dev_uid_match() to support different types of its
     second argument and adjust its users accordingly (Raag Jadav)

   - Clean up code related to acpi_evaluate_reference() and ACPI device
     lists (Rafael J. Wysocki)

   - Use generic ACPI helpers for evaluating trip point temperature
     objects in the ACPI thermal zone driver (Rafael J. Wysockii, Arnd
     Bergmann)

   - Add Thermal fast Sampling Period (_TFP) support to the ACPI thermal
     zone driver (Jeff Brasen)

   - Modify the ACPI LPIT table handling code to avoid u32
     multiplication overflows in state residency computations (Nikita
     Kiryushin)

   - Drop an unused helper function from the ACPI backlight (video)
     driver and add a clarifying comment to it (Hans de Goede)

   - Update the ACPI backlight driver to avoid using uninitialized
     memory in some cases (Nikita Kiryushin)

   - Add ACPI backlight quirk for the Colorful X15 AT 23 laptop (Yuluo
     Qiu)

   - Add support for vendor-defined error types to the ACPI APEI error
     injection code (Avadhut Naik)

   - Adjust APEI to properly set MF_ACTION_REQUIRED on synchronous
     memory failure events, so they are handled differently from the
     asynchronous ones (Shuai Xue)

   - Fix NULL pointer dereference check in the ACPI extlog driver
     (Prarit Bhargava)

   - Adjust the ACPI extlog driver to clear the Extended Error Log
     status when RAS_CEC handled the error (Tony Luck)

   - Add IRQ override quirks for some Infinity laptops and for TongFang
     GMxXGxx (David McFarland, Hans de Goede)

   - Clean up the ACPI NUMA code and fix it to ensure that fake_pxm is
     not the same as one of the real pxm values (Yuntao Wang)

   - Fix the fractional clock divider flags in the ACPI LPSS (Intel SoC)
     driver so as to prevent miscalculation of the values in the clock
     divider (Andy Shevchenko)

   - Adjust comments in the ACPI watchdog driver to prevent kernel-doc
     from complaining during documentation builds (Randy Dunlap)

   - Make the ACPI button driver send wakeup key events to user space in
     addition to power button events on systems that can be woken up by
     the power button (Ken Xue)

   - Adjust pnpacpi_parse_allocated_vendor() to use memcpy() on a full
     structure field (Dmitry Antipov)"

* tag 'acpi-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (56 commits)
  ACPI: resource: Add Infinity laptops to irq1_edge_low_force_override
  ACPI: button: trigger wakeup key events
  ACPI: resource: Add another DMI match for the TongFang GMxXGxx
  ACPI: EC: Use a spin lock without disabing interrupts
  ACPI: EC: Use a threaded handler for dedicated IRQ
  ACPI: OSL: Use spin locks without disabling interrupts
  ACPI: APEI: set memory failure flags as MF_ACTION_REQUIRED on synchronous events
  ACPI: utils: Introduce helper for _DEP list lookup
  ACPI: utils: Fix white space in struct acpi_handle_list definition
  ACPI: utils: Refine acpi_handle_list_equal() slightly
  ACPI: utils: Return bool from acpi_evaluate_reference()
  ACPI: utils: Rearrange in acpi_evaluate_reference()
  ACPI: arm64: export acpi_arch_thermal_cpufreq_pctg()
  ACPI: extlog: Clear Extended Error Log status when RAS_CEC handled the error
  ACPI: LPSS: Fix the fractional clock divider flags
  ACPI: NUMA: Fix the logic of getting the fake_pxm value
  ACPI: NUMA: Optimize the check for the availability of node values
  ACPI: NUMA: Remove unnecessary check in acpi_parse_gi_affinity()
  ACPI: watchdog: fix kernel-doc warnings
  ACPI: extlog: fix NULL pointer dereference check
  ...
2024-01-09 16:12:44 -08:00
Linus Torvalds 9f2a635235 Quite a lot of kexec work this time around. Many singleton patches in
many places.  The notable patch series are:
 
 - nilfs2 folio conversion from Matthew Wilcox in "nilfs2: Folio
   conversions for file paths".
 
 - Additional nilfs2 folio conversion from Ryusuke Konishi in "nilfs2:
   Folio conversions for directory paths".
 
 - IA64 remnant removal in Heiko Carstens's "Remove unused code after
   IA-64 removal".
 
 - Arnd Bergmann has enabled the -Wmissing-prototypes warning everywhere
   in "Treewide: enable -Wmissing-prototypes".  This had some followup
   fixes:
 
   - Nathan Chancellor has cleaned up the hexagon build in the series
     "hexagon: Fix up instances of -Wmissing-prototypes".
 
   - Nathan also addressed some s390 warnings in "s390: A couple of
     fixes for -Wmissing-prototypes".
 
   - Arnd Bergmann addresses the same warnings for MIPS in his series
     "mips: address -Wmissing-prototypes warnings".
 
 - Baoquan He has made kexec_file operate in a top-down-fitting manner
   similar to kexec_load in the series "kexec_file: Load kernel at top of
   system RAM if required"
 
 - Baoquan He has also added the self-explanatory "kexec_file: print out
   debugging message if required".
 
 - Some checkstack maintenance work from Tiezhu Yang in the series
   "Modify some code about checkstack".
 
 - Douglas Anderson has disentangled the watchdog code's logging when
   multiple reports are occurring simultaneously.  The series is "watchdog:
   Better handling of concurrent lockups".
 
 - Yuntao Wang has contributed some maintenance work on the crash code in
   "crash: Some cleanups and fixes".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZZ2R6AAKCRDdBJ7gKXxA
 juCVAP4t76qUISDOSKugB/Dn5E4Nt9wvPY9PcufnmD+xoPsgkQD+JVl4+jd9+gAV
 vl6wkJDiJO5JZ3FVtBtC3DFA/xHtVgk=
 =kQw+
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "Quite a lot of kexec work this time around. Many singleton patches in
  many places. The notable patch series are:

   - nilfs2 folio conversion from Matthew Wilcox in 'nilfs2: Folio
     conversions for file paths'.

   - Additional nilfs2 folio conversion from Ryusuke Konishi in 'nilfs2:
     Folio conversions for directory paths'.

   - IA64 remnant removal in Heiko Carstens's 'Remove unused code after
     IA-64 removal'.

   - Arnd Bergmann has enabled the -Wmissing-prototypes warning
     everywhere in 'Treewide: enable -Wmissing-prototypes'. This had
     some followup fixes:

      - Nathan Chancellor has cleaned up the hexagon build in the series
        'hexagon: Fix up instances of -Wmissing-prototypes'.

      - Nathan also addressed some s390 warnings in 's390: A couple of
        fixes for -Wmissing-prototypes'.

      - Arnd Bergmann addresses the same warnings for MIPS in his series
        'mips: address -Wmissing-prototypes warnings'.

   - Baoquan He has made kexec_file operate in a top-down-fitting manner
     similar to kexec_load in the series 'kexec_file: Load kernel at top
     of system RAM if required'

   - Baoquan He has also added the self-explanatory 'kexec_file: print
     out debugging message if required'.

   - Some checkstack maintenance work from Tiezhu Yang in the series
     'Modify some code about checkstack'.

   - Douglas Anderson has disentangled the watchdog code's logging when
     multiple reports are occurring simultaneously. The series is
     'watchdog: Better handling of concurrent lockups'.

   - Yuntao Wang has contributed some maintenance work on the crash code
     in 'crash: Some cleanups and fixes'"

* tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (157 commits)
  crash_core: fix and simplify the logic of crash_exclude_mem_range()
  x86/crash: use SZ_1M macro instead of hardcoded value
  x86/crash: remove the unused image parameter from prepare_elf_headers()
  kdump: remove redundant DEFAULT_CRASH_KERNEL_LOW_SIZE
  scripts/decode_stacktrace.sh: strip unexpected CR from lines
  watchdog: if panicking and we dumped everything, don't re-enable dumping
  watchdog/hardlockup: use printk_cpu_sync_get_irqsave() to serialize reporting
  watchdog/softlockup: use printk_cpu_sync_get_irqsave() to serialize reporting
  watchdog/hardlockup: adopt softlockup logic avoiding double-dumps
  kexec_core: fix the assignment to kimage->control_page
  x86/kexec: fix incorrect end address passed to kernel_ident_mapping_init()
  lib/trace_readwrite.c:: replace asm-generic/io with linux/io
  nilfs2: cpfile: fix some kernel-doc warnings
  stacktrace: fix kernel-doc typo
  scripts/checkstack.pl: fix no space expression between sp and offset
  x86/kexec: fix incorrect argument passed to kexec_dprintk()
  x86/kexec: use pr_err() instead of kexec_dprintk() when an error occurs
  nilfs2: add missing set_freezable() for freezable kthread
  kernel: relay: remove relay_file_splice_read dead code, doesn't work
  docs: submit-checklist: remove all of "make namespacecheck"
  ...
2024-01-09 11:46:20 -08:00
Linus Torvalds fb46e22a9e Many singleton patches against the MM code. The patch series which
are included in this merge do the following:
 
 - Peng Zhang has done some mapletree maintainance work in the
   series
 
 	"maple_tree: add mt_free_one() and mt_attr() helpers"
 	"Some cleanups of maple tree"
 
 - In the series "mm: use memmap_on_memory semantics for dax/kmem"
   Vishal Verma has altered the interworking between memory-hotplug
   and dax/kmem so that newly added 'device memory' can more easily
   have its memmap placed within that newly added memory.
 
 - Matthew Wilcox continues folio-related work (including a few
   fixes) in the patch series
 
 	"Add folio_zero_tail() and folio_fill_tail()"
 	"Make folio_start_writeback return void"
 	"Fix fault handler's handling of poisoned tail pages"
 	"Convert aops->error_remove_page to ->error_remove_folio"
 	"Finish two folio conversions"
 	"More swap folio conversions"
 
 - Kefeng Wang has also contributed folio-related work in the series
 
 	"mm: cleanup and use more folio in page fault"
 
 - Jim Cromie has improved the kmemleak reporting output in the
   series "tweak kmemleak report format".
 
 - In the series "stackdepot: allow evicting stack traces" Andrey
   Konovalov to permits clients (in this case KASAN) to cause
   eviction of no longer needed stack traces.
 
 - Charan Teja Kalla has fixed some accounting issues in the page
   allocator's atomic reserve calculations in the series "mm:
   page_alloc: fixes for high atomic reserve caluculations".
 
 - Dmitry Rokosov has added to the samples/ dorectory some sample
   code for a userspace memcg event listener application.  See the
   series "samples: introduce cgroup events listeners".
 
 - Some mapletree maintanance work from Liam Howlett in the series
   "maple_tree: iterator state changes".
 
 - Nhat Pham has improved zswap's approach to writeback in the
   series "workload-specific and memory pressure-driven zswap
   writeback".
 
 - DAMON/DAMOS feature and maintenance work from SeongJae Park in
   the series
 
 	"mm/damon: let users feed and tame/auto-tune DAMOS"
 	"selftests/damon: add Python-written DAMON functionality tests"
 	"mm/damon: misc updates for 6.8"
 
 - Yosry Ahmed has improved memcg's stats flushing in the series
   "mm: memcg: subtree stats flushing and thresholds".
 
 - In the series "Multi-size THP for anonymous memory" Ryan Roberts
   has added a runtime opt-in feature to transparent hugepages which
   improves performance by allocating larger chunks of memory during
   anonymous page faults.
 
 - Matthew Wilcox has also contributed some cleanup and maintenance
   work against eh buffer_head code int he series "More buffer_head
   cleanups".
 
 - Suren Baghdasaryan has done work on Andrea Arcangeli's series
   "userfaultfd move option".  UFFDIO_MOVE permits userspace heap
   compaction algorithms to move userspace's pages around rather than
   UFFDIO_COPY'a alloc/copy/free.
 
 - Stefan Roesch has developed a "KSM Advisor", in the series
   "mm/ksm: Add ksm advisor".  This is a governor which tunes KSM's
   scanning aggressiveness in response to userspace's current needs.
 
 - Chengming Zhou has optimized zswap's temporary working memory
   use in the series "mm/zswap: dstmem reuse optimizations and
   cleanups".
 
 - Matthew Wilcox has performed some maintenance work on the
   writeback code, both code and within filesystems.  The series is
   "Clean up the writeback paths".
 
 - Andrey Konovalov has optimized KASAN's handling of alloc and
   free stack traces for secondary-level allocators, in the series
   "kasan: save mempool stack traces".
 
 - Andrey also performed some KASAN maintenance work in the series
   "kasan: assorted clean-ups".
 
 - David Hildenbrand has gone to town on the rmap code.  Cleanups,
   more pte batching, folio conversions and more.  See the series
   "mm/rmap: interface overhaul".
 
 - Kinsey Ho has contributed some maintenance work on the MGLRU
   code in the series "mm/mglru: Kconfig cleanup".
 
 - Matthew Wilcox has contributed lruvec page accounting code
   cleanups in the series "Remove some lruvec page accounting
   functions".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZZyF2wAKCRDdBJ7gKXxA
 jjWjAP42LHvGSjp5M+Rs2rKFL0daBQsrlvy6/jCHUequSdWjSgEAmOx7bc5fbF27
 Oa8+DxGM9C+fwqZ/7YxU2w/WuUmLPgU=
 =0NHs
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Many singleton patches against the MM code. The patch series which are
  included in this merge do the following:

   - Peng Zhang has done some mapletree maintainance work in the series

	'maple_tree: add mt_free_one() and mt_attr() helpers'
	'Some cleanups of maple tree'

   - In the series 'mm: use memmap_on_memory semantics for dax/kmem'
     Vishal Verma has altered the interworking between memory-hotplug
     and dax/kmem so that newly added 'device memory' can more easily
     have its memmap placed within that newly added memory.

   - Matthew Wilcox continues folio-related work (including a few fixes)
     in the patch series

	'Add folio_zero_tail() and folio_fill_tail()'
	'Make folio_start_writeback return void'
	'Fix fault handler's handling of poisoned tail pages'
	'Convert aops->error_remove_page to ->error_remove_folio'
	'Finish two folio conversions'
	'More swap folio conversions'

   - Kefeng Wang has also contributed folio-related work in the series

	'mm: cleanup and use more folio in page fault'

   - Jim Cromie has improved the kmemleak reporting output in the series
     'tweak kmemleak report format'.

   - In the series 'stackdepot: allow evicting stack traces' Andrey
     Konovalov to permits clients (in this case KASAN) to cause eviction
     of no longer needed stack traces.

   - Charan Teja Kalla has fixed some accounting issues in the page
     allocator's atomic reserve calculations in the series 'mm:
     page_alloc: fixes for high atomic reserve caluculations'.

   - Dmitry Rokosov has added to the samples/ dorectory some sample code
     for a userspace memcg event listener application. See the series
     'samples: introduce cgroup events listeners'.

   - Some mapletree maintanance work from Liam Howlett in the series
     'maple_tree: iterator state changes'.

   - Nhat Pham has improved zswap's approach to writeback in the series
     'workload-specific and memory pressure-driven zswap writeback'.

   - DAMON/DAMOS feature and maintenance work from SeongJae Park in the
     series

	'mm/damon: let users feed and tame/auto-tune DAMOS'
	'selftests/damon: add Python-written DAMON functionality tests'
	'mm/damon: misc updates for 6.8'

   - Yosry Ahmed has improved memcg's stats flushing in the series 'mm:
     memcg: subtree stats flushing and thresholds'.

   - In the series 'Multi-size THP for anonymous memory' Ryan Roberts
     has added a runtime opt-in feature to transparent hugepages which
     improves performance by allocating larger chunks of memory during
     anonymous page faults.

   - Matthew Wilcox has also contributed some cleanup and maintenance
     work against eh buffer_head code int he series 'More buffer_head
     cleanups'.

   - Suren Baghdasaryan has done work on Andrea Arcangeli's series
     'userfaultfd move option'. UFFDIO_MOVE permits userspace heap
     compaction algorithms to move userspace's pages around rather than
     UFFDIO_COPY'a alloc/copy/free.

   - Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm:
     Add ksm advisor'. This is a governor which tunes KSM's scanning
     aggressiveness in response to userspace's current needs.

   - Chengming Zhou has optimized zswap's temporary working memory use
     in the series 'mm/zswap: dstmem reuse optimizations and cleanups'.

   - Matthew Wilcox has performed some maintenance work on the writeback
     code, both code and within filesystems. The series is 'Clean up the
     writeback paths'.

   - Andrey Konovalov has optimized KASAN's handling of alloc and free
     stack traces for secondary-level allocators, in the series 'kasan:
     save mempool stack traces'.

   - Andrey also performed some KASAN maintenance work in the series
     'kasan: assorted clean-ups'.

   - David Hildenbrand has gone to town on the rmap code. Cleanups, more
     pte batching, folio conversions and more. See the series 'mm/rmap:
     interface overhaul'.

   - Kinsey Ho has contributed some maintenance work on the MGLRU code
     in the series 'mm/mglru: Kconfig cleanup'.

   - Matthew Wilcox has contributed lruvec page accounting code cleanups
     in the series 'Remove some lruvec page accounting functions'"

* tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits)
  mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER
  mm, treewide: introduce NR_PAGE_ORDERS
  selftests/mm: add separate UFFDIO_MOVE test for PMD splitting
  selftests/mm: skip test if application doesn't has root privileges
  selftests/mm: conform test to TAP format output
  selftests: mm: hugepage-mmap: conform to TAP format output
  selftests/mm: gup_test: conform test to TAP format output
  mm/selftests: hugepage-mremap: conform test to TAP format output
  mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING
  mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large
  mm/memcontrol: remove __mod_lruvec_page_state()
  mm/khugepaged: use a folio more in collapse_file()
  slub: use a folio in __kmalloc_large_node
  slub: use folio APIs in free_large_kmalloc()
  slub: use alloc_pages_node() in alloc_slab_page()
  mm: remove inc/dec lruvec page state functions
  mm: ratelimit stat flush from workingset shrinker
  kasan: stop leaking stack trace handles
  mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE
  mm/mglru: add dummy pmd_dirty()
  ...
2024-01-09 11:18:47 -08:00
Linus Torvalds d30e51aa7b slab updates for 6.8
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmWWu9EACgkQu+CwddJF
 iJpXvQf/aGL7uEY57VpTm0t4gPwoZ9r2P89HxI/nQs9XgVzDcBmVp/cC0LDvSdcm
 t91kJO538KeGjMgvlhLMTEuoShH5FlPs6cOwrGAYUoAGa4NwiOpGvliGky+nNHqY
 w887ZgSzVLq0UOuSvn86N6enumMvewt4V+872+OWo6O1HWOJhC0SgHTIa8QPQtwb
 yZ9BghO5IqMRXiZEsSIwyO+tQHcaU6l2G5huFXzgMFUhkQqAB9KTFc3h6rYI+i80
 L4ppNXo2KNPGTDRb9dA8LNMWgvmfjhCb7chs8o1zSY2PwZlkzOix7EUBLCAIbc/2
 EIaFC8AsZjfT47D1t72r8QpHB+C14Q==
 =J+E7
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab updates from Vlastimil Babka:

 - SLUB: delayed freezing of CPU partial slabs (Chengming Zhou)

   Freezing is an operation involving double_cmpxchg() that makes a slab
   exclusive for a particular CPU. Chengming noticed that we use it also
   in situations where we are not yet installing the slab as the CPU
   slab, because freezing also indicates that the slab is not on the
   shared list. This results in redundant freeze/unfreeze operation and
   can be avoided by marking separately the shared list presence by
   reusing the PG_workingset flag.

   This approach neatly avoids the issues described in 9b1ea29bc0
   ("Revert "mm, slub: consider rest of partial list if acquire_slab()
   fails"") as we can now grab a slab from the shared list in a quick
   and guaranteed way without the cmpxchg_double() operation that
   amplifies the lock contention and can fail.

   As a result, lkp has reported 34.2% improvement of
   stress-ng.rawudp.ops_per_sec

 - SLAB removal and SLUB cleanups (Vlastimil Babka)

   The SLAB allocator has been deprecated since 6.5 and nobody has
   objected so far. We agreed at LSF/MM to wait until the next LTS,
   which is 6.6, so we should be good to go now.

   This doesn't yet erase all traces of SLAB outside of mm/ so some dead
   code, comments or documentation remain, and will be cleaned up
   gradually (some series are already in the works).

   Removing the choice of allocators has already allowed to simplify and
   optimize the code wiring up the kmalloc APIs to the SLUB
   implementation.

* tag 'slab-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (34 commits)
  mm/slub: free KFENCE objects in slab_free_hook()
  mm/slub: handle bulk and single object freeing separately
  mm/slub: introduce __kmem_cache_free_bulk() without free hooks
  mm/slub: fix bulk alloc and free stats
  mm/slub: optimize free fast path code layout
  mm/slub: optimize alloc fastpath code layout
  mm/slub: remove slab_alloc() and __kmem_cache_alloc_lru() wrappers
  mm/slab: move kmalloc() functions from slab_common.c to slub.c
  mm/slab: move kmalloc_slab() to mm/slab.h
  mm/slab: move kfree() from slab_common.c to slub.c
  mm/slab: move struct kmem_cache_node from slab.h to slub.c
  mm/slab: move memcg related functions from slab.h to slub.c
  mm/slab: move pre/post-alloc hooks from slab.h to slub.c
  mm/slab: consolidate includes in the internal mm/slab.h
  mm/slab: move the rest of slub_def.h to mm/slab.h
  mm/slab: move struct kmem_cache_cpu declaration to slub.c
  mm/slab: remove mm/slab.c and slab_def.h
  mm/mempool/dmapool: remove CONFIG_DEBUG_SLAB ifdefs
  mm/slab: remove CONFIG_SLAB code from slab common code
  cpu/hotplug: remove CPUHP_SLAB_PREPARE hooks
  ...
2024-01-09 10:36:07 -08:00
Linus Torvalds ab9517fa9a Debugobject changes for v6.8:
- Make tracking object use more robust: it's not safe to access a
    tracking object after releasing the hashbucket lock. Create a
    persistent copy for debug printouts instead.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmWbwj4RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iQORAAkO9jZmvIgh3e0boJyvVkYh+m+O0sMSek
 /HAnoCLk5ipA8I8uhK7r5aqLHPPCoxw2MLihnKj4rTK1nPULZU/VMmge+4hUComv
 vy1m7tHeUShdAlxSvjZ/807oh546nA5IySv3qG6KCaemHeELllp3gRoIkkTjmnz6
 GcPUPQ9WIk8ptyplvyxOR4mw6QVynkIofk7p6C0ePwGNn7yFrYczc3VsnFL+Kmat
 fl5ZsyN9bRfG4q4bCqj4nKXuUyiJZdbtO2VkyPZtYFmlpERcvGiDHCtzQoKPHS4k
 CvAz4Eq11DdHjPr5VdRIuW7M91CyGE/2QMOfCOlxPqD+Jh0zTbO00U5BuRhicPXz
 rn0Yg5J54gDqv/2jtq6eSNsM28OrhGEu+nep9165e4BBhI6oYd0Wbq0TH+rlzohb
 048zMePY77nLffwNEpa4dEK2+UKLwufeQiP+d2BGfe96mQQ2i4LIRDzdu/BQtR4W
 CU7gbcy2e4q+hjI0gB7smFhQi5zR7tIt3WSfLY6hRpzDcpRqaTvfILyzWFgYN2RN
 I3B9+n3aTlCyCdeFseFajcVvBg5KexkAaPphhfBrkZ9viB+iKUjmcN8mMrvRfX6I
 vjCeIaNCL7arCAkce0hhHZ0WUf/f+DPoEJCb+Z9EuBmOosO/3tlNH6YGyeJNN+vx
 r7mt+MSC4CY=
 =WFe/
 -----END PGP SIGNATURE-----

Merge tag 'core-debugobjects-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull debugobject update from Ingo Molnar:

 - Make tracking object use more robust: it's not safe to access a
   tracking object after releasing the hashbucket lock. Create a
   persistent copy for debug printouts instead.

* tag 'core-debugobjects-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Stop accessing objects after releasing hash bucket lock
2024-01-08 18:35:33 -08:00
Kirill A. Shutemov fd37721803 mm, treewide: introduce NR_PAGE_ORDERS
NR_PAGE_ORDERS defines the number of page orders supported by the page
allocator, ranging from 0 to MAX_ORDER, MAX_ORDER + 1 in total.

NR_PAGE_ORDERS assists in defining arrays of page orders and allows for
more natural iteration over them.

[kirill.shutemov@linux.intel.com: fixup for kerneldoc warning]
  Link: https://lkml.kernel.org/r/20240101111512.7empzyifq7kxtzk3@box
Link: https://lkml.kernel.org/r/20231228144704.14033-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-08 15:27:15 -08:00
Linus Torvalds 5db8752c3b vfs-6.8.iov_iter
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZZUzBQAKCRCRxhvAZXjc
 ot+3AQCZw1PBD4azVxFMWH76qwlAGoVIFug4+ogKU/iUa4VLygEA2FJh1vLJw5iI
 LpgBEIUTPVkwtzinAW94iJJo1Vr7NAI=
 =p6PB
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.8.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs iov_iter cleanups from Christian Brauner:
 "This contains a minor cleanup. The patches drop an unused argument
  from import_single_range() allowing to replace import_single_range()
  with import_ubuf() and dropping import_single_range() completely"

* tag 'vfs-6.8.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  iov_iter: replace import_single_range() with import_ubuf()
  iov_iter: remove unused 'iov' argument from import_single_range()
2024-01-08 11:43:04 -08:00
Dan Williams 80dda9a69a Merge branch 'for-6.8/cxl-misc' into for-6.8/cxl
Pick up some miscellaneous fixups for v6.8.
2024-01-05 19:03:06 -08:00
Jakub Kicinski e63c1822ac Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/broadcom/bnxt/bnxt.c
  e009b2efb7 ("bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()")
  0f2b214779 ("bnxt_en: Fix compile error without CONFIG_RFS_ACCEL")
https://lore.kernel.org/all/20240105115509.225aa8a2@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-04 18:06:46 -08:00
Rafael J. Wysocki 8be056a2c0 Merge branches 'acpi-osl', 'acpi-bus' and 'acpi-tables'
Merge low-level ACPICA interface changes, an _SB-scope _OSC handshake
update and a data-only ACPI tables parsing code update for 6.8-rc1:

 - Switch over ACPI to using a threaded interrupt handler for the
   SCI (Rafael J. Wysocki).

 - Allow ACPI Notify () handlers to run on all CPUs and clean up the
   ACPI interface for deferred events processing (Rafael J. Wysocki).

 - Switch over the ACPI EC driver to using a threaded handler for the
   dedicated IRQ on systems without the EC GPE (Rafael J. Wysocki).

 - Adjust code using ACPICA spinlocks and the ACPI EC driver spinlock to
   keep local interrupts on (Rafael J. Wysocki).

 - Adjust the USB4 _OSC handshake to correctly handle cases in which
   certain types of OS control are denied by the platform (Mika
   Westerberg).

 - Correct and clean up the generic function for parsing ACPI data-only
   tables with array structure (Yuntao Wang).

* acpi-osl:
  ACPI: EC: Use a spin lock without disabing interrupts
  ACPI: EC: Use a threaded handler for dedicated IRQ
  ACPI: OSL: Use spin locks without disabling interrupts
  ACPI: OSL: Allow Notify () handlers to run on all CPUs
  ACPI: OSL: Rearrange workqueue selection in acpi_os_execute()
  ACPI: OSL: Rework error handling in acpi_os_execute()
  ACPI: OSL: Use a threaded interrupt handler for SCI

* acpi-bus:
  ACPI: Run USB4 _OSC() first with query bit set

* acpi-tables:
  ACPI: tables: Correct and clean up the logic of acpi_parse_entries_array()
2024-01-04 12:35:42 +01:00
David Gow 539e582a37 kunit: Fix some comments which were mistakenly kerneldoc
The KUnit device helpers are documented with kerneldoc in their header
file, but also have short comments over their implementation. These were
mistakenly formatted as kerneldoc comments, even though they're not
valid kerneldoc. It shouldn't cause any serious problems -- this file
isn't included in the docs -- but it could be confusing, and causes
warnings.

Remove the extra '*' so that these aren't treated as kerneldoc.

Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312181920.H4EPAH20-lkp@intel.com/
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-03 09:10:37 -07:00
Richard Fitzgerald 5fb1a8c671 kunit: Add example of kunit_activate_static_stub() with pointer-to-function
Adds a variant of example_static_stub_test() that shows use of a
pointer-to-function with kunit_activate_static_stub().

A const pointer to the add_one() function is declared. This
pointer-to-function is passed to kunit_activate_static_stub() and
kunit_deactivate_static_stub() instead of passing add_one directly.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-03 09:07:23 -07:00
Richard Fitzgerald a0b84213f9 kunit: Fix NULL-dereference in kunit_init_suite() if suite->log is NULL
suite->log must be checked for NULL before passing it to
string_stream_clear(). This was done in kunit_init_test() but was missing
from kunit_init_suite().

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 6d696c4695c5 ("kunit: add ability to run tests after boot using debugfs")
Reviewed-by: Rae Moar <rmoar@google.com>
Acked-by: David Gow <davidgow@google.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-03 09:06:19 -07:00
Tanzir Hasan 037d88f0dd lib/trace_readwrite.c:: replace asm-generic/io with linux/io
asm-generic/io.h can be replaced with linux/io.h and the file will still
build correctly.  It is an asm-generic file which should be avoided if
possible.

Link: https://lkml.kernel.org/r/20231221-tracereadwrite-v1-1-a434f25180c7@google.com
Signed-off-by: Tanzir Hasan <tanzirh@google.com>
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29 12:22:29 -08:00
Mathis Marion 90ca22513e lib: crc_ccitt_false() is identical to crc_itu_t()
crc_ccitt_false() was introduced in commit 0d85adb5fb ("lib/crc-ccitt:
Add CCITT-FALSE CRC16 variant"), but it is redundant with crc_itu_t(). 
Since the latter is more used, it is the one being kept.

Link: https://lkml.kernel.org/r/20231219131154.748577-1-Mathis.Marion@silabs.com
Signed-off-by: Mathis Marion <mathis.marion@silabs.com>
Cc: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Andrey Vostrikov <andrey.vostrikov@cogentembedded.com>
Cc: Jérôme Pouiller <jerome.pouiller@silabs.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29 12:22:26 -08:00
Uwe Kleine-König bc09d1dea8 lib: add note about process exit message for DEBUG_STACK_USAGE
DEBUG_STACK_USAGE doesn't only have an influence on the output of sysrq-T
and sysrq-P, it also enables a message at process exit.  See
check_stack_usage() in kernel/exit.c where this is implemented.

Link: https://lkml.kernel.org/r/20231219182808.210284-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29 12:22:26 -08:00
Andrey Konovalov a914d8d6cf lib/stackdepot: add printk_deferred_enter/exit guards
Patch series "lib/stackdepot, kasan: fixes for stack eviction series", v3.

A few fixes for the stack depot eviction series ("stackdepot: allow
evicting stack traces").

This patch (of 5):

Stack depot functions can be called from various contexts that do
allocations, including with console locks taken.  At the same time, stack
depot functions might print WARNING's or refcount-related failures.

This can cause a deadlock on console locks.

Add printk_deferred_enter/exit guards to stack depot to avoid this.

Link: https://lkml.kernel.org/r/cover.1703020707.git.andreyknvl@google.com
Link: https://lkml.kernel.org/r/82092f9040d075a161d1264377d51e0bac847e8a.1703020707.git.andreyknvl@google.com
Fixes: 108be8def4 ("lib/stackdepot: allow users to evict stack traces")
Fixes: cd11016e5f ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Closes: https://lore.kernel.org/all/000000000000f56750060b9ad216@google.com/
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29 11:58:41 -08:00
Joel Granados c8a65501d3 sysctl: Remove the now superfluous sentinel elements from ctl_table array
This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)

Remove empty sentinel element from test_table and test_table_unregister.

Signed-off-by: Joel Granados <j.granados@samsung.com>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-12-28 04:57:57 -08:00
Joel Granados 777740779e sysctl: Add a selftest for handling empty dirs
Basic test to ensure that empty directories can be registered and that
they in turn can serve as a base dir for other registrations.

Add one test to the sysctl selftest module. It first registers an empty
directory under "empty_add" and then uses that as a base to register
another empty dir.
The sysctl bash script then checks that "empty_add" is present and that
there an empty directory within it.

Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-12-28 04:57:57 -08:00
Linus Torvalds f5837722ff 11 hotfixes. 7 are cc:stable and the other 4 address post-6.6 issues or
are not considered backporting material.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZYys4AAKCRDdBJ7gKXxA
 jtmaAQC+o04Ia7IfB8MIqp1p7dNZQo64x/EnGA8YjUnQ8N6IwQD+ImU7dHl9g9Oo
 ROiiAbtMRBUfeJRsExX/Yzc1DV9E9QM=
 =ZGcs
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-12-27-15-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "11 hotfixes. 7 are cc:stable and the other 4 address post-6.6 issues
  or are not considered backporting material"

* tag 'mm-hotfixes-stable-2023-12-27-15-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mailmap: add an old address for Naoya Horiguchi
  mm/memory-failure: cast index to loff_t before shifting it
  mm/memory-failure: check the mapcount of the precise page
  mm/memory-failure: pass the folio and the page to collect_procs()
  selftests: secretmem: floor the memory size to the multiple of page_size
  mm: migrate high-order folios in swap cache correctly
  maple_tree: do not preallocate nodes for slot stores
  mm/filemap: avoid buffered read/write race to read inconsistent data
  kunit: kasan_test: disable fortify string checker on kmalloc_oob_memset
  kexec: select CRYPTO from KEXEC_FILE instead of depending on it
  kexec: fix KEXEC_FILE dependencies
2023-12-27 16:14:41 -08:00
Kent Overstreet 1e2f2d3199 Kill sched.h dependency on rcupdate.h
by moving cond_resched_rcu() to rcupdate_wait.h, we can kill another big
sched.h dependency.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-27 11:50:20 -05:00
Dave Jiang 60e43fe528 lib/firmware_table: tables: Add CDAT table parsing support
The CDAT table is very similar to ACPI tables when it comes to sub-table
and entry structures. The helper functions can be also used to parse the
CDAT table. Add support to the helper functions to deal with an external
CDAT table, and also handle the endieness since CDAT can be processed by a
BE host. Export a function cdat_table_parse() for CXL driver to parse
a CDAT table.

In order to minimize ACPICA code changes, __force is being utilized to deal
with the case of a big endian (BE) host parsing a CDAT. All CDAT data
structure variables are being force casted to __leX as appropriate.

Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/170319615131.2212653.10932785667981494238.stgit@djiang5-mobl3
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-12-22 14:23:13 -08:00
Linus Torvalds c0f65a7c11 printk changes for 6.8
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmWFxecACgkQUqAMR0iA
 lPLq0RAAg0YQxIltUuR1/IcsVB6rMz8eekmR62sQlmsM2mYRHiEX4ffvM2aSuoMW
 1ikAqtfDCFpjIg2dxlirIIA1bkCj7P9wUDShwZispPmjiAag1Woj3E02Q7enxljF
 LwkdBP0rm/59jGj3u/HKqQ/vwbusg6cezoCXXUqf9BMc7AaD9w4wyF4LbdyqVdcS
 B1ngNvqyONH7LS2rp3sixowbO0KDgtaPKFg3kC9r5Aqhz0pvJtmrdQhssx5jQzJ3
 emnKdSPvrtbeLMa+/mYO9OsYa2PnqkotLRq5ulW4b/9QNluPmVqQbkyqiFJGldFo
 /jHLIIYM9CwaOhvqLY5o1uBq8jXQH6dyfE8Yv2aOcVGN8loTR3eG77h+ZgDAsIXL
 6p5Twx4zCYlI80R8H5nVdpPa0MUjLBK7bG/oT7pmGnxK9pqtKNdejXqhrPzUcH0i
 sVbR9BfOTcWXQlL1CcQRc5sfoikv0Qp6TQbpaE+qyHogWZI3oOaJBo2ietvYx0Lp
 E73amdvkJ2q55rEspTdLxLDQCd7hov2zkQMtIHzLBybk7i9QaMLnzzjmUe7yy9yc
 il/8Ma+8PtYsPfC77PjMp7shI94Ldps0383I/NAon8DrUG7bV6CXd3QAPB1c7P/b
 AN7Qjig1udk7yZDHCm5TbX/niB3/sfiBWJINHG764+iUa3lkJ8E=
 =m5F8
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk fix from Petr Mladek:

 - Prevent refcount warning from code releasing a fwnode

* tag 'printk-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  lib/vsprintf: Fix %pfwf when current node refcount == 0
2023-12-22 13:41:29 -08:00
Tianjia Zhang ba3c557420 crypto: lib/mpi - Fix unexpected pointer access in mpi_ec_init
When the mpi_ec_ctx structure is initialized, some fields are not
cleared, causing a crash when referencing the field when the
structure was released. Initially, this issue was ignored because
memory for mpi_ec_ctx is allocated with the __GFP_ZERO flag.
For example, this error will be triggered when calculating the
Za value for SM2 separately.

Fixes: d58bb7e55a ("lib/mpi: Introduce ec implementation to MPI library")
Cc: stable@vger.kernel.org # v6.5
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-22 12:30:19 +08:00
Paolo Abeni 56794e5358 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Adjacent changes:

drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
  23c93c3b62 ("bnxt_en: do not map packet buffers twice")
  6d1add9553 ("bnxt_en: Modify TX ring indexing logic.")

tools/testing/selftests/net/Makefile
  2258b66648 ("selftests: add vlan hw filter tests")
  a0bc96c0cd ("selftests: net: verify fq per-band packet limit")

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-21 22:17:23 +01:00
Matthew Wilcox (Oracle) af73483f4e ida: Fix crash in ida_free when the bitmap is empty
The IDA usually detects double-frees, but that detection failed to
consider the case when there are no nearby IDs allocated and so we have a
NULL bitmap rather than simply having a clear bit.  Add some tests to the
test-suite to be sure we don't inadvertently reintroduce this problem.
Unfortunately they're quite noisy so include a message to disregard
the warnings.

Reported-by: Zhenghan Wang <wzhmmmmm@gmail.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-21 10:02:28 -08:00
Mark Rutland 0df52582e0 kcov: remove stale RANDOMIZE_BASE text
The Kconfig help text for CONFIG_KCOV describes that recorded PC values
will not be stable across machines or reboots when RANDOMIZE_BASE is
selected.  This was the case when KCOV was introduced in commit:

  5c9a8750a6 ("kernel: add kcov code coverage")

However, this changed in commit:

  4983f0ab7f ("kcov: make kcov work properly with KASLR enabled")

Since that commit KCOV always subtracts the KASLR offset from PC values,
which ensures that these are stable across machines and across reboots
even when RANDOMIZE_BASE is selected.

Unfortunately, that commit failed to update the Kconfig help text, which
still suggests disabling RANDOMIZE_BASE even though this is no longer
necessary.

Remove the stale Kconfig text.

Link: https://lkml.kernel.org/r/20231204171807.3313022-1-mark.rutland@arm.com
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 15:02:57 -08:00
Borislav Petkov (AMD) ffda655682 UBSAN: use the kernel panic message markers
Use the same splat markers as panic does for easier matching by external
tools scanning kernel dmesg for splats.

Link: https://lkml.kernel.org/r/20231218135339.23209-1-bp@alien8.de
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 14:48:14 -08:00
Peng Zhang 7e552dcd80 maple_tree: avoid checking other gaps after getting the largest gap
The last range stored in maple tree is typically quite large.  By checking
if it exceeds the sum of the remaining ranges in that node, it is possible
to avoid checking all other gaps.

Running the maple tree test suite in user mode almost always results in a
near 100% hit rate for this optimization.

Link: https://lkml.kernel.org/r/20231215074632.82045-1-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 14:48:14 -08:00
Randy Dunlap d5f6057cf0 maple_tree: fix typos/spellos etc
Fix typos/grammar and spellos in documentation.

Link: https://lkml.kernel.org/r/20231210063839.29967-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 14:48:12 -08:00
Andrew Morton 5143eecd2a lib/maple_tree.c: fix build error due to hotfix alteration
Commit 0de56e38b3 ("maple_tree: use maple state end for write
operations") was broken by a later patch "maple_tree: do not preallocate
nodes for slot stores".  But the later patch was scheduled ahead of
0de56e38b3, for 6.7-rc.

This fixlet undoes the damage.

Fixes: 0de56e38b3 ("maple_tree: use maple state end for write operations")
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 14:48:11 -08:00
Andrew Morton a721aeac8b sync mm-stable with mm-hotfixes-stable to pick up depended-upon changes 2023-12-20 14:47:18 -08:00
Sidhartha Kumar 4249f13c11 maple_tree: do not preallocate nodes for slot stores
mas_preallocate() defaults to requesting 1 node for preallocation and then
,depending on the type of store, will update the request variable.  There
isn't a check for a slot store type, so slot stores are preallocating the
default 1 node.  Slot stores do not require any additional nodes, so add a
check for the slot store case that will bypass node_count_gfp().  Update
the tests to reflect that slot stores do not require allocations.

User visible effects of this bug include increased memory usage from the
unneeded node that was allocated.

Link: https://lkml.kernel.org/r/20231213205058.386589-1-sidhartha.kumar@oracle.com
Fixes: 0b8bb544b1 ("maple_tree: update mas_preallocate() testing")
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: <stable@vger.kernel.org>	[6.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 13:46:19 -08:00
Jakub Kicinski c49b292d03 netdev
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmWAz2EACgkQ6rmadz2v
 bToqrw/9EwroZCc8GEHOKAlb/fzrMvn92rLo0ZW/cGN84QJPnx4zM6Zo0+fgLaaN
 oqqztwMUwdzGC3uX3FfVXaaLKbJ/MeHeL9BXFZNW8zkRHciw4R7kIBhOdPnHyET7
 uT+rQ4xPe1Mt7e9PjepKlSL5mEsxWfBkdUgsdn19Z2Vjdfr9mZMhYWYMJGcfTCD1
 TwxHKBPhq5fN3IsshmMBB8IrRp1HStUKb65MgZ4dI22LJXxTsFkx5XMFXcmuqvkH
 NhKj8jDcPEEh31bYcb6aG2Z4onw5F2lquygjk1Qyy5cyw45m/ipJKAXKdAyvJG+R
 VZCWOET/9wbRwFSK5wxwihCuKghFiofK52i2PcGtXZh0PCouyZZneSJOKM0yVWKO
 BvuJBxK4ETRnQyN6ZxhuJiEXG3/YMBBhyR2TX1LntVK9ct/k7qFVzATG49J39/sR
 SYMbptBRj4a5oMJ1qn0nFVEDFkg0jTnTDNnsEpcz60Ayt6EsJ1XosO5yz2huf861
 xgRMTKMseyG1/uV45tQ8ZPzbSPpBxjUi9Dl3coYsIm1a+y6clWUXcarONY5KVrpS
 CR98DuFgl+E7dXuisd/Kz2p2KxxSPq8nytsmLlgOvrUqhwiXqB+TKN8EHgIapVOt
 l1A5LrzXFTcGlT9MlaWBqEIy83Bu1nqQqbxrAFOE0k8A5jomXaw=
 =stU2
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2023-12-18

This PR is larger than usual and contains changes in various parts
of the kernel.

The main changes are:

1) Fix kCFI bugs in BPF, from Peter Zijlstra.

End result: all forms of indirect calls from BPF into kernel
and from kernel into BPF work with CFI enabled. This allows BPF
to work with CONFIG_FINEIBT=y.

2) Introduce BPF token object, from Andrii Nakryiko.

It adds an ability to delegate a subset of BPF features from privileged
daemon (e.g., systemd) through special mount options for userns-bound
BPF FS to a trusted unprivileged application. The design accommodates
suggestions from Christian Brauner and Paul Moore.

Example:
$ sudo mkdir -p /sys/fs/bpf/token
$ sudo mount -t bpf bpffs /sys/fs/bpf/token \
             -o delegate_cmds=prog_load:MAP_CREATE \
             -o delegate_progs=kprobe \
             -o delegate_attachs=xdp

3) Various verifier improvements and fixes, from Andrii Nakryiko, Andrei Matei.

 - Complete precision tracking support for register spills
 - Fix verification of possibly-zero-sized stack accesses
 - Fix access to uninit stack slots
 - Track aligned STACK_ZERO cases as imprecise spilled registers.
   It improves the verifier "instructions processed" metric from single
   digit to 50-60% for some programs.
 - Fix verifier retval logic

4) Support for VLAN tag in XDP hints, from Larysa Zaremba.

5) Allocate BPF trampoline via bpf_prog_pack mechanism, from Song Liu.

End result: better memory utilization and lower I$ miss for calls to BPF
via BPF trampoline.

6) Fix race between BPF prog accessing inner map and parallel delete,
from Hou Tao.

7) Add bpf_xdp_get_xfrm_state() kfunc, from Daniel Xu.

It allows BPF interact with IPSEC infra. The intent is to support
software RSS (via XDP) for the upcoming ipsec pcpu work.
Experiments on AWS demonstrate single tunnel pcpu ipsec reaching
line rate on 100G ENA nics.

8) Expand bpf_cgrp_storage to support cgroup1 non-attach, from Yafang Shao.

9) BPF file verification via fsverity, from Song Liu.

It allows BPF progs get fsverity digest.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (164 commits)
  bpf: Ensure precise is reset to false in __mark_reg_const_zero()
  selftests/bpf: Add more uprobe multi fail tests
  bpf: Fail uprobe multi link with negative offset
  selftests/bpf: Test the release of map btf
  s390/bpf: Fix indirect trampoline generation
  selftests/bpf: Temporarily disable dummy_struct_ops test on s390
  x86/cfi,bpf: Fix bpf_exception_cb() signature
  bpf: Fix dtor CFI
  cfi: Add CFI_NOSEAL()
  x86/cfi,bpf: Fix bpf_struct_ops CFI
  x86/cfi,bpf: Fix bpf_callback_t CFI
  x86/cfi,bpf: Fix BPF JIT call
  cfi: Flip headers
  selftests/bpf: Add test for abnormal cnt during multi-kprobe attachment
  selftests/bpf: Don't use libbpf_get_error() in kprobe_multi_test
  selftests/bpf: Add test for abnormal cnt during multi-uprobe attachment
  bpf: Limit the number of kprobes when attaching program to multiple kprobes
  bpf: Limit the number of uprobes when attaching program to multiple uprobes
  bpf: xdp: Register generic_kfunc_set with XDP programs
  selftests/bpf: utilize string values for delegate_xxx mount options
  ...
====================

Link: https://lore.kernel.org/r/20231219000520.34178-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18 16:46:08 -08:00
Michal Wajdeczko 342fb97892 kunit: Reset test->priv after each param iteration
If we run parameterized test that uses test->priv to prepare some
custom data, then value of test->priv will leak to the next param
iteration and may be unexpected.  This could be easily seen if
we promote example_priv_test to parameterized test as then only
first test iteration will be successful:

$ ./tools/testing/kunit/kunit.py run \
	--kunitconfig ./lib/kunit/.kunitconfig *.example_priv*

[ ] Starting KUnit Kernel (1/1)...
[ ] ============================================================
[ ] =================== example (1 subtest) ====================
[ ] ==================== example_priv_test  ====================
[ ] [PASSED] example value 3
[ ] # example_priv_test: initializing
[ ] # example_priv_test: ASSERTION FAILED at lib/kunit/kunit-example-test.c:230
[ ] Expected test->priv == ((void *)0), but
[ ]     test->priv == 0000000060dfe290
[ ]     ((void *)0) == 0000000000000000
[ ] # example_priv_test: cleaning up
[ ] [FAILED] example value 2
[ ] # example_priv_test: initializing
[ ] # example_priv_test: ASSERTION FAILED at lib/kunit/kunit-example-test.c:230
[ ] Expected test->priv == ((void *)0), but
[ ]     test->priv == 0000000060dfe290
[ ]     ((void *)0) == 0000000000000000
[ ] # example_priv_test: cleaning up
[ ] [FAILED] example value 1
[ ] # example_priv_test: initializing
[ ] # example_priv_test: ASSERTION FAILED at lib/kunit/kunit-example-test.c:230
[ ] Expected test->priv == ((void *)0), but
[ ]     test->priv == 0000000060dfe290
[ ]     ((void *)0) == 0000000000000000
[ ] # example_priv_test: cleaning up
[ ] [FAILED] example value 0
[ ] # example_priv_test: initializing
[ ] # example_priv_test: cleaning up
[ ] # example_priv_test: pass:1 fail:3 skip:0 total:4
[ ] ================ [FAILED] example_priv_test ================
[ ]     # example: initializing suite
[ ]     # module: kunit_example_test
[ ]     # example: exiting suite
[ ] # Totals: pass:1 fail:3 skip:0 total:4
[ ] ===================== [FAILED] example =====================

Fix that by resetting test->priv after each param iteration, in
similar way what we did for the test->status.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:39:05 -07:00
Michal Wajdeczko 2b61582acd kunit: Add example for using test->priv
In a test->priv field the user can store arbitrary data.
Add example how to use this feature in the test code.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:39:00 -07:00
davidgow@google.com 837018388e overflow: Replace fake root_device with kunit_device
Using struct root_device to create fake devices for tests is something
of a hack. The new struct kunit_device is meant for this purpose, so use
it instead.

Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:28:08 -07:00
davidgow@google.com 46ee8f688e fortify: test: Use kunit_device
Using struct root_device to create fake devices for tests is something
of a hack. The new struct kunit_device is meant for this purpose, so use
it instead.

Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:28:08 -07:00
davidgow@google.com d03c720e03 kunit: Add APIs for managing devices
Tests for drivers often require a struct device to pass to other
functions. While it's possible to create these with
root_device_register(), or to use something like a platform device, this
is both a misuse of those APIs, and can be difficult to clean up after,
for example, a failed assertion.

Add some KUnit-specific functions for registering and unregistering a
struct device:
- kunit_device_register()
- kunit_device_register_with_driver()
- kunit_device_unregister()

These helpers allocate a on a 'kunit' bus which will either probe the
driver passed in (kunit_device_register_with_driver), or will create a
stub driver (kunit_device_register) which is cleaned up on test shutdown.

Devices are automatically unregistered on test shutdown, but can be
manually unregistered earlier with kunit_device_unregister() in order
to, for example, test device release code.

Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:28:08 -07:00
Rae Moar c72a870926 kunit: add ability to run tests after boot using debugfs
Add functionality to run built-in tests after boot by writing to a
debugfs file.

Add a new debugfs file labeled "run" for each test suite to use for
this purpose.

As an example, write to the file using the following:

echo "any string" > /sys/kernel/debugfs/kunit/<testsuite>/run

This will trigger the test suite to run and will print results to the
kernel log.

To guard against running tests concurrently with this feature, add a
mutex lock around running kunit. This supports the current practice of
not allowing tests to be run concurrently on the same kernel.

This new functionality could be used to design a parameter
injection feature in the future.

Fixed up merge conflict duing rebase to Linux 6.7-rc6
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>

Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:25:49 -07:00
Rae Moar 6c4ea2f48d kunit: add is_init test attribute
Add is_init test attribute of type bool. Add to_string, get, and filter
methods to lib/kunit/attributes.c.

Mark each of the tests in the init section with the is_init=true attribute.

Add is_init to the attributes documentation.

Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:15 -07:00
Rae Moar 2cf4528157 kunit: add example suite to test init suites
Add example_init_test_suite to allow for testing the feature of running
test suites marked as init to indicate they use init data and/or
functions.

This suite should always pass and uses a simple init function.

This suite can also be used to test the is_init attribute introduced in
the next patch.

Signed-off-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:15 -07:00
Rae Moar d81f0d7b8b kunit: add KUNIT_INIT_TABLE to init linker section
Add KUNIT_INIT_TABLE to the INIT_DATA linker section.

Alter the KUnit macros to create init tests:
kunit_test_init_section_suites

Update lib/kunit/executor.c to run both the suites in KUNIT_TABLE and
KUNIT_INIT_TABLE.

Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:15 -07:00
Richard Fitzgerald 1557e89d3a kunit: debugfs: Handle errors from alloc_string_stream()
In kunit_debugfs_create_suite() give up and skip creating the debugfs
file if any of the alloc_string_stream() calls return an error or NULL.
Only put a value in the log pointer of kunit_suite and kunit_test if it
is a valid pointer to a log.

This prevents the potential invalid dereference reported by smatch:

 lib/kunit/debugfs.c:115 kunit_debugfs_create_suite() error: 'suite->log'
	dereferencing possible ERR_PTR()
 lib/kunit/debugfs.c:119 kunit_debugfs_create_suite() error: 'test_case->log'
	dereferencing possible ERR_PTR()

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: 05e2006ce4 ("kunit: Use string_stream for test log")
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:15 -07:00
Richard Fitzgerald 34dfd5bb2e kunit: debugfs: Fix unchecked dereference in debugfs_print_results()
Move the call to kunit_suite_has_succeeded() after the check that
the kunit_suite pointer is valid.

This was found by smatch:

 lib/kunit/debugfs.c:66 debugfs_print_results() warn: variable
 dereferenced before check 'suite' (see line 63)

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: 38289a26e1 ("kunit: fix debugfs code to use enum kunit_status, not bool")
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:14 -07:00
Richard Fitzgerald 15bf000014 kunit: string-stream: Allow ERR_PTR to be passed to string_stream_destroy()
Check the stream pointer passed to string_stream_destroy() for
IS_ERR_OR_NULL() instead of only NULL.

Whatever alloc_string_stream() returns should be safe to pass
to string_stream_destroy(), and that will be an ERR_PTR.

It's obviously good practise and generally helpful to also check
for NULL pointers so that client cleanup code can call
string_stream_destroy() unconditionally - which could include
pointers that have never been set to anything and so are NULL.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:14 -07:00
Richard Fitzgerald 37f0d37ffc kunit: string-stream-test: Avoid cast warning when testing gfp_t flags
Passing a gfp_t to KUNIT_EXPECT_EQ() causes a cast warning:

  lib/kunit/string-stream-test.c:73:9: sparse: sparse: incorrect type in
  initializer (different base types) expected long long right_value
  got restricted gfp_t const __right

Avoid this by testing stream->gfp for the expected value and passing the
boolean result of this comparison to KUNIT_EXPECT_TRUE(), as was already
done a few lines above in string_stream_managed_init_test().

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: d1a0d699bf ("kunit: string-stream: Add tests for freeing resource-managed string_stream")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202311181918.0mpCu2Xh-lkp@intel.com/
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:14 -07:00
David Gow 56778b49c9 kunit: Add a macro to wrap a deferred action function
KUnit's deferred action API accepts a void(*)(void *) function pointer
which is called when the test is exited. However, we very frequently
want to use existing functions which accept a single pointer, but which
may not be of type void*. While this is probably dodgy enough to be on
the wrong side of the C standard, it's been often used for similar
callbacks, and gcc's -Wcast-function-type seems to ignore cases where
the only difference is the type of the argument, assuming it's
compatible (i.e., they're both pointers to data).

However, clang 16 has introduced -Wcast-function-type-strict, which no
longer permits any deviation in function pointer type. This seems to be
because it'd break CFI, which validates the type of function calls.

This rather ruins our attempts to cast functions to defer them, and
leaves us with a few options. The one we've chosen is to implement a
macro which will generate a wrapper function which accepts a void*, and
casts the argument to the appropriate type.

For example, if you were trying to wrap:
void foo_close(struct foo *handle);
you could use:
KUNIT_DEFINE_ACTION_WRAPPER(kunit_action_foo_close,
			    foo_close,
			    struct foo *);

This would create a new kunit_action_foo_close() function, of type
kunit_action_t, which could be passed into kunit_add_action() and
similar functions.

In addition to defining this macro, update KUnit and its tests to use
it.

Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:14 -07:00
Jens Axboe ae1914174a cred: get rid of CONFIG_DEBUG_CREDENTIALS
This code is rarely (never?) enabled by distros, and it hasn't caught
anything in decades. Let's kill off this legacy debug code.

Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-15 14:19:48 -08:00
Jakub Kicinski 8f674972d6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/intel/iavf/iavf_ethtool.c
  3a0b5a2929 ("iavf: Introduce new state machines for flow director")
  95260816b4 ("iavf: use iavf_schedule_aq_request() helper")
https://lore.kernel.org/all/84e12519-04dc-bd80-bc34-8cf50d7898ce@intel.com/

drivers/net/ethernet/broadcom/bnxt/bnxt.c
  c13e268c07 ("bnxt_en: Fix HWTSTAMP_FILTER_ALL packet timestamp logic")
  c2f8063309 ("bnxt_en: Refactor RX VLAN acceleration logic.")
  a7445d6980 ("bnxt_en: Add support for new RX and TPA_START completion types for P7")
  1c7fd6ee2f ("bnxt_en: Rename some macros for the P5 chips")
https://lore.kernel.org/all/20231211110022.27926ad9@canb.auug.org.au/

drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
  bd6781c18c ("bnxt_en: Fix wrong return value check in bnxt_close_nic()")
  84793a4995 ("bnxt_en: Skip nic close/open when configuring tstamp filters")
https://lore.kernel.org/all/20231214113041.3a0c003c@canb.auug.org.au/

drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
  3d7a3f2612 ("net/mlx5: Nack sync reset request when HotPlug is enabled")
  cecf44ea1a ("net/mlx5: Allow sync reset flow when BF MGT interface device is present")
https://lore.kernel.org/all/20231211110328.76c925af@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14 17:14:41 -08:00
Levi Yun d9d9bd979c maple_tree: change return type of mas_split_final_node as void.
mas_split_final_node() always returns true and its return value is never
checked.

Change return type to void.

Link: https://lkml.kernel.org/r/20231109160821.16248-2-ppbuk5246@gmail.com
Signed-off-by: Levi Yun <ppbuk5246@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:57:09 -08:00
Peng Zhang 330018fe69 maple_tree: simplify mas_leaf_set_meta()
Now it seems that the incoming 'end' is already pointing to the last item,
so we can simplify this function, considering only whether the last slot
is being used.  This has passed the maple tree test suite.

Link: https://lkml.kernel.org/r/20231120070937.35481-6-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:57:01 -08:00
Peng Zhang 026b935cd9 maple_tree: delete one of the two identical checks
There are two identical checks, delete one of them.

Link: https://lkml.kernel.org/r/20231120070937.35481-5-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:57:00 -08:00
Peng Zhang c5e9412138 maple_tree: remove an unused parameter for ma_meta_end()
The parameter maple_type is not used, so remove it.

Link: https://lkml.kernel.org/r/20231120070937.35481-4-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:57:00 -08:00
Peng Zhang 3f05fcdebf maple_tree: avoid ascending when mas->min is also the parent's minimum
When the child node is the first child of its parent node, mas->min does
not need to be updated. This can reduce the number of ascending times
in some cases.

Link: https://lkml.kernel.org/r/20231120070937.35481-3-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:57:00 -08:00
Peng Zhang 2e783f0c1a maple_tree: move the check forward to avoid static check warning
Patch series "Some cleanups of maple tree", v2.

These are some small cleanups of maple tree.


This patch (of 5):

Put the check for gap before its reference to avoid Smatch static check
warnings.  This is not a bug, it's just a validation program.  Even with
this change, Smatch may still generate warnings because MT_BUG_ON()
doesn't necessarily stop the program.  It may require fixing Smatch itself
to avoid these warnings.

Link: https://lkml.kernel.org/r/20231120070937.35481-1-zhangpeng.00@bytedance.com
Link: https://lkml.kernel.org/r/20231120070937.35481-2-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: http://lists.infradead.org/pipermail/maple-tree/2023-November/003046.html
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:57:00 -08:00
Jiapeng Chong d1fefa3d22 maple_tree: remove unused function
The function are defined in the maple_tree.c file, but not called
elsewhere, so delete the unused function.

lib/maple_tree.c:689:29: warning: unused function 'mas_pivot'.

Link: https://lkml.kernel.org/r/20231027084944.24888-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7064
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:59 -08:00
Liam R. Howlett a3c63c8c5d maple_tree: mtree_range_walk() clean up
mtree_range_walk() needed to be updated to avoid checking if there was a
pivot value.  On closer examination, the code could avoid setting min or
max in certain scenarios.  The commit removes the extra check for
pivot[offset] before setting max and only sets max when necessary.  It
also only sets min if it is necessary by checking offset 0 prior to the
loop (as it has always done).

The commit also drops a dead node check since the end of the node will
return the array size when the last slot is occupied (by a potential reuse
in a dead node).  The data will be discarded later if the node is marked
dead.

Benchmarking these changes results in an increase in performance of 5.45%
using the BENCH_WALK in the maple tree test code.

Link: https://lkml.kernel.org/r/20231101171629.3612299-13-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:59 -08:00
Liam R. Howlett 24662decdd maple_tree: don't find node end in mtree_lookup_walk()
Since the pivot being set is now reliable, the optimized loop no longer
needs to find the node end.  The redundant check for a dead node can also
be avoided as there is no danger of using the wrong pivot since the
results will be thrown out in the case of a dead node by the later check.

This patch also adds a benchmark test for the function to the maple tree
test framework.  The benchmark shows an average increase performance of
5.98% over 3 runs with this commit.

Link: https://lkml.kernel.org/r/20231101171629.3612299-12-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:59 -08:00
Liam R. Howlett 0de56e38b3 maple_tree: use maple state end for write operations
ma_wr_state was previously tracking the end of the node for writing. 
Since the implementation of the ma_state end tracking, this is duplicated
work.  This patch removes the maple write state tracking of the end of the
node and uses the maple state end instead.

Link: https://lkml.kernel.org/r/20231101171629.3612299-11-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:59 -08:00
Liam R. Howlett 9a40d45c1f maple_tree: remove mas_searchable()
Now that the status of the maple state is outside of the node, the
mas_searchable() function can be dropped for easier open-coding of what is
going on.

Link: https://lkml.kernel.org/r/20231101171629.3612299-10-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:58 -08:00
Liam R. Howlett 067311d33e maple_tree: separate ma_state node from status
The maple tree node is overloaded to keep status as well as the active
node.  This, unfortunately, results in a re-walk on underflow or overflow.
Since the maple state has room, the status can be placed in its own enum
in the structure.  Once an underflow/overflow is detected, certain modes
can restore the status to active and others may need to re-walk just that
one node to see the entry.

The status being an enum has the benefit of detecting unhandled status in
switch statements.

[Liam.Howlett@oracle.com: fix comments about MAS_*]
  Link: https://lkml.kernel.org/r/20231106154124.614247-1-Liam.Howlett@oracle.com
[Liam.Howlett@oracle.com: update forking to separate maple state and node]
  Link: https://lkml.kernel.org/r/20231106154551.615042-1-Liam.Howlett@oracle.com
[Liam.Howlett@oracle.com: fix mas_prev() state separation code]
  Link: https://lkml.kernel.org/r/20231207193319.4025462-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20231101171629.3612299-9-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:58 -08:00
Liam R. Howlett 271f61a8b4 maple_tree: clean up inlines for some functions
There are a few functions which were inlined but are somewhat too large to
inline, so remove the inline key word.

There are also several very small functions which are used in critical
code sections which gcc was not inlining, so make this more strict and use
__always_line for these functions.

Link: https://lkml.kernel.org/r/20231101171629.3612299-8-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:58 -08:00
Liam R. Howlett 1f41ef12ab maple_tree: use cached node end in mas_destroy()
The node end is set during the walk, so use the resulting end instead of
re-fetching it.

Link: https://lkml.kernel.org/r/20231101171629.3612299-7-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:58 -08:00
Liam R. Howlett e9c52d8940 maple_tree: use cached node end in mas_next()
When looking for the next entry, don't recalculate the node end as it is
now tracked in the maple state.

Link: https://lkml.kernel.org/r/20231101171629.3612299-6-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:57 -08:00
Liam R. Howlett 31c532a8af maple_tree: add end of node tracking to the maple state
Analysis of the mas_for_each() iteration showed that there is a
significant time spent finding the end of a node.  This time can be
greatly reduced if the end of the node is cached in the maple state.  Care
must be taken to update & invalidate as necessary.

Link: https://lkml.kernel.org/r/20231101171629.3612299-5-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:57 -08:00
Liam R. Howlett f7a5901895 maple_tree: make mas_erase() more robust
mas_erase() may not deal correctly with all maple states.  Make the
function more robust by ensuring the state is in one of the two acceptable
states.

Link: https://lkml.kernel.org/r/20231101171629.3612299-3-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:57 -08:00
Liam R. Howlett 37a8ab24d3 maple_tree: remove unnecessary default labels from switch statements
Patch series "maple_tree: iterator state changes".

These patches have some general cleanup and a change to separate the maple
state status tracking from the maple state node.

The maple state status change allows for walks to continue from previous
places when the status needs to be recorded to make logical sense for the
next call to the maple state.  For instance, it allows for prev/next to
function in a way that better resembles the linked list.  It also allows
switch statements to be used to detect missed states during compile, and
the addition of fast-path "active" state is cleaner as an enum.

While making the status change, perf showed some very small (one line)
functions that were not inlined even with the inline key word.  Making
these small functions __always_inline is less expensive according to perf.
As part of that change, some inlines have been dropped from larger
functions.

Perf also showed that the commonly used mas_for_each() iterator was
spending a lot of time finding the end of the node.  This series
introduces caching of the end of the node in the maple state (and updating
it during writes).  This caching along with the inline changes yielded at
23.25% improvement on the BENCH_MAS_FOR_EACH maple tree test framework
benchmark.

I've also included a change to mtree_range_walk and mtree_lookup_walk to
take advantage of Peng's change [1] to the initial pivot setup.

mmtests did not produce any significant gains.

[1] https://lore.kernel.org/all/20230711035444.526-1-zhangpeng.00@bytedance.com/T/#u


This patch (of 12):

Removing the default types from the switch statements will cause compile
warnings on missing cases.

Link: https://lkml.kernel.org/r/20231101171629.3612299-2-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:56 -08:00
Heiko Carstens 18564756ab s390/fpu: get rid of MACHINE_HAS_VX
Get rid of MACHINE_HAS_VX and replace it with cpu_has_vx() which is a
short readable wrapper for "test_facility(129)".

Facility bit 129 is set if the vector facility is present. test_facility()
returns also true for all bits which are set in the architecture level set
of the cpu that the kernel is compiled for. This means that
test_facility(129) is a compile time constant which returns true for z13
and later, since the vector facility bit is part of the z13 kernel ALS.

In result the compiled code will have less runtime checks, and less code.

Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-12-11 14:33:07 +01:00
Arnd Bergmann fd6f52e3fa ida: make 'ida_dump' static
Patch series "Treewide: enable -Wmissing-prototypes", v3.

At this point, there are five architectures with a number of known
regressions: alpha, nios2, mips, sh and sparc.  In the previous version of
this patch, I had turned off the missing prototype warnings for the 15
architectures that still had issues, but since there are only five left, I
think we can leave the rest to the maintainers (Cc'd here) as well.

The series is also likely to cause occasional build regressions on
linux-next as developers add new code that misses prototypes.  Hopefully
this should be resolved by the time the patches make it into a release and
everyone gets the warnings right away.  


This patch (of 6):

There is no global declaration for ida_dump() and no other callers, so
make it static to avoid this warning:

lib/test_ida.c:16:6: error: no previous prototype for 'ida_dump'

Link: https://lkml.kernel.org/r/20231123110506.707903-1-arnd@kernel.org
Link: https://lkml.kernel.org/r/20231123110506.707903-2-arnd@kernel.org
Fixes: 8ab8ba38d4 ("ida: Start new test_ida module")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Tudor Ambarus <tudor.ambarus@linaro.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 17:21:42 -08:00
Juntong Deng 5d4c6ac946 kasan: record and report more information
Record and report more information to help us find the cause of the bug
and to help us correlate the error with other system events.

This patch adds recording and showing CPU number and timestamp at
allocation and free (controlled by CONFIG_KASAN_EXTRA_INFO).  The
timestamps in the report use the same format and source as printk.

Error occurrence timestamp is already implicit in the printk log, and CPU
number is already shown by dump_stack_lvl, so there is no need to add it.

In order to record CPU number and timestamp at allocation and free,
corresponding members need to be added to the relevant data structures,
which will lead to increased memory consumption.

In Generic KASAN, members are added to struct kasan_track.  Since in most
cases, alloc meta is stored in the redzone and free meta is stored in the
object or the redzone, memory consumption will not increase much.

In SW_TAGS KASAN and HW_TAGS KASAN, members are added to struct
kasan_stack_ring_entry.  Memory consumption increases as the size of
struct kasan_stack_ring_entry increases (this part of the memory is
allocated by memblock), but since this is configurable, it is up to the
user to choose.

Link: https://lkml.kernel.org/r/VI1P193MB0752BD991325D10E4AB1913599BDA@VI1P193MB0752.EURP193.PROD.OUTLOOK.COM
Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:55 -08:00
Andrey Konovalov bd9d9624b7 lib/stackdepot: adjust DEPOT_POOLS_CAP for KMSAN
KMSAN is frequently used in fuzzing scenarios and thus saves a lot of
stack traces.  As KMSAN does not support evicting stack traces from the
stack depot, the stack depot capacity might be reached quickly with large
stack records.

Adjust the maximum number of stack depot pools for this case.

The average size of a stack trace saved into the stack depot is ~16
frames.  Thus, adjust the maximum pools number accordingly to keep the
maximum number of stack traces that can be saved into the stack depot
similar to the one that was allowed before the stack trace eviction
changes.

Link: https://lkml.kernel.org/r/301a115cf7ce8ddb42ef6de9151c2bb76ba728fc.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:48 -08:00
Andrey Konovalov 108be8def4 lib/stackdepot: allow users to evict stack traces
Add stack_depot_put, a function that decrements the reference counter on a
stack record and removes it from the stack depot once the counter reaches
0.

Internally, when removing a stack record, the function unlinks it from the
hash table bucket and returns to the freelist.

With this change, the users of stack depot can call stack_depot_put when
keeping a stack trace in the stack depot is not needed anymore.  This
allows avoiding polluting the stack depot with irrelevant stack traces and
thus have more space to store the relevant ones before the stack depot
reaches its capacity.

Link: https://lkml.kernel.org/r/1d1ad5692ee43d4fc2b3fd9d221331d30b36123f.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:47 -08:00
Andrey Konovalov 410b764f89 lib/stackdepot: add refcount for records
Add a reference counter for how many times a stack records has been
  added to stack depot.

Add a new STACK_DEPOT_FLAG_GET flag to stack_depot_save_flags that
  instructs the stack depot to increment the refcount.

Do not yet decrement the refcount; this is implemented in one of the
  following patches.

Do not yet enable any users to use the flag to avoid overflowing the
  refcount.

This is preparatory patch for implementing the eviction of stack records
  from the stack depot.

Link: https://lkml.kernel.org/r/a3fc14a2359d019d2a008d4ff8b46a665371ffee.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:46 -08:00
Andrey Konovalov 022012dcf4 lib/stackdepot, kasan: add flags to __stack_depot_save and rename
Change the bool can_alloc argument of __stack_depot_save to a u32
  argument that accepts a set of flags.

The following patch will add another flag to stack_depot_save_flags
  besides the existing STACK_DEPOT_FLAG_CAN_ALLOC.

Also rename the function to stack_depot_save_flags, as
  __stack_depot_save is a cryptic name,

Link: https://lkml.kernel.org/r/645fa15239621eebbd3a10331e5864b718839512.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:46 -08:00
Andrey Konovalov 4805180bc1 lib/stackdepot: use list_head for stack record links
Switch stack_record to use list_head for links in the hash table and in
  the freelist.

This will allow removing entries from the hash table buckets.

This is preparatory patch for implementing the eviction of stack records
  from the stack depot.

Link: https://lkml.kernel.org/r/4787d9a584cd33433d9ee1846b17fa3d3e1987ad.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:46 -08:00
Andrey Konovalov a6cd957021 lib/stackdepot: use read/write lock
Currently, stack depot uses the following locking scheme:

1. Lock-free accesses when looking up a stack record, which allows to
   have multiple users to look up records in parallel;
2. Spinlock for protecting the stack depot pools and the hash table
   when adding a new record.

For implementing the eviction of stack traces from stack depot, the
  lock-free approach is not going to work anymore, as we will need to be
  able to also remove records from the hash table.

Convert the spinlock into a read/write lock, and drop the atomic
  accesses, as they are no longer required.

Looking up stack traces is now protected by the read lock and adding new
  records - by the write lock.  One of the following patches will add a
  new function for evicting stack records, which will be protected by the
  write lock as well.

With this change, multiple users can still look up records in parallel.

This is preparatory patch for implementing the eviction of stack records
  from the stack depot.

Link: https://lkml.kernel.org/r/9f81ffcc4bb422ebb6326a65a770bf1918634cbb.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:45 -08:00
Andrey Konovalov b29d318858 lib/stackdepot: store free stack records in a freelist
Instead of using the global pool_offset variable to find a free slot when
storing a new stack record, mainlain a freelist of free slots within the
allocated stack pools.

A global next_stack variable is used as the head of the freelist, and the
next field in the stack_record struct is reused as freelist link (when the
record is not in the freelist, this field is used as a link in the hash
table).

This is preparatory patch for implementing the eviction of stack records
from the stack depot.

Link: https://lkml.kernel.org/r/b9e4c79955c2121b69301778643b203d3fb09ccc.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:45 -08:00
Andrey Konovalov a5d21f7171 lib/stackdepot: store next pool pointer in new_pool
Instead of using the last pointer in stack_pools for storing the pointer
to a new pool (which does not yet store any stack records), use a new
new_pool variable.

This a purely code readability change: it seems more logical to store the
pointer to a pool with a special meaning in a dedicated variable.

Link: https://lkml.kernel.org/r/448bc18296c16bef95cb3167697be6583dcc8ce3.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:45 -08:00
Andrey Konovalov b6a353d3eb lib/stackdepot: rename next_pool_required to new_pool_required
Rename next_pool_required to new_pool_required.

This a purely code readability change: the following patch will change
stack depot to store the pointer to the new pool in a separate variable,
and "new" seems like a more logical name.

Link: https://lkml.kernel.org/r/fd7cd6c6eb250c13ec5d2009d75bb4ddd1470db9.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:45 -08:00
Andrey Konovalov 94b7d32870 lib/stackdepot: rework helpers for depot_alloc_stack
Split code in depot_alloc_stack and depot_init_pool into 3 functions:

1. depot_keep_next_pool that keeps preallocated memory for the next pool
   if required.

2. depot_update_pools that moves on to the next pool if there's no space
   left in the current pool, uses preallocated memory for the new current
   pool if required, and calls depot_keep_next_pool otherwise.

3. depot_alloc_stack that calls depot_update_pools and then allocates
   a stack record as before.

This makes it somewhat easier to follow the logic of depot_alloc_stack and
also serves as a preparation for implementing the eviction of stack
records from the stack depot.

Link: https://lkml.kernel.org/r/71fb144d42b701fcb46708d7f4be6801a4a8270e.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:44 -08:00
Andrey Konovalov fcccc41ecb lib/stackdepot: fix and clean-up atomic annotations
Drop smp_load_acquire from next_pool_required in depot_init_pool, as both
depot_init_pool and the all smp_store_release's to this variable are
executed under the stack depot lock.

Also simplify and clean up comments accompanying the use of atomic
accesses in the stack depot code.

Link: https://lkml.kernel.org/r/c118ef044d8db80248d9e1f14592c72e8429e9d9.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:44 -08:00
Andrey Konovalov fc60e0caa9 lib/stackdepot: use fixed-sized slots for stack records
Instead of storing stack records in stack depot pools one right after
another, use fixed-sized slots.

Add a new Kconfig option STACKDEPOT_MAX_FRAMES that allows to select the
size of the slot in frames.  Use 64 as the default value, which is the
maximum stack trace size both KASAN and KMSAN use right now.

Also add descriptions for other stack depot Kconfig options.

This is preparatory patch for implementing the eviction of stack records
from the stack depot.

Link: https://lkml.kernel.org/r/dce7d030a99ff61022509665187fac45b0827298.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:44 -08:00
Andrey Konovalov 83130ab2d8 lib/stackdepot: add depot_fetch_stack helper
Add a helper depot_fetch_stack function that fetches the pointer to a
stack record.

With this change, all static depot_* functions now operate on stack pools
and the exported stack_depot_* functions operate on the hash table.

Link: https://lkml.kernel.org/r/170d8c202f29dc8e3d5491ee074d1e9e029a46db.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:44 -08:00
Andrey Konovalov 5f9ce55e02 lib/stackdepot: drop valid bit from handles
Stack depot doesn't use the valid bit in handles in any way, so drop it.

Link: https://lkml.kernel.org/r/34969bba2ca6e012c6ad071767197dee64dc5723.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:44 -08:00
Andrey Konovalov 603c000c11 lib/stackdepot: simplify __stack_depot_save
The retval local variable in __stack_depot_save has the union type
handle_parts, but the function never uses anything but the union's handle
field.

Define retval simply as depot_stack_handle_t to simplify the code.

Link: https://lkml.kernel.org/r/3b0763c8057a1cf2f200ff250a5f9580ee36a28c.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:43 -08:00
Andrey Konovalov 0c5d44a814 lib/stackdepot: check disabled flag when fetching
Do not try fetching a stack trace from the stack depot if the
stack_depot_disabled flag is enabled.

Link: https://lkml.kernel.org/r/c3bfa3b7ab00b2e48ab75a3fbb9c67555777cb08.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:43 -08:00
Andrey Konovalov 4d07a03723 lib/stackdepot: print disabled message only if truly disabled
Patch series "stackdepot: allow evicting stack traces", v4.

Currently, the stack depot grows indefinitely until it reaches its
capacity.  Once that happens, the stack depot stops saving new stack
traces.

This creates a problem for using the stack depot for in-field testing and
in production.

For such uses, an ideal stack trace storage should:

1. Allow saving fresh stack traces on systems with a large uptime while
   limiting the amount of memory used to store the traces;
2. Have a low performance impact.

Implementing #1 in the stack depot is impossible with the current
keep-forever approach.  This series targets to address that.  Issue #2 is
left to be addressed in a future series.

This series changes the stack depot implementation to allow evicting
unneeded stack traces from the stack depot.  The users of the stack depot
can do that via new stack_depot_save_flags(STACK_DEPOT_FLAG_GET) and
stack_depot_put APIs.

Internal changes to the stack depot code include:

1. Storing stack traces in fixed-frame-sized slots (vs precisely-sized
   slots in the current implementation); the slot size is controlled via
   CONFIG_STACKDEPOT_MAX_FRAMES (default: 64 frames);
2. Keeping available slots in a freelist (vs keeping an offset to the next
   free slot);
3. Using a read/write lock for synchronization (vs a lock-free approach
   combined with a spinlock).

This series also integrates the eviction functionality into KASAN: the
tag-based modes evict stack traces when the corresponding entry leaves the
stack ring, and Generic KASAN evicts stack traces for objects once those
leave the quarantine.

With KASAN, despite wasting some space on rounding up the size of each
stack record, the total memory consumed by stack depot gets saturated due
to the eviction of irrelevant stack traces from the stack depot.

With the tag-based KASAN modes, the average total amount of memory used
for stack traces becomes ~0.5 MB (with the current default stack ring size
of 32k entries and the default CONFIG_STACKDEPOT_MAX_FRAMES of 64).  With
Generic KASAN, the stack traces take up ~1 MB per 1 GB of RAM (as the
quarantine's size depends on the amount of RAM).

However, with KMSAN, the stack depot ends up using ~4x more memory per a
stack trace than before.  Thus, for KMSAN, the stack depot capacity is
increased accordingly.  KMSAN uses a lot of RAM for shadow memory anyway,
so the increased stack depot memory usage will not make a significant
difference.

Other users of the stack depot do not save stack traces as often as KASAN
and KMSAN.  Thus, the increased memory usage is taken as an acceptable
trade-off.  In the future, these other users can take advantage of the
eviction API to limit the memory waste.

There is no measurable boot time performance impact of these changes for
KASAN on x86-64.  I haven't done any tests for arm64 modes (the stack
depot without performance optimizations is not suitable for intended use
of those anyway), but I expect a similar result.  Obtaining and copying
stack trace frames when saving them into stack depot is what takes the
most time.

This series does not yet provide a way to configure the maximum size of
the stack depot externally (e.g.  via a command-line parameter).  This
will be added in a separate series, possibly together with the performance
improvement changes.


This patch (of 22):

Currently, if stack_depot_disable=off is passed to the kernel command-line
after stack_depot_disable=on, stack depot prints a message that it is
disabled, while it is actually enabled.

Fix this by moving printing the disabled message to
stack_depot_early_init.  Place it before the
__stack_depot_early_init_requested check, so that the message is printed
even if early stack depot init has not been requested.

Also drop the stack_table = NULL assignment from disable_stack_depot, as
stack_table is NULL by default.

Link: https://lkml.kernel.org/r/cover.1700502145.git.andreyknvl@google.com
Link: https://lkml.kernel.org/r/73a25c5fff29f3357cd7a9330e85e09bc8da2cbe.1700502145.git.andreyknvl@google.com
Fixes: e1fdc40334 ("lib: stackdepot: add support to disable stack depot")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:43 -08:00
Paul Heidekrüger 83a6fdd6c2 kasan: default to inline instrumentation
KASan inline instrumentation can yield up to a 2x performance gain at the
cost of a larger binary.

Make inline instrumentation the default, as suggested in the bug report
below.

When an architecture does not support inline instrumentation, it should
set ARCH_DISABLE_KASAN_INLINE, as done by PowerPC, for instance.

Link: https://lkml.kernel.org/r/20231109155101.186028-1-paul.heidekrueger@tum.de
Signed-off-by: Paul Heidekrüger <paul.heidekrueger@tum.de>
Reported-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Marco Elver <elver@google.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=203495
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:40 -08:00
Peng Zhang 8e50d32c7a maple_tree: preserve the tree attributes when destroying maple tree
When destroying maple tree, preserve its attributes and then turn it into
an empty tree.  This allows it to be reused without needing to be
reinitialized.

Link: https://lkml.kernel.org/r/20231027033845.90608-10-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:33 -08:00
Peng Zhang 446e1867e6 maple_tree: update check_forking() and bench_forking()
Updated check_forking() and bench_forking() to use __mt_dup() to duplicate
maple tree.

Link: https://lkml.kernel.org/r/20231027033845.90608-9-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:33 -08:00
Peng Zhang f670fa1caa maple_tree: skip other tests when BENCH is enabled
Skip other tests when BENCH is enabled so that performance can be measured
in user space.

Link: https://lkml.kernel.org/r/20231027033845.90608-8-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:33 -08:00
Peng Zhang fd32e4e9b7 maple_tree: introduce interfaces __mt_dup() and mtree_dup()
Introduce interfaces __mt_dup() and mtree_dup(), which are used to
duplicate a maple tree.  They duplicate a maple tree in Depth-First Search
(DFS) pre-order traversal.  It uses memcopy() to copy nodes in the source
tree and allocate new child nodes in non-leaf nodes.  The new node is
exactly the same as the source node except for all the addresses stored in
it.  It will be faster than traversing all elements in the source tree and
inserting them one by one into the new tree.  The time complexity of these
two functions is O(n).

The difference between __mt_dup() and mtree_dup() is that mtree_dup()
handles locks internally.

Analysis of the average time complexity of this algorithm:

For simplicity, let's assume that the maximum branching factor of all
non-leaf nodes is 16 (in allocation mode, it is 10), and the tree is a
full tree.

Under the given conditions, if there is a maple tree with n elements, the
number of its leaves is n/16.  From bottom to top, the number of nodes in
each level is 1/16 of the number of nodes in the level below.  So the
total number of nodes in the entire tree is given by the sum of n/16 +
n/16^2 + n/16^3 + ...  + 1.  This is a geometric series, and it has log(n)
terms with base 16.  According to the formula for the sum of a geometric
series, the sum of this series can be calculated as (n-1)/15.  Each node
has only one parent node pointer, which can be considered as an edge.  In
total, there are (n-1)/15-1 edges.

This algorithm consists of two operations:

1. Traversing all nodes in DFS order.
2. For each node, making a copy and performing necessary modifications
   to create a new node.

For the first part, DFS traversal will visit each edge twice.  Let
T(ascend) represent the cost of taking one step downwards, and T(descend)
represent the cost of taking one step upwards.  And both of them are
constants (although mas_ascend() may not be, as it contains a loop, but
here we ignore it and treat it as a constant).  So the time spent on the
first part can be represented as ((n-1)/15-1) * (T(ascend) + T(descend)).

For the second part, each node will be copied, and the cost of copying a
node is denoted as T(copy_node).  For each non-leaf node, it is necessary
to reallocate all child nodes, and the cost of this operation is denoted
as T(dup_alloc).  The behavior behind memory allocation is complex and not
specific to the maple tree operation.  Here, we assume that the time
required for a single allocation is constant.  Since the size of a node is
fixed, both of these symbols are also constants.  We can calculate that
the time spent on the second part is ((n-1)/15) * T(copy_node) + ((n-1)/15
- n/16) * T(dup_alloc).

Adding both parts together, the total time spent by the algorithm can be
represented as:

((n-1)/15) * (T(ascend) + T(descend) + T(copy_node) + T(dup_alloc)) -
n/16 * T(dup_alloc) - (T(ascend) + T(descend))

Let C1 = T(ascend) + T(descend) + T(copy_node) + T(dup_alloc)
Let C2 = T(dup_alloc)
Let C3 = T(ascend) + T(descend)

Finally, the expression can be simplified as:
((16 * C1 - 15 * C2) / (15 * 16)) * n - (C1 / 15 + C3).

This is a linear function, so the average time complexity is O(n).

Link: https://lkml.kernel.org/r/20231027033845.90608-4-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Suggested-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:32 -08:00
Peng Zhang 4f2267b58a maple_tree: add mt_free_one() and mt_attr() helpers
Patch series "Introduce __mt_dup() to improve the performance of fork()", v7.

This series introduces __mt_dup() to improve the performance of fork(). 
During the duplication process of mmap, all VMAs are traversed and
inserted one by one into the new maple tree, causing the maple tree to be
rebalanced multiple times.  Balancing the maple tree is a costly
operation.  To duplicate VMAs more efficiently, mtree_dup() and __mt_dup()
are introduced for the maple tree.  They can efficiently duplicate a maple
tree.

Here are some algorithmic details about {mtree,__mt}_dup().  We perform a
DFS pre-order traversal of all nodes in the source maple tree.  During
this process, we fully copy the nodes from the source tree to the new
tree.  This involves memory allocation, and when encountering a new node,
if it is a non-leaf node, all its child nodes are allocated at once.

This idea was originally from Liam R.  Howlett's Maple Tree Work email,
and I added some of my own ideas to implement it.  Some previous
discussions can be found in [1].  For a more detailed analysis of the
algorithm, please refer to the logs for patch [3/10] and patch [10/10].

There is a "spawn" in byte-unixbench[2], which can be used to test the
performance of fork().  I modified it slightly to make it work with
different number of VMAs.

Below are the test results.  The first row shows the number of VMAs.  The
second and third rows show the number of fork() calls per ten seconds,
corresponding to next-20231006 and the this patchset, respectively.  The
test results were obtained with CPU binding to avoid scheduler load
balancing that could cause unstable results.  There are still some
fluctuations in the test results, but at least they are better than the
original performance.

21     121   221    421    821    1621   3221   6421   12821  25621  51221
112100 76261 54227  34035  20195  11112  6017   3161   1606   802    393
114558 83067 65008  45824  28751  16072  8922   4747   2436   1233   599
2.19%  8.92% 19.88% 34.64% 42.37% 44.64% 48.28% 50.17% 51.68% 53.74% 52.42%

Thanks to Liam and Matthew for the review.


This patch (of 10):

Add two helpers:
1. mt_free_one(), used to free a maple node.
2. mt_attr(), used to obtain the attributes of maple tree.

Link: https://lkml.kernel.org/r/20231027033845.90608-1-zhangpeng.00@bytedance.com
Link: https://lkml.kernel.org/r/20231027033845.90608-2-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:31 -08:00
Tiezhu Yang 5181dc08f7 test_bpf: Rename second ALU64_SMOD_X to ALU64_SMOD_K
Currently, there are two test cases with same name
"ALU64_SMOD_X: -7 % 2 = -1", the first one is right,
the second one should be ALU64_SMOD_K because its
code is BPF_ALU64 | BPF_MOD | BPF_K.

Before:
test_bpf: #170 ALU64_SMOD_X: -7 % 2 = -1 jited:1 4 PASS
test_bpf: #171 ALU64_SMOD_X: -7 % 2 = -1 jited:1 4 PASS

After:
test_bpf: #170 ALU64_SMOD_X: -7 % 2 = -1 jited:1 4 PASS
test_bpf: #171 ALU64_SMOD_K: -7 % 2 = -1 jited:1 4 PASS

Fixes: daabb2b098 ("bpf/tests: add tests for cpuv4 instructions")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231207040851.19730-1-yangtiezhu@loongson.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-09 21:27:54 -08:00
Linus Torvalds 8e819a7623 31 hotfixes. 10 of these address pre-6.6 issues and are marked cc:stable.
The remainder address post-6.6 issues or aren't considered serious enough
 to justify backporting.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZXKEfwAKCRDdBJ7gKXxA
 jlRpAQCiAp1nSqIz/fOKTzoQRaTDXU/m+C+6ZAXdKLDfvQBhpwEAnxxjZ8IgF+8Z
 Klz/GirHX5w5o7jE2wb8iObo1nR75Qo=
 =omRq
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-12-07-18-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "31 hotfixes. Ten of these address pre-6.6 issues and are marked
  cc:stable. The remainder address post-6.6 issues or aren't considered
  serious enough to justify backporting"

* tag 'mm-hotfixes-stable-2023-12-07-18-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (31 commits)
  mm/madvise: add cond_resched() in madvise_cold_or_pageout_pte_range()
  nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage()
  mm/hugetlb: have CONFIG_HUGETLB_PAGE select CONFIG_XARRAY_MULTI
  scripts/gdb: fix lx-device-list-bus and lx-device-list-class
  MAINTAINERS: drop Antti Palosaari
  highmem: fix a memory copy problem in memcpy_from_folio
  nilfs2: fix missing error check for sb_set_blocksize call
  kernel/Kconfig.kexec: drop select of KEXEC for CRASH_DUMP
  units: add missing header
  drivers/base/cpu: crash data showing should depends on KEXEC_CORE
  mm/damon/sysfs-schemes: add timeout for update_schemes_tried_regions
  scripts/gdb/tasks: fix lx-ps command error
  mm/Kconfig: make userfaultfd a menuconfig
  selftests/mm: prevent duplicate runs caused by TEST_GEN_PROGS
  mm/damon/core: copy nr_accesses when splitting region
  lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly
  checkstack: fix printed address
  mm/memory_hotplug: fix error handling in add_memory_resource()
  mm/memory_hotplug: add missing mem_hotplug_lock
  .mailmap: add a new address mapping for Chester Lin
  ...
2023-12-08 08:36:23 -08:00
Jakub Kicinski 2483e7f04c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/stmicro/stmmac/dwmac5.c
drivers/net/ethernet/stmicro/stmmac/dwmac5.h
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
drivers/net/ethernet/stmicro/stmmac/hwif.h
  37e4b8df27 ("net: stmmac: fix FPE events losing")
  c3f3b97238 ("net: stmmac: Refactor EST implementation")
https://lore.kernel.org/all/20231206110306.01e91114@canb.auug.org.au/

Adjacent changes:

net/ipv4/tcp_ao.c
  9396c4ee93 ("net/tcp: Don't store TCP-AO maclen on reqsk")
  7b0f570f87 ("tcp: Move TCP-AO bits from cookie_v[46]_check() to tcp_ao_syncookie().")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-07 17:53:17 -08:00
Andrew Morton 0c92218f4e Merge branch 'master' into mm-hotfixes-stable 2023-12-06 17:03:50 -08:00
Ming Lei 0263f92fad lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly
group_cpus_evenly() could be part of storage driver's error handler, such
as nvme driver, when may happen during CPU hotplug, in which storage queue
has to drain its pending IOs because all CPUs associated with the queue
are offline and the queue is becoming inactive.  And handling IO needs
error handler to provide forward progress.

Then deadlock is caused:

1) inside CPU hotplug handler, CPU hotplug lock is held, and blk-mq's
   handler is waiting for inflight IO

2) error handler is waiting for CPU hotplug lock

3) inflight IO can't be completed in blk-mq's CPU hotplug handler
   because error handling can't provide forward progress.

Solve the deadlock by not holding CPU hotplug lock in group_cpus_evenly(),
in which two stage spreads are taken: 1) the 1st stage is over all present
CPUs; 2) the end stage is over all other CPUs.

Turns out the two stage spread just needs consistent 'cpu_present_mask',
and remove the CPU hotplug lock by storing it into one local cache.  This
way doesn't change correctness, because all CPUs are still covered.

Link: https://lkml.kernel.org/r/20231120083559.285174-1-ming.lei@redhat.com
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Reported-by: Guangwu Zhang <guazhang@redhat.com>
Tested-by: Guangwu Zhang <guazhang@redhat.com>
Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Cc: Keith Busch <kbusch@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-06 16:12:46 -08:00
Yuntao Wang 4b3805daaa ACPI: tables: Correct and clean up the logic of acpi_parse_entries_array()
The original intention of acpi_parse_entries_array() is to return the
number of all matching entries on success. This number may be greater than
the value of the max_entries parameter. When this happens, the function
will output a warning message, indicating that `count - max_entries`
matching entries remain unprocessed and have been ignored.

However, commit 4ceacd02f5 ("ACPI / table: Always count matched and
successfully parsed entries") changed this logic to return the number of
entries successfully processed by the handler. In this case, when the
max_entries parameter is not zero, the number of entries successfully
processed can never be greater than the value of max_entries. In other
words, the expression `count > max_entries` will always evaluate to false.
This means that the logic in the final if statement will never be executed.

Commit 99b0efd7c8 ("ACPI / tables: do not report the number of entries
ignored by acpi_parse_entries()") mentioned this issue, but it tried to fix
it by removing part of the warning message. This is meaningless because the
pr_warn statement will never be executed in the first place.

Commit 8726d4f441 ("ACPI / tables: fix acpi_parse_entries_array() so it
traverses all subtables") introduced an errs variable, which is intended to
make acpi_parse_entries_array() always traverse all of the subtables,
calling as many of the callbacks as possible. However, it seems that the
commit does not achieve this goal. For example, when a handler returns an
error, none of the handlers will be called again in the subsequent
iterations. This result appears to be no different from before the change.

This patch corrects and cleans up the logic of acpi_parse_entries_array(),
making it return the number of all matching entries, rather than the number
of entries successfully processed by handlers. Additionally, if an error
occurs when executing a handler, the function will return -EINVAL immediately.

This patch should not affect existing users of acpi_parse_entries_array().

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-06 20:46:14 +01:00
Herve Codina 5c47251e8c lib/vsprintf: Fix %pfwf when current node refcount == 0
A refcount issue can appeared in __fwnode_link_del() due to the
pr_debug() call:
  WARNING: CPU: 0 PID: 901 at lib/refcount.c:25 refcount_warn_saturate+0xe5/0x110
  Call Trace:
  <TASK>
  ...
  of_node_get+0x1e/0x30
  of_fwnode_get+0x28/0x40
  fwnode_full_name_string+0x34/0x90
  fwnode_string+0xdb/0x140
  ...
  vsnprintf+0x17b/0x630
  ...
  __fwnode_link_del+0x25/0xa0
  fwnode_links_purge+0x39/0xb0
  of_node_release+0xd9/0x180
  ...

Indeed, an fwnode (of_node) is being destroyed and so, of_node_release()
is called because the of_node refcount reached 0.
From of_node_release() several function calls are done and lead to
a pr_debug() calls with %pfwf to print the fwnode full name.
The issue is not present if we change %pfwf to %pfwP.

To print the full name, %pfwf iterates over the current node and its
parents and obtain/drop a reference to all nodes involved.

In order to allow to print the full name (%pfwf) of a node while it is
being destroyed, do not obtain/drop a reference to this current node.

Fixes: a92eb7621b ("lib/vsprintf: Make use of fwnode API to obtain node names and separators")
Cc: stable@vger.kernel.org
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20231114152655.409331-1-herve.codina@bootlin.com
2023-12-06 11:06:59 +01:00
Jens Axboe 9fd7874c0e
iov_iter: replace import_single_range() with import_ubuf()
With the removal of the 'iov' argument to import_single_range(), the two
functions are now fully identical. Convert the import_single_range()
callers to import_ubuf(), and remove the former fully.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20231204174827.1258875-3-axboe@kernel.dk
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-12-05 11:57:37 +01:00
Jens Axboe 6ac805d138
iov_iter: remove unused 'iov' argument from import_single_range()
It is entirely unused, just get rid of it.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20231204174827.1258875-2-axboe@kernel.dk
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-12-05 11:57:34 +01:00
Vlastimil Babka 2a19be61a6 mm/slab: remove CONFIG_SLAB from all Kconfig and Makefile
Remove CONFIG_SLAB, CONFIG_DEBUG_SLAB, CONFIG_SLAB_DEPRECATED and
everything in Kconfig files and mm/Makefile that depends on those. Since
SLUB is the only remaining allocator, remove the allocator choice, make
CONFIG_SLUB a "def_bool y" for now and remove all explicit dependencies
on SLUB or SLAB as it's now always enabled. Make every option's verbose
name and description refer to "the slab allocator" without refering to
the specific implementation. Do not rename the CONFIG_ option names yet.

Everything under #ifdef CONFIG_SLAB, and mm/slab.c is now dead code, all
code under #ifdef CONFIG_SLUB is now always compiled.

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: David Rientjes <rientjes@google.com>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-05 11:14:40 +01:00
Linus Torvalds 669fc83452 Probes fixes for v6.7-r3:
- objpool: Fix objpool overrun case on memory/cache access delay especially
   on the big.LITTLE SoC. The objpool uses a copy of object slot index
   internal loop, but the slot index can be changed on another processor
   in parallel. In that case, the difference of 'head' local copy and the
   'slot->last' index will be bigger than local slot size. In that case,
   we need to re-read the slot::head to update it.
 
 - kretprobe: Fix to use appropriate rcu API for kretprobe holder. Since
   kretprobe_holder::rp is RCU managed, it should use rcu_assign_pointer()
   and rcu_dereference_check() correctly. Also adding __rcu tag for
   finding wrong usage by sparse.
 
 - rethook: Fix to use appropriate rcu API for rethook::handler. The same
   as kretprobe, rethook::handler is RCU managed and it should use
   rcu_assign_pointer() and rcu_dereference_check(). This also adds __rcu
   tag for finding wrong usage by sparse.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmVpfBobHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bNyMIAJSLICKQNuFiBJEn/rty
 ACWJ9QMOnwi0DoVaepG/m9QJh6AIUUFW4//9helmSm0GIVzxQ2+f8UeKU+sYiVtH
 ro9atea4W4+FMTvtEB1cU8oG5CDVT4WQdUXbjMktqYe3+WB8Zt8+fIP0mnbTFAVr
 yStpliGPecmlupJVRYqrJGyDdbkUxXxVlPsP/eDrHFgbBWv8Incw0f+MLGSi6LSE
 sZ1MaKCdi2tlHbtD/fiowfLoBMZwQAKY4hq/XguVsWh+BGaGUgwtif+8ESwPeu22
 KEZLyWDQ1N8XBHyOBotV7vsBEwh6LKtLGVXIBsO4KxVyGw6msxWBis0dt/tkn+kk
 LEg=
 =B9WK
 -----END PGP SIGNATURE-----

Merge tag 'probes-fixes-v6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fixes from Masami Hiramatsu:

 - objpool: Fix objpool overrun case on memory/cache access delay
   especially on the big.LITTLE SoC. The objpool uses a copy of object
   slot index internal loop, but the slot index can be changed on
   another processor in parallel. In that case, the difference of 'head'
   local copy and the 'slot->last' index will be bigger than local slot
   size. In that case, we need to re-read the slot::head to update it.

 - kretprobe: Fix to use appropriate rcu API for kretprobe holder. Since
   kretprobe_holder::rp is RCU managed, it should use
   rcu_assign_pointer() and rcu_dereference_check() correctly. Also
   adding __rcu tag for finding wrong usage by sparse.

 - rethook: Fix to use appropriate rcu API for rethook::handler. The
   same as kretprobe, rethook::handler is RCU managed and it should use
   rcu_assign_pointer() and rcu_dereference_check(). This also adds
   __rcu tag for finding wrong usage by sparse.

* tag 'probes-fixes-v6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  rethook: Use __rcu pointer for rethook::handler
  kprobes: consistent rcu api usage for kretprobe holder
  lib: objpool: fix head overrun on RK3588 SBC
2023-12-03 08:02:49 +09:00
Linus Torvalds ce474ae7d0 ACPI fixes for 6.7-rc4
- Fix a recently introduced build issue on ARM32 platforms caused by an
    inadvertent header file breakage (Dave Jiang).
 
  - Eliminate questionable usage of acpi_driver_data() in the ACPI
    backlight cooling device code that leads to NULL pointer dereferences
    after recent ACPI core changes (Hans de Goede).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmVqSGMSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxKBsP/RqwUb8oNYxreUmKgIVZ0H8SnPRfluVP
 4St+HeTxsmdN+obglVUWnhZMViewGsionLuq0y/FrYoWLI2F08UQAU8i248h20aZ
 nGXalM+n5H517dOidTJzvGxKHMOA2TrzyVna/IcQAYLnbXmp25j8EdHmuhrI3KfK
 3yxXLe+6J22776U/MMyutR+rwTVE0tgfQOWM4YuxT67uUPQVkKX+3/uYt/EfkKsX
 Dz2ce/5vF28JDjv/yTxaoctMpmjjem97av0J7Y1EM/5K9kTZ070U8OZhYEgBHw5o
 pRalhQlNz5VI/KQy3Mc8n4DmrwrPoktCBI0pQPr8FV56dmCFmTaFY1C4/SxebTr4
 O2U3r5GkmcxLCKZXAUUAc8J+M6BBRHNQtlpBN3iNRG4ID2x72idPn5fr02UeGk4g
 lxysNzcIwxcOuogeKTD2IERzLZA7Ub3qm8gguFOMqnxV1nmPx6f1nl1MuxLsJY4e
 ZQwrwZCDJr5CZUrvJz6Mo9HUibLQOELgtsWinKV9OYUgTOfQ69t+XuwsPYIYtQiH
 UV7qfE+ZLiG/jZPgP6tfN6InbtB9I4xXNMhKyoZrGTNtbKUVLdDHJH62/vIyHBXQ
 VuLr9jH+CuyRjJSRCnbWA1BWkLhWfHQnFaq9UVWAWkVd8MisGVolnCSPGjUyQur7
 ZT7i3c13nxJ2
 =ANS/
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "This fixes a recently introduced build issue on ARM32 and a NULL
  pointer dereference in the ACPI backlight driver due to a design issue
  exposed by a recent change in the ACPI bus type code.

  Specifics:

   - Fix a recently introduced build issue on ARM32 platforms caused by
     an inadvertent header file breakage (Dave Jiang)

   - Eliminate questionable usage of acpi_driver_data() in the ACPI
     backlight cooling device code that leads to NULL pointer
     dereferences after recent ACPI core changes (Hans de Goede)"

* tag 'acpi-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: video: Use acpi_video_device for cooling-dev driver data
  ACPI: Fix ARM32 platforms compile issue introduced by fw_table changes
2023-12-02 08:52:20 +09:00
Linus Torvalds e6861be452 More bcachefs bugfixes for 6.7
Bigger/user visible fixes:
 
  - bcache & bcachefs were broken with CFI enabled; patch for closures to
    fix type punning
 
  - mark erasure coding as extra-experimental; there are incompatible
    disk space accounting changes coming for erasure coding, and I'm
    still seeing checksum errors in some tests
 
  - several fixes for durability-related issues (durability is a device
    specific setting where we can tell bcachefs that data on a given
    device should be counted as replicated x times )
 
  - a fix for a rare livelock when a btree node merge then updates a
    parent node that is almost full
 
  - fix a race in the device removal path, where dropping a pointer in a
    btree node to a device would be clobbered by an in flight btree write
    updating the btree node key on completion
 
  - fix one SRCU lock hold time warning in the btree gc code - ther's
    still a bunch more of these to fix
 
  - fix a rare race where we'd start copygc before initializing the "are
    we rw" percpu refcount; copygc would think we were already ro and die
    immediately
 
 https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-for-upstream
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmVnoHoACgkQE6szbY3K
 bnbzLBAApVEg3kB3XDCHYw+8AxLbzkuKbuV8FR/w+ULYAmRKbnM5e4pM4UJzwVJ9
 vzBS9KUT4mVNpA5zl7FWmqh5AiJkhbPgb/BijtQiS+gz1ofZ8uCW/DjzWZpaTaT9
 0zz9auiKwzJbBmLXC2lWC28MUPjFNXxlP2pfQPqhpKqlGKBC893hKeJ0Veb6dM1R
 DqkctoWtSQzsNpEaXiQpKBNoNUIlYcFX1XXHn+XpPpWNe80SpMfVNCs2qPkMByu/
 V/QULE9cHI7RTu7oyFY80+9xQDeXDDYZgvtpD7hqNPcyyoix+r/DVz1mZe41XF2B
 bvaJhfcdWePctmiuEXJVXT4HSkwwzC6EKHwi7fejGY56hOvsrEAxNzTEIPRNw5st
 ZkZlxASwFqkiJ3ehy+KRngLX2GZSbJsU4aM5ViQJKtz4rBzGyyf0LmMucdxAoDH5
 zLzsAYaA6FkIZ5e5ZNdTDj7/TMnKWXlU9vTttqIpb8s7qSy+3ejk5NuGitJihZ4R
 LAaCTs1JIsItLP47Ko0ZvmKV6CHlmt+Ht8OBqu73BWJ8vsBTQ8JMK4mGt60bwHvm
 LdEMtp3C3FmXFc06zhKoGgjrletZYO6G4mFBPnQqh1brfFXM1prVg3ftDTqBWkMI
 iAz2chiVc8k0qxoSAqylCYFaGzgiBKzw6YMtqPRmZgfLcq/sJ34=
 =vN+y
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2023-11-29' of https://evilpiepirate.org/git/bcachefs

Pull more bcachefs bugfixes from Kent Overstreet:

 - bcache & bcachefs were broken with CFI enabled; patch for closures to
   fix type punning

 - mark erasure coding as extra-experimental; there are incompatible
   disk space accounting changes coming for erasure coding, and I'm
   still seeing checksum errors in some tests

 - several fixes for durability-related issues (durability is a device
   specific setting where we can tell bcachefs that data on a given
   device should be counted as replicated x times)

 - a fix for a rare livelock when a btree node merge then updates a
   parent node that is almost full

 - fix a race in the device removal path, where dropping a pointer in a
   btree node to a device would be clobbered by an in flight btree write
   updating the btree node key on completion

 - fix one SRCU lock hold time warning in the btree gc code - ther's
   still a bunch more of these to fix

 - fix a rare race where we'd start copygc before initializing the "are
   we rw" percpu refcount; copygc would think we were already ro and die
   immediately

* tag 'bcachefs-2023-11-29' of https://evilpiepirate.org/git/bcachefs: (23 commits)
  bcachefs: Extra kthread_should_stop() calls for copygc
  bcachefs: Convert gc_alloc_start() to for_each_btree_key2()
  bcachefs: Fix race between btree writes and metadata drop
  bcachefs: move journal seq assertion
  bcachefs: -EROFS doesn't count as move_extent_start_fail
  bcachefs: trace_move_extent_start_fail() now includes errcode
  bcachefs: Fix split_race livelock
  bcachefs: Fix bucket data type for stripe buckets
  bcachefs: Add missing validation for jset_entry_data_usage
  bcachefs: Fix zstd compress workspace size
  bcachefs: bpos is misaligned on big endian
  bcachefs: Fix ec + durability calculation
  bcachefs: Data update path won't accidentaly grow replicas
  bcachefs: deallocate_extra_replicas()
  bcachefs: Proper refcounting for journal_keys
  bcachefs: preserve device path as device name
  bcachefs: Fix an endianness conversion
  bcachefs: Start gc, copygc, rebalance threads after initing writes ref
  bcachefs: Don't stop copygc thread on device resize
  bcachefs: Make sure bch2_move_ratelimit() also waits for move_ops
  ...
2023-12-02 06:02:16 +09:00
Rafael J. Wysocki 7d4c44a53d Merge branch 'acpi-tables'
Merge a fix for a recently introduced build issue on ARM32 platforms
caused by an inadvertent header file breakage (Dave Jiang).

* acpi-tables:
  ACPI: Fix ARM32 platforms compile issue introduced by fw_table changes
2023-12-01 21:32:19 +01:00
wuqiang.matt d67f39d2b8 lib: objpool: fix head overrun on RK3588 SBC
objpool overrun stress with test_objpool on OrangePi5+ SBC triggered the
following kernel warnings:

    WARNING: CPU: 6 PID: 3115 at lib/objpool.c:168 objpool_push+0xc0/0x100

This message is from objpool.c:168:

    WARN_ON_ONCE(tail - head > pool->nr_objs);

The overrun test case is to validate the case that pre-allocated objects
are insufficient: 8 objects are pre-allocated for each node and consumer
thread per node tries to grab 16 objects in a row. The testing system is
OrangePI 5+, with RK3588, a big.LITTLE SOC with 4x A76 and 4x A55. When
disabling either all 4 big or 4 little cores, the overrun tests run well,
and once with big and little cores mixed together, the overrun test would
always cause an overrun loop. It's likely the memory timing differences
of big and little cores cause this trouble. Here are the debugging data
of objpool_try_get_slot after try_cmpxchg_release:

    objpool_pop: cpu: 4/0 0:0 head: 278/279 tail:278 last:276/278

The local copies of 'head' and 'last' were 278 and 276, and reloading of
'slot->head' and 'slot->last' got 279 and 278. After try_cmpxchg_release
'slot->head' became 'head + 1', which is correct. But what's wrong here
is the stale value of 'last', and that stale value of 'last' finally led
the overrun of 'head'.

Memory updating of 'last' and 'head' are performed in push() and pop()
independently, which could be the culprit leading this out of order
visibility of 'last' and 'head'. So for objpool_try_get_slot(), it's
not enough only checking the condition of 'head != slot', the implicit
condition 'last - head <= nr_objs' must also be explicitly asserted to
guarantee 'last' is always behind 'head' before the object retrieving.

This patch will check and try reloading of 'head' and 'last' to ensure
'last' is behind 'head' at the time of object retrieving. Performance
testings show the average impact is about 0.1% for X86_64 and 1.12% for
ARM64. Here are the results:

    OS: Debian 10 X86_64, Linux 6.6rc
    HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s
                      1T         2T         4T         8T        16T
    native:     49543304   99277826  199017659  399070324  795185848
    objpool:    29909085   59865637  119692073  239750369  478005250
    objpool+:   29879313   59230743  119609856  239067773  478509029
                     32T        48T        64T        96T       128T
    native:   1596927073 2390099988 2929397330 3183875848 3257546602
    objpool:   957553042 1435814086 1680872925 2043126796 2165424198
    objpool+:  956476281 1434491297 1666055740 2041556569 2157415622

    OS: Debian 11 AARCH64, Linux 6.6rc
    HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s
                      1T         2T         4T         8T        16T
    native:     30890508   60399915  123111980  242257008  494002946
    objpool:    14742531   28883047   57739948  115886644  232455421
    objpool+:   14107220   29032998   57286084  113730493  232232850
                     24T        32T        48T        64T        96T
    native:    746406039 1000174750 1493236240 1998318364 2942911180
    objpool:   349164852  467284332  702296756  934459713 1387898285
    objpool+:  348388180  462750976  696606096  927865887 1368402195

Link: https://lore.kernel.org/all/20231114115148.298821-1-wuqiang.matt@bytedance.com/

Fixes: b4edb8d2d4 ("lib: objpool added: ring-array based lockless MPMC")
Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-12-01 14:53:55 +09:00
Linus Torvalds 47669f40b1 linux_kselftest-kunit-fixes-6.7-rc4
This KUnit fixes update for Linux 6.7-rc4 consists of three fixes to
 warnings and run-time test behavior. With these fixes, test suite
 counter will be reset correctly before running tests, kunit will warn
 if tests are too slow, and eliminate warning when kfree() as an action.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmVosvIACgkQCwJExA0N
 QxyzqxAAtXffplLUm7saZW7bXW3IB5Bugzu5qjp8Zw0YR7WNhgXMimCVHotWhuHX
 CrQiRIf1ZyXt6+qnUYXMIsEpMri03qhun2E8lmsA6Ws7xAHgQh5nORPEZOgBEV+3
 HnyPJTT2RBPBoOZuRTipYeu6MZE1YSnmpGdftYWHWENThqG+qb5WJ/Rs33JSZAnx
 O/XnCHZigqc7aCxZHnm8HcFWfikn4pFg2f76XU7j6uvH+ZIuQJEZvCdukzjWvGeC
 AP47AaZ04rqbkRZAA5nOew7SIgb0fK3dYdxsWfv3ai1gJMReUwqwQQ/44tnxtPs2
 s0d2FzH8+OL3MIQ0PFMH/E16bmAwcGXFlRmJvJFdbMoWa+zji8mhUSw7qIqJaFTI
 6oMHd4XdHGwHYHrLgQgLdIzuggTRSusnaMBubGJe0xo2xCgFI6CjNUzLNk1zsY2L
 OeKNlTUH2l3oOnlia4CsOPzx5HOe204ZCZwQR1fkSQVmfkl5cfz17Mr/lKshxH+Q
 X+t56bLy5rhyUI6LxrAj7AC6E8MVQ/fiial3v1lDmJKD6Bh5YL0n6wAl5eWv3Hcq
 Com0maVlGCaPIErzjaSM9RqBCYT7Le89J/p3lJ7rXF3zUY8B1Miv7IVCeHDOxzpl
 AwMI9PavdHOvh2NeaDku/fQwzID7ytdT7dAktQIviUReqWnZRGU=
 =g2ql
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-fixes-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit fixes from Shuah Khan:
 "Three fixes to warnings and run-time test behavior. With these fixes,
  test suite counter will be reset correctly before running tests, kunit
  will warn if tests are too slow, and eliminate warning when kfree() as
  an action"

* tag 'linux_kselftest-kunit-fixes-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: test: Avoid cast warning when adding kfree() as an action
  kunit: Reset suite counter right before running tests
  kunit: Warn if tests are slow
2023-12-01 14:03:05 +09:00
Jakub Kicinski 753c8608f3 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZWiCPAAKCRDbK58LschI
 g4djAQC1FdqCRIFkhbiIRNHTgHjnfQShELQbd9ofJqzylLqmmgD+JI1E7D9SXagm
 pIXQ26EGmq8/VcCT3VLncA8EsC76Gg4=
 =Xowm
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-11-30

We've added 30 non-merge commits during the last 7 day(s) which contain
a total of 58 files changed, 1598 insertions(+), 154 deletions(-).

The main changes are:

1) Add initial TX metadata implementation for AF_XDP with support in mlx5
   and stmmac drivers. Two types of offloads are supported right now, that
   is, TX timestamp and TX checksum offload, from Stanislav Fomichev with
   stmmac implementation from Song Yoong Siang.

2) Change BPF verifier logic to validate global subprograms lazily instead
   of unconditionally before the main program, so they can be guarded using
   BPF CO-RE techniques, from Andrii Nakryiko.

3) Add BPF link_info support for uprobe multi link along with bpftool
   integration for the latter, from Jiri Olsa.

4) Use pkg-config in BPF selftests to determine ld flags which is
   in particular needed for linking statically, from Akihiko Odaki.

5) Fix a few BPF selftest failures to adapt to the upcoming LLVM18,
   from Yonghong Song.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (30 commits)
  bpf/tests: Remove duplicate JSGT tests
  selftests/bpf: Add TX side to xdp_hw_metadata
  selftests/bpf: Convert xdp_hw_metadata to XDP_USE_NEED_WAKEUP
  selftests/bpf: Add TX side to xdp_metadata
  selftests/bpf: Add csum helpers
  selftests/xsk: Support tx_metadata_len
  xsk: Add option to calculate TX checksum in SW
  xsk: Validate xsk_tx_metadata flags
  xsk: Document tx_metadata_len layout
  net: stmmac: Add Tx HWTS support to XDP ZC
  net/mlx5e: Implement AF_XDP TX timestamp and checksum offload
  tools: ynl: Print xsk-features from the sample
  xsk: Add TX timestamp and TX checksum offload support
  xsk: Support tx_metadata_len
  selftests/bpf: Use pkg-config for libelf
  selftests/bpf: Override PKG_CONFIG for static builds
  selftests/bpf: Choose pkg-config for the target
  bpftool: Add support to display uprobe_multi links
  selftests/bpf: Add link_info test for uprobe_multi link
  selftests/bpf: Use bpf_link__destroy in fill_link_info tests
  ...
====================

Conflicts:

Documentation/netlink/specs/netdev.yaml:
  839ff60df3 ("net: page_pool: add nlspec for basic access to page pools")
  48eb03dd26 ("xsk: Add TX timestamp and TX checksum offload support")
https://lore.kernel.org/all/20231201094705.1ee3cab8@canb.auug.org.au/

While at it also regen, tree is dirty after:
  48eb03dd26 ("xsk: Add TX timestamp and TX checksum offload support")
looks like code wasn't re-rendered after "render-max" was removed.

Link: https://lore.kernel.org/r/20231130145708.32573-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-30 16:58:42 -08:00
Jakub Kicinski 975f2d73a9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-30 16:11:19 -08:00
Yujie Liu f690ff9122 bpf/tests: Remove duplicate JSGT tests
It seems unnecessary that JSGT is tested twice (one before JSGE and one
after JSGE) since others are tested only once. Remove the duplicate JSGT
tests.

Fixes: 0bbaa02b48 ("bpf/tests: Add tests to check source register zero-extension")
Signed-off-by: Yujie Liu <yujie.liu@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Link: https://lore.kernel.org/bpf/20231130034018.2144963-1-yujie.liu@intel.com
2023-11-30 12:17:33 +01:00
Linus Torvalds d2da77f431 parisc architecture fixes for kernel v6.7-rc3:
- Drop HP-UX ENOSYM and EREMOTERELEASE return codes to avoid glibc
   build issues
 - Fix section alignments for ex_table, altinstructions, parisc unwind
   table, jump_table and bug_table
 - Reduce size of bug_table on 64-bit kernel by using relative
   pointers
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZWNy+gAKCRD3ErUQojoP
 XwmuAQDw3a424GmG9c4lxQgPCdtFMUeOCOpMpPNNWdT8UhAcngD9FL5++lt1mwhK
 mq+w+ftx9sATtUIbvX5exkD1dChRaQw=
 =+Q66
 -----END PGP SIGNATURE-----

Merge tag 'parisc-for-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc architecture fixes from Helge Deller:
 "This patchset fixes and enforces correct section alignments for the
  ex_table, altinstructions, parisc_unwind, jump_table and bug_table
  which are created by inline assembly.

  Due to not being correctly aligned at link & load time they can
  trigger unnecessarily the kernel unaligned exception handler at
  runtime. While at it, I switched the bug table to use relative
  addresses which reduces the size of the table by half on 64-bit.

  We still had the ENOSYM and EREMOTERELEASE errno symbols as left-overs
  from HP-UX, which now trigger build-issues with glibc. We can simply
  remove them.

  Most of the patches are tagged for stable kernel series.

  Summary:

   - Drop HP-UX ENOSYM and EREMOTERELEASE return codes to avoid glibc
     build issues

   - Fix section alignments for ex_table, altinstructions, parisc unwind
     table, jump_table and bug_table

   - Reduce size of bug_table on 64-bit kernel by using relative
     pointers"

* tag 'parisc-for-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Reduce size of the bug_table on 64-bit kernel by half
  parisc: Drop the HP-UX ENOSYM and EREMOTERELEASE error codes
  parisc: Use natural CPU alignment for bug_table
  parisc: Ensure 32-bit alignment on parisc unwind section
  parisc: Mark lock_aligned variables 16-byte aligned on SMP
  parisc: Mark jump_table naturally aligned
  parisc: Mark altinstructions read-only and 32-bit aligned
  parisc: Mark ex_table entries 32-bit aligned in uaccess.h
  parisc: Mark ex_table entries 32-bit aligned in assembly.h
2023-11-26 09:59:39 -08:00
Helge Deller e5f3e299a2 parisc: Drop the HP-UX ENOSYM and EREMOTERELEASE error codes
Those return codes are only defined for the parisc architecture and
are leftovers from when we wanted to be HP-UX compatible.

They are not returned by any Linux kernel syscall but do trigger
problems with the glibc strerrorname_np() and strerror() functions as
reported in glibc issue #31080.

There is no need to keep them, so simply remove them.

Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: Bruno Haible <bruno@clisp.org>
Closes: https://sourceware.org/bugzilla/show_bug.cgi?id=31080
Cc: stable@vger.kernel.org
2023-11-25 09:43:18 +01:00
Jakub Kicinski 53775da0b4 Merge branch 'firmware_loader'
Kory says:

====================
This patch was initially submitted as part of a net patch series.
Conor expressed interest in using it in a different subsystem.
https://lore.kernel.org/netdev/20231116-feature_poe-v1-7-be48044bf249@bootlin.com/

Consequently, I extracted it from the series and submitted it separately.
I first tried to send it to driver-core but it seems also not the best
choice:
https://lore.kernel.org/lkml/2023111720-slicer-exes-7d9f@gregkh/
====================

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-24 18:11:55 -08:00
Kory Maincent a066f906ba firmware_loader: Expand Firmware upload error codes with firmware invalid error
No error code are available to signal an invalid firmware content.
Drivers that can check the firmware content validity can not return this
specific failure to the user-space

Expand the firmware error code with an additional code:
- "firmware invalid" code which can be used when the provided firmware
  is invalid

Sync lib/test_firmware.c file accordingly.

Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231122-feature_firmware_error_code-v3-1-04ec753afb71@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-24 18:09:19 -08:00
Linus Torvalds fa2b906f51 vfs-6.7-rc3.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZWBq0gAKCRCRxhvAZXjc
 ot4EAP48O5ExMtQ3/AIkNDo+/9/Iz4g7bE1HYmdyiMPO3Ou/uwEAySwBXRJrFAsS
 9omvkEdqrfyguW0xgoYwcxBdATVHnAE=
 =ScR3
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.7-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:

 - Avoid calling back into LSMs from vfs_getattr_nosec() calls.

   IMA used to query inode properties accessing raw inode fields without
   dedicated helpers. That was finally fixed a few releases ago by
   forcing IMA to use vfs_getattr_nosec() helpers.

   The goal of the vfs_getattr_nosec() helper is to query for attributes
   without calling into the LSM layer which would be quite problematic
   because incredibly IMA is called from __fput()...

     __fput()
       -> ima_file_free()

   What it does is to call back into the filesystem to update the file's
   IMA xattr. Querying the inode without using vfs_getattr_nosec() meant
   that IMA didn't handle stacking filesystems such as overlayfs
   correctly. So the switch to vfs_getattr_nosec() is quite correct. But
   the switch to vfs_getattr_nosec() revealed another bug when used on
   stacking filesystems:

     __fput()
       -> ima_file_free()
          -> vfs_getattr_nosec()
             -> i_op->getattr::ovl_getattr()
                -> vfs_getattr()
                   -> i_op->getattr::$WHATEVER_UNDERLYING_FS_getattr()
                      -> security_inode_getattr() # calls back into LSMs

   Now, if that __fput() happens from task_work_run() of an exiting task
   current->fs and various other pointer could already be NULL. So
   anything in the LSM layer relying on that not being NULL would be
   quite surprised.

   Fix that by passing the information that this is a security request
   through to the stacking filesystem by adding a new internal
   ATT_GETATTR_NOSEC flag. Now the callchain becomes:

     __fput()
       -> ima_file_free()
          -> vfs_getattr_nosec()
             -> i_op->getattr::ovl_getattr()
                -> if (AT_GETATTR_NOSEC)
                          vfs_getattr_nosec()
                   else
                          vfs_getattr()
                   -> i_op->getattr::$WHATEVER_UNDERLYING_FS_getattr()

 - Fix a bug introduced with the iov_iter rework from last cycle.

   This broke /proc/kcore by copying too much and without the correct
   offset.

 - Add a missing NULL check when allocating the root inode in
   autofs_fill_super().

 - Fix stable writes for multi-device filesystems (xfs, btrfs etc) and
   the block device pseudo filesystem.

   Stable writes used to be a superblock flag only, making it a per
   filesystem property. Add an additional AS_STABLE_WRITES mapping flag
   to allow for fine-grained control.

 - Ensure that offset_iterate_dir() returns 0 after reaching the end of
   a directory so it adheres to getdents() convention.

* tag 'vfs-6.7-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  libfs: getdents() should return 0 after reaching EOD
  xfs: respect the stable writes flag on the RT device
  xfs: clean up FS_XFLAG_REALTIME handling in xfs_ioctl_setattr_xflags
  block: update the stable_writes flag in bdev_add
  filemap: add a per-mapping stable writes flag
  autofs: add: new_inode check in autofs_fill_super()
  iov_iter: fix copy_page_to_iter_nofault()
  fs: Pass AT_GETATTR_NOSEC flag to getattr interface function
2023-11-24 09:45:40 -08:00
Kent Overstreet d4e3b928ab closures: CLOSURE_CALLBACK() to fix type punning
Control flow integrity is now checking that type signatures match on
indirect function calls. That breaks closures, which embed a work_struct
in a closure in such a way that a closure_fn may also be used as a
workqueue fn by the underlying closure code.

So we have to change closure fns to take a work_struct as their
argument - but that results in a loss of clarity, as closure fns have
different semantics from normal workqueue functions (they run owning a
ref on the closure, which must be released with continue_at() or
closure_return()).

Thus, this patc introduces CLOSURE_CALLBACK() and closure_type() macros
as suggested by Kees, to smooth things over a bit.

Suggested-by: Kees Cook <keescook@chromium.org>
Cc: Coly Li <colyli@suse.de>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-24 00:29:58 -05:00
Dave Jiang 35732699f5 ACPI: Fix ARM32 platforms compile issue introduced by fw_table changes
Linus reported that:
After commit a103f46633 the kernel stopped compiling for
several ARM32 platforms that I am building with a bare metal
compiler. Bare metal compilers (arm-none-eabi-) don't
define __linux__.

This is because the header <acpi/platform/acenv.h> is now
in the include path for <linux/irq.h>:

  CC      arch/arm/kernel/irq.o
  CC      kernel/sysctl.o
  CC      crypto/api.o
In file included from ../include/acpi/acpi.h:22,
                 from ../include/linux/fw_table.h:29,
                 from ../include/linux/acpi.h:18,
                 from ../include/linux/irqchip.h:14,
                 from ../arch/arm/kernel/irq.c:25:
../include/acpi/platform/acenv.h:218:2: error: #error Unknown target environment
  218 | #error Unknown target environment
      |  ^~~~~

The issue is caused by the introducing of splitting out the ACPI code to
support the new generic fw_table code.

Rafael suggested [1] moving the fw_table.h include in linux/acpi.h to below
the linux/mutex.h. Remove the two includes in fw_table.h. Replace
linux/fw_table.h include in fw_table.c with linux/acpi.h.

Link: https://lore.kernel.org/linux-acpi/CAJZ5v0idWdJq3JSqQWLG5q+b+b=zkEdWR55rGYEoxh7R6N8kFQ@mail.gmail.com/
Fixes: a103f46633 ("acpi: Move common tables helper functions to common lib")
Closes: https://lore.kernel.org/linux-acpi/20231114-arm-build-bug-v1-1-458745fe32a4@linaro.org/
Reported-by: Linus Walleij <linus.walleij@linaro.org>
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-22 20:41:34 +01:00
Andrzej Hajda 9bb6362652 debugobjects: Stop accessing objects after releasing hash bucket lock
After release of the hashbucket lock the tracking object can be modified or
freed by a concurrent thread.  Using it in such a case is error prone, even
for printing the object state:

    1. T1 tries to deactivate destroyed object, debugobjects detects it,
       hash bucket lock is released.

    2. T2 preempts T1 and frees the tracking object.

    3. The freed tracking object is allocated and initialized for a
       different to be tracked kernel object.

    4. T1 resumes and reports error for wrong kernel object.

Create a local copy of the tracking object before releasing the hash bucket
lock and use the local copy for reporting and fixups to prevent this.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20231025-debugobjects_fix-v3-1-2bc3bf7084c2@intel.com
2023-11-22 10:41:46 +01:00
Jakub Kicinski 53475287da bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZV0kjgAKCRDbK58LschI
 gy0EAP9XwncW2OhO72DpITluFzvWPgB0N97OANKBXjzKJrRAlQD/aUe9nlvBQuad
 WsbMKLeC4wvI2X/4PEIR4ukbuZ3ypAA=
 =LMVg
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-11-21

We've added 85 non-merge commits during the last 12 day(s) which contain
a total of 63 files changed, 4464 insertions(+), 1484 deletions(-).

The main changes are:

1) Huge batch of verifier changes to improve BPF register bounds logic
   and range support along with a large test suite, and verifier log
   improvements, all from Andrii Nakryiko.

2) Add a new kfunc which acquires the associated cgroup of a task within
   a specific cgroup v1 hierarchy where the latter is identified by its id,
   from Yafang Shao.

3) Extend verifier to allow bpf_refcount_acquire() of a map value field
   obtained via direct load which is a use-case needed in sched_ext,
   from Dave Marchevsky.

4) Fix bpf_get_task_stack() helper to add the correct crosstask check
   for the get_perf_callchain(), from Jordan Rome.

5) Fix BPF task_iter internals where lockless usage of next_thread()
   was wrong. The rework also simplifies the code, from Oleg Nesterov.

6) Fix uninitialized tail padding via LIBBPF_OPTS_RESET, and another
   fix for certain BPF UAPI structs to fix verifier failures seen
   in bpf_dynptr usage, from Yonghong Song.

7) Add BPF selftest fixes for map_percpu_stats flakes due to per-CPU BPF
   memory allocator not being able to allocate per-CPU pointer successfully,
   from Hou Tao.

8) Add prep work around dynptr and string handling for kfuncs which
   is later going to be used by file verification via BPF LSM and fsverity,
   from Song Liu.

9) Improve BPF selftests to update multiple prog_tests to use ASSERT_*
   macros, from Yuran Pereira.

10) Optimize LPM trie lookup to check prefixlen before walking the trie,
    from Florian Lehner.

11) Consolidate virtio/9p configs from BPF selftests in config.vm file
    given they are needed consistently across archs, from Manu Bretelle.

12) Small BPF verifier refactor to remove register_is_const(),
    from Shung-Hsi Yu.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits)
  selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in vmlinux
  selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_obj_id
  selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bind_perm
  selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_tcp_ca
  selftests/bpf: reduce verboseness of reg_bounds selftest logs
  bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos)
  bpf: bpf_iter_task_next: use __next_thread() rather than next_thread()
  bpf: task_group_seq_get_next: use __next_thread() rather than next_thread()
  bpf: emit frameno for PTR_TO_STACK regs if it differs from current one
  bpf: smarter verifier log number printing logic
  bpf: omit default off=0 and imm=0 in register state log
  bpf: emit map name in register state if applicable and available
  bpf: print spilled register state in stack slot
  bpf: extract register state printing
  bpf: move verifier state printing code to kernel/bpf/log.c
  bpf: move verbose_linfo() into kernel/bpf/log.c
  bpf: rename BPF_F_TEST_SANITY_STRICT to BPF_F_TEST_REG_INVARIANTS
  bpf: Remove test for MOVSX32 with offset=32
  selftests/bpf: add iter test requiring range x range logic
  veristat: add ability to set BPF_F_TEST_SANITY_STRICT flag with -r flag
  ...
====================

Link: https://lore.kernel.org/r/20231122000500.28126-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-21 17:53:20 -08:00
Omar Sandoval fe2c34bab6 iov_iter: fix copy_page_to_iter_nofault()
The recent conversion to inline functions made two mistakes:

1. It tries to copy the full amount requested (bytes), not just what's
   available in the kmap'd page (n).
2. It's not applying the offset in the first page.

Note that copy_page_to_iter_nofault() is only used by /proc/kcore. This
was detected by drgn's test suite.

Fixes: f1982740f5 ("iov_iter: Convert iterate*() to inline funcs")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Link: https://lore.kernel.org/r/c1616e06b5248013cbbb1881bb4fef85a7a69ccb.1700257019.git.osandov@fb.com
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-18 16:42:07 +01:00
Sagar Vashnav 239e27a983 crypto: lib/aesgcm - Add kernel docs for aesgcm_mac
Add kernel documentation for the aesgcm_mac.
This function generates the authentication tag using the AES-GCM algorithm.

Signed-off-by: Sagar Vashnav <sagarvashnav72427@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-11-17 19:16:28 +08:00
Puranjay Mohan 5fa201f37c bpf: Remove test for MOVSX32 with offset=32
MOVSX32 only supports sign extending 8-bit and 16-bit operands into 32
bit operands. The "ALU_MOVSX | BPF_W" test tries to sign extend a 32 bit
operand into a 32 bit operand which is equivalent to a normal BPF_MOV.

Remove this test as it tries to run an invalid instruction.

Fixes: daabb2b098 ("bpf/tests: add tests for cpuv4 instructions")
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202310111838.46ff5b6a-oliver.sang@intel.com
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231110175150.87803-1-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15 12:46:27 -08:00
Linus Torvalds 86d11b0e20 Zstd fixes for v6.7
Only a single line change to fix a benign UBSAN warning that has been
 baking in linux-next for a month. I just missed the merge window, but I
 think it is worthwhile to include this fix in the v6.7 kernel. If you
 would like me to wait for v6.8 please let me know.
 
 Signed-off-by: Nick Terrell <terrelln@fb.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEmIwAqlFIzbQodPwyuzRpqaNEqPUFAmVUG20ACgkQuzRpqaNE
 qPXx2Q/9GSGokNZm3+ZgINijpxsSz/+WT34ZJ4n14ns97PiiiSLQ10Nzpy3xGfHC
 gQTXYUoRJhuTUsid8QDsbgoLzybOxgZsxQWECrWEioXS+AiGCqGLnC85JdZNpJO3
 hWrxO6j0XfLb6Sgl5gm0CBWVuGkdKSUCCd91iHIZ9vKwFWSsQizGsdyYVXB4gZn4
 qYYGhka5TunDXgaTeqBGlNe3x6b/RvsPTBsX3SnAI3aBd88I6xrpcURrYru3NerZ
 16V2t5+GpiDz8cm7IHwEekh4JNaMwRLO8jhkThdErbLjFs/TU0s2hXkvG0m/yujs
 4Yklv+j0wIJv7f0IMSEZi+msnun+ajx8Mg4P1hi6WEiyylHC7aGAT55ShKhO8Qzx
 FD1kNiNR9svfHL83fJAZ3iEuhOQokRnt2qADutYmp0kq6BCcEnNuMBOpSyw+hzDZ
 MHDCkTujDTobE8gzZBEwUYPiw9kX6c3vsvb/mmNv8+y2LUxyb0GShinw3od0RlKQ
 T5xF6yq9neAxbARFJ9e3Zs51e5OXnCbb+K+pMNqR+nw89hUXiscynN93o4jMCzpn
 VBmhjzaSUTiTABG2TrqWNa5Vr6J3l77sr6T5bvC5aNpQQZ9KZFFPmpup8Z4BtBrG
 89BZqDsrUDRlMpS4Z2ECbl5GUELP3EBSV1NHRgVM4BB7t/UzvzU=
 =w0Q0
 -----END PGP SIGNATURE-----

Merge tag 'zstd-linus-v6.7-rc2' of https://github.com/terrelln/linux

Pull Zstd fix from Nick Terrell:
 "Only a single line change to fix a benign UBSAN warning"

* tag 'zstd-linus-v6.7-rc2' of https://github.com/terrelln/linux:
  zstd: Fix array-index-out-of-bounds UBSAN warning
2023-11-14 23:35:31 -05:00
Nick Terrell 77618db346 zstd: Fix array-index-out-of-bounds UBSAN warning
Zstd used an array of length 1 to mean a flexible array for C89
compatibility. Switch to a C99 flexible array to fix the UBSAN warning.

Tested locally by booting the kernel and writing to and reading from a
BtrFS filesystem with zstd compression enabled. I was unable to reproduce
the issue before the fix, however it is a trivial change.

Link: https://lkml.kernel.org/r/20231012213428.1390905-1-nickrterrell@gmail.com
Reported-by: syzbot+1f2eb3e8cd123ffce499@syzkaller.appspotmail.com
Reported-by: Eric Biggers <ebiggers@kernel.org>
Reported-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Nick Terrell <terrelln@fb.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
2023-11-14 17:12:52 -08:00
Richard Fitzgerald 1bddcf77ce kunit: test: Avoid cast warning when adding kfree() as an action
In kunit_log_test() pass the kfree_wrapper() function to kunit_add_action()
instead of directly passing kfree().

This prevents a cast warning:

lib/kunit/kunit-test.c:565:25: warning: cast from 'void (*)(const void *)'
to 'kunit_action_t *' (aka 'void (*)(void *)') converts to incompatible
function type [-Wcast-function-type-strict]

   564		full_log = string_stream_get_string(test->log);
 > 565		kunit_add_action(test, (kunit_action_t *)kfree, full_log);

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202311070041.kWVYx7YP-lkp@intel.com/
Fixes: 05e2006ce4 ("kunit: Use string_stream for test log")
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-11-14 13:01:57 -07:00
Michal Wajdeczko 2e3c94aed5 kunit: Reset suite counter right before running tests
Today we reset the suite counter as part of the suite cleanup,
called from the module exit callback, but it might not work that
well as one can try to collect results without unloading a previous
test (either unintentionally or due to dependencies).

For easy reproduction try to load the kunit-test.ko and then
collect and parse results from the kunit-example-test.ko load.
Parser will complain about mismatch of expected test number:

[ ] KTAP version 1
[ ] 1..1
[ ]     # example: initializing suite
[ ]     KTAP version 1
[ ]     # Subtest: example
..
[ ] # example: pass:5 fail:0 skip:4 total:9
[ ] # Totals: pass:6 fail:0 skip:6 total:12
[ ] ok 7 example

[ ] [ERROR] Test: example: Expected test number 1 but found 7
[ ] ===================== [PASSED] example =====================
[ ] ============================================================
[ ] Testing complete. Ran 12 tests: passed: 6, skipped: 6, errors: 1

Since we are now printing suite test plan on every module load,
right before running suite tests, we should make sure that suite
counter will also start from 1. Easiest solution seems to be move
counter reset to the __kunit_test_suites_init() function.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-11-14 13:01:47 -07:00
Maxime Ripard f8f2847f73 kunit: Warn if tests are slow
Kunit recently gained support to setup attributes, the first one being
the speed of a given test, then allowing to filter out slow tests.

A slow test is defined in the documentation as taking more than one
second. There's an another speed attribute called "super slow" but whose
definition is less clear.

Add support to the test runner to check the test execution time, and
report tests that should be marked as slow but aren't.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-11-14 13:01:37 -07:00
wuqiang.matt 3afe733729 lib: test_objpool: make global variables static
Kernel test robot reported build warnings that structures g_ot_sync_ops,
g_ot_async_ops and g_testcases should be static. These definitions are
only used in test_objpool.c, so make them static

Link: https://lore.kernel.org/all/20231108012248.313574-1-wuqiang.matt@bytedance.com/

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202311071229.WGrWUjM1-lkp@intel.com/

Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-11-10 19:59:04 +09:00
Linus Torvalds c9d01179e1 Second bcachefs pull request for 6.7-rc1
Here's the second big bcachefs pull request. This brings your tree up to
 date with my master branch, which is what existing bcachefs users are
 currently running.
 
 All but the last few patches have been in linux-next, those being small
 fixes. Test results from my dashboard:
   https://evilpiepirate.org/~testdashboard/ci?commit=c7046ed0cf9bb33599aa7e72e7b67bba4be42d64
 
 New features:
  - rebalance_work btree (and metadata version 1.3): the rebalance thread
    no longer has to scan to find extents that need processing - big
    scalability improvement.
  - sb_errors superblock section: this adds counters for each fsck error
    type, since filesystem creation, along with the date of the most
    recent error. It'll get us better bug reports (since users do not
    typically report errors that fsck was able to fix), and I might add
    telemetry for this in the future.
 
 Fixes include:
  - multiple snapshot deletion fixes
  - members_v2 fixups
  - deleted_inodes btree fixes
  - copygc thread no longer spins when a device is full but has no
    fragmented buckets (i.e. rebalance needs to move data around instead)
  - a fix for a memory reclaim issue with the btree key cache: we're now
    careful not to hold the srcu read lock that blocks key cache reclaim
    for too long
  - an early allocator locking fix, from Brian
  - endianness fixes, from Brian
  - CONFIG_BCACHEFS_DEBUG_TRANSACTIONS no longer defaults to y, a big
    performance improvement on multithreaded workloads
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmVH9xYACgkQE6szbY3K
 bnahLRAAiNRZL73SQ+MW79o4yPqGwt0Eyy/mvoiGpZf1B8uXp0oZ55j2w3l887Uf
 LeM03mInAYCPdyp/d4vxqIr96j9BODmRRl8sEkkGdJDzokLG+22F0ovOe45KWTxL
 kBoNdng/O/oeOe/1K7taP3KzBvMx2nOF6oA+xfgyCjECMArAIXek0iocyEUR4Ywd
 vGKhLNn1k2c+94wacnDYwjjdcLBxoqxsFXlpu6V0BcaY+DX4J3aBaGmj75KEoCI0
 VbBOzxrOO4QzJrzW2+hxZZWgGyvReCkBJvqfORfuPxiSbFobTim10MdfZOAMQA1U
 Xr1FTEpK1wMX0/pPVgZRqaOsttC+yc/SsfPNgSxybgHPbDlMLaakDHjvYssbKOYG
 urDWSMG5yCsktSLj95SXsvUFKZaZFD72SKBNdgdt/nZjwTHuNQ7IkdrMwIrCQ/PT
 Ifn50UrR/Ahd8RAd5tyNCPw6U9VfwnxACSNl2KA7ONKpvHb+gSt1JsJTDyz1+gN9
 nFVrw1SHKQ6EIV6XhVon/5DEuRTzqoYGWoN08FHEUq9fBlvnVpmbJErCQMplOjz9
 OQnAfpJH4YqkpXyjFAjP1V0An+RUn8QvDgXNqC9TyvCYuOliVFuil4y7/c+7oIQU
 NEoz+jVLenqsGOGAbduI4/Q567COojRgwEvbebSIxSImXuhCNj4=
 =Lo4N
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2023-11-5' of https://evilpiepirate.org/git/bcachefs

Pull more bcachefs updates from Kent Overstreet:
 "Here's the second big bcachefs pull request. This brings your tree up
  to date with my master branch, which is what existing bcachefs users
  are currently running.

  New features:
   - rebalance_work btree (and metadata version 1.3): the rebalance
     thread no longer has to scan to find extents that need processing -
     big scalability improvement.
   - sb_errors superblock section: this adds counters for each fsck
     error type, since filesystem creation, along with the date of the
     most recent error. It'll get us better bug reports (since users do
     not typically report errors that fsck was able to fix), and I might
     add telemetry for this in the future.

  Fixes include:
   - multiple snapshot deletion fixes
   - members_v2 fixups
   - deleted_inodes btree fixes
   - copygc thread no longer spins when a device is full but has no
     fragmented buckets (i.e. rebalance needs to move data around
     instead)
   - a fix for a memory reclaim issue with the btree key cache: we're
     now careful not to hold the srcu read lock that blocks key cache
     reclaim for too long
   - an early allocator locking fix, from Brian
   - endianness fixes, from Brian
   - CONFIG_BCACHEFS_DEBUG_TRANSACTIONS no longer defaults to y, a big
     performance improvement on multithreaded workloads"

* tag 'bcachefs-2023-11-5' of https://evilpiepirate.org/git/bcachefs: (70 commits)
  bcachefs: Improve stripe checksum error message
  bcachefs: Simplify, fix bch2_backpointer_get_key()
  bcachefs: kill thing_it_points_to arg to backpointer_not_found()
  bcachefs: bch2_ec_read_extent() now takes btree_trans
  bcachefs: bch2_stripe_to_text() now prints ptr gens
  bcachefs: Don't iterate over journal entries just for btree roots
  bcachefs: Break up bch2_journal_write()
  bcachefs: Replace ERANGE with private error codes
  bcachefs: bkey_copy() is no longer a macro
  bcachefs: x-macro-ify inode flags enum
  bcachefs: Convert bch2_fs_open() to darray
  bcachefs: Move __bch2_members_v2_get_mut to sb-members.h
  bcachefs: bch2_prt_datetime()
  bcachefs: CONFIG_BCACHEFS_DEBUG_TRANSACTIONS no longer defaults to y
  bcachefs: Add a comment for BTREE_INSERT_NOJOURNAL usage
  bcachefs: rebalance_work btree is not a snapshots btree
  bcachefs: Add missing printk newlines
  bcachefs: Fix recovery when forced to use JSET_NO_FLUSH journal entry
  bcachefs: .get_parent() should return an error pointer
  bcachefs: Fix bch2_delete_dead_inodes()
  ...
2023-11-07 11:38:38 -08:00
Linus Torvalds b8cc56d041 cxl for v6.7
- Add support for RCH (Restricted CXL Host) Error recovery
 
 - Fix several region assembly bugs
 
 - Fix mem-device lifetime issues relative to the sanitize command and
   RCH topology.
 
 - Refactor ACPI table parsing for CDAT parsing re-use in preparation for
   CXL QOS support.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZUaowQAKCRDfioYZHlFs
 Z75rAP44azzLPwJtva7Ur60KpNsGuoZKhvWWdeI1/zo9k4pHbwEA/Vaf/GGo0U5k
 bMkoTmwPTd7YY79B5HNUQSZsqF9wlAc=
 =TEQ0
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL (Compute Express Link) updates from Dan Williams:
 "The main new functionality this time is work to allow Linux to
  natively handle CXL link protocol errors signalled via PCIe AER for
  current generation CXL platforms. This required some enlightenment of
  the PCIe AER core to workaround the fact that current generation RCH
  (Restricted CXL Host) platforms physically hide topology details and
  registers via a mechanism called RCRB (Root Complex Register Block).

  The next major highlight is reworks to address bugs in parsing region
  configurations for next generation VH (Virtual Host) topologies. The
  old broken algorithm is replaced with a simpler one that significantly
  increases the number of region configurations supported by Linux. This
  is again relevant for error handling so that forward and reverse
  address translation of memory errors can be carried out by Linux for
  memory regions instantiated by platform firmware.

  As for other cross-tree work, the ACPI table parsing code has been
  refactored for reuse parsing the "CDAT" structure which is an
  ACPI-like data structure that is reported by CXL devices. That work is
  in preparation for v6.8 support for CXL QoS. Think of this as dynamic
  generation of NUMA node topology information generated by Linux rather
  than platform firmware.

  Lastly, a number of internal object lifetime issues have been resolved
  along with misc. fixes and feature updates (decoders_committed sysfs
  ABI).

  Summary:

   - Add support for RCH (Restricted CXL Host) Error recovery

   - Fix several region assembly bugs

   - Fix mem-device lifetime issues relative to the sanitize command and
     RCH topology.

   - Refactor ACPI table parsing for CDAT parsing re-use in preparation
     for CXL QOS support"

* tag 'cxl-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (50 commits)
  lib/fw_table: Remove acpi_parse_entries_array() export
  cxl/pci: Change CXL AER support check to use native AER
  cxl/hdm: Remove broken error path
  cxl/hdm: Fix && vs || bug
  acpi: Move common tables helper functions to common lib
  cxl: Add support for reading CXL switch CDAT table
  cxl: Add checksum verification to CDAT from CXL
  cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute
  cxl: Add decoders_committed sysfs attribute to cxl_port
  cxl: Add cxl_decoders_committed() helper
  cxl/core/regs: Rework cxl_map_pmu_regs() to use map->dev for devm
  cxl/core/regs: Rename phys_addr in cxl_map_component_regs()
  PCI/AER: Unmask RCEC internal errors to enable RCH downstream port error handling
  PCI/AER: Forward RCH downstream port-detected errors to the CXL.mem dev handler
  cxl/pci: Disable root port interrupts in RCH mode
  cxl/pci: Add RCH downstream port error logging
  cxl/pci: Map RCH downstream AER registers for logging protocol errors
  cxl/pci: Update CXL error logging to use RAS register address
  PCI/AER: Refactor cper_print_aer() for use by CXL driver module
  cxl/pci: Add RCH downstream port AER register discovery
  ...
2023-11-04 16:20:36 -10:00
Linus Torvalds 707df298cb powerpc updates for 6.7
- Add support for KVM running as a nested hypervisor under development versions
    of PowerVM, using the new PAPR nested virtualisation API.
 
  - Add support for the BPF prog pack allocator.
 
  - A rework of the non-server MMU handling to support execute-only on all platforms.
 
  - Some optimisations & cleanups for the powerpc qspinlock code.
 
  - Various other small features and fixes.
 
 Thanks to: Aboorva Devarajan, Aditya Gupta, Amit Machhiwal, Benjamin Gray,
 Christophe Leroy, Dr. David Alan Gilbert, Gaurav Batra, Gautam Menghani, Geert
 Uytterhoeven, Haren Myneni, Hari Bathini, Joel Stanley, Jordan Niethe, Julia
 Lawall, Kautuk Consul, Kuan-Wei Chiu, Michael Neuling, Minjie Du, Muhammad
 Muzammil, Naveen N Rao, Nicholas Piggin, Nick Child, Nysal Jan K.A, Peter
 Lafreniere, Rob Herring, Sachin Sant, Sebastian Andrzej Siewior, Shrikanth
 Hegde, Srikar Dronamraju, Stanislav Kinsburskii, Vaibhav Jain, Wang Yufen, Yang
 Yingliang, Yuan Tan.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmVEf38THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgMKgD/4vmPVcBE31xCAuuksrVvmMDRsCoC8N
 IJe4A5dHda1tYgdN2YdeK4LBszv5pWICjf2xZHlNh+L0s3Vxpngd4ycAWGPfDAyk
 SOlM24NCKl5j3327QZEt+iZVmJeTSnrmjxO0A1y04yvzLrfvFT7mbP4EXoidjShd
 GNb/EoH9kkCFn65zulc+lN2itQEX6Ht2GQTAz5z5GKtF6d1zZGM8ftOW+SQ5LeU3
 5JOkQtMtwAKhzBiglA4BB3pQyjaOOkPaTaj/WLoxx5tbVaCkV4wrFq48Bmtbm7E3
 kYkMNoI3IsC615GqY1CaRs/RSpMt74tIVh3tstSecHWRIwNGnfF6zeZpKLvJSs8k
 Qa5greGWMUDuJdDg9oDwAX2AKtO+3byI2v1hKE+sMhMh0eeMtDP9WIrIRg4BDjKL
 mq8RffXLTCtepehgfwBpoZbcvFSwFUMwuihBD7+bDMZQeDbtuFdZ2ouMFXBP9M1n
 cuv4KySouvKv9Xp5EeCkHlpL7QmSqrtSHOPYjoPeLueJYlmjheWdreLM9p7Nl2ma
 5wBxLpdLCGCpDJOyGgWNoQRHXucBNlU97DLx2V70nXG4wvvRyXh9EZ6I2niPSdPx
 N3LJnINz4MJ52Gd1KWJvufOyJlLwXxuI07rzCq67ZegpEPh+baWqVcPscuKU8+q0
 dSh2DPCht8gw1A==
 =ddT4
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Add support for KVM running as a nested hypervisor under development
   versions of PowerVM, using the new PAPR nested virtualisation API

 - Add support for the BPF prog pack allocator

 - A rework of the non-server MMU handling to support execute-only on
   all platforms

 - Some optimisations & cleanups for the powerpc qspinlock code

 - Various other small features and fixes

Thanks to Aboorva Devarajan, Aditya Gupta, Amit Machhiwal, Benjamin
Gray, Christophe Leroy, Dr. David Alan Gilbert, Gaurav Batra, Gautam
Menghani, Geert Uytterhoeven, Haren Myneni, Hari Bathini, Joel Stanley,
Jordan Niethe, Julia Lawall, Kautuk Consul, Kuan-Wei Chiu, Michael
Neuling, Minjie Du, Muhammad Muzammil, Naveen N Rao, Nicholas Piggin,
Nick Child, Nysal Jan K.A, Peter Lafreniere, Rob Herring, Sachin Sant,
Sebastian Andrzej Siewior, Shrikanth Hegde, Srikar Dronamraju, Stanislav
Kinsburskii, Vaibhav Jain, Wang Yufen, Yang Yingliang, and Yuan Tan.

* tag 'powerpc-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (100 commits)
  powerpc/vmcore: Add MMU information to vmcoreinfo
  Revert "powerpc: add `cur_cpu_spec` symbol to vmcoreinfo"
  powerpc/bpf: use bpf_jit_binary_pack_[alloc|finalize|free]
  powerpc/bpf: rename powerpc64_jit_data to powerpc_jit_data
  powerpc/bpf: implement bpf_arch_text_invalidate for bpf_prog_pack
  powerpc/bpf: implement bpf_arch_text_copy
  powerpc/code-patching: introduce patch_instructions()
  powerpc/32s: Implement local_flush_tlb_page_psize()
  powerpc/pseries: use kfree_sensitive() in plpks_gen_password()
  powerpc/code-patching: Perform hwsync in __patch_instruction() in case of failure
  powerpc/fsl_msi: Use device_get_match_data()
  powerpc: Remove cpm_dp...() macros
  powerpc/qspinlock: Rename yield_propagate_owner tunable
  powerpc/qspinlock: Propagate sleepy if previous waiter is preempted
  powerpc/qspinlock: don't propagate the not-sleepy state
  powerpc/qspinlock: propagate owner preemptedness rather than CPU number
  powerpc/qspinlock: stop queued waiters trying to set lock sleepy
  powerpc/perf: Fix disabling BHRB and instruction sampling
  powerpc/trace: Add support for HAVE_FUNCTION_ARG_ACCESS_API
  powerpc/tools: Pass -mabi=elfv2 to gcc-check-mprofile-kernel.sh
  ...
2023-11-03 10:07:39 -10:00
Linus Torvalds 31e5f934ff Tracing updates for v6.7:
- Remove eventfs_file descriptor
 
   This is the biggest change, and the second part of making eventfs
   create its files dynamically.
 
   In 6.6 the first part was added, and that maintained a one to one
   mapping between eventfs meta descriptors and the directories and
   file inodes and dentries that were dynamically created. The
   directories were represented by a eventfs_inode and the files
   were represented by a eventfs_file.
 
   In v6.7 the eventfs_file is removed. As all events have the same
   directory make up (sched_switch has an "enable", "id", "format",
   etc files), the handing of what files are underneath each leaf
   eventfs directory is moved back to the tracing subsystem via a
   callback. When a event is added to the eventfs, it registers
   an array of evenfs_entry's. These hold the names of the files and
   the callbacks to call when the file is referenced. The callback gets
   the name so that the same callback may be used by multiple files.
   The callback then supplies the filesystem_operations structure needed
   to create this file.
 
   This has brought the memory footprint of creating multiple eventfs
   instances down by 2 megs each!
 
 - User events now has persistent events that are not associated
   to a single processes. These are privileged events that hang around
   even if no process is attached to them.
 
 - Clean up of seq_buf.
   There's talk about using seq_buf more to replace strscpy() and friends.
   But this also requires some minor modifications of seq_buf to be
   able to do this.
 
 - Expand instance ring buffers individually
   Currently if boot up creates an instance, and a trace event is
   enabled on that instance, the ring buffer for that instance and the
   top level ring buffer are expanded (1.4 MB per CPU). This wastes
   memory as this happens when nothing is using the top level instance.
 
 - Other minor clean ups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZUMrBBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6quzVAQCed/kPM7X9j2QZamJVDruMf2CmVxpu
 /TOvKvSKV584GgEAxLntf5VKx1Q98bc68y3Zkg+OCi8jSgORos1ROmURhws=
 =iIgb
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Remove eventfs_file descriptor

   This is the biggest change, and the second part of making eventfs
   create its files dynamically.

   In 6.6 the first part was added, and that maintained a one to one
   mapping between eventfs meta descriptors and the directories and file
   inodes and dentries that were dynamically created. The directories
   were represented by a eventfs_inode and the files were represented by
   a eventfs_file.

   In v6.7 the eventfs_file is removed. As all events have the same
   directory make up (sched_switch has an "enable", "id", "format", etc
   files), the handing of what files are underneath each leaf eventfs
   directory is moved back to the tracing subsystem via a callback.

   When an event is added to the eventfs, it registers an array of
   evenfs_entry's. These hold the names of the files and the callbacks
   to call when the file is referenced. The callback gets the name so
   that the same callback may be used by multiple files. The callback
   then supplies the filesystem_operations structure needed to create
   this file.

   This has brought the memory footprint of creating multiple eventfs
   instances down by 2 megs each!

 - User events now has persistent events that are not associated to a
   single processes. These are privileged events that hang around even
   if no process is attached to them

 - Clean up of seq_buf

   There's talk about using seq_buf more to replace strscpy() and
   friends. But this also requires some minor modifications of seq_buf
   to be able to do this

 - Expand instance ring buffers individually

   Currently if boot up creates an instance, and a trace event is
   enabled on that instance, the ring buffer for that instance and the
   top level ring buffer are expanded (1.4 MB per CPU). This wastes
   memory as this happens when nothing is using the top level instance

 - Other minor clean ups and fixes

* tag 'trace-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (34 commits)
  seq_buf: Export seq_buf_puts()
  seq_buf: Export seq_buf_putc()
  eventfs: Use simple_recursive_removal() to clean up dentries
  eventfs: Remove special processing of dput() of events directory
  eventfs: Delete eventfs_inode when the last dentry is freed
  eventfs: Hold eventfs_mutex when calling callback functions
  eventfs: Save ownership and mode
  eventfs: Test for ei->is_freed when accessing ei->dentry
  eventfs: Have a free_ei() that just frees the eventfs_inode
  eventfs: Remove "is_freed" union with rcu head
  eventfs: Fix kerneldoc of eventfs_remove_rec()
  tracing: Have the user copy of synthetic event address use correct context
  eventfs: Remove extra dget() in eventfs_create_events_dir()
  tracing: Have trace_event_file have ref counters
  seq_buf: Introduce DECLARE_SEQ_BUF and seq_buf_str()
  eventfs: Fix typo in eventfs_inode union comment
  eventfs: Fix WARN_ON() in create_file_dentry()
  powerpc: Remove initialisation of readpos
  tracing/histograms: Simplify last_cmd_set()
  seq_buf: fix a misleading comment
  ...
2023-11-03 07:41:18 -10:00
Linus Torvalds 2a80532c07 printk changes for 6.7
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmVDk4QACgkQUqAMR0iA
 lPLKtg/9FXMuIRtyiZqHARIjRprxSWyiwVu+ZsDfp8JznP1+WnoWp8E+xf824dbW
 xNnQS1qKecvH+1Nw9jMhXNAlViI5re7ft01yDraA0POjP3P8ar+4CXiB3e8CXTLZ
 VUwMegBlztkcskV0L5EpRRyOa728UF63V3FN41SuR81GaAGOVyZNE4RmlwlsS9xg
 uCj7XEMgVuZpVjnM6Cx/YzUfuWGwFp0eWn31Vc7Dp5Bab/yDvSRdD8H1foo23/3k
 nLh6wDV9UbYbAsgHwLWck18kmn0xsnzK8G08bzFwP8FASHjVuEaxSgJqFBuZxY1j
 O4XFR1zedVOB2Q08fzwi/rVB7jibw9mZZNWceOnrDM8PUx7m/m/YVOO0aNVmBh+f
 uztptyoMw8o93dXwR05Dc5JGyYh4k4eMN9eS0MbALNziFwvc80ln4g1goNYpzjyb
 vfytDoacwuRYD3BkEL5t5BsAK4ULqpTyOZiv0sXUC+cERH1HkUGm5KCPbQawnvaK
 fEkGJTHMtE+I+Cui83jYCdxJVfE1LFcM9Yl7ZDfrVisRQ8/KM9+L68XceRe4E30U
 1YiSJIopeYi5+ABysfE4RPumHWsG6d8FQPtXZyS+/K0cl5j49t36r3IOy0QUh6bn
 G9iZSvO4oxUHz1sXja+X+EKzN1r8fRf/j+ovTrBUpCcZZRqKk9s=
 =Cmex
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Another preparation step for introducing printk kthreads. The main
   piece is a per-console lock with several features:

    - Support three priorities: normal, emergency, and panic. They will
      be defined by a context where the lock is taken. A context with a
      higher priority is allowed to take over the lock from a context
      with a lower one.

      The plan is to use the emergency context for Oops and WARN()
      messages, and also by watchdogs.

      The panic() context will be used on panic CPU.

    - The owner might enter/exit regions where it is not safe to take
      over the lock. It allows the take over the lock a safe way in the
      middle of a message.

      For example, serial drivers emit characters one by one. And the
      serial port is in a safe state in between.

      Only the final console_flush_in_panic() will be allowed to take
      over the lock even in the unsafe state (last chance, pray, and
      hope).

    - A higher priority context might busy wait with a timeout. The
      current owner is informed about the waiter and releases the lock
      on exit from the unsafe state.

    - The new lock is safe even in atomic contexts, including NMI.

   Another change is a safe manipulation of per-console sequence number
   counter under the new lock.

 - simple_strntoull() micro-optimization

 - Reduce pr_flush() pooling time.

 - Calm down false warning about possible buffer invalid access to
   console buffers when CONFIG_PRINTK is disabled.

[ .. and Thomas Gleixner wants to point out that while several of the
  commits are attributed to him, he only authored the early versions of
  said commits, and that John Ogness and Petr Mladek have been the ones
  who sorted out the details and really should be those who get the
  credit   - Linus ]

* tag 'printk-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  vsprintf: uninline simple_strntoull(), reorder arguments
  printk: printk: Remove unnecessary statements'len = 0;'
  printk: Reduce pr_flush() pooling time
  printk: fix illegal pbufs access for !CONFIG_PRINTK
  printk: nbcon: Allow drivers to mark unsafe regions and check state
  printk: nbcon: Add emit function and callback function for atomic printing
  printk: nbcon: Add sequence handling
  printk: nbcon: Add ownership state functions
  printk: nbcon: Add buffer management
  printk: Make static printk buffers available to nbcon
  printk: nbcon: Add acquire/release logic
  printk: Add non-BKL (nbcon) console basic infrastructure
2023-11-03 07:24:22 -10:00
Linus Torvalds 9a719c2145 bitmap patches for v6.7
Hi Linus,
 
 Please pull patches for v6.7.
 
 This request includes "bitmap: cleanup bitmap_*_region() implementation"
 series, and scattered cleanup patches.
 
 Thanks,
         Yury
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmVCgS4ACgkQsUSA/Tof
 vsjIdQv+PnSQ5Lq6ISWYqhV0I60LPLWjf4jm5bgHUT/gKWjUIqYJmYfHD1M1MTkJ
 +qsLdywshSdE62TG/Y0r/i9el8IedJOP1T0Oi9RpVPjV/vZd7BgGYSLfOsZnvV4e
 wmIVKKE5A+uAcKHw2+9MWoK+4LxG6YRWb6AKGroghz3GU70hFz9xY+kwsfP1NxLd
 pqalPYGyyfkte+7uSchwMKfJVkXA5TwxbasB8Qd8s0fM0DNOLcoZbjFxt2ufZzBY
 a57I12nheYagBmLfMPjOT3TR/g9XXQnn8pxxhNM0XJeu73WDno+ZMTmH80SzDuv7
 P6+6KglUHY1IHyeQ0chgwZDusxkCKfR9W6fQ5IhGYJuZkKtzbdsjVf38jJbWwp8n
 ZIFu8n1kkYN7Ap4veOJ32N/cDRN0yR5f2pWxTw2hPifn5Rftl26PhidH0Bjz/F+p
 q4/dIxsGPA6bsQCfZ7XNfGf9pARwLjcHgZt8MMwj2RA2hv+1qyefRav94jUrkyPT
 9gaBkZHi
 =L4AW
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-for-6.7' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:
 "This includes the 'bitmap: cleanup bitmap_*_region() implementation'
  series, and scattered cleanup patches"

* tag 'bitmap-for-6.7' of https://github.com/norov/linux:
  buildid: reduce header file dependencies for module
  bitmap: move bitmap_*_region() functions to bitmap.h
  bitmap: drop _reg_op() function
  bitmap: replace _reg_op(REG_OP_ISFREE) with find_next_bit()
  bitmap: replace _reg_op(REG_OP_RELEASE) with bitmap_clear()
  bitmap: replace _reg_op(REG_OP_ALLOC) with bitmap_set()
  bitmap: fix opencoded bitmap_allocate_region()
  bitmap: add test for bitmap_*_region() functions
  bitmap: align __reg_op() wrappers with modern coding style
  lib/bitmap: split-out string-related operations to a separate files
  bitmap: Remove dead code, i.e. bitmap_copy_le()
  bitmap: Fix a typo ("identify map")
  cpumask: kernel-doc cleanups and additions
2023-11-03 07:08:36 -10:00
Linus Torvalds 8f6f76a6a2 As usual, lots of singleton and doubleton patches all over the tree and
there's little I can say which isn't in the individual changelogs.
 
 The lengthier patch series are
 
 - "kdump: use generic functions to simplify crashkernel reservation in
   arch", from Baoquan He.  This is mainly cleanups and consolidation of
   the "crashkernel=" kernel parameter handling.
 
 - After much discussion, David Laight's "minmax: Relax type checks in
   min() and max()" is here.  Hopefully reduces some typecasting and the
   use of min_t() and max_t().
 
 - A group of patches from Oleg Nesterov which clean up and slightly fix
   our handling of reads from /proc/PID/task/...  and which remove
   task_struct.therad_group.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZUQP9wAKCRDdBJ7gKXxA
 jmOAAQDh8sxagQYocoVsSm28ICqXFeaY9Co1jzBIDdNesAvYVwD/c2DHRqJHEiS4
 63BNcG3+hM9nwGJHb5lyh5m79nBMRg0=
 =On4u
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "As usual, lots of singleton and doubleton patches all over the tree
  and there's little I can say which isn't in the individual changelogs.

  The lengthier patch series are

   - 'kdump: use generic functions to simplify crashkernel reservation
     in arch', from Baoquan He. This is mainly cleanups and
     consolidation of the 'crashkernel=' kernel parameter handling

   - After much discussion, David Laight's 'minmax: Relax type checks in
     min() and max()' is here. Hopefully reduces some typecasting and
     the use of min_t() and max_t()

   - A group of patches from Oleg Nesterov which clean up and slightly
     fix our handling of reads from /proc/PID/task/... and which remove
     task_struct.thread_group"

* tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (64 commits)
  scripts/gdb/vmalloc: disable on no-MMU
  scripts/gdb: fix usage of MOD_TEXT not defined when CONFIG_MODULES=n
  .mailmap: add address mapping for Tomeu Vizoso
  mailmap: update email address for Claudiu Beznea
  tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions
  .mailmap: map Benjamin Poirier's address
  scripts/gdb: add lx_current support for riscv
  ocfs2: fix a spelling typo in comment
  proc: test ProtectionKey in proc-empty-vm test
  proc: fix proc-empty-vm test with vsyscall
  fs/proc/base.c: remove unneeded semicolon
  do_io_accounting: use sig->stats_lock
  do_io_accounting: use __for_each_thread()
  ocfs2: replace BUG_ON() at ocfs2_num_free_extents() with ocfs2_error()
  ocfs2: fix a typo in a comment
  scripts/show_delta: add __main__ judgement before main code
  treewide: mark stuff as __ro_after_init
  fs: ocfs2: check status values
  proc: test /proc/${pid}/statm
  compiler.h: move __is_constexpr() to compiler.h
  ...
2023-11-02 20:53:31 -10:00
Linus Torvalds ecae0bd517 Many singleton patches against the MM code. The patch series which are
included in this merge do the following:
 
 - Kemeng Shi has contributed some compation maintenance work in the
   series "Fixes and cleanups to compaction".
 
 - Joel Fernandes has a patchset ("Optimize mremap during mutual
   alignment within PMD") which fixes an obscure issue with mremap()'s
   pagetable handling during a subsequent exec(), based upon an
   implementation which Linus suggested.
 
 - More DAMON/DAMOS maintenance and feature work from SeongJae Park i the
   following patch series:
 
 	mm/damon: misc fixups for documents, comments and its tracepoint
 	mm/damon: add a tracepoint for damos apply target regions
 	mm/damon: provide pseudo-moving sum based access rate
 	mm/damon: implement DAMOS apply intervals
 	mm/damon/core-test: Fix memory leaks in core-test
 	mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval
 
 - In the series "Do not try to access unaccepted memory" Adrian Hunter
   provides some fixups for the recently-added "unaccepted memory' feature.
   To increase the feature's checking coverage.  "Plug a few gaps where
   RAM is exposed without checking if it is unaccepted memory".
 
 - In the series "cleanups for lockless slab shrink" Qi Zheng has done
   some maintenance work which is preparation for the lockless slab
   shrinking code.
 
 - Qi Zheng has redone the earlier (and reverted) attempt to make slab
   shrinking lockless in the series "use refcount+RCU method to implement
   lockless slab shrink".
 
 - David Hildenbrand contributes some maintenance work for the rmap code
   in the series "Anon rmap cleanups".
 
 - Kefeng Wang does more folio conversions and some maintenance work in
   the migration code.  Series "mm: migrate: more folio conversion and
   unification".
 
 - Matthew Wilcox has fixed an issue in the buffer_head code which was
   causing long stalls under some heavy memory/IO loads.  Some cleanups
   were added on the way.  Series "Add and use bdev_getblk()".
 
 - In the series "Use nth_page() in place of direct struct page
   manipulation" Zi Yan has fixed a potential issue with the direct
   manipulation of hugetlb page frames.
 
 - In the series "mm: hugetlb: Skip initialization of gigantic tail
   struct pages if freed by HVO" has improved our handling of gigantic
   pages in the hugetlb vmmemmep optimizaton code.  This provides
   significant boot time improvements when significant amounts of gigantic
   pages are in use.
 
 - Matthew Wilcox has sent the series "Small hugetlb cleanups" - code
   rationalization and folio conversions in the hugetlb code.
 
 - Yin Fengwei has improved mlock()'s handling of large folios in the
   series "support large folio for mlock"
 
 - In the series "Expose swapcache stat for memcg v1" Liu Shixin has
   added statistics for memcg v1 users which are available (and useful)
   under memcg v2.
 
 - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
   prctl so that userspace may direct the kernel to not automatically
   propagate the denial to child processes.  The series is named "MDWE
   without inheritance".
 
 - Kefeng Wang has provided the series "mm: convert numa balancing
   functions to use a folio" which does what it says.
 
 - In the series "mm/ksm: add fork-exec support for prctl" Stefan Roesch
   makes is possible for a process to propagate KSM treatment across
   exec().
 
 - Huang Ying has enhanced memory tiering's calculation of memory
   distances.  This is used to permit the dax/kmem driver to use "high
   bandwidth memory" in addition to Optane Data Center Persistent Memory
   Modules (DCPMM).  The series is named "memory tiering: calculate
   abstract distance based on ACPI HMAT"
 
 - In the series "Smart scanning mode for KSM" Stefan Roesch has
   optimized KSM by teaching it to retain and use some historical
   information from previous scans.
 
 - Yosry Ahmed has fixed some inconsistencies in memcg statistics in the
   series "mm: memcg: fix tracking of pending stats updates values".
 
 - In the series "Implement IOCTL to get and optionally clear info about
   PTEs" Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits
   us to atomically read-then-clear page softdirty state.  This is mainly
   used by CRIU.
 
 - Hugh Dickins contributed the series "shmem,tmpfs: general maintenance"
   - a bunch of relatively minor maintenance tweaks to this code.
 
 - Matthew Wilcox has increased the use of the VMA lock over file-backed
   page faults in the series "Handle more faults under the VMA lock".  Some
   rationalizations of the fault path became possible as a result.
 
 - In the series "mm/rmap: convert page_move_anon_rmap() to
   folio_move_anon_rmap()" David Hildenbrand has implemented some cleanups
   and folio conversions.
 
 - In the series "various improvements to the GUP interface" Lorenzo
   Stoakes has simplified and improved the GUP interface with an eye to
   providing groundwork for future improvements.
 
 - Andrey Konovalov has sent along the series "kasan: assorted fixes and
   improvements" which does those things.
 
 - Some page allocator maintenance work from Kemeng Shi in the series
   "Two minor cleanups to break_down_buddy_pages".
 
 - In thes series "New selftest for mm" Breno Leitao has developed
   another MM self test which tickles a race we had between madvise() and
   page faults.
 
 - In the series "Add folio_end_read" Matthew Wilcox provides cleanups
   and an optimization to the core pagecache code.
 
 - Nhat Pham has added memcg accounting for hugetlb memory in the series
   "hugetlb memcg accounting".
 
 - Cleanups and rationalizations to the pagemap code from Lorenzo
   Stoakes, in the series "Abstract vma_merge() and split_vma()".
 
 - Audra Mitchell has fixed issues in the procfs page_owner code's new
   timestamping feature which was causing some misbehaviours.  In the
   series "Fix page_owner's use of free timestamps".
 
 - Lorenzo Stoakes has fixed the handling of new mappings of sealed files
   in the series "permit write-sealed memfd read-only shared mappings".
 
 - Mike Kravetz has optimized the hugetlb vmemmap optimization in the
   series "Batch hugetlb vmemmap modification operations".
 
 - Some buffer_head folio conversions and cleanups from Matthew Wilcox in
   the series "Finish the create_empty_buffers() transition".
 
 - As a page allocator performance optimization Huang Ying has added
   automatic tuning to the allocator's per-cpu-pages feature, in the series
   "mm: PCP high auto-tuning".
 
 - Roman Gushchin has contributed the patchset "mm: improve performance
   of accounted kernel memory allocations" which improves their performance
   by ~30% as measured by a micro-benchmark.
 
 - folio conversions from Kefeng Wang in the series "mm: convert page
   cpupid functions to folios".
 
 - Some kmemleak fixups in Liu Shixin's series "Some bugfix about
   kmemleak".
 
 - Qi Zheng has improved our handling of memoryless nodes by keeping them
   off the allocation fallback list.  This is done in the series "handle
   memoryless nodes more appropriately".
 
 - khugepaged conversions from Vishal Moola in the series "Some
   khugepaged folio conversions".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZULEMwAKCRDdBJ7gKXxA
 jhQHAQCYpD3g849x69DmHnHWHm/EHQLvQmRMDeYZI+nx/sCJOwEAw4AKg0Oemv9y
 FgeUPAD1oasg6CP+INZvCj34waNxwAc=
 =E+Y4
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Many singleton patches against the MM code. The patch series which are
  included in this merge do the following:

   - Kemeng Shi has contributed some compation maintenance work in the
     series 'Fixes and cleanups to compaction'

   - Joel Fernandes has a patchset ('Optimize mremap during mutual
     alignment within PMD') which fixes an obscure issue with mremap()'s
     pagetable handling during a subsequent exec(), based upon an
     implementation which Linus suggested

   - More DAMON/DAMOS maintenance and feature work from SeongJae Park i
     the following patch series:

	mm/damon: misc fixups for documents, comments and its tracepoint
	mm/damon: add a tracepoint for damos apply target regions
	mm/damon: provide pseudo-moving sum based access rate
	mm/damon: implement DAMOS apply intervals
	mm/damon/core-test: Fix memory leaks in core-test
	mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval

   - In the series 'Do not try to access unaccepted memory' Adrian
     Hunter provides some fixups for the recently-added 'unaccepted
     memory' feature. To increase the feature's checking coverage. 'Plug
     a few gaps where RAM is exposed without checking if it is
     unaccepted memory'

   - In the series 'cleanups for lockless slab shrink' Qi Zheng has done
     some maintenance work which is preparation for the lockless slab
     shrinking code

   - Qi Zheng has redone the earlier (and reverted) attempt to make slab
     shrinking lockless in the series 'use refcount+RCU method to
     implement lockless slab shrink'

   - David Hildenbrand contributes some maintenance work for the rmap
     code in the series 'Anon rmap cleanups'

   - Kefeng Wang does more folio conversions and some maintenance work
     in the migration code. Series 'mm: migrate: more folio conversion
     and unification'

   - Matthew Wilcox has fixed an issue in the buffer_head code which was
     causing long stalls under some heavy memory/IO loads. Some cleanups
     were added on the way. Series 'Add and use bdev_getblk()'

   - In the series 'Use nth_page() in place of direct struct page
     manipulation' Zi Yan has fixed a potential issue with the direct
     manipulation of hugetlb page frames

   - In the series 'mm: hugetlb: Skip initialization of gigantic tail
     struct pages if freed by HVO' has improved our handling of gigantic
     pages in the hugetlb vmmemmep optimizaton code. This provides
     significant boot time improvements when significant amounts of
     gigantic pages are in use

   - Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code
     rationalization and folio conversions in the hugetlb code

   - Yin Fengwei has improved mlock()'s handling of large folios in the
     series 'support large folio for mlock'

   - In the series 'Expose swapcache stat for memcg v1' Liu Shixin has
     added statistics for memcg v1 users which are available (and
     useful) under memcg v2

   - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
     prctl so that userspace may direct the kernel to not automatically
     propagate the denial to child processes. The series is named 'MDWE
     without inheritance'

   - Kefeng Wang has provided the series 'mm: convert numa balancing
     functions to use a folio' which does what it says

   - In the series 'mm/ksm: add fork-exec support for prctl' Stefan
     Roesch makes is possible for a process to propagate KSM treatment
     across exec()

   - Huang Ying has enhanced memory tiering's calculation of memory
     distances. This is used to permit the dax/kmem driver to use 'high
     bandwidth memory' in addition to Optane Data Center Persistent
     Memory Modules (DCPMM). The series is named 'memory tiering:
     calculate abstract distance based on ACPI HMAT'

   - In the series 'Smart scanning mode for KSM' Stefan Roesch has
     optimized KSM by teaching it to retain and use some historical
     information from previous scans

   - Yosry Ahmed has fixed some inconsistencies in memcg statistics in
     the series 'mm: memcg: fix tracking of pending stats updates
     values'

   - In the series 'Implement IOCTL to get and optionally clear info
     about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap
     which permits us to atomically read-then-clear page softdirty
     state. This is mainly used by CRIU

   - Hugh Dickins contributed the series 'shmem,tmpfs: general
     maintenance', a bunch of relatively minor maintenance tweaks to
     this code

   - Matthew Wilcox has increased the use of the VMA lock over
     file-backed page faults in the series 'Handle more faults under the
     VMA lock'. Some rationalizations of the fault path became possible
     as a result

   - In the series 'mm/rmap: convert page_move_anon_rmap() to
     folio_move_anon_rmap()' David Hildenbrand has implemented some
     cleanups and folio conversions

   - In the series 'various improvements to the GUP interface' Lorenzo
     Stoakes has simplified and improved the GUP interface with an eye
     to providing groundwork for future improvements

   - Andrey Konovalov has sent along the series 'kasan: assorted fixes
     and improvements' which does those things

   - Some page allocator maintenance work from Kemeng Shi in the series
     'Two minor cleanups to break_down_buddy_pages'

   - In thes series 'New selftest for mm' Breno Leitao has developed
     another MM self test which tickles a race we had between madvise()
     and page faults

   - In the series 'Add folio_end_read' Matthew Wilcox provides cleanups
     and an optimization to the core pagecache code

   - Nhat Pham has added memcg accounting for hugetlb memory in the
     series 'hugetlb memcg accounting'

   - Cleanups and rationalizations to the pagemap code from Lorenzo
     Stoakes, in the series 'Abstract vma_merge() and split_vma()'

   - Audra Mitchell has fixed issues in the procfs page_owner code's new
     timestamping feature which was causing some misbehaviours. In the
     series 'Fix page_owner's use of free timestamps'

   - Lorenzo Stoakes has fixed the handling of new mappings of sealed
     files in the series 'permit write-sealed memfd read-only shared
     mappings'

   - Mike Kravetz has optimized the hugetlb vmemmap optimization in the
     series 'Batch hugetlb vmemmap modification operations'

   - Some buffer_head folio conversions and cleanups from Matthew Wilcox
     in the series 'Finish the create_empty_buffers() transition'

   - As a page allocator performance optimization Huang Ying has added
     automatic tuning to the allocator's per-cpu-pages feature, in the
     series 'mm: PCP high auto-tuning'

   - Roman Gushchin has contributed the patchset 'mm: improve
     performance of accounted kernel memory allocations' which improves
     their performance by ~30% as measured by a micro-benchmark

   - folio conversions from Kefeng Wang in the series 'mm: convert page
     cpupid functions to folios'

   - Some kmemleak fixups in Liu Shixin's series 'Some bugfix about
     kmemleak'

   - Qi Zheng has improved our handling of memoryless nodes by keeping
     them off the allocation fallback list. This is done in the series
     'handle memoryless nodes more appropriately'

   - khugepaged conversions from Vishal Moola in the series 'Some
     khugepaged folio conversions'"

[ bcachefs conflicts with the dynamically allocated shrinkers have been
  resolved as per Stephen Rothwell in

     https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/

  with help from Qi Zheng.

  The clone3 test filtering conflict was half-arsed by yours truly ]

* tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits)
  mm/damon/sysfs: update monitoring target regions for online input commit
  mm/damon/sysfs: remove requested targets when online-commit inputs
  selftests: add a sanity check for zswap
  Documentation: maple_tree: fix word spelling error
  mm/vmalloc: fix the unchecked dereference warning in vread_iter()
  zswap: export compression failure stats
  Documentation: ubsan: drop "the" from article title
  mempolicy: migration attempt to match interleave nodes
  mempolicy: mmap_lock is not needed while migrating folios
  mempolicy: alloc_pages_mpol() for NUMA policy without vma
  mm: add page_rmappable_folio() wrapper
  mempolicy: remove confusing MPOL_MF_LAZY dead code
  mempolicy: mpol_shared_policy_init() without pseudo-vma
  mempolicy trivia: use pgoff_t in shared mempolicy tree
  mempolicy trivia: slightly more consistent naming
  mempolicy trivia: delete those ancient pr_debug()s
  mempolicy: fix migrate_pages(2) syscall return nr_failed
  kernfs: drop shared NUMA mempolicy hooks
  hugetlbfs: drop shared NUMA mempolicy pretence
  mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets()
  ...
2023-11-02 19:38:47 -10:00
Dan Williams 4b92894064 lib/fw_table: Remove acpi_parse_entries_array() export
Stephen reports that the ACPI helper library rework,
CONFIG_FIRMWARE_TABLE, introduces a new compiler warning:

    WARNING: modpost: vmlinux: acpi_parse_entries_array: EXPORT_SYMBOL used
    for init symbol. Remove __init or EXPORT_SYMBOL.

Delete this export as it turns out it is unneeded, and future work wraps
this in another exported helper. Note that in general
EXPORT_SYMBOL_ACPI_LIB() is needed for exporting symbols that are marked
__init_or_acpilib, but in this case no export is required.

Fixes: a103f46633 ("acpi: Move common tables helper functions to common lib")
Cc: Dave Jiang <dave.jiang@intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: http://lore.kernel.org/r/20231030160523.670a7569@canb.auug.org.au
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/169896282222.70775.940454758280866379.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-11-02 15:17:21 -07:00
Christophe JAILLET 70a9affa93 seq_buf: Export seq_buf_puts()
Mark seq_buf_puts() which is part of the seq_buf API to be exported to
kernel loadable GPL modules.

Link: https://lkml.kernel.org/r/b9e3737f66ec2450221b492048ce0d9c65c84953.1698861216.git.christophe.jaillet@wanadoo.fr

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-11-02 00:19:44 -04:00
Christophe JAILLET 685b38c765 seq_buf: Export seq_buf_putc()
Mark seq_buf_putc() which is part of the seq_buf API to be exported to
kernel loadable GPL modules.

Link: https://lkml.kernel.org/r/5c9a5ed97ac37dbdcd9c1e7bcbdec9ac166e79be.1698861216.git.christophe.jaillet@wanadoo.fr

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-11-02 00:18:52 -04:00
Linus Torvalds 5eda8f2537 linux_kselftest-kunit-6.7-rc1
This kunit update for Linux 6.7-rc1 consists of:
 
 -- string-stream testing enhancements
 -- several fixes memory leaks
 -- fix to reset status during parameter handling
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmVClgAACgkQCwJExA0N
 QxwXhA//Yn2nL5an6A9ZQire8S0HmxqdAwgROBVb7AaCkJ9L4cY+BYyt+VzTkaGO
 DJ3x5TwBi6BUkDSmAjppu+KhSQqbRvtmC5xIwH5zbSqKesftklJDhNC6SBd8ZFSC
 W5w+wwK8xxqFni2NAQu/ZMHrxgllUlR7ZbCE31U/IUYvW4NP4izz9oX3rqkvgjEU
 aZOQpe3GBVn3jkX+gYopoCqegbSiobjeE83cq8NwFX22rxA8FRDKeJF6/5fv3N1h
 vJF0KR14QIxc+Y50KHLNMpCXLIM8/IXEiPXd0bqzFVEB+/uyikM8BRPQRHaq9PfA
 VhaRSDAe9rYgAQYZcayAPLHlrsrFXS/gFs2x15QxzeLfrke7uJg1jEa8CoxV50f3
 ImAOxbtxXcW0Oz0lU4J5iX6Kq1gwhv/GP/Rgr8Cf2xMo4dWy3k6/dt8Ep7FwuwWk
 +ReDNPBw+FMrRLN3hWTXr2Y7k4avOmCDmjj5L1YVVG9yug0WI4ZD3OelbcXtzAA6
 XUAuB/EjCQYNB8XrOGd7+xX8eZraA/68xgf9gtNM3ycoIQ2UAfGnQnaZm3BXjvGk
 zjnvG8JpaCBObvonK22y1t60Ive1PbZgpdtU2Za1QTUjZWUPPudLHr7oU30crOaW
 +aqtCam6UGU6GlsiMWzbN8cv/q5fo/gCWkfKLGEry2NCz/71mwc=
 =Nn2t
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit updates from Shuah Khan:

 - string-stream testing enhancements

 - several fixes memory leaks

 - fix to reset status during parameter handling

* tag 'linux_kselftest-kunit-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: test: Fix the possible memory leak in executor_test
  kunit: Fix possible memory leak in kunit_filter_suites()
  kunit: Fix the wrong kfree of copy for kunit_filter_suites()
  kunit: Fix missed memory release in kunit_free_suite_set()
  kunit: Reset test status on each param iteration
  kunit: string-stream: Test performance of string_stream
  kunit: Use string_stream for test log
  kunit: string-stream: Add tests for freeing resource-managed string_stream
  kunit: string-stream: Decouple string_stream from kunit
  kunit: string-stream: Add kunit_alloc_string_stream()
  kunit: Don't use a managed alloc in is_literal()
  kunit: string-stream-test: Add cases for string_stream newline appending
  kunit: string-stream: Add option to make all lines end with newline
  kunit: string-stream: Improve testing of string_stream
  kunit: string-stream: Don't create a fragment for empty strings
2023-11-01 17:02:29 -10:00
Linus Torvalds 05bf73aa27 Probes updates for v6.7:
- cleanups:
   . kprobes: Fixes typo in kprobes samples.
 
   . tracing/eprobes: Remove 'break' after return.
 
 - kretprobe/fprobe performance improvements:
   . lib: Introduce new `objpool`, which is a high performance lockless
     object queue. This uses per-cpu ring array to allocate/release
     objects from the pre-allocated object pool. Since the index of ring
     array is a 32bit sequential counter, we can retry to push/pop the
     object pointer from the ring without lock (as seq-lock does).
 
   . lib: Add an objpool test module to test the functionality and
     evaluate the performance under some circumstances.
 
   . kprobes/fprobe: Improve kretprobe and rethook scalability
     performance with objpool.
     This improves both legacy kretprobe and fprobe exit handler (which
     is based on rethook) to be scalable on SMP systems. Even with
     8-threads parallel test, it shows a great scalability improvement.
 
   . Remove unneeded freelist.h which is replaced by objpool.
 
   . objpool: Add maintainers entry for the objpool.
 
   . objpool: Fix to remove unused include header lines.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmVA54obHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8busoH/3mG/rJwVVJw70zTLlfs
 ko4U1wn16aImYQYYLXkZLlYsKr6Y2dzNkb5C4CEI2r47EZjTamHatGZ6MSwvAtPb
 u9oloHEbRbE6yM+EjrE1JAKT9FwC+21/yZCN2zACZKJRwCwQRzxGIXUwGTWtDNdE
 NySLBDyMoR6zZJsFy8YueFBAJxcZdWIPK6mQH2Y5awVQA4tV7tQEe92KFqUYWTd5
 exbfBbcVG8MBWmrPqRI46Hxh0NWOnPCqFwGqX8Q7hE/yrQnTPzJ+2ZsbYFkGRk6A
 pM5wRCdwO5+OlcHEcEHBMQSGCmFgk6m1UMG8RvbCKyF3cwHbxzlelbjzHosKQvSh
 EKQ=
 =/vZK
 -----END PGP SIGNATURE-----

Merge tag 'probes-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes updates from Masami Hiramatsu:
 "Cleanups:

   - kprobes: Fixes typo in kprobes samples

   - tracing/eprobes: Remove 'break' after return

  kretprobe/fprobe performance improvements:

   - lib: Introduce new `objpool`, which is a high performance lockless
     object queue. This uses per-cpu ring array to allocate/release
     objects from the pre-allocated object pool.

     Since the index of ring array is a 32bit sequential counter, we can
     retry to push/pop the object pointer from the ring without lock (as
     seq-lock does)

   - lib: Add an objpool test module to test the functionality and
     evaluate the performance under some circumstances

   - kprobes/fprobe: Improve kretprobe and rethook scalability
     performance with objpool.

     This improves both legacy kretprobe and fprobe exit handler (which
     is based on rethook) to be scalable on SMP systems. Even with
     8-threads parallel test, it shows a great scalability improvement

   - Remove unneeded freelist.h which is replaced by objpool

   - objpool: Add maintainers entry for the objpool

   - objpool: Fix to remove unused include header lines"

* tag 'probes-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  kprobes: unused header files removed
  MAINTAINERS: objpool added
  kprobes: freelist.h removed
  kprobes: kretprobe scalability improvement
  lib: objpool test module added
  lib: objpool added: ring-array based lockless MPMC
  tracing/eprobe: drop unneeded breaks
  samples: kprobes: Fixes a typo
2023-11-01 16:15:42 -10:00
Linus Torvalds 1e0c505e13 asm-generic updates for v6.7
The ia64 architecture gets its well-earned retirement as planned,
 now that there is one last (mostly) working release that will
 be maintained as an LTS kernel.
 
 The architecture specific system call tables are updated for
 the added map_shadow_stack() syscall and to remove references
 to the long-gone sys_lookup_dcookie() syscall.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmVC40IACgkQYKtH/8kJ
 Uidhmw/9EX+aWSXGoObJ3fngaNSMw+PmrEuP8qEKBHxfKHcCdX3hc451Oh4GlhaQ
 tru91pPwgNvN2/rfoKusxT+V4PemGIzfNni/04rp+P0kvmdw5otQ2yNhsQNsfVmq
 XGWvkxF4P2GO6bkjjfR/1dDq7GtlyXtwwPDKeLbYb6TnJOZjtx+EAN27kkfSn1Ms
 R4Sa3zJ+DfHUmHL5S9g+7UD/CZ5GfKNmIskI4Mz5GsfoUz/0iiU+Bge/9sdcdSJQ
 kmbLy5YnVzfooLZ3TQmBFsO3iAMWb0s/mDdtyhqhTVmTUshLolkPYyKnPFvdupyv
 shXcpEST2XJNeaDRnL2K4zSCdxdbnCZHDpjfl9wfioBg7I8NfhXKpf1jYZHH1de4
 LXq8ndEFEOVQw/zSpYWfQq1sux8Jiqr+UK/ukbVeFWiGGIUs91gEWtPAf8T0AZo9
 ujkJvaWGl98O1g5wmBu0/dAR6QcFJMDfVwbmlIFpU8O+MEaz6X8mM+O5/T0IyTcD
 eMbAUjj4uYcU7ihKzHEv/0SS9Of38kzff67CLN5k8wOP/9NlaGZ78o1bVle9b52A
 BdhrsAefFiWHp1jT6Y9Rg4HOO/TguQ9e6EWSKOYFulsiLH9LEFaB9RwZLeLytV0W
 vlAgY9rUW77g1OJcb7DoNv33nRFuxsKqsnz3DEIXtgozo9CzbYI=
 =H1vH
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull ia64 removal and asm-generic updates from Arnd Bergmann:

 - The ia64 architecture gets its well-earned retirement as planned,
   now that there is one last (mostly) working release that will be
   maintained as an LTS kernel.

 - The architecture specific system call tables are updated for the
   added map_shadow_stack() syscall and to remove references to the
   long-gone sys_lookup_dcookie() syscall.

* tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  hexagon: Remove unusable symbols from the ptrace.h uapi
  asm-generic: Fix spelling of architecture
  arch: Reserve map_shadow_stack() syscall number for all architectures
  syscalls: Cleanup references to sys_lookup_dcookie()
  Documentation: Drop or replace remaining mentions of IA64
  lib/raid6: Drop IA64 support
  Documentation: Drop IA64 from feature descriptions
  kernel: Drop IA64 support from sig_fault handlers
  arch: Remove Itanium (IA-64) architecture
2023-11-01 15:28:33 -10:00
Linus Torvalds 385903a7ec SoC driver updates for 6.7
The highlights for the driver support this time are
 
  - Qualcomm platforms gain support for the Qualcomm Secure Execution
    Environment firmware interface to access EFI variables on certain
    devices, and new features for multiple platform and firmware drivers.
 
  - Arm FF-A firmware support gains support for v1.1 specification features,
    in particular notification and memory transaction descriptor changes.
 
  - SCMI firmware support now support v3.2 features for clock and DVFS
    configuration and a new transport for Qualcomm platforms.
 
  - Minor cleanups and bugfixes are added to pretty much all the active
    platforms: qualcomm, broadcom, dove, ti-k3, rockchip, sifive, amlogic,
    atmel, tegra, aspeed, vexpress, mediatek, samsung and more.
    In particular, this contains portions of the treewide conversion to
    use __counted_by annotations and the device_get_match_data helper.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmVC10IACgkQYKtH/8kJ
 UifFoQ//Tw7aux88EA2UkyL2Wulv80NwRQn3tQlxI/6ltjBX64yeQ6Y8OzmYdSYK
 20NEpbU7VWOFftN+D6Jp1HLrvfi0OV9uJn3WiTX3ChgDXixpOXo4TYgNNTlb9uZ4
 MrSTG3NkS27m/oTaCmYprOObgSNLq1FRCGIP7w4U9gyMk9N9FSKMpSJjlH06qPz6
 WBLTaIwPgBsyrLfCdxfA1y7AFCAHVxQJO4bp0VWSIalTrneGTeQrd2FgYMUesQ2e
 fIUNCaU4mpmj8XnQ/W19Wsek8FRB+fOh0hn/Gl+iHYibpxusIsn7bkdZ5BOJn2J0
 OY3C1biopaaxXcZ+wmnX9X0ieZ3TDsHzYOEf0zmNGzMZaZkV8kQt4/Ykv77xz6Gc
 4Bl6JI5QZ4rTZvlHYGMYxhy3hKuB31mO2rHbei7eR7J7UmjzWcl5P6HYfCgj7wzH
 crIWj1IR1Nx6Dt/wXf3HlRcEiAEJ2D0M3KIFjAVT239TsxacBfDrRk+YedF2bKbn
 WMYfVM6jJnPOykGg/gMRlttS/o/7TqHBl3y/900Idiijcm3cRPbQ+uKfkpHXftN/
 2vOtsw7pzEg7QQI9GVrb4drTrLvYJ7GQOi4o0twXTCshlXUk2V684jvHt0emFkdX
 ew9Zft4YLAYSmuJ3XqGhhMP63FsHKMlB1aSTKKPeswdIJmrdO80=
 =QIut
 -----END PGP SIGNATURE-----

Merge tag 'soc-drivers-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC driver updates from Arnd Bergmann:
 "The highlights for the driver support this time are

   - Qualcomm platforms gain support for the Qualcomm Secure Execution
     Environment firmware interface to access EFI variables on certain
     devices, and new features for multiple platform and firmware
     drivers.

   - Arm FF-A firmware support gains support for v1.1 specification
     features, in particular notification and memory transaction
     descriptor changes.

   - SCMI firmware support now support v3.2 features for clock and DVFS
     configuration and a new transport for Qualcomm platforms.

   - Minor cleanups and bugfixes are added to pretty much all the active
     platforms: qualcomm, broadcom, dove, ti-k3, rockchip, sifive,
     amlogic, atmel, tegra, aspeed, vexpress, mediatek, samsung and
     more.

     In particular, this contains portions of the treewide conversion to
     use __counted_by annotations and the device_get_match_data helper"

* tag 'soc-drivers-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (156 commits)
  soc: qcom: pmic_glink_altmode: Print return value on error
  firmware: qcom: scm: remove unneeded 'extern' specifiers
  firmware: qcom: scm: add a missing forward declaration for struct device
  firmware: qcom: move Qualcomm code into its own directory
  soc: samsung: exynos-chipid: Convert to platform remove callback returning void
  soc: qcom: apr: Add __counted_by for struct apr_rx_buf and use struct_size()
  soc: qcom: pmic_glink: fix connector type to be DisplayPort
  soc: ti: k3-socinfo: Avoid overriding return value
  soc: ti: k3-socinfo: Fix typo in bitfield documentation
  soc: ti: knav_qmss_queue: Use device_get_match_data()
  firmware: ti_sci: Use device_get_match_data()
  firmware: qcom: qseecom: add missing include guards
  soc/pxa: ssp: Convert to platform remove callback returning void
  soc/mediatek: mtk-mmsys: Convert to platform remove callback returning void
  soc/mediatek: mtk-devapc: Convert to platform remove callback returning void
  soc/loongson: loongson2_guts: Convert to platform remove callback returning void
  soc/litex: litex_soc_ctrl: Convert to platform remove callback returning void
  soc/ixp4xx: ixp4xx-qmgr: Convert to platform remove callback returning void
  soc/ixp4xx: ixp4xx-npe: Convert to platform remove callback returning void
  soc/hisilicon: kunpeng_hccs: Convert to platform remove callback returning void
  ...
2023-11-01 14:46:51 -10:00
Alexey Dobriyan 72fcce70fa vsprintf: uninline simple_strntoull(), reorder arguments
* uninline simple_strntoull(),
  gcc overinlines and this function is not performance critical

* reorder arguments, so that appending INT_MAX as 4th argument
  generates very efficient tail call

Space savings:

	add/remove: 1/0 grow/shrink: 0/3 up/down: 27/-179 (-152)
	Function                            old     new   delta
	simple_strntoll                       -      27     +27
	simple_strtoull                      15      10      -5
	simple_strtoll                       41       7     -34
	vsscanf                            1930    1790    -140

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/all/82a2af6e-9b6c-4a09-89d7-ca90cc1cdad1@p183/
2023-11-01 15:55:39 +01:00
Linus Torvalds 89ed67ef12 Networking changes for 6.7.
Core & protocols
 ----------------
 
  - Support usec resolution of TCP timestamps, enabled selectively by
    a route attribute.
 
  - Defer regular TCP ACK while processing socket backlog, try to send
    a cumulative ACK at the end. Increase single TCP flow performance
    on a 200Gbit NIC by 20% (100Gbit -> 120Gbit).
 
  - The Fair Queuing (FQ) packet scheduler:
    - add built-in 3 band prio / WRR scheduling
    - support bypass if the qdisc is mostly idle (5% speed up for TCP RR)
    - improve inactive flow reporting
    - optimize the layout of structures for better cache locality
 
  - Support TCP Authentication Option (RFC 5925, TCP-AO), a more modern
    replacement for the old MD5 option.
 
  - Add more retransmission timeout (RTO) related statistics to TCP_INFO.
 
  - Support sending fragmented skbs over vsock sockets.
 
  - Make sure we send SIGPIPE for vsock sockets if socket was shutdown().
 
  - Add sysctl for ignoring lower limit on lifetime in Router
    Advertisement PIO, based on an in-progress IETF draft.
 
  - Add sysctl to control activation of TCP ping-pong mode.
 
  - Add sysctl to make connection timeout in MPTCP configurable.
 
  - Support rcvlowat and notsent_lowat on MPTCP sockets, to help apps
    limit the number of wakeups.
 
  - Support netlink GET for MDB (multicast forwarding), allowing user
    space to request a single MDB entry instead of dumping the entire
    table.
 
  - Support selective FDB flushing in the VXLAN tunnel driver.
 
  - Allow limiting learned FDB entries in bridges, prevent OOM attacks.
 
  - Allow controlling via configfs netconsole targets which were created
    via the kernel cmdline at boot, rather than via configfs at runtime.
 
  - Support multiple PTP timestamp event queue readers with different
    filters.
 
  - MCTP over I3C.
 
 BPF
 ---
 
  - Add new veth-like netdevice where BPF program defines the logic
    of the xmit routine. It can operate in L3 and L2 mode.
 
  - Support exceptions - allow asserting conditions which should
    never be true but are hard for the verifier to infer.
    With some extra flexibility around handling of the exit / failure.
    https://lwn.net/Articles/938435/
 
  - Add support for local per-cpu kptr, allow allocating and storing
    per-cpu objects in maps. Access to those objects operates on
    the value for the current CPU. This allows to deprecate local
    one-off implementations of per-CPU storage like
    BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE maps.
 
  - Extend cgroup BPF sockaddr hooks for UNIX sockets. The use case is
    for systemd to re-implement the LogNamespace feature which allows
    running multiple instances of systemd-journald to process the logs
    of different services.
 
  - Enable open-coded task_vma iteration, after maple tree conversion
    made it hard to directly walk VMAs in tracing programs.
 
  - Add open-coded task, css_task and css iterator support.
    One of the use cases is customizable OOM victim selection via BPF.
 
  - Allow source address selection with bpf_*_fib_lookup().
 
  - Add ability to pin BPF timer to the current CPU.
 
  - Prevent creation of infinite loops by combining tail calls and
    fentry/fexit programs.
 
  - Add missed stats for kprobes to retrieve the number of missed kprobe
    executions and subsequent executions of BPF programs.
 
  - Inherit system settings for CPU security mitigations.
 
  - Add BPF v4 CPU instruction support for arm32 and s390x.
 
 Changes to common code
 ----------------------
 
  - overflow: add DEFINE_FLEX() for on-stack definition of structs
    with flexible array members.
 
  - Process doc update with more guidance for reviewers.
 
 Driver API
 ----------
 
  - Simplify locking in WiFi (cfg80211 and mac80211 layers), use wiphy
    mutex in most places and remove a lot of smaller locks.
 
  - Create a common DPLL configuration API. Allow configuring
    and querying state of PLL circuits used for clock syntonization,
    in network time distribution.
 
  - Unify fragmented and full page allocation APIs in page pool code.
    Let drivers be ignorant of PAGE_SIZE.
 
  - Rework PHY state machine to avoid races with calls to phy_stop().
 
  - Notify DSA drivers of MAC address changes on user ports, improve
    correctness of offloads which depend on matching port MAC addresses.
 
  - Allow antenna control on injected WiFi frames.
 
  - Reduce the number of variants of napi_schedule().
 
  - Simplify error handling when composing devlink health messages.
 
 Misc
 ----
 
  - A lot of KCSAN data race "fixes", from Eric.
 
  - A lot of __counted_by() annotations, from Kees.
 
  - A lot of strncpy -> strscpy and printf format fixes.
 
  - Replace master/slave terminology with conduit/user in DSA drivers.
 
  - Handful of KUnit tests for netdev and WiFi core.
 
 Removed
 -------
 
  - AppleTalk COPS.
 
  - AppleTalk ipddp.
 
  - TI AR7 CPMAC Ethernet driver.
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Intel (100G, ice, idpf):
      - add a driver for the Intel E2000 IPUs
      - make CRC/FCS stripping configurable
      - cross-timestamping for E823 devices
      - basic support for E830 devices
      - use aux-bus for managing client drivers
      - i40e: report firmware versions via devlink
    - nVidia/Mellanox:
      - support 4-port NICs
      - increase max number of channels to 256
      - optimize / parallelize SF creation flow
    - Broadcom (bnxt):
      - enhance NIC temperature reporting
      - support PAM4 speeds and lane configuration
    - Marvell OcteonTX2:
      - PTP pulse-per-second output support
      - enable hardware timestamping for VFs
    - Solarflare/AMD:
      - conntrack NAT offload and offload for tunnels
    - Wangxun (ngbe/txgbe):
      - expose HW statistics
    - Pensando/AMD:
      - support PCI level reset
      - narrow down the condition under which skbs are linearized
    - Netronome/Corigine (nfp):
      - support CHACHA20-POLY1305 crypto in IPsec offload
 
  - Ethernet NICs embedded, slower, virtual:
    - Synopsys (stmmac):
      - add Loongson-1 SoC support
      - enable use of HW queues with no offload capabilities
      - enable PPS input support on all 5 channels
      - increase TX coalesce timer to 5ms
    - RealTek USB (r8152): improve efficiency of Rx by using GRO frags
    - xen: support SW packet timestamping
    - add drivers for implementations based on TI's PRUSS (AM64x EVM)
 
  - nVidia/Mellanox Ethernet datacenter switches:
    - avoid poor HW resource use on Spectrum-4 by better block selection
      for IPv6 multicast forwarding and ordering of blocks in ACL region
 
  - Ethernet embedded switches:
    - Microchip:
      - support configuring the drive strength for EMI compliance
      - ksz9477: partial ACL support
      - ksz9477: HSR offload
      - ksz9477: Wake on LAN
    - Realtek:
      - rtl8366rb: respect device tree config of the CPU port
 
  - Ethernet PHYs:
    - support Broadcom BCM5221 PHYs
    - TI dp83867: support hardware LED blinking
 
  - CAN:
    - add support for Linux-PHY based CAN transceivers
    - at91_can: clean up and use rx-offload helpers
 
  - WiFi:
    - MediaTek (mt76):
      - new sub-driver for mt7925 USB/PCIe devices
      - HW wireless <> Ethernet bridging in MT7988 chips
      - mt7603/mt7628 stability improvements
    - Qualcomm (ath12k):
      - WCN7850:
        - enable 320 MHz channels in 6 GHz band
        - hardware rfkill support
        - enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS
          to make scan faster
        - read board data variant name from SMBIOS
      - QCN9274: mesh support
    - RealTek (rtw89):
      - TDMA-based multi-channel concurrency (MCC)
    - Silicon Labs (wfx):
      - Remain-On-Channel (ROC) support
 
  - Bluetooth:
    - ISO: many improvements for broadcast support
    - mark BCM4378/BCM4387 as BROKEN_LE_CODED
    - add support for QCA2066
    - btmtksdio: enable Bluetooth wakeup from suspend
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmU8XsYACgkQMUZtbf5S
 Irv19RAAnud/24OOF5XMEJkIcYlnfqximh4XO6PujRSYkSkOUJdZTF6iJPgf3pSP
 YpwoHYbYKHYfeOf8+3bTNESiQNSnoVmvmvwiS6/7lZ3behHUrGLQzW9Htc3EZyWH
 2h6QkDZ5OOjfg0bwYSfp3vXkmMH2k8WE9Y0NvCkhcohqZi13Rmp14RnyPmNb2d1V
 yZRYDMSM133KqE6gnBr1Ct65IEvnKeGlCUN2mTGqOJgdn6DZMsyxvtt0y4rmN7Ab
 41+CgPU5SfxfbYpW+Dl2HJpgfte3WrC57KC6AM0PAPJzPmQWgeB/m9mjz/apj6Bg
 bhsEIo7FdvbCnQm3yWPhK2OgCAcSwLr8jfGMU+Q+W4VnL5SRRR3Rm0zjsze+kHNP
 OfqJgxzl3DpvoJqVBy1h5FGcZt0XHwhksm4cTxWqIahsF+veY0ECBXbuBBQx9XTF
 Y7INfI8ulg7wISJs+CJfIClYkgOibTw2u8taBS5ikbtgxNqp5D4QqODn7UefQap1
 PR/IDYODF+zRgmMJLeBqSa6fij6BkfOEDiOWak5kggBoZdtbtmeKI6tzze06CNdW
 lWv1WEhRufxnwK+IuWsEkjhiMbs2WGLvkJ5JbgQV9BfqHfIfiqBCrcWtT/WbQnGt
 lmU46CXh1t/FZEqbmK9h+8vsIIfrcDl6jb5npEiKPRG00vDKRTM=
 =46nS
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Support usec resolution of TCP timestamps, enabled selectively by a
     route attribute.

   - Defer regular TCP ACK while processing socket backlog, try to send
     a cumulative ACK at the end. Increase single TCP flow performance
     on a 200Gbit NIC by 20% (100Gbit -> 120Gbit).

   - The Fair Queuing (FQ) packet scheduler:
       - add built-in 3 band prio / WRR scheduling
       - support bypass if the qdisc is mostly idle (5% speed up for TCP RR)
       - improve inactive flow reporting
       - optimize the layout of structures for better cache locality

   - Support TCP Authentication Option (RFC 5925, TCP-AO), a more modern
     replacement for the old MD5 option.

   - Add more retransmission timeout (RTO) related statistics to
     TCP_INFO.

   - Support sending fragmented skbs over vsock sockets.

   - Make sure we send SIGPIPE for vsock sockets if socket was
     shutdown().

   - Add sysctl for ignoring lower limit on lifetime in Router
     Advertisement PIO, based on an in-progress IETF draft.

   - Add sysctl to control activation of TCP ping-pong mode.

   - Add sysctl to make connection timeout in MPTCP configurable.

   - Support rcvlowat and notsent_lowat on MPTCP sockets, to help apps
     limit the number of wakeups.

   - Support netlink GET for MDB (multicast forwarding), allowing user
     space to request a single MDB entry instead of dumping the entire
     table.

   - Support selective FDB flushing in the VXLAN tunnel driver.

   - Allow limiting learned FDB entries in bridges, prevent OOM attacks.

   - Allow controlling via configfs netconsole targets which were
     created via the kernel cmdline at boot, rather than via configfs at
     runtime.

   - Support multiple PTP timestamp event queue readers with different
     filters.

   - MCTP over I3C.

  BPF:

   - Add new veth-like netdevice where BPF program defines the logic of
     the xmit routine. It can operate in L3 and L2 mode.

   - Support exceptions - allow asserting conditions which should never
     be true but are hard for the verifier to infer. With some extra
     flexibility around handling of the exit / failure:

          https://lwn.net/Articles/938435/

   - Add support for local per-cpu kptr, allow allocating and storing
     per-cpu objects in maps. Access to those objects operates on the
     value for the current CPU.

     This allows to deprecate local one-off implementations of per-CPU
     storage like BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE maps.

   - Extend cgroup BPF sockaddr hooks for UNIX sockets. The use case is
     for systemd to re-implement the LogNamespace feature which allows
     running multiple instances of systemd-journald to process the logs
     of different services.

   - Enable open-coded task_vma iteration, after maple tree conversion
     made it hard to directly walk VMAs in tracing programs.

   - Add open-coded task, css_task and css iterator support. One of the
     use cases is customizable OOM victim selection via BPF.

   - Allow source address selection with bpf_*_fib_lookup().

   - Add ability to pin BPF timer to the current CPU.

   - Prevent creation of infinite loops by combining tail calls and
     fentry/fexit programs.

   - Add missed stats for kprobes to retrieve the number of missed
     kprobe executions and subsequent executions of BPF programs.

   - Inherit system settings for CPU security mitigations.

   - Add BPF v4 CPU instruction support for arm32 and s390x.

  Changes to common code:

   - overflow: add DEFINE_FLEX() for on-stack definition of structs with
     flexible array members.

   - Process doc update with more guidance for reviewers.

  Driver API:

   - Simplify locking in WiFi (cfg80211 and mac80211 layers), use wiphy
     mutex in most places and remove a lot of smaller locks.

   - Create a common DPLL configuration API. Allow configuring and
     querying state of PLL circuits used for clock syntonization, in
     network time distribution.

   - Unify fragmented and full page allocation APIs in page pool code.
     Let drivers be ignorant of PAGE_SIZE.

   - Rework PHY state machine to avoid races with calls to phy_stop().

   - Notify DSA drivers of MAC address changes on user ports, improve
     correctness of offloads which depend on matching port MAC
     addresses.

   - Allow antenna control on injected WiFi frames.

   - Reduce the number of variants of napi_schedule().

   - Simplify error handling when composing devlink health messages.

  Misc:

   - A lot of KCSAN data race "fixes", from Eric.

   - A lot of __counted_by() annotations, from Kees.

   - A lot of strncpy -> strscpy and printf format fixes.

   - Replace master/slave terminology with conduit/user in DSA drivers.

   - Handful of KUnit tests for netdev and WiFi core.

  Removed:

   - AppleTalk COPS.

   - AppleTalk ipddp.

   - TI AR7 CPMAC Ethernet driver.

  Drivers:

   - Ethernet high-speed NICs:
      - Intel (100G, ice, idpf):
         - add a driver for the Intel E2000 IPUs
         - make CRC/FCS stripping configurable
         - cross-timestamping for E823 devices
         - basic support for E830 devices
         - use aux-bus for managing client drivers
         - i40e: report firmware versions via devlink
      - nVidia/Mellanox:
         - support 4-port NICs
         - increase max number of channels to 256
         - optimize / parallelize SF creation flow
      - Broadcom (bnxt):
         - enhance NIC temperature reporting
         - support PAM4 speeds and lane configuration
      - Marvell OcteonTX2:
         - PTP pulse-per-second output support
         - enable hardware timestamping for VFs
      - Solarflare/AMD:
         - conntrack NAT offload and offload for tunnels
      - Wangxun (ngbe/txgbe):
         - expose HW statistics
      - Pensando/AMD:
         - support PCI level reset
         - narrow down the condition under which skbs are linearized
      - Netronome/Corigine (nfp):
         - support CHACHA20-POLY1305 crypto in IPsec offload

   - Ethernet NICs embedded, slower, virtual:
      - Synopsys (stmmac):
         - add Loongson-1 SoC support
         - enable use of HW queues with no offload capabilities
         - enable PPS input support on all 5 channels
         - increase TX coalesce timer to 5ms
      - RealTek USB (r8152): improve efficiency of Rx by using GRO frags
      - xen: support SW packet timestamping
      - add drivers for implementations based on TI's PRUSS (AM64x EVM)

   - nVidia/Mellanox Ethernet datacenter switches:
      - avoid poor HW resource use on Spectrum-4 by better block
        selection for IPv6 multicast forwarding and ordering of blocks
        in ACL region

   - Ethernet embedded switches:
      - Microchip:
         - support configuring the drive strength for EMI compliance
         - ksz9477: partial ACL support
         - ksz9477: HSR offload
         - ksz9477: Wake on LAN
      - Realtek:
         - rtl8366rb: respect device tree config of the CPU port

   - Ethernet PHYs:
      - support Broadcom BCM5221 PHYs
      - TI dp83867: support hardware LED blinking

   - CAN:
      - add support for Linux-PHY based CAN transceivers
      - at91_can: clean up and use rx-offload helpers

   - WiFi:
      - MediaTek (mt76):
         - new sub-driver for mt7925 USB/PCIe devices
         - HW wireless <> Ethernet bridging in MT7988 chips
         - mt7603/mt7628 stability improvements
      - Qualcomm (ath12k):
         - WCN7850:
            - enable 320 MHz channels in 6 GHz band
            - hardware rfkill support
            - enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS to
              make scan faster
            - read board data variant name from SMBIOS
        - QCN9274: mesh support
      - RealTek (rtw89):
         - TDMA-based multi-channel concurrency (MCC)
      - Silicon Labs (wfx):
         - Remain-On-Channel (ROC) support

   - Bluetooth:
      - ISO: many improvements for broadcast support
      - mark BCM4378/BCM4387 as BROKEN_LE_CODED
      - add support for QCA2066
      - btmtksdio: enable Bluetooth wakeup from suspend"

* tag 'net-next-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1816 commits)
  net: pcs: xpcs: Add 2500BASE-X case in get state for XPCS drivers
  net: bpf: Use sockopt_lock_sock() in ip_sock_set_tos()
  net: mana: Use xdp_set_features_flag instead of direct assignment
  vxlan: Cleanup IFLA_VXLAN_PORT_RANGE entry in vxlan_get_size()
  iavf: delete the iavf client interface
  iavf: add a common function for undoing the interrupt scheme
  iavf: use unregister_netdev
  iavf: rely on netdev's own registered state
  iavf: fix the waiting time for initial reset
  iavf: in iavf_down, don't queue watchdog_task if comms failed
  iavf: simplify mutex_trylock+sleep loops
  iavf: fix comments about old bit locks
  doc/netlink: Update schema to support cmd-cnt-name and cmd-max-name
  tools: ynl: introduce option to process unknown attributes or types
  ipvlan: properly track tx_errors
  netdevsim: Block until all devices are released
  nfp: using napi_build_skb() to replace build_skb()
  net: dsa: microchip: ksz9477: Fix spelling mistake "Enery" -> "Energy"
  net: dsa: microchip: Ensure Stable PME Pin State for Wake-on-LAN
  net: dsa: microchip: Refactor switch shutdown routine for WoL preparation
  ...
2023-10-31 05:10:11 -10:00
Linus Torvalds befaa609f4 hardening updates for v6.7-rc1
- Add LKDTM test for stuck CPUs (Mark Rutland)
 
 - Improve LKDTM selftest behavior under UBSan (Ricardo Cañuelo)
 
 - Refactor more 1-element arrays into flexible arrays (Gustavo A. R. Silva)
 
 - Analyze and replace strlcpy and strncpy uses (Justin Stitt, Azeem Shaikh)
 
 - Convert group_info.usage to refcount_t (Elena Reshetova)
 
 - Add __counted_by annotations (Kees Cook, Gustavo A. R. Silva)
 
 - Add Kconfig fragment for basic hardening options (Kees Cook, Lukas Bulwahn)
 
 - Fix randstruct GCC plugin performance mode to stay in groups (Kees Cook)
 
 - Fix strtomem() compile-time check for small sources (Kees Cook)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmU/3cUWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJsEoEACBGPSiOmfSWdH3TOnIG270PD24
 jGjg8KFv7RC/JTOdYmpLl0okdlGT9LvjN/ToSSDEw3PIayxoXUdhkbYy0MYtiV3m
 yz2ozDTzJuplQX/W2fPE+nXSzIwHao2zjPPFjHnT7lt8IIjhgjiOtLfZ2gGUkW99
 Mdu2aWh3u0r4tC8OS23++yN5ibRc5l72efsjDWjZ0aPXnxE1bjmLMiIPiizpndIf
 beasPuDBs98sJVYouemCwnsPXuXOPz3Q1Cpo/fTd+TMTJCLSemCQZCTuOBU0acI/
 ZjLCgCaJU1yIYKBMtrIN4G9kITZniXX3/Nm4o6NQMVlcCqMeNaHuflomqWoqWfhE
 UPbRo2eghZOaMNiCKLLvZDIqPrh1IcsiEl6Ef3W4hICc42GTK96IuGisIvDXwQ4N
 /SzTOupJuN42noh3z1M3XuZy5RoXJ99IYDNY5CTKf9IdqvA0bbGkU3nb1gZH/xw9
 BjTqKzR/7K1kTXuSgagDZ1Wceej9pZxhX7E3IHYsP8ZOvKug3EeL4yybVwQ3HRfq
 Qnzcp/qPB9cOkLSQXveRTFTsj2mX28Gixct/iDuc1jIYwGQlY1gI6dcUcqby6ptM
 BrQti7eR2NH2+T3aE2UVCIWsZVhx7NaSF+z8JxfAuu56jicc4xJVsi8zrNveWX5M
 m2VXyBl3121BVtKi4w==
 =0iVF
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:
 "One of the more voluminous set of changes is for adding the new
  __counted_by annotation[1] to gain run-time bounds checking of
  dynamically sized arrays with UBSan.

   - Add LKDTM test for stuck CPUs (Mark Rutland)

   - Improve LKDTM selftest behavior under UBSan (Ricardo Cañuelo)

   - Refactor more 1-element arrays into flexible arrays (Gustavo A. R.
     Silva)

   - Analyze and replace strlcpy and strncpy uses (Justin Stitt, Azeem
     Shaikh)

   - Convert group_info.usage to refcount_t (Elena Reshetova)

   - Add __counted_by annotations (Kees Cook, Gustavo A. R. Silva)

   - Add Kconfig fragment for basic hardening options (Kees Cook, Lukas
     Bulwahn)

   - Fix randstruct GCC plugin performance mode to stay in groups (Kees
     Cook)

   - Fix strtomem() compile-time check for small sources (Kees Cook)"

* tag 'hardening-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (56 commits)
  hwmon: (acpi_power_meter) replace open-coded kmemdup_nul
  reset: Annotate struct reset_control_array with __counted_by
  kexec: Annotate struct crash_mem with __counted_by
  virtio_console: Annotate struct port_buffer with __counted_by
  ima: Add __counted_by for struct modsig and use struct_size()
  MAINTAINERS: Include stackleak paths in hardening entry
  string: Adjust strtomem() logic to allow for smaller sources
  hardening: x86: drop reference to removed config AMD_IOMMU_V2
  randstruct: Fix gcc-plugin performance mode to stay in group
  mailbox: zynqmp: Annotate struct zynqmp_ipi_pdata with __counted_by
  drivers: thermal: tsens: Annotate struct tsens_priv with __counted_by
  irqchip/imx-intmux: Annotate struct intmux_data with __counted_by
  KVM: Annotate struct kvm_irq_routing_table with __counted_by
  virt: acrn: Annotate struct vm_memory_region_batch with __counted_by
  hwmon: Annotate struct gsc_hwmon_platform_data with __counted_by
  sparc: Annotate struct cpuinfo_tree with __counted_by
  isdn: kcapi: replace deprecated strncpy with strscpy_pad
  isdn: replace deprecated strncpy with strscpy
  NFS/flexfiles: Annotate struct nfs4_ff_layout_segment with __counted_by
  nfs41: Annotate struct nfs4_file_layout_dsaddr with __counted_by
  ...
2023-10-30 19:09:55 -10:00
Kent Overstreet ee526b88ca closures: Fix race in closure_sync()
As pointed out by Linus, closure_sync() was racy; we could skip blocking
immediately after a get() and a put(), but then that would skip any
barrier corresponding to the other thread's put() barrier.

To fix this, always do the full __closure_sync() sequence whenever any
get() has happened and the closure might have been used by other
threads.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-30 21:48:22 -04:00
Kent Overstreet 2bce6368c4 closures: Better memory barriers
atomic_(dec|sub)_return_release() are a thing now - use them.

Also, delete the useless barrier in set_closure_fn(): it's redundant
with the memory barrier in closure_put(0.

Since closure_put() would now otherwise just have a release barrier, we
also need a new barrier when the ref hits 0 -
smp_acquire__after_ctrl_dep().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-30 21:48:22 -04:00
Linus Torvalds 63ce50fff9 Scheduler changes for v6.7 are:
- Fair scheduler (SCHED_OTHER) improvements:
 
     - Remove the old and now unused SIS_PROP code & option
     - Scan cluster before LLC in the wake-up path
     - Use candidate prev/recent_used CPU if scanning failed for cluster wakeup
 
  - NUMA scheduling improvements:
 
     - Improve the VMA access-PID code to better skip/scan VMAs
     - Extend tracing to cover VMA-skipping decisions
     - Improve/fix the recently introduced sched_numa_find_nth_cpu() code
     - Generalize numa_map_to_online_node()
 
  - Energy scheduling improvements:
 
     - Remove the EM_MAX_COMPLEXITY limit
     - Add tracepoints to track energy computation
     - Make the behavior of the 'sched_energy_aware' sysctl more consistent
     - Consolidate and clean up access to a CPU's max compute capacity
     - Fix uclamp code corner cases
 
  - RT scheduling improvements:
 
     - Drive dl_rq->overloaded with dl_rq->pushable_dl_tasks updates
     - Drive the ->rto_mask with rt_rq->pushable_tasks updates
 
  - Scheduler scalability improvements:
 
     - Rate-limit updates to tg->load_avg
     - On x86 disable IBRS when CPU is offline to improve single-threaded performance
     - Micro-optimize in_task() and in_interrupt()
     - Micro-optimize the PSI code
     - Avoid updating PSI triggers and ->rtpoll_total when there are no state changes
 
  - Core scheduler infrastructure improvements:
 
     - Use saved_state to reduce some spurious freezer wakeups
     - Bring in a handful of fast-headers improvements to scheduler headers
     - Make the scheduler UAPI headers more widely usable by user-space
     - Simplify the control flow of scheduler syscalls by using lock guards
     - Fix sched_setaffinity() vs. CPU hotplug race
 
  - Scheduler debuggability improvements:
     - Disallow writing invalid values to sched_rt_period_us
     - Fix a race in the rq-clock debugging code triggering warnings
     - Fix a warning in the bandwidth distribution code
     - Micro-optimize in_atomic_preempt_off() checks
     - Enforce that the tasklist_lock is held in for_each_thread()
     - Print the TGID in sched_show_task()
     - Remove the /proc/sys/kernel/sched_child_runs_first sysctl
 
  - Misc cleanups & fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU8/NoRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gN+xAAvKGYNZBCBG4jowxccgqAbCx81KOhhsy/
 KUaOmdLPg9WaXuqjZ5sggXQCMT0wUqBYAmqV7ts53VhWcma2I1ap4dCM6Jj+RLrc
 vNwkeNetsikiZtarMoCJs5NahL8ULh3liBaoAkkToPjQ5r43aZ/eKwDovEdIKc+g
 +Vgn7jUY8ssIrAOKT1midSwY1y8kAU2AzWOSFDTgedkJP4PgOu9/lBl9jSJ2sYaX
 N4XqONYPXTwOHUtvmzkYILxLz0k0GgJ7hmt78E8Xy2rC4taGCRwCfCMBYxREuwiP
 huo3O1P/iIe5svm4/EBUvcpvf44eAWTV+CD0dnJPwOc9IvFhpSzqSZZAsyy/JQKt
 Lnzmc/xmyc1PnXCYJfHuXrw2/m+MyUHaegPzh5iLJFrlqa79GavOElj0jNTAMzbZ
 39fybzPtuFP+64faRfu0BBlQZfORPBNc/oWMpPKqgP58YGuveKTWaUF5rl5lM7Ne
 nm07uOmq02JVR8YzPl/FcfhU2dPMawWuMwUjEr2eU+lAunY3PF88vu0FALj7iOBd
 66F8qrtpDHJanOxrdEUwSJ7hgw79qY1iw66Db7cQYjMazFKZONxArQPqFUZ0ngLI
 n9hVa7brg1bAQKrQflqjcIAIbpVu3SjPEl15cKpAJTB/gn5H66TQgw8uQ6HfG+h2
 GtOsn1nlvuk=
 =GDqb
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:
 "Fair scheduler (SCHED_OTHER) improvements:
   - Remove the old and now unused SIS_PROP code & option
   - Scan cluster before LLC in the wake-up path
   - Use candidate prev/recent_used CPU if scanning failed for cluster
     wakeup

  NUMA scheduling improvements:
   - Improve the VMA access-PID code to better skip/scan VMAs
   - Extend tracing to cover VMA-skipping decisions
   - Improve/fix the recently introduced sched_numa_find_nth_cpu() code
   - Generalize numa_map_to_online_node()

  Energy scheduling improvements:
   - Remove the EM_MAX_COMPLEXITY limit
   - Add tracepoints to track energy computation
   - Make the behavior of the 'sched_energy_aware' sysctl more
     consistent
   - Consolidate and clean up access to a CPU's max compute capacity
   - Fix uclamp code corner cases

  RT scheduling improvements:
   - Drive dl_rq->overloaded with dl_rq->pushable_dl_tasks updates
   - Drive the ->rto_mask with rt_rq->pushable_tasks updates

  Scheduler scalability improvements:
   - Rate-limit updates to tg->load_avg
   - On x86 disable IBRS when CPU is offline to improve single-threaded
     performance
   - Micro-optimize in_task() and in_interrupt()
   - Micro-optimize the PSI code
   - Avoid updating PSI triggers and ->rtpoll_total when there are no
     state changes

  Core scheduler infrastructure improvements:
   - Use saved_state to reduce some spurious freezer wakeups
   - Bring in a handful of fast-headers improvements to scheduler
     headers
   - Make the scheduler UAPI headers more widely usable by user-space
   - Simplify the control flow of scheduler syscalls by using lock
     guards
   - Fix sched_setaffinity() vs. CPU hotplug race

  Scheduler debuggability improvements:
   - Disallow writing invalid values to sched_rt_period_us
   - Fix a race in the rq-clock debugging code triggering warnings
   - Fix a warning in the bandwidth distribution code
   - Micro-optimize in_atomic_preempt_off() checks
   - Enforce that the tasklist_lock is held in for_each_thread()
   - Print the TGID in sched_show_task()
   - Remove the /proc/sys/kernel/sched_child_runs_first sysctl

  ... and misc cleanups & fixes"

* tag 'sched-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits)
  sched/fair: Remove SIS_PROP
  sched/fair: Use candidate prev/recent_used CPU if scanning failed for cluster wakeup
  sched/fair: Scan cluster before scanning LLC in wake-up path
  sched: Add cpus_share_resources API
  sched/core: Fix RQCF_ACT_SKIP leak
  sched/fair: Remove unused 'curr' argument from pick_next_entity()
  sched/nohz: Update comments about NEWILB_KICK
  sched/fair: Remove duplicate #include
  sched/psi: Update poll => rtpoll in relevant comments
  sched: Make PELT acronym definition searchable
  sched: Fix stop_one_cpu_nowait() vs hotplug
  sched/psi: Bail out early from irq time accounting
  sched/topology: Rename 'DIE' domain to 'PKG'
  sched/psi: Delete the 'update_total' function parameter from update_triggers()
  sched/psi: Avoid updating PSI triggers and ->rtpoll_total when there are no state changes
  sched/headers: Remove comment referring to rq::cpu_load, since this has been removed
  sched/numa: Complete scanning of inactive VMAs when there is no alternative
  sched/numa: Complete scanning of partial VMAs regardless of PID activity
  sched/numa: Move up the access pid reset logic
  sched/numa: Trace decisions related to skipping VMAs
  ...
2023-10-30 13:12:15 -10:00
Linus Torvalds 3cf3fabccb Locking changes in this cycle are:
- Futex improvements:
 
     - Add the 'futex2' syscall ABI, which is an attempt to get away from the
       multiplex syscall and adds a little room for extentions, while lifting
       some limitations.
 
     - Fix futex PI recursive rt_mutex waiter state bug
 
     - Fix inter-process shared futexes on no-MMU systems
 
     - Use folios instead of pages
 
  - Micro-optimizations of locking primitives:
 
     - Improve arch_spin_value_unlocked() on asm-generic ticket spinlock
       architectures, to improve lockref code generation.
 
     - Improve the x86-32 lockref_get_not_zero() main loop by adding
       build-time CMPXCHG8B support detection for the relevant lockref code,
       and by better interfacing the CMPXCHG8B assembly code with the compiler.
 
     - Introduce arch_sync_try_cmpxchg() on x86 to improve sync_try_cmpxchg()
       code generation. Convert some sync_cmpxchg() users to sync_try_cmpxchg().
 
     - Micro-optimize rcuref_put_slowpath()
 
  - Locking debuggability improvements:
 
     - Improve CONFIG_DEBUG_RT_MUTEXES=y to have a fast-path as well
 
     - Enforce atomicity of sched_submit_work(), which is de-facto atomic but
       was un-enforced previously.
 
     - Extend <linux/cleanup.h>'s no_free_ptr() with __must_check semantics
 
     - Fix ww_mutex self-tests
 
     - Clean up const-propagation in <linux/seqlock.h> and simplify
       the API-instantiation macros a bit.
 
  - RT locking improvements:
 
     - Provide the rt_mutex_*_schedule() primitives/helpers and use them
       in the rtmutex code to avoid recursion vs. rtlock on the PI state.
 
     - Add nested blocking lockdep asserts to rt_mutex_lock(), rtlock_lock()
       and rwbase_read_lock().
 
  - Plus misc fixes & cleanups
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU877IRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1g9jw/+N7rxQ78dmFCYh4UWnLCYvuKP0/ivHErG
 493JcB8MupuA2tfJHIkDdr4aM2mNq2E61w69/WlZAQWWD6pdOhwgF5Xf5eoEcJm0
 vsAhWBGLxihXdtevPuMAx0dEpg3AMp2wc6i5PkN831KdPUgCNsrKq9Bfnfef7/G8
 MQTSHjmtba6jxleyxfEa4tE2xe5PJX825nRfkX2e1cf+stkYua+uJFxVxUfxFWGE
 4pBy70D9OC7MsJ44WWOA1gwkVtMMiBTmRPNjlP8Gz2GQ0f3ERHRwYk3jDHOPHZI6
 0GNt7pE3IMXQn2UuDtfkvv9IFTd+U5qD+APnWIn2ntWXqzGLFqOlmovMrobVn7El
 olYDCyweWPG71m1Qblsb1VK2QjRPQVJ9NAEg8RlDHIu2ThxHbMysDVGPVOYnPFq4
 S8QFpmldzbNoPU4rDJyT1fAmoUIrusBHkl+Us3yGfC74iM+fHnDEvaSoMZbzEdY1
 x/Nocj9XgKEgfXdYzrCWFmZ9xXqHkO25/wDL6yKqBdQtvaEalXuHTT6mQcYxrUPm
 Xx1BPan2Jg7p4u2oOFcVtKewUtRH9KBx8qytr5S+JK4PJbrBsixMnr84HLd/3X2V
 ykYkO+367T5MTYv4TnJDE5vdurzUqekKSCFPY3skPujPJfdLj1vsPzYf9iMkCLdo
 hU2f/R+Wpdk=
 =36Ff
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Info Molnar:
 "Futex improvements:

   - Add the 'futex2' syscall ABI, which is an attempt to get away from
     the multiplex syscall and adds a little room for extentions, while
     lifting some limitations.

   - Fix futex PI recursive rt_mutex waiter state bug

   - Fix inter-process shared futexes on no-MMU systems

   - Use folios instead of pages

  Micro-optimizations of locking primitives:

   - Improve arch_spin_value_unlocked() on asm-generic ticket spinlock
     architectures, to improve lockref code generation

   - Improve the x86-32 lockref_get_not_zero() main loop by adding
     build-time CMPXCHG8B support detection for the relevant lockref
     code, and by better interfacing the CMPXCHG8B assembly code with
     the compiler

   - Introduce arch_sync_try_cmpxchg() on x86 to improve
     sync_try_cmpxchg() code generation. Convert some sync_cmpxchg()
     users to sync_try_cmpxchg().

   - Micro-optimize rcuref_put_slowpath()

  Locking debuggability improvements:

   - Improve CONFIG_DEBUG_RT_MUTEXES=y to have a fast-path as well

   - Enforce atomicity of sched_submit_work(), which is de-facto atomic
     but was un-enforced previously.

   - Extend <linux/cleanup.h>'s no_free_ptr() with __must_check
     semantics

   - Fix ww_mutex self-tests

   - Clean up const-propagation in <linux/seqlock.h> and simplify the
     API-instantiation macros a bit

  RT locking improvements:

   - Provide the rt_mutex_*_schedule() primitives/helpers and use them
     in the rtmutex code to avoid recursion vs. rtlock on the PI state.

   - Add nested blocking lockdep asserts to rt_mutex_lock(),
     rtlock_lock() and rwbase_read_lock()

  .. plus misc fixes & cleanups"

* tag 'locking-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
  futex: Don't include process MM in futex key on no-MMU
  locking/seqlock: Fix grammar in comment
  alpha: Fix up new futex syscall numbers
  locking/seqlock: Propagate 'const' pointers within read-only methods, remove forced type casts
  locking/lockdep: Fix string sizing bug that triggers a format-truncation compiler-warning
  locking/seqlock: Change __seqprop() to return the function pointer
  locking/seqlock: Simplify SEQCOUNT_LOCKNAME()
  locking/atomics: Use atomic_try_cmpxchg_release() to micro-optimize rcuref_put_slowpath()
  locking/atomic, xen: Use sync_try_cmpxchg() instead of sync_cmpxchg()
  locking/atomic/x86: Introduce arch_sync_try_cmpxchg()
  locking/atomic: Add generic support for sync_try_cmpxchg() and its fallback
  locking/seqlock: Fix typo in comment
  futex/requeue: Remove unnecessary ‘NULL’ initialization from futex_proxy_trylock_atomic()
  locking/local, arch: Rewrite local_add_unless() as a static inline function
  locking/debug: Fix debugfs API return value checks to use IS_ERR()
  locking/ww_mutex/test: Make sure we bail out instead of livelock
  locking/ww_mutex/test: Fix potential workqueue corruption
  locking/ww_mutex/test: Use prng instead of rng to avoid hangs at bootup
  futex: Add sys_futex_requeue()
  futex: Add flags2 argument to futex_requeue()
  ...
2023-10-30 12:38:48 -10:00
Linus Torvalds 9e87705289 Initial bcachefs pull request for 6.7-rc1
Here's the bcachefs filesystem pull request.
 
 One new patch since last week: the exportfs constants ended up
 conflicting with other filesystems that are also getting added to the
 global enum, so switched to new constants picked by Amir.
 
 I'll also be sending another pull request later on in the cycle bringing
 things up to date my master branch that people are currently running;
 that will be restricted to fs/bcachefs/, naturally.
 
 Testing - fstests as well as the bcachefs specific tests in ktest:
   https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-for-upstream
 
 It's also been soaking in linux-next, which resulted in a whole bunch of
 smatch complaints and fixes and a patch or two from Kees.
 
 The only new non fs/bcachefs/ patch is the objtool patch that adds
 bcachefs functions to the list of noreturns. The patch that exports
 osq_lock() has been dropped for now, per Ingo.
 
 Prereq patch list:
 
 faf1dce852 objtool: Add bcachefs noreturns
 73badee428 lib/generic-radix-tree.c: Add peek_prev()
 9492261ff2 lib/generic-radix-tree.c: Don't overflow in peek()
 0fb5d567f5 MAINTAINERS: Add entry for generic-radix-tree
 b414e8ecd4 closures: Add a missing include
 48b7935722 closures: closure_nr_remaining()
 ced58fc7ab closures: closure_wait_event()
 bd0d22e41e MAINTAINERS: Add entry for closures
 8c8d2d9670 bcache: move closures to lib/
 957e48087d locking: export contention tracepoints for bcachefs six locks
 21db931445 lib: Export errname
 83feeb1955 lib/string_helpers: string_get_size() now returns characters wrote
 7d672f4094 stacktrace: Export stack_trace_save_tsk
 771eb4fe8b fs: factor out d_mark_tmpfile()
 2b69987be5 sched: Add task_struct->faults_disabled_mapping
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmU/wyIACgkQE6szbY3K
 bnZc1xAAqjQBGXdtgtKQvk0/ru0WaMZguMsOHd3BUXIbm30F6eJqnoXQ/ahALofc
 Ju6NrOgcy9wmdPKWpbeF+aK3WnkAW9jShDd0QieVH6PkhyYyh5r11iR/EVtjjLu5
 6Teodn8fyTqn9WSDtKG15QreTCJrEasAoGFQKQDA8oiXC7zc+RSpLUkkTWD/pxyW
 zVqkGGiAUG4x6FON+X2a3QBa9WCahIgV6XzHstGLsmOECxKO/LopGR5jThuIhv9t
 Yo0wodQTKAgb9QviG6V3f2dJLQKKUVDmVEGTXv+8Hl3d8CiYBJeIh+icp+VESBo1
 m8ev0y2xbTPLwgm5v0Uj4o/G8ISZ+qmcexV2zQ9xUWUAd2AjEBzhCh9BrNXM5qSg
 o7mphH+Pt6bJXgzxb2RkYJixU11yG3yuHPOCrRGGFpVHiNYhdHuJeDZOqChWZB8x
 6kY0uvU0X0tqVfWKxMwTwuqG8mJ5BkJNvnEvYi05QEZG0dDcUhgOqYlNNaL8vGkl
 qVixOwE4aH4kscdmW2gXY1c76VSebheyN8n6Wj1zrmTw4hTJH7ZWXPtmbRqQzpB6
 U6w3NjVyopbIjuF+syWeGqitTT/8fpvgZU4E9MpKGmHX4ADgecp6YSZQzzxTJn7D
 cbVX7YQxhmsM50C1PW7A8yLCspD/uRNiKLvzb/g9gFSInk4rV+U=
 =g+ia
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2023-10-30' of https://evilpiepirate.org/git/bcachefs

Pull initial bcachefs updates from Kent Overstreet:
 "Here's the bcachefs filesystem pull request.

  One new patch since last week: the exportfs constants ended up
  conflicting with other filesystems that are also getting added to the
  global enum, so switched to new constants picked by Amir.

  The only new non fs/bcachefs/ patch is the objtool patch that adds
  bcachefs functions to the list of noreturns. The patch that exports
  osq_lock() has been dropped for now, per Ingo"

* tag 'bcachefs-2023-10-30' of https://evilpiepirate.org/git/bcachefs: (2781 commits)
  exportfs: Change bcachefs fid_type enum to avoid conflicts
  bcachefs: Refactor memcpy into direct assignment
  bcachefs: Fix drop_alloc_keys()
  bcachefs: snapshot_create_lock
  bcachefs: Fix snapshot skiplists during snapshot deletion
  bcachefs: bch2_sb_field_get() refactoring
  bcachefs: KEY_TYPE_error now counts towards i_sectors
  bcachefs: Fix handling of unknown bkey types
  bcachefs: Switch to unsafe_memcpy() in a few places
  bcachefs: Use struct_size()
  bcachefs: Correctly initialize new buckets on device resize
  bcachefs: Fix another smatch complaint
  bcachefs: Use strsep() in split_devs()
  bcachefs: Add iops fields to bch_member
  bcachefs: Rename bch_sb_field_members -> bch_sb_field_members_v1
  bcachefs: New superblock section members_v2
  bcachefs: Add new helper to retrieve bch_member from sb
  bcachefs: bucket_lock() is now a sleepable lock
  bcachefs: fix crc32c checksum merge byte order problem
  bcachefs: Fix bch2_inode_delete_keys()
  ...
2023-10-30 11:09:38 -10:00
Linus Torvalds 8b16da681e NFSD 6.7 Release Notes
This release completes the SunRPC thread scheduler work that was
 begun in v6.6. The scheduler can now find an svc thread to wake in
 constant time and without a list walk. Thanks again to Neil Brown
 for this overhaul.
 
 Lorenzo Bianconi contributed infrastructure for a netlink-based
 NFSD control plane. The long-term plan is to provide the same
 functionality as found in /proc/fs/nfsd, plus some interesting
 additions, and then migrate the NFSD user space utilities to
 netlink.
 
 A long series to overhaul NFSD's NFSv4 operation encoding was
 applied in this release. The goals are to bring this family of
 encoding functions in line with the matching NFSv4 decoding
 functions and with the NFSv2 and NFSv3 XDR functions, preparing
 the way for better memory safety and maintainability.
 
 A further improvement to NFSD's write delegation support was
 contributed by Dai Ngo. This adds a CB_GETATTR callback,
 enabling the server to retrieve cached size and mtime data from
 clients holding write delegations. If the server can retrieve
 this information, it does not have to recall the delegation in
 some cases.
 
 The usual panoply of bug fixes and minor improvements round out
 this release. As always I am grateful to all contributors,
 reviewers, and testers.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmU5IuoACgkQM2qzM29m
 f5eVsg//bVp8S93ci/oDlKfzOwH2fO5e5rna91wrDpJxkd51h6KTx55dSRG5sjAZ
 EywIVOann6xCtsixAPyff5Cweg2dWvzQRsy1ZnvWQ1qZBzD5KAJY5LPkeSFUCKBo
 Zani/qTOYbxzgFMjZx+yDSXDPKG68WYZBQK59SI7mURu4SYdk8aRyNY8mjHfr0Vh
 Aqrcny4oVtXV4sL5P5G/2FUW7WKT3olA3jSYlRRNMhbs2qpEemRCCrspOEMMad+b
 t1+ZCg+U27PMranvOJnof4RU7peZbaxDWA0gyiUbivVXVtZn9uOs0ffhktkvechL
 ePc33dqdp2ITdKIPA6JlaRv5WflKXQw0YYM9Kv5mcR4A2el7owL4f/pMlPhtbYwJ
 IOJv15KdKVN979G2e6WMYiKK+iHfaUUguhMEXnfnGoAajHOZNQiUEo3iFQAD7LDc
 DvMF8d9QqYmB9IW8FOYaRRfZGJOQHf3TL79Nd08z/bn5swvlvfj77leux9Sb+0/m
 Luk2Xvz2AJVSXE31wzabaGHkizN+BtH+e4MMbXUHBPW5jE9v7XOnEUFr4UdZyr9P
 Gl87A7NcrzNjJWT5TrnzM4sOslNsx46Aeg+VuNt2fSRn2dm6iBu2B8s0N4imx6dV
 PX1y9VSLq5WRhjrFZ1qeiZdsuTaQtrEiNDoRIQR6nCJPAV80iFk=
 =B4wJ
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
 "This release completes the SunRPC thread scheduler work that was begun
  in v6.6. The scheduler can now find an svc thread to wake in constant
  time and without a list walk. Thanks again to Neil Brown for this
  overhaul.

  Lorenzo Bianconi contributed infrastructure for a netlink-based NFSD
  control plane. The long-term plan is to provide the same functionality
  as found in /proc/fs/nfsd, plus some interesting additions, and then
  migrate the NFSD user space utilities to netlink.

  A long series to overhaul NFSD's NFSv4 operation encoding was applied
  in this release. The goals are to bring this family of encoding
  functions in line with the matching NFSv4 decoding functions and with
  the NFSv2 and NFSv3 XDR functions, preparing the way for better memory
  safety and maintainability.

  A further improvement to NFSD's write delegation support was
  contributed by Dai Ngo. This adds a CB_GETATTR callback, enabling the
  server to retrieve cached size and mtime data from clients holding
  write delegations. If the server can retrieve this information, it
  does not have to recall the delegation in some cases.

  The usual panoply of bug fixes and minor improvements round out this
  release. As always I am grateful to all contributors, reviewers, and
  testers"

* tag 'nfsd-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (127 commits)
  svcrdma: Fix tracepoint printk format
  svcrdma: Drop connection after an RDMA Read error
  NFSD: clean up alloc_init_deleg()
  NFSD: Fix frame size warning in svc_export_parse()
  NFSD: Rewrite synopsis of nfsd_percpu_counters_init()
  nfsd: Clean up errors in nfs3proc.c
  nfsd: Clean up errors in nfs4state.c
  NFSD: Clean up errors in stats.c
  NFSD: simplify error paths in nfsd_svc()
  NFSD: Clean up nfsd4_encode_seek()
  NFSD: Clean up nfsd4_encode_offset_status()
  NFSD: Clean up nfsd4_encode_copy_notify()
  NFSD: Clean up nfsd4_encode_copy()
  NFSD: Clean up nfsd4_encode_test_stateid()
  NFSD: Clean up nfsd4_encode_exchange_id()
  NFSD: Clean up nfsd4_do_encode_secinfo()
  NFSD: Clean up nfsd4_encode_access()
  NFSD: Clean up nfsd4_encode_readdir()
  NFSD: Clean up nfsd4_encode_entry4()
  NFSD: Add an nfsd4_encode_nfs_cookie4() helper
  ...
2023-10-30 10:12:29 -10:00
Linus Torvalds df9c65b5fc vfs-6.7.iov_iter
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZTppQwAKCRCRxhvAZXjc
 om2kAP4u+eLsrhJHfUPUttGUEkSkZE+5/s/f1A/1GcV51usLSgEAu8urxAnP49GW
 INaDABXaFfKx8/KI/H2YFZPKGwlNEwY=
 =PDDE
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.7.iov_iter' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull iov_iter updates from Christian Brauner:
 "This contain's David's iov_iter cleanup work to convert the iov_iter
  iteration macros to inline functions:

   - Remove last_offset from iov_iter as it was only used by ITER_PIPE

   - Add a __user tag on copy_mc_to_user()'s dst argument on x86 to
     match that on powerpc and get rid of a sparse warning

   - Convert iter->user_backed to user_backed_iter() in the sound PCM
     driver

   - Convert iter->user_backed to user_backed_iter() in a couple of
     infiniband drivers

   - Renumber the type enum so that the ITER_* constants match the order
     in iterate_and_advance*()

   - Since the preceding patch puts UBUF and IOVEC at 0 and 1, change
     user_backed_iter() to just use the type value and get rid of the
     extra flag

   - Convert the iov_iter iteration macros to always-inline functions to
     make the code easier to follow. It uses function pointers, but they
     get optimised away

   - Move the check for ->copy_mc to _copy_from_iter() and
     copy_page_from_iter_atomic() rather than in memcpy_from_iter_mc()
     where it gets repeated for every segment. Instead, we check once
     and invoke a side function that can use iterate_bvec() rather than
     iterate_and_advance() and supply a different step function

   - Move the copy-and-csum code to net/ where it can be in proximity
     with the code that uses it

   - Fold memcpy_and_csum() in to its two users

   - Move csum_and_copy_from_iter_full() out of line and merge in
     csum_and_copy_from_iter() since the former is the only caller of
     the latter

   - Move hash_and_copy_to_iter() to net/ where it can be with its only
     caller"

* tag 'vfs-6.7.iov_iter' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
  iov_iter, net: Move hash_and_copy_to_iter() to net/
  iov_iter, net: Merge csum_and_copy_from_iter{,_full}() together
  iov_iter, net: Fold in csum_and_memcpy()
  iov_iter, net: Move csum_and_copy_to/from_iter() to net/
  iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()
  iov_iter: Convert iterate*() to inline funcs
  iov_iter: Derive user-backedness from the iterator type
  iov_iter: Renumber ITER_* constants
  infiniband: Use user_backed_iter() to see if iterator is UBUF/IOVEC
  sound: Fix snd_pcm_readv()/writev() to use iov access functions
  iov_iter, x86: Be consistent about the __user tag on copy_mc_to_user()
  iov_iter: Remove last_offset from iov_iter as it was for ITER_PIPE
2023-10-30 09:24:21 -10:00
Kees Cook dcc4e5728e seq_buf: Introduce DECLARE_SEQ_BUF and seq_buf_str()
Solve two ergonomic issues with struct seq_buf;

1) Too much boilerplate is required to initialize:

	struct seq_buf s;
	char buf[32];

	seq_buf_init(s, buf, sizeof(buf));

Instead, we can build this directly on the stack. Provide
DECLARE_SEQ_BUF() macro to do this:

	DECLARE_SEQ_BUF(s, 32);

2) %NUL termination is fragile and requires 2 steps to get a valid
   C String (and is a layering violation exposing the "internals" of
   seq_buf):

	seq_buf_terminate(s);
	do_something(s->buffer);

Instead, we can just return s->buffer directly after terminating it in
the refactored seq_buf_terminate(), now known as seq_buf_str():

	do_something(seq_buf_str(s));

Link: https://lore.kernel.org/linux-trace-kernel/20231027155634.make.260-kees@kernel.org
Link: https://lore.kernel.org/linux-trace-kernel/20231026194033.it.702-kees@kernel.org/

Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Yun Zhou <yun.zhou@windriver.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-10-28 16:52:43 -04:00
Dave Jiang a103f46633 acpi: Move common tables helper functions to common lib
Some of the routines in ACPI driver/acpi/tables.c can be shared with
parsing CDAT. CDAT is a device-provided data structure that is formatted
similar to a platform provided ACPI table. CDAT is used by CXL and can
exist on platforms that do not use ACPI. Split out the common routine
from ACPI to accommodate platforms that do not support ACPI and move that
to /lib. The common routines can be built outside of ACPI if
FIRMWARE_TABLES is selected.

Link: https://lore.kernel.org/linux-cxl/CAJZ5v0jipbtTNnsA0-o5ozOk8ZgWnOg34m34a9pPenTyRLj=6A@mail.gmail.com/
Suggested-by: "Rafael J. Wysocki" <rafael@kernel.org>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/169713683430.2205276.17899451119920103445.stgit@djiang5-mobl3
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27 20:48:03 -07:00
Jakub Kicinski ec4c20ca09 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/mac80211/rx.c
  91535613b6 ("wifi: mac80211: don't drop all unprotected public action frames")
  6c02fab724 ("wifi: mac80211: split ieee80211_drop_unencrypted_mgmt() return value")

Adjacent changes:

drivers/net/ethernet/apm/xgene/xgene_enet_main.c
  61471264c0 ("net: ethernet: apm: Convert to platform remove callback returning void")
  d2ca43f306 ("net: xgene: Fix unused xgene_enet_of_match warning for !CONFIG_OF")

net/vmw_vsock/virtio_transport.c
  64c99d2d6a ("vsock/virtio: support to send non-linear skb")
  53b08c4985 ("vsock/virtio: initialize the_virtio_vsock before using VQs")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-26 13:46:28 -07:00
wuqiang.matt 4758560fa2 kprobes: unused header files removed
As kernel test robot reported, lib/test_objpool.c (trace:probes/for-next)
has linux/version.h included, but version.h is not used at all. Then more
unused headers are found in test_objpool.c and rethook.c, and all of them
should be removed.

Link: https://lore.kernel.org/all/20231023112245.6112-1-wuqiang.matt@bytedance.com/

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310191512.vvypKU5Z-lkp@intel.com/
Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-10-24 10:04:59 +09:00
Matthew Wilcox (Oracle) d0ed46b603 tracing: Move readpos from seq_buf to trace_seq
To make seq_buf more lightweight as a string buf, move the readpos member
from seq_buf to its container, trace_seq.  That puts the responsibility
of maintaining the readpos entirely in the tracing code.  If some future
users want to package up the readpos with a seq_buf, we can define a
new struct then.

Link: https://lore.kernel.org/linux-trace-kernel/20231020033545.2587554-2-willy@infradead.org

Cc: Kees Cook <keescook@chromium.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-10-20 12:16:10 -04:00
Jakub Kicinski 374d345d9b netlink: add variable-length / auto integers
We currently push everyone to use padding to align 64b values
in netlink. Un-padded nla_put_u64() doesn't even exist any more.

The story behind this possibly start with this thread:
https://lore.kernel.org/netdev/20121204.130914.1457976839967676240.davem@davemloft.net/
where DaveM was concerned about the alignment of a structure
containing 64b stats. If user space tries to access such struct
directly:

	struct some_stats *stats = nla_data(attr);
	printf("A: %llu", stats->a);

lack of alignment may become problematic for some architectures.
These days we most often put every single member in a separate
attribute, meaning that the code above would use a helper like
nla_get_u64(), which can deal with alignment internally.
Even for arches which don't have good unaligned access - access
aligned to 4B should be pretty efficient.
Kernel and well known libraries deal with unaligned input already.

Padded 64b is quite space-inefficient (64b + pad means at worst 16B
per attr vs 32b which takes 8B). It is also more typing:

    if (nla_put_u64_pad(rsp, NETDEV_A_SOMETHING_SOMETHING,
                        value, NETDEV_A_SOMETHING_PAD))

Create a new attribute type which will use 32 bits at netlink
level if value is small enough (probably most of the time?),
and (4B-aligned) 64 bits otherwise. Kernel API is just:

    if (nla_put_uint(rsp, NETDEV_A_SOMETHING_SOMETHING, value))

Calling this new type "just" sint / uint with no specific size
will hopefully also make people more comfortable with using it.
Currently telling people "don't use u8, you may need the bits,
and netlink will round up to 4B, anyway" is the #1 comment
we give to newcomers.

In terms of netlink layout it looks like this:

         0       4       8       12      16
32b:     [nlattr][ u32  ]
64b:     [  pad ][nlattr][     u64      ]
uint(32) [nlattr][ u32  ]
uint(64) [nlattr][     u64      ]

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:43:35 +01:00
Kent Overstreet 73badee428 lib/generic-radix-tree.c: Add peek_prev()
This patch adds genradix_peek_prev(), genradix_iter_rewind(), and
genradix_for_each_reverse(), for iterating backwards over a generic
radix tree.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-19 14:47:33 -04:00
Kent Overstreet 9492261ff2 lib/generic-radix-tree.c: Don't overflow in peek()
When we started spreading new inode numbers throughout most of the 64
bit inode space, that triggered some corner case bugs, in particular
some integer overflows related to the radix tree code. Oops.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-19 14:47:33 -04:00
Kent Overstreet b414e8ecd4 closures: Add a missing include
Fixes building in userspace.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-19 14:47:33 -04:00
Kent Overstreet 8c8d2d9670 bcache: move closures to lib/
Prep work for bcachefs - being a fork of bcache it also uses closures

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Acked-by: Coly Li <colyli@suse.de>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
2023-10-19 14:47:33 -04:00
Alexey Dobriyan 68279f9c9f treewide: mark stuff as __ro_after_init
__read_mostly predates __ro_after_init. Many variables which are marked
__read_mostly should have been __ro_after_init from day 1.

Also, mark some stuff as "const" and "__init" while I'm at it.

[akpm@linux-foundation.org: revert sysctl_nr_open_min, sysctl_nr_open_max changes due to arm warning]
[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/4f6bb9c0-abba-4ee4-a7aa-89265e886817@p183
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:43:23 -07:00
Hugh Dickins 1431996bf9 percpu_counter: extend _limited_add() to negative amounts
Though tmpfs does not need it, percpu_counter_limited_add() can be twice
as useful if it works sensibly with negative amounts (subs) - typically
decrements towards a limit of 0 or nearby: as suggested by Dave Chinner.

And in the course of that reworking, skip the percpu counter sum if it is
already obvious that the limit would be passed: as suggested by Tim Chen.

Extend the comment above __percpu_counter_limited_add(), defining the
behaviour with positive and negative amounts, allowing negative limits,
but not bothering about overflow beyond S64_MAX.

Link: https://lkml.kernel.org/r/8f86083b-c452-95d4-365b-f16a2e4ebcd4@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Carlos Maiolino <cem@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Tim Chen <tim.c.chen@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:14 -07:00
Hugh Dickins beb9868628 shmem,percpu_counter: add _limited_add(fbc, limit, amount)
Percpu counter's compare and add are separate functions: without locking
around them (which would defeat their purpose), it has been possible to
overflow the intended limit.  Imagine all the other CPUs fallocating tmpfs
huge pages to the limit, in between this CPU's compare and its add.

I have not seen reports of that happening; but tmpfs's recent addition of
dquot_alloc_block_nodirty() in between the compare and the add makes it
even more likely, and I'd be uncomfortable to leave it unfixed.

Introduce percpu_counter_limited_add(fbc, limit, amount) to prevent it.

I believe this implementation is correct, and slightly more efficient than
the combination of compare and add (taking the lock once rather than twice
when nearing full - the last 128MiB of a tmpfs volume on a machine with
128 CPUs and 4KiB pages); but it does beg for a better design - when
nearing full, there is no new batching, but the costly percpu counter sum
across CPUs still has to be done, while locked.

Follow __percpu_counter_sum()'s example, including cpu_dying_mask as well
as cpu_online_mask: but shouldn't __percpu_counter_compare() and
__percpu_counter_limited_add() then be adding a num_dying_cpus() to
num_online_cpus(), when they calculate the maximum which could be held
across CPUs?  But the times when it matters would be vanishingly rare.

Link: https://lkml.kernel.org/r/bb817848-2d19-bcc8-39ca-ea179af0f0b4@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Carlos Maiolino <cem@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:14 -07:00
Liam R. Howlett 099d7439ce maple_tree: add GFP_KERNEL to allocations in mas_expected_entries()
Users complained about OOM errors during fork without triggering
compaction.  This can be fixed by modifying the flags used in
mas_expected_entries() so that the compaction will be triggered in low
memory situations.  Since mas_expected_entries() is only used during fork,
the extra argument does not need to be passed through.

Additionally, the two test_maple_tree test cases and one benchmark test
were altered to use the correct locking type so that allocations would not
trigger sleeping and thus fail.  Testing was completed with lockdep atomic
sleep detection.

The additional locking change requires rwsem support additions to the
tools/ directory through the use of pthreads pthread_rwlock_t.  With this
change test_maple_tree works in userspace, as a module, and in-kernel.

Users may notice that the system gave up early on attempting to start new
processes instead of attempting to reclaim memory.

Link: https://lkml.kernel.org/r/20230915093243epcms1p46fa00bbac1ab7b7dca94acb66c44c456@epcms1p4
Link: https://lkml.kernel.org/r/20231012155233.2272446-1-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: <jason.sim@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 12:12:41 -07:00
Arnd Bergmann 57e06f8c1f Qualcomm driver updates for v6.7
This introduces partial support for the Qualcomm Secure Execution
 Environment SCM interface, and uses this to implement EFI variable
 access on the Windows On Snapdragon devices (for now).
 
 The 32/64-bit calling convention detector of the SCM interface is
 updated to not choose 64-bit convention when Linux is 32-bit. The
 "extern" specifier is dropped from the interface include file.
 
 The LLCC driver gains support for carrying configuration for multiple
 different system/DDR configurations for a given platform, and selecting
 between them. Support for Q[DR]U1000 is added to the driver.
 
 All exported symbols are transitioned to EXPORT_SYMBOL_GPL().
 
 The platform_drivers in the Qualcomm SoC are transitioned to the
 void-returning remove_new implementation.
 
 The rmtfs memory driver gains support for leaving guard pages around the
 used area, to avoid issues if the allocation happens to be placed
 adjacent to another protected memory region.
 
 The socinfo driver gains knowledge about IPQ8174, QCM6490, SM7150P and
 various PMICs used together with SM8550.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEBd4DzF816k8JZtUlCx85Pw2ZrcUFAmUsTdwVHGFuZGVyc3Nv
 bkBrZXJuZWwub3JnAAoJEAsfOT8Nma3FffsQAMcY5NKBZqbQu5wNr2zPnKW5m/30
 lYi4bW42mlY6Mo3h97LLxjGM9bqsmPasfAxv/84viN0JYxrSVye/zNtD35qTQBtu
 t8O8vdOaOYXIc8dwqC16Fl0kI5pm9tl7p5SJmGAUZBvIGUIGngy3gZOYM+HzyMQr
 UXFO6dO0tQvjowd1t0xE51UNm4J79Vm8HjtuR1pc4i+fZhsy5XhZYq9e4531Yc7o
 lNacBugjXhurw/rz0odNuRzKn+He+7KQR/hSNv8B2MbH0ZUP+k4VUaRSD9TZyBgC
 cOahbCfVLkGwepPFGuaaJUUgo8/aa4HiAbdyOrNL2ozYkoLv/PbRrLF0R9KgXK5v
 LPXpqXt97+rWPnq2iPrfZlo4LAj6JVSCUCxIRZFcWyDFv+wHsYuRbO64s1+MC7+M
 CPpvup31PLGCeR9Joo/WTt9iAUPJ4BY9v4mkjIdmJF371WRryP4um5Ln6xMC+fIe
 U/Ss8mPNpSM+guDo8LbdCGAcxWTDk8TOXxa/888y/4Q5Mg97nn0hY2KK4JBmJ7I3
 Thd0n9vXjM3oOwPGLC7Y2zj9ZQI5k7JKn331yPS4Hb20SKGEztlaJp4B1DdA5/ul
 4BUoh3HWV7dB4l+felrAmPAzAUXvyfGrDBJhXe2dabRJjMhouQ6Wudn3OlC4VP/l
 fjszjVvPFt7CE30C
 =B1Kn
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmUv9yoACgkQYKtH/8kJ
 UifTOQ/+NUj5Y/sAZfMLY3Q3e8s2nsKZLHHUCH7O/DOR6ILc5DZcTI5a23kfIA9V
 VVluTEnRUzbSHCb3WPW3Yfgy5HpZoaLgXWXGgCsRVKRIODN745TOm2Hn/dFZF0Z7
 Fso0oPplGKrvuxSlaxg+1fbJ7c5r8kE/TDu+ok0eiZGvzwfnVas4bB+27thdH/le
 tY/wS9BhHmBrs6k7gqCfEFWF7M6kS29FOPILIlAfxgUevgSIvAUMBH/KUAGw1Qo/
 UNd372TwXkdw0pG2KOx8xR4iKFBfWX4PbHBtBbUZzAnqerDnNrh7p0qTlhfKXOpb
 jQWx3eqQIwFbbWZ39DxiLE8/h5ig3jF29u1s/vAG2Zb5o5cYwiRM7OjWW8Q8Ha1E
 ha2NSWMEaNniegnMeoy1VTN8uSYY+dUseKmLmP5vCobmYVKuX96iQx5pNBYhsrwN
 YOlkOkF1fXINFqSD5v2N4s4AdGx634GgP7azkjm6fGS+hlwKgWNzwH4HHgXNx/u/
 IPRPw7U/hji2R++oNPtEOlNHoHWM6Zx0ema38qGTC69uAFHETxbFUmOBaX6q/7Ot
 AwpP25YozehwDoUVZRwR6CpHcVbKDl4oJhWaruflkxC+6YFTGYS2hPFiF6zbnQw9
 u36QDkgNzhBG2LINlmqCAAoGWmC+mqwsqYN/DyKsUUSVEODyUng=
 =7XXT
 -----END PGP SIGNATURE-----

Merge tag 'qcom-drivers-for-6.7' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/drivers

Qualcomm driver updates for v6.7

This introduces partial support for the Qualcomm Secure Execution
Environment SCM interface, and uses this to implement EFI variable
access on the Windows On Snapdragon devices (for now).

The 32/64-bit calling convention detector of the SCM interface is
updated to not choose 64-bit convention when Linux is 32-bit. The
"extern" specifier is dropped from the interface include file.

The LLCC driver gains support for carrying configuration for multiple
different system/DDR configurations for a given platform, and selecting
between them. Support for Q[DR]U1000 is added to the driver.

All exported symbols are transitioned to EXPORT_SYMBOL_GPL().

The platform_drivers in the Qualcomm SoC are transitioned to the
void-returning remove_new implementation.

The rmtfs memory driver gains support for leaving guard pages around the
used area, to avoid issues if the allocation happens to be placed
adjacent to another protected memory region.

The socinfo driver gains knowledge about IPQ8174, QCM6490, SM7150P and
various PMICs used together with SM8550.

* tag 'qcom-drivers-for-6.7' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (44 commits)
  soc: qcom: socinfo: Convert to platform remove callback returning void
  soc: qcom: smsm: Convert to platform remove callback returning void
  soc: qcom: smp2p: Convert to platform remove callback returning void
  soc: qcom: smem: Convert to platform remove callback returning void
  soc: qcom: rmtfs_mem: Convert to platform remove callback returning void
  soc: qcom: qcom_stats: Convert to platform remove callback returning void
  soc: qcom: qcom_gsbi: Convert to platform remove callback returning void
  soc: qcom: qcom_aoss: Convert to platform remove callback returning void
  soc: qcom: pmic_glink: Convert to platform remove callback returning void
  soc: qcom: ocmem: Convert to platform remove callback returning void
  soc: qcom: llcc-qcom: Convert to platform remove callback returning void
  soc: qcom: icc-bwmon: Convert to platform remove callback returning void
  firmware: qcom_scm: use 64-bit calling convention only when client is 64-bit
  soc: qcom: llcc: Handle a second device without data corruption
  soc: qcom: Switch to EXPORT_SYMBOL_GPL()
  soc: qcom: smem: Annotate struct qcom_smem with __counted_by
  soc: qcom: rmtfs: Support discarding guard pages
  dt-bindings: reserved-memory: rmtfs: Allow guard pages
  dt-bindings: firmware: qcom,scm: document IPQ5018 compatible
  firmware: qcom_scm: disable SDI if required
  ...

Link: https://lore.kernel.org/r/20231015204014.855672-1-andersson@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-10-18 17:18:02 +02:00
wuqiang.matt 92f90d3b0d lib: objpool test module added
The test_objpool module (test_objpool) will run several testcases
for objpool stress and performance evaluation. Each testcase will
have all available cpu cores involved to create a situation of high
parallel and high contention.

As of now there are 5 groups and 5 * 2 testcases in total:

1) group 1: synchronous mode
   objpool is managed synchronously, that is, all objects are to be
   reclaimed before objpool finalization and the objpool owner makes
   sure of it. All threads on different cores run in the same pace
2) group 2: synchronous mode + hrtimer
   this case have 2 customers: normal threads and hrtimer softirqs
3) group 3: synchronous + overrun mode
   This test group is mainly for performance evaluation of missing
   cases when pre-allocated objects are less than the requested
4) group 4: asynchronous mode
   This case is just an emulation of kretprobe, with refcount used
   to control the objpool lifecycle
5) group 5: asynchronous mode with hrtimer
   hrtimer softirq is introduced to stress async objpool operations

Link: https://lore.kernel.org/all/20231017135654.82270-3-wuqiang.matt@bytedance.com/

Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-10-18 22:36:03 +09:00
wuqiang.matt b4edb8d2d4 lib: objpool added: ring-array based lockless MPMC
objpool is a scalable implementation of high performance queue for
object allocation and reclamation, such as kretprobe instances.

With leveraging percpu ring-array to mitigate hot spots of memory
contention, it delivers near-linear scalability for high parallel
scenarios. The objpool is best suited for the following cases:
1) Memory allocation or reclamation are prohibited or too expensive
2) Consumers are of different priorities, such as irqs and threads

Limitations:
1) Maximum objects (capacity) is fixed after objpool creation
2) All pre-allocated objects are managed in percpu ring array,
   which consumes more memory than linked lists

Link: https://lore.kernel.org/all/20231017135654.82270-2-wuqiang.matt@bytedance.com/

Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-10-18 22:35:36 +09:00
Yury Norov 6cb42f91aa bitmap: move bitmap_*_region() functions to bitmap.h
Now that bitmap_*_region() functions are implemented as thin wrappers
around others, it's worth to move them to the header, as it opens room
for compile-time optimizations.

CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-16 16:14:45 -07:00
NeilBrown de9e82c355 lib: add light-weight queuing mechanism.
lwq is a FIFO single-linked queue that only requires a spinlock
for dequeueing, which happens in process context.  Enqueueing is atomic
with no spinlock and can happen in any context.

This is particularly useful when work items are queued from BH or IRQ
context, and when they are handled one at a time by dedicated threads.

Avoiding any locking when enqueueing means there is no need to disable
BH or interrupts, which is generally best avoided (particularly when
there are any RT tasks on the machine).

This solution is superior to using "list_head" links because we need
half as many pointers in the data structures, and because list_head
lists would need locking to add items to the queue.

This solution is superior to a bespoke solution as all locking and
container_of casting is integrated, so the interface is simple.

Despite the similar name, this solution meets a distinctly different
need to kfifo.  kfifo provides a fixed sized circular buffer to which
data can be added at one end and removed at the other, and does not
provide any locking.  lwq does not have any size limit and works with
data structures (objects?) rather than data (bytes).

A unit test for basic functionality, which runs at boot time, is included.

Signed-off-by: NeilBrown <neilb@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: David Gow <davidgow@google.com>
Cc: linux-kernel@vger.kernel.org
Message-Id: <20230911111333.4d1a872330e924a00acb905b@linux-foundation.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-10-16 12:44:06 -04:00
NeilBrown 8a3e5975ed llist: add llist_del_first_this()
llist_del_first_this() deletes a specific entry from an llist, providing
it is at the head of the list.  Multiple threads can call this
concurrently providing they each offer a different entry.

This can be uses for a set of worker threads which are on the llist when
they are idle.  The head can always be woken, and when it is woken it
can remove itself, and possibly wake the next if there is an excess of
work to do.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-10-16 12:44:06 -04:00
Yury Norov 1d4836527d bitmap: drop _reg_op() function
Now that all _reg_op() users are switched to alternative functions,
_reg_op() machinery is not needed anymore.

CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:23 -07:00
Yury Norov 9276819a68 bitmap: replace _reg_op(REG_OP_ISFREE) with find_next_bit()
_reg_op(REG_OP_ISFREE) can be trivially replaced with find_next_bit().
Doing that opens room for potential small_const_nbits() optimization.

CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Yury Norov add00c76ee bitmap: replace _reg_op(REG_OP_RELEASE) with bitmap_clear()
_reg_op(REG_OP_RELEASE) duplicates bitmap_clear().

CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Yury Norov eae5acbd75 bitmap: replace _reg_op(REG_OP_ALLOC) with bitmap_set()
_reg_op(REG_OP_ALLOC) duplicates bitmap_set().

CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Yury Norov b085f969ed bitmap: fix opencoded bitmap_allocate_region()
bitmap_find_region() opencodes bitmap_allocate_region().

CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Yury Norov 6d5d3a0c33 bitmap: add test for bitmap_*_region() functions
Test basic functionality of bitmap_{allocate,release,find_free}_region()
functions.

CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Yury Norov 82bf9bdfbc bitmap: align __reg_op() wrappers with modern coding style
Fix comments so that scripts/kernel-doc doesn't warn, and fix for-loop
stype in bitmap_find_free_region().

CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Yury Norov aae06fc1b5 lib/bitmap: split-out string-related operations to a separate files
lib/bitmap.c and corresponding include/linux/bitmap.h are intended to
hold functions related to operations on bitmaps, like bitmap_shift or
bitmap_set. Historically, some string-related operations like
bitmap_parse are also reside in lib/bitmap.c.

Now that the subsystem evolves, string-related bitmap operations became a
significant part of the file. Because they are quite different from the
other bitmap functions by nature, it's worth to split them to a separate
source/header files.

CC: Andrew Morton <akpm@linux-foundation.org>
CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Andy Shevchenko 7733aa8938 bitmap: Remove dead code, i.e. bitmap_copy_le()
Besides the fact it's not used anywhere it should be implemented
differently, i.e. via helpers from linux/byteorder/generic.h.
Yet the helpers themselves need to be introduced first.

Also note, the function lacks of the test cases, they must be provided.

Hence, drop the current dead code for good.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Jonathan Neuschäfer 8ed13a762c bitmap: Fix a typo ("identify map")
A map in which each element is mapped to itself is called an "identity
map".

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Randy Dunlap 57f728d59f cpumask: kernel-doc cleanups and additions
Clean up some punctutation and abbreviations.
Add kernel-doc notation for one function and function return value
for 39 functions.

cpumask.h:
Fix some punctuation (plural vs. possessive).
Fix some abbreviations (ie. -> i.e., id -> ID).

Fix 35 warnings like this:
include/linux/cpumask.h:161: warning: No description found for return value of 'cpumask_first'

cpumask.c:
Add Return: value for 4 functions.
Add kernel-doc for cpumask_any_distribute().

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:21 -07:00
Uros Bizjak 4fbf8b136d locking/atomics: Use atomic_try_cmpxchg_release() to micro-optimize rcuref_put_slowpath()
Use atomic_try_cmpxchg() instead of atomic_cmpxchg(*ptr, old, new) == old
in rcuref_put_slowpath(). On x86 the CMPXCHG instruction returns success in the
ZF flag, so this change saves a compare after CMPXCHG.  Additionaly,
the compiler reorders some code blocks to follow likely/unlikely
annotations in the atomic_try_cmpxchg() macro, improving the code from:

  9a:	f0 0f b1 0b          	lock cmpxchg %ecx,(%rbx)
  9e:	83 f8 ff             	cmp    $0xffffffff,%eax
  a1:	74 04                	je     a7 <rcuref_put_slowpath+0x27>
  a3:	31 c0                	xor    %eax,%eax

to:

  9a:	f0 0f b1 0b          	lock cmpxchg %ecx,(%rbx)
  9e:	75 4c                	jne    ec <rcuref_put_slowpath+0x6c>
  a0:	b0 01                	mov    $0x1,%al

No functional change intended.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20230509150255.3691-1-ubizjak@gmail.com
2023-10-10 10:14:27 +02:00
David Howells b5f0e20f44
iov_iter, net: Move hash_and_copy_to_iter() to net/
Move hash_and_copy_to_iter() to be with its only caller in networking code.

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20230925120309.1731676-13-dhowells@redhat.com
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Christian Brauner <christian@brauner.io>
cc: Matthew Wilcox <willy@infradead.org>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: David Laight <David.Laight@ACULAB.COM>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
cc: netdev@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-09 09:35:14 +02:00
David Howells 6d0d419914
iov_iter, net: Move csum_and_copy_to/from_iter() to net/
Move csum_and_copy_to/from_iter() to net code now that the iteration
framework can be #included.

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20230925120309.1731676-10-dhowells@redhat.com
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Christian Brauner <christian@brauner.io>
cc: Matthew Wilcox <willy@infradead.org>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: David Laight <David.Laight@ACULAB.COM>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
cc: netdev@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-09 09:35:14 +02:00
David Howells c9eec08bac
iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()
iter->copy_mc is only used with a bvec iterator and only by
dump_emit_page() in fs/coredump.c so rather than handle this in
memcpy_from_iter_mc() where it is checked repeatedly by _copy_from_iter()
and copy_page_from_iter_atomic(),

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20230925120309.1731676-9-dhowells@redhat.com
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Christian Brauner <christian@brauner.io>
cc: Matthew Wilcox <willy@infradead.org>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: David Laight <David.Laight@ACULAB.COM>
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-09 09:35:14 +02:00
Ingo Molnar 8db30574db Merge branch 'sched/urgent' into sched/core, to pick up fixes and refresh the branch
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-10-07 11:32:24 +02:00
Jakub Kicinski 2606cf059c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts (or adjacent changes of note).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-05 13:16:47 -07:00
Dr. David Alan Gilbert 0ebc7feae7 powerpc: Use shared font data
PowerPC has a 'btext' font used for the console which is almost identical
to the shared font_sun8x16, so use it rather than duplicating the data.

They were actually identical until about a decade ago when
   commit bcfbeecea1 ("drivers: console: font_: Change a glyph from
                        "broken bar" to "vertical line"")

which changed the | in the shared font to be a solid
bar rather than a broken bar.  That's the only difference.

This was originally spotted by the PMF source code analyser, which
noticed that sparc does the same thing with the same data, and they
also share a bunch of functions to manipulate the data.  I've previously
posted a near identical patch for sparc.

Tested very lightly with a boot without FS in qemu.

Signed-off-by: "Dr. David Alan Gilbert" <linux@treblig.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230825142754.1487900-1-linux@treblig.org
2023-10-01 23:09:02 +11:00
Liam R. Howlett a8091f039c maple_tree: add MAS_UNDERFLOW and MAS_OVERFLOW states
When updating the maple tree iterator to avoid rewalks, an issue was
introduced when shifting beyond the limits.  This can be seen by trying to
go to the previous address of 0, which would set the maple node to
MAS_NONE and keep the range as the last entry.

Subsequent calls to mas_find() would then search upwards from mas->last
and skip the value at mas->index/mas->last.  This showed up as a bug in
mprotect which skips the actual VMA at the current range after attempting
to go to the previous VMA from 0.

Since MAS_NONE may already be set when searching for a value that isn't
contained within a node, changing the handling of MAS_NONE in mas_find()
would make the code more complicated and error prone.  Furthermore, there
was no way to tell which limit was hit, and thus which action to take
(next or the entry at the current range).

This solution is to add two states to track what happened with the
previous iterator action.  This allows for the expected behaviour of the
next command to return the correct item (either the item at the range
requested, or the next/previous).

Tests are also added and updated accordingly.

Link: https://lkml.kernel.org/r/20230921181236.509072-3-Liam.Howlett@oracle.com
Link: https://gist.github.com/heatd/85d2971fae1501b55b6ea401fbbe485b
Link: https://lore.kernel.org/linux-mm/20230921181236.509072-1-Liam.Howlett@oracle.com/
Fixes: 39193685d5 ("maple_tree: try harder to keep active node with mas_prev()")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Pedro Falcato <pedro.falcato@gmail.com>
Closes: https://gist.github.com/heatd/85d2971fae1501b55b6ea401fbbe485b
Closes: https://bugs.archlinux.org/task/79656
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-29 17:20:46 -07:00
Jinjie Ruan 8040345fda kunit: test: Fix the possible memory leak in executor_test
When CONFIG_KUNIT_ALL_TESTS=y, making CONFIG_DEBUG_KMEMLEAK=y and
CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y, the below memory leak is detected.

If kunit_filter_suites() succeeds, not only copy but also filtered_suite
and filtered_suite->test_cases should be freed.

So as Rae suggested, to avoid the suite set never be freed when
KUNIT_ASSERT_EQ() fails and exits after kunit_filter_suites() succeeds,
update kfree_at_end() func to free_suite_set_at_end() to use
kunit_free_suite_set() to free them as kunit_module_exit() and
kunit_run_all_tests() do it. As the second arg got of
free_suite_set_at_end() is a local variable, copy it for free to avoid
wild-memory-access. After applying this patch, the following memory leak
is never detected.

unreferenced object 0xffff8881001de400 (size 1024):
  comm "kunit_try_catch", pid 1396, jiffies 4294720452 (age 932.801s)
  hex dump (first 32 bytes):
    73 75 69 74 65 32 00 00 00 00 00 00 00 00 00 00  suite2..........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817db753>] __kmalloc_node_track_caller+0x53/0x150
    [<ffffffff817bd242>] kmemdup+0x22/0x50
    [<ffffffff829e961d>] kunit_filter_suites+0x44d/0xcc0
    [<ffffffff829eb69f>] filter_suites_test+0x12f/0x360
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20
unreferenced object 0xffff8881052cd388 (size 192):
  comm "kunit_try_catch", pid 1396, jiffies 4294720452 (age 932.801s)
  hex dump (first 32 bytes):
    a0 85 9e 82 ff ff ff ff 80 cd 7c 84 ff ff ff ff  ..........|.....
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817dbad2>] __kmalloc+0x52/0x150
    [<ffffffff829e9651>] kunit_filter_suites+0x481/0xcc0
    [<ffffffff829eb69f>] filter_suites_test+0x12f/0x360
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20

unreferenced object 0xffff888100da8400 (size 1024):
  comm "kunit_try_catch", pid 1398, jiffies 4294720454 (age 781.945s)
  hex dump (first 32 bytes):
    73 75 69 74 65 32 00 00 00 00 00 00 00 00 00 00  suite2..........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817db753>] __kmalloc_node_track_caller+0x53/0x150
    [<ffffffff817bd242>] kmemdup+0x22/0x50
    [<ffffffff829e961d>] kunit_filter_suites+0x44d/0xcc0
    [<ffffffff829eb13f>] filter_suites_test_glob_test+0x12f/0x560
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20
unreferenced object 0xffff888105117878 (size 96):
  comm "kunit_try_catch", pid 1398, jiffies 4294720454 (age 781.945s)
  hex dump (first 32 bytes):
    a0 85 9e 82 ff ff ff ff a0 ac 7c 84 ff ff ff ff  ..........|.....
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817dbad2>] __kmalloc+0x52/0x150
    [<ffffffff829e9651>] kunit_filter_suites+0x481/0xcc0
    [<ffffffff829eb13f>] filter_suites_test_glob_test+0x12f/0x560
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20
unreferenced object 0xffff888102c31c00 (size 1024):
  comm "kunit_try_catch", pid 1404, jiffies 4294720460 (age 781.948s)
  hex dump (first 32 bytes):
    6e 6f 72 6d 61 6c 5f 73 75 69 74 65 00 00 00 00  normal_suite....
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817db753>] __kmalloc_node_track_caller+0x53/0x150
    [<ffffffff817bd242>] kmemdup+0x22/0x50
    [<ffffffff829ecf17>] kunit_filter_attr_tests+0xf7/0x860
    [<ffffffff829e99ff>] kunit_filter_suites+0x82f/0xcc0
    [<ffffffff829ea975>] filter_attr_test+0x195/0x5f0
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20
unreferenced object 0xffff8881052cd250 (size 192):
  comm "kunit_try_catch", pid 1404, jiffies 4294720460 (age 781.948s)
  hex dump (first 32 bytes):
    a0 85 9e 82 ff ff ff ff 00 a9 7c 84 ff ff ff ff  ..........|.....
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817dbad2>] __kmalloc+0x52/0x150
    [<ffffffff829ecfc1>] kunit_filter_attr_tests+0x1a1/0x860
    [<ffffffff829e99ff>] kunit_filter_suites+0x82f/0xcc0
    [<ffffffff829ea975>] filter_attr_test+0x195/0x5f0
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20
unreferenced object 0xffff888104f4e400 (size 1024):
  comm "kunit_try_catch", pid 1408, jiffies 4294720464 (age 781.944s)
  hex dump (first 32 bytes):
    73 75 69 74 65 00 00 00 00 00 00 00 00 00 00 00  suite...........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817db753>] __kmalloc_node_track_caller+0x53/0x150
    [<ffffffff817bd242>] kmemdup+0x22/0x50
    [<ffffffff829ecf17>] kunit_filter_attr_tests+0xf7/0x860
    [<ffffffff829e99ff>] kunit_filter_suites+0x82f/0xcc0
    [<ffffffff829e9fc3>] filter_attr_skip_test+0x133/0x6e0
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20
unreferenced object 0xffff8881052cc620 (size 192):
  comm "kunit_try_catch", pid 1408, jiffies 4294720464 (age 781.944s)
  hex dump (first 32 bytes):
    a0 85 9e 82 ff ff ff ff c0 a8 7c 84 ff ff ff ff  ..........|.....
    00 00 00 00 00 00 00 00 02 00 00 00 02 00 00 00  ................
  backtrace:
    [<ffffffff817dbad2>] __kmalloc+0x52/0x150
    [<ffffffff829ecfc1>] kunit_filter_attr_tests+0x1a1/0x860
    [<ffffffff829e99ff>] kunit_filter_suites+0x82f/0xcc0
    [<ffffffff829e9fc3>] filter_attr_skip_test+0x133/0x6e0
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20

Fixes: e5857d396f ("kunit: flatten kunit_suite*** to kunit_suite** in .kunit_test_suites")
Fixes: 76066f93f1 ("kunit: add tests for filtering attributes")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Suggested-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Suggested-by: David Gow <davidgow@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202309142251.uJ8saAZv-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202309270433.wGmFRGjd-lkp@intel.com/
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-28 08:51:07 -06:00
Jinjie Ruan 24de14c98b kunit: Fix possible memory leak in kunit_filter_suites()
If the outer layer for loop is iterated more than once and it fails not
in the first iteration, the filtered_suite and filtered_suite->test_cases
allocated in the last kunit_filter_attr_tests() in last inner for loop
is leaked.

So add a new free_filtered_suite err label and free the filtered_suite
and filtered_suite->test_cases so far. And change kmalloc_array of copy
to kcalloc to Clear the copy to make the kfree safe.

Fixes: 529534e8cb ("kunit: Add ability to filter attributes")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-28 08:51:02 -06:00
Jinjie Ruan e44679515a kunit: Fix the wrong kfree of copy for kunit_filter_suites()
If the outer layer for loop is iterated more than once and it fails not
in the first iteration, the copy pointer has been moved. So it should free
the original copy's backup copy_start.

Fixes: abbf73816b ("kunit: fix possible memory leak in kunit_filter_suites()")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-28 08:50:57 -06:00
Jinjie Ruan a6074cf012 kunit: Fix missed memory release in kunit_free_suite_set()
modprobe cpumask_kunit and rmmod cpumask_kunit, kmemleak detect
a suspected memory leak as below.

If kunit_filter_suites() in kunit_module_init() succeeds, the
suite_set.start will not be NULL and the kunit_free_suite_set() in
kunit_module_exit() should free all the memory which has not
been freed. However the test_cases in suites is left out.

unreferenced object 0xffff54ac47e83200 (size 512):
  comm "modprobe", pid 592, jiffies 4294913238 (age 1367.612s)
  hex dump (first 32 bytes):
    84 13 1a f0 d3 b6 ff ff 30 68 1a f0 d3 b6 ff ff  ........0h......
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000008dec63a2>] slab_post_alloc_hook+0xb8/0x368
    [<00000000ec280d8e>] __kmem_cache_alloc_node+0x174/0x290
    [<00000000896c7740>] __kmalloc+0x60/0x2c0
    [<000000007a50fa06>] kunit_filter_suites+0x254/0x5b8
    [<0000000078cc98e2>] kunit_module_notify+0xf4/0x240
    [<0000000033cea952>] notifier_call_chain+0x98/0x17c
    [<00000000973d05cc>] notifier_call_chain_robust+0x4c/0xa4
    [<000000005f95895f>] blocking_notifier_call_chain_robust+0x4c/0x74
    [<0000000048e36fa7>] load_module+0x1a2c/0x1c40
    [<0000000004eb8a91>] init_module_from_file+0x94/0xcc
    [<0000000037dbba28>] idempotent_init_module+0x184/0x278
    [<00000000161b75cb>] __arm64_sys_finit_module+0x68/0xa8
    [<000000006dc1669b>] invoke_syscall+0x44/0x100
    [<00000000fa87e304>] el0_svc_common.constprop.1+0x68/0xe0
    [<000000009d8ad866>] do_el0_svc+0x1c/0x28
    [<000000005b83c607>] el0_svc+0x3c/0xc4

Fixes: a127b154a8 ("kunit: tool: allow filtering test cases via glob")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-28 08:50:51 -06:00
David Howells f1982740f5
iov_iter: Convert iterate*() to inline funcs
Convert the iov_iter iteration macros to inline functions to make the code
easier to follow.

The functions are marked __always_inline as we don't want to end up with
indirect calls in the code.  This, however, leaves dealing with ->copy_mc
in an awkard situation since the step function (memcpy_from_iter_mc())
needs to test the flag in the iterator, but isn't passed the iterator.
This will be dealt with in a follow-up patch.

The variable names in the per-type iterator functions have been harmonised
as much as possible and made clearer as to the variable purpose.

The iterator functions are also moved to a header file so that other
operations that need to scan over an iterator can be added.  For instance,
the rbd driver could use this to scan a buffer to see if it is all zeros
and libceph could use this to generate a crc.

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/3710261.1691764329@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/855.1692047347@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/20230816120741.534415-1-dhowells@redhat.com/ # v3
Link: https://lore.kernel.org/r/20230925120309.1731676-8-dhowells@redhat.com
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Christian Brauner <christian@brauner.io>
cc: Matthew Wilcox <willy@infradead.org>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: David Laight <David.Laight@ACULAB.COM>
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-09-25 14:30:28 +02:00
David Howells f1b4cb650b
iov_iter: Derive user-backedness from the iterator type
Use the iterator type to determine whether an iterator is user-backed or
not rather than using a special flag for it.  Now that ITER_UBUF and
ITER_IOVEC are 0 and 1, they can be checked with a single comparison.

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20230925120309.1731676-7-dhowells@redhat.com
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Christian Brauner <christian@brauner.io>
cc: Matthew Wilcox <willy@infradead.org>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: David Laight <David.Laight@ACULAB.COM>
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-09-25 14:30:28 +02:00
Azeem Shaikh 6cd59324c6 kobject: Replace strlcpy with strscpy
strlcpy() reads the entire source buffer first.
This read may exceed the destination size limit.
This is both inefficient and can lead to linear read
overflows if a source string is not NUL-terminated [1].
In an effort to remove strlcpy() completely [2], replace
strlcpy() here with strscpy().

Direct replacement is safe here since return value of -errno
is used to check for truncation instead of sizeof(dest).

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89

Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230831140104.207019-1-azeemshaikh38@gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-09-22 09:50:56 -07:00
Randy Dunlap 36ee98b555 argv_split: fix kernel-doc warnings
Use proper kernel-doc notation to prevent build warnings:

lib/argv_split.c:36: warning: Function parameter or member 'argv' not described in 'argv_free'
lib/argv_split.c:61: warning: No description found for return value of 'argv_split'

Link: https://lkml.kernel.org/r/20230912060838.3794-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-19 13:21:33 -07:00
Randy Dunlap c80da1fb85 scatterlist: add missing function params to kernel-doc
Describe missing function parameters to prevent kernel-doc warnings:

lib/scatterlist.c:288: warning: Function parameter or member 'first_chunk' not described in '__sg_alloc_table'
lib/scatterlist.c:800: warning: Function parameter or member 'flags' not described in 'sg_miter_start'

Link: https://lkml.kernel.org/r/20230912060848.4673-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-19 13:21:33 -07:00
Michal Wajdeczko ee5f8cc277 kunit: Reset test status on each param iteration
If we skip one parametrized test case then test status remains
SKIP for all subsequent test params leading to wrong reports:

$ ./tools/testing/kunit/kunit.py run \
	--kunitconfig ./lib/kunit/.kunitconfig *.example_params*
	--raw_output \

[ ] Starting KUnit Kernel (1/1)...
KTAP version 1
1..1
    # example: initializing suite
    KTAP version 1
    # Subtest: example
    # module: kunit_example_test
    1..1
        KTAP version 1
        # Subtest: example_params_test
    # example_params_test: initializing
    # example_params_test: cleaning up
        ok 1 example value 3 # SKIP unsupported param value 3
    # example_params_test: initializing
    # example_params_test: cleaning up
        ok 2 example value 2 # SKIP unsupported param value 3
    # example_params_test: initializing
    # example_params_test: cleaning up
        ok 3 example value 1 # SKIP unsupported param value 3
    # example_params_test: initializing
    # example_params_test: cleaning up
        ok 4 example value 0 # SKIP unsupported param value 0
    # example_params_test: pass:0 fail:0 skip:4 total:4
    ok 1 example_params_test # SKIP unsupported param value 0
    # example: exiting suite
ok 1 example # SKIP

Reset test status and status comment after each param iteration
to avoid using stale results.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:46:56 -06:00
Richard Fitzgerald 53568b720c kunit: string-stream: Test performance of string_stream
Add a test of the speed and memory use of string_stream.

string_stream_performance_test() doesn't actually "test" anything (it
cannot fail unless the system has run out of allocatable memory) but it
measures the speed and memory consumption of the string_stream and reports
the result.

This allows changes in the string_stream implementation to be compared.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:59 -06:00
Richard Fitzgerald 05e2006ce4 kunit: Use string_stream for test log
Replace the fixed-size log buffer with a string_stream so that the
log can grow as lines are added.

The existing kunit log tests have been updated for using a
string_stream as the log. No new test have been added because there
are already tests for the underlying string_stream.

As the log tests now depend on string_stream functions they cannot
build when kunit-test is a module. They have been surrounded by
a #if to replace them with skipping version when the test is
build as a module. Though this isn't pretty, it avoids moving
code to another file while that code is also being changed.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:53 -06:00
Richard Fitzgerald d1a0d699bf kunit: string-stream: Add tests for freeing resource-managed string_stream
string_stream_managed_free_test() allocates a resource-managed
string_stream and tests that kunit_free_string_stream() calls
string_stream_destroy().

string_stream_resource_free_test() allocates a resource-managed
string_stream and tests that string_stream_destroy() is called
when the test resources are cleaned up.

The old string_stream_init_test() has been split into two tests,
one for kunit_alloc_string_stream() and the other for
alloc_string_stream().

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:46 -06:00
Richard Fitzgerald a3fdf78478 kunit: string-stream: Decouple string_stream from kunit
Re-work string_stream so that it is not tied to a struct kunit. This is
to allow using it for the log of struct kunit_suite.

Instead of resource-managing individual allocations the whole string_stream
can be resource-managed, if required.

    alloc_string_stream() now allocates a string stream that is
    not resource-managed.

    string_stream_destroy() now works on an unmanaged string_stream
    allocated by alloc_string_stream() and frees the entire
    string_stream (previously it only freed the fragments).

    string_stream_clear() has been made public for callers that
    want to free the fragments without destroying the string_stream.

For resource-managed allocations use kunit_alloc_string_stream()
and kunit_free_string_stream().

In addition to this, string_stream_get_string() now returns an
unmanaged buffer that the caller must kfree().

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:40 -06:00
Richard Fitzgerald 20631e154c kunit: string-stream: Add kunit_alloc_string_stream()
Add function kunit_alloc_string_stream() to do a resource-managed
allocation of a string stream, and corresponding
kunit_free_string_stream() to free the resource-managed stream.

This is preparing for decoupling the string_stream
implementation from struct kunit, to reduce the amount of code
churn when that happens. Currently:
 - kunit_alloc_string_stream() only calls alloc_string_stream().
 - kunit_free_string_stream() takes a struct kunit* which
   isn't used yet.

Callers of the old alloc_string_stream() and
string_stream_destroy() are all requesting a managed allocation
so have been changed to use the new functions.

alloc_string_stream() has been temporarily made static because
its current behavior has been replaced with
kunit_alloc_string_stream().

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:35 -06:00
Richard Fitzgerald 7b4481cbe7 kunit: Don't use a managed alloc in is_literal()
There is no need to use a test-managed alloc in is_literal().
The function frees the temporary buffer before returning.

This removes the only use of the test and gfp members of
struct string_stream outside of the string_stream implementation.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:30 -06:00
Richard Fitzgerald 1f58cdb173 kunit: string-stream-test: Add cases for string_stream newline appending
Add test cases for testing the string_stream feature that appends a
newline to strings that do not already end with a newline.

string_stream_no_auto_newline_test() tests with this feature disabled.
Newlines should not be added or dropped.

string_stream_auto_newline_test() tests with this feature enabled.
Newlines should be added to lines that do not end with a newline.

string_stream_append_auto_newline_test() tests appending the
content of one stream to another stream when the target stream
has newline appending enabled.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:23 -06:00
Richard Fitzgerald a5abe7b201 kunit: string-stream: Add option to make all lines end with newline
Add an optional feature to string_stream that will append a newline to
any added string that does not already end with a newline. The purpose
of this is so that string_stream can be used to collect log lines.

This is enabled/disabled by calling string_stream_set_append_newlines().

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:16 -06:00
Richard Fitzgerald 4551caca6a kunit: string-stream: Improve testing of string_stream
Replace the minimal tests with more-thorough testing.

string_stream_init_test() tests that struct string_stream is
initialized correctly.

string_stream_line_add_test() adds a series of numbered lines and
checks that the resulting string contains all the lines.

string_stream_variable_length_line_test() adds a large number of
lines of varying length to create many fragments, then tests that all
lines are present.

string_stream_append_test() tests various cases of using
string_stream_append() to append the content of one stream to another.

Adds string_stream_append_empty_string_test() to test that adding an
empty string to a string_stream doesn't create a new empty fragment.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:11 -06:00
Richard Fitzgerald 5c54c9ebb1 kunit: string-stream: Don't create a fragment for empty strings
If the result of the formatted string is an empty string just return
instead of creating an empty fragment.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:05 -06:00
David S. Miller 685c6d5b2c Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
The following pull-request contains BPF updates for your *net-next* tree.

We've added 73 non-merge commits during the last 9 day(s) which contain
a total of 79 files changed, 5275 insertions(+), 600 deletions(-).

The main changes are:

1) Basic BTF validation in libbpf, from Andrii Nakryiko.

2) bpf_assert(), bpf_throw(), exceptions in bpf progs, from Kumar Kartikeya Dwivedi.

3) next_thread cleanups, from Oleg Nesterov.

4) Add mcpu=v4 support to arm32, from Puranjay Mohan.

5) Add support for __percpu pointers in bpf progs, from Yonghong Song.

6) Fix bpf tailcall interaction with bpf trampoline, from Leon Hwang.

7) Raise irq_work in bpf_mem_alloc while irqs are disabled to improve refill probabablity, from Hou Tao.

Please consider pulling these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git

Thanks a lot!

Also thanks to reporters, reviewers and testers of commits in this pull-request:

Alan Maguire, Andrey Konovalov, Dave Marchevsky, "Eric W. Biederman",
Jiri Olsa, Maciej Fijalkowski, Quentin Monnet, Russell King (Oracle),
Song Liu, Stanislav Fomichev, Yonghong Song
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-17 15:12:06 +01:00
Puranjay Mohan daabb2b098 bpf/tests: add tests for cpuv4 instructions
The BPF JITs now support cpuv4 instructions. Add tests for these new
instructions to the test suite:

1. Sign extended Load
2. Sign extended Mov
3. Unconditional byte swap
4. Unconditional jump with 32-bit offset
5. Signed division and modulo

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Link: https://lore.kernel.org/r/20230907230550.1417590-9-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-15 17:16:57 -07:00
Yury Norov 9ecea9ae4d sched/topology: Handle NUMA_NO_NODE in sched_numa_find_nth_cpu()
sched_numa_find_nth_cpu() doesn't handle NUMA_NO_NODE properly, and
may crash kernel if passed with it. On the other hand, the only user
of sched_numa_find_nth_cpu() has to check NUMA_NO_NODE case explicitly.

It would be easier for users if this logic will get moved into
sched_numa_find_nth_cpu().

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Link: https://lore.kernel.org/r/20230819141239.287290-6-yury.norov@gmail.com
2023-09-15 13:48:11 +02:00
Maximilian Luz e4c89f9380 lib/ucs2_string: Add UCS-2 strscpy function
Add a ucs2_strscpy() function for UCS-2 strings. The behavior is
equivalent to the standard strscpy() function, just for 16-bit character
UCS-2 strings.

Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230827211408.689076-2-luzmaximilian@gmail.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-09-13 10:18:42 -07:00