Commit graph

7571 commits

Author SHA1 Message Date
Paul Menzel
633174a704 lib/raid6/test/Makefile: Use $(pound) instead of \# for Make 4.3
Buidling raid6test on Ubuntu 21.10 (ppc64le) with GNU Make 4.3 shows the
errors below:

    $ cd lib/raid6/test/
    $ make
    <stdin>:1:1: error: stray ‘\’ in program
    <stdin>:1:2: error: stray ‘#’ in program
    <stdin>:1:11: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ \
        before ‘<’ token

    [...]

The errors come from the HAS_ALTIVEC test, which fails, and the POWER
optimized versions are not built. That’s also reason nobody noticed on the
other architectures.

GNU Make 4.3 does not remove the backslash anymore. From the 4.3 release
announcment:

> * WARNING: Backward-incompatibility!
>   Number signs (#) appearing inside a macro reference or function invocation
>   no longer introduce comments and should not be escaped with backslashes:
>   thus a call such as:
>     foo := $(shell echo '#')
>   is legal.  Previously the number sign needed to be escaped, for example:
>     foo := $(shell echo '\#')
>   Now this latter will resolve to "\#".  If you want to write makefiles
>   portable to both versions, assign the number sign to a variable:
>     H := \#
>     foo := $(shell echo '$H')
>   This was claimed to be fixed in 3.81, but wasn't, for some reason.
>   To detect this change search for 'nocomment' in the .FEATURES variable.

So, do the same as commit 9564a8cf42 ("Kbuild: fix # escaping in .cmd
files for future Make") and commit 929bef4677 ("bpf: Use $(pound) instead
of \# in Makefiles") and define and use a $(pound) variable.

Reference for the change in make:
https://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57

Cc: Matt Brown <matthew.brown.dev@gmail.com>
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Song Liu <song@kernel.org>
2022-03-08 15:20:21 -08:00
Dirk Müller
a5359ddd05 lib/raid6/test: fix multiple definition linking error
GCC 10+ defaults to -fno-common, which enforces proper declaration of
external references using "extern". without this change a link would
fail with:

  lib/raid6/test/algos.c:28: multiple definition of `raid6_call';
  lib/raid6/test/test.c:22: first defined here

the pq.h header that is included already includes an extern declaration
so we can just remove the redundant one here.

Cc: <stable@vger.kernel.org>
Signed-off-by: Dirk Müller <dmueller@suse.de>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Song Liu <song@kernel.org>
2022-03-08 15:20:21 -08:00
Keith Busch
f3813f4b28 crypto: add rocksoft 64b crc guard tag framework
Hardware specific features may be able to calculate a crc64, so provide
a framework for drivers to register their implementation. If nothing is
registered, fallback to the generic table lookup implementation. The
implementation is modeled after the crct10dif equivalent.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20220303201312.3255347-7-kbusch@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-07 12:48:35 -07:00
Keith Busch
cbc0a40e17 lib: add rocksoft model crc64
The NVM Express specification extended data integrity fields to 64 bits
using the Rocksoft parameters. Add the poly to the crc64 table
generation, and provide a generic library routine implementing the
algorithm.

The Rocksoft 64-bit CRC model parameters are as follows:
    Poly: 0xAD93D23594C93659
    Initial value: 0xFFFFFFFFFFFFFFFF
    Reflected Input: True
    Reflected Output: True
    Xor Final: 0xFFFFFFFFFFFFFFFF

Since this model used reflected bits, the implementation generates the
reflected table so the result is ordered consistently.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220303201312.3255347-6-kbusch@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-07 12:48:35 -07:00
Jens Axboe
bc8419944f Merge branch 'for-5.18/block' into for-5.18/64bit-pi
* for-5.18/block: (96 commits)
  block: remove bio_devname
  ext4: stop using bio_devname
  raid5-ppl: stop using bio_devname
  raid1: stop using bio_devname
  md-multipath: stop using bio_devname
  dm-integrity: stop using bio_devname
  dm-crypt: stop using bio_devname
  pktcdvd: remove a pointless debug check in pkt_submit_bio
  block: remove handle_bad_sector
  block: fix and cleanup bio_check_ro
  bfq: fix use-after-free in bfq_dispatch_request
  blk-crypto: show crypto capabilities in sysfs
  block: don't delete queue kobject before its children
  block: simplify calling convention of elv_unregister_queue()
  block: remove redundant semicolon
  block: default BLOCK_LEGACY_AUTOLOAD to y
  block: update io_ticks when io hang
  block, bfq: don't move oom_bfqq
  block, bfq: avoid moving bfqq to it's parent bfqg
  block, bfq: cleanup bfq_bfqq_to_bfqg()
  ...
2022-03-07 12:46:13 -07:00
Jakub Kicinski
6646dc241d Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2022-03-04

We've added 32 non-merge commits during the last 14 day(s) which contain
a total of 59 files changed, 1038 insertions(+), 473 deletions(-).

The main changes are:

1) Optimize BPF stackmap's build_id retrieval by caching last valid build_id,
   as consecutive stack frames are likely to be in the same VMA and therefore
   have the same build id, from Hao Luo.

2) Several improvements to arm64 BPF JIT, that is, support for JITing
   the atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor} and lastly
   atomic[64]_{xchg|cmpxchg}. Also fix the BTF line info dump for JITed
   programs, from Hou Tao.

3) Optimize generic BPF map batch deletion by only enforcing synchronize_rcu()
   barrier once upon return to user space, from Eric Dumazet.

4) For kernel build parse DWARF and generate BTF through pahole with enabled
   multithreading, from Kui-Feng Lee.

5) BPF verifier usability improvements by making log info more concise and
   replacing inv with scalar type name, from Mykola Lysenko.

6) Two follow-up fixes for BPF prog JIT pack allocator, from Song Liu.

7) Add a new Kconfig to allow for loading kernel modules with non-matching
   BTF type info; their BTF info is then removed on load, from Connor O'Brien.

8) Remove reallocarray() usage from bpftool and switch to libbpf_reallocarray()
   in order to fix compilation errors for older glibc, from Mauricio Vásquez.

9) Fix libbpf to error on conflicting name in BTF when type declaration
   appears before the definition, from Xu Kuohai.

10) Fix issue in BPF preload for in-kernel light skeleton where loaded BPF
    program fds prevent init process from setting up fd 0-2, from Yucong Sun.

11) Fix libbpf reuse of pinned perf RB map when max_entries is auto-determined
    by libbpf, from Stijn Tintel.

12) Several cleanups for libbpf and a fix to enforce perf RB map #pages to be
    non-zero, from Yuntao Wang.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (32 commits)
  bpf: Small BPF verifier log improvements
  libbpf: Add a check to ensure that page_cnt is non-zero
  bpf, x86: Set header->size properly before freeing it
  x86: Disable HAVE_ARCH_HUGE_VMALLOC on 32-bit x86
  bpf, test_run: Fix overflow in XDP frags bpf_test_finish
  selftests/bpf: Update btf_dump case for conflicting names
  libbpf: Skip forward declaration when counting duplicated type names
  bpf: Add some description about BPF_JIT_ALWAYS_ON in Kconfig
  bpf, docs: Add a missing colon in verifier.rst
  bpf: Cache the last valid build_id
  libbpf: Fix BPF_MAP_TYPE_PERF_EVENT_ARRAY auto-pinning
  bpf, selftests: Use raw_tp program for atomic test
  bpf, arm64: Support more atomic operations
  bpftool: Remove redundant slashes
  bpf: Add config to allow loading modules with BTF mismatches
  bpf, arm64: Feed byte-offset into bpf line info
  bpf, arm64: Call build_prologue() first in first JIT pass
  bpf: Fix issue with bpf preload module taking over stdout/stdin of kernel.
  bpftool: Bpf skeletons assert type sizes
  bpf: Cleanup comments
  ...
====================

Link: https://lore.kernel.org/r/20220304164313.31675-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-04 19:28:17 -08:00
Jakub Kicinski
80901bff81 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/batman-adv/hard-interface.c
  commit 690bb6fb64 ("batman-adv: Request iflink once in batadv-on-batadv check")
  commit 6ee3c393ee ("batman-adv: Demote batadv-on-batadv skip error message")
https://lore.kernel.org/all/20220302163049.101957-1-sw@simonwunderlich.de/

net/smc/af_smc.c
  commit 4d08b7b57e ("net/smc: Fix cleanup when register ULP fails")
  commit 462791bbfa ("net/smc: add sysctl interface for SMC")
https://lore.kernel.org/all/20220302112209.355def40@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 11:55:12 -08:00
Christoph Hellwig
27674ef6c7 mm: remove the extra ZONE_DEVICE struct page refcount
ZONE_DEVICE struct pages have an extra reference count that complicates
the code for put_page() and several places in the kernel that need to
check the reference count to see that a page is not being used (gup,
compaction, migration, etc.). Clean up the code so the reference count
doesn't need to be treated specially for ZONE_DEVICE pages.

Note that this excludes the special idle page wakeup for fsdax pages,
which still happens at refcount 1.  This is a separate issue and will
be sorted out later.  Given that only fsdax pages require the
notifiacation when the refcount hits 1 now, the PAGEMAP_OPS Kconfig
symbol can go away and be replaced with a FS_DAX check for this hook
in the put_page fastpath.

Based on an earlier patch from Ralph Campbell <rcampbell@nvidia.com>.

Link: https://lkml.kernel.org/r/20220210072828.2930359-8-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com>

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Christian Knig <christian.koenig@amd.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-03 12:47:33 -05:00
Christoph Hellwig
dc90f0846d mm: don't include <linux/memremap.h> in <linux/mm.h>
Move the check for the actual pgmap types that need the free at refcount
one behavior into the out of line helper, and thus avoid the need to
pull memremap.h into mm.h.

Link: https://lkml.kernel.org/r/20220210072828.2930359-7-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com>

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-03 12:47:33 -05:00
Christoph Hellwig
730ff52194 mm: remove pointless includes from <linux/hmm.h>
hmm.h pulls in the world for no good reason at all.  Remove the
includes and push a few ones into the users instead.

Link: https://lkml.kernel.org/r/20220210072828.2930359-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Tested-by: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com>

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Christian Knig <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-03 12:47:33 -05:00
Linus Torvalds
7e3d76139b ARM further fixes for 5.17-rc:
- Fix kgdb breakpoint for Thumb2
 - Fix dependency for BITREVERSE kconfig
 - Fix nommu early_params and __setup returns
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAmIc4SIACgkQ9OeQG+St
 rGTlCQ//XQD4xLnvM2LScGFVOOvoQwOmv77H6jOVrfO1xs8dD0W5mBG3LdgaAKkW
 CnYRb9qF2i+lq1p3ZH9u+5bSX6ttzlRmvwUQB89YM7gkU5AY535gz1nFKScdT932
 FNftd1h4FJXvdOsVQM3MnwTNFtp3YodkkkNzKS8PkMxSuvQffMXBMo8cTpkkIF+M
 Eq/QRGIavreFqsI7UtN24j1FkDlBGYrVT8aGHwfyYRCIiFw6InaCpZ1eElJl0xdH
 v80h1ihYqIfLgkHH3Bkk8edsNoosJII5B67n1t1ZdkNBKEiPR5tLa5IMmEv2Dy07
 ufUvU1dullN5KXLQzD/8H4BZMGR1m0tDKWqCt1x1wcug/a1R0xPuO5QEOKXU0HpW
 wegV8ueYmGqAN5HN1iRpNctCJSos+qPZYuDDevJMnXjQsDRamUqUy/0V/rgc7qKE
 yzBfzgKM+Vhn5bKhtXu09Z6xAwVa0wknsJ+NF++EbukAW/WK5m3ck1Z0Ab6e3C1i
 phlnCIH083yejpxuoQMxDaGDhWwEE+a9R63BUPUxmdBIxrVc2yZLo+BUDJxaDh8n
 GcsiFnrsziwJIRlL0FsEWh1PbwWd8xhfHCBV7qbRDZ98RfDyjrajJrl9eK9u9pT+
 nUZKTC6Y+v6N3qfGJzvTHgjhOAA+crfgHcgDGZthoz3UJ0A1Tk0=
 =cSgl
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:

 - Fix kgdb breakpoint for Thumb2

 - Fix dependency for BITREVERSE kconfig

 - Fix nommu early_params and __setup returns

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9182/1: mmu: fix returns from early_param() and __setup() functions
  ARM: 9178/1: fix unmet dependency on BITREVERSE for HAVE_ARCH_BITREVERSE
  ARM: Fix kgdb breakpoint for Thumb2
2022-03-02 16:11:56 -08:00
Nicolai Stange
81771ff241 lib/mpi: export mpi_rshift
A subsequent patch will make the crypto/dh's dh_is_pubkey_valid() to
calculate a safe-prime groups Q parameter from P: Q = (P - 1) / 2. For
implementing this, mpi_rshift() will be needed. Export it so that it's
accessible from crypto/dh.

Signed-off-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-03-03 10:47:52 +12:00
Connor O'Brien
5e214f2e43 bpf: Add config to allow loading modules with BTF mismatches
BTF mismatch can occur for a separately-built module even when the ABI is
otherwise compatible and nothing else would prevent successfully loading.

Add a new Kconfig to control how mismatches are handled. By default, preserve
the current behavior of refusing to load the module. If MODULE_ALLOW_BTF_MISMATCH
is enabled, load the module but ignore its BTF information.

Suggested-by: Yonghong Song <yhs@fb.com>
Suggested-by: Michal Suchánek <msuchanek@suse.de>
Signed-off-by: Connor O'Brien <connoro@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/CAADnVQJ+OVPnBz8z3vNu8gKXX42jCUqfuvhWAyCQDu8N_yqqwQ@mail.gmail.com
Link: https://lore.kernel.org/bpf/20220223012814.1898677-1-connoro@google.com
2022-02-28 14:17:10 +01:00
Kees Cook
617f55e207 lib: overflow: Convert to Kunit
Convert overflow unit tests to KUnit, for better integration into the
kernel self test framework. Includes a rename of test_overflow.c to
overflow_kunit.c, and CONFIG_TEST_OVERFLOW to CONFIG_OVERFLOW_KUNIT_TEST.

$ ./tools/testing/kunit/kunit.py run overflow
...
[14:33:51] Starting KUnit Kernel (1/1)...
[14:33:51] ============================================================
[14:33:51] ================== overflow (11 subtests) ==================
[14:33:51] [PASSED] u8_overflow_test
[14:33:51] [PASSED] s8_overflow_test
[14:33:51] [PASSED] u16_overflow_test
[14:33:51] [PASSED] s16_overflow_test
[14:33:51] [PASSED] u32_overflow_test
[14:33:51] [PASSED] s32_overflow_test
[14:33:51] [PASSED] u64_overflow_test
[14:33:51] [PASSED] s64_overflow_test
[14:33:51] [PASSED] overflow_shift_test
[14:33:51] [PASSED] overflow_allocation_test
[14:33:51] [PASSED] overflow_size_helpers_test
[14:33:51] ==================== [PASSED] overflow =====================
[14:33:51] ============================================================
[14:33:51] Testing complete. Passed: 11, Failed: 0, Crashed: 0, Skipped: 0, Errors: 0
[14:33:51] Elapsed time: 12.525s total, 0.001s configuring, 12.402s building, 0.101s running

Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Co-developed-by: Vitor Massaru Iha <vitor@massaru.org>
Signed-off-by: Vitor Massaru Iha <vitor@massaru.org>
Link: https://lore.kernel.org/lkml/20200720224418.200495-1-vitor@massaru.org/
Co-developed-by: Daniel Latypov <dlatypov@google.com>
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Link: https://lore.kernel.org/linux-kselftest/20210503211536.1384578-1-dlatypov@google.com/
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/lkml/CAKwvOdm62iA1dNiC6Q11UJ-MnTqtc4kXkm-ubPaFMK824_k0nw@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Gow <davidgow@google.com>
Link: https://lore.kernel.org/lkml/CABVgOS=TWVh649_Vjo3wnMu9gZnq66gkV-LtGgsksAWMqc+MSA@mail.gmail.com
2022-02-27 09:29:02 -08:00
Andrey Konovalov
70effdc375 kasan: test: prevent cache merging in kmem_cache_double_destroy
With HW_TAGS KASAN and kasan.stacktrace=off, the cache created in the
kmem_cache_double_destroy() test might get merged with an existing one.
Thus, the first kmem_cache_destroy() call won't actually destroy it but
will only decrease the refcount.  This causes the test to fail.

Provide an empty constructor for the created cache to prevent the cache
from getting merged.

Link: https://lkml.kernel.org/r/b597bd434c49591d8af00ee3993a42c609dc9a59.1644346040.git.andreyknvl@google.com
Fixes: f98f966cd7 ("kasan: test: add test case for double-kmem_cache_destroy()")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
David Gow
5debe5bfa0 list: test: Add a test for list_entry_is_head()
The list_entry_is_head() macro was added[1] after the list KUnit tests,
so wasn't tested. Add a new KUnit test to complete the set.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e130816164e244b692921de49771eeb28205152d

Signed-off-by: David Gow <davidgow@google.com>
Acked-by: Daniel Latypov <dlatypov@google.com>
Acked-by: Brendan Higgins <brendanhiggins@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-25 08:39:01 -07:00
David Gow
37dc573c0a list: test: Add a test for list_is_head()
list_is_head() was added recently[1], and didn't have a KUnit test. The
implementation is trivial, so it's not a particularly exciting test, but
it'd be nice to get back to full coverage of the list functions.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/include/linux/list.h?id=0425473037db40d9e322631f2d4dc6ef51f97e88

Signed-off-by: David Gow <davidgow@google.com>
Acked-by: Daniel Latypov <dlatypov@google.com>
Acked-by: Brendan Higgins <brendanhiggins@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-25 08:38:55 -07:00
David Gow
d7fd696c12 list: test: Add test for list_del_init_careful()
The list_del_init_careful() function was added[1] after the list KUnit
test. Add a very basic test to cover it.

Note that this test only covers the single-threaded behaviour (which
matches list_del_init()), as is already the case with the test for
list_empty_careful().

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c6fe44d96fc1536af5b11cd859686453d1b7bfd1

Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-25 08:38:48 -07:00
Arnd Bergmann
967747bbc0 uaccess: remove CONFIG_SET_FS
There are no remaining callers of set_fs(), so CONFIG_SET_FS
can be removed globally, along with the thread_info field and
any references to it.

This turns access_ok() into a cheaper check against TASK_SIZE_MAX.

As CONFIG_SET_FS is now gone, drop all remaining references to
set_fs()/get_fs(), mm_segment_t, user_addr_max() and uaccess_kernel().

Acked-by: Sam Ravnborg <sam@ravnborg.org> # for sparc32 changes
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Tested-by: Sergey Matyukevich <sergey.matyukevich@synopsys.com> # for arc changes
Acked-by: Stafford Horne <shorne@gmail.com> # [openrisc, asm-generic]
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25 09:36:06 +01:00
Arnd Bergmann
5a06fcb15b lib/test_lockup: fix kernel pointer check for separate address spaces
test_kernel_ptr() uses access_ok() to figure out if a given address
points to user space instead of kernel space. However on architectures
that set CONFIG_ALTERNATE_USER_ADDRESS_SPACE, a pointer can be valid
for both, and the check always fails because access_ok() returns true.

Make the check for user space pointers conditional on the type of
address space layout.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25 09:36:06 +01:00
Arnd Bergmann
23fc539e81 uaccess: fix type mismatch warnings from access_ok()
On some architectures, access_ok() does not do any argument type
checking, so replacing the definition with a generic one causes
a few warnings for harmless issues that were never caught before.

Fix the ones that I found either through my own test builds or
that were reported by the 0-day bot.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25 09:36:05 +01:00
Jakub Kicinski
aaa25a2fa7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
tools/testing/selftests/net/mptcp/mptcp_join.sh
  34aa6e3bcc ("selftests: mptcp: add ip mptcp wrappers")

  857898eb4b ("selftests: mptcp: add missing join check")
  6ef84b1517 ("selftests: mptcp: more robust signal race test")
https://lore.kernel.org/all/20220221131842.468893-1-broonie@kernel.org/

drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h
drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c
  fb7e76ea3f ("net/mlx5e: TC, Skip redundant ct clear actions")
  c63741b426 ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information")

  09bf979232 ("net/mlx5e: TC, Move pedit_headers_action to parse_attr")
  84ba8062e3 ("net/mlx5e: Test CT and SAMPLE on flow attr")
  efe6f961cd ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow")
  3b49a7edec ("net/mlx5e: TC, Reject rules with multiple CT actions")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 17:54:25 -08:00
Christophe Leroy
8484291132 vsprintf: Fix %pK with kptr_restrict == 0
Although kptr_restrict is set to 0 and the kernel is booted with
no_hash_pointers parameter, the content of /proc/vmallocinfo is
lacking the real addresses.

  / # cat /proc/vmallocinfo
  0x(ptrval)-0x(ptrval)    8192 load_module+0xc0c/0x2c0c pages=1 vmalloc
  0x(ptrval)-0x(ptrval)   12288 start_kernel+0x4e0/0x690 pages=2 vmalloc
  0x(ptrval)-0x(ptrval)   12288 start_kernel+0x4e0/0x690 pages=2 vmalloc
  0x(ptrval)-0x(ptrval)    8192 _mpic_map_mmio.constprop.0+0x20/0x44 phys=0x80041000 ioremap
  0x(ptrval)-0x(ptrval)   12288 _mpic_map_mmio.constprop.0+0x20/0x44 phys=0x80041000 ioremap
    ...

According to the documentation for /proc/sys/kernel/, %pK is
equivalent to %p when kptr_restrict is set to 0.

Fixes: 5ead723a20 ("lib/vsprintf: no_hash_pointers prints all addresses as unhashed")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/107476128e59bff11a309b5bf7579a1753a41aca.1645087605.git.christophe.leroy@csgroup.eu
2022-02-24 10:10:31 +01:00
Linus Torvalds
917bbdb107 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull ITER_PIPE fix from Al Viro:
 "Fix for old sloppiness in pipe_buffer reuse"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  lib/iov_iter: initialize "flags" in new pipe_buffer
2022-02-22 10:31:53 -08:00
Jason A. Donenfeld
14c174633f random: remove unused tracepoints
These explicit tracepoints aren't really used and show sign of aging.
It's work to keep these up to date, and before I attempted to keep them
up to date, they weren't up to date, which indicates that they're not
really used. These days there are better ways of introspecting anyway.

Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-02-21 21:14:00 +01:00
Max Kellermann
9d2231c5d7 lib/iov_iter: initialize "flags" in new pipe_buffer
The functions copy_page_to_iter_pipe() and push_pipe() can both
allocate a new pipe_buffer, but the "flags" member initializer is
missing.

Fixes: 241699cd72 ("new iov_iter flavour: pipe-backed")
To: Alexander Viro <viro@zeniv.linux.org.uk>
To: linux-fsdevel@vger.kernel.org
To: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-02-21 10:16:39 -05:00
Julian Braha
11c57c3ba9 ARM: 9178/1: fix unmet dependency on BITREVERSE for HAVE_ARCH_BITREVERSE
Resending this to properly add it to the patch tracker - thanks for letting
me know, Arnd :)

When ARM is enabled, and BITREVERSE is disabled,
Kbuild gives the following warning:

WARNING: unmet direct dependencies detected for HAVE_ARCH_BITREVERSE
  Depends on [n]: BITREVERSE [=n]
  Selected by [y]:
  - ARM [=y] && (CPU_32v7M [=n] || CPU_32v7 [=y]) && !CPU_32v6 [=n]

This is because ARM selects HAVE_ARCH_BITREVERSE
without selecting BITREVERSE, despite
HAVE_ARCH_BITREVERSE depending on BITREVERSE.

This unmet dependency bug was found by Kismet,
a static analysis tool for Kconfig. Please advise if this
is not the appropriate solution.

Signed-off-by: Julian Braha <julianbraha@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-02-21 14:58:34 +00:00
Kees Cook
230f6fa2c1 overflow: Provide constant expression struct_size
There have been cases where struct_size() (or flex_array_size()) needs
to be calculated for an initializer, which requires it be a constant
expression. This is possible when the "count" argument is a constant
expression, so provide this ability for the helpers.

Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Tested-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/lkml/20220210010407.GA701603@embeddedor
2022-02-16 14:30:37 -08:00
Kees Cook
e1be43d9b5 overflow: Implement size_t saturating arithmetic helpers
In order to perform more open-coded replacements of common allocation
size arithmetic, the kernel needs saturating (SIZE_MAX) helpers for
multiplication, addition, and subtraction. For example, it is common in
allocators, especially on realloc, to add to an existing size:

    p = krealloc(map->patch,
                 sizeof(struct reg_sequence) * (map->patch_regs + num_regs),
                 GFP_KERNEL);

There is no existing saturating replacement for this calculation, and
just leaving the addition open coded inside array_size() could
potentially overflow as well. For example, an overflow in an expression
for a size_t argument might wrap to zero:

    array_size(anything, something_at_size_max + 1) == 0

Introduce size_mul(), size_add(), and size_sub() helpers that
implicitly promote arguments to size_t and saturated calculations for
use in allocations. With these helpers it is also possible to redefine
array_size(), array3_size(), flex_array_size(), and struct_size() in
terms of the new helpers.

As with the check_*_overflow() helpers, the new helpers use __must_check,
though what is really desired is a way to make sure that assignment is
only to a size_t lvalue. Without this, it's still possible to introduce
overflow/underflow via type conversion (i.e. from size_t to int).
Enforcing this will currently need to be left to static analysis or
future use of -Wconversion.

Additionally update the overflow unit tests to force runtime evaluation
for the pathological cases.

Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Len Baker <len.baker@gmx.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-02-16 14:29:48 -08:00
Kees Cook
28e77cc1c0 fortify: Detect struct member overflows in memset() at compile-time
As done for memcpy(), also update memset() to use the same tightened
compile-time bounds checking under CONFIG_FORTIFY_SOURCE.

Signed-off-by: Kees Cook <keescook@chromium.org>
2022-02-13 16:50:06 -08:00
Kees Cook
938a000e3f fortify: Detect struct member overflows in memmove() at compile-time
As done for memcpy(), also update memmove() to use the same tightened
compile-time checks under CONFIG_FORTIFY_SOURCE.

Signed-off-by: Kees Cook <keescook@chromium.org>
2022-02-13 16:50:06 -08:00
Kees Cook
f68f2ff915 fortify: Detect struct member overflows in memcpy() at compile-time
memcpy() is dead; long live memcpy()

tl;dr: In order to eliminate a large class of common buffer overflow
flaws that continue to persist in the kernel, have memcpy() (under
CONFIG_FORTIFY_SOURCE) perform bounds checking of the destination struct
member when they have a known size. This would have caught all of the
memcpy()-related buffer write overflow flaws identified in at least the
last three years.

Background and analysis:

While stack-based buffer overflow flaws are largely mitigated by stack
canaries (and similar) features, heap-based buffer overflow flaws continue
to regularly appear in the kernel. Many classes of heap buffer overflows
are mitigated by FORTIFY_SOURCE when using the strcpy() family of
functions, but a significant number remain exposed through the memcpy()
family of functions.

At its core, FORTIFY_SOURCE uses the compiler's __builtin_object_size()
internal[0] to determine the available size at a target address based on
the compile-time known structure layout details. It operates in two
modes: outer bounds (0) and inner bounds (1). In mode 0, the size of the
enclosing structure is used. In mode 1, the size of the specific field
is used. For example:

	struct object {
		u16 scalar1;	/* 2 bytes */
		char array[6];	/* 6 bytes */
		u64 scalar2;	/* 8 bytes */
		u32 scalar3;	/* 4 bytes */
		u32 scalar4;	/* 4 bytes */
	} instance;

__builtin_object_size(instance.array, 0) == 22, since the remaining size
of the enclosing structure starting from "array" is 22 bytes (6 + 8 +
4 + 4).

__builtin_object_size(instance.array, 1) == 6, since the remaining size
of the specific field "array" is 6 bytes.

The initial implementation of FORTIFY_SOURCE used mode 0 because there
were many cases of both strcpy() and memcpy() functions being used to
write (or read) across multiple fields in a structure. For example,
it would catch this, which is writing 2 bytes beyond the end of
"instance":

	memcpy(&instance.array, data, 25);

While this didn't protect against overwriting adjacent fields in a given
structure, it would at least stop overflows from reaching beyond the
end of the structure into neighboring memory, and provided a meaningful
mitigation of a subset of buffer overflow flaws. However, many desirable
targets remain within the enclosing structure (for example function
pointers).

As it happened, there were very few cases of strcpy() family functions
intentionally writing beyond the end of a string buffer. Once all known
cases were removed from the kernel, the strcpy() family was tightened[1]
to use mode 1, providing greater mitigation coverage.

What remains is switching memcpy() to mode 1 as well, but making the
switch is much more difficult because of how frustrating it can be to
find existing "normal" uses of memcpy() that expect to write (or read)
across multiple fields. The root cause of the problem is that the C
language lacks a common pattern to indicate the intent of an author's
use of memcpy(), and is further complicated by the available compile-time
and run-time mitigation behaviors.

The FORTIFY_SOURCE mitigation comes in two halves: the compile-time half,
when both the buffer size _and_ the length of the copy is known, and the
run-time half, when only the buffer size is known. If neither size is
known, there is no bounds checking possible. At compile-time when the
compiler sees that a length will always exceed a known buffer size,
a warning can be deterministically emitted. For the run-time half,
the length is tested against the known size of the buffer, and the
overflowing operation is detected. (The performance overhead for these
tests is virtually zero.)

It is relatively easy to find compile-time false-positives since a warning
is always generated. Fixing the false positives, however, can be very
time-consuming as there are hundreds of instances. While it's possible
some over-read conditions could lead to kernel memory exposures, the bulk
of the risk comes from the run-time flaws where the length of a write
may end up being attacker-controlled and lead to an overflow.

Many of the compile-time false-positives take a form similar to this:

	memcpy(&instance.scalar2, data, sizeof(instance.scalar2) +
					sizeof(instance.scalar3));

and the run-time ones are similar, but lack a constant expression for the
size of the copy:

	memcpy(instance.array, data, length);

The former is meant to cover multiple fields (though its style has been
frowned upon more recently), but has been technically legal. Both lack
any expressivity in the C language about the author's _intent_ in a way
that a compiler can check when the length isn't known at compile time.
A comment doesn't work well because what's needed is something a compiler
can directly reason about. Is a given memcpy() call expected to overflow
into neighbors? Is it not? By using the new struct_group() macro, this
intent can be much more easily encoded.

It is not as easy to find the run-time false-positives since the code path
to exercise a seemingly out-of-bounds condition that is actually expected
may not be trivially reachable. Tightening the restrictions to block an
operation for a false positive will either potentially create a greater
flaw (if a copy is truncated by the mitigation), or destabilize the kernel
(e.g. with a BUG()), making things completely useless for the end user.

As a result, tightening the memcpy() restriction (when there is a
reasonable level of uncertainty of the number of false positives), needs
to first WARN() with no truncation. (Though any sufficiently paranoid
end-user can always opt to set the panic_on_warn=1 sysctl.) Once enough
development time has passed, the mitigation can be further intensified.
(Note that this patch is only the compile-time checking step, which is
a prerequisite to doing run-time checking, which will come in future
patches.)

Given the potential frustrations of weeding out all the false positives
when tightening the run-time checks, it is reasonable to wonder if these
changes would actually add meaningful protection. Looking at just the
last three years, there are 23 identified flaws with a CVE that mention
"buffer overflow", and 11 are memcpy()-related buffer overflows.

(For the remaining 12: 7 are array index overflows that would be
mitigated by systems built with CONFIG_UBSAN_BOUNDS=y: CVE-2019-0145,
CVE-2019-14835, CVE-2019-14896, CVE-2019-14897, CVE-2019-14901,
CVE-2019-17666, CVE-2021-28952. 2 are miscalculated allocation
sizes which could be mitigated with memory tagging: CVE-2019-16746,
CVE-2019-2181. 1 is an iovec buffer bug maybe mitigated by memory tagging:
CVE-2020-10742. 1 is a type confusion bug mitigated by stack canaries:
CVE-2020-10942. 1 is a string handling logic bug with no mitigation I'm
aware of: CVE-2021-28972.)

At my last count on an x86_64 allmodconfig build, there are 35,294
calls to memcpy(). With callers instrumented to report all places
where the buffer size is known but the length remains unknown (i.e. a
run-time bounds check is added), we can count how many new run-time
bounds checks are added when the destination and source arguments of
memcpy() are changed to use "mode 1" bounds checking: 1,276. This means
for the future run-time checking, there is a worst-case upper bounds
of 3.6% false positives to fix. In addition, there were around 150 new
compile-time warnings to evaluate and fix (which have now been fixed).

With this instrumentation it's also possible to compare the places where
the known 11 memcpy() flaw overflows manifested against the resulting
list of potential new run-time bounds checks, as a measure of potential
efficacy of the tightened mitigation. Much to my surprise, horror, and
delight, all 11 flaws would have been detected by the newly added run-time
bounds checks, making this a distinctly clear mitigation improvement: 100%
coverage for known memcpy() flaws, with a possible 2 orders of magnitude
gain in coverage over existing but undiscovered run-time dynamic length
flaws (i.e. 1265 newly covered sites in addition to the 11 known), against
only <4% of all memcpy() callers maybe gaining a false positive run-time
check, with only about 150 new compile-time instances needing evaluation.

Specifically these would have been mitigated:
CVE-2020-24490 https://git.kernel.org/linus/a2ec905d1e160a33b2e210e45ad30445ef26ce0e
CVE-2020-12654 https://git.kernel.org/linus/3a9b153c5591548612c3955c9600a98150c81875
CVE-2020-12653 https://git.kernel.org/linus/b70261a288ea4d2f4ac7cd04be08a9f0f2de4f4d
CVE-2019-14895 https://git.kernel.org/linus/3d94a4a8373bf5f45cf5f939e88b8354dbf2311b
CVE-2019-14816 https://git.kernel.org/linus/7caac62ed598a196d6ddf8d9c121e12e082cac3a
CVE-2019-14815 https://git.kernel.org/linus/7caac62ed598a196d6ddf8d9c121e12e082cac3a
CVE-2019-14814 https://git.kernel.org/linus/7caac62ed598a196d6ddf8d9c121e12e082cac3a
CVE-2019-10126 https://git.kernel.org/linus/69ae4f6aac1578575126319d3f55550e7e440449
CVE-2019-9500  https://git.kernel.org/linus/1b5e2423164b3670e8bc9174e4762d297990deff
no-CVE-yet     https://git.kernel.org/linus/130f634da1af649205f4a3dd86cbe5c126b57914
no-CVE-yet     https://git.kernel.org/linus/d10a87a3535cce2b890897914f5d0d83df669c63

To accelerate the review of potential run-time false positives, it's
also worth noting that it is possible to partially automate checking
by examining the memcpy() buffer argument to check for the destination
struct member having a neighboring array member. It is reasonable to
expect that the vast majority of run-time false positives would look like
the already evaluated and fixed compile-time false positives, where the
most common pattern is neighboring arrays. (And, FWIW, many of the
compile-time fixes were actual bugs, so it is reasonable to assume we'll
have similar cases of actual bugs getting fixed for run-time checks.)

Implementation:

Tighten the memcpy() destination buffer size checking to use the actual
("mode 1") target buffer size as the bounds check instead of their
enclosing structure's ("mode 0") size. Use a common inline for memcpy()
(and memmove() in a following patch), since all the tests are the
same. All new cross-field memcpy() uses must use the struct_group() macro
or similar to target a specific range of fields, so that FORTIFY_SOURCE
can reason about the size and safety of the copy.

For now, cross-member "mode 1" _read_ detection at compile-time will be
limited to W=1 builds, since it is, unfortunately, very common. As the
priority is solving write overflows, read overflows will be part of a
future phase (and can be fixed in parallel, for anyone wanting to look
at W=1 build output).

For run-time, the "mode 0" size checking and mitigation is left unchanged,
with "mode 1" to be added in stages. In this patch, no new run-time
checks are added. Future patches will first bounds-check writes,
and only perform a WARN() for now. This way any missed run-time false
positives can be flushed out over the coming several development cycles,
but system builders who have tested their workloads to be WARN()-free
can enable the panic_on_warn=1 sysctl to immediately gain a mitigation
against this class of buffer overflows. Once that is under way, run-time
bounds-checking of reads can be similarly enabled.

Related classes of flaws that will remain unmitigated:

- memcpy() with flexible array structures, as the compiler does not
  currently have visibility into the size of the trailing flexible
  array. These can be fixed in the future by refactoring such cases
  to use a new set of flexible array structure helpers to perform the
  common serialization/deserialization code patterns doing allocation
  and/or copying.

- memcpy() with raw pointers (e.g. void *, char *, etc), or otherwise
  having their buffer size unknown at compile time, have no good
  mitigation beyond memory tagging (and even that would only protect
  against inter-object overflow, not intra-object neighboring field
  overflows), or refactoring. Some kind of "fat pointer" solution is
  likely needed to gain proper size-of-buffer awareness. (e.g. see
  struct membuf)

- type confusion where a higher level type's allocation size does
  not match the resulting cast type eventually passed to a deeper
  memcpy() call where the compiler cannot see the true type. In
  theory, greater static analysis could catch these, and the use
  of -Warray-bounds will help find some of these.

[0] https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html
[1] https://git.kernel.org/linus/6a39e62abbafd1d58d1722f40c7d26ef379c6a2f

Signed-off-by: Kees Cook <keescook@chromium.org>
2022-02-13 16:50:06 -08:00
Jakub Kicinski
5b91c5cc0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-10 17:29:56 -08:00
Andy Shevchenko
f74a08fc61 vsprintf: Move space out of string literals in fourcc_string()
The literals "big-endian" and "little-endian" may be potentially
occurred in other places. Dropping space allows linker to
merge them by using only a single copy.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220127181233.72910-2-andriy.shevchenko@linux.intel.com
2022-02-10 13:16:50 +01:00
Andy Shevchenko
d75b26f880 vsprintf: Fix potential unaligned access
The %p4cc specifier in some cases might get an unaligned pointer.
Due to this we need to make copy to local variable once to avoid
potential crashes on some architectures due to improper access.

Fixes: af612e43de ("lib/vsprintf: Add support for printing V4L2 and DRM fourccs")
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220127181233.72910-1-andriy.shevchenko@linux.intel.com
2022-02-10 13:16:50 +01:00
Jakub Kicinski
1127170d45 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2022-02-09

We've added 126 non-merge commits during the last 16 day(s) which contain
a total of 201 files changed, 4049 insertions(+), 2215 deletions(-).

The main changes are:

1) Add custom BPF allocator for JITs that pack multiple programs into a huge
   page to reduce iTLB pressure, from Song Liu.

2) Add __user tagging support in vmlinux BTF and utilize it from BPF
   verifier when generating loads, from Yonghong Song.

3) Add per-socket fast path check guarding from cgroup/BPF overhead when
   used by only some sockets, from Pavel Begunkov.

4) Continued libbpf deprecation work of APIs/features and removal of their
   usage from samples, selftests, libbpf & bpftool, from Andrii Nakryiko
   and various others.

5) Improve BPF instruction set documentation by adding byte swap
   instructions and cleaning up load/store section, from Christoph Hellwig.

6) Switch BPF preload infra to light skeleton and remove libbpf dependency
   from it, from Alexei Starovoitov.

7) Fix architecture-agnostic macros in libbpf for accessing syscall
   arguments from BPF progs for non-x86 architectures,
   from Ilya Leoshkevich.

8) Rework port members in struct bpf_sk_lookup and struct bpf_sock to be
   of 16-bit field with anonymous zero padding, from Jakub Sitnicki.

9) Add new bpf_copy_from_user_task() helper to read memory from a different
   task than current. Add ability to create sleepable BPF iterator progs,
   from Kenny Yu.

10) Implement XSK batching for ice's zero-copy driver used by AF_XDP and
    utilize TX batching API from XSK buffer pool, from Maciej Fijalkowski.

11) Generate temporary netns names for BPF selftests to avoid naming
    collisions, from Hangbin Liu.

12) Implement bpf_core_types_are_compat() with limited recursion for
    in-kernel usage, from Matteo Croce.

13) Simplify pahole version detection and finally enable CONFIG_DEBUG_INFO_DWARF5
    to be selected with CONFIG_DEBUG_INFO_BTF, from Nathan Chancellor.

14) Misc minor fixes to libbpf and selftests from various folks.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (126 commits)
  selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup
  bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide
  libbpf: Fix compilation warning due to mismatched printf format
  selftests/bpf: Test BPF_KPROBE_SYSCALL macro
  libbpf: Add BPF_KPROBE_SYSCALL macro
  libbpf: Fix accessing the first syscall argument on s390
  libbpf: Fix accessing the first syscall argument on arm64
  libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL
  selftests/bpf: Skip test_bpf_syscall_macro's syscall_arg1 on arm64 and s390
  libbpf: Fix accessing syscall arguments on riscv
  libbpf: Fix riscv register names
  libbpf: Fix accessing syscall arguments on powerpc
  selftests/bpf: Use PT_REGS_SYSCALL_REGS in bpf_syscall_macro
  libbpf: Add PT_REGS_SYSCALL_REGS macro
  selftests/bpf: Fix an endianness issue in bpf_syscall_macro test
  bpf: Fix bpf_prog_pack build HPAGE_PMD_SIZE
  bpf: Fix leftover header->pages in sparc and powerpc code.
  libbpf: Fix signedness bug in btf_dump_array_data()
  selftests/bpf: Do not export subtest as standalone test
  bpf, x86_64: Fail gracefully on bpf_jit_binary_pack_finalize failures
  ...
====================

Link: https://lore.kernel.org/r/20220209210050.8425-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09 18:40:56 -08:00
Kees Cook
8e7c8ca6b9 test_overflow: Regularize test reporting output
Report test run summaries more regularly, so it's easier to understand
the output:
- Remove noisy "ok" reports for shift and allocator tests.
- Reorganize per-type output to the end of each type's tests.
- Replace redundant vmalloc tests with __vmalloc so that __GFP_NO_WARN
  can be used to keep the expected failure warnings out of dmesg,
  similar to commit 8e060c21ae ("lib/test_overflow.c: avoid tainting
  the kernel and fix wrap size")

Resulting output:

  test_overflow: 18 u8 arithmetic tests finished
  test_overflow: 19 s8 arithmetic tests finished
  test_overflow: 17 u16 arithmetic tests finished
  test_overflow: 17 s16 arithmetic tests finished
  test_overflow: 17 u32 arithmetic tests finished
  test_overflow: 17 s32 arithmetic tests finished
  test_overflow: 17 u64 arithmetic tests finished
  test_overflow: 21 s64 arithmetic tests finished
  test_overflow: 113 shift tests finished
  test_overflow: 17 overflow size helper tests finished
  test_overflow: 11 allocation overflow tests finished
  test_overflow: all tests passed

Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Link: https://lore.kernel.org/all/eb6d02ae-e2ed-e7bd-c700-8a6d004d84ce@rasmusvillemoes.dk/
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/all/CAKwvOdnYYa+72VhtJ4ug=SJVFn7w+n7Th+hKYE87BRDt4hvqOg@mail.gmail.com/
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-02-09 14:33:41 -08:00
Dan Williams
3c5b903955 cxl: Prove CXL locking
When CONFIG_PROVE_LOCKING is enabled the 'struct device' definition gets
an additional mutex that is not clobbered by
lockdep_set_novalidate_class() like the typical device_lock(). This
allows for local annotation of subsystem locks with mutex_lock_nested()
per the subsystem's object/lock hierarchy. For CXL, this primarily needs
the ability to lock ports by depth and child objects of ports by their
parent parent-port lock.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/164365853422.99383.1052399160445197427.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-02-08 22:57:29 -08:00
John Garry
3f607293b7 sbitmap: Delete old sbitmap_queue_get_shallow()
Since __sbitmap_queue_get_shallow() was introduced in commit c05e667337
("sbitmap: add sbitmap_get_shallow() operation"), it has not been used.

Delete __sbitmap_queue_get_shallow() and rename public
__sbitmap_queue_get_shallow() -> sbitmap_queue_get_shallow() as it is odd
to have public __foo but no foo at all.

Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1644322024-105340-1-git-send-email-john.garry@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-08 06:55:49 -07:00
Ming Lei
3301bc5335 lib/sbitmap: kill 'depth' from sbitmap_word
Only the last sbitmap_word can have different depth, and all the others
must have same depth of 1U << sb->shift, so not necessary to store it in
sbitmap_word, and it can be retrieved easily and efficiently by adding
one internal helper of __map_depth(sb, index).

Remove 'depth' field from sbitmap_word, then the annotation of
____cacheline_aligned_in_smp for 'word' isn't needed any more.

Not see performance effect when running high parallel IOPS test on
null_blk.

This way saves us one cacheline(usually 64 words) per each sbitmap_word.

Cc: Martin Wilck <martin.wilck@suse.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20220110072945.347535-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-08 06:54:50 -07:00
Eric Dumazet
c2d1e3df4a ref_tracker: remove filter_irq_stacks() call
After commit e940066089 ("lib/stackdepot: always do filter_irq_stacks()
in stack_depot_save()") it became unnecessary to filter the stack
before calling stack_depot_save().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-06 11:05:28 +00:00
Eric Dumazet
8fd5522f44 ref_tracker: add a count of untracked references
We are still chasing a netdev refcount imbalance, and we suspect
we have one rogue dev_put() that is consuming a reference taken
from a dev_hold_track()

To detect this case, allow ref_tracker_alloc() and ref_tracker_free()
to be called with a NULL @trackerp parameter, and use a dedicated
refcount_t just for them.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-05 15:22:44 +00:00
Eric Dumazet
e3ececfe66 ref_tracker: implement use-after-free detection
Whenever ref_tracker_dir_init() is called, mark the struct ref_tracker_dir
as dead.

Test the dead status from ref_tracker_alloc() and ref_tracker_free()

This should detect buggy dev_put()/dev_hold() happening too late
in netdevice dismantle process.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-05 15:22:44 +00:00
Jason A. Donenfeld
d2a02e3c8b lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI
blake2s_compress_generic is weakly aliased by blake2s_compress. The
current harness for function selection uses a function pointer, which is
ordinarily inlined and resolved at compile time. But when Clang's CFI is
enabled, CFI still triggers when making an indirect call via a weak
symbol. This seems like a bug in Clang's CFI, as though it's bucketing
weak symbols and strong symbols differently. It also only seems to
trigger when "full LTO" mode is used, rather than "thin LTO".

[    0.000000][    T0] Kernel panic - not syncing: CFI failure (target: blake2s_compress_generic+0x0/0x1444)
[    0.000000][    T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-mainline-06981-g076c855b846e #1
[    0.000000][    T0] Hardware name: MT6873 (DT)
[    0.000000][    T0] Call trace:
[    0.000000][    T0]  dump_backtrace+0xfc/0x1dc
[    0.000000][    T0]  dump_stack_lvl+0xa8/0x11c
[    0.000000][    T0]  panic+0x194/0x464
[    0.000000][    T0]  __cfi_check_fail+0x54/0x58
[    0.000000][    T0]  __cfi_slowpath_diag+0x354/0x4b0
[    0.000000][    T0]  blake2s_update+0x14c/0x178
[    0.000000][    T0]  _extract_entropy+0xf4/0x29c
[    0.000000][    T0]  crng_initialize_primary+0x24/0x94
[    0.000000][    T0]  rand_initialize+0x2c/0x6c
[    0.000000][    T0]  start_kernel+0x2f8/0x65c
[    0.000000][    T0]  __primary_switched+0xc4/0x7be4
[    0.000000][    T0] Rebooting in 5 seconds..

Nonetheless, the function pointer method isn't so terrific anyway, so
this patch replaces it with a simple boolean, which also gets inlined
away. This successfully works around the Clang bug.

In general, I'm not too keen on all of the indirection involved here; it
clearly does more harm than good. Hopefully the whole thing can get
cleaned up down the road when lib/crypto is overhauled more
comprehensively. But for now, we go with a simple bandaid.

Fixes: 6048fdcc5f ("lib/crypto: blake2s: include as built-in")
Link: https://github.com/ClangBuiltLinux/linux/issues/1567
Reported-by: Miles Chen <miles.chen@mediatek.com>
Tested-by: Miles Chen <miles.chen@mediatek.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-02-04 19:22:32 +01:00
Nathan Chancellor
42d9b379e3 lib/Kconfig.debug: Allow BTF + DWARF5 with pahole 1.21+
Commit 98cd6f521f ("Kconfig: allow explicit opt in to DWARF v5")
prevented CONFIG_DEBUG_INFO_DWARF5 from being selected when
CONFIG_DEBUG_INFO_BTF is enabled because pahole had issues with clang's
DWARF5 info. This was resolved by [1], which is in pahole v1.21.

Allow DEBUG_INFO_DWARF5 to be selected with DEBUG_INFO_BTF when using
pahole v1.21 or newer.

[1]: https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=7d8e829f636f47aba2e1b6eda57e74d8e31f733c

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220201205624.652313-6-nathan@kernel.org
2022-02-02 11:19:34 +01:00
Nathan Chancellor
6323c81350 lib/Kconfig.debug: Use CONFIG_PAHOLE_VERSION
Now that CONFIG_PAHOLE_VERSION exists, use it in the definition of
CONFIG_PAHOLE_HAS_SPLIT_BTF and CONFIG_PAHOLE_HAS_BTF_TAG to reduce the
amount of duplication across the tree.

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220201205624.652313-5-nathan@kernel.org
2022-02-02 11:19:33 +01:00
Daniel Latypov
2b6861e237 kunit: factor out str constants from binary assertion structs
If the compiler doesn't optimize them away, each kunit assertion (use of
KUNIT_EXPECT_EQ, etc.) can use 88 bytes of stack space in the worst and
most common case. This has led to compiler warnings and a suggestion
from Linus to move data from the structs into static const's where
possible [1].

This builds upon [2] which did so for the base struct kunit_assert type.
That only reduced sizeof(struct kunit_binary_assert) from 88 to 64.

Given these are by far the most commonly used asserts, this patch
factors out the textual representations of the operands and comparator
into another static const, saving 16 more bytes.

In detail, KUNIT_EXPECT_EQ(test, 2 + 2, 5) yields the following struct
  (struct kunit_binary_assert) {
    .assert = <struct kunit_assert>,
    .operation = "==",
    .left_text = "2 + 2",
    .left_value = 4,
    .right_text = "5",
    .right_value = 5,
  }
After this change
  static const struct kunit_binary_assert_text __text = {
    .operation = "==",
    .left_text = "2 + 2",
    .right_text = "5",
  };
  (struct kunit_binary_assert) {
    .assert = <struct kunit_assert>,
    .text = &__text,
    .left_value = 4,
    .right_value = 5,
  }

This also DRYs the code a bit more since these str fields were repeated
for the string and pointer versions of kunit_binary_assert.

Note: we could name the kunit_binary_assert_text fields left/right
instead of left_text/right_text. But that would require changing the
macros a bit since they have args called "left" and "right" which would
be substituted in `.left = #left` as `.2 + 2 = \"2 + 2\"`.

[1] https://groups.google.com/g/kunit-dev/c/i3fZXgvBrfA/m/VULQg1z6BAAJ
[2] https://lore.kernel.org/linux-kselftest/20220113165931.451305-6-dlatypov@google.com/

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-01-31 11:55:39 -07:00
Daniel Latypov
6419abb80e kunit: remove va_format from kunit_assert
The concern is that having a lot of redundant fields in kunit_assert can
blow up stack usage if the compiler doesn't optimize them away [1].

The comment on this field implies that it was meant to be initialized
when the expect/assert was declared, but this only happens when we run
kunit_do_failed_assertion().

We don't need to access it outside of that function, so move it out of
the struct and make it a local variable there.

This change also takes the chance to reduce the number of macros by
inlining the now simplified KUNIT_INIT_ASSERT_STRUCT() macro.

[1] https://groups.google.com/g/kunit-dev/c/i3fZXgvBrfA/m/VULQg1z6BAAJ

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-01-31 11:55:27 -07:00
Kevin Bracey
1b3dce8b8a lib/crc32test: correct printed bytes count
crc32c_le self test had a stray multiply by two inherited from
the crc32_le+crc32_be test loop.

Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-01-31 11:21:43 +11:00
Kevin Bracey
5cb29be47d lib/crc32: Make crc32_be weak for arch override
crc32_le and __crc32c_le can be overridden - extend this to crc32_be.

Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-01-31 11:21:43 +11:00
Kevin Bracey
163a4e7fa7 lib/crc32: remove unneeded casts
Casts were added in commit 8f243af42a ("sections: fix const sections
for crc32 table") to cope with the tables not being const. They are no
longer required since commit f5e38b9284 ("lib: crc32: constify crc32
lookup table").

Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-01-31 11:21:43 +11:00
Marco Elver
09c6304e38 kasan: test: fix compatibility with FORTIFY_SOURCE
With CONFIG_FORTIFY_SOURCE enabled, string functions will also perform
dynamic checks using __builtin_object_size(ptr), which when failed will
panic the kernel.

Because the KASAN test deliberately performs out-of-bounds operations,
the kernel panics with FORTIFY_SOURCE, for example:

 | kernel BUG at lib/string_helpers.c:910!
 | invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
 | CPU: 1 PID: 137 Comm: kunit_try_catch Tainted: G    B             5.16.0-rc3+ #3
 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
 | RIP: 0010:fortify_panic+0x19/0x1b
 | ...
 | Call Trace:
 |  kmalloc_oob_in_memset.cold+0x16/0x16
 |  ...

Fix it by also hiding `ptr` from the optimizer, which will ensure that
__builtin_object_size() does not return a valid size, preventing
fortified string functions from panicking.

Link: https://lkml.kernel.org/r/20220124160744.1244685-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reported-by: Nico Pache <npache@redhat.com>
Reviewed-by: Nico Pache <npache@redhat.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-30 09:56:58 +02:00
Tianjia Zhang
eb90686d5d crypto: sm3 - create SM3 stand-alone library
Stand-alone implementation of the SM3 algorithm. It is designed
to have as little dependencies as possible. In other cases you
should generally use the hash APIs from include/crypto/hash.h.
Especially when hashing large amounts of data as those APIs may
be hw-accelerated. In the new SM3 stand-alone library,
sm3_transform() has also been optimized, instead of simply using
the code in sm3_generic.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-01-28 16:51:10 +11:00
Yonghong Song
7472d5a642 compiler_types: define __user as __attribute__((btf_type_tag("user")))
The __user attribute is currently mainly used by sparse for type checking.
The attribute indicates whether a memory access is in user memory address
space or not. Such information is important during tracing kernel
internal functions or data structures as accessing user memory often
has different mechanisms compared to accessing kernel memory. For example,
the perf-probe needs explicit command line specification to indicate a
particular argument or string in user-space memory ([1], [2], [3]).
Currently, vmlinux BTF is available in kernel with many distributions.
If __user attribute information is available in vmlinux BTF, the explicit
user memory access information from users will not be necessary as
the kernel can figure it out by itself with vmlinux BTF.

Besides the above possible use for perf/probe, another use case is
for bpf verifier. Currently, for bpf BPF_PROG_TYPE_TRACING type of bpf
programs, users can write direct code like
  p->m1->m2
and "p" could be a function parameter. Without __user information in BTF,
the verifier will assume p->m1 accessing kernel memory and will generate
normal loads. Let us say "p" actually tagged with __user in the source
code.  In such cases, p->m1 is actually accessing user memory and direct
load is not right and may produce incorrect result. For such cases,
bpf_probe_read_user() will be the correct way to read p->m1.

To support encoding __user information in BTF, a new attribute
  __attribute__((btf_type_tag("<arbitrary_string>")))
is implemented in clang ([4]). For example, if we have
  #define __user __attribute__((btf_type_tag("user")))
during kernel compilation, the attribute "user" information will
be preserved in dwarf. After pahole converting dwarf to BTF, __user
information will be available in vmlinux BTF.

The following is an example with latest upstream clang (clang14) and
pahole 1.23:

  [$ ~] cat test.c
  #define __user __attribute__((btf_type_tag("user")))
  int foo(int __user *arg) {
          return *arg;
  }
  [$ ~] clang -O2 -g -c test.c
  [$ ~] pahole -JV test.o
  ...
  [1] INT int size=4 nr_bits=32 encoding=SIGNED
  [2] TYPE_TAG user type_id=1
  [3] PTR (anon) type_id=2
  [4] FUNC_PROTO (anon) return=1 args=(3 arg)
  [5] FUNC foo type_id=4
  [$ ~]

You can see for the function argument "int __user *arg", its type is
described as
  PTR -> TYPE_TAG(user) -> INT
The kernel can use this information for bpf verification or other
use cases.

Current btf_type_tag is only supported in clang (>= clang14) and
pahole (>= 1.23). gcc support is also proposed and under development ([5]).

  [1] http://lkml.kernel.org/r/155789874562.26965.10836126971405890891.stgit@devnote2
  [2] http://lkml.kernel.org/r/155789872187.26965.4468456816590888687.stgit@devnote2
  [3] http://lkml.kernel.org/r/155789871009.26965.14167558859557329331.stgit@devnote2
  [4] https://reviews.llvm.org/D111199
  [5] https://lore.kernel.org/bpf/0cbeb2fb-1a18-f690-e360-24b1c90c2a91@fb.com/

Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220127154600.652613-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-27 12:03:46 -08:00
Laibin Qiu
10825410b9 blk-mq: Fix wrong wakeup batch configuration which will cause hang
Commit 180dccb0db ("blk-mq: fix tag_get wait task can't be
awakened") will recalculate wake_batch when incrementing or decrementing
active_queues to avoid wake_batch > hctx_max_depth. At the same time, in
order to not affect performance as much as possible, the minimum wakeup
batch is set to 4. But when the QD is small (such as QD=1), if inc or dec
active_queues increases wakeup batch, that can lead to a hang:

Fix this problem with the following strategies:
QD          :  >= 32 | < 32
---------------------------------
wakeup batch:  8~4   | 3~1

Fixes: 180dccb0db ("blk-mq: fix tag_get wait task can't be awakened")
Link: https://lore.kernel.org/linux-block/78cafe94-a787-e006-8851-69906f0c2128@huawei.com/T/#t
Reported-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Tested-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Link: https://lore.kernel.org/r/20220127100047.1763746-1-qiulaibin@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-27 10:15:32 -07:00
Daniel Latypov
21957f90b2 kunit: split out part of kunit_assert into a static const
This is per Linus's suggestion in [1].

The issue there is that every KUNIT_EXPECT/KUNIT_ASSERT puts a
kunit_assert object onto the stack. Normally we rely on compilers to
elide this, but when that doesn't work out, this blows up the stack
usage of kunit test functions.

We can move some data off the stack by making it static.
This change introduces a new `struct kunit_loc` to hold the file and
line number and then just passing assert_type (EXPECT or ASSERT) as an
argument.

In [1], it was suggested to also move out the format string as well, but
users could theoretically craft a format string at runtime, so we can't.

This change leaves a copy of `assert_type` in kunit_assert for now
because cleaning up all the macros to not pass it around is a bit more
involved.

Here's an example of the expanded code for KUNIT_FAIL():
if (__builtin_expect(!!(!(false)), 0)) {
  static const struct kunit_loc loc = { .file = ... };
  struct kunit_fail_assert __assertion = { .assert = { .type ...  };
  kunit_do_failed_assertion(test, &loc, KUNIT_EXPECTATION, &__assertion.assert, ...);
};

[1] https://groups.google.com/g/kunit-dev/c/i3fZXgvBrfA/m/VULQg1z6BAAJ

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-01-25 12:49:59 -07:00
Daniel Latypov
dd640d7087 kunit: factor out kunit_base_assert_format() call into kunit_fail()
We call this function first thing for all the assertion `format()`
functions.
This is the part that prints the file and line number and assertion type
(EXPECTATION, ASSERTION).

Having it as part of the format functions lets us have the flexibility
to not print that information (or print it differently) for new
assertion types, but I think this we don't need that.

And in the future, we'd like to consider factoring that data (file,
line#, type) out of the kunit_assert struct and into a `static`
variable, as Linus suggested [1], so we'd need to extract it anyways.

[1] https://groups.google.com/g/kunit-dev/c/i3fZXgvBrfA/m/VULQg1z6BAAJ

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-01-25 12:49:53 -07:00
Daniel Latypov
4fdacef8ac kunit: move check if assertion passed into the macros
Currently the code always calls kunit_do_assertion() even though it does
nothing when `pass` is true.

This change moves the `if(!(pass))` check into the macro instead
and renames the function to kunit_do_failed_assertion().
I feel this a bit easier to read and understand.

This has the potential upside of avoiding a function call that does
nothing most of the time (assuming your tests are passing) but comes
with the downside of generating a bit more code and branches. We try to
mitigate the branches by tagging them with `unlikely()`.

This also means we don't have to initialize structs that we don't need,
which will become a tiny bit more expensive if we switch over to using
static variables to try and reduce stack usage. (There's runtime code
to check if the variable has been initialized yet or not).

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-01-25 12:49:40 -07:00
Daniel Latypov
7b3391057f kunit: add example test case showing off all the expect macros
Currently, these macros are only really documented near the bottom of
https://www.kernel.org/doc/html/latest/dev-tools/kunit/api/test.html#c.KUNIT_FAIL.

E.g. it's likely someone might just not realize that
KUNIT_EXPECT_STREQ() exists and instead use KUNIT_EXPECT_FALSE(strcmp())
or similar.

This can also serve as a basic smoketest that the KUnit assert machinery
still works for all the macros.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-01-25 12:49:20 -07:00
Linus Torvalds
3689f9f8b0 bitmap patches for 5.17-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQHJBAABCgAzFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmHi+xgVHHl1cnkubm9y
 b3ZAZ21haWwuY29tAAoJELFEgP06H77IxdoMAMf3E+L51Ys/4iAiyJQNVoT3aIBC
 A8ZVOB9he1OA3o3wBNIRKmICHk+ovnfCWcXTr9fG/Ade2wJz88NAsGPQ1Phywb+s
 iGlpySllFN72RT9ZqtJhLEzgoHHOL0CzTW07TN9GJy4gQA2h2G9CTP+OmsQdnVqE
 m9Fn3PSlJ5lhzePlKfnln8rGZFgrriJakfEFPC79n/7an4+2Hvkb5rWigo7KQc4Z
 9YNqYUcHWZFUgq80adxEb9LlbMXdD+Z/8fCjOrAatuwVkD4RDt6iKD0mFGjHXGL7
 MZ9KRS8AfZXawmetk3jjtsV+/QkeS+Deuu7k0FoO0Th2QV7BGSDhsLXAS5By/MOC
 nfSyHhnXHzCsBMyVNrJHmNhEZoN29+tRwI84JX9lWcf/OLANcCofnP6f2UIX7tZY
 CAZAgVELp+0YQXdybrfzTQ8BT3TinjS/aZtCrYijRendI1GwUXcyl69vdOKqAHuk
 5jy8k/xHyp+ZWu6v+PyAAAEGowY++qhL0fmszA==
 =RKW4
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-5.17-rc1' of git://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - introduce for_each_set_bitrange()

 - use find_first_*_bit() instead of find_next_*_bit() where possible

 - unify for_each_bit() macros

* tag 'bitmap-5.17-rc1' of git://github.com/norov/linux:
  vsprintf: rework bitmap_list_string
  lib: bitmap: add performance test for bitmap_print_to_pagebuf
  bitmap: unify find_bit operations
  mm/percpu: micro-optimize pcpu_is_populated()
  Replace for_each_*_bit_from() with for_each_*_bit() where appropriate
  find: micro-optimize for_each_{set,clear}_bit()
  include/linux: move for_each_bit() macros from bitops.h to find.h
  cpumask: replace cpumask_next_* with cpumask_first_* where appropriate
  tools: sync tools/bitmap with mother linux
  all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate
  cpumask: use find_first_and_bit()
  lib: add find_first_and_bit()
  arch: remove GENERIC_FIND_FIRST_BIT entirely
  include: move find.h from asm_generic to linux
  bitops: move find_bit_*_le functions from le.h to find.h
  bitops: protect find_first_{,zero}_bit properly
2022-01-23 06:20:44 +02:00
Marco Elver
e940066089 lib/stackdepot: always do filter_irq_stacks() in stack_depot_save()
The non-interrupt portion of interrupt stack traces before interrupt
entry is usually arbitrary.  Therefore, saving stack traces of
interrupts (that include entries before interrupt entry) to stack depot
leads to unbounded stackdepot growth.

As such, use of filter_irq_stacks() is a requirement to ensure
stackdepot can efficiently deduplicate interrupt stacks.

Looking through all current users of stack_depot_save(), none (except
KASAN) pass the stack trace through filter_irq_stacks() before passing
it on to stack_depot_save().

Rather than adding filter_irq_stacks() to all current users of
stack_depot_save(), it became clear that stack_depot_save() should
simply do filter_irq_stacks().

Link: https://lkml.kernel.org/r/20211130095727.2378739-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Imran Khan <imran.f.khan@oracle.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-22 08:33:38 +02:00
Vlastimil Babka
2dba5eb1c7 lib/stackdepot: allow optional init and stack_table allocation by kvmalloc()
Currently, enabling CONFIG_STACKDEPOT means its stack_table will be
allocated from memblock, even if stack depot ends up not actually used.
The default size of stack_table is 4MB on 32-bit, 8MB on 64-bit.

This is fine for use-cases such as KASAN which is also a config option
and has overhead on its own.  But it's an issue for functionality that
has to be actually enabled on boot (page_owner) or depends on hardware
(GPU drivers) and thus the memory might be wasted.  This was raised as
an issue [1] when attempting to add stackdepot support for SLUB's debug
object tracking functionality.  It's common to build kernels with
CONFIG_SLUB_DEBUG and enable slub_debug on boot only when needed, or
create only specific kmem caches with debugging for testing purposes.

It would thus be more efficient if stackdepot's table was allocated only
when actually going to be used.  This patch thus makes the allocation
(and whole stack_depot_init() call) optional:

 - Add a CONFIG_STACKDEPOT_ALWAYS_INIT flag to keep using the current
   well-defined point of allocation as part of mem_init(). Make
   CONFIG_KASAN select this flag.

 - Other users have to call stack_depot_init() as part of their own init
   when it's determined that stack depot will actually be used. This may
   depend on both config and runtime conditions. Convert current users
   which are page_owner and several in the DRM subsystem. Same will be
   done for SLUB later.

 - Because the init might now be called after the boot-time memblock
   allocation has given all memory to the buddy allocator, change
   stack_depot_init() to allocate stack_table with kvmalloc() when
   memblock is no longer available. Also handle allocation failure by
   disabling stackdepot (could have theoretically happened even with
   memblock allocation previously), and don't unnecessarily align the
   memblock allocation to its own size anymore.

[1] https://lore.kernel.org/all/CAMuHMdW=eoVzM1Re5FVoEN87nKfiLmM2+Ah7eNu2KXEhCvbZyA@mail.gmail.com/

Link: https://lkml.kernel.org/r/20211013073005.11351-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Marco Elver <elver@google.com> # stackdepot
Cc: Marco Elver <elver@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Oliver Glitta <glittao@gmail.com>
Cc: Imran Khan <imran.f.khan@oracle.com>
From: Colin Ian King <colin.king@canonical.com>
Subject: lib/stackdepot: fix spelling mistake and grammar in pr_err message

There is a spelling mistake of the work allocation so fix this and
re-phrase the message to make it easier to read.

Link: https://lkml.kernel.org/r/20211015104159.11282-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
From: Vlastimil Babka <vbabka@suse.cz>
Subject: lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup

On FLATMEM, we call page_ext_init_flatmem_late() just before
kmem_cache_init() which means stack_depot_init() (called by page owner
init) will not recognize properly it should use kvmalloc() and not
memblock_alloc().  memblock_alloc() will also not issue a warning and
return a block memory that can be invalid and cause kernel page fault when
saving stacks, as reported by the kernel test robot [1].

Fix this by moving page_ext_init_flatmem_late() below kmem_cache_init() so
that slab_is_available() is true during stack_depot_init().  SPARSEMEM
doesn't have this issue, as it doesn't do page_ext_init_flatmem_late(),
but a different page_ext_init() even later in the boot process.

Thanks to Mike Rapoport for pointing out the FLATMEM init ordering issue.

While at it, also actually resolve a checkpatch warning in stack_depot_init()
from DRM CI, which was supposed to be in the original patch already.

[1] https://lore.kernel.org/all/20211014085450.GC18719@xsang-OptiPlex-9020/

Link: https://lkml.kernel.org/r/6abd9213-19a9-6d58-cedc-2414386d2d81@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: kernel test robot <oliver.sang@intel.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
From: Vlastimil Babka <vbabka@suse.cz>
Subject: lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup3

Due to cd06ab2fd4 ("drm/locking: add backtrace for locking contended
locks without backoff") landing recently to -next adding a new stack depot
user in drivers/gpu/drm/drm_modeset_lock.c we need to add an appropriate
call to stack_depot_init() there as well.

Link: https://lkml.kernel.org/r/2a692365-cfa1-64f2-34e0-8aa5674dce5e@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Marco Elver <elver@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Oliver Glitta <glittao@gmail.com>
Cc: Imran Khan <imran.f.khan@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
From: Vlastimil Babka <vbabka@suse.cz>
Subject: lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup4

Due to 4e66934eaa ("lib: add reference counting tracking
infrastructure") landing recently to net-next adding a new stack depot
user in lib/ref_tracker.c we need to add an appropriate call to
stack_depot_init() there as well.

Link: https://lkml.kernel.org/r/45c1b738-1a2f-5b5f-2f6d-86fab206d01c@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Slab <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-22 08:33:37 +02:00
Luis Chamberlain
04bc883c98 test_sysctl: simplify subdirectory registration with register_sysctl()
There is no need to user boiler plate code to specify a set of base
directories we're going to stuff sysctls under.  Simplify this by using
register_sysctl() and specifying the directory path directly.

// pycocci sysctl-subdir-register-sysctl-simplify.cocci lib/test_sysctl.c

@c1@
expression E1;
identifier subdir, sysctls;
@@

static struct ctl_table subdir[] = {
	{
		.procname = E1,
		.maxlen = 0,
		.mode = 0555,
		.child = sysctls,
	},
	{ }
};

@c2@
identifier c1.subdir;

expression E2;
identifier base;
@@

static struct ctl_table base[] = {
	{
		.procname = E2,
		.maxlen = 0,
		.mode = 0555,
		.child = subdir,
	},
	{ }
};

@c3@
identifier c2.base;
identifier header;
@@

header = register_sysctl_table(base);

@r1 depends on c1 && c2 && c3@
expression c1.E1;
identifier c1.subdir, c1.sysctls;
@@

-static struct ctl_table subdir[] = {
-	{
-		.procname = E1,
-		.maxlen = 0,
-		.mode = 0555,
-		.child = sysctls,
-	},
-	{ }
-};

@r2 depends on c1 && c2 && c3@
identifier c1.subdir;

expression c2.E2;
identifier c2.base;
@@
-static struct ctl_table base[] = {
-	{
-		.procname = E2,
-		.maxlen = 0,
-		.mode = 0555,
-		.child = subdir,
-	},
-	{ }
-};

@initialize:python@
@@

def make_my_fresh_expression(s1, s2):
  return '"' + s1.strip('"') + "/" + s2.strip('"') + '"'

@r3 depends on c1 && c2 && c3@
expression c1.E1;
identifier c1.sysctls;
expression c2.E2;
identifier c2.base;
identifier c3.header;
fresh identifier E3 = script:python(E2, E1) { make_my_fresh_expression(E2, E1) };
@@

header =
-register_sysctl_table(base);
+register_sysctl(E3, sysctls);

Generated-by: Coccinelle SmPL
Link: https://lkml.kernel.org/r/20211123202422.819032-6-mcgrof@kernel.org
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Antti Palosaari <crope@iki.fi>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Airlie <airlied@linux.ie>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Julia Lawall <julia.lawall@inria.fr>
Cc: Kees Cook <keescook@chromium.org>
Cc: Lukas Middendorf <kernel@tuxforce.de>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Phillip Potter <phil@philpotter.co.uk>
Cc: Qing Wang <wangqing@vivo.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Stephen Kitt <steve@sk2.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Xiaoming Ni <nixiaoming@huawei.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: James E.J. Bottomley <jejb@linux.ibm.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-22 08:33:35 +02:00
Linus Torvalds
3c7c25038b block-5.17-2022-01-21
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmHqtecQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgph8iD/9nahzCdiPYRE+POHneiZbfaEnBEVFH7cz1
 rbEjiAR5EbkLxGZohEkIjbHuZyiF8cP6l8f1D5aEmqiFZfiuib8UOVURk9ZQdEMU
 lXnOhEuRopQnGGyzSs0yXdx8rZ8xvijmg2UDjwl/VZ4UMgkyD4NjFqNEjdXkmQPP
 pWWDkg4CQJIJ9jYeIKtfwijfeyi2LMkYniZFuwiYTAf+9Zt8OIrg7LtDkHulhMqk
 V/c5TSho9p22Hv0q6edQSbWhdm6QZ+MRz71Nsycr9cdvvO1jKoLKlcuXwlhqEB1q
 BMkwuJI4hhcauqKtwIqNIM+ulNj8HsPqRxP6n9b4RL017dhDLIrbeiOL0qG3PUNi
 VbC7EGvQIqTNp0zeyeIV3xM9jaBMbh+FpCqtzdT1ZKlPI4jOB89x7lXKpG30ixA2
 8nWXOiRE+UxXT96EbP6cLS/ykfvMiPqbVOSXdPl9d78R1j+xQVnBdMQoX2Yp/j1Y
 qN40Lp2mQgNJjkIiLOZxncx2xSx1/EVTDW1OPEm2Atv/NGxSK5vaN1P+X9DKB3e7
 pjpKHhvJuNy6c3yeJs5tyZrBu1zZl1dCMxC3fhK8XNTTWJ3zBiUxicDCsGN7YCwR
 5VJ+FbVATrzauBPtT7uQYRFnFePu1RxY5xTCdbg04hgGZmSSIqmJvZSpqp5Nn90s
 M0NbwyQrLg==
 =cebW
 -----END PGP SIGNATURE-----

Merge tag 'block-5.17-2022-01-21' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Various little minor fixes that should go into this release:

   - Fix issue with cloned bios and IO accounting (Christoph)

   - Remove redundant assignments (Colin, GuoYong)

   - Fix an issue with the mq-deadline async_depth sysfs interface (me)

   - Fix brd module loading race (Tetsuo)

   - Shared tag map wakeup fix (Laibin)

   - End of bdev read fix (OGAWA)

   - srcu leak fix (Ming)"

* tag 'block-5.17-2022-01-21' of git://git.kernel.dk/linux-block:
  block: fix async_depth sysfs interface for mq-deadline
  block: Fix wrong offset in bio_truncate()
  block: assign bi_bdev for cloned bios in blk_rq_prep_clone
  block: cleanup q->srcu
  block: Remove unnecessary variable assignment
  brd: remove brd_devices_mutex mutex
  aoe: remove redundant assignment on variable n
  loop: remove redundant initialization of pointer node
  blk-mq: fix tag_get wait task can't be awakened
2022-01-21 16:17:03 +02:00
Linus Torvalds
fa2e1ba3e9 Networking fixes for 5.17-rc1, including fixes from netfilter, bpf.
Current release - regressions:
 
  - fix memory leaks in the skb free deferral scheme if upper layer
    protocols are used, i.e. in-kernel TCP readers like TLS
 
 Current release - new code bugs:
 
  - nf_tables: fix NULL check typo in _clone() functions
 
  - change the default to y for Vertexcom vendor Kconfig
 
  - a couple of fixes to incorrect uses of ref tracking
 
  - two fixes for constifying netdev->dev_addr
 
 Previous releases - regressions:
 
  - bpf:
    - various verifier fixes mainly around register offset handling
      when passed to helper functions
    - fix mount source displayed for bpffs (none -> bpffs)
 
  - bonding:
    - fix extraction of ports for connection hash calculation
    - fix bond_xmit_broadcast return value when some devices are down
 
  - phy: marvell: add Marvell specific PHY loopback
 
  - sch_api: don't skip qdisc attach on ingress, prevent ref leak
 
  - htb: restore minimal packet size handling in rate control
 
  - sfp: fix high power modules without diagnostic monitoring
 
  - mscc: ocelot:
    - don't let phylink re-enable TX PAUSE on the NPI port
    - don't dereference NULL pointers with shared tc filters
 
  - smsc95xx: correct reset handling for LAN9514
 
  - cpsw: avoid alignment faults by taking NET_IP_ALIGN into account
 
  - phy: micrel: use kszphy_suspend/_resume for irq aware devices,
    avoid races with the interrupt
 
 Previous releases - always broken:
 
  - xdp: check prog type before updating BPF link
 
  - smc: resolve various races around abnormal connection termination
 
  - sit: allow encapsulated IPv6 traffic to be delivered locally
 
  - axienet: fix init/reset handling, add missing barriers,
    read the right status words, stop queues correctly
 
  - add missing dev_put() in sock_timestamping_bind_phc()
 
 Misc:
 
  - ipv4: prevent accidentally passing RTO_ONLINK to
    ip_route_output_key_hash() by sanitizing flags
 
  - ipv4: avoid quadratic behavior in netns dismantle
 
  - stmmac: dwmac-oxnas: add support for OX810SE
 
  - fsl: xgmac_mdio: add workaround for erratum A-009885
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmHoS14ACgkQMUZtbf5S
 IrtMQA/6AxhWuj2JsoNhvTzBCi4vkeo53rKU941bxOaST9Ow8dqDc7yAT8YeJU2B
 lGw6/pXx+Fm9twGsRkkQ0vX7piIk25vKzEwnlCYVVXLAnE+lPu9qFH49X1HO5Fwy
 K+frGDC524MrbJFb+UbZfJG4UitsyHoqc58Mp7ZNBe2gn12DcHotsiSJikzdd02F
 rzQZhvwRKsDS2prcIHdvVAxva380cn99mvaFqIPR9MemhWKOzVa3NfkiC3tSlhW/
 OphG3UuOfKCVdofYAO5/oXlVQcDKx0OD9Sr2q8aO0mlME0p0ounKz+LDcwkofaYQ
 pGeMY2pEAHujLyRewunrfaPv8/SIB/ulSPcyreoF28TTN20M+4onvgTHvVSyzLl7
 MA4kYH7tkPgOfbW8T573OFPdrqsy4WTrFPFovGqvDuiE8h65Pll/gTcAqsWjF/xw
 CmfmtICcsBwVGMLUzpUjKAWuB0/voa/sQUuQoxvQFsgCteuslm1suLY5EfSIhdu8
 nvhySJjPXRHicZQNflIwKTiOYYWls7yYVGe76u9hqjyD36peJXYjUjyyENIfLiFA
 0XclGIfSBMGWMGmxvGYIZDwGOKK0j+s0PipliXVjP2otLrPYUjma5Co37KW8SiSV
 9TT673FAXJNB0IJ7xiT7nRUZ/fjRrweP1glte/6d148J1Lf9MTQ=
 =XM4Y
 -----END PGP SIGNATURE-----

Merge tag 'net-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, bpf.

  Quite a handful of old regression fixes but most of those are
  pre-5.16.

  Current release - regressions:

   - fix memory leaks in the skb free deferral scheme if upper layer
     protocols are used, i.e. in-kernel TCP readers like TLS

  Current release - new code bugs:

   - nf_tables: fix NULL check typo in _clone() functions

   - change the default to y for Vertexcom vendor Kconfig

   - a couple of fixes to incorrect uses of ref tracking

   - two fixes for constifying netdev->dev_addr

  Previous releases - regressions:

   - bpf:
      - various verifier fixes mainly around register offset handling
        when passed to helper functions
      - fix mount source displayed for bpffs (none -> bpffs)

   - bonding:
      - fix extraction of ports for connection hash calculation
      - fix bond_xmit_broadcast return value when some devices are down

   - phy: marvell: add Marvell specific PHY loopback

   - sch_api: don't skip qdisc attach on ingress, prevent ref leak

   - htb: restore minimal packet size handling in rate control

   - sfp: fix high power modules without diagnostic monitoring

   - mscc: ocelot:
      - don't let phylink re-enable TX PAUSE on the NPI port
      - don't dereference NULL pointers with shared tc filters

   - smsc95xx: correct reset handling for LAN9514

   - cpsw: avoid alignment faults by taking NET_IP_ALIGN into account

   - phy: micrel: use kszphy_suspend/_resume for irq aware devices,
     avoid races with the interrupt

  Previous releases - always broken:

   - xdp: check prog type before updating BPF link

   - smc: resolve various races around abnormal connection termination

   - sit: allow encapsulated IPv6 traffic to be delivered locally

   - axienet: fix init/reset handling, add missing barriers, read the
     right status words, stop queues correctly

   - add missing dev_put() in sock_timestamping_bind_phc()

  Misc:

   - ipv4: prevent accidentally passing RTO_ONLINK to
     ip_route_output_key_hash() by sanitizing flags

   - ipv4: avoid quadratic behavior in netns dismantle

   - stmmac: dwmac-oxnas: add support for OX810SE

   - fsl: xgmac_mdio: add workaround for erratum A-009885"

* tag 'net-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (92 commits)
  ipv4: add net_hash_mix() dispersion to fib_info_laddrhash keys
  ipv4: avoid quadratic behavior in netns dismantle
  net/fsl: xgmac_mdio: Fix incorrect iounmap when removing module
  powerpc/fsl/dts: Enable WA for erratum A-009885 on fman3l MDIO buses
  dt-bindings: net: Document fsl,erratum-a009885
  net/fsl: xgmac_mdio: Add workaround for erratum A-009885
  net: mscc: ocelot: fix using match before it is set
  net: phy: micrel: use kszphy_suspend()/kszphy_resume for irq aware devices
  net: cpsw: avoid alignment faults by taking NET_IP_ALIGN into account
  nfc: llcp: fix NULL error pointer dereference on sendmsg() after failed bind()
  net: axienet: increase default TX ring size to 128
  net: axienet: fix for TX busy handling
  net: axienet: fix number of TX ring slots for available check
  net: axienet: Fix TX ring slot available check
  net: axienet: limit minimum TX ring size
  net: axienet: add missing memory barriers
  net: axienet: reset core on initialization prior to MDIO access
  net: axienet: Wait for PhyRstCmplt after core reset
  net: axienet: increase reset timeout
  bpf, selftests: Add ringbuf memory type confusion test
  ...
2022-01-20 10:57:05 +02:00
Linus Torvalds
f4484d138b Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "55 patches.

  Subsystems affected by this patch series: percpu, procfs, sysctl,
  misc, core-kernel, get_maintainer, lib, checkpatch, binfmt, nilfs2,
  hfs, fat, adfs, panic, delayacct, kconfig, kcov, and ubsan"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (55 commits)
  lib: remove redundant assignment to variable ret
  ubsan: remove CONFIG_UBSAN_OBJECT_SIZE
  kcov: fix generic Kconfig dependencies if ARCH_WANTS_NO_INSTR
  lib/Kconfig.debug: make TEST_KMOD depend on PAGE_SIZE_LESS_THAN_256KB
  btrfs: use generic Kconfig option for 256kB page size limit
  arch/Kconfig: split PAGE_SIZE_LESS_THAN_256KB from PAGE_SIZE_LESS_THAN_64KB
  configs: introduce debug.config for CI-like setup
  delayacct: track delays from memory compact
  Documentation/accounting/delay-accounting.rst: add thrashing page cache and direct compact
  delayacct: cleanup flags in struct task_delay_info and functions use it
  delayacct: fix incomplete disable operation when switch enable to disable
  delayacct: support swapin delay accounting for swapping without blkio
  panic: remove oops_id
  panic: use error_report_end tracepoint on warnings
  fs/adfs: remove unneeded variable make code cleaner
  FAT: use io_schedule_timeout() instead of congestion_wait()
  hfsplus: use struct_group_attr() for memcpy() region
  nilfs2: remove redundant pointer sbufs
  fs/binfmt_elf: use PT_LOAD p_align values for static PIE
  const_structs.checkpatch: add frequently used ops structs
  ...
2022-01-20 10:41:01 +02:00
Colin Ian King
b1e78ef3be lib: remove redundant assignment to variable ret
The variable ret is being assigned a value that is never read.  If the
for-loop is entered then ret is immediately re-assigned a new value.  If
the for-loop is not executed ret is never read.  The assignment is
redundant and can be removed.

Link: https://lkml.kernel.org/r/20211230134557.83633-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:55 +02:00
Kees Cook
69d0db01e2 ubsan: remove CONFIG_UBSAN_OBJECT_SIZE
The object-size sanitizer is redundant to -Warray-bounds, and
inappropriately performs its checks at run-time when all information
needed for the evaluation is available at compile-time, making it quite
difficult to use:

  https://bugzilla.kernel.org/show_bug.cgi?id=214861

With -Warray-bounds almost enabled globally, it doesn't make sense to
keep this around.

Link: https://lkml.kernel.org/r/20211203235346.110809-1-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:55 +02:00
Marco Elver
bece04b5b4 kcov: fix generic Kconfig dependencies if ARCH_WANTS_NO_INSTR
Until recent versions of GCC and Clang, it was not possible to disable
KCOV instrumentation via a function attribute.  The relevant function
attribute was introduced in 540540d06e ("kcov: add
__no_sanitize_coverage to fix noinstr for all architectures").

x86 was the first architecture to want a working noinstr, and at the
time no compiler support for the attribute existed yet.  Therefore,
commit 0f1441b44e ("objtool: Fix noinstr vs KCOV") introduced the
ability to NOP __sanitizer_cov_*() calls in .noinstr.text.

However, this doesn't work for other architectures like arm64 and s390
that want a working noinstr per ARCH_WANTS_NO_INSTR.

At the time of 0f1441b44e, we didn't yet have ARCH_WANTS_NO_INSTR,
but now we can move the Kconfig dependency checks to the generic KCOV
option.  KCOV will be available if:

	- architecture does not care about noinstr, OR
	- we have objtool support (like on x86), OR
	- GCC is 12.0 or newer, OR
	- Clang is 13.0 or newer.

Link: https://lkml.kernel.org/r/20211201152604.3984495-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:55 +02:00
Nathan Chancellor
bbd2e05fad lib/Kconfig.debug: make TEST_KMOD depend on PAGE_SIZE_LESS_THAN_256KB
Commit b05fbcc36b ("btrfs: disable build on platforms having page size
256K") disabled btrfs for configurations that used a 256kB page size.
However, it did not fully solve the problem because CONFIG_TEST_KMOD
selects CONFIG_BTRFS, which does not account for the dependency.  This
results in a Kconfig warning and the failed BUILD_BUG_ON error
returning.

  WARNING: unmet direct dependencies detected for BTRFS_FS
    Depends on [n]: BLOCK [=y] && !PPC_256K_PAGES && !PAGE_SIZE_256KB [=y]
    Selected by [m]:
    - TEST_KMOD [=m] && RUNTIME_TESTING_MENU [=y] && m && MODULES [=y] && NETDEVICES [=y] && NET_CORE [=y] && INET [=y] && BLOCK [=y]

To resolve this, add CONFIG_PAGE_SIZE_LESS_THAN_256KB as a dependency of
CONFIG_TEST_KMOD so there is no more invalid configuration or build
errors.

Link: https://lkml.kernel.org/r/20211129230141.228085-4-nathan@kernel.org
Fixes: b05fbcc36b ("btrfs: disable build on platforms having page size 256K")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Chris Mason <clm@fb.com>
Cc: David Sterba <dsterba@suse.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:55 +02:00
Andrey Konovalov
e073e5ef90 lib/test_meminit: destroy cache in kmem_cache_alloc_bulk() test
Make do_kmem_cache_size_bulk() destroy the cache it creates.

Link: https://lkml.kernel.org/r/aced20a94bf04159a139f0846e41d38a1537debb.1640018297.git.andreyknvl@google.com
Fixes: 03a9349ac0 ("lib/test_meminit: add a kmem_cache_alloc_bulk() test")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:54 +02:00
Isabella Basso
0acc968f35 test_hash.c: refactor into kunit
Use KUnit framework to make tests more easily integrable with CIs.  Even
though these tests are not yet properly written as unit tests this
change should help in debugging.

Also remove kernel messages (i.e.  through pr_info) as KUnit handles all
debugging output and let it handle module init and exit details.

Link: https://lkml.kernel.org/r/20211208183711.390454-6-isabbasso@riseup.net
Reviewed-by: David Gow <davidgow@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Tested-by: David Gow <davidgow@google.com>
Co-developed-by: Augusto Durães Camargo <augusto.duraes33@gmail.com>
Signed-off-by: Augusto Durães Camargo <augusto.duraes33@gmail.com>
Co-developed-by: Enzo Ferreira <ferreiraenzoa@gmail.com>
Signed-off-by: Enzo Ferreira <ferreiraenzoa@gmail.com>
Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:54 +02:00
Isabella Basso
88168bf35c lib/Kconfig.debug: properly split hash test kernel entries
Split TEST_HASH so that each entry only has one file.

Note that there's no stringhash test file, but actually
<linux/stringhash.h> tests are performed in lib/test_hash.c.

Link: https://lkml.kernel.org/r/20211208183711.390454-5-isabbasso@riseup.net
Reviewed-by: David Gow <davidgow@google.com>
Tested-by: David Gow <davidgow@google.com>
Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Cc: Augusto Durães Camargo <augusto.duraes33@gmail.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: Enzo Ferreira <ferreiraenzoa@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: kernel test robot <lkp@intel.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:54 +02:00
Isabella Basso
5427d3d772 test_hash.c: split test_hash_init
Split up test_hash_init so that it calls each test more explicitly
insofar it is possible without rewriting the entire file.  This aims at
improving readability.

Split tests performed on string_or as they don't interfere with those
performed in hash_or.  Also separate pr_info calls about skipped tests
as they're not part of the tests themselves, but only warn about
(un)defined arch-specific hash functions.

Link: https://lkml.kernel.org/r/20211208183711.390454-4-isabbasso@riseup.net
Reviewed-by: David Gow <davidgow@google.com>
Tested-by: David Gow <davidgow@google.com>
Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Cc: Augusto Durães Camargo <augusto.duraes33@gmail.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: Enzo Ferreira <ferreiraenzoa@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: kernel test robot <lkp@intel.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:54 +02:00
Isabella Basso
ae7880676b test_hash.c: split test_int_hash into arch-specific functions
Split the test_int_hash function to keep its mainloop separate from
arch-specific chunks, which are only compiled as needed.  This aims at
improving readability.

Link: https://lkml.kernel.org/r/20211208183711.390454-3-isabbasso@riseup.net
Reviewed-by: David Gow <davidgow@google.com>
Tested-by: David Gow <davidgow@google.com>
Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Cc: Augusto Durães Camargo <augusto.duraes33@gmail.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: Enzo Ferreira <ferreiraenzoa@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: kernel test robot <lkp@intel.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:54 +02:00
Isabella Basso
fd0a146240 hash.h: remove unused define directive
Patch series "test_hash.c: refactor into KUnit", v3.

We refactored the lib/test_hash.c file into KUnit as part of the student
group LKCAMP [1] introductory hackathon for kernel development.

This test was pointed to our group by Daniel Latypov [2], so its full
conversion into a pure KUnit test was our goal in this patch series, but
we ran into many problems relating to it not being split as unit tests,
which complicated matters a bit, as the reasoning behind the original
tests is quite cryptic for those unfamiliar with hash implementations.

Some interesting developments we'd like to highlight are:

 - In patch 1/5 we noticed that there was an unused define directive
   that could be removed.

 - In patch 4/5 we noticed how stringhash and hash tests are all under
   the lib/test_hash.c file, which might cause some confusion, and we
   also broke those kernel config entries up.

Overall KUnit developments have been made in the other patches in this
series:

In patches 2/5, 3/5 and 5/5 we refactored the lib/test_hash.c file so as
to make it more compatible with the KUnit style, whilst preserving the
original idea of the maintainer who designed it (i.e.  George Spelvin),
which might be undesirable for unit tests, but we assume it is enough
for a first patch.

This patch (of 5):

Currently, there exist hash_32() and __hash_32() functions, which were
introduced in a patch [1] targeting architecture specific optimizations.
These functions can be overridden on a per-architecture basis to achieve
such optimizations.  They must set their corresponding define directive
(HAVE_ARCH_HASH_32 and HAVE_ARCH__HASH_32, respectively) so that header
files can deal with these overrides properly.

As the supported 32-bit architectures that have their own hash function
implementation (i.e.  m68k, Microblaze, H8/300, pa-risc) have only been
making use of the (more general) __hash_32() function (which only lacks
a right shift operation when compared to the hash_32() function), remove
the define directive corresponding to the arch-specific hash_32()
implementation.

[1] https://lore.kernel.org/lkml/20160525073311.5600.qmail@ns.sciencehorizons.net/

[akpm@linux-foundation.org: hash_32_generic() becomes hash_32()]

Link: https://lkml.kernel.org/r/20211208183711.390454-1-isabbasso@riseup.net
Link: https://lkml.kernel.org/r/20211208183711.390454-2-isabbasso@riseup.net
Reviewed-by: David Gow <davidgow@google.com>
Tested-by: David Gow <davidgow@google.com>
Co-developed-by: Augusto Durães Camargo <augusto.duraes33@gmail.com>
Signed-off-by: Augusto Durães Camargo <augusto.duraes33@gmail.com>
Co-developed-by: Enzo Ferreira <ferreiraenzoa@gmail.com>
Signed-off-by: Enzo Ferreira <ferreiraenzoa@gmail.com>
Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: kernel test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:54 +02:00
Zhen Lei
a31f9336ed lib/list_debug.c: print more list debugging context in __list_del_entry_valid()
Currently, the entry->prev and entry->next are considered to be valid as
long as they are not LIST_POISON{1|2}.  However, the memory may be
corrupted.  The prev->next is invalid probably because 'prev' is
invalid, not because prev->next's content is illegal.

Unfortunately, the printk and its subfunctions will modify the registers
that hold the 'prev' and 'next', and we don't see this valuable
information in the BUG context.

So print the contents of 'entry->prev' and 'entry->next'.

Here's an example:
  list_del corruption. prev->next should be c0ecbf74, but was c08410dc
  kernel BUG at lib/list_debug.c:53!
  ... ...
  PC is at __list_del_entry_valid+0x58/0x98
  LR is at __list_del_entry_valid+0x58/0x98
  psr: 60000093
  sp : c0ecbf30  ip : 00000000  fp : 00000001
  r10: c08410d0  r9 : 00000001  r8 : c0825e0c
  r7 : 20000013  r6 : c08410d0  r5 : c0ecbf74  r4 : c0ecbf74
  r3 : c0825d08  r2 : 00000000  r1 : df7ce6f4  r0 : 00000044
  ... ...
  Stack: (0xc0ecbf30 to 0xc0ecc000)
  bf20:                                     c0ecbf74 c0164fd0 c0ecbf70 c0165170
  bf40: c0eca000 c0840c00 c0840c00 c0824500 c0825e0c c0189bbc c088f404 60000013
  bf60: 60000013 c0e85100 000004ec 00000000 c0ebcdc0 c0ecbf74 c0ecbf74 c0825d08
  bf80: c0e807c0 c018965c 00000000 c013f2a0 c0e807c0 c013f154 00000000 00000000
  bfa0: 00000000 00000000 00000000 c01001b0 00000000 00000000 00000000 00000000
  bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
  (__list_del_entry_valid) from (__list_del_entry+0xc/0x20)
  (__list_del_entry) from (finish_swait+0x60/0x7c)
  (finish_swait) from (rcu_gp_kthread+0x560/0xa20)
  (rcu_gp_kthread) from (kthread+0x14c/0x15c)
  (kthread) from (ret_from_fork+0x14/0x24)

At first, I thought prev->next was overwritten.  Later, I carefully
analyzed the RCU code and the disassembly code.  The error occurred when
deleting a node from the list rcu_state.gp_wq.  The System.map shows
that the address of rcu_state is c0840c00.  Then I use gdb to obtain the
offset of rcu_state.gp_wq.task_list.

  (gdb) p &((struct rcu_state *)0)->gp_wq.task_list
  $1 = (struct list_head *) 0x4dc

Again:
  list_del corruption. prev->next should be c0ecbf74, but was c08410dc

  c08410dc = c0840c00 + 0x4dc = &rcu_state.gp_wq.task_list

Because rcu_state.gp_wq has at most one node, so I can guess that "prev
= &rcu_state.gp_wq.task_list".  But for other scenes, maybe I wasn't so
lucky, I cannot figure out the value of 'prev'.

Link: https://lkml.kernel.org/r/20211207025835.1909-1-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:53 +02:00
Alexey Dobriyan
70ac69928e kstrtox: uninline everything
I've made a mistake of looking into lib/kstrtox.o code generation.

The only function remotely performance critical is _parse_integer()
(via /proc/*/map_files/*), everything else is not.

Uninline everything, shrink lib/kstrtox.o by ~20 % !

Space savings on x86_64:

	add/remove: 0/0 grow/shrink: 0/23 up/down: 0/-1269 (-1269 !!!)
	Function                                     old     new   delta
	kstrtoull                                     16      13      -3
	kstrtouint                                    59      48     -11
	kstrtou8                                      60      49     -11
	kstrtou16                                     61      50     -11
	_kstrtoul                                     46      35     -11
	kstrtoull_from_user                           95      83     -12
	kstrtoul_from_user                            95      83     -12
	kstrtoll                                      93      80     -13
	kstrtouint_from_user                         124      83     -41
	kstrtou8_from_user                           125      83     -42
	kstrtou16_from_user                          126      83     -43
	kstrtos8                                     101      50     -51
	kstrtos16                                    102      51     -51
	kstrtoint                                    100      49     -51
	_kstrtol                                      93      35     -58
	kstrtobool_from_user                         156      75     -81
	kstrtoll_from_user                           165      83     -82
	kstrtol_from_user                            165      83     -82
	kstrtoint_from_user                          172      83     -89
	kstrtos8_from_user                           173      83     -90
	kstrtos16_from_user                          174      83     -91
	_parse_integer                               136      10    -126
	_kstrtoull                                   308     101    -207
	Total: Before=3421236, After=3419967, chg -0.04%

Link: https://lkml.kernel.org/r/YZDsFDhHst4m2Pnt@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:53 +02:00
Andy Shevchenko
22c033989c include/linux/unaligned: replace kernel.h with the necessary inclusions
When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.

Replace kernel.h inclusion with the list of what is really being used.

The rest of the changes are induced by the above and may not be split.

Link: https://lkml.kernel.org/r/20211209123823.20425-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>	[brcmfmac]
Acked-by: Kalle Valo <kvalo@kernel.org>
Cc: Arend van Spriel <aspriel@gmail.com>
Cc: Franky Lin <franky.lin@broadcom.com>
Cc: Hante Meuleman <hante.meuleman@broadcom.com>
Cc: Chi-hsien Lin <chi-hsien.lin@infineon.com>
Cc: Wright Feng <wright.feng@infineon.com>
Cc: Chung-hsien Hsu <chung-hsien.hsu@infineon.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:53 +02:00
Jason A. Donenfeld
9a1536b093 lib/crypto: sha1: re-roll loops to reduce code size
With SHA-1 no longer being used for anything performance oriented, and
also soon to be phased out entirely, we can make up for the space added
by unrolled BLAKE2s by simply re-rolling SHA-1. Since SHA-1 is so much
more complex, re-rolling it more or less takes care of the code size
added by BLAKE2s. And eventually, hopefully we'll see SHA-1 removed
entirely from most small kernel builds.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-01-18 13:03:55 +01:00
Jason A. Donenfeld
d8d83d8ab0 lib/crypto: blake2s: move hmac construction into wireguard
Basically nobody should use blake2s in an HMAC construction; it already
has a keyed variant. But unfortunately for historical reasons, Noise,
used by WireGuard, uses HKDF quite strictly, which means we have to use
this. Because this really shouldn't be used by others, this commit moves
it into wireguard's noise.c locally, so that kernels that aren't using
WireGuard don't get this superfluous code baked in. On m68k systems,
this shaves off ~314 bytes.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-01-18 13:03:55 +01:00
Justin M. Forbes
e56e189855 lib/crypto: add prompts back to crypto libraries
Commit 6048fdcc5f ("lib/crypto: blake2s: include as built-in") took
away a number of prompt texts from other crypto libraries. This makes
values flip from built-in to module when oldconfig runs, and causes
problems when these crypto libs need to be built in for thingslike
BIG_KEYS.

Fixes: 6048fdcc5f ("lib/crypto: blake2s: include as built-in")
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Justin M. Forbes <jforbes@fedoraproject.org>
[Jason: - moved menu into submenu of lib/ instead of root menu
        - fixed chacha sub-dependencies for CONFIG_CRYPTO]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-01-18 13:03:55 +01:00
Linus Torvalds
35ce8ae9ae Merge branch 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull signal/exit/ptrace updates from Eric Biederman:
 "This set of changes deletes some dead code, makes a lot of cleanups
  which hopefully make the code easier to follow, and fixes bugs found
  along the way.

  The end-game which I have not yet reached yet is for fatal signals
  that generate coredumps to be short-circuit deliverable from
  complete_signal, for force_siginfo_to_task not to require changing
  userspace configured signal delivery state, and for the ptrace stops
  to always happen in locations where we can guarantee on all
  architectures that the all of the registers are saved and available on
  the stack.

  Removal of profile_task_ext, profile_munmap, and profile_handoff_task
  are the big successes for dead code removal this round.

  A bunch of small bug fixes are included, as most of the issues
  reported were small enough that they would not affect bisection so I
  simply added the fixes and did not fold the fixes into the changes
  they were fixing.

  There was a bug that broke coredumps piped to systemd-coredump. I
  dropped the change that caused that bug and replaced it entirely with
  something much more restrained. Unfortunately that required some
  rebasing.

  Some successes after this set of changes: There are few enough calls
  to do_exit to audit in a reasonable amount of time. The lifetime of
  struct kthread now matches the lifetime of struct task, and the
  pointer to struct kthread is no longer stored in set_child_tid. The
  flag SIGNAL_GROUP_COREDUMP is removed. The field group_exit_task is
  removed. Issues where task->exit_code was examined with
  signal->group_exit_code should been examined were fixed.

  There are several loosely related changes included because I am
  cleaning up and if I don't include them they will probably get lost.

  The original postings of these changes can be found at:
     https://lkml.kernel.org/r/87a6ha4zsd.fsf@email.froward.int.ebiederm.org
     https://lkml.kernel.org/r/87bl1kunjj.fsf@email.froward.int.ebiederm.org
     https://lkml.kernel.org/r/87r19opkx1.fsf_-_@email.froward.int.ebiederm.org

  I trimmed back the last set of changes to only the obviously correct
  once. Simply because there was less time for review than I had hoped"

* 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (44 commits)
  ptrace/m68k: Stop open coding ptrace_report_syscall
  ptrace: Remove unused regs argument from ptrace_report_syscall
  ptrace: Remove second setting of PT_SEIZED in ptrace_attach
  taskstats: Cleanup the use of task->exit_code
  exit: Use the correct exit_code in /proc/<pid>/stat
  exit: Fix the exit_code for wait_task_zombie
  exit: Coredumps reach do_group_exit
  exit: Remove profile_handoff_task
  exit: Remove profile_task_exit & profile_munmap
  signal: clean up kernel-doc comments
  signal: Remove the helper signal_group_exit
  signal: Rename group_exit_task group_exec_task
  coredump: Stop setting signal->group_exit_task
  signal: Remove SIGNAL_GROUP_COREDUMP
  signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process
  signal: Make coredump handling explicit in complete_signal
  signal: Have prepare_signal detect coredumps using signal->core_state
  signal: Have the oom killer detect coredumps using signal->core_state
  exit: Move force_uaccess back into do_exit
  exit: Guarantee make_task_dead leaks the tsk when calling do_task_exit
  ...
2022-01-17 05:49:30 +02:00
Linus Torvalds
f56caedaf9 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "146 patches.

  Subsystems affected by this patch series: kthread, ia64, scripts,
  ntfs, squashfs, ocfs2, vfs, and mm (slab-generic, slab, kmemleak,
  dax, kasan, debug, pagecache, gup, shmem, frontswap, memremap,
  memcg, selftests, pagemap, dma, vmalloc, memory-failure, hugetlb,
  userfaultfd, vmscan, mempolicy, oom-kill, hugetlbfs, migration, thp,
  ksm, page-poison, percpu, rmap, zswap, zram, cleanups, hmm, and
  damon)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (146 commits)
  mm/damon: hide kernel pointer from tracepoint event
  mm/damon/vaddr: hide kernel pointer from damon_va_three_regions() failure log
  mm/damon/vaddr: use pr_debug() for damon_va_three_regions() failure logging
  mm/damon/dbgfs: remove an unnecessary variable
  mm/damon: move the implementation of damon_insert_region to damon.h
  mm/damon: add access checking for hugetlb pages
  Docs/admin-guide/mm/damon/usage: update for schemes statistics
  mm/damon/dbgfs: support all DAMOS stats
  Docs/admin-guide/mm/damon/reclaim: document statistics parameters
  mm/damon/reclaim: provide reclamation statistics
  mm/damon/schemes: account how many times quota limit has exceeded
  mm/damon/schemes: account scheme actions that successfully applied
  mm/damon: remove a mistakenly added comment for a future feature
  Docs/admin-guide/mm/damon/usage: update for kdamond_pid and (mk|rm)_contexts
  Docs/admin-guide/mm/damon/usage: mention tracepoint at the beginning
  Docs/admin-guide/mm/damon/usage: remove redundant information
  Docs/admin-guide/mm/damon/usage: update for scheme quotas and watermarks
  mm/damon: convert macro functions to static inline functions
  mm/damon: modify damon_rand() macro to static inline function
  mm/damon: move damon_rand() definition into damon.h
  ...
2022-01-15 20:37:06 +02:00
Yury Norov
15325b4f76 vsprintf: rework bitmap_list_string
bitmap_list_string() is very ineffective when printing bitmaps with long
ranges of set bits because it calls find_next_bit for each bit in the
bitmap.  We can do better by detecting ranges of set bits.

In my environment, before/after is 943008/31008 ns.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2022-01-15 08:47:31 -08:00
Yury Norov
db7313005e lib: bitmap: add performance test for bitmap_print_to_pagebuf
Functional tests for bitmap_print_to_pagebuf() are provided
in lib/test_printf.c. This patch adds performance test for
a case of fully set bitmap.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2022-01-15 08:47:31 -08:00
Yury Norov
b5c7e7ec7d all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate
find_first{,_zero}_bit is a more effective analogue of 'next' version if
start == 0. This patch replaces 'next' with 'first' where things look
trivial.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2022-01-15 08:47:31 -08:00
Yury Norov
f68edc9297 lib: add find_first_and_bit()
Currently find_first_and_bit() is an alias to find_next_and_bit(). However,
it is widely used in cpumask, so it worth to optimize it. This patch adds
its own implementation for find_first_and_bit().

On x86_64 find_bit_benchmark says:

Before (#define find_first_and_bit(...) find_next_and_bit(..., 0):
Start testing find_bit() with random-filled bitmap
[  140.291468] find_first_and_bit:           46890919 ns,  32671 iterations
Start testing find_bit() with sparse bitmap
[  140.295028] find_first_and_bit:               7103 ns,      1 iterations

After:
Start testing find_bit() with random-filled bitmap
[  162.574907] find_first_and_bit:           25045813 ns,  32846 iterations
Start testing find_bit() with sparse bitmap
[  162.578458] find_first_and_bit:               4900 ns,      1 iterations

(Thanks to Alexey Klimov for thorough testing.)

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Alexey Klimov <aklimov@redhat.com>
2022-01-15 08:47:31 -08:00
Yury Norov
c126a53c27 arch: remove GENERIC_FIND_FIRST_BIT entirely
In 5.12 cycle we enabled GENERIC_FIND_FIRST_BIT config option for ARM64
and MIPS. It increased performance and shrunk .text size; and so far
I didn't receive any negative feedback on the change.

https://lore.kernel.org/linux-arch/20210225135700.1381396-1-yury.norov@gmail.com/

Now I think it's a good time to switch all architectures to use
find_{first,last}_bit() unconditionally, and so remove corresponding
config option.

The patch does't introduce functioal changes for arc, arm, arm64, mips,
m68k, s390 and x86, for other architectures I expect improvement both in
performance and .text size.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Tested-by: Alexander Lobakin <alobakin@pm.me> (mips)
Reviewed-by: Alexander Lobakin <alobakin@pm.me> (mips)
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Will Deacon <will@kernel.org>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2022-01-15 08:47:31 -08:00
Alistair Popple
87c01d57fa mm/hmm.c: allow VM_MIXEDMAP to work with hmm_range_fault
hmm_range_fault() can be used instead of get_user_pages() for devices
which allow faulting however unlike get_user_pages() it will return an
error when used on a VM_MIXEDMAP range.

To make hmm_range_fault() more closely match get_user_pages() remove
this restriction.  This requires dealing with the !ARCH_HAS_PTE_SPECIAL
case in hmm_vma_handle_pte().  Rather than replicating the logic of
vm_normal_page() call it directly and do a check for the zero pfn
similar to what get_user_pages() currently does.

Also add a test to hmm selftest to verify functionality.

Link: https://lkml.kernel.org/r/20211104012001.2555676-1-apopple@nvidia.com
Fixes: da4c3c735e ("mm/hmm/mirror: helper to snapshot CPU page table")
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15 16:30:31 +02:00
Marco Elver
f98f966cd7 kasan: test: add test case for double-kmem_cache_destroy()
Add a test case for double-kmem_cache_destroy() detection.

Link: https://lkml.kernel.org/r/20211119142219.1519617-2-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15 16:30:26 +02:00
Marco Elver
e5f4728767 kasan: test: add globals left-out-of-bounds test
Add a test checking that KASAN generic can also detect out-of-bounds
accesses to the left of globals.

Unfortunately it seems that GCC doesn't catch this (tested GCC 10, 11).
The main difference between GCC's globals redzoning and Clang's is that
GCC relies on using increased alignment to producing padding, where
Clang's redzoning implementation actually adds real data after the
global and doesn't rely on alignment to produce padding.  I believe this
is the main reason why GCC can't reliably catch globals out-of-bounds in
this case.

Given this is now a known issue, to avoid failing the whole test suite,
skip this test case with GCC.

Link: https://lkml.kernel.org/r/20211117130714.135656-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reported-by: Kaiwan N Billimoria <kaiwan.billimoria@gmail.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kaiwan N Billimoria <kaiwan.billimoria@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15 16:30:26 +02:00
Laibin Qiu
180dccb0db blk-mq: fix tag_get wait task can't be awakened
In case of shared tags, there might be more than one hctx which
allocates from the same tags, and each hctx is limited to allocate at
most:
        hctx_max_depth = max((bt->sb.depth + users - 1) / users, 4U);

tag idle detection is lazy, and may be delayed for 30sec, so there
could be just one real active hctx(queue) but all others are actually
idle and still accounted as active because of the lazy idle detection.
Then if wake_batch is > hctx_max_depth, driver tag allocation may wait
forever on this real active hctx.

Fix this by recalculating wake_batch when inc or dec active_queues.

Fixes: 0d2602ca30 ("blk-mq: improve support for shared tags maps")
Suggested-by: Ming Lei <ming.lei@redhat.com>
Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220113025536.1479653-1-qiulaibin@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-13 12:52:14 -07:00
Linus Torvalds
6020c204be Convert much of the page cache to use folios
This patchset stops just short of actually enabling large folios.
 It converts everything that I noticed needs to be converted, but there may
 still be places I've overlooked which still have page size assumptions.
 The big change here is using large entries in the page cache XArray
 instead of many small entries.  That only affects shmem for now, but
 it's a pretty big change for shmem since it changes where memory needs
 to be allocated (at split time instead of insertion).
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmHcraoACgkQDpNsjXcp
 gj7C3wgAl0cjtdVzTpkLmbnInsicW1m3thnbkSXYbpqRccFjpu2kEBGj31PT+oGz
 dzgXP7SNZ/VkFT+qWtmHSRF/J41B6f9bFojO81B2aQdpRiziU+5QbSbXbfUjwVhE
 GJF0WGSJtVqySKynXP/iYTEt2zj6BiVperAwIqzhZpPY7gNoyDgeRD34Xy5bQqdD
 ey6/Uwkh7oFHLEDcgxsEnyF0tUR3q+gpe5XZW1fb79p3crWw44xATc3UvKv8qCLC
 Rd4oHmKkOj4MvdiUxJEfXI+XxgrkQ8XRO70B+p6ZljhDaoDZYw7ullxA0gvlSpNX
 6pnjSQlKA1VQXsi6PMSt+9vf26XxaQ==
 =KeYZ
 -----END PGP SIGNATURE-----

Merge tag 'folio-5.17' of git://git.infradead.org/users/willy/pagecache

Pull folio conversion updates from Matthew Wilcox:
 "Convert much of the page cache to use folios

  This stops just short of actually enabling large folios. It converts
  everything that I noticed needs to be converted, but there may still
  be places I've overlooked which still have page size assumptions.

  The big change here is using large entries in the page cache XArray
  instead of many small entries. That only affects shmem for now, but
  it's a pretty big change for shmem since it changes where memory needs
  to be allocated (at split time instead of insertion)"

* tag 'folio-5.17' of git://git.infradead.org/users/willy/pagecache: (49 commits)
  mm: Use multi-index entries in the page cache
  XArray: Add xas_advance()
  truncate,shmem: Handle truncates that split large folios
  truncate: Convert invalidate_inode_pages2_range to folios
  fs: Convert vfs_dedupe_file_range_compare to folios
  mm: Remove pagevec_remove_exceptionals()
  mm: Convert find_lock_entries() to use a folio_batch
  filemap: Return only folios from find_get_entries()
  filemap: Convert filemap_get_read_batch() to use a folio_batch
  filemap: Convert filemap_read() to use a folio
  truncate: Add invalidate_complete_folio2()
  truncate: Convert invalidate_inode_pages2_range() to use a folio
  truncate: Skip known-truncated indices
  truncate,shmem: Add truncate_inode_folio()
  shmem: Convert part of shmem_undo_range() to use a folio
  mm: Add unmap_mapping_folio()
  truncate: Add truncate_cleanup_folio()
  filemap: Add filemap_release_folio()
  filemap: Use a folio in filemap_page_mkwrite
  filemap: Use a folio in filemap_map_pages
  ...
2022-01-12 12:37:02 -08:00
Linus Torvalds
6dc69d3d0d driver core changes for 5.17-rc1
Here is the set of changes for the driver core for 5.17-rc1.
 
 Lots of little things here, including:
 	- kobj_type cleanups
 	- auxiliary_bus documentation updates
 	- auxiliary_device conversions for some drivers (relevant
 	  subsystems all have provided acks for these)
 	- kernfs lock contention reduction for some workloads
 	- other tiny cleanups and changes.
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYd7deA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ym8ngCgw0ANwrRPE5b1dthEmfU2f8Knk5kAn0pHQv6R
 VRZJypgNfU/Pt0ykstZD
 =CO9J
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the set of changes for the driver core for 5.17-rc1.

  Lots of little things here, including:

   - kobj_type cleanups

   - auxiliary_bus documentation updates

   - auxiliary_device conversions for some drivers (relevant subsystems
     all have provided acks for these)

   - kernfs lock contention reduction for some workloads

   - other tiny cleanups and changes.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (43 commits)
  kobject documentation: remove default_attrs information
  drivers/firmware: Add missing platform_device_put() in sysfb_create_simplefb
  debugfs: lockdown: Allow reading debugfs files that are not world readable
  driver core: Make bus notifiers in right order in really_probe()
  driver core: Move driver_sysfs_remove() after driver_sysfs_add()
  firmware: edd: remove empty default_attrs array
  firmware: dmi-sysfs: use default_groups in kobj_type
  qemu_fw_cfg: use default_groups in kobj_type
  firmware: memmap: use default_groups in kobj_type
  sh: sq: use default_groups in kobj_type
  headers/uninline: Uninline single-use function: kobject_has_children()
  devtmpfs: mount with noexec and nosuid
  driver core: Simplify async probe test code by using ktime_ms_delta()
  nilfs2: use default_groups in kobj_type
  kobject: remove kset from struct kset_uevent_ops callbacks
  driver core: make kobj_type constant.
  driver core: platform: document registration-failure requirement
  vdpa/mlx5: Use auxiliary_device driver data helpers
  net/mlx5e: Use auxiliary_device driver data helpers
  soundwire: intel: Use auxiliary_device driver data helpers
  ...
2022-01-12 11:11:34 -08:00
Linus Torvalds
e3084ed48f Pin control bulk changes for the v5.17 kernel cycle
Core changes:
 
 - New standard enumerator and corresponding device tree bindings
   for output impedance pin configuration. (Implemented and used
   in the Renesas rzg2l driver.)
 
 - Cleanup of Kconfig and Makefile to be somewhat orderly and
   alphabetic.
 
 New drivers:
 
 - Samsung Exynos 7885 pin controller.
 
 - Ocelot LAN966x pin controller.
 
 - Qualcomm SDX65 pin controller.
 
 - Qualcomm SM8450 pin controller.
 
 - Qualcomm PM8019, PM8226 and PM2250 pin controllers.
 
 - NXP/Freescale i.MXRT1050 pin controller.
 
 - Intel Thunder Bay pin controller.
 
 Enhancements:
 
 - Introduction of the string library helper function
   "kasprintf_strarray()" and subsequent use in Rockchip, ST and
   Armada pin control drivers, as well as the GPIO mockup driver.
 
 - The Ocelot pin controller has been extensively rewritten to
   use regmap and other modern kernel infrastructure.
 
 - The Microchip SGPIO driver has been converted to use regmap.
 
 - The SPEAr driver had been converted to use regmap.
 
 - Substantial cleanups and janitorial on the Apple pin control
   driver that was merged for v5.16.
 
 - Janitorial to remove of_node assignments in the GPIO portions
   that anyway get this handled in the GPIO core.
 
 - Minor cleanups and improvements in several pin controllers.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmHetMgACgkQQRCzN7AZ
 XXOXUA/+I8nEdBy8oBa+vYsJp/FwQi9oh2r488Bin7kCEwYJjPKDDjZuIQQQz34H
 DcSpzBBB/sSFiO27F27rk70vHGfZ4pVi57XfRI2IB1qSe4uCNCNEURVDSM9aY7Nl
 hR973GS5VDvmyo/7zUT7dWmG2b9lxRqwU2wCvVJ7y69gQEwT74iR8b51ycziBNWt
 AEQ+BUN9oVEIM6aHs9+jGgD843XIFZMWoKuwoD51036/wFDLO3lQNyuMytZaQtSB
 q1epb51jl4tPhybWrWc+IoVp6BshIZs1m8+LhgRqLfJEj1znTZDXvAEuTuI3Y9BY
 lyyvGuKNbe6q1aD8Hfu3qiO8PfBrI+pNpOcdw84pG6IwBz4vfLmhzyMd8vTyqoK8
 DIlfYCiGJB0aqDBWhRyql8KM04/gSlEm2eZONsudNuMugvRIxU1IOBaKFwlP5Z98
 y2/mYo/wLnVFKZE6cLp3Lxjpv4ENRJ1HkQe5JQak1ulq+XkUL9f82p7oGMJ4lvoB
 iTOPTkuhhkiUYmwbb97VoqWTYwL+EptvsWto+Mv/glHy7OGXXJFTAD+ARpZc+c5I
 f1/mzQYujmVj91XUi9xSGnL07mNNPOiX3p+9q7Fy+A3Rk1x5n0t+7hvmiuv8paLv
 KNowhECllp0lBKns39tcn8BQvRufvxv2b+QvEqgUPVI3Qj8vEc4=
 =+nxh
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control bulk updates from Linus Walleij:
 "Core changes:

   - New standard enumerator and corresponding device tree bindings for
     output impedance pin configuration. (Implemented and used in the
     Renesas rzg2l driver.)

   - Cleanup of Kconfig and Makefile to be somewhat orderly and
     alphabetic.

  New drivers:

   - Samsung Exynos 7885 pin controller.

   - Ocelot LAN966x pin controller.

   - Qualcomm SDX65 pin controller.

   - Qualcomm SM8450 pin controller.

   - Qualcomm PM8019, PM8226 and PM2250 pin controllers.

   - NXP/Freescale i.MXRT1050 pin controller.

   - Intel Thunder Bay pin controller.

  Enhancements:

   - Introduction of the string library helper function
     "kasprintf_strarray()" and subsequent use in Rockchip, ST and
     Armada pin control drivers, as well as the GPIO mockup driver.

   - The Ocelot pin controller has been extensively rewritten to use
     regmap and other modern kernel infrastructure.

   - The Microchip SGPIO driver has been converted to use regmap.

   - The SPEAr driver had been converted to use regmap.

   - Substantial cleanups and janitorial on the Apple pin control driver
     that was merged for v5.16.

   - Janitorial to remove of_node assignments in the GPIO portions that
     anyway get this handled in the GPIO core.

   - Minor cleanups and improvements in several pin controllers"

* tag 'pinctrl-v5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (98 commits)
  pinctrl: imx: fix assigning groups names
  dt-bindings: pinctrl: mt8195: add wrapping node of pin configurations
  pinctrl: bcm: ns: use generic groups & functions helpers
  pinctrl: imx: fix allocation result check
  pinctrl: samsung: Use platform_get_irq_optional() to get the interrupt
  pinctrl: Propagate firmware node from a parent device
  dt-bindings: pinctrl: qcom: Add SDX65 pinctrl bindings
  pinctrl: add one more "const" for generic function groups
  pinctrl: keembay: rework loops looking for groups names
  pinctrl: keembay: comment process of building functions a bit
  pinctrl: imx: prepare for making "group_names" in "function_desc" const
  ARM: dts: gpio-ranges property is now required
  pinctrl: aspeed: fix unmet dependencies on MFD_SYSCON for PINCTRL_ASPEED
  pinctrl: Get rid of duplicate of_node assignment in the drivers
  pinctrl-sunxi: don't call pinctrl_gpio_direction()
  pinctrl-bcm2835: don't call pinctrl_gpio_direction()
  pinctrl: bcm2835: Silence uninit warning
  pinctrl: Sort Kconfig and Makefile entries alphabetically
  pinctrl: Add Intel Thunder Bay pinctrl driver
  dt-bindings: pinctrl: Add bindings for Intel Thunderbay pinctrl driver
  ...
2022-01-12 10:56:08 -08:00
Linus Torvalds
c9193f48e9 for-5.17/drivers-2022-01-11
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmHd8EIQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpnOKEADGpxp+Vntbm8nZI/PFP5fA2gUTZWgSVB4l
 axVTYW21pjSrsrAhGg2FIgBgL0tNkgxQnIPRn50YL8jT3pTkCEcR7kLbhEU7W/Ln
 7hrsBgFnsCBoCs38LvzXHZD69jtEtNRk1ijPMLo5iCcHkAyUVKa1glfeMwefuI5/
 Rl8SoueRXppvCfwNPptaAKiDsYVN8KCJPvvhlMNoKP5n1iTsNYJ/HVsLqfRnP0oc
 CR6eHaYceWGLER8tWtBlG2Qp40+cd/A320thkIlEpEKJPWE/ce5AUp0PYxVJbwjU
 qvO1tMYSya7gPiaVWRJcUeAgRFiivM/kTdDrGwiY9hpv/BQG7EAW5D9Xecz/M4UG
 BgNLfhe0aR9QssjPxITgyiy9sRpwwpnpoVONTu3slgXVTUVlOq0QT6LOTPR1B9A4
 ZjbHVCuI3eyrAOqD4IjYSqjHa6GjFLiKTh8Q0ZB/KJGX1eItLVLVdJfcfV4RkBIf
 6RZg9+7/mXaDxU74DZ2tfUhHT0sC5RS+5VFxpkhThVk9qRbVdZGGWAHcVOkMjk9B
 L4PCpJeuaR+rzXvCDOCOI5sHraa5F/IRhMaTu5sHj/MIuEpq1fqjaB7tWRvfm6HO
 4tepUtb++rS3/zFFQlZCLyjVk2o0p2b0viwPLjvsRqsBp1bVoO9mJIiyp6POmM3G
 UjxQS0vEDw==
 =k0IZ
 -----END PGP SIGNATURE-----

Merge tag 'for-5.17/drivers-2022-01-11' of git://git.kernel.dk/linux-block

Pull block driver updates from Jens Axboe:

 - mtip32xx pci cleanups (Bjorn)

 - mtip32xx conversion to generic power management (Vaibhav)

 - rsxx pci powermanagement cleanups (Bjorn)

 - Remove the rsxx driver. This hardware never saw much adoption, and
   it's been end of lifed for a while. (Christoph)

 - MD pull request from Song:
      - REQ_NOWAIT support (Vishal Verma)
      - raid6 benchmark optimization (Dirk Müller)
      - Fix for acct bioset (Xiao Ni)
      - Clean up max_queued_requests (Mariusz Tkaczyk)
      - PREEMPT_RT optimization (Davidlohr Bueso)
      - Use default_groups in kobj_type (Greg Kroah-Hartman)

 - Use attribute groups in pktcdvd and rnbd (Greg)

 - NVMe pull request from Christoph:
      - increment request genctr on completion (Keith Busch, Geliang
        Tang)
      - add a 'iopolicy' module parameter (Hannes Reinecke)
      - print out valid arguments when reading from /dev/nvme-fabrics
        (Hannes Reinecke)

 - Use struct_group() in drbd (Kees)

 - null_blk fixes (Ming)

 - Get rid of congestion logic in pktcdvd (Neil)

 - Floppy ejection hang fix (Tasos)

 - Floppy max user request size fix (Xiongwei)

 - Loop locking fix (Tetsuo)

* tag 'for-5.17/drivers-2022-01-11' of git://git.kernel.dk/linux-block: (32 commits)
  md: use default_groups in kobj_type
  md: Move alloc/free acct bioset in to personality
  lib/raid6: Use strict priority ranking for pq gen() benchmarking
  lib/raid6: skip benchmark of non-chosen xor_syndrome functions
  md: fix spelling of "its"
  md: raid456 add nowait support
  md: raid10 add nowait support
  md: raid1 add nowait support
  md: add support for REQ_NOWAIT
  md: drop queue limitation for RAID1 and RAID10
  md/raid5: play nice with PREEMPT_RT
  block/rnbd-clt-sysfs: use default_groups in kobj_type
  pktcdvd: convert to use attribute groups
  block: null_blk: only set set->nr_maps as 3 if active poll_queues is > 0
  nvme: add 'iopolicy' module parameter
  nvme: drop unused variable ctrl in nvme_setup_cmd
  nvme: increment request genctr on completion
  nvme-fabrics: print out valid arguments when reading from /dev/nvme-fabrics
  block: remove the rsxx driver
  rsxx: Drop PCI legacy power management
  ...
2022-01-12 10:35:23 -08:00
Eric Dumazet
c12837d1bb ref_tracker: use __GFP_NOFAIL more carefully
syzbot was able to trigger this warning from new_slab()
		/*
		 * All existing users of the __GFP_NOFAIL are blockable, so warn
		 * of any new users that actually require GFP_NOWAIT
		 */
		if (WARN_ON_ONCE(!can_direct_reclaim))
			goto fail;

Indeed, we should use __GFP_NOFAIL if direct reclaim is possible.

Hopefully in the future we will be able to use SLAB_NOFAILSLAB
option so that syzbot can benefit from full ref_tracker
even in the presence of memory fault injections.

WARNING: CPU: 0 PID: 13 at mm/page_alloc.c:5081 __alloc_pages_slowpath.constprop.0+0x1b7b/0x20d0 mm/page_alloc.c:5081 mm/page_alloc.c:5081
Modules linked in:
CPU: 0 PID: 13 Comm: ksoftirqd/0 Not tainted 5.16.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__alloc_pages_slowpath.constprop.0+0x1b7b/0x20d0 mm/page_alloc.c:5081 mm/page_alloc.c:5081
Code: 90 08 00 00 48 81 c7 d8 04 00 00 48 89 f8 48 c1 e8 03 42 80 3c 30 00 0f 84 f0 ea ff ff e8 3d 82 09 00 e9 e6 ea ff ff 4d 89 fd <0f> 0b 48 b8 00 00 00 00 00 fc ff df 48 8b 54 24 30 48 c1 ea 03 80
RSP: 0018:ffffc90000d272b8 EFLAGS: 00010246

RAX: 0000000000000000 RBX: ffff88813fffc300 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88813fffc348
RBP: ffff88813fffc300 R08: 00000000000013dc R09: 00000000000013c8
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffffc90000d274e8 R14: dffffc0000000000 R15: ffffc90000d274e8
FS:  0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffefe6000f8 CR3: 000000001d21e000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 __alloc_pages+0x412/0x500 mm/page_alloc.c:5382 mm/page_alloc.c:5382
 alloc_pages+0x1a7/0x300 mm/mempolicy.c:2191 mm/mempolicy.c:2191
 alloc_slab_page mm/slub.c:1793 [inline]
 allocate_slab mm/slub.c:1938 [inline]
 alloc_slab_page mm/slub.c:1793 [inline] mm/slub.c:1993
 allocate_slab mm/slub.c:1938 [inline] mm/slub.c:1993
 new_slab+0x349/0x4a0 mm/slub.c:1993 mm/slub.c:1993
 ___slab_alloc+0x918/0xfe0 mm/slub.c:3022 mm/slub.c:3022
 __slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3109 mm/slub.c:3109
 slab_alloc_node mm/slub.c:3200 [inline]
 slab_alloc mm/slub.c:3242 [inline]
 slab_alloc_node mm/slub.c:3200 [inline] mm/slub.c:3259
 slab_alloc mm/slub.c:3242 [inline] mm/slub.c:3259
 kmem_cache_alloc_trace+0x289/0x2c0 mm/slub.c:3259 mm/slub.c:3259
 kmalloc include/linux/slab.h:590 [inline]
 kzalloc include/linux/slab.h:724 [inline]
 kmalloc include/linux/slab.h:590 [inline] lib/ref_tracker.c:74
 kzalloc include/linux/slab.h:724 [inline] lib/ref_tracker.c:74
 ref_tracker_alloc+0xe1/0x430 lib/ref_tracker.c:74 lib/ref_tracker.c:74
 netdev_tracker_alloc include/linux/netdevice.h:3855 [inline]
 dev_hold_track include/linux/netdevice.h:3872 [inline]
 netdev_tracker_alloc include/linux/netdevice.h:3855 [inline] net/core/dst.c:52
 dev_hold_track include/linux/netdevice.h:3872 [inline] net/core/dst.c:52
 dst_init+0xe0/0x520 net/core/dst.c:52 net/core/dst.c:52
 dst_alloc+0x16b/0x1f0 net/core/dst.c:96 net/core/dst.c:96
 rt_dst_alloc+0x73/0x450 net/ipv4/route.c:1614 net/ipv4/route.c:1614
 ip_route_input_mc net/ipv4/route.c:1720 [inline]
 ip_route_input_mc net/ipv4/route.c:1720 [inline] net/ipv4/route.c:2465
 ip_route_input_rcu.part.0+0x4fe/0xcc0 net/ipv4/route.c:2465 net/ipv4/route.c:2465
 ip_route_input_rcu net/ipv4/route.c:2420 [inline]
 ip_route_input_rcu net/ipv4/route.c:2420 [inline] net/ipv4/route.c:2416
 ip_route_input_noref+0x1b8/0x2a0 net/ipv4/route.c:2416 net/ipv4/route.c:2416
 ip_rcv_finish_core.constprop.0+0x288/0x1e90 net/ipv4/ip_input.c:354 net/ipv4/ip_input.c:354
 ip_rcv_finish+0x135/0x2f0 net/ipv4/ip_input.c:427 net/ipv4/ip_input.c:427
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 NF_HOOK include/linux/netfilter.h:307 [inline] net/ipv4/ip_input.c:540
 NF_HOOK include/linux/netfilter.h:301 [inline] net/ipv4/ip_input.c:540
 ip_rcv+0xaa/0xd0 net/ipv4/ip_input.c:540 net/ipv4/ip_input.c:540
 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5350 net/core/dev.c:5350
 __netif_receive_skb+0x24/0x1b0 net/core/dev.c:5464 net/core/dev.c:5464
 process_backlog+0x2a5/0x6c0 net/core/dev.c:5796 net/core/dev.c:5796
 __napi_poll+0xaf/0x440 net/core/dev.c:6364 net/core/dev.c:6364
 napi_poll net/core/dev.c:6431 [inline]
 napi_poll net/core/dev.c:6431 [inline] net/core/dev.c:6518
 net_rx_action+0x801/0xb40 net/core/dev.c:6518 net/core/dev.c:6518
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558 kernel/softirq.c:558
 run_ksoftirqd kernel/softirq.c:921 [inline]
 run_ksoftirqd kernel/softirq.c:921 [inline] kernel/softirq.c:913
 run_ksoftirqd+0x2d/0x60 kernel/softirq.c:913 kernel/softirq.c:913
 smpboot_thread_fn+0x645/0x9c0 kernel/smpboot.c:164 kernel/smpboot.c:164
 kthread+0x405/0x4f0 kernel/kthread.c:327 kernel/kthread.c:327
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 arch/x86/entry/entry_64.S:295

Fixes: 4e66934eaa ("lib: add reference counting tracking infrastructure")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-12 14:29:50 +00:00
Linus Torvalds
daadb3bd0e Peter Zijlstra says:
"Lots of cleanups and preparation; highlights:
 
  - futex: Cleanup and remove runtime futex_cmpxchg detection
 
  - rtmutex: Some fixes for the PREEMPT_RT locking infrastructure
 
  - kcsan: Share owner_on_cpu() between mutex,rtmutex and rwsem and
    annotate the racy owner->on_cpu access *once*.
 
  - atomic64: Dead-Code-Elemination"
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHdvssACgkQEsHwGGHe
 VUrbBg//VQvz5BwddIJDj9utt5AvSixNcTF5mJyFKCSIqO0S4J8nCNcvJjZ2bs4S
 w1YmInFbp0WFGUhaIZiw0e6KWJUoINTng4MfHDZosS1doT2of53ZaQqXs3i81jDz
 87w8ADVHL0x4+BNjdsIwbcuPSDTmJFoyFOdeXTIl9hv9ZULT8m4Mt+LJuUHNZ+vF
 rS1jyseVPWkcm5y+Yie0rhip+ygzbfbt0ArsLfRcrBJsKr6oxLxV2DDF+2djXuuP
 d2OgGT7VkbgAhoKpzVXUiHsT6ppR5Mn5TLSa4EZ4bPPCUFldOhKuCAImF3T6yVIa
 44iX5vQN9v5VHBy6ocPbdOIBuYBYVGCMurh1t7pbpB6G+mmSxMiyta5MY37POwjv
 K2JT9mC2A6a4d17gue5FT3mnJMBB4eHwVaDfAwCZs/5rRNuoTz4aY5Xy04Mq0ltI
 39uarwBd5hwSugBWg44AS5E9h52E654FQ7g6iS4NtUvJuuaXBTl43EcZWx2+mnPL
 zY+iOMVMgg33VIVcm/mlf/6zWL0LXPmILUiA1fp4Q9/n8u1EuOOyeA/GsC9Pl3wO
 HY3KpYJA5eQpIk/JEnzKm5ZE3pCrUdH6VDC/SB4owQtafQG6OxyQVP1Gj7KYxZsD
 NqqpJ4nkKooc5f5DqVEN8wrjyYsnVxEfriEG09OoR6wI3MqyUA4=
 =vrYy
 -----END PGP SIGNATURE-----

Merge tag 'locking_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Borislav Petkov:
 "Lots of cleanups and preparation. Highlights:

   - futex: Cleanup and remove runtime futex_cmpxchg detection

   - rtmutex: Some fixes for the PREEMPT_RT locking infrastructure

   - kcsan: Share owner_on_cpu() between mutex,rtmutex and rwsem and
     annotate the racy owner->on_cpu access *once*.

   - atomic64: Dead-Code-Elemination"

[ Description above by Peter Zijlstra ]

* tag 'locking_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/atomic: atomic64: Remove unusable atomic ops
  futex: Fix additional regressions
  locking: Allow to include asm/spinlock_types.h from linux/spinlock_types_raw.h
  x86/mm: Include spinlock_t definition in pgtable.
  locking: Mark racy reads of owner->on_cpu
  locking: Make owner_on_cpu() into <linux/sched.h>
  lockdep/selftests: Adapt ww-tests for PREEMPT_RT
  lockdep/selftests: Skip the softirq related tests on PREEMPT_RT
  lockdep/selftests: Unbalanced migrate_disable() & rcu_read_lock().
  lockdep/selftests: Avoid using local_lock_{acquire|release}().
  lockdep: Remove softirq accounting on PREEMPT_RT.
  locking/rtmutex: Add rt_mutex_lock_nest_lock() and rt_mutex_lock_killable().
  locking/rtmutex: Squash self-deadlock check for ww_rt_mutex.
  locking: Remove rt_rwlock_is_contended().
  sched: Trigger warning if ->migration_disabled counter underflows.
  futex: Fix sparc32/m68k/nds32 build regression
  futex: Remove futex_cmpxchg detection
  futex: Ensure futex_atomic_cmpxchg_inatomic() is present
  kernel/locking: Use a pointer in ww_mutex_trylock().
2022-01-11 17:24:45 -08:00
Linus Torvalds
f692121142 This pull request contains the following changes for UML:
- set_fs removal
 - Devicetree support
 - Many cleanups from Al
 - Various virtio and build related fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAmHbPpwWHHJpY2hhcmRA
 c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wXkcD/9UfDRFqvgtZIlacQ/0IvN24xeq
 +f4aXoXEVyVsCd02jv9pUk3IAezIQyJf3MGtNJ4D/UFXtYfjEYjK5kJpPDP6umaZ
 ZDnpTzn29HW2aGlgOxW9gU7a3Yze629QasIRP6x7Ht+Hk5eXrvRYrgcmKtw1mm04
 SA5v5ZqP3P5r623fpsFiw4Dvl7l6MhDyFeyA2tabNnmv93HgB76PHDtV2Z+SWrC+
 ubjlfBQc87QGHW+eTvce+0qw9APMoJpNFjNN4H8P/9VcDTvw+KL2JqQ02HSMWh4z
 HeHKsv6hbty+GskBhbaWDW7867fPJ3e08TFAAAjeEiBP/CDBwjOTSr3eOw1eHgzU
 xdAqC2Bz0e5G3shClmVEzzvcP6R2cgNZjeBze5m3wQ1NKHEddk6N9t5K+4NrOpgp
 gbNN5Q4FAVOBKeQsZWG81bJKGcu7SbShgiKjlxcaRpMyp6LwyD4naauGjmCzYsbf
 Pd4ilLO1Yocf7nFs2C4vWxE4iAZ6hfQtukerIxCQfb/Y2BaWT3bcWWYHFRFy6Lq+
 hTFGnjf+Ro65QCoa1idaLaUdhwAGi6U9sjjL6G/JdQCCE3ftcXLVA9TJz9CNdMb5
 98IGznxhczOZc7rHNXOF4km5+OUrU6N+C0WRp3yOoUWcI+Ms4PXHzqIwC5cde/V7
 O/o9O1BAoBP6LE1pPg==
 =5J6E
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml

Pull UML updates from Richard Weinberger:

 - set_fs removal

 - Devicetree support

 - Many cleanups from Al

 - Various virtio and build related fixes

* tag 'for-linus-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: (31 commits)
  um: virtio_uml: Allow probing from devicetree
  um: Add devicetree support
  um: Extract load file helper from initrd.c
  um: remove set_fs
  hostfs: Fix writeback of dirty pages
  um: Use swap() to make code cleaner
  um: header debriding - sigio.h
  um: header debriding - os.h
  um: header debriding - net_*.h
  um: header debriding - mem_user.h
  um: header debriding - activate_ipi()
  um: common-offsets.h debriding...
  um, x86: bury crypto_tfm_ctx_offset
  um: unexport handle_page_fault()
  um: remove a dangling extern of syscall_trace()
  um: kill unused cpu()
  uml/i386: missing include in barrier.h
  um: stop polluting the namespace with registers.h contents
  logic_io instance of iounmap() needs volatile on argument
  um: move amd64 variant of mmap(2) to arch/x86/um/syscalls_64.c
  ...
2022-01-11 15:26:52 -08:00
Linus Torvalds
dabd40ecaf tpmdd updates for Linux v5.17
-----BEGIN PGP SIGNATURE-----
 
 iIgEABYIADAWIQRE6pSOnaBC00OEHEIaerohdGur0gUCYdzf7hIcamFya2tvQGtl
 cm5lbC5vcmcACgkQGnq6IXRrq9IA/AEA2sX9fNNYSYnUwvi/Ju+Y8BgW4pA+GvA0
 L8iSuUkWdssA/iQFdQ3vyDK0CI56G1jerKMyT7o8QEuJmUYogTRV7+oA
 =7q7g
 -----END PGP SIGNATURE-----

Merge tag 'tpmdd-next-v5.17-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull TPM updates from Jarkko Sakkinen:
 "Other than bug fixes for TPM, this includes a patch for asymmetric
  keys to allow to look up and verify with self-signed certificates
  (keys without so called AKID - Authority Key Identifier) using a new
  "dn:" prefix in the query"

* tag 'tpmdd-next-v5.17-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  lib: remove redundant assignment to variable ret
  tpm: fix NPE on probe for missing device
  tpm: fix potential NULL pointer access in tpm_del_char_device
  tpm: Add Upgrade/Reduced mode support for TPM2 modules
  char: tpm: cr50: Set TPM_FIRMWARE_POWER_MANAGED based on device property
  keys: X.509 public key issuer lookup without AKID
  tpm_tis: Fix an error handling path in 'tpm_tis_core_init()'
  tpm: tpm_tis_spi_cr50: Add default RNG quality
  tpm/st33zp24: drop unneeded over-commenting
  tpm: add request_locality before write TPM_INT_ENABLE
2022-01-11 12:58:41 -08:00
Linus Torvalds
5c947d0dba Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Algorithms:

   - Drop alignment requirement for data in aesni

   - Use synchronous seeding from the /dev/random in DRBG

   - Reseed nopr DRBGs every 5 minutes from /dev/random

   - Add KDF algorithms currently used by security/DH

   - Fix lack of entropy on some AMD CPUs with jitter RNG

  Drivers:

   - Add support for the D1 variant in sun8i-ce

   - Add SEV_INIT_EX support in ccp

   - PFVF support for GEN4 host driver in qat

   - Compression support for GEN4 devices in qat

   - Add cn10k random number generator support"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (145 commits)
  crypto: af_alg - rewrite NULL pointer check
  lib/mpi: Add the return value check of kcalloc()
  crypto: qat - fix definition of ring reset results
  crypto: hisilicon - cleanup warning in qm_get_qos_value()
  crypto: kdf - select SHA-256 required for self-test
  crypto: x86/aesni - don't require alignment of data
  crypto: ccp - remove unneeded semicolon
  crypto: stm32/crc32 - Fix kernel BUG triggered in probe()
  crypto: s390/sha512 - Use macros instead of direct IV numbers
  crypto: sparc/sha - remove duplicate hash init function
  crypto: powerpc/sha - remove duplicate hash init function
  crypto: mips/sha - remove duplicate hash init function
  crypto: sha256 - remove duplicate generic hash init function
  crypto: jitter - add oversampling of noise source
  MAINTAINERS: update SEC2 driver maintainers list
  crypto: ux500 - Use platform_get_irq() to get the interrupt
  crypto: hisilicon/qm - disable qm clock-gating
  crypto: omap-aes - Fix broken pm_runtime_and_get() usage
  MAINTAINERS: update caam crypto driver maintainers list
  crypto: octeontx2 - prevent underflow in get_cores_bmap()
  ...
2022-01-11 10:21:35 -08:00
Linus Torvalds
1be5bdf8cd KCSAN updates for v5.17
This series provides KCSAN fixes and also the ability to take memory
 barriers into account for weakly-ordered systems.  This last can increase
 the probability of detecting certain types of data races.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmHbuRwTHHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jKDPEACWuzYnd/u/02AHyRd3PIF3Px9uFKlK
 TFwaXX95oYSFCXcrmO42YtDUlZm4QcbwNb85KMCu1DvckRtIsNw0rkBU7BGyqv3Z
 ZoJEfMNpmC0x9+IFBOeseBHySPVT0x7GmYus05MSh0OLfkbCfyImmxRzgoKJGL+A
 ADF9EQb4z2feWjmVEoN8uRaarCAD4f77rSXiX6oTCNDuKrHarqMld/TmoXFrJbu2
 QtfwHeyvraKBnZdUoYfVbGVenyKb1vMv4bUlvrOcuJEgIi/J/th4FupR3XCGYulI
 aWJTl2TQTGnMoE8VnFHgI27I841w3k5PVL+Y1hr/S4uN1/rIoQQuBzCtlnFeCksa
 BiBXsHIchN8N0Dwh8zD8NMd2uxV4t3fmpxXTDAwaOm7vs5hA8AJ0XNu6Sz94Lyjv
 wk2CxX41WWUNJVo3gh6SrS4mL6lC8+VvHF1hbIap++jrevj58gj1jAR1fdx4ANlT
 e2qA00EeoMngEogDNZH42/Fxs3H9zxrBta2ZbkPkwzIqTHH+4pIQDCy2xO3T3oxc
 twdGPYpjYdkf79EGsG4I4R/VA/IfcS09VIWTce8xSDeSnqkgFhcG37r1orJe8hTB
 tH+ODkNOsz5HaEoa8OoAL4ko2l0fL99p2AtMPpuQfHjRj7aorF+dJIrqCizASxwx
 37PjQgOmHeDHgQ==
 =Q5fg
 -----END PGP SIGNATURE-----

Merge tag 'kcsan.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull KCSAN updates from Paul McKenney:
 "This provides KCSAN fixes and also the ability to take memory barriers
  into account for weakly-ordered systems. This last can increase the
  probability of detecting certain types of data races"

* tag 'kcsan.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (29 commits)
  kcsan: Only test clear_bit_unlock_is_negative_byte if arch defines it
  kcsan: Avoid nested contexts reading inconsistent reorder_access
  kcsan: Turn barrier instrumentation into macros
  kcsan: Make barrier tests compatible with lockdep
  kcsan: Support WEAK_MEMORY with Clang where no objtool support exists
  compiler_attributes.h: Add __disable_sanitizer_instrumentation
  objtool, kcsan: Remove memory barrier instrumentation from noinstr
  objtool, kcsan: Add memory barrier instrumentation to whitelist
  sched, kcsan: Enable memory barrier instrumentation
  mm, kcsan: Enable barrier instrumentation
  x86/qspinlock, kcsan: Instrument barrier of pv_queued_spin_unlock()
  x86/barriers, kcsan: Use generic instrumentation for non-smp barriers
  asm-generic/bitops, kcsan: Add instrumentation for barriers
  locking/atomics, kcsan: Add instrumentation for barriers
  locking/barriers, kcsan: Support generic instrumentation
  locking/barriers, kcsan: Add instrumentation for barriers
  kcsan: selftest: Add test case to check memory barrier instrumentation
  kcsan: Ignore GCC 11+ warnings about TSan runtime support
  kcsan: test: Add test cases for memory barrier instrumentation
  kcsan: test: Match reordered or normal accesses
  ...
2022-01-11 09:51:26 -08:00
Linus Torvalds
a229327733 printk changes for 5.17
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmHcFXQACgkQUqAMR0iA
 lPIYCw/7BbaC04tqwwznlL97rDW5LpbxJp/9bheJ3tE9mjsIVm2ExU/RLwXaeAhM
 K+aT0FB9s6u03OXOV3Dc2aFzwTXUamhHwgjp4uPs3+Colin1zGW1GMNTaSxGVm8Y
 zJfdrasjhJeucxNda+VtuGBkKwzs3dZjlNOUi1RzhJNGJvQtLDAZp04P9gAv6RDD
 gA/59YOx1q8m/Mig04PVWzKwiRl/zQLgGaXrgMYID2VrR2d/ZXSVY+itpq1GVOr7
 JWycMveIhAle8aoupbacfZofX5vHShaigfkJRI6MtVHBq7dVyfpTWElqdnjOJaZy
 AxwyTbcm0USfCK2K1Yl3L6TcU1DMaryGWJpJH/HvlR7yqEa8RVIeP2rLIlBTNHlP
 Caouz0Hrvy1pbLsdd39NXZohnY50DWgml4I0wwhf03IaVU9XaGtdAaKdXL45NefI
 ghHTNLCO9e4X1/lQT39EJp33/nZmSoN5vr6+0PQqlWXNgcUbMUAnA/K01jl3NKXP
 fBcb7WigIW0CEReMqB7WyJ42at0uk8Px4JEuYGmxpDSQIj+HqaQAiruUjEP9tQ6K
 b+LwRKgE7c2eNsHMBPNQ7txjen7rN6D8tVUEgF3Vp45uCm/wdp2kIZ27nOYWYRt3
 BVY/6GOIQZ7wJQPztIIaYfd1+3m9DIoq6YsqYniknz88oU4KN5k=
 =k1A+
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Remove some twists in the console registration code. It does not
   change the existing behavior except for one corner case. The proper
   default console (with tty binding) will be registered again even when
   it has been removed in the meantime. It is actually a bug fix.
   Anyway, this modified behavior requires some manual interaction.

 - Optimize gdb extension for huge ring buffers.

 - Do not use atomic operations for a local bitmap variable.

 - Update git links in MAINTAINERS.

* tag 'printk-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  MAINTAIERS/printk: Add link to printk git
  MAINTAINERS/vsprintf: Update link to printk git tree
  scripts/gdb: lx-dmesg: read records individually
  printk/console: Clean up boot console handling in register_console()
  printk/console: Remove need_default_console variable
  printk/console: Remove unnecessary need_default_console manipulation
  printk/console: Rename has_preferred_console to need_default_console
  printk/console: Split out code that enables default console
  vsprintf: Use non-atomic bitmap API when applicable
2022-01-11 09:23:59 -08:00
Linus Torvalds
8efd0d9c31 Networking changes for 5.17.
Core
 ----
 
  - Defer freeing TCP skbs to the BH handler, whenever possible,
    or at least perform the freeing outside of the socket lock section
    to decrease cross-CPU allocator work and improve latency.
 
  - Add netdevice refcount tracking to locate sources of netdevice
    and net namespace refcount leaks.
 
  - Make Tx watchdog less intrusive - avoid pausing Tx and restarting
    all queues from a single CPU removing latency spikes.
 
  - Various small optimizations throughout the stack from Eric Dumazet.
 
  - Make netdev->dev_addr[] constant, force modifications to go via
    appropriate helpers to allow us to keep addresses in ordered data
    structures.
 
  - Replace unix_table_lock with per-hash locks, improving performance
    of bind() calls.
 
  - Extend skb drop tracepoint with a drop reason.
 
  - Allow SO_MARK and SO_PRIORITY setsockopt under CAP_NET_RAW.
 
 BPF
 ---
 
  - New helpers:
    - bpf_find_vma(), find and inspect VMAs for profiling use cases
    - bpf_loop(), runtime-bounded loop helper trading some execution
      time for much faster (if at all converging) verification
    - bpf_strncmp(), improve performance, avoid compiler flakiness
    - bpf_get_func_arg(), bpf_get_func_ret(), bpf_get_func_arg_cnt()
      for tracing programs, all inlined by the verifier
 
  - Support BPF relocations (CO-RE) in the kernel loader.
 
  - Further the support for BTF_TYPE_TAG annotations.
 
  - Allow access to local storage in sleepable helpers.
 
  - Convert verifier argument types to a composable form with different
    attributes which can be shared across types (ro, maybe-null).
 
  - Prepare libbpf for upcoming v1.0 release by cleaning up APIs,
    creating new, extensible ones where missing and deprecating those
    to be removed.
 
 Protocols
 ---------
 
  - WiFi (mac80211/cfg80211):
    - notify user space about long "come back in N" AP responses,
      allow it to react to such temporary rejections
    - allow non-standard VHT MCS 10/11 rates
    - use coarse time in airtime fairness code to save CPU cycles
 
  - Bluetooth:
    - rework of HCI command execution serialization to use a common
      queue and work struct, and improve handling errors reported
      in the middle of a batch of commands
    - rework HCI event handling to use skb_pull_data, avoiding packet
      parsing pitfalls
    - support AOSP Bluetooth Quality Report
 
  - SMC:
    - support net namespaces, following the RDMA model
    - improve connection establishment latency by pre-clearing buffers
    - introduce TCP ULP for automatic redirection to SMC
 
  - Multi-Path TCP:
    - support ioctls: SIOCINQ, OUTQ, and OUTQNSD
    - support socket options: IP_TOS, IP_FREEBIND, IP_TRANSPARENT,
      IPV6_FREEBIND, and IPV6_TRANSPARENT, TCP_CORK and TCP_NODELAY
    - support cmsgs: TCP_INQ
    - improvements in the data scheduler (assigning data to subflows)
    - support fastclose option (quick shutdown of the full MPTCP
      connection, similar to TCP RST in regular TCP)
 
  - MCTP (Management Component Transport) over serial, as defined by
    DMTF spec DSP0253 - "MCTP Serial Transport Binding".
 
 Driver API
 ----------
 
  - Support timestamping on bond interfaces in active/passive mode.
 
  - Introduce generic phylink link mode validation for drivers which
    don't have any quirks and where MAC capability bits fully express
    what's supported. Allow PCS layer to participate in the validation.
    Convert a number of drivers.
 
  - Add support to set/get size of buffers on the Rx rings and size of
    the tx copybreak buffer via ethtool.
 
  - Support offloading TC actions as first-class citizens rather than
    only as attributes of filters, improve sharing and device resource
    utilization.
 
  - WiFi (mac80211/cfg80211):
    - support forwarding offload (ndo_fill_forward_path)
    - support for background radar detection hardware
    - SA Query Procedures offload on the AP side
 
 New hardware / drivers
 ----------------------
 
  - tsnep - FPGA based TSN endpoint Ethernet MAC used in PLCs with
    real-time requirements for isochronous communication with protocols
    like OPC UA Pub/Sub.
 
  - Qualcomm BAM-DMUX WWAN - driver for data channels of modems
    integrated into many older Qualcomm SoCs, e.g. MSM8916 or
    MSM8974 (qcom_bam_dmux).
 
  - Microchip LAN966x multi-port Gigabit AVB/TSN Ethernet Switch
    driver with support for bridging, VLANs and multicast forwarding
    (lan966x).
 
  - iwlmei driver for co-operating between Intel's WiFi driver and
    Intel's Active Management Technology (AMT) devices.
 
  - mse102x - Vertexcom MSE102x Homeplug GreenPHY chips
 
  - Bluetooth:
    - MediaTek MT7921 SDIO devices
    - Foxconn MT7922A
    - Realtek RTL8852AE
 
 Drivers
 -------
 
  - Significantly improve performance in the datapaths of:
    lan78xx, ax88179_178a, lantiq_xrx200, bnxt.
 
  - Intel Ethernet NICs:
    - igb: support PTP/time PEROUT and EXTTS SDP functions on
      82580/i354/i350 adapters
    - ixgbevf: new PF -> VF mailbox API which avoids the risk of
      mailbox corruption with ESXi
    - iavf: support configuration of VLAN features of finer granularity,
      stacked tags and filtering
    - ice: PTP support for new E822 devices with sub-ns precision
    - ice: support firmware activation without reboot
 
  - Mellanox Ethernet NICs (mlx5):
    - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool
    - support TC forwarding when tunnel encap and decap happen between
      two ports of the same NIC
    - dynamically size and allow disabling various features to save
      resources for running in embedded / SmartNIC scenarios
 
  - Broadcom Ethernet NICs (bnxt):
    - use page frag allocator to improve Rx performance
    - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool
 
  - Other Ethernet NICs:
    - amd-xgbe: add Ryzen 6000 (Yellow Carp) Ethernet support
 
  - Microsoft cloud/virtual NIC (mana):
    - add XDP support (PASS, DROP, TX)
 
  - Mellanox Ethernet switches (mlxsw):
    - initial support for Spectrum-4 ASICs
    - VxLAN with IPv6 underlay
 
  - Marvell Ethernet switches (prestera):
    - support flower flow templates
    - add basic IP forwarding support
 
  - NXP embedded Ethernet switches (ocelot & felix):
    - support Per-Stream Filtering and Policing (PSFP)
    - enable cut-through forwarding between ports by default
    - support FDMA to improve packet Rx/Tx to CPU
 
  - Other embedded switches:
    - hellcreek: improve trapping management (STP and PTP) packets
    - qca8k: support link aggregation and port mirroring
 
  - Qualcomm 802.11ax WiFi (ath11k):
    - qca6390, wcn6855: enable 802.11 power save mode in station mode
    - BSS color change support
    - WCN6855 hw2.1 support
    - 11d scan offload support
    - scan MAC address randomization support
    - full monitor mode, only supported on QCN9074
    - qca6390/wcn6855: report signal and tx bitrate
    - qca6390: rfkill support
    - qca6390/wcn6855: regdb.bin support
 
  - Intel WiFi (iwlwifi):
    - support SAR GEO Offset Mapping (SGOM) and Time-Aware-SAR (TAS)
      in cooperation with the BIOS
    - support for Optimized Connectivity Experience (OCE) scan
    - support firmware API version 68
    - lots of preparatory work for the upcoming Bz device family
 
  - MediaTek WiFi (mt76):
    - Specific Absorption Rate (SAR) support
    - mt7921: 160 MHz channel support
 
  - RealTek WiFi (rtw88):
    - Specific Absorption Rate (SAR) support
    - scan offload
 
  - Other WiFi NICs
    - ath10k: support fetching (pre-)calibration data from nvmem
    - brcmfmac: configure keep-alive packet on suspend
    - wcn36xx: beacon filter support
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmHbkZAACgkQMUZtbf5S
 IruYkQ//XX7BggcwBfukPK83j0dONolClijqKcKR08g4vB5L8GXvv6OErKIWrh4k
 h8JanCH352ZkbCSw3MvFdm825UYQv8vPMd6Qks/LJ4aSKqCuy4MIlAo+yOw4Km3O
 i7++lRfma6DqHHI59wvLjWoxZSPu8lL+rI8UsZ5qMOlnNlGAOXsNrzRjaqQ3FddY
 AMxZeBUtrPqUCCQZFq3U8apkYzUp7CA/3XR9zRcja3uPbrtOV2G+4whRF90qGNWz
 Tm/QvJ9F/Ab292cbhxR4KuaQ3hUhaCQyDjbZk3+FZzZpAVhYTVqcNjny6+yXmbiP
 NXRtwemnl1NlWKMnJM8lEeY48u626tRIkxA/Wtd61uoO5uKUSxfGP+UpUi+DfXbF
 yIw50VQ7L2bpxXP/HjtmhVgZDaWKYyh22Zw4Hp/muMJz0hgUB0KODY3tf2jUWbjJ
 0oEgocWyzhhwMQKqupTDCIaRgIs2ewYr4ZrFDhI3HnHC/vv1VjoPRUPIyxwppD2N
 cXvZb3B1sWK8iX5gCbISGzyU4bB7I0rvJSTU42ueti7n6NqRFZ79qHQpYnnY+JdO
 z1qOwY/d/yWfBoXVKRtRg2qz6CdEt5BQklwAgVEBgrFpf58gp694EwGMb1htY14J
 r/k9bVpmyIFpUnBH2CPMRfBVA3tUTqzyzzFV4AMw40NYLKmhLdo=
 =KLm3
 -----END PGP SIGNATURE-----

Merge tag '5.17-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core
  ----

   - Defer freeing TCP skbs to the BH handler, whenever possible, or at
     least perform the freeing outside of the socket lock section to
     decrease cross-CPU allocator work and improve latency.

   - Add netdevice refcount tracking to locate sources of netdevice and
     net namespace refcount leaks.

   - Make Tx watchdog less intrusive - avoid pausing Tx and restarting
     all queues from a single CPU removing latency spikes.

   - Various small optimizations throughout the stack from Eric Dumazet.

   - Make netdev->dev_addr[] constant, force modifications to go via
     appropriate helpers to allow us to keep addresses in ordered data
     structures.

   - Replace unix_table_lock with per-hash locks, improving performance
     of bind() calls.

   - Extend skb drop tracepoint with a drop reason.

   - Allow SO_MARK and SO_PRIORITY setsockopt under CAP_NET_RAW.

  BPF
  ---

   - New helpers:
      - bpf_find_vma(), find and inspect VMAs for profiling use cases
      - bpf_loop(), runtime-bounded loop helper trading some execution
        time for much faster (if at all converging) verification
      - bpf_strncmp(), improve performance, avoid compiler flakiness
      - bpf_get_func_arg(), bpf_get_func_ret(), bpf_get_func_arg_cnt()
        for tracing programs, all inlined by the verifier

   - Support BPF relocations (CO-RE) in the kernel loader.

   - Further the support for BTF_TYPE_TAG annotations.

   - Allow access to local storage in sleepable helpers.

   - Convert verifier argument types to a composable form with different
     attributes which can be shared across types (ro, maybe-null).

   - Prepare libbpf for upcoming v1.0 release by cleaning up APIs,
     creating new, extensible ones where missing and deprecating those
     to be removed.

  Protocols
  ---------

   - WiFi (mac80211/cfg80211):
      - notify user space about long "come back in N" AP responses,
        allow it to react to such temporary rejections
      - allow non-standard VHT MCS 10/11 rates
      - use coarse time in airtime fairness code to save CPU cycles

   - Bluetooth:
      - rework of HCI command execution serialization to use a common
        queue and work struct, and improve handling errors reported in
        the middle of a batch of commands
      - rework HCI event handling to use skb_pull_data, avoiding packet
        parsing pitfalls
      - support AOSP Bluetooth Quality Report

   - SMC:
      - support net namespaces, following the RDMA model
      - improve connection establishment latency by pre-clearing buffers
      - introduce TCP ULP for automatic redirection to SMC

   - Multi-Path TCP:
      - support ioctls: SIOCINQ, OUTQ, and OUTQNSD
      - support socket options: IP_TOS, IP_FREEBIND, IP_TRANSPARENT,
        IPV6_FREEBIND, and IPV6_TRANSPARENT, TCP_CORK and TCP_NODELAY
      - support cmsgs: TCP_INQ
      - improvements in the data scheduler (assigning data to subflows)
      - support fastclose option (quick shutdown of the full MPTCP
        connection, similar to TCP RST in regular TCP)

   - MCTP (Management Component Transport) over serial, as defined by
     DMTF spec DSP0253 - "MCTP Serial Transport Binding".

  Driver API
  ----------

   - Support timestamping on bond interfaces in active/passive mode.

   - Introduce generic phylink link mode validation for drivers which
     don't have any quirks and where MAC capability bits fully express
     what's supported. Allow PCS layer to participate in the validation.
     Convert a number of drivers.

   - Add support to set/get size of buffers on the Rx rings and size of
     the tx copybreak buffer via ethtool.

   - Support offloading TC actions as first-class citizens rather than
     only as attributes of filters, improve sharing and device resource
     utilization.

   - WiFi (mac80211/cfg80211):
      - support forwarding offload (ndo_fill_forward_path)
      - support for background radar detection hardware
      - SA Query Procedures offload on the AP side

  New hardware / drivers
  ----------------------

   - tsnep - FPGA based TSN endpoint Ethernet MAC used in PLCs with
     real-time requirements for isochronous communication with protocols
     like OPC UA Pub/Sub.

   - Qualcomm BAM-DMUX WWAN - driver for data channels of modems
     integrated into many older Qualcomm SoCs, e.g. MSM8916 or MSM8974
     (qcom_bam_dmux).

   - Microchip LAN966x multi-port Gigabit AVB/TSN Ethernet Switch driver
     with support for bridging, VLANs and multicast forwarding
     (lan966x).

   - iwlmei driver for co-operating between Intel's WiFi driver and
     Intel's Active Management Technology (AMT) devices.

   - mse102x - Vertexcom MSE102x Homeplug GreenPHY chips

   - Bluetooth:
      - MediaTek MT7921 SDIO devices
      - Foxconn MT7922A
      - Realtek RTL8852AE

  Drivers
  -------

   - Significantly improve performance in the datapaths of: lan78xx,
     ax88179_178a, lantiq_xrx200, bnxt.

   - Intel Ethernet NICs:
      - igb: support PTP/time PEROUT and EXTTS SDP functions on
        82580/i354/i350 adapters
      - ixgbevf: new PF -> VF mailbox API which avoids the risk of
        mailbox corruption with ESXi
      - iavf: support configuration of VLAN features of finer
        granularity, stacked tags and filtering
      - ice: PTP support for new E822 devices with sub-ns precision
      - ice: support firmware activation without reboot

   - Mellanox Ethernet NICs (mlx5):
      - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool
      - support TC forwarding when tunnel encap and decap happen between
        two ports of the same NIC
      - dynamically size and allow disabling various features to save
        resources for running in embedded / SmartNIC scenarios

   - Broadcom Ethernet NICs (bnxt):
      - use page frag allocator to improve Rx performance
      - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool

   - Other Ethernet NICs:
      - amd-xgbe: add Ryzen 6000 (Yellow Carp) Ethernet support

   - Microsoft cloud/virtual NIC (mana):
      - add XDP support (PASS, DROP, TX)

   - Mellanox Ethernet switches (mlxsw):
      - initial support for Spectrum-4 ASICs
      - VxLAN with IPv6 underlay

   - Marvell Ethernet switches (prestera):
      - support flower flow templates
      - add basic IP forwarding support

   - NXP embedded Ethernet switches (ocelot & felix):
      - support Per-Stream Filtering and Policing (PSFP)
      - enable cut-through forwarding between ports by default
      - support FDMA to improve packet Rx/Tx to CPU

   - Other embedded switches:
      - hellcreek: improve trapping management (STP and PTP) packets
      - qca8k: support link aggregation and port mirroring

   - Qualcomm 802.11ax WiFi (ath11k):
      - qca6390, wcn6855: enable 802.11 power save mode in station mode
      - BSS color change support
      - WCN6855 hw2.1 support
      - 11d scan offload support
      - scan MAC address randomization support
      - full monitor mode, only supported on QCN9074
      - qca6390/wcn6855: report signal and tx bitrate
      - qca6390: rfkill support
      - qca6390/wcn6855: regdb.bin support

   - Intel WiFi (iwlwifi):
      - support SAR GEO Offset Mapping (SGOM) and Time-Aware-SAR (TAS)
        in cooperation with the BIOS
      - support for Optimized Connectivity Experience (OCE) scan
      - support firmware API version 68
      - lots of preparatory work for the upcoming Bz device family

   - MediaTek WiFi (mt76):
      - Specific Absorption Rate (SAR) support
      - mt7921: 160 MHz channel support

   - RealTek WiFi (rtw88):
      - Specific Absorption Rate (SAR) support
      - scan offload

   - Other WiFi NICs
      - ath10k: support fetching (pre-)calibration data from nvmem
      - brcmfmac: configure keep-alive packet on suspend
      - wcn36xx: beacon filter support"

* tag '5.17-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2048 commits)
  tcp: tcp_send_challenge_ack delete useless param `skb`
  net/qla3xxx: Remove useless DMA-32 fallback configuration
  rocker: Remove useless DMA-32 fallback configuration
  hinic: Remove useless DMA-32 fallback configuration
  lan743x: Remove useless DMA-32 fallback configuration
  net: enetc: Remove useless DMA-32 fallback configuration
  cxgb4vf: Remove useless DMA-32 fallback configuration
  cxgb4: Remove useless DMA-32 fallback configuration
  cxgb3: Remove useless DMA-32 fallback configuration
  bnx2x: Remove useless DMA-32 fallback configuration
  et131x: Remove useless DMA-32 fallback configuration
  be2net: Remove useless DMA-32 fallback configuration
  vmxnet3: Remove useless DMA-32 fallback configuration
  bna: Simplify DMA setting
  net: alteon: Simplify DMA setting
  myri10ge: Simplify DMA setting
  qlcnic: Simplify DMA setting
  net: allwinner: Fix print format
  page_pool: remove spinlock in page_pool_refill_alloc_cache()
  amt: fix wrong return type of amt_send_membership_update()
  ...
2022-01-10 19:06:09 -08:00
Linus Torvalds
bf4eebf8cf linux-kselftest-kunit-5.17-rc1
This KUnit update for Linux 5.17-rc1 consists of several fixes and
 enhancements. A few highlights:
 
 - Option --kconfig_add option allows easily tweaking kunitconfigs
 - make build subcommand can reconfigure if needed
 - doesn't error on tests without test plans
 - doesn't crash if no parameters are generated
 - defaults --jobs to # of cups
 - reports test parameter results as (K)TAP subtests
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmHY3T4ACgkQCwJExA0N
 QxwpSA//ZuAuMvAjedj0lgCBU5ocBQAHs7RsTmo6n3ORdTgZ/hjWF9dyyAgvIcb1
 x+BW2M0KXVvpsl5UEuyWz1jQAc1aT4DCMJp/vUYeuwDXqtPxioZhJ9XeGtT+pBDy
 L6GoJeZYQXIGGnRigF0QDY9gQsmvGMQFSJ/NIADeU7XUqlyZlLMgWWa2fO3OKYw+
 33nUBFgObyElGwikyvjACiG+jSZgq9a0eWW1mdZ06sLa7Z+cZvsAyBa4bSdvoQt3
 9s+3JAEHzQEDBwwRt2na6p18m3AA5vi8xyeu7Xz/0agv17TSPuKofx0L7F60sIQW
 oAyHQkHSj9X9s67kjCobu3TlswwsOaB4TEIOolHoqHjrwRPrQGcE4gddyVPGvs52
 3Iu8lAgiCUjNbXKMcEismjrqWe8o4ICk+uVRnAOWjGT4zF/XmAtXnwM4ddZmoFZM
 mS/UmJscoTSV8wxN0QHcZw6TADvX+QNmdOMe3AlQMhhsIklmaWFg5Pf91QafbjST
 yBkXPoqbFlfpKUJ7oCzK3MvvmFzhBOTMIO2lWTSlMPR5xIw/wUR9Go0rKBCm29rf
 YPgwvM1RPkyY+37ZTbPqgpX0oIw5VTRteYdMJTDUzyO4nqSWCp8QYeIKUT/4YJqc
 mY7+wNdqhuXHdvVbsPvObeWqw7DDYZySVf2QJeta7dycBcMYKcE=
 =vGqB
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:
 "This consists of several fixes and enhancements. A few highlights:

   - Option --kconfig_add option allows easily tweaking kunitconfigs

   - make build subcommand can reconfigure if needed

   - doesn't error on tests without test plans

   - doesn't crash if no parameters are generated

   - defaults --jobs to # of cups

   - reports test parameter results as (K)TAP subtests"

* tag 'linux-kselftest-kunit-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: tool: Default --jobs to number of CPUs
  kunit: tool: fix newly introduced typechecker errors
  kunit: tool: make `build` subcommand also reconfigure if needed
  kunit: tool: delete kunit_parser.TestResult type
  kunit: tool: use dataclass instead of collections.namedtuple
  kunit: tool: suggest using decode_stacktrace.sh on kernel crash
  kunit: tool: reconfigure when the used kunitconfig changes
  kunit: tool: revamp message for invalid kunitconfig
  kunit: tool: add --kconfig_add to allow easily tweaking kunitconfigs
  kunit: tool: move Kconfig read_from_file/parse_from_string to package-level
  kunit: tool: print parsed test results fully incrementally
  kunit: Report test parameter results as (K)TAP subtests
  kunit: Don't crash if no parameters are generated
  kunit: tool: Report an error if any test has no subtests
  kunit: tool: Do not error on tests without test plans
  kunit: add run_checks.py script to validate kunit changes
  Documentation: kunit: remove claims that kunit is a mocking framework
  kunit: tool: fix --json output for skipped tests
2022-01-10 12:16:48 -08:00
Colin Ian King
d99a8af48a lib: remove redundant assignment to variable ret
Variable ret is being assigned a value that is never read. If the
for-loop is entered then ret is immediately re-assigned a new
value. If the for-loop is not executed ret is never read. The
assignment is redundant and can be removed.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2022-01-09 00:18:54 +02:00
Matthew Wilcox (Oracle)
25a8de7f8d XArray: Add xas_advance()
Add a new helper function to help iterate over multi-index entries.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08 00:28:41 -05:00
Zizhuang Deng
dd827abe29 lib/mpi: Add the return value check of kcalloc()
Add the return value check of kcalloc() to avoid potential
NULL ptr dereference.

Fixes: a8ea8bdd9d ("lib/mpi: Extend the MPI library")
Signed-off-by: Zizhuang Deng <sunsetdzz@gmail.com>
Reviewed-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-01-07 14:30:01 +11:00
Jason A. Donenfeld
6048fdcc5f lib/crypto: blake2s: include as built-in
In preparation for using blake2s in the RNG, we change the way that it
is wired-in to the build system. Instead of using ifdefs to select the
right symbol, we use weak symbols. And because ARM doesn't need the
generic implementation, we make the generic one default only if an arch
library doesn't need it already, and then have arch libraries that do
need it opt-in. So that the arch libraries can remain tristate rather
than bool, we then split the shash part from the glue code.

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: linux-kbuild@vger.kernel.org
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-01-07 00:25:25 +01:00
Dirk Müller
36dacddbf0 lib/raid6: Use strict priority ranking for pq gen() benchmarking
On x86_64, currently 3 variants of AVX512, 3 variants of AVX2
and 3 variants of SSE2 are benchmarked on initialization, taking
between 144-153 jiffies. Testing across a hardware pool of
various generations of intel cpus I could not find a single
case where SSE2 won over AVX2 or AVX512. There are cases where
AVX2 wins over AVX512 however.

Change "prefer" into an integer priority field (similar to
how recov selection works) to have more than one ranking level
available, which is backwards compatible with existing behavior.

Give AVX2/512 variants higher priority over SSE2 in order to skip
SSE testing when AVX is available. in a AVX2/x86_64/HZ=250 case this
saves in the order of 200ms of initialization time.

Signed-off-by: Dirk Müller <dmueller@suse.de>
Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Song Liu <song@kernel.org>
2022-01-06 08:37:03 -08:00
Dirk Müller
38640c4809 lib/raid6: skip benchmark of non-chosen xor_syndrome functions
In commit fe5cbc6e06 ("md/raid6 algorithms: delta syndrome functions")
a xor_syndrome() benchmarking was added also to the raid6_choose_gen()
function. However, the results of that benchmarking were intentionally
discarded and did not influence the choice. It picked the
xor_syndrome() variant related to the best performing gen_syndrome().

Reduce runtime of raid6_choose_gen() without modifying its outcome by
only benchmarking the xor_syndrome() of the best gen_syndrome() variant.

For a HZ=250 x86_64 system with avx2 and without avx512 this removes
5 out of 6 xor() benchmarks, saving 340ms of raid6 initialization time.

Signed-off-by: Dirk Müller <dmueller@suse.de>
Signed-off-by: Song Liu <song@kernel.org>
2022-01-06 08:37:03 -08:00
Matthew Wilcox (Oracle)
821979f509 iov_iter: Convert iter_xarray to use folios
Take advantage of how kmap_local_folio() works to simplify the loop.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-04 13:15:33 -05:00
Greg Kroah-Hartman
cf6299b610 kobject: remove kset from struct kset_uevent_ops callbacks
There is no need to pass the pointer to the kset in the struct
kset_uevent_ops callbacks as no one uses it, so just remove that pointer
entirely.

Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Wedson Almeida Filho <wedsonaf@google.com>
Link: https://lore.kernel.org/r/20211227163924.3970661-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-28 11:26:18 +01:00
Wedson Almeida Filho
ee6d3dd4ed driver core: make kobj_type constant.
This way instances of kobj_type (which contain function pointers) can be
stored in .rodata, which means that they cannot be [easily/accidentally]
modified at runtime.

Signed-off-by: Wedson Almeida Filho <wedsonaf@google.com>
Link: https://lore.kernel.org/r/20211224231345.777370-1-wedsonaf@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-27 10:40:00 +01:00
Christophe JAILLET
7c63f26cb5 lib: objagg: Use the bitmap API when applicable
Use 'bitmap_zalloc()' to simplify code, improve the semantic and reduce
some open-coded arithmetic in allocator arguments.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/f9541b085ec68e573004e1be200c11c9c901181a.1640295165.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-24 14:54:29 -08:00
Al Viro
5f174ec3c1 logic_io instance of iounmap() needs volatile on argument
... same as the rest of implementations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
2021-12-21 21:31:08 +01:00
Johannes Berg
4e8a5edac5 lib/logic_iomem: Fix operation on 32-bit
On 32-bit, the first entry might be at 0/NULL, but that's
strange and leads to issues, e.g. where we check "if (ret)".
Use a IOREMAP_BIAS/IOREMAP_MASK of 0x80000000UL to avoid
this. This then requires reducing the number of areas (via
MAX_AREAS), but we still have 128 areas, which is enough.

Fixes: ca2e334232 ("lib: add iomem emulation (logic_iomem)")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2021-12-21 21:28:20 +01:00
Johannes Berg
4e84139e14 lib/logic_iomem: Fix 32-bit build
On a 32-bit build, the (unsigned long long) casts throw warnings
(or errors) due to being to a different integer size. Cast to
uintptr_t first (with the __force for sparse) and then further
to get the consistent print on 32 and 64-bit.

Fixes: ca2e334232 ("lib: add iomem emulation (logic_iomem)")
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2021-12-21 21:27:21 +01:00
David Gow
44b7da5fcd kunit: Report test parameter results as (K)TAP subtests
Currently, the results for individial parameters in a parameterised test
are simply output as (K)TAP diagnostic lines.

As kunit_tool now supports nested subtests, report each parameter as its
own subtest.

For example, here's what the output now looks like:
	# Subtest: inode_test_xtimestamp_decoding
	ok 1 - 1901-12-13 Lower bound of 32bit < 0 timestamp, no extra bits
	ok 2 - 1969-12-31 Upper bound of 32bit < 0 timestamp, no extra bits
	ok 3 - 1970-01-01 Lower bound of 32bit >=0 timestamp, no extra bits
	ok 4 - 2038-01-19 Upper bound of 32bit >=0 timestamp, no extra bits
	ok 5 - 2038-01-19 Lower bound of 32bit <0 timestamp, lo extra sec bit on
	ok 6 - 2106-02-07 Upper bound of 32bit <0 timestamp, lo extra sec bit on
	ok 7 - 2106-02-07 Lower bound of 32bit >=0 timestamp, lo extra sec bit on
	ok 8 - 2174-02-25 Upper bound of 32bit >=0 timestamp, lo extra sec bit on
	ok 9 - 2174-02-25 Lower bound of 32bit <0 timestamp, hi extra sec bit on
	ok 10 - 2242-03-16 Upper bound of 32bit <0 timestamp, hi extra sec bit on
	ok 11 - 2242-03-16 Lower bound of 32bit >=0 timestamp, hi extra sec bit on
	ok 12 - 2310-04-04 Upper bound of 32bit >=0 timestamp, hi extra sec bit on
	ok 13 - 2310-04-04 Upper bound of 32bit>=0 timestamp, hi extra sec bit 1. 1 ns
	ok 14 - 2378-04-22 Lower bound of 32bit>= timestamp. Extra sec bits 1. Max ns
	ok 15 - 2378-04-22 Lower bound of 32bit >=0 timestamp. All extra sec bits on
	ok 16 - 2446-05-10 Upper bound of 32bit >=0 timestamp. All extra sec bits on
	# inode_test_xtimestamp_decoding: pass:16 fail:0 skip:0 total:16
	ok 1 - inode_test_xtimestamp_decoding

Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-12-13 13:36:29 -07:00
David Gow
37dbb4c7c7 kunit: Don't crash if no parameters are generated
It's possible that a parameterised test could end up with zero
parameters. At the moment, the test function will nevertheless be called
with NULL as the parameter. Instead, don't try to run the test code, and
just mark the test as SKIPped.

Reported-by: Daniel Latypov <dlatypov@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-12-13 13:36:21 -07:00
Eric W. Biederman
cead185526 exit: Rename complete_and_exit to kthread_complete_and_exit
Update complete_and_exit to call kthread_exit instead of do_exit.

Change the name to reflect this change in functionality.  All of the
users of complete_and_exit are causing the current kthread to exit so
this change makes it clear what is happening.

Move the implementation of kthread_complete_and_exit from
kernel/exit.c to to kernel/kthread.c.  As this function is kthread
specific it makes most sense to live with the kthread functions.

There are no functional change.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-12-13 12:04:45 -06:00
Mark Rutland
5fb6e8cf53 locking/atomic: atomic64: Remove unusable atomic ops
The generic atomic64 implementation provides:

* atomic64_and_return()
* atomic64_or_return()
* atomic64_xor_return()

... but none of these exist in the standard atomic64 API as described by
scripts/atomic/atomics.tbl, and none of these have prototypes exposed by
<asm-generic/atomic64.h>.

The lkp kernel test robot noted this results in warnings when building with
W=1:

  lib/atomic64.c:82:5: warning: no previous prototype for 'generic_atomic64_and_return' [-Wmissing-prototypes]

  lib/atomic64.c:82:5: warning: no previous prototype for 'generic_atomic64_or_return' [-Wmissing-prototypes]

  lib/atomic64.c:82:5: warning: no previous prototype for 'generic_atomic64_xor_return' [-Wmissing-prototypes]

This appears to have been a thinko in commit:

  28aa2bda22 ("locking/atomic: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()")

... where we grouped add/sub separately from and/ox/xor, so that we could avoid
implementing _return forms for the latter group, but forgot to remove
ATOMIC64_OP_RETURN() for that group.

This doesn't cause any functional problem, but it's pointless to build code
which cannot be used. Remove the unusable code. This does not affect add/sub,
for which _return forms will still be built.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/r/20211126115923.41489-1-mark.rutland@arm.com
2021-12-13 10:56:09 +01:00
Ingo Molnar
6773cc31a9 Linux 5.16-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmG2fU0eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGC7EH/3R7Rt+OD8Wn8Ss3
 w8V+dBxVwa2u2oMTyUHPxaeOXZ7bi38XlUdLFPOK/76bGwO0a5TmYZqsWdRbGyT0
 HfcYjHsQ0lbJXk/nh2oM47oJxJXVpThIHXJEk0FZ0Y5t+DYjIYlNHzqZymUyhLem
 St74zgWcyT+MXuqY34vB827FJDUnOxhhhi85tObeunaSPAomy9aiYidSC1ARREnz
 iz2VUntP/QnRnKVvL2nUZNzcz1xL5vfCRSKsRGRSv3qW1Y/1M71ylt6JVmSftWq+
 VmMdFxFhdrb1OK/1ct/930Un/UP2NG9EJsWxote2XYlnVSZHzDqH7lUhbqgdCcLz
 1m2tVNY=
 =7wRd
 -----END PGP SIGNATURE-----

Merge tag 'v5.16-rc5' into locking/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-12-13 10:48:46 +01:00
Jakub Kicinski
be3158290d Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Andrii Nakryiko says:

====================
bpf-next 2021-12-10 v2

We've added 115 non-merge commits during the last 26 day(s) which contain
a total of 182 files changed, 5747 insertions(+), 2564 deletions(-).

The main changes are:

1) Various samples fixes, from Alexander Lobakin.

2) BPF CO-RE support in kernel and light skeleton, from Alexei Starovoitov.

3) A batch of new unified APIs for libbpf, logging improvements, version
   querying, etc. Also a batch of old deprecations for old APIs and various
   bug fixes, in preparation for libbpf 1.0, from Andrii Nakryiko.

4) BPF documentation reorganization and improvements, from Christoph Hellwig
   and Dave Tucker.

5) Support for declarative initialization of BPF_MAP_TYPE_PROG_ARRAY in
   libbpf, from Hengqi Chen.

6) Verifier log fixes, from Hou Tao.

7) Runtime-bounded loops support with bpf_loop() helper, from Joanne Koong.

8) Extend branch record capturing to all platforms that support it,
   from Kajol Jain.

9) Light skeleton codegen improvements, from Kumar Kartikeya Dwivedi.

10) bpftool doc-generating script improvements, from Quentin Monnet.

11) Two libbpf v0.6 bug fixes, from Shuyi Cheng and Vincent Minet.

12) Deprecation warning fix for perf/bpf_counter, from Song Liu.

13) MAX_TAIL_CALL_CNT unification and MIPS build fix for libbpf,
    from Tiezhu Yang.

14) BTF_KING_TYPE_TAG follow-up fixes, from Yonghong Song.

15) Selftests fixes and improvements, from Ilya Leoshkevich, Jean-Philippe
    Brucker, Jiri Olsa, Maxim Mikityanskiy, Tirthendu Sarkar, Yucong Sun,
    and others.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (115 commits)
  libbpf: Add "bool skipped" to struct bpf_map
  libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition
  bpftool: Switch bpf_object__load_xattr() to bpf_object__load()
  selftests/bpf: Remove the only use of deprecated bpf_object__load_xattr()
  selftests/bpf: Add test for libbpf's custom log_buf behavior
  selftests/bpf: Replace all uses of bpf_load_btf() with bpf_btf_load()
  libbpf: Deprecate bpf_object__load_xattr()
  libbpf: Add per-program log buffer setter and getter
  libbpf: Preserve kernel error code and remove kprobe prog type guessing
  libbpf: Improve logging around BPF program loading
  libbpf: Allow passing user log setting through bpf_object_open_opts
  libbpf: Allow passing preallocated log_buf when loading BTF into kernel
  libbpf: Add OPTS-based bpf_btf_load() API
  libbpf: Fix bpf_prog_load() log_buf logic for log_level 0
  samples/bpf: Remove unneeded variable
  bpf: Remove redundant assignment to pointer t
  selftests/bpf: Fix a compilation warning
  perf/bpf_counter: Use bpf_map_create instead of bpf_create_map
  samples: bpf: Fix 'unknown warning group' build warning on Clang
  samples: bpf: Fix xdp_sample_user.o linking with Clang
  ...
====================

Link: https://lore.kernel.org/r/20211210234746.2100561-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10 15:56:13 -08:00
Marco Elver
bd3d5bd1a0 kcsan: Support WEAK_MEMORY with Clang where no objtool support exists
Clang and GCC behave a little differently when it comes to the
__no_sanitize_thread attribute, which has valid reasons, and depending
on context either one could be right.

Traditionally, user space ThreadSanitizer [1] still expects instrumented
builtin atomics (to avoid false positives) and __tsan_func_{entry,exit}
(to generate meaningful stack traces), even if the function has the
attribute no_sanitize("thread").

[1] https://clang.llvm.org/docs/ThreadSanitizer.html#attribute-no-sanitize-thread

GCC doesn't follow the same policy (for better or worse), and removes
all kinds of instrumentation if no_sanitize is added. Arguably, since
this may be a problem for user space ThreadSanitizer, we expect this may
change in future.

Since KCSAN != ThreadSanitizer, the likelihood of false positives even
without barrier instrumentation everywhere, is much lower by design.

At least for Clang, however, to fully remove all sanitizer
instrumentation, we must add the disable_sanitizer_instrumentation
attribute, which is available since Clang 14.0.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-09 16:42:28 -08:00
Marco Elver
69562e4983 kcsan: Add core support for a subset of weak memory modeling
Add support for modeling a subset of weak memory, which will enable
detection of a subset of data races due to missing memory barriers.

KCSAN's approach to detecting missing memory barriers is based on
modeling access reordering, and enabled if `CONFIG_KCSAN_WEAK_MEMORY=y`,
which depends on `CONFIG_KCSAN_STRICT=y`. The feature can be enabled or
disabled at boot and runtime via the `kcsan.weak_memory` boot parameter.

Each memory access for which a watchpoint is set up, is also selected
for simulated reordering within the scope of its function (at most 1
in-flight access).

We are limited to modeling the effects of "buffering" (delaying the
access), since the runtime cannot "prefetch" accesses (therefore no
acquire modeling). Once an access has been selected for reordering, it
is checked along every other access until the end of the function scope.
If an appropriate memory barrier is encountered, the access will no
longer be considered for reordering.

When the result of a memory operation should be ordered by a barrier,
KCSAN can then detect data races where the conflict only occurs as a
result of a missing barrier due to reordering accesses.

Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-09 16:42:26 -08:00
Jakub Kicinski
3150a73366 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 13:23:02 -08:00
Jakub Kicinski
6efcdadc15 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
bpf 2021-12-08

We've added 12 non-merge commits during the last 22 day(s) which contain
a total of 29 files changed, 659 insertions(+), 80 deletions(-).

The main changes are:

1) Fix an off-by-two error in packet range markings and also add a batch of
   new tests for coverage of these corner cases, from Maxim Mikityanskiy.

2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh.

3) Fix two functional regressions and a build warning related to BTF kfunc
   for modules, from Kumar Kartikeya Dwivedi.

4) Fix outdated code and docs regarding BPF's migrate_disable() use on non-
   PREEMPT_RT kernels, from Sebastian Andrzej Siewior.

5) Add missing includes in order to be able to detangle cgroup vs bpf header
   dependencies, from Jakub Kicinski.

6) Fix regression in BPF sockmap tests caused by missing detachment of progs
   from sockets when they are removed from the map, from John Fastabend.

7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF
   dispatcher, from Björn Töpel.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Add selftests to cover packet access corner cases
  bpf: Fix the off-by-two error in range markings
  treewide: Add missing includes masked by cgroup -> bpf dependency
  tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets
  bpf: Fix bpf_check_mod_kfunc_call for built-in modules
  bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
  mips, bpf: Fix reference to non-existing Kconfig symbol
  bpf: Make sure bpf_disable_instrumentation() is safe vs preemption.
  Documentation/locking/locktypes: Update migrate_disable() bits.
  bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap
  bpf, sockmap: Attach map progs to psock early for feature probes
  bpf, x86: Fix "no previous prototype" warning
====================

Link: https://lore.kernel.org/r/20211208155125.11826-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 16:06:44 -08:00
Eric Dumazet
4d92b95ff2 net: add net device refcount tracker infrastructure
net device are refcounted. Over the years we had numerous bugs
caused by imbalanced dev_hold() and dev_put() calls.

The general idea is to be able to precisely pair each decrement with
a corresponding prior increment. Both share a cookie, basically
a pointer to private data storing stack traces.

This patch adds dev_hold_track() and dev_put_track().

To use these helpers, each data structure owning a refcount
should also use a "netdevice_tracker" to pair the hold and put.

netdevice_tracker dev_tracker;
...
dev_hold_track(dev, &dev_tracker, GFP_ATOMIC);
...
dev_put_track(dev, &dev_tracker);

Whenever a leak happens, we will get precise stack traces
of the point dev_hold_track() happened, at device dismantle phase.

We will also get a stack trace if too many dev_put_track() for the same
netdevice_tracker are attempted.

This is guarded by CONFIG_NET_DEV_REFCNT_TRACKER option.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:07 -08:00
Eric Dumazet
914a7b5000 lib: add tests for reference tracker
This module uses reference tracker, forcing two issues.

1) Double free of a tracker

2) leak of two trackers, one being allocated from softirq context.

"modprobe test_ref_tracker" would emit the following traces.
(Use scripts/decode_stacktrace.sh if necessary)

[  171.648681] reference already released.
[  171.653213] allocated in:
[  171.656523]  alloctest_ref_tracker_alloc2+0x1c/0x20 [test_ref_tracker]
[  171.656526]  init_module+0x86/0x1000 [test_ref_tracker]
[  171.656528]  do_one_initcall+0x9c/0x220
[  171.656532]  do_init_module+0x60/0x240
[  171.656536]  load_module+0x32b5/0x3610
[  171.656538]  __do_sys_init_module+0x148/0x1a0
[  171.656540]  __x64_sys_init_module+0x1d/0x20
[  171.656542]  do_syscall_64+0x4a/0xb0
[  171.656546]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.656549] freed in:
[  171.659520]  alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659522]  init_module+0xec/0x1000 [test_ref_tracker]
[  171.659523]  do_one_initcall+0x9c/0x220
[  171.659525]  do_init_module+0x60/0x240
[  171.659527]  load_module+0x32b5/0x3610
[  171.659529]  __do_sys_init_module+0x148/0x1a0
[  171.659532]  __x64_sys_init_module+0x1d/0x20
[  171.659534]  do_syscall_64+0x4a/0xb0
[  171.659536]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659575] ------------[ cut here ]------------
[  171.659576] WARNING: CPU: 5 PID: 13016 at lib/ref_tracker.c:112 ref_tracker_free+0x224/0x270
[  171.659581] Modules linked in: test_ref_tracker(+)
[  171.659591] CPU: 5 PID: 13016 Comm: modprobe Tainted: G S                5.16.0-smp-DEV #290
[  171.659595] RIP: 0010:ref_tracker_free+0x224/0x270
[  171.659599] Code: 5e 41 5f 5d c3 48 c7 c7 04 9c 74 a6 31 c0 e8 62 ee 67 00 83 7b 14 00 75 1a 83 7b 18 00 75 30 4c 89 ff 4c 89 f6 e8 9c 00 69 00 <0f> 0b bb ea ff ff ff eb ae 48 c7 c7 3a 0a 77 a6 31 c0 e8 34 ee 67
[  171.659601] RSP: 0018:ffff89058ba0bbd0 EFLAGS: 00010286
[  171.659603] RAX: 0000000000000029 RBX: ffff890586b19780 RCX: 08895bff57c7d100
[  171.659604] RDX: c0000000ffff7fff RSI: 0000000000000282 RDI: ffffffffc0407000
[  171.659606] RBP: ffff89058ba0bc88 R08: 0000000000000000 R09: ffffffffa6f342e0
[  171.659607] R10: 00000000ffff7fff R11: 0000000000000000 R12: 000000008f000000
[  171.659608] R13: 0000000000000014 R14: 0000000000000282 R15: ffffffffc0407000
[  171.659609] FS:  00007f97ea29d740(0000) GS:ffff8923ff940000(0000) knlGS:0000000000000000
[  171.659611] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  171.659613] CR2: 00007f97ea299000 CR3: 0000000186b4a004 CR4: 00000000001706e0
[  171.659614] Call Trace:
[  171.659615]  <TASK>
[  171.659631]  ? alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659633]  ? init_module+0x105/0x1000 [test_ref_tracker]
[  171.659636]  ? do_one_initcall+0x9c/0x220
[  171.659638]  ? do_init_module+0x60/0x240
[  171.659641]  ? load_module+0x32b5/0x3610
[  171.659644]  ? __do_sys_init_module+0x148/0x1a0
[  171.659646]  ? __x64_sys_init_module+0x1d/0x20
[  171.659649]  ? do_syscall_64+0x4a/0xb0
[  171.659652]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659656]  ? 0xffffffffc040a000
[  171.659658]  alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659660]  init_module+0x105/0x1000 [test_ref_tracker]
[  171.659663]  do_one_initcall+0x9c/0x220
[  171.659666]  do_init_module+0x60/0x240
[  171.659669]  load_module+0x32b5/0x3610
[  171.659672]  __do_sys_init_module+0x148/0x1a0
[  171.659676]  __x64_sys_init_module+0x1d/0x20
[  171.659678]  do_syscall_64+0x4a/0xb0
[  171.659694]  ? exc_page_fault+0x6e/0x140
[  171.659696]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659698] RIP: 0033:0x7f97ea3dbe7a
[  171.659700] Code: 48 8b 0d 61 8d 06 00 f7 d8 64 89 01 48 83 c8 ff c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2e 8d 06 00 f7 d8 64 89 01 48
[  171.659701] RSP: 002b:00007ffea67ce608 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  171.659703] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f97ea3dbe7a
[  171.659704] RDX: 00000000013a0ba0 RSI: 0000000000002808 RDI: 00007f97ea299000
[  171.659705] RBP: 00007ffea67ce670 R08: 0000000000000003 R09: 0000000000000000
[  171.659706] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000013a1048
[  171.659707] R13: 00000000013a0ba0 R14: 0000000001399930 R15: 00000000013a1030
[  171.659709]  </TASK>
[  171.659710] ---[ end trace f5dbd6afa41e60a9 ]---
[  171.659712] leaked reference.
[  171.663393]  alloctest_ref_tracker_alloc0+0x1c/0x20 [test_ref_tracker]
[  171.663395]  test_ref_tracker_timer_func+0x9/0x20 [test_ref_tracker]
[  171.663397]  call_timer_fn+0x31/0x140
[  171.663401]  expire_timers+0x46/0x110
[  171.663403]  __run_timers+0x16f/0x1b0
[  171.663404]  run_timer_softirq+0x1d/0x40
[  171.663406]  __do_softirq+0x148/0x2d3
[  171.663408] leaked reference.
[  171.667101]  alloctest_ref_tracker_alloc1+0x1c/0x20 [test_ref_tracker]
[  171.667103]  init_module+0x81/0x1000 [test_ref_tracker]
[  171.667104]  do_one_initcall+0x9c/0x220
[  171.667106]  do_init_module+0x60/0x240
[  171.667108]  load_module+0x32b5/0x3610
[  171.667111]  __do_sys_init_module+0x148/0x1a0
[  171.667113]  __x64_sys_init_module+0x1d/0x20
[  171.667115]  do_syscall_64+0x4a/0xb0
[  171.667117]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.667131] ------------[ cut here ]------------
[  171.667132] WARNING: CPU: 5 PID: 13016 at lib/ref_tracker.c:30 ref_tracker_dir_exit+0x104/0x130
[  171.667136] Modules linked in: test_ref_tracker(+)
[  171.667144] CPU: 5 PID: 13016 Comm: modprobe Tainted: G S      W         5.16.0-smp-DEV #290
[  171.667147] RIP: 0010:ref_tracker_dir_exit+0x104/0x130
[  171.667150] Code: 01 00 00 00 00 ad de 48 89 03 4c 89 63 08 48 89 df e8 20 a0 d5 ff 4c 89 f3 4d 39 ee 75 a8 4c 89 ff 48 8b 75 d0 e8 7c 05 69 00 <0f> 0b eb 0c 4c 89 ff 48 8b 75 d0 e8 6c 05 69 00 41 8b 47 08 83 f8
[  171.667151] RSP: 0018:ffff89058ba0bc68 EFLAGS: 00010286
[  171.667154] RAX: 08895bff57c7d100 RBX: ffffffffc0407010 RCX: 000000000000003b
[  171.667156] RDX: 000000000000003c RSI: 0000000000000282 RDI: ffffffffc0407000
[  171.667157] RBP: ffff89058ba0bc98 R08: 0000000000000000 R09: ffffffffa6f342e0
[  171.667159] R10: 00000000ffff7fff R11: 0000000000000000 R12: dead000000000122
[  171.667160] R13: ffffffffc0407010 R14: ffffffffc0407010 R15: ffffffffc0407000
[  171.667162] FS:  00007f97ea29d740(0000) GS:ffff8923ff940000(0000) knlGS:0000000000000000
[  171.667164] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  171.667166] CR2: 00007f97ea299000 CR3: 0000000186b4a004 CR4: 00000000001706e0
[  171.667169] Call Trace:
[  171.667170]  <TASK>
[  171.667171]  ? 0xffffffffc040a000
[  171.667173]  init_module+0x126/0x1000 [test_ref_tracker]
[  171.667175]  do_one_initcall+0x9c/0x220
[  171.667179]  do_init_module+0x60/0x240
[  171.667182]  load_module+0x32b5/0x3610
[  171.667186]  __do_sys_init_module+0x148/0x1a0
[  171.667189]  __x64_sys_init_module+0x1d/0x20
[  171.667192]  do_syscall_64+0x4a/0xb0
[  171.667194]  ? exc_page_fault+0x6e/0x140
[  171.667196]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.667199] RIP: 0033:0x7f97ea3dbe7a
[  171.667200] Code: 48 8b 0d 61 8d 06 00 f7 d8 64 89 01 48 83 c8 ff c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2e 8d 06 00 f7 d8 64 89 01 48
[  171.667201] RSP: 002b:00007ffea67ce608 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  171.667203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f97ea3dbe7a
[  171.667204] RDX: 00000000013a0ba0 RSI: 0000000000002808 RDI: 00007f97ea299000
[  171.667205] RBP: 00007ffea67ce670 R08: 0000000000000003 R09: 0000000000000000
[  171.667206] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000013a1048
[  171.667207] R13: 00000000013a0ba0 R14: 0000000001399930 R15: 00000000013a1030
[  171.667209]  </TASK>
[  171.667210] ---[ end trace f5dbd6afa41e60aa ]---

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:04:44 -08:00
Eric Dumazet
4e66934eaa lib: add reference counting tracking infrastructure
It can be hard to track where references are taken and released.

In networking, we have annoying issues at device or netns dismantles,
and we had various proposals to ease root causing them.

This patch adds new infrastructure pairing refcount increases
and decreases. This will self document code, because programmers
will have to associate increments/decrements.

This is controled by CONFIG_REF_TRACKER which can be selected
by users of this feature.

This adds both cpu and memory costs, and thus should probably be
used with care.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:04:44 -08:00
Christophe JAILLET
52e68cd60d vsprintf: Use non-atomic bitmap API when applicable
The 'set' bitmap is local to this function. No concurrent access to it is
possible.
So prefer the non-atomic '__[set|clear]_bit()' function to save a few
cycles.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/1abf81a5e509d372393bd22041eed4ebc07ef9f7.1638023178.git.christophe.jaillet@wanadoo.fr
2021-12-06 13:35:28 +01:00
Sebastian Andrzej Siewior
9a75bd0c52 lockdep/selftests: Adapt ww-tests for PREEMPT_RT
The ww-mutex selftest operates directly on ww_mutex::base and assumes
its type is struct mutex. This isn't true on PREEMPT_RT which turns the
mutex into a rtmutex.

Add a ww_mutex_base_ abstraction which maps to the relevant mutex_ or
rt_mutex_ function.
Change the CONFIG_DEBUG_MUTEXES ifdef to DEBUG_WW_MUTEXES. The latter is
true for the MUTEX and RTMUTEX implementation of WW-MUTEX. The
assignment is required in order to pass the tests.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211129174654.668506-10-bigeasy@linutronix.de
2021-12-04 10:56:24 +01:00
Sebastian Andrzej Siewior
a529f8db89 lockdep/selftests: Skip the softirq related tests on PREEMPT_RT
The softirq context on PREEMPT_RT is different compared to !PREEMPT_RT.
As such lockdep_softirq_enter() is a nop and the all the "softirq safe"
tests fail on PREEMPT_RT because there is no difference.

Skip the softirq context tests on PREEMPT_RT.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211129174654.668506-9-bigeasy@linutronix.de
2021-12-04 10:56:24 +01:00
Sebastian Andrzej Siewior
512bf713cb lockdep/selftests: Unbalanced migrate_disable() & rcu_read_lock().
The tests with unbalanced lock() + unlock() operation leave a modified
preemption counter behind which is then reset to its original value
after the test.

The spin_lock() function on PREEMPT_RT does not include a
preempt_disable() statement but migrate_disable() and read_rcu_lock().
As a consequence both counter never get back to their original value
and the system explodes later after the selftest.  In the
double-unlock case on PREEMPT_RT, the migrate_disable() and RCU code
will trigger a warning which should be avoided. These counter should
not be decremented below their initial value.

Save both counters and bring them back to their original value after
the test.  In the double-unlock case, increment both counter in
advance to they become balanced after the double unlock.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211129174654.668506-8-bigeasy@linutronix.de
2021-12-04 10:56:24 +01:00
Sebastian Andrzej Siewior
fc78dd08e6 lockdep/selftests: Avoid using local_lock_{acquire|release}().
The local_lock related functions
  local_lock_acquire()
  local_lock_release()

are part of the internal implementation and should be avoided.
Define the lock as DEFINE_PER_CPU so the normal local_lock() function
can be used.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211129174654.668506-7-bigeasy@linutronix.de
2021-12-04 10:56:24 +01:00
Kumar Kartikeya Dwivedi
d9847eb8be bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
Vinicius Costa Gomes reported [0] that build fails when
CONFIG_DEBUG_INFO_BTF is enabled and CONFIG_BPF_SYSCALL is disabled.
This leads to btf.c not being compiled, and then no symbol being present
in vmlinux for the declarations in btf.h. Since BTF is not useful
without enabling BPF subsystem, disallow this combination.

However, theoretically disabling both now could still fail, as the
symbol for kfunc_btf_id_list variables is not available. This isn't a
problem as the compiler usually optimizes the whole register/unregister
call, but at lower optimization levels it can fail the build in linking
stage.

Fix that by adding dummy variables so that modules taking address of
them still work, but the whole thing is a noop.

  [0]: https://lore.kernel.org/bpf/20211110205418.332403-1-vinicius.gomes@intel.com

Fixes: 14f267d95f ("bpf: btf: Introduce helpers for dynamic BTF set registration")
Reported-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211122144742.477787-2-memxor@gmail.com
2021-12-02 13:39:46 -08:00
Arnd Bergmann
f7e5b9bfa6 siphash: use _unaligned version by default
On ARM v6 and later, we define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
because the ordinary load/store instructions (ldr, ldrh, ldrb) can
tolerate any misalignment of the memory address. However, load/store
double and load/store multiple instructions (ldrd, ldm) may still only
be used on memory addresses that are 32-bit aligned, and so we have to
use the CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS macro with care, or we
may end up with a severe performance hit due to alignment traps that
require fixups by the kernel. Testing shows that this currently happens
with clang-13 but not gcc-11. In theory, any compiler version can
produce this bug or other problems, as we are dealing with undefined
behavior in C99 even on architectures that support this in hardware,
see also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363.

Fortunately, the get_unaligned() accessors do the right thing: when
building for ARMv6 or later, the compiler will emit unaligned accesses
using the ordinary load/store instructions (but avoid the ones that
require 32-bit alignment). When building for older ARM, those accessors
will emit the appropriate sequence of ldrb/mov/orr instructions. And on
architectures that can truly tolerate any kind of misalignment, the
get_unaligned() accessors resolve to the leXX_to_cpup accessors that
operate on aligned addresses.

Since the compiler will in fact emit ldrd or ldm instructions when
building this code for ARM v6 or later, the solution is to use the
unaligned accessors unconditionally on architectures where this is
known to be fast. The _aligned version of the hash function is
however still needed to get the best performance on architectures
that cannot do any unaligned access in hardware.

This new version avoids the undefined behavior and should produce
the fastest hash on all architectures we support.

Link: https://lore.kernel.org/linux-arm-kernel/20181008211554.5355-4-ard.biesheuvel@linaro.org/
Link: https://lore.kernel.org/linux-crypto/CAK8P3a2KfmmGDbVHULWevB0hv71P2oi2ZCHEAqT=8dQfa0=cqQ@mail.gmail.com/
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Fixes: 2c956a6077 ("siphash: add cryptographically secure PRF")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29 19:50:50 -08:00
Linus Walleij
2448eab440 Linux 5.16-rc2
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmGavnseHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGcl4H/jyFVlHDSa+utMA5
 7PEQX0AarkBtSvKUgK/SivZxX06nYp2UU5L4Jn70O/mccXWo0ru82eDVO3nSImDR
 Mi668IqzbYfGqVL6CMztDku+XbyT3Yr/i9QILFbLWV5DhCM422GXXN8PFBibDHdI
 6Oyt1WoUh404yjVIHOCNwprfLObxREV6ARhFsIsmCRa8Hf+RkKOY5Twua6j5emm5
 aamiq6SYLtf2H5+DwkR5TnPkie6I2o8oLtA7JYiJpKh5KK75qjlpzFd3S3OWsi1H
 0g752g12r7tLh4ac3Xfgwf36pQ2CdiZ7NUOkJhZWT4aHPqPh+MVheQfpR41f5Sgc
 pvFslTo=
 =QdMf
 -----END PGP SIGNATURE-----

Merge tag 'v5.16-rc2' into devel

Linux 5.16-rc2 is needed because nonurgent fixes headed
for next are strongly textually dependent on a fix that
was applied for rc2.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-11-27 00:54:16 +01:00
Helge Deller
8d192bec53 parisc: Increase FRAME_WARN to 2048 bytes on parisc
PA-RISC uses a much bigger frame size for functions than other
architectures. So increase it to 2048 for 32- and 64-bit kernels.
This fixes e.g. a warning in lib/xxhash.c.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-22 07:37:31 +01:00
Kees Cook
cab71f7495 kasan: test: silence intentional read overflow warnings
As done in commit d73dad4eb5 ("kasan: test: bypass __alloc_size
checks") for __write_overflow warnings, also silence some more cases
that trip the __read_overflow warnings seen in 5.16-rc1[1]:

  In file included from include/linux/string.h:253,
                   from include/linux/bitmap.h:10,
                   from include/linux/cpumask.h:12,
                   from include/linux/mm_types_task.h:14,
                   from include/linux/mm_types.h:5,
                   from include/linux/page-flags.h:13,
                   from arch/arm64/include/asm/mte.h:14,
                   from arch/arm64/include/asm/pgtable.h:12,
                   from include/linux/pgtable.h:6,
                   from include/linux/kasan.h:29,
                   from lib/test_kasan.c:10:
  In function 'memcmp',
      inlined from 'kasan_memcmp' at lib/test_kasan.c:897:2:
  include/linux/fortify-string.h:263:25: error: call to '__read_overflow' declared with attribute error: detected read beyond size of object (1st parameter)
    263 |                         __read_overflow();
        |                         ^~~~~~~~~~~~~~~~~
  In function 'memchr',
      inlined from 'kasan_memchr' at lib/test_kasan.c:872:2:
  include/linux/fortify-string.h:277:17: error: call to '__read_overflow' declared with attribute error: detected read beyond size of object (1st parameter)
    277 |                 __read_overflow();
        |                 ^~~~~~~~~~~~~~~~~

[1] http://kisskb.ellerman.id.au/kisskb/buildresult/14660585/log/

Link: https://lkml.kernel.org/r/20211116004111.3171781-1-keescook@chromium.org
Fixes: d73dad4eb5 ("kasan: test: bypass __alloc_size checks")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Acked-by: Marco Elver <elver@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
Linus Torvalds
4c388a8e74 zstd fixes for v5.16-rc1
Fix stack usage on parisc & improve code size bloat
 
 This PR contains 3 commits:
 
 1. Fixes a minor unused variable warning reported by Kernel test robot [0].
 2. Improves the reported code bloat (-88KB / 374KB) [1] by outlining
    some functions that are unlikely to be used in performance sensitive
    workloads.
 3. Fixes the reported excess stack usage on parisc [2] by removing -O3
    from zstd's compilation flags. -O3 triggered bugs in the hppa-linux-gnu
    gcc-8 compiler. -O2 performance is acceptable: neutral compression,
    about -1% decompression speed. We also reduce code bloat
    (-105KB / 374KB).
 
 After this commit our code bloat is cut from 374KB to 105KB with gcc-11.
 If we wanted to cut the remaining 105KB we'd likely have to trade
 signicant performance, so I want to say that this is enough for now.
 
 We should be able to get further gains without sacrificing speed, but
 that will take some significant optimization effort, and isn't suitable
 for a quick fix. I've opened an upstream issue [3] to track the code size,
 and try to avoid future regressions, and improve it in the long term.
 
 [0] https://lore.kernel.org/linux-mm/202111120312.833wII4i-lkp@intel.com/T/
 [1] https://lkml.org/lkml/2021/11/15/710
 [2] https://lkml.org/lkml/2021/11/14/189
 [3] https://github.com/facebook/zstd/issues/2867
 
 Link: https://lore.kernel.org/r/20211117014949.1169186-1-nickrterrell@gmail.com/
 Link: https://lore.kernel.org/r/20211117201459.1194876-1-nickrterrell@gmail.com/
 
 Signed-off-by: Nick Terrell <terrelln@fb.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEmIwAqlFIzbQodPwyuzRpqaNEqPUFAmGWw4AACgkQuzRpqaNE
 qPUXfQ/5AXp+7Ip+YD25QUa/je10OZkdGNi5/MNh1m7f6gwlOab7Pnn65mpN8qsW
 1OJbje5PAiTkC+BzJgGw6zr8JCcvgXCVVtAoPEV73uT9QLOoeEE3E2Jf4OQQxroB
 cKC+lZaxeDgqV60koIhsVBMgs4pny57ohTm4fK8yqrIi7ZV21a/FJoVxwyNLCnbU
 uRJKzN9xa3lBYESnMzlV4dF0WhKfprgI+3YXenLBjHHDhhz0nyPT7jt0sr/CoblI
 2QMq8RItlnMleV1La1v1S38ROu1E4MXvIy/MrFyu7ebBX3jDgMYtRdZxuAL/I2+1
 TfN3LfEcwjyB4ft6Ty76kk0gwEihnEORhTeRVrhqxXx8FPWgEB+tgWHo+zLd8wPp
 khqfO6gf4PZJnf6kDOlyEYF2yTuNlWNR6J41+bLW0bA104zLYjeUhejDgyh2aRR2
 WYo/xwzs2FbI4Da/rJ4iTKy4hK++AZ/Sba9b3t29Ca+TiQZJHSUp5KnjNbIW5XCr
 0jknMki6bASlG9nrg+d2EC3fIQop8nJhywNrLZV1uJYx/H5DBmIcLPmhCb4oBOSt
 AP3d/rj5EnO0+bOGGDg00qndsnuDuko7fOsAM3D9l2HoaOly7++RQtIzZqu8Y3EX
 F8L90qvg/vIWFOppnvJX+nXaWz2J55P4iooKlBKz+JQpBff7lDA=
 =kBgl
 -----END PGP SIGNATURE-----

Merge tag 'zstd-for-linus-5.16-rc1' of git://github.com/terrelln/linux

Pull zstd fixes from Nick Terrell:
 "Fix stack usage on parisc & improve code size bloat

  This contains three commits:

   1. Fixes a minor unused variable warning reported by Kernel test
      robot [0].

   2. Improves the reported code bloat (-88KB / 374KB) [1] by outlining
      some functions that are unlikely to be used in performance
      sensitive workloads.

   3. Fixes the reported excess stack usage on parisc [2] by removing
      -O3 from zstd's compilation flags. -O3 triggered bugs in the
      hppa-linux-gnu gcc-8 compiler. -O2 performance is acceptable:
      neutral compression, about -1% decompression speed. We also reduce
      code bloat (-105KB / 374KB).

  After this our code bloat is cut from 374KB to 105KB with gcc-11. If
  we wanted to cut the remaining 105KB we'd likely have to trade
  signicant performance, so I want to say that this is enough for now.

  We should be able to get further gains without sacrificing speed, but
  that will take some significant optimization effort, and isn't
  suitable for a quick fix. I've opened an upstream issue [3] to track
  the code size, and try to avoid future regressions, and improve it in
  the long term"

Link: https://lore.kernel.org/linux-mm/202111120312.833wII4i-lkp@intel.com/T/ [0]
Link: https://lkml.org/lkml/2021/11/15/710 [1]
Link: https://lkml.org/lkml/2021/11/14/189 [2]
Link: https://github.com/facebook/zstd/issues/2867 [3]
Link: https://lore.kernel.org/r/20211117014949.1169186-1-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-1-nickrterrell@gmail.com/

* tag 'zstd-for-linus-5.16-rc1' of git://github.com/terrelln/linux:
  lib: zstd: Don't add -O3 to cflags
  lib: zstd: Don't inline functions in zstd_opt.c
  lib: zstd: Fix unused variable warning
2021-11-18 17:09:05 -08:00
Nick Terrell
7416cdc9b9 lib: zstd: Don't add -O3 to cflags
After the update to zstd-1.4.10 passing -O3 is no longer necessary to
get good performance from zstd. Using the default optimization level -O2
is sufficient to get good performance.

I've measured no significant change to compression speed, and a ~1%
decompression speed loss, which is acceptable.

This fixes the reported parisc -Wframe-larger-than=1536 errors [0]. The
gcc-8-hppa-linux-gnu compiler performed very poorly with -O3, generating
stacks that are ~3KB. With -O2 these same functions generate stacks in
the < 100B, completely fixing the problem. Function size deltas are
listed below:

ZSTD_compressBlock_fast_extDict_generic: 3800 -> 68
ZSTD_compressBlock_fast: 2216 -> 40
ZSTD_compressBlock_fast_dictMatchState: 1848 ->  64
ZSTD_compressBlock_doubleFast_extDict_generic: 3744 -> 76
ZSTD_fillDoubleHashTable: 3252 -> 0
ZSTD_compressBlock_doubleFast: 5856 -> 36
ZSTD_compressBlock_doubleFast_dictMatchState: 5380 -> 84
ZSTD_copmressBlock_lazy2: 2420 -> 72

Additionally, this improves the reported code bloat [1]. With gcc-11
bloat-o-meter shows an 80KB code size improvement:

```
> ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 31/8 grow/shrink: 24/155 up/down: 25734/-107924 (-82190)
Total: Before=6418562, After=6336372, chg -1.28%
```

Compared to before the zstd-1.4.10 update we see a total code size
regression of 105KB, down from 374KB at v5.16-rc1:

```
> ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 292/62 grow/shrink: 56/88 up/down: 235009/-127487 (107522)
Total: Before=6228850, After=6336372, chg +1.73%
```

[0] https://lkml.org/lkml/2021/11/15/710
[1] https://lkml.org/lkml/2021/11/14/189

Link: https://lore.kernel.org/r/20211117014949.1169186-4-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-4-nickrterrell@gmail.com/

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Nick Terrell <terrelln@fb.com>
2021-11-18 13:16:22 -08:00
Nick Terrell
1974990cca lib: zstd: Don't inline functions in zstd_opt.c
`zstd_opt.c` contains the match finder for the highest compression
levels. These levels are already very slow, and are unlikely to be used
in the kernel. If they are used, they shouldn't be used in latency
sensitive workloads, so slowing them down shouldn't be a big deal.

This saves 188 KB of the 288 KB regression reported by Geert Uytterhoeven [0].
I've also opened an issue upstream [1] so that we can properly tackle
the code size issue in `zstd_opt.c` for all users, and can hopefully
remove this hack in the next zstd version we import.

Bloat-o-meter output on x86-64:

```
> ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 6/5 grow/shrink: 1/9 up/down: 16673/-209939 (-193266)
Function                                     old     new   delta
ZSTD_compressBlock_opt_generic.constprop       -    7559   +7559
ZSTD_insertBtAndGetAllMatches                  -    6304   +6304
ZSTD_insertBt1                                 -    1731   +1731
ZSTD_storeSeq                                  -     693    +693
ZSTD_BtGetAllMatches                           -     255    +255
ZSTD_updateRep                                 -     128    +128
ZSTD_updateTree                               96      99      +3
ZSTD_insertAndFindFirstIndexHash3             81       -     -81
ZSTD_setBasePrices.constprop                  98       -     -98
ZSTD_litLengthPrice.constprop                138       -    -138
ZSTD_count                                   362     181    -181
ZSTD_count_2segments                        1407     938    -469
ZSTD_insertBt1.constprop                    2689       -   -2689
ZSTD_compressBlock_btultra2                19990     423  -19567
ZSTD_compressBlock_btultra                 19633      15  -19618
ZSTD_initStats_ultra                       19825       -  -19825
ZSTD_compressBlock_btopt                   20374      12  -20362
ZSTD_compressBlock_btopt_extDict           29984      12  -29972
ZSTD_compressBlock_btultra_extDict         30718      15  -30703
ZSTD_compressBlock_btopt_dictMatchState    32689      12  -32677
ZSTD_compressBlock_btultra_dictMatchState   33574      15  -33559
Total: Before=6611828, After=6418562, chg -2.92%
```

[0] https://lkml.org/lkml/2021/11/14/189
[1] https://github.com/facebook/zstd/issues/2862

Link: https://lore.kernel.org/r/20211117014949.1169186-3-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-3-nickrterrell@gmail.com/

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Nick Terrell <terrelln@fb.com>
2021-11-18 13:15:33 -08:00
Nick Terrell
ae8d67b211 lib: zstd: Fix unused variable warning
The variable `litLengthSum` is only used by an `assert()`, so when
asserts are disabled the compiler doesn't see any usage and warns.

This issue is already fixed upstream by PR #2838 [0]. It was reported
by the Kernel test robot in [1].

Another approach would be to change zstd's disabled `assert()`
definition to use the argument in a disabled branch, instead of
ignoring the argument. I've avoided this approach because there are
some small changes necessary to get zstd to build, and I would
want to thoroughly re-test for performance, since that is slightly
changing the code in every function in zstd. It seems like a
trivial change, but some functions are pretty sensitive to small
changes. However, I think it is a valid approach that I would
like to see upstream take, so I've opened Issue #2868 to attempt
this upstream.

Lastly, I've chosen not to use __maybe_unused because all code
in lib/zstd/ must eventually be upstreamed. Upstream zstd can't
use __maybe_unused because it isn't portable across all compilers.

[0] https://github.com/facebook/zstd/pull/2838
[1] https://lore.kernel.org/linux-mm/202111120312.833wII4i-lkp@intel.com/T/
[2] https://github.com/facebook/zstd/issues/2868

Link: https://lore.kernel.org/r/20211117014949.1169186-2-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-2-nickrterrell@gmail.com/

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nick Terrell <terrelln@fb.com>
2021-11-18 13:12:26 -08:00
Linus Torvalds
7d5775d49e printk fixup for 5.16
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmGWF6YACgkQUqAMR0iA
 lPJJ+RAAm9pi/EElKKl+lOlBl+ehJlKuNLnPQWFmmaRc9xd0ruUipp1nsaktLJ8f
 R/PkSwR/YWpBWlF8P4o+x9sOFyTNyLasoHtqsinEcAJI4lb7d1KOrPliTXyr15Ai
 A303djwJmwCw5KxAPOjkG/nMBlpMIAQRee9GDWs1ykfSlIsI4jp7isVbCFNCQNVF
 auHYq1bfJ5MJYPjxIDZUt+NF7kg7dD4k4g+VCVjaH1u8pGeaCUCtnNjMFOk1XfU8
 yFQnaDcrAu4zJPq3d74z4eN9Bk+su8+DhnfrAEFjuFxGTgYc2MyRt0gGFeiUtNs4
 rvST/eHBO4zeuL18S8G+fLcig/9ZYE73xzjdOCzRzLDjn0VQr9t06ez1QqJOb4D6
 A4SSufwek5NIqYKMlhV/az2EceQYK8Wv3KAz8w98KDfwvVVhUSgE23MbTCO0hvQU
 PWF35d3hQ+9oH0ZGYRumb8OpXtKJ+2KmzyN8Z0xhivHFBIKlcW6IBGhWRANclJO8
 jNAM3jiwi8fRDVM2wI1fmgeEmMhG+WuTI3dJVu3tu4vI923FW5GdY6ev5EvH0Ts0
 khTwIjtmCHUJGSeWajy3Gi9irdyhPyPNRMqgal4GS+gGpVU2mMMKTG+NzxxtCRKR
 BUgfCjFDoDJWrNWIzzOwNqgF0Y+V9GgCZOkb73u/y+xVx0Rmc6U=
 =wbBy
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk fixes from Petr Mladek:

 - Try to flush backtraces from other CPUs also on the local one. This
   was a regression caused by printk_safe buffers removal.

 - Remove header dependency warning.

* tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: Remove printk.h inclusion in percpu.h
  printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
2021-11-18 10:50:45 -08:00
Andy Shevchenko
acdb89b6c8 lib/string_helpers: Introduce managed variant of kasprintf_strarray()
Some of the users want to have easy way to allocate array of strings
that will be automatically cleaned when associated device is gone.

Introduce managed variant of kasprintf_strarray() for such use cases.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2021-11-18 18:40:08 +02:00
Andy Shevchenko
418e0a3551 lib/string_helpers: Introduce kasprintf_strarray()
We have a few users already that basically want to have array of
sequential strings to be allocated and filled.

Provide a helper for them (basically adjusted version from gpio-mockup.c).

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2021-11-18 18:40:08 +02:00
Petr Mladek
bf6d0d1e1a Merge branch 'rework/printk_safe-removal' into for-linus 2021-11-18 10:03:47 +01:00
Tiezhu Yang
ebf7f6f0a6 bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33
In the current code, the actual max tail call count is 33 which is greater
than MAX_TAIL_CALL_CNT (defined as 32). The actual limit is not consistent
with the meaning of MAX_TAIL_CALL_CNT and thus confusing at first glance.
We can see the historical evolution from commit 04fd61ab36 ("bpf: allow
bpf programs to tail-call other bpf programs") and commit f9dabe016b
("bpf: Undo off-by-one in interpreter tail call count limit"). In order
to avoid changing existing behavior, the actual limit is 33 now, this is
reasonable.

After commit 874be05f52 ("bpf, tests: Add tail call test suite"), we can
see there exists failed testcase.

On all archs when CONFIG_BPF_JIT_ALWAYS_ON is not set:
 # echo 0 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf
 # dmesg | grep -w FAIL
 Tail call error path, max count reached jited:0 ret 34 != 33 FAIL

On some archs:
 # echo 1 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf
 # dmesg | grep -w FAIL
 Tail call error path, max count reached jited:1 ret 34 != 33 FAIL

Although the above failed testcase has been fixed in commit 18935a72eb
("bpf/tests: Fix error in tail call limit tests"), it would still be good
to change the value of MAX_TAIL_CALL_CNT from 32 to 33 to make the code
more readable.

The 32-bit x86 JIT was using a limit of 32, just fix the wrong comments and
limit to 33 tail calls as the constant MAX_TAIL_CALL_CNT updated. For the
mips64 JIT, use "ori" instead of "addiu" as suggested by Johan Almbladh.
For the riscv JIT, use RV_REG_TCC directly to save one register move as
suggested by Björn Töpel. For the other implementations, no function changes,
it does not change the current limit 33, the new value of MAX_TAIL_CALL_CNT
can reflect the actual max tail call count, the related tail call testcases
in test_bpf module and selftests can work well for the interpreter and the
JIT.

Here are the test results on x86_64:

 # uname -m
 x86_64
 # echo 0 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf test_suite=test_tail_calls
 # dmesg | tail -1
 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed]
 # rmmod test_bpf
 # echo 1 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf test_suite=test_tail_calls
 # dmesg | tail -1
 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed]
 # rmmod test_bpf
 # ./test_progs -t tailcalls
 #142 tailcalls:OK
 Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/bpf/1636075800-3264-1-git-send-email-yangtiezhu@loongson.cn
2021-11-16 14:03:15 +01:00
Linus Torvalds
c8c109546a Update to zstd-1.4.10
This PR includes 5 commits that update the zstd library version:
 
 1. Adds a new kernel-style wrapper around zstd. This wrapper API
    is functionally equivalent to the subset of the current zstd API that is
    currently used. The wrapper API changes to be kernel style so that the symbols
    don't collide with zstd's symbols. The update to zstd-1.4.10 maintains the same
    API and preserves the semantics, so that none of the callers need to be
    updated. All callers are updated in the commit, because there are zero
    functional changes.
 2. Adds an indirection for `lib/decompress_unzstd.c` so it
    doesn't depend on the layout of `lib/zstd/` to include every source file.
    This allows the next patch to be automatically generated.
 3. Imports the zstd-1.4.10 source code. This commit is automatically generated
    from upstream zstd (https://github.com/facebook/zstd).
 4. Adds me (terrelln@fb.com) as the maintainer of `lib/zstd`.
 5. Fixes a newly added build warning for clang.
 
 The discussion around this patchset has been pretty long, so I've included a
 FAQ-style summary of the history of the patchset, and why we are taking this
 approach.
 
 Why do we need to update?
 -------------------------
 
 The zstd version in the kernel is based off of zstd-1.3.1, which is was released
 August 20, 2017. Since then zstd has seen many bug fixes and performance
 improvements. And, importantly, upstream zstd is continuously fuzzed by OSS-Fuzz,
 and bug fixes aren't backported to older versions. So the only way to sanely get
 these fixes is to keep up to date with upstream zstd. There are no known security
 issues that affect the kernel, but we need to be able to update in case there
 are. And while there are no known security issues, there are relevant bug fixes.
 For example the problem with large kernel decompression has been fixed upstream
 for over 2 years https://lkml.org/lkml/2020/9/29/27.
 
 Additionally the performance improvements for kernel use cases are significant.
 Measured for x86_64 on my Intel i9-9900k @ 3.6 GHz:
 
 - BtrFS zstd compression at levels 1 and 3 is 5% faster
 - BtrFS zstd decompression+read is 15% faster
 - SquashFS zstd decompression+read is 15% faster
 - F2FS zstd compression+write at level 3 is 8% faster
 - F2FS zstd decompression+read is 20% faster
 - ZRAM decompression+read is 30% faster
 - Kernel zstd decompression is 35% faster
 - Initramfs zstd decompression+build is 5% faster
 
 On top of this, there are significant performance improvements coming down the
 line in the next zstd release, and the new automated update patch generation
 will allow us to pull them easily.
 
 How is the update patch generated?
 ----------------------------------
 
 The first two patches are preparation for updating the zstd version. Then the
 3rd patch in the series imports upstream zstd into the kernel. This patch is
 automatically generated from upstream. A script makes the necessary changes and
 imports it into the kernel. The changes are:
 
 - Replace all libc dependencies with kernel replacements and rewrite includes.
 - Remove unncessary portability macros like: #if defined(_MSC_VER).
 - Use the kernel xxhash instead of bundling it.
 
 This automation gets tested every commit by upstream's continuous integration.
 When we cut a new zstd release, we will submit a patch to the kernel to update
 the zstd version in the kernel.
 
 The automated process makes it easy to keep the kernel version of zstd up to
 date. The current zstd in the kernel shares the guts of the code, but has a lot
 of API and minor changes to work in the kernel. This is because at the time
 upstream zstd was not ready to be used in the kernel envrionment as-is. But,
 since then upstream zstd has evolved to support being used in the kernel as-is.
 
 Why are we updating in one big patch?
 -------------------------------------
 
 The 3rd patch in the series is very large. This is because it is restructuring
 the code, so it both deletes the existing zstd, and re-adds the new structure.
 Future updates will be directly proportional to the changes in upstream zstd
 since the last import. They will admittidly be large, as zstd is an actively
 developed project, and has hundreds of commits between every release. However,
 there is no other great alternative.
 
 One option ruled out is to replay every upstream zstd commit. This is not feasible
 for several reasons:
 - There are over 3500 upstream commits since the zstd version in the kernel.
 - The automation to automatically generate the kernel update was only added recently,
   so older commits cannot easily be imported.
 - Not every upstream zstd commit builds.
 - Only zstd releases are "supported", and individual commits may have bugs that were
   fixed before a release.
 
 Another option to reduce the patch size would be to first reorganize to the new
 file structure, and then apply the patch. However, the current kernel zstd is formatted
 with clang-format to be more "kernel-like". But, the new method imports zstd as-is,
 without additional formatting, to allow for closer correlation with upstream, and
 easier debugging. So the patch wouldn't be any smaller.
 
 It also doesn't make sense to import upstream zstd commit by commit going
 forward. Upstream zstd doesn't support production use cases running of the
 development branch. We have a lot of post-commit fuzzing that catches many bugs,
 so indiviudal commits may be buggy, but fixed before a release. So going forward,
 I intend to import every (important) zstd release into the Kernel.
 
 So, while it isn't ideal, updating in one big patch is the only patch I see forward.
 
 Who is responsible for this code?
 ---------------------------------
 
 I am. This patchset adds me as the maintainer for zstd. Previously, there was no tree
 for zstd patches. Because of that, there were several patches that either got ignored,
 or took a long time to merge, since it wasn't clear which tree should pick them up.
 I'm officially stepping up as maintainer, and setting up my tree as the path through
 which zstd patches get merged. I'll make sure that patches to the kernel zstd get
 ported upstream, so they aren't erased when the next version update happens.
 
 How is this code tested?
 ------------------------
 
 I tested every caller of zstd on x86_64 (BtrFS, ZRAM, SquashFS, F2FS, Kernel,
 InitRAMFS). I also tested Kernel & InitRAMFS on i386 and aarch64. I checked both
 performance and correctness.
 
 Also, thanks to many people in the community who have tested these patches locally.
 If you have tested the patches, please reply with a Tested-By so I can collect them
 for the PR I will send to Linus.
 
 Lastly, this code will bake in linux-next before being merged into v5.16.
 
 Why update to zstd-1.4.10 when zstd-1.5.0 has been released?
 ------------------------------------------------------------
 
 This patchset has been outstanding since 2020, and zstd-1.4.10 was the latest
 release when it was created. Since the update patch is automatically generated
 from upstream, I could generate it from zstd-1.5.0. However, there were some
 large stack usage regressions in zstd-1.5.0, and are only fixed in the latest
 development branch. And the latest development branch contains some new code that
 needs to bake in the fuzzer before I would feel comfortable releasing to the
 kernel.
 
 Once this patchset has been merged, and we've released zstd-1.5.1, we can update
 the kernel to zstd-1.5.1, and exercise the update process.
 
 You may notice that zstd-1.4.10 doesn't exist upstream. This release is an
 artifical release based off of zstd-1.4.9, with some fixes for the kernel
 backported from the development branch. I will tag the zstd-1.4.10 release after
 this patchset is merged, so the Linux Kernel is running a known version of zstd
 that can be debugged upstream.
 
 Why was a wrapper API added?
 ----------------------------
 
 The first versions of this patchset migrated the kernel to the upstream zstd
 API. It first added a shim API that supported the new upstream API with the old
 code, then updated callers to use the new shim API, then transitioned to the
 new code and deleted the shim API. However, Cristoph Hellwig suggested that we
 transition to a kernel style API, and hide zstd's upstream API behind that.
 This is because zstd's upstream API is supports many other use cases, and does
 not follow the kernel style guide, while the kernel API is focused on the
 kernel's use cases, and follows the kernel style guide.
 
 Where is the previous discussion?
 ---------------------------------
 
 Links for the discussions of the previous versions of the patch set.
 The largest changes in the design of the patchset are driven by the discussions
 in V11, V5, and V1. Sorry for the mix of links, I couldn't find most of the the
 threads on lkml.org.
 
 V12: https://www.spinics.net/lists/linux-crypto/msg58189.html
 V11: https://lore.kernel.org/linux-btrfs/20210430013157.747152-1-nickrterrell@gmail.com/
 V10: https://lore.kernel.org/lkml/20210426234621.870684-2-nickrterrell@gmail.com/
 V9: https://lore.kernel.org/linux-btrfs/20210330225112.496213-1-nickrterrell@gmail.com/
 V8: https://lore.kernel.org/linux-f2fs-devel/20210326191859.1542272-1-nickrterrell@gmail.com/
 V7: https://lkml.org/lkml/2020/12/3/1195
 V6: https://lkml.org/lkml/2020/12/2/1245
 V5: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/
 V4: https://www.spinics.net/lists/linux-btrfs/msg105783.html
 V3: https://lkml.org/lkml/2020/9/23/1074
 V2: https://www.spinics.net/lists/linux-btrfs/msg105505.html
 V1: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/
 
 Signed-off-by: Nick Terrell <terrelln@fb.com>
 Tested By: Paul Jones <paul@pauljones.id.au>
 Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
 Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
 Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEmIwAqlFIzbQodPwyuzRpqaNEqPUFAmGJyKIACgkQuzRpqaNE
 qPXnmw/+PKyCn6LvRQqNfdpF5f59j/B1Fab15tkpVyz3UWnCw+EKaPZOoTfIsjRf
 7TMUVm4iGsm+6xBO/YrGdRl4IxocNgXzsgnJ1lTGDbvfRC1tG+YNwuv+EEXwKYq5
 Yz3DRwDotgsrV0Kg05b+VIgkmAuY3ukmu2n09LnAdKkxoIgmHw3MIDCdVZW2Br4c
 sjJmYI+fiJd7nAlbDa42VOrdTiLzkl/2BsjWBqTv6zbiQ5uuJGsKb7P3kpcybWzD
 5C118pyE3qlVyvFz+UFu8WbN0NSf47DP22KV/3IrhNX7CVQxYBe+9/oVuPWTgRx0
 4Vl0G6u7rzh4wDZuGqTC3LYWwH9GfycI0fnVC0URP2XMOcGfPlGd3L0PEmmAeTmR
 fEbaGAN4dr0jNO3lmbyAGe/G8tvtXQx/4ZjS9Pa3TlQP24GARU/f78/blbKR87Vz
 BGMndmSi92AscgXb9buO3bCwAY1YtH5WiFaZT1XVk42cj4MiOLvPTvP4UMzDDxcZ
 56ahmAP/84kd6H+cv9LmgEMqcIBmxdUcO1nuAItJ4wdrMUgw3+lrbxwFkH9xPV7I
 okC1K0TIVEobADbxbdMylxClAylbuW+37Pko97NmAlnzNCPNE38f3s3gtXRrUTaR
 IP8jv5UQ7q3dFiWnNLLodx5KM6s32GVBKRLRnn/6SJB7QzlyHXU=
 =Xb18
 -----END PGP SIGNATURE-----

Merge tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux

Pull zstd update from Nick Terrell:
 "Update to zstd-1.4.10.

  Add myself as the maintainer of zstd and update the zstd version in
  the kernel, which is now 4 years out of date, to a much more recent
  zstd release. This includes bug fixes, much more extensive fuzzing,
  and performance improvements. And generates the kernel zstd
  automatically from upstream zstd, so it is easier to keep the zstd
  verison up to date, and we don't fall so far out of date again.

  This includes 5 commits that update the zstd library version:

   - Adds a new kernel-style wrapper around zstd.

     This wrapper API is functionally equivalent to the subset of the
     current zstd API that is currently used. The wrapper API changes to
     be kernel style so that the symbols don't collide with zstd's
     symbols. The update to zstd-1.4.10 maintains the same API and
     preserves the semantics, so that none of the callers need to be
     updated. All callers are updated in the commit, because there are
     zero functional changes.

   - Adds an indirection for `lib/decompress_unzstd.c` so it doesn't
     depend on the layout of `lib/zstd/` to include every source file.
     This allows the next patch to be automatically generated.

   - Imports the zstd-1.4.10 source code. This commit is automatically
     generated from upstream zstd (https://github.com/facebook/zstd).

   - Adds me (terrelln@fb.com) as the maintainer of `lib/zstd`.

   - Fixes a newly added build warning for clang.

  The discussion around this patchset has been pretty long, so I've
  included a FAQ-style summary of the history of the patchset, and why
  we are taking this approach.

  Why do we need to update?
  -------------------------

  The zstd version in the kernel is based off of zstd-1.3.1, which is
  was released August 20, 2017. Since then zstd has seen many bug fixes
  and performance improvements. And, importantly, upstream zstd is
  continuously fuzzed by OSS-Fuzz, and bug fixes aren't backported to
  older versions. So the only way to sanely get these fixes is to keep
  up to date with upstream zstd.

  There are no known security issues that affect the kernel, but we need
  to be able to update in case there are. And while there are no known
  security issues, there are relevant bug fixes. For example the problem
  with large kernel decompression has been fixed upstream for over 2
  years [1]

  Additionally the performance improvements for kernel use cases are
  significant. Measured for x86_64 on my Intel i9-9900k @ 3.6 GHz:

   - BtrFS zstd compression at levels 1 and 3 is 5% faster

   - BtrFS zstd decompression+read is 15% faster

   - SquashFS zstd decompression+read is 15% faster

   - F2FS zstd compression+write at level 3 is 8% faster

   - F2FS zstd decompression+read is 20% faster

   - ZRAM decompression+read is 30% faster

   - Kernel zstd decompression is 35% faster

   - Initramfs zstd decompression+build is 5% faster

  On top of this, there are significant performance improvements coming
  down the line in the next zstd release, and the new automated update
  patch generation will allow us to pull them easily.

  How is the update patch generated?
  ----------------------------------

  The first two patches are preparation for updating the zstd version.
  Then the 3rd patch in the series imports upstream zstd into the
  kernel. This patch is automatically generated from upstream. A script
  makes the necessary changes and imports it into the kernel. The
  changes are:

   - Replace all libc dependencies with kernel replacements and rewrite
     includes.

   - Remove unncessary portability macros like: #if defined(_MSC_VER).

   - Use the kernel xxhash instead of bundling it.

  This automation gets tested every commit by upstream's continuous
  integration. When we cut a new zstd release, we will submit a patch to
  the kernel to update the zstd version in the kernel.

  The automated process makes it easy to keep the kernel version of zstd
  up to date. The current zstd in the kernel shares the guts of the
  code, but has a lot of API and minor changes to work in the kernel.
  This is because at the time upstream zstd was not ready to be used in
  the kernel envrionment as-is. But, since then upstream zstd has
  evolved to support being used in the kernel as-is.

  Why are we updating in one big patch?
  -------------------------------------

  The 3rd patch in the series is very large. This is because it is
  restructuring the code, so it both deletes the existing zstd, and
  re-adds the new structure. Future updates will be directly
  proportional to the changes in upstream zstd since the last import.
  They will admittidly be large, as zstd is an actively developed
  project, and has hundreds of commits between every release. However,
  there is no other great alternative.

  One option ruled out is to replay every upstream zstd commit. This is
  not feasible for several reasons:

   - There are over 3500 upstream commits since the zstd version in the
     kernel.

   - The automation to automatically generate the kernel update was only
     added recently, so older commits cannot easily be imported.

   - Not every upstream zstd commit builds.

   - Only zstd releases are "supported", and individual commits may have
     bugs that were fixed before a release.

  Another option to reduce the patch size would be to first reorganize
  to the new file structure, and then apply the patch. However, the
  current kernel zstd is formatted with clang-format to be more
  "kernel-like". But, the new method imports zstd as-is, without
  additional formatting, to allow for closer correlation with upstream,
  and easier debugging. So the patch wouldn't be any smaller.

  It also doesn't make sense to import upstream zstd commit by commit
  going forward. Upstream zstd doesn't support production use cases
  running of the development branch. We have a lot of post-commit
  fuzzing that catches many bugs, so indiviudal commits may be buggy,
  but fixed before a release. So going forward, I intend to import every
  (important) zstd release into the Kernel.

  So, while it isn't ideal, updating in one big patch is the only patch
  I see forward.

  Who is responsible for this code?
  ---------------------------------

  I am. This patchset adds me as the maintainer for zstd. Previously,
  there was no tree for zstd patches. Because of that, there were
  several patches that either got ignored, or took a long time to merge,
  since it wasn't clear which tree should pick them up. I'm officially
  stepping up as maintainer, and setting up my tree as the path through
  which zstd patches get merged. I'll make sure that patches to the
  kernel zstd get ported upstream, so they aren't erased when the next
  version update happens.

  How is this code tested?
  ------------------------

  I tested every caller of zstd on x86_64 (BtrFS, ZRAM, SquashFS, F2FS,
  Kernel, InitRAMFS). I also tested Kernel & InitRAMFS on i386 and
  aarch64. I checked both performance and correctness.

  Also, thanks to many people in the community who have tested these
  patches locally.

  Lastly, this code will bake in linux-next before being merged into
  v5.16.

  Why update to zstd-1.4.10 when zstd-1.5.0 has been released?
  ------------------------------------------------------------

  This patchset has been outstanding since 2020, and zstd-1.4.10 was the
  latest release when it was created. Since the update patch is
  automatically generated from upstream, I could generate it from
  zstd-1.5.0.

  However, there were some large stack usage regressions in zstd-1.5.0,
  and are only fixed in the latest development branch. And the latest
  development branch contains some new code that needs to bake in the
  fuzzer before I would feel comfortable releasing to the kernel.

  Once this patchset has been merged, and we've released zstd-1.5.1, we
  can update the kernel to zstd-1.5.1, and exercise the update process.

  You may notice that zstd-1.4.10 doesn't exist upstream. This release
  is an artifical release based off of zstd-1.4.9, with some fixes for
  the kernel backported from the development branch. I will tag the
  zstd-1.4.10 release after this patchset is merged, so the Linux Kernel
  is running a known version of zstd that can be debugged upstream.

  Why was a wrapper API added?
  ----------------------------

  The first versions of this patchset migrated the kernel to the
  upstream zstd API. It first added a shim API that supported the new
  upstream API with the old code, then updated callers to use the new
  shim API, then transitioned to the new code and deleted the shim API.
  However, Cristoph Hellwig suggested that we transition to a kernel
  style API, and hide zstd's upstream API behind that. This is because
  zstd's upstream API is supports many other use cases, and does not
  follow the kernel style guide, while the kernel API is focused on the
  kernel's use cases, and follows the kernel style guide.

  Where is the previous discussion?
  ---------------------------------

  Links for the discussions of the previous versions of the patch set
  below. The largest changes in the design of the patchset are driven by
  the discussions in v11, v5, and v1. Sorry for the mix of links, I
  couldn't find most of the the threads on lkml.org"

Link: https://lkml.org/lkml/2020/9/29/27 [1]
Link: https://www.spinics.net/lists/linux-crypto/msg58189.html [v12]
Link: https://lore.kernel.org/linux-btrfs/20210430013157.747152-1-nickrterrell@gmail.com/ [v11]
Link: https://lore.kernel.org/lkml/20210426234621.870684-2-nickrterrell@gmail.com/ [v10]
Link: https://lore.kernel.org/linux-btrfs/20210330225112.496213-1-nickrterrell@gmail.com/ [v9]
Link: https://lore.kernel.org/linux-f2fs-devel/20210326191859.1542272-1-nickrterrell@gmail.com/ [v8]
Link: https://lkml.org/lkml/2020/12/3/1195 [v7]
Link: https://lkml.org/lkml/2020/12/2/1245 [v6]
Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/ [v5]
Link: https://www.spinics.net/lists/linux-btrfs/msg105783.html [v4]
Link: https://lkml.org/lkml/2020/9/23/1074 [v3]
Link: https://www.spinics.net/lists/linux-btrfs/msg105505.html [v2]
Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/ [v1]
Signed-off-by: Nick Terrell <terrelln@fb.com>
Tested By: Paul Jones <paul@pauljones.id.au>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>

* tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux:
  lib: zstd: Add cast to silence clang's -Wbitwise-instead-of-logical
  MAINTAINERS: Add maintainer entry for zstd
  lib: zstd: Upgrade to latest upstream zstd version 1.4.10
  lib: zstd: Add decompress_sources.h for decompress_unzstd
  lib: zstd: Add kernel-specific API
2021-11-13 15:32:30 -08:00
Alistair Popple
ab09243aa9 mm/migrate.c: remove MIGRATE_PFN_LOCKED
MIGRATE_PFN_LOCKED is used to indicate to migrate_vma_prepare() that a
source page was already locked during migrate_vma_collect().  If it
wasn't then the a second attempt is made to lock the page.  However if
the first attempt failed it's unlikely a second attempt will succeed,
and the retry adds complexity.  So clean this up by removing the retry
and MIGRATE_PFN_LOCKED flag.

Destination pages are also meant to have the MIGRATE_PFN_LOCKED flag
set, but nothing actually checks that.

Link: https://lkml.kernel.org/r/20211025041608.289017-1-apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-11 09:34:35 -08:00
Nicholas Piggin
5d5e4522a7 printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
printk from NMI context relies on irq work being raised on the local CPU
to print to console. This can be a problem if the NMI was raised by a
lockup detector to print lockup stack and regs, because the CPU may not
enable irqs (because it is locked up).

Introduce printk_trigger_flush() that can be called another CPU to try
to get those messages to the console, call that where printk_safe_flush
was previously called.

Fixes: 93d102f094 ("printk: remove safe buffers")
Cc: stable@vger.kernel.org # 5.15
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20211107045116.1754411-1-npiggin@gmail.com
2021-11-10 16:12:00 +01:00
Linus Torvalds
59a2ceeef6 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "87 patches.

  Subsystems affected by this patch series: mm (pagecache and hugetlb),
  procfs, misc, MAINTAINERS, lib, checkpatch, binfmt, kallsyms, ramfs,
  init, codafs, nilfs2, hfs, crash_dump, signals, seq_file, fork,
  sysvfs, kcov, gdb, resource, selftests, and ipc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (87 commits)
  ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL
  ipc: check checkpoint_restore_ns_capable() to modify C/R proc files
  selftests/kselftest/runner/run_one(): allow running non-executable files
  virtio-mem: disallow mapping virtio-mem memory via /dev/mem
  kernel/resource: disallow access to exclusive system RAM regions
  kernel/resource: clean up and optimize iomem_is_exclusive()
  scripts/gdb: handle split debug for vmlinux
  kcov: replace local_irq_save() with a local_lock_t
  kcov: avoid enable+disable interrupts if !in_task()
  kcov: allocate per-CPU memory on the relevant node
  Documentation/kcov: define `ip' in the example
  Documentation/kcov: include types.h in the example
  sysv: use BUILD_BUG_ON instead of runtime check
  kernel/fork.c: unshare(): use swap() to make code cleaner
  seq_file: fix passing wrong private data
  seq_file: move seq_escape() to a header
  signal: remove duplicate include in signal.h
  crash_dump: remove duplicate include in crash_dump.h
  crash_dump: fix boolreturn.cocci warning
  hfs/hfsplus: use WARN_ON for sanity check
  ...
2021-11-09 10:11:53 -08:00
Thomas Gleixner
723aca2085 mm/scatterlist: replace the !preemptible warning in sg_miter_stop()
sg_miter_stop() checks for disabled preemption before unmapping a page
via kunmap_atomic().  The kernel doc mentions under context that
preemption must be disabled if SG_MITER_ATOMIC is set.

There is no active requirement for the caller to have preemption
disabled before invoking sg_mitter_stop().  The sg_mitter_*()
implementation itself has no such requirement.

In fact, preemption is disabled by kmap_atomic() as part of
sg_miter_next() and remains disabled as long as there is an active
SG_MITER_ATOMIC mapping.  This is a consequence of kmap_atomic() and not
a requirement for sg_mitter_*() itself.

The user chooses SG_MITER_ATOMIC because it uses the API in a context
where blocking is not possible or blocking is possible but he chooses a
lower weight mapping which is not available on all CPUs and so it might
need less overhead to setup at a price that now preemption will be
disabled.

The kmap_atomic() implementation on PREEMPT_RT does not disable
preemption.  It simply disables CPU migration to ensure that the task
remains on the same CPU while the caller remains preemptible.  This in
turn triggers the warning in sg_miter_stop() because preemption is
allowed.

The PREEMPT_RT and !PREEMPT_RT implementation of kmap_atomic() disable
pagefaults as a requirement.  It is sufficient to check for this instead
of disabled preemption.

Check for disabled pagefault handler in the SG_MITER_ATOMIC case.
Remove the "preemption disabled" part from the kernel doc as the
sg_milter*() implementation does not care.

[bigeasy@linutronix.de: commit description]

Link: https://lkml.kernel.org/r/20211015211409.cqopacv3pxdwn2ty@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:50 -08:00
Alexey Dobriyan
839b395eb9 lib: uninline simple_strntoull() as well
Codegen become bloated again after simple_strntoull() introduction

	add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-224 (-224)
	Function                                     old     new   delta
	simple_strtoul                                 5       2      -3
	simple_strtol                                 23      20      -3
	simple_strtoull                              119      15    -104
	simple_strtoll                               155      41    -114

Link: https://lkml.kernel.org/r/YVmlB9yY4lvbNKYt@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:50 -08:00
Imran Khan
0f68d45ef4 lib, stackdepot: add helper to print stack entries into buffer
To print stack entries into a buffer, users of stackdepot, first get a
list of stack entries using stack_depot_fetch and then print this list
into a buffer using stack_trace_snprint.  Provide a helper in stackdepot
for this purpose.  Also change above mentioned users to use this helper.

[imran.f.khan@oracle.com: fix build error]
  Link: https://lkml.kernel.org/r/20210915175321.3472770-4-imran.f.khan@oracle.com
[imran.f.khan@oracle.com: export stack_depot_snprint() to modules]
  Link: https://lkml.kernel.org/r/20210916133535.3592491-4-imran.f.khan@oracle.com

Link: https://lkml.kernel.org/r/20210915014806.3206938-4-imran.f.khan@oracle.com
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Jani Nikula <jani.nikula@intel.com>	[i915]
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:50 -08:00
Imran Khan
505be48165 lib, stackdepot: add helper to print stack entries
To print a stack entries, users of stackdepot, first use stack_depot_fetch
to get a list of stack entries and then use stack_trace_print to print
this list.  Provide a helper in stackdepot to print stack entries based on
stackdepot handle.  Also change above mentioned users to use this helper.

Link: https://lkml.kernel.org/r/20210915014806.3206938-3-imran.f.khan@oracle.com
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:50 -08:00
Imran Khan
4d4712c1a4 lib, stackdepot: check stackdepot handle before accessing slabs
Patch series "lib, stackdepot: check stackdepot handle before accessing slabs", v2.

PATCH-1: Checks validity of a stackdepot handle before proceeding to
access stackdepot slab/objects.

PATCH-2: Adds a helper in stackdepot, to allow users to print stack
entries just by specifying the stackdepot handle.  It also changes such
users to use this new interface.

PATCH-3: Adds a helper in stackdepot, to allow users to print stack
entries into buffers just by specifying the stackdepot handle and
destination buffer.  It also changes such users to use this new interface.

This patch (of 3):

stack_depot_save allocates slabs that will be used for storing objects in
future.If this slab allocation fails we may get to a situation where space
allocation for a new stack_record fails, causing stack_depot_save to
return 0 as handle.  If user of this handle ends up invoking
stack_depot_fetch with this handle value, current implementation of
stack_depot_fetch will end up using slab from wrong index.  To avoid this
check handle value at the beginning.

Link: https://lkml.kernel.org/r/20210915175321.3472770-1-imran.f.khan@oracle.com
Link: https://lkml.kernel.org/r/20210915014806.3206938-1-imran.f.khan@oracle.com
Link: https://lkml.kernel.org/r/20210915014806.3206938-2-imran.f.khan@oracle.com
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:50 -08:00
Nathan Chancellor
0a8ea23583 lib: zstd: Add cast to silence clang's -Wbitwise-instead-of-logical
A new warning in clang warns that there is an instance where boolean
expressions are being used with bitwise operators instead of logical
ones:

lib/zstd/decompress/huf_decompress.c:890:25: warning: use of bitwise '&' with boolean operands [-Wbitwise-instead-of-logical]
                       (BIT_reloadDStreamFast(&bitD1) == BIT_DStream_unfinished)
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

zstd does this frequently to help with performance, as logical operators
have branches whereas bitwise ones do not.

To fix this warning in other cases, the expressions were placed on
separate lines with the '&=' operator; however, this particular instance
was moved away from that so that it could be surrounded by LIKELY, which
is a macro for __builtin_expect(), to help with a performance
regression, according to upstream zstd pull #1973.

Aside from switching to logical operators, which is likely undesirable
in this instance, or disabling the warning outright, the solution is
casting one of the expressions to an integer type to make it clear to
clang that the author knows what they are doing. Add a cast to U32 to
silence the warning. The first U32 cast is to silence an instance of
-Wshorten-64-to-32 because __builtin_expect() returns long so it cannot
be moved.

Link: https://github.com/ClangBuiltLinux/linux/issues/1486
Link: https://github.com/facebook/zstd/pull/1973
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Terrell <terrelln@fb.com>
2021-11-08 16:55:38 -08:00
Nick Terrell
e0c1b49f5b lib: zstd: Upgrade to latest upstream zstd version 1.4.10
Upgrade to the latest upstream zstd version 1.4.10.

This patch is 100% generated from upstream zstd commit 20821a46f412 [0].

This patch is very large because it is transitioning from the custom
kernel zstd to using upstream directly. The new zstd follows upstreams
file structure which is different. Future update patches will be much
smaller because they will only contain the changes from one upstream
zstd release.

As an aid for review I've created a commit [1] that shows the diff
between upstream zstd as-is (which doesn't compile), and the zstd
code imported in this patch. The verion of zstd in this patch is
generated from upstream with changes applied by automation to replace
upstreams libc dependencies, remove unnecessary portability macros,
replace `/**` comments with `/*` comments, and use the kernel's xxhash
instead of bundling it.

The benefits of this patch are as follows:
1. Using upstream directly with automated script to generate kernel
   code. This allows us to update the kernel every upstream release, so
   the kernel gets the latest bug fixes and performance improvements,
   and doesn't get 3 years out of date again. The automation and the
   translated code are tested every upstream commit to ensure it
   continues to work.
2. Upgrades from a custom zstd based on 1.3.1 to 1.4.10, getting 3 years
   of performance improvements and bug fixes. On x86_64 I've measured
   15% faster BtrFS and SquashFS decompression+read speeds, 35% faster
   kernel decompression, and 30% faster ZRAM decompression+read speeds.
3. Zstd-1.4.10 supports negative compression levels, which allow zstd to
   match or subsume lzo's performance.
4. Maintains the same kernel-specific wrapper API, so no callers have to
   be modified with zstd version updates.

One concern that was brought up was stack usage. Upstream zstd had
already removed most of its heavy stack usage functions, but I just
removed the last functions that allocate arrays on the stack. I've
measured the high water mark for both compression and decompression
before and after this patch. Decompression is approximately neutral,
using about 1.2KB of stack space. Compression levels up to 3 regressed
from 1.4KB -> 1.6KB, and higher compression levels regressed from 1.5KB
-> 2KB. We've added unit tests upstream to prevent further regression.
I believe that this is a reasonable increase, and if it does end up
causing problems, this commit can be cleanly reverted, because it only
touches zstd.

I chose the bulk update instead of replaying upstream commits because
there have been ~3500 upstream commits since the 1.3.1 release, zstd
wasn't ready to be used in the kernel as-is before a month ago, and not
all upstream zstd commits build. The bulk update preserves bisectablity
because bugs can be bisected to the zstd version update. At that point
the update can be reverted, and we can work with upstream to find and
fix the bug.

Note that upstream zstd release 1.4.10 doesn't exist yet. I have cut a
staging branch at 20821a46f412 [0] and will apply any changes requested
to the staging branch. Once we're ready to merge this update I will cut
a zstd release at the commit we merge, so we have a known zstd release
in the kernel.

The implementation of the kernel API is contained in
zstd_compress_module.c and zstd_decompress_module.c.

[0] 20821a46f4
[1] e0fa481d0e

Signed-off-by: Nick Terrell <terrelln@fb.com>
Tested By: Paul Jones <paul@pauljones.id.au>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>
2021-11-08 16:55:32 -08:00
Nick Terrell
2479b52389 lib: zstd: Add decompress_sources.h for decompress_unzstd
Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.

This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.

Signed-off-by: Nick Terrell <terrelln@fb.com>
Tested By: Paul Jones <paul@pauljones.id.au>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>
2021-11-08 16:55:26 -08:00
Nick Terrell
cf30f6a5f0 lib: zstd: Add kernel-specific API
This patch:
- Moves `include/linux/zstd.h` -> `include/linux/zstd_lib.h`
- Updates modified zstd headers to yearless copyright
- Adds a new API in `include/linux/zstd.h` that is functionally
  equivalent to the in-use subset of the current API. Functions are
  renamed to avoid symbol collisions with zstd, to make it clear it is
  not the upstream zstd API, and to follow the kernel style guide.
- Updates all callers to use the new API.

There are no functional changes in this patch. Since there are no
functional change, I felt it was okay to update all the callers in a
single patch. Once the API is approved, the callers are mechanically
changed.

This patch is preparing for the 3rd patch in this series, which updates
zstd to version 1.4.10. Since the upstream zstd API is no longer exposed
to callers, the update can happen transparently.

Signed-off-by: Nick Terrell <terrelln@fb.com>
Tested By: Paul Jones <paul@pauljones.id.au>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>
2021-11-08 16:55:21 -08:00
Linus Torvalds
1e9ed9360f Kbuild updates for v5.16
- Remove the global -isystem compiler flag, which was made possible by
    the introduction of <linux/stdarg.h>
 
  - Improve the Kconfig help to print the location in the top menu level
 
  - Fix "FORCE prerequisite is missing" build warning for sparc
 
  - Add new build targets, tarzst-pkg and perf-tarzst-src-pkg, which generate
    a zstd-compressed tarball
 
  - Prevent gen_init_cpio tool from generating a corrupted cpio when
    KBUILD_BUILD_TIMESTAMP is set to 2106-02-07 or later
 
  - Misc cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmGGkysVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGgZkQAIX4i9Tt6pyl/2xGDGkzUqjprfoH
 QUIo1DoUclLUygoakrrrX3EnZLWrslgPTKjQxdiV6RA6xHfe4cYgNTSq8zM9lsPT
 lu+B4nEDqoXQ5gyLxMlnjS3FRQTNYIeBZEhSAIiW8TENdLKlKc+NYdoj7th50dO0
 SkXRa2dpWHa6t7ZRqHIHMpUWA7gm0w22ZbgQmyUv1CDGO4IHPLqe2b2PMsrzhSZ1
 yypP1l6aQVKuP0hN9aytbTRqDxUd0uOzBf00PK5zx23hjdwZ9wmZrFTKDf9fAu/+
 nR7gBsa5YoYNQh3UkayZXjR5dClmgsCXZ25OXI7YucQp/8OJ5fadfn1NFpJHsw56
 n5cckbHIXgnFUcel5YlkR6qTHjpzdr9vHm90MmiuX99b3oy9czl6pY3qkNfRkllQ
 v7ME5L1qlw3P3ia1KA+H4zW/LIJ8p5cbKBwaY22m3kY3bTx7PiOfMlep4UVqxXSb
 0/OqxSsmYg5LlmwEQ0SSsx45hE0o9nG/cdjkHu1jUOUHxYfpt1T4MTILeGUwmjzd
 TydJym5MZyXBawu4NVB3QLoKm5Jt2BXtyaWOtq74VSrs77roNCdYuQWJ+1aBf2Pg
 0s4CVC2cC7KlxJDImoqswZATGXPMfbiVDcuVSSukYRgBMeCBPUzRhB8YP36BZyD3
 9vFYmqSujtUU7nWb
 =ATFN
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Remove the global -isystem compiler flag, which was made possible by
   the introduction of <linux/stdarg.h>

 - Improve the Kconfig help to print the location in the top menu level

 - Fix "FORCE prerequisite is missing" build warning for sparc

 - Add new build targets, tarzst-pkg and perf-tarzst-src-pkg, which
   generate a zstd-compressed tarball

 - Prevent gen_init_cpio tool from generating a corrupted cpio when
   KBUILD_BUILD_TIMESTAMP is set to 2106-02-07 or later

 - Misc cleanups

* tag 'kbuild-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (28 commits)
  kbuild: use more subdir- for visiting subdirectories while cleaning
  sh: remove meaningless archclean line
  initramfs: Check timestamp to prevent broken cpio archive
  kbuild: split DEBUG_CFLAGS out to scripts/Makefile.debug
  gen_init_cpio: add static const qualifiers
  kbuild: Add make tarzst-pkg build option
  scripts: update the comments of kallsyms support
  sparc: Add missing "FORCE" target when using if_changed
  kconfig: refactor conf_touch_dep()
  kconfig: refactor conf_write_dep()
  kconfig: refactor conf_write_autoconf()
  kconfig: add conf_get_autoheader_name()
  kconfig: move sym_escape_string_value() to confdata.c
  kconfig: refactor listnewconfig code
  kconfig: refactor conf_write_symbol()
  kconfig: refactor conf_write_heading()
  kconfig: remove 'const' from the return type of sym_escape_string_value()
  kconfig: rename a variable in the lexer to a clearer name
  kconfig: narrow the scope of variables in the lexer
  kconfig: Create links to main menu items in search
  ...
2021-11-08 09:15:45 -08:00
Linus Torvalds
512b7931ad Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "257 patches.

  Subsystems affected by this patch series: scripts, ocfs2, vfs, and
  mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache,
  gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc,
  pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools,
  memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm,
  vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram,
  cleanups, kfence, and damon)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits)
  mm/damon: remove return value from before_terminate callback
  mm/damon: fix a few spelling mistakes in comments and a pr_debug message
  mm/damon: simplify stop mechanism
  Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions
  Docs/admin-guide/mm/damon/start: simplify the content
  Docs/admin-guide/mm/damon/start: fix a wrong link
  Docs/admin-guide/mm/damon/start: fix wrong example commands
  mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on
  mm/damon: remove unnecessary variable initialization
  Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM
  mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
  selftests/damon: support watermarks
  mm/damon/dbgfs: support watermarks
  mm/damon/schemes: activate schemes based on a watermarks mechanism
  tools/selftests/damon: update for regions prioritization of schemes
  mm/damon/dbgfs: support prioritization weights
  mm/damon/vaddr,paddr: support pageout prioritization
  mm/damon/schemes: prioritize regions within the quotas
  mm/damon/selftests: support schemes quotas
  mm/damon/dbgfs: support quotas of schemes
  ...
2021-11-06 14:08:17 -07:00
Marco Elver
4f612ed3f7 kfence: default to dynamic branch instead of static keys mode
We have observed that on very large machines with newer CPUs, the static
key/branch switching delay is on the order of milliseconds.  This is due
to the required broadcast IPIs, which simply does not scale well to
hundreds of CPUs (cores).  If done too frequently, this can adversely
affect tail latencies of various workloads.

One workaround is to increase the sample interval to several seconds,
while decreasing sampled allocation coverage, but the problem still
exists and could still increase tail latencies.

As already noted in the Kconfig help text, there are trade-offs: at
lower sample intervals the dynamic branch results in better performance;
however, at very large sample intervals, the static keys mode can result
in better performance -- careful benchmarking is recommended.

Our initial benchmarking showed that with large enough sample intervals
and workloads stressing the allocator, the static keys mode was slightly
better.  Evaluating and observing the possible system-wide side-effects
of the static-key-switching induced broadcast IPIs, however, was a blind
spot (in particular on large machines with 100s of cores).

Therefore, a major downside of the static keys mode is, unfortunately,
that it is hard to predict performance on new system architectures and
topologies, but also making conclusions about performance of new
workloads based on a limited set of benchmarks.

Most distributions will simply select the defaults, while targeting a
large variety of different workloads and system architectures.  As such,
the better default is CONFIG_KFENCE_STATIC_KEYS=n, and re-enabling it is
only recommended after careful evaluation.

For reference, on x86-64 the condition in kfence_alloc() generates
exactly
2 instructions in the kmem_cache_alloc() fast-path:

 | ...
 | cmpl   $0x0,0x1a8021c(%rip)  # ffffffff82d560d0 <kfence_allocation_gate>
 | je     ffffffff812d6003      <kmem_cache_alloc+0x243>
 | ...

which, given kfence_allocation_gate is infrequently modified, should be
well predicted by most CPUs.

Link: https://lkml.kernel.org/r/20211019102524.2807208-2-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:43 -07:00
Marco Elver
f39f21b3dd stacktrace: move filter_irq_stacks() to kernel/stacktrace.c
filter_irq_stacks() has little to do with the stackdepot implementation,
except that it is usually used by users (such as KASAN) of stackdepot to
reduce the stack trace.

However, filter_irq_stacks() itself is not useful without a stack trace
as obtained by stack_trace_save() and friends.

Therefore, move filter_irq_stacks() to kernel/stacktrace.c, so that new
users of filter_irq_stacks() do not have to start depending on
STACKDEPOT only for filter_irq_stacks().

Link: https://lkml.kernel.org/r/20210923104803.2620285-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Aleksandr Nogikh <nogikh@google.com>
Cc: Taras Madan <tarasmadan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:43 -07:00
David Hildenbrand
50f9481ed9 mm/memory_hotplug: remove CONFIG_MEMORY_HOTPLUG_SPARSE
CONFIG_MEMORY_HOTPLUG depends on CONFIG_SPARSEMEM, so there is no need for
CONFIG_MEMORY_HOTPLUG_SPARSE anymore; adjust all instances to use
CONFIG_MEMORY_HOTPLUG and remove CONFIG_MEMORY_HOTPLUG_SPARSE.

Link: https://lkml.kernel.org/r/20210929143600.49379-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>	[kselftest]
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Alex Shi <alexs@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:42 -07:00
Mike Rapoport
4421cca0a3 memblock: use memblock_free for freeing virtual pointers
Rename memblock_free_ptr() to memblock_free() and use memblock_free()
when freeing a virtual pointer so that memblock_free() will be a
counterpart of memblock_alloc()

The callers are updated with the below semantic patch and manual
addition of (void *) casting to pointers that are represented by
unsigned long variables.

    @@
    identifier vaddr;
    expression size;
    @@
    (
    - memblock_phys_free(__pa(vaddr), size);
    + memblock_free(vaddr, size);
    |
    - memblock_free_ptr(vaddr, size);
    + memblock_free(vaddr, size);
    )

[sfr@canb.auug.org.au: fixup]
  Link: https://lkml.kernel.org/r/20211018192940.3d1d532f@canb.auug.org.au

Link: https://lkml.kernel.org/r/20210930185031.18648-7-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Mike Rapoport
3ecc68349b memblock: rename memblock_free to memblock_phys_free
Since memblock_free() operates on a physical range, make its name
reflect it and rename it to memblock_phys_free(), so it will be a
logical counterpart to memblock_phys_alloc().

The callers are updated with the below semantic patch:

    @@
    expression addr;
    expression size;
    @@
    - memblock_free(addr, size);
    + memblock_phys_free(addr, size);

Link: https://lkml.kernel.org/r/20210930185031.18648-6-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Mike Rapoport
fa27717110 memblock: drop memblock_free_early_nid() and memblock_free_early()
memblock_free_early_nid() is unused and memblock_free_early() is an
alias for memblock_free().

Replace calls to memblock_free_early() with calls to memblock_free() and
remove memblock_free_early() and memblock_free_early_nid().

Link: https://lkml.kernel.org/r/20210930185031.18648-4-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Changcheng Deng
34b46efd6e lib/test_vmalloc.c: use swap() to make code cleaner
Use swap() in order to make code cleaner.  Issue found by coccinelle.

Link: https://lkml.kernel.org/r/20211028111443.15744-1-deng.changcheng@zte.com.cn
Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:37 -07:00
Kees Cook
d73dad4eb5 kasan: test: bypass __alloc_size checks
Intentional overflows, as performed by the KASAN tests, are detected at
compile time[1] (instead of only at run-time) with the addition of
__alloc_size.  Fix this by forcing the compiler into not being able to
trust the size used following the kmalloc()s.

[1] https://lore.kernel.org/lkml/20211005184717.65c6d8eb39350395e387b71f@linux-foundation.org

Link: https://lkml.kernel.org/r/20211006181544.1670992-1-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:33 -07:00
Peter Collingbourne
758cabae31 kasan: test: add memcpy test that avoids out-of-bounds write
With HW tag-based KASAN, error checks are performed implicitly by the
load and store instructions in the memcpy implementation.  A failed
check results in tag checks being disabled and execution will keep
going.  As a result, under HW tag-based KASAN, prior to commit
1b0668be62 ("kasan: test: disable kmalloc_memmove_invalid_size for
HW_TAGS"), this memcpy would end up corrupting memory until it hits an
inaccessible page and causes a kernel panic.

This is a pre-existing issue that was revealed by commit 285133040e
("arm64: Import latest memcpy()/memmove() implementation") which changed
the memcpy implementation from using signed comparisons (incorrectly,
resulting in the memcpy being terminated early for negative sizes) to
using unsigned comparisons.

It is unclear how this could be handled by memcpy itself in a reasonable
way.  One possibility would be to add an exception handler that would
force memcpy to return if a tag check fault is detected -- this would
make the behavior roughly similar to generic and SW tag-based KASAN.
However, this wouldn't solve the problem for asynchronous mode and also
makes memcpy behavior inconsistent with manually copying data.

This test was added as a part of a series that taught KASAN to detect
negative sizes in memory operations, see commit 8cceeff48f ("kasan:
detect negative size in memory operation function").  Therefore we
should keep testing for negative sizes with generic and SW tag-based
KASAN.  But there is some value in testing small memcpy overflows, so
let's add another test with memcpy that does not destabilize the kernel
by performing out-of-bounds writes, and run it in all modes.

Link: https://linux-review.googlesource.com/id/I048d1e6a9aff766c4a53f989fb0c83de68923882
Link: https://lkml.kernel.org/r/20210910211356.3603758-1-pcc@google.com
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Acked-by: Marco Elver <elver@google.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:33 -07:00
Marco Elver
11ac25c62c lib/stackdepot: introduce __stack_depot_save()
Add __stack_depot_save(), which provides more fine-grained control over
stackdepot's memory allocation behaviour, in case stackdepot runs out of
"stack slabs".

Normally stackdepot uses alloc_pages() in case it runs out of space;
passing can_alloc==false to __stack_depot_save() prohibits this, at the
cost of more likely failure to record a stack trace.

Link: https://lkml.kernel.org/r/20210913112609.2651084-4-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Taras Madan <tarasmadan@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: Walter Wu <walter-zh.wu@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:33 -07:00
Marco Elver
7f2b8818ea lib/stackdepot: remove unused function argument
alloc_flags in depot_alloc_stack() is no longer used; remove it.

Link: https://lkml.kernel.org/r/20210913112609.2651084-3-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Taras Madan <tarasmadan@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: Walter Wu <walter-zh.wu@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:33 -07:00
Linus Torvalds
95faf6ba65 Driver core changes for 5.16-rc1
Here is the big set of driver core changes for 5.16-rc1.
 
 All of these have been in linux-next for a while now with no reported
 problems.
 
 Included in here are:
 	- big update and cleanup of the sysfs abi documentation files
 	  and scripts from Mauro.  We are almost at the place where we
 	  can properly check that the running kernel's sysfs abi is
 	  documented fully.
 	- firmware loader updates
 	- dyndbg updates
 	- kernfs cleanups and fixes from Christoph
 	- device property updates
 	- component fix
 	- other minor driver core cleanups and fixes
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYYPbjQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ync9gCfXKMUI1GAnCfJWAwTdTcd18q5akoAoMw32/AH
 0yh5TjAWFyFd7xz5d7qs
 =itsC
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the big set of driver core changes for 5.16-rc1.

  All of these have been in linux-next for a while now with no reported
  problems.

  Included in here are:

   - big update and cleanup of the sysfs abi documentation files and
     scripts from Mauro. We are almost at the place where we can
     properly check that the running kernel's sysfs abi is documented
     fully.

   - firmware loader updates

   - dyndbg updates

   - kernfs cleanups and fixes from Christoph

   - device property updates

   - component fix

   - other minor driver core cleanups and fixes"

* tag 'driver-core-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (122 commits)
  device property: Drop redundant NULL checks
  x86/build: Tuck away built-in firmware under FW_LOADER
  vmlinux.lds.h: wrap built-in firmware support under FW_LOADER
  firmware_loader: move struct builtin_fw to the only place used
  x86/microcode: Use the firmware_loader built-in API
  firmware_loader: remove old DECLARE_BUILTIN_FIRMWARE()
  firmware_loader: formalize built-in firmware API
  component: do not leave master devres group open after bind
  dyndbg: refine verbosity 1-4 summary-detail
  gpiolib: acpi: Replace custom code with device_match_acpi_handle()
  i2c: acpi: Replace custom function with device_match_acpi_handle()
  driver core: Provide device_match_acpi_handle() helper
  dyndbg: fix spurious vNpr_info change
  dyndbg: no vpr-info on empty queries
  dyndbg: vpr-info on remove-module complete, not starting
  device property: Add missed header in fwnode.h
  Documentation: dyndbg: Improve cli param examples
  dyndbg: Remove support for ddebug_query param
  dyndbg: make dyndbg a known cli param
  dyndbg: show module in vpr-info in dd-exec-queries
  ...
2021-11-04 08:32:38 -07:00
Guenter Roeck
5c4e0a21fa string: uninline memcpy_and_pad
When building m68k:allmodconfig, recent versions of gcc generate the
following error if the length of UTS_RELEASE is less than 8 bytes.

  In function 'memcpy_and_pad',
    inlined from 'nvmet_execute_disc_identify' at
      drivers/nvme/target/discovery.c:268:2: arch/m68k/include/asm/string.h:72:25: error:
	'__builtin_memcpy' reading 8 bytes from a region of size 7

Discussions around the problem suggest that this only happens if an
architecture does not provide strlen(), if -ffreestanding is provided as
compiler option, and if CONFIG_FORTIFY_SOURCE=n. All of this is the case
for m68k. The exact reasons are unknown, but seem to be related to the
ability of the compiler to evaluate the return value of strlen() and
the resulting execution flow in memcpy_and_pad(). It would be possible
to work around the problem by using sizeof(UTS_RELEASE) instead of
strlen(UTS_RELEASE), but that would only postpone the problem until the
function is called in a similar way. Uninline memcpy_and_pad() instead
to solve the problem for good.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-03 11:41:25 -07:00
Linus Torvalds
313b6ffc8e linux-kselftest-kunit-5.16-rc1
This KUnit update for Linux 5.16-rc1 consist of several enhancements
 and fixes:
 
 - ability to run each test suite and test separately
 - support for timing test run
 - several fixes and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmGBhpoACgkQCwJExA0N
 Qxx2xQ//bCql+nCJdowo2er7MgYV1Jw9bkjapriV7vaakHVBOQhMItMAD3lw2WNI
 SXuX4n4x4Ap4FMBd3gBQ3Flc1e7MdY/FHQIIcIE5+xDoU/ehl9ypbUMd7NrGOTI7
 KMN18TJawHHyMHjKz/fFFV5Bfi97YptQMy3VB/ujdgRCIF2/bPhra5F/rFYUwdSk
 GLql9CBPbomgaginzAuvnfPKoxWGjiiRZjNTlsXDyRihras0ezS7kfSYPLEV8f/i
 L/ydZK+Lc3QCrWCYadBeAtq/3hkb1pV7FGKfjypvEGhGcvqEQZW7irKxE4KoLB2D
 VFeVVZmK0WCfCFHtOQookTKPVwkgTzItwpXxl57ILMWo6FaB8O8tQGHqEvcF7Xfm
 NIgT/V0laiUWABpWWpQjf1jnY+X1qI+s+f8eZ2eI889AduBdlrNIfCRotYz2tB0i
 /GfnmVLp5pDIYZX0SxCERpu91297sDXSV+uw4mf9hjDs7189LmWppEpPENfjWebM
 geFlconrtrmnWx3E3Wvqxk8pEmJwChDYVAGUUhKFZXLxYO0YvWoBM4REb9v8sv3c
 0x3A1ZhpdeN5f7S6BHEAVKa3VrDHXLG9nZCK6wlaxYIiM2rOJDQN9h4BF5/RO9/2
 SebYQuMA95aQgAPXjJHYKQ9fdxrR3BOgFdn0Xfp/EShEWTl/usA=
 =UpDf
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:
 "Several enhancements and fixes:

   - ability to run each test suite and test separately

   - support for timing test run

   - several fixes and improvements"

* tag 'linux-kselftest-kunit-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: tool: fix typecheck errors about loading qemu configs
  kunit: tool: continue past invalid utf-8 output
  kunit: Reset suite count after running tests
  kunit: tool: improve compatibility of kunit_parser with KTAP specification
  kunit: tool: yield output from run_kernel in real time
  kunit: tool: support running each suite/test separately
  kunit: tool: actually track how long it took to run tests
  kunit: tool: factor exec + parse steps into a function
  kunit: add 'kunit.action' param to allow listing out tests
  kunit: tool: show list of valid --arch options when invalid
  kunit: tool: misc fixes (unused vars, imports, leaked files)
  kunit: fix too small allocation when using suite-only kunit.filter_glob
  kunit: tool: allow filtering test cases via glob
  kunit: drop assumption in kunit-log-test about current suite
2021-11-02 22:06:20 -07:00
Linus Torvalds
56d3375448 drm for 5.16-rc1
core:
 - improve dma_fence, lease and resv documentation
 - shmem-helpers: allocate WC pages on x86, use vmf_insert_pin
 - sched fixes/improvements
 - allow empty drm leases
 - add dma resv iterator
 - add more DP 2.0 headers
 - DP MST helper improvements for DP2.0
 
 dma-buf:
 - avoid warnings, remove fence trace macros
 
 bridge:
 - new helper to get rid of panels
 - probe improvements for it66121
 - enable DSI EOTP for anx7625
 
 fbdev:
 - efifb: release runtime PM on destroy
 
 ttm:
 - kerneldoc switch
 - helper to clear all DMA mappings
 - pool shrinker optimizaton
 - remove ttm_tt_destroy_common
 - update ttm_move_memcpy for async use
 
 panel:
 - add new panel-edp driver
 
 amdgpu:
  - Initial DP 2.0 support
  - Initial USB4 DP tunnelling support
  - Aldebaran MCE support
  - Modifier support for DCC image stores for GFX 10.3
  - Display rework for better FP code handling
  - Yellow Carp/Cyan Skillfish updates
  - Cyan Skillfish display support
  - convert vega/navi to IP discovery asic enumeration
  - validate IP discovery table
  - RAS improvements
  - Lots of fixes
 
  i915:
  - DG1 PCI IDs + LMEM discovery/placement
  - DG1 GuC submission by default
  - ADL-S PCI IDs updated + enabled by default
  - ADL-P (XE_LPD) fixed and updates
  - DG2 display fixes
  - PXP protected object support for Gen12 integrated
  - expose multi-LRC submission interface for GuC
  - export logical engine instance to user
  - Disable engine bonding on Gen12+
  - PSR cleanup
  - PSR2 selective fetch by default
  - DP 2.0 prep work
  - VESA vendor block + MSO use of it
  - FBC refactor
  - try again to fix fast-narrow vs slow-wide eDP training
  - use THP when IOMMU enabled
  - LMEM backup/restore for suspend/resume
  - locking simplification
  - GuC major reworking
  - async flip VT-D workaround changes
  - DP link training improvements
  - misc display refactorings
 
 bochs:
 - new PCI ID
 
 rcar-du:
 - Non-contiguious buffer import support for rcar-du
 - r8a779a0 support prep
 
 omapdrm:
 - COMPILE_TEST fixes
 
 sti:
 - COMPILE_TEST fixes
 
 msm:
 - fence ordering improvements
 - eDP support in DP sub-driver
 - dpu irq handling cleanup
 - CRC support for making igt happy
 - NO_CONNECTOR bridge support
 - dsi: 14nm phy support for msm8953
 - mdp5: msm8x53, sdm450, sdm632 support
 
 stm:
 - layer alpha + zpo support
 
 v3d:
 - fix Vulkan CTS failure
 - support multiple sync objects
 
 gud:
 - add R8/RGB332/RGB888 pixel formats
 
 vc4:
 - convert to new bridge helpers
 
 vgem:
 - use shmem helpers
 
 virtio:
 - support mapping exported vram
 
 zte:
 - remove obsolete driver
 
 rockchip:
 - use bridge attach no connector for LVDS/RGB
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmGByPYACgkQDHTzWXnE
 hr6fxA//cXUvTHlEtF7UJDBRAYv+9lXH39NbGYU4aLJuBNlZztCuUi5JOSyDFDH1
 N9VI5biVseev2PEnCzJUubWxTqbUO7FBQTw0TyvZ4Eqn+UZMuFeo0dvdKZRAkvjV
 VHSUc0fm0+WSYanKUK7XK0fwG8aE6JVyYngzgKPSjifhszTdiiRsbU21iTinFhkS
 rgh3HEVELp+LqfoG4qzAYqFUjYqUjvCjd/hX/UkzCII8ZXKr38/4127e95443WOk
 +jes0gWGJe9TvSDrqo9TMx4qukcOniINFUvnzoD2RhOS+Jzr/i5rBh51Xy92g3NO
 Q7hy6byZdk/ZO/MXCDQ2giUOkBiqn5fQjlRGQp4iAZYw9pb3HU+/xrTq0BWVWd8o
 /vmzZYEKKU/sCGpxVDMZxsHV3mXIuVBvuZq6bjmSGcybgOBCiDx5F/Rum4nY2yHp
 lr3cuc0HP3m3f4b/HVvACO4tGd1nDDpVcon7CuhBB7HB7t6Zl9u18qc/qFw0tCTh
 3sgAhno6XFXtPFcSX2KAeeg0mhKDKKrsOnq5y3bDRr05Z0jLocJk95aXEKs6em4j
 gbyHwNaX3CHtiCnFn2/5169+n1K7zqHBtVSGmQlmFDv55rcdx7L3Spk7tCahQeSQ
 ur24r+sEggm8d5Wjl+MYq6wW3oP31s04JFaeV6oCkaSp1wS+alg=
 =jdhH
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-11-03' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Summary below. i915 starts to add support for DG2 GPUs, enables DG1
  and ADL-S support by default, lots of work to enable DisplayPort 2.0
  across drivers. Lots of documentation updates and fixes across the
  board.

  core:
   - improve dma_fence, lease and resv documentation
   - shmem-helpers: allocate WC pages on x86, use vmf_insert_pin
   - sched fixes/improvements
   - allow empty drm leases
   - add dma resv iterator
   - add more DP 2.0 headers
   - DP MST helper improvements for DP2.0

  dma-buf:
   - avoid warnings, remove fence trace macros

  bridge:
   - new helper to get rid of panels
   - probe improvements for it66121
   - enable DSI EOTP for anx7625

  fbdev:
   - efifb: release runtime PM on destroy

  ttm:
   - kerneldoc switch
   - helper to clear all DMA mappings
   - pool shrinker optimizaton
   - remove ttm_tt_destroy_common
   - update ttm_move_memcpy for async use

  panel:
   - add new panel-edp driver

  amdgpu:
   - Initial DP 2.0 support
   - Initial USB4 DP tunnelling support
   - Aldebaran MCE support
   - Modifier support for DCC image stores for GFX 10.3
   - Display rework for better FP code handling
   - Yellow Carp/Cyan Skillfish updates
   - Cyan Skillfish display support
   - convert vega/navi to IP discovery asic enumeration
   - validate IP discovery table
   - RAS improvements
   - Lots of fixes

  i915:
   - DG1 PCI IDs + LMEM discovery/placement
   - DG1 GuC submission by default
   - ADL-S PCI IDs updated + enabled by default
   - ADL-P (XE_LPD) fixed and updates
   - DG2 display fixes
   - PXP protected object support for Gen12 integrated
   - expose multi-LRC submission interface for GuC
   - export logical engine instance to user
   - Disable engine bonding on Gen12+
   - PSR cleanup
   - PSR2 selective fetch by default
   - DP 2.0 prep work
   - VESA vendor block + MSO use of it
   - FBC refactor
   - try again to fix fast-narrow vs slow-wide eDP training
   - use THP when IOMMU enabled
   - LMEM backup/restore for suspend/resume
   - locking simplification
   - GuC major reworking
   - async flip VT-D workaround changes
   - DP link training improvements
   - misc display refactorings

  bochs:
   - new PCI ID

  rcar-du:
   - Non-contiguious buffer import support for rcar-du
   - r8a779a0 support prep

  omapdrm:
   - COMPILE_TEST fixes

  sti:
   - COMPILE_TEST fixes

  msm:
   - fence ordering improvements
   - eDP support in DP sub-driver
   - dpu irq handling cleanup
   - CRC support for making igt happy
   - NO_CONNECTOR bridge support
   - dsi: 14nm phy support for msm8953
   - mdp5: msm8x53, sdm450, sdm632 support

  stm:
   - layer alpha + zpo support

  v3d:
   - fix Vulkan CTS failure
   - support multiple sync objects

  gud:
   - add R8/RGB332/RGB888 pixel formats

  vc4:
   - convert to new bridge helpers

  vgem:
   - use shmem helpers

  virtio:
   - support mapping exported vram

  zte:
   - remove obsolete driver

  rockchip:
   - use bridge attach no connector for LVDS/RGB"

* tag 'drm-next-2021-11-03' of git://anongit.freedesktop.org/drm/drm: (1259 commits)
  drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits
  drm/amd/display: MST support for DPIA
  drm/amdgpu: Fix even more out of bound writes from debugfs
  drm/amdgpu/discovery: add SDMA IP instance info for soc15 parts
  drm/amdgpu/discovery: add UVD/VCN IP instance info for soc15 parts
  drm/amdgpu/UAPI: rearrange header to better align related items
  drm/amd/display: Enable dpia in dmub only for DCN31 B0
  drm/amd/display: Fix USB4 hot plug crash issue
  drm/amd/display: Fix deadlock when falling back to v2 from v3
  drm/amd/display: Fallback to clocks which meet requested voltage on DCN31
  drm/amd/display: move FPU associated DCN301 code to DML folder
  drm/amd/display: fix link training regression for 1 or 2 lane
  drm/amd/display: add two lane settings training options
  drm/amd/display: decouple hw_lane_settings from dpcd_lane_settings
  drm/amd/display: implement decide lane settings
  drm/amd/display: adopt DP2.0 LT SCR revision 8
  drm/amd/display: FEC configuration for dpia links in MST mode
  drm/amd/display: FEC configuration for dpia links
  drm/amd/display: Add workaround flag for EDID read on certain docks
  drm/amd/display: Set phy_mux_sel bit in dmub scratch register
  ...
2021-11-02 16:47:49 -07:00
Linus Torvalds
c03098d4b9 gfs2: Fix mmap + page fault deadlocks
Functions gfs2_file_read_iter and gfs2_file_write_iter are both
 accessing the user buffer to write to or read from while holding the
 inode glock.  In the most basic scenario, that buffer will not be
 resident and it will be mapped to the same file.  Accessing the buffer
 will trigger a page fault, and gfs2 will deadlock trying to take the
 same inode glock again while trying to handle that fault.
 
 Fix that and similar, more complex scenarios by disabling page faults
 while accessing user buffers.  To make this work, introduce a small
 amount of new infrastructure and fix some bugs that didn't trigger so
 far, with page faults enabled.
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmGBPisUHGFncnVlbmJh
 QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTpE6A/7BezUnGuNJxJrR8pC+vcLYA7xAgUU
 6STQ6IN7w5UHRlSkNzZxZ2XPxW4uVQ4SxSEeaLqBsHZihepjcLNFZ/8MhQ6UPSD0
 8noHOi7CoIcp6IuWQtCpxRM/xjjm2SlMt2XbVJZaiJcdzCV9gB6TU9EkBRq7Zm/X
 9WFBbv1xZF0skn9ISCJvNtiiI+VyWKgMDUKxJUiTQjmJcklyyqHcVGmQi9BjqPz4
 4s3F+WH6CoGbDKlmNk/6Y9wZ/2+sbvGswVscUxPwJVPoZWsR1xBBUdAeAmEMD1P4
 BgE/Y1J8JXyVPYtyvZKq70XUhKdQkxB7RfX87YasOk9mY4Kjd5rIIGEykh+o2vC9
 kDhCHvf2Mnw5I6Rum3B7UXyB1vemY+fECIHsXhgBnS+ztabRtcAdpCuWoqb43ymw
 yEX1KwXyU4FpRYbrRvdZT42Fmh6ty8TW+N4swg8S2TrffirvgAi5yrcHZ4mPupYv
 lyzvsCW7Wv8hPXn/twNObX+okRgJnsxcCdBXARdCnRXfA8tH23xmu88u8RA1Vdxh
 nzTvv6Dx2EowwojuDWMx29Mw3fA2IqIfbOV+4FaRU7NZ2ZKtknL8yGl27qQUsMoJ
 vYsHTmagasjQr+NDJ3vQRLCw+JQ6B1hENpdkmixFD9moo7X1ZFW3HBi/UL973Bv6
 5CmgeXto8FRUFjI=
 =WeNd
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-v5.15-rc5-mmap-fault' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 mmap + page fault deadlocks fixes from Andreas Gruenbacher:
 "Functions gfs2_file_read_iter and gfs2_file_write_iter are both
  accessing the user buffer to write to or read from while holding the
  inode glock.

  In the most basic deadlock scenario, that buffer will not be resident
  and it will be mapped to the same file. Accessing the buffer will
  trigger a page fault, and gfs2 will deadlock trying to take the same
  inode glock again while trying to handle that fault.

  Fix that and similar, more complex scenarios by disabling page faults
  while accessing user buffers. To make this work, introduce a small
  amount of new infrastructure and fix some bugs that didn't trigger so
  far, with page faults enabled"

* tag 'gfs2-v5.15-rc5-mmap-fault' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Fix mmap + page fault deadlocks for direct I/O
  iov_iter: Introduce nofault flag to disable page faults
  gup: Introduce FOLL_NOFAULT flag to disable page faults
  iomap: Add done_before argument to iomap_dio_rw
  iomap: Support partial direct I/O on user copy failures
  iomap: Fix iomap_dio_rw return value for user copies
  gfs2: Fix mmap + page fault deadlocks for buffered I/O
  gfs2: Eliminate ip->i_gh
  gfs2: Move the inode glock locking to gfs2_file_buffered_write
  gfs2: Introduce flag for glock holder auto-demotion
  gfs2: Clean up function may_grant
  gfs2: Add wrapper for iomap_file_buffered_write
  iov_iter: Introduce fault_in_iov_iter_writeable
  iov_iter: Turn iov_iter_fault_in_readable into fault_in_iov_iter_readable
  gup: Turn fault_in_pages_{readable,writeable} into fault_in_{readable,writeable}
  powerpc/kvm: Fix kvm_use_magic_page
  iov_iter: Fix iov_iter_get_pages{,_alloc} page fault return value
2021-11-02 12:25:03 -07:00
Linus Torvalds
0aaa58eca6 printk changes for 5.16
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmGBCBkACgkQUqAMR0iA
 lPLMdg/6Ag9V5Q6DPvbYe0WK8wfrrRL39Eic+K6wrYBVK/8rvMUy4Oee5tyOqCz7
 z9GM+SivWRtEdEy8X/HzoawMQEuy3jLcaFoCNxHcScmc6R5Sd8otxPU5Lo8aZPLN
 Pulni9EprysI2zhLqq5m6o/F9pMOY0y8uKbD1mgIHEV9yoLan+CZ+vahf/eFwYQu
 NtYlMoK2KbS2mChGOZuLsthhyNxcCNFWWNwpBBQz7iJ9ZvnKCZ3EwG7Nx34Rx7ZE
 TYZ2iga3TTONsoCk0IClbA6zRIowgumKQl9aY9Oci1MXdIEug42i0GEl+p4iCkrH
 VhLyPsvJG6xyE6aCg/p2SB1vPasY+pp94VfTjFfmMulYdUHK7ipfZCR3ddxayR4B
 PEsITibo/hHYEVerMMSyVXttiPS7qFhIyZkNuX/xpCMLz8RSFjgU5QhR848A4scM
 r+qv1p7xkdBRvH3jlStrpLRnGtqOucvbNQgyvQiinm0yunpJN8FZgEsHnP60E5+j
 DLpQF/bK2h7PhE2Wy8/iINi49/dZiIldZ1gZV4BxjuJ5zwSLdiuR9aP51RK4IRhV
 qraLwU6yNv0k4v6sjXV78inQQ2vkqy/MBYMe3zqnpYbJB2DZYCbeRE62whrdEd4W
 wxHxiY7r9dR6gtJB52kGepbryd3JIMdI49oFRjvGi2shaXG1AZ0=
 =t12m
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Extend %pGp print format to print hex value of the page flags

 - Use kvmalloc instead of kmalloc to allocate devkmsg buffers

 - Misc cleanup and warning fixes

* tag 'printk-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  vsprintf: Update %pGp documentation about that it prints hex value
  lib/vsprintf.c: Amend static asserts for format specifier flags
  vsprintf: Make %pGp print the hex value
  test_printf: Append strings more efficiently
  test_printf: Remove custom appending of '|'
  test_printf: Remove separate page_flags variable
  test_printf: Make pft array const
  ia64: don't do IA64_CMPXCHG_DEBUG without CONFIG_PRINTK
  printk: use gnu_printf format attribute for printk_sprint()
  printk: avoid -Wsometimes-uninitialized warning
  printk: use kvmalloc instead of kmalloc for devkmsg_user
2021-11-02 10:53:45 -07:00
Linus Torvalds
fc02cb2b37 Core:
- Remove socket skb caches
 
  - Add a SO_RESERVE_MEM socket op to forward allocate buffer space
    and avoid memory accounting overhead on each message sent
 
  - Introduce managed neighbor entries - added by control plane and
    resolved by the kernel for use in acceleration paths (BPF / XDP
    right now, HW offload users will benefit as well)
 
  - Make neighbor eviction on link down controllable by userspace
    to work around WiFi networks with bad roaming implementations
 
  - vrf: Rework interaction with netfilter/conntrack
 
  - fq_codel: implement L4S style ce_threshold_ect1 marking
 
  - sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
 
 BPF:
 
  - Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
    as implemented in LLVM14
 
  - Introduce bpf_get_branch_snapshot() to capture Last Branch Records
 
  - Implement variadic trace_printk helper
 
  - Add a new Bloomfilter map type
 
  - Track <8-byte scalar spill and refill
 
  - Access hw timestamp through BPF's __sk_buff
 
  - Disallow unprivileged BPF by default
 
  - Document BPF licensing
 
 Netfilter:
 
  - Introduce egress hook for looking at raw outgoing packets
 
  - Allow matching on and modifying inner headers / payload data
 
  - Add NFT_META_IFTYPE to match on the interface type either from
    ingress or egress
 
 Protocols:
 
  - Multi-Path TCP:
    - increase default max additional subflows to 2
    - rework forward memory allocation
    - add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
 
  - MCTP flow support allowing lower layer drivers to configure msg
    muxing as needed
 
  - Automatic Multicast Tunneling (AMT) driver based on RFC7450
 
  - HSR support the redbox supervision frames (IEC-62439-3:2018)
 
  - Support for the ip6ip6 encapsulation of IOAM
 
  - Netlink interface for CAN-FD's Transmitter Delay Compensation
 
  - Support SMC-Rv2 eliminating the current same-subnet restriction,
    by exploiting the UDP encapsulation feature of RoCE adapters
 
  - TLS: add SM4 GCM/CCM crypto support
 
  - Bluetooth: initial support for link quality and audio/codec
    offload
 
 Driver APIs:
 
  - Add a batched interface for RX buffer allocation in AF_XDP
    buffer pool
 
  - ethtool: Add ability to control transceiver modules' power mode
 
  - phy: Introduce supported interfaces bitmap to express MAC
    capabilities and simplify PHY code
 
  - Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
 
 New drivers:
 
  - WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
 
  - Ethernet driver for ASIX AX88796C SPI device (x88796c)
 
 Drivers:
 
  - Broadcom PHYs
    - support 72165, 7712 16nm PHYs
    - support IDDQ-SR for additional power savings
 
  - PHY support for QCA8081, QCA9561 PHYs
 
  - NXP DPAA2: support for IRQ coalescing
 
  - NXP Ethernet (enetc): support for software TCP segmentation
 
  - Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
    Gigabit-capable IP found on RZ/G2L SoC
 
  - Intel 100G Ethernet
    - support for eswitch offload of TC/OvS flow API, including
      offload of GRE, VxLAN, Geneve tunneling
    - support application device queues - ability to assign Rx and Tx
      queues to application threads
    - PTP and PPS (pulse-per-second) extensions
 
  - Broadcom Ethernet (bnxt)
    - devlink health reporting and device reload extensions
 
  - Mellanox Ethernet (mlx5)
    - offload macvlan interfaces
    - support HW offload of TC rules involving OVS internal ports
    - support HW-GRO and header/data split
    - support application device queues
 
  - Marvell OcteonTx2:
    - add XDP support for PF
    - add PTP support for VF
 
  - Qualcomm Ethernet switch (qca8k): support for QCA8328
 
  - Realtek Ethernet DSA switch (rtl8366rb)
    - support bridge offload
    - support STP, fast aging, disabling address learning
    - support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
 
  - Mellanox Ethernet/IB switch (mlxsw)
    - multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
    - offload root TBF qdisc as port shaper
    - support multiple routing interface MAC address prefixes
    - support for IP-in-IP with IPv6 underlay
 
  - MediaTek WiFi (mt76)
    - mt7921 - ASPM, 6GHz, SDIO and testmode support
    - mt7915 - LED and TWT support
 
  - Qualcomm WiFi (ath11k)
    - include channel rx and tx time in survey dump statistics
    - support for 80P80 and 160 MHz bandwidths
    - support channel 2 in 6 GHz band
    - spectral scan support for QCN9074
    - support for rx decapsulation offload (data frames in 802.3
      format)
 
  - Qualcomm phone SoC WiFi (wcn36xx)
    - enable Idle Mode Power Save (IMPS) to reduce power consumption
      during idle
 
  - Bluetooth driver support for MediaTek MT7922 and MT7921
 
  - Enable support for AOSP Bluetooth extension in Qualcomm WCN399x
    and Realtek 8822C/8852A
 
  - Microsoft vNIC driver (mana)
    - support hibernation and kexec
 
  - Google vNIC driver (gve)
    - support for jumbo frames
    - implement Rx page reuse
 
 Refactor:
 
  - Make all writes to netdev->dev_addr go thru helpers, so that we
    can add this address to the address rbtree and handle the updates
 
  - Various TCP cleanups and optimizations including improvements
    to CPU cache use
 
  - Simplify the gnet_stats, Qdisc stats' handling and remove
    qdisc->running sequence counter
 
  - Driver changes and API updates to address devlink locking
    deficiencies
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGAzX4ACgkQMUZtbf5S
 IrvW3g//Q0ZLrOuHK9pZ8sCXMMhDj8qL6ajm0otMddHWA/+1UglwVBKFhsajfxOf
 wJ/5LZis+XKLpLqKTU5chKVfn39HuDGe/D3l+egi01Gv5BW0+XzEhagfyR5tJX5z
 wsGG5CXO/we/laVSzRiFtwwVEKHKN20YC+tIQwYOYP5Wy3q4G7qDsFhT7GqgsGCS
 n74QUEAIB5Tz0ODWFqLtbsySzIurXrskibwt5T9bvAAlPw/lCU68mmG+NVJ7VddO
 lBbNkLMOo8yW9Ci20H09SrYd4jZTmMARo9tsFO1tAvAMk7qpn0Wd8pnOYTjFFoMD
 +qjiFSVMh7E0JGb8Y7NCvwaB99suAK5rfGP68Xwe62DfP7vYWEx4pZGxBP19F4ld
 6Kn1ME33BX9rUF9tBecf0bdKfJUwB2Q2Xou/b9laG04bwiqsc9iG5FQq1C46lnLZ
 QdzNiS1My4dJMczkWt66HF3Kx30ibwHfvKMIHjf4PqkzEatkv6Y6SBZ57KXL+Lde
 0BQSFhbf0tm2Gf55etzrczLElI3uqHSFWUNZZ2Bt6WmzO1e6tpV9nAtRWF4C/dFg
 QDpLJtOOOY65uq+qz09zoPfv2lem868SrCAuFrVn99bEpYjx/CGNFDeEI02l6jyr
 84eUxd364UcbIk3fc+eTGdXHLQNVk30G0AHVBBxaWNIidwfqXeE=
 =srde
 -----END PGP SIGNATURE-----

Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - Remove socket skb caches

   - Add a SO_RESERVE_MEM socket op to forward allocate buffer space and
     avoid memory accounting overhead on each message sent

   - Introduce managed neighbor entries - added by control plane and
     resolved by the kernel for use in acceleration paths (BPF / XDP
     right now, HW offload users will benefit as well)

   - Make neighbor eviction on link down controllable by userspace to
     work around WiFi networks with bad roaming implementations

   - vrf: Rework interaction with netfilter/conntrack

   - fq_codel: implement L4S style ce_threshold_ect1 marking

   - sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()

  BPF:

   - Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
     as implemented in LLVM14

   - Introduce bpf_get_branch_snapshot() to capture Last Branch Records

   - Implement variadic trace_printk helper

   - Add a new Bloomfilter map type

   - Track <8-byte scalar spill and refill

   - Access hw timestamp through BPF's __sk_buff

   - Disallow unprivileged BPF by default

   - Document BPF licensing

  Netfilter:

   - Introduce egress hook for looking at raw outgoing packets

   - Allow matching on and modifying inner headers / payload data

   - Add NFT_META_IFTYPE to match on the interface type either from
     ingress or egress

  Protocols:

   - Multi-Path TCP:
      - increase default max additional subflows to 2
      - rework forward memory allocation
      - add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS

   - MCTP flow support allowing lower layer drivers to configure msg
     muxing as needed

   - Automatic Multicast Tunneling (AMT) driver based on RFC7450

   - HSR support the redbox supervision frames (IEC-62439-3:2018)

   - Support for the ip6ip6 encapsulation of IOAM

   - Netlink interface for CAN-FD's Transmitter Delay Compensation

   - Support SMC-Rv2 eliminating the current same-subnet restriction, by
     exploiting the UDP encapsulation feature of RoCE adapters

   - TLS: add SM4 GCM/CCM crypto support

   - Bluetooth: initial support for link quality and audio/codec offload

  Driver APIs:

   - Add a batched interface for RX buffer allocation in AF_XDP buffer
     pool

   - ethtool: Add ability to control transceiver modules' power mode

   - phy: Introduce supported interfaces bitmap to express MAC
     capabilities and simplify PHY code

   - Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks

  New drivers:

   - WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)

   - Ethernet driver for ASIX AX88796C SPI device (x88796c)

  Drivers:

   - Broadcom PHYs
      - support 72165, 7712 16nm PHYs
      - support IDDQ-SR for additional power savings

   - PHY support for QCA8081, QCA9561 PHYs

   - NXP DPAA2: support for IRQ coalescing

   - NXP Ethernet (enetc): support for software TCP segmentation

   - Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
     Gigabit-capable IP found on RZ/G2L SoC

   - Intel 100G Ethernet
      - support for eswitch offload of TC/OvS flow API, including
        offload of GRE, VxLAN, Geneve tunneling
      - support application device queues - ability to assign Rx and Tx
        queues to application threads
      - PTP and PPS (pulse-per-second) extensions

   - Broadcom Ethernet (bnxt)
      - devlink health reporting and device reload extensions

   - Mellanox Ethernet (mlx5)
      - offload macvlan interfaces
      - support HW offload of TC rules involving OVS internal ports
      - support HW-GRO and header/data split
      - support application device queues

   - Marvell OcteonTx2:
      - add XDP support for PF
      - add PTP support for VF

   - Qualcomm Ethernet switch (qca8k): support for QCA8328

   - Realtek Ethernet DSA switch (rtl8366rb)
      - support bridge offload
      - support STP, fast aging, disabling address learning
      - support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch

   - Mellanox Ethernet/IB switch (mlxsw)
      - multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
      - offload root TBF qdisc as port shaper
      - support multiple routing interface MAC address prefixes
      - support for IP-in-IP with IPv6 underlay

   - MediaTek WiFi (mt76)
      - mt7921 - ASPM, 6GHz, SDIO and testmode support
      - mt7915 - LED and TWT support

   - Qualcomm WiFi (ath11k)
      - include channel rx and tx time in survey dump statistics
      - support for 80P80 and 160 MHz bandwidths
      - support channel 2 in 6 GHz band
      - spectral scan support for QCN9074
      - support for rx decapsulation offload (data frames in 802.3
        format)

   - Qualcomm phone SoC WiFi (wcn36xx)
      - enable Idle Mode Power Save (IMPS) to reduce power consumption
        during idle

   - Bluetooth driver support for MediaTek MT7922 and MT7921

   - Enable support for AOSP Bluetooth extension in Qualcomm WCN399x and
     Realtek 8822C/8852A

   - Microsoft vNIC driver (mana)
      - support hibernation and kexec

   - Google vNIC driver (gve)
      - support for jumbo frames
      - implement Rx page reuse

  Refactor:

   - Make all writes to netdev->dev_addr go thru helpers, so that we can
     add this address to the address rbtree and handle the updates

   - Various TCP cleanups and optimizations including improvements to
     CPU cache use

   - Simplify the gnet_stats, Qdisc stats' handling and remove
     qdisc->running sequence counter

   - Driver changes and API updates to address devlink locking
     deficiencies"

* tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2122 commits)
  Revert "net: avoid double accounting for pure zerocopy skbs"
  selftests: net: add arp_ndisc_evict_nocarrier
  net: ndisc: introduce ndisc_evict_nocarrier sysctl parameter
  net: arp: introduce arp_evict_nocarrier sysctl parameter
  libbpf: Deprecate AF_XDP support
  kbuild: Unify options for BTF generation for vmlinux and modules
  selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
  bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
  bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
  net: vmxnet3: remove multiple false checks in vmxnet3_ethtool.c
  net: avoid double accounting for pure zerocopy skbs
  tcp: rename sk_wmem_free_skb
  netdevsim: fix uninit value in nsim_drv_configure_vfs()
  selftests/bpf: Fix also no-alu32 strobemeta selftest
  bpf: Add missing map_delete_elem method to bloom filter map
  selftests/bpf: Add bloom map success test for userspace calls
  bpf: Add alignment padding for "map_extra" + consolidate holes
  bpf: Bloom filter map naming fixups
  selftests/bpf: Add test cases for struct_ops prog
  bpf: Add dummy BPF STRUCT_OPS for test purpose
  ...
2021-11-02 06:20:58 -07:00
Petr Mladek
40e64a88da Merge branch 'for-5.16-vsprintf-pgp' into for-linus 2021-11-02 10:39:27 +01:00
Linus Torvalds
bfc484fe6a Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:

   - Delay boot-up self-test for built-in algorithms

  Algorithms:

   - Remove fallback path on arm64 as SIMD now runs with softirq off

  Drivers:

   - Add Keem Bay OCS ECC Driver"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (61 commits)
  crypto: testmgr - fix wrong key length for pkcs1pad
  crypto: pcrypt - Delay write to padata->info
  crypto: ccp - Make use of the helper macro kthread_run()
  crypto: sa2ul - Use the defined variable to clean code
  crypto: s5p-sss - Add error handling in s5p_aes_probe()
  crypto: keembay-ocs-ecc - Add Keem Bay OCS ECC Driver
  dt-bindings: crypto: Add Keem Bay ECC bindings
  crypto: ecc - Export additional helper functions
  crypto: ecc - Move ecc.h to include/crypto/internal
  crypto: engine - Add KPP Support to Crypto Engine
  crypto: api - Do not create test larvals if manager is disabled
  crypto: tcrypt - fix skcipher multi-buffer tests for 1420B blocks
  hwrng: s390 - replace snprintf in show functions with sysfs_emit
  crypto: octeontx2 - set assoclen in aead_do_fallback()
  crypto: ccp - Fix whitespace in sev_cmd_buffer_len()
  hwrng: mtk - Force runtime pm ops for sleep ops
  crypto: testmgr - Only disable migration in crypto_disable_simd_for_test()
  crypto: qat - share adf_enable_pf2vf_comms() from adf_pf2vf_msg.c
  crypto: qat - extract send and wait from adf_vf2pf_request_version()
  crypto: qat - add VF and PF wrappers to common send function
  ...
2021-11-01 21:24:02 -07:00
Linus Torvalds
d2fac0afe8 audit/stable-5.16 PR 20211101
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmGANdUUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOmihAAgKSTv4Jf0s4yopdcxfuLweiyqHX1
 719QJzdLZohmllrJPq/83FZL9qodCzxy87nAm67Ht0baSKiEjtVgRaVCqJWEE+l6
 oQL+wUsGLP7CmExOP503Uh6tW35AhETQA4Uwu6QtiUYLYG17kAgeR3cTFuekUsJS
 iL4K65PXE2bBxMe7Ta1YIZqcxptbknMgpqYkdne7xs7RS+UiVj8TyRle6ACrfzEX
 IVy4LTk+spHCy1a494g9pt/21xOnbiLHr/FpckALscnvJiUThxbfQHGSQeMpM4uM
 BnwCqFrj860vMeh52M11/GAAXmdPh6AjoLhaSIW2I3M2GbV8ZP2hu1HYUz3osmrT
 f+aeMPJ4feX1xVj6qAC+1G83XRO83tP/YIEuocGiwyepImB25NHPin21xepf6Ru0
 wJX+aXC9O1eG6E2ghT6tBim/MpeNH5OT0hNO3uhGmEQ6xZpArRVVaBwlEdufJiCx
 ZljqEFUT7wA9nGEQif6GdLnGezGr/aNL65caTkIAzHKamd79QIr7VZXYjYIfHSqE
 p74Aro6E8qoQJjsTSkvZceM0u1LRzwS4wPRroE6eGz98oYDpiDm1RPb+9Gw5jyJf
 JN7UjJKO9+iPGAi3KivGBqpBskw4cCp2y/nHrMYmpGUPELcr5kQtDfQ6yp59tVZ8
 Dwo5GeSlG6khmiI=
 =WrEw
 -----END PGP SIGNATURE-----

Merge tag 'audit-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit

Pull audit updates from Paul Moore:
 "Add some additional audit logging to capture the openat2() syscall
  open_how struct info.

  Previous variations of the open()/openat() syscalls allowed audit
  admins to inspect the syscall args to get the information contained in
  the new open_how struct used in openat2()"

* tag 'audit-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: return early if the filter rule has a lower priority
  audit: add OPENAT2 record to list "how" info
  audit: add support for the openat2 syscall
  audit: replace magic audit syscall class numbers with macros
  lsm_audit: avoid overloading the "key" audit field
  audit: Convert to SPDX identifier
  audit: rename struct node to struct audit_node to prevent future name collisions
2021-11-01 21:17:39 -07:00
Linus Torvalds
79ef0c0014 Tracing updates for 5.16:
- kprobes: Restructured stack unwinder to show properly on x86 when a stack
   dump happens from a kretprobe callback.
 
 - Fix to bootconfig parsing
 
 - Have tracefs allow owner and group permissions by default (only denying
   others). There's been pressure to allow non root to tracefs in a
   controlled fashion, and using groups is probably the safest.
 
 - Bootconfig memory managament updates.
 
 - Bootconfig clean up to have the tools directory be less dependent on
   changes in the kernel tree.
 
 - Allow perf to be traced by function tracer.
 
 - Rewrite of function graph tracer to be a callback from the function tracer
   instead of having its own trampoline (this change will happen on an arch
   by arch basis, and currently only x86_64 implements it).
 
 - Allow multiple direct trampolines (bpf hooks to functions) be batched
   together in one synchronization.
 
 - Allow histogram triggers to add variables that can perform calculations
   against the event's fields.
 
 - Use the linker to determine architecture callbacks from the ftrace
   trampoline to allow for proper parameter prototypes and prevent warnings
   from the compiler.
 
 - Extend histogram triggers to key off of variables.
 
 - Have trace recursion use bit magic to determine preempt context over if
   branches.
 
 - Have trace recursion disable preemption as all use cases do anyway.
 
 - Added testing for verification of tracing utilities.
 
 - Various small clean ups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYYBdxhQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qp1sAQD2oYFwaG3sx872gj/myBcHIBSKdiki
 Hry5csd8zYDBpgD+Poylopt5JIbeDuoYw/BedgEXmscZ8Qr7VzjAXdnv/Q4=
 =Loz8
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:

 - kprobes: Restructured stack unwinder to show properly on x86 when a
   stack dump happens from a kretprobe callback.

 - Fix to bootconfig parsing

 - Have tracefs allow owner and group permissions by default (only
   denying others). There's been pressure to allow non root to tracefs
   in a controlled fashion, and using groups is probably the safest.

 - Bootconfig memory managament updates.

 - Bootconfig clean up to have the tools directory be less dependent on
   changes in the kernel tree.

 - Allow perf to be traced by function tracer.

 - Rewrite of function graph tracer to be a callback from the function
   tracer instead of having its own trampoline (this change will happen
   on an arch by arch basis, and currently only x86_64 implements it).

 - Allow multiple direct trampolines (bpf hooks to functions) be batched
   together in one synchronization.

 - Allow histogram triggers to add variables that can perform
   calculations against the event's fields.

 - Use the linker to determine architecture callbacks from the ftrace
   trampoline to allow for proper parameter prototypes and prevent
   warnings from the compiler.

 - Extend histogram triggers to key off of variables.

 - Have trace recursion use bit magic to determine preempt context over
   if branches.

 - Have trace recursion disable preemption as all use cases do anyway.

 - Added testing for verification of tracing utilities.

 - Various small clean ups and fixes.

* tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (101 commits)
  tracing/histogram: Fix semicolon.cocci warnings
  tracing/histogram: Fix documentation inline emphasis warning
  tracing: Increase PERF_MAX_TRACE_SIZE to handle Sentinel1 and docker together
  tracing: Show size of requested perf buffer
  bootconfig: Initialize ret in xbc_parse_tree()
  ftrace: do CPU checking after preemption disabled
  ftrace: disable preemption when recursion locked
  tracing/histogram: Document expression arithmetic and constants
  tracing/histogram: Optimize division by a power of 2
  tracing/histogram: Covert expr to const if both operands are constants
  tracing/histogram: Simplify handling of .sym-offset in expressions
  tracing: Fix operator precedence for hist triggers expression
  tracing: Add division and multiplication support for hist triggers
  tracing: Add support for creating hist trigger variables from literal
  selftests/ftrace: Stop tracing while reading the trace file by default
  MAINTAINERS: Update KPROBES and TRACING entries
  test_kprobes: Move it from kernel/ to lib/
  docs, kprobes: Remove invalid URL and add new reference
  samples/kretprobes: Fix return value if register_kretprobe() failed
  lib/bootconfig: Fix the xbc_get_info kerneldoc
  ...
2021-11-01 20:05:19 -07:00
Jakub Kicinski
b7b98f8689 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-11-01

We've added 181 non-merge commits during the last 28 day(s) which contain
a total of 280 files changed, 11791 insertions(+), 5879 deletions(-).

The main changes are:

1) Fix bpf verifier propagation of 64-bit bounds, from Alexei.

2) Parallelize bpf test_progs, from Yucong and Andrii.

3) Deprecate various libbpf apis including af_xdp, from Andrii, Hengqi, Magnus.

4) Improve bpf selftests on s390, from Ilya.

5) bloomfilter bpf map type, from Joanne.

6) Big improvements to JIT tests especially on Mips, from Johan.

7) Support kernel module function calls from bpf, from Kumar.

8) Support typeless and weak ksym in light skeleton, from Kumar.

9) Disallow unprivileged bpf by default, from Pawan.

10) BTF_KIND_DECL_TAG support, from Yonghong.

11) Various bpftool cleanups, from Quentin.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (181 commits)
  libbpf: Deprecate AF_XDP support
  kbuild: Unify options for BTF generation for vmlinux and modules
  selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
  bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
  bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
  selftests/bpf: Fix also no-alu32 strobemeta selftest
  bpf: Add missing map_delete_elem method to bloom filter map
  selftests/bpf: Add bloom map success test for userspace calls
  bpf: Add alignment padding for "map_extra" + consolidate holes
  bpf: Bloom filter map naming fixups
  selftests/bpf: Add test cases for struct_ops prog
  bpf: Add dummy BPF STRUCT_OPS for test purpose
  bpf: Factor out helpers for ctx access checking
  bpf: Factor out a helper to prepare trampoline for struct_ops prog
  selftests, bpf: Fix broken riscv build
  riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h
  tools, build: Add RISC-V to HOSTARCH parsing
  riscv, bpf: Increase the maximum number of iterations
  selftests, bpf: Add one test for sockmap with strparser
  selftests, bpf: Fix test_txmsg_ingress_parser error
  ...
====================

Link: https://lore.kernel.org/r/20211102013123.9005-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-01 19:59:46 -07:00
Linus Torvalds
bf953917be Various hardening fixes and cleanups for 5.16-rc1
Hi Linus,
 
 Please, pull the following hardening fixes and cleanups that I've
 been collecting during the last development cycle. All of them have
 been baking in linux-next.
 
 Fix -Wcast-function-type error:
 
 - firewire: Remove function callback casts (Oscar Carter)
 
 Fix application of sizeof operator:
 
 - firmware/psci: fix application of sizeof to pointer (jing yangyang)
 
 Replace open coded instances with size_t saturating arithmetic helpers:
 
 - assoc_array: Avoid open coded arithmetic in allocator arguments (Len Baker)
 - writeback: prefer struct_size over open coded arithmetic (Len Baker)
 - aio: Prefer struct_size over open coded arithmetic (Len Baker)
 - dmaengine: pxa_dma: Prefer struct_size over open coded arithmetic (Len Baker)
 
 Flexible array transformation:
 
 - KVM: PPC: Replace zero-length array with flexible array member (Len Baker)
 
 Use 2-factor argument multiplication form:
 
 - nouveau/svm: Use kvcalloc() instead of kvzalloc() (Gustavo A. R. Silva)
 - xfs: Use kvcalloc() instead of kvzalloc() (Gustavo A. R. Silva)
 
 Thanks
 --
 Gustavo
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAmGAWzsACgkQRwW0y0cG
 2zF55A/+PTBZKg0XLQkPZ7HFipobeZpfvM0dU4JutwN6Kts1RmMRftPn6ootY18v
 4tWR4jXcnblvEr7UTgYAl6QQdytFXZKOK+JKMWV8LXLqyNGF6sS2PmA6zk/iQoa5
 1q0IKUaaqLIXwmm3xoz+/uNHsb+kfjYOHZpHA6HhYZQFDyShW7+hhIeS1NauJo2X
 op3IWasMumrawPkCJZ0ZJJQLELtZNGt4gHnOjB1MAYhOTAokowgeeDNtyfoJ9j1L
 iL8kimphVLI35H/GERBozmqdqRGIIZLlQF4P66VfNNEXSDoKOemAKDSFrfmYoVwE
 kdh6fqeKPV/aRImrCtNthfpiEjqEpm8afQGMC5H5uPnZontUX9tcU1Qagg0vwYx0
 fLZ8mMuNQK5AZfugK+1+2ShfBYUlhvWRhQdtjC9nIAoO80NqouWB7QD0zIHC2WV7
 durdlhzxik70ISnXqKmTR6bQNcXB6kFLPR30RpcA3E6+AgwlkP0FmaD3e+sDttJ0
 vtxDMHqMMNNzOWlLW2eqEdKMEfoU0gLyRt5iM7EN6R8HUXwup5f9bu7V4LuCnR6y
 FAX4tEa8b5wg01zNfyWClCccU6tetSeXjdrhdIk7szQVsOsYXc4zxDrp6xvqsAh2
 B7GbGk5qeUzM/O7QWNIl+5s/NhUjEzQ3QiQebRDdjVyINU2OKsI=
 =Jk0U
 -----END PGP SIGNATURE-----

Merge tag 'kspp-misc-fixes-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull hardening fixes and cleanups from Gustavo A. R. Silva:
 "Various hardening fixes and cleanups that I've been collecting during
  the last development cycle:

  Fix -Wcast-function-type error:

   - firewire: Remove function callback casts (Oscar Carter)

  Fix application of sizeof operator:

   - firmware/psci: fix application of sizeof to pointer (jing yangyang)

  Replace open coded instances with size_t saturating arithmetic
  helpers:

   - assoc_array: Avoid open coded arithmetic in allocator arguments
     (Len Baker)

   - writeback: prefer struct_size over open coded arithmetic (Len
     Baker)

   - aio: Prefer struct_size over open coded arithmetic (Len Baker)

   - dmaengine: pxa_dma: Prefer struct_size over open coded arithmetic
     (Len Baker)

  Flexible array transformation:

   - KVM: PPC: Replace zero-length array with flexible array member (Len
     Baker)

  Use 2-factor argument multiplication form:

   - nouveau/svm: Use kvcalloc() instead of kvzalloc() (Gustavo A. R.
     Silva)

   - xfs: Use kvcalloc() instead of kvzalloc() (Gustavo A. R. Silva)"

* tag 'kspp-misc-fixes-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  firewire: Remove function callback casts
  nouveau/svm: Use kvcalloc() instead of kvzalloc()
  firmware/psci: fix application of sizeof to pointer
  dmaengine: pxa_dma: Prefer struct_size over open coded arithmetic
  KVM: PPC: Replace zero-length array with flexible array member
  aio: Prefer struct_size over open coded arithmetic
  writeback: prefer struct_size over open coded arithmetic
  xfs: Use kvcalloc() instead of kvzalloc()
  assoc_array: Avoid open coded arithmetic in allocator arguments
2021-11-01 17:29:10 -07:00
Linus Torvalds
2dc26d98cf overflow updates for v5.16-rc1
The end goal of the current buffer overflow detection work[0] is to gain
 full compile-time and run-time coverage of all detectable buffer overflows
 seen via array indexing or memcpy(), memmove(), and memset(). The str*()
 family of functions already have full coverage.
 
 While much of the work for these changes have been on-going for many
 releases (i.e. 0-element and 1-element array replacements, as well as
 avoiding false positives and fixing discovered overflows[1]), this series
 contains the foundational elements of several related buffer overflow
 detection improvements by providing new common helpers and FORTIFY_SOURCE
 changes needed to gain the introspection required for compiler visibility
 into array sizes. Also included are a handful of already Acked instances
 using the helpers (or related clean-ups), with many more waiting at the
 ready to be taken via subsystem-specific trees[2]. The new helpers are:
 
 - struct_group() for gaining struct member range introspection.
 - memset_after() and memset_startat() for clearing to the end of structures.
 - DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in structs.
 
 Also included is the beginning of the refactoring of FORTIFY_SOURCE to
 support memcpy() introspection, fix missing and regressed coverage under
 GCC, and to prepare to fix the currently broken Clang support. Finishing
 this work is part of the larger series[0], but depends on all the false
 positives and buffer overflow bug fixes to have landed already and those
 that depend on this series to land.
 
 As part of the FORTIFY_SOURCE refactoring, a set of both a compile-time
 and run-time tests are added for FORTIFY_SOURCE and the mem*()-family
 functions respectively. The compile time tests have found a legitimate
 (though corner-case) bug[6] already.
 
 Please note that the appearance of "panic" and "BUG" in the
 FORTIFY_SOURCE refactoring are the result of relocating existing code,
 and no new use of those code-paths are expected nor desired.
 
 Finally, there are two tree-wide conversions for 0-element arrays and
 flexible array unions to gain sane compiler introspection coverage that
 result in no known object code differences.
 
 After this series (and the changes that have now landed via netdev
 and usb), we are very close to finally being able to build with
 -Warray-bounds and -Wzero-length-bounds. However, due corner cases in
 GCC[3] and Clang[4], I have not included the last two patches that turn
 on these options, as I don't want to introduce any known warnings to
 the build. Hopefully these can be solved soon.
 
 [0] https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/
 [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE
 [2] https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/
 [3] https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/
 [4] https://bugs.llvm.org/show_bug.cgi?id=51682
 [5] https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/
 [6] https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmGAFWcWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJmKFD/45MJdnvW5MhIEeW5tc5UjfcIPS
 ae+YvlEX/2ZwgSlTxocFVocE6hz7b6eCiX3dSAChPkPxsSfgeiuhjxsU+4ROnELR
 04RqTA/rwT6JXfJcXbDPXfxDL4huUkgktAW3m1sT771AZspeap2GrSwFyttlTqKA
 +kTiZ3lXJVFcw10uyhfp3Lk6eFJxdf5iOjuEou5kBOQfpNKEOduRL2K15hSowOwB
 lARiAC+HbmN+E+npvDE7YqK4V7ZQ0/dtB0BlfqgTkn1spQz8N21kBAMpegV5vvIk
 A+qGHc7q2oyk4M14TRTidQHGQ4juW1Kkvq3NV6KzwQIVD+mIfz0ESn3d4tnp28Hk
 Y+OXTI1BRFlApQU9qGWv33gkNEozeyqMLDRLKhDYRSFPA9UKkpgXQRzeTzoLKyrQ
 4B6n5NnUGcu7I6WWhpyZQcZLDsHGyy0vHzjQGs/NXtb1PzXJ5XIGuPdmx9pVMykk
 IVKnqRcWyGWahfh3asOnoXvdhi1No4NSHQ/ZHfUM+SrIGYjBMaUisw66qm3Fe8ZU
 lbO2CFkCsfGSoKNPHf0lUEGlkyxAiDolazOfflDNxdzzlZo2X1l/a7O/yoO4Pqul
 cdL0eDjiNoQ2YR2TSYPnXq5KSL1RI0tlfS8pH8k1hVhZsQx0wpAQ+qki0S+fLePV
 PdA9XB82G2tmqKc9cQ==
 =9xbT
 -----END PGP SIGNATURE-----

Merge tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull overflow updates from Kees Cook:
 "The end goal of the current buffer overflow detection work[0] is to
  gain full compile-time and run-time coverage of all detectable buffer
  overflows seen via array indexing or memcpy(), memmove(), and
  memset(). The str*() family of functions already have full coverage.

  While much of the work for these changes have been on-going for many
  releases (i.e. 0-element and 1-element array replacements, as well as
  avoiding false positives and fixing discovered overflows[1]), this
  series contains the foundational elements of several related buffer
  overflow detection improvements by providing new common helpers and
  FORTIFY_SOURCE changes needed to gain the introspection required for
  compiler visibility into array sizes. Also included are a handful of
  already Acked instances using the helpers (or related clean-ups), with
  many more waiting at the ready to be taken via subsystem-specific
  trees[2].

  The new helpers are:

   - struct_group() for gaining struct member range introspection

   - memset_after() and memset_startat() for clearing to the end of
     structures

   - DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in
     structs

  Also included is the beginning of the refactoring of FORTIFY_SOURCE to
  support memcpy() introspection, fix missing and regressed coverage
  under GCC, and to prepare to fix the currently broken Clang support.
  Finishing this work is part of the larger series[0], but depends on
  all the false positives and buffer overflow bug fixes to have landed
  already and those that depend on this series to land.

  As part of the FORTIFY_SOURCE refactoring, a set of both a
  compile-time and run-time tests are added for FORTIFY_SOURCE and the
  mem*()-family functions respectively. The compile time tests have
  found a legitimate (though corner-case) bug[6] already.

  Please note that the appearance of "panic" and "BUG" in the
  FORTIFY_SOURCE refactoring are the result of relocating existing code,
  and no new use of those code-paths are expected nor desired.

  Finally, there are two tree-wide conversions for 0-element arrays and
  flexible array unions to gain sane compiler introspection coverage
  that result in no known object code differences.

  After this series (and the changes that have now landed via netdev and
  usb), we are very close to finally being able to build with
  -Warray-bounds and -Wzero-length-bounds.

  However, due corner cases in GCC[3] and Clang[4], I have not included
  the last two patches that turn on these options, as I don't want to
  introduce any known warnings to the build. Hopefully these can be
  solved soon"

Link: https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/ [0]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE [1]
Link: https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/ [2]
Link: https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/ [3]
Link: https://bugs.llvm.org/show_bug.cgi?id=51682 [4]
Link: https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/ [5]
Link: https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/ [6]

* tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (30 commits)
  fortify: strlen: Avoid shadowing previous locals
  compiler-gcc.h: Define __SANITIZE_ADDRESS__ under hwaddress sanitizer
  treewide: Replace 0-element memcpy() destinations with flexible arrays
  treewide: Replace open-coded flex arrays in unions
  stddef: Introduce DECLARE_FLEX_ARRAY() helper
  btrfs: Use memset_startat() to clear end of struct
  string.h: Introduce memset_startat() for wiping trailing members and padding
  xfrm: Use memset_after() to clear padding
  string.h: Introduce memset_after() for wiping trailing members/padding
  lib: Introduce CONFIG_MEMCPY_KUNIT_TEST
  fortify: Add compile-time FORTIFY_SOURCE tests
  fortify: Allow strlen() and strnlen() to pass compile-time known lengths
  fortify: Prepare to improve strnlen() and strlen() warnings
  fortify: Fix dropped strcpy() compile-time write overflow check
  fortify: Explicitly disable Clang support
  fortify: Move remaining fortify helpers into fortify-string.h
  lib/string: Move helper functions out of string.c
  compiler_types.h: Remove __compiletime_object_size()
  cm4000_cs: Use struct_group() to zero struct cm4000_dev region
  can: flexcan: Use struct_group() to zero struct flexcan_regs regions
  ...
2021-11-01 17:12:56 -07:00
Linus Torvalds
46f8763228 arm64 updates for 5.16
- Support for the Arm8.6 timer extensions, including a self-synchronising
   view of the system registers to elide some expensive ISB instructions.
 
 - Exception table cleanup and rework so that the fixup handlers appear
   correctly in backtraces.
 
 - A handful of miscellaneous changes, the main one being selection of
   CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.
 
 - More mm and pgtable cleanups.
 
 - KASAN support for "asymmetric" MTE, where tag faults are reported
   synchronously for loads (via an exception) and asynchronously for
   stores (via a register).
 
 - Support for leaving the MMU enabled during kexec relocation, which
   significantly speeds up the operation.
 
 - Minor improvements to our perf PMU drivers.
 
 - Improvements to the compat vDSO build system, particularly when
   building with LLVM=1.
 
 - Preparatory work for handling some Coresight TRBE tracing errata.
 
 - Cleanup and refactoring of the SVE code to pave the way for SME
   support in future.
 
 - Ensure SCS pages are unpoisoned immediately prior to freeing them
   when KASAN is enabled for the vmalloc area.
 
 - Try moving to the generic pfn_valid() implementation again now that
   the DMA mapping issue from last time has been resolved.
 
 - Numerous improvements and additions to our FPSIMD and SVE selftests.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmF74ZYQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNI/eB/UZYAtmNi6xC5StPaETyMLeZph9BV/IqIFq
 N71ds7MFzlX/agR6MwLbH2tBHezBtlQ90O732Jjz8zAec2cHd+7sx/w82JesX7PB
 IuOfqP78rvtU4ZkKe1Rcd96QtYvbtNAqcRhIo95OzfV9xwuzkvdXI+ZTYhtCfCuZ
 GozCqQoJtnNDayMtfzbDSXyJLNJc/qnIcUQhrt3vg12zbF3BcHxnmp0nBcHCqZEo
 lDJYufju7p87kCzaFYda2WhlI3t+NThqKOiZ332wQfqzNcr+rw1Y4jWbnCfrdLtI
 JfHT9yiuHDmFSYaJrk7NU8kftW31NV70bbhD7rZ+DQCVndl0lRc=
 =3R3j
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "There's the usual summary below, but the highlights are support for
  the Armv8.6 timer extensions, KASAN support for asymmetric MTE, the
  ability to kexec() with the MMU enabled and a second attempt at
  switching to the generic pfn_valid() implementation.

  Summary:

   - Support for the Arm8.6 timer extensions, including a
     self-synchronising view of the system registers to elide some
     expensive ISB instructions.

   - Exception table cleanup and rework so that the fixup handlers
     appear correctly in backtraces.

   - A handful of miscellaneous changes, the main one being selection of
     CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.

   - More mm and pgtable cleanups.

   - KASAN support for "asymmetric" MTE, where tag faults are reported
     synchronously for loads (via an exception) and asynchronously for
     stores (via a register).

   - Support for leaving the MMU enabled during kexec relocation, which
     significantly speeds up the operation.

   - Minor improvements to our perf PMU drivers.

   - Improvements to the compat vDSO build system, particularly when
     building with LLVM=1.

   - Preparatory work for handling some Coresight TRBE tracing errata.

   - Cleanup and refactoring of the SVE code to pave the way for SME
     support in future.

   - Ensure SCS pages are unpoisoned immediately prior to freeing them
     when KASAN is enabled for the vmalloc area.

   - Try moving to the generic pfn_valid() implementation again now that
     the DMA mapping issue from last time has been resolved.

   - Numerous improvements and additions to our FPSIMD and SVE
     selftests"

[ armv8.6 timer updates were in a shared branch and already came in
  through -tip in the timer pull  - Linus ]

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (85 commits)
  arm64: Select POSIX_CPU_TIMERS_TASK_WORK
  arm64: Document boot requirements for FEAT_SME_FA64
  arm64/sve: Fix warnings when SVE is disabled
  arm64/sve: Add stub for sve_max_virtualisable_vl()
  arm64: errata: Add detection for TRBE write to out-of-range
  arm64: errata: Add workaround for TSB flush failures
  arm64: errata: Add detection for TRBE overwrite in FILL mode
  arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
  selftests: arm64: Factor out utility functions for assembly FP tests
  arm64: vmlinux.lds.S: remove `.fixup` section
  arm64: extable: add load_unaligned_zeropad() handler
  arm64: extable: add a dedicated uaccess handler
  arm64: extable: add `type` and `data` fields
  arm64: extable: use `ex` for `exception_table_entry`
  arm64: extable: make fixup_exception() return bool
  arm64: extable: consolidate definitions
  arm64: gpr-num: support W registers
  arm64: factor out GPR numbering helpers
  arm64: kvm: use kvm_exception_table_entry
  arm64: lib: __arch_copy_to_user(): fold fixups into body
  ...
2021-11-01 16:33:53 -07:00
Linus Torvalds
43aa0a195f objtool updates:
- Improve retpoline code patching by separating it from alternatives which
    reduces memory footprint and allows to do better optimizations in the
    actual runtime patching.
 
  - Add proper retpoline support for x86/BPF
 
  - Address noinstr warnings in x86/kvm, lockdep and paravirtualization code
 
  - Add support to handle pv_opsindirect calls in the noinstr analysis
 
  - Classify symbols upfront and cache the result to avoid redundant
    str*cmp() invocations.
 
  - Add a CFI hash to reduce memory consumption which also reduces runtime
    on a allyesconfig by ~50%
 
  - Adjust XEN code to make objtool handling more robust and as a side
    effect to prevent text fragmentation due to placement of the hypercall
    page.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/GFgTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoc1JD/0Sz6seP2OUMxbMT3gCcFo9sMvYTdsM
 7WuGFbBbnCIo7g8JH7k0zRRBigptMp2eUtQXKkgaaIbWN4JbuVKf8KxN5/qXxLi4
 fJ12QnNTGH9N2jtzl5wKmpjaKJnnJMD9D10XwoR+T6gn6NHd+AgLEs7GxxuQUlgo
 eC9oEXhNHC8uNhiZc38EwfwmItI1bRgaLrnZWIL4rYGSMxfCK1/cEOpWrFfX9wmj
 /diB6oqMyPXZXMCtgpX7TniUr5XOTCcUkeO9mQv5bmyq/YM/8hrTbcVSJlsVYLvP
 EsBnUSHAcfLFiHXwa1RNiIGdbiPjbN+UYeXGAvqF58f3e5dTIHtN/UmWo7OH93If
 9rLMVNcMpsfPx7QRk2IxEPumLCkyfwjzfKrVDM6P6TKEIUzD1og4IK9gTlfykVsh
 56G5XiCOC/X2x8IMxKTLGuBiAVLFHXK/rSwoqhvNEWBFKDbP13QWs0LurBcW09Sa
 /kQI9pIBT1xFA/R+OY5Xy1cqNVVK1Gxmk8/bllCijA9pCFSCFM4hLZE5CevdrBCV
 h5SdqEK5hIlzFyypXfsCik/4p/+rfvlGfUKtFsPctxx29SPe+T0orx+l61jiWQok
 rZOflwMawK5lDuASHrvNHGJcWaTwoo3VcXMQDnQY0Wulc43J5IFBaPxkZzgyd+S1
 4lktHxatrCMUgw==
 =pfZi
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Thomas Gleixner:

 - Improve retpoline code patching by separating it from alternatives
   which reduces memory footprint and allows to do better optimizations
   in the actual runtime patching.

 - Add proper retpoline support for x86/BPF

 - Address noinstr warnings in x86/kvm, lockdep and paravirtualization
   code

 - Add support to handle pv_opsindirect calls in the noinstr analysis

 - Classify symbols upfront and cache the result to avoid redundant
   str*cmp() invocations.

 - Add a CFI hash to reduce memory consumption which also reduces
   runtime on a allyesconfig by ~50%

 - Adjust XEN code to make objtool handling more robust and as a side
   effect to prevent text fragmentation due to placement of the
   hypercall page.

* tag 'objtool-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  bpf,x86: Respect X86_FEATURE_RETPOLINE*
  bpf,x86: Simplify computing label offsets
  x86,bugs: Unconditionally allow spectre_v2=retpoline,amd
  x86/alternative: Add debug prints to apply_retpolines()
  x86/alternative: Try inline spectre_v2=retpoline,amd
  x86/alternative: Handle Jcc __x86_indirect_thunk_\reg
  x86/alternative: Implement .retpoline_sites support
  x86/retpoline: Create a retpoline thunk array
  x86/retpoline: Move the retpoline thunk declarations to nospec-branch.h
  x86/asm: Fixup odd GEN-for-each-reg.h usage
  x86/asm: Fix register order
  x86/retpoline: Remove unused replacement symbols
  objtool,x86: Replace alternatives with .retpoline_sites
  objtool: Shrink struct instruction
  objtool: Explicitly avoid self modifying code in .altinstr_replacement
  objtool: Classify symbols
  objtool: Support pv_opsindirect calls for noinstr
  x86/xen: Rework the xen_{cpu,irq,mmu}_opsarrays
  x86/xen: Mark xen_force_evtchn_callback() noinstr
  x86/xen: Make irq_disable() noinstr
  ...
2021-11-01 13:24:43 -07:00
Linus Torvalds
595b28fb0c Locking updates:
- Move futex code into kernel/futex/ and split up the kitchen sink into
    seperate files to make integration of sys_futex_waitv() simpler.
 
  - Add a new sys_futex_waitv() syscall which allows to wait on multiple
    futexes. The main use case is emulating Windows' WaitForMultipleObjects
    which allows Wine to improve the performance of Windows Games. Also
    native Linux games can benefit from this interface as this is a common
    wait pattern for this kind of applications.
 
  - Add context to ww_mutex_trylock() to provide a path for i915 to rework
    their eviction code step by step without making lockdep upset until the
    final steps of rework are completed. It's also useful for regulator and
    TTM to avoid dropping locks in the non contended path.
 
  - Lockdep and might_sleep() cleanups and improvements
 
  - A few improvements for the RT substitutions.
 
  - The usual small improvements and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/FTITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVNZD/9vIm3Bu1Coz8tbNXz58AiCYq9Y/vp5
 mzFgSzz+VJTkW5Vh8jo5Uel4rCKZyt+rL276EoaRPzYl8KFtWDbpK3qd3PrXKqTX
 At49JO4ttAMJUHIBQ6vblEkykmfEd9YPU1uSWk5roJ+s7Jmr5VWnu0FEWHP00As5
 tWOca/TM0ei9kof26V2fl5aecTGII4i4Zsvy+LPsXtI+TnmP0gSBcGAS/5UnZTtJ
 vQRWTR3ojoYvh5iTmNqbaURYoQLe2j8yscn1DSW1CABWVmP12eDWs+N7jRP4b5S9
 73xOv5P7vpva41wxrK2ir5iNkpsLE97VL2JOHTW8nm7orblfiuxHLTCkTjEdd2pO
 h8blI2IBizEB3JYn2BMkOAaZQOSjN8hd6Ye/b2B4AMEGWeXEoEv6eVy/orYKCluQ
 XDqGn47Vce/SYmo5vfTB8VMt6nANx8PKvOP3IvjHInYEQBgiT6QrlUw3RRkXBp5s
 clQkjYYwjAMVIXowcCrdhoKjMROzi6STShVwHwGL8MaZXqr8Vl6BUO9ckU0pY+4C
 F000Hzwxi8lGEQ9k+P+BnYOEzH5osCty8lloKiQ/7ciX6T+CZHGJPGK/iY4YL8P5
 C3CJWMsHCqST7DodNFJmdfZt99UfIMmEhshMDduU9AAH0tHCn8vOu0U6WvCtpyBp
 BvHj68zteAtlYg==
 =RZ4x
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Thomas Gleixner:

 - Move futex code into kernel/futex/ and split up the kitchen sink into
   seperate files to make integration of sys_futex_waitv() simpler.

 - Add a new sys_futex_waitv() syscall which allows to wait on multiple
   futexes.

   The main use case is emulating Windows' WaitForMultipleObjects which
   allows Wine to improve the performance of Windows Games. Also native
   Linux games can benefit from this interface as this is a common wait
   pattern for this kind of applications.

 - Add context to ww_mutex_trylock() to provide a path for i915 to
   rework their eviction code step by step without making lockdep upset
   until the final steps of rework are completed. It's also useful for
   regulator and TTM to avoid dropping locks in the non contended path.

 - Lockdep and might_sleep() cleanups and improvements

 - A few improvements for the RT substitutions.

 - The usual small improvements and cleanups.

* tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  locking: Remove spin_lock_flags() etc
  locking/rwsem: Fix comments about reader optimistic lock stealing conditions
  locking: Remove rcu_read_{,un}lock() for preempt_{dis,en}able()
  locking/rwsem: Disable preemption for spinning region
  docs: futex: Fix kernel-doc references
  futex: Fix PREEMPT_RT build
  futex2: Documentation: Document sys_futex_waitv() uAPI
  selftests: futex: Test sys_futex_waitv() wouldblock
  selftests: futex: Test sys_futex_waitv() timeout
  selftests: futex: Add sys_futex_waitv() test
  futex,arm: Wire up sys_futex_waitv()
  futex,x86: Wire up sys_futex_waitv()
  futex: Implement sys_futex_waitv()
  futex: Simplify double_lock_hb()
  futex: Split out wait/wake
  futex: Split out requeue
  futex: Rename mark_wake_futex()
  futex: Rename: match_futex()
  futex: Rename: hb_waiter_{inc,dec,pending}()
  futex: Split out PI futex
  ...
2021-11-01 13:15:36 -07:00
Linus Torvalds
67a135b80e Changes since last update:
- support multiple devices for multi-layer container images;
 
  - support the secondary compression head;
 
  - support readmore decompression strategy;
 
  - support new LZMA algorithm (specifically called MicroLZMA);
 
  - some bugfixes & cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCYX8j7hEceGlhbmdAa2Vy
 bmVsLm9yZwAKCRA5NzHcH7XmBE+SAQChAmAUav03OQujm8PvVNX7VUGusGNvww8E
 qu5+zasC8wEArypW2Z75ZZ3IZNPCk6QWFlaC2I5Xnz7NNl0OGPKOCAg=
 =DZQ4
 -----END PGP SIGNATURE-----

Merge tag 'erofs-for-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs updates from Gao Xiang:
 "There are some new features available for this cycle. Firstly, EROFS
  LZMA algorithm support, specifically called MicroLZMA, is available as
  an option for embedded devices, LiveCDs and/or as the secondary
  auxiliary compression algorithm besides the primary algorithm in one
  file.

  In order to better support the LZMA fixed-sized output compression,
  especially for 4KiB pcluster size (which has lowest memory pressure
  thus useful for memory-sensitive scenarios), Lasse introduced a new
  LZMA header/container format called MicroLZMA to minimize the original
  LZMA1 header (for example, we don't need to waste 4-byte dictionary
  size and another 8-byte uncompressed size, which can be calculated by
  fs directly, for each pcluster) and enable EROFS fixed-sized output
  compression.

  Note that MicroLZMA can also be later used by other things in addition
  to EROFS too where wasting minimal amount of space for headers is
  important and it can be only compiled by enabling XZ_DEC_MICROLZMA.
  MicroLZMA has been supported by the latest upstream XZ embedded [1] &
  XZ utils [2], apply the latest related XZ embedded upstream patches by
  the XZ author Lasse here.

  Secondly, multiple device is also supported in this cycle, which is
  designed for multi-layer container images. By working together with
  inter-layer data deduplication and compression, we can achieve the
  next high-performance container image solution. Our team will announce
  the new Nydus container image service [3] implementation with new RAFS
  v6 (EROFS-compatible) format in Open Source Summit 2021 China [4]
  soon.

  Besides, the secondary compression head support and readmore
  decompression strategy are also included in this cycle. There are also
  some minor bugfixes and cleanups, as always.

  Summary:

   - support multiple devices for multi-layer container images;

   - support the secondary compression head;

   - support readmore decompression strategy;

   - support new LZMA algorithm (specifically called MicroLZMA);

   - some bugfixes & cleanups"

* tag 'erofs-for-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: don't trigger WARN() when decompression fails
  erofs: get rid of ->lru usage
  erofs: lzma compression support
  erofs: rename some generic methods in decompressor
  lib/xz, lib/decompress_unxz.c: Fix spelling in comments
  lib/xz: Add MicroLZMA decoder
  lib/xz: Move s->lzma.len = 0 initialization to lzma_reset()
  lib/xz: Validate the value before assigning it to an enum variable
  lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression
  erofs: introduce readmore decompression strategy
  erofs: introduce the secondary compression head
  erofs: get compression algorithms directly on mapping
  erofs: add multiple device support
  erofs: decouple basic mount options from fs_context
  erofs: remove the fast path of per-CPU buffer decompression
2021-11-01 11:39:22 -07:00
Linus Torvalds
33c8846c81 for-5.16/block-2021-10-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmF8KDgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpmQ2D/wO0nH3U+3+OZChi3XUwYck9Dev3o6BANCF
 ClATiK/kivZY0xY1r8J4ixirZo2gcjIMpWSC3JGYZ5LdspfmYGLUbMjfZsaeU23i
 lAKaX1IqfArmHN76k3IU1bKCg7B0/LFwC0q9QTFWTSwNSs8RK/EZLJ61U1hEXUb3
 OfIpaMmvPiMaU7yuPqhcZK14m1cg1srrLM4rFB/PqsWWStF07pHq32WeArGDAU0e
 Fe0YSnYD7qqA5Qc37KwqjCTmmxKX5YZf7etIcA6p3DNmwcuQrVNzKoCH/ZEDijaD
 E2bS/BWbN1x96+rtoEZfBYEaNIrkmJzmW6+fJ53OITbJF3KqP6V66erhqNcFYCzC
 mhFlRe7voXb/8AP7zQqSIhK529BUBM36sQ6nF7EiQcDrfLc1z39mq6eblUxbknIA
 DDPISD5Tseik9N9x0bc7vINseKyHI1E90VAU/XKADcuGbzLvehPx+2p+Iq5ch5Ah
 oa1G3RdlWWQOZxphJHWJhu1qMfo5+FP9dFZj1aoo7b8Kbc/CedyoQe71cpIE5wNh
 Jj/EpWJnuyKXwuTic2VYGC+6ezM9O5DSdqCfP3YuZky95VESyvRCKJYMMgBYRVdC
 /LuxhnBXIY2G8An7ZTnX0kLCCvLbapIwa0NyA98/xeOngO843coJ6wn8ZmE9LJNH
 kMmpCygUrA==
 =QWC+
 -----END PGP SIGNATURE-----

Merge tag 'for-5.16/block-2021-10-29' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:

 - mq-deadline accounting improvements (Bart)

 - blk-wbt timer fix (Andrea)

 - Untangle the block layer includes (Christoph)

 - Rework the poll support to be bio based, which will enable adding
   support for polling for bio based drivers (Christoph)

 - Block layer core support for multi-actuator drives (Damien)

 - blk-crypto improvements (Eric)

 - Batched tag allocation support (me)

 - Request completion batching support (me)

 - Plugging improvements (me)

 - Shared tag set improvements (John)

 - Concurrent queue quiesce support (Ming)

 - Cache bdev in ->private_data for block devices (Pavel)

 - bdev dio improvements (Pavel)

 - Block device invalidation and block size improvements (Xie)

 - Various cleanups, fixes, and improvements (Christoph, Jackie,
   Masahira, Tejun, Yu, Pavel, Zheng, me)

* tag 'for-5.16/block-2021-10-29' of git://git.kernel.dk/linux-block: (174 commits)
  blk-mq-debugfs: Show active requests per queue for shared tags
  block: improve readability of blk_mq_end_request_batch()
  virtio-blk: Use blk_validate_block_size() to validate block size
  loop: Use blk_validate_block_size() to validate block size
  nbd: Use blk_validate_block_size() to validate block size
  block: Add a helper to validate the block size
  block: re-flow blk_mq_rq_ctx_init()
  block: prefetch request to be initialized
  block: pass in blk_mq_tags to blk_mq_rq_ctx_init()
  block: add rq_flags to struct blk_mq_alloc_data
  block: add async version of bio_set_polled
  block: kill DIO_MULTI_BIO
  block: kill unused polling bits in __blkdev_direct_IO()
  block: avoid extra iter advance with async iocb
  block: Add independent access ranges support
  blk-mq: don't issue request directly in case that current is to be blocked
  sbitmap: silence data race warning
  blk-cgroup: synchronize blkg creation against policy deactivation
  block: refactor bio_iov_bvec_set()
  block: add single bio async direct IO helper
  ...
2021-11-01 09:19:50 -07:00
Linus Torvalds
49f8275c7d Memory folios
Add memory folios, a new type to represent either order-0 pages or
 the head page of a compound page.  This should be enough infrastructure
 to support filesystems converting from pages to folios.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmF9uI0ACgkQDpNsjXcp
 gj7MUAf/R7LCZ+xFiIedw7SAgb/DGK0C9uVjuBEIZgAw21ZUw/GuPI6cuKBMFGGf
 rRcdtlvMpwi7yZJcoNXxaqU/xPaaJMjf2XxscIvYJP1mjlZVuwmP9dOx0neNvWOc
 T+8lqR6c1TLl82lpqIjGFLwvj2eVowq2d3J5jsaIJFd4odmmYVInrhJXOzC/LQ54
 Niloj5ksehf+KUIRLDz7ycppvIHhlVsoAl0eM2dWBAtL0mvT7Nyn/3y+vnMfV2v3
 Flb4opwJUgTJleYc16oxTn9svT2yS8q2uuUemRDLW8ABghoAtH3fUUk43RN+5Krd
 LYCtbeawtkikPVXZMfWybsx5vn0c3Q==
 =7SBe
 -----END PGP SIGNATURE-----

Merge tag 'folio-5.16' of git://git.infradead.org/users/willy/pagecache

Pull memory folios from Matthew Wilcox:
 "Add memory folios, a new type to represent either order-0 pages or the
  head page of a compound page. This should be enough infrastructure to
  support filesystems converting from pages to folios.

  The point of all this churn is to allow filesystems and the page cache
  to manage memory in larger chunks than PAGE_SIZE. The original plan
  was to use compound pages like THP does, but I ran into problems with
  some functions expecting only a head page while others expect the
  precise page containing a particular byte.

  The folio type allows a function to declare that it's expecting only a
  head page. Almost incidentally, this allows us to remove various calls
  to VM_BUG_ON(PageTail(page)) and compound_head().

  This converts just parts of the core MM and the page cache. For 5.17,
  we intend to convert various filesystems (XFS and AFS are ready; other
  filesystems may make it) and also convert more of the MM and page
  cache to folios. For 5.18, multi-page folios should be ready.

  The multi-page folios offer some improvement to some workloads. The
  80% win is real, but appears to be an artificial benchmark (postgres
  startup, which isn't a serious workload). Real workloads (eg building
  the kernel, running postgres in a steady state, etc) seem to benefit
  between 0-10%. I haven't heard of any performance losses as a result
  of this series. Nobody has done any serious performance tuning; I
  imagine that tweaking the readahead algorithm could provide some more
  interesting wins. There are also other places where we could choose to
  create large folios and currently do not, such as writes that are
  larger than PAGE_SIZE.

  I'd like to thank all my reviewers who've offered review/ack tags:
  Christoph Hellwig, David Howells, Jan Kara, Jeff Layton, Johannes
  Weiner, Kirill A. Shutemov, Michal Hocko, Mike Rapoport, Vlastimil
  Babka, William Kucharski, Yu Zhao and Zi Yan.

  I'd also like to thank those who gave feedback I incorporated but
  haven't offered up review tags for this part of the series: Nick
  Piggin, Mel Gorman, Ming Lei, Darrick Wong, Ted Ts'o, John Hubbard,
  Hugh Dickins, and probably a few others who I forget"

* tag 'folio-5.16' of git://git.infradead.org/users/willy/pagecache: (90 commits)
  mm/writeback: Add folio_write_one
  mm/filemap: Add FGP_STABLE
  mm/filemap: Add filemap_get_folio
  mm/filemap: Convert mapping_get_entry to return a folio
  mm/filemap: Add filemap_add_folio()
  mm/filemap: Add filemap_alloc_folio
  mm/page_alloc: Add folio allocation functions
  mm/lru: Add folio_add_lru()
  mm/lru: Convert __pagevec_lru_add_fn to take a folio
  mm: Add folio_evictable()
  mm/workingset: Convert workingset_refault() to take a folio
  mm/filemap: Add readahead_folio()
  mm/filemap: Add folio_mkwrite_check_truncate()
  mm/filemap: Add i_blocks_per_folio()
  mm/writeback: Add folio_redirty_for_writepage()
  mm/writeback: Add folio_account_redirty()
  mm/writeback: Add folio_clear_dirty_for_io()
  mm/writeback: Add folio_cancel_dirty()
  mm/writeback: Add folio_account_cleaned()
  mm/writeback: Add filemap_dirty_folio()
  ...
2021-11-01 08:47:59 -07:00
Tiezhu Yang
b066abba3e bpf, tests: Add module parameter test_suite to test_bpf module
After commit 9298e63eaf ("bpf/tests: Add exhaustive tests of ALU
operand magnitudes"), when modprobe test_bpf.ko with JIT on mips64,
there exists segment fault due to the following reason:

  [...]
  ALU64_MOV_X: all register value magnitudes jited:1
  Break instruction in kernel code[#1]
  [...]

It seems that the related JIT implementations of some test cases
in test_bpf() have problems. At this moment, I do not care about
the segment fault while I just want to verify the test cases of
tail calls.

Based on the above background and motivation, add the following
module parameter test_suite to the test_bpf.ko:

  test_suite=<string>: only the specified test suite will be run, the
  string can be "test_bpf", "test_tail_calls" or "test_skb_segment".

If test_suite is not specified, but test_id, test_name or test_range
is specified, set 'test_bpf' as the default test suite. This is useful
to only test the corresponding test suite when specifying the valid
test_suite string.

Any invalid test suite will result in -EINVAL being returned and no
tests being run. If the test_suite is not specified or specified as
empty string, it does not change the current logic, all of the test
cases will be run.

Here are some test results:

 # dmesg -c
 # modprobe test_bpf
 # dmesg | grep Summary
 test_bpf: Summary: 1009 PASSED, 0 FAILED, [0/997 JIT'ed]
 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed]
 test_bpf: test_skb_segment: Summary: 2 PASSED, 0 FAILED

 # rmmod test_bpf
 # dmesg -c
 # modprobe test_bpf test_suite=test_bpf
 # dmesg | tail -1
 test_bpf: Summary: 1009 PASSED, 0 FAILED, [0/997 JIT'ed]

 # rmmod test_bpf
 # dmesg -c
 # modprobe test_bpf test_suite=test_tail_calls
 # dmesg
 test_bpf: #0 Tail call leaf jited:0 21 PASS
 [...]
 test_bpf: #7 Tail call error path, index out of range jited:0 32 PASS
 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed]

 # rmmod test_bpf
 # dmesg -c
 # modprobe test_bpf test_suite=test_skb_segment
 # dmesg
 test_bpf: #0 gso_with_rx_frags PASS
 test_bpf: #1 gso_linear_no_head_frag PASS
 test_bpf: test_skb_segment: Summary: 2 PASSED, 0 FAILED

 # rmmod test_bpf
 # dmesg -c
 # modprobe test_bpf test_id=1
 # dmesg
 test_bpf: test_bpf: set 'test_bpf' as the default test_suite.
 test_bpf: #1 TXA jited:0 54 51 50 PASS
 test_bpf: Summary: 1 PASSED, 0 FAILED, [0/1 JIT'ed]

 # rmmod test_bpf
 # dmesg -c
 # modprobe test_bpf test_suite=test_bpf test_name=TXA
 # dmesg
 test_bpf: #1 TXA jited:0 54 50 51 PASS
 test_bpf: Summary: 1 PASSED, 0 FAILED, [0/1 JIT'ed]

 # rmmod test_bpf
 # dmesg -c
 # modprobe test_bpf test_suite=test_tail_calls test_range=6,7
 # dmesg
 test_bpf: #6 Tail call error path, NULL target jited:0 41 PASS
 test_bpf: #7 Tail call error path, index out of range jited:0 32 PASS
 test_bpf: test_tail_calls: Summary: 2 PASSED, 0 FAILED, [0/2 JIT'ed]

 # rmmod test_bpf
 # dmesg -c
 # modprobe test_bpf test_suite=test_skb_segment test_id=1
 # dmesg
 test_bpf: #1 gso_linear_no_head_frag PASS
 test_bpf: test_skb_segment: Summary: 1 PASSED, 0 FAILED

By the way, the above segment fault has been fixed in the latest bpf-next
tree which contains the mips64 JIT rework.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Link: https://lore.kernel.org/bpf/1635384321-28128-1-git-send-email-yangtiezhu@loongson.cn
2021-10-28 11:41:16 +02:00
Dave Airlie
970eae1560 Linux 5.15-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmF298ceHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGIJYH/1rsEFQQ6caeQdy1
 z9eFIe48DNM4l7bFk+qEj2UAbzPdahVJ299Mg5fW0n2CDemOc9/n0b9TxQ37YObi
 mOzu0xwJVupIxkyFMPQSSc2q8aLm67NSpJy08DsmaNses5hSvu8x15RPHLQTybjt
 SwtKns+jpCq79P1GWbrB5e5UkLb0VNoxNp4L1U4pMrYGcEkJUXbaxNY2V/JcXdM7
 Vtn+qN0T/J6V6QVftv0t8Ecj3bjEnmL3kZHaTaNg3dGeKRpCGyHc5lcBQ0cNFG6t
 vjZ9VbuhBzGI3TN2tHH5hpA1UXo7HPBBCwQqxF1jeGLGHULikYwZ3TAPWqL3QZqC
 9cxr9SY=
 =p75d
 -----END PGP SIGNATURE-----

BackMerge tag 'v5.15-rc7' into drm-next

The msm next tree is based on rc3, so let's just backmerge rc7 before pulling it in.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-10-28 14:59:38 +10:00
Steven Rostedt (VMware)
39d9c1c103 bootconfig: Initialize ret in xbc_parse_tree()
The do while loop continues while ret is zero, but ret is never
initialized. The check for ret in the loop at the while should always be
initialized, but if an empty string were to be passed in, q would be NULL
and p would be '\0', and it would break out of the loop without ever
setting ret.

Set ret to zero, and then xbc_verify_tree() would be called and catch that
it is an empty tree and report the proper error.

Link: https://lkml.kernel.org/r/20211027105753.6ab9da5f@gandalf.local.home

Fixes: bdac5c2b24 ("bootconfig: Allocate xbc_data inside xbc_init()")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-27 11:22:09 -04:00