Commit Graph

5821 Commits

Author SHA1 Message Date
Andrew Kanner 1df83cbf23 selinux: prevent KMSAN warning in selinux_inet_conn_request()
KMSAN reports the following issue:
[   81.822503] =====================================================
[   81.823222] BUG: KMSAN: uninit-value in selinux_inet_conn_request+0x2c8/0x4b0
[   81.823891]  selinux_inet_conn_request+0x2c8/0x4b0
[   81.824385]  security_inet_conn_request+0xc0/0x160
[   81.824886]  tcp_v4_route_req+0x30e/0x490
[   81.825343]  tcp_conn_request+0xdc8/0x3400
[   81.825813]  tcp_v4_conn_request+0x134/0x190
[   81.826292]  tcp_rcv_state_process+0x1f4/0x3b40
[   81.826797]  tcp_v4_do_rcv+0x9ca/0xc30
[   81.827236]  tcp_v4_rcv+0x3bf5/0x4180
[   81.827670]  ip_protocol_deliver_rcu+0x822/0x1230
[   81.828174]  ip_local_deliver_finish+0x259/0x370
[   81.828667]  ip_local_deliver+0x1c0/0x450
[   81.829105]  ip_sublist_rcv+0xdc1/0xf50
[   81.829534]  ip_list_rcv+0x72e/0x790
[   81.829941]  __netif_receive_skb_list_core+0x10d5/0x1180
[   81.830499]  netif_receive_skb_list_internal+0xc41/0x1190
[   81.831064]  napi_complete_done+0x2c4/0x8b0
[   81.831532]  e1000_clean+0x12bf/0x4d90
[   81.831983]  __napi_poll+0xa6/0x760
[   81.832391]  net_rx_action+0x84c/0x1550
[   81.832831]  __do_softirq+0x272/0xa6c
[   81.833239]  __irq_exit_rcu+0xb7/0x1a0
[   81.833654]  irq_exit_rcu+0x17/0x40
[   81.834044]  common_interrupt+0x8d/0xa0
[   81.834494]  asm_common_interrupt+0x2b/0x40
[   81.834949]  default_idle+0x17/0x20
[   81.835356]  arch_cpu_idle+0xd/0x20
[   81.835766]  default_idle_call+0x43/0x70
[   81.836210]  do_idle+0x258/0x800
[   81.836581]  cpu_startup_entry+0x26/0x30
[   81.837002]  __pfx_ap_starting+0x0/0x10
[   81.837444]  secondary_startup_64_no_verify+0x17a/0x17b
[   81.837979]
[   81.838166] Local variable nlbl_type.i created at:
[   81.838596]  selinux_inet_conn_request+0xe3/0x4b0
[   81.839078]  security_inet_conn_request+0xc0/0x160

KMSAN warning is reproducible with:
* netlabel_mgmt_protocount is 0 (e.g. netlbl_enabled() returns 0)
* CONFIG_SECURITY_NETWORK_XFRM may be set or not
* CONFIG_KMSAN=y
* `ssh USER@HOSTNAME /bin/date`

selinux_skb_peerlbl_sid() will call selinux_xfrm_skb_sid(), then fall
to selinux_netlbl_skbuff_getsid() which will not initialize nlbl_type,
but it will be passed to:

    err = security_net_peersid_resolve(nlbl_sid,
                                       nlbl_type, xfrm_sid, sid);

and checked by KMSAN, although it will not be used inside
security_net_peersid_resolve() (at least now), since this function
will check either (xfrm_sid == SECSID_NULL) or (nlbl_sid ==
SECSID_NULL) first and return before using uninitialized nlbl_type.

Signed-off-by: Andrew Kanner <andrew.kanner@gmail.com>
[PM: subject line tweak, removed 'fixes' tag as code is not broken]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-15 18:23:22 -04:00
Marco Elver aa9f10d570 hardening: Move BUG_ON_DATA_CORRUPTION to hardening options
BUG_ON_DATA_CORRUPTION is turning detected corruptions of list data
structures from WARNings into BUGs. This can be useful to stop further
corruptions or even exploitation attempts.

However, the option has less to do with debugging than with hardening.
With the introduction of LIST_HARDENED, it makes more sense to move it
to the hardening options, where it selects LIST_HARDENED instead.

Without this change, combining BUG_ON_DATA_CORRUPTION with LIST_HARDENED
alone wouldn't be possible, because DEBUG_LIST would always be selected
by BUG_ON_DATA_CORRUPTION.

Signed-off-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20230811151847.1594958-4-elver@google.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-15 14:57:25 -07:00
Marco Elver aebc7b0d8d list: Introduce CONFIG_LIST_HARDENED
Numerous production kernel configs (see [1, 2]) are choosing to enable
CONFIG_DEBUG_LIST, which is also being recommended by KSPP for hardened
configs [3]. The motivation behind this is that the option can be used
as a security hardening feature (e.g. CVE-2019-2215 and CVE-2019-2025
are mitigated by the option [4]).

The feature has never been designed with performance in mind, yet common
list manipulation is happening across hot paths all over the kernel.

Introduce CONFIG_LIST_HARDENED, which performs list pointer checking
inline, and only upon list corruption calls the reporting slow path.

To generate optimal machine code with CONFIG_LIST_HARDENED:

  1. Elide checking for pointer values which upon dereference would
     result in an immediate access fault (i.e. minimal hardening
     checks).  The trade-off is lower-quality error reports.

  2. Use the __preserve_most function attribute (available with Clang,
     but not yet with GCC) to minimize the code footprint for calling
     the reporting slow path. As a result, function size of callers is
     reduced by avoiding saving registers before calling the rarely
     called reporting slow path.

     Note that all TUs in lib/Makefile already disable function tracing,
     including list_debug.c, and __preserve_most's implied notrace has
     no effect in this case.

  3. Because the inline checks are a subset of the full set of checks in
     __list_*_valid_or_report(), always return false if the inline
     checks failed.  This avoids redundant compare and conditional
     branch right after return from the slow path.

As a side-effect of the checks being inline, if the compiler can prove
some condition to always be true, it can completely elide some checks.

Since DEBUG_LIST is functionally a superset of LIST_HARDENED, the
Kconfig variables are changed to reflect that: DEBUG_LIST selects
LIST_HARDENED, whereas LIST_HARDENED itself has no dependency on
DEBUG_LIST.

Running netperf with CONFIG_LIST_HARDENED (using a Clang compiler with
"preserve_most") shows throughput improvements, in my case of ~7% on
average (up to 20-30% on some test cases).

Link: https://r.android.com/1266735 [1]
Link: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/blob/main/config [2]
Link: https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project/Recommended_Settings [3]
Link: https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html [4]
Signed-off-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20230811151847.1594958-3-elver@google.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-15 14:57:25 -07:00
Khadija Kamran 8e4672d6f9 lsm: constify the 'file' parameter in security_binder_transfer_file()
SELinux registers the implementation for the "binder_transfer_file"
hook. Looking at the function implementation we observe that the
parameter "file" is not changing.

Mark the "file" parameter of LSM hook security_binder_transfer_file() as
"const" since it will not be changing in the LSM hook.

Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com>
[PM: subject line whitespace fix]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-15 16:04:34 -04:00
David Howells d80a8f1b58 vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing
When NFS superblocks are created by automounting, their LSM parameters
aren't set in the fs_context struct prior to sget_fc() being called,
leading to failure to match existing superblocks.

This bug leads to messages like the following appearing in dmesg when
fscache is enabled:

    NFS: Cache volume key already in use (nfs,4.2,2,108,106a8c0,1,,,,100000,100000,2ee,3a98,1d4c,3a98,1)

Fix this by adding a new LSM hook to load fc->security for submount
creation.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/165962680944.3334508.6610023900349142034.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165962729225.3357250.14350728846471527137.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/165970659095.2812394.6868894171102318796.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166133579016.3678898.6283195019480567275.stgit@warthog.procyon.org.uk/ # v4
Link: https://lore.kernel.org/r/217595.1662033775@warthog.procyon.org.uk/ # v5
Fixes: 9bc61ab18b ("vfs: Introduce fs_context, switch vfs_kern_mount() to it.")
Fixes: 779df6a548 ("NFS: Ensure security label is set for root inode")
Tested-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: "Christian Brauner (Microsoft)" <brauner@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Message-Id: <20230808-master-v9-1-e0ecde888221@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-15 08:32:30 +02:00
GONG, Ruiqi 254a8ed6aa tomoyo: remove unused function declaration
The last usage of tomoyo_check_flags() has been removed by commit
57c2590fb7 ("TOMOYO: Update profile structure."). Clean up its
residual declaration.

Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2023-08-13 22:07:15 +09:00
Jakub Kicinski 4d016ae42e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

drivers/net/ethernet/intel/igc/igc_main.c
  06b412589e ("igc: Add lock to safeguard global Qbv variables")
  d3750076d4 ("igc: Add TransmissionOverrun counter")

drivers/net/ethernet/microsoft/mana/mana_en.c
  a7dfeda6fd ("net: mana: Fix MANA VF unload when hardware is unresponsive")
  a9ca9f9cef ("page_pool: split types and declarations from page_pool.h")
  92272ec410 ("eth: add missing xdp.h includes in drivers")

net/mptcp/protocol.h
  511b90e392 ("mptcp: fix disconnect vs accept race")
  b8dc6d6ce9 ("mptcp: fix rcv buffer auto-tuning")

tools/testing/selftests/net/mptcp/mptcp_join.sh
  c8c101ae39 ("selftests: mptcp: join: fix 'implicit EP' test")
  03668c65d1 ("selftests: mptcp: join: rework detailed report")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-10 14:10:53 -07:00
Christian Göttsche e49be9bc7c selinux: use unsigned iterator in nlmsgtab code
Use an unsigned type as loop iterator.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-09 19:07:49 -04:00
Christian Göttsche dee1537548 selinux: avoid implicit conversions in policydb code
Use the identical type for local variables, e.g. loop counters.

Declare members of struct policydb_compat_info unsigned to consistently
use unsigned iterators.  They hold read-only non-negative numbers in the
global variable policydb_compat.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-09 19:07:49 -04:00
Christian Göttsche 97842c56b8 selinux: avoid implicit conversions in selinuxfs code
Use umode_t as parameter type for sel_make_inode(), which assigns the
value to the member i_mode of struct inode.

Use identical and unsigned types for loop iterators.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-09 19:07:48 -04:00
Christian Göttsche aa4b605182 selinux: make left shifts well defined
The loops upper bound represent the number of permissions used (for the
current class or in general).  The limit for this is 32, thus we might
left shift of one less, 31.  Shifting a base of 1 results in undefined
behavior; use (u32)1 as base.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-09 19:07:48 -04:00
Christian Göttsche 002903e1d1 selinux: update type for number of class permissions in services code
Security classes have only up to 32 permissions, hence using an u16 is
sufficient (while improving padding in struct selinux_mapping).

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-09 19:07:48 -04:00
Christian Göttsche df9d474925 selinux: avoid implicit conversions in avtab code
Return u32 from avtab_hash() instead of int, since the hashing is done
on u32 and the result is used as an index on the hash array.

Use the type of the limit in for loops.

Avoid signed to unsigned conversion of multiplication result in
avtab_hash_eval() and perform multiplication in destination type.

Use unsigned loop iterator for index operations, to avoid sign
extension.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-09 19:07:47 -04:00
Paul Moore 817199e006 selinux: revert SECINITSID_INIT support
This commit reverts 5b0eea835d ("selinux: introduce an initial SID
for early boot processes") as it was found to cause problems on
distros with old SELinux userspace tools/libraries, specifically
Ubuntu 16.04.

Hopefully we will be able to re-add this functionality at a later
date, but let's revert this for now to help ensure a stable and
backwards compatible SELinux tree.

Link: https://lore.kernel.org/selinux/87edkseqf8.fsf@mail.lhotse
Acked-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-09 10:51:13 -04:00
Khadija Kamran 6672efbb68 lsm: constify the 'target' parameter in security_capget()
Three LSMs register the implementations for the "capget" hook: AppArmor,
SELinux, and the normal capability code. Looking at the function
implementations we may observe that the first parameter "target" is not
changing.

Mark the first argument "target" of LSM hook security_capget() as
"const" since it will not be changing in the LSM hook.

cap_capget() LSM hook declaration exceeds the 80 characters per line
limit. Split the function declaration to multiple lines to decrease the
line length.

Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com>
Acked-by: John Johansen <john.johansen@canonical.com>
[PM: align the cap_capget() declaration, spelling fixes]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-08 16:48:47 -04:00
GONG, Ruiqi efea5b0dcc apparmor: remove unused PROF_* macros
The last usage of PROF_{ADD,REPLACE} were removed by commit 18e99f191a
("apparmor: provide finer control over policy management"). So remove
these two unused macros.

Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-08-08 13:24:48 -07:00
Xiu Jianfeng 980a580868 apparmor: cleanup unused functions in file.h
After changes in commit 33bf60cabc ("LSM: Infrastructure management of
the file security"), aa_alloc_file_ctx() and aa_free_file_ctx() are no
longer used, so remove them, and also remove aa_get_file_label() because
it seems that it's never been used before.

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-08-08 13:16:13 -07:00
Xiu Jianfeng 9a0dbdbff0 apparmor: cleanup unused declarations in policy.h
The implementions of these declarations do not exist, remove them all.

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-08-08 13:15:39 -07:00
John Johansen d2fe16e94c apparmor: fixup return comments for kernel doc cleanups by Gaosheng Cui
[PATCH -next 05/11] apparmor: Fix kernel-doc warnings in apparmor/label.c
missed updating the Returns comment for the new parameter names

[PATCH -next 05/11] apparmor: Fix kernel-doc warnings in apparmor/label.c
Added the @size parameter comment without mentioning it is a return
value.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-08-08 13:12:19 -07:00
Christian Göttsche 2b86e04bce selinux: use GFP_KERNEL while reading binary policy
Use GFP_KERNEL instead of GFP_ATOMIC while reading a binary policy in
sens_read() and cat_read(), similar to surrounding code.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-08 13:40:53 -04:00
Xiu Jianfeng 64f18f8a8c selinux: update comment on selinux_hooks[]
After commit f22f9aaf6c ("selinux: remove the runtime disable
functionality"), the comment on selinux_hooks[] is out-of-date,
remove the last paragraph about runtime disable functionality.

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-08 13:28:42 -04:00
Dan Carpenter 3ad49d37cf smackfs: Prevent underflow in smk_set_cipso()
There is a upper bound to "catlen" but no lower bound to prevent
negatives.  I don't see that this necessarily causes a problem but we
may as well be safe.

Fixes: e114e47377 ("Smack: Simplified Mandatory Access Control Kernel")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2023-08-07 14:09:23 -07:00
Tóth János c47b658400 security: smack: smackfs: fix typo (lables->labels)
Fix a spelling error in smakcfs.

Signed-off-by: Tóth János <gomba007@gmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2023-08-07 14:09:23 -07:00
Tom Rix 0de030b308 sysctl: set variable key_sysctls storage-class-specifier to static
smatch reports
security/keys/sysctl.c:12:18: warning: symbol
  'key_sysctls' was not declared. Should it be static?

This variable is only used in its defining file, so it should be static.

Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-08-07 17:55:54 +00:00
Wenyu Liu 55e2b69649 kexec_lock: Replace kexec_mutex() by kexec_lock() in two comments
kexec_mutex is replaced by an atomic variable
in 05c6257433 (panic, kexec: make __crash_kexec() NMI safe).

But there are still two comments that referenced kexec_mutex,
replace them by kexec_lock.

Signed-off-by: Wenyu Liu <liuwenyu7@huawei.com>
Acked-by: Baoquan He <bhe@redhat.com>
Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2023-08-07 09:55:42 -04:00
Justin Stitt 7b9ef666f2 tomoyo: refactor deprecated strncpy
`strncpy` is deprecated for use on NUL-terminated destination strings [1].

A suitable replacement is `strscpy` [2] due to the fact that it
guarantees NUL-termination on its destination buffer argument which is
_not_ the case for `strncpy`!

It should be noted that the destination buffer is zero-initialized and
had a max length of `sizeof(dest) - 1`. There is likely _not_ a bug
present in the current implementation. However, by switching to
`strscpy` we get the benefit of no longer needing the `- 1`'s from the
string copy invocations on top of `strscpy` being a safer interface all
together.

[1]: www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
[2]: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html

Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2023-08-05 19:55:10 +09:00
Christian Göttsche c50e125d05 selinux: avoid implicit conversions in services code
Use u32 as the output parameter type in security_get_classes() and
security_get_permissions(), based on the type of the symtab nprim
member.

Declare the read-only class string parameter of
security_get_permissions() const.

Avoid several implicit conversions by using the identical type for the
destination.

Use the type identical to the source for local variables.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: cleanup extra whitespace in subject]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-03 22:19:57 -04:00
Christian Göttsche fd5a90ff1e selinux: avoid implicit conversions in mls code
Use u32 for ebitmap bits and sensitivity levels, char for the default
range of a class.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: description tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-03 22:19:57 -04:00
Christian Göttsche c17c55c2d1 selinux: use identical iterator type in hashtab_duplicate()
Use the identical type u32 for the loop iterator.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: remove extra whitespace in subject]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-03 22:19:56 -04:00
Jakub Kicinski 35b1b1fd96 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/dsa/port.c
  9945c1fb03 ("net: dsa: fix older DSA drivers using phylink")
  a88dd75384 ("net: dsa: remove legacy_pre_march2020 detection")
https://lore.kernel.org/all/20230731102254.2c9868ca@canb.auug.org.au/

net/xdp/xsk.c
  3c5b4d69c3 ("net: annotate data-races around sk->sk_mark")
  b7f72a30e9 ("xsk: introduce wrappers and helpers for supporting multi-buffer in Tx path")
https://lore.kernel.org/all/20230731102631.39988412@canb.auug.org.au/

drivers/net/ethernet/broadcom/bnxt/bnxt.c
  37b61cda9c ("bnxt: don't handle XDP in netpoll")
  2b56b3d992 ("eth: bnxt: handle invalid Tx completions more gracefully")
https://lore.kernel.org/all/20230801101708.1dc7faac@canb.auug.org.au/

Adjacent changes:

drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
  62da08331f ("net/mlx5e: Set proper IPsec source port in L4 selector")
  fbd517549c ("net/mlx5e: Add function to get IPsec offload namespace")

drivers/net/ethernet/sfc/selftest.c
  55c1528f9b ("sfc: fix field-spanning memcpy in selftest")
  ae9d445cd4 ("sfc: Miscellaneous comment removals")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03 14:34:37 -07:00
Coiby Xu 56dc986a6b ima: require signed IMA policy when UEFI secure boot is enabled
With commit 099f26f22f ("integrity: machine keyring CA
configuration"), users are able to add custom IMA CA keys via
MOK.  This allows users to sign their own IMA polices without
recompiling the kernel. For the sake of security, mandate signed IMA
policy when UEFI secure boot is enabled.

Note this change may affect existing users/tests i.e users won't be able
to load an unsigned IMA policy when the IMA architecture specific policy
is configured and UEFI secure boot is enabled.

Suggested-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2023-08-01 08:18:11 -04:00
Eric Snowberg f20765fdfd integrity: Always reference the blacklist keyring with appraisal
Commit 273df864cf ("ima: Check against blacklisted hashes for files with
modsig") introduced an appraise_flag option for referencing the blacklist
keyring.  Any matching binary found on this keyring fails signature
validation. This flag only works with module appended signatures.

An important part of a PKI infrastructure is to have the ability to do
revocation at a later time should a vulnerability be found.  Expand the
revocation flag usage to all appraisal functions. The flag is now
enabled by default. Setting the flag with an IMA policy has been
deprecated. Without a revocation capability like this in place, only
authenticity can be maintained. With this change, integrity can now be
achieved with digital signature based IMA appraisal.

Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2023-08-01 08:17:25 -04:00
Nayna Jain 5087fd9e80 ima: Remove deprecated IMA_TRUSTED_KEYRING Kconfig
Time to remove "IMA_TRUSTED_KEYRING".

Fixes: f4dc37785e ("integrity: define '.evm' as a builtin 'trusted' keyring") # v4.5+
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2023-08-01 08:16:24 -04:00
Khadija Kamran bd1f5934e4 lsm: add comment block for security_sk_classify_flow LSM hook
security_sk_classify_flow LSM hook has no comment block. Add a comment
block with a brief description of LSM hook and its function parameters.

Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com>
[PM: minor double-space fix]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-31 16:07:40 -04:00
Christian Göttsche f01dd59045 selinux: move debug functions into debug configuration
avtab_hash_eval() and hashtab_stat() are only used in policydb.c when
the configuration SECURITY_SELINUX_DEBUG is enabled.

Move the function definitions under that configuration as well and
provide empty definitions in case SECURITY_SELINUX_DEBUG is disabled, to
avoid using #ifdef in the callers.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-28 14:09:24 -04:00
Christian Göttsche 2d7f105edb security: keys: perform capable check only on privileged operations
If the current task fails the check for the queried capability via
`capable(CAP_SYS_ADMIN)` LSMs like SELinux generate a denial message.
Issuing such denial messages unnecessarily can lead to a policy author
granting more privileges to a subject than needed to silence them.

Reorder CAP_SYS_ADMIN checks after the check whether the operation is
actually privileged.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-07-28 18:07:41 +00:00
Christian Göttsche 19c5b015d1 selinux: log about VM being executable by default
In case virtual memory is being marked as executable by default, SELinux
checks regarding explicit potential dangerous use are disabled.

Inform the user about it.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-28 14:04:14 -04:00
Roberto Sassu faf302f5a2 security: Fix ret values doc for security_inode_init_security()
Commit 6bcdfd2cac ("security: Allow all LSMs to provide xattrs for
inode_init_security hook") unified the !initxattrs and initxattrs cases. By
doing that, security_inode_init_security() cannot return -EOPNOTSUPP
anymore, as it is always replaced with zero at the end of the function.

Also, mentioning -ENOMEM as the only possible error is not correct. For
example, evm_inode_init_security() could return -ENOKEY.

Fix these issues in the documentation of security_inode_init_security().

Fixes: 6bcdfd2cac ("security: Allow all LSMs to provide xattrs for inode_init_security hook")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-26 17:07:39 -04:00
Jeff Layton 4c1698d303 selinux: convert to ctime accessor functions
In later patches, we're going to change how the inode's ctime field is
used. Switch to using accessor functions instead of raw accesses of
inode->i_ctime.

Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-Id: <20230705190309.579783-89-jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-07-24 10:30:08 +02:00
Jeff Layton 428c33f285 security: convert to ctime accessor functions
In later patches, we're going to change how the inode's ctime field is
used. Switch to using accessor functions instead of raw accesses of
inode->i_ctime.

Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-Id: <20230705190309.579783-88-jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-07-24 10:30:08 +02:00
Jeff Layton 6ac5422617 apparmor: convert to ctime accessor functions
In later patches, we're going to change how the inode's ctime field is
used. Switch to using accessor functions instead of raw accesses of
inode->i_ctime.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-Id: <20230705190309.579783-87-jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-07-24 10:30:08 +02:00
Christian Göttsche a959dbd98d tomoyo: add format attributes to functions
Format attributes on functions taking format string can help compilers
detect argument type or count mismatches.

Please the compiler when building with W=1:

    security/tomoyo/audit.c: In function ‘tomoyo_init_log’:
    security/tomoyo/audit.c:290:9: error: function ‘tomoyo_init_log’ might be a candidate for ‘gnu_printf’ format attribute [-Werror=suggest-attribute=format]
      290 |         vsnprintf(buf + pos, len - pos, fmt, args);
          |         ^~~~~~~~~
    security/tomoyo/audit.c: In function ‘tomoyo_write_log2’:
    security/tomoyo/audit.c:376:9: error: function ‘tomoyo_write_log2’ might be a candidate for ‘gnu_printf’ format attribute [-Werror=suggest-attribute=format]
      376 |         buf = tomoyo_init_log(r, len, fmt, args);
          |         ^~~
    security/tomoyo/common.c: In function ‘tomoyo_addprintf’:
    security/tomoyo/common.c:193:9: error: function ‘tomoyo_addprintf’ might be a candidate for ‘gnu_printf’ format attribute [-Werror=suggest-attribute=format]
      193 |         vsnprintf(buffer + pos, len - pos - 1, fmt, args);
          |         ^~~~~~~~~

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2023-07-23 21:25:28 +09:00
Jakub Kicinski 59be3baa8d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-20 15:52:55 -07:00
Paul Moore 3876043ad9 selinux: fix a 0/NULL mistmatch in ad_net_init_from_iif()
Use a NULL instead of a zero to resolve a int/pointer mismatch.

Cc: Paolo Abeni <pabeni@redhat.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202307210332.4AqFZfzI-lkp@intel.com/
Fixes: dd51fcd42f ("selinux: introduce and use lsm_ad_net_init*() helpers")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-20 16:29:47 -04:00
Christian Göttsche 55a0e73806 selinux: introduce SECURITY_SELINUX_DEBUG configuration
The policy database code contains several debug output statements
related to hashtable utilization.  Those are guarded by the macro
DEBUG_HASHES, which is neither documented nor set anywhere.

Introduce a new Kconfig configuration guarding this and potential
other future debugging related code.  Disable the setting by default.

Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: fixed line lengths in the help text]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-20 16:21:52 -04:00
Paolo Abeni dd51fcd42f selinux: introduce and use lsm_ad_net_init*() helpers
Perf traces of network-related workload shows a measurable overhead
inside the network-related selinux hooks while zeroing the
lsm_network_audit struct.

In most cases we can delay the initialization of such structure to the
usage point, avoiding such overhead in a few cases.

Additionally, the audit code accesses the IP address information only
for AF_INET* families, and selinux_parse_skb() will fill-out the
relevant fields in such cases. When the family field is zeroed or the
initialization is followed by the mentioned parsing, the zeroing can be
limited to the sk, family and netif fields.

By factoring out the audit-data initialization to new helpers, this
patch removes some duplicate code and gives small but measurable
performance gain under UDP flood.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-19 16:10:05 -04:00
Stephen Smalley 0fe53224bf selinux: update my email address
Update my email address; MAINTAINERS was updated some time ago.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-19 11:27:02 -04:00
Christian Göttsche e5faa839c3 selinux: add missing newlines in pr_err() statements
The kernel print statements do not append an implicit newline to format
strings.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: subject line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-19 11:12:48 -04:00
Christian Göttsche 08a12b39e2 selinux: drop avtab_search()
avtab_search() shares the same logic with avtab_search_node(), except
that it returns, if found, a pointer to the struct avtab_node member
datum instead of the node itself.  Since the member is an embedded
struct, and not a pointer, the returned value of avtab_search() and
avtab_search_node() will always in unison either be NULL or non-NULL.

Drop avtab_search() and replace its calls by avtab_search_node() to
deduplicate logic and adopt the only caller caring for the type of
the returned value accordingly.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-19 11:04:28 -04:00
Stephen Smalley 90aa4f5e92 selinux: de-brand SELinux
Change "NSA SELinux" to just "SELinux" in Kconfig help text and
comments. While NSA was the original primary developer and continues to
help maintain SELinux, SELinux has long since transitioned to a wide
community of developers and maintainers. SELinux has been part of the
mainline Linux kernel for nearly 20 years now [1] and has received
contributions from many individuals and organizations.

[1] https://lore.kernel.org/lkml/Pine.LNX.4.44.0308082228470.1852-100000@home.osdl.org/

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-18 18:42:57 -04:00
Christian Göttsche c867248cf4 selinux: avoid implicit conversions regarding enforcing status
Use the type bool as parameter type in
selinux_status_update_setenforce().  The related function
enforcing_enabled() returns the type bool, while the struct
selinux_kernel_status member enforcing uses an u32.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-18 18:29:50 -04:00
Christian Göttsche 0e83c9c6fb selinux: fix implicit conversions in the symtab
hashtab_init() takes an u32 as size parameter type.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-18 18:29:49 -04:00
Christian Göttsche 7128578c79 selinux: use consistent type for AV rule specifier
The specifier for avtab keys is always supplied with a type of u16,
either as a macro to security_compute_sid() or the member specified of
the struct avtab_key.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-18 18:29:49 -04:00
Christian Göttsche a13479bb3c selinux: avoid implicit conversions in the LSM hooks
Use the identical types in assignments of local variables for the
destination.

Merge tail calls into return statements.

Avoid using leading underscores for function local variable.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-18 18:29:48 -04:00
Christian Göttsche 5f740953ab selinux: avoid implicit conversions in the AVC code
Use a consistent type of u32 for sequence numbers.

Use a non-negative and input parameter matching type for the hash
result.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-18 18:29:48 -04:00
Christian Göttsche 777ea29c57 selinux: avoid implicit conversions in the netif code
Use the identical type sel_netif_hashfn() returns.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-18 18:29:48 -04:00
Christian Göttsche 1f270f1c34 selinux: consistently use u32 as sequence number type in the status code
Align the type with the one used in selinux_notify_policy_change() and
the sequence member of struct selinux_kernel_status.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-18 18:29:47 -04:00
Christian Göttsche f785c54101 selinux: avoid avtab overflows
Prevent inserting more than the supported U32_MAX number of entries.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-18 18:29:47 -04:00
Christian Göttsche bbea03f474 selinux: check for multiplication overflow in put_entry()
The function is always inlined and most of the time both relevant
arguments are compile time constants, allowing compilers to elide the
check.  Also the function is part of outputting the policy, which is not
performance critical.

Also convert the type of the third parameter into a size_t, since it
should always be a non-negative number of elements.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-18 18:29:46 -04:00
Jiapeng Chong 2a41527420 security: keys: Modify mismatched function name
No functional modification involved.

security/keys/trusted-keys/trusted_tpm2.c:203: warning: expecting prototype for tpm_buf_append_auth(). Prototype was for tpm2_buf_append_auth() instead.

Fixes: 2e19e10131 ("KEYS: trusted: Move TPM2 trusted keys code")
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5524
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-07-17 19:40:27 +00:00
Petr Pavlu d55901522f keys: Fix linking a duplicate key to a keyring's assoc_array
When making a DNS query inside the kernel using dns_query(), the request
code can in rare cases end up creating a duplicate index key in the
assoc_array of the destination keyring. It is eventually found by
a BUG_ON() check in the assoc_array implementation and results in
a crash.

Example report:
[2158499.700025] kernel BUG at ../lib/assoc_array.c:652!
[2158499.700039] invalid opcode: 0000 [#1] SMP PTI
[2158499.700065] CPU: 3 PID: 31985 Comm: kworker/3:1 Kdump: loaded Not tainted 5.3.18-150300.59.90-default #1 SLE15-SP3
[2158499.700096] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[2158499.700351] Workqueue: cifsiod cifs_resolve_server [cifs]
[2158499.700380] RIP: 0010:assoc_array_insert+0x85f/0xa40
[2158499.700401] Code: ff 74 2b 48 8b 3b 49 8b 45 18 4c 89 e6 48 83 e7 fe e8 95 ec 74 00 3b 45 88 7d db 85 c0 79 d4 0f 0b 0f 0b 0f 0b e8 41 f2 be ff <0f> 0b 0f 0b 81 7d 88 ff ff ff 7f 4c 89 eb 4c 8b ad 58 ff ff ff 0f
[2158499.700448] RSP: 0018:ffffc0bd6187faf0 EFLAGS: 00010282
[2158499.700470] RAX: ffff9f1ea7da2fe8 RBX: ffff9f1ea7da2fc1 RCX: 0000000000000005
[2158499.700492] RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000
[2158499.700515] RBP: ffffc0bd6187fbb0 R08: ffff9f185faf1100 R09: 0000000000000000
[2158499.700538] R10: ffff9f1ea7da2cc0 R11: 000000005ed8cec8 R12: ffffc0bd6187fc28
[2158499.700561] R13: ffff9f15feb8d000 R14: ffff9f1ea7da2fc0 R15: ffff9f168dc0d740
[2158499.700585] FS:  0000000000000000(0000) GS:ffff9f185fac0000(0000) knlGS:0000000000000000
[2158499.700610] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2158499.700630] CR2: 00007fdd94fca238 CR3: 0000000809d8c006 CR4: 00000000003706e0
[2158499.700702] Call Trace:
[2158499.700741]  ? key_alloc+0x447/0x4b0
[2158499.700768]  ? __key_link_begin+0x43/0xa0
[2158499.700790]  __key_link_begin+0x43/0xa0
[2158499.700814]  request_key_and_link+0x2c7/0x730
[2158499.700847]  ? dns_resolver_read+0x20/0x20 [dns_resolver]
[2158499.700873]  ? key_default_cmp+0x20/0x20
[2158499.700898]  request_key_tag+0x43/0xa0
[2158499.700926]  dns_query+0x114/0x2ca [dns_resolver]
[2158499.701127]  dns_resolve_server_name_to_ip+0x194/0x310 [cifs]
[2158499.701164]  ? scnprintf+0x49/0x90
[2158499.701190]  ? __switch_to_asm+0x40/0x70
[2158499.701211]  ? __switch_to_asm+0x34/0x70
[2158499.701405]  reconn_set_ipaddr_from_hostname+0x81/0x2a0 [cifs]
[2158499.701603]  cifs_resolve_server+0x4b/0xd0 [cifs]
[2158499.701632]  process_one_work+0x1f8/0x3e0
[2158499.701658]  worker_thread+0x2d/0x3f0
[2158499.701682]  ? process_one_work+0x3e0/0x3e0
[2158499.701703]  kthread+0x10d/0x130
[2158499.701723]  ? kthread_park+0xb0/0xb0
[2158499.701746]  ret_from_fork+0x1f/0x40

The situation occurs as follows:
* Some kernel facility invokes dns_query() to resolve a hostname, for
  example, "abcdef". The function registers its global DNS resolver
  cache as current->cred.thread_keyring and passes the query to
  request_key_net() -> request_key_tag() -> request_key_and_link().
* Function request_key_and_link() creates a keyring_search_context
  object. Its match_data.cmp method gets set via a call to
  type->match_preparse() (resolves to dns_resolver_match_preparse()) to
  dns_resolver_cmp().
* Function request_key_and_link() continues and invokes
  search_process_keyrings_rcu() which returns that a given key was not
  found. The control is then passed to request_key_and_link() ->
  construct_alloc_key().
* Concurrently to that, a second task similarly makes a DNS query for
  "abcdef." and its result gets inserted into the DNS resolver cache.
* Back on the first task, function construct_alloc_key() first runs
  __key_link_begin() to determine an assoc_array_edit operation to
  insert a new key. Index keys in the array are compared exactly as-is,
  using keyring_compare_object(). The operation finds that "abcdef" is
  not yet present in the destination keyring.
* Function construct_alloc_key() continues and checks if a given key is
  already present on some keyring by again calling
  search_process_keyrings_rcu(). This search is done using
  dns_resolver_cmp() and "abcdef" gets matched with now present key
  "abcdef.".
* The found key is linked on the destination keyring by calling
  __key_link() and using the previously calculated assoc_array_edit
  operation. This inserts the "abcdef." key in the array but creates
  a duplicity because the same index key is already present.

Fix the problem by postponing __key_link_begin() in
construct_alloc_key() until an actual key which should be linked into
the destination keyring is determined.

[jarkko@kernel.org: added a fixes tag and cc to stable]
Cc: stable@vger.kernel.org # v5.3+
Fixes: df593ee23e ("keys: Hoist locking out of __key_link_begin()")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Joey Lee <jlee@suse.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-07-17 19:32:30 +00:00
Guillaume Nault 5b52ad34f9 security: Constify sk in the sk_getsecid hook.
The sk_getsecid hook shouldn't need to modify its socket argument.
Make it const so that callers of security_sk_classify_flow() can use a
const struct sock *.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-14 08:27:33 +01:00
Ondrej Mosnacek 5b0eea835d selinux: introduce an initial SID for early boot processes
Currently, SELinux doesn't allow distinguishing between kernel threads
and userspace processes that are started before the policy is first
loaded - both get the label corresponding to the kernel SID. The only
way a process that persists from early boot can get a meaningful label
is by doing a voluntary dyntransition or re-executing itself.

Reusing the kernel label for userspace processes is problematic for
several reasons:
1. The kernel is considered to be a privileged domain and generally
   needs to have a wide range of permissions allowed to work correctly,
   which prevents the policy writer from effectively hardening against
   early boot processes that might remain running unintentionally after
   the policy is loaded (they represent a potential extra attack surface
   that should be mitigated).
2. Despite the kernel being treated as a privileged domain, the policy
   writer may want to impose certain special limitations on kernel
   threads that may conflict with the requirements of intentional early
   boot processes. For example, it is a good hardening practice to limit
   what executables the kernel can execute as usermode helpers and to
   confine the resulting usermode helper processes. However, a
   (legitimate) process surviving from early boot may need to execute a
   different set of executables.
3. As currently implemented, overlayfs remembers the security context of
   the process that created an overlayfs mount and uses it to bound
   subsequent operations on files using this context. If an overlayfs
   mount is created before the SELinux policy is loaded, these "mounter"
   checks are made against the kernel context, which may clash with
   restrictions on the kernel domain (see 2.).

To resolve this, introduce a new initial SID (reusing the slot of the
former "init" initial SID) that will be assigned to any userspace
process started before the policy is first loaded. This is easy to do,
as we can simply label any process that goes through the
bprm_creds_for_exec LSM hook with the new init-SID instead of
propagating the kernel SID from the parent.

To provide backwards compatibility for existing policies that are
unaware of this new semantic of the "init" initial SID, introduce a new
policy capability "userspace_initial_context" and set the "init" SID to
the same context as the "kernel" SID unless this capability is set by
the policy.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-10 14:23:56 -04:00
Paul Moore d91c1ab470 selinux: cleanup the policycap accessor functions
In the process of reverting back to directly accessing the global
selinux_state pointer we left behind some artifacts in the
selinux_policycap_XXX() helper functions.  This patch cleans up
some of that left-behind cruft.

Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-10 14:23:56 -04:00
Roberto Sassu c31288e56c evm: Support multiple LSMs providing an xattr
Currently, evm_inode_init_security() processes a single LSM xattr from the
array passed by security_inode_init_security(), and calculates the HMAC on
it and other inode metadata.

As the LSM infrastructure now can pass to EVM an array with multiple
xattrs, scan them until the terminator (xattr name NULL), and calculate the
HMAC on all of them.

Also, double check that the xattrs array terminator is the first non-filled
slot (obtained with lsm_get_xattr_slot()). Consumers of the xattrs array,
such as the initxattrs() callbacks, rely on the terminator.

Finally, change the name of the lsm_xattr parameter of evm_init_hmac() to
xattrs, to reflect the new type of information passed.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-10 13:59:39 -04:00
Roberto Sassu 6db7d1dee8 evm: Align evm_inode_init_security() definition with LSM infrastructure
Change the evm_inode_init_security() definition to align with the LSM
infrastructure. Keep the existing behavior of including in the HMAC
calculation only the first xattr provided by LSMs.

Changing the evm_inode_init_security() definition requires passing the
xattr array allocated by security_inode_init_security(), and the number of
xattrs filled by previously invoked LSMs.

Use the newly introduced lsm_get_xattr_slot() to position EVM correctly in
the xattrs array, like a regular LSM, and to increment the number of filled
slots. For now, the LSM infrastructure allocates enough xattrs slots to
store the EVM xattr, without using the reservation mechanism.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-10 13:59:38 -04:00
Roberto Sassu baed456a6a smack: Set the SMACK64TRANSMUTE xattr in smack_inode_init_security()
With the newly added ability of LSMs to supply multiple xattrs, set
SMACK64TRASMUTE in smack_inode_init_security(), instead of d_instantiate().
Do it by incrementing SMACK_INODE_INIT_XATTRS to 2 and by calling
lsm_get_xattr_slot() a second time, if the transmuting conditions are met.

The LSM infrastructure passes all xattrs provided by LSMs to the
filesystems through the initxattrs() callback, so that filesystems can
store xattrs in the disk.

After the change, the SMK_INODE_TRANSMUTE inode flag is always set by
d_instantiate() after fetching SMACK64TRANSMUTE from the disk. Before it
was done by smack_inode_post_setxattr() as result of the __vfs_setxattr()
call.

Removing __vfs_setxattr() also prevents invalidating the EVM HMAC, by
adding a new xattr without checking and updating the existing HMAC.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-10 13:59:38 -04:00
Roberto Sassu 6bcdfd2cac security: Allow all LSMs to provide xattrs for inode_init_security hook
Currently, the LSM infrastructure supports only one LSM providing an xattr
and EVM calculating the HMAC on that xattr, plus other inode metadata.

Allow all LSMs to provide one or multiple xattrs, by extending the security
blob reservation mechanism. Introduce the new lbs_xattr_count field of the
lsm_blob_sizes structure, so that each LSM can specify how many xattrs it
needs, and the LSM infrastructure knows how many xattr slots it should
allocate.

Modify the inode_init_security hook definition, by passing the full
xattr array allocated in security_inode_init_security(), and the current
number of xattr slots in that array filled by LSMs. The first parameter
would allow EVM to access and calculate the HMAC on xattrs supplied by
other LSMs, the second to not leave gaps in the xattr array, when an LSM
requested but did not provide xattrs (e.g. if it is not initialized).

Introduce lsm_get_xattr_slot(), which LSMs can call as many times as the
number specified in the lbs_xattr_count field of the lsm_blob_sizes
structure. During each call, lsm_get_xattr_slot() increments the number of
filled xattrs, so that at the next invocation it returns the next xattr
slot to fill.

Cleanup security_inode_init_security(). Unify the !initxattrs and
initxattrs case by simply not allocating the new_xattrs array in the
former. Update the documentation to reflect the changes, and fix the
description of the xattr name, as it is not allocated anymore.

Adapt both SELinux and Smack to use the new definition of the
inode_init_security hook, and to call lsm_get_xattr_slot() to obtain and
fill the reserved slots in the xattr array.

Move the xattr->name assignment after the xattr->value one, so that it is
done only in case of successful memory allocation.

Finally, change the default return value of the inode_init_security hook
from zero to -EOPNOTSUPP, so that BPF LSM correctly follows the hook
conventions.

Reported-by: Nicolas Bouchinet <nicolas.bouchinet@clip-os.org>
Link: https://lore.kernel.org/linux-integrity/Y1FTSIo+1x+4X0LS@archlinux/
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: minor comment and variable tweaks, approved by RS]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-10 13:59:37 -04:00
Pairman Guo ff72942caa lsm: fix typo in security_file_lock() comment header
In the description of function definition security_file_lock(),
the line "@cmd: fnctl command" has a typo where "fnctl" should be
"fcntl". This patch fixes the typo.

Signed-off-by: Pairman Guo <pairmanxlr@gmail.com>
[PM: commit message cleanup]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-10 13:59:37 -04:00
Gaosheng Cui 25ff0ff2d6 apparmor: Fix kernel-doc warnings in apparmor/policy.c
Fix kernel-doc warnings:

security/apparmor/policy.c:294: warning: Function parameter or
member 'proxy' not described in 'aa_alloc_profile'
security/apparmor/policy.c:785: warning: Function parameter or
member 'label' not described in 'aa_policy_view_capable'
security/apparmor/policy.c:785: warning: Function parameter or
member 'ns' not described in 'aa_policy_view_capable'
security/apparmor/policy.c:847: warning: Function parameter or
member 'ns' not described in 'aa_may_manage_policy'
security/apparmor/policy.c:964: warning: Function parameter or
member 'hname' not described in '__lookup_replace'
security/apparmor/policy.c:964: warning: Function parameter or
member 'info' not described in '__lookup_replace'
security/apparmor/policy.c:964: warning: Function parameter or
member 'noreplace' not described in '__lookup_replace'
security/apparmor/policy.c:964: warning: Function parameter or
member 'ns' not described in '__lookup_replace'
security/apparmor/policy.c:964: warning: Function parameter or
member 'p' not described in '__lookup_replace'

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-10 01:16:28 -07:00
Gaosheng Cui 2520d61c50 apparmor: Fix kernel-doc warnings in apparmor/policy_compat.c
Fix kernel-doc warnings:

security/apparmor/policy_compat.c:151: warning: Function parameter
or member 'size' not described in 'compute_fperms'

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-10 01:16:05 -07:00
Gaosheng Cui f8fce898e5 apparmor: Fix kernel-doc warnings in apparmor/policy_unpack.c
Fix kernel-doc warnings:

security/apparmor/policy_unpack.c:1173: warning: Function parameter
or member 'table_size' not described in 'verify_dfa_accept_index'

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-10 01:15:41 -07:00
Gaosheng Cui 13c1748e21 apparmor: Fix kernel-doc warnings in apparmor/resource.c
Fix kernel-doc warnings:

security/apparmor/resource.c:111: warning: Function parameter or
member 'label' not described in 'aa_task_setrlimit'
security/apparmor/resource.c:111: warning: Function parameter or
member 'new_rlim' not described in 'aa_task_setrlimit'
security/apparmor/resource.c:111: warning: Function parameter or
member 'resource' not described in 'aa_task_setrlimit'
security/apparmor/resource.c:111: warning: Function parameter or
member 'task' not described in 'aa_task_setrlimit'

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-10 01:15:17 -07:00
Gaosheng Cui 7abbbd573c apparmor: Fix kernel-doc warnings in apparmor/match.c
Fix kernel-doc warnings:

security/apparmor/match.c:148: warning: Function parameter or member
'tables' not described in 'verify_table_headers'
security/apparmor/match.c:289: warning: Excess function parameter
'kr' description in 'aa_dfa_free_kref'
security/apparmor/match.c:289: warning: Function parameter or member
'kref' not described in 'aa_dfa_free_kref'

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-10 01:14:51 -07:00
Gaosheng Cui 8921482286 apparmor: Fix kernel-doc warnings in apparmor/lib.c
Fix kernel-doc warnings:

security/apparmor/lib.c:33: warning: Excess function parameter
'str' description in 'aa_free_str_table'
security/apparmor/lib.c:33: warning: Function parameter or member
't' not described in 'aa_free_str_table'
security/apparmor/lib.c:94: warning: Function parameter or
member 'n' not described in 'skipn_spaces'
security/apparmor/lib.c:390: warning: Excess function parameter
'deny' description in 'aa_check_perms'

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-10 01:13:52 -07:00
Gaosheng Cui e18573dd2b apparmor: Fix kernel-doc warnings in apparmor/label.c
Fix kernel-doc warnings:

security/apparmor/label.c:166: warning: Excess function parameter
'n' description in 'vec_cmp'
security/apparmor/label.c:166: warning: Excess function parameter
'vec' description in 'vec_cmp'
security/apparmor/label.c:166: warning: Function parameter or member
'an' not described in 'vec_cmp'
security/apparmor/label.c:166: warning: Function parameter or member
'bn' not described in 'vec_cmp'
security/apparmor/label.c:166: warning: Function parameter or member
'b' not described in 'vec_cmp'
security/apparmor/label.c:2051: warning: Function parameter or member
'label' not described in '__label_update'
security/apparmor/label.c:266: warning: Function parameter or member
'flags' not described in 'aa_vec_unique'
security/apparmor/label.c:594: warning: Excess function parameter
'l' description in '__label_remove'
security/apparmor/label.c:594: warning: Function parameter or member
'label' not described in '__label_remove'
security/apparmor/label.c:929: warning: Function parameter or member
'label' not described in 'aa_label_insert'
security/apparmor/label.c:929: warning: Function parameter or member
'ls' not described in 'aa_label_insert'
security/apparmor/label.c:1221: warning: Excess function parameter
'ls' description in 'aa_label_merge'
security/apparmor/label.c:1302: warning: Excess function parameter
'start' description in 'label_compound_match'
security/apparmor/label.c:1302: warning: Function parameter or member
'rules' not described in 'label_compound_match'
security/apparmor/label.c:1302: warning: Function parameter or member
'state' not described in 'label_compound_match'
security/apparmor/label.c:2051: warning: Function parameter or member
'label' not described in '__label_update'

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-10 01:08:38 -07:00
Gaosheng Cui 3175df8032 apparmor: Fix kernel-doc warnings in apparmor/file.c
Fix kernel-doc warnings:

security/apparmor/file.c:177: warning: Excess function parameter
'dfa' description in 'aa_lookup_fperms'
security/apparmor/file.c:177: warning: Function parameter or member
'file_rules' not described in 'aa_lookup_fperms'
security/apparmor/file.c:202: warning: Excess function parameter
'dfa' description in 'aa_str_perms'
security/apparmor/file.c:202: warning: Excess function parameter
'state' description in 'aa_str_perms'
security/apparmor/file.c:202: warning: Function parameter or member
'file_rules' not described in 'aa_str_perms'
security/apparmor/file.c:202: warning: Function parameter or member
'start' not described in 'aa_str_perms'

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-10 01:07:54 -07:00
Gaosheng Cui 76426c9d92 apparmor: Fix kernel-doc warnings in apparmor/domain.c
Fix kernel-doc warnings:

security/apparmor/domain.c:279: warning: Function parameter or
member 'perms' not described in 'change_profile_perms'
security/apparmor/domain.c:380: warning: Function parameter or
member 'bprm' not described in 'find_attach'
security/apparmor/domain.c:380: warning: Function parameter or
member 'head' not described in 'find_attach'
security/apparmor/domain.c:380: warning: Function parameter or
member 'info' not described in 'find_attach'
security/apparmor/domain.c:380: warning: Function parameter or
member 'name' not described in 'find_attach'
security/apparmor/domain.c:558: warning: Function parameter or
member 'info' not described in 'x_to_label'

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-10 01:06:04 -07:00
Gaosheng Cui c98c8972fe apparmor: Fix kernel-doc warnings in apparmor/capability.c
Fix kernel-doc warnings:

security/apparmor/capability.c:45: warning: Function parameter
or member 'ab' not described in 'audit_cb'
security/apparmor/capability.c:45: warning: Function parameter
or member 'va' not described in 'audit_cb'

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-10 01:05:41 -07:00
Gaosheng Cui 26c9ecb34f apparmor: Fix kernel-doc warnings in apparmor/audit.c
Fix kernel-doc warnings:

security/apparmor/audit.c:150: warning: Function parameter or
member 'type' not described in 'aa_audit_msg'

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-10 01:05:25 -07:00
Jeff Layton 46fc6b35a6 apparmor: update ctime whenever the mtime changes on an inode
In general, when updating the mtime on an inode, one must also update
the ctime. Add the missing ctime updates.

Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Message-Id: <20230705190309.579783-5-jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-07-10 10:04:52 +02:00
Dan Carpenter afad53575a apparmor: use passed in gfp flags in aa_alloc_null()
These allocations should use the gfp flags from the caller instead of
GFP_KERNEL.  But from what I can see, all the callers pass in GFP_KERNEL
so this does not affect runtime.

Fixes: e31dd6e412f7 ("apparmor: fix: kzalloc perms tables for shared dfas")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-09 17:31:19 -07:00
John Johansen 180cf25799 apparmor: advertise availability of exended perms
Userspace won't load policy using extended perms unless it knows the
kernel can handle them. Advertise that extended perms are supported in
the feature set.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Jon Tourville <jontourville@me.com>
2023-07-09 17:31:11 -07:00
GONG, Ruiqi 8de4a7de19 apparmor: remove unused macro
SOCK_ctx() doesn't seem to be used anywhere in the code, so remove it.

Signed-off-by: GONG, Ruiqi <gongruiqi@huaweicloud.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-09 17:31:11 -07:00
Quanfa Fu 0897fcb1c1 apparmor: make aa_set_current_onexec return void
Change the return type to void since it always return 0, and no need
to do the checking in aa_set_current_onexec.

Signed-off-by: Quanfa Fu <quanfafu@gmail.com>
Reviewed-by: "Tyler Hicks (Microsoft)" <code@tyhicks.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-09 17:30:51 -07:00
Linus Torvalds 70806ee18a + Bug Fixes
apparmor: fix missing error check for rhashtable_insert_fast
       apparmor: add missing failure check in compute_xmatch_perms
       apparmor: fix policy_compat permission remap with extended permissions
       apparmor: fix profile verification and enable it
       apparmor: fix: kzalloc perms tables for shared dfas
       apparmor: Fix kernel-doc header for verify_dfa_accept_index
       apparmor: aa_buffer: Convert 1-element array to flexible array
       apparmor: Return directly after a failed kzalloc() in two functions
       apparmor: fix use of strcpy in policy_unpack_test
       apparmor: fix kernel-doc complaints
       AppArmor: Fix some kernel-doc comments
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE7cSDD705q2rFEEf7BS82cBjVw9gFAmSnBVAACgkQBS82cBjV
 w9jF3hAAp6AGXSDDie5rZsovXpwkYr4rjpt9tnN+yJBLRRjMNrmOWva85/mauq9t
 Z04U13TRPeufQziU44O9A3+2YvC7x8FOnDPsnQ00PSUmAcFNHWg1rQsmtLgn/m3z
 1/8LL8GEbd/Kl59NYyYNw+28SqpguyzB+hXyYLdbDkJ8NGaNCRYikKvVq/hDymkx
 kBw+XIifC6POKyFMOWtUDa2CIMcbr7gBx8A3sOzZimrNpoIyVCpUnve2Iyy8tda2
 CEB7xfQ7LU1+sildVCrYJ9E4ybbABsIGq9PbKYH4qezyZ3HQbsfrowU357CtwIo5
 SRNkbvMSabnuLxGX0I5Zr1O365qtxkD72bRGqhOfyP4N3N+if//99Gyp4WBH5NEP
 BEOdExWhtllt4x0WdnlbripM3YiV+pRoFfFAXxkSvxMV4wXV/pczmBb2QPG4/SzT
 E8yuqka7n9fsiGS4RUChpGb8fMK3cg2uYfSSn/QHZ0iE6fFIL1R98w6IMTSwr3gB
 K/pD6sQKNt+j4A+sAvarTME4Mgd8GPSIpy4PIoeCyV2MHP/DJPnsDXb9jhBvMXPw
 pZmWTngToZ3ozoI3yMSInfwOcCfacIGh+ahdNFq5ZKM7ssAmw68ap1eZ3BBD3Iyx
 jFcoMI4JbnUFwoaXGhrWAUNPPZQem3XQvk/qvRL/ToYtWZaZBms=
 =+Xg/
 -----END PGP SIGNATURE-----

Merge tag 'apparmor-pr-2023-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor

Pull apparmor updates from John Johansen:

 - fix missing error check for rhashtable_insert_fast

 - add missing failure check in compute_xmatch_perms

 - fix policy_compat permission remap with extended permissions

 - fix profile verification and enable it

 - fix kzalloc perms tables for shared dfas

 - Fix kernel-doc header for verify_dfa_accept_index

 - aa_buffer: Convert 1-element array to flexible array

 - Return directly after a failed kzalloc() in two functions

 - fix use of strcpy in policy_unpack_test

 - fix kernel-doc complaints

 - Fix some kernel-doc comments

* tag 'apparmor-pr-2023-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
  apparmor: Fix kernel-doc header for verify_dfa_accept_index
  apparmor: fix: kzalloc perms tables for shared dfas
  apparmor: fix profile verification and enable it
  apparmor: fix policy_compat permission remap with extended permissions
  apparmor: aa_buffer: Convert 1-element array to flexible array
  apparmor: add missing failure check in compute_xmatch_perms
  apparmor: fix missing error check for rhashtable_insert_fast
  apparmor: Return directly after a failed kzalloc() in two functions
  AppArmor: Fix some kernel-doc comments
  apparmor: fix use of strcpy in policy_unpack_test
  apparmor: fix kernel-doc complaints
2023-07-07 09:55:31 -07:00
John Johansen 3f069c4c64 apparmor: Fix kernel-doc header for verify_dfa_accept_index
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202306141934.UKmM9bFX-lkp@intel.com/
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-06 11:12:10 -07:00
John Johansen ec6851ae0a apparmor: fix: kzalloc perms tables for shared dfas
Currently the permstables of the shared dfas are not shared, and need
to be allocated and copied. In the future this should be addressed
with a larger rework on dfa and pdb ref counts and structure sharing.

BugLink: http://bugs.launchpad.net/bugs/2017903
Fixes: 217af7e2f4 ("apparmor: refactor profile rules and attachments")
Cc: stable@vger.kernel.org
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Jon Tourville <jontourville@me.com>
2023-07-06 11:05:58 -07:00
John Johansen 6f442d42c0 apparmor: fix profile verification and enable it
The transition table size was not being set by compat mappings
resulting in the profile verification code not being run. Unfortunately
the checks were also buggy not being correctly updated from the old
accept perms, to the new layout.

Also indicate to userspace that the kernel has the permstable verification
fixes.

BugLink: http://bugs.launchpad.net/bugs/2017903
Fixes: 670f31774a ("apparmor: verify permission table indexes")
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Jon Tourville <jontourville@me.com>
2023-07-06 10:59:55 -07:00
John Johansen 0bac2002b3 apparmor: fix policy_compat permission remap with extended permissions
If the extended permission table is present we should not be attempting
to do a compat_permission remap as the compat_permissions are not
stored in the dfa accept states.

Fixes: fd1b2b95a2 ("apparmor: add the ability for policy to specify a permission table")
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Jon Tourville <jontourville@me.com>
2023-07-06 10:58:49 -07:00
Kees Cook ba808cb5ed apparmor: aa_buffer: Convert 1-element array to flexible array
In the ongoing effort to convert all fake flexible arrays to proper
flexible arrays, replace aa_buffer's 1-element "buffer" member with a
flexible array.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-06 10:58:49 -07:00
John Johansen 6600e9f692 apparmor: add missing failure check in compute_xmatch_perms
Add check for failure to allocate the permission table.

Fixes: caa9f579ca ("apparmor: isolate policy backwards compatibility to its own file")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-06 10:58:49 -07:00
Danila Chernetsov 000518bc5a apparmor: fix missing error check for rhashtable_insert_fast
rhashtable_insert_fast() could return err value when memory allocation is
 failed. but unpack_profile() do not check values and this always returns
 success value. This patch just adds error check code.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: e025be0f26 ("apparmor: support querying extended trusted helper extra data")

Signed-off-by: Danila Chernetsov <listdansp@mail.ru>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-06 10:58:49 -07:00
Markus Elfring 6d7467957e apparmor: Return directly after a failed kzalloc() in two functions
1. Return directly after a call of the function “kzalloc” failed
   at the beginning in these function implementations.

2. Omit extra initialisations (for a few local variables)
   which became unnecessary with this refactoring.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-06 10:58:49 -07:00
Yang Li 755a22c743 AppArmor: Fix some kernel-doc comments
Make the description of @table to @strs in function unpack_trans_table()
to silence the warnings:

security/apparmor/policy_unpack.c:456: warning: Function parameter or member 'strs' not described in 'unpack_trans_table'
security/apparmor/policy_unpack.c:456: warning: Excess function parameter 'table' description in 'unpack_trans_table'

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4332
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-06 10:58:49 -07:00
Rae Moar b54aebd441 apparmor: fix use of strcpy in policy_unpack_test
Replace the use of strcpy() in build_aa_ext_struct() in
policy_unpack_test.c with strscpy().

strscpy() is the safer method to use to ensure the buffer does not
overflow. This was found by kernel test robot:
https://lore.kernel.org/all/202301040348.NbfVsXO0-lkp@intel.com/.

Reported-by: kernel test robot <lkp@intel.com>

Signed-off-by: Rae Moar <rmoar@google.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-07-06 10:58:49 -07:00
Linus Torvalds 04f2933d37 Scope-based Resource Management infrastructure
These are the first few patches in the Scope-based Resource Management
 series that introduce the infrastructure but not any conversions as of
 yet.
 
 Adding the infrastructure now allows multiple people to start using them.
 
 Of note is that Sparse will need some work since it doesn't yet
 understand this attribute and might have decl-after-stmt issues -- but I
 think that's being worked on.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEv3OU3/byMaA0LqWJdkfhpEvA5LoFAmSZehYVHHBldGVyekBp
 bmZyYWRlYWQub3JnAAoJEHZH4aRLwOS6uC0P/0pduumvCE+osm869VOroHeVqz6B
 B36oqJlD1uqIA8JWiNW8TbpUUqTvU0FyW1jU1YnUdol5Kb3XH7APQIayGj2HM+Mc
 kMyHc4/ICxBzRI7STiBtCEbHIwYAfuibCAfuq7QP4GDw4vHKvUC7BUUaUdbQSey0
 IyUsvgPqLRL1KW3tb2z7Zfz6Oo/f0A/+viTiqOKNxSctXaD7TOEUpIZ/RhIgWbIV
 as65sgzLD/87HV9LbtTyOQCxjR5lpDlTaITapMfgwC9bQ8vQwjmaX3WETIlUmufl
 8QG3wiPIDynmqjmpEeY+Cyjcu/Gbz/2XENPgHBd6g2yJSFLSyhSGcDBAZ/lFKlnB
 eIJqwIFUvpMbAskNxHooNz1zHzqHhOYcCSUITRk6PG8AS1HOD/aNij4sWHMzJSWU
 R3ZUOEf5k6FkER3eAKxEEsJ2HtIurOVfvFXwN1e0iCKvNiQ1QdP74LddcJqsrTY6
 ihXsfVbtJarM9dakuWgk3fOj79cx3BQEm82002vvNzhRRUB5Fx7v47lyB5mAcYEl
 Al9gsLg4mKkVNiRh/tcelchKPp9WOKcXFEKiqom75VNwZH4oa+KSaa4GdDUQZzNs
 73/Zj6gPzFBP1iLwqicvg5VJIzacKuH59WRRQr1IudTxYgWE67yz1OVicplD0WaR
 l0tF9LFAdBHUy8m/
 =adZ8
 -----END PGP SIGNATURE-----

Merge tag 'core_guards_for_6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue

Pull scope-based resource management infrastructure from Peter Zijlstra:
 "These are the first few patches in the Scope-based Resource Management
  series that introduce the infrastructure but not any conversions as of
  yet.

  Adding the infrastructure now allows multiple people to start using
  them.

  Of note is that Sparse will need some work since it doesn't yet
  understand this attribute and might have decl-after-stmt issues"

* tag 'core_guards_for_6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue:
  kbuild: Drop -Wdeclaration-after-statement
  locking: Introduce __cleanup() based infrastructure
  apparmor: Free up __cleanup() name
  dmaengine: ioat: Free up __cleanup() name
2023-07-04 13:50:38 -07:00
Linus Torvalds d8b0bd57c2 powerpc updates for 6.5
- Extend KCSAN support to 32-bit and BookE. Add some KCSAN annotations.
 
  - Make ELFv2 ABI the default for 64-bit big-endian kernel builds, and use
    the -mprofile-kernel option (kernel specific ftrace ABI) for big endian
    ELFv2 kernels.
 
  - Add initial Dynamic Execution Control Register (DEXCR) support, and allow
    the ROP protection instructions to be used on Power 10.
 
  - Various other small features and fixes.
 
 Thanks to: Aditya Gupta, Aneesh Kumar K.V, Benjamin Gray, Brian King,
 Christophe Leroy, Colin Ian King, Dmitry Torokhov, Gaurav Batra, Jean Delvare,
 Joel Stanley, Marco Elver, Masahiro Yamada, Nageswara R Sastry, Nathan
 Chancellor, Naveen N Rao, Nayna Jain, Nicholas Piggin, Paul Gortmaker, Randy
 Dunlap, Rob Herring, Rohan McLure, Russell Currey, Sachin Sant, Timothy
 Pearson, Tom Rix, Uwe Kleine-König.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmSeqQMTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgKukD/sGUceX6gIc7UcjWhL1ZCVco0bsgLjT
 JrY1NenisGKjwwRd/o+2+h3ziJDoO5AsQfT72EiNLhaYJhnlb1d0vXzsvN0THc+2
 W5RrxAZUNhBy+c7gSSEJjy8+vBIwSQAliQLChHGOSejGCj94SxF5+zjUFvSX458I
 z0+ZQK+Fiw5NcpzEnBT0XPnLzap74a7TL0JcG1MLbj2QtHXhbfjIlkkPDX3kK0Gw
 xbelFy38X7KKbQsXXYSTCGqwRdJ3yqu21nEsjRuo2yT5H5rQbjCNggkMOL1DecDd
 ULGxit/z13Pt1Ad3oe+6FF17ggOiCG0F75DONASjFDthFYx6NQffkJS1n1VZauQj
 jU6LtWeD3HkgIYm6Udjq+LaSmkAmn5a+9tsElE/K+V1WG4rKyMeVmE3z/tCJG0l2
 yhLKyFs+glXN/LiWHyX0mrQIIVZdRK237X1qXJuIvAuB7Drm5duXFAHR8pdJD0dg
 H23OhoO2FvLxb9GvnzGxqjdazzattctz31wU/1RgnPxumYkJ9PlBcXn9h1uXa8/K
 rDZFJADsQhEfRCjmLG3GIaFWqZdc4Cn+ZUk4iHkjPDFDL05Fq7JYHIuwteiN6/wP
 NHRvtKdJisu583NI9RN9300JykrEqjSRbMOWlc3vuFwbRLGioXvWhWlIZ3/t58jG
 R8s+f0nKSPr+fg==
 =ssit
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Extend KCSAN support to 32-bit and BookE. Add some KCSAN annotations

 - Make ELFv2 ABI the default for 64-bit big-endian kernel builds, and
   use the -mprofile-kernel option (kernel specific ftrace ABI) for big
   endian ELFv2 kernels

 - Add initial Dynamic Execution Control Register (DEXCR) support, and
   allow the ROP protection instructions to be used on Power 10

 - Various other small features and fixes

Thanks to Aditya Gupta, Aneesh Kumar K.V, Benjamin Gray, Brian King,
Christophe Leroy, Colin Ian King, Dmitry Torokhov, Gaurav Batra, Jean
Delvare, Joel Stanley, Marco Elver, Masahiro Yamada, Nageswara R Sastry,
Nathan Chancellor, Naveen N Rao, Nayna Jain, Nicholas Piggin, Paul
Gortmaker, Randy Dunlap, Rob Herring, Rohan McLure, Russell Currey,
Sachin Sant, Timothy Pearson, Tom Rix, and Uwe Kleine-König.

* tag 'powerpc-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (76 commits)
  powerpc: remove checks for binutils older than 2.25
  powerpc: Fail build if using recordmcount with binutils v2.37
  powerpc/iommu: TCEs are incorrectly manipulated with DLPAR add/remove of memory
  powerpc/iommu: Only build sPAPR access functions on pSeries
  powerpc: powernv: Annotate data races in opal events
  powerpc: Mark writes registering ipi to host cpu through kvm and polling
  powerpc: Annotate accesses to ipi message flags
  powerpc: powernv: Fix KCSAN datarace warnings on idle_state contention
  powerpc: Mark [h]ssr_valid accesses in check_return_regs_valid
  powerpc: qspinlock: Enforce qnode writes prior to publishing to queue
  powerpc: qspinlock: Mark accesses to qnode lock checks
  powerpc/powernv/pci: Remove last IODA1 defines
  powerpc/powernv/pci: Remove MVE code
  powerpc/powernv/pci: Remove ioda1 support
  powerpc: 52xx: Make immr_id DT match tables static
  powerpc: mpc512x: Remove open coded "ranges" parsing
  powerpc: fsl_soc: Use of_range_to_resource() for "ranges" parsing
  powerpc: fsl: Use of_property_read_reg() to parse "reg"
  powerpc: fsl_rio: Use of_range_to_resource() for "ranges" parsing
  macintosh: Use of_property_read_reg() to parse "reg"
  ...
2023-06-30 09:20:08 -07:00
Linus Torvalds 632f54b4d6 slab updates for 6.5
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmSZtjsACgkQu+CwddJF
 iJqCTwf/XVhmAD7zMOj6g1aak5oHNZDRG5jufM5UNXmiWjCWT3w4DpltrJkz0PPm
 mg3Ac5fjNUqesZ1SGtUbvoc363smroBrRudGEFrsUhqBcpR+S4fSneoDk+xqMypf
 VLXP/8kJlFEBGMiR7ouAWnR4+u6JgY4E8E8JIPNzao5KE/L1lD83nY+Usjc/01ek
 oqMyYVFRfncsGjGJXc5fOOTTCj768mRroF0sLmEegIonnwQkSHE7HWJ/nyaVraDV
 bomnTIgMdVIDqharin08ZPIM7qBIWM09Uifaf0lIs6fIA94pQP+5Ko3mum2P/S+U
 ON/qviSrlNgRXoHPJ3hvPHdfEU9cSg==
 =1d0v
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab updates from Vlastimil Babka:

 - SLAB deprecation:

   Following the discussion at LSF/MM 2023 [1] and no objections, the
   SLAB allocator is deprecated by renaming the config option (to make
   its users notice) to CONFIG_SLAB_DEPRECATED with updated help text.
   SLUB should be used instead. Existing defconfigs with CONFIG_SLAB are
   also updated.

 - SLAB_NO_MERGE kmem_cache flag (Jesper Dangaard Brouer):

   There are (very limited) cases where kmem_cache merging is
   undesirable, and existing ways to prevent it are hacky. Introduce a
   new flag to do that cleanly and convert the existing hacky users.
   Btrfs plans to use this for debug kernel builds (that use case is
   always fine), networking for performance reasons (that should be very
   rare).

 - Replace the usage of weak PRNGs (David Keisar Schmidt):

   In addition to using stronger RNGs for the security related features,
   the code is a bit cleaner.

 - Misc code cleanups (SeongJae Parki, Xiongwei Song, Zhen Lei, and
   zhaoxinchao)

Link: https://lwn.net/Articles/932201/ [1]

* tag 'slab-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/slab_common: use SLAB_NO_MERGE instead of negative refcount
  mm/slab: break up RCU readers on SLAB_TYPESAFE_BY_RCU example code
  mm/slab: add a missing semicolon on SLAB_TYPESAFE_BY_RCU example code
  mm/slab_common: reduce an if statement in create_cache()
  mm/slab: introduce kmem_cache flag SLAB_NO_MERGE
  mm/slab: rename CONFIG_SLAB to CONFIG_SLAB_DEPRECATED
  mm/slab: remove HAVE_HARDENED_USERCOPY_ALLOCATOR
  mm/slab_common: Replace invocation of weak PRNG
  mm/slab: Replace invocation of weak PRNG
  slub: Don't read nr_slabs and total_objects directly
  slub: Remove slabs_node() function
  slub: Remove CONFIG_SMP defined check
  slub: Put objects_show() into CONFIG_SLUB_DEBUG enabled block
  slub: Correct the error code when slab_kset is NULL
  mm/slab: correct return values in comment for _kmem_cache_create()
2023-06-29 16:34:12 -07:00
Linus Torvalds 6a8cbd9253 v6.5-rc1-sysctl-next
The changes queued up for v6.5-rc1 for sysctl are in line with
 prior efforts to stop usage of deprecated routines which incur
 recursion and also make it hard to remove the empty array element
 in each sysctl array declaration. The most difficult user to modify
 was parport which required a bit of re-thinking of how to declare shared
 sysctls there, Joel Granados has stepped up to the plate to do most of
 this work and eventual removal of register_sysctl_table(). That work
 ended up saving us about 1465 bytes according to bloat-o-meter. Since
 we gained a few bloat-o-meter karma points I moved two rather small
 sysctl arrays from kernel/sysctl.c leaving us only two more sysctl
 arrays to move left.
 
 Most changes have been tested on linux-next for about a month. The last
 straggler patches are a minor parport fix, changes to the sysctl
 kernel selftest so to verify correctness and prevent regressions for
 the future change he made to provide an alternative solution for the
 special sysctl mount point target which was using the now deprecated
 sysctl child element.
 
 This is all prep work to now finally be able to remove the empty
 array element in all sysctl declarations / registrations which is
 expected to save us a bit of bytes all over the kernel. That work
 will be tested early after v6.5-rc1 is out.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmSceh0SHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinBFQQAK9WdpcU8ODoDzoSls4jsCQpZUCfZ+ED
 pbCgQqUqu9VPs6bnJ+aXVa6Fh3uCr6+TIfNFM55qI/Sbo2issZ7bm0nvKmGgc6/m
 giqDP7btvHqiAsEootci8DVdbBXKkdH4dx3pSwleyN8pdinewH0hrKImaPpahyo6
 1mB1du0iI89yjsZmheHVVSyfXXYAnP0PqRVy5Y+qxY7yYlIegQ5uAZmwRE62lfTf
 TuiV7OFuDZ2DBYOmqIhfGKGRnfOL5ZVF3iHCrfUpX3p+fEFzDmwvm3vr73PTSrFw
 /aRRLa/hOWr5ilw1bvnMcazgQzFEOlQb3DMhBKH7gLl3XHVrM+TaaqYHjUia1+6Y
 e2axz/duA2q9uLMW81daRApvHMCgy0exkpC7prfOxF5bgTe4TjA7ZWvGpqG1kPKT
 PPSxw80XvG5hLZm4tB0ZWJ5rOfFpiUGGneSeRQwyuClBt73SIO+F03jyGpt83slU
 jFE50ac14Zwh1oxpCQtYoR1+bXWdq1QwM5vQBNEuaoTSnJfVjrXqBz/BnqJChtjr
 m1vA27+4/dfki2P3gVWF1lGx43ir3uJvqk+BjWXm2CDDJqpRi3N0qcUwZwLuqAAz
 /LEgFqK61bpHi/C8c2NWAxIoeWRU4NUOaoiKmZwyt0sKAWU1Yzg70xssYeg7VYqZ
 3pvFNVBqkV+F
 =sXUU
 -----END PGP SIGNATURE-----

Merge tag 'v6.5-rc1-sysctl-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull sysctl updates from Luis Chamberlain:
 "The changes for sysctl are in line with prior efforts to stop usage of
  deprecated routines which incur recursion and also make it hard to
  remove the empty array element in each sysctl array declaration.

  The most difficult user to modify was parport which required a bit of
  re-thinking of how to declare shared sysctls there, Joel Granados has
  stepped up to the plate to do most of this work and eventual removal
  of register_sysctl_table(). That work ended up saving us about 1465
  bytes according to bloat-o-meter. Since we gained a few bloat-o-meter
  karma points I moved two rather small sysctl arrays from
  kernel/sysctl.c leaving us only two more sysctl arrays to move left.

  Most changes have been tested on linux-next for about a month. The
  last straggler patches are a minor parport fix, changes to the sysctl
  kernel selftest so to verify correctness and prevent regressions for
  the future change he made to provide an alternative solution for the
  special sysctl mount point target which was using the now deprecated
  sysctl child element.

  This is all prep work to now finally be able to remove the empty array
  element in all sysctl declarations / registrations which is expected
  to save us a bit of bytes all over the kernel. That work will be
  tested early after v6.5-rc1 is out"

* tag 'v6.5-rc1-sysctl-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  sysctl: replace child with an enumeration
  sysctl: Remove debugging dump_stack
  test_sysclt: Test for registering a mount point
  test_sysctl: Add an option to prevent test skip
  test_sysctl: Add an unregister sysctl test
  test_sysctl: Group node sysctl test under one func
  test_sysctl: Fix test metadata getters
  parport: plug a sysctl register leak
  sysctl: move security keys sysctl registration to its own file
  sysctl: move umh sysctl registration to its own file
  signal: move show_unhandled_signals sysctl to its own file
  sysctl: remove empty dev table
  sysctl: Remove register_sysctl_table
  sysctl: Refactor base paths registrations
  sysctl: stop exporting register_sysctl_table
  parport: Removed sysctl related defines
  parport: Remove register_sysctl_table from parport_default_proc_register
  parport: Remove register_sysctl_table from parport_device_proc_register
  parport: Remove register_sysctl_table from parport_proc_register
  parport: Move magic number "15" to a define
2023-06-28 16:05:21 -07:00
Linus Torvalds 6e17c6de3d - Yosry Ahmed brought back some cgroup v1 stats in OOM logs.
- Yosry has also eliminated cgroup's atomic rstat flushing.
 
 - Nhat Pham adds the new cachestat() syscall.  It provides userspace
   with the ability to query pagecache status - a similar concept to
   mincore() but more powerful and with improved usability.
 
 - Mel Gorman provides more optimizations for compaction, reducing the
   prevalence of page rescanning.
 
 - Lorenzo Stoakes has done some maintanance work on the get_user_pages()
   interface.
 
 - Liam Howlett continues with cleanups and maintenance work to the maple
   tree code.  Peng Zhang also does some work on maple tree.
 
 - Johannes Weiner has done some cleanup work on the compaction code.
 
 - David Hildenbrand has contributed additional selftests for
   get_user_pages().
 
 - Thomas Gleixner has contributed some maintenance and optimization work
   for the vmalloc code.
 
 - Baolin Wang has provided some compaction cleanups,
 
 - SeongJae Park continues maintenance work on the DAMON code.
 
 - Huang Ying has done some maintenance on the swap code's usage of
   device refcounting.
 
 - Christoph Hellwig has some cleanups for the filemap/directio code.
 
 - Ryan Roberts provides two patch series which yield some
   rationalization of the kernel's access to pte entries - use the provided
   APIs rather than open-coding accesses.
 
 - Lorenzo Stoakes has some fixes to the interaction between pagecache
   and directio access to file mappings.
 
 - John Hubbard has a series of fixes to the MM selftesting code.
 
 - ZhangPeng continues the folio conversion campaign.
 
 - Hugh Dickins has been working on the pagetable handling code, mainly
   with a view to reducing the load on the mmap_lock.
 
 - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment from
   128 to 8.
 
 - Domenico Cerasuolo has improved the zswap reclaim mechanism by
   reorganizing the LRU management.
 
 - Matthew Wilcox provides some fixups to make gfs2 work better with the
   buffer_head code.
 
 - Vishal Moola also has done some folio conversion work.
 
 - Matthew Wilcox has removed the remnants of the pagevec code - their
   functionality is migrated over to struct folio_batch.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZJejewAKCRDdBJ7gKXxA
 joggAPwKMfT9lvDBEUnJagY7dbDPky1cSYZdJKxxM2cApGa42gEA6Cl8HRAWqSOh
 J0qXCzqaaN8+BuEyLGDVPaXur9KirwY=
 =B7yQ
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull mm updates from Andrew Morton:

 - Yosry Ahmed brought back some cgroup v1 stats in OOM logs

 - Yosry has also eliminated cgroup's atomic rstat flushing

 - Nhat Pham adds the new cachestat() syscall. It provides userspace
   with the ability to query pagecache status - a similar concept to
   mincore() but more powerful and with improved usability

 - Mel Gorman provides more optimizations for compaction, reducing the
   prevalence of page rescanning

 - Lorenzo Stoakes has done some maintanance work on the
   get_user_pages() interface

 - Liam Howlett continues with cleanups and maintenance work to the
   maple tree code. Peng Zhang also does some work on maple tree

 - Johannes Weiner has done some cleanup work on the compaction code

 - David Hildenbrand has contributed additional selftests for
   get_user_pages()

 - Thomas Gleixner has contributed some maintenance and optimization
   work for the vmalloc code

 - Baolin Wang has provided some compaction cleanups,

 - SeongJae Park continues maintenance work on the DAMON code

 - Huang Ying has done some maintenance on the swap code's usage of
   device refcounting

 - Christoph Hellwig has some cleanups for the filemap/directio code

 - Ryan Roberts provides two patch series which yield some
   rationalization of the kernel's access to pte entries - use the
   provided APIs rather than open-coding accesses

 - Lorenzo Stoakes has some fixes to the interaction between pagecache
   and directio access to file mappings

 - John Hubbard has a series of fixes to the MM selftesting code

 - ZhangPeng continues the folio conversion campaign

 - Hugh Dickins has been working on the pagetable handling code, mainly
   with a view to reducing the load on the mmap_lock

 - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment
   from 128 to 8

 - Domenico Cerasuolo has improved the zswap reclaim mechanism by
   reorganizing the LRU management

 - Matthew Wilcox provides some fixups to make gfs2 work better with the
   buffer_head code

 - Vishal Moola also has done some folio conversion work

 - Matthew Wilcox has removed the remnants of the pagevec code - their
   functionality is migrated over to struct folio_batch

* tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (380 commits)
  mm/hugetlb: remove hugetlb_set_page_subpool()
  mm: nommu: correct the range of mmap_sem_read_lock in task_mem()
  hugetlb: revert use of page_cache_next_miss()
  Revert "page cache: fix page_cache_next/prev_miss off by one"
  mm/vmscan: fix root proactive reclaim unthrottling unbalanced node
  mm: memcg: rename and document global_reclaim()
  mm: kill [add|del]_page_to_lru_list()
  mm: compaction: convert to use a folio in isolate_migratepages_block()
  mm: zswap: fix double invalidate with exclusive loads
  mm: remove unnecessary pagevec includes
  mm: remove references to pagevec
  mm: rename invalidate_mapping_pagevec to mapping_try_invalidate
  mm: remove struct pagevec
  net: convert sunrpc from pagevec to folio_batch
  i915: convert i915_gpu_error to use a folio_batch
  pagevec: rename fbatch_count()
  mm: remove check_move_unevictable_pages()
  drm: convert drm_gem_put_pages() to use a folio_batch
  i915: convert shmem_sg_free_table() to use a folio_batch
  scatterlist: add sg_set_folio()
  ...
2023-06-28 10:28:11 -07:00
Linus Torvalds 98be618ad0 Two patches that improve inode attribute initialization.
-----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmSa9X8XHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBFVXRAAuxSLzbFmkWwm89tOK5YJRgnC
 xbnfqsG8/z2YRRbpg2pjcod4wvcBUjwLwg/Y727iWHqxD8/dXaZRyoIcxnWJsVCX
 tCq6puL3XF8NQKSyggMYGvNPYsWgOcHoypGbCmLDcS135GFy2redd6pEAgIokg5r
 XJbRBpmcEiQUUOqwB/gSU29EiVFMYxEmLTPnzxGz6UwU4WTq0oUicpNBIc3znozv
 hQjx7VFxMNDEv3bQupR/gI09MiYWOZGChluyyegyuW5FGTn7OfCcfXsqpP0/eqLg
 OA9JCO9scnsfss8mhO30qQmPFfh1HTbm/dN96TsRPzz9IzTCTALDx1PzbgwQeqZb
 vCeA9eucsxiUWCyNWs+Q1QM0RR7mBsSoyZc4IxJ61R9ee4uYuliBX0ipiX8gHwBK
 6HNaSDwR/gvwbQBTqXic0OV7c1IlZIQLRSbMUNi/6a3AZNkIZvLRkvo+1taKW+2Z
 VYZekvSJl0NU4a/AfYCQUFXAgga93QlegZ0AgKg5YNX9hyWblyau8Owg6DkimM6E
 grbqQ706BoEuFC3xrxHMs2rMQM7G4i9NPjmUyAMCMOJFqeUqdemPCmoGMXG9G6Yk
 F6/YC76Y+y/plLnDdqyLpTyLVmwtXYz8aWFtYHVddCOW2Um3mU1tG7sA8q8C3hxg
 0mUQKznynvk92kOFT7w=
 =KtxS
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-6.5' of https://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "There are two patches, both of which change how Smack initializes the
  SMACK64TRANSMUTE extended attribute.

  The first corrects the behavior of overlayfs, which creates inodes
  differently from other filesystems. The second ensures that transmute
  attributes specified by mount options are correctly assigned"

* tag 'Smack-for-6.5' of https://github.com/cschaufler/smack-next:
  smack: Record transmuting in smk_transmuted
  smack: Retrieve transmuting information in smack_inode_getsecurity()
2023-06-27 17:58:06 -07:00
Linus Torvalds b4c7f2e6ef integrity-v6.5
-----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCZJomKBQcem9oYXJAbGlu
 dXguaWJtLmNvbQAKCRDLwZzRsCrn5QezAQD59PM+HueH5FrziRaCrXdoSt4KK42s
 +gAmd4oUq9hm9QD9GOC6eaAUuV/uJ6UpEF/KjSGGmYSWI8iRWKWBcmDMmg0=
 =TI2r
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity subsystem updates from Mimi Zohar:
 "An i_version change, one bug fix, and three kernel doc fixes:

   - instead of IMA detecting file change by directly accesssing
     i_version, it now calls vfs_getattr_nosec().

   - fix a race condition when inserting a new node in the iint rb-tree"

* tag 'integrity-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: Fix build warnings
  evm: Fix build warnings
  evm: Complete description of evm_inode_setattr()
  integrity: Fix possible multiple allocation in integrity_inode_get()
  IMA: use vfs_getattr_nosec to get the i_version
2023-06-27 17:32:34 -07:00
Linus Torvalds 21953eb16c lsm/stable-6.5 PR 20230626
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmSZuh0UHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNpuxAAxChGqME9nE7iITx1TaFRrbK49mDF
 1RZh/5cwzde72lLLFkTFKB6ErMSQkrrtA+jFH7vKsrOslBel1+yO80vkXmhYCeZU
 P3m0FeREUpuU4QV0tbQamPeR+SWohmKi2dYWd+VdpLA+1aTK3KNYsi2NFkDIreap
 BqeRq4S0Rqc4u3/5juk6JCGFhTRWaH16YJQrzIKHF/K3DK+gMhAY5sjuAWzFc6ma
 /5bbD55kdVVDfnsxNSe+lzJ7zEf7TYedLG6BN+R9cVrU+El12a38M29kASaAof5w
 vpb92a27hA9Q5EyQ2O9QXnr2L5CShT4bvAZCGkK4cmZerGNTdM0iojhYj1s7FAV/
 USkWgkDmEuSatp0+DdXlfQyUmZZWlw1W0oiEfZwR8w7TY7q9CU7aD8K7+GDSIazB
 g89nYznVjlaC/oA4/owMraoWP3eiDiAcsQdO052Vv63TVyJtTiRiKyBq5EFLrX8L
 iaUCa4cBaYFc94kN1PZeNXZKwqRc2F6oAFT1YuXnFWBGmixN0kUL023C0xjl/J7P
 02jYYSVzLm22aU39GU0DSnaLfAwl3muazOB3XuyGOhUWHFYzjkc9UhmGp0W50DkK
 qigW3ONA8s8CKUS/q7QSGq+Vf+CVZA5f+daDDPGYstPfCTk61eu0wjwfwek3W0o+
 xKzBr2Od3vTOzAs=
 =3nWy
 -----END PGP SIGNATURE-----

Merge tag 'lsm-pr-20230626' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm

Pull lsm updates from Paul Moore:

 - A SafeSetID patch to correct what appears to be a cut-n-paste typo in
   the code causing a UID to be printed where a GID was desired.

   This is coming via the LSM tree because we haven't been able to get a
   response from the SafeSetID maintainer (Micah Morton) in several
   months. Hopefully we are able to get in touch with Micah, but until
   we do I'm going to pick them up in the LSM tree.

 - A small fix to the reiserfs LSM xattr code.

   We're continuing to work through some issues with the reiserfs code
   as we try to fixup the LSM xattr handling, but in the process we're
   uncovering some ugly problems in reiserfs and we may just end up
   removing the LSM xattr support in reiserfs prior to reiserfs'
   removal.

   For better or worse, this shouldn't impact any of the reiserfs users,
   as we discovered that LSM xattrs on reiserfs were completely broken,
   meaning no one is currently using the combo of reiserfs and a file
   labeling LSM.

 - A tweak to how the cap_user_data_t struct/typedef is declared in the
   header file to appease the Sparse gods.

 - In the process of trying to sort out the SafeSetID lost-maintainer
   problem I realized that I needed to update the labeled networking
   entry to "Supported".

 - Minor comment/documentation and spelling fixes.

* tag 'lsm-pr-20230626' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  device_cgroup: Fix kernel-doc warnings in device_cgroup
  SafeSetID: fix UID printed instead of GID
  MAINTAINERS: move labeled networking to "supported"
  capability: erase checker warnings about struct __user_cap_data_struct
  lsm: fix a number of misspellings
  reiserfs: Initialize sec->length in reiserfs_security_init().
  capability: fix kernel-doc warnings in capability.c
2023-06-27 17:24:26 -07:00
Linus Torvalds 729b39ec1b selinux/stable-6.5 PR 20230626
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmSZucUUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMoew/+IpRuIKwouAvTINC2IEuacNlCghSs
 berPaYSLF89WTgbJN6hPm9NtaPU+epm5hikYp9/Ebm1Hi/91zgZZfUAN64c4e9Mx
 0GgO4VwuEbx6pOK0CF9EEQTlOWnOOiP24pQlYtQGUcYOTY3OaxFkLjYx9BMw05Rd
 Km93eVRgJolap62ChCxdULPQQIEW0DDNGAI9TPRrPbtYRT0oSmfsMGL8Ndkui8K8
 LlUVpOO5MM5/gCJjP+5PSVoyui6++ao2AwjsFk7I3hJqm3NN5fWFzWH9axLqZEqd
 ZfGdiah48ga+eNqi6pi79pBetlvpfHshELVwKxN9ck2UjzWQe8dqfy1p/0ikHO29
 OuD+urnGTPF668GszGZgC59LoaKrHFUBjfxj3g56/BOk2aqxXKY7qeZClJ/AUEZv
 +VEa/foB0OCVxCBOcTvXB7Zgiz5isoR3hAQu2MmWzny9tCgHFYXJ1u0UhQaFjx57
 ScPxlnjvzD5pA4ts+P2ggRojQ3Xo35dUoC353kuaaCrSg9v8yfz0ex3KeS/m9uJG
 MbeOtl44Xmqzzy0EB7ycNeF96kdbvKSc5XLBZyuT5CmAMUXlL3s6OOa26aevVifj
 LwNHAc1D7oe773Ty2WpW2s82Nh4hUyYVdIKg+9RDm74mS2ftZFeeGgFVumNQ80ZH
 DGhjW2iZY+0a0EU=
 =xzMY
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20230626' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:

 - Thanks to help from the MPTCP folks, it looks like we have finally
   sorted out a proper solution to the MPTCP socket labeling issue, see
   the new security_mptcp_add_subflow() LSM hook.

 - Fix the labeled NFS handling such that a labeled NFS share mounted
   prior to the initial SELinux policy load is properly labeled once a
   policy is loaded; more information in the commit description.

 - Two patches to security/selinux/Makefile, the first took the cleanups
   in v6.4 a bit further and the second removed the grouped targets
   support as that functionality doesn't appear to be properly supported
   prior to make v4.3.

 - Deprecate the "fs" object context type in SELinux policies. The fs
   object context type was an old vestige that was introduced back in
   v2.6.12-rc2 but never really used.

 - A number of small changes that remove dead code, clean up some
   awkward bits, and generally improve the quality of the code. See the
   individual commit descriptions for more information.

* tag 'selinux-pr-20230626' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: avoid bool as identifier name
  selinux: fix Makefile for versions of make < v4.3
  selinux: make labeled NFS work when mounted before policy load
  selinux: cleanup exit_sel_fs() declaration
  selinux: deprecated fs ocon
  selinux: make header files self-including
  selinux: keep context struct members in sync
  selinux: Implement mptcp_add_subflow hook
  security, lsm: Introduce security_mptcp_add_subflow()
  selinux: small cleanups in selinux_audit_rule_init()
  selinux: declare read-only data arrays const
  selinux: retain const qualifier on string literal in avtab_hash_eval()
  selinux: drop return at end of void function avc_insert()
  selinux: avc: drop unused function avc_disable()
  selinux: adjust typos in comments
  selinux: do not leave dangling pointer behind
  selinux: more Makefile tweaks
2023-06-27 17:18:48 -07:00
Linus Torvalds 26642864f8 Landlock updates for v6.5-rc1
-----BEGIN PGP SIGNATURE-----
 
 iIYEABYIAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZJlO/xAcbWljQGRpZ2lr
 b2QubmV0AAoJEOXj0OiMgvbS/FwA/A5GFGtKnLzFpIAXgc1G3kr8c+J/WF7RVUD+
 PJNEsH6PAP41l3BTqVSeVZ+tdLSC7NbesoX0MTd+rCgAWnr+Pko2Cw==
 =KqpG
 -----END PGP SIGNATURE-----

Merge tag 'landlock-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull landlock updates from Mickaël Salaün:
 "Add support for Landlock to UML.

  To do this, this fixes the way hostfs manages inodes according to the
  underlying filesystem [1]. They are now properly handled as for other
  filesystems, which enables Landlock support (and probably other
  features).

  This also extends Landlock's tests with 6 pseudo filesystems,
  including hostfs"

[1] https://lore.kernel.org/all/20230612191430.339153-1-mic@digikod.net/

* tag 'landlock-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  selftests/landlock: Add hostfs tests
  selftests/landlock: Add tests for pseudo filesystems
  selftests/landlock: Make mounts configurable
  selftests/landlock: Add supports_filesystem() helper
  selftests/landlock: Don't create useless file layouts
  hostfs: Fix ephemeral inodes
2023-06-27 17:10:27 -07:00
Linus Torvalds 74774e243c fsverity updates for 6.5
Several updates for fs/verity/:
 
 - Do all hashing with the shash API instead of with the ahash API.  This
   simplifies the code and reduces API overhead.  It should also make
   things slightly easier for XFS's upcoming support for fsverity.  It
   does drop fsverity's support for off-CPU hash accelerators, but that
   support was incomplete and not known to be used.
 
 - Update and export fsverity_get_digest() so that it's ready for
   overlayfs's upcoming support for fsverity checking of lowerdata.
 
 - Improve the documentation for builtin signature support.
 
 - Fix a bug in the large folio support.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZJjsWRQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK0IsAQCZ9ZP2U0DqLKV025LzcU9epUdS30xJ
 U7WOs8gP63pH4QEAqbU1O6bVhEzdFWGzq5gdzdLePWjOyHrGCVcR2u+fgw4=
 =ptAR
 -----END PGP SIGNATURE-----

Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux

Pull fsverity updates from Eric Biggers:
 "Several updates for fs/verity/:

   - Do all hashing with the shash API instead of with the ahash API.

     This simplifies the code and reduces API overhead. It should also
     make things slightly easier for XFS's upcoming support for
     fsverity. It does drop fsverity's support for off-CPU hash
     accelerators, but that support was incomplete and not known to be
     used

   - Update and export fsverity_get_digest() so that it's ready for
     overlayfs's upcoming support for fsverity checking of lowerdata

   - Improve the documentation for builtin signature support

   - Fix a bug in the large folio support"

* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
  fsverity: improve documentation for builtin signature support
  fsverity: rework fsverity_get_digest() again
  fsverity: simplify error handling in verify_data_block()
  fsverity: don't use bio_first_page_all() in fsverity_verify_bio()
  fsverity: constify fsverity_hash_alg
  fsverity: use shash API instead of ahash API
2023-06-26 10:56:13 -07:00
Peter Zijlstra 9a1f37ebcf apparmor: Free up __cleanup() name
In order to use __cleanup for __attribute__((__cleanup__(func))) the
name must not be used for anything else. Avoid the conflict.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: John Johansen <john.johansen@canonical.com>
Link: https://lkml.kernel.org/r/20230612093537.536441207%40infradead.org
2023-06-26 11:14:18 +02:00
Gaosheng Cui 4be22f16a4 device_cgroup: Fix kernel-doc warnings in device_cgroup
Fix kernel-doc warnings in device_cgroup:

security/device_cgroup.c:835: warning: Excess function parameter
'dev_cgroup' description in 'devcgroup_legacy_check_permission'.

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-06-21 09:30:49 -04:00
Nayna Jain e66effaf61 security/integrity: fix pointer to ESL data and its size on pseries
On PowerVM guest, variable data is prefixed with 8 bytes of timestamp.
Extract ESL by stripping off the timestamp before passing to ESL parser.

Fixes: 4b3e71e9a3 ("integrity/powerpc: Support loading keys from PLPKS")
Cc: stable@vger.kenrnel.org # v6.3
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230608120444.382527-1-nayna@linux.ibm.com
2023-06-21 14:08:53 +10:00
Alexander Mikhalitsyn 970ebb8a26 SafeSetID: fix UID printed instead of GID
pr_warn message clearly says that GID should be printed,
but we have UID there. Let's fix that.

Found accidentally during the work on isolated user namespaces.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
[PM: fix spelling errors in description, subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-06-20 20:26:00 -04:00
Eric Biggers 74836ecbc5 fsverity: rework fsverity_get_digest() again
Address several issues with the calling convention and documentation of
fsverity_get_digest():

- Make it provide the hash algorithm as either a FS_VERITY_HASH_ALG_*
  value or HASH_ALGO_* value, at the caller's choice, rather than only a
  HASH_ALGO_* value as it did before.  This allows callers to work with
  the fsverity native algorithm numbers if they want to.  HASH_ALGO_* is
  what IMA uses, but other users (e.g. overlayfs) should use
  FS_VERITY_HASH_ALG_* to match fsverity-utils and the fsverity UAPI.

- Make it return the digest size so that it doesn't need to be looked up
  separately.  Use the return value for this, since 0 works nicely for
  the "file doesn't have fsverity enabled" case.  This also makes it
  clear that no other errors are possible.

- Rename the 'digest' parameter to 'raw_digest' and clearly document
  that it is only useful in combination with the algorithm ID.  This
  hopefully clears up a point of confusion.

- Export it to modules, since overlayfs will need it for checking the
  fsverity digests of lowerdata files
  (https://lore.kernel.org/r/dd294a44e8f401e6b5140029d8355f88748cd8fd.1686565330.git.alexl@redhat.com).

Acked-by: Mimi Zohar <zohar@linux.ibm.com> # for the IMA piece
Link: https://lore.kernel.org/r/20230612190047.59755-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2023-06-14 10:41:07 -07:00
Mickaël Salaün 74ce793bcb
hostfs: Fix ephemeral inodes
hostfs creates a new inode for each opened or created file, which
created useless inode allocations and forbade identifying a host file
with a kernel inode.

Fix this uncommon filesystem behavior by tying kernel inodes to host
file's inode and device IDs.  Even if the host filesystem inodes may be
recycled, this cannot happen while a file referencing it is opened,
which is the case with hostfs.  It should be noted that hostfs inode IDs
may not be unique for the same hostfs superblock because multiple host's
(backed) superblocks may be used.

Delete inodes when dropping them to force backed host's file descriptors
closing.

This enables to entirely remove ARCH_EPHEMERAL_INODES, and then makes
Landlock fully supported by UML.  This is very useful for testing
changes.

These changes also factor out and simplify some helpers thanks to the
new hostfs_inode_update() and the hostfs_iget() revamp: read_name(),
hostfs_create(), hostfs_lookup(), hostfs_mknod(), and
hostfs_fill_sb_common().

A following commit with new Landlock tests check this new hostfs inode
consistency.

Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Richard Weinberger <richard@nod.at>
Link: https://lore.kernel.org/r/20230612191430.339153-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2023-06-12 21:26:19 +02:00
Lorenzo Stoakes ca5e863233 mm/gup: remove vmas parameter from get_user_pages_remote()
The only instances of get_user_pages_remote() invocations which used the
vmas parameter were for a single page which can instead simply look up the
VMA directly. In particular:-

- __update_ref_ctr() looked up the VMA but did nothing with it so we simply
  remove it.

- __access_remote_vm() was already using vma_lookup() when the original
  lookup failed so by doing the lookup directly this also de-duplicates the
  code.

We are able to perform these VMA operations as we already hold the
mmap_lock in order to be able to call get_user_pages_remote().

As part of this work we add get_user_page_vma_remote() which abstracts the
VMA lookup, error handling and decrementing the page reference count should
the VMA lookup fail.

This forms part of a broader set of patches intended to eliminate the vmas
parameter altogether.

[akpm@linux-foundation.org: avoid passing NULL to PTR_ERR]
Link: https://lkml.kernel.org/r/d20128c849ecdbf4dd01cc828fcec32127ed939a.1684350871.git.lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> (for arm64)
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com> (for s390)
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Christian König <christian.koenig@amd.com>
Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:26 -07:00
Luis Chamberlain 28898e260a sysctl: move security keys sysctl registration to its own file
The security keys sysctls are already declared on its own file,
just move the sysctl registration to its own file to help avoid
merge conflicts on sysctls.c, and help with clearing up sysctl.c
further.

This creates a small penalty of 23 bytes:

./scripts/bloat-o-meter vmlinux.1 vmlinux.2
add/remove: 2/0 grow/shrink: 0/1 up/down: 49/-26 (23)
Function                                     old     new   delta
init_security_keys_sysctls                     -      33     +33
__pfx_init_security_keys_sysctls               -      16     +16
sysctl_init_bases                             85      59     -26
Total: Before=21256937, After=21256960, chg +0.00%

But soon we'll be saving tons of bytes anyway, as we modify the
sysctl registrations to use ARRAY_SIZE and so we get rid of all the
empty array elements so let's just clean this up now.

Reviewed-by: Paul Moore <paul@paul-moore.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-06-08 15:42:02 -07:00
Roberto Sassu 95526d1303 ima: Fix build warnings
Fix build warnings (function parameters description) for
ima_collect_modsig(), ima_match_policy() and ima_parse_add_rule().

Fixes: 15588227e0 ("ima: Collect modsig") # v5.4+
Fixes: 2fe5d6def1 ("ima: integrity appraisal extension") # v5.14+
Fixes: 4af4662fa4 ("integrity: IMA policy") # v3.2+
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2023-06-06 09:37:23 -04:00
Roberto Sassu 996e0a97eb evm: Fix build warnings
Fix build warnings (function parameters description) for
evm_read_protected_xattrs(), evm_set_key() and evm_verifyxattr().

Fixes: 7626676320 ("evm: provide a function to set the EVM key from the kernel") # v4.5+
Fixes: 8314b6732a ("ima: Define new template fields xattrnames, xattrlengths and xattrvalues") # v5.14+
Fixes: 2960e6cb5f ("evm: additional parameter to pass integrity cache entry 'iint'") # v3.2+
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2023-06-06 08:51:11 -04:00
Christian Göttsche 447a568800 selinux: avoid bool as identifier name
Avoid using the identifier `bool` to improve support with future C
standards.  C23 is about to make `bool` a predefined macro (see N2654).

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-06-05 17:04:01 -04:00
Roberto Sassu b1de86d424 evm: Complete description of evm_inode_setattr()
Add the description for missing parameters of evm_inode_setattr() to
avoid the warning arising with W=n compile option.

Fixes: 817b54aa45 ("evm: add evm_inode_setattr to prevent updating an invalid security.evm") # v3.2+
Fixes: c1632a0f11 ("fs: port ->setattr() to pass mnt_idmap") # v6.3+
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2023-06-05 09:04:41 -04:00
Paul Moore ec4a491d18 selinux: fix Makefile for versions of make < v4.3
As noted in the comments of this commit, the current SELinux Makefile
requires features found in make v4.3 or later, which is problematic
as the Linux Kernel currently only requires make v3.82.  This patch
fixes the SELinux Makefile so that it works properly on these older
versions of make, and adds a couple of comments to the Makefile about
how it can be improved once make v4.3 is required by the kernel.

Fixes: 6f933aa7df ("selinux: more Makefile tweaks")
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-06-02 15:34:29 -04:00
Paul Moore 42c4e97e06 selinux: don't use make's grouped targets feature yet
The Linux Kernel currently only requires make v3.82 while the grouped
target functionality requires make v4.3.  Removed the grouped target
introduced in 4ce1f694eb ("selinux: ensure av_permissions.h is
built when needed") as well as the multiple header file targets in
the make rule.  This effectively reverts the problem commit.

We will revisit this change when make >= 4.3 is required by the rest
of the kernel.

Cc: stable@vger.kernel.org
Fixes: 4ce1f694eb ("selinux: ensure av_permissions.h is built when needed")
Reported-by: Erwan Velu <e.velu@criteo.com>
Reported-by: Luiz Capitulino <luizcap@amazon.com>
Tested-by: Luiz Capitulino <luizcap@amazon.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-06-01 13:56:13 -04:00
Tianjia Zhang 9df6a4870d integrity: Fix possible multiple allocation in integrity_inode_get()
When integrity_inode_get() is querying and inserting the cache, there
is a conditional race in the concurrent environment.

The race condition is the result of not properly implementing
"double-checked locking". In this case, it first checks to see if the
iint cache record exists before taking the lock, but doesn't check
again after taking the integrity_iint_lock.

Fixes: bf2276d10c ("ima: allocating iint improvements")
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2023-06-01 07:25:04 -04:00
Ondrej Mosnacek cec5fe7007 selinux: make labeled NFS work when mounted before policy load
Currently, when an NFS filesystem that supports passing LSM/SELinux
labels is mounted during early boot (before the SELinux policy is
loaded), it ends up mounted without the labeling support (i.e. with
Fedora policy all files get the generic NFS label
system_u:object_r:nfs_t:s0).

This is because the information that the NFS mount supports passing
labels (communicated to the LSM layer via the kern_flags argument of
security_set_mnt_opts()) gets lost and when the policy is loaded the
mount is initialized as if the passing is not supported.

Fix this by noting the "native labeling" in newsbsec->flags (using a new
SE_SBNATIVE flag) on the pre-policy-loaded call of
selinux_set_mnt_opts() and then making sure it is respected on the
second call from delayed_superblock_init().

Additionally, make inode_doinit_with_dentry() initialize the inode's
label from its extended attributes whenever it doesn't find it already
intitialized by the filesystem. This is needed to properly initialize
pre-existing inodes when delayed_superblock_init() is called. It should
not trigger in any other cases (and if it does, it's still better to
initialize the correct label instead of leaving the inode unlabeled).

Fixes: eb9ae68650 ("SELinux: Add new labeling type native labels")
Tested-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
[PM: fixed 'Fixes' tag format]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-30 17:44:34 -04:00
Xiu Jianfeng 29cd55fe69 selinux: cleanup exit_sel_fs() declaration
exit_sel_fs() has been removed since commit f22f9aaf6c ("selinux:
remove the runtime disable functionality").

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-30 16:43:25 -04:00
Paul Moore 4432b50744 lsm: fix a number of misspellings
A random collection of spelling fixes for source files in the LSM
layer.

Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-25 17:52:15 -04:00
Vlastimil Babka d2e527f0d8 mm/slab: remove HAVE_HARDENED_USERCOPY_ALLOCATOR
With SLOB removed, both remaining allocators support hardened usercopy,
so remove the config and associated #ifdef.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
2023-05-24 15:38:17 +02:00
Jeff Layton db1d1e8b98 IMA: use vfs_getattr_nosec to get the i_version
IMA currently accesses the i_version out of the inode directly when it
does a measurement. This is fine for most simple filesystems, but can be
problematic with more complex setups (e.g. overlayfs).

Make IMA instead call vfs_getattr_nosec to get this info. This allows
the filesystem to determine whether and how to report the i_version, and
should allow IMA to work properly with a broader class of filesystems in
the future.

Reported-and-Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2023-05-23 18:07:34 -04:00
Christian Göttsche 8bfbd046a3 selinux: deprecated fs ocon
The object context type `fs`, not to be confused with the well used
object context type `fscon`, was introduced in the initial git commit
1da177e4c3 ("Linux-2.6.12-rc2") but never actually used since.

The paper "A Security Policy Configuration for the Security-Enhanced
Linux" [1] mentions it under `7.2 File System Contexts` but also states:

    Currently, this configuration is unused.

The policy statement defining such object contexts is `fscon`, e.g.:

    fscon 2 3 gen_context(system_u:object_r:conA_t,s0) \
        gen_context(system_u:object_r:conB_t,s0)

It is not documented at selinuxproject.org or in the SELinux notebook
and not supported by the Reference Policy buildsystem - the statement is
not properly sorted - and thus not used in the Reference or Fedora
Policy.

Print a warning message at policy load for each such object context:

    SELinux:  void and deprecated fs ocon 02:03

This topic was initially highlighted by Nicolas Iooss [2].

[1]: https://media.defense.gov/2021/Jul/29/2002815735/-1/-1/0/SELINUX-SECURITY-POLICY-CONFIGURATION-REPORT.PDF
[2]: https://lore.kernel.org/selinux/CAJfZ7=mP2eJaq2BfO3y0VnwUJaY2cS2p=HZMN71z1pKjzaT0Eg@mail.gmail.com/

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: tweaked deprecation comment, description line wrapping]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-23 15:37:56 -04:00
Christian Göttsche eb14232fb7 selinux: make header files self-including
Include all necessary headers in header files to enable third party
applications, like LSP servers, to resolve all used symbols.

ibpkey.h: include "flask.h" for SECINITSID_UNLABELED
initial_sid_to_string.h: include <linux/stddef.h> for NULL

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-18 14:12:43 -04:00
Christian Göttsche ed99135f76 selinux: keep context struct members in sync
Commit 53f3517ae0 ("selinux: do not leave dangling pointer behind")
reset the `str` field of the `context` struct in an OOM error branch.
In this struct the fields `str` and `len` are coupled and should be kept
in sync.  Set the length to zero according to the string be set to NULL.

Fixes: 53f3517ae0 ("selinux: do not leave dangling pointer behind")
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-18 13:38:39 -04:00
Paolo Abeni 85c3222ddd selinux: Implement mptcp_add_subflow hook
Newly added subflows should inherit the LSM label from the associated
MPTCP socket regardless of the current context.

This patch implements the above copying sid and class from the MPTCP
socket context, deleting the existing subflow label, if any, and then
re-creating the correct one.

The new helper reuses the selinux_netlbl_sk_security_free() function,
and the latter can end-up being called multiple times with the same
argument; we additionally need to make it idempotent.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-18 13:11:10 -04:00
Paolo Abeni e3d9387f00 security, lsm: Introduce security_mptcp_add_subflow()
MPTCP can create subflows in kernel context, and later indirectly
expose them to user-space, via the owning MPTCP socket.

As discussed in the reported link, the above causes unexpected failures
for server, MPTCP-enabled applications.

Let's introduce a new LSM hook to allow the security module to relabel
the subflow according to the owning user-space process, via the MPTCP
socket owning the subflow.

Note that the new hook requires both the MPTCP socket and the new
subflow. This could allow future extensions, e.g. explicitly validating
the MPTCP <-> subflow linkage.

Link: https://lore.kernel.org/mptcp/CAHC9VhTNh-YwiyTds=P1e3rixEDqbRTFj22bpya=+qJqfcaMfg@mail.gmail.com/
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-18 13:11:09 -04:00
Roberto Sassu 2c085f3a8f smack: Record transmuting in smk_transmuted
smack_dentry_create_files_as() determines whether transmuting should occur
based on the label of the parent directory the new inode will be added to,
and not the label of the directory where it is created.

This helps for example to do transmuting on overlayfs, since the latter
first creates the inode in the working directory, and then moves it to the
correct destination.

However, despite smack_dentry_create_files_as() provides the correct label,
smack_inode_init_security() does not know from passed information whether
or not transmuting occurred. Without this information,
smack_inode_init_security() cannot set SMK_INODE_CHANGED in smk_flags,
which will result in the SMACK64TRANSMUTE xattr not being set in
smack_d_instantiate().

Thus, add the smk_transmuted field to the task_smack structure, and set it
in smack_dentry_create_files_as() to smk_task if transmuting occurred. If
smk_task is equal to smk_transmuted in smack_inode_init_security(), act as
if transmuting was successful but without taking the label from the parent
directory (the inode label was already set correctly from the current
credentials in smack_inode_alloc_security()).

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2023-05-11 10:05:39 -07:00
Roberto Sassu 3a3d8fce31 smack: Retrieve transmuting information in smack_inode_getsecurity()
Enhance smack_inode_getsecurity() to retrieve the value for
SMACK64TRANSMUTE from the inode security blob, similarly to SMACK64.

This helps to display accurate values in the situation where the security
labels come from mount options and not from xattrs.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2023-05-11 10:05:38 -07:00
Paul Moore c52df19e37 selinux: small cleanups in selinux_audit_rule_init()
A few small tweaks to selinux_audit_rule_init():

- Adjust how we use the @rc variable so we are not doing any extra
  work in the common/success case.

- Related to the above, rework the 'out' jump label so that the
  success and error paths are different, simplifying both.

- Cleanup some of the vertical whitespace while we are making the
  other changes.

Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-08 16:53:41 -04:00
Christian Göttsche 4158cb6000 selinux: declare read-only data arrays const
The array of mount tokens in only used in match_opt_prefix() and never
modified.

The array of symtab names is never modified and only used in the
DEBUG_HASHES configuration as output.

The array of files for the SElinux filesystem sub-directory `ss` is
similar to the other `struct tree_descr` usages only read from to
construct the containing entries.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-08 16:52:05 -04:00
Christian Göttsche 4595ae8c4a selinux: retain const qualifier on string literal in avtab_hash_eval()
The second parameter `tag` of avtab_hash_eval() is only used for
printing.  In policydb_index() it is called with a string literal:

    avtab_hash_eval(&p->te_avtab, "rules");

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: slight formatting tweak in description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-08 16:49:14 -04:00
Christian Göttsche aeb060ec71 selinux: drop return at end of void function avc_insert()
Commit 539813e418 ("selinux: stop returning node from avc_insert()")
converted the return value of avc_insert() to void but left the now
unnecessary trailing return statement.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-08 16:47:32 -04:00
Christian Göttsche 757010002b selinux: avc: drop unused function avc_disable()
Since commit f22f9aaf6c ("selinux: remove the runtime disable
functionality") the function avc_disable() is no longer used.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-08 16:45:36 -04:00
Christian Göttsche 3d9047a064 selinux: adjust typos in comments
Found by codespell(1)

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-08 16:44:01 -04:00
Christian Göttsche 53f3517ae0 selinux: do not leave dangling pointer behind
In case mls_context_cpy() fails due to OOM set the free'd pointer in
context_cpy() to NULL to avoid it potentially being dereferenced or
free'd again in future.  Freeing a NULL pointer is well-defined and a
hard NULL dereference crash is at least not exploitable and should give
a workable stack trace.

Fixes: 12b29f3455 ("selinux: support deferred mapping of contexts")
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-08 16:37:42 -04:00
Paul Moore 6f933aa7df selinux: more Makefile tweaks
A few small tweaks to improve the SELinux Makefile:

- Define a new variable, 'genhdrs', to represent both flask.h and
  av_permissions.h; this should help ensure consistent processing for
  both generated headers.

- Move the 'ccflags-y' variable closer to the top, just after the
  main 'obj-$(CONFIG_SECURITY_SELINUX)' definition to make it more
  visible and improve the grouping in the Makefile.

- Rework some of the vertical whitespace to improve some of the
  grouping in the Makefile.

Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-08 16:26:48 -04:00
Linus Torvalds febf9ee3d2 integrity-v6.4
-----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCZEb46hQcem9oYXJAbGlu
 dXguaWJtLmNvbQAKCRDLwZzRsCrn5U+lAP9vq7PplZeQv0cGygvp+7vH3UmcANsM
 7MyyydPC7KfhNgEA7A4WKAPIdvLW7IuKxiVfkgMDxQpFCGkLRHscgbf7xgw=
 =v0fw
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity update from Mimi Zohar:
 "Just one one bug fix. Other integrity changes are being upstreamed via
  the tpm and lsm trees"

* tag 'integrity-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  IMA: allow/fix UML builds
2023-04-29 10:11:32 -07:00
Linus Torvalds 7fa8a8ee94 - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
switching from a user process to a kernel thread.
 
 - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav.
 
 - zsmalloc performance improvements from Sergey Senozhatsky.
 
 - Yue Zhao has found and fixed some data race issues around the
   alteration of memcg userspace tunables.
 
 - VFS rationalizations from Christoph Hellwig:
 
   - removal of most of the callers of write_one_page().
 
   - make __filemap_get_folio()'s return value more useful
 
 - Luis Chamberlain has changed tmpfs so it no longer requires swap
   backing.  Use `mount -o noswap'.
 
 - Qi Zheng has made the slab shrinkers operate locklessly, providing
   some scalability benefits.
 
 - Keith Busch has improved dmapool's performance, making part of its
   operations O(1) rather than O(n).
 
 - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
   permitting userspace to wr-protect anon memory unpopulated ptes.
 
 - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather
   than exclusive, and has fixed a bunch of errors which were caused by its
   unintuitive meaning.
 
 - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
   which causes minor faults to install a write-protected pte.
 
 - Vlastimil Babka has done some maintenance work on vma_merge():
   cleanups to the kernel code and improvements to our userspace test
   harness.
 
 - Cleanups to do_fault_around() by Lorenzo Stoakes.
 
 - Mike Rapoport has moved a lot of initialization code out of various
   mm/ files and into mm/mm_init.c.
 
 - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
   DRM, but DRM doesn't use it any more.
 
 - Lorenzo has also coverted read_kcore() and vread() to use iterators
   and has thereby removed the use of bounce buffers in some cases.
 
 - Lorenzo has also contributed further cleanups of vma_merge().
 
 - Chaitanya Prakash provides some fixes to the mmap selftesting code.
 
 - Matthew Wilcox changes xfs and afs so they no longer take sleeping
   locks in ->map_page(), a step towards RCUification of pagefaults.
 
 - Suren Baghdasaryan has improved mmap_lock scalability by switching to
   per-VMA locking.
 
 - Frederic Weisbecker has reworked the percpu cache draining so that it
   no longer causes latency glitches on cpu isolated workloads.
 
 - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
   logic.
 
 - Liu Shixin has changed zswap's initialization so we no longer waste a
   chunk of memory if zswap is not being used.
 
 - Yosry Ahmed has improved the performance of memcg statistics flushing.
 
 - David Stevens has fixed several issues involving khugepaged,
   userfaultfd and shmem.
 
 - Christoph Hellwig has provided some cleanup work to zram's IO-related
   code paths.
 
 - David Hildenbrand has fixed up some issues in the selftest code's
   testing of our pte state changing.
 
 - Pankaj Raghav has made page_endio() unneeded and has removed it.
 
 - Peter Xu contributed some rationalizations of the userfaultfd
   selftests.
 
 - Yosry Ahmed has fixed an issue around memcg's page recalim accounting.
 
 - Chaitanya Prakash has fixed some arm-related issues in the
   selftests/mm code.
 
 - Longlong Xia has improved the way in which KSM handles hwpoisoned
   pages.
 
 - Peter Xu fixes a few issues with uffd-wp at fork() time.
 
 - Stefan Roesch has changed KSM so that it may now be used on a
   per-process and per-cgroup basis.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEr3zQAKCRDdBJ7gKXxA
 jlLoAP0fpQBipwFxED0Us4SKQfupV6z4caXNJGPeay7Aj11/kQD/aMRC2uPfgr96
 eMG3kwn2pqkB9ST2QpkaRbxA//eMbQY=
 =J+Dj
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
   switching from a user process to a kernel thread.

 - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj
   Raghav.

 - zsmalloc performance improvements from Sergey Senozhatsky.

 - Yue Zhao has found and fixed some data race issues around the
   alteration of memcg userspace tunables.

 - VFS rationalizations from Christoph Hellwig:
     - removal of most of the callers of write_one_page()
     - make __filemap_get_folio()'s return value more useful

 - Luis Chamberlain has changed tmpfs so it no longer requires swap
   backing. Use `mount -o noswap'.

 - Qi Zheng has made the slab shrinkers operate locklessly, providing
   some scalability benefits.

 - Keith Busch has improved dmapool's performance, making part of its
   operations O(1) rather than O(n).

 - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
   permitting userspace to wr-protect anon memory unpopulated ptes.

 - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive
   rather than exclusive, and has fixed a bunch of errors which were
   caused by its unintuitive meaning.

 - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
   which causes minor faults to install a write-protected pte.

 - Vlastimil Babka has done some maintenance work on vma_merge():
   cleanups to the kernel code and improvements to our userspace test
   harness.

 - Cleanups to do_fault_around() by Lorenzo Stoakes.

 - Mike Rapoport has moved a lot of initialization code out of various
   mm/ files and into mm/mm_init.c.

 - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
   DRM, but DRM doesn't use it any more.

 - Lorenzo has also coverted read_kcore() and vread() to use iterators
   and has thereby removed the use of bounce buffers in some cases.

 - Lorenzo has also contributed further cleanups of vma_merge().

 - Chaitanya Prakash provides some fixes to the mmap selftesting code.

 - Matthew Wilcox changes xfs and afs so they no longer take sleeping
   locks in ->map_page(), a step towards RCUification of pagefaults.

 - Suren Baghdasaryan has improved mmap_lock scalability by switching to
   per-VMA locking.

 - Frederic Weisbecker has reworked the percpu cache draining so that it
   no longer causes latency glitches on cpu isolated workloads.

 - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
   logic.

 - Liu Shixin has changed zswap's initialization so we no longer waste a
   chunk of memory if zswap is not being used.

 - Yosry Ahmed has improved the performance of memcg statistics
   flushing.

 - David Stevens has fixed several issues involving khugepaged,
   userfaultfd and shmem.

 - Christoph Hellwig has provided some cleanup work to zram's IO-related
   code paths.

 - David Hildenbrand has fixed up some issues in the selftest code's
   testing of our pte state changing.

 - Pankaj Raghav has made page_endio() unneeded and has removed it.

 - Peter Xu contributed some rationalizations of the userfaultfd
   selftests.

 - Yosry Ahmed has fixed an issue around memcg's page recalim
   accounting.

 - Chaitanya Prakash has fixed some arm-related issues in the
   selftests/mm code.

 - Longlong Xia has improved the way in which KSM handles hwpoisoned
   pages.

 - Peter Xu fixes a few issues with uffd-wp at fork() time.

 - Stefan Roesch has changed KSM so that it may now be used on a
   per-process and per-cgroup basis.

* tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
  mm,unmap: avoid flushing TLB in batch if PTE is inaccessible
  shmem: restrict noswap option to initial user namespace
  mm/khugepaged: fix conflicting mods to collapse_file()
  sparse: remove unnecessary 0 values from rc
  mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area()
  hugetlb: pte_alloc_huge() to replace huge pte_alloc_map()
  maple_tree: fix allocation in mas_sparse_area()
  mm: do not increment pgfault stats when page fault handler retries
  zsmalloc: allow only one active pool compaction context
  selftests/mm: add new selftests for KSM
  mm: add new KSM process and sysfs knobs
  mm: add new api to enable ksm per process
  mm: shrinkers: fix debugfs file permissions
  mm: don't check VMA write permissions if the PTE/PMD indicates write permissions
  migrate_pages_batch: fix statistics for longterm pin retry
  userfaultfd: use helper function range_in_vma()
  lib/show_mem.c: use for_each_populated_zone() simplify code
  mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list()
  fs/buffer: convert create_page_buffers to folio_create_buffers
  fs/buffer: add folio_create_empty_buffers helper
  ...
2023-04-27 19:42:02 -07:00
Linus Torvalds 888d3c9f7f sysctl-6.4-rc1
This pull request goes with only a few sysctl moves from the
 kernel/sysctl.c file, the rest of the work has been put towards
 deprecating two API calls which incur recursion and prevent us
 from simplifying the registration process / saving memory per
 move. Most of the changes have been soaking on linux-next since
 v6.3-rc3.
 
 I've slowed down the kernel/sysctl.c moves due to Matthew Wilcox's
 feedback that we should see if we could *save* memory with these
 moves instead of incurring more memory. We currently incur more
 memory since when we move a syctl from kernel/sysclt.c out to its
 own file we end up having to add a new empty sysctl used to register
 it. To achieve saving memory we want to allow syctls to be passed
 without requiring the end element being empty, and just have our
 registration process rely on ARRAY_SIZE(). Without this, supporting
 both styles of sysctls would make the sysctl registration pretty
 brittle, hard to read and maintain as can be seen from Meng Tang's
 efforts to do just this [0]. Fortunately, in order to use ARRAY_SIZE()
 for all sysctl registrations also implies doing the work to deprecate
 two API calls which use recursion in order to support sysctl
 declarations with subdirectories.
 
 And so during this development cycle quite a bit of effort went into
 this deprecation effort. I've annotated the following two APIs are
 deprecated and in few kernel releases we should be good to remove them:
 
   * register_sysctl_table()
   * register_sysctl_paths()
 
 During this merge window we should be able to deprecate and unexport
 register_sysctl_paths(), we can probably do that towards the end
 of this merge window.
 
 Deprecating register_sysctl_table() will take a bit more time but
 this pull request goes with a few example of how to do this.
 
 As it turns out each of the conversions to move away from either of
 these two API calls *also* saves memory. And so long term, all these
 changes *will* prove to have saved a bit of memory on boot.
 
 The way I see it then is if remove a user of one deprecated call, it
 gives us enough savings to move one kernel/sysctl.c out from the
 generic arrays as we end up with about the same amount of bytes.
 
 Since deprecating register_sysctl_table() and register_sysctl_paths()
 does not require maintainer coordination except the final unexport
 you'll see quite a bit of these changes from other pull requests, I've
 just kept the stragglers after rc3.
 
 Most of these changes have been soaking on linux-next since around rc3.
 
 [0] https://lkml.kernel.org/r/ZAD+cpbrqlc5vmry@bombadil.infradead.org
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmRHAjQSHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinTzgQAI/uKHKi0VlUR1l2Psl0XbseUVueuyj3
 ZDxSJpbVUmsoDf2MlLjzB8mYE3ricnNTDbLr7qOyA6pXdM1N0mY5LQmRVRu8/ffd
 2T1hQ5pl7YnJdWP5dPhcF9Y+jnu1tjX1MW5DS4fzllwK7FnD86HuIruGq52RAPS/
 /FH+BD9eodLWWXk6A/o2GFqoWxPKQI0GLxEYWa7Hg7yt8E/3PQL9QsRzn8i6U+HW
 BrN/+G3YD1VCCzXu0UAeXnm+i1Z7CdvqNdZuSkvE3DObiZ5WpOS+/i7FrDB7zdiu
 zAbHaifHnDPtcK3w2ZodbLAAwEWD/mG4iwIjE2kgIMVYxBv7TFDBRREXAWYAevIT
 UUuZnWDQsGaWdjywrebaUycEfd6dytKyan0fTXgMFkcoWRjejhitfdM2iZDdQROg
 q453p4HqOw4vTrhy4ov4zOX7J3EFiBzpZdl+SmLqcXk+jbLVb/Q9snUWz1AFtHBl
 gHoP5bS82uVktGG3MsObjgTzYYMQjO9YGIrVuW1VP9uWs8WaoWx6M9FQJIIhtwE+
 h6wG2s7CjuFWnS0/IxWmDOn91QyUn1w7ohiz9TuvYj/5GLSBpBDGCJHsNB5T2WS1
 qbQRaZ2Kg3j9TeyWfXxdlxBx7bt3ni+J/IXDY0zom2sTpGHKl8D2g5AzmEXJDTpl
 kd7Z3gsmwhDh
 =0U0W
 -----END PGP SIGNATURE-----

Merge tag 'sysctl-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull sysctl updates from Luis Chamberlain:
 "This only does a few sysctl moves from the kernel/sysctl.c file, the
  rest of the work has been put towards deprecating two API calls which
  incur recursion and prevent us from simplifying the registration
  process / saving memory per move. Most of the changes have been
  soaking on linux-next since v6.3-rc3.

  I've slowed down the kernel/sysctl.c moves due to Matthew Wilcox's
  feedback that we should see if we could *save* memory with these moves
  instead of incurring more memory. We currently incur more memory since
  when we move a syctl from kernel/sysclt.c out to its own file we end
  up having to add a new empty sysctl used to register it. To achieve
  saving memory we want to allow syctls to be passed without requiring
  the end element being empty, and just have our registration process
  rely on ARRAY_SIZE(). Without this, supporting both styles of sysctls
  would make the sysctl registration pretty brittle, hard to read and
  maintain as can be seen from Meng Tang's efforts to do just this [0].
  Fortunately, in order to use ARRAY_SIZE() for all sysctl registrations
  also implies doing the work to deprecate two API calls which use
  recursion in order to support sysctl declarations with subdirectories.

  And so during this development cycle quite a bit of effort went into
  this deprecation effort. I've annotated the following two APIs are
  deprecated and in few kernel releases we should be good to remove
  them:

   - register_sysctl_table()
   - register_sysctl_paths()

  During this merge window we should be able to deprecate and unexport
  register_sysctl_paths(), we can probably do that towards the end of
  this merge window.

  Deprecating register_sysctl_table() will take a bit more time but this
  pull request goes with a few example of how to do this.

  As it turns out each of the conversions to move away from either of
  these two API calls *also* saves memory. And so long term, all these
  changes *will* prove to have saved a bit of memory on boot.

  The way I see it then is if remove a user of one deprecated call, it
  gives us enough savings to move one kernel/sysctl.c out from the
  generic arrays as we end up with about the same amount of bytes.

  Since deprecating register_sysctl_table() and register_sysctl_paths()
  does not require maintainer coordination except the final unexport
  you'll see quite a bit of these changes from other pull requests, I've
  just kept the stragglers after rc3"

Link: https://lkml.kernel.org/r/ZAD+cpbrqlc5vmry@bombadil.infradead.org [0]

* tag 'sysctl-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (29 commits)
  fs: fix sysctls.c built
  mm: compaction: remove incorrect #ifdef checks
  mm: compaction: move compaction sysctl to its own file
  mm: memory-failure: Move memory failure sysctls to its own file
  arm: simplify two-level sysctl registration for ctl_isa_vars
  ia64: simplify one-level sysctl registration for kdump_ctl_table
  utsname: simplify one-level sysctl registration for uts_kern_table
  ntfs: simplfy one-level sysctl registration for ntfs_sysctls
  coda: simplify one-level sysctl registration for coda_table
  fs/cachefiles: simplify one-level sysctl registration for cachefiles_sysctls
  xfs: simplify two-level sysctl registration for xfs_table
  nfs: simplify two-level sysctl registration for nfs_cb_sysctls
  nfs: simplify two-level sysctl registration for nfs4_cb_sysctls
  lockd: simplify two-level sysctl registration for nlm_sysctls
  proc_sysctl: enhance documentation
  xen: simplify sysctl registration for balloon
  md: simplify sysctl registration
  hv: simplify sysctl registration
  scsi: simplify sysctl registration with register_sysctl()
  csky: simplify alignment sysctl registration
  ...
2023-04-27 16:52:33 -07:00
Linus Torvalds 6e98b09da9 Networking changes for 6.4.
Core
 ----
 
  - Introduce a config option to tweak MAX_SKB_FRAGS. Increasing the
    default value allows for better BIG TCP performances.
 
  - Reduce compound page head access for zero-copy data transfers.
 
  - RPS/RFS improvements, avoiding unneeded NET_RX_SOFTIRQ when possible.
 
  - Threaded NAPI improvements, adding defer skb free support and unneeded
    softirq avoidance.
 
  - Address dst_entry reference count scalability issues, via false
    sharing avoidance and optimize refcount tracking.
 
  - Add lockless accesses annotation to sk_err[_soft].
 
  - Optimize again the skb struct layout.
 
  - Extends the skb drop reasons to make it usable by multiple
    subsystems.
 
  - Better const qualifier awareness for socket casts.
 
 BPF
 ---
 
  - Add skb and XDP typed dynptrs which allow BPF programs for more
    ergonomic and less brittle iteration through data and variable-sized
    accesses.
 
  - Add a new BPF netfilter program type and minimal support to hook
    BPF programs to netfilter hooks such as prerouting or forward.
 
  - Add more precise memory usage reporting for all BPF map types.
 
  - Adds support for using {FOU,GUE} encap with an ipip device operating
    in collect_md mode and add a set of BPF kfuncs for controlling encap
    params.
 
  - Allow BPF programs to detect at load time whether a particular kfunc
    exists or not, and also add support for this in light skeleton.
 
  - Bigger batch of BPF verifier improvements to prepare for upcoming BPF
    open-coded iterators allowing for less restrictive looping capabilities.
 
  - Rework RCU enforcement in the verifier, add kptr_rcu and enforce BPF
    programs to NULL-check before passing such pointers into kfunc.
 
  - Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and in
    local storage maps.
 
  - Enable RCU semantics for task BPF kptrs and allow referenced kptr
    tasks to be stored in BPF maps.
 
  - Add support for refcounted local kptrs to the verifier for allowing
    shared ownership, useful for adding a node to both the BPF list and
    rbtree.
 
  - Add BPF verifier support for ST instructions in convert_ctx_access()
    which will help new -mcpu=v4 clang flag to start emitting them.
 
  - Add ARM32 USDT support to libbpf.
 
  - Improve bpftool's visual program dump which produces the control
    flow graph in a DOT format by adding C source inline annotations.
 
 Protocols
 ---------
 
  - IPv4: Allow adding to IPv4 address a 'protocol' tag. Such value
    indicates the provenance of the IP address.
 
  - IPv6: optimize route lookup, dropping unneeded R/W lock acquisition.
 
  - Add the handshake upcall mechanism, allowing the user-space
    to implement generic TLS handshake on kernel's behalf.
 
  - Bridge: support per-{Port, VLAN} neighbor suppression, increasing
    resilience to nodes failures.
 
  - SCTP: add support for Fair Capacity and Weighted Fair Queueing
    schedulers.
 
  - MPTCP: delay first subflow allocation up to its first usage. This
    will allow for later better LSM interaction.
 
  - xfrm: Remove inner/outer modes from input/output path. These are
    not needed anymore.
 
  - WiFi:
    - reduced neighbor report (RNR) handling for AP mode
    - HW timestamping support
    - support for randomized auth/deauth TA for PASN privacy
    - per-link debugfs for multi-link
    - TC offload support for mac80211 drivers
    - mac80211 mesh fast-xmit and fast-rx support
    - enable Wi-Fi 7 (EHT) mesh support
 
 Netfilter
 ---------
 
  - Add nf_tables 'brouting' support, to force a packet to be routed
    instead of being bridged.
 
  - Update bridge netfilter and ovs conntrack helpers to handle
    IPv6 Jumbo packets properly, i.e. fetch the packet length
    from hop-by-hop extension header. This is needed for BIT TCP
    support.
 
  - The iptables 32bit compat interface isn't compiled in by default
    anymore.
 
  - Move ip(6)tables builtin icmp matches to the udptcp one.
    This has the advantage that icmp/icmpv6 match doesn't load the
    iptables/ip6tables modules anymore when iptables-nft is used.
 
  - Extended netlink error report for netdevice in flowtables and
    netdev/chains. Allow for incrementally add/delete devices to netdev
    basechain. Allow to create netdev chain without device.
 
 Driver API
 ----------
 
  - Remove redundant Device Control Error Reporting Enable, as PCI core
    has already error reporting enabled at enumeration time.
 
  - Move Multicast DB netlink handlers to core, allowing devices other
    then bridge to use them.
 
  - Allow the page_pool to directly recycle the pages from safely
    localized NAPI.
 
  - Implement lockless TX queue stop/wake combo macros, allowing for
    further code de-duplication and sanitization.
 
  - Add YNL support for user headers and struct attrs.
 
  - Add partial YNL specification for devlink.
 
  - Add partial YNL specification for ethtool.
 
  - Add tc-mqprio and tc-taprio support for preemptible traffic classes.
 
  - Add tx push buf len param to ethtool, specifies the maximum number
    of bytes of a transmitted packet a driver can push directly to the
    underlying device.
 
  - Add basic LED support for switch/phy.
 
  - Add NAPI documentation, stop relaying on external links.
 
  - Convert dsa_master_ioctl() to netdev notifier. This is a preparatory
    work to make the hardware timestamping layer selectable by user
    space.
 
  - Add transceiver support and improve the error messages for CAN-FD
    controllers.
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - AMD/Pensando core device support
    - MediaTek MT7981 SoC
    - MediaTek MT7988 SoC
    - Broadcom BCM53134 embedded switch
    - Texas Instruments CPSW9G ethernet switch
    - Qualcomm EMAC3 DWMAC ethernet
    - StarFive JH7110 SoC
    - NXP CBTX ethernet PHY
 
  - WiFi:
    - Apple M1 Pro/Max devices
    - RealTek rtl8710bu/rtl8188gu
    - RealTek rtl8822bs, rtl8822cs and rtl8821cs SDIO chipset
 
  - Bluetooth:
    - Realtek RTL8821CS, RTL8851B, RTL8852BS
    - Mediatek MT7663, MT7922
    - NXP w8997
    - Actions Semi ATS2851
    - QTI WCN6855
    - Marvell 88W8997
 
  - Can:
    - STMicroelectronics bxcan stm32f429
 
 Drivers
 -------
  - Ethernet NICs:
    - Intel (1G, icg):
      - add tracking and reporting of QBV config errors.
      - add support for configuring max SDU for each Tx queue.
    - Intel (100G, ice):
      - refactor mailbox overflow detection to support Scalable IOV
      - GNSS interface optimization
    - Intel (i40e):
      - support XDP multi-buffer
    - nVidia/Mellanox:
      - add the support for linux bridge multicast offload
      - enable TC offload for egress and engress MACVLAN over bond
      - add support for VxLAN GBP encap/decap flows offload
      - extend packet offload to fully support libreswan
      - support tunnel mode in mlx5 IPsec packet offload
      - extend XDP multi-buffer support
      - support MACsec VLAN offload
      - add support for dynamic msix vectors allocation
      - drop RX page_cache and fully use page_pool
      - implement thermal zone to report NIC temperature
    - Netronome/Corigine:
      - add support for multi-zone conntrack offload
    - Solarflare/Xilinx:
      - support offloading TC VLAN push/pop actions to the MAE
      - support TC decap rules
      - support unicast PTP
 
  - Other NICs:
    - Broadcom (bnxt): enforce software based freq adjustments only
 		on shared PHC NIC
    - RealTek (r8169): refactor to addess ASPM issues during NAPI poll.
    - Micrel (lan8841): add support for PTP_PF_PEROUT
    - Cadence (macb): enable PTP unicast
    - Engleder (tsnep): add XDP socket zero-copy support
    - virtio-net: implement exact header length guest feature
    - veth: add page_pool support for page recycling
    - vxlan: add MDB data path support
    - gve: add XDP support for GQI-QPL format
    - geneve: accept every ethertype
    - macvlan: allow some packets to bypass broadcast queue
    - mana: add support for jumbo frame
 
  - Ethernet high-speed switches:
    - Microchip (sparx5): Add support for TC flower templates.
 
  - Ethernet embedded switches:
    - Broadcom (b54):
      - configure 6318 and 63268 RGMII ports
    - Marvell (mv88e6xxx):
      - faster C45 bus scan
    - Microchip:
      - lan966x:
        - add support for IS1 VCAP
        - better TX/RX from/to CPU performances
      - ksz9477: add ETS Qdisc support
      - ksz8: enhance static MAC table operations and error handling
      - sama7g5: add PTP capability
    - NXP (ocelot):
      - add support for external ports
      - add support for preemptible traffic classes
    - Texas Instruments:
      - add CPSWxG SGMII support for J7200 and J721E
 
  - Intel WiFi (iwlwifi):
    - preparation for Wi-Fi 7 EHT and multi-link support
    - EHT (Wi-Fi 7) sniffer support
    - hardware timestamping support for some devices/firwmares
    - TX beacon protection on newer hardware
 
  - Qualcomm 802.11ax WiFi (ath11k):
    - MU-MIMO parameters support
    - ack signal support for management packets
 
  - RealTek WiFi (rtw88):
    - SDIO bus support
    - better support for some SDIO devices
      (e.g. MAC address from efuse)
 
  - RealTek WiFi (rtw89):
    - HW scan support for 8852b
    - better support for 6 GHz scanning
    - support for various newer firmware APIs
    - framework firmware backwards compatibility
 
  - MediaTek WiFi (mt76):
    - P2P support
    - mesh A-MSDU support
    - EHT (Wi-Fi 7) support
    - coredump support
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmRI/mUSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkgO0QAJGxpuN67YgYV0BIM+/atWKEEexJYG7B
 9MMpU4jMO3EW/pUS5t7VRsBLUybLYVPmqCZoHodObDfnu59jiPOegb6SikJv/ZwJ
 Zw62PVk5MvDnQjlu4e6kDcGwkplteN08TlgI+a49BUTedpdFitrxHAYGW8f2fRO6
 cK2XSld+ZucMoym5vRwf8yWS1BwdxnslPMxDJ+/8ZbWBZv44qAnG2vMB/kIx7ObC
 Vel/4m6MzTwVsLYBsRvcwMVbNNlZ9GuhztlTzEbfGA4ZhTadIAMgb5VTWXB84Ws7
 Aic5wTdli+q+x6/2cxhbyeoVuB9HHObYmLBAciGg4GNljP5rnQBY3X3+KVZ/x9TI
 HQB7CmhxmAZVrO9pLARFV+ECrMTH2/dy3NyrZ7uYQ3WPOXJi8hJZjOTO/eeEGL7C
 eTjdz0dZBWIBK2gON/6s4nExXVQUTEF2ZsPi52jTTClKjfe5pz/ddeFQIWaY1DTm
 pInEiWPAvd28JyiFmhFNHsuIBCjX/Zqe2JuMfMBeBibDAC09o/OGdKJYUI15AiRf
 F46Pdb7use/puqfrYW44kSAfaPYoBiE+hj1RdeQfen35xD9HVE4vdnLNeuhRlFF9
 aQfyIRHYQofkumRDr5f8JEY66cl9NiKQ4IVW1xxQfYDNdC6wQqREPG1md7rJVMrJ
 vP7ugFnttneg
 =ITVa
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "Core:

   - Introduce a config option to tweak MAX_SKB_FRAGS. Increasing the
     default value allows for better BIG TCP performances

   - Reduce compound page head access for zero-copy data transfers

   - RPS/RFS improvements, avoiding unneeded NET_RX_SOFTIRQ when
     possible

   - Threaded NAPI improvements, adding defer skb free support and
     unneeded softirq avoidance

   - Address dst_entry reference count scalability issues, via false
     sharing avoidance and optimize refcount tracking

   - Add lockless accesses annotation to sk_err[_soft]

   - Optimize again the skb struct layout

   - Extends the skb drop reasons to make it usable by multiple
     subsystems

   - Better const qualifier awareness for socket casts

  BPF:

   - Add skb and XDP typed dynptrs which allow BPF programs for more
     ergonomic and less brittle iteration through data and
     variable-sized accesses

   - Add a new BPF netfilter program type and minimal support to hook
     BPF programs to netfilter hooks such as prerouting or forward

   - Add more precise memory usage reporting for all BPF map types

   - Adds support for using {FOU,GUE} encap with an ipip device
     operating in collect_md mode and add a set of BPF kfuncs for
     controlling encap params

   - Allow BPF programs to detect at load time whether a particular
     kfunc exists or not, and also add support for this in light
     skeleton

   - Bigger batch of BPF verifier improvements to prepare for upcoming
     BPF open-coded iterators allowing for less restrictive looping
     capabilities

   - Rework RCU enforcement in the verifier, add kptr_rcu and enforce
     BPF programs to NULL-check before passing such pointers into kfunc

   - Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and
     in local storage maps

   - Enable RCU semantics for task BPF kptrs and allow referenced kptr
     tasks to be stored in BPF maps

   - Add support for refcounted local kptrs to the verifier for allowing
     shared ownership, useful for adding a node to both the BPF list and
     rbtree

   - Add BPF verifier support for ST instructions in
     convert_ctx_access() which will help new -mcpu=v4 clang flag to
     start emitting them

   - Add ARM32 USDT support to libbpf

   - Improve bpftool's visual program dump which produces the control
     flow graph in a DOT format by adding C source inline annotations

  Protocols:

   - IPv4: Allow adding to IPv4 address a 'protocol' tag. Such value
     indicates the provenance of the IP address

   - IPv6: optimize route lookup, dropping unneeded R/W lock acquisition

   - Add the handshake upcall mechanism, allowing the user-space to
     implement generic TLS handshake on kernel's behalf

   - Bridge: support per-{Port, VLAN} neighbor suppression, increasing
     resilience to nodes failures

   - SCTP: add support for Fair Capacity and Weighted Fair Queueing
     schedulers

   - MPTCP: delay first subflow allocation up to its first usage. This
     will allow for later better LSM interaction

   - xfrm: Remove inner/outer modes from input/output path. These are
     not needed anymore

   - WiFi:
      - reduced neighbor report (RNR) handling for AP mode
      - HW timestamping support
      - support for randomized auth/deauth TA for PASN privacy
      - per-link debugfs for multi-link
      - TC offload support for mac80211 drivers
      - mac80211 mesh fast-xmit and fast-rx support
      - enable Wi-Fi 7 (EHT) mesh support

  Netfilter:

   - Add nf_tables 'brouting' support, to force a packet to be routed
     instead of being bridged

   - Update bridge netfilter and ovs conntrack helpers to handle IPv6
     Jumbo packets properly, i.e. fetch the packet length from
     hop-by-hop extension header. This is needed for BIT TCP support

   - The iptables 32bit compat interface isn't compiled in by default
     anymore

   - Move ip(6)tables builtin icmp matches to the udptcp one. This has
     the advantage that icmp/icmpv6 match doesn't load the
     iptables/ip6tables modules anymore when iptables-nft is used

   - Extended netlink error report for netdevice in flowtables and
     netdev/chains. Allow for incrementally add/delete devices to netdev
     basechain. Allow to create netdev chain without device

  Driver API:

   - Remove redundant Device Control Error Reporting Enable, as PCI core
     has already error reporting enabled at enumeration time

   - Move Multicast DB netlink handlers to core, allowing devices other
     then bridge to use them

   - Allow the page_pool to directly recycle the pages from safely
     localized NAPI

   - Implement lockless TX queue stop/wake combo macros, allowing for
     further code de-duplication and sanitization

   - Add YNL support for user headers and struct attrs

   - Add partial YNL specification for devlink

   - Add partial YNL specification for ethtool

   - Add tc-mqprio and tc-taprio support for preemptible traffic classes

   - Add tx push buf len param to ethtool, specifies the maximum number
     of bytes of a transmitted packet a driver can push directly to the
     underlying device

   - Add basic LED support for switch/phy

   - Add NAPI documentation, stop relaying on external links

   - Convert dsa_master_ioctl() to netdev notifier. This is a
     preparatory work to make the hardware timestamping layer selectable
     by user space

   - Add transceiver support and improve the error messages for CAN-FD
     controllers

  New hardware / drivers:

   - Ethernet:
      - AMD/Pensando core device support
      - MediaTek MT7981 SoC
      - MediaTek MT7988 SoC
      - Broadcom BCM53134 embedded switch
      - Texas Instruments CPSW9G ethernet switch
      - Qualcomm EMAC3 DWMAC ethernet
      - StarFive JH7110 SoC
      - NXP CBTX ethernet PHY

   - WiFi:
      - Apple M1 Pro/Max devices
      - RealTek rtl8710bu/rtl8188gu
      - RealTek rtl8822bs, rtl8822cs and rtl8821cs SDIO chipset

   - Bluetooth:
      - Realtek RTL8821CS, RTL8851B, RTL8852BS
      - Mediatek MT7663, MT7922
      - NXP w8997
      - Actions Semi ATS2851
      - QTI WCN6855
      - Marvell 88W8997

   - Can:
      - STMicroelectronics bxcan stm32f429

  Drivers:

   - Ethernet NICs:
      - Intel (1G, icg):
         - add tracking and reporting of QBV config errors
         - add support for configuring max SDU for each Tx queue
      - Intel (100G, ice):
         - refactor mailbox overflow detection to support Scalable IOV
         - GNSS interface optimization
      - Intel (i40e):
         - support XDP multi-buffer
      - nVidia/Mellanox:
         - add the support for linux bridge multicast offload
         - enable TC offload for egress and engress MACVLAN over bond
         - add support for VxLAN GBP encap/decap flows offload
         - extend packet offload to fully support libreswan
         - support tunnel mode in mlx5 IPsec packet offload
         - extend XDP multi-buffer support
         - support MACsec VLAN offload
         - add support for dynamic msix vectors allocation
         - drop RX page_cache and fully use page_pool
         - implement thermal zone to report NIC temperature
      - Netronome/Corigine:
         - add support for multi-zone conntrack offload
      - Solarflare/Xilinx:
         - support offloading TC VLAN push/pop actions to the MAE
         - support TC decap rules
         - support unicast PTP

   - Other NICs:
      - Broadcom (bnxt): enforce software based freq adjustments only on
        shared PHC NIC
      - RealTek (r8169): refactor to addess ASPM issues during NAPI poll
      - Micrel (lan8841): add support for PTP_PF_PEROUT
      - Cadence (macb): enable PTP unicast
      - Engleder (tsnep): add XDP socket zero-copy support
      - virtio-net: implement exact header length guest feature
      - veth: add page_pool support for page recycling
      - vxlan: add MDB data path support
      - gve: add XDP support for GQI-QPL format
      - geneve: accept every ethertype
      - macvlan: allow some packets to bypass broadcast queue
      - mana: add support for jumbo frame

   - Ethernet high-speed switches:
      - Microchip (sparx5): Add support for TC flower templates

   - Ethernet embedded switches:
      - Broadcom (b54):
         - configure 6318 and 63268 RGMII ports
      - Marvell (mv88e6xxx):
         - faster C45 bus scan
      - Microchip:
         - lan966x:
            - add support for IS1 VCAP
            - better TX/RX from/to CPU performances
         - ksz9477: add ETS Qdisc support
         - ksz8: enhance static MAC table operations and error handling
         - sama7g5: add PTP capability
      - NXP (ocelot):
         - add support for external ports
         - add support for preemptible traffic classes
      - Texas Instruments:
         - add CPSWxG SGMII support for J7200 and J721E

   - Intel WiFi (iwlwifi):
      - preparation for Wi-Fi 7 EHT and multi-link support
      - EHT (Wi-Fi 7) sniffer support
      - hardware timestamping support for some devices/firwmares
      - TX beacon protection on newer hardware

   - Qualcomm 802.11ax WiFi (ath11k):
      - MU-MIMO parameters support
      - ack signal support for management packets

   - RealTek WiFi (rtw88):
      - SDIO bus support
      - better support for some SDIO devices (e.g. MAC address from
        efuse)

   - RealTek WiFi (rtw89):
      - HW scan support for 8852b
      - better support for 6 GHz scanning
      - support for various newer firmware APIs
      - framework firmware backwards compatibility

   - MediaTek WiFi (mt76):
      - P2P support
      - mesh A-MSDU support
      - EHT (Wi-Fi 7) support
      - coredump support"

* tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2078 commits)
  net: phy: hide the PHYLIB_LEDS knob
  net: phy: marvell-88x2222: remove unnecessary (void*) conversions
  tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp.
  net: amd: Fix link leak when verifying config failed
  net: phy: marvell: Fix inconsistent indenting in led_blink_set
  lan966x: Don't use xdp_frame when action is XDP_TX
  tsnep: Add XDP socket zero-copy TX support
  tsnep: Add XDP socket zero-copy RX support
  tsnep: Move skb receive action to separate function
  tsnep: Add functions for queue enable/disable
  tsnep: Rework TX/RX queue initialization
  tsnep: Replace modulo operation with mask
  net: phy: dp83867: Add led_brightness_set support
  net: phy: Fix reading LED reg property
  drivers: nfc: nfcsim: remove return value check of `dev_dir`
  net: phy: dp83867: Remove unnecessary (void*) conversions
  net: ethtool: coalesce: try to make user settings stick twice
  net: mana: Check if netdev/napi_alloc_frag returns single page
  net: mana: Rename mana_refill_rxoob and remove some empty lines
  net: veth: add page_pool stats
  ...
2023-04-26 16:07:23 -07:00
Linus Torvalds c23f28975a Commit volume in documentation is relatively low this time, but there is
still a fair amount going on, including:
 
 - Reorganizing the architecture-specific documentation under
   Documentation/arch.  This makes the structure match the source directory
   and helps to clean up the mess that is the top-level Documentation
   directory a bit.  This work creates the new directory and moves x86 and
   most of the less-active architectures there.  The current plan is to move
   the rest of the architectures in 6.5, with the patches going through the
   appropriate subsystem trees.
 
 - Some more Spanish translations and maintenance of the Italian
   translation.
 
 - A new "Kernel contribution maturity model" document from Ted.
 
 - A new tutorial on quickly building a trimmed kernel from Thorsten.
 
 Plus the usual set of updates and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmRGze0PHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5Y/VsH/RyWqinorRVFZmHqRJMRhR0j7hE2pAgK5prE
 dGXYVtHHNQ+25thNaqhZTOLYFbSX6ii2NG7sLRXmyOTGIZrhUCFFXCHkuq4ZUypR
 gJpMUiKQVT4dhln3gIZ0k09NSr60gz8UTcq895N9UFpUdY1SCDhbCcLc4uXTRajq
 NrdgFaHWRkPb+gBRbXOExYm75DmCC6Ny5AyGo2rXfItV//ETjWIJVQpJhlxKrpMZ
 3LgpdYSLhEFFnFGnXJ+EAPJ7gXDi2Tg5DuPbkvJyFOTouF3j4h8lSS9l+refMljN
 xNRessv+boge/JAQidS6u8F2m2ESSqSxisv/0irgtKIMJwXaoX4=
 =1//8
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.4' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "Commit volume in documentation is relatively low this time, but there
  is still a fair amount going on, including:

   - Reorganize the architecture-specific documentation under
     Documentation/arch

     This makes the structure match the source directory and helps to
     clean up the mess that is the top-level Documentation directory a
     bit. This work creates the new directory and moves x86 and most of
     the less-active architectures there.

     The current plan is to move the rest of the architectures in 6.5,
     with the patches going through the appropriate subsystem trees.

   - Some more Spanish translations and maintenance of the Italian
     translation

   - A new "Kernel contribution maturity model" document from Ted

   - A new tutorial on quickly building a trimmed kernel from Thorsten

  Plus the usual set of updates and fixes"

* tag 'docs-6.4' of git://git.lwn.net/linux: (47 commits)
  media: Adjust column width for pdfdocs
  media: Fix building pdfdocs
  docs: clk: add documentation to log which clocks have been disabled
  docs: trace: Fix typo in ftrace.rst
  Documentation/process: always CC responsible lists
  docs: kmemleak: adjust to config renaming
  ELF: document some de-facto PT_* ABI quirks
  Documentation: arm: remove stih415/stih416 related entries
  docs: turn off "smart quotes" in the HTML build
  Documentation: firmware: Clarify firmware path usage
  docs/mm: Physical Memory: Fix grammar
  Documentation: Add document for false sharing
  dma-api-howto: typo fix
  docs: move m68k architecture documentation under Documentation/arch/
  docs: move parisc documentation under Documentation/arch/
  docs: move ia64 architecture docs under Documentation/arch/
  docs: Move arc architecture docs under Documentation/arch/
  docs: move nios2 documentation under Documentation/arch/
  docs: move openrisc documentation under Documentation/arch/
  docs: move superh documentation under Documentation/arch/
  ...
2023-04-24 12:35:49 -07:00
Linus Torvalds 1a0beef98b Two major features are included into this pull request. The links for
the landed patch sets are below.
 
 The .machine keyring, used for Machine Owner Keys (MOK), acquired the
 ability to store only CA enforced keys, and put rest to the .platform
 keyring, thus separating the code signing keys from the keys that are
 used to sign certificates. This essentially unlocks the use of the
 .machine keyring as a trust anchor for IMA. It is an opt-in feature,
 meaning that the additional contraints won't brick anyone who does not
 care about them.
 
 The 2nd feature is the enablement of interrupt based transactions with
 discrete TPM chips (tpm_tis). There was code for this existing but it
 never really worked so I consider this a new feature rather than a bug
 fix. Before the driver just falled back to the polling mode.
 
 Link: https://lore.kernel.org/linux-integrity/a93b6222-edda-d43c-f010-a59701f2aeef@gmx.de/
 Link: https://lore.kernel.org/linux-integrity/20230302164652.83571-1-eric.snowberg@oracle.com/
 -----BEGIN PGP SIGNATURE-----
 
 iIgEABYIADAWIQRE6pSOnaBC00OEHEIaerohdGur0gUCZEaRpxIcamFya2tvQGtl
 cm5lbC5vcmcACgkQGnq6IXRrq9L7OwD+PBjZzXLYCMy0WK++XaVwf2ATZmoRVEKR
 FJn5hCKL0WoBAJn6rBrty7xd+5OHmoO5YddyX7UreBR1L7Zy2U5mGGcJ
 =AIR1
 -----END PGP SIGNATURE-----

Merge tag 'tpmdd-v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull tpm updates from Jarkko Sakkinen:

 - The .machine keyring, used for Machine Owner Keys (MOK), acquired the
   ability to store only CA enforced keys, and put rest to the .platform
   keyring, thus separating the code signing keys from the keys that are
   used to sign certificates.

   This essentially unlocks the use of the .machine keyring as a trust
   anchor for IMA. It is an opt-in feature, meaning that the additional
   contraints won't brick anyone who does not care about them.

 - Enable interrupt based transactions with discrete TPM chips (tpm_tis).

   There was code for this existing but it never really worked so I
   consider this a new feature rather than a bug fix. Before the driver
   just fell back to the polling mode.

Link: https://lore.kernel.org/linux-integrity/a93b6222-edda-d43c-f010-a59701f2aeef@gmx.de/
Link: https://lore.kernel.org/linux-integrity/20230302164652.83571-1-eric.snowberg@oracle.com/

* tag 'tpmdd-v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: (29 commits)
  tpm: Add !tpm_amd_is_rng_defective() to the hwrng_unregister() call site
  tpm_tis: fix stall after iowrite*()s
  tpm/tpm_tis_synquacer: Convert to platform remove callback returning void
  tpm/tpm_tis: Convert to platform remove callback returning void
  tpm/tpm_ftpm_tee: Convert to platform remove callback returning void
  tpm: tpm_tis_spi: Mark ACPI and OF related data as maybe unused
  tpm: st33zp24: Mark ACPI and OF related data as maybe unused
  tpm, tpm_tis: Enable interrupt test
  tpm, tpm_tis: startup chip before testing for interrupts
  tpm, tpm_tis: Claim locality when interrupts are reenabled on resume
  tpm, tpm_tis: Claim locality in interrupt handler
  tpm, tpm_tis: Request threaded interrupt handler
  tpm, tpm: Implement usage counter for locality
  tpm, tpm_tis: do not check for the active locality in interrupt handler
  tpm, tpm_tis: Move interrupt mask checks into own function
  tpm, tpm_tis: Only handle supported interrupts
  tpm, tpm_tis: Claim locality before writing interrupt registers
  tpm, tpm_tis: Do not skip reset of original interrupt vector
  tpm, tpm_tis: Disable interrupts if tpm_tis_probe_irq() failed
  tpm, tpm_tis: Claim locality before writing TPM_INT_ENABLE register
  ...
2023-04-24 11:40:26 -07:00
Linus Torvalds dc7e22a368 Smack updates for v6.4
-----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmRGv+4XHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBE4QxAAkHiCueaplFsGvYhtx6aeajNC
 0ScA84efBMhQJ/jP4FsTh893bGUkbDv+dyasAVOoAdfFPfgpecEOELzhOaaXv5l2
 8pZ1CtTPXU9h5Csg7D6idII/EyzBUkKDCLbrZexT6A6ZEl0xTqY6Pz6/3uee/W4J
 Z/84U1lX/GgI/SzV6JFcO0XYDj2yp7cfdwIzPUHRky5HgPgLm3roB+eZQwONHfYl
 qYX5xAYCxMx6Uqx3kFb+wgXEJ71lFQGBd7zAZsinGqlrH0vIA63BqpxcHPhYTJNl
 9Y/t6Mb9ds2C1CCGhQTPn/m4hcqYcA5oLuhGWNhOeXMX450XBQ4v7nRw45Dkb1Sa
 IPwJTPfuH2I5r5dOW8cGVCrDp5OT+XQJ5OrsIBtdrPxPGX8x6XyaC4DLG3mympC6
 UfBxdP60Jtm/PRuLCX3tX92zzXhFuqt63Gw87b6htlgEPpirJlhZaEiCYKGlshS1
 b6+kMn1snCxqbBvE/jI3FKHp/C8F/lKNnuVRid9J6HkoyABubWMZ3UIAY+SkVw6b
 9BuF8dn+S/HOqPiijDDnwjnnhHFJQg3F8XRCmNP9MsDqfajcwWHs9ik0NLSMfD50
 CXpp3WIZDVGllmNSeYgkkZKuYV+yNbydLU+DaMfWEkOS7euRoaDozShVJdBTRfnV
 7PYZ3V4KhWkNCWXWfbw=
 =Ynnl
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-6.4' of https://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "There are two changes, one small and one more substantial:

   - Remove of an unnecessary cast

   - The mount option processing introduced with the mount rework makes
     copies of mount option values. There is no good reason to make
     copies of Smack labels, as they are maintained on a list and never
     removed.

     The code now uses pointers to entries on the list, reducing
     processing time and memory use"

* tag 'Smack-for-6.4' of https://github.com/cschaufler/smack-next:
  Smack: Improve mount process memory use
  smack_lsm: remove unnecessary type casting
2023-04-24 11:37:24 -07:00
Linus Torvalds 5af4b523ba One cleanup patch from Vlastimil Babka.
Vlastimil Babka (1):
   tomoyo: replace tomoyo_round2() with kmalloc_size_roundup()
 
  security/tomoyo/audit.c  |    6 +++---
  security/tomoyo/common.c |    2 +-
  security/tomoyo/common.h |   44 --------------------------------------------
  3 files changed, 4 insertions(+), 48 deletions(-)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJkRlGdAAoJEEJfEo0MZPUqwsoP/24HXdPf4TGUSPAYMjzUaaKo
 49KW7AlXlXQfiPEm2kjIsQbWHRi0cXEw7qFsw2B+V/gEBx22IVCDu6l+7im3TsZS
 CzBK3Crf8F/6FfUcSeL8WCR815iJvmPkGGwwNZlPh8o6KQLvnyZxpafjJXgHl43a
 JRux9iuKMSW76GFvuRZZjjqwzoMffEa4F4QA68d12P4JaOI2qBAUetZJo3x9fkUo
 H5UAYo5F3tKojYXKk0Hfap5J9KKGDhN9XMSePrNquoTYaxV9l12SYjFmb11mRDnU
 38XQtL+ZBqOoSPhtTMrIPNvnOwgnHMb+7C7HFspK89fKtXWU1H0BXJQSQXhuaztt
 enXHL2ORQb1UkDKZCF0SWeKYhF6cVtX34eJOJVm25sK08VeANqZgEUnjopZoGQJo
 0gx9OKlz+eixEFLtMvxBqI/J5RYyue6BwKgTI4L6ZXWL6KxUWxU6Pf6r7eZ5OE6K
 9AARggx3FfEI9Gs1+HdOEvxOqwJpUuvM7aA7bk08Bd8lU6HuF8E3bUX281yWDMwR
 TE5BqmOQCygyDMmahq+sZVE4mLIZ3xLLv9VuSHhcPprr17RAIjMzVRrdXHCM6Zus
 Z9L6jvJcYPKkjTnRETc4r1QWB0Myo27Ay7YHeShmHL/6NQV9WlCXhKm8gQq6Z1mO
 QyVvqLAEl3NYBaRdzIPt
 =I6oo
 -----END PGP SIGNATURE-----

Merge tag 'tomoyo-pr-20230424' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1

Pull tomoyo update from Tetsuo Handa:
 "One cleanup patch from Vlastimil Babka"

* tag 'tomoyo-pr-20230424' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
  tomoyo: replace tomoyo_round2() with kmalloc_size_roundup()
2023-04-24 11:33:07 -07:00