Add a test suite to test XDP bonding implementation over a pair of
veth devices.
Signed-off-by: Jussi Maki <joamaki@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210731055738.16820-8-joamaki@gmail.com
The program type cannot be deduced from 'tx' which causes an invalid
argument error when trying to load xdp_tx.o using the skeleton.
Rename the section name to "xdp" so that libbpf can deduce the type.
Signed-off-by: Jussi Maki <joamaki@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210731055738.16820-7-joamaki@gmail.com
We need the fixes in here as well, and resolves some merge issues with
the mhi codebase.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
BPF programs for reference_tracking selftest use "fail_" prefix to notify that
they are expected to fail. This is really confusing and inconvenient when
trying to grep through test_progs output to find *actually* failed tests. So
rename the prefix from "fail_" to "err_".
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210805230734.437914-1-andrii@kernel.org
Currently, this test is incorrectly printing the destination port in
place of the destination IP.
Fixes: 2767c97765 ("selftests/bpf: Implement sample tcp/tcp6 bpf_iter programs")
Signed-off-by: Jose Blanquicet <josebl@microsoft.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210805164044.527903-1-josebl@microsoft.com
Build failure in drivers/net/wwan/mhi_wwan_mbim.c:
add missing parameter (0, assuming we don't want buffer pre-alloc).
Conflict in drivers/net/dsa/sja1105/sja1105_main.c between:
589918df93 ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too")
0fac6aa098 ("net: dsa: sja1105: delete the best_effort_vlan_filtering mode")
Follow the instructions from the commit message of the former commit
- removed the if conditions. When looking at commit 589918df93 ("net:
dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too")
note that the mask_iotag fields get removed by the following patch.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Current release - regressions:
- sched: taprio: fix init procedure to avoid inf loop when dumping
- sctp: move the active_key update after sh_keys is added
Current release - new code bugs:
- sparx5: fix build with old GCC & bitmask on 32-bit targets
Previous releases - regressions:
- xfrm: redo the PREEMPT_RT RCU vs hash_resize_mutex deadlock fix
- xfrm: fixes for the compat netlink attribute translator
- phy: micrel: Fix detection of ksz87xx switch
Previous releases - always broken:
- gro: set inner transport header offset in tcp/udp GRO hook to avoid
crashes when such packets reach GSO
- vsock: handle VIRTIO_VSOCK_OP_CREDIT_REQUEST, as required by spec
- dsa: sja1105: fix static FDB entries on SJA1105P/Q/R/S and SJA1110
- bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
- usb: lan78xx: don't modify phy_device state concurrently
- usb: pegasus: check for errors of IO routines
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmEL/jwACgkQMUZtbf5S
IrsxmQ//Qbq6TluxzmFnWCjOGWo009GZUZ/PdlnhxnHRAk8BzzMM7209xGARTKod
t+bNl8ioDDxxiBlp57gtoe67nnnd4cwbPCGKpY49al8guvetsQq+0vlg0u3490X+
clY7Uz7G/8thf0JylhqQB1LrMXcNNHqB7ZV5CpM1cC+H/YxeHBv+LQy44S7Vz+LG
btHGQbbsnHrVF6WhohU8tr5AX7MdLaQvQ2aZ1XodEXRd9js4P4CP2Hn/cazZJBOT
rwxaFao2DWs6qaVYBFHtKyU1qvoxQ6Ngex/lMY0QQ9rOX/L+ha+ygzzUoqXjg7DX
jOFUeZIiGHcPe0a10NO8NkPCqn7bOBQ2h/BpJPF9b8VvQKbJAOV8kOdtTbGhMh28
vboensrppqW4qzWpgkoJaVbusvcNWibFspYFyrLjpKxpPmKuLJlli2mkyUbsUiCO
uxMN+IqisWiR379rWLX5tJQp6OIvWeQW3htD5ms7nIHpvL1pbRJnsekepkUjmTx9
DtvowHGpPSG4dPq7EP6LcE/1K0YQFjZQMsJkqTH7J4Qi+pmB2MoQyJzPraktiquT
2Qb/O2yZlng9sYYCs0P73TiVBef5KnIoIJXKvqkrmyN4QjyO+LDevGQyXV06B5VJ
a8duR+yWgPVQn+T7SMKhAOXoqXwSCbJlpXSG7iOFp4dQLCiBVzI=
=/rrJ
-----END PGP SIGNATURE-----
Merge tag 'net-5.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from ipsec.
Current release - regressions:
- sched: taprio: fix init procedure to avoid inf loop when dumping
- sctp: move the active_key update after sh_keys is added
Current release - new code bugs:
- sparx5: fix build with old GCC & bitmask on 32-bit targets
Previous releases - regressions:
- xfrm: redo the PREEMPT_RT RCU vs hash_resize_mutex deadlock fix
- xfrm: fixes for the compat netlink attribute translator
- phy: micrel: Fix detection of ksz87xx switch
Previous releases - always broken:
- gro: set inner transport header offset in tcp/udp GRO hook to avoid
crashes when such packets reach GSO
- vsock: handle VIRTIO_VSOCK_OP_CREDIT_REQUEST, as required by spec
- dsa: sja1105: fix static FDB entries on SJA1105P/Q/R/S and SJA1110
- bridge: validate the NUD_PERMANENT bit when adding an extern_learn
FDB entry
- usb: lan78xx: don't modify phy_device state concurrently
- usb: pegasus: check for errors of IO routines"
* tag 'net-5.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (48 commits)
net: vxge: fix use-after-free in vxge_device_unregister
net: fec: fix use-after-free in fec_drv_remove
net: pegasus: fix uninit-value in get_interrupt_interval
net: ethernet: ti: am65-cpsw: fix crash in am65_cpsw_port_offload_fwd_mark_update()
bnx2x: fix an error code in bnx2x_nic_load()
net: wwan: iosm: fix recursive lock acquire in unregister
net: wwan: iosm: correct data protocol mask bit
net: wwan: iosm: endianness type correction
net: wwan: iosm: fix lkp buildbot warning
net: usb: lan78xx: don't modify phy_device state concurrently
docs: networking: netdevsim rules
net: usb: pegasus: Remove the changelog and DRIVER_VERSION.
net: usb: pegasus: Check the return value of get_geristers() and friends;
net/prestera: Fix devlink groups leakage in error flow
net: sched: fix lockdep_set_class() typo error for sch->seqlock
net: dsa: qca: ar9331: reorder MDIO write sequence
VSOCK: handle VIRTIO_VSOCK_OP_CREDIT_REQUEST
mptcp: drop unused rcu member in mptcp_pm_addr_entry
net: ipv6: fix returned variable type in ip6_skb_dst_mtu
nfp: update ethtool reporting of pauseframe control
...
now obeys KVM_CAP_HYPERV_ENFORCE_CPUID. Both the XMM arguments feature
and KVM_CAP_HYPERV_ENFORCE_CPUID are new in 5.14, and each did not know
of the other.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmELlMoUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOIigf+O1JWYyfVR16qpVnkI9voa0j7aPKd
0v3Qd0ybEbBqOl2QjYFe+aJa42d8bhiMGRG7UNNxYPmW6MukxM3rWZ0wd8IvlktW
KMR/+2jPw8v8B+M44ty9y30cdW65EGxY8PXPrzSXSF9wT5v0fV8s14Yzxkc1edKM
X3OVJk1AhlRaGN9PyGl3MJxdvQG7Ci9Ep500i5vhAsdb2Azu9TzfQRVUUJ8c2BWj
PEI31Z+E0//G0oU/MiHZ5bsYCboiciiHHJxpdFpxVYN7rQ4sOxJswEsO+PpD00K3
HsZXuTXCRQyZfjbI5QlwNWpU6mZrAb3T8GVNBOP+0g3+fDNyrZzJBiCzbQ==
=bI5h
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Mostly bugfixes; plus, support for XMM arguments to Hyper-V hypercalls
now obeys KVM_CAP_HYPERV_ENFORCE_CPUID.
Both the XMM arguments feature and KVM_CAP_HYPERV_ENFORCE_CPUID are
new in 5.14, and each did not know of the other"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86/mmu: Fix per-cpu counter corruption on 32-bit builds
KVM: selftests: fix hyperv_clock test
KVM: SVM: improve the code readability for ASID management
KVM: SVM: Fix off-by-one indexing when nullifying last used SEV VMCB
KVM: Do not leak memory for duplicate debugfs directories
KVM: selftests: Test access to XMM fast hypercalls
KVM: x86: hyper-v: Check if guest is allowed to use XMM registers for hypercall input
KVM: x86: Introduce trace_kvm_hv_hypercall_done()
KVM: x86: hyper-v: Check access to hypercall before reading XMM registers
KVM: x86: accept userspace interrupt only if no event is injected
To verify that this hash implements the Toeplitz hash function.
Additionally, provide a script toeplitz.sh to run the test in loopback mode
on a networking device of choice (see setup_loopback.sh). Since the
script modifies the NIC setup, it will not be run by selftests
automatically.
Tested:
./toeplitz.sh -i eth0 -irq_prefix <eth0_pattern> -t -6
carrier ready
rxq 0: cpu 14
rxq 1: cpu 20
rxq 2: cpu 17
rxq 3: cpu 23
cpu 14: rx_hash 0x69103ebc [saddr fda8::2 daddr fda8::1 sport 58938 dport 8000] OK rxq 0 (cpu 14)
...
cpu 20: rx_hash 0x257118b9 [saddr fda8::2 daddr fda8::1 sport 59258 dport 8000] OK rxq 1 (cpu 20)
count: pass=111 nohash=0 fail=0
Test Succeeded!
Signed-off-by: Coco Li <lixiaoyan@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement a GRO testsuite that expects Linux kernel GRO behavior.
All tests pass with the kernel software GRO stack. Run against a device
with hardware GRO to verify that it matches the software stack.
gro.c generates packets and sends them out through a packet socket. The
receiver in gro.c (run separately) receives the packets on a packet
socket, filters them by destination ports using BPF and checks the
packet geometry to see whether GRO was applied.
gro.sh provides a wrapper to run the gro.c in NIC loopback mode.
It is not included in continuous testing because it modifies network
configuration around a physical NIC: gro.sh sets the NIC in loopback
mode, creates macvlan devices on the physical device in separate
namespaces, and sends traffic generated by gro.c between the two
namespaces to observe coalescing behavior.
GRO coalescing is time sensitive.
Some tests may prove flaky on some hardware.
Note that this test suite tests for software GRO unless hardware GRO is
enabled (ethtool -K $DEV rx-gro-hw on).
To test, run ./gro.sh.
The wrapper will output success or failed test names, and generate
log.txt and stderr.
Sample log.txt result:
...
pure data packet of same size: Test succeeded
large data packets followed by a smaller one: Test succeeded
small data packets followed by a larger one: Test succeeded
...
Sample stderr result:
...
carrier ready
running test ipv4 data
Expected {200 }, Total 1 packets
Received {200 }, Total 1 packets.
...
Signed-off-by: Coco Li <lixiaoyan@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rewrite to skel and ASSERT macros as well while we are at it.
v3:
- replace -f with -A to make it work with busybox ping.
-A is available on both busybox and iputils, from the man page:
On networks with low RTT this mode is essentially equivalent to
flood mode.
v2:
- don't check result of bpf_map__fd (Yonghong Song)
- remove from .gitignore (Andrii Nakryiko)
- move ping_command into network_helpers (Andrii Nakryiko)
- remove assert() (Andrii Nakryiko)
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210804205524.3748709-1-sdf@google.com
The test was mistakenly using addr_gpa2hva on a gva and that happened
to work accidentally. Commit 106a2e766e ("KVM: selftests: Lower the
min virtual address for misc page allocations") revealed this bug.
Fixes: 2c7f76b4c4 ("selftests: kvm: Add basic Hyper-V clocksources tests", 2021-03-18)
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210804112057.409498-1-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Recently we added multi-queue support to netdevsim in commit d4861fc6be
("netdevsim: Add multi-queue support"); add a few control-plane selftests
for sch_mq using this new feature.
Use nsPlugin.py to avoid network interface name collisions.
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2021-08-04
1) Fix a sysbot reported memory leak in xfrm_user_rcv_msg.
From Pavel Skripkin.
2) Revert "xfrm: policy: Read seqcount outside of rcu-read side
in xfrm_policy_lookup_bytype". This commit tried to fix a
lockin bug, but only cured some of the symptoms. A proper
fix is applied on top of this revert.
3) Fix a locking bug on xfrm state hash resize. A recent change
on sequence counters accidentally repaced a spinlock by a mutex.
Fix from Frederic Weisbecker.
4) Fix possible user-memory-access in xfrm_user_rcv_msg_compat().
From Dmitry Safonov.
5) Add initialiation sefltest fot xfrm_spdattr_type_t.
From Dmitry Safonov.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds OOB support for AF_UNIX sockets.
The semantics is same as TCP.
The last byte of a message with the OOB flag is
treated as the OOB byte. The byte is separated into
a skb and a pointer to the skb is stored in unix_sock.
The pointer is used to enforce OOB semantics.
Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Write down some ideas for additional coverage for floating point in case
someone feels inspired to look into them.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/20210803140450.46624-5-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We provide interfaces for configuring the SVE vector length seen by
processes using prctl and also via /proc for configuring the default
values. Provide tests that exercise all these interfaces and verify that
they take effect as expected, though at present no test fully enumerates
all the possible vector lengths.
A subset of this is already tested via sve-probe-vls but the /proc
interfaces are not currently covered at all.
In preparation for the forthcoming support for SME, the Scalable Matrix
Extension, which has separately but similarly configured vector lengths
which we expect to offer similar userspace interfaces for, all the actual
files and prctls used are parameterised and we don't validate that the
architectural minimum vector length is the minimum we see.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/20210803140450.46624-4-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently sve-probe-vls does not verify that the vector lengths reported
by the prctl() interface are actually what is reported by the architecture,
use the rdvl_sve() helper to validate this.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/20210803140450.46624-3-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
SVE provides an instruction RDVL which reports the currently configured
vector length. In order to validate that our vector length configuration
interfaces are working correctly without having to build the C code for
our test programs with SVE enabled provide a trivial assembly library
with a C callable function that executes RDVL. Since these interfaces
also control behaviour on exec*() provide a trivial wrapper program which
reports the currently configured vector length on stdout, tests can use
this to verify that behaviour on exec*() is as expected.
In preparation for providing similar helper functionality for SME, the
Scalable Matrix Extension, which allows separately configured vector
lengths to be read back both the assembler function and wrapper binary
have SVE included in their name.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/20210803140450.46624-2-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Check that #UD is raised if bit 16 is clear in
HYPERV_CPUID_FEATURES.EDX and an 'XMM fast' hypercall is issued.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Siddharth Chandrasekaran <sidcha@amazon.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210730122625.112848-5-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch removed the 'raw gso min size - 1' test which
always fails now:
./in_netns.sh ./psock_snd -v -c -g -l "${mss}"
raw gso min size - 1 (expected to fail)
tx: 1524
rx: 1472
OK
After commit 7c6d2ecbda ("net: be more gentle about silly
gso requests coming from user"), we relaxed the min gso_size
check in virtio_net_hdr_to_skb().
So when a packet which is smaller then the gso_size,
GSO for this packet will not be set, the packet will be
send/recv successfully.
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrii Nakryiko says:
====================
bpf-next 2021-07-30
We've added 64 non-merge commits during the last 15 day(s) which contain
a total of 83 files changed, 5027 insertions(+), 1808 deletions(-).
The main changes are:
1) BTF-guided binary data dumping libbpf API, from Alan.
2) Internal factoring out of libbpf CO-RE relocation logic, from Alexei.
3) Ambient BPF run context and cgroup storage cleanup, from Andrii.
4) Few small API additions for libbpf 1.0 effort, from Evgeniy and Hengqi.
5) bpf_program__attach_kprobe_opts() fixes in libbpf, from Jiri.
6) bpf_{get,set}sockopt() support in BPF iterators, from Martin.
7) BPF map pinning improvements in libbpf, from Martynas.
8) Improved module BTF support in libbpf and bpftool, from Quentin.
9) Bpftool cleanups and documentation improvements, from Quentin.
10) Libbpf improvements for supporting CO-RE on old kernels, from Shuyi.
11) Increased maximum cgroup storage size, from Stanislav.
12) Small fixes and improvements to BPF tests and samples, from various folks.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (64 commits)
tools: bpftool: Complete metrics list in "bpftool prog profile" doc
tools: bpftool: Document and add bash completion for -L, -B options
selftests/bpf: Update bpftool's consistency script for checking options
tools: bpftool: Update and synchronise option list in doc and help msg
tools: bpftool: Complete and synchronise attach or map types
selftests/bpf: Check consistency between bpftool source, doc, completion
tools: bpftool: Slightly ease bash completion updates
unix_bpf: Fix a potential deadlock in unix_dgram_bpf_recvmsg()
libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf
tools: bpftool: Support dumping split BTF by id
libbpf: Add split BTF support for btf__load_from_kernel_by_id()
tools: Replace btf__get_from_id() with btf__load_from_kernel_by_id()
tools: Free BTF objects at various locations
libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id()
libbpf: Rename btf__load() as btf__load_into_kernel()
libbpf: Return non-null error on failures in libbpf_find_prog_btf_id()
bpf: Emit better log message if bpf_iter ctx arg btf_id == 0
tools/resolve_btfids: Emit warnings and patch zero id for missing symbols
bpf: Increase supported cgroup storage value size
libbpf: Fix race when pinning maps in parallel
...
====================
Link: https://lore.kernel.org/r/20210730225606.1897330-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Q1 and Q2 are numbers with *maximum* length of 384 bytes. If the
calculated length of Q1 and Q2 is less than 384 bytes, things will
go wrong.
E.g. if Q2 is 383 bytes, then
1. The bytes of q2 are copied to sigstruct->q2 in calc_q1q2().
2. The entire sigstruct->q2 is reversed, which results it being
256 * Q2, given that the last byte of sigstruct->q2 is added
to before the bytes given by calc_q1q2().
Either change in key or measurement can trigger the bug. E.g. an
unmeasured heap could cause a devastating change in Q1 or Q2.
Reverse exactly the bytes of Q1 and Q2 in calc_q1q2() before returning
to the caller.
Fixes: 2adcba79e6 ("selftests/x86: Add a selftest for SGX")
Link: https://lore.kernel.org/linux-sgx/20210301051836.30738-1-tianjia.zhang@linux.alibaba.com/
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
and netfilter trees.
Current release - regressions:
- mac80211: fix starting aggregation sessions on mesh interfaces
Current release - new code bugs:
- sctp: send pmtu probe only if packet loss in Search Complete state
- bnxt_en: add missing periodic PHC overflow check
- devlink: fix phys_port_name of virtual port and merge error
- hns3: change the method of obtaining default ptp cycle
- can: mcba_usb_start(): add missing urb->transfer_dma initialization
Previous releases - regressions:
- set true network header for ECN decapsulation
- mlx5e: RX, avoid possible data corruption w/ relaxed ordering and LRO
- phy: re-add check for PHY_BRCM_DIS_TXCRXC_NOENRGY on the BCM54811 PHY
- sctp: fix return value check in __sctp_rcv_asconf_lookup
Previous releases - always broken:
- bpf:
- more spectre corner case fixes, introduce a BPF nospec
instruction for mitigating Spectre v4
- fix OOB read when printing XDP link fdinfo
- sockmap: fix cleanup related races
- mac80211: fix enabling 4-address mode on a sta vif after assoc
- can:
- raw: raw_setsockopt(): fix raw_rcv panic for sock UAF
- j1939: j1939_session_deactivate(): clarify lifetime of
session object, avoid UAF
- fix number of identical memory leaks in USB drivers
- tipc:
- do not blindly write skb_shinfo frags when doing decryption
- fix sleeping in tipc accept routine
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmEEWm8ACgkQMUZtbf5S
Irv84A//V/nn9VRdpDpmodwBWVEc9SA00M/nmziRBLwRyG+fRMtnePY4Ha40TPbh
LL6orth08hZKOjVmMc6Ea4EjZbV5E3iAKtAnaX6wi1HpEXVxKtFYnWxu9ydwTEd9
An1fltDtWYkNi3kiq7il+Tp1/yZAQ+NYv5zQZCWJ47kkN3jkjULdAEBqODA2A6Ul
0PQgS1rKzXukE19PlXDuaNuEekhTiEfaTwzHjdBJZkj1toGJGfHsvdQ/YJjixzB9
44SjE4PfxIaMWP0BVaD6hwzaVQhaZETXhZZufdIDdQd7sDbmd6CPODX6mXfLEq4u
JaWylgobsK+5ScHE6siVI+ZlW7stq9l1Ynm10ADiwsZVzKEoP745484aEFOLO6Z+
Ln/IqDQCP/yJQmnl2i0+TfqVDh6BKYoIfUUK/+nzHw4Otycy0m3kj4P+74aYfjOv
Q+cUgbXUemcrpq6wGUK+zK0NyNHVILvdPDnHPMMypwqPk18y5ZmFvaJAVUPSavD9
N7t9LoLyGwK3i/Ir4l+JJZ1KgAv1+TbmyNBWvY1Yk/r/vHU3nBPIv26s7YarNAwD
094vJEJ0+mqO4h+Xj1Nc7HEBFi46JfpN2L8uYoM7gpwziIRMdmpXVLmpEk43WmFi
UMwWJWqabPEXaozC2UFcFLSk+jS7DiD+G5eG+Fd5HecmKzd7RI0=
=sKPI
-----END PGP SIGNATURE-----
Merge tag 'net-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.14-rc4, including fixes from bpf, can, WiFi
(mac80211) and netfilter trees.
Current release - regressions:
- mac80211: fix starting aggregation sessions on mesh interfaces
Current release - new code bugs:
- sctp: send pmtu probe only if packet loss in Search Complete state
- bnxt_en: add missing periodic PHC overflow check
- devlink: fix phys_port_name of virtual port and merge error
- hns3: change the method of obtaining default ptp cycle
- can: mcba_usb_start(): add missing urb->transfer_dma initialization
Previous releases - regressions:
- set true network header for ECN decapsulation
- mlx5e: RX, avoid possible data corruption w/ relaxed ordering and
LRO
- phy: re-add check for PHY_BRCM_DIS_TXCRXC_NOENRGY on the BCM54811
PHY
- sctp: fix return value check in __sctp_rcv_asconf_lookup
Previous releases - always broken:
- bpf:
- more spectre corner case fixes, introduce a BPF nospec
instruction for mitigating Spectre v4
- fix OOB read when printing XDP link fdinfo
- sockmap: fix cleanup related races
- mac80211: fix enabling 4-address mode on a sta vif after assoc
- can:
- raw: raw_setsockopt(): fix raw_rcv panic for sock UAF
- j1939: j1939_session_deactivate(): clarify lifetime of session
object, avoid UAF
- fix number of identical memory leaks in USB drivers
- tipc:
- do not blindly write skb_shinfo frags when doing decryption
- fix sleeping in tipc accept routine"
* tag 'net-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits)
gve: Update MAINTAINERS list
can: esd_usb2: fix memory leak
can: ems_usb: fix memory leak
can: usb_8dev: fix memory leak
can: mcba_usb_start(): add missing urb->transfer_dma initialization
can: hi311x: fix a signedness bug in hi3110_cmd()
MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver
bpf: Fix leakage due to insufficient speculative store bypass mitigation
bpf: Introduce BPF nospec instruction for mitigating Spectre v4
sis900: Fix missing pci_disable_device() in probe and remove
net: let flow have same hash in two directions
nfc: nfcsim: fix use after free during module unload
tulip: windbond-840: Fix missing pci_disable_device() in probe and remove
sctp: fix return value check in __sctp_rcv_asconf_lookup
nfc: s3fwrn5: fix undefined parameter values in dev_err()
net/mlx5: Fix mlx5_vport_tbl_attr chain from u16 to u32
net/mlx5e: Fix nullptr in mlx5e_hairpin_get_mdev()
net/mlx5: Unload device upon firmware fatal error
net/mlx5e: Fix page allocation failure for ptp-RQ over SF
net/mlx5e: Fix page allocation failure for trap-RQ over SF
...
Update the script responsible for checking that the different types used
at various places in bpftool are synchronised, and extend it to check
the consistency of options between the help messages in the source code
and the manual pages.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210730215435.7095-6-quentin@isovalent.com
Whenever the eBPF subsystem gains new elements, such as new program or
map types, it is necessary to update bpftool if we want it able to
handle the new items.
In addition to the main arrays containing the names of these elements in
the source code, there are also multiple locations to update:
- The help message in the do_help() functions in bpftool's source code.
- The RST documentation files.
- The bash completion file.
This has led to omissions multiple times in the past. This patch
attempts to address this issue by adding consistency checks for all
these different locations. It also verifies that the bpf_prog_type,
bpf_map_type and bpf_attach_type enums from the UAPI BPF header have all
their members present in bpftool.
The script requires no argument to run, it reads and parses the
different files to check, and prints the mismatches, if any. It
currently reports a number of missing elements, which will be fixed in a
later patch:
$ ./test_bpftool_synctypes.py
Comparing [...]/linux/tools/bpf/bpftool/map.c (map_type_name) and [...]/linux/tools/bpf/bpftool/bash-completion/bpftool (BPFTOOL_MAP_CREATE_TYPES): {'ringbuf'}
Comparing BPF header (enum bpf_attach_type) and [...]/linux/tools/bpf/bpftool/common.c (attach_type_name): {'BPF_TRACE_ITER', 'BPF_XDP_DEVMAP', 'BPF_XDP', 'BPF_SK_REUSEPORT_SELECT', 'BPF_XDP_CPUMAP', 'BPF_SK_REUSEPORT_SELECT_OR_MIGRATE'}
Comparing [...]/linux/tools/bpf/bpftool/prog.c (attach_type_strings) and [...]/linux/tools/bpf/bpftool/prog.c (do_help() ATTACH_TYPE): {'skb_verdict'}
Comparing [...]/linux/tools/bpf/bpftool/prog.c (attach_type_strings) and [...]/linux/tools/bpf/bpftool/Documentation/bpftool-prog.rst (ATTACH_TYPE): {'skb_verdict'}
Comparing [...]/linux/tools/bpf/bpftool/prog.c (attach_type_strings) and [...]/linux/tools/bpf/bpftool/bash-completion/bpftool (BPFTOOL_PROG_ATTACH_TYPES): {'skb_verdict'}
Note that the script does NOT check for consistency between the list of
program types that bpftool claims it accepts and the actual list of
keywords that can be used. This is because bpftool does not "see" them,
they are ELF section names parsed by libbpf. It is not hard to parse the
section_defs[] array in libbpf, but some section names are associated
with program types that bpftool cannot load at the moment. For example,
some programs require a BTF target and an attach target that bpftool
cannot handle. The script may be extended to parse the array and check
only relevant values in the future.
The script is not added to the selftests' Makefile, because doing so
would require all patches with BPF UAPI change to also update bpftool.
Instead it is to be added to the CI.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210730215435.7095-3-quentin@isovalent.com
Replace the calls to function btf__get_from_id(), which we plan to
deprecate before the library reaches v1.0, with calls to
btf__load_from_kernel_by_id() in tools/ (bpftool, perf, selftests).
Update the surrounding code accordingly (instead of passing a pointer to
the btf struct, get it as a return value from the function).
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210729162028.29512-6-quentin@isovalent.com
Make sure to call btf__free() (and not simply free(), which does not
free all pointers stored in the struct) on pointers to struct btf
objects retrieved at various locations.
These were found while updating the calls to btf__get_from_id().
Fixes: 999d82cbc0 ("tools/bpf: enhance test_btf file testing to test func info")
Fixes: 254471e57a ("tools/bpf: bpftool: add support for func types")
Fixes: 7b612e291a ("perf tools: Synthesize PERF_RECORD_* for loaded BPF programs")
Fixes: d56354dc49 ("perf tools: Save bpf_prog_info and BTF of new BPF programs")
Fixes: 47c09d6a9f ("bpftool: Introduce "prog profile" command")
Fixes: fa853c4b83 ("perf stat: Enable counting events for BPF programs")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210729162028.29512-5-quentin@isovalent.com
- Fix MTE shared page detection
- Enable selftest's use of PMU registers when asked to
s390:
- restore 5.13 debugfs names
x86:
- fix sizes for vcpu-id indexed arrays
- fixes for AMD virtualized LAPIC (AVIC)
- other small bugfixes
Generic:
- access tracking performance test
- dirty_log_perf_test command line parsing fix
- Fix selftest use of obsolete pthread_yield() in favour of sched_yield()
- use cpu_relax when halt polling
- fixed missing KVM_CLEAR_DIRTY_LOG compat ioctl
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmECvOwUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMjuAf/ZdJx7RKRQxMHG4jHGDtOIQq3qxds
2uJsFZS3MWkphSOJ+mbomdXTOCHvhPbJlr5TXaSxGnasmAAl+mDk2qVT0tH6638m
r6M+fu4X0RYvFz54Qnf96V0/elE6ee8rtteXD8WVKQ/XzE3odk1EOqbe7CBDx7yo
A3SzO8eSBzxamKo22fmE3MR5LVVAcN9wNsCb88XGDTUkTbYl+w597r6zg83rMMlL
gwD4f9+NYX6h88BVVwLUkWotUrD/5rRGpRVVEZk5eZKvFGzpukk15dfv0PA9347O
AOM0i/PgnA+Qw6ZsTetWPjD8eFcXDBurGF1tIkyo4X8VogQG0wFIHxbezQ==
=ZgK/
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Fix MTE shared page detection
- Enable selftest's use of PMU registers when asked to
s390:
- restore 5.13 debugfs names
x86:
- fix sizes for vcpu-id indexed arrays
- fixes for AMD virtualized LAPIC (AVIC)
- other small bugfixes
Generic:
- access tracking performance test
- dirty_log_perf_test command line parsing fix
- Fix selftest use of obsolete pthread_yield() in favour of
sched_yield()
- use cpu_relax when halt polling
- fixed missing KVM_CLEAR_DIRTY_LOG compat ioctl"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: add missing compat KVM_CLEAR_DIRTY_LOG
KVM: use cpu_relax when halt polling
KVM: SVM: use vmcb01 in svm_refresh_apicv_exec_ctrl
KVM: SVM: tweak warning about enabled AVIC on nested entry
KVM: SVM: svm_set_vintr don't warn if AVIC is active but is about to be deactivated
KVM: s390: restore old debugfs names
KVM: SVM: delay svm_vcpu_init_msrpm after svm->vmcb is initialized
KVM: selftests: Introduce access_tracking_perf_test
KVM: selftests: Fix missing break in dirty_log_perf_test arg parsing
x86/kvm: fix vcpu-id indexed array sizes
KVM: x86: Check the right feature bit for MSR_KVM_ASYNC_PF_ACK access
docs: virt: kvm: api.rst: replace some characters
KVM: Documentation: Fix KVM_CAP_ENFORCE_PV_FEATURE_CPUID name
KVM: nSVM: Swap the parameter order for svm_copy_vmrun_state()/svm_copy_vmloadsave_state()
KVM: nSVM: Rename nested_svm_vmloadsave() to svm_copy_vmloadsave_state()
KVM: arm64: selftests: get-reg-list: actually enable pmu regs in pmu sublist
KVM: selftests: change pthread_yield to sched_yield
KVM: arm64: Fix detection of shared VMAs on guest fault
Daniel Borkmann says:
====================
pull-request: bpf 2021-07-29
The following pull-request contains BPF updates for your *net* tree.
We've added 9 non-merge commits during the last 14 day(s) which contain
a total of 20 files changed, 446 insertions(+), 138 deletions(-).
The main changes are:
1) Fix UBSAN out-of-bounds splat for showing XDP link fdinfo, from Lorenz Bauer.
2) Fix insufficient Spectre v4 mitigation in BPF runtime, from Daniel Borkmann,
Piotr Krysiuk and Benedict Schlueter.
3) Batch of fixes for BPF sockmap found under stress testing, from John Fastabend.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Recently we added a new option, SKBMOD_F_ECN, to tc-skbmod(8). Add a
control-plane selftest for it.
Depends on kernel patch "net/sched: act_skbmod: Add SKBMOD_F_ECN option
support", as well as iproute2 patch "tc/skbmod: Introduce SKBMOD_F_ECN
option".
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current max cgroup storage value size is 4k (PAGE_SIZE). The other local
storages accept up to 64k (BPF_LOCAL_STORAGE_MAX_VALUE_SIZE). Let's align
max cgroup value size with the other storages.
For percpu, the max is 32k (PCPU_MIN_UNIT_SIZE) because percpu
allocator is not happy about larger values.
netcnt test is extended to exercise those maximum values
(non-percpu max size is close to, but not real max).
v4:
* remove inner union (Andrii Nakryiko)
* keep net_cnt on the stack (Andrii Nakryiko)
v3:
* refine SIZEOF_BPF_LOCAL_STORAGE_ELEM comment (Yonghong Song)
* anonymous struct in percpu_net_cnt & net_cnt (Yonghong Song)
* reorder free (Yonghong Song)
v2:
* cap max_value_size instead of BUILD_BUG_ON (Martin KaFai Lau)
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210727222335.4029096-1-sdf@google.com
This test measures the performance effects of KVM's access tracking.
Access tracking is driven by the MMU notifiers test_young, clear_young,
and clear_flush_young. These notifiers do not have a direct userspace
API, however the clear_young notifier can be triggered by marking a
pages as idle in /sys/kernel/mm/page_idle/bitmap. This test leverages
that mechanism to enable access tracking on guest memory.
To measure performance this test runs a VM with a configurable number of
vCPUs that each touch every page in disjoint regions of memory.
Performance is measured in the time it takes all vCPUs to finish
touching their predefined region.
Example invocation:
$ ./access_tracking_perf_test -v 8
Testing guest mode: PA-bits:ANY, VA-bits:48, 4K pages
guest physical test memory offset: 0xffdfffff000
Populating memory : 1.337752570s
Writing to populated memory : 0.010177640s
Reading from populated memory : 0.009548239s
Mark memory idle : 23.973131748s
Writing to idle memory : 0.063584496s
Mark memory idle : 24.924652964s
Reading from idle memory : 0.062042814s
Breaking down the results:
* "Populating memory": The time it takes for all vCPUs to perform the
first write to every page in their region.
* "Writing to populated memory" / "Reading from populated memory": The
time it takes for all vCPUs to write and read to every page in their
region after it has been populated. This serves as a control for the
later results.
* "Mark memory idle": The time it takes for every vCPU to mark every
page in their region as idle through page_idle.
* "Writing to idle memory" / "Reading from idle memory": The time it
takes for all vCPUs to write and read to every page in their region
after it has been marked idle.
This test should be portable across architectures but it is only enabled
for x86_64 since that's all I have tested.
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20210713220957.3493520-7-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is a missing break statement which causes a fallthrough to the
next statement where optarg will be null and a segmentation fault will
be generated.
Fixes: 9e965bb75a ("KVM: selftests: Add backing src parameter to dirty_log_perf_test")
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20210713220957.3493520-6-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It turns out that certain types of early boot bugs can result in reboot
loops, even within a guest OS running under qemu/KVM. This commit
therefore upgrades the kvm-test-1-run-qemu.sh script's hang-detection
heuristics to detect such situations and to terminate the run when
they occur.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The kvm-test-1-run-qemu.sh script logs the torture-test start time and
also when it starts getting impatient for the test to finish. However, it
does not timestamp these log messages, which can make debugging needlessly
challenging. This commit therefore adds timestamps to these messages.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
There was a time long ago when the "test" command's documentation
claimed that the "-a" and "-o" arguments did something useful.
But this documentation now suggests letting the shell execute
these boolean operators, so this commit applies that suggestion to
kvm-test-1-run-qemu.sh.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit causes kvm-test-1-run-batch.sh to use the new
kvm-assign-cpus.sh and kvm-get-cpus-script.sh scripts to create a
TORTURE_AFFINITY environment variable containing either an empty string
(for no affinity) or a list of CPUs to pin the scenario's vCPUs to.
The additional change to kvm-test-1-run.sh places the per-scenario
number-of-CPUs information where it can easily be found.
If there is some reason why affinity cannot be supplied, this commit
prints and logs the reason via changes to kvm-again.sh.
Finally, this commit updates the kvm-remote.sh script to copy the
qemu-affinity output files back to the host system.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
There is "qemu-affinity", "qemu-cmd", "qemu-retval", but also "qemu_pid".
This is hard to remember, not so good for bash tab completion, and just
plain inconsistent. This commit therefore renames the "qemu_pid" file to
"qemu-pid". A couple of the scripts must deal with old runs, and thus
must handle both "qemu_pid" and "qemu-pid", but new runs will produce
"qemu-pid".
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The jitter.sh script has some entertaining awk code to generate a
hex mask from a randomly selected CPU number, which is handed to the
"taskset" command. Except that this command has a "-c" parameter to
take a comma/dash-separated list of CPU numbers. This commit therefore
saves a few lines of awk by switching to a single-number CPU list.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
There is no way to place the vCPUs in a two-CPU rcutorture scenario to
get variable memory latency. This commit therefore upgrades the current
two-CPU rcutorture scenarios to four CPUs.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit causes the kvm-test-1-run-qemu.sh script to check the
TORTURE_AFFINITY environment variable and to add "taskset" commands to
the qemu-cmd file. The first "taskset" command is applied only if the
TORTURE_AFFINITY environment variable is a non-empty string, and this
command pins the current scenario's guest OS to the specified CPUs.
The second "taskset" command reports the guest OS's affinity in a new
"qemu-affinity" file.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, kvm-test-1-run-qemu.sh applies redirection to each and every
line of each qemu-cmd script. Only the first line (the only one that
is not a bash comment) needs to be redirected. Although redirecting
the comments is currently harmless, just adding to the comment, it is
an accident waiting to happen. This commit therefore adjusts the "sed"
command to redirect only the qemu-system* command itself.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit causes kvm.sh to use the new kvm-assign-cpus.sh and
kvm-get-cpus-script.sh scripts to create a TORTURE_AFFINITY environment
variable containing either an empty string (for no affinity) or a list
of CPUs to pin the scenario's vCPUs to. A later commit will make
use of this information to actually pin the vCPUs.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Add a simple selftest for a move_mount(MOVE_MOUNT_SET_GROUP). This tests
that one can copy sharing from one mount from nested mntns with nested
userns owner to another mount from other nested mntns with other nested
userns owner while in their parent userns.
TAP version 13
1..1
# Starting 1 tests from 2 test cases.
# RUN move_mount_set_group.complex_sharing_copying ...
# OK move_mount_set_group.complex_sharing_copying
ok 1 move_mount_set_group.complex_sharing_copying
# PASSED: 1 / 1 tests passed.
# Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
Link: https://lore.kernel.org/r/20210715100714.120228-2-ptikhomirov@virtuozzo.com
Cc: Shuah Khan <shuah@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Mattias Nissler <mnissler@chromium.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: lkml <linux-kernel@vger.kernel.org>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
This test passes pointers obtained from anon_allocate_area to the
userfaultfd and mremap APIs. This causes a problem if the system
allocator returns tagged pointers because with the tagged address ABI
the kernel rejects tagged addresses passed to these APIs, which would
end up causing the test to fail. To make this test compatible with such
system allocators, stop using the system allocator to allocate memory in
anon_allocate_area, and instead just use mmap.
Link: https://lkml.kernel.org/r/20210714195437.118982-3-pcc@google.com
Link: https://linux-review.googlesource.com/id/Icac91064fcd923f77a83e8e133f8631c5b8fc241
Fixes: c47174fc36 ("userfaultfd: selftest")
Co-developed-by: Lokesh Gidra <lokeshgidra@google.com>
Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Alistair Delva <adelva@google.com>
Cc: William McVicker <willmcvicker@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Mitch Phillips <mitchp@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: <stable@vger.kernel.org> [5.4]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a list of vmtest script dependencies to make it easier for new
contributors to get going.
Signed-off-by: Evgeniy Litvinenko <evgeniyl@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210723223645.907802-1-evgeniyl@fb.com
This patch adds tests for the batching and bpf_(get|set)sockopt in
bpf tcp iter.
It first creates:
a) 1 non SO_REUSEPORT listener in lhash2.
b) 256 passive and active fds connected to the listener in (a).
c) 256 SO_REUSEPORT listeners in one of the lhash2 bucket.
The test sets all listeners and connections to bpf_cubic before
running the bpf iter.
The bpf iter then calls setsockopt(TCP_CONGESTION) to switch
each listener and connection from bpf_cubic to bpf_dctcp.
The bpf iter has a random_retry mode such that it can return EAGAIN
to the usespace in the middle of a batch.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210701200625.1036874-1-kafai@fb.com
Previously, the newly introduced test case in test_map_in_map(), which
checks whether the inner map is destroyed after unsuccessful creation of
the outer map, logged the following harmless and expected error:
libbpf: map 'mim': failed to create: Invalid argument(-22) libbpf:
failed to load object './test_map_in_map_invalid.o'
To avoid any possible confusion, mute the logging during loading of the
prog.
Fixes: 08f71a1e39 ("selftests/bpf: Check inner map deletion")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210721140941.563175-1-m@lambda.lt
Pull networking fixes from David Miller:
1) Fix type of bind option flag in af_xdp, from Baruch Siach.
2) Fix use after free in bpf_xdp_link_release(), from Xuan Zhao.
3) PM refcnt imbakance in r8152, from Takashi Iwai.
4) Sign extension ug in liquidio, from Colin Ian King.
5) Mising range check in s390 bpf jit, from Colin Ian King.
6) Uninit value in caif_seqpkt_sendmsg(), from Ziyong Xuan.
7) Fix skb page recycling race, from Ilias Apalodimas.
8) Fix memory leak in tcindex_partial_destroy_work, from Pave Skripkin.
9) netrom timer sk refcnt issues, from Nguyen Dinh Phi.
10) Fix data races aroun tcp's tfo_active_disable_stamp, from Eric
Dumazet.
11) act_skbmod should only operate on ethernet packets, from Peilin Ye.
12) Fix slab out-of-bpunds in fib6_nh_flush_exceptions(),, from Psolo
Abeni.
13) Fix sparx5 dependencies, from Yajun Deng.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (74 commits)
dpaa2-switch: seed the buffer pool after allocating the swp
net: sched: cls_api: Fix the the wrong parameter
net: sparx5: fix unmet dependencies warning
net: dsa: tag_ksz: dont let the hardware process the layer 4 checksum
net: dsa: ensure linearized SKBs in case of tail taggers
ravb: Remove extra TAB
ravb: Fix a typo in comment
net: dsa: sja1105: make VID 4095 a bridge VLAN too
tcp: disable TFO blackhole logic by default
sctp: do not update transport pathmtu if SPP_PMTUD_ENABLE is not set
net: ixp46x: fix ptp build failure
ibmvnic: Remove the proper scrq flush
selftests: net: add ESP-in-UDP PMTU test
udp: check encap socket in __udp_lib_err
sctp: update active_key for asoc when old key is being replaced
r8169: Avoid duplicate sysfs entry creation error
ixgbe: Fix packet corruption due to missing DMA sync
Revert "qed: fix possible unpaired spin_{un}lock_bh in _qed_mcp_cmd_and_union()"
ipv6: fix another slab-out-of-bounds in fib6_nh_flush_exceptions
fsl/fman: Add fibre support
...
The case of ESP in UDP encapsulation was not covered before. Add
cases of local changes of MTU and difference on routed path.
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
This test evaluates the IOAM insertion for IPv6 by checking the IOAM data
integrity on the receiver.
The topology is formed by 3 nodes: Alpha (sender), Beta (router in-between)
and Gamma (receiver). An IOAM domain is configured from Alpha to Gamma only,
which means not on the reverse path. When Gamma is the destination, Alpha
adds an IOAM option (Pre-allocated Trace) inside a Hop-by-hop and fills the
trace with its own IOAM data. Beta and Gamma also fill the trace. The IOAM
data integrity is checked on Gamma, by comparing with the pre-defined IOAM
configuration (see below).
+-------------------+ +-------------------+
| | | |
| alpha netns | | gamma netns |
| | | |
| +-------------+ | | +-------------+ |
| | veth0 | | | | veth0 | |
| | db01::2/64 | | | | db02::2/64 | |
| +-------------+ | | +-------------+ |
| . | | . |
+-------------------+ +-------------------+
. .
. .
. .
+----------------------------------------------------+
| . . |
| +-------------+ +-------------+ |
| | veth0 | | veth1 | |
| | db01::1/64 | ................ | db02::1/64 | |
| +-------------+ +-------------+ |
| |
| beta netns |
| |
+--------------------------+-------------------------+
~~~~~~~~~~~~~~~~~~~~~~
| IOAM configuration |
~~~~~~~~~~~~~~~~~~~~~~
Alpha
+-----------------------------------------------------------+
| Type | Value |
+-----------------------------------------------------------+
| Node ID | 1 |
+-----------------------------------------------------------+
| Node Wide ID | 11111111 |
+-----------------------------------------------------------+
| Ingress ID | 0xffff (default value) |
+-----------------------------------------------------------+
| Ingress Wide ID | 0xffffffff (default value) |
+-----------------------------------------------------------+
| Egress ID | 101 |
+-----------------------------------------------------------+
| Egress Wide ID | 101101 |
+-----------------------------------------------------------+
| Namespace Data | 0xdeadbee0 |
+-----------------------------------------------------------+
| Namespace Wide Data | 0xcafec0caf00dc0de |
+-----------------------------------------------------------+
| Schema ID | 777 |
+-----------------------------------------------------------+
| Schema Data | something that will be 4n-aligned |
+-----------------------------------------------------------+
Note: When Gamma is the destination, Alpha adds an IOAM Pre-allocated Trace
option inside a Hop-by-hop, where 164 bytes are pre-allocated for the
trace, with 123 as the IOAM-Namespace and with 0xfff00200 as the trace
type (= all available options at this time). As a result, and based on
IOAM configurations here, only both Alpha and Beta should be capable of
inserting their IOAM data while Gamma won't have enough space and will
set the overflow bit.
Beta
+-----------------------------------------------------------+
| Type | Value |
+-----------------------------------------------------------+
| Node ID | 2 |
+-----------------------------------------------------------+
| Node Wide ID | 22222222 |
+-----------------------------------------------------------+
| Ingress ID | 201 |
+-----------------------------------------------------------+
| Ingress Wide ID | 201201 |
+-----------------------------------------------------------+
| Egress ID | 202 |
+-----------------------------------------------------------+
| Egress Wide ID | 202202 |
+-----------------------------------------------------------+
| Namespace Data | 0xdeadbee1 |
+-----------------------------------------------------------+
| Namespace Wide Data | 0xcafec0caf11dc0de |
+-----------------------------------------------------------+
| Schema ID | 0xffffff (= None) |
+-----------------------------------------------------------+
| Schema Data | |
+-----------------------------------------------------------+
Gamma
+-----------------------------------------------------------+
| Type | Value |
+-----------------------------------------------------------+
| Node ID | 3 |
+-----------------------------------------------------------+
| Node Wide ID | 33333333 |
+-----------------------------------------------------------+
| Ingress ID | 301 |
+-----------------------------------------------------------+
| Ingress Wide ID | 301301 |
+-----------------------------------------------------------+
| Egress ID | 0xffff (default value) |
+-----------------------------------------------------------+
| Egress Wide ID | 0xffffffff (default value) |
+-----------------------------------------------------------+
| Namespace Data | 0xdeadbee2 |
+-----------------------------------------------------------+
| Namespace Wide Data | 0xcafec0caf22dc0de |
+-----------------------------------------------------------+
| Schema ID | 0xffffff (= None) |
+-----------------------------------------------------------+
| Schema Data | |
+-----------------------------------------------------------+
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following ingonred return val of asprintf() warn during
build:
cc -Wall -O2 fw_namespace.c -o ../tools/testing/selftests/firmware/fw_namespace
fw_namespace.c: In function ‘main’:
fw_namespace.c:132:2: warning: ignoring return value of ‘asprintf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
132 | asprintf(&fw_path, "/lib/firmware/%s", fw_name);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20210708031827.51293-1-skhan@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Verify that feature files are created successfully after mounting a
binderfs instance. Note that only "oneway_spam_detection" feature is
tested with this patch as it is currently the only feature listed.
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20210715031805.1725878-3-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Set hthresh, dump it again and verify thresh.lbits && thresh.rbits.
They are passed as attributes of xfrm_spdattr_type_t, different from
other message attributes that use xfrm_attr_type_t.
Also, test attribute that is bigger than XFRMA_SPD_MAX, currently it
should be silently ignored.
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
This commit is a first step towards pinning guest-OS vCPUs so as
to force latency differences, especially on multi-socket systems.
The kvm.sh script puts its batch-creation awk script into a temporary
file so that later commits can add the awk code needed to dole out CPUs
so as to maximize latency differences. This awk code will be used by
multiple scripts.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The last line of kvm-test-1-run-qemu.sh invokes parse-console.sh, but
kvm-test-1-run-qemu.sh is unaware of the PATH containing this script
and does not have the job title handy. This commit therefore moves
the invocation of parse-console.sh to kvm-test-1-run.sh, which has
PATH and title at hand. This commit does not add an invocation of
parse-console.sh to kvm-test-1-run-batch.sh because this latter script
is run in the background, and the information will be gathered at the
end of the full run.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, kvm-recheck.sh attempts to create a kcsan.sum file even for
build-only runs. This results in false-positive bash errors due to
there being no console.log files in that case. This commit therefore
makes kvm-recheck.sh skip creating the kcsan.sum file for build-only runs.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The kvm-remote.sh script places the datestamped directory containing
all the build artifacts in the destination systems' /tmp directories,
where they accumulate runtime artifacts such as console.log. This works,
but some systems have a habit of removing files in /tmp that have not
been recently accessed. This commit therefore runs a simple script that
periodically accesses all files in the datestamped directory.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The qemu-cmd file can contain comments that are not relevant to the
operation of kvm-recheck-lock.sh. This commit therefore strips these
comments before looking for timing information.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The qemu-cmd file can contain comments that are not relevant to the
operation of kvm-recheck-scf.sh. This commit therefore strips these
comments before looking for timing information.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, each -kcsan run in a torture.sh group of runs has its own
kcsan.sum summary. This works, but there is usually a lot of duplication
between the runs. This commit therefore also creates an overall kcsan.sum
file for the entire torture.sh run, if there was at least one -kcsan run.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The kcsan-collapse.sh script assumes that it is being run over the output
of a single kvm.sh run, which is less than helpful for torture.sh runs.
This commit therefore changes the kcsan-collapse.sh script's "ls" pattern
with a "find" command to enable a KCSAN summary across all the -kcsan
runs in a full torture.sh run.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, torture.sh accepts --doall on the one hand and --do-none
on the other, which is a bit inconsistent. This commit therefore adds
--do-all and --donone so that a fully consistent test may be used.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit adds three short tests of the clocksource-watchdog capability
to the torture.sh script, all to avoid otherwise-inevitable bitrot.
While in the area, fix an obsolete comment.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
When run test_tc_tunnel.sh, it complains following error
ipip
encap 192.168.1.1 to 192.168.1.2, type ipip, mac none len 100
test basic connectivity
nc: cannot use -p and -l
nc man page has:
-l Listen for an incoming connection rather than initiating
a connection to a remote host.Cannot be used together with
any of the options -psxz. Additionally, any timeouts specified
with the -w option are ignored.
Correct nc in server_listen().
Signed-off-by: Vincent Li <vincent.mc.li@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210719223022.66681-1-vincent.mc.li@gmail.com
UDP socket support was added recently so testing UDP insert failure is no
longer correct and causes test_maps failure. The fix is easy though, we
simply need to test that UDP is correctly added instead of blocked.
Fixes: 122e6c79ef ("sock_map: Update sock type checks for UDP")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210720184832.452430-1-john.fastabend@gmail.com
Simple functional test for the newly exposed features.
Also add an optional stress test for the channel number
update under flood.
RFC v1 -> RFC v2:
- add the stress test
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a test case to check whether an unsuccessful creation of an outer
map of a BTF-defined map-in-map destroys the inner map.
As bpf_object__create_map() is a static function, we cannot just call it
from the test case and then check whether a map accessible via
map->inner_map_fd has been closed. Instead, we iterate over all maps and
check whether the map "$MAP_NAME.inner" does not exist.
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210719173838.423148-3-m@lambda.lt
- Fix MTE shared page detection
- Fix selftest use of obsolete pthread_yield() in favour of sched_yield()
- Enable selftest's use of PMU registers when asked to
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmD0BgcPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDSWwP/1dmthmV6qh7eqlXquxq9GRFOZoFBdY+48R0
8X7eZkM+TxIFd36gOfur9v8kh3KLCZmTi4/f/0hsWpcE36SYVv8hfNBck6wRWoln
Yh4mYqW23l8nME2wwzRkOE1uZ6o3RRKSWROoMultPYrL1XdVTv+sCUmIqY/aa4vu
dgrZaFBC5sxzW6Rsiw0H7NswC/RezjIooY9qimeQpSNMOM0585FCjVTtU9/0+S4V
UpUPe/Y9g74N98NAtbJ9Wy5Q1I4xsQeYBejzgLyVYvTVnV7/6ITpQOi/EjgtvTln
j7LgN72H9BBQv7wEmnQs4AmCdPqppb0PfHKwpUQTEgK271SvROf9ssmSaaAS5rRx
5d3XCr+D93JViP6W/KUoA8Mye5en1gikfnrQeL5eLPpryOVIFsX1zeA7wOnYeRGh
r7QAboC1ScSKlTZiXaeSvxI1xC4FVljUeTNGuWJIofkyo744KX5tFktZefr0i3LL
/n/oE/aErXDRVfinAFxSRE4NUZON6vc6JnUBGFbs/pCJKj60XjVBx6IhnPnHBlv2
pMS5REZ6+LqovXoaib/BI/4LfLqk32u/fPVTEJTqs9H+brBVD7+lujJozcZ8Wes8
/fC7hZfNnvfbRgH/BdEetxVqbB68FNV3DXZ3V3MjnP+G6mI/ugLCuQ22LN3PmH9w
0BiZilau
=Ln0f
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-fixes-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 5.14, take #1
- Fix MTE shared page detection
- Fix selftest use of obsolete pthread_yield() in favour of sched_yield()
- Enable selftest's use of PMU registers when asked to
This KUnit fixes update for Linux 5.14-rc2 consists of fixes to kunit
tool and documentation:
-- asserts on older python versions.
-- fixes to misleading error messages when TAP header format is
incorrect or when file is missing.
-- fixes documentation dropping obsolete information about uml_abort
coverage.
-- removing unnecessary annotations
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmDyC6cACgkQCwJExA0N
Qxxy7w/8DN/4wMXSP+VO2ZebMLrLgU7RZYyqHHkhvs3DsIv9fKf1sY9aK+fRQiYF
1rDnv4L0MpPFVRgdx6n0r//LDwIavPHiIW3WMwRWUsTfGd0ZhEgY/8LpnR2UNsi+
VWfpxt7kp9kFWaOy75f+/ZuXLW86XNdNc7tQoY4bupI4MAuCv07zcsGxXtsfELeD
VLcjbbAckr4H1IXqltlo++3NKCBVVxI5mCbOBfMeZ9eyvdVUarlJfXR00RbcJ764
TtlsMYGcGBZi8iwd9zgc2nB6a6wCamvCbqWG39CyqFerjw8dqhSSAISs+rcx6tUS
6APRA2iFqKYc9I2bZdp0ct11FQ1gK/yMsjX7UJN6+50qRdTf2y12Qf7HqwkMPpox
CqLlhS9it9K4a1WEzl7wRqBVyQsSa7Bm40bfKOAaqQdvre/i9Qc2W1RfbTmgWG8A
ZUOuh8GdDBpsDmhYygpVuT2cuEFrMEo5kaiGQhczuLWavEwGB2cl6W+Ct/9T9iIO
dXP1f2KGldkXYoqcvBqeVF1Hog6EhIG6pMBnkMGf6uqITNUXxot8EjLv+yR2AsXf
jcW+xJLG7QL4Wtld/XkHxNzQSp0o37yoCQfhIVYLufxtXzXJTOtGBM2K/jJJ3az5
fcP6y4kv9rVk+OeyzY1mi0ijFROKOju1POAMgtbovNMZJmlYWoY=
=p0Nt
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-kunit-fixes-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit fixes from Shuah Khan:
"Fixes to kunit tool and documentation:
- fix asserts on older python versions
- fixes to misleading error messages when TAP header format is
incorrect or when file is missing
- documentation fix: drop obsolete information about uml_abort
coverage
- remove unnecessary annotations"
* tag 'linux-kselftest-kunit-fixes-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: tool: Assert the version requirement
kunit: tool: remove unnecessary "annotations" import
Documentation: kunit: drop obsolete note about uml_abort for coverage
kunit: tool: Fix error messages for cases of no tests and wrong TAP header
This Kselftest fixes update for Linux 5.14-rc2 consists of fix
to memory-hotplug hot-remove test to stop spamming logs with
dump_page() entries and slowing the system down to a crawl.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmDx/hcACgkQCwJExA0N
QxzwIw/+OIg2i+vYwHRaGHw2kIhz+1Sfja+sJ5KCldxxtCFI+alQnkrZwcvjjR0N
OsqQXIfp2BH2YrtMGA4EO3B7ZtgejQ90cFLhFan4VwZC7IVZc8VjTrxIiruUpq9P
XV5TtGDtGSPgMUlqS67nh2fl35GdCminADSkqJ3UETZfvyE2vPw+UEYbij+IzE9n
Ndwcaqwqcf8jFl1dkK2UhTto7Xtjrmf+AOuEZzHFXldaK9tLU0aUz17efwAgxZYx
l1Bldi8f8cqfyXLoW459pV5f13r5hurc7goQHwNft5zayB1AKXJmwIy1CCUsEAQC
P6UpU3lwBumJRZd/sDaSxAhrUk7gRITPJcfej+2tuIMdo+vIqrtl82pLOVtitJux
6INcfGMtZ6LxMWJbF35evhVOmYcpUXu3Uh6xkwjFYoqdw7+y0b+6aAFaveLAKBdE
ne/dN6T2Yo81C94Bst1XPD1z65lpNO4a/xWmt25vQuA2TEHMfMZ+Wg7Ew9zglZSo
sy6fpZnHmSiwFaowP6tfYnFrT2OD6iEaLre58chPNHr3e2alrJWm3bW59TbemSbu
ZWZMTC+NFNLVOVEUDVKXMkW5lXoPpn+NJGX50IEhMltPYVvSCVk74pEUySbzN7eo
mG+pb4Xud9JvnOj/+yxeqw7jkQWY8hknelF/kmwkcMW97HttW80=
=zpTx
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-fixes-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fix from Shuah Khan:
"A fix to memory-hotplug hot-remove test to stop spamming logs with
dump_page() entries and slowing the system down to a crawl"
* tag 'linux-kselftest-fixes-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: memory-hotplug: avoid spamming logs with dump_page(), ratio limit hot-remove error test
Test various type data dumping operations by comparing expected
format with the dumped string; an snprintf-style printf function
is used to record the string dumped. Also verify overflow handling
where the data passed does not cover the full size of a type,
such as would occur if a tracer has a portion of the 8k
"struct task_struct".
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626362126-27775-4-git-send-email-alan.maguire@oracle.com
Add several test cases for checking update_alu_sanitation_state() under
multiple paths:
# ./test_verifier
[...]
#1061/u map access: known scalar += value_ptr unknown vs const OK
#1061/p map access: known scalar += value_ptr unknown vs const OK
#1062/u map access: known scalar += value_ptr const vs unknown OK
#1062/p map access: known scalar += value_ptr const vs unknown OK
#1063/u map access: known scalar += value_ptr const vs const (ne) OK
#1063/p map access: known scalar += value_ptr const vs const (ne) OK
#1064/u map access: known scalar += value_ptr const vs const (eq) OK
#1064/p map access: known scalar += value_ptr const vs const (eq) OK
#1065/u map access: known scalar += value_ptr unknown vs unknown (eq) OK
#1065/p map access: known scalar += value_ptr unknown vs unknown (eq) OK
#1066/u map access: known scalar += value_ptr unknown vs unknown (lt) OK
#1066/p map access: known scalar += value_ptr unknown vs unknown (lt) OK
#1067/u map access: known scalar += value_ptr unknown vs unknown (gt) OK
#1067/p map access: known scalar += value_ptr unknown vs unknown (gt) OK
[...]
Summary: 1762 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Alexei Starovoitov says:
====================
pull-request: bpf-next 2021-07-15
The following pull-request contains BPF updates for your *net-next* tree.
We've added 45 non-merge commits during the last 15 day(s) which contain
a total of 52 files changed, 3122 insertions(+), 384 deletions(-).
The main changes are:
1) Introduce bpf timers, from Alexei.
2) Add sockmap support for unix datagram socket, from Cong.
3) Fix potential memleak and UAF in the verifier, from He.
4) Add bpf_get_func_ip helper, from Jiri.
5) Improvements to generic XDP mode, from Kumar.
6) Support for passing xdp_md to XDP programs in bpf_prog_run, from Zvi.
===================
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP and other connection oriented sockets have accept()
for each incoming connection on the server side, hence
they can just insert those fd's from accept() to sockmap,
which are of course established.
Now with datagram sockets begin to support sockmap and
redirection, the restriction is no longer applicable to
them, as they have no accept(). So we have to lift this
restriction for them. This is fine, because inside
bpf_sk_redirect_map() we still have another socket status
check, sock_map_redirect_allowed(), as a guard.
This also means they do not have to be removed from
sockmap when disconnecting.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210704190252.11866-3-xiyou.wangcong@gmail.com
Adding test for bpf_get_func_ip in kprobe+ofset probe.
Because of the offset value it's arch specific, enabling
the new test only for x86_64 architecture.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210714094400.396467-9-jolsa@kernel.org
Check that map-in-map supports bpf timers.
Check that indirect "recursion" of timer callbacks works:
timer_cb1() { bpf_timer_set_callback(timer_cb2); }
timer_cb2() { bpf_timer_set_callback(timer_cb1); }
Check that
bpf_map_release
htab_free_prealloced_timers
bpf_timer_cancel_and_free
hrtimer_cancel
works while timer cb is running.
"while true; do ./test_progs -t timer_mim; done"
is a great stress test. It caught missing timer cancel in htab->extra_elems.
timer_mim_reject.c is a negative test that checks
that timer<->map mismatch is prevented.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210715005417.78572-12-alexei.starovoitov@gmail.com
Add bpf_timer test that creates timers in preallocated and
non-preallocated hash, in array and in lru maps.
Let array timer expire once and then re-arm it for 35 seconds.
Arm lru timer into the same callback.
Then arm and re-arm hash timers 10 times each.
At the last invocation of prealloc hash timer cancel the array timer.
Force timer free via LRU eviction and direct bpf_map_delete_elem.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210715005417.78572-11-alexei.starovoitov@gmail.com
The variable buf is unused since commit 005edd1656 ("selftests/bpf:
convert bpf tunnel test to BPF_ADJ_ROOM_MAC"). Remove it to fix the
following warning:
test_tc_tunnel.c:531:7: warning: unused variable 'buf' [-Wunused-variable]
Fixes: 005edd1656 ("selftests/bpf: convert bpf tunnel test to BPF_ADJ_ROOM_MAC")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210713102719.8890-1-tklauser@distanz.ch
* Fixes for host SMIs on AMD
* Fixes for guest SMIs on AMD
* Fixes for selftests on s390 and ARM
* Fix memory leak
* Enforce no-instrumentation area on vmentry when hardware
breakpoints are in use.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmDwRi4UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOt4AgAl6xEkMwDC74d/QFIOA7s2GD3ugfa
z5XqGN1qz/nmEMnuIg6/tjTXDPmn/dfLMqy8RGZfyUv6xbgPcv/7JuFMRILvwGTb
SbOVrGnR/QOhMdlfWH34qDkXeEsthTXSgQgVm/iiED0TttvQYVcZ/E9mgzaWQXor
T1yTug2uAUXJ1EBxY0ZBo2kbh+BvvdmhEF0pksZOuwqZdH3zn3QCXwAwkL/OtUYE
M6nNn3j1LU38C4OK1niXOZZVOuMIdk/l7LyFpjUQTFlIqitQAPtBE5MD+K+A9oC2
Yocxyj2tId1e6o8bLic/oN8/LpdORTvA/wDMj5M1DcMzvxQuQIpGYkcVGg==
=gjVA
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
- Allow again loading KVM on 32-bit non-PAE builds
- Fixes for host SMIs on AMD
- Fixes for guest SMIs on AMD
- Fixes for selftests on s390 and ARM
- Fix memory leak
- Enforce no-instrumentation area on vmentry when hardware breakpoints
are in use.
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
KVM: selftests: smm_test: Test SMM enter from L2
KVM: nSVM: Restore nested control upon leaving SMM
KVM: nSVM: Fix L1 state corruption upon return from SMM
KVM: nSVM: Introduce svm_copy_vmrun_state()
KVM: nSVM: Check that VM_HSAVE_PA MSR was set before VMRUN
KVM: nSVM: Check the value written to MSR_VM_HSAVE_PA
KVM: SVM: Fix sev_pin_memory() error checks in SEV migration utilities
KVM: SVM: Return -EFAULT if copy_to_user() for SEV mig packet header fails
KVM: SVM: add module param to control the #SMI interception
KVM: SVM: remove INIT intercept handler
KVM: SVM: #SMI interception must not skip the instruction
KVM: VMX: Remove vmx_msr_index from vmx.h
KVM: X86: Disable hardware breakpoints unconditionally before kvm_x86->run()
KVM: selftests: Address extra memslot parameters in vm_vaddr_alloc
kvm: debugfs: fix memory leak in kvm_create_vm_debugfs
KVM: x86/pmu: Clear anythread deprecated bit when 0xa leaf is unsupported on the SVM
KVM: mmio: Fix use-after-free Read in kvm_vm_ioctl_unregister_coalesced_mmio
KVM: SVM: Revert clearing of C-bit on GPA in #NPF handler
KVM: x86/mmu: Do not apply HPA (memory encryption) mask to GPAs
KVM: x86: Use kernel's x86_phys_bits to handle reduced MAXPHYADDR
...
Two additional tests are added:
- SMM triggered from L2 does not currupt L1 host state.
- Save/restore during SMM triggered from L2 does not corrupt guest/host
state.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210628104425.391276-7-vkuznets@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit a75a895e64 ("KVM: selftests: Unconditionally use memslot 0 for
vaddr allocations") removed the memslot parameters from vm_vaddr_alloc.
It addressed all callers except one under lib/aarch64/, due to a race
with commit e3db7579ef ("KVM: selftests: Add exception handling
support for aarch64")
Fix the vm_vaddr_alloc call in lib/aarch64/processor.c.
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Message-Id: <20210702201042.4036162-1-ricarkol@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Current release - regressions:
- sock: fix parameter order in sock_setsockopt()
Current release - new code bugs:
- netfilter: nft_last:
- fix incorrect arithmetic when restoring last used
- honor NFTA_LAST_SET on restoration
Previous releases - regressions:
- udp: properly flush normal packet at GRO time
- sfc: ensure correct number of XDP queues; don't allow enabling the
feature if there isn't sufficient resources to Tx from any CPU
- dsa: sja1105: fix address learning getting disabled on the CPU port
- mptcp: addresses a rmem accounting issue that could keep packets
in subflow receive buffers longer than necessary, delaying
MPTCP-level ACKs
- ip_tunnel: fix mtu calculation for ETHER tunnel devices
- do not reuse skbs allocated from skbuff_fclone_cache in the napi
skb cache, we'd try to return them to the wrong slab cache
- tcp: consistently disable header prediction for mptcp
Previous releases - always broken:
- bpf: fix subprog poke descriptor tracking use-after-free
- ipv6:
- allocate enough headroom in ip6_finish_output2() in case
iptables TEE is used
- tcp: drop silly ICMPv6 packet too big messages to avoid
expensive and pointless lookups (which may serve as a DDOS
vector)
- make sure fwmark is copied in SYNACK packets
- fix 'disable_policy' for forwarded packets (align with IPv4)
- netfilter: conntrack: do not renew entry stuck in tcp SYN_SENT state
- netfilter: conntrack: do not mark RST in the reply direction coming
after SYN packet for an out-of-sync entry
- mptcp: cleanly handle error conditions with MP_JOIN and syncookies
- mptcp: fix double free when rejecting a join due to port mismatch
- validate lwtstate->data before returning from skb_tunnel_info()
- tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy path
- mt76: mt7921: continue to probe driver when fw already downloaded
- bonding: fix multiple issues with offloading IPsec to (thru?) bond
- stmmac: ptp: fix issues around Qbv support and setting time back
- bcmgenet: always clear wake-up based on energy detection
Misc:
- sctp: move 198 addresses from unusable to private scope
- ptp: support virtual clocks and timestamping
- openvswitch: optimize operation for key comparison
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmDu3mMACgkQMUZtbf5S
Irsjxg//UwcPJMYFmXV+fGkEsWYe1Kf29FcUDEeANFtbltfAcIfZ0GoTbSDRnrVb
HcYAKcm4XRx5bWWdQrQsQq/yiLbnS/rSLc7VRB+uRHWRKl3eYcaUB2rnCXsxrjGw
wQJgOmztDCJS4BIky24iQpF/8lg7p/Gj2Ih532gh93XiYo612FrEJKkYb2/OQfYX
GkbnZ0kL2Y1SV+bhy6aT5azvhHKM4/3eA4fHeJ2p8e2gOZ5ni0vpX0xEzdzKOCd0
vwR/Wu3h/+2QuFYVcSsVguuM++JXACG8MAS/Tof78dtNM4a3kQxzqeh5Bv6IkfTu
rokENLq4pjNRy+nBAOeQZj8Jd0K0kkf/PN9WMdGQtplMoFhjjV25R6PeRrV9wwPo
peozIz2MuQo7Kfof1D+44h2foyLfdC28/Z0CvRbDpr5EHOfYynvBbrnhzIGdQp6V
xgftKTOdgz2Djgg8HiblZund1FA44OYerddVAASrIsnSFnIz1VLVQIsfV+GLBwwc
FawrIZ6WfIjzRSrDGOvDsbAQI47T/1jbaPJeK6XgjWkQmjEd6UtRWRZLYCxemQEw
4HP3sWC96BOehuD8ylipVE1oFqrxCiOB/fZxezXqjo8dSX3NLdak4cCHTHoW5SuZ
eEAxQRaBliKd+P7hoy9cZ57CAu3zUa8kijfM5QRlCAHF+zSxaPs=
=QFnb
-----END PGP SIGNATURE-----
Merge tag 'net-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski.
"Including fixes from bpf and netfilter.
Current release - regressions:
- sock: fix parameter order in sock_setsockopt()
Current release - new code bugs:
- netfilter: nft_last:
- fix incorrect arithmetic when restoring last used
- honor NFTA_LAST_SET on restoration
Previous releases - regressions:
- udp: properly flush normal packet at GRO time
- sfc: ensure correct number of XDP queues; don't allow enabling the
feature if there isn't sufficient resources to Tx from any CPU
- dsa: sja1105: fix address learning getting disabled on the CPU port
- mptcp: addresses a rmem accounting issue that could keep packets in
subflow receive buffers longer than necessary, delaying MPTCP-level
ACKs
- ip_tunnel: fix mtu calculation for ETHER tunnel devices
- do not reuse skbs allocated from skbuff_fclone_cache in the napi
skb cache, we'd try to return them to the wrong slab cache
- tcp: consistently disable header prediction for mptcp
Previous releases - always broken:
- bpf: fix subprog poke descriptor tracking use-after-free
- ipv6:
- allocate enough headroom in ip6_finish_output2() in case
iptables TEE is used
- tcp: drop silly ICMPv6 packet too big messages to avoid
expensive and pointless lookups (which may serve as a DDOS
vector)
- make sure fwmark is copied in SYNACK packets
- fix 'disable_policy' for forwarded packets (align with IPv4)
- netfilter: conntrack:
- do not renew entry stuck in tcp SYN_SENT state
- do not mark RST in the reply direction coming after SYN packet
for an out-of-sync entry
- mptcp: cleanly handle error conditions with MP_JOIN and syncookies
- mptcp: fix double free when rejecting a join due to port mismatch
- validate lwtstate->data before returning from skb_tunnel_info()
- tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy path
- mt76: mt7921: continue to probe driver when fw already downloaded
- bonding: fix multiple issues with offloading IPsec to (thru?) bond
- stmmac: ptp: fix issues around Qbv support and setting time back
- bcmgenet: always clear wake-up based on energy detection
Misc:
- sctp: move 198 addresses from unusable to private scope
- ptp: support virtual clocks and timestamping
- openvswitch: optimize operation for key comparison"
* tag 'net-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (158 commits)
net: dsa: properly check for the bridge_leave methods in dsa_switch_bridge_leave()
sfc: add logs explaining XDP_TX/REDIRECT is not available
sfc: ensure correct number of XDP queues
sfc: fix lack of XDP TX queues - error XDP TX failed (-22)
net: fddi: fix UAF in fza_probe
net: dsa: sja1105: fix address learning getting disabled on the CPU port
net: ocelot: fix switchdev objects synced for wrong netdev with LAG offload
net: Use nlmsg_unicast() instead of netlink_unicast()
octeontx2-pf: Fix uninitialized boolean variable pps
ipv6: allocate enough headroom in ip6_finish_output2()
net: hdlc: rename 'mod_init' & 'mod_exit' functions to be module-specific
net: bridge: multicast: fix MRD advertisement router port marking race
net: bridge: multicast: fix PIM hello router port marking race
net: phy: marvell10g: fix differentiation of 88X3310 from 88X3340
dsa: fix for_each_child.cocci warnings
virtio_net: check virtqueue_add_sgs() return value
mptcp: properly account bulk freed memory
selftests: mptcp: fix case multiple subflows limited by server
mptcp: avoid processing packet if a subflow reset
mptcp: fix syncookie process if mptcp can not_accept new subflow
...
Commit b78f4a5966 ("KVM: selftests: Rename vm_handle_exception")
raced with a couple of new x86 tests, missing two vm_handle_exception
to vm_install_exception_handler conversions.
Help the two broken tests to catch up with the new world.
Cc: Andrew Jones <drjones@redhat.com>
CC: Ricardo Koller <ricarkol@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Message-Id: <20210701071928.2971053-1-maz@kernel.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We reworked get-reg-list to make it easier to enable optional register
sublists by parametrizing their vcpu feature flags as well as making
other generalizations. That was all to make sure we enable the PMU
registers when we want to test them. Somehow we forgot to actually
include the PMU feature flag in the PMU sublist description though!
Do that now.
Fixes: 313673bad8 ("KVM: arm64: selftests: get-reg-list: Split base and pmu registers")
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210713203742.29680-3-drjones@redhat.com
With later GCC we get
steal_time.c: In function ‘main’:
steal_time.c:323:25: warning: ‘pthread_yield’ is deprecated: pthread_yield is deprecated, use sched_yield instead [-Wdeprecated-declarations]
Let's follow the instructions and use sched_yield instead.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210713203742.29680-2-drjones@redhat.com
While the offline memory test obey ratio limit, the same test with
error injection does not and tries to offline all the hotpluggable
memory, spamming system logs with hundreds of thousands of dump_page()
entries, slowing system down (to the point the test itself timesout and
gets terminated) and excessive fs occupation:
...
[ 9784.393354] page:c00c0000007d1b40 refcount:3 mapcount:0 mapping:c0000001fc03e950 index:0xe7b
[ 9784.393355] def_blk_aops
[ 9784.393356] flags: 0x3ffff800002062(referenced|active|workingset|private)
[ 9784.393358] raw: 003ffff800002062 c0000001b9343a68 c0000001b9343a68 c0000001fc03e950
[ 9784.393359] raw: 0000000000000e7b c000000006607b18 00000003ffffffff c00000000490d000
[ 9784.393359] page dumped because: migration failure
[ 9784.393360] page->mem_cgroup:c00000000490d000
[ 9784.393416] migrating pfn 1f46d failed ret:1
...
$ grep "page dumped because: migration failure" /var/log/kern.log | wc -l
2405558
$ ls -la /var/log/kern.log
-rw-r----- 1 syslog adm 2256109539 Jun 30 14:19 /var/log/kern.log
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Commit 87c9c16317 ("kunit: tool: add support for QEMU") on the 'next'
tree adds 'from __future__ import annotations' in 'kunit_kernel.py'.
Because it is supported on only >=3.7 Python, people using older Python
will get below error:
Traceback (most recent call last):
File "./tools/testing/kunit/kunit.py", line 20, in <module>
import kunit_kernel
File "/home/sjpark/linux/tools/testing/kunit/kunit_kernel.py", line 9
from __future__ import annotations
^
SyntaxError: future feature annotations is not defined
This commit adds a version assertion in 'kunit.py', so that people get
more explicit error message like below:
Traceback (most recent call last):
File "./tools/testing/kunit/kunit.py", line 15, in <module>
assert sys.version_info >= (3, 7), "Python version is too old"
AssertionError: Python version is too old
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Acked-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The import was working around the fact "tuple[T]" was used instead of
typing.Tuple[T].
Convert it to use type.Tuple to be consistent with how the rest of the
code is anotated.
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Tested-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
This patch addresses misleading error messages reported by kunit_tool in
two cases. First, in the case of TAP output having an incorrect header
format or missing a header, the parser used to output an error message of
'no tests run!'. Now the parser outputs an error message of 'could not
parse test results!'.
As an example:
Before:
$ ./tools/testing/kunit/kunit.py parse /dev/null
[ERROR] no tests run!
...
After:
$ ./tools/testing/kunit/kunit.py parse /dev/null
[ERROR] could not parse test results!
...
Second, in the case of TAP output with the correct header but no
tests, the parser used to output an error message of 'could not parse
test results!'. Now the parser outputs an error message of 'no tests
run!'.
As an example:
Before:
$ echo -e 'TAP version 14\n1..0' | ./tools/testing/kunit/kunit.py parse
[ERROR] could not parse test results!
After:
$ echo -e 'TAP version 14\n1..0' | ./tools/testing/kunit/kunit.py parse
[ERROR] no tests run!
Additionally, this patch also corrects the tests in kunit_tool_test.py
and adds a test to check the error in the case of TAP output with the
correct header but no tests.
Signed-off-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
After patch "mptcp: fix syncookie process if mptcp can not_accept new
subflow", if subflow is limited, MP_JOIN SYN is dropped, and no SYN/ACK
will be replied.
So in case "multiple subflows limited by server", the expected SYN/ACK
number should be 1.
Fixes: 00587187ad ("selftests: mptcp: add test cases for mptcp join tests with syn cookies")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2021-07-09
The following pull-request contains BPF updates for your *net* tree.
We've added 9 non-merge commits during the last 9 day(s) which contain
a total of 13 files changed, 118 insertions(+), 62 deletions(-).
The main changes are:
1) Fix runqslower task->state access from BPF, from SanjayKumar Jeyakumar.
2) Fix subprog poke descriptor tracking use-after-free, from John Fastabend.
3) Fix sparse complaint from prior devmap RCU conversion, from Toke Høiland-Jørgensen.
4) Fix missing va_end in bpftool JIT json dump's error path, from Gu Shengxian.
5) Fix tools/bpf install target from missing runqslower install, from Wei Li.
6) Fix xdpsock BPF sample to unload program on shared umem option, from Wang Hai.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fixed a bug that broke the .sym-offset modifier and added a test to make
sure nothing breaks it again.
- Replace a list_del/list_add() with a list_move()
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYOhMlxQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6quYRAQCbciaL9gQwk6lS4Rn3NkC/i1xsxv0A
q56XDPK3qyuHWQD/d6gcJ5n+Bl0OoNaDyG4UqnSGucZctYQeY4LQMzd4PQA=
=wPyY
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix and cleanup from Steven Rostedt:
"Tracing fix for histograms and a clean up in ftrace:
- Fixed a bug that broke the .sym-offset modifier and added a test to
make sure nothing breaks it again.
- Replace a list_del/list_add() with a list_move()"
* tag 'trace-v5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Use list_move instead of list_del/list_add
tracing/selftests: Add tests to test histogram sym and sym-offset modifiers
tracing/histograms: Fix parsing of "sym-offset" modifier
This adds some extra noise to the tailcall_bpf2bpf4 tests that will cause
verify to patch insns. This then moves around subprog start/end insn
index and poke descriptor insn index to ensure that verify and JIT will
continue to track these correctly.
If done correctly verifier should pass this program same as before and
JIT should emit tail call logic.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210707223848.14580-3-john.fastabend@gmail.com
With a large mmap map size, we can overlap with the text area and using
MAP_FIXED results in unmapping that area. Switch to MAP_FIXED_NOREPLACE
and handle the EEXIST error.
Link: https://lkml.kernel.org/r/20210616045239.370802-3-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "mrermap fixes", v2.
This patch (of 6):
Instead of hardcoding 4K page size fetch it using sysconf(). For the
performance measurements test still assume 2M and 1G are hugepage sizes.
Link: https://lkml.kernel.org/r/20210616045239.370802-1-aneesh.kumar@linux.ibm.com
Link: https://lkml.kernel.org/r/20210616045239.370802-2-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The test verifies that file descriptor created with memfd_secret does not
allow read/write operations, that secret memory mappings respect
RLIMIT_MEMLOCK and that remote accesses with process_vm_read() and
ptrace() to the secret memory fail.
Link: https://lkml.kernel.org/r/20210518072034.31572-8-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tycho Andersen <tycho@tycho.ws>
Cc: Will Deacon <will@kernel.org>
Cc: kernel test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a test to the tracing selftests that will catch if the .sym or
.sym-offset modifiers break in the future.
Link: https://lkml.kernel.org/r/20210707121451.101a1002@oasis.local.home
Acked-by: Tom Zanussi <zanussi@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Support for cpumap and devmap entry progs in previous commits means the
test needs to be updated for the new semantics. Also take this
opportunity to convert it from CHECK macros to the new ASSERT macros.
Since xdp_cpumap_attach has no subtest, put the sole test inside the
test_xdp_cpumap_attach function.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210702111825.491065-6-memxor@gmail.com
Add a test for using xdp_md as a context to BPF_PROG_TEST_RUN for XDP
programs.
The test uses a BPF program that takes in a return value from XDP
meta data, then reduces the size of the XDP meta data by 4 bytes.
Test cases validate the possible failure cases for passing in invalid
xdp_md contexts, that the return value is successfully passed
in, and that the adjusted meta data is successfully copied out.
Co-developed-by: Cody Haas <chaas@riotgames.com>
Co-developed-by: Lisa Watanabe <lwatanabe@riotgames.com>
Signed-off-by: Cody Haas <chaas@riotgames.com>
Signed-off-by: Lisa Watanabe <lwatanabe@riotgames.com>
Signed-off-by: Zvi Effron <zeffron@riotgames.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210707221657.3985075-5-zeffron@riotgames.com
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Do not refresh timeout in SYN_SENT for syn retransmissions.
Add selftest for unreplied TCP connection, from Florian Westphal.
2) Fix null dereference from error path with hardware offload
in nftables.
3) Remove useless nf_ct_gre_keymap_flush() from netns exit path,
from Vasily Averin.
4) Missing rcu read-lock side in ctnetlink helper info dump,
also from Vasily.
5) Do not mark RST in the reply direction coming after SYN packet
for an out-of-sync entry, from Ali Abdallah and Florian Westphal.
6) Add tcp_ignore_invalid_rst sysctl to allow to disable out of
segment RSTs, from Ali.
7) KCSAN fix for nf_conntrack_all_lock(), from Manfred Spraul.
8) Honor NFTA_LAST_SET in nft_last.
9) Fix incorrect arithmetics when restore last_jiffies in nft_last.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
After redirecting, it's already a new path. So the old PMTU info should
be cleared. The IPv6 test "mtu exception plus redirect" should only
has redirect info without old PMTU.
The IPv4 test can not be changed because of legacy.
Fixes: ec81053528 ("selftests: Add redirect tests")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the kernel doesn't enable option CONFIG_IPV6_SUBTREES, the RTA_SRC
info will not be exported to userspace in rt6_fill_node(). And ip cmd will
not print "from ::" to the route output. So remove this check.
Fixes: ec81053528 ("selftests: Add redirect tests")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Prevent sigaltstack out of bounds writes. The kernel unconditionally
writes the FPU state to the alternate stack without checking whether
the stack is large enough to accomodate it.
Check the alternate stack size before doing so and in case it's too
small force a SIGSEGV instead of silently corrupting user space data.
- MINSIGSTKZ and SIGSTKSZ are constants in signal.h and have never been
updated despite the fact that the FPU state which is stored on the
signal stack has grown over time which causes trouble in the field
when AVX512 is available on a CPU. The kernel does not expose the
minimum requirements for the alternate stack size depending on the
available and enabled CPU features.
ARM already added an aux vector AT_MINSIGSTKSZ for the same reason.
Add it to x86 as well
- A major cleanup of the x86 FPU code. The recent discoveries of XSTATE
related issues unearthed quite some inconsistencies, duplicated code
and other issues.
The fine granular overhaul addresses this, makes the code more robust
and maintainable, which allows to integrate upcoming XSTATE related
features in sane ways.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmDlcpETHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoeP5D/4i+AgYYeiMLgGb+NS7iaKPfoWo6LIz
y3qdTSA0DQaIYbYivWwRO/g0GYdDMXDWeZalFi7eGnVI8O3eOog+22Zrf/y0UINB
KJHdYd4ApWHhs401022y5hexrWQvnV8w1yQCuj/zLm6eC+AVhdwt2AY+IBoRrdUj
wqY97B/4rJNsBvvqTDn9EeDrJA2y0y0Suc7AhIp2BGMI+dpIdxys8RJDamXNWyDL
gJf0YRgUoiIn3AHKb+fgv60AoxfC175NSg/5/y/scFNXqVlW0Up4YCb7pqG9o2Ga
f3XvtWfbw1N5PmUYjFkALwEkzGUbM3v0RA3xLY2j2WlWm9fBPPy59dt+i/h/VKyA
GrA7i7lcIqX8dfVH6XkrReZBkRDSB6t9SZTvV54jAz5fcIZO2Rg++UFUvI/R6GKK
XCcxukYaArwo+IG62iqDszS3gfLGhcor/cviOeULRC5zMUIO4Jah+IhDnifmShtC
M5s9QzrwIRD/XMewGRQmvkiN4kBfE7jFoBQr1J9leCXJKrM+2JQmMzVInuubTQIq
SdlKOaAIn7xtekz+6XdFG9Gmhck0PCLMJMOLNvQkKWI3KqGLRZ+dAWKK0vsCizAx
0BA7ZeB9w9lFT+D8mQCX77JvW9+VNwyfwIOLIrJRHk3VqVpS5qvoiFTLGJJBdZx4
/TbbRZu7nXDN2w==
=Mq1m
-----END PGP SIGNATURE-----
Merge tag 'x86-fpu-2021-07-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fpu updates from Thomas Gleixner:
"Fixes and improvements for FPU handling on x86:
- Prevent sigaltstack out of bounds writes.
The kernel unconditionally writes the FPU state to the alternate
stack without checking whether the stack is large enough to
accomodate it.
Check the alternate stack size before doing so and in case it's too
small force a SIGSEGV instead of silently corrupting user space
data.
- MINSIGSTKZ and SIGSTKSZ are constants in signal.h and have never
been updated despite the fact that the FPU state which is stored on
the signal stack has grown over time which causes trouble in the
field when AVX512 is available on a CPU. The kernel does not expose
the minimum requirements for the alternate stack size depending on
the available and enabled CPU features.
ARM already added an aux vector AT_MINSIGSTKSZ for the same reason.
Add it to x86 as well.
- A major cleanup of the x86 FPU code. The recent discoveries of
XSTATE related issues unearthed quite some inconsistencies,
duplicated code and other issues.
The fine granular overhaul addresses this, makes the code more
robust and maintainable, which allows to integrate upcoming XSTATE
related features in sane ways"
* tag 'x86-fpu-2021-07-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits)
x86/fpu/xstate: Clear xstate header in copy_xstate_to_uabi_buf() again
x86/fpu/signal: Let xrstor handle the features to init
x86/fpu/signal: Handle #PF in the direct restore path
x86/fpu: Return proper error codes from user access functions
x86/fpu/signal: Split out the direct restore code
x86/fpu/signal: Sanitize copy_user_to_fpregs_zeroing()
x86/fpu/signal: Sanitize the xstate check on sigframe
x86/fpu/signal: Remove the legacy alignment check
x86/fpu/signal: Move initial checks into fpu__restore_sig()
x86/fpu: Mark init_fpstate __ro_after_init
x86/pkru: Remove xstate fiddling from write_pkru()
x86/fpu: Don't store PKRU in xstate in fpu_reset_fpstate()
x86/fpu: Remove PKRU handling from switch_fpu_finish()
x86/fpu: Mask PKRU from kernel XRSTOR[S] operations
x86/fpu: Hook up PKRU into ptrace()
x86/fpu: Add PKRU storage outside of task XSAVE buffer
x86/fpu: Dont restore PKRU in fpregs_restore_userspace()
x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi()
x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs()
x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs()
...
Unless the user sets overcommit_memory or has plenty of swap, the latest
changes to the testcase will result in ENOMEM failures for hosts with
less than 64GB RAM. As we do not use much of the allocated memory, we
can use MAP_NORESERVE to avoid this error.
Cc: Zenghui Yu <yuzenghui@huawei.com>
Cc: vkuznets@redhat.com
Cc: wanghaibin.wang@huawei.com
Cc: stable@vger.kernel.org
Fixes: 309505dd56 ("KVM: selftests: Fix mapping length truncation in m{,un}map()")
Tested-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/kvm/20210701160425.33666-1-borntraeger@de.ibm.com/
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Older machines like z196 and zEC12 do only support 44 bits of physical
addresses. Make this the default and check via IBC if we are on a later
machine. We then add P47V64 as an additional model.
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Link: https://lore.kernel.org/kvm/20210701153853.33063-1-borntraeger@de.ibm.com/
Fixes: 1bc603af73 ("KVM: selftests: introduce P47V64 for s390x")
Here is the big set of char / misc and other driver subsystem updates
for 5.14-rc1. Included in here are:
- habanna driver updates
- fsl-mc driver updates
- comedi driver updates
- fpga driver updates
- extcon driver updates
- interconnect driver updates
- mei driver updates
- nvmem driver updates
- phy driver updates
- pnp driver updates
- soundwire driver updates
- lots of other tiny driver updates for char and misc drivers
This is looking more and more like the "various driver subsystems mushed
together" tree...
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYOM8jQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymECgCg0yL+8WxDKO5Gg5llM5PshvLB1rQAn0y5pDgg
nw78LV3HQ0U7qaZBtI91
=x+AR
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc driver updates from Greg KH:
"Here is the big set of char / misc and other driver subsystem updates
for 5.14-rc1. Included in here are:
- habanalabs driver updates
- fsl-mc driver updates
- comedi driver updates
- fpga driver updates
- extcon driver updates
- interconnect driver updates
- mei driver updates
- nvmem driver updates
- phy driver updates
- pnp driver updates
- soundwire driver updates
- lots of other tiny driver updates for char and misc drivers
This is looking more and more like the "various driver subsystems
mushed together" tree...
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (292 commits)
mcb: Use DEFINE_RES_MEM() helper macro and fix the end address
PNP: moved EXPORT_SYMBOL so that it immediately followed its function/variable
bus: mhi: pci-generic: Add missing 'pci_disable_pcie_error_reporting()' calls
bus: mhi: Wait for M2 state during system resume
bus: mhi: core: Fix power down latency
intel_th: Wait until port is in reset before programming it
intel_th: msu: Make contiguous buffers uncached
intel_th: Remove an unused exit point from intel_th_remove()
stm class: Spelling fix
nitro_enclaves: Set Bus Master for the NE PCI device
misc: ibmasm: Modify matricies to matrices
misc: vmw_vmci: return the correct errno code
siox: Simplify error handling via dev_err_probe()
fpga: machxo2-spi: Address warning about unused variable
lkdtm/heap: Add init_on_alloc tests
selftests/lkdtm: Enable various testable CONFIGs
lkdtm: Add CONFIG hints in errors where possible
lkdtm: Enable DOUBLE_FAULT on all architectures
lkdtm/heap: Add vmalloc linear overflow test
lkdtm/bugs: XFAIL UNALIGNED_LOAD_STORE_WRITE
...
Pull RCU updates from Paul McKenney:
- Bitmap parsing support for "all" as an alias for all bits
- Documentation updates
- Miscellaneous fixes, including some that overlap into mm and lockdep
- kvfree_rcu() updates
- mem_dump_obj() updates, with acks from one of the slab-allocator
maintainers
- RCU NOCB CPU updates, including limited deoffloading
- SRCU updates
- Tasks-RCU updates
- Torture-test updates
* 'core-rcu-2021.07.04' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (78 commits)
tasks-rcu: Make show_rcu_tasks_gp_kthreads() be static inline
rcu-tasks: Make ksoftirqd provide RCU Tasks quiescent states
rcu: Add missing __releases() annotation
rcu: Remove obsolete rcu_read_unlock() deadlock commentary
rcu: Improve comments describing RCU read-side critical sections
rcu: Create an unrcu_pointer() to remove __rcu from a pointer
srcu: Early test SRCU polling start
rcu: Fix various typos in comments
rcu/nocb: Unify timers
rcu/nocb: Prepare for fine-grained deferred wakeup
rcu/nocb: Only cancel nocb timer if not polling
rcu/nocb: Delete bypass_timer upon nocb_gp wakeup
rcu/nocb: Cancel nocb_timer upon nocb_gp wakeup
rcu/nocb: Allow de-offloading rdp leader
rcu/nocb: Directly call __wake_nocb_gp() from bypass timer
rcu: Don't penalize priority boosting when there is nothing to boost
rcu: Point to documentation of ordering guarantees
rcu: Make rcu_gp_cleanup() be noinline for tracing
rcu: Restrict RCU_STRICT_GRACE_PERIOD to at most four CPUs
rcu: Make show_rcu_gp_kthreads() dump rcu_node structures blocking GP
...
- Added option for per CPU threads to the hwlat tracer
- Have hwlat tracer handle hotplug CPUs
- New tracer: osnoise, that detects latency caused by interrupts, softirqs
and scheduling of other tasks.
- Added timerlat tracer that creates a thread and measures in detail what
sources of latency it has for wake ups.
- Removed the "success" field of the sched_wakeup trace event.
This has been hardcoded as "1" since 2015, no tooling should be looking
at it now. If one exists, we can revert this commit, fix that tool and
try to remove it again in the future.
- tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids.
- New boot command line option "tp_printk_stop", as tp_printk causes trace
events to write to console. When user space starts, this can easily live
lock the system. Having a boot option to stop just after boot up is
useful to prevent that from happening.
- Have ftrace_dump_on_oops boot command line option take numbers that match
the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops.
- Bootconfig clean ups, fixes and enhancements.
- New ktest script that tests bootconfig options.
- Add tracepoint_probe_register_may_exist() to register a tracepoint
without triggering a WARN*() if it already exists. BPF has a path from
user space that can do this. All other paths are considered a bug.
- Small clean ups and fixes
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYN8YPhQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qhxLAP9Mo5hHv7Hg6W7Ddv77rThm+qclsMR/
yW0P+eJpMm4+xAD8Cq03oE1DimPK+9WZBKU5rSqAkqG6CjgDRw6NlIszzQQ=
=WEPR
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
- Added option for per CPU threads to the hwlat tracer
- Have hwlat tracer handle hotplug CPUs
- New tracer: osnoise, that detects latency caused by interrupts,
softirqs and scheduling of other tasks.
- Added timerlat tracer that creates a thread and measures in detail
what sources of latency it has for wake ups.
- Removed the "success" field of the sched_wakeup trace event. This has
been hardcoded as "1" since 2015, no tooling should be looking at it
now. If one exists, we can revert this commit, fix that tool and try
to remove it again in the future.
- tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids.
- New boot command line option "tp_printk_stop", as tp_printk causes
trace events to write to console. When user space starts, this can
easily live lock the system. Having a boot option to stop just after
boot up is useful to prevent that from happening.
- Have ftrace_dump_on_oops boot command line option take numbers that
match the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops.
- Bootconfig clean ups, fixes and enhancements.
- New ktest script that tests bootconfig options.
- Add tracepoint_probe_register_may_exist() to register a tracepoint
without triggering a WARN*() if it already exists. BPF has a path
from user space that can do this. All other paths are considered a
bug.
- Small clean ups and fixes
* tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (49 commits)
tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT
tracing: Simplify & fix saved_tgids logic
treewide: Add missing semicolons to __assign_str uses
tracing: Change variable type as bool for clean-up
trace/timerlat: Fix indentation on timerlat_main()
trace/osnoise: Make 'noise' variable s64 in run_osnoise()
tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
tracing: Fix spelling in osnoise tracer "interferences" -> "interference"
Documentation: Fix a typo on trace/osnoise-tracer
trace/osnoise: Fix return value on osnoise_init_hotplug_support
trace/osnoise: Make interval u64 on osnoise_main
trace/osnoise: Fix 'no previous prototype' warnings
tracing: Have osnoise_main() add a quiescent state for task rcu
seq_buf: Make trace_seq_putmem_hex() support data longer than 8
seq_buf: Fix overflow in seq_buf_putmem_hex()
trace/osnoise: Support hotplug operations
trace/hwlat: Support hotplug operations
trace/hwlat: Protect kdata->kthread with get/put_online_cpus
trace: Add timerlat tracer
trace: Add osnoise tracer
...
This Kselftest update for Linux 5.14-rc1 consists of fixes to
existing tests and framework:
-- migrate sgx test to kselftest harness
-- add new test cases to sgx test
-- ftrace test fix event-no-pid on 1-core machine
-- splice test adjust for handler fallback removal
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmDfQGUACgkQCwJExA0N
Qxw6hw/+OEBEs+eHlDbbxHb9fBEg8kWls0AJKLAD2DEJOt1eARzyVa4J1MUpS6yK
jJi/k0wKvqhMbdQgEEL3oSVvr9JgmOell0OkLzK9tx4HgEou89GnuO+wVRAeG87o
QGDVWOa76GZwC47rvDDt9i9io075O1fol9HPgPZIdyVFcFgu45PVmRoSK4vaKUpm
m5VtEznCxcL60vD+u2XhbBO2QMUi4OoBrNdJpnkBx/U6iCv5ZQbpN3OoDgdhltXC
raQChLmy9bi2uYXskklcY1xGftyUiiWwfaiz7w6a65C2DedsUoD1lwhnaLlsJmSh
KTOx9r5xF5bgRsdbWefIc7yRk9nVNczVSRs+x8aGlXT3p+e9jwaSXWXvCaZowYOV
M+R0nCgmpOoGl5jhgeeu+P+JyC7aRRtMhVTGOiuUzyAPZbaIur1JVjoHpRHCVZle
XjkCbq+VZ7Qb6pZ5sM+ve20GIw+J8/pDc4qtpttqb+hLtwe5rJj+Ydw2GQ/yZ30c
uFwfdvGQMVMHjvPaL8IQb6WE3tiBhvJiUYQaRdR83HaWFstI14fFHgm7Zed9I19b
TAgoDZ4P08cROtTbopbjRX1B3cnx/sUrFjwW8cTstIkedzo3rHbdf4UvV/h96O+S
BjrZfsbEQBaCLW8KNu9wO/id+sPXeYwuW4HCkE0R0H1ZJI4Fgio=
=6gOE
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-next-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest update from Shuah Khan:
"Fixes to existing tests and framework:
- migrate sgx test to kselftest harness
- add new test cases to sgx test
- ftrace test fix event-no-pid on 1-core machine
- splice test adjust for handler fallback removal"
* tag 'linux-kselftest-next-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/sgx: remove checks for file execute permissions
selftests/ftrace: fix event-no-pid on 1-core machine
selftests/sgx: Refine the test enclave to have storage
selftests/sgx: Add EXPECT_EEXIT() macro
selftests/sgx: Dump enclave memory map
selftests/sgx: Migrate to kselftest harness
selftests/sgx: Rename 'eenter' and 'sgx_call_vdso'
selftests: timers: rtcpie: skip test if default RTC device does not exist
selftests: lib.mk: Also install "config" and "settings"
selftests: splice: Adjust for handler fallback removal
selftests/tls: Add {} to avoid static checker warning
selftests/resctrl: Fix incorrect parsing of option "-t"
This KUnit update for Linux 5.14-rc1 consists of fixes and features:
-- add support for skipped tests
-- introduce kunit_kmalloc_array/kunit_kcalloc() helpers
-- add gnu_printf specifiers
-- add kunit_shutdown
-- add unit test for filtering suites by names
-- convert lib/test_list_sort.c to use KUnit
-- code organization moving default config to tools/testing/kunit
-- refactor of internal parser input handling
-- cleanups and updates to documentation
-- code cleanup related to casts
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmDfNHgACgkQCwJExA0N
QxzU5g/+ORbcbE5jhumo82xYUCGTScgIqPllNO000Vk9xydMqwx6tpY5puV0ig2z
5X7pxKEQTnype58yzY5mS1p336ryIx3q+AwEownxTW3YurXo2naQ59ZPjEvlAV+E
h+DMYLzEjrxzcyJATw02uC9YLUZ3w6FPJfiXViIv93YYrtcnM0u6JpwG0yfBlI4v
4KKB2Xu4K7T90C9/ADFYFKX3mjXQl5fQwvIdtA7wS90Cgq52LKp2mvg1XEiZE5d+
0dkTZ4Zo8TxxHt665o7vfnUjQQNmh45iGlW65wONxfAPb8BPoneGjVKWQTnN0Hor
W+93ZPbMuFMSSKJuoHY9U7sP5VySKvaiIYaGdi6prnZZu0zUabKnLZ6FOy7kEdfs
v09ulCBTVLslixVgNcp/kD9T+G/SXwCF5YBMAiMDQ0GNfUqlFtBkEA3gd44KwMI0
KwCcOgUSiaCkqyzOz/VeQsu/nhA5jdMO0KjiAs7Z3e7r7O/qKFs/ll7hZgDNCWSC
q8eIrcBkSL0EGgXR1iZ4AtGm8op6KKd4ACBM8NdtTyoGFl1npZOgZnHoIsy35G9K
9mhc7eXSoaDGqy9dONL1Tc8Neg7qLTXQNp2radqsnAAgNPUrJuC7+8YC+DdIsjBH
W7OyMjpfbwPws5rP4CS+JdwL+nQprKXZvFIhWGYhkDK44MbOngw=
=5QAv
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-kunit-fixes-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull KUnit update from Shuah Khan:
"Fixes and features:
- add support for skipped tests
- introduce kunit_kmalloc_array/kunit_kcalloc() helpers
- add gnu_printf specifiers
- add kunit_shutdown
- add unit test for filtering suites by names
- convert lib/test_list_sort.c to use KUnit
- code organization moving default config to tools/testing/kunit
- refactor of internal parser input handling
- cleanups and updates to documentation
- code cleanup related to casts"
* tag 'linux-kselftest-kunit-fixes-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (29 commits)
kunit: add unit test for filtering suites by names
kasan: test: make use of kunit_skip()
kunit: test: Add example tests which are always skipped
kunit: tool: Support skipped tests in kunit_tool
kunit: Support skipped tests
thunderbolt: test: Reinstate a few casts of bitfields
kunit: tool: internal refactor of parser input handling
lib/test: convert lib/test_list_sort.c to use KUnit
kunit: introduce kunit_kmalloc_array/kunit_kcalloc() helpers
kunit: Remove the unused all_tests.config
kunit: Move default config from arch/um -> tools/testing/kunit
kunit: arch/um/configs: Enable KUNIT_ALL_TESTS by default
kunit: Add gnu_printf specifiers
lib/cmdline_kunit: Remove a cast which are no-longer required
kernel/sysctl-test: Remove some casts which are no-longer required
thunderbolt: test: Remove some casts which are no longer required
mmc: sdhci-of-aspeed: Remove some unnecessary casts from KUnit tests
iio: Remove a cast in iio-test-format which is no longer required
device property: Remove some casts in property-entry-test
Documentation: kunit: Clean up some string casts in examples
...
- A big series refactoring parts of our KVM code, and converting some to C.
- Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on some CPUs.
- Support for the Microwatt soft-core.
- Optimisations to our interrupt return path on 64-bit.
- Support for userspace access to the NX GZIP accelerator on PowerVM on Power10.
- Enable KUAP and KUEP by default on 32-bit Book3S CPUs.
- Other smaller features, fixes & cleanups.
Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Baokun Li,
Benjamin Herrenschmidt, Bharata B Rao, Christophe Leroy, Daniel Axtens, Daniel Henrique
Barboza, Finn Thain, Geoff Levand, Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley,
Jordan Niethe, Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas
Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika Vasireddy, Shaokun
Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar Singh, Tom Rix, Vaibhav Jain,
YueHaibing, Zhang Jianhua, Zhen Lei.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmDfFS4THG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgFxHEAC88NJ+Gz87LiTQFt6QjhziBaJUd+sY
uqADPRROr4P50O8PjYZbMi2qbXzOlLkZO4wJWX7jpZ1F9KmbPNqY2shD8h4ahyge
F/uqzBW1FXBJfnDEKdU2MzalkeTP+dwxLZyouUamjDCGNLFjOV4x/Fft5otOdXjO
k9uO6yoGyOkWYzjC+Y/irNPlIDDByB/+bD92Cb52Y2mXMDDEnx4JzbtkeJW+8udT
Sjn3bWzeL+dz5GehjMKwK4+SptNiyQGOgM8FwtnKUMvgzxv04DqCGjr9YC12L2Z7
VoFZc4GzVgtf8DZg4fJ3KG5aG2nH3Tui7jc9lUckdrxixDAZw5wSG7CQ39gFb/5+
7A4fEJk4Z3h5llibwxAZrC7wV8ZDDXn8oRFzRcOJjfxYaD+ohOOyWHIebwkdiXYx
nfYI7sBcScDLXeBvHtDra2GJpbFSpVL3S/QNhhi1vKVNrFSyAgbAybcVL2xPLZ6+
8Mh7A8xt+hf2bo9AXuYJDo9mwXWfg1093d0kT+AslcRhZioBk18c2AiZLIz0FzuL
Ua/e5FPb99x9LSdcZHvaAXBoHT2iTgDyCyDa3gkIesyuRX6ggHoFcVQuvdDcbJ9d
H8LK+Tahy1Y+E5b6KdtU8mDEGE+QG+CWLnwQ6YSCaL/MYgaFzNa32Jdj1fmztSBC
cttP43kHZ7ljTw==
=zo4d
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- A big series refactoring parts of our KVM code, and converting some
to C.
- Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on
some CPUs.
- Support for the Microwatt soft-core.
- Optimisations to our interrupt return path on 64-bit.
- Support for userspace access to the NX GZIP accelerator on PowerVM on
Power10.
- Enable KUAP and KUEP by default on 32-bit Book3S CPUs.
- Other smaller features, fixes & cleanups.
Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira
Rajeev, Baokun Li, Benjamin Herrenschmidt, Bharata B Rao, Christophe
Leroy, Daniel Axtens, Daniel Henrique Barboza, Finn Thain, Geoff Levand,
Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley, Jordan Niethe,
Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas
Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika
Vasireddy, Shaokun Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar
Singh, Tom Rix, Vaibhav Jain, YueHaibing, Zhang Jianhua, and Zhen Lei.
* tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (218 commits)
powerpc: Only build restart_table.c for 64s
powerpc/64s: move ret_from_fork etc above __end_soft_masked
powerpc/64s/interrupt: clean up interrupt return labels
powerpc/64/interrupt: add missing kprobe annotations on interrupt exit symbols
powerpc/64: enable MSR[EE] in irq replay pt_regs
powerpc/64s/interrupt: preserve regs->softe for NMI interrupts
powerpc/64s: add a table of implicit soft-masked addresses
powerpc/64e: remove implicit soft-masking and interrupt exit restart logic
powerpc/64e: fix CONFIG_RELOCATABLE build warnings
powerpc/64s: fix hash page fault interrupt handler
powerpc/4xx: Fix setup_kuep() on SMP
powerpc/32s: Fix setup_{kuap/kuep}() on SMP
powerpc/interrupt: Use names in check_return_regs_valid()
powerpc/interrupt: Also use exit_must_hard_disable() on PPC32
powerpc/sysfs: Replace sizeof(arr)/sizeof(arr[0]) with ARRAY_SIZE
powerpc/ptrace: Refactor regs_set_return_{msr/ip}
powerpc/ptrace: Move set_return_regs_changed() before regs_set_return_{msr/ip}
powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()
powerpc/pseries/vas: Include irqdomain.h
powerpc: mark local variables around longjmp as volatile
...
Merge more updates from Andrew Morton:
"190 patches.
Subsystems affected by this patch series: mm (hugetlb, userfaultfd,
vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock,
migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap,
zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc,
core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs,
signals, exec, kcov, selftests, compress/decompress, and ipc"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits)
ipc/util.c: use binary search for max_idx
ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock
ipc: use kmalloc for msg_queue and shmid_kernel
ipc sem: use kvmalloc for sem_undo allocation
lib/decompressors: remove set but not used variabled 'level'
selftests/vm/pkeys: exercise x86 XSAVE init state
selftests/vm/pkeys: refill shadow register after implicit kernel write
selftests/vm/pkeys: handle negative sys_pkey_alloc() return code
selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random
kcov: add __no_sanitize_coverage to fix noinstr for all architectures
exec: remove checks in __register_bimfmt()
x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned
hfsplus: report create_date to kstat.btime
hfsplus: remove unnecessary oom message
nilfs2: remove redundant continue statement in a while-loop
kprobes: remove duplicated strong free_insn_page in x86 and s390
init: print out unknown kernel parameters
checkpatch: do not complain about positive return values starting with EPOLL
checkpatch: improve the indented label test
checkpatch: scripts/spdxcheck.py now requires python3
...
Pull cgroup updates from Tejun Heo:
- cgroup.kill is added which implements atomic killing of the whole
subtree.
Down the line, this should be able to replace the multiple userland
implementations of "keep killing till empty".
- PSI can now be turned off at boot time to avoid overhead for
configurations which don't care about PSI.
* 'for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: make per-cgroup pressure stall tracking configurable
cgroup: Fix kernel-doc
cgroup: inline cgroup_task_freeze()
tests/cgroup: test cgroup.kill
tests/cgroup: move cg_wait_for(), cg_prepare_for_wait()
tests/cgroup: use cgroup.kill in cg_killall()
docs/cgroup: add entry for cgroup.kill
cgroup: introduce cgroup.kill
TCP connections in UNREPLIED state (only SYN seen) can be kept alive
indefinitely, as each SYN re-sets the timeout.
This means that even if a peer has closed its socket the entry
never times out.
This also prevents re-evaluation of configured NAT rules.
Add a test case that sets SYN timeout to 10 seconds, then check
that the nat redirection added later eventually takes effect.
This is based off a repro script from Antonio Ojea.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
On x86, there is a set of instructions used to save and restore register
state collectively known as the XSAVE architecture. There are about a
dozen different features managed with XSAVE. The protection keys
register, PKRU, is one of those features.
The hardware optimizes XSAVE by tracking when the state has not changed
from its initial (init) state. In this case, it can avoid the cost of
writing state to memory (it would usually just be a bunch of 0's).
When the pkey register is 0x0 the hardware optionally choose to track the
register as being in the init state (optimize away the writes). AMD CPUs
do this more aggressively compared to Intel.
On x86, PKRU is rarely in its (very permissive) init state. Instead, the
value defaults to something very restrictive. It is not surprising that
bugs have popped up in the rare cases when PKRU reaches its init state.
Add a protection key selftest which gets the protection keys register into
its init state in a way that should work on Intel and AMD. Then, do a
bunch of pkey register reads to watch for inadvertent changes.
This adds "-mxsave" to CFLAGS for all the x86 vm selftests in order to
allow use of the XSAVE instruction __builtin functions. This will make
the builtins available on all of the vm selftests, but is expected to be
harmless.
Link: https://lkml.kernel.org/r/20210611164202.1849B712@viggo.jf.intel.com
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The pkey test code keeps a "shadow" of the pkey register around. This
ensures that any bugs which might write to the register can be caught more
quickly.
Generally, userspace has a good idea when the kernel is going to write to
the register. For instance, alloc_pkey() is passed a permission mask.
The caller of alloc_pkey() can update the shadow based on the return value
and the mask.
But, the kernel can also modify the pkey register in a more sneaky way.
For mprotect(PROT_EXEC) mappings, the kernel will allocate a pkey and
write the pkey register to create an execute-only mapping. The kernel
never tells userspace what key it uses for this.
This can cause the test to fail with messages like:
protection_keys_64.2: pkey-helpers.h:132: _read_pkey_reg: Assertion `pkey_reg == shadow_pkey_reg' failed.
because the shadow was not updated with the new kernel-set value.
Forcibly update the shadow value immediately after an mprotect().
Link: https://lkml.kernel.org/r/20210611164200.EF76AB73@viggo.jf.intel.com
Fixes: 6af17cf89e ("x86/pkeys/selftests: Add PROT_EXEC test")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The alloc_pkey() sefltest function wraps the sys_pkey_alloc() system call.
On success, it updates its "shadow" register value because
sys_pkey_alloc() updates the real register.
But, the success check is wrong. pkey_alloc() considers any non-zero
return code to indicate success where the pkey register will be modified.
This fails to take negative return codes into account.
Consider only a positive return value as a successful call.
Link: https://lkml.kernel.org/r/20210611164157.87AB4246@viggo.jf.intel.com
Fixes: 5f23f6d082 ("x86/pkeys: Add self-tests")
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "selftests/vm/pkeys: Bug fixes and a new test".
There has been a lot of activity on the x86 front around the XSAVE
architecture which is used to context-switch processor state (among other
things). In addition, AMD has recently joined the protection keys club by
adding processor support for PKU.
The AMD implementation helped uncover a kernel bug around the PKRU "init
state", which actually applied to Intel's implementation but was just
harder to hit. This series adds a test which is expected to help find
this class of bug both on AMD and Intel. All the work around pkeys on x86
also uncovered a few bugs in the selftest.
This patch (of 4):
The "random" pkey allocation code currently does the good old:
srand((unsigned int)time(NULL));
*But*, it unfortunately does this on every random pkey allocation.
There may be thousands of these a second. time() has a one second
resolution. So, each time alloc_random_pkey() is called, the PRNG is
*RESET* to time(). This is nasty. Normally, if you do:
srand(<ANYTHING>);
foo = rand();
bar = rand();
You'll be quite guaranteed that 'foo' and 'bar' are different. But, if
you do:
srand(1);
foo = rand();
srand(1);
bar = rand();
You are quite guaranteed that 'foo' and 'bar' are the *SAME*. The recent
"fix" effectively forced the test case to use the same "random" pkey for
the whole test, unless the test run crossed a second boundary.
Only run srand() once at program startup.
This explains some very odd and persistent test failures I've been seeing.
Link: https://lkml.kernel.org/r/20210611164153.91B76FB8@viggo.jf.intel.com
Link: https://lkml.kernel.org/r/20210611164155.192D00FF@viggo.jf.intel.com
Fixes: 6e373263ce ("selftests/vm/pkeys: fix alloc_random_pkey() to make it really random")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Let's add a simple test for MADV_POPULATE_READ and MADV_POPULATE_WRITE,
verifying some error handling, that population works, and that softdirty
tracking works as expected. For now, limit the test to private anonymous
memory.
Link: https://lkml.kernel.org/r/20210419135443.12822-6-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Enable test_uffdio_minor for test_type == TEST_SHMEM, and modify the test
slightly to pass in / check for the right feature flags.
Link: https://lkml.kernel.org/r/20210503180737.2487560-11-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, the context (fds, mmap-ed areas, etc.) are global. Each test
mutates this state in some way, in some cases really "clobbering it"
(e.g., the events test mremap-ing area_dst over the top of area_src, or
the minor faults tests overwriting the count_verify values in the test
areas). We run the tests in a particular order, each test is careful to
make the right assumptions about its starting state, etc.
But, this is fragile. It's better for a test's success or failure to not
depend on what some other prior test case did to the global state.
To that end, clear and reinitialize the test context at the start of each
test case, so whatever prior test cases did doesn't affect future tests.
This is particularly relevant to this series because the events test's
mremap of area_dst screws up assumptions the minor fault test was relying
on. This wasn't a problem for hugetlb, as we don't mremap in that case.
[peterx@redhat.com: fix conflict between this patch and the uffd pagemap series]
Link: https://lkml.kernel.org/r/YKQqKrl+/cQ1utrb@t490s
Link: https://lkml.kernel.org/r/20210503180737.2487560-10-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Previously, we just allocated two shm areas: area_src and area_dst. With
this commit, change this so we also allocate area_src_alias, and
area_dst_alias.
area_*_alias and area_* (respectively) point to the same underlying
physical pages, but are different VMAs. In a future commit in this
series, we'll leverage this setup to exercise minor fault handling support
for shmem, just like we do in the hugetlb_shared test.
Link: https://lkml.kernel.org/r/20210503180737.2487560-9-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a preparatory commit. In the future, we want to be able to setup
alias mappings for area_src and area_dst in the shmem test, like we do in
the hugetlb_shared test. With a VMA obtained via mmap(MAP_ANONYMOUS |
MAP_SHARED), it isn't clear how to do this.
So, mmap() with an fd, so we can create alias mappings. Use memfd_create
instead of actually passing in a tmpfs path like hugetlb does, since it's
more convenient / simpler to run, and works just as well.
Future commits will:
1. Setup the alias mappings.
2. Extend our tests to actually take advantage of this, to test new
userfaultfd behavior being introduced in this series.
Also, a small fix in the area we're changing: when the hugetlb setup fails
in main(), pass in the right argv[] so we actually print out the hugetlb
file path.
Link: https://lkml.kernel.org/r/20210503180737.2487560-8-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add one anonymous specific test to start using pagemap. With pagemap
support, we can directly read the uffd-wp bit from pgtable without
triggering any fault, so it's easier to do sanity checks in unit tests.
Meanwhile this test also leverages the newly introduced MADV_PAGEOUT
madvise function to test swap ptes with uffd-wp bit set, and across
fork()s.
Link: https://lkml.kernel.org/r/20210428225030.9708-7-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce err()/_err() and replace all the different ways to fail the
program, mostly "fprintf" and "perror" with tons of exit() calls. Always
stop the test program at any failure.
Link: https://lkml.kernel.org/r/20210412232753.1012412-6-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
WP and MINOR modes are conditionally enabled on specific memory types.
This patch avoids dumping tons of zeros for those cases when the modes are
not supported at all.
Link: https://lkml.kernel.org/r/20210412232753.1012412-5-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It tries to check against all zeros and looped for quite a few times.
However after that we'll verify the same page with count_verify, while
count_verify can never be zero. So it means if it's a zero page we'll
detect it anyways with below code.
There's yet another place we conditionally check the fault flag - just do
it unconditionally.
Link: https://lkml.kernel.org/r/20210412232753.1012412-4-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There seems to have no guarantee that time() will return the same for the
two calls even if there's no delay, e.g. when a fault is accidentally
crossing the changing of a second. Meanwhile, this message is also not
helping that much since delay could happen with a lot of reasons, e.g.,
schedule latency of resolving thread. It may not mean an issue with uffd.
Neither do I saw this error triggered either in the past runs. Even if it
triggers, it'll be drown in all the rest of test logs. Remove it.
Link: https://lkml.kernel.org/r/20210412232753.1012412-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "userfaultfd/selftests: A few cleanups", v2.
I wanted to cleanup userfaultfd.c fault handling for a long time. If it's
not cleaned, when the new code grows the file it'll also grow the size
that needs to be cleaned... This is my attempt to cleanup the userfaultfd
selftest on fault handling, to use an err() macro instead of either
fprintf() or perror() then another exit() call.
The huge cleanup is done in the last patch. The first 4 patches are some
other standalone cleanups for the same file, so I put them together.
This patch (of 5):
Userfaultfd selftest does not need to handle kernel initiated fault. Set
user mode so it can be run even if unprivileged_userfaultfd=0 (which is
the default).
Link: https://lkml.kernel.org/r/20210412232753.1012412-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The debug_cow attribute had been removed since commit 4958e4d86e ("mm:
thp: remove debug_cow switch"), so remove it in selftest code too,
otherwise the khugepaged test will fail.
Link: https://lkml.kernel.org/r/20210430051117.400189-1-sunnanyong@huawei.com
Fixes: 4958e4d86e ("mm: thp: remove debug_cow switch")
Signed-off-by: Nanyong Sun <sunnanyong@huawei.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Core:
- BPF:
- add syscall program type and libbpf support for generating
instructions and bindings for in-kernel BPF loaders (BPF loaders
for BPF), this is a stepping stone for signed BPF programs
- infrastructure to migrate TCP child sockets from one listener
to another in the same reuseport group/map to improve flexibility
of service hand-off/restart
- add broadcast support to XDP redirect
- allow bypass of the lockless qdisc to improving performance
(for pktgen: +23% with one thread, +44% with 2 threads)
- add a simpler version of "DO_ONCE()" which does not require
jump labels, intended for slow-path usage
- virtio/vsock: introduce SOCK_SEQPACKET support
- add getsocketopt to retrieve netns cookie
- ip: treat lowest address of a IPv4 subnet as ordinary unicast address
allowing reclaiming of precious IPv4 addresses
- ipv6: use prandom_u32() for ID generation
- ip: add support for more flexible field selection for hashing
across multi-path routes (w/ offload to mlxsw)
- icmp: add support for extended RFC 8335 PROBE (ping)
- seg6: add support for SRv6 End.DT46 behavior
- mptcp:
- DSS checksum support (RFC 8684) to detect middlebox meddling
- support Connection-time 'C' flag
- time stamping support
- sctp: packetization Layer Path MTU Discovery (RFC 8899)
- xfrm: speed up state addition with seq set
- WiFi:
- hidden AP discovery on 6 GHz and other HE 6 GHz improvements
- aggregation handling improvements for some drivers
- minstrel improvements for no-ack frames
- deferred rate control for TXQs to improve reaction times
- switch from round robin to virtual time-based airtime scheduler
- add trace points:
- tcp checksum errors
- openvswitch - action execution, upcalls
- socket errors via sk_error_report
Device APIs:
- devlink: add rate API for hierarchical control of max egress rate
of virtual devices (VFs, SFs etc.)
- don't require RCU read lock to be held around BPF hooks
in NAPI context
- page_pool: generic buffer recycling
New hardware/drivers:
- mobile:
- iosm: PCIe Driver for Intel M.2 Modem
- support for Qualcomm MSM8998 (ipa)
- WiFi: Qualcomm QCN9074 and WCN6855 PCI devices
- sparx5: Microchip SparX-5 family of Enterprise Ethernet switches
- Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)
- NXP SJA1110 Automotive Ethernet 10-port switch
- Qualcomm QCA8327 switch support (qca8k)
- Mikrotik 10/25G NIC (atl1c)
Driver changes:
- ACPI support for some MDIO, MAC and PHY devices from Marvell and NXP
(our first foray into MAC/PHY description via ACPI)
- HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx
- Mellanox/Nvidia NIC (mlx5)
- NIC VF offload of L2 bridging
- support IRQ distribution to Sub-functions
- Marvell (prestera):
- add flower and match all
- devlink trap
- link aggregation
- Netronome (nfp): connection tracking offload
- Intel 1GE (igc): add AF_XDP support
- Marvell DPU (octeontx2): ingress ratelimit offload
- Google vNIC (gve): new ring/descriptor format support
- Qualcomm mobile (rmnet & ipa): inline checksum offload support
- MediaTek WiFi (mt76)
- mt7915 MSI support
- mt7915 Tx status reporting
- mt7915 thermal sensors support
- mt7921 decapsulation offload
- mt7921 enable runtime pm and deep sleep
- Realtek WiFi (rtw88)
- beacon filter support
- Tx antenna path diversity support
- firmware crash information via devcoredump
- Qualcomm 60GHz WiFi (wcn36xx)
- Wake-on-WLAN support with magic packets and GTK rekeying
- Micrel PHY (ksz886x/ksz8081): add cable test support
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmDb+fUACgkQMUZtbf5S
Irs2Jg//aqN0Q8CgIvYCVhPxQw1tY7pTAbgyqgBZ01vwjyvtIOgJiWzSfFEU84mX
M8fcpFX5eTKrOyJ9S6UFfQ/JG114n3hjAxFFT4Hxk2gC1Tg0vHuFQTDHcUl28bUE
mTm61e1YpdorILnv2k5JVQ/wu0vs5QKDrjcYcrcPnh+j93wvnPOgAfDBV95nZzjS
OTt4q2fR8GzLcSYWWsclMbDNkzyTG50RW/0Yd6aGjr5QGvXfrMeXfUJNz533PMf/
w5lNyjRKv+x9mdTZJzU0+msNUrZgUdRz7W8Ey8lD3hJZRE+D6/uU7FtsE8Mi3+uc
HWxeZUyzA3YF1MfVl/eesbxyPT7S/OkLzk4O5B35FbqP0YltaP+bOjq1/nM3ce1/
io9Dx9pIl/2JANUgRCAtLi8Z2dkvRoqTaBxZ/nPudCCljFwDwl6joTMJ7Ow22i5Y
5aIkcXFmZq4LbJDiHvbTlqT7yiuaEvu2UK/23bSIg/K3nF4eAmkY9Y1EgiMf60OF
78Ttw0wk2tUegwaS5MZnCniKBKDyl9gM2F6rbZ/IxQRR2LTXFc1B6gC+ynUxgXfh
Ub8O++6qGYGYZ0XvQH4pzco79p3qQWBTK5beIp2eu6BOAjBVIXq4AibUfoQLACsu
hX7jMPYd0kc3WFgUnKgQP8EnjFSwbf4XiaE7fIXvWBY8hzCw2h4=
=LvtX
-----END PGP SIGNATURE-----
Merge tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- BPF:
- add syscall program type and libbpf support for generating
instructions and bindings for in-kernel BPF loaders (BPF loaders
for BPF), this is a stepping stone for signed BPF programs
- infrastructure to migrate TCP child sockets from one listener to
another in the same reuseport group/map to improve flexibility
of service hand-off/restart
- add broadcast support to XDP redirect
- allow bypass of the lockless qdisc to improving performance (for
pktgen: +23% with one thread, +44% with 2 threads)
- add a simpler version of "DO_ONCE()" which does not require jump
labels, intended for slow-path usage
- virtio/vsock: introduce SOCK_SEQPACKET support
- add getsocketopt to retrieve netns cookie
- ip: treat lowest address of a IPv4 subnet as ordinary unicast
address allowing reclaiming of precious IPv4 addresses
- ipv6: use prandom_u32() for ID generation
- ip: add support for more flexible field selection for hashing
across multi-path routes (w/ offload to mlxsw)
- icmp: add support for extended RFC 8335 PROBE (ping)
- seg6: add support for SRv6 End.DT46 behavior
- mptcp:
- DSS checksum support (RFC 8684) to detect middlebox meddling
- support Connection-time 'C' flag
- time stamping support
- sctp: packetization Layer Path MTU Discovery (RFC 8899)
- xfrm: speed up state addition with seq set
- WiFi:
- hidden AP discovery on 6 GHz and other HE 6 GHz improvements
- aggregation handling improvements for some drivers
- minstrel improvements for no-ack frames
- deferred rate control for TXQs to improve reaction times
- switch from round robin to virtual time-based airtime scheduler
- add trace points:
- tcp checksum errors
- openvswitch - action execution, upcalls
- socket errors via sk_error_report
Device APIs:
- devlink: add rate API for hierarchical control of max egress rate
of virtual devices (VFs, SFs etc.)
- don't require RCU read lock to be held around BPF hooks in NAPI
context
- page_pool: generic buffer recycling
New hardware/drivers:
- mobile:
- iosm: PCIe Driver for Intel M.2 Modem
- support for Qualcomm MSM8998 (ipa)
- WiFi: Qualcomm QCN9074 and WCN6855 PCI devices
- sparx5: Microchip SparX-5 family of Enterprise Ethernet switches
- Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)
- NXP SJA1110 Automotive Ethernet 10-port switch
- Qualcomm QCA8327 switch support (qca8k)
- Mikrotik 10/25G NIC (atl1c)
Driver changes:
- ACPI support for some MDIO, MAC and PHY devices from Marvell and
NXP (our first foray into MAC/PHY description via ACPI)
- HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx
- Mellanox/Nvidia NIC (mlx5)
- NIC VF offload of L2 bridging
- support IRQ distribution to Sub-functions
- Marvell (prestera):
- add flower and match all
- devlink trap
- link aggregation
- Netronome (nfp): connection tracking offload
- Intel 1GE (igc): add AF_XDP support
- Marvell DPU (octeontx2): ingress ratelimit offload
- Google vNIC (gve): new ring/descriptor format support
- Qualcomm mobile (rmnet & ipa): inline checksum offload support
- MediaTek WiFi (mt76)
- mt7915 MSI support
- mt7915 Tx status reporting
- mt7915 thermal sensors support
- mt7921 decapsulation offload
- mt7921 enable runtime pm and deep sleep
- Realtek WiFi (rtw88)
- beacon filter support
- Tx antenna path diversity support
- firmware crash information via devcoredump
- Qualcomm WiFi (wcn36xx)
- Wake-on-WLAN support with magic packets and GTK rekeying
- Micrel PHY (ksz886x/ksz8081): add cable test support"
* tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits)
tcp: change ICSK_CA_PRIV_SIZE definition
tcp_yeah: check struct yeah size at compile time
gve: DQO: Fix off by one in gve_rx_dqo()
stmmac: intel: set PCI_D3hot in suspend
stmmac: intel: Enable PHY WOL option in EHL
net: stmmac: option to enable PHY WOL with PMT enabled
net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del}
net: use netdev_info in ndo_dflt_fdb_{add,del}
ptp: Set lookup cookie when creating a PTP PPS source.
net: sock: add trace for socket errors
net: sock: introduce sk_error_report
net: dsa: replay the local bridge FDB entries pointing to the bridge dev too
net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev
net: dsa: include fdb entries pointing to bridge in the host fdb list
net: dsa: include bridge addresses which are local in the host fdb list
net: dsa: sync static FDB entries on foreign interfaces to hardware
net: dsa: install the host MDB and FDB entries in the master's RX filter
net: dsa: reference count the FDB addresses at the cross-chip notifier level
net: dsa: introduce a separate cross-chip notifier type for host FDBs
net: dsa: reference count the MDB entries at the cross-chip notifier level
...
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYNnKewAKCRCRxhvAZXjc
oo/DAQCgKsDJTSht/QXuA0bMqdsQW27AWFfKacbk5lY4EjXz1gD/ZsYU2Si1fgkB
7mEl32JsfgcIBv0VdIulAh2F29Fa0A0=
=/b8l
-----END PGP SIGNATURE-----
Merge tag 'fs.openat2.unknown_flags.v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull openat2 fixes from Christian Brauner:
- Remove the unused VALID_UPGRADE_FLAGS define we carried from an
extension to openat2() that we haven't merged. Aleksa might be
getting back to it at some point but just not right now.
- openat2() used to accidently ignore unknown flag values in the upper
32 bits.
The new openat2() syscall verifies that no unknown O-flag values are
set and returns an error to userspace if they are while the older
open syscalls like open() and openat() simply ignore unknown flag
values:
#define O_FLAG_CURRENTLY_INVALID (1 << 31)
struct open_how how = {
.flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID,
.resolve = 0,
};
/* fails */
fd = openat2(-EBADF, "/dev/null", &how, sizeof(how));
/* succeeds */
fd = openat(-EBADF, "/dev/null", O_RDONLY | O_FLAG_CURRENTLY_INVALID);
However, openat2() silently truncates the upper 32 bits meaning:
#define O_FLAG_CURRENTLY_INVALID_LOWER32 (1 << 31)
#define O_FLAG_CURRENTLY_INVALID_UPPER32 (1 << 40)
struct open_how how_lowe32 = {
.flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID_LOWER32,
};
struct open_how how_upper32 = {
.flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID_UPPER32,
};
/* fails */
fd = openat2(-EBADF, "/dev/null", &how_lower32, sizeof(how_lower32));
/* succeeds */
fd = openat2(-EBADF, "/dev/null", &how_upper32, sizeof(how_upper32));
Fix this by preventing the immediate truncation in build_open_flags()
and add a compile-time check to catch when we add flags in the upper
32 bit range.
* tag 'fs.openat2.unknown_flags.v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
test: add openat2() test for invalid upper 32 bit flag value
open: don't silently ignore unknown O-flags in openat2()
fcntl: remove unused VALID_UPGRADE_FLAGS
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYNnKNAAKCRCRxhvAZXjc
ok/HAQDz3FK3/yeqxH6OLyCedUcD+YBFPPzrfqX+3y6q3z5tGgD9GAGxFXWcMFA2
/cbfmizwh1eJ3WMnbHUp7x6ogpQhWwQ=
=PpNs
-----END PGP SIGNATURE-----
Merge tag 'fs.mount_setattr.nosymfollow.v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull mount_setattr updates from Christian Brauner:
"A few releases ago the old mount API gained support for a mount
options which prevents following symlinks on a given mount. This adds
support for it in the new mount api through the MOUNT_ATTR_NOSYMFOLLOW
flag via mount_setattr() and fsmount(). With mount_setattr() that flag
can even be applied recursively.
There's an additional ack from Ross Zwisler who originally authored
the nosymfollow patch. As I've already had the patches in my for-next
I didn't add his ack explicitly"
* tag 'fs.mount_setattr.nosymfollow.v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
tests: test MOUNT_ATTR_NOSYMFOLLOW with mount_setattr()
mount: Support "nosymfollow" in new mount api
Merge misc updates from Andrew Morton:
"191 patches.
Subsystems affected by this patch series: kthread, ia64, scripts,
ntfs, squashfs, ocfs2, kernel/watchdog, and mm (gup, pagealloc, slab,
slub, kmemleak, dax, debug, pagecache, gup, swap, memcg, pagemap,
mprotect, bootmem, dma, tracing, vmalloc, kasan, initialization,
pagealloc, and memory-failure)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (191 commits)
mm,hwpoison: make get_hwpoison_page() call get_any_page()
mm,hwpoison: send SIGBUS with error virutal address
mm/page_alloc: split pcp->high across all online CPUs for cpuless nodes
mm/page_alloc: allow high-order pages to be stored on the per-cpu lists
mm: replace CONFIG_FLAT_NODE_MEM_MAP with CONFIG_FLATMEM
mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA
docs: remove description of DISCONTIGMEM
arch, mm: remove stale mentions of DISCONIGMEM
mm: remove CONFIG_DISCONTIGMEM
m68k: remove support for DISCONTIGMEM
arc: remove support for DISCONTIGMEM
arc: update comment about HIGHMEM implementation
alpha: remove DISCONTIGMEM and NUMA
mm/page_alloc: move free_the_page
mm/page_alloc: fix counting of managed_pages
mm/page_alloc: improve memmap_pages dbg msg
mm: drop SECTION_SHIFT in code comments
mm/page_alloc: introduce vm.percpu_pagelist_high_fraction
mm/page_alloc: limit the number of pages on PCP lists when reclaim is active
mm/page_alloc: scale the number of pages that are batch freed
...
Trivial conflict in net/netfilter/nf_tables_api.c.
Duplicate fix in tools/testing/selftests/net/devlink_port_split.py
- take the net-next version.
skmsg, and L4 bpf - keep the bpf code but remove the flags
and err params.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- Consolidate the macros for .byte ... opcode sequences
- Deduplicate register offset defines in include files
- Simplify the ia32,x32 compat handling of the related syscall tables to
get rid of #ifdeffery.
- Clear all EFLAGS which are not required for syscall handling
- Consolidate the syscall tables and switch the generation over to the
generic shell script and remove the CFLAGS tweaks which are not longer
required.
- Use 'int' type for system call numbers to match the generic code.
- Add more selftests for syscalls
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmDbKzMTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoae8D/9+pksdf8lE5dRLtngSeTDLiyIV+qq4
vSks7XfrTTAhOV2nRwtIulc2CO6H7jcvn6ehmiC/X0Tn9JK5brwSJJYryNEjA3cp
3p9jPrB1w1SDhx35JzILN4DDaJfI3jobLSLDq0KQzuEL0+c0R4l3WBplpCzbLjqj
NaFQgslf8RSnjha9NLTKzlzSaNNNo9Ioo6DyrsBDEdcRBtAPlFfdVtT3oJE73ANH
dK5POoVWysmAnDAwEW17j9bBJLtxeWsrhM9CrtqvcKr3HhK9WjWUFAr+diQf5GKf
BAD2A+5y8wZQXvFOuC9WZxfQwUFSLExt8BfcXblOUbf2CdlvoYVzOlvI141kA++4
q4wQ1vl6MbLCp6wLysc3bnwKUEmnf2E4Iyj5JR2aFrw096pAoZ3ZbAQi7s3Vhb16
aSbGxIw3rHRuB0f8VmOA0iEHiXlkRmE/K+nH1/uDTUZLaDpktPvpKQJsp0+9qXFk
eVtEw4bVKJ7q5ozjMzpm9aPxPp1v8MGxUOJOy80W7Ti+vBp2KmMKc1gy8QsYrTvW
Vzvpp3U+/WFh2X7AG0zlP/JEnOuJmMwMK5QhzMC2rEbaHJ66ht7SABvtSbOHHw5Z
zugxTE0lx3n7izCxW1RLEu//xtWY0FbU2L5oE2Ace27myUPeBQCDJzynUn93dMM9
9nq2TtgTCF6XvA==
=+sb9
-----END PGP SIGNATURE-----
Merge tag 'x86-entry-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 entry code related updates from Thomas Gleixner:
- Consolidate the macros for .byte ... opcode sequences
- Deduplicate register offset defines in include files
- Simplify the ia32,x32 compat handling of the related syscall tables
to get rid of #ifdeffery.
- Clear all EFLAGS which are not required for syscall handling
- Consolidate the syscall tables and switch the generation over to the
generic shell script and remove the CFLAGS tweaks which are not
longer required.
- Use 'int' type for system call numbers to match the generic code.
- Add more selftests for syscalls
* tag 'x86-entry-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/syscalls: Don't adjust CFLAGS for syscall tables
x86/syscalls: Remove -Wno-override-init for syscall tables
x86/uml/syscalls: Remove array index from syscall initializers
x86/syscalls: Clear 'offset' and 'prefix' in case they are set in env
x86/entry: Use int everywhere for system call numbers
x86/entry: Treat out of range and gap system calls the same
x86/entry/64: Sign-extend system calls on entry to int
selftests/x86/syscall: Add tests under ptrace to syscall_numbering_64
selftests/x86/syscall: Simplify message reporting in syscall_numbering
selftests/x86/syscall: Update and extend syscall_numbering_64
x86/syscalls: Switch to generic syscallhdr.sh
x86/syscalls: Use __NR_syscalls instead of __NR_syscall_max
x86/unistd: Define X32_NR_syscalls only for 64-bit kernel
x86/syscalls: Stop filling syscall arrays with *_sys_ni_syscall
x86/syscalls: Switch to generic syscalltbl.sh
x86/entry/x32: Rename __x32_compat_sys_* to __x64_compat_sys_*
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmDa2mEACgkQUqAMR0iA
lPKBiBAAgvhNnaRVR6/GBVrv5jYM8obJM7PHPxp8dh+ZRb1mDyL1ZDU2r7lmQjMD
ORBN5eK6pXk/gVabXR5lY0B7vQ8phJmYO98Lk2E3n9ZTgMkTHQ5WOHzBpt93gd/y
l9m00ZD0YcHrkmM1fq73FuZVEMzPk85cjTe8n6JeHJgSAdoOY/rl61Cn57ZHFIa4
DkpdNGkJaf77UIWOc8NLAXOdSD9NxSGycHXpU0q8QO9UFq+Le2qN4OPj3S1CNCO2
ciy+VcW1VQ/BesPPlBIk3ImHWPS4ty3n4EYFzNm+saElIaWxyhNBXAvcBAK/x9LK
3QibfBFwbS3sllhnk96Z24UaWWMg2AygbV2aqd3xMLpW3XD6q/MVnWGHfayhnmYj
aNcWpldIjwDH4iZJ5vnD4KewQpYp+Jc5Hqv6UyIf1O8nEvvQubrDXjSDLLcbZFI1
m2cs9DTc5ezyX/DifBsViDbw8hPjJg7QAbRjVk1EfVQrH090mRQoSoQQI4QtuMEi
pPiTALNG1HRKIoYrKxQMB43JvZ1zjaSbtNbW4JJ9Bm3kxFZ/Oa8NXzE5BOjeykZm
bCePtc018GZaGNW0z/Zr460c/Tuaj8fZSzUOj9Xnw5Hv4G3W5+5EqDy7sU/GTPjL
It9rAZYo+cM9vp1BD2343YPZgnChWHaW0BF/WDqFAhLd9av/WKI=
=Oa1c
-----END PGP SIGNATURE-----
Merge tag 'printk-for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Add %pt[RT]s modifier to vsprintf(). It overrides ISO 8601 separator
by using ' ' (space). It produces "YYYY-mm-dd HH:MM:SS" instead of
"YYYY-mm-ddTHH:MM:SS".
- Correctly parse long row of numbers by sscanf() when using the field
width. Add extensive sscanf() selftest.
- Generalize re-entrant CPU lock that has already been used to
serialize dump_stack() output. It is part of the ongoing printk
rework. It will allow to remove the obsoleted printk_safe buffers and
introduce atomic consoles.
- Some code clean up and sparse warning fixes.
* tag 'printk-for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk: fix cpu lock ordering
lib/dump_stack: move cpu lock to printk.c
printk: Remove trailing semicolon in macros
random32: Fix implicit truncation warning in prandom_seed_state()
lib: test_scanf: Remove pointless use of type_min() with unsigned types
selftests: lib: Add wrapper script for test_scanf
lib: test_scanf: Add tests for sscanf number conversion
lib: vsprintf: Fix handling of number field widths in vsscanf
lib: vsprintf: scanf: Negative number must have field width > 1
usb: host: xhci-tegra: Switch to use %ptTs
nilfs2: Switch to use %ptTs
kdb: Switch to use %ptTs
lib/vsprintf: Allow to override ISO 8601 date and time separator
Patch series "mm/gup: Fix pin page write cache bouncing on has_pinned", v2.
This series contains 3 patches, the 1st one enables threading for
gup_benchmark in the kselftest. The latter two patches are collected from
Andrea's local branch which can fix write cache bouncing issue with
pinning fast-gup.
To be explicit on the latter two patches:
- the 2nd patch fixes the perf degrade when introducing has_pinned, then
- the last patch tries to remove the has_pinned with a bit in mm->flags
For patch 3: originally I think we had a plan to reuse has_pinned into a
counter very soon, however that's not happening at least until today, so
maybe it proves that we can remove it until we really want such a counter
for whatever reason. As the commit message stated, it saves 4 bytes for
each mm without observable regressions.
Regarding testing: we can reference to the commit message of patch 2 for
some detailed testing with will-is-scale. Meanwhile I did patch 1 just
because then we can even easily verify the patchset using the existing
kselftest facilities or even regress test it in the future with the repo
if we want.
Below numbers are extra verification tests that I did besides commit
message of patch 2 using the new gup_benchmark and 256 cpus. Below test
is done on 40 cpus host with Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz,
and I can get similar result (of course the write cache bouncing get
severe with even more cores).
After patch 1 applied (only test patch, so using old kernel):
$ sudo chrt -f 1 ./gup_test -a -m 512 -j 40
PIN_FAST_BENCHMARK: Time: get:459632 put:5990 us
PIN_FAST_BENCHMARK: Time: get:461967 put:5840 us
PIN_FAST_BENCHMARK: Time: get:464521 put:6140 us
PIN_FAST_BENCHMARK: Time: get:465176 put:7100 us
PIN_FAST_BENCHMARK: Time: get:465960 put:6733 us
PIN_FAST_BENCHMARK: Time: get:465324 put:6781 us
PIN_FAST_BENCHMARK: Time: get:466018 put:7130 us
PIN_FAST_BENCHMARK: Time: get:466362 put:7118 us
PIN_FAST_BENCHMARK: Time: get:465118 put:6975 us
PIN_FAST_BENCHMARK: Time: get:466422 put:6602 us
PIN_FAST_BENCHMARK: Time: get:465791 put:6818 us
PIN_FAST_BENCHMARK: Time: get:467091 put:6298 us
PIN_FAST_BENCHMARK: Time: get:467694 put:5432 us
PIN_FAST_BENCHMARK: Time: get:469575 put:5581 us
PIN_FAST_BENCHMARK: Time: get:468124 put:6055 us
PIN_FAST_BENCHMARK: Time: get:468877 put:6720 us
PIN_FAST_BENCHMARK: Time: get:467212 put:4961 us
PIN_FAST_BENCHMARK: Time: get:467834 put:6697 us
PIN_FAST_BENCHMARK: Time: get:470778 put:6398 us
PIN_FAST_BENCHMARK: Time: get:469788 put:6310 us
PIN_FAST_BENCHMARK: Time: get:488277 put:7113 us
PIN_FAST_BENCHMARK: Time: get:486613 put:7085 us
PIN_FAST_BENCHMARK: Time: get:486940 put:7202 us
PIN_FAST_BENCHMARK: Time: get:488728 put:7101 us
PIN_FAST_BENCHMARK: Time: get:487570 put:7327 us
PIN_FAST_BENCHMARK: Time: get:489260 put:7027 us
PIN_FAST_BENCHMARK: Time: get:488846 put:6866 us
PIN_FAST_BENCHMARK: Time: get:488521 put:6745 us
PIN_FAST_BENCHMARK: Time: get:489950 put:6459 us
PIN_FAST_BENCHMARK: Time: get:489777 put:6617 us
PIN_FAST_BENCHMARK: Time: get:488224 put:6591 us
PIN_FAST_BENCHMARK: Time: get:488644 put:6477 us
PIN_FAST_BENCHMARK: Time: get:488754 put:6711 us
PIN_FAST_BENCHMARK: Time: get:488875 put:6743 us
PIN_FAST_BENCHMARK: Time: get:489290 put:6657 us
PIN_FAST_BENCHMARK: Time: get:490264 put:6684 us
PIN_FAST_BENCHMARK: Time: get:489631 put:6737 us
PIN_FAST_BENCHMARK: Time: get:488434 put:6655 us
PIN_FAST_BENCHMARK: Time: get:492213 put:6297 us
PIN_FAST_BENCHMARK: Time: get:491124 put:6173 us
After the whole series applied (new fixed kernel):
$ sudo chrt -f 1 ./gup_test -a -m 512 -j 40
PIN_FAST_BENCHMARK: Time: get:82038 put:7041 us
PIN_FAST_BENCHMARK: Time: get:82144 put:6817 us
PIN_FAST_BENCHMARK: Time: get:83417 put:6674 us
PIN_FAST_BENCHMARK: Time: get:82540 put:6594 us
PIN_FAST_BENCHMARK: Time: get:83214 put:6681 us
PIN_FAST_BENCHMARK: Time: get:83444 put:6889 us
PIN_FAST_BENCHMARK: Time: get:83194 put:7499 us
PIN_FAST_BENCHMARK: Time: get:84876 put:7369 us
PIN_FAST_BENCHMARK: Time: get:86092 put:10289 us
PIN_FAST_BENCHMARK: Time: get:86153 put:10415 us
PIN_FAST_BENCHMARK: Time: get:85026 put:7751 us
PIN_FAST_BENCHMARK: Time: get:85458 put:7944 us
PIN_FAST_BENCHMARK: Time: get:85735 put:8154 us
PIN_FAST_BENCHMARK: Time: get:85851 put:8299 us
PIN_FAST_BENCHMARK: Time: get:86323 put:9617 us
PIN_FAST_BENCHMARK: Time: get:86288 put:10496 us
PIN_FAST_BENCHMARK: Time: get:87697 put:9346 us
PIN_FAST_BENCHMARK: Time: get:87980 put:8382 us
PIN_FAST_BENCHMARK: Time: get:88719 put:8400 us
PIN_FAST_BENCHMARK: Time: get:87616 put:8588 us
PIN_FAST_BENCHMARK: Time: get:86730 put:9563 us
PIN_FAST_BENCHMARK: Time: get:88167 put:8673 us
PIN_FAST_BENCHMARK: Time: get:86844 put:9777 us
PIN_FAST_BENCHMARK: Time: get:88068 put:11774 us
PIN_FAST_BENCHMARK: Time: get:86170 put:15676 us
PIN_FAST_BENCHMARK: Time: get:87967 put:12827 us
PIN_FAST_BENCHMARK: Time: get:95773 put:7652 us
PIN_FAST_BENCHMARK: Time: get:87734 put:13650 us
PIN_FAST_BENCHMARK: Time: get:89833 put:14237 us
PIN_FAST_BENCHMARK: Time: get:96186 put:8029 us
PIN_FAST_BENCHMARK: Time: get:95532 put:8886 us
PIN_FAST_BENCHMARK: Time: get:95351 put:5826 us
PIN_FAST_BENCHMARK: Time: get:96401 put:8407 us
PIN_FAST_BENCHMARK: Time: get:96473 put:8287 us
PIN_FAST_BENCHMARK: Time: get:97177 put:8430 us
PIN_FAST_BENCHMARK: Time: get:98120 put:5263 us
PIN_FAST_BENCHMARK: Time: get:96271 put:7757 us
PIN_FAST_BENCHMARK: Time: get:99628 put:10467 us
PIN_FAST_BENCHMARK: Time: get:99344 put:10045 us
PIN_FAST_BENCHMARK: Time: get:94212 put:15485 us
Summary:
Old kernel: 477729.97 (+-3.79%)
New kernel: 89144.65 (+-11.76%)
This patch (of 3):
Add a new parameter "-j N" to support concurrent gup test.
Link: https://lkml.kernel.org/r/20210507150553.208763-1-peterx@redhat.com
Link: https://lkml.kernel.org/r/20210507150553.208763-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Kirill Shutemov <kirill@shutemov.name>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull user namespace rlimit handling update from Eric Biederman:
"This is the work mainly by Alexey Gladkov to limit rlimits to the
rlimits of the user that created a user namespace, and to allow users
to have stricter limits on the resources created within a user
namespace."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
cred: add missing return error code when set_cred_ucounts() failed
ucounts: Silence warning in dec_rlimit_ucounts
ucounts: Set ucount_max to the largest positive value the type can hold
kselftests: Add test to check for rlimit changes in different user namespaces
Reimplement RLIMIT_MEMLOCK on top of ucounts
Reimplement RLIMIT_SIGPENDING on top of ucounts
Reimplement RLIMIT_MSGQUEUE on top of ucounts
Reimplement RLIMIT_NPROC on top of ucounts
Use atomic_t for ucounts reference counting
Add a reference to ucounts for each cred
Increase size of ucounts to atomic_long_t
And thus avoid a Python stacktrace:
~/linux/tools/testing/selftests/net$ ./devlink_port_split.py
Traceback (most recent call last):
File "/home/linux/tools/testing/selftests/net/./devlink_port_split.py",
line 277, in <module> main()
File "/home/linux/tools/testing/selftests/net/./devlink_port_split.py",
line 242, in main
dev = list(devs.keys())[0]
IndexError: list index out of range
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Add MTE support in guests, complete with tag save/restore interface
- Reduce the impact of CMOs by moving them in the page-table code
- Allow device block mappings at stage-2
- Reduce the footprint of the vmemmap in protected mode
- Support the vGIC on dumb systems such as the Apple M1
- Add selftest infrastructure to support multiple configuration
and apply that to PMU/non-PMU setups
- Add selftests for the debug architecture
- The usual crop of PMU fixes
PPC:
- Support for the H_RPT_INVALIDATE hypercall
- Conversion of Book3S entry/exit to C
- Bug fixes
S390:
- new HW facilities for guests
- make inline assembly more robust with KASAN and co
x86:
- Allow userspace to handle emulation errors (unknown instructions)
- Lazy allocation of the rmap (host physical -> guest physical address)
- Support for virtualizing TSC scaling on VMX machines
- Optimizations to avoid shattering huge pages at the beginning of live migration
- Support for initializing the PDPTRs without loading them from memory
- Many TLB flushing cleanups
- Refuse to load if two-stage paging is available but NX is not (this has
been a requirement in practice for over a year)
- A large series that separates the MMU mode (WP/SMAP/SMEP etc.) from
CR0/CR4/EFER, using the MMU mode everywhere once it is computed
from the CPU registers
- Use PM notifier to notify the guest about host suspend or hibernate
- Support for passing arguments to Hyper-V hypercalls using XMM registers
- Support for Hyper-V TLB flush hypercalls and enlightened MSR bitmap on
AMD processors
- Hide Hyper-V hypercalls that are not included in the guest CPUID
- Fixes for live migration of virtual machines that use the Hyper-V
"enlightened VMCS" optimization of nested virtualization
- Bugfixes (not many)
Generic:
- Support for retrieving statistics without debugfs
- Cleanups for the KVM selftests API
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmDV9UYUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOIRgf/XX8fKLh24RnTOs2ldIu2AfRGVrT4
QMrr8MxhmtukBAszk2xKvBt8/6gkUjdaIC3xqEnVjxaDaUvZaEtP7CQlF5JV45rn
iv1zyxUKucXrnIOr+gCioIT7qBlh207zV35ArKioP9Y83cWx9uAs22pfr6g+7RxO
h8bJZlJbSG6IGr3voANCIb9UyjU1V/l8iEHqRwhmr/A5rARPfD7g8lfMEQeGkzX6
+/UydX2fumB3tl8e2iMQj6vLVdSOsCkehvpHK+Z33EpkKhan7GwZ2sZ05WmXV/nY
QLAYfD10KegoNWl5Ay4GTp4hEAIYVrRJCLC+wnLdc0U8udbfCuTC31LK4w==
=NcRh
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"This covers all architectures (except MIPS) so I don't expect any
other feature pull requests this merge window.
ARM:
- Add MTE support in guests, complete with tag save/restore interface
- Reduce the impact of CMOs by moving them in the page-table code
- Allow device block mappings at stage-2
- Reduce the footprint of the vmemmap in protected mode
- Support the vGIC on dumb systems such as the Apple M1
- Add selftest infrastructure to support multiple configuration and
apply that to PMU/non-PMU setups
- Add selftests for the debug architecture
- The usual crop of PMU fixes
PPC:
- Support for the H_RPT_INVALIDATE hypercall
- Conversion of Book3S entry/exit to C
- Bug fixes
S390:
- new HW facilities for guests
- make inline assembly more robust with KASAN and co
x86:
- Allow userspace to handle emulation errors (unknown instructions)
- Lazy allocation of the rmap (host physical -> guest physical
address)
- Support for virtualizing TSC scaling on VMX machines
- Optimizations to avoid shattering huge pages at the beginning of
live migration
- Support for initializing the PDPTRs without loading them from
memory
- Many TLB flushing cleanups
- Refuse to load if two-stage paging is available but NX is not (this
has been a requirement in practice for over a year)
- A large series that separates the MMU mode (WP/SMAP/SMEP etc.) from
CR0/CR4/EFER, using the MMU mode everywhere once it is computed
from the CPU registers
- Use PM notifier to notify the guest about host suspend or hibernate
- Support for passing arguments to Hyper-V hypercalls using XMM
registers
- Support for Hyper-V TLB flush hypercalls and enlightened MSR bitmap
on AMD processors
- Hide Hyper-V hypercalls that are not included in the guest CPUID
- Fixes for live migration of virtual machines that use the Hyper-V
"enlightened VMCS" optimization of nested virtualization
- Bugfixes (not many)
Generic:
- Support for retrieving statistics without debugfs
- Cleanups for the KVM selftests API"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (314 commits)
KVM: x86: rename apic_access_page_done to apic_access_memslot_enabled
kvm: x86: disable the narrow guest module parameter on unload
selftests: kvm: Allows userspace to handle emulation errors.
kvm: x86: Allow userspace to handle emulation errors
KVM: x86/mmu: Let guest use GBPAGES if supported in hardware and TDP is on
KVM: x86/mmu: Get CR4.SMEP from MMU, not vCPU, in shadow page fault
KVM: x86/mmu: Get CR0.WP from MMU, not vCPU, in shadow page fault
KVM: x86/mmu: Drop redundant rsvd bits reset for nested NPT
KVM: x86/mmu: Optimize and clean up so called "last nonleaf level" logic
KVM: x86: Enhance comments for MMU roles and nested transition trickiness
KVM: x86/mmu: WARN on any reserved SPTE value when making a valid SPTE
KVM: x86/mmu: Add helpers to do full reserved SPTE checks w/ generic MMU
KVM: x86/mmu: Use MMU's role to determine PTTYPE
KVM: x86/mmu: Collapse 32-bit PAE and 64-bit statements for helpers
KVM: x86/mmu: Add a helper to calculate root from role_regs
KVM: x86/mmu: Add helper to update paging metadata
KVM: x86/mmu: Don't update nested guest's paging bitmasks if CR0.PG=0
KVM: x86/mmu: Consolidate reset_rsvds_bits_mask() calls
KVM: x86/mmu: Use MMU role_regs to get LA57, and drop vCPU LA57 helper
KVM: x86/mmu: Get nested MMU's root level from the MMU's role
...
Daniel Borkmann says:
====================
pull-request: bpf-next 2021-06-28
The following pull-request contains BPF updates for your *net-next* tree.
We've added 37 non-merge commits during the last 12 day(s) which contain
a total of 56 files changed, 394 insertions(+), 380 deletions(-).
The main changes are:
1) XDP driver RCU cleanups, from Toke Høiland-Jørgensen and Paul E. McKenney.
2) Fix bpf_skb_change_proto() IPv4/v6 GSO handling, from Maciej Żenczykowski.
3) Fix false positive kmemleak report for BPF ringbuf alloc, from Rustam Kovhaev.
4) Fix x86 JIT's extable offset calculation for PROBE_LDX NULL, from Ravi Bangoria.
5) Enable libbpf fallback probing with tracing under RHEL7, from Jonathan Edwards.
6) Clean up x86 JIT to remove unused cnt tracking from EMIT macro, from Jiri Olsa.
7) Netlink cleanups for libbpf to please Coverity, from Kumar Kartikeya Dwivedi.
8) Allow to retrieve ancestor cgroup id in tracing programs, from Namhyung Kim.
9) Fix lirc BPF program query to use user-provided prog_cnt, from Sean Young.
10) Add initial libbpf doc including generated kdoc for its API, from Grant Seltzer.
11) Make xdp_rxq_info_unreg_mem_model() more robust, from Jakub Kicinski.
12) Fix up bpfilter startup log-level to info level, from Gary Lin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
- Optimise SVE switching for CPUs with 128-bit implementations.
- Fix output format from SVE selftest.
- Add support for versions v1.2 and 1.3 of the SMC calling convention.
- Allow Pointer Authentication to be configured independently for
kernel and userspace.
- PMU driver cleanups for managing IRQ affinity and exposing event
attributes via sysfs.
- KASAN optimisations for both hardware tagging (MTE) and out-of-line
software tagging implementations.
- Relax frame record alignment requirements to facilitate 8-byte
alignment with KASAN and Clang.
- Cleanup of page-table definitions and removal of unused memory types.
- Reduction of ARCH_DMA_MINALIGN back to 64 bytes.
- Refactoring of our instruction decoding routines and addition of some
missing encodings.
- Move entry code moved into C and hardened against harmful compiler
instrumentation.
- Update booting requirements for the FEAT_HCX feature, added to v8.7
of the architecture.
- Fix resume from idle when pNMI is being used.
- Additional CPU sanity checks for MTE and preparatory changes for
systems where not all of the CPUs support 32-bit EL0.
- Update our kernel string routines to the latest Cortex Strings
implementation.
- Big cleanup of our cache maintenance routines, which were confusingly
named and inconsistent in their implementations.
- Tweak linker flags so that GDB can understand vmlinux when using RELR
relocations.
- Boot path cleanups to enable early initialisation of per-cpu
operations needed by KCSAN.
- Non-critical fixes and miscellaneous cleanup.
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmDUh1YQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNDaUCAC+2Jy2Yopd94uBPYajGybM0rqCUgE7b5n1
A7UzmQ6fia2hwqCPmxGG+sRabovwN7C1bKrUCc03RIbErIa7wum1edeyqmF/Aw44
DUDY1MAOSZaFmX8L62QCvxG1hfdLPtGmHMd1hdXvxYK7PCaigEFnzbLRWTtgE+Ok
JhdvNfsoeITJObHnvYPF3rV3NAbyYni9aNJ5AC/qb3dlf6XigEraXaMj29XHKfwc
+vmn+25oqFkLHyFeguqIoK+vUQAy/8TjFfjX83eN3LZknNhDJgWS1Iq1Nm+Vxt62
RvDUUecWJjAooCWgmil6pt0enI+q6E8LcX3A3cWWrM6psbxnYzkU
=I6KS
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"There's a reasonable amount here and the juicy details are all below.
It's worth noting that the MTE/KASAN changes strayed outside of our
usual directories due to core mm changes and some associated changes
to some other architectures; Andrew asked for us to carry these [1]
rather that take them via the -mm tree.
Summary:
- Optimise SVE switching for CPUs with 128-bit implementations.
- Fix output format from SVE selftest.
- Add support for versions v1.2 and 1.3 of the SMC calling
convention.
- Allow Pointer Authentication to be configured independently for
kernel and userspace.
- PMU driver cleanups for managing IRQ affinity and exposing event
attributes via sysfs.
- KASAN optimisations for both hardware tagging (MTE) and out-of-line
software tagging implementations.
- Relax frame record alignment requirements to facilitate 8-byte
alignment with KASAN and Clang.
- Cleanup of page-table definitions and removal of unused memory
types.
- Reduction of ARCH_DMA_MINALIGN back to 64 bytes.
- Refactoring of our instruction decoding routines and addition of
some missing encodings.
- Move entry code moved into C and hardened against harmful compiler
instrumentation.
- Update booting requirements for the FEAT_HCX feature, added to v8.7
of the architecture.
- Fix resume from idle when pNMI is being used.
- Additional CPU sanity checks for MTE and preparatory changes for
systems where not all of the CPUs support 32-bit EL0.
- Update our kernel string routines to the latest Cortex Strings
implementation.
- Big cleanup of our cache maintenance routines, which were
confusingly named and inconsistent in their implementations.
- Tweak linker flags so that GDB can understand vmlinux when using
RELR relocations.
- Boot path cleanups to enable early initialisation of per-cpu
operations needed by KCSAN.
- Non-critical fixes and miscellaneous cleanup"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (150 commits)
arm64: tlb: fix the TTL value of tlb_get_level
arm64: Restrict undef hook for cpufeature registers
arm64/mm: Rename ARM64_SWAPPER_USES_SECTION_MAPS
arm64: insn: avoid circular include dependency
arm64: smp: Bump debugging information print down to KERN_DEBUG
drivers/perf: fix the missed ida_simple_remove() in ddr_perf_probe()
perf/arm-cmn: Fix invalid pointer when access dtc object sharing the same IRQ number
arm64: suspend: Use cpuidle context helpers in cpu_suspend()
PSCI: Use cpuidle context helpers in psci_cpu_suspend_enter()
arm64: Convert cpu_do_idle() to using cpuidle context helpers
arm64: Add cpuidle context save/restore helpers
arm64: head: fix code comments in set_cpu_boot_mode_flag
arm64: mm: drop unused __pa(__idmap_text_start)
arm64: mm: fix the count comments in compute_indices
arm64/mm: Fix ttbr0 values stored in struct thread_info for software-pan
arm64: mm: Pass original fault address to handle_mm_fault()
arm64/mm: Drop SECTION_[SHIFT|SIZE|MASK]
arm64/mm: Use CONT_PMD_SHIFT for ARM64_MEMSTART_SHIFT
arm64/mm: Drop SWAPPER_INIT_MAP_SIZE
arm64: Conditionally configure PTR_AUTH key of the kernel.
...
Instead of depending on "sysctl" being installed, just use "grep -H" for
sysctl status reporting. Additionally report kernel version for easier
comparisons.
Signed-off-by: Kees Cook <keescook@chromium.org>
When running the seccomp benchmark under a test runner, it wouldn't
provide any feedback on progress. Set stdout unbuffered.
Suggested-by: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Since the open fds might not always start at "4" (especially when
running under kselftest, etc), start counting from the first assigned
fd, rather than using the more permissive EXPECT_GE(fd, 0).
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/lkml/20210527032948.3730953-1-keescook@chromium.org
Reviewed-by: Rodrigo Campos <rodrigo@kinvolk.io>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
This just adds a test to verify that when using the new introduced flag
to ADDFD, a valid fd is added and returned as the syscall result.
Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io>
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Tycho Andersen <tycho@tycho.pizza>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210517193908.3113-5-sargun@sargun.me
- Changes to core scheduling facilities:
- Add "Core Scheduling" via CONFIG_SCHED_CORE=y, which enables
coordinated scheduling across SMT siblings. This is a much
requested feature for cloud computing platforms, to allow
the flexible utilization of SMT siblings, without exposing
untrusted domains to information leaks & side channels, plus
to ensure more deterministic computing performance on SMT
systems used by heterogenous workloads.
There's new prctls to set core scheduling groups, which
allows more flexible management of workloads that can share
siblings.
- Fix task->state access anti-patterns that may result in missed
wakeups and rename it to ->__state in the process to catch new
abuses.
- Load-balancing changes:
- Tweak newidle_balance for fair-sched, to improve
'memcache'-like workloads.
- "Age" (decay) average idle time, to better track & improve workloads
such as 'tbench'.
- Fix & improve energy-aware (EAS) balancing logic & metrics.
- Fix & improve the uclamp metrics.
- Fix task migration (taskset) corner case on !CONFIG_CPUSET.
- Fix RT and deadline utilization tracking across policy changes
- Introduce a "burstable" CFS controller via cgroups, which allows
bursty CPU-bound workloads to borrow a bit against their future
quota to improve overall latencies & batching. Can be tweaked
via /sys/fs/cgroup/cpu/<X>/cpu.cfs_burst_us.
- Rework assymetric topology/capacity detection & handling.
- Scheduler statistics & tooling:
- Disable delayacct by default, but add a sysctl to enable
it at runtime if tooling needs it. Use static keys and
other optimizations to make it more palatable.
- Use sched_clock() in delayacct, instead of ktime_get_ns().
- Misc cleanups and fixes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZcPoRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1g3yw//WfhIqy7Psa9d/MBMjQDRGbTuO4+w22Dj
vmWFU44Q4KJxQHWeIgUlrK+dzvYWvNmflUs2CUUOiDVzxFTHMIyBtL4qCBUbx4Ns
vKAcB9wsWZge2o3WzZqpProRhdoRaSKw8egUr2q7rACVBkckY7eGP/OjWxXU8BdA
b7D0LPWwuIBFfN4pFYeCDLn32Dqr9s6Chyj+ZecabdG7EE6Gu+f1diVcxy7JE/mc
4WWL0D1RqdgpGrBEuMJIxPYekdrZiuy4jtEbztz5gbTBteN1cj3BLfqn0Pc/e6rO
Vyuc5mXCAmzRVi18z6g6bsVl+IA/nrbErENB2OHOhOYtqiZxqGTd4GPWZszMyY17
5AsEO5+5pcaBsy4gyp09qURggBu9zhJnMVmOI3rIHZkmkhwzc6uUJlyhDCTiFWOz
3ZF3LjbZEyCKodMD8qMHbs3axIBpIfZqjzkvSKyFnvfXEGVytVse7NUuWtQ36u92
GnURxVeYY1TDVXvE1Y8owNKMxknKQ6YRlypP7Dtbeo/qG6hShp0xmS7qDLDi0ybZ
ZlK+bDECiVoDf3nvJo+8v5M82IJ3CBt4UYldeRJsa1YCK/FsbK8tp91fkEfnXVue
+U6LPX0AmMpXacR5HaZfb3uBIKRw/QMdP/7RFtBPhpV6jqCrEmuqHnpPQiEVtxwO
UmG7bt94Trk=
=3VDr
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler udpates from Ingo Molnar:
- Changes to core scheduling facilities:
- Add "Core Scheduling" via CONFIG_SCHED_CORE=y, which enables
coordinated scheduling across SMT siblings. This is a much
requested feature for cloud computing platforms, to allow the
flexible utilization of SMT siblings, without exposing untrusted
domains to information leaks & side channels, plus to ensure more
deterministic computing performance on SMT systems used by
heterogenous workloads.
There are new prctls to set core scheduling groups, which allows
more flexible management of workloads that can share siblings.
- Fix task->state access anti-patterns that may result in missed
wakeups and rename it to ->__state in the process to catch new
abuses.
- Load-balancing changes:
- Tweak newidle_balance for fair-sched, to improve 'memcache'-like
workloads.
- "Age" (decay) average idle time, to better track & improve
workloads such as 'tbench'.
- Fix & improve energy-aware (EAS) balancing logic & metrics.
- Fix & improve the uclamp metrics.
- Fix task migration (taskset) corner case on !CONFIG_CPUSET.
- Fix RT and deadline utilization tracking across policy changes
- Introduce a "burstable" CFS controller via cgroups, which allows
bursty CPU-bound workloads to borrow a bit against their future
quota to improve overall latencies & batching. Can be tweaked via
/sys/fs/cgroup/cpu/<X>/cpu.cfs_burst_us.
- Rework assymetric topology/capacity detection & handling.
- Scheduler statistics & tooling:
- Disable delayacct by default, but add a sysctl to enable it at
runtime if tooling needs it. Use static keys and other
optimizations to make it more palatable.
- Use sched_clock() in delayacct, instead of ktime_get_ns().
- Misc cleanups and fixes.
* tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
sched/doc: Update the CPU capacity asymmetry bits
sched/topology: Rework CPU capacity asymmetry detection
sched/core: Introduce SD_ASYM_CPUCAPACITY_FULL sched_domain flag
psi: Fix race between psi_trigger_create/destroy
sched/fair: Introduce the burstable CFS controller
sched/uclamp: Fix uclamp_tg_restrict()
sched/rt: Fix Deadline utilization tracking during policy change
sched/rt: Fix RT utilization tracking during policy change
sched: Change task_struct::state
sched,arch: Remove unused TASK_STATE offsets
sched,timer: Use __set_current_state()
sched: Add get_current_state()
sched,perf,kvm: Fix preemption condition
sched: Introduce task_is_running()
sched: Unbreak wakeups
sched/fair: Age the average idle time
sched/cpufreq: Consider reduced CPU capacity in energy calculation
sched/fair: Take thermal pressure into account while estimating energy
thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressure
sched/fair: Return early from update_tg_cfs_load() if delta == 0
...
- Core locking & atomics:
- Convert all architectures to ARCH_ATOMIC: move every
architecture to ARCH_ATOMIC, then get rid of ARCH_ATOMIC
and all the transitory facilities and #ifdefs.
Much reduction in complexity from that series:
63 files changed, 756 insertions(+), 4094 deletions(-)
- Self-test enhancements
- Futexes:
- Add the new FUTEX_LOCK_PI2 ABI, which is a variant that
doesn't set FLAGS_CLOCKRT (.e. uses CLOCK_MONOTONIC).
[ The temptation to repurpose FUTEX_LOCK_PI's implicit
setting of FLAGS_CLOCKRT & invert the flag's meaning
to avoid having to introduce a new variant was
resisted successfully. ]
- Enhance futex self-tests
- Lockdep:
- Fix dependency path printouts
- Optimize trace saving
- Broaden & fix wait-context checks
- Misc cleanups and fixes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZaEYRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1hPdxAAiNCsxL6X1cZ8zqbWsvLefT9Zqhzgs5u6
gdZele7PNibvbYdON26b5RUzuKfOW/hgyX6LKqr+AiNYTT9PGhcY+tycUr2PGk5R
LMyhJWmmX5cUVPU92ky+z5hEHB2gr4XPJcvgpKKUL0XB1tBaSvy2DtgwPuhXOoT1
1sCQfy63t71snt2RfEnibVW6xovwaA2lsqL81lLHJN4iRFWvqO498/m4+PWkylsm
ig/+VT1Oz7t4wqu3NhTqNNZv+4K4W2asniyo53Dg2BnRm/NjhJtgg4jRibrb0ssb
67Xdq6y8+xNBmEAKj+Re8VpMcu4aj346Ctk7d4gst2ah/Rc0TvqfH6mezH7oq7RL
hmOrMBWtwQfKhEE/fDkng30nrVxc/98YXP0n2rCCa0ySsaF6b6T185mTcYDRDxFs
BVNS58ub+zxrF9Zd4nhIHKaEHiL2ZdDimqAicXN0RpywjIzTQ/y11uU7I1WBsKkq
WkPYs+FPHnX7aBv1MsuxHhb8sUXjG924K4JeqnjF45jC3sC1crX+N0jv4wHw+89V
h4k20s2Tw6m5XGXlgGwMJh0PCcD6X22Vd9Uyw8zb+IJfvNTGR9Rp1Ec+1gMRSll+
xsn6G6Uy9bcNU0SqKlBSfelweGKn4ZxbEPn76Jc8KWLiepuZ6vv5PBoOuaujWht9
KAeOC5XdjMk=
=tH//
-----END PGP SIGNATURE-----
Merge tag 'locking-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
- Core locking & atomics:
- Convert all architectures to ARCH_ATOMIC: move every architecture
to ARCH_ATOMIC, then get rid of ARCH_ATOMIC and all the
transitory facilities and #ifdefs.
Much reduction in complexity from that series:
63 files changed, 756 insertions(+), 4094 deletions(-)
- Self-test enhancements
- Futexes:
- Add the new FUTEX_LOCK_PI2 ABI, which is a variant that doesn't
set FLAGS_CLOCKRT (.e. uses CLOCK_MONOTONIC).
[ The temptation to repurpose FUTEX_LOCK_PI's implicit setting of
FLAGS_CLOCKRT & invert the flag's meaning to avoid having to
introduce a new variant was resisted successfully. ]
- Enhance futex self-tests
- Lockdep:
- Fix dependency path printouts
- Optimize trace saving
- Broaden & fix wait-context checks
- Misc cleanups and fixes.
* tag 'locking-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
locking/lockdep: Correct the description error for check_redundant()
futex: Provide FUTEX_LOCK_PI2 to support clock selection
futex: Prepare futex_lock_pi() for runtime clock selection
lockdep/selftest: Remove wait-type RCU_CALLBACK tests
lockdep/selftests: Fix selftests vs PROVE_RAW_LOCK_NESTING
lockdep: Fix wait-type for empty stack
locking/selftests: Add a selftest for check_irq_usage()
lockding/lockdep: Avoid to find wrong lock dep path in check_irq_usage()
locking/lockdep: Remove the unnecessary trace saving
locking/lockdep: Fix the dep path printing for backwards BFS
selftests: futex: Add futex compare requeue test
selftests: futex: Add futex wait test
seqlock: Remove trailing semicolon in macros
locking/lockdep: Reduce LOCKDEP dependency list
locking/lockdep,doc: Improve readability of the block matrix
locking/atomics: atomic-instrumented: simplify ifdeffery
locking/atomic: delete !ARCH_ATOMIC remnants
locking/atomic: xtensa: move to ARCH_ATOMIC
locking/atomic: sparc: move to ARCH_ATOMIC
locking/atomic: sh: move to ARCH_ATOMIC
...
Add support for the SKIP directive to kunit_tool's TAP parser.
Skipped tests now show up as such in the printed summary. The number of
skipped tests is counted, and if all tests in a suite are skipped, the
suite is also marked as skipped. Otherwise, skipped tests do affect the
suite result.
Example output:
[00:22:34] ======== [SKIPPED] example_skip ========
[00:22:34] [SKIPPED] example_skip_test # SKIP this test should be skipped
[00:22:34] [SKIPPED] example_mark_skipped_test # SKIP this test should be skipped
[00:22:34] ============================================================
[00:22:34] Testing complete. 2 tests run. 0 failed. 0 crashed. 2 skipped.
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Note: this does not change the parser behavior at all (except for making
one error message more useful). This is just an internal refactor.
The TAP output parser currently operates over a List[str].
This works, but we only ever need to be able to "peek" at the current
line and the ability to "pop" it off.
Also, using a List means we need to wait for all the output before we
can start parsing. While this is not an issue for most tests which are
really lightweight, we do have some longer (~5 minutes) tests.
This patch introduces an LineStream wrapper class that
* Exposes a peek()/pop() interface instead of manipulating an array
* this allows us to more easily add debugging code [1]
* Can consume an input from a generator
* we can now parse results as tests are running (the parser code
currently doesn't print until the end, so no impact yet).
* Tracks the current line number to print better error messages
* Would allow us to add additional features more easily, e.g. storing
N previous lines so we can print out invalid lines in context, etc.
[1] The parsing logic is currently quite fragile.
E.g. it'll often say the kernel "CRASHED" if there's something slightly
wrong with the output format. When debugging a test that had some memory
corruption issues, it resulted in very misleading errors from the parser.
Now we could easily add this to trace all the lines consumed and why
+import inspect
...
def pop(self) -> str:
n = self._next
+ print(f'popping {n[0]}: {n[1].ljust(40, " ")}| caller={inspect.stack()[1].function}')
Example output:
popping 77: TAP version 14 | caller=parse_tap_header
popping 78: 1..1 | caller=parse_test_plan
popping 79: # Subtest: kunit_executor_test | caller=parse_subtest_header
popping 80: 1..2 | caller=parse_subtest_plan
popping 81: ok 1 - parse_filter_test | caller=parse_ok_not_ok_test_case
popping 82: ok 2 - filter_subsuite_test | caller=parse_ok_not_ok_test_case
popping 83: ok 1 - kunit_executor_test | caller=parse_ok_not_ok_test_suite
If we introduce an invalid line, we can see the parser go down the wrong path:
popping 77: TAP version 14 | caller=parse_tap_header
popping 78: 1..1 | caller=parse_test_plan
popping 79: # Subtest: kunit_executor_test | caller=parse_subtest_header
popping 80: 1..2 | caller=parse_subtest_plan
popping 81: 1..2 # this is invalid! | caller=parse_ok_not_ok_test_case
popping 82: ok 1 - parse_filter_test | caller=parse_ok_not_ok_test_case
popping 83: ok 2 - filter_subsuite_test | caller=parse_ok_not_ok_test_case
popping 84: ok 1 - kunit_executor_test | caller=parse_ok_not_ok_test_case
[ERROR] ran out of lines before end token
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Acked-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
- Add MTE support in guests, complete with tag save/restore interface
- Reduce the impact of CMOs by moving them in the page-table code
- Allow device block mappings at stage-2
- Reduce the footprint of the vmemmap in protected mode
- Support the vGIC on dumb systems such as the Apple M1
- Add selftest infrastructure to support multiple configuration
and apply that to PMU/non-PMU setups
- Add selftests for the debug architecture
- The usual crop of PMU fixes
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmDV2bEPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDEr8P/ivwROx5NwGcHGmU5RfUCT3aFqhtVHHwD/lu
jPcgoO61kz9TelOu6QRaVuK+mVHxcq3iP4R8nPq/QCkUlEXTmK2xkyhXhGXSYpH4
6jM8+BbC3eG7iAxx6H0UM4JTl4Riwat6ZZtXpWEWs9TKqOHOQYFpMkxSttwVZ1CZ
SjbtFvXLEdzKn6PzUWnKdBNMV/mHsdAtohZit9oJOc4ttc8072XxETQ4TFQ+MSvA
j9zY9QPmWzgcZnotqRRu9sbTGO2vxtXuUtY3sjdD8+C9OgSe9qvpnNjymcmfwaMu
1fBkfh65oaO4ItJBdGOUOoEcFqwN5imPiI7CB/O+ZYkO9sBCuTUPSQwPkyiwXb9r
bUkTaQw2nZiNWsqR1x07fQ2sGYbMp5mnmgmqiV4MUWkLmFp9LZATCWYTTn24cBNS
6SjVP6/8S0r3EhLnYjH0Pn1we5PooU1EF6RlCAd3ewYoo+9fPnwjNYwIWH5i5wB7
+tnei44NACAw9cfbos+BYQQ/dY15OSFzLzIMomlabB7OpXOdDg3H6tJnPbFwWwXb
9nF8XdHqxeDVVVrDCAx1BSodSXm9xqgnQM2RDGTUnpVcAfqAr3MXX6VsyKQDzj8T
QXF9qOVCBAABv6BXAvSQ6mvMJZDUVbUPEPhf7kXzF46JsRd6A7wWoU/OnMGHQ/w7
wjvH8HVy
=fWBV
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for v5.14.
- Add MTE support in guests, complete with tag save/restore interface
- Reduce the impact of CMOs by moving them in the page-table code
- Allow device block mappings at stage-2
- Reduce the footprint of the vmemmap in protected mode
- Support the vGIC on dumb systems such as the Apple M1
- Add selftest infrastructure to support multiple configuration
and apply that to PMU/non-PMU setups
- Add selftests for the debug architecture
- The usual crop of PMU fixes
On PowerVM, the hypervisor defines the maximum buffer length for
each NX request and the kernel exported this value via sysfs.
This patch reads this value if the sysfs entry is available and
is used to limit the request length.
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ed908341b1eb7ca0183c028a4ed4a0cf48bfe0f6.camel@linux.ibm.com
This test exercises the feature KVM_CAP_EXIT_ON_EMULATION_FAILURE. When
enabled, errors in the in-kernel instruction emulator are forwarded to
userspace with the instruction bytes stored in the exit struct for
KVM_EXIT_INTERNAL_ERROR. So, when the guest attempts to emulate an
'flds' instruction, which isn't able to be emulated in KVM, instead
of failing, KVM sends the instruction to userspace to handle.
For this test to work properly the module parameter
'allow_smaller_maxphyaddr' has to be set.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20210510144834.658457-3-aaronlewis@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
bootconfig is a new feature that appends scripts onto the initrd, and the
kernel executes the scripts as an extended kernel command line.
Need to add tests to test that the happened. To test the bootconfig
properly, the initrd needs to be updated and the kernel rebooted. ktest is
the perfect solution to perform these tests.
Add a example bootconfig.conf in the tools/testing/ktest/examples/include
and example bootconfig scripts in tools/testing/ktest/examples/bootconfig
and also include verifier scripts that ktest will install on the target
and run to make sure that the bootconfig options in the scripts took place
after the target rebooted with the new initrd update.
Link: https://lkml.kernel.org/r/20210618112647.6a81dec5@oasis.local.home
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Make sure that SO_NETNS_COOKIE returns a non-zero value, and
that sockets from different namespaces have a distinct cookie
value.
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add an x86-only test to verify that x86's MMU reacts to CPUID updates
that impact the MMU. KVM has had multiple bugs where it fails to
reconfigure the MMU after the guest's vCPU model changes.
Sadly, this test is effectively limited to shadow paging because the
hardware page walk handler doesn't support software disabling of GBPAGES
support, and KVM doesn't manually walk the GVA->GPA on faults for
performance reasons (doing so would large defeat the benefits of TDP).
Don't require !TDP for the tests as there is still value in running the
tests with TDP, even though the tests will fail (barring KVM hacks).
E.g. KVM should not completely explode if MAXPHYADDR results in KVM using
4-level vs. 5-level paging for the guest.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-20-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add x86-64 hugepage support in the form of a x86-only variant of
virt_pg_map() that takes an explicit page size. To keep things simple,
follow the existing logic for 4k pages and disallow creating a hugepage
if the upper-level entry is present, even if the desired pfn matches.
Opportunistically fix a double "beyond beyond" reported by checkpatch.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-19-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In preparation for adding hugepage support, replace "pageMapL4Entry",
"pageDirectoryPointerEntry", and "pageDirectoryEntry" with a common
"pageUpperEntry", and add a helper to create an upper level entry. All
upper level entries have the same layout, using unique structs provides
minimal value and requires a non-trivial amount of code duplication.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-18-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a helper to retrieve a PTE pointer given a PFN, address, and level
in preparation for adding hugepage support.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-17-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rename the "address" field to "pfn" in x86's page table structs to match
reality.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-16-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a helper to allocate a page for use in constructing the guest's page
tables. All architectures have identical address and memslot
requirements (which appear to be arbitrary anyways).
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-15-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Drop the EPTP memslot param from all EPT helpers and shove the hardcoded
'0' down to the vm_phy_page_alloc() calls.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-14-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Drop the memslot param from virt_pg_map() and virt_map() and shove the
hardcoded '0' down to the vm_phy_page_alloc() calls.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-13-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Drop the memslot param(s) from vm_vaddr_alloc() now that all callers
directly specific '0' as the memslot. Drop the memslot param from
virt_pgd_alloc() as well since vm_vaddr_alloc() is its only user.
I.e. shove the hardcoded '0' down to the vm_phy_pages_alloc() calls.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add SLAB and page allocator tests for init_on_alloc. Testing for
init_on_free was already happening via the poisoning tests.
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210623203936.3151093-10-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a handful of LKDTM-testable features that depend on certain CONFIGs
so that they are visible in logs for CI systems that run the selftests.
Others could be added, but may be seen as having too high a trade-off
for general testing.
Cc: kernelci@groups.io
Suggested-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210623203936.3151093-9-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For various failure conditions, try to include some details about where
to look for reasons about the failure.
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210623203936.3151093-8-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Similar to the existing slab overflow and stack exhaustion tests, add
VMALLOC_LINEAR_OVERFLOW (and rename the slab test SLAB_LINEAR_OVERFLOW).
Additionally unmarks the test as destructive. (It should be safe in the
face of misbehavior.)
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210623203936.3151093-6-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Freed memory poisoning can be tested a few ways, so update the expected
text to reflect the non-Oopsing alternative.
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210623203936.3151093-4-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some environments do not set $SHELL when running tests. There's no
need to use $SHELL here anyway, since "cat" can be used to receive any
delivered signals from the kernel. Additionally avoid using bash-isms
in the command, and record stderr for posterity.
Fixes: 46d1a0f03d ("selftests/lkdtm: Add tests for LKDTM targets")
Cc: stable@vger.kernel.org
Suggested-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210623203936.3151093-2-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use KVM_UTIL_MIN_ADDR as the minimum for x86-64's CPUID array. The
system page size was likely used as the minimum because _something_ had
to be provided. Increasing the min from 0x1000 to 0x2000 should have no
meaningful impact on the test, and will allow changing vm_vaddr_alloc()
to use KVM_UTIL_MIN_VADDR as the default.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-11-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use the common page allocation helper for the xAPIC IPI test, effectively
raising the minimum virtual address from 0x1000 to 0x2000. Presumably
the test won't explode if it can't get a page at address 0x1000...
Cc: Peter Shier <pshier@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-10-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>