linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-09-29 13:53:33 +00:00

Author	SHA1	Message	Date
Linus Torvalds	5ef3720d91	powerpc fixes for 6.7 #5 - Fix a bug where heavy VAS (accelerator) usage could race with partition migration and prevent the migration from completing. - Update MAINTAINERS to add Aneesh & Naveen. Thanks to: Haren Myneni -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmV+hQYTHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgOGuD/4hYEm8kds5un1Ac7rqr9+0USpB+eB3 KFC5Z3xqUNuyh0TmRIh600gY5RZocb2gfke42RTxliNrR1e9Kft+ehZJ3Bvfrl3r 4v2sqo2HH15jKpGSuzpBhVcl+EjI3dM12F/mWcR9ROg71VE920cKxZMH7F2N3mjf rXW7z7oYwilAR89lF2au5reaLmFzZ5ikjupM2tH4n04D3k4g/3nNt4lMyJQ/qKRQ tUmzeSGisOciKVHkC0xp2jYH+LWW3frD8ot+cjf2TZBCwrfXd6HIbEUIEDCdVZtl LLnYZ24otdZO8gNRIpCQ51LDrd2usjnHQyPK9w56q3o0K/wQ3FQ0FrrjGGyEY5h2 wUvOxNTkFBrW9cM0CfYscrDE0+h19TTaXacv7CiVs4oFcCbPFet/en73D+b3EumI 16xgMoPHZx1CPbYkoM9agMryRiXW3a6rEh0ViIyvHnIL/9mjaUFqcMl0MuD1OSQB VyzPwvka/cxR3vtG58Cojj1se+Wi45HYsDebX7BXf9i0IxUhAV66T16ugUCvDNFb 9n4Q3mige//ZUEGWLzEweRp+sCNrbx6O+9XF2BGFez5ZJrXwtfup5csp3Lzgdzz/ jP7DPhsbhJEE0bevKpSyIcKi0C+p8Lbs6tKzgp5pGbh2gPAK5Vi8QMoKo6SrJzOl FctWttezuIw9XQ== =MaRL -----END PGP SIGNATURE----- Merge tag 'powerpc-6.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix a bug where heavy VAS (accelerator) usage could race with partition migration and prevent the migration from completing. - Update MAINTAINERS to add Aneesh & Naveen. Thanks to Haren Myneni. * tag 'powerpc-6.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: MAINTAINERS: powerpc: Add Aneesh & Naveen powerpc/pseries/vas: Migration suspend waits for no in-progress open windows	2023-12-17 08:50:00 -08:00
Amir Goldstein	413ba91089	ovl: fix dentry reference leak after changes to underlying layers syzbot excercised the forbidden practice of moving the workdir under lowerdir while overlayfs is mounted and tripped a dentry reference leak. Fixes: `c63e56a4a6` ("ovl: do not open/llseek lower file with upper sb_writers held") Reported-and-tested-by: syzbot+8608bb4553edb8c78f41@syzkaller.appspotmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com>	2023-12-17 13:33:46 +02:00
Kalle Valo	c5a3f56fcd	ath.git patches for v6.8. We have new features only for ath12k but lots of small cleanup for ath10k, ath11k and ath12k. And of course smaller fixes to several drivers. Major changes: ath12k * support one MSI vector * WCN7850: support AP mode -----BEGIN PGP SIGNATURE----- iQFLBAABCgA1FiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmV8odsXHHF1aWNfa3Zh bG9AcXVpY2luYy5jb20ACgkQbhckVSbrbZu1hgf9H+4qF8EvxpDPqse5ynDfUqeq sStOSAWtNvwqTLlCeHHOc5xpuifNo9ff9J59mb2WrRo6gO0PY0meTxQYPgaq1xZF lA4tW8yVpcyO+d3CCpFPWUOx0pbJW7UTzFl5A5TDCfGiY1ypFEGUzxnztvHfdY4L rIjWGAFEsVRGDNM2QVNsfhGm3DbXIKm+9XzhhNlNspaZ0/GS+E9BcXwSyKP5G77A MG9oRTIM++tdkCiJAId8Wt0AwBTGGuik5QnRb8JVsbBeigdWuUU6otXTc3WonzJE 93OEOcYJvmQgr5KLyk9euXJ7XVmwnn2JYY9jwPRSgZKvU3PkvxAALLYwMxOW8w== =5NRj -----END PGP SIGNATURE----- Merge tag 'ath-next-20231215' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath ath.git patches for v6.8. We have new features only for ath12k but lots of small cleanup for ath10k, ath11k and ath12k. And of course smaller fixes to several drivers. Major changes: ath12k * support one MSI vector * WCN7850: support AP mode	2023-12-17 13:20:18 +02:00
Gustavo A. R. Silva	40d51f70f0	wifi: mt76: mt7996: Use DECLARE_FLEX_ARRAY() and fix -Warray-bounds warnings Transform zero-length arrays `rate`, `adm_stat` and `msdu_cnt` into proper flexible-array members in anonymous union in `struct mt7996_mcu_all_sta_info_event` via the DECLARE_FLEX_ARRAY() helper; and fix multiple -Warray-bounds warnings: drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:544:61: warning: array subscript <unknown> is outside array bounds of 'struct <anonymous>[0]' [-Warray-bounds=] drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:551:58: warning: array subscript <unknown> is outside array bounds of 'struct <anonymous>[0]' [-Warray-bounds=] drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:553:58: warning: array subscript <unknown> is outside array bounds of 'struct <anonymous>[0]' [-Warray-bounds=] drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:530:61: warning: array subscript <unknown> is outside array bounds of 'struct <anonymous>[0]' [-Warray-bounds=] drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:538:66: warning: array subscript <unknown> is outside array bounds of 'struct <anonymous>[0]' [-Warray-bounds=] drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:540:66: warning: array subscript <unknown> is outside array bounds of 'struct <anonymous>[0]' [-Warray-bounds=] drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:520:57: warning: array subscript <unknown> is outside array bounds of 'struct all_sta_trx_rate[0]' [-Warray-bounds=] drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:526:76: warning: array subscript <unknown> is outside array bounds of 'struct all_sta_trx_rate[0]' [-Warray-bounds=] drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:526:76: warning: array subscript <unknown> is outside array bounds of 'struct all_sta_trx_rate[0]' [-Warray-bounds=] drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:526:76: warning: array subscript <unknown> is outside array bounds of 'struct all_sta_trx_rate[0]' [-Warray-bounds=] drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:526:76: warning: array subscript <unknown> is outside array bounds of 'struct all_sta_trx_rate[0]' [-Warray-bounds=] This results in no differences in binary output, helps with the ongoing efforts to globally enable -Warray-bounds. Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://msgid.link/ZXiU9ayVCslt3qiI@work	2023-12-17 13:17:03 +02:00
David S. Miller	3a3af3aedb	Merge branch 'skb-coalescing-page_pool' Liang Chen says: ==================== skbuff: Optimize SKB coalescing for page pool The combination of the following condition was excluded from skb coalescing: from->pp_recycle = 1 from->cloned = 1 to->pp_recycle = 1 With page pool in use, this combination can be quite common(ex. NetworkMananger may lead to the additional packet_type being registered, thus the cloning). In scenarios with a higher number of small packets, it can significantly affect the success rate of coalescing. This patchset aims to optimize this scenario and enable coalescing of this particular combination. That also involves supporting multiple users referencing the same fragment of a pp page to accomondate the need to increment the "from" SKB page's pp page reference count. Changes from v10: - re-number patches to 1/3, 2/3, 3/3 Changes from v9: - patch 1 was already applied - imporve description for patch 2 - make sure skb_pp_frag_ref only work for pp aware skbs ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:56:33 +00:00
Liang Chen	f7dc3248dc	skbuff: Optimization of SKB coalescing for page pool In order to address the issues encountered with commit `1effe8ca4e` ("skbuff: fix coalescing for page_pool fragment recycling"), the combination of the following condition was excluded from skb coalescing: from->pp_recycle = 1 from->cloned = 1 to->pp_recycle = 1 However, with page pool environments, the aforementioned combination can be quite common(ex. NetworkMananger may lead to the additional packet_type being registered, thus the cloning). In scenarios with a higher number of small packets, it can significantly affect the success rate of coalescing. For example, considering packets of 256 bytes size, our comparison of coalescing success rate is as follows: Without page pool: 70% With page pool: 13% Consequently, this has an impact on performance: Without page pool: 2.57 Gbits/sec With page pool: 2.26 Gbits/sec Therefore, it seems worthwhile to optimize this scenario and enable coalescing of this particular combination. To achieve this, we need to ensure the correct increment of the "from" SKB page's page pool reference count (pp_ref_count). Following this optimization, the success rate of coalescing measured in our environment has improved as follows: With page pool: 60% This success rate is approaching the rate achieved without using page pool, and the performance has also been improved: With page pool: 2.52 Gbits/sec Below is the performance comparison for small packets before and after this optimization. We observe no impact to packets larger than 4K. packet size before after improved (bytes) (Gbits/sec) (Gbits/sec) 128 1.19 1.27 7.13% 256 2.26 2.52 11.75% 512 4.13 4.81 16.50% 1024 6.17 6.73 9.05% 2048 14.54 15.47 6.45% 4096 25.44 27.87 9.52% Signed-off-by: Liang Chen <liangchen.linux@gmail.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Suggested-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:56:33 +00:00
Liang Chen	8cfa2dee32	skbuff: Add a function to check if a page belongs to page_pool Wrap code for checking if a page is a page_pool page into a function for better readability and ease of reuse. Signed-off-by: Liang Chen <liangchen.linux@gmail.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Mina Almasry <almarsymina@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:56:33 +00:00
Liang Chen	aaf153aece	page_pool: halve BIAS_MAX for multiple user references of a fragment Up to now, we were only subtracting from the number of used page fragments to figure out when a page could be freed or recycled. A following patch introduces support for multiple users referencing the same fragment. So reduce the initial page fragments value to half to avoid overflowing. Signed-off-by: Liang Chen <liangchen.linux@gmail.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Mina Almasry <almarsymina@google.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:56:33 +00:00
David S. Miller	66fe896351	Merge branch 'tcp-ao-selftests' Dmitry Safonov says: ==================== selftests/net: Add TCP-AO tests An essential part of any big kernel submissions is selftests. At the beginning of TCP-AO project, I made patches to fcnal-test.sh and nettest.c to have the benefits of easy refactoring, early noticing breakages, putting a moat around the code, documenting and designing uAPI. While tests based on fcnal-test.sh/nettest.c provided initial testing* and were very easy to add, the pile of TCP-AO quickly grew out of one-binary + shell-script testing. The design of the TCP-AO testing is a bit different than one-big selftest binary as I did previously in net/ipsec.c. I found it beneficial to avoid implementing a tests runner/scheduler and delegate it to the user or Makefile. The approach is very influenced by CRIU/ZDTM testing[1]: it provides a static library with helper functions and selftest binaries that create specific scenarios. I also tried to utilize kselftest.h. test_init() function does all needed preparations. To not leave any traces after a selftest exists, it creates a network namespace and if the test wants to establish a TCP connection, a child netns. The parent and child netns have veth pair with proper ip addresses and routes set up. Both peers, the client and server are different pthreads. The treading model was chosen over forking mostly by easiness of cleanup on a failure: no need to search for children, handle SIGCHLD, make sure not to wait for a dead peer to perform anything, etc. Any thread that does exit() naturally kills the tests, sweet! The selftests are compiled currently in two variants: ipv4 and ipv6. Ipv4-mapped-ipv6 addresses might be a third variant to add, but it's not there in this version. As pretty much all tests are shared between two address families, most of the code can be shared, too. To differ in code what kind of test is running, Makefile supplies -DIPV6_TEST to compiler and ifdeffery in tests can do things that have to be different between address families. This is similar to TARGETS_C_BOTHBITS in x86 selftests and also to tests code sharing in CRIU/ZDTM. The total number of tests is 832. From them rst_ipv{4,6} has currently one flaky subtest, that may fail: > not ok 9 client connection was not reset: 0 I'll investigate what happens there. Also, unsigned-md5_ipv{4,6} are flaky because of netns counter checks: it doesn't expect that there may be retransmitted TCP segments from a previous sub-selftest. That will be fixed. Besides, key-management_ipv{4,6} has 3 sub-tests passing with XFAIL: > ok 15 # XFAIL listen() after current/rnext keys set: the socket has current/rnext keys: 100:200 > ok 16 # XFAIL listen socket, delete current key from before listen(): failed to delete the key 100:100 -16 > ok 17 # XFAIL listen socket, delete rnext key from before listen(): failed to delete the key 200:200 -16 ... > # Totals: pass:117 fail:0 xfail:3 xpass:0 skip:0 error:0 Those need some more kernel work to pass instead of xfail. The overview of selftests (see the diffstat at the bottom): ├── lib │ ├── aolib.h │ │ The header for all selftests to include. │ ├── kconfig.c │ │ Kernel kconfig detector to SKIP tests that depend on something. │ ├── netlink.c │ │ Netlink helper to add/modify/delete VETH/IPs/routes/VRFs │ │ I considered just using libmnl, but this is around 400 lines │ │ and avoids selftests dependency on out-of-tree sources/packets. │ ├── proc.c │ │ SNMP/netstat procfs parser and the counters comparator. │ ├── repair.c │ │ Heavily influenced by libsoccr and reduced to minimum TCP │ │ socket checkpoint/repair. Shouldn't be used out of selftests, │ │ though. │ ├── setup.c │ │ All the needed netns/veth/ips/etc preparations for test init. │ ├── sock.c │ │ Socket helpers: {s,g}etsockopt()s/connect()/listen()/etc. │ └── utils.c │ Random stuff (a pun intended). ├── bench-lookups.c │ The only benchmark in selftests currently: checks how well TCP-AO │ setsockopt()s perform, depending on the amount of keys on a socket. ├── connect.c │ Trivial sample, can be used as a boilerplate to write a new test. ├── connect-deny.c │ More-or-less what could be expected for TCP-AO in fcnal-test.sh ├── icmps-accept.c -> icmps-discard.c ├── icmps-discard.c │ Verifies RFC5925 (7.8) by checking that TCP-AO connection can be │ broken if ICMPs are accepted and survives when ::accept_icmps = 0 ├── key-management.c │ Key manipulations, rotations between randomized hashing algorithms │ and counter checks for those scenarios. ├── restore.c │ TCP_AO_REPAIR: verifies that a socket can be re-created without │ TCP-AO connection being interrupted. ├── rst.c │ As RST segments are signed on a separate code-path in kernel, │ verifies passive/active TCP send_reset(). ├── self-connect.c │ Verifies that TCP self-connect and also simultaneous open work. ├── seq-ext.c │ Utilizes TCP_AO_REPAIR to check that on SEQ roll-over SNE │ increment is performed and segments with different SNEs fail to │ pass verification. ├── setsockopt-closed.c │ Checks that {s,g}etsockopt()s are extendable syscalls and common │ error-paths for them. └── unsigned-md5.c Checks listen() socket for (non-)matching peers with: AO/MD5/none keys. As well as their interaction with VRFs and AO_REQUIRED flag. There are certainly more test scenarios that can be added, but even so, I'm pretty happy that this much of TCP-AO functionality and uAPIs got covered. These selftests were iteratively developed by me during TCP-AO kernel upstreaming and the resulting kernel patches would have been worse without having these tests. They provided the user-side perspective but also allowed safer refactoring with less possibility of introducing a regression. Now it's time to use them to dig a moat around the TCP-AO code! There are also people from other network companies that work on TCP-AO (+testing), so sharing these selftests will allow them to contribute and may benefit from their efforts. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:41:55 +00:00
Dmitry Safonov	3c3ead5556	selftests/net: Add TCP-AO key-management test Check multiple keys on a socket: - rotation on closed socket - current/rnext operations shouldn't be possible on listen sockets - current/rnext key set should be the one, that's used on connect() - key rotations with pseudo-random generated keys - copying matching keys on connect() and on accept() At this moment there are 3 tests that are "expected" to fail: a kernel fix is needed to improve the situation, they are marked XFAIL. Sample output: > # ./key-management_ipv4 > 1..120 > # 1601[lib/setup.c:239] rand seed 1700526653 > TAP version 13 > ok 1 closed socket, delete a key: the key was deleted > ok 2 closed socket, delete all keys: the key was deleted > ok 3 closed socket, delete current key: key deletion was prevented > ok 4 closed socket, delete rnext key: key deletion was prevented > ok 5 closed socket, delete a key + set current/rnext: the key was deleted > ok 6 closed socket, force-delete current key: the key was deleted > ok 7 closed socket, force-delete rnext key: the key was deleted > ok 8 closed socket, delete current+rnext key: key deletion was prevented > ok 9 closed socket, add + change current key > ok 10 closed socket, add + change rnext key > ok 11 listen socket, delete a key: the key was deleted > ok 12 listen socket, delete all keys: the key was deleted > ok 13 listen socket, setting current key not allowed > ok 14 listen socket, setting rnext key not allowed > ok 15 # XFAIL listen() after current/rnext keys set: the socket has current/rnext keys: 100:200 > ok 16 # XFAIL listen socket, delete current key from before listen(): failed to delete the key 100:100 -16 > ok 17 # XFAIL listen socket, delete rnext key from before listen(): failed to delete the key 200:200 -16 > ok 18 listen socket, getsockopt(TCP_AO_REPAIR) is restricted > ok 19 listen socket, setsockopt(TCP_AO_REPAIR) is restricted > ok 20 listen socket, delete a key + set current/rnext: key deletion was prevented > ok 21 listen socket, force-delete current key: key deletion was prevented > ok 22 listen socket, force-delete rnext key: key deletion was prevented > ok 23 listen socket, delete a key: the key was deleted > ok 24 listen socket, add + change current key > ok 25 listen socket, add + change rnext key > ok 26 server: Check current/rnext keys unset before connect(): The socket keys are consistent with the expectations > ok 27 client: Check current/rnext keys unset before connect(): current key 19 as expected > ok 28 client: Check current/rnext keys unset before connect(): rnext key 146 as expected > ok 29 server: Check current/rnext keys unset before connect(): server alive > ok 30 server: Check current/rnext keys unset before connect(): passed counters checks > ok 31 client: Check current/rnext keys unset before connect(): The socket keys are consistent with the expectations > ok 32 server: Check current/rnext keys unset before connect(): The socket keys are consistent with the expectations > ok 33 server: Check current/rnext keys unset before connect(): passed counters checks > ok 34 client: Check current/rnext keys unset before connect(): passed counters checks > ok 35 server: Check current/rnext keys set before connect(): The socket keys are consistent with the expectations > ok 36 server: Check current/rnext keys set before connect(): server alive > ok 37 server: Check current/rnext keys set before connect(): passed counters checks > ok 38 client: Check current/rnext keys set before connect(): current key 10 as expected > ok 39 client: Check current/rnext keys set before connect(): rnext key 137 as expected > ok 40 server: Check current/rnext keys set before connect(): The socket keys are consistent with the expectations > ok 41 client: Check current/rnext keys set before connect(): The socket keys are consistent with the expectations > ok 42 client: Check current/rnext keys set before connect(): passed counters checks > ok 43 server: Check current/rnext keys set before connect(): passed counters checks > ok 44 server: Check current != rnext keys set before connect(): The socket keys are consistent with the expectations > ok 45 server: Check current != rnext keys set before connect(): server alive > ok 46 server: Check current != rnext keys set before connect(): passed counters checks > ok 47 client: Check current != rnext keys set before connect(): current key 10 as expected > ok 48 client: Check current != rnext keys set before connect(): rnext key 132 as expected > ok 49 server: Check current != rnext keys set before connect(): The socket keys are consistent with the expectations > ok 50 client: Check current != rnext keys set before connect(): The socket keys are consistent with the expectations > ok 51 client: Check current != rnext keys set before connect(): passed counters checks > ok 52 server: Check current != rnext keys set before connect(): passed counters checks > ok 53 server: Check current flapping back on peer's RnextKey request: The socket keys are consistent with the expectations > ok 54 server: Check current flapping back on peer's RnextKey request: server alive > ok 55 server: Check current flapping back on peer's RnextKey request: passed counters checks > ok 56 client: Check current flapping back on peer's RnextKey request: current key 10 as expected > ok 57 client: Check current flapping back on peer's RnextKey request: rnext key 132 as expected > ok 58 server: Check current flapping back on peer's RnextKey request: The socket keys are consistent with the expectations > ok 59 client: Check current flapping back on peer's RnextKey request: The socket keys are consistent with the expectations > ok 60 server: Check current flapping back on peer's RnextKey request: passed counters checks > ok 61 client: Check current flapping back on peer's RnextKey request: passed counters checks > ok 62 server: Rotate over all different keys: The socket keys are consistent with the expectations > ok 63 server: Rotate over all different keys: server alive > ok 64 server: Rotate over all different keys: passed counters checks > ok 65 server: Rotate over all different keys: current key 128 as expected > ok 66 client: Rotate over all different keys: rnext key 128 as expected > ok 67 server: Rotate over all different keys: current key 129 as expected > ok 68 client: Rotate over all different keys: rnext key 129 as expected > ok 69 server: Rotate over all different keys: current key 130 as expected > ok 70 client: Rotate over all different keys: rnext key 130 as expected > ok 71 server: Rotate over all different keys: current key 131 as expected > ok 72 client: Rotate over all different keys: rnext key 131 as expected > ok 73 server: Rotate over all different keys: current key 132 as expected > ok 74 client: Rotate over all different keys: rnext key 132 as expected > ok 75 server: Rotate over all different keys: current key 133 as expected > ok 76 client: Rotate over all different keys: rnext key 133 as expected > ok 77 server: Rotate over all different keys: current key 134 as expected > ok 78 client: Rotate over all different keys: rnext key 134 as expected > ok 79 server: Rotate over all different keys: current key 135 as expected > ok 80 client: Rotate over all different keys: rnext key 135 as expected > ok 81 server: Rotate over all different keys: current key 136 as expected > ok 82 client: Rotate over all different keys: rnext key 136 as expected > ok 83 server: Rotate over all different keys: current key 137 as expected > ok 84 client: Rotate over all different keys: rnext key 137 as expected > ok 85 server: Rotate over all different keys: current key 138 as expected > ok 86 client: Rotate over all different keys: rnext key 138 as expected > ok 87 server: Rotate over all different keys: current key 139 as expected > ok 88 client: Rotate over all different keys: rnext key 139 as expected > ok 89 server: Rotate over all different keys: current key 140 as expected > ok 90 client: Rotate over all different keys: rnext key 140 as expected > ok 91 server: Rotate over all different keys: current key 141 as expected > ok 92 client: Rotate over all different keys: rnext key 141 as expected > ok 93 server: Rotate over all different keys: current key 142 as expected > ok 94 client: Rotate over all different keys: rnext key 142 as expected > ok 95 server: Rotate over all different keys: current key 143 as expected > ok 96 client: Rotate over all different keys: rnext key 143 as expected > ok 97 server: Rotate over all different keys: current key 144 as expected > ok 98 client: Rotate over all different keys: rnext key 144 as expected > ok 99 server: Rotate over all different keys: current key 145 as expected > ok 100 client: Rotate over all different keys: rnext key 145 as expected > ok 101 server: Rotate over all different keys: current key 146 as expected > ok 102 client: Rotate over all different keys: rnext key 146 as expected > ok 103 server: Rotate over all different keys: current key 127 as expected > ok 104 client: Rotate over all different keys: rnext key 127 as expected > ok 105 client: Rotate over all different keys: current key 0 as expected > ok 106 client: Rotate over all different keys: rnext key 127 as expected > ok 107 server: Rotate over all different keys: The socket keys are consistent with the expectations > ok 108 client: Rotate over all different keys: The socket keys are consistent with the expectations > ok 109 client: Rotate over all different keys: passed counters checks > ok 110 server: Rotate over all different keys: passed counters checks > ok 111 server: Check accept() => established key matching: The socket keys are consistent with the expectations > ok 112 Can't add a key with non-matching ip-address for established sk > ok 113 Can't add a key with non-matching VRF for established sk > ok 114 server: Check accept() => established key matching: server alive > ok 115 server: Check accept() => established key matching: passed counters checks > ok 116 client: Check connect() => established key matching: current key 0 as expected > ok 117 client: Check connect() => established key matching: rnext key 128 as expected > ok 118 client: Check connect() => established key matching: The socket keys are consistent with the expectations > ok 119 server: Check accept() => established key matching: The socket keys are consistent with the expectations > ok 120 server: Check accept() => established key matching: passed counters checks > # Totals: pass:120 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:41:55 +00:00
Dmitry Safonov	8c4e8dd0c0	selftests/net: Add TCP-AO selfconnect/simultaneous connect test Check that a rare functionality of TCP named self-connect works with TCP-AO. This "under the cover" also checks TCP simultaneous connect (TCP_SYN_RECV socket state), which would be harder to check other ways. In order to verify that it's indeed TCP simultaneous connect, check the counters TCPChallengeACK and TCPSYNChallenge. Sample of the output: > # ./self-connect_ipv6 > 1..4 > # 1738[lib/setup.c:254] rand seed 1696451931 > TAP version 13 > ok 1 self-connect(same keyids): connect TCPAOGood 0 => 24 > ok 2 self-connect(different keyids): connect TCPAOGood 26 => 50 > ok 3 self-connect(restore): connect TCPAOGood 52 => 97 > ok 4 self-connect(restore, different keyids): connect TCPAOGood 99 => 144 > # Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:41:55 +00:00
Dmitry Safonov	c6df7b2361	selftests/net: Add TCP-AO RST test Check that both active and passive reset works and correctly sign segments with TCP-AO or don't send RSTs if not possible to sign. A listening socket with backlog = 0 gets one connection in accept queue, another in syn queue. Once the server/listener socket is forcibly closed, client sockets aren't connected to anything. In regular situation they would receive RST on any segment, but with TCP-AO as there's no listener, no AO-key and unknown ISNs, no RST should be sent. And "passive" reset, where RST is sent on reply for some segment (tcp_v{4,6}_send_reset()) - there use TCP_REPAIR to corrupt SEQ numbers, which later results in TCP-AO signed RST, which will be verified and client socket will get EPIPE. No TCPAORequired/TCPAOBad segments are expected during these tests. Sample of the output: > # ./rst_ipv4 > 1..15 > # 1462[lib/setup.c:254] rand seed 1686611171 > TAP version 13 > ok 1 servered 1000 bytes > ok 2 Verified established tcp connection > ok 3 sk[0] = 7, connection was reset > ok 4 sk[1] = 8, connection was reset > ok 5 sk[2] = 9 > ok 6 MKT counters are good on server > ok 7 Verified established tcp connection > ok 8 client connection broken post-seq-adjust > ok 9 client connection was reset > ok 10 No segments without AO sign (server) > ok 11 Signed AO segments (server): 0 => 30 > ok 12 No segments with bad AO sign (server) > ok 13 No segments without AO sign (client) > ok 14 Signed AO segments (client): 0 => 30 > ok 15 No segments with bad AO sign (client) > # Totals: pass:15 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:41:54 +00:00
Dmitry Safonov	0d16eae574	selftests/net: Add SEQ number extension test Check that on SEQ number wraparound there is no disruption or TCPAOBad segments produced. Sample of expected output: > # ./seq-ext_ipv4 > 1..7 > # 1436[lib/setup.c:254] rand seed 1686611079 > TAP version 13 > ok 1 server alive > ok 2 post-migrate connection alive > ok 3 TCPAOGood counter increased 1002 => 3002 > ok 4 TCPAOGood counter increased 1003 => 3003 > ok 5 TCPAOBad counter didn't increase > ok 6 TCPAOBad counter didn't increase > ok 7 SEQ extension incremented: 1/1999, 1/998999 > # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:41:54 +00:00
Dmitry Safonov	3715d32dc9	selftests/net: Add TCP_REPAIR TCP-AO tests The test plan is: 1. check that TCP-AO connection may be restored on another socket 2. check restore with wrong send/recv ISN (checking that they are part of MAC generation) 3. check restore with wrong SEQ number extension (checking that high bytes of it taken into MAC generation) Sample output expected: > # ./restore_ipv4 > 1..20 > # 1412[lib/setup.c:254] rand seed 1686610825 > TAP version 13 > ok 1 TCP-AO migrate to another socket: server alive > ok 2 TCP-AO migrate to another socket: post-migrate connection is alive > ok 3 TCP-AO migrate to another socket: counter TCPAOGood increased 23 => 44 > ok 4 TCP-AO migrate to another socket: counter TCPAOGood increased 22 => 42 > ok 5 TCP-AO with wrong send ISN: server couldn't serve > ok 6 TCP-AO with wrong send ISN: post-migrate connection is broken > ok 7 TCP-AO with wrong send ISN: counter TCPAOBad increased 0 => 4 > ok 8 TCP-AO with wrong send ISN: counter TCPAOBad increased 0 => 3 > ok 9 TCP-AO with wrong receive ISN: server couldn't serve > ok 10 TCP-AO with wrong receive ISN: post-migrate connection is broken > ok 11 TCP-AO with wrong receive ISN: counter TCPAOBad increased 4 => 8 > ok 12 TCP-AO with wrong receive ISN: counter TCPAOBad increased 5 => 10 > ok 13 TCP-AO with wrong send SEQ ext number: server couldn't serve > ok 14 TCP-AO with wrong send SEQ ext number: post-migrate connection is broken > ok 15 TCP-AO with wrong send SEQ ext number: counter TCPAOBad increased 9 => 10 > ok 16 TCP-AO with wrong send SEQ ext number: counter TCPAOBad increased 11 => 19 > ok 17 TCP-AO with wrong receive SEQ ext number: post-migrate connection is broken > ok 18 TCP-AO with wrong receive SEQ ext number: server couldn't serve > ok 19 TCP-AO with wrong receive SEQ ext number: counter TCPAOBad increased 10 => 18 > ok 20 TCP-AO with wrong receive SEQ ext number: counter TCPAOBad increased 20 => 23 > # Totals: pass:20 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:41:54 +00:00
Dmitry Safonov	d1066c9c58	selftests/net: Add test/benchmark for removing MKTs Sample output: > 1..36 > # 1106[lib/setup.c:207] rand seed 1660754406 > TAP version 13 > ok 1 Worst case connect 512 keys: min=0ms max=1ms mean=0.583329ms stddev=0.076376 > ok 2 Connect random-search 512 keys: min=0ms max=1ms mean=0.53412ms stddev=0.0516779 > ok 3 Worst case delete 512 keys: min=2ms max=11ms mean=6.04139ms stddev=0.245792 > ok 4 Add a new key 512 keys: min=0ms max=13ms mean=0.673415ms stddev=0.0820618 > ok 5 Remove random-search 512 keys: min=5ms max=9ms mean=6.65969ms stddev=0.258064 > ok 6 Remove async 512 keys: min=0ms max=0ms mean=0.041825ms stddev=0.0204512 > ok 7 Worst case connect 1024 keys: min=0ms max=2ms mean=0.520357ms stddev=0.0721358 > ok 8 Connect random-search 1024 keys: min=0ms max=2ms mean=0.535312ms stddev=0.0517355 > ok 9 Worst case delete 1024 keys: min=5ms max=9ms mean=8.27219ms stddev=0.287614 > ok 10 Add a new key 1024 keys: min=0ms max=1ms mean=0.688121ms stddev=0.0829531 > ok 11 Remove random-search 1024 keys: min=5ms max=9ms mean=8.37649ms stddev=0.289422 > ok 12 Remove async 1024 keys: min=0ms max=0ms mean=0.0457096ms stddev=0.0213798 > ok 13 Worst case connect 2048 keys: min=0ms max=2ms mean=0.748804ms stddev=0.0865335 > ok 14 Connect random-search 2048 keys: min=0ms max=2ms mean=0.782993ms stddev=0.0625697 > ok 15 Worst case delete 2048 keys: min=5ms max=10ms mean=8.23106ms stddev=0.286898 > ok 16 Add a new key 2048 keys: min=0ms max=1ms mean=0.812988ms stddev=0.0901658 > ok 17 Remove random-search 2048 keys: min=8ms max=9ms mean=8.84949ms stddev=0.297481 > ok 18 Remove async 2048 keys: min=0ms max=0ms mean=0.0297223ms stddev=0.0172402 > ok 19 Worst case connect 4096 keys: min=1ms max=5ms mean=1.53352ms stddev=0.123836 > ok 20 Connect random-search 4096 keys: min=1ms max=5ms mean=1.52226ms stddev=0.0872429 > ok 21 Worst case delete 4096 keys: min=5ms max=9ms mean=8.25874ms stddev=0.28738 > ok 22 Add a new key 4096 keys: min=0ms max=3ms mean=1.67382ms stddev=0.129376 > ok 23 Remove random-search 4096 keys: min=5ms max=10ms mean=8.26178ms stddev=0.287433 > ok 24 Remove async 4096 keys: min=0ms max=0ms mean=0.0340009ms stddev=0.0184393 > ok 25 Worst case connect 8192 keys: min=2ms max=4ms mean=2.86208ms stddev=0.169177 > ok 26 Connect random-search 8192 keys: min=2ms max=4ms mean=2.87592ms stddev=0.119915 > ok 27 Worst case delete 8192 keys: min=6ms max=11ms mean=7.55291ms stddev=0.274826 > ok 28 Add a new key 8192 keys: min=1ms max=5ms mean=2.56797ms stddev=0.160249 > ok 29 Remove random-search 8192 keys: min=5ms max=10ms mean=7.14002ms stddev=0.267208 > ok 30 Remove async 8192 keys: min=0ms max=0ms mean=0.0320066ms stddev=0.0178904 > ok 31 Worst case connect 16384 keys: min=5ms max=6ms mean=5.55334ms stddev=0.235655 > ok 32 Connect random-search 16384 keys: min=5ms max=6ms mean=5.52614ms stddev=0.166225 > ok 33 Worst case delete 16384 keys: min=5ms max=11ms mean=7.39109ms stddev=0.271866 > ok 34 Add a new key 16384 keys: min=2ms max=4ms mean=3.35799ms stddev=0.183248 > ok 35 Remove random-search 16384 keys: min=5ms max=8ms mean=6.86078ms stddev=0.261931 > ok 36 Remove async 16384 keys: min=0ms max=0ms mean=0.0302384ms stddev=0.0173892 > # Totals: pass:36 fail:0 xfail:0 xpass:0 skip:0 error:0 >From the output it's visible that the current simplified approach with linked-list of MKTs scales quite fine even for thousands of keys. And that also means that the majority of the time for delete is eaten by synchronize_rcu() [which I can confirm separately by tracing]. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:41:54 +00:00
Dmitry Safonov	6f0c472a68	selftests/net: Add TCP-AO + TCP-MD5 + no sign listen socket tests The test plan was (most of tests have all 3 client types): 1. TCP-AO listen (INADDR_ANY) 2. TCP-MD5 listen (INADDR_ANY) 3. non-signed listen (INADDR_ANY) 4. TCP-AO + TCP-MD5 listen (prefix) 5. TCP-AO subprefix add failure [checked in setsockopt-closed.c] 6. TCP-AO out of prefix connect [checked in connect-deny.c] 7. TCP-AO + TCP-MD5 on connect() 8. TCP-AO intersect with TCP-MD5 failure 9. Established TCP-AO: add TCP-MD5 key 10. Established TCP-MD5: add TCP-AO key 11. Established non-signed: add TCP-AO key Output produced: > # ./unsigned-md5_ipv6 > 1..72 > # 1592[lib/setup.c:239] rand seed 1697567046 > TAP version 13 > ok 1 AO server (INADDR_ANY): AO client: counter TCPAOGood increased 0 => 2 > ok 2 AO server (INADDR_ANY): AO client: connected > ok 3 AO server (INADDR_ANY): MD5 client > ok 4 AO server (INADDR_ANY): MD5 client: counter TCPMD5Unexpected increased 0 => 1 > ok 5 AO server (INADDR_ANY): no sign client: counter TCPAORequired increased 0 => 1 > ok 6 AO server (INADDR_ANY): unsigned client > ok 7 AO server (AO_REQUIRED): AO client: connected > ok 8 AO server (AO_REQUIRED): AO client: counter TCPAOGood increased 4 => 6 > ok 9 AO server (AO_REQUIRED): unsigned client > ok 10 AO server (AO_REQUIRED): unsigned client: counter TCPAORequired increased 1 => 2 > ok 11 MD5 server (INADDR_ANY): AO client: counter TCPAOKeyNotFound increased 0 => 1 > ok 12 MD5 server (INADDR_ANY): AO client > ok 13 MD5 server (INADDR_ANY): MD5 client: connected > ok 14 MD5 server (INADDR_ANY): MD5 client: no counter checks > ok 15 MD5 server (INADDR_ANY): no sign client > ok 16 MD5 server (INADDR_ANY): no sign client: counter TCPMD5NotFound increased 0 => 1 > ok 17 no sign server: AO client > ok 18 no sign server: AO client: counter TCPAOKeyNotFound increased 1 => 2 > ok 19 no sign server: MD5 client > ok 20 no sign server: MD5 client: counter TCPMD5Unexpected increased 1 => 2 > ok 21 no sign server: no sign client: connected > ok 22 no sign server: no sign client: counter CurrEstab increased 0 => 1 > ok 23 AO+MD5 server: AO client (matching): connected > ok 24 AO+MD5 server: AO client (matching): counter TCPAOGood increased 8 => 10 > ok 25 AO+MD5 server: AO client (misconfig, matching MD5) > ok 26 AO+MD5 server: AO client (misconfig, matching MD5): counter TCPAOKeyNotFound increased 2 => 3 > ok 27 AO+MD5 server: AO client (misconfig, non-matching): counter TCPAOKeyNotFound increased 3 => 4 > ok 28 AO+MD5 server: AO client (misconfig, non-matching) > ok 29 AO+MD5 server: MD5 client (matching): connected > ok 30 AO+MD5 server: MD5 client (matching): no counter checks > ok 31 AO+MD5 server: MD5 client (misconfig, matching AO) > ok 32 AO+MD5 server: MD5 client (misconfig, matching AO): counter TCPMD5Unexpected increased 2 => 3 > ok 33 AO+MD5 server: MD5 client (misconfig, non-matching) > ok 34 AO+MD5 server: MD5 client (misconfig, non-matching): counter TCPMD5Unexpected increased 3 => 4 > ok 35 AO+MD5 server: no sign client (unmatched): connected > ok 36 AO+MD5 server: no sign client (unmatched): counter CurrEstab increased 0 => 1 > ok 37 AO+MD5 server: no sign client (misconfig, matching AO) > ok 38 AO+MD5 server: no sign client (misconfig, matching AO): counter TCPAORequired increased 2 => 3 > ok 39 AO+MD5 server: no sign client (misconfig, matching MD5) > ok 40 AO+MD5 server: no sign client (misconfig, matching MD5): counter TCPMD5NotFound increased 1 => 2 > ok 41 AO+MD5 server: client with both [TCP-MD5] and TCP-AO keys: connect() was prevented > ok 42 AO+MD5 server: client with both [TCP-MD5] and TCP-AO keys: no counter checks > ok 43 AO+MD5 server: client with both TCP-MD5 and [TCP-AO] keys: connect() was prevented > ok 44 AO+MD5 server: client with both TCP-MD5 and [TCP-AO] keys: no counter checks > ok 45 TCP-AO established: add TCP-MD5 key: postfailed as expected > ok 46 TCP-AO established: add TCP-MD5 key: counter TCPAOGood increased 12 => 14 > ok 47 TCP-MD5 established: add TCP-AO key: postfailed as expected > ok 48 TCP-MD5 established: add TCP-AO key: no counter checks > ok 49 non-signed established: add TCP-AO key: postfailed as expected > ok 50 non-signed established: add TCP-AO key: counter CurrEstab increased 0 => 1 > ok 51 TCP-AO key intersects with existing TCP-MD5 key: prefailed as expected: Key was rejected by service > ok 52 TCP-MD5 key intersects with existing TCP-AO key: prefailed as expected: Key was rejected by service > ok 53 TCP-MD5 key + TCP-AO required: prefailed as expected: Key was rejected by service > ok 54 TCP-AO required on socket + TCP-MD5 key: prefailed as expected: Key was rejected by service > ok 55 VRF: TCP-AO key (no l3index) + TCP-MD5 key (no l3index): prefailed as expected: Key was rejected by service > ok 56 VRF: TCP-MD5 key (no l3index) + TCP-AO key (no l3index): prefailed as expected: Key was rejected by service > ok 57 VRF: TCP-AO key (no l3index) + TCP-MD5 key (l3index=0): prefailed as expected: Key was rejected by service > ok 58 VRF: TCP-MD5 key (l3index=0) + TCP-AO key (no l3index): prefailed as expected: Key was rejected by service > ok 59 VRF: TCP-AO key (no l3index) + TCP-MD5 key (l3index=N): prefailed as expected: Key was rejected by service > ok 60 VRF: TCP-MD5 key (l3index=N) + TCP-AO key (no l3index): prefailed as expected: Key was rejected by service > ok 61 VRF: TCP-AO key (l3index=0) + TCP-MD5 key (no l3index): prefailed as expected: Key was rejected by service > ok 62 VRF: TCP-MD5 key (no l3index) + TCP-AO key (l3index=0): prefailed as expected: Key was rejected by service > ok 63 VRF: TCP-AO key (l3index=0) + TCP-MD5 key (l3index=0): prefailed as expected: Key was rejected by service > ok 64 VRF: TCP-MD5 key (l3index=0) + TCP-AO key (l3index=0): prefailed as expected: Key was rejected by service > ok 65 VRF: TCP-AO key (l3index=0) + TCP-MD5 key (l3index=N) > ok 66 VRF: TCP-MD5 key (l3index=N) + TCP-AO key (l3index=0) > ok 67 VRF: TCP-AO key (l3index=N) + TCP-MD5 key (no l3index): prefailed as expected: Key was rejected by service > ok 68 VRF: TCP-MD5 key (no l3index) + TCP-AO key (l3index=N): prefailed as expected: Key was rejected by service > ok 69 VRF: TCP-AO key (l3index=N) + TCP-MD5 key (l3index=0) > ok 70 VRF: TCP-MD5 key (l3index=0) + TCP-AO key (l3index=N) > ok 71 VRF: TCP-AO key (l3index=N) + TCP-MD5 key (l3index=N): prefailed as expected: Key was rejected by service > ok 72 VRF: TCP-MD5 key (l3index=N) + TCP-AO key (l3index=N): prefailed as expected: Key was rejected by service > # Totals: pass:72 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:41:54 +00:00
Dmitry Safonov	b26660531c	selftests/net: Add test for TCP-AO add setsockopt() command Verify corner-cases for UAPI. Sample output: > # ./setsockopt-closed_ipv4 > 1..120 > # 1657[lib/setup.c:254] rand seed 1681938184 > TAP version 13 > ok 1 AO add: minimum size > ok 2 AO add: extended size > ok 3 AO add: null optval > ok 4 AO del: minimum size > ok 5 AO del: extended size > ok 6 AO del: null optval > ok 7 AO set info: minimum size > ok 8 AO set info: extended size > ok 9 AO info get: : extended size > ok 10 AO set info: null optval > ok 11 AO get info: minimum size > ok 12 AO get info: extended size > ok 13 AO get info: null optval > ok 14 AO get info: null optlen > ok 15 AO get keys: minimum size > ok 16 AO get keys: extended size > ok 17 AO get keys: null optval > ok 18 AO get keys: null optlen > ok 19 key add: too big keylen > ok 20 key add: using reserved padding > ok 21 key add: using reserved2 padding > ok 22 key add: wrong address family > ok 23 key add: port (unsupported) > ok 24 key add: no prefix, addr > ok 25 key add: no prefix, any addr > ok 26 key add: prefix, any addr > ok 27 key add: too big prefix > ok 28 key add: too short prefix > ok 29 key add: bad key flags > ok 30 key add: add current key on a listen socket > ok 31 key add: add rnext key on a listen socket > ok 32 key add: add current+rnext key on a listen socket > ok 33 key add: add key and set as current > ok 34 key add: add key and set as rnext > ok 35 key add: add key and set as current+rnext > ok 36 key add: ifindex without TCP_AO_KEYF_IFNINDEX > ok 37 key add: non-existent VRF > ok 38 optmem limit was hit on adding 69 key > ok 39 key add: maclen bigger than TCP hdr > ok 40 key add: bad algo > ok 41 key del: using reserved padding > ok 42 key del: using reserved2 padding > ok 43 key del: del and set current key on a listen socket > ok 44 key del: del and set rnext key on a listen socket > ok 45 key del: del and set current+rnext key on a listen socket > ok 46 key del: bad key flags > ok 47 key del: ifindex without TCP_AO_KEYF_IFNINDEX > ok 48 key del: non-existent VRF > ok 49 key del: set non-exising current key > ok 50 key del: set non-existing rnext key > ok 51 key del: set non-existing current+rnext key > ok 52 key del: set current key > ok 53 key del: set rnext key > ok 54 key del: set current+rnext key > ok 55 key del: set as current key to be removed > ok 56 key del: set as rnext key to be removed > ok 57 key del: set as current+rnext key to be removed > ok 58 key del: async on non-listen > ok 59 key del: non-existing sndid > ok 60 key del: non-existing rcvid > ok 61 key del: incorrect addr > ok 62 key del: correct key delete > ok 63 AO info set: set current key on a listen socket > ok 64 AO info set: set rnext key on a listen socket > ok 65 AO info set: set current+rnext key on a listen socket > ok 66 AO info set: using reserved padding > ok 67 AO info set: using reserved2 padding > ok 68 AO info set: accept_icmps > ok 69 AO info get: accept_icmps > ok 70 AO info set: ao required > ok 71 AO info get: ao required > ok 72 AO info set: ao required with MD5 key > ok 73 AO info set: set non-existing current key > ok 74 AO info set: set non-existing rnext key > ok 75 AO info set: set non-existing current+rnext key > ok 76 AO info set: set current key > ok 77 AO info get: set current key > ok 78 AO info set: set rnext key > ok 79 AO info get: set rnext key > ok 80 AO info set: set current+rnext key > ok 81 AO info get: set current+rnext key > ok 82 AO info set: set counters > ok 83 AO info get: set counters > ok 84 AO info set: no-op > ok 85 AO info get: no-op > ok 86 get keys: no ao_info > ok 87 get keys: proper tcp_ao_get_mkts() > ok 88 get keys: set out-only pkt_good counter > ok 89 get keys: set out-only pkt_bad counter > ok 90 get keys: bad keyflags > ok 91 get keys: ifindex without TCP_AO_KEYF_IFNINDEX > ok 92 get keys: using reserved field > ok 93 get keys: no prefix, addr > ok 94 get keys: no prefix, any addr > ok 95 get keys: prefix, any addr > ok 96 get keys: too big prefix > ok 97 get keys: too short prefix > ok 98 get keys: prefix + addr > ok 99 get keys: get_all + prefix > ok 100 get keys: get_all + addr > ok 101 get keys: get_all + sndid > ok 102 get keys: get_all + rcvid > ok 103 get keys: current + prefix > ok 104 get keys: current + addr > ok 105 get keys: current + sndid > ok 106 get keys: current + rcvid > ok 107 get keys: rnext + prefix > ok 108 get keys: rnext + addr > ok 109 get keys: rnext + sndid > ok 110 get keys: rnext + rcvid > ok 111 get keys: get_all + current > ok 112 get keys: get_all + rnext > ok 113 get keys: current + rnext > ok 114 key add: duplicate: full copy > ok 115 key add: duplicate: any addr key on the socket > ok 116 key add: duplicate: add any addr key > ok 117 key add: duplicate: add any addr for the same subnet > ok 118 key add: duplicate: full copy of a key > ok 119 key add: duplicate: RecvID differs > ok 120 key add: duplicate: SendID differs > # Totals: pass:120 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:41:54 +00:00
Dmitry Safonov	ed9d09b309	selftests/net: Add a test for TCP-AO keys matching Add TCP-AO tests on connect()/accept() pair. SNMP counters exposed by kernel are very useful here to verify the expected behavior of TCP-AO. Expected output for ipv4 version: > # ./connect-deny_ipv4 > 1..19 > # 1702[lib/setup.c:254] rand seed 1680553689 > TAP version 13 > ok 1 Non-AO server + AO client > ok 2 Non-AO server + AO client: counter TCPAOKeyNotFound increased 0 => 1 > ok 3 AO server + Non-AO client > ok 4 AO server + Non-AO client: counter TCPAORequired increased 0 => 1 > ok 5 Wrong password > ok 6 Wrong password: counter TCPAOBad increased 0 => 1 > ok 7 Wrong rcv id > ok 8 Wrong rcv id: counter TCPAOKeyNotFound increased 1 => 2 > ok 9 Wrong snd id > ok 10 Wrong snd id: counter TCPAOGood increased 0 => 1 > ok 11 Server: Wrong addr: counter TCPAOKeyNotFound increased 2 => 3 > ok 12 Server: Wrong addr > ok 13 Client: Wrong addr: connect() was prevented > ok 14 rcv id != snd id: connected > ok 15 rcv id != snd id: counter TCPAOGood increased 1 => 3 > ok 16 Server: prefix match: connected > ok 17 Server: prefix match: counter TCPAOGood increased 4 => 6 > ok 18 Client: prefix match: connected > ok 19 Client: prefix match: counter TCPAOGood increased 7 => 9 > # Totals: pass:19 fail:0 xfail:0 xpass:0 skip:0 error:0 Expected output for ipv6 version: > # ./connect-deny_ipv6 > 1..19 > # 1725[lib/setup.c:254] rand seed 1680553711 > TAP version 13 > ok 1 Non-AO server + AO client > ok 2 Non-AO server + AO client: counter TCPAOKeyNotFound increased 0 => 1 > ok 3 AO server + Non-AO client: counter TCPAORequired increased 0 => 1 > ok 4 AO server + Non-AO client > ok 5 Wrong password: counter TCPAOBad increased 0 => 1 > ok 6 Wrong password > ok 7 Wrong rcv id: counter TCPAOKeyNotFound increased 1 => 2 > ok 8 Wrong rcv id > ok 9 Wrong snd id: counter TCPAOGood increased 0 => 1 > ok 10 Wrong snd id > ok 11 Server: Wrong addr > ok 12 Server: Wrong addr: counter TCPAOKeyNotFound increased 2 => 3 > ok 13 Client: Wrong addr: connect() was prevented > ok 14 rcv id != snd id: connected > ok 15 rcv id != snd id: counter TCPAOGood increased 1 => 3 > ok 16 Server: prefix match: connected > ok 17 Server: prefix match: counter TCPAOGood increased 5 => 7 > ok 18 Client: prefix match: connected > ok 19 Client: prefix match: counter TCPAOGood increased 8 => 10 > # Totals: pass:19 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:41:54 +00:00
Dmitry Safonov	d11301f659	selftests/net: Add TCP-AO ICMPs accept test Reverse to icmps-discard test: the server accepts ICMPs, using TCP_AO_CMDF_ACCEPT_ICMP and it is expected to fail under ICMP flood from client. Test that the default pre-TCP-AO behaviour functions when TCP_AO_CMDF_ACCEPT_ICMP is set. Expected output for ipv4 version (in case it receives ICMP_PROT_UNREACH): > # ./icmps-accept_ipv4 > 1..3 > # 3209[lib/setup.c:166] rand seed 1642623870 > TAP version 13 > # 3209[lib/proc.c:207] Snmp6 Ip6InReceives: 0 => 1 > # 3209[lib/proc.c:207] Snmp6 Ip6InNoRoutes: 0 => 1 > # 3209[lib/proc.c:207] Snmp6 Ip6InOctets: 0 => 76 > # 3209[lib/proc.c:207] Snmp6 Ip6InNoECTPkts: 0 => 1 > # 3209[lib/proc.c:207] Tcp InSegs: 3 => 23 > # 3209[lib/proc.c:207] Tcp OutSegs: 2 => 22 > # 3209[lib/proc.c:207] IcmpMsg InType3: 0 => 4 > # 3209[lib/proc.c:207] Icmp InMsgs: 0 => 4 > # 3209[lib/proc.c:207] Icmp InDestUnreachs: 0 => 4 > # 3209[lib/proc.c:207] Ip InReceives: 3 => 27 > # 3209[lib/proc.c:207] Ip InDelivers: 3 => 27 > # 3209[lib/proc.c:207] Ip OutRequests: 2 => 22 > # 3209[lib/proc.c:207] IpExt InOctets: 288 => 3420 > # 3209[lib/proc.c:207] IpExt OutOctets: 124 => 3244 > # 3209[lib/proc.c:207] IpExt InNoECTPkts: 3 => 25 > # 3209[lib/proc.c:207] TcpExt TCPPureAcks: 1 => 2 > # 3209[lib/proc.c:207] TcpExt TCPOrigDataSent: 0 => 20 > # 3209[lib/proc.c:207] TcpExt TCPDelivered: 0 => 19 > # 3209[lib/proc.c:207] TcpExt TCPAOGood: 3 => 23 > ok 1 InDestUnreachs delivered 4 > ok 2 server failed with -92: Protocol not available > ok 3 TCPAODroppedIcmps counter didn't change: 0 >= 0 > # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 Expected output for ipv6 version (in case it receives ADM_PROHIBITED): > # ./icmps-accept_ipv6 > 1..3 > # 3277[lib/setup.c:166] rand seed 1642624035 > TAP version 13 > # 3277[lib/proc.c:207] Snmp6 Ip6InReceives: 6 => 31 > # 3277[lib/proc.c:207] Snmp6 Ip6InDelivers: 4 => 29 > # 3277[lib/proc.c:207] Snmp6 Ip6OutRequests: 4 => 24 > # 3277[lib/proc.c:207] Snmp6 Ip6InOctets: 592 => 4492 > # 3277[lib/proc.c:207] Snmp6 Ip6OutOctets: 332 => 3852 > # 3277[lib/proc.c:207] Snmp6 Ip6InNoECTPkts: 6 => 31 > # 3277[lib/proc.c:207] Snmp6 Icmp6InMsgs: 1 => 6 > # 3277[lib/proc.c:207] Snmp6 Icmp6InDestUnreachs: 0 => 5 > # 3277[lib/proc.c:207] Snmp6 Icmp6InType1: 0 => 5 > # 3277[lib/proc.c:207] Tcp InSegs: 3 => 23 > # 3277[lib/proc.c:207] Tcp OutSegs: 2 => 22 > # 3277[lib/proc.c:207] TcpExt TCPPureAcks: 1 => 2 > # 3277[lib/proc.c:207] TcpExt TCPOrigDataSent: 0 => 20 > # 3277[lib/proc.c:207] TcpExt TCPDelivered: 0 => 19 > # 3277[lib/proc.c:207] TcpExt TCPAOGood: 3 => 23 > ok 1 Icmp6InDestUnreachs delivered 5 > ok 2 server failed with -13: Permission denied > ok 3 TCPAODroppedIcmps counter didn't change: 0 >= 0 > # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 With some luck the server may fail with ECONNREFUSED (depending on what icmp packet was delivered firstly). For the kernel error handlers see: tab_unreach[] and icmp_err_convert[]. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:41:54 +00:00
Dmitry Safonov	a8fcf8ca14	selftests/net: Verify that TCP-AO complies with ignoring ICMPs Hand-crafted ICMP packets are sent to the server, the server checks for hard/soft errors and fails if any. Expected output for ipv4 version: > # ./icmps-discard_ipv4 > 1..3 > # 3164[lib/setup.c:166] rand seed 1642623745 > TAP version 13 > # 3164[lib/proc.c:207] Snmp6 Ip6InReceives: 0 => 1 > # 3164[lib/proc.c:207] Snmp6 Ip6InNoRoutes: 0 => 1 > # 3164[lib/proc.c:207] Snmp6 Ip6InOctets: 0 => 76 > # 3164[lib/proc.c:207] Snmp6 Ip6InNoECTPkts: 0 => 1 > # 3164[lib/proc.c:207] Tcp InSegs: 2 => 203 > # 3164[lib/proc.c:207] Tcp OutSegs: 1 => 202 > # 3164[lib/proc.c:207] IcmpMsg InType3: 0 => 543 > # 3164[lib/proc.c:207] Icmp InMsgs: 0 => 543 > # 3164[lib/proc.c:207] Icmp InDestUnreachs: 0 => 543 > # 3164[lib/proc.c:207] Ip InReceives: 2 => 746 > # 3164[lib/proc.c:207] Ip InDelivers: 2 => 746 > # 3164[lib/proc.c:207] Ip OutRequests: 1 => 202 > # 3164[lib/proc.c:207] IpExt InOctets: 132 => 61684 > # 3164[lib/proc.c:207] IpExt OutOctets: 68 => 31324 > # 3164[lib/proc.c:207] IpExt InNoECTPkts: 2 => 744 > # 3164[lib/proc.c:207] TcpExt TCPPureAcks: 1 => 2 > # 3164[lib/proc.c:207] TcpExt TCPOrigDataSent: 0 => 200 > # 3164[lib/proc.c:207] TcpExt TCPDelivered: 0 => 199 > # 3164[lib/proc.c:207] TcpExt TCPAOGood: 2 => 203 > # 3164[lib/proc.c:207] TcpExt TCPAODroppedIcmps: 0 => 541 > ok 1 InDestUnreachs delivered 543 > ok 2 Server survived 20000 bytes of traffic > ok 3 ICMPs ignored 541 > # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 Expected output for ipv6 version: > # ./icmps-discard_ipv6 > 1..3 > # 3186[lib/setup.c:166] rand seed 1642623803 > TAP version 13 > # 3186[lib/proc.c:207] Snmp6 Ip6InReceives: 4 => 568 > # 3186[lib/proc.c:207] Snmp6 Ip6InDelivers: 3 => 564 > # 3186[lib/proc.c:207] Snmp6 Ip6OutRequests: 2 => 204 > # 3186[lib/proc.c:207] Snmp6 Ip6InMcastPkts: 1 => 4 > # 3186[lib/proc.c:207] Snmp6 Ip6OutMcastPkts: 0 => 1 > # 3186[lib/proc.c:207] Snmp6 Ip6InOctets: 320 => 70420 > # 3186[lib/proc.c:207] Snmp6 Ip6OutOctets: 160 => 35512 > # 3186[lib/proc.c:207] Snmp6 Ip6InMcastOctets: 72 => 336 > # 3186[lib/proc.c:207] Snmp6 Ip6OutMcastOctets: 0 => 76 > # 3186[lib/proc.c:207] Snmp6 Ip6InNoECTPkts: 4 => 568 > # 3186[lib/proc.c:207] Snmp6 Icmp6InMsgs: 1 => 361 > # 3186[lib/proc.c:207] Snmp6 Icmp6OutMsgs: 1 => 2 > # 3186[lib/proc.c:207] Snmp6 Icmp6InDestUnreachs: 0 => 360 > # 3186[lib/proc.c:207] Snmp6 Icmp6OutMLDv2Reports: 0 => 1 > # 3186[lib/proc.c:207] Snmp6 Icmp6InType1: 0 => 360 > # 3186[lib/proc.c:207] Snmp6 Icmp6OutType143: 0 => 1 > # 3186[lib/proc.c:207] Tcp InSegs: 2 => 203 > # 3186[lib/proc.c:207] Tcp OutSegs: 1 => 202 > # 3186[lib/proc.c:207] TcpExt TCPPureAcks: 1 => 2 > # 3186[lib/proc.c:207] TcpExt TCPOrigDataSent: 0 => 200 > # 3186[lib/proc.c:207] TcpExt TCPDelivered: 0 => 199 > # 3186[lib/proc.c:207] TcpExt TCPAOGood: 2 => 203 > # 3186[lib/proc.c:207] TcpExt TCPAODroppedIcmps: 0 => 360 > ok 1 Icmp6InDestUnreachs delivered 360 > ok 2 Server survived 20000 bytes of traffic > ok 3 ICMPs ignored 360 > # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:41:54 +00:00
Dmitry Safonov	cfbab37b3d	selftests/net: Add TCP-AO library Provide functions to create selftests dedicated to TCP-AO. They can run in parallel, as they use temporary net namespaces. They can be very specific to the feature being tested. This will allow to create a lot of TCP-AO tests, without complicating one binary with many --options and to create scenarios, that are hard to put in bash script that uses one binary. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 10:41:54 +00:00
Vladimir Oltean	37a8997fc5	net: phylink: reimplement population of pl->supported for in-band phylink_parse_mode() populates all possible supported link modes for a given phy_interface_t, for the case where a phylib phy may be absent and we can't retrieve the supported link modes from that. Russell points out that since the introduction of the generic validation helpers phylink_get_capabilities() and phylink_caps_to_linkmodes(), we can rewrite this procedure to populate the pl->supported mask, so that instead of spelling out the link modes, we derive an intermediary mac_capabilities bit field, and we convert that to the equivalent link modes. Suggested-by: Russell King (Oracle) <linux@armlinux.org.uk> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-17 01:00:53 +00:00
Linus Torvalds	dde0672bfa	A handful of clk fixes, mostly in the rockchip clk driver - Fix a clk name, clk parent, and a register for a clk gate in the Rockchip rk3128 clk driver - Add a PLL frequency on Rockchip rk3568 to fix some display artifacts - Fix a kbuild dependency for Qualcomm's SM_CAMCC_8550 symbol so that it isn't possible to select the associated GCC driver -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmV+RNQRHHNib3lkQGtl cm5lbC5vcmcACgkQrQKIl8bklSXG+hAA13VoPADVhWCa9JxxmUtmFA9WQVZQLrDz 7r33ntE6u1KA1AYCIuH6/H4NCaPBXQ6+ZI4iZGSundRGvanK5Jg6WNNOZjclEPcj YUnoO10b0jHmIo7aLy9ds6d1AUJVPpZXjN/STrZ2Usx2TrZj572sO0ZdRbcHLSZB metqjlpDXvXamQn3drf9Q90uhWHMe792/Ha89qWdPHF+d85sQnSxb9OcbzVSBi5/ I9BmIFyJyT+f8+/HFz8LWiA3WGj8ikoWpaGqd6ENEWRUt/Jq6EZUfGKbvXqnpuhI mAd647AM/oGZVabauDoR65XyEhStkwdmGrQQlTdY2c5/De9nNteCGXpwsOdWZkBA pf0uN4niK5TeIF2OncQ3I3rj1AzRHqOnFzoy78oQLpMsyE6yViqBMECqm4HPMcBy rpQiK0qivQaE61kKEDAjsEbkYACE/m8UlLS/G+vpC+myusP8c3KUw8UKXDNDZO0l TUbZPAGavhvhvW7KV8+48rREXsQQglcIG2S7tkdffZKO1Hng2IxasC0k5ur7JfWf 0kwRjDvhXAEmbrwLJN7rD4LBiI6h8dNPPDYBEpnSaLTK5e8Q0FSR0fsuuVmjSfxw flhchSlH+nDI76KPpX/hDQdJxmpvE1MwX/gV1ybZV6baE5Ik+mbKApWbpKBPloLK kaGsthZ0ils= =GPdk -----END PGP SIGNATURE----- Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A handful of clk fixes, mostly in the rockchip clk driver: - Fix a clk name, clk parent, and a register for a clk gate in the Rockchip rk3128 clk driver - Add a PLL frequency on Rockchip rk3568 to fix some display artifacts - Fix a kbuild dependency for Qualcomm's SM_CAMCC_8550 symbol so that it isn't possible to select the associated GCC driver" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: rockchip: rk3128: Fix SCLK_SDMMC's clock name clk: rockchip: rk3128: Fix aclk_peri_src's parent clk: qcom: Fix SM_CAMCC_8550 dependencies clk: rockchip: rk3128: Fix HCLK_OTG gate register clk: rockchip: rk3568: Add PLL rate for 292.5MHz	2023-12-16 16:57:55 -08:00
Linus Torvalds	3b8a9b2e68	Tracing fixes for v6.7-rc5: - Fix eventfs to check creating new files for events with names greater than NAME_MAX. The eventfs lookup needs to check the return result of simple_lookup(). - Fix the ring buffer to check the proper max data size. Events must be able to fit on the ring buffer sub-buffer, if it cannot, then it fails to be written and the logic to add the event is avoided. The code to check if an event can fit failed to add the possible absolute timestamp which may make the event not be able to fit. This causes the ring buffer to go into an infinite loop trying to find a sub-buffer that would fit the event. Luckily, there's a check that will bail out if it looped over a 1000 times and it also warns. The real fix is not to add the absolute timestamp to an event that is starting at the beginning of a sub-buffer because it uses the sub-buffer timestamp. By avoiding the timestamp at the start of the sub-buffer allows events that pass the first check to always find a sub-buffer that it can fit on. - Have large events that do not fit on a trace_seq to print "LINE TOO BIG" like it does for the trace_pipe instead of what it does now which is to silently drop the output. - Fix a memory leak of forgetting to free the spare page that is saved by a trace instance. - Update the size of the snapshot buffer when the main buffer is updated if the snapshot buffer is allocated. - Fix ring buffer timestamp logic by removing all the places that tried to put the before_stamp back to the write stamp so that the next event doesn't add an absolute timestamp. But each of these updates added a race where by making the two timestamp equal, it was validating the write_stamp so that it can be incorrectly used for calculating the delta of an event. - There's a temp buffer used for printing the event that was using the event data size for allocation when it needed to use the size of the entire event (meta-data and payload data) - For hardening, use "%.s" for printing the trace_marker output, to limit the amount that is printed by the size of the event. This was discovered by development that added a bug that truncated the '\0' and caused a crash. - Fix a use-after-free bug in the use of the histogram files when an instance is being removed. - Remove a useless update in the rb_try_to_discard of the write_stamp. The before_stamp was already changed to force the next event to add an absolute timestamp that the write_stamp is not used. But the write_stamp is modified again using an unneeded 64-bit cmpxchg. - Fix several races in the 32-bit implementation of the rb_time_cmpxchg() that does a 64-bit cmpxchg. - While looking at fixing the 64-bit cmpxchg, I noticed that because the ring buffer uses normal cmpxchg, and this can be done in NMI context, there's some architectures that do not have a working cmpxchg in NMI context. For these architectures, fail recording events that happen in NMI context. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZX0nChQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qlOMAQD3iegTcceQl9lAsroa3tb3xdweC1GP 51MsX5athxSyoQEAutI/2pBCtLFXgTLMHAMd5F23EM1U9rha7W0myrnvKQY= =d3bS -----END PGP SIGNATURE----- Merge tag 'trace-v6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix eventfs to check creating new files for events with names greater than NAME_MAX. The eventfs lookup needs to check the return result of simple_lookup(). - Fix the ring buffer to check the proper max data size. Events must be able to fit on the ring buffer sub-buffer, if it cannot, then it fails to be written and the logic to add the event is avoided. The code to check if an event can fit failed to add the possible absolute timestamp which may make the event not be able to fit. This causes the ring buffer to go into an infinite loop trying to find a sub-buffer that would fit the event. Luckily, there's a check that will bail out if it looped over a 1000 times and it also warns. The real fix is not to add the absolute timestamp to an event that is starting at the beginning of a sub-buffer because it uses the sub-buffer timestamp. By avoiding the timestamp at the start of the sub-buffer allows events that pass the first check to always find a sub-buffer that it can fit on. - Have large events that do not fit on a trace_seq to print "LINE TOO BIG" like it does for the trace_pipe instead of what it does now which is to silently drop the output. - Fix a memory leak of forgetting to free the spare page that is saved by a trace instance. - Update the size of the snapshot buffer when the main buffer is updated if the snapshot buffer is allocated. - Fix ring buffer timestamp logic by removing all the places that tried to put the before_stamp back to the write stamp so that the next event doesn't add an absolute timestamp. But each of these updates added a race where by making the two timestamp equal, it was validating the write_stamp so that it can be incorrectly used for calculating the delta of an event. - There's a temp buffer used for printing the event that was using the event data size for allocation when it needed to use the size of the entire event (meta-data and payload data) - For hardening, use "%.s" for printing the trace_marker output, to limit the amount that is printed by the size of the event. This was discovered by development that added a bug that truncated the '\0' and caused a crash. - Fix a use-after-free bug in the use of the histogram files when an instance is being removed. - Remove a useless update in the rb_try_to_discard of the write_stamp. The before_stamp was already changed to force the next event to add an absolute timestamp that the write_stamp is not used. But the write_stamp is modified again using an unneeded 64-bit cmpxchg. - Fix several races in the 32-bit implementation of the rb_time_cmpxchg() that does a 64-bit cmpxchg. - While looking at fixing the 64-bit cmpxchg, I noticed that because the ring buffer uses normal cmpxchg, and this can be done in NMI context, there's some architectures that do not have a working cmpxchg in NMI context. For these architectures, fail recording events that happen in NMI context. * tag 'trace-v6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ring-buffer: Do not record in NMI if the arch does not support cmpxchg in NMI ring-buffer: Have rb_time_cmpxchg() set the msb counter too ring-buffer: Fix 32-bit rb_time_read() race with rb_time_cmpxchg() ring-buffer: Fix a race in rb_time_cmpxchg() for 32 bit archs ring-buffer: Remove useless update to write_stamp in rb_try_to_discard() ring-buffer: Do not try to put back write_stamp tracing: Fix uaf issue when open the hist or hist_debug file tracing: Add size check when printing trace_marker output ring-buffer: Have saved event hold the entire event ring-buffer: Do not update before stamp when switching sub-buffers tracing: Update snapshot buffer on resize if it is allocated ring-buffer: Fix memory leak of free page eventfs: Fix events beyond NAME_MAX blocking tasks tracing: Have large events show up as '[LINE TOO BIG]' instead of nothing ring-buffer: Fix writing to the buffer with max_data_size	2023-12-16 10:40:51 -08:00
Linus Torvalds	c8e97fc6b4	arm64 fixes: - Arm CMN perf: fix the DTC allocation failure path which can end up erroneously clearing live counters - arm64/mm: fix hugetlb handling of the dirty page state leading to a continuous fault loop in user on hardware without dirty bit management (DBM). That's caused by the dirty+writeable information not being properly preserved across a series of mprotect(PROT_NONE), mprotect(PROT_READ\|PROT_WRITE) -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmV85TIACgkQa9axLQDI XvFx/A/+P0PPVvIWr1VEggXeGhalhrXnn5H3TKm3F7Vz8+VH4X/z5gTdAj1FtwTu MIyjGFl3dQWEq5g4qscBNPqy045AGpRBJKmOw1V6mANuyRcg+06d9qCsosb7/pcs sMLsS34cmWPIQemd3AAyV20DeQPkYQEVEPdZ4QM0cvhCNCYspWmuqc8lEqldl25G 5AnwFPrWza5a/4bKZgVOlyXrZtUxX3uwN1/7IbMrJ6ncpsRn1QMjqRfSlrYlTbcw O0IAnLFqtXqvO7nVaBw5Jq2EYrj0oOC25Pg8fCmaLsFM2yMky4186slULHg3c63Z zGyMPOLWdFGa/Vj6yliB8xPrrJGgTfRbFk9LYa4BvJcU3nXxcMI/LXJzM7TZYMFr j1vkH4cLyf76r12xzT/UYooE+A8gMJNuns+G0RqGuPYZ7fA2ut77H1IpDxBiCCEM tB2ys8lV/GtkGqNseGNX75hNPgsykPsCi7HTnXjMFK84iP/6CFUE2V0haOV1cUDw 8r1nGe5wVJ/Yc/6/62mzCQjEluhdAn0gK1b/QQyOt0QN6maPXjIF4CILbuLwdRQh RDYNeRK+dlqLGmBwxM99zVrb3NnPH6oTm0Vq3VK+9mTQeGXNzmWfFQk8odTKiIEA oksWFX91exOXhcrqkB1GCRaLjrQWeuA2eq5RrPpmDmUsPyhoeCM= =wq2L -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Arm CMN perf: fix the DTC allocation failure path which can end up erroneously clearing live counters - arm64/mm: fix hugetlb handling of the dirty page state leading to a continuous fault loop in user on hardware without dirty bit management (DBM). That's caused by the dirty+writeable information not being properly preserved across a series of mprotect(PROT_NONE), mprotect(PROT_READ\|PROT_WRITE) * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mm: Always make sw-dirty PTEs hw-dirty in pte_modify perf/arm-cmn: Fail DTC counter allocation correctly	2023-12-15 19:59:03 -08:00
Linus Torvalds	2e3f280b24	pci-v6.7-fixes-1 -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmV8vZwUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vwQARAApzU4pesi9dkl/Fyv11IwayNm+gra 8dTsdT6dcnTq8DKXdTRtuyMwY+H57YC/Cxl0/Y6KPnJqUmgXEiOfe1duPvy5HJB2 YQGPszDC/yrbU/s65cWwuw+wLHk3PeoR/RNfo0PBNRb+FIoE2tV6mgAw0CR2xyhV MTDMMvdJBAQoNytmkw5ZYgdr3zUPgb80VgjBa453xGxMHlnpqhRIKNw5jXBOWCpY 1TkrDAtKzziCXVx9oqLDA46AgdVo48w+vPrC3wa/8kxv4/N0BplhiUrtAdXMrsIH wR3uX3ypfdjEWf+iX3dgGbvwoSZirlZdu9BaSTZqM/WAHdync/Hit/mF6rYpOQfJ 9WpALQkx11EvYpltZOO4JahaWueGxEK73/P43Cb9Pgj5zNiMagGwVc+1iEXMC0k8 MML/MZrQQaNcJOQL+V3rXr7pcRqV8X6H5/0K/e8M53D5U3ZkcjBc2QcqOd3/4ugf 7sa9JfXOFcBJvwUt3HQNHEyj+leJDJ09kRXSx8szfRrCGVTkNtZ9DxKePHzCk+kC kU4y+5E9iIsGwdGknnO53LbilgGtutJx+JpBPz0guXb53RIGQGcHfWVPHzB9fmJk Ty4d+zsP2OmKgMDX2FdVv4xVNYDsdOGG7PQu++pm0fBmIvBixRcQyfXrLdTMvjwX 6AOhXOppox58COs= =4BWR -----END PGP SIGNATURE----- Merge tag 'pci-v6.7-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Limit Max_Read_Request_Size (MRRS) on some MIPS Loongson systems because they don't all support MRRS > 256, and firmware doesn't always initialize it correctly, which meant some PCIe devices didn't work (Jiaxun Yang) - Add and use pci_enable_link_state_locked() to prevent potential deadlocks in vmd and qcom drivers (Johan Hovold) - Revert recent (v6.5) acpiphp resource assignment changes that fixed issues with hot-adding devices on a root bus or with large BARs, but introduced new issues with GPU initialization and hot-adding SCSI disks in QEMU VMs and (Bjorn Helgaas) * tag 'pci-v6.7-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: Revert "PCI: acpiphp: Reassign resources on bridge if necessary" PCI/ASPM: Add pci_disable_link_state_locked() lockdep assert PCI/ASPM: Clean up __pci_disable_link_state() 'sem' parameter PCI: qcom: Clean up ASPM comment PCI: qcom: Fix potential deadlock when enabling ASPM PCI: vmd: Fix potential deadlock when enabling ASPM PCI/ASPM: Add pci_enable_link_state_locked() PCI: loongson: Limit MRRS to 256	2023-12-15 19:48:47 -08:00
Jakub Kicinski	358105ab92	Merge branch 'tcp-dccp-refine-source-port-selection' Eric Dumazet says: ==================== tcp/dccp: refine source port selection This patch series leverages IP_LOCAL_PORT_RANGE option to no longer favor even source port selection at connect() time. This should lower time taken by connect() for hosts having many active connections to the same destination. ==================== Link: https://lore.kernel.org/r/20231214192939.1962891-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-15 17:56:29 -08:00
Eric Dumazet	207184853d	tcp/dccp: change source port selection at connect() time In commit `1580ab63fc` ("tcp/dccp: better use of ephemeral ports in connect()") we added an heuristic to select even ports for connect() and odd ports for bind(). This was nice because no applications changes were needed. But it added more costs when all even ports are in use, when there are few listeners and many active connections. Since then, IP_LOCAL_PORT_RANGE has been added to permit an application to partition ephemeral port range at will. This patch extends the idea so that if IP_LOCAL_PORT_RANGE is set on a socket before accept(), port selection no longer favors even ports. This means that connect() can find a suitable source port faster, and applications can use a different split between connect() and bind() users. This should give more entropy to Toeplitz hash used in RSS: Using even ports was wasting one bit from the 16bit sport. A similar change can be done in inet_csk_find_open_port() if needed. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jakub Sitnicki <jakub@cloudflare.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/r/20231214192939.1962891-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-15 17:56:27 -08:00
Eric Dumazet	41db7626b7	inet: returns a bool from inet_sk_get_local_port_range() Change inet_sk_get_local_port_range() to return a boolean, telling the callers if the port range was provided by IP_LOCAL_PORT_RANGE socket option. Adds documentation while we are at it. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20231214192939.1962891-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-15 17:56:27 -08:00
Daniel Golle	b1dfc0f762	net: phy: skip LED triggers on PHYs on SFP modules Calling led_trigger_register() when attaching a PHY located on an SFP module potentially (and practically) leads into a deadlock. Fix this by not calling led_trigger_register() for PHYs localted on SFP modules as such modules actually never got any LEDs. ====================================================== WARNING: possible circular locking dependency detected 6.7.0-rc4-next-20231208+ #0 Tainted: G O ------------------------------------------------------ kworker/u8:2/43 is trying to acquire lock: ffffffc08108c4e8 (triggers_list_lock){++++}-{3:3}, at: led_trigger_register+0x4c/0x1a8 but task is already holding lock: ffffff80c5c6f318 (&sfp->sm_mutex){+.+.}-{3:3}, at: cleanup_module+0x2ba8/0x3120 [sfp] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&sfp->sm_mutex){+.+.}-{3:3}: __mutex_lock+0x88/0x7a0 mutex_lock_nested+0x20/0x28 cleanup_module+0x2ae0/0x3120 [sfp] sfp_register_bus+0x5c/0x9c sfp_register_socket+0x48/0xd4 cleanup_module+0x271c/0x3120 [sfp] platform_probe+0x64/0xb8 really_probe+0x17c/0x3c0 __driver_probe_device+0x78/0x164 driver_probe_device+0x3c/0xd4 __driver_attach+0xec/0x1f0 bus_for_each_dev+0x60/0xa0 driver_attach+0x20/0x28 bus_add_driver+0x108/0x208 driver_register+0x5c/0x118 __platform_driver_register+0x24/0x2c init_module+0x28/0xa7c [sfp] do_one_initcall+0x70/0x2ec do_init_module+0x54/0x1e4 load_module+0x1b78/0x1c8c __do_sys_init_module+0x1bc/0x2cc __arm64_sys_init_module+0x18/0x20 invoke_syscall.constprop.0+0x4c/0xdc do_el0_svc+0x3c/0xbc el0_svc+0x34/0x80 el0t_64_sync_handler+0xf8/0x124 el0t_64_sync+0x150/0x154 -> #2 (rtnl_mutex){+.+.}-{3:3}: __mutex_lock+0x88/0x7a0 mutex_lock_nested+0x20/0x28 rtnl_lock+0x18/0x20 set_device_name+0x30/0x130 netdev_trig_activate+0x13c/0x1ac led_trigger_set+0x118/0x234 led_trigger_write+0x104/0x17c sysfs_kf_bin_write+0x64/0x80 kernfs_fop_write_iter+0x128/0x1b4 vfs_write+0x178/0x2a4 ksys_write+0x58/0xd4 __arm64_sys_write+0x18/0x20 invoke_syscall.constprop.0+0x4c/0xdc do_el0_svc+0x3c/0xbc el0_svc+0x34/0x80 el0t_64_sync_handler+0xf8/0x124 el0t_64_sync+0x150/0x154 -> #1 (&led_cdev->trigger_lock){++++}-{3:3}: down_write+0x4c/0x13c led_trigger_write+0xf8/0x17c sysfs_kf_bin_write+0x64/0x80 kernfs_fop_write_iter+0x128/0x1b4 vfs_write+0x178/0x2a4 ksys_write+0x58/0xd4 __arm64_sys_write+0x18/0x20 invoke_syscall.constprop.0+0x4c/0xdc do_el0_svc+0x3c/0xbc el0_svc+0x34/0x80 el0t_64_sync_handler+0xf8/0x124 el0t_64_sync+0x150/0x154 -> #0 (triggers_list_lock){++++}-{3:3}: __lock_acquire+0x12a0/0x2014 lock_acquire+0x100/0x2ac down_write+0x4c/0x13c led_trigger_register+0x4c/0x1a8 phy_led_triggers_register+0x9c/0x214 phy_attach_direct+0x154/0x36c phylink_attach_phy+0x30/0x60 phylink_sfp_connect_phy+0x140/0x510 sfp_add_phy+0x34/0x50 init_module+0x15c/0xa7c [sfp] cleanup_module+0x1d94/0x3120 [sfp] cleanup_module+0x2bb4/0x3120 [sfp] process_one_work+0x1f8/0x4ec worker_thread+0x1e8/0x3d8 kthread+0x104/0x110 ret_from_fork+0x10/0x20 other info that might help us debug this: Chain exists of: triggers_list_lock --> rtnl_mutex --> &sfp->sm_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sfp->sm_mutex); lock(rtnl_mutex); lock(&sfp->sm_mutex); lock(triggers_list_lock); * DEADLOCK * 4 locks held by kworker/u8:2/43: #0: ffffff80c000f938 ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x150/0x4ec #1: ffffffc08214bde8 ((work_completion)(&(&sfp->timeout)->work)){+.+.}-{0:0}, at: process_one_work+0x150/0x4ec #2: ffffffc0810902f8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x18/0x20 #3: ffffff80c5c6f318 (&sfp->sm_mutex){+.+.}-{3:3}, at: cleanup_module+0x2ba8/0x3120 [sfp] stack backtrace: CPU: 0 PID: 43 Comm: kworker/u8:2 Tainted: G O 6.7.0-rc4-next-20231208+ #0 Hardware name: Bananapi BPI-R4 (DT) Workqueue: events_power_efficient cleanup_module [sfp] Call trace: dump_backtrace+0xa8/0x10c show_stack+0x14/0x1c dump_stack_lvl+0x5c/0xa0 dump_stack+0x14/0x1c print_circular_bug+0x328/0x430 check_noncircular+0x124/0x134 __lock_acquire+0x12a0/0x2014 lock_acquire+0x100/0x2ac down_write+0x4c/0x13c led_trigger_register+0x4c/0x1a8 phy_led_triggers_register+0x9c/0x214 phy_attach_direct+0x154/0x36c phylink_attach_phy+0x30/0x60 phylink_sfp_connect_phy+0x140/0x510 sfp_add_phy+0x34/0x50 init_module+0x15c/0xa7c [sfp] cleanup_module+0x1d94/0x3120 [sfp] cleanup_module+0x2bb4/0x3120 [sfp] process_one_work+0x1f8/0x4ec worker_thread+0x1e8/0x3d8 kthread+0x104/0x110 ret_from_fork+0x10/0x20 Signed-off-by: Daniel Golle <daniel@makrotopia.org> Fixes: `01e5b728e9` ("net: phy: Add a binding for PHY LEDs") Link: https://lore.kernel.org/r/102a9dce38bdf00215735d04cd4704458273ad9c.1702339354.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-15 17:53:42 -08:00
Rob Herring	758a8d5b6a	dt-bindings: net: marvell,orion-mdio: Drop "reg" sizes schema Defining the size of register regions is not really in scope of what bindings need to cover. The schema for this is also not completely correct as a reg entry can be variable number of cells for the address and size, but the schema assumes 1 cell. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20231213232455.2248056-1-robh@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-15 17:44:28 -08:00
Jiri Olsa	117211aa73	bpf: Add missing BPF_LINK_TYPE invocations Pengfei Xu reported [1] Syzkaller/KASAN issue found in bpf_link_show_fdinfo. The reason is missing BPF_LINK_TYPE invocation for uprobe multi link and for several other links, adding that. [1] https://lore.kernel.org/bpf/ZXptoKRSLspnk2ie@xpf.sh.intel.com/ Fixes: `89ae89f53d` ("bpf: Add multi uprobe link") Fixes: `e420bed025` ("bpf: Add fd-based tcx multi-prog infra with link support") Fixes: `84601d6ee6` ("bpf: add bpf_link support for BPF_NETFILTER programs") Fixes: `35dfaad718` ("netkit, bpf: Add bpf programmable net device") Reported-by: Pengfei Xu <pengfei.xu@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20231215230502.2769743-1-jolsa@kernel.org	2023-12-15 16:34:12 -08:00
Alexei Starovoitov	42d45c4562	selftests/bpf: Temporarily disable dummy_struct_ops test on s390 Temporarily disable dummy_struct_ops test on s390. The breakage is likely due to commit `2cd3e3772e` ("x86/cfi,bpf: Fix bpf_struct_ops CFI"). Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-15 16:28:25 -08:00
Alexei Starovoitov	3c302e14bd	Merge branch 'x86-cfi-bpf-fix-cfi-vs-ebpf' Peter Zijlstra says: ==================== x86/cfi,bpf: Fix CFI vs eBPF Hi! What started with the simple observation that bpf_dispatcher__func() was broken for calling CFI functions with a __nocfi calling context for FineIBT ended up with a complete BPF wide CFI fixup. With these changes on the BPF selftest suite passes without crashing -- there's still a few failures, but Alexei has graciously offered to look into those. (Alexei, I have presumed your SoB on the very last patch, please update as you see fit) Changes since v2 are numerous but include: - cfi_get_offset() -- as a means to communicate the offset (ast) - 5 new patches fixing various BPF internals to be CFI clean Note: it might* be possible to merge the bpf_bpf_tcp_ca.c:unsupported_ops[] thing into the CFI stubs, as is get_info will have a NULL stub, unlike the others. --- arch/riscv/include/asm/cfi.h \| 3 +- arch/riscv/kernel/cfi.c \| 2 +- arch/x86/include/asm/cfi.h \| 126 +++++++++++++++++++++++++++++++++++++- arch/x86/kernel/alternative.c \| 87 +++++++++++++++++++++++--- arch/x86/kernel/cfi.c \| 4 +- arch/x86/net/bpf_jit_comp.c \| 134 +++++++++++++++++++++++++++++++++++------ include/asm-generic/Kbuild \| 1 + include/linux/bpf.h \| 27 ++++++++- include/linux/cfi.h \| 12 ++++ kernel/bpf/bpf_struct_ops.c \| 16 ++--- kernel/bpf/core.c \| 25 ++++++++ kernel/bpf/cpumask.c \| 8 ++- kernel/bpf/helpers.c \| 18 +++++- net/bpf/bpf_dummy_struct_ops.c \| 31 +++++++++- net/bpf/test_run.c \| 15 ++++- net/ipv4/bpf_tcp_ca.c \| 69 +++++++++++++++++++++ 16 files changed, 528 insertions(+), 50 deletions(-) ==================== Link: https://lore.kernel.org/r/20231215091216.135791411@infradead.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-15 16:25:56 -08:00
Alexei Starovoitov	852486b35f	x86/cfi,bpf: Fix bpf_exception_cb() signature As per the earlier patches, BPF sub-programs have bpf_callback_t signature and CFI expects callers to have matching signature. This is violated by bpf_prog_aux::bpf_exception_cb(). [peterz: Changelog] Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/CAADnVQ+Z7UcXXBBhMubhcMM=R-dExk-uHtfOLtoLxQ1XxEpqEA@mail.gmail.com Link: https://lore.kernel.org/r/20231215092707.910319166@infradead.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-15 16:25:55 -08:00
Peter Zijlstra	e4c0033989	bpf: Fix dtor CFI Ensure the various dtor functions match their prototype and retain their CFI signatures, since they don't have their address taken, they are prone to not getting CFI, making them impossible to call indirectly. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20231215092707.799451071@infradead.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-15 16:25:55 -08:00
Peter Zijlstra	e9d13b9d2f	cfi: Add CFI_NOSEAL() Add a CFI_NOSEAL() helper to mark functions that need to retain their CFI information, despite not otherwise leaking their address. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20231215092707.669401084@infradead.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-15 16:25:55 -08:00
Peter Zijlstra	2cd3e3772e	x86/cfi,bpf: Fix bpf_struct_ops CFI BPF struct_ops uses __arch_prepare_bpf_trampoline() to write trampolines for indirect function calls. These tramplines much have matching CFI. In order to obtain the correct CFI hash for the various methods, add a matching structure that contains stub functions, the compiler will generate correct CFI which we can pilfer for the trampolines. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20231215092707.566977112@infradead.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-15 16:25:55 -08:00
Peter Zijlstra	e72d88d18d	x86/cfi,bpf: Fix bpf_callback_t CFI Where the main BPF program is expected to match bpf_func_t, sub-programs are expected to match bpf_callback_t. This fixes things like: tools/testing/selftests/bpf/progs/bloom_filter_bench.c: bpf_for_each_map_elem(&array_map, bloom_callback, &data, 0); Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20231215092707.451956710@infradead.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-15 16:25:55 -08:00
Peter Zijlstra	4f9087f166	x86/cfi,bpf: Fix BPF JIT call The current BPF call convention is __nocfi, except when it calls !JIT things, then it calls regular C functions. It so happens that with FineIBT the __nocfi and C calling conventions are incompatible. Specifically __nocfi will call at func+0, while FineIBT will have endbr-poison there, which is not a valid indirect target. Causing #CP. Notably this only triggers on IBT enabled hardware, which is probably why this hasn't been reported (also, most people will have JIT on anyway). Implement proper CFI prologues for the BPF JIT codegen and drop __nocfi for x86. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20231215092707.345270396@infradead.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-15 16:25:55 -08:00
Peter Zijlstra	4382159696	cfi: Flip headers Normal include order is that linux/foo.h should include asm/foo.h, CFI has it the wrong way around. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20231215092707.231038174@infradead.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-15 16:25:55 -08:00
Josef Bacik	a8892fd719	btrfs: do not allow non subvolume root targets for snapshot Our btrfs subvolume snapshot <source> <destination> utility enforces that <source> is the root of the subvolume, however this isn't enforced in the kernel. Update the kernel to also enforce this limitation to avoid problems with other users of this ioctl that don't have the appropriate checks in place. Reported-by: Martin Michaelis <code@mgjm.de> CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-12-15 23:46:51 +01:00
Jens Axboe	ae1914174a	cred: get rid of CONFIG_DEBUG_CREDENTIALS This code is rarely (never?) enabled by distros, and it hasn't caught anything in decades. Let's kill off this legacy debug code. Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-12-15 14:19:48 -08:00
Jens Axboe	f8fa5d7692	cred: switch to using atomic_long_t There are multiple ways to grab references to credentials, and the only protection we have against overflowing it is the memory required to do so. With memory sizes only moving in one direction, let's bump the reference count to 64-bit and move it outside the realm of feasibly overflowing. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-12-15 14:08:46 -08:00
Hou Tao	1467affd16	selftests/bpf: Add test for abnormal cnt during multi-kprobe attachment If an abnormally huge cnt is used for multi-kprobes attachment, the following warning will be reported: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 392 at mm/util.c:632 kvmalloc_node+0xd9/0xe0 Modules linked in: bpf_testmod(O) CPU: 1 PID: 392 Comm: test_progs Tainted: G ...... 6.7.0-rc3+ #32 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ...... RIP: 0010:kvmalloc_node+0xd9/0xe0 ? __warn+0x89/0x150 ? kvmalloc_node+0xd9/0xe0 bpf_kprobe_multi_link_attach+0x87/0x670 __sys_bpf+0x2a28/0x2bc0 __x64_sys_bpf+0x1a/0x30 do_syscall_64+0x36/0xb0 entry_SYSCALL_64_after_hwframe+0x6e/0x76 RIP: 0033:0x7fbe067f0e0d ...... </TASK> ---[ end trace 0000000000000000 ]--- So add a test to ensure the warning is fixed. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231215100708.2265609-6-houtao@huaweicloud.com	2023-12-15 22:54:55 +01:00
Hou Tao	00cdcd2900	selftests/bpf: Don't use libbpf_get_error() in kprobe_multi_test Since libbpf v1.0, libbpf doesn't return error code embedded into the pointer iteself, libbpf_get_error() is deprecated and it is basically the same as using -errno directly. So replace the invocations of libbpf_get_error() by -errno in kprobe_multi_test. For libbpf_get_error() in test_attach_api_fails(), saving -errno before invoking ASSERT_xx() macros just in case that errno is overwritten by these macros. However, the invocation of libbpf_get_error() in get_syms() should be kept intact, because hashmap__new() still returns a pointer with embedded error code. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231215100708.2265609-5-houtao@huaweicloud.com	2023-12-15 22:54:55 +01:00
Hou Tao	0d83786f56	selftests/bpf: Add test for abnormal cnt during multi-uprobe attachment If an abnormally huge cnt is used for multi-uprobes attachment, the following warning will be reported: ------------[ cut here ]------------ WARNING: CPU: 7 PID: 406 at mm/util.c:632 kvmalloc_node+0xd9/0xe0 Modules linked in: bpf_testmod(O) CPU: 7 PID: 406 Comm: test_progs Tainted: G ...... 6.7.0-rc3+ #32 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ...... RIP: 0010:kvmalloc_node+0xd9/0xe0 ...... Call Trace: <TASK> ? __warn+0x89/0x150 ? kvmalloc_node+0xd9/0xe0 bpf_uprobe_multi_link_attach+0x14a/0x480 __sys_bpf+0x14a9/0x2bc0 do_syscall_64+0x36/0xb0 entry_SYSCALL_64_after_hwframe+0x6e/0x76 ...... </TASK> ---[ end trace 0000000000000000 ]--- So add a test to ensure the warning is fixed. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231215100708.2265609-4-houtao@huaweicloud.com	2023-12-15 22:54:55 +01:00
Hou Tao	d6d1e6c17c	bpf: Limit the number of kprobes when attaching program to multiple kprobes An abnormally big cnt may also be assigned to kprobe_multi.cnt when attaching multiple kprobes. It will trigger the following warning in kvmalloc_node(): if (unlikely(size > INT_MAX)) { WARN_ON_ONCE(!(flags & __GFP_NOWARN)); return NULL; } Fix the warning by limiting the maximal number of kprobes in bpf_kprobe_multi_link_attach(). If the number of kprobes is greater than MAX_KPROBE_MULTI_CNT, the attachment will fail and return -E2BIG. Fixes: `0dcac27254` ("bpf: Add multi kprobe link") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231215100708.2265609-3-houtao@huaweicloud.com	2023-12-15 22:54:55 +01:00
Hou Tao	8b2efe51ba	bpf: Limit the number of uprobes when attaching program to multiple uprobes An abnormally big cnt may be passed to link_create.uprobe_multi.cnt, and it will trigger the following warning in kvmalloc_node(): if (unlikely(size > INT_MAX)) { WARN_ON_ONCE(!(flags & __GFP_NOWARN)); return NULL; } Fix the warning by limiting the maximal number of uprobes in bpf_uprobe_multi_link_attach(). If the number of uprobes is greater than MAX_UPROBE_MULTI_CNT, the attachment will return -E2BIG. Fixes: `89ae89f53d` ("bpf: Add multi uprobe link") Reported-by: Xingwei Lee <xrivendell7@gmail.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Closes: https://lore.kernel.org/bpf/CABOYnLwwJY=yFAGie59LFsUsBAgHfroVqbzZ5edAXbFE3YiNVA@mail.gmail.com Link: https://lore.kernel.org/bpf/20231215100708.2265609-2-houtao@huaweicloud.com	2023-12-15 22:54:46 +01:00
Bjorn Helgaas	5df12742b7	Revert "PCI: acpiphp: Reassign resources on bridge if necessary" This reverts commit `40613da52b` and the subsequent fix to it: `cc22522fd5` ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") `40613da52b` fixed a problem where hot-adding a device with large BARs failed if the bridge windows programmed by firmware were not large enough. `cc22522fd5` ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") fixed a problem with `40613da52b`: an ACPI hot-add of a device on a PCI root bus (common in the virt world) or firmware sending ACPI Bus Check to non-existent Root Ports (e.g., on Dell Inspiron 7352/0W6WV0) caused a NULL pointer dereference and suspend/resume hangs. Unfortunately the combination of `40613da52b` and `cc22522fd5` caused other problems: - Fiona reported that hot-add of SCSI disks in QEMU virtual machine fails sometimes. - Dongli reported a similar problem with hot-add of SCSI disks. - Jonathan reported a console freeze during boot on bare metal due to an error in radeon GPU initialization. Revert both patches to avoid adding these problems. This means we will again see the problems with hot-adding devices with large BARs and the NULL pointer dereferences and suspend/resume issues that `40613da52b` and `cc22522fd5` were intended to fix. Fixes: `40613da52b` ("PCI: acpiphp: Reassign resources on bridge if necessary") Fixes: `cc22522fd5` ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") Reported-by: Fiona Ebner <f.ebner@proxmox.com> Closes: https://lore.kernel.org/r/9eb669c0-d8f2-431d-a700-6da13053ae54@proxmox.com Reported-by: Dongli Zhang <dongli.zhang@oracle.com> Closes: https://lore.kernel.org/r/3c4a446a-b167-11b8-f36f-d3c1b49b42e9@oracle.com Reported-by: Jonathan Woithe <jwoithe@just42.net> Closes: https://lore.kernel.org/r/ZXpaNCLiDM+Kv38H@marvin.atrad.com.au Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Cc: <stable@vger.kernel.org>	2023-12-15 14:55:10 -06:00

... 4 5 6 7 8 ...

1236784 commits