linux-stable/net
Linus Torvalds 9961a78594 for-6.10/io_uring-20240511
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmY/YdYQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpnmVEADBq8QT9Oa3HTIONHwxjmGMOalr7PSrBP89
 S6Inv/l+3xDlyolyLh1HIXUC84iS9Ihi2pNC3dZct4fNcpA99H0CFaHDGwZ5rVri
 MrFaubZAps1qSzeypqEq3zWGKVUoaYWaOKhuOjye5Ei2tKymbguhDKl1WiKibD21
 E9qOYbhSUFdub/xtx9Rv4BS05QW5bHZ2Y/tTFqB8MY4JUsdb9g/deVZkyGUQYRSd
 40mDallRldjQQTQ8iU4H6/ORdGIN/90aLPbmzMdFtQcymnmRyid3rOEwhwWYe4NO
 ljnI8m1SJQilZz1d5oHBXBB5QubVptY1JWxbk8GQCSmOU5wrCq+ARCJXUtBXwniJ
 K4VFsGm9MkZcc5vsIwIzvsrk8DODla6EVo/jyDy8iFceZcNWfVxdwa5NS67V/6QT
 macbF785XDsmA5E4UjslbZqU047w+A5N1yazcZWzMk0coJDeB8AtsA1/C2WZOm8p
 HVoiAzsqt81hvPItnjCyZluL/YW+BKeOTnq04QbpQKcJpZBzszO4ZLtuD+IXkE69
 8ZZPGFPnPS4ZMQojKkwsBr+Yo65S18oBDkib36mr2lsdnoWTpGq47C7ScUDBbqGm
 iI7U8tYMnVVkQQHVVmGI4KOr5/4lxxp8398kqCaxfW3D5BQhbtUOF/OBjBHj1ZSV
 9aZx87CyhA==
 =DwAV
 -----END PGP SIGNATURE-----

Merge tag 'for-6.10/io_uring-20240511' of git://git.kernel.dk/linux

Pull io_uring updates from Jens Axboe:

 - Greatly improve send zerocopy performance, by enabling coalescing of
   sent buffers.

   MSG_ZEROCOPY already does this with send(2) and sendmsg(2), but the
   io_uring side did not. In local testing, the crossover point for send
   zerocopy being faster is now around 3000 byte packets, and it
   performs better than the sync syscall variants as well.

   This feature relies on a shared branch with net-next, which was
   pulled into both branches.

 - Unification of how async preparation is done across opcodes.

   Previously, opcodes that required extra memory for async retry would
   allocate that as needed, using on-stack state until that was the
   case. If async retry was needed, the on-stack state was adjusted
   appropriately for a retry and then copied to the allocated memory.

   This led to some fragile and ugly code, particularly for read/write
   handling, and made storage retries more difficult than they needed to
   be. Allocate the memory upfront, as it's cheap from our pools, and
   use that state consistently both initially and also from the retry
   side.

 - Move away from using remap_pfn_range() for mapping the rings.

   This is really not the right interface to use and can cause lifetime
   issues or leaks. Additionally, it means the ring sq/cq arrays need to
   be physically contigious, which can cause problems in production with
   larger rings when services are restarted, as memory can be very
   fragmented at that point.

   Move to using vm_insert_page(s) for the ring sq/cq arrays, and apply
   the same treatment to mapped ring provided buffers. This also helps
   unify the code we have dealing with allocating and mapping memory.

   Hard to see in the diffstat as we're adding a few features as well,
   but this kills about ~400 lines of code from the codebase as well.

 - Add support for bundles for send/recv.

   When used with provided buffers, bundles support sending or receiving
   more than one buffer at the time, improving the efficiency by only
   needing to call into the networking stack once for multiple sends or
   receives.

 - Tweaks for our accept operations, supporting both a DONTWAIT flag for
   skipping poll arm and retry if we can, and a POLLFIRST flag that the
   application can use to skip the initial accept attempt and rely
   purely on poll for triggering the operation. Both of these have
   identical flags on the receive side already.

 - Make the task_work ctx locking unconditional.

   We had various code paths here that would do a mix of lock/trylock
   and set the task_work state to whether or not it was locked. All of
   that goes away, we lock it unconditionally and get rid of the state
   flag indicating whether it's locked or not.

   The state struct still exists as an empty type, can go away in the
   future.

 - Add support for specifying NOP completion values, allowing it to be
   used for error handling testing.

 - Use set/test bit for io-wq worker flags. Not strictly needed, but
   also doesn't hurt and helps silence a KCSAN warning.

 - Cleanups for io-wq locking and work assignments, closing a tiny race
   where cancelations would not be able to find the work item reliably.

 - Misc fixes, cleanups, and improvements

* tag 'for-6.10/io_uring-20240511' of git://git.kernel.dk/linux: (97 commits)
  io_uring: support to inject result for NOP
  io_uring: fail NOP if non-zero op flags is passed in
  io_uring/net: add IORING_ACCEPT_POLL_FIRST flag
  io_uring/net: add IORING_ACCEPT_DONTWAIT flag
  io_uring/filetable: don't unnecessarily clear/reset bitmap
  io_uring/io-wq: Use set_bit() and test_bit() at worker->flags
  io_uring/msg_ring: cleanup posting to IOPOLL vs !IOPOLL ring
  io_uring: Require zeroed sqe->len on provided-buffers send
  io_uring/notif: disable LAZY_WAKE for linked notifs
  io_uring/net: fix sendzc lazy wake polling
  io_uring/msg_ring: reuse ctx->submitter_task read using READ_ONCE instead of re-reading it
  io_uring/rw: reinstate thread check for retries
  io_uring/notif: implement notification stacking
  io_uring/notif: simplify io_notif_flush()
  net: add callback for setting a ubuf_info to skb
  net: extend ubuf_info callback to ops structure
  io_uring/net: support bundles for recv
  io_uring/net: support bundles for send
  io_uring/kbuf: add helpers for getting/peeking multiple buffers
  io_uring/net: add provided buffer support for IORING_OP_SEND
  ...
2024-05-13 12:48:06 -07:00
..
6lowpan
9p netfs, 9p: Implement helpers for new write code 2024-05-01 18:07:37 +01:00
802
8021q net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb 2024-05-02 11:02:48 +02:00
appletalk appletalk: Improve handling of broadcast packets 2024-05-08 12:17:19 +01:00
atm
ax25 ax25: Fix netdev refcount issue 2024-04-23 11:35:52 +02:00
batman-adv batman-adv: Avoid infinite loop trying to resize local TT 2024-03-29 20:18:43 +01:00
bluetooth Bluetooth: l2cap: fix null-ptr-deref in l2cap_chan_timeout 2024-05-03 13:05:54 -04:00
bpf for-netdev 2024-03-11 18:06:04 -07:00
bridge net: bridge: fix corrupted ethernet header on multicast-to-unicast 2024-05-08 10:37:57 +01:00
caif
can
ceph libceph: init the cursor when preparing sparse read in msgr2 2024-03-06 12:43:01 +01:00
core for-6.10/io_uring-20240511 2024-05-13 12:48:06 -07:00
dcb
dccp Kbuild updates for v6.9 2024-03-21 14:41:00 -07:00
devlink devlink: fix port new reply cmd type 2024-03-19 19:37:57 -07:00
dns_resolver
dsa net: dsa: Leverage core stats allocator 2024-03-07 20:37:13 -08:00
ethernet ethernet: Add helper for assigning packet type when dest address does not match device address 2024-04-25 08:20:54 -07:00
ethtool ethtool: remove ethtool_eee_use_linkmodes 2024-03-06 20:40:20 -08:00
handshake
hsr hsr: Simplify code for announcing HSR nodes timer setup 2024-05-08 18:56:30 -07:00
ieee802154 Merge tag 'ieee802154-for-net-next-2024-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan-next 2024-03-08 20:35:33 -08:00
ife
ipv4 ipsec-2024-05-02 2024-05-03 15:56:15 -07:00
ipv6 ipv6: prevent NULL dereference in ip6_output() 2024-05-08 18:57:12 -07:00
iucv more s390 updates for 6.9 merge window 2024-03-19 11:38:27 -07:00
kcm net: kcm: fix incorrect parameter validation in the kcm_getsockopt) function 2024-03-11 09:53:22 +00:00
key
l2tp net l2tp: drop flow hash on forward 2024-04-26 13:48:24 +02:00
l3mdev
lapb
llc
mac80211 wifi: mac80211: fix unaligned le16 access 2024-04-19 10:02:27 +02:00
mac802154 mac802154: fix llsec key resources release in mac802154_llsec_key_del 2024-03-06 21:01:26 +01:00
mctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-02-29 14:24:56 -08:00
mpls - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min 2024-03-14 18:03:09 -07:00
mptcp mptcp: only allow set existing scheduler for net.mptcp.scheduler 2024-05-07 17:23:35 -07:00
ncsi
netfilter netfilter: nf_tables: honor table dormant flag from netdev release event path 2024-04-25 10:42:57 +02:00
netlabel netlabel: remove impossible return value in netlbl_bitmap_walk 2024-02-28 19:37:34 -08:00
netlink net/netlink: Add getsockopt support for NETLINK_LISTEN_ALL_NSID 2024-03-11 15:48:34 -07:00
netrom netrom: Fix data-races around sysctl_net_busy_read 2024-03-07 10:36:58 +01:00
nfc nfc: nci: Fix kcov check in nci_rx_work() 2024-05-07 16:40:06 -07:00
nsh nsh: Restore skb->{protocol,data,mac_header} for outer header in nsh_gso_segment(). 2024-04-26 12:20:01 +02:00
openvswitch net: openvswitch: Fix Use-After-Free in ovs_ct_exit 2024-04-24 17:14:24 -07:00
packet Revert "net: Re-use and set mono_delivery_time bit for userspace tstamp packets" 2024-03-18 12:29:53 +00:00
phonet phonet: fix rtm_phonet_notify() skb allocation 2024-05-06 18:30:00 -07:00
psample
qrtr
rds net/rds: fix possible cp null dereference 2024-03-29 12:04:09 -07:00
rfkill
rose
rxrpc rxrpc: Only transmit one ACK per jumbo packet received 2024-05-08 08:05:03 -07:00
sched net/sched: Fix mirred deadlock on device recursion 2024-04-17 18:22:52 -07:00
sctp net: introduce include/net/rps.h 2024-03-07 21:12:43 -08:00
smc net/smc: fix neighbour and rtable leak in smc_ib_find_route() 2024-05-09 10:03:43 +02:00
strparser
sunrpc NFS client bugfixes for Linux 6.9 2024-04-29 12:07:37 -07:00
switchdev
tipc tipc: fix a possible memleak in tipc_buf_append 2024-05-01 18:39:44 -07:00
tls tls: fix lockless read of strp->msg_ready in ->poll 2024-04-25 08:32:37 -07:00
unix af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc(). 2024-04-25 08:37:02 -07:00
vmw_vsock vsock/virtio: fix packet delivery to tap device 2024-04-02 18:00:24 -07:00
wireless wifi: nl80211: don't free NULL coalescing rule 2024-04-19 10:02:17 +02:00
x25 net/x25: fix incorrect parameter validation in the x25_getsockopt() function 2024-03-11 09:53:22 +00:00
xdp xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING 2024-04-05 22:47:22 -07:00
xfrm xfrm: Preserve vlan tags for transport mode software GRO 2024-04-26 06:44:33 +02:00
Kconfig
Kconfig.debug
Makefile
compat.c
devres.c
socket.c io_uring: separate header for exported net bits 2024-04-15 08:10:26 -06:00
sysctl_net.c