Commit Graph

1188942 Commits

Author SHA1 Message Date
Jakub Kicinski 61dc651cdf netfilter pull request 23-06-26
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmSZMg8ACgkQ1V2XiooU
 IOSPnQ//VBUgCxUtgCuQX4PwY+Dqr//BLD8DcA8SNMKqCpYPmXyamPS+ZhtRI1c5
 NplWcFuaER4hqiVHkbKyOzOitDXQb4s4Mn09YyLAIb2/uhxg1f79SYSGSi7H5Fay
 LElP7l84ars0GxlyUNmiwyzA4ySFyuWekT35o2A3eX5gawpf9mPpD21uOqAOKcad
 V2Z7Rz0mFcz3e400DNEx2DNehXSWZQT2O+05zIWFfpBZ7UB42GJaC+Id1RqtIX2m
 w5a2DtvWGKUcgWkA5KHqSQn0Ft21MePqL4QsS/s3z0jffPJUkoQX9pqnccFqr9LL
 0aWKOSJFZoYtnbUGkRaPY5Kdob7Wgk5px4FUUBHORb39I98w0zP5h1hFY8jgMJxn
 J4+8Ys4C7Kv3Z+vq6sEo07WnbaIhj4LNO9GRwjaO2NP/UPUqrGIuhB1elBoVL8uX
 YvoVF6oRaB4ccaH7gR/4R6liF9flsH16OYJTbHp632Ali1nVZDP1vAvNviI5V12G
 WhrPVi50Utxn9KrV6ez6JJY2ysts7tip/TVAxQN0hDIS22IOxJcuiYoCxaXOEjPJ
 8hd6jkF0NApwnlSkPmoqo+ohQ42Az2PtDvEsENw+U7XHur99Ed1ywxRG/K/Pm6gX
 QjUhoAfr9hXObwjoHKNID3VZnZpEjEsVr7CBFvj6S6FyTx3Ag5k=
 =wxop
 -----END PGP SIGNATURE-----

Merge tag 'nf-next-23-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

1) Allow slightly larger IPVS connection table size from Kconfig for
   64-bit arch, from Abhijeet Rastogi.

2) Since IPVS connection table might be larger than 2^20 after previous
   patch, allow to limit it depending on the available memory.
   Moreover, use kvmalloc. From Julian Anastasov.

3) Do not rebuild VLAN header in nft_payload when matching source and
   destination MAC address.

4) Remove nested rcu read lock side in ip_set_test(), from Florian Westphal.

5) Allow to update set size, also from Florian.

6) Improve NAT tuple selection when connection is closing,
   from Florian Westphal.

7) Support for resetting set element stateful expression, from Phil Sutter.

8) Use NLA_POLICY_MAX to narrow down maximum attribute value in nf_tables,
   from Florian Westphal.

* tag 'nf-next-23-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nf_tables: limit allowed range via nla_policy
  netfilter: nf_tables: Introduce NFT_MSG_GETSETELEM_RESET
  netfilter: snat: evict closing tcp entries on reply tuple collision
  netfilter: nf_tables: permit update of set size
  netfilter: ipset: remove rcu_read_lock_bh pair from ip_set_test
  netfilter: nft_payload: rebuild vlan header when needed
  ipvs: dynamically limit the connection hash table
  ipvs: increase ip_vs_conn_tab_bits range for 64BIT
====================

Link: https://lore.kernel.org/r/20230626064749.75525-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-26 12:59:18 -07:00
David S. Miller 771ca3de25 Merge branch 'sfc-next'
Edward Cree says:

====================
sfc: fix unaligned access in loopback selftests

Arnd reported that the sfc drivers each define a packed loopback_payload
 structure with an ethernet header followed by an IP header, whereas the
 kernel definition of iphdr specifies that this is 4-byte aligned,
 causing a W=1 warning.
Fix this in each case by adding two bytes of leading padding to the
 struct, taking care that these are not sent on the wire.
Tested on EF10; build-tested on Siena and Falcon.

Changed in v2:
* added __aligned(4) to payload struct definitions (Arnd)
* fixed dodgy whitespace (checkpatch)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-26 10:36:48 +01:00
Edward Cree 1186c6b31e sfc: falcon: use padding to fix alignment in loopback test
Add two bytes of padding to the start of struct ef4_loopback_payload,
 which are not sent on the wire.  This ensures the 'ip' member is
 4-byte aligned, preventing the following W=1 warning:
net/ethernet/sfc/falcon/selftest.c:43:15: error: field ip within 'struct ef4_loopback_payload' is less aligned than 'struct iphdr' and is usually due to 'struct ef4_loopback_payload' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access]
        struct iphdr ip;

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-26 10:36:48 +01:00
Edward Cree 30c24dd87f sfc: siena: use padding to fix alignment in loopback test
Add two bytes of padding to the start of struct efx_loopback_payload,
 which are not sent on the wire.  This ensures the 'ip' member is
 4-byte aligned, preventing the following W=1 warning:
net/ethernet/sfc/siena/selftest.c:46:15: error: field ip within 'struct efx_loopback_payload' is less aligned than 'struct iphdr' and is usually due to 'struct efx_loopback_payload' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access]
        struct iphdr ip;

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-26 10:36:48 +01:00
Edward Cree cf60ed4696 sfc: use padding to fix alignment in loopback test
Add two bytes of padding to the start of struct efx_loopback_payload,
 which are not sent on the wire.  This ensures the 'ip' member is
 4-byte aligned, preventing the following W=1 warning:
net/ethernet/sfc/selftest.c:46:15: error: field ip within 'struct efx_loopback_payload' is less aligned than 'struct iphdr' and is usually due to 'struct efx_loopback_payload' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access]
        struct iphdr ip;

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-26 10:36:48 +01:00
Florian Westphal a412dbf40f netfilter: nf_tables: limit allowed range via nla_policy
These NLA_U32 types get stored in u8 fields, reject invalid values
instead of silently casting to u8.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-26 08:05:57 +02:00
Phil Sutter 079cd63321 netfilter: nf_tables: Introduce NFT_MSG_GETSETELEM_RESET
Analogous to NFT_MSG_GETOBJ_RESET, but for set elements with a timeout
or attached stateful expressions like counters or quotas - reset them
all at once. Respect a per element timeout value if present to reset the
'expires' value to.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-26 08:05:57 +02:00
Florian Westphal 4589725502 netfilter: snat: evict closing tcp entries on reply tuple collision
When all tried source tuples are in use, the connection request (skb)
and the new conntrack will be dropped in nf_confirm() due to the
non-recoverable clash.

Make it so that the last 32 attempts are allowed to evict a colliding
entry if this connection is already closing and the new sequence number
has advanced past the old one.

Such "all tuples taken" secenario can happen with tcp-rpc workloads where
same dst:dport gets queried repeatedly.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-26 08:05:57 +02:00
Florian Westphal 96b2ef9b16 netfilter: nf_tables: permit update of set size
Now that set->nelems is always updated permit update of the sets max size.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-26 08:05:57 +02:00
Florian Westphal 78aa23d008 netfilter: ipset: remove rcu_read_lock_bh pair from ip_set_test
Callers already hold rcu_read_lock.

Prior to RCU conversion this used to be a read_lock_bh(), but now the
bh-disable isn't needed anymore.

Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-26 08:05:56 +02:00
Pablo Neira Ayuso de6843be30 netfilter: nft_payload: rebuild vlan header when needed
Skip rebuilding the vlan header when accessing destination and source
mac address.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-26 08:05:45 +02:00
Jakub Kicinski 9ae440b8fd Merge branch 'splice-net-switch-over-users-of-sendpage-and-remove-it'
David Howells says:

====================
splice, net: Switch over users of sendpage() and remove it

Here's the final set of patches towards the removal of sendpage.  All the
drivers that use sendpage() get switched over to using sendmsg() with
MSG_SPLICE_PAGES.

The following changes are made:

 (1) Make the protocol drivers behave according to MSG_MORE, not
     MSG_SENDPAGE_NOTLAST.  The latter is restricted to turning on MSG_MORE
     in the sendpage() wrappers.

 (2) Fix ocfs2 to allocate its global protocol buffers with folio_alloc()
     rather than kzalloc() so as not to invoke the !sendpage_ok warning in
     skb_splice_from_iter().

 (3) Make ceph/rds, skb_send_sock, dlm, nvme, smc, ocfs2, drbd and iscsi
     use sendmsg(), not sendpage and make them specify MSG_MORE instead of
     MSG_SENDPAGE_NOTLAST.

 (4) Kill off sendpage and clean up MSG_SENDPAGE_NOTLAST.

Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=51c78a4d532efe9543a4df019ff405f05c6157f6 # part 1
Link: https://lore.kernel.org/r/20230616161301.622169-1-dhowells@redhat.com/ # v1
Link: https://lore.kernel.org/r/20230617121146.716077-1-dhowells@redhat.com/ # v2
Link: https://lore.kernel.org/r/20230620145338.1300897-1-dhowells@redhat.com/ # v3
====================

Link: https://lore.kernel.org/r/20230623225513.2732256-1-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:21 -07:00
David Howells b848b26c66 net: Kill MSG_SENDPAGE_NOTLAST
Now that ->sendpage() has been removed, MSG_SENDPAGE_NOTLAST can be cleaned
up.  Things were converted to use MSG_MORE instead, but the protocol
sendpage stubs still convert MSG_SENDPAGE_NOTLAST to MSG_MORE, which is now
unnecessary.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-afs@lists.infradead.org
cc: mptcp@lists.linux.dev
cc: rds-devel@oss.oracle.com
cc: tipc-discussion@lists.sourceforge.net
cc: virtualization@lists.linux-foundation.org
Link: https://lore.kernel.org/r/20230623225513.2732256-17-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:13 -07:00
David Howells dc97391e66 sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)
Remove ->sendpage() and ->sendpage_locked().  sendmsg() with
MSG_SPLICE_PAGES should be used instead.  This allows multiple pages and
multipage folios to be passed through.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-afs@lists.infradead.org
cc: mptcp@lists.linux.dev
cc: rds-devel@oss.oracle.com
cc: tipc-discussion@lists.sourceforge.net
cc: virtualization@lists.linux-foundation.org
Link: https://lore.kernel.org/r/20230623225513.2732256-16-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:13 -07:00
David Howells e52828cc01 ocfs2: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()
Switch ocfs2 from using sendpage() to using sendmsg() + MSG_SPLICE_PAGES so
that sendpage can be phased out.

Signed-off-by: David Howells <dhowells@redhat.com>

cc: Mark Fasheh <mark@fasheh.com>
cc: Joel Becker <jlbec@evilplan.org>
cc: Joseph Qi <joseph.qi@linux.alibaba.com>
cc: ocfs2-devel@oss.oracle.com
Link: https://lore.kernel.org/r/20230623225513.2732256-15-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:13 -07:00
David Howells 86d7bd6e66 ocfs2: Fix use of slab data with sendpage
ocfs2 uses kzalloc() to allocate buffers for o2net_hand, o2net_keep_req and
o2net_keep_resp and then passes these to sendpage.  This isn't really
allowed as the lifetime of slab objects is not controlled by page ref -
though in this case it will probably work.  sendmsg() with MSG_SPLICE_PAGES
will, however, print a warning and give an error.

Fix it to use folio_alloc() instead to allocate a buffer for the handshake
message, keepalive request and reply messages.

Fixes: 98211489d4 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Mark Fasheh <mark@fasheh.com>
cc: Kurt Hackel <kurt.hackel@oracle.com>
cc: Joel Becker <jlbec@evilplan.org>
cc: Joseph Qi <joseph.qi@linux.alibaba.com>
cc: ocfs2-devel@oss.oracle.com
Link: https://lore.kernel.org/r/20230623225513.2732256-14-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:13 -07:00
David Howells d2fe21077d scsi: target: iscsi: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage.  This allows
multiple pages and multipage folios to be passed through.

TODO: iscsit_fe_sendpage_sg() should perhaps set up a bio_vec array for the
entire set of pages it's going to transfer plus two for the header and
trailer and page fragments to hold the header and trailer - and then call
sendmsg once for the entire message.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Mike Christie <michael.christie@oracle.com>
cc: Maurizio Lombardi <mlombard@redhat.com>
cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
cc: "Martin K. Petersen" <martin.petersen@oracle.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: open-iscsi@googlegroups.com
Link: https://lore.kernel.org/r/20230623225513.2732256-13-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:13 -07:00
David Howells fa8df34357 scsi: iscsi_tcp: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage.  This allows
multiple pages and multipage folios to be passed through.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
cc: Lee Duncan <lduncan@suse.com>
cc: Chris Leech <cleech@redhat.com>
cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
cc: "Martin K. Petersen" <martin.petersen@oracle.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: open-iscsi@googlegroups.com
Link: https://lore.kernel.org/r/20230623225513.2732256-12-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:13 -07:00
David Howells eeac7405c7 drbd: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()
Use sendmsg() conditionally with MSG_SPLICE_PAGES in _drbd_send_page()
rather than calling sendpage() or _drbd_no_send_page().

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Philipp Reisner <philipp.reisner@linbit.com>
cc: Lars Ellenberg <lars.ellenberg@linbit.com>
cc: "Christoph Böhmwalder" <christoph.boehmwalder@linbit.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: drbd-dev@lists.linbit.com
Link: https://lore.kernel.org/r/20230623225513.2732256-11-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:13 -07:00
David Howells 2f8bc2bbb0 smc: Drop smc_sendpage() in favour of smc_sendmsg() + MSG_SPLICE_PAGES
Drop the smc_sendpage() code as smc_sendmsg() just passes the call down to
the underlying TCP socket and smc_tx_sendpage() is just a wrapper around
its sendmsg implementation.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Karsten Graul <kgraul@linux.ibm.com>
cc: Wenjia Zhang <wenjia@linux.ibm.com>
cc: Jan Karcher <jaka@linux.ibm.com>
cc: "D. Wythe" <alibuda@linux.alibaba.com>
cc: Tony Lu <tonylu@linux.alibaba.com>
cc: Wen Gu <guwen@linux.alibaba.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Link: https://lore.kernel.org/r/20230623225513.2732256-10-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:12 -07:00
David Howells c336a79983 nvmet-tcp: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
When transmitting data, call down into TCP using a single sendmsg with
MSG_SPLICE_PAGES to indicate that content should be spliced rather than
copied instead of calling sendpage.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Willem de Bruijn <willemb@google.com>
cc: Keith Busch <kbusch@kernel.org>
cc: Jens Axboe <axboe@fb.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Chaitanya Kulkarni <kch@nvidia.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-nvme@lists.infradead.org
Link: https://lore.kernel.org/r/20230623225513.2732256-9-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:12 -07:00
David Howells 7769887817 nvme-tcp: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
When transmitting data, call down into TCP using a sendmsg with
MSG_SPLICE_PAGES instead of sendpage.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Willem de Bruijn <willemb@google.com>
cc: Keith Busch <kbusch@kernel.org>
cc: Jens Axboe <axboe@fb.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Chaitanya Kulkarni <kch@nvidia.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-nvme@lists.infradead.org
Link: https://lore.kernel.org/r/20230623225513.2732256-8-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:12 -07:00
David Howells a1a5e87527 dlm: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
When transmitting data, call down a layer using a single sendmsg with
MSG_SPLICE_PAGES to indicate that content should be spliced rather using
sendpage.  This allows ->sendpage() to be replaced by something that can
handle multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christine Caulfield <ccaulfie@redhat.com>
cc: David Teigland <teigland@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: cluster-devel@redhat.com
Link: https://lore.kernel.org/r/20230623225513.2732256-7-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:12 -07:00
David Howells 572efade27 rds: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
When transmitting data, call down into TCP using a single sendmsg with
MSG_SPLICE_PAGES to indicate that content should be spliced.

To make this work, the data is assembled in a bio_vec array and attached to
a BVEC-type iterator.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: rds-devel@oss.oracle.com
Link: https://lore.kernel.org/r/20230623225513.2732256-6-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:12 -07:00
David Howells fa094ccae1 ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()
Use sendmsg() and MSG_SPLICE_PAGES rather than sendpage in ceph when
transmitting data.  For the moment, this can only transmit one page at a
time because of the architecture of net/ceph/, but if
write_partial_message_data() can be given a bvec[] at a time by the
iteration code, this would allow pages to be sent in a batch.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Xiubo Li <xiubli@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Link: https://lore.kernel.org/r/20230623225513.2732256-5-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:12 -07:00
David Howells 40a8c17aa7 ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
Use sendmsg() and MSG_SPLICE_PAGES rather than sendpage in ceph when
transmitting data.  For the moment, this can only transmit one page at a
time because of the architecture of net/ceph/, but if
write_partial_message_data() can be given a bvec[] at a time by the
iteration code, this would allow pages to be sent in a batch.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Xiubo Li <xiubli@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Link: https://lore.kernel.org/r/20230623225513.2732256-4-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:12 -07:00
David Howells c729ed6f5b net: Use sendmsg(MSG_SPLICE_PAGES) not sendpage in skb_send_sock()
Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage in
skb_send_sock().  This causes pages to be spliced from the source iterator
if possible.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Note that this could perhaps be improved to fill out a bvec array with all
the frags and then make a single sendmsg call, possibly sticking the header
on the front also.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Link: https://lore.kernel.org/r/20230623225513.2732256-3-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:12 -07:00
David Howells f8dd95b29d tcp_bpf, smc, tls, espintcp, siw: Reduce MSG_SENDPAGE_NOTLAST usage
As MSG_SENDPAGE_NOTLAST is being phased out along with sendpage(), don't
use it further in than the sendpage methods, but rather translate it to
MSG_MORE and use that instead.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
cc: Bernard Metzler <bmt@zurich.ibm.com>
cc: Jason Gunthorpe <jgg@ziepe.ca>
cc: Leon Romanovsky <leon@kernel.org>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jakub Sitnicki <jakub@cloudflare.com>
cc: David Ahern <dsahern@kernel.org>
cc: Karsten Graul <kgraul@linux.ibm.com>
cc: Wenjia Zhang <wenjia@linux.ibm.com>
cc: Jan Karcher <jaka@linux.ibm.com>
cc: "D. Wythe" <alibuda@linux.alibaba.com>
cc: Tony Lu <tonylu@linux.alibaba.com>
cc: Wen Gu <guwen@linux.alibaba.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: Steffen Klassert <steffen.klassert@secunet.com>
cc: Herbert Xu <herbert@gondor.apana.org.au>
Link: https://lore.kernel.org/r/20230623225513.2732256-2-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:50:12 -07:00
Jakub Kicinski b545a13ca9 mlx5-updates-2023-06-21
mlx5 driver minor cleanup and fixes to net-next
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmSV8icACgkQSD+KveBX
 +j6WgAf+Ph5w2dAfuadjlAcQlK2WKXyS1OliVpvCuglepgsP9Zree11WVoyiAKef
 70Xd5Vu4xAQTdpCsw4DGM7sh6xW1SxvZFP1A/FVk7UU1E0zL2SXzKyEHK29I1MH5
 nej2hRf/W1dwYtxfwNOTKAFco3wr0e1vLgMDEqZBZbJXzcUetDRADkgWrx9U+pno
 lBPhVMZWK5R0GzOjlWZXoedaXx2RIcm+5U02ov5S6d1y8AA+sE93tiYxrP9z/2lj
 nml1KeQQl0Ku/y+e8RMSUd9mPdomQZi6CVHMD1wV5DNv6dnrT1bPrUdt4torSXEI
 KAzkVie979XP2jvkDKY2nyXZ8dn7cw==
 =woKn
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2023-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-06-21

mlx5 driver minor cleanup and fixes to net-next

* tag 'mlx5-updates-2023-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Remove pointless vport lookup from mlx5_esw_check_port_type()
  net/mlx5: Remove redundant check from mlx5_esw_query_vport_vhca_id()
  net/mlx5: Remove redundant is_mdev_switchdev_mode() check from is_ib_rep_supported()
  net/mlx5: Remove redundant MLX5_ESWITCH_MANAGER() check from is_ib_rep_supported()
  net/mlx5e: E-Switch, Fix shared fdb error flow
  net/mlx5e: Remove redundant comment
  net/mlx5e: E-Switch, Pass other_vport flag if vport is not 0
  net/mlx5e: E-Switch, Use xarray for devcom paired device index
  net/mlx5e: E-Switch, Add peer fdb miss rules for vport manager or ecpf
  net/mlx5e: Use vhca_id for device index in vport rx rules
  net/mlx5: Lag, Remove duplicate code checking lag is supported
  net/mlx5: Fix error code in mlx5_is_reset_now_capable()
  net/mlx5: Fix reserved at offset in hca_cap register
  net/mlx5: Fix SFs kernel documentation error
  net/mlx5: Fix UAF in mlx5_eswitch_cleanup()
====================

Link: https://lore.kernel.org/r/20230623192907.39033-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:48:04 -07:00
Jakub Kicinski 35bf34b078 Merge branch 'netlink-add-display-hint-to-ynl'
Donald Hunter says:

====================
netlink: add display-hint to ynl

Add a display-hint property to the netlink schema, to be used by generic
netlink clients as hints about how to display attribute values.

A display-hint on an attribute definition is intended for letting a
client such as ynl know that, for example, a u32 should be rendered as
an ipv4 address. The display-hint enumeration includes a small number of
networking domain-specific value types.
====================

Link: https://lore.kernel.org/r/20230623201928.14275-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:45:52 -07:00
Donald Hunter 334f39ce17 netlink: specs: add display hints to ovs_flow
Add display hints for mac, ipv4, ipv6, hex and uuid to the ovs_flow
schema.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230623201928.14275-4-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:45:49 -07:00
Donald Hunter d8eea68d91 tools: ynl: add display-hint support to ynl
Add support to the ynl tool for rendering output based on display-hint
properties.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230623201928.14275-3-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:45:49 -07:00
Donald Hunter 737eab775d netlink: specs: add display-hint to schema definitions
Add a display-hint property to the netlink schema that is for providing
optional hints to generic netlink clients about how to display attribute
values. A display-hint on an attribute definition is intended for
letting a client such as ynl know that, for example, a u32 should be
rendered as an ipv4 address. The display-hint enumeration includes a
small number of networking domain-specific value types.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230623201928.14275-2-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:45:49 -07:00
Jakub Kicinski 2ffecf1a42 Core WPAN changes:
* Support for active scans
 * Support for answering BEACON_REQ
 * Specific MLME handling for limited devices
 
 WPAN driver changes:
 * ca8210:
   - Flag the devices as limited
   - Remove stray gpiod_unexport() call
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEE9HuaYnbmDhq/XIDIJWrqGEe9VoQFAmSV2A0ACgkQJWrqGEe9
 VoSFxgf+MH6U+kmOU9rAikG12t+qTVBKzUkxUgCXcNKSb8oo2/5xF4jWpPd0Oa0W
 Ztneu7N/1o8FFTLJiI5l0KkidU2Nq9jBgYJPZPBen1TucrDapyMvQ1UzfA3LF8Vj
 rnBWXiz9oDWAg33KuSGZwgagvYe+UawQTzsvM+HKfai5VMyCzno0oJ8RXAJ/6jgY
 GkB9e3I5nkZUT9wVzeFKQXEnqGRjZRS2WoC/a76fRz0larGoVeHY3IWWCkQYamU6
 vro18m2yiv/DkZaT5L0YQlYJ8LWojkaCgUVmJlP/9ngMO9WXqriVytMJBpcmFIhp
 1us6PVS5ue65C1aVS3uHQSe47WM0SA==
 =sP6V
 -----END PGP SIGNATURE-----

Merge tag 'ieee802154-for-net-next-2023-06-23' of gitolite.kernel.org:pub/scm/linux/kernel/git/wpan/wpan-next

Miquel Raynal says:

====================
Core WPAN changes:
 - Support for active scans
 - Support for answering BEACON_REQ
 - Specific MLME handling for limited devices

WPAN driver changes:
 - ca8210:
   - Flag the devices as limited
   - Remove stray gpiod_unexport() call

* tag 'ieee802154-for-net-next-2023-06-23' of gitolite.kernel.org:pub/scm/linux/kernel/git/wpan/wpan-next:
  ieee802154: ca8210: Remove stray gpiod_unexport() call
  ieee802154: ca8210: Flag the driver as being limited
  net: ieee802154: Handle limited devices with only datagram support
  mac802154: Handle received BEACON_REQ
  ieee802154: Add support for allowing to answer BEACON_REQ
  mac802154: Handle active scanning
  ieee802154: Add support for user active scan requests
====================

Link: https://lore.kernel.org/r/20230623195506.40b87b5f@xps-13
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:41:46 -07:00
Jakub Kicinski 14fd5e0d48 Merge branch 'selftests-mptcp-refactoring-and-minor-fixes'
Mat Martineau says:

====================
selftests: mptcp: Refactoring and minor fixes

Patch 1 moves code around for clarity and improved code reuse.

Patch 2 makes use of new MPTCP info that consolidates MPTCP-level and
subflow-level information.

Patches 3-7 refactor code to favor limited-scope environment vars over
optional parameters.

Patch 8: typo fix
====================

Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-0-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:38:02 -07:00
Yueh-Shun Li e6b8a78ea2 selftests: mptcp: connect: fix comment typo
Spell "transmissions" properly.

Found by searching for keyword "tranm".

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Yueh-Shun Li <shamrocklee@posteo.net>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-8-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:37:57 -07:00
Geliang Tang 9e9d176df8 selftests: mptcp: add pm_nl_set_endpoint helper
This patch moves endpoint settings out of do_transfer() into a new
helper pm_nl_set_endpoint(). And invoke this helper in do_transfer().
This makes the code much more clearer.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-7-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:37:57 -07:00
Geliang Tang 1534f87ee0 selftests: mptcp: drop sflags parameter
run_tests() accepts too many optional parameters. Before this modification,
it was required to set all of then when only the last one had to be
changed. That's not clear to see all these 0 and it makes the maintenance
harder:

      run_tests $ns1 $ns2 10.0.1.1 1 2 3 slow

Instead, the parameter can be set as an env var with a limited scope:

      foo=1 bar=2 next=3 \
            run_tests $ns1 $ns2 10.0.1.1 slow

This patch switches to key/value "sflags=*" instead of positional parameter
sflags of do_transfer() and run_tests().

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-6-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:37:57 -07:00
Geliang Tang 595ef566a2 selftests: mptcp: drop addr_nr_ns1/2 parameters
run_tests() accepts too many optional parameters. Before this modification,
it was required to set all of then when only the last one had to be
changed. That's not clear to see all these 0 and it makes the maintenance
harder:

      run_tests $ns1 $ns2 10.0.1.1 1 2 3 slow

Instead, the parameter can be set as an env var with a limited scope:

      foo=1 bar=2 next=3 \
            run_tests $ns1 $ns2 10.0.1.1 slow

This patch switches to key/value "addr_nr_ns1=*, addr_nr_ns2=*" instead
of positional parameters addr_nr_ns1 and addr_nr_ns2 of do_transfer()
and run_tests().

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-5-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:37:57 -07:00
Geliang Tang 0c93af1f89 selftests: mptcp: drop test_linkfail parameter
run_tests() accepts too many optional parameters. Before this modification,
it was required to set all of then when only the last one had to be
changed. That's not clear to see all these 0 and it makes the maintenance
harder:

      run_tests $ns1 $ns2 10.0.1.1 1 2 3 slow

Instead, the parameter can be set as an env var with a limited scope:

      foo=1 bar=2 next=3 \
            run_tests $ns1 $ns2 10.0.1.1 slow

This patch switches to key/value "test_linkfail=*" instead of positional
parameter test_linkfail of do_transfer() and run_tests().

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-4-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:37:57 -07:00
Geliang Tang be7e9786c9 selftests: mptcp: set FAILING_LINKS in run_tests
Set FAILING_LINKS as an env var with a limited scope only when calling
run_tests().

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-3-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:37:57 -07:00
Geliang Tang d7ced753aa selftests: mptcp: check subflow and addr infos
New MPTCP info are being checked in multiple places to improve the code
coverage when using the userspace PM.

This patch makes chk_mptcp_info() more generic to be able to check
subflows, add_addr_signal and add_addr_accepted info (and even more
later). New arguments are now required to get different infos from the
two namespaces because some counters are specific to the client or the
server.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-2-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:37:57 -07:00
Geliang Tang 4369c198e5 selftests: mptcp: test userspace pm out of transfer
This patch moves userspace pm tests out of do_transfer(). Move add address
test into a new function userspace_pm_add_addr(), and remove address test
into userspace_pm_rm_sf_addr_ns1(). Move add subflow test into
userspace_pm_add_sf() and remove subflow into
userspace_pm_rm_sf_addr_ns2().

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-1-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:37:57 -07:00
Jakub Kicinski c4015bbee9 Merge branch 'net-stmmac-introduce-devres-helpers-for-stmmac-platform-drivers'
Bartosz Golaszewski says:

====================
net: stmmac: introduce devres helpers for stmmac platform drivers

The goal of this series is two-fold: to make the API for stmmac platforms more
logically correct (by providing functions that acquire resources with release
counterparts that undo only their actions and nothing more) and to provide
devres variants of commonly use registration functions that allows to
significantly simplify the platform drivers.

The current pattern for stmmac platform drivers is to call
stmmac_probe_config_dt(), possibly the platform's init() callback and then
call stmmac_drv_probe(). The resources allocated by these calls will then
be released by calling stmmac_pltfr_remove(). This goes against the commonly
accepted way of providing each function that allocated a resource with a
function that frees it.

First: provide wrappers around platform's init() and exit() callbacks that
allow users to skip checking if the callbacks exist manually.

Second: provide stmmac_pltfr_probe() which calls the platform init() callback
and then calls stmmac_drv_probe() together with a variant of
stmmac_pltfr_remove() that DOES NOT call stmmac_remove_config_dt(). For now
this variant is called stmmac_pltfr_remove_no_dt() but once all users of
the old stmmac_pltfr_remove() are converted to the devres helper, it will be
renamed back to stmmac_pltfr_remove() and the no_dt function removed.

Finally use the devres helpers in dwmac-qco-ethqos to show how much simplier
the driver's probe() becomes.

This series obviously just starts the conversion process and other platform
drivers will need to be converted once the helpers land in net/.
====================

Link: https://lore.kernel.org/r/20230623100417.93592-1-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:36:06 -07:00
Bartosz Golaszewski 4194f32a4b net: stmmac: dwmac-qcom-ethqos: use devm_stmmac_pltfr_probe()
Use the devres variant of stmmac_pltfr_probe() and finally drop the
remove() callback entirely.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-12-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:36:00 -07:00
Bartosz Golaszewski fc9ee2ac4f net: stmmac: platform: provide devm_stmmac_pltfr_probe()
Provide a devres variant of stmmac_pltfr_probe() which allows users to
skip calling stmmac_pltfr_remove() at driver detach.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-11-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:36:00 -07:00
Bartosz Golaszewski 061425d933 net: stmmac: dwmac-qco-ethqos: use devm_stmmac_probe_config_dt()
Significantly simplify the driver's probe() function by using the devres
variant of stmmac_probe_config_dt(). This allows to drop the goto jumps
entirely.

The remove_new() callback now needs to be switched to
stmmac_pltfr_remove_no_dt().

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-10-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:36:00 -07:00
Bartosz Golaszewski d740654273 net: stmmac: platform: provide devm_stmmac_probe_config_dt()
Provide a devres variant of stmmac_probe_config_dt() that allows users to
skip calling stmmac_remove_config_dt() at driver detach.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-9-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:36:00 -07:00
Bartosz Golaszewski 1be0c9d65e net: stmmac: platform: provide stmmac_pltfr_remove_no_dt()
Add a variant of stmmac_pltfr_remove() that only frees resources
allocated by stmmac_pltfr_probe() and - unlike stmmac_pltfr_remove() -
does not call stmmac_remove_config_dt().

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-8-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:35:59 -07:00
Bartosz Golaszewski 0a68a59493 net: stmmac: dwmac-generic: use stmmac_pltfr_probe()
Shrink the code and remove labels by using the new stmmac_pltfr_probe()
function.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-7-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 15:35:59 -07:00