linux-stable/net
Florian Westphal df5042f4c5 sk_buff: add skb extension infrastructure
This adds an optional extension infrastructure, with ispec (xfrm) and
bridge netfilter as first users.
objdiff shows no changes if kernel is built without xfrm and br_netfilter
support.

The third (planned future) user is Multipath TCP which is still
out-of-tree.
MPTCP needs to map logical mptcp sequence numbers to the tcp sequence
numbers used by individual subflows.

This DSS mapping is read/written from tcp option space on receive and
written to tcp option space on transmitted tcp packets that are part of
and MPTCP connection.

Extending skb_shared_info or adding a private data field to skb fclones
doesn't work for incoming skb, so a different DSS propagation method would
be required for the receive side.

mptcp has same requirements as secpath/bridge netfilter:

1. extension memory is released when the sk_buff is free'd.
2. data is shared after cloning an skb (clone inherits extension)
3. adding extension to an skb will COW the extension buffer if needed.

The "MPTCP upstreaming" effort adds SKB_EXT_MPTCP extension to store the
mapping for tx and rx processing.

Two new members are added to sk_buff:
1. 'active_extensions' byte (filling a hole), telling which extensions
   are available for this skb.
   This has two purposes.
   a) avoids the need to initialize the pointer.
   b) allows to "delete" an extension by clearing its bit
   value in ->active_extensions.

   While it would be possible to store the active_extensions byte
   in the extension struct instead of sk_buff, there is one problem
   with this:
    When an extension has to be disabled, we can always clear the
    bit in skb->active_extensions.  But in case it would be stored in the
    extension buffer itself, we might have to COW it first, if
    we are dealing with a cloned skb.  On kmalloc failure we would
    be unable to turn an extension off.

2. extension pointer, located at the end of the sk_buff.
   If the active_extensions byte is 0, the pointer is undefined,
   it is not initialized on skb allocation.

This adds extra code to skb clone and free paths (to deal with
refcount/free of extension area) but this replaces similar code that
manages skb->nf_bridge and skb->sp structs in the followup patches of
the series.

It is possible to add support for extensions that are not preseved on
clones/copies.

To do this, it would be needed to define a bitmask of all extensions that
need copy/cow semantics, and change __skb_ext_copy() to check
->active_extensions & SKB_EXT_PRESERVE_ON_CLONE, then just set
->active_extensions to 0 on the new clone.

This isn't done here because all extensions that get added here
need the copy/cow semantics.

v2:
Allocate entire extension space using kmem_cache.
Upside is that this allows better tracking of used memory,
downside is that we will allocate more space than strictly needed in
most cases (its unlikely that all extensions are active/needed at same
time for same skb).
The allocated memory (except the small extension header) is not cleared,
so no additonal overhead aside from memory usage.

Avoid atomic_dec_and_test operation on skb_ext_put()
by using similar trick as kfree_skbmem() does with fclone_ref:
If recount is 1, there is no concurrent user and we can free right away.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:21:37 -08:00
..
6lowpan 6lowpan: convert to DEFINE_SHOW_ATTRIBUTE 2018-12-19 00:28:05 +01:00
9p Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-03 10:35:52 -07:00
802
8021q net: core: dev: Add extack argument to dev_change_flags() 2018-12-06 13:26:07 -08:00
appletalk
atm Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
ax25
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-11-19 10:55:00 -08:00
bluetooth Bluetooth: Fix unnecessary error message for HCI request completion 2018-12-19 14:37:03 +01:00
bpf Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2018-12-10 18:00:43 -08:00
bpfilter net: bpfilter: Set user mode helper's command line 2018-10-22 19:37:36 -07:00
bridge netfilter: avoid using skb->nf_bridge directly 2018-12-19 11:21:37 -08:00
caif Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
can can: raw: check for CAN FD capable netdev in raw_sendmsg() 2018-11-09 17:19:34 +01:00
ceph libceph: fall back to sendmsg for slab pages 2018-11-19 17:59:47 +01:00
core sk_buff: add skb extension infrastructure 2018-12-19 11:21:37 -08:00
dcb
dccp net: dccp: initialize (addr,port) listening hashtable 2018-12-17 23:11:48 -08:00
decnet net/decnet: add missing indentation 2018-11-16 19:42:49 -08:00
dns_resolver dns: Allow the dns resolver to retrieve a server set 2018-10-04 09:40:52 -07:00
dsa net: dsa: ksz: Add STP multicast handling 2018-12-16 14:23:33 -08:00
ethernet net: ethernet: provide nvmem_get_mac_address() 2018-12-03 15:40:30 -08:00
hsr
ieee802154 net: dev: Add extack argument to dev_set_mac_address() 2018-12-13 18:41:38 -08:00
ife
ipv4 sk_buff: add skb extension infrastructure 2018-12-19 11:21:37 -08:00
ipv6 sk_buff: add skb extension infrastructure 2018-12-19 11:21:37 -08:00
iucv iucv: Remove SKB list assumptions. 2018-11-10 16:55:11 -08:00
kcm Revert "kcm: remove any offset before parsing messages" 2018-09-17 18:43:42 -07:00
key af_key: fix indentation on declaration statement 2018-11-15 18:09:32 +01:00
l2tp l2tp: Add protocol field decompression 2018-12-15 23:28:19 -08:00
l3mdev l3mdev: add function to retreive upper master 2018-12-03 14:15:26 -08:00
lapb
llc llc: do not use sk_eat_skb() 2018-10-22 19:59:20 -07:00
mac80211 This time we have too many changes to list, highlights: 2018-12-19 08:36:18 -08:00
mac802154 mac802154: Remove VLA usage of skcipher 2018-09-28 12:46:07 +08:00
mpls net/mpls: Handle kernel side filtering of route dumps 2018-10-16 00:14:07 -07:00
ncsi net/ncsi: Add NCSI Mellanox OEM command 2018-11-27 16:37:20 -08:00
netfilter netfilter: avoid using skb->nf_bridge directly 2018-12-19 11:21:37 -08:00
netlabel netlabel: check for IPV4MASK in addrinfo_get 2018-09-21 18:58:34 -07:00
netlink netlink: Add answer_flags to netlink_callback 2018-10-16 00:13:12 -07:00
netrom
nfc Merge branch 'work.tty-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-10-24 14:43:41 +01:00
nsh
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-12-09 21:43:31 -08:00
packet packet: copy user buffers before orphan or clone 2018-11-23 11:08:03 -08:00
phonet
psample
qrtr
rds Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-10-12 21:38:46 -07:00
rfkill rfkill: gpio: Remove unused include 2018-12-18 13:13:56 +01:00
rose
rxrpc rxrpc: Fix life check 2018-11-15 11:35:40 -08:00
sched net: sched: simplify the qdisc_leaf code 2018-12-15 11:37:32 -08:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-12-09 21:43:31 -08:00
smc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-11-24 17:01:43 -08:00
strparser bpf, sockmap: convert to generic sk_msg interface 2018-10-15 12:23:19 -07:00
sunrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-12-09 21:43:31 -08:00
switchdev net: switchdev: Add extack to switchdev_handle_port_obj_add() callback 2018-12-12 16:34:22 -08:00
tipc tipc: handle broadcast NAME_DISTRIBUTOR packet when receiving it 2018-12-18 21:50:48 -08:00
tls bpf: helper to pop data from messages 2018-11-28 22:07:57 +01:00
unix Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
vmw_vsock
wimax
wireless This time we have too many changes to list, highlights: 2018-12-19 08:36:18 -08:00
x25 net/x25: handle call collisions 2018-11-29 14:25:36 -08:00
xdp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-10-19 11:03:06 -07:00
xfrm xfrm: policy: fix policy hash rebuild 2018-11-28 07:05:48 +01:00
compat.c y2038: socket: Change recvmmsg to use __kernel_timespec 2018-08-29 15:42:24 +02:00
Kconfig sk_buff: add skb extension infrastructure 2018-12-19 11:21:37 -08:00
Makefile
socket.c socket: do a generic_file_splice_read when proto_ops has no splice_read 2018-11-17 21:34:11 -08:00
sysctl_net.c