linux-stable/net/core
Toke Høiland-Jørgensen 8f22873582 core: Don't skip generic XDP program execution for cloned SKBs
[ Upstream commit ad1e03b2b3 ]

The current generic XDP handler skips execution of XDP programs entirely if
an SKB is marked as cloned. This leads to some surprising behaviour, as
packets can end up being cloned in various ways, which will make an XDP
program not see all the traffic on an interface.

This was discovered by a simple test case where an XDP program that always
returns XDP_DROP is installed on a veth device. When combining this with
the Scapy packet sniffer (which uses an AF_PACKET) socket on the sending
side, SKBs reliably end up in the cloned state, causing them to be passed
through to the receiving interface instead of being dropped. A minimal
reproducer script for this is included below.

This patch fixed the issue by simply triggering the existing linearisation
code for cloned SKBs instead of skipping the XDP program execution. This
behaviour is in line with the behaviour of the native XDP implementation
for the veth driver, which will reallocate and copy the SKB data if the SKB
is marked as shared.

Reproducer Python script (requires BCC and Scapy):

from scapy.all import TCP, IP, Ether, sendp, sniff, AsyncSniffer, Raw, UDP
from bcc import BPF
import time, sys, subprocess, shlex

SKB_MODE = (1 << 1)
DRV_MODE = (1 << 2)
PYTHON=sys.executable

def client():
    time.sleep(2)
    # Sniffing on the sender causes skb_cloned() to be set
    s = AsyncSniffer()
    s.start()

    for p in range(10):
        sendp(Ether(dst="aa:aa:aa:aa:aa:aa", src="cc:cc:cc:cc:cc:cc")/IP()/UDP()/Raw("Test"),
              verbose=False)
        time.sleep(0.1)

    s.stop()
    return 0

def server(mode):
    prog = BPF(text="int dummy_drop(struct xdp_md *ctx) {return XDP_DROP;}")
    func = prog.load_func("dummy_drop", BPF.XDP)
    prog.attach_xdp("a_to_b", func, mode)

    time.sleep(1)

    s = sniff(iface="a_to_b", count=10, timeout=15)
    if len(s):
        print(f"Got {len(s)} packets - should have gotten 0")
        return 1
    else:
        print("Got no packets - as expected")
        return 0

if len(sys.argv) < 2:
    print(f"Usage: {sys.argv[0]} <skb|drv>")
    sys.exit(1)

if sys.argv[1] == "client":
    sys.exit(client())
elif sys.argv[1] == "server":
    mode = SKB_MODE if sys.argv[2] == 'skb' else DRV_MODE
    sys.exit(server(mode))
else:
    try:
        mode = sys.argv[1]
        if mode not in ('skb', 'drv'):
            print(f"Usage: {sys.argv[0]} <skb|drv>")
            sys.exit(1)
        print(f"Running in {mode} mode")

        for cmd in [
                'ip netns add netns_a',
                'ip netns add netns_b',
                'ip -n netns_a link add a_to_b type veth peer name b_to_a netns netns_b',
                # Disable ipv6 to make sure there's no address autoconf traffic
                'ip netns exec netns_a sysctl -qw net.ipv6.conf.a_to_b.disable_ipv6=1',
                'ip netns exec netns_b sysctl -qw net.ipv6.conf.b_to_a.disable_ipv6=1',
                'ip -n netns_a link set dev a_to_b address aa:aa:aa:aa:aa:aa',
                'ip -n netns_b link set dev b_to_a address cc:cc:cc:cc:cc:cc',
                'ip -n netns_a link set dev a_to_b up',
                'ip -n netns_b link set dev b_to_a up']:
            subprocess.check_call(shlex.split(cmd))

        server = subprocess.Popen(shlex.split(f"ip netns exec netns_a {PYTHON} {sys.argv[0]} server {mode}"))
        client = subprocess.Popen(shlex.split(f"ip netns exec netns_b {PYTHON} {sys.argv[0]} client"))

        client.wait()
        server.wait()
        sys.exit(server.returncode)

    finally:
        subprocess.run(shlex.split("ip netns delete netns_a"))
        subprocess.run(shlex.split("ip netns delete netns_b"))

Fixes: d445516966 ("net: xdp: support xdp generic on virtual devices")
Reported-by: Stepan Horacek <shoracek@redhat.com>
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-24 08:36:21 +01:00
..
bpf_sk_storage.c bpf: Improve bucket_log calculation logic 2020-02-14 16:34:10 -05:00
datagram.c net: add READ_ONCE() annotation in __skb_wait_for_more_packets() 2019-10-28 13:33:41 -07:00
datagram.h net/core: Allow the compiler to verify declaration and definition consistency 2019-03-27 13:49:44 -07:00
dev.c core: Don't skip generic XDP program execution for cloned SKBs 2020-02-24 08:36:21 +01:00
dev_addr_lists.c net: remove unnecessary variables and callback 2019-10-24 14:53:49 -07:00
dev_ioctl.c net/core: Document all dev_ioctl() arguments 2019-03-27 13:49:43 -07:00
devlink.c devlink: report 0 after hitting end in region read 2020-02-11 04:35:48 -08:00
drop_monitor.c drop_monitor: Do not cancel uninitialized work item 2020-02-11 04:35:51 -08:00
dst.c net: print proper warning on dst underflow 2019-09-26 09:05:56 +02:00
dst_cache.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
ethtool.c net: Zeroing the structure ethtool_wolinfo in ethtool_get_wol() 2019-10-26 11:20:10 -07:00
failover.c failover: allow name change on IFF_UP slave interfaces 2019-04-10 22:12:26 -07:00
fib_notifier.c net: fib_notifier: move fib_notifier_ops from struct net into per-net struct 2019-09-07 17:28:22 +02:00
fib_rules.c SPDX update for 5.2-rc4 2019-06-08 12:52:42 -07:00
filter.c net: bpf: Don't leak time wait and request sockets 2020-01-23 08:22:49 +01:00
flow_dissector.c flow_dissector: Fix to use new variables for port ranges in bpf hook 2020-02-05 21:22:52 +00:00
flow_offload.c net: core: rename indirect block ingress cb function 2019-12-18 16:08:47 +01:00
gen_estimator.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
gen_stats.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
gro_cells.c gro_cells: make sure device is up in gro_cells_receive() 2019-03-10 11:07:14 -07:00
hwbm.c net: hwbm: Make the hwbm_pool lock a mutex 2019-06-09 19:40:10 -07:00
link_watch.c net: link_watch: prevent starvation when processing linkwatch wq 2019-07-01 19:02:47 -07:00
lwt_bpf.c net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup 2019-12-18 16:08:42 +01:00
lwtunnel.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
Makefile bpf: Introduce bpf sk local storage 2019-04-27 09:07:04 -07:00
neighbour.c net: neigh: use long type to store jiffies delta 2020-01-26 10:01:06 +01:00
net-procfs.c treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively 2019-04-09 14:19:06 +02:00
net-sysfs.c net-sysfs: Call dev_hold always in netdev_queue_add_kobject 2020-01-26 10:01:09 +01:00
net-sysfs.h
net-traces.c page_pool: add tracepoints for page_pool with details need by XDP 2019-06-19 11:23:13 -04:00
net_namespace.c netns: fix GFP flags in rtnl_net_notifyid() 2019-10-25 20:14:42 -07:00
netclassid_cgroup.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
netevent.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
netpoll.c net: fix skb use after free in netpoll 2019-08-27 20:52:02 -07:00
netprio_cgroup.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
page_pool.c page_pool: do not release pool until inflight == 0. 2019-12-18 16:09:07 +01:00
pktgen.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2019-09-18 12:34:53 -07:00
ptp_classifier.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 2019-06-05 17:36:38 +02:00
request_sock.c tcp: add rcu protection around tp->fastopen_rsk 2019-10-13 10:13:08 -07:00
rtnetlink.c net: rtnetlink: validate IFLA_MTU attribute in rtnl_create_link() 2020-01-29 16:45:21 +01:00
scm.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
secure_seq.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
skbuff.c net: Fixed updating of ethertype in skb_mpls_push() 2019-12-18 16:08:56 +01:00
skmsg.c net, sk_msg: Don't check if sock is locked when tearing down psock 2020-01-29 16:45:31 +01:00
sock.c net: annotate lockless accesses to sk->sk_pacing_shift 2020-01-09 10:20:07 +01:00
sock_diag.c sock: make cookie generation global instead of per netns 2019-08-09 13:14:46 -07:00
sock_map.c bpf, sockmap: Check update requirements after locking 2020-02-14 16:34:10 -05:00
sock_reuseport.c udp: correct reuseport selection with connected sockets 2019-09-16 09:02:18 +02:00
stream.c tcp: make sure EPOLLOUT wont be missed 2019-08-19 13:07:43 -07:00
sysctl_net_core.c net, sysctl: Fix compiler warning when only cBPF is present 2020-01-09 10:20:03 +01:00
timestamping.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 61 2019-05-24 17:36:45 +02:00
tso.c net: Use skb accessors in network core 2019-07-22 20:47:56 -07:00
utils.c net: Fix skb->csum update in inet_proto_csum_replace16(). 2020-02-05 21:22:52 +00:00
xdp.c xdp: obtain the mem_id mutex before trying to remove an entry. 2019-12-18 16:09:10 +01:00