linux-stable/net/ipv4
Gilad Naaman acb97d4b2d net: Set true network header for ECN decapsulation
[ Upstream commit 227adfb2b1 ]

In cases where the header straight after the tunnel header was
another ethernet header (TEB), instead of the network header,
the ECN decapsulation code would treat the ethernet header as if
it was an IP header, resulting in mishandling and possible
wrong drops or corruption of the IP header.

In this case, ECT(1) is sent, so IP_ECN_decapsulate tries to copy it to the
inner IPv4 header, and correct its checksum.

The offset of the ECT bits in an IPv4 header corresponds to the
lower 2 bits of the second octet of the destination MAC address
in the ethernet header.
The IPv4 checksum corresponds to end of the source address.

In order to reproduce:

    $ ip netns add A
    $ ip netns add B
    $ ip -n A link add _v0 type veth peer name _v1 netns B
    $ ip -n A link set _v0 up
    $ ip -n A addr add dev _v0 10.254.3.1/24
    $ ip -n A route add default dev _v0 scope global
    $ ip -n B link set _v1 up
    $ ip -n B addr add dev _v1 10.254.1.6/24
    $ ip -n B route add default dev _v1 scope global
    $ ip -n B link add gre1 type gretap local 10.254.1.6 remote 10.254.3.1 key 0x49000000
    $ ip -n B link set gre1 up

    # Now send an IPv4/GRE/Eth/IPv4 frame where the outer header has ECT(1),
    # and the inner header has no ECT bits set:

    $ cat send_pkt.py
        #!/usr/bin/env python3
        from scapy.all import *

        pkt = IP(b'E\x01\x00\xa7\x00\x00\x00\x00@/`%\n\xfe\x03\x01\n\xfe\x01\x06 \x00eXI\x00'
                 b'\x00\x00\x18\xbe\x92\xa0\xee&\x18\xb0\x92\xa0l&\x08\x00E\x00\x00}\x8b\x85'
                 b'@\x00\x01\x01\xe4\xf2\x82\x82\x82\x01\x82\x82\x82\x02\x08\x00d\x11\xa6\xeb'
                 b'3\x1e\x1e\\xf3\\xf7`\x00\x00\x00\x00ZN\x00\x00\x00\x00\x00\x00\x10\x11\x12'
                 b'\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./01234'
                 b'56789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ')

        send(pkt)
    $ sudo ip netns exec B tcpdump -neqlllvi gre1 icmp & ; sleep 1
    $ sudo ip netns exec A python3 send_pkt.py

In the original packet, the source/destinatio MAC addresses are
dst=18:be:92:a0:ee:26 src=18:b0:92:a0:6c:26

In the received packet, they are
dst=18:bd:92:a0:ee:26 src=18:b0:92:a0:6c:27

Thanks to Lahav Schlesinger <lschlesinger@drivenets.com> and Isaac Garzon <isaac@speed.io>
for helping me pinpoint the origin.

Fixes: b723748750 ("tunnel: Propagate ECT(1) when decapsulating as recommended by RFC6040")
Cc: David S. Miller <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Gilad Naaman <gnaaman@drivenets.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-08-04 12:27:39 +02:00
..
bpfilter SPDX update for 5.2-rc2, round 1 2019-05-21 12:33:38 -07:00
netfilter netfilter: arp_tables: add pre_exit hook for table unregister 2021-04-21 12:56:16 +02:00
af_inet.c net: remove empty inet_exit_net 2019-08-19 18:22:54 -07:00
ah4.c xfrm: remove type and offload_type map from xfrm_state_afinfo 2019-06-06 08:34:50 +02:00
arp.c Exempt multicast addresses from five-second neighbor lifetime 2020-11-24 13:28:56 +01:00
cipso_ipv4.c net: ipv4: fix memory leak in netlbl_cipsov4_add_std 2021-06-23 14:41:24 +02:00
datagram.c inet: stop leaking jiffies on the wire 2019-11-01 14:57:52 -07:00
devinet.c net: ipv4: Remove unneed BUG() function 2021-06-30 08:47:46 -04:00
esp4.c xfrm: xfrm_state_mtu should return at least 1280 for ipv6 2021-07-14 16:53:26 +02:00
esp4_offload.c esp: delete NETIF_F_SCTP_CRC bit from features for esp offload 2021-04-14 08:24:13 +02:00
fib_frontend.c net/ipv4: swap flow ports when validating source 2021-07-14 16:53:31 +02:00
fib_lookup.h ipv4: Use accessors for fib_info nexthop data 2019-06-04 19:26:49 -07:00
fib_notifier.c
fib_rules.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-07 11:00:14 -07:00
fib_semantics.c net: Fix the arp error in some cases 2020-06-30 15:36:44 -04:00
fib_trie.c ipv4: Silence suspicious RCU usage warning 2020-09-12 14:18:54 +02:00
fou.c fou: Fix IPv6 netlink policy 2020-01-29 16:45:22 +01:00
gre_demux.c erspan: fix version 1 check in gre_parse_header() 2021-01-12 20:16:15 +01:00
gre_offload.c net: gre: recompute gre csum for sctp over gre tunnels 2020-08-11 15:33:40 +02:00
icmp.c icmp: don't send out ICMP messages with a source address of 0.0.0.0 2021-06-23 14:41:27 +02:00
igmp.c net: ipv4: fix memory leak in ip_mc_add1_src 2021-06-23 14:41:26 +02:00
inet_connection_sock.c tcp: relookup sock for RST+ACK packets handled by obsolete req sock 2021-03-30 14:35:26 +02:00
inet_diag.c inet_diag: Fix error path to cancel the meseage in inet_req_diag_fill() 2020-11-24 13:28:56 +01:00
inet_fragment.c inet: frags: re-introduce skb coalescing for local delivery 2019-08-08 15:55:10 -07:00
inet_hashtables.c net: initialize fastreuse on inet_inherit_port 2020-08-19 08:16:23 +02:00
inet_timewait_sock.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
inetpeer.c inetpeer: fix data-race in inet_putpeer / inet_putpeer 2020-01-04 19:18:39 +01:00
ip_forward.c ipv4: Revert removal of rt_uses_gateway 2019-09-20 18:23:33 -07:00
ip_fragment.c inet: frags: re-introduce skb coalescing for local delivery 2019-08-08 15:55:10 -07:00
ip_gre.c ip_gre: set dev->hard_header_len and dev->needed_headroom properly 2020-10-29 09:58:04 +01:00
ip_input.c netfilter: drop bridge nf reset from nf_reset 2019-10-01 18:42:15 +02:00
ip_options.c netfilter: nf_tables: add support for matching IPv4 options 2019-06-21 18:35:51 +02:00
ip_output.c net: ip: avoid OOM kills with large UDP sends over loopback 2021-07-19 08:53:13 +02:00
ip_sockglue.c ip_sockglue: Fix missing-check bug in ip_ra_control() 2019-05-25 11:00:50 -07:00
ip_tunnel.c net: Set true network header for ECN decapsulation 2021-08-04 12:27:39 +02:00
ip_tunnel_core.c ip_tunnel: allow not to count pkts on tstats by setting skb's dev to NULL 2019-06-18 20:48:45 -04:00
ip_vti.c ip_vti: receive ipip packet by calling ip_tunnel_rcv 2020-06-03 08:21:34 +02:00
ipcomp.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2019-07-05 15:01:15 -07:00
ipconfig.c net: ipconfig: Don't override command-line hostnames or domains 2021-06-18 09:58:59 +02:00
ipip.c net: ipip: fix wrong address family in init error path 2020-06-03 08:20:52 +02:00
ipmr.c net: don't return invalid table id error when we fall back to PF_UNSPEC 2020-06-03 08:20:41 +02:00
ipmr_base.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-05-07 17:22:09 -07:00
Kconfig vti[6]: fix packet tx through bpf_redirect() in XinY cases 2020-04-01 11:02:05 +02:00
Makefile net: Initial nexthop code 2019-05-28 21:37:30 -07:00
metrics.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
netfilter.c netfilter: use actual socket sk rather than skb sk when routing harder 2020-11-18 19:20:17 +01:00
netlink.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
nexthop.c nexthop: Do not flush blackhole nexthops when loopback goes down 2021-03-17 17:03:35 +01:00
ping.c ping: Check return value of function 'ping_queue_rcv_skb' 2021-06-30 08:47:47 -04:00
proc.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-17 20:20:36 -07:00
protocol.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
raw.c netfilter: drop bridge nf reset from nf_reset 2019-10-01 18:42:15 +02:00
raw_diag.c inet_diag: return classid for all socket types 2020-03-18 07:17:38 +01:00
route.c net: lwtunnel: handle MTU calculation in forwading 2021-07-14 16:53:35 +02:00
syncookies.c net: Update window_clamp if SOCK_RCVBUF is set 2020-11-18 19:20:32 +01:00
sysctl_net_ipv4.c tcp: correct read of TFO keys on big endian systems 2020-08-19 08:16:23 +02:00
tcp.c tcp: add sanity tests to TCP_QUEUE_SEQ 2021-03-17 17:03:32 +01:00
tcp_bbr.c tcp: only postpone PROBE_RTT if RTT is < current min_rtt estimate 2020-11-24 13:28:59 +01:00
tcp_bic.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_bpf.c bpf, sockmap, tcp: sk_prot needs inuse_idx set for proc stats 2021-07-28 13:30:55 +02:00
tcp_cdg.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_cong.c net: Only allow init netns to set default tcp cong to a restricted algo 2021-05-14 09:44:33 +02:00
tcp_cubic.c tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT 2020-06-30 15:36:47 -04:00
tcp_dctcp.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
tcp_dctcp.h
tcp_diag.c tcp: annotate tp->write_seq lockless reads 2019-10-13 10:13:08 -07:00
tcp_fastopen.c net/tcp_fastopen: fix data races around tfo_active_disable_stamp 2021-07-28 13:30:57 +02:00
tcp_highspeed.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_htcp.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_hybla.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_illinois.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_input.c tcp: make TCP_USER_TIMEOUT accurate for zero window probes 2021-02-07 15:35:47 +01:00
tcp_ipv4.c tcp: annotate data races around tp->mtu_info 2021-07-25 14:35:15 +02:00
tcp_lp.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_metrics.c tcp: refactor setting the initial congestion window 2019-05-01 11:47:54 -04:00
tcp_minisocks.c tcp: relookup sock for RST+ACK packets handled by obsolete req sock 2021-03-30 14:35:26 +02:00
tcp_nv.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_offload.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
tcp_output.c ipv6: tcp: drop silly ICMPv6 packet too big messages 2021-07-25 14:35:15 +02:00
tcp_rate.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
tcp_recovery.c tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN 2021-02-03 23:26:02 +01:00
tcp_scalable.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_timer.c tcp: make TCP_USER_TIMEOUT accurate for zero window probes 2021-02-07 15:35:47 +01:00
tcp_ulp.c bpf: Sockmap/tls, push write_space updates through ulp updates 2020-01-23 08:22:45 +01:00
tcp_vegas.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_vegas.h
tcp_veno.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_westwood.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_yeah.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tunnel4.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
udp.c udp: annotate data races around unix_sk(sk)->gso_size 2021-07-25 14:35:15 +02:00
udp_diag.c inet_diag: return classid for all socket types 2020-03-18 07:17:38 +01:00
udp_impl.h udp: add missing rehash callback to udplite 2019-01-17 15:01:08 -08:00
udp_offload.c net: Fix gro aggregation for udp encaps with zero csum 2021-03-17 17:03:30 +01:00
udp_tunnel.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
udplite.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
xfrm4_input.c
xfrm4_output.c xfrm: Always set XFRM_TRANSFORMED in xfrm{4,6}_output_finish 2020-04-29 16:33:11 +02:00
xfrm4_policy.c net: add bool confirm_neigh parameter for dst_ops.update_pmtu 2020-01-04 19:18:58 +01:00
xfrm4_protocol.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
xfrm4_state.c xfrm: remove eth_proto value from xfrm_state_afinfo 2019-06-06 08:34:50 +02:00
xfrm4_tunnel.c xfrm: remove type and offload_type map from xfrm_state_afinfo 2019-06-06 08:34:50 +02:00