linux-stable/net
Florian Westphal 843c2fdf7a net: dctcp: loosen requirement to assert ECT(0) during 3WHS
One deployment requirement of DCTCP is to be able to run
in a DC setting along with TCP traffic. As Glenn Judd's
NSDI'15 paper "Attaining the Promise and Avoiding the Pitfalls
of TCP in the Datacenter" [1] (tba) explains, one way to
solve this on switch side is to split DCTCP and TCP traffic
in two queues per switch port based on the DSCP: one queue
soley intended for DCTCP traffic and one for non-DCTCP traffic.

For the DCTCP queue, there's the marking threshold K as
explained in commit e3118e8359 ("net: tcp: add DCTCP congestion
control algorithm") for RED marking ECT(0) packets with CE.
For the non-DCTCP queue, there's f.e. a classic tail drop queue.
As already explained in e3118e8359, running DCTCP at scale
when not marking SYN/SYN-ACK packets with ECT(0) has severe
consequences as for non-ECT(0) packets, traversing the RED
marking DCTCP queue will result in a severe reduction of
connection probability.

This is due to the DCTCP queue being dominated by ECT(0) traffic
and switches handle non-ECT traffic in the RED marking queue
after passing K as drops, where K is usually a low watermark
in order to leave enough tailroom for bursts. Splitting DCTCP
traffic among several queues (ECN and non-ECN queue) is being
considered a terrible idea in the network community as it
splits single flows across multiple network paths.

Therefore, commit e3118e8359 implements this on Linux as
ECT(0) marked traffic, as we argue that marking all packets
of a DCTCP flow is the only viable solution and also doesn't
speak against the draft.

However, recently, a DCTCP implementation for FreeBSD hit also
their mainline kernel [2]. In order to let them play well
together with Linux' DCTCP, we would need to loosen the
requirement that ECT(0) has to be asserted during the 3WHS as
not implemented in FreeBSD. This simplifies the ECN test and
lets DCTCP work together with FreeBSD.

Joint work with Daniel Borkmann.

  [1] https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/judd
  [2] 8ad8794452

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Glenn Judd <glenn.judd@morganstanley.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 18:48:55 -08:00
..
6lowpan net/6lowpan: Remove FSF address from GPL statement. 2014-12-05 12:43:04 +01:00
9p
802
8021q vlan: advertise link netns via netlink 2015-01-23 17:51:15 -08:00
appletalk new helper: memcpy_from_msg() 2014-11-24 04:28:48 -05:00
atm put iov_iter into msghdr 2014-12-09 16:29:03 -05:00
ax25 new helper: memcpy_from_msg() 2014-11-24 04:28:48 -05:00
batman-adv batman-adv: Kconfig, Add missing DEBUG_FS dependency 2015-01-07 22:17:11 +01:00
bluetooth Bluetooth: Remove unused function 2015-01-16 13:06:38 +02:00
bridge bridge: offload bridge port attributes to switch asic if feature flag set 2015-02-01 23:16:34 -08:00
caif put iov_iter into msghdr 2014-12-09 16:29:03 -05:00
can netlink: make nlmsg_end() and genlmsg_end() void 2015-01-18 01:03:45 -05:00
ceph libceph: fix sparse endianness warnings 2015-01-08 20:36:57 +03:00
core net-timestamp: no-payload only sysctl 2015-02-02 18:46:51 -08:00
dcb dcbnl : Disable software interrupts before taking dcb_lock 2014-11-16 14:50:52 -05:00
dccp net: introduce helper macro for_each_cmsghdr 2014-12-10 22:41:55 -05:00
decnet netlink: Fix bugs in nlmsg_end() conversions. 2015-01-18 23:36:08 -05:00
dns_resolver
dsa net: dsa: set slave MII bus PHY mask 2015-01-25 16:00:54 -08:00
ethernet net: Add Transparent Ethernet Bridging GRO support. 2015-01-02 15:46:41 -05:00
hsr
ieee802154 netlink: make nlmsg_end() and genlmsg_end() void 2015-01-18 01:03:45 -05:00
ipv4 net: dctcp: loosen requirement to assert ECT(0) during 3WHS 2015-02-02 18:48:55 -08:00
ipv6 net-timestamp: no-payload option 2015-02-02 18:46:51 -08:00
ipx switch ipxrtr_route_packet() from iovec to msghdr 2014-11-24 04:28:49 -05:00
irda irda: use msecs_to_jiffies for conversions 2015-01-30 18:08:25 -08:00
iucv net: introduce helper macro for_each_cmsghdr 2014-12-10 22:41:55 -05:00
key new helper: memcpy_from_msg() 2014-11-24 04:28:48 -05:00
l2tp netlink: make nlmsg_end() and genlmsg_end() void 2015-01-18 01:03:45 -05:00
lapb
llc net: llc: use correct size for sysctl timeout entries 2015-01-25 00:23:21 -08:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-01-27 16:59:56 -08:00
mac802154 mac802154: fix kbuild test robot warning 2015-01-03 01:51:51 +01:00
mpls net: mark some potential candidates __read_mostly 2015-01-30 17:58:39 -08:00
netfilter netlink: make nlmsg_end() and genlmsg_end() void 2015-01-18 01:03:45 -05:00
netlabel netlink: make nlmsg_end() and genlmsg_end() void 2015-01-18 01:03:45 -05:00
netlink net: remove sock_iocb 2015-01-28 23:15:07 -08:00
netrom new helper: memcpy_from_msg() 2014-11-24 04:28:48 -05:00
nfc NFC: hci: Remove nfc_hci_pipe2gate function 2015-01-28 00:03:36 +01:00
openvswitch openvswitch: Add support for checksums on UDP tunnels. 2015-01-28 23:04:15 -08:00
packet netlink: make nlmsg_end() and genlmsg_end() void 2015-01-18 01:03:45 -05:00
phonet phonet netlink: allow multiple messages per skb in route dump 2015-01-19 16:20:17 -05:00
rds rds: Fix min() warning in rds_message_inc_copy_to_user() 2014-12-15 11:49:09 -05:00
rfkill Driver core patches for 3.19-rc1 2014-12-14 16:10:09 -08:00
rose new helper: memcpy_from_msg() 2014-11-24 04:28:48 -05:00
rxrpc net-timestamp: no-payload option 2015-02-02 18:46:51 -08:00
sched pkt_sched: fq: remove useless TIME_WAIT check 2015-01-28 23:23:57 -08:00
sctp net: sctp: fix slab corruption from use after free on INIT collisions 2015-01-26 17:02:05 -08:00
sunrpc rpc: fix xdr_truncate_encode to handle buffer ending on page boundary 2015-01-07 14:03:58 -05:00
switchdev swdevice: add new apis to set and del bridge port attributes 2015-02-01 23:16:34 -08:00
tipc tipc: fix excessive network event logging 2015-01-26 16:58:08 -08:00
unix net: remove sock_iocb 2015-01-28 23:15:07 -08:00
vmw_vsock put iov_iter into msghdr 2014-12-09 16:29:03 -05:00
wimax
wireless Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-01-27 16:59:56 -08:00
x25 new helper: memcpy_from_msg() 2014-11-24 04:28:48 -05:00
xfrm netlink: make nlmsg_end() and genlmsg_end() void 2015-01-18 01:03:45 -05:00
compat.c put iov_iter into msghdr 2014-12-09 16:29:03 -05:00
Kconfig net: introduce generic switch devices support 2014-12-02 20:01:20 -08:00
Makefile Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-12-16 15:53:03 -08:00
socket.c net: remove sock_iocb 2015-01-28 23:15:07 -08:00
sysctl_net.c