linux-stable/net/ipv4
Willem de Bruijn f2b3ee9e42 ipv6: Fix ip_gre lockless xmits.
Tunnel devices set NETIF_F_LLTX to bypass HARD_TX_LOCK.  Sit and
ipip set this unconditionally in ops->setup, but gre enables it
conditionally after parameter passing in ops->newlink. This is
not called during tunnel setup as below, however, so GRE tunnels are
still taking the lock.

modprobe ip_gre
ip tunnel add test0 mode gre remote 10.5.1.1 dev lo
ip link set test0 up
ip addr add 10.6.0.1 dev test0
 # cat /sys/class/net/test0/features
 # $DIR/test_tunnel_xmit 10 10.5.2.1
ip route add 10.5.2.0/24 dev test0
ip tunnel del test0

The newlink callback is only called in rtnl_netlink, and only if
the device is new, as it calls register_netdevice internally. Gre
tunnels are created at 'ip tunnel add' with ioctl SIOCADDTUNNEL,
which calls ipgre_tunnel_locate, which calls register_netdev.
rtnl_newlink is called at 'ip link set', but skips ops->newlink
and the device is up with locking still enabled. The equivalent
ipip tunnel works fine, btw (just substitute 'method gre' for
'method ipip').

On kernels before /sys/class/net/*/features was removed [1],
the first commented out line returns 0x6000 with method gre,
which indicates that NETIF_F_LLTX (0x1000) is not set. With ipip,
it reports 0x7000. This test cannot be used on recent kernels where
the sysfs file is removed (and ETHTOOL_GFEATURES does not currently
work for tunnel devices, because they lack dev->ethtool_ops).

The second commented out line calls a simple transmission test [2]
that sends on 24 cores at maximum rate. Results of a single run:

ipip:			19,372,306
gre before patch:	 4,839,753
gre after patch:	19,133,873

This patch replicates the condition check in ipgre_newlink to
ipgre_tunnel_locate. It works for me, both with oseq on and off.
This is the first time I looked at rtnetlink and iproute2 code,
though, so someone more knowledgeable should probably check the
patch. Thanks.

The tail of both functions is now identical, by the way. To avoid
code duplication, I'll be happy to rework this and merge the two.

[1] http://patchwork.ozlabs.org/patch/104610/
[2] http://kernel.googlecode.com/files/xmit_udp_parallel.c

Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-26 16:34:08 -05:00
..
netfilter Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security 2012-01-14 18:36:33 -08:00
af_inet.c per-netns ipv4 sysctl_tcp_mem 2011-12-12 19:04:11 -05:00
ah4.c ah: Don't return NET_XMIT_DROP on input. 2011-11-12 18:13:32 -05:00
arp.c ipv6: Use universal hash for NDISC. 2011-12-28 15:06:58 -05:00
cipso_ipv4.c cipso: remove an unneeded NULL check in cipso_v4_doi_add() 2011-10-11 18:43:53 -04:00
datagram.c ipv4: Lock socket and use cork flow in ip4_datagram_connect(). 2011-05-08 13:48:57 -07:00
devinet.c net: reintroduce missing rcu_assign_pointer() calls 2012-01-12 12:26:56 -08:00
esp4.c inet: constify ip headers and in6_addr 2011-04-22 11:04:14 -07:00
fib_frontend.c rtnetlink: Compute and store minimum ifinfo dump size 2011-06-09 20:38:07 -07:00
fib_lookup.h ipv4: Fix nexthop caching wrt. scoping. 2011-03-24 18:06:47 -07:00
fib_rules.c net: ipv4: export fib_lookup and fib_table_lookup 2011-12-04 22:43:33 +01:00
fib_semantics.c ipv4: Fix fib_info->fib_metrics leak 2011-09-16 17:42:26 -04:00
fib_trie.c net: reintroduce missing rcu_assign_pointer() calls 2012-01-12 12:26:56 -08:00
gre.c rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER 2011-08-02 04:29:23 -07:00
icmp.c net: more accurate skb truesize 2011-10-13 16:05:07 -04:00
igmp.c net: reintroduce missing rcu_assign_pointer() calls 2012-01-12 12:26:56 -08:00
inet_connection_sock.c tcp: bind() optimize port allocation 2012-01-25 21:50:43 -05:00
inet_diag.c inet_diag: Rename inet_diag_req_compat into inet_diag_req 2012-01-11 12:56:06 -08:00
inet_fragment.c
inet_hashtables.c net: Compute protocol sequence numbers and fragment IDs using MD5. 2011-08-06 18:33:19 -07:00
inet_lro.c net: add skb frag size accessors 2011-10-19 03:10:46 -04:00
inet_timewait_sock.c net: Fix files explicitly needing to include module.h 2011-10-31 19:30:28 -04:00
inetpeer.c inetpeer: initialize ->redirect_genid in inet_getpeer() 2012-01-17 15:52:12 -05:00
ip_forward.c ipv4: Save nexthop address of LSRR/SSRR option to IPCB. 2011-11-23 19:19:32 -05:00
ip_fragment.c treewide: Fix typos in various parts of the kernel, and fix some comments. 2011-12-02 14:57:31 +01:00
ip_gre.c ipv6: Fix ip_gre lockless xmits. 2012-01-26 16:34:08 -05:00
ip_input.c ip: introduce ip_is_fragment helper inline function 2011-06-21 20:33:34 -07:00
ip_options.c ipv4: Save nexthop address of LSRR/SSRR option to IPCB. 2011-11-23 19:19:32 -05:00
ip_output.c net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}. 2011-12-05 15:20:19 -05:00
ip_sockglue.c net: use IS_ENABLED(CONFIG_IPV6) 2011-12-11 18:25:16 -05:00
ipcomp.c inet: constify ip headers and in6_addr 2011-04-22 11:04:14 -07:00
ipconfig.c net: fix some sparse errors 2012-01-17 10:31:12 -05:00
ipip.c net: reintroduce missing rcu_assign_pointer() calls 2012-01-12 12:26:56 -08:00
ipmr.c net: reintroduce missing rcu_assign_pointer() calls 2012-01-12 12:26:56 -08:00
Kconfig Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-01-09 14:46:52 -08:00
Makefile tcp memory pressure controls 2011-12-12 19:04:10 -05:00
netfilter.c netfilter: possible unaligned packet header in ip_route_me_harder 2011-11-21 18:46:18 +01:00
ping.c net: fix some sparse errors 2012-01-17 10:31:12 -05:00
proc.c tcp: detect loss above high_seq in recovery 2012-01-22 15:08:44 -05:00
protocol.c net: add __rcu annotations to protocol 2010-10-27 11:37:31 -07:00
raw.c ipv4: Remove all uses of LL_ALLOCATED_SPACE 2011-11-18 14:37:08 -05:00
route.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2011-12-23 17:13:56 -05:00
syncookies.c tcp: Replace constants with #define macros 2011-12-21 01:03:23 -05:00
sysctl_net_ipv4.c net: ping: remove some sparse errors 2011-12-14 13:34:55 -05:00
tcp.c per-netns ipv4 sysctl_tcp_mem 2011-12-12 19:04:11 -05:00
tcp_bic.c tcp: fix undo after RTO for BIC 2012-01-20 14:17:26 -05:00
tcp_cong.c tcp: do not scale TSO segment size with reordering degree 2011-11-29 00:29:41 -05:00
tcp_cubic.c tcp: fix undo after RTO for CUBIC 2012-01-20 14:17:26 -05:00
tcp_diag.c inet_diag: Rename inet_diag_req into inet_diag_req_v2 2012-01-11 12:56:06 -08:00
tcp_highspeed.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_htcp.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_hybla.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_illinois.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_input.c tcp: detect loss above high_seq in recovery 2012-01-22 15:08:44 -05:00
tcp_ipv4.c tcp: md5: using remote adress for md5 lookup in rst packet 2012-01-22 15:08:45 -05:00
tcp_lp.c Fix common misspellings 2011-03-31 11:26:23 -03:00
tcp_memcontrol.c net: decrement memcg jump label when limit, not usage, is changed 2012-01-12 12:27:59 -08:00
tcp_minisocks.c net: use IS_ENABLED(CONFIG_IPV6) 2011-12-11 18:25:16 -05:00
tcp_output.c foundations of per-cgroup memory pressure controlling. 2011-12-12 19:04:10 -05:00
tcp_probe.c net: ipv4: tcp_probe: cleanup snprintf() use 2010-11-17 12:27:46 -08:00
tcp_scalable.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_timer.c net: fix assignment of 0/1 to bool variables. 2011-12-19 22:27:29 -05:00
tcp_vegas.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_vegas.h
tcp_veno.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_westwood.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_yeah.c Fix common misspellings 2011-03-31 11:26:23 -03:00
tunnel4.c net: use IS_ENABLED(CONFIG_IPV6) 2011-12-11 18:25:16 -05:00
udp.c udp: Export code sk lookup routines 2011-12-09 14:14:08 -05:00
udp_diag.c net: kill duplicate included header 2012-01-17 10:31:12 -05:00
udp_impl.h
udplite.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
xfrm4_input.c
xfrm4_mode_beet.c
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c ipv4: Don't pre-seed hoplimit metric. 2010-12-12 22:08:17 -08:00
xfrm4_output.c xfrm4: Don't call icmp_send on local error 2011-07-01 17:33:19 -07:00
xfrm4_policy.c ipv4: fix ipsec forward performance regression 2011-10-24 03:01:22 -04:00
xfrm4_state.c net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules 2011-10-31 19:30:30 -04:00
xfrm4_tunnel.c net: use IS_ENABLED(CONFIG_IPV6) 2011-12-11 18:25:16 -05:00