linux-stable/net/core
Ido Schimmel 3c71006d15 ipv4: fib_rules: Check if rule is a default rule
Currently, when non-default (custom) FIB rules are used, devices capable
of layer 3 offloading flush their tables and let the kernel do the
forwarding instead.

When these devices' drivers are loaded they register to the FIB
notification chain, which lets them know about the existence of any
custom FIB rules. This is done by sending a RULE_ADD notification based
on the value of 'net->ipv4.fib_has_custom_rules'.

This approach is problematic when VRF offload is taken into account, as
upon the creation of the first VRF netdev, a l3mdev rule is programmed
to direct skbs to the VRF's table.

Instead of merely reading the above value and sending a single RULE_ADD
notification, we should iterate over all the FIB rules and send a
detailed notification for each, thereby allowing offloading drivers to
sanitize the rules they don't support and potentially flush their
tables.

While l3mdev rules are uniquely marked, the default rules are not.
Therefore, when they are being notified they might invoke offloading
drivers to unnecessarily flush their tables.

Solve this by adding an helper to check if a FIB rule is a default rule.
Namely, its selector should match all packets and its action should
point to the local, main or default tables.

As noted by David Ahern, uniquely marking the default rules is
insufficient. When using VRFs, it's common to avoid false hits by moving
the rule for the local table to just before the main table:

Default configuration:
$ ip rule show
0:      from all lookup local
32766:  from all lookup main
32767:  from all lookup default

Common configuration with VRFs:
$ ip rule show
1000:   from all lookup [l3mdev-table]
32765:  from all lookup local
32766:  from all lookup main
32767:  from all lookup default

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-16 10:18:33 -07:00
..
datagram.c udp: properly cope with csum errors 2017-02-07 11:19:00 -05:00
dev.c net: Resend IGMP memberships upon peer notification. 2017-03-14 11:33:44 -07:00
dev_addr_lists.c
dev_ioctl.c
devlink.c devlink: allow to fillup eswitch attrs even if mode_get op does not exist 2017-02-10 14:43:00 -05:00
drop_monitor.c drop_monitor: use setup_timer 2017-03-12 23:47:16 -07:00
dst.c net: pending_confirm is not used anymore 2017-02-07 13:07:47 -05:00
dst_cache.c net: dst_cache_per_cpu_dst_set() can be static 2016-03-18 17:45:08 -04:00
ethtool.c ethtool: add CRC32 as an RSS hash function 2017-03-09 16:39:58 -08:00
fib_rules.c ipv4: fib_rules: Check if rule is a default rule 2017-03-16 10:18:33 -07:00
filter.c bpf: Fix bpf_xdp_event_output 2017-02-23 13:53:42 -05:00
flow.c Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-12 19:25:04 -08:00
flow_dissector.c flow_dissector: Move GRE dissection into a separate function 2017-03-08 23:08:57 -08:00
gen_estimator.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
gen_stats.c net_sched: gen_estimator: complete rewrite of rate estimators 2016-12-05 15:21:59 -05:00
gro_cells.c gro_cells: move to net/core/gro_cells.c 2017-02-08 14:38:18 -05:00
hwbm.c net: hwbm: Fix unbalanced spinlock in error case 2016-05-25 12:35:09 -07:00
link_watch.c
lwt_bpf.c lwtunnel: remove device arg to lwtunnel_build_state 2017-01-30 15:14:22 -05:00
lwtunnel.c lwtunnel: remove unused but set variable 2017-03-13 23:55:45 -07:00
Makefile gro_cells: move to net/core/gro_cells.c 2017-02-08 14:38:18 -05:00
neighbour.c net: neigh: Fix netevent NETEVENT_DELAY_PROBE_TIME_UPDATE notification 2017-02-15 12:38:43 -05:00
net-procfs.c net: remove NETDEV_TX_LOCKED support 2016-04-26 15:53:05 -04:00
net-sysfs.c net: use net->count to check whether a netns is alive or not 2017-03-13 16:02:27 -07:00
net-sysfs.h
net-traces.c
net_namespace.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
netclassid_cgroup.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
netevent.c
netpoll.c netpoll: more efficient locking 2016-11-16 18:32:02 -05:00
netprio_cgroup.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
pktgen.c net-tc: convert tc_verd to integer bitfields 2017-01-08 20:58:52 -05:00
ptp_classifier.c
request_sock.c ipv4: Namespaceify tcp_max_syn_backlog knob 2016-12-29 11:38:31 -05:00
rtnetlink.c rtnl: simplify error return path in rtnl_create_link() 2017-02-21 12:17:43 -05:00
scm.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/user.h> 2017-03-02 08:42:29 +01:00
secure_seq.c tcp: rename *_sequence_number() to *_seq_and_tsoff() 2017-03-09 18:25:34 -08:00
skbuff.c net: fix socket refcounting in skb_complete_tx_timestamp() 2017-03-07 14:06:15 -08:00
sock.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-03-15 11:59:10 -07:00
sock_diag.c sock_diag: align nlattr properly when needed 2016-04-26 12:00:48 -04:00
sock_reuseport.c soreuseport: do not export reuseport_add_sock() 2016-10-18 14:18:23 -04:00
stream.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> 2017-03-02 08:42:29 +01:00
sysctl_net_core.c bpf: make jited programs visible in traces 2017-02-17 13:40:05 -05:00
timestamping.c
tso.c
utils.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00