linux-stable/net
Daniel Borkmann 109980b894 bpf: don't select potentially stale ri->map from buggy xdp progs
We can potentially run into a couple of issues with the XDP
bpf_redirect_map() helper. The ri->map in the per CPU storage
can become stale in several ways, mostly due to misuse, where
we can then trigger a use after free on the map:

i) prog A is calling bpf_redirect_map(), returning XDP_REDIRECT
and running on a driver not supporting XDP_REDIRECT yet. The
ri->map on that CPU becomes stale when the XDP program is unloaded
on the driver, and a prog B loaded on a different driver which
supports XDP_REDIRECT return code. prog B would have to omit
calling to bpf_redirect_map() and just return XDP_REDIRECT, which
would then access the freed map in xdp_do_redirect() since not
cleared for that CPU.

ii) prog A is calling bpf_redirect_map(), returning a code other
than XDP_REDIRECT. prog A is then detached, which triggers release
of the map. prog B is attached which, similarly as in i), would
just return XDP_REDIRECT without having called bpf_redirect_map()
and thus be accessing the freed map in xdp_do_redirect() since
not cleared for that CPU.

iii) prog A is attached to generic XDP, calling the bpf_redirect_map()
helper and returning XDP_REDIRECT. xdp_do_generic_redirect() is
currently not handling ri->map (will be fixed by Jesper), so it's
not being reset. Later loading a e.g. native prog B which would,
say, call bpf_xdp_redirect() and then returns XDP_REDIRECT would
find in xdp_do_redirect() that a map was set and uses that causing
use after free on map access.

Fix thus needs to avoid accessing stale ri->map pointers, naive
way would be to call a BPF function from drivers that just resets
it to NULL for all XDP return codes but XDP_REDIRECT and including
XDP_REDIRECT for drivers not supporting it yet (and let ri->map
being handled in xdp_do_generic_redirect()). There is a less
intrusive way w/o letting drivers call a reset for each BPF run.

The verifier knows we're calling into bpf_xdp_redirect_map()
helper, so it can do a small insn rewrite transparent to the prog
itself in the sense that it fills R4 with a pointer to the own
bpf_prog. We have that pointer at verification time anyway and
R4 is allowed to be used as per calling convention we scratch
R0 to R5 anyway, so they become inaccessible and program cannot
read them prior to a write. Then, the helper would store the prog
pointer in the current CPUs struct redirect_info. Later in
xdp_do_*_redirect() we check whether the redirect_info's prog
pointer is the same as passed xdp_prog pointer, and if that's
the case then all good, since the prog holds a ref on the map
anyway, so it is always valid at that point in time and must
have a reference count of at least 1. If in the unlikely case
they are not equal, it means we got a stale pointer, so we clear
and bail out right there. Also do reset map and the owning prog
in bpf_xdp_redirect(), so that bpf_xdp_redirect_map() and
bpf_xdp_redirect() won't get mixed up, only the last call should
take precedence. A tc bpf_redirect() doesn't use map anywhere
yet, so no need to clear it there since never accessed in that
layer.

Note that in case the prog is released, and thus the map as
well we're still under RCU read critical section at that time
and have preemption disabled as well. Once we commit with the
__dev_map_insert_ctx() from xdp_do_redirect_map() and set the
map to ri->map_to_flush, we still wait for a xdp_do_flush_map()
to finish in devmap dismantle time once flush_needed bit is set,
so that is fine.

Fixes: 97f91a7cf0 ("bpf: add bpf_redirect_map helper routine")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-08 20:58:09 -07:00
..
6lowpan
9p Merge branch 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-15 12:00:42 -07:00
802
8021q
appletalk
atm net: atm: make atmdev_ops const 2017-08-09 22:43:50 -07:00
ax25 net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t 2017-07-04 22:35:19 +01:00
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-08-09 16:28:45 -07:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-09-06 14:45:08 -07:00
bpf
bridge bridge: switchdev: Use an helper to clear forward mark 2017-09-05 11:51:47 -07:00
caif
can rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
ceph libceph: make RECOVERY_DELETES feature create a new interval 2017-08-01 16:46:45 +02:00
core bpf: don't select potentially stale ri->map from buggy xdp progs 2017-09-08 20:58:09 -07:00
dcb rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
dccp net: dccp: Add handling of IPV6_PKTOPTIONS to dccp_v6_do_rcv() 2017-08-31 11:43:47 -07:00
decnet Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2017-09-03 17:08:42 -07:00
dns_resolver
dsa net: dsa: tag_brcm: Set output queue from skb queue mapping 2017-09-05 11:53:34 -07:00
ethernet
hsr net/hsr: Check skb_put_padto() return value 2017-08-22 13:40:23 -07:00
ieee802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-09-05 20:03:35 -07:00
ife
ipv4 ip_tunnel: fix setting ttl and tos value in collect_md mode 2017-09-08 20:47:10 -07:00
ipv6 ip6_tunnel: fix setting hop_limit value for ipv6 tunnel 2017-09-08 20:47:10 -07:00
ipx net, ipx: convert ipx_route.refcnt from atomic_t to refcount_t 2017-07-04 22:35:17 +01:00
iucv
kcm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-09-01 17:42:05 -07:00
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-08-15 20:23:23 -07:00
l2tp l2tp: pass tunnel pointer to ->session_create() 2017-09-03 11:04:21 -07:00
l3mdev
lapb net, lapb: convert lapb_cb.refcnt from atomic_t to refcount_t 2017-07-04 22:35:16 +01:00
llc net, llc: convert llc_sap.refcnt from atomic_t to refcount_t 2017-07-04 22:35:15 +01:00
mac80211 mac80211: fix deadlock in driver-managed RX BA session start 2017-09-06 15:22:02 +02:00
mac802154
mpls rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
ncsi net/ncsi: fix ncsi_vlan_rx_{add,kill}_vid references 2017-09-05 09:11:45 -07:00
netfilter netfilter: xt_hashlimit: fix build error caused by 64bit division 2017-09-08 18:55:53 +02:00
netlabel
netlink netlink: access nlk groups safely in netlink bind and getname 2017-09-06 21:22:54 -07:00
netrom net, netrom: convert nr_node.refcount from atomic_t to refcount_t 2017-07-04 22:35:17 +01:00
nfc
nsh nsh: add GSO support 2017-08-29 15:16:52 -07:00
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2017-09-03 17:08:42 -07:00
packet Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-09-01 17:42:05 -07:00
phonet rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
psample
qrtr rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
rds rds: Fix incorrect statistics counting 2017-09-07 20:07:13 -07:00
rfkill
rose
rxrpc rxrpc: Make service connection lookup always check for retry 2017-09-05 14:39:17 -07:00
sched net: sched: fix memleak for chain zero 2017-09-07 19:17:20 -07:00
sctp sctp: fix missing wake ups in some situations 2017-09-08 10:02:47 -07:00
smc net/smc: synchronize buffer usage with device 2017-07-29 11:22:58 -07:00
strparser strparser: initialize all callbacks 2017-08-24 21:57:50 -07:00
sunrpc Two nfsd bugfixes, neither 4.13 regressions, but both potentially 2017-08-25 17:27:26 -07:00
switchdev net: switchdev: Remove bridge bypass support from switchdev 2017-08-07 14:48:48 -07:00
tipc tipc: remove unnecessary call to dev_net() 2017-09-06 21:25:52 -07:00
tls TLS: Fix length check in do_tls_getsockopt_tx() 2017-07-06 10:58:19 +01:00
unix Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-08-21 17:06:42 -07:00
vmw_vsock hv_sock: implements Hyper-V transport for Virtual Sockets (AF_VSOCK) 2017-08-28 15:38:18 -07:00
wimax
wireless cfg80211: honor NL80211_RRF_NO_HT40{MINUS,PLUS} 2017-09-06 12:56:31 +02:00
x25 X25: constify null_x25_address 2017-08-03 09:13:51 -07:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-09-01 17:42:05 -07:00
compat.c get_compat_bpf_fprog(): don't copyin field-by-field 2017-07-04 13:14:34 -04:00
Kconfig net: Remove CONFIG_NETFILTER_DEBUG and _ASSERT() macros. 2017-09-04 13:25:20 +02:00
Makefile nsh: add GSO support 2017-08-29 15:16:52 -07:00
socket.c net: fixes for skb_send_sock 2017-08-16 11:27:52 -07:00
sysctl_net.c