linux-stable/net
Maciej Fijalkowski 9f0c8a9d4e xsk: Fix possible crash when multiple sockets are created
commit ba3beec2ec upstream.

Fix a crash that happens if an Rx only socket is created first, then a
second socket is created that is Tx only and bound to the same umem as
the first socket and also the same netdev and queue_id together with the
XDP_SHARED_UMEM flag. In this specific case, the tx_descs array page
pool was not created by the first socket as it was an Rx only socket.
When the second socket is bound it needs this tx_descs array of this
shared page pool as it has a Tx component, but unfortunately it was
never allocated, leading to a crash. Note that this array is only used
for zero-copy drivers using the batched Tx APIs, currently only ice and
i40e.

[ 5511.150360] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 5511.158419] #PF: supervisor write access in kernel mode
[ 5511.164472] #PF: error_code(0x0002) - not-present page
[ 5511.170416] PGD 0 P4D 0
[ 5511.173347] Oops: 0002 [#1] PREEMPT SMP PTI
[ 5511.178186] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G            E     5.18.0-rc1+ #97
[ 5511.187245] Hardware name: Intel Corp. GRANTLEY/GRANTLEY, BIOS GRRFCRB1.86B.0276.D07.1605190235 05/19/2016
[ 5511.198418] RIP: 0010:xsk_tx_peek_release_desc_batch+0x198/0x310
[ 5511.205375] Code: c0 83 c6 01 84 c2 74 6d 8d 46 ff 23 07 44 89 e1 48 83 c0 14 48 c1 e1 04 48 c1 e0 04 48 03 47 10 4c 01 c1 48 8b 50 08 48 8b 00 <48> 89 51 08 48 89 01 41 80 bd d7 00 00 00 00 75 82 48 8b 19 49 8b
[ 5511.227091] RSP: 0018:ffffc90000003dd0 EFLAGS: 00010246
[ 5511.233135] RAX: 0000000000000000 RBX: ffff88810c8da600 RCX: 0000000000000000
[ 5511.241384] RDX: 000000000000003c RSI: 0000000000000001 RDI: ffff888115f555c0
[ 5511.249634] RBP: ffffc90000003e08 R08: 0000000000000000 R09: ffff889092296b48
[ 5511.257886] R10: 0000ffffffffffff R11: ffff889092296800 R12: 0000000000000000
[ 5511.266138] R13: ffff88810c8db500 R14: 0000000000000040 R15: 0000000000000100
[ 5511.274387] FS:  0000000000000000(0000) GS:ffff88903f800000(0000) knlGS:0000000000000000
[ 5511.283746] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5511.290389] CR2: 0000000000000008 CR3: 00000001046e2001 CR4: 00000000003706f0
[ 5511.298640] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5511.306892] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 5511.315142] Call Trace:
[ 5511.317972]  <IRQ>
[ 5511.320301]  ice_xmit_zc+0x68/0x2f0 [ice]
[ 5511.324977]  ? ktime_get+0x38/0xa0
[ 5511.328913]  ice_napi_poll+0x7a/0x6a0 [ice]
[ 5511.333784]  __napi_poll+0x2c/0x160
[ 5511.337821]  net_rx_action+0xdd/0x200
[ 5511.342058]  __do_softirq+0xe6/0x2dd
[ 5511.346198]  irq_exit_rcu+0xb5/0x100
[ 5511.350339]  common_interrupt+0xa4/0xc0
[ 5511.354777]  </IRQ>
[ 5511.357201]  <TASK>
[ 5511.359625]  asm_common_interrupt+0x1e/0x40
[ 5511.364466] RIP: 0010:cpuidle_enter_state+0xd2/0x360
[ 5511.370211] Code: 49 89 c5 0f 1f 44 00 00 31 ff e8 e9 00 7b ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 72 02 00 00 31 ff e8 02 0c 80 ff fb 45 85 f6 <0f> 88 11 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d 14 90 49
[ 5511.391921] RSP: 0018:ffffffff82a03e60 EFLAGS: 00000202
[ 5511.397962] RAX: ffff88903f800000 RBX: 0000000000000001 RCX: 000000000000001f
[ 5511.406214] RDX: 0000000000000000 RSI: ffffffff823400b9 RDI: ffffffff8234c046
[ 5511.424646] RBP: ffff88810a384800 R08: 000005032a28c046 R09: 0000000000000008
[ 5511.443233] R10: 000000000000000b R11: 0000000000000006 R12: ffffffff82bcf700
[ 5511.461922] R13: 000005032a28c046 R14: 0000000000000001 R15: 0000000000000000
[ 5511.480300]  cpuidle_enter+0x29/0x40
[ 5511.494329]  do_idle+0x1c7/0x250
[ 5511.507610]  cpu_startup_entry+0x19/0x20
[ 5511.521394]  start_kernel+0x649/0x66e
[ 5511.534626]  secondary_startup_64_no_verify+0xc3/0xcb
[ 5511.549230]  </TASK>

Detect such case during bind() and allocate this memory region via newly
introduced xp_alloc_tx_descs(). Also, use kvcalloc instead of kcalloc as
for other buffer pool allocations, so that it matches the kvfree() from
xp_destroy().

Fixes: d1bc532e99 ("i40e: xsk: Move tmp desc array from driver to pool")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20220425153745.481322-1-maciej.fijalkowski@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-14 18:41:54 +02:00
..
6lowpan
9p xen/9p: use alloc/free_pages_exact() 2022-03-07 09:48:55 +01:00
802
8021q vlan: move dev_put into vlan_dev_uninit 2022-02-09 13:33:39 +00:00
appletalk
atm proc: remove PDE_DATA() completely 2022-01-22 08:33:37 +02:00
ax25 ax25: Fix ax25 session cleanup problems 2022-06-14 18:41:25 +02:00
batman-adv batman-adv: Don't skb_split skbuffs with frag_list 2022-05-18 10:28:11 +02:00
bluetooth bluetooth: don't use bitmaps for random flag accesses 2022-06-14 18:41:26 +02:00
bpf bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide 2022-04-13 19:27:40 +02:00
bpfilter
bridge net: bridge: Clear offload_fwd_mark when passing frame up bridge interface. 2022-05-25 09:59:11 +02:00
caif Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-12-31 14:35:40 +00:00
can can: isotp: remove re-binding of bound socket 2022-05-12 12:32:25 +02:00
ceph libceph: fix potential use-after-free on linger ping and resends 2022-05-25 09:59:04 +02:00
core net, neigh: Set lower cap for neigh_managed_work rearming 2022-06-14 18:41:37 +02:00
dcb net: dcb: disable softirqs in dcbnl_flush_dev() 2022-03-03 08:01:55 -08:00
dccp
decnet Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-12-31 14:35:40 +00:00
dns_resolver
dsa net: dsa: flush switchdev workqueue on bridge join error path 2022-05-18 10:28:14 +02:00
ethernet
ethtool ethtool: use phydev variable 2022-01-06 12:33:35 +00:00
hsr net: Write lock dev_base_lock without disabling bottom halves. 2021-11-29 12:12:36 +00:00
ieee802154 net: ieee802154: Return meaningful error codes from the netlink helpers 2022-01-27 08:20:47 +01:00
ife
ipv4 tcp: fix tcp_mtup_probe_success vs wrong snd_cwnd 2022-06-14 18:41:54 +02:00
ipv6 net: ipv6: unexport __init-annotated seg6_hmac_init() 2022-06-14 18:41:31 +02:00
iucv net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
kcm net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
key Revert "net: af_key: add check for pfkey_broadcast in function pfkey_process" 2022-06-14 18:41:37 +02:00
l2tp l2tp: add netns refcount tracker to l2tp_dfs_seq_data 2021-12-10 06:38:27 -08:00
l3mdev l3mdev: l3mdev_master_upper_ifindex_by_index_rcu should be using netdev_master_upper_dev_get_rcu 2022-04-27 14:41:01 +02:00
lapb
llc llc: only change llc->dev when bind() succeeds 2022-03-28 10:03:22 +02:00
mac80211 mac80211: upgrade passive scan to active scan on DFS channels after beacon rx 2022-06-09 10:26:26 +02:00
mac802154
mctp mctp: defer the kfree of object mdev->addrs 2022-05-09 09:16:24 +02:00
mpls net: mpls: Fix GCC 12 warning 2022-02-10 15:29:39 +00:00
mptcp mptcp: reset the packet scheduler on PRIO change 2022-06-09 10:25:39 +02:00
ncsi all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate 2022-01-15 08:47:31 -08:00
netfilter netfilter: nf_tables: bail out early if hardware offload is not supported 2022-06-14 18:41:30 +02:00
netlabel netlabel: fix out-of-bounds memory accesses 2022-04-13 19:27:22 +02:00
netlink netlink: do not reset transport header in netlink_recvmsg() 2022-05-18 10:28:13 +02:00
netrom netrom: fix api breakage in nr_setsockopt() 2022-01-07 14:11:05 +00:00
nfc NFC: NULL out the dev->rfkill to prevent UAF 2022-06-09 10:25:39 +02:00
nsh
openvswitch net: openvswitch: fix misuse of the cached connection on tuple changes 2022-06-14 18:41:47 +02:00
packet net/packet: fix packet_sock xmit return value checking 2022-04-27 14:41:00 +02:00
phonet phonet/pep: refuse to enable an unbound pipe 2021-12-20 11:49:51 +00:00
psample
qrtr bus: mhi: core: Add an API for auto queueing buffers for DL channel 2021-12-17 17:17:14 +01:00
rds net: rds: use maybe_get_net() when acquiring refcount on TCP sockets 2022-05-18 10:28:12 +02:00
rfkill rfkill: make new event layout opt-in 2022-04-08 13:57:27 +02:00
rose net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
rxrpc rxrpc: Fix decision on when to generate an IDLE ACK 2022-06-09 10:25:58 +02:00
sched net/sched: act_police: more accurate MTU policing 2022-06-14 18:41:52 +02:00
sctp sctp: read sk->sk_bound_dev_if once in sctp_rcv() 2022-06-09 10:25:53 +02:00
smc net/smc: fixes for converting from "struct smc_cdc_tx_pend **" to "struct smc_wr_tx_pend_priv *" 2022-06-14 18:41:22 +02:00
strparser
sunrpc SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer() 2022-06-14 18:41:31 +02:00
switchdev net: switchdev: add net device refcount tracker 2021-12-07 20:44:58 -08:00
tipc tipc: check attribute length for bearer name 2022-06-14 18:41:25 +02:00
tls tls: Fix context leak on tls_device_down 2022-05-18 10:28:16 +02:00
unix af_unix: Fix a data-race in unix_dgram_peer_wake_me(). 2022-06-14 18:41:30 +02:00
vmw_vsock vsock/virtio: enable VQs early on probe 2022-04-08 13:58:32 +02:00
wireless cfg80211: declare MODULE_FIRMWARE for regulatory.db 2022-06-09 10:26:26 +02:00
x25 net/x25: Fix null-ptr-deref caused by x25_disconnect 2022-04-08 13:58:34 +02:00
xdp xsk: Fix possible crash when multiple sockets are created 2022-06-14 18:41:54 +02:00
xfrm xfrm: rework default policy structure 2022-05-25 09:59:06 +02:00
Kconfig
Kconfig.debug net: add networking namespace refcount tracker 2021-12-10 06:38:26 -08:00
Makefile
compat.c
devres.c
socket.c net: fix documentation for kernel_getsockname 2022-02-14 14:01:19 +00:00
sysctl_net.c