linux-stable/net
Eric Dumazet 2e43d8eba6 tcp: properly terminate timers for kernel sockets
[ Upstream commit 151c9c724d ]

We had various syzbot reports about tcp timers firing after
the corresponding netns has been dismantled.

Fortunately Josef Bacik could trigger the issue more often,
and could test a patch I wrote two years ago.

When TCP sockets are closed, we call inet_csk_clear_xmit_timers()
to 'stop' the timers.

inet_csk_clear_xmit_timers() can be called from any context,
including when socket lock is held.
This is the reason it uses sk_stop_timer(), aka del_timer().
This means that ongoing timers might finish much later.

For user sockets, this is fine because each running timer
holds a reference on the socket, and the user socket holds
a reference on the netns.

For kernel sockets, we risk that the netns is freed before
timer can complete, because kernel sockets do not hold
reference on the netns.

This patch adds inet_csk_clear_xmit_timers_sync() function
that using sk_stop_timer_sync() to make sure all timers
are terminated before the kernel socket is released.
Modules using kernel sockets close them in their netns exit()
handler.

Also add sock_not_owned_by_me() helper to get LOCKDEP
support : inet_csk_clear_xmit_timers_sync() must not be called
while socket lock is held.

It is very possible we can revert in the future commit
3a58f13a88 ("net: rds: acquire refcount on TCP sockets")
which attempted to solve the issue in rds only.
(net/smc/af_smc.c and net/mptcp/subflow.c have similar code)

We probably can remove the check_net() tests from
tcp_out_of_resources() and __tcp_close() in the future.

Reported-by: Josef Bacik <josef@toxicpanda.com>
Closes: https://lore.kernel.org/netdev/20240314210740.GA2823176@perftesting/
Fixes: 26abe14379 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.")
Fixes: 8a68173691 ("net: sk_clone_lock() should only do get_net() if the parent is not a kernel socket")
Link: https://lore.kernel.org/bpf/CANn89i+484ffqb93aQm1N-tjxxvb3WDKX0EbD7318RwRgsatjw@mail.gmail.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Josef Bacik <josef@toxicpanda.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Link: https://lore.kernel.org/r/20240322135732.1535772-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-10 16:19:35 +02:00
..
6lowpan
9p net: 9p: avoid freeing uninit memory in p9pdu_vreadf 2024-01-05 15:13:34 +01:00
802
8021q vlan: skip nested type that is not IFLA_VLAN_QOS_MAPPING 2024-02-23 08:54:27 +01:00
appletalk appletalk: Fix Use-After-Free in atalk_ioctl 2023-12-20 15:17:37 +01:00
atm atm: Fix Use-After-Free in do_vcc_ioctl 2023-12-20 15:17:35 +01:00
ax25
batman-adv net: vlan: introduce skb_vlan_eth_hdr() 2023-12-20 15:17:35 +01:00
bluetooth exit: Rename module_put_and_exit to module_put_and_kthread_exit 2024-04-10 16:18:55 +02:00
bpf
bpfilter
bridge netfilter: bridge: confirm multicast packets before passing them up the stack 2024-03-06 14:38:46 +00:00
caif
can can: j1939: Fix UAF in j1939_sk_match_filter during setsockopt(SO_J1939_FILTER) 2024-02-23 08:55:10 +01:00
ceph libceph: use kernel_connect() 2023-10-19 23:05:36 +02:00
core net: report RCU QS on threaded NAPI repolling 2024-03-26 18:21:37 -04:00
dcb
dccp net: inet: Retire port only listening_hash 2023-11-28 16:56:22 +00:00
dns_resolver keys, dns: Fix size check of V1 server-list header 2024-01-25 14:52:46 -08:00
dsa
ethernet
ethtool ethtool: netlink: Add missing ethnl_ops_begin/complete 2024-01-25 14:52:54 -08:00
hsr hsr: Handle failures in module init 2024-03-26 18:21:36 -04:00
ieee802154
ife net: sched: ife: fix potential use-after-free 2024-01-05 15:13:29 +01:00
ipv4 tcp: properly terminate timers for kernel sockets 2024-04-10 16:19:35 +02:00
ipv6 ipv6: fib6_rules: flush route cache when rule is changed 2024-03-26 18:21:22 -04:00
iucv net/iucv: fix the allocation size of iucv_path_table array 2024-03-26 18:21:13 -04:00
kcm net: kcm: fix incorrect parameter validation in the kcm_getsockopt) function 2024-03-26 18:21:23 -04:00
key
l2tp l2tp: fix incorrect parameter validation in the pppol2tp_getsockopt() function 2024-03-26 18:21:23 -04:00
l3mdev
lapb
llc llc: call sock_orphan() at release time 2024-02-23 08:54:54 +01:00
mac80211 wifi: mac80211: check/clear fast rx for non-4addr sta VLAN changes 2024-04-10 16:19:31 +02:00
mac802154 mac802154: fix llsec key resources release in mac802154_llsec_key_del 2024-04-10 16:18:39 +02:00
mctp mctp: perform route lookups under a RCU read-side lock 2023-10-25 11:58:59 +02:00
mpls
mptcp mptcp: fix double-free on socket dismantle 2024-03-06 14:38:51 +00:00
ncsi net/ncsi: Fix netlink major/minor version numbers 2024-01-25 14:52:36 -08:00
netfilter netfilter: nf_tables: reject constant set with timeout 2024-04-10 16:18:44 +02:00
netlabel calipso: fix memory leak in netlbl_calipso_add_pass() 2024-01-25 14:52:33 -08:00
netlink netlink: Fix kernel-infoleak-after-free in __skb_datagram_iter 2024-03-06 14:38:45 +00:00
netrom netrom: Fix data-races around sysctl_net_busy_read 2024-03-15 10:48:17 -04:00
nfc nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet 2024-04-10 16:19:35 +02:00
nsh
openvswitch net: openvswitch: limit the number of recursions from action sets 2024-02-23 08:55:02 +01:00
packet packet: annotate data-races around ignore_outgoing 2024-03-26 18:21:35 -04:00
phonet
psample psample: Require 'CAP_NET_ADMIN' when joining "packets" group 2023-12-13 18:36:37 +01:00
qrtr net: qrtr: ns: Return 0 if server port is not present 2024-01-25 14:52:30 -08:00
rds rds: introduce acquire/release ordering in acquire/release_in_xmit() 2024-03-26 18:21:36 -04:00
rfkill net: rfkill: gpio: set GPIO direction 2024-01-05 15:13:34 +01:00
rose net/rose: fix races in rose_kill_by_device() 2024-01-05 15:13:29 +01:00
rxrpc rxrpc: Fix response to PING RESPONSE ACKs to a dead call 2024-02-23 08:54:58 +01:00
sched net/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs 2024-03-01 13:21:54 +01:00
sctp sctp: update hb timer immediately after users change hb_interval 2023-10-10 21:59:08 +02:00
smc net/smc: fix illegal rmb_desc access in SMC-D connection dump 2024-02-23 08:54:27 +01:00
strparser
sunrpc nfsd: fix double fget() bug in __write_ports_addfd() 2024-04-10 16:19:28 +02:00
switchdev
tipc tipc: Check the bearer type before calling tipc_udp_nl_bearer_add() 2024-02-23 08:54:58 +01:00
tls Revert "tls: rx: move counting TlsDecryptErrors for sync" 2024-03-06 14:38:51 +00:00
unix af_unix: Annotate data-race of gc_in_progress in wait_for_unix_gc(). 2024-03-26 18:21:17 -04:00
vmw_vsock virtio/vsock: fix logic which reduces credit update messages 2024-01-25 14:52:38 -08:00
wireless wifi: nl80211: reject iftype change with mesh ID change 2024-03-06 14:38:48 +00:00
x25 net/x25: fix incorrect parameter validation in the x25_getsockopt() function 2024-03-26 18:21:23 -04:00
xdp xsk: Fix xsk_diag use-after-free error during socket cleanup 2023-09-19 12:22:58 +02:00
xfrm xfrm: Avoid clang fortify warning in copy_to_user_tmpl() 2024-04-10 16:18:45 +02:00
compat.c
devres.c
Kconfig
Makefile
socket.c net: Save and restore msg_namelen in sock_sendmsg 2024-01-15 18:51:16 +01:00
sysctl_net.c