linux-stable/include
Eric Dumazet 2e43d8eba6 tcp: properly terminate timers for kernel sockets
[ Upstream commit 151c9c724d ]

We had various syzbot reports about tcp timers firing after
the corresponding netns has been dismantled.

Fortunately Josef Bacik could trigger the issue more often,
and could test a patch I wrote two years ago.

When TCP sockets are closed, we call inet_csk_clear_xmit_timers()
to 'stop' the timers.

inet_csk_clear_xmit_timers() can be called from any context,
including when socket lock is held.
This is the reason it uses sk_stop_timer(), aka del_timer().
This means that ongoing timers might finish much later.

For user sockets, this is fine because each running timer
holds a reference on the socket, and the user socket holds
a reference on the netns.

For kernel sockets, we risk that the netns is freed before
timer can complete, because kernel sockets do not hold
reference on the netns.

This patch adds inet_csk_clear_xmit_timers_sync() function
that using sk_stop_timer_sync() to make sure all timers
are terminated before the kernel socket is released.
Modules using kernel sockets close them in their netns exit()
handler.

Also add sock_not_owned_by_me() helper to get LOCKDEP
support : inet_csk_clear_xmit_timers_sync() must not be called
while socket lock is held.

It is very possible we can revert in the future commit
3a58f13a88 ("net: rds: acquire refcount on TCP sockets")
which attempted to solve the issue in rds only.
(net/smc/af_smc.c and net/mptcp/subflow.c have similar code)

We probably can remove the check_net() tests from
tcp_out_of_resources() and __tcp_close() in the future.

Reported-by: Josef Bacik <josef@toxicpanda.com>
Closes: https://lore.kernel.org/netdev/20240314210740.GA2823176@perftesting/
Fixes: 26abe14379 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.")
Fixes: 8a68173691 ("net: sk_clone_lock() should only do get_net() if the parent is not a kernel socket")
Link: https://lore.kernel.org/bpf/CANn89i+484ffqb93aQm1N-tjxxvb3WDKX0EbD7318RwRgsatjw@mail.gmail.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Josef Bacik <josef@toxicpanda.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Link: https://lore.kernel.org/r/20240322135732.1535772-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-10 16:19:35 +02:00
..
acpi ACPI: utils: Fix acpi_evaluate_dsm_typed() redefinition error 2023-07-23 13:47:18 +02:00
asm-generic arch: Introduce CONFIG_FUNCTION_ALIGNMENT 2024-04-10 16:18:49 +02:00
clocksource
crypto crypto: af_alg - Disallow multiple in-flight AIO requests 2024-01-25 14:52:34 -08:00
drm drm/ttm: add ttm_resource_fini v2 2024-03-26 18:21:25 -04:00
dt-bindings dt-bindings: clocks: imx8mp: Add ID for usb suspend clock 2024-03-01 13:21:53 +01:00
keys KEYS: trusted: allow use of kernel RNG for key material 2023-10-19 23:05:33 +02:00
kunit
kvm
linux vfio: Introduce interface to flush virqfd inject workqueue 2024-04-10 16:19:30 +02:00
math-emu
media media: v4l2-mem2mem: add lock to protect parameter num_rdy 2023-08-26 14:23:23 +02:00
memory
misc
net tcp: properly terminate timers for kernel sockets 2024-04-10 16:19:35 +02:00
pcmcia
ras
rdma RDMA/core: Fix umem iterator when PAGE_SIZE is greater then HCA pgsz 2023-12-13 18:36:40 +01:00
scsi scsi: core: Rename scsi_mq_done() into scsi_done() and export it 2023-10-19 23:05:32 +02:00
soc soc: fsl: qbman: Add CGR update function 2024-04-10 16:18:42 +02:00
sound ASoC: soc-card: Add storage for PCI SSID 2023-11-28 16:56:17 +00:00
target scsi: target: Fix multiple LUN_RESET handling 2023-05-11 23:00:26 +09:00
trace NFSD: add CB_RECALL_ANY tracepoints 2024-04-10 16:19:24 +02:00
uapi fanotify: introduce FAN_MARK_IGNORE 2024-04-10 16:19:07 +02:00
vdso
video
xen ACPI: processor: Fix evaluating _PDC method when running as Xen dom0 2023-05-11 23:00:22 +09:00