linux-stable/include
Florian Westphal f487722531 inet: inet_defrag: prevent sk release while still in use
[ Upstream commit 18685451fc ]

ip_local_out() and other functions can pass skb->sk as function argument.

If the skb is a fragment and reassembly happens before such function call
returns, the sk must not be released.

This affects skb fragments reassembled via netfilter or similar
modules, e.g. openvswitch or ct_act.c, when run as part of tx pipeline.

Eric Dumazet made an initial analysis of this bug.  Quoting Eric:
  Calling ip_defrag() in output path is also implying skb_orphan(),
  which is buggy because output path relies on sk not disappearing.

  A relevant old patch about the issue was :
  8282f27449 ("inet: frag: Always orphan skbs inside ip_defrag()")

  [..]

  net/ipv4/ip_output.c depends on skb->sk being set, and probably to an
  inet socket, not an arbitrary one.

  If we orphan the packet in ipvlan, then downstream things like FQ
  packet scheduler will not work properly.

  We need to change ip_defrag() to only use skb_orphan() when really
  needed, ie whenever frag_list is going to be used.

Eric suggested to stash sk in fragment queue and made an initial patch.
However there is a problem with this:

If skb is refragmented again right after, ip_do_fragment() will copy
head->sk to the new fragments, and sets up destructor to sock_wfree.
IOW, we have no choice but to fix up sk_wmem accouting to reflect the
fully reassembled skb, else wmem will underflow.

This change moves the orphan down into the core, to last possible moment.
As ip_defrag_offset is aliased with sk_buff->sk member, we must move the
offset into the FRAG_CB, else skb->sk gets clobbered.

This allows to delay the orphaning long enough to learn if the skb has
to be queued or if the skb is completing the reasm queue.

In the former case, things work as before, skb is orphaned.  This is
safe because skb gets queued/stolen and won't continue past reasm engine.

In the latter case, we will steal the skb->sk reference, reattach it to
the head skb, and fix up wmem accouting when inet_frag inflates truesize.

Fixes: 7026b1ddb6 ("netfilter: Pass socket pointer down through okfn().")
Diagnosed-by: Eric Dumazet <edumazet@google.com>
Reported-by: xingwei lee <xrivendell7@gmail.com>
Reported-by: yue sun <samsun1006219@gmail.com>
Reported-by: syzbot+e5167d7144a62715044c@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240326101845.30836-1-fw@strlen.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-10 16:35:44 +02:00
..
acpi ACPI: PM: Add acpi_device_fix_up_power_children() function 2023-12-03 07:33:07 +01:00
asm-generic linux/init: remove __memexit* annotations 2024-02-23 09:25:03 +01:00
clocksource
crypto crypto: af_alg - Disallow multiple in-flight AIO requests 2024-01-25 15:35:16 -08:00
drm drm/bridge: add ->edid_read hook and drm_bridge_edid_read() 2024-04-03 15:28:38 +02:00
dt-bindings clk: renesas: r8a779g0: Correct PFC/GPIO parent clocks 2024-03-26 18:19:47 -04:00
keys
kunit
kvm
linux inet: inet_defrag: prevent sk release while still in use 2024-04-10 16:35:44 +02:00
math-emu
media media: mc: Add num_links flag to media_pad 2024-04-03 15:28:17 +02:00
memory
misc
net tcp: properly terminate timers for kernel sockets 2024-04-10 16:35:42 +02:00
pcmcia
ras
rdma RDMA/core: Fix umem iterator when PAGE_SIZE is greater then HCA pgsz 2023-12-13 18:45:16 +01:00
rv
scsi scsi: sd: Fix TCG OPAL unlock on system resume 2024-04-03 15:28:59 +02:00
soc soc: qcom: socinfo: rename PM2250 to PM4125 2024-03-26 18:19:23 -04:00
sound ALSA: hda/tas2781: add ptrs to calibration functions 2024-03-26 18:19:57 -04:00
target
trace tracing/net_sched: Fix tracepoints that save qdisc_dev() as a string 2024-03-15 10:48:16 -04:00
uapi net: fix IPSTATS_MIB_OUTPKGS increment in OutForwDatagrams. 2024-04-03 15:28:39 +02:00
ufs
vdso
video fbdev: stifb: Make the STI next font pointer a 32-bit signed offset 2023-11-28 17:19:58 +00:00
xen xen/events: reduce externally visible helper functions 2024-03-01 13:34:57 +01:00