No description
Find a file
Talal Ahmad f1a456f8f3 net: avoid double accounting for pure zerocopy skbs
Track skbs with only zerocopy data and avoid charging them to kernel
memory to correctly account the memory utilization for msg_zerocopy.
All of the data in such skbs is held in user pages which are already
accounted to user. Before this change, they are charged again in
kernel in __zerocopy_sg_from_iter. The charging in kernel is
excessive because data is not being copied into skb frags. This
excessive charging can lead to kernel going into memory pressure
state which impacts all sockets in the system adversely. Mark pure
zerocopy skbs with a SKBFL_PURE_ZEROCOPY flag and remove
charge/uncharge for data in such skbs.

Initially, an skb is marked pure zerocopy when it is empty and in
zerocopy path. skb can then change from a pure zerocopy skb to mixed
data skb (zerocopy and copy data) if it is at tail of write queue and
there is room available in it and non-zerocopy data is being sent in
the next sendmsg call. At this time sk_mem_charge is done for the pure
zerocopied data and the pure zerocopy flag is unmarked. We found that
this happens very rarely on workloads that pass MSG_ZEROCOPY.

A pure zerocopy skb can later be coalesced into normal skb if they are
next to each other in queue but this patch prevents coalescing from
happening. This avoids complexity of charging when skb downgrades from
pure zerocopy to mixed. This is also rare.

In sk_wmem_free_skb, if it is a pure zerocopy skb, an sk_mem_uncharge
for SKB_TRUESIZE(MAX_TCP_HEADER) is done for sk_mem_charge in
tcp_skb_entail for an skb without data.

Testing with the msg_zerocopy.c benchmark between two hosts(100G nics)
with zerocopy showed that before this patch the 'sock' variable in
memory.stat for cgroup2 that tracks sum of sk_forward_alloc,
sk_rmem_alloc and sk_wmem_queued is around 1822720 and with this
change it is 0. This is due to no charge to sk_forward_alloc for
zerocopy data and shows memory utilization for kernel is lowered.

Signed-off-by: Talal Ahmad <talalahmad@google.com>
Acked-by: Arjun Roy <arjunroy@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-01 16:33:27 -07:00
arch net: xtensa: use eth_hw_addr_set() 2021-10-29 13:17:21 +01:00
block block-5.15-2021-10-22 2021-10-22 17:42:13 -10:00
certs
crypto
Documentation dt-bindings: net: lantiq-xrx200-net: Remove the burst length properties 2021-10-29 12:15:35 +01:00
drivers netdevsim: fix uninit value in nsim_drv_configure_vfs() 2021-11-01 16:29:56 -07:00
fs Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-10-24 09:36:06 -10:00
include net: avoid double accounting for pure zerocopy skbs 2021-11-01 16:33:27 -07:00
init
ipc
kernel Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-10-28 10:43:58 -07:00
lib lib: bitmap: Introduce node-aware alloc API 2021-10-26 19:30:38 -07:00
LICENSES
mm secretmem: Prevent secretmem_users from wrapping to zero 2021-10-25 11:27:31 -07:00
net net: avoid double accounting for pure zerocopy skbs 2021-11-01 16:33:27 -07:00
samples
scripts Tracing fixes for 5.15: 2021-10-16 10:51:41 -07:00
security Merge branch 'ucount-fixes-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-10-21 17:27:17 -10:00
sound ALSA: usb-audio: Fix microphone sound on Jieli webcam. 2021-10-19 08:07:01 +02:00
tools selftests: add amt interface selftest script 2021-11-01 13:36:09 +00:00
usr
virt
.clang-format
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore
.mailmap mailmap: add Andrej Shadura 2021-10-18 20:22:03 -10:00
COPYING
CREDITS
Kbuild
Kconfig
MAINTAINERS amt: add control plane of amt interface 2021-11-01 13:36:08 +00:00
Makefile Linux 5.15-rc7 2021-10-25 11:30:31 -07:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.