No description
Find a file
Eric Dumazet 05255b823a tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive
When adding tcp mmap() implementation, I forgot that socket lock
had to be taken before current->mm->mmap_sem. syzbot eventually caught
the bug.

Since we can not lock the socket in tcp mmap() handler we have to
split the operation in two phases.

1) mmap() on a tcp socket simply reserves VMA space, and nothing else.
  This operation does not involve any TCP locking.

2) getsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) implements
 the transfert of pages from skbs to one VMA.
  This operation only uses down_read(&current->mm->mmap_sem) after
  holding TCP lock, thus solving the lockdep issue.

This new implementation was suggested by Andy Lutomirski with great details.

Benefits are :

- Better scalability, in case multiple threads reuse VMAS
   (without mmap()/munmap() calls) since mmap_sem wont be write locked.

- Better error recovery.
   The previous mmap() model had to provide the expected size of the
   mapping. If for some reason one part could not be mapped (partial MSS),
   the whole operation had to be aborted.
   With the tcp_zerocopy_receive struct, kernel can report how
   many bytes were successfuly mapped, and how many bytes should
   be read to skip the problematic sequence.

- No more memory allocation to hold an array of page pointers.
  16 MB mappings needed 32 KB for this array, potentially using vmalloc() :/

- skbs are freed while mmap_sem has been released

Following patch makes the change in tcp_mmap tool to demonstrate
one possible use of mmap() and setsockopt(... TCP_ZEROCOPY_RECEIVE ...)

Note that memcg might require additional changes.

Fixes: 93ab6cc691 ("tcp: implement mmap() for zero copy receive")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Cc: linux-mm@kvack.org
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-29 21:29:55 -04:00
arch Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 2018-04-25 22:55:33 -04:00
block blk-mq: Revert "blk-mq: reimplement blk_mq_hw_queue_mapped" 2018-04-11 07:59:15 -06:00
certs certs/blacklist_nohashes.c: fix const confusion in certs blacklist 2018-02-21 15:35:43 -08:00
crypto Kbuild updates for v4.17 (2nd) 2018-04-15 17:21:30 -07:00
Documentation ipv6: sr: Add documentation for seg_flowlabel sysctl 2018-04-27 20:23:56 -04:00
drivers net: dsa: mv88e6xxx: remove Global 2 setup 2018-04-29 20:36:49 -04:00
firmware kbuild: remove all dummy assignments to obj- 2017-11-18 11:46:06 +09:00
fs various SMB3/CIFS fixes for stable 4.17-rc1 2018-04-22 12:13:04 -07:00
include tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive 2018-04-29 21:29:55 -04:00
init Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-04-15 16:12:35 -07:00
ipc ipc/shm: fix use-after-free of shm file via remap_file_pages() 2018-04-13 17:10:27 -07:00
kernel Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2018-04-26 21:19:50 -04:00
lib rhashtable: improve rhashtable_walk stability when stop/start used. 2018-04-24 13:21:46 -04:00
LICENSES LICENSES: Add MPL-1.1 license 2018-01-06 10:59:44 -07:00
mm mm/filemap.c: fix NULL pointer in page_cache_tree_insert() 2018-04-20 17:18:36 -07:00
net tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive 2018-04-29 21:29:55 -04:00
samples samples, bpf: remove redundant ret assignment in bpf_load_program() 2018-04-27 00:47:06 +02:00
scripts bpf: add script and prepare bpf.h for new helpers documentation 2018-04-27 00:21:58 +02:00
security Merge branch 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2018-04-24 17:58:51 -07:00
sound sound fixes for 4.17-rc2 2018-04-21 10:32:16 -07:00
tools selftests: forwarding: Test changes in mirror-to-gretap 2018-04-27 14:57:50 -04:00
usr kbuild: rename built-in.o to built-in.a 2018-03-26 02:01:19 +09:00
virt KVM/ARM updates for v4.17 2018-03-28 16:09:09 +02:00
.clang-format clang-format: add configuration file 2018-04-11 10:28:35 -07:00
.cocciconfig
.get_maintainer.ignore
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Kbuild updates for v4.17 (2nd) 2018-04-15 17:21:30 -07:00
.mailmap Merge candidates for 4.17 merge window 2018-04-06 17:35:43 -07:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS MAINTAINERS/CREDITS: Drop METAG ARCHITECTURE 2018-03-05 16:34:24 +00:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
MAINTAINERS Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-04-24 23:59:11 -04:00
Makefile Linux 4.17-rc2 2018-04-22 19:20:09 -07:00
README Docs: Added a pointer to the formatted docs to README 2018-03-21 09:02:53 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.