linux-stable/net/ipv4
John Fastabend c9c89dcd87 bpf, sockmap: Fix partial copy_page_to_iter so progress can still be made
If copy_page_to_iter() fails or even partially completes, but with fewer
bytes copied than expected we currently reset sg.start and return EFAULT.
This proves problematic if we already copied data into the user buffer
before we return an error. Because we leave the copied data in the user
buffer and fail to unwind the scatterlist so kernel side believes data
has been copied and user side believes data has _not_ been received.

Expected behavior should be to return number of bytes copied and then
on the next read we need to return the error assuming its still there. This
can happen if we have a copy length spanning multiple scatterlist elements
and one or more complete before the error is hit.

The error is rare enough though that my normal testing with server side
programs, such as nginx, httpd, envoy, etc., I have never seen this. The
only reliable way to reproduce that I've found is to stream movies over
my browser for a day or so and wait for it to hang. Not very scientific,
but with a few extra WARN_ON()s in the code the bug was obvious.

When we review the errors from copy_page_to_iter() it seems we are hitting
a page fault from copy_page_to_iter_iovec() where the code checks
fault_in_pages_writeable(buf, copy) where buf is the user buffer. It
also seems typical server applications don't hit this case.

The other way to try and reproduce this is run the sockmap selftest tool
test_sockmap with data verification enabled, but it doesn't reproduce the
fault. Perhaps we can trigger this case artificially somehow from the
test tools. I haven't sorted out a way to do that yet though.

Fixes: 604326b41a ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/160556566659.73229.15694973114605301063.stgit@john-XPS-13-9370
2020-11-18 00:12:10 +01:00
..
bpfilter net: Revert "net: optimize the sockptr_t for unified kernel/user address spaces" 2020-08-10 12:06:44 -07:00
netfilter netfilter: use actual socket sk rather than skb sk when routing harder 2020-10-30 12:57:39 +01:00
af_inet.c io_uring: allow tcp ancillary data for __sys_recvmsg_sock() 2020-08-24 16:16:06 -07:00
ah4.c inet: Use fallthrough; 2020-03-12 15:55:00 -07:00
arp.c inet: Use fallthrough; 2020-03-12 15:55:00 -07:00
bpf_tcp_ca.c bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON 2020-09-25 13:58:01 -07:00
cipso_ipv4.c cipso: fix 'audit_secid' kernel-doc warning in cipso_ipv4.c 2020-09-08 20:03:36 -07:00
datagram.c
devinet.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-05-31 17:48:46 -07:00
esp4.c ESP: Export esp_output_fill_trailer function 2020-02-19 13:52:32 +01:00
esp4_offload.c net: Add MODULE_DESCRIPTION entries to network modules 2020-06-20 21:33:57 -07:00
fib_frontend.c ipv4: Initialize flowi4_multipath_hash in data path 2020-09-14 14:54:56 -07:00
fib_lookup.h net: add net available in build_state 2020-03-29 22:30:57 -07:00
fib_notifier.c
fib_rules.c fib: use indirect call wrappers in the most common fib_rules_ops 2020-07-28 17:42:31 -07:00
fib_semantics.c net: Fix the arp error in some cases 2020-06-18 20:21:51 -07:00
fib_trie.c ipv4: Silence suspicious RCU usage warning 2020-08-26 15:58:48 -07:00
fou.c genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
gre_demux.c gre: fix uninit-value in __iptunnel_pull_header 2020-03-08 21:25:37 -07:00
gre_offload.c net: gre: recompute gre csum for sctp over gre tunnels 2020-08-03 15:29:44 -07:00
icmp.c icmp: randomize the global rate limiter 2020-10-16 16:47:09 -07:00
igmp.c ip*_mc_gsfget(): lift copyout of struct group_filter into callers 2020-05-20 20:31:27 -04:00
inet_connection_sock.c inet: remove icsk_ack.blocked 2020-09-30 14:21:30 -07:00
inet_diag.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
inet_fragment.c
inet_hashtables.c net: ipv4: remove unused arg exact_dif in compute_score 2020-08-31 13:08:29 -07:00
inet_timewait_sock.c
inetpeer.c
ip_forward.c
ip_fragment.c
ip_gre.c ip_gre: set dev->hard_header_len and dev->needed_headroom properly 2020-10-13 18:35:29 -07:00
ip_input.c bpf: Add socket assign support 2020-03-30 13:45:04 -07:00
ip_options.c net: clean up codestyle for net/ipv4 2020-08-25 06:28:02 -07:00
ip_output.c networking changes for the 5.10 merge window 2020-10-15 18:42:13 -07:00
ip_sockglue.c net: Remove duplicated midx check against 0 2020-08-25 06:23:59 -07:00
ip_tunnel.c ip_tunnel: fix over-mtu packet send fail without TUNNEL_DONT_FRAGMENT flags 2020-10-31 17:19:02 -07:00
ip_tunnel_core.c iptunnel: use new function dev_fetch_sw_netstats 2020-10-13 17:33:49 -07:00
ip_vti.c ipv4: use dev_sw_netstats_rx_add() 2020-10-06 06:23:21 -07:00
ipcomp.c ipcomp: assign if_id to child tunnel from parent tunnel 2020-07-09 12:55:37 +02:00
ipconfig.c Documentation: nfsroot.rst: Fix references to nfsroot.rst 2020-03-02 13:11:46 -07:00
ipip.c net: ipip: implement header_ops->parse_protocol for AF_PACKET 2020-06-30 12:29:39 -07:00
ipmr.c ipmr: Use full VIF ID in netlink cache reports 2020-09-10 12:25:51 -07:00
ipmr_base.c
Kconfig net: ipv4: remove duplicate "the the" phrase in Kconfig text 2020-08-18 16:02:16 -07:00
Makefile udp_tunnel: add central NIC RX port offload infrastructure 2020-07-10 13:54:00 -07:00
metrics.c
netfilter.c netfilter: use actual socket sk rather than skb sk when routing harder 2020-10-30 12:57:39 +01:00
netlink.c
nexthop.c nexthop: Fix performance regression in nexthop deletion 2020-10-19 20:07:15 -07:00
ping.c net: clean up codestyle 2020-08-31 12:33:34 -07:00
proc.c tcp: skip DSACKs with dubious sequence ranges 2020-09-24 20:15:45 -07:00
protocol.c
raw.c networking changes for the 5.10 merge window 2020-10-15 18:42:13 -07:00
raw_diag.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-03-12 22:34:48 -07:00
route.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-15 12:43:21 -07:00
syncookies.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-05 18:40:01 -07:00
sysctl_net_ipv4.c tcp: reflect tos value received in SYN to the socket 2020-09-10 13:15:40 -07:00
tcp.c tcp: Prevent low rmem stalls with SO_RCVLOWAT. 2020-10-23 19:11:20 -07:00
tcp_bbr.c tcp_bbr: improve arithmetic division in bbr_update_bw() 2020-01-21 10:45:49 +01:00
tcp_bic.c tcp: fix stretch ACK bugs in BIC 2020-03-16 18:26:54 -07:00
tcp_bpf.c bpf, sockmap: Fix partial copy_page_to_iter so progress can still be made 2020-11-18 00:12:10 +01:00
tcp_cdg.c
tcp_cong.c tcp: Simplify tcp_set_congestion_control() load=false case 2020-09-10 20:53:01 -07:00
tcp_cubic.c tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT 2020-06-25 16:08:47 -07:00
tcp_dctcp.c
tcp_dctcp.h
tcp_diag.c inet_diag: Move the INET_DIAG_REQ_BYTECODE nlattr to cb->data 2020-02-27 18:50:19 -08:00
tcp_fastopen.c bpf: tcp: Add bpf_skops_established() 2020-08-24 14:35:00 -07:00
tcp_highspeed.c Replace HTTP links with HTTPS ones: IPv* 2020-07-06 13:23:03 -07:00
tcp_htcp.c Replace HTTP links with HTTPS ones: IPv* 2020-07-06 13:23:03 -07:00
tcp_hybla.c
tcp_illinois.c
tcp_input.c tcp: Prevent low rmem stalls with SO_RCVLOWAT. 2020-10-23 19:11:20 -07:00
tcp_ipv4.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-08 15:44:50 -07:00
tcp_lp.c
tcp_metrics.c genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
tcp_minisocks.c bpf: tcp: Do not limit cb_flags when creating child sk from listen sk 2020-10-02 11:34:48 -07:00
tcp_nv.c
tcp_offload.c
tcp_output.c tcp: add exponential backoff in __tcp_send_ack() 2020-09-30 14:21:30 -07:00
tcp_rate.c
tcp_recovery.c tcp: move tcp_mark_skb_lost 2020-09-25 17:17:14 -07:00
tcp_scalable.c net: ipv4: delete repeated words 2020-08-24 17:31:20 -07:00
tcp_timer.c inet: remove icsk_ack.blocked 2020-09-30 14:21:30 -07:00
tcp_ulp.c bpf: sockmap: Only check ULP for TCP sockets 2020-03-09 22:34:58 +01:00
tcp_vegas.c tcp: use semicolons rather than commas to separate statements 2020-10-13 17:11:52 -07:00
tcp_vegas.h
tcp_veno.c Replace HTTP links with HTTPS ones: IPv* 2020-07-06 13:23:03 -07:00
tcp_westwood.c
tcp_yeah.c tcp: fix stretch ACK bugs in Yeah 2020-03-16 18:26:55 -07:00
tunnel4.c tunnel4: add cb_handler to struct xfrm_tunnel 2020-07-09 12:51:36 +02:00
udp.c net: ipv4: delete repeated words 2020-08-24 17:31:20 -07:00
udp_bpf.c net: sk_msg: Simplify sk_psock initialization 2020-08-21 15:16:11 -07:00
udp_diag.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-03-12 22:34:48 -07:00
udp_impl.h net: pass a sockptr_t into ->setsockopt 2020-07-24 15:41:54 -07:00
udp_offload.c udp: initialize is_flist with 0 in udp_gro_receive 2020-03-30 10:35:03 -07:00
udp_tunnel_core.c udp_tunnel: add central NIC RX port offload infrastructure 2020-07-10 13:54:00 -07:00
udp_tunnel_nic.c udp_tunnel: add the ability to share port tables 2020-09-28 12:50:12 -07:00
udp_tunnel_stub.c udp_tunnel: add central NIC RX port offload infrastructure 2020-07-10 13:54:00 -07:00
udplite.c net/ipv4: remove compat_ip_{get,set}sockopt 2020-07-19 18:16:41 -07:00
xfrm4_input.c xfrm: state: remove extract_input indirection from xfrm_state_afinfo 2020-05-06 09:40:08 +02:00
xfrm4_output.c xfrm: fix unused variable warning if CONFIG_NETFILTER=n 2020-05-11 15:12:27 +02:00
xfrm4_policy.c
xfrm4_protocol.c
xfrm4_state.c xfrm: remove output_finish indirection from xfrm_state_afinfo 2020-05-06 09:40:08 +02:00
xfrm4_tunnel.c xfrm: interface: fix the priorities for ipip and ipv6 tunnels 2020-10-09 12:29:48 +02:00