linux-stable/net
Taehee Yoo 9d51378a68 netfilter: ipt_CLUSTERIP: fix deadlock in netns exit routine
[ Upstream commit 5a86d68bcf ]

When network namespace is destroyed, cleanup_net() is called.
cleanup_net() holds pernet_ops_rwsem then calls each ->exit callback.
So that clusterip_tg_destroy() is called by cleanup_net().
And clusterip_tg_destroy() calls unregister_netdevice_notifier().

But both cleanup_net() and clusterip_tg_destroy() hold same
lock(pernet_ops_rwsem). hence deadlock occurrs.

After this patch, only 1 notifier is registered when module is inserted.
And all of configs are added to per-net list.

test commands:
   %ip netns add vm1
   %ip netns exec vm1 bash
   %ip link set lo up
   %iptables -A INPUT -p tcp -i lo -d 192.168.0.5 --dport 80 \
	-j CLUSTERIP --new --hashmode sourceip \
	--clustermac 01:00:5e:00:00:20 --total-nodes 2 --local-node 1
   %exit
   %ip netns del vm1

splat looks like:
[  341.809674] ============================================
[  341.809674] WARNING: possible recursive locking detected
[  341.809674] 4.19.0-rc5+ #16 Tainted: G        W
[  341.809674] --------------------------------------------
[  341.809674] kworker/u4:2/87 is trying to acquire lock:
[  341.809674] 000000005da2d519 (pernet_ops_rwsem){++++}, at: unregister_netdevice_notifier+0x8c/0x460
[  341.809674]
[  341.809674] but task is already holding lock:
[  341.809674] 000000005da2d519 (pernet_ops_rwsem){++++}, at: cleanup_net+0x119/0x900
[  341.809674]
[  341.809674] other info that might help us debug this:
[  341.809674]  Possible unsafe locking scenario:
[  341.809674]
[  341.809674]        CPU0
[  341.809674]        ----
[  341.809674]   lock(pernet_ops_rwsem);
[  341.809674]   lock(pernet_ops_rwsem);
[  341.809674]
[  341.809674]  *** DEADLOCK ***
[  341.809674]
[  341.809674]  May be due to missing lock nesting notation
[  341.809674]
[  341.809674] 3 locks held by kworker/u4:2/87:
[  341.809674]  #0: 00000000d9df6c92 ((wq_completion)"%s""netns"){+.+.}, at: process_one_work+0xafe/0x1de0
[  341.809674]  #1: 00000000c2cbcee2 (net_cleanup_work){+.+.}, at: process_one_work+0xb60/0x1de0
[  341.809674]  #2: 000000005da2d519 (pernet_ops_rwsem){++++}, at: cleanup_net+0x119/0x900
[  341.809674]
[  341.809674] stack backtrace:
[  341.809674] CPU: 1 PID: 87 Comm: kworker/u4:2 Tainted: G        W         4.19.0-rc5+ #16
[  341.809674] Workqueue: netns cleanup_net
[  341.809674] Call Trace:
[ ... ]
[  342.070196]  down_write+0x93/0x160
[  342.070196]  ? unregister_netdevice_notifier+0x8c/0x460
[  342.070196]  ? down_read+0x1e0/0x1e0
[  342.070196]  ? sched_clock_cpu+0x126/0x170
[  342.070196]  ? find_held_lock+0x39/0x1c0
[  342.070196]  unregister_netdevice_notifier+0x8c/0x460
[  342.070196]  ? register_netdevice_notifier+0x790/0x790
[  342.070196]  ? __local_bh_enable_ip+0xe9/0x1b0
[  342.070196]  ? __local_bh_enable_ip+0xe9/0x1b0
[  342.070196]  ? clusterip_tg_destroy+0x372/0x650 [ipt_CLUSTERIP]
[  342.070196]  ? trace_hardirqs_on+0x93/0x210
[  342.070196]  ? __bpf_trace_preemptirq_template+0x10/0x10
[  342.070196]  ? clusterip_tg_destroy+0x372/0x650 [ipt_CLUSTERIP]
[  342.123094]  clusterip_tg_destroy+0x3ad/0x650 [ipt_CLUSTERIP]
[  342.123094]  ? clusterip_net_init+0x3d0/0x3d0 [ipt_CLUSTERIP]
[  342.123094]  ? cleanup_match+0x17d/0x200 [ip_tables]
[  342.123094]  ? xt_unregister_table+0x215/0x300 [x_tables]
[  342.123094]  ? kfree+0xe2/0x2a0
[  342.123094]  cleanup_entry+0x1d5/0x2f0 [ip_tables]
[  342.123094]  ? cleanup_match+0x200/0x200 [ip_tables]
[  342.123094]  __ipt_unregister_table+0x9b/0x1a0 [ip_tables]
[  342.123094]  iptable_filter_net_exit+0x43/0x80 [iptable_filter]
[  342.123094]  ops_exit_list.isra.10+0x94/0x140
[  342.123094]  cleanup_net+0x45b/0x900
[ ... ]

Fixes: 202f59afd4 ("netfilter: ipt_CLUSTERIP: do not hold dev")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-26 09:32:40 +01:00
..
6lowpan
9p 9p/net: put a lower bound on msize 2019-01-13 09:51:08 +01:00
802
8021q net: remove blank lines at end of file 2018-07-24 14:10:43 -07:00
appletalk
atm Revert "net: simplify sock_poll_wait" 2018-11-04 14:50:51 +01:00
ax25 ax25: fix a use-after-free in ax25_fillin_cb() 2019-01-09 17:38:30 +01:00
batman-adv batman-adv: Expand merged fragment buffer for full packet 2018-12-13 09:16:10 +01:00
bluetooth Bluetooth: SMP: fix crash in unpairing 2018-09-26 12:39:32 +03:00
bpf bpf/test_run: support cgroup local storage 2018-08-03 00:47:32 +02:00
bpfilter net: bpfilter: use get_pid_task instead of pid_task 2018-10-17 22:03:40 -07:00
bridge net: clear skb->tstamp in bridge forwarding path 2019-01-26 09:32:33 +01:00
caif Revert "net: simplify sock_poll_wait" 2018-11-04 14:50:51 +01:00
can can: gw: ensure DLC boundaries after CAN frame modification 2019-01-22 21:40:28 +01:00
ceph libceph: fall back to sendmsg for slab pages 2018-11-27 16:13:11 +01:00
core net: call sk_dst_reset when set SO_DONTROUTE 2019-01-26 09:32:38 +01:00
dcb net: dcb: Add priority-to-DSCP map getters 2018-07-27 13:17:50 -07:00
dccp Revert "net: simplify sock_poll_wait" 2018-11-04 14:50:51 +01:00
decnet decnet: fix using plain integer as NULL warning 2018-08-09 14:11:24 -07:00
dns_resolver net: remove blank lines at end of file 2018-07-24 14:10:43 -07:00
dsa net: dsa: Drop GPIO includes 2018-08-27 15:24:33 -07:00
ethernet
hsr
ieee802154 ieee802154: lowpan_header_create check must check daddr 2019-01-09 17:38:31 +01:00
ife
ipv4 netfilter: ipt_CLUSTERIP: fix deadlock in netns exit routine 2019-01-26 09:32:40 +01:00
ipv6 ipv6: Take rcu_read_lock in __inet6_bind for mapped addresses 2019-01-26 09:32:33 +01:00
iucv Revert "net: simplify sock_poll_wait" 2018-11-04 14:50:51 +01:00
kcm Revert "kcm: remove any offset before parsing messages" 2018-09-17 18:43:42 -07:00
key Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2018-07-27 09:33:37 -07:00
l2tp l2tp: fix a sock refcnt leak in l2tp_tunnel_register 2018-11-23 08:17:05 +01:00
l3mdev
lapb
llc llc: do not use sk_eat_skb() 2018-12-01 09:37:27 +01:00
mac80211 mac80211: free skb fraglist before freeing the skb 2019-01-13 09:51:02 +01:00
mac802154 net: mac802154: tx: expand tailroom if necessary 2018-08-06 11:21:37 +02:00
mpls mpls: allow routes on ip6gre devices 2018-09-24 12:19:27 -07:00
ncsi net/ncsi: Fixup .dumpit message flags and ID check in Netlink handler 2018-08-22 21:39:08 -07:00
netfilter netfilter: ipset: Allow matching on destination MAC address for mac and ipmac sets 2019-01-26 09:32:33 +01:00
netlabel netlabel: check for IPV4MASK in addrinfo_get 2018-09-21 18:58:34 -07:00
netlink Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net 2018-08-05 13:04:31 -07:00
netrom netrom: fix locking in nr_find_socket() 2019-01-09 17:38:32 +01:00
nfc Revert "net: simplify sock_poll_wait" 2018-11-04 14:50:51 +01:00
nsh nsh: set mac len based on inner packet 2018-07-12 16:55:29 -07:00
openvswitch openvswitch: Fix push/pop ethernet validation 2018-11-04 14:50:52 +01:00
packet packet: Do not leak dev refcounts on error exit 2019-01-22 21:40:30 +01:00
phonet
psample
qrtr
rds rds: RDS (tcp) hangs on sendto() to unresponding address 2018-10-10 22:19:52 -07:00
rfkill Here are quite a large number of fixes, notably: 2018-09-03 22:12:02 -07:00
rose
rxrpc rxrpc: Fix lockup due to no error backoff after ack transmit error 2018-11-23 08:17:07 +01:00
sched net: Prevent invalid access to skb->prev in __qdisc_drop_all 2018-12-17 09:24:27 +01:00
sctp sctp: allocate sctp_sockaddr_entry with kzalloc 2019-01-22 21:40:36 +01:00
smc smc: move unhash as early as possible in smc_release() 2019-01-22 21:40:31 +01:00
strparser strparser: remove redundant variable 'rd_desc' 2018-08-01 10:00:06 -07:00
sunrpc sunrpc: handle ENOMEM in rpcb_getport_async 2019-01-22 21:40:35 +01:00
switchdev
tipc tipc: fix uninit-value in tipc_nl_compat_doit 2019-01-22 21:40:36 +01:00
tls net/tls: Init routines in create_ctx 2019-01-13 09:51:00 +01:00
unix Revert "net: simplify sock_poll_wait" 2018-11-04 14:50:51 +01:00
vmw_vsock VSOCK: Send reset control packet when socket is partially bound 2019-01-09 17:38:34 +01:00
wimax wimax: remove blank lines at EOF 2018-07-24 14:10:42 -07:00
wireless nl80211: fix memory leak if validate_pae_over_nl80211() fails 2019-01-13 09:51:02 +01:00
x25 x25: remove blank lines at EOF 2018-07-24 14:10:42 -07:00
xdp xsk: do not call synchronize_net() under RCU read lock 2018-10-11 10:19:01 +02:00
xfrm xfrm: Fix NULL pointer dereference in xfrm_input when skb_dst_force clears the dst_entry. 2019-01-13 09:50:57 +01:00
compat.c sock: Make sock->sk_stamp thread-safe 2019-01-09 17:38:33 +01:00
Kconfig net: remove blank lines at end of file 2018-07-24 14:10:43 -07:00
Makefile
socket.c net: socket: fix a missing-check bug 2018-10-18 16:43:06 -07:00
sysctl_net.c