linux-stable/net
Jon Maloy 4e3fbd7433 tipc: fix lockdep warning during node delete
[ Upstream commit ec835f8912 ]

We see the following lockdep warning:

[ 2284.078521] ======================================================
[ 2284.078604] WARNING: possible circular locking dependency detected
[ 2284.078604] 4.19.0+ #42 Tainted: G            E
[ 2284.078604] ------------------------------------------------------
[ 2284.078604] rmmod/254 is trying to acquire lock:
[ 2284.078604] 00000000acd94e28 ((&n->timer)#2){+.-.}, at: del_timer_sync+0x5/0xa0
[ 2284.078604]
[ 2284.078604] but task is already holding lock:
[ 2284.078604] 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x190 [tipc]
[ 2284.078604]
[ 2284.078604] which lock already depends on the new lock.
[ 2284.078604]
[ 2284.078604]
[ 2284.078604] the existing dependency chain (in reverse order) is:
[ 2284.078604]
[ 2284.078604] -> #1 (&(&tn->node_list_lock)->rlock){+.-.}:
[ 2284.078604]        tipc_node_timeout+0x20a/0x330 [tipc]
[ 2284.078604]        call_timer_fn+0xa1/0x280
[ 2284.078604]        run_timer_softirq+0x1f2/0x4d0
[ 2284.078604]        __do_softirq+0xfc/0x413
[ 2284.078604]        irq_exit+0xb5/0xc0
[ 2284.078604]        smp_apic_timer_interrupt+0xac/0x210
[ 2284.078604]        apic_timer_interrupt+0xf/0x20
[ 2284.078604]        default_idle+0x1c/0x140
[ 2284.078604]        do_idle+0x1bc/0x280
[ 2284.078604]        cpu_startup_entry+0x19/0x20
[ 2284.078604]        start_secondary+0x187/0x1c0
[ 2284.078604]        secondary_startup_64+0xa4/0xb0
[ 2284.078604]
[ 2284.078604] -> #0 ((&n->timer)#2){+.-.}:
[ 2284.078604]        del_timer_sync+0x34/0xa0
[ 2284.078604]        tipc_node_delete+0x1a/0x40 [tipc]
[ 2284.078604]        tipc_node_stop+0xcb/0x190 [tipc]
[ 2284.078604]        tipc_net_stop+0x154/0x170 [tipc]
[ 2284.078604]        tipc_exit_net+0x16/0x30 [tipc]
[ 2284.078604]        ops_exit_list.isra.8+0x36/0x70
[ 2284.078604]        unregister_pernet_operations+0x87/0xd0
[ 2284.078604]        unregister_pernet_subsys+0x1d/0x30
[ 2284.078604]        tipc_exit+0x11/0x6f2 [tipc]
[ 2284.078604]        __x64_sys_delete_module+0x1df/0x240
[ 2284.078604]        do_syscall_64+0x66/0x460
[ 2284.078604]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 2284.078604]
[ 2284.078604] other info that might help us debug this:
[ 2284.078604]
[ 2284.078604]  Possible unsafe locking scenario:
[ 2284.078604]
[ 2284.078604]        CPU0                    CPU1
[ 2284.078604]        ----                    ----
[ 2284.078604]   lock(&(&tn->node_list_lock)->rlock);
[ 2284.078604]                                lock((&n->timer)#2);
[ 2284.078604]                                lock(&(&tn->node_list_lock)->rlock);
[ 2284.078604]   lock((&n->timer)#2);
[ 2284.078604]
[ 2284.078604]  *** DEADLOCK ***
[ 2284.078604]
[ 2284.078604] 3 locks held by rmmod/254:
[ 2284.078604]  #0: 000000003368be9b (pernet_ops_rwsem){+.+.}, at: unregister_pernet_subsys+0x15/0x30
[ 2284.078604]  #1: 0000000046ed9c86 (rtnl_mutex){+.+.}, at: tipc_net_stop+0x144/0x170 [tipc]
[ 2284.078604]  #2: 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x19
[...}

The reason is that the node timer handler sometimes needs to delete a
node which has been disconnected for too long. To do this, it grabs
the lock 'node_list_lock', which may at the same time be held by the
generic node cleanup function, tipc_node_stop(), during module removal.
Since the latter is calling del_timer_sync() inside the same lock, we
have a potential deadlock.

We fix this letting the timer cleanup function use spin_trylock()
instead of just spin_lock(), and when it fails to grab the lock it
just returns so that the timer handler can terminate its execution.
This is safe to do, since tipc_node_stop() anyway is about to
delete both the timer and the node instance.

Fixes: 6a939f365b ("tipc: Auto removal of peer down node instance")
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05 19:32:00 +01:00
..
6lowpan 6lowpan: iphc: reset mac_header after decompress to fix panic 2018-07-06 12:32:12 +02:00
9p 9p: clear dangling pointers in p9stat_free 2018-11-21 09:19:12 +01:00
802
8021q net: remove blank lines at end of file 2018-07-24 14:10:43 -07:00
appletalk Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
atm Revert "net: simplify sock_poll_wait" 2018-11-04 14:50:51 +01:00
ax25 ax25: remove blank line at EOF 2018-07-24 14:10:42 -07:00
batman-adv batman-adv: Increase version number to 2018.3 2018-09-14 17:59:20 +02:00
bluetooth Bluetooth: SMP: fix crash in unpairing 2018-09-26 12:39:32 +03:00
bpf bpf/test_run: support cgroup local storage 2018-08-03 00:47:32 +02:00
bpfilter net: bpfilter: use get_pid_task instead of pid_task 2018-10-17 22:03:40 -07:00
bridge net: bridge: remove ipv6 zero address check in mcast queries 2018-11-04 14:50:54 +01:00
caif Revert "net: simplify sock_poll_wait" 2018-11-04 14:50:51 +01:00
can can: raw: check for CAN FD capable netdev in raw_sendmsg() 2018-12-01 09:37:30 +01:00
ceph libceph: fall back to sendmsg for slab pages 2018-11-27 16:13:11 +01:00
core net: skb_scrub_packet(): Scrub offload_fwd_mark 2018-12-05 19:31:59 +01:00
dcb net: dcb: Add priority-to-DSCP map getters 2018-07-27 13:17:50 -07:00
dccp Revert "net: simplify sock_poll_wait" 2018-11-04 14:50:51 +01:00
decnet decnet: fix using plain integer as NULL warning 2018-08-09 14:11:24 -07:00
dns_resolver net: remove blank lines at end of file 2018-07-24 14:10:43 -07:00
dsa net: dsa: Drop GPIO includes 2018-08-27 15:24:33 -07:00
ethernet net: Convert GRO SKB handling to list_head. 2018-06-26 11:33:04 +09:00
hsr
ieee802154 net: ieee802154: 6lowpan: remove redundant pointers 'fq' and 'net' 2018-08-06 11:21:15 +02:00
ife
ipv4 tcp: defer SACK compression after DupThresh 2018-12-05 19:32:00 +01:00
ipv6 netfilter: ipv6: fix oops when defragmenting locally generated fragments 2018-11-27 16:13:01 +01:00
iucv Revert "net: simplify sock_poll_wait" 2018-11-04 14:50:51 +01:00
kcm Revert "kcm: remove any offset before parsing messages" 2018-09-17 18:43:42 -07:00
key Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2018-07-27 09:33:37 -07:00
l2tp l2tp: fix a sock refcnt leak in l2tp_tunnel_register 2018-11-23 08:17:05 +01:00
l3mdev
lapb
llc llc: do not use sk_eat_skb() 2018-12-01 09:37:27 +01:00
mac80211 mac80211: fix setting IEEE80211_KEY_FLAG_RX_MGMT for AP mode keys 2018-10-01 09:13:48 +02:00
mac802154 net: mac802154: tx: expand tailroom if necessary 2018-08-06 11:21:37 +02:00
mpls mpls: allow routes on ip6gre devices 2018-09-24 12:19:27 -07:00
ncsi net/ncsi: Fixup .dumpit message flags and ID check in Netlink handler 2018-08-22 21:39:08 -07:00
netfilter netfilter: nft_compat: ebtables 'nat' table is normal chain type 2018-11-27 16:13:03 +01:00
netlabel netlabel: check for IPV4MASK in addrinfo_get 2018-09-21 18:58:34 -07:00
netlink Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net 2018-08-05 13:04:31 -07:00
netrom Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
nfc Revert "net: simplify sock_poll_wait" 2018-11-04 14:50:51 +01:00
nsh nsh: set mac len based on inner packet 2018-07-12 16:55:29 -07:00
openvswitch openvswitch: Fix push/pop ethernet validation 2018-11-04 14:50:52 +01:00
packet packet: copy user buffers before orphan or clone 2018-12-05 19:31:58 +01:00
phonet Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
psample
qrtr net: qrtr: Reset the node and port ID of broadcast messages 2018-07-05 20:20:03 +09:00
rds rds: RDS (tcp) hangs on sendto() to unresponding address 2018-10-10 22:19:52 -07:00
rfkill Here are quite a large number of fixes, notably: 2018-09-03 22:12:02 -07:00
rose Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
rxrpc rxrpc: Fix lockup due to no error backoff after ack transmit error 2018-11-23 08:17:07 +01:00
sched net: sched: cls_flower: validate nested enc_opts_policy to avoid warning 2018-11-23 08:17:04 +01:00
sctp sctp: clear the transport of some out_chunk_list chunks in sctp_assoc_rm_peer 2018-12-01 09:37:27 +01:00
smc net/smc: fix smc_buf_unuse to use the lgr pointer 2018-11-04 14:50:52 +01:00
strparser strparser: remove redundant variable 'rd_desc' 2018-08-01 10:00:06 -07:00
sunrpc SUNRPC: Fix a bogus get/put in generic_key_to_expire() 2018-12-01 09:37:33 +01:00
switchdev
tipc tipc: fix lockdep warning during node delete 2018-12-05 19:32:00 +01:00
tls tls: fix currently broken MSG_PEEK behavior 2018-09-17 08:03:09 -07:00
unix Revert "net: simplify sock_poll_wait" 2018-11-04 14:50:51 +01:00
vmw_vsock vsock: split dwork to avoid reinitializations 2018-08-07 12:39:13 -07:00
wimax wimax: remove blank lines at EOF 2018-07-24 14:10:42 -07:00
wireless cfg80211: fix use-after-free in reg_process_hint() 2018-10-01 09:14:03 +02:00
x25 x25: remove blank lines at EOF 2018-07-24 14:10:42 -07:00
xdp xsk: do not call synchronize_net() under RCU read lock 2018-10-11 10:19:01 +02:00
xfrm xfrm: policy: use hlist rcu variants on insert 2018-10-11 13:24:46 +02:00
compat.c net: avoid unnecessary sock_flag() check when enable timestamp 2018-08-06 10:42:48 -07:00
Kconfig net: remove blank lines at end of file 2018-07-24 14:10:43 -07:00
Makefile bpfilter: check compiler capability in Kconfig 2018-06-28 13:36:39 +09:00
socket.c net: socket: fix a missing-check bug 2018-10-18 16:43:06 -07:00
sysctl_net.c