linux-stable/net
Dust Li b85f751d71 net/smc: fix kernel panic caused by race of smc_sock
[ Upstream commit 349d43127d ]

A crash occurs when smc_cdc_tx_handler() tries to access smc_sock
but smc_release() has already freed it.

[ 4570.695099] BUG: unable to handle page fault for address: 000000002eae9e88
[ 4570.696048] #PF: supervisor write access in kernel mode
[ 4570.696728] #PF: error_code(0x0002) - not-present page
[ 4570.697401] PGD 0 P4D 0
[ 4570.697716] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 4570.698228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc4+ #111
[ 4570.699013] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8c24b4c 04/0
[ 4570.699933] RIP: 0010:_raw_spin_lock+0x1a/0x30
<...>
[ 4570.711446] Call Trace:
[ 4570.711746]  <IRQ>
[ 4570.711992]  smc_cdc_tx_handler+0x41/0xc0
[ 4570.712470]  smc_wr_tx_tasklet_fn+0x213/0x560
[ 4570.712981]  ? smc_cdc_tx_dismisser+0x10/0x10
[ 4570.713489]  tasklet_action_common.isra.17+0x66/0x140
[ 4570.714083]  __do_softirq+0x123/0x2f4
[ 4570.714521]  irq_exit_rcu+0xc4/0xf0
[ 4570.714934]  common_interrupt+0xba/0xe0

Though smc_cdc_tx_handler() checked the existence of smc connection,
smc_release() may have already dismissed and released the smc socket
before smc_cdc_tx_handler() further visits it.

smc_cdc_tx_handler()           |smc_release()
if (!conn)                     |
                               |
                               |smc_cdc_tx_dismiss_slots()
                               |      smc_cdc_tx_dismisser()
                               |
                               |sock_put(&smc->sk) <- last sock_put,
                               |                      smc_sock freed
bh_lock_sock(&smc->sk) (panic) |

To make sure we won't receive any CDC messages after we free the
smc_sock, add a refcount on the smc_connection for inflight CDC
message(posted to the QP but haven't received related CQE), and
don't release the smc_connection until all the inflight CDC messages
haven been done, for both success or failed ones.

Using refcount on CDC messages brings another problem: when the link
is going to be destroyed, smcr_link_clear() will reset the QP, which
then remove all the pending CQEs related to the QP in the CQ. To make
sure all the CQEs will always come back so the refcount on the
smc_connection can always reach 0, smc_ib_modify_qp_reset() was replaced
by smc_ib_modify_qp_error().
And remove the timeout in smc_wr_tx_wait_no_pending_sends() since we
need to wait for all pending WQEs done, or we may encounter use-after-
free when handling CQEs.

For IB device removal routine, we need to wait for all the QPs on that
device been destroyed before we can destroy CQs on the device, or
the refcount on smc_connection won't reach 0 and smc_sock cannot be
released.

Fixes: 5f08318f61 ("smc: connection data control (CDC)")
Reported-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-01-05 12:42:36 +01:00
..
6lowpan 6lowpan: iphc: Fix an off-by-one check of array index 2021-07-22 16:19:03 +02:00
9p 9p/net: fix missing error check in p9_check_errors 2021-11-18 19:17:16 +01:00
802 net: 802: remove dead leftover after ipx driver removal 2021-08-13 16:30:35 -07:00
8021q net: vlan: fix underflow for the real_dev refcnt 2021-12-01 09:04:53 +01:00
appletalk net: socket: rework compat_ifreq_ioctl() 2021-07-23 14:20:25 +01:00
atm
ax25 ax25: NPD bug when detaching AX25 device 2021-12-29 12:29:02 +01:00
batman-adv net: batman-adv: fix error handling 2021-10-26 14:47:12 +01:00
bluetooth Bluetooth: fix init and cleanup of sco_conn.timeout_work 2021-11-18 19:16:22 +01:00
bpf bpf, test, cgroup: Use sk_{alloc,free} for test cases 2021-09-28 09:29:28 +02:00
bpfilter bpfilter: Specify the log level for the kmsg message 2021-06-25 13:13:50 +02:00
bridge net: bridge: fix ioctl old_deviceless bridge argument 2021-12-29 12:28:46 +01:00
caif net-caif: avoid user-triggerable WARN_ON(1) 2021-09-14 12:51:15 +01:00
can can: j1939: j1939_tp_cmd_recv(): check the dst address of TP.CM_BAM 2021-11-18 19:16:03 +01:00
ceph Networking changes for 5.14. 2021-06-30 15:51:09 -07:00
core net/sched: Extend qdisc control block with tc control block 2022-01-05 12:42:33 +01:00
dcb
dccp tcp: switch orphan_count to bare per-cpu counters 2021-11-18 19:16:33 +01:00
decnet net: Remove redundant if statements 2021-08-05 13:27:50 +01:00
dns_resolver
dsa net: dsa: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge 2021-11-18 19:17:06 +01:00
ethernet move netdev_boot_setup into Space.c 2021-08-03 13:05:26 +01:00
ethtool ethtool: do not perform operations on net devices being unregistered 2021-12-14 10:57:09 +01:00
hsr
ieee802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-08-13 06:41:22 -07:00
ife
ipv4 inet: fully convert sk->sk_rx_dst to RCU rules 2021-12-29 12:28:42 +01:00
ipv6 udp: using datalen to cap ipv6 udp max gso segments 2022-01-05 12:42:35 +01:00
iucv net/iucv: Replace deprecated CPU-hotplug functions. 2021-08-09 10:13:32 +01:00
kcm net: sock: introduce sk_error_report 2021-06-29 11:28:21 -07:00
key
l2tp net/l2tp: Fix reference count leak in l2tp_udp_recv_core 2021-09-09 11:00:20 +01:00
l3mdev
lapb
llc net: Remove redundant if statements 2021-08-05 13:27:50 +01:00
mac80211 mac80211: fix locking in ieee80211_start_ap error path 2021-12-29 12:28:58 +01:00
mac802154 ieee802154: Remove redundant initialization of variable ret 2021-09-07 14:06:08 +01:00
mctp mctp: Don't let RTM_DELROUTE delete local routes 2021-12-08 09:04:53 +01:00
mpls net: mpls: Fix notifications when deleting a device 2021-12-08 09:04:47 +01:00
mptcp mptcp: fix deadlock in __mptcp_push_pending() 2021-12-22 09:32:43 +01:00
ncsi net/ncsi : Add payload to be 32-bit aligned to fix dropped packets 2021-12-01 09:04:51 +01:00
netfilter netfilter: fix regression in looped (broad|multi)cast's MAC handling 2021-12-29 12:28:40 +01:00
netlabel net: fix NULL pointer reference in cipso_v4_doi_free 2021-08-30 12:23:18 +01:00
netlink net: netlink: af_netlink: Prevent empty skb by adding a check on len. 2021-12-17 10:30:15 +01:00
netrom net: Remove redundant if statements 2021-08-05 13:27:50 +01:00
nfc nfc: fix segfault in nfc_genl_dump_devices_done 2021-12-17 10:30:12 +01:00
nsh
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-08-19 18:09:18 -07:00
packet net/packet: rx_owner_map depends on pg_vec 2021-12-22 09:32:44 +01:00
phonet phonet/pep: refuse to enable an unbound pipe 2021-12-29 12:29:03 +01:00
psample
qrtr net: qrtr: revert check in qrtr_endpoint_post() 2021-09-02 11:37:02 +01:00
rds rds: memory leak in __rds_conn_create() 2021-12-22 09:32:42 +01:00
rfkill
rose
rxrpc rxrpc: Fix rxrpc_local leak in rxrpc_lookup_peer() 2021-12-08 09:04:49 +01:00
sched net/sched: Extend qdisc control block with tc control block 2022-01-05 12:42:33 +01:00
sctp sctp: use call_rcu to free endpoint 2022-01-05 12:42:35 +01:00
smc net/smc: fix kernel panic caused by race of smc_sock 2022-01-05 12:42:36 +01:00
strparser bpf: sockmap, strparser, and tls are reusing qdisc_skb_cb and colliding 2021-11-18 19:17:11 +01:00
sunrpc SUNRPC: Partial revert of commit 6f9f17287e 2021-11-18 19:17:20 +01:00
switchdev net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge 2021-08-04 12:35:07 +01:00
tipc tipc: check for null after calling kmemdup 2021-11-25 09:48:42 +01:00
tls net/tls: Fix authentication failure in CCM mode 2021-12-08 09:04:41 +01:00
unix af_unix: fix regression in read after shutdown 2021-12-01 09:04:49 +01:00
vmw_vsock virtio/vsock: fix the transport to work with VMADDR_CID_ANY 2021-12-22 09:32:39 +01:00
wireless cfg80211: Acquire wiphy mutex on regulatory work 2021-12-22 09:32:42 +01:00
x25
xdp Revert "xsk: Do not sleep in poll() when need_wakeup set" 2021-12-22 09:32:51 +01:00
xfrm xfrm: fix rcu lock in xfrm_notify_userpolicy() 2021-09-23 10:11:12 +02:00
compat.c
devres.c
Kconfig mctp: Add MCTP base 2021-07-29 15:06:49 +01:00
Makefile mctp: Add MCTP base 2021-07-29 15:06:49 +01:00
socket.c Core: 2021-08-31 16:43:06 -07:00
sysctl_net.c