linux-stable/net/smc
Wen Gu 0ef6049f66 net/smc: Forward wakeup to smc socket waitqueue after fallback
commit 341adeec9a upstream.

When we replace TCP with SMC and a fallback occurs, there may be
some socket waitqueue entries remaining in smc socket->wq, such
as eppoll_entries inserted by userspace applications.

After the fallback, data flows over TCP/IP and only clcsocket->wq
will be woken up. Applications can't be notified by the entries
which were inserted in smc socket->wq before fallback. So we need
a mechanism to wake up smc socket->wq at the same time if some
entries remaining in it.

The current workaround is to transfer the entries from smc socket->wq
to clcsock->wq during the fallback. But this may cause a crash
like this:

 general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] PREEMPT SMP PTI
 CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Tainted: G E     5.16.0+ #107
 RIP: 0010:__wake_up_common+0x65/0x170
 Call Trace:
  <IRQ>
  __wake_up_common_lock+0x7a/0xc0
  sock_def_readable+0x3c/0x70
  tcp_data_queue+0x4a7/0xc40
  tcp_rcv_established+0x32f/0x660
  ? sk_filter_trim_cap+0xcb/0x2e0
  tcp_v4_do_rcv+0x10b/0x260
  tcp_v4_rcv+0xd2a/0xde0
  ip_protocol_deliver_rcu+0x3b/0x1d0
  ip_local_deliver_finish+0x54/0x60
  ip_local_deliver+0x6a/0x110
  ? tcp_v4_early_demux+0xa2/0x140
  ? tcp_v4_early_demux+0x10d/0x140
  ip_sublist_rcv_finish+0x49/0x60
  ip_sublist_rcv+0x19d/0x230
  ip_list_rcv+0x13e/0x170
  __netif_receive_skb_list_core+0x1c2/0x240
  netif_receive_skb_list_internal+0x1e6/0x320
  napi_complete_done+0x11d/0x190
  mlx5e_napi_poll+0x163/0x6b0 [mlx5_core]
  __napi_poll+0x3c/0x1b0
  net_rx_action+0x27c/0x300
  __do_softirq+0x114/0x2d2
  irq_exit_rcu+0xb4/0xe0
  common_interrupt+0xba/0xe0
  </IRQ>
  <TASK>

The crash is caused by privately transferring waitqueue entries from
smc socket->wq to clcsock->wq. The owners of these entries, such as
epoll, have no idea that the entries have been transferred to a
different socket wait queue and still use original waitqueue spinlock
(smc socket->wq.wait.lock) to make the entries operation exclusive,
but it doesn't work. The operations to the entries, such as removing
from the waitqueue (now is clcsock->wq after fallback), may cause a
crash when clcsock waitqueue is being iterated over at the moment.

This patch tries to fix this by no longer transferring wait queue
entries privately, but introducing own implementations of clcsock's
callback functions in fallback situation. The callback functions will
forward the wakeup to smc socket->wq if clcsock->wq is actually woken
up and smc socket->wq has remaining entries.

Fixes: 2153bd1e3d ("net/smc: Transfer remaining wait queue entries during fallback")
Suggested-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-08 18:34:09 +01:00
..
af_smc.c net/smc: Forward wakeup to smc socket waitqueue after fallback 2022-02-08 18:34:09 +01:00
Kconfig treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
Makefile net/smc: Add SMC statistics support 2021-06-16 12:54:02 -07:00
smc.h net/smc: Forward wakeup to smc socket waitqueue after fallback 2022-02-08 18:34:09 +01:00
smc_cdc.c net/smc: fix kernel panic caused by race of smc_sock 2022-01-05 12:42:36 +01:00
smc_cdc.h net/smc: fix kernel panic caused by race of smc_sock 2022-01-05 12:42:36 +01:00
smc_clc.c net/smc: add missing error check in smc_clc_prfx_set() 2021-09-21 10:54:16 +01:00
smc_clc.h net/smc: Add support for obtaining system information 2020-12-01 17:56:13 -08:00
smc_close.c net/smc: Keep smc_close_final rc during active close 2021-12-08 09:04:50 +01:00
smc_close.h net/smc: remove close abort worker 2019-10-22 11:23:44 -07:00
smc_core.c net/smc: Fix hung_task when removing SMC-R devices 2022-01-27 11:05:31 +01:00
smc_core.h net/smc: Reset conn->lgr when link group registration fails 2022-01-27 11:03:53 +01:00
smc_diag.c net/smc: Introduce SMCR get link command 2020-12-01 17:56:13 -08:00
smc_ib.c net/smc: fix kernel panic caused by race of smc_sock 2022-01-05 12:42:36 +01:00
smc_ib.h net/smc: fix kernel panic caused by race of smc_sock 2022-01-05 12:42:36 +01:00
smc_ism.c net/smc: no need to flush smcd_dev's event_wq before destroying it 2021-06-03 13:54:49 -07:00
smc_ism.h net/smc: Add support for obtaining SMCD device list 2020-12-01 17:56:13 -08:00
smc_llc.c net/smc: don't send CDC/LLC message if link not ready 2022-01-05 12:42:36 +01:00
smc_llc.h net/smc: move add link processing for new device into llc layer 2020-07-19 15:30:22 -07:00
smc_netlink.c net/smc: Add netlink support for SMC fallback statistics 2021-06-16 12:54:02 -07:00
smc_netlink.h net/smc: Add netlink support for SMC fallback statistics 2021-06-16 12:54:02 -07:00
smc_netns.h net/smc: introduce list of pnetids for Ethernet devices 2020-09-28 15:19:03 -07:00
smc_pnet.c net: Remove redundant if statements 2021-08-05 13:27:50 +01:00
smc_pnet.h net/smc: determine proposed ISM devices 2020-09-28 15:19:03 -07:00
smc_rx.c net/smc: Make SMC statistics network namespace aware 2021-06-16 12:54:02 -07:00
smc_rx.h smc: add support for splice() 2018-05-04 11:45:06 -04:00
smc_stats.c net/smc: Fix ENODATA tests in smc_nl_get_fback_stats() 2021-06-21 12:16:58 -07:00
smc_stats.h net/smc: Make SMC statistics network namespace aware 2021-06-16 12:54:02 -07:00
smc_tx.c net/smc: improved fix wait on already cleared link 2021-10-08 17:00:16 +01:00
smc_tx.h net/smc: eliminate cursor read and write calls 2018-07-23 10:57:14 -07:00
smc_wr.c net/smc: fix kernel panic caused by race of smc_sock 2022-01-05 12:42:36 +01:00
smc_wr.h net/smc: fix kernel panic caused by race of smc_sock 2022-01-05 12:42:36 +01:00