linux-stable/drivers/infiniband/hw/hfi1
Mike Marciniszyn 9a293d1e21 IB/hfi1: Ensure pq is not left on waitlist
The following warning can occur when a pq is left on the dmawait list and
the pq is then freed:

  WARNING: CPU: 47 PID: 3546 at lib/list_debug.c:29 __list_add+0x65/0xc0
  list_add corruption. next->prev should be prev (ffff939228da1880), but was ffff939cabb52230. (next=ffff939cabb52230).
  Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ast ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev drm_panel_orientation_quirks i2c_i801 mei_me lpc_ich mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables
  nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci libahci i2c_algo_bit dca libata ptp pps_core crc32c_intel [last unloaded: i2c_algo_bit]
  CPU: 47 PID: 3546 Comm: wrf.exe Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.41.1.el7.x86_64 #1
  Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019
  Call Trace:
  [<ffffffff91f65ac0>] dump_stack+0x19/0x1b
  [<ffffffff91898b78>] __warn+0xd8/0x100
  [<ffffffff91898bff>] warn_slowpath_fmt+0x5f/0x80
  [<ffffffff91a1dabe>] ? ___slab_alloc+0x24e/0x4f0
  [<ffffffff91b97025>] __list_add+0x65/0xc0
  [<ffffffffc03926a5>] defer_packet_queue+0x145/0x1a0 [hfi1]
  [<ffffffffc0372987>] sdma_check_progress+0x67/0xa0 [hfi1]
  [<ffffffffc03779d2>] sdma_send_txlist+0x432/0x550 [hfi1]
  [<ffffffff91a20009>] ? kmem_cache_alloc+0x179/0x1f0
  [<ffffffffc0392973>] ? user_sdma_send_pkts+0xc3/0x1990 [hfi1]
  [<ffffffffc0393e3a>] user_sdma_send_pkts+0x158a/0x1990 [hfi1]
  [<ffffffff918ab65e>] ? try_to_del_timer_sync+0x5e/0x90
  [<ffffffff91a3fe1a>] ? __check_object_size+0x1ca/0x250
  [<ffffffffc0395546>] hfi1_user_sdma_process_request+0xd66/0x1280 [hfi1]
  [<ffffffffc034e0da>] hfi1_aio_write+0xca/0x120 [hfi1]
  [<ffffffff91a4245b>] do_sync_readv_writev+0x7b/0xd0
  [<ffffffff91a4409e>] do_readv_writev+0xce/0x260
  [<ffffffff918df69f>] ? pick_next_task_fair+0x5f/0x1b0
  [<ffffffff918db535>] ? sched_clock_cpu+0x85/0xc0
  [<ffffffff91f6b16a>] ? __schedule+0x13a/0x860
  [<ffffffff91a442c5>] vfs_writev+0x35/0x60
  [<ffffffff91a4447f>] SyS_writev+0x7f/0x110
  [<ffffffff91f78ddb>] system_call_fastpath+0x22/0x27

The issue happens when wait_event_interruptible_timeout() returns a value
<= 0.

In that case, the pq is left on the list. The code continues sending
packets and potentially can complete the current request with the pq still
on the dmawait list provided no descriptor shortage is seen.

If the pq is torn down in that state, the sdma interrupt handler could
find the now freed pq on the list with list corruption or memory
corruption resulting.

Fix by adding a flush routine to ensure that the pq is never on a list
after processing a request.

A follow-up patch series will address issues with seqlock surfaced in:
https://lore.kernel.org/r/20200320003129.GP20941@ziepe.ca

The seqlock use for sdma will then be converted to a spin lock since the
list_empty() doesn't need the protection afforded by the sequence lock
currently in use.

Fixes: a0d406934a ("staging/rdma/hfi1: Add page lock limit check for SDMA requests")
Link: https://lore.kernel.org/r/20200320200200.23203.37777.stgit@awfm-01.aw.intel.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-23 21:57:57 -03:00
..
affinity.c RDMA/hfi1: Fix memory leak in _dev_comp_vect_mappings_create 2020-02-11 11:35:45 -04:00
affinity.h
aspm.c IB/hfi1: Reduce excessive aspm inlines 2019-06-28 22:34:26 -03:00
aspm.h IB/hfi1: Reduce excessive aspm inlines 2019-06-28 22:34:26 -03:00
chip.c IB/hfi1: Add RcvShortLengthErrCnt to hfi1stats 2020-01-10 10:57:17 -04:00
chip.h IB/hfi1: Add RcvShortLengthErrCnt to hfi1stats 2020-01-10 10:57:17 -04:00
chip_registers.h IB/hfi1: Add RcvShortLengthErrCnt to hfi1stats 2020-01-10 10:57:17 -04:00
common.h IB/hfi1: Add accessor API routines to access context members 2020-01-03 16:44:49 -04:00
debugfs.c IB/hfi1: List all receive contexts from debugfs 2020-01-03 16:44:50 -04:00
debugfs.h infiniband: hfi1: drop crazy DEBUGFS_SEQ_FILE_CREATE() macro 2019-01-24 09:22:29 -07:00
device.c
device.h
driver.c IB/hfi1: Add software counter for ctxt0 seq drop 2020-01-10 10:57:17 -04:00
efivar.c
efivar.h
eprom.c
eprom.h
exp_rcv.c IB/hfi1: Remove WARN_ON when freeing expected receive groups 2019-04-03 15:27:30 -03:00
exp_rcv.h
fault.c infiniband: hfi1: fix memory leaks 2019-08-20 13:44:45 -04:00
fault.h
file_ops.c IB/hfi1: Close window for pq and request coliding 2020-02-11 11:41:31 -04:00
firmware.c
hfi.h IB/hfi1: Close window for pq and request coliding 2020-02-11 11:41:31 -04:00
init.c IB/hfi1: IB/hfi1: Add an API to handle special case drop 2020-01-10 10:57:16 -04:00
intr.c
iowait.c IB/hfi1: Don't cancel unused work item 2020-01-03 16:41:51 -04:00
iowait.h IB/hfi1: Prioritize the sending of ACK packets 2019-02-05 18:07:44 -05:00
Kconfig treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
mad.c RDMA: Change MAD processing function to remove extra casting and parameter 2019-11-12 20:20:15 -04:00
mad.h
Makefile IB/hfi1: Reduce excessive aspm inlines 2019-06-28 22:34:26 -03:00
mmu_rb.c mm/mmu_notifier: use structure for invalidate_range_start/end callback 2018-12-28 12:11:50 -08:00
mmu_rb.h
msix.c IB/hfi1: Fix logical condition in msix_request_irq 2020-01-25 15:33:53 -04:00
msix.h IB/hfi1: Decouple IRQ name from type 2020-01-10 10:57:17 -04:00
opa_compat.h
opfn.c IB/hfi1: Add TID RDMA retry timer 2019-02-05 18:07:43 -05:00
opfn.h IB/hfi1: Make opfn.h self sufficient 2019-04-24 11:31:49 -03:00
pcie.c remove ioremap_nocache and devm_ioremap_nocache 2020-01-06 09:45:59 +01:00
pio.c Linux 5.2-rc6 2019-06-28 21:18:23 -03:00
pio.h IB/hfi1: Reduce lock contention on iowait_lock for sdma and pio 2018-12-06 20:15:36 -07:00
pio_copy.c
platform.c IB/hfi1: remove redundant assignment to variable ret 2019-11-25 10:31:47 -04:00
platform.h
qp.c IB/{rdmavt, hfi1, qib}: Add helpers to hide SWQE WR details 2019-06-28 22:34:26 -03:00
qp.h IB/hfi1: Add the dual leg code 2019-02-05 18:07:44 -05:00
qsfp.c
qsfp.h
rc.c IB/hfi1: use true,false for bool variable 2020-01-03 19:13:59 -04:00
rc.h IB/hfi1: Delay the release of destination mr for TID RDMA WRITE DATA 2019-04-03 15:27:30 -03:00
ruc.c IB/{rdmavt, hfi1): Miscellaneous comment fixes 2019-04-24 11:31:48 -03:00
sdma.c treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
sdma.h IB/hfi1: Reduce lock contention on iowait_lock for sdma and pio 2018-12-06 20:15:36 -07:00
sdma_txreq.h IB/hfi1: Prioritize the sending of ACK packets 2019-02-05 18:07:44 -05:00
sysfs.c RDMA: Introduce and use rdma_device_to_ibdev() 2019-01-14 13:12:03 -07:00
tid_rdma.c IB/hfi1: Adjust flow PSN with the correct resync_psn 2020-01-03 16:48:01 -04:00
tid_rdma.h IB/hfi1: Calculate flow weight based on QP MTU for TID RDMA 2019-11-06 13:15:36 -04:00
trace.c IB/hfi1: Add static trace for TID RDMA WRITE protocol 2019-02-05 18:07:44 -05:00
trace.h IB/hfi1: Add static trace for OPFN 2019-01-31 11:37:40 -05:00
trace_ctxts.h IB/hfi1: Add accessor API routines to access context members 2020-01-03 16:44:49 -04:00
trace_dbg.h IB/hfi1: Fix two format strings 2019-03-28 11:03:49 -03:00
trace_ibhdrs.h IB/hfi1: Add missing INVALIDATE opcodes for trace 2019-06-28 22:34:26 -03:00
trace_iowait.h IB/hfi1: Add static trace for iowait 2018-09-30 19:21:12 -06:00
trace_misc.h
trace_mmu.h
trace_rc.h IB/hfi1: Add static trace for TID RDMA READ protocol 2019-02-05 17:53:56 -05:00
trace_rx.h IB/hfi1: Add fast and slow handlers for receive context 2020-01-10 10:57:16 -04:00
trace_tid.h ftrace: Rework event_create_dir() 2019-11-27 07:44:25 +01:00
trace_tx.h ftrace: Rework event_create_dir() 2019-11-27 07:44:25 +01:00
uc.c IB/{hfi1, qib, rdmavt}: Put qp in error state when cq is full 2019-06-28 22:34:26 -03:00
ud.c IB/{rdmavt, hfi1, qib}: Add helpers to hide SWQE WR details 2019-06-28 22:34:26 -03:00
user_exp_rcv.c IB/hfi1: Close window for pq and request coliding 2020-02-11 11:41:31 -04:00
user_exp_rcv.h RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv 2019-11-23 19:56:44 -04:00
user_pages.c mm, tree-wide: rename put_user_page*() to unpin_user_page*() 2020-01-31 10:30:38 -08:00
user_sdma.c IB/hfi1: Ensure pq is not left on waitlist 2020-03-23 21:57:57 -03:00
user_sdma.h IB/hfi1: Remove unused define 2019-07-22 16:10:48 -03:00
verbs.c IB/hfi1, qib: Ensure RCU is locked when accessing list 2020-03-02 11:10:21 -04:00
verbs.h treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
verbs_txreq.c IB/hfi1: Silence txreq allocation warnings 2019-06-17 21:15:40 -04:00
verbs_txreq.h IB/hfi1: Silence txreq allocation warnings 2019-06-17 21:15:40 -04:00
vnic.h
vnic_main.c IB/hfi1: Add accessor API routines to access context members 2020-01-03 16:44:49 -04:00
vnic_sdma.c net: Use skb_frag_off accessors 2019-07-30 14:21:32 -07:00