Commit graph

106 commits

Author SHA1 Message Date
James Smart
84f2ddf8cf scsi: lpfc: Fix hang when downloading fw on port enabled for nvme
As part of firmware download, the adapter is reset. On the adapter the
reset causes the function to stop and all outstanding io is terminated
(without responses). The reset path then starts teardown of the adapter,
starting with deregistration of the remote ports with the nvme-fc
transport. The local port is then deregistered and the driver waits for
local port deregistration. This never finishes.

The remote port deregistrations terminated the nvme controllers, causing
them to send aborts for all the outstanding io. The aborts were serviced in
the driver, but stalled due to its state. The nvme layer then stops to
reclaim it's outstanding io before continuing.  The io must be returned
before the reset on the controller is deemed complete and the controller
delete performed.  The remote port deregistration won't complete until all
the controllers are terminated. And the local port deregistration won't
complete until all controllers and remote ports are terminated. Thus things
hang.

The issue is the reset which stopped the adapter also stopped all the
responses that would drive i/o completions, and the aborts were also
stopped that stopped i/o completions. The driver, when resetting the
adapter like this, needs to be generating the completions as part of the
adapter reset so that I/O complete (in error), and any aborts are not
queued.

Fix by adding flush routines whenever the adapter port has been reset or
discovered in error. The flush routines will generate the completions for
the scsi and nvme outstanding io. The abort ios, if waiting, will be caught
and flushed as well.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:10 -04:00
James Smart
6825b7bd32 scsi: lpfc: Fix error in remote port address change
In a test with high nvme remote port counts connected via a multi-hop FC
switch config where switches were systematically reset (e.g. fabric
partitioning and re-establishment), the nvme remote ports would switch
addresses based on the switch reconfiguration events. The driver would get
into a situation where the nvme port changed address, PLOGI and PRLI would
succeed nvme transport registration occurred, but subsequent LS requests by
the nvme subsystem failed due to a bad ndlp state and connectivity to the
device failed.

The driver hit a race condition on multiple devices that address swapped
simultaneously. In cases where the driver notices the remote port structure
came back as the same value as previously (meaning a nvme_rport structure
was re-enabled and did not go through devloss_tmo/connect_tmo_failures on
all controllers) the driver would unconditionally exit assuming the ndlp
information was correct. But, if the ndlp's had been swapped, the ndlp had
stale port state information, which when used by the LS request commands,
would fail the commands.

Fix by checking whether a node swap had occurred, and only exit if no ndlp
swap had occurred.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:10 -04:00
James Smart
a6d10f24a0 scsi: lpfc: Fix driver nvme rescan logging
In situations where zoning is not being used, thus NVMe initiators see
other NVMe initiators as well as NVMe targets, a link bounce on an
initiator will cause the NVMe initiators to spew "6169" State Error
messages.

The driver is not qualifying whether the remote port is a NVMe targer or
not before calling the lpfc_nvme_rescan_port(), which validates the role
and prints the message if its only an NVMe initiator.

Fix by the following:

 - Before calling lpfc_nvme_rescan_port() ensure that the node is a NVMe
   storage target or a NVMe discovery controller.

 - Clean up implementation of lpfc_nvme_rescan_port. remoteport pointer
   will always be NULL if a NVMe initiator only. But, grabbing of
   remoteport pointer should be done under lock to coincide with the
   registering of the remote port with the fc transport.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:10 -04:00
James Smart
61184f1742 scsi: lpfc: Fix Oops in nvme_register with target logout/login
lpfc_nvme_register_port hit a null prev_ndlp pointer in a test with lots of
target ports swapping addresses. The oldport value was stale, thus it's
ndlp (prev_ndlp set to it) was used.

Fix by moving oldrport pointer checks, and if used prev_ndlp pointer
assignment, to be done while the lock is held.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:09 -04:00
Linus Torvalds
ba6d10ab80 SCSI misc on 20190709
This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs,
 mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the
 removal of the osst driver (I heard from Willem privately that he
 would like the driver removed because all his test hardware has
 failed).  Plus number of minor changes, spelling fixes and other
 trivia.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXSTl4yYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdcxAQDCJVbd
 fPUX76/V1ldupunF97+3DTharxxbst+VnkOnCwD8D4c0KFFFOI9+F36cnMGCPegE
 fjy17dQLvsJ4GsidHy8=
 =aS5B
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs,
  mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the
  removal of the osst driver (I heard from Willem privately that he
  would like the driver removed because all his test hardware has
  failed). Plus number of minor changes, spelling fixes and other
  trivia.

  The big merge conflict this time around is the SPDX licence tags.
  Following discussion on linux-next, we believe our version to be more
  accurate than the one in the tree, so the resolution is to take our
  version for all the SPDX conflicts"

Note on the SPDX license tag conversion conflicts: the SCSI tree had
done its own SPDX conversion, which in some cases conflicted with the
treewide ones done by Thomas & co.

In almost all cases, the conflicts were purely syntactic: the SCSI tree
used the old-style SPDX tags ("GPL-2.0" and "GPL-2.0+") while the
treewide conversion had used the new-style ones ("GPL-2.0-only" and
"GPL-2.0-or-later").

In these cases I picked the new-style one.

In a few cases, the SPDX conversion was actually different, though.  As
explained by James above, and in more detail in a pre-pull-request
thread:

 "The other problem is actually substantive: In the libsas code Luben
  Tuikov originally specified gpl 2.0 only by dint of stating:

  * This file is licensed under GPLv2.

  In all the libsas files, but then muddied the water by quoting GPLv2
  verbatim (which includes the or later than language). So for these
  files Christoph did the conversion to v2 only SPDX tags and Thomas
  converted to v2 or later tags"

So in those cases, where the spdx tag substantially mattered, I took the
SCSI tree conversion of it, but then also took the opportunity to turn
the old-style "GPL-2.0" into a new-style "GPL-2.0-only" tag.

Similarly, when there were whitespace differences or other differences
to the comments around the copyright notices, I took the version from
the SCSI tree as being the more specific conversion.

Finally, in the spdx conversions that had no conflicts (because the
treewide ones hadn't been done for those files), I just took the SCSI
tree version as-is, even if it was old-style.  The old-style conversions
are perfectly valid, even if the "-only" and "-or-later" versions are
perhaps more descriptive.

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (185 commits)
  scsi: qla2xxx: move IO flush to the front of NVME rport unregistration
  scsi: qla2xxx: Fix NVME cmd and LS cmd timeout race condition
  scsi: qla2xxx: on session delete, return nvme cmd
  scsi: qla2xxx: Fix kernel crash after disconnecting NVMe devices
  scsi: megaraid_sas: Update driver version to 07.710.06.00-rc1
  scsi: megaraid_sas: Introduce various Aero performance modes
  scsi: megaraid_sas: Use high IOPS queues based on IO workload
  scsi: megaraid_sas: Set affinity for high IOPS reply queues
  scsi: megaraid_sas: Enable coalescing for high IOPS queues
  scsi: megaraid_sas: Add support for High IOPS queues
  scsi: megaraid_sas: Add support for MPI toolbox commands
  scsi: megaraid_sas: Offload Aero RAID5/6 division calculations to driver
  scsi: megaraid_sas: RAID1 PCI bandwidth limit algorithm is applicable for only Ventura
  scsi: megaraid_sas: megaraid_sas: Add check for count returned by HOST_DEVICE_LIST DCMD
  scsi: megaraid_sas: Handle sequence JBOD map failure at driver level
  scsi: megaraid_sas: Don't send FPIO to RL Bypass queue
  scsi: megaraid_sas: In probe context, retry IOC INIT once if firmware is in fault
  scsi: megaraid_sas: Release Mutex lock before OCR in case of DCMD timeout
  scsi: megaraid_sas: Call disable_irq from process IRQ poll
  scsi: megaraid_sas: Remove few debug counters from IO path
  ...
2019-07-11 15:14:01 -07:00
James Smart
6f2589f478 lpfc: add support for translating an RSCN rcv into a discovery rescan
This patch updates RSCN receive processing to check for the remote
port being an NVME port, and if so, invoke the nvme_fc callback to
rescan the remote port.  The rescan will generate a discovery udev
event.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-06-21 11:08:38 +02:00
James Smart
2ab70c2106 scsi: lpfc: Revise message when stuck due to unresponsive adapter
Revise a stalled adapter message to also include the number of jobs that
are stalling the thread.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:21 -04:00
Bart Van Assche
d964b3e534 scsi: lpfc: Fix a recently introduced compiler warning
This patch avoids that the following compiler warning is reported with
CONFIG_NVME_FC=n:

drivers/scsi/lpfc/lpfc_nvme.c:2140:1: warning: 'lpfc_nvme_lport_unreg_wait' defined but not used [-Wunused-function]
 lpfc_nvme_lport_unreg_wait(struct lpfc_vport *vport,
 ^~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: 3999df75bc ("scsi: lpfc: Declare local functions static")
Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-08 21:26:11 -04:00
Bart Van Assche
d6d189ceab scsi: lpfc: Change smp_processor_id() into raw_smp_processor_id()
This patch avoids that a kernel warning appears when smp_processor_id() is
called with preempt debugging enabled.

Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-03 23:11:36 -04:00
Bart Van Assche
cd05c155d7 scsi: lpfc: Annotate switch/case fall-through
This patch avoids that the compiler warns about missing fall-through
annotation when building with W=1.

Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-03 23:11:35 -04:00
Bart Van Assche
3999df75bc scsi: lpfc: Declare local functions static
This patch avoids that the compiler complains about missing declarations
when building with W=1.

Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-03 23:11:35 -04:00
Arnd Bergmann
faf5a744f4 scsi: lpfc: avoid uninitialized variable warning
clang -Wuninitialized incorrectly sees a variable being used without
initialization:

drivers/scsi/lpfc/lpfc_nvme.c:2102:37: error: variable 'localport' is uninitialized when used here
      [-Werror,-Wuninitialized]
                lport = (struct lpfc_nvme_lport *)localport->private;
                                                  ^~~~~~~~~
drivers/scsi/lpfc/lpfc_nvme.c:2059:38: note: initialize the variable 'localport' to silence this warning
        struct nvme_fc_local_port *localport;
                                            ^
                                             = NULL
1 error generated.

This is clearly in dead code, as the condition leading up to it is always
false when CONFIG_NVME_FC is disabled, and the variable is always
initialized when nvme_fc_register_localport() got called successfully.

Change the preprocessor conditional to the equivalent C construct, which
makes the code more readable and gets rid of the warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-25 22:17:26 -04:00
James Smart
2c013a3a6b scsi: lpfc: Enhance 6072 log string
Update the 6072 log message string to print the whole 32 bits of the
extended status.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19 13:15:10 -04:00
Linus Torvalds
477558d7e8 SCSI misc on 20190315
This is the final round of mostly small fixes and performance
 improvements to our initial submit.  The main regression fix is the
 ia64 simscsi build failure which was missed in the serial number
 elimination conversion.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXIxBayYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pisherpAP4rxLpX
 bcUnQnEsvoxys/JyoK08Qfv1JebZo1B2MAZ62wD/VZ7LpOuzVLhsM2KhLFGRrs1/
 7D2K4tgtO2dQsFix7H0=
 =pcHl
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
 "This is the final round of mostly small fixes and performance
  improvements to our initial submit.

  The main regression fix is the ia64 simscsi build failure which was
  missed in the serial number elimination conversion"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits)
  scsi: ia64: simscsi: use request tag instead of serial_number
  scsi: aacraid: Fix performance issue on logical drives
  scsi: lpfc: Fix error codes in lpfc_sli4_pci_mem_setup()
  scsi: libiscsi: Hold back_lock when calling iscsi_complete_task
  scsi: hisi_sas: Change SERDES_CFG init value to increase reliability of HiLink
  scsi: hisi_sas: Send HARD RESET to clear the previous affiliation of STP target port
  scsi: hisi_sas: Set PHY linkrate when disconnected
  scsi: hisi_sas: print PHY RX errors count for later revision of v3 hw
  scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO
  scsi: hisi_sas: Change return variable type in phy_up_v3_hw()
  scsi: qla2xxx: check for kstrtol() failure
  scsi: lpfc: fix 32-bit format string warning
  scsi: lpfc: fix unused variable warning
  scsi: target: tcmu: Switch to bitmap_zalloc()
  scsi: libiscsi: fall back to sendmsg for slab pages
  scsi: qla2xxx: avoid printf format warning
  scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset
  scsi: lpfc: Correct __lpfc_sli_issue_iocb_s4 lockdep check
  scsi: ufs: hisi: fix ufs_hba_variant_ops passing
  scsi: qla2xxx: Fix panic in qla_dfs_tgt_counters_show
  ...
2019-03-16 12:51:50 -07:00
Linus Torvalds
92fff53b71 SCSI misc on 20190306
This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc,
 hisi_sas, target/iscsi and target/core.  Additionally Christoph
 refactored gdth as part of the dma changes.  The major mid-layer
 change this time is the removal of bidi commands and with them the
 whole of the osd/exofs driver and filesystem.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXIC54SYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishT1GAPwJEV23
 ExPiPsnuVgKj49nLTagZ3rILRQcYNbL+MNYqxQEA0cT8FHzSDBfWY5OKPNE+RQ8z
 f69LpXGmMpuagKGvvd4=
 =Fhy1
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc,
  hisi_sas, target/iscsi and target/core.

  Additionally Christoph refactored gdth as part of the dma changes. The
  major mid-layer change this time is the removal of bidi commands and
  with them the whole of the osd/exofs driver and filesystem. This is a
  major simplification for block and mq in particular"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (240 commits)
  scsi: cxgb4i: validate tcp sequence number only if chip version <= T5
  scsi: cxgb4i: get pf number from lldi->pf
  scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c
  scsi: mpt3sas: Add missing breaks in switch statements
  scsi: aacraid: Fix missing break in switch statement
  scsi: kill command serial number
  scsi: csiostor: drop serial_number usage
  scsi: mvumi: use request tag instead of serial_number
  scsi: dpt_i2o: remove serial number usage
  scsi: st: osst: Remove negative constant left-shifts
  scsi: ufs-bsg: Allow reading descriptors
  scsi: ufs: Allow reading descriptor via raw upiu
  scsi: ufs-bsg: Change the calling convention for write descriptor
  scsi: ufs: Remove unused device quirks
  Revert "scsi: ufs: disable vccq if it's not needed by UFS device"
  scsi: megaraid_sas: Remove a bunch of set but not used variables
  scsi: clean obsolete return values of eh_timed_out
  scsi: sd: Optimal I/O size should be a multiple of physical block size
  scsi: MAINTAINERS: SCSI initiator and target tweaks
  scsi: fcoe: make use of fip_mode enum complete
  ...
2019-03-09 16:53:47 -08:00
Arnd Bergmann
352b205a3b scsi: lpfc: fix unused variable warning
The newly introduced 'cpu' variable is only used inside of an optional
block, so we get a warning without CONFIG_SCSI_LPFC_DEBUG_FS:

drivers/scsi/lpfc/lpfc_nvme.c: In function 'lpfc_nvme_io_cmd_wqe_cmpl':
drivers/scsi/lpfc/lpfc_nvme.c:968:30: error: unused variable 'cpu' [-Werror=unused-variable]
  uint32_t code, status, idx, cpu;

Move the declaration into the same block to avoid the warning.

Fixes: 63df6d637e ("scsi: lpfc: Adapt cpucheck debugfs logic to Hardware Queues")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-06 19:26:46 -05:00
James Smart
0d041215f0 scsi: lpfc: Update 12.2.0.0 file copyrights to 2019
For files modified as part of 12.2.0.0 patches, update copyright to 2019

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
c2017260ee scsi: lpfc: Rework locking on SCSI io completion
A scsi host lock is taken on every io completion to check whether the abort
handler is waiting on the io completion. This is an expensive lock to take
on all completion when rarely in an abort condition.

Replace scsi host lock with command-specific lock. Synchronize completion
and abort paths by new cmd lock. Ensure all flag changing and nulling of
context pointers taken under lock.  When adding lock to task management
abort, realized it was missing other synchronization locks. Added that
synchronization to match normal paths.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
6a828b0f61 scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues
So far MSIX vector allocation assumed it would be 1:1 with hardware
queues. However, there are several reasons why fewer MSIX vectors may be
allocated than hardware queues such as the platform being out of vectors or
adapter limits being less than cpu count.

This patch reworks the MSIX/EQ relationships with the per-cpu hardware
queues so they can function independently. MSIX vectors will be equitably
split been cpu sockets/cores and then the per-cpu hardware queues will be
mapped to the vectors most efficient for them.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:49 -05:00
James Smart
45aa312e21 scsi: lpfc: Allow override of hardware queue selection policies
Default behavior is to use the information from the upper IO stacks to
select the hardware queue to use for IO submission.  Which typically has
good cpu affinity.

However, the driver, when used on some variants of the upstream kernel, has
found queuing information to be suboptimal for FCP or IO completion locked
on particular cpus.

For command submission situations, the lpfc_fcp_io_sched module parameter
can be set to specify a hardware queue selection policy that overrides the
os stack information.

For IO completion situations, rather than queing cq processing based on the
cpu servicing the interrupting event, schedule the cq processing on the cpu
associated with the hardware queue's cq.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:09 -05:00
James Smart
c490850a09 scsi: lpfc: Adapt partitioned XRI lists to efficient sharing
The XRI get/put lists were partitioned per hardware queue. However, the
adapter rarely had sufficient resources to give a large number of resources
per queue. As such, it became common for a cpu to encounter a lack of XRI
resource and request the upper io stack to retry after returning a BUSY
condition. This occurred even though other cpus were idle and not using
their resources.

Create as efficient a scheme as possible to move resources to the cpus that
need them. Each cpu maintains a small private pool which it allocates from
for io. There is a watermark that the cpu attempts to keep in the private
pool.  The private pool, when empty, pulls from a global pool from the
cpu. When the cpu's global pool is empty it will pull from other cpu's
global pool. As there many cpu global pools (1 per cpu or hardware queue
count) and as each cpu selects what cpu to pull from at different rates and
at different times, it creates a radomizing effect that minimizes the
number of cpu's that will contend with each other when the steal XRI's from
another cpu's global pool.

On io completion, a cpu will push the XRI back on to its private pool.  A
watermark level is maintained for the private pool such that when it is
exceeded it will move XRI's to the CPU global pool so that other cpu's may
allocate them.

On NVME, as heartbeat commands are critical to get placed on the wire, a
single expedite pool is maintained. When a heartbeat is to be sent, it will
allocate an XRI from the expedite pool rather than the normal cpu
private/global pools. On any io completion, if a reduction in the expedite
pools is seen, it will be replenished before the XRI is placed on the cpu
private pool.

Statistics are added to aid understanding the XRI levels on each cpu and
their behaviors.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:09 -05:00
James Smart
1fbf974250 scsi: lpfc: Convert ring number to hardware queue for nvme wqe posting.
SLI4 nvme functions are passing the SLI3 ring number when posting wqe to
hardware. This should be indicating the hardware queue to use, not the ring
number.

Replace ring number with the hardware queue that should be used.

Note: SCSI avoided this issue as it utilized an older lfpc_issue_iocb
routine that properly adapts.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:09 -05:00
James Smart
4c47efc140 scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures
Many io statistics were being sampled and saved using adapter-based data
structures. This was creating a lot of contention and cache thrashing in
the I/O path.

Move the statistics to the hardware queue data structures.  Given the
per-queue data structures, use of atomic types is lessened.

Add new sysfs and debugfs stat routines to collate the per hardware queue
values and report at an adapter level.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:08 -05:00
James Smart
63df6d637e scsi: lpfc: Adapt cpucheck debugfs logic to Hardware Queues
Similar to the io execution path that reports cpu context information, the
debugfs routines for cpu information needs to be aligned with new hardware
queue implementation.

Convert debugfs cnd nvme cpucheck statistics to report information per
Hardware Queue.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:28:11 -05:00
James Smart
5e5b511d8b scsi: lpfc: Partition XRI buffer list across Hardware Queues
Once the IO buff allocations were made shared, there was a single XRI
buffer list shared by all hardware queues.  A single list isn't great for
performance when shared across the per-cpu hardware queues.

Create a separate XRI IO buffer get/put list for each Hardware Queue.  As
SGLs and associated IO buffers get allocated/posted to the firmware; round
robin their assignment across all available hardware Queues so that there
is an equitable assignment.

Modify SCSI and NVME IO submit code paths to use the Hardware Queue logic
for XRI allocation.

Add a debugfs interface to display hardware queue statistics

Added new empty_io_bufs counter to track if a cpu runs out of XRIs.

Replace common_ variables/names with io_ to make meanings clearer.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:24:22 -05:00
James Smart
cdb42becdd scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu
Currently, both nvme and fcp each have their own concept of an io_channel,
which is a combination wq/cq and associated msix.  Different cpus would
share an io_channel.

The driver is now moving to per-cpu wq/cq pairs and msix vectors.  The
driver will still use separate wq/cq pairs per protocol on each cpu, but
the protocols will share the msix vector.

Given the elimination of the nvme and fcp io channels, the module
parameters will be removed.  A new parameter, lpfc_hdw_queue is added which
allows the wq/cq pair allocation per cpu to be overridden and allocated to
lesser value. If lpfc_hdw_queue is zero, the number of pairs allocated will
be based on the number of cpus. If non-zero, the parameter specifies the
number of queues to allocate. At this time, the maximum non-zero value is
64.

To manage this new paradigm, a new hardware queue structure is created to
track queue activity and relationships.

As MSIX vector allocation must be known before setting up the
relationships, msix allocation now occurs before queue datastructures are
allocated. If the number of vectors allocated is less than the desired
hardware queues, the hardware queue counts will be reduced to the number of
vectors

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
James Smart
0794d601d1 scsi: lpfc: Implement common IO buffers between NVME and SCSI
Currently, both NVME and SCSI get their IO buffers from separate
pools. XRI's are associated 1:1 with IO buffers, so XRI's are also split
between protocols.

Eliminate the independent pools and use a single pool. Each buffer
structure now has a common section and a protocol section. Per protocol
routines for SGL initialization are removed and replaced by common
routines. Initialization of the buffers is only done on the common area.
All other fields, which are protocol specific, are initialized when the
buffer is allocated for use in the per-protocol allocation routine.

In the past, the SCSI side allocated IO buffers as part of slave_alloc
calls until the maximum XRIs for SCSI was reached. As all XRIs are now
common and may be used for either protocol, allocation for everything is
done as part of adapter initialization and the scsi side has no action in
slave alloc.

As XRI's are no longer split, the lpfc_xri_split module parameter is
removed.

Adapters based on SLI3 will continue to use the older scsi_buf_list_get/put
routines.  All SLI4 adapters utilize the new IO buffer scheme

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
James Smart
e960f5ab40 scsi: lpfc: cleanup: Remove excess check on NVME io submit code path
lpfc_nvme_prep_io_cmd() checks for null pnode, but caller
lpfc_nvme_fcp_io_submit() has already ensured it's non-null.

Remove the pnode null check.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
James Smart
0b05e9fe1f scsi: lpfc: cleanup: remove nrport from nvme command structure
An hba-wide lock is taken in the nvme io completion routine. The lock
covers null'ing of the nrport pointer in the cmd structure.

The nrport member isn't necessary. After extracting the pointer from the
command, the pointer was dereferenced to get the fc discovery node
pointer. But the fc discovery node pointer is alrady in the command
structure so the dereferrence was unnecessary.

Eliminated the nrport structure member and its use, which also eliminates
the port-wide lock.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
Ewan D. Milne
7961cba6f7 scsi: lpfc: nvme: avoid hang / use-after-free when destroying localport
We cannot wait on a completion object in the lpfc_nvme_lport structure in
the _destroy_localport() code path because the NVMe/fc transport will free
that structure immediately after the .localport_delete() callback.  This
results in a use-after-free, and a hang if slub_debug=FZPU is enabled.

Fix this by putting the completion on the stack.

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-22 20:40:59 -05:00
James Smart
1c36833d82 scsi: lpfc: Correct code setting non existent bits in sli4 ABORT WQE
Driver is setting bits in word 10 of the SLI4 ABORT WQE (the wqid).  The
field was a carry over from a prior SLI revision. The field does not exist
in SLI4, and the action may result in an overlap with future definition of
the WQE.

Remove the setting of WQID in the ABORT WQE.

Also cleaned up WQE field settings - initialize to zero, don't bother to
set fields to zero.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
Linus Torvalds
d49f8a52b1 SCSI misc on 20181024
This is mostly updates of the usual drivers: UFS, esp_scsi, NCR5380,
 qla2xxx, lpfc, libsas, hisi_sas.  In addition there's a set of mostly
 small updates to the target subsystem a set of conversions to the
 generic DMA API, which do have some potential for issues in the older
 drivers but we'll handle those as case by case fixes. A new myrs for
 the DAC960/mylex raid controllers to replace the block based DAC960
 which is also being removed by Jens in this merge window. Plus the
 usual slew of trivial changes.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCW9BQJSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishU3MAP41T8yW
 UJQDCprj65pCR+9mOUWzgMvgAW/15ouK89x/7AD/XAEQZqoAgpFUbgnoZWGddZkS
 LykIzSiLHP4qeDOh1TQ=
 =2JMU
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly updates of the usual drivers: UFS, esp_scsi, NCR5380,
  qla2xxx, lpfc, libsas, hisi_sas.

  In addition there's a set of mostly small updates to the target
  subsystem a set of conversions to the generic DMA API, which do have
  some potential for issues in the older drivers but we'll handle those
  as case by case fixes.

  A new myrs driver for the DAC960/mylex raid controllers to replace the
  block based DAC960 which is also being removed by Jens in this merge
  window.

  Plus the usual slew of trivial changes"

[ "myrs" stands for "MYlex Raid Scsi". Obviously. Silly of me to even
  wonder. There's also a "myrb" driver, where the 'b' stands for
  'block'. Truly, somebody has got mad naming skillz. - Linus ]

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (237 commits)
  scsi: myrs: Fix the processor absent message in processor_show()
  scsi: myrs: Fix a logical vs bitwise bug
  scsi: hisi_sas: Fix NULL pointer dereference
  scsi: myrs: fix build failure on 32 bit
  scsi: fnic: replace gross legacy tag hack with blk-mq hack
  scsi: mesh: switch to generic DMA API
  scsi: ips: switch to generic DMA API
  scsi: smartpqi: fully convert to the generic DMA API
  scsi: vmw_pscsi: switch to generic DMA API
  scsi: snic: switch to generic DMA API
  scsi: qla4xxx: fully convert to the generic DMA API
  scsi: qla2xxx: fully convert to the generic DMA API
  scsi: qla1280: switch to generic DMA API
  scsi: qedi: fully convert to the generic DMA API
  scsi: qedf: fully convert to the generic DMA API
  scsi: pm8001: switch to generic DMA API
  scsi: nsp32: switch to generic DMA API
  scsi: mvsas: fully convert to the generic DMA API
  scsi: mvumi: switch to generic DMA API
  scsi: mpt3sas: switch to generic DMA API
  ...
2018-10-25 07:40:30 -07:00
James Smart
9e21017826 scsi: lpfc: Synchronize access to remoteport via rport
The driver currently uses the ndlp to get the local rport which is then used
to get the nvme transport remoteport pointer. There can be cases where a stale
remoteport pointer is obtained as synchronization isn't done through the
different dereferences.

Correct by using locks to synchronize the dereferences.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-20 22:02:36 -04:00
YueHaibing
a63eba9efd scsi: lpfc: Remove set but not used variable 'sgl_size'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/scsi/lpfc/lpfc_nvme.c: In function 'lpfc_new_nvme_buf':
drivers/scsi/lpfc/lpfc_nvme.c:2238:24: warning:
 variable 'sgl_size' set but not used [-Wunused-but-set-variable]
  int bcnt, num_posted, sgl_size;
                        ^
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-17 03:01:43 -04:00
James Smart
2879265f51 scsi: lpfc: Fix errors in log messages.
Message 6408 is displayed for each entry in an array, but the cpu and queue
numbers were incorrect for the entry.  Message 6001 includes an extraneous
character.

Resolve both issues

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-11 20:37:33 -04:00
James Smart
5b9e70b22c scsi: lpfc: raise sg count for nvme to use available sg resources
The driver allocates a sg list per io struture based on a fixed maximum
size. When it registers with the protocol transports and indicates the max sg
list size it supports, the driver manipulates the fixed value to report a
lesser amount so that it has reserved space for sg elements that are used for
DIF.

The driver initialization path sets the cfg_sg_seg_cnt field to the
manipulated value for scsi. NVME initialization ran afterward and capped it's
maximum by the manipulated value for SCSI. This erroneously made NVME report
the SCSI-reduce-for-DIF value that reduced the max io size for nvme and wasted
sg elements.

Rework the driver so that cfg_sg_seg_cnt becomes the overall maximum size and
allow the max size to be tunable.  A separate (new) scsi sg count is then
setup with the scsi-modified reduced value. NVME then initializes based off
the overall maximum.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-11 20:37:33 -04:00
James Smart
2a5b7d626e scsi: lpfc: Limit tracking of tgt queue depth in fast path
Performance is affected when target queue depth is tracked.  An atomic
counter is incremented on the submission path which competes with it being
decremented on the completion path.  In addition, multiple CPUs can
simultaniously be manipulating this counter for the same ndlp.

Reduce the overhead by only performing the target increment/decrement when
the target queue depth is less than the overall adapter depth, thus is
actually meaningful.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-08-02 15:45:19 -04:00
James Smart
93a3922da4 scsi: lpfc: Fix driver crash when re-registering NVME rports.
During remote port loss fault testing, the driver crashed with the
following trace:

general protection fault: 0000 [#1] SMP
RIP: ... lpfc_nvme_register_port+0x250/0x480 [lpfc]
Call Trace:
 lpfc_nlp_state_cleanup+0x1b3/0x7a0 [lpfc]
 lpfc_nlp_set_state+0xa6/0x1d0 [lpfc]
 lpfc_cmpl_prli_prli_issue+0x213/0x440
 lpfc_disc_state_machine+0x7e/0x1e0 [lpfc]
 lpfc_cmpl_els_prli+0x18a/0x200 [lpfc]
 lpfc_sli_sp_handle_rspiocb+0x3b5/0x6f0 [lpfc]
 lpfc_sli_handle_slow_ring_event_s4+0x161/0x240 [lpfc]
 lpfc_work_done+0x948/0x14c0 [lpfc]
 lpfc_do_work+0x16f/0x180 [lpfc]
 kthread+0xc9/0xe0
 ret_from_fork+0x55/0x80

After registering a new remoteport, the driver is pulling an ndlp pointer
from the lpfc rport associated with the private area of a newly registered
remoteport. The private area is uninitialized, so it's garbage.

Correct by pulling the the lpfc rport pointer from the entering ndlp point,
then ndlp value from at rport. Note the entering ndlp may be replacing by
the rport->ndlp due to an address change swap.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-08-02 15:45:19 -04:00
James Smart
414abe0ab6 scsi: lpfc: Make PBDE optimizations configurable
The PBDE optimizations aren't supported in all firmware revs.

Make optimizations configurable in case there's a side effect on old
firmware.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-10 22:15:09 -04:00
James Smart
d580c61374 scsi: lpfc: Fix panic if driver unloaded when port is offline
System crashes when the lpfc module is unloaded after making the port
offline

The nvme queue pointers were freed during port offline, but were later
accessed in pci remove path.

Validate the pointers in pci remove path before accessing them.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-10 22:15:08 -04:00
James Smart
7438273fa2 scsi: lpfc: Fix crash in blk_mq layer when executing modprobe -r lpfc
modprobe -r lpfc produces the following:

Call Trace:
 __blk_mq_run_hw_queue+0xa2/0xb0
 __blk_mq_delay_run_hw_queue+0x9d/0xb0
 ? blk_mq_hctx_has_pending+0x32/0x80
 blk_mq_run_hw_queue+0x50/0xd0
 blk_mq_sched_insert_request+0x110/0x1b0
 blk_execute_rq_nowait+0x76/0x180
 nvme_keep_alive_work+0x8a/0xd0 [nvme_core]
 process_one_work+0x17f/0x440
 worker_thread+0x126/0x3c0
 ? manage_workers.isra.24+0x2a0/0x2a0
 kthread+0xd1/0xe0
 ? insert_kthread_work+0x40/0x40
 ret_from_fork_nospec_begin+0x21/0x21
 ? insert_kthread_work+0x40/0x40

However, rmmod lpfc would run correctly.

When an nvme remoteport is unregistered with the host nvme transport, it
needs to set the remoteport->dev_loss_tmo value 0 to indicate an immediate
termination of device loss and prevent any further keep alives to that
rport.  The driver was never setting dev_loss_tmo causing the nvme
transport to continue to send the keep alive.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:33 -04:00
James Smart
4d5e789a2e scsi: lpfc: correct oversubscription of nvme io requests for an adapter
Under large configurations, the driver would start to log message 6065 -
NVME out of buffers (exchanges).

The driver is using the ndlp cmd_qdepth value when determining the max
outstanding ios for an adapter. This value, by default, is set to 65536,
which exceeds the maximum exchange counts supported on an adapter. The ndlp
cmd_qdepth has no relevance and outstanding io count should be capped at
the max exchange count with IO requests beyond that level getting bounced
back with an EBUSY status so that they are retried by the block layer.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:33 -04:00
James Smart
3e21d1cb0f scsi: lpfc: Comment cleanup regarding Broadcom copyright header
Fix small formatting and wording nits in Broadcom copyright header

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:16 -04:00
James Smart
44c2757b76 scsi: lpfc: Fix up log messages and stats counters in IO submit code path
Fix up log messages and add an fcp error stat counter in the IO submit
code path to make diagnosing problems easier

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:16 -04:00
James Smart
cd2400715c scsi: lpfc: Change IO submit return to EBUSY if remote port is recovering
I/O submission paths in the lpfc nvme path are rejecting the io with an
error code that reflects back to the callee as a hard io failure. Many
of these conditions are transient and would likely resolve if retried.

Correct by returning -EBUSY, which the FC transport triggers off of to
return busy status codes to the blk-mq layer.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:15 -04:00
James Smart
66a85155d4 scsi: lpfc: Fix NULL pointer reference when resetting adapter
Points referencing local port structures didn't accommodate cases where
the localport may not be registered yet.

Add NULL pointer checks to logic.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:06 -04:00
James Smart
b15bd3e621 scsi: lpfc: Fix nvme remoteport registration race conditions
On tests adding and removing a remote port, calls to nvme_info would
eventually show fewer target ports discovered than were present in the
san. Additionally, the following error messages were seen:

  6031 RemotePort Registration failed err: -116, DID x471301

There is a race condition that exists between the driver and the nvme
transport on remote port unregister vs the confirmed deletion. It's
possible that the driver may rediscover the remote port and reregister
the remote port before a prior unregister delete callback was made (as
it rebinded to the prior remoteport structure). However, the driver was
coded to expect the callback before seeing the remote port again thus a
new registration. The logic results in the driver having an invalid
remoteport pointer set.

Correct by tracking when waiting for the delete callback. In cases where
the ndlp remoteport pointer is updated, it is only cleared when the wait
has not been superceded by a prior registration.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:05 -04:00
James Smart
b04744ce52 scsi: lpfc: Fix driver not recovering NVME rports during target link faults
During target-side port faults, the driver would not recover all target
port logins. This resulted in a loss of nvme device discovery.

The driver is coded to wait for all GID_FT requests to complete before
restarting discovery. A fault is seen where the outstanding GIT_FT
counts are not properly decremented, thus discovery would never
start. Another fault was found in the clearing of the gidft_inp counter
that would be skipped in this condition. And a third fault found with
lpfc_nvme_register_port that would remove a reverence on the ndlp which
then allows a node swap on a port address change to prematurely remove
the reference and release the ndlp.

The following changes are made:

 - Correct the decrementing of the outstanding GID_FT counters.

 - In RSCN handling, no longer zero the counter before calling to issue
   another GID_FT.

 - No longer remove the reference on the dlp when the ndlp->nrport value
   is not yet null.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:05 -04:00
James Smart
01466024d2 scsi: lpfc: Fix NULL pointer access in lpfc_nvme_info_show
After making remoteport unregister requests, the ndlp nrport pointer was
stale.

Track when waiting for waiting for unregister completion callback and
adjust nldp pointer assignment.  Add a few safety checks for NULL
pointer values.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:04 -04:00
James Smart
66a210ffb8 scsi: lpfc: Add per io channel NVME IO statistics
When debugging various issues, per IO channel IO statistics were useful
to understand what was happening. However, many of the stats were on a
port basis rather than an io channel basis.

Move statistics to an io channel basis.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:01 -04:00