Commit graph

434 commits

Author SHA1 Message Date
Xiang Chen
1e82e4627a scsi: libsas: Resume SAS host for phy reset or enable via sysfs
Currently if a phy reset or enable phy is issued via sysfs when controller
is suspended, those operations will be ignored as SAS_HA_REGISTERED is
cleared. If RPM is enabled then we may aggressively suspend automatically.
In this case it may be difficult to enable or reset a phy via sysfs, so
resume the host in these scenarios.

Link: https://lore.kernel.org/r/1657823002-139010-6-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-18 23:04:12 -04:00
Damien Le Moal
3dafe0648d scsi: libsas: Introduce struct smp_rps_resp
Similarly to sas report general and discovery responses, define the
structure struct smp_rps_resp to handle SATA PHY report responses using a
structure with a size that is exactly equal to the sas defined response
size.

With this change, struct smp_resp becomes unused and is removed.

Link: https://lore.kernel.org/r/20220609022456.409087-4-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-06-10 13:08:06 -04:00
Damien Le Moal
44f2bfe9ef scsi: libsas: Introduce struct smp_rg_resp
When compiling with gcc 12, several warnings are thrown by gcc when
compiling drivers/scsi/libsas/sas_expander.c, e.g.:

In function ‘sas_get_ex_change_count’,
    inlined from ‘sas_find_bcast_dev’ at
    drivers/scsi/libsas/sas_expander.c:1816:8:
drivers/scsi/libsas/sas_expander.c:1781:20: warning: array subscript
‘struct smp_resp[0]’ is partly outside array bounds of ‘unsigned
char[32]’ [-Warray-bounds]
 1781 |         if (rg_resp->result != SMP_RESP_FUNC_ACC) {
      |             ~~~~~~~^~~~~~~~

This is due to the use of the struct smp_resp to aggregate all possible
response types using a union but allocating a response buffer with a size
exactly equal to the size of the response type needed. This leads to access
to fields of struct smp_resp from an allocated memory area that is smaller
than the size of struct smp_resp.

Fix this by defining struct smp_rg_resp for sas report general responses.

Link: https://lore.kernel.org/r/20220609022456.409087-3-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-06-10 13:08:06 -04:00
Damien Le Moal
c3752f4460 scsi: libsas: Introduce struct smp_disc_resp
When compiling with gcc 12, several warnings are thrown by gcc when
compiling drivers/scsi/libsas/sas_expander.c, e.g.:

In function ‘sas_get_phy_change_count’,
    inlined from ‘sas_find_bcast_phy.constprop’ at
drivers/scsi/libsas/sas_expander.c:1737:9:
drivers/scsi/libsas/sas_expander.c:1697:39: warning: array subscript
‘struct smp_resp[0]’ is partly outside array bounds of ‘unsigned
char[56]’ [-Warray-bounds]
 1697 |                 *pcc = disc_resp->disc.change_count;
      |                        ~~~~~~~~~~~~~~~^~~~~~~~~~~~~

This is due to the use of the struct smp_resp to aggregate all possible
response types using a union but allocating a response buffer with a size
exactly equal to the size of the response type needed. This leads to access
to fields of struct smp_resp from an allocated memory area that is smaller
than the size of struct smp_resp.

Fix this by defining struct smp_disc_resp for sas discovery operations.
Since this structure and the generic struct smp_resp are identical for
the little endian and big endian archs, move the definition of these
structures at the end of include/scsi/sas.h to avoid repeating their
definition.

Link: https://lore.kernel.org/r/20220609022456.409087-2-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-06-10 13:08:06 -04:00
John Garry
057e5fc033 scsi: libsas: Refactor sas_ata_hard_reset()
Create function sas_ata_wait_after_reset() from sas_ata_hard_reset() as
some LLDDs may want to check for a remote ATA phy is up after reset.

Link: https://lore.kernel.org/r/1652354134-171343-2-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19 20:16:25 -04:00
John Garry
095478a6e5 scsi: hisi_sas: Use libsas internal abort support
Use the common libsas internal abort functionality.

In addition, this driver has special handling for internal abort timeouts -
specifically whether to reset the controller in that instance, so extend
the API for that.

Timeout is now increased to 20 * Hz from 6 * Hz.

We also retry for failure now, but this should not make a difference.

Link: https://lore.kernel.org/r/1647001432-239276-5-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-14 23:33:24 -04:00
John Garry
6a91c3e315 scsi: libsas: Add sas_execute_internal_abort_dev()
Add support for a "device" variant of internal abort, which will abort all
pending I/Os for a specific device.

Link: https://lore.kernel.org/r/1647001432-239276-3-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-14 23:33:23 -04:00
John Garry
5c9bf3635b scsi: libsas: Add sas_execute_internal_abort_single()
The internal abort feature is common to hisi_sas and pm8001 HBAs, and the
driver support is similar also, so add a common handler.

Two modes of operation will be supported:

 - single: Abort a single tagged command

 - device: Abort all commands associated with a specific domain device

A new protocol is added, SAS_PROTOCOL_INTERNAL_ABORT, so the common queue
command API may be re-used.

Only add "single" support as a first step.

Link: https://lore.kernel.org/r/1647001432-239276-2-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-14 23:33:23 -04:00
Damien Le Moal
32698c9552 scsi: libsas: Clean up sas_form_port()
Sparse throws a warning about context imbalance ("different lock contexts
for basic block") in sas_form_port() as it gets confused with the fact that
a port is locked within one of the two search loops and unlocked afterward
outside of the search loops once the phy is added to the port. Since this
code is not easy to follow, improve it by factoring out the code adding the
phy to the port once the port is locked into the helper function
sas_form_port_add_phy(). This helper can then be called directly within the
port search loops, avoiding confusion and clearing the sparse warning.

Link: https://lore.kernel.org/r/20220228094857.557329-1-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01 23:56:27 -05:00
John Garry
a2a59faa35 scsi: libsas: Use bool for queue_work() return code
Function queue_work() returns a bool, so use a bool to hold this value for
the return code from callers, which should make the code a tiny bit more
clear.

Also take this opportunity to condense the code of the those callers, such
as sas_queue_work(), as suggested by Damien.

Link: https://lore.kernel.org/r/1645786656-221630-3-git-send-email-john.garry@huawei.com
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-27 21:48:30 -05:00
John Garry
f1834fd163 scsi: libsas: Make sas_notify_{phy,port}_event() return void
Nobody checks the return codes, so make them return void. Indeed, if the
LLDD cannot send an event, nothing much can be done in the LLDD about it.

Also remove prototype for sas_notify_phy_event() in sas_internal.h, which
should not be there.

Link: https://lore.kernel.org/r/1645786656-221630-2-git-send-email-john.garry@huawei.com
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-27 21:48:29 -05:00
Damien Le Moal
a1e7c79919 scsi: libsas: Simplify sas_ata_qc_issue() detection of NCQ commands
To detect if a command is NCQ, there is no need to test all possible NCQ
command codes. Instead, use ata_is_ncq() to test the command protocol.

Link: https://lore.kernel.org/r/20220220031810.738362-24-damien.lemoal@opensource.wdc.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-22 21:31:20 -05:00
Damien Le Moal
8454563e4c scsi: libsas: Fix sas_ata_qc_issue() handling of NCQ NON DATA commands
To detect for the DMA_NONE (no data transfer) DMA direction,
sas_ata_qc_issue() tests if the command protocol is ATA_PROT_NODATA.  This
test does not include the ATA_CMD_NCQ_NON_DATA command as this command
protocol is defined as ATA_PROT_NCQ_NODATA (equal to ATA_PROT_FLAG_NCQ) and
not as ATA_PROT_NODATA.

To include both NCQ and non-NCQ commands when testing for the DMA_NONE DMA
direction, use "!ata_is_data()".

Link: https://lore.kernel.org/r/20220220031810.738362-2-damien.lemoal@opensource.wdc.com
Fixes: 176ddd8917 ("scsi: libsas: Reset num_scatter if libata marks qc as NODATA")
Cc: stable@vger.kernel.org
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-22 21:31:16 -05:00
John Garry
3f2e252ef7 scsi: libsas: Add sas_execute_ata_cmd()
Add a function to execute an ATA command using the TMF code, and use in the
hisi_sas driver. That driver needs to be able to issue the command on a
specific phy, so add an interface for that.

With that, hisi_sas_exec_internal_tmf_task() may be deleted.

Link: https://lore.kernel.org/r/1645534259-27068-19-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-22 21:11:02 -05:00
John Garry
4fea759edf scsi: libsas: Add sas_abort_task()
Add a generic implementation of abort task TMF handler, and use in LLDDs.

With that, some LLDDs custom TMF functions can now be deleted.

Link: https://lore.kernel.org/r/1645112566-115804-18-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-19 15:59:36 -05:00
John Garry
72f8810e1f scsi: libsas: Add sas_query_task()
Add a generic implementation of query task TMF handler, and use in LLDDs.

Link: https://lore.kernel.org/r/1645112566-115804-17-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-19 15:59:36 -05:00
John Garry
29d7769055 scsi: libsas: Add sas_lu_reset()
Add a generic implementation of LU reset TMF handler, and use in LLDDs.

Link: https://lore.kernel.org/r/1645112566-115804-16-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-19 15:59:36 -05:00
John Garry
e858545295 scsi: libsas: Add sas_clear_task_set()
Add a generic implementation of clear task set TMF handler, and use in
LLDDs.

Link: https://lore.kernel.org/r/1645112566-115804-15-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-19 15:59:36 -05:00
John Garry
69b80a0ed0 scsi: libsas: Add sas_abort_task_set()
Add a generic implementation of abort task set TMF handler, and use in
LLDDs.

Link: https://lore.kernel.org/r/1645112566-115804-14-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-19 15:59:36 -05:00
John Garry
693e66a0a6 scsi: libsas: Add TMF handler aborted callback
The hisi_sas and pm8001 TMF handlers have some special processing for when
the TMF is aborted, so add a callback and fill it in for those drivers.

Link: https://lore.kernel.org/r/1645112566-115804-13-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-19 15:59:35 -05:00
John Garry
2037a34031 scsi: libsas: Add TMF handler exec complete callback
The pm8001 TMF handler has some special processing when the TMF completes,
so add a callback and fill it in for the pm8001 driver.

Link: https://lore.kernel.org/r/1645112566-115804-12-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-19 15:59:35 -05:00
John Garry
350d85ba5b scsi: libsas: Add sas_execute_ssp_tmf()
Add a function to issue an SSP TMF.

Add a temp prototype to keep make W=1 happy.

Link: https://lore.kernel.org/r/1645112566-115804-11-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-19 15:59:35 -05:00
John Garry
001ec7f89b scsi: libsas: Add sas_execute_tmf()
Drivers using libsas have to issue their own TMFs, and the code to do this
is duplicated between drivers.

Add a common function to handle TMFs. This will be also used for sending
ATA commands, but use name "tmf" as the purpose is similar between SCSI and
ATA in the usage.

The force_phy_id argument will be used for a hisi_sas HW bug workaround.

We check the sas task status stat field against TMF codes, as according to
the SAS v1.1 spec 9.2.2.5.3, if response_data is in datapres, then the
response code should be a TMF code - see table 128.

Link: https://lore.kernel.org/r/1645112566-115804-10-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-19 15:59:35 -05:00
John Garry
4aef43b25d scsi: libsas: Move SMP task handlers to core
Move the SMP task handlers to the core host code as they will be re-used
for executing internal abort and TMF tasks.

Link: https://lore.kernel.org/r/1645112566-115804-7-git-send-email-john.garry@huawei.com
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-19 15:59:34 -05:00
John Garry
2dd6801a67 scsi: libsas: Delete SAS_SG_ERR
No LLDD sets exec status as SAS_SG_ERR, so remove support.

Link: https://lore.kernel.org/r/1645112566-115804-5-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-19 15:59:34 -05:00
John Garry
1d6049a3b1 scsi: libsas: Use enum for response frame DATAPRES field
As defined in table 126 of the SAS spec 1.1, use an enum for the DATAPRES
field, which makes reading the code easier.

Also change sas_ssp_task_response() to use a switch statement, which is
more suitable (than if-else), as suggested by Christoph.

Link: https://lore.kernel.org/r/1645112566-115804-3-git-send-email-john.garry@huawei.com
Suggested-by: Xiang Chen <chenxiang66@hisilicon.com>
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-19 15:59:34 -05:00
John Garry
9aacf6fe90 scsi: libsas: Handle non-TMF codes in sas_scsi_find_task()
LLDD TMF callbacks may return linux or other error codes instead of TMF
codes. This may cause problems in sas_scsi_find_task() ->
.lldd_query_task(), as only TMF codes are handled there. As such, we may
not return a task_disposition type from sas_scsi_find_task(). Function
sas_eh_handle_sas_errors() only handles that type, and will only progress
error handling for those recognised types.

Return TASK_ABORT_FAILED upon exit on the assumption that the command may
still be alive and error handling should be escalated.

Link: https://lore.kernel.org/r/1645112566-115804-2-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-19 15:59:34 -05:00
Xiang Chen
3a20e64281 scsi: libsas: Remove unused parameter for function sas_ata_eh()
Input parameter work_q is not unused in function sas_ata_eh(), so remove
it.

Link: https://lore.kernel.org/r/1644561778-183074-4-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-11 17:13:29 -05:00
Xiang Chen
59803ccb65 scsi: libsas: Remove duplicated setting for task->task_state_flags
Task->task_state_flags is already set in function sas_alloc_task(), so
remove duplicated setting.

Link: https://lore.kernel.org/r/1644561778-183074-3-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-11 17:13:29 -05:00
Xiang Chen
26d4a969dd scsi: libsas: Use void for sas_discover_event() return code
The callers of function sas_discover_event() do not check its return value.
The function also only ever returns 0, so use void instead.

Link: https://lore.kernel.org/r/1644561778-183074-2-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-11 17:13:29 -05:00
Xiang Chen
307d9f49cc scsi: libsas: Keep host active while processing events
Processing events such as PORTE_BROADCAST_RCVD may cause dependency issues
for runtime power management support.  Such a problem would be that
handling a PORTE_BROADCAST_RCVD event requires that the host is resumed to
send SMP commands. However, in resuming the host, the phyup events
generated from re-enabling the phys are processed in the same workqueue as
the original PORTE_BROADCAST_RCVD event. As such, the host will never
finish resuming (as it waits for the phyup event processing), and then the
PORTE_BROADCAST_RCVD event can't be processed as the SMP commands are
blocked, and so we have a deadlock.  Solve this problem by ensuring that
libsas keeps the host active until completely finished phy or port events,
such as PORTE_BYTES_DMAED. As such, we don't have to worry about resuming
the host for processing individual SMP commands in this example.

Link: https://lore.kernel.org/r/1639999298-244569-15-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22 23:38:31 -05:00
Xiang Chen
bf19aea460 scsi: libsas: Defer works of new phys during suspend
During the processing of event PORT_BYTES_DMAED, the driver queues work
DISCE_DISCOVER_DOMAIN and then flushes workqueue ha->disco_q.  If a new
phyup event occurs during resuming the controller, the work
PORTE_BYTES_DMAED of new phy occurs before suspended phy's. The work
DISCE_DISCOVER_DOMAIN of new phy requires an active SAS controller (it
needs to resume SAS controller by function scsi_sysfs_add_sdev() and some
other functions such as function add_device_link()). However, the
activation of the SAS controller requires completion of work
PORTE_BYTES_DMAED of suspended phys while it is blocked by new phy's work
on ha->event_q. So there is a deadlock and it is released only after resume
timeout.

To solve the issue, defer works of new phys during suspend and queue those
defer works after SAS controller becomes active.

Link: https://lore.kernel.org/r/1639999298-244569-13-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22 23:38:30 -05:00
Xiang Chen
1bc35475c6 scsi: libsas: Refactor sas_queue_deferred_work()
In the second part of function __sas_drain_work(), deferred work is queued.
This functionality is required other places so factor it out into the
function sas_queue_deferred_work().

Link: https://lore.kernel.org/r/1639999298-244569-12-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22 23:38:30 -05:00
Xiang Chen
4ea775abbb scsi: libsas: Add flag SAS_HA_RESUMING
Add a flag SAS_HA_RESUMING and use it to indicate the state of resuming the
host controller.

Link: https://lore.kernel.org/r/1639999298-244569-11-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22 23:38:30 -05:00
Xiang Chen
0da7ca4c4f scsi: libsas: Resume host while sending SMP I/Os
When sending SMP I/Os to the host we need to ensure that the host is not
suspended and can process the commands. This is a better approach than
replying on the host to resume itself to handle such commands. Use
pm_runtime_get_sync() and pm_runtime_put_sync() calls for the host when
executing SMP I/Os.

Link: https://lore.kernel.org/r/1639999298-244569-10-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22 23:38:30 -05:00
Xiang Chen
e31e18128e scsi: libsas: Insert PORTE_BROADCAST_RCVD event for resuming host
If a new disk is inserted through an expander when the host was suspended,
it will not necessarily be detected as the topology is not re-scanned
during resume.  To detect possible changes in topology during suspension,
insert a PORTE_BROADCAST_RCVD event per port when resuming to trigger a
revalidation.

Link: https://lore.kernel.org/r/1639999298-244569-8-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22 23:38:30 -05:00
Xiang Chen
42159d3c8d scsi: libsas: Add spin_lock/unlock() to protect asd_sas_port->phy_list
Most places that use asd_sas_port->phy_list in libsas are protected by
spinlock asd_sas_port->phy_list_lock. However, there are still a few places
which miss the lock. Add it in those places.

Link: https://lore.kernel.org/r/1639999298-244569-5-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22 23:38:29 -05:00
John Garry
fbefe22811 scsi: libsas: Don't always drain event workqueue for HA resume
For the hisi_sas driver, if a directly attached disk is removed during
suspend, a hang will occur in the resume process:

The background is that in commit 16fd4a7c59 ("scsi: hisi_sas: Add device
link between SCSI devices and hisi_hba"), it is ensured that the HBA device
cannot be runtime suspended when any SCSI device associated is active.

Other drivers which use libsas don't worry about this as none support
runtime suspend.

The mentioned hang occurs when an disk is removed during suspend. In the
removal process - from PHYE_RESUME_TIMEOUT event processing - we call into
scsi_remove_device(), which is being processed in the HA event workqueue.
Here we wait for all suppliers of the SCSI device to resume, which includes
the HBA device (from the above commit). However the HBA device cannot
resume, as it is waiting for the PHYE_RESUME_TIMEOUT to be processed (from
calling sas_resume_ha() -> sas_drain_work()). This is the deadlock.

There does not appear to be any need for the sas_drain_work() to be called
at all in sas_resume_ha() as it is not syncing against anything, so allow
LLDDs to avoid this by providing a variant of sas_resume_ha() which does
"sync", i.e. doesn't drain the event workqueue.

Link: https://lore.kernel.org/r/1639999298-244569-2-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22 23:38:29 -05:00
John Garry
4be6181fea scsi: libsas: Decode SAM status and host byte codes
Value 0 is used for SAM status and libsas exec_status bytes codes in
sas_end_task() - use defined macros instead. In addition, change to proper
enum types.

Also replace SAM_STAT_CHECK_CONDITION with SAS_SAM_STAT_CHECK_CONDITION,
the former being a proper member of enum exec_status.

Link: https://lore.kernel.org/r/1639579061-179473-9-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-16 22:59:58 -05:00
Bart Van Assche
db33028647 scsi: Remove superfluous #include <linux/async.h> directives
Remove this include directive from code that does not use any functionality
from kernel/async.c.

Link: https://lore.kernel.org/r/20211129194609.3466071-13-bvanassche@acm.org
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-29 23:02:15 -05:00
Bart Van Assche
e803bc52b0 scsi: libsas: Call scsi_done() directly
Conditional statements are faster than indirect calls. Hence call
scsi_done() directly.

Link: https://lore.kernel.org/r/20211007202923.2174984-46-bvanassche@acm.org
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-16 21:28:48 -04:00
Luo Jiaxing
00aeaf329a scsi: libsas: Export sas_phy_enable()
Export sas_phy_enable() so LLDDs can directly use it to control remote
phys.

We already do this for companion function sas_phy_reset().

Link: https://lore.kernel.org/r/1634041588-74824-4-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-12 22:46:06 -04:00
John Garry
ce4fc333e5 scsi: libsas: Co-locate exports with symbols
It is standard practice to co-locate export declarations with the symbol
which is being exported. Or at least in the same file - see
sas_phy_reset().

Modify libsas to follow this practice consistently.

Link: https://lore.kernel.org/r/1631530296-32358-1-git-send-email-john.garry@huawei.com
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13 23:57:39 -04:00
Bart Van Assche
cad1a780e0 scsi: libsas: Use scsi_cmd_to_rq() instead of scsi_cmnd.request
Prepare for removal of the request pointer by using scsi_cmd_to_rq()
instead. This patch does not change any functionality.

Link: https://lore.kernel.org/r/20210809230355.8186-27-bvanassche@acm.org
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-11 22:25:39 -04:00
Guoqing Jiang
0525265e43 scsi: libsas: Drop BLK_DEV_BSGLIB selection
SCSI_SAS_ATTRS already selects BLK_DEV_BSGLIB in drivers/scsi/Kconfig.
Remove selection in libsas/Kconfig.

Link: https://lore.kernel.org/r/20210723084624.2596297-1-guoqing.jiang@linux.dev
Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-27 00:06:42 -04:00
Jason Yan
e15f669cd9 scsi: libsas: Allow libsas to include SCSI header files directly
libsas needs to include some header files in the scsi directory. However
these are currently hardcoded with the path "../" in the C files. Do this
in the Makefile to avoid hardcoding the path.

Link: https://lore.kernel.org/r/20210716074551.771312-1-yanaijie@huawei.com
Cc: John Garry <john.garry@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-20 23:11:17 -04:00
Gustavo A. R. Silva
223fa873fa scsi: libsas: Fix fall-through warning for Clang
Fix the following fallthrough warning (arm64-randconfig with Clang):

drivers/scsi/libsas/sas_discover.c:467:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]

Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%25lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2021-07-13 13:58:56 -05:00
Linus Torvalds
8b9cc17a46 SCSI misc on 20210711
This is a set of minor fixes and clean ups in the core and various
 drivers.  The only core change in behaviour is the I/O retry for
 spinup notify, but that shouldn't impact anything other than the
 failing case.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYOqWWyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishYBtAQCpqVdl
 Axi1SpD6/UuKOgRmboWscoKD8FLHwvLDMRyCRQEAnLu3XdB9HcQrwZOkTG14vrfB
 q2XB5cP4XAITxFLN1qo=
 =9AO9
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
 "This is a set of minor fixes and clean ups in the core and various
  drivers.

  The only core change in behaviour is the I/O retry for spinup notify,
  but that shouldn't impact anything other than the failing case"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (23 commits)
  scsi: virtio_scsi: Add validation for residual bytes from response
  scsi: ipr: System crashes when seeing type 20 error
  scsi: core: Retry I/O for Notify (Enable Spinup) Required error
  scsi: mpi3mr: Fix warnings reported by smatch
  scsi: qedf: Add check to synchronize abort and flush
  scsi: MAINTAINERS: Add mpi3mr driver maintainers
  scsi: libfc: Fix array index out of bound exception
  scsi: mvsas: Use DEVICE_ATTR_RO()/RW() macro
  scsi: megaraid_mbox: Use DEVICE_ATTR_ADMIN_RO() macro
  scsi: qedf: Use DEVICE_ATTR_RO() macro
  scsi: qedi: Use DEVICE_ATTR_RO() macro
  scsi: message: mptfc: Switch from pci_ to dma_ API
  scsi: be2iscsi: Fix some missing space in some messages
  scsi: be2iscsi: Fix an error handling path in beiscsi_dev_probe()
  scsi: ufs: Fix build warning without CONFIG_PM
  scsi: bnx2fc: Remove meaningless bnx2fc_abts_cleanup() return value assignment
  scsi: qla2xxx: Add heartbeat check
  scsi: virtio_scsi: Do not overwrite SCSI status
  scsi: libsas: Add LUN number check in .slave_alloc callback
  scsi: core: Inline scsi_mq_alloc_queue()
  ...
2021-07-11 10:59:53 -07:00
Linus Torvalds
bd31b9efbf SCSI misc on 20210702
This series consists of the usual driver updates (ufs, ibmvfc,
 megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with
 elx and mpi3mr being new drivers.  The major core change is a rework
 to drop the status byte handling macros and the old bit shifted
 definitions and the rest of the updates are minor fixes.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYN7I6iYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXpRAQCkngYZ
 35yQrqOxgOk2pfrysE95tHrV1MfJm2U49NFTwAEAuZutEvBUTfBF+sbcJ06r6q7i
 H0hkJN/Io7enFs5v3WA=
 =zwIa
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This series consists of the usual driver updates (ufs, ibmvfc,
  megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with
  elx and mpi3mr being new drivers.

  The major core change is a rework to drop the status byte handling
  macros and the old bit shifted definitions and the rest of the updates
  are minor fixes"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (287 commits)
  scsi: aha1740: Avoid over-read of sense buffer
  scsi: arcmsr: Avoid over-read of sense buffer
  scsi: ips: Avoid over-read of sense buffer
  scsi: ufs: ufs-mediatek: Add missing of_node_put() in ufs_mtk_probe()
  scsi: elx: libefc: Fix IRQ restore in efc_domain_dispatch_frame()
  scsi: elx: libefc: Fix less than zero comparison of a unsigned int
  scsi: elx: efct: Fix pointer error checking in debugfs init
  scsi: elx: efct: Fix is_originator return code type
  scsi: elx: efct: Fix link error for _bad_cmpxchg
  scsi: elx: efct: Eliminate unnecessary boolean check in efct_hw_command_cancel()
  scsi: elx: efct: Do not use id uninitialized in efct_lio_setup_session()
  scsi: elx: efct: Fix error handling in efct_hw_init()
  scsi: elx: efct: Remove redundant initialization of variable lun
  scsi: elx: efct: Fix spelling mistake "Unexected" -> "Unexpected"
  scsi: lpfc: Fix build error in lpfc_scsi.c
  scsi: target: iscsi: Remove redundant continue statement
  scsi: qla4xxx: Remove redundant continue statement
  scsi: ppa: Switch to use module_parport_driver()
  scsi: imm: Switch to use module_parport_driver()
  scsi: mpt3sas: Fix error return value in _scsih_expander_add()
  ...
2021-07-02 15:14:36 -07:00
Yufen Yu
49da96d779 scsi: libsas: Add LUN number check in .slave_alloc callback
Offlining a SATA device connected to a hisi SAS controller and then
scanning the host will result in detecting 255 non-existent devices:

  # lsscsi
  [2:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q  /dev/sda
  [2:0:1:0]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdb
  [2:0:2:0]    disk    SEAGATE  ST600MM0006      B001  /dev/sdc
  # echo "offline" > /sys/block/sdb/device/state
  # echo "- - -" > /sys/class/scsi_host/host2/scan
  # lsscsi
  [2:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q  /dev/sda
  [2:0:1:0]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdb
  [2:0:1:1]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdh
  ...
  [2:0:1:255]  disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdjb

After a REPORT LUN command issued to the offline device fails, the SCSI
midlayer tries to do a sequential scan of all devices whose LUN number is
not 0. However, SATA does not support LUN numbers at all.

Introduce a generic sas_slave_alloc() handler which will return -ENXIO for
SATA devices if the requested LUN number is larger than 0 and make libsas
drivers use this function as their .slave_alloc callback.

Link: https://lore.kernel.org/r/20210622034037.1467088-1-yuyufen@huawei.com
Reported-by: Wu Bo <wubo40@huawei.com>
Suggested-by: John Garry <john.garry@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-22 21:33:33 -04:00