Commit graph

4551 commits

Author SHA1 Message Date
Linus Torvalds
39714efc23 ATA changes for 6.7-rc1
- Modify the AHCI driver to print the link power management policy used
    on scan, to help with debugging issues (Niklas).
 
  - Add support for the ASM2116 series adapters to the AHCI driver
    (Szuying).
 
  - Prepare libata for the coming gcc and Clang __counted_by attribute
    (Kees).
 
  - Following the recent estensive fixing of libata suspend/resume
    handling, several patches further cleanup and improve disk power state
    management (from me).
 
  - Reduce the verbosity of some error messages for non-fatal temporary
    errors, e.g. slow response to device reset when scanning a port, and
    warning messages that are in fact normal, e.g. disabling a device
    on suspend or when removing it (from me).
 
  - Cleanup DMA helper functions (from me).
 
  - Fix sata_mv drive handling of potential errors durring probe (Ma).
 
  - Cleanup the xgene and imx drivers using the functions
    of_device_get_match_data() and device_get_match_data() (Rob).
 
  - Improve the tegra driver device tree (Rob).
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCZT9PtwAKCRDdoc3SxdoY
 diCGAQCbSOB3PCpsm1cCMevZeEbt+zUSx+sj6twR12RohtQK6wEA1rEytQllZyqa
 Pk/jIH2UYZaf+lVuV3Vbey4rwDYXTQs=
 =V53T
 -----END PGP SIGNATURE-----

Merge tag 'ata-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata

Pull ATA updates from Damien Le Moal:

 - Modify the AHCI driver to print the link power management policy used
   on scan, to help with debugging issues (Niklas)

 - Add support for the ASM2116 series adapters to the AHCI driver
   (Szuying)

 - Prepare libata for the coming gcc and Clang __counted_by attribute
   (Kees)

 - Following the recent estensive fixing of libata suspend/resume
   handling, several patches further cleanup and improve disk power
   state management (me)

 - Reduce the verbosity of some error messages for non-fatal temporary
   errors, e.g. slow response to device reset when scanning a port, and
   warning messages that are in fact normal, e.g. disabling a device on
   suspend or when removing it (me)

 - Cleanup DMA helper functions (me)

 - Fix sata_mv drive handling of potential errors durring probe (Ma)

 - Cleanup the xgene and imx drivers using the functions
   of_device_get_match_data() and device_get_match_data() (Rob)

 - Improve the tegra driver device tree (Rob)

* tag 'ata-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: (22 commits)
  dt-bindings: ata: tegra: Disallow undefined properties
  ata: libata-core: Improve ata_dev_power_set_active()
  ata: libata-eh: Spinup disk on resume after revalidation
  ata: imx: Use device_get_match_data()
  ata: xgene: Use of_device_get_match_data()
  ata: sata_mv: aspeed: fix value check in mv_platform_probe()
  ata: ahci: Add Intel Alder Lake-P AHCI controller to low power chipsets list
  ata: libata: Cleanup inline DMA helper functions
  ata: libata-eh: Reduce "disable device" message verbosity
  ata: libata-eh: Improve reset error messages
  ata: libata-sata: Improve ata_sas_slave_configure()
  ata: libata-core: Do not resume runtime suspended ports
  ata: libata-core: Do not poweroff runtime suspended ports
  ata: libata-core: Remove ata_port_resume_async()
  ata: libata-core: Remove ata_port_suspend_async()
  ata: libata-core: Detach a port devices on shutdown
  ata: libata-core: Synchronize ata_port_detach() with hotplug
  ata: libata-scsi: Cleanup ata_scsi_start_stop_xlat()
  scsi: Remove scsi device no_start_on_resume flag
  ata: libata: Annotate struct ata_cpr_log with __counted_by
  ...
2023-11-01 12:50:12 -10:00
Damien Le Moal
24eca2dce0 scsi: sd: Introduce manage_shutdown device flag
Commit aa3998dbeb ("ata: libata-scsi: Disable scsi device
manage_system_start_stop") change setting the manage_system_start_stop
flag to false for libata managed disks to enable libata internal
management of disk suspend/resume. However, a side effect of this change
is that on system shutdown, disks are no longer being stopped (set to
standby mode with the heads unloaded). While this is not a critical
issue, this unclean shutdown is not recommended and shows up with
increased smart counters (e.g. the unexpected power loss counter
"Unexpect_Power_Loss_Ct").

Instead of defining a shutdown driver method for all ATA adapter
drivers (not all of them define that operation), this patch resolves
this issue by further refining the sd driver start/stop control of disks
using the new flag manage_shutdown. If this new flag is set to true by
a low level driver, the function sd_shutdown() will issue a
START STOP UNIT command with the start argument set to 0 when a disk
needs to be powered off (suspended) on system power off, that is, when
system_state is equal to SYSTEM_POWER_OFF.

Similarly to the other manage_xxx flags, the new manage_shutdown flag is
exposed through sysfs as a read-write device attribute.

To avoid any confusion between manage_shutdown and
manage_system_start_stop, the comments describing these flags in
include/scsi/scsi.h are also improved.

Fixes: aa3998dbeb ("ata: libata-scsi: Disable scsi device manage_system_start_stop")
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218038
Link: https://lore.kernel.org/all/cd397c88-bf53-4768-9ab8-9d107df9e613@gmail.com/
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-27 10:00:19 +09:00
Damien Le Moal
2da4c5e24e ata: libata-core: Improve ata_dev_power_set_active()
Improve the function ata_dev_power_set_active() by having it do nothing
for a disk that is already in the active power state. To do that,
introduce the function ata_dev_power_is_active() to test the current
power state of the disk and return true if the disk is in the PM0:
active or PM1: idle state (0xff value for the count field of the CHECK
POWER MODE command output).

To preserve the existing behavior, if the CHECK POWER MODE command
issued in ata_dev_power_is_active() fails, the drive is assumed to be in
standby mode and false is returned.

With this change, issuing the VERIFY command to access the disk media to
spin it up becomes unnecessary most of the time during system resume as
the port reset done by libata-eh on resume often result in the drive to
spin-up (this behavior is not clearly defined by the ACS specifications
and may thus vary between disk models).

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
2023-10-16 07:03:02 +09:00
Damien Le Moal
54d7211da7 ata: libata-eh: Spinup disk on resume after revalidation
Move the call to ata_dev_power_set_active() to transition a disk in
standby power mode to the active power mode from
ata_eh_revalidate_and_attach() before doing revalidation to the end of
ata_eh_recover(), after the link speed for the device is reconfigured
(if that was necessary). This is safer as this ensure that the VERIFY
command executed to spinup the disk is executed with the drive properly
reconfigured first.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
2023-10-13 12:46:03 +09:00
Rob Herring
0e7ad4bba2 ata: imx: Use device_get_match_data()
Use preferred device_get_match_data() instead of of_match_device() to
get the driver match data. With this, adjust the includes to explicitly
include the correct headers.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-10-11 17:54:05 +09:00
Rob Herring
0ecbef3125 ata: xgene: Use of_device_get_match_data()
Use preferred of_device_get_match_data() instead of of_match_device() to
get the driver match data. With this, adjust the includes to explicitly
include the correct headers.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-10-11 17:53:37 +09:00
Ma Ke
2267d5a146 ata: sata_mv: aspeed: fix value check in mv_platform_probe()
In mv_platform_probe(), check the return value of clk_prepare_enable()
and return the error code if clk_prepare_enable() returns an
unexpected value.

Signed-off-by: Ma Ke <make_ruc2021@163.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-10-11 17:51:53 +09:00
Ondrej Zary
0c1e81d0b5 ata: pata_parport: fit3: implement IDE command set registers
fit3 protocol driver does not support accessing IDE control registers
(device control/altstatus). The DOS driver does not use these registers
either (as observed from DOSEMU trace). But the HW seems to be capable
of accessing these registers - I simply tried bit 3 and it works!

The control register is required to properly reset ATAPI devices or
they will be detected only once (after a power cycle).

Tested with EXP Computer CD-865 with MC-1285B EPP cable and
TransDisk 3000.

Signed-off-by: Ondrej Zary <linux@zary.sk>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-10-10 14:42:22 +09:00
Ondrej Zary
f343e578fe ata: pata_parport: add custom version of wait_after_reset
Some parallel adapters (e.g. EXP Computer MC-1285B EPP Cable) return
bogus values when there's no master device present. This can cause
reset to fail, preventing the lone slave device (such as EXP Computer
CD-865) from working.

Add custom version of wait_after_reset that ignores master failure when
a slave device is present. The custom version is also needed because
the generic ata_sff_wait_after_reset uses direct port I/O for slave
device detection.

Signed-off-by: Ondrej Zary <linux@zary.sk>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-10-10 14:42:22 +09:00
Ondrej Zary
d2302427c1 ata: pata_parport: implement set_devctl
Add missing ops->sff_set_devctl implementation.

Fixes: 246a1c4c6b ("ata: pata_parport: add driver (PARIDE replacement)")
Cc: stable@vger.kernel.org
Signed-off-by: Ondrej Zary <linux@zary.sk>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-10-10 14:42:22 +09:00
Ondrej Zary
b555aa6676 ata: pata_parport: fix pata_parport_devchk
There's a 'x' missing in 0x55 in pata_parport_devchk(), causing the
detection to always fail. Fix it.

Fixes: 246a1c4c6b ("ata: pata_parport: add driver (PARIDE replacement)")
Cc: stable@vger.kernel.org
Signed-off-by: Ondrej Zary <linux@zary.sk>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-10-10 14:42:22 +09:00
Mika Westerberg
b8b8b4e0c0 ata: ahci: Add Intel Alder Lake-P AHCI controller to low power chipsets list
Intel Alder Lake-P AHCI controller needs to be added to the mobile
chipsets list in order to have link power management enabled. Without
this the CPU cannot enter lower power C-states making idle power
consumption high.

Cc: Koba Ko <koba.ko@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-10-03 09:59:15 +09:00
Damien Le Moal
0fecb50891 ata: libata-eh: Reduce "disable device" message verbosity
There is no point in warning about a device being disabled when we
expect it to be, that is, on suspend, shutdown or when detaching the
device.

Suppress the message "disable device" for these cases by introducing the
EH static function ata_eh_dev_disable() and by using it in
ata_eh_unload() and ata_eh_detach_dev(). ata_dev_disable() code is
modified to call this new function after printing the "disable device"
message.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-03 09:39:50 +09:00
Damien Le Moal
7f95731c74 ata: libata-eh: Improve reset error messages
Some drives are really slow to spinup on resume, resulting is a very
slow response to COMRESET and to error messages such as:

ata1: COMRESET failed (errno=-16)
ata1: link is slow to respond, please be patient (ready=0)
ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata1.00: configured for UDMA/133

Given that the slowness of the response is indicated with the message
"link is slow to respond..." and that resets are retried until the
device is detected as online after up to 1min (ata_eh_reset_timeouts),
there is no point in printing the "COMRESET failed" error message. Let's
not scare the user with non fatal errors and only warn about reset
failures in ata_eh_reset() when all reset retries have been exhausted.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-03 09:39:50 +09:00
Damien Le Moal
88b9f89286 ata: libata-sata: Improve ata_sas_slave_configure()
Change ata_sas_slave_configure() to return the return value of
ata_scsi_dev_config() to ensure that any error from that function is
propagated to libsas.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-03 09:39:50 +09:00
Damien Le Moal
3341b82368 ata: libata-core: Do not resume runtime suspended ports
The scsi disk driver does not resume disks that have been runtime
suspended by the user. To be consistent with this behavior, do the same
for ata ports and skip the PM request in ata_port_pm_resume() if the
port was already runtime suspended. With this change, it is no longer
necessary to force the PM state of the port to ACTIVE as the PM core
code will take care of that when handling runtime resume.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-03 09:39:50 +09:00
Damien Le Moal
3a94af2488 ata: libata-core: Do not poweroff runtime suspended ports
When powering off, there is no need to suspend a port that has already
been runtime suspended. Skip the EH PM request in ata_port_pm_poweroff()
in this case.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-03 09:39:49 +09:00
Damien Le Moal
09b055cfb0 ata: libata-core: Remove ata_port_resume_async()
Remove ata_port_resume_async() and replace it with a modified
ata_port_resume() taking an additional bool argument indicating if
ata EH resume operation should be executed synchronously or
asynchronously. With this change, the variable ata_port_resume_ehi is
not longer necessary and its value (ATA_EHI_XXX flags) passed directly
to ata_port_request_pm().

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-03 09:39:49 +09:00
Damien Le Moal
6702255d70 ata: libata-core: Remove ata_port_suspend_async()
ata_port_suspend_async() is only called by ata_sas_port_suspend().
Modify ata_port_suspend() with an additional bool argument indicating an
asynchronous or synchronous suspend to allow removing that helper
function. With this change, the variable ata_port_resume_ehi can also be
removed and its value (ATA_EHI_XXX flags passed directly to
ata_port_request_pm().

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-03 09:39:49 +09:00
Damien Le Moal
5b6fba546d ata: libata-core: Detach a port devices on shutdown
Modify ata_pci_shutdown_one() to schedule EH to unload a port devices
before freezing and thawing the port. This ensures that drives are
cleanly disabled and transitioned to standby power mode when
a PCI adapter is removed or the system is powered off.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-03 09:39:49 +09:00
Damien Le Moal
cfead0dd81 ata: libata-core: Synchronize ata_port_detach() with hotplug
The call to async_synchronize_cookie() to synchronize a port removal
and hotplug probe is done in ata_host_detach() right before calling
ata_port_detach(). Move this call at the beginning of ata_port_detach()
to ensure that this operation is always synchronized with probe.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-03 09:39:49 +09:00
Damien Le Moal
8c1f081706 ata: libata-scsi: Cleanup ata_scsi_start_stop_xlat()
Now that libata does its own internal device power mode management
through libata EH, the scsi disk driver will not issue START STOP UNIT
commands anymore. We can receive this command only from user passthrough
operations. So there is no need to consider the system state and ATA
port flags for suspend to translate the command.

Since setting up the taskfile for the verify and standby
immediate commands is the same as done in ata_dev_power_set_active()
and ata_dev_power_set_standby(), factor out this code into the helper
function ata_dev_power_init_tf() to simplify ata_scsi_start_stop_xlat()
as well as ata_dev_power_set_active() and ata_dev_power_set_standby().

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-03 09:39:49 +09:00
Szuying Chen
3bf6141060 ata: ahci: add identifiers for ASM2116 series adapters
Add support for PCIe SATA adapter cards based on Asmedia 2116 controllers.
These cards can provide up to 10 SATA ports on PCIe card.

Signed-off-by: Szuying Chen <Chloe_Chen@asmedia.com.tw>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-10-03 09:39:49 +09:00
Niklas Cassel
affccb16c1 ata: ahci: print the lpm policy on boot
The target LPM policy can be set using either a Kconfig or a kernel module
parameter.

However, if the board type is set to anything but board_ahci_low_power,
then the LPM policy will overridden and set to ATA_LPM_UNKNOWN.

Additionally, if the default suspend is suspend to idle, depending on the
hardware capabilities of the HBA, ahci_update_initial_lpm_policy() might
override the LPM policy to either ATA_LPM_MIN_POWER_WITH_PARTIAL or
ATA_LPM_MIN_POWER.

All this means that it is very hard to know which LPM policy a user will
actually be using on a given system.

In order to make it easier to debug LPM related issues, print the LPM
policy on boot.

One common LPM related issue is that the device fails to link up.
Because of that, we cannot add this print to ata_dev_configure(), as that
function is only called after a successful link up. Instead, add the info
using ata_port_desc(), with the help of a new ata_port_desc_misc() helper.
The port description is printed once per port during boot.

Before changes:
ata1: SATA max UDMA/133 abar m524288@0xa5780000 port 0xa5780100 irq 170
ata2: SATA max UDMA/133 abar m524288@0xa5780000 port 0xa5780180 irq 170

After changes:
ata1: SATA max UDMA/133 abar m524288@0xa5780000 port 0xa5780100 irq 170 lpm-pol 4
ata2: SATA max UDMA/133 abar m524288@0xa5780000 port 0xa5780180 irq 170 lpm-pol 4

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-10-03 09:39:49 +09:00
Linus Torvalds
95289e49f0 ATA fixes for 6.6-rc4
A larger than usual set of fixes for 6.6-rc4 due to the unexpected
 number of fixes needed to address ATA disks suspend/resume issues.
 In more details:
 
  - Add missing additionalProperties on child nodes to the pata-common DT
    bindings (Rob).
 
  - Fix handling of the REPORT SUPPORTED OPERATION CODES command to
    ignore reserved bits (Niklas).
 
  - Increase port multiplier soft reset timeout to accomodate slow
    devices and avoid issues on wakeup (Matthias).
 
  - A couple of minor code fixes to avoid compilation warnings in
    libata-core and libata-eh (me).
 
  - Many patches from me to address suspend/resume issues, and in
    particular a potential deadlock on resume due to the SCSI disk driver
    resume operation not being synchronized with libata EH port resume
    handling.  This is addressed by changing the scsi disk driver disk
    start/stop control to allow libata to execute disk suspend (spin
    down) and resume (spin up) on its own during system suspend/resume.
    Runtime suspend/resume control remains with the SCSI disk driver.
    Other fixes include:
     - Fix libata power management request issuing to avoid races.
     - Establish a link between ATA ports and SCSI devices to order PM
       operations.
     - Fix device removal to avoid issues with driver rmmod removal.
     - Fix synchronization of libata device rescan and SCSI disk resume
       operation.
     - Remove libsas PM operations as suspend/resume is handled directly
       by the sas controller resume.
     - Fix the SCSI disk driver to not issue commands to suspended disks,
       thus avoiding potential system lock-up on resume.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCZRbR0gAKCRDdoc3SxdoY
 dkArAP9PFTRgsXEwfE7arBXCwQqXj/W0R2KgKug7Fno+SoQLnAD/ZKe2TR50uwxr
 9mwYROdMgi50T9ax1RX1jWA0npGXmQg=
 =cFzG
 -----END PGP SIGNATURE-----

Merge tag 'ata-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata

Pull ATA fixes from Damien Le Moal:
 "A larger than usual set of fixes for 6.6-rc4 due to the unexpected
  number of fixes needed to address ATA disks suspend/resume issues.

  In more detail:

   - Add missing additionalProperties on child nodes to the pata-common
     DT bindings (Rob)

   - Fix handling of the REPORT SUPPORTED OPERATION CODES command to
     ignore reserved bits (Niklas)

   - Increase port multiplier soft reset timeout to accomodate slow
     devices and avoid issues on wakeup (Matthias)

   - A couple of minor code fixes to avoid compilation warnings in
     libata-core and libata-eh (me)

   - Many patches from me to address suspend/resume issues, and in
     particular a potential deadlock on resume due to the SCSI disk
     driver resume operation not being synchronized with libata EH port
     resume handling.

     This is addressed by changing the scsi disk driver disk start/stop
     control to allow libata to execute disk suspend (spin down) and
     resume (spin up) on its own during system suspend/resume. Runtime
     suspend/resume control remains with the SCSI disk driver.

     Other fixes include:
      - Fix libata power management request issuing to avoid races
      - Establish a link between ATA ports and SCSI devices to order PM
        operations
      - Fix device removal to avoid issues with driver rmmod removal
      - Fix synchronization of libata device rescan and SCSI disk resume
        operation
      - Remove libsas PM operations as suspend/resume is handled
        directly by the sas controller resume
      - Fix the SCSI disk driver to not issue commands to suspended
        disks, thus avoiding potential system lock-up on resume"

* tag 'ata-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: libata-eh: Fix compilation warning in ata_eh_link_report()
  ata: libata-core: Fix compilation warning in ata_dev_config_ncq()
  scsi: sd: Do not issue commands to suspended disks on shutdown
  ata: libata-core: Do not register PM operations for SAS ports
  ata: libata-scsi: Fix delayed scsi_rescan_device() execution
  scsi: Do not attempt to rescan suspended devices
  ata: libata-scsi: Disable scsi device manage_system_start_stop
  scsi: sd: Differentiate system and runtime start/stop management
  ata: libata-scsi: link ata port and scsi device
  ata: libata-core: Fix port and device removal
  ata: libata-core: Fix ata_port_request_pm() locking
  ata: libata-sata: increase PMP SRST timeout to 10s
  ata: libata-scsi: ignore reserved bits for REPORT SUPPORTED OPERATION CODES
  dt-bindings: ata: pata-common: Add missing additionalProperties on child nodes
2023-09-29 13:38:34 -07:00
Damien Le Moal
49728bdc70 ata: libata-eh: Fix compilation warning in ata_eh_link_report()
The 6 bytes length of the tries_buf string in ata_eh_link_report() is
too short and results in a gcc compilation warning with W-!:

drivers/ata/libata-eh.c: In function ‘ata_eh_link_report’:
drivers/ata/libata-eh.c:2371:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 4 [-Wformat-truncation=]
 2371 |                 snprintf(tries_buf, sizeof(tries_buf), " t%d",
      |                                                           ^~
drivers/ata/libata-eh.c:2371:56: note: directive argument in the range [-2147483648, 4]
 2371 |                 snprintf(tries_buf, sizeof(tries_buf), " t%d",
      |                                                        ^~~~~~
drivers/ata/libata-eh.c:2371:17: note: ‘snprintf’ output between 4 and 14 bytes into a destination of size 6
 2371 |                 snprintf(tries_buf, sizeof(tries_buf), " t%d",
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2372 |                          ap->eh_tries);
      |                          ~~~~~~~~~~~~~

Avoid this warning by increasing the string size to 16B.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28 21:24:18 +09:00
Damien Le Moal
ed518d9ba9 ata: libata-core: Fix compilation warning in ata_dev_config_ncq()
The 24 bytes length allocated to the ncq_desc string in
ata_dev_config_lba() for ata_dev_config_ncq() to use is too short,
causing the following gcc compilation warnings when compiling with W=1:

drivers/ata/libata-core.c: In function ‘ata_dev_configure’:
drivers/ata/libata-core.c:2378:56: warning: ‘%d’ directive output may be truncated writing between 1 and 2 bytes into a region of size between 1 and 11 [-Wformat-truncation=]
 2378 |                 snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth,
      |                                                        ^~
In function ‘ata_dev_config_ncq’,
    inlined from ‘ata_dev_config_lba’ at drivers/ata/libata-core.c:2649:8,
    inlined from ‘ata_dev_configure’ at drivers/ata/libata-core.c:2952:9:
drivers/ata/libata-core.c:2378:41: note: directive argument in the range [1, 32]
 2378 |                 snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth,
      |                                         ^~~~~~~~~~~~~~~~~~~~~
drivers/ata/libata-core.c:2378:17: note: ‘snprintf’ output between 16 and 31 bytes into a destination of size 24
 2378 |                 snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth,
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2379 |                         ddepth, aa_desc);
      |                         ~~~~~~~~~~~~~~~~

Avoid these warnings and the potential truncation by changing the size
of the ncq_desc string to 32 characters.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28 21:24:10 +09:00
Damien Le Moal
75e2bd5f1e ata: libata-core: Do not register PM operations for SAS ports
libsas does its own domain based power management of ports. For such
ports, libata should not use a device type defining power management
operations as executing these operations for suspend/resume in addition
to libsas calls to ata_sas_port_suspend() and ata_sas_port_resume() is
not necessary (and likely dangerous to do, even though problems are not
seen currently).

Introduce the new ata_port_sas_type device_type for ports managed by
libsas. This new device type is used in ata_tport_add() and is defined
without power management operations.

Fixes: 2fcbdcb4c8 ("[SCSI] libata: export ata_port suspend/resume infrastructure for sas")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28 21:23:14 +09:00
Damien Le Moal
8b4d9469d0 ata: libata-scsi: Fix delayed scsi_rescan_device() execution
Commit 6aa0365a3c ("ata: libata-scsi: Avoid deadlock on rescan after
device resume") modified ata_scsi_dev_rescan() to check the scsi device
"is_suspended" power field to ensure that the scsi device associated
with an ATA device is fully resumed when scsi_rescan_device() is
executed. However, this fix is problematic as:
1) It relies on a PM internal field that should not be used without PM
   device locking protection.
2) The check for is_suspended and the call to scsi_rescan_device() are
   not atomic and a suspend PM event may be triggered between them,
   casuing scsi_rescan_device() to be called on a suspended device and
   in that function blocking while holding the scsi device lock. This
   would deadlock a following resume operation.
These problems can trigger PM deadlocks on resume, especially with
resume operations triggered quickly after or during suspend operations.
E.g., a simple bash script like:

for (( i=0; i<10; i++ )); do
	echo "+2 > /sys/class/rtc/rtc0/wakealarm
	echo mem > /sys/power/state
done

that triggers a resume 2 seconds after starting suspending a system can
quickly lead to a PM deadlock preventing the system from correctly
resuming.

Fix this by replacing the check on is_suspended with a check on the
return value given by scsi_rescan_device() as that function will fail if
called against a suspended device. Also make sure rescan tasks already
scheduled are first cancelled before suspending an ata port.

Fixes: 6aa0365a3c ("ata: libata-scsi: Avoid deadlock on rescan after device resume")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28 21:23:10 +09:00
Damien Le Moal
aa3998dbeb ata: libata-scsi: Disable scsi device manage_system_start_stop
The introduction of a device link to create a consumer/supplier
relationship between the scsi device of an ATA device and the ATA port
of that ATA device fixes the ordering of system suspend and resume
operations. For suspend, the scsi device is suspended first and the ata
port after it. This is fine as this allows the synchronize cache and
START STOP UNIT commands issued by the scsi disk driver to be executed
before the ata port is disabled.

For resume operations, the ata port is resumed first, followed
by the scsi device. This allows having the request queue of the scsi
device to be unfrozen after the ata port resume is scheduled in EH,
thus avoiding to see new requests prematurely issued to the ATA device.
Since libata sets manage_system_start_stop to 1, the scsi disk resume
operation also results in issuing a START STOP UNIT command to the
device being resumed so that the device exits standby power mode.

However, restoring the ATA device to the active power mode must be
synchronized with libata EH processing of the port resume operation to
avoid either 1) seeing the start stop unit command being received too
early when the port is not yet resumed and ready to accept commands, or
after the port resume process issues commands such as IDENTIFY to
revalidate the device. In this last case, the risk is that the device
revalidation fails with timeout errors as the drive is still spun down.

Commit 0a85890559 ("ata,scsi: do not issue START STOP UNIT on resume")
disabled issuing the START STOP UNIT command to avoid issues with it.
But this is incorrect as transitioning a device to the active power
mode from the standby power mode set on suspend requires a media access
command. The IDENTIFY, READ LOG and SET FEATURES commands executed in
libata EH context triggered by the ata port resume operation may thus
fail.

Fix these synchronization issues is by handling a device power mode
transitions for system suspend and resume directly in libata EH context,
without relying on the scsi disk driver management triggered with the
manage_system_start_stop flag.

To do this, the following libata helper functions are introduced:

1) ata_dev_power_set_standby():

This function issues a STANDBY IMMEDIATE command to transitiom a device
to the standby power mode. For HDDs, this spins down the disks. This
function applies only to ATA and ZAC devices and does nothing otherwise.
This function also does nothing for devices that have the
ATA_FLAG_NO_POWEROFF_SPINDOWN or ATA_FLAG_NO_HIBERNATE_SPINDOWN flag
set.

For suspend, call ata_dev_power_set_standby() in
ata_eh_handle_port_suspend() before the port is disabled and frozen.
ata_eh_unload() is also modified to transition all enabled devices to
the standby power mode when the system is shutdown or devices removed.

2) ata_dev_power_set_active() and

This function applies to ATA or ZAC devices and issues a VERIFY command
for 1 sector at LBA 0 to transition the device to the active power mode.
For HDDs, since this function will complete only once the disk spin up.
Its execution uses the same timeouts as for reset, to give the drive
enough time to complete spinup without triggering a command timeout.

For resume, call ata_dev_power_set_active() in
ata_eh_revalidate_and_attach() after the port has been enabled and
before any other command is issued to the device.

With these changes, the manage_system_start_stop and no_start_on_resume
scsi device flags do not need to be set in ata_scsi_dev_config(). The
flag manage_runtime_start_stop is still set to allow the sd driver to
spinup/spindown a disk through the sd runtime operations.

Fixes: 0a85890559 ("ata,scsi: do not issue START STOP UNIT on resume")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28 21:23:03 +09:00
Damien Le Moal
3cc2ffe5c1 scsi: sd: Differentiate system and runtime start/stop management
The underlying device and driver of a SCSI disk may have different
system and runtime power mode control requirements. This is because
runtime power management affects only the SCSI disk, while system level
power management affects all devices, including the controller for the
SCSI disk.

For instance, issuing a START STOP UNIT command when a SCSI disk is
runtime suspended and resumed is fine: the command is translated to a
STANDBY IMMEDIATE command to spin down the ATA disk and to a VERIFY
command to wake it up. The SCSI disk runtime operations have no effect
on the ata port device used to connect the ATA disk. However, for
system suspend/resume operations, the ATA port used to connect the
device will also be suspended and resumed, with the resume operation
requiring re-validating the device link and the device itself. In this
case, issuing a VERIFY command to spinup the disk must be done before
starting to revalidate the device, when the ata port is being resumed.
In such case, we must not allow the SCSI disk driver to issue START STOP
UNIT commands.

Allow a low level driver to refine the SCSI disk start/stop management
by differentiating system and runtime cases with two new SCSI device
flags: manage_system_start_stop and manage_runtime_start_stop. These new
flags replace the current manage_start_stop flag. Drivers setting the
manage_start_stop are modifed to set both new flags, thus preserving the
existing start/stop management behavior. For backward compatibility, the
old manage_start_stop sysfs device attribute is kept as a read-only
attribute showing a value of 1 for devices enabling both new flags and 0
otherwise.

Fixes: 0a85890559 ("ata,scsi: do not issue START STOP UNIT on resume")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28 21:23:00 +09:00
Damien Le Moal
fb99ef1786 ata: libata-scsi: link ata port and scsi device
There is no direct device ancestry defined between an ata_device and
its scsi device which prevents the power management code from correctly
ordering suspend and resume operations. Create such ancestry with the
ata device as the parent to ensure that the scsi device (child) is
suspended before the ata device and that resume handles the ata device
before the scsi device.

The parent-child (supplier-consumer) relationship is established between
the ata_port (parent) and the scsi device (child) with the function
device_add_link(). The parent used is not the ata_device as the PM
operations are defined per port and the status of all devices connected
through that port is controlled from the port operations.

The device link is established with the new function
ata_scsi_slave_alloc(), and this function is used to define the
->slave_alloc callback of the scsi host template of all ata drivers.

Fixes: a19a93e4c6 ("scsi: core: pm: Rely on the device driver core for async power management")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
2023-09-28 21:22:57 +09:00
Damien Le Moal
84d76529c6 ata: libata-core: Fix port and device removal
Whenever an ATA adapter driver is removed (e.g. rmmod),
ata_port_detach() is called repeatedly for all the adapter ports to
remove (unload) the devices attached to the port and delete the port
device itself. Removing of devices is done using libata EH with the
ATA_PFLAG_UNLOADING port flag set. This causes libata EH to execute
ata_eh_unload() which disables all devices attached to the port.

ata_port_detach() finishes by calling scsi_remove_host() to remove the
scsi host associated with the port. This function will trigger the
removal of all scsi devices attached to the host and in the case of
disks, calls to sd_shutdown() which will flush the device write cache
and stop the device. However, given that the devices were already
disabled by ata_eh_unload(), the synchronize write cache command and
start stop unit commands fail. E.g. running "rmmod ahci" with first
removing sd_mod results in error messages like:

ata13.00: disable device
sd 0:0:0:0: [sda] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 0:0:0:0: [sda] Stopping disk
sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

Fix this by removing all scsi devices of the ata devices connected to
the port before scheduling libata EH to disable the ATA devices.

Fixes: 720ba12620 ("[PATCH] libata-hp: update unload-unplug")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28 21:22:53 +09:00
Damien Le Moal
3b8e0af4a7 ata: libata-core: Fix ata_port_request_pm() locking
The function ata_port_request_pm() checks the port flag
ATA_PFLAG_PM_PENDING and calls ata_port_wait_eh() if this flag is set to
ensure that power management operations for a port are not scheduled
simultaneously. However, this flag check is done without holding the
port lock.

Fix this by taking the port lock on entry to the function and checking
the flag under this lock. The lock is released and re-taken if
ata_port_wait_eh() needs to be called. The two WARN_ON() macros checking
that the ATA_PFLAG_PM_PENDING flag was cleared are removed as the first
call is racy and the second one done without holding the port lock.

Fixes: 5ef4108291 ("ata: add ata port system PM callbacks")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
2023-09-28 21:22:50 +09:00
Linus Torvalds
633b47cb00 SCSI fixes on 20230927
Single fix for libata: older devices don't support command duration
 limits (CDL) and some don't support report opcodes, meaning there's no
 way to tell if they support the command or not. Reduce the problems of
 incorrectly using CDL commands on older devices by checking SCSI spec
 compliance at SPC-5 (the spec which introduced the command) before
 turning on CDL.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZRQglCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishYJdAP4rjE8a
 /X3Vs7C0PoFDl6HlkN3w4Eeq54vLMmxNez2tywEA6cB3RdwG58g34p8wBt7Lb6UI
 1HAIhRub2mpHZyQH0/U=
 =qq6Q
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "A single fix for libata: older devices don't support command duration
  limits (CDL) and some don't support report opcodes, meaning there's no
  way to tell if they support the command or not.

  Reduce the problems of incorrectly using CDL commands on older devices
  by checking SCSI spec compliance at SPC-5 (the spec which introduced
  the command) before turning on CDL"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: core: ata: Do no try to probe for CDL on old drives
2023-09-27 09:58:02 -07:00
Niklas Cassel
3ef6009235 ata: libata-scsi: ignore reserved bits for REPORT SUPPORTED OPERATION CODES
For REPORT SUPPORTED OPERATION CODES command, the service action field is
defined as bits 0-4 in the second byte in the CDB. Bits 5-7 in the second
byte are reserved.

Only look at the service action field in the second byte when determining
if the MAINTENANCE IN opcode is a REPORT SUPPORTED OPERATION CODES command.

This matches how we only look at the service action field in the second
byte when determining if the SERVICE ACTION IN(16) opcode is a READ
CAPACITY(16) command (reserved bits 5-7 in the second byte are ignored).

Fixes: 7b20309428 ("libata: Add support for SCT Write Same")
Cc: stable@vger.kernel.org
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-25 15:51:16 +09:00
Damien Le Moal
2132df16f5 scsi: core: ata: Do no try to probe for CDL on old drives
Some old drives (e.g. an Ultra320 SCSI disk as reported by John) do not
seem to execute MAINTENANCE_IN / MI_REPORT_SUPPORTED_OPERATION_CODES
commands correctly and hang when a non-zero service action is specified
(one command format with service action case in scsi_report_opcode()).

Currently, CDL probing with scsi_cdl_check_cmd() is the only caller using a
non zero service action for scsi_report_opcode(). To avoid issues with
these old drives, do not attempt CDL probe if the device reports support
for an SPC version lower than 5 (CDL was introduced in SPC-5). To keep
things working with ATA devices which probe for the CDL T2A and T2B pages
introduced with SPC-6, modify ata_scsiop_inq_std() to claim SPC-6 version
compatibility for ATA drives supporting CDL.

SPC-6 standard version number is defined as Dh (= 13) in SPC-6 r09. Fix
scsi_probe_lun() to correctly capture this value by changing the bit mask
for the second byte of the INQUIRY response from 0x7 to 0xf.
include/scsi/scsi.h is modified to add the definition SCSI_SPC_6 with the
value 14 (Dh + 1). The missing definitions for the SCSI_SPC_4 and
SCSI_SPC_5 versions are also added.

Reported-by: John David Anglin <dave.anglin@bell.net>
Fixes: 624885209f ("scsi: core: Detect support for command duration limits")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230915022034.678121-1-dlemoal@kernel.org
Tested-by: David Gow <david@davidgow.net>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-21 21:07:23 -04:00
Niklas Cassel
5e35a9ac3f ata: libata-core: fetch sense data for successful commands iff CDL enabled
Currently, we fetch sense data for a _successful_ command if either:
1) Command was NCQ and ATA_DFLAG_CDL_ENABLED flag set (flag
   ATA_DFLAG_CDL_ENABLED will only be set if the Successful NCQ command
   sense data supported bit is set); or
2) Command was non-NCQ and regular sense data reporting is enabled.

This means that case 2) will trigger for a non-NCQ command which has
ATA_SENSE bit set, regardless if CDL is enabled or not.

This decision was by design. If the device reports that it has sense data
available, it makes sense to fetch that sense data, since the sk/asc/ascq
could be important information regardless if CDL is enabled or not.

However, the fetching of sense data for a successful command is done via
ATA EH. Considering how intricate the ATA EH is, we really do not want to
invoke ATA EH unless absolutely needed.

Before commit 18bd7718b5 ("scsi: ata: libata: Handle completion of CDL
commands using policy 0xD") we never fetched sense data for successful
commands.

In order to not invoke the ATA EH unless absolutely necessary, even if the
device claims support for sense data reporting, only fetch sense data for
successful (NCQ and non-NCQ commands) commands that are using CDL.

[Damien] Modified the check to test the qc flag ATA_QCFLAG_HAS_CDL
instead of the device support for CDL, which is implied for commands
using CDL.

Fixes: 3ac873c76d ("ata: libata-core: fix when to fetch sense data for successful commands")
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-16 21:12:19 +09:00
Niklas Cassel
7a3bc2b398 ata: libata-eh: do not thaw the port twice in ata_eh_reset()
commit 1e641060c4 ("libata: clear eh_info on reset completion") added
a workaround that broke the retry mechanism in ATA EH.

Tejun himself suggested to remove this workaround when it was identified
to cause additional problems:
https://lore.kernel.org/linux-ide/20110426135027.GI878@htj.dyndns.org/

He even said:
"Hmm... it seems I wasn't thinking straight when I added that work around."
https://lore.kernel.org/linux-ide/20110426155229.GM878@htj.dyndns.org/

While removing the workaround solved the issue, however, the workaround was
kept to avoid "spurious hotplug events during reset", and instead another
workaround was added on top of the existing workaround in commit
8c56cacc72 ("libata: fix unexpectedly frozen port after ata_eh_reset()").

Because these IRQs happened when the port was frozen, we know that they
were actually a side effect of PxIS and IS.IPS(x) not being cleared before
the COMRESET. This is now done in commit 94152042eaa9 ("ata: libahci: clear
pending interrupt status"), so these workarounds can now be removed.

Since commit 1e641060c4 ("libata: clear eh_info on reset completion") has
now been reverted, the ATA EH retry mechanism is functional again, so there
is once again no need to thaw the port more than once in ata_eh_reset().

This reverts "the workaround on top of the workaround" introduced in commit
8c56cacc72 ("libata: fix unexpectedly frozen port after ata_eh_reset()").

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-16 21:11:28 +09:00
Niklas Cassel
80cc944eca ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset()
ata_scsi_port_error_handler() starts off by clearing ATA_PFLAG_EH_PENDING,
before calling ap->ops->error_handler() (without holding the ap->lock).

If an error IRQ is received while ap->ops->error_handler() is running,
the irq handler will set ATA_PFLAG_EH_PENDING.

Once ap->ops->error_handler() returns, ata_scsi_port_error_handler()
checks if ATA_PFLAG_EH_PENDING is set, and if it is, another iteration
of ATA EH is performed.

The problem is that ATA_PFLAG_EH_PENDING is not only cleared by
ata_scsi_port_error_handler(), it is also cleared by ata_eh_reset().

ata_eh_reset() is called by ap->ops->error_handler(). This additional
clearing done by ata_eh_reset() breaks the whole retry logic in
ata_scsi_port_error_handler(). Thus, if an error IRQ is received while
ap->ops->error_handler() is running, the port will currently remain
frozen and will never get re-enabled.

The additional clearing in ata_eh_reset() was introduced in commit
1e641060c4 ("libata: clear eh_info on reset completion").

Looking at the original error report:
https://marc.info/?l=linux-ide&m=124765325828495&w=2

We can see the following happening:
[    1.074659] ata3: XXX port freeze
[    1.074700] ata3: XXX hardresetting link, stopping engine
[    1.074746] ata3: XXX flipping SControl

[    1.411471] ata3: XXX irq_stat=400040 CONN|PHY
[    1.411475] ata3: XXX port freeze

[    1.420049] ata3: XXX starting engine
[    1.420096] ata3: XXX rc=0, class=1
[    1.420142] ata3: XXX clearing IRQs for thawing
[    1.420188] ata3: XXX port thawed
[    1.420234] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

We are not supposed to be able to receive an error IRQ while the port is
frozen (PxIE is set to 0, i.e. all IRQs for the port are disabled).

AHCI 1.3.1 section 10.7.1.1 First Tier (IS Register) states:
"Each bit location can be thought of as reporting a '1' if the virtual
"interrupt line" for that port is indicating it wishes to generate an
interrupt. That is, if a port has one or more interrupt status bit set,
and the enables for those status bits are set, then this bit shall be set."

Additionally, AHCI state P:ComInit clearly shows that the state machine
will only jump to P:ComInitSetIS (which sets IS.IPS(x) to '1'), if PxIE.PCE
is set to '1'. In our case, PxIE is set to 0, so IS.IPS(x) won't get set.

So IS.IPS(x) only gets set if PxIS and PxIE is set.

AHCI 1.3.1 section 10.7.1.1 First Tier (IS Register) also states:
"The bits in this register are read/write clear. It is set by the level of
the virtual interrupt line being a set, and cleared by a write of '1' from
the software."

So if IS.IPS(x) is set, you need to explicitly clear it by writing a 1 to
IS.IPS(x) for that port.

Since PxIE is cleared, the only way to get an interrupt while the port is
frozen, is if IS.IPS(x) is set, and the only way IS.IPS(x) can be set when
the port is frozen, is if it was set before the port was frozen.

However, since commit 737dd811a3 ("ata: libahci: clear pending interrupt
status"), we clear both PxIS and IS.IPS(x) after freezing the port, but
before the COMRESET, so the problem that commit 1e641060c4 ("libata:
clear eh_info on reset completion") fixed can no longer happen.

Thus, revert commit 1e641060c4 ("libata: clear eh_info on reset
completion"), so that the retry logic in ata_scsi_port_error_handler()
works once again. (The retry logic is still needed, since we can still
get an error IRQ _after_ the port has been thawed, but before
ata_scsi_port_error_handler() takes the ap->lock in order to check
if ATA_PFLAG_EH_PENDING is set.)

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-16 21:10:37 +09:00
Damien Le Moal
e3da4c401f ata: pata_parport: Fix code style issues
Fix indentation and other code style issues in the comm.c file.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202309150646.n3iBvbPj-lkp@intel.com/
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-15 11:37:30 +09:00
Szuying Chen
737dd811a3 ata: libahci: clear pending interrupt status
When a CRC error occurs, the HBA asserts an interrupt to indicate an
interface fatal error (PxIS.IFS). The ISR clears PxIE and PxIS, then
does error recovery. If the adapter receives another SDB FIS
with an error (PxIS.TFES) from the device before the start of the EH
recovery process, the interrupt signaling the new SDB cannot be
serviced as PxIE was cleared already. This in turn results in the HBA
inability to issue any command during the error recovery process after
setting PxCMD.ST to 1 because PxIS.TFES is still set.

According to AHCI 1.3.1 specifications section 6.2.2, fatal errors
notified by setting PxIS.HBFS, PxIS.HBDS, PxIS.IFS or PxIS.TFES will
cause the HBA to enter the ERR:Fatal state. In this state, the HBA
shall not issue any new commands.

To avoid this situation, introduce the function
ahci_port_clear_pending_irq() to clear pending interrupts before
executing a COMRESET. This follows the AHCI 1.3.1 - section 6.2.2.2
specification.

Signed-off-by: Szuying Chen <Chloe_Chen@asmedia.com.tw>
Fixes: e0bfd14997 ("[PATCH] ahci: stop engine during hard reset")
Cc: stable@vger.kernel.org
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-15 11:34:31 +09:00
Christophe JAILLET
e97eb65dd4 ata: sata_mv: Fix incorrect string length computation in mv_dump_mem()
snprintf() returns the "number of characters which *would* be generated for
the given input", not the size *really* generated.

In order to avoid too large values for 'o' (and potential negative values
for "sizeof(linebuf) o") use scnprintf() instead of snprintf().

Note that given the "w < 4" in the for loop, the buffer can NOT
overflow, but using the *right* function is always better.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-11 15:13:35 +09:00
Niklas Cassel
24e0e61db3 ata: libata: disallow dev-initiated LPM transitions to unsupported states
In AHCI 1.3.1, the register description for CAP.SSC:
"When cleared to ‘0’, software must not allow the HBA to initiate
transitions to the Slumber state via agressive link power management nor
the PxCMD.ICC field in each port, and the PxSCTL.IPM field in each port
must be programmed to disallow device initiated Slumber requests."

In AHCI 1.3.1, the register description for CAP.PSC:
"When cleared to ‘0’, software must not allow the HBA to initiate
transitions to the Partial state via agressive link power management nor
the PxCMD.ICC field in each port, and the PxSCTL.IPM field in each port
must be programmed to disallow device initiated Partial requests."

Ensure that we always set the corresponding bits in PxSCTL.IPM, such that
a device is not allowed to initiate transitions to power states which are
unsupported by the HBA.

DevSleep is always initiated by the HBA, however, for completeness, set the
corresponding bit in PxSCTL.IPM such that agressive link power management
cannot transition to DevSleep if DevSleep is not supported.

sata_link_scr_lpm() is used by libahci, ata_piix and libata-pmp.
However, only libahci has the ability to read the CAP/CAP2 register to see
if these features are supported. Therefore, in order to not introduce any
regressions on ata_piix or libata-pmp, create flags that indicate that the
respective feature is NOT supported. This way, the behavior for ata_piix
and libata-pmp should remain unchanged.

This change is based on a patch originally submitted by Runa Guo-oc.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Fixes: 1152b2617a ("libata: implement sata_link_scr_lpm() and make ata_dev_set_feature() global")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-11 15:09:11 +09:00
Linus Torvalds
2a5a4326e5 SCSI misc on 20230909
Mostly small stragglers that missed the initial merge.  Driver updates
 are qla2xxx and smartpqi (mp3sas has a high diffstat due to the
 volatile qualifier removal, fnic due to unused function removal and
 sd.c has a lot of code shuffling to remove forward declarations).
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZPyD4CYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishZWGAQDlh/3q
 5YJp7f8sIqmgdOiKl3bln3API9Y0MPsC3z5TsAEAv9LYQZH3He4XvxMy/v5FioEs
 8IWIoBsUZtcgoK6mI4w=
 =8Aj8
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
 "Mostly small stragglers that missed the initial merge.

  Driver updates are qla2xxx and smartpqi (mp3sas has a high diffstat
  due to the volatile qualifier removal, fnic due to unused function
  removal and sd.c has a lot of code shuffling to remove forward
  declarations)"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (38 commits)
  scsi: ufs: core: No need to update UPIU.header.flags and lun in advanced RPMB handler
  scsi: ufs: core: Add advanced RPMB support where UFSHCI 4.0 does not support EHS length in UTRD
  scsi: mpt3sas: Remove volatile qualifier
  scsi: mpt3sas: Perform additional retries if doorbell read returns 0
  scsi: libsas: Simplify sas_queue_reset() and remove unused code
  scsi: ufs: Fix the build for the old ARM OABI
  scsi: qla2xxx: Fix unused variable warning in qla2xxx_process_purls_pkt()
  scsi: fnic: Remove unused functions fnic_scsi_host_start/end_tag()
  scsi: qla2xxx: Fix spelling mistake "tranport" -> "transport"
  scsi: fnic: Replace sgreset tag with max_tag_id
  scsi: qla2xxx: Remove unused variables in qla24xx_build_scsi_type_6_iocbs()
  scsi: qla2xxx: Fix nvme_fc_rcv_ls_req() undefined error
  scsi: smartpqi: Change driver version to 2.1.24-046
  scsi: smartpqi: Enhance error messages
  scsi: smartpqi: Enhance controller offline notification
  scsi: smartpqi: Enhance shutdown notification
  scsi: smartpqi: Simplify lun_number assignment
  scsi: smartpqi: Rename pciinfo to pci_info
  scsi: smartpqi: Rename MACRO to clarify purpose
  scsi: smartpqi: Add abort handler
  ...
2023-09-09 12:01:33 -07:00
Linus Torvalds
4b3d6e0c6c ata changes for 6.6
- Fix OF include file for ata platform drivers (Rob).
 
  - Simplify various ahci, sata and pata platform drivers using the
    function devm_platform_ioremap_resource() (Yangtao).
 
  - Cleanup libata time related argument types (e.g. timeouts values)
    (Sergey).
 
  - Cleanup libata code around error handling as all ata drivers now
    define a error_handler operation (Hannes and Niklas).
 
  - Remove functions intended for libsas that are in fact unused
    (Niklas).
 
  - Change the remove device callback of platform drivers to a null
    function (Uwe).
 
  - Simplify the pata_imx driver using devm_clk_get_enabled() (Li).
 
  - Remove old and uinused remnants of the ide code in arm, parisc,
    powerpc, sparc and m68k architectures and associated drivers
    (pata_buddha, pata_falcon and pata_gayle) (Geert).
 
  - Add missing MODULE_DESCRIPTION() in the sata_gemini and pata_ftide010
    drivers (me).
 
  - Several fixes for the pata_ep93xx and pata_falcon drivers (Nikita,
    Michael).
 
  - Add Elkhart Lake AHCI controller support to the ahci driver (Werner).
 
  - Disable NCQ trim on Micron 1100 drives (Pawel).
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCZPbUdQAKCRDdoc3SxdoY
 du6NAP9G10EhjgXDNIM8f1AN8gHrZk2RGSxSuqIKKROxl2i+ugD+Mt8/UjRpf0JO
 nVdKTfgeXaLfCXHN/fmkIYNmk+I7Twg=
 =9Ety
 -----END PGP SIGNATURE-----

Merge tag 'ata-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata

Pull ata updates from Damien Le Moal:

 - Fix OF include file for ata platform drivers (Rob)

 - Simplify various ahci, sata and pata platform drivers using the
   function devm_platform_ioremap_resource() (Yangtao)

 - Cleanup libata time related argument types (e.g. timeouts values)
   (Sergey)

 - Cleanup libata code around error handling as all ata drivers now
   define a error_handler operation (Hannes and Niklas)

 - Remove functions intended for libsas that are in fact unused (Niklas)

 - Change the remove device callback of platform drivers to a null
   function (Uwe)

 - Simplify the pata_imx driver using devm_clk_get_enabled() (Li)

 - Remove old and uinused remnants of the ide code in arm, parisc,
   powerpc, sparc and m68k architectures and associated drivers
   (pata_buddha, pata_falcon and pata_gayle) (Geert)

 - Add missing MODULE_DESCRIPTION() in the sata_gemini and pata_ftide010
   drivers (me)

 - Several fixes for the pata_ep93xx and pata_falcon drivers (Nikita,
   Michael)

 - Add Elkhart Lake AHCI controller support to the ahci driver (Werner)

 - Disable NCQ trim on Micron 1100 drives (Pawel)

* tag 'ata-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: (60 commits)
  ata: libata-core: Disable NCQ_TRIM on Micron 1100 drives
  ata: ahci: Add Elkhart Lake AHCI controller
  ata: pata_falcon: add data_swab option to byte-swap disk data
  ata: pata_falcon: fix IO base selection for Q40
  ata: pata_ep93xx: use soc_device_match for UDMA modes
  ata: pata_ep93xx: fix error return code in probe
  ata: sata_gemini: Add missing MODULE_DESCRIPTION
  ata: pata_ftide010: Add missing MODULE_DESCRIPTION
  m68k: Remove <asm/ide.h>
  ata: pata_gayle: Remove #include <asm/ide.h>
  ata: pata_falcon: Remove #include <asm/ide.h>
  ata: pata_buddha: Remove #include <asm/ide.h>
  asm-generic: Remove ide_iops.h
  sparc: Remove <asm/ide.h>
  powerpc: Remove <asm/ide.h>
  parisc: Remove <asm/ide.h>
  ARM: Remove <asm/ide.h>
  ata: pata_imx: Use helper function devm_clk_get_enabled()
  ata: sata_rcar: Convert to platform remove callback returning void
  ata: sata_mv: Convert to platform remove callback returning void
  ...
2023-09-05 12:37:28 -07:00
Pawel Zmarzly
27fd071040 ata: libata-core: Disable NCQ_TRIM on Micron 1100 drives
Micron 1100 drives lock up when encountering queued TRIM command. It is
a quite old hardware series, for past years we have been running our
machines with these drives using libata.force=noncqtrim.

[Damien] Move the "Crucial_CT*M500*" entry to keep Micron and Crucial
entries together.

Signed-off-by: Pawel Zmarzly <pzmarzly@meta.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-02 12:04:51 +09:00
Werner Fischer
2a2df98ec5 ata: ahci: Add Elkhart Lake AHCI controller
Elkhart Lake is the successor of Apollo Lake and Gemini Lake. These
CPUs and their PCHs are used in mobile and embedded environments.

With this patch I suggest that Elkhart Lake SATA controllers [1] should
use the default LPM policy for mobile chipsets.
The disadvantage of missing hot-plug support with this setting should
not be an issue, as those CPUs are used in embedded environments and
not in servers with hot-plug backplanes.

We discovered that the Elkhart Lake SATA controllers have been missing
in ahci.c after a customer reported the throttling of his SATA SSD
after a short period of higher I/O. We determined the high temperature
of the SSD controller in idle mode as the root cause for that.

Depending on the used SSD, we have seen up to 1.8 Watt lower system
idle power usage and up to 30°C lower SSD controller temperatures in
our tests, when we set med_power_with_dipm manually. I have provided a
table showing seven different SATA SSDs from ATP, Intel/Solidigm and
Samsung [2].

Intel lists a total of 3 SATA controller IDs (4B60, 4B62, 4B63) in [1]
for those mobile PCHs.
This commit just adds 0x4b63 as I do not have test systems with 0x4b60
and 0x4b62 SATA controllers.
I have tested this patch with a system which uses 0x4b63 as SATA
controller.

[1] https://sata-io.org/product/8803
[2] https://www.thomas-krenn.com/en/wiki/SATA_Link_Power_Management#Example_LES_v4

Signed-off-by: Werner Fischer <devlists@wefi.net>
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-02 11:56:25 +09:00
Michael Schmitz
8847d42d7a ata: pata_falcon: add data_swab option to byte-swap disk data
Some users of pata_falcon on Q40 have IDE disks in default
IDE little endian byte order, whereas legacy disks use
host-native big-endian byte order as on the Atari Falcon.

Add module parameter 'data_swab' to allow connecting drives
with non-native data byte order. Drives selected by the
data_swap bit mask will have their user data byte-swapped to
host byte order, i.e. 'pata_falcon.data_swab=2' will byte-swap
all user data on drive B, leaving data on drive A in native
byte order. On Q40, drives on a second IDE interface may be
added to the bit mask as bits 2 and 3.

Default setting is no byte swapping, i.e. compatibility with
the native Falcon or Q40 operating system disk format.

Cc: William R Sowerbutts <will@sowerbutts.com>
Cc: Finn Thain <fthain@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: William R Sowerbutts <will@sowerbutts.com>
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-08-28 19:14:41 +09:00
Michael Schmitz
8a1f00b753 ata: pata_falcon: fix IO base selection for Q40
With commit 44b1fbc0f5 ("m68k/q40: Replace q40ide driver
with pata_falcon and falconide"), the Q40 IDE driver was
replaced by pata_falcon.c.

Both IO and memory resources were defined for the Q40 IDE
platform device, but definition of the IDE register addresses
was modeled after the Falcon case, both in use of the memory
resources and in including register shift and byte vs. word
offset in the address.

This was correct for the Falcon case, which does not apply
any address translation to the register addresses. In the
Q40 case, all of device base address, byte access offset
and register shift is included in the platform specific
ISA access translation (in asm/mm_io.h).

As a consequence, such address translation gets applied
twice, and register addresses are mangled.

Use the device base address from the platform IO resource
for Q40 (the IO address translation will then add the correct
ISA window base address and byte access offset), with register
shift 1. Use MMIO base address and register shift 2 as before
for Falcon.

Encode PIO_OFFSET into IO port addresses for all registers
for Q40 except the data transfer register. Encode the MMIO
offset there (pata_falcon_data_xfer() directly uses raw IO
with no address translation).

Reported-by: William R Sowerbutts <will@sowerbutts.com>
Closes: https://lore.kernel.org/r/CAMuHMdUU62jjunJh9cqSqHT87B0H0A4udOOPs=WN7WZKpcagVA@mail.gmail.com
Link: https://lore.kernel.org/r/CAMuHMdUU62jjunJh9cqSqHT87B0H0A4udOOPs=WN7WZKpcagVA@mail.gmail.com
Fixes: 44b1fbc0f5 ("m68k/q40: Replace q40ide driver with pata_falcon and falconide")
Cc: stable@vger.kernel.org
Cc: Finn Thain <fthain@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: William R Sowerbutts <will@sowerbutts.com>
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-08-28 19:14:39 +09:00