We treat system suspend of SCSI devices pretty much the same as runtime
suspend. If the device is already runtime suspended we leave it to that
state during system suspend. On resume from system sleep we then resume the
device and correct the runtime PM status back to "active".
There is a problem with this because runtime PM status of the request queue
in question is not changed (it will be in "suspended" state). When SCSI
disk driver (sd.c) resumes the disk it sends START message to the device
and because the request queue is still in "suspended" state
blk_pm_peek_request() returns NULL preventing resume of the disk.
The issue can be reproduced with following commands:
# echo auto > /sys/block/sda/device/power/control
# echo 15000 > /sys/block/sda/device/power/autosuspend_delay_ms
[ 57.191706] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 57.380015] sd 0:0:0:0: [sda] Stopping disk
Now suspend the machine:
# rtcwake -s10 -mmem
This ends up in soft lockup because resume is not proceeding accordingly
and userspace is never restarted. Also there is nothing printed to the
console.
Fix this by forcing request queue status to "active" before the disk is
resumed.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
This reverts commit 49718f0fb8 ("SCSI: Fix NULL pointer dereference in
runtime PM")
The old commit may lead to a issue that blk_{pre|post}_runtime_suspend and
blk_{pre|post}_runtime_resume may not be called in pairs.
Take sr device as example, when sr device goes to runtime suspend,
blk_{pre|post}_runtime_suspend will be called since sr device defined
pm->runtime_suspend. But blk_{pre|post}_runtime_resume will not be called
since sr device doesn't have pm->runtime_resume. so, sr device can not
resume correctly anymore.
More discussion can be found from below link.
http://marc.info/?l=linux-scsi&m=144163730531875&w=2
Signed-off-by: Ken Xue <Ken.Xue@amd.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Xiangliang Yu <Xiangliang.Yu@amd.com>
Cc: James E.J. Bottomley <JBottomley@odin.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Terry <Michael.terry@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The routines in scsi_rpm.c assume that if a runtime-PM callback is
invoked for a SCSI device, it can only mean that the device's driver
has asked the block layer to handle the runtime power management (by
calling blk_pm_runtime_init(), which among other things sets q->dev).
However, this assumption turns out to be wrong for things like the ses
driver. Normally ses devices are not allowed to do runtime PM, but
userspace can override this setting. If this happens, the kernel gets
a NULL pointer dereference when blk_post_runtime_resume() tries to use
the uninitialized q->dev pointer.
This patch fixes the problem by calling the block layer's runtime-PM
routines only if the device's driver really does have a runtime-PM
callback routine. Since ses doesn't define any such callbacks, the
crash won't occur.
This fixes Bugzilla #101371.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Stanisław Pitucha <viraptor@gmail.com>
Reported-by: Ilan Cohen <ilanco@gmail.com>
Tested-by: Ilan Cohen <ilanco@gmail.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Odin.com>
After commit b2b49ccbdd (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is
selected) PM_RUNTIME is always set if PM is set, so #ifdef blocks
depending on CONFIG_PM_RUNTIME may now be changed to depend on
CONFIG_PM.
Replace CONFIG_PM_RUNTIME with CONFIG_PM everywhere under
drivers/scsi/ and in include/scsi/scsi_device.h.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Christoph Hellwig <hch@lst.de>
async_schedule() sd resume work to allow disks and other devices to
resume in parallel.
This moves the entirety of scsi_device resume to an async context to
ensure that scsi_device_resume() remains ordered with respect to the
completion of the start/stop command. For the duration of the resume,
new command submissions (that do not originate from the scsi-core) will
be deferred (BLKPREP_DEFER).
It adds a new ASYNC_DOMAIN_EXCLUSIVE(scsi_sd_pm_domain) as a container
of these operations. Like scsi_sd_probe_domain it is flushed at
sd_remove() time to ensure async ops do not continue past the
end-of-life of the sdev. The implementation explicitly refrains from
reusing scsi_sd_probe_domain directly for this purpose as it is flushed
at the end of dpm_resume(), potentially defeating some of the benefit.
Given sdevs are quiesced it is permissible for these resume operations
to bleed past the async_synchronize_full() calls made by the driver
core.
We defer the resolution of which pm callback to call until
scsi_dev_type_{suspend|resume} time and guarantee that the callback
parameter is never NULL. With this in place the type of resume
operation is encoded in the async function identifier.
There is a concern that async resume could trigger PSU overload. In the
enterprise, storage enclosures enforce staggered spin-up regardless of
what the kernel does making async scanning safe by default. Outside of
that context a user can disable asynchronous scanning via a kernel
command line or CONFIG_SCSI_SCAN_ASYNC. Honor that setting when
deciding whether to do resume asynchronously.
Inspired by Todd's analysis and initial proposal [2]:
https://01.org/suspendresume/blogs/tebrandt/2013/hard-disk-resume-optimization-simpler-approach
Cc: Len Brown <len.brown@intel.com>
Cc: Phillip Susi <psusi@ubuntu.com>
[alan: bug fix and clean up suggestion]
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Suggested-by: Todd Brandt <todd.e.brandt@linux.intel.com>
[djbw: kick all resume work to the async queue]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Migrate sr to make use of block layer runtime PM. Accordingly, the
SCSI bus layer runtime PM callback is simplified as all SCSI drivers
implementing runtime PM now use the block layer's request-based
mechanism.
Note that due to the device will be polled by kernel at a constant
interval, if the autosuspend delay is set longer than the polling
interval then the device will never suspend.
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
It makes no sense to flush the cache of a device without medium.
Errors during suspend must be handled according to their causes.
Errors due to missing media or unplugged devices must be ignored.
Errors due to devices being offlined must also be ignored.
The error returns must be modified so that the generic layer
understands them.
[jejb: fix up whitespace and other formatting problems]
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The "runtime idle" helper routine, rpm_idle(), currently ignores
return values from .runtime_idle() callbacks executed by it.
However, it turns out that many subsystems use
pm_generic_runtime_idle() which checks the return value of the
driver's callback and executes pm_runtime_suspend() for the device
unless that value is not 0. If that logic is moved to rpm_idle()
instead, pm_generic_runtime_idle() can be dropped and its users
will not need any .runtime_idle() callbacks any more.
Moreover, the PCI, SCSI, and SATA subsystems' .runtime_idle()
routines, pci_pm_runtime_idle(), scsi_runtime_idle(), and
ata_port_runtime_idle(), respectively, as well as a few drivers'
ones may be simplified if rpm_idle() calls rpm_suspend() after 0 has
been returned by the .runtime_idle() callback executed by it.
To reduce overall code bloat, make the changes described above.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Uses block layer runtime pm helper functions in
scsi_runtime_suspend/resume for devices that take advantage of it.
Remove scsi_autopm_* from sd open/release path and check_events path.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Use of pm_message_t is deprecated and device driver is not supposed
to use that. This patch migrates the SCSI bus level pm callbacks
to call device's pm callbacks defined in its driver's dev_pm_ops.
This is achieved by finding out which device pm callback should be used
in bus callback function, and then pass that callback function pointer
as a param to the scsi_bus_{suspend,resume}_common routine, which will
further pass that callback to scsi_dev_type_{suspend,resume} after
proper handling.
The special case for freeze in scsi_bus_suspend_common is not necessary
since there is no high level SCSI driver has implemented freeze, so no
need to runtime resume the device if it is in runtime suspended state
for system freeze, just return like the system suspend/hibernate case.
Since only sd has implemented drv->suspend/drv->resume, and I'll update
sd driver to use the new callbacks in the following patch, there is no
need to fallback to call drv->suspend/drv->resume if dev_pm_ops is NULL.
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This reverts commit 28fd00d42c.
With commit 88d26136a2 (PM: Prevent
runtime suspend during system resume), this patch is no longer needed.
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This reverts commit 33a2285d96.
With commit 88d26136a2 (PM: Prevent
runtime suspend during system resume), this patch is no longer needed.
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
There is a race in scsi_bus_resume_common when set device's runtime
state to active after pm_runtime_put_sync(dev->parent).
Parent device may have been suspended so pm_runtime_set_active(dev) will
fail with -EBUSY.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
For scsi devices which use scsi bus runtime callback, runtime suspend
will call scsi_dev_type_suspend, and if the drv->suspend failed, the
device will still be in active state. But since scsi_device_quiesce is
called, the device will not be able to respond any more commands.
So add a check here to see if err occured, if so, bring the device back
to normal state with scsi_device_resume.
Signed-off-by: Aaron Lu <aaron.lu@amd.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch (as1520) fixes a bug in the SCSI layer's power management
implementation.
LUN scanning can be carried out asynchronously in do_scan_async(), and
sd uses an asynchronous thread for the time-consuming parts of disk
probing in sd_probe_async(). Currently nothing coordinates these
async threads with system sleep transitions; they can and do attempt
to continue scanning/probing SCSI devices even after the host adapter
has been suspended. As one might expect, the outcome is not ideal.
This is what the "prepare" stage of system suspend was created for.
After the prepare callback has been called for a host, target, or
device, drivers are not allowed to register any children underneath
them. Currently the SCSI prepare callback is not implemented; this
patch rectifies that omission.
For SCSI hosts, the prepare routine calls scsi_complete_async_scans()
to wait until async scanning is finished. It might be slightly more
efficient to wait only until the host in question has been scanned,
but there's currently no way to do that. Besides, during a sleep
transition we will ultimately have to wait until all the host scanning
has finished anyway.
For SCSI devices, the prepare routine calls async_synchronize_full()
to wait until sd probing is finished. The routine does nothing for
SCSI targets, because asynchronous target scanning is done only as
part of host scanning.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: <stable@kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
[Patch description from Alan Stern]
If a child device was runtime-suspended when a system suspend began,
then there will be nothing to prevent its parent from
runtime-suspending as soon as it is woken up during the system resume.
Then when the time comes to resume the child, the resume will fail
because the parent is already back at low power.
On the other hand, there are some devices which should remain at low
power across an entire suspend-resume cycle. The details depend on the
device and the platform.
This suggests that the PM core is not the right place to solve the
problem. One possible solution is for the subsystem or device driver
to call pm_runtime_get_sync(dev->parent) at the start of the
system-resume procedure and pm_runtime_put_sync(dev->parent) at the
end.
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The only high-level SCSI driver that currently implements runtime PM is
sd, and sd treats runtime suspend exactly the same as the SUSPEND and
HIBERNATE stages of system sleep, but not the same as the FREEZE stage.
Therefore, when entering the SUSPEND or HIBERNATE stages of system
sleep, we can skip the callback to the driver if the device is already
in runtime suspend. When entering the FREEZE stage, however, we should
first issue a runtime resume. The overhead of doing this is
negligible, because a suspended drive would be spun up during the THAW
stage of hibernation anyway.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
For the basic SCSI infrastructure files that are exporting symbols
but not modules themselves, add in the basic export.h header file
to allow the exports.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Some callers of pm_runtime_get_sync() and other runtime PM helper
functions, scsi_autopm_get_host() and scsi_autopm_get_device() in
particular, need to distinguish error codes returned when runtime PM
is disabled (i.e. power.disable_depth is nonzero for the given
device) from error codes returned in other situations. For this
reason, make the runtime PM helper functions return -EACCES when
power.disable_depth is nonzero and ensure that this error code
won't be returned by them in any other circumstances. Modify
scsi_autopm_get_host() and scsi_autopm_get_device() to check the
error code returned by pm_runtime_get_sync() and ignore -EACCES.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This patch (as1398b) adds runtime PM support to the SCSI layer. Only
the machanism is provided; use of it is up to the various high-level
drivers, and the patch doesn't change any of them. Except for sg --
the patch expicitly prevents a device from being runtime-suspended
while its sg device file is open.
The implementation is simplistic. In general, hosts and targets are
automatically suspended when all their children are asleep, but for
them the runtime-suspend code doesn't actually do anything. (A host's
runtime PM status is propagated up the device tree, though, so a
runtime-PM-aware lower-level driver could power down the host adapter
hardware at the appropriate times.) There are comments indicating
where a transport class might be notified or some other hooks added.
LUNs are runtime-suspended by calling the drivers' existing suspend
handlers (and likewise for runtime-resume). Somewhat arbitrarily, the
implementation delays for 100 ms before suspending an eligible LUN.
This is because there typically are occasions during bootup when the
same device file is opened and closed several times in quick
succession.
The way this all works is that the SCSI core increments a device's
PM-usage count when it is registered. If a high-level driver does
nothing then the device will not be eligible for runtime-suspend
because of the elevated usage count. If a high-level driver wants to
use runtime PM then it can call scsi_autopm_put_device() in its probe
routine to decrement the usage count and scsi_autopm_get_device() in
its remove routine to restore the original count.
Hosts, targets, and LUNs are not suspended while they are being probed
or removed, or while the error handler is running. In fact, a fairly
large part of the patch consists of code to make sure that things
aren't suspended at such times.
[jejb: fix up compile issues in PM config variations]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch (as1397b) converts the SCSI midlayer to use the new PM
callbacks (struct dev_pm_ops). A new source file, scsi_pm.c, is
created to hold the new callback routines, and the existing
suspend/resume code is moved there.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>