Commit graph

197 commits

Author SHA1 Message Date
Linus Torvalds
1ff7bc3ba7 More power management updates for 5.19-rc1
- Add Tegra234 cpufreq support (Sumit Gupta).
 
  - Clean up and enhance the Mediatek cpufreq driver (Wan Jiabing,
    Rex-BC Chen, and Jia-Wei Chang).
 
  - Fix up the CPPC cpufreq driver after recent changes (Zheng Bin,
    Pierre Gondois).
 
  - Minor update to dt-binding for Qcom's opp-v2-kryo-cpu (Yassine
    Oudjana).
 
  - Use list iterator only inside the list_for_each_entry loop (Xiaomeng
    Tong, and Jakob Koschel).
 
  - New APIs related to finding OPP based on interconnect bandwidth
    (Krzysztof Kozlowski).
 
  - Fix the missing of_node_put() in _bandwidth_supported() (Dan
    Carpenter).
 
  - Cleanups (Krzysztof Kozlowski, and Viresh Kumar).
 
  - Add Out of Band mode description to the intel-speed-select utility
    documentation (Srinivas Pandruvada).
 
  - Add power sequences support to the system reboot and power off
    code and make related platform-specific changes for multiple
    platforms (Dmitry Osipenko, Geert Uytterhoeven).
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmKU8lESHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxVz0P91LNCbkDSt60jzNkXdEjsvUnI/YjJ+QJ
 /+ta7iCwf90obb6s9soBkTyU8Ia7hJ/IWDJW/5xhdG0ySYF17hGNIGKK9xKGsJFK
 tzzWtjFsvT3PeUZQERekqWp8OYskHYmQMj8o4jqqFF7DZD/AswTgkVLALUd7YhVL
 UvLmcKsUA7eXy3ZrhtrGSzVSEbKOGXBLFyjy3IuWjfz6Uk/nGQRNKGf7byRWLM44
 y7zb75/5+p4MPyyJP8M/uiXzEYDKuubRtfx9PdmLgBUSMbtho6eB1x47dZWooaxe
 YKmcFjF80AmnwxHb+Te2rZHPeIYr+5hLBaEq7xaLQf/nAS3y5z1PIfI2wVQ5mXPz
 D599jHHda/6oSAKCVTq2fKfnlR6fetm5j66xOQINpD+G5b5tNSpllXJDamFZxFgP
 DiQAOFzdnRYnK7yTiLWVl1q76SVRxqsGz7/5Ak+NRj2OQK2wRkLzHuZfiV/8r0pk
 ksi6Ew9TerXkstoTQsSToPQxB2VvosSajNU3Oy27pmM0oal1XxP0LIPz9sMor5/g
 tfk5f6Yz/+FFIfXj3cZffZNdhsJgejmcqPdrSdCOV3sBrblnIMQNpHiYg4jGztoj
 IjYKYPVpSaWiSZLQOaK2moTEvm9CfQz1TQCF+/Kz88LX6/7ZaDJFxHG2FDEob0sg
 6KVbrZWweLI=
 =PAh+
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management updates from Rafael Wysocki:
 "These update the ARM cpufreq drivers and fix up the CPPC cpufreq
  driver after recent changes, update the OPP code and PM documentation
  and add power sequences support to the system reboot and power off
  code.

  Specifics:

   - Add Tegra234 cpufreq support (Sumit Gupta)

   - Clean up and enhance the Mediatek cpufreq driver (Wan Jiabing,
     Rex-BC Chen, and Jia-Wei Chang)

   - Fix up the CPPC cpufreq driver after recent changes (Zheng Bin,
     Pierre Gondois)

   - Minor update to dt-binding for Qcom's opp-v2-kryo-cpu (Yassine
     Oudjana)

   - Use list iterator only inside the list_for_each_entry loop
     (Xiaomeng Tong, and Jakob Koschel)

   - New APIs related to finding OPP based on interconnect bandwidth
     (Krzysztof Kozlowski)

   - Fix the missing of_node_put() in _bandwidth_supported() (Dan
     Carpenter)

   - Cleanups (Krzysztof Kozlowski, and Viresh Kumar)

   - Add Out of Band mode description to the intel-speed-select utility
     documentation (Srinivas Pandruvada)

   - Add power sequences support to the system reboot and power off code
     and make related platform-specific changes for multiple platforms
     (Dmitry Osipenko, Geert Uytterhoeven)"

* tag 'pm-5.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (60 commits)
  cpufreq: CPPC: Fix unused-function warning
  cpufreq: CPPC: Fix build error without CONFIG_ACPI_CPPC_CPUFREQ_FIE
  Documentation: admin-guide: PM: Add Out of Band mode
  kernel/reboot: Change registration order of legacy power-off handler
  m68k: virt: Switch to new sys-off handler API
  kernel/reboot: Add devm_register_restart_handler()
  kernel/reboot: Add devm_register_power_off_handler()
  soc/tegra: pmc: Use sys-off handler API to power off Nexus 7 properly
  reboot: Remove pm_power_off_prepare()
  regulator: pfuze100: Use devm_register_sys_off_handler()
  ACPI: power: Switch to sys-off handler API
  memory: emif: Use kernel_can_power_off()
  mips: Use do_kernel_power_off()
  ia64: Use do_kernel_power_off()
  x86: Use do_kernel_power_off()
  sh: Use do_kernel_power_off()
  m68k: Switch to new sys-off handler API
  powerpc: Use do_kernel_power_off()
  xen/x86: Use do_kernel_power_off()
  parisc: Use do_kernel_power_off()
  ...
2022-05-30 11:37:26 -07:00
Linus Torvalds
9d004b2f4f cxl for 5.19
- Add driver-core infrastructure for lockdep validation of
   device_lock(), and fixup a deadlock report that was previously hidden
   behind the 'lockdep no validate' policy.
 
 - Add CXL _OSC support for claiming native control of CXL hotplug and
   error handling.
 
 - Disable suspend in the presence of CXL memory unless and until a
   protocol is identified for restoring PCI device context from memory
   hosted on CXL PCI devices.
 
 - Add support for snooping CXL mailbox commands to protect against
   inopportune changes, like set-partition with the 'immediate' flag set.
 
 - Rework how the driver detects legacy CXL 1.1 configurations (CXL DVSEC
   / 'mem_enable') before enabling new CXL 2.0 decode configurations (CXL
   HDM Capability).
 
 - Miscellaneous cleanups and fixes from -next exposure.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYpFUogAKCRDfioYZHlFs
 Zz+VAP9o/NkYhbaM2Ne9ImgsdJii96gA8nN7q/q/ZoXjsSx2WQD+NRC5d3ZwZDCa
 9YKEkntnvbnAZOCs+ZUuyZBgNh6vsgU=
 =p92w
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl updates from Dan Williams:
 "Compute Express Link (CXL) updates for this cycle.

  The highlight is new driver-core infrastructure and CXL subsystem
  changes for allowing lockdep to validate device_lock() usage. Thanks
  to PeterZ for setting me straight on the current capabilities of the
  lockdep API, and Greg acked it as well.

  On the CXL ACPI side this update adds support for CXL _OSC so that
  platform firmware knows that it is safe to still grant Linux native
  control of PCIe hotplug and error handling in the presence of CXL
  devices. A circular dependency problem was discovered between suspend
  and CXL memory for cases where the suspend image might be stored in
  CXL memory where that image also contains the PCI register state to
  restore to re-enable the device. Disable suspend for now until an
  architecture is defined to clarify that conflict.

  Lastly a collection of reworks, fixes, and cleanups to the CXL
  subsystem where support for snooping mailbox commands and properly
  handling the "mem_enable" flow are the highlights.

  Summary:

   - Add driver-core infrastructure for lockdep validation of
     device_lock(), and fixup a deadlock report that was previously
     hidden behind the 'lockdep no validate' policy.

   - Add CXL _OSC support for claiming native control of CXL hotplug and
     error handling.

   - Disable suspend in the presence of CXL memory unless and until a
     protocol is identified for restoring PCI device context from memory
     hosted on CXL PCI devices.

   - Add support for snooping CXL mailbox commands to protect against
     inopportune changes, like set-partition with the 'immediate' flag
     set.

   - Rework how the driver detects legacy CXL 1.1 configurations (CXL
     DVSEC / 'mem_enable') before enabling new CXL 2.0 decode
     configurations (CXL HDM Capability).

   - Miscellaneous cleanups and fixes from -next exposure"

* tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (47 commits)
  cxl/port: Enable HDM Capability after validating DVSEC Ranges
  cxl/port: Reuse 'struct cxl_hdm' context for hdm init
  cxl/port: Move endpoint HDM Decoder Capability init to port driver
  cxl/pci: Drop @info argument to cxl_hdm_decode_init()
  cxl/mem: Merge cxl_dvsec_ranges() and cxl_hdm_decode_init()
  cxl/mem: Skip range enumeration if mem_enable clear
  cxl/mem: Consolidate CXL DVSEC Range enumeration in the core
  cxl/pci: Move cxl_await_media_ready() to the core
  cxl/mem: Validate port connectivity before dvsec ranges
  cxl/mem: Fix cxl_mem_probe() error exit
  cxl/pci: Drop wait_for_valid() from cxl_await_media_ready()
  cxl/pci: Consolidate wait_for_media() and wait_for_media_ready()
  cxl/mem: Drop mem_enabled check from wait_for_media()
  nvdimm: Fix firmware activation deadlock scenarios
  device-core: Kill the lockdep_mutex
  nvdimm: Drop nd_device_lock()
  ACPI: NFIT: Drop nfit_device_lock()
  nvdimm: Replace lockdep_mutex with local lock classes
  cxl: Drop cxl_device_lock()
  cxl/acpi: Add root device lockdep validation
  ...
2022-05-27 21:24:19 -07:00
Rafael J. Wysocki
14c03a4a75 Merge back reboot/poweroff notifiers rework for 5.19-rc1. 2022-05-25 14:38:29 +02:00
Dmitry Osipenko
5b71808eb7 reboot: Remove pm_power_off_prepare()
All pm_power_off_prepare() users were converted to sys-off handler API.
Remove the obsolete global callback variable.

Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-19 19:30:31 +02:00
Dan Williams
9ea4dcf498 PM: CXL: Disable suspend
The CXL specification claims S3 support at a hardware level, but at a
system software level there are some missing pieces. Section 9.4 (CXL
2.0) rightly claims that "CXL mem adapters may need aux power to retain
memory context across S3", but there is no enumeration mechanism for the
OS to determine if a given adapter has that support. Moreover the save
state and resume image for the system may inadvertantly end up in a CXL
device that needs to be restored before the save state is recoverable.
I.e. a circular dependency that is not resolvable without a third party
save-area.

Arrange for the cxl_mem driver to fail S3 attempts. This still nominaly
allows for suspend, but requires unbinding all CXL memory devices before
the suspend to ensure the typical DRAM flow is taken. The cxl_mem unbind
flow is intended to also tear down all CXL memory regions associated
with a given cxl_memdev.

It is reasonable to assume that any device participating in a System RAM
range published in the EFI memory map is covered by aux power and
save-area outside the device itself. So this restriction can be
minimized in the future once pre-existing region enumeration support
arrives, and perhaps a spec update to clarify if the EFI memory map is
sufficent for determining the range of devices managed by
platform-firmware for S3 support.

Per Rafael, if the CXL configuration prevents suspend then it should
fail early before tasks are frozen, and mem_sleep should stop showing
'mem' as an option [1]. Effectively CXL augments the platform suspend
->valid() op since, for example, the ACPI ops are not aware of the CXL /
PCI dependencies. Given the split role of platform firmware vs OS
provisioned CXL memory it is up to the cxl_mem driver to determine if
the CXL configuration has elements that platform firmware may not be
prepared to restore.

Link: https://lore.kernel.org/r/CAJZ5v0hGVN_=3iU8OLpHY3Ak35T5+JcBM-qs8SbojKrpd0VXsA@mail.gmail.com [1]
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <len.brown@intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/165066828317.3907920.5690432272182042556.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-04-22 16:09:42 -07:00
Jonathan Cameron
a8e2512efc PM: core: Add NS varients of EXPORT[_GPL]_SIMPLE_DEV_PM_OPS and runtime pm equiv
As more drivers start to use namespaces, we need to have varients of these
useful macros that allow the export to be in a particular namespace.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-05 20:16:34 +02:00
Youngjin Jang
a759de6991 PM: sleep: Add device name to suspend_report_result()
Currently, suspend_report_result() prints only function information.

If any driver uses a common PM function, nobody knows who exactly
called the failing function.

A device pinter is needed to recognize the failing device.

For example:

 PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 0
 PM: dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns 0

become after the change:

 serial 00:05: PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 0
 pci 0000:00:01.3: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns 0

Signed-off-by: Youngjin Jang <yj84.jang@samsung.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-08 19:57:01 +01:00
Paul Cercueil
9d86191900 PM: runtime: Add DEFINE_RUNTIME_DEV_PM_OPS() macro
A lot of drivers create a dev_pm_ops struct with the system sleep
suspend/resume callbacks set to pm_runtime_force_suspend() and
pm_runtime_force_resume().

These drivers can now use the DEFINE_RUNTIME_DEV_PM_OPS() macro, which
will use pm_runtime_force_{suspend,resume}() as the system sleep
callbacks, while having the same dead code removal characteristic that
is already provided by DEFINE_SIMPLE_DEV_PM_OPS().

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-01-12 19:59:05 +01:00
Paul Cercueil
0ae101fdd3 PM: core: Add EXPORT[_GPL]_SIMPLE_DEV_PM_OPS macros
These macros are defined conditionally, according to CONFIG_PM:
- if CONFIG_PM is enabled, these macros resolve to
  DEFINE_SIMPLE_DEV_PM_OPS(), and the dev_pm_ops symbol will be
  exported.

- if CONFIG_PM is disabled, these macros will result in a dummy static
  dev_pm_ops to be created with the __maybe_unused flag. The dev_pm_ops
  will then be discarded by the compiler, along with the provided
  callback functions if they are not used anywhere else.

In the second case, the symbol is not exported, which should be
perfectly fine - users of the symbol should all use the pm_ptr() or
pm_sleep_ptr() macro, so the dev_pm_ops marked as "extern" in the
client's code will never be accessed.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-01-12 19:59:05 +01:00
Paul Cercueil
52cc1d7f97 PM: core: Remove static qualifier in DEFINE_SIMPLE_DEV_PM_OPS macro
Keep this macro in line with the other ones. This makes it possible to
use them in the cases where the underlying dev_pm_ops structure is
exported.

Restore the "static" qualifier in the two drivers where the
DEFINE_SIMPLE_DEV_PM_OPS macro was used.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-01-12 19:59:05 +01:00
Paul Cercueil
3f4b32511a PM: core: Remove DEFINE_UNIVERSAL_DEV_PM_OPS() macro
The deprecated UNIVERSAL_DEV_PM_OPS() macro uses the provided callbacks
for both runtime PM and system sleep, which is very likely to be a
mistake, as a system sleep can be triggered while a given device is
already PM-suspended, which would cause the suspend callback to be
called twice.

The amount of users of UNIVERSAL_DEV_PM_OPS() is also tiny (16
occurences) compared to the number of places where
SET_SYSTEM_SLEEP_PM_OPS() is used with pm_runtime_force_suspend() and
pm_runtime_force_resume(), which makes me think that none of these cases
are actually valid.

As the new macro DEFINE_UNIVERSAL_DEV_PM_OPS() which was introduced to
replace UNIVERSAL_DEV_PM_OPS() is currently unused, remove it before
someone starts to use it in yet another invalid case.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-01-12 19:59:05 +01:00
Rafael J. Wysocki
c24efa6732 PM: runtime: Capture device status before disabling runtime PM
In some cases (for example, during system-wide suspend and resume of
devices) it is useful to know whether or not runtime PM has ever been
enabled for a given device and, if so, what the runtime PM status of
it had been right before runtime PM was disabled for it last time.

For this reason, introduce a new struct dev_pm_info field called
last_status that will be used for capturing the runtime PM status of
the device when its power.disable_depth counter changes from 0 to 1.

The new field will be set to RPM_INVALID to start with and whenever
power.disable_depth changes from 1 to 0, so it will be valid only
when runtime PM of the device is currently disabled, but it has been
enabled at least once.

Immediately use power.last_status in rpm_resume() to make it handle
the case when PM runtime is disabled for the device, but its runtime
PM status is RPM_ACTIVE more consistently.  Namely, make it return 1
if power.last_status is also equal to RPM_ACTIVE in that case (the
idea being that if the status was RPM_ACTIVE last time when
power.disable_depth was changing from 0 to 1 and it is still
RPM_ACTIVE, it can be assumed to reflect what happened to the device
last time when it was using runtime PM) and -EACCES otherwise.

Update the documentation to provide a description of last_status and
change the description of pm_runtime_resume() in it to reflect the
new behavior of rpm_active().

While at it, rearrange the code in pm_runtime_enable() to be more
straightforward and replace the WARN() macro in it with a pr_warn()
invocation which is less disruptive.

Link: https://lore.kernel.org/linux-pm/20211026222626.39222-1-ulf.hansson@linaro.org/t/#u
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17 16:16:40 +01:00
Paul Cercueil
1a3c7bb088 PM: core: Add new *_PM_OPS macros, deprecate old ones
This commit introduces the following macros:

SYSTEM_SLEEP_PM_OPS()
LATE_SYSTEM_SLEEP_PM_OPS()
NOIRQ_SYSTEM_SLEEP_PM_OPS()
RUNTIME_PM_OPS()

These new macros are very similar to their SET_*_PM_OPS() equivalent.
They however differ in the fact that the callbacks they set will always
be seen as referenced by the compiler. This means that the callback
functions don't need to be wrapped with a #ifdef CONFIG_PM guard, or
tagged with __maybe_unused, to prevent the compiler from complaining
about unused static symbols. The compiler will then simply evaluate at
compile time whether or not these symbols are dead code.

The callbacks that are only useful with CONFIG_PM_SLEEP is enabled, are
now also wrapped with a new pm_sleep_ptr() macro, which is inspired from
pm_ptr(). This is needed for drivers that use different callbacks for
sleep and runtime PM, to handle the case where CONFIG_PM is set and
CONFIG_PM_SLEEP is not.

This commit also deprecates the following macros:

SIMPLE_DEV_PM_OPS()
UNIVERSAL_DEV_PM_OPS()

And introduces the following macros:

DEFINE_SIMPLE_DEV_PM_OPS()
DEFINE_UNIVERSAL_DEV_PM_OPS()

These macros are similar to the functions they were created to replace,
with the following differences:

 - They use the new macros introduced above, and as such always
   reference the provided callback functions.

 - They are not tagged with __maybe_unused. They are meant to be used
   with pm_ptr() or pm_sleep_ptr() for DEFINE_UNIVERSAL_DEV_PM_OPS()
   and DEFINE_SIMPLE_DEV_PM_OPS() respectively.

 - They declare the symbol static, since every driver seems to do that
   anyway; and if a non-static use-case is needed an indirection pointer
   could be used.

The point of this change, is to progressively switch from a code model
where PM callbacks are all protected behind CONFIG_PM guards, to a code
model where the PM callbacks are always seen by the compiler, but
discarded if not used.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17 16:04:14 +01:00
Paul Cercueil
c06ef740d4 PM: core: Redefine pm_ptr() macro
The pm_ptr() macro was previously conditionally defined, according to
the value of the CONFIG_PM option. This meant that the pointed structure
was either referenced (if CONFIG_PM was set), or never referenced (if
CONFIG_PM was not set), causing it to be detected as unused by the
compiler.

This worked fine, but required the __maybe_unused compiler attribute to
be used to every symbol pointed to by a pointer wrapped with pm_ptr().

We can do better. With this change, the pm_ptr() is now defined the
same, independently of the value of CONFIG_PM. It now uses the (?:)
ternary operator to conditionally resolve to its argument. Since the
condition is known at compile time, the compiler will then choose to
discard the unused symbols, which won't need to be tagged with
__maybe_unused anymore.

This pm_ptr() macro is usually used with pointers to dev_pm_ops
structures created with SIMPLE_DEV_PM_OPS() or similar macros. These do
use a __maybe_unused flag, which is now useless with this change, so it
later can be removed. However in the meantime it causes no harm, and all
the drivers still compile fine with the new pm_ptr() macro.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-17 16:04:14 +01:00
Tony Lindgren
c745253e2a PM: runtime: Fix unpaired parent child_count for force_resume
As pm_runtime_need_not_resume() relies also on usage_count, it can return
a different value in pm_runtime_force_suspend() compared to when called in
pm_runtime_force_resume(). Different return values can happen if anything
calls PM runtime functions in between, and causes the parent child_count
to increase on every resume.

So far I've seen the issue only for omapdrm that does complicated things
with PM runtime calls during system suspend for legacy reasons:

omap_atomic_commit_tail() for omapdrm.0
 dispc_runtime_get()
  wakes up 58000000.dss as it's the dispc parent
   dispc_runtime_resume()
    rpm_resume() increases parent child_count
 dispc_runtime_put() won't idle, PM runtime suspend blocked
pm_runtime_force_suspend() for 58000000.dss, !pm_runtime_need_not_resume()
 __update_runtime_status()
system suspended
pm_runtime_force_resume() for 58000000.dss, pm_runtime_need_not_resume()
 pm_runtime_enable() only called because of pm_runtime_need_not_resume()
omap_atomic_commit_tail() for omapdrm.0
 dispc_runtime_get()
  wakes up 58000000.dss as it's the dispc parent
   dispc_runtime_resume()
    rpm_resume() increases parent child_count
 dispc_runtime_put() won't idle, PM runtime suspend blocked
...
rpm_suspend for 58000000.dss but parent child_count is now unbalanced

Let's fix the issue by adding a flag for needs_force_resume and use it in
pm_runtime_force_resume() instead of pm_runtime_need_not_resume().

Additionally omapdrm system suspend could be simplified later on to avoid
lots of unnecessary PM runtime calls and the complexity it adds. The
driver can just use internal functions that are shared between the PM
runtime and system suspend related functions.

Fixes: 4918e1f87c ("PM / runtime: Rework pm_runtime_force_suspend/resume()")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Cc: 4.16+ <stable@vger.kernel.org> # 4.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-05-10 19:14:01 +02:00
Wan Jiabing
e84dff1bf0 PM: core: Remove duplicate declaration from header file
struct device is declared twice, so remove the duplicate.

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-07 19:18:23 +02:00
Nicolas Pitre
0bfa0820c2 PM: clk: make PM clock layer compatible with clocks that must sleep
The clock API splits its interface into sleepable ant atomic contexts:

 - clk_prepare/clk_unprepare for stuff that might sleep

 - clk_enable_clk_disable for anything that may be done in atomic context

The code handling runtime PM for clocks only calls clk_disable() on
suspend requests, and clk_enable on resume requests. This means that
runtime PM with clock providers that only have the prepare/unprepare
methods implemented is basically useless.

Many clock implementations can't accommodate atomic contexts. This is
often the case when communication with the clock happens through another
subsystem like I2C or SCMI.

Let's make the clock PM code useful with such clocks by safely invoking
clk_prepare/clk_unprepare upon resume/suspend requests. Of course, when
such clocks are registered with the PM layer then pm_runtime_irq_safe()
can't be used, and neither pm_runtime_suspend() nor pm_runtime_resume()
may be invoked in atomic context.

For clocks that do implement the enable and disable methods then
everything just works as before.

A note on sparse:
According to https://lwn.net/Articles/109066/ there are things
that sparse can't cope with. In particular, pm_clk_op_lock() and
pm_clk_op_unlock() may or may not lock/unlock psd->lock depending on
some runtime condition. To work around that we tell it the lock is
always untaken for the purpose of static analisys.

Thanks to Naresh Kamboju for reporting issues with the initial patch.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-01-27 19:29:32 +01:00
Grygorii Strashko
6b61d49a55 PM: runtime: Fix timer_expires data type on 32-bit arches
Commit 8234f6734c ("PM-runtime: Switch autosuspend over to using
hrtimers") switched PM runtime autosuspend to use hrtimers and all
related time accounting in ns, but missed to update the timer_expires
data type in struct dev_pm_info to u64.

This causes the timer_expires value to be truncated on 32-bit
architectures when assignment is done from u64 values:

rpm_suspend()
|- dev->power.timer_expires = expires;

Fix it by changing the timer_expires type to u64.

Fixes: 8234f6734c ("PM-runtime: Switch autosuspend over to using hrtimers")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: 5.0+ <stable@vger.kernel.org> # 5.0+
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-28 16:38:11 +02:00
Paul Cercueil
756a64ce34 PM: Make *_DEV_PM_OPS macros use __maybe_unused
This way, when the dev_pm_ops instance is not referenced anywhere, it
will simply be dropped by the compiler without a warning.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 13:52:36 +02:00
Paul Cercueil
7a82e97a11 PM: core: introduce pm_ptr() macro
This macro is analogous to the infamous of_match_ptr(). If CONFIG_PM
is enabled, this macro will resolve to its argument, otherwise to NULL.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 13:52:36 +02:00
Rafael J. Wysocki
2fff3f73e8 Documentation: PM: sleep: Update driver flags documentation
Update the documentation of the driver flags for system-wide power
management to reflect the current code flows and be more consistent.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-04-24 21:35:11 +02:00
Rafael J. Wysocki
2a3f34750b PM: sleep: core: Rename DPM_FLAG_LEAVE_SUSPENDED
Rename DPM_FLAG_LEAVE_SUSPENDED to DPM_FLAG_MAY_SKIP_RESUME which
matches its purpose more closely.

No functional impact.

Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Wolfram Sang <wsa@the-dreams.de> # for I2C
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2020-04-24 21:34:22 +02:00
Rafael J. Wysocki
e07515563d PM: sleep: core: Rename DPM_FLAG_NEVER_SKIP
Rename DPM_FLAG_NEVER_SKIP to DPM_FLAG_NO_DIRECT_COMPLETE which
matches its purpose more closely.

No functional impact.

Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # for PCI parts
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24 21:33:09 +02:00
Rafael J. Wysocki
fa2bfead91 PM: sleep: core: Rename dev_pm_smart_suspend_and_suspended()
Because all callers of dev_pm_smart_suspend_and_suspended use it only
for checking whether or not to skip driver suspend callbacks for a
device, rename it to dev_pm_skip_suspend() in analogy with
dev_pm_skip_resume().

No functional impact.

Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2020-04-24 21:32:41 +02:00
Rafael J. Wysocki
76c70cb58c PM: sleep: core: Rename dev_pm_may_skip_resume()
The name of dev_pm_may_skip_resume() may be easily confused with the
power.may_skip_resume flag which is not checked by that function, so
rename the former as dev_pm_skip_resume().

No functional impact.

Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2020-04-24 21:32:11 +02:00
Ulf Hansson
ca765a8cfe PM / Domains: Introduce dev_pm_domain_start()
For a subsystem/driver that either doesn't support runtime PM or makes use
of pm_runtime_set_active() during ->probe(), may try to access its device
when probing, even if it may not be fully powered on from the PM domain's
point of view. This may be the case when the used PM domain is a genpd
provider, that implements genpd's ->start|stop() device callbacks.

There are cases where the subsystem/driver managed to avoid the above
problem, simply by calling pm_runtime_enable() and pm_runtime_get_sync()
during ->probe(). However, this approach comes with a drawback, especially
if the subsystem/driver implements a ->runtime_resume() callback.

More precisely, the subsystem/driver then needs to use a device flag, which
is checked in its ->runtime_resume() callback, as to avoid powering on its
resources the first time the callback is invoked. This is needed because
the subsystem/driver has already powered on the resources for the device,
during ->probe() and before it called pm_runtime_get_sync().

In a way to avoid this boilerplate code and the inefficient check for "if
(first_time_suspend)" in the ->runtime_resume() callback for these
subsystems/drivers, let's introduce and export a dev_pm_domain_start()
function, that may be called during ->probe() instead.

Moreover, let the dev_pm_domain_start() invoke an optional ->start()
callback, added to the struct dev_pm_domain, as to allow a PM domain
specific implementation.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-11-13 11:41:50 +01:00
Rafael J. Wysocki
b605c44c30 PM: sleep: Drop dpm_noirq_begin() and dpm_noirq_end()
Note that after previous changes dpm_noirq_begin() and
dpm_noirq_end() each have only one caller, so move the code from
them to their respective callers and drop them.

Also note that dpm_noirq_resume_devices() and
dpm_noirq_suspend_devices() need not be exported any more, so make
them both static.

This change is not expected to alter functionality.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
2019-07-23 09:46:56 +02:00
Linus Torvalds
fb4da215ed pci-v5.3-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl0siFoUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vzi9A//S4jRyyZrgUr88Az0GbgMhE4b3yqc
 uL7om/Sf+443gG6C+aKkZSM/IE9hrbyIKuYq7GGxDkzZ/HkucZo2yIuAHkPgG4ik
 QQYJ8fJsmMq1bUht87c1ZZwGP0++Deq/Ns2+VNy/WBYqKLulnV0DvEEaJgPs9C5D
 ppwccGdo6UghiujBTpE4ddUBjFjjURWqT6wSnMRDQ4EGwfUhG0MWwwHKI4hbBuaL
 N6refuggdYyUUX5FeUOHa6VF6uTnSSAQ75k+40n4nljdayqoumHLskst77o9q5ZI
 oXjdpwgmuEqYhfp03HEA4Xo/bBxiRj76NuTiEMKvPokxjpanwbLrdV0GhF0OIlM0
 rp1NOI1w+vppFrU+rc2gtq+7hYXFmvdhjS29hFLeD91PP36N5d29jW5NVFpm7GCm
 n4TMGAOsu8RB+bNua6ZbZVcDk2EnPgQeIcM0ZPoBtPK19Fg/rScdEU4u/aFE1Y0Q
 C+Ks7D1qCvFpHzl/xAg0oo9v/jFsWef3qnQWOzot964Zz4W4NSVvB9Ox6Vbfj6C4
 v331LJmlPxG8fxBNA3q28FrTxcG1NW6sgo3WY9VoSp/vc0aqaPKhm7sbraTt5IrI
 TwqA/WhnAHv90MQCGFcofANyYTkjPkKk2QBFK6b0suoAmVdwVWWELi1WaZ+HdvgQ
 JP7YpmC2cXcQBPk=
 =ZGxL
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration changes:

   - Evaluate PCI Boot Configuration _DSM to learn if firmware wants us
     to preserve its resource assignments (Benjamin Herrenschmidt)

   - Simplify resource distribution (Nicholas Johnson)

   - Decode 32 GT/s link speed (Gustavo Pimentel)

  Virtualization:

   - Fix incorrect caching of VF config space size (Alex Williamson)

   - Fix VF driver probing sysfs knobs (Alex Williamson)

  Peer-to-peer DMA:

   - Fix dma_virt_ops check (Logan Gunthorpe)

  Altera host bridge driver:

   - Allow building as module (Ley Foon Tan)

  Armada 8K host bridge driver:

   - add PHYs support (Miquel Raynal)

  DesignWare host bridge driver:

   - Export APIs to support removable loadable module (Vidya Sagar)

   - Enable Relaxed Ordering erratum workaround only on Tegra20 &
     Tegra30 (Vidya Sagar)

  Hyper-V host bridge driver:

   - Fix use-after-free in eject (Dexuan Cui)

  Mobiveil host bridge driver:

   - Clean up and fix many issues, including non-identify mapped
     windows, 64-bit windows, multi-MSI, class code, INTx clearing (Hou
     Zhiqiang)

  Qualcomm host bridge driver:

   - Use clk bulk API for 2.4.0 controllers (Bjorn Andersson)

   - Add QCS404 support (Bjorn Andersson)

   - Assert PERST for at least 100ms (Niklas Cassel)

  R-Car host bridge driver:

   - Add r8a774a1 DT support (Biju Das)

  Tegra host bridge driver:

   - Add support for Gen2, opportunistic UpdateFC and ACK (PCIe protocol
     details) AER, GPIO-based PERST# (Manikanta Maddireddy)

   - Fix many issues, including power-on failure cases, interrupt
     masking in suspend, UPHY settings, AFI dynamic clock gating,
     pending DLL transactions (Manikanta Maddireddy)

  Xilinx host bridge driver:

   - Fix NWL Multi-MSI programming (Bharat Kumar Gogada)

  Endpoint support:

   - Fix 64bit BAR support (Alan Mikhak)

   - Fix pcitest build issues (Alan Mikhak, Andy Shevchenko)

  Bug fixes:

   - Fix NVIDIA GPU multi-function power dependencies (Abhishek Sahu)

   - Fix NVIDIA GPU HDA enablement issue (Lukas Wunner)

   - Ignore lockdep for sysfs "remove" (Marek Vasut)

  Misc:

   - Convert docs to reST (Changbin Du, Mauro Carvalho Chehab)"

* tag 'pci-v5.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (107 commits)
  PCI: Enable NVIDIA HDA controllers
  tools: PCI: Fix installation when `make tools/pci_install`
  PCI: dwc: pci-dra7xx: Fix compilation when !CONFIG_GPIOLIB
  PCI: Fix typos and whitespace errors
  PCI: mobiveil: Fix INTx interrupt clearing in mobiveil_pcie_isr()
  PCI: mobiveil: Fix infinite-loop in the INTx handling function
  PCI: mobiveil: Move PCIe PIO enablement out of inbound window routine
  PCI: mobiveil: Add upper 32-bit PCI base address setup in inbound window
  PCI: mobiveil: Add upper 32-bit CPU base address setup in outbound window
  PCI: mobiveil: Mask out hardcoded bits in inbound/outbound windows setup
  PCI: mobiveil: Clear the control fields before updating it
  PCI: mobiveil: Add configured inbound windows counter
  PCI: mobiveil: Fix the valid check for inbound and outbound windows
  PCI: mobiveil: Clean-up program_{ib/ob}_windows()
  PCI: mobiveil: Remove an unnecessary return value check
  PCI: mobiveil: Fix error return values
  PCI: mobiveil: Refactor the MEM/IO outbound window initialization
  PCI: mobiveil: Make some register updates more readable
  PCI: mobiveil: Reformat the code for readability
  dt-bindings: PCI: mobiveil: Change gpio_slave and apb_csr to optional
  ...
2019-07-15 20:44:49 -07:00
Rafael J. Wysocki
02bd45a28b PM: sleep: Drop dev_pm_skip_next_resume_phases()
After recent hibernation-related changes, there are no more callers
of dev_pm_skip_next_resume_phases() except for the PM core itself
in which it is more straightforward to run the statements from
that function directly, so do that and drop it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-07-04 10:50:40 +02:00
Mauro Carvalho Chehab
151f4e2bdc docs: power: convert docs to ReST and rename to *.rst
Convert the PM documents to ReST, in order to allow them to
build with Sphinx.

The conversion is actually:
  - add blank lines and indentation in order to identify paragraphs;
  - fix tables markups;
  - add some lists markups;
  - mark literal blocks;
  - adjust title markups.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>
2019-06-14 16:08:36 -05:00
Thomas Gleixner
1a59d1b8e0 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version this program is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details you
  should have received a copy of the gnu general public license along
  with this program if not write to the free software foundation inc
  59 temple place suite 330 boston ma 02111 1307 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 1334 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070033.113240726@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:26:35 -07:00
Ulf Hansson
0996584b30 PM-runtime: Call pm_runtime_active|suspended_time() from sysfs
Avoid the open-coding of the accounted time acquisition in
runtime_active|suspend_time_show() and make them call
pm_runtime_active|suspended_time() instead.

Note that this change also indirectly avoids holding dev->power.lock
around the do_div() computation and the sprintf() call which is an
additional improvement.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-03-07 11:23:17 +01:00
Sudeep Holla
85945c28b5 PM / core: Add support to skip power management in device/driver model
All device objects in the driver model contain fields that control the
handling of various power management activities. However, it's not
always useful. There are few instances where pseudo devices are added
to the model just to take advantage of many other features like
kobjects, udev events, and so on. One such example is cpu devices and
their caches.

The sysfs for the cpu caches are managed by adding devices with cpu
as the parent in cpu_device_create() when secondary cpu is brought
online. Generally when the secondary CPUs are hotplugged back in as part
of resume from suspend-to-ram, we call cpu_device_create() from the cpu
hotplug state machine while the cpu device associated with that CPU is
not yet ready to be resumed as the device_resume() call happens bit
later. It's not really needed to set the flag is_prepared for cpu
devices as they are mostly pseudo device and hotplug framework deals
with state machine and not managed through the cpu device.

This often results in annoying warning when resuming:
Enabling non-boot CPUs ...
CPU1: Booted secondary processor
 cache: parent cpu1 should not be sleeping
CPU1 is up
CPU2: Booted secondary processor
 cache: parent cpu2 should not be sleeping
CPU2 is up
.... and so on.

So in order to fix these kind of errors, we could just completely avoid
doing any power management related initialisations and operations if
they are not used by these devices.

Add no_pm flags to indicate that the device doesn't require any sort of
PM activities and all of them can be completely skipped. We can use the
same flag to also avoid adding not used *power* sysfs entries for these
devices. For now, lets use this for cpu cache devices.

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-02-19 10:42:43 +01:00
Thara Gopinath
a08c2a5a31 PM-runtime: Replace jiffies-based accounting with ktime-based accounting
Replace jiffies-based accounting for runtime_active_time and
runtime_suspended_time with ktime-based accounting. This makes the
runtime debug counters inline with genpd and other PM subsytems which
use ktime-based accounting.

Timekeeping is initialized before driver_init(). It's only at that time
that PM-runtime can be enabled.

Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
[switch from ktime to raw nsec]
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-01-31 10:45:10 +01:00
Vincent Guittot
8234f6734c PM-runtime: Switch autosuspend over to using hrtimers
PM-runtime uses the timer infrastructure for autosuspend. This implies
that the minimum time before autosuspending a device is in the range
of 1 tick included to 2 ticks excluded
 -On arm64 this means between 4ms and 8ms with default jiffies
  configuration
 -And on arm, it is between 10ms and 20ms

These values are quite high for embedded systems which sometimes want
the duration to be in the range of 1 ms.

It is possible to switch autosuspend over to using hrtimers to get
finer granularity for short durations and take advantage of slack to
retain some margins and get long timeouts with minimum wakeups.

On an arm64 platform that uses 1ms for autosuspending timeout of its
GPU, idle power is reduced by 10% with hrtimer.

The latency impact on arm64 hikey octo cores is:
 - mark_last_busy: from 1.11 us to 1.25 us
 - rpm_suspend: from 15.54 us to 15.38 us
[Only the code path of rpm_suspend() that starts hrtimer has been
measured.]

arm64 image (arm64 default defconfig) decreases by around 3KB
with following details:

$ size vmlinux-timer
   text	   data	    bss	    dec	    hex	filename
12034646	6869268	 386840	19290754	1265a82	vmlinux

$ size vmlinux-hrtimer
   text	   data	    bss	    dec	    hex	filename
12030550	6870164	 387032	19287746	1264ec2	vmlinux

The latency impact on arm 32bits snowball dual cores is :
 - mark_last_busy: from 0.31 us usec to 0.77 us
 - rpm_suspend: from 6.83 us to 6.67 usec

The increase of the image for snowball platform that I used for
testing performance impact, is neglictable (244B).

$ size vmlinux-timer
   text	   data	    bss	    dec	    hex	filename
7157961	2119580	 264120	9541661	 91981d	build-ux500/vmlinux

size vmlinux-hrtimer
   text	   data	    bss	    dec	    hex	filename
7157773	2119884	 264248	9541905	 919911	vmlinux-hrtimer

And arm 32bits image (multi_v7_defconfig) increases by around 1.7KB
with following details:

$ size vmlinux-timer
   text	   data	    bss	    dec	    hex	filename
13304443	6803420	 402768	20510631	138f7a7	vmlinux

$ size vmlinux-hrtimer
   text	   data	    bss	    dec	    hex	filename
13304299	6805276	 402768	20512343	138fe57	vmlinux

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-12-19 10:31:50 +01:00
Rafael J. Wysocki
c51a024e39 Merge back PM core material for v4.16. 2017-12-16 02:05:48 +01:00
Rafael J. Wysocki
3487972d7f PM / sleep: Avoid excess pm_runtime_enable() calls in device_resume()
Middle-layer code doing suspend-time optimizations for devices with
the DPM_FLAG_SMART_SUSPEND flag set (currently, the PCI bus type and
the ACPI PM domain) needs to make the core skip ->thaw_early and
->thaw callbacks for those devices in some cases and it sets the
power.direct_complete flag for them for this purpose.

However, it turns out that setting power.direct_complete outside of
the PM core is a bad idea as it triggers an excess invocation of
pm_runtime_enable() in device_resume().

For this reason, provide a helper to clear power.is_late_suspended
and power.is_suspended to be invoked by the middle-layer code in
question instead of setting power.direct_complete and make that code
call the new helper.

Fixes: c4b65157ae (PCI / PM: Take SMART_SUSPEND driver flag into account)
Fixes: 05087360fd (ACPI / PM: Take SMART_SUSPEND driver flag into account)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-12-11 14:32:56 +01:00
Rafael J. Wysocki
0d4b54c6fe PM / core: Add LEAVE_SUSPENDED driver flag
Define and document a new driver flag, DPM_FLAG_LEAVE_SUSPENDED, to
instruct the PM core and middle-layer (bus type, PM domain, etc.)
code that it is desirable to leave the device in runtime suspend
after system-wide transitions to the working state (for example,
the device may be slow to resume and it may be better to avoid
resuming it right away).

Generally, the middle-layer code involved in the handling of the
device is expected to indicate to the PM core whether or not the
device may be left in suspend with the help of the device's
power.may_skip_resume status bit.  That has to happen in the "noirq"
phase of the preceding system suspend (or analogous) transition.
The middle layer is then responsible for handling the device as
appropriate in its "noirq" resume callback which is executed
regardless of whether or not the device may be left suspended, but
the other resume callbacks (except for ->complete) will be skipped
automatically by the core if the device really can be left in
suspend.

The additional power.must_resume status bit introduced for the
implementation of this mechanisn is used internally by the PM core
to track the requirement to resume the device (which may depend on
its children etc).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-11-27 01:20:59 +01:00
Rafael J. Wysocki
c4b65157ae PCI / PM: Take SMART_SUSPEND driver flag into account
Make the PCI bus type take DPM_FLAG_SMART_SUSPEND into account in its
system-wide PM callbacks and make sure that all code that should not
run in parallel with pci_pm_runtime_resume() is executed in the "late"
phases of system suspend, freeze and poweroff transitions.

[Note that the pm_runtime_suspended() check in pci_dev_keep_suspended()
is an optimization, because if is not passed, all of the subsequent
checks may be skipped and some of them are much more overhead in
general.]

Also use the observation that if the device is in runtime suspend
at the beginning of the "late" phase of a system-wide suspend-like
transition, its state cannot change going forward (runtime PM is
disabled for it at that time) until the transition is over and the
subsequent system-wide PM callbacks should be skipped for it (as
they generally assume the device to not be suspended), so add checks
for that in pci_pm_suspend_late/noirq(), pci_pm_freeze_late/noirq()
and pci_pm_poweroff_late/noirq().

Moreover, if pci_pm_resume_noirq() or pci_pm_restore_noirq() is
called during the subsequent system-wide resume transition and if
the device was left in runtime suspend previously, its runtime PM
status needs to be changed to "active" as it is going to be put
into the full-power state, so add checks for that too to these
functions.

In turn, if pci_pm_thaw_noirq() runs after the device has been
left in runtime suspend, the subsequent "thaw" callbacks need
to be skipped for it (as they may not work correctly with a
suspended device), so set the power.direct_complete flag for the
device then to make the PM core skip those callbacks.

In addition to the above add a core helper for checking if
DPM_FLAG_SMART_SUSPEND is set and the device runtime PM status is
"suspended" at the same time, which is done quite often in the new
code (and will be done elsewhere going forward too).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-11-06 13:57:46 +01:00
Rafael J. Wysocki
0eab11c9ae PM / core: Add SMART_SUSPEND driver flag
Define and document a SMART_SUSPEND flag to instruct bus types and PM
domains that the system suspend callbacks provided by the driver can
cope with runtime-suspended devices, so from the driver's perspective
it should be safe to leave devices in runtime suspend during system
suspend.

Setting that flag may also cause middle-layer code (bus types,
PM domains etc.) to skip invocations of the ->suspend_late and
->suspend_noirq callbacks provided by the driver if the device
is in runtime suspend at the beginning of the "late" phase of
the system-wide suspend transition, in which case the driver's
system-wide resume callbacks may be invoked back-to-back with
its ->runtime_suspend callback, so the driver has to be able to
cope with that too.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-11-06 13:57:01 +01:00
Rafael J. Wysocki
08810a4119 PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
The motivation for this change is to provide a way to work around
a problem with the direct-complete mechanism used for avoiding
system suspend/resume handling for devices in runtime suspend.

The problem is that some middle layer code (the PCI bus type and
the ACPI PM domain in particular) returns positive values from its
system suspend ->prepare callbacks regardless of whether the driver's
->prepare returns a positive value or 0, which effectively prevents
drivers from being able to control the direct-complete feature.
Some drivers need that control, however, and the PCI bus type has
grown its own flag to deal with this issue, but since it is not
limited to PCI, it is better to address it by adding driver flags at
the core level.

To that end, add a driver_flags field to struct dev_pm_info for flags
that can be set by device drivers at the probe time to inform the PM
core and/or bus types, PM domains and so on on the capabilities and/or
preferences of device drivers.  Also add two static inline helpers
for setting that field and testing it against a given set of flags
and make the driver core clear it automatically on driver remove
and probe failures.

Define and document two PM driver flags related to the direct-
complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
respectively, to indicate to the PM core that the direct-complete
mechanism should never be used for the device and to inform the
middle layer code (bus types, PM domains etc) that it can only
request the PM core to use the direct-complete mechanism for
the device (by returning a positive value from its ->prepare
callback) if it also has been requested by the driver.

While at it, make the core check pm_runtime_suspended() when
setting power.direct_complete so that it doesn't need to be
checked by ->prepare callbacks.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-11-06 13:55:30 +01:00
Ulf Hansson
0e708fc602 PM / sleep: Remove pm_complete_with_resume_check()
According to recent changes for ACPI, the are longer any users of
pm_complete_with_resume_check(), thus let's drop it.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-11 15:40:29 +02:00
Rafael J. Wysocki
786f41fb6b PM / core: Split dpm_suspend_noirq() and dpm_resume_noirq()
Put the device interrupts disabling and enabling as well as
cpuidle_pause() and cpuidle_resume() called during the "noirq"
stages of system suspend into separate functions to allow the
core suspend-to-idle code to be optimized (later).

The only functional difference this makes is that debug facilities
and diagnostic tools will not include the above operations into the
"noirq" device suspend/resume duration measurements.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-24 23:53:45 +02:00
Rafael J. Wysocki
de3ef1eb1c PM / core: Drop run_wake flag from struct dev_pm_info
The run_wake flag in struct dev_pm_info is used to indicate whether
or not the device is capable of generating remote wakeup signals at
run time (or in the system working state), but the distinction
between runtime remote wakeup and system wakeup signaling has always
been rather artificial.  The only practical reason for it to exist
at the core level was that ACPI and PCI treated those two cases
differently, but that's not the case any more after recent changes.

For this reason, get rid of the run_wake flag and, when applicable,
use device_set_wakeup_capable() and device_can_wakeup() instead of
device_set_run_wake() and device_run_wake(), respectively.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-28 01:52:52 +02:00
Rafael J. Wysocki
2728b2d2e5 PM / core / docs: Convert sleep states API document to reST
Move the document describing the system sleep state transitions API
for devices to Documentation/driver-api/pm/, convert it to reST and
update it to use current terminology.  Also remove the remaining
reference to the old version of it from pm.h.

The new document still contains references to some documents in the
.txt format that will be converted later.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-02-06 11:25:55 -07:00
Rafael J. Wysocki
4d29b2e5ad PM / core: Update kerneldoc comments in pm.h
Refresh the struct dev_pm_ops kerneldoc comment, so that it looks
better and is more readable after processing by Sphinx, and drop
the kerneldoc marker from a few other comments ("PM_EVENT_ messages"
and a couple of enum types declarations) which are not proper
kerneldoc and generally confuse Sphinx.

Also change the comment describing struct dev_pm_domain into
a kerneldoc one.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-02-06 11:11:19 -07:00
Linus Torvalds
098c30557a Driver core patches for 4.10-rc1
Here's the new driver core patches for 4.10-rc1.
 
 Big thing here is the nice addition of "functional dependencies" to the
 driver core.  The idea has been talked about for a very long time, great
 job to Rafael for stepping up and implementing it. It's been tested for
 longer than the 4.9-rc1 date, we held off on merging it earlier in order
 to feel more comfortable about it.
 
 Other than that, it's just a handful of small other patches, some good
 cleanups to the mess that is the firmware class code, and we have a test
 driver for the deferred probe logic.
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWFAvPQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ym3NgCgmhFeWEkp9SDt17YGGavmnzQUlBQAoJlUipJp
 PHeQkq15ZWw3wWC9FEvM
 =91M1
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here's the new driver core patches for 4.10-rc1.

  Big thing here is the nice addition of "functional dependencies" to
  the driver core. The idea has been talked about for a very long time,
  great job to Rafael for stepping up and implementing it. It's been
  tested for longer than the 4.9-rc1 date, we held off on merging it
  earlier in order to feel more comfortable about it.

  Other than that, it's just a handful of small other patches, some good
  cleanups to the mess that is the firmware class code, and we have a
  test driver for the deferred probe logic.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (30 commits)
  firmware: Correct handling of fw_state_wait() return value
  driver core: Silence device links sphinx warning
  firmware: remove warning at documentation generation time
  drivers: base: dma-mapping: Fix typo in dmam_alloc_non_coherent comments
  driver core: test_async: fix up typo found by 0-day
  firmware: move fw_state_is_done() into UHM section
  firmware: do not use fw_lock for fw_state protection
  firmware: drop bit ops in favor of simple state machine
  firmware: refactor loading status
  firmware: fix usermode helper fallback loading
  driver core: firmware_class: convert to use class_groups
  driver core: devcoredump: convert to use class_groups
  driver core: class: add class_groups support
  kernfs: Declare two local data structures static
  driver-core: fix platform_no_drv_owner.cocci warnings
  drivers/base/memory.c: Remove unused 'first_page' variable
  driver core: add CLASS_ATTR_WO()
  drivers: base: cacheinfo: support DT overrides for cache properties
  drivers: base: cacheinfo: add pr_fmt logging
  drivers: base: cacheinfo: fix boot error message when acpi is enabled
  ...
2016-12-13 11:42:18 -08:00
Rafael J. Wysocki
baa8809f60 PM / runtime: Optimize the use of device links
If the device has no links to suppliers that should be used for
runtime PM (links with DEVICE_LINK_PM_RUNTIME set), there is no
reason to walk the list of suppliers for that device during
runtime suspend and resume.

Add a simple mechanism to detect that case and possibly avoid the
extra unnecessary overhead.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31 11:42:51 -06:00
Rafael J. Wysocki
9ed9895370 driver core: Functional dependencies tracking support
Currently, there is a problem with taking functional dependencies
between devices into account.

What I mean by a "functional dependency" is when the driver of device
B needs device A to be functional and (generally) its driver to be
present in order to work properly.  This has certain consequences
for power management (suspend/resume and runtime PM ordering) and
shutdown ordering of these devices.  In general, it also implies that
the driver of A needs to be working for B to be probed successfully
and it cannot be unbound from the device before the B's driver.

Support for representing those functional dependencies between
devices is added here to allow the driver core to track them and act
on them in certain cases where applicable.

The argument for doing that in the driver core is that there are
quite a few distinct use cases involving device dependencies, they
are relatively hard to get right in a driver (if one wants to
address all of them properly) and it only gets worse if multiplied
by the number of drivers potentially needing to do it.  Morever, at
least one case (asynchronous system suspend/resume) cannot be handled
in a single driver at all, because it requires the driver of A to
wait for B to suspend (during system suspend) and the driver of B to
wait for A to resume (during system resume).

For this reason, represent dependencies between devices as "links",
with the help of struct device_link objects each containing pointers
to the "linked" devices, a list node for each of them, status
information, flags, and an RCU head for synchronization.

Also add two new list heads, representing the lists of links to the
devices that depend on the given one (consumers) and to the devices
depended on by it (suppliers), and a "driver presence status" field
(needed for figuring out initial states of device links) to struct
device.

The entire data structure consisting of all of the lists of link
objects for all devices is protected by a mutex (for link object
addition/removal and for list walks during device driver probing
and removal) and by SRCU (for list walking in other case that will
be introduced by subsequent change sets).  If CONFIG_SRCU is not
selected, however, an rwsem is used for protecting the entire data
structure.

In addition, each link object has an internal status field whose
value reflects whether or not drivers are bound to the devices
pointed to by the link or probing/removal of their drivers is in
progress etc.  That field is only modified under the device links
mutex, but it may be read outside of it in some cases (introduced by
subsequent change sets), so modifications of it are annotated with
WRITE_ONCE().

New links are added by calling device_link_add() which takes three
arguments: pointers to the devices in question and flags.  In
particular, if DL_FLAG_STATELESS is set in the flags, the link status
is not to be taken into account for this link and the driver core
will not manage it.  In turn, if DL_FLAG_AUTOREMOVE is set in the
flags, the driver core will remove the link automatically when the
consumer device driver unbinds from it.

One of the actions carried out by device_link_add() is to reorder
the lists used for device shutdown and system suspend/resume to
put the consumer device along with all of its children and all of
its consumers (and so on, recursively) to the ends of those lists
in order to ensure the right ordering between all of the supplier
and consumer devices.

For this reason, it is not possible to create a link between two
devices if the would-be supplier device already depends on the
would-be consumer device as either a direct descendant of it or a
consumer of one of its direct descendants or one of its consumers
and so on.

There are two types of link objects, persistent and non-persistent.
The persistent ones stay around until one of the target devices is
deleted, while the non-persistent ones are removed automatically when
the consumer driver unbinds from its device (ie. they are assumed to
be valid only as long as the consumer device has a driver bound to
it).  Persistent links are created by default and non-persistent
links are created when the DL_FLAG_AUTOREMOVE flag is passed
to device_link_add().

Both persistent and non-persistent device links can be deleted
with an explicit call to device_link_del().

Links created without the DL_FLAG_STATELESS flag set are managed
by the driver core using a simple state machine.  There are 5 states
each link can be in: DORMANT (unused), AVAILABLE (the supplier driver
is present and functional), CONSUMER_PROBE (the consumer driver is
probing), ACTIVE (both supplier and consumer drivers are present and
functional), and SUPPLIER_UNBIND (the supplier driver is unbinding).
The driver core updates the link state automatically depending on
what happens to the linked devices and for each link state specific
actions are taken in addition to that.

For example, if the supplier driver unbinds from its device, the
driver core will also unbind the drivers of all of its consumers
automatically under the assumption that they cannot function
properly without the supplier.  Analogously, the driver core will
only allow the consumer driver to bind to its device if the
supplier driver is present and functional (ie. the link is in
the AVAILABLE state).  If that's not the case, it will rely on
the existing deferred probing mechanism to wait for the supplier
driver to become available.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31 11:36:20 -06:00
Mauro Carvalho Chehab
8c27ceff36 docs: fix locations of several documents that got moved
The previous patch renamed several files that are cross-referenced
along the Kernel documentation. Adjust the links to point to
the right places.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-10-24 08:12:35 -02:00