linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-11-01 17:08:10 +00:00

Author	SHA1	Message	Date
Valentin Schneider	ecddc3a0d5	arch_topology, cpufreq: constify arch_* cpumasks The passed cpumask arguments to arch_set_freq_scale() and arch_freq_counters_available() are only iterated over, so reflect this in the prototype. This also allows to pass system cpumasks like cpu_online_mask without getting a warning. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-18 19:11:04 +02:00
Ionela Voinescu	874f635310	cpufreq: report whether cpufreq supports Frequency Invariance (FI) Now that the update of the FI scale factor is done in cpufreq core for selected functions - target(), target_index() and fast_switch(), we can provide feedback to the task scheduler and architecture code on whether cpufreq supports FI. For this purpose provide an external function to expose whether the cpufreq drivers support FI, by using a static key. The logic behind the enablement of cpufreq-based invariance is as follows: - cpufreq-based invariance is disabled by default - cpufreq-based invariance is enabled if any of the callbacks above is implemented while the unsupported setpolicy() is not The cpufreq_supports_freq_invariance() function only returns whether cpufreq is instrumented with the arch_set_freq_scale() calls that result in support for frequency invariance. Due to the lack of knowledge on whether the implementation of arch_set_freq_scale() actually results in the setting of a scale factor based on cpufreq information, it is up to the architecture code to ensure the setting and provision of the scale factor to the scheduler. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-18 19:10:56 +02:00
Ionela Voinescu	1a0419b0db	cpufreq: move invariance setter calls in cpufreq core To properly scale its per-entity load-tracking signals, the task scheduler needs to be given a frequency scale factor, i.e. some image of the current frequency the CPU is running at. Currently, this scale can be computed either by using counters (APERF/MPERF on x86, AMU on arm64), or by piggy-backing on the frequency selection done by cpufreq. For the latter, drivers have to explicitly set the scale factor themselves, despite it being purely boiler-plate code: the required information depends entirely on the kind of frequency switch callback implemented by the driver, i.e. either of: target_index(), target(), fast_switch() and setpolicy(). The fitness of those callbacks with regard to driving the Frequency Invariance Engine (FIE) is studied below: target_index() ============== Documentation states that the chosen frequency "must be determined by freq_table[index].frequency". It isn't clear if it has to be that frequency, or if it can use that frequency value to do some computation that ultimately leads to a different frequency selection. All drivers go for the former, while the vexpress-spc-cpufreq has an atypical implementation which is handled separately. Therefore, the hook works on the assumption the core can use freq_table[index].frequency. target() ======= This has been flagged as deprecated since: commit `9c0ebcf78f` ("cpufreq: Implement light weight ->target_index() routine") It also doesn't have that many users: gx-suspmod.c:439: .target = cpufreq_gx_target, s3c24xx-cpufreq.c:428: .target = s3c_cpufreq_target, intel_pstate.c:2528: .target = intel_cpufreq_target, cppc_cpufreq.c:401: .target = cppc_cpufreq_set_target, cpufreq-nforce2.c:371: .target = nforce2_target, sh-cpufreq.c:163: .target = sh_cpufreq_target, pcc-cpufreq.c:573: .target = pcc_cpufreq_target, Similarly to the path taken for target_index() calls in the cpufreq core during a frequency change, all of the drivers above will mark the end of a frequency change by a call to cpufreq_freq_transition_end(). Therefore, cpufreq_freq_transition_end() can be used as the location for the arch_set_freq_scale() call to potentially inform the scheduler of the frequency change. This change maintains the previous functionality for the drivers that implement the target_index() callback, while also adding support for the few drivers that implement the deprecated target() callback. fast_switch() ============= This callback has to return the frequency that was selected. setpolicy() =========== This callback does not have any designated way of informing what was the end choice. But there are only two drivers using setpolicy(), and none of them have current FIE support: drivers/cpufreq/longrun.c:281: .setpolicy = longrun_set_policy, drivers/cpufreq/intel_pstate.c:2215: .setpolicy = intel_pstate_set_policy, The intel_pstate is known to use counter-driven frequency invariance. Conclusion ========== Given that the significant majority of current FIE enabled drivers use callbacks that lend themselves to triggering the setting of the FIE scale factor in a generic way, move the invariance setter calls to cpufreq core. As a result of setting the frequency scale factor in cpufreq core, after callbacks that lend themselves to trigger it, remove this functionality from the driver side. To be noted that despite marking a successful frequency change, many cpufreq drivers will consider the new frequency as the requested frequency, although this is might not be the one granted by the hardware. Therefore, the call to arch_set_freq_scale() is a "best effort" one, and it is up to the architecture if the new frequency is used in the new frequency scale factor setting (determined by the implementation of arch_set_freq_scale()) or eventually used by the scheduler (determined by the implementation of arch_scale_freq_capacity()). The architecture is in a better position to decide if it has better methods to obtain more accurate information regarding the current frequency and use that information instead (for example, the use of counters). Also, the implementation to arch_set_freq_scale() will now have to handle error conditions (current frequency == 0) in order to prevent the overhead in cpufreq core when the default arch_set_freq_scale() implementation is used. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Suggested-by: Valentin Schneider <valentin.schneider@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-18 19:10:42 +02:00
Viresh Kumar	681fe68448	cpufreq: No need to verify cpufreq_driver in show_scaling_cur_freq() "cpufreq_driver" is guaranteed to be valid here, no need to check it here. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-08-27 12:51:25 +02:00
Rafael J. Wysocki	f6ebbcf08f	cpufreq: intel_pstate: Implement passive mode with HWP enabled Allow intel_pstate to work in the passive mode with HWP enabled and make it set the HWP minimum performance limit (HWP floor) to the P-state value given by the target frequency supplied by the cpufreq governor, so as to prevent the HWP algorithm and the CPU scheduler from working against each other, at least when the schedutil governor is in use, and update the intel_pstate documentation accordingly. Among other things, this allows utilization clamps to be taken into account, at least to a certain extent, when intel_pstate is in use and makes it more likely that sufficient capacity for deadline tasks will be provided. After this change, the resulting behavior of an HWP system with intel_pstate in the passive mode should be close to the behavior of the analogous non-HWP system with intel_pstate in the passive mode, except that the HWP algorithm is generally allowed to make the CPU run at a frequency above the floor P-state set by intel_pstate in the entire available range of P-states, while without HWP a CPU can run in a P-state above the requested one if the latter falls into the range of turbo P-states (referred to as the turbo range) or if the P-states of all CPUs in one package are coordinated with each other at the hardware level. [Note that in principle the HWP floor may not be taken into account by the processor if it falls into the turbo range, in which case the processor has a license to choose any P-state, either below or above the HWP floor, just like a non-HWP processor in the case when the target P-state falls into the turbo range.] With this change applied, intel_pstate in the passive mode assumes complete control over the HWP request MSR and concurrent changes of that MSR (eg. via the direct MSR access interface) are overridden by it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2020-08-11 17:29:45 +02:00
Rafael J. Wysocki	9ac1fb156a	Merge branch 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull ARM cpufreq driver changes for v5.9-rc1 from Viresh Kumar: "Here are the details: - Adaptive voltage scaling (AVS) support and minor cleanups for brcmstb driver (Florian Fainelli and Markus Mayer). - A new tegra driver and cleanup for the existing one (Sumit Gupta and Jon Hunter). - Bandwidth level support for Qcom driver along with OPP changes (Sibi Sankar). - Cleanups to sti, cpufreq-dt, ap806, CPPC drivers (Viresh Kumar, Lee Jones, Ivan Kokshaysky, Sven Auhagen, and Xin Hao). - Make schedutil default governor for ARM (Valentin Schneider). - Fix dependency issues for imx (Walter Lozano). - Cleanup around cached_resolved_idx in cpufreq core (Viresh Kumar)." * 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: make schedutil the default for arm and arm64 cpufreq: cached_resolved_idx can not be negative cpufreq: Add Tegra194 cpufreq driver dt-bindings: arm: Add NVIDIA Tegra194 CPU Complex binding cpufreq: imx: Select NVMEM_IMX_OCOTP cpufreq: sti-cpufreq: Fix some formatting and misspelling issues cpufreq: tegra186: Simplify probe return path cpufreq: CPPC: Reuse caps variable in few routines cpufreq: ap806: fix cpufreq driver needs ap cpu clk cpufreq: cppc: Reorder code and remove apply_hisi_workaround variable cpufreq: dt: fix oops on armada37xx cpufreq: brcmstb-avs-cpufreq: send S2_ENTER / S2_EXIT commands to AVS cpufreq: brcmstb-avs-cpufreq: Support polling AVS firmware cpufreq: brcmstb-avs-cpufreq: more flexible interface for __issue_avs_command() cpufreq: qcom: Disable fast switch when scaling DDR/L3 cpufreq: qcom: Update the bandwidth levels on frequency change OPP: Add and export helper to set bandwidth cpufreq: blacklist SC7180 in cpufreq-dt-platdev cpufreq: blacklist SDM845 in cpufreq-dt-platdev	2020-08-04 12:44:53 +02:00
Viresh Kumar	292072c387	cpufreq: cached_resolved_idx can not be negative It is not possible for cached_resolved_idx to be invalid here as the cpufreq core always sets index to a positive value. Change its type to unsigned int and fix qcom usage a bit. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>	2020-07-30 11:40:16 +05:30
Lee Jones	a9909c217f	cpufreq: cpufreq: Demote lots of function headers unworthy of kerneldoc status Also provide missing function parameter description for 'cpu' and 'policy'. Fixes the following W=1 kernel build warning(s): drivers/cpufreq/cpufreq.c:60: warning: cannot understand function prototype: 'struct cpufreq_driver *cpufreq_driver; ' drivers/cpufreq/cpufreq.c:90: warning: Function parameter or member 'cpufreq_policy_notifier_list' not described in 'BLOCKING_NOTIFIER_HEAD' drivers/cpufreq/cpufreq.c:312: warning: Function parameter or member 'val' not described in 'adjust_jiffies' drivers/cpufreq/cpufreq.c:312: warning: Function parameter or member 'ci' not described in 'adjust_jiffies' drivers/cpufreq/cpufreq.c:538: warning: Function parameter or member 'policy' not described in 'cpufreq_driver_resolve_freq' drivers/cpufreq/cpufreq.c:686: warning: Function parameter or member 'file_name' not described in 'show_one' drivers/cpufreq/cpufreq.c:686: warning: Function parameter or member 'object' not described in 'show_one' drivers/cpufreq/cpufreq.c:731: warning: Function parameter or member 'file_name' not described in 'store_one' drivers/cpufreq/cpufreq.c:731: warning: Function parameter or member 'object' not described in 'store_one' drivers/cpufreq/cpufreq.c:741: warning: Function parameter or member 'policy' not described in 'show_cpuinfo_cur_freq' drivers/cpufreq/cpufreq.c:741: warning: Function parameter or member 'buf' not described in 'show_cpuinfo_cur_freq' drivers/cpufreq/cpufreq.c:754: warning: Function parameter or member 'policy' not described in 'show_scaling_governor' drivers/cpufreq/cpufreq.c:754: warning: Function parameter or member 'buf' not described in 'show_scaling_governor' drivers/cpufreq/cpufreq.c:770: warning: Function parameter or member 'policy' not described in 'store_scaling_governor' drivers/cpufreq/cpufreq.c:770: warning: Function parameter or member 'buf' not described in 'store_scaling_governor' drivers/cpufreq/cpufreq.c:770: warning: Function parameter or member 'count' not described in 'store_scaling_governor' drivers/cpufreq/cpufreq.c:806: warning: Function parameter or member 'policy' not described in 'show_scaling_driver' drivers/cpufreq/cpufreq.c:806: warning: Function parameter or member 'buf' not described in 'show_scaling_driver' drivers/cpufreq/cpufreq.c:815: warning: Function parameter or member 'policy' not described in 'show_scaling_available_governors' drivers/cpufreq/cpufreq.c:815: warning: Function parameter or member 'buf' not described in 'show_scaling_available_governors' drivers/cpufreq/cpufreq.c:859: warning: Function parameter or member 'policy' not described in 'show_related_cpus' drivers/cpufreq/cpufreq.c:859: warning: Function parameter or member 'buf' not described in 'show_related_cpus' drivers/cpufreq/cpufreq.c:867: warning: Function parameter or member 'policy' not described in 'show_affected_cpus' drivers/cpufreq/cpufreq.c:867: warning: Function parameter or member 'buf' not described in 'show_affected_cpus' drivers/cpufreq/cpufreq.c:901: warning: Function parameter or member 'policy' not described in 'show_bios_limit' drivers/cpufreq/cpufreq.c:901: warning: Function parameter or member 'buf' not described in 'show_bios_limit' drivers/cpufreq/cpufreq.c:1625: warning: Function parameter or member 'dev' not described in 'cpufreq_remove_dev' drivers/cpufreq/cpufreq.c:1625: warning: Function parameter or member 'sif' not described in 'cpufreq_remove_dev' drivers/cpufreq/cpufreq.c:2380: warning: Function parameter or member 'cpu' not described in 'cpufreq_get_policy' drivers/cpufreq/cpufreq.c:2771: warning: Function parameter or member 'driver' not described in 'cpufreq_unregister_driver' Signed-off-by: Lee Jones <lee.jones@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-07-15 15:17:06 +02:00
Viresh Kumar	3a7e4fbbfd	cpufreq: Remove the weakly defined cpufreq_default_governor() The default cpufreq governor is chosen with the help of a "choice" option in the Kconfig which will always end up selecting one of the governors and so the weakly defined definition of cpufreq_default_governor() will never get called. Moreover, this makes us skip the checking of the return value of that routine as it will always be non NULL. If the Kconfig option changes in future, then we will start getting a link error instead (and it won't go unnoticed as in the case of the weak definition). Suggested-by: Quentin Perret <qperret@google.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-07-02 13:03:31 +02:00
Quentin Perret	8412b4563e	cpufreq: Specify default governor on command line Currently, the only way to specify the default CPUfreq governor is via Kconfig options, which suits users who can build the kernel themselves perfectly. However, for those who use a distro-like kernel (such as Android, with the Generic Kernel Image project), the only way to use a non-default governor is to boot to userspace, and to then switch using the sysfs interface. Being able to specify the default governor on the command line, like is the case for cpuidle, would allow those users to specify their governor of choice earlier on, and to simplify the userspace boot procedure slighlty. To support this use-case, add a kernel command line parameter allowing the default governor for CPUfreq to be specified, which takes precedence over the built-in default. This implementation has one notable limitation: the default governor must be registered before the driver. This is solved for builtin governors and drivers using appropriate *_initcall() functions. And in the modular case, this must be reflected as a constraint on the module loading order. Signed-off-by: Quentin Perret <qperret@google.com> [ Viresh: Converted 'default_governor' to a string and parsing it only at initcall level, and several updates to cpufreq_init_policy(). ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-07-02 13:03:30 +02:00
Viresh Kumar	8cc46ae565	cpufreq: Fix locking issues with governors The locking around governors handling isn't adequate currently. The list of governors should never be traversed without the locking in place. Also governor modules must not be removed while the code in them is still in use. Reported-by: Quentin Perret <qperret@google.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: All applicable <stable@vger.kernel.org> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-07-02 13:02:55 +02:00
Xiongfeng Wang	cf6fada715	cpufreq: change '.set_boost' to act on one policy Macro 'for_each_active_policy()' is defined internally. To avoid some cpufreq driver needing this macro to iterate over all the policies in '.set_boost' callback, we redefine '.set_boost' to act on only one policy and pass the policy as an argument. 'cpufreq_boost_trigger_state()' iterates over all the policies to set boost for the system. This is preparation for adding SW BOOST support for CPPC. To protect Boost enable/disable by sysfs from CPU online/offline, add 'cpu_hotplug_lock' before calling '.set_boost' for each CPU. Also move the lock from 'set_boost()' to 'store_cpb()' in acpi_cpufreq. Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Suggested-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-06-05 14:20:02 +02:00
Rafael J. Wysocki	552abb884e	cpufreq: Fix up cpufreq_boost_set_sw() After commit `18c49926c4` ("cpufreq: Add QoS requests for userspace constraints") the return value of freq_qos_update_request(), that can be 1, passed by cpufreq_boost_set_sw() to its caller sometimes confuses the latter, which only expects to see 0 or negative error codes, so notice that cpufreq_boost_set_sw() can return an error code (which should not be -EINVAL for that matter) as soon as the first policy without a frequency table is found (because either all policies have a frequency table or none of them have it) and rework it to meet its caller's expectations. Fixes: `18c49926c4` ("cpufreq: Add QoS requests for userspace constraints") Reported-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reported-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 5.3+ <stable@vger.kernel.org> # 5.3+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-05-18 13:01:28 +02:00
Linus Torvalds	3cd86a58f7	arm64 updates for 5.7: - In-kernel Pointer Authentication support (previously only offered to user space). - ARM Activity Monitors (AMU) extension support allowing better CPU utilisation numbers for the scheduler (frequency invariance). - Memory hot-remove support for arm64. - Lots of asm annotations (SYM_) in preparation for the in-kernel Branch Target Identification (BTI) support. - arm64 perf updates: ARMv8.5-PMU 64-bit counters, refactoring the PMU init callbacks, support for new DT compatibles. - IPv6 header checksum optimisation. - Fixes: SDEI (software delegated exception interface) double-lock on hibernate with shared events. - Minor clean-ups and refactoring: cpu_ops accessor, cpu_do_switch_mm() converted to C, cpufeature finalisation helper. - sys_mremap() comment explaining the asymmetric address untagging behaviour. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl6DVyIACgkQa9axLQDI XvHkqRAAiZA2EYKiQL4M1DJ1cNTADjT7xKX9+UtYBXj7GMVhgVWdunpHVE6qtfgk cT6avmKrS/6PDqizJgr+Z1yX8x3Kvs57G4BvmIUKIw97mkdewvFQ9JKv6VA1vb86 7Qrl1WzqsGg5Kj9uUfI4h+ZoT1H4C/9PQeFxJwgZRtF9DxRh8O7VeZI+JCu8Aub2 lIkjI8rh+EpTsGT9h/PMGWUcawnKQloZ1/F+GfMAuYBvIv2RNN2xVreJtTmm4NyJ VcpL0KCNyAI2lGdaJg5nBLRDyGuXDm5i+PLsCSXMquI4fie00txXeD8sjbeuO0ks YTJ0EhmUUhbSE17go+SxYiEFE0v09i+lD5ud+B4Vmojp0KTczTta9VSgURlbb2/9 n9biq5G3PPDNIrZqiTT2Tf4AMz1350nkbzL2gzKecM5aIzR/u3y5yII5CgfZtFnj 7bGbyFpFpcqI7UaISPsNCxmknbTt/7ff0WM3+7SbecxI3AD2mnxsOdN9JTLyhDp+ owjyiaWxl5zMWF9DhplLG/9BKpNWSxh3skazdOdELd8GTq2MbJlXrVG2XgXTAOh3 y1s6RQrfw8zXh8TSqdmmzauComXIRWTum/sbVB3U8Z3AUsIeq/NTSbN5X9JyIbOP HOabhlVhhkI6omN1grqPX4jwUiZLZoNfn7Ez4q71549KVK/uBtA= =LJVX -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "The bulk is in-kernel pointer authentication, activity monitors and lots of asm symbol annotations. I also queued the sys_mremap() patch commenting the asymmetry in the address untagging. Summary: - In-kernel Pointer Authentication support (previously only offered to user space). - ARM Activity Monitors (AMU) extension support allowing better CPU utilisation numbers for the scheduler (frequency invariance). - Memory hot-remove support for arm64. - Lots of asm annotations (SYM_) in preparation for the in-kernel Branch Target Identification (BTI) support. - arm64 perf updates: ARMv8.5-PMU 64-bit counters, refactoring the PMU init callbacks, support for new DT compatibles. - IPv6 header checksum optimisation. - Fixes: SDEI (software delegated exception interface) double-lock on hibernate with shared events. - Minor clean-ups and refactoring: cpu_ops accessor, cpu_do_switch_mm() converted to C, cpufeature finalisation helper. - sys_mremap() comment explaining the asymmetric address untagging behaviour" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (81 commits) mm/mremap: Add comment explaining the untagging behaviour of mremap() arm64: head: Convert install_el2_stub to SYM_INNER_LABEL arm64: Introduce get_cpu_ops() helper function arm64: Rename cpu_read_ops() to init_cpu_ops() arm64: Declare ACPI parking protocol CPU operation if needed arm64: move kimage_vaddr to .rodata arm64: use mov_q instead of literal ldr arm64: Kconfig: verify binutils support for ARM64_PTR_AUTH lkdtm: arm64: test kernel pointer authentication arm64: compile the kernel with ptrauth return address signing kconfig: Add support for 'as-option' arm64: suspend: restore the kernel ptrauth keys arm64: __show_regs: strip PAC from lr in printk arm64: unwind: strip PAC from kernel addresses arm64: mask PAC bits of __builtin_return_address arm64: initialize ptrauth keys for kernel booting task arm64: initialize and switch ptrauth kernel keys arm64: enable ptrauth earlier arm64: cpufeature: handle conflicts based on capability arm64: cpufeature: Move cpu capability helpers inside C file ...	2020-03-31 10:05:01 -07:00
Ionela Voinescu	bbce8eaa60	cpufreq: add function to get the hardware max frequency Add weak function to return the hardware maximum frequency of a CPU, with the default implementation returning cpuinfo.max_freq, which is the best information we can generically get from the cpufreq framework. The default can be overwritten by a strong function in platforms that want to provide an alternative implementation, with more accurate information, obtained either from hardware or firmware. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-03-06 16:02:50 +00:00
Rafael J. Wysocki	f5739cb0b5	cpufreq: Fix policy initialization for internal governor drivers Before commit `1e4f63aecb` ("cpufreq: Avoid creating excessively large stack frames") the initial value of the policy field in struct cpufreq_policy set by the driver's ->init() callback was implicitly passed from cpufreq_init_policy() to cpufreq_set_policy() if the default governor was neither "performance" nor "powersave". After that commit, however, cpufreq_init_policy() must take that case into consideration explicitly and handle it as appropriate, so make that happen. Fixes: `1e4f63aecb` ("cpufreq: Avoid creating excessively large stack frames") Link: https://lore.kernel.org/linux-pm/39fb762880c27da110086741315ca8b111d781cd.camel@gmail.com/ Reported-by: Artem Bityutskiy <dedekind1@gmail.com> Cc: 5.4+ <stable@vger.kernel.org> # 5.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2020-02-27 08:57:48 +01:00
Yangtao Li	183edb20e6	cpufreq: Make cpufreq_global_kobject static The cpufreq_global_kobject is only used internally by cpufreq.c after commit `2361be2366` ("cpufreq: Don't create empty /sys/devices/system/cpu/cpufreq directory"). Make it static. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> [ rjw: Add empty line after cpufreq_global_kobject definition ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-02-03 16:56:48 +01:00
Rafael J. Wysocki	1e4f63aecb	cpufreq: Avoid creating excessively large stack frames In the process of modifying a cpufreq policy, the cpufreq core makes a copy of it including all of the internals which is stored on the CPU stack. Because struct cpufreq_policy is relatively large, this may cause the size of the stack frame to exceed the 2 KB limit and so the GCC complains when -Wframe-larger-than= is used. In fact, it is not necessary to copy the entire policy structure in order to modify it, however. First, because cpufreq_set_policy() obtains the min and max policy limits from frequency QoS now, it is not necessary to pass the limits to it from the callers. The only things that need to be passed to it from there are the new governor pointer or (if there is a built-in governor in the driver) the "policy" value representing the governor choice. They both can be passed as individual arguments, though, so make cpufreq_set_policy() take them this way and rework its callers accordingly. This avoids making copies of cpufreq policies in the callers of cpufreq_set_policy(). Second, cpufreq_set_policy() still needs to pass the new policy data to the ->verify() callback of the cpufreq driver whose task is to sanitize the min and max policy limits. It still does not need to make a full copy of struct cpufreq_policy for this purpose, but it needs to pass a few items from it to the driver in case they are needed (different drivers have different needs in that respect and all of them have to be covered). For this reason, introduce struct cpufreq_policy_data to hold copies of the members of struct cpufreq_policy used by the existing ->verify() driver callbacks and pass a pointer to a temporary structure of that type to ->verify() (instead of passing a pointer to full struct cpufreq_policy to it). While at it, notice that intel_pstate and longrun don't really need to verify the "policy" value in struct cpufreq_policy, so drop those check from them to avoid copying "policy" into struct cpufreq_policy_data (which allows it to be slightly smaller). Also while at it fix up white space in a couple of places and make cpufreq_set_policy() static (as it can be so). Fixes: `3000ce3c52` ("cpufreq: Use per-policy frequency QoS") Link: https://lore.kernel.org/linux-pm/CAMuHMdX6-jb1W8uC2_237m8ctCpsnGp=JCxqt8pCWVqNXHmkVg@mail.gmail.com Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: 5.4+ <stable@vger.kernel.org> # 5.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2020-01-27 10:33:33 +01:00
Linus Torvalds	9e7a03233e	Power management updates for 5.5-rc1 - Use nanoseconds (instead of microseconds) as the unit of time in the cpuidle core and simplify checks for disabled idle states in the idle loop (Rafael Wysocki). - Fix and clean up the teo cpuidle governor (Rafael Wysocki). - Fix the cpuidle registration error code path (Zhenzhong Duan). - Avoid excessive vmexits in the ACPI cpuidle driver (Yin Fengwei). - Extend the idle injection infrastructure to be able to measure the requested duration in nanoseconds and to allow an exit latency limit for idle states to be specified (Daniel Lezcano). - Fix cpufreq driver registration and clarify a comment in the cpufreq core (Viresh Kumar). - Add NULL checks to the show() and store() methods of sysfs attributes exposed by cpufreq (Kai Shen). - Update cpufreq drivers: * Fix for a plain int as pointer warning from sparse in intel_pstate (Jamal Shareef). * Fix for a hardcoded number of CPUs and stack bloat in the powernv driver (John Hubbard). * Updates to the ti-cpufreq driver and DT files to support new platforms and migrate bindings from opp-v1 to opp-v2 (Adam Ford, H. Nikolaus Schaller). * Merging of the arm_big_little and vexpress-spc drivers and related cleanup (Sudeep Holla). * Fix for imx's default speed grade value (Anson Huang). * Minor cleanup of the s3c64xx driver (Nathan Chancellor). * CPU speed bin detection fix for sun50i (Ondrej Jirman). - Appoint Chanwoo Choi as the new devfreq maintainer. - Update the devfreq core: * Check NULL governor in available_governors_show sysfs to prevent showing wrong governor information and fix a race condition between devfreq_update_status() and trans_stat_show() (Leonard Crestez). * Add new 'interrupt-driven' flag for devfreq governors to allow interrupt-driven governors to prevent the devfreq core from polling devices for status (Dmitry Osipenko). * Improve an error message in devfreq_add_device() (Matthias Kaehlcke). - Update devfreq drivers: * tegra30 driver fixes and cleanups (Dmitry Osipenko). * Removal of unused property from dt-binding documentation for the exynos-bus driver (Kamil Konieczny). * exynos-ppmu cleanup and DT bindings update (Lukasz Luba, Marek Szyprowski). - Add new CPU IDs for CometLake Mobile and Desktop to the Intel RAPL power capping driver (Zhang Rui). - Allow device initialization in the generic power domains (genpd) framework to be more straightforward and clean it up (Ulf Hansson). - Add support for adjusting OPP voltages at run time to the OPP framework (Stephen Boyd). - Avoid freeing memory that has never been allocated in the hibernation core (Andy Whitcroft). - Clean up function headers in a header file and coding style in the wakeup IRQs handling code (Ulf Hansson, Xiaofei Tan). - Clean up the SmartReflex adaptive voltage scaling (AVS) driver for ARM (Ben Dooks, Geert Uytterhoeven). - Wrap power management documentation to fit in 80 columns (Bjorn Helgaas). - Add pm-graph utility entry to MAINTAINERS (Todd Brandt). - Update the cpupower utility: * Fix the handling of set and info subcommands (Abhishek Goel). * Fix build warnings (Nathan Chancellor). * Improve mperf_monitor handling (Janakarajan Natarajan). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl3dHGYSHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxMcgP/1bMSkxlRHFOXYSRwS4YcvkUjlBHrCSi 3qGRyYwhc+eRLqRc+2tcmQeQEeQRBqUt8etp7/9WxqS3nic/3Vdf6AFuhSpmJzo1 6JTEutHMU5eP8lwQuKoUCJncCNdIfEOkd5T35E12W/ar5PwyJio0UByZJBnJBjD/ p7/713ucq6ZH95OGncmCJ1S1UslFCZrSS2RRigDInu8gpEssnwN9zwaJbzUYrZHj BmnKpBpT8FdLmkpbOtmmiT7q2ZGpUEHhkaO916Knf/+BFdvydTXoR90FVvXKy8Zr QpOxaTdQB2ADifUa5zs8klVP6otmZhEO9vz8hVMUWGziqagObykQngzl8tqrKEBh hLI8eEG1IkEBCv5ThQbLcoaRXNpwriXXfvWPTPB8s84HJxNZ09F6pXsv1SLh96qC lj8Q5Yy2a3tlpsg4LB58XoJ54gOtlh8bWKkM0FytrFI/IP+HT4TUu/Rxgp1nDbGd tKzLvpn4Yo2h10seeDbYk3l79mogUYj50RmwjjPn+9RwS/Df4eIpNb6ibllGZUN/ zcPZH5xlVfQRl2LKDufVN0nYSnoMZY/fU05p9XbUiJWd80LHYOb4Em1N6h/FNOyl alDhVwlxEvc2BQwL/gjYmN6Qxc7SsPTBrSGVwjWYY+FghOYQd/wBDQqQUeM21QKg ChOE3z/F/26r =GJvT -----END PGP SIGNATURE----- Merge tag 'pm-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These include cpuidle changes to use nanoseconds (instead of microseconds) as the unit of time and to simplify checks for disabled idle states in the idle loop, some cpuidle fixes and governor updates, assorted cpufreq updates (driver updates mostly and a few core fixes and cleanups), devfreq updates (dominated by the tegra30 driver changes), new CPU IDs for the RAPL power capping driver, relatively minor updates of the generic power domains (genpd) and operation performance points (OPP) frameworks, and assorted fixes and cleanups. There are also two maintainer information updates: Chanwoo Choi will be maintaining the devfreq subsystem going forward and Todd Brandt is going to maintain the pm-graph utility (created by him). Specifics: - Use nanoseconds (instead of microseconds) as the unit of time in the cpuidle core and simplify checks for disabled idle states in the idle loop (Rafael Wysocki) - Fix and clean up the teo cpuidle governor (Rafael Wysocki) - Fix the cpuidle registration error code path (Zhenzhong Duan) - Avoid excessive vmexits in the ACPI cpuidle driver (Yin Fengwei) - Extend the idle injection infrastructure to be able to measure the requested duration in nanoseconds and to allow an exit latency limit for idle states to be specified (Daniel Lezcano) - Fix cpufreq driver registration and clarify a comment in the cpufreq core (Viresh Kumar) - Add NULL checks to the show() and store() methods of sysfs attributes exposed by cpufreq (Kai Shen) - Update cpufreq drivers: * Fix for a plain int as pointer warning from sparse in intel_pstate (Jamal Shareef) * Fix for a hardcoded number of CPUs and stack bloat in the powernv driver (John Hubbard) * Updates to the ti-cpufreq driver and DT files to support new platforms and migrate bindings from opp-v1 to opp-v2 (Adam Ford, H. Nikolaus Schaller) * Merging of the arm_big_little and vexpress-spc drivers and related cleanup (Sudeep Holla) * Fix for imx's default speed grade value (Anson Huang) * Minor cleanup of the s3c64xx driver (Nathan Chancellor) * CPU speed bin detection fix for sun50i (Ondrej Jirman) - Appoint Chanwoo Choi as the new devfreq maintainer. - Update the devfreq core: * Check NULL governor in available_governors_show sysfs to prevent showing wrong governor information and fix a race condition between devfreq_update_status() and trans_stat_show() (Leonard Crestez) * Add new 'interrupt-driven' flag for devfreq governors to allow interrupt-driven governors to prevent the devfreq core from polling devices for status (Dmitry Osipenko) * Improve an error message in devfreq_add_device() (Matthias Kaehlcke) - Update devfreq drivers: * tegra30 driver fixes and cleanups (Dmitry Osipenko) * Removal of unused property from dt-binding documentation for the exynos-bus driver (Kamil Konieczny) * exynos-ppmu cleanup and DT bindings update (Lukasz Luba, Marek Szyprowski) - Add new CPU IDs for CometLake Mobile and Desktop to the Intel RAPL power capping driver (Zhang Rui) - Allow device initialization in the generic power domains (genpd) framework to be more straightforward and clean it up (Ulf Hansson) - Add support for adjusting OPP voltages at run time to the OPP framework (Stephen Boyd) - Avoid freeing memory that has never been allocated in the hibernation core (Andy Whitcroft) - Clean up function headers in a header file and coding style in the wakeup IRQs handling code (Ulf Hansson, Xiaofei Tan) - Clean up the SmartReflex adaptive voltage scaling (AVS) driver for ARM (Ben Dooks, Geert Uytterhoeven) - Wrap power management documentation to fit in 80 columns (Bjorn Helgaas) - Add pm-graph utility entry to MAINTAINERS (Todd Brandt) - Update the cpupower utility: * Fix the handling of set and info subcommands (Abhishek Goel) * Fix build warnings (Nathan Chancellor) * Improve mperf_monitor handling (Janakarajan Natarajan)" * tag 'pm-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (83 commits) PM: Wrap documentation to fit in 80 columns cpuidle: Pass exit latency limit to cpuidle_use_deepest_state() cpuidle: Allow idle injection to apply exit latency limit cpuidle: Introduce cpuidle_driver_state_disabled() for driver quirks cpuidle: teo: Avoid code duplication in conditionals cpufreq: Register drivers only after CPU devices have been registered cpuidle: teo: Avoid using "early hits" incorrectly cpuidle: teo: Exclude cpuidle overhead from computations PM / Domains: Convert to dev_to_genpd_safe() in genpd_syscore_switch() mmc: tmio: Avoid boilerplate code in ->runtime_suspend() PM / Domains: Implement the ->start() callback for genpd PM / Domains: Introduce dev_pm_domain_start() ARM: OMAP2+: SmartReflex: add omap_sr_pdata definition PM / wakeirq: remove unnecessary parentheses power: avs: smartreflex: Remove superfluous cast in debugfs_create_file() call cpuidle: Use nanoseconds as the unit of time PM / OPP: Support adjusting OPP voltages at runtime PM / core: Clean up some function headers in power.h cpufreq: Add NULL checks to show() and store() methods of cpufreq cpufreq: intel_pstate: Fix plain int as pointer warning from sparse ...	2019-11-26 19:06:44 -08:00
Frederic Weisbecker	5720821ba1	cpufreq: Use vtime aware kcpustat accessors for user time We can now safely read user and guest kcpustat fields on nohz_full CPUs. Use the appropriate accessors. Reported-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wanpeng Li <wanpengli@tencent.com> Link: https://lkml.kernel.org/r/20191121024430.19938-5-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-21 07:33:25 +01:00
Viresh Kumar	46770be0cf	cpufreq: Register drivers only after CPU devices have been registered The cpufreq core heavily depends on the availability of the struct device for CPUs and if they aren't available at the time cpufreq driver is registered, we will never succeed in making cpufreq work. This happens due to following sequence of events: - cpufreq_register_driver() - subsys_interface_register() - return 0; //successful registration of driver ... at a later point of time - register_cpu(); - device_register(); - bus_probe_device(); - sif->add_dev(); - cpufreq_add_dev(); - get_cpu_device(); //FAILS - per_cpu(cpu_sys_devices, num) = &cpu->dev; //used by get_cpu_device() - return 0; //CPU registered successfully Because the per-cpu variable cpu_sys_devices is set only after the CPU device is regsitered, cpufreq will never be able to get it when cpufreq_add_dev() is called. This patch avoids this failure by making sure device structure of at least CPU0 is available when the cpufreq driver is registered, else return -EPROBE_DEFER. Reported-by: Bjorn Andersson <bjorn.andersson@linaro.org> Co-developed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-14 09:37:14 +01:00
Ingo Molnar	6d5a763c30	Linux 5.4-rc7 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl3IqJQeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGOiUH+gOEDwid5OODaFAd CggXugdFIlBZefKqGVNW5sjgX8pxFWHXuEMC8iNb6QXtQZdFrI6LFf9hhUDmzQtm 6y1LPxxEiTZjObMEsBNylb7tyzgujFHcAlp0Zro3w/HLCqmYTSP3FF46i2u6KZfL XhkpM4X7R7qxlfpdhlfESv/ElRGocZe6SwXfC7pcPo5flFcmkdu9ijqhNd/6CZ/h Nf9rTsD/wEDVUelFbgVN+LJzlaB0tsyc4Zbof07n8OsFZjhdEOop8gfM/kTBLcyY 6bh66SfDScdsNnC/l8csbPjSZRx+i+nQs67DyhGNnsSAFgHBZdC4Tb/2mDCwhCLR dUvuYZc= =1N6F -----END PGP SIGNATURE----- Merge tag 'v5.4-rc7' into sched/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-11 08:34:59 +01:00
Kai Shen	e6e8df0726	cpufreq: Add NULL checks to show() and store() methods of cpufreq Add NULL checks to show() and store() in cpufreq.c to avoid attempts to invoke a NULL callback. Though some interfaces of cpufreq are set as read-only, users can still get write permission using chmod which can lead to a kernel crash, as follows: chmod +w /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq echo 1 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq This bug was found in linux 4.19. Signed-off-by: Kai Shen <shenkai8@huawei.com> Reported-by: Feilong Lin <linfeilong@huawei.com> Reviewed-by: Feilong Lin <linfeilong@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject & changelog ] Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-08 11:49:15 +01:00
Viresh Kumar	737ffb27f2	cpufreq: Clarify the comment in cpufreq_set_policy() One of the responsibility of the ->verify() callback is to make sure that the policy's min frequency is <= max frequency as this isn't guaranteed by the QoS framework which gave us those values. Update the comment in cpufreq_set_policy() to clarify that. Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Minor changes of the new comment ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-04 11:44:26 +01:00
Frederic Weisbecker	49bb001e24	cpufreq: Use vtime aware kcpustat accessor to fetch CPUTIME_SYSTEM Now that we have a vtime safe kcpustat accessor for CPUTIME_SYSTEM, use it to start fixing frozen kcpustat values on nohz_full CPUs. Reported-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rjw@rjwysocki.net> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpengli@tencent.com> Link: https://lkml.kernel.org/r/20191016025700.31277-14-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-10-29 10:01:18 +01:00
Sudeep Holla	6941051d30	cpufreq: Cancel policy update work scheduled before freeing Scheduled policy update work may end up racing with the freeing of the policy and unregistering the driver. One possible race is as below, where the cpufreq_driver is unregistered, but the scheduled work gets executed at later stage when, cpufreq_driver is NULL (i.e. after freeing the policy and driver). Unable to handle kernel NULL pointer dereference at virtual address 0000001c pgd = (ptrval) [0000001c] pgd=80000080204003, pmd=00000000 Internal error: Oops: 206 [#1] SMP THUMB2 Modules linked in: CPU: 0 PID: 34 Comm: kworker/0:1 Not tainted 5.4.0-rc3-00006-g67f5a8081a4b #86 Hardware name: ARM-Versatile Express Workqueue: events handle_update PC is at cpufreq_set_policy+0x58/0x228 LR is at dev_pm_qos_read_value+0x77/0xac Control: 70c5387d Table: 80203000 DAC: fffffffd Process kworker/0:1 (pid: 34, stack limit = 0x(ptrval)) (cpufreq_set_policy) from (refresh_frequency_limits.part.24+0x37/0x48) (refresh_frequency_limits.part.24) from (handle_update+0x2f/0x38) (handle_update) from (process_one_work+0x16d/0x3cc) (process_one_work) from (worker_thread+0xff/0x414) (worker_thread) from (kthread+0xff/0x100) (kthread) from (ret_from_fork+0x11/0x28) Fixes: `67d874c3b2` ("cpufreq: Register notifiers with the PM QoS framework") Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> [ rjw: Cancel the work before dropping the QoS requests ] Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-10-22 18:07:30 +02:00
Rafael J. Wysocki	3000ce3c52	cpufreq: Use per-policy frequency QoS Replace the CPU device PM QoS used for the management of min and max frequency constraints in cpufreq (and its users) with per-policy frequency QoS to avoid problems with cpufreq policies covering more then one CPU. Namely, a cpufreq driver is registered with the subsys interface which calls cpufreq_add_dev() for each CPU, starting from CPU0, so currently the PM QoS notifiers are added to the first CPU in the policy (i.e. CPU0 in the majority of cases). In turn, when the cpufreq driver is unregistered, the subsys interface doing that calls cpufreq_remove_dev() for each CPU, starting from CPU0, and the PM QoS notifiers are only removed when cpufreq_remove_dev() is called for the last CPU in the policy, say CPUx, which as a rule is not CPU0 if the policy covers more than one CPU. Then, the PM QoS notifiers cannot be removed, because CPUx does not have them, and they are still there in the device PM QoS notifiers list of CPU0, which prevents new PM QoS notifiers from being registered for CPU0 on the next attempt to register the cpufreq driver. The same issue occurs when the first CPU in the policy goes offline before unregistering the driver. After this change it does not matter which CPU is the policy CPU at the driver registration time and whether or not it is online all the time, because the frequency QoS is per policy and not per CPU. Fixes: `67d874c3b2` ("cpufreq: Register notifiers with the PM QoS framework") Reported-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Reported-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Diagnosed-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f Link: https://lore.kernel.org/linux-pm/20191017094612.6tbkwoq4harsjcqv@vireshk-i7/T/#m30d48cc23b9a80467fbaa16e30f90b3828a5a29b Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2019-10-21 02:05:21 +02:00
Rafael J. Wysocki	65650b3513	cpufreq: Avoid cpufreq_suspend() deadlock on system shutdown It is incorrect to set the cpufreq syscore shutdown callback pointer to cpufreq_suspend(), because that function cannot be run in the syscore stage of system shutdown for two reasons: (a) it may attempt to carry out actions depending on devices that have already been shut down at that point and (b) the RCU synchronization carried out by it may not be able to make progress then. The latter issue has been present since commit `45975c7d21` ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds"), but the former one has been there since commit `90de2a4aa9` ("cpufreq: suspend cpufreq governors on shutdown") regardless. Fix that by dropping cpufreq_syscore_ops altogether and making device_shutdown() call cpufreq_suspend() directly before shutting down devices, which is along the lines of what system-wide power management does. Fixes: `45975c7d21` ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds") Fixes: `90de2a4aa9` ("cpufreq: suspend cpufreq governors on shutdown") Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.0+ <stable@vger.kernel.org> # 4.0+	2019-10-10 11:11:17 +02:00
Rafael J. Wysocki	beb4e08e21	Merge branch 'pm-cpufreq-qos' * pm-cpufreq-qos: Documentation: cpufreq: Update policy notifier documentation cpufreq: Remove CPUFREQ_ADJUST and CPUFREQ_NOTIFY policy notifier events ACPI: cpufreq: Switch to QoS requests instead of cpufreq notifier video: pxafb: Remove cpufreq policy notifier video: sa1100fb: Remove cpufreq policy notifier arch_topology: Use CPUFREQ_CREATE_POLICY instead of CPUFREQ_NOTIFY cpufreq: powerpc_cbe: Switch to QoS requests for freq limits cpufreq: powerpc: macintosh: Switch to QoS requests for freq limits thermal: cpu_cooling: Switch to QoS requests for freq limits cpufreq: Add policy create/remove notifiers back	2019-09-05 09:01:26 +02:00
Viresh Kumar	df0eea4488	cpufreq: Remove CPUFREQ_ADJUST and CPUFREQ_NOTIFY policy notifier events No driver makes reference to these events now, remove them and the code related to them. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-09-02 22:44:04 +02:00
Florian Fainelli	e9a7cc1d97	cpufreq: Print driver name if cpufreq_suspend() fails Instead of printing the policy, which is incidentally a kernel pointer, so with limited interest, print the cpufreq driver name that failed to be suspend, which is more useful for debugging. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-08-22 09:27:25 +02:00
Colin Ian King	62c23a89fd	cpufreq: remove redundant assignment to ret Variable ret is initialized to a value that is never read and it is re-assigned later. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-08-21 00:28:30 +02:00
Viresh Kumar	6a1490367c	cpufreq: Add policy create/remove notifiers back This effectively reverts some changes made by commit `f9f41e3ef9` ("cpufreq: Remove policy create/remove notifiers"). We have a new use case for policy create/remove notifiers (for allocating/freeing QoS requests per policy), so add them back. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-08-10 14:05:48 +02:00
Viresh Kumar	e61a41256e	cpufreq: dev_pm_qos_update_request() can return 1 on success dev_pm_qos_update_request() can return 1 on success, so don't treat it as an error. Fixes: `18c49926c4` ("cpufreq: Add QoS requests for userspace constraints") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-08-10 13:39:47 +02:00
Rafael J. Wysocki	918e162e6a	Merge branch 'pm-cpufreq' * pm-cpufreq: cpufreq: Make cpufreq_generic_init() return void cpufreq: imx-cpufreq-dt: Add i.MX8MN support cpufreq: Add QoS requests for userspace constraints cpufreq: intel_pstate: Reuse refresh_frequency_limits() cpufreq: Register notifiers with the PM QoS framework PM / QoS: Add support for MIN/MAX frequency constraints PM / QOS: Pass request type to dev_pm_qos_read_value() PM / QOS: Rename __dev_pm_qos_read_value() and dev_pm_qos_raw_read_value() PM / QOS: Pass request type to dev_pm_qos_{add\|remove}_notifier()	2019-07-18 09:49:30 +02:00
Viresh Kumar	c4dcc8a162	cpufreq: Make cpufreq_generic_init() return void It always returns 0 (success) and its return type should really be void. Over that, many drivers have added error handling code based on its return value, which is not required at all. Change its return type to void and update all the callers. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-16 10:20:11 +02:00
Viresh Kumar	18c49926c4	cpufreq: Add QoS requests for userspace constraints This implements QoS requests to manage userspace configuration of min and max frequency. Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: syzbot <syzbot+de771ae9390dffed7266@syzkaller.appspotmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-08 23:56:39 +02:00
Viresh Kumar	c57b25bdf7	cpufreq: intel_pstate: Reuse refresh_frequency_limits() The implementation of intel_pstate_update_max_freq() is quite similar to refresh_frequency_limits(), lets reuse it. Finding minimum of policy->user_policy.max and policy->cpuinfo.max_freq in intel_pstate_update_max_freq() is redundant as cpufreq_set_policy() will call the ->verify() callback of intel-pstate driver, which will do this comparison anyway and so dropping it from intel_pstate_update_max_freq() doesn't harm. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-08 23:56:39 +02:00
Viresh Kumar	67d874c3b2	cpufreq: Register notifiers with the PM QoS framework Register notifiers for min/max frequency constraints with the PM QoS framework. The constraints are also taken into consideration in cpufreq_set_policy(). This also relocates cpufreq_policy_put_kobj() as it is required to be called from cpufreq_policy_alloc() now. refresh_frequency_limits() is updated to avoid calling cpufreq_set_policy() for inactive policies and handle_update() is updated to have proper locking in place. No constraints are added until now though. Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-08 23:56:13 +02:00
Rafael J. Wysocki	586a07dca8	Merge branch 'pm-cpufreq' * pm-cpufreq: cpufreq: Avoid calling cpufreq_verify_current_freq() from handle_update() cpufreq: Consolidate cpufreq_update_current_freq() and __cpufreq_get() cpufreq: Don't skip frequency validation for has_target() drivers cpufreq: Use has_target() instead of !setpolicy cpufreq: Remove redundant !setpolicy check cpufreq: Move the IS_ENABLED(CPU_THERMAL) macro into a stub cpufreq: s5pv210: Don't flood kernel log after cpufreq change cpufreq: pcc-cpufreq: Fail initialization if driver cannot be registered cpufreq: add driver for Raspberry Pi cpufreq: Switch imx7d to imx-cpufreq-dt for speed grading cpufreq: imx-cpufreq-dt: Remove global platform match list cpufreq: brcmstb-avs-cpufreq: Fix types for voltage/frequency cpufreq: brcmstb-avs-cpufreq: Fix initial command check cpufreq: armada-37xx: Remove set but not used variable 'freq' cpufreq: imx-cpufreq-dt: Fix no OPPs available on unfused parts dt-bindings: imx-cpufreq-dt: Document opp-supported-hw usage cpufreq: Add imx-cpufreq-dt driver	2019-07-08 11:00:02 +02:00
Viresh Kumar	70a59fde6e	cpufreq: Avoid calling cpufreq_verify_current_freq() from handle_update() On some occasions cpufreq_verify_current_freq() schedules a work whose callback is handle_update(), which further calls cpufreq_update_policy() which may end up calling cpufreq_verify_current_freq() again. On the other hand, when cpufreq_update_policy() is called from handle_update(), the pointer to the cpufreq policy is already available, but cpufreq_cpu_acquire() is still called to get it in cpufreq_update_policy(), which should be avoided as well. To fix these issues, create a new helper, refresh_frequency_limits(), and make both handle_update() call it cpufreq_update_policy(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Rename reeval_frequency_limits() as refresh_frequency_limits() ] [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-06-28 11:24:56 +02:00
Viresh Kumar	5980752e6e	cpufreq: Consolidate cpufreq_update_current_freq() and __cpufreq_get() Their implementations are quite similar, so modify cpufreq_update_current_freq() somewhat and call it from __cpufreq_get(). Also rename cpufreq_update_current_freq() to cpufreq_verify_current_freq(), as that's what it is doing. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-06-28 11:17:12 +02:00
Viresh Kumar	9801522840	cpufreq: Don't skip frequency validation for has_target() drivers CPUFREQ_CONST_LOOPS was introduced in a very old commit from pre-2.6 kernel release by commit 6a4a93f9c0d5 ("[CPUFREQ] Fix 'out of sync' issue"). Basically, that commit does two things: - It adds the frequency verification code (which is quite similar to what we have today as well). - And it sets the CPUFREQ_CONST_LOOPS flag only for setpolicy drivers, rightly so based on the code we had then. The idea was to avoid frequency validation for setpolicy drivers as the cpufreq core doesn't know what frequency the hardware is running at and so no point in doing frequency verification. The problem happened when we started to use the same CPUFREQ_CONST_LOOPS flag for constant loops-per-jiffy thing as well and many has_target() drivers started using the same flag and unknowingly skipped the verification of frequency. There is no logical reason behind skipping frequency validation because of the presence of CPUFREQ_CONST_LOOPS flag otherwise. Fix this issue by skipping frequency validation only for setpolicy drivers and always doing it for has_target() drivers irrespective of the presence or absence of CPUFREQ_CONST_LOOPS flag. cpufreq_notify_transition() is only called for has_target() type driver and not for set_policy type, and the check is simply redundant. Remove it as well. Also remove () around freq comparison statement as they aren't required and checkpatch also warns for them. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-06-28 10:16:14 +02:00
Viresh Kumar	5ddc6d4e30	cpufreq: Use has_target() instead of !setpolicy For code consistency, use has_target() instead of !setpolicy everywhere, as it is already done at several places. Maybe we should also use "!has_target()" instead of "cpufreq_driver->setpolicy" where we need to check if the driver supports setpolicy, so to use only one expression for this kind of differentiation. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-06-26 11:41:04 +02:00
Viresh Kumar	407d0fff22	cpufreq: Remove redundant !setpolicy check cpufreq_start_governor() is only called for !setpolicy case, checking it again is not required. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-06-26 11:36:55 +02:00
Daniel Lezcano	bcc6156999	cpufreq: Move the IS_ENABLED(CPU_THERMAL) macro into a stub cpufreq_online() and cpufreq_offline() [un]register the driver as a cooling device. This is done if the driver is flagged as a cooling device in addition with an IS_ENABLED() check to compile out the branching code. Group this test in a stub function added in the cpufreq header instead of having the IS_ENABLED() in the code. Suggested-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-06-26 10:59:57 +02:00
Thomas Gleixner	d2912cb15b	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-19 17:09:55 +02:00
Linus Torvalds	bfbfbf7368	More power management updates for 5.2-rc1 - Fix recent regression causing kernels built with CONFIG_PM unset to crash on systems that support the Performance and Energy Bias Hint (EPB) by avoiding to compile the EPB-related code depending on CONFIG_PM when it is unset (Rafael Wysocki). - Clean up the transition notifier invocation code in the cpufreq core and change some users of cpufreq transition notifiers accordingly (Viresh Kumar). - Change MAINTAINERS to cover the schedutil governor as part of cpufreq (Viresh Kumar). - Simplify cpufreq_init_policy() to avoid redundant computations (Yue Hu). - Add explanatory comment to the cpufreq core (Rafael Wysocki). - Introduce a new flag, GENPD_FLAG_RPM_ALWAYS_ON, to the generic power domains (genpd) framework along with the first user of it (Leonard Crestez). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAlzb4TASHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxiEAP/37uQOx+I8J3IU7HQcPIkdI1hgksLEzo g2eoREekjszIjFK9xa70X3V/QnGK4YSPQ/cHCjgXfVhwkO5TJzte5T5M2z9gUCDT 7OMYWCI6hP6Mo5UWlP4dQ9Cqce4SB3TdibadevxcVOhFAW/xz42y5Gr6s4WkexJf Swb2uoLS4gGANyhUhx6XEZ5NpWZkWcK2ygZ8VJZETnoIwxMSUW7FTJkF+4s2tXLZ GH+F5jWAbwPlg6g2c54lPL1HtiAvK+/018aF8CZMqUBec94RHDFybVOlb5sacfQW +Y0W/mc/6SMqT3OUcQ0H3Z/qkgwR8mL01hH6gCP1jA5OBljmTjzk0Bbc4c3n9BEN aRy4M8Qc/GXzEBPO3Z9AlYik6ALH9iUgL2hewGZAFN8kn9ZGPAqYsctdCVkfKL1u 4Esz5+wOsyYmBx910PozL+p2jbTH0x89sSo1qXUQr2JEiNm2iL4I4+ndqhuiq4LO sQPHCpe4HhYWzIQzJLDurv6hAxxU5PUsGg8XDEGlsyowIPDoIkMgC93RRLGZ/taY Ivc2FSlwLTSkzBHwVfckakXPvfyFdw8DFL2n66dQbXS9FFNshOF/TFx40iV42i5H wusyIZIT1y1H74De0EVntUho3xBo3nrrsu1o2NaXsTBoEsYwJiCji4yOZlI1Zh+m A9coiXKm4hY5 =LqTN -----END PGP SIGNATURE----- Merge tag 'pm-5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These fix a recent regression causing kernels built with CONFIG_PM unset to crash on systems that support the Performance and Energy Bias Hint (EPB), clean up the cpufreq core and some users of transition notifiers and introduce a new power domain flag into the generic power domains framework (genpd). Specifics: - Fix recent regression causing kernels built with CONFIG_PM unset to crash on systems that support the Performance and Energy Bias Hint (EPB) by avoiding to compile the EPB-related code depending on CONFIG_PM when it is unset (Rafael Wysocki). - Clean up the transition notifier invocation code in the cpufreq core and change some users of cpufreq transition notifiers accordingly (Viresh Kumar). - Change MAINTAINERS to cover the schedutil governor as part of cpufreq (Viresh Kumar). - Simplify cpufreq_init_policy() to avoid redundant computations (Yue Hu). - Add explanatory comment to the cpufreq core (Rafael Wysocki). - Introduce a new flag, GENPD_FLAG_RPM_ALWAYS_ON, to the generic power domains (genpd) framework along with the first user of it (Leonard Crestez)" * tag 'pm-5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: soc: imx: gpc: Use GENPD_FLAG_RPM_ALWAYS_ON for ERR009619 PM / Domains: Add GENPD_FLAG_RPM_ALWAYS_ON flag cpufreq: Update MAINTAINERS to include schedutil governor cpufreq: Don't find governor for setpolicy drivers in cpufreq_init_policy() cpufreq: Explain the kobject_put() in cpufreq_policy_alloc() cpufreq: Call transition notifier only once for each policy x86: intel_epb: Take CONFIG_PM into account	2019-05-15 08:46:44 -07:00
Yue Hu	ab05d97a37	cpufreq: Don't find governor for setpolicy drivers in cpufreq_init_policy() In cpufreq_init_policy() we will check if there's last_governor for target and setpolicy type. However last_governor is set only if has_target() is true in cpufreq_offline(). That means find last_governor for setpolicy type is pointless. Also new_policy.governor will not be used if ->setpolicy callback is set in cpufreq_set_policy(). Moreover, there's duplicate ->setpolicy check in using default policy path. Let's add a new helper function to avoid it. Also update comments. Signed-off-by: Yue Hu <huyue2@yulong.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-05-13 10:46:24 +02:00
Rafael J. Wysocki	2acb9bdae9	cpufreq: Explain the kobject_put() in cpufreq_policy_alloc() It may not be particularly clear why the kobject_put() after failing kobject_init_and_add() in cpufreq_policy_alloc() is not redundant, so add a comment to explain that. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2019-05-13 10:43:53 +02:00
Viresh Kumar	df24014abe	cpufreq: Call transition notifier only once for each policy Currently, the notifiers are called once for each CPU of the policy->cpus cpumask. It would be more optimal if the notifier can be called only once and all the relevant information be provided to it. Out of the 23 drivers that register for the transition notifiers today, only 4 of them do per-cpu updates and the callback for the rest can be called only once for the policy without any impact. This would also avoid multiple function calls to the notifier callbacks and reduce multiple iterations of notifier core's code (which does locking as well). This patch adds pointer to the cpufreq policy to the struct cpufreq_freqs, so the notifier callback has all the information available to it with a single call. The five drivers which perform per-cpu updates are updated to use the cpufreq policy. The freqs->cpu field is redundant now and is removed. Acked-by: David S. Miller <davem@davemloft.net> (sparc) Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-05-10 12:20:36 +02:00
Linus Torvalds	0968621917	Printk changes for 5.2 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAlzP8nQACgkQUqAMR0iA lPK79A/+NkRouqA9ihAZhUbgW0DHzOAFvUJSBgX11HQAZbGjngakuoyYFvwUx0T0 m80SUTCysxQrWl+xLdccPZ9ZrhP2KFQrEBEdeYHZ6ymcYcl83+3bOIBS7VwdZAbO EzB8u/58uU/sI6ABL4lF7ZF/+R+U4CXveEUoVUF04bxdPOxZkRX4PT8u3DzCc+RK r4yhwQUXGcKrHa2GrRL3GXKsDxcnRdFef/nzq4RFSZsi0bpskzEj34WrvctV6j+k FH/R3kEcZrtKIMPOCoDMMWq07yNqK/QKj0MJlGoAlwfK4INgcrSXLOx+pAmr6BNq uMKpkxCFhnkZVKgA/GbKEGzFf+ZGz9+2trSFka9LD2Ig6DIstwXqpAgiUK8JFQYj lq1mTaJZD3DfF2vnGHGeAfBFG3XETv+mIT/ow6BcZi3NyNSVIaqa5GAR+lMc6xkR waNkcMDkzLFuP1r0p7ZizXOksk9dFkMP3M6KqJomRtApwbSNmtt+O2jvyLPvB3+w wRyN9WT7IJZYo4v0rrD5Bl6BjV15ZeCPRSFZRYofX+vhcqJQsFX1M9DeoNqokh55 Cri8f6MxGzBVjE1G70y2/cAFFvKEKJud0NUIMEuIbcy+xNrEAWPF8JhiwpKKnU10 c0u674iqHJ2HeVsYWZF0zqzqQ6E1Idhg/PrXfuVuhAaL5jIOnYY= =WZfC -----END PGP SIGNATURE----- Merge tag 'printk-for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk Pull printk updates from Petr Mladek: - Allow state reset of printk_once() calls. - Prevent crashes when dereferencing invalid pointers in vsprintf(). Only the first byte is checked for simplicity. - Make vsprintf warnings consistent and inlined. - Treewide conversion of obsolete %pf, %pF to %ps, %pF printf modifiers. - Some clean up of vsprintf and test_printf code. * tag 'printk-for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: lib/vsprintf: Make function pointer_string static vsprintf: Limit the length of inlined error messages vsprintf: Avoid confusion between invalid address and value vsprintf: Prevent crash when dereferencing invalid pointers vsprintf: Consolidate handling of unknown pointer specifiers vsprintf: Factor out %pO handler as kobject_string() vsprintf: Factor out %pV handler as va_format() vsprintf: Factor out %p[iI] handler as ip_addr_string() vsprintf: Do not check address of well-known strings vsprintf: Consistent %pK handling for kptr_restrict == 0 vsprintf: Shuffle restricted_pointer() printk: Tie printk_once / printk_deferred_once into .data.once for reset treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively lib/test_printf: Switch to bitmap_zalloc()	2019-05-07 09:18:12 -07:00
Viresh Kumar	4ebe36c94a	cpufreq: Fix kobject memleak Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject. Fix it by adding a call to kobject_put() in the error path of kobject_init_and_add(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Tobin C. Harding <tobin@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-04-30 10:54:23 +02:00
Yue Hu	4db7c34cb4	cpufreq: Move ->get callback check outside of __cpufreq_get() Currenly, __cpufreq_get() called by show_cpuinfo_cur_freq() will check ->get callback. That is needless since cpuinfo_cur_freq attribute will not be created if ->get is not set. So let's drop it in __cpufreq_get(). Also keep this check in cpufreq_get(). Signed-off-by: Yue Hu <huyue2@yulong.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-04-23 10:58:43 +02:00
Yue Hu	b23aa311fa	cpufreq: Remove needless bios_limit check in show_bios_limit() Initially, bios_limit attribute will be created if driver->bios_limit is set in cpufreq_add_dev_interface(). So remove the redundant check for latter show operation. Signed-off-by: Yue Hu <huyue2@yulong.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-04-16 23:10:42 +02:00
Sakari Ailus	d75f773c86	treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively %pF and %pf are functionally equivalent to %pS and %ps conversion specifiers. The former are deprecated, therefore switch the current users to use the preferred variant. The changes have been produced by the following command: git grep -l '%p[fF]' \| grep -v '^$tools\\|Documentation$/' \| \ while read i; do perl -i -pe 's/%pf/%ps/g; s/%pF/%pS/g;' $i; done And verifying the result. Link: http://lkml.kernel.org/r/20190325193229.23390-1-sakari.ailus@linux.intel.com Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: sparclinux@vger.kernel.org Cc: linux-um@lists.infradead.org Cc: xen-devel@lists.xenproject.org Cc: linux-acpi@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: drbd-dev@lists.linbit.com Cc: linux-block@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-nvdimm@lists.01.org Cc: linux-pci@vger.kernel.org Cc: linux-scsi@vger.kernel.org Cc: linux-btrfs@vger.kernel.org Cc: linux-f2fs-devel@lists.sourceforge.net Cc: linux-mm@kvack.org Cc: ceph-devel@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Acked-by: David Sterba <dsterba@suse.com> (for btrfs) Acked-by: Mike Rapoport <rppt@linux.ibm.com> (for mm/memblock.c) Acked-by: Bjorn Helgaas <bhelgaas@google.com> (for drivers/pci) Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Petr Mladek <pmladek@suse.com>	2019-04-09 14:19:06 +02:00
Yue Hu	89f98d7e5f	cpufreq: Remove cpufreq_driver check in cpufreq_boost_supported() Currently there are three calling paths for cpufreq_boost_supported() in all as below, we can see the cpufreq_driver null check is needless since it is already checked before. <path1> cpufreq_enable_boost_support() \|-> if (!cpufreq_driver) \|-> cpufreq_boost_supported() <path2> cpufreq_register_driver() \|-> if (!driver_data ... \|-> cpufreq_driver = driver_data \|-> cpufreq_boost_supported() \|-> remove_boost_sysfs_file() \|-> cpufreq_boost_supported() <path3> cpufreq_unregister_driver() \|-> if (!cpufreq_driver ... \|-> remove_boost_sysfs_file() \|-> cpufreq_boost_supported() Signed-off-by: Yue Hu <huyue2@yulong.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-04-09 09:57:16 +02:00
Rafael J. Wysocki	9083e49861	cpufreq: intel_pstate: Update max frequency on global turbo changes While the cpuinfo.max_freq value doesn't really matter for intel_pstate in the active mode, in the passive mode it is used by governors as the maximum physical frequency of the CPU and the results of governor computations generally depend on it. Also it is made available to user space via sysfs and it should match the current HW configuration. For this reason, make intel_pstate update cpuinfo.max_freq for all CPUs if it detects a global change of turbo frequency settings from "disable" to "enable" or the other way associated with a _PPC change notification from the platform firmware. Note that policy_is_inactive(), cpufreq_cpu_acquire(), cpufreq_cpu_release(), and cpufreq_set_policy() need to be made available to it for this purpose. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200759 Reported-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2019-04-08 11:26:09 +02:00
Rafael J. Wysocki	540a375822	cpufreq: Add cpufreq_cpu_acquire() and cpufreq_cpu_release() It sometimes is necessary to find a cpufreq policy for a given CPU and acquire its rwsem (for writing) immediately after that, so introduce cpufreq_cpu_acquire() as a helper for that and the complementary cpufreq_cpu_release(). Make cpufreq_update_policy() use the new functions. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2019-04-01 23:43:05 +02:00
Rafael J. Wysocki	5a25e3f7cc	cpufreq: intel_pstate: Driver-specific handling of _PPC updates In some cases, the platform firmware disables or enables turbo frequencies for all CPUs globally before triggering a _PPC change notification for one of them. Obviously, that global change affects all CPUs, not just the notified one, and it needs to be acted upon by cpufreq. The intel_pstate driver is able to detect such global changes of the settings, but it also needs to update policy limits for all CPUs if that happens, in particular if turbo frequencies are enabled globally - to allow them to be used. For this reason, introduce a new cpufreq driver callback to be invoked on _PPC notifications, if present, instead of simply calling cpufreq_update_policy() for the notified CPU and make intel_pstate use it to trigger policy updates for all CPUs in the system if global settings change. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200759 Reported-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2019-04-01 23:43:05 +02:00
Rafael J. Wysocki	5d094fea14	cpufreq: Improve kerneldoc comments for cpufreq_cpu_get/put() Fix the formatting of the cpufreq_cpu_get() and cpufreq_cpu_put() kerneldoc comments and rework them to be somewhat easier to follow. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-03-07 10:55:29 +01:00
Rafael J. Wysocki	167a38dcd5	cpufreq: Pass updated policy to driver ->setpolicy() callback The invocation of the ->setpolicy() cpufreq driver callback should be equivalent to calling cpufreq_governor_limits(policy) for drivers with internal governors, but in fact it isn't so, because the temporary new_policy object is passed to it instead of the updated policy. That is a bit confusing, so make cpufreq_set_policy() pass the updated policy to the driver ->setpolicy() callback. No intentional changes of behavior. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2019-02-20 10:18:37 +01:00
Rafael J. Wysocki	2bb4059e07	cpufreq: Fix two debug messages in cpufreq_set_policy() Remove the redundant "cpufreq:" prefix from two debug messages in cpufreq_set_policy(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2019-02-20 10:18:37 +01:00
Rafael J. Wysocki	348a2ec5f5	cpufreq: Reorder and simplify cpufreq_update_policy() In cpufreq_update_policy(), instead of updating new_policy.cur separately, which is kind of confusing, because cpufreq_set_policy() doesn't take that value into account directly anyway, make the copy of the existing policy after calling cpufreq_update_current_freq(). No intentional changes of behavior. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2019-02-20 10:18:37 +01:00
Rafael J. Wysocki	a0dbb819b8	cpufreq: Add kerneldoc comments for two core functions Add kerneldoc comments describing cpufreq_set_policy() and cpufreq_update_policy() as they have not been properly documented so far and they really need to be documented. While at it, fix white space around the cpufreq_set_policy() header. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2019-02-20 10:18:37 +01:00
Viresh Kumar	a9a22b570b	cpufreq: Replace double NOT (!!) with single NOT (!) Double NOT (!!) operation is normally done to convert a non-zero value to 1 and keep zero as is, but that isn't the requirement in this case. All we wanted was to make sure that only one of the two routines isn't set, i.e. either both function pointers are set or both are unset. This can be done with a single NOT (!) operation as well. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-02-14 12:06:38 +01:00
Viresh Kumar	91a12e91dc	cpufreq: Allow light-weight tear down and bring up of CPUs The cpufreq core doesn't remove the cpufreq policy anymore on CPU offline operation, rather that happens when the CPU device gets unregistered from the kernel. This allows faster recovery when the CPU comes back online. This is also very useful during system wide suspend/resume where we offline all non-boot CPUs during suspend and then bring them back on resume. This commit takes the same idea a step ahead to allow drivers to do light weight tear-down and bring-up during CPU offline and online operations. A new set of callbacks is introduced, online/offline(). online() gets called when the first CPU of an inactive policy is brought up and offline() gets called when all the CPUs of a policy are offlined. The existing init/exit() callback get called on policy creation/destruction. They also get called instead of online/offline() callbacks if the online/offline() callbacks aren't provided. This also moves around some code to get executed only for the new-policy case going forward. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-02-12 23:47:42 +01:00
Amit Kucheria	5c238a8b59	cpufreq: Auto-register the driver as a thermal cooling device if asked All cpufreq drivers do similar things to register as a cooling device. Provide a cpufreq driver flag so drivers can just ask the cpufreq core to register the cooling device on their behalf. This allows us to get rid of duplicated code in the drivers. In order to allow this, we add a struct thermal_cooling_device pointer to struct cpufreq_policy so that drivers don't need to store it in a private data structure. Suggested-by: Stephen Boyd <swboyd@chromium.org> Suggested-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Tested-by: Matthias Kaehlcke <mka@chromium.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-01-30 23:02:26 +01:00
Viresh Kumar	625c85a62c	cpufreq: Use struct kobj_attribute instead of struct global_attr The cpufreq_global_kobject is created using kobject_create_and_add() helper, which assigns the kobj_type as dynamic_kobj_ktype and show/store routines are set to kobj_attr_show() and kobj_attr_store(). These routines pass struct kobj_attribute as an argument to the show/store callbacks. But all the cpufreq files created using the cpufreq_global_kobject expect the argument to be of type struct attribute. Things work fine currently as no one accesses the "attr" argument. We may not see issues even if the argument is used, as struct kobj_attribute has struct attribute as its first element and so they will both get same address. But this is logically incorrect and we should rather use struct kobj_attribute instead of struct global_attr in the cpufreq core and drivers and the show/store callbacks should take struct kobj_attribute as argument instead. This bug is caught using CFI CLANG builds in android kernel which catches mismatch in function prototypes for such callbacks. Reported-by: Donghee Han <dh.han@samsung.com> Reported-by: Sangkyu Kim <skwith.kim@samsung.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-01-29 11:44:30 +01:00
Viresh Kumar	21469df467	cpufreq: Don't update new_policy on failures The local variable "new_policy" hasn't been used in the error path of cpufreq_online() since commit `f9f41e3ef9` (cpufreq: Remove policy create/remove notifiers). Don't update it in that error path. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-01-15 22:57:04 +01:00
Rafael J. Wysocki	343e60e52a	Merge branches 'pm-cpuidle', 'pm-cpufreq' and 'pm-sleep' * pm-cpuidle: doc: trace: fix reference to cpuidle documentation file cpuidle / Documentation: Update cpuidle MAINTAINERS entry * pm-cpufreq: cpufreq: scmi: Fix frequency invariance in slow path cpufreq: check if policy is inactive early in __cpufreq_get() cpufreq: scpi/scmi: Fix freeing of dynamic OPPs cpufreq / Documentation: Update cpufreq MAINTAINERS entry * pm-sleep: PM: sleep: call devfreq suspend/resume	2019-01-11 10:09:51 +01:00
Sudeep Holla	2f66196208	cpufreq: check if policy is inactive early in __cpufreq_get() cpuinfo_cur_freq gets current CPU frequency as detected by hardware while scaling_cur_freq last known CPU frequency. Some platforms may not allow checking the CPU frequency of an offline CPU or the associated resources may have been released via cpufreq_exit when the CPU gets offlined, in which case the policy would have been invalidated already. If we attempt to get current frequency from the hardware, it may result in hang or crash. For example on Juno, I see: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000188 [0000000000000188] pgd=0000000000000000 Internal error: Oops: 96000004 [#1] PREEMPT SMP Modules linked in: CPU: 5 PID: 4202 Comm: cat Not tainted 4.20.0-08251-ga0f2c0318a15-dirty #87 Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform pstate: 40000005 (nZcv daif -PAN -UAO) pc : scmi_cpufreq_get_rate+0x34/0xb0 lr : scmi_cpufreq_get_rate+0x34/0xb0 Call trace: scmi_cpufreq_get_rate+0x34/0xb0 __cpufreq_get+0x34/0xc0 show_cpuinfo_cur_freq+0x24/0x78 show+0x40/0x60 sysfs_kf_seq_show+0xc0/0x148 kernfs_seq_show+0x44/0x50 seq_read+0xd4/0x480 kernfs_fop_read+0x15c/0x208 __vfs_read+0x60/0x188 vfs_read+0x94/0x150 ksys_read+0x6c/0xd8 __arm64_sys_read+0x24/0x30 el0_svc_common+0x78/0x100 el0_svc_handler+0x38/0x78 el0_svc+0x8/0xc ---[ end trace 3d1024e58f77f6b2 ]--- So fix the issue by checking if the policy is invalid early in __cpufreq_get before attempting to get the current frequency. Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-01-08 10:49:50 +01:00
Quentin Perret	531b5c9f5c	sched/topology: Make Energy Aware Scheduling depend on schedutil Energy Aware Scheduling (EAS) is designed with the assumption that frequencies of CPUs follow their utilization value. When using a CPUFreq governor other than schedutil, the chances of this assumption being true are small, if any. When schedutil is being used, EAS' predictions are at least consistent with the frequency requests. Although those requests have no guarantees to be honored by the hardware, they should at least guide DVFS in the right direction and provide some hope in regards to the EAS model being accurate. To make sure EAS is only used in a sane configuration, create a strong dependency on schedutil being used. Since having sugov compiled-in does not provide that guarantee, make CPUFreq call a scheduler function on governor changes hence letting it rebuild the scheduling domains, check the governors of the online CPUs, and enable/disable EAS accordingly. Signed-off-by: Quentin Perret <quentin.perret@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: adharmap@codeaurora.org Cc: chris.redpath@arm.com Cc: currojerez@riseup.net Cc: dietmar.eggemann@arm.com Cc: edubezval@gmail.com Cc: gregkh@linuxfoundation.org Cc: javi.merino@kernel.org Cc: joel@joelfernandes.org Cc: juri.lelli@redhat.com Cc: morten.rasmussen@arm.com Cc: patrick.bellasi@arm.com Cc: pkondeti@codeaurora.org Cc: skannan@codeaurora.org Cc: smuckle@google.com Cc: srinivas.pandruvada@linux.intel.com Cc: thara.gopinath@linaro.org Cc: tkjos@google.com Cc: valentin.schneider@arm.com Cc: vincent.guittot@linaro.org Cc: viresh.kumar@linaro.org Link: https://lkml.kernel.org/r/20181203095628.11858-9-quentin.perret@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-12-11 15:17:00 +01:00
Igor Stoppa	0e7ea2f3b0	cpufreq: remove unnecessary unlikely() WARN_ON() already contains an unlikely(), so it's not necessary to wrap it into another. Signed-off-by: Igor Stoppa <igor.stoppa@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-09-10 12:04:42 +02:00
Waiman Long	9b3d9bb3e4	cpufreq: Fix a circular lock dependency problem With lockdep turned on, the following circular lock dependency problem was reported: [ 57.470040] ====================================================== [ 57.502900] WARNING: possible circular locking dependency detected [ 57.535208] 4.18.0-0.rc3.1.el8+7.x86_64+debug #1 Tainted: G [ 57.577761] ------------------------------------------------------ [ 57.609714] tuned/1505 is trying to acquire lock: [ 57.633808] 00000000559deec5 (cpu_hotplug_lock.rw_sem){++++}, at: store+0x27/0x120 [ 57.672880] [ 57.672880] but task is already holding lock: [ 57.702184] 000000002136ca64 (kn->count#118){++++}, at: kernfs_fop_write+0x1d0/0x410 [ 57.742176] [ 57.742176] which lock already depends on the new lock. [ 57.742176] [ 57.785220] [ 57.785220] the existing dependency chain (in reverse order) is: : [ 58.932512] other info that might help us debug this: [ 58.932512] [ 58.973344] Chain exists of: [ 58.973344] cpu_hotplug_lock.rw_sem --> subsys mutex#5 --> kn->count#118 [ 58.973344] [ 59.030795] Possible unsafe locking scenario: [ 59.030795] [ 59.061248] CPU0 CPU1 [ 59.085377] ---- ---- [ 59.108160] lock(kn->count#118); [ 59.124935] lock(subsys mutex#5); [ 59.156330] lock(kn->count#118); [ 59.186088] lock(cpu_hotplug_lock.rw_sem); [ 59.208541] [ 59.208541] * DEADLOCK * In the cpufreq_register_driver() function, the lock sequence is: cpus_read_lock --> kn->count For the cpufreq sysfs store method, the lock sequence is: kn->count --> cpus_read_lock These sequences are actually safe as they are taking a share lock on cpu_hotplug_lock. However, the current lockdep code doesn't check for share locking when detecting circular lock dependency. Fixing that could be a substantial effort. Instead, we can work around this problem by using cpus_read_trylock() in the store method which is much simpler. The chance of not getting the read lock is very small. If that happens, the userspace application that writes the sysfs file will get an error. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-07-26 10:37:36 +02:00
Ruchi Kandoi	601b218568	cpufreq: trace frequency limits change systrace used for tracing for Android systems has carried a patch for many years in the Android tree that traces when the cpufreq limits change. With the help of this information, systrace can know when the policy limits change and can visually display the data. Lets add upstream support for the same. Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-07-26 10:17:47 +02:00
Sebastian Andrzej Siewior	cc85de361d	cpufreq: Use static SRCU initializer Use the static SRCU initializer for `cpufreq_transition_notifier_list'. This avoids the init_cpufreq_transition_notifier_list() initcall. Its only purpose is to initialize the SRCU notifier once during boot and set another variable which is used as an indicator whether the init was perfromed before cpufreq_register_notifier() was used. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-05-30 11:16:59 +02:00
Tao Wang	c7d1f119c4	cpufreq: Fix new policy initialization during limits updates via sysfs If the policy limits are updated via cpufreq_update_policy() and subsequently via sysfs, the limits stored in user_policy may be set incorrectly. For example, if both min and max are set via sysfs to the maximum available frequency, user_policy.min and user_policy.max will also be the maximum. If a policy notifier triggered by cpufreq_update_policy() lowers both the min and the max at this point, that change is not reflected by the user_policy limits, so if the max is updated again via sysfs to the same lower value, then user_policy.max will be lower than user_policy.min which shouldn't happen. In particular, if one of the policy CPUs is then taken offline and back online, cpufreq_set_policy() will fail for it due to a failing limits check. To prevent that from happening, initialize the min and max fields of the new_policy object to the ones stored in user_policy that were previously set via sysfs. Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject & changelog ] Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-05-30 10:11:34 +02:00
Viresh Kumar	20b5324d83	cpufreq: optimize cpufreq_notify_transition() cpufreq_notify_transition() calls __cpufreq_notify_transition() for each CPU of a policy. There is a lot of code in __cpufreq_notify_transition() though which isn't required to be executed for each CPU, like checking about disabled cpufreq or irqs, adjusting jiffies, updating cpufreq stats and some debug print messages. This commit merges __cpufreq_notify_transition() into cpufreq_notify_transition() and modifies cpufreq_notify_transition() to execute minimum amount of code for each CPU. Also fix the kerneldoc for cpufreq_notify_transition() while at it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-05-13 11:09:00 +02:00
Viresh Kumar	92c99d159c	cpufreq: Don't validate cpufreq table from cpufreq_generic_init() The cpufreq table is already validated by the cpufreq core and none of the users of cpufreq_generic_init() have any dependency on it to validate the table as well. Don't validate the cpufreq table anymore from cpufreq_generic_init(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-03-20 12:07:51 +01:00
Viresh Kumar	d417e0691a	cpufreq: Validate frequency table in the core By design, cpufreq drivers are responsible for calling cpufreq_frequency_table_cpuinfo() from their ->init() callbacks to validate the frequency table. However, if a cpufreq driver is buggy and fails to do so properly, it lead to unexpected behavior of the driver or the cpufreq core at a later point in time. It would be better if the core could validate the frequency table during driver initialization. To that end, introduce cpufreq_table_validate_and_sort() and make the cpufreq core call it right after invoking the ->init() callback of the driver and destroy the cpufreq policy if the table is invalid. For the time being the validation of the table happens twice, once from the driver and then from the core. The individual drivers will be updated separately to drop table validation if they don't need it for other reasons. The frequency table is marked "sorted" or "unsorted" by the new helper now instead of in cpufreq_table_validate_and_show(), as it should only be done after validating the table (which the drivers won't do going forward). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject/changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-02-27 18:22:12 +01:00
Viresh Kumar	b24b6478e6	cpufreq: Reorder cpufreq_online() error code path Ideally the de-allocation of resources should happen in the exact opposite order in which they were allocated. It helps maintain the code in long term, even if nothing really breaks with incorrect ordering. That wasn't followed in cpufreq_online() and it has some inconsistencies. For example, the symlinks were created from within the locked region while they are removed only after putting the locks. Also ->exit() should have been called only after the symlinks are removed and the lock is dropped, as that was the case when ->init() was first called. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-02-27 18:22:12 +01:00
Bo Yan	703cbaa601	cpufreq: Skip cpufreq resume if it's not suspended cpufreq_resume can be called even without preceding cpufreq_suspend. This can happen in following scenario: suspend_devices_and_enter --> dpm_suspend_start --> dpm_prepare --> device_prepare : this function errors out --> dpm_suspend: this is skipped due to dpm_prepare failure this means cpufreq_suspend is skipped over --> goto Recover_platform, due to previous error --> goto Resume_devices --> dpm_resume_end --> dpm_resume --> cpufreq_resume In case schedutil is used as frequency governor, cpufreq_resume will eventually call sugov_start, which does following: memset(sg_cpu, 0, sizeof(*sg_cpu)); .... This effectively erases function pointer for frequency update, causing crash later on. The function pointer would have been set correctly if subsequent cpufreq_add_update_util_hook runs successfully, but that function returns earlier because cpufreq_suspend was not called: if (WARN_ON(per_cpu(cpufreq_update_util_data, cpu))) return; The fix is to check cpufreq_suspended first, if it's false, that means cpufreq_suspend was not called in the first place, so do not resume cpufreq. Signed-off-by: Bo Yan <byan@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Dropped printing a message ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-02-05 11:03:33 +01:00
Rafael J. Wysocki	a8b149d32b	cpufreq: Fix governor module removal race It is possible to remove a cpufreq governor module after cpufreq_parse_governor() has returned success in store_scaling_governor() and before cpufreq_set_policy() acquires a reference to it, because the governor list is not protected during that period and nothing prevents the governor from being unregistered then. Prevent that from happening by acquiring an extra reference to the governor module temporarily in cpufreq_parse_governor(), under cpufreq_governor_mutex, and dropping it in store_scaling_governor(), when cpufreq_set_policy() returns. Note that the second cpufreq_parse_governor() call site is fine, because it only cares about the policy member of new_policy. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-12-04 15:35:41 +01:00
Rafael J. Wysocki	70d1ff7116	cpufreq: Drop pointless return statement Drop a pointless return statement from cpufreq_unregister_governor(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-12-04 15:35:41 +01:00
Rafael J. Wysocki	ae0ff89f36	cpufreq: Pass policy pointer to cpufreq_parse_governor() Pass policy pointer to cpufreq_parse_governor() instead of passing pointers to two members of it so as to make the code slightly more straightforward. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-12-04 15:35:41 +01:00
Rafael J. Wysocki	045149e6a2	cpufreq: Clean up cpufreq_parse_governor() Drop an unnecessary local variable from cpufreq_parse_governor() and rearrange the code in there to make it easier to follow. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-12-04 15:35:40 +01:00
Dietmar Eggemann	e7d5459dfa	cpufreq: provide default frequency-invariance setter function Frequency-invariant accounting support based on the ratio of current frequency and maximum supported frequency is an optional feature an arch can implement. Since there are cpufreq drivers (e.g. cpufreq-dt) which can be build for different arch's a default implementation of the frequency-invariance setter function arch_set_freq_scale() is needed. This default implementation is an empty weak function which will be overwritten by a strong function in case the arch provides one. The setter function passes the cpumask of related (to the frequency change) cpus (online and offline cpus), the (new) current frequency and the maximum supported frequency. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-10-03 02:37:53 +02:00
Rafael J. Wysocki	08a10002be	Merge branch 'pm-cpufreq-sched' * pm-cpufreq-sched: cpufreq: schedutil: Always process remote callback with slow switching cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily cpufreq: Return 0 from ->fast_switch() on errors cpufreq: Simplify cpufreq_can_do_remote_dvfs() cpufreq: Process remote callbacks from any CPU if the platform permits sched: cpufreq: Allow remote cpufreq callbacks cpufreq: schedutil: Use unsigned int for iowait boost cpufreq: schedutil: Make iowait boost more energy efficient	2017-09-04 00:05:22 +02:00
Viresh Kumar	e948bc8fbe	cpufreq: Cap the default transition delay value to 10 ms If transition_delay_us isn't defined by the cpufreq driver, the default value of transition delay (time after which the cpufreq governor will try updating the frequency again) is currently calculated by multiplying transition_latency (nsec) with LATENCY_MULTIPLIER (1000) and then converting this time to usec. That gives the exact same value as transition_latency, just that the time unit is usec instead of nsec. With acpi-cpufreq for example, transition_latency is set to around 10 usec and we get transition delay as 10 ms. Which seems to be a reasonable amount of time to reevaluate the frequency again. But for platforms where frequency switching isn't that fast (like ARM), the transition_latency varies from 500 usec to 3 ms, and the transition delay becomes 500 ms to 3 seconds. Of course, that is a pretty bad default value to start with. We can try to come across a better formula (instead of multiplying with LATENCY_MULTIPLIER) to solve this problem, but will that be worth it ? This patch tries a simple approach and caps the maximum value of default transition delay to 10 ms. Of course, userspace can still come in and change this value anytime or individual drivers can rather provide transition_delay_us instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-08-22 15:50:03 +02:00
Viresh Kumar	209887e6b9	cpufreq: Return 0 from ->fast_switch() on errors CPUFREQ_ENTRY_INVALID is a special symbol which is used to specify that an entry in the cpufreq table is invalid. But using it outside of the scope of the cpufreq table looks a bit incorrect. We can represent an invalid frequency by writing it as 0 instead if we need. Note that it is already done that way for the return value of the ->get() callback. Lets do the same for ->fast_switch() and not use CPUFREQ_ENTRY_INVALID outside of the scope of cpufreq table. Also update the comment over cpufreq_driver_fast_switch() to clearly mention what this returns. None of the drivers return CPUFREQ_ENTRY_INVALID as of now from ->fast_switch() callback and so we don't need to update any of those. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-08-10 01:26:35 +02:00
Viresh Kumar	fc4c709fc8	cpufreq: Allow dynamic switching with CPUFREQ_ETERNAL latency With the recent updates, CPUFREQ_ETERNAL is only used by the drivers which don't know their transition latency but want to use dynamic switching. Anyway, the routine cpufreq_policy_transition_delay_us() caps the value of transition latency to 10 ms now and that can be used safely with such platforms. Remove the check from cpufreq_init_governor() and allow dynamic switching for such configurations as well. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-07-26 00:15:47 +02:00
Viresh Kumar	fe829ed8ef	cpufreq: Add CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING cpufreq driver flag The policy->transition_latency field is used for multiple purposes today and its not straight forward at all. This is how it is used: A. Set the correct transition_latency value. B. Set it to CPUFREQ_ETERNAL because: 1. We don't want automatic dynamic switching (with ondemand/conservative) to happen at all. 2. We don't know the transition latency. This patch handles the B.1. case in a more readable way. A new flag for the cpufreq drivers is added to disallow use of cpufreq governors which have dynamic_switching flag set. All the current cpufreq drivers which are setting transition_latency unconditionally to CPUFREQ_ETERNAL are updated to use it. They don't need to set transition_latency anymore. There shouldn't be any functional change after this patch. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-07-26 00:15:46 +02:00
Viresh Kumar	ed4676e254	cpufreq: Replace "max_transition_latency" with "dynamic_switching" There is no limitation in the ondemand or conservative governors which disallow the transition_latency to be greater than 10 ms. The max_transition_latency field is rather used to disallow automatic dynamic frequency switching for platforms which didn't wanted these governors to run. Replace max_transition_latency with a boolean (dynamic_switching) and check for transition_latency == CPUFREQ_ETERNAL along with that. This makes it pretty straight forward to read/understand now. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-07-26 00:15:45 +02:00
Viresh Kumar	aa7519af45	cpufreq: Use transition_delay_us for legacy governors as well The policy->transition_delay_us field is used only by the schedutil governor currently, and this field describes how fast the driver wants the cpufreq governor to change CPUs frequency. It should rather be a common thing across all governors, as it doesn't have any schedutil dependency here. Create a new helper cpufreq_policy_transition_delay_us() to get the transition delay across all governors. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-07-22 02:25:20 +02:00
Linus Torvalds	408c9861c6	Power management updates for v4.13-rc1 - Rework suspend-to-idle to allow it to take wakeup events signaled by the EC into account on ACPI-based platforms in order to properly support power button wakeup from suspend-to-idle on recent Dell laptops (Rafael Wysocki). That includes the core suspend-to-idle code rework, support for the Low Power S0 _DSM interface, and support for the ACPI INT0002 Virtual GPIO device from Hans de Goede (required for USB keyboard wakeup from suspend-to-idle to work on some machines). - Stop trying to export the current CPU frequency via /proc/cpuinfo on x86 as that is inaccurate and confusing (Len Brown). - Rework the way in which the current CPU frequency is exported by the kernel (over the cpufreq sysfs interface) on x86 systems with the APERF and MPERF registers by always using values read from these registers, when available, to compute the current frequency regardless of which cpufreq driver is in use (Len Brown). - Rework the PCI/ACPI device wakeup infrastructure to remove the questionable and artificial distinction between "devices that can wake up the system from sleep states" and "devices that can generate wakeup signals in the working state" from it, which allows the code to be simplified quite a bit (Rafael Wysocki). - Fix the wakeup IRQ framework by making it use SRCU instead of RCU which doesn't allow sleeping in the read-side critical sections, but which in turn is expected to be allowed by the IRQ bus locking infrastructure (Thomas Gleixner). - Modify some computations in the intel_pstate driver to avoid rounding errors resulting from them (Srinivas Pandruvada). - Reduce the overhead of the intel_pstate driver in the HWP (hardware-managed P-states) mode and when the "performance" P-state selection algorithm is in use by making it avoid registering scheduler callbacks in those cases (Len Brown). - Rework the energy_performance_preference sysfs knob in intel_pstate by changing the values that correspond to different symbolic hint names used by it (Len Brown). - Make it possible to use more than one cpuidle driver at the same time on ARM (Daniel Lezcano). - Make it possible to prevent the cpuidle menu governor from using the 0 state by disabling it via sysfs (Nicholas Piggin). - Add support for FFH (Fixed Functional Hardware) MWAIT in ACPI C1 on AMD systems (Yazen Ghannam). - Make the CPPC cpufreq driver take the lowest nonlinear performance information into account (Prashanth Prakash). - Add support for hi3660 to the cpufreq-dt driver, fix the imx6q driver and clean up the sfi, exynos5440 and intel_pstate drivers (Colin Ian King, Krzysztof Kozlowski, Octavian Purdila, Rafael Wysocki, Tao Wang). - Fix a few minor issues in the generic power domains (genpd) framework and clean it up somewhat (Krzysztof Kozlowski, Mikko Perttunen, Viresh Kumar). - Fix a couple of minor issues in the operating performance points (OPP) framework and clean it up somewhat (Viresh Kumar). - Fix a CONFIG dependency in the hibernation core and clean it up slightly (Balbir Singh, Arvind Yadav, BaoJun Luo). - Add rk3228 support to the rockchip-io adaptive voltage scaling (AVS) driver (David Wu). - Fix an incorrect bit shift operation in the RAPL power capping driver (Adam Lessnau). - Add support for the EPP field in the HWP (hardware managed P-states) control register, HWP.EPP, to the x86_energy_perf_policy tool and update msr-index.h with HWP.EPP values (Len Brown). - Fix some minor issues in the turbostat tool (Len Brown). - Add support for AMD family 0x17 CPUs to the cpupower tool and fix a minor issue in it (Sherry Hurwitz). - Assorted cleanups, mostly related to the constification of some data structures (Arvind Yadav, Joe Perches, Kees Cook, Krzysztof Kozlowski). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJZWrICAAoJEILEb/54YlRxZYMQAIRhfbyDxKq+ByvSilUS8kTA AItwJ8FFzykhiwN75Cqabg4rAGyWma7IRs1vzU7zeC1aEQIn+bTQtvk+utZNI+g2 ANFlDha20q/sXsP/CDMMTIAdW9tSOC0TOvFI9s2V2Y8dJZhoekO4ctx34FAfUS5d Ao6rwSAWCMsCXcGaTAlqTA+TEJmBG7u6Iq6hq6ngltoFwOv3mWWBVn52VVaJ7SMp 9/IPbbLGMFAedrgEBRGCR+MME1xZZpvcZIJaTt1Mgn7Cx3cJaysIUAvqY/SsvFGq 5FcUTcF2qpK3+AGawiAxZIjvOBsGRtIwqKinNIzYWs/NjiIdzmgVAmTeuPtTqp+5 HFehUdtkFcnuDnLqSNzAaZUa7tw84cJkwnbVMnesx0MkG6rZ1SeL22E2Sabpcdsh 3Yo1ThzJSxi59DhiiE92EQnNCEjmCldRy+8q5Ag035muxl6EJYvuNBMnZv/BMCUn ltSNOrmps1DlN+Col8ORIeNzQ1YjYzWMqKAYzSbyccm4ug/iSHx0/DuESmQ4GTlF YCwkmqyWiHrBwpl51jc+4a7SGlMmKRqU+MJes0CjagaaqoUAb8qeBOpzEJ0yNwjZ wtI41l6blE6kbMD3yqGdCfiB2S7GlPVoxa15eX1wRyLH3fLjwwrzJirEaiBS86tI 1PzHZEOmBlh3DYC6DBKA =Wsph -----END PGP SIGNATURE----- Merge tag 'pm-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "The big ticket items here are the rework of suspend-to-idle in order to add proper support for power button wakeup from it on recent Dell laptops and the rework of interfaces exporting the current CPU frequency on x86. In addition to that, support for a few new pieces of hardware is added, the PCI/ACPI device wakeup infrastructure is simplified significantly and the wakeup IRQ framework is fixed to unbreak the IRQ bus locking infrastructure. Also, there are some functional improvements for intel_pstate, tools updates and small fixes and cleanups all over. Specifics: - Rework suspend-to-idle to allow it to take wakeup events signaled by the EC into account on ACPI-based platforms in order to properly support power button wakeup from suspend-to-idle on recent Dell laptops (Rafael Wysocki). That includes the core suspend-to-idle code rework, support for the Low Power S0 _DSM interface, and support for the ACPI INT0002 Virtual GPIO device from Hans de Goede (required for USB keyboard wakeup from suspend-to-idle to work on some machines). - Stop trying to export the current CPU frequency via /proc/cpuinfo on x86 as that is inaccurate and confusing (Len Brown). - Rework the way in which the current CPU frequency is exported by the kernel (over the cpufreq sysfs interface) on x86 systems with the APERF and MPERF registers by always using values read from these registers, when available, to compute the current frequency regardless of which cpufreq driver is in use (Len Brown). - Rework the PCI/ACPI device wakeup infrastructure to remove the questionable and artificial distinction between "devices that can wake up the system from sleep states" and "devices that can generate wakeup signals in the working state" from it, which allows the code to be simplified quite a bit (Rafael Wysocki). - Fix the wakeup IRQ framework by making it use SRCU instead of RCU which doesn't allow sleeping in the read-side critical sections, but which in turn is expected to be allowed by the IRQ bus locking infrastructure (Thomas Gleixner). - Modify some computations in the intel_pstate driver to avoid rounding errors resulting from them (Srinivas Pandruvada). - Reduce the overhead of the intel_pstate driver in the HWP (hardware-managed P-states) mode and when the "performance" P-state selection algorithm is in use by making it avoid registering scheduler callbacks in those cases (Len Brown). - Rework the energy_performance_preference sysfs knob in intel_pstate by changing the values that correspond to different symbolic hint names used by it (Len Brown). - Make it possible to use more than one cpuidle driver at the same time on ARM (Daniel Lezcano). - Make it possible to prevent the cpuidle menu governor from using the 0 state by disabling it via sysfs (Nicholas Piggin). - Add support for FFH (Fixed Functional Hardware) MWAIT in ACPI C1 on AMD systems (Yazen Ghannam). - Make the CPPC cpufreq driver take the lowest nonlinear performance information into account (Prashanth Prakash). - Add support for hi3660 to the cpufreq-dt driver, fix the imx6q driver and clean up the sfi, exynos5440 and intel_pstate drivers (Colin Ian King, Krzysztof Kozlowski, Octavian Purdila, Rafael Wysocki, Tao Wang). - Fix a few minor issues in the generic power domains (genpd) framework and clean it up somewhat (Krzysztof Kozlowski, Mikko Perttunen, Viresh Kumar). - Fix a couple of minor issues in the operating performance points (OPP) framework and clean it up somewhat (Viresh Kumar). - Fix a CONFIG dependency in the hibernation core and clean it up slightly (Balbir Singh, Arvind Yadav, BaoJun Luo). - Add rk3228 support to the rockchip-io adaptive voltage scaling (AVS) driver (David Wu). - Fix an incorrect bit shift operation in the RAPL power capping driver (Adam Lessnau). - Add support for the EPP field in the HWP (hardware managed P-states) control register, HWP.EPP, to the x86_energy_perf_policy tool and update msr-index.h with HWP.EPP values (Len Brown). - Fix some minor issues in the turbostat tool (Len Brown). - Add support for AMD family 0x17 CPUs to the cpupower tool and fix a minor issue in it (Sherry Hurwitz). - Assorted cleanups, mostly related to the constification of some data structures (Arvind Yadav, Joe Perches, Kees Cook, Krzysztof Kozlowski)" * tag 'pm-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (69 commits) cpufreq: Update scaling_cur_freq documentation cpufreq: intel_pstate: Clean up after performance governor changes PM: hibernate: constify attribute_group structures. cpuidle: menu: allow state 0 to be disabled intel_idle: Use more common logging style PM / Domains: Fix missing default_power_down_ok comment PM / Domains: Fix unsafe iteration over modified list of domains PM / Domains: Fix unsafe iteration over modified list of domain providers PM / Domains: Fix unsafe iteration over modified list of device links PM / Domains: Handle safely genpd_syscore_switch() call on non-genpd device PM / Domains: Call driver's noirq callbacks PM / core: Drop run_wake flag from struct dev_pm_info PCI / PM: Simplify device wakeup settings code PCI / PM: Drop pme_interrupt flag from struct pci_dev ACPI / PM: Consolidate device wakeup settings code ACPI / PM: Drop run_wake from struct acpi_device_wakeup_flags PM / QoS: constify *_attribute_group. PM / AVS: rockchip-io: add io selectors and supplies for rk3228 powercap/RAPL: prevent overridding bits outside of the mask PM / sysfs: Constify attribute groups ...	2017-07-04 13:39:41 -07:00
Linus Torvalds	9a9594efe5	Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP hotplug updates from Thomas Gleixner: "This update is primarily a cleanup of the CPU hotplug locking code. The hotplug locking mechanism is an open coded RWSEM, which allows recursive locking. The main problem with that is the recursive nature as it evades the full lockdep coverage and hides potential deadlocks. The rework replaces the open coded RWSEM with a percpu RWSEM and establishes full lockdep coverage that way. The bulk of the changes fix up recursive locking issues and address the now fully reported potential deadlocks all over the place. Some of these deadlocks have been observed in the RT tree, but on mainline the probability was low enough to hide them away." * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) cpu/hotplug: Constify attribute_group structures powerpc: Only obtain cpu_hotplug_lock if called by rtasd ARM/hw_breakpoint: Fix possible recursive locking for arch_hw_breakpoint_init cpu/hotplug: Remove unused check_for_tasks() function perf/core: Don't release cred_guard_mutex if not taken cpuhotplug: Link lock stacks for hotplug callbacks acpi/processor: Prevent cpu hotplug deadlock sched: Provide is_percpu_thread() helper cpu/hotplug: Convert hotplug locking to percpu rwsem s390: Prevent hotplug rwsem recursion arm: Prevent hotplug rwsem recursion arm64: Prevent cpu hotplug rwsem recursion kprobes: Cure hotplug lock ordering issues jump_label: Reorder hotplug lock and jump_label_lock perf/tracing/cpuhotplug: Fix locking order ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus() PCI: Replace the racy recursion prevention PCI: Use cpu_hotplug_disable() instead of get_online_cpus() perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode() x86/perf: Drop EXPORT of perf_check_microcode ...	2017-07-03 18:08:06 -07:00
Len Brown	f8475cef90	x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF The goal of this change is to give users a uniform and meaningful result when they read /sys/...cpufreq/scaling_cur_freq on modern x86 hardware, as compared to what they get today. Modern x86 processors include the hardware needed to accurately calculate frequency over an interval -- APERF, MPERF, and the TSC. Here we provide an x86 routine to make this calculation on supported hardware, and use it in preference to any driver driver-specific cpufreq_driver.get() routine. MHz is computed like so: MHz = base_MHz * delta_APERF / delta_MPERF MHz is the average frequency of the busy processor over a measurement interval. The interval is defined to be the time between successive invocations of aperfmperf_khz_on_cpu(), which are expected to to happen on-demand when users read sysfs attribute cpufreq/scaling_cur_freq. As with previous methods of calculating MHz, idle time is excluded. base_MHz above is from TSC calibration global "cpu_khz". This x86 native method to calculate MHz returns a meaningful result no matter if P-states are controlled by hardware or firmware and/or if the Linux cpufreq sub-system is or is-not installed. When this routine is invoked more frequently, the measurement interval becomes shorter. However, the code limits re-computation to 10ms intervals so that average frequency remains meaningful. Discerning users are encouraged to take advantage of the turbostat(8) utility, which can gracefully handle concurrent measurement intervals of arbitrary length. Signed-off-by: Len Brown <len.brown@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-06-27 01:47:32 +02:00
David Arcari	6c77003677	cpufreq: cpufreq_register_driver() should return -ENODEV if init fails For a driver that does not set the CPUFREQ_STICKY flag, if all of the ->init() calls fail, cpufreq_register_driver() should return an error. This will prevent the driver from loading. Fixes: `ce1bcfe94d` (cpufreq: check cpufreq_policy_list instead of scanning policies for all CPUs) Cc: 4.0+ <stable@vger.kernel.org> # 4.0+ Signed-off-by: David Arcari <darcari@redhat.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-05-30 00:07:20 +02:00
Sebastian Andrzej Siewior	a92551e41d	cpufreq: Use cpuhp_setup_state_nocalls_cpuslocked() cpufreq holds get_online_cpus() while invoking cpuhp_setup_state_nocalls() to make subsys_interface_register() and the registration of hotplug calls atomic versus cpu hotplug. cpuhp_setup_state_nocalls() invokes get_online_cpus() as well. This is correct, but prevents the conversion of the hotplug locking to a percpu rwsem. Use cpuhp_setup/remove_state_nocalls_cpuslocked() to avoid the nested call. Convert *_online_cpus() to the new interfaces while at it. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: linux-pm@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170524081547.731628408@linutronix.de	2017-05-26 10:10:38 +02:00

1 2 3 4 5 ...

641 commits