linux-stable/drivers/thermal/thermal_core.c

1501 lines
38 KiB
C
Raw Normal View History

// SPDX-License-Identifier: GPL-2.0
/*
* thermal.c - Generic Thermal Management Sysfs support.
*
* Copyright (C) 2008 Intel Corp
* Copyright (C) 2008 Zhang Rui <rui.zhang@intel.com>
* Copyright (C) 2008 Sujith Thomas <sujith.thomas@intel.com>
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/device.h>
#include <linux/err.h>
#include <linux/export.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 08:04:11 +00:00
#include <linux/slab.h>
#include <linux/kdev_t.h>
#include <linux/idr.h>
#include <linux/thermal.h>
#include <linux/reboot.h>
#include <linux/string.h>
#include <linux/of.h>
Thermal: handle thermal zone device properly during system sleep Current thermal code does not handle system sleep well because 1. the cooling device cooling state may be changed during suspend 2. the previous temperature reading becomes invalid after resumed because it is got before system sleep 3. updating thermal zone device during suspending/resuming is wrong because some devices may have already been suspended or may have not been resumed. Thus, the proper way to do this is to cancel all thermal zone device update requirements during suspend/resume, and after all the devices have been resumed, reset and update every registered thermal zone devices. This also fixes a regression introduced by: Commit 19593a1fb1f6 ("ACPI / fan: convert to platform driver") Because, with above commit applied, all the fan devices are attached to the acpi_general_pm_domain, and they are turned on by the pm_domain automatically after resume, without the awareness of thermal core. CC: <stable@vger.kernel.org> #3.18+ Reference: https://bugzilla.kernel.org/show_bug.cgi?id=78201 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=91411 Tested-by: Manuel Krause <manuelkrause@netscape.net> Tested-by: szegad <szegadlo@poczta.onet.pl> Tested-by: prash <prash.n.rao@gmail.com> Tested-by: amish <ammdispose-arch@yahoo.com> Tested-by: Matthias <morpheusxyz123@yahoo.de> Reviewed-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
2015-10-30 08:31:58 +00:00
#include <linux/suspend.h>
#define CREATE_TRACE_POINTS
#include <trace/events/thermal.h>
#include "thermal_core.h"
#include "thermal_hwmon.h"
static DEFINE_IDA(thermal_tz_ida);
static DEFINE_IDA(thermal_cdev_ida);
static LIST_HEAD(thermal_tz_list);
static LIST_HEAD(thermal_cdev_list);
static LIST_HEAD(thermal_governor_list);
static DEFINE_MUTEX(thermal_list_lock);
static DEFINE_MUTEX(thermal_governor_lock);
Thermal: handle thermal zone device properly during system sleep Current thermal code does not handle system sleep well because 1. the cooling device cooling state may be changed during suspend 2. the previous temperature reading becomes invalid after resumed because it is got before system sleep 3. updating thermal zone device during suspending/resuming is wrong because some devices may have already been suspended or may have not been resumed. Thus, the proper way to do this is to cancel all thermal zone device update requirements during suspend/resume, and after all the devices have been resumed, reset and update every registered thermal zone devices. This also fixes a regression introduced by: Commit 19593a1fb1f6 ("ACPI / fan: convert to platform driver") Because, with above commit applied, all the fan devices are attached to the acpi_general_pm_domain, and they are turned on by the pm_domain automatically after resume, without the awareness of thermal core. CC: <stable@vger.kernel.org> #3.18+ Reference: https://bugzilla.kernel.org/show_bug.cgi?id=78201 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=91411 Tested-by: Manuel Krause <manuelkrause@netscape.net> Tested-by: szegad <szegadlo@poczta.onet.pl> Tested-by: prash <prash.n.rao@gmail.com> Tested-by: amish <ammdispose-arch@yahoo.com> Tested-by: Matthias <morpheusxyz123@yahoo.de> Reviewed-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
2015-10-30 08:31:58 +00:00
static atomic_t in_suspend;
static struct thermal_governor *def_governor;
/*
* Governor section: set of functions to handle thermal governors
*
* Functions to help in the life cycle of thermal governors within
* the thermal core and by the thermal governor code.
*/
static struct thermal_governor *__find_governor(const char *name)
{
struct thermal_governor *pos;
if (!name || !name[0])
return def_governor;
list_for_each_entry(pos, &thermal_governor_list, governor_list)
if (!strncasecmp(name, pos->name, THERMAL_NAME_LENGTH))
return pos;
return NULL;
}
/**
* bind_previous_governor() - bind the previous governor of the thermal zone
* @tz: a valid pointer to a struct thermal_zone_device
* @failed_gov_name: the name of the governor that failed to register
*
* Register the previous governor of the thermal zone after a new
* governor has failed to be bound.
*/
static void bind_previous_governor(struct thermal_zone_device *tz,
const char *failed_gov_name)
{
if (tz->governor && tz->governor->bind_to_tz) {
if (tz->governor->bind_to_tz(tz)) {
dev_err(&tz->device,
"governor %s failed to bind and the previous one (%s) failed to bind again, thermal zone %s has no governor\n",
failed_gov_name, tz->governor->name, tz->type);
tz->governor = NULL;
}
}
}
/**
* thermal_set_governor() - Switch to another governor
* @tz: a valid pointer to a struct thermal_zone_device
* @new_gov: pointer to the new governor
*
* Change the governor of thermal zone @tz.
*
* Return: 0 on success, an error if the new governor's bind_to_tz() failed.
*/
static int thermal_set_governor(struct thermal_zone_device *tz,
struct thermal_governor *new_gov)
{
int ret = 0;
if (tz->governor && tz->governor->unbind_from_tz)
tz->governor->unbind_from_tz(tz);
if (new_gov && new_gov->bind_to_tz) {
ret = new_gov->bind_to_tz(tz);
if (ret) {
bind_previous_governor(tz, new_gov->name);
return ret;
}
}
tz->governor = new_gov;
return ret;
}
int thermal_register_governor(struct thermal_governor *governor)
{
int err;
const char *name;
struct thermal_zone_device *pos;
if (!governor)
return -EINVAL;
mutex_lock(&thermal_governor_lock);
err = -EBUSY;
if (!__find_governor(governor->name)) {
bool match_default;
err = 0;
list_add(&governor->governor_list, &thermal_governor_list);
match_default = !strncmp(governor->name,
DEFAULT_THERMAL_GOVERNOR,
THERMAL_NAME_LENGTH);
if (!def_governor && match_default)
def_governor = governor;
}
mutex_lock(&thermal_list_lock);
list_for_each_entry(pos, &thermal_tz_list, node) {
/*
* only thermal zones with specified tz->tzp->governor_name
* may run with tz->govenor unset
*/
if (pos->governor)
continue;
name = pos->tzp->governor_name;
if (!strncasecmp(name, governor->name, THERMAL_NAME_LENGTH)) {
int ret;
ret = thermal_set_governor(pos, governor);
if (ret)
dev_err(&pos->device,
"Failed to set governor %s for thermal zone %s: %d\n",
governor->name, pos->type, ret);
}
}
mutex_unlock(&thermal_list_lock);
mutex_unlock(&thermal_governor_lock);
return err;
}
void thermal_unregister_governor(struct thermal_governor *governor)
{
struct thermal_zone_device *pos;
if (!governor)
return;
mutex_lock(&thermal_governor_lock);
if (!__find_governor(governor->name))
goto exit;
mutex_lock(&thermal_list_lock);
list_for_each_entry(pos, &thermal_tz_list, node) {
if (!strncasecmp(pos->governor->name, governor->name,
THERMAL_NAME_LENGTH))
thermal_set_governor(pos, NULL);
}
mutex_unlock(&thermal_list_lock);
list_del(&governor->governor_list);
exit:
mutex_unlock(&thermal_governor_lock);
}
int thermal_zone_device_set_policy(struct thermal_zone_device *tz,
char *policy)
{
struct thermal_governor *gov;
int ret = -EINVAL;
mutex_lock(&thermal_governor_lock);
mutex_lock(&tz->lock);
gov = __find_governor(strim(policy));
if (!gov)
goto exit;
ret = thermal_set_governor(tz, gov);
exit:
mutex_unlock(&tz->lock);
mutex_unlock(&thermal_governor_lock);
thermal_notify_tz_gov_change(tz->id, policy);
return ret;
}
int thermal_build_list_of_policies(char *buf)
{
struct thermal_governor *pos;
ssize_t count = 0;
mutex_lock(&thermal_governor_lock);
list_for_each_entry(pos, &thermal_governor_list, governor_list) {
count += scnprintf(buf + count, PAGE_SIZE - count, "%s ",
pos->name);
}
count += scnprintf(buf + count, PAGE_SIZE - count, "\n");
mutex_unlock(&thermal_governor_lock);
return count;
}
static void __init thermal_unregister_governors(void)
{
struct thermal_governor **governor;
for_each_governor_table(governor)
thermal_unregister_governor(*governor);
}
static int __init thermal_register_governors(void)
{
int ret = 0;
struct thermal_governor **governor;
for_each_governor_table(governor) {
ret = thermal_register_governor(*governor);
if (ret) {
pr_err("Failed to register governor: '%s'",
(*governor)->name);
break;
}
pr_info("Registered thermal governor '%s'",
(*governor)->name);
}
if (ret) {
struct thermal_governor **gov;
for_each_governor_table(gov) {
if (gov == governor)
break;
thermal_unregister_governor(*gov);
}
}
return ret;
}
/*
* Zone update section: main control loop applied to each zone while monitoring
*
* in polling mode. The monitoring is done using a workqueue.
* Same update may be done on a zone by calling thermal_zone_device_update().
*
* An update means:
* - Non-critical trips will invoke the governor responsible for that zone;
* - Hot trips will produce a notification to userspace;
* - Critical trip point will cause a system shutdown.
*/
static void thermal_zone_device_set_polling(struct thermal_zone_device *tz,
unsigned long delay)
{
if (delay)
mod_delayed_work(system_freezable_power_efficient_wq,
&tz->poll_queue, delay);
else
thermal: Fix deadlock in thermal thermal_zone_device_check 1851799e1d29 ("thermal: Fix use-after-free when unregistering thermal zone device") changed cancel_delayed_work to cancel_delayed_work_sync to avoid a use-after-free issue. However, cancel_delayed_work_sync could be called insides the WQ causing deadlock. [54109.642398] c0 1162 kworker/u17:1 D 0 11030 2 0x00000000 [54109.642437] c0 1162 Workqueue: thermal_passive_wq thermal_zone_device_check [54109.642447] c0 1162 Call trace: [54109.642456] c0 1162 __switch_to+0x138/0x158 [54109.642467] c0 1162 __schedule+0xba4/0x1434 [54109.642480] c0 1162 schedule_timeout+0xa0/0xb28 [54109.642492] c0 1162 wait_for_common+0x138/0x2e8 [54109.642511] c0 1162 flush_work+0x348/0x40c [54109.642522] c0 1162 __cancel_work_timer+0x180/0x218 [54109.642544] c0 1162 handle_thermal_trip+0x2c4/0x5a4 [54109.642553] c0 1162 thermal_zone_device_update+0x1b4/0x25c [54109.642563] c0 1162 thermal_zone_device_check+0x18/0x24 [54109.642574] c0 1162 process_one_work+0x3cc/0x69c [54109.642583] c0 1162 worker_thread+0x49c/0x7c0 [54109.642593] c0 1162 kthread+0x17c/0x1b0 [54109.642602] c0 1162 ret_from_fork+0x10/0x18 [54109.643051] c0 1162 kworker/u17:2 D 0 16245 2 0x00000000 [54109.643067] c0 1162 Workqueue: thermal_passive_wq thermal_zone_device_check [54109.643077] c0 1162 Call trace: [54109.643085] c0 1162 __switch_to+0x138/0x158 [54109.643095] c0 1162 __schedule+0xba4/0x1434 [54109.643104] c0 1162 schedule_timeout+0xa0/0xb28 [54109.643114] c0 1162 wait_for_common+0x138/0x2e8 [54109.643122] c0 1162 flush_work+0x348/0x40c [54109.643131] c0 1162 __cancel_work_timer+0x180/0x218 [54109.643141] c0 1162 handle_thermal_trip+0x2c4/0x5a4 [54109.643150] c0 1162 thermal_zone_device_update+0x1b4/0x25c [54109.643159] c0 1162 thermal_zone_device_check+0x18/0x24 [54109.643167] c0 1162 process_one_work+0x3cc/0x69c [54109.643177] c0 1162 worker_thread+0x49c/0x7c0 [54109.643186] c0 1162 kthread+0x17c/0x1b0 [54109.643195] c0 1162 ret_from_fork+0x10/0x18 [54109.644500] c0 1162 cat D 0 7766 1 0x00000001 [54109.644515] c0 1162 Call trace: [54109.644524] c0 1162 __switch_to+0x138/0x158 [54109.644536] c0 1162 __schedule+0xba4/0x1434 [54109.644546] c0 1162 schedule_preempt_disabled+0x80/0xb0 [54109.644555] c0 1162 __mutex_lock+0x3a8/0x7f0 [54109.644563] c0 1162 __mutex_lock_slowpath+0x14/0x20 [54109.644575] c0 1162 thermal_zone_get_temp+0x84/0x360 [54109.644586] c0 1162 temp_show+0x30/0x78 [54109.644609] c0 1162 dev_attr_show+0x5c/0xf0 [54109.644628] c0 1162 sysfs_kf_seq_show+0xcc/0x1a4 [54109.644636] c0 1162 kernfs_seq_show+0x48/0x88 [54109.644656] c0 1162 seq_read+0x1f4/0x73c [54109.644664] c0 1162 kernfs_fop_read+0x84/0x318 [54109.644683] c0 1162 __vfs_read+0x50/0x1bc [54109.644692] c0 1162 vfs_read+0xa4/0x140 [54109.644701] c0 1162 SyS_read+0xbc/0x144 [54109.644708] c0 1162 el0_svc_naked+0x34/0x38 [54109.845800] c0 1162 D 720.000s 1->7766->7766 cat [panic] Fixes: 1851799e1d29 ("thermal: Fix use-after-free when unregistering thermal zone device") Cc: stable@vger.kernel.org Signed-off-by: Wei Wang <wvw@google.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-11-12 20:42:23 +00:00
cancel_delayed_work(&tz->poll_queue);
}
static inline bool should_stop_polling(struct thermal_zone_device *tz)
{
return !thermal_zone_device_is_enabled(tz);
}
static void monitor_thermal_zone(struct thermal_zone_device *tz)
{
bool stop;
stop = should_stop_polling(tz);
mutex_lock(&tz->lock);
if (!stop && tz->passive)
thermal_zone_device_set_polling(tz, tz->passive_delay_jiffies);
else if (!stop && tz->polling_delay_jiffies)
thermal_zone_device_set_polling(tz, tz->polling_delay_jiffies);
else
thermal_zone_device_set_polling(tz, 0);
mutex_unlock(&tz->lock);
}
static void handle_non_critical_trips(struct thermal_zone_device *tz, int trip)
{
tz->governor ? tz->governor->throttle(tz, trip) :
def_governor->throttle(tz, trip);
}
void thermal_zone_device_critical(struct thermal_zone_device *tz)
{
/*
* poweroff_delay_ms must be a carefully profiled positive value.
* Its a must for forced_emergency_poweroff_work to be scheduled.
*/
int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS;
dev_emerg(&tz->device, "%s: critical temperature reached, "
"shutting down\n", tz->type);
hw_protection_shutdown("Temperature too high", poweroff_delay_ms);
}
EXPORT_SYMBOL(thermal_zone_device_critical);
static void handle_critical_trips(struct thermal_zone_device *tz,
int trip, enum thermal_trip_type trip_type)
{
thermal: consistently use int for temperatures The thermal code uses int, long and unsigned long for temperatures in different places. Using an unsigned type limits the thermal framework to positive temperatures without need. Also several drivers currently will report temperatures near UINT_MAX for temperatures below 0°C. This will probably immediately shut the machine down due to overtemperature if started below 0°C. 'long' is 64bit on several architectures. This is not needed since INT_MAX °mC is above the melting point of all known materials. Consistently use a plain 'int' for temperatures throughout the thermal code and the drivers. This only changes the places in the drivers where the temperature is passed around as pointer, when drivers internally use another type this is not changed. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Jean Delvare <jdelvare@suse.de> Reviewed-by: Lukasz Majewski <l.majewski@samsung.com> Reviewed-by: Darren Hart <dvhart@linux.intel.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Peter Feuerer <peter@piie.net> Cc: Punit Agrawal <punit.agrawal@arm.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Eduardo Valentin <edubezval@gmail.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Jean Delvare <jdelvare@suse.de> Cc: Peter Feuerer <peter@piie.net> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Lukasz Majewski <l.majewski@samsung.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: linux-acpi@vger.kernel.org Cc: platform-driver-x86@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-omap@vger.kernel.org Cc: linux-samsung-soc@vger.kernel.org Cc: Guenter Roeck <linux@roeck-us.net> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Maxime Ripard <maxime.ripard@free-electrons.com> Cc: Darren Hart <dvhart@infradead.org> Cc: lm-sensors@lm-sensors.org Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-07-24 06:12:54 +00:00
int trip_temp;
tz->ops->get_trip_temp(tz, trip, &trip_temp);
/* If we have not crossed the trip_temp, we do not care. */
if (trip_temp <= 0 || tz->temperature < trip_temp)
return;
trace_thermal_zone_trip(tz, trip, trip_type);
if (trip_type == THERMAL_TRIP_HOT && tz->ops->hot)
tz->ops->hot(tz);
else if (trip_type == THERMAL_TRIP_CRITICAL)
tz->ops->critical(tz);
}
static void handle_thermal_trip(struct thermal_zone_device *tz, int trip)
{
enum thermal_trip_type type;
int trip_temp, hyst = 0;
2016-03-18 02:03:24 +00:00
/* Ignore disabled trip points */
if (test_bit(trip, &tz->trips_disabled))
return;
tz->ops->get_trip_temp(tz, trip, &trip_temp);
tz->ops->get_trip_type(tz, trip, &type);
if (tz->ops->get_trip_hyst)
tz->ops->get_trip_hyst(tz, trip, &hyst);
if (tz->last_temperature != THERMAL_TEMP_INVALID) {
if (tz->last_temperature < trip_temp &&
tz->temperature >= trip_temp)
thermal_notify_tz_trip_up(tz->id, trip,
tz->temperature);
if (tz->last_temperature >= trip_temp &&
tz->temperature < (trip_temp - hyst))
thermal_notify_tz_trip_down(tz->id, trip,
tz->temperature);
}
if (type == THERMAL_TRIP_CRITICAL || type == THERMAL_TRIP_HOT)
handle_critical_trips(tz, trip, type);
else
handle_non_critical_trips(tz, trip);
/*
* Alright, we handled this trip successfully.
* So, start monitoring again.
*/
monitor_thermal_zone(tz);
}
static void update_temperature(struct thermal_zone_device *tz)
{
thermal: consistently use int for temperatures The thermal code uses int, long and unsigned long for temperatures in different places. Using an unsigned type limits the thermal framework to positive temperatures without need. Also several drivers currently will report temperatures near UINT_MAX for temperatures below 0°C. This will probably immediately shut the machine down due to overtemperature if started below 0°C. 'long' is 64bit on several architectures. This is not needed since INT_MAX °mC is above the melting point of all known materials. Consistently use a plain 'int' for temperatures throughout the thermal code and the drivers. This only changes the places in the drivers where the temperature is passed around as pointer, when drivers internally use another type this is not changed. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Jean Delvare <jdelvare@suse.de> Reviewed-by: Lukasz Majewski <l.majewski@samsung.com> Reviewed-by: Darren Hart <dvhart@linux.intel.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Peter Feuerer <peter@piie.net> Cc: Punit Agrawal <punit.agrawal@arm.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Eduardo Valentin <edubezval@gmail.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Jean Delvare <jdelvare@suse.de> Cc: Peter Feuerer <peter@piie.net> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Lukasz Majewski <l.majewski@samsung.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: linux-acpi@vger.kernel.org Cc: platform-driver-x86@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-omap@vger.kernel.org Cc: linux-samsung-soc@vger.kernel.org Cc: Guenter Roeck <linux@roeck-us.net> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Maxime Ripard <maxime.ripard@free-electrons.com> Cc: Darren Hart <dvhart@infradead.org> Cc: lm-sensors@lm-sensors.org Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-07-24 06:12:54 +00:00
int temp, ret;
ret = thermal_zone_get_temp(tz, &temp);
if (ret) {
if (ret != -EAGAIN)
dev_warn(&tz->device,
"failed to read out thermal zone (%d)\n",
ret);
return;
}
mutex_lock(&tz->lock);
tz->last_temperature = tz->temperature;
tz->temperature = temp;
mutex_unlock(&tz->lock);
thermal: debug: add debug statement for core and step_wise To ease debugging thermal problem, add these dynamic debug statements so that user do not need rebuild kernel to see these info. Based on a patch from Zhang Rui for debugging on bugzilla: https://bugzilla.kernel.org/attachment.cgi?id=98671 A sample output after we turn on dynamic debug with the following cmd: # echo 'module thermal_sys +fp' > /sys/kernel/debug/dynamic_debug/control is like: [ 355.147627] update_temperature: thermal thermal_zone0: last_temperature=52000, current_temperature=55000 [ 355.147636] thermal_zone_trip_update: thermal thermal_zone0: Trip1[type=1,temp=79000]:trend=2,throttle=0 [ 355.147644] get_target_state: thermal cooling_device8: cur_state=0 [ 355.147647] thermal_zone_trip_update: thermal cooling_device8: old_target=-1, target=-1 [ 355.147652] get_target_state: thermal cooling_device7: cur_state=0 [ 355.147655] thermal_zone_trip_update: thermal cooling_device7: old_target=-1, target=-1 [ 355.147660] get_target_state: thermal cooling_device6: cur_state=0 [ 355.147663] thermal_zone_trip_update: thermal cooling_device6: old_target=-1, target=-1 [ 355.147668] get_target_state: thermal cooling_device5: cur_state=0 [ 355.147671] thermal_zone_trip_update: thermal cooling_device5: old_target=-1, target=-1 [ 355.147678] thermal_zone_trip_update: thermal thermal_zone0: Trip2[type=0,temp=90000]:trend=1,throttle=0 [ 355.147776] get_target_state: thermal cooling_device0: cur_state=0 [ 355.147783] thermal_zone_trip_update: thermal cooling_device0: old_target=-1, target=-1 [ 355.147792] thermal_zone_trip_update: thermal thermal_zone0: Trip3[type=0,temp=80000]:trend=1,throttle=0 [ 355.147845] get_target_state: thermal cooling_device1: cur_state=0 [ 355.147849] thermal_zone_trip_update: thermal cooling_device1: old_target=-1, target=-1 [ 355.147856] thermal_zone_trip_update: thermal thermal_zone0: Trip4[type=0,temp=70000]:trend=1,throttle=0 [ 355.147904] get_target_state: thermal cooling_device2: cur_state=0 [ 355.147908] thermal_zone_trip_update: thermal cooling_device2: old_target=-1, target=-1 [ 355.147915] thermal_zone_trip_update: thermal thermal_zone0: Trip5[type=0,temp=60000]:trend=1,throttle=0 [ 355.147963] get_target_state: thermal cooling_device3: cur_state=0 [ 355.147967] thermal_zone_trip_update: thermal cooling_device3: old_target=-1, target=-1 [ 355.147973] thermal_zone_trip_update: thermal thermal_zone0: Trip6[type=0,temp=55000]:trend=1,throttle=1 [ 355.148022] get_target_state: thermal cooling_device4: cur_state=0 [ 355.148025] thermal_zone_trip_update: thermal cooling_device4: old_target=-1, target=1 [ 355.148036] thermal_cdev_update: thermal cooling_device4: zone0->target=1 [ 355.169279] thermal_cdev_update: thermal cooling_device4: set to state 1 Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Eduardo Valentin <eduardo.valentin@ti.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2013-12-02 05:54:26 +00:00
trace_thermal_temperature(tz);
thermal_genl_sampling_temp(tz->id, temp);
}
static void thermal_zone_device_init(struct thermal_zone_device *tz)
{
struct thermal_instance *pos;
tz->temperature = THERMAL_TEMP_INVALID;
tz->prev_low_trip = -INT_MAX;
tz->prev_high_trip = INT_MAX;
list_for_each_entry(pos, &tz->thermal_instances, tz_node)
pos->initialized = false;
}
static int thermal_zone_device_set_mode(struct thermal_zone_device *tz,
enum thermal_device_mode mode)
{
int ret = 0;
mutex_lock(&tz->lock);
/* do nothing if mode isn't changing */
if (mode == tz->mode) {
mutex_unlock(&tz->lock);
return ret;
}
if (tz->ops->change_mode)
ret = tz->ops->change_mode(tz, mode);
if (!ret)
tz->mode = mode;
mutex_unlock(&tz->lock);
thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
if (mode == THERMAL_DEVICE_ENABLED)
thermal_notify_tz_enable(tz->id);
else
thermal_notify_tz_disable(tz->id);
return ret;
}
int thermal_zone_device_enable(struct thermal_zone_device *tz)
{
return thermal_zone_device_set_mode(tz, THERMAL_DEVICE_ENABLED);
}
EXPORT_SYMBOL_GPL(thermal_zone_device_enable);
int thermal_zone_device_disable(struct thermal_zone_device *tz)
{
return thermal_zone_device_set_mode(tz, THERMAL_DEVICE_DISABLED);
}
EXPORT_SYMBOL_GPL(thermal_zone_device_disable);
int thermal_zone_device_is_enabled(struct thermal_zone_device *tz)
{
enum thermal_device_mode mode;
mutex_lock(&tz->lock);
mode = tz->mode;
mutex_unlock(&tz->lock);
return mode == THERMAL_DEVICE_ENABLED;
}
void thermal_zone_device_update(struct thermal_zone_device *tz,
enum thermal_notify_event event)
{
int count;
if (should_stop_polling(tz))
return;
Thermal: handle thermal zone device properly during system sleep Current thermal code does not handle system sleep well because 1. the cooling device cooling state may be changed during suspend 2. the previous temperature reading becomes invalid after resumed because it is got before system sleep 3. updating thermal zone device during suspending/resuming is wrong because some devices may have already been suspended or may have not been resumed. Thus, the proper way to do this is to cancel all thermal zone device update requirements during suspend/resume, and after all the devices have been resumed, reset and update every registered thermal zone devices. This also fixes a regression introduced by: Commit 19593a1fb1f6 ("ACPI / fan: convert to platform driver") Because, with above commit applied, all the fan devices are attached to the acpi_general_pm_domain, and they are turned on by the pm_domain automatically after resume, without the awareness of thermal core. CC: <stable@vger.kernel.org> #3.18+ Reference: https://bugzilla.kernel.org/show_bug.cgi?id=78201 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=91411 Tested-by: Manuel Krause <manuelkrause@netscape.net> Tested-by: szegad <szegadlo@poczta.onet.pl> Tested-by: prash <prash.n.rao@gmail.com> Tested-by: amish <ammdispose-arch@yahoo.com> Tested-by: Matthias <morpheusxyz123@yahoo.de> Reviewed-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
2015-10-30 08:31:58 +00:00
if (atomic_read(&in_suspend))
return;
if (WARN_ONCE(!tz->ops->get_temp, "'%s' must not be called without "
"'get_temp' ops set\n", __func__))
return;
update_temperature(tz);
thermal_zone_set_trips(tz);
tz->notify_event = event;
for (count = 0; count < tz->trips; count++)
handle_thermal_trip(tz, count);
}
EXPORT_SYMBOL_GPL(thermal_zone_device_update);
static void thermal_zone_device_check(struct work_struct *work)
{
struct thermal_zone_device *tz = container_of(work, struct
thermal_zone_device,
poll_queue.work);
thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
}
int for_each_thermal_governor(int (*cb)(struct thermal_governor *, void *),
void *data)
{
struct thermal_governor *gov;
int ret = 0;
mutex_lock(&thermal_governor_lock);
list_for_each_entry(gov, &thermal_governor_list, governor_list) {
ret = cb(gov, data);
if (ret)
break;
}
mutex_unlock(&thermal_governor_lock);
return ret;
}
int for_each_thermal_cooling_device(int (*cb)(struct thermal_cooling_device *,
void *), void *data)
{
struct thermal_cooling_device *cdev;
int ret = 0;
mutex_lock(&thermal_list_lock);
list_for_each_entry(cdev, &thermal_cdev_list, node) {
ret = cb(cdev, data);
if (ret)
break;
}
mutex_unlock(&thermal_list_lock);
return ret;
}
int for_each_thermal_zone(int (*cb)(struct thermal_zone_device *, void *),
void *data)
{
struct thermal_zone_device *tz;
int ret = 0;
mutex_lock(&thermal_list_lock);
list_for_each_entry(tz, &thermal_tz_list, node) {
ret = cb(tz, data);
if (ret)
break;
}
mutex_unlock(&thermal_list_lock);
return ret;
}
struct thermal_zone_device *thermal_zone_get_by_id(int id)
{
struct thermal_zone_device *tz, *match = NULL;
mutex_lock(&thermal_list_lock);
list_for_each_entry(tz, &thermal_tz_list, node) {
if (tz->id == id) {
match = tz;
break;
}
}
mutex_unlock(&thermal_list_lock);
return match;
}
/*
* Device management section: cooling devices, zones devices, and binding
*
* Set of functions provided by the thermal core for:
* - cooling devices lifecycle: registration, unregistration,
* binding, and unbinding.
* - thermal zone devices lifecycle: registration, unregistration,
* binding, and unbinding.
*/
/**
* thermal_zone_bind_cooling_device() - bind a cooling device to a thermal zone
* @tz: pointer to struct thermal_zone_device
* @trip: indicates which trip point the cooling devices is
* associated with in this thermal zone.
* @cdev: pointer to struct thermal_cooling_device
* @upper: the Maximum cooling state for this trip point.
* THERMAL_NO_LIMIT means no upper limit,
* and the cooling device can be in max_state.
* @lower: the Minimum cooling state can be used for this trip point.
* THERMAL_NO_LIMIT means no lower limit,
* and the cooling device can be in cooling state 0.
* @weight: The weight of the cooling device to be bound to the
* thermal zone. Use THERMAL_WEIGHT_DEFAULT for the
* default value
*
* This interface function bind a thermal cooling device to the certain trip
* point of a thermal zone device.
* This function is usually called in the thermal zone device .bind callback.
*
* Return: 0 on success, the proper error value otherwise.
*/
int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
int trip,
struct thermal_cooling_device *cdev,
unsigned long upper, unsigned long lower,
unsigned int weight)
{
struct thermal_instance *dev;
struct thermal_instance *pos;
struct thermal_zone_device *pos1;
struct thermal_cooling_device *pos2;
unsigned long max_state;
int result, ret;
if (trip >= tz->trips || trip < 0)
return -EINVAL;
list_for_each_entry(pos1, &thermal_tz_list, node) {
if (pos1 == tz)
break;
}
list_for_each_entry(pos2, &thermal_cdev_list, node) {
if (pos2 == cdev)
break;
}
if (tz != pos1 || cdev != pos2)
return -EINVAL;
ret = cdev->ops->get_max_state(cdev, &max_state);
if (ret)
return ret;
/* lower default 0, upper default max_state */
lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
upper = upper == THERMAL_NO_LIMIT ? max_state : upper;
if (lower > upper || upper > max_state)
return -EINVAL;
dev = kzalloc(sizeof(*dev), GFP_KERNEL);
if (!dev)
return -ENOMEM;
dev->tz = tz;
dev->cdev = cdev;
dev->trip = trip;
dev->upper = upper;
dev->lower = lower;
dev->target = THERMAL_NO_TARGET;
dev->weight = weight;
result = ida_simple_get(&tz->ida, 0, 0, GFP_KERNEL);
if (result < 0)
goto free_mem;
dev->id = result;
sprintf(dev->name, "cdev%d", dev->id);
result =
sysfs_create_link(&tz->device.kobj, &cdev->device.kobj, dev->name);
if (result)
goto release_ida;
sprintf(dev->attr_name, "cdev%d_trip_point", dev->id);
sysfs_attr_init(&dev->attr.attr);
dev->attr.attr.name = dev->attr_name;
dev->attr.attr.mode = 0444;
dev->attr.show = trip_point_show;
result = device_create_file(&tz->device, &dev->attr);
if (result)
goto remove_symbol_link;
sprintf(dev->weight_attr_name, "cdev%d_weight", dev->id);
sysfs_attr_init(&dev->weight_attr.attr);
dev->weight_attr.attr.name = dev->weight_attr_name;
dev->weight_attr.attr.mode = S_IWUSR | S_IRUGO;
dev->weight_attr.show = weight_show;
dev->weight_attr.store = weight_store;
result = device_create_file(&tz->device, &dev->weight_attr);
if (result)
goto remove_trip_file;
mutex_lock(&tz->lock);
mutex_lock(&cdev->lock);
list_for_each_entry(pos, &tz->thermal_instances, tz_node)
if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) {
result = -EEXIST;
break;
}
if (!result) {
list_add_tail(&dev->tz_node, &tz->thermal_instances);
list_add_tail(&dev->cdev_node, &cdev->thermal_instances);
atomic_set(&tz->need_update, 1);
}
mutex_unlock(&cdev->lock);
mutex_unlock(&tz->lock);
if (!result)
return 0;
device_remove_file(&tz->device, &dev->weight_attr);
remove_trip_file:
device_remove_file(&tz->device, &dev->attr);
remove_symbol_link:
sysfs_remove_link(&tz->device.kobj, dev->name);
release_ida:
ida_simple_remove(&tz->ida, dev->id);
free_mem:
kfree(dev);
return result;
}
EXPORT_SYMBOL_GPL(thermal_zone_bind_cooling_device);
/**
* thermal_zone_unbind_cooling_device() - unbind a cooling device from a
* thermal zone.
* @tz: pointer to a struct thermal_zone_device.
* @trip: indicates which trip point the cooling devices is
* associated with in this thermal zone.
* @cdev: pointer to a struct thermal_cooling_device.
*
* This interface function unbind a thermal cooling device from the certain
* trip point of a thermal zone device.
* This function is usually called in the thermal zone device .unbind callback.
*
* Return: 0 on success, the proper error value otherwise.
*/
int thermal_zone_unbind_cooling_device(struct thermal_zone_device *tz,
int trip,
struct thermal_cooling_device *cdev)
{
struct thermal_instance *pos, *next;
mutex_lock(&tz->lock);
mutex_lock(&cdev->lock);
list_for_each_entry_safe(pos, next, &tz->thermal_instances, tz_node) {
if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) {
list_del(&pos->tz_node);
list_del(&pos->cdev_node);
mutex_unlock(&cdev->lock);
mutex_unlock(&tz->lock);
goto unbind;
}
}
mutex_unlock(&cdev->lock);
mutex_unlock(&tz->lock);
return -ENODEV;
unbind:
thermal: remove dangling 'weight_attr' device file This file isn't getting removed while we unbind a device from thermal zone. And this causes following messages when the device is registered again: WARNING: CPU: 0 PID: 2228 at /home/viresh/linux/fs/sysfs/dir.c:31 sysfs_warn_dup+0x60/0x70() sysfs: cannot create duplicate filename '/devices/virtual/thermal/thermal_zone0/cdev0_weight' Modules linked in: cpufreq_dt(+) [last unloaded: cpufreq_dt] CPU: 0 PID: 2228 Comm: insmod Not tainted 4.2.0-rc3-00059-g44fffd9473eb #272 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [<c00153e8>] (unwind_backtrace) from [<c0012368>] (show_stack+0x10/0x14) [<c0012368>] (show_stack) from [<c053a684>] (dump_stack+0x84/0xc4) [<c053a684>] (dump_stack) from [<c002284c>] (warn_slowpath_common+0x80/0xb0) [<c002284c>] (warn_slowpath_common) from [<c00228ac>] (warn_slowpath_fmt+0x30/0x40) [<c00228ac>] (warn_slowpath_fmt) from [<c012d524>] (sysfs_warn_dup+0x60/0x70) [<c012d524>] (sysfs_warn_dup) from [<c012d244>] (sysfs_add_file_mode_ns+0x13c/0x190) [<c012d244>] (sysfs_add_file_mode_ns) from [<c012d2d4>] (sysfs_create_file_ns+0x3c/0x48) [<c012d2d4>] (sysfs_create_file_ns) from [<c03c04a8>] (thermal_zone_bind_cooling_device+0x260/0x358) [<c03c04a8>] (thermal_zone_bind_cooling_device) from [<c03c2e70>] (of_thermal_bind+0x88/0xb4) [<c03c2e70>] (of_thermal_bind) from [<c03c10d0>] (__thermal_cooling_device_register+0x17c/0x2e0) [<c03c10d0>] (__thermal_cooling_device_register) from [<c03c3f50>] (__cpufreq_cooling_register+0x3a0/0x51c) [<c03c3f50>] (__cpufreq_cooling_register) from [<bf00505c>] (cpufreq_ready+0x44/0x88 [cpufreq_dt]) [<bf00505c>] (cpufreq_ready [cpufreq_dt]) from [<c03d6c30>] (cpufreq_add_dev+0x4a0/0x7dc) [<c03d6c30>] (cpufreq_add_dev) from [<c02cd3ec>] (subsys_interface_register+0x94/0xd8) [<c02cd3ec>] (subsys_interface_register) from [<c03d785c>] (cpufreq_register_driver+0x10c/0x1f0) [<c03d785c>] (cpufreq_register_driver) from [<bf0057d4>] (dt_cpufreq_probe+0x60/0x8c [cpufreq_dt]) [<bf0057d4>] (dt_cpufreq_probe [cpufreq_dt]) from [<c02d03e4>] (platform_drv_probe+0x44/0xa4) [<c02d03e4>] (platform_drv_probe) from [<c02cead8>] (driver_probe_device+0x174/0x2b4) [<c02cead8>] (driver_probe_device) from [<c02ceca4>] (__driver_attach+0x8c/0x90) [<c02ceca4>] (__driver_attach) from [<c02cd078>] (bus_for_each_dev+0x68/0x9c) [<c02cd078>] (bus_for_each_dev) from [<c02ce2f0>] (bus_add_driver+0x19c/0x214) [<c02ce2f0>] (bus_add_driver) from [<c02cf490>] (driver_register+0x78/0xf8) [<c02cf490>] (driver_register) from [<c0009710>] (do_one_initcall+0x8c/0x1d4) [<c0009710>] (do_one_initcall) from [<c05396b0>] (do_init_module+0x5c/0x1b8) [<c05396b0>] (do_init_module) from [<c0086490>] (load_module+0xd34/0xed8) [<c0086490>] (load_module) from [<c0086704>] (SyS_init_module+0xd0/0x120) [<c0086704>] (SyS_init_module) from [<c000f480>] (ret_fast_syscall+0x0/0x3c) ---[ end trace 3be0e7b7dc6e3c4f ]--- Fixes: db91651311c8 ("thermal: export weight to sysfs") Acked-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-07-23 09:02:32 +00:00
device_remove_file(&tz->device, &pos->weight_attr);
device_remove_file(&tz->device, &pos->attr);
sysfs_remove_link(&tz->device.kobj, pos->name);
ida_simple_remove(&tz->ida, pos->id);
kfree(pos);
return 0;
}
EXPORT_SYMBOL_GPL(thermal_zone_unbind_cooling_device);
static void thermal_release(struct device *dev)
{
struct thermal_zone_device *tz;
struct thermal_cooling_device *cdev;
if (!strncmp(dev_name(dev), "thermal_zone",
sizeof("thermal_zone") - 1)) {
tz = to_thermal_zone(dev);
thermal_zone_destroy_device_groups(tz);
kfree(tz);
} else if (!strncmp(dev_name(dev), "cooling_device",
sizeof("cooling_device") - 1)) {
cdev = to_cooling_device(dev);
kfree(cdev);
}
}
static struct class thermal_class = {
.name = "thermal",
.dev_release = thermal_release,
};
static inline
void print_bind_err_msg(struct thermal_zone_device *tz,
struct thermal_cooling_device *cdev, int ret)
{
dev_err(&tz->device, "binding zone %s with cdev %s failed:%d\n",
tz->type, cdev->type, ret);
}
static void __bind(struct thermal_zone_device *tz, int mask,
struct thermal_cooling_device *cdev,
unsigned long *limits,
unsigned int weight)
{
int i, ret;
for (i = 0; i < tz->trips; i++) {
if (mask & (1 << i)) {
unsigned long upper, lower;
upper = THERMAL_NO_LIMIT;
lower = THERMAL_NO_LIMIT;
if (limits) {
lower = limits[i * 2];
upper = limits[i * 2 + 1];
}
ret = thermal_zone_bind_cooling_device(tz, i, cdev,
upper, lower,
weight);
if (ret)
print_bind_err_msg(tz, cdev, ret);
}
}
}
static void bind_cdev(struct thermal_cooling_device *cdev)
{
int i, ret;
const struct thermal_zone_params *tzp;
struct thermal_zone_device *pos = NULL;
mutex_lock(&thermal_list_lock);
list_for_each_entry(pos, &thermal_tz_list, node) {
if (!pos->tzp && !pos->ops->bind)
continue;
if (pos->ops->bind) {
ret = pos->ops->bind(pos, cdev);
if (ret)
print_bind_err_msg(pos, cdev, ret);
continue;
}
tzp = pos->tzp;
if (!tzp || !tzp->tbp)
continue;
for (i = 0; i < tzp->num_tbps; i++) {
if (tzp->tbp[i].cdev || !tzp->tbp[i].match)
continue;
if (tzp->tbp[i].match(pos, cdev))
continue;
tzp->tbp[i].cdev = cdev;
__bind(pos, tzp->tbp[i].trip_mask, cdev,
tzp->tbp[i].binding_limits,
tzp->tbp[i].weight);
}
}
mutex_unlock(&thermal_list_lock);
}
/**
* __thermal_cooling_device_register() - register a new thermal cooling device
* @np: a pointer to a device tree node.
* @type: the thermal cooling device type.
* @devdata: device private data.
* @ops: standard thermal cooling devices callbacks.
*
* This interface function adds a new thermal cooling device (fan/processor/...)
* to /sys/class/thermal/ folder as cooling_device[0-*]. It tries to bind itself
* to all the thermal zone devices registered at the same time.
* It also gives the opportunity to link the cooling device to a device tree
* node, so that it can be bound to a thermal zone created out of device tree.
*
* Return: a pointer to the created struct thermal_cooling_device or an
* ERR_PTR. Caller must check return value with IS_ERR*() helpers.
*/
static struct thermal_cooling_device *
__thermal_cooling_device_register(struct device_node *np,
const char *type, void *devdata,
const struct thermal_cooling_device_ops *ops)
{
struct thermal_cooling_device *cdev;
struct thermal_zone_device *pos = NULL;
thermal/core: fix a UAF bug in __thermal_cooling_device_register() When device_register() return failed, program will goto out_kfree_type to release 'cdev->device' by put_device(). That will call thermal_release() to free 'cdev'. But the follow-up processes access 'cdev' continually. That trggers the UAF bug. ==================================================================== BUG: KASAN: use-after-free in __thermal_cooling_device_register+0x75b/0xa90 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: dump_stack_lvl+0xe2/0x152 print_address_description.constprop.0+0x21/0x140 ? __thermal_cooling_device_register+0x75b/0xa90 kasan_report.cold+0x7f/0x11b ? __thermal_cooling_device_register+0x75b/0xa90 __thermal_cooling_device_register+0x75b/0xa90 ? memset+0x20/0x40 ? __sanitizer_cov_trace_pc+0x1d/0x50 ? __devres_alloc_node+0x130/0x180 devm_thermal_of_cooling_device_register+0x67/0xf0 max6650_probe.cold+0x557/0x6aa ...... Freed by task 258: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x109/0x140 kfree+0x117/0x4c0 thermal_release+0xa0/0x110 device_release+0xa7/0x240 kobject_put+0x1ce/0x540 put_device+0x20/0x30 __thermal_cooling_device_register+0x731/0xa90 devm_thermal_of_cooling_device_register+0x67/0xf0 max6650_probe.cold+0x557/0x6aa [max6650] Do not use 'cdev' again after put_device() to fix the problem like doing in thermal_zone_device_register(). [dlezcano]: as requested by Rafael, change the affectation into two statements. Fixes: 584837618100 ("thermal/drivers/core: Use a char pointer for the cooling device name") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20211015024504.947520-1-william.xuanziyang@huawei.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-15 02:45:04 +00:00
int id, ret;
if (!ops || !ops->get_max_state || !ops->get_cur_state ||
!ops->set_cur_state)
return ERR_PTR(-EINVAL);
cdev = kzalloc(sizeof(*cdev), GFP_KERNEL);
if (!cdev)
return ERR_PTR(-ENOMEM);
ret = ida_simple_get(&thermal_cdev_ida, 0, 0, GFP_KERNEL);
if (ret < 0)
goto out_kfree_cdev;
cdev->id = ret;
thermal/core: fix a UAF bug in __thermal_cooling_device_register() When device_register() return failed, program will goto out_kfree_type to release 'cdev->device' by put_device(). That will call thermal_release() to free 'cdev'. But the follow-up processes access 'cdev' continually. That trggers the UAF bug. ==================================================================== BUG: KASAN: use-after-free in __thermal_cooling_device_register+0x75b/0xa90 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: dump_stack_lvl+0xe2/0x152 print_address_description.constprop.0+0x21/0x140 ? __thermal_cooling_device_register+0x75b/0xa90 kasan_report.cold+0x7f/0x11b ? __thermal_cooling_device_register+0x75b/0xa90 __thermal_cooling_device_register+0x75b/0xa90 ? memset+0x20/0x40 ? __sanitizer_cov_trace_pc+0x1d/0x50 ? __devres_alloc_node+0x130/0x180 devm_thermal_of_cooling_device_register+0x67/0xf0 max6650_probe.cold+0x557/0x6aa ...... Freed by task 258: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x109/0x140 kfree+0x117/0x4c0 thermal_release+0xa0/0x110 device_release+0xa7/0x240 kobject_put+0x1ce/0x540 put_device+0x20/0x30 __thermal_cooling_device_register+0x731/0xa90 devm_thermal_of_cooling_device_register+0x67/0xf0 max6650_probe.cold+0x557/0x6aa [max6650] Do not use 'cdev' again after put_device() to fix the problem like doing in thermal_zone_device_register(). [dlezcano]: as requested by Rafael, change the affectation into two statements. Fixes: 584837618100 ("thermal/drivers/core: Use a char pointer for the cooling device name") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20211015024504.947520-1-william.xuanziyang@huawei.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-15 02:45:04 +00:00
id = ret;
ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id);
if (ret)
goto out_ida_remove;
cdev->type = kstrdup(type ? type : "", GFP_KERNEL);
if (!cdev->type) {
ret = -ENOMEM;
goto out_ida_remove;
}
mutex_init(&cdev->lock);
INIT_LIST_HEAD(&cdev->thermal_instances);
cdev->np = np;
cdev->ops = ops;
cdev->updated = false;
cdev->device.class = &thermal_class;
cdev->devdata = devdata;
thermal: Add cooling device's statistics in sysfs This extends the sysfs interface for thermal cooling devices and exposes some pretty useful statistics. These statistics have proven to be quite useful specially while doing benchmarks related to the task scheduler, where we want to make sure that nothing has disrupted the test, specially the cooling device which may have put constraints on the CPUs. The information exposed here tells us to what extent the CPUs were constrained by the thermal framework. The write-only "reset" file is used to reset the statistics. The read-only "time_in_state_ms" file shows the time (in msec) spent by the device in the respective cooling states, and it prints one line per cooling state. The read-only "total_trans" file shows single positive integer value showing the total number of cooling state transitions the device has gone through since the time the cooling device is registered or the time when statistics were reset last. The read-only "trans_table" file shows a two dimensional matrix, where an entry <i,j> (row i, column j) represents the number of transitions from State_i to State_j. This is how the directory structure looks like for a single cooling device: $ ls -R /sys/class/thermal/cooling_device0/ /sys/class/thermal/cooling_device0/: cur_state max_state power stats subsystem type uevent /sys/class/thermal/cooling_device0/power: autosuspend_delay_ms runtime_active_time runtime_suspended_time control runtime_status /sys/class/thermal/cooling_device0/stats: reset time_in_state_ms total_trans trans_table This is tested on ARM 64-bit Hisilicon hikey620 board running Ubuntu and ARM 64-bit Hisilicon hikey960 board running Android. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-04-02 10:56:25 +00:00
thermal_cooling_device_setup_sysfs(cdev);
ret = device_register(&cdev->device);
if (ret)
goto out_kfree_type;
/* Add 'this' new cdev to the global cdev list */
mutex_lock(&thermal_list_lock);
list_add(&cdev->node, &thermal_cdev_list);
mutex_unlock(&thermal_list_lock);
/* Update binding information for 'this' new cdev */
bind_cdev(cdev);
mutex_lock(&thermal_list_lock);
list_for_each_entry(pos, &thermal_tz_list, node)
if (atomic_cmpxchg(&pos->need_update, 1, 0))
thermal_zone_device_update(pos,
THERMAL_EVENT_UNSPECIFIED);
mutex_unlock(&thermal_list_lock);
return cdev;
out_kfree_type:
kfree(cdev->type);
put_device(&cdev->device);
thermal/core: fix a UAF bug in __thermal_cooling_device_register() When device_register() return failed, program will goto out_kfree_type to release 'cdev->device' by put_device(). That will call thermal_release() to free 'cdev'. But the follow-up processes access 'cdev' continually. That trggers the UAF bug. ==================================================================== BUG: KASAN: use-after-free in __thermal_cooling_device_register+0x75b/0xa90 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: dump_stack_lvl+0xe2/0x152 print_address_description.constprop.0+0x21/0x140 ? __thermal_cooling_device_register+0x75b/0xa90 kasan_report.cold+0x7f/0x11b ? __thermal_cooling_device_register+0x75b/0xa90 __thermal_cooling_device_register+0x75b/0xa90 ? memset+0x20/0x40 ? __sanitizer_cov_trace_pc+0x1d/0x50 ? __devres_alloc_node+0x130/0x180 devm_thermal_of_cooling_device_register+0x67/0xf0 max6650_probe.cold+0x557/0x6aa ...... Freed by task 258: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x109/0x140 kfree+0x117/0x4c0 thermal_release+0xa0/0x110 device_release+0xa7/0x240 kobject_put+0x1ce/0x540 put_device+0x20/0x30 __thermal_cooling_device_register+0x731/0xa90 devm_thermal_of_cooling_device_register+0x67/0xf0 max6650_probe.cold+0x557/0x6aa [max6650] Do not use 'cdev' again after put_device() to fix the problem like doing in thermal_zone_device_register(). [dlezcano]: as requested by Rafael, change the affectation into two statements. Fixes: 584837618100 ("thermal/drivers/core: Use a char pointer for the cooling device name") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20211015024504.947520-1-william.xuanziyang@huawei.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-15 02:45:04 +00:00
cdev = NULL;
out_ida_remove:
thermal/core: fix a UAF bug in __thermal_cooling_device_register() When device_register() return failed, program will goto out_kfree_type to release 'cdev->device' by put_device(). That will call thermal_release() to free 'cdev'. But the follow-up processes access 'cdev' continually. That trggers the UAF bug. ==================================================================== BUG: KASAN: use-after-free in __thermal_cooling_device_register+0x75b/0xa90 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: dump_stack_lvl+0xe2/0x152 print_address_description.constprop.0+0x21/0x140 ? __thermal_cooling_device_register+0x75b/0xa90 kasan_report.cold+0x7f/0x11b ? __thermal_cooling_device_register+0x75b/0xa90 __thermal_cooling_device_register+0x75b/0xa90 ? memset+0x20/0x40 ? __sanitizer_cov_trace_pc+0x1d/0x50 ? __devres_alloc_node+0x130/0x180 devm_thermal_of_cooling_device_register+0x67/0xf0 max6650_probe.cold+0x557/0x6aa ...... Freed by task 258: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x109/0x140 kfree+0x117/0x4c0 thermal_release+0xa0/0x110 device_release+0xa7/0x240 kobject_put+0x1ce/0x540 put_device+0x20/0x30 __thermal_cooling_device_register+0x731/0xa90 devm_thermal_of_cooling_device_register+0x67/0xf0 max6650_probe.cold+0x557/0x6aa [max6650] Do not use 'cdev' again after put_device() to fix the problem like doing in thermal_zone_device_register(). [dlezcano]: as requested by Rafael, change the affectation into two statements. Fixes: 584837618100 ("thermal/drivers/core: Use a char pointer for the cooling device name") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20211015024504.947520-1-william.xuanziyang@huawei.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-15 02:45:04 +00:00
ida_simple_remove(&thermal_cdev_ida, id);
out_kfree_cdev:
kfree(cdev);
return ERR_PTR(ret);
}
/**
* thermal_cooling_device_register() - register a new thermal cooling device
* @type: the thermal cooling device type.
* @devdata: device private data.
* @ops: standard thermal cooling devices callbacks.
*
* This interface function adds a new thermal cooling device (fan/processor/...)
* to /sys/class/thermal/ folder as cooling_device[0-*]. It tries to bind itself
* to all the thermal zone devices registered at the same time.
*
* Return: a pointer to the created struct thermal_cooling_device or an
* ERR_PTR. Caller must check return value with IS_ERR*() helpers.
*/
struct thermal_cooling_device *
thermal_cooling_device_register(const char *type, void *devdata,
const struct thermal_cooling_device_ops *ops)
{
return __thermal_cooling_device_register(NULL, type, devdata, ops);
}
EXPORT_SYMBOL_GPL(thermal_cooling_device_register);
/**
* thermal_of_cooling_device_register() - register an OF thermal cooling device
* @np: a pointer to a device tree node.
* @type: the thermal cooling device type.
* @devdata: device private data.
* @ops: standard thermal cooling devices callbacks.
*
* This function will register a cooling device with device tree node reference.
* This interface function adds a new thermal cooling device (fan/processor/...)
* to /sys/class/thermal/ folder as cooling_device[0-*]. It tries to bind itself
* to all the thermal zone devices registered at the same time.
*
* Return: a pointer to the created struct thermal_cooling_device or an
* ERR_PTR. Caller must check return value with IS_ERR*() helpers.
*/
struct thermal_cooling_device *
thermal_of_cooling_device_register(struct device_node *np,
const char *type, void *devdata,
const struct thermal_cooling_device_ops *ops)
{
return __thermal_cooling_device_register(np, type, devdata, ops);
}
EXPORT_SYMBOL_GPL(thermal_of_cooling_device_register);
static void thermal_cooling_device_release(struct device *dev, void *res)
{
thermal_cooling_device_unregister(
*(struct thermal_cooling_device **)res);
}
/**
* devm_thermal_of_cooling_device_register() - register an OF thermal cooling
* device
* @dev: a valid struct device pointer of a sensor device.
* @np: a pointer to a device tree node.
* @type: the thermal cooling device type.
* @devdata: device private data.
* @ops: standard thermal cooling devices callbacks.
*
* This function will register a cooling device with device tree node reference.
* This interface function adds a new thermal cooling device (fan/processor/...)
* to /sys/class/thermal/ folder as cooling_device[0-*]. It tries to bind itself
* to all the thermal zone devices registered at the same time.
*
* Return: a pointer to the created struct thermal_cooling_device or an
* ERR_PTR. Caller must check return value with IS_ERR*() helpers.
*/
struct thermal_cooling_device *
devm_thermal_of_cooling_device_register(struct device *dev,
struct device_node *np,
char *type, void *devdata,
const struct thermal_cooling_device_ops *ops)
{
struct thermal_cooling_device **ptr, *tcd;
ptr = devres_alloc(thermal_cooling_device_release, sizeof(*ptr),
GFP_KERNEL);
if (!ptr)
return ERR_PTR(-ENOMEM);
tcd = __thermal_cooling_device_register(np, type, devdata, ops);
if (IS_ERR(tcd)) {
devres_free(ptr);
return tcd;
}
*ptr = tcd;
devres_add(dev, ptr);
return tcd;
}
EXPORT_SYMBOL_GPL(devm_thermal_of_cooling_device_register);
static void __unbind(struct thermal_zone_device *tz, int mask,
struct thermal_cooling_device *cdev)
{
int i;
for (i = 0; i < tz->trips; i++)
if (mask & (1 << i))
thermal_zone_unbind_cooling_device(tz, i, cdev);
}
/**
* thermal_cooling_device_unregister - removes a thermal cooling device
* @cdev: the thermal cooling device to remove.
*
* thermal_cooling_device_unregister() must be called when a registered
* thermal cooling device is no longer needed.
*/
void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev)
{
int i;
const struct thermal_zone_params *tzp;
struct thermal_zone_device *tz;
struct thermal_cooling_device *pos = NULL;
if (!cdev)
return;
mutex_lock(&thermal_list_lock);
list_for_each_entry(pos, &thermal_cdev_list, node)
if (pos == cdev)
break;
if (pos != cdev) {
/* thermal cooling device not found */
mutex_unlock(&thermal_list_lock);
return;
}
list_del(&cdev->node);
/* Unbind all thermal zones associated with 'this' cdev */
list_for_each_entry(tz, &thermal_tz_list, node) {
if (tz->ops->unbind) {
tz->ops->unbind(tz, cdev);
continue;
}
if (!tz->tzp || !tz->tzp->tbp)
continue;
tzp = tz->tzp;
for (i = 0; i < tzp->num_tbps; i++) {
if (tzp->tbp[i].cdev == cdev) {
__unbind(tz, tzp->tbp[i].trip_mask, cdev);
tzp->tbp[i].cdev = NULL;
}
}
}
mutex_unlock(&thermal_list_lock);
ida_simple_remove(&thermal_cdev_ida, cdev->id);
device_del(&cdev->device);
thermal: Add cooling device's statistics in sysfs This extends the sysfs interface for thermal cooling devices and exposes some pretty useful statistics. These statistics have proven to be quite useful specially while doing benchmarks related to the task scheduler, where we want to make sure that nothing has disrupted the test, specially the cooling device which may have put constraints on the CPUs. The information exposed here tells us to what extent the CPUs were constrained by the thermal framework. The write-only "reset" file is used to reset the statistics. The read-only "time_in_state_ms" file shows the time (in msec) spent by the device in the respective cooling states, and it prints one line per cooling state. The read-only "total_trans" file shows single positive integer value showing the total number of cooling state transitions the device has gone through since the time the cooling device is registered or the time when statistics were reset last. The read-only "trans_table" file shows a two dimensional matrix, where an entry <i,j> (row i, column j) represents the number of transitions from State_i to State_j. This is how the directory structure looks like for a single cooling device: $ ls -R /sys/class/thermal/cooling_device0/ /sys/class/thermal/cooling_device0/: cur_state max_state power stats subsystem type uevent /sys/class/thermal/cooling_device0/power: autosuspend_delay_ms runtime_active_time runtime_suspended_time control runtime_status /sys/class/thermal/cooling_device0/stats: reset time_in_state_ms total_trans trans_table This is tested on ARM 64-bit Hisilicon hikey620 board running Ubuntu and ARM 64-bit Hisilicon hikey960 board running Android. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-04-02 10:56:25 +00:00
thermal_cooling_device_destroy_sysfs(cdev);
kfree(cdev->type);
put_device(&cdev->device);
}
EXPORT_SYMBOL_GPL(thermal_cooling_device_unregister);
static void bind_tz(struct thermal_zone_device *tz)
{
int i, ret;
struct thermal_cooling_device *pos = NULL;
const struct thermal_zone_params *tzp = tz->tzp;
if (!tzp && !tz->ops->bind)
return;
mutex_lock(&thermal_list_lock);
/* If there is ops->bind, try to use ops->bind */
if (tz->ops->bind) {
list_for_each_entry(pos, &thermal_cdev_list, node) {
ret = tz->ops->bind(tz, pos);
if (ret)
print_bind_err_msg(tz, pos, ret);
}
goto exit;
}
if (!tzp || !tzp->tbp)
goto exit;
list_for_each_entry(pos, &thermal_cdev_list, node) {
for (i = 0; i < tzp->num_tbps; i++) {
if (tzp->tbp[i].cdev || !tzp->tbp[i].match)
continue;
if (tzp->tbp[i].match(tz, pos))
continue;
tzp->tbp[i].cdev = pos;
__bind(tz, tzp->tbp[i].trip_mask, pos,
tzp->tbp[i].binding_limits,
tzp->tbp[i].weight);
}
}
exit:
mutex_unlock(&thermal_list_lock);
}
/**
* thermal_zone_device_register() - register a new thermal zone device
* @type: the thermal zone device type
* @trips: the number of trip points the thermal zone support
* @mask: a bit string indicating the writeablility of trip points
* @devdata: private device data
* @ops: standard thermal zone device callbacks
* @tzp: thermal zone platform parameters
* @passive_delay: number of milliseconds to wait between polls when
* performing passive cooling
* @polling_delay: number of milliseconds to wait between polls when checking
* whether trip points have been crossed (0 for interrupt
* driven systems)
*
* This interface function adds a new thermal zone device (sensor) to
* /sys/class/thermal folder as thermal_zone[0-*]. It tries to bind all the
* thermal cooling devices registered at the same time.
* thermal_zone_device_unregister() must be called when the device is no
* longer needed. The passive cooling depends on the .get_trend() return value.
*
* Return: a pointer to the created struct thermal_zone_device or an
* in case of error, an ERR_PTR. Caller must check return value with
* IS_ERR*() helpers.
*/
struct thermal_zone_device *
thermal_zone_device_register(const char *type, int trips, int mask,
void *devdata, struct thermal_zone_device_ops *ops,
struct thermal_zone_params *tzp, int passive_delay,
int polling_delay)
{
struct thermal_zone_device *tz;
enum thermal_trip_type trip_type;
2016-03-18 02:03:24 +00:00
int trip_temp;
int id;
int result;
int count;
struct thermal_governor *governor;
if (!type || strlen(type) == 0) {
pr_err("Error: No thermal zone type defined\n");
return ERR_PTR(-EINVAL);
}
if (type && strlen(type) >= THERMAL_NAME_LENGTH) {
pr_err("Error: Thermal zone name (%s) too long, should be under %d chars\n",
type, THERMAL_NAME_LENGTH);
return ERR_PTR(-EINVAL);
}
if (trips > THERMAL_MAX_TRIPS || trips < 0 || mask >> trips) {
pr_err("Error: Incorrect number of thermal trips\n");
return ERR_PTR(-EINVAL);
}
if (!ops) {
pr_err("Error: Thermal zone device ops not defined\n");
return ERR_PTR(-EINVAL);
}
if (trips > 0 && (!ops->get_trip_type || !ops->get_trip_temp))
return ERR_PTR(-EINVAL);
tz = kzalloc(sizeof(*tz), GFP_KERNEL);
if (!tz)
return ERR_PTR(-ENOMEM);
INIT_LIST_HEAD(&tz->thermal_instances);
ida_init(&tz->ida);
mutex_init(&tz->lock);
id = ida_simple_get(&thermal_tz_ida, 0, 0, GFP_KERNEL);
if (id < 0) {
result = id;
goto free_tz;
}
tz->id = id;
strlcpy(tz->type, type, sizeof(tz->type));
result = dev_set_name(&tz->device, "thermal_zone%d", tz->id);
if (result)
goto remove_id;
if (!ops->critical)
ops->critical = thermal_zone_device_critical;
tz->ops = ops;
tz->tzp = tzp;
tz->device.class = &thermal_class;
tz->devdata = devdata;
tz->trips = trips;
thermal_set_delay_jiffies(&tz->passive_delay_jiffies, passive_delay);
thermal_set_delay_jiffies(&tz->polling_delay_jiffies, polling_delay);
/* sys I/F */
/* Add nodes that are always present via .groups */
result = thermal_zone_create_device_groups(tz, mask);
if (result)
goto remove_id;
/* A new thermal zone needs to be updated anyway. */
atomic_set(&tz->need_update, 1);
result = device_register(&tz->device);
if (result)
goto release_device;
for (count = 0; count < trips; count++) {
if (tz->ops->get_trip_type(tz, count, &trip_type) ||
tz->ops->get_trip_temp(tz, count, &trip_temp) ||
!trip_temp)
2016-03-18 02:03:24 +00:00
set_bit(count, &tz->trips_disabled);
}
/* Update 'this' zone's governor information */
mutex_lock(&thermal_governor_lock);
if (tz->tzp)
governor = __find_governor(tz->tzp->governor_name);
else
governor = def_governor;
result = thermal_set_governor(tz, governor);
if (result) {
mutex_unlock(&thermal_governor_lock);
goto unregister;
}
mutex_unlock(&thermal_governor_lock);
if (!tz->tzp || !tz->tzp->no_hwmon) {
result = thermal_add_hwmon_sysfs(tz);
if (result)
goto unregister;
}
mutex_lock(&thermal_list_lock);
list_add_tail(&tz->node, &thermal_tz_list);
mutex_unlock(&thermal_list_lock);
/* Bind cooling devices for this zone */
bind_tz(tz);
INIT_DELAYED_WORK(&tz->poll_queue, thermal_zone_device_check);
thermal_zone_device_init(tz);
/* Update the new thermal zone and mark it as already updated. */
if (atomic_cmpxchg(&tz->need_update, 1, 0))
thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
thermal_notify_tz_create(tz->id, tz->type);
return tz;
unregister:
device_del(&tz->device);
release_device:
put_device(&tz->device);
tz = NULL;
remove_id:
ida_simple_remove(&thermal_tz_ida, id);
free_tz:
kfree(tz);
return ERR_PTR(result);
}
EXPORT_SYMBOL_GPL(thermal_zone_device_register);
/**
* thermal_zone_device_unregister - removes the registered thermal zone device
* @tz: the thermal zone device to remove
*/
void thermal_zone_device_unregister(struct thermal_zone_device *tz)
{
int i, tz_id;
const struct thermal_zone_params *tzp;
struct thermal_cooling_device *cdev;
struct thermal_zone_device *pos = NULL;
if (!tz)
return;
tzp = tz->tzp;
tz_id = tz->id;
mutex_lock(&thermal_list_lock);
list_for_each_entry(pos, &thermal_tz_list, node)
if (pos == tz)
break;
if (pos != tz) {
/* thermal zone device not found */
mutex_unlock(&thermal_list_lock);
return;
}
list_del(&tz->node);
/* Unbind all cdevs associated with 'this' thermal zone */
list_for_each_entry(cdev, &thermal_cdev_list, node) {
if (tz->ops->unbind) {
tz->ops->unbind(tz, cdev);
continue;
}
if (!tzp || !tzp->tbp)
break;
for (i = 0; i < tzp->num_tbps; i++) {
if (tzp->tbp[i].cdev == cdev) {
__unbind(tz, tzp->tbp[i].trip_mask, cdev);
tzp->tbp[i].cdev = NULL;
}
}
}
mutex_unlock(&thermal_list_lock);
thermal: Fix deadlock in thermal thermal_zone_device_check 1851799e1d29 ("thermal: Fix use-after-free when unregistering thermal zone device") changed cancel_delayed_work to cancel_delayed_work_sync to avoid a use-after-free issue. However, cancel_delayed_work_sync could be called insides the WQ causing deadlock. [54109.642398] c0 1162 kworker/u17:1 D 0 11030 2 0x00000000 [54109.642437] c0 1162 Workqueue: thermal_passive_wq thermal_zone_device_check [54109.642447] c0 1162 Call trace: [54109.642456] c0 1162 __switch_to+0x138/0x158 [54109.642467] c0 1162 __schedule+0xba4/0x1434 [54109.642480] c0 1162 schedule_timeout+0xa0/0xb28 [54109.642492] c0 1162 wait_for_common+0x138/0x2e8 [54109.642511] c0 1162 flush_work+0x348/0x40c [54109.642522] c0 1162 __cancel_work_timer+0x180/0x218 [54109.642544] c0 1162 handle_thermal_trip+0x2c4/0x5a4 [54109.642553] c0 1162 thermal_zone_device_update+0x1b4/0x25c [54109.642563] c0 1162 thermal_zone_device_check+0x18/0x24 [54109.642574] c0 1162 process_one_work+0x3cc/0x69c [54109.642583] c0 1162 worker_thread+0x49c/0x7c0 [54109.642593] c0 1162 kthread+0x17c/0x1b0 [54109.642602] c0 1162 ret_from_fork+0x10/0x18 [54109.643051] c0 1162 kworker/u17:2 D 0 16245 2 0x00000000 [54109.643067] c0 1162 Workqueue: thermal_passive_wq thermal_zone_device_check [54109.643077] c0 1162 Call trace: [54109.643085] c0 1162 __switch_to+0x138/0x158 [54109.643095] c0 1162 __schedule+0xba4/0x1434 [54109.643104] c0 1162 schedule_timeout+0xa0/0xb28 [54109.643114] c0 1162 wait_for_common+0x138/0x2e8 [54109.643122] c0 1162 flush_work+0x348/0x40c [54109.643131] c0 1162 __cancel_work_timer+0x180/0x218 [54109.643141] c0 1162 handle_thermal_trip+0x2c4/0x5a4 [54109.643150] c0 1162 thermal_zone_device_update+0x1b4/0x25c [54109.643159] c0 1162 thermal_zone_device_check+0x18/0x24 [54109.643167] c0 1162 process_one_work+0x3cc/0x69c [54109.643177] c0 1162 worker_thread+0x49c/0x7c0 [54109.643186] c0 1162 kthread+0x17c/0x1b0 [54109.643195] c0 1162 ret_from_fork+0x10/0x18 [54109.644500] c0 1162 cat D 0 7766 1 0x00000001 [54109.644515] c0 1162 Call trace: [54109.644524] c0 1162 __switch_to+0x138/0x158 [54109.644536] c0 1162 __schedule+0xba4/0x1434 [54109.644546] c0 1162 schedule_preempt_disabled+0x80/0xb0 [54109.644555] c0 1162 __mutex_lock+0x3a8/0x7f0 [54109.644563] c0 1162 __mutex_lock_slowpath+0x14/0x20 [54109.644575] c0 1162 thermal_zone_get_temp+0x84/0x360 [54109.644586] c0 1162 temp_show+0x30/0x78 [54109.644609] c0 1162 dev_attr_show+0x5c/0xf0 [54109.644628] c0 1162 sysfs_kf_seq_show+0xcc/0x1a4 [54109.644636] c0 1162 kernfs_seq_show+0x48/0x88 [54109.644656] c0 1162 seq_read+0x1f4/0x73c [54109.644664] c0 1162 kernfs_fop_read+0x84/0x318 [54109.644683] c0 1162 __vfs_read+0x50/0x1bc [54109.644692] c0 1162 vfs_read+0xa4/0x140 [54109.644701] c0 1162 SyS_read+0xbc/0x144 [54109.644708] c0 1162 el0_svc_naked+0x34/0x38 [54109.845800] c0 1162 D 720.000s 1->7766->7766 cat [panic] Fixes: 1851799e1d29 ("thermal: Fix use-after-free when unregistering thermal zone device") Cc: stable@vger.kernel.org Signed-off-by: Wei Wang <wvw@google.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-11-12 20:42:23 +00:00
cancel_delayed_work_sync(&tz->poll_queue);
thermal_set_governor(tz, NULL);
thermal_remove_hwmon_sysfs(tz);
ida_simple_remove(&thermal_tz_ida, tz->id);
ida_destroy(&tz->ida);
mutex_destroy(&tz->lock);
device_unregister(&tz->device);
thermal_notify_tz_delete(tz_id);
}
EXPORT_SYMBOL_GPL(thermal_zone_device_unregister);
/**
* thermal_zone_get_zone_by_name() - search for a zone and returns its ref
* @name: thermal zone name to fetch the temperature
*
* When only one zone is found with the passed name, returns a reference to it.
*
* Return: On success returns a reference to an unique thermal zone with
* matching name equals to @name, an ERR_PTR otherwise (-EINVAL for invalid
* paramenters, -ENODEV for not found and -EEXIST for multiple matches).
*/
struct thermal_zone_device *thermal_zone_get_zone_by_name(const char *name)
{
struct thermal_zone_device *pos = NULL, *ref = ERR_PTR(-EINVAL);
unsigned int found = 0;
if (!name)
goto exit;
mutex_lock(&thermal_list_lock);
list_for_each_entry(pos, &thermal_tz_list, node)
if (!strncasecmp(name, pos->type, THERMAL_NAME_LENGTH)) {
found++;
ref = pos;
}
mutex_unlock(&thermal_list_lock);
/* nothing has been found, thus an error code for it */
if (found == 0)
ref = ERR_PTR(-ENODEV);
else if (found > 1)
/* Success only when an unique zone is found */
ref = ERR_PTR(-EEXIST);
exit:
return ref;
}
EXPORT_SYMBOL_GPL(thermal_zone_get_zone_by_name);
Thermal: handle thermal zone device properly during system sleep Current thermal code does not handle system sleep well because 1. the cooling device cooling state may be changed during suspend 2. the previous temperature reading becomes invalid after resumed because it is got before system sleep 3. updating thermal zone device during suspending/resuming is wrong because some devices may have already been suspended or may have not been resumed. Thus, the proper way to do this is to cancel all thermal zone device update requirements during suspend/resume, and after all the devices have been resumed, reset and update every registered thermal zone devices. This also fixes a regression introduced by: Commit 19593a1fb1f6 ("ACPI / fan: convert to platform driver") Because, with above commit applied, all the fan devices are attached to the acpi_general_pm_domain, and they are turned on by the pm_domain automatically after resume, without the awareness of thermal core. CC: <stable@vger.kernel.org> #3.18+ Reference: https://bugzilla.kernel.org/show_bug.cgi?id=78201 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=91411 Tested-by: Manuel Krause <manuelkrause@netscape.net> Tested-by: szegad <szegadlo@poczta.onet.pl> Tested-by: prash <prash.n.rao@gmail.com> Tested-by: amish <ammdispose-arch@yahoo.com> Tested-by: Matthias <morpheusxyz123@yahoo.de> Reviewed-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
2015-10-30 08:31:58 +00:00
static int thermal_pm_notify(struct notifier_block *nb,
unsigned long mode, void *_unused)
Thermal: handle thermal zone device properly during system sleep Current thermal code does not handle system sleep well because 1. the cooling device cooling state may be changed during suspend 2. the previous temperature reading becomes invalid after resumed because it is got before system sleep 3. updating thermal zone device during suspending/resuming is wrong because some devices may have already been suspended or may have not been resumed. Thus, the proper way to do this is to cancel all thermal zone device update requirements during suspend/resume, and after all the devices have been resumed, reset and update every registered thermal zone devices. This also fixes a regression introduced by: Commit 19593a1fb1f6 ("ACPI / fan: convert to platform driver") Because, with above commit applied, all the fan devices are attached to the acpi_general_pm_domain, and they are turned on by the pm_domain automatically after resume, without the awareness of thermal core. CC: <stable@vger.kernel.org> #3.18+ Reference: https://bugzilla.kernel.org/show_bug.cgi?id=78201 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=91411 Tested-by: Manuel Krause <manuelkrause@netscape.net> Tested-by: szegad <szegadlo@poczta.onet.pl> Tested-by: prash <prash.n.rao@gmail.com> Tested-by: amish <ammdispose-arch@yahoo.com> Tested-by: Matthias <morpheusxyz123@yahoo.de> Reviewed-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
2015-10-30 08:31:58 +00:00
{
struct thermal_zone_device *tz;
switch (mode) {
case PM_HIBERNATION_PREPARE:
case PM_RESTORE_PREPARE:
case PM_SUSPEND_PREPARE:
atomic_set(&in_suspend, 1);
break;
case PM_POST_HIBERNATION:
case PM_POST_RESTORE:
case PM_POST_SUSPEND:
atomic_set(&in_suspend, 0);
list_for_each_entry(tz, &thermal_tz_list, node) {
thermal: Use mode helpers in drivers Use thermal_zone_device_{en|dis}able() and thermal_zone_device_is_enabled(). Consequently, all set_mode() implementations in drivers: - can stop modifying tzd's "mode" member, - shall stop taking tzd's lock, as it is taken in the helpers - shall stop calling thermal_zone_device_update() as it is called in the helpers - can assume they are called when the mode truly changes, so checks to verify that can be dropped Not providing set_mode() by a driver no longer prevents the core from being able to set tzd's mode, so the relevant check in mode_store() is removed. Other comments: - acpi/thermal.c: tz->thermal_zone->mode will be updated only after we return from set_mode(), so use function parameter in thermal_set_mode() instead, no need to call acpi_thermal_check() in set_mode() - thermal/imx_thermal.c: regmap writes and mode assignment are done in thermal_zone_device_{en|dis}able() and set_mode() callback - thermal/intel/intel_quark_dts_thermal.c: soc_dts_{en|dis}able() are a part of set_mode() callback, so they don't need to modify tzd->mode, and don't need to fall back to the opposite mode if unsuccessful, as the return value will be propagated to thermal_zone_device_{en|dis}able() and ultimately tzd's member will not be changed in thermal_zone_device_set_mode(). - thermal/of-thermal.c: no need to set zone->mode to DISABLED in of_parse_thermal_zones() as a tzd is kzalloc'ed so mode is DISABLED anyway Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> [for acerhdf] Acked-by: Peter Kaestle <peter@piie.net> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-8-andrzej.p@collabora.com
2020-06-29 12:29:21 +00:00
if (!thermal_zone_device_is_enabled(tz))
continue;
thermal_zone_device_init(tz);
thermal_zone_device_update(tz,
THERMAL_EVENT_UNSPECIFIED);
Thermal: handle thermal zone device properly during system sleep Current thermal code does not handle system sleep well because 1. the cooling device cooling state may be changed during suspend 2. the previous temperature reading becomes invalid after resumed because it is got before system sleep 3. updating thermal zone device during suspending/resuming is wrong because some devices may have already been suspended or may have not been resumed. Thus, the proper way to do this is to cancel all thermal zone device update requirements during suspend/resume, and after all the devices have been resumed, reset and update every registered thermal zone devices. This also fixes a regression introduced by: Commit 19593a1fb1f6 ("ACPI / fan: convert to platform driver") Because, with above commit applied, all the fan devices are attached to the acpi_general_pm_domain, and they are turned on by the pm_domain automatically after resume, without the awareness of thermal core. CC: <stable@vger.kernel.org> #3.18+ Reference: https://bugzilla.kernel.org/show_bug.cgi?id=78201 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=91411 Tested-by: Manuel Krause <manuelkrause@netscape.net> Tested-by: szegad <szegadlo@poczta.onet.pl> Tested-by: prash <prash.n.rao@gmail.com> Tested-by: amish <ammdispose-arch@yahoo.com> Tested-by: Matthias <morpheusxyz123@yahoo.de> Reviewed-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
2015-10-30 08:31:58 +00:00
}
break;
default:
break;
}
return 0;
}
static struct notifier_block thermal_pm_nb = {
.notifier_call = thermal_pm_notify,
};
static int __init thermal_init(void)
{
int result;
result = thermal_netlink_init();
if (result)
goto error;
result = thermal_register_governors();
if (result)
goto error;
result = class_register(&thermal_class);
if (result)
goto unregister_governors;
result = of_parse_thermal_zones();
if (result)
goto unregister_class;
Thermal: handle thermal zone device properly during system sleep Current thermal code does not handle system sleep well because 1. the cooling device cooling state may be changed during suspend 2. the previous temperature reading becomes invalid after resumed because it is got before system sleep 3. updating thermal zone device during suspending/resuming is wrong because some devices may have already been suspended or may have not been resumed. Thus, the proper way to do this is to cancel all thermal zone device update requirements during suspend/resume, and after all the devices have been resumed, reset and update every registered thermal zone devices. This also fixes a regression introduced by: Commit 19593a1fb1f6 ("ACPI / fan: convert to platform driver") Because, with above commit applied, all the fan devices are attached to the acpi_general_pm_domain, and they are turned on by the pm_domain automatically after resume, without the awareness of thermal core. CC: <stable@vger.kernel.org> #3.18+ Reference: https://bugzilla.kernel.org/show_bug.cgi?id=78201 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=91411 Tested-by: Manuel Krause <manuelkrause@netscape.net> Tested-by: szegad <szegadlo@poczta.onet.pl> Tested-by: prash <prash.n.rao@gmail.com> Tested-by: amish <ammdispose-arch@yahoo.com> Tested-by: Matthias <morpheusxyz123@yahoo.de> Reviewed-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
2015-10-30 08:31:58 +00:00
result = register_pm_notifier(&thermal_pm_nb);
if (result)
pr_warn("Thermal: Can not register suspend notifier, return %d\n",
result);
return 0;
unregister_class:
class_unregister(&thermal_class);
unregister_governors:
thermal_unregister_governors();
error:
ida_destroy(&thermal_tz_ida);
ida_destroy(&thermal_cdev_ida);
mutex_destroy(&thermal_list_lock);
mutex_destroy(&thermal_governor_lock);
return result;
}
postcore_initcall(thermal_init);