Thermal control fixes for 6.10-rc4

- Prevent the thermal core from failing the registration of a cooling
    device if its .get_cur_state() reports an incorrect state to start
    with which may happen for fans handled through firmware-supplied AML
    in ACPI tables (Rafael Wysocki).
 
  - Make the ACPI thermal zone driver initialize all trip points with
    temperature of 0 centigrade and below as invalid because such trip
    point temperatures do not make sense on systems with ACPI thermal
    control and they cause performance regressions due to permanent
    thermal mitigations to occur (Rafael Wysocki).
 
  - Restore passive polling management in the Step-Wise thermal governor
    that uses it to ensure that all cooling devices used for thermal
    mitigation will go back to their initial states eventually (Rafael
    Wysocki).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmZsQZASHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx9OMP/1RX+WL0CIIaeuMFapZDmtNoflkl2HF/
 Cx7S1ETc79dxgglZjlfNDi8Ebs9EHAgsqQlpbwNQA0959BideY5sFOgzbhdSYfs1
 NhS+rDobXKNnmH7v3sABNNnRdQAsYC57WThNZVBBOmPyh3XCS2+JiRu2BZ3CpszO
 pqotB+ayOEfbhswLWKfG5o36TbXiwpAsmfyl4bTJMxdAA+FwrcIlHYaDx0WaDGzN
 NQhgz+PPg6Vu59eNu7N6/WInVL8k34/JMGhgseN1nHCZHhTUzl8TQfdBG761B4L4
 +gLGaqFkIWm88EPlFYxyBSYqEISvi4Ap3vjBhbLMrGfQNJ//KjqEOdSz9C1fK2mF
 BIZwpOq9W6ccL6IvznefkHRn7Qbjbk19etp4oG/CBvOW+n0bvRRv6atVckNZju4P
 PMTzkLFTo6h6YN5PP6WCkbq3SsO7GacLWkpgiqedFdy5sophud7rcmuG9MR1XiFV
 Ozvm7VV9ONJBtrv4Vx6uwmwerzl1qomCboOn1yCAq7WBvBaloKDRFWHlyqwrgzBP
 FM0kByyEioF7aYdRB6kC7uZ11o3Fq9Z1TRlZZMsG1tmySdx2IL/AbmIyO4+KyQIV
 jdM8xulNBT3DhcwoBgcK6LsyY7Q0iTTb5gR6t5ndjDUCljgCAMf0RXqewXoetg9/
 mFkOuLLZKtYp
 =bfRy
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control fixes from Rafael Wysocki:
 "These fix three issues introduced recently, two related to defects in
  ACPI tables supplied by the platform firmware and one cause by a
  thermal core change that went too far:

   - Prevent the thermal core from failing the registration of a cooling
     device if its .get_cur_state() reports an incorrect state to start
     with which may happen for fans handled through firmware-supplied
     AML in ACPI tables (Rafael Wysocki)

   - Make the ACPI thermal zone driver initialize all trip points with
     temperature of 0 centigrade and below as invalid because such trip
     point temperatures do not make sense on systems with ACPI thermal
     control and they cause performance regressions due to permanent
     thermal mitigations to occur (Rafael Wysocki)

   - Restore passive polling management in the Step-Wise thermal
     governor that uses it to ensure that all cooling devices used for
     thermal mitigation will go back to their initial states eventually
     (Rafael Wysocki)"

* tag 'thermal-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: gov_step_wise: Restore passive polling management
  thermal: ACPI: Invalidate trip points with temperature of 0 or below
  thermal: core: Do not fail cdev registration because of invalid initial state
This commit is contained in:
Linus Torvalds 2024-06-14 09:28:56 -07:00
commit cee84c0b00
3 changed files with 35 additions and 3 deletions

View file

@ -168,11 +168,17 @@ static int acpi_thermal_get_polling_frequency(struct acpi_thermal *tz)
static int acpi_thermal_temp(struct acpi_thermal *tz, int temp_deci_k)
{
int temp;
if (temp_deci_k == THERMAL_TEMP_INVALID)
return THERMAL_TEMP_INVALID;
return deci_kelvin_to_millicelsius_with_offset(temp_deci_k,
temp = deci_kelvin_to_millicelsius_with_offset(temp_deci_k,
tz->kelvin_offset);
if (temp <= 0)
return THERMAL_TEMP_INVALID;
return temp;
}
static bool acpi_thermal_trip_valid(struct acpi_thermal_trip *acpi_trip)

View file

@ -93,6 +93,23 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz,
if (instance->initialized && old_target == instance->target)
continue;
if (trip->type == THERMAL_TRIP_PASSIVE) {
/*
* If the target state for this thermal instance
* changes from THERMAL_NO_TARGET to something else,
* ensure that the zone temperature will be updated
* (assuming enabled passive cooling) until it becomes
* THERMAL_NO_TARGET again, or the cooling device may
* not be reset to its initial state.
*/
if (old_target == THERMAL_NO_TARGET &&
instance->target != THERMAL_NO_TARGET)
tz->passive++;
else if (old_target != THERMAL_NO_TARGET &&
instance->target == THERMAL_NO_TARGET)
tz->passive--;
}
instance->initialized = true;
mutex_lock(&instance->cdev->lock);

View file

@ -999,9 +999,17 @@ __thermal_cooling_device_register(struct device_node *np,
if (ret)
goto out_cdev_type;
/*
* The cooling device's current state is only needed for debug
* initialization below, so a failure to get it does not cause
* the entire cooling device initialization to fail. However,
* the debug will not work for the device if its initial state
* cannot be determined and drivers are responsible for ensuring
* that this will not happen.
*/
ret = cdev->ops->get_cur_state(cdev, &current_state);
if (ret)
goto out_cdev_type;
current_state = ULONG_MAX;
thermal_cooling_device_setup_sysfs(cdev);
@ -1016,7 +1024,8 @@ __thermal_cooling_device_register(struct device_node *np,
return ERR_PTR(ret);
}
thermal_debug_cdev_add(cdev, current_state);
if (current_state <= cdev->max_state)
thermal_debug_cdev_add(cdev, current_state);
/* Add 'this' new cdev to the global cdev list */
mutex_lock(&thermal_list_lock);