Commit graph

9 commits

Author SHA1 Message Date
Vadim Pasternak
e6d9a876d9 hwmon: (mlxreg-fan) Return zero speed for broken fan
[ Upstream commit a1ffd3c462 ]

Currently for broken fan driver returns value calculated based on error
code (0xFF) in related fan speed register.
Thus, for such fan user gets fan{n}_fault to 1 and fan{n}_input with
misleading value.

Add check for fan fault prior return speed value and return zero if
fault is detected.

Fixes: 65afb4c8e7 ("hwmon: (mlxreg-fan) Add support for Mellanox FAN driver")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20230212145730.24247-1-vadimp@nvidia.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-11 16:43:47 +01:00
Vadim Pasternak
a6c42ae153 hwmon: (mlxreg-fan) Return non-zero value when fan current state is enforced from sysfs
[ Upstream commit e6fab7af6b ]

Fan speed minimum can be enforced from sysfs. For example, setting
current fan speed to 20 is used to enforce fan speed to be at 100%
speed, 19 - to be not below 90% speed, etcetera. This feature provides
ability to limit fan speed according to some system wise
considerations, like absence of some replaceable units or high system
ambient temperature.

Request for changing fan minimum speed is configuration request and can
be set only through 'sysfs' write procedure. In this situation value of
argument 'state' is above nominal fan speed maximum.

Return non-zero code in this case to avoid
thermal_cooling_device_stats_update() call, because in this case
statistics update violates thermal statistics table range.
The issues is observed in case kernel is configured with option
CONFIG_THERMAL_STATISTICS.

Here is the trace from KASAN:
[  159.506659] BUG: KASAN: slab-out-of-bounds in thermal_cooling_device_stats_update+0x7d/0xb0
[  159.516016] Read of size 4 at addr ffff888116163840 by task hw-management.s/7444
[  159.545625] Call Trace:
[  159.548366]  dump_stack+0x92/0xc1
[  159.552084]  ? thermal_cooling_device_stats_update+0x7d/0xb0
[  159.635869]  thermal_zone_device_update+0x345/0x780
[  159.688711]  thermal_zone_device_set_mode+0x7d/0xc0
[  159.694174]  mlxsw_thermal_modules_init+0x48f/0x590 [mlxsw_core]
[  159.700972]  ? mlxsw_thermal_set_cur_state+0x5a0/0x5a0 [mlxsw_core]
[  159.731827]  mlxsw_thermal_init+0x763/0x880 [mlxsw_core]
[  160.070233] RIP: 0033:0x7fd995909970
[  160.074239] Code: 73 01 c3 48 8b 0d 28 d5 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 99 2d 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ..
[  160.095242] RSP: 002b:00007fff54f5d938 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  160.103722] RAX: ffffffffffffffda RBX: 0000000000000013 RCX: 00007fd995909970
[  160.111710] RDX: 0000000000000013 RSI: 0000000001906008 RDI: 0000000000000001
[  160.119699] RBP: 0000000001906008 R08: 00007fd995bc9760 R09: 00007fd996210700
[  160.127687] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000013
[  160.135673] R13: 0000000000000001 R14: 00007fd995bc8600 R15: 0000000000000013
[  160.143671]
[  160.145338] Allocated by task 2924:
[  160.149242]  kasan_save_stack+0x19/0x40
[  160.153541]  __kasan_kmalloc+0x7f/0xa0
[  160.157743]  __kmalloc+0x1a2/0x2b0
[  160.161552]  thermal_cooling_device_setup_sysfs+0xf9/0x1a0
[  160.167687]  __thermal_cooling_device_register+0x1b5/0x500
[  160.173833]  devm_thermal_of_cooling_device_register+0x60/0xa0
[  160.180356]  mlxreg_fan_probe+0x474/0x5e0 [mlxreg_fan]
[  160.248140]
[  160.249807] The buggy address belongs to the object at ffff888116163400
[  160.249807]  which belongs to the cache kmalloc-1k of size 1024
[  160.263814] The buggy address is located 64 bytes to the right of
[  160.263814]  1024-byte region [ffff888116163400, ffff888116163800)
[  160.277536] The buggy address belongs to the page:
[  160.282898] page:0000000012275840 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888116167000 pfn:0x116160
[  160.294872] head:0000000012275840 order:3 compound_mapcount:0 compound_pincount:0
[  160.303251] flags: 0x200000000010200(slab|head|node=0|zone=2)
[  160.309694] raw: 0200000000010200 ffffea00046f7208 ffffea0004928208 ffff88810004dbc0
[  160.318367] raw: ffff888116167000 00000000000a0006 00000001ffffffff 0000000000000000
[  160.327033] page dumped because: kasan: bad access detected
[  160.333270]
[  160.334937] Memory state around the buggy address:
[  160.356469] >ffff888116163800: fc ..

Fixes: 65afb4c8e7 ("hwmon: (mlxreg-fan) Add support for Mellanox FAN driver")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20210916183151.869427-1-vadimp@nvidia.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-06 15:42:32 +02:00
Linus Torvalds
a455eda33f Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal
Pull thermal soc updates from Eduardo Valentin:

 - thermal core has a new devm_* API for registering cooling devices. I
   took the entire series, that is why you see changes on drivers/hwmon
   in this pull (Guenter Roeck)

 - rockchip thermal driver gains support to PX30 SoC (Elaine Zhang)

 - the generic-adc thermal driver now considers the lookup table DT
   property as optional (Jean-Francois Dagenais)

 - Refactoring of tsens thermal driver (Amit Kucheria)

 - Cleanups on cpu cooling driver (Daniel Lezcano)

 - broadcom thermal driver dropped support to ACPI (Srinath Mannam)

 - tegra thermal driver gains support to OC hw throttle and GPU throtle
   (Wei Ni)

 - Fixes in several thermal drivers.

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: (59 commits)
  hwmon: (pwm-fan) Use devm_thermal_of_cooling_device_register
  hwmon: (npcm750-pwm-fan) Use devm_thermal_of_cooling_device_register
  hwmon: (mlxreg-fan) Use devm_thermal_of_cooling_device_register
  hwmon: (gpio-fan) Use devm_thermal_of_cooling_device_register
  hwmon: (aspeed-pwm-tacho) Use devm_thermal_of_cooling_device_register
  thermal: rcar_gen3_thermal: Fix to show correct trip points number
  thermal: rcar_thermal: update calculation formula for R-Car Gen3 SoCs
  thermal: cpu_cooling: Actually trace CPU load in thermal_power_cpu_get_power
  thermal: rockchip: Support the PX30 SoC in thermal driver
  dt-bindings: rockchip-thermal: Support the PX30 SoC compatible
  thermal: rockchip: fix up the tsadc pinctrl setting error
  thermal: broadcom: Remove ACPI support
  thermal: Fix build error of missing devm_ioremap_resource on UM
  thermal/drivers/cpu_cooling: Remove pointless field
  thermal/drivers/cpu_cooling: Add Software Package Data Exchange (SPDX)
  thermal/drivers/cpu_cooling: Fixup the header and copyright
  thermal/drivers/cpu_cooling: Remove pointless test in power2state()
  thermal: rcar_gen3_thermal: disable interrupt in .remove
  thermal: rcar_gen3_thermal: fix interrupt type
  thermal: Introduce devm_thermal_of_cooling_device_register
  ...
2019-05-16 07:56:57 -07:00
Guenter Roeck
9ebe010e56 hwmon: (mlxreg-fan) Use devm_thermal_of_cooling_device_register
Call devm_thermal_of_cooling_device_register() to register the cooling
device. Also introduce struct device *dev = &pdev->dev; to make the code
easier to read.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-14 07:00:46 -07:00
Guenter Roeck
725dcf082c hwmon: (mlxreg-fan) Use HWMON_CHANNEL_INFO macro
The HWMON_CHANNEL_INFO macro simplifies the code, reduces the likelihood
of errors, and makes the code easier to read.

The conversion was done automatically with coccinelle. The semantic patch
used to make this change is as follows.

@r@
initializer list elements;
identifier i;
@@

-u32 i[] = {
-  elements,
-  0
-};

@s@
identifier r.i,j,ty;
@@

-struct hwmon_channel_info j = {
-       .type = ty,
-       .config = i,
-};

@script:ocaml t@
ty << s.ty;
elements << r.elements;
shorter;
elems;
@@

shorter :=
   make_ident (List.hd(List.rev (Str.split (Str.regexp "_") ty)));
elems :=
   make_ident
    (String.concat ","
     (List.map (fun x -> Printf.sprintf "\n\t\t\t   %s" x)
       (Str.split (Str.regexp " , ") elements)))

@@
identifier s.j,t.shorter;
identifier t.elems;
@@

- &j
+ HWMON_CHANNEL_INFO(shorter,elems)

This patch does not introduce functional changes. Many thanks to
Julia Lawall for providing the semantic patch.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-04-15 17:19:53 -07:00
Vadim Pasternak
b429ebc86f hwmon: (mlxreg-fan) Add support for fan capability registers
Add support for fan capability registers in order to distinct between
the systems which have minor fan configuration differences. This
reduces the amount of code used to describe such systems.
The capability registers provides system specific information about the
number of physically connected tachometers and system specific fan
speed scale parameter.
For example one system can be equipped with twelve fan tachometers,
while the other with for example, eight or six. Or one system should
use default fan speed divider value, while the other has a scale
parameter defined in hardware, which should be used for divider
setting.
Reading this information from the capability registers allows to use the
same fan structure for the systems with the such differences.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-04-15 17:19:53 -07:00
Vadim Pasternak
3f9ffa5c3a hwmon: (mlxreg-fan) Modify macros for tachometer fault status reading
Modify macros for tachometer fault status reading for making it more
simple and clear.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-12-16 15:13:16 -08:00
Vadim Pasternak
243cfe3fb8 hwmon: (mlxreg-fan) Fix macros for tacho fault reading
Fix macros for tacometer fault reading.
This fix is relevant for three Mellanox systems MQMB7, MSN37, MSN34,
which are about to be released to the customers.
At the moment, none of them is at customers sites.

Fixes: 65afb4c8e7 ("hwmon: (mlxreg-fan) Add support for Mellanox FAN driver")
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-11-16 08:10:23 -08:00
Vadim Pasternak
65afb4c8e7 hwmon: (mlxreg-fan) Add support for Mellanox FAN driver
Driver obtains PWM and tachometers registers location according to the
system configuration and creates FAN/PWM hwmon objects and a cooling
device. PWM and tachometers are controlled through the on-board
programmable device, which exports its register map. This device could be
attached to any bus type, for which register mapping is supported. Single
instance is created with one PWM control, up to 12 tachometers and one
cooling device. It could be as many instances as programmable device
supports.

Currently driver will be activated from the Mellanox platform driver:
drivers/platform/x86/mlx-platform.c.
For the future ARM based systems it could be activated from the ARM
platform module.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-07-08 20:08:13 -07:00