Commit graph

48 commits

Author SHA1 Message Date
Thomas Weißschuh
5e720f8c8c cpufreq: amd-pstate: fix global sysfs attribute type
In commit 3666062b87 ("cpufreq: amd-pstate: move to use bus_get_dev_root()")
the "amd_pstate" attributes where moved from a dedicated kobject to the
cpu root kobject.

While the dedicated kobject expects to contain kobj_attributes the root
kobject needs device_attributes.

As the changed arguments are not used by the callbacks it works most of
the time.
However CFI will detect this issue:

[ 4947.849350] CFI failure at dev_attr_show+0x24/0x60 (target: show_status+0x0/0x70; expected type: 0x8651b1de)
...
[ 4947.849409] Call Trace:
[ 4947.849410]  <TASK>
[ 4947.849411]  ? __warn+0xcf/0x1c0
[ 4947.849414]  ? dev_attr_show+0x24/0x60
[ 4947.849415]  ? report_cfi_failure+0x4e/0x60
[ 4947.849417]  ? handle_cfi_failure+0x14c/0x1d0
[ 4947.849419]  ? __cfi_show_status+0x10/0x10
[ 4947.849420]  ? handle_bug+0x4f/0x90
[ 4947.849421]  ? exc_invalid_op+0x1a/0x60
[ 4947.849422]  ? asm_exc_invalid_op+0x1a/0x20
[ 4947.849424]  ? __cfi_show_status+0x10/0x10
[ 4947.849425]  ? dev_attr_show+0x24/0x60
[ 4947.849426]  sysfs_kf_seq_show+0xa6/0x110
[ 4947.849433]  seq_read_iter+0x16c/0x4b0
[ 4947.849436]  vfs_read+0x272/0x2d0
[ 4947.849438]  ksys_read+0x72/0xe0
[ 4947.849439]  do_syscall_64+0x76/0xb0
[ 4947.849440]  ? do_user_addr_fault+0x252/0x650
[ 4947.849442]  ? exc_page_fault+0x7a/0x1b0
[ 4947.849443]  entry_SYSCALL_64_after_hwframe+0x72/0xdc

Fixes: 3666062b87 ("cpufreq: amd-pstate: move to use bus_get_dev_root()")
Reported-by: Jannik Glückert <jannik.glueckert@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217765
Link: https://lore.kernel.org/lkml/c7f1bf9b-b183-bf6e-1cbb-d43f72494083@gmail.com/
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-08-07 19:41:48 +02:00
Mario Limonciello
c88ad30e3f cpufreq: amd-pstate: Add a kernel config option to set default mode
Users are having more success with amd-pstate since the introduction
of EPP and Guided modes.  To expose the driver to more users by default
introduce a kernel configuration option for setting the default mode.

Users can use an integer to map out which default mode they want to use
in lieu of a kernel command line option.

This will default to EPP, but only if:
 1) The CPU supports an MSR.
 2) The system profile is identified
 3) The system profile is identified as a non-server by the FADT.

Link: https://gitlab.freedesktop.org/hadess/power-profiles-daemon/-/merge_requests/121
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Co-developed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-21 18:44:56 +02:00
Mario Limonciello
32f80b9adf cpufreq: amd-pstate: Set a fallback policy based on preferred_profile
If a user's configuration doesn't explicitly specify the cpufreq
scaling governor then the code currently explicitly falls back to
'powersave'. This default is fine for notebooks and desktops, but
servers and undefined machines should default to 'performance'.

Look at the 'preferred_profile' field from the FADT to set this
policy accordingly.

Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/05_ACPI_Software_Programming_Model/ACPI_Software_Programming_Model.html#fixed-acpi-description-table-fadt
Acked-by: Huang Rui <ray.huang@amd.com>
Suggested-by: Wyes Karny <Wyes.Karny@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-21 18:44:56 +02:00
Wyes Karny
f4aad63930 cpufreq: amd-pstate: Make amd-pstate EPP driver name hyphenated
amd-pstate passive mode driver is hyphenated. So make amd-pstate active
mode driver consistent with that rename "amd_pstate_epp" to
"amd-pstate-epp".

Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Cc: All applicable <stable@vger.kernel.org>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-16 19:37:13 +02:00
Wyes Karny
217e67784e cpufreq: amd-pstate: Write CPPC enable bit per-socket
Currently amd_pstate sets CPPC enable bit in MSR_AMD_CPPC_ENABLE only
for the CPU where the module_init happened. But MSR_AMD_CPPC_ENABLE is
per-socket. This causes CPPC enable bit to set for only one socket for
servers with more than one physical packages. To fix this write
MSR_AMD_CPPC_ENABLE per-socket.

Also, handle duplicate calls for cppc_enable, because it's called from
per-policy/per-core callbacks and can result in duplicate MSR writes.

Before the fix:
amd@amd:~$ sudo rdmsr -a 0xc00102b1 | uniq --count
	192 0
    192 1

After the fix:
amd@amd:~$ sudo rdmsr -a 0xc00102b1 | uniq --count
    384 1

Suggested-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-16 19:30:47 +02:00
Wyes Karny
3bf8c6307b cpufreq: amd-pstate: Update policy->cur in amd_pstate_adjust_perf()
Driver should update policy->cur after updating the frequency.
Currently amd_pstate doesn't update policy->cur when `adjust_perf`
is used. Which causes /proc/cpuinfo to show wrong cpu frequency.
Fix this by updating policy->cur with correct frequency value in
adjust_perf function callback.

- Before the fix: (setting min freq to 1.5 MHz)

[root@amd]# cat /proc/cpuinfo | grep "cpu MHz" | sort | uniq --count
      1 cpu MHz         : 1777.016
      1 cpu MHz         : 1797.160
      1 cpu MHz         : 1797.270
    189 cpu MHz         : 400.000

- After the fix: (setting min freq to 1.5 MHz)

[root@amd]# cat /proc/cpuinfo | grep "cpu MHz" | sort | uniq --count
      1 cpu MHz         : 1753.353
      1 cpu MHz         : 1756.838
      1 cpu MHz         : 1776.466
      1 cpu MHz         : 1776.873
      1 cpu MHz         : 1777.308
      1 cpu MHz         : 1779.900
    183 cpu MHz         : 1805.231
      1 cpu MHz         : 1956.815
      1 cpu MHz         : 2246.203
      1 cpu MHz         : 2259.984

Fixes: 1d215f0319 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
[ rjw: Subject edits ]
Cc: 5.17+ <stable@vger.kernel.org> # 5.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-05-25 19:35:13 +02:00
Wyes Karny
249b62c448 cpufreq: amd-pstate: Remove fast_switch_possible flag from active driver
amd_pstate active mode driver is only compatible with static governors.
Therefore it doesn't need fast_switch functionality. Remove
fast_switch_possible flag from amd_pstate active mode driver.

Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-05-24 19:39:16 +02:00
Gautham R. Shenoy
4badf2eb1e cpufreq: amd-pstate: Add ->fast_switch() callback
Schedutil normally calls the adjust_perf callback for drivers with
adjust_perf callback available and fast_switch_possible flag set.
However, when frequency invariance is disabled and schedutil tries to
invoke fast_switch. So, there is a chance of kernel crash if this
function pointer is not set. To protect against this scenario add
fast_switch callback to amd_pstate driver.

Fixes: 1d215f0319 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-05-24 19:39:16 +02:00
Linus Torvalds
556eb8b791 Driver core changes for 6.4-rc1
Here is the large set of driver core changes for 6.4-rc1.
 
 Once again, a busy development cycle, with lots of changes happening in
 the driver core in the quest to be able to move "struct bus" and "struct
 class" into read-only memory, a task now complete with these changes.
 
 This will make the future rust interactions with the driver core more
 "provably correct" as well as providing more obvious lifetime rules for
 all busses and classes in the kernel.
 
 The changes required for this did touch many individual classes and
 busses as many callbacks were changed to take const * parameters
 instead.  All of these changes have been submitted to the various
 subsystem maintainers, giving them plenty of time to review, and most of
 them actually did so.
 
 Other than those changes, included in here are a small set of other
 things:
   - kobject logging improvements
   - cacheinfo improvements and updates
   - obligatory fw_devlink updates and fixes
   - documentation updates
   - device property cleanups and const * changes
   - firwmare loader dependency fixes.
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZEp7Sw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykitQCfamUHpxGcKOAGuLXMotXNakTEsxgAoIquENm5
 LEGadNS38k5fs+73UaxV
 =7K4B
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the large set of driver core changes for 6.4-rc1.

  Once again, a busy development cycle, with lots of changes happening
  in the driver core in the quest to be able to move "struct bus" and
  "struct class" into read-only memory, a task now complete with these
  changes.

  This will make the future rust interactions with the driver core more
  "provably correct" as well as providing more obvious lifetime rules
  for all busses and classes in the kernel.

  The changes required for this did touch many individual classes and
  busses as many callbacks were changed to take const * parameters
  instead. All of these changes have been submitted to the various
  subsystem maintainers, giving them plenty of time to review, and most
  of them actually did so.

  Other than those changes, included in here are a small set of other
  things:

   - kobject logging improvements

   - cacheinfo improvements and updates

   - obligatory fw_devlink updates and fixes

   - documentation updates

   - device property cleanups and const * changes

   - firwmare loader dependency fixes.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits)
  device property: make device_property functions take const device *
  driver core: update comments in device_rename()
  driver core: Don't require dynamic_debug for initcall_debug probe timing
  firmware_loader: rework crypto dependencies
  firmware_loader: Strip off \n from customized path
  zram: fix up permission for the hot_add sysfs file
  cacheinfo: Add use_arch[|_cache]_info field/function
  arch_topology: Remove early cacheinfo error message if -ENOENT
  cacheinfo: Check cache properties are present in DT
  cacheinfo: Check sib_leaf in cache_leaves_are_shared()
  cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
  cacheinfo: Add arm64 early level initializer implementation
  cacheinfo: Add arch specific early level initializer
  tty: make tty_class a static const structure
  driver core: class: remove struct class_interface * from callbacks
  driver core: class: mark the struct class in struct class_interface constant
  driver core: class: make class_register() take a const *
  driver core: class: mark class_release() as taking a const *
  driver core: remove incorrect comment for device_create*
  MIPS: vpe-cmp: remove module owner pointer from struct class usage.
  ...
2023-04-27 11:53:57 -07:00
Tom Rix
11fa52fe61 cpufreq: amd-pstate: Make varaiable mode_state_machine static
smatch reports
drivers/cpufreq/amd-pstate.c:907:25: warning: symbol
  'mode_state_machine' was not declared. Should it be static?

This variable is only used in one file so it should be static.

Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-04-11 20:47:20 +02:00
Wyes Karny
3ca7bc818d cpufreq: amd-pstate: Add guided mode control support via sysfs
amd_pstate driver's `status` sysfs entry helps to control the driver's
mode dynamically by user. After the addition of guided mode the
combinations of mode transitions have been increased (16 combinations).
Therefore optimise the amd_pstate_update_status function by implementing
a state transition table.

There are 4 states amd_pstate supports, namely: 'disable', 'passive',
'active', and 'guided'.  The transition from any state to any other
state is possible after this change.

Sysfs interface:

To disable amd_pstate driver:
 # echo disable > /sys/devices/system/cpu/amd_pstate/status

To enable passive mode:
 # echo passive > /sys/devices/system/cpu/amd_pstate/status

To change mode to active:
 # echo active > /sys/devices/system/cpu/amd_pstate/status

To change mode to guided:
 # echo guided > /sys/devices/system/cpu/amd_pstate/status

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-03-17 19:06:23 +01:00
Wyes Karny
2dd6d0ebf7 cpufreq: amd-pstate: Add guided autonomous mode
From ACPI spec below 3 modes for CPPC can be defined:

 1. Non autonomous: OS scaling governor specifies operating frequency/
    performance level through `Desired Performance` register and platform
    follows that.

 2. Guided autonomous: OS scaling governor specifies min and max
    frequencies/ performance levels through `Minimum Performance` and
    `Maximum Performance` register, and platform can autonomously select an
    operating frequency in this range.

 3. Fully autonomous: OS only hints (via EPP) to platform for the required
    energy performance preference for the workload and platform autonomously
    scales the frequency.

Currently (1) is supported by amd_pstate as passive mode, and (3) is
implemented by EPP support. This change is to support (2).

In guided autonomous mode the min_perf is based on the input from the
scaling governor. For example, in case of schedutil this value depends
on the current utilization. And max_perf is set to max capacity.

To activate guided auto mode ``amd_pstate=guided`` command line
parameter has to be passed in the kernel.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-03-17 19:06:23 +01:00
Greg Kroah-Hartman
3666062b87 cpufreq: amd-pstate: move to use bus_get_dev_root()
Direct access to the struct bus_type dev_root pointer is going away soon
so replace that with a call to bus_get_dev_root() instead, which is what
it is there for.

In doing so, remove the unneded kobject structure that was only being
created to cause a subdirectory for the attributes.  The name of the
attribute group is the correct way to do this, saving code and
complexity as well as allowing the attributes to properly show up to
userspace tools (the raw kobject would not allow that.)

Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linux-pm@vger.kernel.org
Acked-by: Huang Rui <ray.huang@.amd.com>
Link: https://lore.kernel.org/r/20230313182918.1312597-20-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17 15:30:07 +01:00
Nick Alcock
fa0746b11b cpufreq: amd-pstate: remove MODULE_LICENSE in non-modules
Since commit 8b41fc4454 ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations
are used to identify modules. As a consequence, uses of the macro
in non-modules will cause modprobe to misidentify their containing
object file as a module when it is not (false positives), and modprobe
might succeed rather than failing with a suitable error message.

So remove it in amd-pstate.c which cannot be built as a module.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Suggested-by: Luis Chamberlain <mcgrof@kernel.org>
[ rjw: Subject and changelog adjustments ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-23 20:01:23 +01:00
Kai-Heng Feng
7af78020e2 cpufreq: amd-pstate: Let user know amd-pstate is disabled
Commit 202e683df3 ("cpufreq: amd-pstate: add amd-pstate driver
parameter for mode selection") changed the driver to be disabled by
default, and this can surprise users.

Let users know what happened so they can decide what to do next.

Link: https://bugs.launchpad.net/bugs/2006942
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Yuan Perry <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-23 19:56:08 +01:00
Wyes Karny
6e9d12125f cpufreq: amd-pstate: Fix invalid write to MSR_AMD_CPPC_REQ
`amd_pstate_set_epp` function uses `cppc_req_cached` and `epp` variable
to update the MSR_AMD_CPPC_REQ register for AMD MSR systems. The recent
commit 7cca9a9851 ("cpufreq: amd-pstate: avoid uninitialized variable
use") changed the sequence of updating cppc_req_cached and writing the
MSR_AMD_CPPC_REQ. Therefore while switching from powersave to
performance governor and vice-versa in active mode MSR_AMD_CPPC_REQ is
set with the previous cached value. To fix this: first update the
`cppc_req_cached` variable and then call `amd_pstate_set_epp` function.

 - Before commit 7cca9a9851 ("cpufreq: amd-pstate: avoid uninitialized
   variable use"):

With powersave governor:
[    1.652743] amd_pstate_epp_init: writing to cppc_req_cached = 0x1eff
[    1.652744] amd_pstate_set_epp: writing cppc_req_cached = 0x1eff
[    1.652746] amd_pstate_set_epp: writing min_perf = 30, des_perf = 0, max_perf = 255, epp = 0

Changing to performance governor:
[  300.493842] amd_pstate_epp_init: writing to cppc_req_cached = 0xffff
[  300.493846] amd_pstate_set_epp: writing cppc_req_cached = 0xffff
[  300.493847] amd_pstate_set_epp: writing min_perf = 255, des_perf = 0, max_perf = 255, epp = 0

 - After commit 7cca9a9851 ("cpufreq: amd-pstate: avoid uninitialized
   variable use"):

With powersave governor:
[    1.646037] amd_pstate_set_epp: writing cppc_req_cached = 0xffff
[    1.646038] amd_pstate_set_epp: writing min_perf = 255, des_perf = 0, max_perf = 255, epp = 0
[    1.646042] amd_pstate_epp_init: writing to cppc_req_cached = 0x1eff

Changing to performance governor:
[  687.117401] amd_pstate_set_epp: writing cppc_req_cached = 0x1eff
[  687.117405] amd_pstate_set_epp: writing min_perf = 30, des_perf = 0, max_perf = 255, epp = 0
[  687.117419] amd_pstate_epp_init: writing to cppc_req_cached = 0xffff

 - After this fix:

With powersave governor:
[    2.525717] amd_pstate_epp_init: writing to cppc_req_cached = 0x1eff
[    2.525720] amd_pstate_set_epp: writing cppc_req_cached = 0x1eff
[    2.525722] amd_pstate_set_epp: writing min_perf = 30, des_perf = 0, max_perf = 255, epp = 0

Changing to performance governor:
[ 3440.152468] amd_pstate_epp_init: writing to cppc_req_cached = 0xffff
[ 3440.152473] amd_pstate_set_epp: writing cppc_req_cached = 0xffff
[ 3440.152474] amd_pstate_set_epp: writing min_perf = 255, des_perf = 0, max_perf = 255, epp = 0

Fixes: 7cca9a9851 ("cpufreq: amd-pstate: avoid uninitialized variable use")
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-15 15:58:07 +01:00
Arnd Bergmann
7cca9a9851 cpufreq: amd-pstate: avoid uninitialized variable use
The new epp support causes warnings about three separate
but related bugs:

1) failing before allocation should just return an error:

drivers/cpufreq/amd-pstate.c:951:6: error: variable 'ret' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
        if (!dev)
            ^~~~
drivers/cpufreq/amd-pstate.c:1018:9: note: uninitialized use occurs here
        return ret;
               ^~~

2) wrong variable to store return code:

drivers/cpufreq/amd-pstate.c:963:6: error: variable 'ret' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
        if (rc)
            ^~
drivers/cpufreq/amd-pstate.c:1019:9: note: uninitialized use occurs here
        return ret;
               ^~~
drivers/cpufreq/amd-pstate.c:963:2: note: remove the 'if' if its condition is always false
        if (rc)
        ^~~~~~~

3) calling amd_pstate_set_epp() in cleanup path after determining
that it should not be called:

drivers/cpufreq/amd-pstate.c:1055:6: error: variable 'epp' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
        if (cpudata->epp_policy == cpudata->policy)
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/cpufreq/amd-pstate.c:1080:30: note: uninitialized use occurs here
        amd_pstate_set_epp(cpudata, epp);
                                    ^~~

All three are trivial to fix, but most likely there are additional bugs
in this function when the error handling was not really tested.

Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Yuan Perry <Perry.Yuan@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-09 20:21:42 +01:00
Uwe Kleine-König
dd329e1e21 cpufreq: Make cpufreq_unregister_driver() return void
All but a few drivers ignore the return value of
cpufreq_unregister_driver(). Those few that don't only call it after
cpufreq_register_driver() succeeded, in which case the call doesn't
fail.

Make the function return no value and add a WARN_ON for the case that
the function is called in an invalid situation (i.e. without a previous
successful call to cpufreq_register_driver()).

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com> # brcmstb-avs-cpufreq.c
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-09 20:19:18 +01:00
Perry Yuan
3ec32b6d17 cpufreq: amd-pstate: convert sprintf with sysfs_emit()
replace the sprintf with a more generic sysfs_emit function

No intended potential function impact

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-03 21:59:42 +01:00
Perry Yuan
abd61c08ef cpufreq: amd-pstate: add driver working mode switch support
While amd-pstate driver was loaded with specific driver mode, it will
need to check which mode is enabled for the pstate driver,add this sysfs
entry to show the current status

$ cat /sys/devices/system/cpu/amd-pstate/status
active

Meanwhile, user can switch the pstate driver mode with writing mode
string to sysfs entry as below.

Enable passive mode:
$ sudo bash -c "echo passive >  /sys/devices/system/cpu/amd-pstate/status"

Enable active mode (EPP driver mode):
$ sudo bash -c "echo active > /sys/devices/system/cpu/amd-pstate/status"

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-03 21:59:42 +01:00
Perry Yuan
50ddd2f782 cpufreq: amd-pstate: implement suspend and resume callbacks
add suspend and resume support for the AMD processors by amd_pstate_epp
driver instance.

When the CPPC is suspended, EPP driver will set EPP profile to 'power'
profile and set max/min perf to lowest perf value.
When resume happens, it will restore the MSR registers with
previous cached value.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <Mario.Limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-03 21:59:42 +01:00
Perry Yuan
d4da12f803 cpufreq: amd-pstate: implement amd pstate cpu online and offline callback
Adds online and offline driver callback support to allow cpu cores go
offline and help to restore the previous working states when core goes
back online later for EPP driver mode.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <Mario.Limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-03 21:59:41 +01:00
Perry Yuan
ffa5096a7c cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors
Add EPP driver support for AMD SoCs which support a dedicated MSR for
CPPC.  EPP is used by the DPM controller to configure the frequency that
a core operates at during short periods of activity.

The SoC EPP targets are configured on a scale from 0 to 255 where 0
represents maximum performance and 255 represents maximum efficiency.

The amd-pstate driver exports profile string names to userspace that are
tied to specific EPP values.

The balance_performance string (0x80) provides the best balance for
efficiency versus power on most systems, but users can choose other
strings to meet their needs as well.

$ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_preferences
default performance balance_performance balance_power power

$ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preference
balance_performance

To enable the driver,it needs to add `amd_pstate=active` to kernel
command line and kernel will load the active mode epp driver

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <Mario.Limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-03 21:59:41 +01:00
Wyes Karny
36c5014e54 cpufreq: amd-pstate: optimize driver working mode selection in amd_pstate_param()
The amd-pstate driver may support multiple working modes.
Introduce a variable to keep track of which mode is currently enabled.
Here we use cppc_state var to indicate which mode is enabled.
This change will help to simplify the the amd_pstate_param() to choose
which mode used for the following driver registration.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-03 21:59:41 +01:00
Perry Yuan
4f3085f87b cpufreq: amd-pstate: fix kernel hang issue while amd-pstate unregistering
In the amd_pstate_adjust_perf(), there is one cpufreq_cpu_get() call to
increase increments the kobject reference count of policy and make it as
busy. Therefore, a corresponding call to cpufreq_cpu_put() is needed to
decrement the kobject reference count back, it will resolve the kernel
hang issue when unregistering the amd-pstate driver and register the
`amd_pstate_epp` driver instance.

Fixes: 1d215f0319 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Cc: 5.17+ <stable@vger.kernel.org> # 5.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-10 20:31:08 +01:00
Perry Yuan
202e683df3 cpufreq: amd-pstate: add amd-pstate driver parameter for mode selection
When the amd_pstate driver is built-in users still need a method to be
able enable or disable it depending upon their circumstance.
Add support for an early parameter to do this.

There is some performance degradation on a number of ASICs in the
passive mode. This performance issue was originally discovered in
shared memory systems but it has been proven that certain workloads
on MSR systems also suffer performance issues.
Set the amd-pstate driver as disabled by default to temporarily
mitigate the performance problem.

 1) with `amd_pstate=disable`, pstate driver will be disabled to load at
    kernel booting.

 2) with `amd_pstate=passive`, pstate driver will be enabled and loaded
    as non-autonomous working mode supported in the low-level power
    management firmware.

 3) If neither parameter is specified, the driver will be disabled by
    default to avoid triggering performance regressions in certain ASICs

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-22 19:57:15 +01:00
Perry Yuan
456ca88d8a cpufreq: amd-pstate: change amd-pstate driver to be built-in type
Currently when the amd-pstate and acpi_cpufreq are both built into
kernel as module driver, amd-pstate will not be loaded by default
in this case.

Change amd-pstate driver as built-in type, it will resolve the loading
sequence problem to allow user to make amd-pstate driver as the default
cpufreq scaling driver.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Fixes: ec437d71db ("cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-22 19:57:15 +01:00
Wyes Karny
919f455769 cpufreq: amd-pstate: cpufreq: amd-pstate: reset MSR_AMD_PERF_CTL register at init
MSR_AMD_PERF_CTL is guaranteed to be 0 on a cold boot. However, on a
kexec boot, for instance, it may have a non-zero value (if the cpu was
in a non-P0 Pstate).  In such cases, the cores with non-P0 Pstates at
boot will never be pushed to P0, let alone boost frequencies.

Kexec is a common workflow for reboot on Linux and this creates a
regression in performance. Fix it by explicitly setting the
MSR_AMD_PERF_CTL to 0 during amd_pstate driver init.

Cc: All applicable <stable@vger.kernel.org>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-22 19:57:14 +01:00
Linus Torvalds
dd42d9c3f4 linux-kselftest-next-6.1-rc1
This Kselftest update for Linux 6.1-rc1 consists of fixes and new tests.
 
 - Adds a amd-pstate-ut test module, this module is used by kselftest
   to unit test amd-pstate functionality
 - Fixes and cleanups to to cpu-hotplug to delete the fault injection
   test code
 - Improvements to vm test to use top_srcdir for builds
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmM912gACgkQCwJExA0N
 QxzfAhAAspak4B2FVZdXrb7ulGxCV2CSd49qXbv++7rBy3Ey17XlIO8pUj2FMPpx
 nBrD2UD+ET/6wHjSzkcSzU47N2BnhetuUWqN2vPmuYo4c8xDME9VSWeoKQ9Yw5rc
 6r6TxNG/Jbxab4y9r8jibwkidVzcgfYO/Ecnj9HAZ6wDERkVG+LBxpX4HpxEnCsY
 c3frYFl0cdtLOSGbEA18qctG7ro2otgh4+cWVfSo9NDypL9UfF8bZ8DzqeMOwRLr
 z2S8L+kBPyQx+ZNdqUgNOfegVx2gKu/P43Xpg8WqRJG/42xxmRmftEOLoVKZ1cCe
 Rpjv+ZV+/u3QnbEejqJ7MU/QX9JzMG35O4mo7ojxpOJ+cZNXc1ycOm+IOvTCSUJE
 fiqM2idLvQNjKMVI+FHvkE+9us6K8fzquCYqgRLBKCnfnJJfgT6tWwZpw/G8OSZ6
 Px37bfQhALzDj79OkxxoV96GdkR9a2L8GRvZG+7W4l6Oa00OM+rK1oQnjKVgmjT5
 wtMwJAXGpwFYqnJagtXcg8dsxS/ZA+k4Eq/skd/yUlCmugmgeVltjk3UmEAYs6Nx
 ame3798/haptgR1u7ncqltqjx11YR2k9i0Sjf7iqQl8zJhjtDdePAlo2gB4mFUio
 mZ9BBxCD+oUnAZprA5AwN0ZxPKLWgvPS5rLqTtFt3Pf0agjHSpY=
 =O0Lh
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-next-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest updates from Shuah Khan:
 "Fixes and new tests:

   - Add an amd-pstate-ut test module, used by kselftest to unit test
     amd-pstate functionality

   - Fixes and cleanups to to cpu-hotplug to delete the fault injection
     test code

   - Improvements to vm test to use top_srcdir for builds"

* tag 'linux-kselftest-next-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  docs:kselftest: fix kselftest_module.h path of example module
  cpufreq: amd-pstate: Add explanation for X86_AMD_PSTATE_UT
  selftests/cpu-hotplug: Add log info when test success
  selftests/cpu-hotplug: Reserve one cpu online at least
  selftests/cpu-hotplug: Delete fault injection related code
  selftests/cpu-hotplug: Use return instead of exit
  selftests/cpu-hotplug: Correct log info
  cpufreq: amd-pstate: modify type in argument 2 for filp_open
  Documentation: amd-pstate: Add unit test introduction
  selftests: amd-pstate: Add test trigger for amd-pstate driver
  cpufreq: amd-pstate: Add test module for amd-pstate driver
  cpufreq: amd-pstate: Expose struct amd_cpudata
  selftests/vm: use top_srcdir instead of recomputing relative paths
2022-10-06 12:53:15 -07:00
Meng Li
f1375ec1df cpufreq: amd-pstate: Expose struct amd_cpudata
Expose struct amd_cpudata to AMD P-State unit test module.

This data struct will be used on the following AMD P-State unit test
(amd-pstate-ut) module. The amd-pstate-ut module can get some
AMD infomations by this data struct. For example: highest perf,
nominal perf, boost supported etc.

Signed-off-by: Meng Li <li.meng@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-10-05 11:05:17 -06:00
Linus Torvalds
c79e6fa98c Power management updates for 6.1-rc1
- Add isupport for Tiger Lake in no-HWP mode to intel_pstate (Doug
    Smythies).
 
  - Update the AMD P-state driver (Perry Yuan):
    * Fix wrong lowest perf fetch.
    * Map desired perf into pstate scope for powersave governor.
    * Update pstate frequency transition delay time.
    * Fix initial highest_perf value.
    * Clean up.
 
  - Move max CPU capacity to sugov_policy in the schedutil cpufreq
    governor (Lukasz Luba).
 
  - Add SM6115 to cpufreq-dt blocklist (Adam Skladowski).
 
  - Add support for Tegra239 and minor cleanups (Sumit Gupta, ye xingchen,
    and Yang Yingliang).
 
  - Add freq qos for qcom cpufreq driver and minor cleanups (Xuewen Yan,
    and Viresh Kumar).
 
  - Minor cleanups around functions called at module_init() (Xiu Jianfeng).
 
  - Use module_init and add module_exit for bmips driver (Zhang Jianhua).
 
  - Add AlderLake-N support to intel_idle (Zhang Rui).
 
  - Replace strlcpy() with unused retval with strscpy() in intel_idle
    (Wolfram Sang).
 
  - Remove redundant check from cpuidle_switch_governor() (Yu Liao).
 
  - Replace strlcpy() with unused retval with strscpy() in the powernv
    cpuidle driver (Wolfram Sang).
 
  - Drop duplicate word from a comment in the coupled cpuidle driver
    (Jason Wang).
 
  - Make rpm_resume() return -EINPROGRESS if RPM_NOWAIT is passed to it
    in the flags and the device is about to resume (Rafael Wysocki).
 
  - Add extra debugging statement for multiple active IRQs to system
    wakeup handling code (Mario Limonciello).
 
  - Replace strlcpy() with unused retval with strscpy() in the core
    system suspend support code (Wolfram Sang).
 
  - Update the intel_rapl power capping driver:
    * Use standard Energy Unit for SPR Dram RAPL domain (Zhang Rui).
    * Add support for RAPTORLAKE_S (Zhang Rui).
    * Fix UBSAN shift-out-of-bounds issue (Chao Qin).
 
  - Handle -EPROBE_DEFER when regulator is not probed on
    mtk-ci-devfreq.c (AngeloGioacchino Del Regno).
 
  - Fix message typo and use dev_err_probe() in rockchip-dfi.c
    (Christophe JAILLET).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmM7OrYSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxeKAP/jFiZ1lhTGRngiVLMV6a6SSSy5xzzXZZ
 b/V0oqsuUvWWo6CzVmfU4QfmKGr55+77NgI9Yh5qN6zJTEJmunuCYwVD80KdxPDJ
 8SjMUNCACiVwfryLR1gFJlO+0BN4CWTxvto2gjGxzm0l1UQBACf71wm9MQCP8b7A
 gcBNuOtM7o5NLywDB+/528SiF9AXfZKjkwXhJACimak5yQytaCJaqtOWtcG2KqYF
 USunmqSB3IIVkAa5LJcwloc8wxHYo5mTPaWGGuSA65hfF42k3vJQ2/b8v8oTVza7
 bKzhegErIYtL6B9FjB+P1FyknNOvT7BYr+4RSGLvaPySfjMn1bwz9fM1Epo59Guk
 Azz3ExpaPixDh+x7b89W1Gb751FZU/zlWT+h1CNy5sOP/ChfxgCEBHw0mnWJ2Y0u
 CPcI/Ch0FNQHG+PdbdGlyfvORHVh7te/t6dOhoEHXBue+1r3VkOo8tRGY9x+2IrX
 /JB968u1r0oajF0btGwaDdbbWlyMRTzjrxVl3bwsuz/Kv/0JxsryND2JT0zkKAMZ
 qYT29HQxhdE0Duw1chgAK6X+BsgP58Bu6LeM3mVcwnGPZE9QvcFa0GQh7z+H71AW
 3yOGNmMVMqQSThBYFC6GDi7O2N1UEsLOMV9+ThTRh6D11nU4uiITM5QVIn8nWZGR
 z3IZ52Jg0oeJ
 =+3IL
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These add support for some new hardware, extend the existing hardware
  support, fix some issues and clean up code

  Specifics:

   - Add isupport for Tiger Lake in no-HWP mode to intel_pstate (Doug
     Smythies)

   - Update the AMD P-state driver (Perry Yuan):
      - Fix wrong lowest perf fetch
      - Map desired perf into pstate scope for powersave governor
      - Update pstate frequency transition delay time
      - Fix initial highest_perf value
      - Clean up

   - Move max CPU capacity to sugov_policy in the schedutil cpufreq
     governor (Lukasz Luba)

   - Add SM6115 to cpufreq-dt blocklist (Adam Skladowski)

   - Add support for Tegra239 and minor cleanups (Sumit Gupta, ye
     xingchen, and Yang Yingliang)

   - Add freq qos for qcom cpufreq driver and minor cleanups (Xuewen
     Yan, and Viresh Kumar)

   - Minor cleanups around functions called at module_init() (Xiu
     Jianfeng)

   - Use module_init and add module_exit for bmips driver (Zhang
     Jianhua)

   - Add AlderLake-N support to intel_idle (Zhang Rui)

   - Replace strlcpy() with unused retval with strscpy() in intel_idle
     (Wolfram Sang)

   - Remove redundant check from cpuidle_switch_governor() (Yu Liao)

   - Replace strlcpy() with unused retval with strscpy() in the powernv
     cpuidle driver (Wolfram Sang)

   - Drop duplicate word from a comment in the coupled cpuidle driver
     (Jason Wang)

   - Make rpm_resume() return -EINPROGRESS if RPM_NOWAIT is passed to it
     in the flags and the device is about to resume (Rafael Wysocki)

   - Add extra debugging statement for multiple active IRQs to system
     wakeup handling code (Mario Limonciello)

   - Replace strlcpy() with unused retval with strscpy() in the core
     system suspend support code (Wolfram Sang)

   - Update the intel_rapl power capping driver:
      - Use standard Energy Unit for SPR Dram RAPL domain (Zhang Rui).
      - Add support for RAPTORLAKE_S (Zhang Rui).
      - Fix UBSAN shift-out-of-bounds issue (Chao Qin)

   - Handle -EPROBE_DEFER when regulator is not probed on
     mtk-ci-devfreq.c (AngeloGioacchino Del Regno)

   - Fix message typo and use dev_err_probe() in rockchip-dfi.c
     (Christophe JAILLET)"

* tag 'pm-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (29 commits)
  cpufreq: qcom-cpufreq-hw: Add cpufreq qos for LMh
  cpufreq: Add __init annotation to module init funcs
  cpufreq: tegra194: change tegra239_cpufreq_soc to static
  PM / devfreq: rockchip-dfi: Fix an error message
  PM / devfreq: mtk-cci: Handle sram regulator probe deferral
  powercap: intel_rapl: Use standard Energy Unit for SPR Dram RAPL domain
  PM: runtime: Return -EINPROGRESS from rpm_resume() in the RPM_NOWAIT case
  intel_idle: Add AlderLake-N support
  powercap: intel_rapl: fix UBSAN shift-out-of-bounds issue
  cpufreq: tegra194: Add support for Tegra239
  cpufreq: qcom-cpufreq-hw: Fix uninitialized throttled_freq warning
  cpufreq: intel_pstate: Add Tigerlake support in no-HWP mode
  powercap: intel_rapl: Add support for RAPTORLAKE_S
  cpufreq: amd-pstate: Fix initial highest_perf value
  cpuidle: Remove redundant check in cpuidle_switch_governor()
  PM: wakeup: Add extra debugging statement for multiple active IRQs
  cpufreq: tegra194: Remove the unneeded result variable
  PM: suspend: move from strlcpy() with unused retval to strscpy()
  intel_idle: move from strlcpy() with unused retval to strscpy()
  cpuidle: powernv: move from strlcpy() with unused retval to strscpy()
  ...
2022-10-03 13:26:47 -07:00
Perry Yuan
bedadcfb01 cpufreq: amd-pstate: Fix initial highest_perf value
To avoid some new AMD processors use wrong highest perf when amd pstate
driver loaded, this fix will query the highest perf from MSR register
MSR_AMD_CPPC_CAP1 and cppc_acpi interface firstly, then compare with the
highest perf value got by calling amd_get_highest_perf() function.

The lower value will be the correct highest perf we need to use.
Otherwise the CPU max MHz will be incorrect if the
amd_get_highest_perf() did not cover the new process family and model ID.

Like this lscpu info, the max frequency is incorrect.

Vendor ID:               AuthenticAMD
    Socket(s):           1
    Stepping:            2
    CPU max MHz:         5410.0000
    CPU min MHz:         400.0000
    BogoMIPS:            5600.54

Fixes: 3743d55b28 (x86, sched: Fix the AMD CPPC maximum performance value on certain AMD Ryzen generations)
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-09-03 20:17:32 +02:00
Perry Yuan
ca08e46d42 cpufreq: amd-pstate: update pstate frequency transition delay time
Change the default transition latency to be 20ms that is more
reasonable transition delay for AMD processors in non-EPP driver mode.

Update transition delay time to 1ms, in the AMD CPU autonomous mode and
non-autonomous mode, CPPC firmware will decide frequency at 1ms timescale
based on the workload utilization.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-31 21:05:38 +02:00
Perry Yuan
0e9a86386b cpufreq: amd_pstate: map desired perf into pstate scope for powersave governor
The patch will fix the invalid desired perf value for powersave
governor. This issue is found when testing on one AMD EPYC system, the
actual des_perf is smaller than the min_perf value, that is invalid
value. because the min_perf is the lowest_perf system can support in
idle state.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-31 21:05:38 +02:00
Perry Yuan
b185c5053c cpufreq: amd_pstate: fix wrong lowest perf fetch
Fix the wrong lowest perf value reading which is used for new
des_perf calculation by governor requested, the incorrect min_perf will
get incorrect des_perf to be set , that will cause the system frequency
changing unexpectedly.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Su Jinzhou <jinzhou.su@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-31 21:05:37 +02:00
Perry Yuan
d8bee41db8 cpufreq: amd-pstate: fix white-space
Remove the white space and correct mixed-up indentation

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-31 21:05:37 +02:00
Perry Yuan
4f59540c3c cpufreq: amd-pstate: simplify cpudata pointer assignment
move the cpudata assignment to cpudata declaration which
will simplify the functions.

No functional change intended.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-31 21:05:37 +02:00
Perry Yuan
a2a9d18500 ACPI: CPPC: Add ACPI disabled check to acpi_cpc_valid()
Make acpi_cpc_valid() check if ACPI is disabled, so that its callers
don't need to check that separately.  This will also cause the AMD
pstate driver to refuse to load right away when ACPI is disabled.

Also update the warning message in amd_pstate_init() to mention the
ACPI disabled case for completeness.

Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
[ rjw: Subject edits, new changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-25 13:55:17 +02:00
Jinzhou Su
b376471fb4 cpufreq: amd-pstate: Add resume and suspend callbacks
When system resumes from S3, the CPPC enable register will be
cleared and reset to 0.

So enable the CPPC interface by writing 1 to this register on
system resume and disable it during system suspend.

Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
[ rjw: Subject and changelog edits ]
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-06-23 21:19:52 +02:00
Jinzhou Su
23c296fb7e cpufreq: amd-pstate: Add more tracepoint for AMD P-State module
Add frequency, mperf, aperf and tsc in the trace. This can be used
to debug and tune the performance of AMD P-state driver.

Use the time difference between amd_pstate_update to calculate CPU
frequency. There could be sleep in arch_freq_get_on_cpu, so do not
use it here.

Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Co-developed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-09 19:53:01 +01:00
Yang Li
bdc4fd3d48 cpufreq: amd-pstate: Fix struct amd_cpudata kernel-doc comment
Add the description of @req and @boost_supported in struct amd_cpudata
kernel-doc comment to remove warnings found by running scripts/kernel-doc,
which is caused by using 'make W=1'.

drivers/cpufreq/amd-pstate.c:104: warning: Function parameter or member
'req' not described in 'amd_cpudata'
drivers/cpufreq/amd-pstate.c:104: warning: Function parameter or member
'boost_supported' not described in 'amd_cpudata'

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-01-06 18:28:26 +01:00
Huang Rui
3ad7fde16a cpufreq: amd-pstate: Add AMD P-State performance attributes
Introduce sysfs attributes to get the different level AMD P-State
performances.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30 18:51:40 +01:00
Huang Rui
ec4e3326a9 cpufreq: amd-pstate: Add AMD P-State frequencies attributes
Introduce sysfs attributes to get the different level processor
frequencies.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30 18:51:40 +01:00
Huang Rui
41271016df cpufreq: amd-pstate: Add boost mode support for AMD P-State
If the sbios supports the boost mode of AMD P-State, let's switch to
boost enabled by default.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30 18:51:39 +01:00
Huang Rui
60e10f896d cpufreq: amd-pstate: Add trace for AMD P-State module
Add trace event to monitor the performance value changes which is
controlled by cpu governors.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30 18:51:39 +01:00
Huang Rui
e059c184da cpufreq: amd-pstate: Introduce the support for the processors with shared memory solution
In some of Zen2 and Zen3 based processors, they are using the shared
memory that exposed from ACPI SBIOS. In this kind of the processors,
there is no MSR support, so we add acpi cppc function as the backend for
them.

It is using a module param (shared_mem) to enable related processors
manually. We will enable this by default once we address performance
issue on this solution.

Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30 18:51:39 +01:00
Huang Rui
1d215f0319 cpufreq: amd-pstate: Add fast switch function for AMD P-State
Introduce the fast switch function for AMD P-State on the AMD processors
which support the full MSR register control. It's able to decrease the
latency on interrupt context.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30 18:51:39 +01:00
Huang Rui
ec437d71db cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors
AMD P-State is the AMD CPU performance scaling driver that introduces a
new CPU frequency control mechanism on AMD Zen based CPU series in Linux
kernel. The new mechanism is based on Collaborative processor
performance control (CPPC) which is finer grain frequency management
than legacy ACPI hardware P-States. Current AMD CPU platforms are using
the ACPI P-states driver to manage CPU frequency and clocks with
switching only in 3 P-states. AMD P-State is to replace the ACPI
P-states controls, allows a flexible, low-latency interface for the
Linux kernel to directly communicate the performance hints to hardware.

AMD P-State leverages the Linux kernel governors such as *schedutil*,
*ondemand*, etc. to manage the performance hints which are provided by CPPC
hardware functionality. The first version for AMD P-State is to support one
of the Zen3 processors, and we will support more in future after we verify
the hardware and SBIOS functionalities.

There are two types of hardware implementations for AMD P-State: one is full
MSR support and another is shared memory support. It can use
X86_FEATURE_CPPC feature flag to distinguish the different types.

Using the new AMD P-State method + kernel governors (*schedutil*,
*ondemand*, ...) to manage the frequency update is the most appropriate
bridge between AMD Zen based hardware processor and Linux kernel, the
processor is able to adjust to the most efficiency frequency according to
the kernel scheduler loading.

Please check the detailed CPU feature and MSR register description in
Processor Programming Reference (PPR) for AMD Family 19h Model 51h,
Revision A1 Processors:

https://www.amd.com/system/files/TechDocs/56569-A1-PUB.zip

Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30 18:51:39 +01:00