2023-01-06 06:17:47

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 00/12] Implement AMD Pstate EPP Driver

Hi all,

This patchset implements one new AMD CPU frequency driver
`amd-pstate-epp` instance for better performance and power control.
CPPC has a parameter called energy preference performance (EPP).
The EPP is used in the CCLK DPM controller to drive the frequency that a core
is going to operate during short periods of activity.
EPP values will be utilized for different OS profiles (balanced, performance, power savings).

AMD Energy Performance Preference (EPP) provides a hint to the hardware
if software wants to bias toward performance (0x0) or energy efficiency (0xff)
The lowlevel power firmware will calculate the runtime frequency according to the EPP preference
value. So the EPP hint will impact the CPU cores frequency responsiveness.

We use the RAPL interface with "perf" tool to get the energy data of the package power.
Performance Per Watt (PPW) Calculation:

The PPW calculation is referred by below paper:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsoftware.intel.com%2Fcontent%2Fdam%2Fdevelop%2Fexternal%2Fus%2Fen%2Fdocuments%2Fperformance-per-what-paper.pdf&data=04%7C01%7CPerry.Yuan%40amd.com%7Cac66e8ce98044e9b062708d9ab47c8d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637729147708574423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=TPOvCE%2Frbb0ptBreWNxHqOi9YnVhcHGKG88vviDLb00%3D&reserved=0

Below formula is referred from below spec to measure the PPW:

(F / t) / P = F * t / (t * E) = F / E,

"F" is the number of frames per second.
"P" is power measured in watts.
"E" is energy measured in joules.

Gitsouce Benchmark Data on ROME Server CPU
+------------------------------+------------------------------+------------+------------------+
| Kernel Module | PPW (1 / s * J) |Energy(J) | PPW Improvement (%)|
+==============================+==============================+============+==================+
| acpi-cpufreq:schedutil | 5.85658E-05 | 17074.8 | base |
+------------------------------+------------------------------+------------+------------------+
| acpi-cpufreq:ondemand | 5.03079E-05 | 19877.6 | -14.10% |
+------------------------------+------------------------------+------------+------------------+
| acpi-cpufreq:performance | 5.88132E-05 | 17003 | 0.42% |
+------------------------------+------------------------------+------------+------------------+
| amd-pstate:ondemand | 4.60295E-05 | 21725.2 | -21.41% |
+------------------------------+------------------------------+------------+------------------+
| amd-pstate:schedutil | 4.70026E-05 | 21275.4 | -19.7% |
+------------------------------+------------------------------+------------+------------------+
| amd-pstate:performance | 5.80094E-05 | 17238.6 | -0.95% |
+------------------------------+------------------------------+------------+------------------+
| EPP:performance | 5.8292E-05 | 17155 | -0.47% |
+------------------------------+------------------------------+------------+------------------+
| EPP: balance performance: | 6.71709E-05 | 14887.4 | 14.69% |
+------------------------------+------------------------------+------------+------------------+
| EPP:power | 6.66951E-05 | 4993.6 | 13.88% |
+------------------------------+------------------------------+------------+------------------+

Tbench Benchmark Data on ROME Server CPU
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| Kernel Module | PPW MB / (s * J) |Throughput(MB/s)| Energy (J)|PPW Improvement(%)|
+=============================================+===================+==============+=============+==================+
| acpi_cpufreq: schedutil | 46.39 | 17191 | 37057.3 | base |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| acpi_cpufreq: ondemand | 51.51 | 19269.5 | 37406.5 | 11.04 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| acpi_cpufreq: performance | 45.96 | 17063.7 | 37123.7 | -0.74 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: performance(0) | 54.46 | 20263.1 | 37205 | 17.87 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: balance performance | 55.03 | 20481.9 | 37221.5 | 19.14 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: balance_power | 54.43 | 20245.9 | 37194.2 | 17.77 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: power(255) | 54.26 | 20181.7 | 37197.4 | 17.40 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| amd-pstate: schedutil | 48.22 | 17844.9 | 37006.6 | 3.80 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| amd-pstate: ondemand | 61.30 | 22988 | 37503.4 | 33.72 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| amd-pstate: performance | 54.52 | 20252.6 | 37147.8 | 17.81 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+

changes from v9:
* pick up R-B flas added by Mario
* pick up R-B flag added by Wyes
* rename "default_pstate_driver" to "current_pstate_driver"
* update status string to "disable" when driver disabled and update size check
* move the common macro definition of EPP to cppc library files for intel_pstate and amd_pstate common using
* fix some doc words for new global sysfs patch

changes from v8:
* drive all the feedbacks from Mario and change the codes in this
version
* drive all the feedbacks from Ray and change the codes in this
version
* pick up all the R-B flags from Mario
* pick up all the R-B flags from Ray
* drop boost/refresh_freq_limits callback
* reuse policy->driver_data to store amd_cpudata struct
* use switch-case in the driver mode switching function
* add Kconfig dependency the INTEL_PSTATE for AMD_PSTATE build
* fix some other code format and typos

changes from v7:
* remove iowait boost functions code
* pick up ack by flag from Huang Ray.
* add one new patch to support multiple working modes in the amd_pstate_param(),aligned with Wyse
* drop the patch "[v7 08/13] cpufreq: amd-pstate: add frequency dynamic boost sysfs control"
* replace the cppc_get_epp_caps() with new cppc_get_epp_perf() wihch is
more simple to use
* remove I/O wait boost code from amd_pstate_update_status()
* replace cppc_active var with enum type AMD_PSTATE_ACTIVE
* squash amd_pstate_epp_verify_policy() into sigle function
* remove "amd pstate" string from the pr_err, pr_debug logs info
* rework patch [v7 03/13], move the common EPP profiles declaration
into cpufreq.h which will be used by amd-pstate and intel-pstate
* combine amd psate init functions.
* remove epp_powersave from amd-pstate.h and dropping the codes.
* move amd_pstate_params{} from amd-pstate.h into amd-pstate.c
* drive some other feedbacks from huang ray

changes from v6:
* fix one legacy kernel hang issue when amd-pstate driver unregistering
* add new documentation to introduce new global sysfs attributes
* use sysfs_emit_at() to print epp profiles array
* update commit info for patch v6 patch 1/11 as Mario sugguested.
* trying to add the EPP profiles into cpufreq.h, but it will cause lots
of build failues,continue to keep cpufreq_common.h used in v7
* update commit info using amd-pstate as prefix same as before.
* remove CONFIG_ACPI for the header as Ray suggested.
* move amd_pstate_kobj to where it is used in patch "add frequency dynamic boost sysfs control"
* drive feedback removing X86_FEATURE_CPPC check for the epp init from Huang Ray
* drive feedback from Mario

change from v5:
* add one common header `cpufreq_commoncpufreq_common` to extract EPP profiles
definition for amd and intel pstate driver.
* remove the epp_off value to avoid confusion.
* convert some other sysfs sprintf() function with sysfs_emit() and add onew new patch
* add acpi pm server priofile detection to enable dynamic boost control
* fix some code format with checkpatch script
* move the EPP profile declaration into common header file `cpufreq_common.h`
* fix commit typos

changes from v4:
* rebase driver based on the v6.1-rc7
* remove the builtin changes patch because pstate driver has been
changed to builtin type by another thread patch
* update Documentation: amd-pstate: add amd pstate driver mode introduction
* replace sprintf with sysfs_emit() instead.
* fix typo for cppc_set_epp_perf() in cppc_acpi.h header

changes from v3:
* add one more document update patch for the active and passive mode
introducion.
* drive most of the feedbacks from Mario
* drive feedback from Rafael for the cppc_acpi driver.
* remove the epp raw data set/get function
* set the amd-pstate drive by passing kernel parameter
* set amd-pstate driver disabled by default if no kernel parameter
input from booting
* get cppc_set_auto_epp and cppc_set_epp_perf combined
* pick up reviewed by flag from Mario

changes from v2:
* change pstate driver as builtin type from module
* drop patch "export cpufreq cpu release and acquire"
* squash patch of shared mem into single patch of epp implementation
* add one new patch to support frequency boost control
* add patch to expose driver working status checking
* rebase driver into v6.1-rc4 kernel release
* move some declaration to amd-pstate.h
* drive feedback from Mario for the online/offline patch
* drive feedback from Mario for the suspend/resume patch
* drive feedback from Ray for the cppc_acpi and some other patches
* drive feedback from Nathan for the epp patch

changes from v1:
* rebased to v6.0
* drive feedbacks from Mario for the suspend/resume patch
* drive feedbacks from Nathan for the EPP support on msr type
* fix some typos and code style indent problems
* update commit comments for patch 4/7
* change the `epp_enabled` module param name to `epp`
* set the default epp mode to be false
* add testing for the x86_energy_perf_policy utility patchset(will
send that utility patchset with another thread)

v9: https://lore.kernel.org/lkml/[email protected]/
v8: https://lore.kernel.org/lkml/[email protected]/
v7: https://lore.kernel.org/lkml/[email protected]/
v6: https://lore.kernel.org/lkml/[email protected]/
v5: https://lore.kernel.org/lkml/[email protected]/
v4: https://lore.kernel.org/lkml/[email protected]/
v3: https://lore.kernel.org/all/[email protected]/
v2: https://lore.kernel.org/all/[email protected]/
v1: https://lore.kernel.org/all/[email protected]/

Perry Yuan (11):
ACPI: CPPC: Add AMD pstate energy performance preference cppc control
Documentation: amd-pstate: add EPP profiles introduction
cpufreq: intel_pstate: use common macro definition for Energy
Preference Performance(EPP)
cpufreq: amd-pstate: implement Pstate EPP support for the AMD
processors
cpufreq: amd-pstate: implement amd pstate cpu online and offline
callback
cpufreq: amd-pstate: implement suspend and resume callbacks
cpufreq: amd-pstate: add driver working mode switch support
Documentation: amd-pstate: add amd pstate driver mode introduction
Documentation: introduce amd pstate active mode kernel command line
options
cpufreq: amd-pstate: convert sprintf with sysfs_emit()
Documentation: amd-pstate: introduce new global sysfs attributes

Wyes Karny (1):
cpufreq: amd-pstate: optimize driver working mode selection in
amd_pstate_param()

.../admin-guide/kernel-parameters.txt | 7 +
Documentation/admin-guide/pm/amd-pstate.rst | 74 +-
drivers/acpi/cppc_acpi.c | 96 +++
drivers/cpufreq/amd-pstate.c | 659 +++++++++++++++++-
drivers/cpufreq/intel_pstate.c | 36 -
include/acpi/cppc_acpi.h | 22 +
include/linux/amd-pstate.h | 28 +
7 files changed, 865 insertions(+), 57 deletions(-)

--
2.34.1


2023-01-06 06:17:50

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 02/12] Documentation: amd-pstate: add EPP profiles introduction

From: Perry Yuan <[email protected]>

The amd-pstate driver supports a feature called energy performance
preference (EPP). Add information to the documentation to explain
how users can interact with the sysfs files for this feature.

1) See all EPP profiles
$ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_available_preferences
default performance balance_performance balance_power power

2) Check current EPP profile
$ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference
performance

3) Set new EPP profile
$ sudo bash -c "echo power > /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference"

Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
Documentation/admin-guide/pm/amd-pstate.rst | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)

diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
index 06e23538f79c..33ab8ec8fc2f 100644
--- a/Documentation/admin-guide/pm/amd-pstate.rst
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -262,6 +262,25 @@ lowest non-linear performance in `AMD CPPC Performance Capability
<perf_cap_>`_.)
This attribute is read-only.

+``energy_performance_available_preferences``
+
+A list of all the supported EPP preferences that could be used for
+``energy_performance_preference`` on this system.
+These profiles represent different hints that are provided
+to the low-level firmware about the user's desired energy vs efficiency
+tradeoff. ``default`` represents the epp value is set by platform
+firmware. This attribute is read-only.
+
+``energy_performance_preference``
+
+The current energy performance preference can be read from this attribute.
+and user can change current preference according to energy or performance needs
+Please get all support profiles list from
+``energy_performance_available_preferences`` attribute, all the profiles are
+integer values defined between 0 to 255 when EPP feature is enabled by platform
+firmware, if EPP feature is disabled, driver will ignore the written value
+This attribute is read-write.
+
Other performance and frequency values can be read back from
``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`.

--
2.34.1

2023-01-06 06:18:11

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 04/12] cpufreq: amd-pstate: optimize driver working mode selection in amd_pstate_param()

From: Wyes Karny <[email protected]>

The amd-pstate driver may support multiple working modes.
Introduce a variable to keep track of which mode is currently enabled.
Here we use cppc_state var to indicate which mode is enabled.
This change will help to simplify the the amd_pstate_param() to choose
which mode used for the following driver registration.

Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
Signed-off-by: Wyes Karny <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 39 +++++++++++++++++++++++++++---------
include/linux/amd-pstate.h | 17 ++++++++++++++++
2 files changed, 46 insertions(+), 10 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 204e39006dda..1a8b31277620 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -60,7 +60,18 @@
* module parameter to be able to enable it manually for debugging.
*/
static struct cpufreq_driver amd_pstate_driver;
-static int cppc_load __initdata;
+static int cppc_state = AMD_PSTATE_DISABLE;
+
+static inline int get_mode_idx_from_str(const char *str, size_t size)
+{
+ int i;
+
+ for (i=0; i < AMD_PSTATE_MAX; i++) {
+ if (!strncmp(str, amd_pstate_mode_string[i], size))
+ return i;
+ }
+ return -EINVAL;
+}

static inline int pstate_enable(bool enable)
{
@@ -625,10 +636,10 @@ static int __init amd_pstate_init(void)
/*
* by default the pstate driver is disabled to load
* enable the amd_pstate passive mode driver explicitly
- * with amd_pstate=passive in kernel command line
+ * with amd_pstate=passive or other modes in kernel command line
*/
- if (!cppc_load) {
- pr_debug("driver load is disabled, boot with amd_pstate=passive to enable this\n");
+ if (cppc_state == AMD_PSTATE_DISABLE) {
+ pr_debug("driver load is disabled, boot with specific mode to enable this\n");
return -ENODEV;
}

@@ -670,16 +681,24 @@ device_initcall(amd_pstate_init);

static int __init amd_pstate_param(char *str)
{
+ size_t size;
+ int mode_idx;
+
if (!str)
return -EINVAL;

- if (!strcmp(str, "disable")) {
- cppc_load = 0;
- pr_info("driver is explicitly disabled\n");
- } else if (!strcmp(str, "passive"))
- cppc_load = 1;
+ size = strlen(str);
+ mode_idx = get_mode_idx_from_str(str, size);

- return 0;
+ if (mode_idx >= AMD_PSTATE_DISABLE && mode_idx < AMD_PSTATE_MAX) {
+ cppc_state = mode_idx;
+ if (cppc_state == AMD_PSTATE_DISABLE)
+ pr_info("driver is explicitly disabled\n");
+
+ return 0;
+ }
+
+ return -EINVAL;
}
early_param("amd_pstate", amd_pstate_param);

diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index 1c4b8659f171..dae2ce0f6735 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -74,4 +74,21 @@ struct amd_cpudata {
bool boost_supported;
};

+/*
+ * enum amd_pstate_mode - driver working mode of amd pstate
+ */
+enum amd_pstate_mode {
+ AMD_PSTATE_DISABLE = 0,
+ AMD_PSTATE_PASSIVE,
+ AMD_PSTATE_ACTIVE,
+ AMD_PSTATE_MAX,
+};
+
+static const char * const amd_pstate_mode_string[] = {
+ [AMD_PSTATE_DISABLE] = "disable",
+ [AMD_PSTATE_PASSIVE] = "passive",
+ [AMD_PSTATE_ACTIVE] = "active",
+ NULL,
+};
+
#endif /* _LINUX_AMD_PSTATE_H */
--
2.34.1

2023-01-06 06:18:19

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 05/12] cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors

From: Perry Yuan <[email protected]>

Add EPP driver support for AMD SoCs which support a dedicated MSR for
CPPC. EPP is used by the DPM controller to configure the frequency that
a core operates at during short periods of activity.

The SoC EPP targets are configured on a scale from 0 to 255 where 0
represents maximum performance and 255 represents maximum efficiency.

The amd-pstate driver exports profile string names to userspace that are
tied to specific EPP values.

The balance_performance string (0x80) provides the best balance for
efficiency versus power on most systems, but users can choose other
strings to meet their needs as well.

$ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_preferences
default performance balance_performance balance_power power

$ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preference
balance_performance

To enable the driver,it needs to add `amd_pstate=active` to kernel
command line and kernel will load the active mode epp driver

Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 374 ++++++++++++++++++++++++++++++++++-
include/linux/amd-pstate.h | 10 +
2 files changed, 378 insertions(+), 6 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 1a8b31277620..9a92c71af80f 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -59,7 +59,9 @@
* we disable it by default to go acpi-cpufreq on these processors and add a
* module parameter to be able to enable it manually for debugging.
*/
+static struct cpufreq_driver *current_pstate_driver;
static struct cpufreq_driver amd_pstate_driver;
+static struct cpufreq_driver amd_pstate_epp_driver;
static int cppc_state = AMD_PSTATE_DISABLE;

static inline int get_mode_idx_from_str(const char *str, size_t size)
@@ -73,6 +75,114 @@ static inline int get_mode_idx_from_str(const char *str, size_t size)
return -EINVAL;
}

+static DEFINE_MUTEX(amd_pstate_limits_lock);
+static DEFINE_MUTEX(amd_pstate_driver_lock);
+
+static s16 amd_pstate_get_epp(struct amd_cpudata *cpudata, u64 cppc_req_cached)
+{
+ u64 epp;
+ int ret;
+
+ if (boot_cpu_has(X86_FEATURE_CPPC)) {
+ if (!cppc_req_cached) {
+ epp = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
+ &cppc_req_cached);
+ if (epp)
+ return epp;
+ }
+ epp = (cppc_req_cached >> 24) & 0xFF;
+ } else {
+ ret = cppc_get_epp_perf(cpudata->cpu, &epp);
+ if (ret < 0) {
+ pr_debug("Could not retrieve energy perf value (%d)\n", ret);
+ return -EIO;
+ }
+ }
+
+ return (s16)(epp & 0xff);
+}
+
+static int amd_pstate_get_energy_pref_index(struct amd_cpudata *cpudata)
+{
+ s16 epp;
+ int index = -EINVAL;
+
+ epp = amd_pstate_get_epp(cpudata, 0);
+ if (epp < 0)
+ return epp;
+
+ switch (epp) {
+ case HWP_EPP_PERFORMANCE:
+ index = EPP_INDEX_PERFORMANCE;
+ break;
+ case HWP_EPP_BALANCE_PERFORMANCE:
+ index = EPP_INDEX_BALANCE_PERFORMANCE;
+ break;
+ case HWP_EPP_BALANCE_POWERSAVE:
+ index = EPP_INDEX_BALANCE_POWERSAVE;
+ break;
+ case HWP_EPP_POWERSAVE:
+ index = EPP_INDEX_POWERSAVE;
+ break;
+ default:
+ break;
+ }
+
+ return index;
+}
+
+static int amd_pstate_set_epp(struct amd_cpudata *cpudata, u32 epp)
+{
+ int ret;
+ struct cppc_perf_ctrls perf_ctrls;
+
+ if (boot_cpu_has(X86_FEATURE_CPPC)) {
+ u64 value = READ_ONCE(cpudata->cppc_req_cached);
+
+ value &= ~GENMASK_ULL(31, 24);
+ value |= (u64)epp << 24;
+ WRITE_ONCE(cpudata->cppc_req_cached, value);
+
+ ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
+ if (!ret)
+ cpudata->epp_cached = epp;
+ } else {
+ perf_ctrls.energy_perf = epp;
+ ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1);
+ if (ret) {
+ pr_debug("failed to set energy perf value (%d)\n", ret);
+ return ret;
+ }
+ cpudata->epp_cached = epp;
+ }
+
+ return ret;
+}
+
+static int amd_pstate_set_energy_pref_index(struct amd_cpudata *cpudata,
+ int pref_index)
+{
+ int epp = -EINVAL;
+ int ret;
+
+ if (!pref_index) {
+ pr_debug("EPP pref_index is invalid\n");
+ return -EINVAL;
+ }
+
+ if (epp == -EINVAL)
+ epp = epp_values[pref_index];
+
+ if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
+ pr_debug("EPP cannot be set under performance policy\n");
+ return -EBUSY;
+ }
+
+ ret = amd_pstate_set_epp(cpudata, epp);
+
+ return ret;
+}
+
static inline int pstate_enable(bool enable)
{
return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable);
@@ -81,11 +191,21 @@ static inline int pstate_enable(bool enable)
static int cppc_enable(bool enable)
{
int cpu, ret = 0;
+ struct cppc_perf_ctrls perf_ctrls;

for_each_present_cpu(cpu) {
ret = cppc_set_enable(cpu, enable);
if (ret)
return ret;
+
+ /* Enable autonomous mode for EPP */
+ if (cppc_state == AMD_PSTATE_ACTIVE) {
+ /* Set desired perf as zero to allow EPP firmware control */
+ perf_ctrls.desired_perf = 0;
+ ret = cppc_set_perf(cpu, &perf_ctrls);
+ if (ret)
+ return ret;
+ }
}

return ret;
@@ -428,7 +548,7 @@ static void amd_pstate_boost_init(struct amd_cpudata *cpudata)
return;

cpudata->boost_supported = true;
- amd_pstate_driver.boost_enabled = true;
+ current_pstate_driver->boost_enabled = true;
}

static void amd_perf_ctl_reset(unsigned int cpu)
@@ -602,10 +722,61 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
return sprintf(&buf[0], "%u\n", perf);
}

+static ssize_t show_energy_performance_available_preferences(
+ struct cpufreq_policy *policy, char *buf)
+{
+ int i = 0;
+ int offset = 0;
+
+ while (energy_perf_strings[i] != NULL)
+ offset += sysfs_emit_at(buf, offset, "%s ", energy_perf_strings[i++]);
+
+ sysfs_emit_at(buf, offset, "\n");
+
+ return offset;
+}
+
+static ssize_t store_energy_performance_preference(
+ struct cpufreq_policy *policy, const char *buf, size_t count)
+{
+ struct amd_cpudata *cpudata = policy->driver_data;
+ char str_preference[21];
+ ssize_t ret;
+
+ ret = sscanf(buf, "%20s", str_preference);
+ if (ret != 1)
+ return -EINVAL;
+
+ ret = match_string(energy_perf_strings, -1, str_preference);
+ if (ret < 0)
+ return -EINVAL;
+
+ mutex_lock(&amd_pstate_limits_lock);
+ ret = amd_pstate_set_energy_pref_index(cpudata, ret);
+ mutex_unlock(&amd_pstate_limits_lock);
+
+ return ret ?: count;
+}
+
+static ssize_t show_energy_performance_preference(
+ struct cpufreq_policy *policy, char *buf)
+{
+ struct amd_cpudata *cpudata = policy->driver_data;
+ int preference;
+
+ preference = amd_pstate_get_energy_pref_index(cpudata);
+ if (preference < 0)
+ return preference;
+
+ return sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
+}
+
cpufreq_freq_attr_ro(amd_pstate_max_freq);
cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);

cpufreq_freq_attr_ro(amd_pstate_highest_perf);
+cpufreq_freq_attr_rw(energy_performance_preference);
+cpufreq_freq_attr_ro(energy_performance_available_preferences);

static struct freq_attr *amd_pstate_attr[] = {
&amd_pstate_max_freq,
@@ -614,6 +785,181 @@ static struct freq_attr *amd_pstate_attr[] = {
NULL,
};

+static struct freq_attr *amd_pstate_epp_attr[] = {
+ &amd_pstate_max_freq,
+ &amd_pstate_lowest_nonlinear_freq,
+ &amd_pstate_highest_perf,
+ &energy_performance_preference,
+ &energy_performance_available_preferences,
+ NULL,
+};
+
+static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
+{
+ int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
+ struct amd_cpudata *cpudata;
+ struct device *dev;
+ int rc;
+ u64 value;
+
+ dev = get_cpu_device(policy->cpu);
+ if (!dev)
+ goto free_cpudata1;
+
+ cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
+ if (!cpudata)
+ return -ENOMEM;
+
+ cpudata->cpu = policy->cpu;
+ cpudata->epp_policy = 0;
+
+ rc = amd_pstate_init_perf(cpudata);
+ if (rc)
+ goto free_cpudata1;
+
+ min_freq = amd_get_min_freq(cpudata);
+ max_freq = amd_get_max_freq(cpudata);
+ nominal_freq = amd_get_nominal_freq(cpudata);
+ lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
+ if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
+ dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
+ min_freq, max_freq);
+ ret = -EINVAL;
+ goto free_cpudata1;
+ }
+
+ policy->cpuinfo.min_freq = min_freq;
+ policy->cpuinfo.max_freq = max_freq;
+ /* It will be updated by governor */
+ policy->cur = policy->cpuinfo.min_freq;
+
+ /* Initial processor data capability frequencies */
+ cpudata->max_freq = max_freq;
+ cpudata->min_freq = min_freq;
+ cpudata->nominal_freq = nominal_freq;
+ cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
+
+ policy->driver_data = cpudata;
+
+ cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
+
+ policy->min = policy->cpuinfo.min_freq;
+ policy->max = policy->cpuinfo.max_freq;
+
+ /*
+ * Set the policy to powersave to provide a valid fallback value in case
+ * the default cpufreq governor is neither powersave nor performance.
+ */
+ policy->policy = CPUFREQ_POLICY_POWERSAVE;
+
+ if (boot_cpu_has(X86_FEATURE_CPPC)) {
+ policy->fast_switch_possible = true;
+ ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, &value);
+ if (ret)
+ return ret;
+ WRITE_ONCE(cpudata->cppc_req_cached, value);
+
+ ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &value);
+ if (ret)
+ return ret;
+ WRITE_ONCE(cpudata->cppc_cap1_cached, value);
+ }
+ amd_pstate_boost_init(cpudata);
+
+ return 0;
+
+free_cpudata1:
+ kfree(cpudata);
+ return ret;
+}
+
+static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
+{
+ pr_debug("CPU %d exiting\n", policy->cpu);
+ policy->fast_switch_possible = false;
+ return 0;
+}
+
+static void amd_pstate_epp_init(unsigned int cpu)
+{
+ struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
+ struct amd_cpudata *cpudata = policy->driver_data;
+ u32 max_perf, min_perf;
+ u64 value;
+ s16 epp;
+
+ max_perf = READ_ONCE(cpudata->highest_perf);
+ min_perf = READ_ONCE(cpudata->lowest_perf);
+
+ value = READ_ONCE(cpudata->cppc_req_cached);
+
+ if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
+ min_perf = max_perf;
+
+ /* Initial min/max values for CPPC Performance Controls Register */
+ value &= ~AMD_CPPC_MIN_PERF(~0L);
+ value |= AMD_CPPC_MIN_PERF(min_perf);
+
+ value &= ~AMD_CPPC_MAX_PERF(~0L);
+ value |= AMD_CPPC_MAX_PERF(max_perf);
+
+ /* CPPC EPP feature require to set zero to the desire perf bit */
+ value &= ~AMD_CPPC_DES_PERF(~0L);
+ value |= AMD_CPPC_DES_PERF(0);
+
+ if (cpudata->epp_policy == cpudata->policy)
+ goto skip_epp;
+
+ cpudata->epp_policy = cpudata->policy;
+
+ if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
+ epp = amd_pstate_get_epp(cpudata, value);
+ if (epp < 0)
+ goto skip_epp;
+ /* force the epp value to be zero for performance policy */
+ epp = 0;
+ } else {
+ /* Get BIOS pre-defined epp value */
+ epp = amd_pstate_get_epp(cpudata, value);
+ if (epp)
+ goto skip_epp;
+ }
+ /* Set initial EPP value */
+ if (boot_cpu_has(X86_FEATURE_CPPC)) {
+ value &= ~GENMASK_ULL(31, 24);
+ value |= (u64)epp << 24;
+ }
+
+skip_epp:
+ WRITE_ONCE(cpudata->cppc_req_cached, value);
+ amd_pstate_set_epp(cpudata, epp);
+ cpufreq_cpu_put(policy);
+}
+
+static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
+{
+ struct amd_cpudata *cpudata = policy->driver_data;
+
+ if (!policy->cpuinfo.max_freq)
+ return -ENODEV;
+
+ pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
+ policy->cpuinfo.max_freq, policy->max);
+
+ cpudata->policy = policy->policy;
+
+ amd_pstate_epp_init(policy->cpu);
+
+ return 0;
+}
+
+static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data *policy)
+{
+ cpufreq_verify_within_cpu_limits(policy);
+ pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy->min);
+ return 0;
+}
+
static struct cpufreq_driver amd_pstate_driver = {
.flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
.verify = amd_pstate_verify,
@@ -627,6 +973,16 @@ static struct cpufreq_driver amd_pstate_driver = {
.attr = amd_pstate_attr,
};

+static struct cpufreq_driver amd_pstate_epp_driver = {
+ .flags = CPUFREQ_CONST_LOOPS,
+ .verify = amd_pstate_epp_verify_policy,
+ .setpolicy = amd_pstate_epp_set_policy,
+ .init = amd_pstate_epp_cpu_init,
+ .exit = amd_pstate_epp_cpu_exit,
+ .name = "amd_pstate_epp",
+ .attr = amd_pstate_epp_attr,
+};
+
static int __init amd_pstate_init(void)
{
int ret;
@@ -655,7 +1011,8 @@ static int __init amd_pstate_init(void)
/* capability check */
if (boot_cpu_has(X86_FEATURE_CPPC)) {
pr_debug("AMD CPPC MSR based functionality is supported\n");
- amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
+ if (cppc_state == AMD_PSTATE_PASSIVE)
+ current_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
} else {
pr_debug("AMD CPPC shared memory based functionality is supported\n");
static_call_update(amd_pstate_enable, cppc_enable);
@@ -666,14 +1023,13 @@ static int __init amd_pstate_init(void)
/* enable amd pstate feature */
ret = amd_pstate_enable(true);
if (ret) {
- pr_err("failed to enable amd-pstate with return %d\n", ret);
+ pr_err("failed to enable with return %d\n", ret);
return ret;
}

- ret = cpufreq_register_driver(&amd_pstate_driver);
+ ret = cpufreq_register_driver(current_pstate_driver);
if (ret)
- pr_err("failed to register amd_pstate_driver with return %d\n",
- ret);
+ pr_err("failed to register with return %d\n", ret);

return ret;
}
@@ -695,6 +1051,12 @@ static int __init amd_pstate_param(char *str)
if (cppc_state == AMD_PSTATE_DISABLE)
pr_info("driver is explicitly disabled\n");

+ if (cppc_state == AMD_PSTATE_ACTIVE)
+ current_pstate_driver = &amd_pstate_epp_driver;
+
+ if (cppc_state == AMD_PSTATE_PASSIVE)
+ current_pstate_driver = &amd_pstate_driver;
+
return 0;
}

diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index dae2ce0f6735..8341a2a2948a 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -47,6 +47,10 @@ struct amd_aperf_mperf {
* @prev: Last Aperf/Mperf/tsc count value read from register
* @freq: current cpu frequency value
* @boost_supported: check whether the Processor or SBIOS supports boost mode
+ * @epp_policy: Last saved policy used to set energy-performance preference
+ * @epp_cached: Cached CPPC energy-performance preference value
+ * @policy: Cpufreq policy value
+ * @cppc_cap1_cached Cached MSR_AMD_CPPC_CAP1 register value
*
* The amd_cpudata is key private data for each CPU thread in AMD P-State, and
* represents all the attributes and goals that AMD P-State requests at runtime.
@@ -72,6 +76,12 @@ struct amd_cpudata {

u64 freq;
bool boost_supported;
+
+ /* EPP feature related attributes*/
+ s16 epp_policy;
+ s16 epp_cached;
+ u32 policy;
+ u64 cppc_cap1_cached;
};

/*
--
2.34.1

2023-01-06 06:25:02

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 09/12] Documentation: amd-pstate: add amd pstate driver mode introduction

From: Perry Yuan <[email protected]>

The amd-pstate driver has two operation modes supported:
* CPPC Autonomous (active) mode
* CPPC non-autonomous (passive) mode.
active mode and passive mode can be chosen by different kernel parameters.

Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
Documentation/admin-guide/pm/amd-pstate.rst | 26 +++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
index 33ab8ec8fc2f..62744dae3c5f 100644
--- a/Documentation/admin-guide/pm/amd-pstate.rst
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -299,8 +299,30 @@ module which supports the new AMD P-States mechanism on most of the future AMD
platforms. The AMD P-States mechanism is the more performance and energy
efficiency frequency management method on AMD processors.

-Kernel Module Options for ``amd-pstate``
-=========================================
+
+AMD Pstate Driver Operation Modes
+=================================
+
+``amd_pstate`` CPPC has two operation modes: CPPC Autonomous(active) mode and
+CPPC non-autonomous(passive) mode.
+active mode and passive mode can be chosen by different kernel parameters.
+When in Autonomous mode, CPPC ignores requests done in the Desired Performance
+Target register and takes into account only the values set to the Minimum requested
+performance, Maximum requested performance, and Energy Performance Preference
+registers. When Autonomous is disabled, it only considers the Desired Performance Target.
+
+Active Mode
+------------
+
+``amd_pstate=active``
+
+This is the low-level firmware control mode which is implemented by ``amd_pstate_epp``
+driver with ``amd_pstate=active`` passed to the kernel in the command line.
+In this mode, ``amd_pstate_epp`` driver provides a hint to the hardware if software
+wants to bias toward performance (0x0) or energy efficiency (0xff) to the CPPC firmware.
+then CPPC power algorithm will calculate the runtime workload and adjust the realtime
+cores frequency according to the power supply and thermal, core voltage and some other
+hardware conditions.

Passive Mode
------------
--
2.34.1

2023-01-06 06:30:32

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 07/12] cpufreq: amd-pstate: implement suspend and resume callbacks

From: Perry Yuan <[email protected]>

add suspend and resume support for the AMD processors by amd_pstate_epp
driver instance.

When the CPPC is suspended, EPP driver will set EPP profile to 'power'
profile and set max/min perf to lowest perf value.
When resume happens, it will restore the MSR registers with
previous cached value.

Acked-by: Huang Rui <[email protected]>
Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 40 ++++++++++++++++++++++++++++++++++++
1 file changed, 40 insertions(+)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index ae6d2bf18e16..2c8b126cec5a 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -1040,6 +1040,44 @@ static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data *policy)
return 0;
}

+static int amd_pstate_epp_suspend(struct cpufreq_policy *policy)
+{
+ struct amd_cpudata *cpudata = policy->driver_data;
+ int ret;
+
+ /* avoid suspending when EPP is not enabled */
+ if (cppc_state != AMD_PSTATE_ACTIVE)
+ return 0;
+
+ /* set this flag to avoid setting core offline*/
+ cpudata->suspended = true;
+
+ /* disable CPPC in lowlevel firmware */
+ ret = amd_pstate_enable(false);
+ if (ret)
+ pr_err("failed to suspend, return %d\n", ret);
+
+ return 0;
+}
+
+static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
+{
+ struct amd_cpudata *cpudata = policy->driver_data;
+
+ if (cpudata->suspended) {
+ mutex_lock(&amd_pstate_limits_lock);
+
+ /* enable amd pstate from suspend state*/
+ amd_pstate_epp_reenable(cpudata);
+
+ mutex_unlock(&amd_pstate_limits_lock);
+
+ cpudata->suspended = false;
+ }
+
+ return 0;
+}
+
static struct cpufreq_driver amd_pstate_driver = {
.flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
.verify = amd_pstate_verify,
@@ -1061,6 +1099,8 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
.exit = amd_pstate_epp_cpu_exit,
.offline = amd_pstate_epp_cpu_offline,
.online = amd_pstate_epp_cpu_online,
+ .suspend = amd_pstate_epp_suspend,
+ .resume = amd_pstate_epp_resume,
.name = "amd_pstate_epp",
.attr = amd_pstate_epp_attr,
};
--
2.34.1

2023-01-06 06:32:15

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 06/12] cpufreq: amd-pstate: implement amd pstate cpu online and offline callback

From: Perry Yuan <[email protected]>

Adds online and offline driver callback support to allow cpu cores go
offline and help to restore the previous working states when core goes
back online later for EPP driver mode.

Acked-by: Huang Rui <[email protected]>
Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 82 ++++++++++++++++++++++++++++++++++++
include/linux/amd-pstate.h | 1 +
2 files changed, 83 insertions(+)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 9a92c71af80f..ae6d2bf18e16 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -953,6 +953,86 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
return 0;
}

+static void amd_pstate_epp_reenable(struct amd_cpudata *cpudata)
+{
+ struct cppc_perf_ctrls perf_ctrls;
+ u64 value, max_perf;
+ int ret;
+
+ ret = amd_pstate_enable(true);
+ if (ret)
+ pr_err("failed to enable amd pstate during resume, return %d\n", ret);
+
+ value = READ_ONCE(cpudata->cppc_req_cached);
+ max_perf = READ_ONCE(cpudata->highest_perf);
+
+ if (boot_cpu_has(X86_FEATURE_CPPC)) {
+ wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
+ } else {
+ perf_ctrls.max_perf = max_perf;
+ perf_ctrls.energy_perf = AMD_CPPC_ENERGY_PERF_PREF(cpudata->epp_cached);
+ cppc_set_perf(cpudata->cpu, &perf_ctrls);
+ }
+}
+
+static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
+{
+ struct amd_cpudata *cpudata = policy->driver_data;
+
+ pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
+
+ if (cppc_state == AMD_PSTATE_ACTIVE) {
+ amd_pstate_epp_reenable(cpudata);
+ cpudata->suspended = false;
+ }
+
+ return 0;
+}
+
+static void amd_pstate_epp_offline(struct cpufreq_policy *policy)
+{
+ struct amd_cpudata *cpudata = policy->driver_data;
+ struct cppc_perf_ctrls perf_ctrls;
+ int min_perf;
+ u64 value;
+
+ min_perf = READ_ONCE(cpudata->lowest_perf);
+ value = READ_ONCE(cpudata->cppc_req_cached);
+
+ mutex_lock(&amd_pstate_limits_lock);
+ if (boot_cpu_has(X86_FEATURE_CPPC)) {
+ cpudata->epp_policy = CPUFREQ_POLICY_UNKNOWN;
+
+ /* Set max perf same as min perf */
+ value &= ~AMD_CPPC_MAX_PERF(~0L);
+ value |= AMD_CPPC_MAX_PERF(min_perf);
+ value &= ~AMD_CPPC_MIN_PERF(~0L);
+ value |= AMD_CPPC_MIN_PERF(min_perf);
+ wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
+ } else {
+ perf_ctrls.desired_perf = 0;
+ perf_ctrls.max_perf = min_perf;
+ perf_ctrls.energy_perf = AMD_CPPC_ENERGY_PERF_PREF(HWP_EPP_BALANCE_POWERSAVE);
+ cppc_set_perf(cpudata->cpu, &perf_ctrls);
+ }
+ mutex_unlock(&amd_pstate_limits_lock);
+}
+
+static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
+{
+ struct amd_cpudata *cpudata = policy->driver_data;
+
+ pr_debug("AMD CPU Core %d going offline\n", cpudata->cpu);
+
+ if (cpudata->suspended)
+ return 0;
+
+ if (cppc_state == AMD_PSTATE_ACTIVE)
+ amd_pstate_epp_offline(policy);
+
+ return 0;
+}
+
static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data *policy)
{
cpufreq_verify_within_cpu_limits(policy);
@@ -979,6 +1059,8 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
.setpolicy = amd_pstate_epp_set_policy,
.init = amd_pstate_epp_cpu_init,
.exit = amd_pstate_epp_cpu_exit,
+ .offline = amd_pstate_epp_cpu_offline,
+ .online = amd_pstate_epp_cpu_online,
.name = "amd_pstate_epp",
.attr = amd_pstate_epp_attr,
};
diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index 8341a2a2948a..15761a581e82 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -82,6 +82,7 @@ struct amd_cpudata {
s16 epp_cached;
u32 policy;
u64 cppc_cap1_cached;
+ bool suspended;
};

/*
--
2.34.1

2023-01-06 06:56:25

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 12/12] Documentation: amd-pstate: introduce new global sysfs attributes

The amd-pstate driver supports switching working modes at runtime.
Users can view and change modes by interacting with the "status" sysfs
attribute.

1) check driver mode:
$ cat /sys/devices/system/cpu/amd-pstate/status

2) switch mode:
`# echo "passive" | sudo tee /sys/devices/system/cpu/amd-pstate/status`
or
`# echo "active" | sudo tee /sys/devices/system/cpu/amd-pstate/status`

Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
Documentation/admin-guide/pm/amd-pstate.rst | 29 +++++++++++++++++++++
1 file changed, 29 insertions(+)

diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
index 62744dae3c5f..df0637a49f47 100644
--- a/Documentation/admin-guide/pm/amd-pstate.rst
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -339,6 +339,35 @@ processor must provide at least nominal performance requested and go higher if c
operating conditions allow.


+User Space Interface in ``sysfs``
+=================================
+
+Global Attributes
+-----------------
+
+``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to
+control its functionality at the system level. They are located in the
+``/sys/devices/system/cpu/amd-pstate/`` directory and affect all CPUs.
+
+``status``
+ Operation mode of the driver: "active", "passive" or "disable".
+
+ "active"
+ The driver is functional and in the ``active mode``
+
+ "passive"
+ The driver is functional and in the ``passive mode``
+
+ "disable"
+ The driver is unregistered and not functional now.
+
+ This attribute can be written to in order to change the driver's
+ operation mode or to unregister it. The string written to it must be
+ one of the possible values of it and, if successful, writing one of
+ these values to the sysfs file will cause the driver to switch over
+ to the operation mode represented by that string - or to be
+ unregistered in the "disable" case.
+
``cpupower`` tool support for ``amd-pstate``
===============================================

--
2.34.1

2023-01-06 07:01:18

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 08/12] cpufreq: amd-pstate: add driver working mode switch support

From: Perry Yuan <[email protected]>

While amd-pstate driver was loaded with specific driver mode, it will
need to check which mode is enabled for the pstate driver,add this sysfs
entry to show the current status

$ cat /sys/devices/system/cpu/amd-pstate/status
active

Meanwhile, user can switch the pstate driver mode with writing mode
string to sysfs entry as below.

Enable passive mode:
$ sudo bash -c "echo passive > /sys/devices/system/cpu/amd-pstate/status"

Enable active mode (EPP driver mode):
$ sudo bash -c "echo active > /sys/devices/system/cpu/amd-pstate/status"

Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 118 +++++++++++++++++++++++++++++++++++
1 file changed, 118 insertions(+)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 2c8b126cec5a..d3f4cdd803f3 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -63,6 +63,7 @@ static struct cpufreq_driver *current_pstate_driver;
static struct cpufreq_driver amd_pstate_driver;
static struct cpufreq_driver amd_pstate_epp_driver;
static int cppc_state = AMD_PSTATE_DISABLE;
+struct kobject *amd_pstate_kobj;

static inline int get_mode_idx_from_str(const char *str, size_t size)
{
@@ -631,6 +632,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
policy->driver_data = cpudata;

amd_pstate_boost_init(cpudata);
+ if (!current_pstate_driver->adjust_perf)
+ current_pstate_driver->adjust_perf = amd_pstate_adjust_perf;

return 0;

@@ -771,12 +774,99 @@ static ssize_t show_energy_performance_preference(
return sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
}

+static ssize_t amd_pstate_show_status(char *buf)
+{
+ if (!current_pstate_driver)
+ return sysfs_emit(buf, "disable\n");
+
+ return sysfs_emit(buf, "%s\n", amd_pstate_mode_string[cppc_state]);
+}
+
+static void amd_pstate_driver_cleanup(void)
+{
+ current_pstate_driver = NULL;
+}
+
+static int amd_pstate_update_status(const char *buf, size_t size)
+{
+ int ret;
+ int mode_idx;
+
+ if (size > 7 || size < 6)
+ return -EINVAL;
+ mode_idx = get_mode_idx_from_str(buf, size);
+
+ switch(mode_idx) {
+ case AMD_PSTATE_DISABLE:
+ if (!current_pstate_driver)
+ return -EINVAL;
+ if (cppc_state == AMD_PSTATE_ACTIVE)
+ return -EBUSY;
+ ret = cpufreq_unregister_driver(current_pstate_driver);
+ amd_pstate_driver_cleanup();
+ break;
+ case AMD_PSTATE_PASSIVE:
+ if (current_pstate_driver) {
+ if (current_pstate_driver == &amd_pstate_driver)
+ return 0;
+ cpufreq_unregister_driver(current_pstate_driver);
+ cppc_state = AMD_PSTATE_PASSIVE;
+ current_pstate_driver = &amd_pstate_driver;
+ }
+
+ ret = cpufreq_register_driver(current_pstate_driver);
+ break;
+ case AMD_PSTATE_ACTIVE:
+ if (current_pstate_driver) {
+ if (current_pstate_driver == &amd_pstate_epp_driver)
+ return 0;
+ cpufreq_unregister_driver(current_pstate_driver);
+ current_pstate_driver = &amd_pstate_epp_driver;
+ cppc_state = AMD_PSTATE_ACTIVE;
+ }
+
+ ret = cpufreq_register_driver(current_pstate_driver);
+ break;
+ default:
+ ret = -EINVAL;
+ break;
+ }
+
+ return ret;
+}
+
+static ssize_t show_status(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ ssize_t ret;
+
+ mutex_lock(&amd_pstate_driver_lock);
+ ret = amd_pstate_show_status(buf);
+ mutex_unlock(&amd_pstate_driver_lock);
+
+ return ret;
+}
+
+static ssize_t store_status(struct kobject *a, struct kobj_attribute *b,
+ const char *buf, size_t count)
+{
+ char *p = memchr(buf, '\n', count);
+ int ret;
+
+ mutex_lock(&amd_pstate_driver_lock);
+ ret = amd_pstate_update_status(buf, p ? p - buf : count);
+ mutex_unlock(&amd_pstate_driver_lock);
+
+ return ret < 0 ? ret : count;
+}
+
cpufreq_freq_attr_ro(amd_pstate_max_freq);
cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);

cpufreq_freq_attr_ro(amd_pstate_highest_perf);
cpufreq_freq_attr_rw(energy_performance_preference);
cpufreq_freq_attr_ro(energy_performance_available_preferences);
+define_one_global_rw(status);

static struct freq_attr *amd_pstate_attr[] = {
&amd_pstate_max_freq,
@@ -794,6 +884,15 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
NULL,
};

+static struct attribute *pstate_global_attributes[] = {
+ &status.attr,
+ NULL
+};
+
+static const struct attribute_group amd_pstate_global_attr_group = {
+ .attrs = pstate_global_attributes,
+};
+
static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
{
int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
@@ -1153,6 +1252,25 @@ static int __init amd_pstate_init(void)
if (ret)
pr_err("failed to register with return %d\n", ret);

+ amd_pstate_kobj = kobject_create_and_add("amd_pstate", &cpu_subsys.dev_root->kobj);
+ if (!amd_pstate_kobj) {
+ ret = -EINVAL;
+ pr_err("global sysfs registration failed.\n");
+ goto kobject_free;
+ }
+
+ ret = sysfs_create_group(amd_pstate_kobj, &amd_pstate_global_attr_group);
+ if (ret) {
+ pr_err("sysfs attribute export failed with error %d.\n", ret);
+ goto global_attr_free;
+ }
+
+ return ret;
+
+global_attr_free:
+ kobject_put(amd_pstate_kobj);
+kobject_free:
+ cpufreq_unregister_driver(current_pstate_driver);
return ret;
}
device_initcall(amd_pstate_init);
--
2.34.1

2023-01-06 07:03:42

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 11/12] cpufreq: amd-pstate: convert sprintf with sysfs_emit()

replace the sprintf with a more generic sysfs_emit function

No intended potential function impact

Acked-by: Huang Rui <[email protected]>
Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index d3f4cdd803f3..ea4a4a0c80be 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -694,7 +694,7 @@ static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
if (max_freq < 0)
return max_freq;

- return sprintf(&buf[0], "%u\n", max_freq);
+ return sysfs_emit(buf, "%u\n", max_freq);
}

static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *policy,
@@ -707,7 +707,7 @@ static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *poli
if (freq < 0)
return freq;

- return sprintf(&buf[0], "%u\n", freq);
+ return sysfs_emit(buf, "%u\n", freq);
}

/*
@@ -722,7 +722,7 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,

perf = READ_ONCE(cpudata->highest_perf);

- return sprintf(&buf[0], "%u\n", perf);
+ return sysfs_emit(buf, "%u\n", perf);
}

static ssize_t show_energy_performance_available_preferences(
--
2.34.1

2023-01-06 07:17:49

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 10/12] Documentation: introduce amd pstate active mode kernel command line options

AMD Pstate driver support another firmware based autonomous mode
with "amd_pstate=active" added to the kernel command line.
In autonomous mode SMU firmware decides frequencies at runtime
based on workload utilization, usage in other IPs, infrastructure
limits such as power, thermals and so on.

Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 42af9ca0127e..73a02816f6f8 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -6970,3 +6970,10 @@
management firmware translates the requests into actual
hardware states (core frequency, data fabric and memory
clocks etc.)
+ active
+ Use amd_pstate_epp driver instance as the scaling driver,
+ driver provides a hint to the hardware if software wants
+ to bias toward performance (0x0) or energy efficiency (0xff)
+ to the CPPC firmware. then CPPC power algorithm will
+ calculate the runtime workload and adjust the realtime cores
+ frequency.
--
2.34.1

2023-01-06 07:19:23

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 03/12] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP)

make the energy preference performance strings and profiles using one
common header for intel_pstate driver, then the amd_pstate epp driver can
use the common header as well. This will simpify the intel_pstate and
amd_pstate driver.

Signed-off-by: Perry Yuan <[email protected]>
---
drivers/acpi/cppc_acpi.c | 29 +++++++++++++++++++++++++++
drivers/cpufreq/intel_pstate.c | 36 ----------------------------------
include/acpi/cppc_acpi.h | 10 ++++++++++
3 files changed, 39 insertions(+), 36 deletions(-)

diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 0ce6c55a76ca..c7aa190ec2d2 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -42,6 +42,35 @@

#include <acpi/cppc_acpi.h>

+/*
+ * EPP/EPB display strings corresponding to EPP index in the
+ * energy_perf_strings[]
+ * index String
+ *-------------------------------------
+ * 0 default
+ * 1 performance
+ * 2 balance_performance
+ * 3 balance_power
+ * 4 power
+ */
+
+const char * const energy_perf_strings[] = {
+ [EPP_INDEX_DEFAULT] = "default",
+ [EPP_INDEX_PERFORMANCE] = "performance",
+ [EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
+ [EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
+ [EPP_INDEX_POWERSAVE] = "power",
+ NULL
+};
+
+unsigned int epp_values[] = {
+ [EPP_INDEX_DEFAULT] = 0, /* Unused index */
+ [EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
+ [EPP_INDEX_BALANCE_PERFORMANCE] = HWP_EPP_BALANCE_PERFORMANCE,
+ [EPP_INDEX_BALANCE_POWERSAVE] = HWP_EPP_BALANCE_POWERSAVE,
+ [EPP_INDEX_POWERSAVE] = HWP_EPP_POWERSAVE,
+};
+
struct cppc_pcc_data {
struct pcc_mbox_chan *pcc_channel;
void __iomem *pcc_comm_addr;
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index ad9be31753b6..01431e279be6 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -628,42 +628,6 @@ static int intel_pstate_set_epb(int cpu, s16 pref)
return 0;
}

-/*
- * EPP/EPB display strings corresponding to EPP index in the
- * energy_perf_strings[]
- * index String
- *-------------------------------------
- * 0 default
- * 1 performance
- * 2 balance_performance
- * 3 balance_power
- * 4 power
- */
-
-enum energy_perf_value_index {
- EPP_INDEX_DEFAULT = 0,
- EPP_INDEX_PERFORMANCE,
- EPP_INDEX_BALANCE_PERFORMANCE,
- EPP_INDEX_BALANCE_POWERSAVE,
- EPP_INDEX_POWERSAVE,
-};
-
-static const char * const energy_perf_strings[] = {
- [EPP_INDEX_DEFAULT] = "default",
- [EPP_INDEX_PERFORMANCE] = "performance",
- [EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
- [EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
- [EPP_INDEX_POWERSAVE] = "power",
- NULL
-};
-static unsigned int epp_values[] = {
- [EPP_INDEX_DEFAULT] = 0, /* Unused index */
- [EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
- [EPP_INDEX_BALANCE_PERFORMANCE] = HWP_EPP_BALANCE_PERFORMANCE,
- [EPP_INDEX_BALANCE_POWERSAVE] = HWP_EPP_BALANCE_POWERSAVE,
- [EPP_INDEX_POWERSAVE] = HWP_EPP_POWERSAVE,
-};
-
static int intel_pstate_get_energy_pref_index(struct cpudata *cpu_data, int *raw_epp)
{
s16 epp;
diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
index 6b487a5bd638..891e5add0e49 100644
--- a/include/acpi/cppc_acpi.h
+++ b/include/acpi/cppc_acpi.h
@@ -135,6 +135,16 @@ struct cppc_cpudata {
cpumask_var_t shared_cpu_map;
};

+enum energy_perf_value_index {
+ EPP_INDEX_DEFAULT = 0,
+ EPP_INDEX_PERFORMANCE,
+ EPP_INDEX_BALANCE_PERFORMANCE,
+ EPP_INDEX_BALANCE_POWERSAVE,
+ EPP_INDEX_POWERSAVE,
+};
+extern const char * const energy_perf_strings[];
+extern unsigned int epp_values[];
+
#ifdef CONFIG_ACPI_CPPC_LIB
extern int cppc_get_desired_perf(int cpunum, u64 *desired_perf);
extern int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf);
--
2.34.1

2023-01-06 10:25:48

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v10 03/12] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP)

Hi Perry,

I love your patch! Yet something to improve:

[auto build test ERROR on rafael-pm/linux-next]
[also build test ERROR on linus/master v6.2-rc2 next-20230106]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Perry-Yuan/ACPI-CPPC-Add-AMD-pstate-energy-performance-preference-cppc-control/20230106-141924
base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
patch link: https://lore.kernel.org/r/20230106061420.95715-4-perry.yuan%40amd.com
patch subject: [PATCH v10 03/12] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP)
config: x86_64-randconfig-a004
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/f43a51c9a7f805b5fd089878fafe7cf9b06b68f0
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Perry-Yuan/ACPI-CPPC-Add-AMD-pstate-energy-performance-preference-cppc-control/20230106-141924
git checkout f43a51c9a7f805b5fd089878fafe7cf9b06b68f0
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=x86_64 olddefconfig
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

ld: vmlinux.o: in function `intel_pstate_update_epp_defaults':
>> drivers/cpufreq/intel_pstate.c:1629: undefined reference to `epp_values'
>> ld: drivers/cpufreq/intel_pstate.c:1629: undefined reference to `epp_values'
ld: drivers/cpufreq/intel_pstate.c:1640: undefined reference to `epp_values'
>> ld: drivers/cpufreq/intel_pstate.c:1629: undefined reference to `epp_values'
ld: vmlinux.o: in function `show_energy_performance_preference':
>> drivers/cpufreq/intel_pstate.c:812: undefined reference to `energy_perf_strings'
>> ld: drivers/cpufreq/intel_pstate.c:812: undefined reference to `energy_perf_strings'
ld: vmlinux.o: in function `intel_pstate_get_energy_pref_index':
drivers/cpufreq/intel_pstate.c:628: undefined reference to `epp_values'
ld: drivers/cpufreq/intel_pstate.c:628: undefined reference to `epp_values'
ld: drivers/cpufreq/intel_pstate.c:630: undefined reference to `epp_values'
ld: drivers/cpufreq/intel_pstate.c:630: undefined reference to `epp_values'
ld: drivers/cpufreq/intel_pstate.c:632: undefined reference to `epp_values'
ld: vmlinux.o:drivers/cpufreq/intel_pstate.c:632: more undefined references to `epp_values' follow
ld: vmlinux.o: in function `show_energy_performance_available_preferences':
drivers/cpufreq/intel_pstate.c:721: undefined reference to `energy_perf_strings'
ld: drivers/cpufreq/intel_pstate.c:721: undefined reference to `energy_perf_strings'
ld: drivers/cpufreq/intel_pstate.c:721: undefined reference to `energy_perf_strings'
ld: drivers/cpufreq/intel_pstate.c:721: undefined reference to `energy_perf_strings'
ld: drivers/cpufreq/intel_pstate.c:721: undefined reference to `energy_perf_strings'
ld: vmlinux.o:drivers/cpufreq/intel_pstate.c:746: more undefined references to `energy_perf_strings' follow
ld: vmlinux.o: in function `store_energy_performance_preference':
drivers/cpufreq/intel_pstate.c:781: undefined reference to `epp_values'
ld: drivers/cpufreq/intel_pstate.c:781: undefined reference to `epp_values'
ld: vmlinux.o: in function `intel_pstate_set_energy_pref_index':
drivers/cpufreq/intel_pstate.c:695: undefined reference to `epp_values'
ld: drivers/cpufreq/intel_pstate.c:695: undefined reference to `epp_values'
ld: vmlinux.o: in function `intel_pstate_init':
drivers/cpufreq/intel_pstate.c:3421: undefined reference to `epp_values'
ld: vmlinux.o:drivers/cpufreq/intel_pstate.c:3421: more undefined references to `epp_values' follow


vim +1629 drivers/cpufreq/intel_pstate.c

57577c996d731c Srinivas Pandruvada 2021-09-28 1620
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1621 static void intel_pstate_update_epp_defaults(struct cpudata *cpudata)
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1622 {
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1623 cpudata->epp_default = intel_pstate_get_epp(cpudata, 0);
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1624
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1625 /*
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1626 * If this CPU gen doesn't call for change in balance_perf
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1627 * EPP return.
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1628 */
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 @1629 if (epp_values[EPP_INDEX_BALANCE_PERFORMANCE] == HWP_EPP_BALANCE_PERFORMANCE)
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1630 return;
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1631
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1632 /*
21cdb6c18f85fe Srinivas Pandruvada 2022-10-27 1633 * If the EPP is set by firmware, which means that firmware enabled HWP
21cdb6c18f85fe Srinivas Pandruvada 2022-10-27 1634 * - Is equal or less than 0x80 (default balance_perf EPP)
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1635 * - But less performance oriented than performance EPP
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1636 * then use this as new balance_perf EPP.
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1637 */
21cdb6c18f85fe Srinivas Pandruvada 2022-10-27 1638 if (hwp_forced && cpudata->epp_default <= HWP_EPP_BALANCE_PERFORMANCE &&
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1639 cpudata->epp_default > HWP_EPP_PERFORMANCE) {
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1640 epp_values[EPP_INDEX_BALANCE_PERFORMANCE] = cpudata->epp_default;
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1641 return;
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1642 }
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1643
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1644 /*
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1645 * Use hard coded value per gen to update the balance_perf
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1646 * and default EPP.
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1647 */
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1648 cpudata->epp_default = epp_values[EPP_INDEX_BALANCE_PERFORMANCE];
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1649 intel_pstate_set_epp(cpudata, cpudata->epp_default);
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1650 }
3d13058ed2a6d0 Srinivas Pandruvada 2022-03-10 1651

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests


Attachments:
(No filename) (6.85 kB)
config (142.86 kB)
Download all attachments

2023-01-06 10:39:36

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v10 03/12] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP)

Hi Perry,

I love your patch! Yet something to improve:

[auto build test ERROR on rafael-pm/linux-next]
[also build test ERROR on linus/master v6.2-rc2 next-20230106]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Perry-Yuan/ACPI-CPPC-Add-AMD-pstate-energy-performance-preference-cppc-control/20230106-141924
base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
patch link: https://lore.kernel.org/r/20230106061420.95715-4-perry.yuan%40amd.com
patch subject: [PATCH v10 03/12] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP)
config: i386-randconfig-a002
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/f43a51c9a7f805b5fd089878fafe7cf9b06b68f0
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Perry-Yuan/ACPI-CPPC-Add-AMD-pstate-energy-performance-preference-cppc-control/20230106-141924
git checkout f43a51c9a7f805b5fd089878fafe7cf9b06b68f0
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> ld.lld: error: undefined symbol: epp_values
>>> referenced by intel_pstate.c:628 (drivers/cpufreq/intel_pstate.c:628)
>>> drivers/cpufreq/intel_pstate.o:(show_energy_performance_preference) in archive vmlinux.a
>>> referenced by intel_pstate.c:630 (drivers/cpufreq/intel_pstate.c:630)
>>> drivers/cpufreq/intel_pstate.o:(show_energy_performance_preference) in archive vmlinux.a
>>> referenced by intel_pstate.c:632 (drivers/cpufreq/intel_pstate.c:632)
>>> drivers/cpufreq/intel_pstate.o:(show_energy_performance_preference) in archive vmlinux.a
>>> referenced 6 more times
--
>> ld.lld: error: undefined symbol: energy_perf_strings
>>> referenced by intel_pstate.c:812 (drivers/cpufreq/intel_pstate.c:812)
>>> drivers/cpufreq/intel_pstate.o:(show_energy_performance_preference) in archive vmlinux.a
>>> referenced by intel_pstate.c:746 (drivers/cpufreq/intel_pstate.c:746)
>>> drivers/cpufreq/intel_pstate.o:(store_energy_performance_preference) in archive vmlinux.a
>>> referenced by intel_pstate.c:721 (drivers/cpufreq/intel_pstate.c:721)
>>> drivers/cpufreq/intel_pstate.o:(show_energy_performance_available_preferences) in archive vmlinux.a
>>> referenced 1 more times

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests


Attachments:
(No filename) (3.31 kB)
config (171.96 kB)
Download all attachments

2023-01-06 11:44:27

by Wyes Karny

[permalink] [raw]
Subject: Re: [PATCH v10 05/12] cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors

Hi Perry,

On 1/6/2023 11:44 AM, Perry Yuan wrote:
----------------------------------->8------------------------------------------
> +static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> +{
> + int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> + struct amd_cpudata *cpudata;
> + struct device *dev;
> + int rc;
> + u64 value;

Please call `amd_perf_ctl_reset` function here, otherwise amd_pstate would
behave unexpectedly.

> +
> + dev = get_cpu_device(policy->cpu);
> + if (!dev)
> + goto free_cpudata1;
> +
> + cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
> + if (!cpudata)
> + return -ENOMEM;
> +
> + cpudata->cpu = policy->cpu;
> + cpudata->epp_policy = 0;
> +
> + rc = amd_pstate_init_perf(cpudata);
> + if (rc)
> + goto free_cpudata1;
> +
> + min_freq = amd_get_min_freq(cpudata);
> + max_freq = amd_get_max_freq(cpudata);
> + nominal_freq = amd_get_nominal_freq(cpudata);
> + lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> + if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> + dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
> + min_freq, max_freq);
> + ret = -EINVAL;
> + goto free_cpudata1;
> + }
> +
> + policy->cpuinfo.min_freq = min_freq;
> + policy->cpuinfo.max_freq = max_freq;
> + /* It will be updated by governor */
> + policy->cur = policy->cpuinfo.min_freq;
> +
> + /* Initial processor data capability frequencies */
> + cpudata->max_freq = max_freq;
> + cpudata->min_freq = min_freq;
> + cpudata->nominal_freq = nominal_freq;
> + cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> +
> + policy->driver_data = cpudata;
> +
> + cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
> +
> + policy->min = policy->cpuinfo.min_freq;
> + policy->max = policy->cpuinfo.max_freq;
> +
> + /*
> + * Set the policy to powersave to provide a valid fallback value in case
> + * the default cpufreq governor is neither powersave nor performance.
> + */
> + policy->policy = CPUFREQ_POLICY_POWERSAVE;
> +
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + policy->fast_switch_possible = true;
> + ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, &value);
> + if (ret)
> + return ret;
> + WRITE_ONCE(cpudata->cppc_req_cached, value);
> +
> + ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &value);
> + if (ret)
> + return ret;
> + WRITE_ONCE(cpudata->cppc_cap1_cached, value);
> + }
> + amd_pstate_boost_init(cpudata);
> +
> + return 0;
> +
> +free_cpudata1:
> + kfree(cpudata);
> + return ret;
> +}
> +
> +static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
> +{
> + pr_debug("CPU %d exiting\n", policy->cpu);
> + policy->fast_switch_possible = false;
> + return 0;
> +}
> +
> +static void amd_pstate_epp_init(unsigned int cpu)
> +{
> + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> + struct amd_cpudata *cpudata = policy->driver_data;
> + u32 max_perf, min_perf;
> + u64 value;
> + s16 epp;
> +
> + max_perf = READ_ONCE(cpudata->highest_perf);
> + min_perf = READ_ONCE(cpudata->lowest_perf);
> +
> + value = READ_ONCE(cpudata->cppc_req_cached);
> +
> + if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> + min_perf = max_perf;
> +
> + /* Initial min/max values for CPPC Performance Controls Register */
> + value &= ~AMD_CPPC_MIN_PERF(~0L);
> + value |= AMD_CPPC_MIN_PERF(min_perf);
> +
> + value &= ~AMD_CPPC_MAX_PERF(~0L);
> + value |= AMD_CPPC_MAX_PERF(max_perf);
> +
> + /* CPPC EPP feature require to set zero to the desire perf bit */
> + value &= ~AMD_CPPC_DES_PERF(~0L);
> + value |= AMD_CPPC_DES_PERF(0);
> +
> + if (cpudata->epp_policy == cpudata->policy)
> + goto skip_epp;
> +
> + cpudata->epp_policy = cpudata->policy;
> +
> + if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> + epp = amd_pstate_get_epp(cpudata, value);
> + if (epp < 0)
> + goto skip_epp;
> + /* force the epp value to be zero for performance policy */
> + epp = 0;
> + } else {
> + /* Get BIOS pre-defined epp value */
> + epp = amd_pstate_get_epp(cpudata, value);
> + if (epp)
> + goto skip_epp;
> + }
> + /* Set initial EPP value */
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + value &= ~GENMASK_ULL(31, 24);
> + value |= (u64)epp << 24;
> + }
> +
> +skip_epp:
> + WRITE_ONCE(cpudata->cppc_req_cached, value);
> + amd_pstate_set_epp(cpudata, epp);
> + cpufreq_cpu_put(policy);
> +}
> +
> +static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
> +{
> + struct amd_cpudata *cpudata = policy->driver_data;
> +
> + if (!policy->cpuinfo.max_freq)
> + return -ENODEV;
> +
> + pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
> + policy->cpuinfo.max_freq, policy->max);
> +
> + cpudata->policy = policy->policy;
> +
> + amd_pstate_epp_init(policy->cpu);
> +
> + return 0;
> +}
> +
> +static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data *policy)
> +{
> + cpufreq_verify_within_cpu_limits(policy);
> + pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy->min);
> + return 0;
> +}
> +
> static struct cpufreq_driver amd_pstate_driver = {
> .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
> .verify = amd_pstate_verify,
> @@ -627,6 +973,16 @@ static struct cpufreq_driver amd_pstate_driver = {
> .attr = amd_pstate_attr,
> };
>
> +static struct cpufreq_driver amd_pstate_epp_driver = {
> + .flags = CPUFREQ_CONST_LOOPS,
> + .verify = amd_pstate_epp_verify_policy,
> + .setpolicy = amd_pstate_epp_set_policy,
> + .init = amd_pstate_epp_cpu_init,
> + .exit = amd_pstate_epp_cpu_exit,
> + .name = "amd_pstate_epp",
> + .attr = amd_pstate_epp_attr,
> +};
> +
> static int __init amd_pstate_init(void)
> {
> int ret;
> @@ -655,7 +1011,8 @@ static int __init amd_pstate_init(void)
> /* capability check */
> if (boot_cpu_has(X86_FEATURE_CPPC)) {
> pr_debug("AMD CPPC MSR based functionality is supported\n");
> - amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
> + if (cppc_state == AMD_PSTATE_PASSIVE)
> + current_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
> } else {
> pr_debug("AMD CPPC shared memory based functionality is supported\n");
> static_call_update(amd_pstate_enable, cppc_enable);
> @@ -666,14 +1023,13 @@ static int __init amd_pstate_init(void)
> /* enable amd pstate feature */
> ret = amd_pstate_enable(true);
> if (ret) {
> - pr_err("failed to enable amd-pstate with return %d\n", ret);
> + pr_err("failed to enable with return %d\n", ret);
> return ret;
> }
>
> - ret = cpufreq_register_driver(&amd_pstate_driver);
> + ret = cpufreq_register_driver(current_pstate_driver);
> if (ret)
> - pr_err("failed to register amd_pstate_driver with return %d\n",
> - ret);
> + pr_err("failed to register with return %d\n", ret);
>
> return ret;
> }
> @@ -695,6 +1051,12 @@ static int __init amd_pstate_param(char *str)
> if (cppc_state == AMD_PSTATE_DISABLE)
> pr_info("driver is explicitly disabled\n");
>
> + if (cppc_state == AMD_PSTATE_ACTIVE)
> + current_pstate_driver = &amd_pstate_epp_driver;
> +
> + if (cppc_state == AMD_PSTATE_PASSIVE)
> + current_pstate_driver = &amd_pstate_driver;
> +
> return 0;
> }
>
> diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> index dae2ce0f6735..8341a2a2948a 100644
> --- a/include/linux/amd-pstate.h
> +++ b/include/linux/amd-pstate.h
> @@ -47,6 +47,10 @@ struct amd_aperf_mperf {
> * @prev: Last Aperf/Mperf/tsc count value read from register
> * @freq: current cpu frequency value
> * @boost_supported: check whether the Processor or SBIOS supports boost mode
> + * @epp_policy: Last saved policy used to set energy-performance preference
> + * @epp_cached: Cached CPPC energy-performance preference value
> + * @policy: Cpufreq policy value
> + * @cppc_cap1_cached Cached MSR_AMD_CPPC_CAP1 register value
> *
> * The amd_cpudata is key private data for each CPU thread in AMD P-State, and
> * represents all the attributes and goals that AMD P-State requests at runtime.
> @@ -72,6 +76,12 @@ struct amd_cpudata {
>
> u64 freq;
> bool boost_supported;
> +
> + /* EPP feature related attributes*/
> + s16 epp_policy;
> + s16 epp_cached;
> + u32 policy;
> + u64 cppc_cap1_cached;
> };
>
> /*

--
Thanks & Regards,
Wyes

2023-01-10 15:36:29

by Yuan, Perry

[permalink] [raw]
Subject: RE: [PATCH v10 05/12] cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors

[AMD Official Use Only - General]

Hi Wyes.

> -----Original Message-----
> From: Karny, Wyes <[email protected]>
> Sent: Friday, January 6, 2023 7:19 PM
> To: Yuan, Perry <[email protected]>; [email protected];
> Limonciello, Mario <[email protected]>; Huang, Ray
> <[email protected]>; [email protected]
> Cc: Sharma, Deepak <[email protected]>; Fontenot, Nathan
> <[email protected]>; Deucher, Alexander
> <[email protected]>; Huang, Shimmer
> <[email protected]>; Du, Xiaojian <[email protected]>; Meng,
> Li (Jassmine) <[email protected]>; [email protected]; linux-
> [email protected]
> Subject: Re: [PATCH v10 05/12] cpufreq: amd-pstate: implement Pstate EPP
> support for the AMD processors
>
> Hi Perry,
>
> On 1/6/2023 11:44 AM, Perry Yuan wrote:
> ----------------------------------->8-----------------------------------
> ----------------------------------->-------
> > +static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy) {
> > + int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> > + struct amd_cpudata *cpudata;
> > + struct device *dev;
> > + int rc;
> > + u64 value;
>
> Please call `amd_perf_ctl_reset` function here, otherwise amd_pstate would
> behave unexpectedly.

Let me add this reset function for epp driver in v11 with another Kconfig fix together.
Thank you.

Perry.

>
> > +
> > + dev = get_cpu_device(policy->cpu);
> > + if (!dev)
> > + goto free_cpudata1;
> > +
> > + cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
> > + if (!cpudata)
> > + return -ENOMEM;
> > +
> > + cpudata->cpu = policy->cpu;
> > + cpudata->epp_policy = 0;
> > +
> > + rc = amd_pstate_init_perf(cpudata);
> > + if (rc)
> > + goto free_cpudata1;
> > +
> > + min_freq = amd_get_min_freq(cpudata);
> > + max_freq = amd_get_max_freq(cpudata);
> > + nominal_freq = amd_get_nominal_freq(cpudata);
> > + lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> > + if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> > + dev_err(dev, "min_freq(%d) or max_freq(%d) value is
> incorrect\n",
> > + min_freq, max_freq);
> > + ret = -EINVAL;
> > + goto free_cpudata1;
> > + }
> > +
> > + policy->cpuinfo.min_freq = min_freq;
> > + policy->cpuinfo.max_freq = max_freq;
> > + /* It will be updated by governor */
> > + policy->cur = policy->cpuinfo.min_freq;
> > +
> > + /* Initial processor data capability frequencies */
> > + cpudata->max_freq = max_freq;
> > + cpudata->min_freq = min_freq;
> > + cpudata->nominal_freq = nominal_freq;
> > + cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> > +
> > + policy->driver_data = cpudata;
> > +
> > + cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
> > +
> > + policy->min = policy->cpuinfo.min_freq;
> > + policy->max = policy->cpuinfo.max_freq;
> > +
> > + /*
> > + * Set the policy to powersave to provide a valid fallback value in case
> > + * the default cpufreq governor is neither powersave nor
> performance.
> > + */
> > + policy->policy = CPUFREQ_POLICY_POWERSAVE;
> > +
> > + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > + policy->fast_switch_possible = true;
> > + ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> &value);
> > + if (ret)
> > + return ret;
> > + WRITE_ONCE(cpudata->cppc_req_cached, value);
> > +
> > + ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
> &value);
> > + if (ret)
> > + return ret;
> > + WRITE_ONCE(cpudata->cppc_cap1_cached, value);
> > + }
> > + amd_pstate_boost_init(cpudata);
> > +
> > + return 0;
> > +
> > +free_cpudata1:
> > + kfree(cpudata);
> > + return ret;
> > +}
> > +
> > +static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy) {
> > + pr_debug("CPU %d exiting\n", policy->cpu);
> > + policy->fast_switch_possible = false;
> > + return 0;
> > +}
> > +
> > +static void amd_pstate_epp_init(unsigned int cpu) {
> > + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> > + struct amd_cpudata *cpudata = policy->driver_data;
> > + u32 max_perf, min_perf;
> > + u64 value;
> > + s16 epp;
> > +
> > + max_perf = READ_ONCE(cpudata->highest_perf);
> > + min_perf = READ_ONCE(cpudata->lowest_perf);
> > +
> > + value = READ_ONCE(cpudata->cppc_req_cached);
> > +
> > + if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> > + min_perf = max_perf;
> > +
> > + /* Initial min/max values for CPPC Performance Controls Register */
> > + value &= ~AMD_CPPC_MIN_PERF(~0L);
> > + value |= AMD_CPPC_MIN_PERF(min_perf);
> > +
> > + value &= ~AMD_CPPC_MAX_PERF(~0L);
> > + value |= AMD_CPPC_MAX_PERF(max_perf);
> > +
> > + /* CPPC EPP feature require to set zero to the desire perf bit */
> > + value &= ~AMD_CPPC_DES_PERF(~0L);
> > + value |= AMD_CPPC_DES_PERF(0);
> > +
> > + if (cpudata->epp_policy == cpudata->policy)
> > + goto skip_epp;
> > +
> > + cpudata->epp_policy = cpudata->policy;
> > +
> > + if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> > + epp = amd_pstate_get_epp(cpudata, value);
> > + if (epp < 0)
> > + goto skip_epp;
> > + /* force the epp value to be zero for performance policy */
> > + epp = 0;
> > + } else {
> > + /* Get BIOS pre-defined epp value */
> > + epp = amd_pstate_get_epp(cpudata, value);
> > + if (epp)
> > + goto skip_epp;
> > + }
> > + /* Set initial EPP value */
> > + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > + value &= ~GENMASK_ULL(31, 24);
> > + value |= (u64)epp << 24;
> > + }
> > +
> > +skip_epp:
> > + WRITE_ONCE(cpudata->cppc_req_cached, value);
> > + amd_pstate_set_epp(cpudata, epp);
> > + cpufreq_cpu_put(policy);
> > +}
> > +
> > +static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy) {
> > + struct amd_cpudata *cpudata = policy->driver_data;
> > +
> > + if (!policy->cpuinfo.max_freq)
> > + return -ENODEV;
> > +
> > + pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
> > + policy->cpuinfo.max_freq, policy->max);
> > +
> > + cpudata->policy = policy->policy;
> > +
> > + amd_pstate_epp_init(policy->cpu);
> > +
> > + return 0;
> > +}
> > +
> > +static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data
> > +*policy) {
> > + cpufreq_verify_within_cpu_limits(policy);
> > + pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy-
> >min);
> > + return 0;
> > +}
> > +
> > static struct cpufreq_driver amd_pstate_driver = {
> > .flags = CPUFREQ_CONST_LOOPS |
> CPUFREQ_NEED_UPDATE_LIMITS,
> > .verify = amd_pstate_verify,
> > @@ -627,6 +973,16 @@ static struct cpufreq_driver amd_pstate_driver = {
> > .attr = amd_pstate_attr,
> > };
> >
> > +static struct cpufreq_driver amd_pstate_epp_driver = {
> > + .flags = CPUFREQ_CONST_LOOPS,
> > + .verify = amd_pstate_epp_verify_policy,
> > + .setpolicy = amd_pstate_epp_set_policy,
> > + .init = amd_pstate_epp_cpu_init,
> > + .exit = amd_pstate_epp_cpu_exit,
> > + .name = "amd_pstate_epp",
> > + .attr = amd_pstate_epp_attr,
> > +};
> > +
> > static int __init amd_pstate_init(void) {
> > int ret;
> > @@ -655,7 +1011,8 @@ static int __init amd_pstate_init(void)
> > /* capability check */
> > if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > pr_debug("AMD CPPC MSR based functionality is
> supported\n");
> > - amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
> > + if (cppc_state == AMD_PSTATE_PASSIVE)
> > + current_pstate_driver->adjust_perf =
> amd_pstate_adjust_perf;
> > } else {
> > pr_debug("AMD CPPC shared memory based functionality is
> supported\n");
> > static_call_update(amd_pstate_enable, cppc_enable); @@ -
> 666,14
> > +1023,13 @@ static int __init amd_pstate_init(void)
> > /* enable amd pstate feature */
> > ret = amd_pstate_enable(true);
> > if (ret) {
> > - pr_err("failed to enable amd-pstate with return %d\n", ret);
> > + pr_err("failed to enable with return %d\n", ret);
> > return ret;
> > }
> >
> > - ret = cpufreq_register_driver(&amd_pstate_driver);
> > + ret = cpufreq_register_driver(current_pstate_driver);
> > if (ret)
> > - pr_err("failed to register amd_pstate_driver with
> return %d\n",
> > - ret);
> > + pr_err("failed to register with return %d\n", ret);
> >
> > return ret;
> > }
> > @@ -695,6 +1051,12 @@ static int __init amd_pstate_param(char *str)
> > if (cppc_state == AMD_PSTATE_DISABLE)
> > pr_info("driver is explicitly disabled\n");
> >
> > + if (cppc_state == AMD_PSTATE_ACTIVE)
> > + current_pstate_driver = &amd_pstate_epp_driver;
> > +
> > + if (cppc_state == AMD_PSTATE_PASSIVE)
> > + current_pstate_driver = &amd_pstate_driver;
> > +
> > return 0;
> > }
> >
> > diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> > index dae2ce0f6735..8341a2a2948a 100644
> > --- a/include/linux/amd-pstate.h
> > +++ b/include/linux/amd-pstate.h
> > @@ -47,6 +47,10 @@ struct amd_aperf_mperf {
> > * @prev: Last Aperf/Mperf/tsc count value read from register
> > * @freq: current cpu frequency value
> > * @boost_supported: check whether the Processor or SBIOS supports
> > boost mode
> > + * @epp_policy: Last saved policy used to set energy-performance
> > + preference
> > + * @epp_cached: Cached CPPC energy-performance preference value
> > + * @policy: Cpufreq policy value
> > + * @cppc_cap1_cached Cached MSR_AMD_CPPC_CAP1 register value
> > *
> > * The amd_cpudata is key private data for each CPU thread in AMD P-
> State, and
> > * represents all the attributes and goals that AMD P-State requests at
> runtime.
> > @@ -72,6 +76,12 @@ struct amd_cpudata {
> >
> > u64 freq;
> > bool boost_supported;
> > +
> > + /* EPP feature related attributes*/
> > + s16 epp_policy;
> > + s16 epp_cached;
> > + u32 policy;
> > + u64 cppc_cap1_cached;
> > };
> >
> > /*
>
> --
> Thanks & Regards,
> Wyes

2023-01-10 18:12:11

by Wyes Karny

[permalink] [raw]
Subject: Re: [PATCH v10 00/12] Implement AMD Pstate EPP Driver



On 1/6/2023 11:44 AM, Perry Yuan wrote:
> Hi all,
>
> This patchset implements one new AMD CPU frequency driver
> `amd-pstate-epp` instance for better performance and power control.
> CPPC has a parameter called energy preference performance (EPP).
> The EPP is used in the CCLK DPM controller to drive the frequency that a core
> is going to operate during short periods of activity.
> EPP values will be utilized for different OS profiles (balanced, performance, power savings).
>
> AMD Energy Performance Preference (EPP) provides a hint to the hardware
> if software wants to bias toward performance (0x0) or energy efficiency (0xff)
> The lowlevel power firmware will calculate the runtime frequency according to the EPP preference
> value. So the EPP hint will impact the CPU cores frequency responsiveness.
>
> We use the RAPL interface with "perf" tool to get the energy data of the package power.
> Performance Per Watt (PPW) Calculation:
>
> The PPW calculation is referred by below paper:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsoftware.intel.com%2Fcontent%2Fdam%2Fdevelop%2Fexternal%2Fus%2Fen%2Fdocuments%2Fperformance-per-what-paper.pdf&amp;data=04%7C01%7CPerry.Yuan%40amd.com%7Cac66e8ce98044e9b062708d9ab47c8d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637729147708574423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=TPOvCE%2Frbb0ptBreWNxHqOi9YnVhcHGKG88vviDLb00%3D&amp;reserved=0
>
> Below formula is referred from below spec to measure the PPW:
>
> (F / t) / P = F * t / (t * E) = F / E,
>
> "F" is the number of frames per second.
> "P" is power measured in watts.
> "E" is energy measured in joules.
>
> Gitsouce Benchmark Data on ROME Server CPU
> +------------------------------+------------------------------+------------+------------------+
> | Kernel Module | PPW (1 / s * J) |Energy(J) | PPW Improvement (%)|
> +==============================+==============================+============+==================+
> | acpi-cpufreq:schedutil | 5.85658E-05 | 17074.8 | base |
> +------------------------------+------------------------------+------------+------------------+
> | acpi-cpufreq:ondemand | 5.03079E-05 | 19877.6 | -14.10% |
> +------------------------------+------------------------------+------------+------------------+
> | acpi-cpufreq:performance | 5.88132E-05 | 17003 | 0.42% |
> +------------------------------+------------------------------+------------+------------------+
> | amd-pstate:ondemand | 4.60295E-05 | 21725.2 | -21.41% |
> +------------------------------+------------------------------+------------+------------------+
> | amd-pstate:schedutil | 4.70026E-05 | 21275.4 | -19.7% |
> +------------------------------+------------------------------+------------+------------------+
> | amd-pstate:performance | 5.80094E-05 | 17238.6 | -0.95% |
> +------------------------------+------------------------------+------------+------------------+
> | EPP:performance | 5.8292E-05 | 17155 | -0.47% |
> +------------------------------+------------------------------+------------+------------------+
> | EPP: balance performance: | 6.71709E-05 | 14887.4 | 14.69% |
> +------------------------------+------------------------------+------------+------------------+
> | EPP:power | 6.66951E-05 | 4993.6 | 13.88% |
> +------------------------------+------------------------------+------------+------------------+
>
> Tbench Benchmark Data on ROME Server CPU
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | Kernel Module | PPW MB / (s * J) |Throughput(MB/s)| Energy (J)|PPW Improvement(%)|
> +=============================================+===================+==============+=============+==================+
> | acpi_cpufreq: schedutil | 46.39 | 17191 | 37057.3 | base |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | acpi_cpufreq: ondemand | 51.51 | 19269.5 | 37406.5 | 11.04 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | acpi_cpufreq: performance | 45.96 | 17063.7 | 37123.7 | -0.74 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | EPP:powersave: performance(0) | 54.46 | 20263.1 | 37205 | 17.87 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | EPP:powersave: balance performance | 55.03 | 20481.9 | 37221.5 | 19.14 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | EPP:powersave: balance_power | 54.43 | 20245.9 | 37194.2 | 17.77 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | EPP:powersave: power(255) | 54.26 | 20181.7 | 37197.4 | 17.40 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | amd-pstate: schedutil | 48.22 | 17844.9 | 37006.6 | 3.80 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | amd-pstate: ondemand | 61.30 | 22988 | 37503.4 | 33.72 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | amd-pstate: performance | 54.52 | 20252.6 | 37147.8 | 17.81 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+

This series + fix mentioned patch 5/12 is:

Tested-by: Wyes Karny <[email protected]>

Test system: AMD Genoa 96C 192T
Kernel: 6.2-rc2 + patch + perf_ctl_reset_fix
Scaling driver: amd_pstate/amd_pstate_epp
(Results are normalized)

================ dbench ================
dbench result comparison:
Here results are throughput (MB/s)
Clients: passive+perf passive+sug active+perf+perf active+ps+perf active+ps+bln_perf active+ps+bln_pwr active+ps+pwr
1 1.00 (0.00 pct) 0.92 (-8.00 pct) 0.93 (-7.00 pct) 0.93 (-7.00 pct) 0.91 (-9.00 pct) 0.89 (-11.00 pct) 0.85 (-15.00 pct)
2 1.03 (0.00 pct) 0.96 (-6.79 pct) 0.97 (-5.82 pct) 0.97 (-5.82 pct) 0.97 (-5.82 pct) 0.96 (-6.79 pct) 0.95 (-7.76 pct)
4 1.93 (0.00 pct) 1.54 (-20.20 pct) 1.57 (-18.65 pct) 1.56 (-19.17 pct) 1.53 (-20.72 pct) 1.52 (-21.24 pct) 1.50 (-22.27 pct)
8 3.48 (0.00 pct) 2.43 (-30.17 pct) 2.56 (-26.43 pct) 2.47 (-29.02 pct) 2.48 (-28.73 pct) 2.42 (-30.45 pct) 2.44 (-29.88 pct)
16 6.08 (0.00 pct) 3.84 (-36.84 pct) 4.05 (-33.38 pct) 3.93 (-35.36 pct) 4.06 (-33.22 pct) 3.99 (-34.37 pct) 3.86 (-36.51 pct)
32 9.25 (0.00 pct) 5.59 (-39.56 pct) 6.63 (-28.32 pct) 6.18 (-33.18 pct) 6.50 (-29.72 pct) 6.41 (-30.70 pct) 5.69 (-38.48 pct)
64 13.07 (0.00 pct) 8.15 (-37.64 pct) 9.82 (-24.86 pct) 8.91 (-31.82 pct) 9.76 (-25.32 pct) 8.98 (-31.29 pct) 8.01 (-38.71 pct)
128 15.02 (0.00 pct) 11.35 (-24.43 pct) 13.47 (-10.31 pct) 12.30 (-18.10 pct) 13.32 (-11.31 pct) 11.38 (-24.23 pct) 11.13 (-25.89 pct)
256 16.37 (0.00 pct) 14.61 (-10.75 pct) 16.32 (-0.30 pct) 15.63 (-4.52 pct) 16.09 (-1.71 pct) 14.83 (-9.40 pct) 14.70 (-10.20 pct)
512 15.00 (0.00 pct) 14.79 (-1.40 pct) 15.06 (0.40 pct) 14.96 (-0.26 pct) 14.95 (-0.33 pct) 14.64 (-2.40 pct) 14.76 (-1.60 pct)
dbench power comparison:
Clients: passive+perf passive+sug active+perf+perf active+ps+perf active+ps+bln_perf active+ps+bln_pwr active+ps+pwr
1 1.00 (0.00 pct) 0.99 (-1.00 pct) 1.00 (0.00 pct) 1.00 (0.00 pct) 0.97 (-3.00 pct) 0.95 (-5.00 pct) 0.95 (-5.00 pct)
2 1.00 (0.00 pct) 0.96 (-4.00 pct) 1.00 (0.00 pct) 1.00 (0.00 pct) 0.95 (-5.00 pct) 0.94 (-6.00 pct) 0.93 (-7.00 pct)
4 1.01 (0.00 pct) 0.97 (-3.96 pct) 1.01 (0.00 pct) 1.01 (0.00 pct) 0.95 (-5.94 pct) 0.94 (-6.93 pct) 0.93 (-7.92 pct)
8 1.03 (0.00 pct) 0.98 (-4.85 pct) 1.01 (-1.94 pct) 1.01 (-1.94 pct) 0.95 (-7.76 pct) 0.94 (-8.73 pct) 0.93 (-9.70 pct)
16 1.05 (0.00 pct) 0.99 (-5.71 pct) 1.03 (-1.90 pct) 1.03 (-1.90 pct) 0.96 (-8.57 pct) 0.94 (-10.47 pct) 0.93 (-11.42 pct)
32 1.08 (0.00 pct) 1.01 (-6.48 pct) 1.05 (-2.77 pct) 1.05 (-2.77 pct) 0.98 (-9.25 pct) 0.96 (-11.11 pct) 0.95 (-12.03 pct)
64 1.12 (0.00 pct) 1.04 (-7.14 pct) 1.09 (-2.67 pct) 1.09 (-2.67 pct) 1.00 (-10.71 pct) 0.98 (-12.50 pct) 0.97 (-13.39 pct)
128 1.15 (0.00 pct) 1.06 (-7.82 pct) 1.14 (-0.86 pct) 1.13 (-1.73 pct) 1.04 (-9.56 pct) 1.00 (-13.04 pct) 0.98 (-14.78 pct)
256 1.18 (0.00 pct) 1.10 (-6.77 pct) 1.18 (0.00 pct) 1.19 (0.84 pct) 1.06 (-10.16 pct) 1.03 (-12.71 pct) 1.02 (-13.55 pct)
512 1.22 (0.00 pct) 1.15 (-5.73 pct) 1.23 (0.81 pct) 1.23 (0.81 pct) 1.10 (-9.83 pct) 1.06 (-13.11 pct) 1.05 (-13.93 pct)

================ git-source ================
git-source result comparison:
Here results are throughput (compilations per 1000 sec)
Threads: passive+perf passive+sug active+perf+perf active+ps+perf active+ps+bln_perf active+ps+bln_pwr active+ps+pwr
192 1.00 (0.00 pct) 0.93 (-7.00 pct) 1.00 (0.00 pct) 1.00 (0.00 pct) 0.87 (-13.00 pct) 0.81 (-19.00 pct) 0.71 (-29.00 pct)
git-source power comparison:
Threads: passive+perf passive+sug active+perf+perf active+ps+perf active+ps+bln_perf active+ps+bln_pwr active+ps+pwr
192 1.00 (0.00 pct) 1.03 (3.00 pct) 1.03 (3.00 pct) 1.03 (3.00 pct) 1.02 (2.00 pct) 1.06 (6.00 pct) 1.13 (13.00 pct)

================ kernbench ================
kernbench result comparison:
Here results are throughput (compilations per 1000 sec)
Load: passive+perf passive+sug active+perf+perf active+ps+perf active+ps+bln_perf active+ps+bln_pwr active+ps+pwr
32 1.00 (0.00 pct) 0.94 (-6.00 pct) 0.97 (-3.00 pct) 1.01 (1.00 pct) 0.77 (-23.00 pct) 0.64 (-36.00 pct) 0.52 (-48.00 pct)
48 1.25 (0.00 pct) 1.20 (-4.00 pct) 1.25 (0.00 pct) 1.24 (-0.80 pct) 0.96 (-23.20 pct) 0.80 (-36.00 pct) 0.66 (-47.20 pct)
64 1.41 (0.00 pct) 1.37 (-2.83 pct) 1.40 (-0.70 pct) 1.43 (1.41 pct) 1.09 (-22.69 pct) 0.92 (-34.75 pct) 0.74 (-47.51 pct)
96 1.45 (0.00 pct) 1.40 (-3.44 pct) 1.44 (-0.68 pct) 1.44 (-0.68 pct) 1.22 (-15.86 pct) 1.05 (-27.58 pct) 0.93 (-35.86 pct)
128 1.36 (0.00 pct) 1.23 (-9.55 pct) 1.33 (-2.20 pct) 1.35 (-0.73 pct) 1.12 (-17.64 pct) 0.98 (-27.94 pct) 0.84 (-38.23 pct)
192 1.22 (0.00 pct) 1.12 (-8.19 pct) 1.21 (-0.81 pct) 1.21 (-0.81 pct) 1.02 (-16.39 pct) 0.91 (-25.40 pct) 0.79 (-35.24 pct)
256 1.21 (0.00 pct) 1.10 (-9.09 pct) 1.21 (0.00 pct) 1.21 (0.00 pct) 1.01 (-16.52 pct) 0.90 (-25.61 pct) 0.78 (-35.53 pct)
384 1.22 (0.00 pct) 1.09 (-10.65 pct) 1.21 (-0.81 pct) 1.21 (-0.81 pct) 1.01 (-17.21 pct) 0.90 (-26.22 pct) 0.79 (-35.24 pct)
git-source power comparison:
Clients: passive+perf passive+sug active+perf+perf active+ps+perf active+ps+bln_perf active+ps+bln_pwr active+ps+pwr
32 1.00 (0.00 pct) 1.06 (6.00 pct) 1.02 (2.00 pct) 1.00 (0.00 pct) 1.03 (3.00 pct) 1.16 (16.00 pct) 1.39 (39.00 pct)
48 0.87 (0.00 pct) 0.88 (1.14 pct) 0.87 (0.00 pct) 0.87 (0.00 pct) 0.89 (2.29 pct) 0.98 (12.64 pct) 1.11 (27.58 pct)
64 0.80 (0.00 pct) 0.79 (-1.25 pct) 0.80 (0.00 pct) 0.80 (0.00 pct) 0.78 (-2.50 pct) 0.86 (7.50 pct) 1.03 (28.75 pct)
96 0.79 (0.00 pct) 0.81 (2.53 pct) 0.79 (0.00 pct) 0.79 (0.00 pct) 0.78 (-1.26 pct) 0.87 (10.12 pct) 0.95 (20.25 pct)
128 0.80 (0.00 pct) 0.86 (7.50 pct) 0.82 (2.50 pct) 0.80 (0.00 pct) 0.86 (7.50 pct) 0.91 (13.75 pct) 1.03 (28.75 pct)
192 0.89 (0.00 pct) 0.96 (7.86 pct) 0.89 (0.00 pct) 0.89 (0.00 pct) 0.96 (7.86 pct) 1.00 (12.35 pct) 1.12 (25.84 pct)
256 0.89 (0.00 pct) 0.95 (6.74 pct) 0.89 (0.00 pct) 0.89 (0.00 pct) 0.96 (7.86 pct) 1.00 (12.35 pct) 1.12 (25.84 pct)
384 0.89 (0.00 pct) 0.95 (6.74 pct) 0.90 (1.12 pct) 0.90 (1.12 pct) 0.95 (6.74 pct) 1.00 (12.35 pct) 1.11 (24.71 pct)

================ tbench ================
tbench result comparison:
Here results are throughput (MB/s)
Clients: passive+perf passive+sug active+perf+perf active+ps+perf active+ps+bln_perf active+ps+bln_pwr active+ps+pwr
1 1.00 (0.00 pct) 0.52 (-48.00 pct) 1.00 (0.00 pct) 1.00 (0.00 pct) 0.76 (-24.00 pct) 0.64 (-36.00 pct) 0.51 (-49.00 pct)
2 1.97 (0.00 pct) 1.03 (-47.71 pct) 2.01 (2.03 pct) 2.00 (1.52 pct) 1.52 (-22.84 pct) 1.27 (-35.53 pct) 1.02 (-48.22 pct)
4 3.85 (0.00 pct) 2.03 (-47.27 pct) 3.91 (1.55 pct) 3.91 (1.55 pct) 2.95 (-23.37 pct) 2.49 (-35.32 pct) 2.01 (-47.79 pct)
8 7.47 (0.00 pct) 4.01 (-46.31 pct) 7.48 (0.13 pct) 7.56 (1.20 pct) 5.73 (-23.29 pct) 4.88 (-34.67 pct) 3.93 (-47.38 pct)
16 14.12 (0.00 pct) 7.71 (-45.39 pct) 14.75 (4.46 pct) 14.49 (2.62 pct) 11.13 (-21.17 pct) 9.37 (-33.64 pct) 7.69 (-45.53 pct)
32 27.87 (0.00 pct) 14.96 (-46.32 pct) 27.86 (-0.03 pct) 27.31 (-2.00 pct) 21.32 (-23.50 pct) 17.82 (-36.06 pct) 14.45 (-48.15 pct)
64 46.26 (0.00 pct) 31.86 (-31.12 pct) 45.56 (-1.51 pct) 45.99 (-0.58 pct) 36.14 (-21.87 pct) 31.36 (-32.20 pct) 24.86 (-46.26 pct)
128 50.68 (0.00 pct) 51.09 (0.80 pct) 50.68 (0.00 pct) 50.54 (-0.27 pct) 50.63 (-0.09 pct) 50.34 (-0.67 pct) 50.69 (0.01 pct)
256 119.57 (0.00 pct) 120.07 (0.41 pct) 119.35 (-0.18 pct) 118.02 (-1.29 pct) 114.58 (-4.17 pct) 117.70 (-1.56 pct) 114.71 (-4.06 pct)
512 115.18 (0.00 pct) 115.93 (0.65 pct) 114.90 (-0.24 pct) 115.69 (0.44 pct) 115.02 (-0.13 pct) 115.71 (0.46 pct) 115.21 (0.02 pct)
tbench power comparison:
Clients: passive+perf passive+sug active+perf+perf active+ps+perf active+ps+bln_perf active+ps+bln_pwr active+ps+pwr
1 1.00 (0.00 pct) 0.90 (-10.00 pct) 1.00 (0.00 pct) 1.00 (0.00 pct) 0.92 (-8.00 pct) 0.90 (-10.00 pct) 0.89 (-11.00 pct)
2 1.02 (0.00 pct) 0.90 (-11.76 pct) 1.02 (0.00 pct) 1.03 (0.98 pct) 0.93 (-8.82 pct) 0.91 (-10.78 pct) 0.90 (-11.76 pct)
4 1.08 (0.00 pct) 0.92 (-14.81 pct) 1.09 (0.92 pct) 1.09 (0.92 pct) 0.97 (-10.18 pct) 0.94 (-12.96 pct) 0.92 (-14.81 pct)
8 1.21 (0.00 pct) 0.96 (-20.66 pct) 1.21 (0.00 pct) 1.22 (0.82 pct) 1.03 (-14.87 pct) 0.98 (-19.00 pct) 0.96 (-20.66 pct)
16 1.44 (0.00 pct) 1.04 (-27.77 pct) 1.46 (1.38 pct) 1.46 (1.38 pct) 1.16 (-19.44 pct) 1.08 (-25.00 pct) 1.03 (-28.47 pct)
32 1.87 (0.00 pct) 1.26 (-32.62 pct) 1.88 (0.53 pct) 1.87 (0.00 pct) 1.39 (-25.66 pct) 1.24 (-33.68 pct) 1.16 (-37.96 pct)
64 2.41 (0.00 pct) 1.98 (-17.84 pct) 2.39 (-0.82 pct) 2.40 (-0.41 pct) 1.69 (-29.87 pct) 1.47 (-39.00 pct) 1.35 (-43.98 pct)
128 2.59 (0.00 pct) 2.60 (0.38 pct) 2.59 (0.00 pct) 2.60 (0.38 pct) 2.57 (-0.77 pct) 2.56 (-1.15 pct) 2.57 (-0.77 pct)
256 3.17 (0.00 pct) 3.17 (0.00 pct) 3.17 (0.00 pct) 3.17 (0.00 pct) 3.12 (-1.57 pct) 3.15 (-0.63 pct) 3.12 (-1.57 pct)
512 3.18 (0.00 pct) 3.18 (0.00 pct) 3.18 (0.00 pct) 3.18 (0.00 pct) 3.16 (-0.62 pct) 3.16 (-0.62 pct) 3.16 (-0.62 pct)

>
> changes from v9:
> * pick up R-B flas added by Mario
> * pick up R-B flag added by Wyes
> * rename "default_pstate_driver" to "current_pstate_driver"
> * update status string to "disable" when driver disabled and update size check
> * move the common macro definition of EPP to cppc library files for intel_pstate and amd_pstate common using
> * fix some doc words for new global sysfs patch
>
> changes from v8:
> * drive all the feedbacks from Mario and change the codes in this
> version
> * drive all the feedbacks from Ray and change the codes in this
> version
> * pick up all the R-B flags from Mario
> * pick up all the R-B flags from Ray
> * drop boost/refresh_freq_limits callback
> * reuse policy->driver_data to store amd_cpudata struct
> * use switch-case in the driver mode switching function
> * add Kconfig dependency the INTEL_PSTATE for AMD_PSTATE build
> * fix some other code format and typos
>
> changes from v7:
> * remove iowait boost functions code
> * pick up ack by flag from Huang Ray.
> * add one new patch to support multiple working modes in the amd_pstate_param(),aligned with Wyse
> * drop the patch "[v7 08/13] cpufreq: amd-pstate: add frequency dynamic boost sysfs control"
> * replace the cppc_get_epp_caps() with new cppc_get_epp_perf() wihch is
> more simple to use
> * remove I/O wait boost code from amd_pstate_update_status()
> * replace cppc_active var with enum type AMD_PSTATE_ACTIVE
> * squash amd_pstate_epp_verify_policy() into sigle function
> * remove "amd pstate" string from the pr_err, pr_debug logs info
> * rework patch [v7 03/13], move the common EPP profiles declaration
> into cpufreq.h which will be used by amd-pstate and intel-pstate
> * combine amd psate init functions.
> * remove epp_powersave from amd-pstate.h and dropping the codes.
> * move amd_pstate_params{} from amd-pstate.h into amd-pstate.c
> * drive some other feedbacks from huang ray
>
> changes from v6:
> * fix one legacy kernel hang issue when amd-pstate driver unregistering
> * add new documentation to introduce new global sysfs attributes
> * use sysfs_emit_at() to print epp profiles array
> * update commit info for patch v6 patch 1/11 as Mario sugguested.
> * trying to add the EPP profiles into cpufreq.h, but it will cause lots
> of build failues,continue to keep cpufreq_common.h used in v7
> * update commit info using amd-pstate as prefix same as before.
> * remove CONFIG_ACPI for the header as Ray suggested.
> * move amd_pstate_kobj to where it is used in patch "add frequency dynamic boost sysfs control"
> * drive feedback removing X86_FEATURE_CPPC check for the epp init from Huang Ray
> * drive feedback from Mario
>
> change from v5:
> * add one common header `cpufreq_commoncpufreq_common` to extract EPP profiles
> definition for amd and intel pstate driver.
> * remove the epp_off value to avoid confusion.
> * convert some other sysfs sprintf() function with sysfs_emit() and add onew new patch
> * add acpi pm server priofile detection to enable dynamic boost control
> * fix some code format with checkpatch script
> * move the EPP profile declaration into common header file `cpufreq_common.h`
> * fix commit typos
>
> changes from v4:
> * rebase driver based on the v6.1-rc7
> * remove the builtin changes patch because pstate driver has been
> changed to builtin type by another thread patch
> * update Documentation: amd-pstate: add amd pstate driver mode introduction
> * replace sprintf with sysfs_emit() instead.
> * fix typo for cppc_set_epp_perf() in cppc_acpi.h header
>
> changes from v3:
> * add one more document update patch for the active and passive mode
> introducion.
> * drive most of the feedbacks from Mario
> * drive feedback from Rafael for the cppc_acpi driver.
> * remove the epp raw data set/get function
> * set the amd-pstate drive by passing kernel parameter
> * set amd-pstate driver disabled by default if no kernel parameter
> input from booting
> * get cppc_set_auto_epp and cppc_set_epp_perf combined
> * pick up reviewed by flag from Mario
>
> changes from v2:
> * change pstate driver as builtin type from module
> * drop patch "export cpufreq cpu release and acquire"
> * squash patch of shared mem into single patch of epp implementation
> * add one new patch to support frequency boost control
> * add patch to expose driver working status checking
> * rebase driver into v6.1-rc4 kernel release
> * move some declaration to amd-pstate.h
> * drive feedback from Mario for the online/offline patch
> * drive feedback from Mario for the suspend/resume patch
> * drive feedback from Ray for the cppc_acpi and some other patches
> * drive feedback from Nathan for the epp patch
>
> changes from v1:
> * rebased to v6.0
> * drive feedbacks from Mario for the suspend/resume patch
> * drive feedbacks from Nathan for the EPP support on msr type
> * fix some typos and code style indent problems
> * update commit comments for patch 4/7
> * change the `epp_enabled` module param name to `epp`
> * set the default epp mode to be false
> * add testing for the x86_energy_perf_policy utility patchset(will
> send that utility patchset with another thread)
>
> v9: https://lore.kernel.org/lkml/[email protected]/
> v8: https://lore.kernel.org/lkml/[email protected]/
> v7: https://lore.kernel.org/lkml/[email protected]/
> v6: https://lore.kernel.org/lkml/[email protected]/
> v5: https://lore.kernel.org/lkml/[email protected]/
> v4: https://lore.kernel.org/lkml/[email protected]/
> v3: https://lore.kernel.org/all/[email protected]/
> v2: https://lore.kernel.org/all/[email protected]/
> v1: https://lore.kernel.org/all/[email protected]/
>
> Perry Yuan (11):
> ACPI: CPPC: Add AMD pstate energy performance preference cppc control
> Documentation: amd-pstate: add EPP profiles introduction
> cpufreq: intel_pstate: use common macro definition for Energy
> Preference Performance(EPP)
> cpufreq: amd-pstate: implement Pstate EPP support for the AMD
> processors
> cpufreq: amd-pstate: implement amd pstate cpu online and offline
> callback
> cpufreq: amd-pstate: implement suspend and resume callbacks
> cpufreq: amd-pstate: add driver working mode switch support
> Documentation: amd-pstate: add amd pstate driver mode introduction
> Documentation: introduce amd pstate active mode kernel command line
> options
> cpufreq: amd-pstate: convert sprintf with sysfs_emit()
> Documentation: amd-pstate: introduce new global sysfs attributes
>
> Wyes Karny (1):
> cpufreq: amd-pstate: optimize driver working mode selection in
> amd_pstate_param()
>
> .../admin-guide/kernel-parameters.txt | 7 +
> Documentation/admin-guide/pm/amd-pstate.rst | 74 +-
> drivers/acpi/cppc_acpi.c | 96 +++
> drivers/cpufreq/amd-pstate.c | 659 +++++++++++++++++-
> drivers/cpufreq/intel_pstate.c | 36 -
> include/acpi/cppc_acpi.h | 22 +
> include/linux/amd-pstate.h | 28 +
> 7 files changed, 865 insertions(+), 57 deletions(-)
>

--
Thanks & Regards,
Wyes

2023-01-10 18:44:45

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v10 05/12] cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors

Hi Perry,

I love your patch! Perhaps something to improve:

[auto build test WARNING on rafael-pm/linux-next]
[also build test WARNING on linus/master v6.2-rc3 next-20230110]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Perry-Yuan/ACPI-CPPC-Add-AMD-pstate-energy-performance-preference-cppc-control/20230106-141924
base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
patch link: https://lore.kernel.org/r/20230106061420.95715-6-perry.yuan%40amd.com
patch subject: [PATCH v10 05/12] cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors
config: x86_64-buildonly-randconfig-r002-20230109
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/70cada4b76e142d3786f958a894dab9f46ce8b7f
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Perry-Yuan/ACPI-CPPC-Add-AMD-pstate-energy-performance-preference-cppc-control/20230106-141924
git checkout 70cada4b76e142d3786f958a894dab9f46ce8b7f
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/cpufreq/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

>> drivers/cpufreq/amd-pstate.c:817:6: warning: variable 'ret' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
if (rc)
^~
drivers/cpufreq/amd-pstate.c:873:9: note: uninitialized use occurs here
return ret;
^~~
drivers/cpufreq/amd-pstate.c:817:2: note: remove the 'if' if its condition is always false
if (rc)
^~~~~~~
drivers/cpufreq/amd-pstate.c:806:6: warning: variable 'ret' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
if (!dev)
^~~~
drivers/cpufreq/amd-pstate.c:873:9: note: uninitialized use occurs here
return ret;
^~~
drivers/cpufreq/amd-pstate.c:806:2: note: remove the 'if' if its condition is always false
if (!dev)
^~~~~~~~~
drivers/cpufreq/amd-pstate.c:799:66: note: initialize the variable 'ret' to silence this warning
int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
^
= 0
>> drivers/cpufreq/amd-pstate.c:844:52: warning: variable 'value' is uninitialized when used here [-Wuninitialized]
cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
^~~~~
drivers/cpufreq/amd-pstate.c:803:11: note: initialize the variable 'value' to silence this warning
u64 value;
^
= 0
>> drivers/cpufreq/amd-pstate.c:910:6: warning: variable 'epp' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
if (cpudata->epp_policy == cpudata->policy)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/cpufreq/amd-pstate.c:935:30: note: uninitialized use occurs here
amd_pstate_set_epp(cpudata, epp);
^~~
drivers/cpufreq/amd-pstate.c:910:2: note: remove the 'if' if its condition is always false
if (cpudata->epp_policy == cpudata->policy)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/cpufreq/amd-pstate.c:889:9: note: initialize the variable 'epp' to silence this warning
s16 epp;
^
= 0
4 warnings generated.


vim +817 drivers/cpufreq/amd-pstate.c

796
797 static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
798 {
799 int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
800 struct amd_cpudata *cpudata;
801 struct device *dev;
802 int rc;
803 u64 value;
804
805 dev = get_cpu_device(policy->cpu);
> 806 if (!dev)
807 goto free_cpudata1;
808
809 cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
810 if (!cpudata)
811 return -ENOMEM;
812
813 cpudata->cpu = policy->cpu;
814 cpudata->epp_policy = 0;
815
816 rc = amd_pstate_init_perf(cpudata);
> 817 if (rc)
818 goto free_cpudata1;
819
820 min_freq = amd_get_min_freq(cpudata);
821 max_freq = amd_get_max_freq(cpudata);
822 nominal_freq = amd_get_nominal_freq(cpudata);
823 lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
824 if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
825 dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
826 min_freq, max_freq);
827 ret = -EINVAL;
828 goto free_cpudata1;
829 }
830
831 policy->cpuinfo.min_freq = min_freq;
832 policy->cpuinfo.max_freq = max_freq;
833 /* It will be updated by governor */
834 policy->cur = policy->cpuinfo.min_freq;
835
836 /* Initial processor data capability frequencies */
837 cpudata->max_freq = max_freq;
838 cpudata->min_freq = min_freq;
839 cpudata->nominal_freq = nominal_freq;
840 cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
841
842 policy->driver_data = cpudata;
843
> 844 cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
845
846 policy->min = policy->cpuinfo.min_freq;
847 policy->max = policy->cpuinfo.max_freq;
848
849 /*
850 * Set the policy to powersave to provide a valid fallback value in case
851 * the default cpufreq governor is neither powersave nor performance.
852 */
853 policy->policy = CPUFREQ_POLICY_POWERSAVE;
854
855 if (boot_cpu_has(X86_FEATURE_CPPC)) {
856 policy->fast_switch_possible = true;
857 ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, &value);
858 if (ret)
859 return ret;
860 WRITE_ONCE(cpudata->cppc_req_cached, value);
861
862 ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &value);
863 if (ret)
864 return ret;
865 WRITE_ONCE(cpudata->cppc_cap1_cached, value);
866 }
867 amd_pstate_boost_init(cpudata);
868
869 return 0;
870
871 free_cpudata1:
872 kfree(cpudata);
873 return ret;
874 }
875
876 static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
877 {
878 pr_debug("CPU %d exiting\n", policy->cpu);
879 policy->fast_switch_possible = false;
880 return 0;
881 }
882
883 static void amd_pstate_epp_init(unsigned int cpu)
884 {
885 struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
886 struct amd_cpudata *cpudata = policy->driver_data;
887 u32 max_perf, min_perf;
888 u64 value;
889 s16 epp;
890
891 max_perf = READ_ONCE(cpudata->highest_perf);
892 min_perf = READ_ONCE(cpudata->lowest_perf);
893
894 value = READ_ONCE(cpudata->cppc_req_cached);
895
896 if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
897 min_perf = max_perf;
898
899 /* Initial min/max values for CPPC Performance Controls Register */
900 value &= ~AMD_CPPC_MIN_PERF(~0L);
901 value |= AMD_CPPC_MIN_PERF(min_perf);
902
903 value &= ~AMD_CPPC_MAX_PERF(~0L);
904 value |= AMD_CPPC_MAX_PERF(max_perf);
905
906 /* CPPC EPP feature require to set zero to the desire perf bit */
907 value &= ~AMD_CPPC_DES_PERF(~0L);
908 value |= AMD_CPPC_DES_PERF(0);
909
> 910 if (cpudata->epp_policy == cpudata->policy)
911 goto skip_epp;
912
913 cpudata->epp_policy = cpudata->policy;
914
915 if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
916 epp = amd_pstate_get_epp(cpudata, value);
917 if (epp < 0)
918 goto skip_epp;
919 /* force the epp value to be zero for performance policy */
920 epp = 0;
921 } else {
922 /* Get BIOS pre-defined epp value */
923 epp = amd_pstate_get_epp(cpudata, value);
924 if (epp)
925 goto skip_epp;
926 }
927 /* Set initial EPP value */
928 if (boot_cpu_has(X86_FEATURE_CPPC)) {
929 value &= ~GENMASK_ULL(31, 24);
930 value |= (u64)epp << 24;
931 }
932
933 skip_epp:
934 WRITE_ONCE(cpudata->cppc_req_cached, value);
935 amd_pstate_set_epp(cpudata, epp);
936 cpufreq_cpu_put(policy);
937 }
938

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests


Attachments:
(No filename) (9.25 kB)
config (189.91 kB)
Download all attachments

2023-01-17 12:17:41

by Huang Rui

[permalink] [raw]
Subject: Re: [PATCH v10 04/12] cpufreq: amd-pstate: optimize driver working mode selection in amd_pstate_param()

On Fri, Jan 06, 2023 at 02:14:12PM +0800, Yuan, Perry wrote:
> From: Wyes Karny <[email protected]>
>
> The amd-pstate driver may support multiple working modes.
> Introduce a variable to keep track of which mode is currently enabled.
> Here we use cppc_state var to indicate which mode is enabled.
> This change will help to simplify the the amd_pstate_param() to choose
> which mode used for the following driver registration.
>
> Reviewed-by: Mario Limonciello <[email protected]>
> Signed-off-by: Perry Yuan <[email protected]>
> Signed-off-by: Wyes Karny <[email protected]>

Acked-by: Huang Rui <[email protected]>

> ---
> drivers/cpufreq/amd-pstate.c | 39 +++++++++++++++++++++++++++---------
> include/linux/amd-pstate.h | 17 ++++++++++++++++
> 2 files changed, 46 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 204e39006dda..1a8b31277620 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -60,7 +60,18 @@
> * module parameter to be able to enable it manually for debugging.
> */
> static struct cpufreq_driver amd_pstate_driver;
> -static int cppc_load __initdata;
> +static int cppc_state = AMD_PSTATE_DISABLE;
> +
> +static inline int get_mode_idx_from_str(const char *str, size_t size)
> +{
> + int i;
> +
> + for (i=0; i < AMD_PSTATE_MAX; i++) {
> + if (!strncmp(str, amd_pstate_mode_string[i], size))
> + return i;
> + }
> + return -EINVAL;
> +}
>
> static inline int pstate_enable(bool enable)
> {
> @@ -625,10 +636,10 @@ static int __init amd_pstate_init(void)
> /*
> * by default the pstate driver is disabled to load
> * enable the amd_pstate passive mode driver explicitly
> - * with amd_pstate=passive in kernel command line
> + * with amd_pstate=passive or other modes in kernel command line
> */
> - if (!cppc_load) {
> - pr_debug("driver load is disabled, boot with amd_pstate=passive to enable this\n");
> + if (cppc_state == AMD_PSTATE_DISABLE) {
> + pr_debug("driver load is disabled, boot with specific mode to enable this\n");
> return -ENODEV;
> }
>
> @@ -670,16 +681,24 @@ device_initcall(amd_pstate_init);
>
> static int __init amd_pstate_param(char *str)
> {
> + size_t size;
> + int mode_idx;
> +
> if (!str)
> return -EINVAL;
>
> - if (!strcmp(str, "disable")) {
> - cppc_load = 0;
> - pr_info("driver is explicitly disabled\n");
> - } else if (!strcmp(str, "passive"))
> - cppc_load = 1;
> + size = strlen(str);
> + mode_idx = get_mode_idx_from_str(str, size);
>
> - return 0;
> + if (mode_idx >= AMD_PSTATE_DISABLE && mode_idx < AMD_PSTATE_MAX) {
> + cppc_state = mode_idx;
> + if (cppc_state == AMD_PSTATE_DISABLE)
> + pr_info("driver is explicitly disabled\n");
> +
> + return 0;
> + }
> +
> + return -EINVAL;
> }
> early_param("amd_pstate", amd_pstate_param);
>
> diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> index 1c4b8659f171..dae2ce0f6735 100644
> --- a/include/linux/amd-pstate.h
> +++ b/include/linux/amd-pstate.h
> @@ -74,4 +74,21 @@ struct amd_cpudata {
> bool boost_supported;
> };
>
> +/*
> + * enum amd_pstate_mode - driver working mode of amd pstate
> + */
> +enum amd_pstate_mode {
> + AMD_PSTATE_DISABLE = 0,
> + AMD_PSTATE_PASSIVE,
> + AMD_PSTATE_ACTIVE,
> + AMD_PSTATE_MAX,
> +};
> +
> +static const char * const amd_pstate_mode_string[] = {
> + [AMD_PSTATE_DISABLE] = "disable",
> + [AMD_PSTATE_PASSIVE] = "passive",
> + [AMD_PSTATE_ACTIVE] = "active",
> + NULL,
> +};
> +
> #endif /* _LINUX_AMD_PSTATE_H */
> --
> 2.34.1
>

2023-01-17 12:23:09

by Huang Rui

[permalink] [raw]
Subject: Re: [PATCH v10 02/12] Documentation: amd-pstate: add EPP profiles introduction

On Fri, Jan 06, 2023 at 02:14:10PM +0800, Yuan, Perry wrote:
> From: Perry Yuan <[email protected]>
>
> The amd-pstate driver supports a feature called energy performance
> preference (EPP). Add information to the documentation to explain
> how users can interact with the sysfs files for this feature.
>
> 1) See all EPP profiles
> $ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_available_preferences
> default performance balance_performance balance_power power
>
> 2) Check current EPP profile
> $ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference
> performance
>
> 3) Set new EPP profile
> $ sudo bash -c "echo power > /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference"
>
> Reviewed-by: Mario Limonciello <[email protected]>
> Signed-off-by: Perry Yuan <[email protected]>

Acked-by: Huang Rui <[email protected]>

> ---
> Documentation/admin-guide/pm/amd-pstate.rst | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
> index 06e23538f79c..33ab8ec8fc2f 100644
> --- a/Documentation/admin-guide/pm/amd-pstate.rst
> +++ b/Documentation/admin-guide/pm/amd-pstate.rst
> @@ -262,6 +262,25 @@ lowest non-linear performance in `AMD CPPC Performance Capability
> <perf_cap_>`_.)
> This attribute is read-only.
>
> +``energy_performance_available_preferences``
> +
> +A list of all the supported EPP preferences that could be used for
> +``energy_performance_preference`` on this system.
> +These profiles represent different hints that are provided
> +to the low-level firmware about the user's desired energy vs efficiency
> +tradeoff. ``default`` represents the epp value is set by platform
> +firmware. This attribute is read-only.
> +
> +``energy_performance_preference``
> +
> +The current energy performance preference can be read from this attribute.
> +and user can change current preference according to energy or performance needs
> +Please get all support profiles list from
> +``energy_performance_available_preferences`` attribute, all the profiles are
> +integer values defined between 0 to 255 when EPP feature is enabled by platform
> +firmware, if EPP feature is disabled, driver will ignore the written value
> +This attribute is read-write.
> +
> Other performance and frequency values can be read back from
> ``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`.
>
> --
> 2.34.1
>

2023-01-17 13:41:09

by Huang Rui

[permalink] [raw]
Subject: Re: [PATCH v10 05/12] cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors

On Fri, Jan 06, 2023 at 02:14:13PM +0800, Yuan, Perry wrote:
> From: Perry Yuan <[email protected]>
>
> Add EPP driver support for AMD SoCs which support a dedicated MSR for
> CPPC. EPP is used by the DPM controller to configure the frequency that
> a core operates at during short periods of activity.
>
> The SoC EPP targets are configured on a scale from 0 to 255 where 0
> represents maximum performance and 255 represents maximum efficiency.
>
> The amd-pstate driver exports profile string names to userspace that are
> tied to specific EPP values.
>
> The balance_performance string (0x80) provides the best balance for
> efficiency versus power on most systems, but users can choose other
> strings to meet their needs as well.
>
> $ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_preferences
> default performance balance_performance balance_power power
>
> $ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preference
> balance_performance
>
> To enable the driver,it needs to add `amd_pstate=active` to kernel
> command line and kernel will load the active mode epp driver
>
> Signed-off-by: Perry Yuan <[email protected]>
> ---
> drivers/cpufreq/amd-pstate.c | 374 ++++++++++++++++++++++++++++++++++-
> include/linux/amd-pstate.h | 10 +
> 2 files changed, 378 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 1a8b31277620..9a92c71af80f 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -59,7 +59,9 @@
> * we disable it by default to go acpi-cpufreq on these processors and add a
> * module parameter to be able to enable it manually for debugging.
> */
> +static struct cpufreq_driver *current_pstate_driver;
> static struct cpufreq_driver amd_pstate_driver;
> +static struct cpufreq_driver amd_pstate_epp_driver;
> static int cppc_state = AMD_PSTATE_DISABLE;
>
> static inline int get_mode_idx_from_str(const char *str, size_t size)
> @@ -73,6 +75,114 @@ static inline int get_mode_idx_from_str(const char *str, size_t size)
> return -EINVAL;
> }
>
> +static DEFINE_MUTEX(amd_pstate_limits_lock);
> +static DEFINE_MUTEX(amd_pstate_driver_lock);
> +
> +static s16 amd_pstate_get_epp(struct amd_cpudata *cpudata, u64 cppc_req_cached)
> +{
> + u64 epp;
> + int ret;
> +
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + if (!cppc_req_cached) {
> + epp = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> + &cppc_req_cached);
> + if (epp)
> + return epp;
> + }
> + epp = (cppc_req_cached >> 24) & 0xFF;
> + } else {
> + ret = cppc_get_epp_perf(cpudata->cpu, &epp);
> + if (ret < 0) {
> + pr_debug("Could not retrieve energy perf value (%d)\n", ret);
> + return -EIO;
> + }
> + }
> +
> + return (s16)(epp & 0xff);
> +}
> +
> +static int amd_pstate_get_energy_pref_index(struct amd_cpudata *cpudata)
> +{
> + s16 epp;
> + int index = -EINVAL;
> +
> + epp = amd_pstate_get_epp(cpudata, 0);
> + if (epp < 0)
> + return epp;
> +
> + switch (epp) {
> + case HWP_EPP_PERFORMANCE:
> + index = EPP_INDEX_PERFORMANCE;
> + break;
> + case HWP_EPP_BALANCE_PERFORMANCE:
> + index = EPP_INDEX_BALANCE_PERFORMANCE;
> + break;
> + case HWP_EPP_BALANCE_POWERSAVE:
> + index = EPP_INDEX_BALANCE_POWERSAVE;
> + break;
> + case HWP_EPP_POWERSAVE:
> + index = EPP_INDEX_POWERSAVE;
> + break;
> + default:
> + break;
> + }
> +
> + return index;
> +}
> +
> +static int amd_pstate_set_epp(struct amd_cpudata *cpudata, u32 epp)
> +{
> + int ret;
> + struct cppc_perf_ctrls perf_ctrls;
> +
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + u64 value = READ_ONCE(cpudata->cppc_req_cached);
> +
> + value &= ~GENMASK_ULL(31, 24);
> + value |= (u64)epp << 24;
> + WRITE_ONCE(cpudata->cppc_req_cached, value);
> +
> + ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> + if (!ret)
> + cpudata->epp_cached = epp;
> + } else {
> + perf_ctrls.energy_perf = epp;
> + ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1);
> + if (ret) {
> + pr_debug("failed to set energy perf value (%d)\n", ret);
> + return ret;
> + }
> + cpudata->epp_cached = epp;
> + }
> +
> + return ret;
> +}
> +
> +static int amd_pstate_set_energy_pref_index(struct amd_cpudata *cpudata,
> + int pref_index)
> +{
> + int epp = -EINVAL;
> + int ret;
> +
> + if (!pref_index) {
> + pr_debug("EPP pref_index is invalid\n");
> + return -EINVAL;
> + }
> +
> + if (epp == -EINVAL)
> + epp = epp_values[pref_index];
> +
> + if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> + pr_debug("EPP cannot be set under performance policy\n");
> + return -EBUSY;
> + }
> +
> + ret = amd_pstate_set_epp(cpudata, epp);
> +
> + return ret;
> +}
> +
> static inline int pstate_enable(bool enable)
> {
> return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable);
> @@ -81,11 +191,21 @@ static inline int pstate_enable(bool enable)
> static int cppc_enable(bool enable)
> {
> int cpu, ret = 0;
> + struct cppc_perf_ctrls perf_ctrls;
>
> for_each_present_cpu(cpu) {
> ret = cppc_set_enable(cpu, enable);
> if (ret)
> return ret;
> +
> + /* Enable autonomous mode for EPP */
> + if (cppc_state == AMD_PSTATE_ACTIVE) {
> + /* Set desired perf as zero to allow EPP firmware control */
> + perf_ctrls.desired_perf = 0;
> + ret = cppc_set_perf(cpu, &perf_ctrls);
> + if (ret)
> + return ret;
> + }
> }
>
> return ret;
> @@ -428,7 +548,7 @@ static void amd_pstate_boost_init(struct amd_cpudata *cpudata)
> return;
>
> cpudata->boost_supported = true;
> - amd_pstate_driver.boost_enabled = true;
> + current_pstate_driver->boost_enabled = true;
> }
>
> static void amd_perf_ctl_reset(unsigned int cpu)
> @@ -602,10 +722,61 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
> return sprintf(&buf[0], "%u\n", perf);
> }
>
> +static ssize_t show_energy_performance_available_preferences(
> + struct cpufreq_policy *policy, char *buf)
> +{
> + int i = 0;
> + int offset = 0;
> +
> + while (energy_perf_strings[i] != NULL)
> + offset += sysfs_emit_at(buf, offset, "%s ", energy_perf_strings[i++]);
> +
> + sysfs_emit_at(buf, offset, "\n");
> +
> + return offset;
> +}
> +
> +static ssize_t store_energy_performance_preference(
> + struct cpufreq_policy *policy, const char *buf, size_t count)
> +{
> + struct amd_cpudata *cpudata = policy->driver_data;
> + char str_preference[21];
> + ssize_t ret;
> +
> + ret = sscanf(buf, "%20s", str_preference);
> + if (ret != 1)
> + return -EINVAL;
> +
> + ret = match_string(energy_perf_strings, -1, str_preference);
> + if (ret < 0)
> + return -EINVAL;
> +
> + mutex_lock(&amd_pstate_limits_lock);
> + ret = amd_pstate_set_energy_pref_index(cpudata, ret);
> + mutex_unlock(&amd_pstate_limits_lock);
> +
> + return ret ?: count;
> +}
> +
> +static ssize_t show_energy_performance_preference(
> + struct cpufreq_policy *policy, char *buf)
> +{
> + struct amd_cpudata *cpudata = policy->driver_data;
> + int preference;
> +
> + preference = amd_pstate_get_energy_pref_index(cpudata);
> + if (preference < 0)
> + return preference;
> +
> + return sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
> +}
> +
> cpufreq_freq_attr_ro(amd_pstate_max_freq);
> cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
>
> cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> +cpufreq_freq_attr_rw(energy_performance_preference);
> +cpufreq_freq_attr_ro(energy_performance_available_preferences);
>
> static struct freq_attr *amd_pstate_attr[] = {
> &amd_pstate_max_freq,
> @@ -614,6 +785,181 @@ static struct freq_attr *amd_pstate_attr[] = {
> NULL,
> };
>
> +static struct freq_attr *amd_pstate_epp_attr[] = {
> + &amd_pstate_max_freq,
> + &amd_pstate_lowest_nonlinear_freq,
> + &amd_pstate_highest_perf,
> + &energy_performance_preference,
> + &energy_performance_available_preferences,
> + NULL,
> +};
> +
> +static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> +{
> + int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> + struct amd_cpudata *cpudata;
> + struct device *dev;
> + int rc;
> + u64 value;
> +
> + dev = get_cpu_device(policy->cpu);
> + if (!dev)
> + goto free_cpudata1;
> +
> + cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
> + if (!cpudata)
> + return -ENOMEM;
> +
> + cpudata->cpu = policy->cpu;
> + cpudata->epp_policy = 0;
> +
> + rc = amd_pstate_init_perf(cpudata);
> + if (rc)
> + goto free_cpudata1;
> +
> + min_freq = amd_get_min_freq(cpudata);
> + max_freq = amd_get_max_freq(cpudata);
> + nominal_freq = amd_get_nominal_freq(cpudata);
> + lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> + if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> + dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
> + min_freq, max_freq);
> + ret = -EINVAL;
> + goto free_cpudata1;
> + }
> +
> + policy->cpuinfo.min_freq = min_freq;
> + policy->cpuinfo.max_freq = max_freq;
> + /* It will be updated by governor */
> + policy->cur = policy->cpuinfo.min_freq;
> +
> + /* Initial processor data capability frequencies */
> + cpudata->max_freq = max_freq;
> + cpudata->min_freq = min_freq;
> + cpudata->nominal_freq = nominal_freq;
> + cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> +
> + policy->driver_data = cpudata;
> +
> + cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);

As the warning which reminded by 0day, should be amd_pstate_get_epp(cpudata, 0).

Others look good for me, with those fixed, please feel free to add

Acked-by: Huang Rui <[email protected]>

> +
> + policy->min = policy->cpuinfo.min_freq;
> + policy->max = policy->cpuinfo.max_freq;
> +
> + /*
> + * Set the policy to powersave to provide a valid fallback value in case
> + * the default cpufreq governor is neither powersave nor performance.
> + */
> + policy->policy = CPUFREQ_POLICY_POWERSAVE;
> +
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + policy->fast_switch_possible = true;
> + ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, &value);
> + if (ret)
> + return ret;
> + WRITE_ONCE(cpudata->cppc_req_cached, value);
> +
> + ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &value);
> + if (ret)
> + return ret;
> + WRITE_ONCE(cpudata->cppc_cap1_cached, value);
> + }
> + amd_pstate_boost_init(cpudata);
> +
> + return 0;
> +
> +free_cpudata1:
> + kfree(cpudata);
> + return ret;
> +}
> +
> +static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
> +{
> + pr_debug("CPU %d exiting\n", policy->cpu);
> + policy->fast_switch_possible = false;
> + return 0;
> +}
> +
> +static void amd_pstate_epp_init(unsigned int cpu)
> +{
> + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> + struct amd_cpudata *cpudata = policy->driver_data;
> + u32 max_perf, min_perf;
> + u64 value;
> + s16 epp;
> +
> + max_perf = READ_ONCE(cpudata->highest_perf);
> + min_perf = READ_ONCE(cpudata->lowest_perf);
> +
> + value = READ_ONCE(cpudata->cppc_req_cached);
> +
> + if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> + min_perf = max_perf;
> +
> + /* Initial min/max values for CPPC Performance Controls Register */
> + value &= ~AMD_CPPC_MIN_PERF(~0L);
> + value |= AMD_CPPC_MIN_PERF(min_perf);
> +
> + value &= ~AMD_CPPC_MAX_PERF(~0L);
> + value |= AMD_CPPC_MAX_PERF(max_perf);
> +
> + /* CPPC EPP feature require to set zero to the desire perf bit */
> + value &= ~AMD_CPPC_DES_PERF(~0L);
> + value |= AMD_CPPC_DES_PERF(0);
> +
> + if (cpudata->epp_policy == cpudata->policy)
> + goto skip_epp;
> +
> + cpudata->epp_policy = cpudata->policy;
> +
> + if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> + epp = amd_pstate_get_epp(cpudata, value);
> + if (epp < 0)
> + goto skip_epp;
> + /* force the epp value to be zero for performance policy */
> + epp = 0;
> + } else {
> + /* Get BIOS pre-defined epp value */
> + epp = amd_pstate_get_epp(cpudata, value);
> + if (epp)
> + goto skip_epp;
> + }
> + /* Set initial EPP value */
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + value &= ~GENMASK_ULL(31, 24);
> + value |= (u64)epp << 24;
> + }
> +
> +skip_epp:
> + WRITE_ONCE(cpudata->cppc_req_cached, value);
> + amd_pstate_set_epp(cpudata, epp);
> + cpufreq_cpu_put(policy);
> +}
> +
> +static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
> +{
> + struct amd_cpudata *cpudata = policy->driver_data;
> +
> + if (!policy->cpuinfo.max_freq)
> + return -ENODEV;
> +
> + pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
> + policy->cpuinfo.max_freq, policy->max);
> +
> + cpudata->policy = policy->policy;
> +
> + amd_pstate_epp_init(policy->cpu);
> +
> + return 0;
> +}
> +
> +static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data *policy)
> +{
> + cpufreq_verify_within_cpu_limits(policy);
> + pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy->min);
> + return 0;
> +}
> +
> static struct cpufreq_driver amd_pstate_driver = {
> .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
> .verify = amd_pstate_verify,
> @@ -627,6 +973,16 @@ static struct cpufreq_driver amd_pstate_driver = {
> .attr = amd_pstate_attr,
> };
>
> +static struct cpufreq_driver amd_pstate_epp_driver = {
> + .flags = CPUFREQ_CONST_LOOPS,
> + .verify = amd_pstate_epp_verify_policy,
> + .setpolicy = amd_pstate_epp_set_policy,
> + .init = amd_pstate_epp_cpu_init,
> + .exit = amd_pstate_epp_cpu_exit,
> + .name = "amd_pstate_epp",
> + .attr = amd_pstate_epp_attr,
> +};
> +
> static int __init amd_pstate_init(void)
> {
> int ret;
> @@ -655,7 +1011,8 @@ static int __init amd_pstate_init(void)
> /* capability check */
> if (boot_cpu_has(X86_FEATURE_CPPC)) {
> pr_debug("AMD CPPC MSR based functionality is supported\n");
> - amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
> + if (cppc_state == AMD_PSTATE_PASSIVE)
> + current_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
> } else {
> pr_debug("AMD CPPC shared memory based functionality is supported\n");
> static_call_update(amd_pstate_enable, cppc_enable);
> @@ -666,14 +1023,13 @@ static int __init amd_pstate_init(void)
> /* enable amd pstate feature */
> ret = amd_pstate_enable(true);
> if (ret) {
> - pr_err("failed to enable amd-pstate with return %d\n", ret);
> + pr_err("failed to enable with return %d\n", ret);
> return ret;
> }
>
> - ret = cpufreq_register_driver(&amd_pstate_driver);
> + ret = cpufreq_register_driver(current_pstate_driver);
> if (ret)
> - pr_err("failed to register amd_pstate_driver with return %d\n",
> - ret);
> + pr_err("failed to register with return %d\n", ret);
>
> return ret;
> }
> @@ -695,6 +1051,12 @@ static int __init amd_pstate_param(char *str)
> if (cppc_state == AMD_PSTATE_DISABLE)
> pr_info("driver is explicitly disabled\n");
>
> + if (cppc_state == AMD_PSTATE_ACTIVE)
> + current_pstate_driver = &amd_pstate_epp_driver;
> +
> + if (cppc_state == AMD_PSTATE_PASSIVE)
> + current_pstate_driver = &amd_pstate_driver;
> +
> return 0;
> }
>
> diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> index dae2ce0f6735..8341a2a2948a 100644
> --- a/include/linux/amd-pstate.h
> +++ b/include/linux/amd-pstate.h
> @@ -47,6 +47,10 @@ struct amd_aperf_mperf {
> * @prev: Last Aperf/Mperf/tsc count value read from register
> * @freq: current cpu frequency value
> * @boost_supported: check whether the Processor or SBIOS supports boost mode
> + * @epp_policy: Last saved policy used to set energy-performance preference
> + * @epp_cached: Cached CPPC energy-performance preference value
> + * @policy: Cpufreq policy value
> + * @cppc_cap1_cached Cached MSR_AMD_CPPC_CAP1 register value
> *
> * The amd_cpudata is key private data for each CPU thread in AMD P-State, and
> * represents all the attributes and goals that AMD P-State requests at runtime.
> @@ -72,6 +76,12 @@ struct amd_cpudata {
>
> u64 freq;
> bool boost_supported;
> +
> + /* EPP feature related attributes*/
> + s16 epp_policy;
> + s16 epp_cached;
> + u32 policy;
> + u64 cppc_cap1_cached;
> };
>
> /*
> --
> 2.34.1
>

2023-01-17 14:26:04

by Huang Rui

[permalink] [raw]
Subject: Re: [PATCH v10 00/12] Implement AMD Pstate EPP Driver

On Fri, Jan 06, 2023 at 02:14:08PM +0800, Yuan, Perry wrote:
> Hi all,
>
> This patchset implements one new AMD CPU frequency driver
> `amd-pstate-epp` instance for better performance and power control.
> CPPC has a parameter called energy preference performance (EPP).
> The EPP is used in the CCLK DPM controller to drive the frequency that a core
> is going to operate during short periods of activity.
> EPP values will be utilized for different OS profiles (balanced, performance, power savings).
>
> AMD Energy Performance Preference (EPP) provides a hint to the hardware
> if software wants to bias toward performance (0x0) or energy efficiency (0xff)
> The lowlevel power firmware will calculate the runtime frequency according to the EPP preference
> value. So the EPP hint will impact the CPU cores frequency responsiveness.
>
> We use the RAPL interface with "perf" tool to get the energy data of the package power.
> Performance Per Watt (PPW) Calculation:
>
> The PPW calculation is referred by below paper:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsoftware.intel.com%2Fcontent%2Fdam%2Fdevelop%2Fexternal%2Fus%2Fen%2Fdocuments%2Fperformance-per-what-paper.pdf&amp;data=04%7C01%7CPerry.Yuan%40amd.com%7Cac66e8ce98044e9b062708d9ab47c8d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637729147708574423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=TPOvCE%2Frbb0ptBreWNxHqOi9YnVhcHGKG88vviDLb00%3D&amp;reserved=0
>
> Below formula is referred from below spec to measure the PPW:
>
> (F / t) / P = F * t / (t * E) = F / E,
>
> "F" is the number of frames per second.
> "P" is power measured in watts.
> "E" is energy measured in joules.
>
> Gitsouce Benchmark Data on ROME Server CPU
> +------------------------------+------------------------------+------------+------------------+
> | Kernel Module | PPW (1 / s * J) |Energy(J) | PPW Improvement (%)|
> +==============================+==============================+============+==================+
> | acpi-cpufreq:schedutil | 5.85658E-05 | 17074.8 | base |
> +------------------------------+------------------------------+------------+------------------+
> | acpi-cpufreq:ondemand | 5.03079E-05 | 19877.6 | -14.10% |
> +------------------------------+------------------------------+------------+------------------+
> | acpi-cpufreq:performance | 5.88132E-05 | 17003 | 0.42% |
> +------------------------------+------------------------------+------------+------------------+
> | amd-pstate:ondemand | 4.60295E-05 | 21725.2 | -21.41% |
> +------------------------------+------------------------------+------------+------------------+
> | amd-pstate:schedutil | 4.70026E-05 | 21275.4 | -19.7% |
> +------------------------------+------------------------------+------------+------------------+
> | amd-pstate:performance | 5.80094E-05 | 17238.6 | -0.95% |
> +------------------------------+------------------------------+------------+------------------+
> | EPP:performance | 5.8292E-05 | 17155 | -0.47% |
> +------------------------------+------------------------------+------------+------------------+
> | EPP: balance performance: | 6.71709E-05 | 14887.4 | 14.69% |
> +------------------------------+------------------------------+------------+------------------+
> | EPP:power | 6.66951E-05 | 4993.6 | 13.88% |
> +------------------------------+------------------------------+------------+------------------+
>
> Tbench Benchmark Data on ROME Server CPU
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | Kernel Module | PPW MB / (s * J) |Throughput(MB/s)| Energy (J)|PPW Improvement(%)|
> +=============================================+===================+==============+=============+==================+
> | acpi_cpufreq: schedutil | 46.39 | 17191 | 37057.3 | base |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | acpi_cpufreq: ondemand | 51.51 | 19269.5 | 37406.5 | 11.04 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | acpi_cpufreq: performance | 45.96 | 17063.7 | 37123.7 | -0.74 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | EPP:powersave: performance(0) | 54.46 | 20263.1 | 37205 | 17.87 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | EPP:powersave: balance performance | 55.03 | 20481.9 | 37221.5 | 19.14 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | EPP:powersave: balance_power | 54.43 | 20245.9 | 37194.2 | 17.77 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | EPP:powersave: power(255) | 54.26 | 20181.7 | 37197.4 | 17.40 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | amd-pstate: schedutil | 48.22 | 17844.9 | 37006.6 | 3.80 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | amd-pstate: ondemand | 61.30 | 22988 | 37503.4 | 33.72 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
> | amd-pstate: performance | 54.52 | 20252.6 | 37147.8 | 17.81 % |
> +---------------------------------------------+-------------------+--------------+-------------+------------------+
>
> changes from v9:
> * pick up R-B flas added by Mario
> * pick up R-B flag added by Wyes
> * rename "default_pstate_driver" to "current_pstate_driver"
> * update status string to "disable" when driver disabled and update size check
> * move the common macro definition of EPP to cppc library files for intel_pstate and amd_pstate common using
> * fix some doc words for new global sysfs patch
>
> changes from v8:
> * drive all the feedbacks from Mario and change the codes in this
> version
> * drive all the feedbacks from Ray and change the codes in this
> version
> * pick up all the R-B flags from Mario
> * pick up all the R-B flags from Ray
> * drop boost/refresh_freq_limits callback
> * reuse policy->driver_data to store amd_cpudata struct
> * use switch-case in the driver mode switching function
> * add Kconfig dependency the INTEL_PSTATE for AMD_PSTATE build
> * fix some other code format and typos
>
> changes from v7:
> * remove iowait boost functions code
> * pick up ack by flag from Huang Ray.
> * add one new patch to support multiple working modes in the amd_pstate_param(),aligned with Wyse
> * drop the patch "[v7 08/13] cpufreq: amd-pstate: add frequency dynamic boost sysfs control"
> * replace the cppc_get_epp_caps() with new cppc_get_epp_perf() wihch is
> more simple to use
> * remove I/O wait boost code from amd_pstate_update_status()
> * replace cppc_active var with enum type AMD_PSTATE_ACTIVE
> * squash amd_pstate_epp_verify_policy() into sigle function
> * remove "amd pstate" string from the pr_err, pr_debug logs info
> * rework patch [v7 03/13], move the common EPP profiles declaration
> into cpufreq.h which will be used by amd-pstate and intel-pstate
> * combine amd psate init functions.
> * remove epp_powersave from amd-pstate.h and dropping the codes.
> * move amd_pstate_params{} from amd-pstate.h into amd-pstate.c
> * drive some other feedbacks from huang ray
>
> changes from v6:
> * fix one legacy kernel hang issue when amd-pstate driver unregistering
> * add new documentation to introduce new global sysfs attributes
> * use sysfs_emit_at() to print epp profiles array
> * update commit info for patch v6 patch 1/11 as Mario sugguested.
> * trying to add the EPP profiles into cpufreq.h, but it will cause lots
> of build failues,continue to keep cpufreq_common.h used in v7
> * update commit info using amd-pstate as prefix same as before.
> * remove CONFIG_ACPI for the header as Ray suggested.
> * move amd_pstate_kobj to where it is used in patch "add frequency dynamic boost sysfs control"
> * drive feedback removing X86_FEATURE_CPPC check for the epp init from Huang Ray
> * drive feedback from Mario
>
> change from v5:
> * add one common header `cpufreq_commoncpufreq_common` to extract EPP profiles
> definition for amd and intel pstate driver.
> * remove the epp_off value to avoid confusion.
> * convert some other sysfs sprintf() function with sysfs_emit() and add onew new patch
> * add acpi pm server priofile detection to enable dynamic boost control
> * fix some code format with checkpatch script
> * move the EPP profile declaration into common header file `cpufreq_common.h`
> * fix commit typos
>
> changes from v4:
> * rebase driver based on the v6.1-rc7
> * remove the builtin changes patch because pstate driver has been
> changed to builtin type by another thread patch
> * update Documentation: amd-pstate: add amd pstate driver mode introduction
> * replace sprintf with sysfs_emit() instead.
> * fix typo for cppc_set_epp_perf() in cppc_acpi.h header
>
> changes from v3:
> * add one more document update patch for the active and passive mode
> introducion.
> * drive most of the feedbacks from Mario
> * drive feedback from Rafael for the cppc_acpi driver.
> * remove the epp raw data set/get function
> * set the amd-pstate drive by passing kernel parameter
> * set amd-pstate driver disabled by default if no kernel parameter
> input from booting
> * get cppc_set_auto_epp and cppc_set_epp_perf combined
> * pick up reviewed by flag from Mario
>
> changes from v2:
> * change pstate driver as builtin type from module
> * drop patch "export cpufreq cpu release and acquire"
> * squash patch of shared mem into single patch of epp implementation
> * add one new patch to support frequency boost control
> * add patch to expose driver working status checking
> * rebase driver into v6.1-rc4 kernel release
> * move some declaration to amd-pstate.h
> * drive feedback from Mario for the online/offline patch
> * drive feedback from Mario for the suspend/resume patch
> * drive feedback from Ray for the cppc_acpi and some other patches
> * drive feedback from Nathan for the epp patch
>
> changes from v1:
> * rebased to v6.0
> * drive feedbacks from Mario for the suspend/resume patch
> * drive feedbacks from Nathan for the EPP support on msr type
> * fix some typos and code style indent problems
> * update commit comments for patch 4/7
> * change the `epp_enabled` module param name to `epp`
> * set the default epp mode to be false
> * add testing for the x86_energy_perf_policy utility patchset(will
> send that utility patchset with another thread)

Most of the patches look good for me, thanks!

Only patch 3, as Intel P-State actually won't use CPPC (won't expose _CPC
object in SBIOS), it may not be good to align the common definition in
acpi_cppc.h.

Thanks,
Ray

>
> v9: https://lore.kernel.org/lkml/[email protected]/
> v8: https://lore.kernel.org/lkml/[email protected]/
> v7: https://lore.kernel.org/lkml/[email protected]/
> v6: https://lore.kernel.org/lkml/[email protected]/
> v5: https://lore.kernel.org/lkml/[email protected]/
> v4: https://lore.kernel.org/lkml/[email protected]/
> v3: https://lore.kernel.org/all/[email protected]/
> v2: https://lore.kernel.org/all/[email protected]/
> v1: https://lore.kernel.org/all/[email protected]/
>
> Perry Yuan (11):
> ACPI: CPPC: Add AMD pstate energy performance preference cppc control
> Documentation: amd-pstate: add EPP profiles introduction
> cpufreq: intel_pstate: use common macro definition for Energy
> Preference Performance(EPP)
> cpufreq: amd-pstate: implement Pstate EPP support for the AMD
> processors
> cpufreq: amd-pstate: implement amd pstate cpu online and offline
> callback
> cpufreq: amd-pstate: implement suspend and resume callbacks
> cpufreq: amd-pstate: add driver working mode switch support
> Documentation: amd-pstate: add amd pstate driver mode introduction
> Documentation: introduce amd pstate active mode kernel command line
> options
> cpufreq: amd-pstate: convert sprintf with sysfs_emit()
> Documentation: amd-pstate: introduce new global sysfs attributes
>
> Wyes Karny (1):
> cpufreq: amd-pstate: optimize driver working mode selection in
> amd_pstate_param()
>
> .../admin-guide/kernel-parameters.txt | 7 +
> Documentation/admin-guide/pm/amd-pstate.rst | 74 +-
> drivers/acpi/cppc_acpi.c | 96 +++
> drivers/cpufreq/amd-pstate.c | 659 +++++++++++++++++-
> drivers/cpufreq/intel_pstate.c | 36 -
> include/acpi/cppc_acpi.h | 22 +
> include/linux/amd-pstate.h | 28 +
> 7 files changed, 865 insertions(+), 57 deletions(-)
>
> --
> 2.34.1
>

2023-01-17 14:32:47

by Huang Rui

[permalink] [raw]
Subject: Re: [PATCH v10 08/12] cpufreq: amd-pstate: add driver working mode switch support

On Fri, Jan 06, 2023 at 02:14:16PM +0800, Yuan, Perry wrote:
> From: Perry Yuan <[email protected]>
>
> While amd-pstate driver was loaded with specific driver mode, it will
> need to check which mode is enabled for the pstate driver,add this sysfs
> entry to show the current status
>
> $ cat /sys/devices/system/cpu/amd-pstate/status
> active
>
> Meanwhile, user can switch the pstate driver mode with writing mode
> string to sysfs entry as below.
>
> Enable passive mode:
> $ sudo bash -c "echo passive > /sys/devices/system/cpu/amd-pstate/status"
>
> Enable active mode (EPP driver mode):
> $ sudo bash -c "echo active > /sys/devices/system/cpu/amd-pstate/status"
>
> Reviewed-by: Mario Limonciello <[email protected]>
> Signed-off-by: Perry Yuan <[email protected]>

Patch is Acked-by: Huang Rui <[email protected]>

Wyes, may we have a ack or any comment from you of this patch, as you are
working on the new guided autonomous mode?

Thanks,
Ray

> ---
> drivers/cpufreq/amd-pstate.c | 118 +++++++++++++++++++++++++++++++++++
> 1 file changed, 118 insertions(+)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 2c8b126cec5a..d3f4cdd803f3 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -63,6 +63,7 @@ static struct cpufreq_driver *current_pstate_driver;
> static struct cpufreq_driver amd_pstate_driver;
> static struct cpufreq_driver amd_pstate_epp_driver;
> static int cppc_state = AMD_PSTATE_DISABLE;
> +struct kobject *amd_pstate_kobj;
>
> static inline int get_mode_idx_from_str(const char *str, size_t size)
> {
> @@ -631,6 +632,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> policy->driver_data = cpudata;
>
> amd_pstate_boost_init(cpudata);
> + if (!current_pstate_driver->adjust_perf)
> + current_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
>
> return 0;
>
> @@ -771,12 +774,99 @@ static ssize_t show_energy_performance_preference(
> return sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
> }
>
> +static ssize_t amd_pstate_show_status(char *buf)
> +{
> + if (!current_pstate_driver)
> + return sysfs_emit(buf, "disable\n");
> +
> + return sysfs_emit(buf, "%s\n", amd_pstate_mode_string[cppc_state]);
> +}
> +
> +static void amd_pstate_driver_cleanup(void)
> +{
> + current_pstate_driver = NULL;
> +}
> +
> +static int amd_pstate_update_status(const char *buf, size_t size)
> +{
> + int ret;
> + int mode_idx;
> +
> + if (size > 7 || size < 6)
> + return -EINVAL;
> + mode_idx = get_mode_idx_from_str(buf, size);
> +
> + switch(mode_idx) {
> + case AMD_PSTATE_DISABLE:
> + if (!current_pstate_driver)
> + return -EINVAL;
> + if (cppc_state == AMD_PSTATE_ACTIVE)
> + return -EBUSY;
> + ret = cpufreq_unregister_driver(current_pstate_driver);
> + amd_pstate_driver_cleanup();
> + break;
> + case AMD_PSTATE_PASSIVE:
> + if (current_pstate_driver) {
> + if (current_pstate_driver == &amd_pstate_driver)
> + return 0;
> + cpufreq_unregister_driver(current_pstate_driver);
> + cppc_state = AMD_PSTATE_PASSIVE;
> + current_pstate_driver = &amd_pstate_driver;
> + }
> +
> + ret = cpufreq_register_driver(current_pstate_driver);
> + break;
> + case AMD_PSTATE_ACTIVE:
> + if (current_pstate_driver) {
> + if (current_pstate_driver == &amd_pstate_epp_driver)
> + return 0;
> + cpufreq_unregister_driver(current_pstate_driver);
> + current_pstate_driver = &amd_pstate_epp_driver;
> + cppc_state = AMD_PSTATE_ACTIVE;
> + }
> +
> + ret = cpufreq_register_driver(current_pstate_driver);
> + break;
> + default:
> + ret = -EINVAL;
> + break;
> + }
> +
> + return ret;
> +}
> +
> +static ssize_t show_status(struct kobject *kobj,
> + struct kobj_attribute *attr, char *buf)
> +{
> + ssize_t ret;
> +
> + mutex_lock(&amd_pstate_driver_lock);
> + ret = amd_pstate_show_status(buf);
> + mutex_unlock(&amd_pstate_driver_lock);
> +
> + return ret;
> +}
> +
> +static ssize_t store_status(struct kobject *a, struct kobj_attribute *b,
> + const char *buf, size_t count)
> +{
> + char *p = memchr(buf, '\n', count);
> + int ret;
> +
> + mutex_lock(&amd_pstate_driver_lock);
> + ret = amd_pstate_update_status(buf, p ? p - buf : count);
> + mutex_unlock(&amd_pstate_driver_lock);
> +
> + return ret < 0 ? ret : count;
> +}
> +
> cpufreq_freq_attr_ro(amd_pstate_max_freq);
> cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
>
> cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> cpufreq_freq_attr_rw(energy_performance_preference);
> cpufreq_freq_attr_ro(energy_performance_available_preferences);
> +define_one_global_rw(status);
>
> static struct freq_attr *amd_pstate_attr[] = {
> &amd_pstate_max_freq,
> @@ -794,6 +884,15 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
> NULL,
> };
>
> +static struct attribute *pstate_global_attributes[] = {
> + &status.attr,
> + NULL
> +};
> +
> +static const struct attribute_group amd_pstate_global_attr_group = {
> + .attrs = pstate_global_attributes,
> +};
> +
> static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> {
> int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> @@ -1153,6 +1252,25 @@ static int __init amd_pstate_init(void)
> if (ret)
> pr_err("failed to register with return %d\n", ret);
>
> + amd_pstate_kobj = kobject_create_and_add("amd_pstate", &cpu_subsys.dev_root->kobj);
> + if (!amd_pstate_kobj) {
> + ret = -EINVAL;
> + pr_err("global sysfs registration failed.\n");
> + goto kobject_free;
> + }
> +
> + ret = sysfs_create_group(amd_pstate_kobj, &amd_pstate_global_attr_group);
> + if (ret) {
> + pr_err("sysfs attribute export failed with error %d.\n", ret);
> + goto global_attr_free;
> + }
> +
> + return ret;
> +
> +global_attr_free:
> + kobject_put(amd_pstate_kobj);
> +kobject_free:
> + cpufreq_unregister_driver(current_pstate_driver);
> return ret;
> }
> device_initcall(amd_pstate_init);
> --
> 2.34.1
>

2023-01-17 14:58:27

by Huang Rui

[permalink] [raw]
Subject: Re: [PATCH v10 12/12] Documentation: amd-pstate: introduce new global sysfs attributes

On Fri, Jan 06, 2023 at 02:14:20PM +0800, Yuan, Perry wrote:
> The amd-pstate driver supports switching working modes at runtime.
> Users can view and change modes by interacting with the "status" sysfs
> attribute.
>
> 1) check driver mode:
> $ cat /sys/devices/system/cpu/amd-pstate/status
>
> 2) switch mode:
> `# echo "passive" | sudo tee /sys/devices/system/cpu/amd-pstate/status`
> or
> `# echo "active" | sudo tee /sys/devices/system/cpu/amd-pstate/status`
>
> Reviewed-by: Mario Limonciello <[email protected]>
> Signed-off-by: Perry Yuan <[email protected]>

Patch 9, 10, and 12 are Acked-by: Huang Rui <[email protected]>

> ---
> Documentation/admin-guide/pm/amd-pstate.rst | 29 +++++++++++++++++++++
> 1 file changed, 29 insertions(+)
>
> diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
> index 62744dae3c5f..df0637a49f47 100644
> --- a/Documentation/admin-guide/pm/amd-pstate.rst
> +++ b/Documentation/admin-guide/pm/amd-pstate.rst
> @@ -339,6 +339,35 @@ processor must provide at least nominal performance requested and go higher if c
> operating conditions allow.
>
>
> +User Space Interface in ``sysfs``
> +=================================
> +
> +Global Attributes
> +-----------------
> +
> +``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to
> +control its functionality at the system level. They are located in the
> +``/sys/devices/system/cpu/amd-pstate/`` directory and affect all CPUs.
> +
> +``status``
> + Operation mode of the driver: "active", "passive" or "disable".
> +
> + "active"
> + The driver is functional and in the ``active mode``
> +
> + "passive"
> + The driver is functional and in the ``passive mode``
> +
> + "disable"
> + The driver is unregistered and not functional now.
> +
> + This attribute can be written to in order to change the driver's
> + operation mode or to unregister it. The string written to it must be
> + one of the possible values of it and, if successful, writing one of
> + these values to the sysfs file will cause the driver to switch over
> + to the operation mode represented by that string - or to be
> + unregistered in the "disable" case.
> +
> ``cpupower`` tool support for ``amd-pstate``
> ===============================================
>
> --
> 2.34.1
>

2023-01-18 07:49:13

by Wyes Karny

[permalink] [raw]
Subject: Re: [PATCH v10 08/12] cpufreq: amd-pstate: add driver working mode switch support

Hi Ray, Perry,

On 1/17/2023 7:26 PM, Huang Rui wrote:
> On Fri, Jan 06, 2023 at 02:14:16PM +0800, Yuan, Perry wrote:
>> From: Perry Yuan <[email protected]>
>>
>> While amd-pstate driver was loaded with specific driver mode, it will
>> need to check which mode is enabled for the pstate driver,add this sysfs
>> entry to show the current status
>>
>> $ cat /sys/devices/system/cpu/amd-pstate/status
>> active
>>
>> Meanwhile, user can switch the pstate driver mode with writing mode
>> string to sysfs entry as below.
>>
>> Enable passive mode:
>> $ sudo bash -c "echo passive > /sys/devices/system/cpu/amd-pstate/status"
>>
>> Enable active mode (EPP driver mode):
>> $ sudo bash -c "echo active > /sys/devices/system/cpu/amd-pstate/status"
>>
>> Reviewed-by: Mario Limonciello <[email protected]>
>> Signed-off-by: Perry Yuan <[email protected]>
>
> Patch is Acked-by: Huang Rui <[email protected]>
>
> Wyes, may we have a ack or any comment from you of this patch, as you are
> working on the new guided autonomous mode?

This looks good to me.

Reviewed-by: Wyes Karny <[email protected]>

>
> Thanks,
> Ray
>
>> ---
>> drivers/cpufreq/amd-pstate.c | 118 +++++++++++++++++++++++++++++++++++
>> 1 file changed, 118 insertions(+)
>>
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index 2c8b126cec5a..d3f4cdd803f3 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -63,6 +63,7 @@ static struct cpufreq_driver *current_pstate_driver;
>> static struct cpufreq_driver amd_pstate_driver;
>> static struct cpufreq_driver amd_pstate_epp_driver;
>> static int cppc_state = AMD_PSTATE_DISABLE;
>> +struct kobject *amd_pstate_kobj;
>>
>> static inline int get_mode_idx_from_str(const char *str, size_t size)
>> {
>> @@ -631,6 +632,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>> policy->driver_data = cpudata;
>>
>> amd_pstate_boost_init(cpudata);
>> + if (!current_pstate_driver->adjust_perf)
>> + current_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
>>
>> return 0;
>>
>> @@ -771,12 +774,99 @@ static ssize_t show_energy_performance_preference(
>> return sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
>> }
>>
>> +static ssize_t amd_pstate_show_status(char *buf)
>> +{
>> + if (!current_pstate_driver)
>> + return sysfs_emit(buf, "disable\n");
>> +
>> + return sysfs_emit(buf, "%s\n", amd_pstate_mode_string[cppc_state]);
>> +}
>> +
>> +static void amd_pstate_driver_cleanup(void)
>> +{
>> + current_pstate_driver = NULL;
>> +}
>> +
>> +static int amd_pstate_update_status(const char *buf, size_t size)
>> +{
>> + int ret;
>> + int mode_idx;
>> +
>> + if (size > 7 || size < 6)
>> + return -EINVAL;
>> + mode_idx = get_mode_idx_from_str(buf, size);
>> +
>> + switch(mode_idx) {
>> + case AMD_PSTATE_DISABLE:
>> + if (!current_pstate_driver)
>> + return -EINVAL;
>> + if (cppc_state == AMD_PSTATE_ACTIVE)
>> + return -EBUSY;
>> + ret = cpufreq_unregister_driver(current_pstate_driver);
>> + amd_pstate_driver_cleanup();
>> + break;
>> + case AMD_PSTATE_PASSIVE:
>> + if (current_pstate_driver) {
>> + if (current_pstate_driver == &amd_pstate_driver)
>> + return 0;
>> + cpufreq_unregister_driver(current_pstate_driver);
>> + cppc_state = AMD_PSTATE_PASSIVE;
>> + current_pstate_driver = &amd_pstate_driver;
>> + }
>> +
>> + ret = cpufreq_register_driver(current_pstate_driver);
>> + break;
>> + case AMD_PSTATE_ACTIVE:
>> + if (current_pstate_driver) {
>> + if (current_pstate_driver == &amd_pstate_epp_driver)
>> + return 0;
>> + cpufreq_unregister_driver(current_pstate_driver);
>> + current_pstate_driver = &amd_pstate_epp_driver;
>> + cppc_state = AMD_PSTATE_ACTIVE;
>> + }
>> +
>> + ret = cpufreq_register_driver(current_pstate_driver);
>> + break;
>> + default:
>> + ret = -EINVAL;
>> + break;
>> + }
>> +
>> + return ret;
>> +}
>> +
>> +static ssize_t show_status(struct kobject *kobj,
>> + struct kobj_attribute *attr, char *buf)
>> +{
>> + ssize_t ret;
>> +
>> + mutex_lock(&amd_pstate_driver_lock);
>> + ret = amd_pstate_show_status(buf);
>> + mutex_unlock(&amd_pstate_driver_lock);
>> +
>> + return ret;
>> +}
>> +
>> +static ssize_t store_status(struct kobject *a, struct kobj_attribute *b,
>> + const char *buf, size_t count)
>> +{
>> + char *p = memchr(buf, '\n', count);
>> + int ret;
>> +
>> + mutex_lock(&amd_pstate_driver_lock);
>> + ret = amd_pstate_update_status(buf, p ? p - buf : count);
>> + mutex_unlock(&amd_pstate_driver_lock);
>> +
>> + return ret < 0 ? ret : count;
>> +}
>> +
>> cpufreq_freq_attr_ro(amd_pstate_max_freq);
>> cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
>>
>> cpufreq_freq_attr_ro(amd_pstate_highest_perf);
>> cpufreq_freq_attr_rw(energy_performance_preference);
>> cpufreq_freq_attr_ro(energy_performance_available_preferences);
>> +define_one_global_rw(status);
>>
>> static struct freq_attr *amd_pstate_attr[] = {
>> &amd_pstate_max_freq,
>> @@ -794,6 +884,15 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
>> NULL,
>> };
>>
>> +static struct attribute *pstate_global_attributes[] = {
>> + &status.attr,
>> + NULL
>> +};
>> +
>> +static const struct attribute_group amd_pstate_global_attr_group = {
>> + .attrs = pstate_global_attributes,
>> +};
>> +
>> static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>> {
>> int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
>> @@ -1153,6 +1252,25 @@ static int __init amd_pstate_init(void)
>> if (ret)
>> pr_err("failed to register with return %d\n", ret);
>>
>> + amd_pstate_kobj = kobject_create_and_add("amd_pstate", &cpu_subsys.dev_root->kobj);
>> + if (!amd_pstate_kobj) {
>> + ret = -EINVAL;
>> + pr_err("global sysfs registration failed.\n");
>> + goto kobject_free;
>> + }
>> +
>> + ret = sysfs_create_group(amd_pstate_kobj, &amd_pstate_global_attr_group);
>> + if (ret) {
>> + pr_err("sysfs attribute export failed with error %d.\n", ret);
>> + goto global_attr_free;
>> + }
>> +
>> + return ret;
>> +
>> +global_attr_free:
>> + kobject_put(amd_pstate_kobj);
>> +kobject_free:
>> + cpufreq_unregister_driver(current_pstate_driver);
>> return ret;
>> }
>> device_initcall(amd_pstate_init);
>> --
>> 2.34.1
>>

--
Thanks & Regards,
Wyes

2023-01-18 08:19:34

by Yuan, Perry

[permalink] [raw]
Subject: RE: [PATCH v10 08/12] cpufreq: amd-pstate: add driver working mode switch support

[AMD Official Use Only - General]

Hi Ray, Wyse.

> -----Original Message-----
> From: Karny, Wyes <[email protected]>
> Sent: Wednesday, January 18, 2023 2:35 PM
> To: Huang, Ray <[email protected]>; Yuan, Perry <[email protected]>
> Cc: [email protected]; Limonciello, Mario
> <[email protected]>; [email protected]; Sharma, Deepak
> <[email protected]>; Fontenot, Nathan
> <[email protected]>; Deucher, Alexander
> <[email protected]>; Huang, Shimmer
> <[email protected]>; Du, Xiaojian <[email protected]>; Meng, Li
> (Jassmine) <[email protected]>; [email protected]; linux-
> [email protected]
> Subject: Re: [PATCH v10 08/12] cpufreq: amd-pstate: add driver working mode
> switch support
>
> Hi Ray, Perry,
>
> On 1/17/2023 7:26 PM, Huang Rui wrote:
> > On Fri, Jan 06, 2023 at 02:14:16PM +0800, Yuan, Perry wrote:
> >> From: Perry Yuan <[email protected]>
> >>
> >> While amd-pstate driver was loaded with specific driver mode, it will
> >> need to check which mode is enabled for the pstate driver,add this
> >> sysfs entry to show the current status
> >>
> >> $ cat /sys/devices/system/cpu/amd-pstate/status
> >> active
> >>
> >> Meanwhile, user can switch the pstate driver mode with writing mode
> >> string to sysfs entry as below.
> >>
> >> Enable passive mode:
> >> $ sudo bash -c "echo passive > /sys/devices/system/cpu/amd-pstate/status"
> >>
> >> Enable active mode (EPP driver mode):
> >> $ sudo bash -c "echo active > /sys/devices/system/cpu/amd-pstate/status"
> >>
> >> Reviewed-by: Mario Limonciello <[email protected]>
> >> Signed-off-by: Perry Yuan <[email protected]>
> >
> > Patch is Acked-by: Huang Rui <[email protected]>
> >
> > Wyes, may we have a ack or any comment from you of this patch, as you
> > are working on the new guided autonomous mode?
>
> This looks good to me.
>
> Reviewed-by: Wyes Karny <[email protected]>
>
> >
> > Thanks,
> > Ray

Thanks a lot for the ack and review flags, I will pick up all the new flags in v11.

Perry.

> >
> >> ---
> >> drivers/cpufreq/amd-pstate.c | 118
> >> +++++++++++++++++++++++++++++++++++
> >> 1 file changed, 118 insertions(+)
> >>
> >> diff --git a/drivers/cpufreq/amd-pstate.c
> >> b/drivers/cpufreq/amd-pstate.c index 2c8b126cec5a..d3f4cdd803f3
> >> 100644
> >> --- a/drivers/cpufreq/amd-pstate.c
> >> +++ b/drivers/cpufreq/amd-pstate.c
> >> @@ -63,6 +63,7 @@ static struct cpufreq_driver
> >> *current_pstate_driver; static struct cpufreq_driver
> >> amd_pstate_driver; static struct cpufreq_driver
> >> amd_pstate_epp_driver; static int cppc_state = AMD_PSTATE_DISABLE;
> >> +struct kobject *amd_pstate_kobj;
> >>
> >> static inline int get_mode_idx_from_str(const char *str, size_t
> >> size) { @@ -631,6 +632,8 @@ static int amd_pstate_cpu_init(struct
> >> cpufreq_policy *policy)
> >> policy->driver_data = cpudata;
> >>
> >> amd_pstate_boost_init(cpudata);
> >> + if (!current_pstate_driver->adjust_perf)
> >> + current_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
> >>
> >> return 0;
> >>
> >> @@ -771,12 +774,99 @@ static ssize_t
> show_energy_performance_preference(
> >> return sysfs_emit(buf, "%s\n", energy_perf_strings[preference]); }
> >>
> >> +static ssize_t amd_pstate_show_status(char *buf) {
> >> + if (!current_pstate_driver)
> >> + return sysfs_emit(buf, "disable\n");
> >> +
> >> + return sysfs_emit(buf, "%s\n", amd_pstate_mode_string[cppc_state]);
> >> +}
> >> +
> >> +static void amd_pstate_driver_cleanup(void) {
> >> + current_pstate_driver = NULL;
> >> +}
> >> +
> >> +static int amd_pstate_update_status(const char *buf, size_t size) {
> >> + int ret;
> >> + int mode_idx;
> >> +
> >> + if (size > 7 || size < 6)
> >> + return -EINVAL;
> >> + mode_idx = get_mode_idx_from_str(buf, size);
> >> +
> >> + switch(mode_idx) {
> >> + case AMD_PSTATE_DISABLE:
> >> + if (!current_pstate_driver)
> >> + return -EINVAL;
> >> + if (cppc_state == AMD_PSTATE_ACTIVE)
> >> + return -EBUSY;
> >> + ret = cpufreq_unregister_driver(current_pstate_driver);
> >> + amd_pstate_driver_cleanup();
> >> + break;
> >> + case AMD_PSTATE_PASSIVE:
> >> + if (current_pstate_driver) {
> >> + if (current_pstate_driver == &amd_pstate_driver)
> >> + return 0;
> >> + cpufreq_unregister_driver(current_pstate_driver);
> >> + cppc_state = AMD_PSTATE_PASSIVE;
> >> + current_pstate_driver = &amd_pstate_driver;
> >> + }
> >> +
> >> + ret = cpufreq_register_driver(current_pstate_driver);
> >> + break;
> >> + case AMD_PSTATE_ACTIVE:
> >> + if (current_pstate_driver) {
> >> + if (current_pstate_driver == &amd_pstate_epp_driver)
> >> + return 0;
> >> + cpufreq_unregister_driver(current_pstate_driver);
> >> + current_pstate_driver = &amd_pstate_epp_driver;
> >> + cppc_state = AMD_PSTATE_ACTIVE;
> >> + }
> >> +
> >> + ret = cpufreq_register_driver(current_pstate_driver);
> >> + break;
> >> + default:
> >> + ret = -EINVAL;
> >> + break;
> >> + }
> >> +
> >> + return ret;
> >> +}
> >> +
> >> +static ssize_t show_status(struct kobject *kobj,
> >> + struct kobj_attribute *attr, char *buf) {
> >> + ssize_t ret;
> >> +
> >> + mutex_lock(&amd_pstate_driver_lock);
> >> + ret = amd_pstate_show_status(buf);
> >> + mutex_unlock(&amd_pstate_driver_lock);
> >> +
> >> + return ret;
> >> +}
> >> +
> >> +static ssize_t store_status(struct kobject *a, struct kobj_attribute *b,
> >> + const char *buf, size_t count) {
> >> + char *p = memchr(buf, '\n', count);
> >> + int ret;
> >> +
> >> + mutex_lock(&amd_pstate_driver_lock);
> >> + ret = amd_pstate_update_status(buf, p ? p - buf : count);
> >> + mutex_unlock(&amd_pstate_driver_lock);
> >> +
> >> + return ret < 0 ? ret : count;
> >> +}
> >> +
> >> cpufreq_freq_attr_ro(amd_pstate_max_freq);
> >> cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
> >>
> >> cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> >> cpufreq_freq_attr_rw(energy_performance_preference);
> >> cpufreq_freq_attr_ro(energy_performance_available_preferences);
> >> +define_one_global_rw(status);
> >>
> >> static struct freq_attr *amd_pstate_attr[] = {
> >> &amd_pstate_max_freq,
> >> @@ -794,6 +884,15 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
> >> NULL,
> >> };
> >>
> >> +static struct attribute *pstate_global_attributes[] = {
> >> + &status.attr,
> >> + NULL
> >> +};
> >> +
> >> +static const struct attribute_group amd_pstate_global_attr_group = {
> >> + .attrs = pstate_global_attributes,
> >> +};
> >> +
> >> static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy) {
> >> int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> >> @@ -1153,6 +1252,25 @@ static int __init amd_pstate_init(void)
> >> if (ret)
> >> pr_err("failed to register with return %d\n", ret);
> >>
> >> + amd_pstate_kobj = kobject_create_and_add("amd_pstate",
> &cpu_subsys.dev_root->kobj);
> >> + if (!amd_pstate_kobj) {
> >> + ret = -EINVAL;
> >> + pr_err("global sysfs registration failed.\n");
> >> + goto kobject_free;
> >> + }
> >> +
> >> + ret = sysfs_create_group(amd_pstate_kobj,
> &amd_pstate_global_attr_group);
> >> + if (ret) {
> >> + pr_err("sysfs attribute export failed with error %d.\n", ret);
> >> + goto global_attr_free;
> >> + }
> >> +
> >> + return ret;
> >> +
> >> +global_attr_free:
> >> + kobject_put(amd_pstate_kobj);
> >> +kobject_free:
> >> + cpufreq_unregister_driver(current_pstate_driver);
> >> return ret;
> >> }
> >> device_initcall(amd_pstate_init);
> >> --
> >> 2.34.1
> >>
>
> --
> Thanks & Regards,
> Wyes