Hi all,
This patchset implements one new AMD CPU frequency driver
`amd-pstate-epp` instance for better performance and power control.
CPPC has a parameter called energy preference performance (EPP).
The EPP is used in the CCLK DPM controller to drive the frequency that a core
is going to operate during short periods of activity.
EPP values will be utilized for different OS profiles (balanced, performance, power savings).
AMD Energy Performance Preference (EPP) provides a hint to the hardware
if software wants to bias toward performance (0x0) or energy efficiency (0xff)
The lowlevel power firmware will calculate the runtime frequency according to the EPP preference
value. So the EPP hint will impact the CPU cores frequency responsiveness.
We use the RAPL interface with "perf" tool to get the energy data of the package power.
Performance Per Watt (PPW) Calculation:
The PPW calculation is referred by below paper:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsoftware.intel.com%2Fcontent%2Fdam%2Fdevelop%2Fexternal%2Fus%2Fen%2Fdocuments%2Fperformance-per-what-paper.pdf&data=04%7C01%7CPerry.Yuan%40amd.com%7Cac66e8ce98044e9b062708d9ab47c8d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637729147708574423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=TPOvCE%2Frbb0ptBreWNxHqOi9YnVhcHGKG88vviDLb00%3D&reserved=0
Below formula is referred from below spec to measure the PPW:
(F / t) / P = F * t / (t * E) = F / E,
"F" is the number of frames per second.
"P" is power measured in watts.
"E" is energy measured in joules.
Gitsouce Benchmark Data on ROME Server CPU
+------------------------------+------------------------------+------------+------------------+
| Kernel Module | PPW (1 / s * J) |Energy(J) | PPW Improvement (%)|
+==============================+==============================+============+==================+
| acpi-cpufreq:schedutil | 5.85658E-05 | 17074.8 | base |
+------------------------------+------------------------------+------------+------------------+
| acpi-cpufreq:ondemand | 5.03079E-05 | 19877.6 | -14.10% |
+------------------------------+------------------------------+------------+------------------+
| acpi-cpufreq:performance | 5.88132E-05 | 17003 | 0.42% |
+------------------------------+------------------------------+------------+------------------+
| amd-pstate:ondemand | 4.60295E-05 | 21725.2 | -21.41% |
+------------------------------+------------------------------+------------+------------------+
| amd-pstate:schedutil | 4.70026E-05 | 21275.4 | -19.7% |
+------------------------------+------------------------------+------------+------------------+
| amd-pstate:performance | 5.80094E-05 | 17238.6 | -0.95% |
+------------------------------+------------------------------+------------+------------------+
| EPP:performance | 5.8292E-05 | 17155 | -0.47% |
+------------------------------+------------------------------+------------+------------------+
| EPP: balance performance: | 6.71709E-05 | 14887.4 | 14.69% |
+------------------------------+------------------------------+------------+------------------+
| EPP:power | 6.66951E-05 | 4993.6 | 13.88% |
+------------------------------+------------------------------+------------+------------------+
Tbench Benchmark Data on ROME Server CPU
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| Kernel Module | PPW MB / (s * J) |Throughput(MB/s)| Energy (J)|PPW Improvement(%)|
+=============================================+===================+==============+=============+==================+
| acpi_cpufreq: schedutil | 46.39 | 17191 | 37057.3 | base |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| acpi_cpufreq: ondemand | 51.51 | 19269.5 | 37406.5 | 11.04 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| acpi_cpufreq: performance | 45.96 | 17063.7 | 37123.7 | -0.74 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: performance(0) | 54.46 | 20263.1 | 37205 | 17.87 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: balance performance | 55.03 | 20481.9 | 37221.5 | 19.14 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: balance_power | 54.43 | 20245.9 | 37194.2 | 17.77 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| EPP:powersave: power(255) | 54.26 | 20181.7 | 37197.4 | 17.40 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| amd-pstate: schedutil | 48.22 | 17844.9 | 37006.6 | 3.80 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| amd-pstate: ondemand | 61.30 | 22988 | 37503.4 | 33.72 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
| amd-pstate: performance | 54.52 | 20252.6 | 37147.8 | 17.81 % |
+---------------------------------------------+-------------------+--------------+-------------+------------------+
changes from v7:
* remove iowait boost functions code
* pick up ack by flag from Huang Ray.
* add one new patch to support multiple working modes in the amd_pstate_param(),aligned with Wyse
* drop the patch "[v7 08/13] cpufreq: amd-pstate: add frequency dynamic boost sysfs control"
* replace the cppc_get_epp_caps() with new cppc_get_epp_perf() wihch is
more simple to use
* remove I/O wait boost code from amd_pstate_update_status()
* replace cppc_active var with enum type AMD_PSTATE_ACTIVE
* squash amd_pstate_epp_verify_policy() into sigle function
* remove "amd pstate" string from the pr_err, pr_debug logs info
* rework patch [v7 03/13], move the common EPP profiles declaration
into cpufreq.h which will be used by amd-pstate and intel-pstate
* combine amd psate init functions.
* remove epp_powersave from amd-pstate.h and dropping the codes.
* move amd_pstate_params{} from amd-pstate.h into amd-pstate.c
* drive some other feedbacks from huang ray
changes from v6:
* fix one legacy kernel hang issue when amd-pstate driver unregistering
* add new documentation to introduce new global sysfs attributes
* use sysfs_emit_at() to print epp profiles array
* update commit info for patch v6 patch 1/11 as Mario sugguested.
* trying to add the EPP profiles into cpufreq.h, but it will cause lots
of build failues,continue to keep cpufreq_common.h used in v7
* update commit info using amd-pstate as prefix same as before.
* remove CONFIG_ACPI for the header as Ray suggested.
* move amd_pstate_kobj to where it is used in patch "add frequency dynamic boost sysfs control"
* drive feedback removing X86_FEATURE_CPPC check for the epp init from Huang Ray
* drive feedback from Mario
change from v5:
* add one common header `cpufreq_commoncpufreq_common` to extract EPP profiles
definition for amd and intel pstate driver.
* remove the epp_off value to avoid confusion.
* convert some other sysfs sprintf() function with sysfs_emit() and add onew new patch
* add acpi pm server priofile detection to enable dynamic boost control
* fix some code format with checkpatch script
* move the EPP profile declaration into common header file `cpufreq_common.h`
* fix commit typos
changes from v4:
* rebase driver based on the v6.1-rc7
* remove the builtin changes patch because pstate driver has been
changed to builtin type by another thread patch
* update Documentation: amd-pstate: add amd pstate driver mode introduction
* replace sprintf with sysfs_emit() instead.
* fix typo for cppc_set_epp_perf() in cppc_acpi.h header
changes from v3:
* add one more document update patch for the active and passive mode
introducion.
* drive most of the feedbacks from Mario
* drive feedback from Rafael for the cppc_acpi driver.
* remove the epp raw data set/get function
* set the amd-pstate drive by passing kernel parameter
* set amd-pstate driver disabled by default if no kernel parameter
input from booting
* get cppc_set_auto_epp and cppc_set_epp_perf combined
* pick up reviewed by flag from Mario
changes from v2:
* change pstate driver as builtin type from module
* drop patch "export cpufreq cpu release and acquire"
* squash patch of shared mem into single patch of epp implementation
* add one new patch to support frequency boost control
* add patch to expose driver working status checking
* rebase driver into v6.1-rc4 kernel release
* move some declaration to amd-pstate.h
* drive feedback from Mario for the online/offline patch
* drive feedback from Mario for the suspend/resume patch
* drive feedback from Ray for the cppc_acpi and some other patches
* drive feedback from Nathan for the epp patch
changes from v1:
* rebased to v6.0
* drive feedbacks from Mario for the suspend/resume patch
* drive feedbacks from Nathan for the EPP support on msr type
* fix some typos and code style indent problems
* update commit comments for patch 4/7
* change the `epp_enabled` module param name to `epp`
* set the default epp mode to be false
* add testing for the x86_energy_perf_policy utility patchset(will
send that utility patchset with another thread)
v7: https://lore.kernel.org/lkml/[email protected]/
v6: https://lore.kernel.org/lkml/[email protected]/
v5: https://lore.kernel.org/lkml/[email protected]/
v4: https://lore.kernel.org/lkml/[email protected]/
v3: https://lore.kernel.org/all/[email protected]/
v2: https://lore.kernel.org/all/[email protected]/
v1: https://lore.kernel.org/all/[email protected]/
Perry Yuan (13):
ACPI: CPPC: Add AMD pstate energy performance preference cppc control
Documentation: amd-pstate: add EPP profiles introduction
cpufreq: intel_pstate: use common macro definition for Energy
Preference Performance(EPP)
cpufreq: amd-pstate: fix kernel hang issue while amd-pstate
unregistering
cpufreq: amd-pstate: optimize driver working mode selection in
amd_pstate_param()
cpufreq: amd-pstate: implement Pstate EPP support for the AMD
processors
cpufreq: amd-pstate: implement amd pstate cpu online and offline
callback
cpufreq: amd-pstate: implement suspend and resume callbacks
cpufreq: amd-pstate: add driver working mode switch support
Documentation: amd-pstate: add amd pstate driver mode introduction
Documentation: introduce amd pstate active mode kernel command line
options
cpufreq: amd-pstate: convert sprintf with sysfs_emit()
Documentation: amd-pstate: introduce new global sysfs attributes
.../admin-guide/kernel-parameters.txt | 7 +
Documentation/admin-guide/pm/amd-pstate.rst | 73 +-
drivers/acpi/cppc_acpi.c | 76 +-
drivers/cpufreq/amd-pstate.c | 740 +++++++++++++++++-
drivers/cpufreq/intel_pstate.c | 13 +-
include/acpi/cppc_acpi.h | 12 +
include/linux/amd-pstate.h | 40 +
include/linux/cpufreq.h | 11 +
8 files changed, 938 insertions(+), 34 deletions(-)
--
2.34.1
make the energy preference performance strings and profiles using one
common header for intel_pstate driver, then the amd_pstate epp driver can
use the common header as well. This will simpify the intel_pstate and
amd_pstate driver.
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/intel_pstate.c | 13 +++----------
include/linux/cpufreq.h | 11 +++++++++++
2 files changed, 14 insertions(+), 10 deletions(-)
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index ad9be31753b6..93a60fdac0fc 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -640,15 +640,7 @@ static int intel_pstate_set_epb(int cpu, s16 pref)
* 4 power
*/
-enum energy_perf_value_index {
- EPP_INDEX_DEFAULT = 0,
- EPP_INDEX_PERFORMANCE,
- EPP_INDEX_BALANCE_PERFORMANCE,
- EPP_INDEX_BALANCE_POWERSAVE,
- EPP_INDEX_POWERSAVE,
-};
-
-static const char * const energy_perf_strings[] = {
+const char * const energy_perf_strings[] = {
[EPP_INDEX_DEFAULT] = "default",
[EPP_INDEX_PERFORMANCE] = "performance",
[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
@@ -656,7 +648,8 @@ static const char * const energy_perf_strings[] = {
[EPP_INDEX_POWERSAVE] = "power",
NULL
};
-static unsigned int epp_values[] = {
+
+unsigned int epp_values[] = {
[EPP_INDEX_DEFAULT] = 0, /* Unused index */
[EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
[EPP_INDEX_BALANCE_PERFORMANCE] = HWP_EPP_BALANCE_PERFORMANCE,
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index d5595d57f4e5..e63309d497fe 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -20,6 +20,7 @@
#include <linux/pm_qos.h>
#include <linux/spinlock.h>
#include <linux/sysfs.h>
+#include <asm/msr.h>
/*********************************************************************
* CPUFREQ INTERFACE *
@@ -185,6 +186,16 @@ struct cpufreq_freqs {
u8 flags; /* flags of cpufreq_driver, see below. */
};
+enum energy_perf_value_index {
+ EPP_INDEX_DEFAULT = 0,
+ EPP_INDEX_PERFORMANCE,
+ EPP_INDEX_BALANCE_PERFORMANCE,
+ EPP_INDEX_BALANCE_POWERSAVE,
+ EPP_INDEX_POWERSAVE,
+};
+extern const char * const energy_perf_strings[];
+extern unsigned int epp_values[];
+
/* Only for ACPI */
#define CPUFREQ_SHARED_TYPE_NONE (0) /* None */
#define CPUFREQ_SHARED_TYPE_HW (1) /* HW does needed coordination */
--
2.34.1
AMD Pstate driver support another firmware based autonomous mode
with "amd_pstate=active" added to the kernel command line.
In autonomous mode SMU firmware decides frequencies at 1 ms timescale
based on workload utilization, usage in other IPs, infrastructure
limits such as power, thermals and so on.
Signed-off-by: Perry Yuan <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 42af9ca0127e..73a02816f6f8 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -6970,3 +6970,10 @@
management firmware translates the requests into actual
hardware states (core frequency, data fabric and memory
clocks etc.)
+ active
+ Use amd_pstate_epp driver instance as the scaling driver,
+ driver provides a hint to the hardware if software wants
+ to bias toward performance (0x0) or energy efficiency (0xff)
+ to the CPPC firmware. then CPPC power algorithm will
+ calculate the runtime workload and adjust the realtime cores
+ frequency.
--
2.34.1
From: Perry Yuan <[email protected]>
The patch add AMD pstate EPP feature introduction and what EPP
preference supported for AMD processors.
User can get supported list from
energy_performance_available_preferences attribute file, or update
current profile to energy_performance_preference file
1) See all EPP profiles
$ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_available_preferences
default performance balance_performance balance_power power
2) Check current EPP profile
$ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference
performance
3) Set new EPP profile
$ sudo bash -c "echo power > /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference"
Signed-off-by: Perry Yuan <[email protected]>
---
Documentation/admin-guide/pm/amd-pstate.rst | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
index 06e23538f79c..33ab8ec8fc2f 100644
--- a/Documentation/admin-guide/pm/amd-pstate.rst
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -262,6 +262,25 @@ lowest non-linear performance in `AMD CPPC Performance Capability
<perf_cap_>`_.)
This attribute is read-only.
+``energy_performance_available_preferences``
+
+A list of all the supported EPP preferences that could be used for
+``energy_performance_preference`` on this system.
+These profiles represent different hints that are provided
+to the low-level firmware about the user's desired energy vs efficiency
+tradeoff. ``default`` represents the epp value is set by platform
+firmware. This attribute is read-only.
+
+``energy_performance_preference``
+
+The current energy performance preference can be read from this attribute.
+and user can change current preference according to energy or performance needs
+Please get all support profiles list from
+``energy_performance_available_preferences`` attribute, all the profiles are
+integer values defined between 0 to 255 when EPP feature is enabled by platform
+firmware, if EPP feature is disabled, driver will ignore the written value
+This attribute is read-write.
+
Other performance and frequency values can be read back from
``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`.
--
2.34.1
From: Perry Yuan <[email protected]>
Add EPP driver support for AMD SoCs which support a dedicated MSR for
CPPC. EPP is used by the DPM controller to configure the frequency that
a core operates at during short periods of activity.
The SoC EPP targets are configured on a scale from 0 to 255 where 0
represents maximum performance and 255 represents maximum efficiency.
The amd-pstate driver exports profile string names to userspace that are
tied to specific EPP values.
The balance_performance string (0x80) provides the best balance for
efficiency versus power on most systems, but users can choose other
strings to meet their needs as well.
$ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_preferences
default performance balance_performance balance_power power
$ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preference
balance_performance
To enable the driver,it needs to add `amd_pstate=active` to kernel
command line and kernel will load the active mode epp driver
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 447 ++++++++++++++++++++++++++++++++++-
include/linux/amd-pstate.h | 10 +
2 files changed, 451 insertions(+), 6 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 861a905f9324..66b39457a312 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -59,7 +59,10 @@
* we disable it by default to go acpi-cpufreq on these processors and add a
* module parameter to be able to enable it manually for debugging.
*/
+static struct cpufreq_driver *default_pstate_driver;
static struct cpufreq_driver amd_pstate_driver;
+static struct cpufreq_driver amd_pstate_epp_driver;
+static struct amd_cpudata **all_cpu_data;
static int cppc_state = AMD_PSTATE_DISABLE;
static inline int get_mode_idx_from_str(const char *str, size_t size)
@@ -70,9 +73,128 @@ static inline int get_mode_idx_from_str(const char *str, size_t size)
if (!strncmp(str, amd_pstate_mode_string[i], size))
return i;
}
+
return -EINVAL;
}
+/**
+ * struct amd_pstate_params - global parameters for the performance control
+ * @ cppc_boost_disabled wheher the core performance boost disabled
+ */
+struct amd_pstate_params {
+ bool cppc_boost_disabled;
+};
+
+static struct amd_pstate_params global_params;
+
+static DEFINE_MUTEX(amd_pstate_limits_lock);
+static DEFINE_MUTEX(amd_pstate_driver_lock);
+
+static s16 amd_pstate_get_epp(struct amd_cpudata *cpudata, u64 cppc_req_cached)
+{
+ u64 epp;
+ int ret;
+
+ if (boot_cpu_has(X86_FEATURE_CPPC)) {
+ if (!cppc_req_cached) {
+ epp = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
+ &cppc_req_cached);
+ if (epp)
+ return epp;
+ }
+ epp = (cppc_req_cached >> 24) & 0xFF;
+ } else {
+ ret = cppc_get_epp_perf(cpudata->cpu, &epp);
+ if (ret < 0) {
+ pr_debug("Could not retrieve energy perf value (%d)\n", ret);
+ return -EIO;
+ }
+ }
+
+ return (s16)(epp & 0xff);
+}
+
+static int amd_pstate_get_energy_pref_index(struct amd_cpudata *cpudata)
+{
+ s16 epp;
+ int index = -EINVAL;
+
+ epp = amd_pstate_get_epp(cpudata, 0);
+ if (epp < 0)
+ return epp;
+
+ switch (epp) {
+ case HWP_EPP_PERFORMANCE:
+ index = EPP_INDEX_PERFORMANCE;
+ break;
+ case HWP_EPP_BALANCE_PERFORMANCE:
+ index = EPP_INDEX_BALANCE_PERFORMANCE;
+ break;
+ case HWP_EPP_BALANCE_POWERSAVE:
+ index = EPP_INDEX_BALANCE_POWERSAVE;
+ break;
+ case HWP_EPP_POWERSAVE:
+ index = EPP_INDEX_POWERSAVE;
+ break;
+ default:
+ break;
+ }
+
+ return index;
+}
+
+static int amd_pstate_set_epp(struct amd_cpudata *cpudata, u32 epp)
+{
+ int ret;
+ struct cppc_perf_ctrls perf_ctrls;
+
+ if (boot_cpu_has(X86_FEATURE_CPPC)) {
+ u64 value = READ_ONCE(cpudata->cppc_req_cached);
+
+ value &= ~GENMASK_ULL(31, 24);
+ value |= (u64)epp << 24;
+ WRITE_ONCE(cpudata->cppc_req_cached, value);
+
+ ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
+ if (!ret)
+ cpudata->epp_cached = epp;
+ } else {
+ perf_ctrls.energy_perf = epp;
+ ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1);
+ if (ret) {
+ pr_debug("failed to set energy perf value (%d)\n", ret);
+ return ret;
+ }
+ cpudata->epp_cached = epp;
+ }
+
+ return ret;
+}
+
+static int amd_pstate_set_energy_pref_index(struct amd_cpudata *cpudata,
+ int pref_index)
+{
+ int epp = -EINVAL;
+ int ret;
+
+ if (!pref_index) {
+ pr_debug("EPP pref_index is invalid\n");
+ return -EINVAL;
+ }
+
+ if (epp == -EINVAL)
+ epp = epp_values[pref_index];
+
+ if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
+ pr_debug("EPP cannot be set under performance policy\n");
+ return -EBUSY;
+ }
+
+ ret = amd_pstate_set_epp(cpudata, epp);
+
+ return ret;
+}
+
static inline int pstate_enable(bool enable)
{
return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable);
@@ -81,11 +203,21 @@ static inline int pstate_enable(bool enable)
static int cppc_enable(bool enable)
{
int cpu, ret = 0;
+ struct cppc_perf_ctrls perf_ctrls;
for_each_present_cpu(cpu) {
ret = cppc_set_enable(cpu, enable);
if (ret)
return ret;
+
+ /* Enable autonomous mode for EPP */
+ if (cppc_state == AMD_PSTATE_ACTIVE) {
+ /* Set desired perf as zero to allow EPP firmware control */
+ perf_ctrls.desired_perf = 0;
+ ret = cppc_set_perf(cpu, &perf_ctrls);
+ if (ret)
+ return ret;
+ }
}
return ret;
@@ -429,7 +561,7 @@ static void amd_pstate_boost_init(struct amd_cpudata *cpudata)
return;
cpudata->boost_supported = true;
- amd_pstate_driver.boost_enabled = true;
+ default_pstate_driver->boost_enabled = true;
}
static void amd_perf_ctl_reset(unsigned int cpu)
@@ -603,10 +735,61 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
return sprintf(&buf[0], "%u\n", perf);
}
+static ssize_t show_energy_performance_available_preferences(
+ struct cpufreq_policy *policy, char *buf)
+{
+ int i = 0;
+ int offset = 0;
+
+ while (energy_perf_strings[i] != NULL)
+ offset += sysfs_emit_at(buf, offset, "%s ", energy_perf_strings[i++]);
+
+ sysfs_emit_at(buf, offset, "\n");
+
+ return offset;
+}
+
+static ssize_t store_energy_performance_preference(
+ struct cpufreq_policy *policy, const char *buf, size_t count)
+{
+ struct amd_cpudata *cpudata = policy->driver_data;
+ char str_preference[21];
+ ssize_t ret;
+
+ ret = sscanf(buf, "%20s", str_preference);
+ if (ret != 1)
+ return -EINVAL;
+
+ ret = match_string(energy_perf_strings, -1, str_preference);
+ if (ret < 0)
+ return -EINVAL;
+
+ mutex_lock(&amd_pstate_limits_lock);
+ ret = amd_pstate_set_energy_pref_index(cpudata, ret);
+ mutex_unlock(&amd_pstate_limits_lock);
+
+ return ret ?: count;
+}
+
+static ssize_t show_energy_performance_preference(
+ struct cpufreq_policy *policy, char *buf)
+{
+ struct amd_cpudata *cpudata = policy->driver_data;
+ int preference;
+
+ preference = amd_pstate_get_energy_pref_index(cpudata);
+ if (preference < 0)
+ return preference;
+
+ return sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
+}
+
cpufreq_freq_attr_ro(amd_pstate_max_freq);
cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
cpufreq_freq_attr_ro(amd_pstate_highest_perf);
+cpufreq_freq_attr_rw(energy_performance_preference);
+cpufreq_freq_attr_ro(energy_performance_available_preferences);
static struct freq_attr *amd_pstate_attr[] = {
&amd_pstate_max_freq,
@@ -615,6 +798,235 @@ static struct freq_attr *amd_pstate_attr[] = {
NULL,
};
+static struct freq_attr *amd_pstate_epp_attr[] = {
+ &amd_pstate_max_freq,
+ &amd_pstate_lowest_nonlinear_freq,
+ &amd_pstate_highest_perf,
+ &energy_performance_preference,
+ &energy_performance_available_preferences,
+ NULL,
+};
+
+static inline void update_boost_state(void)
+{
+ u64 misc_en;
+ struct amd_cpudata *cpudata;
+
+ cpudata = all_cpu_data[0];
+ rdmsrl(MSR_K7_HWCR, misc_en);
+ global_params.cppc_boost_disabled = misc_en & BIT_ULL(25);
+}
+
+static int amd_pstate_init_cpu(unsigned int cpunum)
+{
+ struct amd_cpudata *cpudata;
+
+ cpudata = all_cpu_data[cpunum];
+ if (!cpudata) {
+ cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
+ if (!cpudata)
+ return -ENOMEM;
+ WRITE_ONCE(all_cpu_data[cpunum], cpudata);
+
+ cpudata->cpu = cpunum;
+ }
+
+ cpudata->epp_policy = 0;
+ pr_debug("controlling: cpu %d\n", cpunum);
+ return 0;
+}
+
+static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
+{
+ int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
+ struct amd_cpudata *cpudata;
+ struct device *dev;
+ int rc;
+ u64 value;
+
+ rc = amd_pstate_init_cpu(policy->cpu);
+ if (rc)
+ return rc;
+
+ cpudata = all_cpu_data[policy->cpu];
+
+ dev = get_cpu_device(policy->cpu);
+ if (!dev)
+ goto free_cpudata1;
+
+ rc = amd_pstate_init_perf(cpudata);
+ if (rc)
+ goto free_cpudata1;
+
+ min_freq = amd_get_min_freq(cpudata);
+ max_freq = amd_get_max_freq(cpudata);
+ nominal_freq = amd_get_nominal_freq(cpudata);
+ lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
+ if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
+ dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
+ min_freq, max_freq);
+ ret = -EINVAL;
+ goto free_cpudata1;
+ }
+
+ policy->min = min_freq;
+ policy->max = max_freq;
+
+ policy->cpuinfo.min_freq = min_freq;
+ policy->cpuinfo.max_freq = max_freq;
+ /* It will be updated by governor */
+ policy->cur = policy->cpuinfo.min_freq;
+
+ /* Initial processor data capability frequencies */
+ cpudata->max_freq = max_freq;
+ cpudata->min_freq = min_freq;
+ cpudata->nominal_freq = nominal_freq;
+ cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
+
+ policy->driver_data = cpudata;
+
+ cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
+
+ policy->min = policy->cpuinfo.min_freq;
+ policy->max = policy->cpuinfo.max_freq;
+
+ /*
+ * Set the policy to powersave to provide a valid fallback value in case
+ * the default cpufreq governor is neither powersave nor performance.
+ */
+ policy->policy = CPUFREQ_POLICY_POWERSAVE;
+
+ if (boot_cpu_has(X86_FEATURE_CPPC)) {
+ policy->fast_switch_possible = true;
+ ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, &value);
+ if (ret)
+ return ret;
+ WRITE_ONCE(cpudata->cppc_req_cached, value);
+
+ ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &value);
+ if (ret)
+ return ret;
+ WRITE_ONCE(cpudata->cppc_cap1_cached, value);
+ }
+ amd_pstate_boost_init(cpudata);
+
+ return 0;
+
+free_cpudata1:
+ kfree(cpudata);
+ return ret;
+}
+
+static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
+{
+ pr_debug("CPU %d exiting\n", policy->cpu);
+ policy->fast_switch_possible = false;
+ return 0;
+}
+
+static void amd_pstate_update_max_freq(unsigned int cpu)
+{
+ struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
+
+ if (!policy)
+ return;
+
+ refresh_frequency_limits(policy);
+ cpufreq_cpu_put(policy);
+}
+
+static void amd_pstate_epp_update_limits(unsigned int cpu)
+{
+ mutex_lock(&amd_pstate_driver_lock);
+ update_boost_state();
+ if (global_params.cppc_boost_disabled) {
+ for_each_possible_cpu(cpu)
+ amd_pstate_update_max_freq(cpu);
+ } else {
+ cpufreq_update_policy(cpu);
+ }
+ mutex_unlock(&amd_pstate_driver_lock);
+}
+
+static void amd_pstate_epp_init(unsigned int cpu)
+{
+ struct amd_cpudata *cpudata = all_cpu_data[cpu];
+ u32 max_perf, min_perf;
+ u64 value;
+ s16 epp;
+
+ max_perf = READ_ONCE(cpudata->highest_perf);
+ min_perf = READ_ONCE(cpudata->lowest_perf);
+
+ value = READ_ONCE(cpudata->cppc_req_cached);
+
+ if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
+ min_perf = max_perf;
+
+ /* Initial min/max values for CPPC Performance Controls Register */
+ value &= ~AMD_CPPC_MIN_PERF(~0L);
+ value |= AMD_CPPC_MIN_PERF(min_perf);
+
+ value &= ~AMD_CPPC_MAX_PERF(~0L);
+ value |= AMD_CPPC_MAX_PERF(max_perf);
+
+ /* CPPC EPP feature require to set zero to the desire perf bit */
+ value &= ~AMD_CPPC_DES_PERF(~0L);
+ value |= AMD_CPPC_DES_PERF(0);
+
+ if (cpudata->epp_policy == cpudata->policy)
+ goto skip_epp;
+
+ cpudata->epp_policy = cpudata->policy;
+
+ if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
+ epp = amd_pstate_get_epp(cpudata, value);
+ if (epp < 0)
+ goto skip_epp;
+ /* force the epp value to be zero for performance policy */
+ epp = 0;
+ } else {
+ /* Get BIOS pre-defined epp value */
+ epp = amd_pstate_get_epp(cpudata, value);
+ if (epp)
+ goto skip_epp;
+ }
+ /* Set initial EPP value */
+ if (boot_cpu_has(X86_FEATURE_CPPC)) {
+ value &= ~GENMASK_ULL(31, 24);
+ value |= (u64)epp << 24;
+ }
+
+skip_epp:
+ WRITE_ONCE(cpudata->cppc_req_cached, value);
+ amd_pstate_set_epp(cpudata, epp);
+}
+
+static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
+{
+ struct amd_cpudata *cpudata;
+
+ if (!policy->cpuinfo.max_freq)
+ return -ENODEV;
+
+ pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
+ policy->cpuinfo.max_freq, policy->max);
+
+ cpudata = all_cpu_data[policy->cpu];
+ cpudata->policy = policy->policy;
+
+ amd_pstate_epp_init(policy->cpu);
+
+ return 0;
+}
+
+static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data *policy)
+{
+ cpufreq_verify_within_cpu_limits(policy);
+ pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy->min);
+ return 0;
+}
+
static struct cpufreq_driver amd_pstate_driver = {
.flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
.verify = amd_pstate_verify,
@@ -628,8 +1040,20 @@ static struct cpufreq_driver amd_pstate_driver = {
.attr = amd_pstate_attr,
};
+static struct cpufreq_driver amd_pstate_epp_driver = {
+ .flags = CPUFREQ_CONST_LOOPS,
+ .verify = amd_pstate_epp_verify_policy,
+ .setpolicy = amd_pstate_epp_set_policy,
+ .init = amd_pstate_epp_cpu_init,
+ .exit = amd_pstate_epp_cpu_exit,
+ .update_limits = amd_pstate_epp_update_limits,
+ .name = "amd_pstate_epp",
+ .attr = amd_pstate_epp_attr,
+};
+
static int __init amd_pstate_init(void)
{
+ static struct amd_cpudata **cpudata;
int ret;
if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
@@ -656,7 +1080,8 @@ static int __init amd_pstate_init(void)
/* capability check */
if (boot_cpu_has(X86_FEATURE_CPPC)) {
pr_debug("AMD CPPC MSR based functionality is supported\n");
- amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
+ if (cppc_state == AMD_PSTATE_PASSIVE)
+ default_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
} else {
pr_debug("AMD CPPC shared memory based functionality is supported\n");
static_call_update(amd_pstate_enable, cppc_enable);
@@ -664,17 +1089,21 @@ static int __init amd_pstate_init(void)
static_call_update(amd_pstate_update_perf, cppc_update_perf);
}
+ cpudata = vzalloc(array_size(sizeof(void *), num_possible_cpus()));
+ if (!cpudata)
+ return -ENOMEM;
+ WRITE_ONCE(all_cpu_data, cpudata);
+
/* enable amd pstate feature */
ret = amd_pstate_enable(true);
if (ret) {
- pr_err("failed to enable amd-pstate with return %d\n", ret);
+ pr_err("failed to enable with return %d\n", ret);
return ret;
}
- ret = cpufreq_register_driver(&amd_pstate_driver);
+ ret = cpufreq_register_driver(default_pstate_driver);
if (ret)
- pr_err("failed to register amd_pstate_driver with return %d\n",
- ret);
+ pr_err("failed to register with return %d\n", ret);
return ret;
}
@@ -696,6 +1125,12 @@ static int __init amd_pstate_param(char *str)
if (cppc_state == AMD_PSTATE_DISABLE)
pr_info("driver is explicitly disabled\n");
+ if (cppc_state == AMD_PSTATE_ACTIVE)
+ default_pstate_driver = &amd_pstate_epp_driver;
+
+ if (cppc_state == AMD_PSTATE_PASSIVE)
+ default_pstate_driver = &amd_pstate_driver;
+
return 0;
}
diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index 922d05a13902..fe1aef743c09 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -47,6 +47,10 @@ struct amd_aperf_mperf {
* @prev: Last Aperf/Mperf/tsc count value read from register
* @freq: current cpu frequency value
* @boost_supported: check whether the Processor or SBIOS supports boost mode
+ * @epp_policy: Last saved policy used to set energy-performance preference
+ * @epp_cached: Cached CPPC energy-performance preference value
+ * @policy: Cpufreq policy value
+ * @cppc_cap1_cached Cached MSR_AMD_CPPC_CAP1 register value
*
* The amd_cpudata is key private data for each CPU thread in AMD P-State, and
* represents all the attributes and goals that AMD P-State requests at runtime.
@@ -72,6 +76,12 @@ struct amd_cpudata {
u64 freq;
bool boost_supported;
+
+ /* EPP feature related attributes*/
+ s16 epp_policy;
+ s16 epp_cached;
+ u32 policy;
+ u64 cppc_cap1_cached;
};
/**
--
2.34.1
The new amd-pstate driver support to switch the driver working mode and
use can switch the driver mode within the sysfs attributes in the below
path and check current mode
$ cd /sys/devices/system/cpu/amd-pstate
check driver mode:
$ cat /sys/devices/system/cpu/amd-pstate/status
switch mode:
$ sudo bash -c "echo passive > /sys/devices/system/cpu/amd-pstate/status"
or
$ sudo bash -c "echo active > /sys/devices/system/cpu/amd-pstate/status"
Signed-off-by: Perry Yuan <[email protected]>
---
Documentation/admin-guide/pm/amd-pstate.rst | 28 +++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
index 62744dae3c5f..f3a8f8a66783 100644
--- a/Documentation/admin-guide/pm/amd-pstate.rst
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -339,6 +339,34 @@ processor must provide at least nominal performance requested and go higher if c
operating conditions allow.
+User Space Interface in ``sysfs``
+=================================
+
+Global Attributes
+-----------------
+
+``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to
+control its functionality at the system level. They are located in the
+``/sys/devices/system/cpu/amd-pstate/`` directory and affect all CPUs.
+
+``status``
+ Operation mode of the driver: "active", "passive" or "off".
+
+ "active"
+ The driver is functional and in the ``active mode``
+
+ "passive"
+ The driver is functional and in the ``passive mode``
+
+ "off"
+ The driver is unregistered and not functional now.
+
+ This attribute can be written to in order to change the driver's
+ operation mode or to unregister it. The string written to it must be
+ one of the possible values of it and, if successful, the write will
+ cause the driver to switch over to the operation mode represented by
+ that string - or to be unregistered in the "off" case.
+
``cpupower`` tool support for ``amd-pstate``
===============================================
--
2.34.1
Hi Perry,
I love your patch! Yet something to improve:
[auto build test ERROR on rafael-pm/linux-next]
[also build test ERROR on linus/master v6.1 next-20221219]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Perry-Yuan/Implement-AMD-Pstate-EPP-Driver/20221219-144514
base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
patch link: https://lore.kernel.org/r/20221219064042.661122-4-perry.yuan%40amd.com
patch subject: [PATCH v8 03/13] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP)
config: arm-randconfig-r046-20221218
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 98b13979fb05f3ed288a900deb843e7b27589e58)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm cross compiling tool for clang build
# apt-get install binutils-arm-linux-gnueabi
# https://github.com/intel-lab-lkp/linux/commit/98c25b38af82eff7e9652a58b4a9d1c1c933ec80
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Perry-Yuan/Implement-AMD-Pstate-EPP-Driver/20221219-144514
git checkout 98c25b38af82eff7e9652a58b4a9d1c1c933ec80
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash drivers/devfreq/
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
All errors (new ones prefixed by >>):
In file included from drivers/devfreq/governor_passive.c:12:
>> include/linux/cpufreq.h:23:10: fatal error: 'asm/msr.h' file not found
#include <asm/msr.h>
^~~~~~~~~~~
1 error generated.
vim +23 include/linux/cpufreq.h
10
11 #include <linux/clk.h>
12 #include <linux/cpu.h>
13 #include <linux/cpumask.h>
14 #include <linux/completion.h>
15 #include <linux/kobject.h>
16 #include <linux/notifier.h>
17 #include <linux/of.h>
18 #include <linux/of_device.h>
19 #include <linux/pm_opp.h>
20 #include <linux/pm_qos.h>
21 #include <linux/spinlock.h>
22 #include <linux/sysfs.h>
> 23 #include <asm/msr.h>
24
--
0-DAY CI Kernel Test Service
https://01.org/lkp
Hi Perry,
I love your patch! Yet something to improve:
[auto build test ERROR on rafael-pm/linux-next]
[also build test ERROR on linus/master v6.1 next-20221219]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Perry-Yuan/Implement-AMD-Pstate-EPP-Driver/20221219-144514
base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
patch link: https://lore.kernel.org/r/20221219064042.661122-4-perry.yuan%40amd.com
patch subject: [PATCH v8 03/13] cpufreq: intel_pstate: use common macro definition for Energy Preference Performance(EPP)
config: alpha-randconfig-r031-20221219
compiler: alpha-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/98c25b38af82eff7e9652a58b4a9d1c1c933ec80
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Perry-Yuan/Implement-AMD-Pstate-EPP-Driver/20221219-144514
git checkout 98c25b38af82eff7e9652a58b4a9d1c1c933ec80
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=alpha olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=alpha SHELL=/bin/bash drivers/base/
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
All errors (new ones prefixed by >>):
In file included from drivers/base/core.c:12:
>> include/linux/cpufreq.h:23:10: fatal error: asm/msr.h: No such file or directory
23 | #include <asm/msr.h>
| ^~~~~~~~~~~
compilation terminated.
vim +23 include/linux/cpufreq.h
10
11 #include <linux/clk.h>
12 #include <linux/cpu.h>
13 #include <linux/cpumask.h>
14 #include <linux/completion.h>
15 #include <linux/kobject.h>
16 #include <linux/notifier.h>
17 #include <linux/of.h>
18 #include <linux/of_device.h>
19 #include <linux/pm_opp.h>
20 #include <linux/pm_qos.h>
21 #include <linux/spinlock.h>
22 #include <linux/sysfs.h>
> 23 #include <asm/msr.h>
24
--
0-DAY CI Kernel Test Service
https://01.org/lkp
On 12/19/2022 00:40, Perry Yuan wrote:
> From: Perry Yuan <[email protected]>
>
> The patch add AMD pstate EPP feature introduction and what EPP
> preference supported for AMD processors.
Don't use "the patch" or "this patch" in the commit message.
I would propose something like this:
>
> User can get supported list from
> energy_performance_available_preferences attribute file, or update
> current profile to energy_performance_preference file
The amd-pstate driver supports a feature called energy performance
preference (EPP). Add information to the documentation to explain
how users can interact with the sysfs files for this feature.
>
> 1) See all EPP profiles
> $ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_available_preferences
> default performance balance_performance balance_power power
>
> 2) Check current EPP profile
> $ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference
> performance
>
> 3) Set new EPP profile
> $ sudo bash -c "echo power > /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference"
>
> Signed-off-by: Perry Yuan <[email protected]>
> ---
> Documentation/admin-guide/pm/amd-pstate.rst | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
> index 06e23538f79c..33ab8ec8fc2f 100644
> --- a/Documentation/admin-guide/pm/amd-pstate.rst
> +++ b/Documentation/admin-guide/pm/amd-pstate.rst
> @@ -262,6 +262,25 @@ lowest non-linear performance in `AMD CPPC Performance Capability
> <perf_cap_>`_.)
> This attribute is read-only.
>
> +``energy_performance_available_preferences``
> +
> +A list of all the supported EPP preferences that could be used for
> +``energy_performance_preference`` on this system.
> +These profiles represent different hints that are provided
> +to the low-level firmware about the user's desired energy vs efficiency
> +tradeoff. ``default`` represents the epp value is set by platform
> +firmware. This attribute is read-only.
> +
> +``energy_performance_preference``
> +
> +The current energy performance preference can be read from this attribute.
> +and user can change current preference according to energy or performance needs
> +Please get all support profiles list from
> +``energy_performance_available_preferences`` attribute, all the profiles are
> +integer values defined between 0 to 255 when EPP feature is enabled by platform
> +firmware, if EPP feature is disabled, driver will ignore the written value
> +This attribute is read-write.
> +
> Other performance and frequency values can be read back from
> ``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`.
>
On 12/19/2022 00:40, Perry Yuan wrote:
> The new amd-pstate driver support to switch the driver working mode and
Words like "new" don't age well.
> use can switch the driver mode within the sysfs attributes in the below
> path and check current mode
Perhaps "The amd-pstate driver supports switching working modes at runtime.
Users can view and change modes by interacting with the "status" sysfs
attribute.
>
> $ cd /sys/devices/system/cpu/amd-pstate
>
No need to change directories; you're demonstrating below using full paths.
> check driver mode:
> $ cat /sys/devices/system/cpu/amd-pstate/status
>
> switch mode:
> $ sudo bash -c "echo passive > /sys/devices/system/cpu/amd-pstate/status"
> or
> $ sudo bash -c "echo active > /sys/devices/system/cpu/amd-pstate/status"
Another way you could suggest this:
# echo "passive" | sudo tee /sys/devices/system/cpu/amd-pstate/status
or
# echo "active" | sudo tee /sys/devices/system/cpu/amd-pstate/status
I don't feel strongly which way to suggest though.
>
> Signed-off-by: Perry Yuan <[email protected]>
> ---
> Documentation/admin-guide/pm/amd-pstate.rst | 28 +++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
> diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
> index 62744dae3c5f..f3a8f8a66783 100644
> --- a/Documentation/admin-guide/pm/amd-pstate.rst
> +++ b/Documentation/admin-guide/pm/amd-pstate.rst
> @@ -339,6 +339,34 @@ processor must provide at least nominal performance requested and go higher if c
> operating conditions allow.
>
>
> +User Space Interface in ``sysfs``
> +=================================
> +
> +Global Attributes
> +-----------------
> +
> +``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to
> +control its functionality at the system level. They are located in the
> +``/sys/devices/system/cpu/amd-pstate/`` directory and affect all CPUs.
> +
> +``status``
> + Operation mode of the driver: "active", "passive" or "off".
> +
> + "active"
> + The driver is functional and in the ``active mode``
> +
> + "passive"
> + The driver is functional and in the ``passive mode``
> +
> + "off"
> + The driver is unregistered and not functional now.
> +
> + This attribute can be written to in order to change the driver's
> + operation mode or to unregister it. The string written to it must be
> + one of the possible values of it and, if successful, the write will
I think this is implied that you wrote a possible value and if it
returns success something happens. That's how all sysfs files work and
it's needlessly wordy.
> + cause the driver to switch over to the operation mode represented by
> + that string - or to be unregistered in the "off" case.
Considering the implication of my above comment I think you can reword
this as:
"Writing one of these values to the sysfs file will cause the driver to..."
> +
> ``cpupower`` tool support for ``amd-pstate``
> ===============================================
>
On 12/19/2022 00:40, Perry Yuan wrote:
> AMD Pstate driver support another firmware based autonomous mode
> with "amd_pstate=active" added to the kernel command line.
> In autonomous mode SMU firmware decides frequencies at 1 ms timescale
This might be the case right now, but I don't know that it will always
be this timescale. Suggest to drop drop "at 1ms timescale".
> based on workload utilization, usage in other IPs, infrastructure
> limits such as power, thermals and so on.
>
> Signed-off-by: Perry Yuan <[email protected]>
With nit fixed:
Reviewed-by: Mario Limonciello <[email protected]>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 42af9ca0127e..73a02816f6f8 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -6970,3 +6970,10 @@
> management firmware translates the requests into actual
> hardware states (core frequency, data fabric and memory
> clocks etc.)
> + active
> + Use amd_pstate_epp driver instance as the scaling driver,
> + driver provides a hint to the hardware if software wants
> + to bias toward performance (0x0) or energy efficiency (0xff)
> + to the CPPC firmware. then CPPC power algorithm will
> + calculate the runtime workload and adjust the realtime cores
> + frequency.
On 12/19/22 00:40, Perry Yuan wrote:
> make the energy preference performance strings and profiles using one
> common header for intel_pstate driver, then the amd_pstate epp driver can
> use the common header as well. This will simpify the intel_pstate and
> amd_pstate driver.
>
> Signed-off-by: Perry Yuan <[email protected]>
> ---
> drivers/cpufreq/intel_pstate.c | 13 +++----------
> include/linux/cpufreq.h | 11 +++++++++++
> 2 files changed, 14 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index ad9be31753b6..93a60fdac0fc 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -640,15 +640,7 @@ static int intel_pstate_set_epb(int cpu, s16 pref)
> * 4 power
> */
>
> -enum energy_perf_value_index {
> - EPP_INDEX_DEFAULT = 0,
> - EPP_INDEX_PERFORMANCE,
> - EPP_INDEX_BALANCE_PERFORMANCE,
> - EPP_INDEX_BALANCE_POWERSAVE,
> - EPP_INDEX_POWERSAVE,
> -};
> -
> -static const char * const energy_perf_strings[] = {
> +const char * const energy_perf_strings[] = {
> [EPP_INDEX_DEFAULT] = "default",
> [EPP_INDEX_PERFORMANCE] = "performance",
> [EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
> @@ -656,7 +648,8 @@ static const char * const energy_perf_strings[] = {
> [EPP_INDEX_POWERSAVE] = "power",
> NULL
> };
> -static unsigned int epp_values[] = {
> +
> +unsigned int epp_values[] = {
> [EPP_INDEX_DEFAULT] = 0, /* Unused index */
> [EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
> [EPP_INDEX_BALANCE_PERFORMANCE] = HWP_EPP_BALANCE_PERFORMANCE,
I think this is going to make CONFIG_AMD_PSTATE depend on
CONFIG_INTEL_PSTATE. What you'll want to do is put these symbols in a
"common" C file used by both. How about in the cppc lib stuff?
Please make sure that you test compile/link of v9 both with
CONFIG_AMD_PSTATE/CONFIG_INTEL_PSTATE set and either or set.
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index d5595d57f4e5..e63309d497fe 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -20,6 +20,7 @@
> #include <linux/pm_qos.h>
> #include <linux/spinlock.h>
> #include <linux/sysfs.h>
> +#include <asm/msr.h>
>
> /*********************************************************************
> * CPUFREQ INTERFACE *
> @@ -185,6 +186,16 @@ struct cpufreq_freqs {
> u8 flags; /* flags of cpufreq_driver, see below. */
> };
>
> +enum energy_perf_value_index {
> + EPP_INDEX_DEFAULT = 0,
> + EPP_INDEX_PERFORMANCE,
> + EPP_INDEX_BALANCE_PERFORMANCE,
> + EPP_INDEX_BALANCE_POWERSAVE,
> + EPP_INDEX_POWERSAVE,
> +};
> +extern const char * const energy_perf_strings[];
> +extern unsigned int epp_values[];
> +
> /* Only for ACPI */
> #define CPUFREQ_SHARED_TYPE_NONE (0) /* None */
> #define CPUFREQ_SHARED_TYPE_HW (1) /* HW does needed coordination */
On 12/19/22 00:40, Perry Yuan wrote:
> From: Perry Yuan <[email protected]>
>
> Add EPP driver support for AMD SoCs which support a dedicated MSR for
> CPPC. EPP is used by the DPM controller to configure the frequency that
> a core operates at during short periods of activity.
>
> The SoC EPP targets are configured on a scale from 0 to 255 where 0
> represents maximum performance and 255 represents maximum efficiency.
>
> The amd-pstate driver exports profile string names to userspace that are
> tied to specific EPP values.
>
> The balance_performance string (0x80) provides the best balance for
> efficiency versus power on most systems, but users can choose other
> strings to meet their needs as well.
>
> $ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_preferences
> default performance balance_performance balance_power power
>
> $ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preference
> balance_performance
>
> To enable the driver,it needs to add `amd_pstate=active` to kernel
> command line and kernel will load the active mode epp driver
>
> Signed-off-by: Perry Yuan <[email protected]>
> ---
> drivers/cpufreq/amd-pstate.c | 447 ++++++++++++++++++++++++++++++++++-
> include/linux/amd-pstate.h | 10 +
> 2 files changed, 451 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 861a905f9324..66b39457a312 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -59,7 +59,10 @@
> * we disable it by default to go acpi-cpufreq on these processors and add a
> * module parameter to be able to enable it manually for debugging.
> */
> +static struct cpufreq_driver *default_pstate_driver;
> static struct cpufreq_driver amd_pstate_driver;
> +static struct cpufreq_driver amd_pstate_epp_driver;
> +static struct amd_cpudata **all_cpu_data;
> static int cppc_state = AMD_PSTATE_DISABLE;
>
> static inline int get_mode_idx_from_str(const char *str, size_t size)
> @@ -70,9 +73,128 @@ static inline int get_mode_idx_from_str(const char *str, size_t size)
> if (!strncmp(str, amd_pstate_mode_string[i], size))
> return i;
> }
> +
Unrelated whitespace change.
> return -EINVAL;
> }
>
> +/**
> + * struct amd_pstate_params - global parameters for the performance control
> + * @ cppc_boost_disabled wheher the core performance boost disabled
> + */
> +struct amd_pstate_params {
> + bool cppc_boost_disabled;
> +};
> +
> +static struct amd_pstate_params global_params;
> +
> +static DEFINE_MUTEX(amd_pstate_limits_lock);
> +static DEFINE_MUTEX(amd_pstate_driver_lock);
> +
> +static s16 amd_pstate_get_epp(struct amd_cpudata *cpudata, u64 cppc_req_cached)
> +{
> + u64 epp;
> + int ret;
> +
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + if (!cppc_req_cached) {
> + epp = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> + &cppc_req_cached);
> + if (epp)
> + return epp;
> + }
> + epp = (cppc_req_cached >> 24) & 0xFF;
> + } else {
> + ret = cppc_get_epp_perf(cpudata->cpu, &epp);
> + if (ret < 0) {
> + pr_debug("Could not retrieve energy perf value (%d)\n", ret);
> + return -EIO;
> + }
> + }
> +
> + return (s16)(epp & 0xff); > +}
> +
> +static int amd_pstate_get_energy_pref_index(struct amd_cpudata *cpudata)
> +{
> + s16 epp;
> + int index = -EINVAL;
> +
> + epp = amd_pstate_get_epp(cpudata, 0);
> + if (epp < 0)
> + return epp;
> +
> + switch (epp) {
> + case HWP_EPP_PERFORMANCE:
> + index = EPP_INDEX_PERFORMANCE;
> + break;
> + case HWP_EPP_BALANCE_PERFORMANCE:
> + index = EPP_INDEX_BALANCE_PERFORMANCE;
> + break;
> + case HWP_EPP_BALANCE_POWERSAVE:
> + index = EPP_INDEX_BALANCE_POWERSAVE;
> + break;
> + case HWP_EPP_POWERSAVE:
> + index = EPP_INDEX_POWERSAVE;
> + break;
> + default:
> + break;
Extra tab here
> + }
> +
> + return index;
> +}
> +
> +static int amd_pstate_set_epp(struct amd_cpudata *cpudata, u32 epp)
> +{
> + int ret;
> + struct cppc_perf_ctrls perf_ctrls;
> +
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + u64 value = READ_ONCE(cpudata->cppc_req_cached);
> +
> + value &= ~GENMASK_ULL(31, 24);
> + value |= (u64)epp << 24;
> + WRITE_ONCE(cpudata->cppc_req_cached, value);
> +
> + ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> + if (!ret)
> + cpudata->epp_cached = epp;
> + } else {
> + perf_ctrls.energy_perf = epp;
> + ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1);
> + if (ret) {
> + pr_debug("failed to set energy perf value (%d)\n", ret);
> + return ret;
> + }
> + cpudata->epp_cached = epp;
> + }
> +
> + return ret;
> +}
> +
> +static int amd_pstate_set_energy_pref_index(struct amd_cpudata *cpudata,
> + int pref_index)
> +{
> + int epp = -EINVAL;
> + int ret;
> +
> + if (!pref_index) {
> + pr_debug("EPP pref_index is invalid\n");
> + return -EINVAL > + }
> +
> + if (epp == -EINVAL)
> + epp = epp_values[pref_index];
Didn't you just hardcode epp to -EINVAL at the beginning of function?
> +
> + if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> + pr_debug("EPP cannot be set under performance policy\n");
> + return -EBUSY;
> + }
> +
> + ret = amd_pstate_set_epp(cpudata, epp);
> +
> + return ret;
> +}
> +
> static inline int pstate_enable(bool enable)
> {
> return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable);
> @@ -81,11 +203,21 @@ static inline int pstate_enable(bool enable)
> static int cppc_enable(bool enable)
> {
> int cpu, ret = 0;
> + struct cppc_perf_ctrls perf_ctrls;
>
> for_each_present_cpu(cpu) {
> ret = cppc_set_enable(cpu, enable);
> if (ret)
> return ret;
> +
> + /* Enable autonomous mode for EPP */
> + if (cppc_state == AMD_PSTATE_ACTIVE) {
> + /* Set desired perf as zero to allow EPP firmware control */
> + perf_ctrls.desired_perf = 0;
> + ret = cppc_set_perf(cpu, &perf_ctrls);
> + if (ret)
> + return ret;
> + }
> }
>
> return ret;
> @@ -429,7 +561,7 @@ static void amd_pstate_boost_init(struct amd_cpudata *cpudata)
> return;
>
> cpudata->boost_supported = true;
> - amd_pstate_driver.boost_enabled = true;
> + default_pstate_driver->boost_enabled = true;
> }
>
> static void amd_perf_ctl_reset(unsigned int cpu)
> @@ -603,10 +735,61 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
> return sprintf(&buf[0], "%u\n", perf);
> }
>
> +static ssize_t show_energy_performance_available_preferences(
> + struct cpufreq_policy *policy, char *buf)
> +{
> + int i = 0;
> + int offset = 0;
> +
> + while (energy_perf_strings[i] != NULL)
> + offset += sysfs_emit_at(buf, offset, "%s ", energy_perf_strings[i++]);
> +
> + sysfs_emit_at(buf, offset, "\n");
> +
> + return offset;
> +}
> +
> +static ssize_t store_energy_performance_preference(
> + struct cpufreq_policy *policy, const char *buf, size_t count)
> +{
> + struct amd_cpudata *cpudata = policy->driver_data;
> + char str_preference[21];
> + ssize_t ret;
> +
> + ret = sscanf(buf, "%20s", str_preference);
> + if (ret != 1)
> + return -EINVAL;
> +
> + ret = match_string(energy_perf_strings, -1, str_preference);
> + if (ret < 0)
> + return -EINVAL;
> +
> + mutex_lock(&amd_pstate_limits_lock);
> + ret = amd_pstate_set_energy_pref_index(cpudata, ret);
> + mutex_unlock(&amd_pstate_limits_lock);
> +
> + return ret ?: count;
> +}
> +
> +static ssize_t show_energy_performance_preference(
> + struct cpufreq_policy *policy, char *buf)
> +{
> + struct amd_cpudata *cpudata = policy->driver_data;
> + int preference;
> +
> + preference = amd_pstate_get_energy_pref_index(cpudata);
> + if (preference < 0)
> + return preference;
> +
> + return sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
> +}
> +
> cpufreq_freq_attr_ro(amd_pstate_max_freq);
> cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
>
> cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> +cpufreq_freq_attr_rw(energy_performance_preference);
> +cpufreq_freq_attr_ro(energy_performance_available_preferences);
>
> static struct freq_attr *amd_pstate_attr[] = {
> &amd_pstate_max_freq,
> @@ -615,6 +798,235 @@ static struct freq_attr *amd_pstate_attr[] = {
> NULL,
> };
>
> +static struct freq_attr *amd_pstate_epp_attr[] = {
> + &amd_pstate_max_freq,
> + &amd_pstate_lowest_nonlinear_freq,
> + &amd_pstate_highest_perf,
> + &energy_performance_preference,
> + &energy_performance_available_preferences,
> + NULL,
> +};
> +
> +static inline void update_boost_state(void)
> +{
> + u64 misc_en;
> + struct amd_cpudata *cpudata;
> +
> + cpudata = all_cpu_data[0];
> + rdmsrl(MSR_K7_HWCR, misc_en);
> + global_params.cppc_boost_disabled = misc_en & BIT_ULL(25);
> +}
> +
> +static int amd_pstate_init_cpu(unsigned int cpunum)
> +{
> + struct amd_cpudata *cpudata;
> +
> + cpudata = all_cpu_data[cpunum];
> + if (!cpudata) {
> + cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
> + if (!cpudata)
> + return -ENOMEM;
> + WRITE_ONCE(all_cpu_data[cpunum], cpudata);
> +
> + cpudata->cpu = cpunum;
> + }
> +
> + cpudata->epp_policy = 0;
> + pr_debug("controlling: cpu %d\n", cpunum);
> + return 0;
> +}
> +
> +static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> +{
> + int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> + struct amd_cpudata *cpudata;
> + struct device *dev;
> + int rc;
> + u64 value;
> +
> + rc = amd_pstate_init_cpu(policy->cpu);
> + if (rc)
> + return rc;
> +
> + cpudata = all_cpu_data[policy->cpu];
> +
> + dev = get_cpu_device(policy->cpu);
> + if (!dev)
> + goto free_cpudata1;
> +
> + rc = amd_pstate_init_perf(cpudata);
> + if (rc)
> + goto free_cpudata1;
> +
> + min_freq = amd_get_min_freq(cpudata);
> + max_freq = amd_get_max_freq(cpudata);
> + nominal_freq = amd_get_nominal_freq(cpudata);
> + lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> + if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> + dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
> + min_freq, max_freq);
> + ret = -EINVAL;
> + goto free_cpudata1;
> + }
> +
> + policy->min = min_freq;
> + policy->max = max_freq;
> +
> + policy->cpuinfo.min_freq = min_freq;
> + policy->cpuinfo.max_freq = max_freq;
> + /* It will be updated by governor */
> + policy->cur = policy->cpuinfo.min_freq;
> +
> + /* Initial processor data capability frequencies */
> + cpudata->max_freq = max_freq;
> + cpudata->min_freq = min_freq;
> + cpudata->nominal_freq = nominal_freq;
> + cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> +
> + policy->driver_data = cpudata;
> +
> + cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
> +
> + policy->min = policy->cpuinfo.min_freq;
> + policy->max = policy->cpuinfo.max_freq;
> +
> + /*
> + * Set the policy to powersave to provide a valid fallback value in case
> + * the default cpufreq governor is neither powersave nor performance.
> + */
> + policy->policy = CPUFREQ_POLICY_POWERSAVE;
> +
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + policy->fast_switch_possible = true;
> + ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, &value);
> + if (ret)
> + return ret;
> + WRITE_ONCE(cpudata->cppc_req_cached, value);
> +
> + ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &value);
> + if (ret)
> + return ret;
> + WRITE_ONCE(cpudata->cppc_cap1_cached, value);
> + }
> + amd_pstate_boost_init(cpudata);
> +
> + return 0;
> +
> +free_cpudata1:
> + kfree(cpudata);
> + return ret;
> +}
> +
> +static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
> +{
> + pr_debug("CPU %d exiting\n", policy->cpu);
> + policy->fast_switch_possible = false;
> + return 0;
> +}
> +
> +static void amd_pstate_update_max_freq(unsigned int cpu)
> +{
> + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> +
> + if (!policy)
> + return;
> +
> + refresh_frequency_limits(policy);
> + cpufreq_cpu_put(policy);
> +}
> +
> +static void amd_pstate_epp_update_limits(unsigned int cpu)
> +{
> + mutex_lock(&amd_pstate_driver_lock);
> + update_boost_state();
> + if (global_params.cppc_boost_disabled) {
> + for_each_possible_cpu(cpu)
> + amd_pstate_update_max_freq(cpu);
> + } else {
> + cpufreq_update_policy(cpu);
> + }
> + mutex_unlock(&amd_pstate_driver_lock);
> +}
> +
> +static void amd_pstate_epp_init(unsigned int cpu)
> +{
> + struct amd_cpudata *cpudata = all_cpu_data[cpu];
> + u32 max_perf, min_perf;
> + u64 value;
> + s16 epp;
> +
> + max_perf = READ_ONCE(cpudata->highest_perf);
> + min_perf = READ_ONCE(cpudata->lowest_perf);
> +
> + value = READ_ONCE(cpudata->cppc_req_cached);
> +
> + if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> + min_perf = max_perf;
> +
> + /* Initial min/max values for CPPC Performance Controls Register */
> + value &= ~AMD_CPPC_MIN_PERF(~0L);
> + value |= AMD_CPPC_MIN_PERF(min_perf);
> +
> + value &= ~AMD_CPPC_MAX_PERF(~0L);
> + value |= AMD_CPPC_MAX_PERF(max_perf);
> +
> + /* CPPC EPP feature require to set zero to the desire perf bit */
> + value &= ~AMD_CPPC_DES_PERF(~0L);
> + value |= AMD_CPPC_DES_PERF(0);
> +
> + if (cpudata->epp_policy == cpudata->policy)
> + goto skip_epp;
> +
> + cpudata->epp_policy = cpudata->policy;
> +
> + if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> + epp = amd_pstate_get_epp(cpudata, value);
> + if (epp < 0)
> + goto skip_epp;
> + /* force the epp value to be zero for performance policy */
> + epp = 0;
> + } else {
> + /* Get BIOS pre-defined epp value */
> + epp = amd_pstate_get_epp(cpudata, value);
> + if (epp)
> + goto skip_epp;
> + }
> + /* Set initial EPP value */
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + value &= ~GENMASK_ULL(31, 24);
> + value |= (u64)epp << 24;
> + }
> +
> +skip_epp:
> + WRITE_ONCE(cpudata->cppc_req_cached, value);
> + amd_pstate_set_epp(cpudata, epp);
> +}
> +
> +static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
> +{
> + struct amd_cpudata *cpudata;
> +
> + if (!policy->cpuinfo.max_freq)
> + return -ENODEV;
> +
> + pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
> + policy->cpuinfo.max_freq, policy->max);
> +
> + cpudata = all_cpu_data[policy->cpu];
> + cpudata->policy = policy->policy;
> +
> + amd_pstate_epp_init(policy->cpu);
> +
> + return 0;
> +}
> +
> +static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data *policy)
> +{
> + cpufreq_verify_within_cpu_limits(policy);
> + pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy->min);
> + return 0;
> +}
> +
> static struct cpufreq_driver amd_pstate_driver = {
> .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
> .verify = amd_pstate_verify,
> @@ -628,8 +1040,20 @@ static struct cpufreq_driver amd_pstate_driver = {
> .attr = amd_pstate_attr,
> };
>
> +static struct cpufreq_driver amd_pstate_epp_driver = {
> + .flags = CPUFREQ_CONST_LOOPS,
> + .verify = amd_pstate_epp_verify_policy,
> + .setpolicy = amd_pstate_epp_set_policy,
> + .init = amd_pstate_epp_cpu_init,
> + .exit = amd_pstate_epp_cpu_exit,
> + .update_limits = amd_pstate_epp_update_limits,
> + .name = "amd_pstate_epp",
> + .attr = amd_pstate_epp_attr,
> +};
> +
> static int __init amd_pstate_init(void)
> {
> + static struct amd_cpudata **cpudata;
> int ret;
>
> if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
> @@ -656,7 +1080,8 @@ static int __init amd_pstate_init(void)
> /* capability check */
> if (boot_cpu_has(X86_FEATURE_CPPC)) {
> pr_debug("AMD CPPC MSR based functionality is supported\n");
> - amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
> + if (cppc_state == AMD_PSTATE_PASSIVE)
> + default_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
> } else {
> pr_debug("AMD CPPC shared memory based functionality is supported\n");
> static_call_update(amd_pstate_enable, cppc_enable);
> @@ -664,17 +1089,21 @@ static int __init amd_pstate_init(void)
> static_call_update(amd_pstate_update_perf, cppc_update_perf);
> }
>
> + cpudata = vzalloc(array_size(sizeof(void *), num_possible_cpus()));
> + if (!cpudata)
> + return -ENOMEM;
> + WRITE_ONCE(all_cpu_data, cpudata);
> +
> /* enable amd pstate feature */
> ret = amd_pstate_enable(true);
> if (ret) {
> - pr_err("failed to enable amd-pstate with return %d\n", ret);
> + pr_err("failed to enable with return %d\n", ret);
> return ret;
> }
>
> - ret = cpufreq_register_driver(&amd_pstate_driver);
> + ret = cpufreq_register_driver(default_pstate_driver);
> if (ret)
> - pr_err("failed to register amd_pstate_driver with return %d\n",
> - ret);
> + pr_err("failed to register with return %d\n", ret);
>
> return ret;
> }
> @@ -696,6 +1125,12 @@ static int __init amd_pstate_param(char *str)
> if (cppc_state == AMD_PSTATE_DISABLE)
> pr_info("driver is explicitly disabled\n");
>
> + if (cppc_state == AMD_PSTATE_ACTIVE)
> + default_pstate_driver = &amd_pstate_epp_driver;
> +
> + if (cppc_state == AMD_PSTATE_PASSIVE)
> + default_pstate_driver = &amd_pstate_driver;
> +
> return 0;
> }
>
> diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> index 922d05a13902..fe1aef743c09 100644
> --- a/include/linux/amd-pstate.h
> +++ b/include/linux/amd-pstate.h
> @@ -47,6 +47,10 @@ struct amd_aperf_mperf {
> * @prev: Last Aperf/Mperf/tsc count value read from register
> * @freq: current cpu frequency value
> * @boost_supported: check whether the Processor or SBIOS supports boost mode
> + * @epp_policy: Last saved policy used to set energy-performance preference
> + * @epp_cached: Cached CPPC energy-performance preference value
> + * @policy: Cpufreq policy value
> + * @cppc_cap1_cached Cached MSR_AMD_CPPC_CAP1 register value
> *
> * The amd_cpudata is key private data for each CPU thread in AMD P-State, and
> * represents all the attributes and goals that AMD P-State requests at runtime.
> @@ -72,6 +76,12 @@ struct amd_cpudata {
>
> u64 freq;
> bool boost_supported;
> +
> + /* EPP feature related attributes*/
> + s16 epp_policy;
> + s16 epp_cached;
> + u32 policy;
> + u64 cppc_cap1_cached;
> };
>
> /**
On 19.12.22 06:40, Perry Yuan wrote:
> Hi all,
>
> This patchset implements one new AMD CPU frequency driver
> `amd-pstate-epp` instance for better performance and power control.
> CPPC has a parameter called energy preference performance (EPP).
> The EPP is used in the CCLK DPM controller to drive the frequency that a core
> is going to operate during short periods of activity.
> EPP values will be utilized for different OS profiles (balanced, performance, power savings).
>
Using v8 and clang-15 on 6.1 I get:
---
ld.lld: error: undefined symbol: energy_perf_strings
>>> referenced by amd-pstate.c:789
(/tmp/makepkg/linux61-vd/src/linux-stable/drivers/cpufreq/amd-pstate.c:789)
>>> vmlinux.o:(show_energy_performance_preference)
>>> referenced by amd-pstate.c:768
(/tmp/makepkg/linux61-vd/src/linux-stable/drivers/cpufreq/amd-pstate.c:768)
>>> vmlinux.o:(store_energy_performance_preference)
>>> referenced by amd-pstate.c:749
(/tmp/makepkg/linux61-vd/src/linux-stable/drivers/cpufreq/amd-pstate.c:749)
>>> vmlinux.o:(show_energy_performance_available_preferences)
>>> referenced 1 more times
>>> did you mean: energy_perf_strings
>>> defined in: vmlinux.o
ld.lld: error: undefined symbol: epp_values
>>> referenced by amd-pstate.c:189
(/tmp/makepkg/linux61-vd/src/linux-stable/drivers/cpufreq/amd-pstate.c:189)
>>> vmlinux.o:(store_energy_performance_preference)
---
and a few warnings:
---
drivers/cpufreq/amd-pstate.c:966:6: warning: variable 'ret' is used
uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
if (rc)
^~
drivers/cpufreq/amd-pstate.c:1025:9: note: uninitialized use occurs here
return ret;
^~~
drivers/cpufreq/amd-pstate.c:966:2: note: remove the 'if' if its
condition is always false
if (rc)
^~~~~~~
drivers/cpufreq/amd-pstate.c:962:6: warning: variable 'ret' is used
uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
if (!dev)
^~~~
drivers/cpufreq/amd-pstate.c:1025:9: note: uninitialized use occurs here
return ret;
^~~
drivers/cpufreq/amd-pstate.c:962:2: note: remove the 'if' if its
condition is always false
if (!dev)
^~~~~~~~~
drivers/cpufreq/amd-pstate.c:949:66: note: initialize the variable 'ret'
to silence this warning
int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
^
= 0
drivers/cpufreq/amd-pstate.c:996:52: warning: variable 'value' is
uninitialized when used here [-Wuninitialized]
cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
^~~~~
drivers/cpufreq/amd-pstate.c:953:11: note: initialize the variable
'value' to silence this warning
u64 value;
^
= 0
drivers/cpufreq/amd-pstate.c:1085:6: warning: variable 'epp' is used
uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
if (cpudata->epp_policy == cpudata->policy)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/cpufreq/amd-pstate.c:1110:30: note: uninitialized use occurs here
amd_pstate_set_epp(cpudata, epp);
^~~
drivers/cpufreq/amd-pstate.c:1085:2: note: remove the 'if' if its
condition is always false
if (cpudata->epp_policy == cpudata->policy)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/cpufreq/amd-pstate.c:1064:9: note: initialize the variable 'epp'
to silence this warning
s16 epp;
^
= 0
---
Cheers,
Tor Vic
On 20.12.22 18:13, Tor Vic wrote:
>
> On 19.12.22 06:40, Perry Yuan wrote:
>> Hi all,
>>
>> This patchset implements one new AMD CPU frequency driver
>> `amd-pstate-epp` instance for better performance and power control.
>> CPPC has a parameter called energy preference performance (EPP).
>> The EPP is used in the CCLK DPM controller to drive the frequency that
>> a core
>> is going to operate during short periods of activity.
>> EPP values will be utilized for different OS profiles (balanced,
>> performance, power savings).
>>
>
> Using v8 and clang-15 on 6.1 I get:
>
Got it.
Mario was right. INTEL_PSTATE must be selected, it has become a dependency.
That doesn't seem correct.
With it selected, it builds just fine. Not tested though.
> ---
> ld.lld: error: undefined symbol: energy_perf_strings
> >>> referenced by amd-pstate.c:789
> (/tmp/makepkg/linux61-vd/src/linux-stable/drivers/cpufreq/amd-pstate.c:789)
> >>>Â Â Â Â Â Â Â Â Â Â Â Â Â Â vmlinux.o:(show_energy_performance_preference)
> >>> referenced by amd-pstate.c:768
> (/tmp/makepkg/linux61-vd/src/linux-stable/drivers/cpufreq/amd-pstate.c:768)
> >>>Â Â Â Â Â Â Â Â Â Â Â Â Â Â vmlinux.o:(store_energy_performance_preference)
> >>> referenced by amd-pstate.c:749
> (/tmp/makepkg/linux61-vd/src/linux-stable/drivers/cpufreq/amd-pstate.c:749)
> >>>
> vmlinux.o:(show_energy_performance_available_preferences)
> >>> referenced 1 more times
> >>> did you mean: energy_perf_strings
> >>> defined in: vmlinux.o
>
> ld.lld: error: undefined symbol: epp_values
> >>> referenced by amd-pstate.c:189
> (/tmp/makepkg/linux61-vd/src/linux-stable/drivers/cpufreq/amd-pstate.c:189)
> >>>Â Â Â Â Â Â Â Â Â Â Â Â Â Â vmlinux.o:(store_energy_performance_preference)
> ---
>
> and a few warnings:
>
> ---
> drivers/cpufreq/amd-pstate.c:966:6: warning: variable 'ret' is used
> uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
> Â Â Â Â Â Â Â if (rc)
> Â Â Â Â Â Â Â Â Â Â Â ^~
> drivers/cpufreq/amd-pstate.c:1025:9: note: uninitialized use occurs here
> Â Â Â Â Â Â Â return ret;
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^~~
> drivers/cpufreq/amd-pstate.c:966:2: note: remove the 'if' if its
> condition is always false
> Â Â Â Â Â Â Â if (rc)
> Â Â Â Â Â Â Â ^~~~~~~
> drivers/cpufreq/amd-pstate.c:962:6: warning: variable 'ret' is used
> uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
> Â Â Â Â Â Â Â if (!dev)
> Â Â Â Â Â Â Â Â Â Â Â ^~~~
> drivers/cpufreq/amd-pstate.c:1025:9: note: uninitialized use occurs here
> Â Â Â Â Â Â Â return ret;
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^~~
> drivers/cpufreq/amd-pstate.c:962:2: note: remove the 'if' if its
> condition is always false
> Â Â Â Â Â Â Â if (!dev)
> Â Â Â Â Â Â Â ^~~~~~~~~
> drivers/cpufreq/amd-pstate.c:949:66: note: initialize the variable 'ret'
> to silence this warning
> Â Â Â Â Â Â Â int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^
>
> Â = 0
> drivers/cpufreq/amd-pstate.c:996:52: warning: variable 'value' is
> uninitialized when used here [-Wuninitialized]
> Â Â Â Â Â Â Â cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^~~~~
> drivers/cpufreq/amd-pstate.c:953:11: note: initialize the variable
> 'value' to silence this warning
> Â Â Â Â Â Â Â u64 value;
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â = 0
> drivers/cpufreq/amd-pstate.c:1085:6: warning: variable 'epp' is used
> uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
> Â Â Â Â Â Â Â if (cpudata->epp_policy == cpudata->policy)
> Â Â Â Â Â Â Â Â Â Â Â ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/cpufreq/amd-pstate.c:1110:30: note: uninitialized use occurs here
> Â Â Â Â Â Â Â amd_pstate_set_epp(cpudata, epp);
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^~~
> drivers/cpufreq/amd-pstate.c:1085:2: note: remove the 'if' if its
> condition is always false
> Â Â Â Â Â Â Â if (cpudata->epp_policy == cpudata->policy)
> Â Â Â Â Â Â Â ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/cpufreq/amd-pstate.c:1064:9: note: initialize the variable 'epp'
> to silence this warning
> Â Â Â Â Â Â Â s16 epp;
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â = 0
> ---
>
> Cheers,
>
> Tor Vic
>
[AMD Official Use Only - General]
> -----Original Message-----
> From: Tor Vic <[email protected]>
> Sent: Wednesday, December 21, 2022 2:53 AM
> To: Yuan, Perry <[email protected]>; [email protected];
> Limonciello, Mario <[email protected]>; Huang, Ray
> <[email protected]>; [email protected]
> Cc: Sharma, Deepak <[email protected]>; Fontenot, Nathan
> <[email protected]>; Deucher, Alexander
> <[email protected]>; Huang, Shimmer
> <[email protected]>; Du, Xiaojian <[email protected]>; Meng,
> Li (Jassmine) <[email protected]>; Karny, Wyes <[email protected]>;
> [email protected]; [email protected]
> Subject: Re: [PATCH v8 00/13] Implement AMD Pstate EPP Driver
>
>
> On 20.12.22 18:13, Tor Vic wrote:
> >
> > On 19.12.22 06:40, Perry Yuan wrote:
> >> Hi all,
> >>
> >> This patchset implements one new AMD CPU frequency driver
> >> `amd-pstate-epp` instance for better performance and power control.
> >> CPPC has a parameter called energy preference performance (EPP).
> >> The EPP is used in the CCLK DPM controller to drive the frequency
> >> that a core is going to operate during short periods of activity.
> >> EPP values will be utilized for different OS profiles (balanced,
> >> performance, power savings).
> >>
> >
> > Using v8 and clang-15 on 6.1 I get:
> >
>
> Got it.
> Mario was right. INTEL_PSTATE must be selected, it has become a
> dependency.
>
> That doesn't seem correct.
>
> With it selected, it builds just fine. Not tested though.
Yeah, I will make it in v9.
Thanks for your feedback!
>
> > ---
> > ld.lld: error: undefined symbol: energy_perf_strings >>> referenced
> > by amd-pstate.c:789
> > (/tmp/makepkg/linux61-vd/src/linux-stable/drivers/cpufreq/amd-pstate.c
> > :789) >>>
> > vmlinux.o:(show_energy_performance_preference)
> > >>> referenced by amd-pstate.c:768
> > (/tmp/makepkg/linux61-vd/src/linux-stable/drivers/cpufreq/amd-pstate.c
> > :768) >>>
> > vmlinux.o:(store_energy_performance_preference)
> > >>> referenced by amd-pstate.c:749
> > (/tmp/makepkg/linux61-vd/src/linux-stable/drivers/cpufreq/amd-
> pstate.c:749)
> > >>>
> > vmlinux.o:(show_energy_performance_available_preferences)
> > >>> referenced 1 more times
> > >>> did you mean: energy_perf_strings >>> defined in: vmlinux.o
> >
> > ld.lld: error: undefined symbol: epp_values >>> referenced by
> > amd-pstate.c:189
> > (/tmp/makepkg/linux61-vd/src/linux-stable/drivers/cpufreq/amd-pstate.c
> > :189) >>>
> > vmlinux.o:(store_energy_performance_preference)
> > ---
> >
> > and a few warnings:
> >
> > ---
> > drivers/cpufreq/amd-pstate.c:966:6: warning: variable 'ret' is used
> > uninitialized whenever 'if' condition is true
> > [-Wsometimes-uninitialized]
> > Â Â Â Â Â Â Â if (rc)
> > Â Â Â Â Â Â Â Â Â Â Â ^~
> > drivers/cpufreq/amd-pstate.c:1025:9: note: uninitialized use occurs
> > here
> > Â Â Â Â Â Â Â return ret;
> > Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^~~
> > drivers/cpufreq/amd-pstate.c:966:2: note: remove the 'if' if its
> > condition is always false
> > Â Â Â Â Â Â Â if (rc)
> > Â Â Â Â Â Â Â ^~~~~~~
> > drivers/cpufreq/amd-pstate.c:962:6: warning: variable 'ret' is used
> > uninitialized whenever 'if' condition is true
> > [-Wsometimes-uninitialized]
> > Â Â Â Â Â Â Â if (!dev)
> > Â Â Â Â Â Â Â Â Â Â Â ^~~~
> > drivers/cpufreq/amd-pstate.c:1025:9: note: uninitialized use occurs
> > here
> > Â Â Â Â Â Â Â return ret;
> > Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^~~
> > drivers/cpufreq/amd-pstate.c:962:2: note: remove the 'if' if its
> > condition is always false
> > Â Â Â Â Â Â Â if (!dev)
> > Â Â Â Â Â Â Â ^~~~~~~~~
> > drivers/cpufreq/amd-pstate.c:949:66: note: initialize the variable 'ret'
> > to silence this warning
> > Â Â Â Â Â Â Â int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq,
> > ret;
> >
> > ^
> >
> > Â = 0
> > drivers/cpufreq/amd-pstate.c:996:52: warning: variable 'value' is
> > uninitialized when used here [-Wuninitialized]
> > Â Â Â Â Â Â Â cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
> > Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^~~~~
> > drivers/cpufreq/amd-pstate.c:953:11: note: initialize the variable
> > 'value' to silence this warning
> > Â Â Â Â Â Â Â u64 value;
> > Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^
> > Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â = 0
> > drivers/cpufreq/amd-pstate.c:1085:6: warning: variable 'epp' is used
> > uninitialized whenever 'if' condition is true
> > [-Wsometimes-uninitialized]
> > Â Â Â Â Â Â Â if (cpudata->epp_policy == cpudata->policy)
> > Â Â Â Â Â Â Â Â Â Â Â ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > drivers/cpufreq/amd-pstate.c:1110:30: note: uninitialized use occurs
> > here
> > Â Â Â Â Â Â Â amd_pstate_set_epp(cpudata, epp);
> > Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^~~
> > drivers/cpufreq/amd-pstate.c:1085:2: note: remove the 'if' if its
> > condition is always false
> > Â Â Â Â Â Â Â if (cpudata->epp_policy == cpudata->policy)
> > Â Â Â Â Â Â Â ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > drivers/cpufreq/amd-pstate.c:1064:9: note: initialize the variable 'epp'
> > to silence this warning
> > Â Â Â Â Â Â Â s16 epp;
> > Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^
> > Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â = 0
> > ---
> >
> > Cheers,
> >
> > Tor Vic
> >
[AMD Official Use Only - General]
> -----Original Message-----
> From: Huang, Ray <[email protected]>
> Sent: Friday, December 23, 2022 11:10 AM
> To: Yuan, Perry <[email protected]>
> Cc: [email protected]; Limonciello, Mario
> <[email protected]>; [email protected]; Sharma, Deepak
> <[email protected]>; Fontenot, Nathan
> <[email protected]>; Deucher, Alexander
> <[email protected]>; Huang, Shimmer
> <[email protected]>; Du, Xiaojian <[email protected]>; Meng,
> Li (Jassmine) <[email protected]>; Karny, Wyes <[email protected]>;
> [email protected]; [email protected]
> Subject: Re: [PATCH v8 03/13] cpufreq: intel_pstate: use common macro
> definition for Energy Preference Performance(EPP)
>
> On Mon, Dec 19, 2022 at 02:40:32PM +0800, Yuan, Perry wrote:
> > make the energy preference performance strings and profiles using one
> > common header for intel_pstate driver, then the amd_pstate epp driver
> > can use the common header as well. This will simpify the intel_pstate
> > and amd_pstate driver.
> >
> > Signed-off-by: Perry Yuan <[email protected]>
> > ---
> > drivers/cpufreq/intel_pstate.c | 13 +++----------
> > include/linux/cpufreq.h | 11 +++++++++++
> > 2 files changed, 14 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/cpufreq/intel_pstate.c
> > b/drivers/cpufreq/intel_pstate.c index ad9be31753b6..93a60fdac0fc
> > 100644
> > --- a/drivers/cpufreq/intel_pstate.c
> > +++ b/drivers/cpufreq/intel_pstate.c
> > @@ -640,15 +640,7 @@ static int intel_pstate_set_epb(int cpu, s16 pref)
> > * 4 power
> > */
> >
> > -enum energy_perf_value_index {
> > - EPP_INDEX_DEFAULT = 0,
> > - EPP_INDEX_PERFORMANCE,
> > - EPP_INDEX_BALANCE_PERFORMANCE,
> > - EPP_INDEX_BALANCE_POWERSAVE,
> > - EPP_INDEX_POWERSAVE,
> > -};
> > -
> > -static const char * const energy_perf_strings[] = {
> > +const char * const energy_perf_strings[] = {
> > [EPP_INDEX_DEFAULT] = "default",
> > [EPP_INDEX_PERFORMANCE] = "performance",
> > [EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
> @@ -656,7
> > +648,8 @@ static const char * const energy_perf_strings[] = {
> > [EPP_INDEX_POWERSAVE] = "power",
> > NULL
> > };
> > -static unsigned int epp_values[] = {
> > +
> > +unsigned int epp_values[] = {
> > [EPP_INDEX_DEFAULT] = 0, /* Unused index */
> > [EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
> > [EPP_INDEX_BALANCE_PERFORMANCE] =
> HWP_EPP_BALANCE_PERFORMANCE, diff
> > --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index
> > d5595d57f4e5..e63309d497fe 100644
> > --- a/include/linux/cpufreq.h
> > +++ b/include/linux/cpufreq.h
> > @@ -20,6 +20,7 @@
> > #include <linux/pm_qos.h>
> > #include <linux/spinlock.h>
> > #include <linux/sysfs.h>
> > +#include <asm/msr.h>
>
> Please don't include msr header in cpufreq common file, we already include
> it in amd-pstate.c, that's fairly enough.
>
> Thanks,
> Ray
Good , will remove the msr.h from this file.
Thank you.
Perry.
>
> >
> >
> /**********************************************************
> ***********
> > * CPUFREQ INTERFACE *
> > @@ -185,6 +186,16 @@ struct cpufreq_freqs {
> > u8 flags; /* flags of cpufreq_driver, see below. */
> > };
> >
> > +enum energy_perf_value_index {
> > + EPP_INDEX_DEFAULT = 0,
> > + EPP_INDEX_PERFORMANCE,
> > + EPP_INDEX_BALANCE_PERFORMANCE,
> > + EPP_INDEX_BALANCE_POWERSAVE,
> > + EPP_INDEX_POWERSAVE,
> > +};
> > +extern const char * const energy_perf_strings[]; extern unsigned int
> > +epp_values[];
> > +
> > /* Only for ACPI */
> > #define CPUFREQ_SHARED_TYPE_NONE (0) /* None */
> > #define CPUFREQ_SHARED_TYPE_HW (1) /* HW does needed
> coordination */
> > --
> > 2.34.1
> >
On Mon, Dec 19, 2022 at 02:40:32PM +0800, Yuan, Perry wrote:
> make the energy preference performance strings and profiles using one
> common header for intel_pstate driver, then the amd_pstate epp driver can
> use the common header as well. This will simpify the intel_pstate and
> amd_pstate driver.
>
> Signed-off-by: Perry Yuan <[email protected]>
> ---
> drivers/cpufreq/intel_pstate.c | 13 +++----------
> include/linux/cpufreq.h | 11 +++++++++++
> 2 files changed, 14 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index ad9be31753b6..93a60fdac0fc 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -640,15 +640,7 @@ static int intel_pstate_set_epb(int cpu, s16 pref)
> * 4 power
> */
>
> -enum energy_perf_value_index {
> - EPP_INDEX_DEFAULT = 0,
> - EPP_INDEX_PERFORMANCE,
> - EPP_INDEX_BALANCE_PERFORMANCE,
> - EPP_INDEX_BALANCE_POWERSAVE,
> - EPP_INDEX_POWERSAVE,
> -};
> -
> -static const char * const energy_perf_strings[] = {
> +const char * const energy_perf_strings[] = {
> [EPP_INDEX_DEFAULT] = "default",
> [EPP_INDEX_PERFORMANCE] = "performance",
> [EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
> @@ -656,7 +648,8 @@ static const char * const energy_perf_strings[] = {
> [EPP_INDEX_POWERSAVE] = "power",
> NULL
> };
> -static unsigned int epp_values[] = {
> +
> +unsigned int epp_values[] = {
> [EPP_INDEX_DEFAULT] = 0, /* Unused index */
> [EPP_INDEX_PERFORMANCE] = HWP_EPP_PERFORMANCE,
> [EPP_INDEX_BALANCE_PERFORMANCE] = HWP_EPP_BALANCE_PERFORMANCE,
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index d5595d57f4e5..e63309d497fe 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -20,6 +20,7 @@
> #include <linux/pm_qos.h>
> #include <linux/spinlock.h>
> #include <linux/sysfs.h>
> +#include <asm/msr.h>
Please don't include msr header in cpufreq common file, we already include
it in amd-pstate.c, that's fairly enough.
Thanks,
Ray
>
> /*********************************************************************
> * CPUFREQ INTERFACE *
> @@ -185,6 +186,16 @@ struct cpufreq_freqs {
> u8 flags; /* flags of cpufreq_driver, see below. */
> };
>
> +enum energy_perf_value_index {
> + EPP_INDEX_DEFAULT = 0,
> + EPP_INDEX_PERFORMANCE,
> + EPP_INDEX_BALANCE_PERFORMANCE,
> + EPP_INDEX_BALANCE_POWERSAVE,
> + EPP_INDEX_POWERSAVE,
> +};
> +extern const char * const energy_perf_strings[];
> +extern unsigned int epp_values[];
> +
> /* Only for ACPI */
> #define CPUFREQ_SHARED_TYPE_NONE (0) /* None */
> #define CPUFREQ_SHARED_TYPE_HW (1) /* HW does needed coordination */
> --
> 2.34.1
>
[AMD Official Use Only - General]
Hi Ray.
> -----Original Message-----
> From: Huang, Ray <[email protected]>
> Sent: Friday, December 23, 2022 3:43 PM
> To: Yuan, Perry <[email protected]>
> Cc: [email protected]; Limonciello, Mario
> <[email protected]>; [email protected]; Sharma, Deepak
> <[email protected]>; Fontenot, Nathan
> <[email protected]>; Deucher, Alexander
> <[email protected]>; Huang, Shimmer
> <[email protected]>; Du, Xiaojian <[email protected]>; Meng,
> Li (Jassmine) <[email protected]>; Karny, Wyes <[email protected]>;
> [email protected]; [email protected]
> Subject: Re: [PATCH v8 06/13] cpufreq: amd-pstate: implement Pstate EPP
> support for the AMD processors
>
> On Mon, Dec 19, 2022 at 02:40:35PM +0800, Yuan, Perry wrote:
> > From: Perry Yuan <[email protected]>
> >
> > Add EPP driver support for AMD SoCs which support a dedicated MSR for
> > CPPC. EPP is used by the DPM controller to configure the frequency
> > that a core operates at during short periods of activity.
> >
> > The SoC EPP targets are configured on a scale from 0 to 255 where 0
> > represents maximum performance and 255 represents maximum
> efficiency.
> >
> > The amd-pstate driver exports profile string names to userspace that
> > are tied to specific EPP values.
> >
> > The balance_performance string (0x80) provides the best balance for
> > efficiency versus power on most systems, but users can choose other
> > strings to meet their needs as well.
> >
> > $ cat
> >
> /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_
> p
> > references default performance balance_performance balance_power
> power
> >
> > $ cat
> >
> /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preferenc
> e
> > balance_performance
> >
> > To enable the driver,it needs to add `amd_pstate=active` to kernel
> > command line and kernel will load the active mode epp driver
> >
>
> Please check the comments in V7's reply:
>
> https://lore.kernel.org/lkml/[email protected]/
>
> I think the static call is not hard required at this moment.
>
> But the boost/refresh_freq_limits stuff and cpudata may still need some
> enhancement. Others, looks good for me right now.
>
> Thanks,
> Ray
Thanks for your quick review at this hard time.
I will rework the patch as your suggestion in V9.
Perry.
>
> > Signed-off-by: Perry Yuan <[email protected]>
> > ---
> > drivers/cpufreq/amd-pstate.c | 447
> ++++++++++++++++++++++++++++++++++-
> > include/linux/amd-pstate.h | 10 +
> > 2 files changed, 451 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index 861a905f9324..66b39457a312 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -59,7 +59,10 @@
> > * we disable it by default to go acpi-cpufreq on these processors and add
> a
> > * module parameter to be able to enable it manually for debugging.
> > */
> > +static struct cpufreq_driver *default_pstate_driver;
> > static struct cpufreq_driver amd_pstate_driver;
> > +static struct cpufreq_driver amd_pstate_epp_driver; static struct
> > +amd_cpudata **all_cpu_data;
> > static int cppc_state = AMD_PSTATE_DISABLE;
> >
> > static inline int get_mode_idx_from_str(const char *str, size_t size)
> > @@ -70,9 +73,128 @@ static inline int get_mode_idx_from_str(const char
> *str, size_t size)
> > if (!strncmp(str, amd_pstate_mode_string[i], size))
> > return i;
> > }
> > +
> > return -EINVAL;
> > }
> >
> > +/**
> > + * struct amd_pstate_params - global parameters for the performance
> > +control
> > + * @ cppc_boost_disabled wheher the core performance boost disabled
> > +*/ struct amd_pstate_params {
> > + bool cppc_boost_disabled;
> > +};
> > +
> > +static struct amd_pstate_params global_params;
> > +
> > +static DEFINE_MUTEX(amd_pstate_limits_lock);
> > +static DEFINE_MUTEX(amd_pstate_driver_lock);
> > +
> > +static s16 amd_pstate_get_epp(struct amd_cpudata *cpudata, u64
> > +cppc_req_cached) {
> > + u64 epp;
> > + int ret;
> > +
> > + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > + if (!cppc_req_cached) {
> > + epp = rdmsrl_on_cpu(cpudata->cpu,
> MSR_AMD_CPPC_REQ,
> > + &cppc_req_cached);
> > + if (epp)
> > + return epp;
> > + }
> > + epp = (cppc_req_cached >> 24) & 0xFF;
> > + } else {
> > + ret = cppc_get_epp_perf(cpudata->cpu, &epp);
> > + if (ret < 0) {
> > + pr_debug("Could not retrieve energy perf value
> (%d)\n", ret);
> > + return -EIO;
> > + }
> > + }
> > +
> > + return (s16)(epp & 0xff);
> > +}
> > +
> > +static int amd_pstate_get_energy_pref_index(struct amd_cpudata
> > +*cpudata) {
> > + s16 epp;
> > + int index = -EINVAL;
> > +
> > + epp = amd_pstate_get_epp(cpudata, 0);
> > + if (epp < 0)
> > + return epp;
> > +
> > + switch (epp) {
> > + case HWP_EPP_PERFORMANCE:
> > + index = EPP_INDEX_PERFORMANCE;
> > + break;
> > + case HWP_EPP_BALANCE_PERFORMANCE:
> > + index = EPP_INDEX_BALANCE_PERFORMANCE;
> > + break;
> > + case HWP_EPP_BALANCE_POWERSAVE:
> > + index = EPP_INDEX_BALANCE_POWERSAVE;
> > + break;
> > + case HWP_EPP_POWERSAVE:
> > + index = EPP_INDEX_POWERSAVE;
> > + break;
> > + default:
> > + break;
> > + }
> > +
> > + return index;
> > +}
> > +
> > +static int amd_pstate_set_epp(struct amd_cpudata *cpudata, u32 epp) {
> > + int ret;
> > + struct cppc_perf_ctrls perf_ctrls;
> > +
> > + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > + u64 value = READ_ONCE(cpudata->cppc_req_cached);
> > +
> > + value &= ~GENMASK_ULL(31, 24);
> > + value |= (u64)epp << 24;
> > + WRITE_ONCE(cpudata->cppc_req_cached, value);
> > +
> > + ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> value);
> > + if (!ret)
> > + cpudata->epp_cached = epp;
> > + } else {
> > + perf_ctrls.energy_perf = epp;
> > + ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1);
> > + if (ret) {
> > + pr_debug("failed to set energy perf value (%d)\n",
> ret);
> > + return ret;
> > + }
> > + cpudata->epp_cached = epp;
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +static int amd_pstate_set_energy_pref_index(struct amd_cpudata
> *cpudata,
> > + int pref_index)
> > +{
> > + int epp = -EINVAL;
> > + int ret;
> > +
> > + if (!pref_index) {
> > + pr_debug("EPP pref_index is invalid\n");
> > + return -EINVAL;
> > + }
> > +
> > + if (epp == -EINVAL)
> > + epp = epp_values[pref_index];
> > +
> > + if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> {
> > + pr_debug("EPP cannot be set under performance policy\n");
> > + return -EBUSY;
> > + }
> > +
> > + ret = amd_pstate_set_epp(cpudata, epp);
> > +
> > + return ret;
> > +}
> > +
> > static inline int pstate_enable(bool enable) {
> > return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable); @@ -81,11
> +203,21
> > @@ static inline int pstate_enable(bool enable) static int
> > cppc_enable(bool enable) {
> > int cpu, ret = 0;
> > + struct cppc_perf_ctrls perf_ctrls;
> >
> > for_each_present_cpu(cpu) {
> > ret = cppc_set_enable(cpu, enable);
> > if (ret)
> > return ret;
> > +
> > + /* Enable autonomous mode for EPP */
> > + if (cppc_state == AMD_PSTATE_ACTIVE) {
> > + /* Set desired perf as zero to allow EPP firmware
> control */
> > + perf_ctrls.desired_perf = 0;
> > + ret = cppc_set_perf(cpu, &perf_ctrls);
> > + if (ret)
> > + return ret;
> > + }
> > }
> >
> > return ret;
> > @@ -429,7 +561,7 @@ static void amd_pstate_boost_init(struct
> amd_cpudata *cpudata)
> > return;
> >
> > cpudata->boost_supported = true;
> > - amd_pstate_driver.boost_enabled = true;
> > + default_pstate_driver->boost_enabled = true;
> > }
> >
> > static void amd_perf_ctl_reset(unsigned int cpu) @@ -603,10 +735,61
> > @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy
> *policy,
> > return sprintf(&buf[0], "%u\n", perf); }
> >
> > +static ssize_t show_energy_performance_available_preferences(
> > + struct cpufreq_policy *policy, char *buf) {
> > + int i = 0;
> > + int offset = 0;
> > +
> > + while (energy_perf_strings[i] != NULL)
> > + offset += sysfs_emit_at(buf, offset, "%s ",
> > +energy_perf_strings[i++]);
> > +
> > + sysfs_emit_at(buf, offset, "\n");
> > +
> > + return offset;
> > +}
> > +
> > +static ssize_t store_energy_performance_preference(
> > + struct cpufreq_policy *policy, const char *buf, size_t count) {
> > + struct amd_cpudata *cpudata = policy->driver_data;
> > + char str_preference[21];
> > + ssize_t ret;
> > +
> > + ret = sscanf(buf, "%20s", str_preference);
> > + if (ret != 1)
> > + return -EINVAL;
> > +
> > + ret = match_string(energy_perf_strings, -1, str_preference);
> > + if (ret < 0)
> > + return -EINVAL;
> > +
> > + mutex_lock(&amd_pstate_limits_lock);
> > + ret = amd_pstate_set_energy_pref_index(cpudata, ret);
> > + mutex_unlock(&amd_pstate_limits_lock);
> > +
> > + return ret ?: count;
> > +}
> > +
> > +static ssize_t show_energy_performance_preference(
> > + struct cpufreq_policy *policy, char *buf) {
> > + struct amd_cpudata *cpudata = policy->driver_data;
> > + int preference;
> > +
> > + preference = amd_pstate_get_energy_pref_index(cpudata);
> > + if (preference < 0)
> > + return preference;
> > +
> > + return sysfs_emit(buf, "%s\n", energy_perf_strings[preference]); }
> > +
> > cpufreq_freq_attr_ro(amd_pstate_max_freq);
> > cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
> >
> > cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> > +cpufreq_freq_attr_rw(energy_performance_preference);
> > +cpufreq_freq_attr_ro(energy_performance_available_preferences);
> >
> > static struct freq_attr *amd_pstate_attr[] = {
> > &amd_pstate_max_freq,
> > @@ -615,6 +798,235 @@ static struct freq_attr *amd_pstate_attr[] = {
> > NULL,
> > };
> >
> > +static struct freq_attr *amd_pstate_epp_attr[] = {
> > + &amd_pstate_max_freq,
> > + &amd_pstate_lowest_nonlinear_freq,
> > + &amd_pstate_highest_perf,
> > + &energy_performance_preference,
> > + &energy_performance_available_preferences,
> > + NULL,
> > +};
> > +
> > +static inline void update_boost_state(void) {
> > + u64 misc_en;
> > + struct amd_cpudata *cpudata;
> > +
> > + cpudata = all_cpu_data[0];
> > + rdmsrl(MSR_K7_HWCR, misc_en);
> > + global_params.cppc_boost_disabled = misc_en & BIT_ULL(25); }
> > +
> > +static int amd_pstate_init_cpu(unsigned int cpunum) {
> > + struct amd_cpudata *cpudata;
> > +
> > + cpudata = all_cpu_data[cpunum];
> > + if (!cpudata) {
> > + cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
> > + if (!cpudata)
> > + return -ENOMEM;
> > + WRITE_ONCE(all_cpu_data[cpunum], cpudata);
> > +
> > + cpudata->cpu = cpunum;
> > + }
> > +
> > + cpudata->epp_policy = 0;
> > + pr_debug("controlling: cpu %d\n", cpunum);
> > + return 0;
> > +}
> > +
> > +static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy) {
> > + int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> > + struct amd_cpudata *cpudata;
> > + struct device *dev;
> > + int rc;
> > + u64 value;
> > +
> > + rc = amd_pstate_init_cpu(policy->cpu);
> > + if (rc)
> > + return rc;
> > +
> > + cpudata = all_cpu_data[policy->cpu];
> > +
> > + dev = get_cpu_device(policy->cpu);
> > + if (!dev)
> > + goto free_cpudata1;
> > +
> > + rc = amd_pstate_init_perf(cpudata);
> > + if (rc)
> > + goto free_cpudata1;
> > +
> > + min_freq = amd_get_min_freq(cpudata);
> > + max_freq = amd_get_max_freq(cpudata);
> > + nominal_freq = amd_get_nominal_freq(cpudata);
> > + lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> > + if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> > + dev_err(dev, "min_freq(%d) or max_freq(%d) value is
> incorrect\n",
> > + min_freq, max_freq);
> > + ret = -EINVAL;
> > + goto free_cpudata1;
> > + }
> > +
> > + policy->min = min_freq;
> > + policy->max = max_freq;
> > +
> > + policy->cpuinfo.min_freq = min_freq;
> > + policy->cpuinfo.max_freq = max_freq;
> > + /* It will be updated by governor */
> > + policy->cur = policy->cpuinfo.min_freq;
> > +
> > + /* Initial processor data capability frequencies */
> > + cpudata->max_freq = max_freq;
> > + cpudata->min_freq = min_freq;
> > + cpudata->nominal_freq = nominal_freq;
> > + cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> > +
> > + policy->driver_data = cpudata;
> > +
> > + cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
> > +
> > + policy->min = policy->cpuinfo.min_freq;
> > + policy->max = policy->cpuinfo.max_freq;
> > +
> > + /*
> > + * Set the policy to powersave to provide a valid fallback value in case
> > + * the default cpufreq governor is neither powersave nor
> performance.
> > + */
> > + policy->policy = CPUFREQ_POLICY_POWERSAVE;
> > +
> > + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > + policy->fast_switch_possible = true;
> > + ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> &value);
> > + if (ret)
> > + return ret;
> > + WRITE_ONCE(cpudata->cppc_req_cached, value);
> > +
> > + ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
> &value);
> > + if (ret)
> > + return ret;
> > + WRITE_ONCE(cpudata->cppc_cap1_cached, value);
> > + }
> > + amd_pstate_boost_init(cpudata);
> > +
> > + return 0;
> > +
> > +free_cpudata1:
> > + kfree(cpudata);
> > + return ret;
> > +}
> > +
> > +static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy) {
> > + pr_debug("CPU %d exiting\n", policy->cpu);
> > + policy->fast_switch_possible = false;
> > + return 0;
> > +}
> > +
> > +static void amd_pstate_update_max_freq(unsigned int cpu) {
> > + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> > +
> > + if (!policy)
> > + return;
> > +
> > + refresh_frequency_limits(policy);
> > + cpufreq_cpu_put(policy);
> > +}
> > +
> > +static void amd_pstate_epp_update_limits(unsigned int cpu) {
> > + mutex_lock(&amd_pstate_driver_lock);
> > + update_boost_state();
> > + if (global_params.cppc_boost_disabled) {
> > + for_each_possible_cpu(cpu)
> > + amd_pstate_update_max_freq(cpu);
> > + } else {
> > + cpufreq_update_policy(cpu);
> > + }
> > + mutex_unlock(&amd_pstate_driver_lock);
> > +}
> > +
> > +static void amd_pstate_epp_init(unsigned int cpu) {
> > + struct amd_cpudata *cpudata = all_cpu_data[cpu];
> > + u32 max_perf, min_perf;
> > + u64 value;
> > + s16 epp;
> > +
> > + max_perf = READ_ONCE(cpudata->highest_perf);
> > + min_perf = READ_ONCE(cpudata->lowest_perf);
> > +
> > + value = READ_ONCE(cpudata->cppc_req_cached);
> > +
> > + if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> > + min_perf = max_perf;
> > +
> > + /* Initial min/max values for CPPC Performance Controls Register */
> > + value &= ~AMD_CPPC_MIN_PERF(~0L);
> > + value |= AMD_CPPC_MIN_PERF(min_perf);
> > +
> > + value &= ~AMD_CPPC_MAX_PERF(~0L);
> > + value |= AMD_CPPC_MAX_PERF(max_perf);
> > +
> > + /* CPPC EPP feature require to set zero to the desire perf bit */
> > + value &= ~AMD_CPPC_DES_PERF(~0L);
> > + value |= AMD_CPPC_DES_PERF(0);
> > +
> > + if (cpudata->epp_policy == cpudata->policy)
> > + goto skip_epp;
> > +
> > + cpudata->epp_policy = cpudata->policy;
> > +
> > + if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> > + epp = amd_pstate_get_epp(cpudata, value);
> > + if (epp < 0)
> > + goto skip_epp;
> > + /* force the epp value to be zero for performance policy */
> > + epp = 0;
> > + } else {
> > + /* Get BIOS pre-defined epp value */
> > + epp = amd_pstate_get_epp(cpudata, value);
> > + if (epp)
> > + goto skip_epp;
> > + }
> > + /* Set initial EPP value */
> > + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > + value &= ~GENMASK_ULL(31, 24);
> > + value |= (u64)epp << 24;
> > + }
> > +
> > +skip_epp:
> > + WRITE_ONCE(cpudata->cppc_req_cached, value);
> > + amd_pstate_set_epp(cpudata, epp);
> > +}
> > +
> > +static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy) {
> > + struct amd_cpudata *cpudata;
> > +
> > + if (!policy->cpuinfo.max_freq)
> > + return -ENODEV;
> > +
> > + pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
> > + policy->cpuinfo.max_freq, policy->max);
> > +
> > + cpudata = all_cpu_data[policy->cpu];
> > + cpudata->policy = policy->policy;
> > +
> > + amd_pstate_epp_init(policy->cpu);
> > +
> > + return 0;
> > +}
> > +
> > +static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data
> > +*policy) {
> > + cpufreq_verify_within_cpu_limits(policy);
> > + pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy-
> >min);
> > + return 0;
> > +}
> > +
> > static struct cpufreq_driver amd_pstate_driver = {
> > .flags = CPUFREQ_CONST_LOOPS |
> CPUFREQ_NEED_UPDATE_LIMITS,
> > .verify = amd_pstate_verify,
> > @@ -628,8 +1040,20 @@ static struct cpufreq_driver amd_pstate_driver =
> {
> > .attr = amd_pstate_attr,
> > };
> >
> > +static struct cpufreq_driver amd_pstate_epp_driver = {
> > + .flags = CPUFREQ_CONST_LOOPS,
> > + .verify = amd_pstate_epp_verify_policy,
> > + .setpolicy = amd_pstate_epp_set_policy,
> > + .init = amd_pstate_epp_cpu_init,
> > + .exit = amd_pstate_epp_cpu_exit,
> > + .update_limits = amd_pstate_epp_update_limits,
> > + .name = "amd_pstate_epp",
> > + .attr = amd_pstate_epp_attr,
> > +};
> > +
> > static int __init amd_pstate_init(void) {
> > + static struct amd_cpudata **cpudata;
> > int ret;
> >
> > if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) @@ -656,7
> +1080,8 @@
> > static int __init amd_pstate_init(void)
> > /* capability check */
> > if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > pr_debug("AMD CPPC MSR based functionality is
> supported\n");
> > - amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
> > + if (cppc_state == AMD_PSTATE_PASSIVE)
> > + default_pstate_driver->adjust_perf =
> amd_pstate_adjust_perf;
> > } else {
> > pr_debug("AMD CPPC shared memory based functionality is
> supported\n");
> > static_call_update(amd_pstate_enable, cppc_enable); @@ -
> 664,17
> > +1089,21 @@ static int __init amd_pstate_init(void)
> > static_call_update(amd_pstate_update_perf,
> cppc_update_perf);
> > }
> >
> > + cpudata = vzalloc(array_size(sizeof(void *), num_possible_cpus()));
> > + if (!cpudata)
> > + return -ENOMEM;
> > + WRITE_ONCE(all_cpu_data, cpudata);
> > +
> > /* enable amd pstate feature */
> > ret = amd_pstate_enable(true);
> > if (ret) {
> > - pr_err("failed to enable amd-pstate with return %d\n", ret);
> > + pr_err("failed to enable with return %d\n", ret);
> > return ret;
> > }
> >
> > - ret = cpufreq_register_driver(&amd_pstate_driver);
> > + ret = cpufreq_register_driver(default_pstate_driver);
> > if (ret)
> > - pr_err("failed to register amd_pstate_driver with
> return %d\n",
> > - ret);
> > + pr_err("failed to register with return %d\n", ret);
> >
> > return ret;
> > }
> > @@ -696,6 +1125,12 @@ static int __init amd_pstate_param(char *str)
> > if (cppc_state == AMD_PSTATE_DISABLE)
> > pr_info("driver is explicitly disabled\n");
> >
> > + if (cppc_state == AMD_PSTATE_ACTIVE)
> > + default_pstate_driver = &amd_pstate_epp_driver;
> > +
> > + if (cppc_state == AMD_PSTATE_PASSIVE)
> > + default_pstate_driver = &amd_pstate_driver;
> > +
> > return 0;
> > }
> >
> > diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> > index 922d05a13902..fe1aef743c09 100644
> > --- a/include/linux/amd-pstate.h
> > +++ b/include/linux/amd-pstate.h
> > @@ -47,6 +47,10 @@ struct amd_aperf_mperf {
> > * @prev: Last Aperf/Mperf/tsc count value read from register
> > * @freq: current cpu frequency value
> > * @boost_supported: check whether the Processor or SBIOS supports
> > boost mode
> > + * @epp_policy: Last saved policy used to set energy-performance
> > + preference
> > + * @epp_cached: Cached CPPC energy-performance preference value
> > + * @policy: Cpufreq policy value
> > + * @cppc_cap1_cached Cached MSR_AMD_CPPC_CAP1 register value
> > *
> > * The amd_cpudata is key private data for each CPU thread in AMD P-
> State, and
> > * represents all the attributes and goals that AMD P-State requests at
> runtime.
> > @@ -72,6 +76,12 @@ struct amd_cpudata {
> >
> > u64 freq;
> > bool boost_supported;
> > +
> > + /* EPP feature related attributes*/
> > + s16 epp_policy;
> > + s16 epp_cached;
> > + u32 policy;
> > + u64 cppc_cap1_cached;
> > };
> >
> > /**
> > --
> > 2.34.1
> >
On Mon, Dec 19, 2022 at 02:40:35PM +0800, Yuan, Perry wrote:
> From: Perry Yuan <[email protected]>
>
> Add EPP driver support for AMD SoCs which support a dedicated MSR for
> CPPC. EPP is used by the DPM controller to configure the frequency that
> a core operates at during short periods of activity.
>
> The SoC EPP targets are configured on a scale from 0 to 255 where 0
> represents maximum performance and 255 represents maximum efficiency.
>
> The amd-pstate driver exports profile string names to userspace that are
> tied to specific EPP values.
>
> The balance_performance string (0x80) provides the best balance for
> efficiency versus power on most systems, but users can choose other
> strings to meet their needs as well.
>
> $ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_available_preferences
> default performance balance_performance balance_power power
>
> $ cat /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preference
> balance_performance
>
> To enable the driver,it needs to add `amd_pstate=active` to kernel
> command line and kernel will load the active mode epp driver
>
Please check the comments in V7's reply:
https://lore.kernel.org/lkml/[email protected]/
I think the static call is not hard required at this moment.
But the boost/refresh_freq_limits stuff and cpudata may still need some
enhancement. Others, looks good for me right now.
Thanks,
Ray
> Signed-off-by: Perry Yuan <[email protected]>
> ---
> drivers/cpufreq/amd-pstate.c | 447 ++++++++++++++++++++++++++++++++++-
> include/linux/amd-pstate.h | 10 +
> 2 files changed, 451 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 861a905f9324..66b39457a312 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -59,7 +59,10 @@
> * we disable it by default to go acpi-cpufreq on these processors and add a
> * module parameter to be able to enable it manually for debugging.
> */
> +static struct cpufreq_driver *default_pstate_driver;
> static struct cpufreq_driver amd_pstate_driver;
> +static struct cpufreq_driver amd_pstate_epp_driver;
> +static struct amd_cpudata **all_cpu_data;
> static int cppc_state = AMD_PSTATE_DISABLE;
>
> static inline int get_mode_idx_from_str(const char *str, size_t size)
> @@ -70,9 +73,128 @@ static inline int get_mode_idx_from_str(const char *str, size_t size)
> if (!strncmp(str, amd_pstate_mode_string[i], size))
> return i;
> }
> +
> return -EINVAL;
> }
>
> +/**
> + * struct amd_pstate_params - global parameters for the performance control
> + * @ cppc_boost_disabled wheher the core performance boost disabled
> + */
> +struct amd_pstate_params {
> + bool cppc_boost_disabled;
> +};
> +
> +static struct amd_pstate_params global_params;
> +
> +static DEFINE_MUTEX(amd_pstate_limits_lock);
> +static DEFINE_MUTEX(amd_pstate_driver_lock);
> +
> +static s16 amd_pstate_get_epp(struct amd_cpudata *cpudata, u64 cppc_req_cached)
> +{
> + u64 epp;
> + int ret;
> +
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + if (!cppc_req_cached) {
> + epp = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ,
> + &cppc_req_cached);
> + if (epp)
> + return epp;
> + }
> + epp = (cppc_req_cached >> 24) & 0xFF;
> + } else {
> + ret = cppc_get_epp_perf(cpudata->cpu, &epp);
> + if (ret < 0) {
> + pr_debug("Could not retrieve energy perf value (%d)\n", ret);
> + return -EIO;
> + }
> + }
> +
> + return (s16)(epp & 0xff);
> +}
> +
> +static int amd_pstate_get_energy_pref_index(struct amd_cpudata *cpudata)
> +{
> + s16 epp;
> + int index = -EINVAL;
> +
> + epp = amd_pstate_get_epp(cpudata, 0);
> + if (epp < 0)
> + return epp;
> +
> + switch (epp) {
> + case HWP_EPP_PERFORMANCE:
> + index = EPP_INDEX_PERFORMANCE;
> + break;
> + case HWP_EPP_BALANCE_PERFORMANCE:
> + index = EPP_INDEX_BALANCE_PERFORMANCE;
> + break;
> + case HWP_EPP_BALANCE_POWERSAVE:
> + index = EPP_INDEX_BALANCE_POWERSAVE;
> + break;
> + case HWP_EPP_POWERSAVE:
> + index = EPP_INDEX_POWERSAVE;
> + break;
> + default:
> + break;
> + }
> +
> + return index;
> +}
> +
> +static int amd_pstate_set_epp(struct amd_cpudata *cpudata, u32 epp)
> +{
> + int ret;
> + struct cppc_perf_ctrls perf_ctrls;
> +
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + u64 value = READ_ONCE(cpudata->cppc_req_cached);
> +
> + value &= ~GENMASK_ULL(31, 24);
> + value |= (u64)epp << 24;
> + WRITE_ONCE(cpudata->cppc_req_cached, value);
> +
> + ret = wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
> + if (!ret)
> + cpudata->epp_cached = epp;
> + } else {
> + perf_ctrls.energy_perf = epp;
> + ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1);
> + if (ret) {
> + pr_debug("failed to set energy perf value (%d)\n", ret);
> + return ret;
> + }
> + cpudata->epp_cached = epp;
> + }
> +
> + return ret;
> +}
> +
> +static int amd_pstate_set_energy_pref_index(struct amd_cpudata *cpudata,
> + int pref_index)
> +{
> + int epp = -EINVAL;
> + int ret;
> +
> + if (!pref_index) {
> + pr_debug("EPP pref_index is invalid\n");
> + return -EINVAL;
> + }
> +
> + if (epp == -EINVAL)
> + epp = epp_values[pref_index];
> +
> + if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> + pr_debug("EPP cannot be set under performance policy\n");
> + return -EBUSY;
> + }
> +
> + ret = amd_pstate_set_epp(cpudata, epp);
> +
> + return ret;
> +}
> +
> static inline int pstate_enable(bool enable)
> {
> return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable);
> @@ -81,11 +203,21 @@ static inline int pstate_enable(bool enable)
> static int cppc_enable(bool enable)
> {
> int cpu, ret = 0;
> + struct cppc_perf_ctrls perf_ctrls;
>
> for_each_present_cpu(cpu) {
> ret = cppc_set_enable(cpu, enable);
> if (ret)
> return ret;
> +
> + /* Enable autonomous mode for EPP */
> + if (cppc_state == AMD_PSTATE_ACTIVE) {
> + /* Set desired perf as zero to allow EPP firmware control */
> + perf_ctrls.desired_perf = 0;
> + ret = cppc_set_perf(cpu, &perf_ctrls);
> + if (ret)
> + return ret;
> + }
> }
>
> return ret;
> @@ -429,7 +561,7 @@ static void amd_pstate_boost_init(struct amd_cpudata *cpudata)
> return;
>
> cpudata->boost_supported = true;
> - amd_pstate_driver.boost_enabled = true;
> + default_pstate_driver->boost_enabled = true;
> }
>
> static void amd_perf_ctl_reset(unsigned int cpu)
> @@ -603,10 +735,61 @@ static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
> return sprintf(&buf[0], "%u\n", perf);
> }
>
> +static ssize_t show_energy_performance_available_preferences(
> + struct cpufreq_policy *policy, char *buf)
> +{
> + int i = 0;
> + int offset = 0;
> +
> + while (energy_perf_strings[i] != NULL)
> + offset += sysfs_emit_at(buf, offset, "%s ", energy_perf_strings[i++]);
> +
> + sysfs_emit_at(buf, offset, "\n");
> +
> + return offset;
> +}
> +
> +static ssize_t store_energy_performance_preference(
> + struct cpufreq_policy *policy, const char *buf, size_t count)
> +{
> + struct amd_cpudata *cpudata = policy->driver_data;
> + char str_preference[21];
> + ssize_t ret;
> +
> + ret = sscanf(buf, "%20s", str_preference);
> + if (ret != 1)
> + return -EINVAL;
> +
> + ret = match_string(energy_perf_strings, -1, str_preference);
> + if (ret < 0)
> + return -EINVAL;
> +
> + mutex_lock(&amd_pstate_limits_lock);
> + ret = amd_pstate_set_energy_pref_index(cpudata, ret);
> + mutex_unlock(&amd_pstate_limits_lock);
> +
> + return ret ?: count;
> +}
> +
> +static ssize_t show_energy_performance_preference(
> + struct cpufreq_policy *policy, char *buf)
> +{
> + struct amd_cpudata *cpudata = policy->driver_data;
> + int preference;
> +
> + preference = amd_pstate_get_energy_pref_index(cpudata);
> + if (preference < 0)
> + return preference;
> +
> + return sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
> +}
> +
> cpufreq_freq_attr_ro(amd_pstate_max_freq);
> cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
>
> cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> +cpufreq_freq_attr_rw(energy_performance_preference);
> +cpufreq_freq_attr_ro(energy_performance_available_preferences);
>
> static struct freq_attr *amd_pstate_attr[] = {
> &amd_pstate_max_freq,
> @@ -615,6 +798,235 @@ static struct freq_attr *amd_pstate_attr[] = {
> NULL,
> };
>
> +static struct freq_attr *amd_pstate_epp_attr[] = {
> + &amd_pstate_max_freq,
> + &amd_pstate_lowest_nonlinear_freq,
> + &amd_pstate_highest_perf,
> + &energy_performance_preference,
> + &energy_performance_available_preferences,
> + NULL,
> +};
> +
> +static inline void update_boost_state(void)
> +{
> + u64 misc_en;
> + struct amd_cpudata *cpudata;
> +
> + cpudata = all_cpu_data[0];
> + rdmsrl(MSR_K7_HWCR, misc_en);
> + global_params.cppc_boost_disabled = misc_en & BIT_ULL(25);
> +}
> +
> +static int amd_pstate_init_cpu(unsigned int cpunum)
> +{
> + struct amd_cpudata *cpudata;
> +
> + cpudata = all_cpu_data[cpunum];
> + if (!cpudata) {
> + cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL);
> + if (!cpudata)
> + return -ENOMEM;
> + WRITE_ONCE(all_cpu_data[cpunum], cpudata);
> +
> + cpudata->cpu = cpunum;
> + }
> +
> + cpudata->epp_policy = 0;
> + pr_debug("controlling: cpu %d\n", cpunum);
> + return 0;
> +}
> +
> +static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> +{
> + int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> + struct amd_cpudata *cpudata;
> + struct device *dev;
> + int rc;
> + u64 value;
> +
> + rc = amd_pstate_init_cpu(policy->cpu);
> + if (rc)
> + return rc;
> +
> + cpudata = all_cpu_data[policy->cpu];
> +
> + dev = get_cpu_device(policy->cpu);
> + if (!dev)
> + goto free_cpudata1;
> +
> + rc = amd_pstate_init_perf(cpudata);
> + if (rc)
> + goto free_cpudata1;
> +
> + min_freq = amd_get_min_freq(cpudata);
> + max_freq = amd_get_max_freq(cpudata);
> + nominal_freq = amd_get_nominal_freq(cpudata);
> + lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> + if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> + dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
> + min_freq, max_freq);
> + ret = -EINVAL;
> + goto free_cpudata1;
> + }
> +
> + policy->min = min_freq;
> + policy->max = max_freq;
> +
> + policy->cpuinfo.min_freq = min_freq;
> + policy->cpuinfo.max_freq = max_freq;
> + /* It will be updated by governor */
> + policy->cur = policy->cpuinfo.min_freq;
> +
> + /* Initial processor data capability frequencies */
> + cpudata->max_freq = max_freq;
> + cpudata->min_freq = min_freq;
> + cpudata->nominal_freq = nominal_freq;
> + cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> +
> + policy->driver_data = cpudata;
> +
> + cpudata->epp_cached = amd_pstate_get_epp(cpudata, value);
> +
> + policy->min = policy->cpuinfo.min_freq;
> + policy->max = policy->cpuinfo.max_freq;
> +
> + /*
> + * Set the policy to powersave to provide a valid fallback value in case
> + * the default cpufreq governor is neither powersave nor performance.
> + */
> + policy->policy = CPUFREQ_POLICY_POWERSAVE;
> +
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + policy->fast_switch_possible = true;
> + ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, &value);
> + if (ret)
> + return ret;
> + WRITE_ONCE(cpudata->cppc_req_cached, value);
> +
> + ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &value);
> + if (ret)
> + return ret;
> + WRITE_ONCE(cpudata->cppc_cap1_cached, value);
> + }
> + amd_pstate_boost_init(cpudata);
> +
> + return 0;
> +
> +free_cpudata1:
> + kfree(cpudata);
> + return ret;
> +}
> +
> +static int amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
> +{
> + pr_debug("CPU %d exiting\n", policy->cpu);
> + policy->fast_switch_possible = false;
> + return 0;
> +}
> +
> +static void amd_pstate_update_max_freq(unsigned int cpu)
> +{
> + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> +
> + if (!policy)
> + return;
> +
> + refresh_frequency_limits(policy);
> + cpufreq_cpu_put(policy);
> +}
> +
> +static void amd_pstate_epp_update_limits(unsigned int cpu)
> +{
> + mutex_lock(&amd_pstate_driver_lock);
> + update_boost_state();
> + if (global_params.cppc_boost_disabled) {
> + for_each_possible_cpu(cpu)
> + amd_pstate_update_max_freq(cpu);
> + } else {
> + cpufreq_update_policy(cpu);
> + }
> + mutex_unlock(&amd_pstate_driver_lock);
> +}
> +
> +static void amd_pstate_epp_init(unsigned int cpu)
> +{
> + struct amd_cpudata *cpudata = all_cpu_data[cpu];
> + u32 max_perf, min_perf;
> + u64 value;
> + s16 epp;
> +
> + max_perf = READ_ONCE(cpudata->highest_perf);
> + min_perf = READ_ONCE(cpudata->lowest_perf);
> +
> + value = READ_ONCE(cpudata->cppc_req_cached);
> +
> + if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> + min_perf = max_perf;
> +
> + /* Initial min/max values for CPPC Performance Controls Register */
> + value &= ~AMD_CPPC_MIN_PERF(~0L);
> + value |= AMD_CPPC_MIN_PERF(min_perf);
> +
> + value &= ~AMD_CPPC_MAX_PERF(~0L);
> + value |= AMD_CPPC_MAX_PERF(max_perf);
> +
> + /* CPPC EPP feature require to set zero to the desire perf bit */
> + value &= ~AMD_CPPC_DES_PERF(~0L);
> + value |= AMD_CPPC_DES_PERF(0);
> +
> + if (cpudata->epp_policy == cpudata->policy)
> + goto skip_epp;
> +
> + cpudata->epp_policy = cpudata->policy;
> +
> + if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> + epp = amd_pstate_get_epp(cpudata, value);
> + if (epp < 0)
> + goto skip_epp;
> + /* force the epp value to be zero for performance policy */
> + epp = 0;
> + } else {
> + /* Get BIOS pre-defined epp value */
> + epp = amd_pstate_get_epp(cpudata, value);
> + if (epp)
> + goto skip_epp;
> + }
> + /* Set initial EPP value */
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + value &= ~GENMASK_ULL(31, 24);
> + value |= (u64)epp << 24;
> + }
> +
> +skip_epp:
> + WRITE_ONCE(cpudata->cppc_req_cached, value);
> + amd_pstate_set_epp(cpudata, epp);
> +}
> +
> +static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
> +{
> + struct amd_cpudata *cpudata;
> +
> + if (!policy->cpuinfo.max_freq)
> + return -ENODEV;
> +
> + pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
> + policy->cpuinfo.max_freq, policy->max);
> +
> + cpudata = all_cpu_data[policy->cpu];
> + cpudata->policy = policy->policy;
> +
> + amd_pstate_epp_init(policy->cpu);
> +
> + return 0;
> +}
> +
> +static int amd_pstate_epp_verify_policy(struct cpufreq_policy_data *policy)
> +{
> + cpufreq_verify_within_cpu_limits(policy);
> + pr_debug("policy_max =%d, policy_min=%d\n", policy->max, policy->min);
> + return 0;
> +}
> +
> static struct cpufreq_driver amd_pstate_driver = {
> .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
> .verify = amd_pstate_verify,
> @@ -628,8 +1040,20 @@ static struct cpufreq_driver amd_pstate_driver = {
> .attr = amd_pstate_attr,
> };
>
> +static struct cpufreq_driver amd_pstate_epp_driver = {
> + .flags = CPUFREQ_CONST_LOOPS,
> + .verify = amd_pstate_epp_verify_policy,
> + .setpolicy = amd_pstate_epp_set_policy,
> + .init = amd_pstate_epp_cpu_init,
> + .exit = amd_pstate_epp_cpu_exit,
> + .update_limits = amd_pstate_epp_update_limits,
> + .name = "amd_pstate_epp",
> + .attr = amd_pstate_epp_attr,
> +};
> +
> static int __init amd_pstate_init(void)
> {
> + static struct amd_cpudata **cpudata;
> int ret;
>
> if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
> @@ -656,7 +1080,8 @@ static int __init amd_pstate_init(void)
> /* capability check */
> if (boot_cpu_has(X86_FEATURE_CPPC)) {
> pr_debug("AMD CPPC MSR based functionality is supported\n");
> - amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
> + if (cppc_state == AMD_PSTATE_PASSIVE)
> + default_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
> } else {
> pr_debug("AMD CPPC shared memory based functionality is supported\n");
> static_call_update(amd_pstate_enable, cppc_enable);
> @@ -664,17 +1089,21 @@ static int __init amd_pstate_init(void)
> static_call_update(amd_pstate_update_perf, cppc_update_perf);
> }
>
> + cpudata = vzalloc(array_size(sizeof(void *), num_possible_cpus()));
> + if (!cpudata)
> + return -ENOMEM;
> + WRITE_ONCE(all_cpu_data, cpudata);
> +
> /* enable amd pstate feature */
> ret = amd_pstate_enable(true);
> if (ret) {
> - pr_err("failed to enable amd-pstate with return %d\n", ret);
> + pr_err("failed to enable with return %d\n", ret);
> return ret;
> }
>
> - ret = cpufreq_register_driver(&amd_pstate_driver);
> + ret = cpufreq_register_driver(default_pstate_driver);
> if (ret)
> - pr_err("failed to register amd_pstate_driver with return %d\n",
> - ret);
> + pr_err("failed to register with return %d\n", ret);
>
> return ret;
> }
> @@ -696,6 +1125,12 @@ static int __init amd_pstate_param(char *str)
> if (cppc_state == AMD_PSTATE_DISABLE)
> pr_info("driver is explicitly disabled\n");
>
> + if (cppc_state == AMD_PSTATE_ACTIVE)
> + default_pstate_driver = &amd_pstate_epp_driver;
> +
> + if (cppc_state == AMD_PSTATE_PASSIVE)
> + default_pstate_driver = &amd_pstate_driver;
> +
> return 0;
> }
>
> diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> index 922d05a13902..fe1aef743c09 100644
> --- a/include/linux/amd-pstate.h
> +++ b/include/linux/amd-pstate.h
> @@ -47,6 +47,10 @@ struct amd_aperf_mperf {
> * @prev: Last Aperf/Mperf/tsc count value read from register
> * @freq: current cpu frequency value
> * @boost_supported: check whether the Processor or SBIOS supports boost mode
> + * @epp_policy: Last saved policy used to set energy-performance preference
> + * @epp_cached: Cached CPPC energy-performance preference value
> + * @policy: Cpufreq policy value
> + * @cppc_cap1_cached Cached MSR_AMD_CPPC_CAP1 register value
> *
> * The amd_cpudata is key private data for each CPU thread in AMD P-State, and
> * represents all the attributes and goals that AMD P-State requests at runtime.
> @@ -72,6 +76,12 @@ struct amd_cpudata {
>
> u64 freq;
> bool boost_supported;
> +
> + /* EPP feature related attributes*/
> + s16 epp_policy;
> + s16 epp_cached;
> + u32 policy;
> + u64 cppc_cap1_cached;
> };
>
> /**
> --
> 2.34.1
>