The patch series adds some fixes and enhancements to the AMD pstate
driver.
It enables CPPC v2 for certain processors in the family 17H, as
requested
by TR40 processor users who expect improved performance and lower system
temperature.
Additionally, it fixes the initialization of nominal_freq for each
cpudata
and changes latency and delay values to be read from platform firmware
firstly
for more accurate timing.
A new quirk is also added for legacy processors that lack CPPC
capabilities which caused the pstate driver to fail loading.
Testing done with one APU system while cpb boost on:
amd_pstate_lowest_nonlinear_freq:1701000
amd_pstate_max_freq:3501000
cpuinfo_max_freq:3501000
cpuinfo_min_freq:400000
scaling_cur_freq:3084836
scaling_max_freq:3501000
scaling_min_freq:400000
analyzing CPU 6:
driver: amd-pstate-epp
CPUs which run at the same hardware frequency: 6
CPUs which need to have their frequency coordinated by software: 6
maximum transition latency: Cannot determine or is not supported.
hardware limits: 400 MHz - 3.50 GHz
available cpufreq governors: performance powersave
current policy: frequency should be within 400 MHz and 3.50 GHz.
The governor "powersave" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware
current CPU frequency: 3.50 GHz (asserted by call to kernel)
boost state support:
Supported: yes
Active: yes
AMD PSTATE Highest Performance: 255. Maximum Frequency: 3.50 GHz.
AMD PSTATE Nominal Performance: 204. Nominal Frequency: 2.80 GHz.
AMD PSTATE Lowest Non-linear Performance: 124. Lowest Non-linear Frequency: 1.70 GHz.
AMD PSTATE Lowest Performance: 30. Lowest Frequency: 400 MHz.
If someone would like to test this patchset, it would need to apply
another patchset on top of this in case of some unexpected issue found.
https://lore.kernel.org/lkml/[email protected]/
It implements the amd pstate cpb boost feature
the below patch link is old version, please apply the latest version
while you start the testing work.
I would greatly appreciate any feedbacks.
Thank you!
Changes from v7:
* Gautham helped to invole some new improved patches into the patchset.
* Adds comments for cpudata->{min,max}_limit_{perf,freq}, variables [New Patch].
* Clarifies that the units for cpudata->*_freq is in khz via comments [New Patch].
* Implements the unified computation of all cpudata->*_freq
* v7 Patch 2/6 was dropped which is not needed any more
* moved the quirk check to the amd_pstate_get_freq() function
* pick up RB flags from Gautham
* After the cleanup in patch 3, we don't need the helpers
amd_get_{min,max,nominal,lowest_nonlinear}_freq(). This
patch removes it [New Patch].
* testing done on APU system as well, no regression found.
Changes from v6:
* add one new patch to initialize capabilities in
amd_pstate_init_perf which can avoid duplicate cppc capabilities read
the change has been tested on APU system.
* pick up RB flags from Gautham
* drop the patch 1/6 which has been merged by Rafael
Changes from v5:
* rebased to linux-pm v6.8
* pick up RB flag from for patch 6(Mario)
Changes from v4:
* improve the dmi matching rule with zen2 flag only
Changes from v3:
* change quirk matching broken BIOS with family/model ID and Zen2
flag to fix the CPPC definition issue
* fix typo in quirk
Changes from v2:
* change quirk matching to BIOS version and release (Mario)
* pick up RB flag from Mario
Changes from v1:
* pick up the RB flags from Mario
* address review comment of patch #6 for amd_get_nominal_freq()
* rebased the series to linux-pm/bleeding-edge v6.8.0-rc2
* update debug log for patch #5 as Mario suggested.
* fix some typos and format problems
* tested on 7950X platform
V1: https://lore.kernel.org/lkml/[email protected]/
V2: https://lore.kernel.org/all/[email protected]/
v3: https://lore.kernel.org/lkml/[email protected]/
v4: https://lore.kernel.org/lkml/[email protected]/
v5: https://lore.kernel.org/lkml/[email protected]/
v6: https://lore.kernel.org/lkml/[email protected]/
v7: https://lore.kernel.org/lkml/[email protected]/
*** BLURB HERE ***
Gautham R. Shenoy (3):
cpufreq: amd-pstate: Document *_limit_* fields in struct amd_cpudata
cpufreq: amd-pstate: Document the units for freq variables in
amd_cpudata
cpufreq: amd-pstate: Remove
amd_get_{min,max,nominal,lowest_nonlinear}_freq()
Perry Yuan (5):
cpufreq: amd-pstate: Unify computation of
{max,min,nominal,lowest_nonlinear}_freq
cpufreq: amd-pstate: Bail out if min/max/nominal_freq is 0
cpufreq: amd-pstate: get transition delay and latency value from ACPI
tables
cppc_acpi: print error message if CPPC is unsupported
cpufreq: amd-pstate: Add quirk for the pstate CPPC capabilities
missing
drivers/acpi/cppc_acpi.c | 4 +-
drivers/cpufreq/amd-pstate.c | 257 +++++++++++++++++++++--------------
include/linux/amd-pstate.h | 20 ++-
3 files changed, 174 insertions(+), 107 deletions(-)
--
2.34.1
From: "Gautham R. Shenoy" <[email protected]>
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
include/linux/amd-pstate.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index d21838835abd..212f377d615b 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -49,6 +49,10 @@ struct amd_aperf_mperf {
* @lowest_perf: the absolute lowest performance level of the processor
* @prefcore_ranking: the preferred core ranking, the higher value indicates a higher
* priority.
+ * @min_limit_perf: Cached value of the perf corresponding to policy->min
+ * @max_limit_perf: Cached value of the perf corresponding to policy->max
+ * @min_limit_freq: Cached value of policy->min
+ * @max_limit_freq: Cached value of policy->max
* @max_freq: the frequency that mapped to highest_perf
* @min_freq: the frequency that mapped to lowest_perf
* @nominal_freq: the frequency that mapped to nominal_perf
--
2.34.1
From: "Gautham R. Shenoy" <[email protected]>
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
include/linux/amd-pstate.h | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index 212f377d615b..ab7e82533718 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -51,15 +51,15 @@ struct amd_aperf_mperf {
* priority.
* @min_limit_perf: Cached value of the perf corresponding to policy->min
* @max_limit_perf: Cached value of the perf corresponding to policy->max
- * @min_limit_freq: Cached value of policy->min
- * @max_limit_freq: Cached value of policy->max
- * @max_freq: the frequency that mapped to highest_perf
- * @min_freq: the frequency that mapped to lowest_perf
- * @nominal_freq: the frequency that mapped to nominal_perf
- * @lowest_nonlinear_freq: the frequency that mapped to lowest_nonlinear_perf
+ * @min_limit_freq: Cached value of policy->min (in khz)
+ * @max_limit_freq: Cached value of policy->max (in khz)
+ * @max_freq: the frequency (in khz) that mapped to highest_perf
+ * @min_freq: the frequency (in khz) that mapped to lowest_perf
+ * @nominal_freq: the frequency (in khz) that mapped to nominal_perf
+ * @lowest_nonlinear_freq: the frequency (in khz) that mapped to lowest_nonlinear_perf
* @cur: Difference of Aperf/Mperf/tsc count between last and current sample
* @prev: Last Aperf/Mperf/tsc count value read from register
- * @freq: current cpu frequency value
+ * @freq: current cpu frequency value (in khz)
* @boost_supported: check whether the Processor or SBIOS supports boost mode
* @hw_prefcore: check whether HW supports preferred core featue.
* Only when hw_prefcore and early prefcore param are true,
--
2.34.1
Currently the amd_get_{min, max, nominal, lowest_nonlinear}_freq()
helpers computes the values of min_freq, max_freq, nominal_freq and
lowest_nominal_freq respectively afresh from
cppc_get_perf_caps(). This is not necessary as there are fields in
cpudata to cache these values.
To simplify this, add a single helper function named
amd_pstate_init_freq() which computes all these frequencies at once, and
caches it in cpudata.
Use the cached values everywhere else in the code.
Co-developed-by: Gautham R. Shenoy <[email protected]>
Signed-off-by: Gautham R. Shenoy <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 126 ++++++++++++++++-------------------
1 file changed, 59 insertions(+), 67 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 2015c9fcc3c9..ba1baa6733e6 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -606,74 +606,22 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
static int amd_get_min_freq(struct amd_cpudata *cpudata)
{
- struct cppc_perf_caps cppc_perf;
-
- int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
- if (ret)
- return ret;
-
- /* Switch to khz */
- return cppc_perf.lowest_freq * 1000;
+ return READ_ONCE(cpudata->min_freq);
}
static int amd_get_max_freq(struct amd_cpudata *cpudata)
{
- struct cppc_perf_caps cppc_perf;
- u32 max_perf, max_freq, nominal_freq, nominal_perf;
- u64 boost_ratio;
-
- int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
- if (ret)
- return ret;
-
- nominal_freq = cppc_perf.nominal_freq;
- nominal_perf = READ_ONCE(cpudata->nominal_perf);
- max_perf = READ_ONCE(cpudata->highest_perf);
-
- boost_ratio = div_u64(max_perf << SCHED_CAPACITY_SHIFT,
- nominal_perf);
-
- max_freq = nominal_freq * boost_ratio >> SCHED_CAPACITY_SHIFT;
-
- /* Switch to khz */
- return max_freq * 1000;
+ return READ_ONCE(cpudata->max_freq);
}
static int amd_get_nominal_freq(struct amd_cpudata *cpudata)
{
- struct cppc_perf_caps cppc_perf;
-
- int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
- if (ret)
- return ret;
-
- /* Switch to khz */
- return cppc_perf.nominal_freq * 1000;
+ return READ_ONCE(cpudata->nominal_freq);
}
static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
{
- struct cppc_perf_caps cppc_perf;
- u32 lowest_nonlinear_freq, lowest_nonlinear_perf,
- nominal_freq, nominal_perf;
- u64 lowest_nonlinear_ratio;
-
- int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
- if (ret)
- return ret;
-
- nominal_freq = cppc_perf.nominal_freq;
- nominal_perf = READ_ONCE(cpudata->nominal_perf);
-
- lowest_nonlinear_perf = cppc_perf.lowest_nonlinear_perf;
-
- lowest_nonlinear_ratio = div_u64(lowest_nonlinear_perf << SCHED_CAPACITY_SHIFT,
- nominal_perf);
-
- lowest_nonlinear_freq = nominal_freq * lowest_nonlinear_ratio >> SCHED_CAPACITY_SHIFT;
-
- /* Switch to khz */
- return lowest_nonlinear_freq * 1000;
+ return READ_ONCE(cpudata->lowest_nonlinear_freq);
}
static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
@@ -828,6 +776,53 @@ static void amd_pstate_update_limits(unsigned int cpu)
mutex_unlock(&amd_pstate_driver_lock);
}
+/**
+ * amd_pstate_init_freq: Initialize the max_freq, min_freq,
+ * nominal_freq and lowest_nonlinear_freq for
+ * the @cpudata object.
+ *
+ * Requires: highest_perf, lowest_perf, nominal_perf and
+ * lowest_nonlinear_perf members of @cpudata to be
+ * initialized.
+ *
+ * Returns 0 on success, non-zero value on failure.
+ */
+static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
+{
+ int ret;
+ u32 min_freq;
+ u32 highest_perf, max_freq;
+ u32 nominal_perf, nominal_freq;
+ u32 lowest_nonlinear_perf, lowest_nonlinear_freq;
+ u32 boost_ratio, lowest_nonlinear_ratio;
+ struct cppc_perf_caps cppc_perf;
+
+
+ ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
+ if (ret)
+ return ret;
+
+ min_freq = cppc_perf.lowest_freq * 1000;
+ nominal_freq = cppc_perf.nominal_freq * 1000;
+ nominal_perf = READ_ONCE(cpudata->nominal_perf);
+
+ highest_perf = READ_ONCE(cpudata->highest_perf);
+ boost_ratio = div_u64(highest_perf << SCHED_CAPACITY_SHIFT, nominal_perf);
+ max_freq = nominal_freq * boost_ratio >> SCHED_CAPACITY_SHIFT;
+
+ lowest_nonlinear_perf = READ_ONCE(cpudata->lowest_nonlinear_perf);
+ lowest_nonlinear_ratio = div_u64(lowest_nonlinear_perf << SCHED_CAPACITY_SHIFT,
+ nominal_perf);
+ lowest_nonlinear_freq = nominal_freq * lowest_nonlinear_ratio >> SCHED_CAPACITY_SHIFT;
+
+ WRITE_ONCE(cpudata->min_freq, min_freq);
+ WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
+ WRITE_ONCE(cpudata->nominal_freq, nominal_freq);
+ WRITE_ONCE(cpudata->max_freq, max_freq);
+
+ return 0;
+}
+
static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
{
int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
@@ -855,6 +850,10 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
if (ret)
goto free_cpudata1;
+ ret = amd_pstate_init_freq(cpudata);
+ if (ret)
+ goto free_cpudata1;
+
min_freq = amd_get_min_freq(cpudata);
max_freq = amd_get_max_freq(cpudata);
nominal_freq = amd_get_nominal_freq(cpudata);
@@ -896,13 +895,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
goto free_cpudata2;
}
- /* Initial processor data capability frequencies */
- cpudata->max_freq = max_freq;
- cpudata->min_freq = min_freq;
cpudata->max_limit_freq = max_freq;
cpudata->min_limit_freq = min_freq;
- cpudata->nominal_freq = nominal_freq;
- cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
policy->driver_data = cpudata;
@@ -1317,6 +1311,10 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
if (ret)
goto free_cpudata1;
+ ret = amd_pstate_init_freq(cpudata);
+ if (ret)
+ goto free_cpudata1;
+
min_freq = amd_get_min_freq(cpudata);
max_freq = amd_get_max_freq(cpudata);
nominal_freq = amd_get_nominal_freq(cpudata);
@@ -1333,12 +1331,6 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
/* It will be updated by governor */
policy->cur = policy->cpuinfo.min_freq;
- /* Initial processor data capability frequencies */
- cpudata->max_freq = max_freq;
- cpudata->min_freq = min_freq;
- cpudata->nominal_freq = nominal_freq;
- cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
-
policy->driver_data = cpudata;
cpudata->epp_cached = amd_pstate_get_epp(cpudata, 0);
--
2.34.1
From: "Gautham R. Shenoy" <[email protected]>
amd_get_{min,max,nominal,lowest_nonlinear}_freq() functions merely
return cpudata->{min,max,nominal,lowest_nonlinear}_freq values.
There is no loss in readability in replacing their invocations by
accesses to the corresponding members of cpudata.
Do so and remove these helper functions.
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 40 +++++++++---------------------------
1 file changed, 10 insertions(+), 30 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index ba1baa6733e6..132330b4942f 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -604,26 +604,6 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
cpufreq_cpu_put(policy);
}
-static int amd_get_min_freq(struct amd_cpudata *cpudata)
-{
- return READ_ONCE(cpudata->min_freq);
-}
-
-static int amd_get_max_freq(struct amd_cpudata *cpudata)
-{
- return READ_ONCE(cpudata->max_freq);
-}
-
-static int amd_get_nominal_freq(struct amd_cpudata *cpudata)
-{
- return READ_ONCE(cpudata->nominal_freq);
-}
-
-static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
-{
- return READ_ONCE(cpudata->lowest_nonlinear_freq);
-}
-
static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
{
struct amd_cpudata *cpudata = policy->driver_data;
@@ -854,10 +834,10 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
if (ret)
goto free_cpudata1;
- min_freq = amd_get_min_freq(cpudata);
- max_freq = amd_get_max_freq(cpudata);
- nominal_freq = amd_get_nominal_freq(cpudata);
- lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
+ min_freq = READ_ONCE(cpudata->min_freq);
+ max_freq = READ_ONCE(cpudata->max_freq);
+ nominal_freq = READ_ONCE(cpudata->nominal_freq);
+ lowest_nonlinear_freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
@@ -960,7 +940,7 @@ static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
int max_freq;
struct amd_cpudata *cpudata = policy->driver_data;
- max_freq = amd_get_max_freq(cpudata);
+ max_freq = READ_ONCE(cpudata->max_freq);
if (max_freq < 0)
return max_freq;
@@ -973,7 +953,7 @@ static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *poli
int freq;
struct amd_cpudata *cpudata = policy->driver_data;
- freq = amd_get_lowest_nonlinear_freq(cpudata);
+ freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
if (freq < 0)
return freq;
@@ -1315,10 +1295,10 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
if (ret)
goto free_cpudata1;
- min_freq = amd_get_min_freq(cpudata);
- max_freq = amd_get_max_freq(cpudata);
- nominal_freq = amd_get_nominal_freq(cpudata);
- lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
+ min_freq = READ_ONCE(cpudata->min_freq);
+ max_freq = READ_ONCE(cpudata->max_freq);
+ nominal_freq = READ_ONCE(cpudata->nominal_freq);
+ lowest_nonlinear_freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
min_freq, max_freq);
--
2.34.1
The amd-pstate driver cannot work when the min_freq, nominal_freq or
the max_freq is zero. When this happens it is prudent to error out
early on rather than waiting failing at the time of the governor
initialization.
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 132330b4942f..6708c436e1a2 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -839,9 +839,11 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
nominal_freq = READ_ONCE(cpudata->nominal_freq);
lowest_nonlinear_freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
- if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
- dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
- min_freq, max_freq);
+ if (min_freq <= 0 || max_freq <= 0 ||
+ nominal_freq <= 0 || min_freq > max_freq) {
+ dev_err(dev,
+ "min_freq(%d) or max_freq(%d) or nominal_freq (%d) value is incorrect\n",
+ min_freq, max_freq, nominal_freq);
ret = -EINVAL;
goto free_cpudata1;
}
@@ -1299,9 +1301,11 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
max_freq = READ_ONCE(cpudata->max_freq);
nominal_freq = READ_ONCE(cpudata->nominal_freq);
lowest_nonlinear_freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
- if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
- dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
- min_freq, max_freq);
+ if (min_freq <= 0 || max_freq <= 0 ||
+ nominal_freq <= 0 || min_freq > max_freq) {
+ dev_err(dev,
+ "min_freq(%d) or max_freq(%d) or nominal_freq(%d) value is incorrect\n",
+ min_freq, max_freq, nominal_freq);
ret = -EINVAL;
goto free_cpudata1;
}
--
2.34.1
to be more clear what is wrong with CPPC when pstate driver failed to
load which has dependency on the CPPC capabilities.
Add one more debug message to notify user if CPPC is not supported by
the CPU, then it will be easy to find out what need to fix for pstate
driver loading issue.
[ 0.477523] amd_pstate: the _CPC object is not present in SBIOS or ACPI disabled
Above message is not clear enough to verify whether CPPC is not supported.
Reviewed-by: Mario Limonciello <[email protected]>
Reviewed-by: Gautham R. Shenoy <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/acpi/cppc_acpi.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 4bfbe55553f4..3134101f31b6 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -686,8 +686,10 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
if (!osc_sb_cppc2_support_acked) {
pr_debug("CPPC v2 _OSC not acked\n");
- if (!cpc_supported_by_cpu())
+ if (!cpc_supported_by_cpu()) {
+ pr_debug("CPPC is not supported by the CPU\n");
return -ENODEV;
+ }
}
/* Parse the ACPI _CPC table for this CPU. */
--
2.34.1
Make pstate driver initially retrieve the P-state transition delay and
latency values from the BIOS ACPI tables which has more reasonable
delay and latency values according to the platform design and
requirements.
Previously there values were hardcoded at specific value which may
have conflicted with platform and it might not reflect the most
accurate or optimized setting for the processor.
[054h 0084 8] Preserve Mask : FFFFFFFF00000000
[05Ch 0092 8] Write Mask : 0000000000000001
[064h 0100 4] Command Latency : 00000FA0
[068h 0104 4] Maximum Access Rate : 0000EA60
[06Ch 0108 2] Minimum Turnaround Time : 0000
Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 34 ++++++++++++++++++++++++++++++++--
1 file changed, 32 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 6708c436e1a2..ec049b62b366 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -756,6 +756,36 @@ static void amd_pstate_update_limits(unsigned int cpu)
mutex_unlock(&amd_pstate_driver_lock);
}
+/**
+ * Get pstate transition delay time from ACPI tables that firmware set
+ * instead of using hardcode value directly.
+ */
+static u32 amd_pstate_get_transition_delay_us(unsigned int cpu)
+{
+ u32 transition_delay_ns;
+
+ transition_delay_ns = cppc_get_transition_latency(cpu);
+ if (transition_delay_ns == CPUFREQ_ETERNAL)
+ return AMD_PSTATE_TRANSITION_DELAY;
+
+ return transition_delay_ns / NSEC_PER_USEC;
+}
+
+/**
+ * Get pstate transition latency value from ACPI tables that firmware
+ * set instead of using hardcode value directly.
+ */
+static u32 amd_pstate_get_transition_latency(unsigned int cpu)
+{
+ u32 transition_latency;
+
+ transition_latency = cppc_get_transition_latency(cpu);
+ if (transition_latency == CPUFREQ_ETERNAL)
+ return AMD_PSTATE_TRANSITION_LATENCY;
+
+ return transition_latency;
+}
+
/**
* amd_pstate_init_freq: Initialize the max_freq, min_freq,
* nominal_freq and lowest_nonlinear_freq for
@@ -848,8 +878,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
goto free_cpudata1;
}
- policy->cpuinfo.transition_latency = AMD_PSTATE_TRANSITION_LATENCY;
- policy->transition_delay_us = AMD_PSTATE_TRANSITION_DELAY;
+ policy->cpuinfo.transition_latency = amd_pstate_get_transition_latency(policy->cpu);
+ policy->transition_delay_us = amd_pstate_get_transition_delay_us(policy->cpu);
policy->min = min_freq;
policy->max = max_freq;
--
2.34.1
Add quirks table to get CPPC capabilities issue fixed by providing
correct perf or frequency values while driver loading.
If CPPC capabilities are not defined in the ACPI tables or wrongly
defined by platform firmware, it needs to use quick to get those
issues fixed with correct workaround values to make pstate driver
can be loaded even though there are CPPC capabilities errors.
The workaround will match the broken BIOS which lack of CPPC capabilities
nominal_freq and lowest_freq definition in the ACPI table.
$ cat /sys/devices/system/cpu/cpu0/acpi_cppc/lowest_freq
0
$ cat /sys/devices/system/cpu/cpu0/acpi_cppc/nominal_freq
0
Reviewed-by: Mario Limonciello <[email protected]>
Reviewed-by: Gautham R. Shenoy <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 53 ++++++++++++++++++++++++++++++++++--
include/linux/amd-pstate.h | 6 ++++
2 files changed, 57 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index ec049b62b366..59a2db225d98 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -67,6 +67,7 @@ static struct cpufreq_driver amd_pstate_epp_driver;
static int cppc_state = AMD_PSTATE_UNDEFINED;
static bool cppc_enabled;
static bool amd_pstate_prefcore = true;
+static struct quirk_entry *quirks;
/*
* AMD Energy Preference Performance (EPP)
@@ -111,6 +112,41 @@ static unsigned int epp_values[] = {
typedef int (*cppc_mode_transition_fn)(int);
+static struct quirk_entry quirk_amd_7k62 = {
+ .nominal_freq = 2600,
+ .lowest_freq = 550,
+};
+
+static int __init dmi_matched_7k62_bios_bug(const struct dmi_system_id *dmi)
+{
+ /**
+ * match the broken bios for family 17h processor support CPPC V2
+ * broken BIOS lack of nominal_freq and lowest_freq capabilities
+ * definition in ACPI tables
+ */
+ if (boot_cpu_has(X86_FEATURE_ZEN2)) {
+ quirks = dmi->driver_data;
+ pr_info("Overriding nominal and lowest frequencies for %s\n", dmi->ident);
+ return 1;
+ }
+
+ return 0;
+}
+
+static const struct dmi_system_id amd_pstate_quirks_table[] __initconst = {
+ {
+ .callback = dmi_matched_7k62_bios_bug,
+ .ident = "AMD EPYC 7K62",
+ .matches = {
+ DMI_MATCH(DMI_BIOS_VERSION, "5.14"),
+ DMI_MATCH(DMI_BIOS_RELEASE, "12/12/2019"),
+ },
+ .driver_data = &quirk_amd_7k62,
+ },
+ {}
+};
+MODULE_DEVICE_TABLE(dmi, amd_pstate_quirks_table);
+
static inline int get_mode_idx_from_str(const char *str, size_t size)
{
int i;
@@ -812,8 +848,16 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
if (ret)
return ret;
- min_freq = cppc_perf.lowest_freq * 1000;
- nominal_freq = cppc_perf.nominal_freq * 1000;
+ if (quirks && quirks->lowest_freq)
+ min_freq = quirks->lowest_freq * 1000;
+ else
+ min_freq = cppc_perf.lowest_freq * 1000;
+
+ if (quirks && quirks->nominal_freq)
+ nominal_freq = quirks->nominal_freq * 1000;
+ else
+ nominal_freq = cppc_perf.nominal_freq * 1000;
+
nominal_perf = READ_ONCE(cpudata->nominal_perf);
highest_perf = READ_ONCE(cpudata->highest_perf);
@@ -1662,6 +1706,11 @@ static int __init amd_pstate_init(void)
if (cpufreq_get_current_driver())
return -EEXIST;
+ quirks = NULL;
+
+ /* check if this machine need CPPC quirks */
+ dmi_check_system(amd_pstate_quirks_table);
+
switch (cppc_state) {
case AMD_PSTATE_UNDEFINED:
/* Disable on the following configs by default:
diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index ab7e82533718..6b832153a126 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -128,4 +128,10 @@ static const char * const amd_pstate_mode_string[] = {
[AMD_PSTATE_GUIDED] = "guided",
NULL,
};
+
+struct quirk_entry {
+ u32 nominal_freq;
+ u32 lowest_freq;
+};
+
#endif /* _LINUX_AMD_PSTATE_H */
--
2.34.1
On Mon, Mar 18, 2024 at 10:48 AM Perry Yuan <[email protected]> wrote:
>
> From: "Gautham R. Shenoy" <[email protected]>
No changelog.
> Signed-off-by: Gautham R. Shenoy <[email protected]>
Sender sign-off missing (when sending a somebody else's patch, you
need to add your S-o-b tag to it).
> ---
> include/linux/amd-pstate.h | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> index d21838835abd..212f377d615b 100644
> --- a/include/linux/amd-pstate.h
> +++ b/include/linux/amd-pstate.h
> @@ -49,6 +49,10 @@ struct amd_aperf_mperf {
> * @lowest_perf: the absolute lowest performance level of the processor
> * @prefcore_ranking: the preferred core ranking, the higher value indicates a higher
> * priority.
> + * @min_limit_perf: Cached value of the perf corresponding to policy->min
> + * @max_limit_perf: Cached value of the perf corresponding to policy->max
> + * @min_limit_freq: Cached value of policy->min
> + * @max_limit_freq: Cached value of policy->max
> * @max_freq: the frequency that mapped to highest_perf
> * @min_freq: the frequency that mapped to lowest_perf
> * @nominal_freq: the frequency that mapped to nominal_perf
> --
> 2.34.1
>
>
On Mon, Mar 18, 2024 at 10:49 AM Perry Yuan <[email protected]> wrote:
>
> From: "Gautham R. Shenoy" <[email protected]>
No changelog.
> Signed-off-by: Gautham R. Shenoy <[email protected]>
Sender sign-off missing (when sending somebody else's patch, you need
to add your S-o-b tag to it).
> ---
> include/linux/amd-pstate.h | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> index 212f377d615b..ab7e82533718 100644
> --- a/include/linux/amd-pstate.h
> +++ b/include/linux/amd-pstate.h
> @@ -51,15 +51,15 @@ struct amd_aperf_mperf {
> * priority.
> * @min_limit_perf: Cached value of the perf corresponding to policy->min
> * @max_limit_perf: Cached value of the perf corresponding to policy->max
> - * @min_limit_freq: Cached value of policy->min
> - * @max_limit_freq: Cached value of policy->max
> - * @max_freq: the frequency that mapped to highest_perf
> - * @min_freq: the frequency that mapped to lowest_perf
> - * @nominal_freq: the frequency that mapped to nominal_perf
> - * @lowest_nonlinear_freq: the frequency that mapped to lowest_nonlinear_perf
> + * @min_limit_freq: Cached value of policy->min (in khz)
> + * @max_limit_freq: Cached value of policy->max (in khz)
> + * @max_freq: the frequency (in khz) that mapped to highest_perf
> + * @min_freq: the frequency (in khz) that mapped to lowest_perf
> + * @nominal_freq: the frequency (in khz) that mapped to nominal_perf
> + * @lowest_nonlinear_freq: the frequency (in khz) that mapped to lowest_nonlinear_perf
> * @cur: Difference of Aperf/Mperf/tsc count between last and current sample
> * @prev: Last Aperf/Mperf/tsc count value read from register
> - * @freq: current cpu frequency value
> + * @freq: current cpu frequency value (in khz)
> * @boost_supported: check whether the Processor or SBIOS supports boost mode
> * @hw_prefcore: check whether HW supports preferred core featue.
> * Only when hw_prefcore and early prefcore param are true,
> --
> 2.34.1
>
>
On Mon, Mar 18, 2024 at 10:49 AM Perry Yuan <[email protected]> wrote:
>
> From: "Gautham R. Shenoy" <[email protected]>
>
> amd_get_{min,max,nominal,lowest_nonlinear}_freq() functions merely
> return cpudata->{min,max,nominal,lowest_nonlinear}_freq values.
>
> There is no loss in readability in replacing their invocations by
> accesses to the corresponding members of cpudata.
>
> Do so and remove these helper functions.
>
> Signed-off-by: Gautham R. Shenoy <[email protected]>
Sender sign-off missing (when sending somebody else's patch, you need
to add your S-o-b tag to it).
> ---
> drivers/cpufreq/amd-pstate.c | 40 +++++++++---------------------------
> 1 file changed, 10 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index ba1baa6733e6..132330b4942f 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -604,26 +604,6 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
> cpufreq_cpu_put(policy);
> }
>
> -static int amd_get_min_freq(struct amd_cpudata *cpudata)
> -{
> - return READ_ONCE(cpudata->min_freq);
> -}
> -
> -static int amd_get_max_freq(struct amd_cpudata *cpudata)
> -{
> - return READ_ONCE(cpudata->max_freq);
> -}
> -
> -static int amd_get_nominal_freq(struct amd_cpudata *cpudata)
> -{
> - return READ_ONCE(cpudata->nominal_freq);
> -}
> -
> -static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
> -{
> - return READ_ONCE(cpudata->lowest_nonlinear_freq);
> -}
> -
> static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
> {
> struct amd_cpudata *cpudata = policy->driver_data;
> @@ -854,10 +834,10 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> if (ret)
> goto free_cpudata1;
>
> - min_freq = amd_get_min_freq(cpudata);
> - max_freq = amd_get_max_freq(cpudata);
> - nominal_freq = amd_get_nominal_freq(cpudata);
> - lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> + min_freq = READ_ONCE(cpudata->min_freq);
> + max_freq = READ_ONCE(cpudata->max_freq);
> + nominal_freq = READ_ONCE(cpudata->nominal_freq);
> + lowest_nonlinear_freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
>
> if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
> @@ -960,7 +940,7 @@ static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
> int max_freq;
> struct amd_cpudata *cpudata = policy->driver_data;
>
> - max_freq = amd_get_max_freq(cpudata);
> + max_freq = READ_ONCE(cpudata->max_freq);
> if (max_freq < 0)
> return max_freq;
>
> @@ -973,7 +953,7 @@ static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *poli
> int freq;
> struct amd_cpudata *cpudata = policy->driver_data;
>
> - freq = amd_get_lowest_nonlinear_freq(cpudata);
> + freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
> if (freq < 0)
> return freq;
>
> @@ -1315,10 +1295,10 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> if (ret)
> goto free_cpudata1;
>
> - min_freq = amd_get_min_freq(cpudata);
> - max_freq = amd_get_max_freq(cpudata);
> - nominal_freq = amd_get_nominal_freq(cpudata);
> - lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> + min_freq = READ_ONCE(cpudata->min_freq);
> + max_freq = READ_ONCE(cpudata->max_freq);
> + nominal_freq = READ_ONCE(cpudata->nominal_freq);
> + lowest_nonlinear_freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
> if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
> min_freq, max_freq);
> --
> 2.34.1
>
>
On Mon, Mar 18, 2024 at 10:48 AM Perry Yuan <[email protected]> wrote:
>
> The patch series adds some fixes and enhancements to the AMD pstate
> driver.
> It enables CPPC v2 for certain processors in the family 17H, as
> requested
> by TR40 processor users who expect improved performance and lower system
> temperature.
>
> Additionally, it fixes the initialization of nominal_freq for each
> cpudata
> and changes latency and delay values to be read from platform firmware
> firstly
> for more accurate timing.
>
> A new quirk is also added for legacy processors that lack CPPC
> capabilities which caused the pstate driver to fail loading.
>
> Testing done with one APU system while cpb boost on:
>
> amd_pstate_lowest_nonlinear_freq:1701000
> amd_pstate_max_freq:3501000
> cpuinfo_max_freq:3501000
> cpuinfo_min_freq:400000
> scaling_cur_freq:3084836
> scaling_max_freq:3501000
> scaling_min_freq:400000
>
> analyzing CPU 6:
> driver: amd-pstate-epp
> CPUs which run at the same hardware frequency: 6
> CPUs which need to have their frequency coordinated by software: 6
> maximum transition latency: Cannot determine or is not supported.
> hardware limits: 400 MHz - 3.50 GHz
> available cpufreq governors: performance powersave
> current policy: frequency should be within 400 MHz and 3.50 GHz.
> The governor "powersave" may decide which speed to use
> within this range.
> current CPU frequency: Unable to call hardware
> current CPU frequency: 3.50 GHz (asserted by call to kernel)
> boost state support:
> Supported: yes
> Active: yes
> AMD PSTATE Highest Performance: 255. Maximum Frequency: 3.50 GHz.
> AMD PSTATE Nominal Performance: 204. Nominal Frequency: 2.80 GHz.
> AMD PSTATE Lowest Non-linear Performance: 124. Lowest Non-linear Frequency: 1.70 GHz.
> AMD PSTATE Lowest Performance: 30. Lowest Frequency: 400 MHz.
>
> If someone would like to test this patchset, it would need to apply
> another patchset on top of this in case of some unexpected issue found.
>
> https://lore.kernel.org/lkml/[email protected]/
> It implements the amd pstate cpb boost feature
> the below patch link is old version, please apply the latest version
> while you start the testing work.
>
> I would greatly appreciate any feedbacks.
There are missing changelogs and S-o-b tags in a few messages in this series.
Overall, I would like someone, preferably at AMD, to take
responsibility for the amd-pstate driver, review patches modifying it
and ACK the approved ones.
Huang Rui, who is listed in MAINTAINERS as the official maintainer of
it, does not seem to be interested in it any more.
Can this be addressed, please?
On Mon, Mar 18, 2024 at 08:49:55PM +0800, Rafael J. Wysocki wrote:
> On Mon, Mar 18, 2024 at 10:48 AM Perry Yuan <[email protected]> wrote:
> >
> > The patch series adds some fixes and enhancements to the AMD pstate
> > driver.
> > It enables CPPC v2 for certain processors in the family 17H, as
> > requested
> > by TR40 processor users who expect improved performance and lower system
> > temperature.
> >
> > Additionally, it fixes the initialization of nominal_freq for each
> > cpudata
> > and changes latency and delay values to be read from platform firmware
> > firstly
> > for more accurate timing.
> >
> > A new quirk is also added for legacy processors that lack CPPC
> > capabilities which caused the pstate driver to fail loading.
> >
> > Testing done with one APU system while cpb boost on:
> >
> > amd_pstate_lowest_nonlinear_freq:1701000
> > amd_pstate_max_freq:3501000
> > cpuinfo_max_freq:3501000
> > cpuinfo_min_freq:400000
> > scaling_cur_freq:3084836
> > scaling_max_freq:3501000
> > scaling_min_freq:400000
> >
> > analyzing CPU 6:
> > driver: amd-pstate-epp
> > CPUs which run at the same hardware frequency: 6
> > CPUs which need to have their frequency coordinated by software: 6
> > maximum transition latency: Cannot determine or is not supported.
> > hardware limits: 400 MHz - 3.50 GHz
> > available cpufreq governors: performance powersave
> > current policy: frequency should be within 400 MHz and 3.50 GHz.
> > The governor "powersave" may decide which speed to use
> > within this range.
> > current CPU frequency: Unable to call hardware
> > current CPU frequency: 3.50 GHz (asserted by call to kernel)
> > boost state support:
> > Supported: yes
> > Active: yes
> > AMD PSTATE Highest Performance: 255. Maximum Frequency: 3.50 GHz.
> > AMD PSTATE Nominal Performance: 204. Nominal Frequency: 2.80 GHz.
> > AMD PSTATE Lowest Non-linear Performance: 124. Lowest Non-linear Frequency: 1.70 GHz.
> > AMD PSTATE Lowest Performance: 30. Lowest Frequency: 400 MHz.
> >
> > If someone would like to test this patchset, it would need to apply
> > another patchset on top of this in case of some unexpected issue found.
> >
> > https://lore.kernel.org/lkml/[email protected]/
> > It implements the amd pstate cpb boost feature
> > the below patch link is old version, please apply the latest version
> > while you start the testing work.
> >
> > I would greatly appreciate any feedbacks.
>
> There are missing changelogs and S-o-b tags in a few messages in this series.
>
> Overall, I would like someone, preferably at AMD, to take
> responsibility for the amd-pstate driver, review patches modifying it
> and ACK the approved ones.
>
> Huang Rui, who is listed in MAINTAINERS as the official maintainer of
> it, does not seem to be interested in it any more.
>
> Can this be addressed, please?
Hi Rafael,
Sorry, I was occupied by other task a couple of months ago. I will talk
with AMD colleagues and figure out the way to make sure the amd-pstate
patches will be actively reviewed and tested. Will give you the feedback.
Thanks,
Ray
[AMD Official Use Only - General]
Hi Rafael,
> -----Original Message-----
> From: Rafael J. Wysocki <[email protected]>
> Sent: Monday, March 18, 2024 8:35 PM
> To: Yuan, Perry <[email protected]>
> Cc: [email protected]; Limonciello, Mario
> <[email protected]>; [email protected]; Huang, Ray
> <[email protected]>; Shenoy, Gautham Ranjal
> <[email protected]>; Petkov, Borislav <[email protected]>;
> Deucher, Alexander <[email protected]>; Huang, Shimmer
> <[email protected]>; Du, Xiaojian <[email protected]>; Meng,
> Li (Jassmine) <[email protected]>; [email protected]; linux-
> [email protected]
> Subject: Re: [PATCH v8 1/8] cpufreq: amd-pstate: Document *_limit_* fields in
> struct amd_cpudata
>
> On Mon, Mar 18, 2024 at 10:48 AM Perry Yuan <[email protected]>
> wrote:
> >
> > From: "Gautham R. Shenoy" <[email protected]>
>
> No changelog.
>
> > Signed-off-by: Gautham R. Shenoy <[email protected]>
>
> Sender sign-off missing (when sending a somebody else's patch, you need to
> add your S-o-b tag to it).
Got it, will fix this and other two patches flags missing in V9.
Thank you for the feedback.
Perry.
>
> > ---
> > include/linux/amd-pstate.h | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> > index d21838835abd..212f377d615b 100644
> > --- a/include/linux/amd-pstate.h
> > +++ b/include/linux/amd-pstate.h
> > @@ -49,6 +49,10 @@ struct amd_aperf_mperf {
> > * @lowest_perf: the absolute lowest performance level of the processor
> > * @prefcore_ranking: the preferred core ranking, the higher value indicates
> a higher
> > * priority.
> > + * @min_limit_perf: Cached value of the perf corresponding to
> > + policy->min
> > + * @max_limit_perf: Cached value of the perf corresponding to
> > + policy->max
> > + * @min_limit_freq: Cached value of policy->min
> > + * @max_limit_freq: Cached value of policy->max
> > * @max_freq: the frequency that mapped to highest_perf
> > * @min_freq: the frequency that mapped to lowest_perf
> > * @nominal_freq: the frequency that mapped to nominal_perf
> > --
> > 2.34.1
> >
> >