2024-03-25 11:40:11

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 0/8] AMD Pstate Fixes And Enhancements

The patch series adds some fixes and enhancements to the AMD pstate
driver.

It enables CPPC v2 for certain processors in the family 17H, as
requested by TR40 processor users who expect improved performance and lower system
temperature.

changes latency and delay values to be read from platform firmware
firstly
for more accurate timing.

A new quirk is introduced for supporting amd-pstate on legacy processors which either lack CPPC capability,
or only only have CPPC v2 capability

Testing done with one APU system while cpb boost on:

amd_pstate_lowest_nonlinear_freq:1701000
amd_pstate_max_freq:3501000
cpuinfo_max_freq:3501000
cpuinfo_min_freq:400000
scaling_cur_freq:3084836
scaling_max_freq:3501000
scaling_min_freq:400000

analyzing CPU 6:
driver: amd-pstate-epp
CPUs which run at the same hardware frequency: 6
CPUs which need to have their frequency coordinated by software: 6
maximum transition latency: Cannot determine or is not supported.
hardware limits: 400 MHz - 3.50 GHz
available cpufreq governors: performance powersave
current policy: frequency should be within 400 MHz and 3.50 GHz.
The governor "powersave" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware
current CPU frequency: 3.50 GHz (asserted by call to kernel)
boost state support:
Supported: yes
Active: yes
AMD PSTATE Highest Performance: 255. Maximum Frequency: 3.50 GHz.
AMD PSTATE Nominal Performance: 204. Nominal Frequency: 2.80 GHz.
AMD PSTATE Lowest Non-linear Performance: 124. Lowest Non-linear Frequency: 1.70 GHz.
AMD PSTATE Lowest Performance: 30. Lowest Frequency: 400 MHz.


I would greatly appreciate any feedbacks.
Thank you!
Perry.

Changes from v9:
* pick review by flag from Meng Li
* pick test by flag from Ugwekar Dhananjay
* picl review by flag from Gautham R. Shenoy

Changes from v8:
* add commit log for patch 1 and patch 2 (Rafael)
* add missing Perry signed-off-by for new patches #1,#2,#4 (Rafael)
* rebased to latest linux-pm/bleeding-edge

Changes from v7:
* Gautham helped to invole some new improved patches into the patchset.
* Adds comments for cpudata->{min,max}_limit_{perf,freq}, variables [New Patch].
* Clarifies that the units for cpudata->*_freq is in khz via comments [New Patch].
* Implements the unified computation of all cpudata->*_freq
* v7 Patch 2/6 was dropped which is not needed any more
* moved the quirk check to the amd_pstate_get_freq() function
* pick up RB flags from Gautham
* After the cleanup in patch 3, we don't need the helpers
amd_get_{min,max,nominal,lowest_nonlinear}_freq(). This
patch removes it [New Patch].
* testing done on APU system as well, no regression found.

Changes from v6:
* add one new patch to initialize capabilities in
amd_pstate_init_perf which can avoid duplicate cppc capabilities read
the change has been tested on APU system.
* pick up RB flags from Gautham
* drop the patch 1/6 which has been merged by Rafael

Changes from v5:
* rebased to linux-pm v6.8
* pick up RB flag from for patch 6(Mario)

Changes from v4:
* improve the dmi matching rule with zen2 flag only

Changes from v3:
* change quirk matching broken BIOS with family/model ID and Zen2
flag to fix the CPPC definition issue
* fix typo in quirk

Changes from v2:
* change quirk matching to BIOS version and release (Mario)
* pick up RB flag from Mario

Changes from v1:
* pick up the RB flags from Mario
* address review comment of patch #6 for amd_get_nominal_freq()
* rebased the series to linux-pm/bleeding-edge v6.8.0-rc2
* update debug log for patch #5 as Mario suggested.
* fix some typos and format problems
* tested on 7950X platform


V1: https://lore.kernel.org/lkml/[email protected]/
V2: https://lore.kernel.org/all/[email protected]/
v3: https://lore.kernel.org/lkml/[email protected]/
v4: https://lore.kernel.org/lkml/[email protected]/
v5: https://lore.kernel.org/lkml/[email protected]/
v6: https://lore.kernel.org/lkml/[email protected]/
v7: https://lore.kernel.org/lkml/[email protected]/
v8: https://lore.kernel.org/lkml/[email protected]/
v9: https://lore.kernel.org/lkml/[email protected]/

Gautham R. Shenoy (3):
cpufreq: amd-pstate: Document *_limit_* fields in struct amd_cpudata
cpufreq: amd-pstate: Document the units for freq variables in
amd_cpudata
cpufreq: amd-pstate: Remove
amd_get_{min,max,nominal,lowest_nonlinear}_freq()

Perry Yuan (5):
cpufreq: amd-pstate: Unify computation of
{max,min,nominal,lowest_nonlinear}_freq
cpufreq: amd-pstate: Bail out if min/max/nominal_freq is 0
cpufreq: amd-pstate: get transition delay and latency value from ACPI
tables
cppc_acpi: print error message if CPPC is unsupported
cpufreq: amd-pstate: Add quirk for the pstate CPPC capabilities
missing

drivers/acpi/cppc_acpi.c | 4 +-
drivers/cpufreq/amd-pstate.c | 257 +++++++++++++++++++++--------------
include/linux/amd-pstate.h | 20 ++-
3 files changed, 174 insertions(+), 107 deletions(-)

--
2.34.1



2024-03-25 11:41:16

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 5/8] cpufreq: amd-pstate: Bail out if min/max/nominal_freq is 0

The amd-pstate driver cannot work when the min_freq, nominal_freq or
the max_freq is zero. When this happens it is prudent to error out
early on rather than waiting failing at the time of the governor
initialization.

Reviewed-by: Gautham R. Shenoy <[email protected]>
Tested-by: Dhananjay Ugwekar <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 132330b4942f..6708c436e1a2 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -839,9 +839,11 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
nominal_freq = READ_ONCE(cpudata->nominal_freq);
lowest_nonlinear_freq = READ_ONCE(cpudata->lowest_nonlinear_freq);

- if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
- dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
- min_freq, max_freq);
+ if (min_freq <= 0 || max_freq <= 0 ||
+ nominal_freq <= 0 || min_freq > max_freq) {
+ dev_err(dev,
+ "min_freq(%d) or max_freq(%d) or nominal_freq (%d) value is incorrect\n",
+ min_freq, max_freq, nominal_freq);
ret = -EINVAL;
goto free_cpudata1;
}
@@ -1299,9 +1301,11 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
max_freq = READ_ONCE(cpudata->max_freq);
nominal_freq = READ_ONCE(cpudata->nominal_freq);
lowest_nonlinear_freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
- if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
- dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
- min_freq, max_freq);
+ if (min_freq <= 0 || max_freq <= 0 ||
+ nominal_freq <= 0 || min_freq > max_freq) {
+ dev_err(dev,
+ "min_freq(%d) or max_freq(%d) or nominal_freq(%d) value is incorrect\n",
+ min_freq, max_freq, nominal_freq);
ret = -EINVAL;
goto free_cpudata1;
}
--
2.34.1


2024-03-25 11:48:49

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 4/8] cpufreq: amd-pstate: Remove amd_get_{min,max,nominal,lowest_nonlinear}_freq()

From: "Gautham R. Shenoy" <[email protected]>

amd_get_{min,max,nominal,lowest_nonlinear}_freq() functions merely
return cpudata->{min,max,nominal,lowest_nonlinear}_freq values.

There is no loss in readability in replacing their invocations by
accesses to the corresponding members of cpudata.

Do so and remove these helper functions.

Reviewed-by: Li Meng <[email protected]>
Tested-by: Dhananjay Ugwekar <[email protected]>
Signed-off-by: Gautham R. Shenoy <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 40 +++++++++---------------------------
1 file changed, 10 insertions(+), 30 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index ba1baa6733e6..132330b4942f 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -604,26 +604,6 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
cpufreq_cpu_put(policy);
}

-static int amd_get_min_freq(struct amd_cpudata *cpudata)
-{
- return READ_ONCE(cpudata->min_freq);
-}
-
-static int amd_get_max_freq(struct amd_cpudata *cpudata)
-{
- return READ_ONCE(cpudata->max_freq);
-}
-
-static int amd_get_nominal_freq(struct amd_cpudata *cpudata)
-{
- return READ_ONCE(cpudata->nominal_freq);
-}
-
-static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
-{
- return READ_ONCE(cpudata->lowest_nonlinear_freq);
-}
-
static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
{
struct amd_cpudata *cpudata = policy->driver_data;
@@ -854,10 +834,10 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
if (ret)
goto free_cpudata1;

- min_freq = amd_get_min_freq(cpudata);
- max_freq = amd_get_max_freq(cpudata);
- nominal_freq = amd_get_nominal_freq(cpudata);
- lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
+ min_freq = READ_ONCE(cpudata->min_freq);
+ max_freq = READ_ONCE(cpudata->max_freq);
+ nominal_freq = READ_ONCE(cpudata->nominal_freq);
+ lowest_nonlinear_freq = READ_ONCE(cpudata->lowest_nonlinear_freq);

if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
@@ -960,7 +940,7 @@ static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
int max_freq;
struct amd_cpudata *cpudata = policy->driver_data;

- max_freq = amd_get_max_freq(cpudata);
+ max_freq = READ_ONCE(cpudata->max_freq);
if (max_freq < 0)
return max_freq;

@@ -973,7 +953,7 @@ static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *poli
int freq;
struct amd_cpudata *cpudata = policy->driver_data;

- freq = amd_get_lowest_nonlinear_freq(cpudata);
+ freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
if (freq < 0)
return freq;

@@ -1315,10 +1295,10 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
if (ret)
goto free_cpudata1;

- min_freq = amd_get_min_freq(cpudata);
- max_freq = amd_get_max_freq(cpudata);
- nominal_freq = amd_get_nominal_freq(cpudata);
- lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
+ min_freq = READ_ONCE(cpudata->min_freq);
+ max_freq = READ_ONCE(cpudata->max_freq);
+ nominal_freq = READ_ONCE(cpudata->nominal_freq);
+ lowest_nonlinear_freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
min_freq, max_freq);
--
2.34.1


2024-03-25 11:49:04

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 6/8] cpufreq: amd-pstate: get transition delay and latency value from ACPI tables

Make pstate driver initially retrieve the P-state transition delay and
latency values from the BIOS ACPI tables which has more reasonable
delay and latency values according to the platform design and
requirements.

Previously there values were hardcoded at specific value which may
have conflicted with platform and it might not reflect the most
accurate or optimized setting for the processor.

[054h 0084 8] Preserve Mask : FFFFFFFF00000000
[05Ch 0092 8] Write Mask : 0000000000000001
[064h 0100 4] Command Latency : 00000FA0
[068h 0104 4] Maximum Access Rate : 0000EA60
[06Ch 0108 2] Minimum Turnaround Time : 0000

Reviewed-by: Gautham R. Shenoy <[email protected]>
Reviewed-by: Mario Limonciello <[email protected]>
Tested-by: Dhananjay Ugwekar <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 34 ++++++++++++++++++++++++++++++++--
1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 6708c436e1a2..ec049b62b366 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -756,6 +756,36 @@ static void amd_pstate_update_limits(unsigned int cpu)
mutex_unlock(&amd_pstate_driver_lock);
}

+/**
+ * Get pstate transition delay time from ACPI tables that firmware set
+ * instead of using hardcode value directly.
+ */
+static u32 amd_pstate_get_transition_delay_us(unsigned int cpu)
+{
+ u32 transition_delay_ns;
+
+ transition_delay_ns = cppc_get_transition_latency(cpu);
+ if (transition_delay_ns == CPUFREQ_ETERNAL)
+ return AMD_PSTATE_TRANSITION_DELAY;
+
+ return transition_delay_ns / NSEC_PER_USEC;
+}
+
+/**
+ * Get pstate transition latency value from ACPI tables that firmware
+ * set instead of using hardcode value directly.
+ */
+static u32 amd_pstate_get_transition_latency(unsigned int cpu)
+{
+ u32 transition_latency;
+
+ transition_latency = cppc_get_transition_latency(cpu);
+ if (transition_latency == CPUFREQ_ETERNAL)
+ return AMD_PSTATE_TRANSITION_LATENCY;
+
+ return transition_latency;
+}
+
/**
* amd_pstate_init_freq: Initialize the max_freq, min_freq,
* nominal_freq and lowest_nonlinear_freq for
@@ -848,8 +878,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
goto free_cpudata1;
}

- policy->cpuinfo.transition_latency = AMD_PSTATE_TRANSITION_LATENCY;
- policy->transition_delay_us = AMD_PSTATE_TRANSITION_DELAY;
+ policy->cpuinfo.transition_latency = amd_pstate_get_transition_latency(policy->cpu);
+ policy->transition_delay_us = amd_pstate_get_transition_delay_us(policy->cpu);

policy->min = min_freq;
policy->max = max_freq;
--
2.34.1


2024-03-25 12:07:33

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 3/8] cpufreq: amd-pstate: Unify computation of {max,min,nominal,lowest_nonlinear}_freq

Currently the amd_get_{min, max, nominal, lowest_nonlinear}_freq()
helpers computes the values of min_freq, max_freq, nominal_freq and
lowest_nominal_freq respectively afresh from
cppc_get_perf_caps(). This is not necessary as there are fields in
cpudata to cache these values.

To simplify this, add a single helper function named
amd_pstate_init_freq() which computes all these frequencies at once, and
caches it in cpudata.

Use the cached values everywhere else in the code.

Reviewed-by: Li Meng <[email protected]>
Tested-by: Dhananjay Ugwekar <[email protected]>
Co-developed-by: Gautham R. Shenoy <[email protected]>
Signed-off-by: Gautham R. Shenoy <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 126 ++++++++++++++++-------------------
1 file changed, 59 insertions(+), 67 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 2015c9fcc3c9..ba1baa6733e6 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -606,74 +606,22 @@ static void amd_pstate_adjust_perf(unsigned int cpu,

static int amd_get_min_freq(struct amd_cpudata *cpudata)
{
- struct cppc_perf_caps cppc_perf;
-
- int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
- if (ret)
- return ret;
-
- /* Switch to khz */
- return cppc_perf.lowest_freq * 1000;
+ return READ_ONCE(cpudata->min_freq);
}

static int amd_get_max_freq(struct amd_cpudata *cpudata)
{
- struct cppc_perf_caps cppc_perf;
- u32 max_perf, max_freq, nominal_freq, nominal_perf;
- u64 boost_ratio;
-
- int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
- if (ret)
- return ret;
-
- nominal_freq = cppc_perf.nominal_freq;
- nominal_perf = READ_ONCE(cpudata->nominal_perf);
- max_perf = READ_ONCE(cpudata->highest_perf);
-
- boost_ratio = div_u64(max_perf << SCHED_CAPACITY_SHIFT,
- nominal_perf);
-
- max_freq = nominal_freq * boost_ratio >> SCHED_CAPACITY_SHIFT;
-
- /* Switch to khz */
- return max_freq * 1000;
+ return READ_ONCE(cpudata->max_freq);
}

static int amd_get_nominal_freq(struct amd_cpudata *cpudata)
{
- struct cppc_perf_caps cppc_perf;
-
- int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
- if (ret)
- return ret;
-
- /* Switch to khz */
- return cppc_perf.nominal_freq * 1000;
+ return READ_ONCE(cpudata->nominal_freq);
}

static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
{
- struct cppc_perf_caps cppc_perf;
- u32 lowest_nonlinear_freq, lowest_nonlinear_perf,
- nominal_freq, nominal_perf;
- u64 lowest_nonlinear_ratio;
-
- int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
- if (ret)
- return ret;
-
- nominal_freq = cppc_perf.nominal_freq;
- nominal_perf = READ_ONCE(cpudata->nominal_perf);
-
- lowest_nonlinear_perf = cppc_perf.lowest_nonlinear_perf;
-
- lowest_nonlinear_ratio = div_u64(lowest_nonlinear_perf << SCHED_CAPACITY_SHIFT,
- nominal_perf);
-
- lowest_nonlinear_freq = nominal_freq * lowest_nonlinear_ratio >> SCHED_CAPACITY_SHIFT;
-
- /* Switch to khz */
- return lowest_nonlinear_freq * 1000;
+ return READ_ONCE(cpudata->lowest_nonlinear_freq);
}

static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
@@ -828,6 +776,53 @@ static void amd_pstate_update_limits(unsigned int cpu)
mutex_unlock(&amd_pstate_driver_lock);
}

+/**
+ * amd_pstate_init_freq: Initialize the max_freq, min_freq,
+ * nominal_freq and lowest_nonlinear_freq for
+ * the @cpudata object.
+ *
+ * Requires: highest_perf, lowest_perf, nominal_perf and
+ * lowest_nonlinear_perf members of @cpudata to be
+ * initialized.
+ *
+ * Returns 0 on success, non-zero value on failure.
+ */
+static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
+{
+ int ret;
+ u32 min_freq;
+ u32 highest_perf, max_freq;
+ u32 nominal_perf, nominal_freq;
+ u32 lowest_nonlinear_perf, lowest_nonlinear_freq;
+ u32 boost_ratio, lowest_nonlinear_ratio;
+ struct cppc_perf_caps cppc_perf;
+
+
+ ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
+ if (ret)
+ return ret;
+
+ min_freq = cppc_perf.lowest_freq * 1000;
+ nominal_freq = cppc_perf.nominal_freq * 1000;
+ nominal_perf = READ_ONCE(cpudata->nominal_perf);
+
+ highest_perf = READ_ONCE(cpudata->highest_perf);
+ boost_ratio = div_u64(highest_perf << SCHED_CAPACITY_SHIFT, nominal_perf);
+ max_freq = nominal_freq * boost_ratio >> SCHED_CAPACITY_SHIFT;
+
+ lowest_nonlinear_perf = READ_ONCE(cpudata->lowest_nonlinear_perf);
+ lowest_nonlinear_ratio = div_u64(lowest_nonlinear_perf << SCHED_CAPACITY_SHIFT,
+ nominal_perf);
+ lowest_nonlinear_freq = nominal_freq * lowest_nonlinear_ratio >> SCHED_CAPACITY_SHIFT;
+
+ WRITE_ONCE(cpudata->min_freq, min_freq);
+ WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
+ WRITE_ONCE(cpudata->nominal_freq, nominal_freq);
+ WRITE_ONCE(cpudata->max_freq, max_freq);
+
+ return 0;
+}
+
static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
{
int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
@@ -855,6 +850,10 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
if (ret)
goto free_cpudata1;

+ ret = amd_pstate_init_freq(cpudata);
+ if (ret)
+ goto free_cpudata1;
+
min_freq = amd_get_min_freq(cpudata);
max_freq = amd_get_max_freq(cpudata);
nominal_freq = amd_get_nominal_freq(cpudata);
@@ -896,13 +895,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
goto free_cpudata2;
}

- /* Initial processor data capability frequencies */
- cpudata->max_freq = max_freq;
- cpudata->min_freq = min_freq;
cpudata->max_limit_freq = max_freq;
cpudata->min_limit_freq = min_freq;
- cpudata->nominal_freq = nominal_freq;
- cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;

policy->driver_data = cpudata;

@@ -1317,6 +1311,10 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
if (ret)
goto free_cpudata1;

+ ret = amd_pstate_init_freq(cpudata);
+ if (ret)
+ goto free_cpudata1;
+
min_freq = amd_get_min_freq(cpudata);
max_freq = amd_get_max_freq(cpudata);
nominal_freq = amd_get_nominal_freq(cpudata);
@@ -1333,12 +1331,6 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
/* It will be updated by governor */
policy->cur = policy->cpuinfo.min_freq;

- /* Initial processor data capability frequencies */
- cpudata->max_freq = max_freq;
- cpudata->min_freq = min_freq;
- cpudata->nominal_freq = nominal_freq;
- cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
-
policy->driver_data = cpudata;

cpudata->epp_cached = amd_pstate_get_epp(cpudata, 0);
--
2.34.1


2024-03-25 12:33:59

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 8/8] cpufreq: amd-pstate: Add quirk for the pstate CPPC capabilities missing

Add quirks table to get CPPC capabilities issue fixed by providing
correct perf or frequency values while driver loading.

If CPPC capabilities are not defined in the ACPI tables or wrongly
defined by platform firmware, it needs to use quick to get those
issues fixed with correct workaround values to make pstate driver
can be loaded even though there are CPPC capabilities errors.

The workaround will match the broken BIOS which lack of CPPC capabilities
nominal_freq and lowest_freq definition in the ACPI table.

$ cat /sys/devices/system/cpu/cpu0/acpi_cppc/lowest_freq
0
$ cat /sys/devices/system/cpu/cpu0/acpi_cppc/nominal_freq
0

Reviewed-by: Mario Limonciello <[email protected]>
Reviewed-by: Gautham R. Shenoy <[email protected]>
Tested-by: Dhananjay Ugwekar <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 53 ++++++++++++++++++++++++++++++++++--
include/linux/amd-pstate.h | 6 ++++
2 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index ec049b62b366..59a2db225d98 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -67,6 +67,7 @@ static struct cpufreq_driver amd_pstate_epp_driver;
static int cppc_state = AMD_PSTATE_UNDEFINED;
static bool cppc_enabled;
static bool amd_pstate_prefcore = true;
+static struct quirk_entry *quirks;

/*
* AMD Energy Preference Performance (EPP)
@@ -111,6 +112,41 @@ static unsigned int epp_values[] = {

typedef int (*cppc_mode_transition_fn)(int);

+static struct quirk_entry quirk_amd_7k62 = {
+ .nominal_freq = 2600,
+ .lowest_freq = 550,
+};
+
+static int __init dmi_matched_7k62_bios_bug(const struct dmi_system_id *dmi)
+{
+ /**
+ * match the broken bios for family 17h processor support CPPC V2
+ * broken BIOS lack of nominal_freq and lowest_freq capabilities
+ * definition in ACPI tables
+ */
+ if (boot_cpu_has(X86_FEATURE_ZEN2)) {
+ quirks = dmi->driver_data;
+ pr_info("Overriding nominal and lowest frequencies for %s\n", dmi->ident);
+ return 1;
+ }
+
+ return 0;
+}
+
+static const struct dmi_system_id amd_pstate_quirks_table[] __initconst = {
+ {
+ .callback = dmi_matched_7k62_bios_bug,
+ .ident = "AMD EPYC 7K62",
+ .matches = {
+ DMI_MATCH(DMI_BIOS_VERSION, "5.14"),
+ DMI_MATCH(DMI_BIOS_RELEASE, "12/12/2019"),
+ },
+ .driver_data = &quirk_amd_7k62,
+ },
+ {}
+};
+MODULE_DEVICE_TABLE(dmi, amd_pstate_quirks_table);
+
static inline int get_mode_idx_from_str(const char *str, size_t size)
{
int i;
@@ -812,8 +848,16 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
if (ret)
return ret;

- min_freq = cppc_perf.lowest_freq * 1000;
- nominal_freq = cppc_perf.nominal_freq * 1000;
+ if (quirks && quirks->lowest_freq)
+ min_freq = quirks->lowest_freq * 1000;
+ else
+ min_freq = cppc_perf.lowest_freq * 1000;
+
+ if (quirks && quirks->nominal_freq)
+ nominal_freq = quirks->nominal_freq * 1000;
+ else
+ nominal_freq = cppc_perf.nominal_freq * 1000;
+
nominal_perf = READ_ONCE(cpudata->nominal_perf);

highest_perf = READ_ONCE(cpudata->highest_perf);
@@ -1662,6 +1706,11 @@ static int __init amd_pstate_init(void)
if (cpufreq_get_current_driver())
return -EEXIST;

+ quirks = NULL;
+
+ /* check if this machine need CPPC quirks */
+ dmi_check_system(amd_pstate_quirks_table);
+
switch (cppc_state) {
case AMD_PSTATE_UNDEFINED:
/* Disable on the following configs by default:
diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index ab7e82533718..6b832153a126 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -128,4 +128,10 @@ static const char * const amd_pstate_mode_string[] = {
[AMD_PSTATE_GUIDED] = "guided",
NULL,
};
+
+struct quirk_entry {
+ u32 nominal_freq;
+ u32 lowest_freq;
+};
+
#endif /* _LINUX_AMD_PSTATE_H */
--
2.34.1


2024-03-25 12:34:03

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 2/8] cpufreq: amd-pstate: Document the units for freq variables in amd_cpudata

From: "Gautham R. Shenoy" <[email protected]>

The min_limit_freq, max_limit_freq, min_freq, max_freq, nominal_freq
and the lowest_nominal_freq members of struct cpudata store the
frequency value in khz to be consistent with the cpufreq
core. Update the comment to document this.

Reviewed-by: Li Meng <[email protected]>
Tested-by: Dhananjay Ugwekar <[email protected]>
Signed-off-by: Gautham R. Shenoy <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
include/linux/amd-pstate.h | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index 212f377d615b..ab7e82533718 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -51,15 +51,15 @@ struct amd_aperf_mperf {
* priority.
* @min_limit_perf: Cached value of the perf corresponding to policy->min
* @max_limit_perf: Cached value of the perf corresponding to policy->max
- * @min_limit_freq: Cached value of policy->min
- * @max_limit_freq: Cached value of policy->max
- * @max_freq: the frequency that mapped to highest_perf
- * @min_freq: the frequency that mapped to lowest_perf
- * @nominal_freq: the frequency that mapped to nominal_perf
- * @lowest_nonlinear_freq: the frequency that mapped to lowest_nonlinear_perf
+ * @min_limit_freq: Cached value of policy->min (in khz)
+ * @max_limit_freq: Cached value of policy->max (in khz)
+ * @max_freq: the frequency (in khz) that mapped to highest_perf
+ * @min_freq: the frequency (in khz) that mapped to lowest_perf
+ * @nominal_freq: the frequency (in khz) that mapped to nominal_perf
+ * @lowest_nonlinear_freq: the frequency (in khz) that mapped to lowest_nonlinear_perf
* @cur: Difference of Aperf/Mperf/tsc count between last and current sample
* @prev: Last Aperf/Mperf/tsc count value read from register
- * @freq: current cpu frequency value
+ * @freq: current cpu frequency value (in khz)
* @boost_supported: check whether the Processor or SBIOS supports boost mode
* @hw_prefcore: check whether HW supports preferred core featue.
* Only when hw_prefcore and early prefcore param are true,
--
2.34.1


2024-03-25 13:49:27

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v10 7/8] cppc_acpi: print error message if CPPC is unsupported

The amd-pstate driver can fail when _CPC objects are not supported by
the CPU. However, the current error message is ambiguous (see below) and
there is no clear way for attributing the failure of the amd-pstate
driver to the lack of CPPC support.

[ 0.477523] amd_pstate: the _CPC object is not present in SBIOS or ACPI disabled

Fix this by adding an debug message to notify the user if the amd-pstate
driver failed to load due to CPPC not be supported by the CPU

Reviewed-by: Mario Limonciello <[email protected]>
Reviewed-by: Gautham R. Shenoy <[email protected]>
Tested-by: Dhananjay Ugwekar <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/acpi/cppc_acpi.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 4bfbe55553f4..3134101f31b6 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -686,8 +686,10 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)

if (!osc_sb_cppc2_support_acked) {
pr_debug("CPPC v2 _OSC not acked\n");
- if (!cpc_supported_by_cpu())
+ if (!cpc_supported_by_cpu()) {
+ pr_debug("CPPC is not supported by the CPU\n");
return -ENODEV;
+ }
}

/* Parse the ACPI _CPC table for this CPU. */
--
2.34.1


2024-04-15 01:39:33

by Huang Rui

[permalink] [raw]
Subject: Re: [PATCH v10 2/8] cpufreq: amd-pstate: Document the units for freq variables in amd_cpudata

On Mon, Mar 25, 2024 at 11:03:22AM +0800, Yuan, Perry wrote:
> From: "Gautham R. Shenoy" <[email protected]>
>
> The min_limit_freq, max_limit_freq, min_freq, max_freq, nominal_freq
> and the lowest_nominal_freq members of struct cpudata store the
> frequency value in khz to be consistent with the cpufreq
> core. Update the comment to document this.
>
> Reviewed-by: Li Meng <[email protected]>
> Tested-by: Dhananjay Ugwekar <[email protected]>
> Signed-off-by: Gautham R. Shenoy <[email protected]>
> Signed-off-by: Perry Yuan <[email protected]>
> ---

Acked-by: Huang Rui <[email protected]>

> include/linux/amd-pstate.h | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> index 212f377d615b..ab7e82533718 100644
> --- a/include/linux/amd-pstate.h
> +++ b/include/linux/amd-pstate.h
> @@ -51,15 +51,15 @@ struct amd_aperf_mperf {
> * priority.
> * @min_limit_perf: Cached value of the perf corresponding to policy->min
> * @max_limit_perf: Cached value of the perf corresponding to policy->max
> - * @min_limit_freq: Cached value of policy->min
> - * @max_limit_freq: Cached value of policy->max
> - * @max_freq: the frequency that mapped to highest_perf
> - * @min_freq: the frequency that mapped to lowest_perf
> - * @nominal_freq: the frequency that mapped to nominal_perf
> - * @lowest_nonlinear_freq: the frequency that mapped to lowest_nonlinear_perf
> + * @min_limit_freq: Cached value of policy->min (in khz)
> + * @max_limit_freq: Cached value of policy->max (in khz)
> + * @max_freq: the frequency (in khz) that mapped to highest_perf
> + * @min_freq: the frequency (in khz) that mapped to lowest_perf
> + * @nominal_freq: the frequency (in khz) that mapped to nominal_perf
> + * @lowest_nonlinear_freq: the frequency (in khz) that mapped to lowest_nonlinear_perf
> * @cur: Difference of Aperf/Mperf/tsc count between last and current sample
> * @prev: Last Aperf/Mperf/tsc count value read from register
> - * @freq: current cpu frequency value
> + * @freq: current cpu frequency value (in khz)
> * @boost_supported: check whether the Processor or SBIOS supports boost mode
> * @hw_prefcore: check whether HW supports preferred core featue.
> * Only when hw_prefcore and early prefcore param are true,
> --
> 2.34.1
>

2024-04-15 14:55:06

by Huang Rui

[permalink] [raw]
Subject: Re: [PATCH v10 3/8] cpufreq: amd-pstate: Unify computation of {max,min,nominal,lowest_nonlinear}_freq

On Mon, Mar 25, 2024 at 11:03:23AM +0800, Yuan, Perry wrote:
> Currently the amd_get_{min, max, nominal, lowest_nonlinear}_freq()
> helpers computes the values of min_freq, max_freq, nominal_freq and
> lowest_nominal_freq respectively afresh from
> cppc_get_perf_caps(). This is not necessary as there are fields in
> cpudata to cache these values.
>
> To simplify this, add a single helper function named
> amd_pstate_init_freq() which computes all these frequencies at once, and
> caches it in cpudata.
>
> Use the cached values everywhere else in the code.
>
> Reviewed-by: Li Meng <[email protected]>
> Tested-by: Dhananjay Ugwekar <[email protected]>
> Co-developed-by: Gautham R. Shenoy <[email protected]>
> Signed-off-by: Gautham R. Shenoy <[email protected]>
> Signed-off-by: Perry Yuan <[email protected]>

I am thinking patch 3 and 4 should be squeezed together, because they are
all refining frequencies in cpudata. But I am fine if you want to continue
keep them separately.

> ---
> drivers/cpufreq/amd-pstate.c | 126 ++++++++++++++++-------------------
> 1 file changed, 59 insertions(+), 67 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 2015c9fcc3c9..ba1baa6733e6 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -606,74 +606,22 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
>
> static int amd_get_min_freq(struct amd_cpudata *cpudata)
> {
> - struct cppc_perf_caps cppc_perf;
> -
> - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> - if (ret)
> - return ret;
> -
> - /* Switch to khz */
> - return cppc_perf.lowest_freq * 1000;
> + return READ_ONCE(cpudata->min_freq);
> }
>
> static int amd_get_max_freq(struct amd_cpudata *cpudata)
> {
> - struct cppc_perf_caps cppc_perf;
> - u32 max_perf, max_freq, nominal_freq, nominal_perf;
> - u64 boost_ratio;
> -
> - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> - if (ret)
> - return ret;
> -
> - nominal_freq = cppc_perf.nominal_freq;
> - nominal_perf = READ_ONCE(cpudata->nominal_perf);
> - max_perf = READ_ONCE(cpudata->highest_perf);
> -
> - boost_ratio = div_u64(max_perf << SCHED_CAPACITY_SHIFT,
> - nominal_perf);
> -
> - max_freq = nominal_freq * boost_ratio >> SCHED_CAPACITY_SHIFT;
> -
> - /* Switch to khz */
> - return max_freq * 1000;
> + return READ_ONCE(cpudata->max_freq);
> }
>
> static int amd_get_nominal_freq(struct amd_cpudata *cpudata)
> {
> - struct cppc_perf_caps cppc_perf;
> -
> - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> - if (ret)
> - return ret;
> -
> - /* Switch to khz */
> - return cppc_perf.nominal_freq * 1000;
> + return READ_ONCE(cpudata->nominal_freq);
> }
>
> static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
> {
> - struct cppc_perf_caps cppc_perf;
> - u32 lowest_nonlinear_freq, lowest_nonlinear_perf,
> - nominal_freq, nominal_perf;
> - u64 lowest_nonlinear_ratio;
> -
> - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> - if (ret)
> - return ret;
> -
> - nominal_freq = cppc_perf.nominal_freq;
> - nominal_perf = READ_ONCE(cpudata->nominal_perf);
> -
> - lowest_nonlinear_perf = cppc_perf.lowest_nonlinear_perf;
> -
> - lowest_nonlinear_ratio = div_u64(lowest_nonlinear_perf << SCHED_CAPACITY_SHIFT,
> - nominal_perf);
> -
> - lowest_nonlinear_freq = nominal_freq * lowest_nonlinear_ratio >> SCHED_CAPACITY_SHIFT;
> -
> - /* Switch to khz */
> - return lowest_nonlinear_freq * 1000;
> + return READ_ONCE(cpudata->lowest_nonlinear_freq);
> }
>
> static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
> @@ -828,6 +776,53 @@ static void amd_pstate_update_limits(unsigned int cpu)
> mutex_unlock(&amd_pstate_driver_lock);
> }
>
> +/**
> + * amd_pstate_init_freq: Initialize the max_freq, min_freq,
> + * nominal_freq and lowest_nonlinear_freq for
> + * the @cpudata object.
> + *
> + * Requires: highest_perf, lowest_perf, nominal_perf and
> + * lowest_nonlinear_perf members of @cpudata to be
> + * initialized.
> + *
> + * Returns 0 on success, non-zero value on failure.
> + */
> +static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
> +{
> + int ret;
> + u32 min_freq;
> + u32 highest_perf, max_freq;
> + u32 nominal_perf, nominal_freq;
> + u32 lowest_nonlinear_perf, lowest_nonlinear_freq;
> + u32 boost_ratio, lowest_nonlinear_ratio;
> + struct cppc_perf_caps cppc_perf;
> +
> +
> + ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> + if (ret)
> + return ret;
> +
> + min_freq = cppc_perf.lowest_freq * 1000;
> + nominal_freq = cppc_perf.nominal_freq * 1000;
> + nominal_perf = READ_ONCE(cpudata->nominal_perf);
> +
> + highest_perf = READ_ONCE(cpudata->highest_perf);
> + boost_ratio = div_u64(highest_perf << SCHED_CAPACITY_SHIFT, nominal_perf);
> + max_freq = nominal_freq * boost_ratio >> SCHED_CAPACITY_SHIFT;
> +
> + lowest_nonlinear_perf = READ_ONCE(cpudata->lowest_nonlinear_perf);
> + lowest_nonlinear_ratio = div_u64(lowest_nonlinear_perf << SCHED_CAPACITY_SHIFT,
> + nominal_perf);
> + lowest_nonlinear_freq = nominal_freq * lowest_nonlinear_ratio >> SCHED_CAPACITY_SHIFT;
> +
> + WRITE_ONCE(cpudata->min_freq, min_freq);
> + WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
> + WRITE_ONCE(cpudata->nominal_freq, nominal_freq);
> + WRITE_ONCE(cpudata->max_freq, max_freq);
> +
> + return 0;
> +}
> +
> static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> {
> int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> @@ -855,6 +850,10 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> if (ret)
> goto free_cpudata1;
>
> + ret = amd_pstate_init_freq(cpudata);
> + if (ret)
> + goto free_cpudata1;
> +
> min_freq = amd_get_min_freq(cpudata);
> max_freq = amd_get_max_freq(cpudata);
> nominal_freq = amd_get_nominal_freq(cpudata);
> @@ -896,13 +895,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> goto free_cpudata2;
> }
>
> - /* Initial processor data capability frequencies */
> - cpudata->max_freq = max_freq;
> - cpudata->min_freq = min_freq;
> cpudata->max_limit_freq = max_freq;
> cpudata->min_limit_freq = min_freq;
> - cpudata->nominal_freq = nominal_freq;
> - cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
>
> policy->driver_data = cpudata;
>
> @@ -1317,6 +1311,10 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> if (ret)
> goto free_cpudata1;
>
> + ret = amd_pstate_init_freq(cpudata);
> + if (ret)
> + goto free_cpudata1;
> +
> min_freq = amd_get_min_freq(cpudata);
> max_freq = amd_get_max_freq(cpudata);
> nominal_freq = amd_get_nominal_freq(cpudata);
> @@ -1333,12 +1331,6 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> /* It will be updated by governor */
> policy->cur = policy->cpuinfo.min_freq;
>
> - /* Initial processor data capability frequencies */
> - cpudata->max_freq = max_freq;
> - cpudata->min_freq = min_freq;
> - cpudata->nominal_freq = nominal_freq;
> - cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> -
> policy->driver_data = cpudata;
>
> cpudata->epp_cached = amd_pstate_get_epp(cpudata, 0);
> --
> 2.34.1
>

2024-04-15 14:57:53

by Huang Rui

[permalink] [raw]
Subject: Re: [PATCH v10 4/8] cpufreq: amd-pstate: Remove amd_get_{min,max,nominal,lowest_nonlinear}_freq()

On Mon, Mar 25, 2024 at 11:03:24AM +0800, Yuan, Perry wrote:
> From: "Gautham R. Shenoy" <[email protected]>
>
> amd_get_{min,max,nominal,lowest_nonlinear}_freq() functions merely
> return cpudata->{min,max,nominal,lowest_nonlinear}_freq values.
>
> There is no loss in readability in replacing their invocations by
> accesses to the corresponding members of cpudata.
>
> Do so and remove these helper functions.
>
> Reviewed-by: Li Meng <[email protected]>
> Tested-by: Dhananjay Ugwekar <[email protected]>
> Signed-off-by: Gautham R. Shenoy <[email protected]>
> Signed-off-by: Perry Yuan <[email protected]>

The same comments like patch3.

> ---
> drivers/cpufreq/amd-pstate.c | 40 +++++++++---------------------------
> 1 file changed, 10 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index ba1baa6733e6..132330b4942f 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -604,26 +604,6 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
> cpufreq_cpu_put(policy);
> }
>
> -static int amd_get_min_freq(struct amd_cpudata *cpudata)
> -{
> - return READ_ONCE(cpudata->min_freq);
> -}
> -
> -static int amd_get_max_freq(struct amd_cpudata *cpudata)
> -{
> - return READ_ONCE(cpudata->max_freq);
> -}
> -
> -static int amd_get_nominal_freq(struct amd_cpudata *cpudata)
> -{
> - return READ_ONCE(cpudata->nominal_freq);
> -}
> -
> -static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
> -{
> - return READ_ONCE(cpudata->lowest_nonlinear_freq);
> -}
> -
> static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
> {
> struct amd_cpudata *cpudata = policy->driver_data;
> @@ -854,10 +834,10 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> if (ret)
> goto free_cpudata1;
>
> - min_freq = amd_get_min_freq(cpudata);
> - max_freq = amd_get_max_freq(cpudata);
> - nominal_freq = amd_get_nominal_freq(cpudata);
> - lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> + min_freq = READ_ONCE(cpudata->min_freq);
> + max_freq = READ_ONCE(cpudata->max_freq);
> + nominal_freq = READ_ONCE(cpudata->nominal_freq);
> + lowest_nonlinear_freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
>
> if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
> @@ -960,7 +940,7 @@ static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
> int max_freq;
> struct amd_cpudata *cpudata = policy->driver_data;
>
> - max_freq = amd_get_max_freq(cpudata);
> + max_freq = READ_ONCE(cpudata->max_freq);
> if (max_freq < 0)
> return max_freq;
>
> @@ -973,7 +953,7 @@ static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *poli
> int freq;
> struct amd_cpudata *cpudata = policy->driver_data;
>
> - freq = amd_get_lowest_nonlinear_freq(cpudata);
> + freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
> if (freq < 0)
> return freq;
>
> @@ -1315,10 +1295,10 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> if (ret)
> goto free_cpudata1;
>
> - min_freq = amd_get_min_freq(cpudata);
> - max_freq = amd_get_max_freq(cpudata);
> - nominal_freq = amd_get_nominal_freq(cpudata);
> - lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
> + min_freq = READ_ONCE(cpudata->min_freq);
> + max_freq = READ_ONCE(cpudata->max_freq);
> + nominal_freq = READ_ONCE(cpudata->nominal_freq);
> + lowest_nonlinear_freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
> if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
> min_freq, max_freq);
> --
> 2.34.1
>

2024-04-15 15:02:26

by Huang Rui

[permalink] [raw]
Subject: Re: [PATCH v10 6/8] cpufreq: amd-pstate: get transition delay and latency value from ACPI tables

On Mon, Mar 25, 2024 at 11:03:26AM +0800, Yuan, Perry wrote:
> Make pstate driver initially retrieve the P-state transition delay and
> latency values from the BIOS ACPI tables which has more reasonable
> delay and latency values according to the platform design and
> requirements.
>
> Previously there values were hardcoded at specific value which may
> have conflicted with platform and it might not reflect the most
> accurate or optimized setting for the processor.
>
> [054h 0084 8] Preserve Mask : FFFFFFFF00000000
> [05Ch 0092 8] Write Mask : 0000000000000001
> [064h 0100 4] Command Latency : 00000FA0
> [068h 0104 4] Maximum Access Rate : 0000EA60
> [06Ch 0108 2] Minimum Turnaround Time : 0000
>
> Reviewed-by: Gautham R. Shenoy <[email protected]>
> Reviewed-by: Mario Limonciello <[email protected]>
> Tested-by: Dhananjay Ugwekar <[email protected]>
> Signed-off-by: Perry Yuan <[email protected]>

Acked-by: Huang Rui <[email protected]>

> ---
> drivers/cpufreq/amd-pstate.c | 34 ++++++++++++++++++++++++++++++++--
> 1 file changed, 32 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 6708c436e1a2..ec049b62b366 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -756,6 +756,36 @@ static void amd_pstate_update_limits(unsigned int cpu)
> mutex_unlock(&amd_pstate_driver_lock);
> }
>
> +/**
> + * Get pstate transition delay time from ACPI tables that firmware set
> + * instead of using hardcode value directly.
> + */
> +static u32 amd_pstate_get_transition_delay_us(unsigned int cpu)
> +{
> + u32 transition_delay_ns;
> +
> + transition_delay_ns = cppc_get_transition_latency(cpu);
> + if (transition_delay_ns == CPUFREQ_ETERNAL)
> + return AMD_PSTATE_TRANSITION_DELAY;
> +
> + return transition_delay_ns / NSEC_PER_USEC;
> +}
> +
> +/**
> + * Get pstate transition latency value from ACPI tables that firmware
> + * set instead of using hardcode value directly.
> + */
> +static u32 amd_pstate_get_transition_latency(unsigned int cpu)
> +{
> + u32 transition_latency;
> +
> + transition_latency = cppc_get_transition_latency(cpu);
> + if (transition_latency == CPUFREQ_ETERNAL)
> + return AMD_PSTATE_TRANSITION_LATENCY;
> +
> + return transition_latency;
> +}
> +
> /**
> * amd_pstate_init_freq: Initialize the max_freq, min_freq,
> * nominal_freq and lowest_nonlinear_freq for
> @@ -848,8 +878,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> goto free_cpudata1;
> }
>
> - policy->cpuinfo.transition_latency = AMD_PSTATE_TRANSITION_LATENCY;
> - policy->transition_delay_us = AMD_PSTATE_TRANSITION_DELAY;
> + policy->cpuinfo.transition_latency = amd_pstate_get_transition_latency(policy->cpu);
> + policy->transition_delay_us = amd_pstate_get_transition_delay_us(policy->cpu);
>
> policy->min = min_freq;
> policy->max = max_freq;
> --
> 2.34.1
>

2024-04-15 15:05:35

by Huang Rui

[permalink] [raw]
Subject: Re: [PATCH v10 7/8] cppc_acpi: print error message if CPPC is unsupported

On Mon, Mar 25, 2024 at 11:03:27AM +0800, Yuan, Perry wrote:
> The amd-pstate driver can fail when _CPC objects are not supported by
> the CPU. However, the current error message is ambiguous (see below) and
> there is no clear way for attributing the failure of the amd-pstate
> driver to the lack of CPPC support.
>
> [ 0.477523] amd_pstate: the _CPC object is not present in SBIOS or ACPI disabled
>
> Fix this by adding an debug message to notify the user if the amd-pstate
> driver failed to load due to CPPC not be supported by the CPU
>
> Reviewed-by: Mario Limonciello <[email protected]>
> Reviewed-by: Gautham R. Shenoy <[email protected]>
> Tested-by: Dhananjay Ugwekar <[email protected]>
> Signed-off-by: Perry Yuan <[email protected]>

Acked-by: Huang Rui <[email protected]>

> ---
> drivers/acpi/cppc_acpi.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> index 4bfbe55553f4..3134101f31b6 100644
> --- a/drivers/acpi/cppc_acpi.c
> +++ b/drivers/acpi/cppc_acpi.c
> @@ -686,8 +686,10 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
>
> if (!osc_sb_cppc2_support_acked) {
> pr_debug("CPPC v2 _OSC not acked\n");
> - if (!cpc_supported_by_cpu())
> + if (!cpc_supported_by_cpu()) {
> + pr_debug("CPPC is not supported by the CPU\n");
> return -ENODEV;
> + }
> }
>
> /* Parse the ACPI _CPC table for this CPU. */
> --
> 2.34.1
>

2024-04-15 15:06:00

by Huang Rui

[permalink] [raw]
Subject: Re: [PATCH v10 5/8] cpufreq: amd-pstate: Bail out if min/max/nominal_freq is 0

On Mon, Mar 25, 2024 at 11:03:25AM +0800, Yuan, Perry wrote:
> The amd-pstate driver cannot work when the min_freq, nominal_freq or
> the max_freq is zero. When this happens it is prudent to error out
> early on rather than waiting failing at the time of the governor
> initialization.
>
> Reviewed-by: Gautham R. Shenoy <[email protected]>
> Tested-by: Dhananjay Ugwekar <[email protected]>
> Signed-off-by: Perry Yuan <[email protected]>
> ---
> drivers/cpufreq/amd-pstate.c | 16 ++++++++++------
> 1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 132330b4942f..6708c436e1a2 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -839,9 +839,11 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> nominal_freq = READ_ONCE(cpudata->nominal_freq);
> lowest_nonlinear_freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
>
> - if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> - dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
> - min_freq, max_freq);
> + if (min_freq <= 0 || max_freq <= 0 ||
> + nominal_freq <= 0 || min_freq > max_freq) {
> + dev_err(dev,
> + "min_freq(%d) or max_freq(%d) or nominal_freq (%d) value is incorrect\n",
> + min_freq, max_freq, nominal_freq);

I suggest that we add one comment to remind that should be the error of
ACPI table or BIOS.

> ret = -EINVAL;
> goto free_cpudata1;
> }
> @@ -1299,9 +1301,11 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> max_freq = READ_ONCE(cpudata->max_freq);
> nominal_freq = READ_ONCE(cpudata->nominal_freq);
> lowest_nonlinear_freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
> - if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> - dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
> - min_freq, max_freq);
> + if (min_freq <= 0 || max_freq <= 0 ||
> + nominal_freq <= 0 || min_freq > max_freq) {
> + dev_err(dev,
> + "min_freq(%d) or max_freq(%d) or nominal_freq(%d) value is incorrect\n",
> + min_freq, max_freq, nominal_freq);

The same with above.

With that fixed, patch is Acked-by: Huang Rui <[email protected]>

Thanks,
Ray

2024-04-15 15:17:03

by Huang Rui

[permalink] [raw]
Subject: Re: [PATCH v10 8/8] cpufreq: amd-pstate: Add quirk for the pstate CPPC capabilities missing

On Mon, Mar 25, 2024 at 11:03:28AM +0800, Yuan, Perry wrote:
> Add quirks table to get CPPC capabilities issue fixed by providing
> correct perf or frequency values while driver loading.
>
> If CPPC capabilities are not defined in the ACPI tables or wrongly
> defined by platform firmware, it needs to use quick to get those
> issues fixed with correct workaround values to make pstate driver
> can be loaded even though there are CPPC capabilities errors.
>
> The workaround will match the broken BIOS which lack of CPPC capabilities
> nominal_freq and lowest_freq definition in the ACPI table.
>
> $ cat /sys/devices/system/cpu/cpu0/acpi_cppc/lowest_freq
> 0
> $ cat /sys/devices/system/cpu/cpu0/acpi_cppc/nominal_freq
> 0
>
> Reviewed-by: Mario Limonciello <[email protected]>
> Reviewed-by: Gautham R. Shenoy <[email protected]>
> Tested-by: Dhananjay Ugwekar <[email protected]>
> Signed-off-by: Perry Yuan <[email protected]>
> ---
> drivers/cpufreq/amd-pstate.c | 53 ++++++++++++++++++++++++++++++++++--
> include/linux/amd-pstate.h | 6 ++++
> 2 files changed, 57 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index ec049b62b366..59a2db225d98 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -67,6 +67,7 @@ static struct cpufreq_driver amd_pstate_epp_driver;
> static int cppc_state = AMD_PSTATE_UNDEFINED;
> static bool cppc_enabled;
> static bool amd_pstate_prefcore = true;
> +static struct quirk_entry *quirks;

If we set quirks as global pointer, while the amd-pstate is uninstalling,
should we free the quirks as well?

Thanks,
Ray

>
> /*
> * AMD Energy Preference Performance (EPP)
> @@ -111,6 +112,41 @@ static unsigned int epp_values[] = {
>
> typedef int (*cppc_mode_transition_fn)(int);
>
> +static struct quirk_entry quirk_amd_7k62 = {
> + .nominal_freq = 2600,
> + .lowest_freq = 550,
> +};
> +
> +static int __init dmi_matched_7k62_bios_bug(const struct dmi_system_id *dmi)
> +{
> + /**
> + * match the broken bios for family 17h processor support CPPC V2
> + * broken BIOS lack of nominal_freq and lowest_freq capabilities
> + * definition in ACPI tables
> + */
> + if (boot_cpu_has(X86_FEATURE_ZEN2)) {
> + quirks = dmi->driver_data;
> + pr_info("Overriding nominal and lowest frequencies for %s\n", dmi->ident);
> + return 1;
> + }
> +
> + return 0;
> +}
> +
> +static const struct dmi_system_id amd_pstate_quirks_table[] __initconst = {
> + {
> + .callback = dmi_matched_7k62_bios_bug,
> + .ident = "AMD EPYC 7K62",
> + .matches = {
> + DMI_MATCH(DMI_BIOS_VERSION, "5.14"),
> + DMI_MATCH(DMI_BIOS_RELEASE, "12/12/2019"),
> + },
> + .driver_data = &quirk_amd_7k62,
> + },
> + {}
> +};
> +MODULE_DEVICE_TABLE(dmi, amd_pstate_quirks_table);
> +
> static inline int get_mode_idx_from_str(const char *str, size_t size)
> {
> int i;
> @@ -812,8 +848,16 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
> if (ret)
> return ret;
>
> - min_freq = cppc_perf.lowest_freq * 1000;
> - nominal_freq = cppc_perf.nominal_freq * 1000;
> + if (quirks && quirks->lowest_freq)
> + min_freq = quirks->lowest_freq * 1000;
> + else
> + min_freq = cppc_perf.lowest_freq * 1000;
> +
> + if (quirks && quirks->nominal_freq)
> + nominal_freq = quirks->nominal_freq * 1000;
> + else
> + nominal_freq = cppc_perf.nominal_freq * 1000;
> +
> nominal_perf = READ_ONCE(cpudata->nominal_perf);
>
> highest_perf = READ_ONCE(cpudata->highest_perf);
> @@ -1662,6 +1706,11 @@ static int __init amd_pstate_init(void)
> if (cpufreq_get_current_driver())
> return -EEXIST;
>
> + quirks = NULL;
> +
> + /* check if this machine need CPPC quirks */
> + dmi_check_system(amd_pstate_quirks_table);
> +
> switch (cppc_state) {
> case AMD_PSTATE_UNDEFINED:
> /* Disable on the following configs by default:
> diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> index ab7e82533718..6b832153a126 100644
> --- a/include/linux/amd-pstate.h
> +++ b/include/linux/amd-pstate.h
> @@ -128,4 +128,10 @@ static const char * const amd_pstate_mode_string[] = {
> [AMD_PSTATE_GUIDED] = "guided",
> NULL,
> };
> +
> +struct quirk_entry {
> + u32 nominal_freq;
> + u32 lowest_freq;
> +};
> +
> #endif /* _LINUX_AMD_PSTATE_H */
> --
> 2.34.1
>

2024-04-18 09:13:23

by Yuan, Perry

[permalink] [raw]
Subject: RE: [PATCH v10 5/8] cpufreq: amd-pstate: Bail out if min/max/nominal_freq is 0

[AMD Official Use Only - General]

Regards.
Perry

> -----Original Message-----
> From: Huang, Ray <[email protected]>
> Sent: Monday, April 15, 2024 10:59 PM
> To: Yuan, Perry <[email protected]>
> Cc: [email protected]; Limonciello, Mario
> <[email protected]>; [email protected]; Shenoy, Gautham
> Ranjal <[email protected]>; Petkov, Borislav
> <[email protected]>; Deucher, Alexander
> <[email protected]>; Huang, Shimmer
> <[email protected]>; [email protected]; Du, Xiaojian
> <[email protected]>; Meng, Li (Jassmine) <[email protected]>; linux-
> [email protected]; [email protected]
> Subject: Re: [PATCH v10 5/8] cpufreq: amd-pstate: Bail out if
> min/max/nominal_freq is 0
>
> On Mon, Mar 25, 2024 at 11:03:25AM +0800, Yuan, Perry wrote:
> > The amd-pstate driver cannot work when the min_freq, nominal_freq or
> > the max_freq is zero. When this happens it is prudent to error out
> > early on rather than waiting failing at the time of the governor
> > initialization.
> >
> > Reviewed-by: Gautham R. Shenoy <[email protected]>
> > Tested-by: Dhananjay Ugwekar <[email protected]>
> > Signed-off-by: Perry Yuan <[email protected]>
> > ---
> > drivers/cpufreq/amd-pstate.c | 16 ++++++++++------
> > 1 file changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index 132330b4942f..6708c436e1a2
> 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -839,9 +839,11 @@ static int amd_pstate_cpu_init(struct
> cpufreq_policy *policy)
> > nominal_freq = READ_ONCE(cpudata->nominal_freq);
> > lowest_nonlinear_freq = READ_ONCE(cpudata-
> >lowest_nonlinear_freq);
> >
> > - if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> > - dev_err(dev, "min_freq(%d) or max_freq(%d) value is
> incorrect\n",
> > - min_freq, max_freq);
> > + if (min_freq <= 0 || max_freq <= 0 ||
> > + nominal_freq <= 0 || min_freq > max_freq) {
> > + dev_err(dev,
> > + "min_freq(%d) or max_freq(%d) or nominal_freq
> (%d) value is incorrect\n",
> > + min_freq, max_freq, nominal_freq);
>
> I suggest that we add one comment to remind that should be the error of
> ACPI table or BIOS.

Ok, thanks for comment, will add one more comment in v11.

>
> > ret = -EINVAL;
> > goto free_cpudata1;
> > }
> > @@ -1299,9 +1301,11 @@ static int amd_pstate_epp_cpu_init(struct
> cpufreq_policy *policy)
> > max_freq = READ_ONCE(cpudata->max_freq);
> > nominal_freq = READ_ONCE(cpudata->nominal_freq);
> > lowest_nonlinear_freq = READ_ONCE(cpudata-
> >lowest_nonlinear_freq);
> > - if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
> > - dev_err(dev, "min_freq(%d) or max_freq(%d) value is
> incorrect\n",
> > - min_freq, max_freq);
> > + if (min_freq <= 0 || max_freq <= 0 ||
> > + nominal_freq <= 0 || min_freq > max_freq) {
> > + dev_err(dev,
> > + "min_freq(%d) or max_freq(%d) or nominal_freq(%d)
> value is incorrect\n",
> > + min_freq, max_freq, nominal_freq);
>
> The same with above.
>
> With that fixed, patch is Acked-by: Huang Rui <[email protected]>
>
> Thanks,
> Ray

Thanks for the ACK.

Perry.


2024-04-18 09:15:46

by Yuan, Perry

[permalink] [raw]
Subject: RE: [PATCH v10 3/8] cpufreq: amd-pstate: Unify computation of {max,min,nominal,lowest_nonlinear}_freq

[AMD Official Use Only - General]

> -----Original Message-----
> From: Huang, Ray <[email protected]>
> Sent: Monday, April 15, 2024 10:55 PM
> To: Yuan, Perry <[email protected]>
> Cc: [email protected]; Limonciello, Mario
> <[email protected]>; [email protected]; Shenoy, Gautham
> Ranjal <[email protected]>; Petkov, Borislav
> <[email protected]>; Deucher, Alexander
> <[email protected]>; Huang, Shimmer
> <[email protected]>; [email protected]; Du, Xiaojian
> <[email protected]>; Meng, Li (Jassmine) <[email protected]>; linux-
> [email protected]; [email protected]
> Subject: Re: [PATCH v10 3/8] cpufreq: amd-pstate: Unify computation of
> {max,min,nominal,lowest_nonlinear}_freq
>
> On Mon, Mar 25, 2024 at 11:03:23AM +0800, Yuan, Perry wrote:
> > Currently the amd_get_{min, max, nominal, lowest_nonlinear}_freq()
> > helpers computes the values of min_freq, max_freq, nominal_freq and
> > lowest_nominal_freq respectively afresh from cppc_get_perf_caps().
> > This is not necessary as there are fields in cpudata to cache these
> > values.
> >
> > To simplify this, add a single helper function named
> > amd_pstate_init_freq() which computes all these frequencies at once,
> > and caches it in cpudata.
> >
> > Use the cached values everywhere else in the code.
> >
> > Reviewed-by: Li Meng <[email protected]>
> > Tested-by: Dhananjay Ugwekar <[email protected]>
> > Co-developed-by: Gautham R. Shenoy <[email protected]>
> > Signed-off-by: Gautham R. Shenoy <[email protected]>
> > Signed-off-by: Perry Yuan <[email protected]>
>
> I am thinking patch 3 and 4 should be squeezed together, because they are all
> refining frequencies in cpudata. But I am fine if you want to continue keep
> them separately.

Yes, we would like to keep changes in two patches, then patches don't need to get review again in v11.
Thanks.

Perry.

>
> > ---
> > drivers/cpufreq/amd-pstate.c | 126
> > ++++++++++++++++-------------------
> > 1 file changed, 59 insertions(+), 67 deletions(-)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index 2015c9fcc3c9..ba1baa6733e6
> 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -606,74 +606,22 @@ static void amd_pstate_adjust_perf(unsigned int
> > cpu,
> >
> > static int amd_get_min_freq(struct amd_cpudata *cpudata) {
> > - struct cppc_perf_caps cppc_perf;
> > -
> > - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> > - if (ret)
> > - return ret;
> > -
> > - /* Switch to khz */
> > - return cppc_perf.lowest_freq * 1000;
> > + return READ_ONCE(cpudata->min_freq);
> > }
> >
> > static int amd_get_max_freq(struct amd_cpudata *cpudata) {
> > - struct cppc_perf_caps cppc_perf;
> > - u32 max_perf, max_freq, nominal_freq, nominal_perf;
> > - u64 boost_ratio;
> > -
> > - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> > - if (ret)
> > - return ret;
> > -
> > - nominal_freq = cppc_perf.nominal_freq;
> > - nominal_perf = READ_ONCE(cpudata->nominal_perf);
> > - max_perf = READ_ONCE(cpudata->highest_perf);
> > -
> > - boost_ratio = div_u64(max_perf << SCHED_CAPACITY_SHIFT,
> > - nominal_perf);
> > -
> > - max_freq = nominal_freq * boost_ratio >> SCHED_CAPACITY_SHIFT;
> > -
> > - /* Switch to khz */
> > - return max_freq * 1000;
> > + return READ_ONCE(cpudata->max_freq);
> > }
> >
> > static int amd_get_nominal_freq(struct amd_cpudata *cpudata) {
> > - struct cppc_perf_caps cppc_perf;
> > -
> > - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> > - if (ret)
> > - return ret;
> > -
> > - /* Switch to khz */
> > - return cppc_perf.nominal_freq * 1000;
> > + return READ_ONCE(cpudata->nominal_freq);
> > }
> >
> > static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
> > {
> > - struct cppc_perf_caps cppc_perf;
> > - u32 lowest_nonlinear_freq, lowest_nonlinear_perf,
> > - nominal_freq, nominal_perf;
> > - u64 lowest_nonlinear_ratio;
> > -
> > - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> > - if (ret)
> > - return ret;
> > -
> > - nominal_freq = cppc_perf.nominal_freq;
> > - nominal_perf = READ_ONCE(cpudata->nominal_perf);
> > -
> > - lowest_nonlinear_perf = cppc_perf.lowest_nonlinear_perf;
> > -
> > - lowest_nonlinear_ratio = div_u64(lowest_nonlinear_perf <<
> SCHED_CAPACITY_SHIFT,
> > - nominal_perf);
> > -
> > - lowest_nonlinear_freq = nominal_freq * lowest_nonlinear_ratio >>
> SCHED_CAPACITY_SHIFT;
> > -
> > - /* Switch to khz */
> > - return lowest_nonlinear_freq * 1000;
> > + return READ_ONCE(cpudata->lowest_nonlinear_freq);
> > }
> >
> > static int amd_pstate_set_boost(struct cpufreq_policy *policy, int
> > state) @@ -828,6 +776,53 @@ static void
> amd_pstate_update_limits(unsigned int cpu)
> > mutex_unlock(&amd_pstate_driver_lock);
> > }
> >
> > +/**
> > + * amd_pstate_init_freq: Initialize the max_freq, min_freq,
> > + * nominal_freq and lowest_nonlinear_freq for
> > + * the @cpudata object.
> > + *
> > + * Requires: highest_perf, lowest_perf, nominal_perf and
> > + * lowest_nonlinear_perf members of @cpudata to be
> > + * initialized.
> > + *
> > + * Returns 0 on success, non-zero value on failure.
> > + */
> > +static int amd_pstate_init_freq(struct amd_cpudata *cpudata) {
> > + int ret;
> > + u32 min_freq;
> > + u32 highest_perf, max_freq;
> > + u32 nominal_perf, nominal_freq;
> > + u32 lowest_nonlinear_perf, lowest_nonlinear_freq;
> > + u32 boost_ratio, lowest_nonlinear_ratio;
> > + struct cppc_perf_caps cppc_perf;
> > +
> > +
> > + ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> > + if (ret)
> > + return ret;
> > +
> > + min_freq = cppc_perf.lowest_freq * 1000;
> > + nominal_freq = cppc_perf.nominal_freq * 1000;
> > + nominal_perf = READ_ONCE(cpudata->nominal_perf);
> > +
> > + highest_perf = READ_ONCE(cpudata->highest_perf);
> > + boost_ratio = div_u64(highest_perf << SCHED_CAPACITY_SHIFT,
> nominal_perf);
> > + max_freq = nominal_freq * boost_ratio >> SCHED_CAPACITY_SHIFT;
> > +
> > + lowest_nonlinear_perf = READ_ONCE(cpudata-
> >lowest_nonlinear_perf);
> > + lowest_nonlinear_ratio = div_u64(lowest_nonlinear_perf <<
> SCHED_CAPACITY_SHIFT,
> > + nominal_perf);
> > + lowest_nonlinear_freq = nominal_freq * lowest_nonlinear_ratio >>
> > +SCHED_CAPACITY_SHIFT;
> > +
> > + WRITE_ONCE(cpudata->min_freq, min_freq);
> > + WRITE_ONCE(cpudata->lowest_nonlinear_freq,
> lowest_nonlinear_freq);
> > + WRITE_ONCE(cpudata->nominal_freq, nominal_freq);
> > + WRITE_ONCE(cpudata->max_freq, max_freq);
> > +
> > + return 0;
> > +}
> > +
> > static int amd_pstate_cpu_init(struct cpufreq_policy *policy) {
> > int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> @@
> > -855,6 +850,10 @@ static int amd_pstate_cpu_init(struct cpufreq_policy
> *policy)
> > if (ret)
> > goto free_cpudata1;
> >
> > + ret = amd_pstate_init_freq(cpudata);
> > + if (ret)
> > + goto free_cpudata1;
> > +
> > min_freq = amd_get_min_freq(cpudata);
> > max_freq = amd_get_max_freq(cpudata);
> > nominal_freq = amd_get_nominal_freq(cpudata); @@ -896,13
> +895,8 @@
> > static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> > goto free_cpudata2;
> > }
> >
> > - /* Initial processor data capability frequencies */
> > - cpudata->max_freq = max_freq;
> > - cpudata->min_freq = min_freq;
> > cpudata->max_limit_freq = max_freq;
> > cpudata->min_limit_freq = min_freq;
> > - cpudata->nominal_freq = nominal_freq;
> > - cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> >
> > policy->driver_data = cpudata;
> >
> > @@ -1317,6 +1311,10 @@ static int amd_pstate_epp_cpu_init(struct
> cpufreq_policy *policy)
> > if (ret)
> > goto free_cpudata1;
> >
> > + ret = amd_pstate_init_freq(cpudata);
> > + if (ret)
> > + goto free_cpudata1;
> > +
> > min_freq = amd_get_min_freq(cpudata);
> > max_freq = amd_get_max_freq(cpudata);
> > nominal_freq = amd_get_nominal_freq(cpudata); @@ -1333,12
> +1331,6 @@
> > static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> > /* It will be updated by governor */
> > policy->cur = policy->cpuinfo.min_freq;
> >
> > - /* Initial processor data capability frequencies */
> > - cpudata->max_freq = max_freq;
> > - cpudata->min_freq = min_freq;
> > - cpudata->nominal_freq = nominal_freq;
> > - cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
> > -
> > policy->driver_data = cpudata;
> >
> > cpudata->epp_cached = amd_pstate_get_epp(cpudata, 0);
> > --
> > 2.34.1
> >

2024-04-22 09:50:58

by Yuan, Perry

[permalink] [raw]
Subject: RE: [PATCH v10 8/8] cpufreq: amd-pstate: Add quirk for the pstate CPPC capabilities missing

[AMD Official Use Only - General]

Hi Ray,


> -----Original Message-----
> From: Huang, Ray <[email protected]>
> Sent: Monday, April 15, 2024 11:16 PM
> To: Yuan, Perry <[email protected]>
> Cc: [email protected]; Limonciello, Mario
> <[email protected]>; [email protected]; Shenoy, Gautham
> Ranjal <[email protected]>; Petkov, Borislav
> <[email protected]>; Deucher, Alexander
> <[email protected]>; Huang, Shimmer <[email protected]>;
> [email protected]; Du, Xiaojian <[email protected]>; Meng, Li
> (Jassmine) <[email protected]>; [email protected]; linux-
> [email protected]
> Subject: Re: [PATCH v10 8/8] cpufreq: amd-pstate: Add quirk for the pstate CPPC
> capabilities missing
>
> On Mon, Mar 25, 2024 at 11:03:28AM +0800, Yuan, Perry wrote:
> > Add quirks table to get CPPC capabilities issue fixed by providing
> > correct perf or frequency values while driver loading.
> >
> > If CPPC capabilities are not defined in the ACPI tables or wrongly
> > defined by platform firmware, it needs to use quick to get those
> > issues fixed with correct workaround values to make pstate driver can
> > be loaded even though there are CPPC capabilities errors.
> >
> > The workaround will match the broken BIOS which lack of CPPC
> > capabilities nominal_freq and lowest_freq definition in the ACPI table.
> >
> > $ cat /sys/devices/system/cpu/cpu0/acpi_cppc/lowest_freq
> > 0
> > $ cat /sys/devices/system/cpu/cpu0/acpi_cppc/nominal_freq
> > 0
> >
> > Reviewed-by: Mario Limonciello <[email protected]>
> > Reviewed-by: Gautham R. Shenoy <[email protected]>
> > Tested-by: Dhananjay Ugwekar <[email protected]>
> > Signed-off-by: Perry Yuan <[email protected]>
> > ---
> > drivers/cpufreq/amd-pstate.c | 53 ++++++++++++++++++++++++++++++++++--
> > include/linux/amd-pstate.h | 6 ++++
> > 2 files changed, 57 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index ec049b62b366..59a2db225d98 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -67,6 +67,7 @@ static struct cpufreq_driver amd_pstate_epp_driver;
> > static int cppc_state = AMD_PSTATE_UNDEFINED; static bool
> > cppc_enabled; static bool amd_pstate_prefcore = true;
> > +static struct quirk_entry *quirks;
>
> If we set quirks as global pointer, while the amd-pstate is uninstalling, should we
> free the quirks as well?
>
> Thanks,
> Ray

In general, if the `quirks` variable is dynamically allocated during the driver's execution, it should be freed during the driver's uninstallation to avoid memory leaks. If it is a static or constant variable, it does not need to be explicitly freed.
In this patch, quirks is used to store the pointer to "dmi->driver_data", I will add your ack flag for this patch if you have no concern for this.

+ quirks = dmi->driver_data;

Perry.

>
> >
> > /*
> > * AMD Energy Preference Performance (EPP) @@ -111,6 +112,41 @@
> > static unsigned int epp_values[] = {
> >
> > typedef int (*cppc_mode_transition_fn)(int);
> >
> > +static struct quirk_entry quirk_amd_7k62 = {
> > + .nominal_freq = 2600,
> > + .lowest_freq = 550,
> > +};
> > +
> > +static int __init dmi_matched_7k62_bios_bug(const struct
> > +dmi_system_id *dmi) {
> > + /**
> > + * match the broken bios for family 17h processor support CPPC V2
> > + * broken BIOS lack of nominal_freq and lowest_freq capabilities
> > + * definition in ACPI tables
> > + */
> > + if (boot_cpu_has(X86_FEATURE_ZEN2)) {
> > + quirks = dmi->driver_data;
> > + pr_info("Overriding nominal and lowest frequencies for %s\n",
> dmi->ident);
> > + return 1;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static const struct dmi_system_id amd_pstate_quirks_table[] __initconst = {
> > + {
> > + .callback = dmi_matched_7k62_bios_bug,
> > + .ident = "AMD EPYC 7K62",
> > + .matches = {
> > + DMI_MATCH(DMI_BIOS_VERSION, "5.14"),
> > + DMI_MATCH(DMI_BIOS_RELEASE, "12/12/2019"),
> > + },
> > + .driver_data = &quirk_amd_7k62,
> > + },
> > + {}
> > +};
> > +MODULE_DEVICE_TABLE(dmi, amd_pstate_quirks_table);
> > +
> > static inline int get_mode_idx_from_str(const char *str, size_t size)
> > {
> > int i;
> > @@ -812,8 +848,16 @@ static int amd_pstate_init_freq(struct amd_cpudata
> *cpudata)
> > if (ret)
> > return ret;
> >
> > - min_freq = cppc_perf.lowest_freq * 1000;
> > - nominal_freq = cppc_perf.nominal_freq * 1000;
> > + if (quirks && quirks->lowest_freq)
> > + min_freq = quirks->lowest_freq * 1000;
> > + else
> > + min_freq = cppc_perf.lowest_freq * 1000;
> > +
> > + if (quirks && quirks->nominal_freq)
> > + nominal_freq = quirks->nominal_freq * 1000;
> > + else
> > + nominal_freq = cppc_perf.nominal_freq * 1000;
> > +
> > nominal_perf = READ_ONCE(cpudata->nominal_perf);
> >
> > highest_perf = READ_ONCE(cpudata->highest_perf); @@ -1662,6
> +1706,11
> > @@ static int __init amd_pstate_init(void)
> > if (cpufreq_get_current_driver())
> > return -EEXIST;
> >
> > + quirks = NULL;
> > +
> > + /* check if this machine need CPPC quirks */
> > + dmi_check_system(amd_pstate_quirks_table);
> > +
> > switch (cppc_state) {
> > case AMD_PSTATE_UNDEFINED:
> > /* Disable on the following configs by default:
> > diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> > index ab7e82533718..6b832153a126 100644
> > --- a/include/linux/amd-pstate.h
> > +++ b/include/linux/amd-pstate.h
> > @@ -128,4 +128,10 @@ static const char * const amd_pstate_mode_string[] =
> {
> > [AMD_PSTATE_GUIDED] = "guided",
> > NULL,
> > };
> > +
> > +struct quirk_entry {
> > + u32 nominal_freq;
> > + u32 lowest_freq;
> > +};
> > +
> > #endif /* _LINUX_AMD_PSTATE_H */
> > --
> > 2.34.1
> >