2024-03-13 10:00:59

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v7 0/6] AMD Pstate Fixes And Enhancements

The patch series adds some fixes and enhancements to the AMD pstate
driver.
It enables CPPC v2 for certain processors in the family 17H, as
requested
by TR40 processor users who expect improved performance and lower system
temperature.

Additionally, it fixes the initialization of nominal_freq for each
cpudata
and changes latency and delay values to be read from platform firmware
firstly
for more accurate timing.

A new quirk is also added for legacy processors that lack CPPC
capabilities which caused the pstate driver to fail loading.

Testing done with one APU system while cpb boost on:

amd_pstate_lowest_nonlinear_freq:1701000
amd_pstate_max_freq:3501000
cpuinfo_max_freq:3501000
cpuinfo_min_freq:400000
scaling_cur_freq:3084836
scaling_max_freq:3501000
scaling_min_freq:400000

analyzing CPU 6:
driver: amd-pstate-epp
CPUs which run at the same hardware frequency: 6
CPUs which need to have their frequency coordinated by software: 6
maximum transition latency: Cannot determine or is not supported.
hardware limits: 400 MHz - 3.50 GHz
available cpufreq governors: performance powersave
current policy: frequency should be within 400 MHz and 3.50 GHz.
The governor "powersave" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware
current CPU frequency: 3.50 GHz (asserted by call to kernel)
boost state support:
Supported: yes
Active: yes
AMD PSTATE Highest Performance: 255. Maximum Frequency: 3.50 GHz.
AMD PSTATE Nominal Performance: 204. Nominal Frequency: 2.80 GHz.
AMD PSTATE Lowest Non-linear Performance: 124. Lowest Non-linear Frequency: 1.70 GHz.
AMD PSTATE Lowest Performance: 30. Lowest Frequency: 400 MHz.

If someone would like to test this patchset, it would need to apply
another patchset on top of this in case of some unexpected issue found.

https://lore.kernel.org/lkml/[email protected]/
It implements the amd pstate cpb boost feature
the below patch link is old version, please apply the latest version
while you start the testing work.

I would greatly appreciate any feedbacks.


Thank you!

Changes from v6:
* add one new patch to initialize capabilities in
amd_pstate_init_perf which can avoid duplicate cppc capabilities read
the change has been tested on APU system.
* pick up RB flags from Gautham
* drop the patch 1/6 which has been merged by Rafael

Changes from v5:
* rebased to linux-pm v6.8
* pick up RB flag from for patch 6(Mario)

Changes from v4:
* improve the dmi matching rule with zen2 flag only

Changes from v3:
* change quirk matching broken BIOS with family/model ID and Zen2
flag to fix the CPPC definition issue
* fix typo in quirk

Changes from v2:
* change quirk matching to BIOS version and release (Mario)
* pick up RB flag from Mario

Changes from v1:
* pick up the RB flags from Mario
* address review comment of patch #6 for amd_get_nominal_freq()
* rebased the series to linux-pm/bleeding-edge v6.8.0-rc2
* update debug log for patch #5 as Mario suggested.
* fix some typos and format problems
* tested on 7950X platform


V1: https://lore.kernel.org/lkml/[email protected]/
V2: https://lore.kernel.org/all/[email protected]/
v3: https://lore.kernel.org/lkml/[email protected]/
v4: https://lore.kernel.org/lkml/[email protected]/
v5: https://lore.kernel.org/lkml/[email protected]/
v6: https://lore.kernel.org/lkml/[email protected]/

Perry Yuan (6):
cpufreq:amd-pstate: fix the nominal freq value set
cpufreq:amd-pstate: initialize nominal_freq of each cpudata
cpufreq:amd-pstate: get pstate transition delay and latency value from
ACPI tables
cppc_acpi: print error message if CPPC is unsupported
cpufreq:amd-pstate: add quirk for the pstate CPPC capabilities missing
cpufreq:amd-pstate: initialize capabilities in amd_pstate_init_perf

drivers/acpi/cppc_acpi.c | 4 +-
drivers/cpufreq/amd-pstate.c | 151 ++++++++++++++++++++++++++---------
include/linux/amd-pstate.h | 7 ++
3 files changed, 122 insertions(+), 40 deletions(-)

--
2.34.1



2024-03-13 10:01:07

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v7 1/6] cpufreq:amd-pstate: fix the nominal freq value set

Address an untested error where the nominal_freq was returned in KHz
instead of the correct MHz units, this oversight led to a wrong
nominal_freq set and resued, it will cause the max frequency of core to
be initialized with a wrong frequency value.

Cc: [email protected]
Fixes: ec437d71db7 ("cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors")
Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 2015c9fcc3c9..3faa895b77b7 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -647,8 +647,7 @@ static int amd_get_nominal_freq(struct amd_cpudata *cpudata)
if (ret)
return ret;

- /* Switch to khz */
- return cppc_perf.nominal_freq * 1000;
+ return cppc_perf.nominal_freq;
}

static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
--
2.34.1


2024-03-13 10:01:25

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v7 2/6] cpufreq:amd-pstate: initialize nominal_freq of each cpudata

Optimizes the process of retrieving the nominal frequency by utilizing
'cpudata->nominal_freq' instead of repeatedly accessing the cppc_acpi interface.

To enhance efficiency and reduce the CPU load, shifted to using
'cpudata->nominal_freq'. It allows for the nominal frequency to be accessed
directly from the cached data in 'cpudata' of each CPU.
It will also slightly reduce the frequency change latency while using pstate
driver passive mode.

Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 3faa895b77b7..6db9256f42c0 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -626,7 +626,7 @@ static int amd_get_max_freq(struct amd_cpudata *cpudata)
if (ret)
return ret;

- nominal_freq = cppc_perf.nominal_freq;
+ nominal_freq = READ_ONCE(cpudata->nominal_freq);
nominal_perf = READ_ONCE(cpudata->nominal_perf);
max_perf = READ_ONCE(cpudata->highest_perf);

@@ -661,7 +661,7 @@ static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
if (ret)
return ret;

- nominal_freq = cppc_perf.nominal_freq;
+ nominal_freq = READ_ONCE(cpudata->nominal_freq);
nominal_perf = READ_ONCE(cpudata->nominal_perf);

lowest_nonlinear_perf = cppc_perf.lowest_nonlinear_perf;
@@ -855,13 +855,14 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
goto free_cpudata1;

min_freq = amd_get_min_freq(cpudata);
- max_freq = amd_get_max_freq(cpudata);
nominal_freq = amd_get_nominal_freq(cpudata);
+ cpudata->nominal_freq = nominal_freq;
+ max_freq = amd_get_max_freq(cpudata);
lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);

- if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
- dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
- min_freq, max_freq);
+ if (min_freq < 0 || max_freq < 0 || min_freq > max_freq || nominal_freq == 0) {
+ dev_err(dev, "min_freq(%d) or max_freq(%d) or nominal_freq(%d) is incorrect\n",
+ min_freq, max_freq, nominal_freq);
ret = -EINVAL;
goto free_cpudata1;
}
@@ -900,7 +901,6 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
cpudata->min_freq = min_freq;
cpudata->max_limit_freq = max_freq;
cpudata->min_limit_freq = min_freq;
- cpudata->nominal_freq = nominal_freq;
cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;

policy->driver_data = cpudata;
@@ -1317,12 +1317,13 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
goto free_cpudata1;

min_freq = amd_get_min_freq(cpudata);
- max_freq = amd_get_max_freq(cpudata);
nominal_freq = amd_get_nominal_freq(cpudata);
+ cpudata->nominal_freq = nominal_freq;
+ max_freq = amd_get_max_freq(cpudata);
lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
- if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
- dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
- min_freq, max_freq);
+ if (min_freq < 0 || max_freq < 0 || min_freq > max_freq || nominal_freq == 0) {
+ dev_err(dev, "min_freq(%d) or max_freq(%d) or nominal_freq(%d) is incorrect\n",
+ min_freq, max_freq, nominal_freq);
ret = -EINVAL;
goto free_cpudata1;
}
@@ -1335,7 +1336,6 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
/* Initial processor data capability frequencies */
cpudata->max_freq = max_freq;
cpudata->min_freq = min_freq;
- cpudata->nominal_freq = nominal_freq;
cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;

policy->driver_data = cpudata;
--
2.34.1


2024-03-13 10:01:44

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v7 3/6] cpufreq:amd-pstate: get pstate transition delay and latency value from ACPI tables

make pstate driver initially retrieve the P-state transition delay and latency
values from the BIOS ACPI tables which has more reasonable delay and latency
values according to the platform design and requirements.

Previously there values were hardcoded at specific value which may
have conflicted with platform and it might not reflect the most accurate or
optimized setting for the processor.

[054h 0084 8] Preserve Mask : FFFFFFFF00000000
[05Ch 0092 8] Write Mask : 0000000000000001
[064h 0100 4] Command Latency : 00000FA0
[068h 0104 4] Maximum Access Rate : 0000EA60
[06Ch 0108 2] Minimum Turnaround Time : 0000

Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 34 ++++++++++++++++++++++++++++++++--
1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 6db9256f42c0..ec6259957d25 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -827,6 +827,36 @@ static void amd_pstate_update_limits(unsigned int cpu)
mutex_unlock(&amd_pstate_driver_lock);
}

+/**
+ * Get pstate transition delay time from ACPI tables that firmware set
+ * instead of using hardcode value directly.
+ */
+static u32 amd_pstate_get_transition_delay_us(unsigned int cpu)
+{
+ u32 transition_delay_ns;
+
+ transition_delay_ns = cppc_get_transition_latency(cpu);
+ if (transition_delay_ns == CPUFREQ_ETERNAL)
+ return AMD_PSTATE_TRANSITION_DELAY;
+
+ return transition_delay_ns / NSEC_PER_USEC;
+}
+
+/**
+ * Get pstate transition latency value from ACPI tables that firmware set
+ * instead of using hardcode value directly.
+ */
+static u32 amd_pstate_get_transition_latency(unsigned int cpu)
+{
+ u32 transition_latency;
+
+ transition_latency = cppc_get_transition_latency(cpu);
+ if (transition_latency == CPUFREQ_ETERNAL)
+ return AMD_PSTATE_TRANSITION_LATENCY;
+
+ return transition_latency;
+}
+
static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
{
int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
@@ -867,8 +897,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
goto free_cpudata1;
}

- policy->cpuinfo.transition_latency = AMD_PSTATE_TRANSITION_LATENCY;
- policy->transition_delay_us = AMD_PSTATE_TRANSITION_DELAY;
+ policy->cpuinfo.transition_latency = amd_pstate_get_transition_latency(policy->cpu);
+ policy->transition_delay_us = amd_pstate_get_transition_delay_us(policy->cpu);

policy->min = min_freq;
policy->max = max_freq;
--
2.34.1


2024-03-13 10:01:49

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v7 4/6] cppc_acpi: print error message if CPPC is unsupported

to be more clear what is wrong with CPPC when pstate driver failed to
load which has dependency on the CPPC capabilities.

Add one more debug message to notify user if CPPC is not supported by
the CPU, then it will be easy to find out what need to fix for pstate
driver loading issue.

[ 0.477523] amd_pstate: the _CPC object is not present in SBIOS or ACPI disabled

Above message is not clear enough to verify whether CPPC is not supported.

Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
Reviewed-by: Gautham R. Shenoy <[email protected]>
---
drivers/acpi/cppc_acpi.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 4bfbe55553f4..3134101f31b6 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -686,8 +686,10 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)

if (!osc_sb_cppc2_support_acked) {
pr_debug("CPPC v2 _OSC not acked\n");
- if (!cpc_supported_by_cpu())
+ if (!cpc_supported_by_cpu()) {
+ pr_debug("CPPC is not supported by the CPU\n");
return -ENODEV;
+ }
}

/* Parse the ACPI _CPC table for this CPU. */
--
2.34.1


2024-03-13 10:01:52

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v7 5/6] cpufreq:amd-pstate: add quirk for the pstate CPPC capabilities missing

Add quirks table to get CPPC capabilities issue fixed by providing
correct perf or frequency values while driver loading.

If CPPC capabilities are not defined in the ACPI tables or wrongly
defined by platform firmware, it needs to use quick to get those
issues fixed with correct workaround values to make pstate driver
can be loaded even though there are CPPC capabilities errors.

The workaround will match the broken BIOS which lack of CPPC capabilities
nominal_freq and lowest_freq definition in the ACPI table.

$ cat /sys/devices/system/cpu/cpu0/acpi_cppc/lowest_freq
0
$ cat /sys/devices/system/cpu/cpu0/acpi_cppc/nominal_freq
0

Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
Reviewed-by: Gautham R. Shenoy <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 57 ++++++++++++++++++++++++++++++++++--
include/linux/amd-pstate.h | 6 ++++
2 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index ec6259957d25..59bcdf829c93 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -67,6 +67,7 @@ static struct cpufreq_driver amd_pstate_epp_driver;
static int cppc_state = AMD_PSTATE_UNDEFINED;
static bool cppc_enabled;
static bool amd_pstate_prefcore = true;
+static struct quirk_entry *quirks;

/*
* AMD Energy Preference Performance (EPP)
@@ -111,6 +112,41 @@ static unsigned int epp_values[] = {

typedef int (*cppc_mode_transition_fn)(int);

+static struct quirk_entry quirk_amd_7k62 = {
+ .nominal_freq = 2600,
+ .lowest_freq = 550,
+};
+
+static int __init dmi_matched_7k62_bios_bug(const struct dmi_system_id *dmi)
+{
+ /**
+ * match the broken bios for family 17h processor support CPPC V2
+ * broken BIOS lack of nominal_freq and lowest_freq capabilities
+ * definition in ACPI tables
+ */
+ if (boot_cpu_has(X86_FEATURE_ZEN2)) {
+ quirks = dmi->driver_data;
+ pr_info("Overriding nominal and lowest frequencies for %s\n", dmi->ident);
+ return 1;
+ }
+
+ return 0;
+}
+
+static const struct dmi_system_id amd_pstate_quirks_table[] __initconst = {
+ {
+ .callback = dmi_matched_7k62_bios_bug,
+ .ident = "AMD EPYC 7K62",
+ .matches = {
+ DMI_MATCH(DMI_BIOS_VERSION, "5.14"),
+ DMI_MATCH(DMI_BIOS_RELEASE, "12/12/2019"),
+ },
+ .driver_data = &quirk_amd_7k62,
+ },
+ {}
+};
+MODULE_DEVICE_TABLE(dmi, amd_pstate_quirks_table);
+
static inline int get_mode_idx_from_str(const char *str, size_t size)
{
int i;
@@ -607,13 +643,19 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
static int amd_get_min_freq(struct amd_cpudata *cpudata)
{
struct cppc_perf_caps cppc_perf;
+ u32 lowest_freq;

int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
if (ret)
return ret;

+ if (quirks && quirks->lowest_freq)
+ lowest_freq = quirks->lowest_freq;
+ else
+ lowest_freq = cppc_perf.lowest_freq;
+
/* Switch to khz */
- return cppc_perf.lowest_freq * 1000;
+ return lowest_freq * 1000;
}

static int amd_get_max_freq(struct amd_cpudata *cpudata)
@@ -642,12 +684,18 @@ static int amd_get_max_freq(struct amd_cpudata *cpudata)
static int amd_get_nominal_freq(struct amd_cpudata *cpudata)
{
struct cppc_perf_caps cppc_perf;
+ u32 nominal_freq;

int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
if (ret)
return ret;

- return cppc_perf.nominal_freq;
+ if (quirks && quirks->nominal_freq)
+ nominal_freq = quirks->nominal_freq;
+ else
+ nominal_freq = cppc_perf.nominal_freq;
+
+ return nominal_freq;
}

static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
@@ -1685,6 +1733,11 @@ static int __init amd_pstate_init(void)
if (cpufreq_get_current_driver())
return -EEXIST;

+ quirks = NULL;
+
+ /* check if this machine need CPPC quirks */
+ dmi_check_system(amd_pstate_quirks_table);
+
switch (cppc_state) {
case AMD_PSTATE_UNDEFINED:
/* Disable on the following configs by default:
diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index d21838835abd..7b2cbb892fd9 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -124,4 +124,10 @@ static const char * const amd_pstate_mode_string[] = {
[AMD_PSTATE_GUIDED] = "guided",
NULL,
};
+
+struct quirk_entry {
+ u32 nominal_freq;
+ u32 lowest_freq;
+};
+
#endif /* _LINUX_AMD_PSTATE_H */
--
2.34.1


2024-03-13 10:02:21

by Yuan, Perry

[permalink] [raw]
Subject: [PATCH v7 6/6] cpufreq:amd-pstate: initialize capabilities in amd_pstate_init_perf

Moved the initialization of some perf and frequency values related
to cpudata to the amd_pstate_init_perf and cppc_init_perf functions.
It can avoid duplicate calls to cppc_get_perf_caps function.

Signed-off-by: Perry Yuan <[email protected]>
---
drivers/cpufreq/amd-pstate.c | 43 ++++++++++++++----------------------
include/linux/amd-pstate.h | 1 +
2 files changed, 18 insertions(+), 26 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 59bcdf829c93..3877d4ecb5d4 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -330,12 +330,18 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
{
u64 cap1;
u32 highest_perf;
+ struct cppc_perf_caps cppc_perf;
+ int ret;

- int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
+ ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
&cap1);
if (ret)
return ret;

+ ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
+ if (ret)
+ return ret;
+
/* For platforms that do not support the preferred core feature, the
* highest_pef may be configured with 166 or 255, to avoid max frequency
* calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1) value as
@@ -353,6 +359,9 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
WRITE_ONCE(cpudata->lowest_perf, AMD_CPPC_LOWEST_PERF(cap1));
WRITE_ONCE(cpudata->prefcore_ranking, AMD_CPPC_HIGHEST_PERF(cap1));
WRITE_ONCE(cpudata->min_limit_perf, AMD_CPPC_LOWEST_PERF(cap1));
+ WRITE_ONCE(cpudata->lowest_freq, cppc_perf.lowest_freq);
+ WRITE_ONCE(cpudata->nominal_freq, cppc_perf.nominal_freq);
+
return 0;
}

@@ -360,8 +369,9 @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
{
struct cppc_perf_caps cppc_perf;
u32 highest_perf;
+ int ret;

- int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
+ ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
if (ret)
return ret;

@@ -378,6 +388,8 @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
WRITE_ONCE(cpudata->lowest_perf, cppc_perf.lowest_perf);
WRITE_ONCE(cpudata->prefcore_ranking, cppc_perf.highest_perf);
WRITE_ONCE(cpudata->min_limit_perf, cppc_perf.lowest_perf);
+ WRITE_ONCE(cpudata->lowest_freq, cppc_perf.lowest_freq);
+ WRITE_ONCE(cpudata->nominal_freq, cppc_perf.nominal_freq);

if (cppc_state == AMD_PSTATE_ACTIVE)
return 0;
@@ -642,17 +654,12 @@ static void amd_pstate_adjust_perf(unsigned int cpu,

static int amd_get_min_freq(struct amd_cpudata *cpudata)
{
- struct cppc_perf_caps cppc_perf;
u32 lowest_freq;

- int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
- if (ret)
- return ret;
-
if (quirks && quirks->lowest_freq)
lowest_freq = quirks->lowest_freq;
else
- lowest_freq = cppc_perf.lowest_freq;
+ lowest_freq = READ_ONCE(cpudata->lowest_freq);

/* Switch to khz */
return lowest_freq * 1000;
@@ -660,14 +667,9 @@ static int amd_get_min_freq(struct amd_cpudata *cpudata)

static int amd_get_max_freq(struct amd_cpudata *cpudata)
{
- struct cppc_perf_caps cppc_perf;
u32 max_perf, max_freq, nominal_freq, nominal_perf;
u64 boost_ratio;

- int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
- if (ret)
- return ret;
-
nominal_freq = READ_ONCE(cpudata->nominal_freq);
nominal_perf = READ_ONCE(cpudata->nominal_perf);
max_perf = READ_ONCE(cpudata->highest_perf);
@@ -683,36 +685,25 @@ static int amd_get_max_freq(struct amd_cpudata *cpudata)

static int amd_get_nominal_freq(struct amd_cpudata *cpudata)
{
- struct cppc_perf_caps cppc_perf;
u32 nominal_freq;

- int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
- if (ret)
- return ret;
-
if (quirks && quirks->nominal_freq)
nominal_freq = quirks->nominal_freq;
else
- nominal_freq = cppc_perf.nominal_freq;
+ nominal_freq = READ_ONCE(cpudata->nominal_freq);

return nominal_freq;
}

static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
{
- struct cppc_perf_caps cppc_perf;
u32 lowest_nonlinear_freq, lowest_nonlinear_perf,
nominal_freq, nominal_perf;
u64 lowest_nonlinear_ratio;

- int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
- if (ret)
- return ret;
-
nominal_freq = READ_ONCE(cpudata->nominal_freq);
nominal_perf = READ_ONCE(cpudata->nominal_perf);
-
- lowest_nonlinear_perf = cppc_perf.lowest_nonlinear_perf;
+ lowest_nonlinear_perf = READ_ONCE(cpudata->lowest_nonlinear_perf);

lowest_nonlinear_ratio = div_u64(lowest_nonlinear_perf << SCHED_CAPACITY_SHIFT,
nominal_perf);
diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index 7b2cbb892fd9..1fbbe75c3dcc 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -88,6 +88,7 @@ struct amd_cpudata {
u32 min_freq;
u32 nominal_freq;
u32 lowest_nonlinear_freq;
+ u32 lowest_freq;

struct amd_aperf_mperf cur;
struct amd_aperf_mperf prev;
--
2.34.1


2024-03-14 05:49:24

by Gautham R. Shenoy

[permalink] [raw]
Subject: Re: [PATCH v7 1/6] cpufreq:amd-pstate: fix the nominal freq value set

Hello Perry,

On Wed, Mar 13, 2024 at 05:59:13PM +0800, Perry Yuan wrote:
> Address an untested error where the nominal_freq was returned in KHz
> instead of the correct MHz units, this oversight led to a wrong
> nominal_freq set and resued, it will cause the max frequency of core to
> be initialized with a wrong frequency value.

As I had mentioned in my review comment to v6 [1], cpudata->max_freq,
cpudata->min_freq, cpudata->lowest_non_linear_freq are all in
khz. With this patch, cpudata->nominal_freq will be in mhz.

As Dhananjay confirmed [2], this patch breaks the reporting in
/sys/devices/system/cpu/cpufreq/policyX/*_freq as some of them will be
reported in mhz while some others in khz which breaks the expectation
that all these sysfs values should be reported in khz.

[cpufreq]# grep . *freq
amd_pstate_lowest_nonlinear_freq:1804000 <----- in khz
amd_pstate_max_freq:3514000 <----- in khz
cpuinfo_max_freq:2151 <----- in mhz
cpuinfo_min_freq:400000 <----- in khz
scaling_cur_freq:2151 <----- in mhz
scaling_max_freq:2151 <----- in mhz
scaling_min_freq:2151 <----- in mhz
[cpufreq]# pwd
/sys/devices/system/cpu/cpu0/cpufreq

What am I missing ?

[1] https://lore.kernel.org/lkml/[email protected]/)
[2] https://lore.kernel.org/lkml/[email protected]/

>
> Cc: [email protected]
> Fixes: ec437d71db7 ("cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors")
> Reviewed-by: Mario Limonciello <[email protected]>
> Signed-off-by: Perry Yuan <[email protected]>

--
Thanks and Regards
gautham.


> ---
> drivers/cpufreq/amd-pstate.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 2015c9fcc3c9..3faa895b77b7 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -647,8 +647,7 @@ static int amd_get_nominal_freq(struct amd_cpudata *cpudata)
> if (ret)
> return ret;
>
> - /* Switch to khz */
> - return cppc_perf.nominal_freq * 1000;
> + return cppc_perf.nominal_freq;
> }
>
> static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
> --
> 2.34.1
>

2024-03-14 06:15:12

by Yuan, Perry

[permalink] [raw]
Subject: RE: [PATCH v7 1/6] cpufreq:amd-pstate: fix the nominal freq value set

[AMD Official Use Only - General]

Hi Gautham

> -----Original Message-----
> From: Shenoy, Gautham Ranjal <[email protected]>
> Sent: Thursday, March 14, 2024 1:49 PM
> To: Yuan, Perry <[email protected]>
> Cc: [email protected]; Limonciello, Mario
> <[email protected]>; [email protected]; Huang, Ray
> <[email protected]>; Petkov, Borislav <[email protected]>;
> Deucher, Alexander <[email protected]>; Huang, Shimmer
> <[email protected]>; Du, Xiaojian <[email protected]>; Meng,
> Li (Jassmine) <[email protected]>; [email protected]; linux-
> [email protected]
> Subject: Re: [PATCH v7 1/6] cpufreq:amd-pstate: fix the nominal freq value set
>
> Hello Perry,
>
> On Wed, Mar 13, 2024 at 05:59:13PM +0800, Perry Yuan wrote:
> > Address an untested error where the nominal_freq was returned in KHz
> > instead of the correct MHz units, this oversight led to a wrong
> > nominal_freq set and resued, it will cause the max frequency of core
> > to be initialized with a wrong frequency value.
>
> As I had mentioned in my review comment to v6 [1], cpudata->max_freq,
> cpudata->min_freq, cpudata->lowest_non_linear_freq are all in
> khz. With this patch, cpudata->nominal_freq will be in mhz.
>
> As Dhananjay confirmed [2], this patch breaks the reporting in
> /sys/devices/system/cpu/cpufreq/policyX/*_freq as some of them will be
> reported in mhz while some others in khz which breaks the expectation that all
> these sysfs values should be reported in khz.
>
> [cpufreq]# grep . *freq
> amd_pstate_lowest_nonlinear_freq:1804000 <----- in khz
> amd_pstate_max_freq:3514000 <----- in khz
> cpuinfo_max_freq:2151 <----- in mhz
> cpuinfo_min_freq:400000 <----- in khz
> scaling_cur_freq:2151 <----- in mhz
> scaling_max_freq:2151 <----- in mhz
> scaling_min_freq:2151 <----- in mhz
> [cpufreq]# pwd
> /sys/devices/system/cpu/cpu0/cpufreq
>
> What am I missing ?

https://lore.kernel.org/lkml/42a36c7f788e0fb77d4be7575aab9c937e1773de.1710322310.git.perry.yuan@amd.com/
Changes from v3:
* fix the max frequency value to be KHz when cpb boost disabled(Gautham R. Shenoy)


The previous problem has been resolved by the new patchset of cpb boost support

+ if (on)
+ policy->cpuinfo.max_freq = cpudata->max_freq;
+ else
+ policy->cpuinfo.max_freq = cpudata->nominal_freq * 1000;


The frequency values of cpuinfo are correct on my system.

amd_pstate_lowest_nonlinear_freq:1701000
amd_pstate_max_freq:3501000
cpuinfo_max_freq:3501000
cpuinfo_min_freq:400000
scaling_cur_freq:400000
scaling_max_freq:3501000
scaling_min_freq:400000

Perry.

>
> [1] https://lore.kernel.org/lkml/ZcRvoYZKdUEjBUHp@BLR-
> 5CG11610CF.amd.com/)
> [2] https://lore.kernel.org/lkml/1aecf2fc-2ea4-46ec-aaf2-
> [email protected]/
>
> >
> > Cc: [email protected]
> > Fixes: ec437d71db7 ("cpufreq: amd-pstate: Introduce a new AMD P-State
> > driver to support future processors")
> > Reviewed-by: Mario Limonciello <[email protected]>
> > Signed-off-by: Perry Yuan <[email protected]>
>
> --
> Thanks and Regards
> gautham.
>
>
> > ---
> > drivers/cpufreq/amd-pstate.c | 3 +--
> > 1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index 2015c9fcc3c9..3faa895b77b7
> 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -647,8 +647,7 @@ static int amd_get_nominal_freq(struct
> amd_cpudata *cpudata)
> > if (ret)
> > return ret;
> >
> > - /* Switch to khz */
> > - return cppc_perf.nominal_freq * 1000;
> > + return cppc_perf.nominal_freq;
> > }
> >
> > static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
> > --
> > 2.34.1
> >

2024-03-14 06:24:10

by Gautham R. Shenoy

[permalink] [raw]
Subject: Re: [PATCH v7 3/6] cpufreq:amd-pstate: get pstate transition delay and latency value from ACPI tables


On Wed, Mar 13, 2024 at 05:59:15PM +0800, Perry Yuan wrote:
> make pstate driver initially retrieve the P-state transition delay and latency
> values from the BIOS ACPI tables which has more reasonable delay and latency
> values according to the platform design and requirements.
>
> Previously there values were hardcoded at specific value which may
> have conflicted with platform and it might not reflect the most accurate or
> optimized setting for the processor.
>
> [054h 0084 8] Preserve Mask : FFFFFFFF00000000
> [05Ch 0092 8] Write Mask : 0000000000000001
> [064h 0100 4] Command Latency : 00000FA0
> [068h 0104 4] Maximum Access Rate : 0000EA60
> [06Ch 0108 2] Minimum Turnaround Time : 0000
>
> Reviewed-by: Mario Limonciello <[email protected]>
> Signed-off-by: Perry Yuan <[email protected]>

Looks good to me.

Reviewed-by: Gautham R. Shenoy <[email protected]>

> ---
> drivers/cpufreq/amd-pstate.c | 34 ++++++++++++++++++++++++++++++++--
> 1 file changed, 32 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 6db9256f42c0..ec6259957d25 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -827,6 +827,36 @@ static void amd_pstate_update_limits(unsigned int cpu)
> mutex_unlock(&amd_pstate_driver_lock);
> }
>
> +/**
> + * Get pstate transition delay time from ACPI tables that firmware set
> + * instead of using hardcode value directly.
> + */
> +static u32 amd_pstate_get_transition_delay_us(unsigned int cpu)
> +{
> + u32 transition_delay_ns;
> +
> + transition_delay_ns = cppc_get_transition_latency(cpu);
> + if (transition_delay_ns == CPUFREQ_ETERNAL)
> + return AMD_PSTATE_TRANSITION_DELAY;
> +
> + return transition_delay_ns / NSEC_PER_USEC;
> +}
> +
> +/**
> + * Get pstate transition latency value from ACPI tables that firmware set
> + * instead of using hardcode value directly.
> + */
> +static u32 amd_pstate_get_transition_latency(unsigned int cpu)
> +{
> + u32 transition_latency;
> +
> + transition_latency = cppc_get_transition_latency(cpu);
> + if (transition_latency == CPUFREQ_ETERNAL)
> + return AMD_PSTATE_TRANSITION_LATENCY;
> +
> + return transition_latency;
> +}
> +
> static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> {
> int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> @@ -867,8 +897,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> goto free_cpudata1;
> }
>
> - policy->cpuinfo.transition_latency = AMD_PSTATE_TRANSITION_LATENCY;
> - policy->transition_delay_us = AMD_PSTATE_TRANSITION_DELAY;
> + policy->cpuinfo.transition_latency = amd_pstate_get_transition_latency(policy->cpu);
> + policy->transition_delay_us = amd_pstate_get_transition_delay_us(policy->cpu);
>
> policy->min = min_freq;
> policy->max = max_freq;
> --
> 2.34.1
>

2024-03-14 06:39:02

by Gautham R. Shenoy

[permalink] [raw]
Subject: Re: [PATCH v7 6/6] cpufreq:amd-pstate: initialize capabilities in amd_pstate_init_perf

Hello Perry,

On Wed, Mar 13, 2024 at 05:59:18PM +0800, Perry Yuan wrote:
> Moved the initialization of some perf and frequency values related
> to cpudata to the amd_pstate_init_perf and cppc_init_perf functions.
> It can avoid duplicate calls to cppc_get_perf_caps function.

Does it make sense to fold this into Patch 2 where you are caching the
nominal frequency for later use ?

Otherwise, this patch looks good to me.

--
Thanks and Regards
gautham.

>
> Signed-off-by: Perry Yuan <[email protected]>
> ---
> drivers/cpufreq/amd-pstate.c | 43 ++++++++++++++----------------------
> include/linux/amd-pstate.h | 1 +
> 2 files changed, 18 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 59bcdf829c93..3877d4ecb5d4 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -330,12 +330,18 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
> {
> u64 cap1;
> u32 highest_perf;
> + struct cppc_perf_caps cppc_perf;
> + int ret;
>
> - int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
> + ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
> &cap1);
> if (ret)
> return ret;
>
> + ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> + if (ret)
> + return ret;
> +
> /* For platforms that do not support the preferred core feature, the
> * highest_pef may be configured with 166 or 255, to avoid max frequency
> * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1) value as
> @@ -353,6 +359,9 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
> WRITE_ONCE(cpudata->lowest_perf, AMD_CPPC_LOWEST_PERF(cap1));
> WRITE_ONCE(cpudata->prefcore_ranking, AMD_CPPC_HIGHEST_PERF(cap1));
> WRITE_ONCE(cpudata->min_limit_perf, AMD_CPPC_LOWEST_PERF(cap1));
> + WRITE_ONCE(cpudata->lowest_freq, cppc_perf.lowest_freq);
> + WRITE_ONCE(cpudata->nominal_freq, cppc_perf.nominal_freq);
> +
> return 0;
> }
>
> @@ -360,8 +369,9 @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
> {
> struct cppc_perf_caps cppc_perf;
> u32 highest_perf;
> + int ret;
>
> - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> + ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> if (ret)
> return ret;
>
> @@ -378,6 +388,8 @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
> WRITE_ONCE(cpudata->lowest_perf, cppc_perf.lowest_perf);
> WRITE_ONCE(cpudata->prefcore_ranking, cppc_perf.highest_perf);
> WRITE_ONCE(cpudata->min_limit_perf, cppc_perf.lowest_perf);
> + WRITE_ONCE(cpudata->lowest_freq, cppc_perf.lowest_freq);
> + WRITE_ONCE(cpudata->nominal_freq, cppc_perf.nominal_freq);
>
> if (cppc_state == AMD_PSTATE_ACTIVE)
> return 0;
> @@ -642,17 +654,12 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
>
> static int amd_get_min_freq(struct amd_cpudata *cpudata)
> {
> - struct cppc_perf_caps cppc_perf;
> u32 lowest_freq;
>
> - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> - if (ret)
> - return ret;
> -
> if (quirks && quirks->lowest_freq)
> lowest_freq = quirks->lowest_freq;
> else
> - lowest_freq = cppc_perf.lowest_freq;
> + lowest_freq = READ_ONCE(cpudata->lowest_freq);
>
> /* Switch to khz */
> return lowest_freq * 1000;
> @@ -660,14 +667,9 @@ static int amd_get_min_freq(struct amd_cpudata *cpudata)
>
> static int amd_get_max_freq(struct amd_cpudata *cpudata)
> {
> - struct cppc_perf_caps cppc_perf;
> u32 max_perf, max_freq, nominal_freq, nominal_perf;
> u64 boost_ratio;
>
> - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> - if (ret)
> - return ret;
> -
> nominal_freq = READ_ONCE(cpudata->nominal_freq);
> nominal_perf = READ_ONCE(cpudata->nominal_perf);
> max_perf = READ_ONCE(cpudata->highest_perf);
> @@ -683,36 +685,25 @@ static int amd_get_max_freq(struct amd_cpudata *cpudata)
>
> static int amd_get_nominal_freq(struct amd_cpudata *cpudata)
> {
> - struct cppc_perf_caps cppc_perf;
> u32 nominal_freq;
>
> - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> - if (ret)
> - return ret;
> -
> if (quirks && quirks->nominal_freq)
> nominal_freq = quirks->nominal_freq;
> else
> - nominal_freq = cppc_perf.nominal_freq;
> + nominal_freq = READ_ONCE(cpudata->nominal_freq);
>
> return nominal_freq;
> }
>
> static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
> {
> - struct cppc_perf_caps cppc_perf;
> u32 lowest_nonlinear_freq, lowest_nonlinear_perf,
> nominal_freq, nominal_perf;
> u64 lowest_nonlinear_ratio;
>
> - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> - if (ret)
> - return ret;
> -
> nominal_freq = READ_ONCE(cpudata->nominal_freq);
> nominal_perf = READ_ONCE(cpudata->nominal_perf);
> -
> - lowest_nonlinear_perf = cppc_perf.lowest_nonlinear_perf;
> + lowest_nonlinear_perf = READ_ONCE(cpudata->lowest_nonlinear_perf);
>
> lowest_nonlinear_ratio = div_u64(lowest_nonlinear_perf << SCHED_CAPACITY_SHIFT,
> nominal_perf);
> diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> index 7b2cbb892fd9..1fbbe75c3dcc 100644
> --- a/include/linux/amd-pstate.h
> +++ b/include/linux/amd-pstate.h
> @@ -88,6 +88,7 @@ struct amd_cpudata {
> u32 min_freq;
> u32 nominal_freq;
> u32 lowest_nonlinear_freq;
> + u32 lowest_freq;
>
> struct amd_aperf_mperf cur;
> struct amd_aperf_mperf prev;
> --
> 2.34.1
>

2024-03-14 08:24:57

by Yuan, Perry

[permalink] [raw]
Subject: RE: [PATCH v7 6/6] cpufreq:amd-pstate: initialize capabilities in amd_pstate_init_perf

[AMD Official Use Only - General]

Hi Gautham,

> -----Original Message-----
> From: Shenoy, Gautham Ranjal <[email protected]>
> Sent: Thursday, March 14, 2024 2:39 PM
> To: Yuan, Perry <[email protected]>
> Cc: [email protected]; Limonciello, Mario
> <[email protected]>; [email protected]; Huang, Ray
> <[email protected]>; Petkov, Borislav <[email protected]>;
> Deucher, Alexander <[email protected]>; Huang, Shimmer
> <[email protected]>; Du, Xiaojian <[email protected]>; Meng,
> Li (Jassmine) <[email protected]>; [email protected]; linux-
> [email protected]
> Subject: Re: [PATCH v7 6/6] cpufreq:amd-pstate: initialize capabilities in
> amd_pstate_init_perf
>
> Hello Perry,
>
> On Wed, Mar 13, 2024 at 05:59:18PM +0800, Perry Yuan wrote:
> > Moved the initialization of some perf and frequency values related to
> > cpudata to the amd_pstate_init_perf and cppc_init_perf functions.
> > It can avoid duplicate calls to cppc_get_perf_caps function.
>
> Does it make sense to fold this into Patch 2 where you are caching the nominal
> frequency for later use ?
>
> Otherwise, this patch looks good to me.

That nominal perf change is reviewed by Mario,
This patch can be reviewed separately and the whole changes can be applied after that without function impact.
It will be simpler to look what we changed in this one. ????

Thanks for your review efforts!

Perry.

>
> --
> Thanks and Regards
> gautham.
>
> >
> > Signed-off-by: Perry Yuan <[email protected]>
> > ---
> > drivers/cpufreq/amd-pstate.c | 43 ++++++++++++++----------------------
> > include/linux/amd-pstate.h | 1 +
> > 2 files changed, 18 insertions(+), 26 deletions(-)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index 59bcdf829c93..3877d4ecb5d4
> 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -330,12 +330,18 @@ static int pstate_init_perf(struct amd_cpudata
> > *cpudata) {
> > u64 cap1;
> > u32 highest_perf;
> > + struct cppc_perf_caps cppc_perf;
> > + int ret;
> >
> > - int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
> > + ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
> > &cap1);
> > if (ret)
> > return ret;
> >
> > + ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> > + if (ret)
> > + return ret;
> > +
> > /* For platforms that do not support the preferred core feature, the
> > * highest_pef may be configured with 166 or 255, to avoid max
> frequency
> > * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1)
> value
> > as @@ -353,6 +359,9 @@ static int pstate_init_perf(struct amd_cpudata
> *cpudata)
> > WRITE_ONCE(cpudata->lowest_perf,
> AMD_CPPC_LOWEST_PERF(cap1));
> > WRITE_ONCE(cpudata->prefcore_ranking,
> AMD_CPPC_HIGHEST_PERF(cap1));
> > WRITE_ONCE(cpudata->min_limit_perf,
> AMD_CPPC_LOWEST_PERF(cap1));
> > + WRITE_ONCE(cpudata->lowest_freq, cppc_perf.lowest_freq);
> > + WRITE_ONCE(cpudata->nominal_freq, cppc_perf.nominal_freq);
> > +
> > return 0;
> > }
> >
> > @@ -360,8 +369,9 @@ static int cppc_init_perf(struct amd_cpudata
> > *cpudata) {
> > struct cppc_perf_caps cppc_perf;
> > u32 highest_perf;
> > + int ret;
> >
> > - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> > + ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> > if (ret)
> > return ret;
> >
> > @@ -378,6 +388,8 @@ static int cppc_init_perf(struct amd_cpudata
> *cpudata)
> > WRITE_ONCE(cpudata->lowest_perf, cppc_perf.lowest_perf);
> > WRITE_ONCE(cpudata->prefcore_ranking, cppc_perf.highest_perf);
> > WRITE_ONCE(cpudata->min_limit_perf, cppc_perf.lowest_perf);
> > + WRITE_ONCE(cpudata->lowest_freq, cppc_perf.lowest_freq);
> > + WRITE_ONCE(cpudata->nominal_freq, cppc_perf.nominal_freq);
> >
> > if (cppc_state == AMD_PSTATE_ACTIVE)
> > return 0;
> > @@ -642,17 +654,12 @@ static void amd_pstate_adjust_perf(unsigned int
> > cpu,
> >
> > static int amd_get_min_freq(struct amd_cpudata *cpudata) {
> > - struct cppc_perf_caps cppc_perf;
> > u32 lowest_freq;
> >
> > - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> > - if (ret)
> > - return ret;
> > -
> > if (quirks && quirks->lowest_freq)
> > lowest_freq = quirks->lowest_freq;
> > else
> > - lowest_freq = cppc_perf.lowest_freq;
> > + lowest_freq = READ_ONCE(cpudata->lowest_freq);
> >
> > /* Switch to khz */
> > return lowest_freq * 1000;
> > @@ -660,14 +667,9 @@ static int amd_get_min_freq(struct amd_cpudata
> > *cpudata)
> >
> > static int amd_get_max_freq(struct amd_cpudata *cpudata) {
> > - struct cppc_perf_caps cppc_perf;
> > u32 max_perf, max_freq, nominal_freq, nominal_perf;
> > u64 boost_ratio;
> >
> > - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> > - if (ret)
> > - return ret;
> > -
> > nominal_freq = READ_ONCE(cpudata->nominal_freq);
> > nominal_perf = READ_ONCE(cpudata->nominal_perf);
> > max_perf = READ_ONCE(cpudata->highest_perf); @@ -683,36
> +685,25 @@
> > static int amd_get_max_freq(struct amd_cpudata *cpudata)
> >
> > static int amd_get_nominal_freq(struct amd_cpudata *cpudata) {
> > - struct cppc_perf_caps cppc_perf;
> > u32 nominal_freq;
> >
> > - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> > - if (ret)
> > - return ret;
> > -
> > if (quirks && quirks->nominal_freq)
> > nominal_freq = quirks->nominal_freq;
> > else
> > - nominal_freq = cppc_perf.nominal_freq;
> > + nominal_freq = READ_ONCE(cpudata->nominal_freq);
> >
> > return nominal_freq;
> > }
> >
> > static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
> > {
> > - struct cppc_perf_caps cppc_perf;
> > u32 lowest_nonlinear_freq, lowest_nonlinear_perf,
> > nominal_freq, nominal_perf;
> > u64 lowest_nonlinear_ratio;
> >
> > - int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> > - if (ret)
> > - return ret;
> > -
> > nominal_freq = READ_ONCE(cpudata->nominal_freq);
> > nominal_perf = READ_ONCE(cpudata->nominal_perf);
> > -
> > - lowest_nonlinear_perf = cppc_perf.lowest_nonlinear_perf;
> > + lowest_nonlinear_perf = READ_ONCE(cpudata-
> >lowest_nonlinear_perf);
> >
> > lowest_nonlinear_ratio = div_u64(lowest_nonlinear_perf <<
> SCHED_CAPACITY_SHIFT,
> > nominal_perf);
> > diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
> > index 7b2cbb892fd9..1fbbe75c3dcc 100644
> > --- a/include/linux/amd-pstate.h
> > +++ b/include/linux/amd-pstate.h
> > @@ -88,6 +88,7 @@ struct amd_cpudata {
> > u32 min_freq;
> > u32 nominal_freq;
> > u32 lowest_nonlinear_freq;
> > + u32 lowest_freq;
> >
> > struct amd_aperf_mperf cur;
> > struct amd_aperf_mperf prev;
> > --
> > 2.34.1
> >

2024-03-14 09:34:13

by Gautham R. Shenoy

[permalink] [raw]
Subject: Re: [PATCH v7 1/6] cpufreq:amd-pstate: fix the nominal freq value set

Hello Perry,

On Thu, Mar 14, 2024 at 11:39:20AM +0530, Yuan, Perry wrote:
> [AMD Official Use Only - General]
>
> Hi Gautham
>
> > -----Original Message-----
> > From: Shenoy, Gautham Ranjal <[email protected]>
> > Sent: Thursday, March 14, 2024 1:49 PM
> > To: Yuan, Perry <[email protected]>
> > Cc: [email protected]; Limonciello, Mario
> > <[email protected]>; [email protected]; Huang, Ray
> > <[email protected]>; Petkov, Borislav <[email protected]>;
> > Deucher, Alexander <[email protected]>; Huang, Shimmer
> > <[email protected]>; Du, Xiaojian <[email protected]>; Meng,
> > Li (Jassmine) <[email protected]>; [email protected]; linux-
> > [email protected]
> > Subject: Re: [PATCH v7 1/6] cpufreq:amd-pstate: fix the nominal freq value set
> >
> > Hello Perry,
> >
> > On Wed, Mar 13, 2024 at 05:59:13PM +0800, Perry Yuan wrote:
> > > Address an untested error where the nominal_freq was returned in KHz
> > > instead of the correct MHz units, this oversight led to a wrong
> > > nominal_freq set and resued, it will cause the max frequency of core
> > > to be initialized with a wrong frequency value.

What is still not clear from this commit log or the rest of the patch
is, which part of the kernel code expects nominal_freq to be in MHz,
when all the other freqs in cpudata are in KHz units.

If nominal_freq is in KHz as it is currently, how does it cause the
max frequency to be initialized to the wrong value ? Could you please
elaborate this ?

> >
> > As I had mentioned in my review comment to v6 [1], cpudata->max_freq,
> > cpudata->min_freq, cpudata->lowest_non_linear_freq are all in
> > khz. With this patch, cpudata->nominal_freq will be in mhz.
> >
> > As Dhananjay confirmed [2], this patch breaks the reporting in
> > /sys/devices/system/cpu/cpufreq/policyX/*_freq as some of them will be
> > reported in mhz while some others in khz which breaks the expectation that all
> > these sysfs values should be reported in khz.
> >
> > [cpufreq]# grep . *freq
> > amd_pstate_lowest_nonlinear_freq:1804000 <----- in khz
> > amd_pstate_max_freq:3514000 <----- in khz
> > cpuinfo_max_freq:2151 <----- in mhz
> > cpuinfo_min_freq:400000 <----- in khz
> > scaling_cur_freq:2151 <----- in mhz
> > scaling_max_freq:2151 <----- in mhz
> > scaling_min_freq:2151 <----- in mhz
> > [cpufreq]# pwd
> > /sys/devices/system/cpu/cpu0/cpufreq
> >
> > What am I missing ?
>
> https://lore.kernel.org/lkml/42a36c7f788e0fb77d4be7575aab9c937e1773de.1710322310.git.perry.yuan@amd.com/
> Changes from v3:
> * fix the max frequency value to be KHz when cpb boost disabled(Gautham R. Shenoy)

This CPB boost change assumes that cpudata->nominal_freq is in Mhz
which is not the case until this patch. So is the CPB patchset
dependent on this patch ?

--
Thanks and Regards
gautham.

>
> The previous problem has been resolved by the new patchset of cpb boost support
>
> + if (on)
> + policy->cpuinfo.max_freq = cpudata->max_freq;
> + else
> + policy->cpuinfo.max_freq = cpudata->nominal_freq * 1000;
>
>
> The frequency values of cpuinfo are correct on my system.
>
> amd_pstate_lowest_nonlinear_freq:1701000
> amd_pstate_max_freq:3501000
> cpuinfo_max_freq:3501000
> cpuinfo_min_freq:400000
> scaling_cur_freq:400000
> scaling_max_freq:3501000
> scaling_min_freq:400000
>
> Perry.
>
> >
> > [1] https://lore.kernel.org/lkml/ZcRvoYZKdUEjBUHp@BLR-
> > 5CG11610CF.amd.com/)
> > [2] https://lore.kernel.org/lkml/1aecf2fc-2ea4-46ec-aaf2-
> > [email protected]/
> >
> > >
> > > Cc: [email protected]
> > > Fixes: ec437d71db7 ("cpufreq: amd-pstate: Introduce a new AMD P-State
> > > driver to support future processors")
> > > Reviewed-by: Mario Limonciello <[email protected]>
> > > Signed-off-by: Perry Yuan <[email protected]>
> >
> > --
> > Thanks and Regards
> > gautham.
> >
> >
> > > ---
> > > drivers/cpufreq/amd-pstate.c | 3 +--
> > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/cpufreq/amd-pstate.c
> > > b/drivers/cpufreq/amd-pstate.c index 2015c9fcc3c9..3faa895b77b7
> > 100644
> > > --- a/drivers/cpufreq/amd-pstate.c
> > > +++ b/drivers/cpufreq/amd-pstate.c
> > > @@ -647,8 +647,7 @@ static int amd_get_nominal_freq(struct
> > amd_cpudata *cpudata)
> > > if (ret)
> > > return ret;
> > >
> > > - /* Switch to khz */
> > > - return cppc_perf.nominal_freq * 1000;
> > > + return cppc_perf.nominal_freq;
> > > }
> > >
> > > static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
> > > --
> > > 2.34.1
> > >

2024-03-14 10:12:41

by Yuan, Perry

[permalink] [raw]
Subject: RE: [PATCH v7 1/6] cpufreq:amd-pstate: fix the nominal freq value set

[AMD Official Use Only - General]

> -----Original Message-----
> From: Shenoy, Gautham Ranjal <[email protected]>
> Sent: Thursday, March 14, 2024 5:32 PM
> To: Yuan, Perry <[email protected]>
> Cc: [email protected]; Limonciello, Mario
> <[email protected]>; [email protected]; Huang, Ray
> <[email protected]>; Petkov, Borislav <[email protected]>; Deucher,
> Alexander <[email protected]>; Huang, Shimmer
> <[email protected]>; Du, Xiaojian <[email protected]>; Meng, Li
> (Jassmine) <[email protected]>; [email protected]; linux-
> [email protected]
> Subject: Re: [PATCH v7 1/6] cpufreq:amd-pstate: fix the nominal freq value set
>
> Hello Perry,
>
> On Thu, Mar 14, 2024 at 11:39:20AM +0530, Yuan, Perry wrote:
> > [AMD Official Use Only - General]
> >
> > Hi Gautham
> >
> > > -----Original Message-----
> > > From: Shenoy, Gautham Ranjal <[email protected]>
> > > Sent: Thursday, March 14, 2024 1:49 PM
> > > To: Yuan, Perry <[email protected]>
> > > Cc: [email protected]; Limonciello, Mario
> > > <[email protected]>; [email protected]; Huang, Ray
> > > <[email protected]>; Petkov, Borislav <[email protected]>;
> > > Deucher, Alexander <[email protected]>; Huang, Shimmer
> > > <[email protected]>; Du, Xiaojian <[email protected]>; Meng,
> > > Li (Jassmine) <[email protected]>; [email protected]; linux-
> > > [email protected]
> > > Subject: Re: [PATCH v7 1/6] cpufreq:amd-pstate: fix the nominal freq
> > > value set
> > >
> > > Hello Perry,
> > >
> > > On Wed, Mar 13, 2024 at 05:59:13PM +0800, Perry Yuan wrote:
> > > > Address an untested error where the nominal_freq was returned in
> > > > KHz instead of the correct MHz units, this oversight led to a
> > > > wrong nominal_freq set and resued, it will cause the max frequency
> > > > of core to be initialized with a wrong frequency value.
>
> What is still not clear from this commit log or the rest of the patch is, which part
> of the kernel code expects nominal_freq to be in MHz, when all the other freqs in
> cpudata are in KHz units.
>
> If nominal_freq is in KHz as it is currently, how does it cause the max frequency to
> be initialized to the wrong value ? Could you please elaborate this ?

OK, here is the story.
Actually, the original capability values are Mhz like below, so the driver need to initialize the nominal_freq
as the as-it-is value, then pstate driver will calculate the max frequency as needed.

feedback_ctrs:ref:103751311076 del:87445442175
highest_perf:255
lowest_freq:400
lowest_nonlinear_perf:124
lowest_perf:30
nominal_freq:2801
nominal_perf:204
reference_perf:204
wraparound_time:18446744073709551615

The previous driver did not use the READ_ONCE(cpudata-> nominal_freq) at all.
We initialize all the freq and perf values in the init functions like you suggested in the other patchset.
if driver still use Khz, below code will have problem.

nominal_freq = READ_ONCE(cpudata-> nominal_freq);
lowest_nonlinear_freq = nominal_freq * lowest_nonlinear_ratio >> SCHED_CAPACITY_SHIFT;

/* Switch to khz */
return lowest_nonlinear_freq * 1000;


Now we can read READ_ONCE(cpudata-> nominal_freq) without reading the CPPC ACPI again.
The nominal_freq must be in MHz as it is.

Perry.

>
> > >
> > > As I had mentioned in my review comment to v6 [1],
> > > cpudata->max_freq,
> > > cpudata->min_freq, cpudata->lowest_non_linear_freq are all in
> > > khz. With this patch, cpudata->nominal_freq will be in mhz.
> > >
> > > As Dhananjay confirmed [2], this patch breaks the reporting in
> > > /sys/devices/system/cpu/cpufreq/policyX/*_freq as some of them will
> > > be reported in mhz while some others in khz which breaks the
> > > expectation that all these sysfs values should be reported in khz.
> > >
> > > [cpufreq]# grep . *freq
> > > amd_pstate_lowest_nonlinear_freq:1804000 <----- in khz
> > > amd_pstate_max_freq:3514000 <----- in khz
> > > cpuinfo_max_freq:2151 <----- in mhz
> > > cpuinfo_min_freq:400000 <----- in khz
> > > scaling_cur_freq:2151 <----- in mhz
> > > scaling_max_freq:2151 <----- in mhz
> > > scaling_min_freq:2151 <----- in mhz
> > > [cpufreq]# pwd
> > > /sys/devices/system/cpu/cpu0/cpufreq
> > >
> > > What am I missing ?
> >
> > https://lore.kernel.org/lkml/42a36c7f788e0fb77d4be7575aab9c937e1773de.
> > [email protected]/
> > Changes from v3:
> > * fix the max frequency value to be KHz when cpb boost
> > disabled(Gautham R. Shenoy)
>
> This CPB boost change assumes that cpudata->nominal_freq is in Mhz which is
> not the case until this patch. So is the CPB patchset dependent on this patch ?
>
> --
> Thanks and Regards
> gautham.
>
> >
> > The previous problem has been resolved by the new patchset of cpb
> > boost support
> >
> > + if (on)
> > + policy->cpuinfo.max_freq = cpudata->max_freq;
> > + else
> > + policy->cpuinfo.max_freq = cpudata->nominal_freq *
> > + 1000;
> >
> >
> > The frequency values of cpuinfo are correct on my system.
> >
> > amd_pstate_lowest_nonlinear_freq:1701000
> > amd_pstate_max_freq:3501000
> > cpuinfo_max_freq:3501000
> > cpuinfo_min_freq:400000
> > scaling_cur_freq:400000
> > scaling_max_freq:3501000
> > scaling_min_freq:400000
> >
> > Perry.
> >
> > >
> > > [1] https://lore.kernel.org/lkml/ZcRvoYZKdUEjBUHp@BLR-
> > > 5CG11610CF.amd.com/)
> > > [2] https://lore.kernel.org/lkml/1aecf2fc-2ea4-46ec-aaf2-
> > > [email protected]/
> > >
> > > >
> > > > Cc: [email protected]
> > > > Fixes: ec437d71db7 ("cpufreq: amd-pstate: Introduce a new AMD
> > > > P-State driver to support future processors")
> > > > Reviewed-by: Mario Limonciello <[email protected]>
> > > > Signed-off-by: Perry Yuan <[email protected]>
> > >
> > > --
> > > Thanks and Regards
> > > gautham.
> > >
> > >
> > > > ---
> > > > drivers/cpufreq/amd-pstate.c | 3 +--
> > > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/cpufreq/amd-pstate.c
> > > > b/drivers/cpufreq/amd-pstate.c index 2015c9fcc3c9..3faa895b77b7
> > > 100644
> > > > --- a/drivers/cpufreq/amd-pstate.c
> > > > +++ b/drivers/cpufreq/amd-pstate.c
> > > > @@ -647,8 +647,7 @@ static int amd_get_nominal_freq(struct
> > > amd_cpudata *cpudata)
> > > > if (ret)
> > > > return ret;
> > > >
> > > > - /* Switch to khz */
> > > > - return cppc_perf.nominal_freq * 1000;
> > > > + return cppc_perf.nominal_freq;
> > > > }
> > > >
> > > > static int amd_get_lowest_nonlinear_freq(struct amd_cpudata
> > > > *cpudata)
> > > > --
> > > > 2.34.1
> > > >