2022-02-10 05:43:08

by Bjorn Andersson

[permalink] [raw]
Subject: [PATCH] cpufreq: qcom-hw: Add support for per-core-dcvs

The OSM and EPSS hardware controls the frequency of each cluster in the
system based on requests from the OS and various limiting factors, such
as input from LMH.

In most systems the vote from the OS is done using a single register per
cluster, but some systems are configured to instead take one request per
core. In this configuration a set of consecutive registers are used for
the OS to request the frequency of each of the cores within the cluster.
The information is then aggregated in the hardware and the frequency for
the cluster is determined.

As the current implementation ends up only requesting a frequency for
the first core in each cluster and only the vote of non-idle cores are
considered it's often the case that the cluster will be clocked (much)
lower than expected.

It's possible that there are benefits of performing the per-core
requests from the OS, but more investigation of the outcome is needed
before introducing such support. As such this patch extends the request
for the cluster to be written to all the cores.

The weight of the policy's related_cpus is used to determine how many
cores, and hence consecutive registers, each cluster has.

The OS is not permitted to disable the per-core dcvs feature.

Signed-off-by: Bjorn Andersson <[email protected]>
---
drivers/cpufreq/qcom-cpufreq-hw.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index 05f3d7876e44..8d2b65c782e6 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -28,6 +28,7 @@

struct qcom_cpufreq_soc_data {
u32 reg_enable;
+ u32 reg_dcvs_ctrl;
u32 reg_freq_lut;
u32 reg_volt_lut;
u32 reg_current_vote;
@@ -50,6 +51,8 @@ struct qcom_cpufreq_data {
bool cancel_throttle;
struct delayed_work throttle_work;
struct cpufreq_policy *policy;
+
+ bool per_core_dcvs;
};

static unsigned long cpu_hw_rate, xo_rate;
@@ -102,9 +105,14 @@ static int qcom_cpufreq_hw_target_index(struct cpufreq_policy *policy,
struct qcom_cpufreq_data *data = policy->driver_data;
const struct qcom_cpufreq_soc_data *soc_data = data->soc_data;
unsigned long freq = policy->freq_table[index].frequency;
+ unsigned int i;

writel_relaxed(index, data->base + soc_data->reg_perf_state);

+ if (data->per_core_dcvs)
+ for (i = 1; i < cpumask_weight(policy->related_cpus); i++)
+ writel_relaxed(index, data->base + soc_data->reg_perf_state + i * 4);
+
if (icc_scaling_enabled)
qcom_cpufreq_set_bw(policy, freq);

@@ -137,10 +145,15 @@ static unsigned int qcom_cpufreq_hw_fast_switch(struct cpufreq_policy *policy,
struct qcom_cpufreq_data *data = policy->driver_data;
const struct qcom_cpufreq_soc_data *soc_data = data->soc_data;
unsigned int index;
+ unsigned int i;

index = policy->cached_resolved_idx;
writel_relaxed(index, data->base + soc_data->reg_perf_state);

+ if (data->per_core_dcvs)
+ for (i = 1; i < cpumask_weight(policy->related_cpus); i++)
+ writel_relaxed(index, data->base + soc_data->reg_perf_state + i * 4);
+
return policy->freq_table[index].frequency;
}

@@ -342,6 +355,7 @@ static irqreturn_t qcom_lmh_dcvs_handle_irq(int irq, void *data)

static const struct qcom_cpufreq_soc_data qcom_soc_data = {
.reg_enable = 0x0,
+ .reg_dcvs_ctrl = 0xbc,
.reg_freq_lut = 0x110,
.reg_volt_lut = 0x114,
.reg_current_vote = 0x704,
@@ -351,6 +365,7 @@ static const struct qcom_cpufreq_soc_data qcom_soc_data = {

static const struct qcom_cpufreq_soc_data epss_soc_data = {
.reg_enable = 0x0,
+ .reg_dcvs_ctrl = 0xb0,
.reg_freq_lut = 0x100,
.reg_volt_lut = 0x200,
.reg_perf_state = 0x320,
@@ -481,6 +496,9 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
goto error;
}

+ if (readl_relaxed(base + data->soc_data->reg_dcvs_ctrl) & 0x1)
+ data->per_core_dcvs = true;
+
qcom_get_related_cpus(index, policy->cpus);
if (!cpumask_weight(policy->cpus)) {
dev_err(dev, "Domain-%d failed to get related CPUs\n", index);
--
2.33.1



2022-02-24 05:18:58

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: qcom-hw: Add support for per-core-dcvs

On 09-02-22, 21:01, Bjorn Andersson wrote:
> The OSM and EPSS hardware controls the frequency of each cluster in the
> system based on requests from the OS and various limiting factors, such
> as input from LMH.
>
> In most systems the vote from the OS is done using a single register per
> cluster, but some systems are configured to instead take one request per
> core. In this configuration a set of consecutive registers are used for
> the OS to request the frequency of each of the cores within the cluster.
> The information is then aggregated in the hardware and the frequency for
> the cluster is determined.
>
> As the current implementation ends up only requesting a frequency for
> the first core in each cluster and only the vote of non-idle cores are
> considered it's often the case that the cluster will be clocked (much)
> lower than expected.
>
> It's possible that there are benefits of performing the per-core
> requests from the OS, but more investigation of the outcome is needed
> before introducing such support. As such this patch extends the request
> for the cluster to be written to all the cores.
>
> The weight of the policy's related_cpus is used to determine how many
> cores, and hence consecutive registers, each cluster has.
>
> The OS is not permitted to disable the per-core dcvs feature.
>
> Signed-off-by: Bjorn Andersson <[email protected]>
> ---
> drivers/cpufreq/qcom-cpufreq-hw.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)

Applied. Thanks.

--
viresh