This series adds provision to mark dynamic opps as boost capable and adds
boost frequency support to the scmi cpufreq driver.
Depends on:
HW pressure v4: https://patchwork.kernel.org/project/linux-arm-msm/cover/[email protected]/
scmi notification v2: https://patchwork.kernel.org/project/linux-arm-msm/cover/[email protected]/
Sibi Sankar (3):
OPP: Extend dev_pm_opp_data with turbo support
firmware: arm_scmi: Add support for marking certain frequencies as
boost
cpufreq: scmi: Enable boost support
drivers/cpufreq/scmi-cpufreq.c | 20 +++++++++++++++++++-
drivers/firmware/arm_scmi/perf.c | 11 ++++++++++-
drivers/opp/core.c | 1 +
include/linux/pm_opp.h | 1 +
4 files changed, 31 insertions(+), 2 deletions(-)
--
2.34.1
All opps above the sustained level/frequency are treated as boost, so mark
them accordingly.
Suggested-by: Sudeep Holla <[email protected]>
Signed-off-by: Sibi Sankar <[email protected]>
---
drivers/firmware/arm_scmi/perf.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/firmware/arm_scmi/perf.c b/drivers/firmware/arm_scmi/perf.c
index e286f04ee6e3..d3fb8c804b3d 100644
--- a/drivers/firmware/arm_scmi/perf.c
+++ b/drivers/firmware/arm_scmi/perf.c
@@ -811,7 +811,7 @@ static int scmi_dvfs_device_opps_add(const struct scmi_protocol_handle *ph,
struct device *dev, u32 domain)
{
int idx, ret;
- unsigned long freq;
+ unsigned long freq, sustained_freq;
struct dev_pm_opp_data data = {};
struct perf_dom_info *dom;
@@ -819,12 +819,21 @@ static int scmi_dvfs_device_opps_add(const struct scmi_protocol_handle *ph,
if (IS_ERR(dom))
return PTR_ERR(dom);
+ if (!dom->level_indexing_mode)
+ sustained_freq = dom->sustained_perf_level * dom->mult_factor;
+ else
+ sustained_freq = dom->sustained_freq_khz * dom->mult_factor;
+
for (idx = 0; idx < dom->opp_count; idx++) {
if (!dom->level_indexing_mode)
freq = dom->opp[idx].perf * dom->mult_factor;
else
freq = dom->opp[idx].indicative_freq * dom->mult_factor;
+ /* All opps above the sustained level/frequency are treated as boost */
+ if (sustained_freq && freq > sustained_freq)
+ data.turbo = true;
+
data.level = dom->opp[idx].perf;
data.freq = freq;
--
2.34.1
The X1E80100 SoC hosts a number cpu boost frequencies, so let's enable
boost support if the freq_table has any opps marked as turbo in it.
Signed-off-by: Sibi Sankar <[email protected]>
---
drivers/cpufreq/scmi-cpufreq.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
index e0aa85764451..4355ec73502e 100644
--- a/drivers/cpufreq/scmi-cpufreq.c
+++ b/drivers/cpufreq/scmi-cpufreq.c
@@ -34,6 +34,7 @@ const struct scmi_handle *handle;
static struct scmi_device *scmi_dev;
static struct scmi_protocol_handle *ph;
static const struct scmi_perf_proto_ops *perf_ops;
+static struct cpufreq_driver scmi_cpufreq_driver;
static unsigned int scmi_cpufreq_get_rate(unsigned int cpu)
{
@@ -148,6 +149,12 @@ scmi_get_cpu_power(struct device *cpu_dev, unsigned long *power,
return 0;
}
+static struct freq_attr *scmi_cpufreq_hw_attr[] = {
+ &cpufreq_freq_attr_scaling_available_freqs,
+ NULL,
+ NULL,
+};
+
static int scmi_limit_notify_cb(struct notifier_block *nb, unsigned long event, void *data)
{
unsigned long freq_hz;
@@ -271,6 +278,17 @@ static int scmi_cpufreq_init(struct cpufreq_policy *policy)
policy->fast_switch_possible =
perf_ops->fast_switch_possible(ph, domain);
+ if (policy_has_boost_freq(policy)) {
+ ret = cpufreq_enable_boost_support();
+ if (ret) {
+ dev_warn(cpu_dev, "failed to enable boost: %d\n", ret);
+ goto out_free_opp;
+ } else {
+ scmi_cpufreq_hw_attr[1] = &cpufreq_freq_attr_scaling_boost_freqs;
+ scmi_cpufreq_driver.boost_enabled = true;
+ }
+ }
+
ret = perf_ops->perf_notify_support(ph, domain, &info);
if (ret)
dev_warn(cpu_dev, "failed to get supported notifications: %d\n", ret);
@@ -348,7 +366,7 @@ static struct cpufreq_driver scmi_cpufreq_driver = {
CPUFREQ_NEED_INITIAL_FREQ_CHECK |
CPUFREQ_IS_COOLING_DEV,
.verify = cpufreq_generic_frequency_table_verify,
- .attr = cpufreq_generic_attr,
+ .attr = scmi_cpufreq_hw_attr,
.target_index = scmi_cpufreq_set_target,
.fast_switch = scmi_cpufreq_fast_switch,
.get = scmi_cpufreq_get_rate,
--
2.34.1
Let's extend the dev_pm_opp_data with a turbo variable, to allow users to
specify if it's a boost frequency for a dynamically added OPP.
Signed-off-by: Sibi Sankar <[email protected]>
---
drivers/opp/core.c | 1 +
include/linux/pm_opp.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index c4e0432ae42a..e233734b7220 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -2065,6 +2065,7 @@ int _opp_add_v1(struct opp_table *opp_table, struct device *dev,
/* populate the opp table */
new_opp->rates[0] = data->freq;
new_opp->level = data->level;
+ new_opp->turbo = data->turbo;
tol = u_volt * opp_table->voltage_tolerance_v1 / 100;
new_opp->supplies[0].u_volt = u_volt;
new_opp->supplies[0].u_volt_min = u_volt - tol;
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index 76dcb7f37bcd..a08a1fb1ca2a 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -93,6 +93,7 @@ struct dev_pm_opp_config {
* @u_volt: The voltage in uV for the OPP.
*/
struct dev_pm_opp_data {
+ bool turbo;
unsigned int level;
unsigned long freq;
unsigned long u_volt;
--
2.34.1
On 17-01-24, 16:34, Sibi Sankar wrote:
> This series adds provision to mark dynamic opps as boost capable and adds
> boost frequency support to the scmi cpufreq driver.
>
> Depends on:
> HW pressure v4: https://patchwork.kernel.org/project/linux-arm-msm/cover/[email protected]/
> scmi notification v2: https://patchwork.kernel.org/project/linux-arm-msm/cover/[email protected]/
>
> Sibi Sankar (3):
> OPP: Extend dev_pm_opp_data with turbo support
> firmware: arm_scmi: Add support for marking certain frequencies as
> boost
> cpufreq: scmi: Enable boost support
Sudeep, please lemme know if you are okay with the changes. Will apply
them.
--
viresh
On Tue, Jan 23, 2024 at 11:38:27AM +0530, Viresh Kumar wrote:
> On 17-01-24, 16:34, Sibi Sankar wrote:
> > This series adds provision to mark dynamic opps as boost capable and adds
> > boost frequency support to the scmi cpufreq driver.
> >
> > Depends on:
> > HW pressure v4: https://patchwork.kernel.org/project/linux-arm-msm/cover/[email protected]/
> > scmi notification v2: https://patchwork.kernel.org/project/linux-arm-msm/cover/[email protected]/
@Sudeep, I am going to start review on this series and the related one
these next few days.
Thanks,
Cristian
On Tue, Jan 23, 2024 at 11:38:27AM +0530, Viresh Kumar wrote:
> On 17-01-24, 16:34, Sibi Sankar wrote:
> > This series adds provision to mark dynamic opps as boost capable and adds
> > boost frequency support to the scmi cpufreq driver.
> >
> > Depends on:
> > HW pressure v4: https://patchwork.kernel.org/project/linux-arm-msm/cover/[email protected]/
> > scmi notification v2: https://patchwork.kernel.org/project/linux-arm-msm/cover/[email protected]/
> >
> > Sibi Sankar (3):
> > OPP: Extend dev_pm_opp_data with turbo support
> > firmware: arm_scmi: Add support for marking certain frequencies as
> > boost
> > cpufreq: scmi: Enable boost support
>
> Sudeep, please lemme know if you are okay with the changes. Will apply
> them.
I was planning to look at it once Lukasz/Dietmar confirm that this concept
doesn't change anything fundamental in the way EAS related changes work
today. I know I suggested the change as that seem to be right way to do
but I haven't analysed if this has any negative impact on the existing
features as this change will impact all the existing platform with OPPs
above sustained performance/frequency advertised from the SCMI platform
firmware.
--
Regards,
Sudeep
On Wed, Jan 17, 2024 at 04:34:42PM +0530, Sibi Sankar wrote:
> All opps above the sustained level/frequency are treated as boost, so mark
> them accordingly.
>
> Suggested-by: Sudeep Holla <[email protected]>
> Signed-off-by: Sibi Sankar <[email protected]>
> ---
> drivers/firmware/arm_scmi/perf.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/firmware/arm_scmi/perf.c b/drivers/firmware/arm_scmi/perf.c
> index e286f04ee6e3..d3fb8c804b3d 100644
> --- a/drivers/firmware/arm_scmi/perf.c
> +++ b/drivers/firmware/arm_scmi/perf.c
> @@ -811,7 +811,7 @@ static int scmi_dvfs_device_opps_add(const struct scmi_protocol_handle *ph,
> struct device *dev, u32 domain)
> {
> int idx, ret;
> - unsigned long freq;
> + unsigned long freq, sustained_freq;
> struct dev_pm_opp_data data = {};
> struct perf_dom_info *dom;
>
> @@ -819,12 +819,21 @@ static int scmi_dvfs_device_opps_add(const struct scmi_protocol_handle *ph,
> if (IS_ERR(dom))
> return PTR_ERR(dom);
>
> + if (!dom->level_indexing_mode)
> + sustained_freq = dom->sustained_perf_level * dom->mult_factor;
> + else
> + sustained_freq = dom->sustained_freq_khz * dom->mult_factor;
> +
Can't we just use dom->sustained_freq_khz * 1000UL in both the cases ?
Other than that this series looks good to me but it would be good to
explain how you would use this. Since it is enabled by default, do you
plan to disable boost at time and when ? If it is for thermal reasons,
why your other series handling thermal and limits notification from the
firmware not sufficient.
--
Regards,
Sudeep
Hello Sibi,
On 1/17/24 12:04, Sibi Sankar wrote:
> All opps above the sustained level/frequency are treated as boost, so mark
> them accordingly.
>
> Suggested-by: Sudeep Holla <[email protected]>
> Signed-off-by: Sibi Sankar <[email protected]>
> ---
> drivers/firmware/arm_scmi/perf.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/firmware/arm_scmi/perf.c b/drivers/firmware/arm_scmi/perf.c
> index e286f04ee6e3..d3fb8c804b3d 100644
> --- a/drivers/firmware/arm_scmi/perf.c
> +++ b/drivers/firmware/arm_scmi/perf.c
> @@ -811,7 +811,7 @@ static int scmi_dvfs_device_opps_add(const struct scmi_protocol_handle *ph,
> struct device *dev, u32 domain)
> {
> int idx, ret;
> - unsigned long freq;
> + unsigned long freq, sustained_freq;
> struct dev_pm_opp_data data = {};
> struct perf_dom_info *dom;
>
> @@ -819,12 +819,21 @@ static int scmi_dvfs_device_opps_add(const struct scmi_protocol_handle *ph,
> if (IS_ERR(dom))
> return PTR_ERR(dom);
>
> + if (!dom->level_indexing_mode)
> + sustained_freq = dom->sustained_perf_level * dom->mult_factor;
> + else
> + sustained_freq = dom->sustained_freq_khz * dom->mult_factor;
> +
> for (idx = 0; idx < dom->opp_count; idx++) {
> if (!dom->level_indexing_mode)
> freq = dom->opp[idx].perf * dom->mult_factor;
> else
> freq = dom->opp[idx].indicative_freq * dom->mult_factor;
>
> + /* All opps above the sustained level/frequency are treated as boost */
> + if (sustained_freq && freq > sustained_freq)
It seems the sustained_freq is not optional since SCMI v1.0,
is it necessary to check that (sustained_freq != 0) ?
> + data.turbo = true;
> +
> data.level = dom->opp[idx].perf;
> data.freq = freq;
>
On 23/01/2024 11:15, Sudeep Holla wrote:
> On Tue, Jan 23, 2024 at 11:38:27AM +0530, Viresh Kumar wrote:
>> On 17-01-24, 16:34, Sibi Sankar wrote:
>>> This series adds provision to mark dynamic opps as boost capable and adds
>>> boost frequency support to the scmi cpufreq driver.
>>>
>>> Depends on:
>>> HW pressure v4: https://patchwork.kernel.org/project/linux-arm-msm/cover/[email protected]/
>>> scmi notification v2: https://patchwork.kernel.org/project/linux-arm-msm/cover/[email protected]/
>>>
>>> Sibi Sankar (3):
>>> OPP: Extend dev_pm_opp_data with turbo support
>>> firmware: arm_scmi: Add support for marking certain frequencies as
>>> boost
>>> cpufreq: scmi: Enable boost support
>>
>> Sudeep, please lemme know if you are okay with the changes. Will apply
>> them.
>
> I was planning to look at it once Lukasz/Dietmar confirm that this concept
> doesn't change anything fundamental in the way EAS related changes work
> today. I know I suggested the change as that seem to be right way to do
> but I haven't analysed if this has any negative impact on the existing
> features as this change will impact all the existing platform with OPPs
> above sustained performance/frequency advertised from the SCMI platform
> firmware.
I was mostly concerned about the settings for the CPU frequency
invariance implementation in [drivers/base/arch_topology.c]:
#define arch_scale_freq_capacity topology_get_freq_scale
But per_cpu(capacity_freq_ref, cpu) is still set to
'policy->cpuinfo.max_freq' in init_cpu_capacity_callback()
which stays the same.
With some extra debugging I get the following on Juno-r0 [L b b L L L]:
root@juno:~# dmesg -w | grep -i "freq\|boost\|noti\|OPP\|cap"
[ 1.768414] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled.
[ 1.793084] [1][LITTLE_CPU]:: Registered OPP[0] 450000000
[ 1.798624] [1][LITTLE_CPU]:: Registered OPP[1] 575000000
[ 1.804131] [1][LITTLE_CPU]:: Registered OPP[2] 700000000
[ 1.809552] scmi_dvfs_device_opps_add() sustained_freq=700000000 freq=775000000
[ 1.816971] [1][LITTLE_CPU]:: Registered OPP[3] 775000000
[ 1.822392] scmi_dvfs_device_opps_add() sustained_freq=700000000 freq=850000000
[ 1.829800] [1][LITTLE_CPU]:: Registered OPP[4] 850000000
[ 1.835268] enabled boost: 0
[ 1.838173] init_cpu_capacity_callback() cpu=0 max_freq=850000
[ 1.844032] init_cpu_capacity_callback() cpu=3 max_freq=850000
[ 1.849886] init_cpu_capacity_callback() cpu=4 max_freq=850000
[ 1.855743] init_cpu_capacity_callback() cpu=5 max_freq=850000
[ 1.866324] cpufreq_update_pressure() cpu=0 cpufreq_pressure=0
[ 1.872178] cpufreq_update_pressure() cpu=3 cpufreq_pressure=0
[ 1.878026] cpufreq_update_pressure() cpu=4 cpufreq_pressure=0
[ 1.883874] cpufreq_update_pressure() cpu=5 cpufreq_pressure=0
[ 1.890633] [0][BIG_CPU]:: Registered OPP[0] 450000000
[ 1.895892] [0][BIG_CPU]:: Registered OPP[1] 625000000
[ 1.901129] [0][BIG_CPU]:: Registered OPP[2] 800000000
[ 1.906286] scmi_dvfs_device_opps_add() sustained_freq=800000000 freq=950000000
[ 1.906381] [0][BIG_CPU]:: Registered OPP[3] 950000000
[ 1.917377] scmi_dvfs_device_opps_add() sustained_freq=800000000 freq=1100000000
[ 1.917468] [0][BIG_CPU]:: Registered OPP[4] 1100000000
[ 1.939237] enabled boost: 0
[ 1.942134] init_cpu_capacity_callback() cpu=1 max_freq=1100000
[ 1.948078] init_cpu_capacity_callback() cpu=2 max_freq=1100000
[ 1.959003] cpufreq_update_pressure() cpu=1 cpufreq_pressure=0
[ 1.964853] cpufreq_update_pressure() cpu=2 cpufreq_pressure=0
root@juno:/sys/devices/system/cpu/cpufreq# cat boost policy*/boost
1
0
0
root@juno:/sys/devices/system/cpu/cpufreq# cat policy*/scaling_available_frequencies policy*/scaling_boost_frequencies
450000 575000 700000
450000 625000 800000
775000 850000
950000 1100000
If I disable system-wide boost I see the correct influence on
'cpufreq_pressure':
root@juno:/sys/devices/system/cpu/cpufreq# echo 0 > boost
[ 439.466682] cpufreq_update_pressure() cpu=1 cpufreq_pressure=280
[ 439.472797] cpufreq_update_pressure() cpu=2 cpufreq_pressure=280
[ 439.478889] cpufreq_update_pressure() cpu=0 cpufreq_pressure=79
[ 439.484852] cpufreq_update_pressure() cpu=3 cpufreq_pressure=79
[ 439.490843] cpufreq_update_pressure() cpu=4 cpufreq_pressure=79
[ 439.499621] cpufreq_update_pressure() cpu=5 cpufreq_pressure=79
reflecting the max frequency change from '1100000 to 800000' on CPU1,2
and from '850000 to 700000' on CPU0,3-5.
root@juno:/sys/devices/system/cpu/cpufreq# echo 1 > boost
[ 2722.693113] cpufreq_update_pressure() cpu=1 cpufreq_pressure=0
[ 2722.699041] cpufreq_update_pressure() cpu=2 cpufreq_pressure=0
[ 2722.704962] cpufreq_update_pressure() cpu=0 cpufreq_pressure=0
[ 2722.710842] cpufreq_update_pressure() cpu=3 cpufreq_pressure=0
[ 2722.719644] cpufreq_update_pressure() cpu=4 cpufreq_pressure=0
[ 2722.728224] cpufreq_update_pressure() cpu=5 cpufreq_pressure=0
What doesn't work for me is to disable boost per policy:
root@juno:/sys/devices/system/cpu/cpufreq# echo 1 > boost
root@juno:/sys/devices/system/cpu/cpufreq# echo 0 > policy0/boost
root@juno:/sys/devices/system/cpu/cpufreq# echo 0 > policy1/boost
Here I don't see 'cpufreq_pressure' changes.
BTW, what's the use case you have in mind for this feature? Is it to cap
high OPPs for CPUs in a certain CPUfreq policy?
On Wed, Jan 31, 2024 at 03:29:43PM +0100, Pierre Gondois wrote:
> Hello Sibi,
>
> On 1/17/24 12:04, Sibi Sankar wrote:
> > All opps above the sustained level/frequency are treated as boost, so mark
> > them accordingly.
> >
> > Suggested-by: Sudeep Holla <[email protected]>
> > Signed-off-by: Sibi Sankar <[email protected]>
> > ---
> > drivers/firmware/arm_scmi/perf.c | 11 ++++++++++-
> > 1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/firmware/arm_scmi/perf.c b/drivers/firmware/arm_scmi/perf.c
> > index e286f04ee6e3..d3fb8c804b3d 100644
> > --- a/drivers/firmware/arm_scmi/perf.c
> > +++ b/drivers/firmware/arm_scmi/perf.c
> > @@ -811,7 +811,7 @@ static int scmi_dvfs_device_opps_add(const struct scmi_protocol_handle *ph,
> > struct device *dev, u32 domain)
> > {
> > int idx, ret;
> > - unsigned long freq;
> > + unsigned long freq, sustained_freq;
> > struct dev_pm_opp_data data = {};
> > struct perf_dom_info *dom;
> > @@ -819,12 +819,21 @@ static int scmi_dvfs_device_opps_add(const struct scmi_protocol_handle *ph,
> > if (IS_ERR(dom))
> > return PTR_ERR(dom);
> > + if (!dom->level_indexing_mode)
> > + sustained_freq = dom->sustained_perf_level * dom->mult_factor;
> > + else
> > + sustained_freq = dom->sustained_freq_khz * dom->mult_factor;
> > +
> > for (idx = 0; idx < dom->opp_count; idx++) {
> > if (!dom->level_indexing_mode)
> > freq = dom->opp[idx].perf * dom->mult_factor;
> > else
> > freq = dom->opp[idx].indicative_freq * dom->mult_factor;
> > + /* All opps above the sustained level/frequency are treated as boost */
> > + if (sustained_freq && freq > sustained_freq)
>
> It seems the sustained_freq is not optional since SCMI v1.0,
> is it necessary to check that (sustained_freq != 0) ?
>
Technically correct, we don't have to. But since day 1, we checked and
handled 0 for perf_level specifically to avoid division by zero. I am
just worried if there are any platforms in the wild with these values as
0. We can start without the check and add it if someone complains perhaps ?
--
Regards,
Sudeep
On Wed, Jan 31, 2024 at 04:07:48PM +0100, Dietmar Eggemann wrote:
[...]
>
> root@juno:/sys/devices/system/cpu/cpufreq# cat boost policy*/boost
> 1
> 0
> 0
OK I admit, I wasn't aware of this per policy boost flag.
It must be enabled by default if it has any effect. Otherwise
it breaks the existing behaviour on all the SCMI based platforms.
--
Regards,
Sudeep
On 1/31/24 21:38, Sudeep Holla wrote:
> On Wed, Jan 31, 2024 at 03:29:43PM +0100, Pierre Gondois wrote:
>> Hello Sibi,
>>
>> On 1/17/24 12:04, Sibi Sankar wrote:
>>> All opps above the sustained level/frequency are treated as boost, so mark
>>> them accordingly.
>>>
Sudeep/Pierre,
Thanks for taking time to review the series.
>>> Suggested-by: Sudeep Holla <[email protected]>
>>> Signed-off-by: Sibi Sankar <[email protected]>
>>> ---
>>> drivers/firmware/arm_scmi/perf.c | 11 ++++++++++-
>>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/firmware/arm_scmi/perf.c b/drivers/firmware/arm_scmi/perf.c
>>> index e286f04ee6e3..d3fb8c804b3d 100644
>>> --- a/drivers/firmware/arm_scmi/perf.c
>>> +++ b/drivers/firmware/arm_scmi/perf.c
>>> @@ -811,7 +811,7 @@ static int scmi_dvfs_device_opps_add(const struct scmi_protocol_handle *ph,
>>> struct device *dev, u32 domain)
>>> {
>>> int idx, ret;
>>> - unsigned long freq;
>>> + unsigned long freq, sustained_freq;
>>> struct dev_pm_opp_data data = {};
>>> struct perf_dom_info *dom;
>>> @@ -819,12 +819,21 @@ static int scmi_dvfs_device_opps_add(const struct scmi_protocol_handle *ph,
>>> if (IS_ERR(dom))
>>> return PTR_ERR(dom);
>>> + if (!dom->level_indexing_mode)
>>> + sustained_freq = dom->sustained_perf_level * dom->mult_factor;
>>> + else
>>> + sustained_freq = dom->sustained_freq_khz * dom->mult_factor;
>>> +
>>> for (idx = 0; idx < dom->opp_count; idx++) {
>>> if (!dom->level_indexing_mode)
>>> freq = dom->opp[idx].perf * dom->mult_factor;
>>> else
>>> freq = dom->opp[idx].indicative_freq * dom->mult_factor;
>>> + /* All opps above the sustained level/frequency are treated as boost */
>>> + if (sustained_freq && freq > sustained_freq)
>>
>> It seems the sustained_freq is not optional since SCMI v1.0,
>> is it necessary to check that (sustained_freq != 0) ?
>>
>
> Technically correct, we don't have to. But since day 1, we checked and
> handled 0 for perf_level specifically to avoid division by zero. I am
> just worried if there are any platforms in the wild with these values as
> 0. We can start without the check and add it if someone complains perhaps ?
sure will drop the check in the re-spin.
-Sibi
>
On 1/31/24 16:55, Sudeep Holla wrote:
> On Wed, Jan 17, 2024 at 04:34:42PM +0530, Sibi Sankar wrote:
>> All opps above the sustained level/frequency are treated as boost, so mark
>> them accordingly.
>>
>> Suggested-by: Sudeep Holla <[email protected]>
>> Signed-off-by: Sibi Sankar <[email protected]>
>> ---
>> drivers/firmware/arm_scmi/perf.c | 11 ++++++++++-
>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/firmware/arm_scmi/perf.c b/drivers/firmware/arm_scmi/perf.c
>> index e286f04ee6e3..d3fb8c804b3d 100644
>> --- a/drivers/firmware/arm_scmi/perf.c
>> +++ b/drivers/firmware/arm_scmi/perf.c
>> @@ -811,7 +811,7 @@ static int scmi_dvfs_device_opps_add(const struct scmi_protocol_handle *ph,
>> struct device *dev, u32 domain)
>> {
>> int idx, ret;
>> - unsigned long freq;
>> + unsigned long freq, sustained_freq;
>> struct dev_pm_opp_data data = {};
>> struct perf_dom_info *dom;
>>
>> @@ -819,12 +819,21 @@ static int scmi_dvfs_device_opps_add(const struct scmi_protocol_handle *ph,
>> if (IS_ERR(dom))
>> return PTR_ERR(dom);
>>
>> + if (!dom->level_indexing_mode)
>> + sustained_freq = dom->sustained_perf_level * dom->mult_factor;
>> + else
>> + sustained_freq = dom->sustained_freq_khz * dom->mult_factor;
>> +
>
> Can't we just use dom->sustained_freq_khz * 1000UL in both the cases ?
sure, I retained sustained_perf_level because I wasn't sure how the
sustained_perf_level would be populated in systems not using level
indexing mode.
>
> Other than that this series looks good to me but it would be good to
> explain how you would use this. Since it is enabled by default, do you
> plan to disable boost at time and when ? If it is for thermal reasons,
> why your other series handling thermal and limits notification from the
> firmware not sufficient.
Boost frequencies defined in X1E are achievable by only one CPU in a
cluster i.e. either the other CPUs in the same cluster should be in low
power mode or offline. So it's mostly for book keeping i.e. we wouldn't
to intimate incorrectly that the CPUs are running at max possible
frequency when it's actually running at a lower frequency.
-Sibi
>
On 1/31/24 20:37, Dietmar Eggemann wrote:
> On 23/01/2024 11:15, Sudeep Holla wrote:
>> On Tue, Jan 23, 2024 at 11:38:27AM +0530, Viresh Kumar wrote:
>>> On 17-01-24, 16:34, Sibi Sankar wrote:
>>>> This series adds provision to mark dynamic opps as boost capable and adds
>>>> boost frequency support to the scmi cpufreq driver.
>>>>
>>>> Depends on:
>>>> HW pressure v4: https://patchwork.kernel.org/project/linux-arm-msm/cover/[email protected]/
>>>> scmi notification v2: https://patchwork.kernel.org/project/linux-arm-msm/cover/[email protected]/
>>>>
>>>> Sibi Sankar (3):
>>>> OPP: Extend dev_pm_opp_data with turbo support
>>>> firmware: arm_scmi: Add support for marking certain frequencies as
>>>> boost
>>>> cpufreq: scmi: Enable boost support
>>>
>>> Sudeep, please lemme know if you are okay with the changes. Will apply
>>> them.
>>
>> I was planning to look at it once Lukasz/Dietmar confirm that this concept
>> doesn't change anything fundamental in the way EAS related changes work
>> today. I know I suggested the change as that seem to be right way to do
>> but I haven't analysed if this has any negative impact on the existing
>> features as this change will impact all the existing platform with OPPs
>> above sustained performance/frequency advertised from the SCMI platform
>> firmware.
>
> I was mostly concerned about the settings for the CPU frequency
> invariance implementation in [drivers/base/arch_topology.c]:
>
> #define arch_scale_freq_capacity topology_get_freq_scale
>
> But per_cpu(capacity_freq_ref, cpu) is still set to
> 'policy->cpuinfo.max_freq' in init_cpu_capacity_callback()
> which stays the same.
>
> With some extra debugging I get the following on Juno-r0 [L b b L L L]:
>
> root@juno:~# dmesg -w | grep -i "freq\|boost\|noti\|OPP\|cap"
>
> [ 1.768414] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled.
> [ 1.793084] [1][LITTLE_CPU]:: Registered OPP[0] 450000000
> [ 1.798624] [1][LITTLE_CPU]:: Registered OPP[1] 575000000
> [ 1.804131] [1][LITTLE_CPU]:: Registered OPP[2] 700000000
> [ 1.809552] scmi_dvfs_device_opps_add() sustained_freq=700000000 freq=775000000
> [ 1.816971] [1][LITTLE_CPU]:: Registered OPP[3] 775000000
> [ 1.822392] scmi_dvfs_device_opps_add() sustained_freq=700000000 freq=850000000
> [ 1.829800] [1][LITTLE_CPU]:: Registered OPP[4] 850000000
> [ 1.835268] enabled boost: 0
> [ 1.838173] init_cpu_capacity_callback() cpu=0 max_freq=850000
> [ 1.844032] init_cpu_capacity_callback() cpu=3 max_freq=850000
> [ 1.849886] init_cpu_capacity_callback() cpu=4 max_freq=850000
> [ 1.855743] init_cpu_capacity_callback() cpu=5 max_freq=850000
> [ 1.866324] cpufreq_update_pressure() cpu=0 cpufreq_pressure=0
> [ 1.872178] cpufreq_update_pressure() cpu=3 cpufreq_pressure=0
> [ 1.878026] cpufreq_update_pressure() cpu=4 cpufreq_pressure=0
> [ 1.883874] cpufreq_update_pressure() cpu=5 cpufreq_pressure=0
> [ 1.890633] [0][BIG_CPU]:: Registered OPP[0] 450000000
> [ 1.895892] [0][BIG_CPU]:: Registered OPP[1] 625000000
> [ 1.901129] [0][BIG_CPU]:: Registered OPP[2] 800000000
> [ 1.906286] scmi_dvfs_device_opps_add() sustained_freq=800000000 freq=950000000
> [ 1.906381] [0][BIG_CPU]:: Registered OPP[3] 950000000
> [ 1.917377] scmi_dvfs_device_opps_add() sustained_freq=800000000 freq=1100000000
> [ 1.917468] [0][BIG_CPU]:: Registered OPP[4] 1100000000
> [ 1.939237] enabled boost: 0
> [ 1.942134] init_cpu_capacity_callback() cpu=1 max_freq=1100000
> [ 1.948078] init_cpu_capacity_callback() cpu=2 max_freq=1100000
> [ 1.959003] cpufreq_update_pressure() cpu=1 cpufreq_pressure=0
> [ 1.964853] cpufreq_update_pressure() cpu=2 cpufreq_pressure=0
>
> root@juno:/sys/devices/system/cpu/cpufreq# cat boost policy*/boost
> 1
> 0
> 0
>
> root@juno:/sys/devices/system/cpu/cpufreq# cat policy*/scaling_available_frequencies policy*/scaling_boost_frequencies
> 450000 575000 700000
> 450000 625000 800000
> 775000 850000
> 950000 1100000
>
> If I disable system-wide boost I see the correct influence on
> 'cpufreq_pressure':
>
> root@juno:/sys/devices/system/cpu/cpufreq# echo 0 > boost
>
> [ 439.466682] cpufreq_update_pressure() cpu=1 cpufreq_pressure=280
> [ 439.472797] cpufreq_update_pressure() cpu=2 cpufreq_pressure=280
> [ 439.478889] cpufreq_update_pressure() cpu=0 cpufreq_pressure=79
> [ 439.484852] cpufreq_update_pressure() cpu=3 cpufreq_pressure=79
> [ 439.490843] cpufreq_update_pressure() cpu=4 cpufreq_pressure=79
> [ 439.499621] cpufreq_update_pressure() cpu=5 cpufreq_pressure=79
>
> reflecting the max frequency change from '1100000 to 800000' on CPU1,2
> and from '850000 to 700000' on CPU0,3-5.
>
> root@juno:/sys/devices/system/cpu/cpufreq# echo 1 > boost
>
> [ 2722.693113] cpufreq_update_pressure() cpu=1 cpufreq_pressure=0
> [ 2722.699041] cpufreq_update_pressure() cpu=2 cpufreq_pressure=0
> [ 2722.704962] cpufreq_update_pressure() cpu=0 cpufreq_pressure=0
> [ 2722.710842] cpufreq_update_pressure() cpu=3 cpufreq_pressure=0
> [ 2722.719644] cpufreq_update_pressure() cpu=4 cpufreq_pressure=0
> [ 2722.728224] cpufreq_update_pressure() cpu=5 cpufreq_pressure=0
>
> What doesn't work for me is to disable boost per policy:
>
> root@juno:/sys/devices/system/cpu/cpufreq# echo 1 > boost
> root@juno:/sys/devices/system/cpu/cpufreq# echo 0 > policy0/boost
> root@juno:/sys/devices/system/cpu/cpufreq# echo 0 > policy1/boost
>
> Here I don't see 'cpufreq_pressure' changes.
>
> BTW, what's the use case you have in mind for this feature? Is it to cap
> high OPPs for CPUs in a certain CPUfreq policy?
Yeah, that's exactly the use case for X1E. Boost frequencies defined in
the SoC are achievable by only one CPU in a cluster i.e. either the
other CPUs in the same cluster should be in low power mode or offline.
So it's mostly for book keeping i.e. we wouldn't to intimate incorrectly
that the CPUs are running at max possible frequency when it's actually
running at a lower frequency.
-Sibi
>
>
On 13/02/2024 08:35, Sibi Sankar wrote:
>
>
> On 1/31/24 20:37, Dietmar Eggemann wrote:
>> On 23/01/2024 11:15, Sudeep Holla wrote:
>>> On Tue, Jan 23, 2024 at 11:38:27AM +0530, Viresh Kumar wrote:
>>>> On 17-01-24, 16:34, Sibi Sankar wrote:
[...]
>> root@juno:/sys/devices/system/cpu/cpufreq# cat boost policy*/boost
>> 1
>> 0
>> 0
>>
>> root@juno:/sys/devices/system/cpu/cpufreq# cat
>> policy*/scaling_available_frequencies policy*/scaling_boost_frequencies
>> 450000 575000 700000
>> 450000 625000 800000
>> 775000 850000
>> 950000 1100000
>>
>> If I disable system-wide boost I see the correct influence on
>> 'cpufreq_pressure':
>>
>> root@juno:/sys/devices/system/cpu/cpufreq# echo 0 > boost
>>
>> [ 439.466682] cpufreq_update_pressure() cpu=1 cpufreq_pressure=280
>> [ 439.472797] cpufreq_update_pressure() cpu=2 cpufreq_pressure=280
>> [ 439.478889] cpufreq_update_pressure() cpu=0 cpufreq_pressure=79
>> [ 439.484852] cpufreq_update_pressure() cpu=3 cpufreq_pressure=79
>> [ 439.490843] cpufreq_update_pressure() cpu=4 cpufreq_pressure=79
>> [ 439.499621] cpufreq_update_pressure() cpu=5 cpufreq_pressure=79
>>
>> reflecting the max frequency change from '1100000 to 800000' on CPU1,2
>> and from '850000 to 700000' on CPU0,3-5.
>>
>> root@juno:/sys/devices/system/cpu/cpufreq# echo 1 > boost
>>
>> [ 2722.693113] cpufreq_update_pressure() cpu=1 cpufreq_pressure=0
>> [ 2722.699041] cpufreq_update_pressure() cpu=2 cpufreq_pressure=0
>> [ 2722.704962] cpufreq_update_pressure() cpu=0 cpufreq_pressure=0
>> [ 2722.710842] cpufreq_update_pressure() cpu=3 cpufreq_pressure=0
>> [ 2722.719644] cpufreq_update_pressure() cpu=4 cpufreq_pressure=0
>> [ 2722.728224] cpufreq_update_pressure() cpu=5 cpufreq_pressure=0
>>
>> What doesn't work for me is to disable boost per policy:
>>
>> root@juno:/sys/devices/system/cpu/cpufreq# echo 1 > boost
>> root@juno:/sys/devices/system/cpu/cpufreq# echo 0 > policy0/boost
>> root@juno:/sys/devices/system/cpu/cpufreq# echo 0 > policy1/boost
>>
>> Here I don't see 'cpufreq_pressure' changes.
>>
>> BTW, what's the use case you have in mind for this feature? Is it to cap
>> high OPPs for CPUs in a certain CPUfreq policy?
>
> Yeah, that's exactly the use case for X1E. Boost frequencies defined in
> the SoC are achievable by only one CPU in a cluster i.e. either the
> other CPUs in the same cluster should be in low power mode or offline.
> So it's mostly for book keeping i.e. we wouldn't to intimate incorrectly
> that the CPUs are running at max possible frequency when it's actually
> running at a lower frequency.
I see.
What about the issue with the settings of the global and the per-policy
'boost' file?
On my Juno-r0 the initial boost values are:
(1) Initial setting:
root@juno:/sys/devices/system/cpu/cpufreq# cat boost policy*/boost
1
0
0
Should they not all be 1 ?
(2) Disabling system-wide boost
root@juno:/sys/devices/system/cpu/cpufreq# echo 0 > boost
Here I see 'cpufreq_pressure > 0' for all CPUs.
(3) Enabling system-wide boost
root@juno:/sys/devices/system/cpu/cpufreq# echo 1 > boost
And here 'cpufreq_pressure == 0' for all CPUs.
(4) Disabling boost for policy0.
root@juno:/sys/devices/system/cpu/cpufreq# echo 0 > policy0/boost
root@juno:/sys/devices/system/cpu/cpufreq# cat boost policy*/boost
1
0
1
Here nothing happened. But I was expecting to see 'cpufreq_pressure > 0'
for CPUs of policy0:
root@juno:/sys/devices/system/cpu/cpufreq# cat policy0/affected_cpus
0 3 4 5
On 1/17/24 11:04, Sibi Sankar wrote:
> Let's extend the dev_pm_opp_data with a turbo variable, to allow users to
> specify if it's a boost frequency for a dynamically added OPP.
>
> Signed-off-by: Sibi Sankar <[email protected]>
> ---
> drivers/opp/core.c | 1 +
> include/linux/pm_opp.h | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index c4e0432ae42a..e233734b7220 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -2065,6 +2065,7 @@ int _opp_add_v1(struct opp_table *opp_table, struct device *dev,
> /* populate the opp table */
> new_opp->rates[0] = data->freq;
> new_opp->level = data->level;
> + new_opp->turbo = data->turbo;
> tol = u_volt * opp_table->voltage_tolerance_v1 / 100;
> new_opp->supplies[0].u_volt = u_volt;
> new_opp->supplies[0].u_volt_min = u_volt - tol;
> diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
> index 76dcb7f37bcd..a08a1fb1ca2a 100644
> --- a/include/linux/pm_opp.h
> +++ b/include/linux/pm_opp.h
> @@ -93,6 +93,7 @@ struct dev_pm_opp_config {
> * @u_volt: The voltage in uV for the OPP.
> */
> struct dev_pm_opp_data {
> + bool turbo;
Please add description of that new field, like other
fields in the comment above.
> unsigned int level;
> unsigned long freq;
> unsigned long u_volt;
On 2/15/24 20:27, Dietmar Eggemann wrote:
> On 13/02/2024 08:35, Sibi Sankar wrote:
>>
>>
>> On 1/31/24 20:37, Dietmar Eggemann wrote:
>>> On 23/01/2024 11:15, Sudeep Holla wrote:
>>>> On Tue, Jan 23, 2024 at 11:38:27AM +0530, Viresh Kumar wrote:
>>>>> On 17-01-24, 16:34, Sibi Sankar wrote:
>
> [...]
>
[...]
>>> BTW, what's the use case you have in mind for this feature? Is it to cap
>>> high OPPs for CPUs in a certain CPUfreq policy?
>>
>> Yeah, that's exactly the use case for X1E. Boost frequencies defined in
>> the SoC are achievable by only one CPU in a cluster i.e. either the
>> other CPUs in the same cluster should be in low power mode or offline.
>> So it's mostly for book keeping i.e. we wouldn't to intimate incorrectly
>> that the CPUs are running at max possible frequency when it's actually
>> running at a lower frequency.
>
> I see.
>
> What about the issue with the settings of the global and the per-policy
> 'boost' file?
>
> On my Juno-r0 the initial boost values are:
>
> (1) Initial setting:
>
> root@juno:/sys/devices/system/cpu/cpufreq# cat boost policy*/boost
> 1
> 0
> 0
>
> Should they not all be 1 ?
>
>
> (2) Disabling system-wide boost
>
> root@juno:/sys/devices/system/cpu/cpufreq# echo 0 > boost
>
> Here I see 'cpufreq_pressure > 0' for all CPUs.
>
>
> (3) Enabling system-wide boost
>
> root@juno:/sys/devices/system/cpu/cpufreq# echo 1 > boost
>
> And here 'cpufreq_pressure == 0' for all CPUs.
>
>
> (4) Disabling boost for policy0.
>
> root@juno:/sys/devices/system/cpu/cpufreq# echo 0 > policy0/boost
>
> root@juno:/sys/devices/system/cpu/cpufreq# cat boost policy*/boost
> 1
> 0
> 1
>
> Here nothing happened. But I was expecting to see 'cpufreq_pressure > 0'
> for CPUs of policy0:
>
https://patchwork.kernel.org/project/linux-arm-msm/cover/[email protected]/
Finally got some time to fix this, I've posted out the fix and re-spun
the series as well. This should fix the default values of per-policy
boost flags as well.
-Sibi
> root@juno:/sys/devices/system/cpu/cpufreq# cat policy0/affected_cpus
> 0 3 4 5