2024-02-26 06:52:10

by Nick Hu

[permalink] [raw]
Subject: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

When the cpus in the same cluster are all in the idle state, the kernel
might put the cluster into a deeper low power state. Call the
cluster_pm_enter() before entering the low power state and call the
cluster_pm_exit() after the cluster woken up.

Signed-off-by: Nick Hu <[email protected]>
---
drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
index e8094fc92491..298dc76a00cf 100644
--- a/drivers/cpuidle/cpuidle-riscv-sbi.c
+++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
@@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
{
struct genpd_power_state *state = &pd->states[pd->state_idx];
u32 *pd_state;
+ int ret;

if (!state->data)
return 0;
@@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
if (!sbi_cpuidle_pd_allow_domain_state)
return -EBUSY;

+ ret = cpu_cluster_pm_enter();
+ if (ret)
+ return ret;
+
/* OSI mode is enabled, set the corresponding domain state. */
pd_state = state->data;
sbi_set_domain_state(*pd_state);
@@ -408,6 +413,19 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
return 0;
}

+static int sbi_cpuidle_pd_power_on(struct generic_pm_domain *pd)
+{
+ struct genpd_power_state *state = &pd->states[pd->state_idx];
+
+ if (!state->data)
+ return 0;
+
+ if (!sbi_cpuidle_pd_allow_domain_state)
+ return -EBUSY;
+
+ return cpu_cluster_pm_exit();
+}
+
struct sbi_pd_provider {
struct list_head link;
struct device_node *node;
@@ -433,10 +451,12 @@ static int sbi_pd_init(struct device_node *np)
pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_CPU_DOMAIN;

/* Allow power off when OSI is available. */
- if (sbi_cpuidle_use_osi)
+ if (sbi_cpuidle_use_osi) {
pd->power_off = sbi_cpuidle_pd_power_off;
- else
+ pd->power_on = sbi_cpuidle_pd_power_on;
+ } else {
pd->flags |= GENPD_FLAG_ALWAYS_ON;
+ }

/* Use governor for CPU PM domains if it has some states to manage. */
pd_gov = pd->states ? &pm_domain_cpu_gov : NULL;
--
2.34.1



Subject: Re: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

Hello:

This patch was applied to riscv/linux.git (for-next)
by Palmer Dabbelt <[email protected]>:

On Mon, 26 Feb 2024 14:51:13 +0800 you wrote:
> When the cpus in the same cluster are all in the idle state, the kernel
> might put the cluster into a deeper low power state. Call the
> cluster_pm_enter() before entering the low power state and call the
> cluster_pm_exit() after the cluster woken up.
>
> Signed-off-by: Nick Hu <[email protected]>
>
> [...]

Here is the summary with links:
- cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()
https://git.kernel.org/riscv/c/3fd665f2854d

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



2024-04-29 14:33:08

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

On Mon, 26 Feb 2024 at 07:51, Nick Hu <[email protected]> wrote:
>
> When the cpus in the same cluster are all in the idle state, the kernel
> might put the cluster into a deeper low power state. Call the
> cluster_pm_enter() before entering the low power state and call the
> cluster_pm_exit() after the cluster woken up.
>
> Signed-off-by: Nick Hu <[email protected]>

I was not cced this patch, but noticed that this patch got queued up
recently. Sorry for not noticing earlier.

If not too late, can you please drop/revert it? We should really move
away from the CPU cluster notifiers. See more information below.

> ---
> drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> 1 file changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> index e8094fc92491..298dc76a00cf 100644
> --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> {
> struct genpd_power_state *state = &pd->states[pd->state_idx];
> u32 *pd_state;
> + int ret;
>
> if (!state->data)
> return 0;
> @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> if (!sbi_cpuidle_pd_allow_domain_state)
> return -EBUSY;
>
> + ret = cpu_cluster_pm_enter();
> + if (ret)
> + return ret;

Rather than using the CPU cluster notifiers, consumers of the genpd
can register themselves to receive genpd on/off notifiers.

In other words, none of this should be needed, right?

[...]

Kind regards
Uffe

2024-04-29 16:26:31

by Nick Hu

[permalink] [raw]
Subject: Re: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <[email protected]> wrote:
>
> Hi Ulf
>
> On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <[email protected]> wrote:
> >
> > On Mon, 26 Feb 2024 at 07:51, Nick Hu <[email protected]> wrote:
> > >
> > > When the cpus in the same cluster are all in the idle state, the kernel
> > > might put the cluster into a deeper low power state. Call the
> > > cluster_pm_enter() before entering the low power state and call the
> > > cluster_pm_exit() after the cluster woken up.
> > >
> > > Signed-off-by: Nick Hu <[email protected]>
> >
> > I was not cced this patch, but noticed that this patch got queued up
> > recently. Sorry for not noticing earlier.
> >
> > If not too late, can you please drop/revert it? We should really move
> > away from the CPU cluster notifiers. See more information below.
> >
> > > ---
> > > drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > > 1 file changed, 22 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > index e8094fc92491..298dc76a00cf 100644
> > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > {
> > > struct genpd_power_state *state = &pd->states[pd->state_idx];
> > > u32 *pd_state;
> > > + int ret;
> > >
> > > if (!state->data)
> > > return 0;
> > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > if (!sbi_cpuidle_pd_allow_domain_state)
> > > return -EBUSY;
> > >
> > > + ret = cpu_cluster_pm_enter();
> > > + if (ret)
> > > + return ret;
> >
> > Rather than using the CPU cluster notifiers, consumers of the genpd
> > can register themselves to receive genpd on/off notifiers.
> >
> > In other words, none of this should be needed, right?
> >
> Thanks for the feedback!
> Maybe I miss something, I'm wondering about a case like below:
> If we have a shared L2 cache controller inside the cpu cluster power
> domain and we add this controller to be a consumer of the power
> domain, Shouldn't the genpd invoke the domain idle only after the
> shared L2 cache controller is suspended?
> Is there a way that we can put the L2 cache down while all cpus in the
> same cluster are idle?
> > [...]
Sorry, I made some mistake in my second question.
Update the question here:
Is there a way that we can save the L2 cache states while all cpus in the
same cluster are idle and the cluster could be powered down?
> >
> > Kind regards
> > Uffe

2024-04-29 16:33:39

by Nick Hu

[permalink] [raw]
Subject: Re: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

Hi Ulf

On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <[email protected]> wrote:
>
> On Mon, 26 Feb 2024 at 07:51, Nick Hu <[email protected]> wrote:
> >
> > When the cpus in the same cluster are all in the idle state, the kernel
> > might put the cluster into a deeper low power state. Call the
> > cluster_pm_enter() before entering the low power state and call the
> > cluster_pm_exit() after the cluster woken up.
> >
> > Signed-off-by: Nick Hu <[email protected]>
>
> I was not cced this patch, but noticed that this patch got queued up
> recently. Sorry for not noticing earlier.
>
> If not too late, can you please drop/revert it? We should really move
> away from the CPU cluster notifiers. See more information below.
>
> > ---
> > drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > 1 file changed, 22 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > index e8094fc92491..298dc76a00cf 100644
> > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > {
> > struct genpd_power_state *state = &pd->states[pd->state_idx];
> > u32 *pd_state;
> > + int ret;
> >
> > if (!state->data)
> > return 0;
> > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > if (!sbi_cpuidle_pd_allow_domain_state)
> > return -EBUSY;
> >
> > + ret = cpu_cluster_pm_enter();
> > + if (ret)
> > + return ret;
>
> Rather than using the CPU cluster notifiers, consumers of the genpd
> can register themselves to receive genpd on/off notifiers.
>
> In other words, none of this should be needed, right?
>
Thanks for the feedback!
Maybe I miss something, I'm wondering about a case like below:
If we have a shared L2 cache controller inside the cpu cluster power
domain and we add this controller to be a consumer of the power
domain, Shouldn't the genpd invoke the domain idle only after the
shared L2 cache controller is suspended?
Is there a way that we can put the L2 cache down while all cpus in the
same cluster are idle?
> [...]
>
> Kind regards
> Uffe

2024-04-29 21:00:02

by Palmer Dabbelt

[permalink] [raw]
Subject: Re: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

On Mon, 29 Apr 2024 07:32:12 PDT (-0700), [email protected] wrote:
> On Mon, 26 Feb 2024 at 07:51, Nick Hu <[email protected]> wrote:
>>
>> When the cpus in the same cluster are all in the idle state, the kernel
>> might put the cluster into a deeper low power state. Call the
>> cluster_pm_enter() before entering the low power state and call the
>> cluster_pm_exit() after the cluster woken up.
>>
>> Signed-off-by: Nick Hu <[email protected]>
>
> I was not cced this patch, but noticed that this patch got queued up
> recently. Sorry for not noticing earlier.
>
> If not too late, can you please drop/revert it? We should really move
> away from the CPU cluster notifiers. See more information below.

Sorry about that, I'll toss it. I'm testing some other stuff right now
so it might miss today's linux-next.

>> ---
>> drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
>> 1 file changed, 22 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
>> index e8094fc92491..298dc76a00cf 100644
>> --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
>> +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
>> @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
>> {
>> struct genpd_power_state *state = &pd->states[pd->state_idx];
>> u32 *pd_state;
>> + int ret;
>>
>> if (!state->data)
>> return 0;
>> @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
>> if (!sbi_cpuidle_pd_allow_domain_state)
>> return -EBUSY;
>>
>> + ret = cpu_cluster_pm_enter();
>> + if (ret)
>> + return ret;
>
> Rather than using the CPU cluster notifiers, consumers of the genpd
> can register themselves to receive genpd on/off notifiers.
>
> In other words, none of this should be needed, right?
>
> [...]
>
> Kind regards
> Uffe

2024-04-30 08:13:48

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

On Mon, 29 Apr 2024 at 18:26, Nick Hu <[email protected]> wrote:
>
> On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <[email protected]> wrote:
> >
> > Hi Ulf
> >
> > On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <[email protected]> wrote:
> > >
> > > On Mon, 26 Feb 2024 at 07:51, Nick Hu <[email protected]> wrote:
> > > >
> > > > When the cpus in the same cluster are all in the idle state, the kernel
> > > > might put the cluster into a deeper low power state. Call the
> > > > cluster_pm_enter() before entering the low power state and call the
> > > > cluster_pm_exit() after the cluster woken up.
> > > >
> > > > Signed-off-by: Nick Hu <[email protected]>
> > >
> > > I was not cced this patch, but noticed that this patch got queued up
> > > recently. Sorry for not noticing earlier.
> > >
> > > If not too late, can you please drop/revert it? We should really move
> > > away from the CPU cluster notifiers. See more information below.
> > >
> > > > ---
> > > > drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > > > 1 file changed, 22 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > index e8094fc92491..298dc76a00cf 100644
> > > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > {
> > > > struct genpd_power_state *state = &pd->states[pd->state_idx];
> > > > u32 *pd_state;
> > > > + int ret;
> > > >
> > > > if (!state->data)
> > > > return 0;
> > > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > if (!sbi_cpuidle_pd_allow_domain_state)
> > > > return -EBUSY;
> > > >
> > > > + ret = cpu_cluster_pm_enter();
> > > > + if (ret)
> > > > + return ret;
> > >
> > > Rather than using the CPU cluster notifiers, consumers of the genpd
> > > can register themselves to receive genpd on/off notifiers.
> > >
> > > In other words, none of this should be needed, right?
> > >
> > Thanks for the feedback!
> > Maybe I miss something, I'm wondering about a case like below:
> > If we have a shared L2 cache controller inside the cpu cluster power
> > domain and we add this controller to be a consumer of the power
> > domain, Shouldn't the genpd invoke the domain idle only after the
> > shared L2 cache controller is suspended?
> > Is there a way that we can put the L2 cache down while all cpus in the
> > same cluster are idle?
> > > [...]
> Sorry, I made some mistake in my second question.
> Update the question here:
> Is there a way that we can save the L2 cache states while all cpus in the
> same cluster are idle and the cluster could be powered down?

If the L2 cache is a consumer of the cluster, the consumer driver for
the L2 cache should register for genpd on/off notifiers.

The device representing the L2 cache needs to be enabled for runtime
PM, to be taken into account correctly by the cluster genpd. In this
case, the device should most likely remain runtime suspended, but
instead rely on the genpd on/off notifiers to understand when
save/restore of the cache states should be done.

Kind regards
Uffe

2024-05-14 10:32:05

by Nick Hu

[permalink] [raw]
Subject: Re: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

Hi Ulf,

Thank you for your valuable suggestion.
I sincerely apologize for the delay in responding to your message. We
have diligently worked on experimenting with the suggestion you
provided.

As per your recommendation, we have incorporated the "power-domains=<>
property" into the consumer's node, resulting in modifications to the
DTS as illustrated below:

cpus {
...
domain-idle-states {
CLUSTER_SLEEP:cluster-sleep {
compatible = "domain-idle-state";
...
}
}
power-domains {
...
...
CLUSTER_PD: clusterpd {
domain-idle-states = <&CLUSTER_SLEEP>;
};
}
}
soc {
deviceA@xxx{
...
power-domains = <&CLUSTER_PD>;
...
}
}

However, this adjustment has led to an issue where the probe for
'deviceA' is deferred by 'device_links_check_suppliers()' within
'really_probe()'. In an attempt to mitigate this issue, we
experimented with a workaround by adding the attribute
"status="disabled"" to the 'CLUSTER_PD' node. This action aimed to
prevent the creation of a device link between 'deviceA' and
'CLUSTER_PD'. Nevertheless, we remain uncertain about the
appropriateness of this solution.

Do you have suggestions on how to effectively address this issue?

Regards,
Nick

On Tue, Apr 30, 2024 at 4:13 PM Ulf Hansson <[email protected]> wrote:
>
> On Mon, 29 Apr 2024 at 18:26, Nick Hu <[email protected]> wrote:
> >
> > On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <[email protected]> wrote:
> > >
> > > Hi Ulf
> > >
> > > On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <[email protected]> wrote:
> > > >
> > > > On Mon, 26 Feb 2024 at 07:51, Nick Hu <[email protected]> wrote:
> > > > >
> > > > > When the cpus in the same cluster are all in the idle state, the kernel
> > > > > might put the cluster into a deeper low power state. Call the
> > > > > cluster_pm_enter() before entering the low power state and call the
> > > > > cluster_pm_exit() after the cluster woken up.
> > > > >
> > > > > Signed-off-by: Nick Hu <[email protected]>
> > > >
> > > > I was not cced this patch, but noticed that this patch got queued up
> > > > recently. Sorry for not noticing earlier.
> > > >
> > > > If not too late, can you please drop/revert it? We should really move
> > > > away from the CPU cluster notifiers. See more information below.
> > > >
> > > > > ---
> > > > > drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > > > > 1 file changed, 22 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > index e8094fc92491..298dc76a00cf 100644
> > > > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > {
> > > > > struct genpd_power_state *state = &pd->states[pd->state_idx];
> > > > > u32 *pd_state;
> > > > > + int ret;
> > > > >
> > > > > if (!state->data)
> > > > > return 0;
> > > > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > if (!sbi_cpuidle_pd_allow_domain_state)
> > > > > return -EBUSY;
> > > > >
> > > > > + ret = cpu_cluster_pm_enter();
> > > > > + if (ret)
> > > > > + return ret;
> > > >
> > > > Rather than using the CPU cluster notifiers, consumers of the genpd
> > > > can register themselves to receive genpd on/off notifiers.
> > > >
> > > > In other words, none of this should be needed, right?
> > > >
> > > Thanks for the feedback!
> > > Maybe I miss something, I'm wondering about a case like below:
> > > If we have a shared L2 cache controller inside the cpu cluster power
> > > domain and we add this controller to be a consumer of the power
> > > domain, Shouldn't the genpd invoke the domain idle only after the
> > > shared L2 cache controller is suspended?
> > > Is there a way that we can put the L2 cache down while all cpus in the
> > > same cluster are idle?
> > > > [...]
> > Sorry, I made some mistake in my second question.
> > Update the question here:
> > Is there a way that we can save the L2 cache states while all cpus in the
> > same cluster are idle and the cluster could be powered down?
>
> If the L2 cache is a consumer of the cluster, the consumer driver for
> the L2 cache should register for genpd on/off notifiers.
>
> The device representing the L2 cache needs to be enabled for runtime
> PM, to be taken into account correctly by the cluster genpd. In this
> case, the device should most likely remain runtime suspended, but
> instead rely on the genpd on/off notifiers to understand when
> save/restore of the cache states should be done.
>
> Kind regards
> Uffe

2024-05-14 14:24:20

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

Hi Nick,

On Tue, May 14, 2024 at 3:20 PM Nick Hu <[email protected]> wrote:
>
> Hi Ulf,
>
> Thank you for your valuable suggestion.
> I sincerely apologize for the delay in responding to your message. We
> have diligently worked on experimenting with the suggestion you
> provided.
>
> As per your recommendation, we have incorporated the "power-domains=<>
> property" into the consumer's node, resulting in modifications to the
> DTS as illustrated below:
>
> cpus {
> ...
> domain-idle-states {
> CLUSTER_SLEEP:cluster-sleep {
> compatible = "domain-idle-state";
> ...
> }
> }
> power-domains {
> ...
> ...
> CLUSTER_PD: clusterpd {
> domain-idle-states = <&CLUSTER_SLEEP>;
> };
> }
> }
> soc {
> deviceA@xxx{
> ...
> power-domains = <&CLUSTER_PD>;
> ...
> }
> }
>
> However, this adjustment has led to an issue where the probe for
> 'deviceA' is deferred by 'device_links_check_suppliers()' within
> 'really_probe()'. In an attempt to mitigate this issue, we
> experimented with a workaround by adding the attribute
> "status="disabled"" to the 'CLUSTER_PD' node. This action aimed to
> prevent the creation of a device link between 'deviceA' and
> 'CLUSTER_PD'. Nevertheless, we remain uncertain about the
> appropriateness of this solution.
>
> Do you have suggestions on how to effectively address this issue?

I totally missed this email since I was not CC'ed sorry about that. Please
use get_maintainers.pl when sending patches.

The genpd_add_provider() (called by of_genpd_add_provider_simple())
does mark the power-domain DT node as initialized (fwnode_dev_initialized())
so after the cpuidle-riscv-sbi driver is probed the 'deviceA' dependency is
resolved and 'deviceA' should be probed unless there are other unmet
dependencies.

Try adding "#define DEBUG" before all includes in drivers/core/base.c
and add "loglevel=8" in kernel parameters, this will print producer-consumer
linkage of all devices.

Marking the power-domain DT node as "disabled" is certainly not the
right way.

Regards,
Anup

>
> Regards,
> Nick
>
> On Tue, Apr 30, 2024 at 4:13 PM Ulf Hansson <[email protected]> wrote:
> >
> > On Mon, 29 Apr 2024 at 18:26, Nick Hu <[email protected]> wrote:
> > >
> > > On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <[email protected]> wrote:
> > > >
> > > > Hi Ulf
> > > >
> > > > On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <[email protected]> wrote:
> > > > >
> > > > > On Mon, 26 Feb 2024 at 07:51, Nick Hu <[email protected]> wrote:
> > > > > >
> > > > > > When the cpus in the same cluster are all in the idle state, the kernel
> > > > > > might put the cluster into a deeper low power state. Call the
> > > > > > cluster_pm_enter() before entering the low power state and call the
> > > > > > cluster_pm_exit() after the cluster woken up.
> > > > > >
> > > > > > Signed-off-by: Nick Hu <[email protected]>
> > > > >
> > > > > I was not cced this patch, but noticed that this patch got queued up
> > > > > recently. Sorry for not noticing earlier.
> > > > >
> > > > > If not too late, can you please drop/revert it? We should really move
> > > > > away from the CPU cluster notifiers. See more information below.
> > > > >
> > > > > > ---
> > > > > > drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > > > > > 1 file changed, 22 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > index e8094fc92491..298dc76a00cf 100644
> > > > > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > {
> > > > > > struct genpd_power_state *state = &pd->states[pd->state_idx];
> > > > > > u32 *pd_state;
> > > > > > + int ret;
> > > > > >
> > > > > > if (!state->data)
> > > > > > return 0;
> > > > > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > if (!sbi_cpuidle_pd_allow_domain_state)
> > > > > > return -EBUSY;
> > > > > >
> > > > > > + ret = cpu_cluster_pm_enter();
> > > > > > + if (ret)
> > > > > > + return ret;
> > > > >
> > > > > Rather than using the CPU cluster notifiers, consumers of the genpd
> > > > > can register themselves to receive genpd on/off notifiers.
> > > > >
> > > > > In other words, none of this should be needed, right?
> > > > >
> > > > Thanks for the feedback!
> > > > Maybe I miss something, I'm wondering about a case like below:
> > > > If we have a shared L2 cache controller inside the cpu cluster power
> > > > domain and we add this controller to be a consumer of the power
> > > > domain, Shouldn't the genpd invoke the domain idle only after the
> > > > shared L2 cache controller is suspended?
> > > > Is there a way that we can put the L2 cache down while all cpus in the
> > > > same cluster are idle?
> > > > > [...]
> > > Sorry, I made some mistake in my second question.
> > > Update the question here:
> > > Is there a way that we can save the L2 cache states while all cpus in the
> > > same cluster are idle and the cluster could be powered down?
> >
> > If the L2 cache is a consumer of the cluster, the consumer driver for
> > the L2 cache should register for genpd on/off notifiers.
> >
> > The device representing the L2 cache needs to be enabled for runtime
> > PM, to be taken into account correctly by the cluster genpd. In this
> > case, the device should most likely remain runtime suspended, but
> > instead rely on the genpd on/off notifiers to understand when
> > save/restore of the cache states should be done.
> >
> > Kind regards
> > Uffe

2024-05-14 14:54:38

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

On Tue, May 14, 2024 at 7:53 PM Anup Patel <[email protected]> wrote:
>
> Hi Nick,
>
> On Tue, May 14, 2024 at 3:20 PM Nick Hu <[email protected]> wrote:
> >
> > Hi Ulf,
> >
> > Thank you for your valuable suggestion.
> > I sincerely apologize for the delay in responding to your message. We
> > have diligently worked on experimenting with the suggestion you
> > provided.
> >
> > As per your recommendation, we have incorporated the "power-domains=<>
> > property" into the consumer's node, resulting in modifications to the
> > DTS as illustrated below:
> >
> > cpus {
> > ...
> > domain-idle-states {
> > CLUSTER_SLEEP:cluster-sleep {
> > compatible = "domain-idle-state";
> > ...
> > }
> > }
> > power-domains {
> > ...
> > ...
> > CLUSTER_PD: clusterpd {
> > domain-idle-states = <&CLUSTER_SLEEP>;
> > };
> > }
> > }
> > soc {
> > deviceA@xxx{
> > ...
> > power-domains = <&CLUSTER_PD>;
> > ...
> > }
> > }
> >
> > However, this adjustment has led to an issue where the probe for
> > 'deviceA' is deferred by 'device_links_check_suppliers()' within
> > 'really_probe()'. In an attempt to mitigate this issue, we
> > experimented with a workaround by adding the attribute
> > "status="disabled"" to the 'CLUSTER_PD' node. This action aimed to
> > prevent the creation of a device link between 'deviceA' and
> > 'CLUSTER_PD'. Nevertheless, we remain uncertain about the
> > appropriateness of this solution.
> >
> > Do you have suggestions on how to effectively address this issue?
>
> I totally missed this email since I was not CC'ed sorry about that. Please
> use get_maintainers.pl when sending patches.

I stand corrected. This patch had landed in the "spam" folder. I don't know why.

Regards,
Anup

>
> The genpd_add_provider() (called by of_genpd_add_provider_simple())
> does mark the power-domain DT node as initialized (fwnode_dev_initialized())
> so after the cpuidle-riscv-sbi driver is probed the 'deviceA' dependency is
> resolved and 'deviceA' should be probed unless there are other unmet
> dependencies.
>
> Try adding "#define DEBUG" before all includes in drivers/core/base.c
> and add "loglevel=8" in kernel parameters, this will print producer-consumer
> linkage of all devices.
>
> Marking the power-domain DT node as "disabled" is certainly not the
> right way.
>
> Regards,
> Anup
>
> >
> > Regards,
> > Nick
> >
> > On Tue, Apr 30, 2024 at 4:13 PM Ulf Hansson <ulf.hansson@linaroorg> wrote:
> > >
> > > On Mon, 29 Apr 2024 at 18:26, Nick Hu <[email protected]> wrote:
> > > >
> > > > On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <[email protected]> wrote:
> > > > >
> > > > > Hi Ulf
> > > > >
> > > > > On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <[email protected]> wrote:
> > > > > >
> > > > > > On Mon, 26 Feb 2024 at 07:51, Nick Hu <[email protected]> wrote:
> > > > > > >
> > > > > > > When the cpus in the same cluster are all in the idle state, the kernel
> > > > > > > might put the cluster into a deeper low power state. Call the
> > > > > > > cluster_pm_enter() before entering the low power state and call the
> > > > > > > cluster_pm_exit() after the cluster woken up.
> > > > > > >
> > > > > > > Signed-off-by: Nick Hu <[email protected]>
> > > > > >
> > > > > > I was not cced this patch, but noticed that this patch got queued up
> > > > > > recently. Sorry for not noticing earlier.
> > > > > >
> > > > > > If not too late, can you please drop/revert it? We should really move
> > > > > > away from the CPU cluster notifiers. See more information below.
> > > > > >
> > > > > > > ---
> > > > > > > drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > > > > > > 1 file changed, 22 insertions(+), 2 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > index e8094fc92491..298dc76a00cf 100644
> > > > > > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > > {
> > > > > > > struct genpd_power_state *state = &pd->states[pd->state_idx];
> > > > > > > u32 *pd_state;
> > > > > > > + int ret;
> > > > > > >
> > > > > > > if (!state->data)
> > > > > > > return 0;
> > > > > > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > > if (!sbi_cpuidle_pd_allow_domain_state)
> > > > > > > return -EBUSY;
> > > > > > >
> > > > > > > + ret = cpu_cluster_pm_enter();
> > > > > > > + if (ret)
> > > > > > > + return ret;
> > > > > >
> > > > > > Rather than using the CPU cluster notifiers, consumers of the genpd
> > > > > > can register themselves to receive genpd on/off notifiers.
> > > > > >
> > > > > > In other words, none of this should be needed, right?
> > > > > >
> > > > > Thanks for the feedback!
> > > > > Maybe I miss something, I'm wondering about a case like below:
> > > > > If we have a shared L2 cache controller inside the cpu cluster power
> > > > > domain and we add this controller to be a consumer of the power
> > > > > domain, Shouldn't the genpd invoke the domain idle only after the
> > > > > shared L2 cache controller is suspended?
> > > > > Is there a way that we can put the L2 cache down while all cpus in the
> > > > > same cluster are idle?
> > > > > > [...]
> > > > Sorry, I made some mistake in my second question.
> > > > Update the question here:
> > > > Is there a way that we can save the L2 cache states while all cpus in the
> > > > same cluster are idle and the cluster could be powered down?
> > >
> > > If the L2 cache is a consumer of the cluster, the consumer driver for
> > > the L2 cache should register for genpd on/off notifiers.
> > >
> > > The device representing the L2 cache needs to be enabled for runtime
> > > PM, to be taken into account correctly by the cluster genpd. In this
> > > case, the device should most likely remain runtime suspended, but
> > > instead rely on the genpd on/off notifiers to understand when
> > > save/restore of the cache states should be done.
> > >
> > > Kind regards
> > > Uffe

2024-05-15 12:15:46

by Nick Hu

[permalink] [raw]
Subject: Re: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

Hi Anup,

Thank you for your guidance.
After enabling the debug message, we found a way to solve the problem
by the following steps:
1. Add a compatible string in 'power-domains' node otherwise it won't
be the supplier of the consumers. (See of_link_to_phandle())
2. Move the 'power-domains' node outside the 'cpus' node otherwise it
won't be added to the device hierarchy by device_add().
3. Update the cpuidle-riscv-sbi driver to get the pds_node from
'/power-domains'.

So the DTS will be like:
cpus {
...
domain-idle-states {
CLUSTER_SLEEP:cluster-sleep {
compatible = "domain-idle-state";
...
}
}
}
power-domains {
compatible = "riscv,sbi-power-domains"
...
...
CLUSTER_PD: clusterpd {
domain-idle-states = <&CLUSTER_SLEEP>;
};
}
soc {
deviceA@xxx{
...
power-domains = <&CLUSTER_PD>;
...
}
}

Regards,
Nick

On Tue, May 14, 2024 at 10:54 PM Anup Patel <[email protected]> wrote:
>
> On Tue, May 14, 2024 at 7:53 PM Anup Patel <[email protected]> wrote:
> >
> > Hi Nick,
> >
> > On Tue, May 14, 2024 at 3:20 PM Nick Hu <[email protected]> wrote:
> > >
> > > Hi Ulf,
> > >
> > > Thank you for your valuable suggestion.
> > > I sincerely apologize for the delay in responding to your message. We
> > > have diligently worked on experimenting with the suggestion you
> > > provided.
> > >
> > > As per your recommendation, we have incorporated the "power-domains=<>
> > > property" into the consumer's node, resulting in modifications to the
> > > DTS as illustrated below:
> > >
> > > cpus {
> > > ...
> > > domain-idle-states {
> > > CLUSTER_SLEEP:cluster-sleep {
> > > compatible = "domain-idle-state";
> > > ...
> > > }
> > > }
> > > power-domains {
> > > ...
> > > ...
> > > CLUSTER_PD: clusterpd {
> > > domain-idle-states = <&CLUSTER_SLEEP>;
> > > };
> > > }
> > > }
> > > soc {
> > > deviceA@xxx{
> > > ...
> > > power-domains = <&CLUSTER_PD>;
> > > ...
> > > }
> > > }
> > >
> > > However, this adjustment has led to an issue where the probe for
> > > 'deviceA' is deferred by 'device_links_check_suppliers()' within
> > > 'really_probe()'. In an attempt to mitigate this issue, we
> > > experimented with a workaround by adding the attribute
> > > "status="disabled"" to the 'CLUSTER_PD' node. This action aimed to
> > > prevent the creation of a device link between 'deviceA' and
> > > 'CLUSTER_PD'. Nevertheless, we remain uncertain about the
> > > appropriateness of this solution.
> > >
> > > Do you have suggestions on how to effectively address this issue?
> >
> > I totally missed this email since I was not CC'ed sorry about that. Please
> > use get_maintainers.pl when sending patches.
>
> I stand corrected. This patch had landed in the "spam" folder. I don't know why.
>
> Regards,
> Anup
>
> >
> > The genpd_add_provider() (called by of_genpd_add_provider_simple())
> > does mark the power-domain DT node as initialized (fwnode_dev_initialized())
> > so after the cpuidle-riscv-sbi driver is probed the 'deviceA' dependency is
> > resolved and 'deviceA' should be probed unless there are other unmet
> > dependencies.
> >
> > Try adding "#define DEBUG" before all includes in drivers/core/base.c
> > and add "loglevel=8" in kernel parameters, this will print producer-consumer
> > linkage of all devices.
> >
> > Marking the power-domain DT node as "disabled" is certainly not the
> > right way.
> >
> > Regards,
> > Anup
> >
> > >
> > > Regards,
> > > Nick
> > >
> > > On Tue, Apr 30, 2024 at 4:13 PM Ulf Hansson <[email protected]> wrote:
> > > >
> > > > On Mon, 29 Apr 2024 at 18:26, Nick Hu <[email protected]> wrote:
> > > > >
> > > > > On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <[email protected]> wrote:
> > > > > >
> > > > > > Hi Ulf
> > > > > >
> > > > > > On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <[email protected]> wrote:
> > > > > > >
> > > > > > > On Mon, 26 Feb 2024 at 07:51, Nick Hu <[email protected]> wrote:
> > > > > > > >
> > > > > > > > When the cpus in the same cluster are all in the idle state, the kernel
> > > > > > > > might put the cluster into a deeper low power state. Call the
> > > > > > > > cluster_pm_enter() before entering the low power state and call the
> > > > > > > > cluster_pm_exit() after the cluster woken up.
> > > > > > > >
> > > > > > > > Signed-off-by: Nick Hu <[email protected]>
> > > > > > >
> > > > > > > I was not cced this patch, but noticed that this patch got queued up
> > > > > > > recently. Sorry for not noticing earlier.
> > > > > > >
> > > > > > > If not too late, can you please drop/revert it? We should really move
> > > > > > > away from the CPU cluster notifiers. See more information below.
> > > > > > >
> > > > > > > > ---
> > > > > > > > drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > > > > > > > 1 file changed, 22 insertions(+), 2 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > index e8094fc92491..298dc76a00cf 100644
> > > > > > > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > > > {
> > > > > > > > struct genpd_power_state *state = &pd->states[pd->state_idx];
> > > > > > > > u32 *pd_state;
> > > > > > > > + int ret;
> > > > > > > >
> > > > > > > > if (!state->data)
> > > > > > > > return 0;
> > > > > > > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > > > if (!sbi_cpuidle_pd_allow_domain_state)
> > > > > > > > return -EBUSY;
> > > > > > > >
> > > > > > > > + ret = cpu_cluster_pm_enter();
> > > > > > > > + if (ret)
> > > > > > > > + return ret;
> > > > > > >
> > > > > > > Rather than using the CPU cluster notifiers, consumers of the genpd
> > > > > > > can register themselves to receive genpd on/off notifiers.
> > > > > > >
> > > > > > > In other words, none of this should be needed, right?
> > > > > > >
> > > > > > Thanks for the feedback!
> > > > > > Maybe I miss something, I'm wondering about a case like below:
> > > > > > If we have a shared L2 cache controller inside the cpu cluster power
> > > > > > domain and we add this controller to be a consumer of the power
> > > > > > domain, Shouldn't the genpd invoke the domain idle only after the
> > > > > > shared L2 cache controller is suspended?
> > > > > > Is there a way that we can put the L2 cache down while all cpus in the
> > > > > > same cluster are idle?
> > > > > > > [...]
> > > > > Sorry, I made some mistake in my second question.
> > > > > Update the question here:
> > > > > Is there a way that we can save the L2 cache states while all cpus in the
> > > > > same cluster are idle and the cluster could be powered down?
> > > >
> > > > If the L2 cache is a consumer of the cluster, the consumer driver for
> > > > the L2 cache should register for genpd on/off notifiers.
> > > >
> > > > The device representing the L2 cache needs to be enabled for runtime
> > > > PM, to be taken into account correctly by the cluster genpd. In this
> > > > case, the device should most likely remain runtime suspended, but
> > > > instead rely on the genpd on/off notifiers to understand when
> > > > save/restore of the cache states should be done.
> > > >
> > > > Kind regards
> > > > Uffe

2024-05-15 13:46:19

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

Hi Nick,

On Wed, May 15, 2024 at 5:45 PM Nick Hu <[email protected]> wrote:
>
> Hi Anup,
>
> Thank you for your guidance.
> After enabling the debug message, we found a way to solve the problem
> by the following steps:
> 1. Add a compatible string in 'power-domains' node otherwise it won't
> be the supplier of the consumers. (See of_link_to_phandle())

Hmm, requiring a compatible string is odd. Where should we document
this compatible string ?

> 2. Move the 'power-domains' node outside the 'cpus' node otherwise it
> won't be added to the device hierarchy by device_add().
> 3. Update the cpuidle-riscv-sbi driver to get the pds_node from
> '/power-domains'.

By adding a compatible string and moving the "power-domains" node
outside, you are simply forcing the OF framework to populate devices.

How about manually creating platform_device for each power-domain
DT node using of_platform_device_create() in sbi_pd_init() ?

>
> So the DTS will be like:
> cpus {
> ...
> domain-idle-states {
> CLUSTER_SLEEP:cluster-sleep {
> compatible = "domain-idle-state";
> ...
> }
> }
> }
> power-domains {
> compatible = "riscv,sbi-power-domains"
> ...
> ...
> CLUSTER_PD: clusterpd {
> domain-idle-states = <&CLUSTER_SLEEP>;
> };
> }
> soc {
> deviceA@xxx{
> ...
> power-domains = <&CLUSTER_PD>;
> ...
> }
> }

Regards,
Anup

>
> Regards,
> Nick
>
> On Tue, May 14, 2024 at 10:54 PM Anup Patel <[email protected]> wrote:
> >
> > On Tue, May 14, 2024 at 7:53 PM Anup Patel <[email protected]> wrote:
> > >
> > > Hi Nick,
> > >
> > > On Tue, May 14, 2024 at 3:20 PM Nick Hu <[email protected]> wrote:
> > > >
> > > > Hi Ulf,
> > > >
> > > > Thank you for your valuable suggestion.
> > > > I sincerely apologize for the delay in responding to your message. We
> > > > have diligently worked on experimenting with the suggestion you
> > > > provided.
> > > >
> > > > As per your recommendation, we have incorporated the "power-domains=<>
> > > > property" into the consumer's node, resulting in modifications to the
> > > > DTS as illustrated below:
> > > >
> > > > cpus {
> > > > ...
> > > > domain-idle-states {
> > > > CLUSTER_SLEEP:cluster-sleep {
> > > > compatible = "domain-idle-state";
> > > > ...
> > > > }
> > > > }
> > > > power-domains {
> > > > ...
> > > > ...
> > > > CLUSTER_PD: clusterpd {
> > > > domain-idle-states = <&CLUSTER_SLEEP>;
> > > > };
> > > > }
> > > > }
> > > > soc {
> > > > deviceA@xxx{
> > > > ...
> > > > power-domains = <&CLUSTER_PD>;
> > > > ...
> > > > }
> > > > }
> > > >
> > > > However, this adjustment has led to an issue where the probe for
> > > > 'deviceA' is deferred by 'device_links_check_suppliers()' within
> > > > 'really_probe()'. In an attempt to mitigate this issue, we
> > > > experimented with a workaround by adding the attribute
> > > > "status="disabled"" to the 'CLUSTER_PD' node. This action aimed to
> > > > prevent the creation of a device link between 'deviceA' and
> > > > 'CLUSTER_PD'. Nevertheless, we remain uncertain about the
> > > > appropriateness of this solution.
> > > >
> > > > Do you have suggestions on how to effectively address this issue?
> > >
> > > I totally missed this email since I was not CC'ed sorry about that. Please
> > > use get_maintainers.pl when sending patches.
> >
> > I stand corrected. This patch had landed in the "spam" folder. I don't know why.
> >
> > Regards,
> > Anup
> >
> > >
> > > The genpd_add_provider() (called by of_genpd_add_provider_simple())
> > > does mark the power-domain DT node as initialized (fwnode_dev_initialized())
> > > so after the cpuidle-riscv-sbi driver is probed the 'deviceA' dependency is
> > > resolved and 'deviceA' should be probed unless there are other unmet
> > > dependencies.
> > >
> > > Try adding "#define DEBUG" before all includes in drivers/core/base.c
> > > and add "loglevel=8" in kernel parameters, this will print producer-consumer
> > > linkage of all devices.
> > >
> > > Marking the power-domain DT node as "disabled" is certainly not the
> > > right way.
> > >
> > > Regards,
> > > Anup
> > >
> > > >
> > > > Regards,
> > > > Nick
> > > >
> > > > On Tue, Apr 30, 2024 at 4:13 PM Ulf Hansson <[email protected]> wrote:
> > > > >
> > > > > On Mon, 29 Apr 2024 at 18:26, Nick Hu <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <[email protected]> wrote:
> > > > > > >
> > > > > > > Hi Ulf
> > > > > > >
> > > > > > > On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <[email protected]> wrote:
> > > > > > > >
> > > > > > > > On Mon, 26 Feb 2024 at 07:51, Nick Hu <[email protected]> wrote:
> > > > > > > > >
> > > > > > > > > When the cpus in the same cluster are all in the idle state, the kernel
> > > > > > > > > might put the cluster into a deeper low power state. Call the
> > > > > > > > > cluster_pm_enter() before entering the low power state and call the
> > > > > > > > > cluster_pm_exit() after the cluster woken up.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Nick Hu <[email protected]>
> > > > > > > >
> > > > > > > > I was not cced this patch, but noticed that this patch got queued up
> > > > > > > > recently. Sorry for not noticing earlier.
> > > > > > > >
> > > > > > > > If not too late, can you please drop/revert it? We should really move
> > > > > > > > away from the CPU cluster notifiers. See more information below.
> > > > > > > >
> > > > > > > > > ---
> > > > > > > > > drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > > > > > > > > 1 file changed, 22 insertions(+), 2 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > > index e8094fc92491..298dc76a00cf 100644
> > > > > > > > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > > > > {
> > > > > > > > > struct genpd_power_state *state = &pd->states[pd->state_idx];
> > > > > > > > > u32 *pd_state;
> > > > > > > > > + int ret;
> > > > > > > > >
> > > > > > > > > if (!state->data)
> > > > > > > > > return 0;
> > > > > > > > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > > > > if (!sbi_cpuidle_pd_allow_domain_state)
> > > > > > > > > return -EBUSY;
> > > > > > > > >
> > > > > > > > > + ret = cpu_cluster_pm_enter();
> > > > > > > > > + if (ret)
> > > > > > > > > + return ret;
> > > > > > > >
> > > > > > > > Rather than using the CPU cluster notifiers, consumers of the genpd
> > > > > > > > can register themselves to receive genpd on/off notifiers.
> > > > > > > >
> > > > > > > > In other words, none of this should be needed, right?
> > > > > > > >
> > > > > > > Thanks for the feedback!
> > > > > > > Maybe I miss something, I'm wondering about a case like below:
> > > > > > > If we have a shared L2 cache controller inside the cpu cluster power
> > > > > > > domain and we add this controller to be a consumer of the power
> > > > > > > domain, Shouldn't the genpd invoke the domain idle only after the
> > > > > > > shared L2 cache controller is suspended?
> > > > > > > Is there a way that we can put the L2 cache down while all cpus in the
> > > > > > > same cluster are idle?
> > > > > > > > [...]
> > > > > > Sorry, I made some mistake in my second question.
> > > > > > Update the question here:
> > > > > > Is there a way that we can save the L2 cache states while all cpus in the
> > > > > > same cluster are idle and the cluster could be powered down?
> > > > >
> > > > > If the L2 cache is a consumer of the cluster, the consumer driver for
> > > > > the L2 cache should register for genpd on/off notifiers.
> > > > >
> > > > > The device representing the L2 cache needs to be enabled for runtime
> > > > > PM, to be taken into account correctly by the cluster genpd. In this
> > > > > case, the device should most likely remain runtime suspended, but
> > > > > instead rely on the genpd on/off notifiers to understand when
> > > > > save/restore of the cache states should be done.
> > > > >
> > > > > Kind regards
> > > > > Uffe

2024-05-24 10:41:51

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

On Fri, 17 May 2024 at 06:39, Anup Patel <[email protected]> wrote:
>
> On Thu, May 16, 2024 at 9:40 AM Nick Hu <[email protected]> wrote:
> >
> > Hi Anup
> >
> > On Wed, May 15, 2024 at 9:46 PM Anup Patel <[email protected]> wrote:
> > >
> > > Hi Nick,
> > >
> > > On Wed, May 15, 2024 at 5:45 PM Nick Hu <[email protected]> wrote:
> > > >
> > > > Hi Anup,
> > > >
> > > > Thank you for your guidance.
> > > > After enabling the debug message, we found a way to solve the problem
> > > > by the following steps:
> > > > 1. Add a compatible string in 'power-domains' node otherwise it won't
> > > > be the supplier of the consumers. (See of_link_to_phandle())
> > >
> > > Hmm, requiring a compatible string is odd. Where should we document
> > > this compatible string ?
> > >
> > Sorry, this is my fault. I didn't include some updates in
> > of_link_to_phandle(). This led some misunderstandings here.
> > You are right, we don't need it.
> > The supplier will be linked to the CLUSTER_PD node.
> >
> > > > 2. Move the 'power-domains' node outside the 'cpus' node otherwise it
> > > > won't be added to the device hierarchy by device_add().
> > > > 3. Update the cpuidle-riscv-sbi driver to get the pds_node from
> > > > '/power-domains'.
> > >
> > > By adding a compatible string and moving the "power-domains" node
> > > outside, you are simply forcing the OF framework to populate devices.
> > >
> > > How about manually creating platform_device for each power-domain
> > > DT node using of_platform_device_create() in sbi_pd_init() ?
> > >
> > Thanks for the suggestion! We have test the solution and it could work.
> > We was wondering if it's feasible for us to relocate the
> > 'power-domains' node outside of the /cpus? The CLUSTER_PD might
> > encompass not only the CPUs but also other components within the
> > cluster.
>
> The cpuidle-riscv-sbi driver expects "power-domains" DT node
> under "/cpus" DT node because this driver only deals with power
> domains related to CPU cluster or CPU cache-hierarchy. It does
> make sense to define L2/L3 power domains under
> "/cpus/power-domain" since these are related to CPUs.
>
> Moving the CPU "power-domains" DT node directly under "/" or
> somewhere else would mean that it covers system-wide power
> domains which is not true.

I understand your point, but I am not convinced that the power-domains
need to belong to the "cpus" node. Ideally, the power-domain describes
the power-rail and the interface to manage the CPUs, this can surely
be described outside the "cpus" node - even if there are only CPUs
that are using it.

Moreover, moving forward, one should not be surprised if it turns out
that a platform has other devices than the CPUs, sharing the same
power-rail as the CPU cluster. At least, we have that for arm/psci
[1].

>
> I suggest we continue using "/cpus/power-domains" DT node
> only for power domains related to CPU clusters or CPU
> cache-hierarchy.
>
> For system wide power domains of SoC devices, we can either:
> 1) Use device power domains through the SBI MPXY extension
> via different driver
> 2) Use a platform specific driver
>
> >
> > We also look at cpuidle_psci_domain driver and it seems Arm doesn't
> > create the devices for each subnode of psci domain.
> > Is there any reason that they don't need it?

We don't need it for arm as we have a separate node for PSCI and its
power-domains [2]. Moreover, we have a separate driver that manages
the power-domain (cpuidle-psci-domain).

[...]

[1] arch/arm64/boot/dts/qcom/sc7280.dtsi (search for "CLUSTER_PD")
[2] Documentation/devicetree/bindings/arm/psci.yaml

Kind regards
Uffe

2024-05-24 12:54:43

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

On Fri, May 24, 2024 at 4:11 PM Ulf Hansson <[email protected]> wrote:
>
> On Fri, 17 May 2024 at 06:39, Anup Patel <[email protected]> wrote:
> >
> > On Thu, May 16, 2024 at 9:40 AM Nick Hu <[email protected]> wrote:
> > >
> > > Hi Anup
> > >
> > > On Wed, May 15, 2024 at 9:46 PM Anup Patel <[email protected]> wrote:
> > > >
> > > > Hi Nick,
> > > >
> > > > On Wed, May 15, 2024 at 5:45 PM Nick Hu <[email protected]> wrote:
> > > > >
> > > > > Hi Anup,
> > > > >
> > > > > Thank you for your guidance.
> > > > > After enabling the debug message, we found a way to solve the problem
> > > > > by the following steps:
> > > > > 1. Add a compatible string in 'power-domains' node otherwise it won't
> > > > > be the supplier of the consumers. (See of_link_to_phandle())
> > > >
> > > > Hmm, requiring a compatible string is odd. Where should we document
> > > > this compatible string ?
> > > >
> > > Sorry, this is my fault. I didn't include some updates in
> > > of_link_to_phandle(). This led some misunderstandings here.
> > > You are right, we don't need it.
> > > The supplier will be linked to the CLUSTER_PD node.
> > >
> > > > > 2. Move the 'power-domains' node outside the 'cpus' node otherwise it
> > > > > won't be added to the device hierarchy by device_add().
> > > > > 3. Update the cpuidle-riscv-sbi driver to get the pds_node from
> > > > > '/power-domains'.
> > > >
> > > > By adding a compatible string and moving the "power-domains" node
> > > > outside, you are simply forcing the OF framework to populate devices.
> > > >
> > > > How about manually creating platform_device for each power-domain
> > > > DT node using of_platform_device_create() in sbi_pd_init() ?
> > > >
> > > Thanks for the suggestion! We have test the solution and it could work.
> > > We was wondering if it's feasible for us to relocate the
> > > 'power-domains' node outside of the /cpus? The CLUSTER_PD might
> > > encompass not only the CPUs but also other components within the
> > > cluster.
> >
> > The cpuidle-riscv-sbi driver expects "power-domains" DT node
> > under "/cpus" DT node because this driver only deals with power
> > domains related to CPU cluster or CPU cache-hierarchy. It does
> > make sense to define L2/L3 power domains under
> > "/cpus/power-domain" since these are related to CPUs.
> >
> > Moving the CPU "power-domains" DT node directly under "/" or
> > somewhere else would mean that it covers system-wide power
> > domains which is not true.
>
> I understand your point, but I am not convinced that the power-domains
> need to belong to the "cpus" node. Ideally, the power-domain describes
> the power-rail and the interface to manage the CPUs, this can surely
> be described outside the "cpus" node - even if there are only CPUs
> that are using it.
>
> Moreover, moving forward, one should not be surprised if it turns out
> that a platform has other devices than the CPUs, sharing the same
> power-rail as the CPU cluster. At least, we have that for arm/psci
> [1].

For non-CPU power domains, we are working on a messaging
specification (RPMI) [1]. The supervisor software might have
direct access to a RPMI transport or it can send RPMI messages
via SBI MPXY extension [2].

If power-rails on a platform are shared between CPUs and
devices then the platform can:

1) Use SBI HSM for CPUs and use RPMI for devices. The
DT bindings for device power-domains based on RPMI are
still work-in-progress. If there are multiple supervisor domains
(aka system level partitions) created by SBI implementation or
some partitioning hypervisor then the RPMI messages can be
arbitraged by SBI implementation using SBI MPXY extension.
The SBI MPXY extension also allows sharing the same RPMI
transport between machine-mode (firmware) and supervisor-mode.

2) Use its own platform specific power-domain driver for both
CPUs and devices (basically don't use the SBI HSM and RPMI).
In this case, there won't be any controlled access (or arbitration)
of power rails across supervisor domains.

>
> >
> > I suggest we continue using "/cpus/power-domains" DT node
> > only for power domains related to CPU clusters or CPU
> > cache-hierarchy.
> >
> > For system wide power domains of SoC devices, we can either:
> > 1) Use device power domains through the SBI MPXY extension
> > via different driver
> > 2) Use a platform specific driver
> >
> > >
> > > We also look at cpuidle_psci_domain driver and it seems Arm doesn't
> > > create the devices for each subnode of psci domain.
> > > Is there any reason that they don't need it?
>
> We don't need it for arm as we have a separate node for PSCI and its
> power-domains [2]. Moreover, we have a separate driver that manages
> the power-domain (cpuidle-psci-domain).

Unlike the ARM world, we don't have any DT node for SBI in
the RISC-V world because the SBI is always there. Due to this,
the SBI HSM CPU idle driver (this driver) currently looks for
CPU "power-domains" under "/cpus" DT node because the
SBI HSM extension only deals with CPU states.

>
> [...]
>
> [1] arch/arm64/boot/dts/qcom/sc7280.dtsi (search for "CLUSTER_PD")
> [2] Documentation/devicetree/bindings/arm/psci.yaml
>
> Kind regards
> Uffe

[1] https://github.com/riscv-non-isa/riscv-rpmi
[2] https://docs.google.com/document/d/1Ivej3u6uQgVdJHnjrbqgUwE1Juy75d4uYCjWrdNjeAg/edit?usp=sharing

Best regards,
Anup