2019-01-18 05:48:24

by Pramod Kumar

[permalink] [raw]
Subject: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

If CPU hotplug is supported, ipi_cpu_stop should use PSCI cpudie
call to stop the CPU. This call ensures L1/L2 cache flush,
CPUs cache-cohenrecy setting w.r.to interconnect.

Apart from this, this gives control to f/w to reduce power consumption
by take appropriate decesion on power rails for plugging-out core.

Signed-off-by: Pramod Kumar <[email protected]>
Reviewed-by: Ray Jui <[email protected]>
Reviewed-by: Scott Branden <[email protected]>
---
arch/arm64/kernel/smp.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 1598d6f..360e52b 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -822,8 +822,13 @@ static void ipi_cpu_stop(unsigned int cpu)
local_daif_mask();
sdei_mask_local_cpu();

+#ifdef CONFIG_HOTPLUG_CPU
+ if (cpu_ops[cpu]->cpu_die)
+ cpu_ops[cpu]->cpu_die(cpu);
+#else
while (1)
cpu_relax();
+#endif
}

#ifdef CONFIG_KEXEC_CORE
--
1.9.1



2019-01-18 11:34:15

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

On Fri, Jan 18, 2019 at 11:16:20AM +0530, Pramod Kumar wrote:
> If CPU hotplug is supported, ipi_cpu_stop should use PSCI cpudie
> call to stop the CPU. This call ensures L1/L2 cache flush,
> CPUs cache-cohenrecy setting w.r.to interconnect.
>

Firstly, this is not specific to PSCI and I don't see any PSCI calls as
$subject claims. Next, you fail to explain why do you have to ensure
caches are cleaned and why do you need that in ipi_cpu_stop ?
What's the use case ?

> Apart from this, this gives control to f/w to reduce power consumption
> by take appropriate decesion on power rails for plugging-out core.
>

May be, but ipi_cpu_stop is used is machine reboot/poweroff/halt which
may restart or poweroff the system, powering down individual CPUs is not
necessary and may consume lot of time in systems with large number of CPUs.
It would be good to know the use-case in case I am missing to consider
that.

--
Regards,
Sudeep

2019-01-21 08:04:20

by Pramod Kumar

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

On Mon, Jan 21, 2019 at 11:30 AM Pramod Kumar <[email protected]> wrote:
>
>
>
> On Mon, Jan 21, 2019 at 11:28 AM Pramod Kumar <[email protected]> wrote:
>>
>> Hi Sudeep,
>>
>> Thanks for having a glance on the patch. Please see my reply in line.
>>
>> Regards,
>> Pramod
>>
>> On Fri, Jan 18, 2019 at 5:02 PM Sudeep Holla <[email protected]> wrote:
>>>
>>> On Fri, Jan 18, 2019 at 11:16:20AM +0530, Pramod Kumar wrote:
>>> > If CPU hotplug is supported, ipi_cpu_stop should use PSCI cpudie
>>> > call to stop the CPU. This call ensures L1/L2 cache flush,
>>> > CPUs cache-cohenrecy setting w.r.to interconnect.
>>> >
>>>
>>> Firstly, this is not specific to PSCI and I don't see any PSCI calls as
>>> $subject claims.
>>
>>
>> I had seen all the other cpu ops methods where only PSCI supports only cpu_die features hence
>> used PSCI reference in my subject line to be more clear where this die gets converted in last.
>>
>>>
>>> Next, you fail to explain why do you have to ensure
>>> caches are cleaned and why do you need that in ipi_cpu_stop ?
>>> What's the use case ?
>>>
>>
>> Need comes from a specific use case where one Accelerator card(SoC) is plugged in a sever over a PCIe interface. This Card gets supply from a battery, which could provide very less power for a very small time, in case of any power loss. Once Card switches to battery, this has to reduce its power consumption to its lowest point and back-up the DDR contents asap before battery gets fully drained off.
>>
>> Since battery can provide limited power for a very short time hence need to transition to lowest power. As per the transition process , CPUs power domain has to be off but before that it needs to flush out its content to system memory(L3) so that content could be backed-up by a MCU, a controller consuming very less power. Since we can not afford plugging-out every individual CPUs in sequence hence uses ipi_cpu_stop for all other CPUs which ultimately switch to ATF to flush out all the CPUs caches and comes out of coherency domain so that its power rails could be switched-off.
>>
>>>
>>> > Apart from this, this gives control to f/w to reduce power consumption
>>> > by take appropriate decesion on power rails for plugging-out core.
>>> >
>>>
>>> May be, but ipi_cpu_stop is used is machine reboot/poweroff/halt which
>>> may restart or poweroff the system, powering down individual CPUs is not
>>> necessary and may consume lot of time in systems with large number of CPUs.
>>> It would be good to know the use-case in case I am missing to consider
>>> that.
>>
>>
>> Use case as explained above.
>>
>>>
>>> --
>>> Regards,
>>> Sudeep

2019-01-21 11:24:59

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

On Mon, Jan 21, 2019 at 11:28:27AM +0530, Pramod Kumar wrote:
> On Fri, Jan 18, 2019 at 5:02 PM Sudeep Holla <[email protected]> wrote:
> > On Fri, Jan 18, 2019 at 11:16:20AM +0530, Pramod Kumar wrote:
> > > If CPU hotplug is supported, ipi_cpu_stop should use PSCI cpudie
> > > call to stop the CPU. This call ensures L1/L2 cache flush,
> > > CPUs cache-cohenrecy setting w.r.to interconnect.
> >
> > Firstly, this is not specific to PSCI and I don't see any PSCI calls as
> > $subject claims.
>
> I had seen all the other cpu ops methods where only PSCI supports only
> cpu_die features hence used PSCI reference in my subject line to be more
> clear where this die gets converted in last.
>

OK, but since you are not directly dealing with PSCI APIs, it's misleading.

> > Next, you fail to explain why do you have to ensure
> > caches are cleaned and why do you need that in ipi_cpu_stop ?
> > What's the use case ?
> >
> Need comes from a specific use case where one Accelerator card(SoC) is
> plugged in a sever over a PCIe interface. This Card gets supply from a
> battery, which could provide very less power for a very small time, in case
> of any power loss. Once Card switches to battery, this has to reduce its
> power consumption to its lowest point and back-up the DDR contents asap
> before battery gets fully drained off.
>

OK, but I was expecting the complete call path for this use-case.

> Since battery can provide limited power for a very short time hence need to
> transition to lowest power. As per the transition process , CPUs power
> domain has to be off but before that it needs to flush out its content to
> system memory(L3) so that content could be backed-up by a MCU, a controller
> consuming very less power. Since we can not afford plugging-out every
> individual CPUs in sequence hence uses ipi_cpu_stop for all other CPUs

So, you are randomly using ipi_cpu_stop for something it's not supposed to
be used. What do you exactly want to achieve in that context where you need
to save power ? Why system off or reset not sufficient ?

--
Regards,
Sudeep


2019-01-23 04:52:47

by Pramod Kumar

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

Thanks Sudeep for reviewing. Please see my comment inline below.

On Mon, Jan 21, 2019 at 4:52 PM Sudeep Holla <[email protected]> wrote:
>
> On Mon, Jan 21, 2019 at 11:28:27AM +0530, Pramod Kumar wrote:
> > On Fri, Jan 18, 2019 at 5:02 PM Sudeep Holla <[email protected]> wrote:
> > > On Fri, Jan 18, 2019 at 11:16:20AM +0530, Pramod Kumar wrote:
> > > > If CPU hotplug is supported, ipi_cpu_stop should use PSCI cpudie
> > > > call to stop the CPU. This call ensures L1/L2 cache flush,
> > > > CPUs cache-cohenrecy setting w.r.to interconnect.
> > >
> > > Firstly, this is not specific to PSCI and I don't see any PSCI calls as
> > > $subject claims.
> >
> > I had seen all the other cpu ops methods where only PSCI supports only
> > cpu_die features hence used PSCI reference in my subject line to be more
> > clear where this die gets converted in last.
> >
>
> OK, but since you are not directly dealing with PSCI APIs, it's misleading.
>

I can re-phrase my commit message which links explanation to PSCI.

> > > Next, you fail to explain why do you have to ensure
> > > caches are cleaned and why do you need that in ipi_cpu_stop ?
> > > What's the use case ?
> > >
> > Need comes from a specific use case where one Accelerator card(SoC) is
> > plugged in a sever over a PCIe interface. This Card gets supply from a
> > battery, which could provide very less power for a very small time, in case
> > of any power loss. Once Card switches to battery, this has to reduce its
> > power consumption to its lowest point and back-up the DDR contents asap
> > before battery gets fully drained off.
> >
>
> OK, but I was expecting the complete call path for this use-case.

OK. Once power loss happens, Application processor gets interrupt and
it has to switch to MCU asap so that it can switch-off all irrelevant
power domain to reduce battery drain.
flow is like this-
Power loss interrupt --> interrupt handler -> send IPI to stop all
secondary CPUs-> switch primary CPU to ATF which ultimately transition
to MCU where all power off action is taken.

>
> > Since battery can provide limited power for a very short time hence need to
> > transition to lowest power. As per the transition process , CPUs power
> > domain has to be off but before that it needs to flush out its content to
> > system memory(L3) so that content could be backed-up by a MCU, a controller
> > consuming very less power. Since we can not afford plugging-out every
> > individual CPUs in sequence hence uses ipi_cpu_stop for all other CPUs
>
> So, you are randomly using ipi_cpu_stop for something it's not supposed to
> be used. What do you exactly want to achieve in that context where you need
> to save power ?

Need to back-up DDR from MCU which will require L1/L2 content gets
flushed before CPUs getting shutdown.

>Why system off or reset not sufficient ?
System off or reset is not sufficient as we can not afford that much
delay in transition.

Flushing L1/L2 contents to System memory and Bringing out CPUs
Clusters out of coherency domain before shutting down Clusters puts
rest of system in sane state. If cluster are not being taken out
properly from coherency domain before shutting down, it could lead
system un-reliable for rest parts.

> --
> Regards,
> Sudeep
>

2019-01-23 16:28:46

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

On Wed, Jan 23, 2019 at 10:21:05AM +0530, Pramod Kumar wrote:
> Thanks Sudeep for reviewing. Please see my comment inline below.
>
> On Mon, Jan 21, 2019 at 4:52 PM Sudeep Holla <[email protected]> wrote:

[...]

> >
> > OK, but I was expecting the complete call path for this use-case.
>
> OK. Once power loss happens, Application processor gets interrupt and
> it has to switch to MCU asap so that it can switch-off all irrelevant
> power domain to reduce battery drain.
> flow is like this-
> Power loss interrupt --> interrupt handler -> send IPI to stop all
> secondary CPUs-> switch primary CPU to ATF which ultimately transition
> to MCU where all power off action is taken.
>

So does the entire AP domain get powered off or is that the intention ?
If so, just stopping secondaries through IPI is not sufficient. What
happens to all the running tasks and other operations going on in the
system.

> >
> > > Since battery can provide limited power for a very short time hence need to
> > > transition to lowest power. As per the transition process , CPUs power
> > > domain has to be off but before that it needs to flush out its content to
> > > system memory(L3) so that content could be backed-up by a MCU, a controller
> > > consuming very less power. Since we can not afford plugging-out every
> > > individual CPUs in sequence hence uses ipi_cpu_stop for all other CPUs
> >
> > So, you are randomly using ipi_cpu_stop for something it's not supposed to
> > be used. What do you exactly want to achieve in that context where you need
> > to save power ?
>
> Need to back-up DDR from MCU which will require L1/L2 content gets
> flushed before CPUs getting shutdown.
>

Understood.

> >Why system off or reset not sufficient ?
> System off or reset is not sufficient as we can not afford that much
> delay in transition.
>

But who will save the system state ? You may corrupt data at the cost
of saving transition delay. So IMO, you don't have a valid use-case or
you are failing to present one properly.

> Flushing L1/L2 contents to System memory and Bringing out CPUs
> Clusters out of coherency domain before shutting down Clusters puts
> rest of system in sane state. If cluster are not being taken out
> properly from coherency domain before shutting down, it could lead
> system un-reliable for rest parts.
>

Looks like you are concerned only with CPUs in this scenario, but what
about other devices in the domain. So suspend-to-{ram,idle} are only
sane options to make sure you don't loose/corrupt any data/state.

Further point me to the code either part of this series or upstream
handling the "Power loss interrupt" explained above.

--
Regards,
Sudeep

2019-01-23 16:49:35

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

On Mon, Jan 21, 2019 at 11:30:02AM +0530, Pramod Kumar wrote:
> On Mon, Jan 21, 2019 at 11:28 AM Pramod Kumar <[email protected]>
> wrote:
>
> Need comes from a specific use case where one Accelerator card(SoC) is
> plugged in a sever over a PCIe interface. This Card gets supply from a
> battery, which could provide very less power for a very small time, in case
> of any power loss. Once Card switches to battery, this has to reduce its
> power consumption to its lowest point and back-up the DDR contents asap
> before battery gets fully drained off.

In this example is Linux running on the server, or on the accelerator?

What precisely are you trying to back up from DDR, and why?

What is responsible for backing up that contents?

> Since battery can provide limited power for a very short time hence need to
> transition to lowest power. As per the transition process , CPUs power
> domain has to be off but before that it needs to flush out its content to
> system memory(L3) so that content could be backed-up by a MCU, a controller
> consuming very less power. Since we can not afford plugging-out every
> individual CPUs in sequence hence uses ipi_cpu_stop for all other CPUs
> which ultimately switch to ATF to flush out all the CPUs caches and comes
> out of coherency domain so that its power rails could be switched-off.

If you're stopping CPUs from completely arbitrary states, what is the
benefit of saving the RAM contents?

CPUs might be running with IRQs disabled for an arbitrarily long time,
so there's no guarantee that all of them will be turned off before power
is lost.

I'm very confused as to what you're trying to achieve here.

Thanks,
Mark.

2019-01-23 16:59:03

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

On Fri, Jan 18, 2019 at 11:16:20AM +0530, Pramod Kumar wrote:
> If CPU hotplug is supported, ipi_cpu_stop should use PSCI cpudie
> call to stop the CPU. This call ensures L1/L2 cache flush,
> CPUs cache-cohenrecy setting w.r.to interconnect.
>
> Apart from this, this gives control to f/w to reduce power consumption
> by take appropriate decesion on power rails for plugging-out core.
>
> Signed-off-by: Pramod Kumar <[email protected]>
> Reviewed-by: Ray Jui <[email protected]>
> Reviewed-by: Scott Branden <[email protected]>
> ---
> arch/arm64/kernel/smp.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 1598d6f..360e52b 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -822,8 +822,13 @@ static void ipi_cpu_stop(unsigned int cpu)
> local_daif_mask();
> sdei_mask_local_cpu();
>
> +#ifdef CONFIG_HOTPLUG_CPU
> + if (cpu_ops[cpu]->cpu_die)
> + cpu_ops[cpu]->cpu_die(cpu);
> +#else
> while (1)
> cpu_relax();
> +#endif

If cpu_ops[cpu]->cpu_die is NULL, this change makes ipi_cpu_stop()
return, which is not correct.

Regardless, I don't think that there is sufficient rationale for this
change, especially given that your commit message describes
platform-specific assumptions which do not hold in general.

Thanks,
Mark.

2019-01-23 17:08:48

by Scott Branden

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

Hi Mark,

Hopefully I can shed some light on the use case inline.

On 2019-01-23 8:48 a.m., Mark Rutland wrote:
> On Mon, Jan 21, 2019 at 11:30:02AM +0530, Pramod Kumar wrote:
>> On Mon, Jan 21, 2019 at 11:28 AM Pramod Kumar <[email protected]>
>> wrote:
>>
>> Need comes from a specific use case where one Accelerator card(SoC) is
>> plugged in a sever over a PCIe interface. This Card gets supply from a
>> battery, which could provide very less power for a very small time, in case
>> of any power loss. Once Card switches to battery, this has to reduce its
>> power consumption to its lowest point and back-up the DDR contents asap
>> before battery gets fully drained off.
> In this example is Linux running on the server, or on the accelerator?
Accelerator
>
> What precisely are you trying to back up from DDR, and why?
Data in DDR is being written to disk at this time (disk is connected to
accelerator)
>
> What is responsible for backing up that contents?

A low power M-class processor and DMA engine which continues necessary
operations to transfer DDR memory to disk.

The high power processors on the accelerator running linux needed to be
halted ASAP on this power loss event and M0 take over. Graceful shutdown
of linux and other peripherals is unnecessary (and we don't have the
power necessary to do so).

>
>> Since battery can provide limited power for a very short time hence need to
>> transition to lowest power. As per the transition process , CPUs power
>> domain has to be off but before that it needs to flush out its content to
>> system memory(L3) so that content could be backed-up by a MCU, a controller
>> consuming very less power. Since we can not afford plugging-out every
>> individual CPUs in sequence hence uses ipi_cpu_stop for all other CPUs
>> which ultimately switch to ATF to flush out all the CPUs caches and comes
>> out of coherency domain so that its power rails could be switched-off.
> If you're stopping CPUs from completely arbitrary states, what is the
> benefit of saving the RAM contents?

Some of the RAM contains data that was in the process of being written
to disk by the accelerator.

This data must be saved to disk and the high power CPUs consume too much
power to continue performing this operation.

>
> CPUs might be running with IRQs disabled for an arbitrarily long time,

In an embedded linux system we control everything running.

But good point - we'll need to ensure interrupts are not disabled for
too long.

> so there's no guarantee that all of them will be turned off before power
> is lost.
>
> I'm very confused as to what you're trying to achieve here.
>
> Thanks,
> Mark.

Regards,

 Scott


2019-01-23 17:24:57

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

On Wed, Jan 23, 2019 at 09:05:26AM -0800, Scott Branden wrote:
> Hi Mark,
>
> Hopefully I can shed some light on the use case inline.
>
> On 2019-01-23 8:48 a.m., Mark Rutland wrote:
> > On Mon, Jan 21, 2019 at 11:30:02AM +0530, Pramod Kumar wrote:
> > > On Mon, Jan 21, 2019 at 11:28 AM Pramod Kumar <[email protected]>
> > > wrote:
> > >
> > > Need comes from a specific use case where one Accelerator card(SoC) is
> > > plugged in a sever over a PCIe interface. This Card gets supply from a
> > > battery, which could provide very less power for a very small time, in case
> > > of any power loss. Once Card switches to battery, this has to reduce its
> > > power consumption to its lowest point and back-up the DDR contents asap
> > > before battery gets fully drained off.
> > In this example is Linux running on the server, or on the accelerator?
> Accelerator
> >
> > What precisely are you trying to back up from DDR, and why?
> Data in DDR is being written to disk at this time (disk is connected to
> accelerator)
> >
> > What is responsible for backing up that contents?
>
> A low power M-class processor and DMA engine which continues necessary
> operations to transfer DDR memory to disk.
>
> The high power processors on the accelerator running linux needed to be
> halted ASAP on this power loss event and M0 take over. Graceful shutdown of
> linux and other peripherals is unnecessary (and we don't have the power
> necessary to do so).
>

It may be unnecessary for your use-case, but not recommended.

> >
> > > Since battery can provide limited power for a very short time hence need to
> > > transition to lowest power. As per the transition process , CPUs power
> > > domain has to be off but before that it needs to flush out its content to
> > > system memory(L3) so that content could be backed-up by a MCU, a controller
> > > consuming very less power. Since we can not afford plugging-out every
> > > individual CPUs in sequence hence uses ipi_cpu_stop for all other CPUs
> > > which ultimately switch to ATF to flush out all the CPUs caches and comes
> > > out of coherency domain so that its power rails could be switched-off.
> > If you're stopping CPUs from completely arbitrary states, what is the
> > benefit of saving the RAM contents?
>
> Some of the RAM contains data that was in the process of being written to
> disk by the accelerator.
>
> This data must be saved to disk and the high power CPUs consume too much
> power to continue performing this operation.
>

Why will suspend to ram or idle not work ? It will power off the secondaries
which this patch is trying to achieve, but in more sane way so that no
data/state is lost/corrupted as I stated earlier.

> >
> > CPUs might be running with IRQs disabled for an arbitrarily long time,
>
> In an embedded linux system we control everything running.
>

By which I assume you have patches to do all sorts of things to make this
work and this patch standalone is of no use :)

I don't like this as it's not scalable to big systems as this is in the
same code path as system off/reset.

--
Regards,
Sudeep

2019-01-23 17:36:07

by Scott Branden

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

Hi Sudeep,

On 2019-01-23 9:21 a.m., Sudeep Holla wrote:
> On Wed, Jan 23, 2019 at 09:05:26AM -0800, Scott Branden wrote:
>> Hi Mark,
>>
>> Hopefully I can shed some light on the use case inline.
>>
>> On 2019-01-23 8:48 a.m., Mark Rutland wrote:
>>> On Mon, Jan 21, 2019 at 11:30:02AM +0530, Pramod Kumar wrote:
>>>> On Mon, Jan 21, 2019 at 11:28 AM Pramod Kumar <[email protected]>
>>>> wrote:
>>>>
>>>> Need comes from a specific use case where one Accelerator card(SoC) is
>>>> plugged in a sever over a PCIe interface. This Card gets supply from a
>>>> battery, which could provide very less power for a very small time, in case
>>>> of any power loss. Once Card switches to battery, this has to reduce its
>>>> power consumption to its lowest point and back-up the DDR contents asap
>>>> before battery gets fully drained off.
>>> In this example is Linux running on the server, or on the accelerator?
>> Accelerator
>>> What precisely are you trying to back up from DDR, and why?
>> Data in DDR is being written to disk at this time (disk is connected to
>> accelerator)
>>> What is responsible for backing up that contents?
>> A low power M-class processor and DMA engine which continues necessary
>> operations to transfer DDR memory to disk.
>>
>> The high power processors on the accelerator running linux needed to be
>> halted ASAP on this power loss event and M0 take over. Graceful shutdown of
>> linux and other peripherals is unnecessary (and we don't have the power
>> necessary to do so).
>>
> It may be unnecessary for your use-case, but not recommended.
No choice - we don't have the time/power for a graceful shutdown.
>
>>>> Since battery can provide limited power for a very short time hence need to
>>>> transition to lowest power. As per the transition process , CPUs power
>>>> domain has to be off but before that it needs to flush out its content to
>>>> system memory(L3) so that content could be backed-up by a MCU, a controller
>>>> consuming very less power. Since we can not afford plugging-out every
>>>> individual CPUs in sequence hence uses ipi_cpu_stop for all other CPUs
>>>> which ultimately switch to ATF to flush out all the CPUs caches and comes
>>>> out of coherency domain so that its power rails could be switched-off.
>>> If you're stopping CPUs from completely arbitrary states, what is the
>>> benefit of saving the RAM contents?
>> Some of the RAM contains data that was in the process of being written to
>> disk by the accelerator.
>>
>> This data must be saved to disk and the high power CPUs consume too much
>> power to continue performing this operation.
>>
> Why will suspend to ram or idle not work ? It will power off the secondaries
> which this patch is trying to achieve, but in more sane way so that no
> data/state is lost/corrupted as I stated earlier.
We need to take over control of the disk write operations.  There is no
time/power available to leave linux running.
>
>>> CPUs might be running with IRQs disabled for an arbitrarily long time,
>> In an embedded linux system we control everything running.
>>
> By which I assume you have patches to do all sorts of things to make this
> work and this patch standalone is of no use :)

I believe other patch is in a standalone driver to be upstreamed.

Remainder of code is on a standalone M0 processor not running linux.

>
> I don't like this as it's not scalable to big systems as this is in the
> same code path as system off/reset.
Many things are not used by big systems and vice versa.  What do you
suggest is done otherwise?
>
> --
> Regards,
> Sudeep

Regards,

 Scott


2019-01-23 17:37:23

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

On Wed, Jan 23, 2019 at 09:05:26AM -0800, Scott Branden wrote:
> Hi Mark,
>
> Hopefully I can shed some light on the use case inline.
>
> On 2019-01-23 8:48 a.m., Mark Rutland wrote:
> > On Mon, Jan 21, 2019 at 11:30:02AM +0530, Pramod Kumar wrote:
> > > On Mon, Jan 21, 2019 at 11:28 AM Pramod Kumar <[email protected]>
> > > wrote:
> > >
> > > Need comes from a specific use case where one Accelerator card(SoC) is
> > > plugged in a sever over a PCIe interface. This Card gets supply from a
> > > battery, which could provide very less power for a very small time, in case
> > > of any power loss. Once Card switches to battery, this has to reduce its
> > > power consumption to its lowest point and back-up the DDR contents asap
> > > before battery gets fully drained off.
> > In this example is Linux running on the server, or on the accelerator?
> Accelerator
> >
> > What precisely are you trying to back up from DDR, and why?
> Data in DDR is being written to disk at this time (disk is connected to
> accelerator)
> >
> > What is responsible for backing up that contents?
>
> A low power M-class processor and DMA engine which continues necessary
> operations to transfer DDR memory to disk.
>
> The high power processors on the accelerator running linux needed to be
> halted ASAP on this power loss event and M0 take over. Graceful shutdown of
> linux and other peripherals is unnecessary (and we don't have the power
> necessary to do so).

If graceful shutdown of Linux is not required (and is in fact
undesireable), why is Linux involved at all in this shutdown process?

For example, why is this not a secure interrupt taken to EL3, which can
(gracefully) shut down the CPUs regardless?

> > > Since battery can provide limited power for a very short time hence need to
> > > transition to lowest power. As per the transition process , CPUs power
> > > domain has to be off but before that it needs to flush out its content to
> > > system memory(L3) so that content could be backed-up by a MCU, a controller
> > > consuming very less power. Since we can not afford plugging-out every
> > > individual CPUs in sequence hence uses ipi_cpu_stop for all other CPUs
> > > which ultimately switch to ATF to flush out all the CPUs caches and comes
> > > out of coherency domain so that its power rails could be switched-off.
> > If you're stopping CPUs from completely arbitrary states, what is the
> > benefit of saving the RAM contents?
>
> Some of the RAM contains data that was in the process of being written to
> disk by the accelerator.

Ok, so this isn't actually about backing up RAM contents; it's about
completing pending I/O.

I'm still confused as to how that works. How do you avoid leaving the
disk in some corrupt state if data runs out partway through?

> This data must be saved to disk and the high power CPUs consume too much
> power to continue performing this operation.
>
> > CPUs might be running with IRQs disabled for an arbitrarily long time,
>
> In an embedded linux system we control everything running.

Sure, and that complete control allows you to do something better than
this RFC, AFAICT.

Thanks,
Mark.

2019-01-23 17:47:31

by Scott Branden

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

On 2019-01-23 9:33 a.m., Mark Rutland wrote:
> On Wed, Jan 23, 2019 at 09:05:26AM -0800, Scott Branden wrote:
>> Hi Mark,
>>
>> Hopefully I can shed some light on the use case inline.
>>
>> On 2019-01-23 8:48 a.m., Mark Rutland wrote:
>>> On Mon, Jan 21, 2019 at 11:30:02AM +0530, Pramod Kumar wrote:
>>>> On Mon, Jan 21, 2019 at 11:28 AM Pramod Kumar <[email protected]>
>>>> wrote:
>>>>
>>>> Need comes from a specific use case where one Accelerator card(SoC) is
>>>> plugged in a sever over a PCIe interface. This Card gets supply from a
>>>> battery, which could provide very less power for a very small time, in case
>>>> of any power loss. Once Card switches to battery, this has to reduce its
>>>> power consumption to its lowest point and back-up the DDR contents asap
>>>> before battery gets fully drained off.
>>> In this example is Linux running on the server, or on the accelerator?
>> Accelerator
>>> What precisely are you trying to back up from DDR, and why?
>> Data in DDR is being written to disk at this time (disk is connected to
>> accelerator)
>>> What is responsible for backing up that contents?
>> A low power M-class processor and DMA engine which continues necessary
>> operations to transfer DDR memory to disk.
>>
>> The high power processors on the accelerator running linux needed to be
>> halted ASAP on this power loss event and M0 take over. Graceful shutdown of
>> linux and other peripherals is unnecessary (and we don't have the power
>> necessary to do so).
> If graceful shutdown of Linux is not required (and is in fact
> undesireable), why is Linux involved at all in this shutdown process?
>
> For example, why is this not a secure interrupt taken to EL3, which can
> (gracefully) shut down the CPUs regardless?
Will need Pramod to explain the detailed rationale here.
>>>> Since battery can provide limited power for a very short time hence need to
>>>> transition to lowest power. As per the transition process , CPUs power
>>>> domain has to be off but before that it needs to flush out its content to
>>>> system memory(L3) so that content could be backed-up by a MCU, a controller
>>>> consuming very less power. Since we can not afford plugging-out every
>>>> individual CPUs in sequence hence uses ipi_cpu_stop for all other CPUs
>>>> which ultimately switch to ATF to flush out all the CPUs caches and comes
>>>> out of coherency domain so that its power rails could be switched-off.
>>> If you're stopping CPUs from completely arbitrary states, what is the
>>> benefit of saving the RAM contents?
>> Some of the RAM contains data that was in the process of being written to
>> disk by the accelerator.
> Ok, so this isn't actually about backing up RAM contents; it's about
> completing pending I/O.
>
> I'm still confused as to how that works. How do you avoid leaving the
> disk in some corrupt state if data runs out partway through?

Some additional flags and details are saved to disk with the "pending i/o".

On next power up an app runs which recovers the data and recovers it and
completes processing.

Of course, if the store doesn't succeed properly portions of the
recovery are discarded.

>
>> This data must be saved to disk and the high power CPUs consume too much
>> power to continue performing this operation.
>>
>>> CPUs might be running with IRQs disabled for an arbitrarily long time,
>> In an embedded linux system we control everything running.
> Sure, and that complete control allows you to do something better than
> this RFC, AFAICT.
If possible that would be great.  Need Pramod to comment whether the
direct EL3 will solve all issues.
>
> Thanks,
> Mark.

Thanks for input Mark.

Scott


2019-01-23 18:09:53

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

On Wed, Jan 23, 2019 at 09:46:22AM -0800, Scott Branden wrote:

[...]

>
> Some additional flags and details are saved to disk with the "pending i/o".
>

I am getting a sense that this is some raw partition on a disk(flash/eMMC/UFS)
the M0 can write to and Linux can read. It's can't be full fledge file
system like ext4 were are talking here or are we ?

--
Regards,
Sudeep

2019-01-25 07:05:39

by Pramod Kumar

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

On Wed, Jan 23, 2019 at 11:03 PM Mark Rutland <[email protected]> wrote:
>
> On Wed, Jan 23, 2019 at 09:05:26AM -0800, Scott Branden wrote:
> > Hi Mark,
> >
> > Hopefully I can shed some light on the use case inline.
> >
> > On 2019-01-23 8:48 a.m., Mark Rutland wrote:
> > > On Mon, Jan 21, 2019 at 11:30:02AM +0530, Pramod Kumar wrote:
> > > > On Mon, Jan 21, 2019 at 11:28 AM Pramod Kumar <[email protected]>
> > > > wrote:
> > > >
> > > > Need comes from a specific use case where one Accelerator card(SoC) is
> > > > plugged in a sever over a PCIe interface. This Card gets supply from a
> > > > battery, which could provide very less power for a very small time, in case
> > > > of any power loss. Once Card switches to battery, this has to reduce its
> > > > power consumption to its lowest point and back-up the DDR contents asap
> > > > before battery gets fully drained off.
> > > In this example is Linux running on the server, or on the accelerator?
> > Accelerator
> > >
> > > What precisely are you trying to back up from DDR, and why?
> > Data in DDR is being written to disk at this time (disk is connected to
> > accelerator)
> > >
> > > What is responsible for backing up that contents?
> >
> > A low power M-class processor and DMA engine which continues necessary
> > operations to transfer DDR memory to disk.
> >
> > The high power processors on the accelerator running linux needed to be
> > halted ASAP on this power loss event and M0 take over. Graceful shutdown of
> > linux and other peripherals is unnecessary (and we don't have the power
> > necessary to do so).
>
> If graceful shutdown of Linux is not required (and is in fact
> undesireable), why is Linux involved at all in this shutdown process?
>
> For example, why is this not a secure interrupt taken to EL3, which can
> (gracefully) shut down the CPUs regardless?
>

This is an GPIO interrupt. This can not be marked secure as for that
we need to mark whole GPIO controller as secure which is not possible
as GPIO controller is meant for non-secure world having more than 100
lines connected.

I agree we have work around where we invoke handler in Linux and
switch to ATF via SMC and from ATF we need bring all secondary CPU to
ATF via sending SGI and and then respective core flushes the L1/L2 and
bring himself out of coherency domain and cluster and MCU shutdowns
the CPU subsystem gracefully. This could work for our requirement.
Need to check ATF support for that.

But What about generic system? This patch address the generic
multi-master system's requirement. Consider system where shutting down
the linux does not mean shutting down the complete system. Lets take
an example of smartnic case Where NIC master and CPUs access cachable
DDR. In smarnic its quite common to bring CPUs on demand means when
needed via MCU help.
Now in full-fledged system. if CPU subsystem is shutdown via poweroff
command which does not bring secondary CPUs out of coherency domain,
it will bring the complete system unstable when NIC master tries to
access DDR and snoop is send to CPUs as well which is not available.
Fabric/System hangs...

I feel While shutting down the CPUs subsystem or powering off, All
secondary CPUs must be shutdown properly by bring-out of coherency
domain to remain rest of subsystem usable. I agree that introducing
PSCI call introduce delay for shutdown/reboot case but stability
matter than little delay.

> > > > Since battery can provide limited power for a very short time hence need to
> > > > transition to lowest power. As per the transition process , CPUs power
> > > > domain has to be off but before that it needs to flush out its content to
> > > > system memory(L3) so that content could be backed-up by a MCU, a controller
> > > > consuming very less power. Since we can not afford plugging-out every
> > > > individual CPUs in sequence hence uses ipi_cpu_stop for all other CPUs
> > > > which ultimately switch to ATF to flush out all the CPUs caches and comes
> > > > out of coherency domain so that its power rails could be switched-off.
> > > If you're stopping CPUs from completely arbitrary states, what is the
> > > benefit of saving the RAM contents?
> >
> > Some of the RAM contains data that was in the process of being written to
> > disk by the accelerator.
>
> Ok, so this isn't actually about backing up RAM contents; it's about
> completing pending I/O.
>
> I'm still confused as to how that works. How do you avoid leaving the
> disk in some corrupt state if data runs out partway through?
>
> > This data must be saved to disk and the high power CPUs consume too much
> > power to continue performing this operation.
> >
> > > CPUs might be running with IRQs disabled for an arbitrarily long time,
> >
> > In an embedded linux system we control everything running.
>
> Sure, and that complete control allows you to do something better than
> this RFC, AFAICT.
>
> Thanks,
> Mark.

2019-01-25 15:58:29

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

On 25/01/2019 07:03, Pramod Kumar wrote:
> On Wed, Jan 23, 2019 at 11:03 PM Mark Rutland <[email protected]> wrote:
>>
>> On Wed, Jan 23, 2019 at 09:05:26AM -0800, Scott Branden wrote:
>>> Hi Mark,
>>>
>>> Hopefully I can shed some light on the use case inline.
>>>
>>> On 2019-01-23 8:48 a.m., Mark Rutland wrote:
>>>> On Mon, Jan 21, 2019 at 11:30:02AM +0530, Pramod Kumar wrote:
>>>>> On Mon, Jan 21, 2019 at 11:28 AM Pramod Kumar <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Need comes from a specific use case where one Accelerator card(SoC) is
>>>>> plugged in a sever over a PCIe interface. This Card gets supply from a
>>>>> battery, which could provide very less power for a very small time, in case
>>>>> of any power loss. Once Card switches to battery, this has to reduce its
>>>>> power consumption to its lowest point and back-up the DDR contents asap
>>>>> before battery gets fully drained off.
>>>> In this example is Linux running on the server, or on the accelerator?
>>> Accelerator
>>>>
>>>> What precisely are you trying to back up from DDR, and why?
>>> Data in DDR is being written to disk at this time (disk is connected to
>>> accelerator)
>>>>
>>>> What is responsible for backing up that contents?
>>>
>>> A low power M-class processor and DMA engine which continues necessary
>>> operations to transfer DDR memory to disk.
>>>
>>> The high power processors on the accelerator running linux needed to be
>>> halted ASAP on this power loss event and M0 take over. Graceful shutdown of
>>> linux and other peripherals is unnecessary (and we don't have the power
>>> necessary to do so).
>>
>> If graceful shutdown of Linux is not required (and is in fact
>> undesireable), why is Linux involved at all in this shutdown process?
>>
>> For example, why is this not a secure interrupt taken to EL3, which can
>> (gracefully) shut down the CPUs regardless?
>>
>
> This is an GPIO interrupt. This can not be marked secure as for that
> we need to mark whole GPIO controller as secure which is not possible
> as GPIO controller is meant for non-secure world having more than 100
> lines connected.
>
> I agree we have work around where we invoke handler in Linux and
> switch to ATF via SMC and from ATF we need bring all secondary CPU to
> ATF via sending SGI and and then respective core flushes the L1/L2 and
> bring himself out of coherency domain and cluster and MCU shutdowns
> the CPU subsystem gracefully. This could work for our requirement.
> Need to check ATF support for that.

Right, SMCCC has whole spaces for SoC-specific and platform-specific
service calls. If your system has a need to power off as fast as
possible under system-specific constraints, it seems much more sensible
to immediately tell the firmware "power off as fast as possible under
the system-specific constraints that you have full knowledge of,
please", rather than trying to coax the generic kernel_halt() (or
whatever) infrastructure to sort-of-do-what-you-want.

> But What about generic system? This patch address the generic
> multi-master system's requirement. Consider system where shutting down
> the linux does not mean shutting down the complete system. Lets take
> an example of smartnic case Where NIC master and CPUs access cachable
> DDR. In smarnic its quite common to bring CPUs on demand means when
> needed via MCU help.
> Now in full-fledged system. if CPU subsystem is shutdown via poweroff
> command which does not bring secondary CPUs out of coherency domain,
> it will bring the complete system unstable when NIC master tries to
> access DDR and snoop is send to CPUs as well which is not available.
> Fabric/System hangs...

Not sure that's really relevant here... If platform firmware is able to
power things off in a way that breaks the platform, surely that's
entirely the firmware's own fault.

> I feel While shutting down the CPUs subsystem or powering off, All
> secondary CPUs must be shutdown properly by bring-out of coherency
> domain to remain rest of subsystem usable. I agree that introducing
> PSCI call introduce delay for shutdown/reboot case but stability
> matter than little delay.

Again, if you don't trust the firmware to implement SYSTEM_OFF
appropriately for the platform, can you really assume its CPU_OFF
implementation is safe either?

People already complain today about how long CPU bringup takes on
certain systems. Extending their reboot cycle by a similar degree for
reasons that are entirely irrelevant to those systems is hardly going to
make those users any happier.

Robin.

2019-01-25 16:54:13

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH RFC 1/1] arm64: Use PSCI calls for CPU stop when hotplug is supported

On Fri, Jan 25, 2019 at 12:33:56PM +0530, Pramod Kumar wrote:

[...]

>
> This is an GPIO interrupt. This can not be marked secure as for that
> we need to mark whole GPIO controller as secure which is not possible
> as GPIO controller is meant for non-secure world having more than 100
> lines connected.
>
> I agree we have work around where we invoke handler in Linux and
> switch to ATF via SMC and from ATF we need bring all secondary CPU to
> ATF via sending SGI and and then respective core flushes the L1/L2 and
> bring himself out of coherency domain and cluster and MCU shutdowns
> the CPU subsystem gracefully. This could work for our requirement.
> Need to check ATF support for that.
>

Yes platform specific requirement and platform specific solution,
happy ending :).

> But What about generic system? This patch address the generic
> multi-master system's requirement.

Why do we need to address this in generic system ? And what exactly
is this multi-master system's requirement ? Linux runs on one or more
masters and will own it completely. Shutdown must involve everything
it owns if it needs to be graceful.

> Consider system where shutting down the linux does not mean shutting
> down the complete system.

Why is that the case ? Is it forceful shutdown ? Does Linux just own
CPUs and don't care about other blocks in the system ? Irrespective
of what it owns, system shutdown will take care.

> Lets take an example of smartnic case Where NIC master and CPUs access
> cachable DDR. In smarnic its quite common to bring CPUs on demand means
> when needed via MCU help.

Yes you are talking about CPU hotplug or system here ? The above
indicates, it's just CPU hotplug and a solution already exists for that.

> Now in full-fledged system. if CPU subsystem is shutdown via poweroff
> command which does not bring secondary CPUs out of coherency domain,
> it will bring the complete system unstable when NIC master tries to
> access DDR and snoop is send to CPUs as well which is not available.
> Fabric/System hangs...
>

That's because your custom solution just sends ipi to stop CPUs. If you
shutdown the system, all the required information is save to non-volatile
memory and system is powered off gracefully.

> I feel While shutting down the CPUs subsystem or powering off, All
> secondary CPUs must be shutdown properly by bring-out of coherency
> domain to remain rest of subsystem usable. I agree that introducing
> PSCI call introduce delay for shutdown/reboot case but stability
> matter than little delay.

IPI_STOP is not designed to do a graceful shutdown of CPU subsystem.
Use CPU hotplug. You are trying to make you custom requirement a generic
one. We have CPU hotplug framework to do what you want, you just have to
use it. Infact you are doing almost the same with you patch, I don't see
any point as why CPU hotplug can't be used.

--
Regards,
Sudeep