LinuxLists.cc - [PATCH v3 0/3] ARM: rockchip: fix the SMP

2015-06-05 15:12:42

Subject: [PATCH v3 0/3] ARM: rockchip: fix the SMP

Verified on url =
https://chromium.googlesource.com/chromiumos/third_party/kernel/+/chromeos-3.14

Caesar Wang (3):
ARM: rockchip: fix the CPU soft reset
ARM: rockchip: ensure CPU to enter WFI/WFE state
ARM: rockchip: fix the SMP code style

arch/arm/mach-rockchip/platsmp.c | 36 +++++++++++++++++++++++-------------
1 file changed, 23 insertions(+), 13 deletions(-)

--
1.9.1

2015-06-05 15:12:52

by Caesar Wang

[permalink] [raw]

Subject: [PATCH v3 1/3] ARM: rockchip: fix the CPU soft reset

We need different orderings when turning a core on and turning a core
off. In one case we need to assert reset before turning power off.
In ther other case we need to turn power on and the deassert reset.

In general, the correct flow is:

CPU off:
reset_control_assert
regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
CPU on:
regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
reset_control_deassert

This is needed for stressing CPU up/down, as per:
cd /sys/devices/system/cpu/
for i in $(seq 1000); do
echo "================= $i ============"
for j in $(seq 100); do
while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat cpu3/online)" != "000"" ]]
echo 0 > cpu1/online
echo 0 > cpu2/online
echo 0 > cpu3/online
done
while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat cpu3/online)" != "111" ]]; do
echo 1 > cpu1/online
echo 1 > cpu2/online
echo 1 > cpu3/online
done
done
done

The following is reproducile log:
[34466.186812] PM: noirq suspend of devices complete after 0.669 msecs
[34466.186824] Disabling non-boot CPUs ...
[34466.187509] CPU1: shutdown
[34466.188672] CPU2: shutdown
[34473.736627] Kernel panic - not syncing:Watchdog detected hard LOCKUP on cpu 0
.......

Signed-off-by: Caesar Wang <[email protected]>
---

arch/arm/mach-rockchip/platsmp.c | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/arch/arm/mach-rockchip/platsmp.c b/arch/arm/mach-rockchip/platsmp.c
index 5b4ca3c..25da16f 100644
--- a/arch/arm/mach-rockchip/platsmp.c
+++ b/arch/arm/mach-rockchip/platsmp.c
@@ -88,18 +88,24 @@ static int pmu_set_power_domain(int pd, bool on)
return PTR_ERR(rstc);
}

- if (on)
- reset_control_deassert(rstc);
- else
+ if (!on)
reset_control_assert(rstc);

- reset_control_put(rstc);
- }
+ ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
+ if (ret < 0) {
+ pr_err("%s: could not update power domain\n", __func__);
+ reset_control_put(rstc);
+ return ret;
+ }

- ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
- if (ret < 0) {
- pr_err("%s: could not update power domain\n", __func__);
- return ret;
+ if (on)
+ reset_control_deassert(rstc);
+ } else {
+ ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
+ if (ret < 0) {
+ pr_err("%s: could not update power domain\n", __func__);
+ return ret;
+ }
}

ret = -1;
--
1.9.1

2015-06-05 15:13:02

by Caesar Wang

[permalink] [raw]

Subject: [PATCH v3 2/3] ARM: rockchip: ensure CPU to enter WFI/WFE state

In idle mode, core1/2/3 of Cortex-A17 should be either power off or in
WFI/WFE state.
we can delay 1ms to ensure the CPU enter WFI/WFE state.

Signed-off-by: Caesar Wang <[email protected]>
---

arch/arm/mach-rockchip/platsmp.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/arm/mach-rockchip/platsmp.c b/arch/arm/mach-rockchip/platsmp.c
index 25da16f..6672fdd 100644
--- a/arch/arm/mach-rockchip/platsmp.c
+++ b/arch/arm/mach-rockchip/platsmp.c
@@ -325,6 +325,9 @@ static void __init rockchip_smp_prepare_cpus(unsigned int max_cpus)
#ifdef CONFIG_HOTPLUG_CPU
static int rockchip_cpu_kill(unsigned int cpu)
{
+ /* ensure CPU can enter the WFI/WFE state */
+ mdelay(1);
+
pmu_set_power_domain(0 + cpu, false);
return 1;
}
--
1.9.1

2015-06-05 15:13:08

by Caesar Wang

[permalink] [raw]

Subject: [PATCH v3 3/3] ARM: rockchip: fix the SMP code style

Use the below scripts to check:
scripts/checkpatch.pl -f --subject arch/arm/mach-rockchip/platsmp.c
Although there is a check, it's no matter.

CHECK: usleep_range is preferred over udelay; see
Documentation/timers/timers-howto.txt
+167udelay(10);
total: 0 errors, 0 warnings, 1 checks, 362 lines checked

Changes in v3:
- FIx the PATCH v2, it doesn't work on chromium 3.14.

Changes in v2:
- As Kever points out, Fix the subject typo WIF/WFI in PATCH [2/3].
- As Heiko suggestion, re-adjust the cpu on/off flow in PATCH [1/3].
- Use the checkpatch.pl -f --subjective to check in PATCH [3/3].

Signed-off-by: Caesar Wang <[email protected]>
---

arch/arm/mach-rockchip/platsmp.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/arm/mach-rockchip/platsmp.c b/arch/arm/mach-rockchip/platsmp.c
index 6672fdd..ac9173e 100644
--- a/arch/arm/mach-rockchip/platsmp.c
+++ b/arch/arm/mach-rockchip/platsmp.c
@@ -113,7 +113,7 @@ static int pmu_set_power_domain(int pd, bool on)
ret = pmu_power_domain_is_on(pd);
if (ret < 0) {
pr_err("%s: could not read power domain state\n",
- __func__);
+ __func__);
return ret;
}
}
@@ -137,7 +137,7 @@ static int __cpuinit rockchip_boot_secondary(unsigned int cpu,

if (cpu >= ncores) {
pr_err("%s: cpu %d outside maximum number of cpus %d\n",
- __func__, cpu, ncores);
+ __func__, cpu, ncores);
return -ENXIO;
}

@@ -156,7 +156,7 @@ static int __cpuinit rockchip_boot_secondary(unsigned int cpu,
* */
udelay(10);
writel(virt_to_phys(rockchip_secondary_startup),
- sram_base_addr + 8);
+ sram_base_addr + 8);
writel(0xDEADBEAF, sram_base_addr + 4);
dsb_sev();
}
@@ -335,7 +335,7 @@ static int rockchip_cpu_kill(unsigned int cpu)
static void rockchip_cpu_die(unsigned int cpu)
{
v7_exit_coherency_flush(louis);
- while(1)
+ while (1)
cpu_do_idle();
}
#endif
@@ -348,4 +348,5 @@ static struct smp_operations rockchip_smp_ops __initdata = {
.cpu_die = rockchip_cpu_die,
#endif
};
+
CPU_METHOD_OF_DECLARE(rk3066_smp, "rockchip,rk3066-smp", &rockchip_smp_ops);
--
1.9.1

2015-06-05 16:09:44

by Heiko Stuebner

[permalink] [raw]

Subject: Re: [PATCH v3 1/3] ARM: rockchip: fix the CPU soft reset

Hi Caesar,

Am Freitag, 5. Juni 2015, 23:11:42 schrieb Caesar Wang:
> We need different orderings when turning a core on and turning a core
> off. In one case we need to assert reset before turning power off.
> In ther other case we need to turn power on and the deassert reset.
>
> In general, the correct flow is:
>
> CPU off:
> reset_control_assert
> regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
> CPU on:
> regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
> reset_control_deassert
>
> This is needed for stressing CPU up/down, as per:
> cd /sys/devices/system/cpu/
> for i in $(seq 1000); do
> echo "================= $i ============"
> for j in $(seq 100); do
> while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat
> cpu3/online)" != "000"" ]] echo 0 > cpu1/online
> echo 0 > cpu2/online
> echo 0 > cpu3/online
> done
> while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat
> cpu3/online)" != "111" ]]; do echo 1 > cpu1/online
> echo 1 > cpu2/online
> echo 1 > cpu3/online
> done
> done
> done
>
> The following is reproducile log:
> [34466.186812] PM: noirq suspend of devices complete after 0.669 msecs
> [34466.186824] Disabling non-boot CPUs ...
> [34466.187509] CPU1: shutdown
> [34466.188672] CPU2: shutdown
> [34473.736627] Kernel panic - not syncing:Watchdog detected hard LOCKUP
> on cpu 0 .......
>
> Signed-off-by: Caesar Wang <[email protected]>
> ---

could we do this something like the shown below instead?
Here the deassertion of the reset really happens after we are sure the
power-domain is on

-------------- 8< ---------------
diff --git a/arch/arm/mach-rockchip/platsmp.c b/arch/arm/mach-rockchip/platsmp.c
index 5b4ca3c..ee5dbad6 100644
--- a/arch/arm/mach-rockchip/platsmp.c
+++ b/arch/arm/mach-rockchip/platsmp.c
@@ -72,6 +72,7 @@ static struct reset_control *rockchip_get_core_reset(int cpu)
static int pmu_set_power_domain(int pd, bool on)
{
u32 val = (on) ? 0 : BIT(pd);
+ struct reset_control *rstc = rockchip_get_core_reset(pd);
int ret;

/*
@@ -80,20 +81,16 @@ static int pmu_set_power_domain(int pd, bool on)
* processor is powered down.
*/
if (read_cpuid_part() != ARM_CPU_PART_CORTEX_A9) {
- struct reset_control *rstc = rockchip_get_core_reset(pd);
-
+ /* We only require the reset on the RK3288 at the moment */
if (IS_ERR(rstc)) {
pr_err("%s: could not get reset control for core %d\n",
__func__, pd);
return PTR_ERR(rstc);
}

- if (on)
- reset_control_deassert(rstc);
- else
+ if (!on)
reset_control_assert(rstc);

- reset_control_put(rstc);
}

ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
@@ -112,6 +109,12 @@ static int pmu_set_power_domain(int pd, bool on)
}
}

+ if (read_cpuid_part() != ARM_CPU_PART_CORTEX_A9 && on)
+ reset_control_deassert(rstc);
+
+ if (!IS_ERR(rstc))
+ reset_control_put(rstc);
+
return 0;
}

-------------- 8< ---------------

2015-06-05 16:17:21

by Heiko Stuebner

[permalink] [raw]

Subject: Re: [PATCH v3 1/3] ARM: rockchip: fix the CPU soft reset

Hi Caesar,

Am Freitag, 5. Juni 2015, 17:52:48 schrieb Heiko St?bner:
> could we do this something like the shown below instead?
> Here the deassertion of the reset really happens after we are sure the
> power-domain is on

As you mention in the gerrit cl that the hang does happen when you
deassert the reset before the power-domain wait, ignore my last message please
:-)

But please make sure to also provide a changelog to the list submissions in
the future.

Thanks
Heiko

2015-06-05 16:28:45

by Doug Anderson

[permalink] [raw]

Subject: Re: [PATCH v3 1/3] ARM: rockchip: fix the CPU soft reset

Caesar,

On Fri, Jun 5, 2015 at 8:11 AM, Caesar Wang <[email protected]> wrote:
> We need different orderings when turning a core on and turning a core
> off. In one case we need to assert reset before turning power off.
> In ther other case we need to turn power on and the deassert reset.
>
> In general, the correct flow is:
>
> CPU off:
> reset_control_assert
> regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))

Add: "ensure power domain is on" to this list.

> CPU on:
> regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
> reset_control_deassert

Add: "ensure power domain is on" to this list.

Adding the "ensure power domain is on" step helps document that patch
set version 2 is not what you want and that you thought about it.

> @@ -88,18 +88,24 @@ static int pmu_set_power_domain(int pd, bool on)
> return PTR_ERR(rstc);
> }
>
> - if (on)
> - reset_control_deassert(rstc);
> - else
> + if (!on)
> reset_control_assert(rstc);
>
> - reset_control_put(rstc);
> - }
> + ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> + if (ret < 0) {
> + pr_err("%s: could not update power domain\n", __func__);
> + reset_control_put(rstc);
> + return ret;
> + }
>
> - ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> - if (ret < 0) {
> - pr_err("%s: could not update power domain\n", __func__);
> - return ret;
> + if (on)
> + reset_control_deassert(rstc);

I think you need a "reset_control_put(rstc);" here in the non-error case.

Otherwise this looks reasonable to me and you can add my Reviewed-by
tag. I'll also kick off some tests with this series today to confirm
as well.

-Doug

2015-06-05 17:49:18

by Doug Anderson

[permalink] [raw]

Subject: Re: [PATCH v3 2/3] ARM: rockchip: ensure CPU to enter WFI/WFE state

Caesar,

On Fri, Jun 5, 2015 at 8:11 AM, Caesar Wang <[email protected]> wrote:
> In idle mode, core1/2/3 of Cortex-A17 should be either power off or in
> WFI/WFE state.
> we can delay 1ms to ensure the CPU enter WFI/WFE state.
>
> Signed-off-by: Caesar Wang <[email protected]>
> ---
>
> arch/arm/mach-rockchip/platsmp.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/arch/arm/mach-rockchip/platsmp.c b/arch/arm/mach-rockchip/platsmp.c
> index 25da16f..6672fdd 100644
> --- a/arch/arm/mach-rockchip/platsmp.c
> +++ b/arch/arm/mach-rockchip/platsmp.c
> @@ -325,6 +325,9 @@ static void __init rockchip_smp_prepare_cpus(unsigned int max_cpus)
> #ifdef CONFIG_HOTPLUG_CPU
> static int rockchip_cpu_kill(unsigned int cpu)
> {
> + /* ensure CPU can enter the WFI/WFE state */
> + mdelay(1);

This is a pretty weak assurance. Is there any stronger assurance you
can give that we're in WFI/WFE state and won't come out of it?

Do you actually see problems if you power off a CPU when it's not in
WFI/WFE state?

...so I _think_ I see the path that is happening here and what you're
trying to handle. Specifically, I see:

On dying CPU:
1. cpu_die() calls 'complete(&cpu_died)'
2. cpu_die() calls 'smp_ops.cpu_die(cpu)' AKA rockchip_cpu_die()
3. rockchip_cpu_die() does a bit more cache flushing before looping in
cpu_do_idle()

The problem is that the moment the completion happens in step #1 above
the dying CPU can be killed. ...so you're trying to make sure the
dying CPU makes it to cpu_do_idle(). In that case a fixed mdelay(1)
might be OK since the time that the CPU takes to run through a few
instructions (with no interrupts) is pretty predictable. It would be
really nice if the commit message went through all this, though.

...but is there any chance that cpu_do_idle() could somehow return?
We shouldn't send any events since we've marked the core offline, but
perhaps some per-core interrupt (arch timer?) that didn't get
migrated?

-Doug

2015-06-05 18:30:06

by Russell King - ARM Linux

[permalink] [raw]

Subject: Re: [PATCH v3 2/3] ARM: rockchip: ensure CPU to enter WFI/WFE state

On Fri, Jun 05, 2015 at 10:49:14AM -0700, Doug Anderson wrote:
> On Fri, Jun 5, 2015 at 8:11 AM, Caesar Wang <[email protected]> wrote:
> > In idle mode, core1/2/3 of Cortex-A17 should be either power off or in
> > WFI/WFE state.
> > we can delay 1ms to ensure the CPU enter WFI/WFE state.
> >
> > Signed-off-by: Caesar Wang <[email protected]>
> > ---
> >
> > arch/arm/mach-rockchip/platsmp.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/arch/arm/mach-rockchip/platsmp.c b/arch/arm/mach-rockchip/platsmp.c
> > index 25da16f..6672fdd 100644
> > --- a/arch/arm/mach-rockchip/platsmp.c
> > +++ b/arch/arm/mach-rockchip/platsmp.c
> > @@ -325,6 +325,9 @@ static void __init rockchip_smp_prepare_cpus(unsigned int max_cpus)
> > #ifdef CONFIG_HOTPLUG_CPU
> > static int rockchip_cpu_kill(unsigned int cpu)
> > {
> > + /* ensure CPU can enter the WFI/WFE state */
> > + mdelay(1);
>
> This is a pretty weak assurance. Is there any stronger assurance you
> can give that we're in WFI/WFE state and won't come out of it?
>
> Do you actually see problems if you power off a CPU when it's not in
> WFI/WFE state?

I really don't like to see platforms pulling crap tricks in their
CPU hotunplug code like this. If there's something wrong with the
generic code, then the generic code needs to be fixed, not worked
around in platform code.

> ...so I _think_ I see the path that is happening here and what you're
> trying to handle. Specifically, I see:
>
> On dying CPU:
> 1. cpu_die() calls 'complete(&cpu_died)'
> 2. cpu_die() calls 'smp_ops.cpu_die(cpu)' AKA rockchip_cpu_die()
> 3. rockchip_cpu_die() does a bit more cache flushing before looping in
> cpu_do_idle()
>
> The problem is that the moment the completion happens in step #1 above
> the dying CPU can be killed. ...so you're trying to make sure the
> dying CPU makes it to cpu_do_idle(). In that case a fixed mdelay(1)
> might be OK since the time that the CPU takes to run through a few
> instructions (with no interrupts) is pretty predictable. It would be
> really nice if the commit message went through all this, though.
>
> ...but is there any chance that cpu_do_idle() could somehow return?
> We shouldn't send any events since we've marked the core offline, but
> perhaps some per-core interrupt (arch timer?) that didn't get
> migrated?

How this is supposed to work is:

CPU requesting death CPU dying
stop_machine(take_down_cpu)
take_down_cpu()
-__cpu_disable()
- platform_cpu_disable()
- marks CPU offline
- migrates IRQs away
- caches flushed
- tlbs flushed
__cpu_die() - processes migrated away
-wait_for_completion_timeout() -timekeeping migrated away
returns to idle loop
arch_cpu_idle_dead()
-cpu_die()
- idle_task_exit()
- flush_cache_louis()

At this point, dirty cache lines which matter are flushed from the
dying CPUs cache. However, we still need the dying CPU to be
coherent for the next step, which is to issue the completion.

- complete()
- flush_cache_louis()

At some point during the above, the completion becomes visible to
the other CPU, and it continues its execution.

-pr_notice()
-platform_cpu_kill() - smp_ops.cpu_die()

The precise ordering of smp_ops.cpu_die() vs platform_cpu_kill() is
pretty much indeterminant, and is subject to scheduling effects which
could delay the requesting CPU.

Now, the problem is, from what ARM has been saying, that cache lines
can get migrated to the dying CPU which may contain the "master" copy
of the data, which means that when the dying CPU actually exits
coherency, that data is lost to the rest of the system.

Ideally, what we would like to do is to exit coherency earlier, and
then have some notification mechanism between the dying CPU and the
rest of the system to say that it's finished exiting coherency.
However, we face several issues.

1) v7_coherency_exit() is specific to v7 CPUs and can't be used by
generic code.

2) we have no way to perform that notification reliably once
coherency has been exited.

There's other reasons we need to change the notification mechanism,
one of them is that completions use RCU, and RCU isn't actually valid
in this context - and we can't make it valid because the dying CPU
might loose power at any moment after it's called complete().

Paul McKenny created what was a generic infrastructure for this
notification, but I object to it on principle using atomic_t... and
if we've exited coherency (as we would want to), it won't work
because it makes use of atomic_cmpxchg() on the dying CPU.

I've been working on a solution to use an IPI to notify from the
dying CPU, but that's fraught because of the separation of the IRQ
controller code from the architecture code, and we now have spinlocks
in the IPI-raising path due to the big.LITTLE stuff - and spinlocks
need coherency to be working.

So, we're actually in a very sticky position over taking CPUs offline.
It seems to be something that the ARM architecture and kernel
architecture doesn't actually allow to be done safely. So much so,
that in a similar way to the original Keystone 2 physical address
switch, I'm tempted to make taking a CPU offline taint the kernel!

We're going to end up with tainted kernels through this path when Paul
pushes his RCU correctness patches anyway anyway, so it's just a
matter of the inevitable taint happening sooner or later.

--
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

2015-06-05 20:24:44

by Doug Anderson

[permalink] [raw]

Subject: Re: [PATCH v3 2/3] ARM: rockchip: ensure CPU to enter WFI/WFE state

Russell,

On Fri, Jun 5, 2015 at 11:29 AM, Russell King - ARM Linux
<[email protected]> wrote:
> 1) v7_coherency_exit() is specific to v7 CPUs and can't be used by
> generic code.

Oh, I see. So (I think) you're saying that perhaps the reason that
Caesar needed his patch was that he needed the dying processor to
execute v7_exit_coherency_flush(), NOT that he needed the dying
processor to be in WFI/WFE. That actually makes a lot more sense to
me! :) Thanks a lot for pointing that out, it's very helpful.

> So, we're actually in a very sticky position over taking CPUs offline.
> It seems to be something that the ARM architecture and kernel
> architecture doesn't actually allow to be done safely. So much so,
> that in a similar way to the original Keystone 2 physical address
> switch, I'm tempted to make taking a CPU offline taint the kernel!

Wow, that's going to suck. So if you want to suspend / resume you
need to taint your kernel. So much for saving the planet by going
into suspend... ...or are you thinking that it won't taint the kernel
when the kernel takes CPUs offline for suspend/resume purposes? ...or
are you thinking you've some solution that works for suspend/resume
that doesn't work for the general cpu offlining problem? I'd be very
interested to hear...

I know I'm not a maintainer, but if I were and I knew that lots of
smart people had thought about the problem of CPU offlining and they
didn't have a solution and I could make my platform 99.99999999%
reliable by allowing a very safe mdelay(1) where I had a pretty strong
guarantee that the 1ms was enough time, I would probably accept that
code...

So since I'm not a maintainer and I certainly couldn't ack such code,
I would certainly be happy to add my Reviewed-by to Caesar's patch if
he changed it mention that he needed to make sure that
v7_exit_coherency_flush() in rockchip_cpu_die() executed in time.

-Doug

2015-06-08 04:56:39

by Caesar Wang

[permalink] [raw]

Subject: Re: [PATCH v3 2/3] ARM: rockchip: ensure CPU to enter WFI/WFE state

在 2015年06月06日 04:24, Doug Anderson 写道:
> Russell,
>
> On Fri, Jun 5, 2015 at 11:29 AM, Russell King - ARM Linux
> <[email protected]> wrote:
>> 1) v7_coherency_exit() is specific to v7 CPUs and can't be used by
>> generic code.
> Oh, I see. So (I think) you're saying that perhaps the reason that
> Caesar needed his patch was that he needed the dying processor to
> execute v7_exit_coherency_flush(), NOT that he needed the dying
> processor to be in WFI/WFE. That actually makes a lot more sense to
> me! :) Thanks a lot for pointing that out, it's very helpful.
>
>
>> So, we're actually in a very sticky position over taking CPUs offline.
>> It seems to be something that the ARM architecture and kernel
>> architecture doesn't actually allow to be done safely. So much so,
>> that in a similar way to the original Keystone 2 physical address
>> switch, I'm tempted to make taking a CPU offline taint the kernel!
> Wow, that's going to suck. So if you want to suspend / resume you
> need to taint your kernel. So much for saving the planet by going
> into suspend... ...or are you thinking that it won't taint the kernel
> when the kernel takes CPUs offline for suspend/resume purposes? ...or
> are you thinking you've some solution that works for suspend/resume
> that doesn't work for the general cpu offlining problem? I'd be very
> interested to hear...
>
>
> I know I'm not a maintainer, but if I were and I knew that lots of
> smart people had thought about the problem of CPU offlining and they
> didn't have a solution and I could make my platform 99.99999999%
> reliable by allowing a very safe mdelay(1) where I had a pretty strong
> guarantee that the 1ms was enough time, I would probably accept that
> code...
>
>
> So since I'm not a maintainer and I certainly couldn't ack such code,
> I would certainly be happy to add my Reviewed-by to Caesar's patch if
> he changed it mention that he needed to make sure that
> v7_exit_coherency_flush() in rockchip_cpu_die() executed in time.

OK.
The dying processor to execute v7_exit_coherency_flush(),not that the
dying processor to be in WFI/WFE.

It's needed to enter WFI/WFE state from the ARM refer document when CPU
down.

But......

Here is my test: (won't to enter the WFI state)
@@ -331,8 +331,8 @@ static int rockchip_cpu_kill(unsigned int cpu)
static void rockchip_cpu_die(unsigned int cpu)
{
v7_exit_coherency_flush(louis);
- while (1)
- cpu_do_idle();
+ while (1);
+ //cpu_do_idle();
}

echo 0 > /sys/module/printk/parameters/console_suspend
echo 1 > /sys/power/pm_print_times
echo mem > sys/power/state

You can play anything
or do some test for CPU up/down:

cd /sys/devices/system/cpu/
for i in $(seq 10000); do
echo "================= $i ============"
for j in $(seq 100); do
while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat cpu3/online)"
!= "000" ]]; do

echo 0 > cpu1/online
echo 0 > cpu2/online
echo 0 > cpu3/online
done
while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat cpu3/online)"
!= "111" ]]; do
echo 1 > cpu1/online
echo 1 > cpu2/online
echo 1 > cpu3/online
done
done
done
Sometimes,the system will be restart when do the about test.
I'm no sure what's happen, That maybe abnormal won't to enter the WFI
state.

>
> -Doug
>
>
>

--
Thanks,
- Caesar

2015-06-08 20:52:33

by Doug Anderson

[permalink] [raw]

Subject: Re: [PATCH v3 2/3] ARM: rockchip: ensure CPU to enter WFI/WFE state

Caesar,

On Sun, Jun 7, 2015 at 9:56 PM, Caesar Wang <[email protected]> wrote:
> OK.
> The dying processor to execute v7_exit_coherency_flush(),not that the dying
> processor to be in WFI/WFE.
>
> It's needed to enter WFI/WFE state from the ARM refer document when CPU
> down.
>
> But......
>
> Here is my test: (won't to enter the WFI state)
> @@ -331,8 +331,8 @@ static int rockchip_cpu_kill(unsigned int cpu)
> static void rockchip_cpu_die(unsigned int cpu)
> {
> v7_exit_coherency_flush(louis);
> - while (1)
> - cpu_do_idle();
> + while (1);
> + //cpu_do_idle();
> }
>
> echo 0 > /sys/module/printk/parameters/console_suspend
> echo 1 > /sys/power/pm_print_times
> echo mem > sys/power/state
>
> You can play anything
> or do some test for CPU up/down:
>
> cd /sys/devices/system/cpu/
> for i in $(seq 10000); do
> echo "================= $i ============"
> for j in $(seq 100); do
> while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat cpu3/online)" !=
> "000" ]]; do
>
> echo 0 > cpu1/online
> echo 0 > cpu2/online
> echo 0 > cpu3/online
> done
> while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat cpu3/online)" !=
> "111" ]]; do
> echo 1 > cpu1/online
> echo 1 > cpu2/online
> echo 1 > cpu3/online
> done
> done
> done
> Sometimes,the system will be restart when do the about test.

I assume you meant the _above_ test. So it sometimes works and
sometimes doesn't? Strange...

> I'm no sure what's happen, That maybe abnormal won't to enter the WFI state.

Good test. I think I understood what you said above: basically if you
don't get to the WFI state then the system will sometimes restart.

I guess that seems a little strange to me since I would have thought
that since you assert "reset" to the CPU before you power it off it
wouldn't matter a whole lot what state the system was in. The
processor has exited concurrency and isn't touching any memory, so
nothing it does should be hurting anyone. ...but I also am only
guessing about how all this works / trying to use common sense. If we
really need to be in WFI/WFE then I guess that's what we need to do.

It still makes me nervous to say both that we "need to be in WFI" and
that we have a loop around cpu_do_idle(). The loop implies that
cpu_do_idle() might return sometimes. That would be OK, except now
we've learned that if we happen to kill the CPU while we're executing
the loop that it might crash the system. That's pretty non-ideal. I
know the chances are incredibly unlikely, but still something that
worries me...

-Doug