2023-02-23 04:46:02

by Xueqin Luo

[permalink] [raw]
Subject: [PATCH -next] PM: tools: add "CPU killed" timeline on arm64 platform

On the arm64 platform, the core log of cpu offline is as follows:

[ 100.431501] CPU1: shutdown
[ 100.454820] psci: CPU1 killed (polled 20 ms)
[ 100.459266] CPU2: shutdown
[ 100.482575] psci: CPU2 killed (polled 20 ms)
[ 100.486057] CPU3: shutdown
[ 100.513974] psci: CPU3 killed (polled 28 ms)
[ 100.518068] CPU4: shutdown
[ 100.541481] psci: CPU4 killed (polled 24 ms)

'smpboot: CPU (?P<cpu>[0-9]*) is now offline' cannot be applied
to the arm64 platform, which caused the loss of the suspend
machine stage in S3. Here I added core code to fix this issue.

Signed-off-by: Xueqin Luo <[email protected]>
---
tools/power/pm-graph/sleepgraph.py | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/power/pm-graph/sleepgraph.py b/tools/power/pm-graph/sleepgraph.py
index 82c09cd25cc2..d816970b0a3d 100755
--- a/tools/power/pm-graph/sleepgraph.py
+++ b/tools/power/pm-graph/sleepgraph.py
@@ -4132,9 +4132,12 @@ def parseKernelLog(data):
elif(re.match('Enabling non-boot CPUs .*', msg)):
# start of first cpu resume
cpu_start = ktime
- elif(re.match('smpboot: CPU (?P<cpu>[0-9]*) is now offline', msg)):
+ elif(re.match('smpboot: CPU (?P<cpu>[0-9]*) is now offline', msg)) \
+ or re.match('psci: CPU(?P<cpu>[0-9]*) killed.*', msg)):
# end of a cpu suspend, start of the next
m = re.match('smpboot: CPU (?P<cpu>[0-9]*) is now offline', msg)
+ if(not m):
+ m = re.match('psci: CPU(?P<cpu>[0-9]*) killed.*', msg)
cpu = 'CPU'+m.group('cpu')
if(cpu not in actions):
actions[cpu] = []
--
2.25.1



2023-02-23 18:12:35

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH -next] PM: tools: add "CPU killed" timeline on arm64 platform

On Thu, Feb 23, 2023 at 5:46 AM Xueqin Luo <[email protected]> wrote:
>
> On the arm64 platform, the core log of cpu offline is as follows:

Please spell CPU in capitals.

> [ 100.431501] CPU1: shutdown
> [ 100.454820] psci: CPU1 killed (polled 20 ms)
> [ 100.459266] CPU2: shutdown
> [ 100.482575] psci: CPU2 killed (polled 20 ms)
> [ 100.486057] CPU3: shutdown
> [ 100.513974] psci: CPU3 killed (polled 28 ms)
> [ 100.518068] CPU4: shutdown
> [ 100.541481] psci: CPU4 killed (polled 24 ms)
>
> 'smpboot: CPU (?P<cpu>[0-9]*) is now offline' cannot be applied
> to the arm64 platform, which caused the loss of the suspend
> machine stage in S3.

I'm not exactly sure what you mean by "loss of the suspend machine stage in S3".

> Here I added core code to fix this issue.
>
> Signed-off-by: Xueqin Luo <[email protected]>
> ---
> tools/power/pm-graph/sleepgraph.py | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/tools/power/pm-graph/sleepgraph.py b/tools/power/pm-graph/sleepgraph.py
> index 82c09cd25cc2..d816970b0a3d 100755
> --- a/tools/power/pm-graph/sleepgraph.py
> +++ b/tools/power/pm-graph/sleepgraph.py
> @@ -4132,9 +4132,12 @@ def parseKernelLog(data):
> elif(re.match('Enabling non-boot CPUs .*', msg)):
> # start of first cpu resume
> cpu_start = ktime
> - elif(re.match('smpboot: CPU (?P<cpu>[0-9]*) is now offline', msg)):
> + elif(re.match('smpboot: CPU (?P<cpu>[0-9]*) is now offline', msg)) \
> + or re.match('psci: CPU(?P<cpu>[0-9]*) killed.*', msg)):
> # end of a cpu suspend, start of the next
> m = re.match('smpboot: CPU (?P<cpu>[0-9]*) is now offline', msg)
> + if(not m):
> + m = re.match('psci: CPU(?P<cpu>[0-9]*) killed.*', msg)
> cpu = 'CPU'+m.group('cpu')
> if(cpu not in actions):
> actions[cpu] = []
> --

The changes look reasonable to me, though.

Todd, any comments?

2023-02-27 02:27:54

by Xueqin Luo

[permalink] [raw]
Subject: Re: [PATCH -next] PM: tools: add "CPU killed" timeline on arm64 platform

在 2023/2/24 02:11, Rafael J. Wysocki 写道:
> On Thu, Feb 23, 2023 at 5:46 AM Xueqin Luo <[email protected]> wrote:
>>
>> On the arm64 platform, the core log of cpu offline is as follows:
>
> Please spell CPU in capitals.

Thanks for pointing out my mistake.

>
>> [ 100.431501] CPU1: shutdown
>> [ 100.454820] psci: CPU1 killed (polled 20 ms)
>> [ 100.459266] CPU2: shutdown
>> [ 100.482575] psci: CPU2 killed (polled 20 ms)
>> [ 100.486057] CPU3: shutdown
>> [ 100.513974] psci: CPU3 killed (polled 28 ms)
>> [ 100.518068] CPU4: shutdown
>> [ 100.541481] psci: CPU4 killed (polled 24 ms)
>>
>> 'smpboot: CPU (?P<cpu>[0-9]*) is now offline' cannot be applied
>> to the arm64 platform, which caused the loss of the suspend
>> machine stage in S3.
>
> I'm not exactly sure what you mean by "loss of the suspend machine stage in S3".

I made a mistake in saying "loss of the suspend machine stage in S3",
please allow me to correct it. Because the original program only
recognized the "CPU up" action on the arm64 platform, in output.html,
"CPU up" was classified as the "suspend machine" stage. Adding this code
can put "CPU killed" and "CPU up" in the correct position.

>
>> Here I added core code to fix this issue.
>>
>> Signed-off-by: Xueqin Luo <[email protected]>
>> ---
>> tools/power/pm-graph/sleepgraph.py | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/power/pm-graph/sleepgraph.py b/tools/power/pm-graph/sleepgraph.py
>> index 82c09cd25cc2..d816970b0a3d 100755
>> --- a/tools/power/pm-graph/sleepgraph.py
>> +++ b/tools/power/pm-graph/sleepgraph.py
>> @@ -4132,9 +4132,12 @@ def parseKernelLog(data):
>> elif(re.match('Enabling non-boot CPUs .*', msg)):
>> # start of first cpu resume
>> cpu_start = ktime
>> - elif(re.match('smpboot: CPU (?P<cpu>[0-9]*) is now offline', msg)):
>> + elif(re.match('smpboot: CPU (?P<cpu>[0-9]*) is now offline', msg)) \
>> + or re.match('psci: CPU(?P<cpu>[0-9]*) killed.*', msg)):
>> # end of a cpu suspend, start of the next
>> m = re.match('smpboot: CPU (?P<cpu>[0-9]*) is now offline', msg)
>> + if(not m):
>> + m = re.match('psci: CPU(?P<cpu>[0-9]*) killed.*', msg)
>> cpu = 'CPU'+m.group('cpu')
>> if(cpu not in actions):
>> actions[cpu] = []
>> --
>
> The changes look reasonable to me, though.
>
> Todd, any comments?


2023-03-07 12:55:24

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH -next] PM: tools: add "CPU killed" timeline on arm64 platform

On Mon, Feb 27, 2023 at 3:27 AM luoxueqin <[email protected]> wrote:
>
> 在 2023/2/24 02:11, Rafael J. Wysocki 写道:
> > On Thu, Feb 23, 2023 at 5:46 AM Xueqin Luo <[email protected]> wrote:
> >>
> >> On the arm64 platform, the core log of cpu offline is as follows:
> >
> > Please spell CPU in capitals.
>
> Thanks for pointing out my mistake.
>
> >
> >> [ 100.431501] CPU1: shutdown
> >> [ 100.454820] psci: CPU1 killed (polled 20 ms)
> >> [ 100.459266] CPU2: shutdown
> >> [ 100.482575] psci: CPU2 killed (polled 20 ms)
> >> [ 100.486057] CPU3: shutdown
> >> [ 100.513974] psci: CPU3 killed (polled 28 ms)
> >> [ 100.518068] CPU4: shutdown
> >> [ 100.541481] psci: CPU4 killed (polled 24 ms)
> >>
> >> 'smpboot: CPU (?P<cpu>[0-9]*) is now offline' cannot be applied
> >> to the arm64 platform, which caused the loss of the suspend
> >> machine stage in S3.
> >
> > I'm not exactly sure what you mean by "loss of the suspend machine stage in S3".
>
> I made a mistake in saying "loss of the suspend machine stage in S3",
> please allow me to correct it. Because the original program only
> recognized the "CPU up" action on the arm64 platform, in output.html,
> "CPU up" was classified as the "suspend machine" stage. Adding this code
> can put "CPU killed" and "CPU up" in the correct position.

It is still somewhat unclear to be honest.

What does "the original program" above mean? sleepgraph.py before the patch?

And IIUC the goal of the patch is to prevent sleepgraph from
mistakenly treating the "CPU up" message as part of the suspend flow
(because it should be regarded as part of the resume flow).

If my understanding above is correct, please update the patch
changelog accordingly and resubmit the patch.