2021-06-07 01:36:02

by Codyyao-oc

[permalink] [raw]
Subject: [PATCH] x86/perf: Fixed kernel panic during boot on Nano processor.

From: CodyYao-oc <[email protected]>

Nano processor may not fully support rdpmc instruction, it works well
for reading general pmc counter, but will lead to GP(general protection)
when accessing fixed pmc counter. Futhermore, family/model information
is same between Nano processor and ZX-C processor, it leads to zhaoxin
pmu driver is wrongly loaded for Nano processor, which resulting boot
kernal fail.

To solve this problem, stepping information will be checked to distinguish
between Nano processor and ZX-C processor.

[https://bugzilla.kernel.org/show_bug.cgi?id=212389]

Reported-by: Arjan <[email protected]>
Signed-off-by: CodyYao-oc <[email protected]>
---
arch/x86/events/zhaoxin/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/zhaoxin/core.c b/arch/x86/events/zhaoxin/core.c
index 949d845c922b..cef1de251613 100644
--- a/arch/x86/events/zhaoxin/core.c
+++ b/arch/x86/events/zhaoxin/core.c
@@ -541,7 +541,8 @@ __init int zhaoxin_pmu_init(void)

switch (boot_cpu_data.x86) {
case 0x06:
- if (boot_cpu_data.x86_model == 0x0f || boot_cpu_data.x86_model == 0x19) {
+ if ((boot_cpu_data.x86_model == 0x0f && boot_cpu_data.x86_stepping >= 0x0e) ||
+ boot_cpu_data.x86_model == 0x19) {

x86_pmu.max_period = x86_pmu.cntval_mask >> 1;

--
2.17.1


2021-06-30 04:57:05

by Codyyao-oc

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Fixed kernel panic during boot on Nano processor.

Dear Mingo and Peter,

Thank you for taking your precious time to read this letter, I am very
graterful.


Last month, I fixed the bug that boot failed on Nano processor which
introduced by

"Fixes: 3a4ac121c2ca ("x86/perf: Add hardware performance events support
for Zhaoxin CPU.")"

with Arjan's help and submitted this patch.  But I haven't got back.
Greatly appreciate if you could kindly

check it and reply at your convenience.

Many Thanks!

Cody

On 2021/6/7 上午9:31, Cody Yao-oc wrote:
> From: CodyYao-oc <[email protected]>
>
> Nano processor may not fully support rdpmc instruction, it works well
> for reading general pmc counter, but will lead to GP(general protection)
> when accessing fixed pmc counter. Futhermore, family/model information
> is same between Nano processor and ZX-C processor, it leads to zhaoxin
> pmu driver is wrongly loaded for Nano processor, which resulting boot
> kernal fail.
>
> To solve this problem, stepping information will be checked to distinguish
> between Nano processor and ZX-C processor.
>
> [https://bugzilla.kernel.org/show_bug.cgi?id=212389]
>
> Reported-by: Arjan <[email protected]>
> Signed-off-by: CodyYao-oc <[email protected]>
> ---
> arch/x86/events/zhaoxin/core.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/zhaoxin/core.c b/arch/x86/events/zhaoxin/core.c
> index 949d845c922b..cef1de251613 100644
> --- a/arch/x86/events/zhaoxin/core.c
> +++ b/arch/x86/events/zhaoxin/core.c
> @@ -541,7 +541,8 @@ __init int zhaoxin_pmu_init(void)
>
> switch (boot_cpu_data.x86) {
> case 0x06:
> - if (boot_cpu_data.x86_model == 0x0f || boot_cpu_data.x86_model == 0x19) {
> + if ((boot_cpu_data.x86_model == 0x0f && boot_cpu_data.x86_stepping >= 0x0e) ||
> + boot_cpu_data.x86_model == 0x19) {
>
> x86_pmu.max_period = x86_pmu.cntval_mask >> 1;
>

2022-10-13 14:22:39

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Fixed kernel panic during boot on Nano processor.

On Mon, Jun 07, 2021 at 09:31:09AM +0800, Cody Yao-oc wrote:
> From: CodyYao-oc <[email protected]>
>
> Nano processor may not fully support rdpmc instruction, it works well
> for reading general pmc counter, but will lead to GP(general protection)
> when accessing fixed pmc counter. Futhermore, family/model information
> is same between Nano processor and ZX-C processor, it leads to zhaoxin
> pmu driver is wrongly loaded for Nano processor, which resulting boot
> kernal fail.
>
> To solve this problem, stepping information will be checked to distinguish
> between Nano processor and ZX-C processor.
>
> [https://bugzilla.kernel.org/show_bug.cgi?id=212389]
>
> Reported-by: Arjan <[email protected]>
> Signed-off-by: CodyYao-oc <[email protected]>

I suppose I'll queue it up for perf/urgent post -rc1

2022-10-13 14:54:28

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Fixed kernel panic during boot on Nano processor.

Hi perf maintainers and Codyyao-oc! What happened to below patch, which
was posted many moons ago? It wasn't merged afaics. Did it fall through
the cracks or is there something wrong with it?

I'm asking because a user who reported this regression asked what's up:
https://bugzilla.kernel.org/show_bug.cgi?id=212389

On 30.06.21 06:38, Codyyao-oc wrote:
>
> Thank you for taking your precious time to read this letter, I am very
> graterful.
>
> Last month, I fixed the bug that boot failed on Nano processor which
> introduced by
>
> "Fixes: 3a4ac121c2ca ("x86/perf: Add hardware performance events support
> for Zhaoxin CPU.")"

Just BTW: You want to add that tag to your patch description.

> with Arjan's help and submitted this patch.  But I haven't got back.
> Greatly appreciate if you could kindly
>
> check it and reply at your convenience.
>
> Many Thanks!
>
> Cody
>
> On 2021/6/7 上午9:31, Cody Yao-oc wrote:
>> From: CodyYao-oc <[email protected]>
>>
>> Nano processor may not fully support rdpmc instruction, it works well
>> for reading general pmc counter, but will lead to GP(general protection)
>> when accessing fixed pmc counter. Futhermore, family/model information
>> is same between Nano processor and ZX-C processor, it leads to zhaoxin
>> pmu driver is wrongly loaded for Nano processor, which resulting boot
>> kernal fail.
>>
>> To solve this problem, stepping information will be checked to
>> distinguish
>> between Nano processor and ZX-C processor.

And this...

>> [https://bugzilla.kernel.org/show_bug.cgi?id=212389]

...should look like this:

Link: https://bugzilla.kernel.org/show_bug.cgi?id=212389

Ohh, and you might want to add this to ensure backporting:

Cc: <[email protected]> # 5.10.x

Guess adding those and submitting it again might be wise and help to
finally get this regression resolved.

Ciao, Thorsten

>> Reported-by: Arjan <[email protected]>
>> Signed-off-by: CodyYao-oc <[email protected]>
>> ---
>>   arch/x86/events/zhaoxin/core.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/events/zhaoxin/core.c
>> b/arch/x86/events/zhaoxin/core.c
>> index 949d845c922b..cef1de251613 100644
>> --- a/arch/x86/events/zhaoxin/core.c
>> +++ b/arch/x86/events/zhaoxin/core.c
>> @@ -541,7 +541,8 @@ __init int zhaoxin_pmu_init(void)
>>         switch (boot_cpu_data.x86) {
>>       case 0x06:
>> -        if (boot_cpu_data.x86_model == 0x0f ||
>> boot_cpu_data.x86_model == 0x19) {
>> +        if ((boot_cpu_data.x86_model == 0x0f &&
>> boot_cpu_data.x86_stepping >= 0x0e) ||
>> +            boot_cpu_data.x86_model == 0x19) {
>>                 x86_pmu.max_period = x86_pmu.cntval_mask >> 1;
>>  

2022-10-13 15:58:27

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Fixed kernel panic during boot on Nano processor.

On Mon, Jun 07, 2021 at 09:31:09AM +0800, Cody Yao-oc wrote:
> From: CodyYao-oc <[email protected]>
>
> Nano processor may not fully support rdpmc instruction, it works well
> for reading general pmc counter, but will lead to GP(general protection)
> when accessing fixed pmc counter. Futhermore, family/model information
> is same between Nano processor and ZX-C processor, it leads to zhaoxin
> pmu driver is wrongly loaded for Nano processor, which resulting boot
> kernal fail.
>
> To solve this problem, stepping information will be checked to distinguish
> between Nano processor and ZX-C processor.
>
> [https://bugzilla.kernel.org/show_bug.cgi?id=212389]
>
> Reported-by: Arjan <[email protected]>
> Signed-off-by: CodyYao-oc <[email protected]>

*sigh*.. so this email address doesn't exist, as such I can't apply this
patch. Consider it dropped.

2022-10-16 10:24:08

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Fixed kernel panic during boot on Nano processor.

On Sun, Oct 16, 2022 at 11:53:14AM +0200, Arjan wrote:
> On 13-10-2022 17:07, Peter Zijlstra wrote:
> > On Mon, Jun 07, 2021 at 09:31:09AM +0800, Cody Yao-oc wrote:
> > > From: CodyYao-oc <[email protected]>
> > >
> > > Nano processor may not fully support rdpmc instruction, it works well
> > > for reading general pmc counter, but will lead to GP(general protection)
> > > when accessing fixed pmc counter. Futhermore, family/model information
> > > is same between Nano processor and ZX-C processor, it leads to zhaoxin
> > > pmu driver is wrongly loaded for Nano processor, which resulting boot
> > > kernal fail.
> > >
> > > To solve this problem, stepping information will be checked to distinguish
> > > between Nano processor and ZX-C processor.
> > >
> > > [https://bugzilla.kernel.org/show_bug.cgi?id=212389]
> > >
> > > Reported-by: Arjan <[email protected]>
> > > Signed-off-by: CodyYao-oc <[email protected]>
> >
> > *sigh*.. so this email address doesn't exist, as such I can't apply this
> > patch. Consider it dropped.
>
> If it's about my email address: The address exists and works.
> If the nospam part bothers you, that part can be left out. You may leave the reported-by line out if you want to.

The SoB address ([email protected]) bounced for me -- since that's
the patch author that is somewhat important.

2022-10-16 10:37:05

by Arjan

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Fixed kernel panic during boot on Nano processor.

On 13-10-2022 17:07, Peter Zijlstra wrote:
> On Mon, Jun 07, 2021 at 09:31:09AM +0800, Cody Yao-oc wrote:
>> From: CodyYao-oc <[email protected]>
>>
>> Nano processor may not fully support rdpmc instruction, it works well
>> for reading general pmc counter, but will lead to GP(general protection)
>> when accessing fixed pmc counter. Futhermore, family/model information
>> is same between Nano processor and ZX-C processor, it leads to zhaoxin
>> pmu driver is wrongly loaded for Nano processor, which resulting boot
>> kernal fail.
>>
>> To solve this problem, stepping information will be checked to distinguish
>> between Nano processor and ZX-C processor.
>>
>> [https://bugzilla.kernel.org/show_bug.cgi?id=212389]
>>
>> Reported-by: Arjan <[email protected]>
>> Signed-off-by: CodyYao-oc <[email protected]>
>
> *sigh*.. so this email address doesn't exist, as such I can't apply this
> patch. Consider it dropped.

If it's about my email address: The address exists and works.
If the nospam part bothers you, that part can be left out. You may leave the reported-by line out if you want to.

2022-10-16 11:07:18

by Arjan

[permalink] [raw]
Subject: Re: [PATCH] x86/perf: Fixed kernel panic during boot on Nano processor.

On 16-10-2022 11:59, Peter Zijlstra wrote:
> On Sun, Oct 16, 2022 at 11:53:14AM +0200, Arjan wrote:
>> On 13-10-2022 17:07, Peter Zijlstra wrote:
>>> On Mon, Jun 07, 2021 at 09:31:09AM +0800, Cody Yao-oc wrote:
>>>> From: CodyYao-oc <[email protected]>
>>>>
>>>> Nano processor may not fully support rdpmc instruction, it works well
>>>> for reading general pmc counter, but will lead to GP(general protection)
>>>> when accessing fixed pmc counter. Futhermore, family/model information
>>>> is same between Nano processor and ZX-C processor, it leads to zhaoxin
>>>> pmu driver is wrongly loaded for Nano processor, which resulting boot
>>>> kernal fail.
>>>>
>>>> To solve this problem, stepping information will be checked to distinguish
>>>> between Nano processor and ZX-C processor.
>>>>
>>>> [https://bugzilla.kernel.org/show_bug.cgi?id=212389]
>>>>
>>>> Reported-by: Arjan <[email protected]>
>>>> Signed-off-by: CodyYao-oc <[email protected]>
>>>
>>> *sigh*.. so this email address doesn't exist, as such I can't apply this
>>> patch. Consider it dropped.
>>
>> If it's about my email address: The address exists and works.
>> If the nospam part bothers you, that part can be left out. You may leave the reported-by line out if you want to.
>
> The SoB address ([email protected]) bounced for me -- since that's
> the patch author that is somewhat important.

It now bounced for me too.
It was still valid when Cody submitted the patch in 2021, because we
exchanged messages while debugging and testing the patch.