2019-03-22 11:14:33

by Wen Pu

[permalink] [raw]
Subject: [RFC PATCH] x86/cpu/hygon: Fix phys_proc_id calculation logic for multi-die processor

For Hygon family 18h multi-die processor platform, which support 1-Die/
2-Die/4-Die per socket, the system view is shown in the following system
topology.

System View (with 1-Die 2-Socket):
|------------|
------ -----
SOCKET0 | D0 | | D1 | SOCKET1
------ -----

System View (with 2-Die 2-socket):
--------------------
| -------------|------
| | | |
------------ ------------
SOCKET0 | D1 -- D0 | | D3 -- D2 | SOCKET1
------------ ------------

System View (with 4-Die 2-Socket) :
--------------------
| -------------|------
| | | |
------------ ------------
| D1 -- D0 | | D7 -- D6 |
| | \/ | | | | \/ | |
SOCKET0 | | /\ | | | | /\ | | SOCKET1
| D2 -- D3 | | D4 -- D5 |
------------ ------------
| | | |
------|------------| |
--------------------

Current codes direct use "phys_proc_id = initial_apicid >> bits", which
calc phys_proc_id from initial_apicid by shifting *bits*, will get
wrong phys_proc_id for Hygon family 18h 1-Die/2-Die 2-socket system.

According to document [1] section 2.1.11.1, the *bits* is the value of
CPUID_Fn80000008_ECX[12:15]. The value may be 4, 5 or 6. The definitions
are: 4 - 1 die, 5 - 2 die, 6 - 3/4 die.

ApicId definition is as below (see section 2.1.10.2.1.3 of [1]):
-------------------------------------------------
Bit | 6 | 5 4 | 3 | 2 1 0 |
|-----------|---------|--------|----------------|
IDs | Socket ID | Node ID | CCX ID | Core/Thread ID |
-------------------------------------------------

So for 3/4-Die CPU the *bits* is 6, which is same to ApicID definition
field, which still can get correct socket ID for 2-socket system. But
for 1-Die or 2-Die CPU the *bits* is 4 or 5, which will cause the right
shifted result is not exactly the value of socket ID.

For Hygon family 18h CPU the socket ID should be obtained from ApicId[6].
To fix the problem and match ApicID field definition, adjust the shift
bits to 6 for all Hygon family 18h multi-die CPUs.

Reference:
[1] https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf

Signed-off-by: Pu Wen <[email protected]>
---
arch/x86/kernel/cpu/hygon.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/cpu/hygon.c b/arch/x86/kernel/cpu/hygon.c
index cf25405..2df6dd9 100644
--- a/arch/x86/kernel/cpu/hygon.c
+++ b/arch/x86/kernel/cpu/hygon.c
@@ -87,6 +87,9 @@ static void hygon_get_topology(struct cpuinfo_x86 *c)
if (!err)
c->x86_coreid_bits = get_count_order(c->x86_max_cores);

+ /* Socket ID is ApicId[6] for these processors. */
+ c->phys_proc_id = c->apicid >> 6;
+
cacheinfo_hygon_init_llc_id(c, cpu, node_id);
} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
u64 value;
--
2.7.4



2019-03-22 15:55:44

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu/hygon: Fix phys_proc_id calculation logic for multi-die processor

On Fri, Mar 22, 2019 at 06:43:00PM +0800, Pu Wen wrote:
> For Hygon family 18h multi-die processor platform, which support 1-Die/
> 2-Die/4-Die per socket, the system view is shown in the following system
> topology.
>
> System View (with 1-Die 2-Socket):
> |------------|
> ------ -----
> SOCKET0 | D0 | | D1 | SOCKET1
> ------ -----
>
> System View (with 2-Die 2-socket):
> --------------------
> | -------------|------
> | | | |
> ------------ ------------
> SOCKET0 | D1 -- D0 | | D3 -- D2 | SOCKET1
> ------------ ------------
>
> System View (with 4-Die 2-Socket) :
> --------------------
> | -------------|------
> | | | |
> ------------ ------------
> | D1 -- D0 | | D7 -- D6 |
> | | \/ | | | | \/ | |
> SOCKET0 | | /\ | | | | /\ | | SOCKET1
> | D2 -- D3 | | D4 -- D5 |
> ------------ ------------
> | | | |
> ------|------------| |
> --------------------
>
> Current codes direct use "phys_proc_id = initial_apicid >> bits", which

Use proper english please - there's no "codes"

> calc phys_proc_id from initial_apicid by shifting *bits*, will get

... or "calc"

> wrong phys_proc_id for Hygon family 18h 1-Die/2-Die 2-socket system.
>
> According to document [1] section 2.1.11.1, the *bits* is the value of
> CPUID_Fn80000008_ECX[12:15]. The value may be 4, 5 or 6. The definitions
> are: 4 - 1 die, 5 - 2 die, 6 - 3/4 die.
>
> ApicId definition is as below (see section 2.1.10.2.1.3 of [1]):
> -------------------------------------------------
> Bit | 6 | 5 4 | 3 | 2 1 0 |
> |-----------|---------|--------|----------------|
> IDs | Socket ID | Node ID | CCX ID | Core/Thread ID |
> -------------------------------------------------
>
> So for 3/4-Die CPU the *bits* is 6, which is same to ApicID definition
> field, which still can get correct socket ID for 2-socket system. But
> for 1-Die or 2-Die CPU the *bits* is 4 or 5, which will cause the right
> shifted result is not exactly the value of socket ID.
>
> For Hygon family 18h CPU the socket ID should be obtained from ApicId[6].
> To fix the problem and match ApicID field definition, adjust the shift
> bits to 6 for all Hygon family 18h multi-die CPUs.

Sounds to me like you're programming the initial APIC ID not
the same way as AMD do...

> Reference:
> [1] https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf
>
> Signed-off-by: Pu Wen <[email protected]>
> ---
> arch/x86/kernel/cpu/hygon.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/hygon.c b/arch/x86/kernel/cpu/hygon.c
> index cf25405..2df6dd9 100644
> --- a/arch/x86/kernel/cpu/hygon.c
> +++ b/arch/x86/kernel/cpu/hygon.c
> @@ -87,6 +87,9 @@ static void hygon_get_topology(struct cpuinfo_x86 *c)
> if (!err)
> c->x86_coreid_bits = get_count_order(c->x86_max_cores);
>
> + /* Socket ID is ApicId[6] for these processors. */
> + c->phys_proc_id = c->apicid >> 6;

That 6 is a magic number, I assume?

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2019-03-22 16:22:42

by Wen Pu

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu/hygon: Fix phys_proc_id calculation logic for multi-die processor

On 2019/3/22 23:55, Borislav Petkov wrote:
> On Fri, Mar 22, 2019 at 06:43:00PM +0800, Pu Wen wrote:
>> Current codes direct use "phys_proc_id = initial_apicid >> bits", which
>
> Use proper english please - there's no "codes"
>
>> calc phys_proc_id from initial_apicid by shifting *bits*, will get
>
> ... or "calc"

Okay, will correct the spelling.

>> For Hygon family 18h CPU the socket ID should be obtained from ApicId[6].
>> To fix the problem and match ApicID field definition, adjust the shift
>> bits to 6 for all Hygon family 18h multi-die CPUs.
>
> Sounds to me like you're programming the initial APIC ID not
> the same way as AMD do...

In the same way.

>> Reference:
>> [1] https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf
>>
>> Signed-off-by: Pu Wen <[email protected]>
>> ---
>> arch/x86/kernel/cpu/hygon.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/hygon.c b/arch/x86/kernel/cpu/hygon.c
>> index cf25405..2df6dd9 100644
>> --- a/arch/x86/kernel/cpu/hygon.c
>> +++ b/arch/x86/kernel/cpu/hygon.c
>> @@ -87,6 +87,9 @@ static void hygon_get_topology(struct cpuinfo_x86 *c)
>> if (!err)
>> c->x86_coreid_bits = get_count_order(c->x86_max_cores);
>>
>> + /* Socket ID is ApicId[6] for these processors. */
>> + c->phys_proc_id = c->apicid >> 6;
>
> That 6 is a magic number, I assume?

That 6 is not a magic number. It indicate bit 6 of ApicId. This calculation
is the same as the LLC ID calculation in cacheinfo_hygon_init_llc_id().

--
Regards,
Pu Wen

2019-03-22 16:45:30

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu/hygon: Fix phys_proc_id calculation logic for multi-die processor

On Sat, Mar 23, 2019 at 12:19:01AM +0800, Pu Wen wrote:
> That 6 is not a magic number.

Well, if I see a naked 6, then it is only magic to me. Now if it were a
proper define with a descriptive name...

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2019-03-22 16:53:50

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu/hygon: Fix phys_proc_id calculation logic for multi-die processor

On Fri, Mar 22, 2019 at 05:44:45PM +0100, Borislav Petkov wrote:
> On Sat, Mar 23, 2019 at 12:19:01AM +0800, Pu Wen wrote:
> > That 6 is not a magic number.
>
> Well, if I see a naked 6, then it is only magic to me. Now if it were a
> proper define with a descriptive name...

Does AMD/Hygon not have a CPUID leaf to read useful things like this
from?

2019-03-22 17:17:42

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu/hygon: Fix phys_proc_id calculation logic for multi-die processor

On Sat, Mar 23, 2019 at 12:19:01AM +0800, Pu Wen wrote:
> > Sounds to me like you're programming the initial APIC ID not
> > the same way as AMD do...
>
> In the same way.

So why do you need to do something different than what AMD does?

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2019-03-23 01:58:32

by Wen Pu

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu/hygon: Fix phys_proc_id calculation logic for multi-die processor

On 2019/3/23 0:44, Borislav Petkov wrote:
> On Sat, Mar 23, 2019 at 12:19:01AM +0800, Pu Wen wrote:
>> That 6 is not a magic number.
>
> Well, if I see a naked 6, then it is only magic to me. Now if it were a
> proper define with a descriptive name...

So maybe define it as:
#define APICID_SOCKET_ID BIT(6)

--
Regards,
Pu Wen

2019-03-23 02:17:08

by Wen Pu

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu/hygon: Fix phys_proc_id calculation logic for multi-die processor

On 2019/3/23 1:16, Borislav Petkov wrote:
> On Sat, Mar 23, 2019 at 12:19:01AM +0800, Pu Wen wrote:
>>> Sounds to me like you're programming the initial APIC ID not
>>> the same way as AMD do...
>>
>> In the same way.
>
> So why do you need to do something different than what AMD does?

Current physical id is computed via "phys_proc_id = initial_apicid >>
bits".

For 4-Die 2 socket system, the physical id of socket 2 is:
initial_apicid >> bits = 0b1xxxxxx >> 6 = 1.
The result is true.

But for 2-Die 2 socket system, the physical id of socket 2 is:
initial_apicid >> bits = 0b10xxxxx >> 5 = 2,
and for 1-Die 2 socket system, the physical id of socket 2 is:
initial_apicid >> bits = 0b100xxxx >> 4 = 4.
The results are not correct any more.

So the adjustment for the 1-Die/2-Die 2 socket system is needed.
And just use ApicId[6], which already defined the right thing, as the
socket ID.

--
Regards,
Pu Wen

2019-03-23 03:07:12

by Wen Pu

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu/hygon: Fix phys_proc_id calculation logic for multi-die processor

On 2019/3/23 0:52, Peter Zijlstra wrote:
> On Fri, Mar 22, 2019 at 05:44:45PM +0100, Borislav Petkov wrote:
>> On Sat, Mar 23, 2019 at 12:19:01AM +0800, Pu Wen wrote:
>>> That 6 is not a magic number.
>>
>> Well, if I see a naked 6, then it is only magic to me. Now if it were a
>> proper define with a descriptive name...
>
> Does AMD/Hygon not have a CPUID leaf to read useful things like this
> from?

ApicId[6] is read from CPUID_Fn00000001_EBX[30].
Please see section 2.1.11.1 and 2.1.10.2.1.3 of referrence [1].

Reference:
[1] https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf

--
Regards,
Pu Wen

2019-03-23 08:58:59

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu/hygon: Fix phys_proc_id calculation logic for multi-die processor

On Sat, Mar 23, 2019 at 09:48:52AM +0800, Pu Wen wrote:
> So maybe define it as:
> #define APICID_SOCKET_ID BIT(6)

Sure, if you define it locally in hygon.c. Otherwise, it needs to have
"HYGON" in the name or so.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2019-03-23 09:00:31

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu/hygon: Fix phys_proc_id calculation logic for multi-die processor

On Sat, Mar 23, 2019 at 10:13:39AM +0800, Pu Wen wrote:
> Current physical id is computed via "phys_proc_id = initial_apicid >>
> bits".
>
> For 4-Die 2 socket system, the physical id of socket 2 is:
> initial_apicid >> bits = 0b1xxxxxx >> 6 = 1.
> The result is true.
>
> But for 2-Die 2 socket system, the physical id of socket 2 is:
> initial_apicid >> bits = 0b10xxxxx >> 5 = 2,
> and for 1-Die 2 socket system, the physical id of socket 2 is:
> initial_apicid >> bits = 0b100xxxx >> 4 = 4.
> The results are not correct any more.
>
> So the adjustment for the 1-Die/2-Die 2 socket system is needed.
> And just use ApicId[6], which already defined the right thing, as the
> socket ID.

I understand all that. But let me repeat my question:

So why do you need to do something different than what AMD does?

You said you're programming the initial APIC ID the same as AMD. So why
doesn't this need to be changed in AMD code too but only for hygon?

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2019-03-23 11:01:13

by Wen Pu

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu/hygon: Fix phys_proc_id calculation logic for multi-die processor

On 2019/3/23 16:59, Borislav Petkov wrote:
> On Sat, Mar 23, 2019 at 10:13:39AM +0800, Pu Wen wrote:
>> Current physical id is computed via "phys_proc_id = initial_apicid >>
>> bits".
>>
>> For 4-Die 2 socket system, the physical id of socket 2 is:
>> initial_apicid >> bits = 0b1xxxxxx >> 6 = 1.
>> The result is true.
>>
>> But for 2-Die 2 socket system, the physical id of socket 2 is:
>> initial_apicid >> bits = 0b10xxxxx >> 5 = 2,
>> and for 1-Die 2 socket system, the physical id of socket 2 is:
>> initial_apicid >> bits = 0b100xxxx >> 4 = 4.
>> The results are not correct any more.
>>
>> So the adjustment for the 1-Die/2-Die 2 socket system is needed.
>> And just use ApicId[6], which already defined the right thing, as the
>> socket ID.
>
> I understand all that. But let me repeat my question:
>
> So why do you need to do something different than what AMD does?
>
> You said you're programming the initial APIC ID the same as AMD. So why
> doesn't this need to be changed in AMD code too but only for hygon?

Because AMD doesn't have 2-Socket system with 1-Die/2-Die processors(see
reference [2]). So it will return the right result when getting physical
ID in AMD system.

[2] https://www.amd.com/en/products/specifications/processors

--
Regards,
Pu Wen

2019-03-23 11:08:58

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu/hygon: Fix phys_proc_id calculation logic for multi-die processor

On Sat, Mar 23, 2019 at 06:56:52PM +0800, Pu Wen wrote:
> Because AMD doesn't have 2-Socket system with 1-Die/2-Die processors(see
> reference [2]). So it will return the right result when getting physical
> ID in AMD system.

There it is! Now add that to the commit message please.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2019-03-23 11:14:38

by Wen Pu

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu/hygon: Fix phys_proc_id calculation logic for multi-die processor

On 2019/3/23 19:08, Borislav Petkov wrote:
> On Sat, Mar 23, 2019 at 06:56:52PM +0800, Pu Wen wrote:
>> Because AMD doesn't have 2-Socket system with 1-Die/2-Die processors(see
>> reference [2]). So it will return the right result when getting physical
>> ID in AMD system.
>
> There it is! Now add that to the commit message please.

Okay! :)

--
Regards,
Pu Wen