Date: Mon, 24 Jul 2017 13:14:08 +0200
From: Borislav Petkov <bp@suse.de>
To: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: linux-kernel@vger.kernel.org, x86@kernel.org, tglx@linutronix.de,
        mingo@redhat.com, hpa@zytor.com, peterz@infradead.org,
        Yazen.Ghannam@amd.com
Subject: Re: [PATCH v4 2/2] x86/amd: Fixup cpu_core_id for family17h downcore
 configuration
Message-ID: <20170724111408.GC28024@nazgul.tnic>
References: <1500888165-2345-1-git-send-email-suravee.suthikulpanit@amd.com>
 <1500888165-2345-3-git-send-email-suravee.suthikulpanit@amd.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <1500888165-2345-3-git-send-email-suravee.suthikulpanit@amd.com>
User-Agent: Mutt/1.6.0 (2016-04-01)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 4205
Lines: 136

On Mon, Jul 24, 2017 at 04:22:45AM -0500, Suravee Suthikulpanit wrote:
> For family17h, current cpu_core_id is directly taken from the value
> CPUID_Fn8000001E_EBX[7:0] (CoreId), which is the physical ID of the
> core within a die. However, on system with downcore configuration
> (where not all physical cores within a die are available), this could
> result in the case where cpu_core_id > (cores_per_node - 1).

And that is a problem because...?

> Fix up the cpu_core_id by breaking down the bitfields of CoreId,
> and calculate relative ID using available topology information.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>  arch/x86/kernel/cpu/amd.c | 74 ++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 54 insertions(+), 20 deletions(-)

Btw, I have to say, it is much easier to review now.

> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index a2a52b5..b5ea28f 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -305,37 +305,71 @@ static void __get_topoext(struct cpuinfo_x86 *c, int cpu)
>  {
>  	u32 eax, ebx, ecx, edx;
>  	u8 node_id;
> +	u16 l3_nshared = 0;
> +
> +	if (cpuid_edx(0x80000006)) {

Ok, so Janakarajan did some L3 detection:

https://lkml.kernel.org/r/cover.1497452002.git.Janakarajan.Natarajan@amd.com

and now that you need it too, I'd like you to refactor and unify that
L3 detection code and put it in arch/x86/kernel/cpu/intel_cacheinfo.c
(yes, we will rename that file one fine day :-)) along with accessors
for other users to call. init_amd_cacheinfo() looks good to me right
now.

> +		cpuid_count(0x8000001d, 3, &eax, &ebx, &ecx, &edx);
> +		l3_nshared = ((eax >> 14) & 0xfff) + 1;
> +	}
>  
>  	cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
>  
>  	node_id  = ecx & 0xff;
>  	smp_num_siblings = ((ebx >> 8) & 0xff) + 1;
>  
> -	if (c->x86 == 0x15)
> -		c->cu_id = ebx & 0xff;
> -
> -	if (c->x86 >= 0x17) {
> -		c->cpu_core_id = ebx & 0xff;
> -
> -		if (smp_num_siblings > 1)
> -			c->x86_max_cores /= smp_num_siblings;
> -	}
> +	switch (c->x86) {
> +	case 0x17: {
> +		u32 tmp, ccx_offset, cpu_offset;
>  
> -	/*
> -	 * We may have multiple LLCs if L3 caches exist, so check if we
> -	 * have an L3 cache by looking at the L3 cache CPUID leaf.
> -	 */
> -	if (cpuid_edx(0x80000006)) {
> -		if (c->x86 == 0x17) {
> +		/*
> +		 * In family 17h, the CPUID_Fn8000001E_EBX[7:0] (CoreId)
> +		 * is non-contiguous in downcore and non-SMT cases.
> +		 * Fixup the cpu_core_id to be contiguous for cores within
> +		 * the die.

Why do we need it to be contiguous? It is not contiguous on Intel too.

> +		 */
> +		tmp = ebx & 0xff;
> +		if (smp_num_siblings == 1) {
>  			/*
> -			 * LLC is at the core complex level.
> -			 * Core complex id is ApicId[3].
> +			 * CoreId bit-encoding for SMT-disabled
> +			 * [7:4] : die
> +			 * [3]   : ccx
> +			 * [2:0] : core
>  			 */
> -			per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;
> +			ccx_offset = ((tmp >> 3) & 1) * l3_nshared;
> +			cpu_offset = tmp & 7;
>  		} else {
> -			/* LLC is at the node level. */
> -			per_cpu(cpu_llc_id, cpu) = node_id;
> +			/*
> +			 * CoreId bit-encoding for SMT-enabled
> +			 * [7:3] : die
> +			 * [2]   : ccx
> +			 * [1:0] : core
> +			 */
> +			ccx_offset = ((tmp >> 2) & 1) * l3_nshared /
> +				       smp_num_siblings;
> +			cpu_offset = tmp & 3;
> +			c->x86_max_cores /= smp_num_siblings;
> +
>  		}
> +		c->cpu_core_id = ccx_offset + cpu_offset;
> +
> +		/*
> +		 * Family17h L3 cache (LLC) is at Core Complex (CCX).
> +		 * There could be multiple CCXs in a node.
> +		 * CCX ID is ApicId[3].
> +		 */
> +		per_cpu(cpu_llc_id, cpu) = c->apicid >> 3;
> +
> +		pr_debug("Fixup coreid:%#x to cpu_core_id:%#x\n",
> +			 tmp, c->cpu_core_id);
> +		break;
> +	}
> +	case 0x15:
> +		c->cu_id = ebx & 0xff;
> +		/* Follow through */
> +	default:
> +		/* LLC is default to L3, which generally per-node */
> +		if (l3_nshared > 0)
> +			per_cpu(cpu_llc_id, cpu) = node_id;

If this needs to be executed unconditionally, just move it out of the
switch-case.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--