Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752217AbcKGUXq (ORCPT ); Mon, 7 Nov 2016 15:23:46 -0500 Received: from Galois.linutronix.de ([146.0.238.70]:53127 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751618AbcKGUXn (ORCPT ); Mon, 7 Nov 2016 15:23:43 -0500 Date: Mon, 7 Nov 2016 21:20:53 +0100 (CET) From: Thomas Gleixner To: "Charles (Chas) Williams" cc: Sebastian Andrzej Siewior , x86@kernel.org, linux-kernel@vger.kernel.org, "M. Vefa Bicakci" Subject: Re: [RFC PATCH] perf/x86/intel/rapl: avoid access unallocate memory In-Reply-To: <58e229e2-91f4-a97f-1b9f-089f48ef994a@brocade.com> Message-ID: References: <20161102122557.qs4rl6mb7n7l7j7p@linutronix.de> <24e69019-60d0-29e7-e31f-c6f00f9ed98a@brocade.com> <58e229e2-91f4-a97f-1b9f-089f48ef994a@brocade.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2577 Lines: 82 On Mon, 7 Nov 2016, Charles (Chas) Williams wrote: > On 11/07/2016 11:19 AM, Thomas Gleixner wrote: > > On Wed, 2 Nov 2016, Charles (Chas) Williams wrote: > > > I don't know why the CPU's phys_proc_id is 2. > > > > max_physical_pkg_id gets initialized via: > > > > cpus = boot_cpu_data.x86_max_cores; > > max_physical_pkg_id = DIV_ROUND_UP(MAX_LOCAL_APIC, ncpus); > > > > What's the value of boot_cpu_data.x86_max_cores and MAX_LOCAL_APIC? > > I have discovered that that is not the problem. smp_init_package_map() > is calculating the physical core id using the following: > > for_each_present_cpu(cpu) { > unsigned int apicid = apic->cpu_present_to_apicid(cpu); > > ... > if (!topology_update_package_map(apicid, cpu)) > continue; > > ... > int topology_update_package_map(unsigned int apicid, unsigned int cpu) > { > unsigned int new, pkg = apicid >> > boot_cpu_data.x86_coreid_bits; > > But later when the secondary CPU's are identified they use a different > calculation using the local APIC ID from the CPU's registers: > > static void generic_identify(struct cpuinfo_x86 *c) > ... > if (c->cpuid_level >= 0x00000001) { > c->initial_apicid = (cpuid_ebx(1) >> 24) & 0xFF; > ... > c->phys_proc_id = c->initial_apicid; > > So at the end of identify_cpu() when the boot/hotplug assignment is > put back: > > c->logical_proc_id = topology_phys_to_logical_pkg(c->phys_proc_id); > > topology_phys_to_logical_pkg() is returning an invalid logical processor > since one isn't configured. > > It's not clear to me what the right thing to do is or which is right. Nice detective work! So the issue is that the package mapping code honours boot_cpu_data.x86_coreid_bit, while generic_identify does not. boot_cpu_data.x86_coreid_bit is obviously 1 in your case. Tentative fix below. I still need to gow through that maze and figure out what could go wrong with that :( Thanks, tglx 8<------------------------ --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -905,6 +905,8 @@ static void detect_null_seg_behavior(str static void generic_identify(struct cpuinfo_x86 *c) { + unsigned int pkg; + c->extended_cpuid_level = 0; if (!have_cpuid_p()) @@ -929,7 +931,8 @@ static void generic_identify(struct cpui c->apicid = c->initial_apicid; # endif #endif - c->phys_proc_id = c->initial_apicid; + pkg = c->initial_apicid >> boot_cpu_data.x86_coreid_bits; + c->phys_proc_id = pkg; } get_model_name(c); /* Default name */