Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753087AbcJDGCG (ORCPT ); Tue, 4 Oct 2016 02:02:06 -0400 Received: from mail-vk0-f67.google.com ([209.85.213.67]:36413 "EHLO mail-vk0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751597AbcJDGCF (ORCPT ); Tue, 4 Oct 2016 02:02:05 -0400 MIME-Version: 1.0 In-Reply-To: References: <1472114120-3281-4-git-send-email-douly.fnst@cn.fujitsu.com> From: Yinghai Lu Date: Mon, 3 Oct 2016 23:02:03 -0700 X-Google-Sender-Auth: _4QPlV2JMpMyV5rZxUO-_gwEezc Message-ID: Subject: Re: [tip:x86/apic] x86/acpi: Introduce persistent storage for cpuid <-> apicid mapping To: Linux Kernel Mailing List , Gu Zheng , Tang Chen , Ingo Molnar , douly.fnst@cn.fujitsu.com, zhugh.fnst@cn.fujitsu.com, Thomas Gleixner , "H. Peter Anvin" Cc: "linux-tip-commits@vger.kernel.org" , Tony Luck Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5413 Lines: 141 On Thu, Sep 22, 2016 at 12:10 PM, tip-bot for Gu Zheng wrote: > > x86/acpi: Introduce persistent storage for cpuid <-> apicid mapping > > The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that, > when node online/offline happens, cache based on cpuid <-> nodeid mapping such as > wq_numa_possible_cpumask will not cause any problem. > It contains 4 steps: > 1. Enable apic registeration flow to handle both enabled and disabled cpus. > 2. Introduce a new array storing all possible cpuid <-> apicid mapping. > 3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid. > 4. Establish all possible cpuid <-> nodeid mapping. > > This patch finishes step 2. > > In this patch, we introduce a new static array named cpuid_to_apicid[], > which is large enough to store info for all possible cpus. > > And then, we modify the cpuid calculation. In generic_processor_info(), > it simply finds the next unused cpuid. And it is also why the cpuid <-> nodeid > mapping changes with node hotplug. > > After this patch, we find the next unused cpuid, map it to an apicid, > and store the mapping in cpuid_to_apicid[], so that cpuid <-> apicid > mapping will be persistent. > > And finally we will use this array to make cpuid <-> nodeid persistent. > > cpuid <-> apicid mapping is established at local apic registeration time. > But non-present or disabled cpus are ignored. > > In this patch, we establish all possible cpuid <-> apicid mapping when > registering local apic. Hi, This one cause one regression on 8 sockets system: MLC from intel does not run anymore. the root cause is : cpu index used to be 0-447. with this patch, cpu index change to 0, 2-448. The MADT from system is like: [ 42.107902] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled) [ 42.120125] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled) [ 42.132361] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled) [ 42.144598] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled) [ 42.156836] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled) ... [ 47.552852] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled) [ 47.565088] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled) [ 47.577322] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled) [ 47.589561] ACPI: X2APIC (uid[0x00] apic_id[0x00] enabled) [ 47.600899] ACPI: X2APIC (uid[0x02] apic_id[0x02] enabled) [ 47.612234] ACPI: X2APIC (uid[0x04] apic_id[0x04] enabled) ... init_cpu_node become: [ 55.477160] init_cpu_to_node: [ 55.483280] cpu 0 -> apicid 0x0 -> node 0 [ 55.491558] cpu 1 -> apicid 0xff -> node 1 [ 55.500017] cpu 2 -> apicid 0x2 -> node 0 [ 55.508296] cpu 3 -> apicid 0x4 -> node 0 [ 55.516575] cpu 4 -> apicid 0x6 -> node 0 ... looks like problem is acpi_parse_lapic==>acpi_register_lapic==>__generic_processor_info==>allocate_logical_cpuid it will take lapic_id[0xff] take cpu index 1. Then will have not /dev/cpu/1/msr, that will make the MLC not happy. Following change could workaround the problem at this point. Index: linux-2.6/arch/x86/kernel/acpi/boot.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/acpi/boot.c +++ linux-2.6/arch/x86/kernel/acpi/boot.c @@ -163,10 +163,11 @@ static int __init acpi_parse_madt(struct * @id: local apic id to register * @acpiid: ACPI id to register * @enabled: this cpu is enabled or not + * @disabled_id: not used apic id * * Returns the logic cpu number which maps to the local apic */ -static int acpi_register_lapic(int id, u32 acpiid, u8 enabled) +static int acpi_register_lapic(int id, u32 acpiid, u8 enabled, int disabled_id) { unsigned int ver = 0; int cpu; @@ -176,6 +177,11 @@ static int acpi_register_lapic(int id, u return -EINVAL; } + if (!enabled && (id == disabled_id)) { + ++disabled_cpus; + return -EINVAL; + } + if (boot_cpu_physical_apicid != -1U) ver = boot_cpu_apic_version; @@ -213,7 +219,7 @@ acpi_parse_x2apic(struct acpi_subtable_h if (!apic->apic_id_valid(apic_id) && enabled) printk(KERN_WARNING PREFIX "x2apic entry ignored\n"); else - acpi_register_lapic(apic_id, processor->uid, enabled); + acpi_register_lapic(apic_id, processor->uid, enabled, -1); #else printk(KERN_WARNING PREFIX "x2apic entry ignored\n"); #endif @@ -242,7 +248,7 @@ acpi_parse_lapic(struct acpi_subtable_he */ acpi_register_lapic(processor->id, /* APIC ID */ processor->processor_id, /* ACPI ID */ - processor->lapic_flags & ACPI_MADT_ENABLED); + processor->lapic_flags & ACPI_MADT_ENABLED, 0xff); return 0; } @@ -261,7 +267,7 @@ acpi_parse_sapic(struct acpi_subtable_he acpi_register_lapic((processor->id << 8) | processor->eid,/* APIC ID */ processor->processor_id, /* ACPI ID */ - processor->lapic_flags & ACPI_MADT_ENABLED); + processor->lapic_flags & ACPI_MADT_ENABLED, -1); return 0; } @@ -725,7 +731,7 @@ int acpi_map_cpu(acpi_handle handle, phy { int cpu; - cpu = acpi_register_lapic(physid, U32_MAX, ACPI_MADT_ENABLED); + cpu = acpi_register_lapic(physid, U32_MAX, ACPI_MADT_ENABLED, -1); if (cpu < 0) { pr_info(PREFIX "Unable to map lapic to logical cpu number\n"); return cpu;