Received: by 2002:ab2:6816:0:b0:1f9:5764:f03e with SMTP id t22csp991887lqo; Fri, 17 May 2024 07:42:36 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXo6sF/GPdrY4jB1cSBAAs0mxWnjqamcaUw7hX82e3uw87R8mTpzlWHbaiEwYHoXWjlzwiYrSfvOGJ12eriChFZnDectE/F6KN2Mq6pEw== X-Google-Smtp-Source: AGHT+IFpGXmhS5RmnpVU36Xp3A/jUDgDibs+iGnG84I9bXSHC1QKvTDzQXL8fvIrmZs/cIJZ7uwy X-Received: by 2002:a05:6a20:729e:b0:1af:66aa:7fc7 with SMTP id adf61e73a8af0-1afde085c17mr33731874637.3.1715956956183; Fri, 17 May 2024 07:42:36 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1715956956; cv=pass; d=google.com; s=arc-20160816; b=SUsondD71z+/b6bJuGNUCJeTw47o1+O1FcEtITe5ahR3WKCURLHPMDB0bTZWkChk0x CRMLCcZ+TuobElbnL5ioWGIc2YNly7xx9LVXWzIKSPax15USb98FSE8vxfsDRIV3yK64 dfIrBvZ93xqR80j0wWfshChNajQuY9+6vdJBcNZ2g/me7LZHV/rncGpUfXhchjU7jiwC 99K0ISqyQDZ8jGzq/ftXr5cph2RbzPvNlpHU1TgVjgMDoJvMOb9PhNxCpQdPwCSYP0x5 ngaUoatsy+VVdUP7JNnHyMbOgyP+qI0dUy52MpwBeIlsYda1WCJNbKz/KvnauTlDSNg0 OUlA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :message-id:date:subject:cc:to:dkim-signature:dkim-signature:from; bh=8m1vjhmUrX+z6fB8LXwOmch1tlT7U66cdTNDRltqcbU=; fh=eKBvTvE1AQfDRWztwNaKVainPRmAjFSIIxpq7IrpGZw=; b=qRXCpZKrbLCpi5FzwX/bwlVJboZ+/C8lKwdx7K8Xw/qwuuCOCiZyppXymf6cfkL4y/ JVDdqYsDr+AYMbDb/JSF09BUAcn1Gv8P2/0dBMmDrn1xxPggcbul0CoLWxLRDy+lOenG GKTP4Ln1TdgN793UM5eO+z6tc2JKixfP0o2iAFMdg1QKxXzh9fO2L27IP35MCSLA24VZ osdcAeNsREIUFKR+A0KHUQRRXgjfmwzQ6x7lQdFlvkW3IkchhxHAHCSR2lwFPldHtYxV IHTP1JVn/aVNiZ2l0oxE8W7kCrum74mHQSryjsEqFLzvd82ZVuKEudIacAmCpromjfhx xAvA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=k2S8LvW7; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; arc=pass (i=1 spf=pass spfdomain=linutronix.de dkim=pass dkdomain=linutronix.de dmarc=pass fromdomain=linutronix.de); spf=pass (google.com: domain of linux-kernel+bounces-182216-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-182216-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id d2e1a72fcca58-6f4d2a726a2si18974373b3a.28.2024.05.17.07.42.35 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 May 2024 07:42:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-182216-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=k2S8LvW7; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; arc=pass (i=1 spf=pass spfdomain=linutronix.de dkim=pass dkdomain=linutronix.de dmarc=pass fromdomain=linutronix.de); spf=pass (google.com: domain of linux-kernel+bounces-182216-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-182216-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 754CE2880CA for ; Fri, 17 May 2024 14:41:00 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 252D53219F; Fri, 17 May 2024 14:40:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="k2S8LvW7"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="jfrQyv5r" Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8A1231DFC7 for ; Fri, 17 May 2024 14:40:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715956841; cv=none; b=QxiDjuR4fe9JFfV6GYsyN2wIUWTxy/swSzcChdb9/uE38BijDH4Lrhu0OOgILthBzS9Y20WCr58AllBznD5Bv2lxpfAQI1NhZqjVoGtZLsK0ZBis+WeatLi6DZ6hiG7BlKyKWCR5eguJnVKT5B/GUNIO+yMJVeuFosUyPOrJGfs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715956841; c=relaxed/simple; bh=Xjocpe6o+LPVbYekQlUHM7EpcP5a1fvLjeuj7wvDWAU=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type; b=gzozzbcdZKq/ijzJgSkyGzl267BzAicK0tUO5v4/jJUjMtGK3NkuXZByudn5cbo932PuFdB5S+7jrDw5/ZFq3saafjasu1LNRwP6FsKklA14gTceSqnu4BAcw+FVJfHgOpnmFVT1Lz27tZa3rI8VFo6GdggEE3Qpx0T0E4BU4Gs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=k2S8LvW7; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=jfrQyv5r; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1715956837; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=8m1vjhmUrX+z6fB8LXwOmch1tlT7U66cdTNDRltqcbU=; b=k2S8LvW7NzCu07k72HJZSDMpyXFt9aws7OUpvdyuQA+udWYPdIP8nd4wnfbTyoJVVdk7ja uzR50rVKpSlYbQRBC60dMdVwICSvO5BQZQJwsJxNyh+E8JcUfpzKFDiY0qypcDH6C+y635 y131t21KMtHAN39w3pdlAcjqXa6DjrbelnPMKkp8Z/GAEENT4vTRXTeNugunqGGbLJZwKr QfiwTqm27nqXC8Nc0WKldQl+Pk5XkkKboGXkwzuEYOBSkpyD8NU3o0IgXZvK3+6dKIfPhf ZExrQG6W4+EVoguV0NLL0ZrdsFdZZak/Hr/p/k/HWJ9B/0K9Rx6NKT9KCubeLQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1715956837; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=8m1vjhmUrX+z6fB8LXwOmch1tlT7U66cdTNDRltqcbU=; b=jfrQyv5rTcaKrgIlsjj0P+1shzep48aWkPmZcUxYnxrJWVGhhHOTLcgwusZ3sK+V9QCAss rY0JxGEJg2EbhqAw== To: Carsten Tolkmit Cc: LKML , x86@kernel.org Subject: x86/topology: Handle bogus ACPI tables correctly Date: Fri, 17 May 2024 16:40:36 +0200 Message-ID: <87le48jycb.ffs@tglx> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain The ACPI specification clearly states how the processors should be enumerated in the MADT: "To ensure that the boot processor is supported post initialization, two guidelines should be followed. The first is that OSPM should initialize processors in the order that they appear in the MADT. The second is that platform firmware should list the boot processor as the first processor entry in the MADT. ... Failure of OSPM implementations and platform firmware to abide by these guidelines can result in both unpredictable and non optimal platform operation." The kernel relies on that ordering to detect the real BSP on crash kernels which is important to avoid sending a INIT IPI to it as that would cause a full machine reset. On a Dell XPS 16 9640 the BIOS ignores this rule and enumerates the CPUs in the wrong order. As a consequence the kernel falsely detects a crash kernel and disables the corresponding CPU. Prevent this by checking the IA32_APICBASE MSR for the BSP bit on the boot CPU. If that bit is set, then the MADT based BSP detection can be safely ignored. If the kernel detects a mismatch between the BSP bit and the first enumerated MADT entry then emit a firmware bug message. This obviously also has to be taken into account when the boot APIC ID and the first enumerated APIC ID match. If the boot CPU does not have the BSP bit set in the APICBASE MSR then there is no way for the boot CPU to determine which of the CPUs is the real BSP. Sending an INIT to the real BSP would reset the machine so the only sane way to deal with that is to limit the number of CPUs to one and emit a corresponding warning message. Fixes: 5c5682b9f87a ("x86/cpu: Detect real BSP on crash kernels") Reported-by: Carsten Tolkmit Signed-off-by: Thomas Gleixner Cc: stable@vger.kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218837 --- This is a slightly different solution than the initial patch I provided in the bugzilla. Carsten, can you please test that again? --- arch/x86/kernel/cpu/topology.c | 43 ++++++++++++++++++++++++++++++++++++++--- 1 file changed, 40 insertions(+), 3 deletions(-) --- a/arch/x86/kernel/cpu/topology.c +++ b/arch/x86/kernel/cpu/topology.c @@ -128,6 +128,9 @@ static void topo_set_cpuids(unsigned int static __init bool check_for_real_bsp(u32 apic_id) { + bool is_bsp = false, has_apic_base = boot_cpu_data.x86 >= 6; + u64 msr; + /* * There is no real good way to detect whether this a kdump() * kernel, but except on the Voyager SMP monstrosity which is not @@ -144,17 +147,51 @@ static __init bool check_for_real_bsp(u3 if (topo_info.real_bsp_apic_id != BAD_APICID) return false; + /* + * Check whether the enumeration order is broken by evaluating the + * BSP bit in the APICBASE MSR. If the CPU does not have the + * APICBASE MSR then the BSP detection is not possible and the + * kernel must rely on the firmware enumeration order. + */ + if (has_apic_base) { + rdmsrl(MSR_IA32_APICBASE, msr); + is_bsp = !!(msr & MSR_IA32_APICBASE_BSP); + } + if (apic_id == topo_info.boot_cpu_apic_id) { - topo_info.real_bsp_apic_id = apic_id; - return false; + if (is_bsp || !has_apic_base) { + topo_info.real_bsp_apic_id = apic_id; + return false; + } + /* + * If the boot APIC is enumerated first, but the APICBASE + * MSR does not have the BSP bit set, then there is no way + * to discover the real BSP here. Assume a crash kernel and + * limit the number of CPUs to 1 as an INIT to the real BSP + * would reset the machine. + */ + pr_warn("Enumerated BSP APIC %x is not marked in APICBASE MSR\n", apic_id); + pr_warn("Assuming crash kernel. Limiting to one CPU to prevent machine INIT\n"); + set_nr_cpu_ids(1); + goto fwbug; } - pr_warn("Boot CPU APIC ID not the first enumerated APIC ID: %x > %x\n", + pr_warn("Boot CPU APIC ID not the first enumerated APIC ID: %x != %x\n", topo_info.boot_cpu_apic_id, apic_id); + + if (is_bsp && has_apic_base) { + topo_info.real_bsp_apic_id = topo_info.boot_cpu_apic_id; + goto fwbug; + } + pr_warn("Crash kernel detected. Disabling real BSP to prevent machine INIT\n"); topo_info.real_bsp_apic_id = apic_id; return true; + +fwbug: + pr_warn(FW_BUG "APIC enumeration order not specification compliant\n"); + return false; } static unsigned int topo_unit_count(u32 lvlid, enum x86_topology_domains at_level,