Received: by 2002:a05:6358:700f:b0:131:369:b2a3 with SMTP id 15csp230099rwo; Tue, 1 Aug 2023 16:09:09 -0700 (PDT) X-Google-Smtp-Source: APBJJlF04Et3bFW8HiGE0eIDSawskdZQuWRq7Fj4RZGIRWZqPmtwD9mkNczBLazkOffQnA7omrhw X-Received: by 2002:a17:902:ce81:b0:1b9:e23a:f761 with SMTP id f1-20020a170902ce8100b001b9e23af761mr15703680plg.63.1690931349001; Tue, 01 Aug 2023 16:09:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690931348; cv=none; d=google.com; s=arc-20160816; b=Y6kCaSfPTmuW1ia3gy0vz9rdi8nb4bFkolmaGO+Run5OELU9IQuclow5ulI31ABIAK +9z9FevW1jzCAgi+TSaQKL/mZZlf0Cxo/KnAKhzjK5fIR6awB+TaNsT5dRHDSVPkUY7n YxOMiavwyrFfaSwLYlCCoh5aV355fGYikzuZl122hEeUk/PHn2hS5dsYLdMkM7Vasp74 XLW3gGVeZAVJi1KlQR1N6HlWPLhjuQJwZnKOxn6/R+wB6ulEVlCEIL68joBVrBuSjwFd A1nrhLezc/OuDyWeSRUGEN+h/T3XYnOMbttRJ+sgkH16xbU+KOcTeVtJkhaxd5AJ+q3n fSkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=6vBNRn+H/dJq3KiQZngBmkVTw7vlHm+rFM/v4+ZfCVc=; fh=LqO74HW/UfcoN8h8Xy8tQP8iAnpCRwXKv6BkHlwgjFY=; b=NLJXR6GYaSie9YpJr8IaC/RZv77U+YvkPjgWND9AsiL4uo1GVz9c6Qww/ZXLfrsruC 5LdXTFuoN8FZjbsnPOikKWOO5dFDcbYJcsz1H9I1FlT+CLf/FaKzl6h5lAbYaMGFcUU0 WzsQnfoXBs+4lC1Z2CctHundLDyxpZ5VH544GcpPyrcBfy6JPdAc6OAMYGleqEq4HD65 BnUODh+ZFoZwVMrgirv3KMGw5MHVONTHigUvGYqshC+MRYXCLe/Zn1Ab9YIrakOjccOE 877YeuzQ9gLDN47bauq8frRQHHW4GJT8w05Irdd+g3kjoxeQgHjIlGZZngt9NfcEuQPK 5gHg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=YfVVpMlh; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k2-20020a170902c40200b001b866e8af8dsi9554298plk.43.2023.08.01.16.08.55; Tue, 01 Aug 2023 16:09:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=YfVVpMlh; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231660AbjHAWZN (ORCPT + 99 others); Tue, 1 Aug 2023 18:25:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230462AbjHAWZM (ORCPT ); Tue, 1 Aug 2023 18:25:12 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17604E57; Tue, 1 Aug 2023 15:25:11 -0700 (PDT) From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1690928708; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6vBNRn+H/dJq3KiQZngBmkVTw7vlHm+rFM/v4+ZfCVc=; b=YfVVpMlh3AgWVaOw8BAqH0WNL6zBa3y3ZX/ML7tUNBIuQuW3kyGi9vZXe84DNlGNouwNhI vspqSluF6nr+ZKET4ijthsFCpSBWrLrslZAFZrCseQrmqNWz5WYopEmIi6xR1wpMVy2HTG HtiCcYZOdEr0+v8XouguUounQcNg0Zjrh02LZToA+6rm6y0P5gE9jQRT7KBSiOqNZNxGfI TfOrA6aYGbZzAwuHkTEwOv75RXrKHzkUOIdl5Pdu97qx+RHoVYgM1Dw8GXyZQ8kjdlLE/K ppQ7lOf2hB7RFHYmd0a+vOTLBJW+O1PnVz0Mp9A2aziBItdIPRf8Zha+Hh04lQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1690928708; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6vBNRn+H/dJq3KiQZngBmkVTw7vlHm+rFM/v4+ZfCVc=; b=qICwBH2O/ACaej1WWGuDl2V2Z5ehM13vyBVJeYwCoQni/JQKI+RGOzbSjHbyvjJUgiuM6O uWN9TLIlApESnhCQ== To: "Michael Kelley (LINUX)" , Peter Zijlstra Cc: LKML , "x86@kernel.org" , Tom Lendacky , Andrew Cooper , Arjan van de Ven , "James E.J. Bottomley" , Dick Kennedy , James Smart , "Martin K. Petersen" , "linux-scsi@vger.kernel.org" , Guenter Roeck , "linux-hwmon@vger.kernel.org" , Jean Delvare , Huang Rui , Juergen Gross , Steve Wahl , Mike Travis , Dimitri Sivanich , Russ Anderson , linux-hyperv@vger.kernel.org, Linus Torvalds , Greg Kroah-Hartman Subject: RE: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology() In-Reply-To: <873513n31m.ffs@tglx> References: <20230728105650.565799744@linutronix.de> <20230728120930.839913695@linutronix.de> <871qgop8dc.ffs@tglx> <20230731132714.GH29590@hirez.programming.kicks-ass.net> <87sf94nlaq.ffs@tglx> <87fs53n6xd.ffs@tglx> <873513n31m.ffs@tglx> Date: Wed, 02 Aug 2023 00:25:07 +0200 Message-ID: <87r0omjt8c.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Michael! On Tue, Aug 01 2023 at 00:12, Thomas Gleixner wrote: > On Mon, Jul 31 2023 at 21:27, Michael Kelley wrote: > Clearly the hyper-v BIOS people put a lot of thoughts into this: > >> x2APIC features / processor topology (0xb): >> extended APIC ID = 0 >> --- level 0 --- >> level number = 0x0 (0) >> level type = thread (1) >> bit width of level = 0x1 (1) >> number of logical processors at level = 0x2 (2) >> --- level 1 --- >> level number = 0x1 (1) >> level type = core (2) >> bit width of level = 0x6 (6) >> number of logical processors at level = 0x40 (64) > > FAIL: ^^^^^ > > While that field is not meant for topology evaluation it is at least > expected to tell the actual number of logical processors at that level > which are actually available. > > The CPUID APIC ID aka initial_apicid clearly tells that the topology has > 36 logical CPUs in package 0 and 36 in package 1 according to your > table. > > On real hardware this looks like this: > > --- level 1 --- > level number = 0x1 (1) > level type = core (2) > bit width of level = 0x6 (6) > number of logical processors at level = 0x38 (56) > > Which corresponds to reality and is consistent. But sure, consistency is > overrated. So I looked really hard to find some hint how to detect this situation on the boot CPU, which allows us to mitigate it, but there is none at all. So we are caught between a rock and a hard place, which provides us two mutually exclusive options to chose from: 1) Have a sane topology evaluation mechanism which solves the known problems of hybrid systems, wrong sizing estimates and other unpleasantries. 2) Support the Hyper-V BIOS trainwreck forever. Unsurprisingly #2 is not really an option as #1 is a crucial issue for the kernel and we need it resolved urgently as of yesterday. So while I'm definitely a strong supporter of no-regression policy, I have to make an argument here why this particular issue is _not_ covered: 1) Hyper-V BIOS/firmware violates the firmware specification and requirements which are clearly spelled out in the SDM. 2) This violatation is reported on every boot with one promiment message per brought up AP where the initial APIC ID as provided by CPUID leaf 0xB deviates from the APIC ID read from "hardware", which is also provided by MADT starting with CPU 36 in the provided example: "[FIRMWARE BUG] CPU36: APIC id mismatch. Firmware: 40 APIC: 24" repeating itself up to CPU71 with the relevant diverging APIC IDs. At least that's what the upstream kernel produces according to validate_apic_and_package_id() in such an situation. 3) This is known for years and the Hyper-V Linux team tried to get this resolved, but obviously their arguments fell on deaf ears. IOW, the firmware BUG message has been ignored willfully for years due to "works for me, why should I care?" attitude. Seriously, kernel development cannot be held hostage forever by the wilful ignorance of a BIOS team, which refuses to adhere to specifications and defines their own world order. The x86 maintainer team is chosing the lesser of two evils and lets those who created the problem and refused to resolve it deal with the outcome. Just to clarify. This is not preventing affected guests from booting. The worst consequence is a slight performance regression because the firmware provided topology information is not matching reality and therefore the scheduler placement vs. L3 affinity sucks. That's clearly not a kernel problem. I'm happy to aid accelerating this thought process by elevating the existing pr_err(FW_BUG....) to a solid WARN_ON_ONCE(). See below. Thanks, tglx --- --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1688,7 +1688,7 @@ static void validate_apic_and_package_id apicid = apic->cpu_present_to_apicid(cpu); - if (apicid != c->topo.apicid) { + if (WARN_ON_ONCE(apicid != c->topo.apicid)) { pr_err(FW_BUG "CPU%u: APIC id mismatch. Firmware: %x APIC: %x\n", cpu, apicid, c->topo.initial_apicid); }