Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp2782781rdb; Tue, 12 Sep 2023 11:55:36 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE7OIU6m7byBY7kViDVdSTu9v2w4PWbViNsMyKfJu/nv96aTWBvu7klY3IkSjpQ3m3hiHeD X-Received: by 2002:a05:6a20:3956:b0:14c:5dc3:f1c9 with SMTP id r22-20020a056a20395600b0014c5dc3f1c9mr315847pzg.49.1694544936392; Tue, 12 Sep 2023 11:55:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694544936; cv=none; d=google.com; s=arc-20160816; b=uz9GoLtWd/SQB3P2C7yvsMq6mrbSPDMrE1CJevnXae4kyduqXfer3/CN8GbzP7cb3c S2nTgdGlay3/5/Y0yJELCoUjATrTqxvNBsaG+r3kF/PfqkbGVtAvob4SZa/Mjvl0W+9b PKUQRJ7M3vOC79FnoKEk1xh6M2BBnau7z8Gz9O66lDHfJy1+7NEMZPOf48pF9WiWsizx 4vsmFvZohwKujetZdVSyJaVMuupC3iO5Bta6PJeG3kDL2Id2BK9yLRpzJw7mDjkrQqE/ viNtlrKoa4jTrQji+SvkVAZ2pzJUk6Bm6yqRyThRfNdvzuTHShaDnS5Npn6y4kErto1m 4uRQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:mime-version:references:subject:cc:to:from :dkim-signature:dkim-signature:message-id; bh=2kqLUw1a4gV9FrIDWZ/O+PSOOuBOB0W9vRdpe8yDcXg=; fh=u57tXYamzTrJA+Ht8n1u7SfTMptrQaIb6LVW+jsaYf4=; b=IylhhTZxjo36nbWdeWZDmIAASX7CLWCQR6R0ySvAMm2Xx8rdTk06UGV/kqbew01WrR 1ganpTRmG4H1ZHgERsu3CzP7Hs8HxVMa/XK7QJ+bqGjJF1EcZUO53Kn+siJ5HmppYaD4 fU++cmxhjd6zmDsefZh6BjIhL+JISGf/AiB4BALWugS46ydseUj2swYwtGiV23D7OZsu 6MWGLk2pK9GDVtAa4Jw8fjnwLjkWinpJz3HzvzaaIQX/MDTeb5zSSOG/DkBx0z+1AZ04 F5ysW0cB/idg8uqV9+POufOgwDYZQtDW3T5Vyqd+P+HXA43zgRMXm9g5nNO8wkFrt/kY UEnA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=QqaPuXD8; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=PqPSfxGD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id h64-20020a636c43000000b00565db8bec39si8499214pgc.173.2023.09.12.11.55.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Sep 2023 11:55:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=QqaPuXD8; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=PqPSfxGD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 27FAE851B9E0; Tue, 12 Sep 2023 01:00:39 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232076AbjILIA2 (ORCPT + 99 others); Tue, 12 Sep 2023 04:00:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232211AbjILH6y (ORCPT ); Tue, 12 Sep 2023 03:58:54 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C3E3610F3 for ; Tue, 12 Sep 2023 00:58:18 -0700 (PDT) Message-ID: <20230912065502.082789879@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1694505497; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=2kqLUw1a4gV9FrIDWZ/O+PSOOuBOB0W9vRdpe8yDcXg=; b=QqaPuXD8otO8MOEIXvyvoMFbgqrpc8ZVKklazBuJHJC9cWBbTtTw1x3eGLytyNxgdmimyp WBzksNyIlnKhE12fHgbaM8Z7xBsWve9SDi2jxGaJmk38DpyRNkAVnQw4QV02fS183isYgS s74li/BXboVIoqEr9LtkbE0rLqY8+Aqec3jyTu+eaJ6gs65O5US7LuWGV98RfmTz/K6N+l 6gorRnyMPBhtqD6COqK2L6x8iHs14w28TNmytckJckq9A5p0idz43iIZweDXN8DnAGJbos CA4rvHzMHsJV8U3RfJAGjOpyk5ofnU5eQDYoMDuHlqRZqCWiyRWr5tbHAQt9nQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1694505497; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=2kqLUw1a4gV9FrIDWZ/O+PSOOuBOB0W9vRdpe8yDcXg=; b=PqPSfxGDhKa1TXBJ4lTbeEw9jb3ASVQZbMMAeaWiCeX7/ccIi0d6SHszRnlHLIpKrgPJVG fTrFWjT0+pB6r5BQ== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Borislav Petkov , "Chang S. Bae" , Arjan van de Ven , Nikolay Borisov Subject: [patch V3 21/30] x86/microcode: Add per CPU result state References: <20230912065249.695681286@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Date: Tue, 12 Sep 2023 09:58:16 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Tue, 12 Sep 2023 01:00:39 -0700 (PDT) From: Thomas Gleixner The microcode rendevouz is purely acting on global state, which does not allow to analyze fails in a coherent way. Introduce per CPU state where the results are written into, which allows to analyze the return codes of the individual CPUs. Initialize the state when walking the cpu_present_mask in the online check to avoid another for_each_cpu() loop. Enhance the result print out with that. The structure is intentionally named ucode_ctrl as it will gain control fields in subsequent changes. Signed-off-by: Thomas Gleixner --- arch/x86/kernel/cpu/microcode/core.c | 108 ++++++++++++++++++------------- arch/x86/kernel/cpu/microcode/internal.h | 1 2 files changed, 65 insertions(+), 44 deletions(-) --- --- a/arch/x86/kernel/cpu/microcode/core.c +++ b/arch/x86/kernel/cpu/microcode/core.c @@ -324,6 +324,11 @@ static struct platform_device *microcode * requirement can be relaxed in the future. Right now, this is conservative * and good. */ +struct ucode_ctrl { + enum ucode_state result; +}; + +static DEFINE_PER_CPU(struct ucode_ctrl, ucode_ctrl); static atomic_t late_cpus_in, late_cpus_out; static bool wait_for_cpus(atomic_t *cnt) @@ -344,23 +349,19 @@ static bool wait_for_cpus(atomic_t *cnt) return false; } -/* - * Returns: - * < 0 - on error - * 0 - success (no update done or microcode was updated) - */ -static int __reload_late(void *info) +static int ucode_load_cpus_stopped(void *unused) { int cpu = smp_processor_id(); - enum ucode_state err; - int ret = 0; + enum ucode_state ret; /* * Wait for all CPUs to arrive. A load will not be attempted unless all * CPUs show up. * */ - if (!wait_for_cpus(&late_cpus_in)) - return -1; + if (!wait_for_cpus(&late_cpus_in)) { + this_cpu_write(ucode_ctrl.result, UCODE_TIMEOUT); + return 0; + } /* * On an SMT system, it suffices to load the microcode on one sibling of @@ -369,17 +370,11 @@ static int __reload_late(void *info) * loading attempts happen on multiple threads of an SMT core. See * below. */ - if (cpumask_first(topology_sibling_cpumask(cpu)) == cpu) - err = microcode_ops->apply_microcode(cpu); - else + if (cpumask_first(topology_sibling_cpumask(cpu)) != cpu) goto wait_for_siblings; - if (err >= UCODE_NFOUND) { - if (err == UCODE_ERROR) { - pr_warn("Error reloading microcode on CPU %d\n", cpu); - ret = -1; - } - } + ret = microcode_ops->apply_microcode(cpu); + this_cpu_write(ucode_ctrl.result, ret); wait_for_siblings: if (!wait_for_cpus(&late_cpus_out)) @@ -391,19 +386,18 @@ static int __reload_late(void *info) * per-cpu cpuinfo can be updated with right microcode * revision. */ - if (cpumask_first(topology_sibling_cpumask(cpu)) != cpu) - err = microcode_ops->apply_microcode(cpu); + if (cpumask_first(topology_sibling_cpumask(cpu)) == cpu) + return 0; - return ret; + ret = microcode_ops->apply_microcode(cpu); + this_cpu_write(ucode_ctrl.result, ret); + return 0; } -/* - * Reload microcode late on all CPUs. Wait for a sec until they - * all gather together. - */ -static int microcode_reload_late(void) +static int ucode_load_late_stop_cpus(void) { - int old = boot_cpu_data.microcode, ret; + unsigned int cpu, updated = 0, failed = 0, timedout = 0, siblings = 0; + int old_rev = boot_cpu_data.microcode; struct cpuinfo_x86 prev_info; pr_err("Attempting late microcode loading - it is dangerous and taints the kernel.\n"); @@ -418,26 +412,47 @@ static int microcode_reload_late(void) */ store_cpu_caps(&prev_info); - ret = stop_machine_cpuslocked(__reload_late, NULL, cpu_online_mask); + stop_machine_cpuslocked(ucode_load_cpus_stopped, NULL, cpu_online_mask); + + /* Analyze the results */ + for_each_cpu_and(cpu, cpu_present_mask, &cpus_booted_once_mask) { + switch (per_cpu(ucode_ctrl.result, cpu)) { + case UCODE_UPDATED: updated++; break; + case UCODE_TIMEOUT: timedout++; break; + case UCODE_OK: siblings++; break; + default: failed++; break; + } + } if (microcode_ops->finalize_late_load) - microcode_ops->finalize_late_load(ret); + microcode_ops->finalize_late_load(!updated); - if (!ret) { - pr_info("Reload succeeded, microcode revision: 0x%x -> 0x%x\n", - old, boot_cpu_data.microcode); - microcode_check(&prev_info); - add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK); - } else { - pr_info("Reload failed, current microcode revision: 0x%x\n", - boot_cpu_data.microcode); + if (!updated) { + /* Nothing changed. */ + if (!failed && !timedout) + return 0; + pr_err("Microcode update failed: %u CPUs failed %u CPUs timed out\n", + failed, timedout); + return -EIO; } - return ret; + + add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK); + pr_info("Microcode load: updated on %u primary CPUs with %u siblings\n", updated, siblings); + if (failed || timedout) { + pr_err("Microcode load incomplete. %u CPUs timed out or failed\n", + num_online_cpus() - (updated + siblings)); + } + pr_info("Microcode revision: 0x%x -> 0x%x\n", old_rev, boot_cpu_data.microcode); + microcode_check(&prev_info); + + return updated + siblings == num_online_cpus() ? 0 : -EIO; } /* - * Ensure that all required CPUs which are present and have been booted - * once are online. + * This function does two things: + * + * 1) Ensure that all required CPUs which are present and have been booted + * once are online. * * To pass this check, all primary threads must be online. * @@ -448,9 +463,12 @@ static int microcode_reload_late(void) * behaviour is undefined. The default play_dead() implementation on * modern CPUs is using MWAIT, which is also not guaranteed to be safe * against a microcode update which affects MWAIT. + * + * 2) Initialize the per CPU control structure */ -static bool ensure_cpus_are_online(void) +static bool ucode_setup_cpus(void) { + struct ucode_ctrl ctrl = { .result = -1, }; unsigned int cpu; for_each_cpu_and(cpu, cpu_present_mask, &cpus_booted_once_mask) { @@ -460,6 +478,8 @@ static bool ensure_cpus_are_online(void) return false; } } + /* Initialize the per CPU state */ + per_cpu(ucode_ctrl, cpu) = ctrl; } return true; } @@ -468,13 +488,13 @@ static int ucode_load_late_locked(void) { int ret; - if (!ensure_cpus_are_online()) + if (!ucode_setup_cpus()) return -EBUSY; ret = microcode_ops->request_microcode_fw(0, µcode_pdev->dev); if (ret != UCODE_NEW) return ret == UCODE_NFOUND ? -ENOENT : -EBADFD; - return microcode_reload_late(); + return ucode_load_late_stop_cpus(); } static ssize_t reload_store(struct device *dev, --- a/arch/x86/kernel/cpu/microcode/internal.h +++ b/arch/x86/kernel/cpu/microcode/internal.h @@ -16,6 +16,7 @@ enum ucode_state { UCODE_UPDATED, UCODE_NFOUND, UCODE_ERROR, + UCODE_TIMEOUT, }; struct microcode_ops {