Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A852C636D3 for ; Wed, 1 Feb 2023 22:41:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231428AbjBAWlJ (ORCPT ); Wed, 1 Feb 2023 17:41:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57010 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229730AbjBAWlI (ORCPT ); Wed, 1 Feb 2023 17:41:08 -0500 Received: from mail.skyhub.de (mail.skyhub.de [5.9.137.197]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B615A2A982 for ; Wed, 1 Feb 2023 14:41:03 -0800 (PST) Received: from zn.tnic (p5de8e9fe.dip0.t-ipconnect.de [93.232.233.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.skyhub.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id 2EDE21EC0589; Wed, 1 Feb 2023 23:41:02 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=dkim; t=1675291262; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=AyoeuwBBM2ackjOZrWChwx5x8HGPUXUVlBJUm7ryOIQ=; b=H6p+CSnDVJIG8Rcs6jgZYfWmN5r1SIQu0HHp2v1Qwl0Ewz7VEdGYK8lb4iSg/OrshD3B/d gtoloum+s15DxutSxn3N1geAnRGoafllZ+/nmOHXLsq+giKRI5S5Hduelgo0J2T7lRj8hm urKR/BPKEqSyND5XyqjXfKSBu2/S5Ok= Date: Wed, 1 Feb 2023 23:40:58 +0100 From: Borislav Petkov To: Dave Hansen Cc: Ashok Raj , Thomas Gleixner , LKML , x86 , Ingo Molnar , Tony Luck , Alison Schofield , Reinette Chatre , Tom Lendacky , Stefan Talpalaru , David Woodhouse , Benjamin Herrenschmidt , Jonathan Corbet , "Rafael J . Wysocki" , Peter Zilstra , Andy Lutomirski , Andrew Cooper , Boris Ostrovsky , Martin Pohlack Subject: Re: [Patch v3 Part2 4/9] x86/microcode: Do not call apply_microcode() on sibling threads Message-ID: References: <20230130213955.6046-1-ashok.raj@intel.com> <20230130213955.6046-5-ashok.raj@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 01, 2023 at 02:21:18PM -0800, Dave Hansen wrote: > That works great, unless T0 experiences an error. In that case, T0 will > jump out of __reload_late() after failing to do the update. T1 will > come bumbling along after it and will enter ->apply_microcode(), > blissfully unaware of T0's failure. T1 will assume that it is supposed > to do T0's job, noting "rev < mc->hdr.rev". T1 will write the MSR while > T0 is off doing god knows what. > > T1 should not even be attempting to do ->apply_microcode() because T0 is > not quiescent. Yah, thanks for explaining properly. So, if T0 fails, then we will say that it failed. The ->apply_microcode() call on T1 was never meant to apply any microcode - just to update the cached data. Now, if T0 fails, then it doesn't matter what T1 does - you have a bigger problem: A subset of the cores is running with new microcode while other subset with the old one. Now this is a shit situation I don't want to be in. And I don't have a good way out of it. Revert to the old patch? Maybe... Retry to application on all again with the hope that it works this time? What if some core touches a MSR being added with the new microcode patch? Late loading is a big PITA. As we've been preaching for a while now. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette