Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp1762769ybe; Sat, 7 Sep 2019 02:23:07 -0700 (PDT) X-Google-Smtp-Source: APXvYqy8+m0Eic7U0R9kkEs8BKMc+imduVrKHqA7othXxkXyKiKf1dK1d3BDBvnz+2HRe/T2j3A0 X-Received: by 2002:a63:60a:: with SMTP id 10mr11619052pgg.381.1567848187322; Sat, 07 Sep 2019 02:23:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567848187; cv=none; d=google.com; s=arc-20160816; b=rgKafErMhvDIT0/Bj2LsUqVDQCJwGLPzjiYgVwk5E+GBagLcdnbxuNdiNpW+dJOj/i 8S4H9XDShI+Fg9otMsYc6K6gba1Pn88LhHOcn0tnh3uE6tj+aFtegBkqgCFLBaRpxuKp cafjG2YCO8+cA0lfdC3NEgiyc29jOUMMP6y02zxagICtUoKVKm8xgaL3zmlCnQePmuy6 ZEEAucffjzib/TRmp6cTovi1mPgojnf3P3emX70BmzjKFlGX7YQzu3uVNt1CqI5243B7 0b4aNWtzmPUbi2YE0VGrUvCyoHDzs6z5LVvZ57ibpsVVYfAUJPVQvArp8jOYB8ihvb6H osdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=ML7AhYICUM+rkVSAqtRuyuzZIuUuSKwVdQWOYwnukpQ=; b=PuhMrYTNuoOb04nER448DG/ymnBzv8J94nA/c8PdKYbahBT3pcUjEtgn/4CGPHEal8 xi6CQU9GLbOqHBIE/hXrp3aTILsFXFtok3nHJfmmCQ7ZWSQJtyhNYwW+tpI/aRoJpW/R j2GtFtvjtD8v9DlagmkX9GbRU0nUfcQ/vSD0y/bT5wwMwtb32w4t8YojFgbSFsCOlRQ/ vx+Bz0UFRcpIJMhN5zerhe2C2T2FFXPHozrPsVkbpRqbtgUWM9Jb6KONvqqKDx6yyGK2 Fym200PsgU0kK9zL3+g1JnAtxPPla54+uaul5YJsHLxBERu9yEdITXpXyAag9uiuwuOP kCAA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@alien8.de header.s=dkim header.b="f2/tDhYJ"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alien8.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q13si7052067pjb.13.2019.09.07.02.22.52; Sat, 07 Sep 2019 02:23:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@alien8.de header.s=dkim header.b="f2/tDhYJ"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alien8.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730273AbfIFQRs (ORCPT + 99 others); Fri, 6 Sep 2019 12:17:48 -0400 Received: from mail.skyhub.de ([5.9.137.197]:55568 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727805AbfIFQRr (ORCPT ); Fri, 6 Sep 2019 12:17:47 -0400 Received: from zn.tnic (p200300EC2F0B9E0090E54EFB2576D755.dip0.t-ipconnect.de [IPv6:2003:ec:2f0b:9e00:90e5:4efb:2576:d755]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.skyhub.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id DC9731EC02FE; Fri, 6 Sep 2019 18:17:41 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=dkim; t=1567786662; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=ML7AhYICUM+rkVSAqtRuyuzZIuUuSKwVdQWOYwnukpQ=; b=f2/tDhYJpGW1B2R1WMpnXM5eN/+fuxaJPcXGTgHFXs2ZQ2+tOrW9i5sYt857sqkY1yBePw vZu9+f2eYjLiBohLc4r8vYD+Zv13N3FRGz70XRJpnVjxKf7Kr222FkIF0ObxAxKW7/pseE WKqYiKh6fw5TqyzHY3zRwNOEWAX4EOY= Date: Fri, 6 Sep 2019 18:17:35 +0200 From: Borislav Petkov To: Johannes Erdfelt Cc: Thomas Gleixner , "Raj, Ashok" , Boris Ostrovsky , Mihai Carabas , "H. Peter Anvin" , Ingo Molnar , Jon Grimm , kanth.ghatraju@oracle.com, konrad.wilk@oracle.com, patrick.colp@oracle.com, Tom Lendacky , x86-ml , linux-kernel@vger.kernel.org Subject: Re: [PATCH] x86/microcode: Add an option to reload microcode even if revision is unchanged Message-ID: <20190906161735.GH19008@zn.tnic> References: <20190905002132.GA26568@otc-nc-03> <20190905072029.GB19246@zn.tnic> <20190905194044.GA3663@otc-nc-03> <20190905222706.GA4422@otc-nc-03> <20190906144039.GA29569@sventech.com> <20190906151617.GE19008@zn.tnic> <20190906154618.GB29569@sventech.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20190906154618.GB29569@sventech.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 06, 2019 at 08:46:18AM -0700, Johannes Erdfelt wrote: > That said, we very much rely on late microcode loading and it has helped > us and our customers significantly. You do realize that you rely on an update method which *won't* work in all possible cases and then you *will* have to reboot if the microcode patching *must* happen early, do you? > It's really easy to say "fix your infrastructure" when you're not > running that infrastructure. I'm not saying you should fix your infrastructure now - I'm saying you should keep that in mind when thinking whether to rely more on late loading or not. Who knows, maybe newer generation machines in the fleet could do load balancing, live migration, whatever fancy new cloud stuff it is, to facilitate a proper reboot. Or someone could rewrite arch/x86/ to rediscover new features upon a microcode reload or a feature disabling. And do that in a clean way. Who knows... > Reboots suck. Customers hate it. Operations hates it. When you get into > the number of hosts we have, you run into all kinds of weird failure > scenarios. (What do you mean that the NIC that was working just fine > before the reboot is no longer seen on the PCI bus?) Yeah, I've heard all the stories. > The more reboots we can avoid, the better it is for us and our > customers. So how do you update the kernels on those machines? Or you live-patch in the new functionality too? > I understand that it could be unsafe to late load some rare microcode > updates (theoretical or not). However, that is certainly the exception. > We have done this multiple times on our fleet and we plan to continue > doing so in the future. The fact that it has worked for you does not make it right. It won't magically become safe, as tglx said. But since you do custom development, you should be fine, it seems. Practically speaking, late loading probably won't disappear as it is being used apparently. Just don't expect that it will get "extended" if that extension brings with itself fallout and duct tape fixes left and right. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette