Received: by 10.192.165.156 with SMTP id m28csp628140imm; Thu, 19 Apr 2018 05:04:18 -0700 (PDT) X-Google-Smtp-Source: AIpwx49Z0FxlgQMV1a9AuzwjUU8YbgtVx5Ld0rvYgRBWU2oVairJT+vLfMVou1AGF57sob6AwwVR X-Received: by 2002:a17:902:24e:: with SMTP id 72-v6mr5771725plc.87.1524139458258; Thu, 19 Apr 2018 05:04:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524139458; cv=none; d=google.com; s=arc-20160816; b=i+djtNvC+Yl1TE5xaJPurw7zVhc4a2LbT4Yet8H8sIG/i1XUUY9xMzZeX4EL1NsQuA C1/PDYXmsEDny7Jm7/vJg4sVKZl8fdvNgQN+1cnlcqqWeZoSqEpkKkc8s7V0GFiiDdag 5IkG/AxOh55FFYJh7jSTthnlO6SL1ZeLjVlLT0a4N+/nHXPX46kLSWG0gpc1kHuHaWvD KDvg7q6cN/Af2ngbZBiXfIcXiGc+nprASPFKUf2ObaoWDps0vdZBW8DyHO/1G4kUXMJm Ph0ex2MjQIIopinScfdCfpkRAEeX6XEqsfGF7DBcZej8/MEXL/k1lU007WrJnckBV01a Skvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:mail-followup-to:message-id:subject:cc:to:from:date :arc-authentication-results; bh=nJO5+DAQDv7LoK6Sg0xE7FzQY+qU1wUeQ20RfKQFK5Y=; b=QLt79n1z4kcxMiglqFUoIdQJAeDLw0zMJ0Meccpf53wXFnxtqE3OJiuFqRw61gs72X VAxtkgvTiBvyEbLvzpq2/lng9pIExkv9vDZG+amxLc+UtLCx46c4h0fDNGTE3ci6ELyL ebQG0cLLEdrgrIKMRGB0H1tL3LlhLUuJ2lPAUrPGRll/cn/jEow2inK9MlzcuYUMNRcO T0v26Rk3du4zye1zOFSRbmig6qYornNMIfscKsW3ZacTY0IDI4VD7kJwdoILuaTCt1z1 TYOBFcapBo9EUN5h/AzXAA45jcv9nr4u5wKKVUU6Rh0RJcdN5/w1+p4NB1lvuHJxU/8I Gz2g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g3si3137315pfc.237.2018.04.19.05.04.04; Thu, 19 Apr 2018 05:04:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752514AbeDSMCn (ORCPT + 99 others); Thu, 19 Apr 2018 08:02:43 -0400 Received: from mail.pod.cz ([213.155.227.146]:53430 "EHLO mail.pod.cz" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752116AbeDSMCk (ORCPT ); Thu, 19 Apr 2018 08:02:40 -0400 Received: by pc11.op.pod.cz (Postfix, from userid 475) id 40Rd0H1Tqzz70Qj; Thu, 19 Apr 2018 14:02:39 +0200 (CEST) Date: Thu, 19 Apr 2018 14:02:39 +0200 From: Vitezslav Samel To: Borislav Petkov Cc: "Raj, Ashok" , Greg Kroah-Hartman , linux-kernel@vger.kernel.org Subject: Re: 4.15.17 regression: bisected: timeout during microcode update Message-ID: <20180419120239.GA2377@pc11.op.pod.cz> Mail-Followup-To: Borislav Petkov , "Raj, Ashok" , Greg Kroah-Hartman , linux-kernel@vger.kernel.org References: <20180418081140.GA2439@pc11.op.pod.cz> <20180418100721.GA5866@pd.tnic> <20180418120839.GA5655@pc11.op.pod.cz> <20180418122212.GA4290@pd.tnic> <20180418135330.GA23580@araj-mobl1.jf.intel.com> <20180419053531.GA2224@pc11.op.pod.cz> <20180419104829.GE3896@pd.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20180419104829.GE3896@pd.tnic> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 19, 2018 at 12:48:29PM +0200, Borislav Petkov wrote: > On Thu, Apr 19, 2018 at 07:35:31AM +0200, Vitezslav Samel wrote: > > > - Can you remove your builtin microcode, > > > - rename the /lib/firmware/intel-ucode so we don't find it during late loading. > > > - let the system boot completely > > > - then rename the intel-ucode back for this test. > > > - write 1 to reload and see if that update succeeds or fails? > > > > Just tested, it fails. > > Can you apply the below patch, do the exact same exercise and catch the > output? Over serial console or netconsole or if nothing else, do a video > of the screen with a phone and upload it somewhere? Here it is: ------------------------------------------------------------- microcode: __reload_late: CPU1 microcode: __reload_late: CPU3 microcode: __reload_late: CPU2 microcode: __reload_late: CPU0 microcode: __reload_late: CPU1 reloading microcode: __reload_late: CPU3 reloading microcode: __reload_late: CPU2 reloading microcode: __reload_late: CPU0 reloading microcode: __reload_late: CPU3 returning 0x0 microcode: __reload_late: CPU2 returning 0x0 microcode: updated to revision 0x24, date = 2018-01-21 microcode: __reload_late: CPU0 waiting to exit microcode: __reload_late: CPU1 returning 0x0 microcode: Timeout while waiting for CPUs rendezvous, remaining: 3 Kernel panic - not syncing: Timeout during microcode update! CPU: 0 PID: 11 Comm: migration/0 Not tainted 4.16.3+ #1 Hardware name: Supermicro X10SLM-F/X10SLM-F, BIOS 2.2 02/05/2015 Call Trace: dump_stack+0x46/0x65 panic+0xca/0x208 __reload_late+0x11e/0x120 multi_cpu_stop+0x55/0xa0 ? cpu_stop_queue_work+0x80/0x80 cpu_stopper_thread+0x7d/0x100 ? sort_range+0x20/0x20 smpboot_thread_fn+0x11f/0x1e0 kthread+0x101/0x120 ? __kthread_create_on_node+0x150/0x150 ? __kthread_create_on_node+0xf0/0x150 ret_from_fork+0x35/0x40 Shutting down cpus with NMI Kernel Offset: disabled ---[ end Kernel panic - not syncing: Timeout during microcode update! ------------------------------------------------------------- Vita > > Thx. > > --- > diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c > index 10c4fc2c91f8..374ec1d75d89 100644 > --- a/arch/x86/kernel/cpu/microcode/core.c > +++ b/arch/x86/kernel/cpu/microcode/core.c > @@ -553,6 +553,8 @@ static int __reload_late(void *info) > enum ucode_state err; > int ret = 0; > > + pr_info("%s: CPU%d\n", __func__, cpu); > + > /* > * Wait for all CPUs to arrive. A load will not be attempted unless all > * CPUs show up. > @@ -560,6 +562,8 @@ static int __reload_late(void *info) > if (__wait_for_cpus(&late_cpus_in, NSEC_PER_SEC)) > return -1; > > + pr_info("%s: CPU%d reloading\n", __func__, cpu); > + > spin_lock(&update_lock); > apply_microcode_local(&err); > spin_unlock(&update_lock); > @@ -571,9 +575,12 @@ static int __reload_late(void *info) > } else if (err == UCODE_UPDATED || err == UCODE_OK) { > ret = 1; > } else { > + pr_info("%s: CPU%d returning 0x%x\n", __func__, cpu, ret); > return ret; > } > > + pr_info("%s: CPU%d waiting to exit\n", __func__, cpu); > + > /* > * Increase the wait timeout to a safe value here since we're > * serializing the microcode update and that could take a while on a > > -- > Regards/Gruss, > Boris. > > SUSE Linux GmbH, GF: Felix Imend?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG N?rnberg) > --