Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp7675635rwl; Tue, 10 Jan 2023 04:04:46 -0800 (PST) X-Google-Smtp-Source: AMrXdXuzmNph3bTWkpOLd34dcU0geI6S1knmrBHE2cYNhUXQxZNnbOTfUy/MoUBaw4KwFxDiRh+q X-Received: by 2002:a17:902:a50c:b0:192:6c8a:6b81 with SMTP id s12-20020a170902a50c00b001926c8a6b81mr55199206plq.31.1673352286560; Tue, 10 Jan 2023 04:04:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673352286; cv=none; d=google.com; s=arc-20160816; b=y8pVDrhiJ7m/PvUXYqg4anNIIWd2SWpK6xUMz8rIs5F+EON8E7f1iye+NOUi/Z4c50 Bguvr4cPwHYaZKn/65/rDLmK0nX+EcORpUywcIXyu5+0pm1bWljOr9/nA9VgRxiwQJxQ BbQr6aWwrM/dHkybNQdeYsjFXylildy4AbVT3nHufZZGMft2QrA4jH4TfhsB8R5qWmWI RNTRDZitAAHuaMRVsgTC/u94gwE0Oil7rSA08KcHLRcBxhFDvWziS4ovMdnG2/7y/Nok lHupgKyI8zyJpyvez9r+BMTJKsmH9cFCzyJPVWpHwUD8CsoQROXnLqSuEidY3z4CU+oQ +/sw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:sender:dkim-signature; bh=VolYRiSABhDndYYNEKONj1lyAiYPsl2ohB5ly0YsFTQ=; b=ediyE2VQZYY7bWfLGobpVonCWi6P5HnIpxtR50DXW/2N1NfUL5MvUQt9dWXWObm8RJ NVAWrY3pcHnBt3K8Mu1nh8uhGapKmLOmKPCQosdZqgozWCzumlv7+Sme/cIh0Km3dgiU DrXQ98CyiL3pLCu/KmFxyJQCD3J6hSdBYXRqh2m02g9kAWxNOddSj8lht8jMfnDgKhaP ve961hN9OhEJ5jx+kYGtnIJHqfQYd9pVGP4MC/7dt/FZPVjLxxGxfssc8Jnzytx9fpJf BWs2iaZtbldXH9HCXoGNEf2+5Nguip1+2LgKJ8Ji69LW/cI4InJj35btOlNUqeWWKEpA 0Piw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=m+YDO5LI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d7-20020a170903230700b00174c5fdc8d2si12264201plh.307.2023.01.10.04.04.39; Tue, 10 Jan 2023 04:04:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=m+YDO5LI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232482AbjAJMBO (ORCPT + 53 others); Tue, 10 Jan 2023 07:01:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48664 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231733AbjAJMBM (ORCPT ); Tue, 10 Jan 2023 07:01:12 -0500 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC03458327 for ; Tue, 10 Jan 2023 04:01:10 -0800 (PST) Received: by mail-wr1-x42a.google.com with SMTP id d17so11473979wrs.2 for ; Tue, 10 Jan 2023 04:01:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:from:to:cc:subject:date:message-id :reply-to; bh=VolYRiSABhDndYYNEKONj1lyAiYPsl2ohB5ly0YsFTQ=; b=m+YDO5LITD9TyN0U8OC2I9QvVqvhTGruhVtt8aHGC8HyOF4qKI3fZM3F1HJ/d4qX3p CzqtvXpM3pBRhAF2smMu0K/qFgFr1dcv3bVhE2tp5DASWMNLVHId316rh9YBnXrLmGfc 7RcL+Azp35ao2sCYy/VYdLqDgHCYktElvUOkqjjR/qy8ZSXhOI1JnuVlh/zyTNFytga4 QMlSWzu7H+ZLn6b9g0QOSO/HFuvi7mj4VjQczmyz3w2yeImvIb68VltKLWyMDKhN2jQn H5X/RpombrCVb7UkWMdFKZsR4DXQsIOMWIN96aR9nCADk/4eRd/EaNX7WvmscQxkGpzV oajg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VolYRiSABhDndYYNEKONj1lyAiYPsl2ohB5ly0YsFTQ=; b=iGN+bHpsLCDOlB2fOfnCYV3xfys75LjyEwhWuDfCmTiF42TpLiXQcBwXggnKq2DWDo xmLdedo+wUK2myDrJ9ac2x1gE1OwTLyuu6NDq1XXORVSRsHIELCxSRnoVBdk5LwUfKUS ZSVDDKiL1Ws2sLrWzWUUwJuQ15YgfVsoVbUPBdDGSIBTs4GPO5cJ9FGcSHfZgzbc3br1 3dYi2LUjYLbZvc4n/SARUXZJP7TZDB5y244pcIGlVubGZj0hF6IBWtHHGiNT30uBqYfn 1wF2Wq4FiR9qVr+eENS5b55schylo75+5F8HhPTP/F3ArAa/iJNRijXVKMEbHVpu7/k/ 12Nw== X-Gm-Message-State: AFqh2kr6ufG41gyUUQROvhIs+suZzDuOM6O9BkJ6uSZfofshAxjRGnPc 5/TheUgrl1OyCoW85tfVoJY= X-Received: by 2002:adf:fa84:0:b0:28b:ca44:6458 with SMTP id h4-20020adffa84000000b0028bca446458mr27339809wrr.2.1673352069282; Tue, 10 Jan 2023 04:01:09 -0800 (PST) Received: from gmail.com (1F2EF2EB.nat.pool.telekom.hu. [31.46.242.235]) by smtp.gmail.com with ESMTPSA id q4-20020adfdfc4000000b002bc6c180738sm5241873wrn.90.2023.01.10.04.01.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 04:01:08 -0800 (PST) Sender: Ingo Molnar Date: Tue, 10 Jan 2023 13:01:06 +0100 From: Ingo Molnar To: Zeng Heng Cc: michael.roth@amd.com, bp@alien8.de, hpa@zytor.com, tglx@linutronix.de, sathyanarayanan.kuppuswamy@linux.intel.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, keescook@chromium.org, mingo@redhat.com, dave.hansen@linux.intel.com, brijesh.singh@amd.com, linux-kernel@vger.kernel.org, x86@kernel.org, liwei391@huawei.com Subject: [PATCH -v2] x86/boot/compressed: Register dummy NMI handler in EFI boot loader, to avoid kdump crashes Message-ID: References: <20230110102745.2514694-1-zengheng4@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Ingo Molnar wrote: > > * Zeng Heng wrote: > > > +void do_boot_nmi_fault(struct pt_regs *regs, unsigned long error_code) > > +{ > > + /* ignore */ > > +} > > diff --git a/arch/x86/boot/compressed/idt_64.c b/arch/x86/boot/compressed/idt_64.c > > index 6debb816e83d..b169c9728d52 100644 > > --- a/arch/x86/boot/compressed/idt_64.c > > +++ b/arch/x86/boot/compressed/idt_64.c > > @@ -60,6 +60,7 @@ void load_stage2_idt(void) > > { > > boot_idt_desc.address = (unsigned long)boot_idt; > > > > + set_idt_entry(X86_TRAP_NMI, boot_nmi_fault); > > set_idt_entry(X86_TRAP_PF, boot_page_fault); > > So it's a bit sad to install a dummy handler that does nothing, while > something clearly sent an NMI and expects an intelligent reaction - OTOH > the unexpected NMIs from from watchdog during a kdump clearly make things > worse by crashing the bootup. > > Anyway, I cannot think of a better response here that the boot loading code > could do either, so I've applied your fix to tip:x86/boot. BTW., the changelog had very poor quality, and the patch added no comments to explain the presence of the dummy NMI. The -v2 version below should address most of those problems. Thanks, Ingo =============> From: Zeng Heng Date: Tue, 10 Jan 2023 18:27:45 +0800 Subject: [PATCH] x86/boot/compressed: Register dummy NMI handler in EFI boot loader, to avoid kdump crashes If kdump is enabled, when using mce_inject to inject errors, EFI boot loader would decompress & load second kernel for saving the vmcore file. For normal errors that is fine. However, in the MCE case, the panic CPU that firstly enters into mce_panic() is running within NMI interrupt context, and the processor blocks delivery of subsequent NMIs until the next execution of the IRET instruction. When the panic CPU takes long time in the panic processing route, and causes the watchdog timeout, at this moment, the processor already receives NMI interrupt in the background. In the reproducer sequence below, panic CPU would run into EFI loader and raise page fault exception (like visiting `vidmem` variable when attempting to call debug_putstr()), the CPU would execute IRET instruction when it exits from the page fault handler. But the loader never registers handler for NMI vector in IDT, lack of vector handler would cause reboot, which interrupts kdump procedure and fails to save the vmcore file. Here is steps to reproduce the above issue (it's sporadic): 1. # cat uncorrected CPU 1 BANK 4 STATUS uncorrected 0xc0 MCGSTATUS EIPV MCIP ADDR 0x1234 RIP 0xdeadbabe RAISINGCPU 0 MCGCAP SER CMCI TES 0x6 2. # modprobe mce_inject 3. # mce-inject uncorrected For increasing the probability of reproduction of this issue, there are two ways to increase the probability of the bug: 1. modify the threshold value of watchdog (increase NMI frequency); 2. and/or add delays before panic() in mce_panic() and modify PANIC_TIMEOUT macro; Fixes: ca0e22d4f011 ("x86/boot/compressed/64: Always switch to own page table") Signed-off-by: Zeng Heng [ Tidy up changelog, add comments. ] Signed-off-by: Ingo Molnar Link: https://lore.kernel.org/r/20230110102745.2514694-1-zengheng4@huawei.com --- arch/x86/boot/compressed/ident_map_64.c | 12 ++++++++++++ arch/x86/boot/compressed/idt_64.c | 1 + arch/x86/boot/compressed/idt_handlers_64.S | 1 + arch/x86/boot/compressed/misc.h | 1 + 4 files changed, 15 insertions(+) diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c index d4a314cc50d6..cbfdefcf9657 100644 --- a/arch/x86/boot/compressed/ident_map_64.c +++ b/arch/x86/boot/compressed/ident_map_64.c @@ -379,3 +379,15 @@ void do_boot_page_fault(struct pt_regs *regs, unsigned long error_code) */ kernel_add_identity_map(address, end); } + +void do_boot_nmi_fault(struct pt_regs *regs, unsigned long error_code) +{ + /* + * Default boot loader placeholder fault handler - there's no real + * kernel running yet, so there's not much we can do - but NMIs + * can arrive in a kdump scenario, for example by the NMI watchdog. + * + * Not having any handler would cause the CPU to silently reboot, + * so we do the second-worst thing here and ignore the NMI. + */ +} diff --git a/arch/x86/boot/compressed/idt_64.c b/arch/x86/boot/compressed/idt_64.c index 6debb816e83d..b169c9728d52 100644 --- a/arch/x86/boot/compressed/idt_64.c +++ b/arch/x86/boot/compressed/idt_64.c @@ -60,6 +60,7 @@ void load_stage2_idt(void) { boot_idt_desc.address = (unsigned long)boot_idt; + set_idt_entry(X86_TRAP_NMI, boot_nmi_fault); set_idt_entry(X86_TRAP_PF, boot_page_fault); #ifdef CONFIG_AMD_MEM_ENCRYPT diff --git a/arch/x86/boot/compressed/idt_handlers_64.S b/arch/x86/boot/compressed/idt_handlers_64.S index 22890e199f5b..2aef8e1b515b 100644 --- a/arch/x86/boot/compressed/idt_handlers_64.S +++ b/arch/x86/boot/compressed/idt_handlers_64.S @@ -69,6 +69,7 @@ SYM_FUNC_END(\name) .text .code64 +EXCEPTION_HANDLER boot_nmi_fault do_boot_nmi_fault error_code=0 EXCEPTION_HANDLER boot_page_fault do_boot_page_fault error_code=1 #ifdef CONFIG_AMD_MEM_ENCRYPT diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h index 62208ec04ca4..d89d3f8417f6 100644 --- a/arch/x86/boot/compressed/misc.h +++ b/arch/x86/boot/compressed/misc.h @@ -187,6 +187,7 @@ static inline void cleanup_exception_handling(void) { } #endif /* IDT Entry Points */ +void boot_nmi_fault(void); void boot_page_fault(void); void boot_stage1_vc(void); void boot_stage2_vc(void);