Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CAD88C64EC7 for ; Thu, 23 Feb 2023 02:29:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233038AbjBWC3n (ORCPT ); Wed, 22 Feb 2023 21:29:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231214AbjBWC3l (ORCPT ); Wed, 22 Feb 2023 21:29:41 -0500 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C4D511258B; Wed, 22 Feb 2023 18:29:39 -0800 (PST) Received: from kwepemi500024.china.huawei.com (unknown [172.30.72.53]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4PMcL423BJzKmMd; Thu, 23 Feb 2023 10:24:44 +0800 (CST) Received: from [10.174.179.163] (10.174.179.163) by kwepemi500024.china.huawei.com (7.221.188.100) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.17; Thu, 23 Feb 2023 10:29:34 +0800 Message-ID: <81f5d521-bc8a-4d1a-fe7e-55487f3d25b3@huawei.com> Date: Thu, 23 Feb 2023 10:29:33 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.1.2 Subject: Re: [RFC PATCH v4] x86/kdump: terminate watchdog NMI interrupt to avoid kdump crashes Content-Language: en-US To: "Eric W. Biederman" , Peter Zijlstra CC: , , , , , , , , , , , , , , , , , , , , References: <20230217120604.435608-1-zengheng4@huawei.com> <87r0uh5yud.fsf@email.froward.int.ebiederm.org> From: Zeng Heng In-Reply-To: <87r0uh5yud.fsf@email.froward.int.ebiederm.org> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.179.163] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemi500024.china.huawei.com (7.221.188.100) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 在 2023/2/23 2:39, Eric W. Biederman 写道: > Peter Zijlstra writes: > >> On Fri, Feb 17, 2023 at 08:06:04PM +0800, Zeng Heng wrote: >>> If the cpu panics within the NMI interrupt context, there could be >>> unhandled NMI interrupts in the background which are blocked by processor >>> until next IRET instruction executes. Since that, it prevents nested >>> NMI handler execution. >>> >>> In case of IRET execution during kdump reboot and no proper NMIs handler >>> registered at that point (such as during EFI loader) > EFI loader? kexec on panic is supposed to be kernel to kernel. > If someone is getting EFI involved that is a bug. In kdump path, kexec would start purgatory to verify the secondary kernel by sha256. If verify passed, it would turn the control to EFI loader, and call the second kernel to capture the environment as vmcore file. As the mail said, if panic appears within NMI context, we never exit from that until EFI loader handles page fault exception and executes IRET instruction when exit from PF. At this moment, processor would allow the blocked NMI interrupt raise. >> This kills all of perf, including but not limited to the hardware >> watchdog. However, it does nothing to external NMI sources like the NMI >> button found on some HP machines. >> >> Still I suppose it is sufficient for the normal case. > I can't think of one why we don't just leave > NMIs deliberately disabled How to just leave NMIs disabled, could you explain it with more details ? Zeng Heng > until the crash recover kernel figured out how to enable them safely. >