Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp3220826pxb; Mon, 25 Jan 2021 09:56:39 -0800 (PST) X-Google-Smtp-Source: ABdhPJzLMiesbjFJjqCJgxSNc0/F5ALV57ZQUL7slGbjFwtAdiC6jWnvQyHtIp7kBt1OSMZrFb5t X-Received: by 2002:a05:6402:27d1:: with SMTP id c17mr1464696ede.109.1611597399285; Mon, 25 Jan 2021 09:56:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611597399; cv=none; d=google.com; s=arc-20160816; b=x3Sf233KsbP19XP4M/JtNLwwpetrfphfnaChlU9BEgYwii75Z+tKb5zhzP5Eur4CFz uGT39ScbWyyAapjiFygjwBUvm88WT0/SR9u6I3oTunUHKikL5ruNXXnTd0T/6n/9CESr +/W5/LEKNaNRFTpTwRTM2P1P9eNGZVIndtaxrVchMXwsFcV8y4/KmGFpeP4+vbRhfy1x zRYTkFNvn48xWCpsy+hjJ1in1WJZETPtHFeu6nuK/fUgVnUlygRSeo6jKvtNH6pI3/4U ukYB4IeibVOD2O9v0u010L8yNDjwwAqOfBsFIyxqzhAHFWtSnjsktY32PMQYpRpK09qY vyPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=TtVSS1/YO4aLDxICHCBi85A8Dihjn11WfJS1v6mLCH0=; b=oRVYobAsynsubErn3jnwH6cTO8sVl8XcQoRGA6BXL+w+sDqQiZHKmEm7+wN5tbmo5R DoPxMpUgx4A2/H5sR3E1ibuVJU/7IOYcpCCcQcvhpiIgNjLYlP7mJPTgsgUYFaO+1f+D UtYM4SaZRkdfOVBEgXUQQoZkqSdY6CBXztwNaONsOmBbVjg7vUDBZZvmQ11FgmsfUQay SyNwLS68AiHCMZ0PeoddpCRkBqW+EknPZIFsi4drM6aw0zGZrYAVkiL0kieGQynJJmlj 4jNvH/R8W4hIS4WNiKLW2QaTvtPMFd4ASUeqI/3v9Nn5bA71N7/AvAppk2uLted2DOEN jg0g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i13si7464426edl.347.2021.01.25.09.56.14; Mon, 25 Jan 2021 09:56:39 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731267AbhAYRyT (ORCPT + 99 others); Mon, 25 Jan 2021 12:54:19 -0500 Received: from mail.kernel.org ([198.145.29.99]:39940 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731239AbhAYRxT (ORCPT ); Mon, 25 Jan 2021 12:53:19 -0500 Received: from gandalf.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 8BB2722D04; Mon, 25 Jan 2021 17:52:37 +0000 (UTC) Date: Mon, 25 Jan 2021 12:52:36 -0500 From: Steven Rostedt To: Lai Jiangshan Cc: linux-kernel@vger.kernel.org, Peter Zijlstra , Lai Jiangshan , Andy Lutomirski , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Subject: Re: [PATCH V2] x86/entry/64: De-Xen-ify our NMI code further Message-ID: <20210125125236.14b295a5@gandalf.local.home> In-Reply-To: <20210125123859.39b244ca@gandalf.local.home> References: <20210125074506.15064-1-jiangshanlai@gmail.com> <20210125123859.39b244ca@gandalf.local.home> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 25 Jan 2021 12:38:59 -0500 Steven Rostedt wrote: > On triggering an NMI from user space, I see the switch to the thread stack > is done, and "exc_nmi" is called. > > The problem I see with this is that exc_nmi is called with the thread > stack, if it were to take an exception, NMIs would be enabled allowing for > a nested NMI to run. From what I can tell, I don't see anything stopping > that NMI from executing over the currently running NMI. That is, this means > that NMI handlers are now re-entrant. > > Yes, the stack issue is not a problem here, but NMI handlers are not > allowed to be re-entrant. For example, we have spin locks in NMI handlers > that are considered fine if they are only used in NMI handlers. But because > there's a possible way to make NMI handlers re-entrant then these spin > locks can deadlock. > > I'm guessing that we need to add some tricks to the user space path to > set and clear the "NMI executing" variable, but the return may become a bit > complex in clearing that without races. I think this may work if we wrap the exc_nmi call with the following: Overwrite the NMI HW stack frame on the NMI stack as if an NMI came in at the return back to the user space path of the NMI handler. Set the stack pointer to the NMI stack just after the first frame that was updated. Then jump to asm_exc_nmi. Then the code would act like it came in from kernel mode, and execute the NMI nesting code normally. When it finishes, and does the iretq, it will return to the NMI handler for the user space return with the kernel thread stack, and then the special code for returning to user space can be called. The exc_nmi C code will need to handle this case to update pt_regs to make sure the registered NMI handlers still see the pt_regs from user space. But I think something like this may be the easiest way to handle this without dealing with more NMI stack nesting races. I could try to write something up to implemented this. -- Steve