Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2831868pxb; Sun, 24 Jan 2021 22:50:31 -0800 (PST) X-Google-Smtp-Source: ABdhPJxDL8jQRihCi0Z48pfcTRfTxGDIYA2nHViv2O6it8Q+sf1XBTJ2eE38owItiifRH4kbdXQw X-Received: by 2002:aa7:c983:: with SMTP id c3mr510047edt.327.1611557431464; Sun, 24 Jan 2021 22:50:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611557431; cv=none; d=google.com; s=arc-20160816; b=KqV/rDJSIBcMozbt7fELobjInjqLaC26xFTJ4sTD5ycuPzCHjj99r/QQqlJDuCSTcc 8rucDXBu1kFk6VkQncmoCL2pyT1q/I71f6CxeKNkgxccVF3xLNx5o+shggkpB3726Yg8 TfGxft984JpCD4ZrMOpbAjXySUe8qk6K9mCciSZ4hz/MrEmF1yg6fSgZXOEllW4hYkq1 p2X5IiRDA7wx5DEtgGGGIHODWf18ZIOwO9NwvnjOFjzoVw1NoN2P24qzqZJ3o0X1jAPt QMVYYdU0dbGWwzVnvkLLbOBeAUOrpwhvDJY4ElteWeSX1NssROB3BXE31R7GLD3WBjHw h/Uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=oTvj0bJFQ+clUckTa5Uz1qyRueqmBt51RQhXdKOAiHQ=; b=uag8s3Brczd6agjs7tTkT5xCrmOYt2n/s3jxj5cbtrS+h6zuFg7Mm2uEilWIrq+FPo elu2Y7Tz+xM3EUiq8Sn/oENQ/fs+nMfG6tyk2UHWsqt4HdKm7Zsmw2qV6bOZ/98iJuRS kJyVN9CKEU2j+WhxPisCYHfUVMu3me5cV6UHDDPn6B/NJF7KiLksw88tfRD097QkQ609 dvhmrKl87WKZBLoQIuX9Xdgb1OYpgdeMjz5xNA9vQK3mjiHjoVii4MDcRLUIGO9yLiys pyJkI7zG8Muux9vFPDSkYKotRV0FJrUr+F+RzCIdF+W50bPeyqguXq9tbqIxFiqO6Zup bYog== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=rRokgsrA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 2si3311819ejv.698.2021.01.24.22.50.08; Sun, 24 Jan 2021 22:50:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=rRokgsrA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727180AbhAYGq6 (ORCPT + 99 others); Mon, 25 Jan 2021 01:46:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725959AbhAYGpE (ORCPT ); Mon, 25 Jan 2021 01:45:04 -0500 Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD753C061573 for ; Sun, 24 Jan 2021 22:44:23 -0800 (PST) Received: by mail-pj1-x1035.google.com with SMTP id gx1so2316106pjb.1 for ; Sun, 24 Jan 2021 22:44:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=oTvj0bJFQ+clUckTa5Uz1qyRueqmBt51RQhXdKOAiHQ=; b=rRokgsrAFUBwxaoCWnS8obc/RfFGHlTvJpxlHRIIsrkEQj6VhXimaIX3E1zcPjD6Tl pEuRwweCtLvKauPW2eNyoKtDHAA5oPMu2j2XOIG4WuYqq4z9gvyB7H3x4Tqyvh5owWum UMjwaVQR3AE5psrHyVhnUJP8qAzfRmrcK7z8wHt38trVU3yj1Y+VJEz2VOssErTJm6/G o2Xm1ET+GSVXBbFSMaGub8o1DnsiRjytKbbUxonUOFTjI5n8Oy/7+FqWJeQgeR+mYuxZ d+SngBNJqTdW2PIbBH1wz3UiYcLcZF6IoR6i35ZwYQ28V2t9A7KJgN1eRBH/2t4+SIVo uXew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=oTvj0bJFQ+clUckTa5Uz1qyRueqmBt51RQhXdKOAiHQ=; b=YyCkmPR8jFzfq332gj69yIpBI3I7T5CC0bl3pbSiWeT+I6i6DHWG6Nje7v5e8uAVH+ looFeofiLir+9ZXgpvX93GwMtRd5oP/dYX4URkXNbDeaNhjyIXoMx5HMoWTBZdNTwIky HUAMbRVaSU/26IZY7BYFC4hyG4VQmJ70nogMOOTBZ58Gvfzk79ELhOPpNKO9Zwg1RKuQ Az/bNjVpanz2Vcd2FRF7Lmj1Qpslf5O3HtDr2gRr8PsNW885moQPhWd8QA/Lot4d2xAP iuIZGvyBSKTTYS/1OLdDRyG0NeaADn0fgE2kcToJ2CkqzKh6hxxix0dNXnR+CODXG+v3 d5fw== X-Gm-Message-State: AOAM533zwZjzRN0CkYN683ve5F4WuMhSr5idNwr5DpF7G0aF4svHv3t3 1tKUHpLosVMOhyyDNOXEhBn/vXdc4BM= X-Received: by 2002:a17:902:a708:b029:da:ec42:a3d4 with SMTP id w8-20020a170902a708b02900daec42a3d4mr1688380plq.40.1611557063133; Sun, 24 Jan 2021 22:44:23 -0800 (PST) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id fs14sm13881190pjb.46.2021.01.24.22.44.22 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 24 Jan 2021 22:44:22 -0800 (PST) From: Lai Jiangshan To: linux-kernel@vger.kernel.org Cc: Steven Rostedt , Peter Zijlstra , Lai Jiangshan , Andy Lutomirski , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH V2] x86/entry/64: De-Xen-ify our NMI code further Date: Mon, 25 Jan 2021 15:45:06 +0800 Message-Id: <20210125074506.15064-1-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Lai Jiangshan The commit 929bacec21478("x86/entry/64: De-Xen-ify our NMI code") simplified the NMI code by changing paravirt code into native code and left a comment about "inspecting RIP instead". But until now, "inspecting RIP instead" has not been made happened and this patch tries to complete it. Comments in the code was from Andy Lutomirski. Thanks! Signed-off-by: Lai Jiangshan --- arch/x86/entry/entry_64.S | 44 ++++++++++----------------------------- 1 file changed, 11 insertions(+), 33 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index cad08703c4ad..21f67ea62341 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1268,32 +1268,14 @@ SYM_CODE_START(asm_exc_nmi) je nested_nmi /* - * Now test if the previous stack was an NMI stack. This covers - * the case where we interrupt an outer NMI after it clears - * "NMI executing" but before IRET. We need to be careful, though: - * there is one case in which RSP could point to the NMI stack - * despite there being no NMI active: naughty userspace controls - * RSP at the very beginning of the SYSCALL targets. We can - * pull a fast one on naughty userspace, though: we program - * SYSCALL to mask DF, so userspace cannot cause DF to be set - * if it controls the kernel's RSP. We set DF before we clear - * "NMI executing". + * Now test if we interrupted an outer NMI that just cleared "NMI + * executing" and is about to IRET. This is a single-instruction + * window. This check does not handle the case in which we get a + * nested interrupt (#MC, #VE, #VC, etc.) after clearing + * "NMI executing" but before the outer NMI executes IRET. */ - lea 6*8(%rsp), %rdx - /* Compare the NMI stack (rdx) with the stack we came from (4*8(%rsp)) */ - cmpq %rdx, 4*8(%rsp) - /* If the stack pointer is above the NMI stack, this is a normal NMI */ - ja first_nmi - - subq $EXCEPTION_STKSZ, %rdx - cmpq %rdx, 4*8(%rsp) - /* If it is below the NMI stack, it is a normal NMI */ - jb first_nmi - - /* Ah, it is within the NMI stack. */ - - testb $(X86_EFLAGS_DF >> 8), (3*8 + 1)(%rsp) - jz first_nmi /* RSP was user controlled. */ + cmpq $.Lnmi_iret, 8(%rsp) + jne first_nmi /* This is a nested NMI. */ @@ -1438,17 +1420,13 @@ nmi_restore: addq $6*8, %rsp /* - * Clear "NMI executing". Set DF first so that we can easily - * distinguish the remaining code between here and IRET from - * the SYSCALL entry and exit paths. - * - * We arguably should just inspect RIP instead, but I (Andy) wrote - * this code when I had the misapprehension that Xen PV supported - * NMIs, and Xen PV would break that approach. + * Clear "NMI executing". This leaves a window in which a nested NMI + * could observe "NMI executing" cleared, and a nested NMI will detect + * this by inspecting RIP. */ - std movq $0, 5*8(%rsp) /* clear "NMI executing" */ +.Lnmi_iret: /* must be immediately after clearing "NMI executing" */ /* * iretq reads the "iret" frame and exits the NMI stack in a * single instruction. We are returning to kernel mode, so this -- 2.19.1.6.gb485710b