Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp5798493rwl; Tue, 4 Apr 2023 03:58:47 -0700 (PDT) X-Google-Smtp-Source: AKy350auaer55QSqtMHd/3ih3ZLqUR5pzX0ZZKJjyv8OFc6InMvgK1wJ6nBRLc4pe8dHcpOMWIT7 X-Received: by 2002:a17:906:5e58:b0:8b1:7ae8:ba6f with SMTP id b24-20020a1709065e5800b008b17ae8ba6fmr1743803eju.16.1680605927591; Tue, 04 Apr 2023 03:58:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680605927; cv=none; d=google.com; s=arc-20160816; b=QNPhUQ5MmlBv7+jpxhn+a6GXq/XVLh1HHChrn0g0aXUXS+ujmy7dbYqMGtUmFGmot3 Uyy+nRRqtFTNxT7bMopHYpyvTyadiYlc4Zp4ZFW+jcZtptaTXHUht6Bc2SdLXy/T/sME X95f9gmt4M13T1LcSQrOsp9uY3oYp2Ftb3qVs0YXlnvq/+E6YacSstqsIuQDkdcUG6Ej 0vrJ9mhExnuuyGAV3/mwQ6dmhgV3L6Mibg3bHBFTRIlwuVq42bgrpyweGOTHpn/ucwzT fwdyFtwveFp3MzT0vTR1w8C8hqaylmLL2pLcqAOUDrzCRHHI+Dm5DcHHtmArNDBNO3Qo C1kA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=exDGCc8X2G7/6JXDtAe6rSBKL9gtva1p+jz7FZPd4y4=; b=N3u5XpH+BSRH4u9BvAx88ulIHQ4afirpshmvbc6yLclKnCDsROhE3nbFT1J3TE2oJG bJ9gmooqgE1csFDxQY7RgOue5UrxDVQAu7CSMYqPKCX4HNH2Obu4LU581NDzixeuFxra P8cg+CU7FwQGFgD3vOkWLTYoetJKRo9+AI+vS8UzUt+JnAraDSwCATS/GqopxGOunfFp KJzt1xTM9OP0JQEuGj7QgKarr80Za08jlKHE6j+zuuhEILHRkdKbqSOiChqUW5JkMFjR eU0QBeN/UEF40Lq9Yi01R1H4rA7flu29gQlWHRv38GdJXYExhzMtqseKTW0xc3oTdqJN j9GA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=dkNfTZLK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m23-20020aa7c497000000b004fd16f5aa1bsi8531336edq.103.2023.04.04.03.58.22; Tue, 04 Apr 2023 03:58:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=dkNfTZLK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234963AbjDDK4O (ORCPT + 99 others); Tue, 4 Apr 2023 06:56:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234848AbjDDKzV (ORCPT ); Tue, 4 Apr 2023 06:55:21 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 922803A9A; Tue, 4 Apr 2023 03:54:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680605643; x=1712141643; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Uh3c6OxgPYnQWdTQbg7Q3IOV/J0YfwnoKWnS2/a8iq4=; b=dkNfTZLKljuisedlfVK0NzDrE8NcHMpj7xoEWJ5MrkXXSQg2dRiNB9eD elyo/bsHx3gpgR7S6afdAANk7iP13d+YvCpSu8uvv6YUPCfYSTBC1yPe4 Yrbg4NRjHBAvLHyBcoDT1oiYl4IGne6OQW4UE8o16t/hCisVIBOHyzI6J RZ8BUJDpvYbwAqnDUF/bBhVEXUohLAxI78e1SuuF5p9i4wPKa28zDbffh QYBzrYx9IxSOwr0sGqQ1JT+c6PwZZ3H1y5qsRKTp0rLCOHQDytrn0i4gA K3GdkZPlxBqKgQOpjL1B56yT0BIHHPz8LJbmYF6G8hl/asD6iAaTBpyPR Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10669"; a="330734205" X-IronPort-AV: E=Sophos;i="5.98,317,1673942400"; d="scan'208";a="330734205" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Apr 2023 03:53:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10669"; a="775597859" X-IronPort-AV: E=Sophos;i="5.98,317,1673942400"; d="scan'208";a="775597859" Received: from unknown (HELO fred..) ([172.25.112.68]) by FMSMGA003.fm.intel.com with ESMTP; 04 Apr 2023 03:53:05 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v7 27/33] x86/fred: fixup fault on ERETU by jumping to fred_entrypoint_user Date: Tue, 4 Apr 2023 03:27:10 -0700 Message-Id: <20230404102716.1795-28-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230404102716.1795-1-xin3.li@intel.com> References: <20230404102716.1795-1-xin3.li@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.5 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org If the stack frame contains an invalid user context (e.g. due to invalid SS, a non-canonical RIP, etc.) the ERETU instruction will trap (#SS or #GP). From a Linux point of view, this really should be considered a user space failure, so use the standard fault fixup mechanism to intercept the fault, fix up the exception frame, and redirect execution to fred_entrypoint_user. The end result is that it appears just as if the hardware had taken the exception immediately after completing the transition to user space. Suggested-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- Changes since v6: * Add a comment to explain why it is safe to write to a previous FRED stack frame. (Lai Jiangshan). Changes since v5: * Move the NMI bit from an invalid stack frame, which caused ERETU to fault, to the fault handler's stack frame, thus to unblock NMI ASAP if NMI is blocked (Lai Jiangshan). --- arch/x86/entry/entry_64_fred.S | 8 ++- arch/x86/include/asm/extable_fixup_types.h | 4 +- arch/x86/mm/extable.c | 76 ++++++++++++++++++++++ 3 files changed, 85 insertions(+), 3 deletions(-) diff --git a/arch/x86/entry/entry_64_fred.S b/arch/x86/entry/entry_64_fred.S index d975cacd060f..efe2bcd11273 100644 --- a/arch/x86/entry/entry_64_fred.S +++ b/arch/x86/entry/entry_64_fred.S @@ -5,8 +5,10 @@ * The actual FRED entry points. */ #include -#include +#include #include +#include +#include #include #include "calling.h" @@ -38,7 +40,9 @@ SYM_CODE_START_NOALIGN(fred_entrypoint_user) call fred_entry_from_user SYM_INNER_LABEL(fred_exit_user, SYM_L_GLOBAL) FRED_EXIT - ERETU +1: ERETU + + _ASM_EXTABLE_TYPE(1b, fred_entrypoint_user, EX_TYPE_ERETU) SYM_CODE_END(fred_entrypoint_user) .fill fred_entrypoint_kernel - ., 1, 0xcc diff --git a/arch/x86/include/asm/extable_fixup_types.h b/arch/x86/include/asm/extable_fixup_types.h index 991e31cfde94..1585c798a02f 100644 --- a/arch/x86/include/asm/extable_fixup_types.h +++ b/arch/x86/include/asm/extable_fixup_types.h @@ -64,6 +64,8 @@ #define EX_TYPE_UCOPY_LEN4 (EX_TYPE_UCOPY_LEN | EX_DATA_IMM(4)) #define EX_TYPE_UCOPY_LEN8 (EX_TYPE_UCOPY_LEN | EX_DATA_IMM(8)) -#define EX_TYPE_ZEROPAD 20 /* longword load with zeropad on fault */ +#define EX_TYPE_ZEROPAD 20 /* longword load with zeropad on fault */ + +#define EX_TYPE_ERETU 21 #endif diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c index 60814e110a54..9d82193adf3c 100644 --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -6,6 +6,7 @@ #include #include +#include #include #include #include @@ -195,6 +196,77 @@ static bool ex_handler_ucopy_len(const struct exception_table_entry *fixup, return ex_handler_uaccess(fixup, regs, trapnr); } +#ifdef CONFIG_X86_FRED +static bool ex_handler_eretu(const struct exception_table_entry *fixup, + struct pt_regs *regs, unsigned long error_code) +{ + struct pt_regs *uregs = (struct pt_regs *)(regs->sp - offsetof(struct pt_regs, ip)); + unsigned short ss = uregs->ss; + unsigned short cs = uregs->cs; + + /* + * Move the NMI bit from the invalid stack frame, which caused ERETU + * to fault, to the fault handler's stack frame, thus to unblock NMI + * with the fault handler's ERETS instruction ASAP if NMI is blocked. + */ + regs->nmi = uregs->nmi; + + fred_info(uregs)->edata = fred_event_data(regs); + uregs->ssx = regs->ssx; + uregs->ss = ss; + uregs->csx = regs->csx; + uregs->nmi = 0; /* The NMI bit was moved away above */ + uregs->current_stack_level = 0; + uregs->cs = cs; + + /* + * Copy error code to uregs and adjust stack pointer accordingly. + * + * The RSP used by FRED to push a stack frame is not the value in %rsp, + * it is calculated from %rsp with the following 2 steps: + * 1) RSP = %rsp - (IA32_FRED_CONFIG & 0x1c0) // Reserve N*64 bytes + * 2) RSP = RSP & ~0x3f // Align to a 64-byte cache line + * when the event delivery doesn't trigger a stack level change. + * + * Here is an example with N*64 (N=1) bytes reserved: + * + * 64-byte cache line ==> ______________ + * |___Reserved___| + * |__Event_data__| + * |_____SS_______| + * |_____RSP______| + * |_____FLAGS____| + * |_____CS_______| + * |_____IP_______| <== ERETU stack frame + * 64-byte cache line ==> |__Error_code__| + * |______________| + * |______________| + * |______________| + * |______________| + * |______________| + * |______________| + * |______________| <== RSP after step 1) + * 64-byte cache line ==> |______________| <== RSP after step 2) + * |___Reserved___| + * |__Event_data__| + * |_____SS_______| + * |_____RSP______| + * |_____FLAGS____| + * |_____CS_______| + * |_____IP_______| <== ERETS stack frame + * 64-byte cache line ==> |__Error_code__| + * + * Thus a new FRED stack frame will always be pushed below a previous + * FRED stack frame ((N*64) bytes may be reserved between), and it is + * safe to write to a previous FRED stack frame as they never overlap. + */ + uregs->orig_ax = error_code; + regs->sp -= 8; + + return ex_handler_default(fixup, regs); +} +#endif + int ex_get_fixup_type(unsigned long ip) { const struct exception_table_entry *e = search_exception_tables(ip); @@ -272,6 +344,10 @@ int fixup_exception(struct pt_regs *regs, int trapnr, unsigned long error_code, return ex_handler_ucopy_len(e, regs, trapnr, reg, imm); case EX_TYPE_ZEROPAD: return ex_handler_zeropad(e, regs, fault_addr); +#ifdef CONFIG_X86_FRED + case EX_TYPE_ERETU: + return ex_handler_eretu(e, regs, error_code); +#endif } BUG(); } -- 2.34.1