Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp2345048ybz; Thu, 23 Apr 2020 16:26:00 -0700 (PDT) X-Google-Smtp-Source: APiQypJdc2QCNuNo0gYgkVH9U1PY1Swd9Sgsn1OvwZTG7AAvbkTAsFjAOYsVFOGgkANl/jJg1YR+ X-Received: by 2002:a17:906:68d7:: with SMTP id y23mr4965172ejr.85.1587684360406; Thu, 23 Apr 2020 16:26:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587684360; cv=none; d=google.com; s=arc-20160816; b=bJoeH6NvU5aFKRK3HlvdDo5a51wPRdofcNBpbT+r9KRRs3syZ+jq+FJFMH0ivUzPhy CmIauI+v8rWKxBD2lHWeua4lLYVX2iSiyJVKwpKjCQ6/u+Bos9z7rDj38WHvr0NNLfZk ocRDm1zslU9K/9YAKn1hpGeQZ2r+IOT7oouss89XEe5PdB/ip0smWG9qkpvGt8SvHvUG Y63G77hR9l0Uuz0ma69bNZsp9voqlrDiBaJfg8XbdeEysjULfPYR7/A4Wiuc91ufFYK3 xGdlyGW9APyvq1HZ46rLlD5zytysnzR7ngiC2YnTTvI4ZOI7hcU0cCgBejIWgm68k7Xh QoTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=P12Bifwlr+uRWpDhHsFcRcWjxvgWg251Q+bCeuQ6PzA=; b=af3+ReVeW99Z/QIWcNzdxovPmIc5/3FwrnTNe6X2u9CiRH19ICziLRtpaaobhtbuMD KkZIDQRlRm1CyHHZmzi+Ces2MzGHm9RJ3aiXaOpCoEAZoHtDpzb2e1XkwH1Q5yRIOWV3 eyEhzY2be5iIzSPZcsqo5Qd+LcUIPFzwC/qqi894IaekQN5y7iX9Duyb9G2IZmFPKb4x iy9I8RW9cKxEFXTIrb2MFMLzqi5NNbY3EiF4EkTnvWpZQ+r79fR8wyw9h2Wng8q5O5lu RZgWo0g67gzIta0XP0GR21i7O0ChSAd4UoPstRyvYJPMsJvN1gv8EwoH2RdFO5r3T0Jd 4K3Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Wbticzcv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id rv10si2033090ejb.519.2020.04.23.16.25.36; Thu, 23 Apr 2020 16:26:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Wbticzcv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728532AbgDWXW0 (ORCPT + 99 others); Thu, 23 Apr 2020 19:22:26 -0400 Received: from mail.kernel.org ([198.145.29.99]:58998 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729832AbgDWXWW (ORCPT ); Thu, 23 Apr 2020 19:22:22 -0400 Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id F3D792166E; Thu, 23 Apr 2020 23:22:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1587684141; bh=9krookZukCDBePU7RIeSRY+NOOS+9DLWnQPTdDdk4XU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Wbticzcvt9idpVbqCuh8cq0mmf6vIPhctsRCZw4s7axFx8WDs7MwvSpJOSw7nXwhM 3m7w1IreaKj5ilruoFJ1N6w9Ue7KOih9gr/oWfKtKHAKe3+u2EK69awn0ijJgEgDVk 791amCT/fHPMTllTZMbDjLAGAnDD1Lo1EZSq7vQ8= From: Sasha Levin To: linux-kernel@vger.kernel.org, tglx@linutronix.de, bp@alien8.de, luto@kernel.org Cc: hpa@zytor.com, dave.hansen@intel.com, tony.luck@intel.com, ak@linux.intel.com, ravi.v.shankar@intel.com, chang.seok.bae@intel.com, Tom Lendacky , Vegard Nossum , Sasha Levin Subject: [PATCH v10 07/18] x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit Date: Thu, 23 Apr 2020 19:21:56 -0400 Message-Id: <20200423232207.5797-8-sashal@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200423232207.5797-1-sashal@kernel.org> References: <20200423232207.5797-1-sashal@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: "Chang S. Bae" Without FSGSBASE, user space cannot change GS base other than through a PRCTL. The kernel enforces that the user space GS base value is positive as negative values are used for detecting the kernel space GS base value in the paranoid entry code. If FSGSBASE is enabled, user space can set arbitrary GS base values without kernel intervention, including negative ones, which breaks the paranoid entry assumptions. To avoid this, paranoid entry needs to unconditionally save the current GS base value independent of the interrupted context, retrieve and write the kernel GS base and unconditionally restore the saved value on exit. The restore happens either in paranoid exit or in the special exit path of the NMI low level code. All other entry code paths which use unconditional SWAPGS are not affected as they do not depend on the actual content. The new logic for paranoid entry, when FSGSBASE is enabled, removes SWAPGS and replaces with unconditional WRGSBASE. Hence no fences are needed. Suggested-by: H. Peter Anvin Suggested-by: Andy Lutomirski Suggested-by: Thomas Gleixner Signed-off-by: Chang S. Bae Reviewed-by: Tony Luck Acked-by: Tom Lendacky Cc: Thomas Gleixner Cc: Borislav Petkov Cc: Andy Lutomirski Cc: H. Peter Anvin Cc: Dave Hansen Cc: Tony Luck Cc: Andi Kleen Cc: Tom Lendacky Cc: Vegard Nossum Signed-off-by: Sasha Levin --- arch/x86/entry/calling.h | 6 +++ arch/x86/entry/entry_64.S | 78 ++++++++++++++++++++++++++++++++++----- 2 files changed, 75 insertions(+), 9 deletions(-) diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h index 0eb134e18b7a9..5f3a8ecaddc2d 100644 --- a/arch/x86/entry/calling.h +++ b/arch/x86/entry/calling.h @@ -340,6 +340,12 @@ For 32-bit we have the following conventions - kernel is built with #endif .endm +.macro SAVE_AND_SET_GSBASE scratch_reg:req save_reg:req + rdgsbase \save_reg + GET_PERCPU_BASE \scratch_reg + wrgsbase \scratch_reg +.endm + #endif /* CONFIG_X86_64 */ .macro STACKLEAK_ERASE diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 7f27626f8426f..a4fd01c8f2970 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -38,6 +38,7 @@ #include #include #include +#include #include #include "calling.h" @@ -1211,9 +1212,14 @@ idtentry machine_check do_mce has_error_code=0 paranoid=1 #endif /* - * Save all registers in pt_regs, and switch gs if needed. - * Use slow, but surefire "are we in kernel?" check. - * Return: ebx=0: need swapgs on exit, ebx=1: otherwise + * Save all registers in pt_regs. Return GS base related information + * in EBX depending on the availability of the FSGSBASE instructions: + * + * FSGSBASE R/EBX + * N 0 -> SWAPGS on exit + * 1 -> no SWAPGS on exit + * + * Y GS base value at entry, must be restored in paranoid_exit */ SYM_CODE_START_LOCAL(paranoid_entry) UNWIND_HINT_FUNC @@ -1238,7 +1244,29 @@ SYM_CODE_START_LOCAL(paranoid_entry) */ SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%rax save_reg=%r14 - /* EBX = 1 -> kernel GSBASE active, no restore required */ + /* + * Handling GS base depends on the availability of FSGSBASE. + * + * Without FSGSBASE the kernel enforces that negative GS base + * values indicate kernel GS base. With FSGSBASE no assumptions + * can be made about the GS base value when entering from user + * space. + */ + ALTERNATIVE "jmp .Lparanoid_entry_checkgs", "", X86_FEATURE_FSGSBASE + + /* + * Read the current GS base and store it in %rbx unconditionally, + * retrieve and set the current CPUs kernel GS base. The stored value + * has to be restored in paranoid_exit unconditionally. + * + * This unconditional write of GS base ensures no subsequent load + * based on a mispredicted GS base. + */ + SAVE_AND_SET_GSBASE scratch_reg=%rax save_reg=%rbx + ret + +.Lparanoid_entry_checkgs: + /* EBX = 1 -> kernel GS base active, no restore required */ movl $1, %ebx /* * The kernel-enforced convention is a negative GS base indicates @@ -1265,10 +1293,17 @@ SYM_CODE_END(paranoid_entry) * * We may be returning to very strange contexts (e.g. very early * in syscall entry), so checking for preemption here would - * be complicated. Fortunately, we there's no good reason - * to try to handle preemption here. + * be complicated. Fortunately, there's no good reason to try + * to handle preemption here. + * + * R/EBX contains the GS base related information depending on the + * availability of the FSGSBASE instructions: + * + * FSGSBASE R/EBX + * N 0 -> SWAPGS on exit + * 1 -> no SWAPGS on exit * - * On entry, ebx is "no swapgs" flag (1: don't need swapgs, 0: need it) + * Y User space GS base, must be restored unconditionally */ SYM_CODE_START_LOCAL(paranoid_exit) UNWIND_HINT_REGS @@ -1285,7 +1320,15 @@ SYM_CODE_START_LOCAL(paranoid_exit) TRACE_IRQS_OFF_DEBUG RESTORE_CR3 scratch_reg=%rax save_reg=%r14 - /* If EBX is 0, SWAPGS is required */ + /* Handle the three GS base cases */ + ALTERNATIVE "jmp .Lparanoid_exit_checkgs", "", X86_FEATURE_FSGSBASE + + /* With FSGSBASE enabled, unconditionally resotre GS base */ + wrgsbase %rbx + jmp restore_regs_and_return_to_kernel + +.Lparanoid_exit_checkgs: + /* On non-FSGSBASE systems, conditionally do SWAPGS */ testl %ebx, %ebx jnz restore_regs_and_return_to_kernel @@ -1699,10 +1742,27 @@ end_repeat_nmi: /* Always restore stashed CR3 value (see paranoid_entry) */ RESTORE_CR3 scratch_reg=%r15 save_reg=%r14 - testl %ebx, %ebx /* swapgs needed? */ + /* + * The above invocation of paranoid_entry stored the GS base + * related information in R/EBX depending on the availability + * of FSGSBASE. + * + * If FSGSBASE is enabled, restore the saved GS base value + * unconditionally, otherwise take the conditional SWAPGS path. + */ + ALTERNATIVE "jmp nmi_no_fsgsbase", "", X86_FEATURE_FSGSBASE + + wrgsbase %rbx + jmp nmi_restore + +nmi_no_fsgsbase: + /* EBX == 0 -> invoke SWAPGS */ + testl %ebx, %ebx jnz nmi_restore + nmi_swapgs: SWAPGS_UNSAFE_STACK + nmi_restore: POP_REGS -- 2.20.1