Received: by 2002:ac0:aa62:0:0:0:0:0 with SMTP id w31-v6csp98231ima; Thu, 25 Oct 2018 16:15:41 -0700 (PDT) X-Google-Smtp-Source: AJdET5ebjSAYDsCGqVWmYYlMemUbCajArgFxfjQ4pVJBEX+4o45Xn8C7ovpmLcTwXw2mMZPf2EGA X-Received: by 2002:a65:6295:: with SMTP id f21-v6mr1076633pgv.167.1540509341671; Thu, 25 Oct 2018 16:15:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540509341; cv=none; d=google.com; s=arc-20160816; b=MmsOqVdXLmXiHXYy5g3GRmCz0xiNi/3Wvg9Ygfv8xu28tHuttQp6FAqzKmrjw/Vmvq N9wrwdZYIAanwgy0Y+9V5xFUIWTgdpBmrYQ3Wm67YpYwrnwpz9MMpFnrtSb+nydC/rmp mIfyeuuCKrdwSUwPtIa2H0FWDH71p+aCQtrG9LtnO8VB3CJCftt06bazT3cJiafdgBA3 emFvCATbz0o5FQBn6JdF/fLkdxLVRdUaXzBXvslB9ILb+FpcVNzBPSpj18iz9F8hY372 ykZOAGm+E8AXKIlpoReYinSMwtm4CHLnXJBg6nyu0lXhO7LwEL0zal4StoaH7yyZVNGK 5jyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:autocrypt:openpgp:from:references:cc:to:subject; bh=IayMIYYqVMOqYgZI9qiVcFj9DuBSv7YbpBCdDC1DaTs=; b=jcIojL2gdKgcn7uHQ8eSP61uwNDnrKn7LUY8l2sIqylh9IVDltC3I/iDcy4Iac3AoX 827f0STYWC+5a8jfZKR+f+9mZCaK3qa16k5ndhRNvuSiPhySTbw+Xd5F2tV8RVbHONb7 ZzNvuwoj4DRmGuvY5LcU1aqGwLfD3nOGo2M1ZeDfJN8/oBdOlRbQIUDvXAn923vAU3LM cRFChfPdA1Z1K4HXb46BGYEwTCn9m2US0AjhR95/7l3Q62N1aWk9YLHymgRjYpy5/J7g U9ur1sSHroxEuL5F7tiK9GSDrTUM/w/wS51tPSmoqJBNRsZB4fm7u50Z5lGT4HVHMyOq 6Gkg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e11-v6si9514901pln.21.2018.10.25.16.15.26; Thu, 25 Oct 2018 16:15:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727604AbeJZHt2 (ORCPT + 99 others); Fri, 26 Oct 2018 03:49:28 -0400 Received: from smtp.ctxuk.citrix.com ([185.25.65.24]:39024 "EHLO SMTP.EU.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726796AbeJZHt2 (ORCPT ); Fri, 26 Oct 2018 03:49:28 -0400 X-IronPort-AV: E=Sophos;i="5.54,425,1534809600"; d="scan'208";a="80986949" Subject: Re: [Xen-devel] [v3 04/12] x86/fsgsbase/64: Enable FSGSBASE instructions in the helper functions To: Andy Lutomirski CC: Juergen Gross , "Bae, Chang Seok" , Boris Ostrovsky , xen-devel , "Ravi V. Shankar" , Andi Kleen , Dave Hansen , LKML , "Metzger, Markus T" , "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar References: <20181023184234.14025-1-chang.seok.bae@intel.com> <20181023184234.14025-5-chang.seok.bae@intel.com> <0d64fe9d-0cc3-5901-0d6f-4bcb94aa9ee4@citrix.com> <9b0a8b86-6949-837e-8a20-a5e934ed2b63@suse.com> <9b8b66e4-6a47-7320-be00-c75fed725878@citrix.com> From: Andrew Cooper Openpgp: preference=signencrypt Autocrypt: addr=andrew.cooper3@citrix.com; prefer-encrypt=mutual; keydata= xsFNBFLhNn8BEADVhE+Hb8i0GV6mihnnr/uiQQdPF8kUoFzCOPXkf7jQ5sLYeJa0cQi6Penp VtiFYznTairnVsN5J+ujSTIb+OlMSJUWV4opS7WVNnxHbFTPYZVQ3erv7NKc2iVizCRZ2Kxn srM1oPXWRic8BIAdYOKOloF2300SL/bIpeD+x7h3w9B/qez7nOin5NzkxgFoaUeIal12pXSR Q354FKFoy6Vh96gc4VRqte3jw8mPuJQpfws+Pb+swvSf/i1q1+1I4jsRQQh2m6OTADHIqg2E ofTYAEh7R5HfPx0EXoEDMdRjOeKn8+vvkAwhviWXTHlG3R1QkbE5M/oywnZ83udJmi+lxjJ5 YhQ5IzomvJ16H0Bq+TLyVLO/VRksp1VR9HxCzItLNCS8PdpYYz5TC204ViycobYU65WMpzWe LFAGn8jSS25XIpqv0Y9k87dLbctKKA14Ifw2kq5OIVu2FuX+3i446JOa2vpCI9GcjCzi3oHV e00bzYiHMIl0FICrNJU0Kjho8pdo0m2uxkn6SYEpogAy9pnatUlO+erL4LqFUO7GXSdBRbw5 gNt25XTLdSFuZtMxkY3tq8MFss5QnjhehCVPEpE6y9ZjI4XB8ad1G4oBHVGK5LMsvg22PfMJ ISWFSHoF/B5+lHkCKWkFxZ0gZn33ju5n6/FOdEx4B8cMJt+cWwARAQABzSlBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPsLBegQTAQgAJAIbAwULCQgHAwUVCgkI CwUWAgMBAAIeAQIXgAUCWKD95wIZAQAKCRBlw/kGpdefoHbdD/9AIoR3k6fKl+RFiFpyAhvO 59ttDFI7nIAnlYngev2XUR3acFElJATHSDO0ju+hqWqAb8kVijXLops0gOfqt3VPZq9cuHlh IMDquatGLzAadfFx2eQYIYT+FYuMoPZy/aTUazmJIDVxP7L383grjIkn+7tAv+qeDfE+txL4 SAm1UHNvmdfgL2/lcmL3xRh7sub3nJilM93RWX1Pe5LBSDXO45uzCGEdst6uSlzYR/MEr+5Z JQQ32JV64zwvf/aKaagSQSQMYNX9JFgfZ3TKWC1KJQbX5ssoX/5hNLqxMcZV3TN7kU8I3kjK mPec9+1nECOjjJSO/h4P0sBZyIUGfguwzhEeGf4sMCuSEM4xjCnwiBwftR17sr0spYcOpqET ZGcAmyYcNjy6CYadNCnfR40vhhWuCfNCBzWnUW0lFoo12wb0YnzoOLjvfD6OL3JjIUJNOmJy RCsJ5IA/Iz33RhSVRmROu+TztwuThClw63g7+hoyewv7BemKyuU6FTVhjjW+XUWmS/FzknSi dAG+insr0746cTPpSkGl3KAXeWDGJzve7/SBBfyznWCMGaf8E2P1oOdIZRxHgWj0zNr1+ooF /PzgLPiCI4OMUttTlEKChgbUTQ+5o0P080JojqfXwbPAyumbaYcQNiH1/xYbJdOFSiBv9rpt TQTBLzDKXok86M7BTQRS4TZ/ARAAkgqudHsp+hd82UVkvgnlqZjzz2vyrYfz7bkPtXaGb9H4 Rfo7mQsEQavEBdWWjbga6eMnDqtu+FC+qeTGYebToxEyp2lKDSoAsvt8w82tIlP/EbmRbDVn 7bhjBlfRcFjVYw8uVDPptT0TV47vpoCVkTwcyb6OltJrvg/QzV9f07DJswuda1JH3/qvYu0p vjPnYvCq4NsqY2XSdAJ02HrdYPFtNyPEntu1n1KK+gJrstjtw7KsZ4ygXYrsm/oCBiVW/OgU g/XIlGErkrxe4vQvJyVwg6YH653YTX5hLLUEL1NS4TCo47RP+wi6y+TnuAL36UtK/uFyEuPy wwrDVcC4cIFhYSfsO0BumEI65yu7a8aHbGfq2lW251UcoU48Z27ZUUZd2Dr6O/n8poQHbaTd 6bJJSjzGGHZVbRP9UQ3lkmkmc0+XCHmj5WhwNNYjgbbmML7y0fsJT5RgvefAIFfHBg7fTY/i kBEimoUsTEQz+N4hbKwo1hULfVxDJStE4sbPhjbsPCrlXf6W9CxSyQ0qmZ2bXsLQYRj2xqd1 bpA+1o1j2N4/au1R/uSiUFjewJdT/LX1EklKDcQwpk06Af/N7VZtSfEJeRV04unbsKVXWZAk uAJyDDKN99ziC0Wz5kcPyVD1HNf8bgaqGDzrv3TfYjwqayRFcMf7xJaL9xXedMcAEQEAAcLB XwQYAQgACQUCUuE2fwIbDAAKCRBlw/kGpdefoG4XEACD1Qf/er8EA7g23HMxYWd3FXHThrVQ HgiGdk5Yh632vjOm9L4sd/GCEACVQKjsu98e8o3ysitFlznEns5EAAXEbITrgKWXDDUWGYxd pnjj2u+GkVdsOAGk0kxczX6s+VRBhpbBI2PWnOsRJgU2n10PZ3mZD4Xu9kU2IXYmuW+e5KCA vTArRUdCrAtIa1k01sPipPPw6dfxx2e5asy21YOytzxuWFfJTGnVxZZSCyLUO83sh6OZhJkk b9rxL9wPmpN/t2IPaEKoAc0FTQZS36wAMOXkBh24PQ9gaLJvfPKpNzGD8XWR5HHF0NLIJhgg 4ZlEXQ2fVp3XrtocHqhu4UZR4koCijgB8sB7Tb0GCpwK+C4UePdFLfhKyRdSXuvY3AHJd4CP 4JzW0Bzq/WXY3XMOzUTYApGQpnUpdOmuQSfpV9MQO+/jo7r6yPbxT7CwRS5dcQPzUiuHLK9i nvjREdh84qycnx0/6dDroYhp0DFv4udxuAvt1h4wGwTPRQZerSm4xaYegEFusyhbZrI0U9tJ B8WrhBLXDiYlyJT6zOV2yZFuW47VrLsjYnHwn27hmxTC/7tvG3euCklmkn9Sl9IAKFu29RSo d5bD8kMSCYsTqtTfT6W4A3qHGvIDta3ptLYpIAOD2sY3GYq2nf3Bbzx81wZK14JdDDHUX2Rs 6+ahAA== Message-ID: Date: Fri, 26 Oct 2018 00:14:48 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Content-Language: en-GB X-ClientProxiedBy: AMSPEX02CAS02.citrite.net (10.69.22.113) To AMSPEX02CL01.citrite.net (10.69.22.125) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 26/10/2018 00:11, Andy Lutomirski wrote: > On Thu, Oct 25, 2018 at 4:09 PM Andrew Cooper wrote: >> On 25/10/2018 07:09, Juergen Gross wrote: >>> On 24/10/2018 21:41, Andrew Cooper wrote: >>>> On 24/10/18 20:16, Andy Lutomirski wrote: >>>>> On Tue, Oct 23, 2018 at 11:43 AM Chang S. Bae wrote: >>>>>> The helper functions will switch on faster accesses to FSBASE and GSBASE >>>>>> when the FSGSBASE feature is enabled. >>>>>> >>>>>> Accessing user GSBASE needs a couple of SWAPGS operations. It is avoidable >>>>>> if the user GSBASE is saved at kernel entry, being updated as changes, and >>>>>> restored back at kernel exit. However, it seems to spend more cycles for >>>>>> savings and restorations. Little or no benefit was measured from >>>>>> experiments. >>>>>> >>>>>> Signed-off-by: Chang S. Bae >>>>>> Reviewed-by: Andi Kleen >>>>>> Cc: Any Lutomirski >>>>>> Cc: H. Peter Anvin >>>>>> Cc: Thomas Gleixner >>>>>> Cc: Ingo Molnar >>>>>> Cc: Dave Hansen >>>>>> --- >>>>>> arch/x86/include/asm/fsgsbase.h | 17 +++---- >>>>>> arch/x86/kernel/process_64.c | 82 +++++++++++++++++++++++++++------ >>>>>> 2 files changed, 75 insertions(+), 24 deletions(-) >>>>>> >>>>>> diff --git a/arch/x86/include/asm/fsgsbase.h b/arch/x86/include/asm/fsgsbase.h >>>>>> index b4d4509b786c..e500d771155f 100644 >>>>>> --- a/arch/x86/include/asm/fsgsbase.h >>>>>> +++ b/arch/x86/include/asm/fsgsbase.h >>>>>> @@ -57,26 +57,23 @@ static __always_inline void wrgsbase(unsigned long gsbase) >>>>>> : "memory"); >>>>>> } >>>>>> >>>>>> +#include >>>>>> + >>>>>> /* Helper functions for reading/writing FS/GS base */ >>>>>> >>>>>> static inline unsigned long x86_fsbase_read_cpu(void) >>>>>> { >>>>>> unsigned long fsbase; >>>>>> >>>>>> - rdmsrl(MSR_FS_BASE, fsbase); >>>>>> + if (static_cpu_has(X86_FEATURE_FSGSBASE)) >>>>>> + fsbase = rdfsbase(); >>>>>> + else >>>>>> + rdmsrl(MSR_FS_BASE, fsbase); >>>>>> >>>>>> return fsbase; >>>>>> } >>>>>> >>>>>> -static inline unsigned long x86_gsbase_read_cpu_inactive(void) >>>>>> -{ >>>>>> - unsigned long gsbase; >>>>>> - >>>>>> - rdmsrl(MSR_KERNEL_GS_BASE, gsbase); >>>>>> - >>>>>> - return gsbase; >>>>>> -} >>>>>> - >>>>>> +extern unsigned long x86_gsbase_read_cpu_inactive(void); >>>>>> extern void x86_fsbase_write_cpu(unsigned long fsbase); >>>>>> extern void x86_gsbase_write_cpu_inactive(unsigned long gsbase); >>>>>> >>>>>> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c >>>>>> index 31b4755369f0..fcf18046c3d6 100644 >>>>>> --- a/arch/x86/kernel/process_64.c >>>>>> +++ b/arch/x86/kernel/process_64.c >>>>>> @@ -159,6 +159,36 @@ enum which_selector { >>>>>> GS >>>>>> }; >>>>>> >>>>>> +/* >>>>>> + * Interrupts are disabled here. Out of line to be protected from kprobes. >>>>>> + */ >>>>>> +static noinline __kprobes unsigned long rd_inactive_gsbase(void) >>>>>> +{ >>>>>> + unsigned long gsbase, flags; >>>>>> + >>>>>> + local_irq_save(flags); >>>>>> + native_swapgs(); >>>>>> + gsbase = rdgsbase(); >>>>>> + native_swapgs(); >>>>>> + local_irq_restore(flags); >>>>>> + >>>>>> + return gsbase; >>>>>> +} >>>>> Please fold this into its only caller and make *that* noinline. >>>>> >>>>> Also, this function, and its "write" equivalent, will access the >>>>> *active* gsbase. So it either needs to be fixed for Xen PV or some >>>>> clear comment and careful auditing needs to be added to ensure that >>>>> it's not used on Xen PV. Or it needs to be renamed >>>>> native_x86_fsgsbase_... and add paravirt hooks, since Xen PV allows a >>>>> very efficient but different implementation, I think. The latter is >>>>> probably the right solution. >>>>> >>>>> (Hi Xen people -- how does CR4.FSGSBASE work on Xen? Is it always >>>>> set? Never set? Set only if the guest tries to set it?) >>>> FML. Seriously - whoever put this code into the hypervisor in the past >>>> did an atrocious job. After some experimentation, you're going to be >>>> sad and I'm declaring this borderline unusable. >>>> >>>> Looks like Xen unconditionally enabled CR4.FSGSBASE if it is available. >>>> Therefore, PV guests can use the instructions, even if the bit is clear >>>> in vCR4. >>>> >>>> The CPUID bits are exposed to guests by default, and Xen will emulate >>>> vCR4.FSGSBASE being set and cleared. >>>> >>>> We don't however emulate swapgs (which is a cpl0 instruction). The >>>> guest gets handed a #GP[0] instead. >>>> >>>> The Linux WRMSR PVop uses the set_segment_base() hypercall in instead of >>>> going through the full wrmsr emulation path. >>>> >>>> There is no equivalent get hypercall, so the only way I can see of >>>> getting the value is to actually read MSR_KERNEL_GS_BASE and take the >>>> full rdmsr emulation path. >>> Or shadow the value in a percpu variable. >> Hmm true, so long as no paths try to use native_rd{fs,gs}base() to >> bypass the PVop. > But *user* code can change the base. How is the kernel supposed to > context-switch the user gsbase? user code can change the user gs base. Xen will switch user/kernel base as appropriate on context switch so the kernel is entered on the kernel gs base. But you are right - there is no way for Linux to peek at the current user gs base without reading MSR_GS_SHADOW.  (The user gs base can be set via a hypercall, but not obtained). ~Andrew