Received: by 10.223.185.116 with SMTP id b49csp5328608wrg; Tue, 27 Feb 2018 11:19:38 -0800 (PST) X-Google-Smtp-Source: AH8x226WMehDg8NuavpJYvytOZTa4S7Wd9M3o9l7yzhEitrZ1HREdiSjFlfHFJEJ65fHycQRNbvu X-Received: by 10.99.39.131 with SMTP id n125mr12071023pgn.292.1519759178370; Tue, 27 Feb 2018 11:19:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519759178; cv=none; d=google.com; s=arc-20160816; b=DAK/ROBP8gY7HdGDMnCLaX/X4KnA4+zwvyoBZZABnVRvqNswBfJZ/9ZOTFLEuUnpdi AaxeHYimoQDoB9Oe0JQUp4Zl8l1EnKwfEwn0kISOMxEqDgsyr7k8aIJRODOEZhoDtnGF ojfeOOV45BKvh5ZGa6cuKA9twLH42v36egnnB2tbLjp84DXJvJiXPrSCIJkPGDpcYT3e f5qdqU2NvtrE3ia9J4IJA/meq46hsQSk6cwlMV51+UV9yZjexlF7PmebXuUPXLAdUNle N5H8SxpQq8BLoNDB9ZpxmdJbaJ5DkWWLMdntbSgw37DovZaY9uZjHHJpUQbSpToMMeEQ kRzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :arc-authentication-results; bh=Plf+x+GSL6mI14Cv/uFlTNyhFQXTHGZhlAmdidgxw+8=; b=LBEmCEBK/r+IQ0oTn7eDpoyuMh7chCj61fAPvftAEP+uIpWDcFo/l4PQWqEixkh7mV n1DkTHMawJJECHFuGCCIx1hfDy6XrCV43tqcDgXAdwU/8Avxc+8YGMvUzqYpbKBckugT IT6ki4ISVk6K0KEv95DMz9lTWxC66QrMJbD4xmKarlI6kM1kyqnQHIXTO1/YB7SOcePK 6jsFeo7S1L+8FSyeio/swcEwIClbyX6gYFMRNLk3ExSFi0HUPqGq2zaKtxPfIDxTZlp/ jG0BV8rE56KT46ATH6IB+lvpZ54Nkvw79fAH7xyfWOIiuprVvipYBfWEdkEeWVTprshL 2xkQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z9-v6si779806pll.805.2018.02.27.11.19.23; Tue, 27 Feb 2018 11:19:38 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751716AbeB0TSk (ORCPT + 99 others); Tue, 27 Feb 2018 14:18:40 -0500 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:44226 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751139AbeB0TSj (ORCPT ); Tue, 27 Feb 2018 14:18:39 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7C70387ABA; Tue, 27 Feb 2018 19:18:38 +0000 (UTC) Received: from llong.remote.csb (dhcp-17-5.bos.redhat.com [10.18.17.5]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6C6D523169; Tue, 27 Feb 2018 19:18:36 +0000 (UTC) Subject: Re: [PATCH 12/31] x86/entry/32: Add PTI cr3 switch to non-NMI entry/exit points To: Joerg Roedel , Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Linus Torvalds , Andy Lutomirski , Dave Hansen , Josh Poimboeuf , Juergen Gross , Peter Zijlstra , Borislav Petkov , Jiri Kosina , Boris Ostrovsky , Brian Gerst , David Laight , Denys Vlasenko , Eduardo Valentin , Greg KH , Will Deacon , aliguori@amazon.com, daniel.gruss@iaik.tugraz.at, hughd@google.com, keescook@google.com, Andrea Arcangeli , Waiman Long , Pavel Machek , jroedel@suse.de References: <1518168340-9392-1-git-send-email-joro@8bytes.org> <1518168340-9392-13-git-send-email-joro@8bytes.org> From: Waiman Long Organization: Red Hat Message-ID: Date: Tue, 27 Feb 2018 14:18:36 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <1518168340-9392-13-git-send-email-joro@8bytes.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Tue, 27 Feb 2018 19:18:38 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Tue, 27 Feb 2018 19:18:38 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'longman@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/09/2018 04:25 AM, Joerg Roedel wrote: > From: Joerg Roedel > > Add unconditional cr3 switches between user and kernel cr3 > to all non-NMI entry and exit points. > > Signed-off-by: Joerg Roedel > --- > arch/x86/entry/entry_32.S | 59 ++++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 58 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S > index 9693485..b5ef003 100644 > --- a/arch/x86/entry/entry_32.S > +++ b/arch/x86/entry/entry_32.S > @@ -328,6 +328,25 @@ > #endif /* CONFIG_X86_ESPFIX32 */ > .endm > > +/* Unconditionally switch to user cr3 */ > +.macro SWITCH_TO_USER_CR3 scratch_reg:req > + ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI > + > + movl %cr3, \scratch_reg > + orl $PTI_SWITCH_MASK, \scratch_reg > + movl \scratch_reg, %cr3 > +.Lend_\@: > +.endm > + > +/* Unconditionally switch to kernel cr3 */ > +.macro SWITCH_TO_KERNEL_CR3 scratch_reg:req > + ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI > + movl %cr3, \scratch_reg > + andl $(~PTI_SWITCH_MASK), \scratch_reg > + movl \scratch_reg, %cr3 > +.Lend_\@: > +.endm > + > > /* > * Called with pt_regs fully populated and kernel segments loaded, > @@ -343,6 +362,8 @@ > > ALTERNATIVE "", "jmp .Lend_\@", X86_FEATURE_XENPV > > + SWITCH_TO_KERNEL_CR3 scratch_reg=%eax > + > /* Are we on the entry stack? Bail out if not! */ > movl PER_CPU_VAR(cpu_entry_area), %edi > addl $CPU_ENTRY_AREA_entry_stack, %edi > @@ -637,6 +658,18 @@ ENTRY(xen_sysenter_target) > * 0(%ebp) arg6 > */ > ENTRY(entry_SYSENTER_32) > + /* > + * On entry-stack with all userspace-regs live - save and > + * restore eflags and %eax to use it as scratch-reg for the cr3 > + * switch. > + */ > + pushfl > + pushl %eax > + SWITCH_TO_KERNEL_CR3 scratch_reg=%eax > + popl %eax > + popfl > + > + /* Stack empty again, switch to task stack */ > movl TSS_entry_stack(%esp), %esp > > .Lsysenter_past_esp: > @@ -691,6 +724,10 @@ ENTRY(entry_SYSENTER_32) > movl PT_OLDESP(%esp), %ecx /* pt_regs->sp */ > 1: mov PT_FS(%esp), %fs > PTGS_TO_GS > + > + /* Segments are restored - switch to user cr3 */ > + SWITCH_TO_USER_CR3 scratch_reg=%eax > + > popl %ebx /* pt_regs->bx */ > addl $2*4, %esp /* skip pt_regs->cx and pt_regs->dx */ > popl %esi /* pt_regs->si */ > @@ -778,7 +815,23 @@ restore_all: > .Lrestore_all_notrace: > CHECK_AND_APPLY_ESPFIX > .Lrestore_nocheck: > - RESTORE_REGS 4 # skip orig_eax/error_code > + /* > + * First restore user segments. This can cause exceptions, so we > + * run it with kernel cr3. > + */ > + RESTORE_SEGMENTS > + > + /* > + * Segments are restored - no more exceptions from here on except on > + * iret, but that handled safely. > + */ > + SWITCH_TO_USER_CR3 scratch_reg=%eax > + > + /* Restore rest */ > + RESTORE_INT_REGS > + > + /* Unwind stack to the iret frame */ > + RESTORE_SKIP_SEGMENTS 4 # skip orig_eax/error_code > .Lirq_return: > INTERRUPT_RETURN > > @@ -1139,6 +1192,10 @@ ENTRY(debug) > > SAVE_ALL > ENCODE_FRAME_POINTER > + > + /* Make sure we are running on kernel cr3 */ > + SWITCH_TO_KERNEL_CR3 scratch_reg=%eax > + > xorl %edx, %edx # error code 0 > movl %esp, %eax # pt_regs pointer > The debug exception calls ret_from_exception on exit. If coming from userspace, the C function prepare_exit_to_usermode() will be called. With the PTI-32 code, it means that function will be called with the entry stack instead of the task stack. This can be problematic as macro like current won't work anymore. -Longman