Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755278AbbGZX2N (ORCPT ); Sun, 26 Jul 2015 19:28:13 -0400 Received: from mail-la0-f53.google.com ([209.85.215.53]:33633 "EHLO mail-la0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755033AbbGZX2L (ORCPT ); Sun, 26 Jul 2015 19:28:11 -0400 MIME-Version: 1.0 In-Reply-To: <55B567C1.3050709@citrix.com> References: <55B53636.80304@citrix.com> <55B567C1.3050709@citrix.com> From: Andy Lutomirski Date: Sun, 26 Jul 2015 16:27:50 -0700 Message-ID: Subject: Re: Getting rid of invalid SYSCALL RSP under Xen? To: Andrew Cooper Cc: X86 ML , Boris Ostrovsky , "linux-kernel@vger.kernel.org" , Borislav Petkov , Steven Rostedt , "xen-devel@lists.xen.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4003 Lines: 114 On Sun, Jul 26, 2015 at 4:05 PM, Andrew Cooper wrote: > On 26/07/2015 23:08, Andy Lutomirski wrote: >> >>>> If so, can we just >>>> enter later on: >>>> >>>> pushq %r11 /* pt_regs->flags */ >>>> pushq $__USER_CS /* pt_regs->cs */ >>>> pushq %rcx /* pt_regs->ip */ >>>> >>>> <-- Xen enters here >>>> >>>> pushq %rax /* pt_regs->orig_ax */ >>>> pushq %rdi /* pt_regs->di */ >>>> pushq %rsi /* pt_regs->si */ >>>> pushq %rdx /* pt_regs->dx */ >>> This looks plausible, and indeed preferable to the current doublestep >>> with undo_xen_syscall. >>> >>> One slight complication is that xen_enable_syscall() will want to >>> special case register_callback() to not set CALLBACKF_mask_events, as >>> the entry point is now after re-enabling interrupts. >> I wouldn't do that. Let's just move the ENABLE_INTERRUPTS a few >> instructions later even on native -- I want to do that anyway. > > That would also work. > >> >>>> For SYSRET, I think the way to go is to force Xen to always use the >>>> syscall slow path. Instead, Xen could hook into >>>> syscall_return_via_sysret or even right before the opportunistic >>>> sysret stuff. Then we could remove the USERGS_SYSRET hooks entirely. >>>> >>>> Would this work? >>> None of the opportunistic sysret stuff makes sense under Xen. The path >>> will inevitably end up in xen_iret making a hypercall. Short circuiting >>> all of this seems like a good idea, especially if it allows for the >>> removal of the UERGS_SYSRET. >> Doesn't Xen decide what to do based on VGCF_IN_SYSCALL? Maybe Xen >> should have its own opportunistic VGCF_IN_SYSCALL logic. > > VGCF_in_syscall affects whether the extra r11/rcx get restored or not, > as the hypercall itself is implemented using syscall. As the extra > r11/rcx (and rax for that matter) are unconditionally saved in the > hypercall stub, I can't see anything Linux could usefully do, > opportunistically speaking. Xen does: /* %rbx: struct vcpu, interrupts disabled */ restore_all_guest: ASSERT_INTERRUPTS_DISABLED RESTORE_ALL testw $TRAP_syscall,4(%rsp) jz iret_exit_to_guest /* Don't use SYSRET path if the return address is not canonical. */ movq 8(%rsp),%rcx sarq $47,%rcx incl %ecx cmpl $1,%ecx ja .Lforce_iret cmpw $FLAT_USER_CS32,16(%rsp)# CS movq 8(%rsp),%rcx # RIP movq 24(%rsp),%r11 # RFLAGS movq 32(%rsp),%rsp # RSP je 1f sysretq 1: sysretl That's essentially the same thing as opportunistic sysret. If Linux stops setting VGCF_in_syscall, though, I think we'll bypass that code, which will hurt performance. Whether this should be fixed in the hypervisor or in the guest kernel hooks, I don't know, but it would be easy to have a very simple xen_opportunistic_sysret path that checks rcx==rip and r11==rflags and, if so, sets VGCF_in_syscall. > >> >> Hmm, maybe some of this would be easier to think about if, rather than >> having a paravirt op, we could have: >> >> ALTERNATIVE "", "jmp xen_pop_things_and_iret", X86_FEATURE_XEN >> >> Or just IF_XEN("jmp ..."); >> >> As a practical matter, x86_64 has native and Xen -- I don't think >> there's any other paravirt platform that needs the asm hooks. > > It would certainly seem so. A careful use of IF_XEN() or two would make > the code far clearer to read, and to drop the hooks. > Want to add an IF_XEN macro? I'm about to send patches for the SYSCALL bit. --Andy > ~Andrew > -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/