Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755285AbbGZTeW (ORCPT ); Sun, 26 Jul 2015 15:34:22 -0400 Received: from ppsw-40.csi.cam.ac.uk ([131.111.8.140]:43319 "EHLO ppsw-40.csi.cam.ac.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755205AbbGZTeV (ORCPT ); Sun, 26 Jul 2015 15:34:21 -0400 X-Cam-AntiVirus: no malware found X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Subject: Re: Getting rid of invalid SYSCALL RSP under Xen? To: Andy Lutomirski , X86 ML , Boris Ostrovsky , "linux-kernel@vger.kernel.org" , Borislav Petkov , Steven Rostedt , "xen-devel@lists.xen.org" References: From: Andrew Cooper X-Enigmail-Draft-Status: N1110 Message-ID: <55B53636.80304@citrix.com> Date: Sun, 26 Jul 2015 20:34:14 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3091 Lines: 88 On 23/07/2015 17:49, Andy Lutomirski wrote: > Hi- Hi. Apologies for the delay. I have been out of the office for a few days. > > In entry_64.S, we have: > > ENTRY(entry_SYSCALL_64) > /* > * Interrupts are off on entry. > * We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON, > * it is too small to ever cause noticeable irq latency. > */ > SWAPGS_UNSAFE_STACK > /* > * A hypervisor implementation might want to use a label > * after the swapgs, so that it can do the swapgs > * for the guest and jump here on syscall. > */ > GLOBAL(entry_SYSCALL_64_after_swapgs) > > movq %rsp, PER_CPU_VAR(rsp_scratch) > movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp > > It would be really, really nice if Xen entered the SYSCALL path > *after* the mov to %rsp. > > Similarly, we have: > > movq RSP(%rsp), %rsp > /* big comment */ > USERGS_SYSRET64 > > It would be really nice if we could just mov to %rsp, swapgs, and > sysret, without worrying that the sysret is actually a jump on Xen. > > I suspect that making Xen stop using these code paths would actually > be a simplification. On SYSCALL entry, Xen lands in > xen_syscall_target (AFAICT) and clearly has rsp pointing somewhere > valid. Xen obligingly shoves the user RSP into the hardware RSP > register and jumps into the entry code. > > Is that stuff running on the normal kernel stack? Yes. The Xen ABI takes what is essentially tss->esp0 and uses that stack for all "switch to kernel" actions, including syscall and sysenter. > If so, can we just > enter later on: > > pushq %r11 /* pt_regs->flags */ > pushq $__USER_CS /* pt_regs->cs */ > pushq %rcx /* pt_regs->ip */ > > <-- Xen enters here > > pushq %rax /* pt_regs->orig_ax */ > pushq %rdi /* pt_regs->di */ > pushq %rsi /* pt_regs->si */ > pushq %rdx /* pt_regs->dx */ This looks plausible, and indeed preferable to the current doublestep with undo_xen_syscall. One slight complication is that xen_enable_syscall() will want to special case register_callback() to not set CALLBACKF_mask_events, as the entry point is now after re-enabling interrupts. > > For SYSRET, I think the way to go is to force Xen to always use the > syscall slow path. Instead, Xen could hook into > syscall_return_via_sysret or even right before the opportunistic > sysret stuff. Then we could remove the USERGS_SYSRET hooks entirely. > > Would this work? None of the opportunistic sysret stuff makes sense under Xen. The path will inevitably end up in xen_iret making a hypercall. Short circuiting all of this seems like a good idea, especially if it allows for the removal of the UERGS_SYSRET. ~Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/