Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754206AbbGWQtr (ORCPT ); Thu, 23 Jul 2015 12:49:47 -0400 Received: from mail-la0-f43.google.com ([209.85.215.43]:34224 "EHLO mail-la0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754084AbbGWQto (ORCPT ); Thu, 23 Jul 2015 12:49:44 -0400 MIME-Version: 1.0 From: Andy Lutomirski Date: Thu, 23 Jul 2015 09:49:23 -0700 Message-ID: Subject: Getting rid of invalid SYSCALL RSP under Xen? To: X86 ML , Boris Ostrovsky , Andrew Cooper , "linux-kernel@vger.kernel.org" , Borislav Petkov , Steven Rostedt Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2189 Lines: 66 Hi- In entry_64.S, we have: ENTRY(entry_SYSCALL_64) /* * Interrupts are off on entry. * We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON, * it is too small to ever cause noticeable irq latency. */ SWAPGS_UNSAFE_STACK /* * A hypervisor implementation might want to use a label * after the swapgs, so that it can do the swapgs * for the guest and jump here on syscall. */ GLOBAL(entry_SYSCALL_64_after_swapgs) movq %rsp, PER_CPU_VAR(rsp_scratch) movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp It would be really, really nice if Xen entered the SYSCALL path *after* the mov to %rsp. Similarly, we have: movq RSP(%rsp), %rsp /* big comment */ USERGS_SYSRET64 It would be really nice if we could just mov to %rsp, swapgs, and sysret, without worrying that the sysret is actually a jump on Xen. I suspect that making Xen stop using these code paths would actually be a simplification. On SYSCALL entry, Xen lands in xen_syscall_target (AFAICT) and clearly has rsp pointing somewhere valid. Xen obligingly shoves the user RSP into the hardware RSP register and jumps into the entry code. Is that stuff running on the normal kernel stack? If so, can we just enter later on: pushq %r11 /* pt_regs->flags */ pushq $__USER_CS /* pt_regs->cs */ pushq %rcx /* pt_regs->ip */ <-- Xen enters here pushq %rax /* pt_regs->orig_ax */ pushq %rdi /* pt_regs->di */ pushq %rsi /* pt_regs->si */ pushq %rdx /* pt_regs->dx */ For SYSRET, I think the way to go is to force Xen to always use the syscall slow path. Instead, Xen could hook into syscall_return_via_sysret or even right before the opportunistic sysret stuff. Then we could remove the USERGS_SYSRET hooks entirely. Would this work? --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/