Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751678AbdGZOJ1 (ORCPT ); Wed, 26 Jul 2017 10:09:27 -0400 Received: from mail-vk0-f53.google.com ([209.85.213.53]:37733 "EHLO mail-vk0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750783AbdGZOJZ (ORCPT ); Wed, 26 Jul 2017 10:09:25 -0400 MIME-Version: 1.0 In-Reply-To: <45d0e5c7-a946-d7be-20e5-6965cb9f6629@citrix.com> References: <20170724142853.26448-1-jgross@suse.com> <45d0e5c7-a946-d7be-20e5-6965cb9f6629@citrix.com> From: Andy Lutomirski Date: Wed, 26 Jul 2017 07:09:03 -0700 Message-ID: Subject: Re: [Xen-devel] [PATCH v1] xen: get rid of paravirt op adjust_exception_frame To: Andrew Cooper Cc: Juergen Gross , X86 ML , "linux-kernel@vger.kernel.org" , Ingo Molnar , "H. Peter Anvin" , "xen-devel@lists.xenproject.org" , Boris Ostrovsky , Thomas Gleixner Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2624 Lines: 70 On Wed, Jul 26, 2017 at 7:01 AM, Andrew Cooper wrote: > On 26/07/17 14:48, Andy Lutomirski wrote: >> >>> /* Runs on exception stack */ >>> -ENTRY(nmi) >>> - /* >>> - * Fix up the exception frame if we're on Xen. >>> - * PARAVIRT_ADJUST_EXCEPTION_FRAME is guaranteed to push at most >>> - * one value to the stack on native, so it may clobber the rdx >>> - * scratch slot, but it won't clobber any of the important >>> - * slots past it. >>> - * >>> - * Xen is a different story, because the Xen frame itself overlaps >>> - * the "NMI executing" variable. >>> - */ >> I would keep this comment. The Xen frame really is in the way AFAICT. > > (For reasons best explained by the original authors) there is only ever > a single stack which a PV guest registers with Xen, which functions > equivalently to tss.sp0. There is no support for stack switching via > task switch or IST. > > Therefore, nested NMIs won't clobber the top of this stack. > Does that mean that nested NMIs on Xen just nest normally without clobbering each other? If so, that's neat, although I wonder how we don't get crashes due to this: /* Normal 64-bit system call target */ ENTRY(xen_syscall_target) undo_xen_syscall <-- NMI right here jmp entry_SYSCALL_64_after_swapgs ENDPROC(xen_syscall_target) I think it would be nice if Xen could instead enter the native syscall path a bit later like this: ENTRY(entry_SYSCALL_64) /* * Interrupts are off on entry. * We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON, * it is too small to ever cause noticeable irq latency. */ SWAPGS_UNSAFE_STACK /* * A hypervisor implementation might want to use a label * after the swapgs, so that it can do the swapgs * for the guest and jump here on syscall. */ GLOBAL(entry_SYSCALL_64_after_swapgs) movq %rsp, PER_CPU_VAR(rsp_scratch) movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp TRACE_IRQS_OFF /* Construct struct pt_regs on stack */ pushq $__USER_DS /* pt_regs->ss */ pushq PER_CPU_VAR(rsp_scratch) /* pt_regs->sp */ pushq %r11 /* pt_regs->flags */ pushq $__USER_CS /* pt_regs->cs */ pushq %rcx /* pt_regs->ip */ <-- Xen enters here then we wouldn't have all this funny business. And Xen could completely skip the nmi nesting code.