Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752339AbdGZOdK (ORCPT ); Wed, 26 Jul 2017 10:33:10 -0400 Received: from smtp.eu.citrix.com ([185.25.65.24]:4882 "EHLO SMTP.EU.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752279AbdGZOdI (ORCPT ); Wed, 26 Jul 2017 10:33:08 -0400 X-IronPort-AV: E=Sophos;i="5.40,415,1496102400"; d="scan'208";a="49979835" Subject: Re: [Xen-devel] [PATCH v1] xen: get rid of paravirt op adjust_exception_frame To: Andy Lutomirski References: <20170724142853.26448-1-jgross@suse.com> <45d0e5c7-a946-d7be-20e5-6965cb9f6629@citrix.com> CC: Juergen Gross , X86 ML , "linux-kernel@vger.kernel.org" , Ingo Molnar , "H. Peter Anvin" , "xen-devel@lists.xenproject.org" , Boris Ostrovsky , Thomas Gleixner From: Andrew Cooper Message-ID: <53f936d4-c6e0-3e9b-4291-8f16cbb87779@citrix.com> Date: Wed, 26 Jul 2017 15:31:27 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-ClientProxiedBy: AMSPEX02CAS01.citrite.net (10.69.22.112) To AMSPEX02CL02.citrite.net (10.69.22.126) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3160 Lines: 85 On 26/07/17 15:09, Andy Lutomirski wrote: > On Wed, Jul 26, 2017 at 7:01 AM, Andrew Cooper > wrote: >> On 26/07/17 14:48, Andy Lutomirski wrote: >>>> /* Runs on exception stack */ >>>> -ENTRY(nmi) >>>> - /* >>>> - * Fix up the exception frame if we're on Xen. >>>> - * PARAVIRT_ADJUST_EXCEPTION_FRAME is guaranteed to push at most >>>> - * one value to the stack on native, so it may clobber the rdx >>>> - * scratch slot, but it won't clobber any of the important >>>> - * slots past it. >>>> - * >>>> - * Xen is a different story, because the Xen frame itself overlaps >>>> - * the "NMI executing" variable. >>>> - */ >>> I would keep this comment. The Xen frame really is in the way AFAICT. >> (For reasons best explained by the original authors) there is only ever >> a single stack which a PV guest registers with Xen, which functions >> equivalently to tss.sp0. There is no support for stack switching via >> task switch or IST. >> >> Therefore, nested NMIs won't clobber the top of this stack. >> > Does that mean that nested NMIs on Xen just nest normally without > clobbering each other? Yes. > If so, that's neat, although I wonder how we > don't get crashes due to this: > > /* Normal 64-bit system call target */ > ENTRY(xen_syscall_target) > undo_xen_syscall > <-- NMI right here > jmp entry_SYSCALL_64_after_swapgs > ENDPROC(xen_syscall_target) I'm going to go out on a limb here and say "because no has hit that condition yet" (or at least managed to diagnose such a crash). PV domU's don't get given NMIs. PV dom0 might, depending on how Xen is handling the NMI itself. On XenServer at least, dom0 never gets handed an NMI. I expect is a sufficiently rarely used path that noone has noticed if it is indeed broken. ~Andrew > > I think it would be nice if Xen could instead enter the native syscall > path a bit later like this: > > ENTRY(entry_SYSCALL_64) > /* > * Interrupts are off on entry. > * We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON, > * it is too small to ever cause noticeable irq latency. > */ > SWAPGS_UNSAFE_STACK > /* > * A hypervisor implementation might want to use a label > * after the swapgs, so that it can do the swapgs > * for the guest and jump here on syscall. > */ > GLOBAL(entry_SYSCALL_64_after_swapgs) > > movq %rsp, PER_CPU_VAR(rsp_scratch) > movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp > > TRACE_IRQS_OFF > > /* Construct struct pt_regs on stack */ > pushq $__USER_DS /* pt_regs->ss */ > pushq PER_CPU_VAR(rsp_scratch) /* pt_regs->sp */ > pushq %r11 /* pt_regs->flags */ > pushq $__USER_CS /* pt_regs->cs */ > pushq %rcx /* pt_regs->ip */ > > <-- Xen enters here > > then we wouldn't have all this funny business. And Xen could > completely skip the nmi nesting code.