Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp3748574ybk; Tue, 19 May 2020 12:01:05 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxLREvUb5d6QaJVhQPxNMgAiyzAWDCE39Yq5OZK/4Ct3YORYtMNzDmJ1+jRPZUbMJAe1yoR X-Received: by 2002:a05:6402:cb4:: with SMTP id cn20mr339502edb.150.1589914865211; Tue, 19 May 2020 12:01:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589914865; cv=none; d=google.com; s=arc-20160816; b=szewmWAyB+G/iB1M8d5SPhGozPIi06V+RpYozLxqlnDKqBj8dzA73L87N1g/Dpq9T9 tsqWlf9tLQskhASRtYAVtyKEjiEHTE+tJwqU12M8vtnoEaULq5pVIGyXkqPioxOQry8P xhpfBe+lKS6uQwSnH7XKdJ6lKBFSY3KB4lAprOT+0DX2fmcFTWaLceJ7+HftxrwmPfn9 1r4bhqcIDddjItpwtV4rsGTLZyBT5p+lMr8cDjFQUUYkY9/wgevi/05DduQZHCVkXaXd YR1biJsi21TfmBL+eJuHxjDD5i3xZSjKdy/4V6byNaaHHSLiL6fzV2yDjPT8IwPFGqLx q2gA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from; bh=57TdH2XQU4MmIchLzVfgZKKTECAQjjXbIqid7B/eaHg=; b=L8ntPHUsZL4aqYX6Awf4I2F2i851KYSNHtHAnWYTPb5EtgRTFsvPg8gQc2IsBaPy2T kuE5DsBpXUeiIqiL4WTwTKcM904Uuuc8ReeTEZoKdEXj83Q0Y9Ox+15Xpx1nbz+j6MJ4 rBsepQ37jPPscHfa3UkKQd+5tCTl4u/GDPH8OpUVsxW8JTyNVJR1EU/+9gLOJCeRw0NE Ueg0SinnFkXAyV8RK/hrCm9XousMTGEPRFs4tvGboeiLsGmORIMPMnVzC+08rGhUE0v6 oeoOL1HyQoYWV7GABuriwnJoxSuVrO5TyNPvQ02k8j8EN3gYxB9uhoqb5TGOaqXEoVp8 +u7A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l26si368218ejx.390.2020.05.19.12.00.40; Tue, 19 May 2020 12:01:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726988AbgESS64 (ORCPT + 99 others); Tue, 19 May 2020 14:58:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726407AbgESS64 (ORCPT ); Tue, 19 May 2020 14:58:56 -0400 Received: from Galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 21A30C08C5C0 for ; Tue, 19 May 2020 11:58:56 -0700 (PDT) Received: from p5de0bf0b.dip0.t-ipconnect.de ([93.224.191.11] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1jb7R8-0006vJ-Jc; Tue, 19 May 2020 20:57:58 +0200 Received: by nanos.tec.linutronix.de (Postfix, from userid 1000) id 187F7100D01; Tue, 19 May 2020 20:57:58 +0200 (CEST) From: Thomas Gleixner To: Andy Lutomirski , Andrew Cooper Cc: LKML , X86 ML , "Paul E. McKenney" , Andy Lutomirski , Alexandre Chartre , Frederic Weisbecker , Paolo Bonzini , Sean Christopherson , Masami Hiramatsu , Petr Mladek , Steven Rostedt , Joel Fernandes , Boris Ostrovsky , Juergen Gross , Brian Gerst , Mathieu Desnoyers , Josh Poimboeuf , Will Deacon , Tom Lendacky , Wei Liu , Michael Kelley , Jason Chen CJ , Zhao Yakui , "Peter Zijlstra \(Intel\)" Subject: Re: [patch V6 10/37] x86/entry: Switch XEN/PV hypercall entry to IDTENTRY In-Reply-To: References: <20200515234547.710474468@linutronix.de> <20200515235125.425810667@linutronix.de> Date: Tue, 19 May 2020 20:57:58 +0200 Message-ID: <87imgr7nwp.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Andy Lutomirski writes: > On Fri, May 15, 2020 at 5:10 PM Thomas Gleixner wrote: >> @@ -573,6 +578,16 @@ static __always_inline void __idtentry_exit(struct pt_regs *regs) >> instrumentation_end(); >> return; >> } >> + } else if (IS_ENABLED(CONFIG_XEN_PV)) { >> + if (preempt_hcall) { >> + /* See CONFIG_PREEMPTION above */ >> + instrumentation_begin(); >> + rcu_irq_exit_preempt(); >> + xen_maybe_preempt_hcall(); >> + trace_hardirqs_on(); >> + instrumentation_end(); >> + return; >> + } > > Ewwwww! This shouldn't be taken as a NAK -- it's just an expression > of disgust. I'm really not proud of it, but that was the least horrible thing I could come up with. > Shouldn't this be: > > instrumentation_begin(); > if (!irq_needs_irq_stack(...)) > __blah(); > else > run_on_irqstack(__blah, NULL); > instrumentation_end(); > > or even: > > instrumentation_begin(); > run_on_irqstack_if_needed(__blah, NULL); > instrumentation_end(); Yeah. In that case the instrumentation markers are not required as they will be inside the run....() function. > ****** BUT ******* > > I think this is all arse-backwards. This is a giant mess designed to > pretend we support preemption and to emulate normal preemption in a > non-preemptible kernel. I propose one to two massive cleanups: > > A: Just delete all of this code. Preemptible hypercalls on > non-preempt kernels will still process interrupts but won't get > preempted. If you want preemption, compile with preemption. I'm happy to do so, but the XEN folks might have opinions on that :) > B: Turn this thing around. Specifically, in the one and only case we > care about, we know pretty much exactly what context we got this entry > in: we're running in a schedulable context doing an explicitly > preemptible hypercall, and we have RIP pointing at a SYSCALL > instruction (presumably, but we shouldn't bet on it) in the hypercall > page. Ideally we would change the Xen PV ABI so the hypercall would > return something like EAGAIN instead of auto-restarting and we could > ditch this mess entirely. But the ABI seems to be set in stone or at > least in molasses, so how about just: > > idt_entry(exit(regs)); > if (inhcall && need_resched()) > schedule(); Which brings you into the situation that you call schedule() from the point where we just moved it out. If we would go there we'd need to ensure that RCU is watching as well. idtentry_exit() might have it turned off .... That's why I did it this way to keep the code flow exactly the same for all these exit variants. Thanks, tglx