Received: by 2002:a05:6358:e9c4:b0:b2:91dc:71ab with SMTP id hc4csp6472139rwb; Tue, 9 Aug 2022 16:23:48 -0700 (PDT) X-Google-Smtp-Source: AA6agR5Zdb8RP55WSgHmYOJzE2zIvFRrNBoc68UWUy/rBSPnMVwcInuvYZtLr0VCgL+SgmxV3bAJ X-Received: by 2002:a17:907:6eac:b0:730:a07f:38bb with SMTP id sh44-20020a1709076eac00b00730a07f38bbmr19247998ejc.750.1660087428065; Tue, 09 Aug 2022 16:23:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660087428; cv=none; d=google.com; s=arc-20160816; b=mERUzPp7z97UYnT8eMXj1wmMsAusRKr14sXHo8D15K3EMLNSHGR4AFp5TykyLm7F6f 4RM+EbngFDWjfx0CNQrmR6CcYc6+0goap48XUIcohMhrvbft+0hunUmIsFHUHyyQ9HqY 7ZtuHrpNHnICAOraGUS884NuoCa6BhiFuXMOAPeGoDY+GxzZzhF3vYx+lgl4Idq+XhqC XNm3BucnEBx3EpkPdMh5S1yHv87k7888OorjQpYpKBIH2ilQcVmk1DQdo1swhCf52Jfl tWqCUthmCkKRvMcQOar1gvdc6KVxSWEqVsn6EtPtCKcfxiIwjsJVxFMxpgvFYcLnj+fo NJvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=49BR0x796LXoJ0VhQzJr4EQaUub9aq/xCwbr1Uy7JZk=; b=Fy68zD74kWT2Cw32VqlBP+CPvBEFoHsvLaV2I5LPX40S12K4Xk78tWNQGc5giYRCer tH6t7xcoGZmemPJX8jnJfW8XLnOTMA3Sp1/UC79vIvC1NuSpb1mV5EwG2sgePjWyZkYX VhjIYU2dTuldCu+YguSTDnLma1fEobUrKGtRg9gafkwrhfJQqpOBL6DIxyaa2LkUvEgR nO8VaHMWzvf9LOFrvz1bbSsiQPfhe6D6epcdBsyXHZHXRktrCn5hOvrHsvObZy+bN7SW J3nRcQvycEXx9of/WkL7I6mBcYJoQzLXAM3WxRLEmjatAwvaaFkZ0CFh8BUlTSWkuMUo A3JA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=upJCVu7v; dkim=neutral (no key) header.i=@linutronix.de header.b=bQ8bUaL+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id js20-20020a17090797d400b0072b4063eca2si3418664ejc.559.2022.08.09.16.23.23; Tue, 09 Aug 2022 16:23:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=upJCVu7v; dkim=neutral (no key) header.i=@linutronix.de header.b=bQ8bUaL+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229526AbiHIXSi (ORCPT + 99 others); Tue, 9 Aug 2022 19:18:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229449AbiHIXSe (ORCPT ); Tue, 9 Aug 2022 19:18:34 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E939461DBE for ; Tue, 9 Aug 2022 16:18:33 -0700 (PDT) From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1660087111; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=49BR0x796LXoJ0VhQzJr4EQaUub9aq/xCwbr1Uy7JZk=; b=upJCVu7v+BJugzpD9dLKrlHgXZINP3F4S+n0ysjUqHvUEydbmm0eIPKWsP5p0UMnA3TYyH +TFpc35CmLXz7Zg8bPXdY2HNbvHv9V5cG8hlJq810fL5ht2agyxpPQUAvFIz1Urjb+xdns CKIiL30f1q5Yqqfx0BuWx/kq1HtO4dzYhG+is8WFtJd4T/iZNvM37U0aLrYWSyzHFyqbu3 ubnIRrHTbmv4USJZcZOrKP+xNMNthjC9I+IAgustXu0ndWmegF2hm0cf3lRQORkohEC/bP 60h3nPQznbkW82NmMmu3eft4UnNaRPF3b5nbNgw8rYyprug5eWwbRIM9V+W1Ig== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1660087111; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=49BR0x796LXoJ0VhQzJr4EQaUub9aq/xCwbr1Uy7JZk=; b=bQ8bUaL+l2tqSf1Y2TuxxI63mBSDGryCDh4JjNRVoN/gw99rBKqmksnMfsljxFlPQnp9+R 0/q9xr9y5tx3tKAw== To: Borislav Petkov , ira.weiny@intel.com Cc: Rik van Riel , Dave Hansen , Dave Hansen , x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, Frederic Weisbecker , Juergen Gross , Mark Rutland , Andrew Cooper Subject: Re: [RFC PATCH 1/5] entry: Pass pt_regs to irqentry_exit_cond_resched() In-Reply-To: References: <20220805173009.3128098-1-ira.weiny@intel.com> <20220805173009.3128098-2-ira.weiny@intel.com> Date: Wed, 10 Aug 2022 01:18:30 +0200 Message-ID: <87fsi5qa49.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 08 2022 at 12:38, Borislav Petkov wrote: > On Fri, Aug 05, 2022 at 10:30:05AM -0700, ira.weiny@intel.com wrote: >> --- >> arch/arm64/include/asm/preempt.h | 2 +- >> arch/arm64/kernel/entry-common.c | 4 ++-- >> arch/x86/entry/common.c | 2 +- >> include/linux/entry-common.h | 17 ++++++++------ >> kernel/entry/common.c | 13 +++++++---- >> kernel/sched/core.c | 40 ++++++++++++++++---------------- >> 6 files changed, 43 insertions(+), 35 deletions(-) > > Why all this churn? > > Why can't you add a parameter to irqentry_exit(): > > noinstr void irqentry_exit(struct pt_regs *regs, irqentry_state_t state, bool cond_resched); > > and then have all callers except xen_pv_evtchn_do_upcall() pass in false > and this way have all exit paths end up in irqentry_exit()? > > And, ofc, move the true case which is the body of > raw_irqentry_exit_cond_resched() to irqentry_exit() and then get rid of > former. > > xen_pv_evtchn_do_upcall() will, ofc, do: > > if (inhcall && !WARN_ON_ONCE(state.exit_rcu)) { > irqentry_exit(regs, state, true); > instrumentation_end(); > restore_inhcall(inhcall); > } else { > instrumentation_end(); > irqentry_exit(regs, state, false); > How is that less churn? Care to do: git grep 'irqentry_exit(' arch/ The real question is: Why is it required that irqentry_exit_cond_resched() handles any of the auxiliary pt regs space? That's completely unanswered by the changelog and absolutely irrelevant for the problem at hand, i.e. storing the CPU number on irq/exception entry. So why is this patch in this series at all? But even for future purposes it is more than questionable. Why? Contrary to the claim of the changelog xen_pv_evtchn_do_upcall() is not really a bypass of irqentry_exit(). The point is: The hypercall is issued by the kernel from privcmd_ioctl_hypercall() which does: xen_preemptible_hcall_begin(); hypercall(); xen_preemptible_hcall_end(); So any upcall from the hypervisor to the guest will semantically hit regular interrupt enabled guest kernel space which means that if that code would call irqentry_exit() then it would run through the kernel exit code path: } else if (!regs_irqs_disabled(regs)) { instrumentation_begin(); if (IS_ENABLED(CONFIG_PREEMPTION)) irqentry_exit_cond_resched(); /* Covers both tracing and lockdep */ trace_hardirqs_on(); instrumentation_end(); } .... Which would fail to invoke irqentry_exit_cond_resched() if CONFIG_PREEMPTION=n. But the whole point of this exercise is to allow preemption from the upcall even when the kernel has CONFIG_PREEMPTION=n. Staring at this XENPV code there are two issues: 1) That code lacks a trace_hardirqs_on() after the call to irqentry_exit_cond_resched(). My bad. 2) CONFIG_PREEMPT_DYNAMIC broke this mechanism. If the static call/key is disabled then the call becomes a NOP. Frederic? Both clearly demonstrate how well tested this XEN_PV muck is which means we should just delete it. Anyway. This wants the fix below. If there is still a need to do anything about this XEN cond_resched() muck for the PREEMPTION=n case for future auxregs usage then this can be simply hidden in this new XEN helper and does not need any change to the rest of the code. I doubt that this is required, but if so then there needs to be a very concise explanation and not just uncomprehensible hand waving word salad. Thanks, tglx --- --- a/arch/x86/entry/common.c +++ b/arch/x86/entry/common.c @@ -283,9 +283,18 @@ static __always_inline void restore_inhc { __this_cpu_write(xen_in_preemptible_hcall, inhcall); } + +static __always_inline void xenpv_irqentry_exit_cond_resched(void) +{ + instrumentation_begin(); + raw_irqentry_exit_cond_resched(); + trace_hardirqs_on(); + instrumentation_end(); +} #else static __always_inline bool get_and_clear_inhcall(void) { return false; } static __always_inline void restore_inhcall(bool inhcall) { } +static __always_inline void xenpv_irqentry_exit_cond_resched(void) { } #endif static void __xen_pv_evtchn_do_upcall(struct pt_regs *regs) @@ -306,11 +315,11 @@ static void __xen_pv_evtchn_do_upcall(st instrumentation_begin(); run_sysvec_on_irqstack_cond(__xen_pv_evtchn_do_upcall, regs); + instrumentation_end(); inhcall = get_and_clear_inhcall(); if (inhcall && !WARN_ON_ONCE(state.exit_rcu)) { - irqentry_exit_cond_resched(); - instrumentation_end(); + xenpv_irqentry_exit_cond_resched(); restore_inhcall(inhcall); } else { instrumentation_end();