Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp3433216pxb; Mon, 4 Apr 2022 16:51:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxDyv8jKXPmQzNqvCdUHxXm+tFKZEjY0IreuM21L5CACq4ImOGZM2UAqV8p+uW+9mPPPpFk X-Received: by 2002:a63:125a:0:b0:382:5f9c:9c43 with SMTP id 26-20020a63125a000000b003825f9c9c43mr559712pgs.232.1649116307956; Mon, 04 Apr 2022 16:51:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649116307; cv=none; d=google.com; s=arc-20160816; b=eDq516UYqMHBgrJQj96qFDVOCyh5vEzE0fkTnPIL9Udig0q+H3gtCrlN/+SlHuvD70 XfNPabbAVHePHJ/7FwVQr2RPVLzLgMx0s01YpdAeD5ZBcR4gxnxOgZid0KcOB94KEFqy FzpufSKZYkzpzqZxOS2gTv0pBAGnoHrzJkCgAqa/f77137Np8mnAuVwNyC/93z402XXd 7NVT0r5l9276tZ64HpeXQgny+dBVPVpP/IvtTivU3td7n3+0oAhLXQGrCXnqHuglHVsy QqMh4TKKjiRpKIyWiPwT4sMa9+KGCXGLco4VW3jF3eGl4NKMpzau1nGIWJbmTy6hepFO S6pA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=1OUFcQbrqdZlMF7kKyGit1V3j5gCbtGcbc6mB3kZu98=; b=Hp/Pp6NoECub8Iy4t6M9Ttgj78rXB1YBSrFcfRJJKOIBZ902/sGmqbR8a8/MLl8I5a E4+//UWHnYfebAbQZ8i2rSfDT6sr7ZPf/YfZaHqDDVKteMTqlI5AA7VM9GPyX4PE1b7K A8oeS0ua+b14IXWnisYblT9a959JZ+iJ7dTIs1clY/dfnXj5dIILNZbV9X7ycHB954fw fTtIozOQH3QMDjp1TymazpO2vOxwkURH+JRQWtnOlpGzZNOSSPP5iW8ICpDtFMwX6iS1 whY3K4sLKHG/cuSn0yJCynvWVYlrbpUGIaSfWvtgTRtGdlLrtl82QH4aycw4+QnDuMQp oaLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=pm0T6SrJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id b15-20020a17090a100f00b001cab7bbc6ffsi608215pja.60.2022.04.04.16.51.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Apr 2022 16:51:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=pm0T6SrJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 4EE335C86E; Mon, 4 Apr 2022 16:35:52 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234764AbiDCDmN (ORCPT + 99 others); Sat, 2 Apr 2022 23:42:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231704AbiDCDmK (ORCPT ); Sat, 2 Apr 2022 23:42:10 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E8F62E6AE; Sat, 2 Apr 2022 20:40:17 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 2A899B8093C; Sun, 3 Apr 2022 03:40:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E2A8BC340ED; Sun, 3 Apr 2022 03:40:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1648957214; bh=vyDw1LW1hToLuXx1bLNdBKfd5IwUMTVc4aGJHpgMUGA=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=pm0T6SrJZdqebaCCx5nz3OsLJW2z8/dkTGfpvZDhqUvoTw9t64e5jTKCHqxlhayKj 86aX6oXux7llW9f5pFRb726zL/nwmEJrQstiW4xBZUHrmWAWW3bRNTV6xswSY/yWya Con7s9N2WKyzeyAi/DebS+3CXPEpdTGQn4qgfoSupQ5jtrr5mCvMXJnJOGQuhI6jjw sJoSpBK0XmwiZd3AirCmRAM9/hvf7cS5sTwsyhih/r03DUprGLH4O1VlyOQsl71z5t Q7VjeTe+ag8bWQcqxImmi3g7DNtQtbGHLn1hZ/VzjqhGDz6SAx6b2Vc17fsZO3ZuNa 1zpwCwWOHjQXg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 8B9195C098F; Sat, 2 Apr 2022 20:40:14 -0700 (PDT) Date: Sat, 2 Apr 2022 20:40:14 -0700 From: "Paul E. McKenney" To: "Zhang, Qiang1" Cc: "frederic@kernel.org" , "rcu@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] rcu: Put the irq work into hard interrupt context for execution Message-ID: <20220403034014.GL4285@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <20220330060012.2470054-1-qiang1.zhang@intel.com> <20220330201620.GM4285@paulmck-ThinkPad-P17-Gen-1> <20220331172943.GV4285@paulmck-ThinkPad-P17-Gen-1> <20220401165524.GF4285@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Apr 02, 2022 at 06:29:04AM +0000, Zhang, Qiang1 wrote: > > On Fri, Apr 01, 2022 at 01:55:51AM +0000, Zhang, Qiang1 wrote: > > > > On Wed, Mar 30, 2022 at 10:47:05PM +0000, Zhang, Qiang1 wrote: > > > On Wed, Mar 30, 2022 at 02:00:12PM +0800, Zqiang wrote: > > > > In PREEMPT_RT kernel, if irq work flags is not set, it will be > > > > executed in per-CPU irq_work kthreads. set IRQ_WORK_HARD_IRQ flags > > > > to irq work, put it in the context of hard interrupt execution, > > > > accelerate scheduler to re-evaluate. > > > > > > > > Signed-off-by: Zqiang > > > > --- > > > > kernel/rcu/tree.c | 2 +- > > > > kernel/rcu/tree_plugin.h | 2 +- > > > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index > > > > e2ffbeceba69..a69587773a85 100644 > > > > --- a/kernel/rcu/tree.c > > > > +++ b/kernel/rcu/tree.c > > > > @@ -678,7 +678,7 @@ static void late_wakeup_func(struct irq_work > > > > *work) } > > > > > > > > static DEFINE_PER_CPU(struct irq_work, late_wakeup_work) = > > > > - IRQ_WORK_INIT(late_wakeup_func); > > > > + IRQ_WORK_INIT_HARD(late_wakeup_func); > > > > > > >This is used only by rcu_irq_work_resched(), which is invoked only by rcu_user_enter(), which is never invoked until userspace is enabled, by which time all of the various kthreads will have been spawned, correct? > > > > > > > >Either way, please show me the exact sequence of events that lead to a problem with the current IRQ_WORK_INIT(). > > > > > > > > /* > > > > * If either: > > > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > > > > index 3037c2536e1f..cf7bd28af8ef 100644 > > > > --- a/kernel/rcu/tree_plugin.h > > > > +++ b/kernel/rcu/tree_plugin.h > > > > @@ -661,7 +661,7 @@ static void rcu_read_unlock_special(struct task_struct *t) > > > > expboost && !rdp->defer_qs_iw_pending && cpu_online(rdp->cpu)) { > > > > // Get scheduler to re-evaluate and call hooks. > > > > // If !IRQ_WORK, FQS scan will eventually IPI. > > > > - init_irq_work(&rdp->defer_qs_iw, rcu_preempt_deferred_qs_handler); > > > > + rdp->defer_qs_iw = > > > > +IRQ_WORK_INIT_HARD(rcu_preempt_deferred_qs_handler); > > > > rdp->defer_qs_iw_pending = true; > > > > irq_work_queue_on(&rdp->defer_qs_iw, rdp->cpu); > > > > } > > > > > > > >OK, in theory, rcu_read_unlock() could get to this point before all of the various kthreads were spawned. In practice, the next time that the boot CPU went idle, the end of the quiescent state would be noticed. > > > > > > Through my understanding, use irq_work in order to make the > > > quiescent state be noticed earlier, Because the irq_work execute in > > > interrupt, this irq_work can be executed in time, but In RT kernel The irq_work is put into the kthread for execution, when it is executed, it is affected by the scheduling delay. > > > Is there anything I missed? > > > > >Yes, in that I am not seeing any actual data showing that this fix really makes things better. Please in mind that IRQ_WORK_INIT_HARD does have performance disadvantages of its own. So although I agree with your words saying that IRQ_WORK_INIT_HARD -might- be helpful, those words are not sufficient. > > > > > >So, can you show a statistically significant benefit on a real system? > > >For example, by measuring the time required for a expedited grace > > >period to complete? That would argue for this change, though it > > >would need to be conditional, so that systems that don't care that > > >much about the latency of expedited RCU grace periods don't need to > > >pay the IRQ_WORK_INIT_HARD performance penalties. Or you would need > > >to demonstrate that these performance penalties don't cause problems. > > >(But such a demonstration is not easy given the wide variety of > > >systems that Linux supports.) > > > > > >Now, I could imagine that the current code could cause problems > > >during boot on CONFIG_PREEMPT_RT kernels. But, believe me, I can > > >imagine all sorts of horrible problems. But we should fix those that > > >happen not just in my imagination, but also in the real world. ;-) > > > > Thanks, agree. I'll test it according to your suggestion. > > > >Very good! I am looking forward to seeing what you come up with. > > Hello, Paul > > I use v5.17.1-rt16 PREEMPT_RT kernel and enable CONFIG_RCU_STRICT_GRACE_PERIOD. > When boot, find a lot of 'defer_qs_iw' irq-work is handled in the 'irq_work/0', the 'irq_work/0' is RT fifo > Kthreads, it occupies boot CPU for a long time, the kernel_init kthread cannot get the boot CPU to start > Up other CPU, the boot process occurs hang. > Replace init_irq_work (&rdp->defer_qs_iw) with IRQ_WORK_INIT_HARD can fix. Very good on making it happen! But please make the fix use IRQ_WORK_INIT_HARD only for PREEMPT_RT kernels. That way the performance of other systems won't be degraded by this fix for a PREEMPT_RT bug. Thanx, Paul > Thanks > Zqiang > > > > > Thanx, Paul > > > >So if you can make such a problem happen in real life, then I would be happy to take a patch that fixed this on CONFIG_PREEMPT_RT but kept the current code otherwise. > > > > > > Thanx, Paul > > > > > Thanks > > > Zqiang > > > > > > > > > > >Or has this been failing in some other manner? If so, please let me know the exact sequence of events. > > > > > > > > Thanx, Paul