Received: by 10.192.165.148 with SMTP id m20csp58686imm; Tue, 24 Apr 2018 17:11:04 -0700 (PDT) X-Google-Smtp-Source: AIpwx49jlvrsx6hwG2JgFyMD5G1cgogQ9IZgzqt/6deIePdULonpTvDf13aYQjr9zP2CKnWAnpjQ X-Received: by 10.98.24.214 with SMTP id 205mr25227657pfy.242.1524615064032; Tue, 24 Apr 2018 17:11:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524615063; cv=none; d=google.com; s=arc-20160816; b=A3R5lsmH+mRsEE99ahPLtwCIvnSGYwYPMzBN68RJoa7o1U3S5ox/gPo35E8odHD1Dt FCjOwS4CsoH2Cmzz3J1AKcKSAOP6uYpZjq3VLE6OxYhI1ZSMK/oclZoZ29m4yyNECUlZ wZYLdXQbpDJS0BVvi+hin1BsNrzkaMHO8wFc6b0Hf9sLIb37i2imi7CVeIJyTKYUAzOR RPRbsJ497ngcq4zSsQNeBkLi34cCKWWvSljOog4aEOfdC6u2hnfQoNh/llvpWKK8nsZt LuKT3TwmrX7FMNaVdRGtfmnIcgJkEvQbBeEuMfJJwXlUlOIlxdGukTrx+RnYspkytyo9 olAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:user-agent:in-reply-to :content-disposition:mime-version:references:reply-to:subject:cc:to :from:date:arc-authentication-results; bh=x8Dj3knrzoxtMcxb4KP4/PzhS4PTa8/RYN0zHqsipwY=; b=rtqR8bI4zmCA3k/FimGaLAag89axv3aLnrW29ZyF4hZKxWF3NB95trn8Rnk/oLqrri +yl4nZ1U+sNrRMFfVIGeVZV4+FavyVzV8Z6S5tRZcotnGnHQxDpEllKFzAlIvumK/rkr Pz3xkR4CkBBkNfYNneZvcgRZO6hxMEjKs/wwoBx2hEAqqIOob8nUxivomi+dvQrNjc6h ZX3ktFeYDl8i5gIAtEoZbvVB+mXwCvi/9u3LjfqVjjBnHTCg7mc6y1fXDhwZZhMERjiD Re3OH3gbletRuBN8dQGx5aneqpxXdGBM84v4xOJHFmC+b1sBLNN/4P8MExnxV059vHmW Y1+w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j63si12187277pgc.474.2018.04.24.17.10.49; Tue, 24 Apr 2018 17:11:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751247AbeDYAJo (ORCPT + 99 others); Tue, 24 Apr 2018 20:09:44 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:47344 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751179AbeDYAJm (ORCPT ); Tue, 24 Apr 2018 20:09:42 -0400 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w3P09emj123762 for ; Tue, 24 Apr 2018 20:09:42 -0400 Received: from e14.ny.us.ibm.com (e14.ny.us.ibm.com [129.33.205.204]) by mx0a-001b2d01.pphosted.com with ESMTP id 2hjd5budt5-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 24 Apr 2018 20:09:41 -0400 Received: from localhost by e14.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 24 Apr 2018 20:09:40 -0400 Received: from b01cxnp23032.gho.pok.ibm.com (9.57.198.27) by e14.ny.us.ibm.com (146.89.104.201) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 24 Apr 2018 20:09:35 -0400 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp23032.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w3P09Z8949152238; Wed, 25 Apr 2018 00:09:35 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6DC56B2058; Tue, 24 Apr 2018 21:11:37 -0400 (EDT) Received: from paulmck-ThinkPad-W541 (unknown [9.70.82.108]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP id 247C8B2046; Tue, 24 Apr 2018 21:11:37 -0400 (EDT) Received: by paulmck-ThinkPad-W541 (Postfix, from userid 1000) id C165416C0722; Tue, 24 Apr 2018 17:10:49 -0700 (PDT) Date: Tue, 24 Apr 2018 17:10:49 -0700 From: "Paul E. McKenney" To: Joel Fernandes Cc: Mathieu Desnoyers , rostedt , Namhyung Kim , Masami Hiramatsu , linux-kernel , linux-rt-users , Peter Zijlstra , Ingo Molnar , Tom Zanussi , Thomas Gleixner , Boqun Feng , fweisbec , Randy Dunlap , kbuild test robot , baohong liu , vedang patel , kernel-team Subject: Re: [RFC v4 3/4] irqflags: Avoid unnecessary calls to trace_ if you can Reply-To: paulmck@linux.vnet.ibm.com References: <409016827.14587.1524493888181.JavaMail.zimbra@efficios.com> <20180423172244.694dbc9d@gandalf.local.home> <20180424155655.GA820@linux.vnet.ibm.com> <20180424172658.GT26088@linux.vnet.ibm.com> <20180424182302.GA404@linux.vnet.ibm.com> <20180424182623.GA1357@linux.vnet.ibm.com> <849066633.939.1524612064698.JavaMail.zimbra@efficios.com> <68e4c123-a223-5e26-e57a-da2515041bf3@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <68e4c123-a223-5e26-e57a-da2515041bf3@google.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 18042500-0052-0000-0000-000002E1A578 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00008915; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000257; SDB=6.01022799; UDB=6.00522080; IPR=6.00802034; MB=3.00020757; MTD=3.00000008; XFM=3.00000015; UTC=2018-04-25 00:09:39 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18042500-0053-0000-0000-00005C72FE3D Message-Id: <20180425001049.GX26088@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-04-24_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1804250000 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 24, 2018 at 04:46:54PM -0700, Joel Fernandes wrote: > > > On 04/24/2018 04:21 PM, Mathieu Desnoyers wrote: > >----- On Apr 24, 2018, at 2:59 PM, Joel Fernandes joelaf@google.com wrote: > >>On Tue, Apr 24, 2018 at 11:26 AM, Paul E. McKenney > >> wrote: > >>>On Tue, Apr 24, 2018 at 11:23:02AM -0700, Paul E. McKenney wrote: > >>>>On Tue, Apr 24, 2018 at 10:26:58AM -0700, Paul E. McKenney wrote: > >>>>>On Tue, Apr 24, 2018 at 09:01:34AM -0700, Joel Fernandes wrote: > >>>>>>On Tue, Apr 24, 2018 at 8:56 AM, Paul E. McKenney > >>>>>> wrote: > >>>>>>>On Mon, Apr 23, 2018 at 05:22:44PM -0400, Steven Rostedt wrote: > >>>>>>>>On Mon, 23 Apr 2018 13:12:21 -0400 (EDT) > >>>>>>>>Mathieu Desnoyers wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>>>I'm inclined to explicitly declare the tracepoints with their given > >>>>>>>>>synchronization method. Tracepoint probe callback functions for currently > >>>>>>>>>existing tracepoints expect to have preemption disabled when invoked. > >>>>>>>>>This assumption will not be true anymore for srcu-tracepoints. > >>>>>>>> > >>>>>>>>Actually, why not have a flag attached to the tracepoint_func that > >>>>>>>>states if it expects preemption to be enabled or not? If a > >>>>>>>>trace_##event##_srcu() is called, then simply disable preemption before > >>>>>>>>calling the callbacks for it. That way if a callback is fine for use > >>>>>>>>with srcu, then it would require calling > >>>>>>>> > >>>>>>>> register_trace_##event##_may_sleep(); > >>>>>>>> > >>>>>>>>Then if someone uses this on a tracepoint where preemption is disabled, > >>>>>>>>we simply do not call it. > >>>>>>> > >>>>>>>One more stupid question... If we are having to trace so much stuff > >>>>>>>in the idle loop, are we perhaps grossly overstating the extent of that > >>>>>>>"idle" loop? For being called "idle", this code seems quite busy! > >>>>>> > >>>>>>;-) > >>>>>>The performance hit I am observing is when running a heavy workload, > >>>>>>like hackbench or something like that. That's what I am trying to > >>>>>>correct. > >>>>>>By the way is there any limitation on using SRCU too early during > >>>>>>boot? I backported Mathieu's srcu tracepoint patches but the kernel > >>>>>>hangs pretty early in the boot. I register lockdep probes in > >>>>>>start_kernel. I am hoping that's not why. > >>>>>> > >>>>>>I could also have just screwed up the backporting... may be for my > >>>>>>testing, I will just replace the rcu API with the srcu instead of all > >>>>>>of Mathieu's new TRACE_EVENT macros for SRCU, since all I am trying to > >>>>>>do right now is measure the performance of my patches with SRCU. > >>>>> > >>>>>Gah, yes, there is an entry on my capacious todo list on making SRCU > >>>>>grace periods work during early boot and mid-boot. Let me see what > >>>>>I can do... > >>>> > >>>>OK, just need to verify that you are OK with call_srcu()'s callbacks > >>>>not being invoked until sometime during core_initcall() time. (If you > >>>>really do need them to be invoked before that, in theory it is possible, > >>>>but in practice it is weird, even for RCU.) > >>> > >>>Oh, and that early at boot, you will need to use DEFINE_SRCU() or > >>>DEFINE_STATIC_SRCU() rather than dynamic allocation and initialization. > >>> > >>> Thanx, Paul > >>> > >> > >>Oh ok. > >> > >>About call_rcu, calling it later may be an issue since we register the > >>probes in start_kernel, for the first probe call_rcu will be sched, > >>but for the second one I think it'll try to call_rcu to get rid of the > >>first one. > >> > >>This is the relevant code that gets called when probes are added: > >> > >>static inline void release_probes(struct tracepoint_func *old) > >>{ > >> if (old) { > >> struct tp_probes *tp_probes = container_of(old, > >> struct tp_probes, probes[0]); > >> call_rcu_sched(&tp_probes->rcu, rcu_free_old_probes); > >> } > >>} > >> > >>Maybe we can somehow defer the call_srcu until later? Would that be possible? > >> > >>also Mathieu, you didn't modify the call_rcu_sched in your prototype > >>to be changed to use call_srcu, should you be doing that? > > > >You're right, I think I should have introduced a call_srcu in there. > >It's missing in my prototype. > > > >However, in the prototype I did, we need to wait for *both* sched-rcu > >and SRCU grace periods, because we don't track which site is using which > >rcu flavor. > > > >So you could achieve this relatively easily by means of two chained > >RCU callbacks, e.g.: > > > >release_probes() calls call_rcu_sched(... , rcu_free_old_probes) > > > >and then in rcu_free_old_probes() do: > > > >call_srcu(... , srcu_free_old_probes) > > > >and perform kfree(container_of(head, struct tp_probes, rcu)); > >within srcu_free_old_probes. > > > >It is somewhat a hack, but should work. > > Sounds good, thanks. > > Also I found the reason for my boot issue. It was because the > init_srcu_struct in the prototype was being done in an initcall. > Instead if I do it in start_kernel before the tracepoint is used, it > fixes it (although I don't know if this is dangerous to do like this > but I can get it to boot atleast.. Let me know if this isn't the > right way to do it, or if something else could go wrong) > > diff --git a/init/main.c b/init/main.c > index 34823072ef9e..ecc88319c6da 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -631,6 +631,7 @@ asmlinkage __visible void __init start_kernel(void) > WARN(!irqs_disabled(), "Interrupts were enabled early\n"); > early_boot_irqs_disabled = false; > > + init_srcu_struct(&tracepoint_srcu); > lockdep_init_early(); > > local_irq_enable(); > -- > > I benchmarked it and the performance also looks quite good compared > to the rcu tracepoint version. > > If you, Paul and other think doing the init_srcu_struct like this > should be Ok, then I can try to work more on your srcu prototype and > roll into my series and post them in the next RFC series (or let me > know if you wanted to work your srcu stuff in a separate series..). That is definitely not what I was expecting, but let's see if it works anyway... ;-) But first, I was instead expecting something like this: DEFINE_SRCU(tracepoint_srcu); With this approach, some of the initialization happens at compile time and the rest happens at the first call_srcu(). This will work -only- if the first call_srcu() doesn't happen until after workqueue_init_early() has been invoked. Which I believe must have been the case in your testing, because otherwise it looks like __call_srcu() would have complained bitterly. On the other hand, if you need to invoke call_srcu() before the call to workqueue_init_early(), then you need the patch that I am beating into shape. Plus you would need to use DEFINE_SRCU() and to avoid invoking init_srcu_struct(). Thanx, Paul