Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp584271imm; Sat, 14 Jul 2018 07:34:09 -0700 (PDT) X-Google-Smtp-Source: AAOMgpetcqRKfz25Cw2McEhiZiN/z81Zgedra3OLqYmBnYAEDsVhFdglIPDx2QaeXJg8nAUSvkVt X-Received: by 2002:a17:902:205:: with SMTP id 5-v6mr10194100plc.301.1531578849408; Sat, 14 Jul 2018 07:34:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531578849; cv=none; d=google.com; s=arc-20160816; b=ejYt/Oxwjn+kgdS5o9fd/FYlZdXSer0UqCaUBJVZd3oVIsKCAqla8A67kLI0UKyDkm IXkHFRb6jdr8+jD9x3vaQ7/6ioG+H17gTM3PtSY4F0aD0HDfYycYc3QPXjmrCQPnMqje kVdAakGGmA9earnW+Ytb58lE0fqBdzbI+3mhC8pNNRib6Z9rVAaScjj9ItMkY4pXcp4E JpYeJf4rIRpQPa8pJJZbfVLy46rUlGhdb+l2DH569btLy3B5qgHZ4Q0ce9k3TeRVptNO DECCtQFjxTVZqUSKMxOr665FtiQZGQj9bY66S3kk9m0uXv1ucTV2cp8IihjSuQhvso3e GVxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:thread-index:thread-topic :content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:dkim-signature:dkim-filter :arc-authentication-results; bh=NIIGnCSzNoiKQRSVNNsaa2AuVcz7lrGDO8e6bUGG/M8=; b=cTTKPrhn9l8iqkcSJClONdeEtR6314wTOb6vS+CKnDCmZE8jReK18XJpqhper+aN6E P8tlrYQn8t1LoG8kyb+tIFgKAW9PXNL39JW9WLlfP22tO9mkPrLqFhcWkV957+XsU3ut CpHqonBZhY5wsk0AQJwzFb6rPZa891HU1gvhFno6CWiH9jZCoZ+JVNhzW+PcmhhZuJDU wDpHarth20B1V1lF++Bj4csAMc+hl6ZojnrVXPEgT//KjU2gkySkF+tepCp5/Oxt+zYJ n6Jg1kHViMnBTs3X/aOIplsi/iuwdKG8xGM3f0NPupV5BIAe04LjS8TnC1xPT2V/YvDs t0mg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=dcl48MzP; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m37-v6si26636606plg.491.2018.07.14.07.33.54; Sat, 14 Jul 2018 07:34:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=dcl48MzP; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728061AbeGNOvR (ORCPT + 99 others); Sat, 14 Jul 2018 10:51:17 -0400 Received: from mail.efficios.com ([167.114.142.138]:35458 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726793AbeGNOvR (ORCPT ); Sat, 14 Jul 2018 10:51:17 -0400 Received: from localhost (ip6-localhost [IPv6:::1]) by mail.efficios.com (Postfix) with ESMTP id CA9211B74CB; Sat, 14 Jul 2018 10:31:59 -0400 (EDT) Received: from mail.efficios.com ([IPv6:::1]) by localhost (mail02.efficios.com [IPv6:::1]) (amavisd-new, port 10032) with ESMTP id w_4vqp72Tjnx; Sat, 14 Jul 2018 10:31:58 -0400 (EDT) Received: from localhost (ip6-localhost [IPv6:::1]) by mail.efficios.com (Postfix) with ESMTP id 4C6D71B74C4; Sat, 14 Jul 2018 10:31:58 -0400 (EDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 4C6D71B74C4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=default; t=1531578718; bh=NIIGnCSzNoiKQRSVNNsaa2AuVcz7lrGDO8e6bUGG/M8=; h=Date:From:To:Message-ID:MIME-Version; b=dcl48MzPcAwnkFHJ724B3eHKrJsyOf/8SrkBS7bnfbN6zKC9X5414QmDHrJ2SpIxM me+aRzhdXx4/9Klek5a5GaRbAWTR+69D+IJinzaxijOcQ0RHzEGnmPEcVgl6lwCLkU lQUQIB+tMEg8HTdrZ/GNNGh5vvJgP0HisJRLAxvmHafZvUhXNDtvxJQN/f88g4bK44 j1TsZasFXpPTrKe/B5vw3d+OZ2CRTmoc/cQXMB0EHE+f1WOsg81uJocTpHE6pWBORU QUUJQFH7w+Z+NzjSAjRXbd9neFBekmwUngHgNFB6Yz+6CV+kJv7LaRQMkzsc+2rJIm yVHVgDJ0r+ZoQ== X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([IPv6:::1]) by localhost (mail02.efficios.com [IPv6:::1]) (amavisd-new, port 10026) with ESMTP id Vo3EzGxCtvEs; Sat, 14 Jul 2018 10:31:58 -0400 (EDT) Received: from mail02.efficios.com (mail02.efficios.com [167.114.142.138]) by mail.efficios.com (Postfix) with ESMTP id 31BE01B74BD; Sat, 14 Jul 2018 10:31:58 -0400 (EDT) Date: Sat, 14 Jul 2018 10:31:57 -0400 (EDT) From: Mathieu Desnoyers To: "Joel Fernandes, Google" Cc: linux-kernel , kernel-team , Boqun Feng , Byungchul Park , Erick Reyes , Ingo Molnar , Julia Cartwright , Masami Hiramatsu , Namhyung Kim , "Paul E. McKenney" , Peter Zijlstra , rostedt , Thomas Gleixner , Todd Kjos , Tom Zanussi , Will Deacon Message-ID: <1420383125.4964.1531578717837.JavaMail.zimbra@efficios.com> In-Reply-To: <20180713215547.255620-3-joel@joelfernandes.org> References: <20180713215547.255620-1-joel@joelfernandes.org> <20180713215547.255620-3-joel@joelfernandes.org> Subject: Re: [PATCH v10 2/3] tracepoint: Make rcuidle tracepoint callers use SRCU MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.142.138] X-Mailer: Zimbra 8.8.8_GA_2096 (ZimbraWebClient - FF52 (Linux)/8.8.8_GA_1703) Thread-Topic: tracepoint: Make rcuidle tracepoint callers use SRCU Thread-Index: GAYR78LA0J4BTegV3PVmbdxukz+zXg== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- On Jul 13, 2018, at 5:55 PM, Joel Fernandes, Google joel@joelfernandes.org wrote: > From: "Joel Fernandes (Google)" > > In recent tests with IRQ on/off tracepoints, a large performance > overhead ~10% is noticed when running hackbench. This is root caused to > calls to rcu_irq_enter_irqson and rcu_irq_exit_irqson from the > tracepoint code. Following a long discussion on the list [1] about this, > we concluded that srcu is a better alternative for use during rcu idle. > Although it does involve extra barriers, its lighter than the sched-rcu > version which has to do additional RCU calls to notify RCU idle about > entry into RCU sections. > > In this patch, we change the underlying implementation of the > trace_*_rcuidle API to use SRCU. This has shown to improve performance > alot for the high frequency irq enable/disable tracepoints. > > Test: Tested idle and preempt/irq tracepoints. > > Here are some performance numbers: > > With a run of the following 30 times on a single core x86 Qemu instance > with 1GB memory: > hackbench -g 4 -f 2 -l 3000 > > Completion times in seconds. CONFIG_PROVE_LOCKING=y. > > No patches (without this series) > Mean: 3.048 > Median: 3.025 > Std Dev: 0.064 > > With Lockdep using irq tracepoints with RCU implementation: > Mean: 3.451 (-11.66 %) > Median: 3.447 (-12.22%) > Std Dev: 0.049 > > With Lockdep using irq tracepoints with SRCU implementation (this series): > Mean: 3.020 (I would consider the improvement against the "without > this series" case as just noise). > Median: 3.013 > Std Dev: 0.033 > > [1] https://patchwork.kernel.org/patch/10344297/ > > Reviewed-by: Mathieu Desnoyers I'm fine with the changes done since last iteration. Acked-by: Mathieu Desnoyers Thanks! Mathieu > Cleaned-up-by: Peter Zijlstra > Signed-off-by: Joel Fernandes (Google) > --- > include/linux/tracepoint.h | 45 +++++++++++++++++++++++++++++++------- > kernel/tracepoint.c | 16 +++++++++++++- > 2 files changed, 52 insertions(+), 9 deletions(-) > > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h > index 19a690b559ca..97e1d365a817 100644 > --- a/include/linux/tracepoint.h > +++ b/include/linux/tracepoint.h > @@ -15,6 +15,7 @@ > */ > > #include > +#include > #include > #include > #include > @@ -33,6 +34,8 @@ struct trace_eval_map { > > #define TRACEPOINT_DEFAULT_PRIO 10 > > +extern struct srcu_struct tracepoint_srcu; > + > extern int > tracepoint_probe_register(struct tracepoint *tp, void *probe, void *data); > extern int > @@ -75,10 +78,16 @@ int unregister_tracepoint_module_notifier(struct > notifier_block *nb) > * probe unregistration and the end of module exit to make sure there is no > * caller executing a probe when it is freed. > */ > +#ifdef CONFIG_TRACEPOINTS > static inline void tracepoint_synchronize_unregister(void) > { > + synchronize_srcu(&tracepoint_srcu); > synchronize_sched(); > } > +#else > +static inline void tracepoint_synchronize_unregister(void) > +{ } > +#endif > > #ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS > extern int syscall_regfunc(void); > @@ -129,18 +138,34 @@ extern void syscall_unregfunc(void); > * as "(void *, void)". The DECLARE_TRACE_NOARGS() will pass in just > * "void *data", where as the DECLARE_TRACE() will pass in "void *data, proto". > */ > -#define __DO_TRACE(tp, proto, args, cond, rcucheck) \ > +#define __DO_TRACE(tp, proto, args, cond, rcuidle) \ > do { \ > struct tracepoint_func *it_func_ptr; \ > void *it_func; \ > void *__data; \ > + int __maybe_unused idx = 0; \ > \ > if (!(cond)) \ > return; \ > - if (rcucheck) \ > - rcu_irq_enter_irqson(); \ > - rcu_read_lock_sched_notrace(); \ > - it_func_ptr = rcu_dereference_sched((tp)->funcs); \ > + \ > + /* srcu can't be used from NMI */ \ > + if (rcuidle && in_nmi()) \ > + WARN_ON_ONCE(1); \ > + \ > + /* keep srcu and sched-rcu usage consistent */ \ > + preempt_disable_notrace(); \ > + \ > + /* \ > + * For rcuidle callers, use srcu since sched-rcu \ > + * doesn't work from the idle path. \ > + */ \ > + if (rcuidle) \ > + idx = srcu_read_lock_notrace(&tracepoint_srcu); \ > + else \ > + rcu_read_lock_sched_notrace(); \ > + \ > + it_func_ptr = rcu_dereference_raw((tp)->funcs); \ > + \ > if (it_func_ptr) { \ > do { \ > it_func = (it_func_ptr)->func; \ > @@ -148,9 +173,13 @@ extern void syscall_unregfunc(void); > ((void(*)(proto))(it_func))(args); \ > } while ((++it_func_ptr)->func); \ > } \ > - rcu_read_unlock_sched_notrace(); \ > - if (rcucheck) \ > - rcu_irq_exit_irqson(); \ > + \ > + if (rcuidle) \ > + srcu_read_unlock_notrace(&tracepoint_srcu, idx);\ > + else \ > + rcu_read_unlock_sched_notrace(); \ > + \ > + preempt_enable_notrace(); \ > } while (0) > > #ifndef MODULE > diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c > index 6dc6356c3327..955148d91b74 100644 > --- a/kernel/tracepoint.c > +++ b/kernel/tracepoint.c > @@ -31,6 +31,9 @@ > extern struct tracepoint * const __start___tracepoints_ptrs[]; > extern struct tracepoint * const __stop___tracepoints_ptrs[]; > > +DEFINE_SRCU(tracepoint_srcu); > +EXPORT_SYMBOL_GPL(tracepoint_srcu); > + > /* Set to 1 to enable tracepoint debug output */ > static const int tracepoint_debug; > > @@ -67,16 +70,27 @@ static inline void *allocate_probes(int count) > return p == NULL ? NULL : p->probes; > } > > -static void rcu_free_old_probes(struct rcu_head *head) > +static void srcu_free_old_probes(struct rcu_head *head) > { > kfree(container_of(head, struct tp_probes, rcu)); > } > > +static void rcu_free_old_probes(struct rcu_head *head) > +{ > + call_srcu(&tracepoint_srcu, head, srcu_free_old_probes); > +} > + > static inline void release_probes(struct tracepoint_func *old) > { > if (old) { > struct tp_probes *tp_probes = container_of(old, > struct tp_probes, probes[0]); > + /* > + * Tracepoint probes are protected by both sched RCU and SRCU, > + * by calling the SRCU callback in the sched RCU callback we > + * cover both cases. So let us chain the SRCU and sched RCU > + * callbacks to wait for both grace periods. > + */ > call_rcu_sched(&tp_probes->rcu, rcu_free_old_probes); > } > } > -- > 2.18.0.203.gfac676dfb9-goog -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com