Received: by 10.192.165.148 with SMTP id m20csp3547325imm; Mon, 7 May 2018 14:18:50 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrGCNEMNW4ELE4+NeYbtmc55rk1AGi6Kqx3UtLGIeVyDQLGCDnsAVyGRmnZYxjeG2FBRXN5 X-Received: by 10.98.206.78 with SMTP id y75mr36761743pfg.175.1525727930091; Mon, 07 May 2018 14:18:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525727930; cv=none; d=google.com; s=arc-20160816; b=tKJowEp7ghT7xQMBQJl1xYdJiHfaAqHd6cwK3Gcejcni4ZTxOYeOCvWH04Z8T52IJk 8kJVGANN+KJJp01AtHJEOiS/Qojh/VVcI1WJgTBQnFSp5Ef8qEi7ObVhUugVaOfyY+/s 67D5bZRM+gT0qG1as3HBkITr2QosPjqU92Aa5AFCbJ/XxKyFrzieVUG+kQp7WqrcXnzT 35RRmbRg/n8HlvjiUmmlTIcH0S1bbb8Dil9wFH0gUa3sbtiS6I3onOJ96dFoCuJzaJtq vDfsWwTiTXfYQxjXFMRgEu9OXjk0ya9kFkhinjEEPkRc+O2LLB9ajtjMjdNo+kQ54fwl HAQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:thread-index:thread-topic :content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:dkim-signature:dkim-filter :arc-authentication-results; bh=UwNQ/cOkOcmHUHfb9TVycD8DK1vgtPKRM99o9BhjgzI=; b=uGMuF5bMwDh3mIFf/6GahQeHX+aK0bdAjnjNpd4Yf+ql4h//0dID3yz6LQnKDK0p+p kKQhI08GCGa3WHx5iqRTih2QWcZvDHDzrikD/wpfd/4gDD0m3/BBbujn96AGE8mfWYo6 IHAoW+Nx36XU8hbR7R2UIZ86B17CGUqXsPlj+IYiVOtiZZnWRI4zUN8mlmJcgvJOsf+Z YErtPBpB+uyKDfmNOBTKLbevMCDTSVf7PieG2Nr+tUdZ3DFR8frlAEwyD5DcuCPi2NMk BgdmBgL5XQ83vTgHH33WRq+gg1vB4/EqVL/yeI3ZqRJ3uhKdP58hOsIDlrBx4PN46Xp5 Igng== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=XYARXIQM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u75si11730271pfd.328.2018.05.07.14.18.35; Mon, 07 May 2018 14:18:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=XYARXIQM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753462AbeEGVR4 (ORCPT + 99 others); Mon, 7 May 2018 17:17:56 -0400 Received: from mail.efficios.com ([167.114.142.138]:55992 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753002AbeEGVRz (ORCPT ); Mon, 7 May 2018 17:17:55 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 668B71B1A17; Mon, 7 May 2018 17:17:54 -0400 (EDT) Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail02.efficios.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id dttMZ-yETbT6; Mon, 7 May 2018 17:17:53 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 6E2C51B1A10; Mon, 7 May 2018 17:17:53 -0400 (EDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 6E2C51B1A10 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=default; t=1525727873; bh=UwNQ/cOkOcmHUHfb9TVycD8DK1vgtPKRM99o9BhjgzI=; h=Date:From:To:Message-ID:MIME-Version; b=XYARXIQMJgWDuEXYyGZmnmxe8T5pyisWxriUDDfI87C61Ti/IMWZ3CLT8GIEJ7X4I Ql955poRXhZZtOsnuxIDBx9VAs2BBTHwugXfB5CnNpDqxj66B34j5xW1XUf8x5G/X4 a58DTA3TwYqceKOtAIqXAHk8xm6XtG+AVY1RHaraqi0lw1uGDwWBE8turKjuhuFGEw Oo6ACY/c0+L6d5vqLzG1eB09NppuEsYyhqsncBKBm55v4uIsmeqaJOtS/MPu8SQazy kgRiKfwOcywlnjb6vaMuMWmphcVhR3UVqmLW3XONiT4EeKtnW0Ekf+5ekPDCIY8g4j 5C5bChX1kjhvg== X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail02.efficios.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id IEVHwB3RsXeC; Mon, 7 May 2018 17:17:53 -0400 (EDT) Received: from mail02.efficios.com (mail02.efficios.com [167.114.142.138]) by mail.efficios.com (Postfix) with ESMTP id 545BC1B1A04; Mon, 7 May 2018 17:17:53 -0400 (EDT) Date: Mon, 7 May 2018 17:17:53 -0400 (EDT) From: Mathieu Desnoyers To: "Paul E. McKenney" Cc: Joel Fernandes , linux-kernel , "Joel Fernandes, Google" , rostedt , Peter Zijlstra , Ingo Molnar , Tom Zanussi , Namhyung Kim , Thomas Gleixner , Boqun Feng , fweisbec , Randy Dunlap , Masami Hiramatsu , kbuild test robot , baohong liu , vedang patel , kernel-team Message-ID: <2001761952.217.1525727873194.JavaMail.zimbra@efficios.com> In-Reply-To: <20180507210801.GZ26088@linux.vnet.ibm.com> References: <20180507204143.13061-1-joelaf@google.com> <20180507204143.13061-5-joelaf@google.com> <20180507210801.GZ26088@linux.vnet.ibm.com> Subject: Re: [PATCH RFC v6 4/5] tracepoint: Make rcuidle tracepoint callers use SRCU MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.142.138] X-Mailer: Zimbra 8.8.8_GA_2026 (ZimbraWebClient - FF52 (Linux)/8.8.8_GA_2031) Thread-Topic: tracepoint: Make rcuidle tracepoint callers use SRCU Thread-Index: nnlZJ4fddsYlY5dHLZYvhuQmw8obAA== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- On May 7, 2018, at 5:08 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com wrote: > On Mon, May 07, 2018 at 01:41:42PM -0700, Joel Fernandes wrote: >> From: "Joel Fernandes (Google)" >> >> In recent tests with IRQ on/off tracepoints, a large performance >> overhead ~10% is noticed when running hackbench. This is root caused to >> calls to rcu_irq_enter_irqson and rcu_irq_exit_irqson from the >> tracepoint code. Following a long discussion on the list [1] about this, >> we concluded that srcu is a better alternative for use during rcu idle. >> Although it does involve extra barriers, its lighter than the sched-rcu >> version which has to do additional RCU calls to notify RCU idle about >> entry into RCU sections. >> >> In this patch, we change the underlying implementation of the >> trace_*_rcuidle API to use SRCU. This has shown to improve performance >> alot for the high frequency irq enable/disable tracepoints. >> >> Test: Tested idle and preempt/irq tracepoints. >> >> Here are some performance numbers: >> >> With a run of the following 30 times on a single core x86 Qemu instance >> with 1GB memory: >> hackbench -g 4 -f 2 -l 3000 >> >> Completion times in seconds. CONFIG_PROVE_LOCKING=y. >> >> No patches (without this series) >> Mean: 3.048 >> Median: 3.025 >> Std Dev: 0.064 >> >> With Lockdep using irq tracepoints with RCU implementation: >> Mean: 3.451 (-11.66 %) >> Median: 3.447 (-12.22%) >> Std Dev: 0.049 >> >> With Lockdep using irq tracepoints with SRCU implementation (this series): >> Mean: 3.020 (I would consider the improvement against the "without >> this series" case as just noise). >> Median: 3.013 >> Std Dev: 0.033 >> >> [1] https://patchwork.kernel.org/patch/10344297/ >> >> Cc: Steven Rostedt >> Cc: Peter Zilstra >> Cc: Ingo Molnar >> Cc: Mathieu Desnoyers >> Cc: Tom Zanussi >> Cc: Namhyung Kim >> Cc: Thomas Glexiner >> Cc: Boqun Feng >> Cc: Paul McKenney >> Cc: Frederic Weisbecker >> Cc: Randy Dunlap >> Cc: Masami Hiramatsu >> Cc: Fenguang Wu >> Cc: Baohong Liu >> Cc: Vedang Patel >> Cc: kernel-team@android.com >> Signed-off-by: Joel Fernandes (Google) >> --- >> include/linux/tracepoint.h | 46 +++++++++++++++++++++++++++++++------- >> kernel/tracepoint.c | 15 ++++++++++++- >> 2 files changed, 52 insertions(+), 9 deletions(-) >> >> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h >> index c94f466d57ef..f56f290cf8eb 100644 >> --- a/include/linux/tracepoint.h >> +++ b/include/linux/tracepoint.h >> @@ -15,6 +15,7 @@ >> */ >> >> #include >> +#include >> #include >> #include >> #include >> @@ -33,6 +34,8 @@ struct trace_eval_map { >> >> #define TRACEPOINT_DEFAULT_PRIO 10 >> >> +extern struct srcu_struct tracepoint_srcu; >> + >> extern int >> tracepoint_probe_register(struct tracepoint *tp, void *probe, void *data); >> extern int >> @@ -77,6 +80,9 @@ int unregister_tracepoint_module_notifier(struct >> notifier_block *nb) >> */ >> static inline void tracepoint_synchronize_unregister(void) >> { >> +#ifdef CONFIG_TRACEPOINTS >> + synchronize_srcu(&tracepoint_srcu); >> +#endif >> synchronize_sched(); >> } >> >> @@ -129,18 +135,38 @@ extern void syscall_unregfunc(void); >> * as "(void *, void)". The DECLARE_TRACE_NOARGS() will pass in just >> * "void *data", where as the DECLARE_TRACE() will pass in "void *data, proto". >> */ >> -#define __DO_TRACE(tp, proto, args, cond, rcucheck) \ >> +#define __DO_TRACE(tp, proto, args, cond, rcuidle) \ >> do { \ >> struct tracepoint_func *it_func_ptr; \ >> void *it_func; \ >> void *__data; \ >> + int __maybe_unused idx = 0; \ >> \ >> if (!(cond)) \ >> return; \ >> - if (rcucheck) \ >> - rcu_irq_enter_irqson(); \ >> - rcu_read_lock_sched_notrace(); \ >> - it_func_ptr = rcu_dereference_sched((tp)->funcs); \ >> + \ >> + /* \ >> + * For rcuidle callers, use srcu since sched-rcu \ >> + * doesn't work from the idle path. \ >> + */ \ >> + if (rcuidle) { \ >> + if (in_nmi()) { \ >> + WARN_ON_ONCE(1); \ >> + return; /* no srcu from nmi */ \ >> + } \ >> + \ >> + idx = srcu_read_lock_notrace(&tracepoint_srcu); \ >> + it_func_ptr = \ >> + srcu_dereference_notrace((tp)->funcs, \ >> + &tracepoint_srcu); \ >> + /* To keep it consistent with !rcuidle path */ \ >> + preempt_disable_notrace(); \ >> + } else { \ >> + rcu_read_lock_sched_notrace(); \ >> + it_func_ptr = \ >> + rcu_dereference_sched((tp)->funcs); \ >> + } \ >> + \ >> if (it_func_ptr) { \ >> do { \ >> it_func = (it_func_ptr)->func; \ >> @@ -148,9 +174,13 @@ extern void syscall_unregfunc(void); >> ((void(*)(proto))(it_func))(args); \ >> } while ((++it_func_ptr)->func); \ >> } \ >> - rcu_read_unlock_sched_notrace(); \ >> - if (rcucheck) \ >> - rcu_irq_exit_irqson(); \ >> + \ >> + if (rcuidle) { \ > > Don't we also need an in_nmi() check here in order to avoid unbalanced > srcu_read_unlock_notrace() calls? AFAIU the "return;" in the if (in_nmi()) branch above takes care of never executing the following code. diff appears to be a bit confused by the preprocessor macros, but in reality this is all part of the same static inline function. Thanks, Mathieu > > Thanx, Paul > >> + preempt_enable_notrace(); \ >> + srcu_read_unlock_notrace(&tracepoint_srcu, idx);\ >> + } else { \ >> + rcu_read_unlock_sched_notrace(); \ >> + } \ >> } while (0) >> >> #ifndef MODULE >> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c >> index 671b13457387..2089f579f790 100644 >> --- a/kernel/tracepoint.c >> +++ b/kernel/tracepoint.c >> @@ -31,6 +31,9 @@ >> extern struct tracepoint * const __start___tracepoints_ptrs[]; >> extern struct tracepoint * const __stop___tracepoints_ptrs[]; >> >> +DEFINE_SRCU(tracepoint_srcu); >> +EXPORT_SYMBOL_GPL(tracepoint_srcu); >> + >> /* Set to 1 to enable tracepoint debug output */ >> static const int tracepoint_debug; >> >> @@ -67,16 +70,26 @@ static inline void *allocate_probes(int count) >> return p == NULL ? NULL : p->probes; >> } >> >> -static void rcu_free_old_probes(struct rcu_head *head) >> +static void srcu_free_old_probes(struct rcu_head *head) >> { >> kfree(container_of(head, struct tp_probes, rcu)); >> } >> >> +static void rcu_free_old_probes(struct rcu_head *head) >> +{ >> + call_srcu(&tracepoint_srcu, head, srcu_free_old_probes); >> +} >> + >> static inline void release_probes(struct tracepoint_func *old) >> { >> if (old) { >> struct tp_probes *tp_probes = container_of(old, >> struct tp_probes, probes[0]); >> + /* >> + * Tracepoint probes are protected by both sched RCU and SRCU, >> + * by calling the SRCU callback in the sched RCU callback we >> + * cover both cases. So lets chain the SRCU and RCU callbacks. >> + */ >> call_rcu_sched(&tp_probes->rcu, rcu_free_old_probes); >> } >> } >> -- >> 2.17.0.441.gb46fe60e1d-goog -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com