Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp105061imu; Mon, 26 Nov 2018 08:29:30 -0800 (PST) X-Google-Smtp-Source: AFSGD/XfEQgctOsxfJ9UHVsSl3FtONDz8iHAjSjYfMOu1GNXJbaIShOlEcO7E6ySdVuNlaZfmJi2 X-Received: by 2002:a63:a002:: with SMTP id r2mr25130725pge.212.1543249770675; Mon, 26 Nov 2018 08:29:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543249770; cv=none; d=google.com; s=arc-20160816; b=XdjTMfyZTtLgnapNZGirhd5/8WldSTjQLeW9U6d3GQqsfYctLm6ptrpnMsxtnpAYe4 aEsfrEcE0huSmDm5G3gkcW4xu+RhMW2Gh27/3/IqjyNSVzlkWOaAFiVP91JKEi2YGMgq P98Dm4SN4CBWbmSX3HqAfempjrZj7f1yzTGpOMdEM7W2Fv6g9vArTG9iXjgSVAIMnWp+ A91Sjk00xGRPqy0A617MUr4kDZ4MvZ3in6E9/TLbIgdsvXDIwcM0ikI4PjjQktfJ8KMU 3anJUv2s6iz3vWCWC6w5V8aI33vKV7a3h75GWFMI0HkPlJS5W9BOFkGcpiFbBDJQtfOM /5Zg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=puKZ1K0y+PJC7EjfvZFPypMjMgaA8TU3JoMWKcEge08=; b=WNjLtfE5uoAhGWm9LaSAPUU0NzSLb9JiAng0+xyvHKFZ5qq+yCRJRPEYDbRBQ1z+T4 wWTq+uSXeFry4Gc+899Ih3Ja/Q2Zz8gHwHxwCv0FKZ/gqWIoSt7j+kKD964NO+nP15ij kuQ5R/2v3auGD7LKQ44BwQSY8Q3GMUdRdItOyN5Xi30a56SRLP6jnivtn6DUMZtRGTzu v8uxss+w+VJswIR80W+ALrUMhp7SU1EAsLNdlqB9pmV4q3Mrz4KxB8rWLIDpozDYO8/A SvkGSrqsb145gOwFTS/XQgKOHC0K1Nc+TxHpU4rpwnouONNybOkMFftN2kCfOKDDhzdX APnA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y6si757834pfi.228.2018.11.26.08.28.33; Mon, 26 Nov 2018 08:29:30 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726548AbeK0DUn (ORCPT + 99 others); Mon, 26 Nov 2018 22:20:43 -0500 Received: from mail.kernel.org ([198.145.29.99]:42576 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726248AbeK0DUm (ORCPT ); Mon, 26 Nov 2018 22:20:42 -0500 Received: from gandalf.local.home (cpe-66-24-56-78.stny.res.rr.com [66.24.56.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 57BA420664; Mon, 26 Nov 2018 16:26:05 +0000 (UTC) Date: Mon, 26 Nov 2018 11:26:03 -0500 From: Steven Rostedt To: Masami Hiramatsu Cc: Joel Fernandes , linux-kernel@vger.kernel.org, Ingo Molnar , Andrew Morton , Thomas Gleixner , Peter Zijlstra , Josh Poimboeuf , Frederic Weisbecker , Andy Lutomirski , Mark Rutland Subject: Re: [RFC][PATCH 11/14] function_graph: Convert ret_stack to a series of longs Message-ID: <20181126112603.6c5519dd@gandalf.local.home> In-Reply-To: <20181127010755.0f897c13a57315a3859d225b@kernel.org> References: <20181122012708.491151844@goodmis.org> <20181122012804.122411256@goodmis.org> <20181124053138.GA242510@google.com> <20181127010755.0f897c13a57315a3859d225b@kernel.org> X-Mailer: Claws Mail 3.16.0 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 27 Nov 2018 01:07:55 +0900 Masami Hiramatsu wrote: > > > --- a/include/linux/sched.h > > > +++ b/include/linux/sched.h > > > @@ -1119,7 +1119,7 @@ struct task_struct { > > > int curr_ret_depth; > > > > > > /* Stack of return addresses for return function tracing: */ > > > - struct ftrace_ret_stack *ret_stack; > > > + unsigned long *ret_stack; > > > > > > /* Timestamp for last schedule: */ > > > unsigned long long ftrace_timestamp; > > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > > > index 9b85638ecded..1389fe39f64c 100644 > > > --- a/kernel/trace/fgraph.c > > > +++ b/kernel/trace/fgraph.c > > > @@ -23,6 +23,17 @@ > > > #define ASSIGN_OPS_HASH(opsname, val) > > > #endif > > > > > > +#define FGRAPH_RET_SIZE (sizeof(struct ftrace_ret_stack)) > > > +#define FGRAPH_RET_INDEX (ALIGN(FGRAPH_RET_SIZE, sizeof(long)) / sizeof(long)) > > > +#define SHADOW_STACK_SIZE (FTRACE_RETFUNC_DEPTH * FGRAPH_RET_SIZE) > > > +#define SHADOW_STACK_INDEX \ > > > + (ALIGN(SHADOW_STACK_SIZE, sizeof(long)) / sizeof(long)) > > > +#define SHADOW_STACK_MAX_INDEX (SHADOW_STACK_INDEX - FGRAPH_RET_INDEX) > > > + > > > +#define RET_STACK(t, index) ((struct ftrace_ret_stack *)(&(t)->ret_stack[index])) > > > +#define RET_STACK_INC(c) ({ c += FGRAPH_RET_INDEX; }) > > > +#define RET_STACK_DEC(c) ({ c -= FGRAPH_RET_INDEX; }) > > > + > > [...] > > > @@ -514,7 +531,7 @@ void ftrace_graph_init_task(struct task_struct *t) > > > > > > void ftrace_graph_exit_task(struct task_struct *t) > > > { > > > - struct ftrace_ret_stack *ret_stack = t->ret_stack; > > > + unsigned long *ret_stack = t->ret_stack; > > > > > > t->ret_stack = NULL; > > > /* NULL must become visible to IRQs before we free it: */ > > > @@ -526,12 +543,10 @@ void ftrace_graph_exit_task(struct task_struct *t) > > > /* Allocate a return stack for each task */ > > > static int start_graph_tracing(void) > > > { > > > - struct ftrace_ret_stack **ret_stack_list; > > > + unsigned long **ret_stack_list; > > > int ret, cpu; > > > > > > - ret_stack_list = kmalloc_array(FTRACE_RETSTACK_ALLOC_SIZE, > > > - sizeof(struct ftrace_ret_stack *), > > > - GFP_KERNEL); > > > + ret_stack_list = kmalloc(SHADOW_STACK_SIZE, GFP_KERNEL); > > > > > > > I had dumped the fgraph size related macros to understand the patch better, I > > got: > > [ 0.909528] val of FGRAPH_RET_SIZE is 40 > > [ 0.910250] val of FGRAPH_RET_INDEX is 5 > > [ 0.910866] val of FGRAPH_ARRAY_SIZE is 16 > > [ 0.911488] val of FGRAPH_ARRAY_MASK is 255 > > [ 0.912134] val of FGRAPH_MAX_INDEX is 16 > > [ 0.912751] val of FGRAPH_INDEX_SHIFT is 8 > > [ 0.913382] val of FGRAPH_FRAME_SIZE is 168 > > [ 0.914033] val of FGRAPH_FRAME_INDEX is 21 > > FTRACE_RETFUNC_DEPTH is 50 > > [ 0.914686] val of SHADOW_STACK_SIZE is 8400 > > > > I had a concern about memory overhead per-task. It seems the total memory > > needed per task for the stack is 8400 bytes (with my configuration with > > FUNCTION_PROFILE > > turned off). > > > > Where as before it would be 32 * 40 = 1280 bytes. That looks like ~7 times > > more than before. > > Hmm, this seems too big... I thought the shadow-stack size should be > smaller than 1 page (4kB). Steve, can we give a 4k page for shadow stack > and define FTRACE_RETFUNC_DEPTH = 4096 / FGRAPH_RET_SIZE ? For the first pass, I decided not to worry about the size. It made the code less complex :-) Yes, I plan on working on making the size of the stack smaller, but that will probably be added on patches to do so. > > > On my system with ~4000 threads, that becomes ~32MB which seems a bit > > wasteful especially if there was only one or 2 function graph callbacks > > registered and most of the callback array in the stack isn't used. Note, all 4000 threads could be doing those trace backs, and if you are doing full function graph tracing, it will use a lot. > > > > Could we make the array size configurable at compile time and start it with a > > small number like 4 or 6? > > Or, we can introduce online setting :) Yes, that can easily be added. I didn't try to make this into the perfect solution, I wanted a solid one first, and then massage it into something that is more efficient, both with memory consumption and performance. Joel and Masami, thanks for the feedback. -- Steve