Received: by 2002:a89:d88:0:b0:1fa:5c73:8e2d with SMTP id eb8csp1786876lqb; Sun, 26 May 2024 18:16:40 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUViIRCZAJo/PN0OUpPhP0jf25CtAmHs83Lw4ucX7WmBwXpurg1mmTY0kvbqAZpyFwB0irTjJsRim4bkBp5JqxsU518/0KIkAsqpISvSQ== X-Google-Smtp-Source: AGHT+IG2dYh9WQw+urC8HLUc7zGDUxw4DXb64G+pbCimzPEr3kWNhPah2L3wtVXnT4xwxQXiWMkU X-Received: by 2002:a81:6d82:0:b0:61b:2b7:27d8 with SMTP id 00721157ae682-62a08dd8cedmr74666657b3.23.1716772600620; Sun, 26 May 2024 18:16:40 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1716772600; cv=pass; d=google.com; s=arc-20160816; b=tSC3WVEyqlaDxv4DFw1Jt1UuVY84MRzr+iSVJe+ztC9eLOMfNpg9NLYVyGUU4ulOKJ 2a6rSmSfy3wNISJ24dHYcj7xDql6Yhaq4yUeXl0OG7DOc5tnbT1FnSd/ccIu3RKue3Pn vVUiVFSD30GZ/lLn7J7gANMS7O9TQFKcXLkfvCtRGsIqs0vvy92W/v2UsdNusqjmX5hX xrvqPWmQrv1t7k3HN1FzLicCVqFzyLO/7sY+CFSr10E3xEgOOJwgI90lnQmhjQbuXXhB g//0lmggNqTqZ/kq/aGzCqfxhT+jGyVOiwDbtDwsjk9+nCtHgDsk9By5Seab7z+6uHq4 F2kA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :subject:cc:to:from:date; bh=n7ZlfxTwSewQa8S+I9ouVqK7m/AsdOqX6rfv3nDx2zY=; fh=CONbk9IM1KeoX7M3ojYzAuH3rM9roTZPm7jJ9UBDXA8=; b=FTVVTgR+mMlZG4SfJYKX/yG4jWTh48PtRYsGXMyq4927fmWNeiusp903ztz8lTUcTR a1cAl4HWfRjSKDGeVYkocAD0tHO0vbbt01Z2+j1HaG+C7d1FxzxfGNxIM3BydX8z+Npt ud4GcVAln9pCHdm5Bb1wa+cUXlcD9WsgXhVNzX8YJt2CK6D3SmNjy8SUJPqwszbYf0tB drrzGlPo7kvl3/Odo8QLuCkY+7giB9cGYcVZ6tBeqoWkHuCqIpj0oGoVt0QtMRQYE0Yr aBl4OD0MgQ8qw3ozq3YUX+B8GPx1ICaMzm+VHwOrMMg+ihVUcxoOR9q0gGFndC4pyi9O 4rsw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of linux-kernel+bounces-189925-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-189925-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id 6a1803df08f44-6ac1123aa8csi71143876d6.323.2024.05.26.18.16.40 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 26 May 2024 18:16:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-189925-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of linux-kernel+bounces-189925-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-189925-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 4E1411C20E5E for ; Mon, 27 May 2024 01:16:40 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2C64633C8; Mon, 27 May 2024 01:16:31 +0000 (UTC) Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 642C038C; Mon, 27 May 2024 01:16:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716772590; cv=none; b=ZRqYFAlwl0+GtTa3jJUtIBUGA2nplMz7RVFi/FmnSY11EvIjRXS42THK5jmzbxKOo5eiCMG6viApPGcI1xG45m23/lgoLBFwU/eWtx+IahqOhbMQfq3gRM8+/RGV2j4ZM5AYTU7oc+FXcKOUzkawf46ozCS+osLEf2JJEBrD89g= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716772590; c=relaxed/simple; bh=hgb0zfnLV7BoxmFrRxX+n/XHKuKXwMbq4hFC46FTxtc=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=YmEKI64PTyMaqduvJwWyWcfoKSAq2UCHzx+hw3BbS8ui+2h1XI5M+ytrv10NbnwJ3TZ8WFDYxHTpnUE3oUvalum0iE/cGOyoqePkHXNTTClhLIXca5ofJ5wVIaXD5n4t6LtHrn11sLaw+qjCh+keiJXHaoeARoGc+m3JY44Zx88= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id A8F26C2BD10; Mon, 27 May 2024 01:16:27 +0000 (UTC) Date: Sun, 26 May 2024 21:17:19 -0400 From: Steven Rostedt To: "Masami Hiramatsu (Google)" Cc: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, Mark Rutland , Mathieu Desnoyers , Andrew Morton , Alexei Starovoitov , Florent Revest , Martin KaFai Lau , bpf , Sven Schnelle , Alexei Starovoitov , Jiri Olsa , Arnaldo Carvalho de Melo , Daniel Borkmann , Alan Maguire , Peter Zijlstra , Thomas Gleixner , Guo Ren Subject: Re: [PATCH 04/20] function_graph: Allow multiple users to attach to function graph Message-ID: <20240526211719.0c4c2835@gandalf.local.home> In-Reply-To: <20240527093436.7060d358a64cc2ea3213b07b@kernel.org> References: <20240525023652.903909489@goodmis.org> <20240525023741.836661178@goodmis.org> <20240527093436.7060d358a64cc2ea3213b07b@kernel.org> X-Mailer: Claws Mail 3.20.0git84 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Mon, 27 May 2024 09:34:36 +0900 Masami Hiramatsu (Google) wrote: > > @@ -110,11 +253,13 @@ void ftrace_graph_stop(void) > > /* Add a function return address to the trace stack on thread info.*/ > > static int > > ftrace_push_return_trace(unsigned long ret, unsigned long func, > > - unsigned long frame_pointer, unsigned long *retp) > > + unsigned long frame_pointer, unsigned long *retp, > > + int fgraph_idx) > > We do not need this fgraph_idx parameter anymore because this removed > reuse-frame check. Agreed. Will remove. > > > { > > struct ftrace_ret_stack *ret_stack; > > unsigned long long calltime; > > - int index; > > + unsigned long val; > > + int offset; > > > > if (unlikely(ftrace_graph_is_dead())) > > return -EBUSY; > > @@ -124,24 +269,57 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, > > > > BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long)); > > > > + /* Set val to "reserved" with the delta to the new fgraph frame */ > > + val = (FGRAPH_TYPE_RESERVED << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET; > > + > > /* > > * We must make sure the ret_stack is tested before we read > > * anything else. > > */ > > smp_rmb(); > > > > - /* The return trace stack is full */ > > - if (current->curr_ret_stack >= SHADOW_STACK_MAX_INDEX) { > > + /* > > + * Check if there's room on the shadow stack to fit a fraph frame > > + * and a bitmap word. > > + */ > > + if (current->curr_ret_stack + FGRAPH_FRAME_OFFSET + 1 >= SHADOW_STACK_MAX_OFFSET) { > > atomic_inc(¤t->trace_overrun); > > return -EBUSY; > > } > > > > calltime = trace_clock_local(); > > > > - index = current->curr_ret_stack; > > - RET_STACK_INC(current->curr_ret_stack); > > - ret_stack = RET_STACK(current, index); > > + offset = READ_ONCE(current->curr_ret_stack); > > + ret_stack = RET_STACK(current, offset); > > + offset += FGRAPH_FRAME_OFFSET; > > + > > + /* ret offset = FGRAPH_FRAME_OFFSET ; type = reserved */ > > + current->ret_stack[offset] = val; > > + ret_stack->ret = ret; > > + /* > > + * The unwinders expect curr_ret_stack to point to either zero > > + * or an offset where to find the next ret_stack. Even though the > > + * ret stack might be bogus, we want to write the ret and the > > + * offset to find the ret_stack before we increment the stack point. > > + * If an interrupt comes in now before we increment the curr_ret_stack > > + * it may blow away what we wrote. But that's fine, because the > > + * offset will still be correct (even though the 'ret' won't be). > > + * What we worry about is the offset being correct after we increment > > + * the curr_ret_stack and before we update that offset, as if an > > + * interrupt comes in and does an unwind stack dump, it will need > > + * at least a correct offset! > > + */ > > barrier(); > > + WRITE_ONCE(current->curr_ret_stack, offset + 1); > > + /* > > + * This next barrier is to ensure that an interrupt coming in > > + * will not corrupt what we are about to write. > > + */ > > + barrier(); > > + > > + /* Still keep it reserved even if an interrupt came in */ > > + current->ret_stack[offset] = val; > > + > > ret_stack->ret = ret; > > ret_stack->func = func; > > ret_stack->calltime = calltime; > > @@ -151,7 +329,7 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, > > #ifdef HAVE_FUNCTION_GRAPH_RET_ADDR_PTR > > ret_stack->retp = retp; > > #endif > > - return 0; > > + return offset; > > } > > > > /* > > @@ -168,49 +346,67 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, > > # define MCOUNT_INSN_SIZE 0 > > #endif > > > > +/* If the caller does not use ftrace, call this function. */ > > int function_graph_enter(unsigned long ret, unsigned long func, > > unsigned long frame_pointer, unsigned long *retp) > > { > > struct ftrace_graph_ent trace; > > + unsigned long bitmap = 0; > > + int offset; > > + int i; > > > > trace.func = func; > > trace.depth = ++current->curr_ret_depth; > > > > - if (ftrace_push_return_trace(ret, func, frame_pointer, retp)) > > + offset = ftrace_push_return_trace(ret, func, frame_pointer, retp, 0); > > + if (offset < 0) > > goto out; > > > > - /* Only trace if the calling function expects to */ > > - if (!fgraph_array[0]->entryfunc(&trace)) > > + for (i = 0; i < fgraph_array_cnt; i++) { > > + struct fgraph_ops *gops = fgraph_array[i]; > > + > > + if (gops == &fgraph_stub) > > + continue; > > + > > + if (gops->entryfunc(&trace)) > > + bitmap |= BIT(i); > > + } > > + > > + if (!bitmap) > > goto out_ret; > > > > + /* > > + * Since this function uses fgraph_idx = 0 as a tail-call checking > > + * flag, set that bit always. > > + */ > > This comment is also out-of-date. > > > + set_bitmap(current, offset, bitmap | BIT(0)); > > And we do not need to set BIT(0) anymore. Right. When looking at your first comment, I did a search for fgraph_idx and noticed this code, and realized it should be removed too. And of course, you noticed it too ;-) > > > + > > return 0; > > out_ret: > > - RET_STACK_DEC(current->curr_ret_stack); > > + current->curr_ret_stack -= FGRAPH_FRAME_OFFSET + 1; > > out: > > current->curr_ret_depth--; > > return -EBUSY; > > } > > > > /* Retrieve a function return address to the trace stack on thread info.*/ > > -static void > > +static struct ftrace_ret_stack * > > ftrace_pop_return_trace(struct ftrace_graph_ret *trace, unsigned long *ret, > > - unsigned long frame_pointer) > > + unsigned long frame_pointer, int *offset) > > { > > struct ftrace_ret_stack *ret_stack; > > - int index; > > > > - index = current->curr_ret_stack; > > - RET_STACK_DEC(index); > > + ret_stack = get_ret_stack(current, current->curr_ret_stack, offset); > > > > - if (unlikely(index < 0 || index > SHADOW_STACK_MAX_INDEX)) { > > + if (unlikely(!ret_stack)) { > > ftrace_graph_stop(); > > - WARN_ON(1); > > + WARN(1, "Bad function graph ret_stack pointer: %d", > > + current->curr_ret_stack); > > /* Might as well panic, otherwise we have no where to go */ > > *ret = (unsigned long)panic; > > - return; > > + return NULL; > > } > > > > - ret_stack = RET_STACK(current, index); > > #ifdef HAVE_FUNCTION_GRAPH_FP_TEST > > /* > > * The arch may choose to record the frame pointer used > > @@ -230,26 +426,29 @@ ftrace_pop_return_trace(struct ftrace_graph_ret *trace, unsigned long *ret, > > ftrace_graph_stop(); > > WARN(1, "Bad frame pointer: expected %lx, received %lx\n" > > " from func %ps return to %lx\n", > > - current->ret_stack[index].fp, > > + ret_stack->fp, > > frame_pointer, > > (void *)ret_stack->func, > > ret_stack->ret); > > *ret = (unsigned long)panic; > > - return; > > + return NULL; > > } > > #endif > > > > + *offset += FGRAPH_FRAME_OFFSET; > > *ret = ret_stack->ret; > > trace->func = ret_stack->func; > > trace->calltime = ret_stack->calltime; > > trace->overrun = atomic_read(¤t->trace_overrun); > > - trace->depth = current->curr_ret_depth--; > > + trace->depth = current->curr_ret_depth; > > /* > > * We still want to trace interrupts coming in if > > * max_depth is set to 1. Make sure the decrement is > > * seen before ftrace_graph_return. > > */ > > barrier(); > > + > > + return ret_stack; > > } > > > > /* > > @@ -287,30 +486,47 @@ struct fgraph_ret_regs; > > static unsigned long __ftrace_return_to_handler(struct fgraph_ret_regs *ret_regs, > > unsigned long frame_pointer) > > { > > + struct ftrace_ret_stack *ret_stack; > > struct ftrace_graph_ret trace; > > + unsigned long bitmap; > > unsigned long ret; > > + int offset; > > + int i; > > + > > + ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset); > > + > > + if (unlikely(!ret_stack)) { > > + ftrace_graph_stop(); > > + WARN_ON(1); > > + /* Might as well panic. What else to do? */ > > + return (unsigned long)panic; > > + } > > > > - ftrace_pop_return_trace(&trace, &ret, frame_pointer); > > + trace.rettime = trace_clock_local(); > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL > > trace.retval = fgraph_ret_regs_return_value(ret_regs); > > #endif > > - trace.rettime = trace_clock_local(); > > - fgraph_array[0]->retfunc(&trace); > > + > > + bitmap = get_bitmap_bits(current, offset); > > + for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) { > > + struct fgraph_ops *gops = fgraph_array[i]; > > + > > + if (!(bitmap & BIT(i))) > > + continue; > > + if (gops == &fgraph_stub) > > nit: here, we can make this check unlikely() because the above > bitmap check already filtered. (Some sleepable functions leave > the return frame on shadow stack after gops is unregistered. But it > also rare compared with living time.) Sure. Thanks for the review. -- Steve