Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933560AbcDESRB (ORCPT ); Tue, 5 Apr 2016 14:17:01 -0400 Received: from casper.infradead.org ([85.118.1.10]:43980 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758499AbcDESRA (ORCPT ); Tue, 5 Apr 2016 14:17:00 -0400 Date: Tue, 5 Apr 2016 20:16:54 +0200 From: Peter Zijlstra To: Alexei Starovoitov Cc: Steven Rostedt , "David S . Miller" , Ingo Molnar , Daniel Borkmann , Arnaldo Carvalho de Melo , Wang Nan , Josef Bacik , Brendan Gregg , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH net-next 2/8] perf, bpf: allow bpf programs attach to tracepoints Message-ID: <20160405181654.GR3408@twins.programming.kicks-ass.net> References: <1459831974-2891931-1-git-send-email-ast@fb.com> <1459831974-2891931-3-git-send-email-ast@fb.com> <20160405141857.GN3448@twins.programming.kicks-ass.net> <5703FF5A.1040707@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5703FF5A.1040707@fb.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1525 Lines: 34 On Tue, Apr 05, 2016 at 11:09:30AM -0700, Alexei Starovoitov wrote: > >>@@ -67,6 +69,14 @@ perf_trace_##call(void *__data, proto) \ > >> \ > >> { assign; } \ > >> \ > >>+ if (prog) { \ > >>+ *(struct pt_regs **)entry = __regs; \ > >>+ if (!trace_call_bpf(prog, entry) || hlist_empty(head)) { \ > >>+ perf_swevent_put_recursion_context(rctx); \ > >>+ return; \ > >>+ } \ > >>+ memset(&entry->ent, 0, sizeof(entry->ent)); \ > > > >But if not, you destroy it and then feed it to perf? > > yes. If bpf prog returns 1 the buffer goes into normal ring-buffer > with all perf_event attributes and so on. > So far there wasn't a single real use case where we went this path. > Programs always do aggregation inside and pass stuff to user space > either via bpf maps or via bpf_perf_event_output() helper. > I wanted to keep perf_trace_xx() calls to be minimal in .text size > so memset above is one x86 instruction, but I don't mind > replacing this memset with a call to a helper function that will do: > local_save_flags(flags); > tracing_generic_entry_update(entry, flags, preempt_count()); > entry->type = type; > Then whether bpf attached or not the ring buffer will see the same > raw tracepoint entry. You think it's cleaner? Yeah, otherwise you get very weird and surprising behaviour. Also, one possible use-case is dynamic filters where the BPF program is basically used to filter events, although I suppose we already have a hook for that elsewhere.