LinuxLists.cc - [PATCH] staging: tracing/kprobe: filter kprobe based perf event

2019-09-18 07:39:53

Subject: [PATCH] staging: tracing/kprobe: filter kprobe based perf event

From: Jinshan Xiong <[email protected]>

Invoking bpf program only if kprobe based perf_event has been added into
the percpu list. This is essential to make event tracing for cgroup to work
properly.

Signed-off-by: Jinshan Xiong <[email protected]>
---
kernel/trace/trace_kprobe.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 9d483ad9bb6c..40ef0f1945f7 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1171,11 +1171,18 @@ static int
kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs *regs)
{
struct trace_event_call *call = trace_probe_event_call(&tk->tp);
+ struct hlist_head *head = this_cpu_ptr(call->perf_events);
struct kprobe_trace_entry_head *entry;
- struct hlist_head *head;
int size, __size, dsize;
int rctx;

+ /*
+ * If head is empty, the process currently running on this cpu is not
+ * interested by kprobe perf PMU.
+ */
+ if (hlist_empty(head))
+ return 0;
+
if (bpf_prog_array_valid(call)) {
unsigned long orig_ip = instruction_pointer(regs);
int ret;
@@ -1193,10 +1200,6 @@ kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs *regs)
return 0;
}

- head = this_cpu_ptr(call->perf_events);
- if (hlist_empty(head))
- return 0;
-
dsize = __get_data_size(&tk->tp, regs);
__size = sizeof(*entry) + tk->tp.size + dsize;
size = ALIGN(__size + sizeof(u32), sizeof(u64));

2019-09-18 22:51:54

by Alexei Starovoitov

[permalink] [raw]

Subject: Re: [PATCH] staging: tracing/kprobe: filter kprobe based perf event

On Wed, Sep 18, 2019 at 8:13 AM Jinshan Xiong <[email protected]> wrote:
>
> The problem with the current approach is that it would be difficult to filter cgroup, especially the cgroup in question has descendents, and also it would spawn new descendents after BPF program is installed. it's hard to filter it inside a BPF program.

Why is that?
bpf_current_task_under_cgroup() fits exactly that purpose.

2019-09-19 05:34:54

by Jinshan Xiong

[permalink] [raw]

Subject: Re: [PATCH] staging: tracing/kprobe: filter kprobe based perf event

That's bloody true. Thanks for your insights.

I will make an example program and commit into bcc repository.

Jinshan

On Wed, Sep 18, 2019 at 1:22 PM Alexei Starovoitov
<[email protected]> wrote:
>
> On Wed, Sep 18, 2019 at 8:13 AM Jinshan Xiong <[email protected]> wrote:
> >
> > The problem with the current approach is that it would be difficult to filter cgroup, especially the cgroup in question has descendents, and also it would spawn new descendents after BPF program is installed. it's hard to filter it inside a BPF program.
>
> Why is that?
> bpf_current_task_under_cgroup() fits exactly that purpose.