Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp7154441ybi; Mon, 8 Jul 2019 15:45:01 -0700 (PDT) X-Google-Smtp-Source: APXvYqz4MbaVZUKBRrXW9TkVuD0FF2x88exbpuljV394nbGsVYQ8lDvfE7w6F2V6WQNr+kjBdDvv X-Received: by 2002:a17:902:9a95:: with SMTP id w21mr27719633plp.126.1562625901896; Mon, 08 Jul 2019 15:45:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562625901; cv=none; d=google.com; s=arc-20160816; b=b1wzdDKIANcBfHADGqQf9BJcIQuMz/zftL5G/hHghjap/d+NweTndft1k8DbVi1BKk ui8R/KYpR9pWFeqe82APfRzcK/5YfcY4wAUBhp2DIRbEW+QvgL/L/0F+6NV4ALJJhTkp PUT4zJs6rSR3qDrfU1IMHXzX3Vp0MSJcNDAnciTgZSFIiBbgtyKZY4bxxAd7QOUkdnrV j6QRuHKhUKd0Hjoks/iiK132euVxYpnvUab6hL86JsFcBiF4xjUPmGf88+cs5tqbOIiE +uzgEJSDF9B9O0A0fVlzd5GbVxIbKC4SRExmU33FqQXKbQ7w/71H8LfSPMHhitge+PHJ 67FQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=JiyfBxDrUpJwKDj5qRxHMNcBMneqQ8dyhv4/1WSz/nM=; b=haZKhRfD1q0wQxiDUNZ09El9aebCUPLKLA4caX42B0HKbnpJiLyLfjJrMvjEN0ADgm raJ4CV3KTNQke/7zcieREHTVPjAt0Aclq28+IYnDzG5oc9BjiDvZqosB0vqPQxhOfm/9 2ATo8l8BEDUhClKnKa7QX7cYBlISxb0icXdyLAkZcX1jsd+vhsJJtjDmY7Ono0ECx40H za7HUpP0Cqxh52F1non4Oioehruqk0FBIfi4ETlSDT33dVabv56cx3Ori3cfi5lGgJuS XmfgYiXviCz39EiIOEV+XuEN8g1wBKfMVgcY3mPEcMjzCY3j63TPGQGDu3iKC7Sbl4S6 ogYA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kinvolk.io header.s=google header.b=IvFIKtCh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e2si11079477pfn.32.2019.07.08.15.44.47; Mon, 08 Jul 2019 15:45:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kinvolk.io header.s=google header.b=IvFIKtCh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730628AbfGHQvw (ORCPT + 99 others); Mon, 8 Jul 2019 12:51:52 -0400 Received: from mail-lj1-f196.google.com ([209.85.208.196]:32836 "EHLO mail-lj1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390072AbfGHQvv (ORCPT ); Mon, 8 Jul 2019 12:51:51 -0400 Received: by mail-lj1-f196.google.com with SMTP id h10so16645963ljg.0 for ; Mon, 08 Jul 2019 09:51:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kinvolk.io; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=JiyfBxDrUpJwKDj5qRxHMNcBMneqQ8dyhv4/1WSz/nM=; b=IvFIKtChvtNohLTnMgGzxoRkIk+WcVdktFHEk4p20NBTSlwCluMT2Rg8RV1y5cfgKC 9F77cauh2pamKTgc+4qnvO5FE/9K9JuiJ70HP4vepUvSpemgQtlNIiT5WYkuY+ZGKgj5 U4KBeOYrlbAFPAHcM1nfSm7ICNujoPJyL32us= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=JiyfBxDrUpJwKDj5qRxHMNcBMneqQ8dyhv4/1WSz/nM=; b=R0axs3ynKyHo9Y+YrONwy4dN6rMHPI2oOW4rv5jDMgp7mNsbhdOVkTXbPLClCgnGM7 aEk+X7bkwSjg3cRH8c0Lw5B8WKeIixxasq+W0mPOW7Og5pXuZNvRlH519f8hC9Ugxm/+ 4qYjAjkCoI36Z4Vy3G5TK17Zi95Ex4s2kRPfCDEWMmbIvM3OXlBMJ+Og+yj7E1MK+N4s 51Y3dPauNYgWMe7Fa5+fb5dZuI9N1uKII9Vi1p8ZuGh3tKI0IbmV0A+GNbgKfB/oPvZz 3Ig6mzXU8NMzMCBR6Uo30jfLrMMbPE+90wrPAiB29SDYiSLDRqIiRVq2Laj/LeMyY87G 2TPA== X-Gm-Message-State: APjAAAWHY3RCY1lm2BNqDkqDLxNwHBMP4ettCUcbA2gDw2O9Vks8FpOX Xr/nvJy/U3QOcko2L10BM9wK73FFJxyERXPY2oSmTg== X-Received: by 2002:a2e:9754:: with SMTP id f20mr10715874ljj.151.1562604708948; Mon, 08 Jul 2019 09:51:48 -0700 (PDT) MIME-Version: 1.0 References: <20190625194215.14927-1-krzesimir@kinvolk.io> <20190625194215.14927-9-krzesimir@kinvolk.io> <20190625201220.GC10487@mini-arch> <20190626161231.GA4866@mini-arch> In-Reply-To: <20190626161231.GA4866@mini-arch> From: Krzesimir Nowak Date: Mon, 8 Jul 2019 18:51:38 +0200 Message-ID: Subject: Re: [bpf-next v2 08/10] bpf: Implement bpf_prog_test_run for perf event programs To: Stanislav Fomichev Cc: netdev@vger.kernel.org, Alban Crequy , =?UTF-8?Q?Iago_L=C3=B3pez_Galeiras?= , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , linux-kernel@vger.kernel.org, bpf@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 26, 2019 at 6:12 PM Stanislav Fomichev wrote: > > On 06/26, Krzesimir Nowak wrote: > > On Tue, Jun 25, 2019 at 10:12 PM Stanislav Fomichev w= rote: > > > > > > On 06/25, Krzesimir Nowak wrote: > > > > As an input, test run for perf event program takes struct > > > > bpf_perf_event_data as ctx_in and struct bpf_perf_event_value as > > > > data_in. For an output, it basically ignores ctx_out and data_out. > > > > > > > > The implementation sets an instance of struct bpf_perf_event_data_k= ern > > > > in such a way that the BPF program reading data from context will > > > > receive what we passed to the bpf prog test run in ctx_in. Also BPF > > > > program can call bpf_perf_prog_read_value to receive what was passe= d > > > > in data_in. > > > > > > > > Signed-off-by: Krzesimir Nowak > > > > --- > > > > kernel/trace/bpf_trace.c | 107 ++++++++++++++= ++++ > > > > .../bpf/verifier/perf_event_sample_period.c | 8 ++ > > > > 2 files changed, 115 insertions(+) > > > > > > > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c > > > > index c102c240bb0b..2fa49ea8a475 100644 > > > > --- a/kernel/trace/bpf_trace.c > > > > +++ b/kernel/trace/bpf_trace.c > > > > @@ -16,6 +16,8 @@ > > > > > > > > #include > > > > > > > > +#include > > > > + > > > > #include "trace_probe.h" > > > > #include "trace.h" > > > > > > > > @@ -1160,7 +1162,112 @@ const struct bpf_verifier_ops perf_event_ve= rifier_ops =3D { > > > > .convert_ctx_access =3D pe_prog_convert_ctx_access, > > > > }; > > > > > > > > +static int pe_prog_test_run(struct bpf_prog *prog, > > > > + const union bpf_attr *kattr, > > > > + union bpf_attr __user *uattr) > > > > +{ > > > > + void __user *ctx_in =3D u64_to_user_ptr(kattr->test.ctx_in); > > > > + void __user *data_in =3D u64_to_user_ptr(kattr->test.data_in)= ; > > > > + u32 data_size_in =3D kattr->test.data_size_in; > > > > + u32 ctx_size_in =3D kattr->test.ctx_size_in; > > > > + u32 repeat =3D kattr->test.repeat; > > > > + u32 retval =3D 0, duration =3D 0; > > > > + int err =3D -EINVAL; > > > > + u64 time_start, time_spent =3D 0; > > > > + int i; > > > > + struct perf_sample_data sample_data =3D {0, }; > > > > + struct perf_event event =3D {0, }; > > > > + struct bpf_perf_event_data_kern real_ctx =3D {0, }; > > > > + struct bpf_perf_event_data fake_ctx =3D {0, }; > > > > + struct bpf_perf_event_value value =3D {0, }; > > > > + > > > > + if (ctx_size_in !=3D sizeof(fake_ctx)) > > > > + goto out; > > > > + if (data_size_in !=3D sizeof(value)) > > > > + goto out; > > > > + > > > > + if (copy_from_user(&fake_ctx, ctx_in, ctx_size_in)) { > > > > + err =3D -EFAULT; > > > > + goto out; > > > > + } > > > Move this to net/bpf/test_run.c? I have a bpf_ctx_init helper to deal > > > with ctx input, might save you some code above wrt ctx size/etc. > > > > My impression about net/bpf/test_run.c was that it was a collection of > > helpers for test runs of the network-related BPF programs, because > > they are so similar to each other. So kernel/trace/bpf_trace.c looked > > like an obvious place for the test_run implementation since other perf > > trace BPF stuff was already there. > Maybe net/bpf/test_run.c should be renamed to kernel/bpf/test_run.c? Just sent another version of this patch series. I went with slightly different approach - moved some functions to kernel/bpf/test_run.c and left the network specific stuff in net/bpf/test_run.c. > > > And about bpf_ctx_init - looks useful as it seems to me that it > > handles the scenario where the size of the ctx struct grows, but still > > allows passing older version of the struct (thus smaller) from > > userspace for compatibility. Maybe that checking and copying part of > > the function could be moved into some non-static helper function, so I > > could use it and still skip the need for allocating memory for the > > context? > You can always make bpf_ctx_init non-static and export it. > But, again, consider adding your stuff to the net/bpf/test_run.c > and exporting only pe_prog_test_run. That way you can reuse > bpf_ctx_init and bpf_test_run. > > Why do you care about memory allocation though? It's a one time > operation and doesn't affect the performance measurements. > > > > > + if (copy_from_user(&value, data_in, data_size_in)) { > > > > + err =3D -EFAULT; > > > > + goto out; > > > > + } > > > > + > > > > + real_ctx.regs =3D &fake_ctx.regs; > > > > + real_ctx.data =3D &sample_data; > > > > + real_ctx.event =3D &event; > > > > + perf_sample_data_init(&sample_data, fake_ctx.addr, > > > > + fake_ctx.sample_period); > > > > + event.cpu =3D smp_processor_id(); > > > > + event.oncpu =3D -1; > > > > + event.state =3D PERF_EVENT_STATE_OFF; > > > > + local64_set(&event.count, value.counter); > > > > + event.total_time_enabled =3D value.enabled; > > > > + event.total_time_running =3D value.running; > > > > + /* make self as a leader - it is used only for checking the > > > > + * state field > > > > + */ > > > > + event.group_leader =3D &event; > > > > + > > > > + /* slightly changed copy pasta from bpf_test_run() in > > > > + * net/bpf/test_run.c > > > > + */ > > > > + if (!repeat) > > > > + repeat =3D 1; > > > > + > > > > + rcu_read_lock(); > > > > + preempt_disable(); > > > > + time_start =3D ktime_get_ns(); > > > > + for (i =3D 0; i < repeat; i++) { > > > Any reason for not using bpf_test_run? > > > > Two, mostly. One was that it is a static function and my code was > > elsewhere. Second was that it does some cgroup storage setup and I'm > > not sure if the perf event BPF program needs that. > You can always make it non-static. > > Regarding cgroup storage: do we care? If you can see it affecting > your performance numbers, then yes, but you can try to measure to see > if it gives you any noticeable overhead. Maybe add an argument to > bpf_test_run to skip cgroup storage stuff? > > > > > + retval =3D BPF_PROG_RUN(prog, &real_ctx); > > > > + > > > > + if (signal_pending(current)) { > > > > + err =3D -EINTR; > > > > + preempt_enable(); > > > > + rcu_read_unlock(); > > > > + goto out; > > > > + } > > > > + > > > > + if (need_resched()) { > > > > + time_spent +=3D ktime_get_ns() - time_start; > > > > + preempt_enable(); > > > > + rcu_read_unlock(); > > > > + > > > > + cond_resched(); > > > > + > > > > + rcu_read_lock(); > > > > + preempt_disable(); > > > > + time_start =3D ktime_get_ns(); > > > > + } > > > > + } > > > > + time_spent +=3D ktime_get_ns() - time_start; > > > > + preempt_enable(); > > > > + rcu_read_unlock(); > > > > + > > > > + do_div(time_spent, repeat); > > > > + duration =3D time_spent > U32_MAX ? U32_MAX : (u32)time_spent= ; > > > > + /* end of slightly changed copy pasta from bpf_test_run() in > > > > + * net/bpf/test_run.c > > > > + */ > > > > + > > > > + if (copy_to_user(&uattr->test.retval, &retval, sizeof(retval)= )) { > > > > + err =3D -EFAULT; > > > > + goto out; > > > > + } > > > > + if (copy_to_user(&uattr->test.duration, &duration, sizeof(dur= ation))) { > > > > + err =3D -EFAULT; > > > > + goto out; > > > > + } > > > Can BPF program modify fake_ctx? Do we need/want to copy it back? > > > > Reading the pe_prog_is_valid_access function tells me that it's not > > possible - the only type of valid access is read. So maybe I should be > > stricter about the requirements for the data_out and ctx_out sizes > > (should be zero or return -EINVAL). > Yes, better to explicitly prohibit anything that we don't support. > > > > > + err =3D 0; > > > > +out: > > > > + trace_bpf_test_finish(&err); > > > > + return err; > > > > +} > > > > + > > > > const struct bpf_prog_ops perf_event_prog_ops =3D { > > > > + .test_run =3D pe_prog_test_run, > > > > }; > > > > > > > > static DEFINE_MUTEX(bpf_event_mutex); > > > > diff --git a/tools/testing/selftests/bpf/verifier/perf_event_sample= _period.c b/tools/testing/selftests/bpf/verifier/perf_event_sample_period.c > > > > index 471c1a5950d8..16e9e5824d14 100644 > > > > --- a/tools/testing/selftests/bpf/verifier/perf_event_sample_period= .c > > > > +++ b/tools/testing/selftests/bpf/verifier/perf_event_sample_period= .c > > > This should probably go in another patch. > > > > Yeah, I was wondering about it. These changes are here to avoid > > breaking the tests, since perf event program can actually be run now > > and the test_run for perf event required certain sizes for ctx and > > data. > You need to make sure the context is optional, that way you don't break > any existing tests out in the wild and can move those changes to > another patch. > > > So, I will either move them to a separate patch or rework the test_run > > for perf event to accept the size between 0 and sizeof(struct > > something), so the changes in tests maybe will not be necessary. > > > > > > > > > @@ -13,6 +13,8 @@ > > > > }, > > > > .result =3D ACCEPT, > > > > .prog_type =3D BPF_PROG_TYPE_PERF_EVENT, > > > > + .ctx_len =3D sizeof(struct bpf_perf_event_data), > > > > + .data_len =3D sizeof(struct bpf_perf_event_value), > > > > }, > > > > { > > > > "check bpf_perf_event_data->sample_period half load permitted= ", > > > > @@ -29,6 +31,8 @@ > > > > }, > > > > .result =3D ACCEPT, > > > > .prog_type =3D BPF_PROG_TYPE_PERF_EVENT, > > > > + .ctx_len =3D sizeof(struct bpf_perf_event_data), > > > > + .data_len =3D sizeof(struct bpf_perf_event_value), > > > > }, > > > > { > > > > "check bpf_perf_event_data->sample_period word load permitted= ", > > > > @@ -45,6 +49,8 @@ > > > > }, > > > > .result =3D ACCEPT, > > > > .prog_type =3D BPF_PROG_TYPE_PERF_EVENT, > > > > + .ctx_len =3D sizeof(struct bpf_perf_event_data), > > > > + .data_len =3D sizeof(struct bpf_perf_event_value), > > > > }, > > > > { > > > > "check bpf_perf_event_data->sample_period dword load permitte= d", > > > > @@ -56,4 +62,6 @@ > > > > }, > > > > .result =3D ACCEPT, > > > > .prog_type =3D BPF_PROG_TYPE_PERF_EVENT, > > > > + .ctx_len =3D sizeof(struct bpf_perf_event_data), > > > > + .data_len =3D sizeof(struct bpf_perf_event_value), > > > > }, > > > > -- > > > > 2.20.1 > > > > > > > > > > > > -- > > Kinvolk GmbH | Adalbertstr.6a, 10999 Berlin | tel: +491755589364 > > Gesch=C3=A4ftsf=C3=BChrer/Directors: Alban Crequy, Chris K=C3=BChl, Iag= o L=C3=B3pez Galeiras > > Registergericht/Court of registration: Amtsgericht Charlottenburg > > Registernummer/Registration number: HRB 171414 B > > Ust-ID-Nummer/VAT ID number: DE302207000 --=20 Kinvolk GmbH | Adalbertstr.6a, 10999 Berlin | tel: +491755589364 Gesch=C3=A4ftsf=C3=BChrer/Directors: Alban Crequy, Chris K=C3=BChl, Iago L= =C3=B3pez Galeiras Registergericht/Court of registration: Amtsgericht Charlottenburg Registernummer/Registration number: HRB 171414 B Ust-ID-Nummer/VAT ID number: DE302207000