Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753156AbaFBTCL (ORCPT ); Mon, 2 Jun 2014 15:02:11 -0400 Received: from mail-wi0-f172.google.com ([209.85.212.172]:45145 "EHLO mail-wi0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753065AbaFBTCH (ORCPT ); Mon, 2 Jun 2014 15:02:07 -0400 MIME-Version: 1.0 In-Reply-To: <538CAEA6.4060307@redhat.com> References: <1401692506-7796-1-git-send-email-ast@plumgrid.com> <538C3C94.3080206@redhat.com> <538CAEA6.4060307@redhat.com> Date: Mon, 2 Jun 2014 12:02:03 -0700 Message-ID: Subject: Re: [PATCH v2 net-next 0/2] split BPF out of core networking From: Alexei Starovoitov To: Daniel Borkmann Cc: "David S. Miller" , Ingo Molnar , Steven Rostedt , Chema Gonzalez , Eric Dumazet , Peter Zijlstra , Arnaldo Carvalho de Melo , Jiri Olsa , Thomas Gleixner , "H. Peter Anvin" , Andrew Morton , Kees Cook , Network Development , LKML Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 2, 2014 at 10:04 AM, Daniel Borkmann wrote: > On 06/02/2014 05:41 PM, Alexei Starovoitov wrote: > ... > >> Glad you brought up this point :) >> 100% agree that current double verification done by seccomp is far from >> being generic and quite hard to maintain, since any change done to >> classic BPF verifier needs to be thought through from >> seccomp_check_filter() >> perspective as well. > > > Glad we're on the same page. > > >> BPF's input context, set of allowed calls need to be expressed in a >> generic way. >> Obviously this split by itself won't make classic BPF all of a sudden >> generic. >> It rather defines a boundary of eBPF core. > > > Note, I'm not at all against using it in tracing, I think it's probably > a good idea, but shouldn't we _first_ think about how to overcome such > deficits as above by improving upon its in-kernel API design, thus to > better prepare it to be generic? I feel this step is otherwise just > skipped and quickly 'hacked' around ... ;) Are you talking about classic 'deficit' or eBPF 'deficit' ? Classic has all sorts of hard coded assumptions. The whole concept of 'load from magic constant' to mean different things is flawed. We all got used to it and now think that it's normal for "ld_abs -4056" to mean "a ^= x" This split is not trying to make classic easier to hack. With eBPF underneath classic, it got a lot easier to add extensions to classic, but we shouldn't be doing it. Classic BPF is not generic and cannot become one. It's eBPF's job. The split is mainly helping to clearly see the boundary of eBPF core vs its socket use case. It doesn't change or add any API. We need to carefully design eBPF APIs when we expose it to user space. I have a proposal for that too, but that's separate discussion. In terms of in-kernel eBPF API there is nothing to be done. eBPF program 'prog' is generated by whatever means and then: struct sk_filter *fp; fp = kzalloc(sk_filter_size(prog_len), GFP_KERNEL); memcpy(fp->insni, prog, prog_len * sizeof(fp->insni[0])); fp->len = prog_len; sk_filter_select_runtime(fp); // select interpreter or JIT SK_RUN_FILTER(fp, ctx); // run the program sk_filter_free(fp); // free program that's how sockets, testsuite, seccomp, tracing are doing it. All have different ways of producing 'prog' and 'prog_len'. This in-kernel API cleanup was done in commit 5fe821a9dee2 You even acked it back then :) If you're referring to eBPF verifier in-kernel API then yeah, it's missing, just like the whole eBPF verifier :) Ideally any kernel component that generates eBPF on the fly sends eBPF program to verifier first just to double check that generated program is valid. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/