Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754487AbaF3WJl (ORCPT ); Mon, 30 Jun 2014 18:09:41 -0400 Received: from mail-la0-f51.google.com ([209.85.215.51]:47103 "EHLO mail-la0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753465AbaF3WJj convert rfc822-to-8bit (ORCPT ); Mon, 30 Jun 2014 18:09:39 -0400 MIME-Version: 1.0 In-Reply-To: References: <1403913966-4927-1-git-send-email-ast@plumgrid.com> <1403913966-4927-4-git-send-email-ast@plumgrid.com> From: Andy Lutomirski Date: Mon, 30 Jun 2014 15:09:17 -0700 Message-ID: Subject: Re: [PATCH RFC net-next 03/14] bpf: introduce syscall(BPF, ...) and BPF maps To: Alexei Starovoitov Cc: "David S. Miller" , Ingo Molnar , Linus Torvalds , Steven Rostedt , Daniel Borkmann , Chema Gonzalez , Eric Dumazet , Peter Zijlstra , Arnaldo Carvalho de Melo , Jiri Olsa , Thomas Gleixner , "H. Peter Anvin" , Andrew Morton , Kees Cook , Linux API , Network Development , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jun 28, 2014 at 11:36 PM, Alexei Starovoitov wrote: > On Sat, Jun 28, 2014 at 6:52 PM, Andy Lutomirski wrote: >> On Sat, Jun 28, 2014 at 1:49 PM, Alexei Starovoitov wrote: >>> >>> Sorry I don't like 'fd' direction at all. >>> 1. it will make the whole thing very socket specific and 'net' dependent. >>> but the goal here is to be able to use eBPF for tracing in embedded >>> setups. So it's gotta be net independent. >>> 2. sockets are already overloaded with all sorts of stuff. Adding more >>> types of sockets will complicate it a lot. >>> 3. and most important. read/write operations on sockets are not >>> done every nanosecond, whereas lookup operations on bpf maps >>> are done every dozen instructions, so we cannot have any overhead >>> when accessing maps. >>> In other words the verifier is done as static analyzer. I moved all >>> the complexity to verify time, so at run-time the programs are as >>> fast as possible. I'm strongly against run-time checks in critical path, >>> since they kill performance and make the whole approach a lot less usable. >> >> I may have described my suggestion poorly. I'm suggesting that all of >> these global ids be replaced *for userspace's benefit* with fds. That >> is, a map would have an associated struct inode, and, when you load an >> eBPF program, you'd pass fds into the kernel instead of global ids. >> The kernel would still compile the eBPF program to use the global ids, >> though. > > Hmm. If I understood you correctly, you're suggesting to do it similar > to ipc/mqueue, shmem, sockets do. By registering and mounting > a file system and providing all superblock and inode hooks… and > probably have its own namespace type… hmm… may be. That's > quite a bit of work to put lightly. As I said in the other email the first > step is root only and all these complexity just not worth doing > at this stage. The downside of not doing it right away is that it's harder to retrofit in without breaking early users. You might be able to get away with using anon_inodes. That will prevent repoening via /proc/self/fd from working (I think), but that's a good thing until someone fixes the /proc reopen hole. Sigh. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/