Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933626AbcLTBoG (ORCPT ); Mon, 19 Dec 2016 20:44:06 -0500 Received: from mail-ua0-f181.google.com ([209.85.217.181]:36531 "EHLO mail-ua0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754714AbcLTBoF (ORCPT ); Mon, 19 Dec 2016 20:44:05 -0500 MIME-Version: 1.0 In-Reply-To: References: <20161219205631.GA31242@ast-mbp.thefacebook.com> <20161220000254.GA58895@ast-mbp.thefacebook.com> From: Andy Lutomirski Date: Mon, 19 Dec 2016 17:43:43 -0800 Message-ID: Subject: Re: Potential issues (security and otherwise) with the current cgroup-bpf API To: Alexei Starovoitov Cc: Andy Lutomirski , Daniel Mack , =?UTF-8?B?TWlja2HDq2wgU2FsYcO8bg==?= , Kees Cook , Jann Horn , Tejun Heo , David Ahern , "David S. Miller" , Thomas Graf , Michael Kerrisk , Peter Zijlstra , Linux API , "linux-kernel@vger.kernel.org" , Network Development Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1118 Lines: 26 On Mon, Dec 19, 2016 at 4:25 PM, Andy Lutomirski wrote: > On Mon, Dec 19, 2016 at 4:02 PM, Alexei Starovoitov > wrote: >> you're ignoring use cases I described earlier. >> In vrf case there is only one ifindex it needs to bind to. > > I'm totally lost. Can you explain what this has to do with the cgroup > hierarchy? > Okay, I figured out what you mean, I think. You have a handful of vrf devices. Let's say they have ifindexes 1 and 2 (and maybe more). The interesting case happens when you set up /cgroup/a with a bpf program that binds new sockets to ifindex 1 and /cgroup/a/b with a bpf program that binds new sockets to ifindex 2. The question is: what should happen if you're in /cgroup/a/b? Presumably, if you do this, you wanted to end up with ifindex 2. I think the way it should actually work is that the kernel evaluates /cgroup/a/b's hook and then /cgroup/a's hook. Then /cgroup/a (which is the more privileged hook) gets to make the choice. If it wants ifindex 2 to win, it can do (pseudocode): if (!sk->sk_bound_if) sk->sk_bound_if = 1;