Subject: Re: netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of
 'xt_bpf_info_v1'
To: Al Viro <viro@ZenIV.linux.org.uk>, Kees Cook <keescook@chromium.org>
Cc: Shmulik Ladkani <shmulik.ladkani@gmail.com>,
        Willem de Bruijn <willemb@google.com>,
        Pablo Neira Ayuso <pablo@netfilter.org>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        David Miller <davem@davemloft.net>,
        LKML <linux-kernel@vger.kernel.org>,
        Network Development <netdev@vger.kernel.org>,
        Christoph Hellwig <hch@infradead.org>,
        Thomas Garnier <thgarnie@google.com>, Jann Horn <jannh@google.com>
References: <CAGXu5j+ZCJ8rRK=DAKaeuNmGzVY7xo1o9uJyWPEijmg9gw4fKA@mail.gmail.com>
 <20171201013304.GM21978@ZenIV.linux.org.uk>
 <20171201034859.GN21978@ZenIV.linux.org.uk>
From: Daniel Borkmann <daniel@iogearbox.net>
Message-ID: <13ec5d38-22d7-7b0d-abbd-5a1746291afe@iogearbox.net>
Date: Fri, 1 Dec 2017 22:34:48 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.3.0
MIME-Version: 1.0
In-Reply-To: <20171201034859.GN21978@ZenIV.linux.org.uk>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2623
Lines: 79

On 12/01/2017 04:48 AM, Al Viro wrote:
> On Fri, Dec 01, 2017 at 01:33:04AM +0000, Al Viro wrote:
> 
>> Use of file descriptors should be limited to "got a number from userland,
>> convert to struct file *" on the way in and "install struct file * into
>> descriptor table and return the descriptor to userland" on the way out.
>> And the latter - *ONLY* after the last possible point of failure.  Once
>> a file reference is inserted into descriptor table, that's it - you
>> can't undo that.
>>
>> The only way to use bpf_obj_get_user() is to pass its return value to
>> userland.  As return value of syscall - not even put_user() (for that
>> you'd need to reserve the descriptor, copy it to userland and only
>> then attach struct file * to it).
>>
>> The whole approach stinks - what it needs is something that would
>> take struct filename * and return struct bpf_prog * or struct file *
>> reference.  With bpf_obj_get_user() and this thing implemented
>> via that.

Agree, the "fix" is completely buggy due to fd being exposed to user
space during that period of time ...

>> I'm looking into that thing...
> 
> What it tries to pull off is something not far from
> 
> static struct bpf_prog *__get_prog(struct inode *inode, enum bpf_prog_type type)
> {
> 	struct bpf_prog *prog;
> 	int err = inode_permission(inode, FMODE_READ | FMODE_WRITE);
> 	if (err)
> 		return ERR_PTR(err);
> 
> 	if (inode->i_op == &bpf_map_iops)
> 		return ERR_PTR(-EINVAL);
> 
> 	if (inode->i_op != &bpf_prog_iops)
> 		return ERR_PTR(-EACCES);
> 
> 	prog = inode->i_private;
> 	err = security_bpf_prog(prog);
> 	if (err < 0)
> 		return ERR_PTR(err);
> 
> 	if (!bpf_prog_get_ok(prog, &type, false))
> 		return ERR_PTR(-EINVAL);
> 
> 	return bpf_prog_inc(prog);
> }
> 
> struct bpf_prog *get_prog_path_type(const char *name, enum bpf_prog_type type)
> {
> 	struct path path;
> 	struct bpf_prog *prog;
> 	int err = kern_path(name, LOOKUP_FOLLOW, &path);
> 	if (err)
> 		return ERR_PTR(err);
> 	prog = __get_prog(d_backing_inode(path.dentry), type);
> 	if (!IS_ERR(prog))
> 		touch_atime(&path);
> 	path_put(&path);
> 	return prog;
> }
> 
> static int __bpf_mt_check_path(const char *path, struct bpf_prog **ret)
> {
> 	*ret = get_prog_path_type(path, BPF_PROG_TYPE_SOCKET_FILTER);
>         return PTR_ERR_OR_ZERO(*ret);
> }
> 
> That skips all tracepoint random shite (pardon the triple redundance) and makes
> a somewhat arbitrary change for touch_atime() logics.  And, of course, it is
> not even compile-tested.
> 
> Something similar to get_prog_path_type() above might make for a usable
> primitive, IMO...

The above looks good to me!