Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp1069615yba; Thu, 18 Apr 2019 14:43:01 -0700 (PDT) X-Google-Smtp-Source: APXvYqynLAzX+wGeSl1rB/Qdju/MolMHlAnGBzmFKi/CtIZFy8P0pWM8Kpl7P1JFv9dyp4pEByYL X-Received: by 2002:a63:4b15:: with SMTP id y21mr226840pga.430.1555623781796; Thu, 18 Apr 2019 14:43:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555623781; cv=none; d=google.com; s=arc-20160816; b=ut6AJVMfpOrl2uhVyJIzF/dTZC/6GH9Eab0kN7ZZEQhnfzRBoSCWYVFrHqggJZvUoa NsNuSPP/0ZtIM8qm4uHwK449kb2hh8PHbZXjORSWqxyHqNkdXhJhWSG+KHKlNVZDySRf 5XSmML1G/bWx/NjtorY45nsPu3ZlQXPrGIbTjfig6EaFzi19VLbhBPwdSD8UxhTXvJ6+ aQD61jj0c3HvKycIq6w7iG7c1ZdxtTKuFicyO4Vz5y7tg9EXUc97tGDCIZmhAiccSVuc wky0hst8PbFYI7mQxWEjVS8OVMDQCc1uDoJRJkNbuCs04TrlPy8c3DgQDFx+ca79Wd9N DNSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=MdDRHQShD7ihvuBtYtC/04Pbgnh7z8l0D7NLMNuzIYU=; b=ahU8Db3eLAbZldipCz8Kc2/Ma4N3cqPInzB16oJAZnpMWFXD/KC6k/bNi/SsLOH0r3 KEC1XJHVyQU9CghkP+KCwLujBoz05jLvtB5spTOeJk5WC+jRX/WJpeFO3e7S8r/COjSD e62j4xgTY4D6nIbUydKPm69KMmZYDPyQdSGXKgL/QI4i3kkPRoVleP6q/kiU9XAIxFOC FrbqaaWu9p7z01GQ0AFikEa1StuFLQ/PIkgPVF1YTEJgOSRdNiphYCRnFBz6h8p4hItI NkGwolOzBD0GigozIqUWE3zix8rrB/NfRvBoCWVKOJK6beEOLYmO6DGCLU4B1WMBLF45 I67g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=CHlzK5lw; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y14si3051833pll.379.2019.04.18.14.42.46; Thu, 18 Apr 2019 14:43:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=CHlzK5lw; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726082AbfDRVly (ORCPT + 99 others); Thu, 18 Apr 2019 17:41:54 -0400 Received: from mail-qk1-f195.google.com ([209.85.222.195]:33565 "EHLO mail-qk1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725815AbfDRVly (ORCPT ); Thu, 18 Apr 2019 17:41:54 -0400 Received: by mail-qk1-f195.google.com with SMTP id k189so2040696qkc.0; Thu, 18 Apr 2019 14:41:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MdDRHQShD7ihvuBtYtC/04Pbgnh7z8l0D7NLMNuzIYU=; b=CHlzK5lwUOzlCtkJTFOO/gjek7pbZmruZIxa7Xm+dkd1YDGKGxjJpbH7GS9JUtakFa pfvid743Q5/SV0i1sFoe55S1v+7ltcsvEjPWYjjEGRWozbbkNgghQBg5x9/Ltt09pn6J Xi7bthd13MDyelQL8wDRM7Qx1r0Tz4H3CwLXp+TbIiKs1H9Fhksej7DIWYXm5pFzyzvA M4WEpyVDtVuIUPwkC2L0hqMG+yrS5RSTHp7jXFZhPpIhLMsYrPELMvHMiPm7t0XzK6m1 wO/of+Jt92JuEfZtDr+9MBWkMXOOrogloYp/h0oc6mk7g1WTDxfMs1mR/rERassjVseQ 2/7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MdDRHQShD7ihvuBtYtC/04Pbgnh7z8l0D7NLMNuzIYU=; b=GjBLyaw7XWizilqWsV9EYscxi+WCtWdgnzsFzHDajT7BWP2eai3Y1Y9nrW9Dv7Otbt Q4GDB04mvDFr8QJbFV9ThKWBky5w3IZN6ZaYMLsiIxf/TtXyY2iM9KFmcTOyBbJ3blg/ b/yFEGjSGYVC3fdCRIdIxmG0dFqrikv1LM3+V3dGMKVwFglaZl64OXriapFqjT3NAvcP rQ9TI0L4O1NTbXY5208uc/dfkZhFwA7admZUoWNIxceC2fX3DQKu3G3BrfftUbQXWnHw iB2Es1Zc2AWVLPZz/utHNPkUYXoRWBoo6jay36dicuG8NtQocT+sMSM0sGmtN/htZm5K vLMg== X-Gm-Message-State: APjAAAUUeWvAlPtK1z2ww9f6mvQEm8Pf0d8f01mJLQZ6gIksh2nLlSG9 CbzF2lFGGbWi5rzP7B7j3gjVuCqez/iUGqpVm0E= X-Received: by 2002:a05:620a:15f5:: with SMTP id p21mr292290qkm.5.1555623712968; Thu, 18 Apr 2019 14:41:52 -0700 (PDT) MIME-Version: 1.0 References: <20190418155652.22181-1-alban@kinvolk.io> In-Reply-To: <20190418155652.22181-1-alban@kinvolk.io> From: Song Liu Date: Thu, 18 Apr 2019 14:41:41 -0700 Message-ID: Subject: Re: [PATCH bpf-next v2 1/3] bpf: sock ops: add netns ino and dev in bpf context To: Alban Crequy Cc: John Fastabend , Alexei Starovoitov , Daniel Borkmann , bpf , Networking , open list , "Alban Crequy (Kinvolk)" , =?UTF-8?Q?Iago_L=C3=B3pez_Galeiras?= Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 18, 2019 at 8:59 AM Alban Crequy wrote: > > From: Alban Crequy > > sockops programs can now access the network namespace inode and device > via (struct bpf_sock_ops)->netns_ino and ->netns_dev. This can be useful > to apply different policies on different network namespaces. > > In the unlikely case where network namespaces are not compiled in > (CONFIG_NET_NS=n), the verifier will not allow access to ->netns_*. > > The generated BPF bytecode for netns_ino is loading the correct inode > number at the time of execution. > > However, the generated BPF bytecode for netns_dev is loading an > immediate value determined at BPF-load-time by looking at the initial > network namespace. In practice, this works because all netns currently > use the same virtual device. If this was to change, this code would need > to be updated too. > > Signed-off-by: Alban Crequy Acked-by: Song Liu > > --- > > Changes since v1: > - add netns_dev (review from Alexei) > --- > include/uapi/linux/bpf.h | 2 ++ > net/core/filter.c | 70 ++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 72 insertions(+) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index eaf2d3284248..f4f841dde42c 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -3213,6 +3213,8 @@ struct bpf_sock_ops { > __u32 sk_txhash; > __u64 bytes_received; > __u64 bytes_acked; > + __u64 netns_dev; > + __u64 netns_ino; > }; > > /* Definitions for bpf_sock_ops_cb_flags */ > diff --git a/net/core/filter.c b/net/core/filter.c > index 1833926a63fc..93e3429603d7 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -75,6 +75,8 @@ > #include > #include > #include > +#include > +#include > > /** > * sk_filter_trim_cap - run a packet through a socket filter > @@ -6774,6 +6776,15 @@ static bool sock_ops_is_valid_access(int off, int size, > } > } else { > switch (off) { > + case offsetof(struct bpf_sock_ops, netns_dev): > + case offsetof(struct bpf_sock_ops, netns_ino): > +#ifdef CONFIG_NET_NS > + if (size != sizeof(__u64)) > + return false; > +#else > + return false; > +#endif > + break; > case bpf_ctx_range_till(struct bpf_sock_ops, bytes_received, > bytes_acked): > if (size != sizeof(__u64)) > @@ -7660,6 +7671,11 @@ static u32 sock_addr_convert_ctx_access(enum bpf_access_type type, > return insn - insn_buf; > } > > +static struct ns_common *sockops_netns_cb(void *private_data) > +{ > + return &init_net.ns; > +} > + > static u32 sock_ops_convert_ctx_access(enum bpf_access_type type, > const struct bpf_insn *si, > struct bpf_insn *insn_buf, > @@ -7668,6 +7684,10 @@ static u32 sock_ops_convert_ctx_access(enum bpf_access_type type, > { > struct bpf_insn *insn = insn_buf; > int off; > + struct inode *ns_inode; > + struct path ns_path; > + __u64 netns_dev; > + void *res; > > /* Helper macro for adding read access to tcp_sock or sock fields. */ > #define SOCK_OPS_GET_FIELD(BPF_FIELD, OBJ_FIELD, OBJ) \ > @@ -7914,6 +7934,56 @@ static u32 sock_ops_convert_ctx_access(enum bpf_access_type type, > SOCK_OPS_GET_OR_SET_FIELD(sk_txhash, sk_txhash, > struct sock, type); > break; > + > + case offsetof(struct bpf_sock_ops, netns_dev): > +#ifdef CONFIG_NET_NS > + /* We get the netns_dev at BPF-load-time and not at > + * BPF-exec-time. We assume that netns_dev is a constant. > + */ > + res = ns_get_path_cb(&ns_path, sockops_netns_cb, NULL); > + if (IS_ERR(res)) { > + netns_dev = 0; > + } else { > + ns_inode = ns_path.dentry->d_inode; > + netns_dev = new_encode_dev(ns_inode->i_sb->s_dev); > + } > +#else > + netns_dev = 0; > +#endif > + *insn++ = BPF_MOV64_IMM(si->dst_reg, netns_dev); > + break; > + > + case offsetof(struct bpf_sock_ops, netns_ino): > +#ifdef CONFIG_NET_NS > + /* Loading: sk_ops->sk->__sk_common.skc_net.net->ns.inum > + * Type: (struct bpf_sock_ops_kern *) > + * ->(struct sock *) > + * ->(struct sock_common) > + * .possible_net_t > + * .(struct net *) > + * ->(struct ns_common) > + * .(unsigned int) > + */ > + BUILD_BUG_ON(offsetof(struct sock, __sk_common) != 0); > + BUILD_BUG_ON(offsetof(possible_net_t, net) != 0); > + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF( > + struct bpf_sock_ops_kern, sk), > + si->dst_reg, si->src_reg, > + offsetof(struct bpf_sock_ops_kern, sk)); > + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF( > + possible_net_t, net), > + si->dst_reg, si->dst_reg, > + offsetof(struct sock_common, skc_net)); > + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF( > + struct ns_common, inum), > + si->dst_reg, si->dst_reg, > + offsetof(struct net, ns) + > + offsetof(struct ns_common, inum)); > +#else > + *insn++ = BPF_MOV64_IMM(si->dst_reg, 0); > +#endif > + break; > + > } > return insn - insn_buf; > } > -- > 2.20.1 >