Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp793453yba; Fri, 26 Apr 2019 08:50:47 -0700 (PDT) X-Google-Smtp-Source: APXvYqzoot6cU+UIFwyU4873aWu1H1ErSn6BTmfp/alqJWGl3xn5K5x90KcAkoxTLCx0j9494l/A X-Received: by 2002:a63:2846:: with SMTP id o67mr30387878pgo.329.1556293846743; Fri, 26 Apr 2019 08:50:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556293846; cv=none; d=google.com; s=arc-20160816; b=fuVtR+4N8xv9UeYSFLRQNCx4FpWxjxr3CORyf8ZcAIt9hWpeErdpqg4hvaJtbmgiCt Yh5Nk6nM1V+G8JVsEf6ijs7naT7p5D9lDIDZEpXfmY5IQDER0jgolyFAkrdQS2KnTP+n t5S7uGkascFCTXpIERKnAyW8mZtBJvEF1lLLwo5wRxzjwFxF8eJ4Px/U+GOy/o4oNgFK qDUbEwBBMhwK+M4BpbV+gEI/cwO1GlXCfWQ1TNZj3yV3vgCBmgfQl+cUvZ6P04jVA77z u7YxhHBpsWgwG9FwVHq6miEBHmpQRl8SY2t5T+ZkkhNV3PveNH+lkgDOLyM0VOfGvq/t /P1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=YBGtbPc4Hjybxt7HzjkrxpcfTTZAyVllCtxWEE6p+sM=; b=ku/VcXMyZ9NV5+Q2WfNYbPrk04CWkvbMCPzbjLdTk1y7K7ZAmBAyQdpaGnFu/bOMVy 8212/A00noT7GKjHRrGTeL5u4ELy6jgr8+dB/V5jgCxDeYU02E6XPuStbPBpls6vao9W ftf0r79gj4UNh02KurgSIon53M7YYmdda0lJUg3ZXak1gxMkPLrJUYhQ6H2F40zDN98C ydVFNtgTefr7noOJ6B+RvXTfYSz4x+sCycg7GHKl85SfMJsDOpDZ9Uk8LkLFbAGR41Cs /i86L1/fL1Uj3dCRXRlUsh0b3ukmpGmYyHBqeUMg46e/glmsaQlz5uKEtdi2iYWM43jS u49w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=TANMI2VK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q81si26097315pfa.207.2019.04.26.08.50.31; Fri, 26 Apr 2019 08:50:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=TANMI2VK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726423AbfDZPtg (ORCPT + 99 others); Fri, 26 Apr 2019 11:49:36 -0400 Received: from mail-ed1-f67.google.com ([209.85.208.67]:33543 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726176AbfDZPtf (ORCPT ); Fri, 26 Apr 2019 11:49:35 -0400 Received: by mail-ed1-f67.google.com with SMTP id d55so1721810ede.0; Fri, 26 Apr 2019 08:49:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=YBGtbPc4Hjybxt7HzjkrxpcfTTZAyVllCtxWEE6p+sM=; b=TANMI2VKwIWd1nqs0losES/j4XERwdlHVa9rgnbpdDHY4tGiP9RU+v3VK6jHnKTwlX klZxgBLoOp+hIPPRg+V7B+NxaxvPPEEcsrEvFBWVtJ2pZ7a1+ezIxs6YrFswFZxtp29P iM5FcayCUbX0IWrG/GqiOODZpbgyxfAD/jJO/dk9mUuT6JBRYVabLUX9PtMIe4QmmQNr VLoXhWvbPR08D4ZJx71pvGeduWAcSIKHbr/Na+DVVezkUOTBg9Utc2gnEpqmGsKCo2Sn O6IzxmU3EoPq3a3P+QYB06iBt2oD07KK+6tFdnAexQ5a+kq5V+5b9O64xUFPeiHHgORG wFVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :mime-version:content-transfer-encoding; bh=YBGtbPc4Hjybxt7HzjkrxpcfTTZAyVllCtxWEE6p+sM=; b=MRiEGWpam/g/5/J4pO78orTpGhKM57JJnS3e2PfcM3uLhExjsmxZLYJOR267VDXLFt 32mZXnLvhtCKpYY+EB6gUopDueIzjd9ebixADsSKYYsY3XmPA/+yfsHl4mwJeQvpXpKN HFbdzl3/aPY1NOThgMtC5irCHzH+QCx8crRGNhFcIgYiIXWE9Nxemh+dflOaL4AtM3Mq L+zIKxQG4+Ogo2rZoJDh7gbWLFPXkexZoIptQHkIVnAfWPSomXwSLmVJ3cq0/w+pN1lg BccaFedUjzEJuzOI1Oua76OcSUasl7hyHPleJx7DBi/EqVvGnneMqojQrAEAj8ZvcprU kcTg== X-Gm-Message-State: APjAAAUNv9QmZ3gtkVaw237h+dELupuA00x8wYXslh2t32od9Nzv8Gnh HORzV8s95yD38BHD8T6jYSg= X-Received: by 2002:a50:a704:: with SMTP id h4mr12336151edc.7.1556293773101; Fri, 26 Apr 2019 08:49:33 -0700 (PDT) Received: from neptune.fritz.box ([178.19.216.175]) by smtp.gmail.com with ESMTPSA id f15sm4603002eja.39.2019.04.26.08.49.30 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 26 Apr 2019 08:49:31 -0700 (PDT) From: Alban Crequy X-Google-Original-From: Alban Crequy To: john.fastabend@gmail.com, ast@kernel.org, daniel@iogearbox.net Cc: bpf@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, alban@kinvolk.io, iago@kinvolk.io Subject: [PATCH bpf-next v3 1/4] bpf: sock ops: add netns ino and dev in bpf context Date: Fri, 26 Apr 2019 17:48:45 +0200 Message-Id: <20190426154848.23490-1-alban@kinvolk.io> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Alban Crequy sockops programs can now access the network namespace inode and device via (struct bpf_sock_ops)->netns_ino and ->netns_dev. This can be useful to apply different policies on different network namespaces. In the unlikely case where network namespaces are not compiled in (CONFIG_NET_NS=n), the verifier will not allow access to ->netns_*. The generated BPF bytecode for netns_ino is loading the correct inode number at the time of execution. However, the generated BPF bytecode for netns_dev is loading an immediate value determined at BPF-load-time by looking at the initial network namespace. In practice, this works because all netns currently use the same virtual device. If this was to change, this code would need to be updated too. Signed-off-by: Alban Crequy --- Changes since v1: - add netns_dev (review from Alexei) Changes since v2: - replace __u64 by u64 in kernel code (review from Y Song) - remove unneeded #else branch: program would be rejected in is_valid_access (review from Y Song) - allow partial reads ( #include #include +#include +#include /** * sk_filter_trim_cap - run a packet through a socket filter @@ -6810,6 +6812,24 @@ static bool sock_ops_is_valid_access(int off, int size, } } else { switch (off) { + case offsetof(struct bpf_sock_ops, netns_dev) ... + offsetof(struct bpf_sock_ops, netns_dev) + sizeof(u64) - 1: +#ifdef CONFIG_NET_NS + if (off - offsetof(struct bpf_sock_ops, netns_dev) + + size > sizeof(u64)) + return false; +#else + return false; +#endif + break; + case offsetof(struct bpf_sock_ops, netns_ino): +#ifdef CONFIG_NET_NS + if (size != sizeof(u64)) + return false; +#else + return false; +#endif + break; case bpf_ctx_range_till(struct bpf_sock_ops, bytes_received, bytes_acked): if (size != sizeof(__u64)) @@ -7727,6 +7747,11 @@ static u32 sock_addr_convert_ctx_access(enum bpf_access_type type, return insn - insn_buf; } +static struct ns_common *sockops_netns_cb(void *private_data) +{ + return &init_net.ns; +} + static u32 sock_ops_convert_ctx_access(enum bpf_access_type type, const struct bpf_insn *si, struct bpf_insn *insn_buf, @@ -7735,6 +7760,10 @@ static u32 sock_ops_convert_ctx_access(enum bpf_access_type type, { struct bpf_insn *insn = insn_buf; int off; + struct inode *ns_inode; + struct path ns_path; + u64 netns_dev; + void *res; /* Helper macro for adding read access to tcp_sock or sock fields. */ #define SOCK_OPS_GET_FIELD(BPF_FIELD, OBJ_FIELD, OBJ) \ @@ -7981,6 +8010,71 @@ static u32 sock_ops_convert_ctx_access(enum bpf_access_type type, SOCK_OPS_GET_OR_SET_FIELD(sk_txhash, sk_txhash, struct sock, type); break; + + case offsetof(struct bpf_sock_ops, netns_dev) ... + offsetof(struct bpf_sock_ops, netns_dev) + sizeof(u64) - 1: +#ifdef CONFIG_NET_NS + /* We get the netns_dev at BPF-load-time and not at + * BPF-exec-time. We assume that netns_dev is a constant. + */ + res = ns_get_path_cb(&ns_path, sockops_netns_cb, NULL); + if (IS_ERR(res)) { + netns_dev = 0; + } else { + ns_inode = ns_path.dentry->d_inode; + netns_dev = new_encode_dev(ns_inode->i_sb->s_dev); + } + off = si->off; + off -= offsetof(struct bpf_sock_ops, netns_dev); + switch (BPF_LDST_BYTES(si)) { + case sizeof(u64): + *insn++ = BPF_MOV64_IMM(si->dst_reg, netns_dev); + break; + case sizeof(u32): + netns_dev = *(u32 *)(((char *)&netns_dev) + off); + *insn++ = BPF_MOV32_IMM(si->dst_reg, netns_dev); + break; + case sizeof(u16): + netns_dev = *(u16 *)(((char *)&netns_dev) + off); + *insn++ = BPF_MOV32_IMM(si->dst_reg, netns_dev); + break; + case sizeof(u8): + netns_dev = *(u8 *)(((char *)&netns_dev) + off); + *insn++ = BPF_MOV32_IMM(si->dst_reg, netns_dev); + break; + } +#endif + break; + + case offsetof(struct bpf_sock_ops, netns_ino): +#ifdef CONFIG_NET_NS + /* Loading: sk_ops->sk->__sk_common.skc_net.net->ns.inum + * Type: (struct bpf_sock_ops_kern *) + * ->(struct sock *) + * ->(struct sock_common) + * .possible_net_t + * .(struct net *) + * ->(struct ns_common) + * .(unsigned int) + */ + BUILD_BUG_ON(offsetof(struct sock, __sk_common) != 0); + BUILD_BUG_ON(offsetof(possible_net_t, net) != 0); + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF( + struct bpf_sock_ops_kern, sk), + si->dst_reg, si->src_reg, + offsetof(struct bpf_sock_ops_kern, sk)); + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF( + possible_net_t, net), + si->dst_reg, si->dst_reg, + offsetof(struct sock_common, skc_net)); + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF( + struct ns_common, inum), + si->dst_reg, si->dst_reg, + offsetof(struct net, ns) + + offsetof(struct ns_common, inum)); +#endif + break; + } return insn - insn_buf; } -- 2.20.1