Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp4788831imu; Sun, 25 Nov 2018 10:11:17 -0800 (PST) X-Google-Smtp-Source: AFSGD/UMItzwesNIuUeJXbzNLfpZqodO7g4B/Xv7UPOt6M6KhxuiQC5OFvX1p07v+9d7yI4fTAH4 X-Received: by 2002:a17:902:8e8b:: with SMTP id bg11mr24558181plb.332.1543169477344; Sun, 25 Nov 2018 10:11:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543169477; cv=none; d=google.com; s=arc-20160816; b=de62GeaPG95wD62i8sDzsW4S00UL65Zrp57q/oXtjrWl5TiQZe61DwZJhYyMDfFNXT IOLXFnhkQGEfp6ryp3a5Bs0VEuazY7pcHF5XxdELKM4rOhklK2PV/fjXp2STBYujFMPM kT/Fu6CzKWdvRdEYQHxvKLd10CrGuin1bx/09yKowzf6rEFQgOyeayZg6k0/i9ENjmpW S8EIUn3gAvZxXL19e+ZCyT9mCraOJlSrTOKSNIx6ez2BRxKNyHTixuE63jGwtcdRayKS Lv2oIxBrWKMPwjy5lqPksd8D5XXbJCNwvAhfdQ9cBYJ6bKfMY5VESwZK7YKULeztGaSc /eQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=UhWWASpWuscnRsDFC5DeC96YRbPqtXlv0yit6Esju10=; b=jf+m0G3BxaFTm3Bgr6BHDe8dlpL5MEsSYGiJ5GD/z3Wx9X/Q1l90X9KDkWiUJW+CXu i3Ttt9YCjeQBRZJF2U4GNsM8sOUc70Pxb5JDnNUD8sUJMK3kvMeEoT1HWmTZF0vbHCZk dqk/bagmNxbQZWr+XEC1SX8zqsukjsapEpHWyQGt+weuvJiZba6tc8g86ryyGKmaPX07 Uk2AF9+h/ZgAHQ2OSt0HAR3QByUUpBR/ZrQGHL9ZG6DOXGqNrpaTWIw5cT0qNklWzzGE 9m8uGI9yvlbSZW00aPur2gz14CiFVZlbGVa8/JHTRlCT2Gab3NVmyww9KaUHgm6Ej5O0 snqw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytheb-org.20150623.gappssmtp.com header.s=20150623 header.b=pn9O2FAf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a59-v6si59041359plc.48.2018.11.25.10.10.46; Sun, 25 Nov 2018 10:11:17 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@bytheb-org.20150623.gappssmtp.com header.s=20150623 header.b=pn9O2FAf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726755AbeKZFBV (ORCPT + 99 others); Mon, 26 Nov 2018 00:01:21 -0500 Received: from mail-it1-f195.google.com ([209.85.166.195]:38207 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726537AbeKZFBV (ORCPT ); Mon, 26 Nov 2018 00:01:21 -0500 Received: by mail-it1-f195.google.com with SMTP id h65so24174087ith.3 for ; Sun, 25 Nov 2018 10:09:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytheb-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=UhWWASpWuscnRsDFC5DeC96YRbPqtXlv0yit6Esju10=; b=pn9O2FAfWBa2/l5ZP5eN3v6aYao+QZYJVFsyWtk1VGD/RNaHkYGgowUZHvwoF1PSI6 Ya+LOh0y2awr/SH0Lw6Mks8JLYUqgiti7tNmCbzabVQuJvDFYyK1x01Nn1IJOdDanhG+ 9+fk1nf168wVZMl/qcVMLLzTXIYMWBis1GJCAQe7Nuoljuwzaew2PcJOIoyGuEoCl4x3 o4jT8huOPXBdL749ddS96BHZBBRfXqFtHhaozabLL5UTkPhIfh6YsYgDj7KvdM05YHq+ RsMs0YCIHFyo9UwFqMXDcJvjtZ9XB/rv0kfozgUtfj/2XoIj+a/YvKMf54fxYBq1TAil ZfJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UhWWASpWuscnRsDFC5DeC96YRbPqtXlv0yit6Esju10=; b=Ym2w2qzVG5ZDAN22j1r4zZKlwqlOrvZDe9OXHvaL5iPAmHI+DeTaKU/L/E7rFx3RPe 2xNncBb22o9PNtsIIj8DMd9JL6bnMJjTdheNI75pI/tdTdwY0+nJ26oUXaiyZ63rjrM2 HKIDFwms4wLt430of5HJrZSdzcYhwVzFvJkqanWJSeYqHsPL0QXcqDFeaQLybMtD6JzO P+89/KMaIXYWX2g5rN9pzSWseelNtRfZvriPFyWbAmKFNjf/PwhO51LHwAUodSdGkW/D NanH6/3JRrhQJJuph2mjc8jCiSFzsNAakRo+lzk/ehxACSuUYV85utzDfQhfD8eAePcv OnjQ== X-Gm-Message-State: AGRZ1gLLzg/fb1ThaANfBf8fmQcPgM7e+il8FmWF/PMZ4tr01qS3LWbR lO6p7Qe7OcUp/nFuB1DMclAZxA== X-Received: by 2002:a24:5284:: with SMTP id d126mr20588190itb.110.1543169382150; Sun, 25 Nov 2018 10:09:42 -0800 (PST) Received: from dhcp-25.97.bos.redhat.com (047-014-005-015.res.spectrum.com. [47.14.5.15]) by smtp.gmail.com with ESMTPSA id y8sm5959768ita.5.2018.11.25.10.09.40 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 25 Nov 2018 10:09:41 -0800 (PST) From: Aaron Conole To: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org, netfilter-devel@vger.kernel.org, coreteam@netfilter.org, Alexei Starovoitov , Daniel Borkmann , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , John Fastabend , Jesper Brouer , "David S . Miller" , Andy Gospodarek , Rony Efraim , Simon Horman , Marcelo Leitner Subject: [RFC -next v0 3/3] netfilter: nf_flow_table_bpf_map: introduce new loadable bpf map Date: Sun, 25 Nov 2018 13:09:19 -0500 Message-Id: <20181125180919.13996-4-aconole@bytheb.org> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181125180919.13996-1-aconole@bytheb.org> References: <20181125180919.13996-1-aconole@bytheb.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This commit introduces a new loadable map that allows an eBPF program to query the flow offload tables for specific flow information. For now, that information is limited to input and output index information. Future enhancements would be to include connection tracking details, such as state, metadata, and allow for window validation. Signed-off-by: Aaron Conole --- include/linux/bpf_types.h | 2 + include/uapi/linux/bpf.h | 7 + net/netfilter/Kconfig | 9 + net/netfilter/Makefile | 1 + net/netfilter/nf_flow_table_bpf_flowmap.c | 202 ++++++++++++++++++++++ 5 files changed, 221 insertions(+) create mode 100644 net/netfilter/nf_flow_table_bpf_flowmap.c diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h index 44d9ab4809bd..82d3038cf6c3 100644 --- a/include/linux/bpf_types.h +++ b/include/linux/bpf_types.h @@ -71,3 +71,5 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_REUSEPORT_SOCKARRAY, reuseport_array_ops) #endif BPF_MAP_TYPE(BPF_MAP_TYPE_QUEUE, queue_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_STACK, stack_map_ops) + +BPF_MAP_TYPE(BPF_MAP_TYPE_FLOWMAP, loadable_map) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 852dc17ab47a..fb77c8c5c209 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -131,6 +131,7 @@ enum bpf_map_type { BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE, BPF_MAP_TYPE_QUEUE, BPF_MAP_TYPE_STACK, + BPF_MAP_TYPE_FLOWMAP, }; enum bpf_prog_type { @@ -2942,4 +2943,10 @@ struct bpf_flow_keys { }; }; +struct bpf_flow_map { + struct bpf_flow_keys flow; + __u32 iifindex; + __u32 oifindex; +}; + #endif /* _UAPI__LINUX_BPF_H__ */ diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig index 2ab870ef233a..30f1bc9084be 100644 --- a/net/netfilter/Kconfig +++ b/net/netfilter/Kconfig @@ -709,6 +709,15 @@ config NF_FLOW_TABLE To compile it as a module, choose M here. +config NF_FLOW_TABLE_BPF + tristate "Netfilter flowtable BPF map" + depends on NF_FLOW_TABLE + depends on BPF_LOADABLE_MAPS + help + This option adds support for retrieving flow table entries + via a loadable BPF map. + To compile it as a module, choose M here. + config NETFILTER_XTABLES tristate "Netfilter Xtables support (required for ip_tables)" default m if NETFILTER_ADVANCED=n diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile index 4ddf3ef51ece..8dba928a03fd 100644 --- a/net/netfilter/Makefile +++ b/net/netfilter/Makefile @@ -121,6 +121,7 @@ obj-$(CONFIG_NFT_FWD_NETDEV) += nft_fwd_netdev.o # flow table infrastructure obj-$(CONFIG_NF_FLOW_TABLE) += nf_flow_table.o +obj-$(CONFIG_NF_FLOW_TABLE_BPF) += nf_flow_table_bpf_flowmap.o nf_flow_table-objs := nf_flow_table_core.o nf_flow_table_ip.o obj-$(CONFIG_NF_FLOW_TABLE_INET) += nf_flow_table_inet.o diff --git a/net/netfilter/nf_flow_table_bpf_flowmap.c b/net/netfilter/nf_flow_table_bpf_flowmap.c new file mode 100644 index 000000000000..577985560883 --- /dev/null +++ b/net/netfilter/nf_flow_table_bpf_flowmap.c @@ -0,0 +1,202 @@ +/* SPDX-License-Identifier: GPL-2.0 + * + * Copyright (c) 2018, Aaron Conole + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of version 2 of the GNU General Public + * License as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ + +#include +#include +#include +#include +#include +#include + +struct flow_map_internal { + struct bpf_map map; + struct nf_flowtable net_flow_table; +}; + +static void flow_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr) +{ + map->map_type = attr->map_type; + map->key_size = attr->key_size; + map->value_size = attr->value_size; + map->max_entries = attr->max_entries; + map->map_flags = attr->map_flags; + map->numa_node = bpf_map_attr_numa_node(attr); +} + +static struct bpf_map *flow_map_alloc(union bpf_attr *attr) +{ + struct flow_map_internal *fmap_ret; + u64 cost; + int err; + + if (!capable(CAP_NET_ADMIN)) + return ERR_PTR(-EPERM); + + if (attr->max_entries == 0 || + attr->key_size != sizeof(struct bpf_flow_map) || + attr->value_size != sizeof(struct bpf_flow_map)) + return ERR_PTR(-EINVAL); + + fmap_ret = kzalloc(sizeof(*fmap_ret), GFP_USER); + if (!fmap_ret) + return ERR_PTR(-ENOMEM); + + flow_map_init_from_attr(&fmap_ret->map, attr); + cost = (u64)fmap_ret->map.max_entries * sizeof(struct flow_offload); + if (cost >= U32_MAX - PAGE_SIZE) { + kfree(&fmap_ret); + return ERR_PTR(-ENOMEM); + } + + fmap_ret->map.pages = round_up(cost, PAGE_SIZE) >> PAGE_SHIFT; + + /* if map size is larger than memlock limit, reject it early */ + if ((err = bpf_map_precharge_memlock(fmap_ret->map.pages))) { + kfree(&fmap_ret); + return ERR_PTR(err); + } + + memset(&fmap_ret->net_flow_table, 0, sizeof(fmap_ret->net_flow_table)); + fmap_ret->net_flow_table.flags |= NF_FLOWTABLE_F_SNOOP; + nf_flow_table_init(&fmap_ret->net_flow_table); + + return &fmap_ret->map; +} + +static void flow_map_free(struct bpf_map *map) +{ + struct flow_map_internal *fmap = container_of(map, + struct flow_map_internal, + map); + + nf_flow_table_free(&fmap->net_flow_table); + synchronize_rcu(); + kfree(fmap); +} + +static void flow_walk(struct flow_offload *flow, void *data) +{ + printk("Flow offload dir0: %x:%d -> %x:%d, %u, %u, %d, %u\n", + flow->tuplehash[0].tuple.src_v4.s_addr, + flow->tuplehash[0].tuple.src_port, + flow->tuplehash[0].tuple.dst_v4.s_addr, + flow->tuplehash[0].tuple.dst_port, + flow->tuplehash[0].tuple.l3proto, + flow->tuplehash[0].tuple.l4proto, + flow->tuplehash[0].tuple.iifidx, + flow->tuplehash[0].tuple.dir + ); + + printk("Flow offload dir1: %x:%d -> %x:%d, %u, %u, %d, %u\n", + flow->tuplehash[1].tuple.src_v4.s_addr, + flow->tuplehash[1].tuple.src_port, + flow->tuplehash[1].tuple.dst_v4.s_addr, + flow->tuplehash[1].tuple.dst_port, + flow->tuplehash[1].tuple.l3proto, + flow->tuplehash[1].tuple.l4proto, + flow->tuplehash[1].tuple.iifidx, + flow->tuplehash[1].tuple.dir + ); +} + +static void *flow_map_lookup_elem(struct bpf_map *map, void *key) +{ + struct flow_map_internal *fmap = container_of(map, + struct flow_map_internal, map); + struct bpf_flow_map *internal_key = (struct bpf_flow_map *)key; + struct flow_offload_tuple_rhash *hash_ret; + struct flow_offload_tuple lookup_key; + + memset(&lookup_key, 0, sizeof(lookup_key)); + lookup_key.src_port = ntohs(internal_key->flow.sport); + lookup_key.dst_port = ntohs(internal_key->flow.dport); + lookup_key.dir = 0; + + if (internal_key->flow.addr_proto == htons(ETH_P_IP)) { + lookup_key.l3proto = AF_INET; + lookup_key.src_v4.s_addr = ntohl(internal_key->flow.ipv4_src); + lookup_key.dst_v4.s_addr = ntohl(internal_key->flow.ipv4_dst); + } else if (internal_key->flow.addr_proto == htons(ETH_P_IPV6)) { + lookup_key.l3proto = AF_INET6; + memcpy(&lookup_key.src_v6, + internal_key->flow.ipv6_src, + sizeof(lookup_key.src_v6)); + memcpy(&lookup_key.dst_v6, + internal_key->flow.ipv6_dst, + sizeof(lookup_key.dst_v6)); + } else + return NULL; + + lookup_key.l4proto = (u8)internal_key->flow.ip_proto; + lookup_key.iifidx = internal_key->iifindex; + + printk("Flow offload lookup: %x:%d -> %x:%d, %u, %u, %d, %u\n", + lookup_key.src_v4.s_addr, lookup_key.src_port, + lookup_key.dst_v4.s_addr, lookup_key.dst_port, + lookup_key.l3proto, lookup_key.l4proto, + lookup_key.iifidx, lookup_key.dir); + hash_ret = flow_offload_lookup(&fmap->net_flow_table, &lookup_key); + if (!hash_ret) { + memcpy(&lookup_key.src_v6, internal_key->flow.ipv6_src, + sizeof(lookup_key.src_v6)); + memcpy(&lookup_key.dst_v6, internal_key->flow.ipv6_dst, + sizeof(lookup_key.dst_v6)); + lookup_key.src_port = internal_key->flow.dport; + lookup_key.dst_port = internal_key->flow.sport; + lookup_key.dir = 1; + hash_ret = flow_offload_lookup(&fmap->net_flow_table, + &lookup_key); + } + + if (!hash_ret) { + printk("No flow found, but table is: %d\n", + atomic_read(&fmap->net_flow_table.rhashtable.nelems)); + nf_flow_table_iterate(&fmap->net_flow_table, flow_walk, NULL); + return NULL; + } + + printk("Flow matched!\n"); + return key; +} + +static int flow_map_get_next_key(struct bpf_map *map, void *key, void *next_key) +{ + return 0; +} + +static int flow_map_check_no_btf(const struct bpf_map *map, + const struct btf_type *key_type, + const struct btf_type *value_type) +{ + return -ENOTSUPP; +} + +const struct bpf_map_ops flow_map_ops = { + .map_alloc = flow_map_alloc, + .map_free = flow_map_free, + .map_get_next_key = flow_map_get_next_key, + .map_lookup_elem = flow_map_lookup_elem, + .map_check_btf = flow_map_check_no_btf, +}; + +static int __init flow_map_init(void) +{ + bpf_map_insert_ops(BPF_MAP_TYPE_FLOWMAP, &flow_map_ops); + return 0; +} + +module_init(flow_map_init); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Aaron Conole "); -- 2.19.1