Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp3569734ybc; Mon, 18 Nov 2019 17:45:38 -0800 (PST) X-Google-Smtp-Source: APXvYqxqFfCcE+H8TXptvAFzB25eDL35aA0pIYc9fixn3tDBczhLBhfY0hVN4+6ZICR3Y+05Zm9a X-Received: by 2002:a5d:66cf:: with SMTP id k15mr33164352wrw.38.1574127938171; Mon, 18 Nov 2019 17:45:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1574127938; cv=none; d=google.com; s=arc-20160816; b=qtTkRvkQHsK4gAcqg5qQf8nlmPSxAKXoq22ehDcQFZobd3soYlspjXysedtpJvcRvf 4x7vZl9T8DpjT7VsQi/rPl4dil+d3mo5vRm4j7G2G38gpz8fkIjdxIlC/8qnEeA7wT4Y g8q/r/sVJqXys2pb2GsQLk8BCr/AecIiLiJK7TeB2wVhSrvfbpZHYOyfUa617tlrLkWg W1jOO66X53O0wQJI0apb1Mv/xYkSGCl0uODI7c6+/pUU51Z2LiHkyj3Fipu1+9nn+ofx 8NR1J9qN1SqYCDFskQ42guq3Gz4xklBOenYEXMZwEtZCiWDy8QMr9MScmBk75TX6Q4wz W+nA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:from:subject:references :mime-version:message-id:in-reply-to:date:dkim-signature; bh=SLNOUoddbyuUDBBjWCQQAd7kt0eAjd+1Ui6xbGCwfdw=; b=qm3uWcIryRQDvWTnGxoaPGiIAyJ3r5Vlnl1mm8GIW2xAehSmJWpnKSZiw43j/h8ZKC /IOlFhQp39HiSja7b2f3m22CMaF6r9HPT/Ho8biQrOHLDQQyxn8pAVUmcWGkNH9TZLMG 1zcKCck3WaAIxM5dqunzi1SpGcVX31leWzPMCowLB7jKLyBdGEXQsAaCNhI4xfbMP2Nd Nayag4rTCSt+Xal1SHnlJPN7AQOUerx5y/HyAp1DAohZbsAR9pvDIk9d/tg0S6fbtJvy TTGcYiZV+kNMvHDZavbfZm8OLftuF4Un73TH89iO9BAyZGqwxHFm0zc4W7ZU0p92C/Sr yw/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=gvhe1YOS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d10si10587075edo.298.2019.11.18.17.45.14; Mon, 18 Nov 2019 17:45:38 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=gvhe1YOS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727116AbfKSBoM (ORCPT + 99 others); Mon, 18 Nov 2019 20:44:12 -0500 Received: from mail-yb1-f202.google.com ([209.85.219.202]:40199 "EHLO mail-yb1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727004AbfKSBoM (ORCPT ); Mon, 18 Nov 2019 20:44:12 -0500 Received: by mail-yb1-f202.google.com with SMTP id p4so14909052ybp.7 for ; Mon, 18 Nov 2019 17:44:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=SLNOUoddbyuUDBBjWCQQAd7kt0eAjd+1Ui6xbGCwfdw=; b=gvhe1YOSDMDK90LbW8YD0cZud0AK4Jb3FYjuMRxcJk4R5hhePZBT6wYnAnNCX7yAOt Cbf4zPTF8rco5uSz8bjK/HcQWJF9WLc+5Yuwdz6H39C6nNz0t6MQh3UMwlPGl11pNzy2 az8kLkYQjD3xvFaFyDUTPUdzbHyfR0opsjDBKSJK83QxRWn3xLRUSHdMcPmnVlfXzbxG +bcFgqlPcawvaYVVwmbCJoJmfdKb1vwb525Id6qIKZUH3mU6ndI5fcetxMNIWfkm9z5F ICfEYuBtxuYw8DuaWvoMnUo6APelRQm9WQzlpR4db4kGU8oiSyhaO/2O8IBBbhKKn3sy T8IQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=SLNOUoddbyuUDBBjWCQQAd7kt0eAjd+1Ui6xbGCwfdw=; b=EJXGjMoKTCXWWbk12EKX5T70CoJEoWazWIdW2qg33QE2cyyx6JF0ORfu7gkSVNxHK0 IXiQaK6jB4OMjmV5z0EpOUibE3B8RwO0RwzhVnMCbWEF5GE+ezW7qOrWx5XlltDN40Al Y4Wgdv2E3rhpUpZ8qZZRu31fBKr4JrECdqhaqk8jhvvA1mqvyavItvChdBAc31uXqyLM Su35Wuwniy9XX36cjFsTGlHzzbDIyBLSitdAWmzlAOLV+Yo4hAuNuRflLDVAA5cdiPFM YbU8PrluNfsHHJxbuWOouEBO+uSvHNyx2YM3Vp7LH70PUq/389XukD+gVtB5qwmMxZV2 EP4A== X-Gm-Message-State: APjAAAU5SeG4pzqG5qkWKSJgTCYJCHorQZlLX40Y5zyItB8l5SxdrhV9 TOjao9E2gY4d56ydLJ5DJrRazFDxfLz5 X-Received: by 2002:a81:d54a:: with SMTP id l10mr21741065ywj.0.1574127851200; Mon, 18 Nov 2019 17:44:11 -0800 (PST) Date: Mon, 18 Nov 2019 17:43:49 -0800 In-Reply-To: <20191119014357.98465-1-brianvv@google.com> Message-Id: <20191119014357.98465-2-brianvv@google.com> Mime-Version: 1.0 References: <20191119014357.98465-1-brianvv@google.com> X-Mailer: git-send-email 2.24.0.432.g9d3f5f5b63-goog Subject: [PATCH bpf-next 1/9] bpf: add bpf_map_{value_size,update_value,map_copy_value} functions From: Brian Vazquez To: Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, Brian Vazquez Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This commit moves reusable code from map_lookup_elem and map_update_elem to avoid code duplication in kernel/bpf/syscall.c. Signed-off-by: Brian Vazquez --- kernel/bpf/syscall.c | 271 ++++++++++++++++++++++++------------------- 1 file changed, 151 insertions(+), 120 deletions(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index c88c815c2154d..cc714c9d5b4cc 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -127,6 +127,153 @@ static struct bpf_map *find_and_alloc_map(union bpf_attr *attr) return map; } +static u32 bpf_map_value_size(struct bpf_map *map) +{ + if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || + map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH || + map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY || + map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) + return round_up(map->value_size, 8) * num_possible_cpus(); + else if (IS_FD_MAP(map)) + return sizeof(u32); + else + return map->value_size; +} + +static void maybe_wait_bpf_programs(struct bpf_map *map) +{ + /* Wait for any running BPF programs to complete so that + * userspace, when we return to it, knows that all programs + * that could be running use the new map value. + */ + if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS || + map->map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS) + synchronize_rcu(); +} + +static int bpf_map_update_value(struct bpf_map *map, struct fd f, void *key, + void *value, __u64 flags) +{ + int err; + /* Need to create a kthread, thus must support schedule */ + if (bpf_map_is_dev_bound(map)) { + return bpf_map_offload_update_elem(map, key, value, flags); + } else if (map->map_type == BPF_MAP_TYPE_CPUMAP || + map->map_type == BPF_MAP_TYPE_SOCKHASH || + map->map_type == BPF_MAP_TYPE_SOCKMAP) { + return map->ops->map_update_elem(map, key, value, flags); + } + + /* must increment bpf_prog_active to avoid kprobe+bpf triggering from + * inside bpf map update or delete otherwise deadlocks are possible + */ + preempt_disable(); + __this_cpu_inc(bpf_prog_active); + if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || + map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { + err = bpf_percpu_hash_update(map, key, value, flags); + } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { + err = bpf_percpu_array_update(map, key, value, flags); + } else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) { + err = bpf_percpu_cgroup_storage_update(map, key, value, + flags); + } else if (IS_FD_ARRAY(map)) { + rcu_read_lock(); + err = bpf_fd_array_map_update_elem(map, f.file, key, value, + flags); + rcu_read_unlock(); + } else if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) { + rcu_read_lock(); + err = bpf_fd_htab_map_update_elem(map, f.file, key, value, + flags); + rcu_read_unlock(); + } else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) { + /* rcu_read_lock() is not needed */ + err = bpf_fd_reuseport_array_update_elem(map, key, value, + flags); + } else if (map->map_type == BPF_MAP_TYPE_QUEUE || + map->map_type == BPF_MAP_TYPE_STACK) { + err = map->ops->map_push_elem(map, value, flags); + } else { + rcu_read_lock(); + err = map->ops->map_update_elem(map, key, value, flags); + rcu_read_unlock(); + } + __this_cpu_dec(bpf_prog_active); + preempt_enable(); + maybe_wait_bpf_programs(map); + + return err; +} + +static int bpf_map_copy_value(struct bpf_map *map, void *key, void *value, + __u64 flags, bool do_delete) +{ + void *ptr; + int err; + + + if (bpf_map_is_dev_bound(map)) { + err = bpf_map_offload_lookup_elem(map, key, value); + + if (!err && do_delete) + err = bpf_map_offload_delete_elem(map, key); + + return err; + } + + preempt_disable(); + this_cpu_inc(bpf_prog_active); + if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || + map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { + err = bpf_percpu_hash_copy(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { + err = bpf_percpu_array_copy(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) { + err = bpf_percpu_cgroup_storage_copy(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_STACK_TRACE) { + err = bpf_stackmap_copy(map, key, value); + } else if (IS_FD_ARRAY(map)) { + err = bpf_fd_array_map_lookup_elem(map, key, value); + } else if (IS_FD_HASH(map)) { + err = bpf_fd_htab_map_lookup_elem(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) { + err = bpf_fd_reuseport_array_lookup_elem(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_QUEUE || + map->map_type == BPF_MAP_TYPE_STACK) { + err = map->ops->map_peek_elem(map, value); + } else { + rcu_read_lock(); + if (map->ops->map_lookup_elem_sys_only) + ptr = map->ops->map_lookup_elem_sys_only(map, key); + else + ptr = map->ops->map_lookup_elem(map, key); + if (IS_ERR(ptr)) { + err = PTR_ERR(ptr); + } else if (!ptr) { + err = -ENOENT; + } else { + err = 0; + if (flags & BPF_F_LOCK) + /* lock 'ptr' and copy everything but lock */ + copy_map_value_locked(map, value, ptr, true); + else + copy_map_value(map, value, ptr); + /* mask lock, since value wasn't zero inited */ + check_and_init_map_lock(map, value); + } + rcu_read_unlock(); + } + if (do_delete) + err = err ? err : map->ops->map_delete_elem(map, key); + + this_cpu_dec(bpf_prog_active); + preempt_enable(); + maybe_wait_bpf_programs(map); + + return err; +} + void *bpf_map_area_alloc(size_t size, int numa_node) { /* We really just want to fail instead of triggering OOM killer @@ -740,7 +887,7 @@ static int map_lookup_elem(union bpf_attr *attr) void __user *uvalue = u64_to_user_ptr(attr->value); int ufd = attr->map_fd; struct bpf_map *map; - void *key, *value, *ptr; + void *key, *value; u32 value_size; struct fd f; int err; @@ -772,72 +919,14 @@ static int map_lookup_elem(union bpf_attr *attr) goto err_put; } - if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || - map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH || - map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY || - map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) - value_size = round_up(map->value_size, 8) * num_possible_cpus(); - else if (IS_FD_MAP(map)) - value_size = sizeof(u32); - else - value_size = map->value_size; + value_size = bpf_map_value_size(map); err = -ENOMEM; value = kmalloc(value_size, GFP_USER | __GFP_NOWARN); if (!value) goto free_key; - if (bpf_map_is_dev_bound(map)) { - err = bpf_map_offload_lookup_elem(map, key, value); - goto done; - } - - preempt_disable(); - this_cpu_inc(bpf_prog_active); - if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || - map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { - err = bpf_percpu_hash_copy(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { - err = bpf_percpu_array_copy(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) { - err = bpf_percpu_cgroup_storage_copy(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_STACK_TRACE) { - err = bpf_stackmap_copy(map, key, value); - } else if (IS_FD_ARRAY(map)) { - err = bpf_fd_array_map_lookup_elem(map, key, value); - } else if (IS_FD_HASH(map)) { - err = bpf_fd_htab_map_lookup_elem(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) { - err = bpf_fd_reuseport_array_lookup_elem(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_QUEUE || - map->map_type == BPF_MAP_TYPE_STACK) { - err = map->ops->map_peek_elem(map, value); - } else { - rcu_read_lock(); - if (map->ops->map_lookup_elem_sys_only) - ptr = map->ops->map_lookup_elem_sys_only(map, key); - else - ptr = map->ops->map_lookup_elem(map, key); - if (IS_ERR(ptr)) { - err = PTR_ERR(ptr); - } else if (!ptr) { - err = -ENOENT; - } else { - err = 0; - if (attr->flags & BPF_F_LOCK) - /* lock 'ptr' and copy everything but lock */ - copy_map_value_locked(map, value, ptr, true); - else - copy_map_value(map, value, ptr); - /* mask lock, since value wasn't zero inited */ - check_and_init_map_lock(map, value); - } - rcu_read_unlock(); - } - this_cpu_dec(bpf_prog_active); - preempt_enable(); - -done: + err = bpf_map_copy_value(map, key, value, attr->flags, false); if (err) goto free_value; @@ -856,16 +945,6 @@ static int map_lookup_elem(union bpf_attr *attr) return err; } -static void maybe_wait_bpf_programs(struct bpf_map *map) -{ - /* Wait for any running BPF programs to complete so that - * userspace, when we return to it, knows that all programs - * that could be running use the new map value. - */ - if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS || - map->map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS) - synchronize_rcu(); -} #define BPF_MAP_UPDATE_ELEM_LAST_FIELD flags @@ -921,56 +1000,8 @@ static int map_update_elem(union bpf_attr *attr) if (copy_from_user(value, uvalue, value_size) != 0) goto free_value; - /* Need to create a kthread, thus must support schedule */ - if (bpf_map_is_dev_bound(map)) { - err = bpf_map_offload_update_elem(map, key, value, attr->flags); - goto out; - } else if (map->map_type == BPF_MAP_TYPE_CPUMAP || - map->map_type == BPF_MAP_TYPE_SOCKHASH || - map->map_type == BPF_MAP_TYPE_SOCKMAP) { - err = map->ops->map_update_elem(map, key, value, attr->flags); - goto out; - } + err = bpf_map_update_value(map, f, key, value, attr->flags); - /* must increment bpf_prog_active to avoid kprobe+bpf triggering from - * inside bpf map update or delete otherwise deadlocks are possible - */ - preempt_disable(); - __this_cpu_inc(bpf_prog_active); - if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || - map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { - err = bpf_percpu_hash_update(map, key, value, attr->flags); - } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { - err = bpf_percpu_array_update(map, key, value, attr->flags); - } else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) { - err = bpf_percpu_cgroup_storage_update(map, key, value, - attr->flags); - } else if (IS_FD_ARRAY(map)) { - rcu_read_lock(); - err = bpf_fd_array_map_update_elem(map, f.file, key, value, - attr->flags); - rcu_read_unlock(); - } else if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) { - rcu_read_lock(); - err = bpf_fd_htab_map_update_elem(map, f.file, key, value, - attr->flags); - rcu_read_unlock(); - } else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) { - /* rcu_read_lock() is not needed */ - err = bpf_fd_reuseport_array_update_elem(map, key, value, - attr->flags); - } else if (map->map_type == BPF_MAP_TYPE_QUEUE || - map->map_type == BPF_MAP_TYPE_STACK) { - err = map->ops->map_push_elem(map, value, attr->flags); - } else { - rcu_read_lock(); - err = map->ops->map_update_elem(map, key, value, attr->flags); - rcu_read_unlock(); - } - __this_cpu_dec(bpf_prog_active); - preempt_enable(); - maybe_wait_bpf_programs(map); -out: free_value: kfree(value); free_key: -- 2.24.0.432.g9d3f5f5b63-goog