Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1098685imm; Fri, 27 Jul 2018 11:04:31 -0700 (PDT) X-Google-Smtp-Source: AAOMgpft3KCeCozS1WeaQOMi+zKHj2SvfKqMJjIIG4RFC01HnHk35qIC5d1pji/qUDog+xeBjvlJ X-Received: by 2002:a63:ed56:: with SMTP id m22-v6mr7077956pgk.148.1532714671165; Fri, 27 Jul 2018 11:04:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532714671; cv=none; d=google.com; s=arc-20160816; b=dq3q72Aj9kXloLJ6U0RN0DsnfS0HGNZ6LSH7w5lSYe7Ds2AKG1LjkolDex75/fWPtB At2McNFVDYFfUyakjmmmZNXYScFvcLUKRgpg1FFwUP5KO2wZHOLNfpna7kZ0Wf//fhK+ kfFihBKau9H1Tas9mg+BO+L7vz41bqYSFIKMZCw89iBUM3solKEpXwY4Ym8tcvatWne4 LK3VW4Lp45ftz/xkpzFeaKbQQNwc6Qhb98T7TZ+wFL7NrI3SomyuqyRHLHrRAIlPLuly MeB7bs2SsNTf8laYGbcHwSg7sxZO3MNvv1vRcqyZo911CnlDYOYEEtriEET0wZHQbt8A JFCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=hwELv6xCQXcVA4JD6t2p2ra26V3HWVbL7QXIYgK06GY=; b=jrv2R8Vvmjs0PpuglE4SH4Wu1iisCdavTcuAcsT4qvsFBu/wpE2d1PCl5ypL4mtXt4 yYZ9RSGSPwXU4t9n80jEEXQdNDbSvnao3y7A8w9xdF6ugc2VOBC+EBT0b58MA6+LN8Lf rdcO4vRDanIw8UZw/oG+SZFxL88R4u1saAyNPPx/TxdPeNgpyoqObxkHxff1rLdabYoV KfVny4BtVH63sDrLLG9uEwQFpqcW/1kr6ZuHLJ/sRtipT32uRELZWQgTtwSxNeCuEx9P p0dEThB6CzJiXSUrNpFGfvJT5Fcqz71fPPuqbROCzy4VVxQksEYYtGnA958Rc1MEcqLg xheQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s14-v6si4429089pga.21.2018.07.27.11.03.53; Fri, 27 Jul 2018 11:04:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389185AbeG0TYL (ORCPT + 99 others); Fri, 27 Jul 2018 15:24:11 -0400 Received: from www62.your-server.de ([213.133.104.62]:57809 "EHLO www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730636AbeG0TYL (ORCPT ); Fri, 27 Jul 2018 15:24:11 -0400 Received: from [78.46.172.3] (helo=sslproxy06.your-server.de) by www62.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.85_2) (envelope-from ) id 1fj736-0001bS-AS; Fri, 27 Jul 2018 20:01:08 +0200 Received: from [99.0.85.34] (helo=localhost.localdomain) by sslproxy06.your-server.de with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.89) (envelope-from ) id 1fj736-000PLj-0E; Fri, 27 Jul 2018 20:01:08 +0200 Subject: Re: [PATCH v3 bpf-next 01/14] bpf: add ability to charge bpf maps memory dynamically To: Roman Gushchin , netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, Alexei Starovoitov References: <20180720174558.5829-1-guro@fb.com> <20180720174558.5829-2-guro@fb.com> From: Daniel Borkmann Message-ID: <4f96442f-d9bc-c366-1dbd-17d4cc47783a@iogearbox.net> Date: Fri, 27 Jul 2018 20:01:05 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <20180720174558.5829-2-guro@fb.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Authenticated-Sender: daniel@iogearbox.net X-Virus-Scanned: Clear (ClamAV 0.100.0/24788/Fri Jul 27 18:45:51 2018) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/20/2018 07:45 PM, Roman Gushchin wrote: > This commits extends existing bpf maps memory charging API > to support dynamic charging/uncharging. > > This is required to account memory used by maps, > if all entries are created dynamically after > the map initialization. > > Signed-off-by: Roman Gushchin > Cc: Alexei Starovoitov > Cc: Daniel Borkmann > Acked-by: Martin KaFai Lau > --- > include/linux/bpf.h | 2 ++ > kernel/bpf/syscall.c | 53 +++++++++++++++++++++++++++++++++++++--------------- > 2 files changed, 40 insertions(+), 15 deletions(-) > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > index 5b5ad95cf339..5a4a256473c3 100644 > --- a/include/linux/bpf.h > +++ b/include/linux/bpf.h > @@ -435,6 +435,8 @@ struct bpf_map * __must_check bpf_map_inc(struct bpf_map *map, bool uref); > void bpf_map_put_with_uref(struct bpf_map *map); > void bpf_map_put(struct bpf_map *map); > int bpf_map_precharge_memlock(u32 pages); > +int bpf_map_charge_memlock(struct bpf_map *map, u32 pages); > +void bpf_map_uncharge_memlock(struct bpf_map *map, u32 pages); > void *bpf_map_area_alloc(size_t size, int numa_node); > void bpf_map_area_free(void *base); > void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr); > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index d10ecd78105f..cee452a19538 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -181,32 +181,55 @@ int bpf_map_precharge_memlock(u32 pages) > return 0; > } > > -static int bpf_map_charge_memlock(struct bpf_map *map) > +static int bpf_charge_memlock(struct user_struct *user, u32 pages) > { > - struct user_struct *user = get_current_user(); > - unsigned long memlock_limit; > + unsigned long memlock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; > > - memlock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; > + if (atomic_long_add_return(pages, &user->locked_vm) > memlock_limit) { > + atomic_long_sub(pages, &user->locked_vm); > + return -EPERM; > + } > + return 0; > +} > > - atomic_long_add(map->pages, &user->locked_vm); > +static int bpf_map_init_memlock(struct bpf_map *map) > +{ > + struct user_struct *user = get_current_user(); > + int ret; > > - if (atomic_long_read(&user->locked_vm) > memlock_limit) { > - atomic_long_sub(map->pages, &user->locked_vm); > + ret = bpf_charge_memlock(user, map->pages); > + if (ret) { > free_uid(user); > - return -EPERM; > + return ret; > } > map->user = user; > - return 0; > + return ret; > } > > -static void bpf_map_uncharge_memlock(struct bpf_map *map) > +static void bpf_map_release_memlock(struct bpf_map *map) > { > struct user_struct *user = map->user; > - > - atomic_long_sub(map->pages, &user->locked_vm); > + atomic_long_sub(map->pages, &map->user->locked_vm); Just a small nit since you're respinning anyway, could you also make a proper destructor for the bpf_charge_memlock(), so we have a bpf_uncharge_memlock() which this one here would be calling as well as ... > free_uid(user); > } > > +int bpf_map_charge_memlock(struct bpf_map *map, u32 pages) > +{ > + int ret; > + > + ret = bpf_charge_memlock(map->user, pages); > + if (ret) > + return ret; > + map->pages += pages; > + return ret; > +} > + > +void bpf_map_uncharge_memlock(struct bpf_map *map, u32 pages) > +{ > + atomic_long_sub(pages, &map->user->locked_vm); ... this one, so we hide the details there. > + map->pages -= pages; > +} > + > static int bpf_map_alloc_id(struct bpf_map *map) > { > int id; > @@ -256,7 +279,7 @@ static void bpf_map_free_deferred(struct work_struct *work) > { > struct bpf_map *map = container_of(work, struct bpf_map, work); > > - bpf_map_uncharge_memlock(map); > + bpf_map_release_memlock(map); > security_bpf_map_free(map); > /* implementation dependent freeing */ > map->ops->map_free(map); > @@ -492,7 +515,7 @@ static int map_create(union bpf_attr *attr) > if (err) > goto free_map_nouncharge; > > - err = bpf_map_charge_memlock(map); > + err = bpf_map_init_memlock(map); > if (err) > goto free_map_sec; > > @@ -515,7 +538,7 @@ static int map_create(union bpf_attr *attr) > return err; > > free_map: > - bpf_map_uncharge_memlock(map); > + bpf_map_release_memlock(map); > free_map_sec: > security_bpf_map_free(map); > free_map_nouncharge: >