Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp1144667ybi; Tue, 16 Jul 2019 10:15:16 -0700 (PDT) X-Google-Smtp-Source: APXvYqyyEOvmv9VP3WCxi1U/T5wDg1Mwp0GCYSbPmO/YBSe1LPVKLwG2FLE3WTYFPPT7mk2x7QkJ X-Received: by 2002:a65:4841:: with SMTP id i1mr35227977pgs.316.1563297316673; Tue, 16 Jul 2019 10:15:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1563297316; cv=none; d=google.com; s=arc-20160816; b=otylHKxw8PTbSxZZmmNGDkDCi6kOwqZV/VY1c4bCzBk4EJYI2LuK7vgyYgzuzMuoKp v+kWgRpEkrT31fuRm3/79EmnuslLO+9dCFI4P2YDazX/L2/P6lNSgHHXv8gv7sMnABAr sa5iZm4q4ACd0+0YSYSNDAmP9YpnF/r3WhF5ltTcIpXjOJNFPfIcWAj052ce2ztDLCV1 ZAfuyFkYAxoqkIFxRuGne+SWiwW65wQAMN91TUh6hJnINgFM07XsqyDn/4w9eWaw1FCr QQ4un3IjT1uD4mddBEeKMTlEdzB+/Zoku2VkXaNyw6kx1+67s2Fzp5B1N5K4/ZAdrYEb 6yjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=K2T5hXZ28GqFjDQWKXNnzayNSQkWAA4ayrA9/eIoHIk=; b=zdTft6zaXK4HKY7KHYZL9OTm+yYxscmbiWeSPdsX42zAdB7BZnk4+wIftwYfdkTOGg ETpjH89HxE+Q5HhpSQdJ9cw0+Ad27YQvaBZHzvj+TsD7zBDh199vLoHPco/VzGEGO+26 kjDS80UW+H8bGl8G3eLIMf3lqRsbn7u1hItT/OcMThYcTt41+E7vLjaaOmCdOlkdP1fL Ia0x/d2Oi2GHxSHBnQ2o9+FrkawGtRegQ6Zm+lDHHvTCWgcwaTv9eEWnQFX8EJCZ4BrT mwjNnLDZDSeFzSCMD+dy+STMLtF0a9eQPaYEFbw+CauTNzFUxKbOUkaMJWHfSE+Ouoh1 5jwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ML7ZtHIq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a88si19750680pje.6.2019.07.16.10.14.59; Tue, 16 Jul 2019 10:15:16 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ML7ZtHIq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388107AbfGPROj (ORCPT + 99 others); Tue, 16 Jul 2019 13:14:39 -0400 Received: from mail-vs1-f67.google.com ([209.85.217.67]:33404 "EHLO mail-vs1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728124AbfGPROj (ORCPT ); Tue, 16 Jul 2019 13:14:39 -0400 Received: by mail-vs1-f67.google.com with SMTP id m8so14489329vsj.0; Tue, 16 Jul 2019 10:14:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=K2T5hXZ28GqFjDQWKXNnzayNSQkWAA4ayrA9/eIoHIk=; b=ML7ZtHIqMEKlSdOcxsPF6riTTSqgesIeZrKbGD7HrTze3FDhjzRu0lyA0WD/PsVoJC 5V1ApPny/eRhwmxTsXfDdtoB8ZJ1IOnkyzM1dlSSRhXRozCnWJMgViCjapPaLNiUk8sM PEe4Fl7nN786ZlgCTheHo339pxzCOOtLvTpyPosnfDbyzFjsyGY2JWcw51zTkKqcplma PP5Ldrkl8+7IcMW6bJCTUpkofmk/m4EzkWeKp7OZ0ax30N3KyTd2COWhM+X8Sm6qP9Fn ZvBrrHRiO0iz376Y2iKNFP0/PsTbJPi7xuNOVbuKnM+1PqhJ+5M4q7o56br3LxuUZCYN bMGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=K2T5hXZ28GqFjDQWKXNnzayNSQkWAA4ayrA9/eIoHIk=; b=c5zO7F1/8VZxceI8AHpDT/WjyeeeLg6RcTDCPgQsODu2D6hW1aOsWNSQ9A8pnwHmNL egzJO5NcjxT4OhVsBFET1uIY1OlIPgr4tajxDDjh8HIGuZaiUHzJtdpEGrktToDyvasQ tYLCN1jcm1cjgUT3fFs7TVOA/Ae/bQmNepu2ogin1aWsDn8hYYg3QcR+tj+mVkawKG62 nL6Z0/gqR6gEWADQ6HR5cjrz17JiAHrzxOxJWPyEfnEGkOZoK1RWTStQIJnQs03BBGuy Uu6IHUQrDW3Bpm8+8HcIQjnn3Piw5CIn90yoD35c2WoNxwOqYPdMmRqwRtK4bKmdSP3k nZbQ== X-Gm-Message-State: APjAAAV7BTpqSOS5zBRQ06jonVzLbO31rYKr2E2oyoWDWZfgoC6w7TWL oaI2vD1CkdMOJZdtnepGXwsp/3WqYS84WouklEc= X-Received: by 2002:a67:d082:: with SMTP id s2mr18848597vsi.96.1563297277785; Tue, 16 Jul 2019 10:14:37 -0700 (PDT) MIME-Version: 1.0 References: <734dd45a-95b0-a7fd-9e1d-0535ef4d3e12@iogearbox.net> In-Reply-To: From: Anton Protopopov Date: Tue, 16 Jul 2019 13:14:26 -0400 Message-ID: Subject: Re: [PATCH bpf-next 1/2] bpf, libbpf: add a new API bpf_object__reuse_maps() To: Andrii Nakryiko Cc: Daniel Borkmann , Alexei Starovoitov , Martin KaFai Lau , Song Liu , Yonghong Song , Networking , bpf , open list , Andrii Nakryiko Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org =D0=B2=D1=82, 9 =D0=B8=D1=8E=D0=BB. 2019 =D0=B3. =D0=B2 13:40, Andrii Nakry= iko : > > On Mon, Jul 8, 2019 at 1:37 PM Anton Protopopov > wrote: > > > > =D0=BF=D0=BD, 8 =D0=B8=D1=8E=D0=BB. 2019 =D0=B3. =D0=B2 13:54, Andrii N= akryiko : > > > > > > On Fri, Jul 5, 2019 at 2:53 PM Daniel Borkmann = wrote: > > > > > > > > On 07/05/2019 10:44 PM, Anton Protopopov wrote: > > > > > Add a new API bpf_object__reuse_maps() which can be used to repla= ce all maps in > > > > > an object by maps pinned to a directory provided in the path argu= ment. Namely, > > > > > each map M in the object will be replaced by a map pinned to path= /M.name. > > > > > > > > > > Signed-off-by: Anton Protopopov > > > > > --- > > > > > tools/lib/bpf/libbpf.c | 34 ++++++++++++++++++++++++++++++++++ > > > > > tools/lib/bpf/libbpf.h | 2 ++ > > > > > tools/lib/bpf/libbpf.map | 1 + > > > > > 3 files changed, 37 insertions(+) > > > > > > > > > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > > > > > index 4907997289e9..84c9e8f7bfd3 100644 > > > > > --- a/tools/lib/bpf/libbpf.c > > > > > +++ b/tools/lib/bpf/libbpf.c > > > > > @@ -3144,6 +3144,40 @@ int bpf_object__unpin_maps(struct bpf_obje= ct *obj, const char *path) > > > > > return 0; > > > > > } > > > > > > > > > > +int bpf_object__reuse_maps(struct bpf_object *obj, const char *p= ath) > > > > > > As is, bpf_object__reuse_maps() can be easily implemented by user > > > applications, as it's only using public libbpf APIs, so I'm not 100% > > > sure we need to add method like that to libbpf. > > > > The bpf_object__reuse_maps() can definitely be implemented by user > > applications, however, to use it a user also needs to re-implement the > > bpf_prog_load_xattr funciton, so it seemed to me that adding this > > functionality to the library is a better way. > > I'm still not convinced. Looking at bpf_prog_load_xattr, I think some > of what it's doing should be part of bpf_object__object_xattr anyway > (all the expected type setting for programs). > > Besides that, there isn't much more than just bpf_object__open and > bpf_object__load, to be honest. By doing open and load explicitly, > user gets an opportunity to do whatever adjustment they need: reuse > maps, adjust map sizes, etc. So I think we should improve > bpf_object__open to "guess" program attach types and add map > definition flags to allow reuse declaratively. > > > > > > > > > > > > +{ > > > > > + struct bpf_map *map; > > > > > + > > > > > + if (!obj) > > > > > + return -ENOENT; > > > > > + > > > > > + if (!path) > > > > > + return -EINVAL; > > > > > + > > > > > + bpf_object__for_each_map(map, obj) { > > > > > + int len, err; > > > > > + int pinned_map_fd; > > > > > + char buf[PATH_MAX]; > > > > > > > > We'd need to skip the case of bpf_map__is_internal(map) since they = are always > > > > recreated for the given object. > > > > > > > > > + len =3D snprintf(buf, PATH_MAX, "%s/%s", path, bpf_= map__name(map)); > > > > > + if (len < 0) { > > > > > + return -EINVAL; > > > > > + } else if (len >=3D PATH_MAX) { > > > > > + return -ENAMETOOLONG; > > > > > + } > > > > > + > > > > > + pinned_map_fd =3D bpf_obj_get(buf); > > > > > + if (pinned_map_fd < 0) > > > > > + return pinned_map_fd; > > > > > > > > Should we rather have a new map definition attribute that tells to = reuse > > > > the map if it's pinned in bpf fs, and if not, we create it and late= r on > > > > pin it? This is what iproute2 is doing and which we're making use o= f heavily. > > > > > > I'd like something like that as well. This would play nicely with > > > recently added BTF-defined maps as well. > > > > > > I think it should be not just pin/don't pin flag, but rather pinning > > > strategy, to accommodate various typical strategies of handling maps > > > that are already pinned. So something like this: > > > > > > 1. BPF_PIN_NOTHING - default, don't pin; > > > 2. BPF_PIN_EXCLUSIVE - pin, but if map is already pinned - fail; > > > 3. BPF_PIN_SET - pin; if existing map exists, reset its state to be > > > exact state of object's map; > > > 4. BPF_PIN_MERGE - pin, if map exists, fill in NULL entries only (thi= s > > > is how Cilium is pinning PROG_ARRAY maps, if I understand correctly); > > > 5. BPF_PIN_MERGE_OVERWRITE - pin, if map exists, overwrite non-NULL v= alues. > > > > > > This list is only for illustrative purposes, ideally people that have > > > a lot of experience using pinning for real-world use cases would chim= e > > > in on what strategies are useful and make sense. > > > > My case was simply to reuse existing maps when reloading a program. > > Does it make sense for you to add only the simplest cases of listed abo= ve? > > Of course, it's enum, so we can start with few clearly useful ones and > then expand more if we ever have a need. But I think we still need a > bit wider discussion and let people who use pinning to chime in. > > > > > Also, libbpf doesn't use standard naming conventions for pinning maps. > > We talked about this in another thread related to BTF-defined maps. I > think the way to go with this is to actually define a default pinning > root path, but allow to override it on bpf_object__open, if user needs > a different one. > > > Does it make sense to provide a list of already open maps to the > > bpf_prog_load_xattr function as an attribute? In this case a user > > can execute his own policy on pinning, but still will have an option > > to reuse, reset, and merge maps. > > As explained above, I don't think there isn't much added value in > bpf_prog_load, so I'd advise to just switch to explicit > bpf_object__open + bpf_object__load and get maximum control and > flexibility. Thanks for your comments. I can see now that using bpf_object__open/bpf_object__load makes better sense. > > > > > > > > > > In bpf_object__reuse_maps() bailing out if bpf_obj_get() fails is p= erhaps > > > > too limiting for a generic API as new version of an object file may= contain > > > > new maps which are not yet present in bpf fs at that point. > > > > > > > > > + err =3D bpf_map__reuse_fd(map, pinned_map_fd); > > > > > + if (err) > > > > > + return err; > > > > > + } > > > > > + > > > > > + return 0; > > > > > +} > > > > > + > > > > > int bpf_object__pin_programs(struct bpf_object *obj, const char = *path) > > > > > { > > > > > struct bpf_program *prog; > > > > > diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h > > > > > index d639f47e3110..7fe465a1be76 100644 > > > > > --- a/tools/lib/bpf/libbpf.h > > > > > +++ b/tools/lib/bpf/libbpf.h > > > > > @@ -82,6 +82,8 @@ int bpf_object__variable_offset(const struct bp= f_object *obj, const char *name, > > > > > LIBBPF_API int bpf_object__pin_maps(struct bpf_object *obj, cons= t char *path); > > > > > LIBBPF_API int bpf_object__unpin_maps(struct bpf_object *obj, > > > > > const char *path); > > > > > +LIBBPF_API int bpf_object__reuse_maps(struct bpf_object *obj, > > > > > + const char *path); > > > > > LIBBPF_API int bpf_object__pin_programs(struct bpf_object *obj, > > > > > const char *path); > > > > > LIBBPF_API int bpf_object__unpin_programs(struct bpf_object *obj= , > > > > > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map > > > > > index 2c6d835620d2..66a30be6696c 100644 > > > > > --- a/tools/lib/bpf/libbpf.map > > > > > +++ b/tools/lib/bpf/libbpf.map > > > > > @@ -172,5 +172,6 @@ LIBBPF_0.0.4 { > > > > > btf_dump__new; > > > > > btf__parse_elf; > > > > > bpf_object__load_xattr; > > > > > + bpf_object__reuse_maps; > > > > > libbpf_num_possible_cpus; > > > > > } LIBBPF_0.0.3; > > > > > > > > >