Received: by 2002:a05:7412:3784:b0:e2:908c:2ebd with SMTP id jk4csp63044rdb; Fri, 29 Sep 2023 16:59:46 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF4TzSgLjvB3tzlqcsqgvNXcRe5JH+SYJmgjlMQStTe6F/iDxXSFx/fPE6qjmoY+v8EGCna X-Received: by 2002:a17:90a:db56:b0:279:856:b036 with SMTP id u22-20020a17090adb5600b002790856b036mr5217676pjx.6.1696031986152; Fri, 29 Sep 2023 16:59:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696031986; cv=none; d=google.com; s=arc-20160816; b=ZYQasiXwfIljnJOI4YlFVKTvdAN+1o07guEx07O6gydfD+mgEN3n0L/JJs/omLDtiD wwgaN3IYZE1lhjB9d9PgXF0IAfGj0i7NQDA9KffF+Rs6FGjiIni0oOYRSUT261UbBn3D 7Zic+U5VmPYoi9ayk2L4Vlm/hicH4h4VZIgXL6zDzGi1NFXbmWtL9ifkRGxtmZlaOj8c ztcJ6lK4uJsVsUOUqYg3HCxpkz+AXpagN6f0a03UK7+xIf2/A330ZW0G+iqYiScJsMHi VuebcYIfztaWFWq5lk0p8u1igFIIDO9xm5E2Ijbm82wbDV4RAJAeoFlr9P5+kut04Jfq IW6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=Ulc+lDz0cmLjLptsbHUWL0O/uVNoSf1Xc1Or5n5xvRo=; fh=luN1T+TZmJdWrv2MwLqOcKf0btUR/zj1LntAQkciNa8=; b=CB1BZirPIBKHkVLLFl1ecwnCmiShXWTuUS+GpIxxgjZWut5hM7xlIfr/SNr9jOxqer D6rXDkfQcZzk814gmP1oUHmEp02rlbTJ2YKHMnkoxoU+c1ndWwG2L83Vmjbi8RBHypKw ki8MVFem+b7Qc1Tn2yBYe+AWs2IguHJhYr4r2c4xxBBeL9CVlmD+73Rzk8fk/W1Jvmiu 1eoLxxjrFzbjPqcoA9uMfi+ICwyr9CsYvH0dW9vgyxr60NMID3hZW8iKxwCZWDAGHQ/C aAgZZKdVVvFhB03Ws171sy+ilexAhRK2xuGmwxIM9XuH2My1AK8AUrfHxXuLPejCx3cd ljVA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=ZDR4Qq2p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id my11-20020a17090b4c8b00b002777d51e89dsi2757486pjb.128.2023.09.29.16.59.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Sep 2023 16:59:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=ZDR4Qq2p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id D06D1818F644; Fri, 29 Sep 2023 14:29:31 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232878AbjI2V3Z (ORCPT + 99 others); Fri, 29 Sep 2023 17:29:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40012 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229508AbjI2V3Y (ORCPT ); Fri, 29 Sep 2023 17:29:24 -0400 Received: from mail-ed1-x532.google.com (mail-ed1-x532.google.com [IPv6:2a00:1450:4864:20::532]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E17EF1AB; Fri, 29 Sep 2023 14:29:21 -0700 (PDT) Received: by mail-ed1-x532.google.com with SMTP id 4fb4d7f45d1cf-533cbbd0153so15659975a12.0; Fri, 29 Sep 2023 14:29:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696022960; x=1696627760; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Ulc+lDz0cmLjLptsbHUWL0O/uVNoSf1Xc1Or5n5xvRo=; b=ZDR4Qq2pe40f0N34LNncp+9fMDtAh1Kxz3D1LbpJeLH8GEPtCRlp0hHfvess7gxAcx oPb5NFOuoaca3pW91SvI53x86/d7phv6sNCAJ9gkRlQNp36IIc9gbtaxDprBibpnf4Vm SQh209rtfNdbXJyKLhdP8rf50I22Yw40DfQWDC8lZ/LYn/ovW8myqkyW5nofSR1STujt Dxv/cWYnqNPX7WzHAUHzU3XV2/NyFeAbp1A4VjvgBwEAjzLMBVtBSzOYq/QWU4gGcDGX AE3eE5TUa4+Bdqof53vbEx+kPqGlxS/sLyPVg17e2TJd1axb6SVuuYRh8PwoJXEKQQdU qbFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696022960; x=1696627760; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ulc+lDz0cmLjLptsbHUWL0O/uVNoSf1Xc1Or5n5xvRo=; b=IgdhL2Fn2v6KIhEJjRRCaMvKloA/QdVGS30Oam+EGEeyrXuLXuuL/uoqTE4CJ3qtzd /sFRnK1cgq1uuApIrEno4ZjNOnFRsYNpp7XQePwixJsKNgrNEhk1aZKjVV2fwHxpXQju u1JpdC2Pma0HzXJUTw4X7gRqWv6Lf0fXWCyahIufnZbkU37kSwO6loUXxOL2IF+qOccQ 0pndltEMXa2JWrbPiQep9+Rrer1QQ4Lj3ACJ/aX9VUg0CBUgudzIqI31hKTZ1Ikpe71t Bm3WeuixSYXmew7KfNRnqUR42jPHSnMp16mZVXqqggRT3YItkm5RkoZEVaecq2H1jjEM g7lQ== X-Gm-Message-State: AOJu0YwFeERGoNkEf+PZlbeYPYR//5F0QQ5M4/mKrzBstWe7eGaMjj6j oE5rKXtK2JMEnSwFqOW8UUUbKLjT5/hCnmmZE5w= X-Received: by 2002:aa7:d648:0:b0:534:2e79:6b04 with SMTP id v8-20020aa7d648000000b005342e796b04mr4794500edr.14.1696022960015; Fri, 29 Sep 2023 14:29:20 -0700 (PDT) MIME-Version: 1.0 References: <20230925105552.817513-1-zhouchuyi@bytedance.com> <20230925105552.817513-5-zhouchuyi@bytedance.com> <27b57638-48db-7082-2b53-93d84e423350@bytedance.com> In-Reply-To: <27b57638-48db-7082-2b53-93d84e423350@bytedance.com> From: Andrii Nakryiko Date: Fri, 29 Sep 2023 14:29:08 -0700 Message-ID: Subject: Re: [PATCH bpf-next v3 4/7] bpf: Introduce css open-coded iterator kfuncs To: Chuyi Zhou Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, tj@kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 29 Sep 2023 14:29:32 -0700 (PDT) On Wed, Sep 27, 2023 at 7:51=E2=80=AFPM Chuyi Zhou wrote: > > Hello, > > =E5=9C=A8 2023/9/28 07:24, Andrii Nakryiko =E5=86=99=E9=81=93: > > On Mon, Sep 25, 2023 at 3:56=E2=80=AFAM Chuyi Zhou wrote: > >> > >> This Patch adds kfuncs bpf_iter_css_{new,next,destroy} which allow > >> creation and manipulation of struct bpf_iter_css in open-coded iterato= r > >> style. These kfuncs actually wrapps css_next_descendant_{pre, post}. > >> css_iter can be used to: > >> > >> 1) iterating a sepcific cgroup tree with pre/post/up order > >> > >> 2) iterating cgroup_subsystem in BPF Prog, like > >> for_each_mem_cgroup_tree/cpuset_for_each_descendant_pre in kernel. > >> > >> The API design is consistent with cgroup_iter. bpf_iter_css_new accept= s > >> parameters defining iteration order and starting css. Here we also reu= se > >> BPF_CGROUP_ITER_DESCENDANTS_PRE, BPF_CGROUP_ITER_DESCENDANTS_POST, > >> BPF_CGROUP_ITER_ANCESTORS_UP enums. > >> > >> Signed-off-by: Chuyi Zhou > >> --- > >> kernel/bpf/cgroup_iter.c | 57 +++++++++++++++++= ++ > >> kernel/bpf/helpers.c | 3 + > >> .../testing/selftests/bpf/bpf_experimental.h | 6 ++ > >> 3 files changed, 66 insertions(+) > >> > >> diff --git a/kernel/bpf/cgroup_iter.c b/kernel/bpf/cgroup_iter.c > >> index 810378f04fbc..ebc3d9471f52 100644 > >> --- a/kernel/bpf/cgroup_iter.c > >> +++ b/kernel/bpf/cgroup_iter.c > >> @@ -294,3 +294,60 @@ static int __init bpf_cgroup_iter_init(void) > >> } > >> > >> late_initcall(bpf_cgroup_iter_init); > >> + > >> +struct bpf_iter_css { > >> + __u64 __opaque[2]; > >> + __u32 __opaque_int[1]; > >> +} __attribute__((aligned(8))); > >> + > > > > same as before, __opaque[3] only > > > > > >> +struct bpf_iter_css_kern { > >> + struct cgroup_subsys_state *start; > >> + struct cgroup_subsys_state *pos; > >> + int order; > >> +} __attribute__((aligned(8))); > >> + > >> +__bpf_kfunc int bpf_iter_css_new(struct bpf_iter_css *it, > >> + struct cgroup_subsys_state *start, enum bpf_cgroup_ite= r_order order) > > > > Similarly, I wonder if we should go for a more generic "flags" argument= ? > > > >> +{ > >> + struct bpf_iter_css_kern *kit =3D (void *)it; > > > > empty line > > > >> + kit->start =3D NULL; > >> + BUILD_BUG_ON(sizeof(struct bpf_iter_css_kern) !=3D sizeof(stru= ct bpf_iter_css)); > >> + BUILD_BUG_ON(__alignof__(struct bpf_iter_css_kern) !=3D __alig= nof__(struct bpf_iter_css)); > > > > please move this up before kit->start assignment, and separate by empty= lines > > > >> + switch (order) { > >> + case BPF_CGROUP_ITER_DESCENDANTS_PRE: > >> + case BPF_CGROUP_ITER_DESCENDANTS_POST: > >> + case BPF_CGROUP_ITER_ANCESTORS_UP: > >> + break; > >> + default: > >> + return -EINVAL; > >> + } > >> + > >> + kit->start =3D start; > >> + kit->pos =3D NULL; > >> + kit->order =3D order; > >> + return 0; > >> +} > >> + > >> +__bpf_kfunc struct cgroup_subsys_state *bpf_iter_css_next(struct bpf_= iter_css *it) > >> +{ > >> + struct bpf_iter_css_kern *kit =3D (void *)it; > > > > empty line > > > >> + if (!kit->start) > >> + return NULL; > >> + > >> + switch (kit->order) { > >> + case BPF_CGROUP_ITER_DESCENDANTS_PRE: > >> + kit->pos =3D css_next_descendant_pre(kit->pos, kit->st= art); > >> + break; > >> + case BPF_CGROUP_ITER_DESCENDANTS_POST: > >> + kit->pos =3D css_next_descendant_post(kit->pos, kit->s= tart); > >> + break; > >> + default: > > > > we know it's BPF_CGROUP_ITER_ANCESTORS_UP, so why not have that here ex= plicitly? > > > >> + kit->pos =3D kit->pos ? kit->pos->parent : kit->start; > >> + } > >> + > >> + return kit->pos; > > > > wouldn't this implementation never return the "start" css? is that inte= ntional? > > > > Thanks for the review. > > This implementation actually would return the "start" css. > > 1. BPF_CGROUP_ITER_DESCENDANTS_PRE: > 1.1 when we first call next(), css_next_descendant_pre(NULL, kit->start) > will return kit->start. > 1.2 second call next(), css_next_descendant_pre(kit->start, kit->start) > would return a first valid child under kit->start with pre-order > 1.3 third call next, css_next_descendant_pre(last_valid_child, > kit->start) would return the next valid child > ... > util css_next_descendant_pre return a NULL pointer, which means we have > visited all valid child including "start" css itself. > > The above logic is equal to macro 'css_for_each_descendant_pre' in kernel= . > > Same, BPF_CGROUP_ITER_DESCENDANTS_POST is equal to macro > 'css_for_each_descendant_post' which would return 'start' css when we > have visited all valid child. > > 2. BPF_CGROUP_ITER_ANCESTORS_UP > 2.1 when we fisrt call next(), kit->pos is NULL, and we would return > kit->start. > > > The selftest in patch7 whould check: > 1. when we use BPF_CGROUP_ITER_DESCENDANTS_PRE to iterate a cgroup tree, > the first cgroup we visted should be root('start') cgroup. > 2. when we use BPF_CGROUP_ITER_DESCENDANTS_POST to iterate a cgroup > tree, the last cgroup we visited should be root('start') cgroup. > > > Am I miss something important? > No, again, my bad, I didn't trace the logic completely before asking. All makes sense with kit->pos being initialized to NULL. Thanks for elaborating! > > Thanks. > > >