Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp2987822rdh; Wed, 27 Sep 2023 21:36:24 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEOQ7R6gsytpDsRBey0eeLgPEPTdcoVdJX1nf7aKcnb/7V5V5xXdrVrKbDaHNPB0Vhsp544 X-Received: by 2002:aca:2811:0:b0:3ab:8574:e8ab with SMTP id 17-20020aca2811000000b003ab8574e8abmr182996oix.21.1695875784281; Wed, 27 Sep 2023 21:36:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695875784; cv=none; d=google.com; s=arc-20160816; b=ZvjZEdxoZWKSUTk4UwGzgLW4ua8a85coTHQeWATocy2r6VUikb6zvM2BHb/G9BJ2ti SkHcjdphoBWQFLBVQZezGESm4tU6clAFtCeFGPDB/e9rx4yQ6PnMBJY5VzS34vSn6o29 5Df+9pIBQV0iaNtGSfZoCzw97xafi1BuW75m8IyBqKWAtgNowB8wBnShECAmLQFltTFz xSi4M7mLefqsAoTDd3LF4o6IQr4VOL+MOfCYPXQ2u+BeCp5EsAvRhG5JIQbtM01VYn2u e8iAltw0zfUtRcMONUSeeIg9bKMllAd9nvltnaGwKsoSZKXk+wZWMQrYp7VOfvdej68c lq7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=dmrgC3kiMjmQ6SkAMsL82ZSNHcLV8JzvXRoTFPoIZ0s=; fh=luN1T+TZmJdWrv2MwLqOcKf0btUR/zj1LntAQkciNa8=; b=SItiU70MtW+EsCKBLntC2viZt2B1OoEWvER8AMotiIPkLdfu4ElRbC6JVfyhu+bnCS llTlcbYRA3ytdhj0rMc7hM5VOWez5Ai1BnwM+UwwSTs8te65BFIQEWFT97CligddfAkx RhqNbHm9XAee0lN3iQ1mgCONKy6kUGszDYo1MqUW3wByKPxN+O9IZmTet79TZlcsJjrl sex09OCV34nkspGHkGus4zpHnBH3h5gxC47JQxCTDxTdaDo01N8JIT8IrNaq6VwU1CgJ 7KF26hhWfTNIYPZvrj33WxIdkueRJQUV8W2e/GJoPyACgxv9iyXbYX5WQ7RiVeNUFKhx q5Sg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=af6hQN49; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id s21-20020a056a00179500b0069026254582si18707078pfg.98.2023.09.27.21.36.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Sep 2023 21:36:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=af6hQN49; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 70C6E80725E2; Wed, 27 Sep 2023 16:21:25 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229945AbjI0XVA (ORCPT + 99 others); Wed, 27 Sep 2023 19:21:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229816AbjI0XU7 (ORCPT ); Wed, 27 Sep 2023 19:20:59 -0400 Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [IPv6:2a00:1450:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6251AF4; Wed, 27 Sep 2023 16:20:57 -0700 (PDT) Received: by mail-ej1-x632.google.com with SMTP id a640c23a62f3a-9adca291f99so1526335666b.2; Wed, 27 Sep 2023 16:20:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695856856; x=1696461656; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=dmrgC3kiMjmQ6SkAMsL82ZSNHcLV8JzvXRoTFPoIZ0s=; b=af6hQN497eFjOAH2zCY3sSZfJcDpODKOsZxce/KVIea5zO5pYF5AGnW+Kn60N/94Gc GVBDXzdsX6O/0u+MQE7tRIJOwIZU++aXzsbUdW89NkqqdwBLwqQ+ycXJt/YepDMjUWO3 9lw3mN2VM/mGU1CizWbQuLOFZlSD+cxostN14wiW2z96zdfk9+1iNmcf1kUHaMicIrhB 4ZVC9V4B4trwMFG1IwTSDl6P9tiTDA0ntSe5F7FTd2ogejFTQQZlvvLNkxF8OLAgnbYg 4heit1Z6zdju7Zg3HFLkwsLz61+GO+QG/HMTqXuklSqOQLbVZzQwWsO1mCSi4H2Epbw3 f9NA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695856856; x=1696461656; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dmrgC3kiMjmQ6SkAMsL82ZSNHcLV8JzvXRoTFPoIZ0s=; b=BfGPXgT5OR8A0vdOuDYEDekzXUCpOuqEbc2/1VthqgJJKZ4mnaX1jLTOUaPsINRyw8 WhVXgD8AySGwYYyidi69Vn0TECPV2cLY0CN4r6ViIA6AsIN6DOL7Q7SUteY8DWv+D9nK ktnHciRehm93e4RLGpcOJ0zqENKbmBVCpgtwIpLtjRHwZo0dHS4DNzJMhr0T6HnQKTTR fVBAKUNDSsNPg9q0Lh2xA3gu+r6ieFMFQwVgoNOvLS8SwNYca7u5MOOUrKpab8vmd/gY BiIpWSm2XmtuOXZPGsLdmpNpfA0ym9DAt0GajGmpat0JpcGCTqqLK35G1ef4twpSnhbj i5vA== X-Gm-Message-State: AOJu0YyR/dWiVIP8FywEhG35qmz2RDindp3B+uUUuzbeWCCIV4We5sze +VT5CtJViRGqIHqNLEhJQIxVT1v6SX0vuhR5NZw= X-Received: by 2002:a17:906:847c:b0:9a5:d095:a8e1 with SMTP id hx28-20020a170906847c00b009a5d095a8e1mr2621922ejc.11.1695856855470; Wed, 27 Sep 2023 16:20:55 -0700 (PDT) MIME-Version: 1.0 References: <20230925105552.817513-1-zhouchuyi@bytedance.com> <20230925105552.817513-4-zhouchuyi@bytedance.com> In-Reply-To: <20230925105552.817513-4-zhouchuyi@bytedance.com> From: Andrii Nakryiko Date: Wed, 27 Sep 2023 16:20:43 -0700 Message-ID: Subject: Re: [PATCH bpf-next v3 3/7] bpf: Introduce task open coded iterator kfuncs To: Chuyi Zhou Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, tj@kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Wed, 27 Sep 2023 16:21:25 -0700 (PDT) On Mon, Sep 25, 2023 at 3:56=E2=80=AFAM Chuyi Zhou wrote: > > This patch adds kfuncs bpf_iter_task_{new,next,destroy} which allow > creation and manipulation of struct bpf_iter_task in open-coded iterator > style. BPF programs can use these kfuncs or through bpf_for_each macro to > iterate all processes in the system. > > The API design keep consistent with SEC("iter/task"). bpf_iter_task_new() > accepts a specific task and iterating type which allows: > 1. iterating all process in the system > > 2. iterating all threads in the system > > 3. iterating all threads of a specific task > Here we also resuse enum bpf_iter_task_type and rename BPF_TASK_ITER_TID > to BPF_TASK_ITER_THREAD, rename BPF_TASK_ITER_TGID to BPF_TASK_ITER_PROC. > > The newly-added struct bpf_iter_task has a name collision with a selftest > for the seq_file task iter's bpf skel, so the selftests/bpf/progs file is > renamed in order to avoid the collision. > > Signed-off-by: Chuyi Zhou > --- > include/linux/bpf.h | 8 +- > kernel/bpf/helpers.c | 3 + > kernel/bpf/task_iter.c | 96 ++++++++++++++++--- > .../testing/selftests/bpf/bpf_experimental.h | 5 + > .../selftests/bpf/prog_tests/bpf_iter.c | 18 ++-- > .../{bpf_iter_task.c =3D> bpf_iter_tasks.c} | 0 > 6 files changed, 106 insertions(+), 24 deletions(-) > rename tools/testing/selftests/bpf/progs/{bpf_iter_task.c =3D> bpf_iter_= tasks.c} (100%) > [...] > @@ -692,9 +692,9 @@ static int bpf_iter_fill_link_info(const struct bpf_i= ter_aux_info *aux, struct b > static void bpf_iter_task_show_fdinfo(const struct bpf_iter_aux_info *au= x, struct seq_file *seq) > { > seq_printf(seq, "task_type:\t%s\n", iter_task_type_names[aux->tas= k.type]); > - if (aux->task.type =3D=3D BPF_TASK_ITER_TID) > + if (aux->task.type =3D=3D BPF_TASK_ITER_THREAD) > seq_printf(seq, "tid:\t%u\n", aux->task.pid); > - else if (aux->task.type =3D=3D BPF_TASK_ITER_TGID) > + else if (aux->task.type =3D=3D BPF_TASK_ITER_PROC) > seq_printf(seq, "pid:\t%u\n", aux->task.pid); > } > > @@ -856,6 +856,80 @@ __bpf_kfunc void bpf_iter_css_task_destroy(struct bp= f_iter_css_task *it) > bpf_mem_free(&bpf_global_ma, kit->css_it); > } > > +struct bpf_iter_task { > + __u64 __opaque[2]; > + __u32 __opaque_int[1]; this should be __u64 __opaque[3], because struct takes full 24 bytes > +} __attribute__((aligned(8))); > + > +struct bpf_iter_task_kern { > + struct task_struct *task; > + struct task_struct *pos; > + unsigned int type; > +} __attribute__((aligned(8))); > + > +__bpf_kfunc int bpf_iter_task_new(struct bpf_iter_task *it, struct task_= struct *task, unsigned int type) nit: type -> flags, so we can add a bit more stuff, if necessary > +{ > + struct bpf_iter_task_kern *kit =3D (void *)it; empty line after variable declarations > + BUILD_BUG_ON(sizeof(struct bpf_iter_task_kern) !=3D sizeof(struct= bpf_iter_task)); > + BUILD_BUG_ON(__alignof__(struct bpf_iter_task_kern) !=3D > + __alignof__(struct bpf_iter_task)= ); and I'd add empty line here to keep BUILD_BUG_ON block separate > + kit->task =3D kit->pos =3D NULL; > + switch (type) { > + case BPF_TASK_ITER_ALL: > + case BPF_TASK_ITER_PROC: > + case BPF_TASK_ITER_THREAD: > + break; > + default: > + return -EINVAL; > + } > + > + if (type =3D=3D BPF_TASK_ITER_THREAD) > + kit->task =3D task; > + else > + kit->task =3D &init_task; > + kit->pos =3D kit->task; > + kit->type =3D type; > + return 0; > +} > + > +__bpf_kfunc struct task_struct *bpf_iter_task_next(struct bpf_iter_task = *it) > +{ > + struct bpf_iter_task_kern *kit =3D (void *)it; > + struct task_struct *pos; > + unsigned int type; > + > + type =3D kit->type; > + pos =3D kit->pos; > + > + if (!pos) > + goto out; > + > + if (type =3D=3D BPF_TASK_ITER_PROC) > + goto get_next_task; > + > + kit->pos =3D next_thread(kit->pos); > + if (kit->pos =3D=3D kit->task) { > + if (type =3D=3D BPF_TASK_ITER_THREAD) { > + kit->pos =3D NULL; > + goto out; > + } > + } else > + goto out; > + > +get_next_task: > + kit->pos =3D next_task(kit->pos); > + kit->task =3D kit->pos; > + if (kit->pos =3D=3D &init_task) > + kit->pos =3D NULL; I can't say I completely follow the logic (e.g., for BPF_TASK_ITER_PROC, why do we do next_task() on first next() call)? Can you elabore the expected behavior for various combinations of types and starting task argument? > + > +out: > + return pos; > +} > + > +__bpf_kfunc void bpf_iter_task_destroy(struct bpf_iter_task *it) > +{ > +} > + > DEFINE_PER_CPU(struct mmap_unlock_irq_work, mmap_unlock_work); > > static void do_mmap_read_unlock(struct irq_work *entry) > diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testi= ng/selftests/bpf/bpf_experimental.h > index d3ea90f0e142..d989775dbdb5 100644 > --- a/tools/testing/selftests/bpf/bpf_experimental.h > +++ b/tools/testing/selftests/bpf/bpf_experimental.h > @@ -169,4 +169,9 @@ extern int bpf_iter_css_task_new(struct bpf_iter_css_= task *it, > extern struct task_struct *bpf_iter_css_task_next(struct bpf_iter_css_ta= sk *it) __weak __ksym; > extern void bpf_iter_css_task_destroy(struct bpf_iter_css_task *it) __we= ak __ksym; > > +struct bpf_iter_task; > +extern int bpf_iter_task_new(struct bpf_iter_task *it, struct task_struc= t *task, unsigned int type) __weak __ksym; > +extern struct task_struct *bpf_iter_task_next(struct bpf_iter_task *it) = __weak __ksym; > +extern void bpf_iter_task_destroy(struct bpf_iter_task *it) __weak __ksy= m; > + > #endif > diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/te= sting/selftests/bpf/prog_tests/bpf_iter.c please split out selftests changes from kernel-side changes. We only combine them if kernel changes break selftests, preventing bisection. [...]