Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp4383201rdb; Thu, 14 Sep 2023 23:40:43 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGGYRdXMUmpgtLp1CkeadNOmyvBNrGTYNnIpSFKA8hDZwrLftat+HFPuM0nHu7gS+n+ilKd X-Received: by 2002:a17:902:778d:b0:1bd:fa80:103d with SMTP id o13-20020a170902778d00b001bdfa80103dmr603314pll.25.1694760042693; Thu, 14 Sep 2023 23:40:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694760042; cv=none; d=google.com; s=arc-20160816; b=Hxh6MFIdxXQNXB9uJhUtyeOyitKQZ0RhtX54y5g+t7FEXnEUTmJ1CzfACBNRczFxnU hvi5u3oGGcPkz7dOH5jiSfxsUNWpT6Chy8Cch8CCZPSGuHiiy77xnbTJtVvM393qHJrW bUrln+WCXJYtdvpg1gp9kTBCtWeW0De/h7qhs6Y3Tw7Ay2LhYDpQuubaOayjF8KDJnNu ojtVRN/3PqHbD9PDVvSZ6A3vPXu9SfhU6Y5ioLdz1iMbo7fugL9/GrdrYK7unJ+uwiJn deKK88pOmWVJNV6edSFauh3vLnSJHT0YJRK1F10BVuqo4I8aZA6iiWzbiaFekOEEgpnn 27yQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=z5RHk1zRkAJ4y5BF/R8pbipfkCCCw1UzNiSuYHyN1n4=; fh=luN1T+TZmJdWrv2MwLqOcKf0btUR/zj1LntAQkciNa8=; b=Nb3aQg5woTbuQUmysrHmW3Gf0HuaFpAqhVXdUVRWwrULnOfz4u4RxAY8Ksbdld1lnT aEjob5ukFMc/3VPMRERc84E7Gpk+hAzeGASf13gdE+XQhDZOxZbZ2VY+qHUyHWs4/oqM kfTRqMpx7Q6gNMo01jDZ8LpxW9txaH/UVkkYqnyAm04OJvas2AmtUTJV6sX2OFEpEQYS q8Z8Ow4M+5QMwk7ZVgmS1izYUzdF0zzLeLLIbyEfqIJM96B2onHJVMA7LoOyZjn03Jrr 6n0YOqca3RWdw6SNZqsiFxQ0fINtzHAd1cEX/xhKahp38GaWGWKpOaARoQhvgQgVXnjy htZQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=ZpL3u2+M; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id jy13-20020a17090342cd00b001b80de4d3f7si2759441plb.558.2023.09.14.23.40.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 23:40:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=ZpL3u2+M; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 19E5E837342D; Thu, 14 Sep 2023 16:26:57 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230261AbjINX0v (ORCPT + 99 others); Thu, 14 Sep 2023 19:26:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230215AbjINX0u (ORCPT ); Thu, 14 Sep 2023 19:26:50 -0400 Received: from mail-ed1-x52e.google.com (mail-ed1-x52e.google.com [IPv6:2a00:1450:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 397262698; Thu, 14 Sep 2023 16:26:46 -0700 (PDT) Received: by mail-ed1-x52e.google.com with SMTP id 4fb4d7f45d1cf-52bcd4db4e6so1805281a12.0; Thu, 14 Sep 2023 16:26:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1694734004; x=1695338804; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=z5RHk1zRkAJ4y5BF/R8pbipfkCCCw1UzNiSuYHyN1n4=; b=ZpL3u2+M68t0rYDiApxJvgj2d3U+k1btmulkGv0QYuBSBZCOH8y23GMYrtBIWR7sjv qT+njsrTEv+1drYUJ+5EitHgP0ezG8TW0bobelc2w11KtW9aj8vMbgAh9wNJ12gg3OYB zSt3Gxi+k0amViclDdQ4lOhCcYpeJUMV7mlWU1agxUN62qLGqh+UuipgJ7SpE1SeG4f6 mjtkqBYzgImhVC8S8sXezp0aLh83UBOeJ+UO3AfOQJzOenJ2CFv0tgy06ozZzPrJjMbB uiI4yaf1n+8XW/B0EorwoaOmdjziy7e0oUySRS5UetYdom9EWFcWaaQQoDmrtfvJoFSz gxqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694734004; x=1695338804; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=z5RHk1zRkAJ4y5BF/R8pbipfkCCCw1UzNiSuYHyN1n4=; b=Jtr+30wfpgAcZt1sdJSnS0puoqs2p/wqK3BNs5YNVTYvwozOKxTAHalkgFTIJsr1XA gkGgbhBjz8HYX/BYP8XdoX+/BvJzuiTWNQ45KxJzY2RO0nUpt3T89OYArqoED6tJJNky xyrFuYbhfrDHTRt4z4PUyOiaDv28OQMdYG5eimbAju1NtiFKPsqYdscYDIXa3ZHVVt3z h+VHs383zTDnCTAAtbrrE8U24+4A4OrD/AlSil4xvUO7qPnqLQsatYUOSQ6dRi2msLMk z6AQjhBEosFxsB5L7gHmjVlJViBrtQQkS2GhZYNql8G7y6fgH/Q1jiaI4hc776ezGXVw PPqA== X-Gm-Message-State: AOJu0Yy2G3ENLOEX7bVilNqY8USgmly6lSoUFNI9jP0V2RmJKBczgJU3 oOSFZ6MLBj8ODYogsnU9LzjucOj2+zTnLC1p/G8= X-Received: by 2002:aa7:cd0d:0:b0:525:442b:6068 with SMTP id b13-20020aa7cd0d000000b00525442b6068mr5945105edw.13.1694734004393; Thu, 14 Sep 2023 16:26:44 -0700 (PDT) MIME-Version: 1.0 References: <20230912070149.969939-1-zhouchuyi@bytedance.com> <20230912070149.969939-4-zhouchuyi@bytedance.com> In-Reply-To: <20230912070149.969939-4-zhouchuyi@bytedance.com> From: Andrii Nakryiko Date: Thu, 14 Sep 2023 16:26:32 -0700 Message-ID: Subject: Re: [PATCH bpf-next v2 3/6] bpf: Introduce process open coded iterator kfuncs To: Chuyi Zhou Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, tj@kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Thu, 14 Sep 2023 16:26:57 -0700 (PDT) On Tue, Sep 12, 2023 at 12:02=E2=80=AFAM Chuyi Zhou wrote: > > This patch adds kfuncs bpf_iter_process_{new,next,destroy} which allow > creation and manipulation of struct bpf_iter_process in open-coded iterat= or > style. BPF programs can use these kfuncs or through bpf_for_each macro to > iterate all processes in the system. > > Signed-off-by: Chuyi Zhou > --- > include/uapi/linux/bpf.h | 4 ++++ > kernel/bpf/helpers.c | 3 +++ > kernel/bpf/task_iter.c | 29 +++++++++++++++++++++++++++++ > tools/include/uapi/linux/bpf.h | 4 ++++ > tools/lib/bpf/bpf_helpers.h | 5 +++++ > 5 files changed, 45 insertions(+) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index de02c0971428..befa55b52e29 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -7322,4 +7322,8 @@ struct bpf_iter_css_task { > __u64 __opaque[1]; > } __attribute__((aligned(8))); > > +struct bpf_iter_process { > + __u64 __opaque[1]; > +} __attribute__((aligned(8))); > + > #endif /* _UAPI__LINUX_BPF_H__ */ > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index d6a16becfbb9..9b7d2c6f99d1 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -2507,6 +2507,9 @@ BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DE= STROY) > BTF_ID_FLAGS(func, bpf_iter_css_task_new, KF_ITER_NEW) > BTF_ID_FLAGS(func, bpf_iter_css_task_next, KF_ITER_NEXT | KF_RET_NULL) > BTF_ID_FLAGS(func, bpf_iter_css_task_destroy, KF_ITER_DESTROY) > +BTF_ID_FLAGS(func, bpf_iter_process_new, KF_ITER_NEW) > +BTF_ID_FLAGS(func, bpf_iter_process_next, KF_ITER_NEXT | KF_RET_NULL) > +BTF_ID_FLAGS(func, bpf_iter_process_destroy, KF_ITER_DESTROY) > BTF_ID_FLAGS(func, bpf_dynptr_adjust) > BTF_ID_FLAGS(func, bpf_dynptr_is_null) > BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) > diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c > index d8539cc05ffd..9d1927dc3a06 100644 > --- a/kernel/bpf/task_iter.c > +++ b/kernel/bpf/task_iter.c > @@ -851,6 +851,35 @@ __bpf_kfunc void bpf_iter_css_task_destroy(struct bp= f_iter_css_task *it) > kfree(kit->css_it); > } > > +struct bpf_iter_process_kern { > + struct task_struct *tsk; > +} __attribute__((aligned(8))); > + Few high level thoughts. I think it would be good to follow SEC("iter/task") naming and approach. Open-coded iterators in many ways are in-kernel counterpart to iterator programs, so keeping them close enough within reason is useful for knowledge transfer. SEC("iter/task") allows to: a) iterate all threads in the system b) iterate all threads for a given TGID c) it also allows to "iterate" a single thread or process, but that's a bit less relevant for in-kernel iterator, but we can still support them, why not? I'm not sure if it supports iterating all processes (as in group leaders of each task group) in the system, but if it's possible I think we should support it at least for open-coded iterator, seems like a very useful functionality. So to that end, let's design a small set of input arguments for bpf_iter_process_new() that would allow to specify this as flags + either (optional) struct task_struct * pointer to represent task/process or PID/TGID. > +__bpf_kfunc int bpf_iter_process_new(struct bpf_iter_process *it) Also, given iterator in the previous is called css_task, and that we have iter/task, iter/task_vma, etc iterator programs, shouldn't this one be called bpf_iter_task_new(), which also will be consistent with Dave's task_vma open-coded iterator? > +{ > + struct bpf_iter_process_kern *kit =3D (void *)it; > + > + BUILD_BUG_ON(sizeof(struct bpf_iter_process_kern) !=3D sizeof(str= uct bpf_iter_process)); > + BUILD_BUG_ON(__alignof__(struct bpf_iter_process_kern) !=3D > + __alignof__(struct bpf_iter_proce= ss)); > + > + kit->tsk =3D &init_task; > + return 0; > +} > + > +__bpf_kfunc struct task_struct *bpf_iter_process_next(struct bpf_iter_pr= ocess *it) > +{ > + struct bpf_iter_process_kern *kit =3D (void *)it; > + > + kit->tsk =3D next_task(kit->tsk); > + > + return kit->tsk =3D=3D &init_task ? NULL : kit->tsk; > +} > + > +__bpf_kfunc void bpf_iter_process_destroy(struct bpf_iter_process *it) > +{ > +} > + > DEFINE_PER_CPU(struct mmap_unlock_irq_work, mmap_unlock_work); > > static void do_mmap_read_unlock(struct irq_work *entry) > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bp= f.h > index de02c0971428..befa55b52e29 100644 > --- a/tools/include/uapi/linux/bpf.h > +++ b/tools/include/uapi/linux/bpf.h > @@ -7322,4 +7322,8 @@ struct bpf_iter_css_task { > __u64 __opaque[1]; > } __attribute__((aligned(8))); > > +struct bpf_iter_process { > + __u64 __opaque[1]; > +} __attribute__((aligned(8))); > + > #endif /* _UAPI__LINUX_BPF_H__ */ > diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h > index f48723c6c593..858252c2641c 100644 > --- a/tools/lib/bpf/bpf_helpers.h > +++ b/tools/lib/bpf/bpf_helpers.h > @@ -310,6 +310,11 @@ extern int bpf_iter_css_task_new(struct bpf_iter_css= _task *it, > extern struct task_struct *bpf_iter_css_task_next(struct bpf_iter_css_ta= sk *it) __weak __ksym; > extern void bpf_iter_css_task_destroy(struct bpf_iter_css_task *it) __we= ak __ksym; > > +struct bpf_iter_process; > +extern int bpf_iter_process_new(struct bpf_iter_process *it) __weak __ks= ym; > +extern struct task_struct *bpf_iter_process_next(struct bpf_iter_process= *it) __weak __ksym; > +extern void bpf_iter_process_destroy(struct bpf_iter_process *it) __weak= __ksym; > + same, please add this to bpf_experimental, not bpf_helpers.h > #ifndef bpf_for_each > /* bpf_for_each(iter_type, cur_elem, args...) provides generic construct= for > * using BPF open-coded iterators without having to write mundane explic= it > -- > 2.20.1 >