Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp3003989rdh; Wed, 27 Sep 2023 22:28:06 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH3pbhKvZ9cU1/KJDGrDUVB+ZYncNYfmOHN7IjDj4i1HSLaSApAbWhd4v4CYrZ6mRGIL/W1 X-Received: by 2002:a17:903:493:b0:1c6:1d52:2778 with SMTP id jj19-20020a170903049300b001c61d522778mr200607plb.1.1695878885738; Wed, 27 Sep 2023 22:28:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695878885; cv=none; d=google.com; s=arc-20160816; b=fVCiOaEn2itWZvHC04ocmQeFy94iDudGO8p2yvQfzweh+vC+jgvW0aylMYVnAO3FAe PU1394ujVY4UnuhrTcko6yWgD70EK1nuaMtjoq5qLE5dfi8GApY5mBg9fm4otrMld27S w2RQQMGmmEn4qQtZz98TvnjKheVKu/soVczDan8MxCnG6v2bS7rbCIkPTOBgDKRhQIpQ DJfDyTHFMJmqkozM3k8W/OIArKYhdclNZddH9bph6AcDAsjh243b9XGFPcP6DUhZUU/M 0qqrE/GpPqEqZVGzpQ0g/DqRKzfzOywewk2Boxbwt6EQ49FyGReyCMSOvIKABbGbR9f2 +v5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id :dkim-signature; bh=KuGpzWUz7WllqNPIiFw60y239AVC+GmSn3mdK28s17Q=; fh=v95LKH/FC/ox3RbaAOhxEcPZKseyeUOjmVin6/WheA4=; b=LYP1Vw/TZFNXNOLjATF0tHNE7+v8z+W1TiKf9Oze3NZAsQNKK0mIR7ymu6Feu16pjL zQaaXR2kIDfj5cZzDH9hOHxiSEIDH72rsp0jbw0YgDFSdI69e3htDuoPXOrZoFEbNop0 Qxb2lE5YiV3PkQJPkQnP25fXyxXeRl7AAoU3+GwF34JtjxFXCdLe5w39HZCX6k2Rc585 dl1XTpmJ5If3AyRO8qwd3bETr3FTyhUuZ8pls5ZcVx36e2hJCf66qoS9qaxrRd1UtxR4 CyvU1d3RlQ57aEqJIf9oEGSheQ74boJjcV8ZyrVpH6nkb6zz9R5aNSS8TvCnk9y0vw/f vvEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=MrUXHCgF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Return-Path: Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id u7-20020a170902e5c700b001c61073b06bsi11713818plf.427.2023.09.27.22.28.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Sep 2023 22:28:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=MrUXHCgF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 9FB2080763C5; Wed, 27 Sep 2023 20:30:17 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230123AbjI1DaC (ORCPT + 99 others); Wed, 27 Sep 2023 23:30:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229854AbjI1D37 (ORCPT ); Wed, 27 Sep 2023 23:29:59 -0400 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 25F24122 for ; Wed, 27 Sep 2023 20:29:58 -0700 (PDT) Received: by mail-qk1-x731.google.com with SMTP id af79cd13be357-7742da399a2so536436785a.0 for ; Wed, 27 Sep 2023 20:29:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1695871797; x=1696476597; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=KuGpzWUz7WllqNPIiFw60y239AVC+GmSn3mdK28s17Q=; b=MrUXHCgF6csYJaZtGtqH9CzwM3OyIVjou75BAKkQ7RG2fYn2Cc9EUExWhuAg9AaqqR IxcdMpfsdTMEwpxA6MyixWsZJz396U9tPOM1dg5AT7p36t+J4XmG0BUqLjmNsQ5fOcrC PYRbQsy5jvQJs8oKb3IbAn8fEgTdHwkaHo/Z8JLNd/eFE1Yzofz5R0O5MzyTgRL7WN6K lr/Xk7dE+nbHRB9CUNW7Lca8J0UfWHSp8NmbIVx+ZkSn0D1CNyAKiM5P/M3NzJm08L1y +MlOaRW4Baqwy/LOo+9oMwJnDCGh1Nj1bXVswS3HcFUPWDrD+6TDC6HeEZWdt7deUhYP y1wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695871797; x=1696476597; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=KuGpzWUz7WllqNPIiFw60y239AVC+GmSn3mdK28s17Q=; b=DSSir+JGmu0hZiD2LCm66yYJfIiMnOgK3qqhlXhLnMfAyZlXvwGcIzWZ31Mk6ug17v 6Ua3vRC8u2JBdKU2Ok9kZTO3PnujD/dznBB1xLVxce7J4iFOteMzzPgCTxD6gfNwuCZd GDKoz4zEKF1xUVVKvIvyGVEMtrT0bMGouO0sa9NM1mog5/80BFePnDO2T+i8ZRT1h8R6 9e+QPM45f+nP87pLwDVnKnoXy02i98hPtdO9OxMy40acXTabkONvWHzJWLZoBFHnaKdq rr704EOyCcLLrI02Ej9HbYFc7Ct5ZdJHU3c8Hyd2G2Wa7r76TF6wsYLKyVyBKFynLxAP pGGw== X-Gm-Message-State: AOJu0YxSALd5s1u5Kfo7mk9pn6Y9B7X1P8jUgNdHPJpH/v7JO5XHHEwW OthzRPV35vngeGWYq8jzVnKG76iDj/cAw59wjic= X-Received: by 2002:a05:620a:221a:b0:774:f7b:f0a with SMTP id m26-20020a05620a221a00b007740f7b0f0amr44046qkh.76.1695871797184; Wed, 27 Sep 2023 20:29:57 -0700 (PDT) Received: from [10.255.173.165] ([139.177.225.224]) by smtp.gmail.com with ESMTPSA id j17-20020aa78dd1000000b0068c1ac1784csm12541842pfr.59.2023.09.27.20.29.53 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 27 Sep 2023 20:29:56 -0700 (PDT) Message-ID: <716adfa5-bd5d-3fe2-108c-ff24b2e81420@bytedance.com> Date: Thu, 28 Sep 2023 11:29:51 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH bpf-next v3 3/7] bpf: Introduce task open coded iterator kfuncs To: Andrii Nakryiko Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, tj@kernel.org, linux-kernel@vger.kernel.org References: <20230925105552.817513-1-zhouchuyi@bytedance.com> <20230925105552.817513-4-zhouchuyi@bytedance.com> From: Chuyi Zhou In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.3 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Wed, 27 Sep 2023 20:30:17 -0700 (PDT) Hello, 在 2023/9/28 07:20, Andrii Nakryiko 写道: > On Mon, Sep 25, 2023 at 3:56 AM Chuyi Zhou wrote: >> >> This patch adds kfuncs bpf_iter_task_{new,next,destroy} which allow >> creation and manipulation of struct bpf_iter_task in open-coded iterator >> style. BPF programs can use these kfuncs or through bpf_for_each macro to >> iterate all processes in the system. >> >> The API design keep consistent with SEC("iter/task"). bpf_iter_task_new() >> accepts a specific task and iterating type which allows: >> 1. iterating all process in the system >> >> 2. iterating all threads in the system >> >> 3. iterating all threads of a specific task >> Here we also resuse enum bpf_iter_task_type and rename BPF_TASK_ITER_TID >> to BPF_TASK_ITER_THREAD, rename BPF_TASK_ITER_TGID to BPF_TASK_ITER_PROC. >> >> The newly-added struct bpf_iter_task has a name collision with a selftest >> for the seq_file task iter's bpf skel, so the selftests/bpf/progs file is >> renamed in order to avoid the collision. >> >> Signed-off-by: Chuyi Zhou >> --- >> include/linux/bpf.h | 8 +- >> kernel/bpf/helpers.c | 3 + >> kernel/bpf/task_iter.c | 96 ++++++++++++++++--- >> .../testing/selftests/bpf/bpf_experimental.h | 5 + >> .../selftests/bpf/prog_tests/bpf_iter.c | 18 ++-- >> .../{bpf_iter_task.c => bpf_iter_tasks.c} | 0 >> 6 files changed, 106 insertions(+), 24 deletions(-) >> rename tools/testing/selftests/bpf/progs/{bpf_iter_task.c => bpf_iter_tasks.c} (100%) >> > > [...] > >> @@ -692,9 +692,9 @@ static int bpf_iter_fill_link_info(const struct bpf_iter_aux_info *aux, struct b >> static void bpf_iter_task_show_fdinfo(const struct bpf_iter_aux_info *aux, struct seq_file *seq) >> { >> seq_printf(seq, "task_type:\t%s\n", iter_task_type_names[aux->task.type]); >> - if (aux->task.type == BPF_TASK_ITER_TID) >> + if (aux->task.type == BPF_TASK_ITER_THREAD) >> seq_printf(seq, "tid:\t%u\n", aux->task.pid); >> - else if (aux->task.type == BPF_TASK_ITER_TGID) >> + else if (aux->task.type == BPF_TASK_ITER_PROC) >> seq_printf(seq, "pid:\t%u\n", aux->task.pid); >> } >> >> @@ -856,6 +856,80 @@ __bpf_kfunc void bpf_iter_css_task_destroy(struct bpf_iter_css_task *it) >> bpf_mem_free(&bpf_global_ma, kit->css_it); >> } >> >> +struct bpf_iter_task { >> + __u64 __opaque[2]; >> + __u32 __opaque_int[1]; > > this should be __u64 __opaque[3], because struct takes full 24 bytes > >> +} __attribute__((aligned(8))); >> + >> +struct bpf_iter_task_kern { >> + struct task_struct *task; >> + struct task_struct *pos; >> + unsigned int type; >> +} __attribute__((aligned(8))); >> + >> +__bpf_kfunc int bpf_iter_task_new(struct bpf_iter_task *it, struct task_struct *task, unsigned int type) > > nit: type -> flags, so we can add a bit more stuff, if necessary > >> +{ >> + struct bpf_iter_task_kern *kit = (void *)it; > > empty line after variable declarations > >> + BUILD_BUG_ON(sizeof(struct bpf_iter_task_kern) != sizeof(struct bpf_iter_task)); >> + BUILD_BUG_ON(__alignof__(struct bpf_iter_task_kern) != >> + __alignof__(struct bpf_iter_task)); > > and I'd add empty line here to keep BUILD_BUG_ON block separate > >> + kit->task = kit->pos = NULL; >> + switch (type) { >> + case BPF_TASK_ITER_ALL: >> + case BPF_TASK_ITER_PROC: >> + case BPF_TASK_ITER_THREAD: >> + break; >> + default: >> + return -EINVAL; >> + } >> + >> + if (type == BPF_TASK_ITER_THREAD) >> + kit->task = task; >> + else >> + kit->task = &init_task; >> + kit->pos = kit->task; >> + kit->type = type; >> + return 0; >> +} >> + >> +__bpf_kfunc struct task_struct *bpf_iter_task_next(struct bpf_iter_task *it) >> +{ >> + struct bpf_iter_task_kern *kit = (void *)it; >> + struct task_struct *pos; >> + unsigned int type; >> + >> + type = kit->type; >> + pos = kit->pos; >> + >> + if (!pos) >> + goto out; >> + >> + if (type == BPF_TASK_ITER_PROC) >> + goto get_next_task; >> + >> + kit->pos = next_thread(kit->pos); >> + if (kit->pos == kit->task) { >> + if (type == BPF_TASK_ITER_THREAD) { >> + kit->pos = NULL; >> + goto out; >> + } >> + } else >> + goto out; >> + >> +get_next_task: >> + kit->pos = next_task(kit->pos); >> + kit->task = kit->pos; >> + if (kit->pos == &init_task) >> + kit->pos = NULL; > > I can't say I completely follow the logic (e.g., for > BPF_TASK_ITER_PROC, why do we do next_task() on first next() call)? > Can you elabore the expected behavior for various combinations of > types and starting task argument? > Thanks for the review. The expected behavior of current implementation is: BPF_TASK_ITER_PROC: init_task->first_process->second_process->...->last_process->init_task We would exit before visiting init_task again. BPF_TASK_ITER_THREAD: group_task->first_thread->second_thread->...->last_thread->group_task We would exit before visiting group_task again. BPF_TASK_ITER_ALL: init_task -> first_process -> second_process -> ... | | -> first_thread.. | -> first_thread Actually, every next() call, we would return the "pos" which was prepared by previous next() call, and use next_task()/next_thread() to update kit->pos. Once we meet the exit condition (next_task() return init_task or next_thread() return group_task), we would update kit->pos to NULL. In this way, when next() is called again, we will terminate the iteration. Here "kit->pos = NULL;" means we would return the last valid "pos" and will return NULL in next call to exit from the iteration. Am I miss something important? Thanks.