Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp684777rwb; Sat, 17 Sep 2022 12:43:58 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6vy9/ssVkwZ3LY+w27dkECe++t08gfH5zrRm9IFJBYcovV1C3qBnAAcrrAT+CS2df/cGxV X-Received: by 2002:a63:91c7:0:b0:438:36c9:9022 with SMTP id l190-20020a6391c7000000b0043836c99022mr9914962pge.573.1663443838345; Sat, 17 Sep 2022 12:43:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663443838; cv=none; d=google.com; s=arc-20160816; b=rl7GjM8luGsJJnk7cYAv+Qn7I2gsuhRejI1SjBGIhs8c16HTohbVi+19T0rSNJ2udf b6GSKvsIb2yrYOYvAGXgjyrF5Su1RuIJv1pknr9Hm8RGNcoPfngXd7C0B2GeUjZPcnwY EVezCPL14O9fJcyyGh7fd4Qkx+SCoHMIcpi6/fFC78KyJHVEcP9VQQFkqM2XJbGyEPxM ACK8cpyqEc8AAXvyLb5lyEk00LoAhpH0O8MUIWugk88DSp7Paj/fLYU7U0y+VRmS74Bx hgAaNiCkV+n5VRd8puWfnSj/Jg0gi2LsD1EP5fyDoWThb4UEc3vFesRVQqJRcSY0oUuF 0ZLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=XuJnLWdoR6SviPcUc/49at2QbmUOJmkyf0qoFnjTlRM=; b=SRGTmLO+ZYMDbeJmi5EyOJwQfhPBf4l74TtKs6zmcxXY1dImQblegOa6NOUOZ7smiY ZrLXm7zFKZQNSxedm8ZvhYn1rcEzLjYBdHKTyoygdx+0Ej91RUx7Td+wSHkOdnnVtlar At4VhWzhY1/kuS7G5sa1XQ30kSmnYVahj7P688ALoMSNhHgKO/bCudbjdLznnHs450YB q3B/lb/Y4XlMsXdBXXiBMrCWJ6XUkdSj5rQKZPaKQCazOyzOcrrrIhvFcBHrT46Vys6d oEGpmVnX6d90MXDEqW26/nIVgVERE+PLwGqTwa7ZnKE9BqFqYZiuiKSHGDqXFeyQohpX JkYg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cloudflare.com header.s=google header.b=jU9hKS7y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=cloudflare.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gn21-20020a17090ac79500b002001fd09d80si5432358pjb.59.2022.09.17.12.43.46; Sat, 17 Sep 2022 12:43:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@cloudflare.com header.s=google header.b=jU9hKS7y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=cloudflare.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229595AbiIQScT (ORCPT + 99 others); Sat, 17 Sep 2022 14:32:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229379AbiIQScP (ORCPT ); Sat, 17 Sep 2022 14:32:15 -0400 Received: from mail-yb1-xb32.google.com (mail-yb1-xb32.google.com [IPv6:2607:f8b0:4864:20::b32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BFF72C678 for ; Sat, 17 Sep 2022 11:32:14 -0700 (PDT) Received: by mail-yb1-xb32.google.com with SMTP id p69so30417506yba.0 for ; Sat, 17 Sep 2022 11:32:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=XuJnLWdoR6SviPcUc/49at2QbmUOJmkyf0qoFnjTlRM=; b=jU9hKS7ygUDxtTFOXLH4RSn40IlDB7mfQJJahl+Z+Mf6mo0esTBhFpDLbs6+6r34ya DmDsXBkD26yaoOtLj5uiKBgyMkkefygI8tw6p38pjmT6+bwa5DXXVxZUizSV3VQfsj0s AkbXooeEDFa6T2D8Jt2sDWFsMRVId3YLzn8Ns= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=XuJnLWdoR6SviPcUc/49at2QbmUOJmkyf0qoFnjTlRM=; b=idnqx/f7/FfBlDhZAE/uJuIWjKHAvnYUxfVczVCsP6wLZfToWy5PI+0HbH630QrptV iWDQZYRtVNp3zhg/RPxseig5hsMhSnl9llD2I1JeInpLAPud9+ECrfvDzDy3ucoLhUCl LgplOXlbEbyHKl+qAIdlC11jnlL65oUXGrVVUsCBgmV+Q3BgTBeOlm5w1cyytBMg1nnd gPaCjAFMHmvtsDasMcx1CzEnA6KRF5kCE5KZbrHe2iREpUPPY5148U+FYwReGz+5of4D RclQc2X39xvq4/9lGBP+2E+29E9OMG+SpZmEBnSYAZsOjppPJzLQuPGnKR9gTzysSQTl klpQ== X-Gm-Message-State: ACrzQf0G9AYn09cfXcp9gbjkv6bug8KDsLD61OlZeayiFowV+iGKdFkz 39yHa3mM5LLeTkEH7pe2cYjGe7RL9cMqYAVX7ZLaHg== X-Received: by 2002:a05:6902:72f:b0:6b1:d9:79d2 with SMTP id l15-20020a056902072f00b006b100d979d2mr2026914ybt.201.1663439533294; Sat, 17 Sep 2022 11:32:13 -0700 (PDT) MIME-Version: 1.0 References: <20220916230853.49056-1-ivan@cloudflare.com> <20220916170115.35932cba34e2cc2d923b03b5@linux-foundation.org> In-Reply-To: From: Ivan Babrou Date: Sat, 17 Sep 2022 11:32:02 -0700 Message-ID: Subject: Re: [RFC] proc: report open files as size in stat() for /proc/pid/fd To: Alexey Dobriyan Cc: Andrew Morton , linux-fsdevel@vger.kernel.org, linux-kernel , kernel-team , Kalesh Singh , Al Viro Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > > > * Make fd count acces O(1) and expose it in /proc/pid/status > > This is doable, next to FDSize. It feels like a better solution, but maybe I'm missing some context here. Let me know whether this is preferred. That said, I've tried doing it, but failed. There's a noticeable mismatch in the numbers: * systemd: ivan@vm:~$ sudo ls -l /proc/1/fd | wc -l 66 ivan@vm:~$ cat /proc/1/status | fgrep FD FDSize: 256 FDUsed: 71 * journald: ivan@vm:~$ sudo ls -l /proc/803/fd | wc -l 29 ivan@vm:~$ cat /proc/803/status | fgrep FD FDSize: 128 FDUsed: 37 I'll see if I can make it work next week. I'm happy to receive tips as well. Below is my attempt (link in case gmail breaks patch formatting): * https://gist.githubusercontent.com/bobrik/acce40881d629d8cce2e55966b31a0a2/raw/716eb4724a8fe3afeeb76fd2a7a47ee13790a9e9/fdused.patch diff --git a/fs/file.c b/fs/file.c index 3bcc1ecc314a..8bc0741cabf1 100644 --- a/fs/file.c +++ b/fs/file.c @@ -85,6 +85,8 @@ static void copy_fdtable(struct fdtable *nfdt, struct fdtable *ofdt) memset((char *)nfdt->fd + cpy, 0, set); copy_fd_bitmaps(nfdt, ofdt, ofdt->max_fds); + + atomic_set(&nfdt->count, atomic_read(&ofdt->count)); } /* @@ -105,6 +107,7 @@ static void copy_fdtable(struct fdtable *nfdt, struct fdtable *ofdt) static struct fdtable * alloc_fdtable(unsigned int nr) { struct fdtable *fdt; + atomic_t count = ATOMIC_INIT(0); void *data; /* @@ -148,6 +151,7 @@ static struct fdtable * alloc_fdtable(unsigned int nr) fdt->close_on_exec = data; data += nr / BITS_PER_BYTE; fdt->full_fds_bits = data; + fdt->count = count; return fdt; @@ -399,6 +403,8 @@ struct files_struct *dup_fd(struct files_struct *oldf, unsigned int max_fds, int /* clear the remainder */ memset(new_fds, 0, (new_fdt->max_fds - open_files) * sizeof(struct file *)); + atomic_set(&new_fdt->count, atomic_read(&old_fdt->count)); + rcu_assign_pointer(newf->fdt, new_fdt); return newf; @@ -474,6 +480,7 @@ struct files_struct init_files = { .close_on_exec = init_files.close_on_exec_init, .open_fds = init_files.open_fds_init, .full_fds_bits = init_files.full_fds_bits_init, + .count = ATOMIC_INIT(0), }, .file_lock = __SPIN_LOCK_UNLOCKED(init_files.file_lock), .resize_wait = __WAIT_QUEUE_HEAD_INITIALIZER(init_files.resize_wait), @@ -613,6 +620,7 @@ void fd_install(unsigned int fd, struct file *file) BUG_ON(fdt->fd[fd] != NULL); rcu_assign_pointer(fdt->fd[fd], file); spin_unlock(&files->file_lock); + atomic_inc(&fdt->count); return; } /* coupled with smp_wmb() in expand_fdtable() */ @@ -621,6 +629,7 @@ void fd_install(unsigned int fd, struct file *file) BUG_ON(fdt->fd[fd] != NULL); rcu_assign_pointer(fdt->fd[fd], file); rcu_read_unlock_sched(); + atomic_inc(&fdt->count); } EXPORT_SYMBOL(fd_install); @@ -646,6 +655,7 @@ static struct file *pick_file(struct files_struct *files, unsigned fd) if (file) { rcu_assign_pointer(fdt->fd[fd], NULL); __put_unused_fd(files, fd); + atomic_dec(&fdt->count); } return file; } @@ -844,6 +854,7 @@ void do_close_on_exec(struct files_struct *files) filp_close(file, files); cond_resched(); spin_lock(&files->file_lock); + atomic_dec(&fdt->count); } } @@ -1108,6 +1119,7 @@ __releases(&files->file_lock) else __clear_close_on_exec(fd, fdt); spin_unlock(&files->file_lock); + atomic_inc(&fdt->count); if (tofree) filp_close(tofree, files); diff --git a/fs/proc/array.c b/fs/proc/array.c index 99fcbfda8e25..5847f077bfc3 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -153,7 +153,8 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns, struct task_struct *tracer; const struct cred *cred; pid_t ppid, tpid = 0, tgid, ngid; - unsigned int max_fds = 0; + struct fdtable *fdt; + unsigned int max_fds = 0, open_fds = 0; rcu_read_lock(); ppid = pid_alive(p) ? @@ -170,8 +171,11 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns, task_lock(p); if (p->fs) umask = p->fs->umask; - if (p->files) - max_fds = files_fdtable(p->files)->max_fds; + if (p->files) { + fdt = files_fdtable(p->files); + max_fds = fdt->max_fds; + open_fds = atomic_read(&fdt->count); + } task_unlock(p); rcu_read_unlock(); @@ -194,6 +198,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns, seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->sgid)); seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->fsgid)); seq_put_decimal_ull(m, "\nFDSize:\t", max_fds); + seq_put_decimal_ull(m, "\nFDUsed:\t", open_fds); seq_puts(m, "\nGroups:\t"); group_info = cred->group_info; diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index e066816f3519..59aceb1e4bc6 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -31,6 +31,7 @@ struct fdtable { unsigned long *open_fds; unsigned long *full_fds_bits; struct rcu_head rcu; + atomic_t count; }; static inline bool close_on_exec(unsigned int fd, const struct fdtable *fdt) > > > + > > > + generic_fillattr(&init_user_ns, inode, stat); > ^^^^^^^^^^^^^ > > Is this correct? I'm not userns guy at all. I mostly copied from here: * https://elixir.bootlin.com/linux/v6.0-rc5/source/fs/proc/generic.c#L150 Maybe it can be simplified even further to match this one: * https://elixir.bootlin.com/linux/v6.0-rc5/source/fs/proc/root.c#L317