2009-06-09 09:02:19

by Alexander Shishkin

[permalink] [raw]
Subject: [PATCH] [RESEND] RFC: List per-process file descriptor consumption when hitting file-max

From: Alexander Shishkin <[email protected]>

[resending to aim at wider audience]
When a file descriptor limit is hit, it might be useful to see all the
users to be able to identify those that leak descriptors.

Signed-off-by: Alexander Shishkin <[email protected]>
---
fs/file_table.c | 27 +++++++++++++++++++++++++++
1 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 54018fe..9e53167 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -136,8 +136,35 @@ struct file *get_empty_filp(void)
over:
/* Ran out of filps - report that */
if (get_nr_files() > old_max) {
+ struct task_struct *p;
+ struct files_struct *files;
+ struct fdtable *fdt;
+ int i, count = 0;
+
printk(KERN_INFO "VFS: file-max limit %d reached\n",
get_max_files());
+
+ read_lock(&tasklist_lock);
+ for_each_process(p) {
+ files = get_files_struct(p);
+ if (!files)
+ continue;
+
+ spin_lock(&files->file_lock);
+ fdt = files_fdtable(files);
+
+ /* we have to actually *count* the fds */
+ for (count = i = 0; i < fdt->max_fds; i++)
+ count += !!fcheck_files(files, i);
+
+ printk(KERN_INFO "=> %s [%d]: %d\n", p->comm,
+ p->pid, count);
+
+ spin_unlock(&files->file_lock);
+ put_files_struct(files);
+ }
+ read_unlock(&tasklist_lock);
+
old_max = get_nr_files();
}
goto fail;
--
1.6.1.3


2009-07-30 00:13:01

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] [RESEND] RFC: List per-process file descriptor consumption when hitting file-max

On Tue, 9 Jun 2009 12:01:11 +0300
[email protected] wrote:

> When a file descriptor limit is hit, it might be useful to see all the
> users to be able to identify those that leak descriptors.

Well... maybe.

There is no end to the amount of code which we could add to the kernel
to help userspace developers and administrators resolve userspace bugs.
But that doesn't mean that we should add all these things to the
kernel.

If there's some reason why the problem is particularly severe and
particularly hard to resolve by other means then sure, perhaps explicit
kernel support is justified. But is that the case with this specific
userspace bug?

2009-10-11 12:09:26

by Alexander Shishkin

[permalink] [raw]
Subject: Re: [PATCH] [RESEND] RFC: List per-process file descriptor consumption when hitting file-max

2009/7/30 Andrew Morton <[email protected]>:
> If there's some reason why the problem is particularly severe and
> particularly hard to resolve by other means then sure, perhaps explicit
> kernel support is justified.  But is that the case with this specific
> userspace bug?

Well, this can be figured by userspace by traversing procfs and
counting entries of fd/ for each, but that is likely to require more
available file descriptors and given we are at the point when the
limit is hit, this may not work. There is, of course, a good chance
that the process that tried to open the one-too-many descriptor is
going to crash upon failing to do so (and thus free a bunch of
descriptors), but that is going to create more confusion. Most of the
time, the application that crashes when file-max is reached is not the
one that ate them all.

So, all in all, in certain cases there's no other way to figure out
who was leaking descriptors.

Regards,
--
Alex