2007-01-31 14:34:58

by Eddie Pettis

[permalink] [raw]
Subject: How to locate struct file * from a bio?

Short question: Is it possible to locate the struct file * associated
with a bio? If so, how?

Longer version: I am working on a project that requires measuring the
popularity of each file in a filesystem. I have made several attempts
to locate all the file reads by grepping for ->readpage() and
->readpages() calls, but I am still missing several file reads. I
have not yet looked for file writes.

I *have* been able to catch every bio access, which leads to my
question. Is it possible to locate the struct file * associated with
a bio? If so, how? Efficiency is not my primary issue right now.
I'm mainly concerned with it "just working."

Thanks in advance!

--

Eddie Pettis
Electrical and Computering Engineering
Purdue University


2007-01-31 14:44:25

by Al Viro

[permalink] [raw]
Subject: Re: How to locate struct file * from a bio?

On Wed, Jan 31, 2007 at 09:34:54AM -0500, Eddie Pettis wrote:
> Short question: Is it possible to locate the struct file * associated
> with a bio? If so, how?

Obviously impossible. For one thing, there might very well be no inode,
let alone struct file, associated with bio in question (e.g. for any
filesystem metadata). Moreover, the same on-disk object may get IO
without any stuct file at all (e.g. a directory) or with many struct
file (e.g. any file independently opened by several processes; no matter
how many of them do reads, we'll get stuff pulled into page cache the
same way (and once, not once per struct file).

2007-01-31 14:51:57

by Al Viro

[permalink] [raw]
Subject: Re: How to locate struct file * from a bio?

On Wed, Jan 31, 2007 at 02:44:23PM +0000, Al Viro wrote:
> On Wed, Jan 31, 2007 at 09:34:54AM -0500, Eddie Pettis wrote:
> > Short question: Is it possible to locate the struct file * associated
> > with a bio? If so, how?
>
> Obviously impossible. For one thing, there might very well be no inode,
> let alone struct file, associated with bio in question (e.g. for any
> filesystem metadata). Moreover, the same on-disk object may get IO
> without any stuct file at all (e.g. a directory) or with many struct
> file (e.g. any file independently opened by several processes; no matter
> how many of them do reads, we'll get stuff pulled into page cache the
> same way (and once, not once per struct file).

BTW, here's a good testcase for you: /etc/ld.so.cache; it's accessed at
practically any execve(), so it should be very close to top of the
popularity list (right there with /lib/libc.so.6)...

2007-01-31 15:21:05

by Helge Hafting

[permalink] [raw]
Subject: Re: How to locate struct file * from a bio?

Eddie Pettis wrote:
> Short question: Is it possible to locate the struct file * associated
> with a bio? If so, how?
>
> Longer version: I am working on a project that requires measuring the
> popularity of each file in a filesystem. I have made several attempts
> to locate all the file reads by grepping for ->readpage() and
> ->readpages() calls, but I am still missing several file reads. I
> have not yet looked for file writes.
>
> I *have* been able to catch every bio access, which leads to my
> question. Is it possible to locate the struct file * associated with
> a bio? If so, how? Efficiency is not my primary issue right now.
> I'm mainly concerned with it "just working."
Looks like you do this the wrong way.

Why don't you tap into "open" instead?
Here you can note who opens the file and if they open it for
reading or writing. If you really need the amount of data
transferred, consider trapping the read and write syscalls too.

Helge Hafting

2007-01-31 18:46:00

by Jan Engelhardt

[permalink] [raw]
Subject: Re: How to locate struct file * from a bio?


On Jan 31 2007 16:18, Helge Hafting wrote:
> Eddie Pettis wrote:
>> Longer version: I am working on a project that requires measuring the
>> popularity of each file in a filesystem. I have made several attempts
>> to locate all the file reads by grepping for ->readpage() and
>> ->readpages() calls, but I am still missing several file reads. I
>> have not yet looked for file writes.
>
> Why don't you tap into "open" instead?
> Here you can note who opens the file and if they open it for
> reading or writing. If you really need the amount of data
> transferred, consider trapping the read and write syscalls too.

And to add the sugar on top: in case you can live without tracing / (root
filesystem), you can write your very own fuse filesystem layer in a few
minutes and trace every small thing. Or perhaps take an existing project
(aufs/unionfs) and enhance the module with some the wanted hooks, etc.


Jan
--

2007-01-31 18:55:18

by Eddie Pettis

[permalink] [raw]
Subject: Re: How to locate struct file * from a bio?

On 1/31/07, Helge Hafting <[email protected]> wrote:
> Eddie Pettis wrote:
> > Short question: Is it possible to locate the struct file * associated
> > with a bio? If so, how?
> >
>
> Looks like you do this the wrong way.

Agreed. It was a bad hack based on something I had done previously.

>
> Why don't you tap into "open" instead?
> Here you can note who opens the file and if they open it for
> reading or writing. If you really need the amount of data
> transferred, consider trapping the read and write syscalls too.

I added hooks to the sys_* system calls for read/write/open/close from
userspace, and that seems to work.

Thanks!

>
> Helge Hafting
>