2003-02-13 17:19:21

by Bruno Diniz de Paula

[permalink] [raw]
Subject: How to bypass buffer caches?

Hi,

I've sent some messages about using O_DIRECT to read files, but I
suppose that is not possible using 2.4 kernel and ext2. So I was
wondering which other alternatives I have to bypass the buffer cache of
the kernel. One option would be create a raw device on top of my disk
partition, but in this case I would have to learn how to map a logical
file name (/var/tmp/myfile) to a set of block disks. Is there any other
solution? Can I disable buffer caches or at least limit their memory
utilization?

Thanks,

Bruno.
--
Bruno Diniz de Paula <[email protected]>
Rutgers University


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2003-02-13 17:27:34

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: How to bypass buffer caches?

On Thu, 13 Feb 2003 12:29:12 EST, Bruno Diniz de Paula <[email protected]> said:

> the kernel. One option would be create a raw device on top of my disk
> partition, but in this case I would have to learn how to map a logical
> file name (/var/tmp/myfile) to a set of block disks. Is there any other

What's wrong with this?

fd = open("/dev/hda7", your_flags_here);


Attachments:
(No filename) (226.00 B)

2003-02-13 17:41:29

by Bruno Diniz de Paula

[permalink] [raw]
Subject: Re: How to bypass buffer caches?

But what if "/dev/hda7" already has an ext2 fs set up. How am I supposed
to know which phisical blocks in the disk correspond to each of my files
in the ext2 mapping, that is, "/var/somefile" or "/usr/local/otherfile"?

Thanks,
Bruno.

On Thu, 2003-02-13 at 12:37, [email protected] wrote:
> On Thu, 13 Feb 2003 12:29:12 EST, Bruno Diniz de Paula <[email protected]> said:
>
> > the kernel. One option would be create a raw device on top of my disk
> > partition, but in this case I would have to learn how to map a logical
> > file name (/var/tmp/myfile) to a set of block disks. Is there any other
>
> What's wrong with this?
>
> fd = open("/dev/hda7", your_flags_here);
--
Bruno Diniz de Paula <[email protected]>
Rutgers University


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2003-02-13 18:10:57

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: How to bypass buffer caches?

On Thu, 13 Feb 2003 12:51:19 EST, Bruno Diniz de Paula said:

> But what if "/dev/hda7" already has an ext2 fs set up. How am I supposed
> to know which phisical blocks in the disk correspond to each of my files
> in the ext2 mapping, that is, "/var/somefile" or "/usr/local/otherfile"?

The quick answer: Don't do that. ;)

Usually, this would be done by using /dev/hda7 as somefile and hda8 as
otherfile, or you'd create your own "filesystem" by saying "data for
somefile is in the first 2,000 blocks and otherfile is in blocks 2001+"
or so on. In other words, if you want a *raw partition*, you use one,
with *NO* filesystem involved.

Consider a product like Oracle (yes, I know I'm oversimplifying here).
If you have a database that takes 250M, it doesn't really care if it's
a 250M disk partition called /dev/hdc4 or a 250M file in a filesystem -
it just wants 250M of disk that *it* will worry about what goes in what
block.

The whole point of using a raw disk partition instead of a file is so that
you *DONT* have to worry about what the in-kernel filesystem cache is doing
to you, or what other files on the partition are doing, etc. Note that most
of the problems (such as "do we need to fsync() here because of fs dain bramage"
or "do we need to worry about flushing the cache") arise because your code
is trying to second guess the filesystem - so if you scribble directly
on the partition and bypass the filesystem life gets easier (or at least then
all the bugs are self-inflicted, anyhow..)

So conceptually, you have an ext2/ext3 partition where allocation/management is done
by the ext2/3 filesystem code, a swap partition handled by the VM code,
a database partition that's run by the database code... and so on.


Attachments:
(No filename) (226.00 B)

2003-02-13 23:26:54

by Andries Brouwer

[permalink] [raw]
Subject: Re: How to bypass buffer caches?


> On Thu, 13 Feb 2003 12:51:19 EST, Bruno Diniz de Paula said:
>
> > But what if "/dev/hda7" already has an ext2 fs set up. How am I supposed
> > to know which phisical blocks in the disk correspond to each of my files
> > in the ext2 mapping, that is, "/var/somefile" or "/usr/local/otherfile"?
>
> The quick answer: Don't do that. ;)

But if you insist, there is the FIBMAP ioctl.

2003-02-14 08:25:38

by Helge Hafting

[permalink] [raw]
Subject: Re: How to bypass buffer caches?

Bruno Diniz de Paula wrote:
>
> Hi,
>
> I've sent some messages about using O_DIRECT to read files, but I
> suppose that is not possible using 2.4 kernel and ext2. So I was
> wondering which other alternatives I have to bypass the buffer cache of
> the kernel.

You don't say why you need this. I recommend that you
simply don't use a filesystem - use a partition like
/dev/hda5 without a filesystem and read/write diskblocks
to and from it.

Without a filesystem you decide what data goes in what disk block,
and of course no fs cache gets in the way.

Transfering data between a range of blocks on a partition
and a ordinary file is easy - use the dd command.

file->partition
dd if=yourfile of=/dev/hdaX bs=4096 seek=<number of first block you want
to use>

partition->file
dd if=/dev/hdaX of=yourfile bs=4096 skip=<number of first disk block you
want copied> count=<total number of blocks>


Helge Hafting