2003-08-05 16:28:19

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: i_blksize

> Looks like I got myself confused

Yes. But nevertheless, now that you brought this up,
we might consider throwing out i_blksize.

I am not aware of anybody who actually uses this to give
per-file advice. So, it could be in the superblock.
There is no reason why it would be a power of two -
the case mentioned yesterday or so was cifs, with

inode->i_blksize =
(pTcon->ses->server->maxBuf - MAX_CIFS_HDR_SIZE) & 0xFFFFFE00;

I see no reason not to replace i_blksize by i_sb->s_optimal_io_size.

Any objections?

If sizeof(struct inode) decreases by 1% then we can keep 1% more inodes.

That reminds me - I threw out i_dev and i_cdev, but Al reintroduced i_cdev.
We should do as some comment says and make a union with i_bdev and i_pipe.
Another 8 bytes gone.

Andries


2003-08-05 17:48:57

by Andrew Morton

[permalink] [raw]
Subject: Re: i_blksize

[email protected] wrote:
>
> > Looks like I got myself confused
>
> Yes. But nevertheless, now that you brought this up,
> we might consider throwing out i_blksize.
>
> I am not aware of anybody who actually uses this to give
> per-file advice. So, it could be in the superblock.

I suppose so. reiserfs plays with it.

I can't really see that anyone would want to set the I/O size hint on a
per-inode basis, especially as the readahead and writebehind code will
cheerfully ignore it.

> Any objections?

I don't think it's worth fiddling with at this time, really.

> If sizeof(struct inode) decreases by 1% then we can keep 1% more inodes.
>
> That reminds me - I threw out i_dev and i_cdev, but Al reintroduced i_cdev.
> We should do as some comment says and make a union with i_bdev and i_pipe.
> Another 8 bytes gone.

Well all the inode slab caches are using SLAB_HWCACHE_ALIGN at present, so
it's a little moot. Especially on a pentium4-compiled kernel.

But I expect most distributed 2.6 kernels will be pII or pIII-compiled.
Let's look:

SMP:
sizeof(struct ext2_inode_info) = 0x1d0
sizeof(struct ext3_inode_info) = 0x1e0

Both of these pack eight-per-page. Need to get them to 0x1c4 (and remove
SLAB_HWCACHE_ALIGN) to get to nine-per-page.

UP:
sizeof(struct ext3_inode_info) = 0x1c4 (whew!)
sizeof(struct ext2_inode_info) = 0x1b4

So for these filesystems at least, we need to remove SLAB_HWCACHE_ALIGN and
we will get a 12% improvement in packing density on uniprocessor.

unionification of i_[bcp]dev sounds like a good idea to give us a little
margin there.

2003-08-05 18:10:16

by Andreas Dilger

[permalink] [raw]
Subject: Re: i_blksize

On Aug 05, 2003 10:50 -0700, Andrew Morton wrote:
> [email protected] wrote:
> > Yes. But nevertheless, now that you brought this up,
> > we might consider throwing out i_blksize.
> >
> > I am not aware of anybody who actually uses this to give
> > per-file advice. So, it could be in the superblock.
>
> I suppose so. reiserfs plays with it.
>
> I can't really see that anyone would want to set the I/O size hint on a
> per-inode basis, especially as the readahead and writebehind code will
> cheerfully ignore it.

Actually, Lustre uses this, because each file can be striped over a
different number of storage targets, and you want read and write requests
large enough to try and write to all of the targets at one time.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/