2006-11-06 21:50:08

by Eric Sandeen

[permalink] [raw]
Subject: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

The inode diet patches first did this to generic_fillattr():

- stat->blksize = inode->i_blksize;
+ stat->blksize = PAGE_CACHE_SIZE;

but by 2.6.19-rc3 this was changed again:

- stat->blksize = PAGE_CACHE_SIZE;
+ stat->blksize = (1 << inode->i_blkbits);

I can't find for sure why this was done; perhaps because it exposed a bug
in cifs[1]

However, if we are going to pick a default for generic_fillattr, it should probably
be what most filesystems were using before, and let filesystems which need something
else re-set it to according to their needs. As it stands today, doing readdirs and
the like in block-sized chunks rather than page sized will probably not be the
best thing for performance.

A bit of quick parsing of the original inode diet patch shows that by far most
filesystems were using the page size:

(count) (line which sets i_blksize)
1 cifs_inode->vfs_inode.i_blksize = CIFS_MAX_MSGSIZE;
1 i->i_blksize = 512;
1 inode->i_blksize = attr->va_blocksize;
1 inode->i_blksize = befs_sb->block_size;
1 inode->i_blksize = HFSPLUS_SB(inode->i_sb).alloc_blksz;
1 inode->i_blksize = HFSPLUS_SB(sb).alloc_blksz;
1 inode->i_blksize = HFS_SB(inode->i_sb)->alloc_blksz;
1 inode->i_blksize = HFS_SB(sb)->alloc_blksz;
1 inode->i_blksize = HPAGE_SIZE;
1 inode->i_blksize = NCP_BLOCK_SIZE;
1 inode->i_blksize = QNX4_DIR_ENTRY_SIZE;
1 inode->i_blksize = (u32)osb->s_clustersize;
1 inode->i_blksize = xfs_preferred_iosize(mp);
1 ino->i_blksize = i_blksize;
1 ino->i_blksize = proc_ino->i_blksize;
1 ip->i_blksize = ip->i_sb->s_blocksize;
1 ip->i_blksize = PAGE_SIZE;
2 inode->i_blksize = fattr->du.nfs2.blocksize;
2 inode->i_blksize = inode->i_sb->s_blocksize;
2 inode->i_blksize = reiserfs_default_io_size;
2 inode->i_blksize = sbi->cluster_size;
2 vi->i_blksize = PAGE_CACHE_SIZE;
3 inode->i_blksize = sb->s_blocksize;
3 ret->i_blksize = PAGE_CACHE_SIZE;
5 inode->i_blksize = 1024;
7 inode->i_blksize = 0;
12 inode->i_blksize = PAGE_SIZE;
22 inode->i_blksize = PAGE_CACHE_SIZE;

so I would propose the following patch to make PAGE_CACHE_SIZE the default (again),
and let filesystems which need something -else- do that on their own.

[1] http://lists.samba.org/archive/linux-cifs-client/2006-September/001481.html

Thanks,
-Eric

Signed-off-by: Eric Sandeen <[email protected]>

--- linux.orig/fs/stat.c 2006-11-05 23:27:04.962482569 -0600
+++ linux/fs/stat.c 2006-11-05 23:27:29.394396050 -0600
@@ -33,7 +33,7 @@
stat->ctime = inode->i_ctime;
stat->size = i_size_read(inode);
stat->blocks = inode->i_blocks;
- stat->blksize = (1 << inode->i_blkbits);
+ stat->blksize = PAGE_CACHE_SIZE;
}

EXPORT_SYMBOL(generic_fillattr);


2006-11-06 22:27:59

by Dave Kleikamp

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

On Mon, 2006-11-06 at 15:50 -0600, Eric Sandeen wrote:
> The inode diet patches first did this to generic_fillattr():
>
> - stat->blksize = inode->i_blksize;
> + stat->blksize = PAGE_CACHE_SIZE;
>
> but by 2.6.19-rc3 this was changed again:
>
> - stat->blksize = PAGE_CACHE_SIZE;
> + stat->blksize = (1 << inode->i_blkbits);
>
> I can't find for sure why this was done; perhaps because it exposed a bug
> in cifs[1]

Steve has fixed the bug in cifs_readdir().

> However, if we are going to pick a default for generic_fillattr, it should probably
> be what most filesystems were using before, and let filesystems which need something
> else re-set it to according to their needs. As it stands today, doing readdirs and
> the like in block-sized chunks rather than page sized will probably not be the
> best thing for performance.

I agree.

>
> so I would propose the following patch to make PAGE_CACHE_SIZE the default (again),
> and let filesystems which need something -else- do that on their own.
>
> [1] http://lists.samba.org/archive/linux-cifs-client/2006-September/001481.html
>
> Thanks,
> -Eric
>
> Signed-off-by: Eric Sandeen <[email protected]>
>
> --- linux.orig/fs/stat.c 2006-11-05 23:27:04.962482569 -0600
> +++ linux/fs/stat.c 2006-11-05 23:27:29.394396050 -0600
> @@ -33,7 +33,7 @@
> stat->ctime = inode->i_ctime;
> stat->size = i_size_read(inode);
> stat->blocks = inode->i_blocks;
> - stat->blksize = (1 << inode->i_blkbits);
> + stat->blksize = PAGE_CACHE_SIZE;
> }
>
> EXPORT_SYMBOL(generic_fillattr);

Looks good. Since cifs is affected by this patch, I propose that cifs
explicitly set stat->blksize:

CIFS: Explicitly set stat->blksize

CIFS may perform I/O over the network in larger chunks than the page size,
so it should explicitly set stat->blksize to ensure optimal I/O bandwidth

Signed-off-by: Dave Kleikamp <[email protected]>

diff -Nurp linux.orig/fs/cifs/inode.c linux/fs/cifs/inode.c
--- linux.orig/fs/cifs/inode.c 2006-11-03 13:44:04.000000000 -0600
+++ linux/fs/cifs/inode.c 2006-11-06 16:11:21.000000000 -0600
@@ -1089,8 +1089,10 @@ int cifs_getattr(struct vfsmount *mnt, s
struct kstat *stat)
{
int err = cifs_revalidate(dentry);
- if (!err)
+ if (!err) {
generic_fillattr(dentry->d_inode, stat);
+ stat->blksize = CIFS_MAX_MSGSIZE;
+ }
return err;
}


--
David Kleikamp
IBM Linux Technology Center

2006-11-06 22:37:33

by Eric Sandeen

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

Dave Kleikamp wrote:
ooks good. Since cifs is affected by this patch, I propose that cifs
> explicitly set stat->blksize:
>
> CIFS: Explicitly set stat->blksize
>
> CIFS may perform I/O over the network in larger chunks than the page size,
> so it should explicitly set stat->blksize to ensure optimal I/O bandwidth
>
> Signed-off-by: Dave Kleikamp <[email protected]>
>
> diff -Nurp linux.orig/fs/cifs/inode.c linux/fs/cifs/inode.c
> --- linux.orig/fs/cifs/inode.c 2006-11-03 13:44:04.000000000 -0600
> +++ linux/fs/cifs/inode.c 2006-11-06 16:11:21.000000000 -0600
> @@ -1089,8 +1089,10 @@ int cifs_getattr(struct vfsmount *mnt, s
> struct kstat *stat)
> {
> int err = cifs_revalidate(dentry);
> - if (!err)
> + if (!err) {
> generic_fillattr(dentry->d_inode, stat);
> + stat->blksize = CIFS_MAX_MSGSIZE;
> + }
> return err;
> }

Yep, I agree that this should go in too. Other filesystems probably
need to recover from the crash diet as well :)

Thanks,
-Eric

2006-11-06 23:05:50

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

On Mon, Nov 06, 2006 at 03:50:02PM -0600, Eric Sandeen wrote:
> so I would propose the following patch to make PAGE_CACHE_SIZE the default (again),
> and let filesystems which need something -else- do that on their own.

I agree with the conclusion, but the patch is incomplete. You went down
all the way to find out what the fileystems do in this messages, so add
the hunks to override the defaults for non-standard filesystems to the
patch aswell to restore the pre-inode diet state.

2006-11-06 23:15:37

by Eric Sandeen

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

Christoph Hellwig wrote:
> On Mon, Nov 06, 2006 at 03:50:02PM -0600, Eric Sandeen wrote:
>> so I would propose the following patch to make PAGE_CACHE_SIZE the default (again),
>> and let filesystems which need something -else- do that on their own.
>
> I agree with the conclusion, but the patch is incomplete. You went down
> all the way to find out what the fileystems do in this messages, so add
> the hunks to override the defaults for non-standard filesystems to the
> patch aswell to restore the pre-inode diet state.

Well, agreed. I put 80% or more back to pre-patch state, but not all.
:) So it's less broken with my patch than without, so at least it's
moving forward. So... Ted's patches get in w/o fixing up all the other
filesystems (left as an exercise to the patch reader) but mine can't? :)

-Eric

2006-11-06 23:17:52

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

On Mon, Nov 06, 2006 at 05:15:27PM -0600, Eric Sandeen wrote:
> > I agree with the conclusion, but the patch is incomplete. You went down
> > all the way to find out what the fileystems do in this messages, so add
> > the hunks to override the defaults for non-standard filesystems to the
> > patch aswell to restore the pre-inode diet state.
>
> Well, agreed. I put 80% or more back to pre-patch state, but not all.
> :) So it's less broken with my patch than without, so at least it's
> moving forward. So... Ted's patches get in w/o fixing up all the other
> filesystems (left as an exercise to the patch reader) but mine can't? :)

It's because no reviwer noticed the breakage Ted put in ;-) I think if
you really refused to do the remaining 20% we can try to pressured Ted to
do it and if that doesn't work I'll just do it myself because I'll be
tired of arguing at that point..

2006-11-06 23:28:29

by Eric Sandeen

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

Christoph Hellwig wrote:
> On Mon, Nov 06, 2006 at 05:15:27PM -0600, Eric Sandeen wrote:
>>> I agree with the conclusion, but the patch is incomplete. You went down
>>> all the way to find out what the fileystems do in this messages, so add
>>> the hunks to override the defaults for non-standard filesystems to the
>>> patch aswell to restore the pre-inode diet state.
>> Well, agreed. I put 80% or more back to pre-patch state, but not all.
>> :) So it's less broken with my patch than without, so at least it's
>> moving forward. So... Ted's patches get in w/o fixing up all the other
>> filesystems (left as an exercise to the patch reader) but mine can't? :)
>
> It's because no reviwer noticed the breakage Ted put in ;-) I think if
> you really refused to do the remaining 20% we can try to pressured Ted to
> do it and if that doesn't work I'll just do it myself because I'll be
> tired of arguing at that point..

eh, I'll sign up to do it when I get a moment, I won't refuse :)

Just had to notice that 2 patches already went in without the
requirement to "fix everything else" :)

It would be nice to get the "fix most of it" patch in sooner than later,
though.

-Eric

2006-11-07 00:08:43

by Andreas Dilger

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

On Nov 06, 2006 17:15 -0600, Eric Sandeen wrote:
> Christoph Hellwig wrote:
> > I agree with the conclusion, but the patch is incomplete. You went down
> > all the way to find out what the fileystems do in this messages, so add
> > the hunks to override the defaults for non-standard filesystems to the
> > patch aswell to restore the pre-inode diet state.
>
> Well, agreed. I put 80% or more back to pre-patch state, but not all.
> :) So it's less broken with my patch than without, so at least it's
> moving forward. So... Ted's patches get in w/o fixing up all the other
> filesystems (left as an exercise to the patch reader) but mine can't? :)

Actually, rather than blindly revert to pre-patch behaviour it would be
worthwhile to determine if PAGE_SIZE isn't the better value. In some
cases people don't understand that i_blksize is the "optimal IO size"
and instead assume it is the filesystem blocksize. I saw a few that were
e.g. 512 and that can't be very useful.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2006-11-07 00:13:47

by Hua Zhong

[permalink] [raw]
Subject: RE: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

> Actually, rather than blindly revert to pre-patch behaviour
> it would be worthwhile to determine if PAGE_SIZE isn't the
> better value. In some cases people don't understand that
> i_blksize is the "optimal IO size"
> and instead assume it is the filesystem blocksize. I saw a
> few that were e.g. 512 and that can't be very useful.

Of course the name i_blksize is very clear on that. :-)

> Cheers, Andreas
> --
> Andreas Dilger
> Principal Software Engineer
> Cluster File Systems, Inc.
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in the body of a message to
> [email protected] More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2006-11-07 00:25:39

by Steve French (smfltc)

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

Dave Kleikamp wrote:
> On Mon, 2006-11-06 at 15:50 -0600, Eric Sandeen wrote:
>
>> The inode diet patches first did this to generic_fillattr():
>>
>> - stat->blksize = inode->i_blksize;
>> + stat->blksize = PAGE_CACHE_SIZE;
>>
>> but by 2.6.19-rc3 this was changed again:
>>
>> - stat->blksize = PAGE_CACHE_SIZE;
>> + stat->blksize = (1 << inode->i_blkbits);
>>
>> I can't find for sure why this was done; perhaps because it exposed a bug
>> in cifs[1]
>>
>
> Steve has fixed the bug in cifs_readdir().
>
>
>> However, if we are going to pick a default for generic_fillattr, it should probably
>> be what most filesystems were using before, and let filesystems which need something
>> else re-set it to according to their needs. As it stands today, doing readdirs and
>> the like in block-sized chunks rather than page sized will probably not be the
>> best thing for performance.
>>
>
> I agree.
>
>
>> so I would propose the following patch to make PAGE_CACHE_SIZE the default (again),
>> and let filesystems which need something -else- do that on their own.
>>
>> [1] http://lists.samba.org/archive/linux-cifs-client/2006-September/001481.html
>>
>> Thanks,
>> -Eric
>>
>> Signed-off-by: Eric Sandeen <[email protected]>
>>
>> --- linux.orig/fs/stat.c 2006-11-05 23:27:04.962482569 -0600
>> +++ linux/fs/stat.c 2006-11-05 23:27:29.394396050 -0600
>> @@ -33,7 +33,7 @@
>> stat->ctime = inode->i_ctime;
>> stat->size = i_size_read(inode);
>> stat->blocks = inode->i_blocks;
>> - stat->blksize = (1 << inode->i_blkbits);
>> + stat->blksize = PAGE_CACHE_SIZE;
>> }
>>
>> EXPORT_SYMBOL(generic_fillattr);
>>
>
> Looks good. Since cifs is affected by this patch, I propose that cifs
> explicitly set stat->blksize:
>
> CIFS: Explicitly set stat->blksize
>
> CIFS may perform I/O over the network in larger chunks than the page size,
> so it should explicitly set stat->blksize to ensure optimal I/O bandwidth
>
> Signed-off-by: Dave Kleikamp <[email protected]>
>
> diff -Nurp linux.orig/fs/cifs/inode.c linux/fs/cifs/inode.c
> --- linux.orig/fs/cifs/inode.c 2006-11-03 13:44:04.000000000 -0600
> +++ linux/fs/cifs/inode.c 2006-11-06 16:11:21.000000000 -0600
> @@ -1089,8 +1089,10 @@ int cifs_getattr(struct vfsmount *mnt, s
> struct kstat *stat)
> {
> int err = cifs_revalidate(dentry);
> - if (!err)
> + if (!err) {
> generic_fillattr(dentry->d_inode, stat);
> + stat->blksize = CIFS_MAX_MSGSIZE;
> + }
> return err;
> }
>
>
>
I assumed that the original intent of the "inode diet patch" was to
remove fields in the inode,
which most filesystems can get out of the superblock. If
inode->blksize and inode->blkbits were
related (2**blkbits == blksize) , it also makes sense to me that someone
(Ted?) removed one and left the other
as one would be redundant, but some filesystems like cifs have a large
recommended i/o size (16K),
but if someone wants to remove both from the inode that is fine by me,
as long as cifs
stat->blksize is set as you suggested on the way out of cifs_getattr.
Eventually cifs should set
the stat->blksize to a smaller value for one rarer case. For the most
common case
cifs should still set it to 16K (CIFS_MAX_MSGSIZE) as that is the most
common negotiated
buffer size but if the server does not negotiate large read support, and
the server also is so old
that it negotiates a buffer size smaller than 16K (e.g. Windows95
negotiates 2K IIRC)
then we could set stat-blksize to the smaller negotiated buffer size -
but since those servers are
getting rarer it is probably not that important. More interesting
will be the future cases in
which we will be able to set this value larger to more servers, as in
general for modern
network adapters, the larger network i/o size the better.

2006-11-07 01:39:18

by Eric Sandeen

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

Andreas Dilger wrote:
> On Nov 06, 2006 17:15 -0600, Eric Sandeen wrote:
>> Christoph Hellwig wrote:
>>> I agree with the conclusion, but the patch is incomplete. You went down
>>> all the way to find out what the fileystems do in this messages, so add
>>> the hunks to override the defaults for non-standard filesystems to the
>>> patch aswell to restore the pre-inode diet state.
>> Well, agreed. I put 80% or more back to pre-patch state, but not all.
>> :) So it's less broken with my patch than without, so at least it's
>> moving forward. So... Ted's patches get in w/o fixing up all the other
>> filesystems (left as an exercise to the patch reader) but mine can't? :)
>
> Actually, rather than blindly revert to pre-patch behaviour it would be
> worthwhile to determine if PAGE_SIZE isn't the better value. In some
> cases people don't understand that i_blksize is the "optimal IO size"
> and instead assume it is the filesystem blocksize. I saw a few that were
> e.g. 512 and that can't be very useful.

I'm willing to either revert everyting to pre-inode-diet behavior, or leave it
at the (newly re-proposed) page size default and let the other fs maintainers
sort it out for their own codebase, but I don't pretend to know what is best
for, say, qnx4 etc... I'd be willing to cc: all maintainers asking them to take
another long hard look at their code. :)

As we saw with cifs, these changes can have unintended consequences (not picking
on cifs, it's just one that ran into issues with the broad-stroke change).

-Eric

2006-11-07 03:44:19

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

On Mon, Nov 06, 2006 at 05:15:27PM -0600, Eric Sandeen wrote:
> Christoph Hellwig wrote:
> > On Mon, Nov 06, 2006 at 03:50:02PM -0600, Eric Sandeen wrote:
> >> so I would propose the following patch to make PAGE_CACHE_SIZE the default (again),
> >> and let filesystems which need something -else- do that on their own.
> >
> > I agree with the conclusion, but the patch is incomplete. You went down
> > all the way to find out what the fileystems do in this messages, so add
> > the hunks to override the defaults for non-standard filesystems to the
> > patch aswell to restore the pre-inode diet state.
>
> Well, agreed. I put 80% or more back to pre-patch state, but not all.
> :) So it's less broken with my patch than without, so at least it's
> moving forward. So... Ted's patches get in w/o fixing up all the other
> filesystems (left as an exercise to the patch reader) but mine can't? :)

Note that *I* wasn't the one who changed it from PAGE_CACHE_SIZE to (1
<< inode->i_blkbits). This was done by Andrew. (See below, from an
e-mail dated September 19th).

Given that Steve French was cc'ed, I assume this was done as a hack to
fix CIFS, but it was a bad idea; I agree that PAGE_CACHE_SIZE is a way
better default than (1 << inode->i_blkbits).

As far as fixing all of the other filesystems, I did *try*; I know I
screwed up with XFS, but that's because I still think the code is a
screaming horror of indirections that make it impossible to understand
what the heck is going on, and I guess I screwed up with CIFS. Some
of the changes away from "pre inode diet" state were deliberate,
though, since some filesystems had very clearly broken "optimal I/O
sizes" of 512, and one even had something incredibly bogus that was
something like 96 bytes (!) if I remember correctly.

- Ted


Subject: inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-fix-fix
From: Andrew Morton <[email protected]>

Cc: <[email protected]>
Cc: Theodore Ts'o <[email protected]>
Cc: Steven French <[email protected]>,
Signed-off-by: Andrew Morton <[email protected]>
---

fs/stat.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN
fs/stat.c~inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-fix-fi
x fs/stat.c
---
a/fs/stat.c~inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-fix-
fix
+++ a/fs/stat.c
@@ -33,7 +33,7 @@ void generic_fillattr(struct inode *inod
stat->ctime = inode->i_ctime;
stat->size = i_size_read(inode);
stat->blocks = inode->i_blocks;
- stat->blksize = PAGE_CACHE_SIZE;
+ stat->blksize = (1 << inode->i_blkbits);
}

EXPORT_SYMBOL(generic_fillattr);
_

2006-11-07 04:02:23

by Eric Sandeen

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

Theodore Tso wrote:
> On Mon, Nov 06, 2006 at 05:15:27PM -0600, Eric Sandeen wrote:
>> Christoph Hellwig wrote:
>>> On Mon, Nov 06, 2006 at 03:50:02PM -0600, Eric Sandeen wrote:
>>>> so I would propose the following patch to make PAGE_CACHE_SIZE the default (again),
>>>> and let filesystems which need something -else- do that on their own.
>>> I agree with the conclusion, but the patch is incomplete. You went down
>>> all the way to find out what the fileystems do in this messages, so add
>>> the hunks to override the defaults for non-standard filesystems to the
>>> patch aswell to restore the pre-inode diet state.
>> Well, agreed. I put 80% or more back to pre-patch state, but not all.
>> :) So it's less broken with my patch than without, so at least it's
>> moving forward. So... Ted's patches get in w/o fixing up all the other
>> filesystems (left as an exercise to the patch reader) but mine can't? :)
>
> Note that *I* wasn't the one who changed it from PAGE_CACHE_SIZE to (1
> << inode->i_blkbits). This was done by Andrew. (See below, from an
> e-mail dated September 19th).
>
> Given that Steve French was cc'ed, I assume this was done as a hack to
> fix CIFS, but it was a bad idea; I agree that PAGE_CACHE_SIZE is a way
> better default than (1 << inode->i_blkbits).
>
> As far as fixing all of the other filesystems, I did *try*; I know I
> screwed up with XFS, but that's because I still think the code is a
> screaming horror of indirections that make it impossible to understand
> what the heck is going on, and I guess I screwed up with CIFS. Some
> of the changes away from "pre inode diet" state were deliberate,
> though, since some filesystems had very clearly broken "optimal I/O
> sizes" of 512, and one even had something incredibly bogus that was
> something like 96 bytes (!) if I remember correctly.

In that case can we just go with my original third-round patch to put it back to
PAGE_SIZE, and let the other filesystems tidy up if necessary. This -is- just
supposed to be a hint...

Sorry Ted, "broken" was too strong a word... I just read:

"Filesystems that want
to provide a per-inode st_blksize can do so by providing their own getattr
routine instead of using the generic_fillattr() function."

in the patch, and figured that you were leaving it up to the filesystems to
re-evaluate whether they needed something other than a page size. Which I'd be
happy to do, too. :)

-Eric

2006-11-07 13:40:50

by Dave Kleikamp

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

On Mon, 2006-11-06 at 18:26 -0600, Steve French wrote:
> I assumed that the original intent of the "inode diet patch" was to
> remove fields in the inode,
> which most filesystems can get out of the superblock.

I think I may have planted the idea that you could get i_blkbits from
sb->s_blocksize_bits, but I was wrong. Consider a block device. It's
blocksize is not related to the superblock of its containing filesystem.

> If
> inode->blksize and inode->blkbits were
> related (2**blkbits == blksize) , it also makes sense to me that someone
> (Ted?) removed one and left the other
> as one would be redundant,

They were never really related. Some filesystems treated them as if
they were, but the vfs always used i_blkbits for the block size.
i_blksize was only really used for returning stat->blksize.

> but some filesystems like cifs have a large
> recommended i/o size (16K),
> but if someone wants to remove both from the inode that is fine by me,
> as long as cifs
> stat->blksize is set as you suggested on the way out of cifs_getattr.
> Eventually cifs should set
> the stat->blksize to a smaller value for one rarer case. For the most
> common case
> cifs should still set it to 16K (CIFS_MAX_MSGSIZE) as that is the most
> common negotiated
> buffer size but if the server does not negotiate large read support, and
> the server also is so old
> that it negotiates a buffer size smaller than 16K (e.g. Windows95
> negotiates 2K IIRC)
> then we could set stat-blksize to the smaller negotiated buffer size -
> but since those servers are
> getting rarer it is probably not that important. More interesting
> will be the future cases in
> which we will be able to set this value larger to more servers, as in
> general for modern
> network adapters, the larger network i/o size the better.

It would probably be best to just set stat->blksize to the negotiated
buffer size.
--
David Kleikamp
IBM Linux Technology Center

2006-11-07 15:34:23

by Dave Kleikamp

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

On Tue, 2006-11-07 at 07:40 -0600, Dave Kleikamp wrote:
> It would probably be best to just set stat->blksize to the negotiated
> buffer size.

But be careful here. I don't know how applications/glibc may behave if
stat->blksize is not a power of 2.
--
David Kleikamp
IBM Linux Technology Center

2006-11-07 15:49:33

by Steve French (smfltc)

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

Dave Kleikamp wrote:
> On Tue, 2006-11-07 at 07:40 -0600, Dave Kleikamp wrote:
>
>> It would probably be best to just set stat->blksize to the negotiated
>> buffer size.
>>
>
> But be careful here. I don't know how applications/glibc may behave if
> stat->blksize is not a power of 2.
>
The man page is not particularly helpful either as it simply indicates:
"The st_blksize field gives the preferred blocksize for efficient
file system I/O. "
but it appears that blksize would affects readdir performance more than
read/write
(since read/write go through the pagecache and thus readpages/writepages
will request readahead/writebehind for many pages at a time) unless the
application
opens the file direct i/o.

2006-12-03 13:38:48

by col-pepper

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

I am using a 2.6.18.2 based kernel and see lots of broken fs due to this
"diet". eg cloop

I hope some general lessons can be drawn about the necessity and
desirablility of such changes that (predictably) invoke broadband breakage.

This kind of change and the breakage and dependancy issues they create are
what makes linux a nightmare to maintain.

While it seems some improvement and clean up may result from this getting
attention, it appears that the inode structure is back to it's original
size. Which is quite probably the way it should have stayed all along.

Hopefully this has now stablised.


What kernel release contains code where all this calms down and I dont
need to search patches and updates for modules in order to get basics to
work again?

Alternatively can I simply revert the original diet patch on my 2.6.18.2
to maintain working fs modules?

Thanks for your replys.

2006-12-03 13:48:18

by Adrian Bunk

[permalink] [raw]
Subject: Re: [RFC/PATCH] - revert generic_fillattr stat->blksize to PAGE_CACHE_SIZE

On Sun, Dec 03, 2006 at 02:21:09PM +0100, [email protected] wrote:

> I am using a 2.6.18.2 based kernel and see lots of broken fs due to this
> "diet". eg cloop
>
> I hope some general lessons can be drawn about the necessity and
> desirablility of such changes that (predictably) invoke broadband breakage.

Lessons 1-99:
Get your modules included in the kernel.

> This kind of change and the breakage and dependancy issues they create are
> what makes linux a nightmare to maintain.
>...

s/linux/external modules not submitted for inclusion in the kernel/

> What kernel release contains code where all this calms down and I dont
> need to search patches and updates for modules in order to get basics to
> work again?
>...

None, the Linux development model is based on the fact that such changes
are considered perfectly OK as long as all in-kernel users are being
fixed.

The solution for your problem is that the authors of the modules you
are using should get their modules included in the Linux kernel - and
they'll be automatically fixed when someone changes an in-kernel API.

> Thanks for your replys.

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed