2006-10-03 21:23:42

by Chris Wedgwood

[permalink] [raw]
Subject: Re: [RFC 0/3] Convert XFS inode hashes to radix trees

On Tue, Oct 03, 2006 at 04:06:10PM +1000, David Chinner wrote:

> Overall, the patchset removes more than 200 lines of code from the
> xfs inode caching and lookup code and provides more consistent
> scalability for large numbers of cached inodes. The only down side
> is that it limits us to 32 bit inode numbers of 32 bit platforms due
> to the way the radix tree uses unsigned longs for it's indexes

commit afefdbb28a0a2af689926c30b94a14aea6036719
tree 6ee500575cac928cd90045bcf5b691cf2b8daa09
parent 1d32849b14bc8792e6f35ab27dd990d74b16126c
author David Howells <[email protected]> 1159863226 -0700
committer Linus Torvalds <[email protected]> 1159887820 -0700

[PATCH] VFS: Make filldir_t and struct kstat deal in 64-bit inode numbers

These patches make the kernel pass 64-bit inode numbers internally when
communicating to userspace, even on a 32-bit system. They are required
because some filesystems have intrinsic 64-bit inode numbers: NFS3+ and XFS
for example. The 64-bit inode numbers are then propagated to userspace
automatically where the arch supports it.
[...]

Doing this will mean XFS won't be able to support 32-bit inodes on
32-bit platforms the above (merged) patch --- though given that cheap
64-bit systems are now abundant does anyone really care?


2006-10-03 22:23:52

by David Chinner

[permalink] [raw]
Subject: Re: [RFC 0/3] Convert XFS inode hashes to radix trees

On Tue, Oct 03, 2006 at 02:23:35PM -0700, Chris Wedgwood wrote:
> On Tue, Oct 03, 2006 at 04:06:10PM +1000, David Chinner wrote:
> > Overall, the patchset removes more than 200 lines of code from the
> > xfs inode caching and lookup code and provides more consistent
> > scalability for large numbers of cached inodes. The only down side
> > is that it limits us to 32 bit inode numbers of 32 bit platforms due
> > to the way the radix tree uses unsigned longs for it's indexes
>
> commit afefdbb28a0a2af689926c30b94a14aea6036719
> tree 6ee500575cac928cd90045bcf5b691cf2b8daa09
> parent 1d32849b14bc8792e6f35ab27dd990d74b16126c
> author David Howells <[email protected]> 1159863226 -0700
> committer Linus Torvalds <[email protected]> 1159887820 -0700
>
> [PATCH] VFS: Make filldir_t and struct kstat deal in 64-bit inode numbers
>
> These patches make the kernel pass 64-bit inode numbers internally when
> communicating to userspace, even on a 32-bit system. They are required
> because some filesystems have intrinsic 64-bit inode numbers: NFS3+ and XFS
> for example. The 64-bit inode numbers are then propagated to userspace
> automatically where the arch supports it.
> [...]
>
> Doing this will mean XFS won't be able to support 32-bit inodes on
> 32-bit platforms the above (merged) patch --- though given that cheap
> 64-bit systems are now abundant does anyone really care?

That's a good question. In a recent thread on linux-fsdevel about
these patches Christoph Hellwig pointed out that 32bit user space is
not ready for 64 bit inodes, so it's probably going to be a while
before the second half of this mod is ready (which exports 64 bit
inodes ito userspace on 32bit platforms).

http://marc.theaimsgroup.com/?l=linux-fsdevel&m=115946211808497&w=2
http://marc.theaimsgroup.com/?l=linux-fsdevel&m=115948836023569&w=2

ISTR someone else also menitoning that 64bit inodes on 32-bit machines
also breaks the dynamic linker, but I can't find a reference to that
atm.

As it stands, there's still a few barriers to getting 64 bit inodes
on 32 bit platforms and I can't see them going away quickly. Right
now I see little reason in moving to 64 bit inodes for 32 bit
platforms for XFS because of the 16TB filesystem size limit (that
only needs 33-36 bit inodes depending on the inode size) and no
32bit platform is currently able to repair a filesystem of that
size.

And yes, 64 bit systems are cheap, cheap, cheap so IMO this
functionality is really irrelevant moving forward. If it had come
along a couple of years ago then it would be different, but I think
mainstream technology is finally catching up with XFS so it's not a
critical issue anymore... ;)

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2006-10-04 00:47:14

by Chris Wedgwood

[permalink] [raw]
Subject: Re: [RFC 0/3] Convert XFS inode hashes to radix trees

On Wed, Oct 04, 2006 at 08:22:56AM +1000, David Chinner wrote:

> That's a good question. In a recent thread on linux-fsdevel about
> these patches Christoph Hellwig pointed out that 32bit user space is
> not ready for 64 bit inodes, so it's probably going to be a while
> before the second half of this mod is ready (which exports 64 bit
> inodes ito userspace on 32bit platforms).

yes a patch changing struct kstat and filldir* was merged...

> http://marc.theaimsgroup.com/?l=linux-fsdevel&m=115946211808497&w=2
> http://marc.theaimsgroup.com/?l=linux-fsdevel&m=115948836023569&w=2

> As it stands, there's still a few barriers to getting 64 bit inodes
> on 32 bit platforms and I can't see them going away quickly. Right
> now I see little reason in moving to 64 bit inodes for 32 bit
> platforms for XFS because of the 16TB filesystem size limit (that
> only needs 33-36 bit inodes depending on the inode size) and no
> 32bit platform is currently able to repair a filesystem of that
> size.

so that leaves NFS3+

is it really worth the pain then?

2006-10-04 01:44:13

by David Chinner

[permalink] [raw]
Subject: Re: [RFC 0/3] Convert XFS inode hashes to radix trees

On Wed, Oct 04, 2006 at 08:22:56AM +1000, David Chinner wrote:
> On Tue, Oct 03, 2006 at 02:23:35PM -0700, Chris Wedgwood wrote:
> > On Tue, Oct 03, 2006 at 04:06:10PM +1000, David Chinner wrote:
> > > Overall, the patchset removes more than 200 lines of code from the
> > > xfs inode caching and lookup code and provides more consistent
> > > scalability for large numbers of cached inodes. The only down side
> > > is that it limits us to 32 bit inode numbers of 32 bit platforms due
> > > to the way the radix tree uses unsigned longs for it's indexes
> >
> > commit afefdbb28a0a2af689926c30b94a14aea6036719
> > tree 6ee500575cac928cd90045bcf5b691cf2b8daa09
> > parent 1d32849b14bc8792e6f35ab27dd990d74b16126c
> > author David Howells <[email protected]> 1159863226 -0700
> > committer Linus Torvalds <[email protected]> 1159887820 -0700
> >
> > [PATCH] VFS: Make filldir_t and struct kstat deal in 64-bit inode numbers
> >
> > These patches make the kernel pass 64-bit inode numbers internally when
> > communicating to userspace, even on a 32-bit system. They are required
> > because some filesystems have intrinsic 64-bit inode numbers: NFS3+ and XFS
> > for example. The 64-bit inode numbers are then propagated to userspace
> > automatically where the arch supports it.
> > [...]
> >
> > Doing this will mean XFS won't be able to support 32-bit inodes on
> > 32-bit platforms the above (merged) patch --- though given that cheap
> > 64-bit systems are now abundant does anyone really care?
>
> That's a good question. In a recent thread on linux-fsdevel about
> these patches Christoph Hellwig pointed out that 32bit user space is
> not ready for 64 bit inodes, so it's probably going to be a while
> before the second half of this mod is ready (which exports 64 bit
> inodes ito userspace on 32bit platforms).

Ahhh.... I think I misread what Chris wrote here - _32_ bit inodes on
32 bit platforms not working? I can't see how this would be the
case with the mods I posted given that they are entirely internal to
XFS and don't change any external inode number interfaces. And the
above commit shouldn't break XFS either.

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2006-10-04 19:22:57

by Trond Myklebust

[permalink] [raw]
Subject: Re: [RFC 0/3] Convert XFS inode hashes to radix trees

On Wed, 2006-10-04 at 08:22 +1000, David Chinner wrote:
> On Tue, Oct 03, 2006 at 02:23:35PM -0700, Chris Wedgwood wrote:
> > On Tue, Oct 03, 2006 at 04:06:10PM +1000, David Chinner wrote:
> > > Overall, the patchset removes more than 200 lines of code from the
> > > xfs inode caching and lookup code and provides more consistent
> > > scalability for large numbers of cached inodes. The only down side
> > > is that it limits us to 32 bit inode numbers of 32 bit platforms due
> > > to the way the radix tree uses unsigned longs for it's indexes
> >
> > commit afefdbb28a0a2af689926c30b94a14aea6036719
> > tree 6ee500575cac928cd90045bcf5b691cf2b8daa09
> > parent 1d32849b14bc8792e6f35ab27dd990d74b16126c
> > author David Howells <[email protected]> 1159863226 -0700
> > committer Linus Torvalds <[email protected]> 1159887820 -0700
> >
> > [PATCH] VFS: Make filldir_t and struct kstat deal in 64-bit inode numbers
> >
> > These patches make the kernel pass 64-bit inode numbers internally when
> > communicating to userspace, even on a 32-bit system. They are required
> > because some filesystems have intrinsic 64-bit inode numbers: NFS3+ and XFS
> > for example. The 64-bit inode numbers are then propagated to userspace
> > automatically where the arch supports it.
> > [...]
> >
> > Doing this will mean XFS won't be able to support 32-bit inodes on
> > 32-bit platforms the above (merged) patch --- though given that cheap
> > 64-bit systems are now abundant does anyone really care?

Which completely ignored the fact that NFS systems are already having to
truncate 64-bit inode numbers to 32-bits and pass these truncated values
up to userspace. Collisions have been observed in the wild, and I've
already had to change the 64-bit->32-bit hashing algorithm on at least
one occasion.

By moving that truncation into userspace, we will at least give 64-bit
standards-compliant programs a chance to work correctly.

Trond