2002-02-18 21:09:03

by Maciej W. Rozycki

[permalink] [raw]
Subject: [patches] RFC: Export inode generations to the userland

Hello,

As you may know, there are serious problems with creating unique file
handles in the userland NFS server. They exist because the inode
generation number, which allows to determine if an inode was deleted and
recreated, is currently only available to the kernel -- it's by no means
exported to user programs[1]. I've been working to remove this limitation
recently and here I am presenting the results.

1. Linux was modified to add another member of "struct stat" and "struct
stat64". The member provides the value of the inode generation at the
time one of the stat syscalls is invoked. It is named "st_gen" as it is
the name other systems give it (it seems DEC OSF/1 and IBM AIX define this
member currently). New syscalls have been defined wherever spare space
was not available in "struct stat" or "struct stat64", otherwise only the
semantics of old ones was extended. Due to its moderate size the patch is
not attached. It's available at:
'ftp://ftp.ds2.pg.gda.pl/pub/macro/st_gen/patches/patch-2.4.16-stat-st_gen-26.gz'.

2. Glibc was updated to make the Linux change usable. This is an example
implementation and may seriously differ from what might go into glibc
finally. The patch is available at:
'ftp://ftp.ds2.pg.gda.pl/pub/macro/st_gen/patches/glibc-2.2.5-stat-st_gen.patch.gz'.

3. The userland NFS server was changed to embed the inode generation into
file handles. The patch is available at:
'ftp://ftp.ds2.pg.gda.pl/pub/macro/st_gen/patches/nfs-server-2.2beta50-stat.patch.gz'.

Patches were tested against versions embedded in their names. Tests were
successful on an i386 and a mipsel system. Additionally the Linux patch
was tested as is with Linux 2.4.17 (a 2.4.17 snapshot take on Jan 29th
from oss.sgi.com for mipsel; a slightly modified patch is available at the
site well). Glibc and nfs-server RPM packages are available at the site
as well.

I'm looking forward to any constructive comments, whether positive or
critical. The destined target of the changes is obviously Linux 2.6 and
glibc 2.3; testing of such changes is better with stable versions, though.
I believe the changes may be useful to other software dealing with
filesystems as well, not only to the NFS server.

Maciej

[1] There is that weird EXT2_IOC_GETVERSION ioctl, but it's neither
portable nor usable for anything but maybe debugging.

--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: [email protected], PGP key available +


2002-02-19 10:44:04

by Andreas Dilger

[permalink] [raw]
Subject: Re: [patches] RFC: Export inode generations to the userland

On Feb 18, 2002 22:06 +0100, Maciej W. Rozycki wrote:
> As you may know, there are serious problems with creating unique file
> handles in the userland NFS server. They exist because the inode
> generation number, which allows to determine if an inode was deleted and
> recreated, is currently only available to the kernel -- it's by no means
> exported to user programs[1].

Well, I don't see what's so bad with EXT2_IOC_GETVERSION? It's not like
many Linux filesystems have inode generation numbers in the first place.
It may even be that reiserfs does/would implement the EXT2_IOC_GETVERSION
ioctl also (they implemented EXT2_IOC_GETATTR compatible with ext2/ext3).
You can wrap this inside glibc if you really want to, and that has the
added benefit of working with all kernels in existence. That's not to
say this ioctl is the best interface...

> 1. Linux was modified to add another member of "struct stat" and "struct
> stat64". The member provides the value of the inode generation at the
> time one of the stat syscalls is invoked. It is named "st_gen" as it is
> the name other systems give it (it seems DEC OSF/1 and IBM AIX define this
> member currently). New syscalls have been defined wherever spare space
> was not available in "struct stat" or "struct stat64", otherwise only the
> semantics of old ones was extended.

IIRC, there are several other desirable changes to struct stat/stat64
(64-bit timestamps, 32-bit UIDs/GIDs, and others I believe, some searching
should show up complaintants) so if there is really a need to add yet
_another_ stat struct/syscall we may as well do it right _this_ time
(like we've said every other time we change this interface).

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2002-02-19 11:53:49

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [patches] RFC: Export inode generations to the userland

On Tue, 19 Feb 2002, Andreas Dilger wrote:

> Well, I don't see what's so bad with EXT2_IOC_GETVERSION? It's not like
> many Linux filesystems have inode generation numbers in the first place.

1. You need permissions to open a file.

2. Opening may cause undesired side effects (think "/dev/st0").

3. You can't open a symlink.

> It may even be that reiserfs does/would implement the EXT2_IOC_GETVERSION
> ioctl also (they implemented EXT2_IOC_GETATTR compatible with ext2/ext3).
> You can wrap this inside glibc if you really want to, and that has the
> added benefit of working with all kernels in existence. That's not to
> say this ioctl is the best interface...

Due to the limitations quoted above the ioctl is unsuitable as an
underlying way to retrieve "st_gen" for neither of stat(), stat64(),
lstat() or lstat64().

> IIRC, there are several other desirable changes to struct stat/stat64
> (64-bit timestamps, 32-bit UIDs/GIDs, and others I believe, some searching
> should show up complaintants) so if there is really a need to add yet
> _another_ stat struct/syscall we may as well do it right _this_ time
> (like we've said every other time we change this interface).

Fully agreed. It might be desireable to keep a few spare bytes at the
end of the new "struct stat64" as well (like it's already done for "struct
stat"), so there is no need to add syscalls each time a new member is
added.

--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: [email protected], PGP key available +