On Mon 27-05-24 04:47:56, Christoph Hellwig wrote:
> On Sun, May 26, 2024 at 12:01:08PM -0700, Aleksa Sarai wrote:
> > The existing interface already provides a mount ID which is not even
> > safe without rebooting.
>
> And that seems to be a big part of the problem where the Linux by handle
> syscall API deviated from all know precedence for no good reason. NFS
> file handles which were the start of this do (and have to) encode a
> persistent file system identifier. As do the xfs handles (although they
> do the decoding in the userspace library on Linux for historic reasons),
> as do the FreeBSD equivalents to these syscalls.
So I was wondering how this is actually working in practice. Checking the
code, NFS server is (based on configuration in /etc/exports) either using
device number as the filesystem identifier or fsid / uuid as specified in
/etc/exports.
> > An alternative would be to return something unique to the filesystem
> > superblock, but as far as I can tell there is no guarantee that every
> > Linux filesystem's fsid is sufficiently unique to act as a globally
> > unique identifier. At least with a 64-bit mount ID and statmount(2),
> > userspace can decide what information is needed to get sufficiently
> > unique information about the source filesystem.
>
> Well, every file system that supports export ops already needs a
> globally unique ID for NFS to work properly. We might not have good
> enough interfaces for that, but that shouldn't be too hard.
So as my research above shows, this ID is either manually configured in
/etc/exports or NFS server uses device number which is not guaranteed to be
persistent. Filesystems, at least currently, have no obligation to provide
anything (and some of them indeed don't provide any uuid or persistent
fsid). I guess that's the reason why mount ID is reported with
name_to_handle_at().
Don't get me wrong, I agree with your reservations towards mount ID (per
mount instead of per-sb, non-persistent) and I agree the properties you
describe are the golden standard we should strive for but I mainly wanted
to point out this is not reality today and in particular providing the
"persistency" guarantee of the filesystem ID may require on disk format
changes for some filesystems.
So returning the 64-bit mount ID from name_to_handle_at() weasels out of
these "how to identify arbitrary superblock" problems by giving userspace a
reasonably reliable way to generate this superblock identifier itself. I'm
fully open to less errorprone API for this but at this point I don't see it
so changing the mount ID returned from name_to_handle_at() to 64-bit unique
one seems like a sane practical choice to me...
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Mon, May 27, 2024 at 03:34:30PM +0200, Jan Kara wrote:
> So I was wondering how this is actually working in practice. Checking the
> code, NFS server is (based on configuration in /etc/exports) either using
> device number as the filesystem identifier or fsid / uuid as specified in
> /etc/exports.
Yes, it's a rather suboptimal implementation.
> So returning the 64-bit mount ID from name_to_handle_at() weasels out of
> these "how to identify arbitrary superblock" problems by giving userspace a
> reasonably reliable way to generate this superblock identifier itself. I'm
> fully open to less errorprone API for this but at this point I don't see it
> so changing the mount ID returned from name_to_handle_at() to 64-bit unique
> one seems like a sane practical choice to me...
Well, how about we fix the thing for real:
- allow file systems to provide a uniqueu identifier of at least
uuid size (16 bytes) in the superblock or through an export operation
- for non-persistent file systems allow to generate one at boot time
using the normal uuid generation helpers
- add a new flag to name_to_handle_at/open_by_handle_at to include it
in the file handle, and thus make the file handle work more like
the normal file handle
- add code to nfsd to directly make use of this
This would solve all the problems in this proposal as well as all the
obvious ones it doesn't solve.