2009-01-19 17:25:02

by Coly Li

[permalink] [raw]
Subject: [PATCH 0/20] return f_fsid for statfs(2)

Currently many file systems in Linux kernel do not return f_fsid in statfs info, the value is set as
0 in vfs layer. Anyway, in some conditions, f_fsid from statfs(2) is useful, especially being used
as (f_fsid, ino) pair to uniquely identify a file.

Basic idea of the patches is generating a unique fs ID by huge_encode_dev(sb->s_bdev->bd_dev) during
file system mounting life time (no endian consistent issue). sb is a point of struct super_block of
current mounted file system being accessed by statfs(2).

The patches are quite simple, any feedback or patch review is welcome.

Thanks.
--
Coly Li
SuSE Labs





2009-01-19 19:29:22

by Dave Kleikamp

[permalink] [raw]
Subject: Re: [PATCH 0/20] return f_fsid for statfs(2)

These patches probably belong in linux-fsdevel, but I hesitate to ask
you to repost all twenty.

On Tue, 2009-01-20 at 01:30 +0800, Coly Li wrote:
> Currently many file systems in Linux kernel do not return f_fsid in statfs info, the value is set as
> 0 in vfs layer. Anyway, in some conditions, f_fsid from statfs(2) is useful, especially being used
> as (f_fsid, ino) pair to uniquely identify a file.
>
> Basic idea of the patches is generating a unique fs ID by huge_encode_dev(sb->s_bdev->bd_dev) during
> file system mounting life time (no endian consistent issue). sb is a point of struct super_block of
> current mounted file system being accessed by statfs(2).

ext[234] return a portion of the uuid in f_fsid. There is a theoretical
chance of those values being non-unique. Since there doesn't appear to
be any case for the fsid to be persistent between boots, I guess
huge_encode_dev() is probably a better choice. In practice it probably
makes no difference.

> The patches are quite simple, any feedback or patch review is welcome.

They look reasonable to me.

Shaggy
--
David Kleikamp
IBM Linux Technology Center

2009-01-19 23:37:51

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 0/20] return f_fsid for statfs(2)

On Jan 19, 2009 13:28 -0600, Dave Kleikamp wrote:
> ext[234] return a portion of the uuid in f_fsid. There is a theoretical
> chance of those values being non-unique. Since there doesn't appear to
> be any case for the fsid to be persistent between boots, I guess
> huge_encode_dev() is probably a better choice. In practice it probably
> makes no difference.

I'm not sure what you mean about "no case for fsid to be persistent"?
The whole point of fsid (for NFS) is that this identifies the filesystem
over reboot, even if the block device ID changes, or if the filesystem
doesn't have a block device at all (e.g. cluster filesystem).

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2009-01-20 02:39:27

by Dave Kleikamp

[permalink] [raw]
Subject: Re: [PATCH 0/20] return f_fsid for statfs(2)

On Tue, 2009-01-20 at 07:36 +0800, Andreas Dilger wrote:
> On Jan 19, 2009 13:28 -0600, Dave Kleikamp wrote:
> > ext[234] return a portion of the uuid in f_fsid. There is a theoretical
> > chance of those values being non-unique. Since there doesn't appear to
> > be any case for the fsid to be persistent between boots, I guess
> > huge_encode_dev() is probably a better choice. In practice it probably
> > makes no difference.
>
> I'm not sure what you mean about "no case for fsid to be persistent"?
> The whole point of fsid (for NFS) is that this identifies the filesystem
> over reboot, even if the block device ID changes, or if the filesystem
> doesn't have a block device at all (e.g. cluster filesystem).

I guess that just demonstrates how little I know about what the fsid is
about. Would it be preferable for file systems that have a uuid to use
that instead? Of course anything is an improvement over zeroes.

Shaggy
--
David Kleikamp
IBM Linux Technology Center

2009-01-20 04:14:21

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 0/20] return f_fsid for statfs(2)

On Jan 19, 2009 20:39 -0600, Dave Kleikamp wrote:
> On Tue, 2009-01-20 at 07:36 +0800, Andreas Dilger wrote:
> > The whole point of fsid (for NFS) is that this identifies the filesystem
> > over reboot, even if the block device ID changes, or if the filesystem
> > doesn't have a block device at all (e.g. cluster filesystem).
>
> I guess that just demonstrates how little I know about what the fsid is
> about. Would it be preferable for file systems that have a uuid to use
> that instead? Of course anything is an improvement over zeroes.

Yes, that is what the ext* patches do - fold the 128-bit UUID into a 64-bit
fsid so that it is constant across reboots. The chance of UUID collision
is about 1/2^32 due to birthday paradox, which is fairly low, and in case
this happens one of the filesystem UUIDs can be regenerated.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2009-01-20 04:25:10

by Coly Li

[permalink] [raw]
Subject: Re: [PATCH 0/20] return f_fsid for statfs(2)



Andreas Dilger Wrote:
> On Jan 19, 2009 20:39 -0600, Dave Kleikamp wrote:
>> On Tue, 2009-01-20 at 07:36 +0800, Andreas Dilger wrote:
>>> The whole point of fsid (for NFS) is that this identifies the filesystem
>>> over reboot, even if the block device ID changes, or if the filesystem
>>> doesn't have a block device at all (e.g. cluster filesystem).
>> I guess that just demonstrates how little I know about what the fsid is
>> about. Would it be preferable for file systems that have a uuid to use
>> that instead? Of course anything is an improvement over zeroes.
>
> Yes, that is what the ext* patches do - fold the 128-bit UUID into a 64-bit
> fsid so that it is constant across reboots. The chance of UUID collision
> is about 1/2^32 due to birthday paradox, which is fairly low, and in case
> this happens one of the filesystem UUIDs can be regenerated.
>
Ext[234] is sophisticated to have on-disk uuid record. Most file systems in the patches (except jfs
and reiser3) do not have a persistent uuid, a reasonable/feasible solution without media format
modification is fsid in boot/mount life cycle. That's why huge_encode_dev(sb->s_bdev->bd_dev) is
used here.
For jfs and reiserfs3, is there any use case for persistent fsid cross boots ?

Thanks for your reviews.
--
Coly Li
SuSE Labs

2009-01-20 04:45:24

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 0/20] return f_fsid for statfs(2)

On Jan 20, 2009 12:30 +0800, Coly Li wrote:
> Ext[234] is sophisticated to have on-disk uuid record. Most file systems
> in the patches (except jfs and reiser3) do not have a persistent uuid,
> a reasonable/feasible solution without media format modification is fsid
> in boot/mount life cycle. That's why huge_encode_dev(sb->s_bdev->bd_dev)
> is used here. For jfs and reiserfs3, is there any use case for
> persistent fsid cross boots ?

I would say yes, this is worthwhile to do, or the fsid can change between
boots unnecessarily.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2009-01-20 06:58:13

by Coly Li

[permalink] [raw]
Subject: Re: [PATCH 0/20] return f_fsid for statfs(2)



Andreas Dilger Wrote:
> On Jan 20, 2009 12:30 +0800, Coly Li wrote:
>> Ext[234] is sophisticated to have on-disk uuid record. Most file systems
>> in the patches (except jfs and reiser3) do not have a persistent uuid,
>> a reasonable/feasible solution without media format modification is fsid
>> in boot/mount life cycle. That's why huge_encode_dev(sb->s_bdev->bd_dev)
>> is used here. For jfs and reiserfs3, is there any use case for
>> persistent fsid cross boots ?
>
> I would say yes, this is worthwhile to do, or the fsid can change between
> boots unnecessarily.
>
If no partitioning happens between boots/mounts, fsid from huge_encode_dev() should be identical.
For non-uuid file systems, IMHO huge_encode_dev() method is acceptable.
But YES, for jfs and reiserfs3 there is chance to provide persistent fsid cross boots, here are
examples,
- in fs/jfs/super.c:jfs_statfs(), generate f_fsid by:
buf->f_fsid.val[0] = crc32_le(0, sbi->uuid, sizeof(sbi->uuid)/2);
buf->f_fsid.val[1] = crc32_le(0, sbi->uuid + sizeof(sbi->uuid)/2,
sizeof(sbi->uuid)/2);
- in fs/reiserfs/super.c:reiserfs_statfs(), generate f_fsid by:
buf->f_fsid.val[0] = (u32)crc32_le(0, rs->s_uuid, sizeof(rs->s_uuid)/2);
buf->f_fsid.val[1] = (u32)crc32_le(0, rs->s_uuid + sizeof(rs->s_uuid)/2,
sizeof(rs->s_uuid)/2);

I will update corresponded patches for the implementation. Thanks for your comments.

--
Coly Li
SuSE Labs

2009-01-20 18:49:27

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH 0/20] return f_fsid for statfs(2)

Andreas Dilger wrote:
> On Jan 20, 2009 12:30 +0800, Coly Li wrote:
> > Ext[234] is sophisticated to have on-disk uuid record. Most file systems
> > in the patches (except jfs and reiser3) do not have a persistent uuid,
> > a reasonable/feasible solution without media format modification is fsid
> > in boot/mount life cycle. That's why huge_encode_dev(sb->s_bdev->bd_dev)
> > is used here. For jfs and reiserfs3, is there any use case for
> > persistent fsid cross boots ?
>
> I would say yes, this is worthwhile to do, or the fsid can change between
> boots unnecessarily.

Even FAT has a volume id which should probably be used.
I'm guessing NTFS does too.

-- Jamie

2009-01-22 19:31:21

by Coly Li

[permalink] [raw]
Subject: Re: [PATCH 0/20] return f_fsid for statfs(2)



Jamie Lokier Wrote:
> Andreas Dilger wrote:
>> On Jan 20, 2009 12:30 +0800, Coly Li wrote:
>>> Ext[234] is sophisticated to have on-disk uuid record. Most file systems
>>> in the patches (except jfs and reiser3) do not have a persistent uuid,
>>> a reasonable/feasible solution without media format modification is fsid
>>> in boot/mount life cycle. That's why huge_encode_dev(sb->s_bdev->bd_dev)
>>> is used here. For jfs and reiserfs3, is there any use case for
>>> persistent fsid cross boots ?
>> I would say yes, this is worthwhile to do, or the fsid can change between
>> boots unnecessarily.
>
> Even FAT has a volume id which should probably be used.
> I'm guessing NTFS does too.
>
vfat volume id is assigned manually in mkfs.vfat, vfat volumes in one machine can be assigned to
same volume id value, even worse the volume id can be modified when volume is mounted. Therefore,
IMHO it's not suitable to be used as a fsid.

--
Coly Li
SuSE Labs