2009-05-05 20:25:44

by Andreas Dilger

[permalink] [raw]
Subject: [RFC] new INCOMPAT flag for extended directory data

Ted,
we're looking to store some extended data in each directory entry
for Lustre, to hold a 128-bit filesystem-unique file identifier
in the dirent. If we ever wanted to look at 64-bit or larger
inode numbers we would need to do the same.

There are a couple of approaches to do this, either having the extra
data beyond name_len but within rec_len, or to have the extra data within
name_len, but after a NUL terminator. Keeping the extra dirent data
within name_len is somewhat easier to implement, only the few places
that do filename comparisons/hashing need to be changed.

In order to detect the presence of this extra data in the dirent, we
would want to use the high bits in d_type (say bits 0xf0). This part
of d_type could either be a flag for the presence of 4 different bits
of data (which limits the number of different kinds of data), or it
could be the length of the extra data (which means there is no way to
identify the type of data being stored there). The d_type would mask
off the high bits in get_dtype() so as not to confuse filldir.

If e2fsck detected these bits set in d_type, and name_len != strlen(name)
either it would ask to set the INCOMPAT_DIRDATA feature, or failing
that it would clear the flag in d_type and set name_len == strlen(name).
I don't think there are any valid name encodings that have an embedded NUL
byte.

So, the questions:
- do you have a strong objection to this?
- do you prefer data-in-name_len or data-in-rec_len?
- can we get an EXT4_FEATURE_INCOMPAT_DIRDATA = 0x200 flag for this?
- can we reserve the high 4 bits of d_type, and use EXT4_FT_DIRDATA = 0x20
for our 128-bit identifier? 0x20 would match both the length of the
identifier in 4-byte words, or be a flag indicating this FID is present.
We can keep 0x10 for the inode_hi field (which will also match the length
of a 32-bit inode_hi field and/or the presence of inode_hi) and we can
defer the decision on whether this is the length or the type of the extra
data.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.



2009-08-20 01:40:57

by Andreas Dilger

[permalink] [raw]
Subject: Re: [RFC] new INCOMPAT flag for extended directory data

On May 05, 2009 14:25 -0600, Andreas Dilger wrote:
> we're looking to store some extended data in each directory entry
> for Lustre, to hold a 128-bit filesystem-unique file identifier
> in the dirent. If we ever wanted to look at 64-bit or larger
> inode numbers we would need to do the same.
>
> In order to detect the presence of this extra data in the dirent, we
> would want to use the high bits in d_type (say bits 0xf0). This part
> of d_type could be a flag for the presence of 4 different types of
> data (which limits the number of different kinds of data). The d_type
> would mask off the high bits in get_dtype() so as not to confuse filldir.

Ted,
could we at least reserve the INCOMPAT_DIRDATA flag to avoid conflicts:

#define EXT4_FEATURE_INCOMPAT_DIRDATA 0x1000

reserve the high 4 bits of d_type:

#define EXT4_DIRENT_LUFID 0x10 /* Lustre 128-bit unique file identifier */

0x20, 0x40, and 0x80 would be available for future expansion (e.g. high
32 bits of inode number). If needed, type 0x80 can be extended by adding
a 1-byte subtype after the length byte, though I doubt this would be needed.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2009-08-28 06:03:07

by pravin shelar

[permalink] [raw]
Subject: Re: [RFC] new INCOMPAT flag for extended directory data

Hi
Here is code for ext4 extended dirent data feature.

these RFC patches adds facility in ext4 to have user data in every ext4 dirent.
lustre assigns cluserwide unique id to every inode in system, so we have added
user data field in ext4 dirent to map filename to id efficiently.

i would like to get feedback on this as it might have some conflict with 64 bit
inode work (which is in discussion phase).

patch[1] removes dx_root struct so that "." and ".." dirent can have extra data.
patch[2] add user data field to ext4 dirent.
patch[3] e2fs package changes for dirdata feature.

Thanks,
Pravin.

Andreas Dilger wrote:
> On May 05, 2009 14:25 -0600, Andreas Dilger wrote:
>> we're looking to store some extended data in each directory entry
>> for Lustre, to hold a 128-bit filesystem-unique file identifier
>> in the dirent. If we ever wanted to look at 64-bit or larger
>> inode numbers we would need to do the same.
>>
>> In order to detect the presence of this extra data in the dirent, we
>> would want to use the high bits in d_type (say bits 0xf0). This part
>> of d_type could be a flag for the presence of 4 different types of
>> data (which limits the number of different kinds of data). The d_type
>> would mask off the high bits in get_dtype() so as not to confuse filldir.
>
> Ted,
> could we at least reserve the INCOMPAT_DIRDATA flag to avoid conflicts:
>
> #define EXT4_FEATURE_INCOMPAT_DIRDATA 0x1000
>
> reserve the high 4 bits of d_type:
>
> #define EXT4_DIRENT_LUFID 0x10 /* Lustre 128-bit unique file identifier */
>
> 0x20, 0x40, and 0x80 would be available for future expansion (e.g. high
> 32 bits of inode number). If needed, type 0x80 can be extended by adding
> a 1-byte subtype after the length byte, though I doubt this would be needed.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>


Attachments:
ext4-kill-dx_root.patch (7.40 kB)