2005-03-30 19:44:08

by David Malone

[permalink] [raw]
Subject: Directory link count wrapping on Linux/XFS/i386?

I was looking around to see how Linux handles directories with a
high link count (ie. when they have many subdirectories) and I think
I have stumbled across a bug in the Linux xfs glue.

It seems that internally xfs uses a 32 bit field for the link count,
and the stat64 syscalls use a 32 bit field. These fields are copied
via the vattr structure in xfs_vnode.h, which uses a nlink_t for
the link count. However, in the kernel, I think this field is
actually of type __kernel_nlink_t which seems to be 16 bits on many
platforms.

I've tested this on an i386 2.6.11 kernel and it seems that the
link count presented to userland wraps after 65536 subdirectories.
This naturally doesn't let you screw up the filesystem or anything,
but it does let you can hide files from find/fts, as demonstrated
below.

I guess to fix it you'd change the type of nlink in struct vattr
so that it is the same type (unsigned int) as the type in struct
kstat. I've included the obvious patch, but I don't have a machine
that I can test it on right now.

David.

turing 2% mkdir testdir
turing 3% cd testdir
turing 4% ls -ld .
drwxr-xr-x 2 dwmalone dwmalone 6 Mar 30 12:18 .
turing 5% perl ../mk65536dirs.pl
turing 6% ls -ld .
drwxr-xr-x 2 dwmalone dwmalone 1056768 Mar 30 12:19 .
turing 7% mkdir .hidden
turing 8% touch .hidden/secret
turing 9% find . -name secret -print


--- /usr/src/linux-2.6.11/fs/xfs/linux-2.6/xfs_vnode.h 2005-03-02 07:38:33.000000000 +0000
+++ /tmp/xfs_vnode.h 2005-03-30 18:49:22.000000000 +0100
@@ -409,7 +409,7 @@
int va_mask; /* bit-mask of attributes present */
enum vtype va_type; /* vnode type (for create) */
mode_t va_mode; /* file access mode and type */
- nlink_t va_nlink; /* number of references to file */
+ unsigned int va_nlink; /* number of references to file */
uid_t va_uid; /* owner user id */
gid_t va_gid; /* owner group id */
xfs_ino_t va_nodeid; /* file id */


2005-03-30 20:08:40

by Andreas Dilger

[permalink] [raw]
Subject: Re: Directory link count wrapping on Linux/XFS/i386?

On Mar 30, 2005 20:43 +0100, David Malone wrote:
> It seems that internally xfs uses a 32 bit field for the link count,
> and the stat64 syscalls use a 32 bit field. These fields are copied
> via the vattr structure in xfs_vnode.h, which uses a nlink_t for
> the link count. However, in the kernel, I think this field is
> actually of type __kernel_nlink_t which seems to be 16 bits on many
> platforms.
>
> I've tested this on an i386 2.6.11 kernel and it seems that the
> link count presented to userland wraps after 65536 subdirectories.
> This naturally doesn't let you screw up the filesystem or anything,
> but it does let you can hide files from find/fts, as demonstrated
> below.

The correct fix, used for reiserfs (and a patch for ext3 also) is to
set i_nlink = 1 in case the filesystem count has wrapped. When nlink==1
the fts/find code no longer optimizes subdirectory traversal and checks
each entries filetype to see if it should recurse.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


Attachments:
(No filename) (1.02 kB)
(No filename) (189.00 B)
Download all attachments

2005-03-31 00:48:37

by Nathan Scott

[permalink] [raw]
Subject: Re: Directory link count wrapping on Linux/XFS/i386?

On Wed, Mar 30, 2005 at 01:06:01PM -0700, Andreas Dilger wrote:
> On Mar 30, 2005 20:43 +0100, David Malone wrote:
> > It seems that internally xfs uses a 32 bit field for the link count,
> > and the stat64 syscalls use a 32 bit field. These fields are copied
> > via the vattr structure in xfs_vnode.h, which uses a nlink_t for
> > the link count. However, in the kernel, I think this field is
> > actually of type __kernel_nlink_t which seems to be 16 bits on many
> > platforms.

Yes, use of nlink_t looks wrong there, thanks. Theres one/two other
uses of it in XFS as well, I'll audit those.

> The correct fix, used for reiserfs (and a patch for ext3 also) is to
> set i_nlink = 1 in case the filesystem count has wrapped. When nlink==1
> the fts/find code no longer optimizes subdirectory traversal and checks
> each entries filetype to see if it should recurse.

Ah, I see - the INC_DIR_INODE_NLINK/DEC_DIR_INODE_NLINK macros, right.
I'll look into that too, thanks.

cheers.

--
Nathan

2005-03-31 02:31:52

by Nathan Scott

[permalink] [raw]
Subject: Re: Directory link count wrapping on Linux/XFS/i386?

On Thu, Mar 31, 2005 at 10:42:58AM +1000, Nathan Scott wrote:
> On Wed, Mar 30, 2005 at 01:06:01PM -0700, Andreas Dilger wrote:
> > The correct fix, used for reiserfs (and a patch for ext3 also) is to
> > set i_nlink = 1 in case the filesystem count has wrapped. When nlink==1
> > the fts/find code no longer optimizes subdirectory traversal and checks
> > each entries filetype to see if it should recurse.
>
> Ah, I see - the INC_DIR_INODE_NLINK/DEC_DIR_INODE_NLINK macros, right.
> I'll look into that too, thanks.

Hmm, since struct inode has an unsigned int as nlink, it'd
seem doing this sort of thing is only useful for filesystems
where the ondisk nlink is a 16 bit value (and ext2/3/reiserfs
do seem to be in that category, afaict).

So, Davids patch (and those one/two other cases) would seem
to be enough to get this resolved. There isn't much using
nlink_t - looks like mainly the non-stat64 'stat' calls and
one or two other (possibly accidental?) uses like we had in
XFS.

cheers.

--
Nathan

2005-03-31 06:25:03

by Jan Engelhardt

[permalink] [raw]
Subject: Re: Directory link count wrapping on Linux/XFS/i386?


>but it does let you can hide files from find/fts, as demonstrated
>below.

That's because `find` optimizes its searching by looking at the link count.
IIRC, the -noleaf option should make it visible again.

>turing 7% mkdir .hidden
>turing 8% touch .hidden/secret
>turing 9% find . -name secret -print



Jan Engelhardt
--
No TOFU for me, please.