2007-01-30 22:41:40

by Bodo Eggert

[permalink] [raw]
Subject: [PATCH] pipefs unique inode numbers

change pipefs to use a unique inode number equal to the memory
address unless it would be truncated.

Signed-Off-By: Bodo Eggert <[email protected]>
---
Tested on i386.

--- 2.6.19/fs/pipe.c.ori 2007-01-30 22:02:46.000000000 +0100
+++ 2.6.19/fs/pipe.c 2007-01-30 23:22:27.000000000 +0100
@@ -864,6 +864,10 @@ static struct inode * get_pipe_inode(voi
inode->i_uid = current->fsuid;
inode->i_gid = current->fsgid;
inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
+ /* The address of *inode is unique, so we'll get an unique inode number.
+ * Off cause this will not work for 32 bit inodes on 64 bit systems. */
+ if (sizeof(inode->i_ino) >= sizeof(struct inode*))
+ inode->i_ino = (unsigned int) inode;

return inode;


2007-01-30 22:56:46

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] pipefs unique inode numbers



On Tue, 30 Jan 2007, Bodo Eggert wrote:
>
> change pipefs to use a unique inode number equal to the memory
> address unless it would be truncated.

I *really* wouldn't want to expose kernel addresses to user space, it just
ends up being a piece of data that they shouldn't have. If we have some
security issue, this is just too much kernel information that a bad user
could get at.

Linus

2007-01-31 00:13:39

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] pipefs unique inode numbers

Linus Torvalds wrote:
>
> On Tue, 30 Jan 2007, Bodo Eggert wrote:
>> change pipefs to use a unique inode number equal to the memory
>> address unless it would be truncated.
>
> I *really* wouldn't want to expose kernel addresses to user space, it just
> ends up being a piece of data that they shouldn't have. If we have some
> security issue, this is just too much kernel information that a bad user
> could get at.
>
> Linus

Agreed. That was my reasoning for proposing the earlier patch that xor'ed
it with a random value (though that's pretty thin protection too).

I think in hindsight though, just pulling the patch that hashes pipefs
inodes is probably the best thing for now. At some point in the future,
if we decide it's enough of a problem, we can always revisit it.

I'm still planning to look over other callers of new_inode to make a
determination about them wrt to i_ino uniqueness. Many of them are not
as performance sensitive as pipefs, and it might not be such a big deal
to just hash those.

-- Jeff

2007-01-31 01:19:46

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] pipefs unique inode numbers

Bodo Eggert wrote:
> change pipefs to use a unique inode number equal to the memory
> address unless it would be truncated.
>
> Signed-Off-By: Bodo Eggert <[email protected]>
> ---
> Tested on i386.
>
> --- 2.6.19/fs/pipe.c.ori 2007-01-30 22:02:46.000000000 +0100
> +++ 2.6.19/fs/pipe.c 2007-01-30 23:22:27.000000000 +0100
> @@ -864,6 +864,10 @@ static struct inode * get_pipe_inode(voi
> inode->i_uid = current->fsuid;
> inode->i_gid = current->fsgid;
> inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
> + /* The address of *inode is unique, so we'll get an unique inode number.
> + * Off cause this will not work for 32 bit inodes on 64 bit systems. */
> + if (sizeof(inode->i_ino) >= sizeof(struct inode*))
> + inode->i_ino = (unsigned int) inode;
>
> return inode;
>

Also, that patch would break many 32-bit programs not compiled with large
offsets when run in compatibility mode on a 64-bit kernel. If they were to
do a stat on this inode, it would likely generate an EOVERFLOW error since
the pointer address would probably not fit in a 32 bit field.

That problem was the whole impetus for this set of patches.

-- Jeff

2007-01-31 01:29:32

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] pipefs unique inode numbers



On Tue, 30 Jan 2007, Jeff Layton wrote:
>
> Also, that patch would break many 32-bit programs not compiled with large
> offsets when run in compatibility mode on a 64-bit kernel. If they were to
> do a stat on this inode, it would likely generate an EOVERFLOW error since
> the pointer address would probably not fit in a 32 bit field.
>
> That problem was the whole impetus for this set of patches.

Well, we have that problem with the slowly incrementing "last_ino" too.

Should we make "last_ino" be "static unsigned int" instead of "long"?

Does anybody actually even use the old stat() with 32-bit interfaces? We
warn for it, and we've done so for a long time.. I dont' remember people
even complaining about the warning, so..

Linus

2007-01-31 01:38:06

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] pipefs unique inode numbers

Jeff Layton wrote:
> Bodo Eggert wrote:
> > change pipefs to use a unique inode number equal to the memory
> > address unless it would be truncated.
> >
> > Signed-Off-By: Bodo Eggert <[email protected]>
> > ---
> > Tested on i386.
> >
> > --- 2.6.19/fs/pipe.c.ori 2007-01-30 22:02:46.000000000 +0100
> > +++ 2.6.19/fs/pipe.c 2007-01-30 23:22:27.000000000 +0100
> > @@ -864,6 +864,10 @@ static struct inode * get_pipe_inode(voi
> > inode->i_uid = current->fsuid;
> > inode->i_gid = current->fsgid;
> > inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
> > + /* The address of *inode is unique, so we'll get an unique inode
> number.
> > + * Off cause this will not work for 32 bit inodes on 64 bit
> systems. */
> > + if (sizeof(inode->i_ino) >= sizeof(struct inode*))
> > + inode->i_ino = (unsigned int) inode;
> >
> > return inode;
> >
>
> Also, that patch would break many 32-bit programs not compiled with large
> offsets when run in compatibility mode on a 64-bit kernel. If they were to
> do a stat on this inode, it would likely generate an EOVERFLOW error since
> the pointer address would probably not fit in a 32 bit field.
>
> That problem was the whole impetus for this set of patches.
>

Actually, sorry...I misread the patch. It wouldn't have that problem. My
mistake.

Still though, I considered an approach somewhat similar to this early on.
I was thinking of taking a bit-shifted inode address and hashing it to
give a unique value. If you do the math, you can discard the lower 9 bits
of the pointer, so you end up being able to use the lower 41 bits of the
pointer. So a scheme like that could work if you could guarantee that
all inode addresses wouldn't be > 2^41 apart.

The problem is, you can't guarantee that, especially in a NUMA situation.

See the thread entitled:

[RFC][PATCH] ensure i_ino uniqueness in filesystems without
permanent inode numbers (via pointer conversion)

in linux-fsdevel, ~Nov 17th for more info.

-- Jeff

2007-01-31 02:02:52

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] pipefs unique inode numbers

Linus Torvalds wrote:
>
> On Tue, 30 Jan 2007, Jeff Layton wrote:
>> Also, that patch would break many 32-bit programs not compiled with large
>> offsets when run in compatibility mode on a 64-bit kernel. If they were to
>> do a stat on this inode, it would likely generate an EOVERFLOW error since
>> the pointer address would probably not fit in a 32 bit field.
>>
>> That problem was the whole impetus for this set of patches.
>
> Well, we have that problem with the slowly incrementing "last_ino" too.
>
> Should we make "last_ino" be "static unsigned int" instead of "long"?
>
> Does anybody actually even use the old stat() with 32-bit interfaces? We
> warn for it, and we've done so for a long time.. I dont' remember people
> even complaining about the warning, so..
>
> Linus

I've actually sent Andrew a patch that does that and the same thing to
the counter in iunique as well. It's in -mm now, but I think it's pretty
safe and can probably go into your tree any time you're ready for it.

It's been quite a while since I looked at the original problem, but I
believe glibc actually uses stat64 to make the call, so it doesn't
throw the warning. The EOVERFLOW comes from glibc when it gets back an
st_ino value that won't fit in the 32 bit buffer provided by the program.

Obviously, we can't do anything for filesystems with permanent inodes larger
than 32 bits, but when generating them on the fly via new_inode or iunique,
we ought to try and have them fit in 32 bits if possible (at least as long
as we can).

-- Jeff

2007-01-31 09:08:29

by Kirill Korotaev

[permalink] [raw]
Subject: Re: [PATCH] pipefs unique inode numbers

Linus,

>>Also, that patch would break many 32-bit programs not compiled with large
>>offsets when run in compatibility mode on a 64-bit kernel. If they were to
>>do a stat on this inode, it would likely generate an EOVERFLOW error since
>>the pointer address would probably not fit in a 32 bit field.
>>
>>That problem was the whole impetus for this set of patches.
>
>
> Well, we have that problem with the slowly incrementing "last_ino" too.
>
> Should we make "last_ino" be "static unsigned int" instead of "long"?
yes. it is what the 1st patch (AFAIK commited to -mm) from Jeff.

> Does anybody actually even use the old stat() with 32-bit interfaces? We
> warn for it, and we've done so for a long time.. I dont' remember people
> even complaining about the warning, so..
Yes, we saw this happening multiple times already.
e.g. many OpenVZ users run 32bit VEs on 64bit host OS.
due to this bug some apps (e.g. postfix) complain about EOVERFLOW returned by stat()
and fail to work.
I also observed similar bug on Debian 3.0 which has 'find' compiled w/o FILE_OFFSETS=64,
but NFS partition used 64bit inode numbers.

Thanks,
Kirill