2006-11-06 18:12:19

by Jeff Layton

[permalink] [raw]
Subject: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

Some (generally not disk-based) filesystems use the new_inode() function
to allocate and populate inode structs. This function uses a static
unsigned long counter to populate i_ino. On most 64-bit platforms, this
is a 64-bit integer, but on ia32, this is a 32-bit int.

If stat() is called from a 32-bit program that was not compiled with
-D_FILE_OFFSET_BITS=64 on one of these inodes, then glibc will generate
an EOVERFLOW error in userspace (since the st_ino returned by the stat64
call won't fit in the old-style stat struct).

This creates a situation where these programs will run just fine on a
32-bit kernel, but will eventually start falling down with EOVERFLOW
errors on a 64-bit kernel. One way to reproduce this is to write a
program that repeatedly calls pipe() and then does an fstat() on one of
the pipe filehandles, and compile the program as a 32-bit app w/o
-D_FILE_OFFSET_BITS=64. Eventually, the fstat will consistently fail
with an EOVERFLOW.

The attached patch remedies this by making the last_inode counter be an
unsigned int on kernels that have ia32 compatability mode enabled.

My rationale is that we're eventually going to overflow this counter
regardless. This does make it happen sooner, but this is no worse that
what already happens on ia32.

Signed-off-by: Jeff Layton <[email protected]>

--- linux-2.6/fs/inode.c.lastino
+++ linux-2.6/fs/inode.c
@@ -524,7 +524,11 @@ repeat:
*/
struct inode *new_inode(struct super_block *sb)
{
+#if (defined CONFIG_IA32_EMULATION || defined CONFIG_IA32_SUPPORT)
+ static unsigned int last_ino;
+#else
static unsigned long last_ino;
+#endif
struct inode * inode;

spin_lock_prefetch(&inode_lock);



2006-11-06 18:22:24

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Mon, Nov 06, 2006 at 01:12:05PM -0500, Jeff Layton wrote:
> The attached patch remedies this by making the last_inode counter be an
> unsigned int on kernels that have ia32 compatability mode enabled.

... and this only happens on ia64/x86_64 kernels, not sparc64, ppc64,
s390x, parisc64 or mips64?

> {
> +#if (defined CONFIG_IA32_EMULATION || defined CONFIG_IA32_SUPPORT)
> + static unsigned int last_ino;
> +#else
> static unsigned long last_ino;
> +#endif
> struct inode * inode;

I suspect what you really want there is CONFIG_COMPAT.

2006-11-06 18:31:17

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Mon, 2006-11-06 at 11:22 -0700, Matthew Wilcox wrote:
> On Mon, Nov 06, 2006 at 01:12:05PM -0500, Jeff Layton wrote:
> > The attached patch remedies this by making the last_inode counter be an
> > unsigned int on kernels that have ia32 compatability mode enabled.
>
> ... and this only happens on ia64/x86_64 kernels, not sparc64, ppc64,
> s390x, parisc64 or mips64?
>

Yeah, that was my big question. I'd only seen this on ia32 compatability
modes, but clearly its a problem where unsigned long is a different size
between a 64-bit kernel and its 32-bit compatability mode.

I'll have a look at CONFIG_COMPAT and likely respin.

Thanks,
Jeff


2006-11-06 18:47:26

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Mon, 2006-11-06 at 11:22 -0700, Matthew Wilcox wrote:
> On Mon, Nov 06, 2006 at 01:12:05PM -0500, Jeff Layton wrote:
> > The attached patch remedies this by making the last_inode counter be an
> > unsigned int on kernels that have ia32 compatability mode enabled.
>
> ... and this only happens on ia64/x86_64 kernels, not sparc64, ppc64,
> s390x, parisc64 or mips64?

Here's a new (untested) patch that replaces the ia32 specific
compatability mode defines with CONFIG_COMPAT, as suggested by Matthew.

Signed-off-by: Jeff Layton <[email protected]>

--- linux-2.6/fs/inode.c.lastino
+++ linux-2.6/fs/inode.c
@@ -524,7 +524,11 @@ repeat:
*/
struct inode *new_inode(struct super_block *sb)
{
+#ifdef CONFIG_COMPAT
+ static unsigned int last_ino;
+#else
static unsigned long last_ino;
+#endif
struct inode * inode;

spin_lock_prefetch(&inode_lock);


2006-11-06 20:23:29

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Mon, 6 November 2006 13:47:23 -0500, Jeff Layton wrote:
> On Mon, 2006-11-06 at 11:22 -0700, Matthew Wilcox wrote:
> > On Mon, Nov 06, 2006 at 01:12:05PM -0500, Jeff Layton wrote:
> > > The attached patch remedies this by making the last_inode counter be an
> > > unsigned int on kernels that have ia32 compatability mode enabled.
> >
> > ... and this only happens on ia64/x86_64 kernels, not sparc64, ppc64,
> > s390x, parisc64 or mips64?
>
> Here's a new (untested) patch that replaces the ia32 specific
> compatability mode defines with CONFIG_COMPAT, as suggested by Matthew.

While you're at it, how about making last_ino per-sb instead of
system-wide? ino collisions after a wrap are just as bad as inos
beyond 32bit. And this should be a fairly simple method to reduce the
risk.

Also, do you have a testcase that can actually force the wrap?

J?rn

--
Don't patch bad code, rewrite it.
-- Kernigham and Pike, according to Rusty

2006-11-06 20:40:17

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

J?rn Engel a ?crit :
> On Mon, 6 November 2006 13:47:23 -0500, Jeff Layton wrote:
>> On Mon, 2006-11-06 at 11:22 -0700, Matthew Wilcox wrote:
>>> On Mon, Nov 06, 2006 at 01:12:05PM -0500, Jeff Layton wrote:
>>>> The attached patch remedies this by making the last_inode counter be an
>>>> unsigned int on kernels that have ia32 compatability mode enabled.
>>> ... and this only happens on ia64/x86_64 kernels, not sparc64, ppc64,
>>> s390x, parisc64 or mips64?
>> Here's a new (untested) patch that replaces the ia32 specific
>> compatability mode defines with CONFIG_COMPAT, as suggested by Matthew.
>
> While you're at it, how about making last_ino per-sb instead of
> system-wide? ino collisions after a wrap are just as bad as inos
> beyond 32bit. And this should be a fairly simple method to reduce the
> risk.
>
> Also, do you have a testcase that can actually force the wrap?

while (1) {
int fd[2];
pipe(fd);
close(fd[0]);
close(fd[1]);
}


2006-11-06 20:51:03

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

J?rn Engel wrote:
> On Mon, 6 November 2006 13:47:23 -0500, Jeff Layton wrote:
>> On Mon, 2006-11-06 at 11:22 -0700, Matthew Wilcox wrote:
>>> On Mon, Nov 06, 2006 at 01:12:05PM -0500, Jeff Layton wrote:
>>>> The attached patch remedies this by making the last_inode counter be an
>>>> unsigned int on kernels that have ia32 compatability mode enabled.
>>> ... and this only happens on ia64/x86_64 kernels, not sparc64, ppc64,
>>> s390x, parisc64 or mips64?
>> Here's a new (untested) patch that replaces the ia32 specific
>> compatability mode defines with CONFIG_COMPAT, as suggested by Matthew.
>
> While you're at it, how about making last_ino per-sb instead of
> system-wide? ino collisions after a wrap are just as bad as inos
> beyond 32bit. And this should be a fairly simple method to reduce the
> risk.

Using a global counter for multiple filesystems should actually -reduce-
the chance of a collision on the same filesystem, since after you wrap the
recycled number may go to a different filesystem.

Simply making it a per-sb counter makes it worse, because wrapped inodes
will always go to the same filesystem.

To fix this properly, we'd need some sort of checking that the inode number
isn't currently being used on the filesystem in question before it's
assigned to the new inode.

-Eric

2006-11-06 20:56:22

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Mon, 2006-11-06 at 21:31 +0100, Eric Dumazet wrote:
> >
> > Also, do you have a testcase that can actually force the wrap?
>
> while (1) {
> int fd[2];
> pipe(fd);
> close(fd[0]);
> close(fd[1]);
> }
>

Yep, add an fstat(fd[0]) before the closes and you essentially have the
reproducer I was given for this.
-- Jeff


2006-11-06 21:12:11

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Mon, 6 November 2006 14:50:58 -0600, Eric Sandeen wrote:
> >
> > While you're at it, how about making last_ino per-sb instead of
> > system-wide? ino collisions after a wrap are just as bad as inos
> > beyond 32bit. And this should be a fairly simple method to reduce the
> > risk.
>
> Using a global counter for multiple filesystems should actually -reduce-
> the chance of a collision on the same filesystem, since after you wrap the
> recycled number may go to a different filesystem.

You're missing something. The chance for a collision _per wrap_ is
reduced, as you said. But the number of wraps goes up. Overall and
for large numbers, the two effects compensate each other.

For not-so-large numbers, you can get by without the wrap by having
this per-sb. And if you have just one or two wrapping filesystems, at
least the others are protected. It's not much, but it is a simple
thing to do.

> To fix this properly, we'd need some sort of checking that the inode number
> isn't currently being used on the filesystem in question before it's
> assigned to the new inode.

Absolutely. Thinking about it, iget() already has a lot of what is
needed - except that it can block and has side effects we don't really
want. Sounds more complicated, but I would love to be proven wrong
here. :)

J?rn

--
Joern's library part 7:
http://www.usenix.org/publications/library/proceedings/neworl/full_papers/mckusick.a

2006-11-06 21:36:59

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

J?rn Engel wrote:
> On Mon, 6 November 2006 14:50:58 -0600, Eric Sandeen wrote:
>>> While you're at it, how about making last_ino per-sb instead of
>>> system-wide? ino collisions after a wrap are just as bad as inos
>>> beyond 32bit. And this should be a fairly simple method to reduce the
>>> risk.
>> Using a global counter for multiple filesystems should actually -reduce-
>> the chance of a collision on the same filesystem, since after you wrap the
>> recycled number may go to a different filesystem.
>
> You're missing something. The chance for a collision _per wrap_ is
> reduced, as you said. But the number of wraps goes up. Overall and
> for large numbers, the two effects compensate each other.

Well, one concern is an intentional exploit. In which case "longer
time" and "shorter time" don't matter -so- much. If it can happen at
all, it's bad, period.

OTOH if one filesystem (say, pipes) can wrap the numbers very quickly,
while other spaces are otherwise more immune, then having it global puts
everything using it at a bit more risk.

*shrug* I dunno, it's probably not worth arguing this point, it needs to
be fixed properly in any case. :)

-Eric

2006-11-06 22:01:48

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Nov 06, 2006 15:36 -0600, Eric Sandeen wrote:
> OTOH if one filesystem (say, pipes) can wrap the numbers very quickly,
> while other spaces are otherwise more immune, then having it global puts
> everything using it at a bit more risk.

One option is having a per-sb counter (to avoid wraps on not-heavily-used
filesystems), and also a per-sb flag that indicates if the counter has
wrapped. If that happens, it would be possible to do a lookup in the
inode hash for a conflicting inode number, and skip those. It is more
overhead, but only hit in the case where there is danger (i.e. post wrap).

There should also be a flag indicating if the caller is actually using
the inum supplied by new_inode or not, to avoid the overhead if it just
replaces i_ino with its own value.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2006-11-07 15:56:28

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Mon, 2006-11-06 at 15:36 -0600, Eric Sandeen wrote:
> *shrug* I dunno, it's probably not worth arguing this point, it needs to
> be fixed properly in any case. :)
>
> -Eric

Here's a more involved (again, untested) patch. I turned new_inode into
a wrapper for a new function (new_inode_autonum). If the autonum arg is
set, then we should get back a unique i_ino.

This also converts the iunique function to use the per sb s_lastino
counter. I think that's the right thing to do here, but I'm new to the
hlist stuff and so feedback would be appreciated there.

Current behavior for new_inode shouldn't be changed (aside from the
per-sb s_lastino counter). The idea here would be to move filesystems
that don't later overwrite the i_ino field, to use new_inode_autonum
rather than new_inode and set autonum to true.

Not sure how bad performance with new inode creation will be when that
is set to true, but this is the best way I could see to do this without
adding something like an radix tree to track used inodes.

Signed-off-by: Jeff Layton <[email protected]>

diff --git a/fs/inode.c b/fs/inode.c
index 26cdb11..83b9bab 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -517,14 +517,14 @@ repeat:
}

/**
- * new_inode - obtain an inode
+ * new_inode_autonum - obtain an inode
* @sb: superblock
+ * @autonum: if true, make sure that i_ino is unique
*
* Allocates a new inode for given superblock.
*/
-struct inode *new_inode(struct super_block *sb)
+struct inode *new_inode_autonum(struct super_block *sb, int autonum)
{
- static unsigned long last_ino;
struct inode * inode;

spin_lock_prefetch(&inode_lock);
@@ -535,13 +535,26 @@ struct inode *new_inode(struct super_blo
inodes_stat.nr_inodes++;
list_add(&inode->i_list, &inode_in_use);
list_add(&inode->i_sb_list, &sb->s_inodes);
- inode->i_ino = ++last_ino;
+ if (autonum)
+ inode->i_ino = iunique(sb, 0);
+ else
+ inode->i_ino = ++sb->s_lastino;
inode->i_state = 0;
spin_unlock(&inode_lock);
}
return inode;
}

+/**
+ * new_inode - obtain an inode
+ * @sb: superblock
+ *
+ * Allocates a new inode for given superblock.
+ */
+struct inode *new_inode(struct super_block *sb)
+{
+ new_inode_autonum(sb, 0);
+}
EXPORT_SYMBOL(new_inode);

void unlock_new_inode(struct inode *inode)
@@ -683,22 +696,21 @@ static unsigned long hash(struct super_b
*/
ino_t iunique(struct super_block *sb, ino_t max_reserved)
{
- static ino_t counter;
struct inode *inode;
struct hlist_head * head;
ino_t res;
spin_lock(&inode_lock);
retry:
- if (counter > max_reserved) {
- head = inode_hashtable + hash(sb,counter);
- res = counter++;
+ if (sb->s_lastino >= max_reserved) {
+ head = inode_hashtable + hash(sb,++sb->s_lastino);
+ res = sb->s_lastino;
inode = find_inode_fast(sb, head, res);
if (!inode) {
spin_unlock(&inode_lock);
return res;
}
} else {
- counter = max_reserved + 1;
+ sb->s_lastino = max_reserved;
}
goto retry;

diff --git a/fs/super.c b/fs/super.c
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2fe6e3f..1010e64 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -961,6 +961,14 @@ #endif
/* Granularity of c/m/atime in ns.
Cannot be worse than a second */
u32 s_time_gran;
+
+ /* per-sb inode counter for new_inode. Make it a 32-bit counter when
+ we have the possibility of dealing with 32-bit apps */
+#ifdef CONFIG_COMPAT
+ unsigned int s_lastino;
+#else
+ unsigned long s_lastino;
+#endif
};

extern struct timespec current_fs_time(struct super_block *sb);


2006-11-07 16:07:21

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, 2006-11-07 at 10:56 -0500, Jeff Layton wrote:
>
> Here's a more involved (again, untested) patch. I turned new_inode into
> a wrapper for a new function (new_inode_autonum). If the autonum arg is
> set, then we should get back a unique i_ino.
>
> This also converts the iunique function to use the per sb s_lastino
> counter. I think that's the right thing to do here, but I'm new to the
> hlist stuff and so feedback would be appreciated there.
>
> Current behavior for new_inode shouldn't be changed (aside from the
> per-sb s_lastino counter). The idea here would be to move filesystems
> that don't later overwrite the i_ino field, to use new_inode_autonum
> rather than new_inode and set autonum to true.
>
> Not sure how bad performance with new inode creation will be when that
> is set to true, but this is the best way I could see to do this without
> adding something like an radix tree to track used inodes.

Doh! Here is a revised patch that fixes a couple of silly errors that
would make this not compile. Also export new_inode_autonum.

Signed-off-by: Jeff Layton <[email protected]>

diff --git a/fs/inode.c b/fs/inode.c
index 26cdb11..d96d23e 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -517,14 +517,14 @@ repeat:
}

/**
- * new_inode - obtain an inode
+ * new_inode_autonum - obtain an inode
* @sb: superblock
+ * @autonum: if true, make sure that i_ino is unique
*
* Allocates a new inode for given superblock.
*/
-struct inode *new_inode(struct super_block *sb)
+struct inode *new_inode_autonum(struct super_block *sb, int autonum)
{
- static unsigned long last_ino;
struct inode * inode;

spin_lock_prefetch(&inode_lock);
@@ -535,13 +535,29 @@ struct inode *new_inode(struct super_blo
inodes_stat.nr_inodes++;
list_add(&inode->i_list, &inode_in_use);
list_add(&inode->i_sb_list, &sb->s_inodes);
- inode->i_ino = ++last_ino;
+ if (autonum)
+ inode->i_ino = iunique(sb, 0);
+ else
+ inode->i_ino = ++sb->s_lastino;
inode->i_state = 0;
spin_unlock(&inode_lock);
}
return inode;
}

+EXPORT_SYMBOL(new_inode_autonum);
+
+/**
+ * new_inode - obtain an inode
+ * @sb: superblock
+ *
+ * Allocates a new inode for given superblock.
+ */
+struct inode *new_inode(struct super_block *sb)
+{
+ return new_inode_autonum(sb, 0);
+}
+
EXPORT_SYMBOL(new_inode);

void unlock_new_inode(struct inode *inode)
@@ -683,22 +699,21 @@ static unsigned long hash(struct super_b
*/
ino_t iunique(struct super_block *sb, ino_t max_reserved)
{
- static ino_t counter;
struct inode *inode;
struct hlist_head * head;
ino_t res;
spin_lock(&inode_lock);
retry:
- if (counter > max_reserved) {
- head = inode_hashtable + hash(sb,counter);
- res = counter++;
+ if (sb->s_lastino >= max_reserved) {
+ head = inode_hashtable + hash(sb,++sb->s_lastino);
+ res = sb->s_lastino;
inode = find_inode_fast(sb, head, res);
if (!inode) {
spin_unlock(&inode_lock);
return res;
}
} else {
- counter = max_reserved + 1;
+ sb->s_lastino = max_reserved;
}
goto retry;

diff --git a/fs/super.c b/fs/super.c
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2fe6e3f..1010e64 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -961,6 +961,14 @@ #endif
/* Granularity of c/m/atime in ns.
Cannot be worse than a second */
u32 s_time_gran;
+
+ /* per-sb inode counter for new_inode. Make it a 32-bit counter when
+ we have the possibility of dealing with 32-bit apps */
+#ifdef CONFIG_COMPAT
+ unsigned int s_lastino;
+#else
+ unsigned long s_lastino;
+#endif
};

extern struct timespec current_fs_time(struct super_block *sb);


2006-11-07 16:10:08

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, Nov 07, 2006 at 10:56:06AM -0500, Jeff Layton wrote:
> retry:
> - if (counter > max_reserved) {
> - head = inode_hashtable + hash(sb,counter);
> - res = counter++;
> + if (sb->s_lastino >= max_reserved) {
> + head = inode_hashtable + hash(sb,++sb->s_lastino);
> + res = sb->s_lastino;

I think it'd be clearer to write this as:

res = ++sb->s_lastino;
head = inode_hashtable + hash(sb, res);

My eye skipped over the preincrement entirely the way it's currently
written.

> inode = find_inode_fast(sb, head, res);
> if (!inode) {
> spin_unlock(&inode_lock);
> return res;
> }
> } else {
> - counter = max_reserved + 1;
> + sb->s_lastino = max_reserved;
> }
> goto retry;
>

2006-11-07 17:04:56

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, 7 November 2006 10:56:06 -0500, Jeff Layton wrote:
>
> Here's a more involved (again, untested) patch. I turned new_inode into
> a wrapper for a new function (new_inode_autonum). If the autonum arg is
> set, then we should get back a unique i_ino.

A good start. You didn't convert any callers yet, so I did a quick
poll. Turned out that out of 76 callers 37 explicitly set i_ino, 1
(fat) calls iunique itself and 38 look at first glance as if they
didn't set i_ino. xfs and jfs were in the latter category, which just
proved I didn't look too closely.

Anyway, the distribution is fairly even, so I agree with your choice
of default behaviour.

Jörn

--
Schr?dinger's cat is <BLINK>not</BLINK> dead.
-- Illiad

2006-11-07 17:28:42

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, 7 November 2006 10:56:06 -0500, Jeff Layton wrote:
> -struct inode *new_inode(struct super_block *sb)
> +struct inode *new_inode_autonum(struct super_block *sb, int autonum)

Looking at the callers, it seemed a bit more natural to me to call
new_inode_autonum(sb) than new_inode_autonum(sb, 1). Would you mind
turning new_inode_autonum() in a wrapper like new_inode() and putting
the actual code into a static helper?

Anyway, here is a first patch converting some callers that looked
obvious.

Jörn

--
Man darf nicht das, was uns unwahrscheinlich und unnat?rlich erscheint,
mit dem verwechseln, was absolut unm?glich ist.
-- Carl Friedrich Gau?


Signed-off-by: J?rn Engel <[email protected]>
---

arch/ia64/kernel/perfmon.c | 2 +-
arch/powerpc/platforms/cell/spufs/inode.c | 2 +-
arch/s390/hypfs/inode.c | 2 +-
drivers/infiniband/hw/ipath/ipath_fs.c | 2 +-
drivers/misc/ibmasm/ibmasmfs.c | 2 +-
drivers/oprofile/oprofilefs.c | 2 +-
drivers/usb/core/inode.c | 2 +-
drivers/usb/gadget/inode.c | 2 +-
fs/autofs4/inode.c | 2 +-
fs/binfmt_misc.c | 2 +-
fs/configfs/inode.c | 2 +-
fs/debugfs/inode.c | 2 +-
fs/eventpoll.c | 2 +-
fs/freevxfs/vxfs_inode.c | 2 +-
fs/fuse/control.c | 2 +-
fs/hugetlbfs/inode.c | 2 +-
fs/ocfs2/dlm/dlmfs.c | 4 ++--
fs/pipe.c | 2 +-
fs/ramfs/inode.c | 2 +-
fs/sysfs/inode.c | 2 +-
ipc/mqueue.c | 2 +-
kernel/cpuset.c | 2 +-
mm/shmem.c | 2 +-
net/socket.c | 2 +-
security/inode.c | 2 +-
security/selinux/selinuxfs.c | 2 +-
26 files changed, 27 insertions(+), 27 deletions(-)

--- iunique/arch/ia64/kernel/perfmon.c~iunique 2006-10-13 15:50:36.000000000 +0200
+++ iunique/arch/ia64/kernel/perfmon.c 2006-11-07 18:06:51.000000000 +0100
@@ -2167,7 +2167,7 @@ pfm_alloc_fd(struct file **cfile)
/*
* allocate a new inode
*/
- inode = new_inode(pfmfs_mnt->mnt_sb);
+ inode = new_inode_autonum(pfmfs_mnt->mnt_sb);
if (!inode) goto out;

DPRINT(("new inode ino=%ld @%p\n", inode->i_ino, inode));
--- iunique/arch/powerpc/platforms/cell/spufs/inode.c~iunique 2006-10-13 15:51:29.000000000 +0200
+++ iunique/arch/powerpc/platforms/cell/spufs/inode.c 2006-11-07 18:07:18.000000000 +0100
@@ -75,7 +75,7 @@ spufs_new_inode(struct super_block *sb,
{
struct inode *inode;

- inode = new_inode(sb);
+ inode = new_inode_autonum(sb);
if (!inode)
goto out;

--- iunique/arch/s390/hypfs/inode.c~iunique 2006-10-13 15:51:52.000000000 +0200
+++ iunique/arch/s390/hypfs/inode.c 2006-11-07 18:07:37.000000000 +0100
@@ -84,7 +84,7 @@ static void hypfs_delete_tree(struct den

static struct inode *hypfs_make_inode(struct super_block *sb, int mode)
{
- struct inode *ret = new_inode(sb);
+ struct inode *ret = new_inode_autonum(sb);

if (ret) {
struct hypfs_sb_info *hypfs_info = sb->s_fs_info;
--- iunique/drivers/infiniband/hw/ipath/ipath_fs.c~iunique 2006-10-13 15:53:19.000000000 +0200
+++ iunique/drivers/infiniband/hw/ipath/ipath_fs.c 2006-11-07 18:08:06.000000000 +0100
@@ -51,7 +51,7 @@ static int ipathfs_mknod(struct inode *d
void *data)
{
int error;
- struct inode *inode = new_inode(dir->i_sb);
+ struct inode *inode = new_inode_autonum(dir->i_sb);

if (!inode) {
error = -EPERM;
--- iunique/drivers/misc/ibmasm/ibmasmfs.c~iunique 2006-10-13 15:53:59.000000000 +0200
+++ iunique/drivers/misc/ibmasm/ibmasmfs.c 2006-11-07 18:08:26.000000000 +0100
@@ -142,7 +142,7 @@ static int ibmasmfs_fill_super (struct s

static struct inode *ibmasmfs_make_inode(struct super_block *sb, int mode)
{
- struct inode *ret = new_inode(sb);
+ struct inode *ret = new_inode_autonum(sb);

if (ret) {
ret->i_mode = mode;
--- iunique/drivers/oprofile/oprofilefs.c~iunique 2006-10-13 15:54:40.000000000 +0200
+++ iunique/drivers/oprofile/oprofilefs.c 2006-11-07 18:08:45.000000000 +0100
@@ -25,7 +25,7 @@ DEFINE_SPINLOCK(oprofilefs_lock);

static struct inode * oprofilefs_get_inode(struct super_block * sb, int mode)
{
- struct inode * inode = new_inode(sb);
+ struct inode * inode = new_inode_autonum(sb);

if (inode) {
inode->i_mode = mode;
--- iunique/drivers/usb/core/inode.c~iunique 2006-10-13 15:55:22.000000000 +0200
+++ iunique/drivers/usb/core/inode.c 2006-11-07 18:09:03.000000000 +0100
@@ -243,7 +243,7 @@ static int remount(struct super_block *s

static struct inode *usbfs_get_inode (struct super_block *sb, int mode, dev_t dev)
{
- struct inode *inode = new_inode(sb);
+ struct inode *inode = new_inode_autonum(sb);

if (inode) {
inode->i_mode = mode;
--- iunique/drivers/usb/gadget/inode.c~iunique 2006-10-13 15:55:23.000000000 +0200
+++ iunique/drivers/usb/gadget/inode.c 2006-11-07 18:10:41.000000000 +0100
@@ -1960,7 +1960,7 @@ gadgetfs_make_inode (struct super_block
void *data, const struct file_operations *fops,
int mode)
{
- struct inode *inode = new_inode (sb);
+ struct inode *inode = new_inode_autonum(sb);

if (inode) {
inode->i_mode = mode;
--- iunique/fs/autofs4/inode.c~iunique 2006-10-13 15:55:48.000000000 +0200
+++ iunique/fs/autofs4/inode.c 2006-11-07 18:11:28.000000000 +0100
@@ -432,7 +432,7 @@ fail_unlock:
struct inode *autofs4_get_inode(struct super_block *sb,
struct autofs_info *inf)
{
- struct inode *inode = new_inode(sb);
+ struct inode *inode = new_inode_autonum(sb);

if (inode == NULL)
return NULL;
--- iunique/fs/binfmt_misc.c~iunique 2006-10-13 15:55:49.000000000 +0200
+++ iunique/fs/binfmt_misc.c 2006-11-07 18:11:50.000000000 +0100
@@ -501,7 +501,7 @@ static void entry_status(Node *e, char *

static struct inode *bm_get_inode(struct super_block *sb, int mode)
{
- struct inode * inode = new_inode(sb);
+ struct inode * inode = new_inode_autonum(sb);

if (inode) {
inode->i_mode = mode;
--- iunique/fs/configfs/inode.c~iunique 2006-10-13 15:55:51.000000000 +0200
+++ iunique/fs/configfs/inode.c 2006-11-07 18:13:09.000000000 +0100
@@ -134,7 +134,7 @@ static inline void set_inode_attr(struct

struct inode * configfs_new_inode(mode_t mode, struct configfs_dirent * sd)
{
- struct inode * inode = new_inode(configfs_sb);
+ struct inode * inode = new_inode_autonum(configfs_sb);
if (inode) {
inode->i_blksize = PAGE_CACHE_SIZE;
inode->i_blocks = 0;
--- iunique/fs/debugfs/inode.c~iunique 2006-10-13 15:55:51.000000000 +0200
+++ iunique/fs/debugfs/inode.c 2006-11-07 18:13:27.000000000 +0100
@@ -34,7 +34,7 @@ static int debugfs_mount_count;

static struct inode *debugfs_get_inode(struct super_block *sb, int mode, dev_t dev)
{
- struct inode *inode = new_inode(sb);
+ struct inode *inode = new_inode_autonum(sb);

if (inode) {
inode->i_mode = mode;
--- iunique/fs/eventpoll.c~iunique 2006-10-13 15:55:53.000000000 +0200
+++ iunique/fs/eventpoll.c 2006-11-07 18:14:11.000000000 +0100
@@ -1572,7 +1572,7 @@ static int eventpollfs_delete_dentry(str
static struct inode *ep_eventpoll_inode(void)
{
int error = -ENOMEM;
- struct inode *inode = new_inode(eventpoll_mnt->mnt_sb);
+ struct inode *inode = new_inode_autonum(eventpoll_mnt->mnt_sb);

if (!inode)
goto eexit_1;
--- iunique/fs/freevxfs/vxfs_inode.c~iunique 2006-10-13 15:55:58.000000000 +0200
+++ iunique/fs/freevxfs/vxfs_inode.c 2006-11-07 18:15:48.000000000 +0100
@@ -262,7 +262,7 @@ vxfs_get_fake_inode(struct super_block *
{
struct inode *ip = NULL;

- if ((ip = new_inode(sbp))) {
+ if ((ip = new_inode_autonum(sbp))) {
vxfs_iinit(ip, vip);
ip->i_mapping->a_ops = &vxfs_aops;
}
--- iunique/fs/fuse/control.c~iunique 2006-10-13 15:55:59.000000000 +0200
+++ iunique/fs/fuse/control.c 2006-11-07 18:15:58.000000000 +0100
@@ -85,7 +85,7 @@ static struct dentry *fuse_ctl_add_dentr
return NULL;

fc->ctl_dentry[fc->ctl_ndents++] = dentry;
- inode = new_inode(fuse_control_sb);
+ inode = new_inode_autonum(fuse_control_sb);
if (!inode)
return NULL;

--- iunique/fs/hugetlbfs/inode.c~iunique 2006-10-13 15:56:01.000000000 +0200
+++ iunique/fs/hugetlbfs/inode.c 2006-11-07 18:16:30.000000000 +0100
@@ -351,7 +351,7 @@ static struct inode *hugetlbfs_get_inode
{
struct inode *inode;

- inode = new_inode(sb);
+ inode = new_inode_autonum(sb);
if (inode) {
struct hugetlbfs_inode_info *info;
inode->i_mode = mode;
--- iunique/fs/ocfs2/dlm/dlmfs.c~iunique 2006-10-13 15:56:17.000000000 +0200
+++ iunique/fs/ocfs2/dlm/dlmfs.c 2006-11-07 18:19:11.000000000 +0100
@@ -325,7 +325,7 @@ static struct backing_dev_info dlmfs_bac

static struct inode *dlmfs_get_root_inode(struct super_block *sb)
{
- struct inode *inode = new_inode(sb);
+ struct inode *inode = new_inode_autonum(sb);
int mode = S_IFDIR | 0755;
struct dlmfs_inode_private *ip;

@@ -353,7 +353,7 @@ static struct inode *dlmfs_get_inode(str
int mode)
{
struct super_block *sb = parent->i_sb;
- struct inode * inode = new_inode(sb);
+ struct inode * inode = new_inode_autonum(sb);
struct dlmfs_inode_private *ip;

if (!inode)
--- iunique/fs/pipe.c~iunique 2006-10-13 15:56:19.000000000 +0200
+++ iunique/fs/pipe.c 2006-11-07 18:19:21.000000000 +0100
@@ -854,7 +854,7 @@ static struct dentry_operations pipefs_d

static struct inode * get_pipe_inode(void)
{
- struct inode *inode = new_inode(pipe_mnt->mnt_sb);
+ struct inode *inode = new_inode_autonum(pipe_mnt->mnt_sb);
struct pipe_inode_info *pipe;

if (!inode)
--- iunique/fs/ramfs/inode.c~iunique 2006-10-13 15:56:23.000000000 +0200
+++ iunique/fs/ramfs/inode.c 2006-11-07 18:19:36.000000000 +0100
@@ -52,7 +52,7 @@ static struct backing_dev_info ramfs_bac

struct inode *ramfs_get_inode(struct super_block *sb, int mode, dev_t dev)
{
- struct inode * inode = new_inode(sb);
+ struct inode * inode = new_inode_autonum(sb);

if (inode) {
inode->i_mode = mode;
--- iunique/fs/sysfs/inode.c~iunique 2006-10-13 15:56:26.000000000 +0200
+++ iunique/fs/sysfs/inode.c 2006-11-07 18:19:51.000000000 +0100
@@ -122,7 +122,7 @@ static struct lock_class_key sysfs_inode

struct inode * sysfs_new_inode(mode_t mode, struct sysfs_dirent * sd)
{
- struct inode * inode = new_inode(sysfs_sb);
+ struct inode * inode = new_inode_autonum(sysfs_sb);
if (inode) {
inode->i_blksize = PAGE_CACHE_SIZE;
inode->i_blocks = 0;
--- iunique/ipc/mqueue.c~iunique 2006-10-13 15:59:58.000000000 +0200
+++ iunique/ipc/mqueue.c 2006-11-07 18:20:13.000000000 +0100
@@ -110,7 +110,7 @@ static struct inode *mqueue_get_inode(st
{
struct inode *inode;

- inode = new_inode(sb);
+ inode = new_inode_autonum(sb);
if (inode) {
inode->i_mode = mode;
inode->i_uid = current->fsuid;
--- iunique/kernel/cpuset.c~iunique 2006-10-13 15:59:59.000000000 +0200
+++ iunique/kernel/cpuset.c 2006-11-07 18:20:22.000000000 +0100
@@ -283,7 +283,7 @@ static struct backing_dev_info cpuset_ba

static struct inode *cpuset_new_inode(mode_t mode)
{
- struct inode *inode = new_inode(cpuset_sb);
+ struct inode *inode = new_inode_autonum(cpuset_sb);

if (inode) {
inode->i_mode = mode;
--- iunique/mm/shmem.c~iunique 2006-10-13 16:00:14.000000000 +0200
+++ iunique/mm/shmem.c 2006-11-07 18:20:32.000000000 +0100
@@ -1345,7 +1345,7 @@ shmem_get_inode(struct super_block *sb,
spin_unlock(&sbinfo->stat_lock);
}

- inode = new_inode(sb);
+ inode = new_inode_autonum(sb);
if (inode) {
inode->i_mode = mode;
inode->i_uid = current->fsuid;
--- iunique/net/socket.c~iunique 2006-10-13 16:00:48.000000000 +0200
+++ iunique/net/socket.c 2006-11-07 18:20:45.000000000 +0100
@@ -516,7 +516,7 @@ static struct socket *sock_alloc(void)
struct inode * inode;
struct socket * sock;

- inode = new_inode(sock_mnt->mnt_sb);
+ inode = new_inode_autonum(sock_mnt->mnt_sb);
if (!inode)
return NULL;

--- iunique/security/inode.c~iunique 2006-10-13 16:01:00.000000000 +0200
+++ iunique/security/inode.c 2006-11-07 18:23:06.000000000 +0100
@@ -58,7 +58,7 @@ static struct file_operations default_fi

static struct inode *get_inode(struct super_block *sb, int mode, dev_t dev)
{
- struct inode *inode = new_inode(sb);
+ struct inode *inode = new_inode_autonum(sb);

if (inode) {
inode->i_mode = mode;
--- iunique/security/selinux/selinuxfs.c~iunique 2006-10-13 16:01:01.000000000 +0200
+++ iunique/security/selinux/selinuxfs.c 2006-11-07 18:23:27.000000000 +0100
@@ -766,7 +766,7 @@ out:

static struct inode *sel_make_inode(struct super_block *sb, int mode)
{
- struct inode *ret = new_inode(sb);
+ struct inode *ret = new_inode_autonum(sb);

if (ret) {
ret->i_mode = mode;

2006-11-07 17:42:53

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, 7 November 2006 18:28:35 +0100, J?rn Engel wrote:
>
> Anyway, here is a first patch converting some callers that looked
> obvious.

Next patch with the not-so-obvious ones. I believe this patch is
correct, but someone should double-check it.

Jfs really surprised me. It appears as if it just takes the number
returned from new_inode in some cases - unbelievable.

J?rn

--
"[One] doesn't need to know [...] how to cause a headache in order
to take an aspirin."
-- Scott Culp, Manager of the Microsoft Security Response Center, 2001


Signed-off-by: J?rn Engel <[email protected]>
---

fs/9p/vfs_inode.c | 2 +-
fs/cifs/inode.c | 8 +++++---
fs/jfs/jfs_inode.c | 2 +-
3 files changed, 7 insertions(+), 5 deletions(-)

--- iunique/fs/9p/vfs_inode.c~iunique_nonobvious 2006-10-13 15:55:45.000000000 +0200
+++ iunique/fs/9p/vfs_inode.c 2006-11-07 18:30:59.000000000 +0100
@@ -199,7 +199,7 @@ struct inode *v9fs_get_inode(struct supe

dprintk(DEBUG_VFS, "super block: %p mode: %o\n", sb, mode);

- inode = new_inode(sb);
+ inode = new_inode_autonum(sb);
if (inode) {
inode->i_mode = mode;
inode->i_uid = current->fsuid;
--- iunique/fs/cifs/inode.c~iunique_nonobvious 2006-10-13 15:55:50.000000000 +0200
+++ iunique/fs/cifs/inode.c 2006-11-07 18:33:53.000000000 +0100
@@ -90,7 +90,9 @@ int cifs_get_inode_info_unix(struct inod
if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_SERVER_INUM) {
(*pinode)->i_ino =
(unsigned long)findData.UniqueId;
- } /* note ino incremented to unique num in new_inode */
+ } else {
+ (*pinode)->i_ino = iunique(sb, 0);
+ }
insert_inode_hash(*pinode);
}

@@ -384,7 +386,7 @@ int cifs_get_inode_info(struct inode **p

/* get new inode */
if (*pinode == NULL) {
- *pinode = new_inode(sb);
+ *pinode = new_inode_autonum(sb);
if (*pinode == NULL)
return -ENOMEM;
/* Is an i_ino of zero legal? Can we use that to check
@@ -416,7 +418,7 @@ int cifs_get_inode_info(struct inode **p
/* BB EOPNOSUPP disable SERVER_INUM? */
} else /* do we need cast or hash to ino? */
(*pinode)->i_ino = inode_num;
- } /* else ino incremented to unique num in new_inode*/
+ } /* else ino incremented to unique num in new_inode_autonum*/
insert_inode_hash(*pinode);
}
inode = *pinode;
--- iunique/fs/jfs/jfs_inode.c~iunique_nonobvious 2006-10-13 15:56:05.000000000 +0200
+++ iunique/fs/jfs/jfs_inode.c 2006-11-07 18:36:55.000000000 +0100
@@ -58,7 +58,7 @@ struct inode *ialloc(struct inode *paren
struct jfs_inode_info *jfs_inode;
int rc;

- inode = new_inode(sb);
+ inode = new_inode_autonum(sb);
if (!inode) {
jfs_warn("ialloc: new_inode returned NULL!");
return inode;

2006-11-07 17:53:10

by Dave Kleikamp

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, 2006-11-07 at 18:42 +0100, J?rn Engel wrote:
> On Tue, 7 November 2006 18:28:35 +0100, J?rn Engel wrote:
> >
> > Anyway, here is a first patch converting some callers that looked
> > obvious.
>
> Next patch with the not-so-obvious ones. I believe this patch is
> correct, but someone should double-check it.
>
> Jfs really surprised me. It appears as if it just takes the number
> returned from new_inode in some cases - unbelievable.

jfs set it in diInitInode() (pardon the uglyMixedCaps), which is called
in several places under diAlloc(). diAlloc() is called after
new_inode() for most inodes. The exceptions are for special inodes,
which also initialize i_ino in some manner.

Shaggy
--
David Kleikamp
IBM Linux Technology Center

2006-11-07 17:56:36

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

Things are getting more interesting. simple_fill_super() looked like
a bug waiting to happen. It is fairly hard to trigger, but still.
This should fix it, although in a fairly crude manner.

The other callers were save - it is hard to have the root inode
collide with anything existing.

J?rn

--
Good warriors cause others to come to them and do not go to others.
-- Sun Tzu


Signed-off-by: J?rn Engel <[email protected]>
---

fs/libfs.c | 2 ++
1 file changed, 2 insertions(+)

--- iunique/fs/libfs.c~iunique_libfs 2006-10-13 15:56:01.000000000 +0200
+++ iunique/fs/libfs.c 2006-11-07 18:54:21.000000000 +0100
@@ -381,6 +381,8 @@ int simple_fill_super(struct super_block
inode = new_inode(s);
if (!inode)
return -ENOMEM;
+ /* ino must not collide with any ino assigned in the loop below */
+ inode->i_ino = 0x8000000;
inode->i_mode = S_IFDIR | 0755;
inode->i_uid = inode->i_gid = 0;
inode->i_blksize = PAGE_CACHE_SIZE;

2006-11-07 18:02:11

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

And the last one is xfs_mapping_buftarg(). I am completely at a loss.
As far as I can tell, inode allocated, partially initiated and -
leaked. Am I missing something?

STATIC int
xfs_mapping_buftarg(
xfs_buftarg_t *btp,
struct block_device *bdev)
{
struct backing_dev_info *bdi;
struct inode *inode;
struct address_space *mapping;
static const struct address_space_operations mapping_aops = {
.sync_page = block_sync_page,
.migratepage = fail_migrate_page,
};

inode = new_inode(bdev->bd_inode->i_sb);
if (!inode) {
printk(KERN_WARNING
"XFS: Cannot allocate mapping inode for device %s\n",
XFS_BUFTARG_NAME(btp));
return ENOMEM;
}
inode->i_mode = S_IFBLK;
inode->i_bdev = bdev;
inode->i_rdev = bdev->bd_dev;
bdi = blk_get_backing_dev_info(bdev);
if (!bdi)
bdi = &default_backing_dev_info;
mapping = &inode->i_data;
mapping->a_ops = &mapping_aops;
mapping->backing_dev_info = bdi;
mapping_set_gfp_mask(mapping, GFP_NOFS);
btp->bt_mapping = mapping;
return 0;
}


J?rn

--
Those who come seeking peace without a treaty are plotting.
-- Sun Tzu

2006-11-07 18:01:34

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, 2006-11-07 at 18:28 +0100, Jörn Engel wrote:

> Looking at the callers, it seemed a bit more natural to me to call
> new_inode_autonum(sb) than new_inode_autonum(sb, 1). Would you mind
> turning new_inode_autonum() in a wrapper like new_inode() and putting
> the actual code into a static helper?
>
> Anyway, here is a first patch converting some callers that looked
> obvious.
>
> Jörn

Thanks for the feedback. Yeah, I held back on converting any filesystems
until I had some comments. Thanks for doing the legwork on that part.
Here's a respun patch with the suggested modification to
new_inode_autonum. This also adds it to fs.h.

Signed-off-by: Jeff Layton <[email protected]>

diff --git a/fs/inode.c b/fs/inode.c
index 26cdb11..dd96799 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -517,14 +517,14 @@ repeat:
}

/**
- * new_inode - obtain an inode
+ * __new_inode - obtain an inode
* @sb: superblock
+ * @autonum: if true, make sure that i_ino is unique
*
* Allocates a new inode for given superblock.
*/
-struct inode *new_inode(struct super_block *sb)
+static struct inode *__new_inode(struct super_block *sb, int autonum)
{
- static unsigned long last_ino;
struct inode * inode;

spin_lock_prefetch(&inode_lock);
@@ -535,13 +535,42 @@ struct inode *new_inode(struct super_blo
inodes_stat.nr_inodes++;
list_add(&inode->i_list, &inode_in_use);
list_add(&inode->i_sb_list, &sb->s_inodes);
- inode->i_ino = ++last_ino;
+ if (autonum)
+ inode->i_ino = iunique(sb, 0);
+ else
+ inode->i_ino = ++sb->s_lastino;
inode->i_state = 0;
spin_unlock(&inode_lock);
}
return inode;
}

+/**
+ * new_inode_autonum - obtain an inode with a unique i_ino value
+ * @sb: superblock
+ *
+ * Allocates a new inode for given superblock. Ensures that i_ino is
+ * unique on the filesystem.
+ */
+struct inode *new_inode_autonum(struct super_block *sb)
+{
+ return __new_inode(sb, 1);
+}
+
+EXPORT_SYMBOL(new_inode_autonum);
+
+/**
+ * new_inode - obtain an inode -- i_ino not guaranteed unique
+ * @sb: superblock
+ *
+ * Allocates a new inode for given superblock. i_ino is not guaranteed to
+ * be unique. Should only be used when i_ino is going to be clobbered.
+ */
+struct inode *new_inode(struct super_block *sb)
+{
+ return __new_inode(sb, 0);
+}
+
EXPORT_SYMBOL(new_inode);

void unlock_new_inode(struct inode *inode)
@@ -683,22 +712,21 @@ static unsigned long hash(struct super_b
*/
ino_t iunique(struct super_block *sb, ino_t max_reserved)
{
- static ino_t counter;
struct inode *inode;
struct hlist_head * head;
ino_t res;
spin_lock(&inode_lock);
retry:
- if (counter > max_reserved) {
- head = inode_hashtable + hash(sb,counter);
- res = counter++;
+ if (sb->s_lastino >= max_reserved) {
+ res = ++sb->s_lastino;
+ head = inode_hashtable + hash(sb,res);
inode = find_inode_fast(sb, head, res);
if (!inode) {
spin_unlock(&inode_lock);
return res;
}
} else {
- counter = max_reserved + 1;
+ sb->s_lastino = max_reserved;
}
goto retry;

diff --git a/fs/super.c b/fs/super.c
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2fe6e3f..d85c7ba 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -961,6 +961,14 @@ #endif
/* Granularity of c/m/atime in ns.
Cannot be worse than a second */
u32 s_time_gran;
+
+ /* per-sb inode counter for new_inode. Make it a 32-bit counter when
+ we have the possibility of dealing with 32-bit apps */
+#ifdef CONFIG_COMPAT
+ unsigned int s_lastino;
+#else
+ unsigned long s_lastino;
+#endif
};

extern struct timespec current_fs_time(struct super_block *sb);
@@ -1712,6 +1720,7 @@ extern void __iget(struct inode * inode)
extern void clear_inode(struct inode *);
extern void destroy_inode(struct inode *);
extern struct inode *new_inode(struct super_block *);
+extern struct inode *new_inode_autonum(struct super_block *);
extern int __remove_suid(struct dentry *, int);
extern int should_remove_suid(struct dentry *);
extern int remove_suid(struct dentry *);



2006-11-07 18:07:24

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, 7 November 2006 11:53:03 -0600, Dave Kleikamp wrote:
> On Tue, 2006-11-07 at 18:42 +0100, J?rn Engel wrote:
> >
> > Jfs really surprised me. It appears as if it just takes the number
> > returned from new_inode in some cases - unbelievable.
>
> jfs set it in diInitInode() (pardon the uglyMixedCaps), which is called
> in several places under diAlloc(). diAlloc() is called after
> new_inode() for most inodes. The exceptions are for special inodes,
> which also initialize i_ino in some manner.

Yes, even I see it now. Thanks for clearing that up.

J?rn

--
Joern's library part 10:
http://blogs.msdn.com/David_Gristwood/archive/2004/06/24/164849.aspx

2006-11-07 18:10:45

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

J?rn Engel wrote:
> And the last one is xfs_mapping_buftarg(). I am completely at a loss.
> As far as I can tell, inode allocated, partially initiated and -
> leaked. Am I missing something?

It's not leaked... the new inode's mapping is saved in the buftarg:

btp->bt_mapping = mapping;

and then eventually the inode is put/freed in xfs_free_buftarg():

iput(btp->bt_mapping->host);

The inode's mapping itself is used in several places:

File Function Line
0 xfs/linux-2.6/xfs_buf.h <global> 84 struct address_space *bt_mapping;
1 xfs/linux-2.6/xfs_buf.c _xfs_buf_lookup_pages 345 struct address_space *mapping = bp->b_target->bt_mapping;
2 xfs/linux-2.6/xfs_buf.c xfs_buf_readahead 671 bdi = target->bt_mapping->backing_dev_info;
3 xfs/linux-2.6/xfs_buf.c xfs_buf_lock 906 blk_run_address_space(bp->b_target->bt_mapping);
4 xfs/linux-2.6/xfs_buf.c xfs_buf_wait_unpin 978 blk_run_address_space(bp->b_target->bt_mapping);
5 xfs/linux-2.6/xfs_buf.c xfs_buf_iowait 1291 blk_run_address_space(bp->b_target->bt_mapping);
6 xfs/linux-2.6/xfs_buf.c xfs_free_buftarg 1451 iput(btp->bt_mapping->host);
7 xfs/linux-2.6/xfs_buf.c xfs_mapping_buftarg 1545 btp->bt_mapping = mapping;
8 xfs/linux-2.6/xfs_buf.c xfsbufd 1728 blk_run_address_space(target->bt_mapping);
9 xfs/linux-2.6/xfs_buf.c xfs_flush_buftarg 1801 blk_run_address_space(target->bt_mapping);

-Eric

2006-11-07 18:15:17

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, 7 November 2006 13:01:28 -0500, Jeff Layton wrote:
>
> Thanks for the feedback. Yeah, I held back on converting any filesystems
> until I had some comments. Thanks for doing the legwork on that part.
> Here's a respun patch with the suggested modification to
> new_inode_autonum. This also adds it to fs.h.
>
> Signed-off-by: Jeff Layton <[email protected]>
Acked-by: Joern Engel <[email protected]>

Looks good to me.

Jeff, would you mind taking my patches and putting them into a decent
patch series? I really should be working on other things.

J?rn

--
Time? What's that? Time is only worth what you do with it.
-- Theo de Raadt

2006-11-07 18:23:34

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, 2006-11-07 at 19:14 +0100, Jörn Engel wrote:
> Acked-by: Joern Engel <[email protected]>
>
> Looks good to me.
>
> Jeff, would you mind taking my patches and putting them into a decent
> patch series? I really should be working on other things.
>
> Jörn
>

Me too, but sure. I'll try to put together a good patch set, and even
test to make sure it builds ;-). I'll probably have something together
by tomorrow sometime.

-- Jeff


2006-11-07 19:41:08

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, 2006-11-07 at 18:56 +0100, Jörn Engel wrote:
> Things are getting more interesting. simple_fill_super() looked like
> a bug waiting to happen. It is fairly hard to trigger, but still.
> This should fix it, although in a fairly crude manner.
>
> The other callers were save - it is hard to have the root inode
> collide with anything existing.
>
> Jörn
>

Jörn,
How about this patch instead here? I don't think anything depends on
i_ino being any certain value for these files, and this seems less
"magic-numbery". This should also mostly prevent us from assigning out
i_ino=0.

Signed-off-by: Jeff Layton <[email protected]>

diff --git a/fs/libfs.c b/fs/libfs.c
index bd08e0e..506268e 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -373,6 +373,8 @@ int simple_fill_super(struct super_block
inode = new_inode(s);
if (!inode)
return -ENOMEM;
+ /* ino must not collide with any ino assigned in the loop below */
+ inode->i_ino = 1;
inode->i_mode = S_IFDIR | 0755;
inode->i_uid = inode->i_gid = 0;
inode->i_blocks = 0;
@@ -385,7 +387,7 @@ int simple_fill_super(struct super_block
iput(inode);
return -ENOMEM;
}
- for (i = 0; !files->name || files->name[0]; i++, files++) {
+ for (i = 2; !files->name || files->name[0]; i++, files++) {
if (!files->name)
continue;
dentry = d_alloc_name(root, files->name);


2006-11-07 20:41:42

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, 7 November 2006 14:41:04 -0500, Jeff Layton wrote:
>
> How about this patch instead here? I don't think anything depends on
> i_ino being any certain value for these files, and this seems less
> "magic-numbery". This should also mostly prevent us from assigning out
> i_ino=0.

nfsctl_transaction_write() appears to depend on i_ino.

J?rn

--
Invincibility is in oneself, vulnerability is in the opponent.
-- Sun Tzu

2006-11-07 21:15:31

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, 2006-11-07 at 21:41 +0100, Jörn Engel wrote:
> On Tue, 7 November 2006 14:41:04 -0500, Jeff Layton wrote:
> >
> > How about this patch instead here? I don't think anything depends on
> > i_ino being any certain value for these files, and this seems less
> > "magic-numbery". This should also mostly prevent us from assigning out
> > i_ino=0.
>
> nfsctl_transaction_write() appears to depend on i_ino.
>
> Jörn
>

Ahh, correct. That could probably stand to be a bit more robust, but I
don't want to tackle it now. How about this instead, which should just
set it to the maximum allowed value of s->s_lastino? It might throw a
compile-time warning about overflowing i_ino, but I think it'll still
work.

Signed-off-by: Jeff Layton <[email protected]>

diff --git a/fs/libfs.c b/fs/libfs.c
index bd08e0e..d9f4e3e 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -373,6 +373,9 @@ int simple_fill_super(struct super_block
inode = new_inode(s);
if (!inode)
return -ENOMEM;
+ /* ino must not collide with any ino assigned in the loop below. Set
+ it to the highest possible inode number */
+ inode->i_ino = (1 << (sizeof(s->s_lastino) * 8)) - 1;
inode->i_mode = S_IFDIR | 0755;
inode->i_uid = inode->i_gid = 0;
inode->i_blocks = 0;


2006-11-07 21:20:15

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, Nov 07, 2006 at 04:13:00PM -0500, Jeff Layton wrote:
> + /* ino must not collide with any ino assigned in the loop below. Set
> + it to the highest possible inode number */
> + inode->i_ino = (1 << (sizeof(s->s_lastino) * 8)) - 1;

This really isn't a good idiom to be using; GCC now takes this to mean
"I can reformat your hard drive because you did something outside the
spec".

Try instead:
+ inode->i_ino = -1;

2006-11-07 22:11:37

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] make last_inode counter in new_inode 32-bit on kernels that offer x86 compatability

On Tue, 2006-11-07 at 14:20 -0700, Matthew Wilcox wrote:
> On Tue, Nov 07, 2006 at 04:13:00PM -0500, Jeff Layton wrote:
> > + /* ino must not collide with any ino assigned in the loop below. Set
> > + it to the highest possible inode number */
> > + inode->i_ino = (1 << (sizeof(s->s_lastino) * 8)) - 1;
>
> This really isn't a good idiom to be using; GCC now takes this to mean
> "I can reformat your hard drive because you did something outside the
> spec".
>
> Try instead:
> + inode->i_ino = -1;
>

The problem there is that on platforms with a 64-bit ino_t, this will be
too large to fit in a 32-bit field and we'll end up with the same
EOVERFLOW problem. Is there a more correct way to make it size
appropriately given the different possible sizes of s_lastino?

I suppose we could just set it to 0xffffffff and hope that that is "big
enough" for most cases.

-- Jeff