2011-11-21 07:45:23

by Chris Dunlop

[permalink] [raw]
Subject: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

can't blindly check nd->flags in ->d_revalidate()

Has been previously done for various other file systems:

git blame -L 1768,+1 fs/proc/base.c
34286d66 (Nick Piggin 2011-01-07 17:49:57 +1100 1768) if (nd && nd->flags & LOOKUP_RCU)
git blame -L 645,+1 fs/cifs/dir.c
3ca30d40 (Pavel Shilovsky 2011-07-25 17:59:10 +0400 645) if (nd && (nd->flags & LOOKUP_RCU))
git blame -L 46,+1 fs/fat/namei_vfat.c
9177ada9 (Al Viro 2011-03-10 03:45:49 -0500 46) if (nd && nd->flags & LOOKUP_RCU)
git blame -L 57,+1 fs/fat/namei_vfat.c
9177ada9 (Al Viro 2011-03-10 03:45:49 -0500 57) if (nd && nd->flags & LOOKUP_RCU)
git blame -L 177,+1 fs/fuse/dir.c
d2433905 (Miklos Szeredi 2011-05-10 17:35:58 +0200 177) if (nd && (nd->flags & LOOKUP_RCU))
git blame -L 1036,+1 fs/ceph/dir.c
0eb980e3 (Al Viro 2011-03-10 03:44:05 -0500 1036) if (nd && nd->flags & LOOKUP_RCU)
git blame -L 47,+1 fs/gfs2/dentry.c
53fe9241 (Al Viro 2011-03-10 03:44:48 -0500 47) if (nd && nd->flags & LOOKUP_RCU)
git blame -L 53,+1 fs/ecryptfs/dentry.c
70b89021 (Tyler Hicks 2011-02-17 17:35:20 -0600 53) if (nd && nd->flags & LOOKUP_RCU)
git blame -L 59,+1 fs/ocfs2/dcache.c
4714e637 (Al Viro 2011-03-10 03:45:07 -0500 59) if (nd && nd->flags & LOOKUP_RCU)

Signed-off-by: Chris Dunlop <[email protected]>
---
fs/9p/vfs_dentry.c | 2 +-
fs/afs/dir.c | 2 +-
fs/coda/dir.c | 2 +-
fs/hfs/sysdep.c | 2 +-
fs/jfs/namei.c | 3 +++
fs/ncpfs/dir.c | 2 +-
fs/nfs/dir.c | 2 +-
fs/proc/proc_sysctl.c | 2 +-
fs/sysfs/dir.c | 2 +-
9 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/fs/9p/vfs_dentry.c b/fs/9p/vfs_dentry.c
index e022890..2be4b91 100644
--- a/fs/9p/vfs_dentry.c
+++ b/fs/9p/vfs_dentry.c
@@ -106,7 +106,7 @@ static int v9fs_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
struct inode *inode;
struct v9fs_inode *v9inode;

- if (nd->flags & LOOKUP_RCU)
+ if (nd && nd->flags & LOOKUP_RCU)
return -ECHILD;

inode = dentry->d_inode;
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 1b0b195..4112d68 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -607,7 +607,7 @@ static int afs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
void *dir_version;
int ret;

- if (nd->flags & LOOKUP_RCU)
+ if (nd && nd->flags & LOOKUP_RCU)
return -ECHILD;

vnode = AFS_FS_I(dentry->d_inode);
diff --git a/fs/coda/dir.c b/fs/coda/dir.c
index 28e7e13..0cf4a68 100644
--- a/fs/coda/dir.c
+++ b/fs/coda/dir.c
@@ -544,7 +544,7 @@ static int coda_dentry_revalidate(struct dentry *de, struct nameidata *nd)
struct inode *inode;
struct coda_inode_info *cii;

- if (nd->flags & LOOKUP_RCU)
+ if (nd && nd->flags & LOOKUP_RCU)
return -ECHILD;

inode = de->d_inode;
diff --git a/fs/hfs/sysdep.c b/fs/hfs/sysdep.c
index 19cf291..1127ea3 100644
--- a/fs/hfs/sysdep.c
+++ b/fs/hfs/sysdep.c
@@ -18,7 +18,7 @@ static int hfs_revalidate_dentry(struct dentry *dentry, struct nameidata *nd)
struct inode *inode;
int diff;

- if (nd->flags & LOOKUP_RCU)
+ if (nd && nd->flags & LOOKUP_RCU)
return -ECHILD;

inode = dentry->d_inode;
diff --git a/fs/jfs/namei.c b/fs/jfs/namei.c
index a112ad9..e3b65d6 100644
--- a/fs/jfs/namei.c
+++ b/fs/jfs/namei.c
@@ -1585,6 +1585,9 @@ out:

static int jfs_ci_revalidate(struct dentry *dentry, struct nameidata *nd)
{
+ if (nd && (nd->flags & LOOKUP_RCU))
+ return -ECHILD;
+
/*
* This is not negative dentry. Always valid.
*
diff --git a/fs/ncpfs/dir.c b/fs/ncpfs/dir.c
index 9c51f62..1fb3de3 100644
--- a/fs/ncpfs/dir.c
+++ b/fs/ncpfs/dir.c
@@ -302,7 +302,7 @@ ncp_lookup_validate(struct dentry *dentry, struct nameidata *nd)
if (dentry == dentry->d_sb->s_root)
return 1;

- if (nd->flags & LOOKUP_RCU)
+ if (nd && nd->flags & LOOKUP_RCU)
return -ECHILD;

parent = dget_parent(dentry);
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index b238d95..88c0f6e 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1103,7 +1103,7 @@ static int nfs_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
struct nfs_fattr *fattr = NULL;
int error;

- if (nd->flags & LOOKUP_RCU)
+ if (nd && nd->flags & LOOKUP_RCU)
return -ECHILD;

parent = dget_parent(dentry);
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index a6b6217..0437e4a 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -435,7 +435,7 @@ static const struct inode_operations proc_sys_dir_operations = {

static int proc_sys_revalidate(struct dentry *dentry, struct nameidata *nd)
{
- if (nd->flags & LOOKUP_RCU)
+ if (nd && nd->flags & LOOKUP_RCU)
return -ECHILD;
return !PROC_I(dentry->d_inode)->sysctl->unregistering;
}
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 7fdf6a7..8f36a13 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -261,7 +261,7 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
struct sysfs_dirent *sd;
int is_dir;

- if (nd->flags & LOOKUP_RCU)
+ if (nd && nd->flags & LOOKUP_RCU)
return -ECHILD;

sd = dentry->d_fsdata;
--
1.7.0.4



2011-11-30 07:13:25

by Chris Dunlop

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

On Tue, Nov 29, 2011 at 03:58:07AM -0800, Myklebust, Trond wrote:
>> -----Original Message-----
>> From: Chris Dunlop [mailto:[email protected]]
>> Sent: Tuesday, November 29, 2011 3:25 AM
>>
>> I haven't seen any response to this patch which fixes an Oops in
>> d_revalidate. I hit this using NFS, but various other file systems look to be
>> likewise vulnerable, hence the broadness of the patch. The sequence leading
>> to the Oops is:
>>
>> lookup_one_len() [fs/namei.c]
>> calls __lookup_hash() [fs/namei.c] with nd == NULL,
>> which can then call the file system specific d_revalidate(), passing in nd == NULL
>> which will then Oops if nd is used without checking
>
> That's because you are "fixing" the wrong bug and if you'd checked the
> list archives, you'd know that this has already been discussed several
> times...

Augh! Apologies.

> By allowing stacked filesystems to pass nd==NULL (the VFS doesn't do
> this), you're circumventing the lookup intent mechanisms and will hit
> all sorts of problems further down the road. If you want to fix the
> problem, then please fix the broken stacking filesystems to stop using
> lookup_one_len...

OK.

To avoid other people further wasting their and your time on
exactly the same thing future, how something like the following
patch, based on your comment in:

http://article.gmane.org/gmane.linux.nfs/40370

...and, if that's acceptable, is it worthwhile doing for the
other file systems which are likewise currently vulnerable when
abused by broken layered file systems?

Chris

----------------------------------------------------------------------
Don't oops when abused by broken layered file systems

Signed-off-by: Chris Dunlop <[email protected]>
---
fs/nfs/dir.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index b238d95..f872f29 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1103,6 +1103,12 @@ static int nfs_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
struct nfs_fattr *fattr = NULL;
int error;

+ /*
+ * We don't support layered filesystems that don't do intents
+ */
+ if (nd == NULL)
+ return -EIO;
+
if (nd->flags & LOOKUP_RCU)
return -ECHILD;

--
1.7.0.4
----------------------------------------------------------------------


2011-11-29 08:25:03

by Chris Dunlop

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

Hi,

I haven't seen any response to this patch which fixes an Oops in
d_revalidate. I hit this using NFS, but various other file
systems look to be likewise vulnerable, hence the broadness of
the patch. The sequence leading to the Oops is:

lookup_one_len() [fs/namei.c]
calls __lookup_hash() [fs/namei.c] with nd == NULL,
which can then call the file system specific d_revalidate(), passing in nd == NULL
which will then Oops if nd is used without checking

Any suggestions on how to better bring this to the attention of
the relevant people? E.g. be patient, and/or provide separate
patches for each file system, and/or provide a better
explanation of the issue in the patch?

Note: in the patch version provided I missed the d_revalidate for
NFSv4, I'll respin that once I can get an idea of how to better
provide the patches.

Cheers,

Chris.

On Mon, Nov 21, 2011 at 06:36:48PM +1100, Chris Dunlop wrote:
> can't blindly check nd->flags in ->d_revalidate()
>
> Has been previously done for various other file systems:
>
> git blame -L 1768,+1 fs/proc/base.c
> 34286d66 (Nick Piggin 2011-01-07 17:49:57 +1100 1768) if (nd && nd->flags & LOOKUP_RCU)
> git blame -L 645,+1 fs/cifs/dir.c
> 3ca30d40 (Pavel Shilovsky 2011-07-25 17:59:10 +0400 645) if (nd && (nd->flags & LOOKUP_RCU))
> git blame -L 46,+1 fs/fat/namei_vfat.c
> 9177ada9 (Al Viro 2011-03-10 03:45:49 -0500 46) if (nd && nd->flags & LOOKUP_RCU)
> git blame -L 57,+1 fs/fat/namei_vfat.c
> 9177ada9 (Al Viro 2011-03-10 03:45:49 -0500 57) if (nd && nd->flags & LOOKUP_RCU)
> git blame -L 177,+1 fs/fuse/dir.c
> d2433905 (Miklos Szeredi 2011-05-10 17:35:58 +0200 177) if (nd && (nd->flags & LOOKUP_RCU))
> git blame -L 1036,+1 fs/ceph/dir.c
> 0eb980e3 (Al Viro 2011-03-10 03:44:05 -0500 1036) if (nd && nd->flags & LOOKUP_RCU)
> git blame -L 47,+1 fs/gfs2/dentry.c
> 53fe9241 (Al Viro 2011-03-10 03:44:48 -0500 47) if (nd && nd->flags & LOOKUP_RCU)
> git blame -L 53,+1 fs/ecryptfs/dentry.c
> 70b89021 (Tyler Hicks 2011-02-17 17:35:20 -0600 53) if (nd && nd->flags & LOOKUP_RCU)
> git blame -L 59,+1 fs/ocfs2/dcache.c
> 4714e637 (Al Viro 2011-03-10 03:45:07 -0500 59) if (nd && nd->flags & LOOKUP_RCU)
>
> Signed-off-by: Chris Dunlop <[email protected]>
> ---
> fs/9p/vfs_dentry.c | 2 +-
> fs/afs/dir.c | 2 +-
> fs/coda/dir.c | 2 +-
> fs/hfs/sysdep.c | 2 +-
> fs/jfs/namei.c | 3 +++
> fs/ncpfs/dir.c | 2 +-
> fs/nfs/dir.c | 2 +-
> fs/proc/proc_sysctl.c | 2 +-
> fs/sysfs/dir.c | 2 +-
> 9 files changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/fs/9p/vfs_dentry.c b/fs/9p/vfs_dentry.c
> index e022890..2be4b91 100644
> --- a/fs/9p/vfs_dentry.c
> +++ b/fs/9p/vfs_dentry.c
> @@ -106,7 +106,7 @@ static int v9fs_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
> struct inode *inode;
> struct v9fs_inode *v9inode;
>
> - if (nd->flags & LOOKUP_RCU)
> + if (nd && nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> inode = dentry->d_inode;
> diff --git a/fs/afs/dir.c b/fs/afs/dir.c
> index 1b0b195..4112d68 100644
> --- a/fs/afs/dir.c
> +++ b/fs/afs/dir.c
> @@ -607,7 +607,7 @@ static int afs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
> void *dir_version;
> int ret;
>
> - if (nd->flags & LOOKUP_RCU)
> + if (nd && nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> vnode = AFS_FS_I(dentry->d_inode);
> diff --git a/fs/coda/dir.c b/fs/coda/dir.c
> index 28e7e13..0cf4a68 100644
> --- a/fs/coda/dir.c
> +++ b/fs/coda/dir.c
> @@ -544,7 +544,7 @@ static int coda_dentry_revalidate(struct dentry *de, struct nameidata *nd)
> struct inode *inode;
> struct coda_inode_info *cii;
>
> - if (nd->flags & LOOKUP_RCU)
> + if (nd && nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> inode = de->d_inode;
> diff --git a/fs/hfs/sysdep.c b/fs/hfs/sysdep.c
> index 19cf291..1127ea3 100644
> --- a/fs/hfs/sysdep.c
> +++ b/fs/hfs/sysdep.c
> @@ -18,7 +18,7 @@ static int hfs_revalidate_dentry(struct dentry *dentry, struct nameidata *nd)
> struct inode *inode;
> int diff;
>
> - if (nd->flags & LOOKUP_RCU)
> + if (nd && nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> inode = dentry->d_inode;
> diff --git a/fs/jfs/namei.c b/fs/jfs/namei.c
> index a112ad9..e3b65d6 100644
> --- a/fs/jfs/namei.c
> +++ b/fs/jfs/namei.c
> @@ -1585,6 +1585,9 @@ out:
>
> static int jfs_ci_revalidate(struct dentry *dentry, struct nameidata *nd)
> {
> + if (nd && (nd->flags & LOOKUP_RCU))
> + return -ECHILD;
> +
> /*
> * This is not negative dentry. Always valid.
> *
> diff --git a/fs/ncpfs/dir.c b/fs/ncpfs/dir.c
> index 9c51f62..1fb3de3 100644
> --- a/fs/ncpfs/dir.c
> +++ b/fs/ncpfs/dir.c
> @@ -302,7 +302,7 @@ ncp_lookup_validate(struct dentry *dentry, struct nameidata *nd)
> if (dentry == dentry->d_sb->s_root)
> return 1;
>
> - if (nd->flags & LOOKUP_RCU)
> + if (nd && nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> parent = dget_parent(dentry);
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index b238d95..88c0f6e 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -1103,7 +1103,7 @@ static int nfs_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
> struct nfs_fattr *fattr = NULL;
> int error;
>
> - if (nd->flags & LOOKUP_RCU)
> + if (nd && nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> parent = dget_parent(dentry);
> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> index a6b6217..0437e4a 100644
> --- a/fs/proc/proc_sysctl.c
> +++ b/fs/proc/proc_sysctl.c
> @@ -435,7 +435,7 @@ static const struct inode_operations proc_sys_dir_operations = {
>
> static int proc_sys_revalidate(struct dentry *dentry, struct nameidata *nd)
> {
> - if (nd->flags & LOOKUP_RCU)
> + if (nd && nd->flags & LOOKUP_RCU)
> return -ECHILD;
> return !PROC_I(dentry->d_inode)->sysctl->unregistering;
> }
> diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
> index 7fdf6a7..8f36a13 100644
> --- a/fs/sysfs/dir.c
> +++ b/fs/sysfs/dir.c
> @@ -261,7 +261,7 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
> struct sysfs_dirent *sd;
> int is_dir;
>
> - if (nd->flags & LOOKUP_RCU)
> + if (nd && nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> sd = dentry->d_fsdata;
> --
> 1.7.0.4
>

2011-11-29 11:58:29

by Myklebust, Trond

[permalink] [raw]
Subject: RE: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

> -----Original Message-----
> From: Chris Dunlop [mailto:[email protected]]
> Sent: Tuesday, November 29, 2011 3:25 AM
> To: [email protected]; [email protected]; Eric
Van
> Hensbergen; Ron Minnich; Latchesar Ionkov; David Howells; Jan Harkes;
> maintainer:CODA FILE SYSTEM; Dave Kleikamp; Petr Vandrovec; Myklebust,
> Trond; Greg Kroah-Hartman; Al Viro;
[email protected];
> [email protected]; [email protected]; jfs-
> [email protected]; [email protected]
> Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports
>
> Hi,
>
> I haven't seen any response to this patch which fixes an Oops in
> d_revalidate. I hit this using NFS, but various other file systems
look to be
> likewise vulnerable, hence the broadness of the patch. The sequence
leading
> to the Oops is:
>
> lookup_one_len() [fs/namei.c]
> calls __lookup_hash() [fs/namei.c] with nd == NULL,
> which can then call the file system specific d_revalidate(),
passing in nd ==
> NULL
> which will then Oops if nd is used without checking

That's because you are "fixing" the wrong bug and if you'd checked the
list archives, you'd know that this has already been discussed several
times...

By allowing stacked filesystems to pass nd==NULL (the VFS doesn't do
this), you're circumventing the lookup intent mechanisms and will hit
all sorts of problems further down the road. If you want to fix the
problem, then please fix the broken stacking filesystems to stop using
lookup_one_len...

Trond

2011-12-01 03:54:04

by Dave Kleikamp

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports



On 11/30/2011 09:33 PM, Chris Dunlop wrote:
> On Wed, Nov 30, 2011 at 08:22:39PM -0600, Dave Kleikamp wrote:
>> On 11/30/2011 06:47 PM, Chris Dunlop wrote:
>>>> It's also worth printing a message - this *is* a kernel bug of some description
>>>> if it happens.
>>>
>>> Like the below? This covers the d_revalidate for 9p, afs, coda,
>>> hfs, ncpfs, proc, sysfs.
>>>
>>> Note: jfs isn't susceptible to this problem, but the resolution
>>> doesn't look like the other file systems, and from the comment
>>> I'm not sure if the problem was really understood and if it's
>>> doing the right thing:
>>
>> This code, as well as the comments, were copied from vfat. It seems
>> reasonable for case-insensitive but case-preserving behavior (not jfs's
>> default). The safe thing is to drop the negative dentry if we don't know
>> the operation.
>
> In that case, it looks like the thing to do might be to add the
> "protection" to the start of jfs_ci_revaliate(), per how the
> original has been changed in vfat:

The LOOKUP_RCU check had previously been there, but Al Viro removed it:

commit 5c0f360b083fb33d05d1bff4b138b82d715eb419
Author: Al Viro <[email protected]>
Date: Sat Jun 25 21:41:09 2011 -0400

jfs_ci_revalidate() is safe from RCU mode

I'm not sure what it takes to be "safe", but this is a simple function
that doesn't block, take locks, or do much of anything. You shouldn't
need to do anything with jfs.

Shaggy

>
> fs/fat/namei_vfat.c:
> static int vfat_revalidate_ci(struct dentry *dentry, struct nameidata *nd)
> {
> if (nd && nd->flags & LOOKUP_RCU)
> return -ECHILD;
> ...
> }
>
> E.g.:
>
> ----------------------------------------------------------------------
> Don't oops when abused by broken layered file systems
>
> Signed-off-by: Chris Dunlop <[email protected]>
> ---
> fs/jfs/namei.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/fs/jfs/namei.c b/fs/jfs/namei.c
> index e17545e..5504f6e 100644
> --- a/fs/jfs/namei.c
> +++ b/fs/jfs/namei.c
> @@ -1585,6 +1585,9 @@ out:
>
> static int jfs_ci_revalidate(struct dentry *dentry, struct nameidata *nd)
> {
> + if (nd && nd->flags & LOOKUP_RCU)
> + return -ECHILD;
> +
> /*
> * This is not negative dentry. Always valid.
> *

2011-11-30 08:55:45

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

Chris Dunlop <[email protected]> wrote:

> To avoid other people further wasting their and your time on
> exactly the same thing future, how something like the following
> patch, based on your comment in:
>
> http://article.gmane.org/gmane.linux.nfs/40370
>
> ...and, if that's acceptable, is it worthwhile doing for the
> other file systems which are likewise currently vulnerable when
> abused by broken layered file systems?

Also, this may get fixed by Al's atomic open patches - but obviously it hasn't
been yet...

> Don't oops when abused by broken layered file systems
>
> Signed-off-by: Chris Dunlop <[email protected]>

Acked-by: David Howells <[email protected]>

It's also worth printing a message - this *is* a kernel bug of some description
if it happens.

David

2011-12-01 00:47:15

by Chris Dunlop

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

On Wed, Nov 30, 2011 at 08:54:43AM +0000, David Howells wrote:
> Chris Dunlop <[email protected]> wrote:
>
>> To avoid other people further wasting their and your time on
>> exactly the same thing future, how something like the following
>> patch, based on your comment in:
>>
>> http://article.gmane.org/gmane.linux.nfs/40370
>>
>> ...and, if that's acceptable, is it worthwhile doing for the
>> other file systems which are likewise currently vulnerable when
>> abused by broken layered file systems?
>
> Also, this may get fixed by Al's atomic open patches - but obviously it hasn't
> been yet...
>
>> Don't oops when abused by broken layered file systems
>>
>> Signed-off-by: Chris Dunlop <[email protected]>
>
> Acked-by: David Howells <[email protected]>
>
> It's also worth printing a message - this *is* a kernel bug of some description
> if it happens.

Like the below? This covers the d_revalidate for 9p, afs, coda,
hfs, ncpfs, proc, sysfs.

Note: jfs isn't susceptible to this problem, but the resolution
doesn't look like the other file systems, and from the comment
I'm not sure if the problem was really understood and if it's
doing the right thing:

static int jfs_ci_revalidate(struct dentry *dentry, struct nameidata *nd)
{
...
/*
* This may be nfsd (or something), anyway, we can't see the
* intent of this. So, since this can be for creation, drop it.
*/
if (!nd)
return 0;

/*
* Drop the negative dentry, in order to make sure to use the
* case sensitive name which is specified by user if this is
* for creation.
*/
if (nd->flags & (LOOKUP_CREATE | LOOKUP_RENAME_TARGET))
return 0;
...
}

Chris.

----------------------------------------------------------------------
Don't oops when abused by broken layered file systems

Signed-off-by: Chris Dunlop <[email protected]>
---
fs/9p/vfs_dentry.c | 6 ++++++
fs/afs/dir.c | 6 ++++++
fs/coda/dir.c | 6 ++++++
fs/hfs/sysdep.c | 6 ++++++
fs/ncpfs/dir.c | 6 ++++++
fs/nfs/dir.c | 12 ++++++++++++
fs/proc/proc_sysctl.c | 5 +++++
fs/sysfs/dir.c | 6 ++++++
8 files changed, 53 insertions(+), 0 deletions(-)

diff --git a/fs/9p/vfs_dentry.c b/fs/9p/vfs_dentry.c
index e022890..3b082dc 100644
--- a/fs/9p/vfs_dentry.c
+++ b/fs/9p/vfs_dentry.c
@@ -106,6 +106,12 @@ static int v9fs_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
struct inode *inode;
struct v9fs_inode *v9inode;

+ if (!nd) {
+ printk(KERN_ERR "v9fs_lookup_revalidate:"
+ " called from layered filesystem without intents\n");
+ return -EIO;
+ }
+
if (nd->flags & LOOKUP_RCU)
return -ECHILD;

diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 1b0b195..4003d76 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -607,6 +607,12 @@ static int afs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
void *dir_version;
int ret;

+ if (!nd) {
+ printk(KERN_ERR "afs_d_revalidate:"
+ " called from layered filesystem without intents\n");
+ return -EIO;
+ }
+
if (nd->flags & LOOKUP_RCU)
return -ECHILD;

diff --git a/fs/coda/dir.c b/fs/coda/dir.c
index 0239433..ede8e77 100644
--- a/fs/coda/dir.c
+++ b/fs/coda/dir.c
@@ -544,6 +544,12 @@ static int coda_dentry_revalidate(struct dentry *de, struct nameidata *nd)
struct inode *inode;
struct coda_inode_info *cii;

+ if (!nd) {
+ printk(KERN_ERR "coda_dentry_revalidate:"
+ " called from layered filesystem without intents\n");
+ return -EIO;
+ }
+
if (nd->flags & LOOKUP_RCU)
return -ECHILD;

diff --git a/fs/hfs/sysdep.c b/fs/hfs/sysdep.c
index 19cf291..b130d91 100644
--- a/fs/hfs/sysdep.c
+++ b/fs/hfs/sysdep.c
@@ -18,6 +18,12 @@ static int hfs_revalidate_dentry(struct dentry *dentry, struct nameidata *nd)
struct inode *inode;
int diff;

+ if (!nd) {
+ printk(KERN_ERR "hfs_revalidate_dentry:"
+ " called from layered filesystem without intents\n");
+ return -EIO;
+ }
+
if (nd->flags & LOOKUP_RCU)
return -ECHILD;

diff --git a/fs/ncpfs/dir.c b/fs/ncpfs/dir.c
index 9c51f62..6580d1d 100644
--- a/fs/ncpfs/dir.c
+++ b/fs/ncpfs/dir.c
@@ -302,6 +302,12 @@ ncp_lookup_validate(struct dentry *dentry, struct nameidata *nd)
if (dentry == dentry->d_sb->s_root)
return 1;

+ if (!nd) {
+ printk(KERN_ERR "ncp_lookup_validate:"
+ " called from layered filesystem without intents\n");
+ return -EIO;
+ }
+
if (nd->flags & LOOKUP_RCU)
return -ECHILD;

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index b238d95..51b3d54 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1103,6 +1103,12 @@ static int nfs_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
struct nfs_fattr *fattr = NULL;
int error;

+ if (!nd) {
+ printk(KERN_ERR "nfs_lookup_revalidate:"
+ " called from layered filesystem without intents\n");
+ return -EIO;
+ }
+
if (nd->flags & LOOKUP_RCU)
return -ECHILD;

@@ -1508,6 +1514,12 @@ static int nfs_open_revalidate(struct dentry *dentry, struct nameidata *nd)
struct nfs_open_context *ctx;
int openflags, ret = 0;

+ if (!nd) {
+ printk(KERN_ERR "nfs_open_revalidate:"
+ " called from layered filesystem without intents\n");
+ return -EIO;
+ }
+
if (nd->flags & LOOKUP_RCU)
return -ECHILD;

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 1a77dbe..20ef3ab 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -389,6 +389,11 @@ static const struct inode_operations proc_sys_dir_operations = {

static int proc_sys_revalidate(struct dentry *dentry, struct nameidata *nd)
{
+ if (!nd) {
+ printk(KERN_ERR "proc_sys_revalidate:"
+ " called from layered filesystem without intents\n");
+ return -EIO;
+ }
if (nd->flags & LOOKUP_RCU)
return -ECHILD;
return !PROC_I(dentry->d_inode)->sysctl->unregistering;
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index ea9120a..6373450 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -242,6 +242,12 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
struct sysfs_dirent *sd;
int is_dir;

+ if (!nd) {
+ printk(KERN_ERR "sysfs_dentry_revalidate:"
+ " called from layered filesystem without intents\n");
+ return -EIO;
+ }
+
if (nd->flags & LOOKUP_RCU)
return -ECHILD;

--
1.7.0.4

----------------------------------------------------------------------

2011-12-06 11:43:20

by Jacek Luczak

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

2011/12/1 Chris Dunlop <[email protected]>:
[!snip]
>> I'm also not sure about the printk in the NFS case. Instead of littering
>> the logs, we should probably just disallow the stacked filesystem (are
>> we talking about eCryptfs here?) from mounting on top of NFS in the
>> first place.
>
> See other reply: it wasn't a stacked file system.

I've actually hit the same think with NFS on vanilla 2.6.39.4. There
was no stacked fs on top as this host is only running Jenkins master
instance where few jar are on NFS homedir mounted by autofs.

As there's no way I could actually wait for Al patches what is then
the best way to limit this issue (BTW: I was not able to reproduce
this anywhere yet)? The gentle msg info and -EIO sound like good idea
until we will be suck up by black hole.

-Jacek

2011-12-01 05:35:01

by Chris Dunlop

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

On Thu, Dec 01, 2011 at 11:47:09AM +1100, Chris Dunlop wrote:
> On Wed, Nov 30, 2011 at 08:54:43AM +0000, David Howells wrote:
>> Chris Dunlop <[email protected]> wrote:
>>
>>> To avoid other people further wasting their and your time on
>>> exactly the same thing future, how something like the following
>>> patch, based on your comment in:
>>>
>>> http://article.gmane.org/gmane.linux.nfs/40370
>>>
>>> ...and, if that's acceptable, is it worthwhile doing for the
>>> other file systems which are likewise currently vulnerable when
>>> abused by broken layered file systems?
>>
>> It's also worth printing a message - this *is* a kernel bug of some description
>> if it happens.
>
> Like the below? This covers the d_revalidate for 9p, afs, coda,
> hfs, ncpfs, proc, sysfs.

...and nfs.

> ----------------------------------------------------------------------
> Don't oops when abused by broken layered file systems
>
> Signed-off-by: Chris Dunlop <[email protected]>
> ---
> fs/9p/vfs_dentry.c | 6 ++++++
> fs/afs/dir.c | 6 ++++++
> fs/coda/dir.c | 6 ++++++
> fs/hfs/sysdep.c | 6 ++++++
> fs/ncpfs/dir.c | 6 ++++++
> fs/nfs/dir.c | 12 ++++++++++++
> fs/proc/proc_sysctl.c | 5 +++++
> fs/sysfs/dir.c | 6 ++++++
> 8 files changed, 53 insertions(+), 0 deletions(-)
>
> diff --git a/fs/9p/vfs_dentry.c b/fs/9p/vfs_dentry.c
> index e022890..3b082dc 100644
> --- a/fs/9p/vfs_dentry.c
> +++ b/fs/9p/vfs_dentry.c
> @@ -106,6 +106,12 @@ static int v9fs_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
> struct inode *inode;
> struct v9fs_inode *v9inode;
>
> + if (!nd) {
> + printk(KERN_ERR "v9fs_lookup_revalidate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> diff --git a/fs/afs/dir.c b/fs/afs/dir.c
> index 1b0b195..4003d76 100644
> --- a/fs/afs/dir.c
> +++ b/fs/afs/dir.c
> @@ -607,6 +607,12 @@ static int afs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
> void *dir_version;
> int ret;
>
> + if (!nd) {
> + printk(KERN_ERR "afs_d_revalidate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> diff --git a/fs/coda/dir.c b/fs/coda/dir.c
> index 0239433..ede8e77 100644
> --- a/fs/coda/dir.c
> +++ b/fs/coda/dir.c
> @@ -544,6 +544,12 @@ static int coda_dentry_revalidate(struct dentry *de, struct nameidata *nd)
> struct inode *inode;
> struct coda_inode_info *cii;
>
> + if (!nd) {
> + printk(KERN_ERR "coda_dentry_revalidate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> diff --git a/fs/hfs/sysdep.c b/fs/hfs/sysdep.c
> index 19cf291..b130d91 100644
> --- a/fs/hfs/sysdep.c
> +++ b/fs/hfs/sysdep.c
> @@ -18,6 +18,12 @@ static int hfs_revalidate_dentry(struct dentry *dentry, struct nameidata *nd)
> struct inode *inode;
> int diff;
>
> + if (!nd) {
> + printk(KERN_ERR "hfs_revalidate_dentry:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> diff --git a/fs/ncpfs/dir.c b/fs/ncpfs/dir.c
> index 9c51f62..6580d1d 100644
> --- a/fs/ncpfs/dir.c
> +++ b/fs/ncpfs/dir.c
> @@ -302,6 +302,12 @@ ncp_lookup_validate(struct dentry *dentry, struct nameidata *nd)
> if (dentry == dentry->d_sb->s_root)
> return 1;
>
> + if (!nd) {
> + printk(KERN_ERR "ncp_lookup_validate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index b238d95..51b3d54 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -1103,6 +1103,12 @@ static int nfs_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
> struct nfs_fattr *fattr = NULL;
> int error;
>
> + if (!nd) {
> + printk(KERN_ERR "nfs_lookup_revalidate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> @@ -1508,6 +1514,12 @@ static int nfs_open_revalidate(struct dentry *dentry, struct nameidata *nd)
> struct nfs_open_context *ctx;
> int openflags, ret = 0;
>
> + if (!nd) {
> + printk(KERN_ERR "nfs_open_revalidate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> index 1a77dbe..20ef3ab 100644
> --- a/fs/proc/proc_sysctl.c
> +++ b/fs/proc/proc_sysctl.c
> @@ -389,6 +389,11 @@ static const struct inode_operations proc_sys_dir_operations = {
>
> static int proc_sys_revalidate(struct dentry *dentry, struct nameidata *nd)
> {
> + if (!nd) {
> + printk(KERN_ERR "proc_sys_revalidate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
> return !PROC_I(dentry->d_inode)->sysctl->unregistering;
> diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
> index ea9120a..6373450 100644
> --- a/fs/sysfs/dir.c
> +++ b/fs/sysfs/dir.c
> @@ -242,6 +242,12 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
> struct sysfs_dirent *sd;
> int is_dir;
>
> + if (!nd) {
> + printk(KERN_ERR "sysfs_dentry_revalidate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> --
> 1.7.0.4
>
> ----------------------------------------------------------------------

2011-12-01 06:50:32

by Tyler Hicks

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

On 2011-11-29 19:25:01, Chris Dunlop wrote:
> Hi,
>
> I haven't seen any response to this patch which fixes an Oops in
> d_revalidate. I hit this using NFS, but various other file
> systems look to be likewise vulnerable, hence the broadness of
> the patch. The sequence leading to the Oops is:
>
> lookup_one_len() [fs/namei.c]
> calls __lookup_hash() [fs/namei.c] with nd == NULL,
> which can then call the file system specific d_revalidate(), passing in nd == NULL
> which will then Oops if nd is used without checking

Hey Chris - Can you share what you were trying to do when you hit this?
Were you stacking eCryptfs on top of NFS? Another stacked filesystem on
top of NFS?

Do you *need* a stacked filesystem to work on top of NFS? If so, we'll
need to discuss a way forward. Al has previously shown a dislike of
eCryptfs passing around nameidata (for good reason), but that is what
NFS currently requires. I looked at doing this a few months back, but
never got to the implementation stage.

As David mentioned, Al's atomic open patches might solve all of this in
the future, but I don't know much about that patchset. Is there any
relevant info you could provide about those patches, Al?

Tyler


Attachments:
(No filename) (1.19 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2011-12-01 08:02:37

by Tyler Hicks

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

On 2011-12-01 00:50:25, Tyler Hicks wrote:
> As David mentioned, Al's atomic open patches might solve all of this in
> the future, but I don't know much about that patchset. Is there any
> relevant info you could provide about those patches, Al?

Sorry, Al. I see that you've already provided those details here:

https://lkml.org/lkml/2011/5/12/337

I was unaware of that thread until now.

Tyler



Attachments:
(No filename) (400.00 B)
signature.asc (836.00 B)
Digital signature
Download all attachments

2011-12-01 07:29:50

by Chris Dunlop

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

On Thu, Dec 01, 2011 at 12:31:58AM -0600, Tyler Hicks wrote:
> On 2011-12-01 11:47:09, Chris Dunlop wrote:
>> On Wed, Nov 30, 2011 at 08:54:43AM +0000, David Howells wrote:
>>> Chris Dunlop <[email protected]> wrote:
>>>
>>>> To avoid other people further wasting their and your time on
>>>> exactly the same thing future, how something like the following
>>>> patch, based on your comment in:
>>>>
>>>> http://article.gmane.org/gmane.linux.nfs/40370
>>>>
>>>> ...and, if that's acceptable, is it worthwhile doing for the
>>>> other file systems which are likewise currently vulnerable when
>>>> abused by broken layered file systems?
>>>
>>> Also, this may get fixed by Al's atomic open patches - but obviously it hasn't
>>> been yet...
>>>
>>>> Don't oops when abused by broken layered file systems
>>>>
>>>> Signed-off-by: Chris Dunlop <[email protected]>
>>>
>>> Acked-by: David Howells <[email protected]>
>>>
>>> It's also worth printing a message - this *is* a kernel bug of some description
>>> if it happens.
>>
>> Like the below? This covers the d_revalidate for 9p, afs, coda,
>> hfs, ncpfs, proc, sysfs.
>
> I don't like the looks of this patch. It makes sense for NFS to error
> out of d_revalidate() when passed a NULL nameidata pointer because NFS
> actually uses the nameidata to do something useful. That can't be said
> about the other filesystems in this patch.

I can see nd is used in nfs_open_revalidate(), but is it
necessarily used in nfs_lookup_revalidate()? I'm way out of my
depth here, but everywhere it's used in nfs_lookup_revalidate()
(nfs_neg_need_reval(), nfs_is_exclusive_create(),
nfs_lookup_verify_inode()) there are also checks for nd != NULL.

> Why not handle the other filesystems like the previous fixes you
> referenced in your original email by checking for a non-NULL nd like
> this:
>
> if (nd && nd->flags & LOOKUP_RCU)
> return -ECHILD;

'Cos Trond scared me into it! ;-)

But mostly because I don't really know what I'm doing. The
original patch came about because I was tracking down the Oops
in the NFS code and it seemed such an obvious fix that
lookup_one_len() passes down a hard-coded NULL and that NULL
isn't checked in all the d_revalidate routines. I thought I'd do
the right thing and make sure it was checked everywhere. Little
did I know there's "history" behind it! I'm afraid I don't know
anywhere near enough to argue about the right way to deal with it.

> I'm also not sure about the printk in the NFS case. Instead of littering
> the logs, we should probably just disallow the stacked filesystem (are
> we talking about eCryptfs here?) from mounting on top of NFS in the
> first place.

See other reply: it wasn't a stacked file system.

But it seems useful to have the d_revalidate routines indicate
via the log that they're being abused.

Chris

2011-12-01 07:23:11

by Chris Dunlop

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

On Thu, Dec 01, 2011 at 12:50:25AM -0600, Tyler Hicks wrote:
> On 2011-11-29 19:25:01, Chris Dunlop wrote:
>> I haven't seen any response to this patch which fixes an Oops in
>> d_revalidate. I hit this using NFS, but various other file
>> systems look to be likewise vulnerable, hence the broadness of
>> the patch. The sequence leading to the Oops is:
>>
>> lookup_one_len() [fs/namei.c]
>> calls __lookup_hash() [fs/namei.c] with nd == NULL,
>> which can then call the file system specific d_revalidate(), passing in nd == NULL
>> which will then Oops if nd is used without checking
>
> Hey Chris - Can you share what you were trying to do when you hit this?
> Were you stacking eCryptfs on top of NFS? Another stacked filesystem on
> top of NFS?
>
> Do you *need* a stacked filesystem to work on top of NFS? If so, we'll
> need to discuss a way forward. Al has previously shown a dislike of
> eCryptfs passing around nameidata (for good reason), but that is what
> NFS currently requires. I looked at doing this a few months back, but
> never got to the implementation stage.

Actually, no, it wasn't eCryptfs or another stacked filesystem.
It seems my dirty little secret must come out: I hit the problem
when trying to use the (necessarily) out-of-tree zfsonlinux
(ZoL) [1], on an NFS root machine.

I don't know exactly what ZoL is using lookup_one_len() for, nor
how to fix it so it isn't, but I've given them the heads up that
it's not supposed to be used outside of original file system [2].

Chris.

[1] http://zfsonlinux.org/
[2] https://github.com/zfsonlinux/zfs/issues/456

2011-12-01 06:32:14

by Tyler Hicks

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

On 2011-12-01 11:47:09, Chris Dunlop wrote:
> On Wed, Nov 30, 2011 at 08:54:43AM +0000, David Howells wrote:
> > Chris Dunlop <[email protected]> wrote:
> >
> >> To avoid other people further wasting their and your time on
> >> exactly the same thing future, how something like the following
> >> patch, based on your comment in:
> >>
> >> http://article.gmane.org/gmane.linux.nfs/40370
> >>
> >> ...and, if that's acceptable, is it worthwhile doing for the
> >> other file systems which are likewise currently vulnerable when
> >> abused by broken layered file systems?
> >
> > Also, this may get fixed by Al's atomic open patches - but obviously it hasn't
> > been yet...
> >
> >> Don't oops when abused by broken layered file systems
> >>
> >> Signed-off-by: Chris Dunlop <[email protected]>
> >
> > Acked-by: David Howells <[email protected]>
> >
> > It's also worth printing a message - this *is* a kernel bug of some description
> > if it happens.
>
> Like the below? This covers the d_revalidate for 9p, afs, coda,
> hfs, ncpfs, proc, sysfs.

I don't like the looks of this patch. It makes sense for NFS to error
out of d_revalidate() when passed a NULL nameidata pointer because NFS
actually uses the nameidata to do something useful. That can't be said
about the other filesystems in this patch.

Why not handle the other filesystems like the previous fixes you
referenced in your original email by checking for a non-NULL nd like
this:

if (nd && nd->flags & LOOKUP_RCU)
return -ECHILD;

I'm also not sure about the printk in the NFS case. Instead of littering
the logs, we should probably just disallow the stacked filesystem (are
we talking about eCryptfs here?) from mounting on top of NFS in the
first place.

Tyler

>
> Note: jfs isn't susceptible to this problem, but the resolution
> doesn't look like the other file systems, and from the comment
> I'm not sure if the problem was really understood and if it's
> doing the right thing:
>
> static int jfs_ci_revalidate(struct dentry *dentry, struct nameidata *nd)
> {
> ...
> /*
> * This may be nfsd (or something), anyway, we can't see the
> * intent of this. So, since this can be for creation, drop it.
> */
> if (!nd)
> return 0;
>
> /*
> * Drop the negative dentry, in order to make sure to use the
> * case sensitive name which is specified by user if this is
> * for creation.
> */
> if (nd->flags & (LOOKUP_CREATE | LOOKUP_RENAME_TARGET))
> return 0;
> ...
> }
>
> Chris.
>
> ----------------------------------------------------------------------
> Don't oops when abused by broken layered file systems
>
> Signed-off-by: Chris Dunlop <[email protected]>
> ---
> fs/9p/vfs_dentry.c | 6 ++++++
> fs/afs/dir.c | 6 ++++++
> fs/coda/dir.c | 6 ++++++
> fs/hfs/sysdep.c | 6 ++++++
> fs/ncpfs/dir.c | 6 ++++++
> fs/nfs/dir.c | 12 ++++++++++++
> fs/proc/proc_sysctl.c | 5 +++++
> fs/sysfs/dir.c | 6 ++++++
> 8 files changed, 53 insertions(+), 0 deletions(-)
>
> diff --git a/fs/9p/vfs_dentry.c b/fs/9p/vfs_dentry.c
> index e022890..3b082dc 100644
> --- a/fs/9p/vfs_dentry.c
> +++ b/fs/9p/vfs_dentry.c
> @@ -106,6 +106,12 @@ static int v9fs_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
> struct inode *inode;
> struct v9fs_inode *v9inode;
>
> + if (!nd) {
> + printk(KERN_ERR "v9fs_lookup_revalidate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> diff --git a/fs/afs/dir.c b/fs/afs/dir.c
> index 1b0b195..4003d76 100644
> --- a/fs/afs/dir.c
> +++ b/fs/afs/dir.c
> @@ -607,6 +607,12 @@ static int afs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
> void *dir_version;
> int ret;
>
> + if (!nd) {
> + printk(KERN_ERR "afs_d_revalidate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> diff --git a/fs/coda/dir.c b/fs/coda/dir.c
> index 0239433..ede8e77 100644
> --- a/fs/coda/dir.c
> +++ b/fs/coda/dir.c
> @@ -544,6 +544,12 @@ static int coda_dentry_revalidate(struct dentry *de, struct nameidata *nd)
> struct inode *inode;
> struct coda_inode_info *cii;
>
> + if (!nd) {
> + printk(KERN_ERR "coda_dentry_revalidate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> diff --git a/fs/hfs/sysdep.c b/fs/hfs/sysdep.c
> index 19cf291..b130d91 100644
> --- a/fs/hfs/sysdep.c
> +++ b/fs/hfs/sysdep.c
> @@ -18,6 +18,12 @@ static int hfs_revalidate_dentry(struct dentry *dentry, struct nameidata *nd)
> struct inode *inode;
> int diff;
>
> + if (!nd) {
> + printk(KERN_ERR "hfs_revalidate_dentry:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> diff --git a/fs/ncpfs/dir.c b/fs/ncpfs/dir.c
> index 9c51f62..6580d1d 100644
> --- a/fs/ncpfs/dir.c
> +++ b/fs/ncpfs/dir.c
> @@ -302,6 +302,12 @@ ncp_lookup_validate(struct dentry *dentry, struct nameidata *nd)
> if (dentry == dentry->d_sb->s_root)
> return 1;
>
> + if (!nd) {
> + printk(KERN_ERR "ncp_lookup_validate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index b238d95..51b3d54 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -1103,6 +1103,12 @@ static int nfs_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
> struct nfs_fattr *fattr = NULL;
> int error;
>
> + if (!nd) {
> + printk(KERN_ERR "nfs_lookup_revalidate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> @@ -1508,6 +1514,12 @@ static int nfs_open_revalidate(struct dentry *dentry, struct nameidata *nd)
> struct nfs_open_context *ctx;
> int openflags, ret = 0;
>
> + if (!nd) {
> + printk(KERN_ERR "nfs_open_revalidate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> index 1a77dbe..20ef3ab 100644
> --- a/fs/proc/proc_sysctl.c
> +++ b/fs/proc/proc_sysctl.c
> @@ -389,6 +389,11 @@ static const struct inode_operations proc_sys_dir_operations = {
>
> static int proc_sys_revalidate(struct dentry *dentry, struct nameidata *nd)
> {
> + if (!nd) {
> + printk(KERN_ERR "proc_sys_revalidate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
> return !PROC_I(dentry->d_inode)->sysctl->unregistering;
> diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
> index ea9120a..6373450 100644
> --- a/fs/sysfs/dir.c
> +++ b/fs/sysfs/dir.c
> @@ -242,6 +242,12 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
> struct sysfs_dirent *sd;
> int is_dir;
>
> + if (!nd) {
> + printk(KERN_ERR "sysfs_dentry_revalidate:"
> + " called from layered filesystem without intents\n");
> + return -EIO;
> + }
> +
> if (nd->flags & LOOKUP_RCU)
> return -ECHILD;
>
> --
> 1.7.0.4
>
> ----------------------------------------------------------------------
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Attachments:
(No filename) (7.67 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2011-12-01 05:32:59

by Chris Dunlop

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

On Wed, Nov 30, 2011 at 09:53:16PM -0600, Dave Kleikamp wrote:
> On 11/30/2011 09:33 PM, Chris Dunlop wrote:
>> On Wed, Nov 30, 2011 at 08:22:39PM -0600, Dave Kleikamp wrote:
>>> On 11/30/2011 06:47 PM, Chris Dunlop wrote:
>>>>> It's also worth printing a message - this *is* a kernel bug of some description
>>>>> if it happens.
>>>>
>>>> Like the below? This covers the d_revalidate for 9p, afs, coda,
>>>> hfs, ncpfs, proc, sysfs.
>>>>
>>>> Note: jfs isn't susceptible to this problem, but the resolution
>>>> doesn't look like the other file systems, and from the comment
>>>> I'm not sure if the problem was really understood and if it's
>>>> doing the right thing:
>>>
>>> This code, as well as the comments, were copied from vfat. It seems
>>> reasonable for case-insensitive but case-preserving behavior (not jfs's
>>> default). The safe thing is to drop the negative dentry if we don't know
>>> the operation.
>>
>> In that case, it looks like the thing to do might be to add the
>> "protection" to the start of jfs_ci_revaliate(), per how the
>> original has been changed in vfat:
>
> The LOOKUP_RCU check had previously been there, but Al Viro removed it:
>
> commit 5c0f360b083fb33d05d1bff4b138b82d715eb419
> Author: Al Viro <[email protected]>
> Date: Sat Jun 25 21:41:09 2011 -0400
>
> jfs_ci_revalidate() is safe from RCU mode
>
> I'm not sure what it takes to be "safe", but this is a simple function
> that doesn't block, take locks, or do much of anything. You shouldn't
> need to do anything with jfs.
>
> Shaggy

OK, thanks.

Chris

2011-12-01 02:22:39

by Dave Kleikamp

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

On 11/30/2011 06:47 PM, Chris Dunlop wrote:
> On Wed, Nov 30, 2011 at 08:54:43AM +0000, David Howells wrote:

...

>>
>> Acked-by: David Howells <[email protected]>
>>
>> It's also worth printing a message - this *is* a kernel bug of some description
>> if it happens.
>
> Like the below? This covers the d_revalidate for 9p, afs, coda,
> hfs, ncpfs, proc, sysfs.
>
> Note: jfs isn't susceptible to this problem, but the resolution
> doesn't look like the other file systems, and from the comment
> I'm not sure if the problem was really understood and if it's
> doing the right thing:

This code, as well as the comments, were copied from vfat. It seems
reasonable for case-insensitive but case-preserving behavior (not jfs's
default). The safe thing is to drop the negative dentry if we don't know
the operation.

>
> static int jfs_ci_revalidate(struct dentry *dentry, struct nameidata *nd)
> {
> ...
> /*
> * This may be nfsd (or something), anyway, we can't see the
> * intent of this. So, since this can be for creation, drop it.
> */
> if (!nd)
> return 0;
>
> /*
> * Drop the negative dentry, in order to make sure to use the
> * case sensitive name which is specified by user if this is
> * for creation.
> */
> if (nd->flags & (LOOKUP_CREATE | LOOKUP_RENAME_TARGET))
> return 0;
> ...
> }
>
> Chris.
>

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d

2011-12-01 03:33:47

by Chris Dunlop

[permalink] [raw]
Subject: Re: [PATCH 1/1] fix d_revalidate oopsen on NFS exports

On Wed, Nov 30, 2011 at 08:22:39PM -0600, Dave Kleikamp wrote:
> On 11/30/2011 06:47 PM, Chris Dunlop wrote:
>>> It's also worth printing a message - this *is* a kernel bug of some description
>>> if it happens.
>>
>> Like the below? This covers the d_revalidate for 9p, afs, coda,
>> hfs, ncpfs, proc, sysfs.
>>
>> Note: jfs isn't susceptible to this problem, but the resolution
>> doesn't look like the other file systems, and from the comment
>> I'm not sure if the problem was really understood and if it's
>> doing the right thing:
>
> This code, as well as the comments, were copied from vfat. It seems
> reasonable for case-insensitive but case-preserving behavior (not jfs's
> default). The safe thing is to drop the negative dentry if we don't know
> the operation.

In that case, it looks like the thing to do might be to add the
"protection" to the start of jfs_ci_revaliate(), per how the
original has been changed in vfat:

fs/fat/namei_vfat.c:
static int vfat_revalidate_ci(struct dentry *dentry, struct nameidata *nd)
{
if (nd && nd->flags & LOOKUP_RCU)
return -ECHILD;
...
}

E.g.:

----------------------------------------------------------------------
Don't oops when abused by broken layered file systems

Signed-off-by: Chris Dunlop <[email protected]>
---
fs/jfs/namei.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/jfs/namei.c b/fs/jfs/namei.c
index e17545e..5504f6e 100644
--- a/fs/jfs/namei.c
+++ b/fs/jfs/namei.c
@@ -1585,6 +1585,9 @@ out:

static int jfs_ci_revalidate(struct dentry *dentry, struct nameidata *nd)
{
+ if (nd && nd->flags & LOOKUP_RCU)
+ return -ECHILD;
+
/*
* This is not negative dentry. Always valid.
*
--
1.7.0.4

----------------------------------------------------------------------

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d