2004-03-27 22:25:12

by Alan Stern

[permalink] [raw]
Subject: BUG: Problem with your patches for sysfs from 2 weeks ago

Maneesh:

I've been tracing a problem with sysfs that starting showing up just
recently, and it seems likely to have originated with your patches from a
couple of weeks ago. I can't tell exactly what's wrong or fix it because
I don't understand the filesystem layer well enough.

The problem I found (there may be others) shows up when trying to delete a
nonexistent symlink -- presumably trying to delete a nonexistent file
would have a similar result. The code in sysfs_hash_and_remove() does a
lookup on the nonexistent name and sysfs_get_dentry() returns a
newly-allocated dentry. Creating the new entry increments the parent
directory's d_count, of course. But at the end of the routine, when
dput() is called for the new dentry, the parent's d_count does _not_ get
decremented. The new dentry is placed on the dentry_unused list and the
parent is left with an anomalously large d_count. This doesn't ever seem
to get resolved, and when the directory's kobject is deleted the reference
you added doesn't get dropped.

Alan Stern


2004-03-29 06:48:40

by Maneesh Soni

[permalink] [raw]
Subject: Re: BUG: Problem with your patches for sysfs from 2 weeks ago

On Sat, Mar 27, 2004 at 05:25:07PM -0500, Alan Stern wrote:
> Maneesh:
>
> I've been tracing a problem with sysfs that starting showing up just
> recently, and it seems likely to have originated with your patches from a
> couple of weeks ago. I can't tell exactly what's wrong or fix it because
> I don't understand the filesystem layer well enough.
>
> The problem I found (there may be others) shows up when trying to delete a
> nonexistent symlink -- presumably trying to delete a nonexistent file
> would have a similar result. The code in sysfs_hash_and_remove() does a
> lookup on the nonexistent name and sysfs_get_dentry() returns a
> newly-allocated dentry. Creating the new entry increments the parent
> directory's d_count, of course. But at the end of the routine, when
> dput() is called for the new dentry, the parent's d_count does _not_ get
> decremented. The new dentry is placed on the dentry_unused list and the
> parent is left with an anomalously large d_count. This doesn't ever seem
> to get resolved, and when the directory's kobject is deleted the reference
> you added doesn't get dropped.
>

I see the problem here. Probably same will happen even if a non-existent
regular file is deleted. Negative dentry and the ref. taken on the parent
should go away eventually when there is a memory pressure. But the situation
here needs immediate removal of the negative dentry.

This could be solved partially if sysfs_hash_and_remove() handles negative
dentries as well by unhashing them before calling dput().

Partially because sysfs_hash_and_remove() is called only if some kernel
level code (drivers) issues sysfs_remove_file() or sysfs_create_file().
But if a user just goes to the corresponding directory and does rm
for a non-existent file, I guess we will have the same situation. This needs
some solution.

The patch (sysfs-pin-kobject.patch) I did was required for oops
(Use after free) situations like this

=============================================================================
Unable to handle kernel paging request at virtual address db368dc0
printing eip:
c01f8690
*pde = 5a5a5a5a
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c01f8690>] Not tainted
EFLAGS: 00010282
EIP is at kobject_get+0x10/0x50
eax: db368da8 ebx: db368da8 ecx: 00000001 edx: d6d0ff68
esi: db69ee88 edi: db69ee88 ebp: d7ed3f0c esp: d7ed3ef8
ds: 007b es: 007b ss: 0068
Process cat (pid: 30394, threadinfo=d7ed2000 task=d63dc9b0)
Stack: c014a447 d6d0ff68 c1fb9aec 00000001 d6d0ff68 d7ed3f30 c019f500 db368da8
d7ed3f30 c016601b d6d0ff68 d6d0ff68 db69ee88 c1fa9c44 d7ed3f50 c01640ec
db69ee88 d6d0ff68 ffffffe9 00008000 00008000 d975e000 d7ed3f9c c0163fa6
Call Trace:
[<c014a447>] kmem_cache_alloc+0x107/0x240
[<c019f500>] check_perm+0x20/0x190
[<c016601b>] get_empty_filp+0x7b/0x110
[<c01640ec>] dentry_open+0x13c/0x1f0
[<c0163fa6>] filp_open+0x66/0x70
[<c0164545>] sys_open+0x55/0x90
[<c010b651>] sysenter_past_esp+0x52/0x71

Code: 8b 43 18 85 c0 74 0d f0 ff 43 18 89 d8 8b 5d fc 89 ec 5d c3


--
Maneesh Soni
Linux Technology Center,
IBM Software Lab, Bangalore, India
email: [email protected]
Phone: 91-80-25044999 Fax: 91-80-25268553
T/L : 9243696

2004-03-29 21:05:13

by Greg KH

[permalink] [raw]
Subject: Re: BUG: Problem with your patches for sysfs from 2 weeks ago

On Mon, Mar 29, 2004 at 12:23:09PM +0530, Maneesh Soni wrote:
> On Sat, Mar 27, 2004 at 05:25:07PM -0500, Alan Stern wrote:
> > Maneesh:
> >
> > I've been tracing a problem with sysfs that starting showing up just
> > recently, and it seems likely to have originated with your patches from a
> > couple of weeks ago. I can't tell exactly what's wrong or fix it because
> > I don't understand the filesystem layer well enough.
> >
> > The problem I found (there may be others) shows up when trying to delete a
> > nonexistent symlink -- presumably trying to delete a nonexistent file
> > would have a similar result. The code in sysfs_hash_and_remove() does a
> > lookup on the nonexistent name and sysfs_get_dentry() returns a
> > newly-allocated dentry. Creating the new entry increments the parent
> > directory's d_count, of course. But at the end of the routine, when
> > dput() is called for the new dentry, the parent's d_count does _not_ get
> > decremented. The new dentry is placed on the dentry_unused list and the
> > parent is left with an anomalously large d_count. This doesn't ever seem
> > to get resolved, and when the directory's kobject is deleted the reference
> > you added doesn't get dropped.
> >
>
> I see the problem here. Probably same will happen even if a non-existent
> regular file is deleted. Negative dentry and the ref. taken on the parent
> should go away eventually when there is a memory pressure. But the situation
> here needs immediate removal of the negative dentry.

Yes.

> This could be solved partially if sysfs_hash_and_remove() handles negative
> dentries as well by unhashing them before calling dput().

Care to fix this?

> Partially because sysfs_hash_and_remove() is called only if some kernel
> level code (drivers) issues sysfs_remove_file() or sysfs_create_file().
> But if a user just goes to the corresponding directory and does rm
> for a non-existent file, I guess we will have the same situation. This needs
> some solution.

Agreed.

> The patch (sysfs-pin-kobject.patch) I did was required for oops
> (Use after free) situations like this

Yes, I agree that it was needed, just that your fix now causes other
problems. Actually these were hinted at in the previous objections to
this patch (the PCMCIA issues no one could figure out.) Thanks to Alan,
we have now pinpointed the true cause.

So, as your fix for one problem caused another one, care to fix this
too? :)

thanks,

greg k-h