2003-09-14 13:38:02

by Andrey Borzenkov

[permalink] [raw]
Subject: [PATCH][2.6.0-test5] fix oopses is kobject parent is removed before child

It is possible that parent is removed before child when child is in use.
Trivial example is mounted USB storage when you unplug it. The kobject for
USB device is removed but subordinate SCSI device remains. Then kernel oopses
on attempt to release child e.g. umount removed USB storage. This patch fixes
two problems:

- kset_hotplug. It oopses in get_kobj_path_length because child->parent points
to nowhere - even if parent has not yet been overwritten, its name is already
freed. Common oops I get is

Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
c01caf50
*pde = 00000000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c01caf50>] Tainted: P
EFLAGS: 00010246
EIP is at get_kobj_path_length+0x40/0x60
eax: 00000000 ebx: c87db0fc ecx: ffffffff edx: 00000001
esi: ffffffff edi: 00000000 ebp: c6a7fe14 esp: c6a7fdf8
ds: 007b es: 007b ss: 0068
Process umount (pid: 2961, threadinfo=c6a7e000 task=c6f25000)
Stack: c02ecd9e c02ce570 00000000 00000017 c6a7e000 c6cd66a8 000000a7 c6a7fe64
c01cb175 c0346ec0 c8ae51b8 000000a7 c6a7fe40 d1c62e6f c7351d34 c8ae5194
d1c64e64 c70a2004 00000000 c70a201d c03131c0 d1943dbb 00000000 d19543ac
Call Trace:
[<c01cb175>] kset_hotplug+0x195/0x360
[<d1c62e6f>] sd_remove+0x7f/0xb0 [sd_mod]
[<c01cb768>] kobject_del+0x68/0x80
[<c0213f03>] device_del+0x83/0xb0
[<d193739c>] scsi_device_put+0xdc/0x100 [scsi_mod]
[<d1c615e4>] sd_release+0x64/0xb0 [sd_mod]
[<d1c61580>] sd_release+0x0/0xb0 [sd_mod]
[<c0178e46>] blkdev_put+0x236/0x270
[<c0178e04>] blkdev_put+0x1f4/0x270
[<c017716c>] kill_block_super+0x3c/0x50
[<c0175903>] deactivate_super+0xa3/0x170
[<c0191bcc>] sys_umount+0x3c/0x80
[<c0191c29>] sys_oldumount+0x19/0x20
[<c010c9ef>] syscall_call+0x7/0xb

the patch moves kobject_put for parent from unlink() into kobject_cleanup for
child making sure reference to parents exists for as long as child is there
and may use it.

- after this oops has been fixed I got next one now in sysfs. The problem is
sysfs_remove_dir would unlink all children including directories for
subordinate kobjects. Resulting in dget/dput mismatch. I usually got oops due
to the fact that d_delete in remove_dir would free inode and then
simple_rmdir would try to access it like in:

Unable to handle kernel NULL pointer dereference at virtual address 00000024
printing eip:
c01959b3
*pde = 00000000
Oops: 0002 [#1]
CPU: 0
EIP: 0060:[<c01959b3>] Tainted: P
EFLAGS: 00010202
EIP is at simple_rmdir+0x33/0x50
eax: 00000000 ebx: c71e8004 ecx: c71e800c edx: ffffffd9
esi: c766e004 edi: c76d4004 ebp: c7115e20 esp: c7115e10
ds: 007b es: 007b ss: 0068
Process umount (pid: 2703, threadinfo=c7114000 task=c93e4000)
Stack: c71e8004 c71df004 c766e074 c766e004 c7115e40 c01b4510 c766e004
c71e8004
c71e8004 c71e1004 c7114000 c71e8004 c7115e64 c01b4661 c71e8004 c71e1004
c71e8028 c7114000 c8a8e1b8 c0346f08 c86220bc c7115e7c c01cb733 c8a8e1b8
Call Trace:
[<c01b4510>] remove_dir+0x70/0xa0
[<c01b4661>] sysfs_remove_dir+0x111/0x1f0
[<c01cb733>] kobject_del+0x43/0x80
[<c0213f03>] device_del+0x83/0xb0
[<d193739c>] scsi_device_put+0xdc/0x100 [scsi_mod]
[<d1c615e4>] sd_release+0x64/0xb0 [sd_mod]
[<d1c61580>] sd_release+0x0/0xb0 [sd_mod]
[<c0178e46>] blkdev_put+0x236/0x270
[<c0178e04>] blkdev_put+0x1f4/0x270
[<c017716c>] kill_block_super+0x3c/0x50
[<c0175903>] deactivate_super+0xa3/0x170
[<c0191bcc>] sys_umount+0x3c/0x80
[<c0191c29>] sys_oldumount+0x19/0x20
[<c010c9ef>] syscall_call+0x7/0xb

The patch avoids calling extra d_delete/unlink on already-deleted dentry. I
hate this patch but anything better apparently requires complete redesign of
sysfs implementation. Unlinking busy directory is otherweise impossible and I
am afraid it will show itself somewhere else.

-andrey


Attachments:
(No filename) (3.78 kB)
2.6.0-test5-kobj-parent-ref-count.patch (1.32 kB)
Download all attachments

2003-09-16 17:56:16

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH][2.6.0-test5] fix oopses is kobject parent is removed before child

On Sun, Sep 14, 2003 at 05:37:04PM +0400, Andrey Borzenkov wrote:
> It is possible that parent is removed before child when child is in use.
> Trivial example is mounted USB storage when you unplug it. The kobject for
> USB device is removed but subordinate SCSI device remains. Then kernel oopses
> on attempt to release child e.g. umount removed USB storage. This patch fixes
> two problems:
>
> - kset_hotplug. It oopses in get_kobj_path_length because child->parent points
> to nowhere - even if parent has not yet been overwritten, its name is already
> freed. Common oops I get is

No, the scsi code should be fixed to prevent this from happening. This
used to happen in the past, but I thought the scsi people fixed it up.
The SCSI code should grab a reference on the parent device which will
prevent it from going away until the SCSI device does, preventing all of
these oopes.

thanks,

greg k-h

2003-09-19 22:51:09

by Mike Anderson

[permalink] [raw]
Subject: Re: [PATCH][2.6.0-test5] fix oopses is kobject parent is removed before child

Greg KH [[email protected]] wrote:
> On Sun, Sep 14, 2003 at 05:37:04PM +0400, Andrey Borzenkov wrote:
> > It is possible that parent is removed before child when child is in use.
> > Trivial example is mounted USB storage when you unplug it. The kobject for
> > USB device is removed but subordinate SCSI device remains. Then kernel oopses
> > on attempt to release child e.g. umount removed USB storage. This patch fixes
> > two problems:
> >
> > - kset_hotplug. It oopses in get_kobj_path_length because child->parent points
> > to nowhere - even if parent has not yet been overwritten, its name is already
> > freed. Common oops I get is
>
> No, the scsi code should be fixed to prevent this from happening. This
> used to happen in the past, but I thought the scsi people fixed it up.
> The SCSI code should grab a reference on the parent device which will
> prevent it from going away until the SCSI device does, preventing all of
> these oopes.
>
> thanks,
>
> greg k-h

Sorry it took long to reply. I setup both a uml system and another
system using the scsi_debug driver so I could step through the scsi ref
counts. The ref counts look ok. I attached a small cut prior to the
unmount. I believe the issue is that in scsi we are using the device_del
and device_put instead of calling device_register. This in itself will
not generate this problem, but to avoid an blk layer cleanup issue the
device_del for sd is being called when ref counts go to zero. This could
be some time after the device_del on the host was called. The kobject
put / release calls follow a standard cleanup path with the parent
staying in place until the last child goes away.

I did use the patch provided by Andrey for our current method and the
oops I was getting was gone which allowed me to get a non-garbled ref
count output.

While the patch seems to fix the problem, the better answer is to look
into what it would take have sd call the blk layer cleanup functions
during the remove even through there are still openers (having this
change in remove_dir may still be a good idea). When I looked at
this last I thought there was a need for a release function in the block
layer instead of calling del_gendisk directly. I will look into this.

-andmike
--
Michael Anderson
[email protected]

FYI ref count during umount

# umount /dev/sdd
SCSI error : <2 0 0 0> return code = 0x10000
end_request: I/O error, dev sdd, sector 2
Buffer I/O error on device sdd, logical block 1
lost page write due to I/O error on sdd
kobject get scsi_device: ref 14
kobject put queue: ref 3
kobject put iosched: ref 1
kobject put iosched: ref 0
kobject iosched: cleaning up
kobject put queue: ref 2
kobject put sdd: ref 4
kobject put queue: ref 1
kobject put queue: ref 0
kobject queue: cleaning up
kobject put sdd: ref 3
kobject put 2:0:0:0: ref 2
kobject put block: ref 42
kobject put sdd: ref 2
kobject put sdd: ref 1
kobject put scsi: ref 13
kobject put host2: ref 2
kobject put 2:0:0:0: ref 1
kobject put host2: ref 1
kobject put 2:0:0:0: ref 0
kobject 2:0:0:0: cleaning up
kobject put host2: ref 0
kobject host2: cleaning up
kobject put adapter1: ref 0
kobject adapter1: cleaning up
kobject put devices: ref 39
kobject put devices: ref 38
kobject put devices: ref 37
kobject put scsi_device: ref 13
kobject put sdd: ref 0
kobject sdd: cleaning up
kobject put block: ref 41