Hello!
Problem code is in register_disk, and though it was included at the end of
November, but problems started to appear very recently.
At the beginning of register_disk() we see this snippet:
s = strchr(disk->kobj.name, '/');
if (s)
*s = '!';
kobject_register(&disk->kobj);
disk_sysfs_symlinks(disk);
And now kobject_register() frees disk structure if registration was
unsuccesful, so doing anything else with this structure is not very
nice thing. This is especially observable if you turn on SLAB poisoning,
then it will die during disk_sysfs_symlinks (actually two levels deeper).
I think here we should at least check return value of kobject_register
and print something useful. This won't help much against crashing since
accessing /proc/partition afterwards will still kill the box (in case
of SLAB poisoning, otherwise some other unpredictable stuff will happen).
It seems that making register_disk() to return error back to caller
does not makes much sence either, since struct gendisk is already destroyed
and we cannot get it off the lists that easily.
May be it was just a bad decision to destroy whole structure of underlying
object when kobject creation have failed?
Anyway at least below patch is a good thing to easy problem diagnostics
when kobject_register() fails during registering block devices (e.g. because
some of them have same base name).
Bye,
Oleg
===== fs/partitions/check.c 1.88 vs edited =====
--- 1.88/fs/partitions/check.c Thu Nov 21 06:08:39 2002
+++ edited/fs/partitions/check.c Sat Nov 23 15:17:28 2002
@@ -400,13 +400,23 @@
struct block_device *bdev;
char *s;
int j;
+ int error;
+ /* In case of error disk structure will be freed, so we cannot
+ dereference it. Let's save some useful info so that we can
+ print meaningful error message. */
+ int minor=disk->first_minor, major=disk->major;
strncpy(disk->kobj.name,disk->disk_name,KOBJ_NAME_LEN);
/* ewww... some of these buggers have / in name... */
s = strchr(disk->kobj.name, '/');
if (s)
*s = '!';
- kobject_register(&disk->kobj);
+
+ error = kobject_register(&disk->kobj);
+ if ( error ) {
+ printk(KERN_ERR "%s: Failed to create kernel object for disk major %d, first minor %d. Reason: %d\n", __FUNCTION__, major, minor, error );
+ return;
+ }
disk_sysfs_symlinks(disk);
if (disk->flags & GENHD_FL_CD)
> Problem code is in register_disk, and though it was included at the end of
> November, but problems started to appear very recently.
>
> At the beginning of register_disk() we see this snippet:
>
> s = strchr(disk->kobj.name, '/');
> if (s)
> *s = '!';
> kobject_register(&disk->kobj);
> disk_sysfs_symlinks(disk);
>
> And now kobject_register() frees disk structure if registration was
> unsuccesful, so doing anything else with this structure is not very
> nice thing. This is especially observable if you turn on SLAB poisoning,
> then it will die during disk_sysfs_symlinks (actually two levels deeper).
>
> I think here we should at least check return value of kobject_register
> and print something useful. This won't help much against crashing since
> accessing /proc/partition afterwards will still kill the box (in case
> of SLAB poisoning, otherwise some other unpredictable stuff will happen).
>
> It seems that making register_disk() to return error back to caller
> does not makes much sence either, since struct gendisk is already destroyed
> and we cannot get it off the lists that easily.
>
> May be it was just a bad decision to destroy whole structure of underlying
> object when kobject creation have failed?
Thanks for catching this. There are two things wrong: register_disk() is
calling kobject_register() instead of kobject_add(). This is fixed with a
patch I posted earlier today.
Also, though it is now irrelevant in this case, kobject_register() was
causing the object to be freed if kobject_add() failed. This is a bit
rude, though kobject_add() really shoudln't fail. It can fail because of
these reasons:
- sysfs isn't mounted yet.
- we couldn't get an inode.
- the file or directory already existed.
In each case, we probably want to know about it. So, the patch below
changes the call to kobject_cleanup() to a WARN_ON(error) to flag it, but
continue going..
Thanks,
-pat
===== lib/kobject.c 1.8 vs edited =====
--- 1.8/lib/kobject.c Thu Nov 21 17:01:53 2002
+++ edited/lib/kobject.c Mon Nov 25 16:01:53 2002
@@ -110,8 +110,7 @@
if (kobj) {
kobject_init(kobj);
error = kobject_add(kobj);
- if (error)
- kobject_cleanup(kobj);
+ WARN_ON(error);
} else
error = -EINVAL;
return error;