2008-10-29 15:13:37

by Peter Korsgaard

[permalink] [raw]
Subject: 2.6.28-rc2: (mtd)block/partitions BUG with kobject reference count

Hi,

I'm seing what looks like a kobject reference count issue with
mtdblock_ro + mtd_dataflash + mtd partitions and repeated unbind/bind.
I'm on 2.6.28-rc2, but I can reproduce the problem on 2.6.27 as well.

The issue is that the devices/virtual/bdi/31:<index> doesn't get
released on unbind, so the next bind BUGs with:

mtd_dataflash spi28672.1: AT45DB041x (528 KBytes) pagesize 264 bytes
Creating 2 MTD partitions on "xdb":
0x00000000-0x00000108 : "xdb_id"
blktrans_notify_add(c35258f0), type = 6
sysfs: duplicate filename '31:12' can not be created
------------[ cut here ]------------
Badness at fs/sysfs/dir.c:462
NIP: c0095e74 LR: c0095e74 CTR: c00ed798
REGS: c35c1b00 TRAP: 0700 Not tainted (2.6.28-rc2)
MSR: 00029032 <EE,ME,IR,DR> CR: 22002082 XER: 20000000
TASK = c34f1000[310] 'sh' THREAD: c35c0000
GPR00: c0095e74 c35c1bb0 c34f1000 00000038 00001c5f ffffffff c00eddd0
c0249b58
GPR08: c024a004 c023e52c 00001c5f 00000001 44002084 efffffff c00fe4bc
c00fe408
GPR16: c00fecb0 c0100000 c00fe618 c00fe3a8 c00fe348 c00fe570 c00febdc
00000000
GPR24: c35bc20c c35c1cd8 c01e145c c35c1bf8 c387ca28 c35c1bc8 c355cacc
ffffffef
NIP [c0095e74] sysfs_add_one+0x34/0x50
LR [c0095e74] sysfs_add_one+0x34/0x50
Call Trace:
[c35c1bb0] [c0095e74] sysfs_add_one+0x34/0x50 (unreliable)
[c35c1bc0] [c0096464] create_dir+0x58/0xac
[c35c1bf0] [c0096504] sysfs_create_dir+0x4c/0x70
[c35c1c10] [c00cb748] kobject_add_internal+0xc4/0x1b4
[c35c1c30] [c00cbd40] kobject_add+0x80/0x98
[c35c1c60] [c00f1adc] device_add+0x8c/0x490
[c35c1ca0] [c00f1fa4] device_create_vargs+0x94/0xcc
[c35c1cd0] [c004d328] bdi_register+0x64/0x8c
[c35c1d10] [c00c82b4] add_disk+0xe0/0x118
[c35c1d40] [c0101254] add_mtd_blktrans_dev+0x258/0x278
[c35c1d60] [c0101468] mtdblock_add_mtd+0x5c/0x70
[c35c1d70] [c01008f8] blktrans_notify_add+0x58/0x98
[c35c1d90] [c00fde90] add_mtd_device+0xdc/0x140
[c35c1db0] [c00ff3d0] add_mtd_partitions+0x594/0x5d0
[c35c1e00] [c01afd38] add_dataflash_otp+0x190/0x1cc
[c35c1e30] [c01b0000] dataflash_probe+0x28c/0x2ac
[c35c1e60] [c01056f0] spi_drv_probe+0x2c/0x3c
[c35c1e70] [c00f43b4] driver_probe_device+0xe8/0x18c
[c35c1e90] [c00f36b8] driver_bind+0x74/0xd4
[c35c1eb0] [c00f2d10] drv_attr_store+0x34/0x44
[c35c1ec0] [c0095310] sysfs_write_file+0x130/0x1a0
[c35c1ef0] [c005c9d8] vfs_write+0xb8/0x104
[c35c1f10] [c005ce18] sys_write+0x4c/0x8c
[c35c1f40] [c000f3bc] ret_from_syscall+0x0/0x38

I don't see the problem without mtd partitions or with only 1
partition, so it seems related to the partition code.

If I enable kobject debugging I see the following:

bind:

mtd_dataflash spi28672.1: AT45DB041x (528 KBytes) pagesize 264 bytes
Creating 2 MTD partitions on "xdb":
0x00000000-0x00000108 : "xdb_id"
blktrans_notify_add(c35aba44), type = 6
kobject: 'mtdblock12' (c35d6af4): kobject_add_internal: parent: 'block', set: 'devices'
kobject: 'mtdblock12' (c35d6af4): kobject_uevent_env
kobject: 'mtdblock12' (c35d6af4): kobject_uevent_env: filter function caused the event to drop!
kobject: 'holders' (c35abe58): kobject_add_internal: parent: 'mtdblock12', set: '<NULL>'
kobject: 'slaves' (c35abec4): kobject_add_internal: parent: 'mtdblock12', set: '<NULL>'
kobject: 'mtdblock12' (c35d6af4): kobject_uevent_env
kobject: 'mtdblock12' (c35d6af4): fill_kobj_path: path = '/devices/virtual/block/mtdblock12'
kobject: 'queue' (c3477a44): kobject_add_internal: parent: 'mtdblock12', set: '<NULL>'
kobject: 'queue' (c3477a44): kobject_uevent_env
kobject: 'queue' (c3477a44): kobject_uevent_env: filter function caused the event to drop!
kobject: 'iosched' (c3856498): kobject_add_internal: parent: 'queue', set: '<NULL>'
kobject: 'iosched' (c3856498): kobject_uevent_env
kobject: 'iosched' (c3856498): kobject_uevent_env: filter function caused the event to drop!
kobject: '31:12' (c35d6ca8): kobject_add_internal: parent: 'bdi', set: 'devices'
kobject: '31:12' (c35d6ca8): kobject_uevent_env
kobject: '31:12' (c35d6ca8): fill_kobj_path: path = '/devices/virtual/bdi/31:12'
kobject: 'mtd12' (c35d6ddc): kobject_add_internal: parent: 'mtd', set: 'devices'
kobject: 'mtd12' (c35d6ddc): kobject_uevent_env
kobject: 'mtd12' (c35d6ddc): fill_kobj_path: path = '/devices/virtual/mtd/mtd12'
kobject: 'mtd12ro' (c35d6f10): kobject_add_internal: parent: 'mtd', set: 'devices'
kobject: 'mtd12ro' (c35d6f10): kobject_uevent_env
kobject: 'mtd12ro' (c35d6f10): fill_kobj_path: path = '/devices/virtual/mtd/mtd12ro'
0x00000108-0x00084000 : "xdb_fpga"
blktrans_notify_add(c3506730), type = 6
kobject: 'mtdblock13' (c35dc284): kobject_add_internal: parent: 'block', set: 'devices'
kobject: 'mtdblock13' (c35dc284): kobject_uevent_env
kobject: 'mtdblock13' (c35dc284): kobject_uevent_env: filter function caused the event to drop!
kobject: 'holders' (c3543cc0): kobject_add_internal: parent: 'mtdblock13', set: '<NULL>'
kobject: 'slaves' (c35a62f0): kobject_add_internal: parent: 'mtdblock13', set: '<NULL>'
kobject: 'mtdblock13' (c35dc284): kobject_uevent_env
kobject: 'mtdblock13' (c35dc284): fill_kobj_path: path = '/devices/virtual/block/mtdblock13'
kobject: 'queue' (c3477a44): kobject_add_internal: parent: 'mtdblock13', set: '<NULL>'
kobject: 'queue' (c3477a44): kobject_uevent_env
kobject: 'queue' (c3477a44): kobject_uevent_env: filter function caused the event to drop!
kobject: 'iosched' (c3856498): kobject_add_internal: parent: 'queue', set: '<NULL>'
kobject: 'iosched' (c3856498): kobject_uevent_env
kobject: 'iosched' (c3856498): kobject_uevent_env: filter function caused the event to drop!
kobject: '31:13' (c35dc06c): kobject_add_internal: parent: 'bdi', set: 'devices'
kobject: '31:13' (c35dc06c): kobject_uevent_env
kobject: '31:13' (c35dc06c): fill_kobj_path: path = '/devices/virtual/bdi/31:13'
kobject: 'mtd13' (c35dc438): kobject_add_internal: parent: 'mtd', set: 'devices'
kobject: 'mtd13' (c35dc438): kobject_uevent_env
kobject: 'mtd13' (c35dc438): fill_kobj_path: path = '/devices/virtual/mtd/mtd13'
kobject: 'mtd13ro' (c35dc56c): kobject_add_internal: parent: 'mtd', set: 'devices'
kobject: 'mtd13ro' (c35dc56c): kobject_uevent_env
kobject: 'mtd13ro' (c35dc56c): fill_kobj_path: path = '/devices/virtual/mtd/mtd13ro'

unbind:

blktrans_notify_remove(c3506730), type = 6
kobject: '31:13' (c35dc06c): kobject_uevent_env
kobject: '31:13' (c35dc06c): fill_kobj_path: path = '/devices/virtual/bdi/31:13'
kobject: '31:13' (c35dc06c): kobject_cleanup
kobject: '31:13' (c35dc06c): calling ktype release
kobject: '31:13': free name
kobject: 'iosched' (c3856498): kobject_uevent_env
kobject: 'iosched' (c3856498): kobject_uevent_env: filter function caused the event to drop!
kobject: 'queue' (c3477a44): kobject_uevent_env
kobject: 'queue' (c3477a44): kobject_uevent_env: filter function caused the event to drop!
kobject: 'holders' (c3543cc0): kobject_cleanup
kobject: 'holders' (c3543cc0): auto cleanup kobject_del
kobject: 'holders' (c3543cc0): calling ktype release
kobject: (c3543cc0): dynamic_kobj_release
kobject: 'holders': free name
kobject: 'slaves' (c35a62f0): kobject_cleanup
kobject: 'slaves' (c35a62f0): auto cleanup kobject_del
kobject: 'slaves' (c35a62f0): calling ktype release
kobject: (c35a62f0): dynamic_kobj_release
kobject: 'slaves': free name
kobject: 'mtdblock13' (c35dc284): kobject_uevent_env
kobject: 'mtdblock13' (c35dc284): fill_kobj_path: path = '/devices/virtual/block/mtdblock13'
kobject: 'mtdblock13' (c35dc284): kobject_cleanup
kobject: 'mtdblock13' (c35dc284): calling ktype release
kobject: 'mtdblock13': free name
kobject: 'mtd13' (c35dc438): kobject_uevent_env
kobject: 'mtd13' (c35dc438): fill_kobj_path: path = '/devices/virtual/mtd/mtd13'
kobject: 'mtd13' (c35dc438): kobject_cleanup
kobject: 'mtd13' (c35dc438): calling ktype release
kobject: 'mtd13': free name
kobject: 'mtd13ro' (c35dc56c): kobject_uevent_env
kobject: 'mtd13ro' (c35dc56c): fill_kobj_path: path = '/devices/virtual/mtd/mtd13ro'
kobject: 'mtd13ro' (c35dc56c): kobject_cleanup
kobject: 'mtd13ro' (c35dc56c): calling ktype release
kobject: 'mtd13ro': free name
blktrans_notify_remove(c35aba44), type = 6
kobject: 'iosched' (c3856498): kobject_uevent_env
kobject: 'iosched' (c3856498): kobject_uevent_env: attempted to send uevent without kset!
kobject: 'queue' (c3477a44): kobject_uevent_env
kobject: 'queue' (c3477a44): kobject_uevent_env: attempted to send uevent without kset!
kobject: 'holders' (c35abe58): kobject_cleanup
kobject: 'holders' (c35abe58): auto cleanup kobject_del
kobject: 'holders' (c35abe58): calling ktype release
kobject: (c35abe58): dynamic_kobj_release
kobject: 'holders': free name
kobject: 'slaves' (c35abec4): kobject_cleanup
kobject: 'slaves' (c35abec4): auto cleanup kobject_del
kobject: 'slaves' (c35abec4): calling ktype release
kobject: (c35abec4): dynamic_kobj_release
kobject: 'slaves': free name
kobject: 'mtdblock12' (c35d6af4): kobject_uevent_env
kobject: 'mtdblock12' (c35d6af4): fill_kobj_path: path = '/devices/virtual/block/mtdblock12'
kobject: 'mtd12' (c35d6ddc): kobject_uevent_env
kobject: 'mtd12' (c35d6ddc): fill_kobj_path: path = '/devices/virtual/mtd/mtd12'
kobject: 'mtd12' (c35d6ddc): kobject_cleanup
kobject: 'mtd12' (c35d6ddc): calling ktype release
kobject: 'mtd12': free name
kobject: 'mtd12ro' (c35d6f10): kobject_uevent_env
kobject: 'mtd12ro' (c35d6f10): fill_kobj_path: path = '/devices/virtual/mtd/mtd12ro'
kobject: 'mtd12ro' (c35d6f10): kobject_cleanup
kobject: 'mtd12ro' (c35d6f10): calling ktype release
kobject: 'mtd12ro': free name

The 'attempted to send uevent without kset!' seems odd, and the fact
that 31:12 isn't removed + release isn't called for mtdblock12.

To compare, here's the output with only 1 partition:

bind:

mtd_dataflash spi28672.1: AT45DB041x (528 KBytes) pagesize 264 bytes
Creating 1 MTD partitions on "xdb":
0x00000000-0x00000108 : "xdb_id"
blktrans_notify_add(c354823c), type = 6
kobject: 'mtdblock12' (c35c40a4): kobject_add_internal: parent: 'block', set: 'devices'
kobject: 'mtdblock12' (c35c40a4): kobject_uevent_env
kobject: 'mtdblock12' (c35c40a4): kobject_uevent_env: filter function caused the event to drop!
kobject: 'holders' (c355ef34): kobject_add_internal: parent: 'mtdblock12', set: '<NULL>'
kobject: 'slaves' (c3522130): kobject_add_internal: parent: 'mtdblock12', set: '<NULL>'
kobject: 'mtdblock12' (c35c40a4): kobject_uevent_env
kobject: 'mtdblock12' (c35c40a4): fill_kobj_path: path = '/devices/virtual/block/mtdblock12'
kobject: 'queue' (c3477a44): kobject_add_internal: parent: 'mtdblock12', set: '<NULL>'
kobject: 'queue' (c3477a44): kobject_uevent_env
kobject: 'queue' (c3477a44): kobject_uevent_env: filter function caused the event to drop!
kobject: 'iosched' (c3856498): kobject_add_internal: parent: 'queue', set: '<NULL>'
kobject: 'iosched' (c3856498): kobject_uevent_env
kobject: 'iosched' (c3856498): kobject_uevent_env: filter function caused the event to drop!
kobject: '31:12' (c35c4258): kobject_add_internal: parent: 'bdi', set: 'devices'
kobject: '31:12' (c35c4258): kobject_uevent_env
kobject: '31:12' (c35c4258): fill_kobj_path: path = '/devices/virtual/bdi/31:12'
kobject: 'mtd12' (c35c438c): kobject_add_internal: parent: 'mtd', set: 'devices'
kobject: 'mtd12' (c35c438c): kobject_uevent_env
kobject: 'mtd12' (c35c438c): fill_kobj_path: path = '/devices/virtual/mtd/mtd12'
kobject: 'mtd12ro' (c35c44c0): kobject_add_internal: parent: 'mtd', set: 'devices'
kobject: 'mtd12ro' (c35c44c0): kobject_uevent_env
kobject: 'mtd12ro' (c35c44c0): fill_kobj_path: path = '/devices/virtual/mtd/mtd12ro'

unbind:

blktrans_notify_remove(c354823c), type = 6
kobject: '31:12' (c35c4258): kobject_uevent_env
kobject: '31:12' (c35c4258): fill_kobj_path: path = '/devices/virtual/bdi/31:12'
kobject: '31:12' (c35c4258): kobject_cleanup
kobject: '31:12' (c35c4258): calling ktype release
kobject: '31:12': free name
kobject: 'iosched' (c3856498): kobject_uevent_env
kobject: 'iosched' (c3856498): kobject_uevent_env: filter function caused the event to drop!
kobject: 'queue' (c3477a44): kobject_uevent_env
kobject: 'queue' (c3477a44): kobject_uevent_env: filter function caused the event to drop!
kobject: 'holders' (c355ef34): kobject_cleanup
kobject: 'holders' (c355ef34): auto cleanup kobject_del
kobject: 'holders' (c355ef34): calling ktype release
kobject: (c355ef34): dynamic_kobj_release
kobject: 'holders': free name
kobject: 'slaves' (c3522130): kobject_cleanup
kobject: 'slaves' (c3522130): auto cleanup kobject_del
kobject: 'slaves' (c3522130): calling ktype release
kobject: (c3522130): dynamic_kobj_release
kobject: 'slaves': free name
kobject: 'mtdblock12' (c35c40a4): kobject_uevent_env
kobject: 'mtdblock12' (c35c40a4): fill_kobj_path: path = '/devices/virtual/block/mtdblock12'
kobject: 'mtdblock12' (c35c40a4): kobject_cleanup
kobject: 'mtdblock12' (c35c40a4): calling ktype release
kobject: 'mtdblock12': free name
kobject: 'mtd12' (c35c438c): kobject_uevent_env
kobject: 'mtd12' (c35c438c): fill_kobj_path: path = '/devices/virtual/mtd/mtd12'
kobject: 'mtd12' (c35c438c): kobject_cleanup
kobject: 'mtd12' (c35c438c): calling ktype release
kobject: 'mtd12': free name
kobject: 'mtd12ro' (c35c44c0): kobject_uevent_env
kobject: 'mtd12ro' (c35c44c0): fill_kobj_path: path = '/devices/virtual/mtd/mtd12ro'
kobject: 'mtd12ro' (c35c44c0): kobject_cleanup
kobject: 'mtd12ro' (c35c44c0): calling ktype release
kobject: 'mtd12ro': free name

I don't really know the mtd and/or block subsystem good enough to know
where to look - Any ideas?

--
Bye, Peter Korsgaard


2008-10-29 23:22:55

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.28-rc2: (mtd)block/partitions BUG with kobject reference count

On Wednesday, 29 of October 2008, Peter Korsgaard wrote:
> Hi,
>
> I'm seing what looks like a kobject reference count issue with
> mtdblock_ro + mtd_dataflash + mtd partitions and repeated unbind/bind.
> I'm on 2.6.28-rc2, but I can reproduce the problem on 2.6.27 as well.

Is it reproducible with 2.6.26 too?

Rafael

2008-10-29 23:28:40

by Peter Korsgaard

[permalink] [raw]
Subject: Re: 2.6.28-rc2: (mtd)block/partitions BUG with kobject reference count

>>>>> "Rafael" == Rafael J Wysocki <[email protected]> writes:

Rafael> On Wednesday, 29 of October 2008, Peter Korsgaard wrote:
>> Hi,
>>
>> I'm seing what looks like a kobject reference count issue with
>> mtdblock_ro + mtd_dataflash + mtd partitions and repeated unbind/bind.
>> I'm on 2.6.28-rc2, but I can reproduce the problem on 2.6.27 as well.

Rafael> Is it reproducible with 2.6.26 too?

Sorry, I haven't backported my platform support code to such "old"
kernel. I can do it though, if you think it will help pinpoint the
issue.

--
Bye, Peter Korsgaard

2008-10-30 21:51:53

by Kay Sievers

[permalink] [raw]
Subject: Re: 2.6.28-rc2: (mtd)block/partitions BUG with kobject reference count

On Thu, Oct 30, 2008 at 00:28, Peter Korsgaard <[email protected]> wrote:
>>>>>> "Rafael" == Rafael J Wysocki <[email protected]> writes:
>
> Rafael> On Wednesday, 29 of October 2008, Peter Korsgaard wrote:
> >> Hi,
> >>
> >> I'm seing what looks like a kobject reference count issue with
> >> mtdblock_ro + mtd_dataflash + mtd partitions and repeated unbind/bind.
> >> I'm on 2.6.28-rc2, but I can reproduce the problem on 2.6.27 as well.
>
> Rafael> Is it reproducible with 2.6.26 too?
>
> Sorry, I haven't backported my platform support code to such "old"
> kernel. I can do it though, if you think it will help pinpoint the
> issue.

This sounds like a possible reason for the problem:
"After digging into the mtd code, this bug is not related to our driver. It
should be a subtle bug in mtd core code.

In add_mtd_partition, for 2 partitions, 2 gendisk structures will be
allocated. But these 2 gendisk->queue will be set to the same
request_queue. Then when unregistering the 1st partition, from the
same request_queue->backing_dev_info, the bdi struct will be set to
NULL. So for the 2nd partition (bdi == NULL), the sysfs dir of 2nd
partition will not be removed. Finally, when modprobe the module
again, the 2nd partition won't be added"
https://blackfin.uclinux.org/gf/tracker/4463

Kay

2008-10-30 23:47:37

by Kay Sievers

[permalink] [raw]
Subject: Re: 2.6.28-rc2: (mtd)block/partitions BUG with kobject reference count

On Thu, Oct 30, 2008 at 22:51, Kay Sievers <[email protected]> wrote:
> On Thu, Oct 30, 2008 at 00:28, Peter Korsgaard <[email protected]> wrote:
>>>>>>> "Rafael" == Rafael J Wysocki <[email protected]> writes:
>>
>> Rafael> On Wednesday, 29 of October 2008, Peter Korsgaard wrote:
>> >> Hi,
>> >>
>> >> I'm seing what looks like a kobject reference count issue with
>> >> mtdblock_ro + mtd_dataflash + mtd partitions and repeated unbind/bind.
>> >> I'm on 2.6.28-rc2, but I can reproduce the problem on 2.6.27 as well.
>>
>> Rafael> Is it reproducible with 2.6.26 too?
>>
>> Sorry, I haven't backported my platform support code to such "old"
>> kernel. I can do it though, if you think it will help pinpoint the
>> issue.
>
> This sounds like a possible reason for the problem:
> "After digging into the mtd code, this bug is not related to our driver. It
> should be a subtle bug in mtd core code.
>
> In add_mtd_partition, for 2 partitions, 2 gendisk structures will be
> allocated. But these 2 gendisk->queue will be set to the same
> request_queue. Then when unregistering the 1st partition, from the
> same request_queue->backing_dev_info, the bdi struct will be set to
> NULL. So for the 2nd partition (bdi == NULL), the sysfs dir of 2nd
> partition will not be removed. Finally, when modprobe the module
> again, the 2nd partition won't be added"
> https://blackfin.uclinux.org/gf/tracker/4463

Looks like a bdi issue:
http://lkml.org/lkml/2008/10/30/519

Peter, if I do this (whitespace mangled, just pasted in here), the
error goes away for me. Can you try this?

Thanks,
Kay

--- a/block/genhd.c
+++ b/block/genhd.c
@@ -515,7 +515,8 @@ void add_disk(struct gendisk *disk)
blk_register_queue(disk);

bdi = &disk->queue->backing_dev_info;
- bdi_register_dev(bdi, disk_devt(disk));
+ if (!bdi->dev)
+ bdi_register_dev(bdi, disk_devt(disk));
retval = sysfs_create_link(&disk_to_dev(disk)->kobj, &bdi->dev->kobj,
"bdi");
WARN_ON(retval);

2008-10-31 04:22:44

by Bryan Wu

[permalink] [raw]
Subject: Re: 2.6.28-rc2: (mtd)block/partitions BUG with kobject reference count

On Fri, Oct 31, 2008 at 5:51 AM, Kay Sievers <[email protected]> wrote:
> On Thu, Oct 30, 2008 at 00:28, Peter Korsgaard <[email protected]> wrote:
>>>>>>> "Rafael" == Rafael J Wysocki <[email protected]> writes:
>>
>> Rafael> On Wednesday, 29 of October 2008, Peter Korsgaard wrote:
>> >> Hi,
>> >>
>> >> I'm seing what looks like a kobject reference count issue with
>> >> mtdblock_ro + mtd_dataflash + mtd partitions and repeated unbind/bind.
>> >> I'm on 2.6.28-rc2, but I can reproduce the problem on 2.6.27 as well.
>>
>> Rafael> Is it reproducible with 2.6.26 too?
>>
>> Sorry, I haven't backported my platform support code to such "old"
>> kernel. I can do it though, if you think it will help pinpoint the
>> issue.
>
> This sounds like a possible reason for the problem:
> "After digging into the mtd code, this bug is not related to our driver. It
> should be a subtle bug in mtd core code.
>
> In add_mtd_partition, for 2 partitions, 2 gendisk structures will be
> allocated. But these 2 gendisk->queue will be set to the same
> request_queue. Then when unregistering the 1st partition, from the
> same request_queue->backing_dev_info, the bdi struct will be set to
> NULL. So for the 2nd partition (bdi == NULL), the sysfs dir of 2nd
> partition will not be removed. Finally, when modprobe the module
> again, the 2nd partition won't be added"
> https://blackfin.uclinux.org/gf/tracker/4463
>

Yes, I found the similar issue on Blackfin. And kernel 2.6.26, 2.6.27
and also latest 2.6.28-rc2 have this bug.

-Bryan

2008-10-31 04:25:15

by Bryan Wu

[permalink] [raw]
Subject: Re: 2.6.28-rc2: (mtd)block/partitions BUG with kobject reference count

On Fri, Oct 31, 2008 at 7:47 AM, Kay Sievers <[email protected]> wrote:
> On Thu, Oct 30, 2008 at 22:51, Kay Sievers <[email protected]> wrote:
>> On Thu, Oct 30, 2008 at 00:28, Peter Korsgaard <[email protected]> wrote:
>>>>>>>> "Rafael" == Rafael J Wysocki <[email protected]> writes:
>>>
>>> Rafael> On Wednesday, 29 of October 2008, Peter Korsgaard wrote:
>>> >> Hi,
>>> >>
>>> >> I'm seing what looks like a kobject reference count issue with
>>> >> mtdblock_ro + mtd_dataflash + mtd partitions and repeated unbind/bind.
>>> >> I'm on 2.6.28-rc2, but I can reproduce the problem on 2.6.27 as well.
>>>
>>> Rafael> Is it reproducible with 2.6.26 too?
>>>
>>> Sorry, I haven't backported my platform support code to such "old"
>>> kernel. I can do it though, if you think it will help pinpoint the
>>> issue.
>>
>> This sounds like a possible reason for the problem:
>> "After digging into the mtd code, this bug is not related to our driver. It
>> should be a subtle bug in mtd core code.
>>
>> In add_mtd_partition, for 2 partitions, 2 gendisk structures will be
>> allocated. But these 2 gendisk->queue will be set to the same
>> request_queue. Then when unregistering the 1st partition, from the
>> same request_queue->backing_dev_info, the bdi struct will be set to
>> NULL. So for the 2nd partition (bdi == NULL), the sysfs dir of 2nd
>> partition will not be removed. Finally, when modprobe the module
>> again, the 2nd partition won't be added"
>> https://blackfin.uclinux.org/gf/tracker/4463
>
> Looks like a bdi issue:
> http://lkml.org/lkml/2008/10/30/519
>
> Peter, if I do this (whitespace mangled, just pasted in here), the
> error goes away for me. Can you try this?
>
> Thanks,
> Kay
>
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -515,7 +515,8 @@ void add_disk(struct gendisk *disk)
> blk_register_queue(disk);
>
> bdi = &disk->queue->backing_dev_info;
> - bdi_register_dev(bdi, disk_devt(disk));
> + if (!bdi->dev)
> + bdi_register_dev(bdi, disk_devt(disk));
> retval = sysfs_create_link(&disk_to_dev(disk)->kobj, &bdi->dev->kobj,
> "bdi");
> WARN_ON(retval);

IMHO, this is a workaround, right? I think the final solution should
provide every 'gendisk' a dedicated 'bdi'.
So they won't mess up and overwrite.

-Bryan

2008-10-31 06:27:08

by Kay Sievers

[permalink] [raw]
Subject: Re: 2.6.28-rc2: (mtd)block/partitions BUG with kobject reference count

On Fri, Oct 31, 2008 at 05:24, Bryan Wu <[email protected]> wrote:
> On Fri, Oct 31, 2008 at 7:47 AM, Kay Sievers <[email protected]> wrote:
>> On Thu, Oct 30, 2008 at 22:51, Kay Sievers <[email protected]> wrote:
>>> On Thu, Oct 30, 2008 at 00:28, Peter Korsgaard <[email protected]> wrote:
>>>>>>>>> "Rafael" == Rafael J Wysocki <[email protected]> writes:
>>>>
>>>> Rafael> On Wednesday, 29 of October 2008, Peter Korsgaard wrote:
>>>> >> Hi,
>>>> >>
>>>> >> I'm seing what looks like a kobject reference count issue with
>>>> >> mtdblock_ro + mtd_dataflash + mtd partitions and repeated unbind/bind.
>>>> >> I'm on 2.6.28-rc2, but I can reproduce the problem on 2.6.27 as well.
>>>>
>>>> Rafael> Is it reproducible with 2.6.26 too?
>>>>
>>>> Sorry, I haven't backported my platform support code to such "old"
>>>> kernel. I can do it though, if you think it will help pinpoint the
>>>> issue.
>>>
>>> This sounds like a possible reason for the problem:
>>> "After digging into the mtd code, this bug is not related to our driver. It
>>> should be a subtle bug in mtd core code.
>>>
>>> In add_mtd_partition, for 2 partitions, 2 gendisk structures will be
>>> allocated. But these 2 gendisk->queue will be set to the same
>>> request_queue. Then when unregistering the 1st partition, from the
>>> same request_queue->backing_dev_info, the bdi struct will be set to
>>> NULL. So for the 2nd partition (bdi == NULL), the sysfs dir of 2nd
>>> partition will not be removed. Finally, when modprobe the module
>>> again, the 2nd partition won't be added"
>>> https://blackfin.uclinux.org/gf/tracker/4463
>>
>> Looks like a bdi issue:
>> http://lkml.org/lkml/2008/10/30/519
>>
>> Peter, if I do this (whitespace mangled, just pasted in here), the
>> error goes away for me. Can you try this?

>> --- a/block/genhd.c
>> +++ b/block/genhd.c
>> @@ -515,7 +515,8 @@ void add_disk(struct gendisk *disk)
>> blk_register_queue(disk);
>>
>> bdi = &disk->queue->backing_dev_info;
>> - bdi_register_dev(bdi, disk_devt(disk));
>> + if (!bdi->dev)
>> + bdi_register_dev(bdi, disk_devt(disk));
>> retval = sysfs_create_link(&disk_to_dev(disk)->kobj, &bdi->dev->kobj,
>> "bdi");
>> WARN_ON(retval);
>
> IMHO, this is a workaround, right? I think the final solution should
> provide every 'gendisk' a dedicated 'bdi'.
> So they won't mess up and overwrite.

It's a per-queue property which is shared by multiple devices. The
"bdi" symlink of the blockdev still points to the actual (shared) bdi
device. It's just that only the first device, of the pool of devices
sharing a queue, creates the bdi entry.

Not sure what the final fix should be, but this sounds better than
duplicating identical information and/or leaking the duplicates. :)

Kay

2008-11-03 15:14:23

by Peter Korsgaard

[permalink] [raw]
Subject: Re: 2.6.28-rc2: (mtd)block/partitions BUG with kobject reference count

>>>>> "Kay" == Kay Sievers <[email protected]> writes:

Hi,

Kay> Looks like a bdi issue:
Kay> http://lkml.org/lkml/2008/10/30/519

Kay> Peter, if I do this (whitespace mangled, just pasted in here), the
Kay> error goes away for me. Can you try this?

Sorry for the slow response - Your patch
(http://lkml.org/lkml/2008/10/30/558) fixes the issue for me - Thanks!

--
Bye, Peter Korsgaard