2011-07-07 15:25:19

by Rafał Miłecki

[permalink] [raw]
Subject: Bug in BCMA: device_unregister causing "NULL pointer dereference at"

I've problem with bcma and bus subsystem.

This works fine:
modprobe bcma; rmmod bcma

This:
modprobe bcma; modprobe b43; rmmod b43; rmmod bcma
causes:
BUG: unable to handle kernel NULL pointer dereference at (null)

My BCMA has only 3 fores, out of them only 1 is registered as device:
bcma: Core 0 found: ChipCommon (manuf 0x4BF, id 0x800, rev 0x22, class 0x0)
bcma: Core 1 found: IEEE 802.11 (manuf 0x4BF, id 0x812, rev 0x17, class 0x0)
bcma: Core 2 found: PCIe (manuf 0x4BF, id 0x820, rev 0x0F, class 0x0)

The dereference comes out from
static void bcma_unregister_cores(struct bcma_bus *bus)

There is a simple loop:
list_for_each_entry(core, &bus->cores, list) {
if (core->dev_registered)
device_unregister(&core->dev);
}


So when I unload bcma after I got driver (b43) for 0x812 core, I get
NULL pointer dereference.

Any tip, why does it happen?

--
Rafał


2011-07-07 15:27:26

by Rafał Miłecki

[permalink] [raw]
Subject: Re: Bug in BCMA: device_unregister causing "NULL pointer dereference at"

W dniu 7 lipca 2011 17:25 użytkownik Rafał Miłecki <[email protected]> napisał:
> I've problem with bcma and bus subsystem.
>
> This works fine:
> modprobe bcma; rmmod bcma
>
> This:
> modprobe bcma; modprobe b43; rmmod b43; rmmod bcma
> causes:
> BUG: unable to handle kernel NULL pointer dereference at   (null)
>
> My BCMA has only 3 fores, out of them only 1 is registered as device:
> bcma: Core 0 found: ChipCommon (manuf 0x4BF, id 0x800, rev 0x22, class 0x0)
> bcma: Core 1 found: IEEE 802.11 (manuf 0x4BF, id 0x812, rev 0x17, class 0x0)
> bcma: Core 2 found: PCIe (manuf 0x4BF, id 0x820, rev 0x0F, class 0x0)
>
> The dereference comes out from
> static void bcma_unregister_cores(struct bcma_bus *bus)
>
> There is a simple loop:
> list_for_each_entry(core, &bus->cores, list) {
>        if (core->dev_registered)
>                device_unregister(&core->dev);
> }
>
>
> So when I unload bcma after I got driver (b43) for 0x812 core, I get
> NULL pointer dereference.
>
> Any tip, why does it happen?

Dmesg

--
Rafał


Attachments:
dmesg.log (3.45 kB)

2011-07-21 07:18:18

by Rafał Miłecki

[permalink] [raw]
Subject: Re: Bug in BCMA: device_unregister causing "NULL pointer dereference at"

W dniu 7 lipca 2011 17:25 użytkownik Rafał Miłecki <[email protected]> napisał:
> I've problem with bcma and bus subsystem.
>
> This works fine:
> modprobe bcma; rmmod bcma
>
> This:
> modprobe bcma; modprobe b43; rmmod b43; rmmod bcma
> causes:
> BUG: unable to handle kernel NULL pointer dereference at   (null)
>
> My BCMA has only 3 fores, out of them only 1 is registered as device:
> bcma: Core 0 found: ChipCommon (manuf 0x4BF, id 0x800, rev 0x22, class 0x0)
> bcma: Core 1 found: IEEE 802.11 (manuf 0x4BF, id 0x812, rev 0x17, class 0x0)
> bcma: Core 2 found: PCIe (manuf 0x4BF, id 0x820, rev 0x0F, class 0x0)
>
> The dereference comes out from
> static void bcma_unregister_cores(struct bcma_bus *bus)
>
> There is a simple loop:
> list_for_each_entry(core, &bus->cores, list) {
>        if (core->dev_registered)
>                device_unregister(&core->dev);
> }
>
>
> So when I unload bcma after I got driver (b43) for 0x812 core, I get
> NULL pointer dereference.
>
> Any tip, why does it happen?

I've tracked where does crash really happen (kobject_del does not
really say much). The real forwardtrace is:
device_unregister → device_del → kobject_del → kobj_kset_leave →
kobj_kset_leave → list_del_init

If you take a look at list_del_init, it touches "prev" and "next". So
I've added some debugging:
pr_info("core->dev.kobj.entry.prev: 0x%p\n", core->dev.kobj.entry.prev);
pr_info("core->dev.kobj.entry.next: 0x%p\n", core->dev.kobj.entry.next);

modprobe bcma && rmmod bcma:
[ 342.866366] bcma: Unregistering device for core 0x812
[ 342.866380] bcma: core->dev.kobj.entry.prev: 0xdb82780c
[ 342.866382] bcma: core->dev.kobj.entry.next: 0xda044980

modprobe bcma && modprobe b43 && rmmod b43 && rmmod bcma:
[ 612.819306] bcma: Unregistering device for core 0x812
[ 612.819320] bcma: core->dev.kobj.entry.prev: 0x (null)
[ 612.819322] bcma: core->dev.kobj.entry.next: 0xd7fe6614
[ 612.819971] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 612.819989] IP: [<c03dcfbe>] kobject_del+0x2e/0x60

I've no idea why kobj entry list gets corrupted after loading b43
driver supporting device for core 0x812.

Any help now maybe?

--
Rafał

2011-07-21 16:33:44

by Pavel Roskin

[permalink] [raw]
Subject: Re: Bug in BCMA: device_unregister causing "NULL pointer dereference at"

On 07/21/2011 03:18 AM, Rafał Miłecki wrote:

>> So when I unload bcma after I got driver (b43) for 0x812 core, I get
>> NULL pointer dereference.
>>
>> Any tip, why does it happen?
>
> I've tracked where does crash really happen (kobject_del does not
> really say much). The real forwardtrace is:
> device_unregister → device_del → kobject_del → kobj_kset_leave →
> kobj_kset_leave → list_del_init
>
> If you take a look at list_del_init, it touches "prev" and "next". So
> I've added some debugging:
> pr_info("core->dev.kobj.entry.prev: 0x%p\n", core->dev.kobj.entry.prev);
> pr_info("core->dev.kobj.entry.next: 0x%p\n", core->dev.kobj.entry.next);

There are options for debugging that you may want to enable:

CONFIG_DEBUG_LIST
CONFIG_DEBUG_OBJECTS
CONFIG_DEBUG_KOBJECT

Actually, consider enabling most debug options as possible, except
perhaps the most time consuming (such as CONFIG_DEBUG_KMEMLEAK). Maybe
you are passing a freed pointer or something.

Print the pointers you are passing to device_register() and
device_unregister().

> [ 612.819320] bcma: core->dev.kobj.entry.prev: 0x (null)

You may want to make it a macro and print it in most bcma functions.

--
Regards,
Pavel Roskin

2011-07-21 07:14:06

by Rafał Miłecki

[permalink] [raw]
Subject: Re: Bug in BCMA: device_unregister causing "NULL pointer dereference at"

W dniu 14 lipca 2011 16:45 użytkownik Francois Romieu
<[email protected]> napisał:
> Rafał Miłecki <[email protected]> :
> [...]
>> Any tip, why does it happen?
>
> bcma_release_core_dev kfrees core while its list_head is still used ?
>
> May be something like this :
>
> diff --git a/drivers/bcma/main.c b/drivers/bcma/main.c
> index be52344..85fb3aa 100644
> --- a/drivers/bcma/main.c
> +++ b/drivers/bcma/main.c
> @@ -110,11 +110,14 @@ static int bcma_register_cores(struct bcma_bus *bus)
>
>  static void bcma_unregister_cores(struct bcma_bus *bus)
>  {
> -       struct bcma_device *core;
> +       struct bcma_device *core, *next;
>
> -       list_for_each_entry(core, &bus->cores, list) {
> +       list_for_each_entry_safe(core, next, &bus->cores, list) {
> +               list_del(&core->list);
>                if (core->dev_registered)
>                        device_unregister(&core->dev);
> +               else
> +                       kfree(core);
>        }
>  }

Thanks for your help, but I'm afraid crash happens in totally
different place. Have you take a look at dmesg.log from my second
e-mail? NULL ptr exception happens in kobject_del.

--
Rafał

2011-07-14 15:42:41

by Francois Romieu

[permalink] [raw]
Subject: Re: Bug in BCMA: device_unregister causing "NULL pointer dereference at"

Rafał Miłecki <[email protected]> :
[...]
> Any tip, why does it happen?

bcma_release_core_dev kfrees core while its list_head is still used ?

May be something like this :

diff --git a/drivers/bcma/main.c b/drivers/bcma/main.c
index be52344..85fb3aa 100644
--- a/drivers/bcma/main.c
+++ b/drivers/bcma/main.c
@@ -110,11 +110,14 @@ static int bcma_register_cores(struct bcma_bus *bus)

static void bcma_unregister_cores(struct bcma_bus *bus)
{
- struct bcma_device *core;
+ struct bcma_device *core, *next;

- list_for_each_entry(core, &bus->cores, list) {
+ list_for_each_entry_safe(core, next, &bus->cores, list) {
+ list_del(&core->list);
if (core->dev_registered)
device_unregister(&core->dev);
+ else
+ kfree(core);
}
}