2013-05-09 10:58:04

by Borislav Petkov

[permalink] [raw]
Subject: lockdep: BUG: key ffff880436f00330 not in .data!

This is on current linus:

[ 3.806733] EDAC sbridge: Seeking for: dev 0d.6 PCI ID 8086:3cf5
[ 3.807455] EDAC sbridge: Seeking for: dev 0d.6 PCI ID 8086:3cf5
[ 3.808317] BUG: key ffff880436f00330 not in .data!
[ 3.808843] ------------[ cut here ]------------
[ 3.809571] WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x555/0x590()
[ 3.810274] DEBUG_LOCKS_WARN_ON(1)
[ 3.810279] Modules linked in: sb_edac(+) edac_core acpi_cpufreq mperf ehci_hcd gf128mul ablk_helper iTCO_wdt iTCO_vendor_support evdev l
pc_ich button pcspkr usbcore snd cryptd mfd_core usb_common soundcore processor microcode
[ 3.811776] CPU: 7 PID: 718 Comm: modprobe Not tainted 3.9.0+ #5
[ 3.812516] Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013
[ 3.813271] 0000000000000009 ffff880434421858 ffffffff815e97c6 ffff880434421898
[ 3.814009] ffffffff8103e210 ffff880434421888 ffff88043b200110 ffffffff8181062e
[ 3.814758] 0000000000000000 ffff880436f00330 0000000000000000 ffff8804344218f8
[ 3.815518] Call Trace:
[ 3.816249] [<ffffffff815e97c6>] dump_stack+0x19/0x1b
[ 3.816977] [<ffffffff8103e210>] warn_slowpath_common+0x70/0xa0
[ 3.817675] [<ffffffff8103e2f6>] warn_slowpath_fmt+0x46/0x50
[ 3.818394] [<ffffffff810954c5>] lockdep_init_map+0x555/0x590
[ 3.819119] [<ffffffff810998ed>] ? trace_hardirqs_on_caller+0x10d/0x1d0
[ 3.819837] [<ffffffff810999bd>] ? trace_hardirqs_on+0xd/0x10
[ 3.820536] [<ffffffff81093ea8>] debug_mutex_init+0x38/0x50
[ 3.821223] [<ffffffff81068b91>] __mutex_init+0x61/0x70
[ 3.821916] [<ffffffff8142a495>] bus_register+0x175/0x2e0
[ 3.822579] [<ffffffffa00ce645>] edac_create_sysfs_mci_device+0x55/0x540 [edac_core]
[ 3.823249] [<ffffffff810999bd>] ? trace_hardirqs_on+0xd/0x10
[ 3.823938] [<ffffffffa00cc5e3>] edac_mc_add_mc+0x103/0x270 [edac_core]
[ 3.824600] [<ffffffffa00b2d12>] sbridge_probe+0xc12/0xee0 [sb_edac]
[ 3.825307] [<ffffffff812abbe6>] pci_device_probe+0x136/0x150
[ 3.825996] [<ffffffff8142bb96>] driver_probe_device+0x76/0x230
[ 3.826663] [<ffffffff8142bdfb>] __driver_attach+0xab/0xb0
[ 3.827342] [<ffffffff8142bd50>] ? driver_probe_device+0x230/0x230
[ 3.828018] [<ffffffff81429ced>] bus_for_each_dev+0x5d/0xa0
[ 3.828672] [<ffffffff8142b67e>] driver_attach+0x1e/0x20
[ 3.829329] [<ffffffff8142b151>] bus_add_driver+0x101/0x260
[ 3.829957] [<ffffffff8142c4d7>] driver_register+0x77/0x170
[ 3.830559] [<ffffffff812aab74>] __pci_register_driver+0x64/0x70
[ 3.831156] [<ffffffffa006a000>] ? 0xffffffffa0069fff
[ 3.831737] [<ffffffffa006a033>] sbridge_init+0x33/0x1000 [sb_edac]
[ 3.832290] [<ffffffffa006a000>] ? 0xffffffffa0069fff
[ 3.832868] [<ffffffff81000312>] do_one_initcall+0x112/0x170
[ 3.833392] [<ffffffff810a5bb9>] load_module+0x1be9/0x25a0
[ 3.833918] [<ffffffff810a18e0>] ? unset_module_init_ro_nx+0x80/0x80
[ 3.834426] [<ffffffff810a662f>] SyS_init_module+0xbf/0xf0
[ 3.834975] [<ffffffff815f8ec6>] system_call_fastpath+0x1a/0x1f
[ 3.835471] ---[ end trace 4ad6d866311d6e2b ]---
[ 3.836069] EDAC MC0: Giving out device to 'sbridge_edac.c' 'Sandy Bridge Socket#0': DEV 0000:3f:0e.0

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--


2013-05-09 14:28:22

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: lockdep: BUG: key ffff880436f00330 not in .data!

Hi Boris,

Em 09-05-2013 08:01, Borislav Petkov escreveu:
> This is on current linus:
>
> [ 3.806733] EDAC sbridge: Seeking for: dev 0d.6 PCI ID 8086:3cf5
> [ 3.807455] EDAC sbridge: Seeking for: dev 0d.6 PCI ID 8086:3cf5
> [ 3.808317] BUG: key ffff880436f00330 not in .data!
> [ 3.808843] ------------[ cut here ]------------
> [ 3.809571] WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x555/0x590()
> [ 3.810274] DEBUG_LOCKS_WARN_ON(1)
> [ 3.810279] Modules linked in: sb_edac(+) edac_core acpi_cpufreq mperf ehci_hcd gf128mul ablk_helper iTCO_wdt iTCO_vendor_support evdev l
> pc_ich button pcspkr usbcore snd cryptd mfd_core usb_common soundcore processor microcode
> [ 3.811776] CPU: 7 PID: 718 Comm: modprobe Not tainted 3.9.0+ #5
> [ 3.812516] Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013
> [ 3.813271] 0000000000000009 ffff880434421858 ffffffff815e97c6 ffff880434421898
> [ 3.814009] ffffffff8103e210 ffff880434421888 ffff88043b200110 ffffffff8181062e
> [ 3.814758] 0000000000000000 ffff880436f00330 0000000000000000 ffff8804344218f8
> [ 3.815518] Call Trace:
> [ 3.816249] [<ffffffff815e97c6>] dump_stack+0x19/0x1b
> [ 3.816977] [<ffffffff8103e210>] warn_slowpath_common+0x70/0xa0
> [ 3.817675] [<ffffffff8103e2f6>] warn_slowpath_fmt+0x46/0x50
> [ 3.818394] [<ffffffff810954c5>] lockdep_init_map+0x555/0x590
> [ 3.819119] [<ffffffff810998ed>] ? trace_hardirqs_on_caller+0x10d/0x1d0
> [ 3.819837] [<ffffffff810999bd>] ? trace_hardirqs_on+0xd/0x10
> [ 3.820536] [<ffffffff81093ea8>] debug_mutex_init+0x38/0x50
> [ 3.821223] [<ffffffff81068b91>] __mutex_init+0x61/0x70
> [ 3.821916] [<ffffffff8142a495>] bus_register+0x175/0x2e0
> [ 3.822579] [<ffffffffa00ce645>] edac_create_sysfs_mci_device+0x55/0x540 [edac_core]

This is likely on this part of edac_mc_sysfs.c:
mci->bus.name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
if (!mci->bus.name)
return -ENOMEM;
edac_dbg(0, "creating bus %s\n", mci->bus.name);
err = bus_register(&mci->bus);
if (err < 0)
return err;


> [ 3.823249] [<ffffffff810999bd>] ? trace_hardirqs_on+0xd/0x10
> [ 3.823938] [<ffffffffa00cc5e3>] edac_mc_add_mc+0x103/0x270 [edac_core]

This is likely at this part of sb_edac.c driver:

/* add this new MC control structure to EDAC's list of MCs */
if (unlikely(edac_mc_add_mc(mci))) {
edac_dbg(0, "MC: failed edac_mc_add_mc()\n");
rc = -EINVAL;
goto fail0;
}

What I suspect is that it is trying to create more than one bus with the
same name.

So, either there are two EDAC drivers trying to register at the edac core,
or sb_edac is not incrementing mci->mc_idx.

I suspect that it is the former condition, like having both ghes_edac
and sb_edac trying to register.

In thesis, changeset 80cc7d8 should be preventing it to happen.

Could you please check if ghes_edac also got registered there?

Thanks!
Mauro

2013-05-09 19:56:28

by Borislav Petkov

[permalink] [raw]
Subject: Re: lockdep: BUG: key ffff880436f00330 not in .data!

On Thu, May 09, 2013 at 11:28:08AM -0300, Mauro Carvalho Chehab wrote:
> Could you please check if ghes_edac also got registered there?

Nope, CONFIG_EDAC_GHES is not set.

.config is attached.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--


Attachments:
(No filename) (286.00 B)
config.gz (20.37 kB)
Download all attachments

2013-05-12 15:37:28

by Borislav Petkov

[permalink] [raw]
Subject: Re: lockdep: BUG: key ffff880436f00330 not in .data!

On Thu, May 09, 2013 at 10:01:08PM +0200, Borislav Petkov wrote:
> On Thu, May 09, 2013 at 11:28:08AM -0300, Mauro Carvalho Chehab wrote:
> > Could you please check if ghes_edac also got registered there?

Ok, I got it:

[ 5.430408] BUG: key ffff88043c320330 not in .data!
[ 5.431043] ------------[ cut here ]------------
[ 5.431776] WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x555/0x590()
[ 5.432512] DEBUG_LOCKS_WARN_ON(1)

basically means we're using dynamically allocated memory for the
lock_key. And the last can thus disappear, which is Not Good(tm).

This happens because the EDAC instance probing functions do
edac_mc_alloc() to allocate the mci which also contains embedded in it:

struct bus_type bus;

and the bus_register() call in edac_create_sysfs_mci_device() hands down
this struct bus_type to lockdep.

The patch which introduced the bus_register() call in
edac_create_sysfs_mci_device() is

--
commit de3910eb79ac8c0f29a11224661c0ebaaf813039
Author: Mauro Carvalho Chehab <[email protected]>
Date: Tue Apr 24 15:05:43 2012 -0300

edac: change the mem allocation scheme to make Documentation/kobject.txt happy
--

which means we need a fix since this affects all edac drivers, not only
sb_edac, as initially surmised.

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--