LinuxLists.cc - [2.6.27-rc4] SLUB list_lock vs obj

2008-08-22 21:49:13

Subject: [2.6.27-rc4] SLUB list_lock vs obj_hash.lock...

When booting 2.6.27-rc4 with SLUB and debug_objects=1, we see (after
some activity) lock ordering issues with obj_hash.lock and SLUB's
list_lock [1].

Thanks,
Daniel

--- [1]

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.27-rc4-224c #1
-------------------------------------------------------
hald/4680 is trying to acquire lock:
(&n->list_lock){++..}, at: [<ffffffff802bfa26>] add_partial+0x26/0x80

but task is already holding lock:
(&obj_hash[i].lock){++..}, at: [<ffffffff8041cfdc>]
debug_object_free+0x5c/0x120

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&obj_hash[i].lock){++..}:
[<ffffffff8026b011>] __lock_acquire+0xdb1/0x1150
[<ffffffff8026b441>] lock_acquire+0x91/0xc0
[<ffffffff80641b29>] _spin_lock_irqsave+0x49/0x90
[<ffffffff8041cd0e>] __debug_check_no_obj_freed+0x6e/0x170
[<ffffffff8041ce25>] debug_check_no_obj_freed+0x15/0x20
[<ffffffff8029accf>] free_hot_cold_page+0x11f/0x240
[<ffffffff8029ae80>] free_hot_page+0x10/0x20
[<ffffffff8029b36d>] __free_pages+0x3d/0x50
[<ffffffff802bfffe>] __free_slab+0x7e/0x90
[<ffffffff802c0028>] discard_slab+0x18/0x40
[<ffffffff802c1d2b>] kmem_cache_shrink+0x17b/0x220
[<ffffffff80442521>] acpi_os_purge_cache+0xe/0x12
[<ffffffff80459db9>] acpi_purge_cached_objects+0x15/0x3d
[<ffffffff80459e49>] acpi_initialize_objects+0x4e/0x59
[<ffffffff808a64ea>] acpi_init+0x91/0x226
[<ffffffff80209265>] do_one_initcall+0x45/0x190
[<ffffffff80887c53>] kernel_init+0x145/0x1a2
[<ffffffff8020d749>] child_rip+0xa/0x11
[<ffffffffffffffff>] 0xffffffffffffffff

-> #0 (&n->list_lock){++..}:
[<ffffffff8026b0f5>] __lock_acquire+0xe95/0x1150
[<ffffffff8026b441>] lock_acquire+0x91/0xc0
[<ffffffff806419c6>] _spin_lock+0x36/0x70
[<ffffffff802bfa26>] add_partial+0x26/0x80
[<ffffffff802c0156>] __slab_free+0x106/0x110
[<ffffffff802c0507>] kmem_cache_free+0xa7/0x110
[<ffffffff8041cc48>] free_object+0x68/0xc0
[<ffffffff8041d033>] debug_object_free+0xb3/0x120
[<ffffffff8063f35e>] schedule_timeout+0x7e/0xe0
[<ffffffff802d5439>] do_sys_poll+0x3b9/0x440
[<ffffffff802d57d8>] sys_poll+0x38/0xa0
[<ffffffff8020c74b>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

other info that might help us debug this:

1 lock held by hald/4680:
#0: (&obj_hash[i].lock){++..}, at: [<ffffffff8041cfdc>]
debug_object_free+0x5c/0x120

stack backtrace:
Pid: 4680, comm: hald Not tainted 2.6.27-rc4-224c #1

Call Trace:
[<ffffffff80268d1f>] print_circular_bug_tail+0x9f/0xe0
[<ffffffff8026b0f5>] __lock_acquire+0xe95/0x1150
[<ffffffff8026b441>] lock_acquire+0x91/0xc0
[<ffffffff802bfa26>] ? add_partial+0x26/0x80
[<ffffffff806419c6>] _spin_lock+0x36/0x70
[<ffffffff802bfa26>] ? add_partial+0x26/0x80
[<ffffffff802bfa26>] add_partial+0x26/0x80
[<ffffffff802c0156>] __slab_free+0x106/0x110
[<ffffffff802c0507>] kmem_cache_free+0xa7/0x110
[<ffffffff8041cc48>] ? free_object+0x68/0xc0
[<ffffffff8041cc48>] free_object+0x68/0xc0
[<ffffffff8041d033>] debug_object_free+0xb3/0x120
[<ffffffff8063f35e>] schedule_timeout+0x7e/0xe0
[<ffffffff8024c200>] ? process_timeout+0x0/0x10
[<ffffffff8063f342>] ? schedule_timeout+0x62/0xe0
[<ffffffff802d5439>] do_sys_poll+0x3b9/0x440
[<ffffffff802d5f00>] ? __pollwait+0x0/0x120
[<ffffffff8023a720>] ? default_wake_function+0x0/0x10
[<ffffffff8023a720>] ? default_wake_function+0x0/0x10
[<ffffffff8023a720>] ? default_wake_function+0x0/0x10
[<ffffffff8023a720>] ? default_wake_function+0x0/0x10
[<ffffffff8023a720>] ? default_wake_function+0x0/0x10
[<ffffffff8023a720>] ? default_wake_function+0x0/0x10
[<ffffffff8023a720>] ? default_wake_function+0x0/0x10
[<ffffffff8023a720>] ? default_wake_function+0x0/0x10
[<ffffffff8023a720>] ? default_wake_function+0x0/0x10
[<ffffffff8023a720>] ? default_wake_function+0x0/0x10
[<ffffffff8064128e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff802d57d8>] sys_poll+0x38/0xa0
[<ffffffff8020c74b>] system_call_fastpath+0x16/0x1b
--
Daniel J Blueman

2008-08-23 05:34:56

by Vegard Nossum

[permalink] [raw]

Subject: Re: [2.6.27-rc4] SLUB list_lock vs obj_hash.lock...

On Fri, Aug 22, 2008 at 11:48 PM, Daniel J Blueman <[email protected]> wrote:
> When booting 2.6.27-rc4 with SLUB and debug_objects=1, we see (after
> some activity) lock ordering issues with obj_hash.lock and SLUB's
> list_lock [1].

That's quite funny :-)

> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.27-rc4-224c #1
> -------------------------------------------------------
> hald/4680 is trying to acquire lock:
> (&n->list_lock){++..}, at: [<ffffffff802bfa26>] add_partial+0x26/0x80
>
> but task is already holding lock:
> (&obj_hash[i].lock){++..}, at: [<ffffffff8041cfdc>]
> debug_object_free+0x5c/0x120

It seems to be a similar problem as before, except that this is the free path, and not alloc.

You might try the appended patch -- but it's completely untested on my end!

What it does is to move the actual freeing until after we've released the hash-entry lock. In the cases of multiple objects freed, we enqueue all the objects on a local list.

Also added Thomas to Cc.

Vegard

PS: Why are you the only one seeing these things? Does debug_objects=1 need to be passed to activate this code, or can a CONFIG also be used?

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 45a6bde..9d8161f 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -171,19 +171,28 @@ static void debug_objects_oom(void)
{
struct debug_bucket *db = obj_hash;
struct hlist_node *node, *tmp;
+ HLIST_HEAD(freelist);
struct debug_obj *obj;
unsigned long flags;
int i;

printk(KERN_WARNING "ODEBUG: Out of memory. ODEBUG disabled\n");

+ /* XXX: Could probably be optimized by transplantation of more than
+ * one entry at a time. */
for (i = 0; i < ODEBUG_HASH_SIZE; i++, db++) {
spin_lock_irqsave(&db->lock, flags);
hlist_for_each_entry_safe(obj, node, tmp, &db->list, node) {
hlist_del(&obj->node);
- free_object(obj);
+ hlist_add_head(&obj->node, &freelist);
}
spin_unlock_irqrestore(&db->lock, flags);
+
+ /* Now free them */
+ hlist_for_each_entry_safe(obj, node, tmp, &freelist, node) {
+ hlist_del(&obj->node);
+ free_object(obj);
+ }
}
}

@@ -498,8 +507,9 @@ void debug_object_free(void *addr, struct debug_obj_descr *descr)
return;
default:
hlist_del(&obj->node);
+ spin_unlock_irqrestore(&db->lock, flags);
free_object(obj);
- break;
+ return;
}
out_unlock:
spin_unlock_irqrestore(&db->lock, flags);
@@ -510,6 +520,7 @@ static void __debug_check_no_obj_freed(const void *address, unsigned long size)
{
unsigned long flags, oaddr, saddr, eaddr, paddr, chunks;
struct hlist_node *node, *tmp;
+ HLIST_HEAD(freelist);
struct debug_obj_descr *descr;
enum debug_obj_state state;
struct debug_bucket *db;
@@ -545,11 +556,18 @@ repeat:
goto repeat;
default:
hlist_del(&obj->node);
- free_object(obj);
+ hlist_add_head(&obj->node, &freelist);
break;
}
}
spin_unlock_irqrestore(&db->lock, flags);
+
+ /* Now free them */
+ hlist_for_each_entry_safe(obj, node, tmp, &freelist, node) {
+ hlist_del(&obj->node);
+ free_object(obj);
+ }
+
if (cnt > debug_objects_maxchain)
debug_objects_maxchain = cnt;
}