2006-01-31 22:51:18

by Alexey Dobriyan

[permalink] [raw]
Subject: Badness in local_bh_enable by [PATCH] fix uidhash_lock <-> RCU deadlock

Flooding boot logs with

Badness in local_bh_enable at kernel/softirq.c:140
[<c0114272>] local_bh_enable+0x25/0x68
[<c0118015>] collect_signal+0x9e/0xf4
[<c01f3867>] drm_notifier+0x0/0x41
[<c01180c4>] __dequeue_signal+0x59/0x70
[<c0118108>] dequeue_signal+0x2d/0x98
[<c01196bd>] get_signal_to_deliver+0x5b/0x256
[<c0102226>] do_signal+0x51/0xf4
[<c014d4d1>] __pollwait+0x0/0x94
[<c01a8e59>] copy_to_user+0x2d/0x35
[<c014dce4>] sys_select+0x115/0x133
[<c01022f2>] do_notify_resume+0x29/0x37
[<c01024ba>] work_notifysig+0x13/0x19

4021cb279a532728c3208a16b9b09b0ca8016850 is first bad commit
diff-tree 4021cb279a532728c3208a16b9b09b0ca8016850 (from d5bee775137c56ed993f1b3c9d66c268b3525d7d)
Author: Ingo Molnar <[email protected]>
Date: Wed Jan 25 15:23:07 2006 +0100

[PATCH] fix uidhash_lock <-> RCU deadlock

RCU task-struct freeing can call free_uid(), which is taking
uidhash_lock - while other users of uidhash_lock are softirq-unsafe.

The fix is to always take the uidhash_spinlock in a softirq-safe manner.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Paul E. McKenney <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

:040000 040000 98d3bd6cebd288defc2bee44054d252629e6c020 9319c402e05522b655c2445e9f19cf394b006e02 M kernel
diff --git a/kernel/user.c b/kernel/user.c
index 89e562f..d1ae234 100644
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -13,6 +13,7 @@
#include <linux/slab.h>
#include <linux/bitops.h>
#include <linux/key.h>
+#include <linux/interrupt.h>

/*
* UID task count cache, to get fast user lookup in "alloc_uid"
@@ -27,6 +28,12 @@

static kmem_cache_t *uid_cachep;
static struct list_head uidhash_table[UIDHASH_SZ];
+
+/*
+ * The uidhash_lock is mostly taken from process context, but it is
+ * occasionally also taken from softirq/tasklet context, when
+ * task-structs get RCU-freed. Hence all locking must be softirq-safe.
+ */
static DEFINE_SPINLOCK(uidhash_lock);

struct user_struct root_user = {
@@ -83,14 +90,15 @@ struct user_struct *find_user(uid_t uid)
{
struct user_struct *ret;

- spin_lock(&uidhash_lock);
+ spin_lock_bh(&uidhash_lock);
ret = uid_hash_find(uid, uidhashentry(uid));
- spin_unlock(&uidhash_lock);
+ spin_unlock_bh(&uidhash_lock);
return ret;
}

void free_uid(struct user_struct *up)
{
+ local_bh_disable();
if (up && atomic_dec_and_lock(&up->__count, &uidhash_lock)) {
uid_hash_remove(up);
key_put(up->uid_keyring);
@@ -98,6 +106,7 @@ void free_uid(struct user_struct *up)
kmem_cache_free(uid_cachep, up);
spin_unlock(&uidhash_lock);
}
+ local_bh_enable();
}

struct user_struct * alloc_uid(uid_t uid)
@@ -105,9 +114,9 @@ struct user_struct * alloc_uid(uid_t uid
struct list_head *hashent = uidhashentry(uid);
struct user_struct *up;

- spin_lock(&uidhash_lock);
+ spin_lock_bh(&uidhash_lock);
up = uid_hash_find(uid, hashent);
- spin_unlock(&uidhash_lock);
+ spin_unlock_bh(&uidhash_lock);

if (!up) {
struct user_struct *new;
@@ -137,7 +146,7 @@ struct user_struct * alloc_uid(uid_t uid
* Before adding this, check whether we raced
* on adding the same user already..
*/
- spin_lock(&uidhash_lock);
+ spin_lock_bh(&uidhash_lock);
up = uid_hash_find(uid, hashent);
if (up) {
key_put(new->uid_keyring);
@@ -147,7 +156,7 @@ struct user_struct * alloc_uid(uid_t uid
uid_hash_insert(new, hashent);
up = new;
}
- spin_unlock(&uidhash_lock);
+ spin_unlock_bh(&uidhash_lock);

}
return up;
@@ -183,9 +192,9 @@ static int __init uid_cache_init(void)
INIT_LIST_HEAD(uidhash_table + n);

/* Insert the root user immediately (init already runs as root) */
- spin_lock(&uidhash_lock);
+ spin_lock_bh(&uidhash_lock);
uid_hash_insert(&root_user, uidhashentry(0));
- spin_unlock(&uidhash_lock);
+ spin_unlock_bh(&uidhash_lock);

return 0;
}


2006-01-31 23:20:28

by Linus Torvalds

[permalink] [raw]
Subject: Re: Badness in local_bh_enable by [PATCH] fix uidhash_lock <-> RCU deadlock



On Wed, 1 Feb 2006, Alexey Dobriyan wrote:
>
> Flooding boot logs with
>
> Badness in local_bh_enable at kernel/softirq.c:140

Ok, looks bad. It's through

__dequeue_signal():
collect_signal():
__sigqueue_free():
free_uid()

where we hold the sigqueue lock. We do _not_ want to do BH processing
there with the lock held and interrupts disabled, so the warning is
correct, and that uidhash_lock patch potentially causes more problems than
it fixes.

Perhaps the easiest solution is to just make them irq-safe instead
of bh-safe? An alternative might be to make __sigqueue_free() do its work
through RCU callbacks too, but that seems wrong.

Comments? Ingo?

Linus

2006-02-01 02:07:42

by Kumar Gala

[permalink] [raw]
Subject: Re: Badness in local_bh_enable by [PATCH] fix uidhash_lock <-> RCU deadlock

I'm also seeing a similar issue on an embedded PowerPC. Makes my serial
port seem to lose characters or require extra ones. Through a slightly
different path than Alexey:

Badness in local_bh_enable at kernel/softirq.c:140
Call Trace:
[C0673D10] [C0007620] show_stack+0x40/0x194 (unreliable)
[C0673D40] [C000C21C] program_check_exception+0x3c4/0x518
[C0673D80] [C000D588] ret_from_except_full+0x0/0x4c
--- Exception: 700 at local_bh_enable+0x1c/0x84
LR = free_uid+0x5c/0xb4
[C0673E40] [00800144] 0x800144 (unreliable)
[C0673E50] [C0025540] free_uid+0x5c/0xb4
[C0673E60] [C00259FC] flush_sigqueue+0x84/0xb0
[C0673E80] [C0025E10] __exit_signal+0x214/0x250
[C0673EA0] [C001A6A0] release_task+0x94/0x1d0
[C0673EC0] [C001D164] do_wait+0xc44/0xff4
[C0673F40] [C000CEE8] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe68680
LR = 0x10023c28

- kumar

> On Wed, 1 Feb 2006, Alexey Dobriyan wrote:
> >
> > Flooding boot logs with
> >
> > Badness in local_bh_enable at kernel/softirq.c:140
>
> Ok, looks bad. It's through
>
> __dequeue_signal():
> collect_signal():
> __sigqueue_free():
> free_uid()
>
> where we hold the sigqueue lock. We do _not_ want to do BH processing
> there with the lock held and interrupts disabled, so the warning is
> correct, and that uidhash_lock patch potentially causes more problems than
> it fixes.
>
> Perhaps the easiest solution is to just make them irq-safe instead
> of bh-safe? An alternative might be to make __sigqueue_free() do its work
> through RCU callbacks too, but that seems wrong.
>
> Comments? Ingo?
>
> Linus
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>


2006-02-01 02:41:27

by Peter Williams

[permalink] [raw]
Subject: Re: Badness in local_bh_enable by [PATCH] fix uidhash_lock <-> RCU deadlock

Kumar Gala wrote:
> I'm also seeing a similar issue on an embedded PowerPC. Makes my serial
> port seem to lose characters or require extra ones. Through a slightly
> different path than Alexey:
>
> Badness in local_bh_enable at kernel/softirq.c:140
> Call Trace:
> [C0673D10] [C0007620] show_stack+0x40/0x194 (unreliable)
> [C0673D40] [C000C21C] program_check_exception+0x3c4/0x518
> [C0673D80] [C000D588] ret_from_except_full+0x0/0x4c
> --- Exception: 700 at local_bh_enable+0x1c/0x84
> LR = free_uid+0x5c/0xb4
> [C0673E40] [00800144] 0x800144 (unreliable)
> [C0673E50] [C0025540] free_uid+0x5c/0xb4
> [C0673E60] [C00259FC] flush_sigqueue+0x84/0xb0
> [C0673E80] [C0025E10] __exit_signal+0x214/0x250
> [C0673EA0] [C001A6A0] release_task+0x94/0x1d0
> [C0673EC0] [C001D164] do_wait+0xc44/0xff4
> [C0673F40] [C000CEE8] ret_from_syscall+0x0/0x38
> --- Exception: c01 at 0xfe68680
> LR = 0x10023c28
>
> - kumar
>
>
>>On Wed, 1 Feb 2006, Alexey Dobriyan wrote:
>>
>>>Flooding boot logs with
>>>
>>>Badness in local_bh_enable at kernel/softirq.c:140
>>
>>Ok, looks bad. It's through
>>
>> __dequeue_signal():
>> collect_signal():
>> __sigqueue_free():
>> free_uid()
>>
>>where we hold the sigqueue lock. We do _not_ want to do BH processing
>>there with the lock held and interrupts disabled, so the warning is
>>correct, and that uidhash_lock patch potentially causes more problems than
>>it fixes.
>>
>>Perhaps the easiest solution is to just make them irq-safe instead
>>of bh-safe? An alternative might be to make __sigqueue_free() do its work
>>through RCU callbacks too, but that seems wrong.
>>
>>Comments? Ingo?
>>
>> Linus

More data points. I'm seeing two slightly different to each other (one
has a preempt_schedule() in the middle of the call sequence) and
different to those already reported versions of this problem on a Pentium 4:

Badness in local_bh_enable at kernel/softirq.c:140
[<c011928f>] local_bh_enable+0x7e/0x88
[<c011f01a>] __dequeue_signal+0x164/0x1d6
[<c011f104>] dequeue_signal+0x78/0xc1
[<c011ffc3>] get_signal_to_deliver+0x5c/0x55e
[<c0148b88>] do_wp_page+0x20e/0x353
[<c01023e5>] do_notify_resume+0x8e/0x6a1
[<c036fdd7>] notifier_call_chain+0x27/0x40
[<c011ea10>] sigprocmask+0x62/0xd8
[<c011eb59>] sys_rt_sigprocmask+0xd3/0xf8
[<c0102bb2>] work_notifysig+0x13/0x19
Badness in local_bh_enable at kernel/softirq.c:140
[<c011928f>] local_bh_enable+0x7e/0x88
[<c011f01a>] __dequeue_signal+0x164/0x1d6
[<c011f104>] dequeue_signal+0x78/0xc1
[<c011ffc3>] get_signal_to_deliver+0x5c/0x55e
[<c0148b88>] do_wp_page+0x20e/0x353
[<c01023e5>] do_notify_resume+0x8e/0x6a1
[<c036cc83>] preempt_schedule+0x4a/0x56
[<c036fdd7>] notifier_call_chain+0x27/0x40
[<c011ea10>] sigprocmask+0x62/0xd8
[<c011eb59>] sys_rt_sigprocmask+0xd3/0xf8
[<c0102bb2>] work_notifysig+0x13/0x19

--
Peter Williams [email protected]

"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce