2003-09-29 09:23:49

by kladit

[permalink] [raw]
Subject: 2.6.0-test6 oops futex"

System SMP, Dual-Xeon, glibc-cvs with nptl-0.59, gcc-3.3.1
MozillaFirebird-0.61

With linux-2.6.0-test6 I got several times this oops.
I had no problems with linux-2.6.0-test5 before. (absolute stable for me)

Sep 29 11:06:36 xeon2 kernel: Unable to handle kernel paging request at virtual address 00100104
Sep 29 11:06:36 xeon2 kernel: printing eip:
Sep 29 11:06:36 xeon2 kernel: c013522d
Sep 29 11:06:36 xeon2 kernel: *pde = 00000000
Sep 29 11:06:36 xeon2 kernel: Oops: 0002 [#1]
Sep 29 11:06:36 xeon2 kernel: CPU: 3
Sep 29 11:06:36 xeon2 kernel: EIP: 0060:[<c013522d>] Not tainted
Sep 29 11:06:36 xeon2 kernel: EFLAGS: 00010246
Sep 29 11:06:36 xeon2 kernel: EIP is at futex_wake+0xe4/0x140
Sep 29 11:06:36 xeon2 kernel: eax: 00200200 ebx: f6e4deec ecx: 00000000 edx: 00100100
Sep 29 11:06:36 xeon2 kernel: esi: c054cbc0 edi: 00000000 ebp: c054cbc0 esp: f626df3c
Sep 29 11:06:36 xeon2 kernel: ds: 007b es: 007b ss: 0068
Sep 29 11:06:36 xeon2 kernel: Process MozillaFirebird (pid: 967, threadinfo=f626c000 task=f6cdc080)
Sep 29 11:06:36 xeon2 kernel: Stack: f626df4c f626df4c 00000fe8 c054cbbc 0819c000 f7472c80 00000fe8 e75d6008
Sep 29 11:06:36 xeon2 kernel: 0819cfe8 00000001 0819cfe8 00000001 c0135c77 0819cfe8 00000001 00000000
Sep 29 11:06:36 xeon2 kernel: 00000001 c0135d85 0819cfe8 00000001 00000001 7fffffff 0819cfe8 00000000
Sep 29 11:06:36 xeon2 kernel: Call Trace:
Sep 29 11:06:36 xeon2 kernel: [<c0135c77>] do_futex+0x7e/0x80
Sep 29 11:06:36 xeon2 kernel: [<c0135d85>] sys_futex+0x10c/0x124
Sep 29 11:06:36 xeon2 kernel: [<c0124d8b>] sys_gettimeofday+0x53/0xa8
Sep 29 11:06:36 xeon2 kernel: [<c010aba9>] sysenter_past_esp+0x52/0x71
Sep 29 11:06:36 xeon2 kernel:
Sep 29 11:06:36 xeon2 kernel: Code: 89 42 04 89 10 89 5b 04 8d 43 08 89 1b ba 03 00 00 00 e8 3b
Sep 29 11:06:36 xeon2 kernel: <6>note: MozillaFirebird[967] exited with preempt_count 1

--
Klaus


2003-09-30 08:49:22

by Jamie Lokier

[permalink] [raw]
Subject: Re: 2.6.0-test6 oops futex"

Klaus Dittrich wrote:
> With linux-2.6.0-test6 I got several times this oops.
> I had no problems with linux-2.6.0-test5 before. (absolute stable for me)

This oops is in the first store in the __list_del() of
list_del_init(), inside the loop of futex_wake.

The "next" link of the current futex in the loop is corrupt. More
precisely, the iterator list node contains { LIST_POISON1, LIST_POISON2 }
which means it must have been earlier deleted using list_del().

The only call to list_del() in the futex code is in unqueue_me(),
where it is protected against the same spinlock as all the other
queuing and dequeuing operations.

The one possibility I can think of is that the hash function isn't
consistent, either because of a hashing bug or because the hash
function argument is being changed. Then different locks would be
used between one place and another, permitting the lists to become
corrupt. That would explain the oops appearing only in test6, which
is when the new hashing scheme and split locks appeared.

I don't see any obvious flaw. Rusty, Hugh, anyone else see it?

Ulrich, could you try a favourite futex stress test on vanilla
2.6.0-test6, to see if you can reliably trigger this oops?

Thanks,
-- Jamie

> Sep 29 11:06:36 xeon2 kernel: Unable to handle kernel paging request at virtual address 00100104
> Sep 29 11:06:36 xeon2 kernel: printing eip:
> Sep 29 11:06:36 xeon2 kernel: c013522d
> Sep 29 11:06:36 xeon2 kernel: *pde = 00000000
> Sep 29 11:06:36 xeon2 kernel: Oops: 0002 [#1]
> Sep 29 11:06:36 xeon2 kernel: CPU: 3
> Sep 29 11:06:36 xeon2 kernel: EIP: 0060:[<c013522d>] Not tainted
> Sep 29 11:06:36 xeon2 kernel: EFLAGS: 00010246
> Sep 29 11:06:36 xeon2 kernel: EIP is at futex_wake+0xe4/0x140
> Sep 29 11:06:36 xeon2 kernel: eax: 00200200 ebx: f6e4deec ecx: 00000000 edx: 00100100
> Sep 29 11:06:36 xeon2 kernel: esi: c054cbc0 edi: 00000000 ebp: c054cbc0 esp: f626df3c
> Sep 29 11:06:36 xeon2 kernel: ds: 007b es: 007b ss: 0068
> Sep 29 11:06:36 xeon2 kernel: Process MozillaFirebird (pid: 967, threadinfo=f626c000 task=f6cdc080)
> Sep 29 11:06:36 xeon2 kernel: Stack: f626df4c f626df4c 00000fe8 c054cbbc 0819c000 f7472c80 00000fe8 e75d6008
> Sep 29 11:06:36 xeon2 kernel: 0819cfe8 00000001 0819cfe8 00000001 c0135c77 0819cfe8 00000001 00000000
> Sep 29 11:06:36 xeon2 kernel: 00000001 c0135d85 0819cfe8 00000001 00000001 7fffffff 0819cfe8 00000000
> Sep 29 11:06:36 xeon2 kernel: Call Trace:
> Sep 29 11:06:36 xeon2 kernel: [<c0135c77>] do_futex+0x7e/0x80
> Sep 29 11:06:36 xeon2 kernel: [<c0135d85>] sys_futex+0x10c/0x124
> Sep 29 11:06:36 xeon2 kernel: [<c0124d8b>] sys_gettimeofday+0x53/0xa8
> Sep 29 11:06:36 xeon2 kernel: [<c010aba9>] sysenter_past_esp+0x52/0x71
> Sep 29 11:06:36 xeon2 kernel:
> Sep 29 11:06:36 xeon2 kernel: Code: 89 42 04 89 10 89 5b 04 8d 43 08 89 1b ba 03 00 00 00 e8 3b
> Sep 29 11:06:36 xeon2 kernel: <6>note: MozillaFirebird[967] exited with preempt_count 1
>
> --
> Klaus
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2003-09-30 21:37:52

by Hugh Dickins

[permalink] [raw]
Subject: Re: 2.6.0-test6 oops futex"

On Tue, 30 Sep 2003, Jamie Lokier wrote:
> Klaus Dittrich wrote:
> > With linux-2.6.0-test6 I got several times this oops.
> > I had no problems with linux-2.6.0-test5 before. (absolute stable for me)
>
> This oops is in the first store in the __list_del() of
> list_del_init(), inside the loop of futex_wake.
>
> The "next" link of the current futex in the loop is corrupt. More
> precisely, the iterator list node contains { LIST_POISON1, LIST_POISON2 }
> which means it must have been earlier deleted using list_del().
>
> The only call to list_del() in the futex code is in unqueue_me(),
> where it is protected against the same spinlock as all the other
> queuing and dequeuing operations.
>
> The one possibility I can think of is that the hash function isn't
> consistent, either because of a hashing bug or because the hash
> function argument is being changed. Then different locks would be
> used between one place and another, permitting the lists to become
> corrupt. That would explain the oops appearing only in test6, which
> is when the new hashing scheme and split locks appeared.
>
> I don't see any obvious flaw. Rusty, Hugh, anyone else see it?

Excellent (hah, not again! what a cosy club we share) analysis.

I can't quite make the flaw I do see, fit the facts Klaus sees.

Consider unqueue_me on one cpu racing against this->key = key2 in
futex_requeue on another. unqueue_me finds the right bh to lock
from q->key, but futex_requeue is shifting q->key beneath it.
Although futex_requeue holds both the relevant spinlocks, during
reassignment of key an irrelevant hashqueue may get calculated
and locked instead.

Whether or not this is relevant to Klaus' oops, it does need to be
fixed, doesn't it? And your new key_refs version even more exposed?
I'm giving up on it, but expect you or Rusty will be ingenious enough
to rescue the split locks somehow.

(Oh, while you're there, be nice to fix nr_requeue 0.)

Hugh

2003-09-30 21:48:45

by Ulrich Drepper

[permalink] [raw]
Subject: Re: 2.6.0-test6 oops futex"

Hugh Dickins wrote:

> (Oh, while you're there, be nice to fix nr_requeue 0.)

Not necessary. FUTEX_REQUEUE with nr_requeue == 0 is the same as
FUTEX_WAKE. When we wrote the code we decided that it is better to
optimize the requeue as much as possible and leave worrying about
invalid parameters to the user. Not that the code will not cause any
crashes or so.

--
--------------. ,-. 444 Castro Street
Ulrich Drepper \ ,-----------------' \ Mountain View, CA 94041 USA
Red Hat `--' drepper at redhat.com `---------------------------

2003-10-01 01:01:51

by Jamie Lokier

[permalink] [raw]
Subject: Re: 2.6.0-test6 oops futex"

Hugh Dickins wrote:
> I can't quite make the flaw I do see, fit the facts Klaus sees.
>
> Consider unqueue_me on one cpu racing against this->key = key2 in
> futex_requeue on another. unqueue_me finds the right bh to lock
> from q->key, but futex_requeue is shifting q->key beneath it.
> Although futex_requeue holds both the relevant spinlocks, during
> reassignment of key an irrelevant hashqueue may get calculated
> and locked instead.
>
> Whether or not this is relevant to Klaus' oops, it does need to be
> fixed, doesn't it? And your new key_refs version even more exposed?
> I'm giving up on it, but expect you or Rusty will be ingenious enough
> to rescue the split locks somehow.

Superb analysis!

It does indeed explain Klaus' oops. unqueue_me() and futex_poll()
hash using &q->key outside any spinlocks, so if there's a concurrent
futex_requeue() the hash can be corrupt. Result: wrong lock taken;
list manipulation races in test6, not in test5. (Provided Klaus has
SMP or pre-empt).

Solutions are to call hash_futex() inside the lock, or store the
hash result inside futex_q, and read it inside the lock.

You're right, the key_refs version is more exposed, but because of the
change to futux_requeue logic rather than anything to do with key_refs.

> (Oh, while you're there, be nice to fix nr_requeue 0.)

Logically, zero num in FUTEX_QUEUE shouldn't wake anything either, but
it's understood that at least one is always woken. It's a bit of a
wart I agree, but I'll go along with Ulrich's view.

Thanks,
-- Jamie

2003-10-01 02:42:12

by Jamie Lokier

[permalink] [raw]
Subject: Re: 2.6.0-test6 oops futex"

Jamie Lokier wrote:
> Solutions are to call hash_futex() inside the lock, or store the
> hash result inside futex_q, and read it inside the lock.

What _am_ I talking about. Can't call hash_futex() inside the lock,
it's needed to choose the lock ;)

Keeping the split locks is going to be tricky - all due to futex_requeue.

-- Jamie