Was the badness in futex_wait problem ever resolved?
-- Jamie
No, I still get them in 2.6.4
To my knowledge, they only happen when one thread is blocked in a system
call
and another thread makes a system("foo bar") call.
The blocked thread will return with EINTR.
I see this usually in sem_wait, but also in epoll_wait sometimes.
Is it possible system() is causing the wrong process to get woken up?
Let me know if you need any further information, I can reproduce it
consistently.
Jamie Lokier wrote:
>Was the badness in futex_wait problem ever resolved?
>
>-- Jamie
>
>
Dirk Morris wrote:
> Let me know if you need any further information, I can reproduce it
> consistently.
If you have a small test program (or pair of programs) that
consistently triggers this message on any machine running 2.6.4, that
would be very helpful indeed.
-- Jamie
Jamie Lokier wrote:
>If you have a small test program (or pair of programs) that
>consistently triggers this message on any machine running 2.6.4, that
>would be very helpful indeed.
>
>
Here ya go. :)
* $Id: foo.c,v 1.00 2004/03/31 10:51:19 dmorris Exp $ */
/* gcc foo.c -o foo -pthread */
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
#include <semaphore.h>
sem_t sem;
void* make_system_call (void* arg)
{
while(1) {
sleep(1);
system("/bin/true");
}
return NULL;
}
int main(void)
{
pthread_t id;
sem_init(&sem,0,0);
pthread_create(&id,NULL,make_system_call,NULL);
while (sem_wait(&sem)<0)
perror("sem_wait");
return 0;
}
Output for me is :
~ # uname
-a
[dmorris @ gobbles]
Linux gobbles 2.6.4 #2 SMP Mon Mar 29 17:15:08 PST 2004 i686 GNU/Linux
~ #
./foo
[dmorris @ gobbles]
sem_wait: Interrupted system call
sem_wait: Interrupted system call
sem_wait: Interrupted system call
sem_wait: Interrupted system call
...
meanwhile kern.log spits out:
Mar 30 16:40:23 gobbles kernel: Badness in futex_wait at kernel/futex.c:508
Mar 30 16:40:23 gobbles kernel: Call Trace:
Mar 30 16:40:23 gobbles kernel: [<c0139a28>] futex_wait+0x1a8/0x1c0
Mar 30 16:40:23 gobbles kernel: [<c011efc0>] default_wake_function+0x0/0x20
Mar 30 16:40:23 gobbles kernel: [<c011efc0>] default_wake_function+0x0/0x20
Mar 30 16:40:23 gobbles kernel: [<c012d020>] do_timer+0xc0/0xd0
Mar 30 16:40:23 gobbles kernel: [<c0139d30>] do_futex+0x70/0x80
Mar 30 16:40:23 gobbles kernel: [<c0139e64>] sys_futex+0x124/0x140
Mar 30 16:40:23 gobbles kernel: [<c015ef11>] sys_write+0x61/0x70
Mar 30 16:40:23 gobbles kernel: [<c01095db>] syscall_call+0x7/0xb
Mar 30 16:40:23 gobbles kernel:
I can do this on multiple machines, so I won't include all the box info.
If you need it let me know.
Thanks! :)
-Dirk
On Thu, 2004-04-01 at 02:56, Jamie Lokier wrote:
> Was the badness in futex_wait problem ever resolved?
No: Andrew misapplied the patches for ages and ended up ditching it. We
tried a different version but there was a bug which meant real wakups
triggered it...
Here is the current version...
Name: Who's Spuriously Waking Futexes?
Author: Andrew Morton, Rusty Russell
Status: Tested on 2.6.3-bk1
Someone is triggering the WARN_ON() in futex.c. We know that software
suspend could do it, in theory. But noone else should be.
This code adds a PF_FUTEX_DEBUG flag, which is set in the futex code
when we sleep, and also when we wake up. If a task with
PF_FUTEX_DEBUG is woken by a task without PF_FUTEX_DEBUG, we have
found our culprit.
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .31103-linux-2.6.3-rc4/include/linux/sched.h .31103-linux-2.6.3-rc4.updated/include/linux/sched.h
--- .31103-linux-2.6.3-rc4/include/linux/sched.h 2004-02-17 17:54:03.000000000 +1100
+++ .31103-linux-2.6.3-rc4.updated/include/linux/sched.h 2004-02-18 14:55:05.000000000 +1100
@@ -500,6 +500,7 @@ do { if (atomic_dec_and_test(&(tsk)->usa
#define PF_SWAPOFF 0x00080000 /* I am in swapoff */
#define PF_LESS_THROTTLE 0x00100000 /* Throttle me less: I clean memory */
#define PF_SYNCWRITE 0x00200000 /* I am doing a sync write */
+#define PF_FUTEX_DEBUG 0x00400000
#ifdef CONFIG_SMP
extern int set_cpus_allowed(task_t *p, cpumask_t new_mask);
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .31103-linux-2.6.3-rc4/kernel/futex.c .31103-linux-2.6.3-rc4.updated/kernel/futex.c
--- .31103-linux-2.6.3-rc4/kernel/futex.c 2004-02-17 17:54:04.000000000 +1100
+++ .31103-linux-2.6.3-rc4.updated/kernel/futex.c 2004-02-18 15:09:15.000000000 +1100
@@ -269,7 +269,11 @@ static void wake_futex(struct futex_q *q
* The lock in wake_up_all() is a crucial memory barrier after the
* list_del_init() and also before assigning to q->lock_ptr.
*/
+
+ current->flags |= PF_FUTEX_DEBUG;
wake_up_all(&q->waiters);
+ current->flags &= ~PF_FUTEX_DEBUG;
+
/*
* The waiting task can free the futex_q as soon as this is written,
* without taking any locks. This must come last.
@@ -490,8 +494,11 @@ static int futex_wait(unsigned long uadd
* !list_empty() is safe here without any lock.
* q.lock_ptr != 0 is not safe, because of ordering against wakeup.
*/
- if (likely(!list_empty(&q.list)))
+ if (likely(!list_empty(&q.list))) {
+ current->flags |= PF_FUTEX_DEBUG;
time = schedule_timeout(time);
+ current->flags &= ~PF_FUTEX_DEBUG;
+ }
__set_current_state(TASK_RUNNING);
/*
@@ -505,7 +512,11 @@ static int futex_wait(unsigned long uadd
if (time == 0)
return -ETIMEDOUT;
/* A spurious wakeup should never happen. */
- WARN_ON(!signal_pending(current));
+ if (!signal_pending(current)) {
+ printk("futex_wait woken: %lu %i %lu\n",
+ uaddr, val, time);
+ WARN_ON(1);
+ }
return -EINTR;
out_unqueue:
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .31103-linux-2.6.3-rc4/kernel/sched.c .31103-linux-2.6.3-rc4.updated/kernel/sched.c
--- .31103-linux-2.6.3-rc4/kernel/sched.c 2004-02-17 17:54:04.000000000 +1100
+++ .31103-linux-2.6.3-rc4.updated/kernel/sched.c 2004-02-18 14:55:05.000000000 +1100
@@ -658,6 +658,14 @@ static int try_to_wake_up(task_t * p, un
long old_state;
runqueue_t *rq;
+ if ((p->flags & PF_FUTEX_DEBUG)
+ && !(current->flags & PF_FUTEX_DEBUG)) {
+ printk("%s %i waking %s: %i %i\n",
+ current->comm, (int)in_interrupt(),
+ p->comm, p->tgid, p->pid);
+ WARN_ON(1);
+ }
+
repeat_lock_task:
rq = task_rq_lock(p, &flags);
old_state = p->state;
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .31103-linux-2.6.3-rc4/kernel/timer.c .31103-linux-2.6.3-rc4.updated/kernel/timer.c
--- .31103-linux-2.6.3-rc4/kernel/timer.c 2004-02-04 15:39:15.000000000 +1100
+++ .31103-linux-2.6.3-rc4.updated/kernel/timer.c 2004-02-18 14:55:05.000000000 +1100
@@ -971,6 +971,13 @@ static void process_timeout(unsigned lon
wake_up_process((task_t *)__data);
}
+static void futex_timeout(unsigned long __data)
+{
+ current->flags |= PF_FUTEX_DEBUG;
+ wake_up_process((task_t *)__data);
+ current->flags &= ~PF_FUTEX_DEBUG;
+}
+
/**
* schedule_timeout - sleep until timeout
* @timeout: timeout value in jiffies
@@ -1037,7 +1044,10 @@ signed long schedule_timeout(signed long
init_timer(&timer);
timer.expires = expire;
timer.data = (unsigned long) current;
- timer.function = process_timeout;
+ if (current->flags & PF_FUTEX_DEBUG)
+ timer.function = futex_timeout;
+ else
+ timer.function = process_timeout;
add_timer(&timer);
schedule();
--
Anyone who quotes me in their signature is an idiot -- Rusty Russell
On Thu, 2004-04-01 at 04:59, Dirk Morris wrote:
> Jamie Lokier wrote:
>
> >If you have a small test program (or pair of programs) that
> >consistently triggers this message on any machine running 2.6.4, that
> >would be very helpful indeed.
> >
> >
> Here ya go. :)
Doesn't work for me (2.6.5-rc1 Debian unstable 2xi686).
strace -f output attached below.
rusty@mingo:/tmp$ strace -f -o /tmp/foo.out ./foo
Process 3993 attached
Process 3994 attached
Process 3993 suspended
Process 3993 resumed
Process 3994 detached
sem_wait: Interrupted system call
Process 3995 attached
Process 3993 suspended
Process 3993 resumed
Process 3995 detached
sem_wait: Interrupted system call
Process 3996 attached
Process 3993 suspended
Process 3993 resumed
Process 3996 detached
sem_wait: Interrupted system call
PANIC: handle_group_exit: 3993 leader 3992
Rusty.
--
Anyone who quotes me in their signature is an idiot -- Rusty Russell
Rusty Russell <[email protected]> wrote:
>
> On Thu, 2004-04-01 at 04:59, Dirk Morris wrote:
> > Jamie Lokier wrote:
> >
> > >If you have a small test program (or pair of programs) that
> > >consistently triggers this message on any machine running 2.6.4, that
> > >would be very helpful indeed.
> > >
> > >
> > Here ya go. :)
Dirk, thanks for that. It's worth its weight.
>
> Doesn't work for me (2.6.5-rc1 Debian unstable 2xi686).
It works here. You're probably using some steam-driven debian userspace.
There seem to be two call traces according to dmesg:
true 0 waking futex-prob: 1732 1732
Badness in task_rq_lock at kernel/sched.c:277
Call Trace:
[<c011c678>] task_rq_lock+0x78/0xf4
[<c011ccbd>] try_to_wake_up+0x1d/0x2c0
[<c012d35b>] send_signal+0xdb/0x128
[<c011cf80>] wake_up_state+0xc/0x10
[<c012ea0d>] do_notify_parent+0x499/0x54c
[<c0178454>] dput+0x1c/0x274
[<c016226c>] __fput+0xfc/0x124
[<c0160b6e>] filp_close+0x62/0x70
[<c0125c1e>] exit_notify+0x61e/0x660
[<c0126102>] do_exit+0x4a2/0x4b4
[<c012625c>] sys_exit_group+0x0/0x14
[<c012626c>] sys_exit_group+0x10/0x14
[<c038d3f2>] sysenter_past_esp+0x43/0x65
sshd 0 waking futex-prob: 1732 1732
Badness in task_rq_lock at kernel/sched.c:277
Call Trace:
[<c011c678>] task_rq_lock+0x78/0xf4
[<c011ccbd>] try_to_wake_up+0x1d/0x2c0
[<c014a86c>] kmem_cache_alloc+0x20/0x68
[<c012d2be>] send_signal+0x3e/0x128
[<c011cf80>] wake_up_state+0xc/0x10
[<c012d97c>] group_send_sig_info+0x2cc/0x3f0
[<c012dafe>] __kill_pg_info+0x5e/0x90
[<c012db62>] kill_pg_info+0x32/0x5c
[<c012df03>] kill_pg+0x1b/0x20
[<c02536e0>] n_tty_receive_buf+0x530/0x11bc
[<c0334a85>] release_sock+0x75/0x7f
[<c0355fee>] tcp_recvmsg+0x656/0x694
[<c0202141>] avc_has_perm+0x41/0x4d
[<c02561f3>] pty_write+0x12b/0x198
[<c02550dd>] write_chan+0x195/0x1f4
[<c011f088>] default_wake_function+0x0/0x1c
[<c011f088>] default_wake_function+0x0/0x1c
[<c024f982>] tty_write+0x22a/0x2e4
[<c0254f48>] write_chan+0x0/0x1f4
[<c0161517>] vfs_write+0xb7/0xf0
[<c01615d0>] sys_write+0x30/0x50
[<c038d3f2>] sysenter_past_esp+0x43/0x65
But setting a breakpoint in task_rq_lock(), kgdb only ever seems to find
one:
Breakpoint 1, task_rq_lock (p=0xcf3a8330, flags=0xcf105e88) at kernel/sched.c:274
274 printk("%s %i waking %s: %i %i\n",
(gdb) bt
#0 task_rq_lock (p=0xcf3a8330, flags=0xcf105e88) at kernel/sched.c:274
#1 0xc011ce7d in try_to_wake_up (p=0xcf3a8330, state=1, sync=0) at kernel/sched.c:792
#2 0xc011cf80 in wake_up_state (p=0xcf29d420, state=3473952768) at kernel/sched.c:849
#3 0xc012ea0d in do_notify_parent (tsk=0xcf29d420, sig=17) at kernel/signal.c:552
#4 0xc0125c1e in exit_notify (tsk=0xcf29d420) at kernel/exit.c:716
#5 0xc0126102 in do_exit (code=0) at kernel/exit.c:790
#6 0xc012625c in do_group_exit (exit_code=0) at kernel/exit.c:861
#7 0xc012626c in sys_exit_group (error_code=0) at kernel/exit.c:872
#8 0xc038d3f2 in sysenter_past_esp () at net/key/af_key.c:2834
Looks like breakage in the signal code.
Andrew Morton <[email protected]> wrote:
>
> > > Here ya go. :)
>
> Dirk, thanks for that. It's worth its weight.
>
> >
> > Doesn't work for me (2.6.5-rc1 Debian unstable 2xi686).
>
> It works here.
Then again, I only saw the debug messages on the wakeup path. I'm unable
to trigger the warning in futex_wait().