Hello,
I noticed that the behaviour of FUTEX_WAIT changed between 2.6.37 and
2.6.38. The error was initially found in a java program where a
Thread.sleep never returned after resuming from a suspend to ram.
Thread.sleep is implemented using pthread_cond_timedwait which itself
uses futex with the op FUTEX_WAIT.
The error can also be triggered with a simple test program (attached as
test-futex.c) which calls FUTEX_WAIT with a timeout of 200ms in a loop.
While running the test program the machine is suspended using "echo mem
> /sys/power/state".
After resume the futex syscall never returns. The return can be provoked
by sending the process a combination of SIGSTOP and SIGCONT.
The bug didn't occur in 2.6.37.
I found this bug report
https://bugzilla.kernel.org/show_bug.cgi?id=32922
which describes a related problem and presented a patch. This patch
(adding the FLAGS_HAS_TIMEOUT in futex_wait to the restart_block) fixes
the problem for my initial java problem and the test program.
I found the following pull request which probably introduced the
problem: https://lkml.org/lkml/2011/1/6/62
Thanks,
Torsten