2011-04-13 07:04:47

by Torsten Hilbrich

[permalink] [raw]
Subject: [futex] Regression in 2.6.38 regarding FLAGS_HAS_TIMEOUT

Hello,

I noticed that the behaviour of FUTEX_WAIT changed between 2.6.37 and
2.6.38. The error was initially found in a java program where a
Thread.sleep never returned after resuming from a suspend to ram.
Thread.sleep is implemented using pthread_cond_timedwait which itself
uses futex with the op FUTEX_WAIT.

The error can also be triggered with a simple test program (attached as
test-futex.c) which calls FUTEX_WAIT with a timeout of 200ms in a loop.
While running the test program the machine is suspended using "echo mem
> /sys/power/state".

After resume the futex syscall never returns. The return can be provoked
by sending the process a combination of SIGSTOP and SIGCONT.

The bug didn't occur in 2.6.37.

I found this bug report

https://bugzilla.kernel.org/show_bug.cgi?id=32922

which describes a related problem and presented a patch. This patch
(adding the FLAGS_HAS_TIMEOUT in futex_wait to the restart_block) fixes
the problem for my initial java problem and the test program.

I found the following pull request which probably introduced the
problem: https://lkml.org/lkml/2011/1/6/62

Thanks,

Torsten



Attachments:
test-futex.c (759.00 B)