2017-04-28 09:34:38

by Luchezar Belev

[permalink] [raw]
Subject: futex race condition

hello,

Consider a situation where the "atomic check and suspend" actually
comes into action, that is, a call to futex_wait is canceled because
another thread managed to change the variable just before the
suspending.
Since the exact time when the change occurred can not be relied on in
any way, is is quite possible that the other thread might have missed
the suspending by a tiny while and changed the variable _after_ the
futex_wait-er was already suspended.
Then it turns out the "atomic check and suspend" actually relies on
the unreliable inter-thread timing! (i.e. a race condition)

Unless i got it all wrong, this means that the futex_wait function is
as good as simple unconditional suspend for the purpose of
implementing inter-thread synchronization primitives (like mutexes),
and any futex-based mutex/whatever implementation does not really rely
on that "atomic" mechanism but manages to achieve it's job without it
(even though the people that implemented them may not realize it).
Perhaps thats why some people say that "futex is tricky".


Subject: Re: futex race condition

On 2017-04-28 12:34:23 [+0300], Luchezar Belev wrote:
> hello,
>
> Consider a situation where the "atomic check and suspend" actually
> comes into action, that is, a call to futex_wait is canceled because
> another thread managed to change the variable just before the
> suspending.
> Since the exact time when the change occurred can not be relied on in
> any way, is is quite possible that the other thread might have missed
> the suspending by a tiny while and changed the variable _after_ the
> futex_wait-er was already suspended.
> Then it turns out the "atomic check and suspend" actually relies on
> the unreliable inter-thread timing! (i.e. a race condition)

You give futex_wait() a value you expect it to have. If another thread
changes it before the suspend the kernel returns an error. In that case
you have to redo your atomic operation in userland.

> Unless i got it all wrong, this means that the futex_wait function is
> as good as simple unconditional suspend for the purpose of
> implementing inter-thread synchronization primitives (like mutexes),
> and any futex-based mutex/whatever implementation does not really rely
> on that "atomic" mechanism but manages to achieve it's job without it
> (even though the people that implemented them may not realize it).
> Perhaps thats why some people say that "futex is tricky".

The wait/wae operations of the futex syscall can be used to implement
mutex in user space without spinning until the mutex can be acquired.
Besides that, there are also LOCK and UNLOCK primitives. The tricky part
is that the kernel can not rely on a lot things the user says is true
and has to deal with race conditions like signals or swap which could
change values and so on.

Sebastian