Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756543AbdD1Lnn (ORCPT ); Fri, 28 Apr 2017 07:43:43 -0400 Received: from Chamillionaire.breakpoint.cc ([146.0.238.67]:35558 "EHLO Chamillionaire.breakpoint.cc" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753904AbdD1Lng (ORCPT ); Fri, 28 Apr 2017 07:43:36 -0400 Date: Fri, 28 Apr 2017 13:43:33 +0200 From: Sebastian Andrzej Siewior To: Luchezar Belev Cc: linux-kernel@vger.kernel.org Subject: Re: futex race condition Message-ID: <20170428114332.mcow24365cfdizmw@breakpoint.cc> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170306 (1.8.0) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1734 Lines: 34 On 2017-04-28 12:34:23 [+0300], Luchezar Belev wrote: > hello, > > Consider a situation where the "atomic check and suspend" actually > comes into action, that is, a call to futex_wait is canceled because > another thread managed to change the variable just before the > suspending. > Since the exact time when the change occurred can not be relied on in > any way, is is quite possible that the other thread might have missed > the suspending by a tiny while and changed the variable _after_ the > futex_wait-er was already suspended. > Then it turns out the "atomic check and suspend" actually relies on > the unreliable inter-thread timing! (i.e. a race condition) You give futex_wait() a value you expect it to have. If another thread changes it before the suspend the kernel returns an error. In that case you have to redo your atomic operation in userland. > Unless i got it all wrong, this means that the futex_wait function is > as good as simple unconditional suspend for the purpose of > implementing inter-thread synchronization primitives (like mutexes), > and any futex-based mutex/whatever implementation does not really rely > on that "atomic" mechanism but manages to achieve it's job without it > (even though the people that implemented them may not realize it). > Perhaps thats why some people say that "futex is tricky". The wait/wae operations of the futex syscall can be used to implement mutex in user space without spinning until the mutex can be acquired. Besides that, there are also LOCK and UNLOCK primitives. The tricky part is that the kernel can not rely on a lot things the user says is true and has to deal with race conditions like signals or swap which could change values and so on. Sebastian