Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752526AbbLRLWJ (ORCPT ); Fri, 18 Dec 2015 06:22:09 -0500 Received: from mx1.redhat.com ([209.132.183.28]:51932 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752035AbbLRLWG (ORCPT ); Fri, 18 Dec 2015 06:22:06 -0500 Message-ID: <1450437714.26597.53.camel@localhost.localdomain> Subject: Re: futex(3) man page, final draft for pre-release review From: Torvald Riegel To: Darren Hart Cc: "Michael Kerrisk (man-pages)" , Thomas Gleixner , lkml , libc-alpha , linux-man , "Carlos O'Donell" , Roland McGrath , Davidlohr Bueso , Jakub Jelinek , Ingo Molnar , bill o gallmeister , bert hubert , Jan Kiszka , Eric Dumazet , Arnd Bergmann , Rusty Russell , Heinrich Schuchardt , Andy Lutomirski , Daniel Wagner , Anton Blanchard , Steven Rostedt , Rich Felker , Jonathan Wakely , Mike Frysinger Date: Fri, 18 Dec 2015 12:21:54 +0100 In-Reply-To: <20151215211816.GR11972@malice.jf.intel.com> References: <56701916.4090203@gmail.com> <20151215211816.GR11972@malice.jf.intel.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4822 Lines: 99 On Tue, 2015-12-15 at 13:18 -0800, Darren Hart wrote: > On Tue, Dec 15, 2015 at 02:43:50PM +0100, Michael Kerrisk (man-pages) wrote: > > > > When executing a futex operation that requests to block a thread, > > the kernel will block only if the futex word has the value that > > the calling thread supplied (as one of the arguments of the > > futex() call) as the expected value of the futex word. The load‐ > > ing of the futex word's value, the comparison of that value with > > the expected value, and the actual blocking will happen atomi‐ > > > > FIXME: for next line, it would be good to have an explanation of > > "totally ordered" somewhere around here. > > > > cally and totally ordered with respect to concurrently executing > > Totally ordered with respect futex operations refers to semantics of the > ACQUIRE/RELEASE operations and how they impact ordering of memory reads and > writes. The kernel futex operations are protected by spinlocks, which ensure > that that all operations are serialized with respect to one another. > > This is a lot to attempt to define in this document. Perhaps a reference to > linux/Documentation/memory-barriers.txt as a footnote would be sufficient? Or > perhaps for this manual, "serialized" would be sufficient, with a footnote > regarding "totally ordered" and a pointer to the memory-barrier documentation? I'd strongly prefer to document the semantics for users here. And I don't think users use the kernel's memory model -- instead, if we assume that most users will call futex ops from C or C++, then the best we have is the C11 / C++11 memory model. Therefore, if we want to expand that, we should specify semantics in terms of as-if equivalence to C11 pseudo code. I had proposed that in the past but, IIRC, Michael didn't want to add a C11 "dependency" in the semantics back then, at least for the initial release. Here's what I wrote back then (atomic_*_relaxed() is like C11 atomic_*(..., memory_order_relaxed), lock/unlock have normal C11 mutex semantics): ======================== For example, we could say that futex_wait is, in terms of synchronization semantics, *as if* we'd execute a piece of C11 code. Here's a part of the docs for a glibc-internal futex wrapper that I'm working on; this is futex_wait ... : /* Atomically wrt other futex operations, this blocks iff the value at *FUTEX matches the expected value. This is semantically equivalent to: l = (FUTEX); wait_flag = (FUTEX); lock (l); val = atomic_load_relaxed (FUTEX); if (val != expected) { unlock (l); return EAGAIN; } atomic_store_relaxed (wait_flag, 1); unlock (l); // Now block; can time out in futex_time_wait (see below) while (atomic_load_relaxed(wait_flag)); Note that no guarantee of a happens-before relation between a woken futex_wait and a futex_wake is documented; however, this does not matter in practice because we have to consider spurious wake-ups (see below), and thus would not be able to reason which futex_wake woke us anyway. ... and this is futex_wake: /* Atomically wrt other futex operations, this unblocks the specified number of processes, or all processes blocked on this futex if there are fewer than the specified number. Semantically, this is equivalent to: l = (futex); lock (l); for (res = 0; processes_to_wake > 0; processes_to_wake--, res++) { if () break; wf = (futex); // No happens-before guarantee with woken futex_wait (see above) atomic_store_relaxed (wf, 0); } return res; This allows a programmer to really infer the guarantees he/she can get from a futex in terms of synchronization, without the docs having to use prose to describe that. This should also not constrain the kernel in terms of how to implement it, because it is a conceptual as-if relation (e.g., the kernel won't spin-wait the whole time, and we might want to make this clear for the PI case). Of course, there are several as-if representations we could use, and we might want to be a bit more pseudo-code-ish to make this also easy to understand for people not familiar with C11 (e.g., using mutex + condvar with some relaxation of condvar guaranteees). ========================= I will go through the discussion pointed out by Davidlohr next. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/