Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756590AbcDDUsd (ORCPT ); Mon, 4 Apr 2016 16:48:33 -0400 Received: from mail.efficios.com ([78.47.125.74]:40479 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754086AbcDDUsb (ORCPT ); Mon, 4 Apr 2016 16:48:31 -0400 Date: Mon, 4 Apr 2016 20:48:23 +0000 (UTC) From: Mathieu Desnoyers To: "H. Peter Anvin" Cc: Andrew Morton , Russell King , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-api , Paul Turner , Andrew Hunter , Peter Zijlstra , Andy Lutomirski , Andi Kleen , Dave Watson , Chris Lameter , Ben Maurer , rostedt , "Paul E. McKenney" , Josh Triplett , Linus Torvalds , Catalin Marinas , Will Deacon , Michael Kerrisk , Boqun Feng Message-ID: <856357054.45028.1459802903401.JavaMail.zimbra@efficios.com> In-Reply-To: <492303698.44994.1459799188052.JavaMail.zimbra@efficios.com> References: <1459789313-4917-1-git-send-email-mathieu.desnoyers@efficios.com> <1459789313-4917-2-git-send-email-mathieu.desnoyers@efficios.com> <5702A037.60200@zytor.com> <492303698.44994.1459799188052.JavaMail.zimbra@efficios.com> Subject: Re: [RFC PATCH v6 1/5] Thread-local ABI system call: cache CPU number of running thread MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [78.47.125.74] X-Mailer: Zimbra 8.6.0_GA_1178 (ZimbraWebClient - FF45 (Linux)/8.6.0_GA_1178) Thread-Topic: Thread-local ABI system call: cache CPU number of running thread Thread-Index: g0lJ8PtS/NghZp2HrJC6l2TPQeEUt4MnmVTP Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3865 Lines: 90 ----- On Apr 4, 2016, at 3:46 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote: > ----- On Apr 4, 2016, at 1:11 PM, H. Peter Anvin hpa@zytor.com wrote: > >> On 04/04/16 10:01, Mathieu Desnoyers wrote: >>> >>> Changes since v5: >>> - Rename "getcpu_cache" to "thread_local_abi", allowing to extend >>> this system call to cover future features such as restartable critical >>> sections. Generalizing this system call ensures that we can add >>> features similar to the cpu_id field within the same cache-line >>> without having to track one pointer per feature within the task >>> struct. >>> - Add a tlabi_nr parameter to the system call, thus allowing to extend >>> the ABI beyond the initial 64-byte structure by registering structures >>> with tlabi_nr greater than 0. The initial ABI structure is associated >>> with tlabi_nr 0. >>> - Rebased on kernel v4.5. >>> >> >> This seems absolutely insanely complex, both for the kernel and for >> userspace. >> >> A much saner way would be for userspace to query the kernel for the size >> of the structure; userspace then allocates the maximum of what it knows >> and what the kernel knows. That way, the kernel doesn't need to >> conditionalize its accesses to user space, and libc doesn't need to >> conditionalize its accesses either. > > If we go down the route of having user-space dynamically allocating > the structure, my understanding is that we need to associate the > user-space TLS symbol with a pointer to the structure, and test for > NULL each time, thus requiring user-space to touch one more cache-line > (read the pointer), and add one conditional per user-space fast-path, > compared to a statically-sized definition approach. Or perhaps you have > some clever trick in mind for "allocation by user-space" that I'm missing ? > > Besides the NULL pointer check, another issue is feature detection. > As we extend the feature set, my proposal has a 32-bit features > mask at the beginning of the TLS structure, within the same > cache-line containing the structure fields, so user-space can quickly > check whether the required feature is enabled (adds one conditional > on the user-space fast path, but does not require to touch another > cache-line). This allows adding new features without requiring to > reserve the value "0" within each field of the structure to mean > "feature unavailable", which I find terminally unaesthetic. > > I propose here a fixed-size 64 bytes layout for the first structure, > for which a 32-bit feature mask should be enough. If we ever fill > up these 64 bytes, we can then use the following tlabi_nr number (1), > which will define its own structure size and feature mask. This > seems like a good compromise between fast-path speed, feature detection > flexibility, optimal use of cache-lines, and extensibility. Moreover, the feature set that the application knows about, glibc knows about, and the kernel knows about are three different things. My intent here is to have glibc stay out of the way as much as possible, since this is really an interface between various applications/libraries and the kernel. Even if glibc allocates a structure large enough for the union of the features it knows about and the features the kernel implements, the application could be built against kernel headers that expose more features than glibc knows about, and would therefore need to have a structure length check, for an added branch on the fast path if we dynamically allocate the tlabi structure. A statically-sized structure allows application and libraries to skip pointer load, NULL checks, and structure length checks on the user-space fast-path. Thanks, Mathieu > > Thanks, > > Mathieu > > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com