Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757375AbcDGSnV (ORCPT ); Thu, 7 Apr 2016 14:43:21 -0400 Received: from mail.efficios.com ([78.47.125.74]:46065 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751950AbcDGSnS (ORCPT ); Thu, 7 Apr 2016 14:43:18 -0400 Date: Thu, 7 Apr 2016 18:43:12 +0000 (UTC) From: Mathieu Desnoyers To: Linus Torvalds Cc: Peter Zijlstra , Florian Weimer , "H. Peter Anvin" , Andrew Morton , Russell King , Thomas Gleixner , Ingo Molnar , Linux Kernel Mailing List , linux-api , Paul Turner , Andrew Hunter , Andy Lutomirski , Andi Kleen , Dave Watson , Chris Lameter , Ben Maurer , rostedt , "Paul E. McKenney" , Josh Triplett , Catalin Marinas , Will Deacon , Michael Kerrisk , Boqun Feng Message-ID: <1025228632.49344.1460054592801.JavaMail.zimbra@efficios.com> In-Reply-To: References: <1459789313-4917-1-git-send-email-mathieu.desnoyers@efficios.com> <20160405164722.GB3430@twins.programming.kicks-ass.net> <570621E5.7060306@redhat.com> <20160407103158.GP3430@twins.programming.kicks-ass.net> <570638D9.7010108@redhat.com> <20160407111938.GR3430@twins.programming.kicks-ass.net> Subject: Re: [RFC PATCH v6 1/5] Thread-local ABI system call: cache CPU number of running thread MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [78.47.125.74] X-Mailer: Zimbra 8.6.0_GA_1178 (ZimbraWebClient - FF45 (Linux)/8.6.0_GA_1178) Thread-Topic: Thread-local ABI system call: cache CPU number of running thread Thread-Index: ZOCGM3X2O8tK6mnvBbkdHCyQSnjiYw== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3310 Lines: 73 ----- On Apr 7, 2016, at 12:52 PM, Linus Torvalds torvalds@linux-foundation.org wrote: > On Thu, Apr 7, 2016 at 9:39 AM, Linus Torvalds > wrote: >> >> Because if not, then this discussion is done for. Stop with the >> f*cking idiotic "let's look at some kernel size and user-space size >> and try to match them up". The kernel doesn't care. The kernel MUST >> NOT care. The kernel will touch one single word, and that's all the >> kernel does, and user space had better be able make up their own >> semantics around that. > > .. and btw - if people aren't sure that that is a "good enough" > interface, then I'm sure as hell not going to merge that patch anyway. > Andy mentions rseq. Yeah, I'm not going to merge anything where part > of the discussion is "and we might want to do something else for X". > > Either the suggested patches are useful and generic enough that people > can do this, or they aren't. > > If people can agree that "yes, this whole cpu id cache is a great > interface that we can build up interesting user-space constructs > around", then great. Such a new kernel interface may be worth merging. One basic use of cpu id cache is to speed up the sched_getcpu(3) implementation in glibc. This is why I'm proposing it as a stand-alone feature that does not require the restartable sequences. It can also be used directly from applications to remove the function call overhead of sched_getcpu, which further accelerates this operation. > > But if people cannot be convinced that it is sufficient, then I don't > want to merge some half-arsed interface that generates these kinds of > discussions. > > So the fact that currently makes me go "no way will I merge any of > this" is the very fact that these discussions continue and are still > going on. The intent of this RFC patchset is to get people to agree on the proper way to introduce both the "cpu id" and the "rseq (restartable critical section)" features. I have so far proposed two ways of doing it: one system call per feature, or one system call to register all the features. My previous patch rounds were adding a system call specific for the cpu_id field, registering a pointer to a 32-bit per-thread integer. (getcpu_cache system call) Based on prior email exchanges I had with you on other topics, I was inclined to go for the specific getcpu_cache system call route, and adding future features as separate system calls. hpa pointed out that this will mean keeping track of one pointer per task-struct for cpu_id, and eventually another pointer per task-struct for rseq fields, thus degrading cache locality. In order to address his concerns, I proposed this "thread local ABI" system call, which registers a fixed-size 64 bytes structure that starts with a feature mask. The other route we could take is to just implement one "rseq" system call, which would contain all fields needed for the rseq feature, which happen to include the cpu_id. The main downside of this approach is that whenever we want to port the cpu_id feature to another architecture, it _needs_ to come with the implemented "rseq" feature too, which is rather more complex. I don't mind going that way either if that's preferred. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com