Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932679AbdLOQuQ (ORCPT ); Fri, 15 Dec 2017 11:50:16 -0500 Received: from mail.efficios.com ([167.114.142.141]:45448 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932348AbdLOQuN (ORCPT ); Fri, 15 Dec 2017 11:50:13 -0500 Date: Fri, 15 Dec 2017 16:52:22 +0000 (UTC) From: Mathieu Desnoyers To: Chris Lameter Cc: Peter Zijlstra , "Paul E. McKenney" , Boqun Feng , Andy Lutomirski , Dave Watson , linux-kernel , linux-api , Paul Turner , Andrew Morton , Russell King , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Andrew Hunter , Andi Kleen , Ben Maurer , rostedt , Josh Triplett , Linus Torvalds , Catalin Marinas , Will Deacon , Michael Kerrisk , Alexander Viro Message-ID: <729438855.35910.1513356742518.JavaMail.zimbra@efficios.com> In-Reply-To: References: <20171214161403.30643-1-mathieu.desnoyers@efficios.com> <12046460.34426.1513275177081.JavaMail.zimbra@efficios.com> <1537392285.34532.1513279478488.JavaMail.zimbra@efficios.com> <20171214212023.GJ3326@worktop> Subject: Re: [RFC PATCH for 4.16 02/21] rseq: Introduce restartable sequences system call (v12) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.142.141] X-Mailer: Zimbra 8.7.11_GA_1854 (ZimbraWebClient - FF52 (Linux)/8.7.11_GA_1854) Thread-Topic: rseq: Introduce restartable sequences system call (v12) Thread-Index: mbKFscn3NACoNEwgymYxoioPhh9TuA== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2007 Lines: 68 ----- On Dec 15, 2017, at 10:05 AM, Chris Lameter cl@linux.com wrote: > On Thu, 14 Dec 2017, Peter Zijlstra wrote: > >> > But my company has extensive user space code that maintains a lot of >> > counters and does other tricks to get full performance out of the >> > hardware. Such a mechanism would also be good from user space. Why keep >> > the good stuff only inside the kernel? >> >> Mathieu's proposal is for userspace, _only_ userspace. > > But what we were talking about are instructions that work effectively in > kernel space whose efficiency restartable sequences could bring to user > space. It can be worthwhile to recap my understanding of this thread so far: AFAIU, Chris' proposal is to use the "gs" segment selector as instruction prefix on x86 rather than explicitly loading CPU number and calculating offsets. This can turn sequences of rseq operations like this cmpxchg: Registers: R1: return value R2: expected value R3: new value R4: cpu_id rseq cmpxchg: load TLS::cpu_id_start into R4 calculate offset of v fs:mov (store rseq descriptor address into TLS::rseq_cs) compare R4 against TLS::cpu_id jne abort mov (load v into R1) compare R1 against R2 jne cmpfail mov (store R3 into *v) into: fs:mov (store rseq descriptor address into TLS::rseq_cs) gs:mov (load *v+off into R1) compare R1 against R2 jne cmpfail gs:mov (store R3 into *v+off) My first concern with this approach is the lack of flexibility of the segment selector method wrt variety of schemes user-space has to deal with for memory allocation. In the kernel, this is achieved by ensuring that all per-cpu data layout is segment-selector-prefix friendly. Another aspect that worries me is applications using the gs segment selector for other purposes. Suddenly reserving the gs segment selector for use by a library like glibc may lead to incompatibilities with applications already using it. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com