Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754097AbdLNSuT (ORCPT ); Thu, 14 Dec 2017 13:50:19 -0500 Received: from resqmta-ch2-12v.sys.comcast.net ([69.252.207.44]:43342 "EHLO resqmta-ch2-12v.sys.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753742AbdLNSuR (ORCPT ); Thu, 14 Dec 2017 13:50:17 -0500 Date: Thu, 14 Dec 2017 12:50:13 -0600 (CST) From: Christopher Lameter X-X-Sender: cl@nuc-kabylake To: Mathieu Desnoyers cc: Peter Zijlstra , "Paul E. McKenney" , Boqun Feng , Andy Lutomirski , Dave Watson , linux-kernel , linux-api , Paul Turner , Andrew Morton , Russell King , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Andrew Hunter , Andi Kleen , Ben Maurer , rostedt , Josh Triplett , Linus Torvalds , Catalin Marinas , Will Deacon , Michael Kerrisk , Alexander Viro Subject: Re: [RFC PATCH for 4.16 02/21] rseq: Introduce restartable sequences system call (v12) In-Reply-To: <12046460.34426.1513275177081.JavaMail.zimbra@efficios.com> Message-ID: References: <20171214161403.30643-1-mathieu.desnoyers@efficios.com> <20171214161403.30643-3-mathieu.desnoyers@efficios.com> <12046460.34426.1513275177081.JavaMail.zimbra@efficios.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-CMAE-Envelope: MS4wfAwPO7Z7JZZ6ffd2er4kG5xue44ptUKBR2qpoQ8bLdyPSUgYzqq7oNql6+qz224vGljasXAASE0juQUXNPXWuW0S1RZq+/KCW8+sThR88zrdoxsMP32N 0yUBv7zGuTQE55giP/cq74T6qzUYFUnMRJGx5iTCiV9DzQqRGjaASttJq0PBg/U7xYasmyXRR2NvmPzy9jpFm2l0tA1ccMcHWPMvqmxCROpAjZJkUc8mtGAr oLXDgQhUifBGLUEgtjKRyVdjB6rDorQku6+EK7lvIOVD7QEYD+hD1f+K5hs+i13VawlAyJXgOeUQ714jLdvkBm63o4JNRxg3oIlIT7HfQndyN1mSuxddOWrm qbWFI0h+xl/d+bDcxyKB//Jl0iiXxb4+7RnD9zaHqwMoG3w47bVYL1fgJ2OujIoCJGZOUJzvrfDRXeOmBs48000PaXb7I/tVSe68Gh/s0f3PzMc285u5aX/y PHEFqrQcFguu2DR0BCwuxUnjPvxz7KZ8256oAwP15ePoHx9FR8ZmrwsdsWKNy/0w2xLQAPDDZF+E5Oj7cPxvr6Xrqa7rMLHmbWmRHy5YiwBVNVS6GT8KuG0K Yu180VZs4U6ktFvZJ0Wnp01nFnrByT8xii9i9cnMmo4LNuXaVi5lMMklgKqteEEVOYotwcBdNBvuOGmPXc94WWw98r0a4Vtqr6j74uStSz1tMxF2VmZ1/pc5 nJcO6jFI4c3n4J1nTLLgnayC5HDc2wuCYAIl0nJIk3whJySbGJ3ZL3muqsXLICBihYAoZ0wnKMlGCJgxkfIe8JsZreTxHqsvzOqYvUoi3F23yXcY+uXkNCJt 3Md1PTPcU7S7ZxuiJy3dQlPFWQAzNH2TKhLcNqbVA45NCRbbe1CuJOWi5qni+GvV47/rkioe/oXBUqwCQuc= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1669 Lines: 38 On Thu, 14 Dec 2017, Mathieu Desnoyers wrote: > > I think the proper way to think about gs and fs on x86 is as base > > registers. They are essentially values in registers added to the address > > generated in an instruction. As such the approach is transferable to other > > processor architecture. Many support base register and base register > > relative processing. If a processor can do RMV instructions base register > > relative then you have something similar. > > How would you do it on ARM32 ? Actually you do not really need RMV instructions. The data is cpu specific so within a restartable sequence you would have exclusive access right? F.e. a increment would be 1. Load base register relative 2. add 1 3. Store base register relative The main overhead would be the registeration of the sequence. The advantage on x86 is that you do not need a restartable sequence since a single lockless RMV instruction can do this (this_cpu_inc f.e.) > One benefit of your proposal is to lessen the number of retired instructions, > but if we take the IPC into account, it is slower than rseq in my benchmark. What > benefits do you expect from using segment selectors and non-lock-prefixed atomic > instructions on the fast-path ? Ultimately I wish fast increments like done by this_cpu_inc() could be implemented in an efficient way on non x86 platforms that do not have cheap instructions like that. If cmpxchg local is slower than a group of instructions to do the same then there is an obvious question to the cpu architects why we would need the instruction at all (aside from the fact that we do not need a restartable sequence for these instructions).