Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932120AbcDGPxu (ORCPT ); Thu, 7 Apr 2016 11:53:50 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:37091 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756513AbcDGPxX (ORCPT ); Thu, 7 Apr 2016 11:53:23 -0400 Date: Thu, 7 Apr 2016 17:53:12 +0200 From: Peter Zijlstra To: Andy Lutomirski Cc: Mathieu Desnoyers , "Paul E. McKenney" , Ingo Molnar , Paul Turner , Andi Kleen , Chris Lameter , Dave Watson , Josh Triplett , Linux API , "linux-kernel@vger.kernel.org" , Andrew Hunter , Linus Torvalds Subject: Re: [RFC PATCH 0/3] restartable sequences v2: fast user-space percpu critical sections Message-ID: <20160407155312.GA3448@twins.programming.kicks-ass.net> References: <20151027235635.16059.11630.stgit@pjt-glaptop.roam.corp.google.com> <20160407120254.GY3448@twins.programming.kicks-ass.net> <20160407152432.GZ3448@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2847 Lines: 76 On Thu, Apr 07, 2016 at 08:44:38AM -0700, Andy Lutomirski wrote: > On Thu, Apr 7, 2016 at 8:24 AM, Peter Zijlstra wrote: > > On Thu, Apr 07, 2016 at 07:35:26AM -0700, Andy Lutomirski wrote: > >> What I meant was: rather than shoving individual values into the TLABI > >> thing, shove in a pointer: > >> > >> struct commit_info { > >> u64 post_commit_rip; > >> u32 cpu; > >> u64 *event; > >> // whatever else; > >> }; > >> > >> and then put a commit_info* in TLABI. > >> > >> This would save some bytes in the TLABI structure. > > > > But would cost us extra indirections. The whole point was getting this > > stuff at a constant offset from the TLS segment register. > > I don't think the extra indirections would matter much. The kernel > would have to chase the pointer, but only in the very rare case where > it resumes userspace during a commit or on the immediately following > instruction. Its about userspace finding these values, not the kernel. > At the very least, post_commit_rip and the abort address (which I > forgot about) could both live in a static structure, Paul keeps the abort address in rcx. > and shoving a > pointer to *that* into TLABI space is one store instead of two. > > Ah, so what happens if the signal happens before the commit but after > > the load of the seqcount? > > > > Then, even if the signal motifies the count, we'll not observe. > > > > Where exactly? > > In my scheme, nothing except the kernel ever loads the seqcount. The > user code generates a fresh value, writes it to memory, and then, just > before commit, writes that same value to the TLABI area and then > double-checks that the value it wrote at the beginning is still there. > > If the signal modifies the count, then the user code won't directly > notice, but prepare_exit_to_usermode on the way out of the signal will > notice that the (restored) TLABI state doesn't match the counter that > the signal handler changed and will just to the abort address. OK, you lost me.. commit looks like: + __asm__ __volatile__ goto ( + "movq $%l[failed], %%rcx\n" + "movq $1f, %[commit_instr]\n" + "cmpq %[start_value], %[current_value]\n" If we get preempted/signaled here without the preemption/signal entry checking for the post_commit_instr, we'll fail hard. + "jnz %l[failed]\n" + "movq %[to_write], (%[target])\n" + "1: movq $0, %[commit_instr]\n" + : /* no outputs */ + : [start_value]"d"(start_value.storage), + [current_value]"m"(__rseq_state), + [to_write]"r"(to_write), + [target]"r"(p), + [commit_instr]"m"(__rseq_state.post_commit_instr) + : "rcx", "memory" + : failed + );