Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752728AbdF2P7f (ORCPT ); Thu, 29 Jun 2017 11:59:35 -0400 Received: from iolanthe.rowland.org ([192.131.102.54]:40098 "HELO iolanthe.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1752150AbdF2P72 (ORCPT ); Thu, 29 Jun 2017 11:59:28 -0400 Date: Thu, 29 Jun 2017 11:59:27 -0400 (EDT) From: Alan Stern X-X-Sender: stern@iolanthe.rowland.org To: Will Deacon cc: Linus Torvalds , Paul McKenney , Andrea Parri , Linux Kernel Mailing List , , =?utf-8?Q?Stanis=C5=82aw?= Drozd , Arnd Bergmann , , Thomas Gleixner , Peter Zijlstra , Josh Triplett , Nicolas Pitre , Krister Johansen , Vegard Nossum , , Wu Fengguang , Frederic Weisbecker , Rik van Riel , Steven Rostedt , Ingo Molnar , Luc Maranget , Jade Alglave Subject: Re: [GIT PULL rcu/next] RCU commits for 4.13 In-Reply-To: <20170629113848.GA18630@arm.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5120 Lines: 124 On Thu, 29 Jun 2017, Will Deacon wrote: > [turns out I've not been on cc for this thread, but Jade pointed me to it > and I see my name came up at some point!] > > On Wed, Jun 28, 2017 at 05:05:46PM -0700, Linus Torvalds wrote: > > On Wed, Jun 28, 2017 at 4:54 PM, Paul E. McKenney > > wrote: > > > > > > Linus, are you dead-set against defining spin_unlock_wait() to be > > > spin_lock + spin_unlock? For example, is the current x86 implementation > > > of spin_unlock_wait() really a non-negotiable hard requirement? Or > > > would you be willing to live with the spin_lock + spin_unlock semantics? > > > > So I think the "same as spin_lock + spin_unlock" semantics are kind of insane. > > > > One of the issues is that the same as "spin_lock + spin_unlock" is > > basically now architecture-dependent. Is it really the > > architecture-dependent ordering you want to define this as? > > > > So I just think it's a *bad* definition. If somebody wants something > > that is exactly equivalent to spin_lock+spin_unlock, then dammit, just > > do *THAT*. It's completely pointless to me to define > > spin_unlock_wait() in those terms. > > > > And if it's not equivalent to the *architecture* behavior of > > spin_lock+spin_unlock, then I think it should be descibed in terms > > that aren't about the architecture implementation (so you shouldn't > > describe it as "spin_lock+spin_unlock", you should describe it in > > terms of memory barrier semantics. > > > > And if we really have to use the spin_lock+spinunlock semantics for > > this, then what is the advantage of spin_unlock_wait at all, if it > > doesn't fundamentally avoid some locking overhead of just taking the > > spinlock in the first place? > > Just on this point -- the arm64 code provides the same ordering semantics > as you would get from a lock;unlock sequence, but we can optimise that > when compared to an actual lock;unlock sequence because we don't need to > wait in turn for our ticket. I suspect something similar could be done > if/when we move to qspinlocks. > > Whether or not this is actually worth optimising is another question, but > it is worth noting that unlock_wait can be implemented more cheaply than > lock;unlock, whilst providing the same ordering guarantees (if that's > really what we want -- see my reply to Paul). > > Simplicity tends to be my preference, so ripping this out would suit me > best ;) It would be best to know: (1). How spin_unlock_wait() is currently being used. (2). What it was originally intended for. Paul has done some research into (1). He can correct me if I get this wrong... Only a few (i.e., around one or two) of the usages don't seem to require the full spin_lock+spin_unlock semantics. I go along with Linus; the places which really do want it to behave like spin_lock+spin_unlock should simply use spin_lock+spin_unlock. There hasn't been any indication so far that the possible efficiency improvement Will mentions is at all important. According to Paul, most of the other places don't need anything more than the acquire guarantee (any changes made in earlier critical sections will be visible to the code following spin_unlock_wait). In which case, the semantics of spin_unlock_wait could be redefined in this simpler form. Or we could literally replace all the existing definitions with spin_lock+spin_unlock. Would that be so terrible? As regards (2), I did a little digging. spin_unlock_wait was introduced in the 2.1.36 kernel, in mid-April 1997. I wasn't able to find a specific patch for it in the LKML archives. At the time it was used in only one place in the entire kernel (in kernel/exit.c): void release(struct task_struct * p) { int i; if (!p) return; if (p == current) { printk("task releasing itself\n"); return; } for (i=1 ; iprocessor != NO_PROC_ID) barrier(); spin_unlock_wait(&scheduler_lock); #endif nr_tasks--; task[i] = NULL; REMOVE_LINKS(p); release_thread(p); if (STACK_MAGIC != *(unsigned long *)p->kernel_stack_page) printk(KERN_ALERT "release: %s kernel stack corruption. Aiee\n", p->comm); free_kernel_stack(p->kernel_stack_page); current->cmin_flt += p->min_flt + p->cmin_flt; current->cmaj_flt += p->maj_flt + p->cmaj_flt; current->cnswap += p->nswap + p->cnswap; free_task_struct(p); return; } panic("trying to release non-existent task"); } I'm not entirely clear on the point of this call. It looks like it wanted to wait until p was guaranteed not to be running on any processor ever again. (I don't see why it couldn't have just acquired the scheduler_lock -- was release() a particularly hot path?) Although it doesn't matter now, this would mean that the original semantics of spin_unlock_wait were different from what we are discussing. It apparently was meant to provide the release guarantee: any future critical sections would see the values that were visible before the call. Ironic. Alan Stern