Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751644AbbBKBTN (ORCPT ); Tue, 10 Feb 2015 20:19:13 -0500 Received: from claw.goop.org ([74.207.240.146]:54188 "EHLO claw.goop.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750861AbbBKBTL (ORCPT ); Tue, 10 Feb 2015 20:19:11 -0500 Message-ID: <54DAADEE.6070506@goop.org> Date: Tue, 10 Feb 2015 17:18:38 -0800 From: Jeremy Fitzhardinge User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Oleg Nesterov , Raghavendra K T CC: Linus Torvalds , Sasha Levin , Davidlohr Bueso , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Peter Anvin , Konrad Rzeszutek Wilk , Paolo Bonzini , Paul McKenney , Waiman Long , Dave Jones , the arch/x86 maintainers , Paul Gortmaker , Andi Kleen , Jason Wang , Linux Kernel Mailing List , KVM list , virtualization , xen-devel@lists.xenproject.org, Rik van Riel , Christian Borntraeger , Andrew Morton , Andrey Ryabinin Subject: Re: [PATCH] x86 spinlock: Fix memory corruption on completing completions References: <1423234148-13886-1-git-send-email-raghavendra.kt@linux.vnet.ibm.com> <54D7D19B.1000103@goop.org> <54D87F1E.9060307@linux.vnet.ibm.com> <20150209120227.GT21418@twins.programming.kicks-ass.net> <54D9CFC7.5020007@linux.vnet.ibm.com> <20150210132634.GA30380@redhat.com> In-Reply-To: <20150210132634.GA30380@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2875 Lines: 73 On 02/10/2015 05:26 AM, Oleg Nesterov wrote: > On 02/10, Raghavendra K T wrote: >> On 02/10/2015 06:23 AM, Linus Torvalds wrote: >> >>> add_smp(&lock->tickets.head, TICKET_LOCK_INC); >>> if (READ_ONCE(lock->tickets.tail) & TICKET_SLOWPATH_FLAG) .. >>> >>> into something like >>> >>> val = xadd((&lock->ticket.head_tail, TICKET_LOCK_INC << TICKET_SHIFT); >>> if (unlikely(val & TICKET_SLOWPATH_FLAG)) ... >>> >>> would be the right thing to do. Somebody should just check that I got >>> that shift right, and that the tail is in the high bytes (head really >>> needs to be high to work, if it's in the low byte(s) the xadd would >>> overflow from head into tail which would be wrong). >> Unfortunately xadd could result in head overflow as tail is high. >> >> The other option was repeated cmpxchg which is bad I believe. >> Any suggestions? > Stupid question... what if we simply move SLOWPATH from .tail to .head? > In this case arch_spin_unlock() could do xadd(tickets.head) and check > the result Well, right now, "tail" is manipulated by locked instructions by CPUs who are contending for the ticketlock, but head can be manipulated unlocked by the CPU which currently owns the ticketlock. If SLOWPATH moved into head, then non-owner CPUs would be touching head, requiring everyone to use locked instructions on it. That's the theory, but I don't see much (any?) code which depends on that. Ideally we could find a way so that pv ticketlocks could use a plain unlocked add for the unlock like the non-pv case, but I just don't see a way to do it. > In this case __ticket_check_and_clear_slowpath() really needs to cmpxchg > the whole .head_tail. Plus obviously more boring changes. This needs a > separate patch even _if_ this can work. Definitely. > BTW. If we move "clear slowpath" into "lock" path, then probably trylock > should be changed too? Something like below, we just need to clear SLOWPATH > before cmpxchg. How important / widely used is trylock these days? J > > Oleg. > > --- x/arch/x86/include/asm/spinlock.h > +++ x/arch/x86/include/asm/spinlock.h > @@ -109,7 +109,8 @@ static __always_inline int arch_spin_try > if (old.tickets.head != (old.tickets.tail & ~TICKET_SLOWPATH_FLAG)) > return 0; > > - new.head_tail = old.head_tail + (TICKET_LOCK_INC << TICKET_SHIFT); > + new.tickets.head = old.tickets.head; > + new.tickets.tail = (old.tickets.tail & ~TICKET_SLOWPATH_FLAG) + TICKET_LOCK_INC; > > /* cmpxchg is a full barrier, so nothing can move before it */ > return cmpxchg(&lock->head_tail, old.head_tail, new.head_tail) == old.head_tail; > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/