Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751315AbbBLQyN (ORCPT ); Thu, 12 Feb 2015 11:54:13 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54769 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751146AbbBLQyL (ORCPT ); Thu, 12 Feb 2015 11:54:11 -0500 Date: Thu, 12 Feb 2015 15:18:19 +0100 From: Oleg Nesterov To: Jeremy Fitzhardinge Cc: Raghavendra K T , Linus Torvalds , Sasha Levin , Davidlohr Bueso , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Peter Anvin , Konrad Rzeszutek Wilk , Paolo Bonzini , Paul McKenney , Waiman Long , Dave Jones , the arch/x86 maintainers , Paul Gortmaker , Andi Kleen , Jason Wang , Linux Kernel Mailing List , KVM list , virtualization , xen-devel@lists.xenproject.org, Rik van Riel , Christian Borntraeger , Andrew Morton , Andrey Ryabinin Subject: Re: [PATCH] x86 spinlock: Fix memory corruption on completing completions Message-ID: <20150212141819.GA11633@redhat.com> References: <1423234148-13886-1-git-send-email-raghavendra.kt@linux.vnet.ibm.com> <54D7D19B.1000103@goop.org> <54D87F1E.9060307@linux.vnet.ibm.com> <20150209120227.GT21418@twins.programming.kicks-ass.net> <54D9CFC7.5020007@linux.vnet.ibm.com> <20150210132634.GA30380@redhat.com> <54DAADEE.6070506@goop.org> <20150211172434.GA28689@redhat.com> <54DBE27C.8050105@goop.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54DBE27C.8050105@goop.org> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2254 Lines: 57 On 02/11, Jeremy Fitzhardinge wrote: > > On 02/11/2015 09:24 AM, Oleg Nesterov wrote: > > I agree, and I have to admit I am not sure I fully understand why > > unlock uses the locked add. Except we need a barrier to avoid the race > > with the enter_slowpath() users, of course. Perhaps this is the only > > reason? > > Right now it needs to be a locked operation to prevent read-reordering. > x86 memory ordering rules state that all writes are seen in a globally > consistent order, and are globally ordered wrt reads *on the same > addresses*, but reads to different addresses can be reordered wrt to writes. > > So, if the unlocking add were not a locked operation: > > __add(&lock->tickets.head, TICKET_LOCK_INC); /* not locked */ > > if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) > __ticket_unlock_slowpath(lock, prev); > > Then the read of lock->tickets.tail can be reordered before the unlock, > which introduces a race: Yes, yes, thanks, but this is what I meant. We need a barrier. Even if "Every store is a release" as Linus mentioned. > This *might* be OK, but I think it's on dubious ground: > > __add(&lock->tickets.head, TICKET_LOCK_INC); /* not locked */ > > /* read overlaps write, and so is ordered */ > if (unlikely(lock->head_tail & (TICKET_SLOWPATH_FLAG << TICKET_SHIFT)) > __ticket_unlock_slowpath(lock, prev); > > because I think Intel and AMD differed in interpretation about how > overlapping but different-sized reads & writes are ordered (or it simply > isn't architecturally defined). can't comment, I simply so not know how the hardware works. > If the slowpath flag is moved to head, then it would always have to be > locked anyway, because it needs to be atomic against other CPU's RMW > operations setting the flag. Yes, this is true. But again, if we want to avoid the read-after-unlock, we need to update this lock and read SLOWPATH atomically, it seems that we can't avoid the locked insn. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/