Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932236AbaFBVNU (ORCPT ); Mon, 2 Jun 2014 17:13:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:63589 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752169AbaFBVNT (ORCPT ); Mon, 2 Jun 2014 17:13:19 -0400 Date: Mon, 2 Jun 2014 17:12:16 -0400 (EDT) From: Mikulas Patocka X-X-Sender: mpatocka@file01.intranet.prod.int.rdu2.redhat.com To: Linus Torvalds cc: Peter Zijlstra , "James E.J. Bottomley" , Helge Deller , John David Anglin , Parisc List , Linux Kernel Mailing List , Paul McKenney , "Vinod, Chegu" , Waiman Long , Thomas Gleixner , Rik van Riel , Andrew Morton , Davidlohr Bueso , Peter Anvin , Andi Kleen , "Chandramouleeswaran, Aswin" , "Norton, Scott J" , Jason Low Subject: Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks In-Reply-To: Message-ID: References: <20140602162525.GH16155@laptop.programming.kicks-ass.net> User-Agent: Alpine 2.02 (LRH 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2 Jun 2014, Linus Torvalds wrote: > On Mon, Jun 2, 2014 at 1:46 PM, Mikulas Patocka wrote: > > > > And what else do you want to do? > > > > Peter Zijlstra said "I've been using xchg() and cmpxchg() without such > > consideration for quite a while." - so it basically implies that the > > kernel is full of such races, mcs_spinlock is just the most visible one > > that crashes the kernel first. > > .. so your whole argument is bogus, because it doesn't actually fix > anything else. > > Now, something that *would* fix something else is (for example) to > just make "ACCESS_ONCE()" a rvalue so that you cannot use it for > assignments, and then trying to sort out what happens then. It's > possible that the "atomic_pointer_t" would be a part of the solution > to that "what happens then", but THERE IS NO WAY IN HELL we're adding > it for just one architecture and one use that doesn't warrant even > _existing_ on that architecture. The patch adds atomic_pointer_t for all architectures - it is in the common code and it is backed by atomic_long_t (that already exists for all architectures). There is no new arch-specific code at all. When we have atomic_pointer_t, we can find the instances of xchg() and cmpxchg() and convert them to atomic_pointer_t (or to other atomic*_t types). When we convert them all, we can drop xchg() and cmpxchg() at all (at least from architecture-neutral code). The problem with xchg() and cmpxchg() is that they are very easy to misuse. Peter Zijlstra didn't know that they are not atomic w.r.t. normal stores, a lot of other people don't know it too - and if we allow these functions to be used, this race condition will reappear in the future again and again. That's why I'm proposing atomic_pointer_t - it guarantees that this race condition can't be made. > See what I'm saying? > > You're not fixing the problem, you're fixing one unimportant detail > that isn't worth fixing that way. > > Linus Regarding reworking ACCESS_ONCE() for reads and writes - the problem is - how do you make people use it? ACCESS_ONCE() is already missing at a lot of places (it doesn't cause any visible bug on the condition that the compiler doesn't split the load or store to multiple accesses), I can assume that people will omit ATOMIC_ONCE_STORE() too even if we make it. Mikulas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/