Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932802AbcCBCcH (ORCPT ); Tue, 1 Mar 2016 21:32:07 -0500 Received: from mail-ob0-f174.google.com ([209.85.214.174]:35353 "EHLO mail-ob0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757230AbcCBCb7 (ORCPT ); Tue, 1 Mar 2016 21:31:59 -0500 MIME-Version: 1.0 In-Reply-To: <20160302013759.GA362@dvhart-mobl5.amr.corp.intel.com> References: <1456746093-10578-1-git-send-email-nasa4836@gmail.com> <20160229222209.GD7499@dvhart-mobl5.amr.corp.intel.com> <20160302013759.GA362@dvhart-mobl5.amr.corp.intel.com> From: Jianyu Zhan Date: Wed, 2 Mar 2016 10:31:18 +0800 Message-ID: Subject: Re: [PATCH] futex: replace bare barrier() with a READ_ONCE() To: Darren Hart Cc: LKML , Thomas Gleixner , dave@stgolabs.net, Peter Zijlstra , Andrew Morton , Ingo Molnar , dvhart@linux.intel.com, bigeasy@linutronix.de, Paul McKenney Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5169 Lines: 146 On Wed, Mar 2, 2016 at 9:37 AM, Darren Hart wrote: > read_once will use a *volatile assignment instead of calling barrier() > directly for a word size argument. > > With weak statements like "apparently" (above) and "could be" (from the original > post: This patch replaces this bare barrier() with a READ_ONCE(), a weaker form > of barrier(), which could be more informative.)" I do not see a compelling > reason to change what is notoriously fragile code with respect to barriers. > Fair enough. The "apparently" wording is lame without assembly code evidence. I do not have such a s390 machine, so I have cc''ed the original author incorporating this barrier(), hopefully he could help test this. By "could be more informative" I mean a READ_ONCE has explanatory effect by its name, instead of a bare barrier() without more inline comment for why. As I said the retry code logic is effectively the same as while (lock_ptr = READ_ONCE(q->lock_ptr)) { spin_lock(lock_ptr) if (unlikely(lock_ptr != q->lock_ptr)) { spin_unlock(lock_ptr); } ... } in which case READ_ONCE could perfectly fit, in that it will make compiler only read it once in every testing condition, which will eliminate the original problem that commit e91467ecd1ef addressed, though assembly code proof is needed. Actually, the quotation I argued in previous mail could be used again here, from memory-barriers.txt: (*) The compiler is within its rights to merge successive loads from the same variable. Such merging can cause the compiler to "optimize" the following code: while (tmp = a) do_something_with(tmp); into the following code, which, although in some sense legitimate for single-threaded code, is almost certainly not what the developer intended: if (tmp = a) for (;;) do_something_with(tmp); Use READ_ONCE() to prevent the compiler from doing this to you: while (tmp = READ_ONCE(a)) do_something_with(tmp); I might be wrong, so I have cc'ed Paul, and Peter, I wish they give comment :-) > As for #2... > >> 2. For the second problem I memtioned, yes, it is theoretical, and >> it is also due to q->lock_ptr can change between >> the test of nullness of q->lock_ptr and the lock on q->lock_ptr. >> >> the code is >> >> retry: >> lock_ptr = q->lock_ptr; >> if (lock_ptr != 0) { >> spin_lock(lock_ptr) >> if (unlikely(lock_ptr != q->lock_ptr)) { >> spin_unlock(lock_ptr); >> goto retry; >> } >> ... >> } >> >> which is effectively the same as : >> >> while (lock_ptr = q->lock_ptr) { >> spin_lock(lock_ptr) >> if (unlikely(lock_ptr != q->lock_ptr)) { >> spin_unlock(lock_ptr); >> goto retry; >> } >> ... >> } >> >> This might cause the compiler load the q->lock_ptr once and use it >> many times, quoted from >> memory-barriers.txt: >> >> >> (*) The compiler is within its rights to reload a variable, for example, >> in cases where high register pressure prevents the compiler from >> keeping all data of interest in registers. The compiler might >> therefore optimize the variable 'tmp' out of our previous example: >> >> while (tmp = a) >> do_something_with(tmp); >> >> This could result in the following code, which is perfectly safe in >> single-threaded code, but can be fatal in concurrent code: >> >> while (a) >> do_something_with(a); >> >> For example, the optimized version of this code could result in >> passing a zero to do_something_with() in the case where the variable >> a was modified by some other CPU between the "while" statement and >> the call to do_something_with(). >> >> Again, use READ_ONCE() to prevent the compiler from doing this: >> >> while (tmp = READ_ONCE(a)) >> do_something_with(tmp); >> > > OK, so this is really the meat of the argument for the patch. You are looking to > add a compiler barrier instead of a CPU memory barrier. This should be what your > commit message is focused on and it should provide compelling evidence to > justify risking the change. > > Compelling evidence for a compiler barrier would be a disassembly of the code > block before and after, demonstrating the compiler generating broken code and the > compiler generating correct code. > > In addition to this, however, I would want to see a strong convincing argument > that the READ_ONCE volatile cast is sufficient to cover the original case which > motivated the addition of the barrier() (not "apparently" and "could be"). As for #2, this is pure theoretical induction from this quotation, I do have no convincing argument and again I would like Paul to help clarify this. Thanks very much! Regards, Jianyu Zhan