Date: Tue, 1 Mar 2016 22:39:14 -0800
From: Darren Hart <dvhart@infradead.org>
To: Jianyu Zhan <nasa4836@gmail.com>
Cc: LKML <linux-kernel@vger.kernel.org>, Thomas Gleixner <tglx@linutronix.de>,
        dave@stgolabs.net, Peter Zijlstra <peterz@infradead.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        Ingo Molnar <mingo@kernel.org>, dvhart@linux.intel.com,
        bigeasy@linutronix.de, Paul McKenney <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCH] futex: replace bare barrier() with a READ_ONCE()
Message-ID: <20160302063914.GA19608@dvhart-mobl5.amr.corp.intel.com>
References: <1456746093-10578-1-git-send-email-nasa4836@gmail.com>
 <20160229222209.GD7499@dvhart-mobl5.amr.corp.intel.com>
 <CAHz2CGXV+vhLYqAScaspqQbXQ6DJsw9MCE+Vkbt0ch0J=y5M0w@mail.gmail.com>
 <20160302013759.GA362@dvhart-mobl5.amr.corp.intel.com>
 <CAHz2CGVjUmXKDmJ5xC=mgF6GoS=F_NPNr_F4G46vbJb0Yy2Esg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAHz2CGVjUmXKDmJ5xC=mgF6GoS=F_NPNr_F4G46vbJb0Yy2Esg@mail.gmail.com>
User-Agent: Mutt/1.5.24 (2015-08-30)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 6537
Lines: 176

On Wed, Mar 02, 2016 at 10:31:18AM +0800, Jianyu Zhan wrote:
> On Wed, Mar 2, 2016 at 9:37 AM, Darren Hart <dvhart@infradead.org> wrote:
> > read_once will use a *volatile assignment instead of calling barrier()
> > directly for a word size argument.
> >
> > With weak statements like "apparently" (above) and "could be" (from the original
> > post: This patch replaces this bare barrier() with a READ_ONCE(), a weaker form
> > of barrier(), which could be more informative.)" I do not see a compelling
> > reason to change what is notoriously fragile code with respect to barriers.
> >
> 
> 
> Fair enough.  The "apparently" wording is lame without assembly code evidence.
> I do not have such a s390 machine, so I have cc''ed the original
> author incorporating this barrier(),
> hopefully he could help test  this.
> 
> By "could be more informative"  I mean a READ_ONCE has explanatory
> effect by its name, instead of
> a bare barrier() without more inline comment for why.
> 
> As I said the retry code logic is effectively the same as
> 
> while (lock_ptr = READ_ONCE(q->lock_ptr)) {
>                    spin_lock(lock_ptr)
>                    if (unlikely(lock_ptr != q->lock_ptr)) {
>                          spin_unlock(lock_ptr);
>                    }
> 
>                    ...
> }
> 
> in which case READ_ONCE could perfectly fit, in that it will make
> compiler only read it once in every testing condition,
> which will eliminate the original problem that commit e91467ecd1ef
> addressed, though assembly code proof is needed.
> 
> Actually, the quotation I argued in previous mail could be used again
> here, from memory-barriers.txt:
> 
>  (*) The compiler is within its rights to merge successive loads from
>      the same variable.  Such merging can cause the compiler to "optimize"
>      the following code:
> 
>         while (tmp = a)
>                 do_something_with(tmp);
> 
>      into the following code, which, although in some sense legitimate
>      for single-threaded code, is almost certainly not what the developer
>      intended:
> 
>         if (tmp = a)
>                 for (;;)
>                         do_something_with(tmp);
> 
>      Use READ_ONCE() to prevent the compiler from doing this to you:
> 
>         while (tmp = READ_ONCE(a))
>                 do_something_with(tmp);
> 
> 
> I might be wrong, so I have cc'ed Paul, and Peter,  I wish they give comment :-)

I believe you are correct with respect to the retry and while condition being an
appropriate place for the application of READ_ONCE. The question is why is this
preferred to the existing barrier()? I suggest:

While barrier() is a fairly brute force general application of a compiler
barrier, READ_ONCE() is very specific, it targets only operations dealing with
the specified variable. As such, its application both clearly identifies the
volatile variable and frees the compiler to make optimizations a more general
barrier would forbid.


> 
> 
> > As for #2...
> >
> >> 2.  For the second problem I memtioned,  yes, it is theoretical,  and
> >> it is also due to  q->lock_ptr can change between
> >> the test of nullness of q->lock_ptr and the lock on q->lock_ptr.
> >>
> >> the code is
> >>
> >> retry:
> >>        lock_ptr = q->lock_ptr;
> >>        if (lock_ptr != 0)  {
> >>                   spin_lock(lock_ptr)
> >>                   if (unlikely(lock_ptr != q->lock_ptr)) {
> >>                         spin_unlock(lock_ptr);
> >>                          goto retry;
> >>                   }
> >>                    ...
> >>        }
> >>
> >> which is effectively the same as :
> >>
> >>  while (lock_ptr = q->lock_ptr) {
> >>                   spin_lock(lock_ptr)
> >>                   if (unlikely(lock_ptr != q->lock_ptr)) {
> >>                         spin_unlock(lock_ptr);
> >>                          goto retry;
> >>                   }
> >>                    ...
> >> }
> >>
> >> This might cause the compiler load the q->lock_ptr once and use it
> >> many times,  quoted from

Which is already covered by the barrier() in place today as a more general
compiler barrier.

Your argument is then simply that READ_ONCE is a more specific solution to the
problem.

> >> memory-barriers.txt:
> >>
> >>
> >>  (*) The compiler is within its rights to reload a variable, for example,
> >>     in cases where high register pressure prevents the compiler from
> >>     keeping all data of interest in registers.  The compiler might
> >>     therefore optimize the variable 'tmp' out of our previous example:
> >>
> >>        while (tmp = a)
> >>                do_something_with(tmp);
> >>
> >>     This could result in the following code, which is perfectly safe in
> >>     single-threaded code, but can be fatal in concurrent code:
> >>
> >>        while (a)
> >>                do_something_with(a);
> >>
> >>     For example, the optimized version of this code could result in
> >>     passing a zero to do_something_with() in the case where the variable
> >>     a was modified by some other CPU between the "while" statement and
> >>     the call to do_something_with().
> >>
> >>     Again, use READ_ONCE() to prevent the compiler from doing this:
> >>
> >>        while (tmp = READ_ONCE(a))
> >>                do_something_with(tmp);
> >>
> >
> > OK, so this is really the meat of the argument for the patch. You are looking to
> > add a compiler barrier instead of a CPU memory barrier. This should be what your

I was incorrect in my statement here. barrier() is a general compiler barrier and
READ_ONCE is a targeted compiler barrier (impacting only the operations dealing
with the specified variable). This helps the argument considerably.

> > commit message is focused on and it should provide compelling evidence to
> > justify risking the change.
> >
> > Compelling evidence for a compiler barrier would be a disassembly of the code
> > block before and after, demonstrating the compiler generating broken code and the
> > compiler generating correct code.
> >
> > In addition to this, however, I would want to see a strong convincing argument
> > that the READ_ONCE volatile cast is sufficient to cover the original case which
> > motivated the addition of the barrier() (not "apparently" and "could be").
> 
> As for #2, this is pure theoretical induction from this quotation, I
> do have no convincing argument
> and again I would like Paul to help clarify this.
> 
> 
> Thanks very much!
> 
> 
> Regards,
> Jianyu Zhan
> 

-- 
Darren Hart
Intel Open Source Technology Center