MIME-Version: 1.0
In-Reply-To: <20160302013759.GA362@dvhart-mobl5.amr.corp.intel.com>
References: <1456746093-10578-1-git-send-email-nasa4836@gmail.com>
 <20160229222209.GD7499@dvhart-mobl5.amr.corp.intel.com> <CAHz2CGXV+vhLYqAScaspqQbXQ6DJsw9MCE+Vkbt0ch0J=y5M0w@mail.gmail.com>
 <20160302013759.GA362@dvhart-mobl5.amr.corp.intel.com>
From: Jianyu Zhan <nasa4836@gmail.com>
Date: Wed, 2 Mar 2016 10:31:18 +0800
Message-ID: <CAHz2CGVjUmXKDmJ5xC=mgF6GoS=F_NPNr_F4G46vbJb0Yy2Esg@mail.gmail.com>
Subject: Re: [PATCH] futex: replace bare barrier() with a READ_ONCE()
To: Darren Hart <dvhart@infradead.org>
Cc: LKML <linux-kernel@vger.kernel.org>, Thomas Gleixner <tglx@linutronix.de>,
        dave@stgolabs.net, Peter Zijlstra <peterz@infradead.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        Ingo Molnar <mingo@kernel.org>, dvhart@linux.intel.com,
        bigeasy@linutronix.de, Paul McKenney <paulmck@linux.vnet.ibm.com>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 5169
Lines: 146

On Wed, Mar 2, 2016 at 9:37 AM, Darren Hart <dvhart@infradead.org> wrote:
> read_once will use a *volatile assignment instead of calling barrier()
> directly for a word size argument.
>
> With weak statements like "apparently" (above) and "could be" (from the original
> post: This patch replaces this bare barrier() with a READ_ONCE(), a weaker form
> of barrier(), which could be more informative.)" I do not see a compelling
> reason to change what is notoriously fragile code with respect to barriers.
>


Fair enough.  The "apparently" wording is lame without assembly code evidence.
I do not have such a s390 machine, so I have cc''ed the original
author incorporating this barrier(),
hopefully he could help test  this.

By "could be more informative"  I mean a READ_ONCE has explanatory
effect by its name, instead of
a bare barrier() without more inline comment for why.

As I said the retry code logic is effectively the same as

while (lock_ptr = READ_ONCE(q->lock_ptr)) {
                   spin_lock(lock_ptr)
                   if (unlikely(lock_ptr != q->lock_ptr)) {
                         spin_unlock(lock_ptr);
                   }

                   ...
}

in which case READ_ONCE could perfectly fit, in that it will make
compiler only read it once in every testing condition,
which will eliminate the original problem that commit e91467ecd1ef
addressed, though assembly code proof is needed.

Actually, the quotation I argued in previous mail could be used again
here, from memory-barriers.txt:

 (*) The compiler is within its rights to merge successive loads from
     the same variable.  Such merging can cause the compiler to "optimize"
     the following code:

        while (tmp = a)
                do_something_with(tmp);

     into the following code, which, although in some sense legitimate
     for single-threaded code, is almost certainly not what the developer
     intended:

        if (tmp = a)
                for (;;)
                        do_something_with(tmp);

     Use READ_ONCE() to prevent the compiler from doing this to you:

        while (tmp = READ_ONCE(a))
                do_something_with(tmp);


I might be wrong, so I have cc'ed Paul, and Peter,  I wish they give comment :-)


> As for #2...
>
>> 2.  For the second problem I memtioned,  yes, it is theoretical,  and
>> it is also due to  q->lock_ptr can change between
>> the test of nullness of q->lock_ptr and the lock on q->lock_ptr.
>>
>> the code is
>>
>> retry:
>>        lock_ptr = q->lock_ptr;
>>        if (lock_ptr != 0)  {
>>                   spin_lock(lock_ptr)
>>                   if (unlikely(lock_ptr != q->lock_ptr)) {
>>                         spin_unlock(lock_ptr);
>>                          goto retry;
>>                   }
>>                    ...
>>        }
>>
>> which is effectively the same as :
>>
>>  while (lock_ptr = q->lock_ptr) {
>>                   spin_lock(lock_ptr)
>>                   if (unlikely(lock_ptr != q->lock_ptr)) {
>>                         spin_unlock(lock_ptr);
>>                          goto retry;
>>                   }
>>                    ...
>> }
>>
>> This might cause the compiler load the q->lock_ptr once and use it
>> many times,  quoted from
>> memory-barriers.txt:
>>
>>
>>  (*) The compiler is within its rights to reload a variable, for example,
>>     in cases where high register pressure prevents the compiler from
>>     keeping all data of interest in registers.  The compiler might
>>     therefore optimize the variable 'tmp' out of our previous example:
>>
>>        while (tmp = a)
>>                do_something_with(tmp);
>>
>>     This could result in the following code, which is perfectly safe in
>>     single-threaded code, but can be fatal in concurrent code:
>>
>>        while (a)
>>                do_something_with(a);
>>
>>     For example, the optimized version of this code could result in
>>     passing a zero to do_something_with() in the case where the variable
>>     a was modified by some other CPU between the "while" statement and
>>     the call to do_something_with().
>>
>>     Again, use READ_ONCE() to prevent the compiler from doing this:
>>
>>        while (tmp = READ_ONCE(a))
>>                do_something_with(tmp);
>>
>
> OK, so this is really the meat of the argument for the patch. You are looking to
> add a compiler barrier instead of a CPU memory barrier. This should be what your
> commit message is focused on and it should provide compelling evidence to
> justify risking the change.
>
> Compelling evidence for a compiler barrier would be a disassembly of the code
> block before and after, demonstrating the compiler generating broken code and the
> compiler generating correct code.
>
> In addition to this, however, I would want to see a strong convincing argument
> that the READ_ONCE volatile cast is sufficient to cover the original case which
> motivated the addition of the barrier() (not "apparently" and "could be").

As for #2, this is pure theoretical induction from this quotation, I
do have no convincing argument
and again I would like Paul to help clarify this.


Thanks very much!


Regards,
Jianyu Zhan