Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932465AbdIFOCC (ORCPT ); Wed, 6 Sep 2017 10:02:02 -0400 Received: from iolanthe.rowland.org ([192.131.102.54]:39914 "HELO iolanthe.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S932354AbdIFOCB (ORCPT ); Wed, 6 Sep 2017 10:02:01 -0400 Date: Wed, 6 Sep 2017 10:02:00 -0400 (EDT) From: Alan Stern X-X-Sender: stern@iolanthe.rowland.org To: Peter Zijlstra cc: Jens Axboe , , , , , , , Subject: Re: [PATCH -v2] blk-mq: Start to fix memory ordering... In-Reply-To: <20170906080022.sotl4pvihs2nvg4m@hirez.programming.kicks-ass.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3925 Lines: 96 On Wed, 6 Sep 2017, Peter Zijlstra wrote: > Attempt to untangle the ordering in blk-mq. The patch introducing the > single smp_mb__before_atomic() is obviously broken in that it doesn't > clearly specify a pairing barrier and an obtained guarantee. > > The comment is further misleading in that it hints that the > deadline store and the COMPLETE store also need to be ordered, but > AFAICT there is no such dependency. However what does appear to be > important is the clear happening _after_ the store, and that worked by > pure accident. > > This clarifies blk_mq_start_request() -- we should not get there with > STARTING set -- this simplifies the code and makes the barrier usage > sane (the old code could be read to allow not having _any_ atomic after > the barrier, in which case the barrier hasn't got anything to order). We > then also introduce the missing pairing barrier for it. > > Also down-grade the barrier to smp_wmb(), this is cheaper for > PowerPC/ARM and doesn't cost anything extra on x86. > > And it documents the STARTING vs COMPLETE ordering. Although I've not > been entirely successful in reverse engineering the blk-mq state > machine so there might still be more funnies around timeout vs > requeue. > > If I got anything wrong, feel free to educate me by adding comments to > clarify things ;-) > > Cc: Alan Stern > Cc: Will Deacon > Cc: Ming Lei > Cc: Christoph Hellwig > Cc: Jens Axboe > Cc: Andrea Parri > Cc: Boqun Feng > Cc: Bart Van Assche > Cc: "Paul E. McKenney" > Fixes: 538b75341835 ("blk-mq: request deadline must be visible before marking rq as started") > Signed-off-by: Peter Zijlstra (Intel) > --- > - spelling; Andrea and Bart > - compiles (urgh!) > - smp_wmb(); Adrea > > > block/blk-mq.c | 52 ++++++++++++++++++++++++++++++++++++++++------------ > block/blk-timeout.c | 2 +- > 2 files changed, 41 insertions(+), 13 deletions(-) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 4603b115e234..506a0f355117 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -558,22 +558,32 @@ void blk_mq_start_request(struct request *rq) > > blk_add_timer(rq); > > - /* > - * Ensure that ->deadline is visible before set the started > - * flag and clear the completed flag. > - */ > - smp_mb__before_atomic(); > + WARN_ON_ONCE(test_bit(REQ_ATOM_STARTED, &rq->atomic_flags)); > > /* > * Mark us as started and clear complete. Complete might have been > * set if requeue raced with timeout, which then marked it as > * complete. So be sure to clear complete again when we start > * the request, otherwise we'll ignore the completion event. > + * > + * Ensure that ->deadline is visible before we set STARTED, such that > + * blk_mq_check_expired() is guaranteed to observe our ->deadline when > + * it observes STARTED. > */ > - if (!test_bit(REQ_ATOM_STARTED, &rq->atomic_flags)) > - set_bit(REQ_ATOM_STARTED, &rq->atomic_flags); > - if (test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags)) > + smp_wmb(); > + set_bit(REQ_ATOM_STARTED, &rq->atomic_flags); > + if (test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags)) { > + /* > + * Coherence order guarantees these consecutive stores to a > + * single variable propagate in the specified order. Thus the > + * clear_bit() is ordered _after_ the set bit. See > + * blk_mq_check_expired(). > + * > + * (the bits must be part of the same byte for this to be > + * true). Adding this comment is well and good, but for more security you should also add a comment (maybe even a compile-time check) to the place where REQ_ATOM_STARTED and REQ_ATOM_COMPLETE are defined. Otherwise they might eventually get moved into separate bytes. Alan Stern