Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751749AbdIEOwQ (ORCPT ); Tue, 5 Sep 2017 10:52:16 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:15652 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751332AbdIEOwO (ORCPT ); Tue, 5 Sep 2017 10:52:14 -0400 X-IronPort-AV: E=Sophos;i="5.41,480,1498492800"; d="scan'208";a="48913024" From: Bart Van Assche To: "peterz@infradead.org" , "axboe@fb.com" CC: "parri.andrea@gmail.com" , "linux-kernel@vger.kernel.org" , "tom.leiming@gmail.com" , "hch@lst.de" , "paulmck@linux.vnet.ibm.com" , "will.deacon@arm.com" , "boqun.feng@gmail.com" , "stern@rowland.harvard.edu" Subject: Re: [PATCH] blk-mq: Start to fix memory ordering... Thread-Topic: [PATCH] blk-mq: Start to fix memory ordering... Thread-Index: AQHTJV2SVHzl+/irYU24JEa9ufXxRKKmYigA Date: Tue, 5 Sep 2017 14:51:25 +0000 Message-ID: <1504623084.4135.27.camel@wdc.com> References: <20170904090932.ngfefw2clot3sbqi@hirez.programming.kicks-ass.net> In-Reply-To: <20170904090932.ngfefw2clot3sbqi@hirez.programming.kicks-ass.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [73.223.62.206] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;CY1PR0401MB1772;20:A7kCqEcd9XosRH9adeKGEYsS9H3ePp00On16n284GAfQbJRaH7OgVYIc5ILf9t4emBINlEKiLamKqkh3wnwSHlLbaSXyiNGNGAOF6B9FW6uaf77+U38hvQfEfbiksFdsgcZF8Dxr2GQWZwT71vRVrSYbqyaGxWy2SamdZuZ5YXg= x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-correlation-id: a2f26c3d-fe3d-4406-d056-08d4f46d9517 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(300000502095)(300135100095)(22001)(2017030254152)(300000503095)(300135400095)(48565401081)(2017052603199)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:CY1PR0401MB1772; x-ms-traffictypediagnostic: CY1PR0401MB1772: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Bart.VanAssche@wdc.com; wdcipoutbound: EOP-TRUE x-exchange-antispam-report-test: UriScan:(190756311086443); x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(5005006)(8121501046)(100000703101)(100105400095)(3002001)(10201501046)(93006095)(93001095)(6055026)(6041248)(20161123560025)(20161123558100)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123555025)(20161123562025)(20161123564025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:CY1PR0401MB1772;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:CY1PR0401MB1772; x-forefront-prvs: 0421BF7135 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(39860400002)(199003)(377424004)(24454002)(189002)(3280700002)(106356001)(2950100002)(68736007)(33646002)(50986999)(76176999)(54356999)(36756003)(103116003)(101416001)(102836003)(305945005)(86362001)(3660700001)(6116002)(105586002)(14454004)(7736002)(189998001)(3846002)(6246003)(4326008)(39060400002)(478600001)(8936002)(6486002)(72206003)(97736004)(8676002)(81166006)(77096006)(229853002)(6506006)(6436002)(81156014)(99286003)(2900100001)(6512007)(2501003)(53936002)(5660300001)(2906002)(66066001)(25786009)(54906002)(7416002);DIR:OUT;SFP:1102;SCL:1;SRVR:CY1PR0401MB1772;H:CY1PR0401MB1536.namprd04.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" Content-ID: <0826C1DE68D43E49B14570E2CD4CF7EE@namprd04.prod.outlook.com> MIME-Version: 1.0 X-OriginatorOrg: wdc.com X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Sep 2017 14:51:25.4342 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: b61c8803-16f3-4c35-9b17-6f65f441df86 X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR0401MB1772 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by nfs id v85EqPse009989 Content-Length: 3226 Lines: 88 On Mon, 2017-09-04 at 11:09 +0200, Peter Zijlstra wrote: > /* > * Mark us as started and clear complete. Complete might have been > * set if requeue raced with timeout, which then marked it as > * complete. So be sure to clear complete again when we start > * the request, otherwise we'll ignore the completion event. > + * > + * Ensure that ->deadline is visible before set STARTED, such that It seems like there is something wrong with the grammar in the above sentence? Did you perhaps mean "before we set STARTED"? > + * blk_mq_check_expired() is guaranteed to observe our ->deadline > + * when it observes STARTED. > */ > - if (!test_bit(REQ_ATOM_STARTED, &rq->atomic_flags)) > - set_bit(REQ_ATOM_STARTED, &rq->atomic_flags); > - if (test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags)) > + smp_mb__before_atomic(); > + set_bit(REQ_ATOM_STARTED, &rq->atomic_flags); > + if (test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags)) { > + /* > + * Coherence order guarantees these consequtive stores to a > + * singe variable propagate in the specified order. Thus the > + * clear_bit() is ordered _after_ the set bit. See > + * blk_mq_check_expired(). > + */ > clear_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags); > + } Is this new comment really useful? If you want to keep it please spell "consecutive" correctly. > if (q->dma_drain_size && blk_rq_bytes(rq)) { > /* > @@ -744,11 +751,20 @@ static void blk_mq_check_expired(struct > struct request *rq, void *priv, bool reserved) > { > struct blk_mq_timeout_data *data = priv; > + unsigned long deadline; > > if (!test_bit(REQ_ATOM_STARTED, &rq->atomic_flags)) > return; > > /* > + * Ensures that if we see STARTED we must also see our > + * up-to-date deadline, see blk_mq_start_request(). > + */ > + smp_rmb(); > + > + deadline = READ_ONCE(rq->deaedline); "deaedline" is a spelling error. Has this patch been tested? > + /* > * The rq being checked may have been freed and reallocated > * out already here, we avoid this race by checking rq->deadline > * and REQ_ATOM_COMPLETE flag together: > @@ -761,10 +777,20 @@ static void blk_mq_check_expired(struct > * and clearing the flag in blk_mq_start_request(), so > * this rq won't be timed out too. > */ > - if (time_after_eq(jiffies, rq->deadline)) { > - if (!blk_mark_rq_complete(rq)) > + if (time_after_eq(jiffies, deadline)) { > + if (!blk_mark_rq_complete(rq)) { > + /* > + * Relies on the implied MB from test_and_clear() to > + * order the COMPLETE load against the STARTED load. > + * Orders against the coherence order in > + * blk_mq_start_request(). > + * > + * This ensures that if we see !COMPLETE we must see > + * STARTED and ignore this timeout. > + */ > blk_mq_rq_timed_out(rq, reserved); > - } else if (!data->next_set || time_after(data->next, rq->deadline)) { > + } > + } else if (!data->next_set || time_after(data->next, deadline)) { > data->next = rq->deadline; > data->next_set = 1; > } Apparently a READ_ONCE(rq->deadline) statement has been added but not all rq->deadline reads have been changed into reads of the local variable "deadline"? Was that really your intention? Bart.