Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751554AbbEDTv7 (ORCPT ); Mon, 4 May 2015 15:51:59 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:62577 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751015AbbEDTvy (ORCPT ); Mon, 4 May 2015 15:51:54 -0400 Date: Mon, 4 May 2015 12:51:48 -0700 From: Shaohua Li To: Jens Axboe CC: , Subject: Re: [PATCH] blk-mq: don't lose requests if a stopped queue restarts Message-ID: <20150504195137.GB3365187@devbig257.prn2.facebook.com> References: <430595847fe98f99e1c8ea79b018003d9315f905.1430613052.git.shli@fb.com> <5547C5BF.2030703@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <5547C5BF.2030703@fb.com> User-Agent: Mutt/1.5.20 (2009-12-10) X-Originating-IP: [192.168.52.13] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.13.68,1.0.33,0.0.0000 definitions=2015-05-04_04:2015-05-04,2015-05-04,1970-01-01 signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1623 Lines: 38 On Mon, May 04, 2015 at 01:17:19PM -0600, Jens Axboe wrote: > On 05/02/2015 06:31 PM, Shaohua Li wrote: > >Normally if driver is busy to dispatch a request the logic is like below: > >block layer: driver: > > __blk_mq_run_hw_queue > >a. blk_mq_stop_hw_queue > >b. rq add to ctx->dispatch > > > >later: > >1. blk_mq_start_hw_queue > >2. __blk_mq_run_hw_queue > > > >But it's possible step 1-2 runs between a and b. And since rq isn't in > >ctx->dispatch yet, step 2 will not run rq. The rq might get lost if > >there are no subsequent requests kick in. > > Good catch! But the patch introduces a potentially never ending loop > in __blk_mq_run_hw_queue(). Not sure how we can fully close it, but > it might be better to punt the re-run after adding the requests back > to the worker. That would turn a potential busy loop (until requests > complete) into something with nicer behavior, at least. Ala > > if (!test_bit(BLK_MQ_S_STOPPED, &hctx->state)) > kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx), > &hctx->run_work, 0); My first version of the patch is like this, but I changed my mind later. The assumption is driver will stop queue if it's busy to dispatch request. If the driver is buggy, we will have the endless loop here. Should we assume drivers will not do the right thing? Thanks, Shaohua -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/