Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932719AbcDYOiY (ORCPT ); Mon, 25 Apr 2016 10:38:24 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:21639 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932565AbcDYOiU (ORCPT ); Mon, 25 Apr 2016 10:38:20 -0400 Subject: Re: [PATCH 8/8] writeback: throttle buffered writeback To: xiakaixu References: <1460953487-3430-1-git-send-email-axboe@fb.com> <1460953487-3430-9-git-send-email-axboe@fb.com> <571B3073.2010206@huawei.com> <571BEB03.5060906@fb.com> <571E024C.2020307@huawei.com> CC: , , , , , "miaoxie (A)" , Bintian , Huxinwei From: Jens Axboe Message-ID: <571E2BBD.7040804@fb.com> Date: Mon, 25 Apr 2016 08:37:49 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <571E024C.2020307@huawei.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [192.168.54.13] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-04-25_08:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4748 Lines: 117 On 04/25/2016 05:41 AM, xiakaixu wrote: > 于 2016/4/24 5:37, Jens Axboe 写道: >> On 04/23/2016 02:21 AM, xiakaixu wrote: >>>> diff --git a/block/blk-core.c b/block/blk-core.c >>>> index 40b57bf4852c..d941f69dfb4b 100644 >>>> --- a/block/blk-core.c >>>> +++ b/block/blk-core.c >>>> @@ -39,6 +39,7 @@ >>>> >>>> #include "blk.h" >>>> #include "blk-mq.h" >>>> +#include "blk-wb.h" >>>> >>>> EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_remap); >>>> EXPORT_TRACEPOINT_SYMBOL_GPL(block_rq_remap); >>>> @@ -880,6 +881,7 @@ blk_init_allocated_queue(struct request_queue *q, request_fn_proc *rfn, >>>> >>>> fail: >>>> blk_free_flush_queue(q->fq); >>>> + blk_wb_exit(q); >>>> return NULL; >>>> } >>>> EXPORT_SYMBOL(blk_init_allocated_queue); >>>> @@ -1395,6 +1397,7 @@ void blk_requeue_request(struct request_queue *q, struct request *rq) >>>> blk_delete_timer(rq); >>>> blk_clear_rq_complete(rq); >>>> trace_block_rq_requeue(q, rq); >>>> + blk_wb_requeue(q->rq_wb, rq); >>>> >>>> if (rq->cmd_flags & REQ_QUEUED) >>>> blk_queue_end_tag(q, rq); >>>> @@ -1485,6 +1488,8 @@ void __blk_put_request(struct request_queue *q, struct request *req) >>>> /* this is a bio leak */ >>>> WARN_ON(req->bio != NULL); >>>> >>>> + blk_wb_done(q->rq_wb, req); >>>> + >>>> /* >>>> * Request may not have originated from ll_rw_blk. if not, >>>> * it didn't come out of our reserved rq pools >>>> @@ -1714,6 +1719,7 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio) >>>> int el_ret, rw_flags, where = ELEVATOR_INSERT_SORT; >>>> struct request *req; >>>> unsigned int request_count = 0; >>>> + bool wb_acct; >>>> >>>> /* >>>> * low level driver can indicate that it wants pages above a >>>> @@ -1766,6 +1772,8 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio) >>>> } >>>> >>>> get_rq: >>>> + wb_acct = blk_wb_wait(q->rq_wb, bio, q->queue_lock); >>>> + >>>> /* >>>> * This sync check and mask will be re-done in init_request_from_bio(), >>>> * but we need to set it earlier to expose the sync flag to the >>>> @@ -1781,11 +1789,16 @@ get_rq: >>>> */ >>>> req = get_request(q, rw_flags, bio, GFP_NOIO); >>>> if (IS_ERR(req)) { >>>> + if (wb_acct) >>>> + __blk_wb_done(q->rq_wb); >>>> bio->bi_error = PTR_ERR(req); >>>> bio_endio(bio); >>>> goto out_unlock; >>>> } >>>> >>>> + if (wb_acct) >>>> + req->cmd_flags |= REQ_BUF_INFLIGHT; >>>> + >>>> /* >>>> * After dropping the lock and possibly sleeping here, our request >>>> * may now be mergeable after it had proven unmergeable (above). >>>> @@ -2515,6 +2528,7 @@ void blk_start_request(struct request *req) >>>> blk_dequeue_request(req); >>>> >>>> req->issue_time = ktime_to_ns(ktime_get()); >>>> + blk_wb_issue(req->q->rq_wb, req); >>>> >>>> /* >>>> * We are now handing the request to the hardware, initialize >>>> @@ -2751,6 +2765,7 @@ void blk_finish_request(struct request *req, int error) >>>> blk_unprep_request(req); >>>> >>>> blk_account_io_done(req); >>>> + blk_wb_done(req->q->rq_wb, req); >>> >>> Hi Jens, >>> >>> Seems the function blk_wb_done() will be executed twice even if the end_io >>> callback is set. >>> Maybe the same thing would happen in blk-mq.c. >> >> Yeah, that was a mistake, the current version has it fixed. It was inadvertently added when I discovered that the flush request didn't work properly. Now it just duplicates the call inside the check for if it has an ->end_io() defined, since we don't use the normal path for that. >> > Hi Jens, > > I have checked the wb-buf-throttle branch in your block git repo. I am not sure it is the completed version. > Seems only the problem is fixed in blk-mq.c. The function blk_wb_done() still would be executed twice in blk-core.c. > (the functions blk_finish_request() and __blk_put_request()) > Maybe we can add a flag to mark whether blk_wb_done() has been done or not. Good catch, looks like I did only patch up the mq bits. It's still not perfect, since we could potentially double account a request that has a private end_io(), if it was allocated through the normal block rq allocator. It'll skew the unrelated-io-timestamp a bit, but it's not a big deal. The count for inflight will be consistent, which is the important part. We currently have just 1 bit to tell if the request is tracked or not, so we don't know if it was tracked but already seen. I'll fix up the blk-core part to be identical to the blk-mq fix. -- Jens Axboe