Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752430AbbEAUQP (ORCPT ); Fri, 1 May 2015 16:16:15 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55690 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750844AbbEAUQM (ORCPT ); Fri, 1 May 2015 16:16:12 -0400 From: Jeff Moyer To: Shaohua Li Cc: , , , Subject: Re: [PATCH 4/5] blk-mq: do limited block plug for multiple queue case References: X-PGP-KeyID: 1F78E1B4 X-PGP-CertKey: F6FE 280D 8293 F72C 65FD 5A58 1FF8 A7CA 1F78 E1B4 X-PCLoadLetter: What the f**k does that mean? Date: Fri, 01 May 2015 16:16:04 -0400 In-Reply-To: (Shaohua Li's message of "Thu, 30 Apr 2015 10:45:17 -0700") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4119 Lines: 119 Shaohua Li writes: > plug is still helpful for workload with IO merge, but it can be harmful > otherwise especially with multiple hardware queues, as there is > (supposed) no lock contention in this case and plug can introduce > latency. For multiple queues, we do limited plug, eg plug only if there > is request merge. If a request doesn't have merge with following > request, the requet will be dispatched immediately. > > This also fixes a bug. If we directly issue a request and it fails, we > use blk_mq_merge_queue_io(). But we already assigned bio to a request in > blk_mq_bio_to_request. blk_mq_merge_queue_io shouldn't run > blk_mq_bio_to_request again. Good catch. Might've been better to split that out first for easy backport to stable kernels, but I won't hold you to that. > @@ -1243,6 +1277,10 @@ static void blk_mq_make_request(struct request_queue *q, struct bio *bio) > return; > } > > + if (likely(!is_flush_fua) && !blk_queue_nomerges(q) && > + blk_attempt_plug_merge(q, bio, &request_count)) > + return; > + > rq = blk_mq_map_request(q, bio, &data); > if (unlikely(!rq)) > return; After this patch, everything up to this point in blk_mq_make_request and blk_sq_make_request is the same. This can be factored out (in another patch) to a common function. > @@ -1253,38 +1291,38 @@ static void blk_mq_make_request(struct request_queue *q, struct bio *bio) > goto run_queue; > } > > + plug = current->plug; > /* > * If the driver supports defer issued based on 'last', then > * queue it up like normal since we can potentially save some > * CPU this way. > */ > - if (is_sync && !(data.hctx->flags & BLK_MQ_F_DEFER_ISSUE)) { > - struct blk_mq_queue_data bd = { > - .rq = rq, > - .list = NULL, > - .last = 1 > - }; > - int ret; > + if ((plug || is_sync) && !(data.hctx->flags & BLK_MQ_F_DEFER_ISSUE)) { > + struct request *old_rq = NULL; I would add a !blk_queue_nomerges(q) to that conditional. There's no point holding back an I/O when we won't merge it anyway. That brings up another quirk of the current implementation (not your patches) that bugs me. BLK_MQ_F_SHOULD_MERGE QUEUE_FLAG_NOMERGES Those two flags are set independently, one via the driver and the other via a sysfs file. So the user could set the nomerges flag to 1 or 2, and still potentially get merges (see blk_mq_merge_queue_io). That's something that should be fixed, albeit that can wait. > blk_mq_bio_to_request(rq, bio); > > /* > - * For OK queue, we are done. For error, kill it. Any other > - * error (busy), just add it to our list as we previously > - * would have done > + * we do limited pluging. If bio can be merged, do merge. > + * Otherwise the existing request in the plug list will be > + * issued. So the plug list will have one request at most > */ > - ret = q->mq_ops->queue_rq(data.hctx, &bd); > - if (ret == BLK_MQ_RQ_QUEUE_OK) > - goto done; > - else { > - __blk_mq_requeue_request(rq); > - > - if (ret == BLK_MQ_RQ_QUEUE_ERROR) { > - rq->errors = -EIO; > - blk_mq_end_request(rq, rq->errors); > - goto done; > + if (plug) { > + if (!list_empty(&plug->mq_list)) { > + old_rq = list_first_entry(&plug->mq_list, > + struct request, queuelist); > + list_del_init(&old_rq->queuelist); > } > - } > + list_add_tail(&rq->queuelist, &plug->mq_list); > + } else /* is_sync */ > + old_rq = rq; > + blk_mq_put_ctx(data.ctx); > + if (!old_rq) > + return; > + if (!blk_mq_direct_issue_request(old_rq)) > + return; > + blk_mq_insert_request(old_rq, false, true, true); > + return; > } Now there is no way to exit that if block, we always return. It may be worth cosidering moving that block to its own function, if you can think of a good name for it. Other than those minor issues, this looks good to me. Cheers, Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/