2017-11-27 17:15:59

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH V2 1/5] dm-mpath: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

On Mon, 2017-11-27 at 13:07 +0800, Ming Lei wrote:
> If .queue_rq() returns BLK_STS_RESOURCE, blk-mq will rerun the queue in
> the three situations:
>
> 1) if BLK_MQ_S_SCHED_RESTART is set
> - queue is rerun after one rq is completed, see blk_mq_sched_restart()
> which is run from blk_mq_free_request()
>
> 2) run out of driver tag
> - queue is rerun after one tag is freed
>
> 3) otherwise
> - queue is run immediately in blk_mq_dispatch_rq_list()
>
> This random dealy of running hw queue is introduced by commit 6077c2d706097c0
> (dm rq: Avoid that request processing stalls sporadically), which claimed
> one request processing stalling is fixed, but never explained the behind
> idea, and it is a workaound at most. Even the question isn't explained by
> anyone in recent discussion.
>
> Also calling blk_mq_delay_run_hw_queue() inside .queue_rq() is a horrible
> hack because it makes BLK_MQ_S_SCHED_RESTART not working, and degrades I/O
> peformance a lot.
>
> Finally this patch makes sure that dm-rq returns BLK_STS_RESOURCE to blk-mq
> only when underlying queue is out of resource, so we switch to return
> DM_MAPIO_DELAY_REQUEU if either MPATHF_QUEUE_IO or MPATHF_PG_INIT_REQUIRED
> is set in multipath_clone_and_map().

Sorry but in my opinion the above description shows that you don't understand
the dm-mpath driver completely.

> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index c8faa2b85842..8fe3f45407ce 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -484,9 +484,7 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
> return DM_MAPIO_KILL;
> } else if (test_bit(MPATHF_QUEUE_IO, &m->flags) ||
> test_bit(MPATHF_PG_INIT_REQUIRED, &m->flags)) {
> - if (pg_init_all_paths(m))
> - return DM_MAPIO_DELAY_REQUEUE;
> - return DM_MAPIO_REQUEUE;
> + return DM_MAPIO_DELAY_REQUEUE;
> }

This patch removes a pg_init_all_paths() call but you don't explain why you
think it is allowed to remove that call. Did you perhaps remove that call by
mistake?

Bart.