2018-11-14 08:46:51

by jianchao.wang

[permalink] [raw]
Subject: [PATCH V7 0/4] blk-mq: refactor and fix on issue request directly


Hi Jens

Please consider this patchset for 4.21.
It refactors the code of issue request directly to unify the interface
and make the code clearer and more readable, and also fixes a defects
there.

The 1st patch refactors the code of issue request directly to unify the
helper interface which could handle all the cases.

The 2nd patch fix the issue that when queue is stopped or quiesced request
may pass through bottom device's potential io scheduler.

The 3rd patch make blk_mq_sched_insert_requests issue requests directly
with 'bypass' false, then it needn't to handle the non-issued requests
any more.

The 4th patch replace and kill the blk_mq_request_issue_directly.

V7:
- drop the original 3rd patch which try to ensure hctx to be ran on mapped
cpu. As it add get/put_cpu and cpumask test in hot path and it is not
necessary for drivers to do the guarantee.

V6:
- drop original 1st patch to address Jen's comment
- discard the enum mq_issue_decision and blk_mq_make_decision and use
BLK_STS_* return values directly to address Jen's comment. (1/5)
- add 'unlikely' in blk_mq_try_issue_directly (1/5)
- refactor the 2nd and 3rd patch based on the new 1st patch.
- reserve the unused_cookie in 4th and 5th patch

V5:
- rebase against Jens' for-4.21/block branch
- adjust the order of patch04 and patch05
- add patch06 to replace and kill the one line blk_mq_request_bypass_insert
- comment changes

V4:
- split the original patch 1 into two patches, 1st and 2nd patch currently
- rename the mq_decision to mq_issue_decision
- comment changes

V3:
- Correct the code about the case bypass_insert is true and io scheduler
attached. The request still need to be issued in case above. (1/4)
- Refactor the code to make code clearer. blk_mq_make_request is introduced
to decide insert, end or just return based on the return value of .queue_rq
and bypass_insert (1/4)
- Add the 2nd patch. It introduce a new decision result which indicates to
insert request with blk_mq_request_bypass_insert.
- Modify the code to adapt the new patch 1.

V2:
- Add 1st and 2nd patch to refactor the code.

Jianchao Wang(4)
blk-mq: refactor the code of issue request directly
blk-mq: fix issue directly case when q is stopped or quiesced
blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
blk-mq: replace and kill blk_mq_request_issue_directly

block/blk-core.c | 4 +-
block/blk-mq-sched.c | 8 ++--
block/blk-mq.c | 112 ++++++++++++++++++++++-----------------------------
block/blk-mq.h | 7 ++--
4 files changed, 59 insertions(+), 72 deletions(-)

Thanks
Jianchao






2018-11-14 08:46:57

by jianchao.wang

[permalink] [raw]
Subject: [PATCH V7 4/4] blk-mq: replace and kill blk_mq_request_issue_directly

Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly
in blk_insert_cloned_request and kill it as nobody uses it any more.

Signed-off-by: Jianchao Wang <[email protected]>
---
block/blk-core.c | 4 +++-
block/blk-mq.c | 9 +--------
block/blk-mq.h | 7 ++++---
3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index fdc0ad2..e4eedc7 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1421,6 +1421,8 @@ static int blk_cloned_rq_check_limits(struct request_queue *q,
*/
blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq)
{
+ blk_qc_t unused_cookie;
+
if (blk_cloned_rq_check_limits(q, rq))
return BLK_STS_IOERR;

@@ -1436,7 +1438,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *
* bypass a potential scheduler on the bottom device for
* insert.
*/
- return blk_mq_request_issue_directly(rq);
+ return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true);
}
EXPORT_SYMBOL_GPL(blk_insert_cloned_request);

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 049fd47..3fcf2f9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1766,7 +1766,7 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
return ret;
}

-static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
bool bypass)
@@ -1833,13 +1833,6 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
return ret;
}

-blk_status_t blk_mq_request_issue_directly(struct request *rq)
-{
- blk_qc_t unused_cookie;
-
- return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true);
-}
-
void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
{
diff --git a/block/blk-mq.h b/block/blk-mq.h
index facb6e9..f18c27c 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -61,9 +61,10 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
void blk_mq_request_bypass_insert(struct request *rq, bool run_queue);
void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct list_head *list);
-
-/* Used by blk_insert_cloned_request() to issue request directly */
-blk_status_t blk_mq_request_issue_directly(struct request *rq);
+blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+ struct request *rq,
+ blk_qc_t *cookie,
+ bool bypass);
void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list);

--
2.7.4


2018-11-14 08:47:00

by jianchao.wang

[permalink] [raw]
Subject: [PATCH V7 2/4] blk-mq: fix issue directly case when q is stopped or quiesced

When try to issue request directly, if the queue is stopped or
quiesced, 'bypass' will be ignored and return BLK_STS_OK to caller
to avoid it dispatch request again. Then the request will be
inserted with blk_mq_sched_insert_request. This is not correct
for dm-rq case where we should avoid to pass through the underlying
path's io scheduler.

To fix it, use blk_mq_request_bypass_insert to insert the request
to hctx->dispatch when we cannot pass through io scheduler but have
to insert.

Signed-off-by: Jianchao Wang <[email protected]>
---
block/blk-mq.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 14b4d06..11c52bb 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1772,7 +1772,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
bool bypass)
{
struct request_queue *q = rq->q;
- bool run_queue = true;
+ bool run_queue = true, force = false;
blk_status_t ret = BLK_STS_RESOURCE;
int srcu_idx;

@@ -1786,7 +1786,7 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
*/
if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
run_queue = false;
- bypass = false;
+ force = true;
goto out_unlock;
}

@@ -1817,6 +1817,9 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
if (!bypass) {
blk_mq_sched_insert_request(rq, false, run_queue, false);
ret = BLK_STS_OK;
+ } else if (force) {
+ blk_mq_request_bypass_insert(rq, run_queue);
+ ret = BLK_STS_OK;
}
break;
default:
--
2.7.4


2018-11-14 08:47:24

by jianchao.wang

[permalink] [raw]
Subject: [PATCH V7 1/4] blk-mq: refactor the code of issue request directly

Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
into one interface to unify the interfaces to issue requests
directly. The merged interface takes over the requests totally,
it could insert, end or do nothing based on the return value of
.queue_rq and 'bypass' parameter. Then caller needn't any other
handling any more.

Signed-off-by: Jianchao Wang <[email protected]>
---
block/blk-mq.c | 93 ++++++++++++++++++++++++++++------------------------------
1 file changed, 45 insertions(+), 48 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 411be60..14b4d06 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1766,78 +1766,75 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
return ret;
}

-static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
- bool bypass_insert)
+ bool bypass)
{
struct request_queue *q = rq->q;
bool run_queue = true;
+ blk_status_t ret = BLK_STS_RESOURCE;
+ int srcu_idx;

+ hctx_lock(hctx, &srcu_idx);
/*
- * RCU or SRCU read lock is needed before checking quiesced flag.
+ * hctx_lock is needed before checking quiesced flag.
*
- * When queue is stopped or quiesced, ignore 'bypass_insert' from
- * blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
- * and avoid driver to try to dispatch again.
+ * When queue is stopped or quiesced, ignore 'bypass', insert
+ * and return BLK_STS_OK to caller, and avoid driver to try to
+ * dispatch again.
*/
- if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
+ if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
run_queue = false;
- bypass_insert = false;
- goto insert;
+ bypass = false;
+ goto out_unlock;
}

- if (q->elevator && !bypass_insert)
- goto insert;
+ /*
+ * Bypass the potential scheduler on the bottom device.
+ */
+ if (unlikely(q->elevator && !bypass))
+ goto out_unlock;

- if (!blk_mq_get_dispatch_budget(hctx))
- goto insert;
+ if (unlikely(!blk_mq_get_dispatch_budget(hctx)))
+ goto out_unlock;

- if (!blk_mq_get_driver_tag(rq)) {
+ if (unlikely(!blk_mq_get_driver_tag(rq))) {
blk_mq_put_dispatch_budget(hctx);
- goto insert;
+ goto out_unlock;
}

- return __blk_mq_issue_directly(hctx, rq, cookie);
-insert:
- if (bypass_insert)
- return BLK_STS_RESOURCE;
-
- blk_mq_sched_insert_request(rq, false, run_queue, false);
- return BLK_STS_OK;
-}
-
-static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
- struct request *rq, blk_qc_t *cookie)
-{
- blk_status_t ret;
- int srcu_idx;
-
- might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
+ ret = __blk_mq_issue_directly(hctx, rq, cookie);

- hctx_lock(hctx, &srcu_idx);
+out_unlock:
+ hctx_unlock(hctx, srcu_idx);

- ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
- if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
- blk_mq_sched_insert_request(rq, false, true, false);
- else if (ret != BLK_STS_OK)
- blk_mq_end_request(rq, ret);
+ switch (ret) {
+ case BLK_STS_OK:
+ break;
+ case BLK_STS_DEV_RESOURCE:
+ case BLK_STS_RESOURCE:
+ if (!bypass) {
+ blk_mq_sched_insert_request(rq, false, run_queue, false);
+ ret = BLK_STS_OK;
+ }
+ break;
+ default:
+ if (!bypass) {
+ blk_mq_end_request(rq, ret);
+ ret = BLK_STS_OK;
+ }
+ break;
+ }

- hctx_unlock(hctx, srcu_idx);
+ return ret;
}

blk_status_t blk_mq_request_issue_directly(struct request *rq)
{
- blk_status_t ret;
- int srcu_idx;
blk_qc_t unused_cookie;
- struct blk_mq_hw_ctx *hctx = rq->mq_hctx;

- hctx_lock(hctx, &srcu_idx);
- ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true);
- hctx_unlock(hctx, srcu_idx);
-
- return ret;
+ return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused_cookie, true);
}

void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
@@ -1958,13 +1955,13 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
if (same_queue_rq) {
data.hctx = same_queue_rq->mq_hctx;
blk_mq_try_issue_directly(data.hctx, same_queue_rq,
- &cookie);
+ &cookie, false);
}
} else if ((q->nr_hw_queues > 1 && is_sync) || (!q->elevator &&
!data.hctx->dispatch_busy)) {
blk_mq_put_ctx(data.ctx);
blk_mq_bio_to_request(rq, bio);
- blk_mq_try_issue_directly(data.hctx, rq, &cookie);
+ blk_mq_try_issue_directly(data.hctx, rq, &cookie, false);
} else {
blk_mq_put_ctx(data.ctx);
blk_mq_bio_to_request(rq, bio);
--
2.7.4


2018-11-14 08:49:20

by jianchao.wang

[permalink] [raw]
Subject: [PATCH V7 3/4] blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests

It is not necessary to issue request directly with bypass 'true'
in blk_mq_sched_insert_requests and handle the non-issued requests
itself. Just set bypass to 'false' and let blk_mq_try_issue_directly
handle them totally.

Signed-off-by: Jianchao Wang <[email protected]>
---
block/blk-mq-sched.c | 8 +++-----
block/blk-mq.c | 13 +++----------
2 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 66fda19..9af57c8 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -410,12 +410,10 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx,
* busy in case of 'none' scheduler, and this way may save
* us one extra enqueue & dequeue to sw queue.
*/
- if (!hctx->dispatch_busy && !e && !run_queue_async) {
+ if (!hctx->dispatch_busy && !e && !run_queue_async)
blk_mq_try_issue_list_directly(hctx, list);
- if (list_empty(list))
- return;
- }
- blk_mq_insert_requests(hctx, ctx, list);
+ else
+ blk_mq_insert_requests(hctx, ctx, list);
}

blk_mq_run_hw_queue(hctx, run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 11c52bb..049fd47 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1843,21 +1843,14 @@ blk_status_t blk_mq_request_issue_directly(struct request *rq)
void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
{
+ blk_qc_t unused_cookie;
+
while (!list_empty(list)) {
- blk_status_t ret;
struct request *rq = list_first_entry(list, struct request,
queuelist);

list_del_init(&rq->queuelist);
- ret = blk_mq_request_issue_directly(rq);
- if (ret != BLK_STS_OK) {
- if (ret == BLK_STS_RESOURCE ||
- ret == BLK_STS_DEV_RESOURCE) {
- list_add(&rq->queuelist, list);
- break;
- }
- blk_mq_end_request(rq, ret);
- }
+ blk_mq_try_issue_directly(hctx, rq, &unused_cookie, false);
}
}

--
2.7.4


2018-11-14 09:13:10

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH V7 1/4] blk-mq: refactor the code of issue request directly

On Wed, Nov 14, 2018 at 04:45:28PM +0800, Jianchao Wang wrote:
> Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
> into one interface to unify the interfaces to issue requests
> directly. The merged interface takes over the requests totally,
> it could insert, end or do nothing based on the return value of
> .queue_rq and 'bypass' parameter. Then caller needn't any other
> handling any more.
>
> Signed-off-by: Jianchao Wang <[email protected]>
> ---
> block/blk-mq.c | 93 ++++++++++++++++++++++++++++------------------------------
> 1 file changed, 45 insertions(+), 48 deletions(-)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 411be60..14b4d06 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1766,78 +1766,75 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
> return ret;
> }
>
> -static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
> +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
> struct request *rq,
> blk_qc_t *cookie,
> - bool bypass_insert)
> + bool bypass)
> {
> struct request_queue *q = rq->q;
> bool run_queue = true;
> + blk_status_t ret = BLK_STS_RESOURCE;
> + int srcu_idx;
>
> + hctx_lock(hctx, &srcu_idx);
> /*
> - * RCU or SRCU read lock is needed before checking quiesced flag.
> + * hctx_lock is needed before checking quiesced flag.
> *
> - * When queue is stopped or quiesced, ignore 'bypass_insert' from
> - * blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
> - * and avoid driver to try to dispatch again.
> + * When queue is stopped or quiesced, ignore 'bypass', insert
> + * and return BLK_STS_OK to caller, and avoid driver to try to
> + * dispatch again.
> */
> - if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
> + if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
> run_queue = false;
> - bypass_insert = false;
> - goto insert;
> + bypass = false;
> + goto out_unlock;
> }
>
> - if (q->elevator && !bypass_insert)
> - goto insert;
> + /*
> + * Bypass the potential scheduler on the bottom device.
> + */
> + if (unlikely(q->elevator && !bypass))
> + goto out_unlock;
>
> - if (!blk_mq_get_dispatch_budget(hctx))
> - goto insert;
> + if (unlikely(!blk_mq_get_dispatch_budget(hctx)))
> + goto out_unlock;

The unlikely annotation is a bit misleading, since out-of-budget can
happen frequently in case of low queue depth, and there are lots of
such examples.

>
> - if (!blk_mq_get_driver_tag(rq)) {
> + if (unlikely(!blk_mq_get_driver_tag(rq))) {

Same with above.

Thanks,
Ming

2018-11-14 09:23:18

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH V7 2/4] blk-mq: fix issue directly case when q is stopped or quiesced

On Wed, Nov 14, 2018 at 04:45:29PM +0800, Jianchao Wang wrote:
> When try to issue request directly, if the queue is stopped or
> quiesced, 'bypass' will be ignored and return BLK_STS_OK to caller
> to avoid it dispatch request again. Then the request will be
> inserted with blk_mq_sched_insert_request. This is not correct
> for dm-rq case where we should avoid to pass through the underlying
> path's io scheduler.
>
> To fix it, use blk_mq_request_bypass_insert to insert the request
> to hctx->dispatch when we cannot pass through io scheduler but have
> to insert.

Not sure if the current behaviour is wrong, or worth of a fix.

Bypassing io scheduler for dm-rq is only for sake of performance
because there has been io scheduler for dm device already, and we
just don't want to schedule these requests twice.

Given it can be thought as error handling, there shouldn't be big
difference to put request in scheduler queue or ->dispatch list
when queue is quiesced or stopped. What matters is that all these
requests can be dispatched again after queue switches back.


thanks,
Ming

2018-11-14 09:24:37

by jianchao.wang

[permalink] [raw]
Subject: Re: [PATCH V7 1/4] blk-mq: refactor the code of issue request directly

Hi Ming

On 11/14/18 5:11 PM, Ming Lei wrote:
>>
>> - if (!blk_mq_get_dispatch_budget(hctx))
>> - goto insert;
>> + if (unlikely(!blk_mq_get_dispatch_budget(hctx)))
>> + goto out_unlock;
> The unlikely annotation is a bit misleading, since out-of-budget can
> happen frequently in case of low queue depth, and there are lots of
> such examples.
>

This could be good for the case for no .get_budget and getting budget success.
In case of out-of-budget, we insert the request which is slow path.
It should be OK. Maybe some comment should be added for this.

Thanks
Jianchao



2018-11-14 09:30:23

by jianchao.wang

[permalink] [raw]
Subject: Re: [PATCH V7 2/4] blk-mq: fix issue directly case when q is stopped or quiesced

Hi Ming

On 11/14/18 5:20 PM, Ming Lei wrote:
> On Wed, Nov 14, 2018 at 04:45:29PM +0800, Jianchao Wang wrote:
>> When try to issue request directly, if the queue is stopped or
>> quiesced, 'bypass' will be ignored and return BLK_STS_OK to caller
>> to avoid it dispatch request again. Then the request will be
>> inserted with blk_mq_sched_insert_request. This is not correct
>> for dm-rq case where we should avoid to pass through the underlying
>> path's io scheduler.
>>
>> To fix it, use blk_mq_request_bypass_insert to insert the request
>> to hctx->dispatch when we cannot pass through io scheduler but have
>> to insert.
>
> Not sure if the current behaviour is wrong, or worth of a fix.
>
> Bypassing io scheduler for dm-rq is only for sake of performance
> because there has been io scheduler for dm device already, and we
> just don't want to schedule these requests twice.

As comment of commit 157f377beb710e84bd8bc7a3c4475c0674ebebd7
(block: directly insert blk-mq request from blk_insert_cloned_request())

All said, a request-based DM multipath device's IO scheduler should be
the only one used -- when the original requests are issued to the
underlying paths as cloned requests they are inserted directly in the
underlying dispatch queue(s) rather than through an additional elevator.

But commit bd166ef18 ("blk-mq-sched: add framework for MQ capable IO
schedulers") switched blk_insert_cloned_request() from using
blk_mq_insert_request() to blk_mq_sched_insert_request(). Which
incorrectly added elevator machinery into a call chain that isn't
supposed to have any.

It sounds like a wrong action.

Thanks
Jianchao

>
> thanks,
> Ming
>

2018-11-14 09:36:34

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH V7 2/4] blk-mq: fix issue directly case when q is stopped or quiesced

On Wed, Nov 14, 2018 at 05:29:54PM +0800, jianchao.wang wrote:
> Hi Ming
>
> On 11/14/18 5:20 PM, Ming Lei wrote:
> > On Wed, Nov 14, 2018 at 04:45:29PM +0800, Jianchao Wang wrote:
> >> When try to issue request directly, if the queue is stopped or
> >> quiesced, 'bypass' will be ignored and return BLK_STS_OK to caller
> >> to avoid it dispatch request again. Then the request will be
> >> inserted with blk_mq_sched_insert_request. This is not correct
> >> for dm-rq case where we should avoid to pass through the underlying
> >> path's io scheduler.
> >>
> >> To fix it, use blk_mq_request_bypass_insert to insert the request
> >> to hctx->dispatch when we cannot pass through io scheduler but have
> >> to insert.
> >
> > Not sure if the current behaviour is wrong, or worth of a fix.
> >
> > Bypassing io scheduler for dm-rq is only for sake of performance
> > because there has been io scheduler for dm device already, and we
> > just don't want to schedule these requests twice.
>
> As comment of commit 157f377beb710e84bd8bc7a3c4475c0674ebebd7
> (block: directly insert blk-mq request from blk_insert_cloned_request())
>
> All said, a request-based DM multipath device's IO scheduler should be
> the only one used -- when the original requests are issued to the
> underlying paths as cloned requests they are inserted directly in the
> underlying dispatch queue(s) rather than through an additional elevator.
>
> But commit bd166ef18 ("blk-mq-sched: add framework for MQ capable IO
> schedulers") switched blk_insert_cloned_request() from using
> blk_mq_insert_request() to blk_mq_sched_insert_request(). Which
> incorrectly added elevator machinery into a call chain that isn't
> supposed to have any.
>
> It sounds like a wrong action.

As I mentioned, it is only for the sake of performance, and IO scheduler
has to be supported on these devices too, for example, one partition may
be under dm-rq, and another partition can be accessed directly.

However, you are fixing the handling when queue is quiesced or stopped.
Under this situation, it is fine to put requests into scheduler queue,
given no performance need to be worried.

Thanks,
Ming

2018-11-14 09:44:32

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH V7 1/4] blk-mq: refactor the code of issue request directly

On Wed, Nov 14, 2018 at 05:23:48PM +0800, jianchao.wang wrote:
> Hi Ming
>
> On 11/14/18 5:11 PM, Ming Lei wrote:
> >>
> >> - if (!blk_mq_get_dispatch_budget(hctx))
> >> - goto insert;
> >> + if (unlikely(!blk_mq_get_dispatch_budget(hctx)))
> >> + goto out_unlock;
> > The unlikely annotation is a bit misleading, since out-of-budget can
> > happen frequently in case of low queue depth, and there are lots of
> > such examples.
> >
>
> This could be good for the case for no .get_budget and getting budget success.
> In case of out-of-budget, we insert the request which is slow path.

In case of low queue depth, it is hard to say that 'insert request' is
done in slow path, cause it happens quite frequently.

I suggest to remove these two unlikely() since modern CPU's branch prediction
should work well enough.

Especially the annotation of unlikely() often means that this branch is
missed in most of times for all settings, and it is obviously not true
in this case.


thanks,
Ming

2018-11-14 15:23:32

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH V7 1/4] blk-mq: refactor the code of issue request directly

On 11/14/18 2:43 AM, Ming Lei wrote:
> On Wed, Nov 14, 2018 at 05:23:48PM +0800, jianchao.wang wrote:
>> Hi Ming
>>
>> On 11/14/18 5:11 PM, Ming Lei wrote:
>>>>
>>>> - if (!blk_mq_get_dispatch_budget(hctx))
>>>> - goto insert;
>>>> + if (unlikely(!blk_mq_get_dispatch_budget(hctx)))
>>>> + goto out_unlock;
>>> The unlikely annotation is a bit misleading, since out-of-budget can
>>> happen frequently in case of low queue depth, and there are lots of
>>> such examples.
>>>
>>
>> This could be good for the case for no .get_budget and getting budget success.
>> In case of out-of-budget, we insert the request which is slow path.
>
> In case of low queue depth, it is hard to say that 'insert request' is
> done in slow path, cause it happens quite frequently.
>
> I suggest to remove these two unlikely() since modern CPU's branch prediction
> should work well enough.
>
> Especially the annotation of unlikely() often means that this branch is
> missed in most of times for all settings, and it is obviously not true
> in this case.

Agree, unlikely() should only be used for the error handling case or
similar that does indeed almost never trigger. It should not be used
for cases that don't trigger a lot in "most" circumstances.


--
Jens Axboe


2018-11-15 01:37:48

by jianchao.wang

[permalink] [raw]
Subject: Re: [PATCH V7 1/4] blk-mq: refactor the code of issue request directly



On 11/14/18 11:22 PM, Jens Axboe wrote:
> On 11/14/18 2:43 AM, Ming Lei wrote:
>> On Wed, Nov 14, 2018 at 05:23:48PM +0800, jianchao.wang wrote:
>>> Hi Ming
>>>
>>> On 11/14/18 5:11 PM, Ming Lei wrote:
>>>>>
>>>>> - if (!blk_mq_get_dispatch_budget(hctx))
>>>>> - goto insert;
>>>>> + if (unlikely(!blk_mq_get_dispatch_budget(hctx)))
>>>>> + goto out_unlock;
>>>> The unlikely annotation is a bit misleading, since out-of-budget can
>>>> happen frequently in case of low queue depth, and there are lots of
>>>> such examples.
>>>>
>>>
>>> This could be good for the case for no .get_budget and getting budget success.
>>> In case of out-of-budget, we insert the request which is slow path.
>>
>> In case of low queue depth, it is hard to say that 'insert request' is
>> done in slow path, cause it happens quite frequently.
>>
>> I suggest to remove these two unlikely() since modern CPU's branch prediction
>> should work well enough.
>>
>> Especially the annotation of unlikely() often means that this branch is
>> missed in most of times for all settings, and it is obviously not true
>> in this case.
>
> Agree, unlikely() should only be used for the error handling case or
> similar that does indeed almost never trigger. It should not be used
> for cases that don't trigger a lot in "most" circumstances.
>

That's really appreciated for all of your kindly response.
Fair enough with 'unlikely'.
I will remove these two wrong 'unlikely' in next version.

Thanks
Jianchao


2018-11-15 01:38:32

by jianchao.wang

[permalink] [raw]
Subject: Re: [PATCH V7 2/4] blk-mq: fix issue directly case when q is stopped or quiesced



On 11/14/18 5:35 PM, Ming Lei wrote:
> On Wed, Nov 14, 2018 at 05:29:54PM +0800, jianchao.wang wrote:
>> Hi Ming
>>
>> On 11/14/18 5:20 PM, Ming Lei wrote:
>>> On Wed, Nov 14, 2018 at 04:45:29PM +0800, Jianchao Wang wrote:
>>>> When try to issue request directly, if the queue is stopped or
>>>> quiesced, 'bypass' will be ignored and return BLK_STS_OK to caller
>>>> to avoid it dispatch request again. Then the request will be
>>>> inserted with blk_mq_sched_insert_request. This is not correct
>>>> for dm-rq case where we should avoid to pass through the underlying
>>>> path's io scheduler.
>>>>
>>>> To fix it, use blk_mq_request_bypass_insert to insert the request
>>>> to hctx->dispatch when we cannot pass through io scheduler but have
>>>> to insert.
>>>
>>> Not sure if the current behaviour is wrong, or worth of a fix.
>>>
>>> Bypassing io scheduler for dm-rq is only for sake of performance
>>> because there has been io scheduler for dm device already, and we
>>> just don't want to schedule these requests twice.
>>
>> As comment of commit 157f377beb710e84bd8bc7a3c4475c0674ebebd7
>> (block: directly insert blk-mq request from blk_insert_cloned_request())
>>
>> All said, a request-based DM multipath device's IO scheduler should be
>> the only one used -- when the original requests are issued to the
>> underlying paths as cloned requests they are inserted directly in the
>> underlying dispatch queue(s) rather than through an additional elevator.
>>
>> But commit bd166ef18 ("blk-mq-sched: add framework for MQ capable IO
>> schedulers") switched blk_insert_cloned_request() from using
>> blk_mq_insert_request() to blk_mq_sched_insert_request(). Which
>> incorrectly added elevator machinery into a call chain that isn't
>> supposed to have any.
>>
>> It sounds like a wrong action.
>
> As I mentioned, it is only for the sake of performance, and IO scheduler
> has to be supported on these devices too, for example, one partition may
> be under dm-rq, and another partition can be accessed directly.
>
> However, you are fixing the handling when queue is quiesced or stopped.
> Under this situation, it is fine to put requests into scheduler queue,
> given no performance need to be worried.
>

OK, I drop this one.

Thanks
Jianchao