2018-08-02 18:57:44

by Ming Lei

[permalink] [raw]
Subject: [PATCH V2 0/2] blk-mq: fix fix blk_mq_tagset_busy_iter

Hi Jens,

The two patches fixes boot hang issue in lots of test vm system.
And the cause is in blk_mq_tagset_busy_iter().

V2:
- move blk_mq_request_started() into header file as suggested by
James

Ming Lei (2):
blk-mq: move blk_mq_request_started() into header file
blk-mq: fix blk_mq_tagset_busy_iter

block/blk-mq-tag.c | 2 +-
block/blk-mq.c | 6 ------
block/blk-mq.h | 9 ---------
include/linux/blk-mq.h | 15 ++++++++++++++-
4 files changed, 15 insertions(+), 17 deletions(-)

Cc: Josef Bacik <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Matt Hart <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: John Garry <[email protected]>
Cc: Hannes Reinecke <[email protected]>,
Cc: "Martin K. Petersen" <[email protected]>,
Cc: James Bottomley <[email protected]>
Cc: [email protected]
Cc: [email protected]

--
2.9.5



2018-08-02 18:33:39

by Ming Lei

[permalink] [raw]
Subject: [PATCH V2 2/2] blk-mq: fix blk_mq_tagset_busy_iter

Commit d250bf4e776ff09d5("blk-mq: only iterate over inflight requests
in blk_mq_tagset_busy_iter") uses 'blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT'
to replace 'blk_mq_request_started(req)', this way is wrong, and causes
lots of test system hang during booting.

Fix the issue by using blk_mq_request_started(req) inside bt_tags_iter().

Fixes: d250bf4e776ff09d5 ("blk-mq: only iterate over inflight requests in blk_mq_tagset_busy_iter")
Cc: Josef Bacik <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Matt Hart <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: John Garry <[email protected]>
Cc: Hannes Reinecke <[email protected]>,
Cc: "Martin K. Petersen" <[email protected]>,
Cc: James Bottomley <[email protected]>
Cc: [email protected]
Cc: [email protected]
Reported-by: Mark Brown <[email protected]>
Reported-by: Guenter Roeck <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
block/blk-mq-tag.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 09b2ee6694fb..3de0836163c2 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -271,7 +271,7 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
* test and set the bit before assining ->rqs[].
*/
rq = tags->rqs[bitnr];
- if (rq && blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT)
+ if (rq && blk_mq_request_started(rq))
iter_data->fn(rq, iter_data->data, reserved);

return true;
--
2.9.5


2018-08-02 18:50:21

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH V2 1/2] blk-mq: move blk_mq_request_started() into header file

On Fri, 2018-08-03 at 01:49 +-0800, Ming Lei wrote:
+AD4- Define it as static inline given it might be called in fast path.

Reviewed-by: Bart Van Assche +ADw-bart.vanassche+AEA-wdc.com+AD4-



2018-08-02 18:51:33

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] blk-mq: fix blk_mq_tagset_busy_iter

On Fri, 2018-08-03 at 01:49 +-0800, Ming Lei wrote:
+AD4- Commit d250bf4e776ff09d5(+ACI-blk-mq: only iterate over inflight requests
+AD4- in blk+AF8-mq+AF8-tagset+AF8-busy+AF8-iter+ACI-) uses 'blk+AF8-mq+AF8-rq+AF8-state(rq) +AD0APQ- MQ+AF8-RQ+AF8-IN+AF8-FLIGHT'
+AD4- to replace 'blk+AF8-mq+AF8-request+AF8-started(req)', this way is wrong, and causes
+AD4- lots of test system hang during booting.
+AD4-
+AD4- Fix the issue by using blk+AF8-mq+AF8-request+AF8-started(req) inside bt+AF8-tags+AF8-iter().

That's a good catch. Hence:

Reviewed-by: Bart Van Assche +ADw-bart.vanassche+AEA-wdc.com+AD4-



2018-08-02 18:58:27

by Ming Lei

[permalink] [raw]
Subject: [PATCH V2 1/2] blk-mq: move blk_mq_request_started() into header file

Define it as static inline given it might be called in fast path.

Suggested-by: James Bottomley <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Matt Hart <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: John Garry <[email protected]>
Cc: Hannes Reinecke <[email protected]>,
Cc: "Martin K. Petersen" <[email protected]>,
Cc: James Bottomley <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
---
block/blk-mq.c | 6 ------
block/blk-mq.h | 9 ---------
include/linux/blk-mq.h | 15 ++++++++++++++-
3 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index c92ce06fd565..7afd1c925fa8 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -619,12 +619,6 @@ void blk_mq_complete_request(struct request *rq)
}
EXPORT_SYMBOL(blk_mq_complete_request);

-int blk_mq_request_started(struct request *rq)
-{
- return blk_mq_rq_state(rq) != MQ_RQ_IDLE;
-}
-EXPORT_SYMBOL_GPL(blk_mq_request_started);
-
void blk_mq_start_request(struct request *rq)
{
struct request_queue *q = rq->q;
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 9497b47e2526..18bc448b61a3 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -90,15 +90,6 @@ extern void blk_mq_hctx_kobj_init(struct blk_mq_hw_ctx *hctx);

void blk_mq_release(struct request_queue *q);

-/**
- * blk_mq_rq_state() - read the current MQ_RQ_* state of a request
- * @rq: target request.
- */
-static inline enum mq_rq_state blk_mq_rq_state(struct request *rq)
-{
- return READ_ONCE(rq->state);
-}
-
static inline struct blk_mq_ctx *__blk_mq_get_ctx(struct request_queue *q,
unsigned int cpu)
{
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 1da59c16f637..cc0359b83fb8 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -250,7 +250,6 @@ static inline u16 blk_mq_unique_tag_to_tag(u32 unique_tag)
}


-int blk_mq_request_started(struct request *rq);
void blk_mq_start_request(struct request *rq);
void blk_mq_end_request(struct request *rq, blk_status_t error);
void __blk_mq_end_request(struct request *rq, blk_status_t error);
@@ -303,6 +302,20 @@ static inline bool blk_mq_mark_complete(struct request *rq)
MQ_RQ_IN_FLIGHT;
}

+/**
+ * blk_mq_rq_state() - read the current MQ_RQ_* state of a request
+ * @rq: target request.
+ */
+static inline enum mq_rq_state blk_mq_rq_state(struct request *rq)
+{
+ return READ_ONCE(rq->state);
+}
+
+static inline int blk_mq_request_started(struct request *rq)
+{
+ return blk_mq_rq_state(rq) != MQ_RQ_IDLE;
+}
+
/*
* Driver command data is immediately after the request. So subtract request
* size to get back to the original request, add request size to get the PDU.
--
2.9.5


2018-08-02 19:00:10

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] blk-mq: fix blk_mq_tagset_busy_iter

On Fri, Aug 03, 2018 at 01:49:37AM +0800, Ming Lei wrote:
> Commit d250bf4e776ff09d5("blk-mq: only iterate over inflight requests
> in blk_mq_tagset_busy_iter") uses 'blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT'
> to replace 'blk_mq_request_started(req)', this way is wrong, and causes
> lots of test system hang during booting.
>
> Fix the issue by using blk_mq_request_started(req) inside bt_tags_iter().
>
> Fixes: d250bf4e776ff09d5 ("blk-mq: only iterate over inflight requests in blk_mq_tagset_busy_iter")
> Cc: Josef Bacik <[email protected]>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Guenter Roeck <[email protected]>
> Cc: Mark Brown <[email protected]>
> Cc: Matt Hart <[email protected]>
> Cc: Johannes Thumshirn <[email protected]>
> Cc: John Garry <[email protected]>
> Cc: Hannes Reinecke <[email protected]>,
> Cc: "Martin K. Petersen" <[email protected]>,
> Cc: James Bottomley <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Reported-by: Mark Brown <[email protected]>
> Reported-by: Guenter Roeck <[email protected]>
> Signed-off-by: Ming Lei <[email protected]>

Tested-by: Guenter Roeck <[email protected]>

Test builds (with both patches applied):

Building mips:malta_defconfig:nosmp ... running ..... passed
Building mips:malta_defconfig:smp ... running ..... passed
Building x86_64:q35:Broadwell-noTSX:defconfig:smp:sata:rootfs ... running ..... passed
Building x86_64:q35:IvyBridge:defconfig:smp:nvme:rootfs ... running ..... passed
Building x86_64:q35:SandyBridge:defconfig:smp:usb:rootfs ... running ...... passed
Building x86_64:q35:Haswell:defconfig:smp:usb-uas:rootfs ... running ...... passed
Building x86_64:q35:Skylake-Client:defconfig:smp:mmc:rootfs ... running ...... passed
Building x86_64:q35:Conroe:defconfig:smp:scsi[DC395]:rootfs ... running ....... passed
Building x86_64:q35:Nehalem:defconfig:smp:scsi[AM53C974]:rootfs ... running ....... passed
Building x86_64:q35:Westmere-IBRS:defconfig:smp:scsi[53C810]:rootfs ... running ...... passed
Building x86_64:q35:Skylake-Server:defconfig:smp:scsi[53C895A]:rootfs ... running ...... passed
Building x86_64:pc:EPYC:defconfig:smp:scsi[MEGASAS]:rootfs ... running ...... passed
Building x86_64:q35:EPYC-IBPB:defconfig:smp:scsi[MEGASAS2]:rootfs ... running ..... passed
Building x86_64:q35:Opteron_G5:defconfig:smp:scsi[FUSION]:rootfs ... running ..... passed
Building x86_64:pc:phenom:defconfig:smp:initrd ... running ..... passed
Building x86_64:q35:Opteron_G1:defconfig:smp:initrd ... running ..... passed
Building x86_64:pc:Opteron_G2:defconfig:smp:sata:rootfs ... running ..... passed
Building x86_64:q35:core2duo:defconfig:smp:usb:rootfs ... running ...... passed
Building x86_64:pc:Opteron_G3:defconfig:nosmp:usb:rootfs ... running ....... passed
Building x86_64:q35:Opteron_G4:defconfig:nosmp:sata:rootfs ... running ...... passed

Guenter

2018-08-02 20:49:39

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH V2 0/2] blk-mq: fix fix blk_mq_tagset_busy_iter

On 8/2/18 11:49 AM, Ming Lei wrote:
> Hi Jens,
>
> The two patches fixes boot hang issue in lots of test vm system.
> And the cause is in blk_mq_tagset_busy_iter().
>
> V2:
> - move blk_mq_request_started() into header file as suggested by
> James

I'd be fine with moving that into a header, if the target of this
was 4.19. But it's not, so there's absolutely no point to include
that in this series.

I'm going to apply 2/2 for 4.18.

--
Jens Axboe