2017-07-07 13:46:13

by Stefan Haberland

[permalink] [raw]
Subject: blk-mq timeout question

While changing the DASD device driver to use the blk-mq interfaces I
found the following unexpected behavior:

In case of a timeout our complete callback is never called. Here is the
sequence of events as I understood:

- timeout occurs
- blk_mq_check_expired() calls and checks blk_mark_rq_complete()
- our .timeout callback is called which returns BLK_EH_NOT_HANDLED and
schedules delayed work
- our worker calls blk_mq_complete_request()
- this also checks blk_mark_rq_complete() and therefore never calls our
complete callback

Question:
Should blk_clear_rq_complete() also be called for the BLK_EH_NOT_HANDLED
case?

Regards,
Stefan


2017-07-10 02:24:02

by Ming Lei

[permalink] [raw]
Subject: Re: blk-mq timeout question

On Fri, Jul 7, 2017 at 9:45 PM, Stefan Haberland <[email protected]> wrote:
> While changing the DASD device driver to use the blk-mq interfaces I found
> the following unexpected behavior:
>
> In case of a timeout our complete callback is never called. Here is the
> sequence of events as I understood:
>
> - timeout occurs
> - blk_mq_check_expired() calls and checks blk_mark_rq_complete()
> - our .timeout callback is called which returns BLK_EH_NOT_HANDLED and
> schedules delayed work
> - our worker calls blk_mq_complete_request()
> - this also checks blk_mark_rq_complete() and therefore never calls our
> complete callback
>
> Question:
> Should blk_clear_rq_complete() also be called for the BLK_EH_NOT_HANDLED
> case?

>From comment in blk_rq_timed_out():

case BLK_EH_NOT_HANDLED:
/*
* LLD handles this for now but in the future
* we can send a request msg to abort the command
* and we can move more of the generic scsi eh code to
* the blk layer.

Looks you can/should handle the case inside DASD, and not do that in blk layer.


--
Ming Lei