2014-06-04 15:35:50

by KY Srinivasan

[permalink] [raw]
Subject: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to derive the
FLUSH_TIMEOUT from the basic I/O timeout. However, this patch did not use the
basic I/O timeout of the device. Fix this bug.

Signed-off-by: K. Y. Srinivasan <[email protected]>
---
drivers/scsi/sd.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index e9689d5..54150b1 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct scsi_device *sdp, struct request *rq)

static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct request *rq)
{
- rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;
+ int timeout = sdp->request_queue->rq_timeout;
+
+ rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER);
rq->retries = SD_MAX_RETRIES;
rq->cmd[0] = SYNCHRONIZE_CACHE;
rq->cmd_len = 10;
--
1.7.4.1


2014-06-04 17:02:19

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote:
> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to derive the
> FLUSH_TIMEOUT from the basic I/O timeout. However, this patch did not use the
> basic I/O timeout of the device. Fix this bug.
>
> Signed-off-by: K. Y. Srinivasan <[email protected]>
> ---
> drivers/scsi/sd.c | 4 +++-
> 1 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index e9689d5..54150b1 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct scsi_device *sdp, struct request *rq)
>
> static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct request *rq)
> {
> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;
> + int timeout = sdp->request_queue->rq_timeout;
> +
> + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER);

Could you share where you found this to be a problem? It looks like a
bug in block because all inbound requests being prepared should have a
timeout set, so block would be the place to fix it.

I can't see how this can happen because we definitely add the timer
after the request is prepared in my reading of the block code, unless
I'm missing some path in block that violates this.

James

2014-06-04 17:15:21

by KY Srinivasan

[permalink] [raw]
Subject: RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout



> -----Original Message-----
> From: James Bottomley [mailto:[email protected]]
> Sent: Wednesday, June 4, 2014 10:02 AM
> To: KY Srinivasan
> Cc: [email protected]; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT
> from the basic I/O timeout
>
> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote:
> > Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to
> > derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this
> > patch did not use the basic I/O timeout of the device. Fix this bug.
> >
> > Signed-off-by: K. Y. Srinivasan <[email protected]>
> > ---
> > drivers/scsi/sd.c | 4 +++-
> > 1 files changed, 3 insertions(+), 1 deletions(-)
> >
> > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index
> > e9689d5..54150b1 100644
> > --- a/drivers/scsi/sd.c
> > +++ b/drivers/scsi/sd.c
> > @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct
> > scsi_device *sdp, struct request *rq)
> >
> > static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct
> > request *rq) {
> > - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;
> > + int timeout = sdp->request_queue->rq_timeout;
> > +
> > + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER);
>
> Could you share where you found this to be a problem? It looks like a bug in
> block because all inbound requests being prepared should have a timeout
> set, so block would be the place to fix it.

Perhaps; what I found was that the value in rq->timeout was 0 coming into
this function and thus multiplying obviously has no effect.

>
> I can't see how this can happen because we definitely add the timer after the
> request is prepared in my reading of the block code, unless I'm missing some
> path in block that violates this.
>
> James

K. Y

2014-06-06 01:42:03

by Mike Christie

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On 06/04/2014 12:15 PM, KY Srinivasan wrote:
>
>
>> -----Original Message-----
>> From: James Bottomley [mailto:[email protected]]
>> Sent: Wednesday, June 4, 2014 10:02 AM
>> To: KY Srinivasan
>> Cc: [email protected]; [email protected];
>> [email protected]; [email protected]; linux-
>> [email protected]; [email protected]; [email protected];
>> [email protected]
>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT
>> from the basic I/O timeout
>>
>> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote:
>>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to
>>> derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this
>>> patch did not use the basic I/O timeout of the device. Fix this bug.
>>>
>>> Signed-off-by: K. Y. Srinivasan <[email protected]>
>>> ---
>>> drivers/scsi/sd.c | 4 +++-
>>> 1 files changed, 3 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index
>>> e9689d5..54150b1 100644
>>> --- a/drivers/scsi/sd.c
>>> +++ b/drivers/scsi/sd.c
>>> @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct
>>> scsi_device *sdp, struct request *rq)
>>>
>>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct
>>> request *rq) {
>>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;
>>> + int timeout = sdp->request_queue->rq_timeout;
>>> +
>>> + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER);
>>
>> Could you share where you found this to be a problem? It looks like a bug in
>> block because all inbound requests being prepared should have a timeout
>> set, so block would be the place to fix it.
>
> Perhaps; what I found was that the value in rq->timeout was 0 coming into
> this function and thus multiplying obviously has no effect.
>

I think you are right. We hit this problem because we are doing:

scsi_request_fn -> blk_peek_request -> sd_prep_fn -> scsi_setup_flush_cmnd.

At this time request->timeout is zero so the multiplication does
nothing. See how sd_setup_write_same_cmnd will set the request->timeout
at this time.

Then in scsi_request_fn we do:

scsi_request_fn -> blk_start_request -> blk_add_timer.

At this time it will set the request->timeout if something like req
block pc users (like scsi_execute() or block/scsi_ioctl.c) or the write
same code mentioned above have not set the timeout.


2014-06-06 02:53:21

by KY Srinivasan

[permalink] [raw]
Subject: RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout



> -----Original Message-----
> From: Mike Christie [mailto:[email protected]]
> Sent: Thursday, June 5, 2014 6:33 PM
> To: KY Srinivasan
> Cc: James Bottomley; [email protected]; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT
> from the basic I/O timeout
>
> On 06/04/2014 12:15 PM, KY Srinivasan wrote:
> >
> >
> >> -----Original Message-----
> >> From: James Bottomley [mailto:[email protected]]
> >> Sent: Wednesday, June 4, 2014 10:02 AM
> >> To: KY Srinivasan
> >> Cc: [email protected]; [email protected];
> >> [email protected]; [email protected]; linux-
> >> [email protected]; [email protected]; [email protected];
> >> [email protected]
> >> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the
> >> FLUSH_TIMEOUT from the basic I/O timeout
> >>
> >> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote:
> >>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to
> >>> derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this
> >>> patch did not use the basic I/O timeout of the device. Fix this bug.
> >>>
> >>> Signed-off-by: K. Y. Srinivasan <[email protected]>
> >>> ---
> >>> drivers/scsi/sd.c | 4 +++-
> >>> 1 files changed, 3 insertions(+), 1 deletions(-)
> >>>
> >>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index
> >>> e9689d5..54150b1 100644
> >>> --- a/drivers/scsi/sd.c
> >>> +++ b/drivers/scsi/sd.c
> >>> @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct
> >>> scsi_device *sdp, struct request *rq)
> >>>
> >>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct
> >>> request *rq) {
> >>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;
> >>> + int timeout = sdp->request_queue->rq_timeout;
> >>> +
> >>> + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER);
> >>
> >> Could you share where you found this to be a problem? It looks like
> >> a bug in block because all inbound requests being prepared should
> >> have a timeout set, so block would be the place to fix it.
> >
> > Perhaps; what I found was that the value in rq->timeout was 0 coming
> > into this function and thus multiplying obviously has no effect.
> >
>
> I think you are right. We hit this problem because we are doing:
>
> scsi_request_fn -> blk_peek_request -> sd_prep_fn ->
> scsi_setup_flush_cmnd.
>
> At this time request->timeout is zero so the multiplication does nothing. See
> how sd_setup_write_same_cmnd will set the request->timeout at this time.
>
> Then in scsi_request_fn we do:
>
> scsi_request_fn -> blk_start_request -> blk_add_timer.
>
> At this time it will set the request->timeout if something like req block pc
> users (like scsi_execute() or block/scsi_ioctl.c) or the write same code
> mentioned above have not set the timeout.

I don't think this is a recent change. Prior to this commit, we were setting the timeout
value in this function; it just happened to be a different constant unrelated to the I/O
timeout.

K. Y
>
>

2014-06-06 17:23:33

by Mike Christie

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On 6/5/14, 9:53 PM, KY Srinivasan wrote:
>
>
>> -----Original Message-----
>> From: Mike Christie [mailto:[email protected]]
>> Sent: Thursday, June 5, 2014 6:33 PM
>> To: KY Srinivasan
>> Cc: James Bottomley; [email protected]; [email protected];
>> [email protected]; [email protected]; linux-
>> [email protected]; [email protected]; [email protected];
>> [email protected]
>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT
>> from the basic I/O timeout
>>
>> On 06/04/2014 12:15 PM, KY Srinivasan wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: James Bottomley [mailto:[email protected]]
>>>> Sent: Wednesday, June 4, 2014 10:02 AM
>>>> To: KY Srinivasan
>>>> Cc: [email protected]; [email protected];
>>>> [email protected]; [email protected]; linux-
>>>> [email protected]; [email protected]; [email protected];
>>>> [email protected]
>>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the
>>>> FLUSH_TIMEOUT from the basic I/O timeout
>>>>
>>>> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote:
>>>>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to
>>>>> derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this
>>>>> patch did not use the basic I/O timeout of the device. Fix this bug.
>>>>>
>>>>> Signed-off-by: K. Y. Srinivasan <[email protected]>
>>>>> ---
>>>>> drivers/scsi/sd.c | 4 +++-
>>>>> 1 files changed, 3 insertions(+), 1 deletions(-)
>>>>>
>>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index
>>>>> e9689d5..54150b1 100644
>>>>> --- a/drivers/scsi/sd.c
>>>>> +++ b/drivers/scsi/sd.c
>>>>> @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct
>>>>> scsi_device *sdp, struct request *rq)
>>>>>
>>>>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct
>>>>> request *rq) {
>>>>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;
>>>>> + int timeout = sdp->request_queue->rq_timeout;
>>>>> +
>>>>> + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER);
>>>>
>>>> Could you share where you found this to be a problem? It looks like
>>>> a bug in block because all inbound requests being prepared should
>>>> have a timeout set, so block would be the place to fix it.
>>>
>>> Perhaps; what I found was that the value in rq->timeout was 0 coming
>>> into this function and thus multiplying obviously has no effect.
>>>
>>
>> I think you are right. We hit this problem because we are doing:
>>
>> scsi_request_fn -> blk_peek_request -> sd_prep_fn ->
>> scsi_setup_flush_cmnd.
>>
>> At this time request->timeout is zero so the multiplication does nothing. See
>> how sd_setup_write_same_cmnd will set the request->timeout at this time.
>>
>> Then in scsi_request_fn we do:
>>
>> scsi_request_fn -> blk_start_request -> blk_add_timer.
>>
>> At this time it will set the request->timeout if something like req block pc
>> users (like scsi_execute() or block/scsi_ioctl.c) or the write same code
>> mentioned above have not set the timeout.
>
> I don't think this is a recent change. Prior to this commit, we were setting the timeout
> value in this function; it just happened to be a different constant unrelated to the I/O
> timeout.
>

Yeah, it looks like when 7e660100d85af860e7ad763202fff717adcdaacd was
merged we were supposed to initialize it like in your patch in this thread.

I guess we could do your patch in this thread, or if we want the block
layer to initialize the timeout before the prep_fn callout is called
then we would need to have the blk-flush.c code to that when it sets up
the request. If we do the latter, do we want the discard and write same
code to initialize the request's timeout before the prep_fn callout is
called too?

2014-06-06 17:52:59

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On Fri, 2014-06-06 at 12:18 -0500, Mike Christie wrote:
> On 6/5/14, 9:53 PM, KY Srinivasan wrote:
> >
> >
> >> -----Original Message-----
> >> From: Mike Christie [mailto:[email protected]]
> >> Sent: Thursday, June 5, 2014 6:33 PM
> >> To: KY Srinivasan
> >> Cc: James Bottomley; [email protected]; [email protected];
> >> [email protected]; [email protected]; linux-
> >> [email protected]; [email protected]; [email protected];
> >> [email protected]
> >> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT
> >> from the basic I/O timeout
> >>
> >> On 06/04/2014 12:15 PM, KY Srinivasan wrote:
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: James Bottomley [mailto:[email protected]]
> >>>> Sent: Wednesday, June 4, 2014 10:02 AM
> >>>> To: KY Srinivasan
> >>>> Cc: [email protected]; [email protected];
> >>>> [email protected]; [email protected]; linux-
> >>>> [email protected]; [email protected]; [email protected];
> >>>> [email protected]
> >>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the
> >>>> FLUSH_TIMEOUT from the basic I/O timeout
> >>>>
> >>>> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote:
> >>>>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to
> >>>>> derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this
> >>>>> patch did not use the basic I/O timeout of the device. Fix this bug.
> >>>>>
> >>>>> Signed-off-by: K. Y. Srinivasan <[email protected]>
> >>>>> ---
> >>>>> drivers/scsi/sd.c | 4 +++-
> >>>>> 1 files changed, 3 insertions(+), 1 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index
> >>>>> e9689d5..54150b1 100644
> >>>>> --- a/drivers/scsi/sd.c
> >>>>> +++ b/drivers/scsi/sd.c
> >>>>> @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct
> >>>>> scsi_device *sdp, struct request *rq)
> >>>>>
> >>>>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct
> >>>>> request *rq) {
> >>>>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;
> >>>>> + int timeout = sdp->request_queue->rq_timeout;
> >>>>> +
> >>>>> + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER);
> >>>>
> >>>> Could you share where you found this to be a problem? It looks like
> >>>> a bug in block because all inbound requests being prepared should
> >>>> have a timeout set, so block would be the place to fix it.
> >>>
> >>> Perhaps; what I found was that the value in rq->timeout was 0 coming
> >>> into this function and thus multiplying obviously has no effect.
> >>>
> >>
> >> I think you are right. We hit this problem because we are doing:
> >>
> >> scsi_request_fn -> blk_peek_request -> sd_prep_fn ->
> >> scsi_setup_flush_cmnd.
> >>
> >> At this time request->timeout is zero so the multiplication does nothing. See
> >> how sd_setup_write_same_cmnd will set the request->timeout at this time.
> >>
> >> Then in scsi_request_fn we do:
> >>
> >> scsi_request_fn -> blk_start_request -> blk_add_timer.
> >>
> >> At this time it will set the request->timeout if something like req block pc
> >> users (like scsi_execute() or block/scsi_ioctl.c) or the write same code
> >> mentioned above have not set the timeout.
> >
> > I don't think this is a recent change. Prior to this commit, we were setting the timeout
> > value in this function; it just happened to be a different constant unrelated to the I/O
> > timeout.
> >
>
> Yeah, it looks like when 7e660100d85af860e7ad763202fff717adcdaacd was
> merged we were supposed to initialize it like in your patch in this thread.
>
> I guess we could do your patch in this thread, or if we want the block
> layer to initialize the timeout before the prep_fn callout is called
> then we would need to have the blk-flush.c code to that when it sets up
> the request. If we do the latter, do we want the discard and write same
> code to initialize the request's timeout before the prep_fn callout is
> called too?

I looked through the call chain; it seems to be intentional behaviour on
the part of block. Just from an mq point of view, it would make better
code if we unconditionally initialised rq->timeout early and allowed
prep to modify it and then dumped the if(!req->timeout) in
blk_add_timer(), but it's a marginal if condition that would compile to
a conditional store on sensible architectures, so losing the conditional
probably isn't worth worrying about.

Cc'd Jens for his opinion with the block patch

James

---

diff --git a/block/blk-core.c b/block/blk-core.c
index a0e3096..cad6b2a 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -111,6 +111,7 @@ void blk_rq_init(struct request_queue *q, struct request *rq)
rq->cmd = rq->__cmd;
rq->cmd_len = BLK_MAX_CDB;
rq->tag = -1;
+ rq->timeout = q->rq_timeout;
rq->start_time = jiffies;
set_start_time_ns(rq);
rq->part = NULL;
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index d96f706..9063ade 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -180,13 +180,6 @@ void __blk_add_timer(struct request *req, struct list_head *timeout_list)

BUG_ON(!list_empty(&req->timeout_list));

- /*
- * Some LLDs, like scsi, peek at the timeout to prevent a
- * command from being retried forever.
- */
- if (!req->timeout)
- req->timeout = q->rq_timeout;
-
req->deadline = jiffies + req->timeout;
if (timeout_list)
list_add_tail(&req->timeout_list, timeout_list);

2014-06-06 18:22:55

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On 2014-06-06 11:52, James Bottomley wrote:
> On Fri, 2014-06-06 at 12:18 -0500, Mike Christie wrote:
>> On 6/5/14, 9:53 PM, KY Srinivasan wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Mike Christie [mailto:[email protected]]
>>>> Sent: Thursday, June 5, 2014 6:33 PM
>>>> To: KY Srinivasan
>>>> Cc: James Bottomley; [email protected]; [email protected];
>>>> [email protected]; [email protected]; linux-
>>>> [email protected]; [email protected]; [email protected];
>>>> [email protected]
>>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT
>>>> from the basic I/O timeout
>>>>
>>>> On 06/04/2014 12:15 PM, KY Srinivasan wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: James Bottomley [mailto:[email protected]]
>>>>>> Sent: Wednesday, June 4, 2014 10:02 AM
>>>>>> To: KY Srinivasan
>>>>>> Cc: [email protected]; [email protected];
>>>>>> [email protected]; [email protected]; linux-
>>>>>> [email protected]; [email protected]; [email protected];
>>>>>> [email protected]
>>>>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the
>>>>>> FLUSH_TIMEOUT from the basic I/O timeout
>>>>>>
>>>>>> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote:
>>>>>>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to
>>>>>>> derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this
>>>>>>> patch did not use the basic I/O timeout of the device. Fix this bug.
>>>>>>>
>>>>>>> Signed-off-by: K. Y. Srinivasan <[email protected]>
>>>>>>> ---
>>>>>>> drivers/scsi/sd.c | 4 +++-
>>>>>>> 1 files changed, 3 insertions(+), 1 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index
>>>>>>> e9689d5..54150b1 100644
>>>>>>> --- a/drivers/scsi/sd.c
>>>>>>> +++ b/drivers/scsi/sd.c
>>>>>>> @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct
>>>>>>> scsi_device *sdp, struct request *rq)
>>>>>>>
>>>>>>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct
>>>>>>> request *rq) {
>>>>>>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;
>>>>>>> + int timeout = sdp->request_queue->rq_timeout;
>>>>>>> +
>>>>>>> + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER);
>>>>>>
>>>>>> Could you share where you found this to be a problem? It looks like
>>>>>> a bug in block because all inbound requests being prepared should
>>>>>> have a timeout set, so block would be the place to fix it.
>>>>>
>>>>> Perhaps; what I found was that the value in rq->timeout was 0 coming
>>>>> into this function and thus multiplying obviously has no effect.
>>>>>
>>>>
>>>> I think you are right. We hit this problem because we are doing:
>>>>
>>>> scsi_request_fn -> blk_peek_request -> sd_prep_fn ->
>>>> scsi_setup_flush_cmnd.
>>>>
>>>> At this time request->timeout is zero so the multiplication does nothing. See
>>>> how sd_setup_write_same_cmnd will set the request->timeout at this time.
>>>>
>>>> Then in scsi_request_fn we do:
>>>>
>>>> scsi_request_fn -> blk_start_request -> blk_add_timer.
>>>>
>>>> At this time it will set the request->timeout if something like req block pc
>>>> users (like scsi_execute() or block/scsi_ioctl.c) or the write same code
>>>> mentioned above have not set the timeout.
>>>
>>> I don't think this is a recent change. Prior to this commit, we were setting the timeout
>>> value in this function; it just happened to be a different constant unrelated to the I/O
>>> timeout.
>>>
>>
>> Yeah, it looks like when 7e660100d85af860e7ad763202fff717adcdaacd was
>> merged we were supposed to initialize it like in your patch in this thread.
>>
>> I guess we could do your patch in this thread, or if we want the block
>> layer to initialize the timeout before the prep_fn callout is called
>> then we would need to have the blk-flush.c code to that when it sets up
>> the request. If we do the latter, do we want the discard and write same
>> code to initialize the request's timeout before the prep_fn callout is
>> called too?
>
> I looked through the call chain; it seems to be intentional behaviour on
> the part of block. Just from an mq point of view, it would make better
> code if we unconditionally initialised rq->timeout early and allowed
> prep to modify it and then dumped the if(!req->timeout) in
> blk_add_timer(), but it's a marginal if condition that would compile to
> a conditional store on sensible architectures, so losing the conditional
> probably isn't worth worrying about.
>
> Cc'd Jens for his opinion with the block patch

I just committed this one earlier today:

http://git.kernel.dk/?p=linux-block.git;a=commit;h=f6be4fb4bcb396fc3b1c134b7863351972de081f

since I ran into the same thing on nvme. Either approach is fine with
me, as they both allow override of the timeout before insertion. But
we've always done the rq->timeout = 0 init, so I think we should just
reinstate that behavior.

--
Jens Axboe

2014-06-20 21:37:10

by KY Srinivasan

[permalink] [raw]
Subject: RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout



> -----Original Message-----
> From: Jens Axboe [mailto:[email protected]]
> Sent: Friday, June 6, 2014 11:23 AM
> To: James Bottomley; [email protected]
> Cc: [email protected]; [email protected];
> [email protected]; [email protected]; KY Srinivasan; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT
> from the basic I/O timeout
>
> On 2014-06-06 11:52, James Bottomley wrote:
> > On Fri, 2014-06-06 at 12:18 -0500, Mike Christie wrote:
> >> On 6/5/14, 9:53 PM, KY Srinivasan wrote:
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Mike Christie [mailto:[email protected]]
> >>>> Sent: Thursday, June 5, 2014 6:33 PM
> >>>> To: KY Srinivasan
> >>>> Cc: James Bottomley; [email protected];
> >>>> [email protected]; [email protected];
> [email protected];
> >>>> linux- [email protected]; [email protected];
> >>>> [email protected]; [email protected]
> >>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the
> >>>> FLUSH_TIMEOUT from the basic I/O timeout
> >>>>
> >>>> On 06/04/2014 12:15 PM, KY Srinivasan wrote:
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: James Bottomley [mailto:[email protected]]
> >>>>>> Sent: Wednesday, June 4, 2014 10:02 AM
> >>>>>> To: KY Srinivasan
> >>>>>> Cc: [email protected]; [email protected];
> >>>>>> [email protected]; [email protected]; linux-
> >>>>>> [email protected]; [email protected];
> >>>>>> [email protected]; [email protected]
> >>>>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the
> >>>>>> FLUSH_TIMEOUT from the basic I/O timeout
> >>>>>>
> >>>>>> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote:
> >>>>>>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added
> code
> >>>>>>> to derive the FLUSH_TIMEOUT from the basic I/O timeout.
> However,
> >>>>>>> this patch did not use the basic I/O timeout of the device. Fix this
> bug.
> >>>>>>>
> >>>>>>> Signed-off-by: K. Y. Srinivasan <[email protected]>
> >>>>>>> ---
> >>>>>>> drivers/scsi/sd.c | 4 +++-
> >>>>>>> 1 files changed, 3 insertions(+), 1 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index
> >>>>>>> e9689d5..54150b1 100644
> >>>>>>> --- a/drivers/scsi/sd.c
> >>>>>>> +++ b/drivers/scsi/sd.c
> >>>>>>> @@ -832,7 +832,9 @@ static int
> sd_setup_write_same_cmnd(struct
> >>>>>>> scsi_device *sdp, struct request *rq)
> >>>>>>>
> >>>>>>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp,
> >>>>>>> struct request *rq) {
> >>>>>>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;
> >>>>>>> + int timeout = sdp->request_queue->rq_timeout;
> >>>>>>> +
> >>>>>>> + rq->timeout = (timeout *
> SD_FLUSH_TIMEOUT_MULTIPLIER);
> >>>>>>
> >>>>>> Could you share where you found this to be a problem? It looks
> >>>>>> like a bug in block because all inbound requests being prepared
> >>>>>> should have a timeout set, so block would be the place to fix it.
> >>>>>
> >>>>> Perhaps; what I found was that the value in rq->timeout was 0
> >>>>> coming into this function and thus multiplying obviously has no effect.
> >>>>>
> >>>>
> >>>> I think you are right. We hit this problem because we are doing:
> >>>>
> >>>> scsi_request_fn -> blk_peek_request -> sd_prep_fn ->
> >>>> scsi_setup_flush_cmnd.
> >>>>
> >>>> At this time request->timeout is zero so the multiplication does
> >>>> nothing. See how sd_setup_write_same_cmnd will set the request-
> >timeout at this time.
> >>>>
> >>>> Then in scsi_request_fn we do:
> >>>>
> >>>> scsi_request_fn -> blk_start_request -> blk_add_timer.
> >>>>
> >>>> At this time it will set the request->timeout if something like req
> >>>> block pc users (like scsi_execute() or block/scsi_ioctl.c) or the
> >>>> write same code mentioned above have not set the timeout.
> >>>
> >>> I don't think this is a recent change. Prior to this commit, we were
> >>> setting the timeout value in this function; it just happened to be a
> >>> different constant unrelated to the I/O timeout.
> >>>
> >>
> >> Yeah, it looks like when 7e660100d85af860e7ad763202fff717adcdaacd was
> >> merged we were supposed to initialize it like in your patch in this thread.
> >>
> >> I guess we could do your patch in this thread, or if we want the
> >> block layer to initialize the timeout before the prep_fn callout is
> >> called then we would need to have the blk-flush.c code to that when
> >> it sets up the request. If we do the latter, do we want the discard
> >> and write same code to initialize the request's timeout before the
> >> prep_fn callout is called too?
> >
> > I looked through the call chain; it seems to be intentional behaviour
> > on the part of block. Just from an mq point of view, it would make
> > better code if we unconditionally initialised rq->timeout early and
> > allowed prep to modify it and then dumped the if(!req->timeout) in
> > blk_add_timer(), but it's a marginal if condition that would compile
> > to a conditional store on sensible architectures, so losing the
> > conditional probably isn't worth worrying about.
> >
> > Cc'd Jens for his opinion with the block patch
>
> I just committed this one earlier today:
>
> http://git.kernel.dk/?p=linux-
> block.git;a=commit;h=f6be4fb4bcb396fc3b1c134b7863351972de081f
>
> since I ran into the same thing on nvme. Either approach is fine with me, as
> they both allow override of the timeout before insertion. But we've always
> done the rq->timeout = 0 init, so I think we should just reinstate that
> behavior.

James,

How is this being fixed now.

Regards,

K. Y

2014-07-17 23:53:58

by KY Srinivasan

[permalink] [raw]
Subject: RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout



> -----Original Message-----
> From: [email protected] [mailto:driverdev-
> [email protected]] On Behalf Of KY Srinivasan
> Sent: Friday, June 20, 2014 2:37 PM
> To: Jens Axboe; James Bottomley; [email protected]
> Cc: [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]
> Subject: RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT
> from the basic I/O timeout
>
>
>
> > -----Original Message-----
> > From: Jens Axboe [mailto:[email protected]]
> > Sent: Friday, June 6, 2014 11:23 AM
> > To: James Bottomley; [email protected]
> > Cc: [email protected]; [email protected];
> > [email protected]; [email protected]; KY Srinivasan; linux-
> > [email protected]; [email protected]; [email protected];
> > [email protected]
> > Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the
> > FLUSH_TIMEOUT from the basic I/O timeout
> >
> > On 2014-06-06 11:52, James Bottomley wrote:
> > > On Fri, 2014-06-06 at 12:18 -0500, Mike Christie wrote:
> > >> On 6/5/14, 9:53 PM, KY Srinivasan wrote:
> > >>>
> > >>>
> > >>>> -----Original Message-----
> > >>>> From: Mike Christie [mailto:[email protected]]
> > >>>> Sent: Thursday, June 5, 2014 6:33 PM
> > >>>> To: KY Srinivasan
> > >>>> Cc: James Bottomley; [email protected];
> > >>>> [email protected]; [email protected];
> > [email protected];
> > >>>> linux- [email protected]; [email protected];
> > >>>> [email protected]; [email protected]
> > >>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the
> > >>>> FLUSH_TIMEOUT from the basic I/O timeout
> > >>>>
> > >>>> On 06/04/2014 12:15 PM, KY Srinivasan wrote:
> > >>>>>
> > >>>>>
> > >>>>>> -----Original Message-----
> > >>>>>> From: James Bottomley [mailto:[email protected]]
> > >>>>>> Sent: Wednesday, June 4, 2014 10:02 AM
> > >>>>>> To: KY Srinivasan
> > >>>>>> Cc: [email protected]; [email protected];
> > >>>>>> [email protected]; [email protected]; linux-
> > >>>>>> [email protected]; [email protected];
> > >>>>>> [email protected]; [email protected]
> > >>>>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the
> > >>>>>> FLUSH_TIMEOUT from the basic I/O timeout
> > >>>>>>
> > >>>>>> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote:
> > >>>>>>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added
> > code
> > >>>>>>> to derive the FLUSH_TIMEOUT from the basic I/O timeout.
> > However,
> > >>>>>>> this patch did not use the basic I/O timeout of the device.
> > >>>>>>> Fix this
> > bug.
> > >>>>>>>
> > >>>>>>> Signed-off-by: K. Y. Srinivasan <[email protected]>
> > >>>>>>> ---
> > >>>>>>> drivers/scsi/sd.c | 4 +++-
> > >>>>>>> 1 files changed, 3 insertions(+), 1 deletions(-)
> > >>>>>>>
> > >>>>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index
> > >>>>>>> e9689d5..54150b1 100644
> > >>>>>>> --- a/drivers/scsi/sd.c
> > >>>>>>> +++ b/drivers/scsi/sd.c
> > >>>>>>> @@ -832,7 +832,9 @@ static int
> > sd_setup_write_same_cmnd(struct
> > >>>>>>> scsi_device *sdp, struct request *rq)
> > >>>>>>>
> > >>>>>>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp,
> > >>>>>>> struct request *rq) {
> > >>>>>>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;
> > >>>>>>> + int timeout = sdp->request_queue->rq_timeout;
> > >>>>>>> +
> > >>>>>>> + rq->timeout = (timeout *
> > SD_FLUSH_TIMEOUT_MULTIPLIER);
> > >>>>>>
> > >>>>>> Could you share where you found this to be a problem? It looks
> > >>>>>> like a bug in block because all inbound requests being prepared
> > >>>>>> should have a timeout set, so block would be the place to fix it.
> > >>>>>
> > >>>>> Perhaps; what I found was that the value in rq->timeout was 0
> > >>>>> coming into this function and thus multiplying obviously has no
> effect.
> > >>>>>
> > >>>>
> > >>>> I think you are right. We hit this problem because we are doing:
> > >>>>
> > >>>> scsi_request_fn -> blk_peek_request -> sd_prep_fn ->
> > >>>> scsi_setup_flush_cmnd.
> > >>>>
> > >>>> At this time request->timeout is zero so the multiplication does
> > >>>> nothing. See how sd_setup_write_same_cmnd will set the request-
> > >timeout at this time.
> > >>>>
> > >>>> Then in scsi_request_fn we do:
> > >>>>
> > >>>> scsi_request_fn -> blk_start_request -> blk_add_timer.
> > >>>>
> > >>>> At this time it will set the request->timeout if something like
> > >>>> req block pc users (like scsi_execute() or block/scsi_ioctl.c) or
> > >>>> the write same code mentioned above have not set the timeout.
> > >>>
> > >>> I don't think this is a recent change. Prior to this commit, we
> > >>> were setting the timeout value in this function; it just happened
> > >>> to be a different constant unrelated to the I/O timeout.
> > >>>
> > >>
> > >> Yeah, it looks like when 7e660100d85af860e7ad763202fff717adcdaacd
> > >> was merged we were supposed to initialize it like in your patch in this
> thread.
> > >>
> > >> I guess we could do your patch in this thread, or if we want the
> > >> block layer to initialize the timeout before the prep_fn callout is
> > >> called then we would need to have the blk-flush.c code to that when
> > >> it sets up the request. If we do the latter, do we want the discard
> > >> and write same code to initialize the request's timeout before the
> > >> prep_fn callout is called too?
> > >
> > > I looked through the call chain; it seems to be intentional
> > > behaviour on the part of block. Just from an mq point of view, it
> > > would make better code if we unconditionally initialised rq->timeout
> > > early and allowed prep to modify it and then dumped the
> > > if(!req->timeout) in blk_add_timer(), but it's a marginal if
> > > condition that would compile to a conditional store on sensible
> > > architectures, so losing the conditional probably isn't worth worrying
> about.
> > >
> > > Cc'd Jens for his opinion with the block patch
> >
> > I just committed this one earlier today:
> >
> > http://git.kernel.dk/?p=linux-
> > block.git;a=commit;h=f6be4fb4bcb396fc3b1c134b7863351972de081f
> >
> > since I ran into the same thing on nvme. Either approach is fine with
> > me, as they both allow override of the timeout before insertion. But
> > we've always done the rq->timeout = 0 init, so I think we should just
> > reinstate that behavior.
>
> James,
>
> How is this being fixed now.
>
> Regards,
>
> K. Y

I still see this problem. There was talk of fixing it elsewhere.

Regards,

K. Y
>
> _______________________________________________
> devel mailing list
> [email protected]
> http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Subject: RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

In sd_sync_cache:
rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;

Regardless of the baseline for the multiplication, a magic
number of 2 is too arbitrary. That might work for an
individual drive, but could be far too short for a RAID
controller that runs into worst case error handling for
the drives to which it is flushing (e.g., if its cache
is volatile and the drives all have recoverable errors
during writes). That time goes up with a bigger cache and
with more fragmented write data in the cache requiring
more individual WRITE commands.

A better value would be the Recommended Command Timeout field
value reported in the REPORT SUPPORTED OPERATION CODES command,
if reported by the device server. That is supposed to account
for the worst case.

For cases where that is not reported, exposing the multiplier
in sysfs would let the timeout for simple devices be set to
smaller values than complex devices.

Also, in both sd_setup_flush_cmnd and sd_sync_cache:
cmd->cmnd[0] = SYNCHRONIZE_CACHE;
cmd->cmd_len = 10;

SYNCHRONIZE CACHE (16) should be favored over SYNCHRONIZE
CACHE (10) unless SYNCHRONIZE CACHE (10) is not supported.

---
Rob Elliott HP Server Storage


2014-07-18 15:10:51

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On Thu, Jul 17, 2014 at 11:53:33PM +0000, KY Srinivasan wrote:
> I still see this problem. There was talk of fixing it elsewhere.

Well, what we have right not is entirely broken, given that the
block layer doesn't initialize ->timeout on TYPE_FS requeuests.

We either need to revert that initial commit or apply something like
the attached patch as a quick fix.


Attachments:
(No filename) (365.00 B)
0001-sd-set-a-non-zero-timeout-for-flush-requests.patch (836.00 B)
Download all attachments

2014-07-18 15:11:42

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On Fri, Jul 18, 2014 at 12:51:06AM +0000, Elliott, Robert (Server Storage) wrote:
> SYNCHRONIZE CACHE (16) should be favored over SYNCHRONIZE
> CACHE (10) unless SYNCHRONIZE CACHE (10) is not supported.

I gues you mean (16) for the last occurance? What's the benefit of
using SYNCHRONIZE CACHE (16) if we don't pass a LBA range?

2014-07-18 15:39:40

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On Fri, 2014-07-18 at 00:51 +0000, Elliott, Robert (Server Storage)
wrote:
> In sd_sync_cache:
> rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;
>
> Regardless of the baseline for the multiplication, a magic
> number of 2 is too arbitrary. That might work for an
> individual drive, but could be far too short for a RAID
> controller that runs into worst case error handling for
> the drives to which it is flushing (e.g., if its cache
> is volatile and the drives all have recoverable errors
> during writes). That time goes up with a bigger cache and
> with more fragmented write data in the cache requiring
> more individual WRITE commands.

RAID devices with gigabytes of cache are usually battery backed and lie
about their cache type (they usually claim write through). This
behaviour is exactly what we want because the flush is used to enforce
write barriers for filesystem consistency, so we only want to flush
volatile caches.

The rare case you cite, where the RAID device is volatile and having
difficulty flushing, we do want a failure because otherwise the
filesystem will become unusable and the system will live lock ...
barriers are sent down frequently under normal filesystem operation.

The SCSI standards committees have been struggling with defining and
detecting the difference between volatile and non-volatile cache and the
heuristics we'd have to use to avoid annoying USB devices with detecting
this don't look pretty. We always zero the SYNC_NV bit anyway, so even
if the devices stopped lying, we'd be safe.

> A better value would be the Recommended Command Timeout field
> value reported in the REPORT SUPPORTED OPERATION CODES command,
> if reported by the device server. That is supposed to account
> for the worst case.
>
> For cases where that is not reported, exposing the multiplier
> in sysfs would let the timeout for simple devices be set to
> smaller values than complex devices.
>
> Also, in both sd_setup_flush_cmnd and sd_sync_cache:
> cmd->cmnd[0] = SYNCHRONIZE_CACHE;
> cmd->cmd_len = 10;
>
> SYNCHRONIZE CACHE (16) should be favored over SYNCHRONIZE
> CACHE (10) unless SYNCHRONIZE CACHE (10) is not supported.

For what reason. We usually go for the safe alternatives, which is 10
byte commands because they have the widest testing and greatest level of
support. We don't do range flushes currently, so there doesn't seem to
be a practical different. If we did support range flushes, we'd likely
only use SC(16) on >2TB devices.

James

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-07-18 16:44:34

by KY Srinivasan

[permalink] [raw]
Subject: RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout



> -----Original Message-----
> From: Christoph Hellwig ([email protected]) [mailto:[email protected]]
> Sent: Friday, July 18, 2014 8:11 AM
> To: KY Srinivasan
> Cc: Jens Axboe; James Bottomley; [email protected]; Christoph Hellwig
> ([email protected]); [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT
> from the basic I/O timeout
>
> On Thu, Jul 17, 2014 at 11:53:33PM +0000, KY Srinivasan wrote:
> > I still see this problem. There was talk of fixing it elsewhere.
>
> Well, what we have right not is entirely broken, given that the block layer
> doesn't initialize ->timeout on TYPE_FS requeuests.
>
> We either need to revert that initial commit or apply something like the
> attached patch as a quick fix.
I had sent this exact patch sometime back:

http://www.spinics.net/lists/linux-scsi/msg75385.html

K. Y

2014-07-18 16:57:17

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On Fri, 2014-07-18 at 16:44 +0000, KY Srinivasan wrote:
>
> > -----Original Message-----
> > From: Christoph Hellwig ([email protected]) [mailto:[email protected]]
> > Sent: Friday, July 18, 2014 8:11 AM
> > To: KY Srinivasan
> > Cc: Jens Axboe; James Bottomley; [email protected]; Christoph Hellwig
> > ([email protected]); [email protected];
> > [email protected]; [email protected]; linux-
> > [email protected]; [email protected]; [email protected];
> > [email protected]
> > Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT
> > from the basic I/O timeout
> >
> > On Thu, Jul 17, 2014 at 11:53:33PM +0000, KY Srinivasan wrote:
> > > I still see this problem. There was talk of fixing it elsewhere.
> >
> > Well, what we have right not is entirely broken, given that the block layer
> > doesn't initialize ->timeout on TYPE_FS requeuests.
> >
> > We either need to revert that initial commit or apply something like the
> > attached patch as a quick fix.
> I had sent this exact patch sometime back:
>
> http://www.spinics.net/lists/linux-scsi/msg75385.html

Actually, no you didn't. The difference is in the derivation of the
timeout. Christoph's patch is absolute in terms of SD_TIMEOUT; yours is
relative to the queue timeout setting ... I thought there was a reason
for preferring the relative version.

James

2014-07-18 17:00:34

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On Wed, Jun 04, 2014 at 09:33:43AM -0700, K. Y. Srinivasan wrote:
> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to derive the
> FLUSH_TIMEOUT from the basic I/O timeout. However, this patch did not use the
> basic I/O timeout of the device. Fix this bug.
>
> Signed-off-by: K. Y. Srinivasan <[email protected]>

Looks good to me,

Reviewed-by: Christoph Hellwig <[email protected]>

(it will need some changes to apply to my tree, but I'm happy to do that
if I can get another review).

James, does this look fine to you?

2014-07-18 17:01:11

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On Fri, Jul 18, 2014 at 04:57:13PM +0000, James Bottomley wrote:
> Actually, no you didn't. The difference is in the derivation of the
> timeout. Christoph's patch is absolute in terms of SD_TIMEOUT; yours is
> relative to the queue timeout setting ... I thought there was a reason
> for preferring the relative version.

Yes, KYs version is better. It takes the base timeout drivers set
on the request queue into account.

2014-07-18 17:03:51

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On Fri, 2014-07-18 at 10:00 -0700, Christoph Hellwig wrote:
> On Wed, Jun 04, 2014 at 09:33:43AM -0700, K. Y. Srinivasan wrote:
> > Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to derive the
> > FLUSH_TIMEOUT from the basic I/O timeout. However, this patch did not use the
> > basic I/O timeout of the device. Fix this bug.
> >
> > Signed-off-by: K. Y. Srinivasan <[email protected]>
>
> Looks good to me,
>
> Reviewed-by: Christoph Hellwig <[email protected]>
>
> (it will need some changes to apply to my tree, but I'm happy to do that
> if I can get another review).
>
> James, does this look fine to you?

Yes:

Reviewed-by: James Bottomley <[email protected]>

James

2014-07-18 17:05:50

by KY Srinivasan

[permalink] [raw]
Subject: RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout



> -----Original Message-----
> From: James Bottomley [mailto:[email protected]]
> Sent: Friday, July 18, 2014 9:57 AM
> To: KY Srinivasan
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT
> from the basic I/O timeout
>
> On Fri, 2014-07-18 at 16:44 +0000, KY Srinivasan wrote:
> >
> > > -----Original Message-----
> > > From: Christoph Hellwig ([email protected])
> > > [mailto:[email protected]]
> > > Sent: Friday, July 18, 2014 8:11 AM
> > > To: KY Srinivasan
> > > Cc: Jens Axboe; James Bottomley; [email protected]; Christoph
> > > Hellwig ([email protected]); [email protected];
> > > [email protected]; [email protected]; linux-
> > > [email protected]; [email protected]; [email protected];
> > > [email protected]
> > > Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the
> > > FLUSH_TIMEOUT from the basic I/O timeout
> > >
> > > On Thu, Jul 17, 2014 at 11:53:33PM +0000, KY Srinivasan wrote:
> > > > I still see this problem. There was talk of fixing it elsewhere.
> > >
> > > Well, what we have right not is entirely broken, given that the
> > > block layer doesn't initialize ->timeout on TYPE_FS requeuests.
> > >
> > > We either need to revert that initial commit or apply something like
> > > the attached patch as a quick fix.
> > I had sent this exact patch sometime back:
> >
> > http://www.spinics.net/lists/linux-scsi/msg75385.html
>
> Actually, no you didn't. The difference is in the derivation of the timeout.
> Christoph's patch is absolute in terms of SD_TIMEOUT; yours is relative to the
> queue timeout setting ... I thought there was a reason for preferring the
> relative version.

You are right; sorry about that. I think my version is better since we do want to base the
flush timeout relative to the basic timeout.

K. Y
>
> James

2014-07-18 17:12:42

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

This is what I plan to put in after it passes basic testing:

---
>From bb617c9465b839d70ecbbc69002a20ccf5f935bd Mon Sep 17 00:00:00 2001
From: "K. Y. Srinivasan" <[email protected]>
Date: Fri, 18 Jul 2014 19:12:58 +0200
Subject: sd: fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O
timeout

Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to derive the
FLUSH_TIMEOUT from the basic I/O timeout. However, this patch did not use the
basic I/O timeout of the device. Fix this bug.

Signed-off-by: K. Y. Srinivasan <[email protected]>
Reviewed-by: James Bottomley <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
---
drivers/scsi/sd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index bef4e78..9ffb393 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -880,7 +880,7 @@ static int sd_setup_flush_cmnd(struct scsi_cmnd *cmd)
cmd->transfersize = 0;
cmd->allowed = SD_MAX_RETRIES;

- rq->timeout = SD_TIMEOUT * SD_FLUSH_TIMEOUT_MULTIPLIER;
+ rq->timeout = rq->q->rq_timeout * SD_FLUSH_TIMEOUT_MULTIPLIER;
return BLKPREP_OK;
}

--
1.9.1

2014-07-18 17:15:23

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On Fri, Jul 18, 2014 at 10:12:38AM -0700, [email protected] wrote:
> This is what I plan to put in after it passes basic testing:

And that one was on top of my previous version. One that applies
against core-for-3.17 below:

---
>From 8a79783e5f72ec034a724e16c1f46604bd97bf68 Mon Sep 17 00:00:00 2001
From: "K. Y. Srinivasan" <[email protected]>
Date: Fri, 18 Jul 2014 17:11:27 +0200
Subject: sd: fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O
timeout

Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to derive the
FLUSH_TIMEOUT from the basic I/O timeout. However, this patch did not use the
basic I/O timeout of the device. Fix this bug.

Signed-off-by: K. Y. Srinivasan <[email protected]>
Reviewed-by: James Bottomley <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
---
drivers/scsi/sd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 377a520..9ffb393 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -880,7 +880,7 @@ static int sd_setup_flush_cmnd(struct scsi_cmnd *cmd)
cmd->transfersize = 0;
cmd->allowed = SD_MAX_RETRIES;

- rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;
+ rq->timeout = rq->q->rq_timeout * SD_FLUSH_TIMEOUT_MULTIPLIER;
return BLKPREP_OK;
}

--
1.9.1

Subject: RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout



> From: James Bottomley [mailto:[email protected]]
>
> On Fri, 2014-07-18 at 00:51 +0000, Elliott, Robert (Server Storage)
> wrote:
...
> >
> > Also, in both sd_setup_flush_cmnd and sd_sync_cache:
> > cmd->cmnd[0] = SYNCHRONIZE_CACHE;
> > cmd->cmd_len = 10;
> >
> > SYNCHRONIZE CACHE (16) should be favored over SYNCHRONIZE
> > CACHE (10) unless SYNCHRONIZE CACHE (10) is not supported.

(sorry - meant "unless ... 16 is not supported")

> For what reason. We usually go for the safe alternatives, which is 10
> byte commands because they have the widest testing and greatest level of
> support. We don't do range flushes currently, so there doesn't seem to
> be a practical different. If we did support range flushes, we'd likely
> only use SC(16) on >2TB devices.
>
> James

A goal of the simplified SCSI feature set idea is to drop all the
short CDBs that have larger, more capable equivalents - don't carry
READ 6/10/12/16 and SYNCHRONIZE CACHE 10/16, just keep the 16-byte
versions. With modern serial IU-based protocols, short CDBs don't
save any transfer time. This will simplify design and testing on
both initiator and target sides. Competing command sets like NVMe
rightly point out that SCSI has too much legacy baggage - all you
need for IO is one READ, one WRITE, and one FLUSH command.

That's why SBC-3 ended up with these warning notes for all the
non-16 byte CDBs:

NOTE 15 - Migration from the SYNCHRONIZE CACHE (10) command to
the SYNCHRONIZE CACHE (16) command is recommended for all
implementations.

If the LBA field in SYNCHRONIZE CACHE went obsolete, then maybe
SYNCHRONIZE CACHE (10) would be kept instead of (16), but that
field is still present. So, (16) is the likely survivor.

---
Rob Elliott HP Server Storage


????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-07-18 17:41:11

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On Fri, 2014-07-18 at 17:17 +0000, Elliott, Robert (Server Storage)
wrote:
>
>
> > From: James Bottomley [mailto:[email protected]]
> >
> > On Fri, 2014-07-18 at 00:51 +0000, Elliott, Robert (Server Storage)
> > wrote:
> ...
> > >
> > > Also, in both sd_setup_flush_cmnd and sd_sync_cache:
> > > cmd->cmnd[0] = SYNCHRONIZE_CACHE;
> > > cmd->cmd_len = 10;
> > >
> > > SYNCHRONIZE CACHE (16) should be favored over SYNCHRONIZE
> > > CACHE (10) unless SYNCHRONIZE CACHE (10) is not supported.
>
> (sorry - meant "unless ... 16 is not supported")

Yes, I guessed that?

> > For what reason. We usually go for the safe alternatives, which is 10
> > byte commands because they have the widest testing and greatest level of
> > support. We don't do range flushes currently, so there doesn't seem to
> > be a practical different. If we did support range flushes, we'd likely
> > only use SC(16) on >2TB devices.
> >
> > James
>
> A goal of the simplified SCSI feature set idea is to drop all the
> short CDBs that have larger, more capable equivalents - don't carry
> READ 6/10/12/16 and SYNCHRONIZE CACHE 10/16, just keep the 16-byte
> versions. With modern serial IU-based protocols, short CDBs don't
> save any transfer time. This will simplify design and testing on
> both initiator and target sides. Competing command sets like NVMe
> rightly point out that SCSI has too much legacy baggage - all you
> need for IO is one READ, one WRITE, and one FLUSH command.

But that's not relevant to us. This is the problem of practical vs
standards approaches. We have to support older and buggy devices. Most
small USB storage sticks die if they see 16 byte CDB commands because
their interpreters. The more "legacy baggage" the standards committee
dumps, the greater the number of heuristics OSs have to have to cope
with the plethora of odd devices.

> That's why SBC-3 ended up with these warning notes for all the
> non-16 byte CDBs:
>
> NOTE 15 - Migration from the SYNCHRONIZE CACHE (10) command to
> the SYNCHRONIZE CACHE (16) command is recommended for all
> implementations.
>
> If the LBA field in SYNCHRONIZE CACHE went obsolete, then maybe
> SYNCHRONIZE CACHE (10) would be kept instead of (16), but that
> field is still present. So, (16) is the likely survivor.

OK, but look at it from our point of view: The reply above justifies
why we prefer 10 byte CDBs over 16. If the standards body ever did
remove SC(10) completely, then we'd have to have yet another heuristic
to recognise devices that don't support SC(10), but until that day,
using SC(10) alone works in all cases, so is the better path for the OS.

If you could, please get the standards body to recognise that the more
command churn they introduce (in the name of rationalisation or
whatever), the more problems they introduce for Operating Systems and
the more likelihood (because of different people reading different
revisions of standards) that we end up with compliance bugs in devices.

James

2014-07-18 18:17:00

by Douglas Gilbert

[permalink] [raw]
Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

On 14-07-18 07:41 AM, James Bottomley wrote:
> On Fri, 2014-07-18 at 17:17 +0000, Elliott, Robert (Server Storage)
> wrote:
>>
>>
>>> From: James Bottomley [mailto:[email protected]]
>>>
>>> On Fri, 2014-07-18 at 00:51 +0000, Elliott, Robert (Server Storage)
>>> wrote:
>> ...
>>>>
>>>> Also, in both sd_setup_flush_cmnd and sd_sync_cache:
>>>> cmd->cmnd[0] = SYNCHRONIZE_CACHE;
>>>> cmd->cmd_len = 10;
>>>>
>>>> SYNCHRONIZE CACHE (16) should be favored over SYNCHRONIZE
>>>> CACHE (10) unless SYNCHRONIZE CACHE (10) is not supported.
>>
>> (sorry - meant "unless ... 16 is not supported")
>
> Yes, I guessed that?
>
>>> For what reason. We usually go for the safe alternatives, which is 10
>>> byte commands because they have the widest testing and greatest level of
>>> support. We don't do range flushes currently, so there doesn't seem to
>>> be a practical different. If we did support range flushes, we'd likely
>>> only use SC(16) on >2TB devices.
>>>
>>> James
>>
>> A goal of the simplified SCSI feature set idea is to drop all the
>> short CDBs that have larger, more capable equivalents - don't carry
>> READ 6/10/12/16 and SYNCHRONIZE CACHE 10/16, just keep the 16-byte
>> versions. With modern serial IU-based protocols, short CDBs don't
>> save any transfer time. This will simplify design and testing on
>> both initiator and target sides. Competing command sets like NVMe
>> rightly point out that SCSI has too much legacy baggage - all you
>> need for IO is one READ, one WRITE, and one FLUSH command.
>
> But that's not relevant to us. This is the problem of practical vs
> standards approaches. We have to support older and buggy devices. Most
> small USB storage sticks die if they see 16 byte CDB commands because
> their interpreters. The more "legacy baggage" the standards committee
> dumps, the greater the number of heuristics OSs have to have to cope
> with the plethora of odd devices.
>
>> That's why SBC-3 ended up with these warning notes for all the
>> non-16 byte CDBs:
>>
>> NOTE 15 - Migration from the SYNCHRONIZE CACHE (10) command to
>> the SYNCHRONIZE CACHE (16) command is recommended for all
>> implementations.
>>
>> If the LBA field in SYNCHRONIZE CACHE went obsolete, then maybe
>> SYNCHRONIZE CACHE (10) would be kept instead of (16), but that
>> field is still present. So, (16) is the likely survivor.
>
> OK, but look at it from our point of view: The reply above justifies
> why we prefer 10 byte CDBs over 16. If the standards body ever did
> remove SC(10) completely, then we'd have to have yet another heuristic
> to recognise devices that don't support SC(10), but until that day,
> using SC(10) alone works in all cases, so is the better path for the OS.
>
> If you could, please get the standards body to recognise that the more
> command churn they introduce (in the name of rationalisation or
> whatever), the more problems they introduce for Operating Systems and
> the more likelihood (because of different people reading different
> revisions of standards) that we end up with compliance bugs in devices.

From the term: "feature sets" I'm guessing T10 will follow what T13
does and have something like a VPD page with descriptors of feature
sets supported. Each set has mandatory and optional commands,
perhaps a similar categorization of mode and VPD pages as well. Such
a "clean slate" for SCSI would make it simpler in the future, at
least for what to put on the fast path. Perhaps some legacy
support could be pushed to the user space.

For many technical areas "legacy" is a derogatory term, but
not necessarily for storage!

Doug Gilbert