2022-08-17 09:26:41

by Michael Wu

[permalink] [raw]
Subject: [PATCH] scsi: core: Fix block I/O error of USB card reader during resume

When accessing storage device via an USB card reader, a block I/O error
occurs during resume:

PM: suspend exit
sd 0: 0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte
=0x08
sd 0: 0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
sd 0: 0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
sd 0: 0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 17 ce e1 00 00 f0 00
blk_update_request: I/O error, dev sda, sector 1560289 op 0x0:(READ) flags
0x84700 phys_seg 19 prio class 0
sd 0: 0:0:0: [sda] tag#0 device offline or changed

Fix it by changing the action in scsi_io_completion_action() from
ACTION_FAIL to ACTION_RETRY by adding the condition `cmd->device->
lockable`.

Signed-off-by: Michael Wu <[email protected]>
---
drivers/scsi/scsi_lib.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 4dbd29ab1dcc..4bc480721947 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -704,7 +704,8 @@ static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result)
} else if (sense_valid && sense_current) {
switch (sshdr.sense_key) {
case UNIT_ATTENTION:
- if (cmd->device->removable) {
+ if (cmd->device->removable &&
+ cmd->device->lockable) {
/* Detected disc change. Set a bit
* and quietly refuse further access.
*/
--
2.29.0


2022-08-17 14:23:36

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH] scsi: core: Fix block I/O error of USB card reader during resume

On 8/17/22 01:34, Michael Wu wrote:
> When accessing storage device via an USB card reader, a block I/O error
> occurs during resume:
>
> PM: suspend exit
> sd 0: 0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte
> =0x08
> sd 0: 0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
> sd 0: 0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
> sd 0: 0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 17 ce e1 00 00 f0 00
> blk_update_request: I/O error, dev sda, sector 1560289 op 0x0:(READ) flags
> 0x84700 phys_seg 19 prio class 0
> sd 0: 0:0:0: [sda] tag#0 device offline or changed
>
> Fix it by changing the action in scsi_io_completion_action() from
> ACTION_FAIL to ACTION_RETRY by adding the condition `cmd->device->
> lockable`.
>
> Signed-off-by: Michael Wu <[email protected]>
> ---
> drivers/scsi/scsi_lib.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 4dbd29ab1dcc..4bc480721947 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -704,7 +704,8 @@ static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result)
> } else if (sense_valid && sense_current) {
> switch (sshdr.sense_key) {
> case UNIT_ATTENTION:
> - if (cmd->device->removable) {
> + if (cmd->device->removable &&
> + cmd->device->lockable) {
> /* Detected disc change. Set a bit
> * and quietly refuse further access.
> */

To me the above doesn't look like a good way to address this. I don't
see why a device being lockable should control whether or not a unit
attention results in a retry? Shouldn't the decision taken by
scsi_io_completion_action() depend on the ASC and ASCQ codes rather than
on whether a device is removable and/or lockable?

BTW, the code modified by the above patch is old. This is what I found
in the 2002 version of scsi_lib.c:

if ((SCpnt->sense_buffer[0] & 0x7f) == 0x70
&& (SCpnt->sense_buffer[2] & 0xf) == UNIT_ATTENTION) {
if (SCpnt->device->removable) {
/* detected disc change. set a bit and quietly refuse
* further access.
*/
SCpnt->device->changed = 1;
SCpnt = scsi_end_request(SCpnt, 0, this_count);
return;
} else {
/*
* Must have been a power glitch, or a
* bus reset. Could not have been a
* media change, so we just retry the
* request and see what happens.
*/
scsi_queue_next_request(q, SCpnt);
return;
}

Bart.

Thanks,

Bart.

2022-08-23 13:53:29

by Michael Wu

[permalink] [raw]
Subject: Re: [PATCH] scsi: core: Fix block I/O error of USB card reader during resume

On 8/17/2022 9:52 PM, Bart Van Assche wrote:
> On 8/17/22 01:34, Michael Wu wrote:
>> When accessing storage device via an USB card reader, a block I/O error
>> occurs during resume:
>>
>> PM: suspend exit
>> sd 0: 0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte
>> =0x08
>> sd 0: 0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
>> sd 0: 0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
>> sd 0: 0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 17 ce e1 00 00 f0 00
>> blk_update_request: I/O error, dev sda, sector 1560289 op 0x0:(READ)
>> flags
>> 0x84700 phys_seg 19 prio class 0
>> sd 0: 0:0:0: [sda] tag#0 device offline or changed
>>
>> Fix it by changing the action in scsi_io_completion_action() from
>> ACTION_FAIL to ACTION_RETRY by adding the condition `cmd->device->
>> lockable`.
>>
>> Signed-off-by: Michael Wu <[email protected]>
>> ---
>>   drivers/scsi/scsi_lib.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index 4dbd29ab1dcc..4bc480721947 100644
>> --- a/drivers/scsi/scsi_lib.c
>> +++ b/drivers/scsi/scsi_lib.c
>> @@ -704,7 +704,8 @@ static void scsi_io_completion_action(struct
>> scsi_cmnd *cmd, int result)
>>       } else if (sense_valid && sense_current) {
>>           switch (sshdr.sense_key) {
>>           case UNIT_ATTENTION:
>> -            if (cmd->device->removable) {
>> +            if (cmd->device->removable &&
>> +                cmd->device->lockable) {
>>                   /* Detected disc change.  Set a bit
>>                    * and quietly refuse further access.
>>                    */
>
> To me the above doesn't look like a good way to address this. I don't
> see why a device being lockable should control whether or not a unit
> attention results in a retry? Shouldn't the decision taken by
> scsi_io_completion_action() depend on the ASC and ASCQ codes rather than
> on whether a device is removable and/or lockable?
>
Dear Bart,
Yes... My patch did seem suspicious. Here's the scene about the block
I/O error: Some card reader does not respond the command 'MEDIUM REMOVAL
PREVENT' correctly, as a result, the host does not send subsequent cmd
'MEDIUM REMOVAL ALLOW'/'MEDIUM REMOVAL PREVENT' before/after sleep,
which leads to a enumeration failure after system resume.
I wonder, without changing the behavior of the device, is there's a
better way to solve this? -- Modifying the scsi core should not be a
good idea though :(

> BTW, the code modified by the above patch is old. This is what I found
> in the 2002 version of scsi_lib.c:
>
> if ((SCpnt->sense_buffer[0] & 0x7f) == 0x70
>     && (SCpnt->sense_buffer[2] & 0xf) == UNIT_ATTENTION) {
>         if (SCpnt->device->removable) {
>                 /* detected disc change.  set a bit and quietly refuse
>                  * further access.
>                  */
>                 SCpnt->device->changed = 1;
>                 SCpnt = scsi_end_request(SCpnt, 0, this_count);
>                 return;
>         } else {
>                 /*
>                  * Must have been a power glitch, or a
>                  * bus reset.  Could not have been a
>                  * media change, so we just retry the
>                  * request and see what happens.
>                  */
>                 scsi_queue_next_request(q, SCpnt);
>                 return;
> }
>
> Bart.
>
> Thanks,
Thanks for your kindly notice. I looked around in the latest linux
mainline repo, but could not find this code. Where can I get this 2002
version of scsi_lib.c? Thank you.

--
Regards,
Michael Wu

2022-08-26 23:31:35

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH] scsi: core: Fix block I/O error of USB card reader during resume

On 8/23/22 03:16, Michael Wu wrote:
> Yes... My patch did seem suspicious. Here's the scene about the block
> I/O error: Some card reader does not respond the command 'MEDIUM REMOVAL
> PREVENT' correctly, as a result, the host does not send subsequent cmd
> 'MEDIUM REMOVAL ALLOW'/'MEDIUM REMOVAL PREVENT' before/after sleep,
> which leads to a enumeration failure after system resume.
> I wonder, without changing the behavior of the device, is there's a
> better way to solve this? -- Modifying the scsi core should not be a
> good idea though :(

The above is not clear to me. My understanding is that "MEDIUM REMOVAL
PREVENT" is a sense code instead of a SCSI command?

> Thanks for your kindly notice. I looked around in the latest linux
> mainline repo, but could not find this code. Where can I get this 2002
> version of scsi_lib.c? Thank you.

Please take a look at
https://stackoverflow.com/questions/3264283/linux-kernel-historical-git-repository-with-full-history.
That web page has instructions for how to configure a git repository
such that history goes back before the time when Linus started using git.

Bart.


2022-08-29 09:09:16

by Michael Wu

[permalink] [raw]
Subject: Re: [PATCH] scsi: core: Fix block I/O error of USB card reader during resume

On 8/27/2022 6:05 AM, Bart Van Assche wrote:
> On 8/23/22 03:16, Michael Wu wrote:
>> Yes... My patch did seem suspicious. Here's the scene about the block
>> I/O error: Some card reader does not respond the command 'MEDIUM
>> REMOVAL PREVENT' correctly, as a result, the host does not send
>> subsequent cmd 'MEDIUM REMOVAL ALLOW'/'MEDIUM REMOVAL PREVENT'
>> before/after sleep, which leads to a enumeration failure after system
>> resume.
>> I wonder, without changing the behavior of the device, is there's a
>> better way to solve this? -- Modifying the scsi core should not be a
>> good idea though :(
>
> The above is not clear to me. My understanding is that "MEDIUM REMOVAL
> PREVENT" is a sense code instead of a SCSI command?
>
>> Thanks for your kindly notice. I looked around in the latest linux
>> mainline repo, but could not find this code. Where can I get this 2002
>> version of scsi_lib.c? Thank you.
>
> Please take a look at
> https://stackoverflow.com/questions/3264283/linux-kernel-historical-git-repository-with-full-history.
> That web page has instructions for how to configure a git repository
> such that history goes back before the time when Linus started using git.
>
> Bart.
>
Dear Bart,
Thank you. I'll try to figure it out and sync to you later.

--
Regards,
Michael Wu