2015-06-04 18:40:47

by Rajat Jain

[permalink] [raw]
Subject: [PATCH] scsi: Avoid potential infinite eh_timeout_handler() loop

Each cmd timeout should result in scmd->retries++. Currently it happens
just only before a command is requeued back. However, if the LLD
eh_timed_out() handler asks to reset timer back again, then also it should
be incremented because effectively LLD will be given a full time period
(SD_TIMEOUT = 30 secs!) to attempt to complete the command.

Why this is a problem:

=> Currently the SCSI low level transport drivers can provide
eh_timeout_handler() calls (for e.g. iscsi provides this) to deal
with command timeouts.

=> The eh_timeout_handler() can return BLK_EH_RESET_TIMER that causes the
SCSI / block layer to reset the timer, thus giving more time to the
LLD.

=> Currently a LLD can potentially loop infinitely on a command if it
always keeps on returning BLK_EH_RESET_TIMER.

* => Other than choking its own devices, if the command that is stuck is a
command issued during sd_probe_async() (e.g. a partition table scan),
then it impacts all the disks because no other disks can be removed
from the system until sd_probe_async() returns. (sd_remove waits on
async_synchronize_full_domain(...))

=> This problem actually resulted in the situation mentioned above,
whereby no disks in the system (on other scsi hosts) could be removed,
because of a stuck scsi command to read the partition tables of an
unrelated problematic disk during probe. The threads were stuck at:

schedule+0x312/0x7a0
async_synchronize_cookie_domain+0xb8/0x115
? __wake_up_bit+0x40/0x40
async_synchronize_full_domain+0x15/0x17
sd_remove+0x5f/0x135
__device_release_driver+0x8a/0xe0
device_release_driver+0x23/0x30
bus_remove_device+0x10f/0x123
device_del+0x132/0x18e
__scsi_remove_device+0x56/0xb6
scsi_remove_device+0x26/0x33
scsi_remove_target+0x12d/0x1a0
...

What this patch does:
=> Ensure that any quests to reset the timer are accounted for, so that
there is a finite upper bound on the time that a command is tried.
Once allowed number of retries is reached, we proceed to standard
error handling procedure (abort etc.) by scheduling the command
for EH.

Signed-off-by: Rajat Jain <[email protected]>
---
drivers/scsi/scsi_error.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index c95a4e9..9671ec5 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -283,6 +283,17 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
else if (host->hostt->eh_timed_out)
rtn = host->hostt->eh_timed_out(scmd);

+ /*
+ * If a scmd times out because LLD failed to complete it, make sure that
+ * LLD can ask for more time only finite number of times. Also each such
+ * request must account towards the time the LLD has been spent on that
+ * cmd. Thus each timeout attempt by an LLD to complete a scmd must be
+ * treated as a retry since it involves waiting for another whole period
+ * of time before it times out again.
+ */
+ if (rtn == BLK_EH_RESET_TIMER && (++scmd->retries > scmd->allowed))
+ rtn = BLK_EH_NOT_HANDLED;
+
if (rtn == BLK_EH_NOT_HANDLED) {
if (!host->hostt->no_async_abort &&
scsi_abort_command(scmd) == SUCCESS)
--
2.2.0.rc0.207.ga3a616c


2015-06-04 20:27:28

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] scsi: Avoid potential infinite eh_timeout_handler() loop

On Thu, 2015-06-04 at 11:40 -0700, Rajat Jain wrote:
> Each cmd timeout should result in scmd->retries++. Currently it happens
> just only before a command is requeued back. However, if the LLD
> eh_timed_out() handler asks to reset timer back again, then also it should
> be incremented because effectively LLD will be given a full time period
> (SD_TIMEOUT = 30 secs!) to attempt to complete the command.
>
> Why this is a problem:
>
> => Currently the SCSI low level transport drivers can provide
> eh_timeout_handler() calls (for e.g. iscsi provides this) to deal
> with command timeouts.
>
> => The eh_timeout_handler() can return BLK_EH_RESET_TIMER that causes the
> SCSI / block layer to reset the timer, thus giving more time to the
> LLD.
>
> => Currently a LLD can potentially loop infinitely on a command if it
> always keeps on returning BLK_EH_RESET_TIMER.
>
> * => Other than choking its own devices, if the command that is stuck is a
> command issued during sd_probe_async() (e.g. a partition table scan),
> then it impacts all the disks because no other disks can be removed
> from the system until sd_probe_async() returns. (sd_remove waits on
> async_synchronize_full_domain(...))
>
> => This problem actually resulted in the situation mentioned above,
> whereby no disks in the system (on other scsi hosts) could be removed,
> because of a stuck scsi command to read the partition tables of an
> unrelated problematic disk during probe. The threads were stuck at:
>
> schedule+0x312/0x7a0
> async_synchronize_cookie_domain+0xb8/0x115
> ? __wake_up_bit+0x40/0x40
> async_synchronize_full_domain+0x15/0x17
> sd_remove+0x5f/0x135
> __device_release_driver+0x8a/0xe0
> device_release_driver+0x23/0x30
> bus_remove_device+0x10f/0x123
> device_del+0x132/0x18e
> __scsi_remove_device+0x56/0xb6
> scsi_remove_device+0x26/0x33
> scsi_remove_target+0x12d/0x1a0
> ...
>
> What this patch does:
> => Ensure that any quests to reset the timer are accounted for, so that
> there is a finite upper bound on the time that a command is tried.
> Once allowed number of retries is reached, we proceed to standard
> error handling procedure (abort etc.) by scheduling the command
> for EH.
>
> Signed-off-by: Rajat Jain <[email protected]>

This is actually wrong. Originally the code you're suggesting did exist
and it used to cause us to time out far too early on some conditions.
Now scmd->retries is for specific things that shouldn't be retried too
often. Anything else appears to retry forever but in fact there's a
specific check (in the softirq and io_completion) to check that a
retryable failure hasn't taken longer than (cmd->allowed + 1) *
req->timeout.

This means effectively that nothing in SCSI is allowed to retry forever.

James

2015-06-04 21:49:18

by Rajat Jain

[permalink] [raw]
Subject: Re: [PATCH] scsi: Avoid potential infinite eh_timeout_handler() loop

Hello James,

On Thu, Jun 4, 2015 at 1:27 PM, James Bottomley
<[email protected]> wrote:
> On Thu, 2015-06-04 at 11:40 -0700, Rajat Jain wrote:
>> Each cmd timeout should result in scmd->retries++. Currently it happens
>> just only before a command is requeued back. However, if the LLD
>> eh_timed_out() handler asks to reset timer back again, then also it should
>> be incremented because effectively LLD will be given a full time period
>> (SD_TIMEOUT = 30 secs!) to attempt to complete the command.
>>
>> Why this is a problem:
>>
>> => Currently the SCSI low level transport drivers can provide
>> eh_timeout_handler() calls (for e.g. iscsi provides this) to deal
>> with command timeouts.
>>
>> => The eh_timeout_handler() can return BLK_EH_RESET_TIMER that causes the
>> SCSI / block layer to reset the timer, thus giving more time to the
>> LLD.
>>
>> => Currently a LLD can potentially loop infinitely on a command if it
>> always keeps on returning BLK_EH_RESET_TIMER.
>>
>> * => Other than choking its own devices, if the command that is stuck is a
>> command issued during sd_probe_async() (e.g. a partition table scan),
>> then it impacts all the disks because no other disks can be removed
>> from the system until sd_probe_async() returns. (sd_remove waits on
>> async_synchronize_full_domain(...))
>>
>> => This problem actually resulted in the situation mentioned above,
>> whereby no disks in the system (on other scsi hosts) could be removed,
>> because of a stuck scsi command to read the partition tables of an
>> unrelated problematic disk during probe. The threads were stuck at:
>>
>> schedule+0x312/0x7a0
>> async_synchronize_cookie_domain+0xb8/0x115
>> ? __wake_up_bit+0x40/0x40
>> async_synchronize_full_domain+0x15/0x17
>> sd_remove+0x5f/0x135
>> __device_release_driver+0x8a/0xe0
>> device_release_driver+0x23/0x30
>> bus_remove_device+0x10f/0x123
>> device_del+0x132/0x18e
>> __scsi_remove_device+0x56/0xb6
>> scsi_remove_device+0x26/0x33
>> scsi_remove_target+0x12d/0x1a0
>> ...
>>
>> What this patch does:
>> => Ensure that any quests to reset the timer are accounted for, so that
>> there is a finite upper bound on the time that a command is tried.
>> Once allowed number of retries is reached, we proceed to standard
>> error handling procedure (abort etc.) by scheduling the command
>> for EH.
>>
>> Signed-off-by: Rajat Jain <[email protected]>
>
> This is actually wrong. Originally the code you're suggesting did exist
> and it used to cause us to time out far too early on some conditions.
> Now scmd->retries is for specific things that shouldn't be retried too
> often. Anything else appears to retry forever but in fact there's a
> specific check (in the softirq and io_completion) to check that a
> retryable failure hasn't taken longer than (cmd->allowed + 1) *
> req->timeout.
>
> This means effectively that nothing in SCSI is allowed to retry forever.

Thanks for the review. I'm not sure if I understood completely though.
I see the check you mention in softirq_done and in the
scsi_io_completion, however, I'm not sure if I see that in the
situation I mentioned above (eh_timed_out() always returning returning
BLK_EH_RESET_TIMER), how would the command ever end up in softirq_done
or io_completion (instead of going on infinitely)?

In my experiment, I actually instrumented the SCSI LLD to always ask
for more time (BLK_EH_RESET_TIMER) for 1 of the disks, and I actually
ended up with a system where I couldn't remove ANY of the disks in the
system (for the reasons mentioned in the commit log sd_remove()
waiting infinitely). I'm sure I'm missing something, but I'd
appreciate if you could please help me understand?

Thanks,

Rajat