2023-03-21 08:44:10

by Ye Bin

[permalink] [raw]
Subject: [PATCH] scsi: fix hung_task when change host from recovery to running via sysfs

From: Ye Bin <[email protected]>

When do follow test:
Step1: echo "recovery" > /sys/class/scsi_host/host0/state
Step2: dd if=/dev/sda of=/dev/null count=1 &
Step3: echo "running" > /sys/class/scsi_host/host0/state
Got issue as follows:
INFO: task dd:14545 blocked for more than 143 seconds.
Not tainted 6.3.0-rc2-next-20230315-dirty #406
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:dd state:D stack:23376 pid:14545 ppid:14439 flags:0x00000000
Call Trace:
<TASK>
__schedule+0x232e/0x55a0
schedule+0xde/0x1a0
scsi_block_when_processing_errors+0x2e9/0x350
sd_open+0x10c/0x6d0
blkdev_get_whole+0x99/0x260
blkdev_get_by_dev+0x556/0xbe0
blkdev_open+0x140/0x2c0
do_dentry_open+0x6cc/0x13f0
path_openat+0x1b3b/0x26b0
do_filp_open+0x1ce/0x2a0
do_sys_openat2+0x61b/0x990
do_sys_open+0xc7/0x150
do_syscall_64+0x39/0xb0
entry_SYSCALL_64_after_hwframe+0x63/0xcd

Above issue happens as when change host state by sysfs, there isn't wakeup
waiter.
To solve above issue, just wakeup waiter when change state success. There is
no additional judgment here because modifying the host state is more used in
testing.

Signed-off-by: Ye Bin <[email protected]>
---
drivers/scsi/scsi_sysfs.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index ee28f73af4d4..ae6b1476b869 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -216,6 +216,9 @@ store_shost_state(struct device *dev, struct device_attribute *attr,

if (scsi_host_set_state(shost, state))
return -EINVAL;
+ else
+ wake_up(&shost->host_wait);
+
return count;
}

--
2.31.1



2023-03-21 14:23:28

by Benjamin Block

[permalink] [raw]
Subject: Re: [PATCH] scsi: fix hung_task when change host from recovery to running via sysfs

On Tue, Mar 21, 2023 at 04:42:04PM +0800, Ye Bin wrote:
> From: Ye Bin <[email protected]>
>
> When do follow test:
> Step1: echo "recovery" > /sys/class/scsi_host/host0/state

Hmm, that make me wonder, what potential use-case this is for? Just
testing?

For SDEVs we explicitly filter what states can be set from user-space.
Only `SDEV_RUNNING` and `SDEV_OFFLINE` can be set in
`store_state_field()`.
There is probably quite a few other bad things you can do with this
interface by using any of the other states used for device destruction
or EH, and then trigger I/O or said destruction/EH otherwise.
Not sure handling this one special case of `SHOST_RECOVERY` is quite
enough.


> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index ee28f73af4d4..ae6b1476b869 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -216,6 +216,9 @@ store_shost_state(struct device *dev, struct device_attribute *attr,
>
> if (scsi_host_set_state(shost, state))
> return -EINVAL;
> + else
> + wake_up(&shost->host_wait);
> +
> return count;
> }
>
> --
> 2.31.1
>

--
Best Regards, Benjamin Block / Linux on IBM Z Kernel Development
IBM Deutschland Research & Development GmbH / https://www.ibm.com/privacy
Vors. Aufs.-R.: Gregor Pillen / Gesch?ftsf?hrung: David Faller
Sitz der Ges.: B?blingen / Registergericht: AmtsG Stuttgart, HRB 243294

2023-03-22 01:28:53

by yebin (H)

[permalink] [raw]
Subject: Re: [PATCH] scsi: fix hung_task when change host from recovery to running via sysfs



On 2023/3/21 22:22, Benjamin Block wrote:
> On Tue, Mar 21, 2023 at 04:42:04PM +0800, Ye Bin wrote:
>> From: Ye Bin <[email protected]>
>>
>> When do follow test:
>> Step1: echo "recovery" > /sys/class/scsi_host/host0/state
> Hmm, that make me wonder, what potential use-case this is for? Just
> testing?
Thank you for your reply.
Actually, I'm looking for a way to temporarily stop sending IO to the
driver.
Setting the state of the host to recovery can do this, but I changed the
state to
running and found that the process could not be woken up.
I don't know what the purpose of designing this sysfs interface was. But
this
modification can solve the effect I want to achieve.
> For SDEVs we explicitly filter what states can be set from user-space.
> Only `SDEV_RUNNING` and `SDEV_OFFLINE` can be set in
> `store_state_field()`.
> There is probably quite a few other bad things you can do with this
> interface by using any of the other states used for device destruction
> or EH, and then trigger I/O or said destruction/EH otherwise.
> Not sure handling this one special case of `SHOST_RECOVERY` is quite
> enough.
>
>
>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
>> index ee28f73af4d4..ae6b1476b869 100644
>> --- a/drivers/scsi/scsi_sysfs.c
>> +++ b/drivers/scsi/scsi_sysfs.c
>> @@ -216,6 +216,9 @@ store_shost_state(struct device *dev, struct device_attribute *attr,
>>
>> if (scsi_host_set_state(shost, state))
>> return -EINVAL;
>> + else
>> + wake_up(&shost->host_wait);
>> +
>> return count;
>> }
>>
>> --
>> 2.31.1
>>

2023-03-23 10:24:17

by Benjamin Block

[permalink] [raw]
Subject: Re: [PATCH] scsi: fix hung_task when change host from recovery to running via sysfs

On Wed, Mar 22, 2023 at 09:24:32AM +0800, yebin (H) wrote:
> On 2023/3/21 22:22, Benjamin Block wrote:
> > On Tue, Mar 21, 2023 at 04:42:04PM +0800, Ye Bin wrote:
> >> From: Ye Bin <[email protected]>
> >>
> >> When do follow test:
> >> Step1: echo "recovery" > /sys/class/scsi_host/host0/state
> >
> > Hmm, that make me wonder, what potential use-case this is for? Just
> > testing?
>
> Thank you for your reply.
> Actually, I'm looking for a way to temporarily stop sending IO to the
> driver.
> Setting the state of the host to recovery can do this, but I changed
> the state to running and found that the process could not be woken up.
> I don't know what the purpose of designing this sysfs interface was.
> But this modification can solve the effect I want to achieve.

My first thought when seeing this was that maybe we should also limit
this interface to say `SHOST_RUNNING` and `SHOST_RECOVERY` (similar to
what is done in `store_state_field()`).
That would limit the amount of corner cases drastically.

And in case of setting `SHOST_RUNNING` after the scsi host was set to
`SHOST_RECOVERY`, we could also make use of the already existing
function `scsi_restart_operations()` to handle the restart in the same
way as EH does.

But it's not up to me, to make that call.

> > For SDEVs we explicitly filter what states can be set from user-space.
> > Only `SDEV_RUNNING` and `SDEV_OFFLINE` can be set in
> > `store_state_field()`.
> > There is probably quite a few other bad things you can do with this
> > interface by using any of the other states used for device destruction
> > or EH, and then trigger I/O or said destruction/EH otherwise.
> > Not sure handling this one special case of `SHOST_RECOVERY` is quite
> > enough.
> >
> >> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> >> index ee28f73af4d4..ae6b1476b869 100644
> >> --- a/drivers/scsi/scsi_sysfs.c
> >> +++ b/drivers/scsi/scsi_sysfs.c
> >> @@ -216,6 +216,9 @@ store_shost_state(struct device *dev, struct device_attribute *attr,
> >>
> >> if (scsi_host_set_state(shost, state))
> >> return -EINVAL;
> >> + else
> >> + wake_up(&shost->host_wait);

In the very least, this should first check whether we really just made
the transition from `SHOST_RECOVERY` to `SHOST_RUNNING` before calling
this `wake_up()`.
And for that - first get old state, then set the new state - we
probably would also need to grab the `host_lock` to make that race free.

Just calling `wake_up()` without knowing what state transition we just
made doesn't sound right to me.

> >> +
> >> return count;
> >> }
> >>
> >> --
> >> 2.31.1
> >>
>

--
Best Regards, Benjamin Block / Linux on IBM Z Kernel Development
IBM Deutschland Research & Development GmbH / https://www.ibm.com/privacy
Vors. Aufs.-R.: Gregor Pillen / Gesch?ftsf?hrung: David Faller
Sitz der Ges.: B?blingen / Registergericht: AmtsG Stuttgart, HRB 243294

2023-03-23 16:06:22

by Mike Christie

[permalink] [raw]
Subject: Re: [PATCH] scsi: fix hung_task when change host from recovery to running via sysfs

On 3/23/23 5:21 AM, Benjamin Block wrote:
> On Wed, Mar 22, 2023 at 09:24:32AM +0800, yebin (H) wrote:
>> On 2023/3/21 22:22, Benjamin Block wrote:
>>> On Tue, Mar 21, 2023 at 04:42:04PM +0800, Ye Bin wrote:
>>>> From: Ye Bin <[email protected]>
>>>>
>>>> When do follow test:
>>>> Step1: echo "recovery" > /sys/class/scsi_host/host0/state
>>>
>>> Hmm, that make me wonder, what potential use-case this is for? Just
>>> testing?
>>
>> Thank you for your reply.
>> Actually, I'm looking for a way to temporarily stop sending IO to the
>> driver.
>> Setting the state of the host to recovery can do this, but I changed
>> the state to running and found that the process could not be woken up.
>> I don't know what the purpose of designing this sysfs interface was.
>> But this modification can solve the effect I want to achieve.
>
> My first thought when seeing this was that maybe we should also limit
> this interface to say `SHOST_RUNNING` and `SHOST_RECOVERY` (similar to
> what is done in `store_state_field()`).
> That would limit the amount of corner cases drastically.
>
> And in case of setting `SHOST_RUNNING` after the scsi host was set to
> `SHOST_RECOVERY`, we could also make use of the already existing
> function `scsi_restart_operations()` to handle the restart in the same
> way as EH does.
>

I agree we should limit the states we can set. It doesn't make sense for
userspace to be able to set states like SHOST_CANCEL and I think it would
later break functions like scsi_remove_host.

Maybe instead of allowing SHOST_RECOVERY to be used by userspace we want
a new state SHOST_BLOCKED which just does the specific operation we want.
If we re-use SHOST_RECOVERY for userspace blocking IO there could be issues
later on because that state also means we are going to be performing the eh
callouts and not just stopping IO. Or maybe instead of a different state
we just add another shost field similar to tmf_in_progress which forces
scsi_queue_rq to not queue IO to the drivers.

2023-03-23 16:29:06

by Mike Christie

[permalink] [raw]
Subject: Re: [PATCH] scsi: fix hung_task when change host from recovery to running via sysfs

On 3/21/23 8:24 PM, yebin (H) wrote:
>
>
> On 2023/3/21 22:22, Benjamin Block wrote:
>> On Tue, Mar 21, 2023 at 04:42:04PM +0800, Ye Bin wrote:
>>> From: Ye Bin <[email protected]>
>>>
>>> When do follow test:
>>> Step1: echo  "recovery" > /sys/class/scsi_host/host0/state
>> Hmm, that make me wonder, what potential use-case this is for? Just
>> testing?
> Thank you for your reply.
> Actually, I'm looking for a way to temporarily stop sending IO to the driver.

Is this just for testing something or does a user/app need this
functionality for something?

We used to be able to block specific devices but we removed that.
It was useful for people like us where we need to do some low kernel
testing like testing for how upper layers handle IO hangs, but I
think it was not useful for other users so it was removed.

2023-03-24 01:44:23

by yebin (H)

[permalink] [raw]
Subject: Re: [PATCH] scsi: fix hung_task when change host from recovery to running via sysfs



On 2023/3/24 0:12, Mike Christie wrote:
> On 3/21/23 8:24 PM, yebin (H) wrote:
>>
>> On 2023/3/21 22:22, Benjamin Block wrote:
>>> On Tue, Mar 21, 2023 at 04:42:04PM +0800, Ye Bin wrote:
>>>> From: Ye Bin <[email protected]>
>>>>
>>>> When do follow test:
>>>> Step1: echo "recovery" > /sys/class/scsi_host/host0/state
>>> Hmm, that make me wonder, what potential use-case this is for? Just
>>> testing?
>> Thank you for your reply.
>> Actually, I'm looking for a way to temporarily stop sending IO to the driver.
> Is this just for testing something or does a user/app need this
> functionality for something?

This can be used to store IO in the block layer, enabling some fault recovery
that is insensitive to the upper layer.Also want to use this state to test the
block layer.

> We used to be able to block specific devices but we removed that.
> It was useful for people like us where we need to do some low kernel
> testing like testing for how upper layers handle IO hangs, but I
> think it was not useful for other users so it was removed.
>
> .
>