2021-05-14 09:18:32

by Can Guo

[permalink] [raw]
Subject: Re: [PATCH v1 6/6] scsi: ufs: Update the fast abort path in ufshcd_abort() for PM requests

On 2021-05-14 12:05, Bart Van Assche wrote:
> On 5/12/21 10:55 PM, Can Guo wrote:
>> If PM requests fail during runtime suspend/resume, RPM framework saves
>> the
>> error to dev->power.runtime_error. Before the runtime_error gets
>> cleared,
>> runtime PM on this specific device won't work again, leaving the
>> device
>> in either suspended or active state permanently.
>>
>> When task abort happens to a PM request sent during runtime
>> suspend/resume,
>> even if it can be successfully aborted, RPM framework anyways saves
>> the
>> (TIMEOUT) error. But we want more and we can do better - let error
>> handling
>> recover and clear the runtime_error. So, let PM requests take the fast
>> abort path in ufshcd_abort().
>
> The only RQF_PM requests I know of are START STOP UNIT and SYNCHRONIZE
> CACHE. Are there devices for which these commands can time out or do
> these commands perhaps only time out as the result of error injection?

There are also REQUEST SENSE requests sent with RQF_PM flag set from
pm ops. And they do time out (device does not respond in 60s) in real
cases, at least I have seen quite a lot of related issues reported
from customers these years.

>
>> - if (lrbp->lun == UFS_UPIU_UFS_DEVICE_WLUN) {
>> + if (lrbp->lun == UFS_UPIU_UFS_DEVICE_WLUN ||
>> + (cmd->request->rq_flags & RQF_PM)) {
>
> Which are the RQF_PM commands that are not sent to a WLUN? Are these
> START STOP UNIT and SYNCHRONIZE CACHE only?
>

There are also REQUEST SENSE cmds sent to the RPMB W-LU, in
ufshcd_add_wlus(),
ufshcd_err_handler() and ufshcd_rpmb_resume() and/or ufshcd_wl_resume().

And SYNCHRONIZE CACHE cmd is only sent to general LUs, but not W-LUs.

Thanks,
Can Guo.

> Thanks,
>
> Bart.