2020-11-03 06:26:54

by Can Guo

[permalink] [raw]
Subject: [PATCH v1 1/2] scsi: ufs: Fix unbalanced scsi_block_reqs_cnt caused by ufshcd_hold()

The scsi_block_reqs_cnt increased in ufshcd_hold() is supposed to be
decreased back in ufshcd_ungate_work() in a paired way. However, if
specific ufshcd_hold/release sequences are met, it is possible that
scsi_block_reqs_cnt is increased twice but only one ungate work is
queued. To make sure scsi_block_reqs_cnt is handled by ufshcd_hold() and
ufshcd_ungate_work() in a paired way, increase it only if queue_work()
returns true.

Signed-off-by: Can Guo <[email protected]>
Reviewed-by: Hongwu Su <[email protected]>
---
drivers/scsi/ufs/ufshcd.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 847f355..efa7d86 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1634,12 +1634,12 @@ int ufshcd_hold(struct ufs_hba *hba, bool async)
*/
/* fallthrough */
case CLKS_OFF:
- ufshcd_scsi_block_requests(hba);
hba->clk_gating.state = REQ_CLKS_ON;
trace_ufshcd_clk_gating(dev_name(hba->dev),
hba->clk_gating.state);
- queue_work(hba->clk_gating.clk_gating_workq,
- &hba->clk_gating.ungate_work);
+ if (queue_work(hba->clk_gating.clk_gating_workq,
+ &hba->clk_gating.ungate_work))
+ ufshcd_scsi_block_requests(hba);
/*
* fall through to check if we should wait for this
* work to be done or not.
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.


2020-11-03 07:11:47

by Stanley Chu

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] scsi: ufs: Fix unbalanced scsi_block_reqs_cnt caused by ufshcd_hold()

Hi Can,

On Mon, 2020-11-02 at 22:24 -0800, Can Guo wrote:
> The scsi_block_reqs_cnt increased in ufshcd_hold() is supposed to be
> decreased back in ufshcd_ungate_work() in a paired way. However, if
> specific ufshcd_hold/release sequences are met, it is possible that
> scsi_block_reqs_cnt is increased twice but only one ungate work is
> queued. To make sure scsi_block_reqs_cnt is handled by ufshcd_hold() and

Just curious that how could this be possible? Would you have some failed
examples?

> ufshcd_ungate_work() in a paired way, increase it only if queue_work()
> returns true.
>
> Signed-off-by: Can Guo <[email protected]>
> Reviewed-by: Hongwu Su <[email protected]>
> ---
> drivers/scsi/ufs/ufshcd.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 847f355..efa7d86 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -1634,12 +1634,12 @@ int ufshcd_hold(struct ufs_hba *hba, bool async)
> */
> /* fallthrough */
> case CLKS_OFF:
> - ufshcd_scsi_block_requests(hba);
> hba->clk_gating.state = REQ_CLKS_ON;
> trace_ufshcd_clk_gating(dev_name(hba->dev),
> hba->clk_gating.state);
> - queue_work(hba->clk_gating.clk_gating_workq,
> - &hba->clk_gating.ungate_work);
> + if (queue_work(hba->clk_gating.clk_gating_workq,
> + &hba->clk_gating.ungate_work))
> + ufshcd_scsi_block_requests(hba);
> /*
> * fall through to check if we should wait for this
> * work to be done or not.

Thanks,
Stanley Chu

2020-11-03 13:30:13

by Can Guo

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] scsi: ufs: Fix unbalanced scsi_block_reqs_cnt caused by ufshcd_hold()

On 2020-11-03 15:07, Stanley Chu wrote:
> Hi Can,
>
> On Mon, 2020-11-02 at 22:24 -0800, Can Guo wrote:
>> The scsi_block_reqs_cnt increased in ufshcd_hold() is supposed to be
>> decreased back in ufshcd_ungate_work() in a paired way. However, if
>> specific ufshcd_hold/release sequences are met, it is possible that
>> scsi_block_reqs_cnt is increased twice but only one ungate work is
>> queued. To make sure scsi_block_reqs_cnt is handled by ufshcd_hold()
>> and
>
> Just curious that how could this be possible? Would you have some
> failed
> examples?
>

[1] One gate_work() is in the workqueue, not yet executed, now clk state
== REQ_CLKS_OFF.
[2] ufshcd_queuecommand() calls ufshcd_hold(async == ture) ->
active_req++ -> scsi_block_reqs_cnt++ -> REQ_CLKS_ON -> queue ungate
work -> active_req-- -> return -EAGAIN.
[3] Now gate_work() starts to run, but since the clk state is
REQ_CLKS_ON, gate_work() just sets clk state to CLKS_ON and bail.
[3] Someone calls ufshcd_hold(async == false) -> do something ->
ufshcd_release() -> clk state is changed to REQ_CLKS_OFF. Note that,
till now, ungate_work() is still in the work queue, not executed yet.
[4] Now, if someone calls ufshcd_hold(), we will hit the issue.

Above sequence is a very common clk gate/ungate sequence. The issue
is because ungate_work is queued but cannot be executed in time. In my
case, I see the ungate_work is somehow delayed for about 150ms. This
change has been tested by customers on multiple platforms. And you
can tell from the code that it won't break anything. :)

Thanks,

Can Guo.

>> ufshcd_ungate_work() in a paired way, increase it only if queue_work()
>> returns true.
>>
>> Signed-off-by: Can Guo <[email protected]>
>> Reviewed-by: Hongwu Su <[email protected]>
>> ---
>> drivers/scsi/ufs/ufshcd.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>> index 847f355..efa7d86 100644
>> --- a/drivers/scsi/ufs/ufshcd.c
>> +++ b/drivers/scsi/ufs/ufshcd.c
>> @@ -1634,12 +1634,12 @@ int ufshcd_hold(struct ufs_hba *hba, bool
>> async)
>> */
>> /* fallthrough */
>> case CLKS_OFF:
>> - ufshcd_scsi_block_requests(hba);
>> hba->clk_gating.state = REQ_CLKS_ON;
>> trace_ufshcd_clk_gating(dev_name(hba->dev),
>> hba->clk_gating.state);
>> - queue_work(hba->clk_gating.clk_gating_workq,
>> - &hba->clk_gating.ungate_work);
>> + if (queue_work(hba->clk_gating.clk_gating_workq,
>> + &hba->clk_gating.ungate_work))
>> + ufshcd_scsi_block_requests(hba);
>> /*
>> * fall through to check if we should wait for this
>> * work to be done or not.
>
> Thanks,
> Stanley Chu

2020-11-03 14:06:43

by Stanley Chu

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] scsi: ufs: Fix unbalanced scsi_block_reqs_cnt caused by ufshcd_hold()

Hi Can,

On Tue, 2020-11-03 at 18:01 +0800, Can Guo wrote:
> On 2020-11-03 15:07, Stanley Chu wrote:
> > Hi Can,
> >
> > On Mon, 2020-11-02 at 22:24 -0800, Can Guo wrote:
> >> The scsi_block_reqs_cnt increased in ufshcd_hold() is supposed to be
> >> decreased back in ufshcd_ungate_work() in a paired way. However, if
> >> specific ufshcd_hold/release sequences are met, it is possible that
> >> scsi_block_reqs_cnt is increased twice but only one ungate work is
> >> queued. To make sure scsi_block_reqs_cnt is handled by ufshcd_hold()
> >> and
> >
> > Just curious that how could this be possible? Would you have some
> > failed
> > examples?
> >
>
> [1] One gate_work() is in the workqueue, not yet executed, now clk state
> == REQ_CLKS_OFF.
> [2] ufshcd_queuecommand() calls ufshcd_hold(async == ture) ->
> active_req++ -> scsi_block_reqs_cnt++ -> REQ_CLKS_ON -> queue ungate
> work -> active_req-- -> return -EAGAIN.
> [3] Now gate_work() starts to run, but since the clk state is
> REQ_CLKS_ON, gate_work() just sets clk state to CLKS_ON and bail.
> [3] Someone calls ufshcd_hold(async == false) -> do something ->
> ufshcd_release() -> clk state is changed to REQ_CLKS_OFF. Note that,
> till now, ungate_work() is still in the work queue, not executed yet.
> [4] Now, if someone calls ufshcd_hold(), we will hit the issue.
>
> Above sequence is a very common clk gate/ungate sequence. The issue
> is because ungate_work is queued but cannot be executed in time. In my
> case, I see the ungate_work is somehow delayed for about 150ms. This
> change has been tested by customers on multiple platforms. And you
> can tell from the code that it won't break anything. :)

Thanks so much for the details. Looks good to me.

Reviewed-by: Stanley Chu <[email protected]>

>
> Thanks,
>
> Can Guo.
>
> >> ufshcd_ungate_work() in a paired way, increase it only if queue_work()
> >> returns true.
> >>
> >> Signed-off-by: Can Guo <[email protected]>
> >> Reviewed-by: Hongwu Su <[email protected]>
> >> ---
> >> drivers/scsi/ufs/ufshcd.c | 6 +++---
> >> 1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> >> index 847f355..efa7d86 100644
> >> --- a/drivers/scsi/ufs/ufshcd.c
> >> +++ b/drivers/scsi/ufs/ufshcd.c
> >> @@ -1634,12 +1634,12 @@ int ufshcd_hold(struct ufs_hba *hba, bool
> >> async)
> >> */
> >> /* fallthrough */
> >> case CLKS_OFF:
> >> - ufshcd_scsi_block_requests(hba);
> >> hba->clk_gating.state = REQ_CLKS_ON;
> >> trace_ufshcd_clk_gating(dev_name(hba->dev),
> >> hba->clk_gating.state);
> >> - queue_work(hba->clk_gating.clk_gating_workq,
> >> - &hba->clk_gating.ungate_work);
> >> + if (queue_work(hba->clk_gating.clk_gating_workq,
> >> + &hba->clk_gating.ungate_work))
> >> + ufshcd_scsi_block_requests(hba);
> >> /*
> >> * fall through to check if we should wait for this
> >> * work to be done or not.
> >
> > Thanks,
> > Stanley Chu

2020-11-11 18:13:44

by Asutosh Das (asd)

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] scsi: ufs: Fix unbalanced scsi_block_reqs_cnt caused by ufshcd_hold()

On 11/2/2020 10:24 PM, Can Guo wrote:
> The scsi_block_reqs_cnt increased in ufshcd_hold() is supposed to be
> decreased back in ufshcd_ungate_work() in a paired way. However, if
> specific ufshcd_hold/release sequences are met, it is possible that
> scsi_block_reqs_cnt is increased twice but only one ungate work is
> queued. To make sure scsi_block_reqs_cnt is handled by ufshcd_hold() and
> ufshcd_ungate_work() in a paired way, increase it only if queue_work()
> returns true.
>
> Signed-off-by: Can Guo <[email protected]>
> Reviewed-by: Hongwu Su <[email protected]>
> ---

Reviewed-by: Asutosh Das <[email protected]>

> drivers/scsi/ufs/ufshcd.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 847f355..efa7d86 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -1634,12 +1634,12 @@ int ufshcd_hold(struct ufs_hba *hba, bool async)
> */
> /* fallthrough */
> case CLKS_OFF:
> - ufshcd_scsi_block_requests(hba);
> hba->clk_gating.state = REQ_CLKS_ON;
> trace_ufshcd_clk_gating(dev_name(hba->dev),
> hba->clk_gating.state);
> - queue_work(hba->clk_gating.clk_gating_workq,
> - &hba->clk_gating.ungate_work);
> + if (queue_work(hba->clk_gating.clk_gating_workq,
> + &hba->clk_gating.ungate_work))
> + ufshcd_scsi_block_requests(hba);
> /*
> * fall through to check if we should wait for this
> * work to be done or not.
>


--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
Linux Foundation Collaborative Project