2018-04-24 00:22:33

by Bjorn Andersson

[permalink] [raw]
Subject: [PATCH 0/3] Fix UFS and devfreq interaction

With the introduction of f1d981eaecf8 ("PM / devfreq: Use the available min/max
frequency") the UFS host controller driver (UFSHCD) stopped probing for
platforms that supports frequency scaling, e.g. all modern Qualcomm platforms.

The cause of this was UFSHCD's reliance of not registering any frequencies and
then being called by devfreq to switch between the frequencies 0 and UINT_MAX.

The devfreq code implies that the client is able to pass the frequency table,
instead of relying on opp tables, so the first patch makes this actually work.
The second patch extracts the devfreq registration in the UFSHCD driver, both
to facilitate the third patch and to remove a dereference of an ERR_PTR() in
the case that devfreq registration fails. Finally, the third patch picks the
two frequencies from the freq-table provided in UFSHCD and pass these to
devfreq, as well as map these frequencies back to the step up/down actions.


With this UFS is once again functional on the db820c, and is needed to get UFS
working on SDM845 (both tested).

Bjorn Andersson (3):
PM / devfreq: Actually support providing freq_table
scsi: ufs: Extract devfreq registration
scsi: ufs: Use freq table with devfreq

drivers/devfreq/devfreq.c | 22 +++------------
drivers/scsi/ufs/ufshcd.c | 68 ++++++++++++++++++++++++++++++++++++-----------
2 files changed, 57 insertions(+), 33 deletions(-)

--
2.16.2



2018-04-24 00:21:55

by Bjorn Andersson

[permalink] [raw]
Subject: [PATCH 1/3] PM / devfreq: Actually support providing freq_table

The code in devfreq_add_device() handles the case where a freq_table is
passed by the client, but then requests min and max frequences from
the, in this case absent, opp tables.

Read the min and max frequencies from the frequency table, which has
been built from the opp table if one exists, instead of querying the
opp table.

Signed-off-by: Bjorn Andersson <[email protected]>
---

An alternative approach is to clarify in the devfreq code that it's not
possible to pass a freq_table and then in patch 3 create an opp table for the
device in runtime; although the error handling of this becomes non-trivial.

Transitioning the UFSHCD to use opp tables directly is hindered by the fact
that the Qualcomm UFS hardware has two different clocks that needs to be
running at different rates, so we would need a way to describe the two rates in
the opp table. (And would force us to change the DT binding)

drivers/devfreq/devfreq.c | 22 ++++------------------
1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index fe2af6aa88fc..086ced50a13d 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -74,30 +74,16 @@ static struct devfreq *find_device_devfreq(struct device *dev)

static unsigned long find_available_min_freq(struct devfreq *devfreq)
{
- struct dev_pm_opp *opp;
- unsigned long min_freq = 0;
-
- opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &min_freq);
- if (IS_ERR(opp))
- min_freq = 0;
- else
- dev_pm_opp_put(opp);
+ struct devfreq_dev_profile *profile = devfreq->profile;

- return min_freq;
+ return profile->freq_table[0];
}

static unsigned long find_available_max_freq(struct devfreq *devfreq)
{
- struct dev_pm_opp *opp;
- unsigned long max_freq = ULONG_MAX;
-
- opp = dev_pm_opp_find_freq_floor(devfreq->dev.parent, &max_freq);
- if (IS_ERR(opp))
- max_freq = 0;
- else
- dev_pm_opp_put(opp);
+ struct devfreq_dev_profile *profile = devfreq->profile;

- return max_freq;
+ return profile->freq_table[profile->max_state - 1];
}

/**
--
2.16.2


2018-04-24 00:22:17

by Bjorn Andersson

[permalink] [raw]
Subject: [PATCH 3/3] scsi: ufs: Use freq table with devfreq

devfreq requires that the client operates on actual frequencies, not
only 0 and UMAX_INT and as such UFS brok with the introduction of
f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency").

This patch registers the frequencies of the first clock with devfreq and
use these to determine if we're trying to step up or down.

Signed-off-by: Bjorn Andersson <[email protected]>
---
drivers/scsi/ufs/ufshcd.c | 39 ++++++++++++++++++++++++++++++++-------
1 file changed, 32 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 2253f24309ec..07b1f3c7bd2d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1168,16 +1168,13 @@ static int ufshcd_devfreq_target(struct device *dev,
struct ufs_hba *hba = dev_get_drvdata(dev);
ktime_t start;
bool scale_up, sched_clk_scaling_suspend_work = false;
+ struct list_head *clk_list = &hba->clk_list_head;
+ struct ufs_clk_info *clki;
unsigned long irq_flags;

if (!ufshcd_is_clkscaling_supported(hba))
return -EINVAL;

- if ((*freq > 0) && (*freq < UINT_MAX)) {
- dev_err(hba->dev, "%s: invalid freq = %lu\n", __func__, *freq);
- return -EINVAL;
- }
-
spin_lock_irqsave(hba->host->host_lock, irq_flags);
if (ufshcd_eh_in_progress(hba)) {
spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
@@ -1187,7 +1184,13 @@ static int ufshcd_devfreq_target(struct device *dev,
if (!hba->clk_scaling.active_reqs)
sched_clk_scaling_suspend_work = true;

- scale_up = (*freq == UINT_MAX) ? true : false;
+ if (list_empty(clk_list)) {
+ spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
+ goto out;
+ }
+
+ clki = list_first_entry(&hba->clk_list_head, struct ufs_clk_info, list);
+ scale_up = (*freq == clki->max_freq) ? true : false;
if (!ufshcd_is_devfreq_scaling_required(hba, scale_up)) {
spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
ret = 0;
@@ -1257,11 +1260,33 @@ static struct devfreq_dev_profile ufs_devfreq_profile = {

static int ufshcd_devfreq_init(struct ufs_hba *hba)
{
+ struct devfreq_dev_profile *profile;
+ struct list_head *clk_list = &hba->clk_list_head;
+ struct ufs_clk_info *clki;
struct devfreq *devfreq;
int ret;

+ /* Skip devfreq if we don't have any clocks in the list */
+ if (list_empty(clk_list))
+ return 0;
+
+ profile = devm_kmemdup(hba->dev, &ufs_devfreq_profile,
+ sizeof(ufs_devfreq_profile), GFP_KERNEL);
+ if (!profile)
+ return -ENOMEM;
+
+ profile->max_state = 2;
+ profile->freq_table = devm_kcalloc(hba->dev, profile->max_state,
+ sizeof(unsigned long), GFP_KERNEL);
+ if (!profile->freq_table)
+ return -ENOMEM;
+
+ clki = list_first_entry(&hba->clk_list_head, struct ufs_clk_info, list);
+ profile->freq_table[0] = clki->min_freq;
+ profile->freq_table[1] = clki->max_freq;
+
devfreq = devm_devfreq_add_device(hba->dev,
- &ufs_devfreq_profile,
+ profile,
"simple_ondemand",
NULL);
if (IS_ERR(devfreq)) {
--
2.16.2


2018-04-24 00:23:58

by Bjorn Andersson

[permalink] [raw]
Subject: [PATCH 2/3] scsi: ufs: Extract devfreq registration

Failing to register with devfreq leaves hba->devfreq assigned, which
causes the error path to dereference the ERR_PTR(). Rather than bolting
on more conditionals, move the call of devm_devfreq_add_device() into
it's own function and only update hba->devfreq once it's successfully
registered.

The subsequent patch builds upon this to make UFS actually work again,
as it's been broken since f1d981eaecf8 ("PM / devfreq: Use the available
min/max frequency")

Signed-off-by: Bjorn Andersson <[email protected]>
---
drivers/scsi/ufs/ufshcd.c | 31 ++++++++++++++++++++++---------
1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 8f22a980b1a7..2253f24309ec 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1255,6 +1255,26 @@ static struct devfreq_dev_profile ufs_devfreq_profile = {
.get_dev_status = ufshcd_devfreq_get_dev_status,
};

+static int ufshcd_devfreq_init(struct ufs_hba *hba)
+{
+ struct devfreq *devfreq;
+ int ret;
+
+ devfreq = devm_devfreq_add_device(hba->dev,
+ &ufs_devfreq_profile,
+ "simple_ondemand",
+ NULL);
+ if (IS_ERR(devfreq)) {
+ ret = PTR_ERR(devfreq);
+ dev_err(hba->dev, "Unable to register with devfreq %d\n", ret);
+ return ret;
+ }
+
+ hba->devfreq = devfreq;
+
+ return 0;
+}
+
static void __ufshcd_suspend_clkscaling(struct ufs_hba *hba)
{
unsigned long flags;
@@ -6399,16 +6419,9 @@ static int ufshcd_probe_hba(struct ufs_hba *hba)
sizeof(struct ufs_pa_layer_attr));
hba->clk_scaling.saved_pwr_info.is_valid = true;
if (!hba->devfreq) {
- hba->devfreq = devm_devfreq_add_device(hba->dev,
- &ufs_devfreq_profile,
- "simple_ondemand",
- NULL);
- if (IS_ERR(hba->devfreq)) {
- ret = PTR_ERR(hba->devfreq);
- dev_err(hba->dev, "Unable to register with devfreq %d\n",
- ret);
+ ret = ufshcd_devfreq_init(hba);
+ if (ret)
goto out;
- }
}
hba->clk_scaling.is_allowed = true;
}
--
2.16.2


2018-04-24 03:03:13

by Chanwoo Choi

[permalink] [raw]
Subject: Re: [PATCH 1/3] PM / devfreq: Actually support providing freq_table

Hi,

On 2018년 04월 24일 09:20, Bjorn Andersson wrote:
> The code in devfreq_add_device() handles the case where a freq_table is
> passed by the client, but then requests min and max frequences from
> the, in this case absent, opp tables.
>
> Read the min and max frequencies from the frequency table, which has
> been built from the opp table if one exists, instead of querying the
> opp table.
>
> Signed-off-by: Bjorn Andersson <[email protected]>
> ---
>
> An alternative approach is to clarify in the devfreq code that it's not
> possible to pass a freq_table and then in patch 3 create an opp table for the
> device in runtime; although the error handling of this becomes non-trivial.
>
> Transitioning the UFSHCD to use opp tables directly is hindered by the fact
> that the Qualcomm UFS hardware has two different clocks that needs to be
> running at different rates, so we would need a way to describe the two rates in
> the opp table. (And would force us to change the DT binding)
>
> drivers/devfreq/devfreq.c | 22 ++++------------------
> 1 file changed, 4 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> index fe2af6aa88fc..086ced50a13d 100644
> --- a/drivers/devfreq/devfreq.c
> +++ b/drivers/devfreq/devfreq.c
> @@ -74,30 +74,16 @@ static struct devfreq *find_device_devfreq(struct device *dev)
>
> static unsigned long find_available_min_freq(struct devfreq *devfreq)
> {
> - struct dev_pm_opp *opp;
> - unsigned long min_freq = 0;
> -
> - opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &min_freq);
> - if (IS_ERR(opp))
> - min_freq = 0;
> - else
> - dev_pm_opp_put(opp);
> + struct devfreq_dev_profile *profile = devfreq->profile;
>
> - return min_freq;
> + return profile->freq_table[0];

It is wrong. The thermal framework support the devfreq-cooling device
which uses the dev_pm_opp_enable/disable().

In order to find the correct available min frequency,
the devfreq have to use the OPP function instead of using the first entry
of the freq_table array.

> }
>
> static unsigned long find_available_max_freq(struct devfreq *devfreq)
> {
> - struct dev_pm_opp *opp;
> - unsigned long max_freq = ULONG_MAX;
> -
> - opp = dev_pm_opp_find_freq_floor(devfreq->dev.parent, &max_freq);
> - if (IS_ERR(opp))
> - max_freq = 0;
> - else
> - dev_pm_opp_put(opp);
> + struct devfreq_dev_profile *profile = devfreq->profile;
>
> - return max_freq;
> + return profile->freq_table[profile->max_state - 1];
> }

ditto.

>
> /**
>


--
Best Regards,
Chanwoo Choi
Samsung Electronics

2018-04-24 06:23:06

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH 1/3] PM / devfreq: Actually support providing freq_table

On Mon 23 Apr 19:48 PDT 2018, Chanwoo Choi wrote:

> Hi,
>
> On 2018??? 04??? 24??? 09:20, Bjorn Andersson wrote:
> > The code in devfreq_add_device() handles the case where a freq_table is
> > passed by the client, but then requests min and max frequences from
> > the, in this case absent, opp tables.
> >
> > Read the min and max frequencies from the frequency table, which has
> > been built from the opp table if one exists, instead of querying the
> > opp table.
> >
> > Signed-off-by: Bjorn Andersson <[email protected]>
> > ---
> >
> > An alternative approach is to clarify in the devfreq code that it's not
> > possible to pass a freq_table and then in patch 3 create an opp table for the
> > device in runtime; although the error handling of this becomes non-trivial.
> >
> > Transitioning the UFSHCD to use opp tables directly is hindered by the fact
> > that the Qualcomm UFS hardware has two different clocks that needs to be
> > running at different rates, so we would need a way to describe the two rates in
> > the opp table. (And would force us to change the DT binding)
> >
> > drivers/devfreq/devfreq.c | 22 ++++------------------
> > 1 file changed, 4 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> > index fe2af6aa88fc..086ced50a13d 100644
> > --- a/drivers/devfreq/devfreq.c
> > +++ b/drivers/devfreq/devfreq.c
> > @@ -74,30 +74,16 @@ static struct devfreq *find_device_devfreq(struct device *dev)
> >
> > static unsigned long find_available_min_freq(struct devfreq *devfreq)
> > {
> > - struct dev_pm_opp *opp;
> > - unsigned long min_freq = 0;
> > -
> > - opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &min_freq);
> > - if (IS_ERR(opp))
> > - min_freq = 0;
> > - else
> > - dev_pm_opp_put(opp);
> > + struct devfreq_dev_profile *profile = devfreq->profile;
> >
> > - return min_freq;
> > + return profile->freq_table[0];
>
> It is wrong. The thermal framework support the devfreq-cooling device
> which uses the dev_pm_opp_enable/disable().
>

Okay, that makes sense. So rather than registering a custom freq_table I
should register the min and max frequency using dev_pm_opp_add().

> In order to find the correct available min frequency,
> the devfreq have to use the OPP function instead of using the first entry
> of the freq_table array.
>

Based on this there seems to be room for cleaning out the freq_table
from devfreq, to reduce the confusion. I will review this further.

Thanks,
Bjorn

2018-04-24 06:28:48

by MyungJoo Ham

[permalink] [raw]
Subject: RE: Re: [PATCH 1/3] PM / devfreq: Actually support providing freq_table

>On Mon 23 Apr 19:48 PDT 2018, Chanwoo Choi wrote:
>
>> Hi,
>>
>> On 2018??? 04??? 24??? 09:20, Bjorn Andersson wrote:
>> > The code in devfreq_add_device() handles the case where a freq_table is
>> > passed by the client, but then requests min and max frequences from
>> > the, in this case absent, opp tables.
>> >
>> > Read the min and max frequencies from the frequency table, which has
>> > been built from the opp table if one exists, instead of querying the
>> > opp table.
>> >
>> > Signed-off-by: Bjorn Andersson <[email protected]>
>> > ---
>> >
>> > An alternative approach is to clarify in the devfreq code that it's not
>> > possible to pass a freq_table and then in patch 3 create an opp table for the
>> > device in runtime; although the error handling of this becomes non-trivial.
>> >
>> > Transitioning the UFSHCD to use opp tables directly is hindered by the fact
>> > that the Qualcomm UFS hardware has two different clocks that needs to be
>> > running at different rates, so we would need a way to describe the two rates in
>> > the opp table. (And would force us to change the DT binding)
>> >
>> > drivers/devfreq/devfreq.c | 22 ++++------------------
>> > 1 file changed, 4 insertions(+), 18 deletions(-)
>> >
>> > diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
>> > index fe2af6aa88fc..086ced50a13d 100644
>> > --- a/drivers/devfreq/devfreq.c
>> > +++ b/drivers/devfreq/devfreq.c
>> > @@ -74,30 +74,16 @@ static struct devfreq *find_device_devfreq(struct device *dev)
>> >
>> > static unsigned long find_available_min_freq(struct devfreq *devfreq)
>> > {
>> > - struct dev_pm_opp *opp;
>> > - unsigned long min_freq = 0;
>> > -
>> > - opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &min_freq);
>> > - if (IS_ERR(opp))
>> > - min_freq = 0;
>> > - else
>> > - dev_pm_opp_put(opp);
>> > + struct devfreq_dev_profile *profile = devfreq->profile;
>> >
>> > - return min_freq;
>> > + return profile->freq_table[0];
>>
>> It is wrong. The thermal framework support the devfreq-cooling device
>> which uses the dev_pm_opp_enable/disable().
>>
>
>Okay, that makes sense. So rather than registering a custom freq_table I
>should register the min and max frequency using dev_pm_opp_add().
>
>> In order to find the correct available min frequency,
>> the devfreq have to use the OPP function instead of using the first entry
>> of the freq_table array.
>>
>
>Based on this there seems to be room for cleaning out the freq_table
>from devfreq, to reduce the confusion. I will review this further.

Could you please check if the bug suffering you gets resolved by
replacing 0 with ULONG_MAX in the function find_available_max_freq?

- max_freq = 0;
+ max_freq = ULONG_MAX;

Even if you are not using OPP, these functions should provide somewhat
"compatible" values.

Cheers,
MyungJoo


>
>Thanks,
>Bjorn
>


2018-04-24 08:23:53

by Chanwoo Choi

[permalink] [raw]
Subject: Re: [PATCH 1/3] PM / devfreq: Actually support providing freq_table

Hi,

On 2018년 04월 24일 14:29, Bjorn Andersson wrote:
> On Mon 23 Apr 19:48 PDT 2018, Chanwoo Choi wrote:
>
>> Hi,
>>
>> On 2018??? 04??? 24??? 09:20, Bjorn Andersson wrote:
>>> The code in devfreq_add_device() handles the case where a freq_table is
>>> passed by the client, but then requests min and max frequences from
>>> the, in this case absent, opp tables.
>>>
>>> Read the min and max frequencies from the frequency table, which has
>>> been built from the opp table if one exists, instead of querying the
>>> opp table.
>>>
>>> Signed-off-by: Bjorn Andersson <[email protected]>
>>> ---
>>>
>>> An alternative approach is to clarify in the devfreq code that it's not
>>> possible to pass a freq_table and then in patch 3 create an opp table for the
>>> device in runtime; although the error handling of this becomes non-trivial.
>>>
>>> Transitioning the UFSHCD to use opp tables directly is hindered by the fact
>>> that the Qualcomm UFS hardware has two different clocks that needs to be
>>> running at different rates, so we would need a way to describe the two rates in
>>> the opp table. (And would force us to change the DT binding)
>>>
>>> drivers/devfreq/devfreq.c | 22 ++++------------------
>>> 1 file changed, 4 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
>>> index fe2af6aa88fc..086ced50a13d 100644
>>> --- a/drivers/devfreq/devfreq.c
>>> +++ b/drivers/devfreq/devfreq.c
>>> @@ -74,30 +74,16 @@ static struct devfreq *find_device_devfreq(struct device *dev)
>>>
>>> static unsigned long find_available_min_freq(struct devfreq *devfreq)
>>> {
>>> - struct dev_pm_opp *opp;
>>> - unsigned long min_freq = 0;
>>> -
>>> - opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &min_freq);
>>> - if (IS_ERR(opp))
>>> - min_freq = 0;
>>> - else
>>> - dev_pm_opp_put(opp);
>>> + struct devfreq_dev_profile *profile = devfreq->profile;
>>>
>>> - return min_freq;
>>> + return profile->freq_table[0];
>>
>> It is wrong. The thermal framework support the devfreq-cooling device
>> which uses the dev_pm_opp_enable/disable().
>>
>
> Okay, that makes sense. So rather than registering a custom freq_table I
> should register the min and max frequency using dev_pm_opp_add().

Thanks.

>
>> In order to find the correct available min frequency,
>> the devfreq have to use the OPP function instead of using the first entry
>> of the freq_table array.
>>
>
> Based on this there seems to be room for cleaning out the freq_table
> from devfreq, to reduce the confusion. I will review this further.

Actually, devfreq must need to have the freq_table[] array. But, freq_table[]
array should be handled in the devfreq core. Now, the devfreq device drivers can
touch the freq_table. I think it is not good.

There is a reason why we have to maintain the freq_table[] as the internal variable.
OPP doesn't provide the OPP API which get the all registered frequencies.
If devfreq-cooling device disables the specific frequency by using dev_pm_oppdisable(),
the user of OPP interface can not get the disabled frequency list.
So, I maintain the freq_table even if using the OPP interface.

And, devfreq-cooling device uses the freq_table directly because released MALi driver
from ARM initializes the freq_table list directly.

I have no any objection for refactoring. Just I'm sharing the issue and current status.

>
> Thanks,
> Bjorn
>
>
>


--
Best Regards,
Chanwoo Choi
Samsung Electronics

2018-04-24 18:40:29

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH 1/3] PM / devfreq: Actually support providing freq_table

On Tue 24 Apr 00:26 PDT 2018, Chanwoo Choi wrote:

> Hi,
>
> On 2018??? 04??? 24??? 14:29, Bjorn Andersson wrote:
> > On Mon 23 Apr 19:48 PDT 2018, Chanwoo Choi wrote:
> >
> >> Hi,
> >>
> >> On 2018??? 04??? 24??? 09:20, Bjorn Andersson wrote:
> >>> The code in devfreq_add_device() handles the case where a freq_table is
> >>> passed by the client, but then requests min and max frequences from
> >>> the, in this case absent, opp tables.
> >>>
> >>> Read the min and max frequencies from the frequency table, which has
> >>> been built from the opp table if one exists, instead of querying the
> >>> opp table.
> >>>
> >>> Signed-off-by: Bjorn Andersson <[email protected]>
> >>> ---
> >>>
> >>> An alternative approach is to clarify in the devfreq code that it's not
> >>> possible to pass a freq_table and then in patch 3 create an opp table for the
> >>> device in runtime; although the error handling of this becomes non-trivial.
> >>>
> >>> Transitioning the UFSHCD to use opp tables directly is hindered by the fact
> >>> that the Qualcomm UFS hardware has two different clocks that needs to be
> >>> running at different rates, so we would need a way to describe the two rates in
> >>> the opp table. (And would force us to change the DT binding)
> >>>
> >>> drivers/devfreq/devfreq.c | 22 ++++------------------
> >>> 1 file changed, 4 insertions(+), 18 deletions(-)
> >>>
> >>> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> >>> index fe2af6aa88fc..086ced50a13d 100644
> >>> --- a/drivers/devfreq/devfreq.c
> >>> +++ b/drivers/devfreq/devfreq.c
> >>> @@ -74,30 +74,16 @@ static struct devfreq *find_device_devfreq(struct device *dev)
> >>>
> >>> static unsigned long find_available_min_freq(struct devfreq *devfreq)
> >>> {
> >>> - struct dev_pm_opp *opp;
> >>> - unsigned long min_freq = 0;
> >>> -
> >>> - opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &min_freq);
> >>> - if (IS_ERR(opp))
> >>> - min_freq = 0;
> >>> - else
> >>> - dev_pm_opp_put(opp);
> >>> + struct devfreq_dev_profile *profile = devfreq->profile;
> >>>
> >>> - return min_freq;
> >>> + return profile->freq_table[0];
> >>
> >> It is wrong. The thermal framework support the devfreq-cooling device
> >> which uses the dev_pm_opp_enable/disable().
> >>
> >
> > Okay, that makes sense. So rather than registering a custom freq_table I
> > should register the min and max frequency using dev_pm_opp_add().
>
> Thanks.
>
> >
> >> In order to find the correct available min frequency,
> >> the devfreq have to use the OPP function instead of using the first entry
> >> of the freq_table array.
> >>
> >
> > Based on this there seems to be room for cleaning out the freq_table
> > from devfreq, to reduce the confusion. I will review this further.
>
> Actually, devfreq must need to have the freq_table[] array. But, freq_table[]
> array should be handled in the devfreq core. Now, the devfreq device drivers can
> touch the freq_table. I think it is not good.
>
> There is a reason why we have to maintain the freq_table[] as the internal variable.
> OPP doesn't provide the OPP API which get the all registered frequencies.
> If devfreq-cooling device disables the specific frequency by using dev_pm_oppdisable(),
> the user of OPP interface can not get the disabled frequency list.
> So, I maintain the freq_table even if using the OPP interface.
>

Thanks for the clarification, I see some possibilities for improving
this but it makes sense.

> And, devfreq-cooling device uses the freq_table directly because
> released MALi driver from ARM initializes the freq_table list
> directly.
>

Forgive me if I misunderstand this, but isn't this exactly what I'm
trying to do in patch 3? Which stopped working back in v4.15-rc1, with
the introduction of f1d981eaecf8 ("PM / devfreq: Use the available
min/max frequency").

> I have no any objection for refactoring. Just I'm sharing the issue
> and current status.
>

Thanks for sharing the current status and helping me understand how to
properly use devfreq.

Regards,
Bjorn

2018-04-24 18:50:15

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH 1/3] PM / devfreq: Actually support providing freq_table

On Mon 23 Apr 23:09 PDT 2018, MyungJoo Ham wrote:

> >On Mon 23 Apr 19:48 PDT 2018, Chanwoo Choi wrote:
> >
> >> Hi,
> >>
> >> On 2018??? 04??? 24??? 09:20, Bjorn Andersson wrote:
> >> > The code in devfreq_add_device() handles the case where a freq_table is
> >> > passed by the client, but then requests min and max frequences from
> >> > the, in this case absent, opp tables.
> >> >
> >> > Read the min and max frequencies from the frequency table, which has
> >> > been built from the opp table if one exists, instead of querying the
> >> > opp table.
> >> >
> >> > Signed-off-by: Bjorn Andersson <[email protected]>
> >> > ---
> >> >
> >> > An alternative approach is to clarify in the devfreq code that it's not
> >> > possible to pass a freq_table and then in patch 3 create an opp table for the
> >> > device in runtime; although the error handling of this becomes non-trivial.
> >> >
> >> > Transitioning the UFSHCD to use opp tables directly is hindered by the fact
> >> > that the Qualcomm UFS hardware has two different clocks that needs to be
> >> > running at different rates, so we would need a way to describe the two rates in
> >> > the opp table. (And would force us to change the DT binding)
> >> >
> >> > drivers/devfreq/devfreq.c | 22 ++++------------------
> >> > 1 file changed, 4 insertions(+), 18 deletions(-)
> >> >
> >> > diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> >> > index fe2af6aa88fc..086ced50a13d 100644
> >> > --- a/drivers/devfreq/devfreq.c
> >> > +++ b/drivers/devfreq/devfreq.c
> >> > @@ -74,30 +74,16 @@ static struct devfreq *find_device_devfreq(struct device *dev)
> >> >
> >> > static unsigned long find_available_min_freq(struct devfreq *devfreq)
> >> > {
> >> > - struct dev_pm_opp *opp;
> >> > - unsigned long min_freq = 0;
> >> > -
> >> > - opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &min_freq);
> >> > - if (IS_ERR(opp))
> >> > - min_freq = 0;
> >> > - else
> >> > - dev_pm_opp_put(opp);
> >> > + struct devfreq_dev_profile *profile = devfreq->profile;
> >> >
> >> > - return min_freq;
> >> > + return profile->freq_table[0];
> >>
> >> It is wrong. The thermal framework support the devfreq-cooling device
> >> which uses the dev_pm_opp_enable/disable().
> >>
> >
> >Okay, that makes sense. So rather than registering a custom freq_table I
> >should register the min and max frequency using dev_pm_opp_add().
> >
> >> In order to find the correct available min frequency,
> >> the devfreq have to use the OPP function instead of using the first entry
> >> of the freq_table array.
> >>
> >
> >Based on this there seems to be room for cleaning out the freq_table
> >from devfreq, to reduce the confusion. I will review this further.
>
> Could you please check if the bug suffering you gets resolved by
> replacing 0 with ULONG_MAX in the function find_available_max_freq?
>
> - max_freq = 0;
> + max_freq = ULONG_MAX;
>
> Even if you are not using OPP, these functions should provide somewhat
> "compatible" values.
>

I also need to make set_freq_table() handle the case where there is no
opp table and change a min_freq of 0 from being treated as an error.

With this I think we're back at supporting using devfreq without
specifying any available frequencies. I am however uncertain if this
should be considered valid use of devfreq.

Regards,
Bjorn

2018-04-24 21:11:32

by Subhash Jadavani

[permalink] [raw]
Subject: Re: [PATCH 2/3] scsi: ufs: Extract devfreq registration

On 2018-04-23 17:20, Bjorn Andersson wrote:
> Failing to register with devfreq leaves hba->devfreq assigned, which
> causes the error path to dereference the ERR_PTR(). Rather than bolting
> on more conditionals, move the call of devm_devfreq_add_device() into
> it's own function and only update hba->devfreq once it's successfully
> registered.
>
> The subsequent patch builds upon this to make UFS actually work again,
> as it's been broken since f1d981eaecf8 ("PM / devfreq: Use the
> available
> min/max frequency")
>
> Signed-off-by: Bjorn Andersson <[email protected]>
> ---
> drivers/scsi/ufs/ufshcd.c | 31 ++++++++++++++++++++++---------
> 1 file changed, 22 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 8f22a980b1a7..2253f24309ec 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -1255,6 +1255,26 @@ static struct devfreq_dev_profile
> ufs_devfreq_profile = {
> .get_dev_status = ufshcd_devfreq_get_dev_status,
> };
>
> +static int ufshcd_devfreq_init(struct ufs_hba *hba)
> +{
> + struct devfreq *devfreq;
> + int ret;
> +
> + devfreq = devm_devfreq_add_device(hba->dev,
> + &ufs_devfreq_profile,
> + "simple_ondemand",
> + NULL);
> + if (IS_ERR(devfreq)) {
> + ret = PTR_ERR(devfreq);
> + dev_err(hba->dev, "Unable to register with devfreq %d\n", ret);
> + return ret;
> + }
> +
> + hba->devfreq = devfreq;
> +
> + return 0;
> +}
> +
> static void __ufshcd_suspend_clkscaling(struct ufs_hba *hba)
> {
> unsigned long flags;
> @@ -6399,16 +6419,9 @@ static int ufshcd_probe_hba(struct ufs_hba *hba)
> sizeof(struct ufs_pa_layer_attr));
> hba->clk_scaling.saved_pwr_info.is_valid = true;
> if (!hba->devfreq) {
> - hba->devfreq = devm_devfreq_add_device(hba->dev,
> - &ufs_devfreq_profile,
> - "simple_ondemand",
> - NULL);
> - if (IS_ERR(hba->devfreq)) {
> - ret = PTR_ERR(hba->devfreq);
> - dev_err(hba->dev, "Unable to register with devfreq %d\n",
> - ret);
> + ret = ufshcd_devfreq_init(hba);
> + if (ret)
> goto out;
> - }
> }
> hba->clk_scaling.is_allowed = true;
> }

Looks good to me.
Reviewed-by: Subhash Jadavani <[email protected]>

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

2018-04-24 22:10:27

by Subhash Jadavani

[permalink] [raw]
Subject: Re: [PATCH 3/3] scsi: ufs: Use freq table with devfreq

On 2018-04-23 17:20, Bjorn Andersson wrote:
> devfreq requires that the client operates on actual frequencies, not
> only 0 and UMAX_INT and as such UFS brok with the introduction of
> f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency").
>
> This patch registers the frequencies of the first clock with devfreq
> and
> use these to determine if we're trying to step up or down.
>
> Signed-off-by: Bjorn Andersson <[email protected]>
> ---
> drivers/scsi/ufs/ufshcd.c | 39 ++++++++++++++++++++++++++++++++-------
> 1 file changed, 32 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 2253f24309ec..07b1f3c7bd2d 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -1168,16 +1168,13 @@ static int ufshcd_devfreq_target(struct device
> *dev,
> struct ufs_hba *hba = dev_get_drvdata(dev);
> ktime_t start;
> bool scale_up, sched_clk_scaling_suspend_work = false;
> + struct list_head *clk_list = &hba->clk_list_head;
> + struct ufs_clk_info *clki;
> unsigned long irq_flags;
>
> if (!ufshcd_is_clkscaling_supported(hba))
> return -EINVAL;
>
> - if ((*freq > 0) && (*freq < UINT_MAX)) {
> - dev_err(hba->dev, "%s: invalid freq = %lu\n", __func__, *freq);
> - return -EINVAL;
> - }
> -
> spin_lock_irqsave(hba->host->host_lock, irq_flags);
> if (ufshcd_eh_in_progress(hba)) {
> spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
> @@ -1187,7 +1184,13 @@ static int ufshcd_devfreq_target(struct device
> *dev,
> if (!hba->clk_scaling.active_reqs)
> sched_clk_scaling_suspend_work = true;
>
> - scale_up = (*freq == UINT_MAX) ? true : false;
> + if (list_empty(clk_list)) {
> + spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
> + goto out;
> + }
> +
> + clki = list_first_entry(&hba->clk_list_head, struct ufs_clk_info,
> list);
> + scale_up = (*freq == clki->max_freq) ? true : false;
> if (!ufshcd_is_devfreq_scaling_required(hba, scale_up)) {
> spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
> ret = 0;
> @@ -1257,11 +1260,33 @@ static struct devfreq_dev_profile
> ufs_devfreq_profile = {
>
> static int ufshcd_devfreq_init(struct ufs_hba *hba)
> {
> + struct devfreq_dev_profile *profile;
> + struct list_head *clk_list = &hba->clk_list_head;
> + struct ufs_clk_info *clki;
> struct devfreq *devfreq;
> int ret;
>
> + /* Skip devfreq if we don't have any clocks in the list */
> + if (list_empty(clk_list))
> + return 0;
> +
> + profile = devm_kmemdup(hba->dev, &ufs_devfreq_profile,
> + sizeof(ufs_devfreq_profile), GFP_KERNEL);
> + if (!profile)
> + return -ENOMEM;
> +
> + profile->max_state = 2;
> + profile->freq_table = devm_kcalloc(hba->dev, profile->max_state,
> + sizeof(unsigned long), GFP_KERNEL);
> + if (!profile->freq_table)
> + return -ENOMEM;
> +
> + clki = list_first_entry(&hba->clk_list_head, struct ufs_clk_info,
> list);
> + profile->freq_table[0] = clki->min_freq;
> + profile->freq_table[1] = clki->max_freq;
> +
> devfreq = devm_devfreq_add_device(hba->dev,
> - &ufs_devfreq_profile,
> + profile,
> "simple_ondemand",
> NULL);
> if (IS_ERR(devfreq)) {

Looks good to me.
Reviewed-by: Subhash Jadavani <[email protected]>

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

2018-04-24 22:16:01

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH 3/3] scsi: ufs: Use freq table with devfreq

On Tue 24 Apr 15:08 PDT 2018, Subhash Jadavani wrote:

> On 2018-04-23 17:20, Bjorn Andersson wrote:
> > devfreq requires that the client operates on actual frequencies, not
> > only 0 and UMAX_INT and as such UFS brok with the introduction of
> > f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency").
> >
> > This patch registers the frequencies of the first clock with devfreq and
> > use these to determine if we're trying to step up or down.
> >
[..]
>
> Looks good to me.
> Reviewed-by: Subhash Jadavani <[email protected]>
>

Thanks Subhash. Unfortunately I need to respin this to register the opp
table based on our freq table, so there will be a v2 of this soon.

Regards,
Bjorn