When all CPUs of a policy are hot-unplugged, we EXIT the governor but
don't mark policy->governor as NULL. This was done in order to keep last
used governor's information intact in sysfs, while the CPUs are offline.
We also missed marking policy->governor as NULL while restoring the
policy. Because of that, we call __cpufreq_governor(CPUFREQ_GOV_LIMITS)
for an uninitialized policy. Which eventually returns -EBUSY.
Fix this by setting policy->governor to NULL while restoring the policy.
Reported-by: Pi-Cheng Chen <[email protected]>
Reported-by: "Jon Medhurst (Tixy)" <[email protected]>
Fixes: 18bf3a124ef8 ("cpufreq: Mark policy->governor = NULL for inactive policies")
Signed-off-by: Viresh Kumar <[email protected]>
---
For 4.2-rc
drivers/cpufreq/cpufreq.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b612411655f9..2c22e3902e72 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1132,6 +1132,7 @@ static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu)
down_write(&policy->rwsem);
policy->cpu = cpu;
+ policy->governor = NULL;
up_write(&policy->rwsem);
}
--
2.4.0
On Wed, Jul 8, 2015 at 1:53 PM, Viresh Kumar <[email protected]> wrote:
> When all CPUs of a policy are hot-unplugged, we EXIT the governor but
> don't mark policy->governor as NULL. This was done in order to keep last
> used governor's information intact in sysfs, while the CPUs are offline.
>
> We also missed marking policy->governor as NULL while restoring the
> policy. Because of that, we call __cpufreq_governor(CPUFREQ_GOV_LIMITS)
> for an uninitialized policy. Which eventually returns -EBUSY.
>
> Fix this by setting policy->governor to NULL while restoring the policy.
Tested-by: "Pi-Cheng Chen <[email protected]>"
Thanks for your fix.
>
> Reported-by: Pi-Cheng Chen <[email protected]>
> Reported-by: "Jon Medhurst (Tixy)" <[email protected]>
> Fixes: 18bf3a124ef8 ("cpufreq: Mark policy->governor = NULL for inactive policies")
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> For 4.2-rc
>
> drivers/cpufreq/cpufreq.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index b612411655f9..2c22e3902e72 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1132,6 +1132,7 @@ static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu)
>
> down_write(&policy->rwsem);
> policy->cpu = cpu;
> + policy->governor = NULL;
> up_write(&policy->rwsem);
> }
>
> --
> 2.4.0
>
On Wed, 2015-07-08 at 11:23 +0530, Viresh Kumar wrote:
> When all CPUs of a policy are hot-unplugged, we EXIT the governor but
> don't mark policy->governor as NULL. This was done in order to keep last
> used governor's information intact in sysfs, while the CPUs are offline.
>
> We also missed marking policy->governor as NULL while restoring the
> policy. Because of that, we call __cpufreq_governor(CPUFREQ_GOV_LIMITS)
> for an uninitialized policy. Which eventually returns -EBUSY.
>
> Fix this by setting policy->governor to NULL while restoring the policy.
>
> Reported-by: Pi-Cheng Chen <[email protected]>
> Reported-by: "Jon Medhurst (Tixy)" <[email protected]>
> Fixes: 18bf3a124ef8 ("cpufreq: Mark policy->governor = NULL for inactive policies")
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
Tested-by: Jon Medhurst <[email protected]>
Thanks for that.
I believe this also fixes the other issue I mentioned (nullptr deref in
in arm_big_little driver). To test that, after applying this patch, I
modified the code to force __cpufreq_governor to still return an error
when a cpu is hotpluged back in. Now the arm_big_little driver doesn't
get called when I manually poke scaling_setspeed, presumably because
policy->governor==NULL prevents that from reaching the driver?
> For 4.2-rc
>
> drivers/cpufreq/cpufreq.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index b612411655f9..2c22e3902e72 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1132,6 +1132,7 @@ static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu)
>
> down_write(&policy->rwsem);
> policy->cpu = cpu;
> + policy->governor = NULL;
> up_write(&policy->rwsem);
> }
>
On 08-07-15, 10:27, Jon Medhurst (Tixy) wrote:
> I believe this also fixes the other issue I mentioned (nullptr deref in
> in arm_big_little driver). To test that, after applying this patch, I
> modified the code to force __cpufreq_governor to still return an error
> when a cpu is hotpluged back in. Now the arm_big_little driver doesn't
> get called when I manually poke scaling_setspeed, presumably because
> policy->governor==NULL prevents that from reaching the driver?
I would like to fix that issue without using this patch as we aren't
cleaning up things properly on errors today. I am almost done with the
patches, and will send them to you shortly. Please give them a try
without this patch.
--
viresh
On Wednesday, July 08, 2015 11:23:58 AM Viresh Kumar wrote:
> When all CPUs of a policy are hot-unplugged, we EXIT the governor but
> don't mark policy->governor as NULL. This was done in order to keep last
> used governor's information intact in sysfs, while the CPUs are offline.
>
> We also missed marking policy->governor as NULL while restoring the
> policy. Because of that, we call __cpufreq_governor(CPUFREQ_GOV_LIMITS)
How exactly does that happen?
> for an uninitialized policy. Which eventually returns -EBUSY.
>
> Fix this by setting policy->governor to NULL while restoring the policy.
>
> Reported-by: Pi-Cheng Chen <[email protected]>
> Reported-by: "Jon Medhurst (Tixy)" <[email protected]>
> Fixes: 18bf3a124ef8 ("cpufreq: Mark policy->governor = NULL for inactive policies")
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> For 4.2-rc
>
> drivers/cpufreq/cpufreq.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index b612411655f9..2c22e3902e72 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1132,6 +1132,7 @@ static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu)
>
> down_write(&policy->rwsem);
> policy->cpu = cpu;
> + policy->governor = NULL;
> up_write(&policy->rwsem);
> }
>
>
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
On 09-07-15, 02:33, Rafael J. Wysocki wrote:
> > We also missed marking policy->governor as NULL while restoring the
> > policy. Because of that, we call __cpufreq_governor(CPUFREQ_GOV_LIMITS)
>
> How exactly does that happen?
Should have mentioned that in detail, sorry for being lazy. Hopefully
this will look better:
---------------------------8<---------------------------
Message-Id: <5f17361741c009a7f0d8488f7f94bab80d9317fd.1436418101.git.viresh.kumar@linaro.org>
From: Viresh Kumar <[email protected]>
Date: Wed, 8 Jul 2015 10:45:53 +0530
Subject: [PATCH V2] cpufreq: Initialize the governor again while restoring policy
When all CPUs of a policy are hot-unplugged, we EXIT the governor but
don't mark policy->governor as NULL. This was done in order to keep last
used governor's information intact in sysfs, while the CPUs are offline.
But we also marking policy->governor as NULL while restoring the policy.
Because policy->governor still points to the last governor while policy
is restored, following sequence of event happens:
- cpufreq_init_policy() called while restoring policy
- find_governor() matches last_governor string for present governors and
returns last used governor's pointer, say ondemand. policy->governor
already has the same address, unless the governor was removed in
between.
- cpufreq_set_policy() is called with both old/new policies governor set
as ondemand.
- Because governors matched, we skip governor initialization and return
after calling __cpufreq_governor(CPUFREQ_GOV_LIMITS). Because the
governor wasn't initialized for this policy, it returned -EBUSY.
- cpufreq_init_policy() exits the policy on this error, but doesn't
destroy it properly (should be fixed separately).
- And so we enter a scenario where the policy isn't completely
initialized but used.
Fix this by setting policy->governor to NULL while restoring the policy.
Reported-and-tested-by: Pi-Cheng Chen <[email protected]>
Reported-and-tested-by: "Jon Medhurst (Tixy)" <[email protected]>
Tested-by: Steven Rostedt <[email protected]>
Fixes: 18bf3a124ef8 ("cpufreq: Mark policy->governor = NULL for inactive policies")
Signed-off-by: Viresh Kumar <[email protected]>
---
V2: Detailed changelog
drivers/cpufreq/cpufreq.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b612411655f9..2c22e3902e72 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1132,6 +1132,7 @@ static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu)
down_write(&policy->rwsem);
policy->cpu = cpu;
+ policy->governor = NULL;
up_write(&policy->rwsem);
}
On Thursday, July 09, 2015 10:40:24 AM Viresh Kumar wrote:
> On 09-07-15, 02:33, Rafael J. Wysocki wrote:
> > > We also missed marking policy->governor as NULL while restoring the
> > > policy. Because of that, we call __cpufreq_governor(CPUFREQ_GOV_LIMITS)
> >
> > How exactly does that happen?
>
> Should have mentioned that in detail, sorry for being lazy. Hopefully
> this will look better:
OK, applied (with some minor changelog fixup), thanks!
> ---------------------------8<---------------------------
>
> Message-Id: <5f17361741c009a7f0d8488f7f94bab80d9317fd.1436418101.git.viresh.kumar@linaro.org>
> From: Viresh Kumar <[email protected]>
> Date: Wed, 8 Jul 2015 10:45:53 +0530
> Subject: [PATCH V2] cpufreq: Initialize the governor again while restoring policy
>
> When all CPUs of a policy are hot-unplugged, we EXIT the governor but
> don't mark policy->governor as NULL. This was done in order to keep last
> used governor's information intact in sysfs, while the CPUs are offline.
>
> But we also marking policy->governor as NULL while restoring the policy.
>
> Because policy->governor still points to the last governor while policy
> is restored, following sequence of event happens:
> - cpufreq_init_policy() called while restoring policy
> - find_governor() matches last_governor string for present governors and
> returns last used governor's pointer, say ondemand. policy->governor
> already has the same address, unless the governor was removed in
> between.
> - cpufreq_set_policy() is called with both old/new policies governor set
> as ondemand.
> - Because governors matched, we skip governor initialization and return
> after calling __cpufreq_governor(CPUFREQ_GOV_LIMITS).
But this sounds fragile in principle. What's the benefit from skipping the
governor initialization in that case?
> Because the
> governor wasn't initialized for this policy, it returned -EBUSY.
> - cpufreq_init_policy() exits the policy on this error, but doesn't
> destroy it properly (should be fixed separately).
> - And so we enter a scenario where the policy isn't completely
> initialized but used.
>
> Fix this by setting policy->governor to NULL while restoring the policy.
>
> Reported-and-tested-by: Pi-Cheng Chen <[email protected]>
> Reported-and-tested-by: "Jon Medhurst (Tixy)" <[email protected]>
> Tested-by: Steven Rostedt <[email protected]>
> Fixes: 18bf3a124ef8 ("cpufreq: Mark policy->governor = NULL for inactive policies")
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> V2: Detailed changelog
>
> drivers/cpufreq/cpufreq.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index b612411655f9..2c22e3902e72 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1132,6 +1132,7 @@ static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu)
>
> down_write(&policy->rwsem);
> policy->cpu = cpu;
> + policy->governor = NULL;
> up_write(&policy->rwsem);
> }
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
On 10 July 2015 at 05:35, Rafael J. Wysocki <[email protected]> wrote:
>> - Because governors matched, we skip governor initialization and return
>> after calling __cpufreq_governor(CPUFREQ_GOV_LIMITS).
>
> But this sounds fragile in principle. What's the benefit from skipping the
> governor initialization in that case?
So this is the case where we have changed some property of the governor,
and the governor is already initialised. We need to exit the earlier governor
and initialize the new one only when the governor is actually switched.