2009-07-06 11:18:28

by Thomas Renninger

[permalink] [raw]
Subject: cpufreq cleanups - .30 vs .31

Hi Dave,

this is about Venki's and Mathieu's recently sent cleanups.
I'd like to summarize this to help finding a solution:

IMO Venki's approach (making .governor() always be called with
rwsem held) is the cleaner one and this should be the way to
go for .31 and future. This better separates locking responsibilities
between cpufreq core and governors and brings back "design" into this.

One could argue that for .30 Mathieu's is better, because less
intrusive. It's up to Dave in the end, but:
[patch 2.6.30 1/4] remove rwsem lock from CPUFREQ_GOV_STOP call (second call
site)
should not be the way to go for .31 and I'd vote for Venki's
approach concerning locking .governor() against multiple calls (done by
rwsem) and governor() vs do_dbs_timer calls (governor's job with a governor
specific sem).

So if not find too intrusive, I'd say:
Venkatesh's whole series of:
[patch 0/4] Take care of cpufreq lockdep issues (take 2)
should be seen in .31.

Depending on how intrusive this is seen, Venki's first patch:
[patch 1/4] cpufreq: Eliminate the recent lockdep warnings in cpufreq
should then go to .30 (after still waiting a bit?)
or Mathieu's approach (I'd vote for Venki's to be consistent for .30 and .31).

The one patch from Mathieu:
[patch 2.6.30 2/4] CPUFREQ: fix (utter) cpufreq_add_dev mess
is a separate, general cleanup which should show up in .31.



I still have two patch specific questions:
about Mathieu's (it's a minor issue in the error path):
[patch 2.6.30 2/4] CPUFREQ: fix (utter) cpufreq_add_dev mess

+???????????????????????if (lock_policy_rwsem_write(cpu) < 0) {
+???????????????????????????????/* Should not go through policy unlock path */
+???????????????????????????????if (cpufreq_driver->exit)
+???????????????????????????????????????cpufreq_driver->exit(policy);
+???????????????????????????????ret = -EBUSY;
+???????????????????????????????cpufreq_cpu_put(managed_policy);
Shouldn't:
cpufreq_cpu_put(managed_policy);
be called before:
cpufreq_driver->exit(policy);
Just in case the driver itself wants to grab the policy of the
managed cpu?


about Venki's:
[patch 3/4] cpufreq: Cleanup locking in ondemand governor
Isn't it possible to use only one mutex(timer_mutex) to protect do_dbs_timer
against governor start, stop, limit?
Then dbs_mutex would only be used to protect against concurrent sysfs access
and can be thrown away as soon as ondemand only provides global sysfs files,
not per cpu ones.

Hmm, maybe this should just go in? It eases up things, but it's still hard
to follow up each detail. Fixes/enhancements can still be put on top
for .31.

Thomas


2009-07-06 13:45:57

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: cpufreq cleanups - .30 vs .31

* Thomas Renninger ([email protected]) wrote:
> Hi Dave,
>
> this is about Venki's and Mathieu's recently sent cleanups.
> I'd like to summarize this to help finding a solution:
>
> IMO Venki's approach (making .governor() always be called with
> rwsem held) is the cleaner one and this should be the way to
> go for .31 and future. This better separates locking responsibilities
> between cpufreq core and governors and brings back "design" into this.
>
> One could argue that for .30 Mathieu's is better, because less
> intrusive. It's up to Dave in the end, but:
> [patch 2.6.30 1/4] remove rwsem lock from CPUFREQ_GOV_STOP call (second call
> site)
> should not be the way to go for .31 and I'd vote for Venki's
> approach concerning locking .governor() against multiple calls (done by
> rwsem) and governor() vs do_dbs_timer calls (governor's job with a governor
> specific sem).
>
> So if not find too intrusive, I'd say:
> Venkatesh's whole series of:
> [patch 0/4] Take care of cpufreq lockdep issues (take 2)
> should be seen in .31.
>
> Depending on how intrusive this is seen, Venki's first patch:
> [patch 1/4] cpufreq: Eliminate the recent lockdep warnings in cpufreq
> should then go to .30 (after still waiting a bit?)
> or Mathieu's approach (I'd vote for Venki's to be consistent for .30 and .31).
>
> The one patch from Mathieu:
> [patch 2.6.30 2/4] CPUFREQ: fix (utter) cpufreq_add_dev mess
> is a separate, general cleanup which should show up in .31.
>
>
>
> I still have two patch specific questions:
> about Mathieu's (it's a minor issue in the error path):
> [patch 2.6.30 2/4] CPUFREQ: fix (utter) cpufreq_add_dev mess
>
> +???????????????????????if (lock_policy_rwsem_write(cpu) < 0) {
> +???????????????????????????????/* Should not go through policy unlock path */
> +???????????????????????????????if (cpufreq_driver->exit)
> +???????????????????????????????????????cpufreq_driver->exit(policy);
> +???????????????????????????????ret = -EBUSY;
> +???????????????????????????????cpufreq_cpu_put(managed_policy);
> Shouldn't:
> cpufreq_cpu_put(managed_policy);
> be called before:
> cpufreq_driver->exit(policy);
> Just in case the driver itself wants to grab the policy of the
> managed cpu?
>

If we want to make it perfectly similar to the error path:

(note : this is a success path if ret = 0, and error path if ret != 0)

ret != 0 :
if (!ret)
cpufreq_cpu_put(managed_policy);
/*
* Success. We only needed to be added to the mask.
* Call driver->exit() because only the cpu parent of
* the kobj needed to call init().
*/
goto out_driver_exit; /* call driver->exit() */

Then yes, we might be tempted to flip the cpu_put and exit.

But given cpu_put just decrements a reference counter (and performs
cleanup if needed), then even if ->exit() takes a reference count
somehow, this will just make the code hold 2 refcounts temporarily.

And I prefer to keep this refcount thorough ->exit() call, because we
don't hold any policy_rwsem at this point : we are in a lock acquisition
error path. It's therefore safer to keep the refcount to ensure that
data won't vanish.

Mathieu

>
> about Venki's:
> [patch 3/4] cpufreq: Cleanup locking in ondemand governor
> Isn't it possible to use only one mutex(timer_mutex) to protect do_dbs_timer
> against governor start, stop, limit?
> Then dbs_mutex would only be used to protect against concurrent sysfs access
> and can be thrown away as soon as ondemand only provides global sysfs files,
> not per cpu ones.
>
> Hmm, maybe this should just go in? It eases up things, but it's still hard
> to follow up each detail. Fixes/enhancements can still be put on top
> for .31.
>
> Thomas
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-07-06 17:22:26

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: Re: cpufreq cleanups - .30 vs .31

On Mon, 2009-07-06 at 04:18 -0700, Thomas Renninger wrote:
> Hi Dave,
>
> this is about Venki's and Mathieu's recently sent cleanups.
> I'd like to summarize this to help finding a solution:
>
> IMO Venki's approach (making .governor() always be called with
> rwsem held) is the cleaner one and this should be the way to
> go for .31 and future. This better separates locking responsibilities
> between cpufreq core and governors and brings back "design" into this.
>
> One could argue that for .30 Mathieu's is better, because less
> intrusive. It's up to Dave in the end, but:
> [patch 2.6.30 1/4] remove rwsem lock from CPUFREQ_GOV_STOP call (second call
> site)
> should not be the way to go for .31 and I'd vote for Venki's
> approach concerning locking .governor() against multiple calls (done by
> rwsem) and governor() vs do_dbs_timer calls (governor's job with a governor
> specific sem).
>
> So if not find too intrusive, I'd say:
> Venkatesh's whole series of:
> [patch 0/4] Take care of cpufreq lockdep issues (take 2)
> should be seen in .31.
>
> Depending on how intrusive this is seen, Venki's first patch:
> [patch 1/4] cpufreq: Eliminate the recent lockdep warnings in cpufreq
> should then go to .30 (after still waiting a bit?)
> or Mathieu's approach (I'd vote for Venki's to be consistent for .30 and .31).
>
> The one patch from Mathieu:
> [patch 2.6.30 2/4] CPUFREQ: fix (utter) cpufreq_add_dev mess
> is a separate, general cleanup which should show up in .31.
>
>
>
> I still have two patch specific questions:
> about Mathieu's (it's a minor issue in the error path):
> [patch 2.6.30 2/4] CPUFREQ: fix (utter) cpufreq_add_dev mess
>
> + if (lock_policy_rwsem_write(cpu) < 0) {
> + /* Should not go through policy unlock path */
> + if (cpufreq_driver->exit)
> + cpufreq_driver->exit(policy);
> + ret = -EBUSY;
> + cpufreq_cpu_put(managed_policy);
> Shouldn't:
> cpufreq_cpu_put(managed_policy);
> be called before:
> cpufreq_driver->exit(policy);
> Just in case the driver itself wants to grab the policy of the
> managed cpu?
>
>
> about Venki's:
> [patch 3/4] cpufreq: Cleanup locking in ondemand governor
> Isn't it possible to use only one mutex(timer_mutex) to protect do_dbs_timer
> against governor start, stop, limit?
> Then dbs_mutex would only be used to protect against concurrent sysfs access
> and can be thrown away as soon as ondemand only provides global sysfs files,
> not per cpu ones.
>

dbs_mutex (or some other global lock) is also needed at the places where
dbs_enable is changed and used. Yes having dbs_mutex exclusively for
dbs_tuners makes code cleaner. I would say, making ondemand providing
global sysfs/debugfs files is a better thing to do sooner.

Thanks,
Venki

2009-07-07 01:52:15

by Dave Jones

[permalink] [raw]
Subject: Re: cpufreq cleanups - .30 vs .31

On Mon, Jul 06, 2009 at 01:18:18PM +0200, Thomas Renninger wrote:

> So if not find too intrusive, I'd say:
> Venkatesh's whole series of:
> [patch 0/4] Take care of cpufreq lockdep issues (take 2)
> should be seen in .31.
> ...
> The one patch from Mathieu:
> [patch 2.6.30 2/4] CPUFREQ: fix (utter) cpufreq_add_dev mess
> is a separate, general cleanup which should show up in .31.

I came to the same conclusion after reading the thread, and looking
over the patches. I merged the above, and sent Linus a pull request
a few minutes ago.

Thanks Mathieu and Venki for chasing this down.

Dave

2009-07-27 14:25:40

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: cpufreq cleanups - .30 vs .31

* Dave Jones ([email protected]) wrote:
> On Mon, Jul 06, 2009 at 01:18:18PM +0200, Thomas Renninger wrote:
>
> > So if not find too intrusive, I'd say:
> > Venkatesh's whole series of:
> > [patch 0/4] Take care of cpufreq lockdep issues (take 2)
> > should be seen in .31.
> > ...
> > The one patch from Mathieu:
> > [patch 2.6.30 2/4] CPUFREQ: fix (utter) cpufreq_add_dev mess
> > is a separate, general cleanup which should show up in .31.
>
> I came to the same conclusion after reading the thread, and looking
> over the patches. I merged the above, and sent Linus a pull request
> a few minutes ago.
>
> Thanks Mathieu and Venki for chasing this down.
>
> Dave

Given I never got an answer to this question, I'm re-asking a question I
asked in a previous thread about Venki's patchset:

[CPUFREQ] Cleanup locking in ondemand governor
commit 5a75c82828e7c088ca6e7b4827911dc29cc8e774

>From the earlier thread:
Subject: Re: [patch 2.6.30 3/4] cpufreq add gov mutex

I am worried about potential races between add_dev/remove_dev, which
currently lock the rwsem as mean of protection, and execution of timer
handler that would not take the rwsem to protect itself anymore, due to
your changes.

I'm especially worried about the call to

__cpufreq_driver_target(dbs_info->cur_policy,
dbs_info->freq_lo, CPUFREQ_RELATION_H);

which seems to depend on policy-level information, protected at the
rwsem-level.

By removing the rwsem from the timer handler, I don't see how you plan
to protect this information from add_dev/remove_dev execution.


Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-07-27 16:33:29

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: Re: cpufreq cleanups - .30 vs .31

On Mon, 2009-07-27 at 07:25 -0700, Mathieu Desnoyers wrote:
> * Dave Jones ([email protected]) wrote:
> > On Mon, Jul 06, 2009 at 01:18:18PM +0200, Thomas Renninger wrote:
> >
> > > So if not find too intrusive, I'd say:
> > > Venkatesh's whole series of:
> > > [patch 0/4] Take care of cpufreq lockdep issues (take 2)
> > > should be seen in .31.
> > > ...
> > > The one patch from Mathieu:
> > > [patch 2.6.30 2/4] CPUFREQ: fix (utter) cpufreq_add_dev mess
> > > is a separate, general cleanup which should show up in .31.
> >
> > I came to the same conclusion after reading the thread, and looking
> > over the patches. I merged the above, and sent Linus a pull request
> > a few minutes ago.
> >
> > Thanks Mathieu and Venki for chasing this down.
> >
> > Dave
>
> Given I never got an answer to this question, I'm re-asking a question I
> asked in a previous thread about Venki's patchset:
>
> [CPUFREQ] Cleanup locking in ondemand governor
> commit 5a75c82828e7c088ca6e7b4827911dc29cc8e774
>
> From the earlier thread:
> Subject: Re: [patch 2.6.30 3/4] cpufreq add gov mutex
>
> I am worried about potential races between add_dev/remove_dev, which
> currently lock the rwsem as mean of protection, and execution of timer
> handler that would not take the rwsem to protect itself anymore, due to
> your changes.
>
> I'm especially worried about the call to
>
> __cpufreq_driver_target(dbs_info->cur_policy,
> dbs_info->freq_lo, CPUFREQ_RELATION_H);
>
> which seems to depend on policy-level information, protected at the
> rwsem-level.
>
> By removing the rwsem from the timer handler, I don't see how you plan
> to protect this information from add_dev/remove_dev execution.
>

Sorry I missed the question earlier.

The invariant here is that the timer routine will not be running while
policy is inconsistent due to add/remove. The cpufreq layer calls START
at the end of add_dev when all policy stuff has been setup, which starts
the timer. And STOP along remove_dev before cleaning up policy which
stops the timer.

If you are thinking of races with other cpufreq sysfs interfaces, they
go through the per cpu rwsem along with add/remove.

Thanks,
Venki

2009-07-27 16:37:44

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: cpufreq cleanups - .30 vs .31

* Pallipadi, Venkatesh ([email protected]) wrote:
> On Mon, 2009-07-27 at 07:25 -0700, Mathieu Desnoyers wrote:
> > * Dave Jones ([email protected]) wrote:
> > > On Mon, Jul 06, 2009 at 01:18:18PM +0200, Thomas Renninger wrote:
> > >
> > > > So if not find too intrusive, I'd say:
> > > > Venkatesh's whole series of:
> > > > [patch 0/4] Take care of cpufreq lockdep issues (take 2)
> > > > should be seen in .31.
> > > > ...
> > > > The one patch from Mathieu:
> > > > [patch 2.6.30 2/4] CPUFREQ: fix (utter) cpufreq_add_dev mess
> > > > is a separate, general cleanup which should show up in .31.
> > >
> > > I came to the same conclusion after reading the thread, and looking
> > > over the patches. I merged the above, and sent Linus a pull request
> > > a few minutes ago.
> > >
> > > Thanks Mathieu and Venki for chasing this down.
> > >
> > > Dave
> >
> > Given I never got an answer to this question, I'm re-asking a question I
> > asked in a previous thread about Venki's patchset:
> >
> > [CPUFREQ] Cleanup locking in ondemand governor
> > commit 5a75c82828e7c088ca6e7b4827911dc29cc8e774
> >
> > From the earlier thread:
> > Subject: Re: [patch 2.6.30 3/4] cpufreq add gov mutex
> >
> > I am worried about potential races between add_dev/remove_dev, which
> > currently lock the rwsem as mean of protection, and execution of timer
> > handler that would not take the rwsem to protect itself anymore, due to
> > your changes.
> >
> > I'm especially worried about the call to
> >
> > __cpufreq_driver_target(dbs_info->cur_policy,
> > dbs_info->freq_lo, CPUFREQ_RELATION_H);
> >
> > which seems to depend on policy-level information, protected at the
> > rwsem-level.
> >
> > By removing the rwsem from the timer handler, I don't see how you plan
> > to protect this information from add_dev/remove_dev execution.
> >
>
> Sorry I missed the question earlier.
>
> The invariant here is that the timer routine will not be running while
> policy is inconsistent due to add/remove. The cpufreq layer calls START
> at the end of add_dev when all policy stuff has been setup, which starts
> the timer. And STOP along remove_dev before cleaning up policy which
> stops the timer.
>
> If you are thinking of races with other cpufreq sysfs interfaces, they
> go through the per cpu rwsem along with add/remove.
>

Great, that makes sense.

Thanks !

Mathieu

> Thanks,
> Venki
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68