LinuxLists.cc - Re: [linux-pm] [PATCH] PowerOP, PowerOP Core, 1/2

2006-09-19 21:37:21

Subject: Re: [linux-pm] [PATCH] PowerOP, PowerOP Core, 1/2

From: "Eugeny S. Mints" <[email protected]>

| Eugeny S. Mints wrote:
| > Pavel Machek wrote:
| >> Hi!
| >>
| [skip]
| >> How is it going to work on 8cpu box? will
| >> you have states like cpu1_800MHz_cpu2_1600MHz_cpu3_800MHz_... ?
|
| basically I guess you are asking about what the names of operating
| points are and how to distinguish between operating points from
| userspace on 8cpu box.
|
| An advantage of PowerOP approach is that operating point name is used as
| a _handle_ and may or may not be meaningful. The idea is that if a
| policy manager needs to make a decision and needs to distinguish between
| operating points it can check value of any power parameters of operating
| points in question. Power parameter values may be obtained under
| <op_name> dir name.
|
| With such approach a policy manger may compare operating points at
| runtime and should not rely on compile time knowledge about what name
| corresponds to what set of power parameter values. It uses name as a
| handle.
---

Hmm. If you assume the CPUs in an SMP system can be in different
operating points, this would (as Pavel pointed out) result in an
explosion of operating points.

I see several possible responses to this problem:

(1) Make operating points a CPU-level abstraction, rather than a
system-level abstraction, with the set of OPs for each CPU defined
individually. This allows for non-symmetric CPUs. Each CPU would have
its own policy manager driving OP selection for that CPU (the set of
policy managers could be shared, as I believe cpufreq shares governors
among CPU).

(2) Create CPU-group operating points that captured the power
parameters for each of a set of CPUs (that is, a group OP would be a
vector of CPU OPs).

(3) Create CPU-group operating points that varied a group of CPUs
symetrically (that is, one set of CPU-level power parameters shared
across a set of CPUs), with group-level policy managers that control
transitions for the group in lockstep.

(4) Create CPU-group operating points that varied a group of CPUs
symetrically (as in (3)), and have both group-level policy managers and
a system-level policy manager that moves CPUs from one group to
another.

scott
--
scott preece
motorola mobile devices, il67, 1800 s. oak st., champaign, il 61820
e-mail: [email protected] fax: +1-217-384-8550
phone: +1-217-384-8589 cell: +1-217-433-6114 pager: [email protected]

2006-09-22 14:11:31

by Pavel Machek

[permalink] [raw]

Subject: Re: [linux-pm] [PATCH] PowerOP, PowerOP Core, 1/2

Hi!

> Hmm. If you assume the CPUs in an SMP system can be in different
> operating points, this would (as Pavel pointed out) result in an
> explosion of operating points.

Problem is not only CPUs, devices are mostly independent in PC
case... it would be nice to solve that problem with same approach.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-09-22 14:49:51

by Igor Stoppa

[permalink] [raw]

Subject: Re: [linux-pm] [PATCH] PowerOP, PowerOP Core, 1/2

On Fri, 2006-09-22 at 16:11 +0200, ext Pavel Machek wrote:
> Hi!
>
> > Hmm. If you assume the CPUs in an SMP system can be in different
> > operating points, this would (as Pavel pointed out) result in an
> > explosion of operating points.
>
> Problem is not only CPUs, devices are mostly independent in PC
> case... it would be nice to solve that problem with same approach.
>
> Pavel

This whole discussion is, imho, very misleading.

The number of CPU in a box or the number of cores in a chip is not a
significant element, per se.

What is really important is how interdependent they are.
In the case of a board with 2, 4 or 8 CPU, the decisione if their states
are tied together or not is not based on the packaging, but rather on
the (possibly suboptimal) HW design: shared clock or power sources
impose constraints and correlations.

Correlations lead to the multiplication of subsystem states, while
constraints curb the number, because if CPU1 and CPU2 share the same
voltage source, then of all the possible states, only those where this
constraint is satisfied are possible.

Remember what an OP is:
a set of values that exaustively and uniquely define the state of a
system.

So if your box has 256 CPUs, I bet that they are not all on the same
board and probably you also have several independently programmable
power sources.
If every power source feeds say 8 CPUs, then the box contains 16
independent subsystems.

Of course one probably would like to orchestrate all of them, but that's
a 2nd level problem, that could be addressed by a power/workload
manager.

However, even starting with localised dynamic power management would
yeld a significant improvement.

About other device within a PC: SW design cannot really change whatever
constraints the HW design is imposing: if 2 devices are sharing a
programmable v/f source, the source is generating a system which
comprises both devices and it has to be addressed as such.

The innterdipendency could be masked at some high abstract level, but
then going down, close to HW, it has to be explicitly dealt with.

--
Cheers,
Igor

Igor Stoppa (Nokia M - OSSO / Tampere)