2006-09-11 22:56:23

by Pavel Machek

[permalink] [raw]
Subject: cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP]

Hi!

Just for the record... this goes out to the lkml. This discussion was
internal for way too long. (for interested lkml readers, I'm sure
linux-pm mailing list has public archive somewhere).

On Tue 2006-09-12 02:05:26, Eugeny S. Mints wrote:
> Pavel Machek wrote:
> >>>>- PowerOP is only one layer (towards the bottom) in a power management
> >>>>solution.
> >>>>- PowerOP does *not* replace cpufreq
> >>>PowerOP provides userland interface for changing processor
> >>>frequency. That's bad -- duplicate interface.
> >>Basically the biggest problem with cpufreq interface is that cpufreq has
> >>"chose
> >>predefined closest to a given frequency" functionality implemented in the
> >>kernel while there is _no_ any reason to have this functionality
> >>implemented in
> >>the kernel if we have sysfs interface exported by PowerOP in place - you
> >>just
> >
> >No, there is reason to keep that in kernel -- so that cpufreq
> >userspace interface can be kept, and so that resulting kernel<->user
> >interface is not ugly.
> Cpuferq defines cpufreq_frequency_table structure in arch independent
> header while it's arch dependent data structure. A lot of code is built
> around this invalid basic brick and therefore is invalid: cpufreq tables,
> interface with cpu freq drivers, etc. Notion of transition latency as it
> defined by cpufreq is wrong because it's not a function of cpu type but
> function of current and next operating point. no runtime control on
> operating points set. it's always the same set of operating points for all
> system cpus in smp case and no way to define different sets or track any
> dependencies in case say multi core cpus. insufficient kernel<->user space
> interface to handle embedded requirements and no way to extend it within
> current design. Shall I continue? Why should then anyone want to keep
> cpufreq userspace interface just due to keep it?

Yes, please continue. I do not think we can just rip cpufreq interface
out of kernel, and I do not think it is as broken as you claim it
is. Ripping interface out of kernel takes years.

I'm sure cpufreq_frequency_table could be moved to asm/ header if you
felt strongly about that.

I believe we need to fix cpufreq if it is broken for embedded
cases.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


2006-09-12 00:20:00

by mark gross

[permalink] [raw]
Subject: Re: cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP]

On Tue, Sep 12, 2006 at 12:56:17AM +0200, Pavel Machek wrote:
> Hi!
>
> Just for the record... this goes out to the lkml. This discussion was
> internal for way too long. (for interested lkml readers, I'm sure
> linux-pm mailing list has public archive somewhere).
>

This was rude.

> On Tue 2006-09-12 02:05:26, Eugeny S. Mints wrote:
> > Pavel Machek wrote:
> > >>>>- PowerOP is only one layer (towards the bottom) in a power management
> > >>>>solution.
> > >>>>- PowerOP does *not* replace cpufreq
> > >>>PowerOP provides userland interface for changing processor
> > >>>frequency. That's bad -- duplicate interface.
> > >>Basically the biggest problem with cpufreq interface is that cpufreq has
> > >>"chose
> > >>predefined closest to a given frequency" functionality implemented in the
> > >>kernel while there is _no_ any reason to have this functionality
> > >>implemented in
> > >>the kernel if we have sysfs interface exported by PowerOP in place - you
> > >>just
> > >
> > >No, there is reason to keep that in kernel -- so that cpufreq
> > >userspace interface can be kept, and so that resulting kernel<->user
> > >interface is not ugly.
> > Cpuferq defines cpufreq_frequency_table structure in arch independent
> > header while it's arch dependent data structure. A lot of code is built
> > around this invalid basic brick and therefore is invalid: cpufreq tables,
> > interface with cpu freq drivers, etc. Notion of transition latency as it
> > defined by cpufreq is wrong because it's not a function of cpu type but
> > function of current and next operating point. no runtime control on
> > operating points set. it's always the same set of operating points for all
> > system cpus in smp case and no way to define different sets or track any
> > dependencies in case say multi core cpus. insufficient kernel<->user space
> > interface to handle embedded requirements and no way to extend it within
> > current design. Shall I continue? Why should then anyone want to keep
> > cpufreq userspace interface just due to keep it?
>
> Yes, please continue. I do not think we can just rip cpufreq interface
> out of kernel, and I do not think it is as broken as you claim it
> is. Ripping interface out of kernel takes years.
>
> I'm sure cpufreq_frequency_table could be moved to asm/ header if you
> felt strongly about that.
>
> I believe we need to fix cpufreq if it is broken for embedded
> cases.

cpufreq is broken at the cpufreq_driver interface for embedded
applications needing control over more than one control variable at a
time.

That interface only supports setting target frequencies, and expanding it
to set target frequencies and voltages is not possible without something
like PowerOP. Adding the types of parameters to cpufreq would likely
make cpufreq a mess. I think we would be better off with something that
coexists with cpufreq, like the powerop patch from Eugeny.

God help you if you try to use cpufreq on a complex non-PC platform with
multiple power and clock domains that need to be tweaked to squeeze out
competitive battery life.

Because of the existing user base of cpufreq removing cpufreq will never
happen. No one supporting the PowerOP patch has never recommended
such a thing. However; holding back innovation because of an existing
solution that doesn't support a large class of users seems dumb.

--mgross

2006-09-12 03:57:17

by Greg KH

[permalink] [raw]
Subject: Re: cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP]

On Mon, Sep 11, 2006 at 05:17:01PM -0700, Mark Gross wrote:
>
> cpufreq is broken at the cpufreq_driver interface for embedded
> applications needing control over more than one control variable at a
> time.
>
> That interface only supports setting target frequencies, and expanding it
> to set target frequencies and voltages is not possible without something
> like PowerOP. Adding the types of parameters to cpufreq would likely
> make cpufreq a mess. I think we would be better off with something that
> coexists with cpufreq, like the powerop patch from Eugeny.
>
> God help you if you try to use cpufreq on a complex non-PC platform with
> multiple power and clock domains that need to be tweaked to squeeze out
> competitive battery life.
>
> Because of the existing user base of cpufreq removing cpufreq will never
> happen. No one supporting the PowerOP patch has never recommended
> such a thing. However; holding back innovation because of an existing
> solution that doesn't support a large class of users seems dumb.

But you can't break the existing stuff, and it seems that some of these
proposals are doing just that. :(

thanks,

greg k-h

2006-09-12 08:33:28

by Pavel Machek

[permalink] [raw]
Subject: Re: cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP]

Hi!

> > > >No, there is reason to keep that in kernel -- so that cpufreq
> > > >userspace interface can be kept, and so that resulting kernel<->user
> > > >interface is not ugly.
> > > Cpuferq defines cpufreq_frequency_table structure in arch independent
> > > header while it's arch dependent data structure. A lot of code is built
> > > around this invalid basic brick and therefore is invalid: cpufreq tables,
> > > interface with cpu freq drivers, etc. Notion of transition latency as it
> > > defined by cpufreq is wrong because it's not a function of cpu type but
> > > function of current and next operating point. no runtime control on
> > > operating points set. it's always the same set of operating points for all
> > > system cpus in smp case and no way to define different sets or track any
> > > dependencies in case say multi core cpus. insufficient kernel<->user space
> > > interface to handle embedded requirements and no way to extend it within
> > > current design. Shall I continue? Why should then anyone want to keep
> > > cpufreq userspace interface just due to keep it?
> >
> > Yes, please continue. I do not think we can just rip cpufreq interface
> > out of kernel, and I do not think it is as broken as you claim it
> > is. Ripping interface out of kernel takes years.
> >
> > I'm sure cpufreq_frequency_table could be moved to asm/ header if you
> > felt strongly about that.
> >
> > I believe we need to fix cpufreq if it is broken for embedded
> > cases.
>
> cpufreq is broken at the cpufreq_driver interface for embedded
> applications needing control over more than one control variable at a
> time.
>
> That interface only supports setting target frequencies, and expanding it
> to set target frequencies and voltages is not possible without something
> like PowerOP. Adding the types of parameters to cpufreq would likely
> make cpufreq a mess.

Can we at least try adding that, before deciding cpufreq is unsuitable
and starting new interface? I do not think issues are nearly as big as
you think.. (at user<->kernel interface level, anyway; you'll need big
changes under the hood).
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-09-12 09:10:25

by Vitaly Wool

[permalink] [raw]
Subject: Re: [linux-pm] cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP]

Pavel,

On 9/12/06, Pavel Machek <[email protected]> wrote:
> Can we at least try adding that, before deciding cpufreq is unsuitable
> and starting new interface? I do not think issues are nearly as big as
> you think.. (at user<->kernel interface level, anyway; you'll need big
> changes under the hood).

who talks about user <-> kernel interface level changes at the moment?!

Vitaly

2006-09-12 09:16:18

by Pavel Machek

[permalink] [raw]
Subject: Re: [linux-pm] cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP]

On Tue 2006-09-12 13:10:24, Vitaly Wool wrote:
> Pavel,
>
> On 9/12/06, Pavel Machek <[email protected]> wrote:
> >Can we at least try adding that, before deciding cpufreq is unsuitable
> >and starting new interface? I do not think issues are nearly as big as
> >you think.. (at user<->kernel interface level, anyway; you'll need big
> >changes under the hood).
>
> who talks about user <-> kernel interface level changes at the moment?!

Eugeny?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-09-12 09:23:12

by Vitaly Wool

[permalink] [raw]
Subject: Re: [linux-pm] cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP]

On 9/12/06, Pavel Machek <[email protected]> wrote:
> > who talks about user <-> kernel interface level changes at the moment?!
> Eugeny?

Well, as far as I understood, both parties are ready to talk about
_kernel_ interface at the moment. Let's try to look at it from this
very point of view.
Eugeny, please correct me if my understanding is wrong.

Vitaly

2006-09-13 23:50:45

by David Singleton

[permalink] [raw]
Subject: Re: [linux-pm] cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP]

Greg,
here's a few paragraphs about the power management code I'm working on.
The OpPoint patch set is a fully functionaly power management solution,
from kernel operating state support to userland power manager.

OpPoint constructs operating points for all supported frequency, voltage
and suspend states for PC and SoC solutions running Linux. OpPoint
does not break or replace cpufreq. It leverages cpufreq code to provide
the same driver scaling functions when cpu frequency changes affect drivers.

(The ARM pxa27x patch uses the cpufreq scaling routines to scale the LCD
when frequencies are changed and works well when playing mpeg movies on
the LCD during frequency scaling operations).

The Operating Points in OpPoint are simply created at compile time, in
the same manner cpufreq tables are, and registered in
/sys/power/operating_states directory when the cpu is identified at boot time.

The states are ordered by name on their power consumption level from lowest
to highest so the power manager can operate correctly regardless of what
frequencies or voltages are associated with the lowest or highest
operating points.

There is no kernel interface to parse and create all the parameters
needed to create an operating point. Platform specific information
is supplied to the oppoint structure through an ops vector of three
routines and a void * pointer to supply the platform specific data,
in the same manner drivers have a void * for their private data.

The ops vectors provide operating point specific functions to prepare
to change to a new operating point, transition to the target operating
point, and a finish transition routine to either notify driver that
the clocks have scaled and operation of bus and DMA traffic may continue.

OpPoint draws the line about what's needed in the kernel a bit differently
than Matt's PowerOp code. OpPoint only puts operating point support in
the kernel. Polices for operting states and classes of operating states
are left to the power manager, in userland. This simplifies the
kernel code, no string parsers for operating point parameter construction,
and makes it easier to customize a solution by customizing the power
manager.

A power manager is supplied with the OpPoint patch set as well. I borrowed
the cpuspeed deamon and made a simple patch that uses the new OpPoint
sysfs interface. The daemon can be compiled as the original cpuspeed or
oppointd deamon depending on the users choice. The daemon provides the
same functions as the cpuspeed daemon.

OpPoint is a fully functional solution ready for testing and evaluation
in Andrew's or your tree.

The kernel patches are available at:

http://source.mvista.com/~dsingleton/2.6.1-rc6

the power manager source code and patch is available at:

http://source.mvista.com/~dsingleton/oppointd


David

2006-09-14 05:30:35

by Vitaly Wool

[permalink] [raw]
Subject: Re: [linux-pm] cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP]

On 9/14/06, David Singleton <[email protected]> wrote:

> OpPoint constructs operating points for all supported frequency, voltage
> and suspend states for PC and SoC solutions running Linux.

...producing very hard-to-manage and long-to-search lists (did you
consider using trees for that BTW?)
Also, you'll have to maintain different lists for differemt
modifications of the same SoC. That will complicate the code prety
much.

> (The ARM pxa27x patch uses the cpufreq scaling routines to scale the LCD
> when frequencies are changed and works well when playing mpeg movies on
> the LCD during frequency scaling operations).

PXA is a hi-end stuff for embedded, so that might be not very illustrative.

> The Operating Points in OpPoint are simply created at compile time, in
> the same manner cpufreq tables are, and registered in
> /sys/power/operating_states directory when the cpu is identified at boot time.

So do you say there's no *kernel* interface to create OPs even?
Assume you've got a device which needs scaling and which driver may be
compiled as a module. Thus, you'll have to include OPs that reflect
this driver's clock scaling at the kernel compile time. Which is
nonsense in a way because the clock itself should be switched on
during probe() operation of the device driver.

> OpPoint draws the line about what's needed in the kernel a bit differently
> than Matt's PowerOp code. OpPoint only puts operating point support in
> the kernel. Polices for operting states and classes of operating states
> are left to the power manager, in userland. This simplifies the
> kernel code, no string parsers for operating point parameter construction,
> and makes it easier to customize a solution by customizing the power
> manager.

That sounds nice :)

> OpPoint is a fully functional solution ready for testing and evaluation
> in Andrew's or your tree.

Can you please list the SoCs which this solution has been tested on?

Thanks,
Vitaly

2006-09-14 06:14:36

by Greg KH

[permalink] [raw]
Subject: OpPoint summary

On Wed, Sep 13, 2006 at 04:50:43PM -0700, David Singleton wrote:
> Greg,
> here's a few paragraphs about the power management code I'm working on.
> The OpPoint patch set is a fully functionaly power management solution,
> from kernel operating state support to userland power manager.

Thanks for the summary, but it was a bit longer than just "one
paragraph" :)

> OpPoint constructs operating points for all supported frequency, voltage
> and suspend states for PC and SoC solutions running Linux. OpPoint
> does not break or replace cpufreq. It leverages cpufreq code to provide
> the same driver scaling functions when cpu frequency changes affect drivers.

So it works with cpufreq? That's a good thing, as it is a requirement.
We can't just break current usages.

> OpPoint is a fully functional solution ready for testing and evaluation
> in Andrew's or your tree.
>
> The kernel patches are available at:
>
> http://source.mvista.com/~dsingleton/2.6.1-rc6

I get a 404 with that link :(

Care to resend your patches in the proper format, through email so that
we can see them, and possibly get some testing in -mm if they look sane?

thanks,

greg k-h

2006-09-14 07:36:00

by Vitaly Wool

[permalink] [raw]
Subject: Re: [linux-pm] OpPoint summary

Hi Greg,

On 9/14/06, Greg KH <[email protected]> wrote:
> >
> > The kernel patches are available at:
> >
> > http://source.mvista.com/~dsingleton/2.6.1-rc6
>
> I get a 404 with that link :(

I bet David meant http://source.mvista.com/~dsingleton/2.6.18-rc6/

Vitaly

2006-09-14 15:00:12

by mark gross

[permalink] [raw]
Subject: Re: cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP]

On Tue, Sep 12, 2006 at 10:33:28AM +0200, Pavel Machek wrote:
> Hi!
>
> > > > >No, there is reason to keep that in kernel -- so that cpufreq
> > > > >userspace interface can be kept, and so that resulting kernel<->user
> > > > >interface is not ugly.
> > > > Cpuferq defines cpufreq_frequency_table structure in arch independent
> > > > header while it's arch dependent data structure. A lot of code is built
> > > > around this invalid basic brick and therefore is invalid: cpufreq tables,
> > > > interface with cpu freq drivers, etc. Notion of transition latency as it
> > > > defined by cpufreq is wrong because it's not a function of cpu type but
> > > > function of current and next operating point. no runtime control on
> > > > operating points set. it's always the same set of operating points for all
> > > > system cpus in smp case and no way to define different sets or track any
> > > > dependencies in case say multi core cpus. insufficient kernel<->user space
> > > > interface to handle embedded requirements and no way to extend it within
> > > > current design. Shall I continue? Why should then anyone want to keep
> > > > cpufreq userspace interface just due to keep it?
> > >
> > > Yes, please continue. I do not think we can just rip cpufreq interface
> > > out of kernel, and I do not think it is as broken as you claim it
> > > is. Ripping interface out of kernel takes years.
> > >
> > > I'm sure cpufreq_frequency_table could be moved to asm/ header if you
> > > felt strongly about that.
> > >
> > > I believe we need to fix cpufreq if it is broken for embedded
> > > cases.
> >
> > cpufreq is broken at the cpufreq_driver interface for embedded
> > applications needing control over more than one control variable at a
> > time.
> >
> > That interface only supports setting target frequencies, and expanding it
> > to set target frequencies and voltages is not possible without something
> > like PowerOP. Adding the types of parameters to cpufreq would likely
> > make cpufreq a mess.
>
> Can we at least try adding that, before deciding cpufreq is unsuitable
> and starting new interface? I do not think issues are nearly as big as
> you think.. (at user<->kernel interface level, anyway; you'll need big
> changes under the hood).

We are trying. The PowerOP patches from Matt and Eugeny start to put
into place some of the kernel mode plumbing for this in a way that
avoids thrashing the existing models, and it addresses the needs of the
operating point PM community. Which is large in the CE and Embedded
camps.

--mgross

2006-09-14 15:07:17

by mark gross

[permalink] [raw]
Subject: Re: [linux-pm] cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP]

On Tue, Sep 12, 2006 at 01:10:24PM +0400, Vitaly Wool wrote:
> Pavel,
>
> On 9/12/06, Pavel Machek <[email protected]> wrote:
> >Can we at least try adding that, before deciding cpufreq is unsuitable
> >and starting new interface? I do not think issues are nearly as big as
> >you think.. (at user<->kernel interface level, anyway; you'll need big
> >changes under the hood).
>
> who talks about user <-> kernel interface level changes at the moment?!
>

Mostly Pavel is.

There are questions on how to set/get operating points between the
platform and user space, and some questions on one to make the space of
operating points extensible outside of compile time deployment nice.
But these questions aren't the ones I see folks fussing about.

--mgross

2006-09-14 16:55:20

by David Singleton

[permalink] [raw]
Subject: Re: OpPoint summary

On 9/13/06, Greg KH <[email protected]> wrote:
> On Wed, Sep 13, 2006 at 04:50:43PM -0700, David Singleton wrote:
> > Greg,
> > here's a few paragraphs about the power management code I'm working on.
> > The OpPoint patch set is a fully functionaly power management solution,
> > from kernel operating state support to userland power manager.
>
> Thanks for the summary, but it was a bit longer than just "one
> paragraph" :)
>
> > OpPoint constructs operating points for all supported frequency, voltage
> > and suspend states for PC and SoC solutions running Linux. OpPoint
> > does not break or replace cpufreq. It leverages cpufreq code to provide
> > the same driver scaling functions when cpu frequency changes affect drivers.
>
> So it works with cpufreq? That's a good thing, as it is a requirement.
> We can't just break current usages.
>
> > OpPoint is a fully functional solution ready for testing and evaluation
> > in Andrew's or your tree.
> >
> > The kernel patches are available at:
> >
> > http://source.mvista.com/~dsingleton/2.6.1-rc6
>
> I get a 404 with that link :(
>
> Care to resend your patches in the proper format, through email so that
> we can see them, and possibly get some testing in -mm if they look sane?

Whoops, there are in the 2.6.18-rc6 directory. Here's the core patch inlined:


Signed-Off-by: David Singleton <[email protected]>

drivers/base/driver.c | 1
drivers/base/power/Makefile | 2
drivers/base/power/oppoint.c | 74 +++++++++
include/linux/pm.h | 34 ++++
kernel/power/main.c | 328 +++++++++++++++++++++++++++++++++++++------
kernel/power/power.h | 2
6 files changed, 397 insertions(+), 44 deletions(-)

Index: linux-2.6.17/kernel/power/main.c
===================================================================
--- linux-2.6.17.orig/kernel/power/main.c
+++ linux-2.6.17/kernel/power/main.c
@@ -16,6 +16,7 @@
#include <linux/init.h>
#include <linux/pm.h>
#include <linux/console.h>
+#include <linux/module.h>

#include "power.h"

@@ -49,7 +50,7 @@ void pm_set_ops(struct pm_ops * ops)
* the platform can enter the requested state.
*/

-static int suspend_prepare(suspend_state_t state)
+static int suspend_prepare(struct oppoint * state)
{
int error = 0;
unsigned int free_pages;
@@ -82,7 +83,7 @@ static int suspend_prepare(suspend_state
}

if (pm_ops->prepare) {
- if ((error = pm_ops->prepare(state)))
+ if ((error = pm_ops->prepare(state->type)))
goto Thaw;
}

@@ -94,7 +95,7 @@ static int suspend_prepare(suspend_state
return 0;
Finish:
if (pm_ops->finish)
- pm_ops->finish(state);
+ pm_ops->finish(state->type);
Thaw:
thaw_processes();
Enable_cpu:
@@ -104,7 +105,7 @@ static int suspend_prepare(suspend_state
}


-int suspend_enter(suspend_state_t state)
+int suspend_enter(struct oppoint * state)
{
int error = 0;
unsigned long flags;
@@ -115,7 +116,7 @@ int suspend_enter(suspend_state_t state)
printk(KERN_ERR "Some devices failed to power down\n");
goto Done;
}
- error = pm_ops->enter(state);
+ error = pm_ops->enter(state->type);
device_power_up();
Done:
local_irq_restore(flags);
@@ -131,36 +132,98 @@ int suspend_enter(suspend_state_t state)
* console that we've allocated. This is not called for suspend-to-disk.
*/

-static void suspend_finish(suspend_state_t state)
+static void suspend_finish(struct oppoint * state)
{
device_resume();
resume_console();
thaw_processes();
enable_nonboot_cpus();
if (pm_ops && pm_ops->finish)
- pm_ops->finish(state);
+ pm_ops->finish(state->type);
pm_restore_console();
}


+struct oppoint *current_state;
+struct oppoint pm_states = {
+ .name = "default",
+ .type = PM_SUSPEND_ON,
+};
+
+static struct oppoint standby = {
+ .name = "standby",
+ .type = PM_SUSPEND_STANDBY,
+};
+struct oppoint *oppoint_standby;

+static struct oppoint mem = {
+ .name = "mem",
+ .type = PM_SUSPEND_MEM,
+ .frequency = 0,
+ .voltage = 0,
+ .latency = 150,
+};
+struct oppoint *oppoint_mem;

-static const char * const pm_states[PM_SUSPEND_MAX] = {
- [PM_SUSPEND_STANDBY] = "standby",
- [PM_SUSPEND_MEM] = "mem",
#ifdef CONFIG_SOFTWARE_SUSPEND
- [PM_SUSPEND_DISK] = "disk",
-#endif
+struct oppoint disk = {
+ .name = "disk",
+ .type = PM_SUSPEND_DISK,
};
+#endif

-static inline int valid_state(suspend_state_t state)
+/*
+ *
+ */
+static int pm_change_state(struct oppoint *state)
+{
+ int error = 0;
+
+ printk("OpPoint: changing from %s to %s\n", current_state->name,
+ state->name);
+ /*
+ * compare to current operating point.
+ * if different change to new operating point.
+ */
+ if (current_state == state)
+ goto out;
+
+ /*
+ * prepare_transition does device constraint checking. If
+ * a new operating point will put a device in an unsupported
+ * state, lcd clock too low, NIC bus too low, etc. the new state
+ * cannot be entered (until the constrainded device is suspended).
+ * If prepare_transition fails we don't go to the new operating
+ * point.
+ */
+ if ((error = state->prepare_transition(current_state, state)))
+ goto out;
+
+ /*
+ * if the transition fails we call the finish transistion
+ * with the current state as the new state, causing
+ * the finish to return to the current_state.
+ */
+
+ if ((error = state->transition(current_state, state)))
+ state = current_state;
+
+ if ((state->finish_transition(current_state, state)) == 0)
+ current_state = state;
+
+out:
+ printk("OpPoint: State change returned %d\n", error);
+ return error;
+}
+
+static inline int valid_state(struct oppoint * state)
{
/* Suspend-to-disk does not really need low-level support.
* It can work with reboot if needed. */
- if (state == PM_SUSPEND_DISK)
+ if (state->type == PM_SUSPEND_DISK)
return 1;

- if (pm_ops && pm_ops->valid && !pm_ops->valid(state))
+ if (pm_ops && pm_ops->valid && !pm_ops->valid(state->type))
return 0;
return 1;
}
@@ -168,7 +231,7 @@ static inline int valid_state(suspend_st

/**
* enter_state - Do common work of entering low-power state.
- * @state: pm_state structure for state we're entering.
+ * @state: oppoint structure for state we're entering.
*
* Make sure we're the only ones trying to enter a sleep state. Fail
* if someone has beat us to it, since we don't want anything weird to
@@ -177,7 +240,7 @@ static inline int valid_state(suspend_st
* we've woken up).
*/

-static int enter_state(suspend_state_t state)
+static int enter_state(struct oppoint *state)
{
int error;

@@ -186,16 +249,21 @@ static int enter_state(suspend_state_t s
if (down_trylock(&pm_sem))
return -EBUSY;

- if (state == PM_SUSPEND_DISK) {
+ if (state->type == PM_SUSPEND_DISK) {
error = pm_suspend_disk();
goto Unlock;
}

- pr_debug("PM: Preparing system for %s sleep\n", pm_states[state]);
+ if (state->type == PM_FREQ_CHANGE || state->type == PM_VOLT_CHANGE) {
+ error = pm_change_state(state);
+ goto Unlock;
+ }
+
+ pr_debug("PM: Preparing system for %s sleep\n", state->name);
if ((error = suspend_prepare(state)))
goto Unlock;

- pr_debug("PM: Entering %s sleep\n", pm_states[state]);
+ pr_debug("PM: Entering %s sleep\n", state->name);
error = suspend_enter(state);

pr_debug("PM: Finishing wakeup.\n");
@@ -211,7 +279,15 @@ static int enter_state(suspend_state_t s
*/
int software_suspend(void)
{
- return enter_state(PM_SUSPEND_DISK);
+ struct oppoint *this, *next;
+ struct list_head *head = &pm_states.list;
+ int error = 0;
+
+ list_for_each_entry_safe(this, next, head, list) {
+ if (this->type == PM_SUSPEND_DISK)
+ error= enter_state(this);
+ }
+ return error;
}


@@ -223,9 +299,9 @@ int software_suspend(void)
* structure, and enter (above).
*/

-int pm_suspend(suspend_state_t state)
+int pm_suspend(struct oppoint * state)
{
- if (state > PM_SUSPEND_ON && state <= PM_SUSPEND_MAX)
+ if (state->type > PM_SUSPEND_ON && state->type <= PM_SUSPEND_MAX)
return enter_state(state);
return -EINVAL;
}
@@ -248,36 +324,29 @@ decl_subsys(power,NULL,NULL);

static ssize_t state_show(struct subsystem * subsys, char * buf)
{
- int i;
char * s = buf;

- for (i = 0; i < PM_SUSPEND_MAX; i++) {
- if (pm_states[i] && valid_state(i))
- s += sprintf(s,"%s ", pm_states[i]);
- }
- s += sprintf(s,"\n");
+ s += sprintf(s,"%s\n", current_state->name);
return (s - buf);
}

static ssize_t state_store(struct subsystem * subsys, const char *
buf, size_t n)
{
- suspend_state_t state = PM_SUSPEND_STANDBY;
- const char * const *s;
+ struct oppoint *this, *next;
+ struct list_head *head = &pm_states.list;
char *p;
- int error;
+ int error = -EINVAL;
int len;

p = memchr(buf, '\n', n);
len = p ? p - buf : n;
-
- for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
- if (*s && !strncmp(buf, *s, len))
+ list_for_each_entry_safe(this, next, head, list) {
+ if ((strlen(this->name) == len) &&
+ (!strncmp(this->name, buf, len))) {
+ error = enter_state(this);
break;
+ }
}
- if (state < PM_SUSPEND_MAX && *s)
- error = enter_state(state);
- else
- error = -EINVAL;
return error ? error : n;
}

@@ -292,12 +361,191 @@ static struct attribute_group attr_group
.attrs = g,
};

+static struct kobject oppoint_kobj = {
+ .kset = &power_subsys.kset,
+};
+
+struct oppoint_attribute {
+ struct attribute attr;
+ ssize_t (*show)(struct kobject * kobj, char * buf);
+ ssize_t (*store)(struct kobject * kobj, const char * buf,
size_t count);
+};
+
+#define to_oppoint(obj) container_of(obj,struct oppoint,kobj)
+#define to_oppoint_attr(_attr) container_of(_attr,struct
oppoint_attribute,attr)
+/*
+ * the frequency, voltage and latency files are readonly
+ */
+
+static ssize_t oppoint_voltage_show(struct kobject * kobj, char * buf)
+{
+ ssize_t len;
+ struct oppoint *opt = to_oppoint(kobj);
+
+ len = sprintf(buf, "%8d\n", opt->voltage);
+
+ return len;
+}
+
+static ssize_t oppoint_voltage_store(struct kobject * kobj, const char * buf,
+ size_t n)
+{
+ return -EINVAL;
+
+}
+
+static ssize_t oppoint_frequency_show(struct kobject * kobj, char * buf)
+{
+ ssize_t len;
+ struct oppoint *opt = to_oppoint(kobj);
+
+ len = sprintf(buf, "%8d\n", opt->frequency);
+
+ return len;
+}
+
+static ssize_t oppoint_frequency_store(struct kobject * kobj,
+ const char * buf, size_t n)
+{
+ return -EINVAL;
+
+}
+
+static ssize_t oppoint_latency_show(struct kobject * kobj, char * buf)
+{
+ ssize_t len;
+ struct oppoint *opt = to_oppoint(kobj);
+
+ len = sprintf(buf, "%8d\n", opt->latency);
+
+ return len;
+}
+
+static ssize_t oppoint_latency_store(struct kobject * kobj,
+ const char * buf, size_t n)
+{
+ return -EINVAL;
+
+}
+
+static struct oppoint_attribute frequency_attr = {
+ .attr = {
+ .name = "frequency",
+ .mode = 0400,
+ },
+ .show = oppoint_frequency_show,
+ .store = oppoint_frequency_store,
+};
+
+static struct oppoint_attribute voltage_attr = {
+ .attr = {
+ .name = "voltage",
+ .mode = 0400,
+ },
+ .show = oppoint_voltage_show,
+ .store = oppoint_voltage_store,
+};
+
+static struct oppoint_attribute latency_attr = {
+ .attr = {
+ .name = "latency",
+ .mode = 0400,
+ },
+ .show = oppoint_latency_show,
+ .store = oppoint_latency_store,
+};
+
+static ssize_t
+oppoint_attr_show(struct kobject * kobj, struct attribute * attr, char * buf)
+{
+ struct oppoint_attribute * opt_attr = to_oppoint_attr(attr);
+ ssize_t ret = 0;
+
+ if (opt_attr->show)
+ ret = opt_attr->show(kobj,buf);
+ return ret;
+}
+
+static ssize_t
+oppoint_attr_store(struct kobject * kobj, struct attribute * attr,
+ const char * buf, size_t count)
+{
+ return -EINVAL;
+}
+
+static void oppoint_kobj_release(struct kobject *kobj)
+{
+ return;
+}
+
+static struct sysfs_ops oppoint_sysfs_ops = {
+ .show = oppoint_attr_show,
+ .store = oppoint_attr_store,
+};
+
+static struct attribute * oppoint_default_attrs[] = {
+ &frequency_attr.attr,
+ &voltage_attr.attr,
+ &latency_attr.attr,
+ NULL,
+};
+
+static struct kobj_type ktype_operating_point = {
+ .release = oppoint_kobj_release,
+ .sysfs_ops = &oppoint_sysfs_ops,
+ .default_attrs = oppoint_default_attrs,
+};
+
+int unregister_operating_point(struct oppoint *opt)
+{
+ down(&pm_sem);
+ list_del_init(&opt->list);
+ sysfs_remove_file(&opt->kobj, &frequency_attr.attr);
+ sysfs_remove_file(&opt->kobj, &voltage_attr.attr);
+ sysfs_remove_file(&opt->kobj, &latency_attr.attr);
+ up(&pm_sem);
+}
+EXPORT_SYMBOL(unregister_operating_point);
+
+int register_operating_point(struct oppoint *opt)
+{
+ down(&pm_sem);
+ kobject_set_name(&opt->kobj, opt->name);
+ opt->kobj.kset = &power_subsys.kset;
+ opt->kobj.parent = &oppoint_kobj;
+ opt->kobj.ktype = &ktype_operating_point;
+ kobject_register(&opt->kobj);
+
+ sysfs_create_file(&opt->kobj, &frequency_attr.attr);
+ sysfs_create_file(&opt->kobj, &voltage_attr.attr);
+ sysfs_create_file(&opt->kobj, &latency_attr.attr);
+
+ list_add_tail(&opt->list, &pm_states.list);
+ up(&pm_sem);
+ return 0;
+}
+EXPORT_SYMBOL(register_operating_point);

static int __init pm_init(void)
{
+
int error = subsystem_register(&power_subsys);
- if (!error)
+ if (!error) {
error = sysfs_create_group(&power_subsys.kset.kobj,&attr_group);
+ kobject_set_name(&oppoint_kobj, "operating_points");
+ kobject_register(&oppoint_kobj);
+ }
+
+
+ INIT_LIST_HEAD(&pm_states.list);
+
+#ifdef CONFIG_SOFTWARE_SUSPEND
+ register_operating_point(&disk);
+#endif
+ register_operating_point(&mem);
+ register_operating_point(&standby);
+ current_state = &pm_states;
+
return error;
}

Index: linux-2.6.17/include/linux/pm.h
===================================================================
--- linux-2.6.17.orig/include/linux/pm.h
+++ linux-2.6.17/include/linux/pm.h
@@ -24,6 +24,7 @@
#ifdef __KERNEL__

#include <linux/list.h>
+#include <linux/kobject.h>
#include <asm/atomic.h>

/*
@@ -108,7 +109,36 @@ typedef int __bitwise suspend_state_t;
#define PM_SUSPEND_STANDBY ((__force suspend_state_t) 1)
#define PM_SUSPEND_MEM ((__force suspend_state_t) 3)
#define PM_SUSPEND_DISK ((__force suspend_state_t) 4)
-#define PM_SUSPEND_MAX ((__force suspend_state_t) 5)
+#define PM_FREQ_CHANGE ((__force suspend_state_t) 5)
+#define PM_VOLT_CHANGE ((__force suspend_state_t) 6)
+#define PM_SUSPEND_MAX ((__force suspend_state_t) 7)
+
+struct oppoint {
+ struct list_head list;
+ suspend_state_t type;
+ unsigned int flags;
+ char *name;
+ unsigned int frequency; /* in KHz */
+ unsigned int voltage; /* mV */
+ unsigned int latency; /* transition latency in us */
+ int (*prepare_transition)(struct oppoint *cur, struct oppoint *new);
+ int (*transition)(struct oppoint *cur, struct oppoint *new);
+ int (*finish_transition)(struct oppoint *cur, struct oppoint *new);
+
+ void *md_data; /* arch dependent data */
+ struct kobject kobj;
+};
+
+
+extern struct oppoint pm_states;
+extern struct oppoint *current_state;
+extern unsigned long oppoint_compute_lpj(unsigned long ref, u_int
div, u_int mult);
+extern int register_operating_point(struct oppoint *opt);
+extern int unregister_operating_point(struct oppoint *opt);
+struct notifier_block;
+extern void oppoint_register_scale(struct notifier_block *nb, int level);
+extern void oppoint_unregister_scale(struct notifier_block *nb, int level);
+extern int oppoint_driver_scale(int level, struct oppoint *new);

typedef int __bitwise suspend_disk_method_t;

@@ -128,7 +158,7 @@ struct pm_ops {

extern void pm_set_ops(struct pm_ops *);
extern struct pm_ops *pm_ops;
-extern int pm_suspend(suspend_state_t state);
+extern int pm_suspend(struct oppoint *state);


/*
Index: linux-2.6.17/kernel/power/power.h
===================================================================
--- linux-2.6.17.orig/kernel/power/power.h
+++ linux-2.6.17/kernel/power/power.h
@@ -113,4 +113,4 @@ extern int swsusp_resume(void);
extern int swsusp_read(void);
extern int swsusp_write(void);
extern void swsusp_close(void);
-extern int suspend_enter(suspend_state_t state);
+extern int suspend_enter(struct oppoint * state);
Index: linux-2.6.17/drivers/base/driver.c
===================================================================
--- linux-2.6.17.orig/drivers/base/driver.c
+++ linux-2.6.17/drivers/base/driver.c
@@ -12,6 +12,7 @@
#include <linux/module.h>
#include <linux/errno.h>
#include <linux/string.h>
+#include <linux/pm.h>
#include "base.h"

#define to_dev(node) container_of(node, struct device, driver_list)
Index: linux-2.6.17/drivers/base/power/Makefile
===================================================================
--- linux-2.6.17.orig/drivers/base/power/Makefile
+++ linux-2.6.17/drivers/base/power/Makefile
@@ -1,4 +1,4 @@
-obj-y := shutdown.o
+obj-y := shutdown.o oppoint.o
obj-$(CONFIG_PM) += main.o suspend.o resume.o runtime.o sysfs.o
obj-$(CONFIG_PM_TRACE) += trace.o

Index: linux-2.6.17/drivers/base/power/oppoint.c
===================================================================
--- /dev/null
+++ linux-2.6.17/drivers/base/power/oppoint.c
@@ -0,0 +1,74 @@
+/*
+ * oppoint.c -- OpPoint ower Management support (hotplug events and device
+ * scaling).
+ *
+ * (c) 2006 MontaVista Software, Inc. This file is licensed under the
+ * terms of the GNU General Public License version 2. This program is
+ * licensed "as is" without any warranty of any kind, whether express or
+ * implied.
+ */
+
+#include <linux/device.h>
+#include <linux/pm.h>
+#include <linux/sched.h>
+#include <linux/init.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <linux/notifier.h>
+
+#include "power.h"
+static RAW_NOTIFIER_HEAD(oppoint_scale_notifier);
+static DECLARE_MUTEX(oppoint_scale_sem);
+
+/* This function may be called by the platform frequency scaler before
+ or after a frequency change, in order to let drivers adjust any
+ clocks or calculations for the new frequency. */
+
+int oppoint_driver_scale(int level, struct oppoint *newop)
+{
+ if (down_trylock(&oppoint_scale_sem))
+ return 1;
+
+ raw_notifier_call_chain(&oppoint_scale_notifier, level, newop);
+ up(&oppoint_scale_sem);
+ return 0;
+}
+
+void oppoint_register_scale(struct notifier_block *nb, int level)
+{
+ down(&oppoint_scale_sem);
+ raw_notifier_chain_register(&oppoint_scale_notifier, nb);
+ up(&oppoint_scale_sem);
+}
+
+void oppoint_unregister_scale(struct notifier_block *nb, int level)
+{
+ down(&oppoint_scale_sem);
+ raw_notifier_chain_unregister(&oppoint_scale_notifier, nb);
+ up(&oppoint_scale_sem);
+}
+
+EXPORT_SYMBOL(oppoint_driver_scale);
+EXPORT_SYMBOL(oppoint_register_scale);
+EXPORT_SYMBOL(oppoint_unregister_scale);
+
+unsigned long oppoint_compute_lpj(unsigned long ref, u_int div, u_int mult)
+{
+ unsigned long new_jiffy_l, new_jiffy_h;
+
+ /*
+ * Recalculate loops_per_jiffy. We do it this way to
+ * avoid math overflow on 32-bit machines. Maybe we
+ * should make this architecture dependent? If you have
+ * a better way of doing this, please replace!
+ *
+ * new = old * mult / div
+ */
+ new_jiffy_h = ref / div;
+ new_jiffy_l = (ref % div) / 100;
+ new_jiffy_h *= mult;
+ new_jiffy_l = new_jiffy_l * mult / div;
+
+ return new_jiffy_h + new_jiffy_l * 100;
+}
+EXPORT_SYMBOL(oppoint_compute_lpj);


David
>
> thanks,
>
> greg k-h
>

2006-09-14 17:03:06

by David Singleton

[permalink] [raw]
Subject: Re: OpPoint summary

> Care to resend your patches in the proper format, through email so that
> we can see them, and possibly get some testing in -mm if they look sane?

Greg,
here's the patch that leverages the cpufreq notifier lists for
driver PRE and POST
change functions. I'm also rebasing to 2.6.18-rc7 and making changes Pavel
suggested about just having suspend states in /sys/power/state and moving
the operating point control file down under
/sys/power/operating_states directory.


Signed-Off-by: David Singleton <[email protected]>

drivers/cpufreq/cpufreq.c | 36 ++++++++++++++++++++++++++++++++++++
include/linux/cpufreq.h | 2 ++
2 files changed, 38 insertions(+)

Index: linux-2.6.17/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-2.6.17.orig/drivers/cpufreq/cpufreq.c
+++ linux-2.6.17/drivers/cpufreq/cpufreq.c
@@ -226,6 +226,35 @@ static void adjust_jiffies(unsigned long
static inline void adjust_jiffies(unsigned long val, struct
cpufreq_freqs *ci) { return; }
#endif

+int cpufreq_prepare_transition(struct oppoint *cur, struct oppoint *new)
+{
+ struct cpufreq_freqs freqs;
+
+ freqs.old = cur->frequency;
+ freqs.new = new->frequency;
+ freqs.cpu = 0;
+ freqs.flags = new->flags;
+ blocking_notifier_call_chain(&cpufreq_transition_notifier_list,
+ CPUFREQ_PRECHANGE, &freqs);
+ adjust_jiffies(CPUFREQ_PRECHANGE, &freqs);
+ return 0;
+}
+EXPORT_SYMBOL(cpufreq_prepare_transition);
+
+int cpufreq_finish_transition(struct oppoint *cur, struct oppoint *new)
+{
+ struct cpufreq_freqs freqs;
+
+ freqs.old = cur->frequency;
+ freqs.new = new->frequency;
+ freqs.cpu = 0;
+ freqs.flags = new->flags;
+ adjust_jiffies(CPUFREQ_POSTCHANGE, &freqs);
+ blocking_notifier_call_chain(&cpufreq_transition_notifier_list,
+ CPUFREQ_POSTCHANGE, &freqs);
+ return 0;
+}
+EXPORT_SYMBOL(cpufreq_finish_transition);

/**
* cpufreq_notify_transition - call notifier chain and adjust_jiffies
@@ -920,6 +949,12 @@ static void cpufreq_out_of_sync(unsigned
}


+#ifdef CONFIG_PM
+unsigned int cpufreq_quick_get(unsigned int cpu)
+{
+ return (current_state->frequency);
+}
+#else
/**
* cpufreq_quick_get - get the CPU frequency (in kHz) frpm policy->cur
* @cpu: CPU number
@@ -941,6 +976,7 @@ unsigned int cpufreq_quick_get(unsigned

return (ret);
}
+#endif
EXPORT_SYMBOL(cpufreq_quick_get);


Index: linux-2.6.17/include/linux/cpufreq.h
===================================================================
--- linux-2.6.17.orig/include/linux/cpufreq.h
+++ linux-2.6.17/include/linux/cpufreq.h
@@ -268,6 +268,8 @@ static inline unsigned int cpufreq_quick
return 0;
}
#endif
+int cpufreq_prepare_transition(struct oppoint *cur, struct oppoint *new);
+int cpufreq_finish_transition(struct oppoint *cur, struct oppoint *new);


/*********************************************************************



>
> thanks,
>
> greg k-h
>

2006-09-14 17:07:12

by David Singleton

[permalink] [raw]
Subject: Re: OpPoint summary

>
> Care to resend your patches in the proper format, through email so that
> we can see them, and possibly get some testing in -mm if they look sane?

Greg,
here's the patch that implements operating points for different frequencies
for the speedstep-centrino line of processors. Operating points are created
in much the same manner that cpufreq tables are. This works for both
simple implementations like the centrino and more complex SoC systems
like the arm-pxa72x which has several clocks to control, and different clock
divisors and multipliers.

David


Signed-Off-by: David Singleton <[email protected]>

arch/i386/Kconfig | 2
arch/i386/kernel/cpu/Makefile | 1
arch/i386/kernel/cpu/power/Kconfig | 168 ++++++++++
arch/i386/kernel/cpu/power/Makefile | 2
arch/i386/kernel/cpu/power/centrino-on-the-fly.c | 72 ++++
arch/i386/kernel/cpu/power/centrino-speedstep.c | 368 +++++++++++++++++++++++
arch/i386/kernel/i386_ksyms.c | 4
7 files changed, 617 insertions(+)

Index: linux-2.6.17/arch/i386/kernel/cpu/Makefile
===================================================================
--- linux-2.6.17.orig/arch/i386/kernel/cpu/Makefile
+++ linux-2.6.17/arch/i386/kernel/cpu/Makefile
@@ -17,3 +17,4 @@ obj-$(CONFIG_X86_MCE) += mcheck/

obj-$(CONFIG_MTRR) += mtrr/
obj-$(CONFIG_CPU_FREQ) += cpufreq/
+obj-$(CONFIG_PM) += power/
Index: linux-2.6.17/arch/i386/kernel/i386_ksyms.c
===================================================================
--- linux-2.6.17.orig/arch/i386/kernel/i386_ksyms.c
+++ linux-2.6.17/arch/i386/kernel/i386_ksyms.c
@@ -28,3 +28,7 @@ EXPORT_SYMBOL(__read_lock_failed);
#endif

EXPORT_SYMBOL(csum_partial);
+#ifdef CONFIG_PM
+#include <linux/pm.h>
+EXPORT_SYMBOL(pm_states);
+#endif
Index: linux-2.6.17/arch/i386/Kconfig
===================================================================
--- linux-2.6.17.orig/arch/i386/Kconfig
+++ linux-2.6.17/arch/i386/Kconfig
@@ -964,6 +964,8 @@ config APM_REAL_MODE_POWER_OFF

endmenu

+source "arch/i386/kernel/cpu/power/Kconfig"
+
source "arch/i386/kernel/cpu/cpufreq/Kconfig"

endmenu
Index: linux-2.6.17/arch/i386/kernel/cpu/power/Makefile
===================================================================
--- /dev/null
+++ linux-2.6.17/arch/i386/kernel/cpu/power/Makefile
@@ -0,0 +1,2 @@
+obj-m += centrino-on-the-fly.o
+obj-$(CONFIG_X86_SPEEDSTEP_CENTRINO) += centrino-speedstep.o
Index: linux-2.6.17/arch/i386/kernel/cpu/power/centrino-speedstep.c
===================================================================
--- /dev/null
+++ linux-2.6.17/arch/i386/kernel/cpu/power/centrino-speedstep.c
@@ -0,0 +1,368 @@
+/*
+ * OpPoint support for Enhanced SpeedStep, as found in Intel's Pentium
+ * M (part of the Centrino chipset).
+ *
+ * Modelled on speedstep-centrino.c
+ *
+ * Author: David Singleton [email protected] MontaVista Software, Inc.
+ *
+ * 2006 (c) MontaVista Software, Inc. This file is licensed under
+ * the terms of the GNU General Public License version 2. This program
+ * is licensed "as is" without any warranty of any kind, whether express
+ * or implied.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/pm.h>
+#include <linux/delay.h>
+#include <linux/cpufreq.h>
+#include <linux/moduleparam.h>
+#include <linux/moduleloader.h>
+
+struct cpu_id
+{
+ __u8 x86; /* CPU family */
+ __u8 x86_model; /* model */
+ __u8 x86_mask; /* stepping */
+};
+
+enum {
+ CPU_BANIAS,
+ CPU_DOTHAN_A1,
+ CPU_DOTHAN_A2,
+ CPU_DOTHAN_B0,
+ CPU_MP4HT_D0,
+ CPU_MP4HT_E0,
+};
+
+static const struct cpu_id cpu_ids[] = {
+ [CPU_BANIAS] = { 6, 9, 5 },
+ [CPU_DOTHAN_A1] = { 6, 13, 1 },
+ [CPU_DOTHAN_A2] = { 6, 13, 2 },
+ [CPU_DOTHAN_B0] = { 6, 13, 6 },
+ [CPU_MP4HT_D0] = {15, 3, 4 },
+ [CPU_MP4HT_E0] = {15, 4, 1 },
+};
+#define N_IDS ARRAY_SIZE(cpu_ids)
+
+struct cpu_model
+{
+ const struct cpu_id *cpu_id;
+ const char *model_name;
+ unsigned max_freq; /* max clock in kHz */
+
+};
+static int centrino_verify_cpu_id(const struct cpuinfo_x86 *c, const
struct cpu_id *x);
+
+void centrino_set_frequency(struct oppoint *op, uint freq, uint volt)
+{
+ op->frequency = freq * 1000;
+ op->voltage = volt;
+ op->md_data = (void *)(((freq / 100) << 8) | (volt - 700) / 16);
+}
+EXPORT_SYMBOL(centrino_set_frequency);
+
+int centrino_transition(struct oppoint *cur, struct oppoint *new)
+{
+ unsigned int msr, oldmsr = 0, h = 0;
+
+ if (cur == new)
+ return 0;
+
+ msr = (unsigned int)new->md_data;
+ rdmsr(MSR_IA32_PERF_CTL, oldmsr, h);
+
+ /* all but 16 LSB are reserved, treat them with care */
+ oldmsr &= ~0xffff;
+ msr &= 0xffff;
+ oldmsr |= msr;
+
+ wrmsr(MSR_IA32_PERF_CTL, oldmsr, h);
+
+ udelay(new->latency);
+
+ return 0;
+}
+EXPORT_SYMBOL(centrino_transition);
+
+#define _BANIAS(cpuid, max, name) \
+{ .cpu_id = cpuid, \
+ .model_name = "Intel(R) Pentium(R) M processor " name "MHz", \
+ .max_freq = (max)*1000, \
+}
+#define BANIAS(max) _BANIAS(&cpu_ids[CPU_BANIAS], max, #max)
+
+/*
+ * CPU models, their operating frequency range, and freq/voltage
+ * operating points
+ */
+static struct cpu_model models[] =
+{
+ _BANIAS(&cpu_ids[CPU_BANIAS], 900, " 900"),
+ BANIAS(1000),
+ BANIAS(1100),
+ BANIAS(1200),
+ BANIAS(1300),
+ BANIAS(1400),
+ BANIAS(1500),
+ BANIAS(1600),
+ BANIAS(1700),
+
+ /* NULL model_name is a wildcard */
+ { &cpu_ids[CPU_DOTHAN_A1], NULL, 0},
+ { &cpu_ids[CPU_DOTHAN_A2], NULL, 0},
+ { &cpu_ids[CPU_DOTHAN_B0], NULL, 0},
+ { &cpu_ids[CPU_MP4HT_D0], NULL, 0},
+ { &cpu_ids[CPU_MP4HT_E0], NULL, 0},
+
+ { NULL, }
+};
+#undef _BANIAS
+#undef BANIAS
+
+static struct oppoint lowest = {
+ .name = "lowest",
+ .type = PM_FREQ_CHANGE,
+ .frequency = 0,
+ .voltage = 0,
+ .latency = 15,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = centrino_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint low = {
+ .name = "low",
+ .type = PM_FREQ_CHANGE,
+ .latency = 15,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = centrino_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint mediumlow = {
+ .name = "mediumlow",
+ .type = PM_FREQ_CHANGE,
+ .latency = 15,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = centrino_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint medium = {
+ .name = "medium",
+ .type = PM_FREQ_CHANGE,
+ .latency = 15,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = centrino_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint mediumhigh = {
+ .name = "mediumhigh",
+ .type = PM_FREQ_CHANGE,
+ .latency = 15,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = centrino_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint high = {
+ .name = "high",
+ .type = PM_FREQ_CHANGE,
+ .latency = 15,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = centrino_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint highest = {
+ .name = "highest",
+ .type = PM_FREQ_CHANGE,
+ .latency = 15,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = centrino_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static int __init centrino_init_oppoint(void)
+{
+ struct cpuinfo_x86 *cpu = &cpu_data[0];
+ struct cpu_model *model;
+
+ for(model = models; model->cpu_id != NULL; model++) {
+ if (centrino_verify_cpu_id(cpu, model->cpu_id) &&
+ (model->model_name == NULL ||
+ strcmp(cpu->x86_model_id, model->model_name) == 0))
+ break;
+ }
+
+ if (model->cpu_id == NULL) {
+ /* No match at all */
+ printk("OpPoint: no support for CPU model %s\n",
+ cpu->x86_model_id);
+ return -ENOENT;
+ }
+
+ switch (model->max_freq) {
+ /* Ultra Low Voltage Intel Pentium M processor 900MHz (Banias) */
+ case (900000) :
+ {
+ centrino_set_frequency(&low, 600, 844);
+ centrino_set_frequency(&medium, 800, 988);
+ centrino_set_frequency(&high, 900, 1004);
+ break;
+ }
+ /* Ultra Low Voltage Intel Pentium M processor 1.00GHz (Banias) */
+ case (1000000) :
+ {
+ centrino_set_frequency(&low, 600, 844);
+ centrino_set_frequency(&medium, 800, 972);
+ centrino_set_frequency(&high, 900, 988);
+ centrino_set_frequency(&highest, 1000, 1004);
+ break;
+ }
+ /* Ultra Low Voltage Intel Pentium M processor 1.10GHz (Banias) */
+ case (1100000) :
+ {
+ centrino_set_frequency(&lowest, 600, 956);
+ centrino_set_frequency(&low, 800, 1020);
+ centrino_set_frequency(&medium, 900, 1100);
+ centrino_set_frequency(&high, 1000, 1164);
+ centrino_set_frequency(&highest, 1100, 1180);
+ break;
+ }
+ /* Ultra Low Voltage Intel Pentium M processor 1.10GHz (Banias) */
+ case (1200000) :
+ {
+ centrino_set_frequency(&lowest, 600, 956);
+ centrino_set_frequency(&low, 800, 1004);
+ centrino_set_frequency(&medium, 900, 1020);
+ centrino_set_frequency(&mediumhigh, 1000, 1100);
+ centrino_set_frequency(&high, 1100, 1164);
+ centrino_set_frequency(&highest, 1200, 1180);
+ break;
+ }
+ /* Ultra Low Voltage Intel Pentium M processor 1.10GHz (Banias) */
+ case (1300000) :
+ {
+ centrino_set_frequency(&lowest, 600, 956);
+ centrino_set_frequency(&low, 800, 1260);
+ centrino_set_frequency(&medium, 1000, 1292);
+ centrino_set_frequency(&high, 1200, 1356);
+ centrino_set_frequency(&highest, 1300, 1388);
+ break;
+ }
+ /* Ultra Low Voltage Intel Pentium M processor 1.10GHz (Banias) */
+ case (1400000) :
+ {
+ centrino_set_frequency(&lowest, 600, 956);
+ centrino_set_frequency(&low, 800, 1180);
+ centrino_set_frequency(&medium, 1000, 1308);
+ centrino_set_frequency(&high, 1200, 1436);
+ centrino_set_frequency(&highest, 1400, 1484);
+ break;
+ }
+ /* Ultra Low Voltage Intel Pentium M processor 1.10GHz (Banias) */
+ case (1500000) :
+ {
+ centrino_set_frequency(&lowest, 600, 956);
+ centrino_set_frequency(&low, 800, 1116);
+ centrino_set_frequency(&medium, 1000, 1228);
+ centrino_set_frequency(&mediumhigh, 1200, 1356);
+ centrino_set_frequency(&high, 1400, 1452);
+ centrino_set_frequency(&highest, 1500, 1484);
+ break;
+ }
+ /* Ultra Low Voltage Intel Pentium M processor 1.10GHz (Banias) */
+ case (1600000) :
+ {
+ centrino_set_frequency(&lowest, 600, 956);
+ centrino_set_frequency(&low, 800, 1036);
+ centrino_set_frequency(&medium, 1000, 1164);
+ centrino_set_frequency(&mediumhigh, 1200, 1276);
+ centrino_set_frequency(&high, 1400, 1420);
+ centrino_set_frequency(&highest, 1600, 1484);
+ break;
+ }
+ /* Ultra Low Voltage Intel Pentium M processor 1.10GHz (Banias) */
+ case (1700000) :
+ {
+ centrino_set_frequency(&lowest, 600, 956);
+ centrino_set_frequency(&low, 800, 1004);
+ centrino_set_frequency(&medium, 1000, 1116);
+ centrino_set_frequency(&mediumhigh, 1200, 1228);
+ centrino_set_frequency(&high, 1400, 1308);
+ centrino_set_frequency(&highest, 1700, 1484);
+ break;
+ }
+ }
+ if (lowest.frequency) {
+ register_operating_point(&lowest);
+ list_add_tail(&lowest.list, &pm_states.list);
+ }
+ if (low.frequency) {
+ register_operating_point(&low);
+ list_add_tail(&low.list, &pm_states.list);
+ }
+ if (mediumlow.frequency) {
+ register_operating_point(&mediumlow);
+ list_add_tail(&mediumlow.list, &pm_states.list);
+ }
+ if (medium.frequency) {
+ register_operating_point(&medium);
+ list_add_tail(&medium.list, &pm_states.list);
+ }
+ if (mediumhigh.frequency) {
+ register_operating_point(&mediumhigh);
+ list_add_tail(&mediumhigh.list, &pm_states.list);
+ }
+ if (high.frequency) {
+ register_operating_point(&high);
+ list_add_tail(&high.list, &pm_states.list);
+ current_state = &high;
+ }
+ if (highest.frequency) {
+ register_operating_point(&highest);
+ list_add_tail(&highest.list, &pm_states.list);
+ current_state = &highest;
+ }
+ return 0;
+}
+
+static void centrino_exit_oppoint(void)
+{
+ if (lowest.frequency)
+ list_del_init(&lowest.list);
+ if (low.frequency)
+ list_del_init(&low.list);
+ if (mediumlow.frequency)
+ list_del_init(&mediumlow.list);
+ if (medium.frequency)
+ list_del_init(&medium.list);
+ if (mediumhigh.frequency)
+ list_del_init(&mediumhigh.list);
+ if (high.frequency)
+ list_del_init(&high.list);
+ if (highest.frequency)
+ list_del_init(&highest.list);
+ return;
+}
+
+static int centrino_verify_cpu_id(const struct cpuinfo_x86 *c, const
struct cpu_id *x)
+{
+ if ((c->x86 == x->x86) &&
+ (c->x86_model == x->x86_model) &&
+ (c->x86_mask == x->x86_mask))
+ return 1;
+ return 0;
+}
+
+MODULE_AUTHOR ("David Singleton <[email protected]>");
+MODULE_DESCRIPTION ("OpPoint operting points for Intel Pentium M processors.");
+MODULE_LICENSE ("GPL");
+
+late_initcall(centrino_init_oppoint);
+module_exit(centrino_exit_oppoint);
Index: linux-2.6.17/arch/i386/kernel/cpu/power/Kconfig
===================================================================
--- /dev/null
+++ linux-2.6.17/arch/i386/kernel/cpu/power/Kconfig
@@ -0,0 +1,168 @@
+#
+# Operating Point support for frequency/voltage scaling
+#
+
+menu "CPU Frequency/Voltage scaling"
+
+if CPU_PM
+
+comment "OpPoint processor support"
+
+config ELAN_OPPOINT
+ tristate "AMD Elan SC400 and SC410"
+ depends on X86_ELAN
+ ---help---
+ This adds the OpPoint support for AMD Elan SC400 and SC410
+ processors.
+
+ You need to specify the processor maximum speed as boot
+ parameter: elanfreq=maxspeed (in kHz) or as module
+ parameter "max_freq".
+
+ If in doubt, say N.
+
+config SC520_OPPOINT
+ tristate "AMD Elan SC520"
+ depends on X86_ELAN
+ ---help---
+ This adds OpPoint support for AMD Elan SC520 processor.
+
+ If in doubt, say N.
+
+
+config X86_POWERNOW_K6
+ tristate "AMD Mobile K6-2/K6-3 PowerNow!"
+ help
+ This adds OpPoint support for mobile AMD K6-2+ and mobile
+ AMD K6-3+ processors.
+
+ If in doubt, say N.
+
+config X86_POWERNOW_K7
+ tristate "AMD Mobile Athlon/Duron PowerNow!"
+ help
+ This adds OpPoint support for mobile AMD K7 mobile processors.
+
+ If in doubt, say N.
+
+config X86_POWERNOW_K7_ACPI
+ bool
+ depends on X86_POWERNOW_K7 && ACPI_PROCESSOR
+ depends on !(X86_POWERNOW_K7 = y && ACPI_PROCESSOR = m)
+ default y
+
+config X86_POWERNOW_K8
+ tristate "AMD Opteron/Athlon64 PowerNow!"
+ depends on EXPERIMENTAL
+ help
+ This adds OpPoint support for mobile AMD Opteron/Athlon64 processors.
+
+ If in doubt, say N.
+
+config X86_POWERNOW_K8_ACPI
+ bool
+ depends on X86_POWERNOW_K8 && ACPI_PROCESSOR
+ depends on !(X86_POWERNOW_K8 = y && ACPI_PROCESSOR = m)
+ default y
+
+config X86_GX_SUSPMOD
+ tristate "Cyrix MediaGX/NatSemi Geode Suspend Modulation"
+ depends on PCI
+ help
+ This add OpPoint support for NatSemi Geode processors which
+ support suspend modulation.
+
+ If in doubt, say N.
+
+config X86_SPEEDSTEP_CENTRINO
+ tristate "Intel Enhanced SpeedStep"
+ select X86_SPEEDSTEP_CENTRINO_TABLE if (!X86_SPEEDSTEP_CENTRINO_ACPI)
+ help
+ This adds OpPoint support for Enhanced SpeedStep enabled
+ mobile CPUs. This means Intel Pentium M (Centrino) CPUs. However,
+ you also need to say Y to "Use ACPI tables to decode..." below
+ [which might imply enabling ACPI] if you want to use this driver
+ on non-Banias CPUs.
+
+ If in doubt, say N.
+
+config X86_SPEEDSTEP_CENTRINO_ACPI
+ bool "Use ACPI tables to decode valid frequency/voltage pairs"
+ depends on X86_SPEEDSTEP_CENTRINO && ACPI_PROCESSOR
+ depends on !(X86_SPEEDSTEP_CENTRINO = y && ACPI_PROCESSOR = m)
+ default y
+ help
+ Use primarily the information provided in the BIOS ACPI tables
+ to determine valid CPU frequency and voltage pairings. It is
+ required for the driver to work on non-Banias CPUs.
+
+ If in doubt, say Y.
+
+config X86_SPEEDSTEP_CENTRINO_TABLE
+ bool "Built-in tables for Banias CPUs"
+ depends on X86_SPEEDSTEP_CENTRINO
+ default y
+ help
+ Use built-in tables for Banias CPUs if ACPI encoding
+ is not available.
+
+ If in doubt, say N.
+
+config X86_SPEEDSTEP_ICH
+ tristate "Intel Speedstep on ICH-M chipsets (ioport interface)"
+ help
+ This adds the OpPoint support for certain mobile Intel Pentium III
+ (Coppermine), all mobile Intel Pentium III-M (Tualatin) and all
+ mobile Intel Pentium 4 P4-M on systems which have an Intel ICH2,
+ ICH3 or ICH4 southbridge.
+
+ If in doubt, say N.
+
+config X86_SPEEDSTEP_SMI
+ tristate "Intel SpeedStep on 440BX/ZX/MX chipsets (SMI interface)"
+ depends on EXPERIMENTAL
+ help
+ This adds OpPoint support for certain mobile Intel Pentium III
+ (Coppermine), all mobile Intel Pentium III-M (Tualatin)
+ on systems which have an Intel 440BX/ZX/MX southbridge.
+
+ If in doubt, say N.
+
+config X86_P4_CLOCKMOD
+ tristate "Intel Pentium 4 clock modulation"
+ help
+ This adds OpPoint support for Intel Pentium 4 / XEON
+ processors.
+
+ If in doubt, say N.
+
+config X86_OPPOINT_NFORCE2
+ tristate "nVidia nForce2 FSB changing"
+ depends on EXPERIMENTAL
+ help
+ This adds OpPoint support for FSB changing on nVidia nForce2
+ platforms.
+
+ If in doubt, say N.
+
+config X86_LONGRUN
+ tristate "Transmeta LongRun"
+ help
+ This adds OpPoint support for Transmeta Crusoe and Efficeon processors
+ which support LongRun.
+
+ If in doubt, say N.
+
+config X86_LONGHAUL
+ tristate "VIA Cyrix III Longhaul"
+ depends on ACPI_PROCESSOR
+ help
+ This adds OpPoint support for VIA Samuel/CyrixIII,
+ VIA Cyrix Samuel/C3, VIA Cyrix Ezra and VIA Cyrix Ezra-T
+ processors.
+
+ If in doubt, say N.
+
+endif # CONFIG_PM
+
+endmenu
Index: linux-2.6.17/arch/i386/kernel/cpu/power/centrino-on-the-fly.c
===================================================================
--- /dev/null
+++ linux-2.6.17/arch/i386/kernel/cpu/power/centrino-on-the-fly.c
@@ -0,0 +1,72 @@
+/*
+ * power/centrino-on-the-fly.c
+ *
+ * This is the template to create a dynamic operating point.
+ *
+ * Author: David Singleton [email protected] MontaVista Software, Inc.
+ *
+ * 2006 (c) MontaVista Software, Inc. This file is licensed under
+ * the terms of the GNU General Public License version 2. This program
+ * is licensed "as is" without any warranty of any kind, whether express
+ * or implied.
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/list.h>
+#include <linux/pm.h>
+#include <linux/cpufreq.h>
+#include <linux/moduleparam.h>
+#include <linux/moduleloader.h>
+
+int centrino_transition(struct oppoint *cur, struct oppoint *new);
+
+static unsigned int frequency = 1000;
+static unsigned int voltage = 1308;
+static unsigned int latency = 100;
+module_param_named(frequency, frequency, uint, 0);
+module_param_named(voltage, voltage, uint, 0);
+module_param_named(latency, latency, uint, 0);
+MODULE_PARM_DESC(frequency, "cpu frequency in kHz");
+MODULE_PARM_DESC(voltage, "cpu voltage in mV");
+MODULE_PARM_DESC(latency, "transition latency in us");
+
+/* Register both the driver and the device */
+
+static struct oppoint dynamic_op = {
+ .type = PM_FREQ_CHANGE,
+ .name = "dynamic",
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = centrino_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+extern void centrino_set_frequency(struct oppoint *op, uint freq, uint volt);
+
+int __init dynamic_oppoint_init(void)
+{
+
+ printk("Dynamic OpPoint operating point for speedstep centrino\n");
+ dynamic_op.frequency = frequency;
+ dynamic_op.voltage = voltage;
+ dynamic_op.latency = latency;
+ centrino_set_frequency(&dynamic_op, frequency / 1000, voltage);
+ register_operating_point(&dynamic_op);
+ printk("OpPoint: freq %d volt %d msr 0x%x\n", dynamic_op.frequency,
+ dynamic_op.voltage, (unsigned int)dynamic_op.md_data);
+ list_add_tail(&dynamic_op.list, &pm_states.list);
+ return 0;
+}
+
+void __exit dynamic_oppoint_cleanup(void)
+{
+ unregister_operating_point(&dynamic_op);
+ list_del_init(&dynamic_op.list);
+}
+
+module_init(dynamic_oppoint_init);
+module_exit(dynamic_oppoint_cleanup);
+
+MODULE_AUTHOR("David Singleton <[email protected]>");
+MODULE_DESCRIPTION("Dynamic OpPoint for Intel Pentium M processor module");
+MODULE_LICENSE("GPL");

>
> thanks,
>
> greg k-h
>

2006-09-14 17:11:53

by David Singleton

[permalink] [raw]
Subject: Re: OpPoint summary

>
> Care to resend your patches in the proper format, through email so that
> we can see them, and possibly get some testing in -mm if they look sane?
>
> thanks,
>
> greg k-h
>

Greg,
here's the arm-pxa7x patch which creates operating points for the much more
complex ARM platform. It illustrates the straight forward nature of creating
operating points, both frequency and sleep states, for different and more
complex architectures, and eliminates the need to have users pass in all
the parameters needed to create an operating point.


Signed-Off-by: David Singleton <[email protected]>

arch/arm/Kconfig | 2
arch/arm/mach-pxa/Makefile | 3
arch/arm/mach-pxa/mainstone_freq.c | 211 +++
arch/arm/mach-pxa/mainstone_oppoint.c | 1910 ++++++++++++++++++++++++++++++++++
arch/arm/mach-pxa/mainstone_volt.c | 363 ++++++
include/asm-arm/arch-pxa/oppoint.h | 119 ++
include/asm-arm/arch-pxa/pxa-regs.h | 8
7 files changed, 2614 insertions(+), 2 deletions(-)

Index: linux-2.6.17/arch/arm/mach-pxa/Makefile
===================================================================
--- linux-2.6.17.orig/arch/arm/mach-pxa/Makefile
+++ linux-2.6.17/arch/arm/mach-pxa/Makefile
@@ -10,7 +10,8 @@ obj-$(CONFIG_PXA27x) += pxa27x.o
# Specific board support
obj-$(CONFIG_ARCH_LUBBOCK) += lubbock.o
obj-$(CONFIG_MACH_LOGICPD_PXA270) += lpd270.o
-obj-$(CONFIG_MACH_MAINSTONE) += mainstone.o
+obj-$(CONFIG_MACH_MAINSTONE) += mainstone.o mainstone_oppoint.o \
+mainstone_freq.o mainstone_volt.o
obj-$(CONFIG_ARCH_PXA_IDP) += idp.o
obj-$(CONFIG_MACH_TRIZEPS4) += trizeps4.o
obj-$(CONFIG_PXA_SHARP_C7xx) += corgi.o corgi_ssp.o corgi_lcd.o
sharpsl_pm.o corgi_pm.o
Index: linux-2.6.17/arch/arm/mach-pxa/mainstone_freq.c
===================================================================
--- /dev/null
+++ linux-2.6.17/arch/arm/mach-pxa/mainstone_freq.c
@@ -0,0 +1,211 @@
+/*
+ * linux/arch/arm/mach-pxa/mainstone_freq.c
+ *
+ * Functions to change CPU frequencies on the Bulverde processor
+ * adopted from Intel code for MontaVista Linux.
+ *
+ * Author: David Singleton <[email protected]>
+ *
+ * 2006 (c) MontaVista Software, Inc. This file is licensed under the
+ * terms of the GNU General Public License version 2. This program is
+ * licensed "as is" without any warranty of any kind, whether express
+ * or implied.
+ *
+ */
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/init.h>
+#include <linux/mm.h>
+#include <linux/hardirq.h>
+#include <asm/io.h>
+
+#include <asm/hardware.h>
+#include <asm/arch/oppoint.h>
+#include <asm/arch/pxa-regs.h>
+#include <asm/pgtable.h>
+#include <asm/pgalloc.h>
+
+#include <asm/arch/mainstone.h>
+
+/*
+ * Since CPDIS and PPDIS is always the same, we use only one definition here.
+ */
+#define PDIS 0 /* Core PLL and Peripheral PLL is
enabled after FCS. */
+
+/*
+ * Available CPU frequency list for Bulverde.
+ */
+static unsigned int cpufreq_matrix[N_NUM][L_NUM + 1];
+static void mainstone_freq_debug_info(void);
+static volatile int *ramstart;
+
+/*
+ * Init according to mainstone manual.
+ */
+static void mainstone_initialize_freq_matrix(void)
+{
+ int n, l;
+
+ memset(&cpufreq_matrix, 0, sizeof(cpufreq_matrix));
+
+ for (n = 2; n < N_NUM + 2; n++) {
+ for (l = 2; l <= L_NUM; l++) {
+ cpufreq_matrix[n - 2][l - 2] = (13 * n * l / 2) * 1000;
+ if (cpufreq_matrix[n - 2][l - 2] > BLVD_MAX_FREQ)
+ cpufreq_matrix[n - 2][l - 2] = 0;
+ }
+ }
+}
+
+/*
+ * This should be called with a valid freq point that was
+ * obtained via mainstone_validate_speed
+ */
+void mainstone_set_freq(unsigned int CLKCFGValue)
+{
+ unsigned long flags;
+ unsigned int unused;
+ volatile int v;
+
+ local_irq_save(flags);
+
+ /*
+ * force a tlb fault to get the mapping into the tlb
+ * (otherwise this will occur below when the sdram is turned off and
+ * something-bad(tm) will happen)
+ */
+ v = *(volatile unsigned long *)ramstart;
+ *(volatile unsigned long *)ramstart = v;
+
+ MST_LEDDAT1 = CLKCFGValue;
+
+ __asm__ __volatile__(" \n\
+ ldr r4, [%1] @load MDREFR \n\
+ mcr p14, 0, %2, c6, c0, 0 @ set CCLKCFG[FCS] \n\
+ ldr r5, =0xe3dfefff \n\
+ and r4, r4, r5 \n\
+ str r4, [%1] @restore \n\
+ ":"=&r"(unused)
+ :"r"(&MDREFR), "r"(CLKCFGValue), "r"(ramstart)
+ :"r4", "r5");
+
+ MST_LEDDAT1 = 0x0002;
+ /*
+ NOTE: if we don't turn off IRQs up top, there is no point
+ to restoring them here.
+ */
+ local_irq_restore(flags);
+
+ /* spit out some info about what happened */
+ mainstone_freq_debug_info();
+
+}
+
+extern void mainstone_get_current_info(struct md_opt *);
+
+static void mainstone_freq_debug_info(void)
+{
+ unsigned int sysbus, run, t, turbo, mem, m = 1;
+ struct md_opt opt;
+
+ mainstone_get_current_info(&opt);
+
+ run = 13000 * opt.l;
+ turbo = (13000 * opt.l * opt.n) >> 1;
+ sysbus = (opt.b) ? run : (run / 2);
+ t = opt.regs.clkcfg & 0x1;
+
+ /* If CCCR[A] is on */
+ if (opt.cccra) {
+ mem = sysbus;
+ } else {
+ /* If A=0
+ m initialized to 1 (for l=2-10)
+ */
+ if (opt.l > 10)
+ m = 2; /* for l=11-20 */
+ if (opt.l > 20)
+ m = 4; /* for l=21-31 */
+ mem = run / m;
+ }
+}
+
+int mainstone_get_freq(void)
+{
+ unsigned int freq, n, l, ccsr;
+
+ ccsr = CCSR;
+
+ l = ccsr & CCCR_L_MASK; /* Get L */
+ n = (ccsr & CCCR_N_MASK) >> 7; /* Get 2N */
+
+ if (n < 2)
+ n = 2;
+
+ /* Shift to divide by 2 because N is really 2N */
+ freq = (13000 * l * n) >> 1; /* in kHz */
+
+ return freq;
+}
+
+unsigned int mainstone_read_clkcfg(void)
+{
+ unsigned int value = 0;
+ unsigned int un_used;
+
+ __asm__ __volatile__("mrc p14, 0, %0, c6, c0, 0":
"=r"(value) : "r"(un_used) );
+
+ return value;
+}
+
+static int mainstone_init_freqs(void)
+{
+ int cpu_ver;
+
+ asm volatile ("mrc%? p15, 0, %0, c0, c0":"=r" (cpu_ver));
+
+ /*
+ Bulverde A0: 0x69054110,
+ A1 : 0x69054111
+ */
+ if ((cpu_ver & 0x0000f000) >> 12 == 4 &&
+ (cpu_ver & 0xffff0000) >> 16 == 0x6905) {
+ /* It is an xscale core bulverde chip. */
+ return 1;
+ }
+
+ return 0;
+}
+
+int mainstone_clk_init(void)
+{
+ unsigned int freq;
+
+ /*
+ * In order to turn the sdram back on (see below) we need to
+ * r/w the sdram. We need to do this without the cache and
+ * write buffer in the way. So, we temporarily ioremap the
+ * first page of sdram as uncached i/o memory and use the
+ * aliased address
+ */
+
+ /* map the first page of sdram to an uncached virtual page */
+ ramstart = (int *)ioremap(PHYS_OFFSET, 4096);
+
+ freq = mainstone_get_freq(); /* in kHz */
+ printk(KERN_INFO "Init freq: %dkHz.\n", freq);
+
+ mainstone_initialize_freq_matrix();
+
+ if (mainstone_init_freqs()) {
+ printk(KERN_INFO "CPU frequency change initialized.\n");
+ }
+ return 0;
+}
+
+void mainstone_freq_cleanup(void)
+{
+ /* unmap the page we used*/
+ iounmap((void *)ramstart);
+}
Index: linux-2.6.17/arch/arm/mach-pxa/mainstone_volt.c
===================================================================
--- /dev/null
+++ linux-2.6.17/arch/arm/mach-pxa/mainstone_volt.c
@@ -0,0 +1,363 @@
+/*
+ * linux/arch/arm/mach-pxa/mainstone_volt.c
+ *
+ * Bulverde voltage change driver.
+ *
+ * Author: David Singleton [email protected] MontaVista Software, Inc.
+ *
+ * 2006 (c) MontaVista Software, Inc. This file is licensed under
+ * the terms of the GNU General Public License version 2. This program
+ * is licensed "as is" without any warranty of any kind, whether express
+ * or implied.
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/init.h>
+#include <asm/hardware.h>
+#include <asm/arch/oppoint.h>
+#include <asm/arch/pxa-regs.h>
+
+/* For ioremap */
+#include <asm/io.h>
+
+#define CP15R0_REV_MASK 0x0000000f
+#define PXA270_C5 0x7
+
+static u32 chiprev;
+static unsigned int blvd_min_vol, blvd_max_vol;
+static int mvdt_size;
+
+static volatile int *ramstart;
+
+struct MvDAC {
+ unsigned int mv;
+ unsigned int DACIn;
+} *mvDACtable;
+
+/*
+ * Transfer desired mv to required DAC value.
+ * Vcore = 1.3v - ( 712uv * DACIn )
+ */
+static struct MvDAC table_c0[] = {
+ {1425, 0},
+ {1400, 69},
+ {1300, 248},
+ {1200, 428},
+ {1100, 601},
+ {1000, 777},
+ {950, 872},
+ {868, 1010},
+ {861, 0xFFFFFFFF},
+};
+
+/*
+ * Transfer desired mv to required DAC value, update for new boards,
+ * according to "Intel PXA27x Processor Developer's Kit User's Guide,
+ * April 2004, Revision 4.001"
+ * Vcore = 1.5V - (587uV * DAC(input)).
+ */
+static struct MvDAC table_c5[] = {
+ {1500, 0},
+ {1484,25},
+ {1471,50},
+ {1456,75},
+ {1441,100},
+ {1427,125},
+ {1412,150},
+ {1397,175},
+ {1383,200},
+ {1368,225},
+ {1353,250},
+ {1339,275},
+ {1323,300},
+ {1309,325},
+ {1294,350},
+ {1280,375},
+ {1265,400},
+ {1251,425},
+ {1236,450},
+ {1221,475},
+ {1207,500},
+ {1192,525},
+ {1177,550},
+ {1162,575},
+ {1148,600},
+ {1133,625},
+ {1118,650},
+ {1104,675},
+ {1089,700},
+ {1074,725},
+ {1060,750},
+ {1045,775},
+ {1030,800},
+ {1016,825},
+ {1001,850},
+ {986,875},
+ {972,900},
+ {957,925},
+ {942,950},
+ {928,975},
+ {913,1000},
+ {899, 1023},
+};
+
+unsigned int mainstone_validate_voltage(unsigned int mv)
+{
+ /*
+ * Just to check whether user specified mv
+ * can be set to the CPU.
+ */
+ if ((mv >= blvd_min_vol) && (mv <= blvd_max_vol)) {
+ return mv;
+ } else {
+ return 0;
+ }
+}
+
+/*
+ * Prepare for a voltage change, possibly coupled with a frequency
+ * change
+ */
+static void power_change_cmd(unsigned int DACValue, int coupled);
+void mainstone_prep_set_voltage(unsigned int mv)
+{
+ power_change_cmd(mv2DAC(mv), 1 /* coupled */ );
+}
+
+
+unsigned int mv2DAC(unsigned int mv)
+{
+ int i, num = mvdt_size;
+
+ if (mvDACtable[0].mv <= mv) { /* Max or bigger */
+ /* Return the first one */
+ return mvDACtable[0].DACIn;
+ }
+
+ if (mvDACtable[num - 1].mv >= mv) { /* Min or smaller */
+ /* Return the last one */
+ return mvDACtable[num - 1].DACIn;
+ }
+
+ /*
+ * The biggest and smallest value cases are covered, now the
+ * loop may skip those
+ */
+ for (i = 1; i <= (num - 1); i++) {
+ if ((mvDACtable[i].mv >= mv) && (mvDACtable[i + 1].mv < mv)) {
+ return mvDACtable[i].DACIn;
+ }
+ }
+
+ /* Should never get here */
+ return 0;
+}
+extern void mainstone_change_voltage(void);
+void vm_setvoltage(unsigned int DACValue)
+{
+ power_change_cmd(DACValue, 0 /* not-coupled */ );
+ /* Execute voltage change sequence */
+ mainstone_change_voltage(); /* set VC on the PWRMODE on CP14 */
+}
+/*
+ * According to bulverde's manual, set the core's voltage.
+ */
+void mainstone_set_voltage(unsigned int mv)
+{
+ vm_setvoltage(mv2DAC(mv));
+}
+
+/*
+ * Functionality: Initialize PWR I2C.
+ * Argument: None
+ * Return: void
+*/
+int mainstone_vcs_init(void)
+{
+ /*
+ * we distinguish new and old boards by proc chip
+ * revision, we assume new boards have C5 proc
+ * revision and we use the new table (table_c5) for them,
+ * for all other boards we use the old table (table_c0).
+ * Note, the logics won't work and inaccurate voltage
+ * will be set if C5 proc installed to old board
+ * and vice versa.
+ */
+
+ asm("mrc%? p15, 0, %0, c0, c0" : "=r" (chiprev));
+
+ chiprev &= CP15R0_REV_MASK;
+
+ if ( chiprev == PXA270_C5 ) {
+ mvDACtable = table_c5;
+ mvdt_size = sizeof(table_c5) / sizeof(struct MvDAC);
+ blvd_min_vol = BLVD_MIN_VOL_C5;
+ blvd_max_vol = BLVD_MAX_VOL_C5;
+ } else {
+ mvDACtable = table_c0;
+ mvdt_size = sizeof(table_c0) / sizeof(struct MvDAC);
+ blvd_min_vol = BLVD_MIN_VOL_C0;
+ blvd_max_vol = BLVD_MAX_VOL_C0;
+ }
+
+ CKEN |= 0x1 << 15;
+ CKEN |= 0x1 << 14;
+ PCFR |= 0x60;
+
+ /* map the first page of sdram to an uncached virtual page */
+ ramstart = (int *)ioremap(PHYS_OFFSET, 4096);
+
+ printk(KERN_INFO "CPU voltage change initialized.\n");
+
+ return 0;
+}
+
+void mainstone_voltage_cleanup(void)
+{
+ /* unmap the page we used*/
+ iounmap((void *)ramstart);
+}
+
+
+void mainstone_change_voltage(void)
+{
+ unsigned long flags;
+ unsigned int unused;
+
+
+ local_irq_save(flags);
+
+ __asm__ __volatile__("\n\
+ @ BV B0 WORKAROUND - Core hangs on voltage change at different\n\
+ @ alignments and at different core clock frequencies\n\
+ @ To ensure that no external fetches occur, we want to store the next\n\
+ @ several instructions that occur after the voltage change inside\n\
+ @ the cache. The load dependency stall near the retry label ensures \n\
+ @ that any outstanding instruction cacheline loads are complete before \n\
+ @ the mcr instruction is executed on the 2nd pass. This procedure \n\
+ @ ensures us that the internal bus will not be busy. \n\
+ \n\
+ b 2f \n\
+ nop \n\
+ .align 5 \n\
+2: \n\
+ ldr r0, [%1] @ APB register read and compare \n\
+ cmp r0, #0 @ fence for pending slow apb reads \n\
+ \n\
+ mov r0, #8 @ VC bit for PWRMODE \n\
+ movs r1, #1 @ don't execute mcr on 1st pass \n\
+ \n\
+ @ %1 points to uncacheable memory to force memory read \n\
+ \n\
+retry: \n\
+ ldreq r3, [%2] @ only stall on the 2nd pass\n\
+ cmpeq r3, #0 @ cmp causes fence on mem transfers\n\
+ cmp r1, #0 @ is this the 2nd pass? \n\
+ mcreq p14, 0, r0, c7, c0, 0 @ write to PWRMODE on 2nd pass only \n\
+ \n\
+ @ Read VC bit until it is 0, indicates that the VoltageChange is done.\n\
+ @ On first pass, we never set the VC bit, so it will be clear already.\n\
+ \n\
+VoltageChange_loop: \n\
+ mrc p14, 0, r3, c7, c0, 0 \n\
+ tst r3, #0x8 \n\
+ bne VoltageChange_loop \n\
+ \n\
+ subs r1, r1, #1 @ update conditional execution counter\n\
+ beq retry":"=&r"(unused)
+ :"r"(&CCCR), "r"(ramstart)
+ :"r0", "r1", "r3");
+
+ local_irq_restore(flags);
+
+}
+
+static void clr_all_sqc(void)
+{
+ int i = 0;
+ for (i = 0; i < 32; i++)
+ PCMD(i) &= ~PCMD_SQC;
+}
+
+static void clr_all_mbc(void)
+{
+ int i = 0;
+ for (i = 0; i < 32; i++)
+ PCMD(i) &= ~PCMD_MBC;
+}
+
+static void clr_all_dce(void)
+{
+ int i = 0;
+ for (i = 0; i < 32; i++)
+ PCMD(i) &= ~PCMD_DCE;
+}
+
+static void set_mbc_bit(int ReadPointer, int NumOfBytes)
+{
+ PCMD0 |= PCMD_MBC;
+ PCMD1 |= PCMD_MBC;
+}
+
+static void set_lc_bit(int ReadPointer, int NumOfBytes)
+{
+ PCMD0 |= PCMD_LC;
+ PCMD1 |= PCMD_LC;
+ PCMD2 |= PCMD_LC;
+}
+
+static void set_cmd_data(unsigned char *DataArray, int StartPoint, int size)
+{
+ PCMD0 &= 0xFFFFFF00;
+ PCMD0 |= DataArray[0];
+ PCMD1 &= 0xFFFFFF00;
+ PCMD1 |= DataArray[1];
+ PCMD2 &= 0xFFFFFF00;
+ PCMD2 |= DataArray[2];
+}
+
+/* coupled indicates that this VCS is to be coupled with a FCS */
+static void power_change_cmd(unsigned int DACValue, int coupled)
+{
+ unsigned char dataArray[3];
+
+ dataArray[0] = 0; /* Command 0 */
+ dataArray[1] = (DACValue & 0x000000FF); /* data LSB */
+ dataArray[2] = (DACValue & 0x0000FF00) >> 8; /* data MSB */
+
+ PVCR = 0;
+
+ PCFR &= ~PCFR_FVC;
+ PVCR &= 0xFFFFF07F; /* no delay is necessary */
+ PVCR &= 0xFFFFFF80; /* clear slave address */
+ PVCR |= 0x20; /* set slave address */
+
+ PVCR &= 0xFE0FFFFF; /* clear read pointer 0 */
+ PVCR |= 0;
+
+ /* DCE and SQC are not necessary for single command */
+ clr_all_sqc();
+ clr_all_dce();
+
+ clr_all_mbc();
+ set_mbc_bit(0, 2);
+
+ /* indicate the last byte of this command is holded in this register */
+ PCMD2 &= ~PCMD_MBC;
+
+ /* indicate this is the first command and last command also */
+ set_lc_bit(0, 3);
+
+ /* programming the command data bit */
+ set_cmd_data(dataArray, 0, 2);
+
+ /* Enable Power I2C */
+ PCFR |= PCFR_PI2CEN;
+
+ if (coupled) {
+ /* Enable Power I2C and FVC */
+ PCFR |= PCFR_FVC;
+ }
+}
Index: linux-2.6.17/include/asm-arm/arch-pxa/pxa-regs.h
===================================================================
--- linux-2.6.17.orig/include/asm-arm/arch-pxa/pxa-regs.h
+++ linux-2.6.17/include/asm-arm/arch-pxa/pxa-regs.h
@@ -2042,6 +2042,14 @@
#define MDMRS __REG(0x48000040) /* MRS value to be written
to SDRAM */
#define BOOT_DEF __REG(0x48000044) /* Read-Only Boot-Time
Register. Contains BOOT_SEL and PKG_SEL */

+#define MDREFR_DRI 0xFFF
+#define MSC0_RDF (0xF << 20)
+#define MSC0_RDN (0xF << 24)
+#define MSC0_RRR (0x7 << 12)
+#define MDREFR_RFU 0xC0200000
+#define MDCNFG_DTC0 (0x3 << 8)
+#define MDCNFG_DTC2 (0x3 << 24)
+
/*
* More handy macros for PCMCIA
*
Index: linux-2.6.17/include/asm-arm/arch-pxa/oppoint.h
===================================================================
--- /dev/null
+++ linux-2.6.17/include/asm-arm/arch-pxa/oppoint.h
@@ -0,0 +1,119 @@
+/*
+ * include/asm-arm/arch-pxa/oppoint.h
+ *
+ * Bulverde-specific definitions for OpPoint. If further PXA boards are
+ * supported in the future, will split into board-specific files.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Copyright (C) 2002, 2006 MontaVista Software <[email protected]>
+ *
+ * Based on arch/ppc/platforms/ibm405lp_dpm.h by Bishop Brock.
+ */
+
+#ifndef __ASM_ARM_PXA_POWEROP_H__
+#define __ASM_ARM_PXA_POWEROP_H__
+
+#ifdef __KERNEL__
+#ifndef __ASSEMBLER__
+#include <linux/config.h>
+#include <linux/notifier.h>
+#include <linux/ioctl.h>
+#include <linux/miscdevice.h>
+
+#define CCLKCFG_TURBO 0x1
+#define CCLKCFG_FCS 0x2
+
+#define L_NUM 31 /* 30 different L numbers. */
+#define N_NUM 7 /* 7 N numbers. */
+
+#define BLVD_MIN_FREQ 13000
+/* latest PowerPoint documentation indicates 624000, not 540000*/
+#define BLVD_MAX_FREQ 624000
+
+void mainstone_set_freq(unsigned int);
+
+int mainstone_clk_init(void);
+unsigned int mainstone_read_clkcfg(void);
+void mainstone_freq_cleanup(void);
+
+#define BLVD_MAX_VOL_C5 1500 /* in mV. */
+#define BLVD_MIN_VOL_C5 899 /* in Mv. */
+#define BLVD_MAX_VOL_C0 1400 /* in mV. */
+#define BLVD_MIN_VOL_C0 850 /* in Mv. */
+
+unsigned int mv2DAC(unsigned int mv);
+void vm_setvoltage(unsigned int);
+unsigned int mainstone_validate_voltage(unsigned int mv);
+
+void mainstone_set_voltage(unsigned int mv);
+void mainstone_prep_set_voltage(unsigned int mv);
+int mainstone_vcs_init(void);
+
+void mainstone_set_freq(unsigned int);
+
+int mainstone_clk_init(void);
+unsigned int mainstone_read_clkcfg(void);
+void mainstone_freq_cleanup(void);
+
+enum {
+ CPUMODE_RUN,
+ CPUMODE_IDLE,
+ CPUMODE_STANDBY,
+ CPUMODE_SLEEP,
+ CPUMODE_RESERVED,
+ CPUMODE_SENSE,
+ CPUMODE_RESERVED2,
+ CPUMODE_DEEPSLEEP,
+};
+
+#define PM_SUSPEND_DEEP ((__force suspend_state_t) 2)
+
+struct oppoint_regs {
+ unsigned int cccr;
+ unsigned int clkcfg;
+ unsigned int voltage; /*This is not a register.*/
+};
+
+/*
+ * Instances of this structure define valid Bulverde operating points for DPM.
+ * Voltages are represented in mV, and frequencies are represented in KHz.
+ */
+
+struct md_opt {
+ /* Specified values */
+ int v; /* Target voltage in mV*/
+ int l; /* Run Mode to Oscillator ratio */
+ int n; /* Turbo-Mode to Run-Mode ratio */
+ int b; /* Fast Bus Mode */
+ int half_turbo;/* Half Turbo bit */
+ int cccra; /* the 'A' bit of the CCCR register,
+ alternate MEMC clock */
+ int cpll_enabled; /* core PLL is ON? (Bulverde >="C0" feature)*/
+ int ppll_enabled; /* peripherial PLL is ON? (Bulverde >="C0" feature)*/
+
+ int sleep_mode;
+ /*Calculated values*/
+ unsigned int lcd; /*in KHz */
+ unsigned int lpj; /*New value for loops_per_jiffy */
+ unsigned int cpu; /*CPU frequency in KHz */
+ unsigned int turbo; /* Turbo bit in clkcfg */
+
+ struct oppoint_regs regs; /* Register values */
+};
+
+#endif /* __ASSEMBLER__ */
+#endif /* __KERNEL__ */
+#endif
Index: linux-2.6.17/arch/arm/mach-pxa/mainstone_oppoint.c
===================================================================
--- /dev/null
+++ linux-2.6.17/arch/arm/mach-pxa/mainstone_oppoint.c
@@ -0,0 +1,1910 @@
+/*
+ * arch/arm/mach-pxa/mainstone_oppoint.c PM support for Intel PXA27x
+ *
+ * Author: <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Copyright (C) 2002, 2005, 2006 MontaVista Software <[email protected]>.
+ *
+ * Based on code by Matthew Locke, Dmitry Chigirev, and Bishop Brock.
+ */
+#define DEBUG
+#include <linux/config.h>
+
+#include <linux/errno.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/pm.h>
+
+#include <linux/proc_fs.h>
+#include <linux/delay.h>
+#include <linux/cpufreq.h>
+
+#include <asm/uaccess.h>
+
+#include <asm/hardware.h>
+#include <asm/arch/oppoint.h>
+#include <asm/arch/pxa-regs.h>
+
+static int saved_loops_per_jiffy = 0;
+static int saved_cpu_freq = 0;
+
+#define BULVERDE_DEFAULT_VOLTAGE 1400
+
+#define FSCALER_NOP 0
+#define FSCALER_CPUFREQ (1 << 0)
+#define FSCALER_SLEEP (1 << 1)
+#define FSCALER_STANDBY (1 << 2)
+#define FSCALER_DEEPSLEEP (1 << 3)
+#define FSCALER_WAKEUP (1 << 4)
+#define FSCALER_VOLTAGE (1 << 5)
+#define FSCALER_XPLLON (1 << 6)
+#define FSCALER_HALFTURBO_ON (1 << 7)
+#define FSCALER_HALFTURBO_OFF (1 << 8)
+#define FSCALER_TURBO_ON (1 << 9)
+#define FSCALER_TURBO_OFF (1 << 10)
+
+#define FSCALER_TURBO (FSCALER_TURBO_ON | FSCALER_TURBO_OFF)
+
+#define FSCALER_ANY_SLEEPMODE (FSCALER_SLEEP | \
+ FSCALER_STANDBY | \
+ FSCALER_DEEPSLEEP)
+
+#define CCCR_CPDIS_BIT_ON (1 << 31)
+#define CCCR_PPDIS_BIT_ON (1 << 30)
+#define CCCR_CPDIS_BIT_OFF (0 << 31)
+#define CCCR_PPDIS_BIT_OFF (0 << 30)
+#define CCCR_PLL_EARLY_EN_BIT_ON (1 << 26)
+#define CCSR_CPLL_LOCKED (1 << 29)
+#define CCSR_PPLL_LOCKED (1 << 28)
+
+/* CLKCFG
+ | 31------------------------------------------- | 3 | 2 | 1 | 0 |
+ | --------------------------------------------- | B | HT | F | T |
+*/
+#define CLKCFG_B_BIT (1 << 3)
+#define CLKCFG_HT_BIT (1 << 2)
+#define CLKCFG_F_BIT (1 << 1)
+#define CLKCFG_T_BIT 1
+
+/* Initialize the machine-dependent operating point from a list of parameters,
+ which has already been installed in the pp field of the operating point.
+ Some of the parameters may be specified with a value of -1 to indicate a
+ default value. */
+
+#define PLL_L_MAX 31
+#define PLL_N_MAX 8
+
+/* The MIN for L is 2 in the Yellow Book tables, but L=1 really means
+ 13M mode, so L min includes 1 */
+#define PLL_L_MIN 1
+#define PLL_N_MIN 2
+
+/* memory timing (MSC0,DTC,DRI) constants (see Blob and Intel BBU sources) */
+#define XLLI_MSC0_13 0x11101110
+#define XLLI_MSC0_19 0x11101110
+#define XLLI_MSC0_26 0x11201120 /* 26 MHz setting */
+#define XLLI_MSC0_32 0x11201120
+#define XLLI_MSC0_39 0x11301130 /* 39 MHz setting */
+#define XLLI_MSC0_45 0x11301130
+#define XLLI_MSC0_52 0x11401140 /* @ 52 MHz setting */
+#define XLLI_MSC0_58 0x11401140
+#define XLLI_MSC0_65 0x11501150 /* @ 65 MHz setting */
+#define XLLI_MSC0_68 0x11501150
+#define XLLI_MSC0_71 0x11501150 /* @ 71.5 MHz setting */
+#define XLLI_MSC0_74 0x11601160
+#define XLLI_MSC0_78 0x12601260 /* @ 78 MHz setting */
+#define XLLI_MSC0_81 0x12601260
+#define XLLI_MSC0_84 0x12601260 /* @ 84.5 MHz setting */
+#define XLLI_MSC0_87 0x12701270
+#define XLLI_MSC0_91 0x12701270 /* 91 MHz setting */
+#define XLLI_MSC0_94 0x12701270 /* 94.2 MHz setting */
+#define XLLI_MSC0_97 0x12701270 /* 97.5 MHz setting */
+#define XLLI_MSC0_100 0x12801280 /* 100.7 MHz setting */
+#define XLLI_MSC0_104 0x12801280 /* 104 MHz setting */
+#define XLLI_MSC0_110 0x12901290
+#define XLLI_MSC0_117 0x13901390 /* 117 MHz setting */
+#define XLLI_MSC0_124 0x13A013A0
+#define XLLI_MSC0_130 0x13A013A0 /* 130 MHz setting */
+#define XLLI_MSC0_136 0x13B013B0
+#define XLLI_MSC0_143 0x13B013B0
+#define XLLI_MSC0_149 0x13C013C0
+#define XLLI_MSC0_156 0x14C014C0
+#define XLLI_MSC0_162 0x14C014C0
+#define XLLI_MSC0_169 0x14C014C0
+#define XLLI_MSC0_175 0x14C014C0
+#define XLLI_MSC0_182 0x14C014C0
+#define XLLI_MSC0_188 0x14C014C0
+#define XLLI_MSC0_195 0x15C015C0
+#define XLLI_MSC0_201 0x15D015D0
+#define XLLI_MSC0_208 0x15D015D0
+
+/* DTC settings depend on 16/32 bit SDRAM we have (32 is chosen) */
+#define XLLI_DTC_13 0x00000000
+#define XLLI_DTC_19 0x00000000
+#define XLLI_DTC_26 0x00000000
+#define XLLI_DTC_32 0x00000000
+#define XLLI_DTC_39 0x00000000
+#define XLLI_DTC_45 0x00000000
+#define XLLI_DTC_52 0x00000000
+#define XLLI_DTC_58 0x01000100
+#define XLLI_DTC_65 0x01000100
+#define XLLI_DTC_68 0x01000100
+#define XLLI_DTC_71 0x01000100
+#define XLLI_DTC_74 0x01000100
+#define XLLI_DTC_78 0x01000100
+#define XLLI_DTC_81 0x01000100
+#define XLLI_DTC_84 0x01000100
+#define XLLI_DTC_87 0x01000100
+#define XLLI_DTC_91 0x02000200
+#define XLLI_DTC_94 0x02000200
+#define XLLI_DTC_97 0x02000200
+#define XLLI_DTC_100 0x02000200
+#define XLLI_DTC_104 0x02000200
+/* 110-208 MHz setting - SDCLK Halved*/
+#define XLLI_DTC_110 0x01000100
+#define XLLI_DTC_117 0x01000100
+#define XLLI_DTC_124 0x01000100
+#define XLLI_DTC_130 0x01000100
+#define XLLI_DTC_136 0x01000100
+#define XLLI_DTC_143 0x01000100
+#define XLLI_DTC_149 0x01000100
+#define XLLI_DTC_156 0x01000100
+#define XLLI_DTC_162 0x01000100
+#define XLLI_DTC_169 0x01000100
+#define XLLI_DTC_175 0x01000100
+/* 182-208 MHz setting - SDCLK Halved - Close to edge, so bump up */
+#define XLLI_DTC_182 0x02000200
+#define XLLI_DTC_188 0x02000200
+#define XLLI_DTC_195 0x02000200
+#define XLLI_DTC_201 0x02000200
+#define XLLI_DTC_208 0x02000200
+
+/* Optimal values for DRI (refreash interval) settings for
+ * various MemClk settings (MDREFR)
+ */
+#define XLLI_DRI_13 0x002
+#define XLLI_DRI_19 0x003
+#define XLLI_DRI_26 0x005
+#define XLLI_DRI_32 0x006
+#define XLLI_DRI_39 0x008
+#define XLLI_DRI_45 0x00A
+#define XLLI_DRI_52 0x00B
+#define XLLI_DRI_58 0x00D
+#define XLLI_DRI_65 0x00E
+#define XLLI_DRI_68 0x00F
+#define XLLI_DRI_71 0x010
+#define XLLI_DRI_74 0x011
+#define XLLI_DRI_78 0x012
+#define XLLI_DRI_81 0x012
+#define XLLI_DRI_84 0x013
+#define XLLI_DRI_87 0x014
+#define XLLI_DRI_91 0x015
+#define XLLI_DRI_94 0x016
+#define XLLI_DRI_97 0x016
+#define XLLI_DRI_100 0x017
+#define XLLI_DRI_104 0x018
+#define XLLI_DRI_110 0x01A
+#define XLLI_DRI_117 0x01B
+#define XLLI_DRI_124 0x01D
+#define XLLI_DRI_130 0x01E
+#define XLLI_DRI_136 0x020
+#define XLLI_DRI_143 0x021
+#define XLLI_DRI_149 0x023
+#define XLLI_DRI_156 0x025
+#define XLLI_DRI_162 0x026
+#define XLLI_DRI_169 0x028
+#define XLLI_DRI_175 0x029
+#define XLLI_DRI_182 0x02B
+#define XLLI_DRI_188 0x02D
+#define XLLI_DRI_195 0x02E
+#define XLLI_DRI_201 0x030
+#define XLLI_DRI_208 0x031
+
+
+
+/* timings for memory controller set up (masked values) */
+struct mem_timings{
+ unsigned int msc0; /* for MSC0 */
+ unsigned int dtc; /* for MDCNFG */
+ unsigned int dri; /* for MDREFR */
+};
+
+static int pxa27x_transition(struct oppoint *cur, struct oppoint *new);
+
+static struct oppoint m13 = {
+ .name = "13m",
+ .type = PM_FREQ_CHANGE,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = pxa27x_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint mext13 = {
+ .name = "13mext",
+ .type = PM_FREQ_CHANGE,
+ .voltage = 1500,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = pxa27x_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint sleep = {
+ .name = "sleep",
+ .type = PM_FREQ_CHANGE,
+ .voltage = 1500,
+ .frequency = 0,
+ .latency = 150,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = pxa27x_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint deepsleep = {
+ .name = "deepsleep",
+ .type = PM_FREQ_CHANGE,
+ .voltage = 1500,
+ .frequency = 0,
+ .latency = 1000,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = pxa27x_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint standby = {
+ .name = "pxastandby",
+ .type = PM_FREQ_CHANGE,
+ .voltage = 1500,
+ .frequency = 0,
+ .latency = 150,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = pxa27x_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint lowest = {
+ .name = "lowest",
+ .type = PM_FREQ_CHANGE,
+ .frequency = 0,
+ .voltage = 0,
+ .latency = 100,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = pxa27x_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint low = {
+ .name = "low",
+ .type = PM_FREQ_CHANGE,
+ .frequency = 0,
+ .voltage = 0,
+ .latency = 100,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = pxa27x_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint mediumlow = {
+ .name = "mediumlow",
+ .type = PM_FREQ_CHANGE,
+ .frequency = 0,
+ .voltage = 0,
+ .latency = 100,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = pxa27x_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint medium = {
+ .name = "medium",
+ .type = PM_FREQ_CHANGE,
+ .frequency = 0,
+ .voltage = 0,
+ .latency = 100,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = pxa27x_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint mediumhigh = {
+ .name = "mediumhigh",
+ .type = PM_FREQ_CHANGE,
+ .frequency = 0,
+ .voltage = 0,
+ .latency = 100,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = pxa27x_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint high = {
+ .name = "high",
+ .type = PM_FREQ_CHANGE,
+ .frequency = 0,
+ .voltage = 0,
+ .latency = 100,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = pxa27x_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct oppoint highest = {
+ .name = "highest",
+ .type = PM_FREQ_CHANGE,
+ .frequency = 0,
+ .voltage = 0,
+ .latency = 100,
+ .prepare_transition = cpufreq_prepare_transition,
+ .transition = pxa27x_transition,
+ .finish_transition = cpufreq_finish_transition,
+};
+
+static struct md_opt mhz104 = {
+ .v = 900,
+ .l = 8,
+ .n = 2,
+ .b = 1,
+ .half_turbo = 0,
+ .cccra = 1,
+ .cpll_enabled = 1,
+ .ppll_enabled = 1,
+ .sleep_mode = 0,
+};
+
+static struct md_opt mhz208 = {
+ .v = 1050,
+ .l = 16,
+ .n = 2,
+ .b = 1,
+ .half_turbo = 0,
+ .cccra = 0,
+ .cpll_enabled = 1,
+ .ppll_enabled = 1,
+ .sleep_mode = 0,
+};
+
+static struct md_opt mhz312 = {
+ .v = 1250,
+ .l = 16,
+ .n = 3,
+ .b = 1,
+ .half_turbo = 0,
+ .cccra = 0,
+ .cpll_enabled = 1,
+ .ppll_enabled = 1,
+ .sleep_mode = 0,
+};
+
+static struct md_opt mhz208hi = {
+ .v = 1150,
+ .l = 16,
+ .n = 2,
+ .b = 1,
+ .half_turbo = 0,
+ .cccra = 1,
+ .cpll_enabled = 1,
+ .ppll_enabled = 1,
+ .sleep_mode = 0,
+};
+
+static struct md_opt mhz312hi = {
+ .v = 1250,
+ .l = 16,
+ .n = 3,
+ .b = 1,
+ .half_turbo = 0,
+ .cccra = 1,
+ .cpll_enabled = 1,
+ .ppll_enabled = 1,
+ .sleep_mode = 0,
+};
+
+static struct md_opt mhz416hi = {
+ .v = 1350,
+ .l = 16,
+ .n = 4,
+ .b = 1,
+ .half_turbo = 0,
+ .cccra = 1,
+ .cpll_enabled = 1,
+ .ppll_enabled = 1,
+ .sleep_mode = 0,
+};
+
+static struct md_opt mhz520hi = {
+ .v = 1450,
+ .l = 16,
+ .n = 5,
+ .b = 1,
+ .half_turbo = 0,
+ .cccra = 1,
+ .cpll_enabled = 1,
+ .ppll_enabled = 1,
+ .sleep_mode = 0,
+};
+
+static struct md_opt md_sleep = {
+ .v = 1500,
+ .l = 0,
+ .n = 0,
+ .b = 0,
+ .half_turbo = -1,
+ .cccra = 0,
+ .cpll_enabled = 0,
+ .ppll_enabled = 0,
+ .sleep_mode = 3,
+};
+
+static struct md_opt md_deepsleep = {
+ .v = 1500,
+ .l = 0,
+ .n = 0,
+ .b = 0,
+ .half_turbo = -1,
+ .cccra = 0,
+ .cpll_enabled = 0,
+ .ppll_enabled = 0,
+ .sleep_mode = 7,
+};
+
+static struct md_opt md_standby = {
+ .v = 1500,
+ .l = 1,
+ .n = 2,
+ .b = 0,
+ .half_turbo = -1,
+ .cccra = 0,
+ .cpll_enabled = 0,
+ .ppll_enabled = 0,
+ .sleep_mode = -1,
+};
+
+static struct md_opt md_13m = {
+ .v = 1500,
+ .l = 1,
+ .n = 2,
+ .b = 0,
+ .half_turbo = -1,
+ .cccra = 0,
+ .cpll_enabled = 0,
+ .ppll_enabled = 0,
+ .sleep_mode = -1,
+};
+
+static struct md_opt md_13mext = {
+ .v = 1500,
+ .l = 1,
+ .n = 2,
+ .b = 0,
+ .half_turbo = -1,
+ .cccra = 0,
+ .cpll_enabled = 0,
+ .ppll_enabled = 0,
+ .sleep_mode = -1,
+};
+
+static void
+mainstone_setup_opt(struct oppoint *op, uint freq, struct md_opt *md)
+{
+ op->frequency = freq * 1000;
+ op->md_data = (void *)md;
+ op->voltage = md->v;
+}
+
+static unsigned int fscaler_flags = 0;
+
+void mainstone_fully_define_opt(struct oppoint *cur,
+ struct oppoint *new);
+static int mainstone_fscale(struct oppoint *cur, struct oppoint *new);
+static void mainstone_fscaler(struct oppoint_regs *regs);
+
+extern void mainstone_set_voltage(unsigned int mv);
+extern void mainstone_prep_set_voltage(unsigned int mv);
+extern int mainstone_vcs_init(void);
+extern void mainstone_voltage_cleanup(void);
+
+static unsigned long
+calculate_memclk(unsigned long cccr, unsigned long clkcfg)
+{
+ unsigned long M, memclk;
+ u32 L;
+
+ L = cccr & 0x1f;
+ if (cccr & (1 << 25)) {
+ if (clkcfg & CLKCFG_B_BIT)
+ memclk = (L*13);
+ else
+ memclk = (L*13)/2;
+ }
+ else {
+ if (L <= 10) M = 1;
+ else if (L <= 20) M = 2;
+ else M = 4;
+
+ memclk = (L*13)/M;
+ }
+
+ return memclk;
+}
+
+static unsigned long
+calculate_new_memclk(struct oppoint_regs *regs)
+{
+ return calculate_memclk(regs->cccr, regs->clkcfg);
+}
+
+static unsigned long
+calculate_cur_memclk(void)
+{
+ unsigned long cccr = CCCR;
+ return calculate_memclk(cccr, mainstone_read_clkcfg());
+}
+
+/* Returns optimal timings for memory controller
+ * a - [A]
+ * b - [B]
+ * l - value of L
+ */
+static struct mem_timings get_optimal_mem_timings(int a, int b, int l){
+ struct mem_timings ret = {
+ .msc0 = 0,
+ .dtc = 0,
+ .dri = 0,
+ };
+
+ if (a!=0 && b==0) {
+ switch (l) {
+ case 2:
+ ret.msc0 = XLLI_MSC0_13;
+ ret.dtc = XLLI_DTC_13;
+ ret.dri = XLLI_DRI_13;
+ break;
+ case 3:
+ ret.msc0 = XLLI_MSC0_19;
+ ret.dtc = XLLI_DTC_19;
+ ret.dri = XLLI_DRI_19;
+ break;
+ case 4:
+ ret.msc0 = XLLI_MSC0_26;
+ ret.dtc = XLLI_DTC_26;
+ ret.dri = XLLI_DRI_26;
+ break;
+ case 5:
+ ret.msc0 = XLLI_MSC0_32;
+ ret.dtc = XLLI_DTC_32;
+ ret.dri = XLLI_DRI_32;
+ break;
+ case 6:
+ ret.msc0 = XLLI_MSC0_39;
+ ret.dtc = XLLI_DTC_39;
+ ret.dri = XLLI_DRI_39;
+ break;
+ case 7:
+ ret.msc0 = XLLI_MSC0_45;
+ ret.dtc = XLLI_DTC_45;
+ ret.dri = XLLI_DRI_45;
+ break;
+ case 8:
+ ret.msc0 = XLLI_MSC0_52;
+ ret.dtc = XLLI_DTC_52;
+ ret.dri = XLLI_DRI_52;
+ break;
+ case 9:
+ ret.msc0 = XLLI_MSC0_58;
+ ret.dtc = XLLI_DTC_58;
+ ret.dri = XLLI_DRI_58;
+ break;
+ case 10:
+ ret.msc0 = XLLI_MSC0_65;
+ ret.dtc = XLLI_DTC_65;
+ ret.dri = XLLI_DRI_65;
+ break;
+ /*
+ * L11 - L20 ARE THE SAME for A0Bx
+ */
+ case 11:
+ ret.msc0 = XLLI_MSC0_71;
+ ret.dtc = XLLI_DTC_71;
+ ret.dri = XLLI_DRI_71;
+ break;
+ case 12:
+ ret.msc0 = XLLI_MSC0_78;
+ ret.dtc = XLLI_DTC_78;
+ ret.dri = XLLI_DRI_78;
+ break;
+ case 13:
+ ret.msc0 = XLLI_MSC0_84;
+ ret.dtc = XLLI_DTC_84;
+ ret.dri = XLLI_DRI_84;
+ break;
+ case 14:
+ ret.msc0 = XLLI_MSC0_91;
+ ret.dtc = XLLI_DTC_91;
+ ret.dri = XLLI_DRI_91;
+ break;
+ case 15:
+ ret.msc0 = XLLI_MSC0_97;
+ ret.dtc = XLLI_DTC_97;
+ ret.dri = XLLI_DRI_97;
+ break;
+ case 16:
+ ret.msc0 = XLLI_MSC0_104;
+ ret.dtc = XLLI_DTC_104;
+ ret.dri = XLLI_DRI_104;
+ break;
+ case 17:
+ ret.msc0 = XLLI_MSC0_110;
+ ret.dtc = XLLI_DTC_110;
+ ret.dri = XLLI_DRI_110;
+ break;
+ case 18:
+ ret.msc0 = XLLI_MSC0_117;
+ ret.dtc = XLLI_DTC_117;
+ ret.dri = XLLI_DRI_117;
+ break;
+ case 19:
+ ret.msc0 = XLLI_MSC0_124;
+ ret.dtc = XLLI_DTC_124;
+ ret.dri = XLLI_DRI_124;
+ break;
+ case 20:
+ ret.msc0 = XLLI_MSC0_130;
+ ret.dtc = XLLI_DTC_130;
+ ret.dri = XLLI_DRI_130;
+ break;
+ case 21:
+ ret.msc0 = XLLI_MSC0_136;
+ ret.dtc = XLLI_DTC_136;
+ ret.dri = XLLI_DRI_136;
+ break;
+ case 22:
+ ret.msc0 = XLLI_MSC0_143;
+ ret.dtc = XLLI_DTC_143;
+ ret.dri = XLLI_DRI_143;
+ break;
+ case 23:
+ ret.msc0 = XLLI_MSC0_149;
+ ret.dtc = XLLI_DTC_149;
+ ret.dri = XLLI_DRI_149;
+ break;
+ case 24:
+ ret.msc0 = XLLI_MSC0_156;
+ ret.dtc = XLLI_DTC_156;
+ ret.dri = XLLI_DRI_156;
+ break;
+ case 25:
+ ret.msc0 = XLLI_MSC0_162;
+ ret.dtc = XLLI_DTC_162;
+ ret.dri = XLLI_DRI_162;
+ break;
+ case 26:
+ ret.msc0 = XLLI_MSC0_169;
+ ret.dtc = XLLI_DTC_169;
+ ret.dri = XLLI_DRI_169;
+ break;
+ case 27:
+ ret.msc0 = XLLI_MSC0_175;
+ ret.dtc = XLLI_DTC_175;
+ ret.dri = XLLI_DRI_175;
+ break;
+ case 28:
+ ret.msc0 = XLLI_MSC0_182;
+ ret.dtc = XLLI_DTC_182;
+ ret.dri = XLLI_DRI_182;
+ break;
+ case 29:
+ ret.msc0 = XLLI_MSC0_188;
+ ret.dtc = XLLI_DTC_188;
+ ret.dri = XLLI_DRI_188;
+ break;
+ case 30:
+ ret.msc0 = XLLI_MSC0_195;
+ ret.dtc = XLLI_DTC_195;
+ ret.dri = XLLI_DRI_195;
+ break;
+ case 31:
+ ret.msc0 = XLLI_MSC0_201;
+ ret.dtc = XLLI_DTC_201;
+ ret.dri = XLLI_DRI_201;
+ }
+
+ } else if (a!=0 && b!=0) {
+ switch (l) {
+ case 2:
+ ret.msc0 = XLLI_MSC0_26;
+ ret.dtc = XLLI_DTC_26;
+ ret.dri = XLLI_DRI_26;
+ break;
+ case 3:
+ ret.msc0 = XLLI_MSC0_39;
+ ret.dtc = XLLI_DTC_39;
+ ret.dri = XLLI_DRI_39;
+ break;
+ case 4:
+ ret.msc0 = XLLI_MSC0_52;
+ ret.dtc = XLLI_DTC_52;
+ ret.dri = XLLI_DRI_52;
+ break;
+ case 5:
+ ret.msc0 = XLLI_MSC0_65;
+ ret.dtc = XLLI_DTC_65;
+ ret.dri = XLLI_DRI_65;
+ break;
+ case 6:
+ ret.msc0 = XLLI_MSC0_78;
+ ret.dtc = XLLI_DTC_78;
+ ret.dri = XLLI_DRI_78;
+ break;
+ case 7:
+ ret.msc0 = XLLI_MSC0_91;
+ ret.dtc = XLLI_DTC_91;
+ ret.dri = XLLI_DRI_91;
+ break;
+ case 8:
+ ret.msc0 = XLLI_MSC0_104;
+ ret.dtc = XLLI_DTC_104;
+ ret.dri = XLLI_DRI_104;
+ break;
+ case 9:
+ ret.msc0 = XLLI_MSC0_117;
+ ret.dtc = XLLI_DTC_117;
+ ret.dri = XLLI_DRI_117;
+ break;
+ case 10:
+ ret.msc0 = XLLI_MSC0_130;
+ ret.dtc = XLLI_DTC_130;
+ ret.dri = XLLI_DRI_130;
+ break;
+ case 11:
+ ret.msc0 = XLLI_MSC0_143;
+ ret.dtc = XLLI_DTC_143;
+ ret.dri = XLLI_DRI_143;
+ break;
+ case 12:
+ ret.msc0 = XLLI_MSC0_156;
+ ret.dtc = XLLI_DTC_156;
+ ret.dri = XLLI_DRI_156;
+ break;
+ case 13:
+ ret.msc0 = XLLI_MSC0_169;
+ ret.dtc = XLLI_DTC_169;
+ ret.dri = XLLI_DRI_169;
+ break;
+ case 14:
+ ret.msc0 = XLLI_MSC0_182;
+ ret.dtc = XLLI_DTC_182;
+ ret.dri = XLLI_DRI_182;
+ break;
+ case 15:
+ ret.msc0 = XLLI_MSC0_195;
+ ret.dtc = XLLI_DTC_195;
+ ret.dri = XLLI_DRI_195;
+ break;
+ case 16:
+ ret.msc0 = XLLI_MSC0_208;
+ ret.dtc = XLLI_DTC_208;
+ ret.dri = XLLI_DRI_208;
+ }
+ } else {
+ /* A0Bx */
+ switch (l) {
+ case 2:
+ ret.msc0 = XLLI_MSC0_26;
+ ret.dtc = XLLI_DTC_26;
+ ret.dri = XLLI_DRI_26;
+ break;
+ case 3:
+ ret.msc0 = XLLI_MSC0_39;
+ ret.dtc = XLLI_DTC_39;
+ ret.dri = XLLI_DRI_39;
+ break;
+ case 4:
+ ret.msc0 = XLLI_MSC0_52;
+ ret.dtc = XLLI_DTC_52;
+ ret.dri = XLLI_DRI_52;
+ break;
+ case 5:
+ ret.msc0 = XLLI_MSC0_65;
+ ret.dtc = XLLI_DTC_65;
+ ret.dri = XLLI_DRI_65;
+ break;
+ case 6:
+ ret.msc0 = XLLI_MSC0_78;
+ ret.dtc = XLLI_DTC_78;
+ ret.dri = XLLI_DRI_78;
+ break;
+ case 7:
+ ret.msc0 = XLLI_MSC0_91;
+ ret.dtc = XLLI_DTC_91;
+ ret.dri = XLLI_DRI_91;
+ break;
+ case 8:
+ ret.msc0 = XLLI_MSC0_104;
+ ret.dtc = XLLI_DTC_104;
+ ret.dri = XLLI_DRI_104;
+ break;
+ case 9:
+ ret.msc0 = XLLI_MSC0_117;
+ ret.dtc = XLLI_DTC_117;
+ ret.dri = XLLI_DRI_117;
+ break;
+ case 10:
+ ret.msc0 = XLLI_MSC0_130;
+ ret.dtc = XLLI_DTC_130;
+ ret.dri = XLLI_DRI_130;
+ break;
+ case 11:
+ ret.msc0 = XLLI_MSC0_71;
+ ret.dtc = XLLI_DTC_71;
+ ret.dri = XLLI_DRI_71;
+ break;
+ case 12:
+ ret.msc0 = XLLI_MSC0_78;
+ ret.dtc = XLLI_DTC_78;
+ ret.dri = XLLI_DRI_78;
+ break;
+ case 13:
+ ret.msc0 = XLLI_MSC0_84;
+ ret.dtc = XLLI_DTC_84;
+ ret.dri = XLLI_DRI_84;
+ break;
+ case 14:
+ ret.msc0 = XLLI_MSC0_91;
+ ret.dtc = XLLI_DTC_91;
+ ret.dri = XLLI_DRI_91;
+ break;
+ case 15:
+ ret.msc0 = XLLI_MSC0_97;
+ ret.dtc = XLLI_DTC_97;
+ ret.dri = XLLI_DRI_97;
+ break;
+ case 16:
+ ret.msc0 = XLLI_MSC0_104;
+ ret.dtc = XLLI_DTC_104;
+ ret.dri = XLLI_DRI_104;
+ break;
+ case 17:
+ ret.msc0 = XLLI_MSC0_110;
+ ret.dtc = XLLI_DTC_110;
+ ret.dri = XLLI_DRI_110;
+ break;
+ case 18:
+ ret.msc0 = XLLI_MSC0_117;
+ ret.dtc = XLLI_DTC_117;
+ ret.dri = XLLI_DRI_117;
+ break;
+ case 19:
+ ret.msc0 = XLLI_MSC0_124;
+ ret.dtc = XLLI_DTC_124;
+ ret.dri = XLLI_DRI_124;
+ break;
+ case 20:
+ ret.msc0 = XLLI_MSC0_130;
+ ret.dtc = XLLI_DTC_130;
+ ret.dri = XLLI_DRI_130;
+ break;
+ case 21:
+ ret.msc0 = XLLI_MSC0_68;
+ ret.dtc = XLLI_DTC_68;
+ ret.dri = XLLI_DRI_68;
+ break;
+ case 22:
+ ret.msc0 = XLLI_MSC0_71;
+ ret.dtc = XLLI_DTC_71;
+ ret.dri = XLLI_DRI_71;
+ break;
+ case 23:
+ ret.msc0 = XLLI_MSC0_74;
+ ret.dtc = XLLI_DTC_74;
+ ret.dri = XLLI_DRI_74;
+ break;
+ case 24:
+ ret.msc0 = XLLI_MSC0_78;
+ ret.dtc = XLLI_DTC_78;
+ ret.dri = XLLI_DRI_78;
+ break;
+ case 25:
+ ret.msc0 = XLLI_MSC0_81;
+ ret.dtc = XLLI_DTC_81;
+ ret.dri = XLLI_DRI_81;
+ break;
+ case 26:
+ ret.msc0 = XLLI_MSC0_84;
+ ret.dtc = XLLI_DTC_84;
+ ret.dri = XLLI_DRI_84;
+ break;
+ case 27:
+ ret.msc0 = XLLI_MSC0_87;
+ ret.dtc = XLLI_DTC_87;
+ ret.dri = XLLI_DRI_87;
+ break;
+ case 28:
+ ret.msc0 = XLLI_MSC0_91;
+ ret.dtc = XLLI_DTC_91;
+ ret.dri = XLLI_DRI_91;
+ break;
+ case 29:
+ ret.msc0 = XLLI_MSC0_94;
+ ret.dtc = XLLI_DTC_94;
+ ret.dri = XLLI_DRI_94;
+ break;
+ case 30:
+ ret.msc0 = XLLI_MSC0_97;
+ ret.dtc = XLLI_DTC_97;
+ ret.dri = XLLI_DRI_97;
+ break;
+ case 31:
+ ret.msc0 = XLLI_MSC0_100;
+ ret.dtc = XLLI_DTC_100;
+ ret.dri = XLLI_DRI_100;
+ }
+ }
+
+ return ret;
+}
+
+static void assign_optimal_mem_timings(
+ unsigned int* msc0_reg,
+ unsigned int* mdrefr_reg,
+ unsigned int* mdcnfg_reg,
+ int a, int b, int l
+ )
+{
+ unsigned int msc0_reg_tmp = (*msc0_reg);
+ unsigned int mdrefr_reg_tmp = (*mdrefr_reg);
+ unsigned int mdcnfg_reg_tmp = (*mdcnfg_reg);
+ struct mem_timings timings = get_optimal_mem_timings(a,b,l);
+
+ /* clear bits which are set by get_optimal_mem_timings*/
+ msc0_reg_tmp &= ~(MSC0_RDF & MSC0_RDN & MSC0_RRR);
+ mdrefr_reg_tmp &= ~(MDREFR_RFU & MDREFR_DRI);
+ mdcnfg_reg_tmp &= ~(MDCNFG_DTC0 & MDCNFG_DTC2);
+
+ /* prepare appropriate timings */
+ msc0_reg_tmp |= timings.msc0;
+ mdrefr_reg_tmp |= timings.dri;
+ mdcnfg_reg_tmp |= timings.dtc;
+
+ /* set timings (all bits one time) */
+ (*msc0_reg) = msc0_reg_tmp;
+ (*mdrefr_reg) = mdrefr_reg_tmp;
+ (*mdcnfg_reg) = mdcnfg_reg_tmp;
+}
+
+static void set_mdrefr_value(u32 new_mdrefr)
+{
+ unsigned long s, old_mdrefr, errata62;
+ old_mdrefr = MDREFR;
+ /* E62 (28007106.pdf): Memory controller may hang while clearing
+ * MDREFR[K1DB2] or MDREFR[K2DB2]
+ */
+ errata62 = (((old_mdrefr & MDREFR_K1DB2) != 0) &&
+ ((new_mdrefr & MDREFR_K1DB2) == 0)) ||
+ (((old_mdrefr & MDREFR_K2DB2) != 0) &&
+ ((new_mdrefr & MDREFR_K2DB2) == 0));
+
+ if (errata62) {
+ unsigned long oscr_0 = OSCR;
+ unsigned long oscr_1 = oscr_0;
+ /* Step 1 - disable interrupts */
+ local_irq_save(s);
+ /* Step 2 - leave KxDB2, but set MDREFR[DRI] (bits 0-11) to
+ * 0xFFF
+ */
+ MDREFR = MDREFR | MDREFR_DRI;
+ /* Step 3 - read MDREFR one time */
+ MDREFR;
+ /* Step 4 - wait 1.6167us
+ * (3.25MHz clock increments OSCR0 7 times)
+ */
+ while (oscr_1-oscr_0 < 7) {
+ cpu_relax();
+ oscr_1 = OSCR;
+ }
+
+ }
+
+ /* Step 5 - clear K1DB1 and/or K2DB2, and set MDREFR[DRI] to
+ * proper value at the same time
+ */
+
+ /*Set MDREFR as if no errata workaround is needed*/
+ MDREFR = new_mdrefr;
+
+ if (errata62) {
+ /* Step 6 - read MDREFR one time*/
+ MDREFR;
+ /* Step 7 - enable interrupts*/
+ local_irq_restore(s);
+ }
+}
+
+static void mainstone_scale_cpufreq(struct oppoint_regs *regs)
+{
+ unsigned long new_memclk, cur_memclk;
+ u32 new_mdrefr, cur_mdrefr, read_mdrefr;
+ u32 new_msc0, new_mdcnfg;
+ int set_mdrefr = 0, scaling_up = 0;
+ int l, a, b;
+
+ l = regs->cccr & CCCR_L_MASK; /* Get L */
+ b = (regs->clkcfg >> 3) & 0x1;
+ a = (regs->cccr >> 25) & 0x1; /* cccr[A]: bit 25 */
+
+ cur_memclk = calculate_cur_memclk();
+ new_memclk = calculate_new_memclk(regs);
+
+ new_mdrefr = cur_mdrefr = MDREFR;
+ new_msc0 = MSC0;
+ new_mdcnfg = MDCNFG;
+
+ if (new_memclk != cur_memclk) {
+ /* SDCLK0,SDCLK1,SDCLK2 = MEMCLK - by default (<=52MHz) */
+ new_mdrefr &= ~( MDREFR_K0DB2 | MDREFR_K0DB4 |
+ MDREFR_K1DB2 | MDREFR_K2DB2 );
+
+ if ((new_memclk > 52) && (new_memclk <= 104)) {
+ /* SDCLK0 = MEMCLK/2, SDCLK1,SDCLK2 = MEMCLK */
+ new_mdrefr |= MDREFR_K0DB2;
+ }
+ else if (new_memclk > 104){
+ /* SDCLK0 = MEMCLK/4, SDCLK1 and SDCLK2 = MEMCLK/2 */
+ new_mdrefr |= (MDREFR_K0DB4 | MDREFR_K1DB2 |
MDREFR_K2DB2);
+ }
+
+ /* clock increasing or decreasing? */
+ if (new_memclk > cur_memclk) scaling_up = 1;
+ }
+
+ /* set MDREFR if necessary */
+ if (new_mdrefr != cur_mdrefr){
+ set_mdrefr = 1;
+ /* also adjust timings as long as we change MDREFR value */
+ assign_optimal_mem_timings(
+ &new_msc0,
+ &new_mdrefr,
+ &new_mdcnfg,
+ a,b,l
+ );
+ }
+
+ /* if memclk is scaling up, set MDREFR before freq change
+ * (2800002.pdf:6.5.1.4)
+ */
+ if (set_mdrefr && scaling_up) {
+ MSC0 = new_msc0;
+ set_mdrefr_value(new_mdrefr);
+ MDCNFG = new_mdcnfg;
+ read_mdrefr = MDREFR;
+ }
+
+ CCCR = regs->cccr;
+ mainstone_set_freq(regs->clkcfg);
+
+ /* if memclk is scaling down, set MDREFR after freq change
+ * (2800002.pdf:6.5.1.4)
+ */
+ if (set_mdrefr && !scaling_up) {
+ MSC0 = new_msc0;
+ set_mdrefr_value(new_mdrefr);
+ MDCNFG = new_mdcnfg;
+ read_mdrefr = MDREFR;
+ }
+}
+
+static void mainstone_scale_voltage(struct oppoint_regs *regs)
+{
+ mainstone_set_voltage(regs->voltage);
+}
+
+static void mainstone_scale_voltage_coupled(struct oppoint_regs *regs)
+{
+ mainstone_prep_set_voltage(regs->voltage);
+}
+
+static void calculate_lcd_freq(struct md_opt *opt)
+{
+ int k = 1; /* lcd divisor */
+
+ /* L is verified to be between PLL_L_MAX and PLL_L_MIN in */
+ if (opt->l == -1) {
+ opt->lcd = -1;
+ return;
+ }
+
+ if (opt->l > 16) {
+ /* When L=17-31, K=4 */
+ k = 4;
+ } else if (opt->l > 7) {
+ /* When L=8-16, K=2 */
+ k = 2;
+ }
+
+ /* Else, when L=2-7, K=1 */
+
+ opt->lcd = 13000 * opt->l / k;
+}
+
+static void calculate_reg_values(struct md_opt *opt)
+{
+ int f = 0; /* frequency change bit */
+ int turbo = 0; /* turbo mode bit; depends on N value */
+
+ opt->regs.voltage = opt->v;
+/*
+ CCCR
+ | 31| 30|29-28| 27| 26| 25|24-11| 10| 9 | 8 | 7 |6-5 | 4 | 3 | 2 | 1 | 0 |
+ | C | P | | L | P | | | | | |
+ | P | P | | C | L | | | | | |
+ | D | D |resrv| D | L | A |resrv| 2 * N |resrv| L |
+ | I | I | | 2 | . | | | | | |
+ | S | S | | 6 | . | | | | | |
+
+ A: Alternate setting for MEMC clock
+ 0 = MEM clock frequency as specified in YB's table 3-7
+ 1 = MEM clock frq = System Bus Frequency
+
+
+ CLKCFG
+ | 31------------------------------------------- | 3 | 2 | 1 | 0 |
+ | --------------------------------------------- | B | HT | F | T |
+
+ B = Fast-Bus Mode 0: System Bus is half of run-mode
+ 1: System Bus is equal to run-mode
+ NOTE: only allowed when L <= 16
+
+ HT = Half-Turbo 0: core frequency = run or turbo, depending on T bit
+ 1: core frequency = turbo frequency / 2
+ NOTE: only allowed when 2N = 6 or 2N = 8
+
+ F = Frequency change
+ 0: No frequency change is performed
+ 1: Do frequency-change
+
+ T = Turbo Mode 0: CPU operates at run Frequency
+ 1: CPU operates at Turbo Frequency (when n2 > 2)
+
+*/
+ /* Set the CLKCFG with B, T, and HT */
+ if (opt->b != -1 && opt->n != -1) {
+ f = 1;
+
+ /*When N2=2, Turbo Mode equals Run Mode, so it
+ does not really matter if this is >2 or >=2
+ */
+ if (opt->n > 2) {
+ turbo = 0x1;
+ }
+ opt->regs.clkcfg = (opt->b << 3) + (f << 1) + turbo;
+ } else {
+ f = 0x1;
+ opt->regs.clkcfg = (f << 1);
+ }
+
+ /*
+ What about when n2=0 ... it is not defined by the yellow
+ book
+ */
+ if (opt->n != -1) {
+ /* N2 is 4 bits, L is 5 bits */
+ opt->regs.cccr = ((opt->n & 0xF) << 7) + (opt->l & 0x1F);
+ }
+
+ if (opt->cccra > 0) {
+ /* Turn on the CCCR[A] bit */
+ opt->regs.cccr |= (1 << 25);
+ }
+
+ /* 13M Mode */
+ if (opt->l == 1) {
+ }
+
+ if ( (opt->l > 1) && (opt->cpll_enabled == 0) ) {
+ printk(KERN_WARNING
+ "DPM: internal error if l>1 CPLL must be On\n");
+ }
+ if( (opt->cpll_enabled == 1) && (opt->ppll_enabled == 0) ){
+ printk(KERN_WARNING
+ "DPM: internal error CPLL=On PPLL=Off is NOT
supported in hardware\n");
+ }
+ if(opt->cpll_enabled == 0) {
+ opt->regs.cccr |= (CCCR_CPDIS_BIT_ON);
+ }
+ if(opt->ppll_enabled == 0) {
+ opt->regs.cccr |= (CCCR_PPDIS_BIT_ON);
+ }
+
+}
+
+/* This routine computes the "forward" frequency scaler flags
+ * for moving the system
+ * from the current operating point to the new operating point. The resulting
+ * fscaler is applied to the registers of the new operating point.
+ */
+void compute_fscaler_flags(struct md_opt *cur, struct md_opt *new)
+{
+ int current_n, ccsr;
+
+ ccsr = CCSR;
+ current_n = (ccsr & CCCR_N_MASK) >> 7;
+ fscaler_flags = FSCALER_NOP;
+ /* If new CPU is 0, that means sleep, we do NOT switch PLLs
+ if going to sleep.
+ */
+ if (!new->cpu) {
+ if (new->sleep_mode == CPUMODE_DEEPSLEEP) {
+ fscaler_flags |= FSCALER_DEEPSLEEP;
+ } else if (new->sleep_mode == CPUMODE_STANDBY) {
+ fscaler_flags |= FSCALER_STANDBY;
+ } else {
+ fscaler_flags |= FSCALER_SLEEP;
+ }
+ } else {
+
+ /*
+ * If exiting 13M mode, set the flag so we can do the extra
+ * work to get out before the frequency change
+ */
+ if( ((cur->cpll_enabled == 0) && (new->cpll_enabled ==1)) ||
+ ((cur->ppll_enabled == 0) && (new->ppll_enabled ==1)) ){
+ fscaler_flags |= FSCALER_XPLLON;
+ }
+
+ }
+
+
+ /* if CPU is *something*, it means we are not going to sleep */
+ if ((new->cpu) &&
+ /* And something has indeed changed */
+ ((new->regs.cccr != cur->regs.cccr) ||
+ (new->regs.clkcfg != cur->regs.clkcfg))) {
+
+ /* Find out if it is *just* a turbo bit change */
+ if ((cur->l == new->l) &&
+ (cur->cccra == new->cccra) &&
+ (cur->b == new->b) &&
+ (cur->half_turbo == new->half_turbo)) {
+ /*
+ * If the real, current N is a turbo freq and
+ * the new N is not a turbo freq, then set
+ * TURBO_OFF and do not change N
+ */
+ if ((cur->n > 1) && (new->n == 2)) {
+ fscaler_flags |= FSCALER_TURBO_OFF;
+ }
+ /*
+ * Else if the current operating point's N is
+ * not-turbo and the new N is the desired
+ * destination N, then set TURBO_ON
+ */
+ else if ((cur->n == 2) && (new->n == current_n)) {
+ /*
+ * Desired N must be what is current
+ * set in the CCCR/CCSR
+ */
+ fscaler_flags |= FSCALER_TURBO_ON;
+ }
+ /* Else, fall through to regular FCS */
+ }
+ if (!(fscaler_flags & FSCALER_TURBO)) {
+ /* It this is not a Turbo bit only change, it
+ must be a regular FCS
+ */
+ fscaler_flags |= FSCALER_CPUFREQ;
+ }
+ loops_per_jiffy = new->lpj;
+ }
+
+ if (new->half_turbo != cur->half_turbo) {
+ loops_per_jiffy = new->lpj;
+
+ if (new->half_turbo)
+ fscaler_flags |= FSCALER_HALFTURBO_ON;
+ else
+ fscaler_flags |= FSCALER_HALFTURBO_OFF;
+ }
+
+ if (new->regs.voltage != cur->regs.voltage)
+ fscaler_flags |= FSCALER_VOLTAGE;
+
+}
+
+static int pxa27x_transition(struct oppoint *cur, struct oppoint *new)
+{
+ int rc = 0;
+ unsigned target_v;
+ struct md_opt *md_cur, *md_new;
+
+ pr_debug("%s: %s => %s\n", __FUNCTION__, cur->name, new->name);
+
+ md_cur = (struct md_opt *)cur->md_data;
+ md_new = (struct md_opt *)new->md_data;
+
+ /* fully define the new opt, if necessary, based on values
+ from the current opt
+ */
+ mainstone_fully_define_opt(cur, new);
+ target_v = md_new->v;
+
+ /* In accordance with Yellow Book section 3.7.6.3, "Coupling
+ Voltage Change with Frequency Change", always set the
+ voltage first (setting the FVC bit in the PCFR) and then do
+ the frequency change
+ */
+ rc = mainstone_fscale(cur, new);
+ if (rc == 0)
+ current_state = new;
+
+ udelay(new->latency);
+
+ return rc;
+}
+
+static int mainstone_fscale(struct oppoint *c, struct oppoint *n)
+{
+ struct md_opt *cur = (struct md_opt *)c->md_data;
+ struct md_opt *new = (struct md_opt *)n->md_data;
+
+ compute_fscaler_flags(cur, new);
+
+ mainstone_fscaler(&new->regs);
+
+}
+
+void mainstone_fully_define_opt(struct oppoint *c, struct oppoint *n)
+{
+ struct md_opt *cur = (struct md_opt *)c->md_data;
+ struct md_opt *new = (struct md_opt *)n->md_data;
+
+ if (new->v == -1)
+ new->v = cur->v;
+ if (new->l == -1)
+ new->l = cur->l;
+ if (new->n == -1)
+ new->n = cur->n;
+ if (new->b == -1)
+ new->b = cur->b;
+ if (new->half_turbo == -1)
+ new->half_turbo = cur->half_turbo;
+ if (new->cccra == -1)
+ new->cccra = cur->cccra;
+ if (new->cpll_enabled == -1)
+ new->cpll_enabled = cur->cpll_enabled;
+ if (new->ppll_enabled == -1)
+ new->ppll_enabled = cur->ppll_enabled;
+ if (new->sleep_mode == -1)
+ new->sleep_mode = cur->sleep_mode;
+
+#ifdef CONFIG_BULVERDE_B0
+ /* for "B0"-revision PLLs have the same value */
+ new->ppll_enabled = new->cpll_enabled;
+#endif
+ /* PXA27x manual ("Yellow book") 3.5.5 (Table 3-7) states that
+ * CPLL-"On" and PPLL-"Off"
+ * configuration is forbidden (all others seem to be OK for "B0")
+ * for "C0" boards we suppose that this configuration is also enabled.
+ * PXA27x manual ("Yellow book") also states at 3.5.7.1 (page 3-25)
+ * that "CCCR[PPDIS] and CCCR[CPDIS] must always be identical and
+ * changed together". "If PLLs are to be turned off using xPDIS then
+ * set xPDIS before frequency change and clear xPDIS after frequency
+ * change"
+ */
+
+ if (new->n > 2) {
+ new->turbo = 1;
+ /* turbo mode: 13K * L * (N/2)
+ Shift at the end to divide N by 2 for Turbo mode or
+ by 4 for Half-Turbo mode )
+ */
+ new->cpu = (13000 * new->l * new->n) >>
+ ((new->half_turbo == 1) ? 2 : 1);
+ } else {
+ new->turbo = 0;
+ /* run mode */
+ new->cpu = 13000 * new->l;
+ }
+ /* lcd freq is derived from L */
+ calculate_lcd_freq(new);
+ calculate_reg_values(new);
+ /* We want to keep a baseline loops_per_jiffy/cpu-freq ratio
+ to work off of for future calculations, especially when
+ emerging from sleep when there is no current cpu frequency
+ to calculate from (because cpu-freq of 0 means sleep).
+ */
+ if (!saved_loops_per_jiffy) {
+ saved_loops_per_jiffy = loops_per_jiffy;
+ saved_cpu_freq = cur->cpu;
+ }
+ if (!saved_cpu_freq) {
+ saved_cpu_freq = c->frequency;
+ }
+ /*
+ * a dedicated method for updating jiffies when frequency is changed
+ */
+ if (new->cpu) {
+ /* Normal change (not sleep), just compute. Always use
+ the "baseline" lpj and freq */
+ new->lpj = oppoint_compute_lpj(saved_loops_per_jiffy,
+ saved_cpu_freq, new->cpu);
+ } else {
+ /* If sleeping, keep the old LPJ */
+ new->lpj = loops_per_jiffy;
+ }
+}
+
+static void xpll_on(struct oppoint_regs *regs, int fscaler_flags)
+{
+ int tmp_cccr, tmp_ccsr;
+ int new_cpllon=0, new_pllon=0, cur_cpllon=0;
+ int cur_pllon=0, start_cpll=0, start_pll=0;
+
+ tmp_ccsr = CCSR;
+
+ if ((regs->cccr & CCCR_CPDIS_BIT_ON) == 0)
+ new_cpllon=1;
+ if ((regs->cccr & CCCR_PPDIS_BIT_ON) == 0)
+ new_pllon=1;
+ if (((tmp_ccsr >> 31) & 0x1) == 0)
+ cur_cpllon=1;
+ if (((tmp_ccsr >> 30) & 0x1) == 0)
+ cur_pllon=1;
+
+ if ((new_cpllon == 1) && (cur_cpllon == 0)) {
+ start_cpll=1;
+ }
+ if ((new_pllon == 1) && (cur_pllon == 0)) {
+ start_pll=1;
+ }
+
+ if (!(fscaler_flags & FSCALER_XPLLON)) {
+ return;
+ }
+ if ((start_cpll == 0) && (start_pll == 0)) {
+ return;
+ }
+ /* NOTE: the Yellow Book says that exiting 13M mode requires a
+ PLL relock, which takes at least 120uS, so the book suggests
+ the OS could use a timer to keep busy until it is time to
+ check the CCSR bits which must happen before changing the
+ frequency back.
+
+ For now, we'll just loop.
+ */
+
+ /* From Yellow Book, page 3-31, section 3.5.7.5 13M Mode
+
+ Exiting 13M Mode:
+
+ 1. Remain in 13M mode, but early enable the PLL via
+ CCCR[CPDIS, PPDIS]=11, and CCCR[PLL_EARLY_EN]=1. Doing
+ so will allow the PLL to be started early.
+
+ 2. Read CCCR and compare to make sure that the data was
+ correctly written.
+
+ 3. Check to see if CCS[CPLOCK] and CCSR[PPLOCK] bits are
+ both set. Once these bits are both high, the PLLs are
+ locked and you may move on.
+
+ 4. Note that the CPU is still in 13M mode, but the PLLs are
+ started.
+
+ 5. Exit from 13M mode by writing CCCR[CPDIS, PPDIS]=00, but
+ maintain CCCR[PLL_EARLY_EN]=1. This bit will be cleared
+ by the imminent frequency change.
+ */
+
+ /* Step 1 */
+ tmp_cccr = CCCR;
+ if (start_cpll)
+ tmp_cccr |= CCCR_CPDIS_BIT_ON;
+ if (start_pll)
+ tmp_cccr |= CCCR_PPDIS_BIT_ON;
+ tmp_cccr |= CCCR_PLL_EARLY_EN_BIT_ON;
+
+ CCCR = tmp_cccr;
+
+ /* Step 2 */
+ tmp_cccr = CCCR;
+
+ if ((tmp_cccr & CCCR_PLL_EARLY_EN_BIT_ON) != CCCR_PLL_EARLY_EN_BIT_ON) {
+ printk(KERN_WARNING
+ "DPM: Warning: PLL_EARLY_EN is NOT on\n");
+ }
+ if ((start_cpll==1) &&
+ ((tmp_cccr & CCCR_CPDIS_BIT_ON) != CCCR_CPDIS_BIT_ON)) {
+ printk(KERN_WARNING
+ "DPM: Warning: CPDIS is NOT on\n");
+ }
+ if ((start_pll==1) &&
+ (tmp_cccr & CCCR_PPDIS_BIT_ON) != CCCR_PPDIS_BIT_ON) {
+ printk(KERN_WARNING
+ "DPM: Warning: PPDIS is NOT on\n");
+ }
+
+ /* Step 3 */
+ {
+ /* Note: the point of this is to "wait" for the lock
+ bits to be set; the Yellow Book says this may take
+ a while, but observation indicates that it is
+ instantaneous
+ */
+
+ long volatile int i = 0;
+
+ int cpll_complete=1;
+ int pll_complete=1;
+ if (start_cpll==1)
+ cpll_complete=0;
+ if (start_pll==1)
+ pll_complete=0;
+
+ /*loop arbitrary big value to prevent looping forever */
+ for (i = 0; i < 999999; i++) {
+ tmp_ccsr = CCSR;
+
+ if ((tmp_ccsr & CCSR_CPLL_LOCKED) == CCSR_CPLL_LOCKED) {
+ /*CPLL locked*/
+ cpll_complete=1;
+ }
+ if ((tmp_ccsr & CCSR_PPLL_LOCKED) == CCSR_PPLL_LOCKED) {
+ /*PPLL locked*/
+ pll_complete=1;
+ }
+ if ((cpll_complete == 1) && (pll_complete == 1)) {
+ break;
+ }
+ }
+ }
+
+ /* Step 4: NOP */
+
+ /* Step 5
+ Clear the PLL disable bits - do NOT do it here.
+ */
+
+ /* But leave EARLY_EN on; it will be cleared by the frequency change */
+ regs->cccr |= CCCR_PLL_EARLY_EN_BIT_ON;
+ /* Do not set it now
+ Step 6: Now go continue on with frequency change
+ We do this step later as if voltage is too low,
+ we must ensure that it rised up before entereng to higher
+ freq mode or simultaniously
+ */
+}
+
+static void mainstone_fscaler(struct oppoint_regs *regs)
+{
+ unsigned int cccr, clkcfg = 0;
+ unsigned long s;
+
+ /* If no flags are set, don't waste time here, just return */
+ if (fscaler_flags == FSCALER_NOP)
+ return;
+
+ if (!(fscaler_flags & FSCALER_ANY_SLEEPMODE))
+ local_irq_save(s);
+
+ /* If exiting 13M mode (turn on PLL(s) ), do some extra work
+ before changing the CPU frequency or voltage.
+ We may turn on a combination of PLLs supported by hardware
+ only. Otherwise xpll_on(...) hang the system.
+ */
+ if (fscaler_flags & FSCALER_XPLLON)
+ xpll_on(regs, fscaler_flags);
+
+ /* if not sleeping, and have a voltage change
+ note that SLEEPMODE will handle voltage itself
+ */
+ if (((fscaler_flags & FSCALER_ANY_SLEEPMODE) == 0) &&
+ (fscaler_flags & FSCALER_VOLTAGE)) {
+ if (fscaler_flags & FSCALER_CPUFREQ) {
+ /* coupled voltage & freq change */
+ mainstone_scale_voltage_coupled(regs);
+ } else {
+ /* Scale CPU voltage un-coupled with freq */
+ mainstone_scale_voltage(regs);
+ }
+ }
+
+ if (fscaler_flags & FSCALER_CPUFREQ) /* Scale CPU freq */
+ mainstone_scale_cpufreq(regs);
+
+ if ((fscaler_flags & FSCALER_VOLTAGE) &&
+ (fscaler_flags & FSCALER_CPUFREQ))
+ PCFR &= ~PCFR_FVC;
+
+ if (fscaler_flags & FSCALER_TURBO) {
+
+ clkcfg = mainstone_read_clkcfg();
+
+ /* Section 3.5.7 of the Yellow Book says that the F
+ bit will be left on after a FCS, so we need to
+ explicitly clear it. But do not change the B bit
+ */
+ clkcfg &= ~(CLKCFG_F_BIT);
+
+ if (fscaler_flags & FSCALER_TURBO_ON) {
+ clkcfg = clkcfg | (CLKCFG_T_BIT);
+ } else {
+ clkcfg = clkcfg & ~(CLKCFG_T_BIT);
+ }
+
+ /* enable */
+ mainstone_set_freq(clkcfg);
+ }
+
+ if ((fscaler_flags & FSCALER_HALFTURBO_ON) ||
+ (fscaler_flags & FSCALER_HALFTURBO_OFF)) {
+ if ((fscaler_flags & FSCALER_CPUFREQ) ||
+ (fscaler_flags & FSCALER_VOLTAGE)) {
+
+ /*
+ From the Yellow Book, p 3-106:
+
+ "Any two writes to CLKCFG or PWRMODE
+ registers must be separated by siz 13-MHz
+ cycles. This requirement is achieved by
+ doing the write to the CLKCFG or POWERMODE
+ reigster, performing a read of CCCR, and
+ then comparing the value in the CLKCFG or
+ POWERMODE register to the written value
+ until it matches."
+
+ Since the setting of half turbo is a
+ separate write to CLKCFG, we need to adhere
+ to this requirement.
+ */
+ cccr = CCCR;
+ clkcfg = mainstone_read_clkcfg();
+ while (clkcfg != regs->clkcfg)
+ clkcfg = mainstone_read_clkcfg();
+ }
+
+ if (clkcfg == 0)
+ clkcfg = regs->clkcfg;
+ /* Turn off f-bit.
+
+ According to the Yellow Book, page 3-23, "If only
+ HT is set, F is clear, and B is not altered, then
+ the core PLL is not stopped."
+ */
+ clkcfg = clkcfg & ~(CLKCFG_F_BIT);
+ /* set half turbo bit */
+ if (fscaler_flags & FSCALER_HALFTURBO_ON) {
+ clkcfg = clkcfg | (CLKCFG_HT_BIT);
+ } else {
+ clkcfg = clkcfg & ~(CLKCFG_HT_BIT);
+ }
+
+ /* enable */
+ mainstone_set_freq(clkcfg);
+ }
+
+ /* Devices only need to scale on a core frequency
+ change. Half-Turbo changes are separate from the regular
+ frequency changes, so Half-Turbo changes do not need to
+ trigger a device recalculation.
+
+ NOTE: turbo-mode-only changes could someday also be
+ optimized like Half-Turbo (to not trigger a device
+ recalc).
+ */
+
+ if (fscaler_flags & FSCALER_ANY_SLEEPMODE) {
+ /* NOTE: voltage needs i2c, so be sure to change
+ voltage BEFORE* calling device_suspend
+ */
+
+ if (fscaler_flags & FSCALER_VOLTAGE) {
+ /* Scale CPU voltage un-coupled with freq */
+ mainstone_scale_voltage(regs);
+ }
+
+ if (fscaler_flags & FSCALER_SLEEP) {
+ pm_suspend(&sleep);
+ } else if (fscaler_flags & FSCALER_STANDBY) {
+ pm_suspend(&standby);
+ } else if (fscaler_flags & FSCALER_DEEPSLEEP) {
+ pm_suspend(&deepsleep);
+ }
+
+ /* Here when we wake up. */
+
+ } else {
+ local_irq_restore(s);
+ }
+}
+
+/*
+ * Fully determine the current machine-dependent operating point, and fill in a
+ * structure presented by the caller.
+ */
+
+void mainstone_get_current_info(void)
+{
+ unsigned int tmp_cccr;
+ unsigned int cpdis;
+ unsigned int ppdis;
+ struct md_opt *opt = (struct md_opt *)current_state->md_data;
+
+ /* You should read CCSR to see what's up...but there is no A
+ bit in the CCSR, so we'll grab it from the CCCR.
+ */
+ tmp_cccr = CCCR;
+ opt->cccra = (tmp_cccr >> 25) & 0x1; /* cccr[A]: bit 25 */
+
+ /* NOTE: the current voltage is not obtained, but will be left
+ as 0 in the opt which will mean no voltage change at all
+ */
+
+ opt->regs.cccr = CCSR;
+
+ opt->l = opt->regs.cccr & CCCR_L_MASK; /* Get L */
+ opt->n = (opt->regs.cccr & CCCR_N_MASK) >> 7; /* Get 2N */
+
+ /* This should never really be less than 2 */
+ if (opt->n < 2) {
+ opt->n = 2;
+ }
+
+ opt->regs.clkcfg = mainstone_read_clkcfg();
+ opt->b = (opt->regs.clkcfg >> 3) & 0x1; /* Fast Bus (b): bit 3 */
+ opt->turbo = opt->regs.clkcfg & 0x1; /* Turbo is bit 1 */
+ opt->half_turbo = (opt->regs.clkcfg >> 2) & 0x1;/* HalfTurbo: bit 2 */
+
+ calculate_lcd_freq(opt);
+
+ /* are any of the PLLs is on? */
+ cpdis = ((opt->regs.cccr >> 31) & 0x1);
+ ppdis = ((opt->regs.cccr >> 30) & 0x1);
+ /*
+ * Newer revisions still require that if CPLL is On
+ * then PPLL must also be On.
+ */
+ if ((cpdis == 0) && (ppdis != 0)) {
+ /*
+ * CPLL=On PPLL=Off is NOT supported with hardware.
+ * NOTE:"B0"-revision has even more restrictive requirments
+ * to PLLs
+ */
+ printk("OpPoint: cpdis and ppdis are not in sync!\n");
+ }
+
+ opt->cpll_enabled = (cpdis == 0);
+ opt->ppll_enabled = (ppdis == 0);
+
+ /* Shift 1 to divide by 2 (because opt->n is really 2*N */
+ if (opt->turbo) {
+ opt->cpu = (13000 * opt->l * opt->n) >> 1;
+ } else {
+ /*
+ * turbo bit is off, so skip N multiplier (no matter
+ * what N really is) and use Run frequency (13K * L)
+ */
+ opt->cpu = 13000 * opt->l;
+ }
+}
+
+static void oppoint_print_opt(struct oppoint *opt)
+{
+ struct md_opt *md_opt = (struct md_opt *)opt->md_data;
+
+ printk("OpPoint : Table of defined operating points:\n");
+ printk("\t%s freq %d volt %d latency %d\n", opt->name,
+ opt->frequency, opt->voltage, opt->latency);
+
+ printk(" Name Vol CPU L N A B HT PLL
CPLL Sleep LCD\n");
+
+ printk("%12s %5d%5d%5d%5d%5d%5d%5d%5d%5d%5d%5d\n",
+ opt->name, (md_opt->v), (md_opt->cpu / 1000), md_opt->l, md_opt->n,
+ md_opt->cccra, md_opt->b, md_opt->half_turbo, md_opt->cpll_enabled,
+ md_opt->ppll_enabled, md_opt->sleep_mode, (md_opt->lcd / 1000));
+ return ;
+}
+
+/* Crystal clock: 13MHz */
+#define BASE_CLK 13000000
+
+int __init oppoint_mainstone_init(void)
+{
+ unsigned int freq;
+ unsigned long ccsr;
+ unsigned int l;
+
+ printk("Mainstone OpPoint Power Management\n");
+
+ mainstone_clk_init();
+ mainstone_vcs_init();
+ ccsr = CCSR;
+ l = ccsr & 0x1f;
+ freq = l * BASE_CLK;
+
+ /*
+ * supported operating point sets.
+ * 104, 208, 312
+ * 208, 312, 416, 520
+ */
+ switch (freq) {
+ case 104000000: {
+ mainstone_setup_opt(&low, 104, &mhz104);
+ oppoint_print_opt(&low);
+ mainstone_setup_opt(&medium, 208, &mhz208);
+ oppoint_print_opt(&medium);
+ mainstone_setup_opt(&high, 312, &mhz312);
+ oppoint_print_opt(&high);
+ current_state = &medium;
+ break;
+ }
+ case 208000000: {
+ mainstone_setup_opt(&low, 208, &mhz208hi);
+ oppoint_print_opt(&low);
+ mainstone_setup_opt(&medium, 312, &mhz312hi);
+ oppoint_print_opt(&medium);
+ mainstone_setup_opt(&high, 416, &mhz416hi);
+ oppoint_print_opt(&high);
+ mainstone_setup_opt(&highest, 520, &mhz520hi);
+ oppoint_print_opt(&highest);
+ break;
+
+ }
+ default: {
+ printk("OpPoint: unknown frequency set %d\n", freq);
+ break;
+ }
+ }
+
+ if (lowest.frequency)
+ register_operating_point(&lowest);
+ if (low.frequency)
+ register_operating_point(&low);
+ if (mediumlow.frequency)
+ register_operating_point(&mediumlow);
+ if (medium.frequency)
+ register_operating_point(&medium);
+ if (mediumhigh.frequency)
+ register_operating_point(&mediumhigh);
+ if (high.frequency) {
+ register_operating_point(&high);
+ current_state = &high;
+ }
+ if (highest.frequency) {
+ register_operating_point(&highest);
+ current_state = &highest;
+ }
+ /*
+ * add sleep states
+ */
+ mainstone_setup_opt(&sleep, 0, &md_sleep);
+ register_operating_point(&sleep);
+ oppoint_print_opt(&sleep);
+ mainstone_setup_opt(&deepsleep, 0, &md_deepsleep);
+ register_operating_point(&deepsleep);
+ oppoint_print_opt(&deepsleep);
+ mainstone_setup_opt(&standby, 0, &md_standby);
+ register_operating_point(&standby);
+ oppoint_print_opt(&standby);
+
+ return 0;
+}
+
+void __exit oppoint_mainstone_exit(void) {
+ mainstone_freq_cleanup();
+ mainstone_voltage_cleanup();
+}
+
+__initcall(oppoint_mainstone_init);
+__exitcall(oppoint_mainstone_exit);
Index: linux-2.6.17/arch/arm/Kconfig
===================================================================
--- linux-2.6.17.orig/arch/arm/Kconfig
+++ linux-2.6.17/arch/arm/Kconfig
@@ -690,7 +690,7 @@ config XIP_PHYS_ADDR

endmenu

-if (ARCH_SA1100 || ARCH_INTEGRATOR || ARCH_OMAP)
+if (ARCH_SA1100 || ARCH_INTEGRATOR || ARCH_OMAP || ARCH_PXA)

menu "CPU Frequency scaling"



David

2006-09-14 17:31:24

by Kok, Auke

[permalink] [raw]
Subject: Re: OpPoint summary

David Singleton wrote:

> +static const struct cpu_id cpu_ids[] = {
> + [CPU_BANIAS] = { 6, 9, 5 },
> + [CPU_DOTHAN_A1] = { 6, 13, 1 },
> + [CPU_DOTHAN_A2] = { 6, 13, 2 },
> + [CPU_DOTHAN_B0] = { 6, 13, 6 },
> + [CPU_MP4HT_D0] = {15, 3, 4 },
> + [CPU_MP4HT_E0] = {15, 4, 1 },
> +};


Any reason why { 6, 13, 8 } is missing? My lenovo T43 identifies itself as such:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 13
model name : Intel(R) Pentium(R) M processor 1.86GHz
stepping : 8

I'm not sure a Dothan B1 exists, but some postings suggest even C0 and C1 are
valid steppings. I'm sure OpPoint could work with those as well.

Auke

2006-09-14 18:15:58

by Vitaly Wool

[permalink] [raw]
Subject: Re: [linux-pm] OpPoint summary

On 9/14/06, Auke Kok <[email protected]> wrote:
> David Singleton wrote:
>
> > +static const struct cpu_id cpu_ids[] = {
> > + [CPU_BANIAS] = { 6, 9, 5 },
> > + [CPU_DOTHAN_A1] = { 6, 13, 1 },
> > + [CPU_DOTHAN_A2] = { 6, 13, 2 },
> > + [CPU_DOTHAN_B0] = { 6, 13, 6 },
> > + [CPU_MP4HT_D0] = {15, 3, 4 },
> > + [CPU_MP4HT_E0] = {15, 4, 1 },
> > +};
>
>
> Any reason why { 6, 13, 8 } is missing? My lenovo T43 identifies itself as such:
>
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 13
> model name : Intel(R) Pentium(R) M processor 1.86GHz
> stepping : 8
>
> I'm not sure a Dothan B1 exists, but some postings suggest even C0 and C1 are
> valid steppings. I'm sure OpPoint could work with those as well.

Heh, that shows pretty much that the approach itself is not good...
And this is only beginning.

Vitaly

2006-09-14 18:17:42

by David Singleton

[permalink] [raw]
Subject: Re: OpPoint summary

On 9/14/06, Auke Kok <[email protected]> wrote:
> David Singleton wrote:
>
> > +static const struct cpu_id cpu_ids[] = {
> > + [CPU_BANIAS] = { 6, 9, 5 },
> > + [CPU_DOTHAN_A1] = { 6, 13, 1 },
> > + [CPU_DOTHAN_A2] = { 6, 13, 2 },
> > + [CPU_DOTHAN_B0] = { 6, 13, 6 },
> > + [CPU_MP4HT_D0] = {15, 3, 4 },
> > + [CPU_MP4HT_E0] = {15, 4, 1 },
> > +};
>
>
> Any reason why { 6, 13, 8 } is missing? My lenovo T43 identifies itself as such:
>
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 13
> model name : Intel(R) Pentium(R) M processor 1.86GHz
> stepping : 8
>
> I'm not sure a Dothan B1 exists, but some postings suggest even C0 and C1 are
> valid steppings. I'm sure OpPoint could work with those as well.

Yes it could. The centrino was the first platform I tested on and I used the
existing speedstep-centrino code from cpufreq. The 1.86Ghz was not in
the cpufreq base. But you can see how easy it is to add new operating points
for a new cpu.

Adding new platform support is quite straight forward. It basically requires
a function to transition to the new operating point and the parameters needed
for the transition.

David
>
> Auke
>

2006-09-17 05:07:19

by David Singleton

[permalink] [raw]
Subject: Re: OpPoint summary

Pavel and Greg,

I've incorporated Pavels suggestions and only put suspend states
in the /sys/power/state file. The control file for frequency and
voltage operating
point transitions is now in /sys/power/operating_points/current_point.

The /sys/power/operating_points dirctory still contains the operating
points themselves, with a frequency, voltage and latency file
for each operating point.

The oppointd power manager has been changed to use the
new control file for operating points. It has been tested on
a centrino laptop, the 4 way Xeon server and the arm-pxa27x.

I finally got SMP tested on a 4 way Xeon server. The patch
that supports SMP Xeon's is the oppoint-x86-p4.patch in the series.

The only files in the core framework patch now are:

kernel/power/main.c
include/linux/pm.h
kernel/power/power.h

The full patch set is at

http://source.mvista.com/~dsingleton/2.6.18-rc7

The power manager source and patch is at:

http://source.mvista.com/~dsingleton/oppointd-1.2.3

Attached is the oppoint-core.patch.

David

Signed-Off-by: David Singleton <[email protected]>

include/linux/pm.h | 30 +++-
kernel/power/main.c | 361 +++++++++++++++++++++++++++++++++++++++++++++------
kernel/power/power.h | 2
3 files changed, 350 insertions(+), 43 deletions(-)

Index: linux-2.6.17/kernel/power/main.c
===================================================================
--- linux-2.6.17.orig/kernel/power/main.c
+++ linux-2.6.17/kernel/power/main.c
@@ -16,6 +16,7 @@
#include <linux/init.h>
#include <linux/pm.h>
#include <linux/console.h>
+#include <linux/module.h>

#include "power.h"

@@ -49,7 +50,7 @@ void pm_set_ops(struct pm_ops * ops)
* the platform can enter the requested state.
*/

-static int suspend_prepare(suspend_state_t state)
+static int suspend_prepare(struct oppoint * state)
{
int error = 0;
unsigned int free_pages;
@@ -82,7 +83,7 @@ static int suspend_prepare(suspend_state
}

if (pm_ops->prepare) {
- if ((error = pm_ops->prepare(state)))
+ if ((error = pm_ops->prepare(state->type)))
goto Thaw;
}

@@ -94,7 +95,7 @@ static int suspend_prepare(suspend_state
return 0;
Finish:
if (pm_ops->finish)
- pm_ops->finish(state);
+ pm_ops->finish(state->type);
Thaw:
thaw_processes();
Enable_cpu:
@@ -104,7 +105,7 @@ static int suspend_prepare(suspend_state
}


-int suspend_enter(suspend_state_t state)
+int suspend_enter(struct oppoint * state)
{
int error = 0;
unsigned long flags;
@@ -115,7 +116,7 @@ int suspend_enter(suspend_state_t state)
printk(KERN_ERR "Some devices failed to power down\n");
goto Done;
}
- error = pm_ops->enter(state);
+ error = pm_ops->enter(state->type);
device_power_up();
Done:
local_irq_restore(flags);
@@ -131,36 +132,82 @@ int suspend_enter(suspend_state_t state)
* console that we've allocated. This is not called for suspend-to-disk.
*/

-static void suspend_finish(suspend_state_t state)
+static void suspend_finish(struct oppoint * state)
{
device_resume();
resume_console();
thaw_processes();
enable_nonboot_cpus();
if (pm_ops && pm_ops->finish)
- pm_ops->finish(state);
+ pm_ops->finish(state->type);
pm_restore_console();
}

+struct list_head pm_list;
+static struct oppoint standby = {
+ .name = "standby",
+ .type = PM_SUSPEND_STANDBY,
+};

+static struct oppoint mem = {
+ .name = "mem",
+ .type = PM_SUSPEND_MEM,
+ .frequency = 0,
+ .voltage = 0,
+ .latency = 150,
+};

-
-static const char * const pm_states[PM_SUSPEND_MAX] = {
- [PM_SUSPEND_STANDBY] = "standby",
- [PM_SUSPEND_MEM] = "mem",
#ifdef CONFIG_SOFTWARE_SUSPEND
- [PM_SUSPEND_DISK] = "disk",
+struct oppoint disk = {
+ .name = "disk",
+ .type = PM_SUSPEND_DISK,
+};
#endif
+
+struct oppoint pm_states = {
+ .name = "default",
+ .type = PM_FREQ_CHANGE,
};
+struct oppoint *current_state;
+
+/*
+ *
+ */
+static int pm_change_state(struct oppoint *state)
+{
+ int error = 0;
+
+ printk("OpPoint: changing from %s to %s\n", current_state->name,
+ state->name);
+ /*
+ * compare to current operating point.
+ * if different change to new operating point.
+ */
+ if (current_state == state)
+ goto out;
+
+ if ((error = state->prepare_transition(current_state, state)))
+ goto out;
+
+ if ((error = state->transition(current_state, state)))
+ state = current_state;
+
+ if ((error = state->finish_transition(current_state, state)) == 0)
+ current_state = state;
+
+out:
+ printk("OpPoint: State change returned %d\n", error);
+ return error;
+}

-static inline int valid_state(suspend_state_t state)
+static inline int valid_state(struct oppoint * state)
{
/* Suspend-to-disk does not really need low-level support.
* It can work with reboot if needed. */
- if (state == PM_SUSPEND_DISK)
+ if (state->type == PM_SUSPEND_DISK)
return 1;

- if (pm_ops && pm_ops->valid && !pm_ops->valid(state))
+ if (pm_ops && pm_ops->valid && !pm_ops->valid(state->type))
return 0;
return 1;
}
@@ -168,7 +215,7 @@ static inline int valid_state(suspend_st

/**
* enter_state - Do common work of entering low-power state.
- * @state: pm_state structure for state we're entering.
+ * @state: oppoint structure for state we're entering.
*
* Make sure we're the only ones trying to enter a sleep state. Fail
* if someone has beat us to it, since we don't want anything weird to
@@ -177,7 +224,7 @@ static inline int valid_state(suspend_st
* we've woken up).
*/

-static int enter_state(suspend_state_t state)
+static int enter_state(struct oppoint *state)
{
int error;

@@ -186,16 +233,21 @@ static int enter_state(suspend_state_t s
if (down_trylock(&pm_sem))
return -EBUSY;

- if (state == PM_SUSPEND_DISK) {
+ if (state->type == PM_SUSPEND_DISK) {
error = pm_suspend_disk();
goto Unlock;
}

- pr_debug("PM: Preparing system for %s sleep\n", pm_states[state]);
+ if (state->type == PM_FREQ_CHANGE || state->type == PM_VOLT_CHANGE) {
+ error = pm_change_state(state);
+ goto Unlock;
+ }
+
+ pr_debug("PM: Preparing system for %s sleep\n", state->name);
if ((error = suspend_prepare(state)))
goto Unlock;

- pr_debug("PM: Entering %s sleep\n", pm_states[state]);
+ pr_debug("PM: Entering %s sleep\n", state->name);
error = suspend_enter(state);

pr_debug("PM: Finishing wakeup.\n");
@@ -211,7 +263,15 @@ static int enter_state(suspend_state_t s
*/
int software_suspend(void)
{
- return enter_state(PM_SUSPEND_DISK);
+ struct oppoint *this, *next;
+ struct list_head *head = &mem.list;
+ int error = 0;
+
+ list_for_each_entry_safe(this, next, head, list) {
+ if (this->type == PM_SUSPEND_DISK)
+ error= enter_state(this);
+ }
+ return error;
}


@@ -223,9 +283,9 @@ int software_suspend(void)
* structure, and enter (above).
*/

-int pm_suspend(suspend_state_t state)
+int pm_suspend(struct oppoint * state)
{
- if (state > PM_SUSPEND_ON && state <= PM_SUSPEND_MAX)
+ if (state->type > PM_SUSPEND_ON && state->type <= PM_SUSPEND_MAX)
return enter_state(state);
return -EINVAL;
}
@@ -248,36 +308,35 @@ decl_subsys(power,NULL,NULL);

static ssize_t state_show(struct subsystem * subsys, char * buf)
{
- int i;
- char * s = buf;
+ struct oppoint *this, *next;
+ struct list_head *head = &pm_list;
+ char *s = buf;
+
+ list_for_each_entry_safe(this, next, head, list)
+ s += sprintf(s,"%s ", this->name);

- for (i = 0; i < PM_SUSPEND_MAX; i++) {
- if (pm_states[i] && valid_state(i))
- s += sprintf(s,"%s ", pm_states[i]);
- }
s += sprintf(s,"\n");
+
return (s - buf);
}

static ssize_t state_store(struct subsystem * subsys, const char *
buf, size_t n)
{
- suspend_state_t state = PM_SUSPEND_STANDBY;
- const char * const *s;
+ struct oppoint *this, *next;
+ struct list_head *head = &mem.list;
char *p;
- int error;
+ int error = -EINVAL;
int len;

p = memchr(buf, '\n', n);
len = p ? p - buf : n;
-
- for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
- if (*s && !strncmp(buf, *s, len))
+ list_for_each_entry_safe(this, next, head, list) {
+ if ((strlen(this->name) == len) &&
+ (!strncmp(this->name, buf, len))) {
+ error = enter_state(this);
break;
+ }
}
- if (state < PM_SUSPEND_MAX && *s)
- error = enter_state(state);
- else
- error = -EINVAL;
return error ? error : n;
}

@@ -292,12 +351,234 @@ static struct attribute_group attr_group
.attrs = g,
};

+static struct kobject oppoint_kobj = {
+ .kset = &power_subsys.kset,
+};
+
+struct oppoint_attribute {
+ struct attribute attr;
+ ssize_t (*show)(struct kobject * kobj, char * buf);
+ ssize_t (*store)(struct kobject * kobj, const char * buf,
size_t count);
+};
+
+#define to_oppoint(obj) container_of(obj,struct oppoint,kobj)
+#define to_oppoint_attr(_attr) container_of(_attr,struct
oppoint_attribute,attr)
+/*
+ * the frequency, voltage and latency files are readonly
+ */
+
+static ssize_t oppoint_voltage_show(struct kobject * kobj, char * buf)
+{
+ ssize_t len;
+ struct oppoint *opt = to_oppoint(kobj);
+
+ len = sprintf(buf, "%8d\n", opt->voltage);
+
+ return len;
+}
+
+static ssize_t oppoint_voltage_store(struct kobject * kobj, const char * buf,
+ size_t n)
+{
+ return -EINVAL;
+
+}
+
+static ssize_t oppoint_frequency_show(struct kobject * kobj, char * buf)
+{
+ ssize_t len;
+ struct oppoint *opt = to_oppoint(kobj);
+
+ len = sprintf(buf, "%8d\n", opt->frequency);
+
+ return len;
+}
+
+static ssize_t oppoint_frequency_store(struct kobject * kobj,
+ const char * buf, size_t n)
+{
+ return -EINVAL;
+
+}
+
+static ssize_t oppoint_point_show(struct kobject * kobj, char * buf)
+{
+ ssize_t len;
+
+ len = sprintf(buf, "%s\n", current_state->name);
+
+ return len;
+}
+
+static ssize_t oppoint_point_store(struct kobject * kobj, const char * buf,
+ size_t n)
+{
+ struct oppoint *this, *next;
+ struct list_head *head = &pm_states.list;
+ char *p;
+ int error = -EINVAL;
+ int len;
+
+ p = memchr(buf, '\n', n);
+ len = p ? p - buf : n;
+ list_for_each_entry_safe(this, next, head, list) {
+ if ((strlen(this->name) == len) &&
+ (!strncmp(this->name, buf, len))) {
+ error = enter_state(this);
+ break;
+ }
+ }
+ return error ? error : n;
+}
+
+static ssize_t oppoint_latency_show(struct kobject * kobj, char * buf)
+{
+ ssize_t len;
+ struct oppoint *opt = to_oppoint(kobj);
+
+ len = sprintf(buf, "%8d\n", opt->latency);
+
+ return len;
+}
+
+static ssize_t oppoint_latency_store(struct kobject * kobj,
+ const char * buf, size_t n)
+{
+ return -EINVAL;
+
+}
+
+static struct oppoint_attribute point_attr = {
+ .attr = {
+ .name = "current_point",
+ .mode = 0600,
+ },
+ .show = oppoint_point_show,
+ .store = oppoint_point_store,
+};
+
+static struct oppoint_attribute frequency_attr = {
+ .attr = {
+ .name = "frequency",
+ .mode = 0400,
+ },
+ .show = oppoint_frequency_show,
+ .store = oppoint_frequency_store,
+};
+
+static struct oppoint_attribute voltage_attr = {
+ .attr = {
+ .name = "voltage",
+ .mode = 0400,
+ },
+ .show = oppoint_voltage_show,
+ .store = oppoint_voltage_store,
+};
+
+static struct oppoint_attribute latency_attr = {
+ .attr = {
+ .name = "latency",
+ .mode = 0400,
+ },
+ .show = oppoint_latency_show,
+ .store = oppoint_latency_store,
+};
+
+static ssize_t
+oppoint_attr_show(struct kobject * kobj, struct attribute * attr, char * buf)
+{
+ struct oppoint_attribute * opt_attr = to_oppoint_attr(attr);
+ ssize_t ret = 0;
+
+ if (opt_attr->show)
+ ret = opt_attr->show(kobj,buf);
+ return ret;
+}
+
+static ssize_t
+oppoint_attr_store(struct kobject * kobj, struct attribute * attr,
+ const char * buf, size_t count)
+{
+ return -EINVAL;
+}
+
+static void oppoint_kobj_release(struct kobject *kobj)
+{
+ return;
+}
+
+static struct sysfs_ops oppoint_sysfs_ops = {
+ .show = oppoint_attr_show,
+ .store = oppoint_attr_store,
+};
+
+static struct attribute * oppoint_default_attrs[] = {
+ &frequency_attr.attr,
+ &voltage_attr.attr,
+ &latency_attr.attr,
+ NULL,
+};
+
+static struct kobj_type ktype_operating_point = {
+ .release = oppoint_kobj_release,
+ .sysfs_ops = &oppoint_sysfs_ops,
+ .default_attrs = oppoint_default_attrs,
+};
+
+int unregister_operating_point(struct oppoint *opt)
+{
+ down(&pm_sem);
+ list_del_init(&opt->list);
+ sysfs_remove_file(&opt->kobj, &frequency_attr.attr);
+ sysfs_remove_file(&opt->kobj, &voltage_attr.attr);
+ sysfs_remove_file(&opt->kobj, &latency_attr.attr);
+ up(&pm_sem);
+
+ return 0;
+}
+EXPORT_SYMBOL(unregister_operating_point);
+
+int register_operating_point(struct oppoint *opt)
+{
+ down(&pm_sem);
+ kobject_set_name(&opt->kobj, opt->name);
+ opt->kobj.kset = &power_subsys.kset;
+ opt->kobj.parent = &oppoint_kobj;
+ opt->kobj.ktype = &ktype_operating_point;
+ kobject_register(&opt->kobj);
+
+ sysfs_create_file(&opt->kobj, &frequency_attr.attr);
+ sysfs_create_file(&opt->kobj, &voltage_attr.attr);
+ sysfs_create_file(&opt->kobj, &latency_attr.attr);
+
+ list_add_tail(&opt->list, &pm_states.list);
+ up(&pm_sem);
+ return 0;
+}
+EXPORT_SYMBOL(register_operating_point);

static int __init pm_init(void)
{
+
int error = subsystem_register(&power_subsys);
- if (!error)
+ if (!error) {
error = sysfs_create_group(&power_subsys.kset.kobj,&attr_group);
+ kobject_set_name(&oppoint_kobj, "operating_points");
+ kobject_register(&oppoint_kobj);
+ sysfs_create_file(&oppoint_kobj, &point_attr.attr);
+ }
+
+
+ INIT_LIST_HEAD(&pm_states.list);
+ INIT_LIST_HEAD(&pm_list);
+
+#ifdef CONFIG_SOFTWARE_SUSPEND
+ list_add(&disk.list, &pm_list);
+#endif
+ list_add(&standby.list, &pm_list);
+ list_add(&mem.list, &pm_list);
+ current_state = &pm_states;
+
return error;
}

Index: linux-2.6.17/include/linux/pm.h
===================================================================
--- linux-2.6.17.orig/include/linux/pm.h
+++ linux-2.6.17/include/linux/pm.h
@@ -24,6 +24,7 @@
#ifdef __KERNEL__

#include <linux/list.h>
+#include <linux/kobject.h>
#include <asm/atomic.h>

/*
@@ -108,7 +109,32 @@ typedef int __bitwise suspend_state_t;
#define PM_SUSPEND_STANDBY ((__force suspend_state_t) 1)
#define PM_SUSPEND_MEM ((__force suspend_state_t) 3)
#define PM_SUSPEND_DISK ((__force suspend_state_t) 4)
-#define PM_SUSPEND_MAX ((__force suspend_state_t) 5)
+#define PM_FREQ_CHANGE ((__force suspend_state_t) 5)
+#define PM_VOLT_CHANGE ((__force suspend_state_t) 6)
+#define PM_SUSPEND_MAX ((__force suspend_state_t) 7)
+
+struct oppoint {
+ struct list_head list;
+ suspend_state_t type;
+ char *name;
+ unsigned int flags;
+ unsigned int frequency; /* in KHz */
+ unsigned int voltage; /* mV */
+ unsigned int latency; /* transition latency in us */
+ int (*prepare_transition)(struct oppoint *cur, struct oppoint *new);
+ int (*transition)(struct oppoint *cur, struct oppoint *new);
+ int (*finish_transition)(struct oppoint *cur, struct oppoint *new);
+
+ void *md_data; /* arch dependent data */
+ struct kobject kobj;
+};
+
+
+extern struct oppoint pm_states;
+extern struct oppoint *current_state;
+extern int register_operating_point(struct oppoint *opt);
+extern int unregister_operating_point(struct oppoint *opt);
+struct notifier_block;

typedef int __bitwise suspend_disk_method_t;

@@ -128,7 +154,7 @@ struct pm_ops {

extern void pm_set_ops(struct pm_ops *);
extern struct pm_ops *pm_ops;
-extern int pm_suspend(suspend_state_t state);
+extern int pm_suspend(struct oppoint *state);


/*
Index: linux-2.6.17/kernel/power/power.h
===================================================================
--- linux-2.6.17.orig/kernel/power/power.h
+++ linux-2.6.17/kernel/power/power.h
@@ -113,4 +113,4 @@ extern int swsusp_resume(void);
extern int swsusp_read(void);
extern int swsusp_write(void);
extern void swsusp_close(void);
-extern int suspend_enter(suspend_state_t state);
+extern int suspend_enter(struct oppoint * state);

2006-09-17 12:58:32

by Pavel Machek

[permalink] [raw]
Subject: Re: OpPoint summary

On Sat 2006-09-16 22:07:15, David Singleton wrote:
> Pavel and Greg,
>
> I've incorporated Pavels suggestions and only put suspend states
> in the /sys/power/state file. The control file for frequency and
> voltage operating
> point transitions is now in
> /sys/power/operating_points/current_point.

Ouch and it still needs description of your proposed user<->kernel
interface, explanation why it is good thing (I do not thing it is),
and probably explanation why you are competing with powerOP framework,
rather than working with them.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-09-17 12:56:53

by Pavel Machek

[permalink] [raw]
Subject: Re: OpPoint summary

Hi!

> I've incorporated Pavels suggestions and only put suspend states
> in the /sys/power/state file. The control file for frequency and

Ok...

> voltage operating
> point transitions is now in
> /sys/power/operating_points/current_point.

How do you handle SMP? If I want full speed on CPU0 and 100MHz on cpu
1?
> --- linux-2.6.17.orig/kernel/power/main.c
> +++ linux-2.6.17/kernel/power/main.c
> @@ -16,6 +16,7 @@
> #include <linux/init.h>
> #include <linux/pm.h>
> #include <linux/console.h>
> +#include <linux/module.h>
>
> #include "power.h"
>
> @@ -49,7 +50,7 @@ void pm_set_ops(struct pm_ops * ops)
> * the platform can enter the requested state.
> */
>
> -static int suspend_prepare(suspend_state_t state)
> +static int suspend_prepare(struct oppoint * state)

...so why this change?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-09-17 21:19:19

by Pavel Machek

[permalink] [raw]
Subject: Re: OpPoint summary

Hi!

> >Care to resend your patches in the proper format, through email so that
> >we can see them, and possibly get some testing in -mm if they look sane?
>
> Greg,
> here's the patch that implements operating points for different
> frequencies
> for the speedstep-centrino line of processors. Operating points are created
> in much the same manner that cpufreq tables are. This works for both
> simple implementations like the centrino and more complex SoC systems
> like the arm-pxa72x which has several clocks to control, and different clock
> divisors and multipliers.

> +static struct oppoint lowest = {
> + .name = "lowest",
> + .type = PM_FREQ_CHANGE,
> + .frequency = 0,
> + .voltage = 0,
> + .latency = 15,
> + .prepare_transition = cpufreq_prepare_transition,
> + .transition = centrino_transition,
> + .finish_transition = cpufreq_finish_transition,
> +};

We had nice, descriptive interface... with numbers. Now you want to
introduce english state names... looks like a step back to me.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-09-17 22:43:15

by Matthew Locke

[permalink] [raw]
Subject: Re: [linux-pm] OpPoint summary

Dave,

I am confused as to why you continue to duplicate work already done in
PowerOP. We are still waiting to hear why you need to have a different
operating point interface. As Eugeny and I have pointed out before
(refer to the thread in my PowerOP vs OPpoint email), your interface
doesn't make sense on a mulit cpu, SMP or otherwise, system. I
recommend you redo your patches to use PowerOP which already supports
this properly.


On Sep 16, 2006, at 10:07 PM, David Singleton wrote:

> Pavel and Greg,
>
> I've incorporated Pavels suggestions and only put suspend states
> in the /sys/power/state file. The control file for frequency and
> voltage operating
> point transitions is now in /sys/power/operating_points/current_point.
>
> The /sys/power/operating_points dirctory still contains the operating
> points themselves, with a frequency, voltage and latency file
> for each operating point.
>
> The oppointd power manager has been changed to use the
> new control file for operating points. It has been tested on
> a centrino laptop, the 4 way Xeon server and the arm-pxa27x.
>
> I finally got SMP tested on a 4 way Xeon server. The patch
> that supports SMP Xeon's is the oppoint-x86-p4.patch in the series.
>
> The only files in the core framework patch now are:
>
> kernel/power/main.c
> include/linux/pm.h
> kernel/power/power.h
>
> The full patch set is at
>
> http://source.mvista.com/~dsingleton/2.6.18-rc7
>
> The power manager source and patch is at:
>
> http://source.mvista.com/~dsingleton/oppointd-1.2.3
>
> Attached is the oppoint-core.patch.
>
> David
>
> Signed-Off-by: David Singleton <[email protected]>
>
> include/linux/pm.h | 30 +++-
> kernel/power/main.c | 361
> +++++++++++++++++++++++++++++++++++++++++++++------
> kernel/power/power.h | 2
> 3 files changed, 350 insertions(+), 43 deletions(-)
>
> Index: linux-2.6.17/kernel/power/main.c
> ===================================================================
> --- linux-2.6.17.orig/kernel/power/main.c
> +++ linux-2.6.17/kernel/power/main.c
> @@ -16,6 +16,7 @@
> #include <linux/init.h>
> #include <linux/pm.h>
> #include <linux/console.h>
> +#include <linux/module.h>
>
> #include "power.h"
>
> @@ -49,7 +50,7 @@ void pm_set_ops(struct pm_ops * ops)
> * the platform can enter the requested state.
> */
>
> -static int suspend_prepare(suspend_state_t state)
> +static int suspend_prepare(struct oppoint * state)
> {
> int error = 0;
> unsigned int free_pages;
> @@ -82,7 +83,7 @@ static int suspend_prepare(suspend_state
> }
>
> if (pm_ops->prepare) {
> - if ((error = pm_ops->prepare(state)))
> + if ((error = pm_ops->prepare(state->type)))
> goto Thaw;
> }
>
> @@ -94,7 +95,7 @@ static int suspend_prepare(suspend_state
> return 0;
> Finish:
> if (pm_ops->finish)
> - pm_ops->finish(state);
> + pm_ops->finish(state->type);
> Thaw:
> thaw_processes();
> Enable_cpu:
> @@ -104,7 +105,7 @@ static int suspend_prepare(suspend_state
> }
>
>
> -int suspend_enter(suspend_state_t state)
> +int suspend_enter(struct oppoint * state)
> {
> int error = 0;
> unsigned long flags;
> @@ -115,7 +116,7 @@ int suspend_enter(suspend_state_t state)
> printk(KERN_ERR "Some devices failed to power down\n");
> goto Done;
> }
> - error = pm_ops->enter(state);
> + error = pm_ops->enter(state->type);
> device_power_up();
> Done:
> local_irq_restore(flags);
> @@ -131,36 +132,82 @@ int suspend_enter(suspend_state_t state)
> * console that we've allocated. This is not called for
> suspend-to-disk.
> */
>
> -static void suspend_finish(suspend_state_t state)
> +static void suspend_finish(struct oppoint * state)
> {
> device_resume();
> resume_console();
> thaw_processes();
> enable_nonboot_cpus();
> if (pm_ops && pm_ops->finish)
> - pm_ops->finish(state);
> + pm_ops->finish(state->type);
> pm_restore_console();
> }
>
> +struct list_head pm_list;
> +static struct oppoint standby = {
> + .name = "standby",
> + .type = PM_SUSPEND_STANDBY,
> +};
>
> +static struct oppoint mem = {
> + .name = "mem",
> + .type = PM_SUSPEND_MEM,
> + .frequency = 0,
> + .voltage = 0,
> + .latency = 150,
> +};
>
> -
> -static const char * const pm_states[PM_SUSPEND_MAX] = {
> - [PM_SUSPEND_STANDBY] = "standby",
> - [PM_SUSPEND_MEM] = "mem",
> #ifdef CONFIG_SOFTWARE_SUSPEND
> - [PM_SUSPEND_DISK] = "disk",
> +struct oppoint disk = {
> + .name = "disk",
> + .type = PM_SUSPEND_DISK,
> +};
> #endif
> +
> +struct oppoint pm_states = {
> + .name = "default",
> + .type = PM_FREQ_CHANGE,
> };
> +struct oppoint *current_state;
> +
> +/*
> + *
> + */
> +static int pm_change_state(struct oppoint *state)
> +{
> + int error = 0;
> +
> + printk("OpPoint: changing from %s to %s\n",
> current_state->name,
> + state->name);
> + /*
> + * compare to current operating point.
> + * if different change to new operating point.
> + */
> + if (current_state == state)
> + goto out;
> +
> + if ((error = state->prepare_transition(current_state, state)))
> + goto out;
> +
> + if ((error = state->transition(current_state, state)))
> + state = current_state;
> +
> + if ((error = state->finish_transition(current_state, state))
> == 0)
> + current_state = state;
> +
> +out:
> + printk("OpPoint: State change returned %d\n", error);
> + return error;
> +}
>
> -static inline int valid_state(suspend_state_t state)
> +static inline int valid_state(struct oppoint * state)
> {
> /* Suspend-to-disk does not really need low-level support.
> * It can work with reboot if needed. */
> - if (state == PM_SUSPEND_DISK)
> + if (state->type == PM_SUSPEND_DISK)
> return 1;
>
> - if (pm_ops && pm_ops->valid && !pm_ops->valid(state))
> + if (pm_ops && pm_ops->valid && !pm_ops->valid(state->type))
> return 0;
> return 1;
> }
> @@ -168,7 +215,7 @@ static inline int valid_state(suspend_st
>
> /**
> * enter_state - Do common work of entering low-power state.
> - * @state: pm_state structure for state we're entering.
> + * @state: oppoint structure for state we're entering.
> *
> * Make sure we're the only ones trying to enter a sleep state.
> Fail
> * if someone has beat us to it, since we don't want anything
> weird to
> @@ -177,7 +224,7 @@ static inline int valid_state(suspend_st
> * we've woken up).
> */
>
> -static int enter_state(suspend_state_t state)
> +static int enter_state(struct oppoint *state)
> {
> int error;
>
> @@ -186,16 +233,21 @@ static int enter_state(suspend_state_t s
> if (down_trylock(&pm_sem))
> return -EBUSY;
>
> - if (state == PM_SUSPEND_DISK) {
> + if (state->type == PM_SUSPEND_DISK) {
> error = pm_suspend_disk();
> goto Unlock;
> }
>
> - pr_debug("PM: Preparing system for %s sleep\n",
> pm_states[state]);
> + if (state->type == PM_FREQ_CHANGE || state->type ==
> PM_VOLT_CHANGE) {
> + error = pm_change_state(state);
> + goto Unlock;
> + }
> +
> + pr_debug("PM: Preparing system for %s sleep\n", state->name);
> if ((error = suspend_prepare(state)))
> goto Unlock;
>
> - pr_debug("PM: Entering %s sleep\n", pm_states[state]);
> + pr_debug("PM: Entering %s sleep\n", state->name);
> error = suspend_enter(state);
>
> pr_debug("PM: Finishing wakeup.\n");
> @@ -211,7 +263,15 @@ static int enter_state(suspend_state_t s
> */
> int software_suspend(void)
> {
> - return enter_state(PM_SUSPEND_DISK);
> + struct oppoint *this, *next;
> + struct list_head *head = &mem.list;
> + int error = 0;
> +
> + list_for_each_entry_safe(this, next, head, list) {
> + if (this->type == PM_SUSPEND_DISK)
> + error= enter_state(this);
> + }
> + return error;
> }
>
>
> @@ -223,9 +283,9 @@ int software_suspend(void)
> * structure, and enter (above).
> */
>
> -int pm_suspend(suspend_state_t state)
> +int pm_suspend(struct oppoint * state)
> {
> - if (state > PM_SUSPEND_ON && state <= PM_SUSPEND_MAX)
> + if (state->type > PM_SUSPEND_ON && state->type <=
> PM_SUSPEND_MAX)
> return enter_state(state);
> return -EINVAL;
> }
> @@ -248,36 +308,35 @@ decl_subsys(power,NULL,NULL);
>
> static ssize_t state_show(struct subsystem * subsys, char * buf)
> {
> - int i;
> - char * s = buf;
> + struct oppoint *this, *next;
> + struct list_head *head = &pm_list;
> + char *s = buf;
> +
> + list_for_each_entry_safe(this, next, head, list)
> + s += sprintf(s,"%s ", this->name);
>
> - for (i = 0; i < PM_SUSPEND_MAX; i++) {
> - if (pm_states[i] && valid_state(i))
> - s += sprintf(s,"%s ", pm_states[i]);
> - }
> s += sprintf(s,"\n");
> +
> return (s - buf);
> }
>
> static ssize_t state_store(struct subsystem * subsys, const char *
> buf, size_t n)
> {
> - suspend_state_t state = PM_SUSPEND_STANDBY;
> - const char * const *s;
> + struct oppoint *this, *next;
> + struct list_head *head = &mem.list;
> char *p;
> - int error;
> + int error = -EINVAL;
> int len;
>
> p = memchr(buf, '\n', n);
> len = p ? p - buf : n;
> -
> - for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++,
> state++) {
> - if (*s && !strncmp(buf, *s, len))
> + list_for_each_entry_safe(this, next, head, list) {
> + if ((strlen(this->name) == len) &&
> + (!strncmp(this->name, buf, len))) {
> + error = enter_state(this);
> break;
> + }
> }
> - if (state < PM_SUSPEND_MAX && *s)
> - error = enter_state(state);
> - else
> - error = -EINVAL;
> return error ? error : n;
> }
>
> @@ -292,12 +351,234 @@ static struct attribute_group attr_group
> .attrs = g,
> };
>
> +static struct kobject oppoint_kobj = {
> + .kset = &power_subsys.kset,
> +};
> +
> +struct oppoint_attribute {
> + struct attribute attr;
> + ssize_t (*show)(struct kobject * kobj, char * buf);
> + ssize_t (*store)(struct kobject * kobj, const char * buf,
> size_t count);
> +};
> +
> +#define to_oppoint(obj) container_of(obj,struct oppoint,kobj)
> +#define to_oppoint_attr(_attr) container_of(_attr,struct
> oppoint_attribute,attr)
> +/*
> + * the frequency, voltage and latency files are readonly
> + */
> +
> +static ssize_t oppoint_voltage_show(struct kobject * kobj, char * buf)
> +{
> + ssize_t len;
> + struct oppoint *opt = to_oppoint(kobj);
> +
> + len = sprintf(buf, "%8d\n", opt->voltage);
> +
> + return len;
> +}
> +
> +static ssize_t oppoint_voltage_store(struct kobject * kobj, const
> char * buf,
> + size_t n)
> +{
> + return -EINVAL;
> +
> +}
> +
> +static ssize_t oppoint_frequency_show(struct kobject * kobj, char *
> buf)
> +{
> + ssize_t len;
> + struct oppoint *opt = to_oppoint(kobj);
> +
> + len = sprintf(buf, "%8d\n", opt->frequency);
> +
> + return len;
> +}
> +
> +static ssize_t oppoint_frequency_store(struct kobject * kobj,
> + const char * buf, size_t n)
> +{
> + return -EINVAL;
> +
> +}
> +
> +static ssize_t oppoint_point_show(struct kobject * kobj, char * buf)
> +{
> + ssize_t len;
> +
> + len = sprintf(buf, "%s\n", current_state->name);
> +
> + return len;
> +}
> +
> +static ssize_t oppoint_point_store(struct kobject * kobj, const char
> * buf,
> + size_t n)
> +{
> + struct oppoint *this, *next;
> + struct list_head *head = &pm_states.list;
> + char *p;
> + int error = -EINVAL;
> + int len;
> +
> + p = memchr(buf, '\n', n);
> + len = p ? p - buf : n;
> + list_for_each_entry_safe(this, next, head, list) {
> + if ((strlen(this->name) == len) &&
> + (!strncmp(this->name, buf, len))) {
> + error = enter_state(this);
> + break;
> + }
> + }
> + return error ? error : n;
> +}
> +
> +static ssize_t oppoint_latency_show(struct kobject * kobj, char * buf)
> +{
> + ssize_t len;
> + struct oppoint *opt = to_oppoint(kobj);
> +
> + len = sprintf(buf, "%8d\n", opt->latency);
> +
> + return len;
> +}
> +
> +static ssize_t oppoint_latency_store(struct kobject * kobj,
> + const char * buf, size_t n)
> +{
> + return -EINVAL;
> +
> +}
> +
> +static struct oppoint_attribute point_attr = {
> + .attr = {
> + .name = "current_point",
> + .mode = 0600,
> + },
> + .show = oppoint_point_show,
> + .store = oppoint_point_store,
> +};
> +
> +static struct oppoint_attribute frequency_attr = {
> + .attr = {
> + .name = "frequency",
> + .mode = 0400,
> + },
> + .show = oppoint_frequency_show,
> + .store = oppoint_frequency_store,
> +};
> +
> +static struct oppoint_attribute voltage_attr = {
> + .attr = {
> + .name = "voltage",
> + .mode = 0400,
> + },
> + .show = oppoint_voltage_show,
> + .store = oppoint_voltage_store,
> +};
> +
> +static struct oppoint_attribute latency_attr = {
> + .attr = {
> + .name = "latency",
> + .mode = 0400,
> + },
> + .show = oppoint_latency_show,
> + .store = oppoint_latency_store,
> +};
> +
> +static ssize_t
> +oppoint_attr_show(struct kobject * kobj, struct attribute * attr,
> char * buf)
> +{
> + struct oppoint_attribute * opt_attr = to_oppoint_attr(attr);
> + ssize_t ret = 0;
> +
> + if (opt_attr->show)
> + ret = opt_attr->show(kobj,buf);
> + return ret;
> +}
> +
> +static ssize_t
> +oppoint_attr_store(struct kobject * kobj, struct attribute * attr,
> + const char * buf, size_t count)
> +{
> + return -EINVAL;
> +}
> +
> +static void oppoint_kobj_release(struct kobject *kobj)
> +{
> + return;
> +}
> +
> +static struct sysfs_ops oppoint_sysfs_ops = {
> + .show = oppoint_attr_show,
> + .store = oppoint_attr_store,
> +};
> +
> +static struct attribute * oppoint_default_attrs[] = {
> + &frequency_attr.attr,
> + &voltage_attr.attr,
> + &latency_attr.attr,
> + NULL,
> +};
> +
> +static struct kobj_type ktype_operating_point = {
> + .release = oppoint_kobj_release,
> + .sysfs_ops = &oppoint_sysfs_ops,
> + .default_attrs = oppoint_default_attrs,
> +};
> +
> +int unregister_operating_point(struct oppoint *opt)
> +{
> + down(&pm_sem);
> + list_del_init(&opt->list);
> + sysfs_remove_file(&opt->kobj, &frequency_attr.attr);
> + sysfs_remove_file(&opt->kobj, &voltage_attr.attr);
> + sysfs_remove_file(&opt->kobj, &latency_attr.attr);
> + up(&pm_sem);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(unregister_operating_point);
> +
> +int register_operating_point(struct oppoint *opt)
> +{
> + down(&pm_sem);
> + kobject_set_name(&opt->kobj, opt->name);
> + opt->kobj.kset = &power_subsys.kset;
> + opt->kobj.parent = &oppoint_kobj;
> + opt->kobj.ktype = &ktype_operating_point;
> + kobject_register(&opt->kobj);
> +
> + sysfs_create_file(&opt->kobj, &frequency_attr.attr);
> + sysfs_create_file(&opt->kobj, &voltage_attr.attr);
> + sysfs_create_file(&opt->kobj, &latency_attr.attr);
> +
> + list_add_tail(&opt->list, &pm_states.list);
> + up(&pm_sem);
> + return 0;
> +}
> +EXPORT_SYMBOL(register_operating_point);
>
> static int __init pm_init(void)
> {
> +
> int error = subsystem_register(&power_subsys);
> - if (!error)
> + if (!error) {
> error =
> sysfs_create_group(&power_subsys.kset.kobj,&attr_group);
> + kobject_set_name(&oppoint_kobj, "operating_points");
> + kobject_register(&oppoint_kobj);
> + sysfs_create_file(&oppoint_kobj, &point_attr.attr);
> + }
> +
> +
> + INIT_LIST_HEAD(&pm_states.list);
> + INIT_LIST_HEAD(&pm_list);
> +
> +#ifdef CONFIG_SOFTWARE_SUSPEND
> + list_add(&disk.list, &pm_list);
> +#endif
> + list_add(&standby.list, &pm_list);
> + list_add(&mem.list, &pm_list);
> + current_state = &pm_states;
> +
> return error;
> }
>
> Index: linux-2.6.17/include/linux/pm.h
> ===================================================================
> --- linux-2.6.17.orig/include/linux/pm.h
> +++ linux-2.6.17/include/linux/pm.h
> @@ -24,6 +24,7 @@
> #ifdef __KERNEL__
>
> #include <linux/list.h>
> +#include <linux/kobject.h>
> #include <asm/atomic.h>
>
> /*
> @@ -108,7 +109,32 @@ typedef int __bitwise suspend_state_t;
> #define PM_SUSPEND_STANDBY ((__force suspend_state_t) 1)
> #define PM_SUSPEND_MEM ((__force suspend_state_t) 3)
> #define PM_SUSPEND_DISK ((__force suspend_state_t) 4)
> -#define PM_SUSPEND_MAX ((__force suspend_state_t) 5)
> +#define PM_FREQ_CHANGE ((__force suspend_state_t) 5)
> +#define PM_VOLT_CHANGE ((__force suspend_state_t) 6)
> +#define PM_SUSPEND_MAX ((__force suspend_state_t) 7)
> +
> +struct oppoint {
> + struct list_head list;
> + suspend_state_t type;
> + char *name;
> + unsigned int flags;
> + unsigned int frequency; /* in KHz */
> + unsigned int voltage; /* mV */
> + unsigned int latency; /* transition latency in us */
> + int (*prepare_transition)(struct oppoint *cur, struct
> oppoint *new);
> + int (*transition)(struct oppoint *cur, struct oppoint
> *new);
> + int (*finish_transition)(struct oppoint *cur, struct
> oppoint *new);
> +
> + void *md_data; /* arch dependent data */
> + struct kobject kobj;
> +};
> +
> +
> +extern struct oppoint pm_states;
> +extern struct oppoint *current_state;
> +extern int register_operating_point(struct oppoint *opt);
> +extern int unregister_operating_point(struct oppoint *opt);
> +struct notifier_block;
>
> typedef int __bitwise suspend_disk_method_t;
>
> @@ -128,7 +154,7 @@ struct pm_ops {
>
> extern void pm_set_ops(struct pm_ops *);
> extern struct pm_ops *pm_ops;
> -extern int pm_suspend(suspend_state_t state);
> +extern int pm_suspend(struct oppoint *state);
>
>
> /*
> Index: linux-2.6.17/kernel/power/power.h
> ===================================================================
> --- linux-2.6.17.orig/kernel/power/power.h
> +++ linux-2.6.17/kernel/power/power.h
> @@ -113,4 +113,4 @@ extern int swsusp_resume(void);
> extern int swsusp_read(void);
> extern int swsusp_write(void);
> extern void swsusp_close(void);
> -extern int suspend_enter(suspend_state_t state);
> +extern int suspend_enter(struct oppoint * state);
> _______________________________________________
> linux-pm mailing list
> [email protected]
> https://lists.osdl.org/mailman/listinfo/linux-pm
>

2006-09-18 14:33:47

by Richard Griffiths

[permalink] [raw]
Subject: Re: [linux-pm] OpPoint summary

On Sun, 2006-09-17 at 19:48 +0200, Pavel Machek wrote:
> Hi!
>
> > >Care to resend your patches in the proper format, through email so that
> > >we can see them, and possibly get some testing in -mm if they look sane?
> >
> > Greg,
> > here's the patch that implements operating points for different
> > frequencies
> > for the speedstep-centrino line of processors. Operating points are created
> > in much the same manner that cpufreq tables are. This works for both
> > simple implementations like the centrino and more complex SoC systems
> > like the arm-pxa72x which has several clocks to control, and different clock
> > divisors and multipliers.
>
> > +static struct oppoint lowest = {
> > + .name = "lowest",
> > + .type = PM_FREQ_CHANGE,
> > + .frequency = 0,
> > + .voltage = 0,
> > + .latency = 15,
> > + .prepare_transition = cpufreq_prepare_transition,
> > + .transition = centrino_transition,
> > + .finish_transition = cpufreq_finish_transition,
> > +};
>
> We had nice, descriptive interface... with numbers. Now you want to
> introduce english state names... looks like a step back to me.

Maybe a compromise could be reached where a defined set of numbers maps
to string names ala Unix init states. Many people (at least me) still
invoke init 6 to reboot a system. A defined table would satisfy both
the number and string camps.

Richard

2006-09-18 16:13:25

by Matthew Locke

[permalink] [raw]
Subject: Re: [linux-pm] OpPoint summary


On Sep 18, 2006, at 7:33 AM, Richard A. Griffiths wrote:

> On Sun, 2006-09-17 at 19:48 +0200, Pavel Machek wrote:
>> Hi!
>>
>>>> Care to resend your patches in the proper format, through email so
>>>> that
>>>> we can see them, and possibly get some testing in -mm if they look
>>>> sane?
>>>
>>> Greg,
>>> here's the patch that implements operating points for different
>>> frequencies
>>> for the speedstep-centrino line of processors. Operating points are
>>> created
>>> in much the same manner that cpufreq tables are. This works for both
>>> simple implementations like the centrino and more complex SoC systems
>>> like the arm-pxa72x which has several clocks to control, and
>>> different clock
>>> divisors and multipliers.
>>
>>> +static struct oppoint lowest = {
>>> + .name = "lowest",
>>> + .type = PM_FREQ_CHANGE,
>>> + .frequency = 0,
>>> + .voltage = 0,
>>> + .latency = 15,
>>> + .prepare_transition = cpufreq_prepare_transition,
>>> + .transition = centrino_transition,
>>> + .finish_transition = cpufreq_finish_transition,
>>> +};
>>
>> We had nice, descriptive interface... with numbers. Now you want to
>> introduce english state names... looks like a step back to me.
>
> Maybe a compromise could be reached where a defined set of numbers maps
> to string names ala Unix init states. Many people (at least me) still
> invoke init 6 to reboot a system. A defined table would satisfy both
> the number and string camps.

PowerOP allows the platform to define the name. In our cpufreq
integration patches, we reuse the same name that cpufreq centrino used.

>
> Richard
> _______________________________________________
> linux-pm mailing list
> [email protected]
> https://lists.osdl.org/mailman/listinfo/linux-pm
>