2011-02-01 10:59:20

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH] Power domains for platform bus type

On Tuesday, February 01, 2011, Grant Likely wrote:
> On Mon, Jan 31, 2011 at 4:43 PM, Kevin Hilman <[email protected]> wrote:
> > "Rafael J. Wysocki" <[email protected]> writes:
> >
> >>> Also, what is the use case for having 2 sets of power_domain ops? My
> >>> gut tells me that you'd only want to do post ops on the
> >>> {freeze,suspend,poweroff} path and pre ops on the {resume,thaw,restore}
> >>> path. It seems overly engineered to me, but I may be missing
> >>> something fundamental.
> >>
> >> Well, that's a part of the RFC, actually. :-)
> >>
> >> For the subsystems I've worked with (PCI, ACPI, PNP to some extent) one set
> >> would be sufficient, but I don't know of every possible use case.
> >
> > For the on-chip SoC devices we're managing with OMAP, we're currently
> > only using one set: post ops on [runtime_]suspend and pre ops on
> > [runtime_]resume.
> >
> > However, I could imagine (at least conceptually) using the pre ops on
> > suspend to do some constraints checking and/or possibly some
> > management/notification of dependent devices. Another possiblity
> > (although possibly racy) would be using the pre ops on suspend to
> > initiate some high-latency operations.
> >
> > I guess the main problem with two sets is wasted space. e.g, if I move
> > OMAP to this (already hacking on it) there will be only 2 functions used
> > in post ops: [runtime_]suspend() and 2 used in pre ops [runtime_]_resume().

I agree with Alan that the wasted space is not substantial.

> There's a conceptual load added to the (human) reader too. Every
> additional hook point is an additional piece other engineers need to
> fit into their mental model. I'm resistant to having the two sets of
> ops without a definite use case for it when conceptually I can only
> imagine a need for one set.

There are two possibilities I think. First, we can use one set for now
and possibly add the other one in the future if it proves necessary. Second,
we can use two sets to start with and then drop one of them if no one finds it
useful.

I guess the first option is a cleaner one, so I'm going to move forward along
this line.

Also, I'm going to move the power domain hook to the struct device level, so
that we have:

struct device {
...
struct power_domain *domain;
...
};

struct power_domain {
struct dev_pm_ops ops;
};

This way, if there's a need to add the second set of operations, it will be
relatively easy to do that without major redesign.

Thanks,
Rafael


2011-02-01 16:48:29

by Kevin Hilman

[permalink] [raw]
Subject: Re: [RFC][PATCH] Power domains for platform bus type

"Rafael J. Wysocki" <[email protected]> writes:

[...]

> Also, I'm going to move the power domain hook to the struct device level, so
> that we have:
>
> struct device {
> ...
> struct power_domain *domain;
> ...
> };
>
> struct power_domain {
> struct dev_pm_ops ops;
> };
>

Great, thanks.

Any opinion on my idea to name this 'struct dev_power_domain' to avoid
confusion with existing powerdomain structs in SoC code? For example,
on OMAP we already have a 'struct powerdomain'.

Kevin

2011-02-01 18:40:15

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH] Power domains for platform bus type

On Tuesday, February 01, 2011, Kevin Hilman wrote:
> "Rafael J. Wysocki" <[email protected]> writes:
>
> [...]
>
> > Also, I'm going to move the power domain hook to the struct device level, so
> > that we have:
> >
> > struct device {
> > ...
> > struct power_domain *domain;
> > ...
> > };
> >
> > struct power_domain {
> > struct dev_pm_ops ops;
> > };
> >
>
> Great, thanks.
>
> Any opinion on my idea to name this 'struct dev_power_domain' to avoid
> confusion with existing powerdomain structs in SoC code? For example,
> on OMAP we already have a 'struct powerdomain'.

Fine by me.

Thanks,
Rafael

2011-02-12 22:15:08

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFC][PATCH 0/2] PM: Core power management modifications

Hi,

The two patches in this series modify the core power management code in the
following way:

[1/2] - Adds support for device power domains.
[2/2] - Reworks the core PM routines so that subsystems are treated in the
same way by the system-wide transitions code and the runtime PM code.

Comments welcome!

Thanks,
Rafael

2011-02-12 22:15:27

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFC][PATCH 2/2] PM: Make system-wide PM and runtime PM handle subsystems consistently

From: Rafael J. Wysocki <[email protected]>

The code handling system-wide power transitions (eg. suspend-to-RAM)
can in theory execute callbacks provided by the device's bus type,
device type and class in each phase of the power transition. In
turn, the runtime PM core code only calls one of those callbacks at
a time, preferring bus type callbacks to device type or class
callbacks and device type callbacks to class callbacks.

It seems reasonable to make them both behave in the same way in that
respect. Moreover, even though a device may belong to two subsystems
(eg. bus type and device class) simultaneously, in practice power
management callbacks for system-wide power transitions are always
provided by only one of them (ie. if the bus type callbacks are
defined, the device class ones are not and vice versa). Thus it is
possible to modify the code handling system-wide power transitions
so that it follows the core runtime PM code (ie. treats the
subsystem callbacks as mutually exclusive).

On the other hand, the core runtime PM code will choose to execute,
for example, a runtime suspend callback provided by the device type
even if the bus type's struct dev_pm_ops object exists, but the
runtime_suspend pointer in it happens to be NULL. This is confusing,
because it may lead to the execution of callbacks from different
subsystems during different operations (eg. the bus type suspend
callback may be executed during runtime suspend, while the device
type callback will be executed during runtime resume).

Make all of the power management code treat subsystem callbacks in
a consistent way, such that:
(1) If the device's bus type is defined (eg. dev->bus is not NULL)
and its pm pointer is not NULL, the callbacks from dev->bus->pm
will be used.
(2) If dev->bus is NULL or dev->bus->pm is NULL, but the device's
device type is defined (eg. dev->type is not NULL) and its pm
pointer is not NULL, the callbacks from dev->type->pm will be
used.
(3) If dev->bus is NULL or dev->bus->pm is NULL and dev->type is
NULL or dev->type->pm is NULL, the callbacks from dev->class->pm
will be used provided that both dev->class and dev->class->pm
are not NULL.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
Documentation/power/devices.txt | 28 +++----
drivers/base/power/main.c | 141 +++++++++++++++++-----------------------
drivers/base/power/runtime.c | 12 +--
3 files changed, 79 insertions(+), 102 deletions(-)

Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -433,21 +433,17 @@ static int device_resume_noirq(struct de
error = pm_noirq_op(dev, dev->bus->pm, state);
if (error)
goto End;
- }
-
- if (dev->type && dev->type->pm) {
+ } else if (dev->type && dev->type->pm) {
pm_dev_dbg(dev, state, "EARLY type ");
error = pm_noirq_op(dev, dev->type->pm, state);
if (error)
goto End;
- }
-
- if (dev->class && dev->class->pm) {
+ } else if (dev->class && dev->class->pm) {
pm_dev_dbg(dev, state, "EARLY class ");
error = pm_noirq_op(dev, dev->class->pm, state);
}

-End:
+ End:
TRACE_RESUME(error);
return error;
}
@@ -532,21 +528,18 @@ static int device_resume(struct device *
if (dev->bus->pm) {
pm_dev_dbg(dev, state, "");
error = pm_op(dev, dev->bus->pm, state);
+ goto End;
} else if (dev->bus->resume) {
pm_dev_dbg(dev, state, "legacy ");
error = legacy_resume(dev, dev->bus->resume);
- }
- if (error)
goto End;
+ }
}

- if (dev->type) {
- if (dev->type->pm) {
- pm_dev_dbg(dev, state, "type ");
- error = pm_op(dev, dev->type->pm, state);
- }
- if (error)
- goto End;
+ if (dev->type && dev->type->pm) {
+ pm_dev_dbg(dev, state, "type ");
+ error = pm_op(dev, dev->type->pm, state);
+ goto End;
}

if (dev->class) {
@@ -558,6 +551,7 @@ static int device_resume(struct device *
error = legacy_resume(dev, dev->class->resume);
}
}
+
End:
device_unlock(dev);
complete_all(&dev->power.completion);
@@ -644,19 +638,18 @@ static void device_complete(struct devic
dev->pwr_domain->ops.complete(dev);
}

- if (dev->class && dev->class->pm && dev->class->pm->complete) {
- pm_dev_dbg(dev, state, "completing class ");
- dev->class->pm->complete(dev);
- }
-
- if (dev->type && dev->type->pm && dev->type->pm->complete) {
- pm_dev_dbg(dev, state, "completing type ");
- dev->type->pm->complete(dev);
- }
-
- if (dev->bus && dev->bus->pm && dev->bus->pm->complete) {
+ if (dev->bus && dev->bus->pm) {
pm_dev_dbg(dev, state, "completing ");
- dev->bus->pm->complete(dev);
+ if (dev->bus->pm->complete)
+ dev->bus->pm->complete(dev);
+ } else if (dev->type && dev->type->pm) {
+ pm_dev_dbg(dev, state, "completing type ");
+ if (dev->type->pm->complete)
+ dev->type->pm->complete(dev);
+ } else if (dev->class && dev->class->pm) {
+ pm_dev_dbg(dev, state, "completing class ");
+ if (dev->class->pm->complete)
+ dev->class->pm->complete(dev);
}

device_unlock(dev);
@@ -741,27 +734,23 @@ static pm_message_t resume_event(pm_mess
*/
static int device_suspend_noirq(struct device *dev, pm_message_t state)
{
- int error = 0;
+ int error;

- if (dev->class && dev->class->pm) {
- pm_dev_dbg(dev, state, "LATE class ");
- error = pm_noirq_op(dev, dev->class->pm, state);
+ if (dev->bus && dev->bus->pm) {
+ pm_dev_dbg(dev, state, "LATE ");
+ error = pm_noirq_op(dev, dev->bus->pm, state);
if (error)
- goto End;
- }
-
- if (dev->type && dev->type->pm) {
+ return error;
+ } else if (dev->type && dev->type->pm) {
pm_dev_dbg(dev, state, "LATE type ");
error = pm_noirq_op(dev, dev->type->pm, state);
if (error)
- goto End;
- }
-
- if (dev->bus && dev->bus->pm) {
- pm_dev_dbg(dev, state, "LATE ");
- error = pm_noirq_op(dev, dev->bus->pm, state);
+ return error;
+ } else if (dev->class && dev->class->pm) {
+ pm_dev_dbg(dev, state, "LATE class ");
+ error = pm_noirq_op(dev, dev->class->pm, state);
if (error)
- goto End;
+ return error;
}

if (dev->pwr_domain) {
@@ -769,8 +758,7 @@ static int device_suspend_noirq(struct d
pm_noirq_op(dev, &dev->pwr_domain->ops, state);
}

-End:
- return error;
+ return 0;
}

/**
@@ -857,40 +845,36 @@ static int __device_suspend(struct devic
goto End;
}

- if (dev->class) {
- if (dev->class->pm) {
- pm_dev_dbg(dev, state, "class ");
- error = pm_op(dev, dev->class->pm, state);
- } else if (dev->class->suspend) {
- pm_dev_dbg(dev, state, "legacy class ");
- error = legacy_suspend(dev, state, dev->class->suspend);
- }
- if (error)
- goto End;
- }
-
- if (dev->type) {
- if (dev->type->pm) {
- pm_dev_dbg(dev, state, "type ");
- error = pm_op(dev, dev->type->pm, state);
- }
- if (error)
- goto End;
- }
-
if (dev->bus) {
if (dev->bus->pm) {
pm_dev_dbg(dev, state, "");
error = pm_op(dev, dev->bus->pm, state);
+ goto Domain;
} else if (dev->bus->suspend) {
pm_dev_dbg(dev, state, "legacy ");
error = legacy_suspend(dev, state, dev->bus->suspend);
+ goto Domain;
}
- if (error)
- goto End;
}

- if (dev->pwr_domain) {
+ if (dev->type && dev->type->pm) {
+ pm_dev_dbg(dev, state, "type ");
+ error = pm_op(dev, dev->type->pm, state);
+ goto Domain;
+ }
+
+ if (dev->class) {
+ if (dev->class->pm) {
+ pm_dev_dbg(dev, state, "class ");
+ error = pm_op(dev, dev->class->pm, state);
+ } else if (dev->class->suspend) {
+ pm_dev_dbg(dev, state, "legacy class ");
+ error = legacy_suspend(dev, state, dev->class->suspend);
+ }
+ }
+
+ Domain:
+ if (!error && dev->pwr_domain) {
pm_dev_dbg(dev, state, "power domain ");
pm_op(dev, &dev->pwr_domain->ops, state);
}
@@ -985,25 +969,24 @@ static int device_prepare(struct device

device_lock(dev);

- if (dev->bus && dev->bus->pm && dev->bus->pm->prepare) {
+ if (dev->bus && dev->bus->pm) {
pm_dev_dbg(dev, state, "preparing ");
- error = dev->bus->pm->prepare(dev);
+ if (dev->bus->pm->prepare)
+ error = dev->bus->pm->prepare(dev);
suspend_report_result(dev->bus->pm->prepare, error);
if (error)
goto End;
- }
-
- if (dev->type && dev->type->pm && dev->type->pm->prepare) {
+ } else if (dev->type && dev->type->pm) {
pm_dev_dbg(dev, state, "preparing type ");
- error = dev->type->pm->prepare(dev);
+ if (dev->type->pm->prepare)
+ error = dev->type->pm->prepare(dev);
suspend_report_result(dev->type->pm->prepare, error);
if (error)
goto End;
- }
-
- if (dev->class && dev->class->pm && dev->class->pm->prepare) {
+ } else if (dev->class && dev->class->pm) {
pm_dev_dbg(dev, state, "preparing class ");
- error = dev->class->pm->prepare(dev);
+ if (dev->class->pm->prepare)
+ error = dev->class->pm->prepare(dev);
suspend_report_result(dev->class->pm->prepare, error);
if (error)
goto End;
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- linux-2.6.orig/drivers/base/power/runtime.c
+++ linux-2.6/drivers/base/power/runtime.c
@@ -214,9 +214,9 @@ static int rpm_idle(struct device *dev,

dev->power.idle_notification = true;

- if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
+ if (dev->bus && dev->bus->pm)
callback = dev->bus->pm->runtime_idle;
- else if (dev->type && dev->type->pm && dev->type->pm->runtime_idle)
+ else if (dev->type && dev->type->pm)
callback = dev->type->pm->runtime_idle;
else if (dev->class && dev->class->pm)
callback = dev->class->pm->runtime_idle;
@@ -382,9 +382,9 @@ static int rpm_suspend(struct device *de

__update_runtime_status(dev, RPM_SUSPENDING);

- if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend)
+ if (dev->bus && dev->bus->pm)
callback = dev->bus->pm->runtime_suspend;
- else if (dev->type && dev->type->pm && dev->type->pm->runtime_suspend)
+ else if (dev->type && dev->type->pm)
callback = dev->type->pm->runtime_suspend;
else if (dev->class && dev->class->pm)
callback = dev->class->pm->runtime_suspend;
@@ -584,9 +584,9 @@ static int rpm_resume(struct device *dev
if (dev->pwr_domain)
rpm_callback(dev->pwr_domain->ops.runtime_resume, dev);

- if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume)
+ if (dev->bus && dev->bus->pm)
callback = dev->bus->pm->runtime_resume;
- else if (dev->type && dev->type->pm && dev->type->pm->runtime_resume)
+ else if (dev->type && dev->type->pm)
callback = dev->type->pm->runtime_resume;
else if (dev->class && dev->class->pm)
callback = dev->class->pm->runtime_resume;
Index: linux-2.6/Documentation/power/devices.txt
===================================================================
--- linux-2.6.orig/Documentation/power/devices.txt
+++ linux-2.6/Documentation/power/devices.txt
@@ -249,23 +249,17 @@ various phases always run after tasks ha
unfrozen. Furthermore, the *_noirq phases run at a time when IRQ handlers have
been disabled (except for those marked with the IRQ_WAKEUP flag).

-Most phases use bus, type, and class callbacks (that is, methods defined in
-dev->bus->pm, dev->type->pm, and dev->class->pm). The prepare and complete
-phases are exceptions; they use only bus callbacks. When multiple callbacks
-are used in a phase, they are invoked in the order: <class, type, bus> during
-power-down transitions and in the opposite order during power-up transitions.
-For example, during the suspend phase the PM core invokes
-
- dev->class->pm.suspend(dev);
- dev->type->pm.suspend(dev);
- dev->bus->pm.suspend(dev);
-
-before moving on to the next device, whereas during the resume phase the core
-invokes
-
- dev->bus->pm.resume(dev);
- dev->type->pm.resume(dev);
- dev->class->pm.resume(dev);
+All phases use bus, type, or class callbacks (that is, methods defined in
+dev->bus->pm, dev->type->pm, or dev->class->pm). These callbacks are mutually
+exclusive, so if the bus provides a struct dev_pm_ops object pointed to by its
+pm field (i.e. both dev->bus and dev->bus->pm are defined), the callbacks
+included in that object (i.e. dev->bus->pm) will be used. In turn, if the
+device type provides a struct dev_pm_ops object pointed to by its pm field
+(i.e. both dev->type and dev->type->pm are defined), the PM core will used the
+callbacks from that object (i.e. dev->type->pm). Finally, if the pm fields of
+both the bus and device type objects are NULL (or those objects do not exist),
+the callbacks provided by the class (that is, the callbacks from dev->class->pm)
+will be used.

These callbacks may in turn invoke device- or driver-specific methods stored in
dev->driver->pm, but they don't have to.

2011-02-12 22:15:34

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFC][PATCH 1/2] PM: Add support for device power domains

From: Rafael J. Wysocki <[email protected]>

The platform bus type is often used to handle Systems-on-a-Chip (SoC)
where all devices are represented by objects of type struct
platform_device. In those cases the same "platform" device driver
may be used with multiple different system configurations, but the
actions needed to put the devices it handles into a low-power state
and back into the full-power state may depend on the design of the
given SoC. The driver, however, cannot possibly include all the
information necessary for the power management of its device on all
the systems it is used with. Moreover, the device hierarchy in its
current form also is not suitable for representing this kind of
information.

The patch below attempts to address this problem by introducing
objects of type struct dev_power_domain that can be used for
representing power domains within a SoC. Every struct
dev_power_domain object provides a sets of device power
management callbacks that can be used to perform what's needed for
device power management in addition to the operations carried out by
the device's driver and subsystem.

Namely, if a struct dev_power_domain object is pointed to by the
pwr_domain field in a struct device, the callbacks provided by its
ops member will be executed in addition to the corresponding
callbacks provided by the device's subsystem and driver during all
power transitions.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
Documentation/power/devices.txt | 45 +++++++++++++++++++++++++++++++++++++++-
drivers/base/power/main.c | 37 ++++++++++++++++++++++++++++++++
drivers/base/power/runtime.c | 19 +++++++++++++++-
include/linux/device.h | 1
include/linux/pm.h | 8 +++++++
5 files changed, 107 insertions(+), 3 deletions(-)

Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include/linux/pm.h
+++ linux-2.6/include/linux/pm.h
@@ -463,6 +463,14 @@ struct dev_pm_info {

extern void update_pm_runtime_accounting(struct device *dev);

+/*
+ * Power domains provide callbacks that are executed during system suspend,
+ * hibernation, system resume and during runtime PM transitions along with
+ * subsystem-level and driver-level callbacks.
+ */
+struct dev_power_domain {
+ struct dev_pm_ops ops;
+};

/*
* The PM_EVENT_ messages are also used by drivers implementing the legacy
Index: linux-2.6/include/linux/device.h
===================================================================
--- linux-2.6.orig/include/linux/device.h
+++ linux-2.6/include/linux/device.h
@@ -422,6 +422,7 @@ struct device {
void *platform_data; /* Platform specific data, device
core doesn't touch it */
struct dev_pm_info power;
+ struct dev_power_domain *pwr_domain;

#ifdef CONFIG_NUMA
int numa_node; /* NUMA node this device is close to */
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- linux-2.6.orig/drivers/base/power/runtime.c
+++ linux-2.6/drivers/base/power/runtime.c
@@ -168,6 +168,7 @@ static int rpm_check_suspend_allowed(str
static int rpm_idle(struct device *dev, int rpmflags)
{
int (*callback)(struct device *);
+ int (*domain_callback)(struct device *);
int retval;

retval = rpm_check_suspend_allowed(dev);
@@ -222,10 +223,19 @@ static int rpm_idle(struct device *dev,
else
callback = NULL;

- if (callback) {
+ if (dev->pwr_domain)
+ domain_callback = dev->pwr_domain->ops.runtime_idle;
+ else
+ domain_callback = NULL;
+
+ if (callback || domain_callback) {
spin_unlock_irq(&dev->power.lock);

- callback(dev);
+ if (domain_callback)
+ retval = domain_callback(dev);
+
+ if (!retval && callback)
+ callback(dev);

spin_lock_irq(&dev->power.lock);
}
@@ -390,6 +400,8 @@ static int rpm_suspend(struct device *de
else
pm_runtime_cancel_pending(dev);
} else {
+ if (dev->pwr_domain)
+ rpm_callback(dev->pwr_domain->ops.runtime_suspend, dev);
no_callback:
__update_runtime_status(dev, RPM_SUSPENDED);
pm_runtime_deactivate_timer(dev);
@@ -569,6 +581,9 @@ static int rpm_resume(struct device *dev

__update_runtime_status(dev, RPM_RESUMING);

+ if (dev->pwr_domain)
+ rpm_callback(dev->pwr_domain->ops.runtime_resume, dev);
+
if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume)
callback = dev->bus->pm->runtime_resume;
else if (dev->type && dev->type->pm && dev->type->pm->runtime_resume)
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -423,6 +423,11 @@ static int device_resume_noirq(struct de
TRACE_DEVICE(dev);
TRACE_RESUME(0);

+ if (dev->pwr_domain) {
+ pm_dev_dbg(dev, state, "EARLY power domain ");
+ pm_noirq_op(dev, &dev->pwr_domain->ops, state);
+ }
+
if (dev->bus && dev->bus->pm) {
pm_dev_dbg(dev, state, "EARLY ");
error = pm_noirq_op(dev, dev->bus->pm, state);
@@ -518,6 +523,11 @@ static int device_resume(struct device *

dev->power.in_suspend = false;

+ if (dev->pwr_domain) {
+ pm_dev_dbg(dev, state, "power domain ");
+ pm_op(dev, &dev->pwr_domain->ops, state);
+ }
+
if (dev->bus) {
if (dev->bus->pm) {
pm_dev_dbg(dev, state, "");
@@ -629,6 +639,11 @@ static void device_complete(struct devic
{
device_lock(dev);

+ if (dev->pwr_domain && dev->pwr_domain->ops.complete) {
+ pm_dev_dbg(dev, state, "completing power domain ");
+ dev->pwr_domain->ops.complete(dev);
+ }
+
if (dev->class && dev->class->pm && dev->class->pm->complete) {
pm_dev_dbg(dev, state, "completing class ");
dev->class->pm->complete(dev);
@@ -745,6 +760,13 @@ static int device_suspend_noirq(struct d
if (dev->bus && dev->bus->pm) {
pm_dev_dbg(dev, state, "LATE ");
error = pm_noirq_op(dev, dev->bus->pm, state);
+ if (error)
+ goto End;
+ }
+
+ if (dev->pwr_domain) {
+ pm_dev_dbg(dev, state, "LATE power domain ");
+ pm_noirq_op(dev, &dev->pwr_domain->ops, state);
}

End:
@@ -864,6 +886,13 @@ static int __device_suspend(struct devic
pm_dev_dbg(dev, state, "legacy ");
error = legacy_suspend(dev, state, dev->bus->suspend);
}
+ if (error)
+ goto End;
+ }
+
+ if (dev->pwr_domain) {
+ pm_dev_dbg(dev, state, "power domain ");
+ pm_op(dev, &dev->pwr_domain->ops, state);
}

End:
@@ -976,7 +1005,15 @@ static int device_prepare(struct device
pm_dev_dbg(dev, state, "preparing class ");
error = dev->class->pm->prepare(dev);
suspend_report_result(dev->class->pm->prepare, error);
+ if (error)
+ goto End;
+ }
+
+ if (dev->pwr_domain && dev->pwr_domain->ops.prepare) {
+ pm_dev_dbg(dev, state, "preparing power domain ");
+ dev->pwr_domain->ops.prepare(dev);
}
+
End:
device_unlock(dev);

Index: linux-2.6/Documentation/power/devices.txt
===================================================================
--- linux-2.6.orig/Documentation/power/devices.txt
+++ linux-2.6/Documentation/power/devices.txt
@@ -1,6 +1,6 @@
Device Power Management

-Copyright (c) 2010 Rafael J. Wysocki <[email protected]>, Novell Inc.
+Copyright (c) 2010-2011 Rafael J. Wysocki <[email protected]>, Novell Inc.
Copyright (c) 2010 Alan Stern <[email protected]>


@@ -507,6 +507,49 @@ routines. Nevertheless, different callb
situation where it actually matters.


+Device Power Domains
+--------------------
+Sometimes devices share reference clocks or other power resources. In those
+cases it generally is not possible to put devices into low-power states
+individually. Instead, a set of devices sharing a power resource can be put
+into a low-power state together at the same time by turning off the shared
+power resource. Of course, they also need to be put into the full-power state
+together, by turning the shared power resource on. A set of devices with this
+property is often referred to as a power domain.
+
+Support for power domains is provided through the pwr_domain field of struct
+device. This field is a pointer to an object of type struct dev_power_domain,
+defined in include/linux/pm.h, providing a set of power management callbacks
+analogous to the subsystem-level and device driver callbacks that are executed
+for the given device during all power transitions, in addition to the respective
+subsystem-level callbacks. Specifically, the power domain "suspend" callbacks
+(i.e. ->runtime_suspend(), ->suspend(), ->freeze(), ->poweroff(), etc.) are
+executed after the analogous subsystem-level callbacks, while the power domain
+"resume" callbacks (i.e. ->runtime_resume(), ->resume(), ->thaw(), ->restore,
+etc.) are executed before the analogous subsystem-level callbacks. Error codes
+returned by the "suspend" and "resume" power domain callbacks are ignored.
+
+Power domain ->runtime_idle() callback is executed before the subsystem-level
+->runtime_idle() callback and the result returned by it is not ignored. Namely,
+if it returns error code, the subsystem-level ->runtime_idle() callback will not
+be called and the helper function rpm_idle() executing it will return error
+code. This mechanism is intended to help platforms where saving device state
+is a time consuming operation and should only be carried out if all devices
+in the power domain are idle, before turning off the shared power resource(s).
+Namely, the power domain ->runtime_idle() callback may return error code until
+the pm_runtime_idle() helper (or its asychronous version) has been called for
+all devices in the power domain (it is recommended that the returned error code
+be -EBUSY in those cases), preventing the subsystem-level ->runtime_idle()
+callback from being run prematurely.
+
+The support for device power domains is only relevant to platforms needing to
+use the same subsystem-level (e.g. platform bus type) and device driver power
+management callbacks in many different power domain configurations and wanting
+to avoid incorporating the support for power domains into the subsystem-level
+callbacks. The other platforms need not implement it or take it into account
+in any way.
+
+
System Devices
--------------
System devices (sysdevs) follow a slightly different API, which can be found in

2011-02-14 16:12:52

by Alan Stern

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/2] PM: Add support for device power domains

On Sat, 12 Feb 2011, Rafael J. Wysocki wrote:

> From: Rafael J. Wysocki <[email protected]>
>
> The platform bus type is often used to handle Systems-on-a-Chip (SoC)
> where all devices are represented by objects of type struct
> platform_device. In those cases the same "platform" device driver
> may be used with multiple different system configurations, but the
> actions needed to put the devices it handles into a low-power state
> and back into the full-power state may depend on the design of the
> given SoC. The driver, however, cannot possibly include all the
> information necessary for the power management of its device on all
> the systems it is used with. Moreover, the device hierarchy in its
> current form also is not suitable for representing this kind of
> information.
>
> The patch below attempts to address this problem by introducing
> objects of type struct dev_power_domain that can be used for
> representing power domains within a SoC. Every struct
> dev_power_domain object provides a sets of device power
> management callbacks that can be used to perform what's needed for
> device power management in addition to the operations carried out by
> the device's driver and subsystem.
>
> Namely, if a struct dev_power_domain object is pointed to by the
> pwr_domain field in a struct device, the callbacks provided by its
> ops member will be executed in addition to the corresponding
> callbacks provided by the device's subsystem and driver during all
> power transitions.

Overall this looks very good.

> Index: linux-2.6/include/linux/pm.h
> ===================================================================
> --- linux-2.6.orig/include/linux/pm.h
> +++ linux-2.6/include/linux/pm.h
> @@ -463,6 +463,14 @@ struct dev_pm_info {
>
> extern void update_pm_runtime_accounting(struct device *dev);
>
> +/*
> + * Power domains provide callbacks that are executed during system suspend,
> + * hibernation, system resume and during runtime PM transitions along with
> + * subsystem-level and driver-level callbacks.
> + */
> +struct dev_power_domain {
> + struct dev_pm_ops ops;
> +};

I don't have a clear picture of how people are going to want to use
these dev_power_domain structures. Should there be a

void *priv;

member as well?

Alan Stern

2011-02-14 16:26:01

by Alan Stern

[permalink] [raw]
Subject: Re: [RFC][PATCH 2/2] PM: Make system-wide PM and runtime PM handle subsystems consistently

On Sat, 12 Feb 2011, Rafael J. Wysocki wrote:

> From: Rafael J. Wysocki <[email protected]>
>
> The code handling system-wide power transitions (eg. suspend-to-RAM)
> can in theory execute callbacks provided by the device's bus type,
> device type and class in each phase of the power transition. In
> turn, the runtime PM core code only calls one of those callbacks at
> a time, preferring bus type callbacks to device type or class
> callbacks and device type callbacks to class callbacks.
>
> It seems reasonable to make them both behave in the same way in that
> respect. Moreover, even though a device may belong to two subsystems
> (eg. bus type and device class) simultaneously, in practice power
> management callbacks for system-wide power transitions are always
> provided by only one of them (ie. if the bus type callbacks are
> defined, the device class ones are not and vice versa). Thus it is
> possible to modify the code handling system-wide power transitions
> so that it follows the core runtime PM code (ie. treats the
> subsystem callbacks as mutually exclusive).
>
> On the other hand, the core runtime PM code will choose to execute,
> for example, a runtime suspend callback provided by the device type
> even if the bus type's struct dev_pm_ops object exists, but the
> runtime_suspend pointer in it happens to be NULL. This is confusing,
> because it may lead to the execution of callbacks from different
> subsystems during different operations (eg. the bus type suspend
> callback may be executed during runtime suspend, while the device
> type callback will be executed during runtime resume).
>
> Make all of the power management code treat subsystem callbacks in
> a consistent way, such that:
> (1) If the device's bus type is defined (eg. dev->bus is not NULL)
> and its pm pointer is not NULL, the callbacks from dev->bus->pm
> will be used.
> (2) If dev->bus is NULL or dev->bus->pm is NULL, but the device's
> device type is defined (eg. dev->type is not NULL) and its pm
> pointer is not NULL, the callbacks from dev->type->pm will be
> used.
> (3) If dev->bus is NULL or dev->bus->pm is NULL and dev->type is
> NULL or dev->type->pm is NULL, the callbacks from dev->class->pm
> will be used provided that both dev->class and dev->class->pm
> are not NULL.

It looks okay, but I haven't tested it. Just one minor change needed
in the documentation:

> +All phases use bus, type, or class callbacks (that is, methods defined in
> +dev->bus->pm, dev->type->pm, or dev->class->pm). These callbacks are mutually
> +exclusive, so if the bus provides a struct dev_pm_ops object pointed to by its
> +pm field (i.e. both dev->bus and dev->bus->pm are defined), the callbacks
> +included in that object (i.e. dev->bus->pm) will be used. In turn, if the

s/In turn/Otherwise/

> +device type provides a struct dev_pm_ops object pointed to by its pm field
> +(i.e. both dev->type and dev->type->pm are defined), the PM core will used the
> +callbacks from that object (i.e. dev->type->pm). Finally, if the pm fields of
> +both the bus and device type objects are NULL (or those objects do not exist),
> +the callbacks provided by the class (that is, the callbacks from dev->class->pm)
> +will be used.
>
> These callbacks may in turn invoke device- or driver-specific methods stored in
> dev->driver->pm, but they don't have to.

Alan Stern

2011-02-14 22:35:10

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/2] PM: Add support for device power domains

On Monday, February 14, 2011, Alan Stern wrote:
> On Sat, 12 Feb 2011, Rafael J. Wysocki wrote:
>
> > From: Rafael J. Wysocki <[email protected]>
> >
> > The platform bus type is often used to handle Systems-on-a-Chip (SoC)
> > where all devices are represented by objects of type struct
> > platform_device. In those cases the same "platform" device driver
> > may be used with multiple different system configurations, but the
> > actions needed to put the devices it handles into a low-power state
> > and back into the full-power state may depend on the design of the
> > given SoC. The driver, however, cannot possibly include all the
> > information necessary for the power management of its device on all
> > the systems it is used with. Moreover, the device hierarchy in its
> > current form also is not suitable for representing this kind of
> > information.
> >
> > The patch below attempts to address this problem by introducing
> > objects of type struct dev_power_domain that can be used for
> > representing power domains within a SoC. Every struct
> > dev_power_domain object provides a sets of device power
> > management callbacks that can be used to perform what's needed for
> > device power management in addition to the operations carried out by
> > the device's driver and subsystem.
> >
> > Namely, if a struct dev_power_domain object is pointed to by the
> > pwr_domain field in a struct device, the callbacks provided by its
> > ops member will be executed in addition to the corresponding
> > callbacks provided by the device's subsystem and driver during all
> > power transitions.
>
> Overall this looks very good.
>
> > Index: linux-2.6/include/linux/pm.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/pm.h
> > +++ linux-2.6/include/linux/pm.h
> > @@ -463,6 +463,14 @@ struct dev_pm_info {
> >
> > extern void update_pm_runtime_accounting(struct device *dev);
> >
> > +/*
> > + * Power domains provide callbacks that are executed during system suspend,
> > + * hibernation, system resume and during runtime PM transitions along with
> > + * subsystem-level and driver-level callbacks.
> > + */
> > +struct dev_power_domain {
> > + struct dev_pm_ops ops;
> > +};
>
> I don't have a clear picture of how people are going to want to use
> these dev_power_domain structures. Should there be a
>
> void *priv;
>
> member as well?

Well, I'm not sure. What would be the purpose of it?

Rafael

2011-02-14 22:35:30

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH 2/2] PM: Make system-wide PM and runtime PM handle subsystems consistently

On Monday, February 14, 2011, Alan Stern wrote:
> On Sat, 12 Feb 2011, Rafael J. Wysocki wrote:
>
> > From: Rafael J. Wysocki <[email protected]>
> >
> > The code handling system-wide power transitions (eg. suspend-to-RAM)
> > can in theory execute callbacks provided by the device's bus type,
> > device type and class in each phase of the power transition. In
> > turn, the runtime PM core code only calls one of those callbacks at
> > a time, preferring bus type callbacks to device type or class
> > callbacks and device type callbacks to class callbacks.
> >
> > It seems reasonable to make them both behave in the same way in that
> > respect. Moreover, even though a device may belong to two subsystems
> > (eg. bus type and device class) simultaneously, in practice power
> > management callbacks for system-wide power transitions are always
> > provided by only one of them (ie. if the bus type callbacks are
> > defined, the device class ones are not and vice versa). Thus it is
> > possible to modify the code handling system-wide power transitions
> > so that it follows the core runtime PM code (ie. treats the
> > subsystem callbacks as mutually exclusive).
> >
> > On the other hand, the core runtime PM code will choose to execute,
> > for example, a runtime suspend callback provided by the device type
> > even if the bus type's struct dev_pm_ops object exists, but the
> > runtime_suspend pointer in it happens to be NULL. This is confusing,
> > because it may lead to the execution of callbacks from different
> > subsystems during different operations (eg. the bus type suspend
> > callback may be executed during runtime suspend, while the device
> > type callback will be executed during runtime resume).
> >
> > Make all of the power management code treat subsystem callbacks in
> > a consistent way, such that:
> > (1) If the device's bus type is defined (eg. dev->bus is not NULL)
> > and its pm pointer is not NULL, the callbacks from dev->bus->pm
> > will be used.
> > (2) If dev->bus is NULL or dev->bus->pm is NULL, but the device's
> > device type is defined (eg. dev->type is not NULL) and its pm
> > pointer is not NULL, the callbacks from dev->type->pm will be
> > used.
> > (3) If dev->bus is NULL or dev->bus->pm is NULL and dev->type is
> > NULL or dev->type->pm is NULL, the callbacks from dev->class->pm
> > will be used provided that both dev->class and dev->class->pm
> > are not NULL.
>
> It looks okay, but I haven't tested it. Just one minor change needed
> in the documentation:
>
> > +All phases use bus, type, or class callbacks (that is, methods defined in
> > +dev->bus->pm, dev->type->pm, or dev->class->pm). These callbacks are mutually
> > +exclusive, so if the bus provides a struct dev_pm_ops object pointed to by its
> > +pm field (i.e. both dev->bus and dev->bus->pm are defined), the callbacks
> > +included in that object (i.e. dev->bus->pm) will be used. In turn, if the
>
> s/In turn/Otherwise/

OK

> > +device type provides a struct dev_pm_ops object pointed to by its pm field
> > +(i.e. both dev->type and dev->type->pm are defined), the PM core will used the
> > +callbacks from that object (i.e. dev->type->pm). Finally, if the pm fields of
> > +both the bus and device type objects are NULL (or those objects do not exist),
> > +the callbacks provided by the class (that is, the callbacks from dev->class->pm)
> > +will be used.
> >
> > These callbacks may in turn invoke device- or driver-specific methods stored in
> > dev->driver->pm, but they don't have to.

Thanks,
Rafael

2011-02-15 03:01:28

by Alan Stern

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/2] PM: Add support for device power domains

On Mon, 14 Feb 2011, Rafael J. Wysocki wrote:

> > I don't have a clear picture of how people are going to want to use
> > these dev_power_domain structures. Should there be a
> >
> > void *priv;
> >
> > member as well?
>
> Well, I'm not sure. What would be the purpose of it?

It's easy to imagine an SoC with multiple power domains and hence
multiple structures. The platform code would need some way to tell
those structures apart. But we should get some advice from the people
who will actually have to use these things...

Alan Stern

2011-02-15 07:28:18

by Magnus Damm

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/2] PM: Add support for device power domains

On Tue, Feb 15, 2011 at 7:34 AM, Rafael J. Wysocki <[email protected]> wrote:
> On Monday, February 14, 2011, Alan Stern wrote:
>> On Sat, 12 Feb 2011, Rafael J. Wysocki wrote:
>>
>> > From: Rafael J. Wysocki <[email protected]>
>> >
>> > The platform bus type is often used to handle Systems-on-a-Chip (SoC)
>> > where all devices are represented by objects of type struct
>> > platform_device. ?In those cases the same "platform" device driver
>> > may be used with multiple different system configurations, but the
>> > actions needed to put the devices it handles into a low-power state
>> > and back into the full-power state may depend on the design of the
>> > given SoC. ?The driver, however, cannot possibly include all the
>> > information necessary for the power management of its device on all
>> > the systems it is used with. ?Moreover, the device hierarchy in its
>> > current form also is not suitable for representing this kind of
>> > information.
>> >
>> > The patch below attempts to address this problem by introducing
>> > objects of type struct dev_power_domain that can be used for
>> > representing power domains within a SoC. ?Every struct
>> > dev_power_domain object provides a sets of device power
>> > management callbacks that can be used to perform what's needed for
>> > device power management in addition to the operations carried out by
>> > the device's driver and subsystem.
>> >
>> > Namely, if a struct dev_power_domain object is pointed to by the
>> > pwr_domain field in a struct device, the callbacks provided by its
>> > ops member will be executed in addition to the corresponding
>> > callbacks provided by the device's subsystem and driver during all
>> > power transitions.
>>
>> Overall this looks very good.

I think so too.

>> > Index: linux-2.6/include/linux/pm.h
>> > ===================================================================
>> > --- linux-2.6.orig/include/linux/pm.h
>> > +++ linux-2.6/include/linux/pm.h
>> > @@ -463,6 +463,14 @@ struct dev_pm_info {
>> >
>> > ?extern void update_pm_runtime_accounting(struct device *dev);
>> >
>> > +/*
>> > + * Power domains provide callbacks that are executed during system suspend,
>> > + * hibernation, system resume and during runtime PM transitions along with
>> > + * subsystem-level and driver-level callbacks.
>> > + */
>> > +struct dev_power_domain {
>> > + ? struct dev_pm_ops ? ? ? ops;
>> > +};
>>
>> I don't have a clear picture of how people are going to want to use
>> these dev_power_domain structures. ?Should there be a
>>
>> ? ? ? void ? ? ? ? ? ? ? ? ? ?*priv;
>>
>> member as well?
>
> Well, I'm not sure. ?What would be the purpose of it?

As Alan pointed out, a private pointer may be useful for the
SoC-specific power domain code. I'm yet to tie in this code with
working hardware power domain support, so it's hard for me to say if a
private pointer will help or not at this point. An alternative to
private pointers for the SoC-specific power domain code is to use
devres_alloc().

I believe it is important to have some kind of power domain mapping
for each hardware block on a SoC device.
On SH-Mobile I had this hidden in the SoC-specific code, grep for
HWBLK to find my approach to deal with power domains. The HWBLK enums
provide a unique identifier for each device instance on the SoC.

Having the power domain code framework as generic as possible is of
course a good idea. So I'm very happy to see these patches. Per-struct
device power domain pointers make sense IMO.

I do wonder how this ties into multiple levels of power management.
This will be needed for Runtime PM at some point.

To compare Runtime PM with CPUIdle, I believe the CPUIdle core can ask
the cpu-specific code to enter a certain level of sleep mode, but the
cpu-specific code can chose to not follow this and sleep lightly if
for instance hardware dependencies prevent the deep sleep mode to be
entered. I believe something similar will be needed on Runtime PM, but
controlling power domains instead of sleep modes.

Any ideas?

Thanks,

/ magnus

2011-02-15 18:17:16

by Kevin Hilman

[permalink] [raw]
Subject: Re: [RFC][PATCH 2/2] PM: Make system-wide PM and runtime PM handle subsystems consistently

"Rafael J. Wysocki" <[email protected]> writes:

> From: Rafael J. Wysocki <[email protected]>
>
> The code handling system-wide power transitions (eg. suspend-to-RAM)
> can in theory execute callbacks provided by the device's bus type,
> device type and class in each phase of the power transition. In
> turn, the runtime PM core code only calls one of those callbacks at
> a time, preferring bus type callbacks to device type or class
> callbacks and device type callbacks to class callbacks.
>
> It seems reasonable to make them both behave in the same way in that
> respect. Moreover, even though a device may belong to two subsystems
> (eg. bus type and device class) simultaneously, in practice power
> management callbacks for system-wide power transitions are always
> provided by only one of them (ie. if the bus type callbacks are
> defined, the device class ones are not and vice versa). Thus it is
> possible to modify the code handling system-wide power transitions
> so that it follows the core runtime PM code (ie. treats the
> subsystem callbacks as mutually exclusive).
>
> On the other hand, the core runtime PM code will choose to execute,
> for example, a runtime suspend callback provided by the device type
> even if the bus type's struct dev_pm_ops object exists, but the
> runtime_suspend pointer in it happens to be NULL. This is confusing,
> because it may lead to the execution of callbacks from different
> subsystems during different operations (eg. the bus type suspend
> callback may be executed during runtime suspend, while the device
> type callback will be executed during runtime resume).
>
> Make all of the power management code treat subsystem callbacks in
> a consistent way, such that:
> (1) If the device's bus type is defined (eg. dev->bus is not NULL)
> and its pm pointer is not NULL, the callbacks from dev->bus->pm
> will be used.
> (2) If dev->bus is NULL or dev->bus->pm is NULL, but the device's
> device type is defined (eg. dev->type is not NULL) and its pm
> pointer is not NULL, the callbacks from dev->type->pm will be
> used.
> (3) If dev->bus is NULL or dev->bus->pm is NULL and dev->type is
> NULL or dev->type->pm is NULL, the callbacks from dev->class->pm
> will be used provided that both dev->class and dev->class->pm
> are not NULL.
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>

This looks good, consistency between system and runtime PM is a great
help to readability.

Acked-by: Kevin Hilman <[email protected]>


> ---
> Documentation/power/devices.txt | 28 +++----
> drivers/base/power/main.c | 141 +++++++++++++++++-----------------------
> drivers/base/power/runtime.c | 12 +--
> 3 files changed, 79 insertions(+), 102 deletions(-)
>
> Index: linux-2.6/drivers/base/power/main.c
> ===================================================================
> --- linux-2.6.orig/drivers/base/power/main.c
> +++ linux-2.6/drivers/base/power/main.c
> @@ -433,21 +433,17 @@ static int device_resume_noirq(struct de
> error = pm_noirq_op(dev, dev->bus->pm, state);
> if (error)
> goto End;
> - }
> -
> - if (dev->type && dev->type->pm) {
> + } else if (dev->type && dev->type->pm) {
> pm_dev_dbg(dev, state, "EARLY type ");
> error = pm_noirq_op(dev, dev->type->pm, state);
> if (error)
> goto End;
> - }
> -
> - if (dev->class && dev->class->pm) {
> + } else if (dev->class && dev->class->pm) {
> pm_dev_dbg(dev, state, "EARLY class ");
> error = pm_noirq_op(dev, dev->class->pm, state);
> }
>
> -End:
> + End:
> TRACE_RESUME(error);
> return error;
> }
> @@ -532,21 +528,18 @@ static int device_resume(struct device *
> if (dev->bus->pm) {
> pm_dev_dbg(dev, state, "");
> error = pm_op(dev, dev->bus->pm, state);
> + goto End;
> } else if (dev->bus->resume) {
> pm_dev_dbg(dev, state, "legacy ");
> error = legacy_resume(dev, dev->bus->resume);
> - }
> - if (error)
> goto End;
> + }
> }
>
> - if (dev->type) {
> - if (dev->type->pm) {
> - pm_dev_dbg(dev, state, "type ");
> - error = pm_op(dev, dev->type->pm, state);
> - }
> - if (error)
> - goto End;
> + if (dev->type && dev->type->pm) {
> + pm_dev_dbg(dev, state, "type ");
> + error = pm_op(dev, dev->type->pm, state);
> + goto End;
> }
>
> if (dev->class) {
> @@ -558,6 +551,7 @@ static int device_resume(struct device *
> error = legacy_resume(dev, dev->class->resume);
> }
> }
> +
> End:
> device_unlock(dev);
> complete_all(&dev->power.completion);
> @@ -644,19 +638,18 @@ static void device_complete(struct devic
> dev->pwr_domain->ops.complete(dev);
> }
>
> - if (dev->class && dev->class->pm && dev->class->pm->complete) {
> - pm_dev_dbg(dev, state, "completing class ");
> - dev->class->pm->complete(dev);
> - }
> -
> - if (dev->type && dev->type->pm && dev->type->pm->complete) {
> - pm_dev_dbg(dev, state, "completing type ");
> - dev->type->pm->complete(dev);
> - }
> -
> - if (dev->bus && dev->bus->pm && dev->bus->pm->complete) {
> + if (dev->bus && dev->bus->pm) {
> pm_dev_dbg(dev, state, "completing ");
> - dev->bus->pm->complete(dev);
> + if (dev->bus->pm->complete)
> + dev->bus->pm->complete(dev);
> + } else if (dev->type && dev->type->pm) {
> + pm_dev_dbg(dev, state, "completing type ");
> + if (dev->type->pm->complete)
> + dev->type->pm->complete(dev);
> + } else if (dev->class && dev->class->pm) {
> + pm_dev_dbg(dev, state, "completing class ");
> + if (dev->class->pm->complete)
> + dev->class->pm->complete(dev);
> }
>
> device_unlock(dev);
> @@ -741,27 +734,23 @@ static pm_message_t resume_event(pm_mess
> */
> static int device_suspend_noirq(struct device *dev, pm_message_t state)
> {
> - int error = 0;
> + int error;
>
> - if (dev->class && dev->class->pm) {
> - pm_dev_dbg(dev, state, "LATE class ");
> - error = pm_noirq_op(dev, dev->class->pm, state);
> + if (dev->bus && dev->bus->pm) {
> + pm_dev_dbg(dev, state, "LATE ");
> + error = pm_noirq_op(dev, dev->bus->pm, state);
> if (error)
> - goto End;
> - }
> -
> - if (dev->type && dev->type->pm) {
> + return error;
> + } else if (dev->type && dev->type->pm) {
> pm_dev_dbg(dev, state, "LATE type ");
> error = pm_noirq_op(dev, dev->type->pm, state);
> if (error)
> - goto End;
> - }
> -
> - if (dev->bus && dev->bus->pm) {
> - pm_dev_dbg(dev, state, "LATE ");
> - error = pm_noirq_op(dev, dev->bus->pm, state);
> + return error;
> + } else if (dev->class && dev->class->pm) {
> + pm_dev_dbg(dev, state, "LATE class ");
> + error = pm_noirq_op(dev, dev->class->pm, state);
> if (error)
> - goto End;
> + return error;
> }
>
> if (dev->pwr_domain) {
> @@ -769,8 +758,7 @@ static int device_suspend_noirq(struct d
> pm_noirq_op(dev, &dev->pwr_domain->ops, state);
> }
>
> -End:
> - return error;
> + return 0;
> }
>
> /**
> @@ -857,40 +845,36 @@ static int __device_suspend(struct devic
> goto End;
> }
>
> - if (dev->class) {
> - if (dev->class->pm) {
> - pm_dev_dbg(dev, state, "class ");
> - error = pm_op(dev, dev->class->pm, state);
> - } else if (dev->class->suspend) {
> - pm_dev_dbg(dev, state, "legacy class ");
> - error = legacy_suspend(dev, state, dev->class->suspend);
> - }
> - if (error)
> - goto End;
> - }
> -
> - if (dev->type) {
> - if (dev->type->pm) {
> - pm_dev_dbg(dev, state, "type ");
> - error = pm_op(dev, dev->type->pm, state);
> - }
> - if (error)
> - goto End;
> - }
> -
> if (dev->bus) {
> if (dev->bus->pm) {
> pm_dev_dbg(dev, state, "");
> error = pm_op(dev, dev->bus->pm, state);
> + goto Domain;
> } else if (dev->bus->suspend) {
> pm_dev_dbg(dev, state, "legacy ");
> error = legacy_suspend(dev, state, dev->bus->suspend);
> + goto Domain;
> }
> - if (error)
> - goto End;
> }
>
> - if (dev->pwr_domain) {
> + if (dev->type && dev->type->pm) {
> + pm_dev_dbg(dev, state, "type ");
> + error = pm_op(dev, dev->type->pm, state);
> + goto Domain;
> + }
> +
> + if (dev->class) {
> + if (dev->class->pm) {
> + pm_dev_dbg(dev, state, "class ");
> + error = pm_op(dev, dev->class->pm, state);
> + } else if (dev->class->suspend) {
> + pm_dev_dbg(dev, state, "legacy class ");
> + error = legacy_suspend(dev, state, dev->class->suspend);
> + }
> + }
> +
> + Domain:
> + if (!error && dev->pwr_domain) {
> pm_dev_dbg(dev, state, "power domain ");
> pm_op(dev, &dev->pwr_domain->ops, state);
> }
> @@ -985,25 +969,24 @@ static int device_prepare(struct device
>
> device_lock(dev);
>
> - if (dev->bus && dev->bus->pm && dev->bus->pm->prepare) {
> + if (dev->bus && dev->bus->pm) {
> pm_dev_dbg(dev, state, "preparing ");
> - error = dev->bus->pm->prepare(dev);
> + if (dev->bus->pm->prepare)
> + error = dev->bus->pm->prepare(dev);
> suspend_report_result(dev->bus->pm->prepare, error);
> if (error)
> goto End;
> - }
> -
> - if (dev->type && dev->type->pm && dev->type->pm->prepare) {
> + } else if (dev->type && dev->type->pm) {
> pm_dev_dbg(dev, state, "preparing type ");
> - error = dev->type->pm->prepare(dev);
> + if (dev->type->pm->prepare)
> + error = dev->type->pm->prepare(dev);
> suspend_report_result(dev->type->pm->prepare, error);
> if (error)
> goto End;
> - }
> -
> - if (dev->class && dev->class->pm && dev->class->pm->prepare) {
> + } else if (dev->class && dev->class->pm) {
> pm_dev_dbg(dev, state, "preparing class ");
> - error = dev->class->pm->prepare(dev);
> + if (dev->class->pm->prepare)
> + error = dev->class->pm->prepare(dev);
> suspend_report_result(dev->class->pm->prepare, error);
> if (error)
> goto End;
> Index: linux-2.6/drivers/base/power/runtime.c
> ===================================================================
> --- linux-2.6.orig/drivers/base/power/runtime.c
> +++ linux-2.6/drivers/base/power/runtime.c
> @@ -214,9 +214,9 @@ static int rpm_idle(struct device *dev,
>
> dev->power.idle_notification = true;
>
> - if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
> + if (dev->bus && dev->bus->pm)
> callback = dev->bus->pm->runtime_idle;
> - else if (dev->type && dev->type->pm && dev->type->pm->runtime_idle)
> + else if (dev->type && dev->type->pm)
> callback = dev->type->pm->runtime_idle;
> else if (dev->class && dev->class->pm)
> callback = dev->class->pm->runtime_idle;
> @@ -382,9 +382,9 @@ static int rpm_suspend(struct device *de
>
> __update_runtime_status(dev, RPM_SUSPENDING);
>
> - if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend)
> + if (dev->bus && dev->bus->pm)
> callback = dev->bus->pm->runtime_suspend;
> - else if (dev->type && dev->type->pm && dev->type->pm->runtime_suspend)
> + else if (dev->type && dev->type->pm)
> callback = dev->type->pm->runtime_suspend;
> else if (dev->class && dev->class->pm)
> callback = dev->class->pm->runtime_suspend;
> @@ -584,9 +584,9 @@ static int rpm_resume(struct device *dev
> if (dev->pwr_domain)
> rpm_callback(dev->pwr_domain->ops.runtime_resume, dev);
>
> - if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume)
> + if (dev->bus && dev->bus->pm)
> callback = dev->bus->pm->runtime_resume;
> - else if (dev->type && dev->type->pm && dev->type->pm->runtime_resume)
> + else if (dev->type && dev->type->pm)
> callback = dev->type->pm->runtime_resume;
> else if (dev->class && dev->class->pm)
> callback = dev->class->pm->runtime_resume;
> Index: linux-2.6/Documentation/power/devices.txt
> ===================================================================
> --- linux-2.6.orig/Documentation/power/devices.txt
> +++ linux-2.6/Documentation/power/devices.txt
> @@ -249,23 +249,17 @@ various phases always run after tasks ha
> unfrozen. Furthermore, the *_noirq phases run at a time when IRQ handlers have
> been disabled (except for those marked with the IRQ_WAKEUP flag).
>
> -Most phases use bus, type, and class callbacks (that is, methods defined in
> -dev->bus->pm, dev->type->pm, and dev->class->pm). The prepare and complete
> -phases are exceptions; they use only bus callbacks. When multiple callbacks
> -are used in a phase, they are invoked in the order: <class, type, bus> during
> -power-down transitions and in the opposite order during power-up transitions.
> -For example, during the suspend phase the PM core invokes
> -
> - dev->class->pm.suspend(dev);
> - dev->type->pm.suspend(dev);
> - dev->bus->pm.suspend(dev);
> -
> -before moving on to the next device, whereas during the resume phase the core
> -invokes
> -
> - dev->bus->pm.resume(dev);
> - dev->type->pm.resume(dev);
> - dev->class->pm.resume(dev);
> +All phases use bus, type, or class callbacks (that is, methods defined in
> +dev->bus->pm, dev->type->pm, or dev->class->pm). These callbacks are mutually
> +exclusive, so if the bus provides a struct dev_pm_ops object pointed to by its
> +pm field (i.e. both dev->bus and dev->bus->pm are defined), the callbacks
> +included in that object (i.e. dev->bus->pm) will be used. In turn, if the
> +device type provides a struct dev_pm_ops object pointed to by its pm field
> +(i.e. both dev->type and dev->type->pm are defined), the PM core will used the
> +callbacks from that object (i.e. dev->type->pm). Finally, if the pm fields of
> +both the bus and device type objects are NULL (or those objects do not exist),
> +the callbacks provided by the class (that is, the callbacks from dev->class->pm)
> +will be used.
>
> These callbacks may in turn invoke device- or driver-specific methods stored in
> dev->driver->pm, but they don't have to.

2011-02-15 18:23:25

by Kevin Hilman

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/2] PM: Add support for device power domains

"Rafael J. Wysocki" <[email protected]> writes:

> From: Rafael J. Wysocki <[email protected]>
>
> The platform bus type is often used to handle Systems-on-a-Chip (SoC)
> where all devices are represented by objects of type struct
> platform_device. In those cases the same "platform" device driver
> may be used with multiple different system configurations, but the
> actions needed to put the devices it handles into a low-power state
> and back into the full-power state may depend on the design of the
> given SoC. The driver, however, cannot possibly include all the
> information necessary for the power management of its device on all
> the systems it is used with. Moreover, the device hierarchy in its
> current form also is not suitable for representing this kind of
> information.
>
> The patch below attempts to address this problem by introducing
> objects of type struct dev_power_domain that can be used for
> representing power domains within a SoC. Every struct
> dev_power_domain object provides a sets of device power
> management callbacks that can be used to perform what's needed for
> device power management in addition to the operations carried out by
> the device's driver and subsystem.
>
> Namely, if a struct dev_power_domain object is pointed to by the
> pwr_domain field in a struct device, the callbacks provided by its
> ops member will be executed in addition to the corresponding
> callbacks provided by the device's subsystem and driver during all
> power transitions.
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>

Acked-by: Kevin Hilman <[email protected]>

Tested by converting OMAP platform_bus dev_pm_ops overrides to use this
method, and works great.

Tested-by: Kevin Hilman <[email protected]>

> ---
> Documentation/power/devices.txt | 45 +++++++++++++++++++++++++++++++++++++++-
> drivers/base/power/main.c | 37 ++++++++++++++++++++++++++++++++
> drivers/base/power/runtime.c | 19 +++++++++++++++-
> include/linux/device.h | 1
> include/linux/pm.h | 8 +++++++
> 5 files changed, 107 insertions(+), 3 deletions(-)
>
> Index: linux-2.6/include/linux/pm.h
> ===================================================================
> --- linux-2.6.orig/include/linux/pm.h
> +++ linux-2.6/include/linux/pm.h
> @@ -463,6 +463,14 @@ struct dev_pm_info {
>
> extern void update_pm_runtime_accounting(struct device *dev);
>
> +/*
> + * Power domains provide callbacks that are executed during system suspend,
> + * hibernation, system resume and during runtime PM transitions along with
> + * subsystem-level and driver-level callbacks.
> + */
> +struct dev_power_domain {
> + struct dev_pm_ops ops;
> +};
>
> /*
> * The PM_EVENT_ messages are also used by drivers implementing the legacy
> Index: linux-2.6/include/linux/device.h
> ===================================================================
> --- linux-2.6.orig/include/linux/device.h
> +++ linux-2.6/include/linux/device.h
> @@ -422,6 +422,7 @@ struct device {
> void *platform_data; /* Platform specific data, device
> core doesn't touch it */
> struct dev_pm_info power;
> + struct dev_power_domain *pwr_domain;
>
> #ifdef CONFIG_NUMA
> int numa_node; /* NUMA node this device is close to */
> Index: linux-2.6/drivers/base/power/runtime.c
> ===================================================================
> --- linux-2.6.orig/drivers/base/power/runtime.c
> +++ linux-2.6/drivers/base/power/runtime.c
> @@ -168,6 +168,7 @@ static int rpm_check_suspend_allowed(str
> static int rpm_idle(struct device *dev, int rpmflags)
> {
> int (*callback)(struct device *);
> + int (*domain_callback)(struct device *);
> int retval;
>
> retval = rpm_check_suspend_allowed(dev);
> @@ -222,10 +223,19 @@ static int rpm_idle(struct device *dev,
> else
> callback = NULL;
>
> - if (callback) {
> + if (dev->pwr_domain)
> + domain_callback = dev->pwr_domain->ops.runtime_idle;
> + else
> + domain_callback = NULL;
> +
> + if (callback || domain_callback) {
> spin_unlock_irq(&dev->power.lock);
>
> - callback(dev);
> + if (domain_callback)
> + retval = domain_callback(dev);
> +
> + if (!retval && callback)
> + callback(dev);
>
> spin_lock_irq(&dev->power.lock);
> }
> @@ -390,6 +400,8 @@ static int rpm_suspend(struct device *de
> else
> pm_runtime_cancel_pending(dev);
> } else {
> + if (dev->pwr_domain)
> + rpm_callback(dev->pwr_domain->ops.runtime_suspend, dev);
> no_callback:
> __update_runtime_status(dev, RPM_SUSPENDED);
> pm_runtime_deactivate_timer(dev);
> @@ -569,6 +581,9 @@ static int rpm_resume(struct device *dev
>
> __update_runtime_status(dev, RPM_RESUMING);
>
> + if (dev->pwr_domain)
> + rpm_callback(dev->pwr_domain->ops.runtime_resume, dev);
> +
> if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume)
> callback = dev->bus->pm->runtime_resume;
> else if (dev->type && dev->type->pm && dev->type->pm->runtime_resume)
> Index: linux-2.6/drivers/base/power/main.c
> ===================================================================
> --- linux-2.6.orig/drivers/base/power/main.c
> +++ linux-2.6/drivers/base/power/main.c
> @@ -423,6 +423,11 @@ static int device_resume_noirq(struct de
> TRACE_DEVICE(dev);
> TRACE_RESUME(0);
>
> + if (dev->pwr_domain) {
> + pm_dev_dbg(dev, state, "EARLY power domain ");
> + pm_noirq_op(dev, &dev->pwr_domain->ops, state);
> + }
> +
> if (dev->bus && dev->bus->pm) {
> pm_dev_dbg(dev, state, "EARLY ");
> error = pm_noirq_op(dev, dev->bus->pm, state);
> @@ -518,6 +523,11 @@ static int device_resume(struct device *
>
> dev->power.in_suspend = false;
>
> + if (dev->pwr_domain) {
> + pm_dev_dbg(dev, state, "power domain ");
> + pm_op(dev, &dev->pwr_domain->ops, state);
> + }
> +
> if (dev->bus) {
> if (dev->bus->pm) {
> pm_dev_dbg(dev, state, "");
> @@ -629,6 +639,11 @@ static void device_complete(struct devic
> {
> device_lock(dev);
>
> + if (dev->pwr_domain && dev->pwr_domain->ops.complete) {
> + pm_dev_dbg(dev, state, "completing power domain ");
> + dev->pwr_domain->ops.complete(dev);
> + }
> +
> if (dev->class && dev->class->pm && dev->class->pm->complete) {
> pm_dev_dbg(dev, state, "completing class ");
> dev->class->pm->complete(dev);
> @@ -745,6 +760,13 @@ static int device_suspend_noirq(struct d
> if (dev->bus && dev->bus->pm) {
> pm_dev_dbg(dev, state, "LATE ");
> error = pm_noirq_op(dev, dev->bus->pm, state);
> + if (error)
> + goto End;
> + }
> +
> + if (dev->pwr_domain) {
> + pm_dev_dbg(dev, state, "LATE power domain ");
> + pm_noirq_op(dev, &dev->pwr_domain->ops, state);
> }
>
> End:
> @@ -864,6 +886,13 @@ static int __device_suspend(struct devic
> pm_dev_dbg(dev, state, "legacy ");
> error = legacy_suspend(dev, state, dev->bus->suspend);
> }
> + if (error)
> + goto End;
> + }
> +
> + if (dev->pwr_domain) {
> + pm_dev_dbg(dev, state, "power domain ");
> + pm_op(dev, &dev->pwr_domain->ops, state);
> }
>
> End:
> @@ -976,7 +1005,15 @@ static int device_prepare(struct device
> pm_dev_dbg(dev, state, "preparing class ");
> error = dev->class->pm->prepare(dev);
> suspend_report_result(dev->class->pm->prepare, error);
> + if (error)
> + goto End;
> + }
> +
> + if (dev->pwr_domain && dev->pwr_domain->ops.prepare) {
> + pm_dev_dbg(dev, state, "preparing power domain ");
> + dev->pwr_domain->ops.prepare(dev);
> }
> +
> End:
> device_unlock(dev);
>
> Index: linux-2.6/Documentation/power/devices.txt
> ===================================================================
> --- linux-2.6.orig/Documentation/power/devices.txt
> +++ linux-2.6/Documentation/power/devices.txt
> @@ -1,6 +1,6 @@
> Device Power Management
>
> -Copyright (c) 2010 Rafael J. Wysocki <[email protected]>, Novell Inc.
> +Copyright (c) 2010-2011 Rafael J. Wysocki <[email protected]>, Novell Inc.
> Copyright (c) 2010 Alan Stern <[email protected]>
>
>
> @@ -507,6 +507,49 @@ routines. Nevertheless, different callb
> situation where it actually matters.
>
>
> +Device Power Domains
> +--------------------
> +Sometimes devices share reference clocks or other power resources. In those
> +cases it generally is not possible to put devices into low-power states
> +individually. Instead, a set of devices sharing a power resource can be put
> +into a low-power state together at the same time by turning off the shared
> +power resource. Of course, they also need to be put into the full-power state
> +together, by turning the shared power resource on. A set of devices with this
> +property is often referred to as a power domain.
> +
> +Support for power domains is provided through the pwr_domain field of struct
> +device. This field is a pointer to an object of type struct dev_power_domain,
> +defined in include/linux/pm.h, providing a set of power management callbacks
> +analogous to the subsystem-level and device driver callbacks that are executed
> +for the given device during all power transitions, in addition to the respective
> +subsystem-level callbacks. Specifically, the power domain "suspend" callbacks
> +(i.e. ->runtime_suspend(), ->suspend(), ->freeze(), ->poweroff(), etc.) are
> +executed after the analogous subsystem-level callbacks, while the power domain
> +"resume" callbacks (i.e. ->runtime_resume(), ->resume(), ->thaw(), ->restore,
> +etc.) are executed before the analogous subsystem-level callbacks. Error codes
> +returned by the "suspend" and "resume" power domain callbacks are ignored.
> +
> +Power domain ->runtime_idle() callback is executed before the subsystem-level
> +->runtime_idle() callback and the result returned by it is not ignored. Namely,
> +if it returns error code, the subsystem-level ->runtime_idle() callback will not
> +be called and the helper function rpm_idle() executing it will return error
> +code. This mechanism is intended to help platforms where saving device state
> +is a time consuming operation and should only be carried out if all devices
> +in the power domain are idle, before turning off the shared power resource(s).
> +Namely, the power domain ->runtime_idle() callback may return error code until
> +the pm_runtime_idle() helper (or its asychronous version) has been called for
> +all devices in the power domain (it is recommended that the returned error code
> +be -EBUSY in those cases), preventing the subsystem-level ->runtime_idle()
> +callback from being run prematurely.
> +
> +The support for device power domains is only relevant to platforms needing to
> +use the same subsystem-level (e.g. platform bus type) and device driver power
> +management callbacks in many different power domain configurations and wanting
> +to avoid incorporating the support for power domains into the subsystem-level
> +callbacks. The other platforms need not implement it or take it into account
> +in any way.
> +
> +
> System Devices
> --------------
> System devices (sysdevs) follow a slightly different API, which can be found in

2011-02-15 19:48:32

by Grant Likely

[permalink] [raw]
Subject: Re: [RFC][PATCH 2/2] PM: Make system-wide PM and runtime PM handle subsystems consistently

On Tue, Feb 15, 2011 at 10:10:06AM -0800, Kevin Hilman wrote:
> "Rafael J. Wysocki" <[email protected]> writes:
>
> > From: Rafael J. Wysocki <[email protected]>
> >
> > The code handling system-wide power transitions (eg. suspend-to-RAM)
> > can in theory execute callbacks provided by the device's bus type,
> > device type and class in each phase of the power transition. In
> > turn, the runtime PM core code only calls one of those callbacks at
> > a time, preferring bus type callbacks to device type or class
> > callbacks and device type callbacks to class callbacks.
> >
> > It seems reasonable to make them both behave in the same way in that
> > respect. Moreover, even though a device may belong to two subsystems
> > (eg. bus type and device class) simultaneously, in practice power
> > management callbacks for system-wide power transitions are always
> > provided by only one of them (ie. if the bus type callbacks are
> > defined, the device class ones are not and vice versa). Thus it is
> > possible to modify the code handling system-wide power transitions
> > so that it follows the core runtime PM code (ie. treats the
> > subsystem callbacks as mutually exclusive).
> >
> > On the other hand, the core runtime PM code will choose to execute,
> > for example, a runtime suspend callback provided by the device type
> > even if the bus type's struct dev_pm_ops object exists, but the
> > runtime_suspend pointer in it happens to be NULL. This is confusing,
> > because it may lead to the execution of callbacks from different
> > subsystems during different operations (eg. the bus type suspend
> > callback may be executed during runtime suspend, while the device
> > type callback will be executed during runtime resume).
> >
> > Make all of the power management code treat subsystem callbacks in
> > a consistent way, such that:
> > (1) If the device's bus type is defined (eg. dev->bus is not NULL)
> > and its pm pointer is not NULL, the callbacks from dev->bus->pm
> > will be used.
> > (2) If dev->bus is NULL or dev->bus->pm is NULL, but the device's
> > device type is defined (eg. dev->type is not NULL) and its pm
> > pointer is not NULL, the callbacks from dev->type->pm will be
> > used.
> > (3) If dev->bus is NULL or dev->bus->pm is NULL and dev->type is
> > NULL or dev->type->pm is NULL, the callbacks from dev->class->pm
> > will be used provided that both dev->class and dev->class->pm
> > are not NULL.
> >
> > Signed-off-by: Rafael J. Wysocki <[email protected]>
>
> This looks good, consistency between system and runtime PM is a great
> help to readability.
>
> Acked-by: Kevin Hilman <[email protected]>

Reasoning-sounds-sane-to: Grant Likely <[email protected]>

2011-02-15 21:40:26

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/2] PM: Add support for device power domains

On Tuesday, February 15, 2011, Alan Stern wrote:
> On Mon, 14 Feb 2011, Rafael J. Wysocki wrote:
>
> > > I don't have a clear picture of how people are going to want to use
> > > these dev_power_domain structures. Should there be a
> > >
> > > void *priv;
> > >
> > > member as well?
> >
> > Well, I'm not sure. What would be the purpose of it?
>
> It's easy to imagine an SoC with multiple power domains and hence
> multiple structures. The platform code would need some way to tell
> those structures apart. But we should get some advice from the people
> who will actually have to use these things...

Well, anyway I think it should be easy to add a priv pointer at any time later
if there's a use case?

Rafael

2011-02-15 23:13:28

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/2] PM: Add support for device power domains

On Tuesday, February 15, 2011, Magnus Damm wrote:
> On Tue, Feb 15, 2011 at 7:34 AM, Rafael J. Wysocki <[email protected]> wrote:
> > On Monday, February 14, 2011, Alan Stern wrote:
> >> On Sat, 12 Feb 2011, Rafael J. Wysocki wrote:
> >>
> >> > From: Rafael J. Wysocki <[email protected]>
> >> >
> >> > The platform bus type is often used to handle Systems-on-a-Chip (SoC)
> >> > where all devices are represented by objects of type struct
> >> > platform_device. In those cases the same "platform" device driver
> >> > may be used with multiple different system configurations, but the
> >> > actions needed to put the devices it handles into a low-power state
> >> > and back into the full-power state may depend on the design of the
> >> > given SoC. The driver, however, cannot possibly include all the
> >> > information necessary for the power management of its device on all
> >> > the systems it is used with. Moreover, the device hierarchy in its
> >> > current form also is not suitable for representing this kind of
> >> > information.
> >> >
> >> > The patch below attempts to address this problem by introducing
> >> > objects of type struct dev_power_domain that can be used for
> >> > representing power domains within a SoC. Every struct
> >> > dev_power_domain object provides a sets of device power
> >> > management callbacks that can be used to perform what's needed for
> >> > device power management in addition to the operations carried out by
> >> > the device's driver and subsystem.
> >> >
> >> > Namely, if a struct dev_power_domain object is pointed to by the
> >> > pwr_domain field in a struct device, the callbacks provided by its
> >> > ops member will be executed in addition to the corresponding
> >> > callbacks provided by the device's subsystem and driver during all
> >> > power transitions.
> >>
> >> Overall this looks very good.
>
> I think so too.
>
> >> > Index: linux-2.6/include/linux/pm.h
> >> > ===================================================================
> >> > --- linux-2.6.orig/include/linux/pm.h
> >> > +++ linux-2.6/include/linux/pm.h
> >> > @@ -463,6 +463,14 @@ struct dev_pm_info {
> >> >
> >> > extern void update_pm_runtime_accounting(struct device *dev);
> >> >
> >> > +/*
> >> > + * Power domains provide callbacks that are executed during system suspend,
> >> > + * hibernation, system resume and during runtime PM transitions along with
> >> > + * subsystem-level and driver-level callbacks.
> >> > + */
> >> > +struct dev_power_domain {
> >> > + struct dev_pm_ops ops;
> >> > +};
> >>
> >> I don't have a clear picture of how people are going to want to use
> >> these dev_power_domain structures. Should there be a
> >>
> >> void *priv;
> >>
> >> member as well?
> >
> > Well, I'm not sure. What would be the purpose of it?
>
> As Alan pointed out, a private pointer may be useful for the
> SoC-specific power domain code. I'm yet to tie in this code with
> working hardware power domain support, so it's hard for me to say if a
> private pointer will help or not at this point. An alternative to
> private pointers for the SoC-specific power domain code is to use
> devres_alloc().

I'm not against the priv pointer, but I'd simply prefer to add it once we
have a good use case for it.

> I believe it is important to have some kind of power domain mapping
> for each hardware block on a SoC device.
> On SH-Mobile I had this hidden in the SoC-specific code, grep for
> HWBLK to find my approach to deal with power domains. The HWBLK enums
> provide a unique identifier for each device instance on the SoC.
>
> Having the power domain code framework as generic as possible is of
> course a good idea. So I'm very happy to see these patches. Per-struct
> device power domain pointers make sense IMO.

Good.

> I do wonder how this ties into multiple levels of power management.
> This will be needed for Runtime PM at some point.
>
> To compare Runtime PM with CPUIdle, I believe the CPUIdle core can ask
> the cpu-specific code to enter a certain level of sleep mode, but the
> cpu-specific code can chose to not follow this and sleep lightly if
> for instance hardware dependencies prevent the deep sleep mode to be
> entered. I believe something similar will be needed on Runtime PM, but
> controlling power domains instead of sleep modes.
>
> Any ideas?

One possibility is that the power domain callbacks will decide what power
state to put the domain into (once "idle" has been called for all devices
in the domain). Of course, they will need some input allowing them to make
the decision (like what's the greatest acceptable resume latency for example).

Still, the PCI bus type has a similar problem, but PCI devices are put into
low-power states by the bus type callbacks, so those callbacks would have to
decide what state to choose (currently they simply choose the lowest power
state available and we don't really take the resume latency into
consideration).

So, whatever approach we invent, it should be suitable for both the cases
above, or it will have to stay platform-specific.

Thanks,
Rafael

2011-02-16 12:25:14

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH 2/2] PM: Make system-wide PM and runtime PM handle subsystems consistently

On Monday, February 14, 2011, Alan Stern wrote:
> On Sat, 12 Feb 2011, Rafael J. Wysocki wrote:
>
> > From: Rafael J. Wysocki <[email protected]>
> >
> > The code handling system-wide power transitions (eg. suspend-to-RAM)
> > can in theory execute callbacks provided by the device's bus type,
> > device type and class in each phase of the power transition. In
> > turn, the runtime PM core code only calls one of those callbacks at
> > a time, preferring bus type callbacks to device type or class
> > callbacks and device type callbacks to class callbacks.
> >
> > It seems reasonable to make them both behave in the same way in that
> > respect. Moreover, even though a device may belong to two subsystems
> > (eg. bus type and device class) simultaneously, in practice power
> > management callbacks for system-wide power transitions are always
> > provided by only one of them (ie. if the bus type callbacks are
> > defined, the device class ones are not and vice versa). Thus it is
> > possible to modify the code handling system-wide power transitions
> > so that it follows the core runtime PM code (ie. treats the
> > subsystem callbacks as mutually exclusive).
> >
> > On the other hand, the core runtime PM code will choose to execute,
> > for example, a runtime suspend callback provided by the device type
> > even if the bus type's struct dev_pm_ops object exists, but the
> > runtime_suspend pointer in it happens to be NULL. This is confusing,
> > because it may lead to the execution of callbacks from different
> > subsystems during different operations (eg. the bus type suspend
> > callback may be executed during runtime suspend, while the device
> > type callback will be executed during runtime resume).
> >
> > Make all of the power management code treat subsystem callbacks in
> > a consistent way, such that:
> > (1) If the device's bus type is defined (eg. dev->bus is not NULL)
> > and its pm pointer is not NULL, the callbacks from dev->bus->pm
> > will be used.
> > (2) If dev->bus is NULL or dev->bus->pm is NULL, but the device's
> > device type is defined (eg. dev->type is not NULL) and its pm
> > pointer is not NULL, the callbacks from dev->type->pm will be
> > used.
> > (3) If dev->bus is NULL or dev->bus->pm is NULL and dev->type is
> > NULL or dev->type->pm is NULL, the callbacks from dev->class->pm
> > will be used provided that both dev->class and dev->class->pm
> > are not NULL.
>
> It looks okay, but I haven't tested it.

Unfortunately, it doesn't work on my Acer Ferrari One. The problem is that
hcd_pci_suspend() fails for the EHCI controller, apparently because the root
hub is not suspended. Do root hubs need both class suspend and bus type
suspend to work at the same time?

Rafael

2011-02-16 14:57:39

by Alan Stern

[permalink] [raw]
Subject: Re: [RFC][PATCH 2/2] PM: Make system-wide PM and runtime PM handle subsystems consistently

On Wed, 16 Feb 2011, Rafael J. Wysocki wrote:

> Unfortunately, it doesn't work on my Acer Ferrari One. The problem is that
> hcd_pci_suspend() fails for the EHCI controller, apparently because the root
> hub is not suspended. Do root hubs need both class suspend and bus type
> suspend to work at the same time?

No, only the bus type suspend method is needed. Can you provide the
dmesg log using a kernel built with CONFIG_USB_DEBUG?

Alan Stern

2011-02-16 21:48:10

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH 2/2] PM: Make system-wide PM and runtime PM handle subsystems consistently

On Wednesday, February 16, 2011, Alan Stern wrote:
> On Wed, 16 Feb 2011, Rafael J. Wysocki wrote:
>
> > Unfortunately, it doesn't work on my Acer Ferrari One. The problem is that
> > hcd_pci_suspend() fails for the EHCI controller, apparently because the root
> > hub is not suspended. Do root hubs need both class suspend and bus type
> > suspend to work at the same time?
>
> No, only the bus type suspend method is needed.

Bus type or device type? It appears to be the latter from reading the code
(although admittedly not too thorough).

> Can you provide the dmesg log using a kernel built with CONFIG_USB_DEBUG?

Well, I know what the problem is.

USB defines usb_device_type pointing to usb_device_pm_ops that provides
system-wide PM callbacks only and usb_bus_type pointing to
usb_bus_pm_ops that provides runtime PM callbacks only. So it looks like
we should move either the system-wide PM callbacks to usb_bus_pm_ops,
or the runtime PM callbacks to usb_device_pm_ops.

FWIW, the appended patch helps on my test machine, but I'm not sure if it
is the right thing to do.

Apart from this I think the order of checks introduced by the $subject patch
should be:
(1) If dev->class != NULL and dev->class->pm != NULL, use dev->class,
or otherwise
(2) if dev->type != NULL and dev->type->pm != NULL, use dev->type,
or otherwise
(3) use dev->bus (if present).
as that would allow classes and device types to override bus type PM
callbacks if they wish to.

Thanks,
Rafael

---
drivers/usb/core/driver.c | 15 +++------------
drivers/usb/core/usb.c | 5 +++++
drivers/usb/core/usb.h | 18 ++++++++++++++++++
3 files changed, 26 insertions(+), 12 deletions(-)

Index: linux-2.6/drivers/usb/core/driver.c
===================================================================
--- linux-2.6.orig/drivers/usb/core/driver.c
+++ linux-2.6/drivers/usb/core/driver.c
@@ -1646,7 +1646,7 @@ static int autosuspend_check(struct usb_
return 0;
}

-static int usb_runtime_suspend(struct device *dev)
+int usb_runtime_suspend(struct device *dev)
{
struct usb_device *udev = to_usb_device(dev);
int status;
@@ -1662,7 +1662,7 @@ static int usb_runtime_suspend(struct de
return status;
}

-static int usb_runtime_resume(struct device *dev)
+int usb_runtime_resume(struct device *dev)
{
struct usb_device *udev = to_usb_device(dev);
int status;
@@ -1674,7 +1674,7 @@ static int usb_runtime_resume(struct dev
return status;
}

-static int usb_runtime_idle(struct device *dev)
+int usb_runtime_idle(struct device *dev)
{
struct usb_device *udev = to_usb_device(dev);

@@ -1686,19 +1686,10 @@ static int usb_runtime_idle(struct devic
return 0;
}

-static const struct dev_pm_ops usb_bus_pm_ops = {
- .runtime_suspend = usb_runtime_suspend,
- .runtime_resume = usb_runtime_resume,
- .runtime_idle = usb_runtime_idle,
-};
-
#endif /* CONFIG_USB_SUSPEND */

struct bus_type usb_bus_type = {
.name = "usb",
.match = usb_device_match,
.uevent = usb_uevent,
-#ifdef CONFIG_USB_SUSPEND
- .pm = &usb_bus_pm_ops,
-#endif
};
Index: linux-2.6/drivers/usb/core/usb.h
===================================================================
--- linux-2.6.orig/drivers/usb/core/usb.h
+++ linux-2.6/drivers/usb/core/usb.h
@@ -77,6 +77,9 @@ static inline int usb_port_resume(struct
extern void usb_autosuspend_device(struct usb_device *udev);
extern int usb_autoresume_device(struct usb_device *udev);
extern int usb_remote_wakeup(struct usb_device *dev);
+extern int usb_runtime_suspend(struct device *dev);
+extern int usb_runtime_resume(struct device *dev);
+extern int usb_runtime_idle(struct device *dev);

#else

@@ -90,6 +93,21 @@ static inline int usb_remote_wakeup(stru
{
return 0;
}
+
+static inline int usb_runtime_suspend(struct device *dev)
+{
+ return 0;
+}
+
+static inline int usb_runtime_resume(struct device *dev)
+{
+ return 0;
+}
+
+static inline int usb_runtime_idle(struct device *dev)
+{
+ return 0;
+}

#endif

Index: linux-2.6/drivers/usb/core/usb.c
===================================================================
--- linux-2.6.orig/drivers/usb/core/usb.c
+++ linux-2.6/drivers/usb/core/usb.c
@@ -315,6 +315,11 @@ static const struct dev_pm_ops usb_devic
.thaw = usb_dev_thaw,
.poweroff = usb_dev_poweroff,
.restore = usb_dev_restore,
+#ifdef CONFIG_USB_SUSPEND
+ .runtime_suspend = usb_runtime_suspend,
+ .runtime_resume = usb_runtime_resume,
+ .runtime_idle = usb_runtime_idle,
+#endif
};

#endif /* CONFIG_PM */

2011-02-16 22:23:15

by Alan Stern

[permalink] [raw]
Subject: Re: [RFC][PATCH 2/2] PM: Make system-wide PM and runtime PM handle subsystems consistently

On Wed, 16 Feb 2011, Rafael J. Wysocki wrote:

> On Wednesday, February 16, 2011, Alan Stern wrote:
> > On Wed, 16 Feb 2011, Rafael J. Wysocki wrote:
> >
> > > Unfortunately, it doesn't work on my Acer Ferrari One. The problem is that
> > > hcd_pci_suspend() fails for the EHCI controller, apparently because the root
> > > hub is not suspended. Do root hubs need both class suspend and bus type
> > > suspend to work at the same time?
> >
> > No, only the bus type suspend method is needed.
>
> Bus type or device type? It appears to be the latter from reading the code
> (although admittedly not too thorough).

You're right. I forgot about how the PM methods were split up. :-(

> > Can you provide the dmesg log using a kernel built with CONFIG_USB_DEBUG?
>
> Well, I know what the problem is.
>
> USB defines usb_device_type pointing to usb_device_pm_ops that provides
> system-wide PM callbacks only and usb_bus_type pointing to
> usb_bus_pm_ops that provides runtime PM callbacks only. So it looks like
> we should move either the system-wide PM callbacks to usb_bus_pm_ops,
> or the runtime PM callbacks to usb_device_pm_ops.

Yes. IIRC, I did it that way so that the runtime PM routines could be
static. Making them non-static isn't a big deal, though.

> FWIW, the appended patch helps on my test machine, but I'm not sure if it
> is the right thing to do.

It is. Except that the inline stubs aren't needed for anything; they
don't have to be added to usb.h.

> Apart from this I think the order of checks introduced by the $subject patch
> should be:
> (1) If dev->class != NULL and dev->class->pm != NULL, use dev->class,
> or otherwise
> (2) if dev->type != NULL and dev->type->pm != NULL, use dev->type,
> or otherwise
> (3) use dev->bus (if present).
> as that would allow classes and device types to override bus type PM
> callbacks if they wish to.

I haven't heard of any device types being present on more than one kind
of bus, so it makes sense for device types to override bus types. But
I'm not so sure about the priority we should give to classes. On the
other hand, if no classes define a dev_pm_ops then of course it doesn't
matter.

Alan Stern

2011-02-16 23:46:11

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH 2/2] PM: Make system-wide PM and runtime PM handle subsystems consistently

On Wednesday, February 16, 2011, Alan Stern wrote:
> On Wed, 16 Feb 2011, Rafael J. Wysocki wrote:
>
> > On Wednesday, February 16, 2011, Alan Stern wrote:
> > > On Wed, 16 Feb 2011, Rafael J. Wysocki wrote:
> > >
> > > > Unfortunately, it doesn't work on my Acer Ferrari One. The problem is that
> > > > hcd_pci_suspend() fails for the EHCI controller, apparently because the root
> > > > hub is not suspended. Do root hubs need both class suspend and bus type
> > > > suspend to work at the same time?
> > >
> > > No, only the bus type suspend method is needed.
> >
> > Bus type or device type? It appears to be the latter from reading the code
> > (although admittedly not too thorough).
>
> You're right. I forgot about how the PM methods were split up. :-(
>
> > > Can you provide the dmesg log using a kernel built with CONFIG_USB_DEBUG?
> >
> > Well, I know what the problem is.
> >
> > USB defines usb_device_type pointing to usb_device_pm_ops that provides
> > system-wide PM callbacks only and usb_bus_type pointing to
> > usb_bus_pm_ops that provides runtime PM callbacks only. So it looks like
> > we should move either the system-wide PM callbacks to usb_bus_pm_ops,
> > or the runtime PM callbacks to usb_device_pm_ops.
>
> Yes. IIRC, I did it that way so that the runtime PM routines could be
> static. Making them non-static isn't a big deal, though.
>
> > FWIW, the appended patch helps on my test machine, but I'm not sure if it
> > is the right thing to do.
>
> It is. Except that the inline stubs aren't needed for anything; they
> don't have to be added to usb.h.

OK, I'll remove them, add a changelog and repost.

> > Apart from this I think the order of checks introduced by the $subject patch
> > should be:
> > (1) If dev->class != NULL and dev->class->pm != NULL, use dev->class,
> > or otherwise
> > (2) if dev->type != NULL and dev->type->pm != NULL, use dev->type,
> > or otherwise
> > (3) use dev->bus (if present).
> > as that would allow classes and device types to override bus type PM
> > callbacks if they wish to.
>
> I haven't heard of any device types being present on more than one kind
> of bus, so it makes sense for device types to override bus types.

OK

> But I'm not so sure about the priority we should give to classes. On the
> other hand, if no classes define a dev_pm_ops then of course it doesn't
> matter.

The change will also affect classes that provide "legacy" suspend-resume
(if there are any, which I'm totally unsure of).

Anyway, I think we need to choose one ordering. :-)

What about type / bus / class , then?

Rafael

2011-02-17 14:55:51

by Alan Stern

[permalink] [raw]
Subject: Re: [RFC][PATCH 2/2] PM: Make system-wide PM and runtime PM handle subsystems consistently

On Thu, 17 Feb 2011, Rafael J. Wysocki wrote:

> > > Apart from this I think the order of checks introduced by the $subject patch
> > > should be:
> > > (1) If dev->class != NULL and dev->class->pm != NULL, use dev->class,
> > > or otherwise
> > > (2) if dev->type != NULL and dev->type->pm != NULL, use dev->type,
> > > or otherwise
> > > (3) use dev->bus (if present).
> > > as that would allow classes and device types to override bus type PM
> > > callbacks if they wish to.
> >
> > I haven't heard of any device types being present on more than one kind
> > of bus, so it makes sense for device types to override bus types.
>
> OK
>
> > But I'm not so sure about the priority we should give to classes. On the
> > other hand, if no classes define a dev_pm_ops then of course it doesn't
> > matter.
>
> The change will also affect classes that provide "legacy" suspend-resume
> (if there are any, which I'm totally unsure of).
>
> Anyway, I think we need to choose one ordering. :-)
>
> What about type / bus / class , then?

I really don't know. Somebody who has more experience with device
class implementations should answer.

Greg, any ideas?

To recap: The issue is how to handle multiple PM callbacks. Since the
bus type, device type, and device class may all have their own
callbacks, Rafael has decided the best approach is to prioritize them
and invoke only the highest-priority callback. But what priority order
should we use?

Alan Stern

2011-02-17 17:08:06

by Greg KH

[permalink] [raw]
Subject: Re: [RFC][PATCH 2/2] PM: Make system-wide PM and runtime PM handle subsystems consistently

On Thu, Feb 17, 2011 at 09:55:46AM -0500, Alan Stern wrote:
> On Thu, 17 Feb 2011, Rafael J. Wysocki wrote:
>
> > > > Apart from this I think the order of checks introduced by the $subject patch
> > > > should be:
> > > > (1) If dev->class != NULL and dev->class->pm != NULL, use dev->class,
> > > > or otherwise
> > > > (2) if dev->type != NULL and dev->type->pm != NULL, use dev->type,
> > > > or otherwise
> > > > (3) use dev->bus (if present).
> > > > as that would allow classes and device types to override bus type PM
> > > > callbacks if they wish to.
> > >
> > > I haven't heard of any device types being present on more than one kind
> > > of bus, so it makes sense for device types to override bus types.
> >
> > OK
> >
> > > But I'm not so sure about the priority we should give to classes. On the
> > > other hand, if no classes define a dev_pm_ops then of course it doesn't
> > > matter.
> >
> > The change will also affect classes that provide "legacy" suspend-resume
> > (if there are any, which I'm totally unsure of).
> >
> > Anyway, I think we need to choose one ordering. :-)
> >
> > What about type / bus / class , then?
>
> I really don't know. Somebody who has more experience with device
> class implementations should answer.
>
> Greg, any ideas?
>
> To recap: The issue is how to handle multiple PM callbacks. Since the
> bus type, device type, and device class may all have their own
> callbacks, Rafael has decided the best approach is to prioritize them
> and invoke only the highest-priority callback. But what priority order
> should we use?

I think we should do it in the following order:
device type
device class
device bus

for the reasons that a device itself could override the default class
and bus information if it "knows" it is special. After that, the class
of the device holds a lot of information about what is going on with the
logic involved (i.e. network stuff), and lastly, the bus knows some
default hardware information.

Sound reasonable?

I think that follows the default we have today, right?

thanks,

greg k-h

2011-02-17 22:16:41

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH 2/2] PM: Make system-wide PM and runtime PM handle subsystems consistently

On Thursday, February 17, 2011, Greg KH wrote:
> On Thu, Feb 17, 2011 at 09:55:46AM -0500, Alan Stern wrote:
> > On Thu, 17 Feb 2011, Rafael J. Wysocki wrote:
> >
> > > > > Apart from this I think the order of checks introduced by the $subject patch
> > > > > should be:
> > > > > (1) If dev->class != NULL and dev->class->pm != NULL, use dev->class,
> > > > > or otherwise
> > > > > (2) if dev->type != NULL and dev->type->pm != NULL, use dev->type,
> > > > > or otherwise
> > > > > (3) use dev->bus (if present).
> > > > > as that would allow classes and device types to override bus type PM
> > > > > callbacks if they wish to.
> > > >
> > > > I haven't heard of any device types being present on more than one kind
> > > > of bus, so it makes sense for device types to override bus types.
> > >
> > > OK
> > >
> > > > But I'm not so sure about the priority we should give to classes. On the
> > > > other hand, if no classes define a dev_pm_ops then of course it doesn't
> > > > matter.
> > >
> > > The change will also affect classes that provide "legacy" suspend-resume
> > > (if there are any, which I'm totally unsure of).
> > >
> > > Anyway, I think we need to choose one ordering. :-)
> > >
> > > What about type / bus / class , then?
> >
> > I really don't know. Somebody who has more experience with device
> > class implementations should answer.
> >
> > Greg, any ideas?
> >
> > To recap: The issue is how to handle multiple PM callbacks. Since the
> > bus type, device type, and device class may all have their own
> > callbacks, Rafael has decided the best approach is to prioritize them
> > and invoke only the highest-priority callback. But what priority order
> > should we use?
>
> I think we should do it in the following order:
> device type
> device class
> device bus
>
> for the reasons that a device itself could override the default class
> and bus information if it "knows" it is special. After that, the class
> of the device holds a lot of information about what is going on with the
> logic involved (i.e. network stuff), and lastly, the bus knows some
> default hardware information.
>
> Sound reasonable?
>
> I think that follows the default we have today, right?

No, right now the class goes first during suspend and last during resume (and
of course they all can be called during system suspend/resume now).

That said I'm going to follow your suggestion. :-)

Thanks,
Rafael

2011-02-17 23:57:53

by R. J. Wysocki

[permalink] [raw]
Subject: [PATCH] PM: Make system-wide PM and runtime PM treat subsystems consistently

From: Rafael J. Wysocki <[email protected]>

The code handling system-wide power transitions (eg. suspend-to-RAM)
can in theory execute callbacks provided by the device's bus type,
device type and class in each phase of the power transition. In
turn, the runtime PM core code only calls one of those callbacks at
a time, preferring bus type callbacks to device type or class
callbacks and device type callbacks to class callbacks.

It seems reasonable to make them both behave in the same way in that
respect. Moreover, even though a device may belong to two subsystems
(eg. bus type and device class) simultaneously, in practice power
management callbacks for system-wide power transitions are always
provided by only one of them (ie. if the bus type callbacks are
defined, the device class ones are not and vice versa). Thus it is
possible to modify the code handling system-wide power transitions
so that it follows the core runtime PM code (ie. treats the
subsystem callbacks as mutually exclusive).

On the other hand, the core runtime PM code will choose to execute,
for example, a runtime suspend callback provided by the device type
even if the bus type's struct dev_pm_ops object exists, but the
runtime_suspend pointer in it happens to be NULL. This is confusing,
because it may lead to the execution of callbacks from different
subsystems during different operations (eg. the bus type suspend
callback may be executed during runtime suspend of the device, while
the device type callback will be executed during system suspend).

Make all of the power management code treat subsystem callbacks in
a consistent way, such that:
(1) If the device's type is defined (eg. dev->type is not NULL)
and its pm pointer is not NULL, the callbacks from dev->type->pm
will be used.
(2) If dev->type is NULL or dev->type->pm is NULL, but the device's
class is defined (eg. dev->class is not NULL) and its pm pointer
is not NULL, the callbacks from dev->class->pm will be used.
(3) If dev->type is NULL or dev->type->pm is NULL and dev->class is
NULL or dev->class->pm is NULL, the callbacks from dev->bus->pm
will be used provided that both dev->bus and dev->bus->pm are
not NULL.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Kevin Hilman <[email protected]>
Reasoning-sounds-sane-to: Grant Likely <[email protected]>
---

Hi,

This patch is on top of suspend-2.6/linux-next.

I changed the ordering of callbacks following your advice, hopefully the
ACKs above still apply.

Thanks,
Rafael

---
Documentation/power/devices.txt | 28 +++----
Documentation/power/devices.txt | 29 ++-----
Documentation/power/runtime_pm.txt | 13 +--
drivers/base/power/main.c | 150 +++++++++++++++----------------------
drivers/base/power/runtime.c | 18 ++--
4 files changed, 92 insertions(+), 118 deletions(-)

Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -428,26 +428,17 @@ static int device_resume_noirq(struct de
pm_noirq_op(dev, &dev->pwr_domain->ops, state);
}

- if (dev->bus && dev->bus->pm) {
- pm_dev_dbg(dev, state, "EARLY ");
- error = pm_noirq_op(dev, dev->bus->pm, state);
- if (error)
- goto End;
- }
-
if (dev->type && dev->type->pm) {
pm_dev_dbg(dev, state, "EARLY type ");
error = pm_noirq_op(dev, dev->type->pm, state);
- if (error)
- goto End;
- }
-
- if (dev->class && dev->class->pm) {
+ } else if (dev->class && dev->class->pm) {
pm_dev_dbg(dev, state, "EARLY class ");
error = pm_noirq_op(dev, dev->class->pm, state);
+ } else if (dev->bus && dev->bus->pm) {
+ pm_dev_dbg(dev, state, "EARLY ");
+ error = pm_noirq_op(dev, dev->bus->pm, state);
}

-End:
TRACE_RESUME(error);
return error;
}
@@ -528,36 +519,34 @@ static int device_resume(struct device *
pm_op(dev, &dev->pwr_domain->ops, state);
}

- if (dev->bus) {
- if (dev->bus->pm) {
- pm_dev_dbg(dev, state, "");
- error = pm_op(dev, dev->bus->pm, state);
- } else if (dev->bus->resume) {
- pm_dev_dbg(dev, state, "legacy ");
- error = legacy_resume(dev, dev->bus->resume);
- }
- if (error)
- goto End;
- }
-
- if (dev->type) {
- if (dev->type->pm) {
- pm_dev_dbg(dev, state, "type ");
- error = pm_op(dev, dev->type->pm, state);
- }
- if (error)
- goto End;
+ if (dev->type && dev->type->pm) {
+ pm_dev_dbg(dev, state, "type ");
+ error = pm_op(dev, dev->type->pm, state);
+ goto End;
}

if (dev->class) {
if (dev->class->pm) {
pm_dev_dbg(dev, state, "class ");
error = pm_op(dev, dev->class->pm, state);
+ goto End;
} else if (dev->class->resume) {
pm_dev_dbg(dev, state, "legacy class ");
error = legacy_resume(dev, dev->class->resume);
+ goto End;
}
}
+
+ if (dev->bus) {
+ if (dev->bus->pm) {
+ pm_dev_dbg(dev, state, "");
+ error = pm_op(dev, dev->bus->pm, state);
+ } else if (dev->bus->resume) {
+ pm_dev_dbg(dev, state, "legacy ");
+ error = legacy_resume(dev, dev->bus->resume);
+ }
+ }
+
End:
device_unlock(dev);
complete_all(&dev->power.completion);
@@ -644,19 +633,18 @@ static void device_complete(struct devic
dev->pwr_domain->ops.complete(dev);
}

- if (dev->class && dev->class->pm && dev->class->pm->complete) {
- pm_dev_dbg(dev, state, "completing class ");
- dev->class->pm->complete(dev);
- }
-
- if (dev->type && dev->type->pm && dev->type->pm->complete) {
+ if (dev->type && dev->type->pm) {
pm_dev_dbg(dev, state, "completing type ");
- dev->type->pm->complete(dev);
- }
-
- if (dev->bus && dev->bus->pm && dev->bus->pm->complete) {
+ if (dev->type->pm->complete)
+ dev->type->pm->complete(dev);
+ } else if (dev->class && dev->class->pm) {
+ pm_dev_dbg(dev, state, "completing class ");
+ if (dev->class->pm->complete)
+ dev->class->pm->complete(dev);
+ } else if (dev->bus && dev->bus->pm) {
pm_dev_dbg(dev, state, "completing ");
- dev->bus->pm->complete(dev);
+ if (dev->bus->pm->complete)
+ dev->bus->pm->complete(dev);
}

device_unlock(dev);
@@ -741,27 +729,23 @@ static pm_message_t resume_event(pm_mess
*/
static int device_suspend_noirq(struct device *dev, pm_message_t state)
{
- int error = 0;
-
- if (dev->class && dev->class->pm) {
- pm_dev_dbg(dev, state, "LATE class ");
- error = pm_noirq_op(dev, dev->class->pm, state);
- if (error)
- goto End;
- }
+ int error;

if (dev->type && dev->type->pm) {
pm_dev_dbg(dev, state, "LATE type ");
error = pm_noirq_op(dev, dev->type->pm, state);
if (error)
- goto End;
- }
-
- if (dev->bus && dev->bus->pm) {
+ return error;
+ } else if (dev->class && dev->class->pm) {
+ pm_dev_dbg(dev, state, "LATE class ");
+ error = pm_noirq_op(dev, dev->class->pm, state);
+ if (error)
+ return error;
+ } else if (dev->bus && dev->bus->pm) {
pm_dev_dbg(dev, state, "LATE ");
error = pm_noirq_op(dev, dev->bus->pm, state);
if (error)
- goto End;
+ return error;
}

if (dev->pwr_domain) {
@@ -769,8 +753,7 @@ static int device_suspend_noirq(struct d
pm_noirq_op(dev, &dev->pwr_domain->ops, state);
}

-End:
- return error;
+ return 0;
}

/**
@@ -857,25 +840,22 @@ static int __device_suspend(struct devic
goto End;
}

+ if (dev->type && dev->type->pm) {
+ pm_dev_dbg(dev, state, "type ");
+ error = pm_op(dev, dev->type->pm, state);
+ goto Domain;
+ }
+
if (dev->class) {
if (dev->class->pm) {
pm_dev_dbg(dev, state, "class ");
error = pm_op(dev, dev->class->pm, state);
+ goto Domain;
} else if (dev->class->suspend) {
pm_dev_dbg(dev, state, "legacy class ");
error = legacy_suspend(dev, state, dev->class->suspend);
+ goto Domain;
}
- if (error)
- goto End;
- }
-
- if (dev->type) {
- if (dev->type->pm) {
- pm_dev_dbg(dev, state, "type ");
- error = pm_op(dev, dev->type->pm, state);
- }
- if (error)
- goto End;
}

if (dev->bus) {
@@ -886,11 +866,10 @@ static int __device_suspend(struct devic
pm_dev_dbg(dev, state, "legacy ");
error = legacy_suspend(dev, state, dev->bus->suspend);
}
- if (error)
- goto End;
}

- if (dev->pwr_domain) {
+ Domain:
+ if (!error && dev->pwr_domain) {
pm_dev_dbg(dev, state, "power domain ");
pm_op(dev, &dev->pwr_domain->ops, state);
}
@@ -985,28 +964,27 @@ static int device_prepare(struct device

device_lock(dev);

- if (dev->bus && dev->bus->pm && dev->bus->pm->prepare) {
- pm_dev_dbg(dev, state, "preparing ");
- error = dev->bus->pm->prepare(dev);
- suspend_report_result(dev->bus->pm->prepare, error);
- if (error)
- goto End;
- }
-
- if (dev->type && dev->type->pm && dev->type->pm->prepare) {
+ if (dev->type && dev->type->pm) {
pm_dev_dbg(dev, state, "preparing type ");
- error = dev->type->pm->prepare(dev);
+ if (dev->type->pm->prepare)
+ error = dev->type->pm->prepare(dev);
suspend_report_result(dev->type->pm->prepare, error);
if (error)
goto End;
- }
-
- if (dev->class && dev->class->pm && dev->class->pm->prepare) {
+ } else if (dev->class && dev->class->pm) {
pm_dev_dbg(dev, state, "preparing class ");
- error = dev->class->pm->prepare(dev);
+ if (dev->class->pm->prepare)
+ error = dev->class->pm->prepare(dev);
suspend_report_result(dev->class->pm->prepare, error);
if (error)
goto End;
+ } else if (dev->bus && dev->bus->pm) {
+ pm_dev_dbg(dev, state, "preparing ");
+ if (dev->bus->pm->prepare)
+ error = dev->bus->pm->prepare(dev);
+ suspend_report_result(dev->bus->pm->prepare, error);
+ if (error)
+ goto End;
}

if (dev->pwr_domain && dev->pwr_domain->ops.prepare) {
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- linux-2.6.orig/drivers/base/power/runtime.c
+++ linux-2.6/drivers/base/power/runtime.c
@@ -214,12 +214,12 @@ static int rpm_idle(struct device *dev,

dev->power.idle_notification = true;

- if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle)
- callback = dev->bus->pm->runtime_idle;
- else if (dev->type && dev->type->pm && dev->type->pm->runtime_idle)
+ if (dev->type && dev->type->pm)
callback = dev->type->pm->runtime_idle;
else if (dev->class && dev->class->pm)
callback = dev->class->pm->runtime_idle;
+ else if (dev->bus && dev->bus->pm)
+ callback = dev->bus->pm->runtime_idle;
else
callback = NULL;

@@ -382,12 +382,12 @@ static int rpm_suspend(struct device *de

__update_runtime_status(dev, RPM_SUSPENDING);

- if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend)
- callback = dev->bus->pm->runtime_suspend;
- else if (dev->type && dev->type->pm && dev->type->pm->runtime_suspend)
+ if (dev->type && dev->type->pm)
callback = dev->type->pm->runtime_suspend;
else if (dev->class && dev->class->pm)
callback = dev->class->pm->runtime_suspend;
+ else if (dev->bus && dev->bus->pm)
+ callback = dev->bus->pm->runtime_suspend;
else
callback = NULL;

@@ -584,12 +584,12 @@ static int rpm_resume(struct device *dev
if (dev->pwr_domain)
rpm_callback(dev->pwr_domain->ops.runtime_resume, dev);

- if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume)
- callback = dev->bus->pm->runtime_resume;
- else if (dev->type && dev->type->pm && dev->type->pm->runtime_resume)
+ if (dev->type && dev->type->pm)
callback = dev->type->pm->runtime_resume;
else if (dev->class && dev->class->pm)
callback = dev->class->pm->runtime_resume;
+ else if (dev->bus && dev->bus->pm)
+ callback = dev->bus->pm->runtime_resume;
else
callback = NULL;

Index: linux-2.6/Documentation/power/devices.txt
===================================================================
--- linux-2.6.orig/Documentation/power/devices.txt
+++ linux-2.6/Documentation/power/devices.txt
@@ -249,23 +249,18 @@ various phases always run after tasks ha
unfrozen. Furthermore, the *_noirq phases run at a time when IRQ handlers have
been disabled (except for those marked with the IRQ_WAKEUP flag).

-Most phases use bus, type, and class callbacks (that is, methods defined in
-dev->bus->pm, dev->type->pm, and dev->class->pm). The prepare and complete
-phases are exceptions; they use only bus callbacks. When multiple callbacks
-are used in a phase, they are invoked in the order: <class, type, bus> during
-power-down transitions and in the opposite order during power-up transitions.
-For example, during the suspend phase the PM core invokes
-
- dev->class->pm.suspend(dev);
- dev->type->pm.suspend(dev);
- dev->bus->pm.suspend(dev);
-
-before moving on to the next device, whereas during the resume phase the core
-invokes
-
- dev->bus->pm.resume(dev);
- dev->type->pm.resume(dev);
- dev->class->pm.resume(dev);
+All phases use bus, type, or class callbacks (that is, methods defined in
+dev->bus->pm, dev->type->pm, or dev->class->pm). These callbacks are mutually
+exclusive, so if the device type provides a struct dev_pm_ops object pointed to
+by its pm field (i.e. both dev->type and dev->type->pm are defined), the
+callbacks included in that object (i.e. dev->type->pm) will be used. Otherwise,
+if the class provides a struct dev_pm_ops object pointed to by its pm field
+(i.e. both dev->class and dev->class->pm are defined), the PM core will use the
+callbacks from that object (i.e. dev->class->pm). Finally, if the pm fields of
+both the device type and class objects are NULL (or those objects do not exist),
+the callbacks provided by the bus (that is, the callbacks from dev->bus->pm)
+will be used (this allows device types to override callbacks provided by bus
+types or classes if necessary).

These callbacks may in turn invoke device- or driver-specific methods stored in
dev->driver->pm, but they don't have to.
Index: linux-2.6/Documentation/power/runtime_pm.txt
===================================================================
--- linux-2.6.orig/Documentation/power/runtime_pm.txt
+++ linux-2.6/Documentation/power/runtime_pm.txt
@@ -1,6 +1,6 @@
Run-time Power Management Framework for I/O Devices

-(C) 2009 Rafael J. Wysocki <[email protected]>, Novell Inc.
+(C) 2009-2011 Rafael J. Wysocki <[email protected]>, Novell Inc.
(C) 2010 Alan Stern <[email protected]>

1. Introduction
@@ -44,11 +44,12 @@ struct dev_pm_ops {
};

The ->runtime_suspend(), ->runtime_resume() and ->runtime_idle() callbacks are
-executed by the PM core for either the bus type, or device type (if the bus
-type's callback is not defined), or device class (if the bus type's and device
-type's callbacks are not defined) of given device. The bus type, device type
-and device class callbacks are referred to as subsystem-level callbacks in what
-follows.
+executed by the PM core for either the device type, or the class (if the device
+type's struct dev_pm_ops object does not exist), or the bus type (if the
+device type's and class' struct dev_pm_ops objects do not exist) of the given
+device (this allows device types to override callbacks provided by bus types or
+classes if necessary). The bus type, device type and class callbacks are
+referred to as subsystem-level callbacks in what follows.

By default, the callbacks are always invoked in process context with interrupts
enabled. However, subsystems can use the pm_runtime_irq_safe() helper function

2011-02-18 19:22:59

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] PM: Make system-wide PM and runtime PM treat subsystems consistently

On Fri, Feb 18, 2011 at 12:54:25AM +0100, R. J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> The code handling system-wide power transitions (eg. suspend-to-RAM)
> can in theory execute callbacks provided by the device's bus type,
> device type and class in each phase of the power transition. In
> turn, the runtime PM core code only calls one of those callbacks at
> a time, preferring bus type callbacks to device type or class
> callbacks and device type callbacks to class callbacks.
>
> It seems reasonable to make them both behave in the same way in that
> respect. Moreover, even though a device may belong to two subsystems
> (eg. bus type and device class) simultaneously, in practice power
> management callbacks for system-wide power transitions are always
> provided by only one of them (ie. if the bus type callbacks are
> defined, the device class ones are not and vice versa). Thus it is
> possible to modify the code handling system-wide power transitions
> so that it follows the core runtime PM code (ie. treats the
> subsystem callbacks as mutually exclusive).
>
> On the other hand, the core runtime PM code will choose to execute,
> for example, a runtime suspend callback provided by the device type
> even if the bus type's struct dev_pm_ops object exists, but the
> runtime_suspend pointer in it happens to be NULL. This is confusing,
> because it may lead to the execution of callbacks from different
> subsystems during different operations (eg. the bus type suspend
> callback may be executed during runtime suspend of the device, while
> the device type callback will be executed during system suspend).
>
> Make all of the power management code treat subsystem callbacks in
> a consistent way, such that:
> (1) If the device's type is defined (eg. dev->type is not NULL)
> and its pm pointer is not NULL, the callbacks from dev->type->pm
> will be used.
> (2) If dev->type is NULL or dev->type->pm is NULL, but the device's
> class is defined (eg. dev->class is not NULL) and its pm pointer
> is not NULL, the callbacks from dev->class->pm will be used.
> (3) If dev->type is NULL or dev->type->pm is NULL and dev->class is
> NULL or dev->class->pm is NULL, the callbacks from dev->bus->pm
> will be used provided that both dev->bus and dev->bus->pm are
> not NULL.
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> Acked-by: Kevin Hilman <[email protected]>
> Reasoning-sounds-sane-to: Grant Likely <[email protected]>

Acked-by: Greg Kroah-Hartman <[email protected]>

You are going to take this through your tree, right?

thanks,

greg k-h

2011-02-18 20:14:43

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] PM: Make system-wide PM and runtime PM treat subsystems consistently

On Friday, February 18, 2011, Greg KH wrote:
> On Fri, Feb 18, 2011 at 12:54:25AM +0100, R. J. Wysocki wrote:
> > From: Rafael J. Wysocki <[email protected]>
> >
> > The code handling system-wide power transitions (eg. suspend-to-RAM)
> > can in theory execute callbacks provided by the device's bus type,
> > device type and class in each phase of the power transition. In
> > turn, the runtime PM core code only calls one of those callbacks at
> > a time, preferring bus type callbacks to device type or class
> > callbacks and device type callbacks to class callbacks.
> >
> > It seems reasonable to make them both behave in the same way in that
> > respect. Moreover, even though a device may belong to two subsystems
> > (eg. bus type and device class) simultaneously, in practice power
> > management callbacks for system-wide power transitions are always
> > provided by only one of them (ie. if the bus type callbacks are
> > defined, the device class ones are not and vice versa). Thus it is
> > possible to modify the code handling system-wide power transitions
> > so that it follows the core runtime PM code (ie. treats the
> > subsystem callbacks as mutually exclusive).
> >
> > On the other hand, the core runtime PM code will choose to execute,
> > for example, a runtime suspend callback provided by the device type
> > even if the bus type's struct dev_pm_ops object exists, but the
> > runtime_suspend pointer in it happens to be NULL. This is confusing,
> > because it may lead to the execution of callbacks from different
> > subsystems during different operations (eg. the bus type suspend
> > callback may be executed during runtime suspend of the device, while
> > the device type callback will be executed during system suspend).
> >
> > Make all of the power management code treat subsystem callbacks in
> > a consistent way, such that:
> > (1) If the device's type is defined (eg. dev->type is not NULL)
> > and its pm pointer is not NULL, the callbacks from dev->type->pm
> > will be used.
> > (2) If dev->type is NULL or dev->type->pm is NULL, but the device's
> > class is defined (eg. dev->class is not NULL) and its pm pointer
> > is not NULL, the callbacks from dev->class->pm will be used.
> > (3) If dev->type is NULL or dev->type->pm is NULL and dev->class is
> > NULL or dev->class->pm is NULL, the callbacks from dev->bus->pm
> > will be used provided that both dev->bus and dev->bus->pm are
> > not NULL.
> >
> > Signed-off-by: Rafael J. Wysocki <[email protected]>
> > Acked-by: Kevin Hilman <[email protected]>
> > Reasoning-sounds-sane-to: Grant Likely <[email protected]>
>
> Acked-by: Greg Kroah-Hartman <[email protected]>
>
> You are going to take this through your tree, right?

Yes, I am. Thanks!

Rafael