This patch set tries to address Russell's concerns with platform
pm code calling into the driver for every block in the Cortex A9s
during idle, hotplug, and suspend. The first patch adds cpu pm
notifiers that can be called by platform code, the second uses
the notifier to save and restore the GIC state, and the third
saves the VFP state.
The notifiers are used for two types of events, CPU PM events and
CPU complex PM events. CPU PM events are used to save and restore
per-cpu context when a single CPU is preparing to enter or has
just exited a low power state. For example, the VFP saves the
last thread context, and the GIC saves banked CPU registers.
CPU complex events are used after all the CPUs in a power domain
have been prepared for the low power state. The GIC uses these
events to save global register state.
Platforms that call the cpu_pm APIs must select
CONFIG_ARCH_USES_CPU_PM
L2 cache is not covered by this patch set, as the determination
of when the L2 is reset and when it is retained is
platform-specific, and most of the APIs necessary are already
present.
arch/arm/Kconfig | 7 ++
arch/arm/common/gic.c | 212 +++++++++++++++++++++++++++++++++++++++++
arch/arm/include/asm/cpu_pm.h | 54 +++++++++++
arch/arm/kernel/Makefile | 1 +
arch/arm/kernel/cpu_pm.c | 181 +++++++++++++++++++++++++++++++++++
arch/arm/vfp/vfpmodule.c | 40 ++++++++
6 files changed, 495 insertions(+), 0 deletions(-)
During some CPU power modes entered during idle, hotplug and
suspend, peripherals located in the CPU power domain, such as
the GIC and VFP, may be powered down. Add a notifier chain
that allows drivers for those peripherals to be notified
before and after they may be reset.
Signed-off-by: Colin Cross <[email protected]>
---
arch/arm/Kconfig | 7 ++
arch/arm/include/asm/cpu_pm.h | 54 ++++++++++++
arch/arm/kernel/Makefile | 1 +
arch/arm/kernel/cpu_pm.c | 181 +++++++++++++++++++++++++++++++++++++++++
4 files changed, 243 insertions(+), 0 deletions(-)
create mode 100644 arch/arm/include/asm/cpu_pm.h
create mode 100644 arch/arm/kernel/cpu_pm.c
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 9adc278..356f266 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -183,6 +183,13 @@ config FIQ
config ARCH_MTD_XIP
bool
+config ARCH_USES_CPU_PM
+ bool
+
+config CPU_PM
+ def_bool y
+ depends on ARCH_USES_CPU_PM && (PM || CPU_IDLE)
+
config VECTORS_BASE
hex
default 0xffff0000 if MMU || CPU_HIGH_VECTOR
diff --git a/arch/arm/include/asm/cpu_pm.h b/arch/arm/include/asm/cpu_pm.h
new file mode 100644
index 0000000..b4bb715
--- /dev/null
+++ b/arch/arm/include/asm/cpu_pm.h
@@ -0,0 +1,54 @@
+/*
+ * Copyright (C) 2011 Google, Inc.
+ *
+ * Author:
+ * Colin Cross <[email protected]>
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#ifndef _ASMARM_CPU_PM_H
+#define _ASMARM_CPU_PM_H
+
+#include <linux/kernel.h>
+#include <linux/notifier.h>
+
+/* Event codes passed as unsigned long val to notifier calls */
+enum cpu_pm_event {
+ /* A single cpu is entering a low power state */
+ CPU_PM_ENTER,
+
+ /* A single cpu failed to enter a low power state */
+ CPU_PM_ENTER_FAILED,
+
+ /* A single cpu is exiting a low power state */
+ CPU_PM_EXIT,
+
+ /* A cpu power domain is entering a low power state */
+ CPU_COMPLEX_PM_ENTER,
+
+ /* A cpu power domain failed to enter a low power state */
+ CPU_COMPLEX_PM_ENTER_FAILED,
+
+ /* A cpu power domain is exiting a low power state */
+ CPU_COMPLEX_PM_EXIT,
+};
+
+int cpu_pm_register_notifier(struct notifier_block *nb);
+int cpu_pm_unregister_notifier(struct notifier_block *nb);
+
+int cpu_pm_enter(void);
+int cpu_pm_exit(void);
+
+int cpu_complex_pm_enter(void);
+int cpu_complex_pm_exit(void);
+
+#endif
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index a5b31af..8b42d58 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -60,6 +60,7 @@ obj-$(CONFIG_CPU_PJ4) += pj4-cp0.o
obj-$(CONFIG_IWMMXT) += iwmmxt.o
obj-$(CONFIG_CPU_HAS_PMU) += pmu.o
obj-$(CONFIG_HW_PERF_EVENTS) += perf_event.o
+obj-$(CONFIG_CPU_PM) += cpu_pm.o
AFLAGS_iwmmxt.o := -Wa,-mcpu=iwmmxt
ifneq ($(CONFIG_ARCH_EBSA110),y)
diff --git a/arch/arm/kernel/cpu_pm.c b/arch/arm/kernel/cpu_pm.c
new file mode 100644
index 0000000..48a5b53
--- /dev/null
+++ b/arch/arm/kernel/cpu_pm.c
@@ -0,0 +1,181 @@
+/*
+ * Copyright (C) 2011 Google, Inc.
+ *
+ * Author:
+ * Colin Cross <[email protected]>
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/notifier.h>
+#include <linux/spinlock.h>
+
+#include <asm/cpu_pm.h>
+
+/*
+ * When a CPU goes to a low power state that turns off power to the CPU's
+ * power domain, the contents of some blocks (floating point coprocessors,
+ * interrutp controllers, caches, timers) in the same power domain can
+ * be lost. The cpm_pm notifiers provide a method for platform idle, suspend,
+ * and hotplug implementations to notify the drivers for these blocks that
+ * they may be reset.
+ *
+ * All cpu_pm notifications must be called with interrupts disabled.
+ *
+ * The notifications are split into two classes, CPU notifications and CPU
+ * complex notifications.
+ *
+ * CPU notifications apply to a single CPU, and must be called on the affected
+ * CPU. They are used to save per-cpu context for affected blocks.
+ *
+ * CPU complex notifications apply to all CPUs in a single power domain. They
+ * are used to save any global context for affected blocks, and must be called
+ * after all the CPUs in the power domain have been notified of the low power
+ * state.
+ *
+ */
+
+static DEFINE_RWLOCK(cpu_pm_notifier_lock);
+static RAW_NOTIFIER_HEAD(cpu_pm_notifier_chain);
+
+int cpu_pm_register_notifier(struct notifier_block *nb)
+{
+ unsigned long flags;
+ int ret;
+
+ write_lock_irqsave(&cpu_pm_notifier_lock, flags);
+ ret = raw_notifier_chain_register(&cpu_pm_notifier_chain, nb);
+ write_unlock_irqrestore(&cpu_pm_notifier_lock, flags);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(cpu_pm_register_notifier);
+
+int cpu_pm_unregister_notifier(struct notifier_block *nb)
+{
+ unsigned long flags;
+ int ret;
+
+ write_lock_irqsave(&cpu_pm_notifier_lock, flags);
+ ret = raw_notifier_chain_unregister(&cpu_pm_notifier_chain, nb);
+ write_unlock_irqrestore(&cpu_pm_notifier_lock, flags);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(cpu_pm_unregister_notifier);
+
+static int cpu_pm_notify(enum cpu_pm_event event, int nr_to_call, int *nr_calls)
+{
+ int ret;
+
+ ret = __raw_notifier_call_chain(&cpu_pm_notifier_chain, event, NULL,
+ nr_to_call, nr_calls);
+
+ return notifier_to_errno(ret);
+}
+
+/**
+ * cpm_pm_enter
+ *
+ * Notifies listeners that a single cpu is entering a low power state that may
+ * cause some blocks in the same power domain as the cpu to reset.
+ *
+ * Must be called on the affected cpu with interrupts disabled. Platform is
+ * responsible for ensuring that cpu_pm_enter is not called twice on the same
+ * cpu before cpu_pm_exit is called.
+ */
+int cpu_pm_enter(void)
+{
+ int nr_calls;
+ int ret = 0;
+
+ read_lock(&cpu_pm_notifier_lock);
+ ret = cpu_pm_notify(CPU_PM_ENTER, -1, &nr_calls);
+ if (ret)
+ cpu_pm_notify(CPU_PM_ENTER_FAILED, nr_calls - 1, NULL);
+ read_unlock(&cpu_pm_notifier_lock);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(cpu_pm_enter);
+
+/**
+ * cpm_pm_exit
+ *
+ * Notifies listeners that a single cpu is exiting a low power state that may
+ * have caused some blocks in the same power domain as the cpu to reset.
+ *
+ * Must be called on the affected cpu with interrupts disabled.
+ */
+int cpu_pm_exit(void)
+{
+ int ret;
+
+ read_lock(&cpu_pm_notifier_lock);
+ ret = cpu_pm_notify(CPU_PM_EXIT, -1, NULL);
+ read_unlock(&cpu_pm_notifier_lock);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(cpu_pm_exit);
+
+/**
+ * cpm_complex_pm_enter
+ *
+ * Notifies listeners that all cpus in a power domain are entering a low power
+ * state that may cause some blocks in the same power domain to reset.
+ *
+ * Must be called after cpu_pm_enter has been called on all cpus in the power
+ * domain, and before cpu_pm_exit has been called on any cpu in the power
+ * domain.
+ *
+ * Must be called with interrupts disabled.
+ */
+int cpu_complex_pm_enter(void)
+{
+ int nr_calls;
+ int ret = 0;
+
+ read_lock(&cpu_pm_notifier_lock);
+ ret = cpu_pm_notify(CPU_COMPLEX_PM_ENTER, -1, &nr_calls);
+ if (ret)
+ cpu_pm_notify(CPU_COMPLEX_PM_ENTER_FAILED, nr_calls - 1, NULL);
+ read_unlock(&cpu_pm_notifier_lock);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(cpu_complex_pm_enter);
+
+/**
+ * cpm_pm_enter
+ *
+ * Notifies listeners that a single cpu is entering a low power state that may
+ * cause some blocks in the same power domain as the cpu to reset.
+ *
+ * Must be called after cpu_pm_enter has been called on all cpus in the power
+ * domain, and before cpu_pm_exit has been called on any cpu in the power
+ * domain.
+ *
+ * Must be called with interrupts disabled.
+ */
+int cpu_complex_pm_exit(void)
+{
+ int ret;
+
+ read_lock(&cpu_pm_notifier_lock);
+ ret = cpu_pm_notify(CPU_COMPLEX_PM_EXIT, -1, NULL);
+ read_unlock(&cpu_pm_notifier_lock);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(cpu_complex_pm_exit);
--
1.7.4.1
When the cpu is powered down in a low power mode, the gic cpu
interface may be reset, and when the cpu complex is powered
down, the gic distributor may also be reset.
This patch uses CPU_PM_ENTER and CPU_PM_EXIT notifiers to save
and restore the gic cpu interface registers, and the
CPU_COMPLEX_PM_ENTER and CPU_COMPLEX_PM_EXIT notifiers to save
and restore the gic distributor registers.
Signed-off-by: Colin Cross <[email protected]>
---
arch/arm/common/gic.c | 212 +++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 212 insertions(+), 0 deletions(-)
diff --git a/arch/arm/common/gic.c b/arch/arm/common/gic.c
index 4ddd0a6..8d62e07 100644
--- a/arch/arm/common/gic.c
+++ b/arch/arm/common/gic.c
@@ -29,6 +29,7 @@
#include <linux/cpumask.h>
#include <linux/io.h>
+#include <asm/cpu_pm.h>
#include <asm/irq.h>
#include <asm/mach/irq.h>
#include <asm/hardware/gic.h>
@@ -42,6 +43,17 @@ struct gic_chip_data {
unsigned int irq_offset;
void __iomem *dist_base;
void __iomem *cpu_base;
+#ifdef CONFIG_CPU_PM
+ u32 saved_spi_enable[DIV_ROUND_UP(1020, 32)];
+ u32 saved_spi_conf[DIV_ROUND_UP(1020, 16)];
+ u32 saved_spi_pri[DIV_ROUND_UP(1020, 4)];
+ u32 saved_spi_target[DIV_ROUND_UP(1020, 4)];
+ u32 __percpu *saved_ppi_enable;
+ u32 __percpu *saved_ppi_conf;
+ u32 __percpu *saved_ppi_pri;
+#endif
+
+ unsigned int gic_irqs;
};
/*
@@ -283,6 +295,8 @@ static void __init gic_dist_init(struct gic_chip_data *gic,
if (gic_irqs > 1020)
gic_irqs = 1020;
+ gic->gic_irqs = gic_irqs;
+
/*
* Set all global interrupts to be level triggered, active low.
*/
@@ -350,6 +364,203 @@ static void __cpuinit gic_cpu_init(struct gic_chip_data *gic)
writel_relaxed(1, base + GIC_CPU_CTRL);
}
+#ifdef CONFIG_CPU_PM
+/*
+ * Saves the GIC distributor registers during suspend or idle. Must be called
+ * with interrupts disabled but before powering down the GIC. After calling
+ * this function, no interrupts will be delivered by the GIC, and another
+ * platform-specific wakeup source must be enabled.
+ */
+static void gic_dist_save(unsigned int gic_nr)
+{
+ unsigned int gic_irqs;
+ void __iomem *dist_base;
+ int i;
+
+ if (gic_nr >= MAX_GIC_NR)
+ BUG();
+
+ gic_irqs = gic_data[gic_nr].gic_irqs;
+ dist_base = gic_data[gic_nr].dist_base;
+
+ if (!dist_base)
+ return;
+
+ for (i = 0; i < DIV_ROUND_UP(gic_irqs, 16); i++)
+ gic_data[gic_nr].saved_spi_conf[i] =
+ readl_relaxed(dist_base + GIC_DIST_CONFIG + i * 4);
+
+ for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
+ gic_data[gic_nr].saved_spi_pri[i] =
+ readl_relaxed(dist_base + GIC_DIST_PRI + i * 4);
+
+ for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
+ gic_data[gic_nr].saved_spi_target[i] =
+ readl_relaxed(dist_base + GIC_DIST_TARGET + i * 4);
+
+ for (i = 0; i < DIV_ROUND_UP(gic_irqs, 32); i++)
+ gic_data[gic_nr].saved_spi_enable[i] =
+ readl_relaxed(dist_base + GIC_DIST_ENABLE_SET + i * 4);
+
+ writel_relaxed(0, dist_base + GIC_DIST_CTRL);
+}
+
+/*
+ * Restores the GIC distributor registers during resume or when coming out of
+ * idle. Must be called before enabling interrupts. If a level interrupt
+ * that occured while the GIC was suspended is still present, it will be
+ * handled normally, but any edge interrupts that occured will not be seen by
+ * the GIC and need to be handled by the platform-specific wakeup source.
+ */
+static void gic_dist_restore(unsigned int gic_nr)
+{
+ unsigned int gic_irqs;
+ unsigned int i;
+ void __iomem *dist_base;
+
+ if (gic_nr >= MAX_GIC_NR)
+ BUG();
+
+ gic_irqs = gic_data[gic_nr].gic_irqs;
+ dist_base = gic_data[gic_nr].dist_base;
+
+ if (!dist_base)
+ return;
+
+ writel_relaxed(0, dist_base + GIC_DIST_CTRL);
+
+ for (i = 0; i < DIV_ROUND_UP(gic_irqs, 16); i++)
+ writel_relaxed(gic_data[gic_nr].saved_spi_conf[i],
+ dist_base + GIC_DIST_CONFIG + i * 4);
+
+ for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
+ writel_relaxed(gic_data[gic_nr].saved_spi_pri[i],
+ dist_base + GIC_DIST_PRI + i * 4);
+
+ for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
+ writel_relaxed(gic_data[gic_nr].saved_spi_target[i],
+ dist_base + GIC_DIST_TARGET + i * 4);
+
+ for (i = 0; i < DIV_ROUND_UP(gic_irqs, 32); i++)
+ writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
+ dist_base + GIC_DIST_ENABLE_SET + i * 4);
+
+ writel_relaxed(1, dist_base + GIC_DIST_CTRL);
+}
+
+static void gic_cpu_save(unsigned int gic_nr)
+{
+ int i;
+ u32 *ptr;
+ void __iomem *dist_base;
+ void __iomem *cpu_base;
+
+ if (gic_nr >= MAX_GIC_NR)
+ BUG();
+
+ dist_base = gic_data[gic_nr].dist_base;
+ cpu_base = gic_data[gic_nr].cpu_base;
+
+ if (!dist_base || !cpu_base)
+ return;
+
+ ptr = __this_cpu_ptr(gic_data[gic_nr].saved_ppi_enable);
+ for (i = 0; i < DIV_ROUND_UP(32, 32); i++)
+ ptr[i] = readl_relaxed(dist_base + GIC_DIST_ENABLE_SET + i * 4);
+
+ ptr = __this_cpu_ptr(gic_data[gic_nr].saved_ppi_conf);
+ for (i = 0; i < DIV_ROUND_UP(32, 16); i++)
+ ptr[i] = readl_relaxed(dist_base + GIC_DIST_CONFIG + i * 4);
+
+ ptr = __this_cpu_ptr(gic_data[gic_nr].saved_ppi_pri);
+ for (i = 0; i < DIV_ROUND_UP(32, 4); i++)
+ ptr[i] = readl_relaxed(dist_base + GIC_DIST_PRI + i * 4);
+}
+
+static void gic_cpu_restore(unsigned int gic_nr)
+{
+ int i;
+ u32 *ptr;
+ void __iomem *dist_base;
+ void __iomem *cpu_base;
+
+ if (gic_nr >= MAX_GIC_NR)
+ BUG();
+
+ dist_base = gic_data[gic_nr].dist_base;
+ cpu_base = gic_data[gic_nr].cpu_base;
+
+ if (!dist_base || !cpu_base)
+ return;
+
+ ptr = __this_cpu_ptr(gic_data[gic_nr].saved_ppi_enable);
+ for (i = 0; i < DIV_ROUND_UP(32, 32); i++)
+ writel_relaxed(ptr[i], dist_base + GIC_DIST_ENABLE_SET + i * 4);
+
+ ptr = __this_cpu_ptr(gic_data[gic_nr].saved_ppi_conf);
+ for (i = 0; i < DIV_ROUND_UP(32, 16); i++)
+ writel_relaxed(ptr[i], dist_base + GIC_DIST_CONFIG + i * 4);
+
+ ptr = __this_cpu_ptr(gic_data[gic_nr].saved_ppi_pri);
+ for (i = 0; i < DIV_ROUND_UP(32, 4); i++)
+ writel_relaxed(ptr[i], dist_base + GIC_DIST_PRI + i * 4);
+
+ writel_relaxed(0xf0, cpu_base + GIC_CPU_PRIMASK);
+ writel_relaxed(1, cpu_base + GIC_CPU_CTRL);
+}
+
+static int gic_notifier(struct notifier_block *self, unsigned long cmd, void *v)
+{
+ int i;
+
+ for (i = 0; i < MAX_GIC_NR; i++) {
+ switch (cmd) {
+ case CPU_PM_ENTER:
+ gic_cpu_save(i);
+ break;
+ case CPU_PM_ENTER_FAILED:
+ case CPU_PM_EXIT:
+ gic_cpu_restore(i);
+ break;
+ case CPU_COMPLEX_PM_ENTER:
+ gic_dist_save(i);
+ break;
+ case CPU_COMPLEX_PM_ENTER_FAILED:
+ case CPU_COMPLEX_PM_EXIT:
+ gic_dist_restore(i);
+ break;
+ }
+ }
+
+ return NOTIFY_OK;
+}
+
+static struct notifier_block gic_notifier_block = {
+ .notifier_call = gic_notifier,
+};
+
+static void __init gic_cpu_pm_init(struct gic_chip_data *gic)
+{
+ gic->saved_ppi_enable = __alloc_percpu(DIV_ROUND_UP(32, 32) * 4,
+ sizeof(u32));
+ BUG_ON(!gic->saved_ppi_enable);
+
+ gic->saved_ppi_conf = __alloc_percpu(DIV_ROUND_UP(32, 16) * 4,
+ sizeof(u32));
+ BUG_ON(!gic->saved_ppi_conf);
+
+ gic->saved_ppi_pri = __alloc_percpu(DIV_ROUND_UP(32, 4) * 4,
+ sizeof(u32));
+ BUG_ON(!gic->saved_ppi_pri);
+
+ cpu_pm_register_notifier(&gic_notifier_block);
+}
+#else
+static void __init gic_cpu_pm_init(struct gic_chip_data *gic)
+{
+}
+#endif
+
void __init gic_init(unsigned int gic_nr, unsigned int irq_start,
void __iomem *dist_base, void __iomem *cpu_base)
{
@@ -367,6 +578,7 @@ void __init gic_init(unsigned int gic_nr, unsigned int irq_start,
gic_dist_init(gic, irq_start);
gic_cpu_init(gic);
+ gic_cpu_pm_init(gic);
}
void __cpuinit gic_secondary_init(unsigned int gic_nr)
--
1.7.4.1
When the cpu is powered down in a low power mode, the vfp
registers may be reset.
This patch uses CPU_PM_ENTER and CPU_PM_EXIT notifiers to save
and restore the cpu's vfp registers.
Signed-off-by: Colin Cross <[email protected]>
---
arch/arm/vfp/vfpmodule.c | 40 ++++++++++++++++++++++++++++++++++++++++
1 files changed, 40 insertions(+), 0 deletions(-)
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index f25e7ec..6f08dbe 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -21,6 +21,7 @@
#include <asm/cputype.h>
#include <asm/thread_notify.h>
#include <asm/vfp.h>
+#include <asm/cpu_pm.h>
#include "vfpinstr.h"
#include "vfp.h"
@@ -169,6 +170,44 @@ static struct notifier_block vfp_notifier_block = {
.notifier_call = vfp_notifier,
};
+#ifdef CONFIG_CPU_PM
+static int vfp_cpu_pm_notifier(struct notifier_block *self, unsigned long cmd,
+ void *v)
+{
+ u32 fpexc = fmrx(FPEXC);
+ unsigned int cpu = smp_processor_id();
+
+ switch (cmd) {
+ case CPU_PM_ENTER:
+ if (last_VFP_context[cpu]) {
+ fmxr(FPEXC, fpexc | FPEXC_EN);
+ vfp_save_state(last_VFP_context[cpu], fpexc);
+ /* force a reload when coming back from idle */
+ last_VFP_context[cpu] = NULL;
+ fmxr(FPEXC, fpexc & ~FPEXC_EN);
+ }
+ break;
+ case CPU_PM_ENTER_FAILED:
+ case CPU_PM_EXIT:
+ /* make sure VFP is disabled when leaving idle */
+ fmxr(FPEXC, fpexc & ~FPEXC_EN);
+ break;
+ }
+ return NOTIFY_OK;
+}
+
+static struct notifier_block vfp_cpu_pm_notifier_block = {
+ .notifier_call = vfp_cpu_pm_notifier,
+};
+
+static void vfp_cpu_pm_init(void)
+{
+ cpu_pm_register_notifier(&vfp_cpu_pm_notifier_block);
+}
+#else
+static inline void vfp_cpu_pm_init(void) { }
+#endif
+
/*
* Raise a SIGFPE for the current process.
* sicode describes the signal being raised.
@@ -563,6 +602,7 @@ static int __init vfp_init(void)
vfp_vector = vfp_support_entry;
thread_register_notifier(&vfp_notifier_block);
+ vfp_cpu_pm_init();
vfp_pm_init();
/*
--
1.7.4.1
On 06/12/2011 07:43 PM, Colin Cross wrote:
> During some CPU power modes entered during idle, hotplug and
> suspend, peripherals located in the CPU power domain, such as
> the GIC and VFP, may be powered down. Add a notifier chain
> that allows drivers for those peripherals to be notified
> before and after they may be reset.
>
> Signed-off-by: Colin Cross<[email protected]>
> ---
> arch/arm/Kconfig | 7 ++
> arch/arm/include/asm/cpu_pm.h | 54 ++++++++++++
> arch/arm/kernel/Makefile | 1 +
> arch/arm/kernel/cpu_pm.c | 181 +++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 243 insertions(+), 0 deletions(-)
> create mode 100644 arch/arm/include/asm/cpu_pm.h
> create mode 100644 arch/arm/kernel/cpu_pm.c
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 9adc278..356f266 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -183,6 +183,13 @@ config FIQ
> config ARCH_MTD_XIP
> bool
>
> +config ARCH_USES_CPU_PM
> + bool
> +
> +config CPU_PM
> + def_bool y
> + depends on ARCH_USES_CPU_PM&& (PM || CPU_IDLE)
> +
> config VECTORS_BASE
> hex
> default 0xffff0000 if MMU || CPU_HIGH_VECTOR
> diff --git a/arch/arm/include/asm/cpu_pm.h b/arch/arm/include/asm/cpu_pm.h
> new file mode 100644
> index 0000000..b4bb715
> --- /dev/null
> +++ b/arch/arm/include/asm/cpu_pm.h
> @@ -0,0 +1,54 @@
> +/*
> + * Copyright (C) 2011 Google, Inc.
> + *
> + * Author:
> + * Colin Cross<[email protected]>
> + *
> + * This software is licensed under the terms of the GNU General Public
> + * License version 2, as published by the Free Software Foundation, and
> + * may be copied, distributed, and modified under those terms.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + */
> +
> +#ifndef _ASMARM_CPU_PM_H
> +#define _ASMARM_CPU_PM_H
> +
> +#include<linux/kernel.h>
> +#include<linux/notifier.h>
> +
> +/* Event codes passed as unsigned long val to notifier calls */
> +enum cpu_pm_event {
> + /* A single cpu is entering a low power state */
> + CPU_PM_ENTER,
> +
> + /* A single cpu failed to enter a low power state */
> + CPU_PM_ENTER_FAILED,
> +
> + /* A single cpu is exiting a low power state */
> + CPU_PM_EXIT,
> +
> + /* A cpu power domain is entering a low power state */
> + CPU_COMPLEX_PM_ENTER,
> +
> + /* A cpu power domain failed to enter a low power state */
> + CPU_COMPLEX_PM_ENTER_FAILED,
> +
> + /* A cpu power domain is exiting a low power state */
> + CPU_COMPLEX_PM_EXIT,
> +};
> +
> +int cpu_pm_register_notifier(struct notifier_block *nb);
> +int cpu_pm_unregister_notifier(struct notifier_block *nb);
> +
> +int cpu_pm_enter(void);
> +int cpu_pm_exit(void);
> +
> +int cpu_complex_pm_enter(void);
> +int cpu_complex_pm_exit(void);
> +
> +#endif
> diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
> index a5b31af..8b42d58 100644
> --- a/arch/arm/kernel/Makefile
> +++ b/arch/arm/kernel/Makefile
> @@ -60,6 +60,7 @@ obj-$(CONFIG_CPU_PJ4) += pj4-cp0.o
> obj-$(CONFIG_IWMMXT) += iwmmxt.o
> obj-$(CONFIG_CPU_HAS_PMU) += pmu.o
> obj-$(CONFIG_HW_PERF_EVENTS) += perf_event.o
> +obj-$(CONFIG_CPU_PM) += cpu_pm.o
> AFLAGS_iwmmxt.o := -Wa,-mcpu=iwmmxt
>
> ifneq ($(CONFIG_ARCH_EBSA110),y)
> diff --git a/arch/arm/kernel/cpu_pm.c b/arch/arm/kernel/cpu_pm.c
> new file mode 100644
> index 0000000..48a5b53
> --- /dev/null
> +++ b/arch/arm/kernel/cpu_pm.c
> @@ -0,0 +1,181 @@
> +/*
> + * Copyright (C) 2011 Google, Inc.
> + *
> + * Author:
> + * Colin Cross<[email protected]>
> + *
> + * This software is licensed under the terms of the GNU General Public
> + * License version 2, as published by the Free Software Foundation, and
> + * may be copied, distributed, and modified under those terms.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + */
> +
> +#include<linux/kernel.h>
> +#include<linux/module.h>
> +#include<linux/notifier.h>
> +#include<linux/spinlock.h>
> +
> +#include<asm/cpu_pm.h>
> +
> +/*
> + * When a CPU goes to a low power state that turns off power to the CPU's
> + * power domain, the contents of some blocks (floating point coprocessors,
> + * interrutp controllers, caches, timers) in the same power domain can
> + * be lost. The cpm_pm notifiers provide a method for platform idle, suspend,
> + * and hotplug implementations to notify the drivers for these blocks that
> + * they may be reset.
> + *
> + * All cpu_pm notifications must be called with interrupts disabled.
> + *
> + * The notifications are split into two classes, CPU notifications and CPU
> + * complex notifications.
> + *
> + * CPU notifications apply to a single CPU, and must be called on the affected
> + * CPU. They are used to save per-cpu context for affected blocks.
> + *
> + * CPU complex notifications apply to all CPUs in a single power domain. They
> + * are used to save any global context for affected blocks, and must be called
> + * after all the CPUs in the power domain have been notified of the low power
> + * state.
> + *
> + */
> +
> +static DEFINE_RWLOCK(cpu_pm_notifier_lock);
> +static RAW_NOTIFIER_HEAD(cpu_pm_notifier_chain);
> +
> +int cpu_pm_register_notifier(struct notifier_block *nb)
> +{
> + unsigned long flags;
> + int ret;
> +
> + write_lock_irqsave(&cpu_pm_notifier_lock, flags);
> + ret = raw_notifier_chain_register(&cpu_pm_notifier_chain, nb);
> + write_unlock_irqrestore(&cpu_pm_notifier_lock, flags);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(cpu_pm_register_notifier);
> +
> +int cpu_pm_unregister_notifier(struct notifier_block *nb)
> +{
> + unsigned long flags;
> + int ret;
> +
> + write_lock_irqsave(&cpu_pm_notifier_lock, flags);
> + ret = raw_notifier_chain_unregister(&cpu_pm_notifier_chain, nb);
> + write_unlock_irqrestore(&cpu_pm_notifier_lock, flags);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(cpu_pm_unregister_notifier);
> +
> +static int cpu_pm_notify(enum cpu_pm_event event, int nr_to_call, int *nr_calls)
> +{
> + int ret;
> +
> + ret = __raw_notifier_call_chain(&cpu_pm_notifier_chain, event, NULL,
> + nr_to_call, nr_calls);
> +
> + return notifier_to_errno(ret);
> +}
> +
> +/**
> + * cpm_pm_enter
> + *
> + * Notifies listeners that a single cpu is entering a low power state that may
> + * cause some blocks in the same power domain as the cpu to reset.
> + *
> + * Must be called on the affected cpu with interrupts disabled. Platform is
> + * responsible for ensuring that cpu_pm_enter is not called twice on the same
> + * cpu before cpu_pm_exit is called.
> + */
> +int cpu_pm_enter(void)
> +{
> + int nr_calls;
> + int ret = 0;
> +
> + read_lock(&cpu_pm_notifier_lock);
> + ret = cpu_pm_notify(CPU_PM_ENTER, -1,&nr_calls);
> + if (ret)
> + cpu_pm_notify(CPU_PM_ENTER_FAILED, nr_calls - 1, NULL);
> + read_unlock(&cpu_pm_notifier_lock);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(cpu_pm_enter);
> +
> +/**
> + * cpm_pm_exit
> + *
> + * Notifies listeners that a single cpu is exiting a low power state that may
> + * have caused some blocks in the same power domain as the cpu to reset.
> + *
> + * Must be called on the affected cpu with interrupts disabled.
> + */
> +int cpu_pm_exit(void)
> +{
> + int ret;
> +
> + read_lock(&cpu_pm_notifier_lock);
> + ret = cpu_pm_notify(CPU_PM_EXIT, -1, NULL);
> + read_unlock(&cpu_pm_notifier_lock);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(cpu_pm_exit);
> +
> +/**
> + * cpm_complex_pm_enter
> + *
> + * Notifies listeners that all cpus in a power domain are entering a low power
> + * state that may cause some blocks in the same power domain to reset.
> + *
> + * Must be called after cpu_pm_enter has been called on all cpus in the power
> + * domain, and before cpu_pm_exit has been called on any cpu in the power
> + * domain.
> + *
> + * Must be called with interrupts disabled.
> + */
> +int cpu_complex_pm_enter(void)
> +{
> + int nr_calls;
> + int ret = 0;
> +
> + read_lock(&cpu_pm_notifier_lock);
> + ret = cpu_pm_notify(CPU_COMPLEX_PM_ENTER, -1,&nr_calls);
> + if (ret)
> + cpu_pm_notify(CPU_COMPLEX_PM_ENTER_FAILED, nr_calls - 1, NULL);
> + read_unlock(&cpu_pm_notifier_lock);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(cpu_complex_pm_enter);
> +
> +/**
> + * cpm_pm_enter
> + *
> + * Notifies listeners that a single cpu is entering a low power state that may
> + * cause some blocks in the same power domain as the cpu to reset.
> + *
> + * Must be called after cpu_pm_enter has been called on all cpus in the power
> + * domain, and before cpu_pm_exit has been called on any cpu in the power
> + * domain.
> + *
> + * Must be called with interrupts disabled.
> + */
> +int cpu_complex_pm_exit(void)
> +{
> + int ret;
> +
> + read_lock(&cpu_pm_notifier_lock);
> + ret = cpu_pm_notify(CPU_COMPLEX_PM_EXIT, -1, NULL);
> + read_unlock(&cpu_pm_notifier_lock);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(cpu_complex_pm_exit);
Nothing really ARM specific here, so why not make this generic for all
architectures?
Rob
Hi Colin,
On Mon, Jun 13, 2011 at 01:43:44AM +0100, Colin Cross wrote:
> When the cpu is powered down in a low power mode, the gic cpu
> interface may be reset, and when the cpu complex is powered
> down, the gic distributor may also be reset.
>
> This patch uses CPU_PM_ENTER and CPU_PM_EXIT notifiers to save
> and restore the gic cpu interface registers, and the
> CPU_COMPLEX_PM_ENTER and CPU_COMPLEX_PM_EXIT notifiers to save
> and restore the gic distributor registers.
>
> Signed-off-by: Colin Cross <[email protected]>
> ---
> arch/arm/common/gic.c | 212 +++++++++++++++++++++++++++++++++++++++++++++++++
> 1 files changed, 212 insertions(+), 0 deletions(-)
>
[...]
> +
> +static void __init gic_cpu_pm_init(struct gic_chip_data *gic)
> +{
> + gic->saved_ppi_enable = __alloc_percpu(DIV_ROUND_UP(32, 32) * 4,
> + sizeof(u32));
> + BUG_ON(!gic->saved_ppi_enable);
> +
> + gic->saved_ppi_conf = __alloc_percpu(DIV_ROUND_UP(32, 16) * 4,
> + sizeof(u32));
> + BUG_ON(!gic->saved_ppi_conf);
> +
> + gic->saved_ppi_pri = __alloc_percpu(DIV_ROUND_UP(32, 4) * 4,
> + sizeof(u32));
> + BUG_ON(!gic->saved_ppi_pri);
> +
> + cpu_pm_register_notifier(&gic_notifier_block);
> +}
> +#else
> +static void __init gic_cpu_pm_init(struct gic_chip_data *gic)
> +{
> +}
> +#endif
> +
> void __init gic_init(unsigned int gic_nr, unsigned int irq_start,
> void __iomem *dist_base, void __iomem *cpu_base)
> {
> @@ -367,6 +578,7 @@ void __init gic_init(unsigned int gic_nr, unsigned int irq_start,
>
> gic_dist_init(gic, irq_start);
> gic_cpu_init(gic);
> + gic_cpu_pm_init(gic);
> }
>
>
I have been using this patchset for a while and it works perfectly fine.
I reckon one bit we could improve is allowing to register a subsystem depending
on a flag set @init, for now it is a save "all default registered subsystems or nothing" solution on cpu_pm_enter(), maybe we could add this level of
configurability as an improvement (GIC being a good example, see OMAP).
Lorenzo
On Monday, June 13, 2011, Colin Cross wrote:
> This patch set tries to address Russell's concerns with platform
> pm code calling into the driver for every block in the Cortex A9s
> during idle, hotplug, and suspend. The first patch adds cpu pm
> notifiers that can be called by platform code, the second uses
> the notifier to save and restore the GIC state, and the third
> saves the VFP state.
>
> The notifiers are used for two types of events, CPU PM events and
> CPU complex PM events. CPU PM events are used to save and restore
> per-cpu context when a single CPU is preparing to enter or has
> just exited a low power state. For example, the VFP saves the
> last thread context, and the GIC saves banked CPU registers.
>
> CPU complex events are used after all the CPUs in a power domain
> have been prepared for the low power state. The GIC uses these
> events to save global register state.
>
> Platforms that call the cpu_pm APIs must select
> CONFIG_ARCH_USES_CPU_PM
>
> L2 cache is not covered by this patch set, as the determination
> of when the L2 is reset and when it is retained is
> platform-specific, and most of the APIs necessary are already
> present.
>
> arch/arm/Kconfig | 7 ++
> arch/arm/common/gic.c | 212 +++++++++++++++++++++++++++++++++++++++++
> arch/arm/include/asm/cpu_pm.h | 54 +++++++++++
> arch/arm/kernel/Makefile | 1 +
> arch/arm/kernel/cpu_pm.c | 181 +++++++++++++++++++++++++++++++++++
Is there any reason why this has to be ARM-specific? There are other
architectures where this kind of feature might make sense (SH and
powerpc at least).
Besides, is there any overlap between this feature and the CPU hotplug
notifiers?
Rafael
On Mon, Jun 13, 2011 at 11:37 AM, Rafael J. Wysocki <[email protected]> wrote:
> On Monday, June 13, 2011, Colin Cross wrote:
>> This patch set tries to address Russell's concerns with platform
>> pm code calling into the driver for every block in the Cortex A9s
>> during idle, hotplug, and suspend. ?The first patch adds cpu pm
>> notifiers that can be called by platform code, the second uses
>> the notifier to save and restore the GIC state, and the third
>> saves the VFP state.
>>
>> The notifiers are used for two types of events, CPU PM events and
>> CPU complex PM events. ?CPU PM events are used to save and restore
>> per-cpu context when a single CPU is preparing to enter or has
>> just exited a low power state. ?For example, the VFP saves the
>> last thread context, and the GIC saves banked CPU registers.
>>
>> CPU complex events are used after all the CPUs in a power domain
>> have been prepared for the low power state. ?The GIC uses these
>> events to save global register state.
>>
>> Platforms that call the cpu_pm APIs must select
>> CONFIG_ARCH_USES_CPU_PM
>>
>> L2 cache is not covered by this patch set, as the determination
>> of when the L2 is reset and when it is retained is
>> platform-specific, and most of the APIs necessary are already
>> present.
>>
>> ?arch/arm/Kconfig ? ? ? ? ? ? ?| ? ?7 ++
>> ?arch/arm/common/gic.c ? ? ? ? | ?212 +++++++++++++++++++++++++++++++++++++++++
>> ?arch/arm/include/asm/cpu_pm.h | ? 54 +++++++++++
>> ?arch/arm/kernel/Makefile ? ? ?| ? ?1 +
>> ?arch/arm/kernel/cpu_pm.c ? ? ?| ?181 +++++++++++++++++++++++++++++++++++
>
> Is there any reason why this has to be ARM-specific? ?There are other
> architectures where this kind of feature might make sense (SH and
> powerpc at least).
Nothing other than there are currently no adaptations for any drivers
besides ARM, but I can move it somewhere outside ARM. Any suggestions
where?
> Besides, is there any overlap between this feature and the CPU hotplug
> notifiers?
I don't think so - the hotplug notifiers are used when a CPU is being
removed from the system, so no saving and restoring is necessary - the
CPU will be rebooted from scratch. They are used by systems outside
the CPU that need to know that a CPU no longer exists.
CPU PM notifiers are used when a CPU is going through reset in a way
that should be transparent to most of the system. Only drivers within
the CPU itself need a notification. Note that both ARM drivers I
modified did not have a register_cpu_notifier call.
> Rafael
>
Colin Cross <[email protected]> writes:
> During some CPU power modes entered during idle, hotplug and
> suspend, peripherals located in the CPU power domain, such as
> the GIC and VFP, may be powered down. Add a notifier chain
> that allows drivers for those peripherals to be notified
> before and after they may be reset.
>
> Signed-off-by: Colin Cross <[email protected]>
This is great, thanks!
I had hacked up something OMAP-specific a while back to do something
similar, and have been meaning to make it more generic, so this is
perfect. Also, if it is moved outside ARM, note that x86_64 has a
idle_notifier infrastructure that is somewhat similar, and if you're
motivated, it should probably be converted to this as well. See
arch/x86/kernel/process_64.c.
Also, for the sake of the comments/changelog, the usefulness of these
notifiers is not limited to low-power states where power is off and IP
blocks may be reset. It could be considered as simply a generic
notification mechanism for any CPU PM transitions.
On OMAP we have other peripherals (not in the CPU power domain) where we
need to control their PM transitions in relation to the CPU PM
transitions so the notifiers are useful for any low-power state
transition of the CPU(s). The drivers for these peripherals use runtime
PM in their CPU PM notifiers to trigger the device PM transitions. (The
drivers must use the synchronous versions of the runtime PM get/put
calls with device in pm_runtime_irq_safe() mode.)
In this series, I don't see any calls to cpu_[complex_]pm_[enter|exit]().
Based on that, I assume you prefer to leave it up to platform-specific
idle/PM code to place these calls. Or, do you plan to have this
triggered by generic CPUidle, suspend/resume and/or hotplug code? At
least on OMAP, I prefer having the calls in platform-specific code.
I have a minor enhancement request. The notifier callbacks provide for
passing a void * through the notifier chain. Could you add a way for a
void * to be registered at cpu_pm_register_notifier() time and that
would be passed through the notifier call chain? This would allow using
the same struct notifier_block and callback for multiple instances of
the same IP, and the instances could be differentiated in the callback
using the void *.
Also, FWIW I tested this on OMAP3 (Cortex-A8 UP) using
cpu_pm_enter/exit() in the code path shared between idle and suspend. I
successfully triggered PM transitions in non-CPU power-domain
peripherals, and it worked great.
Some other minor comments below...
[...]
> diff --git a/arch/arm/kernel/cpu_pm.c b/arch/arm/kernel/cpu_pm.c
> new file mode 100644
> index 0000000..48a5b53
> --- /dev/null
> +++ b/arch/arm/kernel/cpu_pm.c
> @@ -0,0 +1,181 @@
> +/*
> + * Copyright (C) 2011 Google, Inc.
> + *
> + * Author:
> + * Colin Cross <[email protected]>
> + *
> + * This software is licensed under the terms of the GNU General Public
> + * License version 2, as published by the Free Software Foundation, and
> + * may be copied, distributed, and modified under those terms.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/notifier.h>
> +#include <linux/spinlock.h>
> +
> +#include <asm/cpu_pm.h>
> +
> +/*
> + * When a CPU goes to a low power state that turns off power to the CPU's
> + * power domain, the contents of some blocks (floating point coprocessors,
> + * interrutp controllers, caches, timers) in the same power domain can
typo: interrutp
> + * be lost. The cpm_pm notifiers provide a method for platform idle, suspend,
> + * and hotplug implementations to notify the drivers for these blocks that
> + * they may be reset.
> + *
> + * All cpu_pm notifications must be called with interrupts disabled.
> + *
> + * The notifications are split into two classes, CPU notifications and CPU
> + * complex notifications.
> + *
> + * CPU notifications apply to a single CPU, and must be called on the affected
> + * CPU. They are used to save per-cpu context for affected blocks.
> + *
> + * CPU complex notifications apply to all CPUs in a single power domain. They
> + * are used to save any global context for affected blocks, and must be called
> + * after all the CPUs in the power domain have been notified of the low power
> + * state.
> + *
> + */
Not directly related to this series, but I'm a bit confused on terminology.
I've seen both 'CPU complex' and 'CPU cluster' used in ARM SMP land, but
haven't seen precise definitions of either. Does one imply all CPUs,
and the other imply all CPUs in the same power domain?
Kevin
On Mon, Jun 13, 2011 at 3:17 PM, Kevin Hilman <[email protected]> wrote:
> Colin Cross <[email protected]> writes:
>
>> During some CPU power modes entered during idle, hotplug and
>> suspend, peripherals located in the CPU power domain, such as
>> the GIC and VFP, may be powered down. ?Add a notifier chain
>> that allows drivers for those peripherals to be notified
>> before and after they may be reset.
>>
>> Signed-off-by: Colin Cross <[email protected]>
>
> This is great, thanks!
>
> I had hacked up something OMAP-specific a while back to do something
> similar, and have been meaning to make it more generic, so this is
> perfect. ?Also, if it is moved outside ARM, note that x86_64 has a
> idle_notifier infrastructure that is somewhat similar, and if you're
> motivated, it should probably be converted to this as well. ?See
> arch/x86/kernel/process_64.c.
I'll take a look at x86
> Also, for the sake of the comments/changelog, the usefulness of these
> notifiers is not limited to low-power states where power is off and IP
> blocks may be reset. ?It could be considered as simply a generic
> notification mechanism for any CPU PM transitions.
>
> On OMAP we have other peripherals (not in the CPU power domain) where we
> need to control their PM transitions in relation to the CPU PM
> transitions so the notifiers are useful for any low-power state
> transition of the CPU(s). ?The drivers for these peripherals use runtime
> PM in their CPU PM notifiers to trigger the device PM transitions. (The
> drivers must use the synchronous versions of the runtime PM get/put
> calls with device in pm_runtime_irq_safe() mode.)
Can you give a concrete example of this so I can describe it correctly?
> In this series, I don't see any calls to cpu_[complex_]pm_[enter|exit]().
> Based on that, I assume you prefer to leave it up to platform-specific
> idle/PM code to place these calls. ?Or, do you plan to have this
> triggered by generic CPUidle, suspend/resume and/or hotplug code? ?At
> least on OMAP, I prefer having the calls in platform-specific code.
I will post the Tegra code that uses this soon. I expect that the
decision on exactly when to call these functions will be unique to
each platform, so I think it should start in the platform-specific
code.
> I have a minor enhancement request. ?The notifier callbacks provide for
> passing a void * through the notifier chain. ?Could you add a way for a
> void * to be registered at cpu_pm_register_notifier() time and that
> would be passed through the notifier call chain? ?This would allow using
> the same struct notifier_block and callback for multiple instances of
> the same IP, and the instances could be differentiated in the callback
> using the void *.
The void * passed to the notifier comes from the call to
notifier_call_chain(), not from the call to register_notifier(). You
can get the behavior you want by putting the notifier_block inside
your device struct and using container_of on the notifier_bock.
> Also, FWIW I tested this on OMAP3 (Cortex-A8 UP) using
> cpu_pm_enter/exit() in the code path shared between idle and suspend. ?I
> successfully triggered PM transitions in non-CPU power-domain
> peripherals, and it worked great.
Great! Can I get a tested-by?
> Some other minor comments below...
>
> [...]
>
>> diff --git a/arch/arm/kernel/cpu_pm.c b/arch/arm/kernel/cpu_pm.c
>> new file mode 100644
>> index 0000000..48a5b53
>> --- /dev/null
>> +++ b/arch/arm/kernel/cpu_pm.c
>> @@ -0,0 +1,181 @@
>> +/*
>> + * Copyright (C) 2011 Google, Inc.
>> + *
>> + * Author:
>> + * ? Colin Cross <[email protected]>
>> + *
>> + * This software is licensed under the terms of the GNU General Public
>> + * License version 2, as published by the Free Software Foundation, and
>> + * may be copied, distributed, and modified under those terms.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ?See the
>> + * GNU General Public License for more details.
>> + *
>> + */
>> +
>> +#include <linux/kernel.h>
>> +#include <linux/module.h>
>> +#include <linux/notifier.h>
>> +#include <linux/spinlock.h>
>> +
>> +#include <asm/cpu_pm.h>
>> +
>> +/*
>> + * When a CPU goes to a low power state that turns off power to the CPU's
>> + * power domain, the contents of some blocks (floating point coprocessors,
>> + * interrutp controllers, caches, timers) in the same power domain can
>
> typo: interrutp
Will fix
>> + * be lost. ?The cpm_pm notifiers provide a method for platform idle, suspend,
>> + * and hotplug implementations to notify the drivers for these blocks that
>> + * they may be reset.
>> + *
>> + * All cpu_pm notifications must be called with interrupts disabled.
>> + *
>> + * The notifications are split into two classes, CPU notifications and CPU
>> + * complex notifications.
>> + *
>> + * CPU notifications apply to a single CPU, and must be called on the affected
>> + * CPU. ?They are used to save per-cpu context for affected blocks.
>> + *
>> + * CPU complex notifications apply to all CPUs in a single power domain. They
>> + * are used to save any global context for affected blocks, and must be called
>> + * after all the CPUs in the power domain have been notified of the low power
>> + * state.
>> + *
>> + */
>
> Not directly related to this series, but I'm a bit confused on terminology.
>
> I've seen both 'CPU complex' and 'CPU cluster' used in ARM SMP land, but
> haven't seen precise definitions of either. ?Does one imply all CPUs,
> and the other imply all CPUs in the same power domain?
'CPU complex' is the terminology I cribbed from some of nVidia's Tegra
patches, but I think 'CPU cluster' is a better term for what I mean -
the group of CPUs that share some external state, like the L2 or GIC
distributor. In practice, all existing platforms seem to have a
single cluster (as far as Linux is concerned). The ARM ARM uses
'cluster', but doesn't directly define it, and doesn't use 'CPU
complex' at all.
On Monday, June 13, 2011, Colin Cross wrote:
> On Mon, Jun 13, 2011 at 11:37 AM, Rafael J. Wysocki <[email protected]> wrote:
> > On Monday, June 13, 2011, Colin Cross wrote:
> >> This patch set tries to address Russell's concerns with platform
> >> pm code calling into the driver for every block in the Cortex A9s
> >> during idle, hotplug, and suspend. The first patch adds cpu pm
> >> notifiers that can be called by platform code, the second uses
> >> the notifier to save and restore the GIC state, and the third
> >> saves the VFP state.
> >>
> >> The notifiers are used for two types of events, CPU PM events and
> >> CPU complex PM events. CPU PM events are used to save and restore
> >> per-cpu context when a single CPU is preparing to enter or has
> >> just exited a low power state. For example, the VFP saves the
> >> last thread context, and the GIC saves banked CPU registers.
> >>
> >> CPU complex events are used after all the CPUs in a power domain
> >> have been prepared for the low power state. The GIC uses these
> >> events to save global register state.
> >>
> >> Platforms that call the cpu_pm APIs must select
> >> CONFIG_ARCH_USES_CPU_PM
> >>
> >> L2 cache is not covered by this patch set, as the determination
> >> of when the L2 is reset and when it is retained is
> >> platform-specific, and most of the APIs necessary are already
> >> present.
> >>
> >> arch/arm/Kconfig | 7 ++
> >> arch/arm/common/gic.c | 212 +++++++++++++++++++++++++++++++++++++++++
> >> arch/arm/include/asm/cpu_pm.h | 54 +++++++++++
> >> arch/arm/kernel/Makefile | 1 +
> >> arch/arm/kernel/cpu_pm.c | 181 +++++++++++++++++++++++++++++++++++
> >
> > Is there any reason why this has to be ARM-specific? There are other
> > architectures where this kind of feature might make sense (SH and
> > powerpc at least).
>
> Nothing other than there are currently no adaptations for any drivers
> besides ARM, but I can move it somewhere outside ARM. Any suggestions
> where?
Well, there is kernel/cpu.c. It contains mostly CPU hotplug and PM
code at the moment, so it looks like a good place.
> > Besides, is there any overlap between this feature and the CPU hotplug
> > notifiers?
>
> I don't think so - the hotplug notifiers are used when a CPU is being
> removed from the system, so no saving and restoring is necessary - the
> CPU will be rebooted from scratch. They are used by systems outside
> the CPU that need to know that a CPU no longer exists.
>
> CPU PM notifiers are used when a CPU is going through reset in a way
> that should be transparent to most of the system.
Do I guess correctly that you mean cpuidle?
Rafael
On Tue, Jun 14, 2011 at 2:00 PM, Rafael J. Wysocki <[email protected]> wrote:
> On Monday, June 13, 2011, Colin Cross wrote:
>> On Mon, Jun 13, 2011 at 11:37 AM, Rafael J. Wysocki <[email protected]> wrote:
>> > On Monday, June 13, 2011, Colin Cross wrote:
>> >> This patch set tries to address Russell's concerns with platform
>> >> pm code calling into the driver for every block in the Cortex A9s
>> >> during idle, hotplug, and suspend. ?The first patch adds cpu pm
>> >> notifiers that can be called by platform code, the second uses
>> >> the notifier to save and restore the GIC state, and the third
>> >> saves the VFP state.
>> >>
>> >> The notifiers are used for two types of events, CPU PM events and
>> >> CPU complex PM events. ?CPU PM events are used to save and restore
>> >> per-cpu context when a single CPU is preparing to enter or has
>> >> just exited a low power state. ?For example, the VFP saves the
>> >> last thread context, and the GIC saves banked CPU registers.
>> >>
>> >> CPU complex events are used after all the CPUs in a power domain
>> >> have been prepared for the low power state. ?The GIC uses these
>> >> events to save global register state.
>> >>
>> >> Platforms that call the cpu_pm APIs must select
>> >> CONFIG_ARCH_USES_CPU_PM
>> >>
>> >> L2 cache is not covered by this patch set, as the determination
>> >> of when the L2 is reset and when it is retained is
>> >> platform-specific, and most of the APIs necessary are already
>> >> present.
>> >>
>> >> ?arch/arm/Kconfig ? ? ? ? ? ? ?| ? ?7 ++
>> >> ?arch/arm/common/gic.c ? ? ? ? | ?212 +++++++++++++++++++++++++++++++++++++++++
>> >> ?arch/arm/include/asm/cpu_pm.h | ? 54 +++++++++++
>> >> ?arch/arm/kernel/Makefile ? ? ?| ? ?1 +
>> >> ?arch/arm/kernel/cpu_pm.c ? ? ?| ?181 +++++++++++++++++++++++++++++++++++
>> >
>> > Is there any reason why this has to be ARM-specific? ?There are other
>> > architectures where this kind of feature might make sense (SH and
>> > powerpc at least).
>>
>> Nothing other than there are currently no adaptations for any drivers
>> besides ARM, but I can move it somewhere outside ARM. ?Any suggestions
>> where?
>
> Well, there is kernel/cpu.c. ?It contains mostly CPU hotplug and PM
> code at the moment, so it looks like a good place.
OK, I'll look at moving it there.
>> > Besides, is there any overlap between this feature and the CPU hotplug
>> > notifiers?
>>
>> I don't think so - the hotplug notifiers are used when a CPU is being
>> removed from the system, so no saving and restoring is necessary - the
>> CPU will be rebooted from scratch. ?They are used by systems outside
>> the CPU that need to know that a CPU no longer exists.
>>
>> CPU PM notifiers are used when a CPU is going through reset in a way
>> that should be transparent to most of the system.
>
> Do I guess correctly that you mean cpuidle?
cpuidle is the major user, but primary CPUs in suspend have to save
and restore the same blocks, and tend to use the same platform sleep
code as idle, so it's logical to use the notifiers for both. On the
other hand, some drivers that would use cpu_pm notifiers already use
syscore ops to handle suspend and resume (like vfp) - maybe these
notifiers should only be used in cpuidle, and syscore ops added to the
gic driver? I could also convert the notifiers to new syscore_ops -
cpu_idle, cpu_unidle, cpu_cluster_idle, cpu_cluster_unidle, but I
don't know how well that fits in to the intention for syscore.
On Tuesday, June 14, 2011, Colin Cross wrote:
> On Tue, Jun 14, 2011 at 2:00 PM, Rafael J. Wysocki <[email protected]> wrote:
> > On Monday, June 13, 2011, Colin Cross wrote:
> >> On Mon, Jun 13, 2011 at 11:37 AM, Rafael J. Wysocki <[email protected]> wrote:
> >> > On Monday, June 13, 2011, Colin Cross wrote:
> >> >> This patch set tries to address Russell's concerns with platform
> >> >> pm code calling into the driver for every block in the Cortex A9s
> >> >> during idle, hotplug, and suspend. The first patch adds cpu pm
> >> >> notifiers that can be called by platform code, the second uses
> >> >> the notifier to save and restore the GIC state, and the third
> >> >> saves the VFP state.
> >> >>
> >> >> The notifiers are used for two types of events, CPU PM events and
> >> >> CPU complex PM events. CPU PM events are used to save and restore
> >> >> per-cpu context when a single CPU is preparing to enter or has
> >> >> just exited a low power state. For example, the VFP saves the
> >> >> last thread context, and the GIC saves banked CPU registers.
> >> >>
> >> >> CPU complex events are used after all the CPUs in a power domain
> >> >> have been prepared for the low power state. The GIC uses these
> >> >> events to save global register state.
> >> >>
> >> >> Platforms that call the cpu_pm APIs must select
> >> >> CONFIG_ARCH_USES_CPU_PM
> >> >>
> >> >> L2 cache is not covered by this patch set, as the determination
> >> >> of when the L2 is reset and when it is retained is
> >> >> platform-specific, and most of the APIs necessary are already
> >> >> present.
> >> >>
> >> >> arch/arm/Kconfig | 7 ++
> >> >> arch/arm/common/gic.c | 212 +++++++++++++++++++++++++++++++++++++++++
> >> >> arch/arm/include/asm/cpu_pm.h | 54 +++++++++++
> >> >> arch/arm/kernel/Makefile | 1 +
> >> >> arch/arm/kernel/cpu_pm.c | 181 +++++++++++++++++++++++++++++++++++
> >> >
> >> > Is there any reason why this has to be ARM-specific? There are other
> >> > architectures where this kind of feature might make sense (SH and
> >> > powerpc at least).
> >>
> >> Nothing other than there are currently no adaptations for any drivers
> >> besides ARM, but I can move it somewhere outside ARM. Any suggestions
> >> where?
> >
> > Well, there is kernel/cpu.c. It contains mostly CPU hotplug and PM
> > code at the moment, so it looks like a good place.
>
> OK, I'll look at moving it there.
>
> >> > Besides, is there any overlap between this feature and the CPU hotplug
> >> > notifiers?
> >>
> >> I don't think so - the hotplug notifiers are used when a CPU is being
> >> removed from the system, so no saving and restoring is necessary - the
> >> CPU will be rebooted from scratch. They are used by systems outside
> >> the CPU that need to know that a CPU no longer exists.
> >>
> >> CPU PM notifiers are used when a CPU is going through reset in a way
> >> that should be transparent to most of the system.
> >
> > Do I guess correctly that you mean cpuidle?
>
> cpuidle is the major user, but primary CPUs in suspend have to save
> and restore the same blocks, and tend to use the same platform sleep
> code as idle, so it's logical to use the notifiers for both. On the
> other hand, some drivers that would use cpu_pm notifiers already use
> syscore ops to handle suspend and resume (like vfp) - maybe these
> notifiers should only be used in cpuidle, and syscore ops added to the
> gic driver? I could also convert the notifiers to new syscore_ops -
> cpu_idle, cpu_unidle, cpu_cluster_idle, cpu_cluster_unidle, but I
> don't know how well that fits in to the intention for syscore.
Basically, syscore_ops deal with the situation during system suspend
when all CPUs but one have been switched off (through CPU hotplug)
and interrupts are off on the only active CPU. If there's anything
you need to do at this point, syscore_ops is the right thing to use.
And analogously for system resume.
Moreover, for system suspend switching off the "boot" CPU (i.e. the only one
that remains active through the whole sequence) should really be the last
thing done, everything else should have been handled through syscore_ops
before.
Thanks,
Rafael
On Tue, Jun 14, 2011 at 2:34 PM, Rafael J. Wysocki <[email protected]> wrote:
> On Tuesday, June 14, 2011, Colin Cross wrote:
>> On Tue, Jun 14, 2011 at 2:00 PM, Rafael J. Wysocki <[email protected]> wrote:
>> > On Monday, June 13, 2011, Colin Cross wrote:
>> >> On Mon, Jun 13, 2011 at 11:37 AM, Rafael J. Wysocki <[email protected]> wrote:
>> >> > On Monday, June 13, 2011, Colin Cross wrote:
>> >> >> This patch set tries to address Russell's concerns with platform
>> >> >> pm code calling into the driver for every block in the Cortex A9s
>> >> >> during idle, hotplug, and suspend. ?The first patch adds cpu pm
>> >> >> notifiers that can be called by platform code, the second uses
>> >> >> the notifier to save and restore the GIC state, and the third
>> >> >> saves the VFP state.
>> >> >>
>> >> >> The notifiers are used for two types of events, CPU PM events and
>> >> >> CPU complex PM events. ?CPU PM events are used to save and restore
>> >> >> per-cpu context when a single CPU is preparing to enter or has
>> >> >> just exited a low power state. ?For example, the VFP saves the
>> >> >> last thread context, and the GIC saves banked CPU registers.
>> >> >>
>> >> >> CPU complex events are used after all the CPUs in a power domain
>> >> >> have been prepared for the low power state. ?The GIC uses these
>> >> >> events to save global register state.
>> >> >>
>> >> >> Platforms that call the cpu_pm APIs must select
>> >> >> CONFIG_ARCH_USES_CPU_PM
>> >> >>
>> >> >> L2 cache is not covered by this patch set, as the determination
>> >> >> of when the L2 is reset and when it is retained is
>> >> >> platform-specific, and most of the APIs necessary are already
>> >> >> present.
>> >> >>
>> >> >> ?arch/arm/Kconfig ? ? ? ? ? ? ?| ? ?7 ++
>> >> >> ?arch/arm/common/gic.c ? ? ? ? | ?212 +++++++++++++++++++++++++++++++++++++++++
>> >> >> ?arch/arm/include/asm/cpu_pm.h | ? 54 +++++++++++
>> >> >> ?arch/arm/kernel/Makefile ? ? ?| ? ?1 +
>> >> >> ?arch/arm/kernel/cpu_pm.c ? ? ?| ?181 +++++++++++++++++++++++++++++++++++
>> >> >
>> >> > Is there any reason why this has to be ARM-specific? ?There are other
>> >> > architectures where this kind of feature might make sense (SH and
>> >> > powerpc at least).
>> >>
>> >> Nothing other than there are currently no adaptations for any drivers
>> >> besides ARM, but I can move it somewhere outside ARM. ?Any suggestions
>> >> where?
>> >
>> > Well, there is kernel/cpu.c. ?It contains mostly CPU hotplug and PM
>> > code at the moment, so it looks like a good place.
>>
>> OK, I'll look at moving it there.
>>
>> >> > Besides, is there any overlap between this feature and the CPU hotplug
>> >> > notifiers?
>> >>
>> >> I don't think so - the hotplug notifiers are used when a CPU is being
>> >> removed from the system, so no saving and restoring is necessary - the
>> >> CPU will be rebooted from scratch. ?They are used by systems outside
>> >> the CPU that need to know that a CPU no longer exists.
>> >>
>> >> CPU PM notifiers are used when a CPU is going through reset in a way
>> >> that should be transparent to most of the system.
>> >
>> > Do I guess correctly that you mean cpuidle?
>>
>> cpuidle is the major user, but primary CPUs in suspend have to save
>> and restore the same blocks, and tend to use the same platform sleep
>> code as idle, so it's logical to use the notifiers for both. ?On the
>> other hand, some drivers that would use cpu_pm notifiers already use
>> syscore ops to handle suspend and resume (like vfp) - maybe these
>> notifiers should only be used in cpuidle, and syscore ops added to the
>> gic driver? ?I could also convert the notifiers to new syscore_ops -
>> cpu_idle, cpu_unidle, cpu_cluster_idle, cpu_cluster_unidle, but I
>> don't know how well that fits in to the intention for syscore.
>
> Basically, syscore_ops deal with the situation during system suspend
> when all CPUs but one have been switched off (through CPU hotplug)
> and interrupts are off on the only active CPU. ?If there's anything
> you need to do at this point, syscore_ops is the right thing to use.
> And analogously for system resume.
>
> Moreover, for system suspend switching off the "boot" CPU (i.e. the only one
> that remains active through the whole sequence) should really be the last
> thing done, everything else should have been handled through syscore_ops
> before.
Yes, but what to do with idle, which generally needs to do the exact
same things as handled in some syscore ops? Extend syscore ops, or
add the new notifier, and each driver can implement both syscore and
cpu_pm listeners (and probably call the same helper function to handle
both)?
On Tuesday, June 14, 2011, Colin Cross wrote:
> On Tue, Jun 14, 2011 at 2:34 PM, Rafael J. Wysocki <[email protected]> wrote:
> > On Tuesday, June 14, 2011, Colin Cross wrote:
> >> On Tue, Jun 14, 2011 at 2:00 PM, Rafael J. Wysocki <[email protected]> wrote:
> >> > On Monday, June 13, 2011, Colin Cross wrote:
> >> >> On Mon, Jun 13, 2011 at 11:37 AM, Rafael J. Wysocki <[email protected]> wrote:
> >> >> > On Monday, June 13, 2011, Colin Cross wrote:
> >> >> >> This patch set tries to address Russell's concerns with platform
> >> >> >> pm code calling into the driver for every block in the Cortex A9s
> >> >> >> during idle, hotplug, and suspend. The first patch adds cpu pm
> >> >> >> notifiers that can be called by platform code, the second uses
> >> >> >> the notifier to save and restore the GIC state, and the third
> >> >> >> saves the VFP state.
> >> >> >>
> >> >> >> The notifiers are used for two types of events, CPU PM events and
> >> >> >> CPU complex PM events. CPU PM events are used to save and restore
> >> >> >> per-cpu context when a single CPU is preparing to enter or has
> >> >> >> just exited a low power state. For example, the VFP saves the
> >> >> >> last thread context, and the GIC saves banked CPU registers.
> >> >> >>
> >> >> >> CPU complex events are used after all the CPUs in a power domain
> >> >> >> have been prepared for the low power state. The GIC uses these
> >> >> >> events to save global register state.
> >> >> >>
> >> >> >> Platforms that call the cpu_pm APIs must select
> >> >> >> CONFIG_ARCH_USES_CPU_PM
> >> >> >>
> >> >> >> L2 cache is not covered by this patch set, as the determination
> >> >> >> of when the L2 is reset and when it is retained is
> >> >> >> platform-specific, and most of the APIs necessary are already
> >> >> >> present.
> >> >> >>
> >> >> >> arch/arm/Kconfig | 7 ++
> >> >> >> arch/arm/common/gic.c | 212 +++++++++++++++++++++++++++++++++++++++++
> >> >> >> arch/arm/include/asm/cpu_pm.h | 54 +++++++++++
> >> >> >> arch/arm/kernel/Makefile | 1 +
> >> >> >> arch/arm/kernel/cpu_pm.c | 181 +++++++++++++++++++++++++++++++++++
> >> >> >
> >> >> > Is there any reason why this has to be ARM-specific? There are other
> >> >> > architectures where this kind of feature might make sense (SH and
> >> >> > powerpc at least).
> >> >>
> >> >> Nothing other than there are currently no adaptations for any drivers
> >> >> besides ARM, but I can move it somewhere outside ARM. Any suggestions
> >> >> where?
> >> >
> >> > Well, there is kernel/cpu.c. It contains mostly CPU hotplug and PM
> >> > code at the moment, so it looks like a good place.
> >>
> >> OK, I'll look at moving it there.
> >>
> >> >> > Besides, is there any overlap between this feature and the CPU hotplug
> >> >> > notifiers?
> >> >>
> >> >> I don't think so - the hotplug notifiers are used when a CPU is being
> >> >> removed from the system, so no saving and restoring is necessary - the
> >> >> CPU will be rebooted from scratch. They are used by systems outside
> >> >> the CPU that need to know that a CPU no longer exists.
> >> >>
> >> >> CPU PM notifiers are used when a CPU is going through reset in a way
> >> >> that should be transparent to most of the system.
> >> >
> >> > Do I guess correctly that you mean cpuidle?
> >>
> >> cpuidle is the major user, but primary CPUs in suspend have to save
> >> and restore the same blocks, and tend to use the same platform sleep
> >> code as idle, so it's logical to use the notifiers for both. On the
> >> other hand, some drivers that would use cpu_pm notifiers already use
> >> syscore ops to handle suspend and resume (like vfp) - maybe these
> >> notifiers should only be used in cpuidle, and syscore ops added to the
> >> gic driver? I could also convert the notifiers to new syscore_ops -
> >> cpu_idle, cpu_unidle, cpu_cluster_idle, cpu_cluster_unidle, but I
> >> don't know how well that fits in to the intention for syscore.
> >
> > Basically, syscore_ops deal with the situation during system suspend
> > when all CPUs but one have been switched off (through CPU hotplug)
> > and interrupts are off on the only active CPU. If there's anything
> > you need to do at this point, syscore_ops is the right thing to use.
> > And analogously for system resume.
> >
> > Moreover, for system suspend switching off the "boot" CPU (i.e. the only one
> > that remains active through the whole sequence) should really be the last
> > thing done, everything else should have been handled through syscore_ops
> > before.
>
> Yes, but what to do with idle, which generally needs to do the exact
> same things as handled in some syscore ops? Extend syscore ops, or
> add the new notifier, and each driver can implement both syscore and
> cpu_pm listeners (and probably call the same helper function to handle
> both)?
Good question. I don't think I have a good answer to it at the moment, need
to ponder that a bit more.
Colin,
On 6/15/2011 3:42 AM, Rafael J. Wysocki wrote:
> On Tuesday, June 14, 2011, Colin Cross wrote:
>> On Tue, Jun 14, 2011 at 2:34 PM, Rafael J. Wysocki<[email protected]> wrote:
>>> On Tuesday, June 14, 2011, Colin Cross wrote:
>>>> On Tue, Jun 14, 2011 at 2:00 PM, Rafael J. Wysocki<[email protected]> wrote:
>>>>> On Monday, June 13, 2011, Colin Cross wrote:
>>>>>> On Mon, Jun 13, 2011 at 11:37 AM, Rafael J. Wysocki<[email protected]> wrote:
>>>>>>> On Monday, June 13, 2011, Colin Cross wrote:
>>>>>>>> This patch set tries to address Russell's concerns with platform
>>>>>>>> pm code calling into the driver for every block in the Cortex A9s
>>>>>>>> during idle, hotplug, and suspend. The first patch adds cpu pm
>>>>>>>> notifiers that can be called by platform code, the second uses
>>>>>>>> the notifier to save and restore the GIC state, and the third
>>>>>>>> saves the VFP state.
>>>>>>>>
>>>>>>>> The notifiers are used for two types of events, CPU PM events and
>>>>>>>> CPU complex PM events. CPU PM events are used to save and restore
>>>>>>>> per-cpu context when a single CPU is preparing to enter or has
>>>>>>>> just exited a low power state. For example, the VFP saves the
>>>>>>>> last thread context, and the GIC saves banked CPU registers.
>>>>>>>>
>>>>>>>> CPU complex events are used after all the CPUs in a power domain
>>>>>>>> have been prepared for the low power state. The GIC uses these
>>>>>>>> events to save global register state.
>>>>>>>>
>>>>>>>> Platforms that call the cpu_pm APIs must select
>>>>>>>> CONFIG_ARCH_USES_CPU_PM
>>>>>>>>
>>>>>>>> L2 cache is not covered by this patch set, as the determination
>>>>>>>> of when the L2 is reset and when it is retained is
>>>>>>>> platform-specific, and most of the APIs necessary are already
>>>>>>>> present.
>>>>>>>>
>>>>>>>> arch/arm/Kconfig | 7 ++
>>>>>>>> arch/arm/common/gic.c | 212 +++++++++++++++++++++++++++++++++++++++++
>>>>>>>> arch/arm/include/asm/cpu_pm.h | 54 +++++++++++
>>>>>>>> arch/arm/kernel/Makefile | 1 +
>>>>>>>> arch/arm/kernel/cpu_pm.c | 181 +++++++++++++++++++++++++++++++++++
>>>>>>>
>>>>>>> Is there any reason why this has to be ARM-specific? There are other
>>>>>>> architectures where this kind of feature might make sense (SH and
>>>>>>> powerpc at least).
>>>>>>
>>>>>> Nothing other than there are currently no adaptations for any drivers
>>>>>> besides ARM, but I can move it somewhere outside ARM. Any suggestions
>>>>>> where?
>>>>>
>>>>> Well, there is kernel/cpu.c. It contains mostly CPU hotplug and PM
>>>>> code at the moment, so it looks like a good place.
>>>>
>>>> OK, I'll look at moving it there.
>>>>
>>>>>>> Besides, is there any overlap between this feature and the CPU hotplug
>>>>>>> notifiers?
>>>>>>
>>>>>> I don't think so - the hotplug notifiers are used when a CPU is being
>>>>>> removed from the system, so no saving and restoring is necessary - the
>>>>>> CPU will be rebooted from scratch. They are used by systems outside
>>>>>> the CPU that need to know that a CPU no longer exists.
>>>>>>
>>>>>> CPU PM notifiers are used when a CPU is going through reset in a way
>>>>>> that should be transparent to most of the system.
>>>>>
>>>>> Do I guess correctly that you mean cpuidle?
>>>>
>>>> cpuidle is the major user, but primary CPUs in suspend have to save
>>>> and restore the same blocks, and tend to use the same platform sleep
>>>> code as idle, so it's logical to use the notifiers for both. On the
>>>> other hand, some drivers that would use cpu_pm notifiers already use
>>>> syscore ops to handle suspend and resume (like vfp) - maybe these
>>>> notifiers should only be used in cpuidle, and syscore ops added to the
>>>> gic driver? I could also convert the notifiers to new syscore_ops -
>>>> cpu_idle, cpu_unidle, cpu_cluster_idle, cpu_cluster_unidle, but I
>>>> don't know how well that fits in to the intention for syscore.
>>>
>>> Basically, syscore_ops deal with the situation during system suspend
>>> when all CPUs but one have been switched off (through CPU hotplug)
>>> and interrupts are off on the only active CPU. If there's anything
>>> you need to do at this point, syscore_ops is the right thing to use.
>>> And analogously for system resume.
>>>
>>> Moreover, for system suspend switching off the "boot" CPU (i.e. the only one
>>> that remains active through the whole sequence) should really be the last
>>> thing done, everything else should have been handled through syscore_ops
>>> before.
>>
IIUC, you mentioned about moving the GIC, L2 to syscore_ops. Now the
state of these can still change till the last CPU goes down and in that
case it's necessary to handle these as part of the last CPU sleep
code. In other case, the syscore_ops callback would happen before
last CPU suspend has started.
>> Yes, but what to do with idle, which generally needs to do the exact
>> same things as handled in some syscore ops? Extend syscore ops, or
>> add the new notifier, and each driver can implement both syscore and
>> cpu_pm listeners (and probably call the same helper function to handle
>> both)?
>
> Good question. I don't think I have a good answer to it at the moment, need
> to ponder that a bit more.
So I am not sure CPU cluster related modules can be handled via sys_core
ops even for suspend. And if that happens to be the case, then
doing all of that through PM cluster notifier for both idle and suspend
is most logical.
Regards
Santosh
On Wednesday, June 15, 2011, Rafael J. Wysocki wrote:
> On Tuesday, June 14, 2011, Colin Cross wrote:
> > On Tue, Jun 14, 2011 at 2:34 PM, Rafael J. Wysocki <[email protected]> wrote:
> > > On Tuesday, June 14, 2011, Colin Cross wrote:
> > >> On Tue, Jun 14, 2011 at 2:00 PM, Rafael J. Wysocki <[email protected]> wrote:
> > >> > On Monday, June 13, 2011, Colin Cross wrote:
> > >> >> On Mon, Jun 13, 2011 at 11:37 AM, Rafael J. Wysocki <[email protected]> wrote:
> > >> >> > On Monday, June 13, 2011, Colin Cross wrote:
> > >> >> >> This patch set tries to address Russell's concerns with platform
> > >> >> >> pm code calling into the driver for every block in the Cortex A9s
> > >> >> >> during idle, hotplug, and suspend. The first patch adds cpu pm
> > >> >> >> notifiers that can be called by platform code, the second uses
> > >> >> >> the notifier to save and restore the GIC state, and the third
> > >> >> >> saves the VFP state.
> > >> >> >>
> > >> >> >> The notifiers are used for two types of events, CPU PM events and
> > >> >> >> CPU complex PM events. CPU PM events are used to save and restore
> > >> >> >> per-cpu context when a single CPU is preparing to enter or has
> > >> >> >> just exited a low power state. For example, the VFP saves the
> > >> >> >> last thread context, and the GIC saves banked CPU registers.
> > >> >> >>
> > >> >> >> CPU complex events are used after all the CPUs in a power domain
> > >> >> >> have been prepared for the low power state. The GIC uses these
> > >> >> >> events to save global register state.
> > >> >> >>
> > >> >> >> Platforms that call the cpu_pm APIs must select
> > >> >> >> CONFIG_ARCH_USES_CPU_PM
> > >> >> >>
> > >> >> >> L2 cache is not covered by this patch set, as the determination
> > >> >> >> of when the L2 is reset and when it is retained is
> > >> >> >> platform-specific, and most of the APIs necessary are already
> > >> >> >> present.
> > >> >> >>
> > >> >> >> arch/arm/Kconfig | 7 ++
> > >> >> >> arch/arm/common/gic.c | 212 +++++++++++++++++++++++++++++++++++++++++
> > >> >> >> arch/arm/include/asm/cpu_pm.h | 54 +++++++++++
> > >> >> >> arch/arm/kernel/Makefile | 1 +
> > >> >> >> arch/arm/kernel/cpu_pm.c | 181 +++++++++++++++++++++++++++++++++++
> > >> >> >
> > >> >> > Is there any reason why this has to be ARM-specific? There are other
> > >> >> > architectures where this kind of feature might make sense (SH and
> > >> >> > powerpc at least).
> > >> >>
> > >> >> Nothing other than there are currently no adaptations for any drivers
> > >> >> besides ARM, but I can move it somewhere outside ARM. Any suggestions
> > >> >> where?
> > >> >
> > >> > Well, there is kernel/cpu.c. It contains mostly CPU hotplug and PM
> > >> > code at the moment, so it looks like a good place.
> > >>
> > >> OK, I'll look at moving it there.
> > >>
> > >> >> > Besides, is there any overlap between this feature and the CPU hotplug
> > >> >> > notifiers?
> > >> >>
> > >> >> I don't think so - the hotplug notifiers are used when a CPU is being
> > >> >> removed from the system, so no saving and restoring is necessary - the
> > >> >> CPU will be rebooted from scratch. They are used by systems outside
> > >> >> the CPU that need to know that a CPU no longer exists.
> > >> >>
> > >> >> CPU PM notifiers are used when a CPU is going through reset in a way
> > >> >> that should be transparent to most of the system.
> > >> >
> > >> > Do I guess correctly that you mean cpuidle?
> > >>
> > >> cpuidle is the major user, but primary CPUs in suspend have to save
> > >> and restore the same blocks, and tend to use the same platform sleep
> > >> code as idle, so it's logical to use the notifiers for both. On the
> > >> other hand, some drivers that would use cpu_pm notifiers already use
> > >> syscore ops to handle suspend and resume (like vfp) - maybe these
> > >> notifiers should only be used in cpuidle, and syscore ops added to the
> > >> gic driver? I could also convert the notifiers to new syscore_ops -
> > >> cpu_idle, cpu_unidle, cpu_cluster_idle, cpu_cluster_unidle, but I
> > >> don't know how well that fits in to the intention for syscore.
> > >
> > > Basically, syscore_ops deal with the situation during system suspend
> > > when all CPUs but one have been switched off (through CPU hotplug)
> > > and interrupts are off on the only active CPU. If there's anything
> > > you need to do at this point, syscore_ops is the right thing to use.
> > > And analogously for system resume.
> > >
> > > Moreover, for system suspend switching off the "boot" CPU (i.e. the only one
> > > that remains active through the whole sequence) should really be the last
> > > thing done, everything else should have been handled through syscore_ops
> > > before.
> >
> > Yes, but what to do with idle, which generally needs to do the exact
> > same things as handled in some syscore ops? Extend syscore ops, or
> > add the new notifier, and each driver can implement both syscore and
> > cpu_pm listeners (and probably call the same helper function to handle
> > both)?
>
> Good question. I don't think I have a good answer to it at the moment, need
> to ponder that a bit more.
So, it looks like system suspend only needs those things because it uses
(a part of) the cpuidle infrastructure to put the CPU into a low-power state.
Thus from the system suspend point of view, they are parts of the "switch the
CPU off" operation, so syscore_ops don't seem to be suitable for doing them.
That said, they seem to belong to cpuidle rather than to "general PM".
Thanks,
Rafael
Colin Cross <[email protected]> writes:
> On Mon, Jun 13, 2011 at 3:17 PM, Kevin Hilman <[email protected]> wrote:
>> Colin Cross <[email protected]> writes:
>>
>>> During some CPU power modes entered during idle, hotplug and
>>> suspend, peripherals located in the CPU power domain, such as
>>> the GIC and VFP, may be powered down. Add a notifier chain
>>> that allows drivers for those peripherals to be notified
>>> before and after they may be reset.
>>>
>>> Signed-off-by: Colin Cross <[email protected]>
>>
>> This is great, thanks!
>>
>> I had hacked up something OMAP-specific a while back to do something
>> similar, and have been meaning to make it more generic, so this is
>> perfect. Also, if it is moved outside ARM, note that x86_64 has a
>> idle_notifier infrastructure that is somewhat similar, and if you're
>> motivated, it should probably be converted to this as well. See
>> arch/x86/kernel/process_64.c.
>
> I'll take a look at x86
>
>> Also, for the sake of the comments/changelog, the usefulness of these
>> notifiers is not limited to low-power states where power is off and IP
>> blocks may be reset. It could be considered as simply a generic
>> notification mechanism for any CPU PM transitions.
>>
>> On OMAP we have other peripherals (not in the CPU power domain) where we
>> need to control their PM transitions in relation to the CPU PM
>> transitions so the notifiers are useful for any low-power state
>> transition of the CPU(s). The drivers for these peripherals use runtime
>> PM in their CPU PM notifiers to trigger the device PM transitions. (The
>> drivers must use the synchronous versions of the runtime PM get/put
>> calls with device in pm_runtime_irq_safe() mode.)
>
> Can you give a concrete example of this so I can describe it correctly?
>
One simple example is a clock controlling a thermal sensor. The sensor
(and thus the functional clock) should be active whenever the CPU is
running, so the idle transition (clock gating) of this IP block should
be coordinated with any idle transitions of the CPU cluster.
Another example is any IP block that is not capable of waking up the CPU
(or $DEITY forbid, where said wakeups are broken.) A CPU_PM_EXIT
notifier would be used to check for pending activity and take action.
>> In this series, I don't see any calls to cpu_[complex_]pm_[enter|exit]().
>> Based on that, I assume you prefer to leave it up to platform-specific
>> idle/PM code to place these calls. Or, do you plan to have this
>> triggered by generic CPUidle, suspend/resume and/or hotplug code? At
>> least on OMAP, I prefer having the calls in platform-specific code.
>
> I will post the Tegra code that uses this soon. I expect that the
> decision on exactly when to call these functions will be unique to
> each platform, so I think it should start in the platform-specific
> code.
Agreed.
>> I have a minor enhancement request. The notifier callbacks provide for
>> passing a void * through the notifier chain. Could you add a way for a
>> void * to be registered at cpu_pm_register_notifier() time and that
>> would be passed through the notifier call chain? This would allow using
>> the same struct notifier_block and callback for multiple instances of
>> the same IP, and the instances could be differentiated in the callback
>> using the void *.
>
> The void * passed to the notifier comes from the call to
> notifier_call_chain(), not from the call to register_notifier().
I see now.
> You can get the behavior you want by putting the notifier_block inside
> your device struct and using container_of on the notifier_bock.
Yeah, that's the way I did it, but it requires a separate 'struct
notifier_block' for each device instance. It's only a minor issue, but
being able to pass in the 'void *' at registration time would allow a
single struct notifier_block' for all instances of the device.
>> Also, FWIW I tested this on OMAP3 (Cortex-A8 UP) using
>> cpu_pm_enter/exit() in the code path shared between idle and suspend. I
>> successfully triggered PM transitions in non-CPU power-domain
>> peripherals, and it worked great.
>
> Great! Can I get a tested-by?
Sure, meant to add it the first time.
Tested-by: Kevin Hilman <[email protected]>
Kevin