2014-02-18 20:33:15

by Nishanth Menon

[permalink] [raw]
Subject: [RFC PATCH 0/6] PM: introduce voltage domain abstraction

Hi,

On many SoCs such as OMAP, hardware blocks are isolated into voltage
domains allowing for individual tweaks for optimal power-performance
tradeoffs. Frameworks such as cpufreq, devfreq provide a top level
control of these hardware blocks. Most SoCs have different knobs to
fine tune the voltages. They may supply multiple supplies to various
voltage planes on the same voltage domains.

Regulator framework provides us a with an appropriate representation
of consumer model, however, we lack the abstraction necessary to
isolate the necessary sequencing for a specific voltage domain.

There has been few attempts previously taken such as [3] [4],
this series is a further development of of Mike's series[1] where
cpufreq-cpu0 driver was first targetted with clk notifiers. While I do
understand the concerns Mike has in [2], this could be a base on which
further work may be done.

In a nutshell, this (for cpufreq-cpu0 as an example):
maintains the legacy definitions used by cpufreq/devfreq such as:
&cpu0 {
cpu0-supply = <&regulator_x>;
};

on other SoCs where this might turn out to be a more detailed
implementation,
&cpu0 {
cpu0-voltdm = <&voltdm_x>;
};

To enure this is split off from clk framework, I chose
drivers/power/voltdm as the location for the new files, I'd love to
hear opinions and suggestions on the approach.

The series is based on 3.14-rc3 and is available here:
https://github.com/nmenon/linux-2.6-playground/commits/push/voltdm-notifier-rfc

A testing branch for Beagle-XM (OMAP3630 as testbed):
https://github.com/nmenon/linux-2.6-playground/commits/testing/beagle-xm-voltdm-notifier-rfc
Log: http://slexy.org/view/s20TgkN7kf

NOTE: TI platforms potentially need further development, however, this
forms the basis of the development.

Mike Turquette (2):
PM / Voltagedomain: Add generic clk notifier handler for regulator
based dynamic voltage scaling
cpufreq: cpufreq-cpu0: use clk rate-change notifiers

Nishanth Menon (4):
PM / Voltagedomain: introduce voltage domain driver support
devicetree: bindings: add documentation for voltagedomain
PM / Voltagedomain: introduce basic voltage domain support for OMAP
devicetree: bindings: voltagedomain: add bindings for OMAP compatible
SoCs

.../bindings/power/voltdm/voltage_domain.txt | 65 +++
.../bindings/power/voltdm/voltdm_omap.txt | 24 +
drivers/cpufreq/Kconfig | 1 +
drivers/cpufreq/cpufreq-cpu0.c | 140 ++----
drivers/power/Kconfig | 1 +
drivers/power/Makefile | 1 +
drivers/power/voltdm/Kconfig | 19 +
drivers/power/voltdm/Makefile | 6 +
drivers/power/voltdm/core.c | 508 ++++++++++++++++++++
drivers/power/voltdm/voltage_domain_private.h | 86 ++++
drivers/power/voltdm/voltdm_omap.c | 287 +++++++++++
include/linux/pm_voltage_domain.h | 47 ++
12 files changed, 1086 insertions(+), 99 deletions(-)
create mode 100644 Documentation/devicetree/bindings/power/voltdm/voltage_domain.txt
create mode 100644 Documentation/devicetree/bindings/power/voltdm/voltdm_omap.txt
create mode 100644 drivers/power/voltdm/Kconfig
create mode 100644 drivers/power/voltdm/Makefile
create mode 100644 drivers/power/voltdm/core.c
create mode 100644 drivers/power/voltdm/voltage_domain_private.h
create mode 100644 drivers/power/voltdm/voltdm_omap.c
create mode 100644 include/linux/pm_voltage_domain.h

[1] https://lkml.org/lkml/2013/7/7/110 (original RFC)
[2] http://marc.info/?l=linux-arm-kernel&m=137947787621572&w=2 (runtime pm should enable regulator)
[3] http://marc.info/?t=136603111700006&r=1&w=2
[4] http://git.omapzoom.org/?p=kernel/omap.git;a=blob;f=arch/arm/mach-omap2/dvfs.c;hb=p-linux-omap-3.4
--
1.7.9.5


2014-02-18 20:33:20

by Nishanth Menon

[permalink] [raw]
Subject: [RFC PATCH 2/6] cpufreq: cpufreq-cpu0: use clk rate-change notifiers

From: Mike Turquette <[email protected]>

Removes directly handling of OPP tables and voltage regulators by
calling of_clk_cpufreq_notifier_handler, introduced by commit "clk:
cpufreq helper for voltage scaling".

In the future this can help consolidate code found across similar
CPUfreq drivers.

[[email protected]: updates to keep the OPP logic still in cpufreq-cpu0 -
optimization to generalize that for cpufreq is to be done in a later
series]
Signed-off-by: Nishanth Menon <[email protected]>
Signed-off-by: Mike Turquette <[email protected]>
---
drivers/cpufreq/Kconfig | 1 +
drivers/cpufreq/cpufreq-cpu0.c | 140 ++++++++++++----------------------------
2 files changed, 42 insertions(+), 99 deletions(-)

diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
index 4b029c0..70b07ab 100644
--- a/drivers/cpufreq/Kconfig
+++ b/drivers/cpufreq/Kconfig
@@ -187,6 +187,7 @@ config GENERIC_CPUFREQ_CPU0
tristate "Generic CPU0 cpufreq driver"
depends on HAVE_CLK && REGULATOR && OF && THERMAL && CPU_THERMAL
select PM_OPP
+ select VOLTAGE_DOMAIN
help
This adds a generic cpufreq driver for CPU0 frequency management.
It supports both uniprocessor (UP) and symmetric multiprocessor (SMP)
diff --git a/drivers/cpufreq/cpufreq-cpu0.c b/drivers/cpufreq/cpufreq-cpu0.c
index 0c12ffc..32719d3 100644
--- a/drivers/cpufreq/cpufreq-cpu0.c
+++ b/drivers/cpufreq/cpufreq-cpu0.c
@@ -21,23 +21,20 @@
#include <linux/of.h>
#include <linux/pm_opp.h>
#include <linux/platform_device.h>
-#include <linux/regulator/consumer.h>
+#include <linux/pm_voltage_domain.h>
#include <linux/slab.h>
#include <linux/thermal.h>

static unsigned int transition_latency;
-static unsigned int voltage_tolerance; /* in percentage */

static struct device *cpu_dev;
static struct clk *cpu_clk;
-static struct regulator *cpu_reg;
static struct cpufreq_frequency_table *freq_table;
static struct thermal_cooling_device *cdev;
+static struct notifier_block *clk_nb;

static int cpu0_set_target(struct cpufreq_policy *policy, unsigned int index)
{
- struct dev_pm_opp *opp;
- unsigned long volt = 0, volt_old = 0, tol = 0;
unsigned int old_freq, new_freq;
long freq_Hz, freq_exact;
int ret;
@@ -50,50 +47,14 @@ static int cpu0_set_target(struct cpufreq_policy *policy, unsigned int index)
new_freq = freq_Hz / 1000;
old_freq = clk_get_rate(cpu_clk) / 1000;

- if (!IS_ERR(cpu_reg)) {
- rcu_read_lock();
- opp = dev_pm_opp_find_freq_ceil(cpu_dev, &freq_Hz);
- if (IS_ERR(opp)) {
- rcu_read_unlock();
- pr_err("failed to find OPP for %ld\n", freq_Hz);
- return PTR_ERR(opp);
- }
- volt = dev_pm_opp_get_voltage(opp);
- rcu_read_unlock();
- tol = volt * voltage_tolerance / 100;
- volt_old = regulator_get_voltage(cpu_reg);
- }
-
- pr_debug("%u MHz, %ld mV --> %u MHz, %ld mV\n",
- old_freq / 1000, volt_old ? volt_old / 1000 : -1,
- new_freq / 1000, volt ? volt / 1000 : -1);
-
- /* scaling up? scale voltage before frequency */
- if (!IS_ERR(cpu_reg) && new_freq > old_freq) {
- ret = regulator_set_voltage_tol(cpu_reg, volt, tol);
- if (ret) {
- pr_err("failed to scale voltage up: %d\n", ret);
- return ret;
- }
- }
+ pr_debug("%u MHz --> %u MHz\n", old_freq / 1000, new_freq / 1000);

ret = clk_set_rate(cpu_clk, freq_exact);
if (ret) {
pr_err("failed to set clock rate: %d\n", ret);
- if (!IS_ERR(cpu_reg))
- regulator_set_voltage_tol(cpu_reg, volt_old, tol);
return ret;
}

- /* scaling down? scale voltage after frequency */
- if (!IS_ERR(cpu_reg) && new_freq < old_freq) {
- ret = regulator_set_voltage_tol(cpu_reg, volt, tol);
- if (ret) {
- pr_err("failed to scale voltage down: %d\n", ret);
- clk_set_rate(cpu_clk, old_freq * 1000);
- }
- }
-
return ret;
}

@@ -117,33 +78,24 @@ static struct cpufreq_driver cpu0_cpufreq_driver = {
static int cpu0_cpufreq_probe(struct platform_device *pdev)
{
struct device_node *np;
+ unsigned int voltage_latency;
int ret;

- cpu_dev = get_cpu_device(0);
if (!cpu_dev) {
- pr_err("failed to get cpu0 device\n");
- return -ENODEV;
- }
-
- np = of_node_get(cpu_dev->of_node);
- if (!np) {
- pr_err("failed to find cpu0 node\n");
- return -ENOENT;
- }
+ cpu_dev = get_cpu_device(0);
+ if (!cpu_dev) {
+ pr_err("failed to get cpu0 device\n");
+ return -ENODEV;
+ }

- cpu_reg = devm_regulator_get_optional(cpu_dev, "cpu0");
- if (IS_ERR(cpu_reg)) {
- /*
- * If cpu0 regulator supply node is present, but regulator is
- * not yet registered, we should try defering probe.
- */
- if (PTR_ERR(cpu_reg) == -EPROBE_DEFER) {
- dev_err(cpu_dev, "cpu0 regulator not ready, retry\n");
- ret = -EPROBE_DEFER;
- goto out_put_node;
+ np = of_node_get(cpu_dev->of_node);
+ ret = of_init_opp_table(cpu_dev);
+ if (ret) {
+ pr_err("failed to init OPP table: %d\n", ret);
+ return ret;
}
- pr_warn("failed to get cpu0 regulator: %ld\n",
- PTR_ERR(cpu_reg));
+ } else {
+ np = of_node_get(cpu_dev->of_node);
}

cpu_clk = devm_clk_get(cpu_dev, NULL);
@@ -153,11 +105,8 @@ static int cpu0_cpufreq_probe(struct platform_device *pdev)
goto out_put_node;
}

- ret = of_init_opp_table(cpu_dev);
- if (ret) {
- pr_err("failed to init OPP table: %d\n", ret);
- goto out_put_node;
- }
+ if (of_property_read_u32(np, "clock-latency", &transition_latency))
+ transition_latency = CPUFREQ_ETERNAL;

ret = dev_pm_opp_init_cpufreq_table(cpu_dev, &freq_table);
if (ret) {
@@ -165,40 +114,30 @@ static int cpu0_cpufreq_probe(struct platform_device *pdev)
goto out_put_node;
}

- of_property_read_u32(np, "voltage-tolerance", &voltage_tolerance);
-
- if (of_property_read_u32(np, "clock-latency", &transition_latency))
- transition_latency = CPUFREQ_ETERNAL;
-
- if (!IS_ERR(cpu_reg)) {
- struct dev_pm_opp *opp;
- unsigned long min_uV, max_uV;
- int i;
-
- /*
- * OPP is maintained in order of increasing frequency, and
- * freq_table initialised from OPP is therefore sorted in the
- * same order.
- */
- for (i = 0; freq_table[i].frequency != CPUFREQ_TABLE_END; i++)
- ;
- rcu_read_lock();
- opp = dev_pm_opp_find_freq_exact(cpu_dev,
- freq_table[0].frequency * 1000, true);
- min_uV = dev_pm_opp_get_voltage(opp);
- opp = dev_pm_opp_find_freq_exact(cpu_dev,
- freq_table[i-1].frequency * 1000, true);
- max_uV = dev_pm_opp_get_voltage(opp);
- rcu_read_unlock();
- ret = regulator_set_voltage_time(cpu_reg, min_uV, max_uV);
- if (ret > 0)
- transition_latency += ret * 1000;
+ clk_nb = of_pm_voltdm_notifier_register(cpu_dev, np, cpu_clk, "cpu0",
+ &voltage_latency);
+
+ if (IS_ERR(clk_nb)) {
+ ret = PTR_ERR(clk_nb);
+ /* defer probe if regulator is not yet registered */
+ if (ret == -EPROBE_DEFER) {
+ dev_err(cpu_dev,
+ "cpu0 clock notifier not ready, retry\n");
+ } else {
+ dev_err(cpu_dev,
+ "Failed to register cpu0 clock notifier: %d\n",
+ ret);
+ }
+ goto out_freq_table_free;
}

+ if (voltage_latency > 0)
+ transition_latency += voltage_latency;
+
ret = cpufreq_register_driver(&cpu0_cpufreq_driver);
if (ret) {
pr_err("failed register driver: %d\n", ret);
- goto out_free_table;
+ goto out_notifier_unregister;
}

/*
@@ -215,7 +154,9 @@ static int cpu0_cpufreq_probe(struct platform_device *pdev)
of_node_put(np);
return 0;

-out_free_table:
+out_notifier_unregister:
+ of_pm_voltdm_notifier_unregister(clk_nb);
+out_freq_table_free:
dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table);
out_put_node:
of_node_put(np);
@@ -226,6 +167,7 @@ static int cpu0_cpufreq_remove(struct platform_device *pdev)
{
cpufreq_cooling_unregister(cdev);
cpufreq_unregister_driver(&cpu0_cpufreq_driver);
+ of_pm_voltdm_notifier_unregister(clk_nb);
dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table);

return 0;
--
1.7.9.5

2014-02-18 20:33:25

by Nishanth Menon

[permalink] [raw]
Subject: [RFC PATCH 5/6] PM / Voltagedomain: introduce basic voltage domain support for OMAP

OMAP platforms have multiple voltage domains, but each with consistent
behavior of controlling the VDD (supply) and VBB(ABB/Bias voltage).

These need to be sequenced in a specific order for functionality.
Further optimization such as AVS(Adaptive Voltage Scaling), optimized
voltages from efuse(Class0) can also be introduced based on this basic
framework.

This now allows reuse by voltage domain devices which may share the same
voltage domain by allowing the regulator framework to maintain per
consumer usage information.

Signed-off-by: Nishanth Menon <[email protected]>
---
drivers/power/voltdm/Kconfig | 7 +
drivers/power/voltdm/Makefile | 1 +
drivers/power/voltdm/voltdm_omap.c | 287 ++++++++++++++++++++++++++++++++++++
3 files changed, 295 insertions(+)
create mode 100644 drivers/power/voltdm/voltdm_omap.c

diff --git a/drivers/power/voltdm/Kconfig b/drivers/power/voltdm/Kconfig
index 270fdab..41bfbc3 100644
--- a/drivers/power/voltdm/Kconfig
+++ b/drivers/power/voltdm/Kconfig
@@ -9,4 +9,11 @@ config VOLTAGE_DOMAIN
menu "Voltage Domain Framework Drivers"
depends on VOLTAGE_DOMAIN

+config VOLTAGE_DOMAIN_OMAP
+ tristate "TI OMAP Voltage domain driver"
+ ---help---
+ Voltage domain support for OMAP platforms which need
+ vdd regulator and Adaptive Body Bias(ABB) regulator
+ support
+
endmenu
diff --git a/drivers/power/voltdm/Makefile b/drivers/power/voltdm/Makefile
index fd32040..e1cb50d 100644
--- a/drivers/power/voltdm/Makefile
+++ b/drivers/power/voltdm/Makefile
@@ -3,3 +3,4 @@ obj-$(CONFIG_VOLTAGE_DOMAIN) += core.o

# Hardware specific voltage domain
# please keep this section sorted lexicographically by file/directory path name
+obj-$(CONFIG_VOLTAGE_DOMAIN_OMAP) += voltdm_omap.o
diff --git a/drivers/power/voltdm/voltdm_omap.c b/drivers/power/voltdm/voltdm_omap.c
new file mode 100644
index 0000000..aacbdde
--- /dev/null
+++ b/drivers/power/voltdm/voltdm_omap.c
@@ -0,0 +1,287 @@
+/*
+ * Copyright (C) 2014 Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Helper functions for registering clock rate-change notifier handlers
+ * that scale voltage when a clock changes its output frequency.
+ */
+#include <linux/clk.h>
+#include <linux/cpufreq.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/notifier.h>
+#include <linux/of_device.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/pm_opp.h>
+#include <linux/regulator/consumer.h>
+#include <linux/slab.h>
+#include "voltage_domain_private.h"
+
+/**
+ * struct omap_voltdm_data - OMAP specific voltage domain data
+ * @vdd_reg: VDD regulator
+ * @vbb_reg: Body Bias regulator
+ */
+struct omap_voltdm_data {
+ struct regulator *vdd_reg;
+ struct regulator *vbb_reg;
+};
+
+/**
+ * omap_voltdm_do_transition() - do the voltage domain transition
+ * @dev: voltage domain device for which we are doing the transition
+ * @voltdm_data: data specific to the device
+ * @clk_notifier_flags: clk notifier flags for direction of transition
+ * @uv: what voltage to transition to
+ * @tol_uv: voltage tolerance to use
+ */
+static int omap_voltdm_do_transition(struct device *dev,
+ void *voltdm_data,
+ unsigned long clk_notifier_flags, int uv,
+ int tol_uv)
+{
+ struct omap_voltdm_data *data = (struct omap_voltdm_data *)voltdm_data;
+ int ret;
+ bool do_abb_first;
+
+ /* We dont expect voltdm layer to make mistakes.. but still */
+ BUG_ON(!data);
+
+ do_abb_first = clk_notifier_flags == ABORT_RATE_CHANGE ||
+ clk_notifier_flags == POST_RATE_CHANGE;
+
+ if (do_abb_first) {
+ dev_dbg(dev, "vbb pre %duV[tol %duV]\n", uv, tol_uv);
+ ret = regulator_set_voltage_tol(data->vbb_reg, uv, tol_uv);
+ if (ret) {
+ dev_err(dev,
+ "vbb failed for voltage %duV[tol %duV]:%d\n",
+ uv, tol_uv, ret);
+ return ret;
+ }
+ }
+
+ dev_dbg(dev, "vdd for voltage %duV[tol %duV]\n", uv, tol_uv);
+ ret = regulator_set_voltage_tol(data->vdd_reg, uv, tol_uv);
+ if (ret) {
+ dev_err(dev,
+ "vdd failed for voltage %duV[tol %duV]:%d\n",
+ uv, tol_uv, ret);
+ return ret;
+ }
+
+ if (!do_abb_first) {
+ dev_dbg(dev, "vbb post %duV[tol %duV]\n", uv, tol_uv);
+ ret = regulator_set_voltage_tol(data->vbb_reg, uv, tol_uv);
+ if (ret) {
+ dev_err(dev,
+ "vbb failed for voltage %duV[tol %duV]:%d\n",
+ uv, tol_uv, ret);
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
+/**
+ * omap_voltdm_latency() - provide the transition latency for the voltage domain
+ * @dev: voltage domain device for which we are doing the transition
+ * @voltdm_data: data specific to the device
+ * @min: minimum voltage in uV
+ * @max: maximum voltage in uV
+ *
+ * Return: If successful, the combined transition latency from min to max, else
+ * returns error value
+ */
+static int omap_voltdm_latency(struct device *dev, void *voltdm_data,
+ unsigned long min, unsigned long max)
+{
+ struct omap_voltdm_data *data = (struct omap_voltdm_data *)voltdm_data;
+ int ret, tot_latency = 0;
+
+ /* We dont expect voltdm layer to make mistakes.. but still */
+ BUG_ON(!data);
+
+ ret = regulator_set_voltage_time(data->vdd_reg, min, max);
+ if (ret < 0) {
+ dev_warn(dev, "vdd failed voltage latency: %d\n", ret);
+ return ret;
+ }
+ tot_latency += ret;
+
+ ret = regulator_set_voltage_time(data->vbb_reg, min, max);
+ if (ret < 0) {
+ dev_warn(dev, "vbb failed voltage latency: %d\n", ret);
+ return ret;
+ }
+ tot_latency += ret;
+
+ return tot_latency;
+}
+
+static inline void omap_voltdm_cleanup(struct omap_voltdm_data *data)
+{
+ if (!IS_ERR(data->vbb_reg))
+ regulator_put(data->vbb_reg);
+ if (!IS_ERR(data->vdd_reg))
+ regulator_put(data->vdd_reg);
+ kfree(data);
+}
+
+/**
+ * omap_voltdm_get() - get the voltage domain resources specific to request
+ * @voltdm_dev: voltage domain device
+ * @request_dev: device for which we have been requested to get
+ * @np: unused
+ * @np_args: unused
+ * @supply: unused
+ * @*voltdm_data: returns data for the current request (freed in put)
+ *
+ * Return: 0 if everything went OK, else return appropriate error value.
+ */
+static int omap_voltdm_get(struct device *voltdm_dev,
+ struct device *request_dev,
+ struct device_node *np,
+ struct of_phandle_args *np_args,
+ const char *supply,
+ void **voltdm_data)
+{
+ struct omap_voltdm_data *data;
+ int ret = 0;
+
+ BUG_ON(!voltdm_dev || !request_dev || !voltdm_data);
+
+ data = kzalloc(sizeof(*data), GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+ /* Intialize for use in generic cleanup routine */
+ data->vdd_reg = ERR_PTR(-ENODEV);
+ data->vbb_reg = ERR_PTR(-ENODEV);
+
+ /*
+ * Setup aliases for request device supply to let regulator framework
+ * do multi-consumer scenario
+ */
+ ret = regulator_register_supply_alias(request_dev, "vdd", voltdm_dev, "vdd");
+ if (ret)
+ goto out_free;
+ ret = regulator_register_supply_alias(request_dev, "vbb", voltdm_dev, "vbb");
+ if (ret)
+ goto out_unreg_vdd;
+
+ data->vdd_reg = regulator_get(request_dev, "vdd");
+ if (IS_ERR(data->vdd_reg)) {
+ ret = PTR_ERR(data->vdd_reg);
+ dev_err(voltdm_dev, "Unable to get vdd regulator:%d\n", ret);
+ goto out_unreg;
+ }
+
+ data->vbb_reg = regulator_get(request_dev, "vbb");
+ if (IS_ERR(data->vbb_reg)) {
+ ret = PTR_ERR(data->vbb_reg);
+ dev_err(voltdm_dev, "Unable to get vbb regulator:%d\n", ret);
+ goto out_vdd_reg;
+ }
+
+ *voltdm_data = data;
+ return 0;
+
+out_vdd_reg:
+ regulator_put(data->vdd_reg);
+out_unreg:
+ regulator_unregister_supply_alias(request_dev, "vbb");
+out_unreg_vdd:
+ regulator_unregister_supply_alias(request_dev, "vdd");
+out_free:
+ kfree(data);
+
+ return ret;
+}
+
+/**
+ * omap_voltdm_put() - release the resources reserved by omap_voltdm_get()
+ * @voltdm_dev: voltage domain device for which we reserved resources
+ * @request_dev: device for which we have been requested to put
+ * @voltdm_data: data for the request provided by omap_voltdm_get()
+ */
+static void omap_voltdm_put(struct device *voltdm_dev, struct device *request_dev,
+ void *voltdm_data)
+{
+ struct omap_voltdm_data *data = (struct omap_voltdm_data *)voltdm_data;
+
+ /* We dont expect voltdm layer to make mistakes.. but still */
+ BUG_ON(!data || !voltdm_dev || !request_dev);
+
+ regulator_put(data->vbb_reg);
+ regulator_put(data->vdd_reg);
+ regulator_unregister_supply_alias(request_dev, "vbb");
+ regulator_unregister_supply_alias(request_dev, "vdd");
+ kfree(data);
+
+ return;
+}
+
+static const struct pm_voltdm_ops omap_voltdm_ops = {
+ .voltdm_get = omap_voltdm_get,
+ .voltdm_put = omap_voltdm_put,
+ .voltdm_latency = omap_voltdm_latency,
+ .voltdm_do_transition = omap_voltdm_do_transition,
+};
+
+static const struct pm_voltdm_desc omap_voltdm_desc = {
+ .ops = &omap_voltdm_ops,
+};
+
+static const struct of_device_id omap_voltdm_of_match[] = {
+ {.compatible = "ti,omap-voltdm", .data = &omap_voltdm_desc},
+ {},
+};
+
+static int omap_voltdm_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ const struct of_device_id *match;
+ struct pm_voltdm_dev *vdev;
+ int ret = 0;
+
+ match = of_match_device(omap_voltdm_of_match, dev);
+ if (!match) {
+ /* We do not expect this to happen */
+ dev_err(dev, "%s: Unable to match device\n", __func__);
+ return -ENODEV;
+ }
+ if (!match->data) {
+ /* Again, unlikely.. but mistakes do happen */
+ dev_err(dev, "%s: Bad data in match\n", __func__);
+ return -EINVAL;
+ }
+
+ vdev = devm_voltdm_register(dev, match->data);
+ if (IS_ERR(vdev)) {
+ ret = PTR_ERR(vdev);
+ dev_err(dev, "Failed to register voltage domain %d\n", ret);
+ }
+
+ return ret;
+}
+
+MODULE_ALIAS("platform:omap_voltdm");
+
+static struct platform_driver omap_voltdm_driver = {
+ .probe = omap_voltdm_probe,
+ .driver = {
+ .name = "omap_voltdm",
+ .owner = THIS_MODULE,
+ .of_match_table = of_match_ptr(omap_voltdm_of_match),
+ },
+};
+module_platform_driver(omap_voltdm_driver);
+
+MODULE_DESCRIPTION("Texas Instruments OMAP Voltage Domain driver");
+MODULE_AUTHOR("Texas Instruments Inc.");
+MODULE_LICENSE("GPL v2");
--
1.7.9.5

2014-02-18 20:33:49

by Nishanth Menon

[permalink] [raw]
Subject: [RFC PATCH 3/6] PM / Voltagedomain: introduce voltage domain driver support

Many SoCs have basic concepts of voltage rails supplying a specific
SoC device. These voltage rails may be as simple as a single regulator
or complex to be three or more regulators that are transitioned in
tandem with respect to clock changes. In some cases, they may tend
to use custom frameworks to do the actual voltage transition OR use
hardware modules that takes hints about the required configuration and
does optimized voltage transitions.

The current regulator model provides the basic building blocks for the
transitions, however SoC drivers specific to each of these devices, be
it cpufreq/devfreq have to replicate the logic for functionality.

To simply the logic, we can hence introduce a layer that takes care
of the mundane transition logic, registration mechanisms to provide
the "user drivers" such as cpufreq/devfreq a generic interface, whose
details are abstracted by the device tree description for the SoC on
which the driver operates on.

This allows us opportunity to make generic cpufreq / devfreq drivers
which may deal with a single clock, however the actual voltage change
mechanism can be customized specific to SoC without impact to the
generic driver.

Signed-off-by: Nishanth Menon <[email protected]>
---
drivers/power/voltdm/Kconfig | 5 +
drivers/power/voltdm/Makefile | 3 +
drivers/power/voltdm/core.c | 347 +++++++++++++++++++++++--
drivers/power/voltdm/voltage_domain_private.h | 86 ++++++
4 files changed, 424 insertions(+), 17 deletions(-)
create mode 100644 drivers/power/voltdm/voltage_domain_private.h

diff --git a/drivers/power/voltdm/Kconfig b/drivers/power/voltdm/Kconfig
index c5353bd..270fdab 100644
--- a/drivers/power/voltdm/Kconfig
+++ b/drivers/power/voltdm/Kconfig
@@ -5,3 +5,8 @@ config VOLTAGE_DOMAIN
---help---
Core voltage domain framework using common clock framework's
notifier mechanism.
+
+menu "Voltage Domain Framework Drivers"
+ depends on VOLTAGE_DOMAIN
+
+endmenu
diff --git a/drivers/power/voltdm/Makefile b/drivers/power/voltdm/Makefile
index 3fa4408..fd32040 100644
--- a/drivers/power/voltdm/Makefile
+++ b/drivers/power/voltdm/Makefile
@@ -1,2 +1,5 @@
# Generic voltage domain
obj-$(CONFIG_VOLTAGE_DOMAIN) += core.o
+
+# Hardware specific voltage domain
+# please keep this section sorted lexicographically by file/directory path name
diff --git a/drivers/power/voltdm/core.c b/drivers/power/voltdm/core.c
index d0ed27e..69f6e81 100644
--- a/drivers/power/voltdm/core.c
+++ b/drivers/power/voltdm/core.c
@@ -10,12 +10,32 @@
* that scale voltage when a clock changes its output frequency.
*/
#include <linux/device.h>
+#include <linux/module.h>
#include <linux/of.h>
#include <linux/pm_opp.h>
#include <linux/pm_voltage_domain.h>
#include <linux/regulator/consumer.h>
#include <linux/slab.h>

+#include "voltage_domain_private.h"
+
+/**
+ * struct pm_voltdm_dev - internal representation of voltage domain devices
+ * @desc: voltage domain description
+ * @dev: voltage domain device
+ * @list: list to remaining voltage domain devices
+ * @lock: mutex to control data structure modifications and serialize ops
+ * @notifier_list: list of notifiers registered for this device
+ */
+struct pm_voltdm_dev {
+ const struct pm_voltdm_desc *desc;
+ struct device *dev;
+ struct list_head list;
+ /* list lock */
+ struct mutex lock;
+ struct list_head notifier_list;
+};
+
/**
* struct volt_scale_data - Internal structure to maintain notifier information
* @dev: device on behalf of which we register the notifier
@@ -23,6 +43,9 @@
* @reg: regulator if any which is used for scaling voltage
* @tol: voltage tolerance in %
* @nb: notifier block pointer
+ * @list: list head for the notifier
+ * @vdev: pointer to voltage domain device for this notifier
+ * @voltdm_data: voltdm driver specific data
*/
struct volt_scale_data {
struct device *dev;
@@ -30,23 +53,208 @@ struct volt_scale_data {
struct regulator *reg;
int tol;
struct notifier_block nb;
+ struct list_head list;
+
+ struct pm_voltdm_dev *vdev;
+ void *voltdm_data;
};

#define to_volt_scale_data(_nb) container_of(_nb, \
struct volt_scale_data, nb)

+static DEFINE_MUTEX(pm_voltdm_list_lock);
+static LIST_HEAD(pm_voltdm_list);
+
+static inline bool voltmd_skip_check(struct pm_voltdm_dev *vdev)
+{
+ bool ret = false;
+
+ if (vdev) {
+ const struct pm_voltdm_desc *desc;
+
+ if (IS_ERR(vdev))
+ return false;
+
+ mutex_lock(&vdev->lock);
+ desc = vdev->desc;
+
+ if (desc->flags & VOLTAGE_DOMAIN_FLAG_NOTIFY_ALL)
+ ret = true;
+ mutex_unlock(&vdev->lock);
+ }
+
+ return ret;
+}
+
+static inline int voltmd_scale_voltage(struct volt_scale_data *vsd,
+ unsigned long flags, int volt, int tol)
+{
+ int ret;
+ struct pm_voltdm_dev *vdev = vsd->vdev;
+
+ if (vdev) {
+ const struct pm_voltdm_ops *ops;
+
+ if (IS_ERR(vdev))
+ return PTR_ERR(vdev);
+
+ mutex_lock(&vdev->lock);
+ ops = vdev->desc->ops;
+
+ ret = ops->voltdm_do_transition(vdev->dev,
+ vsd->voltdm_data,
+ flags, volt, tol);
+ mutex_unlock(&vdev->lock);
+ } else {
+ ret = regulator_set_voltage_tol(vsd->reg, volt, tol);
+ }
+
+ return ret;
+}
+
+static struct pm_voltdm_dev *voltdm_parse_of(struct device_node *np,
+ const char *supply,
+ struct of_phandle_args *args)
+{
+ char prop_name[32]; /* 32 is max size of property name */
+ bool found = false;
+ struct device_node *voltdm_np;
+ struct pm_voltdm_dev *vdev = NULL;
+ int ret;
+
+ snprintf(prop_name, sizeof(prop_name), "%s-voltdm", supply);
+ voltdm_np = of_parse_phandle(np, prop_name, 0);
+ if (voltdm_np) {
+ ret = of_parse_phandle_with_args(np, prop_name, "#voltdm-cells",
+ 0, args);
+ if (ret)
+ return ERR_PTR(ret);
+
+ mutex_lock(&pm_voltdm_list_lock);
+ list_for_each_entry(vdev, &pm_voltdm_list, list)
+ if (vdev->dev->parent && voltdm_np == vdev->dev->of_node) {
+ found = true;
+ break;
+ }
+ mutex_unlock(&pm_voltdm_list_lock);
+
+ /* if node is present and not ready, then defer */
+ if (!found)
+ return ERR_PTR(-EPROBE_DEFER);
+ } else {
+ return NULL;
+ }
+
+ return vdev;
+}
+
+static int voltdm_get(struct volt_scale_data *vsd, struct device_node *np,
+ const char *supply, struct of_phandle_args *args,
+ bool *skip_reg)
+{
+ struct pm_voltdm_dev *vdev = vsd->vdev;
+ struct device *dev = vsd->dev;
+ int ret = 0;
+
+ if (vdev) {
+ const struct pm_voltdm_ops *ops;
+ if (IS_ERR(vdev))
+ return PTR_ERR(vdev);
+
+ mutex_lock(&vdev->lock);
+ if (!try_module_get(vdev->dev->driver->owner)) {
+ ret = -ENODEV;
+ } else {
+ ops = vdev->desc->ops;
+ if (ops->voltdm_get)
+ ret = ops->voltdm_get(vdev->dev, dev, np,
+ args, supply,
+ &vsd->voltdm_data);
+ if (ret)
+ module_put(vdev->dev->driver->owner);
+ }
+ if (!ret)
+ list_add(&vsd->list, &vdev->notifier_list);
+
+ mutex_unlock(&vdev->lock);
+ } else {
+ vsd->reg = regulator_get_optional(dev, supply);
+ if (IS_ERR(vsd->reg))
+ ret = PTR_ERR(vsd->reg);
+ /* Regulator is not mandatory */
+ if (ret != -EPROBE_DEFER) {
+ ret = 0;
+ *skip_reg = false;
+ dev_dbg(dev, "%s: Failed to get %s regulator:%d\n",
+ __func__, supply, ret);
+ }
+ }
+
+ return ret;
+}
+
+static void voltdm_put(struct volt_scale_data *vsd)
+{
+ struct pm_voltdm_dev *vdev = vsd->vdev;
+
+ if (vdev) {
+ const struct pm_voltdm_ops *ops;
+
+ if (IS_ERR(vdev))
+ return;
+
+ mutex_lock(&vdev->lock);
+ ops = vdev->desc->ops;
+ if (ops->voltdm_put)
+ ops->voltdm_put(vdev->dev, vsd->dev, vsd->voltdm_data);
+ list_del(&vsd->list);
+ module_put(vdev->dev->driver->owner);
+ mutex_unlock(&vdev->lock);
+ } else {
+ if (!IS_ERR(vsd->reg))
+ regulator_put(vsd->reg);
+ }
+}
+
+static int voltdm_get_latency(struct volt_scale_data *vsd, int min, int max)
+{
+ struct pm_voltdm_dev *vdev = vsd->vdev;
+ const struct pm_voltdm_ops *ops;
+ int ret;
+
+ if (!vdev)
+ return regulator_set_voltage_time(vsd->reg, min, max);
+
+ if (IS_ERR(vdev))
+ return PTR_ERR(vdev);
+
+ mutex_lock(&vdev->lock);
+ ops = vdev->desc->ops;
+
+ if (!ops->voltdm_latency)
+ ret = -ENXIO;
+ else
+ ret = ops->voltdm_latency(vdev->dev, vsd->voltdm_data,
+ min, max);
+ mutex_unlock(&vdev->lock);
+
+ return ret;
+}
+
static int clk_volt_notifier_handler(struct notifier_block *nb,
unsigned long flags, void *data)
{
struct clk_notifier_data *cnd = data;
struct volt_scale_data *vsd = to_volt_scale_data(nb);
+ struct pm_voltdm_dev *vdev = vsd->vdev;
int ret, volt, tol;
struct dev_pm_opp *opp;
unsigned long old_rate = cnd->old_rate;
unsigned long new_rate = cnd->new_rate;

- if ((new_rate < old_rate && flags == PRE_RATE_CHANGE) ||
- (new_rate > old_rate && flags == POST_RATE_CHANGE))
+ if (!voltmd_skip_check(vdev) &&
+ ((new_rate < old_rate && flags == PRE_RATE_CHANGE) ||
+ (new_rate > old_rate && flags == POST_RATE_CHANGE)))
return NOTIFY_OK;

rcu_read_lock();
@@ -69,7 +277,7 @@ static int clk_volt_notifier_handler(struct notifier_block *nb,
dev_dbg(vsd->dev, "%s: %lu -> %lu, V=%d, tol=%d, clk_flag=%lu\n",
__func__, old_rate, new_rate, volt, tol, flags);

- ret = regulator_set_voltage_tol(vsd->reg, volt, tol);
+ ret = voltmd_scale_voltage(vsd, flags, volt, tol);
if (ret) {
dev_err(vsd->dev,
"%s: Failed to scale voltage(%u): %d\n", __func__,
@@ -101,6 +309,14 @@ struct notifier_block *of_pm_voltdm_notifier_register(struct device *dev,
struct dev_pm_opp *opp;
unsigned long min, max, freq;
int ret;
+ struct of_phandle_args voltdm_args = {NULL};
+ struct pm_voltdm_dev *vdev = NULL;
+ bool skip_reg = false;
+
+ /* First look for voltdm of node */
+ vdev = voltdm_parse_of(np, supply, &voltdm_args);
+ if (IS_ERR(vdev))
+ return (struct notifier_block *)vdev;

vsd = kzalloc(sizeof(*vsd), GFP_KERNEL);
if (!vsd)
@@ -109,19 +325,16 @@ struct notifier_block *of_pm_voltdm_notifier_register(struct device *dev,
vsd->dev = dev;
vsd->clk = clk;
vsd->nb.notifier_call = clk_volt_notifier_handler;
- vsd->reg = regulator_get_optional(dev, supply);
- ret = 0;
- if (IS_ERR(vsd->reg))
- ret = PTR_ERR(vsd->reg);
- /* regulator is not mandatory */
- if (ret != -EPROBE_DEFER) {
- dev_warn(dev, "%s: Failed to get %s regulator:%d\n",
+ vsd->vdev = vdev;
+ ret = voltdm_get(vsd, np, supply, &voltdm_args, &skip_reg);
+
+ if (ret) {
+ dev_warn(dev, "%s: Failed to get %s regulator/voltdm: %d\n",
__func__, supply, ret);
- ret = 0;
goto err_free_vsd;
}
- /* For devices that are not ready.... */
- if (ret)
+ /* if not mandatory... */
+ if (skip_reg)
goto err_free_vsd;

rcu_read_lock();
@@ -138,7 +351,7 @@ struct notifier_block *of_pm_voltdm_notifier_register(struct device *dev,
max = dev_pm_opp_get_voltage(opp);
rcu_read_unlock();

- *voltage_latency = regulator_set_voltage_time(vsd->reg, min, max);
+ *voltage_latency = voltdm_get_latency(vsd, min, max);
if (*voltage_latency < 0) {
dev_warn(dev,
"%s: Fail calculating voltage latency[%ld<->%ld]:%d\n",
@@ -163,7 +376,7 @@ err_bad_opp:
dev_err(dev, "%s: failed to get OPP, %d\n", __func__, ret);

err_free_reg:
- regulator_put(vsd->reg);
+ voltdm_put(vsd);

err_free_vsd:
kfree(vsd);
@@ -187,9 +400,109 @@ void of_pm_voltdm_notifier_unregister(struct notifier_block *nb)
vsd = to_volt_scale_data(nb);
clk = vsd->clk;
clk_notifier_unregister(clk, nb);
- if (!IS_ERR(vsd->reg))
- regulator_put(vsd->reg);
+ voltdm_put(vsd);

kfree(vsd);
}
EXPORT_SYMBOL_GPL(of_pm_voltdm_notifier_unregister);
+
+static void devm_voltdm_release(struct device *dev, void *res)
+{
+ struct pm_voltdm_dev *vdev = *(struct pm_voltdm_dev **)res;
+ struct volt_scale_data *vsd;
+
+ mutex_lock(&pm_voltdm_list_lock);
+ mutex_lock(&vdev->lock);
+ list_for_each_entry(vsd, &vdev->notifier_list, list) {
+ dev_warn(dev, "%s: pending notifier from device %s!\n",
+ __func__, dev_name(vsd->dev));
+ vsd->vdev = ERR_PTR(-EINVAL);
+ }
+ mutex_unlock(&vdev->lock);
+
+ list_del(&vdev->list);
+ mutex_unlock(&pm_voltdm_list_lock);
+
+ kfree(vdev);
+}
+
+/**
+ * devm_voltdm_register - Resource managed voltage domain registration
+ * @dev: pointer to the device representing the voltage domain
+ * @desc: voltage domain descriptor
+ *
+ * Called by voltage domain drivers to register a voltagedomain. Returns a
+ * valid pointer to struct pm_voltdm_dev on success or an ERR_PTR() on
+ * error. The voltagedomain will automatically be released when the device
+ * is unbound.
+ */
+struct pm_voltdm_dev *devm_voltdm_register(struct device *dev,
+ const struct pm_voltdm_desc *desc)
+{
+ struct pm_voltdm_dev **ptr, *vdev;
+
+ if (!dev || !desc)
+ return ERR_PTR(-EINVAL);
+
+ if (!desc->ops)
+ return ERR_PTR(-EINVAL);
+
+ /* Mandatory to have notify transition */
+ if (!desc->ops->voltdm_do_transition) {
+ dev_err(dev, "%s: Bad desc -do_transition missing\n", __func__);
+ return ERR_PTR(-EINVAL);
+ }
+
+ vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
+ if (!vdev)
+ return ERR_PTR(-ENOMEM);
+
+ ptr = devres_alloc(devm_voltdm_release, sizeof(*ptr), GFP_KERNEL);
+ if (!ptr) {
+ kfree(vdev);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ vdev->desc = desc;
+ vdev->dev = dev;
+ mutex_init(&vdev->lock);
+ INIT_LIST_HEAD(&vdev->notifier_list);
+ mutex_lock(&pm_voltdm_list_lock);
+ list_add(&vdev->list, &pm_voltdm_list);
+ mutex_unlock(&pm_voltdm_list_lock);
+
+ *ptr = vdev;
+ devres_add(dev, ptr);
+
+ return vdev;
+}
+EXPORT_SYMBOL_GPL(devm_voltdm_register);
+
+static int devm_vdev_match(struct device *dev, void *res, void *data)
+{
+ struct pm_voltdm_dev **r = res;
+ if (!r || !*r) {
+ WARN_ON(!r || !*r);
+ return 0;
+ }
+ return *r == data;
+}
+
+/**
+ * devm_voltdm_unregister - Resource managed voltagedomain unregister
+ * @vdev: voltage domain device returned by devm_voltdm_register()
+ *
+ * Unregister a voltdm registered with devm_voltdm_register().
+ * Normally this function will not need to be called and the resource
+ * management code will ensure that the resource is freed.
+ */
+void devm_voltdm_unregister(struct pm_voltdm_dev *vdev)
+{
+ int rc;
+ struct device *dev = vdev->dev;
+
+ rc = devres_release(dev, devm_voltdm_release, devm_vdev_match, vdev);
+ if (rc != 0)
+ WARN_ON(rc);
+}
+EXPORT_SYMBOL_GPL(devm_voltdm_unregister);
diff --git a/drivers/power/voltdm/voltage_domain_private.h b/drivers/power/voltdm/voltage_domain_private.h
new file mode 100644
index 0000000..e63c2b1
--- /dev/null
+++ b/drivers/power/voltdm/voltage_domain_private.h
@@ -0,0 +1,86 @@
+/*
+ * Voltage Domain private header for voltage domain drivers
+ *
+ * Copyright (C) 2014 Texas Instruments Incorporated - http://www.ti.com/
+ * Nishanth Menon
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __PM_VOLTAGE_DOMAIN_PRIVATE__
+#define __PM_VOLTAGE_DOMAIN_PRIVATE__
+
+#include <linux/pm_voltage_domain.h>
+
+struct pm_voltdm_dev;
+
+/**
+ * struct pm_voltdm_ops - Operations functions for voltage domain
+ * @voltdm_get: (optional) invoked when notifier is to be registered.
+ * @voltdm_put: (optional) invoked when notifier is unregistered.
+ * @voltdm_latency: (optional) compute and provide voltage domain
+ * transition latency.
+ * @voltdm_do_transition: (mandatory) callback for notification
+ *
+ * np_args - is arguments to the node phandle - useful when specific voltage
+ * domain is refered to using indices.
+ * voltdm_data - is the voltage domain driver specific data corresponding to
+ * driver information per registration (may point to domain driver specific
+ * data including reference to index. This is always provided back to the
+ * driver at various follow on operations.
+ * clk_notifier_flags - follows the standard clk notification flags
+ */
+struct pm_voltdm_ops {
+ int (*voltdm_get)(struct device *voltdm_dev,
+ struct device *request_dev,
+ struct device_node *np,
+ struct of_phandle_args *np_args,
+ const char *supply,
+ void **voltdm_data);
+ int (*voltdm_latency)(struct device *voltdm_dev, void *voltdm_data,
+ unsigned long min_uv, unsigned long max_uv);
+ int (*voltdm_do_transition)(struct device *voltdm_dev,
+ void *voltdm_data,
+ unsigned long clk_notifier_flags,
+ int uv, int tol_uv);
+ void (*voltdm_put)(struct device *voltdm_dev,
+ struct device *request_dev,
+ void *voltdm_data);
+};
+
+#define VOLTAGE_DOMAIN_FLAG_NOTIFY_ALL (0x1 << 0)
+
+/**
+ * struct pm_voltdm_desc - Descriptor for the voltage domain
+ * @ops: operations for the voltage domain
+ * @flags: flags controlling the various operations
+ */
+struct pm_voltdm_desc {
+ const struct pm_voltdm_ops *ops;
+ u16 flags;
+};
+
+#if defined(CONFIG_VOLTAGE_DOMAIN)
+struct pm_voltdm_dev *devm_voltdm_register(struct device *dev,
+ const struct pm_voltdm_desc *desc);
+void devm_voltdm_unregister(struct pm_voltdm_dev *vdev);
+#else
+static inline struct pm_voltdm_dev *devm_voltdm_register(struct device *dev,
+ const struct pm_voltdm_desc *desc)
+{
+ return ERR_PTR(-ENODEV);
+}
+
+static inline void devm_voltdm_unregister(struct pm_voltdm_dev *vdev)
+{
+}
+#endif /* VOLTAGE_DOMAIN */
+
+#endif /* __PM_VOLTAGE_DOMAIN_PRIVATE__ */
--
1.7.9.5

2014-02-18 20:35:08

by Nishanth Menon

[permalink] [raw]
Subject: [RFC PATCH 1/6] PM / Voltagedomain: Add generic clk notifier handler for regulator based dynamic voltage scaling

From: Mike Turquette <[email protected]>

This patch provides helper functions for drivers that wish to scale
voltage through the clock rate-change notifiers. The approach taken
is that the user-driver(cpufreq/devfreq) do not care about the
details of the OPP table, nor does it care about handling the voltage
regulator directly.

By using the clk notifier flags, we are able to sequence the operations
in the right order. The current logic is heavily influenced by
implementation done in cpufreq-cpu0.

[[email protected]: Fixes in logic, and broken out from clk to allow building
a generic voltagedomain solution independent of cpufreq]
Signed-off-by: Nishanth Menon <[email protected]>
Signed-off-by: Mike Turquette <[email protected]>
---
drivers/power/Kconfig | 1 +
drivers/power/Makefile | 1 +
drivers/power/voltdm/Kconfig | 7 ++
drivers/power/voltdm/Makefile | 2 +
drivers/power/voltdm/core.c | 195 +++++++++++++++++++++++++++++++++++++
include/linux/pm_voltage_domain.h | 47 +++++++++
6 files changed, 253 insertions(+)
create mode 100644 drivers/power/voltdm/Kconfig
create mode 100644 drivers/power/voltdm/Makefile
create mode 100644 drivers/power/voltdm/core.c
create mode 100644 include/linux/pm_voltage_domain.h

diff --git a/drivers/power/Kconfig b/drivers/power/Kconfig
index ba69751..5c4fe16 100644
--- a/drivers/power/Kconfig
+++ b/drivers/power/Kconfig
@@ -394,3 +394,4 @@ source "drivers/power/reset/Kconfig"
endif # POWER_SUPPLY

source "drivers/power/avs/Kconfig"
+source "drivers/power/voltdm/Kconfig"
diff --git a/drivers/power/Makefile b/drivers/power/Makefile
index ee54a3e..3d47072 100644
--- a/drivers/power/Makefile
+++ b/drivers/power/Makefile
@@ -58,3 +58,4 @@ obj-$(CONFIG_POWER_AVS) += avs/
obj-$(CONFIG_CHARGER_SMB347) += smb347-charger.o
obj-$(CONFIG_CHARGER_TPS65090) += tps65090-charger.o
obj-$(CONFIG_POWER_RESET) += reset/
+obj-$(CONFIG_VOLTAGE_DOMAIN) += voltdm/
diff --git a/drivers/power/voltdm/Kconfig b/drivers/power/voltdm/Kconfig
new file mode 100644
index 0000000..c5353bd
--- /dev/null
+++ b/drivers/power/voltdm/Kconfig
@@ -0,0 +1,7 @@
+config VOLTAGE_DOMAIN
+ bool
+ depends on COMMON_CLK && OF && PM_OPP
+ default y if COMMON_CLK
+ ---help---
+ Core voltage domain framework using common clock framework's
+ notifier mechanism.
diff --git a/drivers/power/voltdm/Makefile b/drivers/power/voltdm/Makefile
new file mode 100644
index 0000000..3fa4408
--- /dev/null
+++ b/drivers/power/voltdm/Makefile
@@ -0,0 +1,2 @@
+# Generic voltage domain
+obj-$(CONFIG_VOLTAGE_DOMAIN) += core.o
diff --git a/drivers/power/voltdm/core.c b/drivers/power/voltdm/core.c
new file mode 100644
index 0000000..d0ed27e
--- /dev/null
+++ b/drivers/power/voltdm/core.c
@@ -0,0 +1,195 @@
+/*
+ * Copyright (C) 2013 Linaro Ltd <[email protected]>
+ * Copyright (C) 2014 Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Helper functions for registering clock rate-change notifier handlers
+ * that scale voltage when a clock changes its output frequency.
+ */
+#include <linux/device.h>
+#include <linux/of.h>
+#include <linux/pm_opp.h>
+#include <linux/pm_voltage_domain.h>
+#include <linux/regulator/consumer.h>
+#include <linux/slab.h>
+
+/**
+ * struct volt_scale_data - Internal structure to maintain notifier information
+ * @dev: device on behalf of which we register the notifier
+ * @clk: clk on which we registered the notifier
+ * @reg: regulator if any which is used for scaling voltage
+ * @tol: voltage tolerance in %
+ * @nb: notifier block pointer
+ */
+struct volt_scale_data {
+ struct device *dev;
+ struct clk *clk;
+ struct regulator *reg;
+ int tol;
+ struct notifier_block nb;
+};
+
+#define to_volt_scale_data(_nb) container_of(_nb, \
+ struct volt_scale_data, nb)
+
+static int clk_volt_notifier_handler(struct notifier_block *nb,
+ unsigned long flags, void *data)
+{
+ struct clk_notifier_data *cnd = data;
+ struct volt_scale_data *vsd = to_volt_scale_data(nb);
+ int ret, volt, tol;
+ struct dev_pm_opp *opp;
+ unsigned long old_rate = cnd->old_rate;
+ unsigned long new_rate = cnd->new_rate;
+
+ if ((new_rate < old_rate && flags == PRE_RATE_CHANGE) ||
+ (new_rate > old_rate && flags == POST_RATE_CHANGE))
+ return NOTIFY_OK;
+
+ rcu_read_lock();
+ if (flags != ABORT_RATE_CHANGE)
+ opp = dev_pm_opp_find_freq_ceil(vsd->dev, &new_rate);
+ else
+ opp = dev_pm_opp_find_freq_ceil(vsd->dev, &old_rate);
+ if (IS_ERR(opp)) {
+ rcu_read_unlock();
+ dev_err(vsd->dev, "%s: Failed to find OPP for %lu\n",
+ __func__, new_rate);
+ return notifier_from_errno(PTR_ERR(opp));
+ }
+
+ volt = dev_pm_opp_get_voltage(opp);
+ rcu_read_unlock();
+
+ tol = volt * vsd->tol / 100;
+
+ dev_dbg(vsd->dev, "%s: %lu -> %lu, V=%d, tol=%d, clk_flag=%lu\n",
+ __func__, old_rate, new_rate, volt, tol, flags);
+
+ ret = regulator_set_voltage_tol(vsd->reg, volt, tol);
+ if (ret) {
+ dev_err(vsd->dev,
+ "%s: Failed to scale voltage(%u): %d\n", __func__,
+ volt, ret);
+ return notifier_from_errno(ret);
+ }
+
+ return NOTIFY_OK;
+}
+
+/**
+ * of_pm_voltdm_notifier_register() - register voltage domain notifier
+ * @dev: device for which to register notifier for
+ * @np: node pointer of the device for which we register
+ * @clk: clk pointer around which the notifier is expected to trigger
+ * @supply: default regulator supply name(regulator id string)
+ * @voltage_latency: returns the latency for the voltage domain
+ *
+ * Return: notifier block which is registered with the common clock framework's
+ * notifier for the clk node requested.
+ */
+struct notifier_block *of_pm_voltdm_notifier_register(struct device *dev,
+ struct device_node *np,
+ struct clk *clk,
+ const char *supply,
+ int *voltage_latency)
+{
+ struct volt_scale_data *vsd;
+ struct dev_pm_opp *opp;
+ unsigned long min, max, freq;
+ int ret;
+
+ vsd = kzalloc(sizeof(*vsd), GFP_KERNEL);
+ if (!vsd)
+ return ERR_PTR(-ENOMEM);
+
+ vsd->dev = dev;
+ vsd->clk = clk;
+ vsd->nb.notifier_call = clk_volt_notifier_handler;
+ vsd->reg = regulator_get_optional(dev, supply);
+ ret = 0;
+ if (IS_ERR(vsd->reg))
+ ret = PTR_ERR(vsd->reg);
+ /* regulator is not mandatory */
+ if (ret != -EPROBE_DEFER) {
+ dev_warn(dev, "%s: Failed to get %s regulator:%d\n",
+ __func__, supply, ret);
+ ret = 0;
+ goto err_free_vsd;
+ }
+ /* For devices that are not ready.... */
+ if (ret)
+ goto err_free_vsd;
+
+ rcu_read_lock();
+ freq = 0;
+ opp = dev_pm_opp_find_freq_ceil(dev, &freq);
+ if (IS_ERR(opp))
+ goto err_bad_opp;
+ min = dev_pm_opp_get_voltage(opp);
+
+ freq = ULONG_MAX;
+ opp = dev_pm_opp_find_freq_floor(dev, &freq);
+ if (IS_ERR(opp))
+ goto err_bad_opp;
+ max = dev_pm_opp_get_voltage(opp);
+ rcu_read_unlock();
+
+ *voltage_latency = regulator_set_voltage_time(vsd->reg, min, max);
+ if (*voltage_latency < 0) {
+ dev_warn(dev,
+ "%s: Fail calculating voltage latency[%ld<->%ld]:%d\n",
+ __func__, min, max, *voltage_latency);
+ }
+
+ of_property_read_u32(np, "voltage-tolerance", &vsd->tol);
+
+ ret = clk_notifier_register(clk, &vsd->nb);
+
+ if (ret) {
+ dev_err(dev, "%s: Failed to Register Notifier, %d\n", __func__,
+ ret);
+ goto err_free_reg;
+ }
+
+ return &vsd->nb;
+
+err_bad_opp:
+ rcu_read_unlock();
+ ret = PTR_ERR(opp);
+ dev_err(dev, "%s: failed to get OPP, %d\n", __func__, ret);
+
+err_free_reg:
+ regulator_put(vsd->reg);
+
+err_free_vsd:
+ kfree(vsd);
+ return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(of_pm_voltdm_notifier_register);
+
+/**
+ * of_pm_voltdm_notifier_unregister() - unregister notifier for volt domain
+ * @nb: notifier block returned by of_pm_voltdm_notifier_register
+ */
+void of_pm_voltdm_notifier_unregister(struct notifier_block *nb)
+{
+ struct volt_scale_data *vsd;
+ struct clk *clk;
+
+ /* if caller send us back error value */
+ if (IS_ERR(nb))
+ return;
+
+ vsd = to_volt_scale_data(nb);
+ clk = vsd->clk;
+ clk_notifier_unregister(clk, nb);
+ if (!IS_ERR(vsd->reg))
+ regulator_put(vsd->reg);
+
+ kfree(vsd);
+}
+EXPORT_SYMBOL_GPL(of_pm_voltdm_notifier_unregister);
diff --git a/include/linux/pm_voltage_domain.h b/include/linux/pm_voltage_domain.h
new file mode 100644
index 0000000..1ee7343
--- /dev/null
+++ b/include/linux/pm_voltage_domain.h
@@ -0,0 +1,47 @@
+/*
+ * Voltage Domain header for users of voltage domain interface
+ *
+ * Copyright (C) 2013 Linaro Ltd <[email protected]>
+ * Copyright (C) 2014 Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __PM_VOLTAGE_DOMAIN__
+#define __PM_VOLTAGE_DOMAIN__
+
+#include <linux/clk.h>
+
+#if defined(CONFIG_VOLTAGE_DOMAIN)
+struct notifier_block *of_pm_voltdm_notifier_register(struct device *dev,
+ struct device_node *np,
+ struct clk *clk,
+ const char *supply,
+ int *voltage_latency);
+void of_pm_voltdm_notifier_unregister(struct notifier_block *nb);
+
+#else
+static inline struct notifier_block *of_pm_voltdm_notifier_register(
+ struct device *dev,
+ struct device_node *np,
+ struct clk *clk,
+ const char *supply,
+ int *voltage_latency);
+{
+ return ERR_PTR(-ENODEV);
+}
+
+static inline void of_pm_voltdm_notifier_unregister(struct notifier_block *nb)
+{
+}
+
+#endif /* VOLTAGE_DOMAIN */
+
+#endif /* __PM_VOLTAGE_DOMAIN__ */
--
1.7.9.5

2014-02-18 20:33:13

by Nishanth Menon

[permalink] [raw]
Subject: [RFC PATCH 6/6] devicetree: bindings: voltagedomain: add bindings for OMAP compatible SoCs

Add documentation for device tree description for basic voltage domain
for Texas Instruments OMAP compatible SoC family of processors which
controls both bias and supply voltage planes.

Signed-off-by: Nishanth Menon <[email protected]>
---
.../bindings/power/voltdm/voltdm_omap.txt | 24 ++++++++++++++++++++
1 file changed, 24 insertions(+)
create mode 100644 Documentation/devicetree/bindings/power/voltdm/voltdm_omap.txt

diff --git a/Documentation/devicetree/bindings/power/voltdm/voltdm_omap.txt b/Documentation/devicetree/bindings/power/voltdm/voltdm_omap.txt
new file mode 100644
index 0000000..30cd3d2
--- /dev/null
+++ b/Documentation/devicetree/bindings/power/voltdm/voltdm_omap.txt
@@ -0,0 +1,24 @@
+Texas Instruments OMAP compatible voltage domain description
+
+This binding uses [1] and describes the voltage domain devices
+typically used on Texas Instruments OMAP compatible SoC family of
+processors.
+
+[1] Documentation/devicetree/bindings/power/voltdm/voltage_domain.txt
+
+Required Properties:
+- compatible: Should be one of:
+ "ti,omap-voltdm" - basic voltage domain controlling VDD and VBB
+- vdd-supply: phandle to regulator controlling VDD supply
+- vbb-supply: phandle to regulator controlling Body Bias supply
+ (Usually Adaptive Body Bias regulator)
+- #voltdm-cells: shall be <0>
+
+Example:
+voltage_domain_mpu: voltdm@1 {
+ compatible = "ti,omap-voltdm";
+ #voltdm-cells = <0>;
+
+ vdd-supply = <&vcc>;
+ vbb-supply = <&abb_mpu>;
+};
--
1.7.9.5

2014-02-18 20:37:04

by Nishanth Menon

[permalink] [raw]
Subject: [RFC PATCH 4/6] devicetree: bindings: add documentation for voltagedomain

Add documentation for voltage domain binding format. Specific
voltage domains will have bindings corresponding to them.

Signed-off-by: Nishanth Menon <[email protected]>
---
.../bindings/power/voltdm/voltage_domain.txt | 65 ++++++++++++++++++++
1 file changed, 65 insertions(+)
create mode 100644 Documentation/devicetree/bindings/power/voltdm/voltage_domain.txt

diff --git a/Documentation/devicetree/bindings/power/voltdm/voltage_domain.txt b/Documentation/devicetree/bindings/power/voltdm/voltage_domain.txt
new file mode 100644
index 0000000..da48a19
--- /dev/null
+++ b/Documentation/devicetree/bindings/power/voltdm/voltage_domain.txt
@@ -0,0 +1,65 @@
+Specifying Voltage Domain information for devices
+=================================================
+
+1. Specifying a voltage domain.
+
+SoCs may have multiple voltage domains under which various clocks operate.
+
+Mandatory Properties:
+- #voltdm-cells - indicates if there are specifiers needed to reference the
+ voltage domain
+
+Optional Properties:
+- voltage-tolerance: Specify the voltage tolerance in percentage
+
+2. Voltage domain consumer
+consumer nodes can be reference using the below binding:
+- <name>-voltdm: phandle to the voltage domain node.
+
+Examples:
+
+A. Voltage Domain controlling multiple regulator
+voltage_domain_mpu: voltdm@1 {
+ compatible = "xyz";
+ #voltdm-cells = <0>;
+ vdd-supply = <&vcc>;
+ vbb-supply = <&abb_mpu>;
+ ...
+};
+
+voltage_domain_gpu: voltdm@2 {
+ compatible = "xyz";
+ #voltdm-cells = <0>;
+ vdd-supply = <&dcdc2>;
+ vbb-supply = <&abb_gpu>;
+ ...
+};
+...
+&cpu0 {
+ cpu0-voltdm = <&voltage_domain_mpu>;
+ voltage-tolerance = <1>;
+};
+
+&gpu {
+ gpu-voltdm = <&voltage_domain_gpu>;
+};
+
+B. Indexed voltage domain device
+
+#define SOC_XYZ_VOLTAGE_DOMAIN_MPU 0
+#define SOC_XYZ_VOLTAGE_DOMAIN_GFX 1
+
+voltage_domain_socx: voltdm@1 {
+ compatible = "abc";
+ #voltdm-cells = <1>;
+ ...
+};
+...
+&cpu0 {
+ cpu0-voltdm = <&voltage_domain_socx SOC_XYZ_VOLTAGE_DOMAIN_MPU>;
+ voltage-tolerance = <1>;
+};
+
+&gpu {
+ gpu-voltdm = <&voltage_domain_socx SOC_XYZ_VOLTAGE_DOMAIN_GFX>;
+};
--
1.7.9.5

2014-02-24 05:34:50

by Mark Brown

[permalink] [raw]
Subject: Re: [RFC PATCH 3/6] PM / Voltagedomain: introduce voltage domain driver support

On Tue, Feb 18, 2014 at 02:32:20PM -0600, Nishanth Menon wrote:

> The current regulator model provides the basic building blocks for the
> transitions, however SoC drivers specific to each of these devices, be
> it cpufreq/devfreq have to replicate the logic for functionality.

> To simply the logic, we can hence introduce a layer that takes care
> of the mundane transition logic, registration mechanisms to provide
> the "user drivers" such as cpufreq/devfreq a generic interface, whose
> details are abstracted by the device tree description for the SoC on
> which the driver operates on.

This doesn't really provide a picture of what the generic interface
that's being offered is and...

> drivers/power/voltdm/Kconfig | 5 +
> drivers/power/voltdm/Makefile | 3 +
> drivers/power/voltdm/core.c | 347 +++++++++++++++++++++++--
> drivers/power/voltdm/voltage_domain_private.h | 86 ++++++
> 4 files changed, 424 insertions(+), 17 deletions(-)

...the diffstat doesn't make it obvious what the external interface is
either. It would be much easier to review this with a clearer picture
of what it's aiming to implement.

> + voltdm_np = of_parse_phandle(np, prop_name, 0);
> + if (voltdm_np) {
> + ret = of_parse_phandle_with_args(np, prop_name, "#voltdm-cells",
> + 0, args);
> + if (ret)
> + return ERR_PTR(ret);

There seems to be some DT stuff going on here, is the interface DT only?


Attachments:
(No filename) (1.43 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-02-24 14:38:45

by Nishanth Menon

[permalink] [raw]
Subject: Re: [RFC PATCH 3/6] PM / Voltagedomain: introduce voltage domain driver support

On 02/23/2014 07:58 PM, Mark Brown wrote:
> On Tue, Feb 18, 2014 at 02:32:20PM -0600, Nishanth Menon wrote:
>
>> The current regulator model provides the basic building blocks for the
>> transitions, however SoC drivers specific to each of these devices, be
>> it cpufreq/devfreq have to replicate the logic for functionality.
>
>> To simply the logic, we can hence introduce a layer that takes care
>> of the mundane transition logic, registration mechanisms to provide
>> the "user drivers" such as cpufreq/devfreq a generic interface, whose
>> details are abstracted by the device tree description for the SoC on
>> which the driver operates on.
>
> This doesn't really provide a picture of what the generic interface
> that's being offered is and...
I should have probably picked up part of the cover letter[1] and
elaborated further in the commit message here.

Intent here is to allow drivers such as cpufreq-cpu0 to be reused on
platforms such as TI's OMAP derivatives, and other SoCs which differ
only by the sequence involved in voltage scale operations. So, this
patch provides a framework for registering the underlying
implementation of the SoC specific voltage change methodology.

Overall the sequence takes place after this patch is as follows:
a) voltage domain drivers such as those of TI or others register with
voltage domain with devm_voltdm_register.
b) cpufreq-cpu0/devfreq drivers:
of_pm_voltdm_notifier_register(introduced as part of patch #1) to
register notifiers around clk of interest. This request is linked to
the specific voltage domain using phandle in device tree.
c) when cpufreq-cpu0/devfreq does a clk_set_rate, the common clock
framework triggers notifiers in voltage domain core which in turn,
invokes the corresponding handlers for the voltage domain driver
ensuring the right dvfs sequence specific to the SoC is triggered.

With this patch, the logic for a SoC specific voltage domain
intricacies can now be abstracted out to the voltage domain layer.
This allows reuse of the logic by multiple drivers such as
devfreq/cpufreq and allows these to remain consistent with very
minimal SoC(if any at all) specific implementations in them.

Among the other obvious alternatives to this approach are:
a) To duplicate the logic multiple times for various drivers such as
devfreq/cpufreq and try to ensure the logic sequences are proper.
basically, create cpufreq-abb-omap, cpufreq-avs-omap,
devfreq-abb-omap, devfreq-avs-omap etc..
b) introduce a specific API for voltage change (something like
pm_dev_scale_opp_frequency or the equivalent).


>
>> drivers/power/voltdm/Kconfig | 5 +
>> drivers/power/voltdm/Makefile | 3 +
>> drivers/power/voltdm/core.c | 347 +++++++++++++++++++++++--
>> drivers/power/voltdm/voltage_domain_private.h | 86 ++++++
>> 4 files changed, 424 insertions(+), 17 deletions(-)
>
> ...the diffstat doesn't make it obvious what the external interface is
> either. It would be much easier to review this with a clearer picture
> of what it's aiming to implement.

The drivers such as cpufreq/devfreq has the same interface introduced
by patch #1 in this series of_pm_voltdm_notifier_register/unregister
which is exposed by include/linux/pm_voltage_domain.h

Underlying implementation of the SoC specific voltage change
methodology can now be isolated by voltage domain driver using
voltage_domain_private.h and services provided by core.c

>
>> + voltdm_np = of_parse_phandle(np, prop_name, 0);
>> + if (voltdm_np) {
>> + ret = of_parse_phandle_with_args(np, prop_name, "#voltdm-cells",
>> + 0, args);
>> + if (ret)
>> + return ERR_PTR(ret);
>
> There seems to be some DT stuff going on here, is the interface DT only?
>
yes. i will make that clear in the documentation.

[1] http://marc.info/?l=linux-kernel&m=139275560531711&w=2
--
Regards,
Nishanth Menon

2014-02-25 05:51:53

by Mike Turquette

[permalink] [raw]
Subject: Re: [RFC PATCH 1/6] PM / Voltagedomain: Add generic clk notifier handler for regulator based dynamic voltage scaling

Quoting Nishanth Menon (2014-02-18 12:32:18)
> From: Mike Turquette <[email protected]>
>
> This patch provides helper functions for drivers that wish to scale
> voltage through the clock rate-change notifiers. The approach taken
> is that the user-driver(cpufreq/devfreq) do not care about the
> details of the OPP table, nor does it care about handling the voltage
> regulator directly.
>
> By using the clk notifier flags, we are able to sequence the operations
> in the right order. The current logic is heavily influenced by
> implementation done in cpufreq-cpu0.
>
> [[email protected]: Fixes in logic, and broken out from clk to allow building
> a generic voltagedomain solution independent of cpufreq]
> Signed-off-by: Nishanth Menon <[email protected]>
> Signed-off-by: Mike Turquette <[email protected]>

Not-signed-off-by: Mike Turquette <[email protected]>

I haven't reviewed this series and it is a pretty big deviation from my
original RFC. You can have authorship of the patches if you want.

I'm not sure about trying to capture the "voltdm" as a core concept. It
feels a bit unwieldy to me. I have wondered about making an abstract
"performance domain" which is the dvfs analogue to generic power
domains. This a reasonable split since gpd are good for idle power
savings (e.g. clock gate, power gate, sleep state, etc) and "perf
domains" would be good for active power savings (dvfs).

Having a generic container for performance domains might make a good
place to stuff all of this glue logic that we keep running into (e.g.
CPU and GPU max frequencies that are related), and it might make another
nice knob for the thermal folks to use.

For the case of the OMAP voltage domains, it would be a place to stuff
all of the VC/VP -> ABB -> Smart Reflex AVS stuff.

Anyways, I don't have a real proposal. I just don't want my name on
these patches since they are really yours now. I might resurrect them
some day in a different context.

Regards,
Mike

2014-02-25 20:58:29

by Nishanth Menon

[permalink] [raw]
Subject: Re: [RFC PATCH 1/6] PM / Voltagedomain: Add generic clk notifier handler for regulator based dynamic voltage scaling

Hi Mike,
On 02/24/2014 11:51 PM, Mike Turquette wrote:
> Quoting Nishanth Menon (2014-02-18 12:32:18)
>> From: Mike Turquette <[email protected]>
>>
>> This patch provides helper functions for drivers that wish to scale
>> voltage through the clock rate-change notifiers. The approach taken
>> is that the user-driver(cpufreq/devfreq) do not care about the
>> details of the OPP table, nor does it care about handling the voltage
>> regulator directly.
>>
>> By using the clk notifier flags, we are able to sequence the operations
>> in the right order. The current logic is heavily influenced by
>> implementation done in cpufreq-cpu0.
>>
>> [[email protected]: Fixes in logic, and broken out from clk to allow building
>> a generic voltagedomain solution independent of cpufreq]
>> Signed-off-by: Nishanth Menon <[email protected]>
>> Signed-off-by: Mike Turquette <[email protected]>
>
> Not-signed-off-by: Mike Turquette <[email protected]>
>
> I haven't reviewed this series and it is a pretty big deviation from my
> original RFC. You can have authorship of the patches if you want.

Sure, I had send a private note requesting clarification about the
authorship, but I guess I can take this as the response :).

>
> I'm not sure about trying to capture the "voltdm" as a core concept. It
> feels a bit unwieldy to me.

Considering it is a simple collation of regulators and SoC specific
"magic" which have to be operated in tandem to clock operation, Why
does it seem unwieldy? Usage of multiple voltage planes in a single
voltage domain concept does not seem unique to TI processors either:
For example, imx6q-cpufreq.c uses 3 regulators (arm, pu, soc),
s5pv210-cpufreq.c uses two regulators (vddarm, vddint), ideally OMAP
implementation would use two (vdd_mpu, vbb_mpu).

> I have wondered about making an abstract
> "performance domain" which is the dvfs analogue to generic power
> domains. This a reasonable split since gpd are good for idle power
> savings (e.g. clock gate, power gate, sleep state, etc) and "perf
> domains" would be good for active power savings (dvfs).
>
> Having a generic container for performance domains might make a good
> place to stuff all of this glue logic that we keep running into (e.g.
> CPU and GPU max frequencies that are related), and it might make another
> nice knob for the thermal folks to use.

This sounds like one level higher abstraction that we are speaking of
here? I was'nt intending to solve the bigger picture problem here -
just an abstraction level that might allow reusablity for multiple
SoCs. In fact, having an abstraction away for voltage domain(which may
consist of multiple regulators and any SoC specific magic) purely
allows us to move towards a direction you mention here.

>
> For the case of the OMAP voltage domains, it would be a place to stuff
> all of the VC/VP -> ABB -> Smart Reflex AVS stuff.
>

Unfortunately, I dont completely comprehend objection we have to this
approach (other than an higher level abstraction is needed) and if we
do have an objection, what is the alternate approach should be for
representing hardware which this series attempts to present.

--
Regards,
Nishanth Menon

2014-02-27 02:35:34

by Nishanth Menon

[permalink] [raw]
Subject: Re: [RFC PATCH 1/6] PM / Voltagedomain: Add generic clk notifier handler for regulator based dynamic voltage scaling

On 14:56-20140225, Nishanth Menon wrote:
> On 02/24/2014 11:51 PM, Mike Turquette wrote:
> > Quoting Nishanth Menon (2014-02-18 12:32:18)
[...]
> > I'm not sure about trying to capture the "voltdm" as a core concept. It
> > feels a bit unwieldy to me.
>
> Considering it is a simple collation of regulators and SoC specific
> "magic" which have to be operated in tandem to clock operation, Why
> does it seem unwieldy? Usage of multiple voltage planes in a single
> voltage domain concept does not seem unique to TI processors either:
> For example, imx6q-cpufreq.c uses 3 regulators (arm, pu, soc),
> s5pv210-cpufreq.c uses two regulators (vddarm, vddint), ideally OMAP
> implementation would use two (vdd_mpu, vbb_mpu).
>
> > I have wondered about making an abstract
> > "performance domain" which is the dvfs analogue to generic power
> > domains. This a reasonable split since gpd are good for idle power
> > savings (e.g. clock gate, power gate, sleep state, etc) and "perf
> > domains" would be good for active power savings (dvfs).
> >
> > Having a generic container for performance domains might make a good
> > place to stuff all of this glue logic that we keep running into (e.g.
> > CPU and GPU max frequencies that are related), and it might make another
> > nice knob for the thermal folks to use.
>
> This sounds like one level higher abstraction that we are speaking of
> here? I was'nt intending to solve the bigger picture problem here -
> just an abstraction level that might allow reusablity for multiple
> SoCs. In fact, having an abstraction away for voltage domain(which may
> consist of multiple regulators and any SoC specific magic) purely
> allows us to move towards a direction you mention here.
>
> >
> > For the case of the OMAP voltage domains, it would be a place to stuff
> > all of the VC/VP -> ABB -> Smart Reflex AVS stuff.
> >
>
> Unfortunately, I dont completely comprehend objection we have to this
> approach (other than an higher level abstraction is needed) and if we
> do have an objection, what is the alternate approach should be for
> representing hardware which this series attempts to present.

I think the following is around the lines of your thought direction -
if Rafael or others have comments on the following approach, it'd be a
good starting point for me to progress.

-->8--
>From 62e50b9f920495db88e5594aa6bceb52e83a443d Mon Sep 17 00:00:00 2001
From: Nishanth Menon <[email protected]>
Date: Wed, 26 Feb 2014 10:59:59 -0600
Subject: [PATCH] PM / Runtime: introduce active power management callbacks
for pm_domain

dev_pm_domain currently handles just device idle power management
using the generic pm_runtime_get|put and related family of functions.

Logically with appropriate pm_domain hooks this can translate to
hardware specific clock and related operations. Given that pm_domains
may contain this information, this provides an opportunity to extend
current pm_runtime do dynamic power operations as well.

What this means for drivers is as follows:

Today, drivers(with some level of complexity) do:
pm_runtime_get_sync(dev);
clk = clk_get(dev, "name");
old_rate = clk_get_rate(clk);
...
clk_set_rate(clk, new_rate);
...
clk_put(clk);
pm_runtime_get_sync(dev);

Instead, on pm_domains that can handle this as part of
pm_domain->active_ops functions, They can now do the following:
pm_runtime_get_sync(dev);
old_rate = pm_runtime_get_rate(dev);
...
pm_runtime_set_rate(dev, new_rate);
...
pm_runtime_put_sync(dev);

Obviously, this'd work for devices that handle a single main
functional clock, but this could reduce complexity of drivers having
to deal with power management details to have pm_runtime as the main
point of interface.

CAVEAT: For power domains that are capable of handling multiple
clocks (example on OMAP, where there are the concepts of interface,
functional and optional clocks per block), appropriate handling will
be necessary from pm_domain callbacks. So, the question about which
clock rate is being controlled or returned to is entirely upto the
pm_domain implementation.

On the otherhand, we can debate about defining and querying ACPI style
"Performance state" instead of frequencies and wrap P-states inside
or the other way around.. but given that majority of drivers using
pm_runtime would rather be interested in frequencies and my naieve
belief that we can index P-states with frequencies, kind of influenced
my choice here of proposing frequencies as base query parameter..
ofcourse, debate is still open here.

Yes, we can still debate if providing yet another wrapper on top of
clock APIs makes sense at all as well.

Nyet-signed-off-by: Nishanth Menon <[email protected]>
---
drivers/base/power/runtime.c | 101 ++++++++++++++++++++++++++++++++++++++++++
include/linux/pm.h | 25 +++++++++--
include/linux/pm_runtime.h | 21 +++++++++
3 files changed, 143 insertions(+), 4 deletions(-)

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 72e00e6..ef230b4 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1401,3 +1401,104 @@ void pm_runtime_remove(struct device *dev)
if (dev->power.irq_safe && dev->parent)
pm_runtime_put(dev->parent);
}
+
+/**
+ * pm_runtime_get_rate() - Returns the device operational frequency
+ * @dev: Device to handle
+ * @rate: Returns rate in Hz.
+ *
+ * Returns appropriate error value in case of error conditions, else
+ * returns 0 and rate is updated. The pm_domain logic does all the necessary
+ * operation (which may consider magic hardware stuff) to provide the rate.
+ *
+ * NOTE: the rate returned is a snapshot and in many cases just a bypass
+ * to clk api to set the rate.
+ */
+int pm_runtime_get_rate(struct device *dev, unsigned long *rate)
+{
+ unsigned long flags;
+ int error = -ENOSYS;
+
+ if (!rate || !dev)
+ return -EINVAL;
+
+ spin_lock_irqsave(&dev->power.lock, flags);
+ if (!pm_runtime_active(dev)) {
+ error = -EINVAL;
+ goto out;
+ }
+
+ if (dev->pm_domain && dev->pm_domain->active_ops.get_rate)
+ error = dev->pm_domain->active_ops.get_rate(dev, rate);
+out:
+ spin_unlock_irqrestore(&dev->power.lock, flags);
+
+ return error;
+}
+
+/**
+ * pm_runtime_set_rate() - Set a specific rate for the device operation
+ * @dev: Device to handle
+ * @rate: Rate to set in Hz
+ *
+ * Returns appropriate error value in case of error conditions, else
+ * returns 0. The pm_domain logic does all the necessary operation (which
+ * may include voltage scale operations or other magic hardware stuff) to
+ * achieve the operation. It is guarenteed that the requested rate is achieved
+ * on returning from this function if return value is 0.
+ */
+int pm_runtime_set_rate(struct device *dev, unsigned long rate)
+{
+ unsigned long flags;
+ int error = -ENOSYS;
+
+ if (!rate || !dev)
+ return -EINVAL;
+
+ spin_lock_irqsave(&dev->power.lock, flags);
+ if (!pm_runtime_active(dev)) {
+ error = -EINVAL;
+ goto out;
+ }
+
+ if (dev->pm_domain && dev->pm_domain->active_ops.set_rate)
+ error = dev->pm_domain->active_ops.set_rate(dev, rate);
+out:
+ spin_unlock_irqrestore(&dev->power.lock, flags);
+
+ return error;
+}
+
+/**
+ * pm_runtime_get_transition_latency() - determine transition latency`
+ * @dev: Device to handle
+ * @from_rate: Transition from which rate
+ * @to_rate: Transition to which rate
+ *
+ * Returns appropriate error value in case of error conditions, else
+ * returns the latency in uSecs for transition between two given rates
+ */
+int pm_runtime_get_transition_latency(struct device *dev,
+ unsigned long from_rate,
+ unsigned long to_rate)
+{
+ unsigned long flags;
+ int error = -ENOSYS;
+
+ if (!from_rate || !to_rate || !dev)
+ return -EINVAL;
+
+ spin_lock_irqsave(&dev->power.lock, flags);
+ if (!pm_runtime_active(dev)) {
+ error = -EINVAL;
+ goto out;
+ }
+
+ if (dev->pm_domain && dev->pm_domain->active_ops.get_transition_latency)
+ error = dev->pm_domain->active_ops.get_transition_latency(dev,
+ from_rate, to_rate);
+out:
+ spin_unlock_irqrestore(&dev->power.lock, flags);
+
+ return error;
+}
diff --git a/include/linux/pm.h b/include/linux/pm.h
index 8c6583a..f907b27 100644
--- a/include/linux/pm.h
+++ b/include/linux/pm.h
@@ -299,6 +299,20 @@ struct dev_pm_ops {
int (*runtime_idle)(struct device *dev);
};

+/**
+ * struct dev_pm_active_ops - Active power management operations
+ * @get_rate: get the current operational frequency
+ * @set_rate: set the current operational frequency
+ * @get_transition_latency: get the transition latency in uSeconds
+ */
+struct dev_pm_active_ops {
+ int (*get_rate)(struct device *dev, unsigned long *rate);
+ int (*set_rate)(struct device *dev, unsigned long rate);
+ int (*get_transition_latency)(struct device *dev,
+ unsigned long from_rate,
+ unsigned long to_rate);
+};
+
#ifdef CONFIG_PM_SLEEP
#define SET_SYSTEM_SLEEP_PM_OPS(suspend_fn, resume_fn) \
.suspend = suspend_fn, \
@@ -589,13 +603,16 @@ extern void update_pm_runtime_accounting(struct device *dev);
extern int dev_pm_get_subsys_data(struct device *dev);
extern int dev_pm_put_subsys_data(struct device *dev);

-/*
- * Power domains provide callbacks that are executed during system suspend,
- * hibernation, system resume and during runtime PM transitions along with
- * subsystem-level and driver-level callbacks.
+/**
+ * struct dev_pm_domain - power domain information
+ * @ops: Power domains provide callbacks that are executed during system
+ * suspend, hibernation, system resume and during runtime PM transitions
+ * along with subsystem-level and driver-level callbacks.
+ * @active_ops: Active operational callbacks
*/
struct dev_pm_domain {
struct dev_pm_ops ops;
+ struct dev_pm_active_ops active_ops;
};

/*
diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
index 16c9a62..731a6e4 100644
--- a/include/linux/pm_runtime.h
+++ b/include/linux/pm_runtime.h
@@ -112,6 +112,11 @@ static inline void pm_runtime_mark_last_busy(struct device *dev)
ACCESS_ONCE(dev->power.last_busy) = jiffies;
}

+extern int pm_runtime_get_rate(struct device *dev, unsigned long *rate);
+extern int pm_runtime_set_rate(struct device *dev, unsigned long rate);
+extern int pm_runtime_get_transition_latency(struct device *dev,
+ unsigned long from_rate,
+ unsigned long to_rate);
#else /* !CONFIG_PM_RUNTIME */

static inline int __pm_runtime_idle(struct device *dev, int rpmflags)
@@ -162,6 +167,22 @@ static inline unsigned long pm_runtime_autosuspend_expiration(
static inline void pm_runtime_set_memalloc_noio(struct device *dev,
bool enable){}

+static inline int pm_runtime_get_rate(struct device *dev, unsigned long *rate)
+{
+ return -ENOSYS;
+}
+
+static inline int pm_runtime_set_rate(struct device *dev, unsigned long rate)
+{
+ return -ENOSYS;
+}
+
+static inline int pm_runtime_get_transition_latency(struct device *dev,
+ unsigned long from_rate,
+ unsigned long to_rate)
+{
+ return -ENOSYS;
+}
#endif /* !CONFIG_PM_RUNTIME */

static inline int pm_runtime_idle(struct device *dev)
--
1.7.9.5

--
Regards,
Nishanth Menon

2014-02-27 05:00:51

by Mike Turquette

[permalink] [raw]
Subject: Re: [RFC PATCH 1/6] PM / Voltagedomain: Add generic clk notifier handler for regulator based dynamic voltage scaling

Quoting Nishanth Menon (2014-02-26 18:34:55)
> +/**
> + * pm_runtime_get_rate() - Returns the device operational frequency
> + * @dev: Device to handle
> + * @rate: Returns rate in Hz.
> + *
> + * Returns appropriate error value in case of error conditions, else
> + * returns 0 and rate is updated. The pm_domain logic does all the necessary
> + * operation (which may consider magic hardware stuff) to provide the rate.
> + *
> + * NOTE: the rate returned is a snapshot and in many cases just a bypass
> + * to clk api to set the rate.
> + */
> +int pm_runtime_get_rate(struct device *dev, unsigned long *rate)

Instead of "rate", how about we use "level" and leave it undefined as to
what that means? It would be equally valid for level to represent a
clock rate, or an opp from a table of opp's, or a p-state, or some value
passed to a PM microcontroller.

Code that is tightly coupled to the hardware would simply know what
value to use with no extra sugar.

Generic code would need to get the various supported "levels" populated
at run time, but a DT binding could do that, or a query to the ACPI
tables, or whatever.

> +{
> + unsigned long flags;
> + int error = -ENOSYS;
> +
> + if (!rate || !dev)
> + return -EINVAL;
> +
> + spin_lock_irqsave(&dev->power.lock, flags);
> + if (!pm_runtime_active(dev)) {
> + error = -EINVAL;
> + goto out;
> + }
> +
> + if (dev->pm_domain && dev->pm_domain->active_ops.get_rate)
> + error = dev->pm_domain->active_ops.get_rate(dev, rate);
> +out:
> + spin_unlock_irqrestore(&dev->power.lock, flags);
> +
> + return error;
> +}
> +
> +/**
> + * pm_runtime_set_rate() - Set a specific rate for the device operation
> + * @dev: Device to handle
> + * @rate: Rate to set in Hz
> + *
> + * Returns appropriate error value in case of error conditions, else
> + * returns 0. The pm_domain logic does all the necessary operation (which
> + * may include voltage scale operations or other magic hardware stuff) to
> + * achieve the operation. It is guarenteed that the requested rate is achieved
> + * on returning from this function if return value is 0.
> + */
> +int pm_runtime_set_rate(struct device *dev, unsigned long rate)

Additionally I wonder if the function signature should include a way to
specify the sub-unit of a device that we are operating on? This is a way
to tackle the issues you raised regarding multiple clocks per device,
etc. Two approaches come to mind:

int pm_runtime_set_rate(struct device *dev, int index,
unsigned long rate);

Where index is a sub-unit of struct device *dev. The second approach is
to create a publicly declared structure representing the sub-unit. Some
variations on that theme:

int pm_runtime_set_rate(struct perf_domain *perfdm, unsigned long rate);

or,

int pm_runtime_set_rate(struct generic_power_domain *gpd,
unsigned long rate);

or whatever that sub-unit looks like. The gpd thing might be a total
layering violation, I don't know. Or perhaps it's a decent idea but it
shouldn't be as a PM runtime call. Again, I dunno.

Regards,
Mike

2014-02-27 14:43:29

by Nishanth Menon

[permalink] [raw]
Subject: Re: [RFC PATCH 1/6] PM / Voltagedomain: Add generic clk notifier handler for regulator based dynamic voltage scaling

On 02/26/2014 11:00 PM, Mike Turquette wrote:
> Quoting Nishanth Menon (2014-02-26 18:34:55)
>> +/**
>> + * pm_runtime_get_rate() - Returns the device operational frequency
>> + * @dev: Device to handle
>> + * @rate: Returns rate in Hz.
>> + *
>> + * Returns appropriate error value in case of error conditions, else
>> + * returns 0 and rate is updated. The pm_domain logic does all the necessary
>> + * operation (which may consider magic hardware stuff) to provide the rate.
>> + *
>> + * NOTE: the rate returned is a snapshot and in many cases just a bypass
>> + * to clk api to set the rate.
>> + */
>> +int pm_runtime_get_rate(struct device *dev, unsigned long *rate)
>
> Instead of "rate", how about we use "level" and leave it undefined as to
> what that means? It would be equally valid for level to represent a
> clock rate, or an opp from a table of opp's, or a p-state, or some value
> passed to a PM microcontroller.

IMHO, from a driver which already exists for multiple SoCs/
architectures, we cannot have variant definitions that a generic
driver will be unable to depend upon. what should such a driver
expect? level == rate OR level == index to p-state or magic PM
controller value?

Today the definitions are very clear to such a driver, pm_runtime APIs
are used for device specific idle management, but for active
management, use clk and regulator code as needed - ofcourse, that
machine specificity triggers the need for the 50 odd cpufreq drivers
we have today - intent was to be able to abstract it enough for a
generic logic to exist.

>
> Code that is tightly coupled to the hardware would simply know what
> value to use with no extra sugar.

agreed on the machine specific implementation, but the logic at driver
level will then get tied down to machine specific implementation as well

>
> Generic code would need to get the various supported "levels" populated
> at run time, but a DT binding could do that, or a query to the ACPI
> tables, or whatever.

then we'd also have to introduce a translator API for drivers that
need frequency -> we go back to the old world of having specific
functions depending on usage in the frequency world, ACPI world, PM
controller world.

That completely breaks the concept of having a higher level function,
right?

>
>> +{
>> + unsigned long flags;
>> + int error = -ENOSYS;
>> +
>> + if (!rate || !dev)
>> + return -EINVAL;
>> +
>> + spin_lock_irqsave(&dev->power.lock, flags);
>> + if (!pm_runtime_active(dev)) {
>> + error = -EINVAL;
>> + goto out;
>> + }
>> +
>> + if (dev->pm_domain && dev->pm_domain->active_ops.get_rate)
>> + error = dev->pm_domain->active_ops.get_rate(dev, rate);
>> +out:
>> + spin_unlock_irqrestore(&dev->power.lock, flags);
>> +
>> + return error;
>> +}
>> +
>> +/**
>> + * pm_runtime_set_rate() - Set a specific rate for the device operation
>> + * @dev: Device to handle
>> + * @rate: Rate to set in Hz
>> + *
>> + * Returns appropriate error value in case of error conditions, else
>> + * returns 0. The pm_domain logic does all the necessary operation (which
>> + * may include voltage scale operations or other magic hardware stuff) to
>> + * achieve the operation. It is guarenteed that the requested rate is achieved
>> + * on returning from this function if return value is 0.
>> + */
>> +int pm_runtime_set_rate(struct device *dev, unsigned long rate)
>
> Additionally I wonder if the function signature should include a way to
> specify the sub-unit of a device that we are operating on? This is a way
> to tackle the issues you raised regarding multiple clocks per device,
> etc. Two approaches come to mind:
>
> int pm_runtime_set_rate(struct device *dev, int index,
> unsigned long rate);
>
> Where index is a sub-unit of struct device *dev.

Here the problem is trying to define what that index is. should it be
clk index? how again would a generic driver be able to consistently
function? lets, for a moment replace that with a string - "clk_name"
to uniquely identify the clk -> but then, it still, in concept makes
it no better than a clk_set_rate since we are uniquely identifying the
clk to operate upon -> and we can definitely add "magic dvfs" stuff on
existing clock framework without a need for additional wrappers (which
what the original $subject series does).


>The second approach is
> to create a publicly declared structure representing the sub-unit. Some
> variations on that theme:
>
> int pm_runtime_set_rate(struct perf_domain *perfdm, unsigned long rate);
>
> or,
>
> int pm_runtime_set_rate(struct generic_power_domain *gpd,
> unsigned long rate);
>
> or whatever that sub-unit looks like. The gpd thing might be a total
> layering violation, I don't know. Or perhaps it's a decent idea but it
> shouldn't be as a PM runtime call. Again, I dunno.

Again, we goes back to the same question, right? which clock in a
power_domain/perf_domain are we intending with the rate? a device
belongs to a perf domain -> Taking drivers/cpufreq/imx6q-cpufreq.c as
an example.
Clocks needed: arm_clk, pll1_sys_clk, pll1_sw_clk, step_clk,
pll2_pfd2_396m_clk.
regulators needed: arm_reg, pu_reg, soc_reg.

The device we want to set a frequency is the ARM processor. by
describing "perf_domain" or "generic power domain" as a single clock
entity, we might as well use clk_set_rate instead to be specific,
instead of writing one wrapper on top of the entire thing.


I apologize, more I think in this angle, less I think such a generic
api seems feasible.

--
Regards,
Nishanth Menon

2014-02-27 19:01:14

by Felipe Balbi

[permalink] [raw]
Subject: Re: [RFC PATCH 1/6] PM / Voltagedomain: Add generic clk notifier handler for regulator based dynamic voltage scaling

Hi,

On Wed, Feb 26, 2014 at 08:34:55PM -0600, Nishanth Menon wrote:
> On 14:56-20140225, Nishanth Menon wrote:
> > On 02/24/2014 11:51 PM, Mike Turquette wrote:
> > > Quoting Nishanth Menon (2014-02-18 12:32:18)
> [...]
> > > I'm not sure about trying to capture the "voltdm" as a core concept. It
> > > feels a bit unwieldy to me.
> >
> > Considering it is a simple collation of regulators and SoC specific
> > "magic" which have to be operated in tandem to clock operation, Why
> > does it seem unwieldy? Usage of multiple voltage planes in a single
> > voltage domain concept does not seem unique to TI processors either:
> > For example, imx6q-cpufreq.c uses 3 regulators (arm, pu, soc),
> > s5pv210-cpufreq.c uses two regulators (vddarm, vddint), ideally OMAP
> > implementation would use two (vdd_mpu, vbb_mpu).
> >
> > > I have wondered about making an abstract
> > > "performance domain" which is the dvfs analogue to generic power
> > > domains. This a reasonable split since gpd are good for idle power
> > > savings (e.g. clock gate, power gate, sleep state, etc) and "perf
> > > domains" would be good for active power savings (dvfs).
> > >
> > > Having a generic container for performance domains might make a good
> > > place to stuff all of this glue logic that we keep running into (e.g.
> > > CPU and GPU max frequencies that are related), and it might make another
> > > nice knob for the thermal folks to use.
> >
> > This sounds like one level higher abstraction that we are speaking of
> > here? I was'nt intending to solve the bigger picture problem here -
> > just an abstraction level that might allow reusablity for multiple
> > SoCs. In fact, having an abstraction away for voltage domain(which may
> > consist of multiple regulators and any SoC specific magic) purely
> > allows us to move towards a direction you mention here.
> >
> > >
> > > For the case of the OMAP voltage domains, it would be a place to stuff
> > > all of the VC/VP -> ABB -> Smart Reflex AVS stuff.
> > >
> >
> > Unfortunately, I dont completely comprehend objection we have to this
> > approach (other than an higher level abstraction is needed) and if we
> > do have an objection, what is the alternate approach should be for
> > representing hardware which this series attempts to present.
>
> I think the following is around the lines of your thought direction -
> if Rafael or others have comments on the following approach, it'd be a
> good starting point for me to progress.
>
> -->8--
> From 62e50b9f920495db88e5594aa6bceb52e83a443d Mon Sep 17 00:00:00 2001
> From: Nishanth Menon <[email protected]>
> Date: Wed, 26 Feb 2014 10:59:59 -0600
> Subject: [PATCH] PM / Runtime: introduce active power management callbacks
> for pm_domain
>
> dev_pm_domain currently handles just device idle power management
> using the generic pm_runtime_get|put and related family of functions.
>
> Logically with appropriate pm_domain hooks this can translate to
> hardware specific clock and related operations. Given that pm_domains
> may contain this information, this provides an opportunity to extend
> current pm_runtime do dynamic power operations as well.
>
> What this means for drivers is as follows:
>
> Today, drivers(with some level of complexity) do:
> pm_runtime_get_sync(dev);
> clk = clk_get(dev, "name");
> old_rate = clk_get_rate(clk);
> ...
> clk_set_rate(clk, new_rate);
> ...
> clk_put(clk);
> pm_runtime_get_sync(dev);
>
> Instead, on pm_domains that can handle this as part of
> pm_domain->active_ops functions, They can now do the following:
> pm_runtime_get_sync(dev);
> old_rate = pm_runtime_get_rate(dev);
> ...
> pm_runtime_set_rate(dev, new_rate);
> ...
> pm_runtime_put_sync(dev);
>
> Obviously, this'd work for devices that handle a single main
> functional clock, but this could reduce complexity of drivers having
> to deal with power management details to have pm_runtime as the main
> point of interface.
>
> CAVEAT: For power domains that are capable of handling multiple
> clocks (example on OMAP, where there are the concepts of interface,
> functional and optional clocks per block), appropriate handling will
> be necessary from pm_domain callbacks. So, the question about which
> clock rate is being controlled or returned to is entirely upto the
> pm_domain implementation.
>
> On the otherhand, we can debate about defining and querying ACPI style
> "Performance state" instead of frequencies and wrap P-states inside
> or the other way around.. but given that majority of drivers using
> pm_runtime would rather be interested in frequencies and my naieve
> belief that we can index P-states with frequencies, kind of influenced
> my choice here of proposing frequencies as base query parameter..
> ofcourse, debate is still open here.
>
> Yes, we can still debate if providing yet another wrapper on top of
> clock APIs makes sense at all as well.
>
> Nyet-signed-off-by: Nishanth Menon <[email protected]>
> ---
> drivers/base/power/runtime.c | 101 ++++++++++++++++++++++++++++++++++++++++++
> include/linux/pm.h | 25 +++++++++--
> include/linux/pm_runtime.h | 21 +++++++++
> 3 files changed, 143 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 72e00e6..ef230b4 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -1401,3 +1401,104 @@ void pm_runtime_remove(struct device *dev)
> if (dev->power.irq_safe && dev->parent)
> pm_runtime_put(dev->parent);
> }
> +
> +/**
> + * pm_runtime_get_rate() - Returns the device operational frequency
> + * @dev: Device to handle
> + * @rate: Returns rate in Hz.
> + *
> + * Returns appropriate error value in case of error conditions, else
> + * returns 0 and rate is updated. The pm_domain logic does all the necessary
> + * operation (which may consider magic hardware stuff) to provide the rate.
> + *
> + * NOTE: the rate returned is a snapshot and in many cases just a bypass
> + * to clk api to set the rate.
> + */
> +int pm_runtime_get_rate(struct device *dev, unsigned long *rate)
> +{
> + unsigned long flags;
> + int error = -ENOSYS;
> +
> + if (!rate || !dev)
> + return -EINVAL;
> +
> + spin_lock_irqsave(&dev->power.lock, flags);
> + if (!pm_runtime_active(dev)) {
> + error = -EINVAL;
> + goto out;
> + }
> +
> + if (dev->pm_domain && dev->pm_domain->active_ops.get_rate)
> + error = dev->pm_domain->active_ops.get_rate(dev, rate);
> +out:
> + spin_unlock_irqrestore(&dev->power.lock, flags);
> +
> + return error;
> +}

IMHO coupling device drivers even more with pm_runtime is wrong, and
Kevin Hilman seems to agree [1].

I would much rather go with Nishanth's initial approach of subscribing
to clock notifiers. They are, after all, supposed to tell the kernel
about any clock changes. In just so happens that in the case discussed
in this thread, OMAP needs to change voltages to match clock frequency
and, IMHO, using clock notifiers for that is correct.

The sematics are well defined and it's something which has been in the
tree for quite some time.

[1] https://lkml.org/lkml/2014/1/30/469

--
balbi


Attachments:
(No filename) (6.97 kB)
signature.asc (819.00 B)
Digital signature
Download all attachments