Here is a proposal to extend the OPP bindings with bandwidth based on
a previous discussion [1].
Every functional block on a SoC can contribute to the system power
efficiency by expressing its own bandwidth needs (to memory or other SoC
modules). This will allow the system to save power when high throughput
is not required (and also provide maximum throughput when needed).
There are at least three ways for a device to determine its bandwidth
needs:
1. The device can dynamically calculate the needed bandwidth
based on some known variable. For example: UART (baud rate), I2C (fast
mode, high-speed mode, etc), USB (specification version, data transfer
type), SDHC (SD standard, clock rate, bus-width), Video Encoder/Decoder
(video format, resolution, frame-rate)
2. There is a hardware specific value. For example: hardware
specific constant value (e.g. for PRNG) or use-case specific value that
is hard-coded.
3. Predefined SoC/board specific bandwidth values. For example:
CPU or GPU bandwidth is related to the current core frequency and both
bandwidth and frequency are scaled together.
This patchset is trying to address point 3 above by extending the OPP
bindings to support predefined SoC/board bandwidth values and adds
support in cpufreq-dt to scale the interconnect between the CPU and the
DDR together with frequency and voltage.
[1] https://patchwork.kernel.org/patch/10577315/
Georgi Djakov (4):
dt-bindings: opp: Introduce opp-bw-MBs bindings
OPP: Add support for parsing the interconnect bandwidth
OPP: Update the bandwidth on OPP frequency changes
cpufreq: dt: Add support for interconnect bandwidth scaling
Documentation/devicetree/bindings/opp/opp.txt | 45 ++++++++++++
drivers/cpufreq/cpufreq-dt.c | 27 ++++++-
drivers/opp/core.c | 71 +++++++++++++++++++
drivers/opp/of.c | 44 ++++++++++++
drivers/opp/opp.h | 6 ++
include/linux/pm_opp.h | 14 ++++
6 files changed, 206 insertions(+), 1 deletion(-)
In addition to frequency and voltage, some devices may have bandwidth
requirements for their interconnect throughput - for example a CPU
or GPU may also need to increase or decrease their bandwidth to DDR
memory based on the current operating performance point.
Extend the OPP tables with additional property to describe the bandwidth
needs of a device. The average and peak bandwidth values depend on the
hardware and its properties.
Signed-off-by: Georgi Djakov <[email protected]>
---
Documentation/devicetree/bindings/opp/opp.txt | 45 +++++++++++++++++++
1 file changed, 45 insertions(+)
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt
index 76b6c79604a5..fa598264615f 100644
--- a/Documentation/devicetree/bindings/opp/opp.txt
+++ b/Documentation/devicetree/bindings/opp/opp.txt
@@ -129,6 +129,9 @@ Optional properties:
- opp-microamp-<name>: Named opp-microamp property. Similar to
opp-microvolt-<name> property, but for microamp instead.
+- opp-bw-MBs: The interconnect bandwidth is specified with an array containing
+ the two integer values for average and peak bandwidth in megabytes per second.
+
- opp-level: A value representing the performance level of the device,
expressed as a 32-bit integer.
@@ -546,3 +549,45 @@ Example 6: opp-microvolt-<name>, opp-microamp-<name>:
};
};
};
+
+Example 7: opp-bw-MBs:
+(example: average and peak bandwidth values are defined for each OPP and the
+interconnect between CPU and DDR memory is scaled together with CPU frequency)
+
+/ {
+ cpus {
+ CPU0: cpu@0 {
+ compatible = "arm,cortex-a53", "arm,armv8";
+ ...
+ operating-points-v2 = <&cpu_opp_table>;
+ /* path between the CPU and DDR memory */
+ interconnects = <&rpm_bimc MASTER_AMPSS_M0
+ &rpm_bimc SLAVE_EBI_CH0>;
+ };
+ };
+
+ cpu_opp_table: cpu_opp_table {
+ compatible = "operating-points-v2";
+ opp-shared;
+
+ opp-200000000 {
+ opp-hz = /bits/ 64 <200000000>;
+ /* 457 MB/s average and 1525 MB/s peak bandwidth */
+ opp-bw-MBs = <457 1525>;
+ };
+ opp-400000000 {
+ opp-hz = /bits/ 64 <400000000>;
+ /* 915 MB/s average and 3051 MB/s peak bandwidth */
+ opp-bw-MBs = <915 3051>;
+ };
+ opp-800000000 {
+ opp-hz = /bits/ 64 <800000000>;
+ /* 1830 MB/s average and 6103 MB/s peak bandwidth */
+ opp-bw-MBs = <1830 6103>;
+ };
+ opp-998400000 {
+ opp-hz = /bits/ 64 <998400000>;
+ /* 2282 MB/s average and 7614 MB/s peak bandwidth */
+ opp-bw-MBs = <2284 7614>;
+ };
+ };
The OPP bindings now support bandwidth values, so add support to parse it
from device tree and store it into the new dev_pm_opp_icc_bw struct, which
is part of the dev_pm_opp.
Also add and export the dev_pm_opp_set_path() and dev_pm_opp_put_path()
helpers, to set (and release) an interconnect path to a device. The
bandwidth of this path will be updated when the OPPs are switched.
Signed-off-by: Georgi Djakov <[email protected]>
---
drivers/opp/core.c | 67 ++++++++++++++++++++++++++++++++++++++++++
drivers/opp/of.c | 44 +++++++++++++++++++++++++++
drivers/opp/opp.h | 6 ++++
include/linux/pm_opp.h | 14 +++++++++
4 files changed, 131 insertions(+)
diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index e06a0ab05ad6..4b019cecaa07 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -19,6 +19,7 @@
#include <linux/slab.h>
#include <linux/device.h>
#include <linux/export.h>
+#include <linux/interconnect.h>
#include <linux/pm_domain.h>
#include <linux/regulator/consumer.h>
@@ -1645,6 +1646,72 @@ void dev_pm_opp_put_clkname(struct opp_table *opp_table)
}
EXPORT_SYMBOL_GPL(dev_pm_opp_put_clkname);
+/**
+ * dev_pm_opp_set_path() - Set interconnect path for a device
+ * @dev: Device for which interconnect path is being set.
+ * @name: Interconnect path name or NULL.
+ *
+ * This must be called before any OPPs are initialized for the device.
+ */
+struct opp_table *dev_pm_opp_set_path(struct device *dev, const char *name)
+{
+ struct opp_table *opp_table;
+ int ret;
+
+ opp_table = dev_pm_opp_get_opp_table(dev);
+ if (!opp_table)
+ return ERR_PTR(-ENOMEM);
+
+ /* This should be called before OPPs are initialized */
+ if (WARN_ON(!list_empty(&opp_table->opp_list))) {
+ ret = -EBUSY;
+ goto err;
+ }
+
+ /* Another CPU that shares the OPP table has set the path */
+ if (opp_table->path)
+ return opp_table;
+
+ /* Find interconnect path for the device */
+ opp_table->path = of_icc_get(dev, name);
+ if (IS_ERR(opp_table->path)) {
+ ret = PTR_ERR(opp_table->clk);
+ if (ret != -EPROBE_DEFER) {
+ dev_err(dev, "%s: Couldn't find path: %d\n", __func__,
+ ret);
+ }
+ goto err;
+ }
+
+ return opp_table;
+
+err:
+ dev_pm_opp_put_opp_table(opp_table);
+
+ return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(dev_pm_opp_set_path);
+
+/**
+ * dev_pm_opp_put_path() - Release interconnect path resources
+ * @opp_table: OPP table returned from dev_pm_opp_set_path().
+ */
+void dev_pm_opp_put_path(struct opp_table *opp_table)
+{
+ if (!opp_table->path)
+ goto put_opp_table;
+
+ /* Make sure there are no concurrent readers while updating opp_table */
+ WARN_ON(!list_empty(&opp_table->opp_list));
+
+ icc_put(opp_table->path);
+ opp_table->path = NULL;
+
+put_opp_table:
+ dev_pm_opp_put_opp_table(opp_table);
+}
+EXPORT_SYMBOL_GPL(dev_pm_opp_put_path);
+
/**
* dev_pm_opp_register_set_opp_helper() - Register custom set OPP helper
* @dev: Device for which the helper is getting registered.
diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index 1779f2c93291..96fb7fdda8c7 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -16,6 +16,7 @@
#include <linux/cpu.h>
#include <linux/errno.h>
#include <linux/device.h>
+#include <linux/interconnect.h>
#include <linux/of_device.h>
#include <linux/pm_domain.h>
#include <linux/slab.h>
@@ -526,6 +527,45 @@ static int opp_parse_supplies(struct dev_pm_opp *opp, struct device *dev,
return ret;
}
+static int opp_parse_icc_bw(struct dev_pm_opp *opp, struct device *dev,
+ struct opp_table *opp_table)
+{
+ struct property *prop = NULL;
+ char name[NAME_MAX];
+ int count;
+ u32 avg = 0;
+ u32 peak = 0;
+
+ /* Search for "opp-bw-MBs" */
+ sprintf(name, "opp-bw-MBs");
+ prop = of_find_property(opp->np, name, NULL);
+
+ /* Missing property is not a problem */
+ if (!prop) {
+ dev_dbg(dev, "%s: Missing opp-bw-MBs\n", __func__);
+ return 0;
+ }
+
+ count = of_property_count_u32_elems(opp->np, name);
+ if (count != 2) {
+ dev_err(dev, "%s: Invalid number of elements in %s property\n",
+ __func__, name);
+ return -EINVAL;
+ }
+
+ opp->bandwidth = kzalloc(sizeof(*opp->bandwidth), GFP_KERNEL);
+ if (!opp->bandwidth)
+ return -ENOMEM;
+
+ of_property_read_u32_index(opp->np, name, 0, &avg);
+ of_property_read_u32_index(opp->np, name, 1, &peak);
+
+ opp->bandwidth->avg = MBps_to_icc(avg);
+ opp->bandwidth->peak = MBps_to_icc(peak);
+
+ return 0;
+}
+
/**
* dev_pm_opp_of_remove_table() - Free OPP table entries created from static DT
* entries
@@ -619,6 +659,10 @@ static struct dev_pm_opp *_opp_add_static_v2(struct opp_table *opp_table,
if (ret)
goto free_required_opps;
+ ret = opp_parse_icc_bw(new_opp, dev, opp_table);
+ if (ret)
+ goto free_required_opps;
+
if (opp_table->is_genpd)
new_opp->pstate = pm_genpd_opp_to_performance_state(dev, new_opp);
diff --git a/drivers/opp/opp.h b/drivers/opp/opp.h
index 4458175aa661..b4287d065c24 100644
--- a/drivers/opp/opp.h
+++ b/drivers/opp/opp.h
@@ -24,6 +24,7 @@
struct clk;
struct regulator;
+struct icc_path;
/* Lock to allow exclusive modification to the device and opp lists */
extern struct mutex opp_table_lock;
@@ -62,6 +63,7 @@ extern struct list_head opp_tables;
* @rate: Frequency in hertz
* @level: Performance level
* @supplies: Power supplies voltage/current values
+ * @bandwidth: Interconnect bandwidth values
* @clock_latency_ns: Latency (in nanoseconds) of switching to this OPP's
* frequency from any other OPP's frequency.
* @required_opps: List of OPPs that are required by this OPP.
@@ -85,6 +87,8 @@ struct dev_pm_opp {
struct dev_pm_opp_supply *supplies;
+ struct dev_pm_opp_icc_bw *bandwidth;
+
unsigned long clock_latency_ns;
struct dev_pm_opp **required_opps;
@@ -152,6 +156,7 @@ enum opp_table_access {
* property).
* @genpd_performance_state: Device's power domain support performance state.
* @is_genpd: Marks if the OPP table belongs to a genpd.
+ * @path: Interconnect path handle
* @set_opp: Platform specific set_opp callback
* @set_opp_data: Data to be passed to set_opp callback
* @dentry: debugfs dentry pointer of the real device directory (not links).
@@ -196,6 +201,7 @@ struct opp_table {
int regulator_count;
bool genpd_performance_state;
bool is_genpd;
+ struct icc_path *path;
int (*set_opp)(struct dev_pm_set_opp_data *data);
struct dev_pm_set_opp_data *set_opp_data;
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index 900359342965..5edce71a15d6 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -43,6 +43,18 @@ struct dev_pm_opp_supply {
unsigned long u_amp;
};
+/**
+ * struct dev_pm_opp_icc_bw - Interconnect bandwidth values
+ * @avg: Average bandwidth corresponding to this OPP (in icc units)
+ * @peak: Peak bandwidth corresponding to this OPP (in icc units)
+ *
+ * This structure stores the bandwidth values for a single interconnect path.
+ */
+struct dev_pm_opp_icc_bw {
+ u32 avg;
+ u32 peak;
+};
+
/**
* struct dev_pm_opp_info - OPP freq/voltage/current values
* @rate: Target clk rate in hz
@@ -127,6 +139,8 @@ struct opp_table *dev_pm_opp_set_regulators(struct device *dev, const char * con
void dev_pm_opp_put_regulators(struct opp_table *opp_table);
struct opp_table *dev_pm_opp_set_clkname(struct device *dev, const char * name);
void dev_pm_opp_put_clkname(struct opp_table *opp_table);
+struct opp_table *dev_pm_opp_set_path(struct device *dev, const char *name);
+void dev_pm_opp_put_path(struct opp_table *opp_table);
struct opp_table *dev_pm_opp_register_set_opp_helper(struct device *dev, int (*set_opp)(struct dev_pm_set_opp_data *data));
void dev_pm_opp_unregister_set_opp_helper(struct opp_table *opp_table);
struct opp_table *dev_pm_opp_set_genpd_virt_dev(struct device *dev, struct device *virt_dev, int index);
In addition to clocks and regulators, some devices can scale the bandwidth
of their on-chip interconnect - for example between CPU and DDR memory. Add
support for that, so that platforms which support it can make use of it.
Signed-off-by: Georgi Djakov <[email protected]>
---
drivers/cpufreq/cpufreq-dt.c | 27 ++++++++++++++++++++++++++-
1 file changed, 26 insertions(+), 1 deletion(-)
diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index e58bfcb1169e..30fed0fc266d 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -17,6 +17,7 @@
#include <linux/cpufreq.h>
#include <linux/cpumask.h>
#include <linux/err.h>
+#include <linux/interconnect.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/pm_opp.h>
@@ -100,6 +101,7 @@ static int resources_available(void)
struct device *cpu_dev;
struct regulator *cpu_reg;
struct clk *cpu_clk;
+ struct icc_path *cpu_path;
int ret = 0;
const char *name;
@@ -126,6 +128,19 @@ static int resources_available(void)
clk_put(cpu_clk);
+ cpu_path = of_icc_get(cpu_dev, NULL);
+ ret = PTR_ERR_OR_ZERO(cpu_path);
+ if (ret) {
+ if (ret == -EPROBE_DEFER)
+ dev_dbg(cpu_dev, "defer icc path: %d\n", ret);
+ else
+ dev_err(cpu_dev, "failed to get icc path: %d\n", ret);
+
+ return ret;
+ }
+
+ icc_put(cpu_path);
+
name = find_supply_name(cpu_dev);
/* Platform doesn't require regulator */
if (!name)
@@ -205,10 +220,18 @@ static int cpufreq_init(struct cpufreq_policy *policy)
}
}
+ opp_table = dev_pm_opp_set_path(cpu_dev, NULL);
+ if (IS_ERR(opp_table)) {
+ ret = PTR_ERR(opp_table);
+ dev_err(cpu_dev, "Failed to set interconnect path for cpu%d: %d\n",
+ policy->cpu, ret);
+ goto out_put_regulator;
+ }
+
priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv) {
ret = -ENOMEM;
- goto out_put_regulator;
+ goto out_put_path;
}
priv->reg_name = name;
@@ -288,6 +311,8 @@ static int cpufreq_init(struct cpufreq_policy *policy)
if (priv->have_static_opps)
dev_pm_opp_of_cpumask_remove_table(policy->cpus);
kfree(priv);
+out_put_path:
+ dev_pm_opp_put_path(opp_table);
out_put_regulator:
if (name)
dev_pm_opp_put_regulators(opp_table);
If the OPP bandwidth values are populated, we want to switch also the
interconnect bandwidth in addition to frequency and voltage.
Signed-off-by: Georgi Djakov <[email protected]>
---
drivers/opp/core.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 4b019cecaa07..99e7c4cf6c34 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -781,6 +781,10 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
ret = _generic_set_opp_clk_only(dev, clk, old_freq, freq);
}
+ if (!ret && !IS_ERR_OR_NULL(opp_table->path) && opp->bandwidth)
+ icc_set_bw(opp_table->path, opp->bandwidth->avg,
+ opp->bandwidth->peak);
+
/* Scaling down? Configure required OPPs after frequency */
if (!ret && freq < old_freq) {
ret = _set_required_opps(dev, opp_table, opp);
On 13-03-19, 11:00, Georgi Djakov wrote:
> In addition to frequency and voltage, some devices may have bandwidth
> requirements for their interconnect throughput - for example a CPU
> or GPU may also need to increase or decrease their bandwidth to DDR
> memory based on the current operating performance point.
>
> Extend the OPP tables with additional property to describe the bandwidth
> needs of a device. The average and peak bandwidth values depend on the
> hardware and its properties.
>
> Signed-off-by: Georgi Djakov <[email protected]>
> ---
> Documentation/devicetree/bindings/opp/opp.txt | 45 +++++++++++++++++++
> 1 file changed, 45 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt
> index 76b6c79604a5..fa598264615f 100644
> --- a/Documentation/devicetree/bindings/opp/opp.txt
> +++ b/Documentation/devicetree/bindings/opp/opp.txt
> @@ -129,6 +129,9 @@ Optional properties:
> - opp-microamp-<name>: Named opp-microamp property. Similar to
> opp-microvolt-<name> property, but for microamp instead.
>
> +- opp-bw-MBs: The interconnect bandwidth is specified with an array containing
> + the two integer values for average and peak bandwidth in megabytes per second.
> +
> - opp-level: A value representing the performance level of the device,
> expressed as a 32-bit integer.
>
> @@ -546,3 +549,45 @@ Example 6: opp-microvolt-<name>, opp-microamp-<name>:
> };
> };
> };
> +
> +Example 7: opp-bw-MBs:
> +(example: average and peak bandwidth values are defined for each OPP and the
> +interconnect between CPU and DDR memory is scaled together with CPU frequency)
> +
> +/ {
> + cpus {
> + CPU0: cpu@0 {
> + compatible = "arm,cortex-a53", "arm,armv8";
> + ...
> + operating-points-v2 = <&cpu_opp_table>;
> + /* path between the CPU and DDR memory */
> + interconnects = <&rpm_bimc MASTER_AMPSS_M0
> + &rpm_bimc SLAVE_EBI_CH0>;
Can we have multiple paths for a device ?
> + };
> + };
> +
> + cpu_opp_table: cpu_opp_table {
> + compatible = "operating-points-v2";
> + opp-shared;
> +
> + opp-200000000 {
> + opp-hz = /bits/ 64 <200000000>;
> + /* 457 MB/s average and 1525 MB/s peak bandwidth */
> + opp-bw-MBs = <457 1525>;
In that case fixing this to just 2 entries in the array is incorrect
and we should take care of that in the bindings here.
> + };
> + opp-400000000 {
> + opp-hz = /bits/ 64 <400000000>;
> + /* 915 MB/s average and 3051 MB/s peak bandwidth */
> + opp-bw-MBs = <915 3051>;
> + };
> + opp-800000000 {
> + opp-hz = /bits/ 64 <800000000>;
> + /* 1830 MB/s average and 6103 MB/s peak bandwidth */
> + opp-bw-MBs = <1830 6103>;
> + };
> + opp-998400000 {
> + opp-hz = /bits/ 64 <998400000>;
> + /* 2282 MB/s average and 7614 MB/s peak bandwidth */
> + opp-bw-MBs = <2284 7614>;
> + };
> + };
--
viresh
On 13-03-19, 11:00, Georgi Djakov wrote:
> The OPP bindings now support bandwidth values, so add support to parse it
> from device tree and store it into the new dev_pm_opp_icc_bw struct, which
> is part of the dev_pm_opp.
>
> Also add and export the dev_pm_opp_set_path() and dev_pm_opp_put_path()
> helpers, to set (and release) an interconnect path to a device. The
> bandwidth of this path will be updated when the OPPs are switched.
>
> Signed-off-by: Georgi Djakov <[email protected]>
> ---
> drivers/opp/core.c | 67 ++++++++++++++++++++++++++++++++++++++++++
> drivers/opp/of.c | 44 +++++++++++++++++++++++++++
> drivers/opp/opp.h | 6 ++++
> include/linux/pm_opp.h | 14 +++++++++
> 4 files changed, 131 insertions(+)
>
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index e06a0ab05ad6..4b019cecaa07 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -19,6 +19,7 @@
> #include <linux/slab.h>
> #include <linux/device.h>
> #include <linux/export.h>
> +#include <linux/interconnect.h>
> #include <linux/pm_domain.h>
> #include <linux/regulator/consumer.h>
>
> @@ -1645,6 +1646,72 @@ void dev_pm_opp_put_clkname(struct opp_table *opp_table)
> }
> EXPORT_SYMBOL_GPL(dev_pm_opp_put_clkname);
>
> +/**
> + * dev_pm_opp_set_path() - Set interconnect path for a device
> + * @dev: Device for which interconnect path is being set.
> + * @name: Interconnect path name or NULL.
> + *
> + * This must be called before any OPPs are initialized for the device.
> + */
> +struct opp_table *dev_pm_opp_set_path(struct device *dev, const char *name)
Maybe the OPP core can do it itself in a similar way to how we do
clk_get() today ?
> +{
> + struct opp_table *opp_table;
> + int ret;
> +
> + opp_table = dev_pm_opp_get_opp_table(dev);
> + if (!opp_table)
> + return ERR_PTR(-ENOMEM);
> +
> + /* This should be called before OPPs are initialized */
> + if (WARN_ON(!list_empty(&opp_table->opp_list))) {
> + ret = -EBUSY;
> + goto err;
> + }
> +
> + /* Another CPU that shares the OPP table has set the path */
> + if (opp_table->path)
> + return opp_table;
> +
> + /* Find interconnect path for the device */
> + opp_table->path = of_icc_get(dev, name);
> + if (IS_ERR(opp_table->path)) {
> + ret = PTR_ERR(opp_table->clk);
> + if (ret != -EPROBE_DEFER) {
> + dev_err(dev, "%s: Couldn't find path: %d\n", __func__,
> + ret);
> + }
> + goto err;
> + }
> +
> + return opp_table;
> +
> +err:
> + dev_pm_opp_put_opp_table(opp_table);
> +
> + return ERR_PTR(ret);
> +}
> +EXPORT_SYMBOL_GPL(dev_pm_opp_set_path);
> +
> +/**
> + * dev_pm_opp_put_path() - Release interconnect path resources
> + * @opp_table: OPP table returned from dev_pm_opp_set_path().
> + */
> +void dev_pm_opp_put_path(struct opp_table *opp_table)
> +{
> + if (!opp_table->path)
> + goto put_opp_table;
> +
> + /* Make sure there are no concurrent readers while updating opp_table */
> + WARN_ON(!list_empty(&opp_table->opp_list));
> +
> + icc_put(opp_table->path);
> + opp_table->path = NULL;
> +
> +put_opp_table:
> + dev_pm_opp_put_opp_table(opp_table);
> +}
> +EXPORT_SYMBOL_GPL(dev_pm_opp_put_path);
> +
> /**
> * dev_pm_opp_register_set_opp_helper() - Register custom set OPP helper
> * @dev: Device for which the helper is getting registered.
> diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> index 1779f2c93291..96fb7fdda8c7 100644
> --- a/drivers/opp/of.c
> +++ b/drivers/opp/of.c
> @@ -16,6 +16,7 @@
> #include <linux/cpu.h>
> #include <linux/errno.h>
> #include <linux/device.h>
> +#include <linux/interconnect.h>
> #include <linux/of_device.h>
> #include <linux/pm_domain.h>
> #include <linux/slab.h>
> @@ -526,6 +527,45 @@ static int opp_parse_supplies(struct dev_pm_opp *opp, struct device *dev,
> return ret;
> }
>
> +static int opp_parse_icc_bw(struct dev_pm_opp *opp, struct device *dev,
> + struct opp_table *opp_table)
> +{
> + struct property *prop = NULL;
> + char name[NAME_MAX];
> + int count;
> + u32 avg = 0;
> + u32 peak = 0;
Why init to 0 ?
> +
> + /* Search for "opp-bw-MBs" */
> + sprintf(name, "opp-bw-MBs");
> + prop = of_find_property(opp->np, name, NULL);
> +
> + /* Missing property is not a problem */
> + if (!prop) {
> + dev_dbg(dev, "%s: Missing opp-bw-MBs\n", __func__);
> + return 0;
> + }
> +
> + count = of_property_count_u32_elems(opp->np, name);
> + if (count != 2) {
> + dev_err(dev, "%s: Invalid number of elements in %s property\n",
> + __func__, name);
> + return -EINVAL;
> + }
> +
> + opp->bandwidth = kzalloc(sizeof(*opp->bandwidth), GFP_KERNEL);
You forgot to free it.
> + if (!opp->bandwidth)
> + return -ENOMEM;
> +
> + of_property_read_u32_index(opp->np, name, 0, &avg);
> + of_property_read_u32_index(opp->np, name, 1, &peak);
> +
> + opp->bandwidth->avg = MBps_to_icc(avg);
> + opp->bandwidth->peak = MBps_to_icc(peak);
> +
> + return 0;
> +}
> +
> /**
> * dev_pm_opp_of_remove_table() - Free OPP table entries created from static DT
> * entries
> @@ -619,6 +659,10 @@ static struct dev_pm_opp *_opp_add_static_v2(struct opp_table *opp_table,
> if (ret)
> goto free_required_opps;
>
> + ret = opp_parse_icc_bw(new_opp, dev, opp_table);
> + if (ret)
> + goto free_required_opps;
> +
> if (opp_table->is_genpd)
> new_opp->pstate = pm_genpd_opp_to_performance_state(dev, new_opp);
>
> diff --git a/drivers/opp/opp.h b/drivers/opp/opp.h
> index 4458175aa661..b4287d065c24 100644
> --- a/drivers/opp/opp.h
> +++ b/drivers/opp/opp.h
> @@ -24,6 +24,7 @@
>
> struct clk;
> struct regulator;
> +struct icc_path;
>
> /* Lock to allow exclusive modification to the device and opp lists */
> extern struct mutex opp_table_lock;
> @@ -62,6 +63,7 @@ extern struct list_head opp_tables;
> * @rate: Frequency in hertz
> * @level: Performance level
> * @supplies: Power supplies voltage/current values
> + * @bandwidth: Interconnect bandwidth values
> * @clock_latency_ns: Latency (in nanoseconds) of switching to this OPP's
> * frequency from any other OPP's frequency.
> * @required_opps: List of OPPs that are required by this OPP.
> @@ -85,6 +87,8 @@ struct dev_pm_opp {
>
> struct dev_pm_opp_supply *supplies;
>
> + struct dev_pm_opp_icc_bw *bandwidth;
> +
> unsigned long clock_latency_ns;
>
> struct dev_pm_opp **required_opps;
> @@ -152,6 +156,7 @@ enum opp_table_access {
> * property).
> * @genpd_performance_state: Device's power domain support performance state.
> * @is_genpd: Marks if the OPP table belongs to a genpd.
> + * @path: Interconnect path handle
> * @set_opp: Platform specific set_opp callback
> * @set_opp_data: Data to be passed to set_opp callback
> * @dentry: debugfs dentry pointer of the real device directory (not links).
> @@ -196,6 +201,7 @@ struct opp_table {
> int regulator_count;
> bool genpd_performance_state;
> bool is_genpd;
> + struct icc_path *path;
>
> int (*set_opp)(struct dev_pm_set_opp_data *data);
> struct dev_pm_set_opp_data *set_opp_data;
> diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
> index 900359342965..5edce71a15d6 100644
> --- a/include/linux/pm_opp.h
> +++ b/include/linux/pm_opp.h
> @@ -43,6 +43,18 @@ struct dev_pm_opp_supply {
> unsigned long u_amp;
> };
>
> +/**
> + * struct dev_pm_opp_icc_bw - Interconnect bandwidth values
> + * @avg: Average bandwidth corresponding to this OPP (in icc units)
> + * @peak: Peak bandwidth corresponding to this OPP (in icc units)
> + *
> + * This structure stores the bandwidth values for a single interconnect path.
> + */
> +struct dev_pm_opp_icc_bw {
> + u32 avg;
> + u32 peak;
> +};
There is only one user of this structure, maybe we can directly add
the elements in teh dev_pm_opp structure.
> +
> /**
> * struct dev_pm_opp_info - OPP freq/voltage/current values
> * @rate: Target clk rate in hz
> @@ -127,6 +139,8 @@ struct opp_table *dev_pm_opp_set_regulators(struct device *dev, const char * con
> void dev_pm_opp_put_regulators(struct opp_table *opp_table);
> struct opp_table *dev_pm_opp_set_clkname(struct device *dev, const char * name);
> void dev_pm_opp_put_clkname(struct opp_table *opp_table);
> +struct opp_table *dev_pm_opp_set_path(struct device *dev, const char *name);
> +void dev_pm_opp_put_path(struct opp_table *opp_table);
> struct opp_table *dev_pm_opp_register_set_opp_helper(struct device *dev, int (*set_opp)(struct dev_pm_set_opp_data *data));
> void dev_pm_opp_unregister_set_opp_helper(struct opp_table *opp_table);
> struct opp_table *dev_pm_opp_set_genpd_virt_dev(struct device *dev, struct device *virt_dev, int index);
--
viresh
On 3/13/19 2:30 PM, Georgi Djakov wrote:
> Here is a proposal to extend the OPP bindings with bandwidth based on
> a previous discussion [1].
>
> Every functional block on a SoC can contribute to the system power
> efficiency by expressing its own bandwidth needs (to memory or other SoC
> modules). This will allow the system to save power when high throughput
> is not required (and also provide maximum throughput when needed).
>
> There are at least three ways for a device to determine its bandwidth
> needs:
> 1. The device can dynamically calculate the needed bandwidth
> based on some known variable. For example: UART (baud rate), I2C (fast
> mode, high-speed mode, etc), USB (specification version, data transfer
> type), SDHC (SD standard, clock rate, bus-width), Video Encoder/Decoder
> (video format, resolution, frame-rate)
>
> 2. There is a hardware specific value. For example: hardware
> specific constant value (e.g. for PRNG) or use-case specific value that
> is hard-coded.
>
> 3. Predefined SoC/board specific bandwidth values. For example:
> CPU or GPU bandwidth is related to the current core frequency and both
> bandwidth and frequency are scaled together.
>
> This patchset is trying to address point 3 above by extending the OPP
> bindings to support predefined SoC/board bandwidth values and adds
> support in cpufreq-dt to scale the interconnect between the CPU and the
> DDR together with frequency and voltage.
Hey Georgi,
Having opp-bw-MBps as a part of cpu opp does greatly simplify the
problem of scaling multiple interconnect devices with change in cpu
frequency. But there is still a need to scale other devices (non
interconnect based) according to cpu frequency. Having a devfreq
governor for the same would help to have the same generic solution
across SoCs (msm8916/8996/qcs405/sdm845). The devfreq maintainer did
like the idea but wanted it incorporated into the passive governor.
*
https://lore.kernel.org/lkml/20180528060014epcms1p87ec68a4d44f9447b06f979a87e545b7d@epcms1p8/
*
https://lore.kernel.org/lkml/20180802095608epcms1p33fb061543efc9ceb3ec12d5567ceffbc@epcms1p3/
I have a RFC series implementing ddr scaling with passive governor for
sdm845 with the following bindings, will post it early next week.
cpus {
...
CPU0: cpu@0 {
...
operating-points-v2 = <&cpu0_opp_table>;
...
};
....
CPU4: cpu@400 {
...
operating-points-v2 = <&cpu4_opp_table>;
...
};
...
};
cpu0_opp_table: cpu0_opp_table {
compatible = "operating-points-v2";
opp-shared;
cpu0_opp1: opp-300000000 {
opp-hz = /bits/ 64 <300000000>;
};
...
cpu0_opp16: opp-1612800000 {
opp-hz = /bits/ 64 <1612800000>;
};
...
};
cpu4_opp_table: cpu4_opp_table {
compatible = "operating-points-v2";
opp-shared;
...
cpu4_opp4: opp-1056000000 {
opp-hz = /bits/ 64 <1056000000>;
};
cpu4_opp5: opp-1209600000 {
opp-hz = /bits/ 64 <1209600000>;
};
...
};
bw_opp_table: bw-opp-table {
compatible = "operating-points-v2";
opp-200 {
opp-hz = /bits/ 64 < 200000000 >; /* 200 MHz */
required-opps = <&cpu0_opp1>;
/* 0 MB/s average and 762 MB/s peak bandwidth */
opp-bw-MBs = <0 762>;
};
opp-300 {
opp-hz = /bits/ 64 < 300000000 >; /* 300 MHz */
/* 0 MB/s average and 1144 MB/s peak bandwidth */
opp-bw-MBs = <0 1144>;
};
...
opp-768 {
opp-hz = /bits/ 64 < 768000000 >; /* 768 MHz */
/* 0 MB/s average and 2929 MB/s peak bandwidth */
opp-bw-MBs = <0 2929>;
required-opps = <&cpu4_opp4>;
};
opp-1017 {
opp-hz = /bits/ 64 < 1017000000 >; /* 1017 MHz */
/* 0 MB/s average and 3879 MB/s peak bandwidth */
opp-bw-MBs = <0 3879>;
required-opps = <&cpu0_opp16>, <&cpu4_opp5>;
};
};
cpubw {
compatible = "devfreq-icbw";
interconnects = <&snoc MASTER_APSS_1 &bimc SLAVE_EBI_CH0>;
operating-points-v2 = <&bw_opp_table>;
};
> > [1] https://patchwork.kernel.org/patch/10577315/
>
> Georgi Djakov (4):
> dt-bindings: opp: Introduce opp-bw-MBs bindings
> OPP: Add support for parsing the interconnect bandwidth
> OPP: Update the bandwidth on OPP frequency changes
> cpufreq: dt: Add support for interconnect bandwidth scaling
>
> Documentation/devicetree/bindings/opp/opp.txt | 45 ++++++++++++
> drivers/cpufreq/cpufreq-dt.c | 27 ++++++-
> drivers/opp/core.c | 71 +++++++++++++++++++
> drivers/opp/of.c | 44 ++++++++++++
> drivers/opp/opp.h | 6 ++
> include/linux/pm_opp.h | 14 ++++
> 6 files changed, 206 insertions(+), 1 deletion(-)
>
--
Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc, is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
On Wed, Mar 13, 2019 at 11:00:07AM +0200, Georgi Djakov wrote:
> In addition to frequency and voltage, some devices may have bandwidth
> requirements for their interconnect throughput - for example a CPU
> or GPU may also need to increase or decrease their bandwidth to DDR
> memory based on the current operating performance point.
>
> Extend the OPP tables with additional property to describe the bandwidth
> needs of a device. The average and peak bandwidth values depend on the
> hardware and its properties.
How would this work if you have 1 OPP (for the bus/interconnect/ddr) and
2 devices with variable bandwidth needs? Or 'device' here means the
interconnect?
>
> Signed-off-by: Georgi Djakov <[email protected]>
> ---
> Documentation/devicetree/bindings/opp/opp.txt | 45 +++++++++++++++++++
> 1 file changed, 45 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt
> index 76b6c79604a5..fa598264615f 100644
> --- a/Documentation/devicetree/bindings/opp/opp.txt
> +++ b/Documentation/devicetree/bindings/opp/opp.txt
> @@ -129,6 +129,9 @@ Optional properties:
> - opp-microamp-<name>: Named opp-microamp property. Similar to
> opp-microvolt-<name> property, but for microamp instead.
>
> +- opp-bw-MBs: The interconnect bandwidth is specified with an array containing
> + the two integer values for average and peak bandwidth in megabytes per second.
-MBps would be better IMO. Either way, units should be documented in
property-units.txt.
> +
> - opp-level: A value representing the performance level of the device,
> expressed as a 32-bit integer.
>
> @@ -546,3 +549,45 @@ Example 6: opp-microvolt-<name>, opp-microamp-<name>:
> };
> };
> };
> +
> +Example 7: opp-bw-MBs:
> +(example: average and peak bandwidth values are defined for each OPP and the
> +interconnect between CPU and DDR memory is scaled together with CPU frequency)
> +
> +/ {
> + cpus {
> + CPU0: cpu@0 {
> + compatible = "arm,cortex-a53", "arm,armv8";
> + ...
> + operating-points-v2 = <&cpu_opp_table>;
> + /* path between the CPU and DDR memory */
> + interconnects = <&rpm_bimc MASTER_AMPSS_M0
> + &rpm_bimc SLAVE_EBI_CH0>;
> + };
> + };
> +
> + cpu_opp_table: cpu_opp_table {
> + compatible = "operating-points-v2";
> + opp-shared;
> +
> + opp-200000000 {
> + opp-hz = /bits/ 64 <200000000>;
> + /* 457 MB/s average and 1525 MB/s peak bandwidth */
> + opp-bw-MBs = <457 1525>;
> + };
> + opp-400000000 {
> + opp-hz = /bits/ 64 <400000000>;
> + /* 915 MB/s average and 3051 MB/s peak bandwidth */
> + opp-bw-MBs = <915 3051>;
> + };
> + opp-800000000 {
> + opp-hz = /bits/ 64 <800000000>;
> + /* 1830 MB/s average and 6103 MB/s peak bandwidth */
> + opp-bw-MBs = <1830 6103>;
> + };
> + opp-998400000 {
> + opp-hz = /bits/ 64 <998400000>;
> + /* 2282 MB/s average and 7614 MB/s peak bandwidth */
> + opp-bw-MBs = <2284 7614>;
> + };
> + };
On Sat, Mar 16, 2019 at 12:32:49AM +0530, Sibi Sankar wrote:
>
>
> On 3/13/19 2:30 PM, Georgi Djakov wrote:
> > Here is a proposal to extend the OPP bindings with bandwidth based on
> > a previous discussion [1].
> >
> > Every functional block on a SoC can contribute to the system power
> > efficiency by expressing its own bandwidth needs (to memory or other SoC
> > modules). This will allow the system to save power when high throughput
> > is not required (and also provide maximum throughput when needed).
> >
> > There are at least three ways for a device to determine its bandwidth
> > needs:
> > 1. The device can dynamically calculate the needed bandwidth
> > based on some known variable. For example: UART (baud rate), I2C (fast
> > mode, high-speed mode, etc), USB (specification version, data transfer
> > type), SDHC (SD standard, clock rate, bus-width), Video Encoder/Decoder
> > (video format, resolution, frame-rate)
> >
> > 2. There is a hardware specific value. For example: hardware
> > specific constant value (e.g. for PRNG) or use-case specific value that
> > is hard-coded.
> >
> > 3. Predefined SoC/board specific bandwidth values. For example:
> > CPU or GPU bandwidth is related to the current core frequency and both
> > bandwidth and frequency are scaled together.
> >
> > This patchset is trying to address point 3 above by extending the OPP
> > bindings to support predefined SoC/board bandwidth values and adds
> > support in cpufreq-dt to scale the interconnect between the CPU and the
> > DDR together with frequency and voltage.
>
> Hey Georgi,
> Having opp-bw-MBps as a part of cpu opp does greatly simplify the
> problem of scaling multiple interconnect devices with change in cpu
> frequency. But there is still a need to scale other devices (non
> interconnect based) according to cpu frequency. Having a devfreq
> governor for the same would help to have the same generic solution
> across SoCs (msm8916/8996/qcs405/sdm845). The devfreq maintainer did
> like the idea but wanted it incorporated into the passive governor.
>
> * https://lore.kernel.org/lkml/20180528060014epcms1p87ec68a4d44f9447b06f979a87e545b7d@epcms1p8/
>
> * https://lore.kernel.org/lkml/20180802095608epcms1p33fb061543efc9ceb3ec12d5567ceffbc@epcms1p3/
>
> I have a RFC series implementing ddr scaling with passive governor for
> sdm845 with the following bindings, will post it early next week.
>
> cpus {
> ...
>
> CPU0: cpu@0 {
> ...
> operating-points-v2 = <&cpu0_opp_table>;
> ...
> };
> ....
>
> CPU4: cpu@400 {
> ...
> operating-points-v2 = <&cpu4_opp_table>;
> ...
> };
> ...
> };
>
> cpu0_opp_table: cpu0_opp_table {
> compatible = "operating-points-v2";
> opp-shared;
>
> cpu0_opp1: opp-300000000 {
> opp-hz = /bits/ 64 <300000000>;
> };
>
> ...
>
> cpu0_opp16: opp-1612800000 {
> opp-hz = /bits/ 64 <1612800000>;
> };
>
> ...
> };
>
> cpu4_opp_table: cpu4_opp_table {
> compatible = "operating-points-v2";
> opp-shared;
>
> ...
>
> cpu4_opp4: opp-1056000000 {
> opp-hz = /bits/ 64 <1056000000>;
> };
>
> cpu4_opp5: opp-1209600000 {
> opp-hz = /bits/ 64 <1209600000>;
> };
>
> ...
> };
>
> bw_opp_table: bw-opp-table {
> compatible = "operating-points-v2";
>
> opp-200 {
> opp-hz = /bits/ 64 < 200000000 >; /* 200 MHz */
> required-opps = <&cpu0_opp1>;
> /* 0 MB/s average and 762 MB/s peak bandwidth */
> opp-bw-MBs = <0 762>;
> };
>
> opp-300 {
> opp-hz = /bits/ 64 < 300000000 >; /* 300 MHz */
> /* 0 MB/s average and 1144 MB/s peak bandwidth */
> opp-bw-MBs = <0 1144>;
> };
>
> ...
>
> opp-768 {
> opp-hz = /bits/ 64 < 768000000 >; /* 768 MHz */
> /* 0 MB/s average and 2929 MB/s peak bandwidth */
> opp-bw-MBs = <0 2929>;
> required-opps = <&cpu4_opp4>;
> };
>
> opp-1017 {
> opp-hz = /bits/ 64 < 1017000000 >; /* 1017 MHz */
> /* 0 MB/s average and 3879 MB/s peak bandwidth */
> opp-bw-MBs = <0 3879>;
> required-opps = <&cpu0_opp16>, <&cpu4_opp5>;
> };
> };
>
> cpubw {
> compatible = "devfreq-icbw";
Most certainly not a h/w device, so it doesn't go in DT.
> interconnects = <&snoc MASTER_APSS_1 &bimc SLAVE_EBI_CH0>;
> operating-points-v2 = <&bw_opp_table>;
> };
>
>
> > > [1] https://patchwork.kernel.org/patch/10577315/
> >
> > Georgi Djakov (4):
> > dt-bindings: opp: Introduce opp-bw-MBs bindings
> > OPP: Add support for parsing the interconnect bandwidth
> > OPP: Update the bandwidth on OPP frequency changes
> > cpufreq: dt: Add support for interconnect bandwidth scaling
> >
> > Documentation/devicetree/bindings/opp/opp.txt | 45 ++++++++++++
> > drivers/cpufreq/cpufreq-dt.c | 27 ++++++-
> > drivers/opp/core.c | 71 +++++++++++++++++++
> > drivers/opp/of.c | 44 ++++++++++++
> > drivers/opp/opp.h | 6 ++
> > include/linux/pm_opp.h | 14 ++++
> > 6 files changed, 206 insertions(+), 1 deletion(-)
> >
>
> --
> Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc, is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
Hi Viresh,
On 3/14/19 08:23, Viresh Kumar wrote:
> On 13-03-19, 11:00, Georgi Djakov wrote:
>> In addition to frequency and voltage, some devices may have bandwidth
>> requirements for their interconnect throughput - for example a CPU
>> or GPU may also need to increase or decrease their bandwidth to DDR
>> memory based on the current operating performance point.
>>
>> Extend the OPP tables with additional property to describe the bandwidth
>> needs of a device. The average and peak bandwidth values depend on the
>> hardware and its properties.
>>
>> Signed-off-by: Georgi Djakov <[email protected]>
>> ---
>> Documentation/devicetree/bindings/opp/opp.txt | 45 +++++++++++++++++++
>> 1 file changed, 45 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt
>> index 76b6c79604a5..fa598264615f 100644
>> --- a/Documentation/devicetree/bindings/opp/opp.txt
>> +++ b/Documentation/devicetree/bindings/opp/opp.txt
>> @@ -129,6 +129,9 @@ Optional properties:
>> - opp-microamp-<name>: Named opp-microamp property. Similar to
>> opp-microvolt-<name> property, but for microamp instead.
>>
>> +- opp-bw-MBs: The interconnect bandwidth is specified with an array containing
>> + the two integer values for average and peak bandwidth in megabytes per second.
>> +
>> - opp-level: A value representing the performance level of the device,
>> expressed as a 32-bit integer.
>>
>> @@ -546,3 +549,45 @@ Example 6: opp-microvolt-<name>, opp-microamp-<name>:
>> };
>> };
>> };
>> +
>> +Example 7: opp-bw-MBs:
>> +(example: average and peak bandwidth values are defined for each OPP and the
>> +interconnect between CPU and DDR memory is scaled together with CPU frequency)
>> +
>> +/ {
>> + cpus {
>> + CPU0: cpu@0 {
>> + compatible = "arm,cortex-a53", "arm,armv8";
>> + ...
>> + operating-points-v2 = <&cpu_opp_table>;
>> + /* path between the CPU and DDR memory */
>> + interconnects = <&rpm_bimc MASTER_AMPSS_M0
>> + &rpm_bimc SLAVE_EBI_CH0>;
>
> Can we have multiple paths for a device ?
I suppose that this is also a possible scenario. Will propose something
to handle multiple paths too.
>> + };
>> + };
>> +
>> + cpu_opp_table: cpu_opp_table {
>> + compatible = "operating-points-v2";
>> + opp-shared;
>> +
>> + opp-200000000 {
>> + opp-hz = /bits/ 64 <200000000>;
>> + /* 457 MB/s average and 1525 MB/s peak bandwidth */
>> + opp-bw-MBs = <457 1525>;
>
> In that case fixing this to just 2 entries in the array is incorrect
> and we should take care of that in the bindings here.
We can encode the path name into the property (when there are multiple
paths). We already have opp-microamp-<name> and opp-microamp-<name>, so
we can follow the same practice.
For example:
CPU0: cpu@0 {
compatible = "arm,cortex-a53", "arm,armv8";
...
operating-points-v2 = <&cpu_opp_table>;
/* path between the CPU and DDR and path between CPU and L3 */
interconnects = <&bimc MASTER_AMPSS_M0 &bimc SLAVE_EBI_CH0>,
<&bimc MASTER_AMPSS_M0 &bimc SLAVE_L3>;
interconnect-names "cpu-mem", "cpu-l3";
};
cpu_opp_table: cpu_opp_table {
compatible = "operating-points-v2";
opp-shared;
opp-200000000 {
opp-hz = /bits/ 64 <200000000>;
/* 457 MB/s average, 1525 MB/s peak bandwidth to DDR */
opp-bw-MBps-cpu-mem = <457 1525>;
/* 914 MB/s average, 3050 MB/s peak bandwidth to L3 */
opp-bw-MBps-cpu-l3 = <914 3050>;
};
};
Thanks,
Georgi
Hi Viresh,
On 3/14/19 08:30, Viresh Kumar wrote:
> On 13-03-19, 11:00, Georgi Djakov wrote:
>> The OPP bindings now support bandwidth values, so add support to parse it
>> from device tree and store it into the new dev_pm_opp_icc_bw struct, which
>> is part of the dev_pm_opp.
>>
>> Also add and export the dev_pm_opp_set_path() and dev_pm_opp_put_path()
>> helpers, to set (and release) an interconnect path to a device. The
>> bandwidth of this path will be updated when the OPPs are switched.
>>
>> Signed-off-by: Georgi Djakov <[email protected]>
>> ---
>> drivers/opp/core.c | 67 ++++++++++++++++++++++++++++++++++++++++++
>> drivers/opp/of.c | 44 +++++++++++++++++++++++++++
>> drivers/opp/opp.h | 6 ++++
>> include/linux/pm_opp.h | 14 +++++++++
>> 4 files changed, 131 insertions(+)
>>
>> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
>> index e06a0ab05ad6..4b019cecaa07 100644
>> --- a/drivers/opp/core.c
>> +++ b/drivers/opp/core.c
>> @@ -19,6 +19,7 @@
>> #include <linux/slab.h>
>> #include <linux/device.h>
>> #include <linux/export.h>
>> +#include <linux/interconnect.h>
>> #include <linux/pm_domain.h>
>> #include <linux/regulator/consumer.h>
>>
>> @@ -1645,6 +1646,72 @@ void dev_pm_opp_put_clkname(struct opp_table *opp_table)
>> }
>> EXPORT_SYMBOL_GPL(dev_pm_opp_put_clkname);
>>
>> +/**
>> + * dev_pm_opp_set_path() - Set interconnect path for a device
>> + * @dev: Device for which interconnect path is being set.
>> + * @name: Interconnect path name or NULL.
>> + *
>> + * This must be called before any OPPs are initialized for the device.
>> + */
>> +struct opp_table *dev_pm_opp_set_path(struct device *dev, const char *name)
>
> Maybe the OPP core can do it itself in a similar way to how we do
> clk_get() today ?
Do you mean to directly call of_icc_get() in _allocate_opp_table()?
[..]
>> +static int opp_parse_icc_bw(struct dev_pm_opp *opp, struct device *dev,
>> + struct opp_table *opp_table)
>> +{
>> + struct property *prop = NULL;
>> + char name[NAME_MAX];
>> + int count;
>> + u32 avg = 0;
>> + u32 peak = 0;
>
> Why init to 0 ?
Right, seems not necessary.
>> +
>> + /* Search for "opp-bw-MBs" */
>> + sprintf(name, "opp-bw-MBs");
>> + prop = of_find_property(opp->np, name, NULL);
>> +
>> + /* Missing property is not a problem */
>> + if (!prop) {
>> + dev_dbg(dev, "%s: Missing opp-bw-MBs\n", __func__);
>> + return 0;
>> + }
>> +
>> + count = of_property_count_u32_elems(opp->np, name);
>> + if (count != 2) {
>> + dev_err(dev, "%s: Invalid number of elements in %s property\n",
>> + __func__, name);
>> + return -EINVAL;
>> + }
>> +
>> + opp->bandwidth = kzalloc(sizeof(*opp->bandwidth), GFP_KERNEL);
>
> You forgot to free it.
Will fix.
[..]>> +/**
>> + * struct dev_pm_opp_icc_bw - Interconnect bandwidth values
>> + * @avg: Average bandwidth corresponding to this OPP (in icc units)
>> + * @peak: Peak bandwidth corresponding to this OPP (in icc units)
>> + *
>> + * This structure stores the bandwidth values for a single interconnect path.
>> + */
>> +struct dev_pm_opp_icc_bw {
>> + u32 avg;
>> + u32 peak;
>> +};
>
> There is only one user of this structure, maybe we can directly add
> the elements in teh dev_pm_opp structure.
Ok, will do it.
Thanks,
Georgi
Hi Rob,
On 3/28/19 17:12, Rob Herring wrote:
> On Wed, Mar 13, 2019 at 11:00:07AM +0200, Georgi Djakov wrote:
>> In addition to frequency and voltage, some devices may have bandwidth
>> requirements for their interconnect throughput - for example a CPU
>> or GPU may also need to increase or decrease their bandwidth to DDR
>> memory based on the current operating performance point.
>>
>> Extend the OPP tables with additional property to describe the bandwidth
>> needs of a device. The average and peak bandwidth values depend on the
>> hardware and its properties.
>
> How would this work if you have 1 OPP (for the bus/interconnect/ddr) and
> 2 devices with variable bandwidth needs? Or 'device' here means the
> interconnect?
This is a property of the consumer devices and not of the bus. It's fine
for devices to have different bandwidth needs and to be connected to a
shared bus. In this case the framework knows the topology and can
determine which parts of the bus are shared and aggregate the traffic
accordingly.
>>
>> Signed-off-by: Georgi Djakov <[email protected]>
>> ---
>> Documentation/devicetree/bindings/opp/opp.txt | 45 +++++++++++++++++++
>> 1 file changed, 45 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt
>> index 76b6c79604a5..fa598264615f 100644
>> --- a/Documentation/devicetree/bindings/opp/opp.txt
>> +++ b/Documentation/devicetree/bindings/opp/opp.txt
>> @@ -129,6 +129,9 @@ Optional properties:
>> - opp-microamp-<name>: Named opp-microamp property. Similar to
>> opp-microvolt-<name> property, but for microamp instead.
>>
>> +- opp-bw-MBs: The interconnect bandwidth is specified with an array containing
>> + the two integer values for average and peak bandwidth in megabytes per second.
>
> -MBps would be better IMO. Either way, units should be documented in
> property-units.txt.
Agree! Of course!
Thanks,
Georgi
On 09-04-19, 17:37, Georgi Djakov wrote:
> Hi Viresh,
>
> On 3/14/19 08:30, Viresh Kumar wrote:
> > On 13-03-19, 11:00, Georgi Djakov wrote:
> >> The OPP bindings now support bandwidth values, so add support to parse it
> >> from device tree and store it into the new dev_pm_opp_icc_bw struct, which
> >> is part of the dev_pm_opp.
> >>
> >> Also add and export the dev_pm_opp_set_path() and dev_pm_opp_put_path()
> >> helpers, to set (and release) an interconnect path to a device. The
> >> bandwidth of this path will be updated when the OPPs are switched.
> >>
> >> Signed-off-by: Georgi Djakov <[email protected]>
> >> ---
> >> drivers/opp/core.c | 67 ++++++++++++++++++++++++++++++++++++++++++
> >> drivers/opp/of.c | 44 +++++++++++++++++++++++++++
> >> drivers/opp/opp.h | 6 ++++
> >> include/linux/pm_opp.h | 14 +++++++++
> >> 4 files changed, 131 insertions(+)
> >>
> >> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> >> index e06a0ab05ad6..4b019cecaa07 100644
> >> --- a/drivers/opp/core.c
> >> +++ b/drivers/opp/core.c
> >> @@ -19,6 +19,7 @@
> >> #include <linux/slab.h>
> >> #include <linux/device.h>
> >> #include <linux/export.h>
> >> +#include <linux/interconnect.h>
> >> #include <linux/pm_domain.h>
> >> #include <linux/regulator/consumer.h>
> >>
> >> @@ -1645,6 +1646,72 @@ void dev_pm_opp_put_clkname(struct opp_table *opp_table)
> >> }
> >> EXPORT_SYMBOL_GPL(dev_pm_opp_put_clkname);
> >>
> >> +/**
> >> + * dev_pm_opp_set_path() - Set interconnect path for a device
> >> + * @dev: Device for which interconnect path is being set.
> >> + * @name: Interconnect path name or NULL.
> >> + *
> >> + * This must be called before any OPPs are initialized for the device.
> >> + */
> >> +struct opp_table *dev_pm_opp_set_path(struct device *dev, const char *name)
> >
> > Maybe the OPP core can do it itself in a similar way to how we do
> > clk_get() today ?
It took me a decade to understand my own comment ;)
> Do you mean to directly call of_icc_get() in _allocate_opp_table()?
I believe I wanted to say s/clk_get()/clk_set_rate()/ . i.e. if someone calls
set-opp-rate, then the path should get set as well accordingly automagically.
--
viresh
On 09-04-19, 17:36, Georgi Djakov wrote:
> Hi Viresh,
>
> On 3/14/19 08:23, Viresh Kumar wrote:
> > On 13-03-19, 11:00, Georgi Djakov wrote:
> >> In addition to frequency and voltage, some devices may have bandwidth
> >> requirements for their interconnect throughput - for example a CPU
> >> or GPU may also need to increase or decrease their bandwidth to DDR
> >> memory based on the current operating performance point.
> >>
> >> Extend the OPP tables with additional property to describe the bandwidth
> >> needs of a device. The average and peak bandwidth values depend on the
> >> hardware and its properties.
> >>
> >> Signed-off-by: Georgi Djakov <[email protected]>
> >> ---
> >> Documentation/devicetree/bindings/opp/opp.txt | 45 +++++++++++++++++++
> >> 1 file changed, 45 insertions(+)
> >>
> >> diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt
> >> index 76b6c79604a5..fa598264615f 100644
> >> --- a/Documentation/devicetree/bindings/opp/opp.txt
> >> +++ b/Documentation/devicetree/bindings/opp/opp.txt
> >> @@ -129,6 +129,9 @@ Optional properties:
> >> - opp-microamp-<name>: Named opp-microamp property. Similar to
> >> opp-microvolt-<name> property, but for microamp instead.
> >>
> >> +- opp-bw-MBs: The interconnect bandwidth is specified with an array containing
> >> + the two integer values for average and peak bandwidth in megabytes per second.
> >> +
> >> - opp-level: A value representing the performance level of the device,
> >> expressed as a 32-bit integer.
> >>
> >> @@ -546,3 +549,45 @@ Example 6: opp-microvolt-<name>, opp-microamp-<name>:
> >> };
> >> };
> >> };
> >> +
> >> +Example 7: opp-bw-MBs:
> >> +(example: average and peak bandwidth values are defined for each OPP and the
> >> +interconnect between CPU and DDR memory is scaled together with CPU frequency)
> >> +
> >> +/ {
> >> + cpus {
> >> + CPU0: cpu@0 {
> >> + compatible = "arm,cortex-a53", "arm,armv8";
> >> + ...
> >> + operating-points-v2 = <&cpu_opp_table>;
> >> + /* path between the CPU and DDR memory */
> >> + interconnects = <&rpm_bimc MASTER_AMPSS_M0
> >> + &rpm_bimc SLAVE_EBI_CH0>;
> >
> > Can we have multiple paths for a device ?
>
> I suppose that this is also a possible scenario. Will propose something
> to handle multiple paths too.
>
> >> + };
> >> + };
> >> +
> >> + cpu_opp_table: cpu_opp_table {
> >> + compatible = "operating-points-v2";
> >> + opp-shared;
> >> +
> >> + opp-200000000 {
> >> + opp-hz = /bits/ 64 <200000000>;
> >> + /* 457 MB/s average and 1525 MB/s peak bandwidth */
> >> + opp-bw-MBs = <457 1525>;
> >
> > In that case fixing this to just 2 entries in the array is incorrect
> > and we should take care of that in the bindings here.
>
> We can encode the path name into the property (when there are multiple
> paths). We already have opp-microamp-<name> and opp-microamp-<name>, so
> we can follow the same practice.
>
> For example:
>
> CPU0: cpu@0 {
> compatible = "arm,cortex-a53", "arm,armv8";
> ...
> operating-points-v2 = <&cpu_opp_table>;
> /* path between the CPU and DDR and path between CPU and L3 */
> interconnects = <&bimc MASTER_AMPSS_M0 &bimc SLAVE_EBI_CH0>,
> <&bimc MASTER_AMPSS_M0 &bimc SLAVE_L3>;
> interconnect-names "cpu-mem", "cpu-l3";
> };
>
> cpu_opp_table: cpu_opp_table {
> compatible = "operating-points-v2";
> opp-shared;
>
> opp-200000000 {
> opp-hz = /bits/ 64 <200000000>;
> /* 457 MB/s average, 1525 MB/s peak bandwidth to DDR */
> opp-bw-MBps-cpu-mem = <457 1525>;
> /* 914 MB/s average, 3050 MB/s peak bandwidth to L3 */
> opp-bw-MBps-cpu-l3 = <914 3050>;
> };
> };
The -<name> property is different as only one of the value is ever used, i.e. we
can have opp-microvolt-speed0/1/2/3 (4 different values/properties) and only
opp-microvolt-speed1 will be used eventually and all others are discarded.
Also I am not sure if this will be actually required. We already have a list of
interconnects above and the order of that can be taken as reference here. i.e.
CPU0: cpu@0 {
compatible = "arm,cortex-a53", "arm,armv8";
...
operating-points-v2 = <&cpu_opp_table>;
/* path between the CPU and DDR and path between CPU and L3 */
interconnects = <&bimc MASTER_AMPSS_M0 &bimc SLAVE_EBI_CH0>,
<&bimc MASTER_AMPSS_M0 &bimc SLAVE_L3>;
};
cpu_opp_table: cpu_opp_table {
compatible = "operating-points-v2";
opp-shared;
opp-200000000 {
opp-hz = /bits/ 64 <200000000>;
/* 457 MB/s average, 1525 MB/s peak bandwidth to DDR */
/* 914 MB/s average, 3050 MB/s peak bandwidth to L3 */
opp-bw-MBps = <457 1525>, <914 3050>;
};
};
I also strongly believe that "opp-bw-MBps" should be renamed in a way to make it
independent of the OPPs. For example, we may have devices which also need to add
their vote for the bandwidth but don't have an OPP table as they don't do DVFS.
And the same property should be used by them directly as what we will have in
the individual OPPs in the above example case.
So maybe something like bw-MBps or something else.
--
viresh
Hi Viresh,
On 4/10/19 07:05, Viresh Kumar wrote:
> On 09-04-19, 17:36, Georgi Djakov wrote:
>> Hi Viresh,
>>
>> On 3/14/19 08:23, Viresh Kumar wrote:
>>> On 13-03-19, 11:00, Georgi Djakov wrote:
>>>> In addition to frequency and voltage, some devices may have bandwidth
>>>> requirements for their interconnect throughput - for example a CPU
>>>> or GPU may also need to increase or decrease their bandwidth to DDR
>>>> memory based on the current operating performance point.
>>>>
>>>> Extend the OPP tables with additional property to describe the bandwidth
>>>> needs of a device. The average and peak bandwidth values depend on the
>>>> hardware and its properties.
>>>>
>>>> Signed-off-by: Georgi Djakov <[email protected]>
>>>> ---
>>>> Documentation/devicetree/bindings/opp/opp.txt | 45 +++++++++++++++++++
>>>> 1 file changed, 45 insertions(+)
>>>>
>>>> diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt
>>>> index 76b6c79604a5..fa598264615f 100644
>>>> --- a/Documentation/devicetree/bindings/opp/opp.txt
>>>> +++ b/Documentation/devicetree/bindings/opp/opp.txt
>>>> @@ -129,6 +129,9 @@ Optional properties:
>>>> - opp-microamp-<name>: Named opp-microamp property. Similar to
>>>> opp-microvolt-<name> property, but for microamp instead.
>>>>
>>>> +- opp-bw-MBs: The interconnect bandwidth is specified with an array containing
>>>> + the two integer values for average and peak bandwidth in megabytes per second.
>>>> +
>>>> - opp-level: A value representing the performance level of the device,
>>>> expressed as a 32-bit integer.
>>>>
>>>> @@ -546,3 +549,45 @@ Example 6: opp-microvolt-<name>, opp-microamp-<name>:
>>>> };
>>>> };
>>>> };
>>>> +
>>>> +Example 7: opp-bw-MBs:
>>>> +(example: average and peak bandwidth values are defined for each OPP and the
>>>> +interconnect between CPU and DDR memory is scaled together with CPU frequency)
>>>> +
>>>> +/ {
>>>> + cpus {
>>>> + CPU0: cpu@0 {
>>>> + compatible = "arm,cortex-a53", "arm,armv8";
>>>> + ...
>>>> + operating-points-v2 = <&cpu_opp_table>;
>>>> + /* path between the CPU and DDR memory */
>>>> + interconnects = <&rpm_bimc MASTER_AMPSS_M0
>>>> + &rpm_bimc SLAVE_EBI_CH0>;
>>>
>>> Can we have multiple paths for a device ?
>>
>> I suppose that this is also a possible scenario. Will propose something
>> to handle multiple paths too.
>>
>>>> + };
>>>> + };
>>>> +
>>>> + cpu_opp_table: cpu_opp_table {
>>>> + compatible = "operating-points-v2";
>>>> + opp-shared;
>>>> +
>>>> + opp-200000000 {
>>>> + opp-hz = /bits/ 64 <200000000>;
>>>> + /* 457 MB/s average and 1525 MB/s peak bandwidth */
>>>> + opp-bw-MBs = <457 1525>;
>>>
>>> In that case fixing this to just 2 entries in the array is incorrect
>>> and we should take care of that in the bindings here.
>>
>> We can encode the path name into the property (when there are multiple
>> paths). We already have opp-microamp-<name> and opp-microamp-<name>, so
>> we can follow the same practice.
>>
>> For example:
>>
>> CPU0: cpu@0 {
>> compatible = "arm,cortex-a53", "arm,armv8";
>> ...
>> operating-points-v2 = <&cpu_opp_table>;
>> /* path between the CPU and DDR and path between CPU and L3 */
>> interconnects = <&bimc MASTER_AMPSS_M0 &bimc SLAVE_EBI_CH0>,
>> <&bimc MASTER_AMPSS_M0 &bimc SLAVE_L3>;
>> interconnect-names "cpu-mem", "cpu-l3";
>> };
>>
>> cpu_opp_table: cpu_opp_table {
>> compatible = "operating-points-v2";
>> opp-shared;
>>
>> opp-200000000 {
>> opp-hz = /bits/ 64 <200000000>;
>> /* 457 MB/s average, 1525 MB/s peak bandwidth to DDR */
>> opp-bw-MBps-cpu-mem = <457 1525>;
>> /* 914 MB/s average, 3050 MB/s peak bandwidth to L3 */
>> opp-bw-MBps-cpu-l3 = <914 3050>;
>> };
>> };
>
> The -<name> property is different as only one of the value is ever used, i.e. we
> can have opp-microvolt-speed0/1/2/3 (4 different values/properties) and only
> opp-microvolt-speed1 will be used eventually and all others are discarded.
>
> Also I am not sure if this will be actually required. We already have a list of
> interconnects above and the order of that can be taken as reference here. i.e.
>
> CPU0: cpu@0 {
> compatible = "arm,cortex-a53", "arm,armv8";
> ...
> operating-points-v2 = <&cpu_opp_table>;
> /* path between the CPU and DDR and path between CPU and L3 */
> interconnects = <&bimc MASTER_AMPSS_M0 &bimc SLAVE_EBI_CH0>,
> <&bimc MASTER_AMPSS_M0 &bimc SLAVE_L3>;
> };
>
> cpu_opp_table: cpu_opp_table {
> compatible = "operating-points-v2";
> opp-shared;
>
> opp-200000000 {
> opp-hz = /bits/ 64 <200000000>;
> /* 457 MB/s average, 1525 MB/s peak bandwidth to DDR */
> /* 914 MB/s average, 3050 MB/s peak bandwidth to L3 */
> opp-bw-MBps = <457 1525>, <914 3050>;
> };
> };
This works too.
>
> I also strongly believe that "opp-bw-MBps" should be renamed in a way to make it
> independent of the OPPs. For example, we may have devices which also need to add
> their vote for the bandwidth but don't have an OPP table as they don't do DVFS.
> And the same property should be used by them directly as what we will have in
> the individual OPPs in the above example case.
>
> So maybe something like bw-MBps or something else.
Ok, will make it bandwidth-MBps.
Thanks,
Georgi