2018-05-25 20:35:41

by Matthias Kaehlcke

[permalink] [raw]
Subject: [PATCH 00/11] Add throttler driver for non-thermal throttling

This series adds the throttler driver, for non-thermal throttling of
CPUs and devfreq devices. A use case for non-thermal throttling could
be the detection of a high battery discharge voltage, close to the
over-current protection (OCP) limit of the battery.

To support throttling of devfreq devices the series introduces the
concept of a devfreq policy and the DEVFREQ_ADJUST notifier (similar
to CPUFREQ_ADJUST). Further it includes some related devfreq bugfixes
and improvements that change some of the code that is also touched
by the policy changes.

Matthias Kaehlcke (11):
PM / devfreq: Init user limits from OPP limits, not viceversa
PM / devfreq: Fix handling of min/max_freq == 0
PM / devfreq: Remove check for df->max_freq == 0 from governors
PM / devfreq: Remove redundant frequency adjustment from governors
PM / devfreq: governors: Return device frequency limits instead of
user limits
PM / devfreq: Add struct devfreq_policy
PM / devfreg: Add support policy notifiers
PM / devfreq: Make update_devfreq() public
misc: throttler: Add core support for non-thermal throttling
dt-bindings: misc: add bindings for throttler
misc/throttler: Add Chrome OS EC throttler

.../devicetree/bindings/misc/throttler.txt | 41 ++
drivers/devfreq/devfreq.c | 203 ++++++----
drivers/devfreq/governor.h | 3 -
drivers/devfreq/governor_passive.c | 4 +-
drivers/devfreq/governor_performance.c | 5 +-
drivers/devfreq/governor_powersave.c | 2 +-
drivers/devfreq/governor_simpleondemand.c | 13 +-
drivers/devfreq/governor_userspace.c | 16 +-
drivers/misc/Kconfig | 1 +
drivers/misc/Makefile | 1 +
drivers/misc/throttler/Kconfig | 28 ++
drivers/misc/throttler/Makefile | 2 +
drivers/misc/throttler/core.c | 373 ++++++++++++++++++
drivers/misc/throttler/cros_ec_throttler.c | 122 ++++++
include/linux/devfreq.h | 112 +++++-
include/linux/throttler.h | 10 +
16 files changed, 813 insertions(+), 123 deletions(-)
create mode 100644 Documentation/devicetree/bindings/misc/throttler.txt
create mode 100644 drivers/misc/throttler/Kconfig
create mode 100644 drivers/misc/throttler/Makefile
create mode 100644 drivers/misc/throttler/core.c
create mode 100644 drivers/misc/throttler/cros_ec_throttler.c
create mode 100644 include/linux/throttler.h

--
2.17.0.921.gf22659ad46-goog



2018-05-25 20:32:03

by Matthias Kaehlcke

[permalink] [raw]
Subject: [PATCH 05/11] PM / devfreq: governors: Return device frequency limits instead of user limits

The performance, powersave and simpleondemand governors can return
df->min/max_freq, which are the user defined frequency limits.
update_devfreq() already takes care of adjusting the target frequency
with the user limits if necessary, therefore we can return
df->scaling_min/max_freq instead, which is the min/max frequency
supported by the device at a given time (depending on the
enabled/disabled OPPs)

Signed-off-by: Matthias Kaehlcke <[email protected]>
---
drivers/devfreq/governor_performance.c | 2 +-
drivers/devfreq/governor_powersave.c | 2 +-
drivers/devfreq/governor_simpleondemand.c | 6 +++---
3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/devfreq/governor_performance.c b/drivers/devfreq/governor_performance.c
index 1c990cb45098..a8e3478b3c43 100644
--- a/drivers/devfreq/governor_performance.c
+++ b/drivers/devfreq/governor_performance.c
@@ -20,7 +20,7 @@ static int devfreq_performance_func(struct devfreq *df,
* target callback should be able to get floor value as
* said in devfreq.h
*/
- *freq = df->max_freq;
+ *freq = df->scaling_max_freq;
return 0;
}

diff --git a/drivers/devfreq/governor_powersave.c b/drivers/devfreq/governor_powersave.c
index 0c42f23249ef..8696efd32e5a 100644
--- a/drivers/devfreq/governor_powersave.c
+++ b/drivers/devfreq/governor_powersave.c
@@ -20,7 +20,7 @@ static int devfreq_powersave_func(struct devfreq *df,
* target callback should be able to get ceiling value as
* said in devfreq.h
*/
- *freq = df->min_freq;
+ *freq = df->scaling_min_freq;
return 0;
}

diff --git a/drivers/devfreq/governor_simpleondemand.c b/drivers/devfreq/governor_simpleondemand.c
index 3da7554b4837..805fee09c754 100644
--- a/drivers/devfreq/governor_simpleondemand.c
+++ b/drivers/devfreq/governor_simpleondemand.c
@@ -46,7 +46,7 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,

/* Assume MAX if it is going to be divided by zero */
if (stat->total_time == 0) {
- *freq = df->max_freq;
+ *freq = df->scaling_max_freq;
return 0;
}

@@ -59,13 +59,13 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
/* Set MAX if it's busy enough */
if (stat->busy_time * 100 >
stat->total_time * dfso_upthreshold) {
- *freq = df->max_freq;
+ *freq = df->scaling_max_freq;
return 0;
}

/* Set MAX if we do not know the initial frequency */
if (stat->current_frequency == 0) {
- *freq = df->max_freq;
+ *freq = df->scaling_max_freq;
return 0;
}

--
2.17.0.921.gf22659ad46-goog


2018-05-25 20:32:06

by Matthias Kaehlcke

[permalink] [raw]
Subject: [PATCH 08/11] PM / devfreq: Make update_devfreq() public

Currently update_devfreq() is only visible to devfreq governors outside
of devfreq.c. Make it public to allow drivers that adjust devfreq policies
to cause a re-evaluation of the frequency after a policy change.

Signed-off-by: Matthias Kaehlcke <[email protected]>
---
drivers/devfreq/governor.h | 3 ---
include/linux/devfreq.h | 8 ++++++++
2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/devfreq/governor.h b/drivers/devfreq/governor.h
index cfc50a61a90d..608e7549465b 100644
--- a/drivers/devfreq/governor.h
+++ b/drivers/devfreq/governor.h
@@ -54,9 +54,6 @@ struct devfreq_governor {
unsigned int event, void *data);
};

-/* Caution: devfreq->lock must be locked before calling update_devfreq */
-extern int update_devfreq(struct devfreq *devfreq);
-
extern void devfreq_monitor_start(struct devfreq *devfreq);
extern void devfreq_monitor_stop(struct devfreq *devfreq);
extern void devfreq_monitor_suspend(struct devfreq *devfreq);
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index 4a6ed61bbe49..561ab1d5e8d6 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -222,6 +222,14 @@ extern void devm_devfreq_remove_device(struct device *dev,
extern int devfreq_suspend_device(struct devfreq *devfreq);
extern int devfreq_resume_device(struct devfreq *devfreq);

+/**
+ * update_devfreq() - Reevaluate the device and configure frequency
+ * @devfreq: the devfreq device
+ *
+ * Note: devfreq->lock must be held
+ */
+extern int update_devfreq(struct devfreq *devfreq);
+
/* Helper functions for devfreq user device driver with OPP. */
extern struct dev_pm_opp *devfreq_recommended_opp(struct device *dev,
unsigned long *freq, u32 flags);
--
2.17.0.921.gf22659ad46-goog


2018-05-25 20:32:10

by Matthias Kaehlcke

[permalink] [raw]
Subject: [PATCH 11/11] misc/throttler: Add Chrome OS EC throttler

The driver subscribes to throttling events from the Chrome OS
embedded controller and enables/disables system throttling based
on these events.

Signed-off-by: Matthias Kaehlcke <[email protected]>
---
drivers/misc/throttler/Kconfig | 15 +++
drivers/misc/throttler/Makefile | 1 +
drivers/misc/throttler/cros_ec_throttler.c | 122 +++++++++++++++++++++
3 files changed, 138 insertions(+)
create mode 100644 drivers/misc/throttler/cros_ec_throttler.c

diff --git a/drivers/misc/throttler/Kconfig b/drivers/misc/throttler/Kconfig
index ef8388f6bc0a..652b6817b75c 100644
--- a/drivers/misc/throttler/Kconfig
+++ b/drivers/misc/throttler/Kconfig
@@ -11,3 +11,18 @@ menuconfig THROTTLER
Note that you also need a event monitor module usually called
*_throttler.

+if THROTTLER
+
+config CROS_EC_THROTTLER
+ tristate "Throttler event monitor for the Chrome OS Embedded Controller"
+ default n
+ depends on MFD_CROS_EC
+ ---help---
+ This driver adds support to throttle the system in reaction to
+ Chrome OS EC events.
+
+ To compile this driver as a module, choose M here: the
+ module will be called cros_ec_throttler.
+
+endif # THROTTLER
+
diff --git a/drivers/misc/throttler/Makefile b/drivers/misc/throttler/Makefile
index c8d920cee315..d9b2a77dabc9 100644
--- a/drivers/misc/throttler/Makefile
+++ b/drivers/misc/throttler/Makefile
@@ -1 +1,2 @@
obj-$(CONFIG_THROTTLER) += core.o
+obj-$(CONFIG_CROS_EC_THROTTLER) += cros_ec_throttler.o
diff --git a/drivers/misc/throttler/cros_ec_throttler.c b/drivers/misc/throttler/cros_ec_throttler.c
new file mode 100644
index 000000000000..ea6bc002d49c
--- /dev/null
+++ b/drivers/misc/throttler/cros_ec_throttler.c
@@ -0,0 +1,122 @@
+/*
+ * Driver for throttling triggered by EC events.
+ *
+ * Copyright (C) 2018 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/mfd/cros_ec.h>
+#include <linux/module.h>
+#include <linux/notifier.h>
+#include <linux/of.h>
+#include <linux/of_platform.h>
+#include <linux/platform_device.h>
+#include <linux/throttler.h>
+
+struct cros_ec_throttler {
+ struct cros_ec_device *ec;
+ struct throttler *throttler;
+ struct notifier_block nb;
+};
+
+static int cros_ec_throttler_event(struct notifier_block *nb,
+ unsigned long queued_during_suspend, void *_notify)
+{
+ struct cros_ec_throttler *cte =
+ container_of(nb, struct cros_ec_throttler, nb);
+ u32 host_event;
+
+ host_event = cros_ec_get_host_event(cte->ec);
+ if (host_event & EC_HOST_EVENT_MASK(EC_HOST_EVENT_THROTTLE_START)) {
+ throttler_set_level(cte->throttler, 1);
+
+ return NOTIFY_OK;
+ } else if (host_event & EC_HOST_EVENT_MASK(EC_HOST_EVENT_THROTTLE_STOP)) {
+ throttler_set_level(cte->throttler, 0);
+
+ return NOTIFY_OK;
+ }
+
+ return NOTIFY_DONE;
+}
+
+static int cros_ec_throttler_probe(struct platform_device *pdev)
+{
+ struct cros_ec_throttler *cte;
+ struct device *dev = &pdev->dev;
+ struct device_node *np = pdev->dev.of_node;
+ int ret;
+
+ if (!np) {
+ /* should never happen */
+ return -EINVAL;
+ }
+
+ cte = devm_kzalloc(dev, sizeof(*cte), GFP_KERNEL);
+ if (!cte)
+ return -ENOMEM;
+
+ cte->ec = dev_get_drvdata(pdev->dev.parent);
+
+ cte->throttler = throttler_setup(dev);
+ if (IS_ERR(cte->throttler))
+ return PTR_ERR(cte->throttler);
+
+ dev_set_drvdata(dev, cte);
+
+ cte->nb.notifier_call = cros_ec_throttler_event;
+ ret = blocking_notifier_chain_register(&cte->ec->event_notifier,
+ &cte->nb);
+ if (ret < 0) {
+ dev_err(dev, "failed to register notifier\n");
+ throttler_teardown(cte->throttler);
+ return ret;
+ }
+
+ return 0;
+}
+
+static int cros_ec_throttler_remove(struct platform_device *pdev)
+{
+ struct cros_ec_throttler *cte = platform_get_drvdata(pdev);
+
+ blocking_notifier_chain_unregister(&cte->ec->event_notifier,
+ &cte->nb);
+
+ throttler_teardown(cte->throttler);
+
+ return 0;
+}
+
+#ifdef CONFIG_OF
+static const struct of_device_id cros_ec_throttler_of_match[] = {
+ { .compatible = "google,cros-ec-throttler" },
+ { /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, cros_ec_throttler_of_match);
+#endif /* CONFIG_OF */
+
+static struct platform_driver cros_ec_throttler_driver = {
+ .driver = {
+ .name = "cros-ec-throttler",
+ .of_match_table = of_match_ptr(cros_ec_throttler_of_match),
+ },
+ .probe = cros_ec_throttler_probe,
+ .remove = cros_ec_throttler_remove,
+};
+
+module_platform_driver(cros_ec_throttler_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Matthias Kaehlcke <[email protected]>");
+MODULE_DESCRIPTION("Chrome OS EC Throttler");
--
2.17.0.921.gf22659ad46-goog


2018-05-25 20:32:49

by Matthias Kaehlcke

[permalink] [raw]
Subject: [PATCH 10/11] dt-bindings: misc: add bindings for throttler

Signed-off-by: Matthias Kaehlcke <[email protected]>
---
.../devicetree/bindings/misc/throttler.txt | 41 +++++++++++++++++++
1 file changed, 41 insertions(+)
create mode 100644 Documentation/devicetree/bindings/misc/throttler.txt

diff --git a/Documentation/devicetree/bindings/misc/throttler.txt b/Documentation/devicetree/bindings/misc/throttler.txt
new file mode 100644
index 000000000000..92f13e94451a
--- /dev/null
+++ b/Documentation/devicetree/bindings/misc/throttler.txt
@@ -0,0 +1,41 @@
+Throttler driver
+
+The Throttler is used for non-thermal throttling of system components like
+CPUs or devfreq devices.
+
+
+Optional properties:
+--------------------
+- cpufreq: A sub-node which is container of only cpufreq
+ Type: sub-mode nodes, used to describe the throttling settings
+ for a CPU (or CPU group sharing the same policy).
+
+- devfreq: A sub-node which is container of only devfreq
+ Type: sub-mode nodes, used to describe the throttling settings
+ for a devfreq device.
+
+cpufreq node:
+=============
+
+Required properties:
+--------------------
+- cpu: The id of the CPU to be throttled.
+ Type: unsigned
+ Size: one cell
+
+- throttling-frequencies: Frequencies used for throttling, corresponding
+ Type: unsigned to throttling level 1, 2, ...
+ Size: array
+
+
+devfreq node:
+=============
+
+Required properties:
+--------------------
+- devfreq: A phandle of the devfreq device to be throttled.
+ Type: phandle
+
+- throttling-frequencies: Frequencies used for throttling, corresponding
+ Type: unsigned to throttling level 1, 2, ...
+ Size: array
--
2.17.0.921.gf22659ad46-goog


2018-05-25 20:33:10

by Matthias Kaehlcke

[permalink] [raw]
Subject: [PATCH 07/11] PM / devfreg: Add support policy notifiers

Policy notifiers are called before a frequency change and may narrow
the min/max frequency range in devfreq_policy, which is used to adjust
the target frequency if it is beyond this range.

Also add a few helpers:
- devfreq_verify_within_[dev_]limits()
- should be used by the notifiers for policy adjustments.
- dev_to_devfreq()
- lookup a devfreq strict from a device pointer

Signed-off-by: Matthias Kaehlcke <[email protected]>
---
drivers/devfreq/devfreq.c | 47 +++++++++++++++++++++-------
include/linux/devfreq.h | 66 +++++++++++++++++++++++++++++++++++++++
2 files changed, 102 insertions(+), 11 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 7fd55b49c8ae..939b91f3a7e3 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -72,6 +72,20 @@ static struct devfreq *find_device_devfreq(struct device *dev)
return ERR_PTR(-ENODEV);
}

+/**
+ * dev_to_devfreq() - find devfreq struct using device pointer
+ * @dev: device pointer used to lookup device devfreq.
+ */
+struct devfreq *dev_to_devfreq(struct device *dev) {
+ struct devfreq *devfreq;
+
+ mutex_lock(&devfreq_list_lock);
+ devfreq = find_device_devfreq(dev);
+ mutex_unlock(&devfreq_list_lock);
+
+ return devfreq;
+}
+
static unsigned long find_available_min_freq(struct devfreq *devfreq)
{
struct dev_pm_opp *opp;
@@ -269,20 +283,21 @@ int update_devfreq(struct devfreq *devfreq)
if (!policy->governor)
return -EINVAL;

+ policy->min = policy->devinfo.min_freq;
+ policy->max = policy->devinfo.max_freq;
+
+ srcu_notifier_call_chain(&devfreq->policy_notifier_list,
+ DEVFREQ_ADJUST, policy);
+
/* Reevaluate the proper frequency */
err = policy->governor->get_target_freq(devfreq, &freq);
if (err)
return err;

- /*
- * Adjust the frequency with user freq, QoS and available freq.
- *
- * List from the highest priority
- * max_freq
- * min_freq
- */
- max_freq = MIN(policy->devinfo.max_freq, policy->user.max_freq);
- min_freq = MAX(policy->devinfo.min_freq, policy->user.min_freq);
+ /* Adjust the frequency */
+
+ max_freq = MIN(policy->max, policy->user.max_freq);
+ min_freq = MAX(policy->min, policy->user.min_freq);

if (freq < min_freq) {
freq = min_freq;
@@ -645,6 +660,7 @@ struct devfreq *devfreq_add_device(struct device *dev,
devfreq->last_stat_updated = jiffies;

srcu_init_notifier_head(&devfreq->transition_notifier_list);
+ srcu_init_notifier_head(&devfreq->policy_notifier_list);

mutex_unlock(&devfreq->lock);

@@ -1432,7 +1448,7 @@ EXPORT_SYMBOL(devm_devfreq_unregister_opp_notifier);
* devfreq_register_notifier() - Register a driver with devfreq
* @devfreq: The devfreq object.
* @nb: The notifier block to register.
- * @list: DEVFREQ_TRANSITION_NOTIFIER.
+ * @list: DEVFREQ_TRANSITION_NOTIFIER or DEVFREQ_POLICY_NOTIFIER.
*/
int devfreq_register_notifier(struct devfreq *devfreq,
struct notifier_block *nb,
@@ -1448,6 +1464,10 @@ int devfreq_register_notifier(struct devfreq *devfreq,
ret = srcu_notifier_chain_register(
&devfreq->transition_notifier_list, nb);
break;
+ case DEVFREQ_POLICY_NOTIFIER:
+ ret = srcu_notifier_chain_register(
+ &devfreq->policy_notifier_list, nb);
+ break;
default:
ret = -EINVAL;
}
@@ -1460,7 +1480,7 @@ EXPORT_SYMBOL(devfreq_register_notifier);
* devfreq_unregister_notifier() - Unregister a driver with devfreq
* @devfreq: The devfreq object.
* @nb: The notifier block to be unregistered.
- * @list: DEVFREQ_TRANSITION_NOTIFIER.
+ * @list: DEVFREQ_TRANSITION_NOTIFIER or DEVFREQ_POLICY_NOTIFIER.
*/
int devfreq_unregister_notifier(struct devfreq *devfreq,
struct notifier_block *nb,
@@ -1476,6 +1496,11 @@ int devfreq_unregister_notifier(struct devfreq *devfreq,
ret = srcu_notifier_chain_unregister(
&devfreq->transition_notifier_list, nb);
break;
+ case DEVFREQ_POLICY_NOTIFIER:
+ ret = srcu_notifier_chain_unregister(
+ &devfreq->policy_notifier_list, nb);
+ break;
+
default:
ret = -EINVAL;
}
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index 9bf23b976f4d..4a6ed61bbe49 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -33,6 +33,10 @@
#define DEVFREQ_PRECHANGE (0)
#define DEVFREQ_POSTCHANGE (1)

+#define DEVFREQ_POLICY_NOTIFIER 1
+
+#define DEVFREQ_ADJUST 0
+
struct devfreq;
struct devfreq_governor;

@@ -121,12 +125,16 @@ struct devfreq_freq_limits {

/**
* struct devfreq_policy - Devfreq policy
+ * @min: minimum frequency (adjustable by policy notifiers)
+ * @min: maximum frequency (adjustable by policy notifiers)
* @user: frequency limits requested by the user
* @devinfo: frequency limits of the device (available OPPs)
* @governor: method how to choose frequency based on the usage.
* @governor_name: devfreq governor name for use with this devfreq
*/
struct devfreq_policy {
+ unsigned long min;
+ unsigned long max;
struct devfreq_freq_limits user;
struct devfreq_freq_limits devinfo;
const struct devfreq_governor *governor;
@@ -155,6 +163,7 @@ struct devfreq_policy {
* @time_in_state: Statistics of devfreq states
* @last_stat_updated: The last time stat updated
* @transition_notifier_list: list head of DEVFREQ_TRANSITION_NOTIFIER notifier
+ * @policy_notifier_list: list head of DEVFREQ_POLICY_NOTIFIER notifier
*
* This structure stores the devfreq information for a give device.
*
@@ -188,6 +197,7 @@ struct devfreq {
unsigned long last_stat_updated;

struct srcu_notifier_head transition_notifier_list;
+ struct srcu_notifier_head policy_notifier_list;
};

struct devfreq_freqs {
@@ -240,6 +250,46 @@ extern void devm_devfreq_unregister_notifier(struct device *dev,
extern struct devfreq *devfreq_get_devfreq_by_phandle(struct device *dev,
int index);

+/**
+ * devfreq_verify_within_limits() - Adjust a devfreq policy if needed to make
+ * sure its min/max values are within a
+ * specified range.
+ * @policy: the policy
+ * @min: the minimum frequency
+ * @max: the maximum frequency
+ */
+static inline void devfreq_verify_within_limits(struct devfreq_policy *policy,
+ unsigned int min, unsigned int max)
+{
+ if (policy->min < min)
+ policy->min = min;
+ if (policy->max < min)
+ policy->max = min;
+ if (policy->min > max)
+ policy->min = max;
+ if (policy->max > max)
+ policy->max = max;
+ if (policy->min > policy->max)
+ policy->min = policy->max;
+ return;
+}
+
+/**
+ * devfreq_verify_within_dev_limits() - Adjust a devfreq policy if needed to
+ * make sure its min/max values are within
+ * the frequency range supported by the
+ * device.
+ * @policy: the policy
+ */
+static inline void
+devfreq_verify_within_dev_limits(struct devfreq_policy *policy)
+{
+ devfreq_verify_within_limits(policy, policy->devinfo.min_freq,
+ policy->devinfo.max_freq);
+}
+
+struct devfreq *dev_to_devfreq(struct device *dev);
+
#if IS_ENABLED(CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND)
/**
* struct devfreq_simple_ondemand_data - void *data fed to struct devfreq
@@ -394,10 +444,26 @@ static inline struct devfreq *devfreq_get_devfreq_by_phandle(struct device *dev,
return ERR_PTR(-ENODEV);
}

+static inline void devfreq_verify_within_limits(struct defreq_policy *policy,
+ unsigned int min, unsigned int max)
+{
+}
+
+static inline void
+devfreq_verify_within_dev_limits(struct cpufreq_policy *policy)
+{
+}
+
static inline int devfreq_update_stats(struct devfreq *df)
{
return -EINVAL;
}
+
+static inline struct devfreq *dev_to_devfreq(struct device *dev)
+{
+ return NULL;
+}
+
#endif /* CONFIG_PM_DEVFREQ */

#endif /* __LINUX_DEVFREQ_H__ */
--
2.17.0.921.gf22659ad46-goog


2018-05-25 20:33:40

by Matthias Kaehlcke

[permalink] [raw]
Subject: [PATCH 06/11] PM / devfreq: Add struct devfreq_policy

Move variables related with devfreq policy changes from struct devfreq
to the new struct devfreq_policy and add a policy field to struct devfreq.

The following variables are moved:

df->min/max_freq => p->user.min/max_freq
df->scaling_min/max_freq => p->devinfo.min/max_freq
df->governor => p->governor
df->governor_name => p->governor_name

Signed-off-by: Matthias Kaehlcke <[email protected]>
---
drivers/devfreq/devfreq.c | 136 ++++++++++++----------
drivers/devfreq/governor_passive.c | 4 +-
drivers/devfreq/governor_performance.c | 2 +-
drivers/devfreq/governor_powersave.c | 2 +-
drivers/devfreq/governor_simpleondemand.c | 7 +-
include/linux/devfreq.h | 38 ++++--
6 files changed, 108 insertions(+), 81 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 67da4e7b486b..7fd55b49c8ae 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -255,6 +255,7 @@ static int devfreq_notify_transition(struct devfreq *devfreq,
*/
int update_devfreq(struct devfreq *devfreq)
{
+ struct devfreq_policy *policy = &devfreq->policy;
struct devfreq_freqs freqs;
unsigned long freq, cur_freq, min_freq, max_freq;
int err = 0;
@@ -265,11 +266,11 @@ int update_devfreq(struct devfreq *devfreq)
return -EINVAL;
}

- if (!devfreq->governor)
+ if (!policy->governor)
return -EINVAL;

/* Reevaluate the proper frequency */
- err = devfreq->governor->get_target_freq(devfreq, &freq);
+ err = policy->governor->get_target_freq(devfreq, &freq);
if (err)
return err;

@@ -280,8 +281,8 @@ int update_devfreq(struct devfreq *devfreq)
* max_freq
* min_freq
*/
- max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);
- min_freq = MAX(devfreq->scaling_min_freq, devfreq->min_freq);
+ max_freq = MIN(policy->devinfo.max_freq, policy->user.max_freq);
+ min_freq = MAX(policy->devinfo.min_freq, policy->user.min_freq);

if (freq < min_freq) {
freq = min_freq;
@@ -493,18 +494,19 @@ static int devfreq_notifier_call(struct notifier_block *nb, unsigned long type,
void *devp)
{
struct devfreq *devfreq = container_of(nb, struct devfreq, nb);
+ struct devfreq_policy *policy = &devfreq->policy;
int ret;

mutex_lock(&devfreq->lock);

- devfreq->scaling_min_freq = find_available_min_freq(devfreq);
- if (!devfreq->scaling_min_freq) {
+ policy->devinfo.min_freq = find_available_min_freq(devfreq);
+ if (!policy->devinfo.min_freq) {
mutex_unlock(&devfreq->lock);
return -EINVAL;
}

- devfreq->scaling_max_freq = find_available_max_freq(devfreq);
- if (!devfreq->scaling_max_freq) {
+ policy->devinfo.max_freq = find_available_max_freq(devfreq);
+ if (!policy->devinfo.max_freq) {
mutex_unlock(&devfreq->lock);
return -EINVAL;
}
@@ -524,6 +526,7 @@ static int devfreq_notifier_call(struct notifier_block *nb, unsigned long type,
static void devfreq_dev_release(struct device *dev)
{
struct devfreq *devfreq = to_devfreq(dev);
+ struct devfreq_policy *policy = &devfreq->policy;

mutex_lock(&devfreq_list_lock);
if (IS_ERR(find_device_devfreq(devfreq->dev.parent))) {
@@ -534,8 +537,8 @@ static void devfreq_dev_release(struct device *dev)
list_del(&devfreq->node);
mutex_unlock(&devfreq_list_lock);

- if (devfreq->governor)
- devfreq->governor->event_handler(devfreq,
+ if (policy->governor)
+ policy->governor->event_handler(devfreq,
DEVFREQ_GOV_STOP, NULL);

if (devfreq->profile->exit)
@@ -559,6 +562,7 @@ struct devfreq *devfreq_add_device(struct device *dev,
void *data)
{
struct devfreq *devfreq;
+ struct devfreq_policy *policy;
struct devfreq_governor *governor;
static atomic_t devfreq_no = ATOMIC_INIT(-1);
int err = 0;
@@ -584,13 +588,14 @@ struct devfreq *devfreq_add_device(struct device *dev,
goto err_out;
}

+ policy = &devfreq->policy;
mutex_init(&devfreq->lock);
mutex_lock(&devfreq->lock);
devfreq->dev.parent = dev;
devfreq->dev.class = devfreq_class;
devfreq->dev.release = devfreq_dev_release;
devfreq->profile = profile;
- strncpy(devfreq->governor_name, governor_name, DEVFREQ_NAME_LEN);
+ strncpy(policy->governor_name, governor_name, DEVFREQ_NAME_LEN);
devfreq->previous_freq = profile->initial_freq;
devfreq->last_status.current_frequency = profile->initial_freq;
devfreq->data = data;
@@ -604,21 +609,21 @@ struct devfreq *devfreq_add_device(struct device *dev,
mutex_lock(&devfreq->lock);
}

- devfreq->scaling_min_freq = find_available_min_freq(devfreq);
- if (!devfreq->scaling_min_freq) {
+ policy->devinfo.min_freq = find_available_min_freq(devfreq);
+ if (!policy->devinfo.min_freq) {
mutex_unlock(&devfreq->lock);
err = -EINVAL;
goto err_dev;
}
- devfreq->min_freq = devfreq->scaling_min_freq;
+ policy->user.min_freq = policy->devinfo.min_freq;

- devfreq->scaling_max_freq = find_available_max_freq(devfreq);
- if (!devfreq->scaling_max_freq) {
+ policy->devinfo.max_freq = find_available_max_freq(devfreq);
+ if (!policy->devinfo.max_freq) {
mutex_unlock(&devfreq->lock);
err = -EINVAL;
goto err_dev;
}
- devfreq->max_freq = devfreq->scaling_max_freq;
+ policy->user.max_freq = policy->devinfo.max_freq;

dev_set_name(&devfreq->dev, "devfreq%d",
atomic_inc_return(&devfreq_no));
@@ -646,7 +651,7 @@ struct devfreq *devfreq_add_device(struct device *dev,
mutex_lock(&devfreq_list_lock);
list_add(&devfreq->node, &devfreq_list);

- governor = find_devfreq_governor(devfreq->governor_name);
+ governor = find_devfreq_governor(policy->governor_name);
if (IS_ERR(governor)) {
dev_err(dev, "%s: Unable to find governor for the device\n",
__func__);
@@ -654,9 +659,9 @@ struct devfreq *devfreq_add_device(struct device *dev,
goto err_init;
}

- devfreq->governor = governor;
- err = devfreq->governor->event_handler(devfreq, DEVFREQ_GOV_START,
- NULL);
+ policy->governor = governor;
+ err = policy->governor->event_handler(devfreq, DEVFREQ_GOV_START,
+ NULL);
if (err) {
dev_err(dev, "%s: Unable to start governor for the device\n",
__func__);
@@ -817,10 +822,10 @@ int devfreq_suspend_device(struct devfreq *devfreq)
if (!devfreq)
return -EINVAL;

- if (!devfreq->governor)
+ if (!devfreq->policy.governor)
return 0;

- return devfreq->governor->event_handler(devfreq,
+ return devfreq->policy.governor->event_handler(devfreq,
DEVFREQ_GOV_SUSPEND, NULL);
}
EXPORT_SYMBOL(devfreq_suspend_device);
@@ -838,10 +843,10 @@ int devfreq_resume_device(struct devfreq *devfreq)
if (!devfreq)
return -EINVAL;

- if (!devfreq->governor)
+ if (!devfreq->policy.governor)
return 0;

- return devfreq->governor->event_handler(devfreq,
+ return devfreq->policy.governor->event_handler(devfreq,
DEVFREQ_GOV_RESUME, NULL);
}
EXPORT_SYMBOL(devfreq_resume_device);
@@ -875,30 +880,31 @@ int devfreq_add_governor(struct devfreq_governor *governor)
list_for_each_entry(devfreq, &devfreq_list, node) {
int ret = 0;
struct device *dev = devfreq->dev.parent;
+ struct devfreq_policy *policy = &devfreq->policy;

- if (!strncmp(devfreq->governor_name, governor->name,
+ if (!strncmp(policy->governor_name, governor->name,
DEVFREQ_NAME_LEN)) {
/* The following should never occur */
- if (devfreq->governor) {
+ if (policy->governor) {
dev_warn(dev,
"%s: Governor %s already present\n",
- __func__, devfreq->governor->name);
- ret = devfreq->governor->event_handler(devfreq,
+ __func__, policy->governor->name);
+ ret = policy->governor->event_handler(devfreq,
DEVFREQ_GOV_STOP, NULL);
if (ret) {
dev_warn(dev,
"%s: Governor %s stop = %d\n",
__func__,
- devfreq->governor->name, ret);
+ policy->governor->name, ret);
}
/* Fall through */
}
- devfreq->governor = governor;
- ret = devfreq->governor->event_handler(devfreq,
+ policy->governor = governor;
+ ret = policy->governor->event_handler(devfreq,
DEVFREQ_GOV_START, NULL);
if (ret) {
dev_warn(dev, "%s: Governor %s start=%d\n",
- __func__, devfreq->governor->name,
+ __func__, policy->governor->name,
ret);
}
}
@@ -937,24 +943,25 @@ int devfreq_remove_governor(struct devfreq_governor *governor)
list_for_each_entry(devfreq, &devfreq_list, node) {
int ret;
struct device *dev = devfreq->dev.parent;
+ struct devfreq_policy *policy = &devfreq->policy;

- if (!strncmp(devfreq->governor_name, governor->name,
+ if (!strncmp(policy->governor_name, governor->name,
DEVFREQ_NAME_LEN)) {
/* we should have a devfreq governor! */
- if (!devfreq->governor) {
+ if (!policy->governor) {
dev_warn(dev, "%s: Governor %s NOT present\n",
__func__, governor->name);
continue;
/* Fall through */
}
- ret = devfreq->governor->event_handler(devfreq,
+ ret = policy->governor->event_handler(devfreq,
DEVFREQ_GOV_STOP, NULL);
if (ret) {
dev_warn(dev, "%s: Governor %s stop=%d\n",
- __func__, devfreq->governor->name,
+ __func__, policy->governor->name,
ret);
}
- devfreq->governor = NULL;
+ policy->governor = NULL;
}
}

@@ -969,16 +976,17 @@ EXPORT_SYMBOL(devfreq_remove_governor);
static ssize_t governor_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
- if (!to_devfreq(dev)->governor)
+ if (!to_devfreq(dev)->policy.governor)
return -EINVAL;

- return sprintf(buf, "%s\n", to_devfreq(dev)->governor->name);
+ return sprintf(buf, "%s\n", to_devfreq(dev)->policy.governor->name);
}

static ssize_t governor_store(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
struct devfreq *df = to_devfreq(dev);
+ struct devfreq_policy *policy = &df->policy;
int ret;
char str_governor[DEVFREQ_NAME_LEN + 1];
struct devfreq_governor *governor;
@@ -993,29 +1001,29 @@ static ssize_t governor_store(struct device *dev, struct device_attribute *attr,
ret = PTR_ERR(governor);
goto out;
}
- if (df->governor == governor) {
+ if (policy->governor == governor) {
ret = 0;
goto out;
- } else if ((df->governor && df->governor->immutable) ||
+ } else if ((policy->governor && policy->governor->immutable) ||
governor->immutable) {
ret = -EINVAL;
goto out;
}

- if (df->governor) {
- ret = df->governor->event_handler(df, DEVFREQ_GOV_STOP, NULL);
+ if (policy->governor) {
+ ret = policy->governor->event_handler(df, DEVFREQ_GOV_STOP, NULL);
if (ret) {
dev_warn(dev, "%s: Governor %s not stopped(%d)\n",
- __func__, df->governor->name, ret);
+ __func__, policy->governor->name, ret);
goto out;
}
}
- df->governor = governor;
- strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
- ret = df->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
+ policy->governor = governor;
+ strncpy(policy->governor_name, governor->name, DEVFREQ_NAME_LEN);
+ ret = policy->governor->event_handler(df, DEVFREQ_GOV_START, NULL);
if (ret)
dev_warn(dev, "%s: Governor %s not started(%d)\n",
- __func__, df->governor->name, ret);
+ __func__, policy->governor->name, ret);
out:
mutex_unlock(&devfreq_list_lock);

@@ -1030,6 +1038,7 @@ static ssize_t available_governors_show(struct device *d,
char *buf)
{
struct devfreq *df = to_devfreq(d);
+ struct devfreq_policy *policy = &df->policy;
ssize_t count = 0;

mutex_lock(&devfreq_list_lock);
@@ -1038,9 +1047,9 @@ static ssize_t available_governors_show(struct device *d,
* The devfreq with immutable governor (e.g., passive) shows
* only own governor.
*/
- if (df->governor->immutable) {
+ if (policy->governor->immutable) {
count = scnprintf(&buf[count], DEVFREQ_NAME_LEN,
- "%s ", df->governor_name);
+ "%s ", policy->governor_name);
/*
* The devfreq device shows the registered governor except for
* immutable governors such as passive governor .
@@ -1100,17 +1109,18 @@ static ssize_t polling_interval_store(struct device *dev,
const char *buf, size_t count)
{
struct devfreq *df = to_devfreq(dev);
+ struct devfreq_policy *policy = &df->policy;
unsigned int value;
int ret;

- if (!df->governor)
+ if (!policy->governor)
return -EINVAL;

ret = sscanf(buf, "%u", &value);
if (ret != 1)
return -EINVAL;

- df->governor->event_handler(df, DEVFREQ_GOV_INTERVAL, &value);
+ policy->governor->event_handler(df, DEVFREQ_GOV_INTERVAL, &value);
ret = count;

return ret;
@@ -1131,7 +1141,7 @@ static ssize_t min_freq_store(struct device *dev, struct device_attribute *attr,
mutex_lock(&df->lock);

if (value) {
- if (value > df->max_freq) {
+ if (value > df->policy.user.max_freq) {
ret = -EINVAL;
goto unlock;
}
@@ -1139,7 +1149,7 @@ static ssize_t min_freq_store(struct device *dev, struct device_attribute *attr,
value = df->profile->freq_table[df->profile->max_state - 1];
}

- df->min_freq = value;
+ df->policy.user.min_freq = value;
update_devfreq(df);
ret = count;
unlock:
@@ -1150,9 +1160,10 @@ static ssize_t min_freq_store(struct device *dev, struct device_attribute *attr,
static ssize_t min_freq_show(struct device *dev, struct device_attribute *attr,
char *buf)
{
- struct devfreq *df = to_devfreq(dev);
+ struct devfreq_policy *policy = &to_devfreq(dev)->policy;

- return sprintf(buf, "%lu\n", MAX(df->scaling_min_freq, df->min_freq));
+ return sprintf(buf, "%lu\n",
+ MAX(policy->devinfo.min_freq, policy->user.min_freq));
}

static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr,
@@ -1169,15 +1180,15 @@ static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr,
mutex_lock(&df->lock);

if (!value) {
- value = df->profile->freq_table[0];
+ value = df->policy.user.min_freq;
} else {
- if (value < df->min_freq) {
+ if (value < df->policy.user.min_freq) {
ret = -EINVAL;
goto unlock;
}
}

- df->max_freq = value;
+ df->policy.user.max_freq = value;
update_devfreq(df);
ret = count;
unlock:
@@ -1189,9 +1200,10 @@ static DEVICE_ATTR_RW(min_freq);
static ssize_t max_freq_show(struct device *dev, struct device_attribute *attr,
char *buf)
{
- struct devfreq *df = to_devfreq(dev);
+ struct devfreq_policy *policy = &to_devfreq(dev)->policy;

- return sprintf(buf, "%lu\n", MIN(df->scaling_max_freq, df->max_freq));
+ return sprintf(buf, "%lu\n",
+ MIN(policy->devinfo.max_freq, policy->user.max_freq));
}
static DEVICE_ATTR_RW(max_freq);

diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index 3bc29acbd54e..e0987c749ec2 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -99,12 +99,12 @@ static int update_devfreq_passive(struct devfreq *devfreq, unsigned long freq)
{
int ret;

- if (!devfreq->governor)
+ if (!devfreq->policy.governor)
return -EINVAL;

mutex_lock_nested(&devfreq->lock, SINGLE_DEPTH_NESTING);

- ret = devfreq->governor->get_target_freq(devfreq, &freq);
+ ret = devfreq->policy.governor->get_target_freq(devfreq, &freq);
if (ret < 0)
goto out;

diff --git a/drivers/devfreq/governor_performance.c b/drivers/devfreq/governor_performance.c
index a8e3478b3c43..e5767bdbf77d 100644
--- a/drivers/devfreq/governor_performance.c
+++ b/drivers/devfreq/governor_performance.c
@@ -20,7 +20,7 @@ static int devfreq_performance_func(struct devfreq *df,
* target callback should be able to get floor value as
* said in devfreq.h
*/
- *freq = df->scaling_max_freq;
+ *freq = df->policy.devinfo.max_freq;
return 0;
}

diff --git a/drivers/devfreq/governor_powersave.c b/drivers/devfreq/governor_powersave.c
index 8696efd32e5a..86184b3a572a 100644
--- a/drivers/devfreq/governor_powersave.c
+++ b/drivers/devfreq/governor_powersave.c
@@ -20,7 +20,7 @@ static int devfreq_powersave_func(struct devfreq *df,
* target callback should be able to get ceiling value as
* said in devfreq.h
*/
- *freq = df->scaling_min_freq;
+ *freq = df->policy.devinfo.min_freq;
return 0;
}

diff --git a/drivers/devfreq/governor_simpleondemand.c b/drivers/devfreq/governor_simpleondemand.c
index 805fee09c754..076f9f8e33b2 100644
--- a/drivers/devfreq/governor_simpleondemand.c
+++ b/drivers/devfreq/governor_simpleondemand.c
@@ -23,6 +23,7 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
{
int err;
struct devfreq_dev_status *stat;
+ struct devfreq_policy *policy = &df->policy;
unsigned long long a, b;
unsigned int dfso_upthreshold = DFSO_UPTHRESHOLD;
unsigned int dfso_downdifferential = DFSO_DOWNDIFFERENCTIAL;
@@ -46,7 +47,7 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,

/* Assume MAX if it is going to be divided by zero */
if (stat->total_time == 0) {
- *freq = df->scaling_max_freq;
+ *freq = policy->devinfo.max_freq;
return 0;
}

@@ -59,13 +60,13 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
/* Set MAX if it's busy enough */
if (stat->busy_time * 100 >
stat->total_time * dfso_upthreshold) {
- *freq = df->scaling_max_freq;
+ *freq = policy->devinfo.max_freq;
return 0;
}

/* Set MAX if we do not know the initial frequency */
if (stat->current_frequency == 0) {
- *freq = df->scaling_max_freq;
+ *freq = policy->devinfo.max_freq;
return 0;
}

diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index 3aae5b3af87c..9bf23b976f4d 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -109,6 +109,30 @@ struct devfreq_dev_profile {
unsigned int max_state;
};

+/**
+ * struct devfreq_freq_limits - Devfreq frequency limits
+ * @min_freq: minimum frequency
+ * @max_freq: maximum frequency
+ */
+struct devfreq_freq_limits {
+ unsigned long min_freq;
+ unsigned long max_freq;
+};
+
+/**
+ * struct devfreq_policy - Devfreq policy
+ * @user: frequency limits requested by the user
+ * @devinfo: frequency limits of the device (available OPPs)
+ * @governor: method how to choose frequency based on the usage.
+ * @governor_name: devfreq governor name for use with this devfreq
+ */
+struct devfreq_policy {
+ struct devfreq_freq_limits user;
+ struct devfreq_freq_limits devinfo;
+ const struct devfreq_governor *governor;
+ char governor_name[DEVFREQ_NAME_LEN];
+};
+
/**
* struct devfreq - Device devfreq structure
* @node: list node - contains the devices with devfreq that have been
@@ -117,8 +141,6 @@ struct devfreq_dev_profile {
* @dev: device registered by devfreq class. dev.parent is the device
* using devfreq.
* @profile: device-specific devfreq profile
- * @governor: method how to choose frequency based on the usage.
- * @governor_name: devfreq governor name for use with this devfreq
* @nb: notifier block used to notify devfreq object that it should
* reevaluate operable frequencies. Devfreq users may use
* devfreq.nb to the corresponding register notifier call chain.
@@ -126,10 +148,7 @@ struct devfreq_dev_profile {
* @previous_freq: previously configured frequency value.
* @data: Private data of the governor. The devfreq framework does not
* touch this.
- * @min_freq: Limit minimum frequency requested by user (0: none)
- * @max_freq: Limit maximum frequency requested by user (0: none)
- * @scaling_min_freq: Limit minimum frequency requested by OPP interface
- * @scaling_max_freq: Limit maximum frequency requested by OPP interface
+ * @policy: Policy for frequency adjustments
* @stop_polling: devfreq polling status of a device.
* @total_trans: Number of devfreq transitions
* @trans_table: Statistics of devfreq transitions
@@ -151,8 +170,6 @@ struct devfreq {
struct mutex lock;
struct device dev;
struct devfreq_dev_profile *profile;
- const struct devfreq_governor *governor;
- char governor_name[DEVFREQ_NAME_LEN];
struct notifier_block nb;
struct delayed_work work;

@@ -161,10 +178,7 @@ struct devfreq {

void *data; /* private data for governors */

- unsigned long min_freq;
- unsigned long max_freq;
- unsigned long scaling_min_freq;
- unsigned long scaling_max_freq;
+ struct devfreq_policy policy;
bool stop_polling;

/* information for device frequency transition */
--
2.17.0.921.gf22659ad46-goog


2018-05-25 20:34:21

by Matthias Kaehlcke

[permalink] [raw]
Subject: [PATCH 09/11] misc: throttler: Add core support for non-thermal throttling

The purpose of the throttler is to provide support for non-thermal
throttling. Throttling is triggered by external event, e.g. the
detection of a high battery discharge current, close to the OCP limit
of the battery. The throttler is only in charge of the throttling, not
the monitoring, which is done by another (possibly platform specific)
driver.

Signed-off-by: Matthias Kaehlcke <[email protected]>
---
drivers/misc/Kconfig | 1 +
drivers/misc/Makefile | 1 +
drivers/misc/throttler/Kconfig | 13 ++
drivers/misc/throttler/Makefile | 1 +
drivers/misc/throttler/core.c | 373 ++++++++++++++++++++++++++++++++
include/linux/throttler.h | 10 +
6 files changed, 399 insertions(+)
create mode 100644 drivers/misc/throttler/Kconfig
create mode 100644 drivers/misc/throttler/Makefile
create mode 100644 drivers/misc/throttler/core.c
create mode 100644 include/linux/throttler.h

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 5d713008749b..691d9625d83c 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -513,4 +513,5 @@ source "drivers/misc/echo/Kconfig"
source "drivers/misc/cxl/Kconfig"
source "drivers/misc/ocxl/Kconfig"
source "drivers/misc/cardreader/Kconfig"
+source "drivers/misc/throttler/Kconfig"
endmenu
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 20be70c3f118..01a1714dd2ad 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -57,3 +57,4 @@ obj-$(CONFIG_ASPEED_LPC_SNOOP) += aspeed-lpc-snoop.o
obj-$(CONFIG_PCI_ENDPOINT_TEST) += pci_endpoint_test.o
obj-$(CONFIG_OCXL) += ocxl/
obj-$(CONFIG_MISC_RTSX) += cardreader/
+obj-y += throttler/
diff --git a/drivers/misc/throttler/Kconfig b/drivers/misc/throttler/Kconfig
new file mode 100644
index 000000000000..ef8388f6bc0a
--- /dev/null
+++ b/drivers/misc/throttler/Kconfig
@@ -0,0 +1,13 @@
+menuconfig THROTTLER
+ bool "Throttler support"
+ default n
+ depends on OF
+ select CPU_FREQ
+ select PM_DEVFREQ
+ help
+ This option enables core support for non-thermal throttling of CPUs
+ and devfreq devices.
+
+ Note that you also need a event monitor module usually called
+ *_throttler.
+
diff --git a/drivers/misc/throttler/Makefile b/drivers/misc/throttler/Makefile
new file mode 100644
index 000000000000..c8d920cee315
--- /dev/null
+++ b/drivers/misc/throttler/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_THROTTLER) += core.o
diff --git a/drivers/misc/throttler/core.c b/drivers/misc/throttler/core.c
new file mode 100644
index 000000000000..c058d03212b8
--- /dev/null
+++ b/drivers/misc/throttler/core.c
@@ -0,0 +1,373 @@
+/*
+ * Core code for non-thermal throttling
+ *
+ * Copyright (C) 2018 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/cpufreq.h>
+#include <linux/devfreq.h>
+#include <linux/kernel.h>
+#include <linux/notifier.h>
+#include <linux/of.h>
+#include <linux/of_platform.h>
+#include <linux/platform_device.h>
+
+/*
+ * Non-thermal throttling: throttling of system components in response to
+ * external events (e.g. high battery discharge current).
+ *
+ * The throttler supports throttling through cpufreq and devfreq. Multiple
+ * levels of throttling can be configured. At level 0 no throttling is
+ * active on behalf of the throttler, for values > 0 throttling is typically
+ * configured to be increasingly aggressive with each level.
+ * The number of throttling levels is not limited by the throttler (though
+ * it is likely limited by the throttling devices). It is not necessary to
+ * configure the same number of levels for all throttling devices. If the
+ * requested throttling level for a device is higher than the maximum level
+ * of the device the throttler will sleect the maximum throttling level of
+ * the device.
+ *
+ * Non-thermal throttling is split in two parts:
+ *
+ * - throttler core
+ * - parses the thermal policy
+ * - applies throttling settings for a requested level of throttling
+ *
+ * - event monitor driver
+ * - monitors the events that trigger throttling
+ * - determines the throttling level (often limited to on/off)
+ * - requests throttler core to apply throttling settings
+ *
+ * It is possible for a system to have more than one throttler and the
+ * throttlers may make use of the same throttling devices, in case of
+ * conflicting settings for a device the more aggressive values will be
+ * applied.
+ *
+ */
+
+struct thrcfg {
+ uint32_t *freqs;
+ int num_levels;
+};
+
+struct cpufreq_thrdev {
+ uint32_t cpu;
+ struct thrcfg cfg;
+};
+
+struct devfreq_thrdev {
+ struct devfreq *devfreq;
+ struct thrcfg cfg;
+ struct throttler *thr;
+ struct notifier_block nb;
+};
+
+struct __thr_cpufreq {
+ struct cpufreq_thrdev *devs;
+ int ndevs;
+ struct notifier_block nb;
+};
+
+struct __thr_devfreq {
+ struct devfreq_thrdev *devs;
+ int ndevs;
+};
+
+struct throttler {
+ struct device *dev;
+ int level;
+ struct __thr_cpufreq cpufreq;
+ struct __thr_devfreq devfreq;
+};
+
+static unsigned long thr_get_throttling_freq(struct thrcfg *cfg, int level)
+{
+ if (level == 0 ) {
+ WARN(true, "level == 0");
+ return 0;
+ }
+
+ if (level <= cfg->num_levels)
+ return cfg->freqs[level - 1];
+ else
+ return cfg->freqs[cfg->num_levels - 1];
+}
+
+static int thr_cpufreq_event(struct notifier_block *nb,
+ unsigned long event, void *data)
+{
+ struct throttler *thr =
+ container_of(nb, struct throttler, cpufreq.nb);
+ struct cpufreq_policy *policy = data;
+ struct cpufreq_thrdev *ctd;
+ int i;
+
+ if ((event != CPUFREQ_ADJUST) || (thr->level == 0))
+ return NOTIFY_DONE;
+
+ for (i = 0; i < thr->cpufreq.ndevs; i++) {
+ ctd = &thr->cpufreq.devs[i];
+
+ if (ctd->cpu == policy->cpu) {
+ unsigned long clamp_freq =
+ thr_get_throttling_freq(&ctd->cfg, thr->level);
+ if (clamp_freq < policy->max) {
+ cpufreq_verify_within_limits(policy, 0, clamp_freq);
+ }
+ }
+ }
+
+ return NOTIFY_DONE;
+}
+
+static int thr_devfreq_event(struct notifier_block *nb,
+ unsigned long event, void *data)
+{
+ struct devfreq_thrdev *dtd =
+ container_of(nb, struct devfreq_thrdev, nb);
+ struct throttler *thr = dtd->thr;
+ struct devfreq_policy *policy = data;
+ unsigned long clamp_freq;
+
+ if ((event != DEVFREQ_ADJUST) || (thr->level == 0))
+ return NOTIFY_DONE;
+
+ clamp_freq = thr_get_throttling_freq(&dtd->cfg, thr->level);
+ if (clamp_freq < policy->max)
+ devfreq_verify_within_limits(policy, 0, clamp_freq);
+
+ return NOTIFY_DONE;
+}
+
+static void thr_cpufreq_update_policy(struct throttler *thr)
+{
+ int i;
+
+ for (i = 0; i < thr->cpufreq.ndevs; i++) {
+ struct cpufreq_thrdev *ctd = &thr->cpufreq.devs[i];
+ struct cpufreq_policy *policy = cpufreq_cpu_get(ctd->cpu);
+
+ if (!policy) {
+ dev_warn(thr->dev, "CPU%d does have no cpufreq policy!\n", ctd->cpu);
+ continue;
+ }
+
+ cpufreq_update_policy(ctd->cpu);
+ cpufreq_cpu_put(policy);
+ }
+}
+
+static int thr_parse_thrcfg(struct throttler *thr,
+ struct device_node *np, struct thrcfg *cfg) {
+ int err;
+
+ cfg->num_levels =
+ of_property_count_u32_elems(np, "throttling-frequencies");
+ if (cfg->num_levels < 0) {
+ pr_err("%s: failed to determine number of throttling frequencies\n",
+ np->full_name);
+ return cfg->num_levels;
+ }
+
+ cfg->freqs = devm_kzalloc(thr->dev,
+ cfg->num_levels * sizeof(u32), GFP_KERNEL);
+ if (!cfg->freqs)
+ return -ENOMEM;
+
+ err = of_property_read_u32_array(np, "throttling-frequencies",
+ cfg->freqs, cfg->num_levels);
+ if (err) {
+ pr_err("%s: failed to read throttling frequencies\n", np->full_name);
+ return err;
+ }
+
+ return 0;
+}
+
+static struct devfreq *thr_find_devfreq_dev(struct throttler *thr,
+ struct device_node *np_df) {
+ struct device_node *node;
+ struct platform_device *pdev;
+
+ node = of_parse_phandle(np_df, "device", 0);
+ if (!node) {
+ pr_err("%s: failed to get devfreq parent device\n",
+ np_df->full_name);
+ return ERR_PTR(-EINVAL);
+ }
+
+ pdev = of_find_device_by_node(node);
+ if (!pdev) {
+ pr_err("%s: could not find devfreq parent device\n",
+ node->full_name);
+ return ERR_PTR(-EINVAL);
+ }
+
+ return dev_to_devfreq(&pdev->dev);
+}
+
+static int thr_parse_dt(struct throttler *thr, struct device_node *np)
+{
+ struct device_node *node, *child;
+ int err, i;
+
+ node = of_get_child_by_name(np, "cpufreq");
+ if (node) {
+ thr->cpufreq.ndevs = of_get_child_count(node);
+ thr->cpufreq.devs = devm_kzalloc(thr->dev,
+ sizeof(*thr->cpufreq.devs) * thr->cpufreq.ndevs,
+ GFP_KERNEL);
+
+ i = 0;
+ for_each_child_of_node(node, child) {
+ struct cpufreq_thrdev *ctd = &thr->cpufreq.devs[i];
+
+ err = of_property_read_u32(child, "cpu", &ctd->cpu);
+ if (err) {
+ pr_err("%s: failed to read CPU id\n", child->full_name);
+ return err;
+ }
+
+ err = thr_parse_thrcfg(thr, child, &ctd->cfg);
+ if (err)
+ return err;
+
+ i++;
+ }
+ }
+
+ node = of_get_child_by_name(np, "devfreq");
+ if (node) {
+ thr->devfreq.ndevs = of_get_child_count(node);
+ thr->devfreq.devs = devm_kzalloc(thr->dev,
+ sizeof(*thr->devfreq.devs) * thr->devfreq.ndevs,
+ GFP_KERNEL);
+
+ i = 0;
+ for_each_child_of_node(node, child) {
+ struct devfreq_thrdev *dtd = &thr->devfreq.devs[i];
+
+ dtd->thr = thr;
+
+ dtd->devfreq = thr_find_devfreq_dev(thr, child);
+ if (IS_ERR(dtd->devfreq))
+ return PTR_ERR(dtd->devfreq);
+
+ err = thr_parse_thrcfg(thr, child, &dtd->cfg);
+ if (err)
+ return err;
+
+ i++;
+ }
+ }
+
+ return 0;
+}
+
+static void thr_update_devfreq(struct devfreq *devfreq)
+{
+ mutex_lock(&devfreq->lock);
+ update_devfreq(devfreq);
+ mutex_unlock(&devfreq->lock);
+}
+
+void throttler_set_level(struct throttler *thr, int level)
+{
+ int i;
+
+ if (level == thr->level)
+ return;
+
+ dev_dbg(thr->dev, "throttling level: %d\n", level);
+ thr->level = level;
+
+ if (thr->cpufreq.ndevs > 0)
+ thr_cpufreq_update_policy(thr);
+
+ if (thr->devfreq.ndevs > 0)
+ for (i = 0; i < thr->devfreq.ndevs; i++)
+ thr_update_devfreq(thr->devfreq.devs[i].devfreq);
+}
+EXPORT_SYMBOL_GPL(throttler_set_level);
+
+struct throttler *throttler_setup(struct device *dev)
+{
+ struct throttler *thr;
+ struct device_node *np = dev->of_node;
+ int err, i;
+
+ if (!np)
+ /* should never happen */
+ return ERR_PTR(-EINVAL);
+
+ thr = devm_kzalloc(dev, sizeof(*thr), GFP_KERNEL);
+ if (!thr)
+ return ERR_PTR(-ENOMEM);
+
+ thr->dev = dev;
+
+ err = thr_parse_dt(thr, np);
+ if (err)
+ return ERR_PTR(err);
+
+ if (thr->cpufreq.ndevs > 0) {
+ thr->cpufreq.nb.notifier_call = thr_cpufreq_event;
+ err = cpufreq_register_notifier(&thr->cpufreq.nb,
+ CPUFREQ_POLICY_NOTIFIER);
+ if (err < 0) {
+ dev_err(dev, "failed to register cpufreq notifier\n");
+ return ERR_PTR(err);
+ }
+ }
+
+ for (i = 0; i < thr->devfreq.ndevs; i++) {
+ struct devfreq_thrdev *dtd = &thr->devfreq.devs[i];
+
+ dtd->nb.notifier_call = thr_devfreq_event;
+ err = devm_devfreq_register_notifier(dev, dtd->devfreq,
+ &dtd->nb, DEVFREQ_POLICY_NOTIFIER);
+ if (err < 0) {
+ dev_err(dev, "failed to register devfreq notifier\n");
+ goto err_cpufreq_unregister;
+ }
+ }
+
+ return thr;
+
+err_cpufreq_unregister:
+ if (thr->cpufreq.ndevs > 0)
+ cpufreq_unregister_notifier(&thr->cpufreq.nb,
+ CPUFREQ_POLICY_NOTIFIER);
+
+ return ERR_PTR(err);
+}
+EXPORT_SYMBOL_GPL(throttler_setup);
+
+void throttler_teardown(struct throttler *thr)
+{
+ int i;
+
+ thr->level = 0;
+
+ if (thr->cpufreq.ndevs > 0) {
+ thr_cpufreq_update_policy(thr);
+
+ cpufreq_unregister_notifier(&thr->cpufreq.nb,
+ CPUFREQ_POLICY_NOTIFIER);
+ }
+
+ if (thr->devfreq.ndevs > 0)
+ for (i = 0; i < thr->devfreq.ndevs; i++)
+ thr_update_devfreq(thr->devfreq.devs[i].devfreq);
+}
+EXPORT_SYMBOL_GPL(throttler_teardown);
diff --git a/include/linux/throttler.h b/include/linux/throttler.h
new file mode 100644
index 000000000000..cab8c466da4b
--- /dev/null
+++ b/include/linux/throttler.h
@@ -0,0 +1,10 @@
+#ifndef __LINUX_THROTTLER_H__
+#define __LINUX_THROTTLER_H__
+
+struct throttler;
+
+extern struct throttler *throttler_setup(struct device *dev);
+extern void throttler_teardown(struct throttler *thr);
+extern void throttler_set_level(struct throttler *thr, int level);
+
+#endif /* __LINUX_THROTTLER_H__ */
--
2.17.0.921.gf22659ad46-goog


2018-05-25 20:34:22

by Matthias Kaehlcke

[permalink] [raw]
Subject: [PATCH 02/11] PM / devfreq: Fix handling of min/max_freq == 0

Commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when adding the
devfreq device") initializes df->min/max_freq with the min/max OPP when
the device is added. Later commit f1d981eaecf8 ("PM / devfreq: Use the
available min/max frequency") adds df->scaling_min/max_freq and the
following to the frequency adjustment code:

max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);

With the current handling of min/max_freq this is incorrect:

Even though df->max_freq is now initialized to a value != 0 user space
can still set it to 0, in this case max_freq would be 0 instead of
df->scaling_max_freq as intended. In consequence the frequency adjustment
is not performed:

if (max_freq && freq > max_freq) {
freq = max_freq;

To fix this set df->min/max freq to the min/max OPP in max/max_freq_store,
when the user passes a value of 0. This also prevents df->max_freq from
being set below the min OPP when df->min_freq is 0, and similar for
min_freq. Since it is now guaranteed that df->min/max_freq can't be 0 the
checks for this case can be removed.

Fixes: f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency")
Signed-off-by: Matthias Kaehlcke <[email protected]>
---
drivers/devfreq/devfreq.c | 30 ++++++++++++++++++------------
1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 0057ef5b0a98..67da4e7b486b 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -283,11 +283,11 @@ int update_devfreq(struct devfreq *devfreq)
max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);
min_freq = MAX(devfreq->scaling_min_freq, devfreq->min_freq);

- if (min_freq && freq < min_freq) {
+ if (freq < min_freq) {
freq = min_freq;
flags &= ~DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use GLB */
}
- if (max_freq && freq > max_freq) {
+ if (freq > max_freq) {
freq = max_freq;
flags |= DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use LUB */
}
@@ -1123,17 +1123,20 @@ static ssize_t min_freq_store(struct device *dev, struct device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
unsigned long value;
int ret;
- unsigned long max;

ret = sscanf(buf, "%lu", &value);
if (ret != 1)
return -EINVAL;

mutex_lock(&df->lock);
- max = df->max_freq;
- if (value && max && value > max) {
- ret = -EINVAL;
- goto unlock;
+
+ if (value) {
+ if (value > df->max_freq) {
+ ret = -EINVAL;
+ goto unlock;
+ }
+ } else {
+ value = df->profile->freq_table[df->profile->max_state - 1];
}

df->min_freq = value;
@@ -1158,17 +1161,20 @@ static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
unsigned long value;
int ret;
- unsigned long min;

ret = sscanf(buf, "%lu", &value);
if (ret != 1)
return -EINVAL;

mutex_lock(&df->lock);
- min = df->min_freq;
- if (value && min && value < min) {
- ret = -EINVAL;
- goto unlock;
+
+ if (!value) {
+ value = df->profile->freq_table[0];
+ } else {
+ if (value < df->min_freq) {
+ ret = -EINVAL;
+ goto unlock;
+ }
}

df->max_freq = value;
--
2.17.0.921.gf22659ad46-goog


2018-05-25 20:34:32

by Matthias Kaehlcke

[permalink] [raw]
Subject: [PATCH 04/11] PM / devfreq: Remove redundant frequency adjustment from governors

The userspace and simpleondemand governor determine a target frequency and
then adjust it according to the df->min/max_freq limits that might have
been set by user space. This adjustment is redundant, it is done in
update_devfreq() for any governor, right after returning from
governor->get_target_freq().

Signed-off-by: Matthias Kaehlcke <[email protected]>
---
drivers/devfreq/governor_simpleondemand.c | 5 -----
drivers/devfreq/governor_userspace.c | 16 ++++------------
2 files changed, 4 insertions(+), 17 deletions(-)

diff --git a/drivers/devfreq/governor_simpleondemand.c b/drivers/devfreq/governor_simpleondemand.c
index 278964783fa6..3da7554b4837 100644
--- a/drivers/devfreq/governor_simpleondemand.c
+++ b/drivers/devfreq/governor_simpleondemand.c
@@ -84,11 +84,6 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
b = div_u64(b, (dfso_upthreshold - dfso_downdifferential / 2));
*freq = (unsigned long) b;

- if (df->min_freq && *freq < df->min_freq)
- *freq = df->min_freq;
- if (df->max_freq && *freq > df->max_freq)
- *freq = df->max_freq;
-
return 0;
}

diff --git a/drivers/devfreq/governor_userspace.c b/drivers/devfreq/governor_userspace.c
index 080607c3f34d..378d84c011df 100644
--- a/drivers/devfreq/governor_userspace.c
+++ b/drivers/devfreq/governor_userspace.c
@@ -26,19 +26,11 @@ static int devfreq_userspace_func(struct devfreq *df, unsigned long *freq)
{
struct userspace_data *data = df->data;

- if (data->valid) {
- unsigned long adjusted_freq = data->user_frequency;
-
- if (df->max_freq && adjusted_freq > df->max_freq)
- adjusted_freq = df->max_freq;
-
- if (df->min_freq && adjusted_freq < df->min_freq)
- adjusted_freq = df->min_freq;
-
- *freq = adjusted_freq;
- } else {
+ if (data->valid)
+ *freq = data->user_frequency;
+ else
*freq = df->previous_freq; /* No user freq specified yet */
- }
+
return 0;
}

--
2.17.0.921.gf22659ad46-goog


2018-05-25 20:35:36

by Matthias Kaehlcke

[permalink] [raw]
Subject: [PATCH 03/11] PM / devfreq: Remove check for df->max_freq == 0 from governors

Commit "PM / devfreq: Fix handling of min/max_freq == 0" ensures that
df->max_freq is not 0, remove unnecessary checks.

Signed-off-by: Matthias Kaehlcke <[email protected]>
---
drivers/devfreq/governor_performance.c | 5 +----
drivers/devfreq/governor_simpleondemand.c | 7 +++----
2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/devfreq/governor_performance.c b/drivers/devfreq/governor_performance.c
index 4d23ecfbd948..1c990cb45098 100644
--- a/drivers/devfreq/governor_performance.c
+++ b/drivers/devfreq/governor_performance.c
@@ -20,10 +20,7 @@ static int devfreq_performance_func(struct devfreq *df,
* target callback should be able to get floor value as
* said in devfreq.h
*/
- if (!df->max_freq)
- *freq = UINT_MAX;
- else
- *freq = df->max_freq;
+ *freq = df->max_freq;
return 0;
}

diff --git a/drivers/devfreq/governor_simpleondemand.c b/drivers/devfreq/governor_simpleondemand.c
index 28e0f2de7100..278964783fa6 100644
--- a/drivers/devfreq/governor_simpleondemand.c
+++ b/drivers/devfreq/governor_simpleondemand.c
@@ -27,7 +27,6 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
unsigned int dfso_upthreshold = DFSO_UPTHRESHOLD;
unsigned int dfso_downdifferential = DFSO_DOWNDIFFERENCTIAL;
struct devfreq_simple_ondemand_data *data = df->data;
- unsigned long max = (df->max_freq) ? df->max_freq : UINT_MAX;

err = devfreq_update_stats(df);
if (err)
@@ -47,7 +46,7 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,

/* Assume MAX if it is going to be divided by zero */
if (stat->total_time == 0) {
- *freq = max;
+ *freq = df->max_freq;
return 0;
}

@@ -60,13 +59,13 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
/* Set MAX if it's busy enough */
if (stat->busy_time * 100 >
stat->total_time * dfso_upthreshold) {
- *freq = max;
+ *freq = df->max_freq;
return 0;
}

/* Set MAX if we do not know the initial frequency */
if (stat->current_frequency == 0) {
- *freq = max;
+ *freq = df->max_freq;
return 0;
}

--
2.17.0.921.gf22659ad46-goog


2018-05-25 20:36:38

by Matthias Kaehlcke

[permalink] [raw]
Subject: [PATCH 01/11] PM / devfreq: Init user limits from OPP limits, not viceversa

Commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when adding
the devfreq device") introduced the initialization of the user
limits min/max_freq from the lowest/highest available OPPs. Later
commit f1d981eaecf8 ("PM / devfreq: Use the available min/max
frequency") added scaling_min/max_freq, which actually represent
the frequencies of the lowest/highest available OPP. scaling_min/
max_freq are initialized with the values from min/max_freq, which
is totally correct in the context, but a bit awkward to read.

Swap the initialization and assign scaling_min/max_freq with the
OPP freqs and then the user limts min/max_freq with scaling_min/
max_freq.

Needless to say that this change is a NOP, intended to improve
readability.

Signed-off-by: Matthias Kaehlcke <[email protected]>
---
drivers/devfreq/devfreq.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index fe2af6aa88fc..0057ef5b0a98 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -604,21 +604,21 @@ struct devfreq *devfreq_add_device(struct device *dev,
mutex_lock(&devfreq->lock);
}

- devfreq->min_freq = find_available_min_freq(devfreq);
- if (!devfreq->min_freq) {
+ devfreq->scaling_min_freq = find_available_min_freq(devfreq);
+ if (!devfreq->scaling_min_freq) {
mutex_unlock(&devfreq->lock);
err = -EINVAL;
goto err_dev;
}
- devfreq->scaling_min_freq = devfreq->min_freq;
+ devfreq->min_freq = devfreq->scaling_min_freq;

- devfreq->max_freq = find_available_max_freq(devfreq);
- if (!devfreq->max_freq) {
+ devfreq->scaling_max_freq = find_available_max_freq(devfreq);
+ if (!devfreq->scaling_max_freq) {
mutex_unlock(&devfreq->lock);
err = -EINVAL;
goto err_dev;
}
- devfreq->scaling_max_freq = devfreq->max_freq;
+ devfreq->max_freq = devfreq->scaling_max_freq;

dev_set_name(&devfreq->dev, "devfreq%d",
atomic_inc_return(&devfreq_no));
--
2.17.0.921.gf22659ad46-goog


2018-05-28 04:00:32

by MyungJoo Ham

[permalink] [raw]
Subject: RE: [PATCH 02/11] PM / devfreq: Fix handling of min/max_freq == 0

> Commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when adding the
> devfreq device") initializes df->min/max_freq with the min/max OPP when
> the device is added. Later commit f1d981eaecf8 ("PM / devfreq: Use the
> available min/max frequency") adds df->scaling_min/max_freq and the
> following to the frequency adjustment code:
>
> max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);
>
> With the current handling of min/max_freq this is incorrect:
>
> Even though df->max_freq is now initialized to a value != 0 user space
> can still set it to 0, in this case max_freq would be 0 instead of
> df->scaling_max_freq as intended. In consequence the frequency adjustment
> is not performed:
>
> if (max_freq && freq > max_freq) {
> freq = max_freq;
>
> To fix this set df->min/max freq to the min/max OPP in max/max_freq_store,
> when the user passes a value of 0. This also prevents df->max_freq from
> being set below the min OPP when df->min_freq is 0, and similar for
> min_freq. Since it is now guaranteed that df->min/max_freq can't be 0 the
> checks for this case can be removed.
>
> Fixes: f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency")
> Signed-off-by: Matthias Kaehlcke <[email protected]>
> ---
> drivers/devfreq/devfreq.c | 30 ++++++++++++++++++------------
> 1 file changed, 18 insertions(+), 12 deletions(-)

Thanks a lot! Nice Catch.

Acked-by: MyungJoo Ham <[email protected]>

Cheers,
MyungJoo.

2018-05-28 04:52:26

by MyungJoo Ham

[permalink] [raw]
Subject: RE: [PATCH 03/11] PM / devfreq: Remove check for df->max_freq == 0 from governors

>Commit "PM / devfreq: Fix handling of min/max_freq == 0" ensures that
>df->max_freq is not 0, remove unnecessary checks.
>
>Signed-off-by: Matthias Kaehlcke <[email protected]>
>---
> drivers/devfreq/governor_performance.c | 5 +----
> drivers/devfreq/governor_simpleondemand.c | 7 +++----
> 2 files changed, 4 insertions(+), 8 deletions(-)
>
>diff --git a/drivers/devfreq/governor_performance.c b/drivers/devfreq/governor_performance.c
>index 4d23ecfbd948..1c990cb45098 100644
>

Thanks!

Acked-by: MyungJoo Ham <[email protected]>

2018-05-28 04:58:17

by MyungJoo Ham

[permalink] [raw]
Subject: RE: [PATCH 04/11] PM / devfreq: Remove redundant frequency adjustment from governors

> The userspace and simpleondemand governor determine a target frequency and
> then adjust it according to the df->min/max_freq limits that might have
> been set by user space. This adjustment is redundant, it is done in
> update_devfreq() for any governor, right after returning from
> governor->get_target_freq().
>
> Signed-off-by: Matthias Kaehlcke <[email protected]>
> ---
> drivers/devfreq/governor_simpleondemand.c | 5 -----
> drivers/devfreq/governor_userspace.c | 16 ++++------------
> 2 files changed, 4 insertions(+), 17 deletions(-)
>

Yes, indeed. Governors are no longer required to be aware of min/max freq.

Acked-by: MyungJoo Ham <[email protected]>


2018-05-28 05:05:19

by MyungJoo Ham

[permalink] [raw]
Subject: RE: [PATCH 05/11] PM / devfreq: governors: Return device frequency limits instead of user limits

>The performance, powersave and simpleondemand governors can return
>df->min/max_freq, which are the user defined frequency limits.
>update_devfreq() already takes care of adjusting the target frequency
>with the user limits if necessary, therefore we can return
>df->scaling_min/max_freq instead, which is the min/max frequency
>supported by the device at a given time (depending on the
>enabled/disabled OPPs)
>
>Signed-off-by: Matthias Kaehlcke <[email protected]>
>---
> drivers/devfreq/governor_performance.c | 2 +-
> drivers/devfreq/governor_powersave.c | 2 +-
> drivers/devfreq/governor_simpleondemand.c | 6 +++---
> 3 files changed, 5 insertions(+), 5 deletions(-)
>

Actually, even scaling_max_freq and scaling_min_freq are
covered centerally at devfreq.c:update_devfreq();

Wouldn't it be sufficient to return UINT_MAX for performance
and return UINT_MIN (0) for powersave, if the purpose is to
remove redundancy?

In the same sense, we may return UINT_MAX for freq-increasing
case for simpleondemand as well, because they are filtered
centrally anyway.

(This commit might be better merged to 4/11 in that case as well.)


Cheers,
MyungJoo


2018-05-28 05:21:06

by MyungJoo Ham

[permalink] [raw]
Subject: RE: [PATCH 07/11] PM / devfreg: Add support policy notifiers

>Policy notifiers are called before a frequency change and may narrow
>the min/max frequency range in devfreq_policy, which is used to adjust
>the target frequency if it is beyond this range.
>
>Also add a few helpers:
> - devfreq_verify_within_[dev_]limits()
> - should be used by the notifiers for policy adjustments.
> - dev_to_devfreq()
> - lookup a devfreq strict from a device pointer
>
>Signed-off-by: Matthias Kaehlcke <[email protected]>
>---
> drivers/devfreq/devfreq.c | 47 +++++++++++++++++++++-------
> include/linux/devfreq.h | 66 +++++++++++++++++++++++++++++++++++++++
> 2 files changed, 102 insertions(+), 11 deletions(-)

Hello Matthias,


Why should we have yet another notifier from an instance of devfreq?
Wouldn't it better to let the current notifier (transition notifier)
handle new events as well by adding possible event states to it?

Anyway, is this the reason why you've separated some data of devfreq
into "policy" struct? (I was wondering why while reading commit 6/11).


Cheers
MyungJoo

2018-05-28 05:26:05

by MyungJoo Ham

[permalink] [raw]
Subject: RE: [PATCH 08/11] PM / devfreq: Make update_devfreq() public

>Currently update_devfreq() is only visible to devfreq governors outside
>of devfreq.c. Make it public to allow drivers that adjust devfreq policies
>to cause a re-evaluation of the frequency after a policy change.
>
>Signed-off-by: Matthias Kaehlcke <[email protected]>
>---
> drivers/devfreq/governor.h | 3 ---
> include/linux/devfreq.h | 8 ++++++++
> 2 files changed, 8 insertions(+), 3 deletions(-)

With the requirement from patch 9/11, this commit is reasonable enough.

Acked-by: MyungJoo Ham <[email protected]>


2018-05-28 05:27:28

by Chanwoo Choi

[permalink] [raw]
Subject: Re: [PATCH 01/11] PM / devfreq: Init user limits from OPP limits, not viceversa

Hi,

On 2018년 05월 26일 05:30, Matthias Kaehlcke wrote:
> Commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when adding
> the devfreq device") introduced the initialization of the user
> limits min/max_freq from the lowest/highest available OPPs. Later
> commit f1d981eaecf8 ("PM / devfreq: Use the available min/max
> frequency") added scaling_min/max_freq, which actually represent
> the frequencies of the lowest/highest available OPP. scaling_min/
> max_freq are initialized with the values from min/max_freq, which
> is totally correct in the context, but a bit awkward to read.
>
> Swap the initialization and assign scaling_min/max_freq with the
> OPP freqs and then the user limts min/max_freq with scaling_min/
> max_freq.
>
> Needless to say that this change is a NOP, intended to improve
> readability.
>
> Signed-off-by: Matthias Kaehlcke <[email protected]>
> ---
> drivers/devfreq/devfreq.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> index fe2af6aa88fc..0057ef5b0a98 100644
> --- a/drivers/devfreq/devfreq.c
> +++ b/drivers/devfreq/devfreq.c
> @@ -604,21 +604,21 @@ struct devfreq *devfreq_add_device(struct device *dev,
> mutex_lock(&devfreq->lock);
> }
>
> - devfreq->min_freq = find_available_min_freq(devfreq);
> - if (!devfreq->min_freq) {
> + devfreq->scaling_min_freq = find_available_min_freq(devfreq);
> + if (!devfreq->scaling_min_freq) {
> mutex_unlock(&devfreq->lock);
> err = -EINVAL;
> goto err_dev;
> }
> - devfreq->scaling_min_freq = devfreq->min_freq;
> + devfreq->min_freq = devfreq->scaling_min_freq;
>
> - devfreq->max_freq = find_available_max_freq(devfreq);
> - if (!devfreq->max_freq) {
> + devfreq->scaling_max_freq = find_available_max_freq(devfreq);
> + if (!devfreq->scaling_max_freq) {
> mutex_unlock(&devfreq->lock);
> err = -EINVAL;
> goto err_dev;
> }
> - devfreq->scaling_max_freq = devfreq->max_freq;
> + devfreq->max_freq = devfreq->scaling_max_freq;
>
> dev_set_name(&devfreq->dev, "devfreq%d",
> atomic_inc_return(&devfreq_no));
>

I already replied with my Reviewed-by tag. You are missing my tag.

Again,
Reviewed-by: Chanwoo Choi <[email protected]>

--
Best Regards,
Chanwoo Choi
Samsung Electronics

2018-05-28 05:29:57

by Chanwoo Choi

[permalink] [raw]
Subject: Re: [PATCH 03/11] PM / devfreq: Remove check for df->max_freq == 0 from governors

Hi,

On 2018년 05월 26일 05:30, Matthias Kaehlcke wrote:
> Commit "PM / devfreq: Fix handling of min/max_freq == 0" ensures that
> df->max_freq is not 0, remove unnecessary checks.
>
> Signed-off-by: Matthias Kaehlcke <[email protected]>
> ---
> drivers/devfreq/governor_performance.c | 5 +----
> drivers/devfreq/governor_simpleondemand.c | 7 +++----
> 2 files changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/devfreq/governor_performance.c b/drivers/devfreq/governor_performance.c
> index 4d23ecfbd948..1c990cb45098 100644
> --- a/drivers/devfreq/governor_performance.c
> +++ b/drivers/devfreq/governor_performance.c
> @@ -20,10 +20,7 @@ static int devfreq_performance_func(struct devfreq *df,
> * target callback should be able to get floor value as
> * said in devfreq.h
> */
> - if (!df->max_freq)
> - *freq = UINT_MAX;
> - else
> - *freq = df->max_freq;
> + *freq = df->max_freq;
> return 0;
> }
>
> diff --git a/drivers/devfreq/governor_simpleondemand.c b/drivers/devfreq/governor_simpleondemand.c
> index 28e0f2de7100..278964783fa6 100644
> --- a/drivers/devfreq/governor_simpleondemand.c
> +++ b/drivers/devfreq/governor_simpleondemand.c
> @@ -27,7 +27,6 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
> unsigned int dfso_upthreshold = DFSO_UPTHRESHOLD;
> unsigned int dfso_downdifferential = DFSO_DOWNDIFFERENCTIAL;
> struct devfreq_simple_ondemand_data *data = df->data;
> - unsigned long max = (df->max_freq) ? df->max_freq : UINT_MAX;
>
> err = devfreq_update_stats(df);
> if (err)
> @@ -47,7 +46,7 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
>
> /* Assume MAX if it is going to be divided by zero */
> if (stat->total_time == 0) {
> - *freq = max;
> + *freq = df->max_freq;
> return 0;
> }
>
> @@ -60,13 +59,13 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
> /* Set MAX if it's busy enough */
> if (stat->busy_time * 100 >
> stat->total_time * dfso_upthreshold) {
> - *freq = max;
> + *freq = df->max_freq;
> return 0;
> }
>
> /* Set MAX if we do not know the initial frequency */
> if (stat->current_frequency == 0) {
> - *freq = max;
> + *freq = df->max_freq;
> return 0;
> }
>
>

Reviewed-by: Chanwoo Choi <[email protected]>

--
Best Regards,
Chanwoo Choi
Samsung Electronics

2018-05-28 05:38:13

by Chanwoo Choi

[permalink] [raw]
Subject: Re: [PATCH 04/11] PM / devfreq: Remove redundant frequency adjustment from governors

Hi,

On 2018년 05월 26일 05:30, Matthias Kaehlcke wrote:
> The userspace and simpleondemand governor determine a target frequency and
> then adjust it according to the df->min/max_freq limits that might have
> been set by user space. This adjustment is redundant, it is done in
> update_devfreq() for any governor, right after returning from
> governor->get_target_freq().
>
> Signed-off-by: Matthias Kaehlcke <[email protected]>
> ---
> drivers/devfreq/governor_simpleondemand.c | 5 -----
> drivers/devfreq/governor_userspace.c | 16 ++++------------
> 2 files changed, 4 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/devfreq/governor_simpleondemand.c b/drivers/devfreq/governor_simpleondemand.c
> index 278964783fa6..3da7554b4837 100644
> --- a/drivers/devfreq/governor_simpleondemand.c
> +++ b/drivers/devfreq/governor_simpleondemand.c
> @@ -84,11 +84,6 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
> b = div_u64(b, (dfso_upthreshold - dfso_downdifferential / 2));
> *freq = (unsigned long) b;
>
> - if (df->min_freq && *freq < df->min_freq)
> - *freq = df->min_freq;
> - if (df->max_freq && *freq > df->max_freq)
> - *freq = df->max_freq;
> -
> return 0;
> }
>
> diff --git a/drivers/devfreq/governor_userspace.c b/drivers/devfreq/governor_userspace.c
> index 080607c3f34d..378d84c011df 100644
> --- a/drivers/devfreq/governor_userspace.c
> +++ b/drivers/devfreq/governor_userspace.c
> @@ -26,19 +26,11 @@ static int devfreq_userspace_func(struct devfreq *df, unsigned long *freq)
> {
> struct userspace_data *data = df->data;
>
> - if (data->valid) {
> - unsigned long adjusted_freq = data->user_frequency;
> -
> - if (df->max_freq && adjusted_freq > df->max_freq)
> - adjusted_freq = df->max_freq;
> -
> - if (df->min_freq && adjusted_freq < df->min_freq)
> - adjusted_freq = df->min_freq;
> -
> - *freq = adjusted_freq;
> - } else {
> + if (data->valid)
> + *freq = data->user_frequency;
> + else
> *freq = df->previous_freq; /* No user freq specified yet */
> - }
> +
> return 0;
> }
>
>

Reviewed-by: Chanwoo Choi <[email protected]>

--
Best Regards,
Chanwoo Choi
Samsung Electronics

2018-05-28 06:38:24

by Chanwoo Choi

[permalink] [raw]
Subject: Re: [PATCH 02/11] PM / devfreq: Fix handling of min/max_freq == 0

Hi,

On 2018년 05월 26일 05:30, Matthias Kaehlcke wrote:
> Commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when adding the
> devfreq device") initializes df->min/max_freq with the min/max OPP when
> the device is added. Later commit f1d981eaecf8 ("PM / devfreq: Use the
> available min/max frequency") adds df->scaling_min/max_freq and the
> following to the frequency adjustment code:
>
> max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);
>
> With the current handling of min/max_freq this is incorrect:
>
> Even though df->max_freq is now initialized to a value != 0 user space
> can still set it to 0, in this case max_freq would be 0 instead of
> df->scaling_max_freq as intended. In consequence the frequency adjustment
> is not performed:
>
> if (max_freq && freq > max_freq) {
> freq = max_freq;
>
> To fix this set df->min/max freq to the min/max OPP in max/max_freq_store,
> when the user passes a value of 0. This also prevents df->max_freq from
> being set below the min OPP when df->min_freq is 0, and similar for
> min_freq. Since it is now guaranteed that df->min/max_freq can't be 0 the
> checks for this case can be removed.
>
> Fixes: f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency")
> Signed-off-by: Matthias Kaehlcke <[email protected]>
> ---
> drivers/devfreq/devfreq.c | 30 ++++++++++++++++++------------
> 1 file changed, 18 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> index 0057ef5b0a98..67da4e7b486b 100644
> --- a/drivers/devfreq/devfreq.c
> +++ b/drivers/devfreq/devfreq.c
> @@ -283,11 +283,11 @@ int update_devfreq(struct devfreq *devfreq)
> max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);
> min_freq = MAX(devfreq->scaling_min_freq, devfreq->min_freq);
>
> - if (min_freq && freq < min_freq) {
> + if (freq < min_freq) {
> freq = min_freq;
> flags &= ~DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use GLB */
> }
> - if (max_freq && freq > max_freq) {
> + if (freq > max_freq) {
> freq = max_freq;
> flags |= DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use LUB */
> }
> @@ -1123,17 +1123,20 @@ static ssize_t min_freq_store(struct device *dev, struct device_attribute *attr,
> struct devfreq *df = to_devfreq(dev);
> unsigned long value;
> int ret;
> - unsigned long max;
>
> ret = sscanf(buf, "%lu", &value);
> if (ret != 1)
> return -EINVAL;
>
> mutex_lock(&df->lock);
> - max = df->max_freq;
> - if (value && max && value > max) {
> - ret = -EINVAL;
> - goto unlock;
> +
> + if (value) {
> + if (value > df->max_freq) {
> + ret = -EINVAL;
> + goto unlock;
> + }
> + } else {
> + value = df->profile->freq_table[df->profile->max_state - 1];
> }

If you want to prevent that df->min_freq is zero,
you should reinitialize 'value' as following.
Because freq_table must be in ascending order.
value = df->profile->freq_table[0];


>
> df->min_freq = value;
> @@ -1158,17 +1161,20 @@ static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr,
> struct devfreq *df = to_devfreq(dev);
> unsigned long value;
> int ret;
> - unsigned long min;
>
> ret = sscanf(buf, "%lu", &value);
> if (ret != 1)
> return -EINVAL;
>
> mutex_lock(&df->lock);
> - min = df->min_freq;
> - if (value && min && value < min) {
> - ret = -EINVAL;
> - goto unlock;
> +
> + if (!value) {
> + value = df->profile->freq_table[0];

ditto.
value = df->profile->freq_table[df->profile->max_state - 1];

> + } else {
> + if (value < df->min_freq) {
> + ret = -EINVAL;
> + goto unlock;
> + }
> }
>
> df->max_freq = value;
>

Actually, min_freq_store() and max_freq_store() are very similar.
But, this patch changed the order of conditional statement as following:
If there is no special reason, you better to keep the same format
for the readability.


min_freq_store()
if (value) {
...
} else {
value = df->profile->freq_table[df->profile->max_state - 1];
}


max_freq_store()
if (!value) {
value = df->profile->freq_table[0];
} else {
...

--
Best Regards,
Chanwoo Choi
Samsung Electronics

2018-05-28 06:58:17

by Chanwoo Choi

[permalink] [raw]
Subject: Re: [PATCH 05/11] PM / devfreq: governors: Return device frequency limits instead of user limits

Hi,

On 2018년 05월 26일 05:30, Matthias Kaehlcke wrote:
> The performance, powersave and simpleondemand governors can return
> df->min/max_freq, which are the user defined frequency limits.
> update_devfreq() already takes care of adjusting the target frequency
> with the user limits if necessary, therefore we can return
> df->scaling_min/max_freq instead, which is the min/max frequency
> supported by the device at a given time (depending on the
> enabled/disabled OPPs)

As you mentioned on the description, update_devfreq() adjusts
the final target frequency. So, actually, there are no any
benefits when changing from max_freq/min_freq to scaling_max/min_freq.

I think that it is not necessary.

>
> Signed-off-by: Matthias Kaehlcke <[email protected]>
> ---
> drivers/devfreq/governor_performance.c | 2 +-
> drivers/devfreq/governor_powersave.c | 2 +-
> drivers/devfreq/governor_simpleondemand.c | 6 +++---
> 3 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/devfreq/governor_performance.c b/drivers/devfreq/governor_performance.c
> index 1c990cb45098..a8e3478b3c43 100644
> --- a/drivers/devfreq/governor_performance.c
> +++ b/drivers/devfreq/governor_performance.c
> @@ -20,7 +20,7 @@ static int devfreq_performance_func(struct devfreq *df,
> * target callback should be able to get floor value as
> * said in devfreq.h
> */
> - *freq = df->max_freq;
> + *freq = df->scaling_max_freq;
> return 0;
> }
>
> diff --git a/drivers/devfreq/governor_powersave.c b/drivers/devfreq/governor_powersave.c
> index 0c42f23249ef..8696efd32e5a 100644
> --- a/drivers/devfreq/governor_powersave.c
> +++ b/drivers/devfreq/governor_powersave.c
> @@ -20,7 +20,7 @@ static int devfreq_powersave_func(struct devfreq *df,
> * target callback should be able to get ceiling value as
> * said in devfreq.h
> */
> - *freq = df->min_freq;
> + *freq = df->scaling_min_freq;
> return 0;
> }
>
> diff --git a/drivers/devfreq/governor_simpleondemand.c b/drivers/devfreq/governor_simpleondemand.c
> index 3da7554b4837..805fee09c754 100644
> --- a/drivers/devfreq/governor_simpleondemand.c
> +++ b/drivers/devfreq/governor_simpleondemand.c
> @@ -46,7 +46,7 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
>
> /* Assume MAX if it is going to be divided by zero */
> if (stat->total_time == 0) {
> - *freq = df->max_freq;
> + *freq = df->scaling_max_freq;
> return 0;
> }
>
> @@ -59,13 +59,13 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
> /* Set MAX if it's busy enough */
> if (stat->busy_time * 100 >
> stat->total_time * dfso_upthreshold) {
> - *freq = df->max_freq;
> + *freq = df->scaling_max_freq;
> return 0;
> }
>
> /* Set MAX if we do not know the initial frequency */
> if (stat->current_frequency == 0) {
> - *freq = df->max_freq;
> + *freq = df->scaling_max_freq;
> return 0;
> }
>
>


--
Best Regards,
Chanwoo Choi
Samsung Electronics

2018-05-28 07:33:18

by Chanwoo Choi

[permalink] [raw]
Subject: Re: [PATCH 09/11] misc: throttler: Add core support for non-thermal throttling

Hi Matthias,

IMHO, you better to split out the devfreq patches from
'throttler' patch set. Because I'm not sure throttler is either
necessary or not.

After finishing the review of 'throttler' patches without devfreq handling,
it would be better for you to send devfreq patches separately.

Regards,
Chanwoo Choi


On 2018년 05월 26일 05:30, Matthias Kaehlcke wrote:
> The purpose of the throttler is to provide support for non-thermal
> throttling. Throttling is triggered by external event, e.g. the
> detection of a high battery discharge current, close to the OCP limit
> of the battery. The throttler is only in charge of the throttling, not
> the monitoring, which is done by another (possibly platform specific)
> driver.
>
> Signed-off-by: Matthias Kaehlcke <[email protected]>
> ---
> drivers/misc/Kconfig | 1 +
> drivers/misc/Makefile | 1 +
> drivers/misc/throttler/Kconfig | 13 ++
> drivers/misc/throttler/Makefile | 1 +
> drivers/misc/throttler/core.c | 373 ++++++++++++++++++++++++++++++++
> include/linux/throttler.h | 10 +
> 6 files changed, 399 insertions(+)
> create mode 100644 drivers/misc/throttler/Kconfig
> create mode 100644 drivers/misc/throttler/Makefile
> create mode 100644 drivers/misc/throttler/core.c
> create mode 100644 include/linux/throttler.h
>
> diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
> index 5d713008749b..691d9625d83c 100644
> --- a/drivers/misc/Kconfig
> +++ b/drivers/misc/Kconfig
> @@ -513,4 +513,5 @@ source "drivers/misc/echo/Kconfig"
> source "drivers/misc/cxl/Kconfig"
> source "drivers/misc/ocxl/Kconfig"
> source "drivers/misc/cardreader/Kconfig"
> +source "drivers/misc/throttler/Kconfig"
> endmenu
> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> index 20be70c3f118..01a1714dd2ad 100644
> --- a/drivers/misc/Makefile
> +++ b/drivers/misc/Makefile
> @@ -57,3 +57,4 @@ obj-$(CONFIG_ASPEED_LPC_SNOOP) += aspeed-lpc-snoop.o
> obj-$(CONFIG_PCI_ENDPOINT_TEST) += pci_endpoint_test.o
> obj-$(CONFIG_OCXL) += ocxl/
> obj-$(CONFIG_MISC_RTSX) += cardreader/
> +obj-y += throttler/
> diff --git a/drivers/misc/throttler/Kconfig b/drivers/misc/throttler/Kconfig
> new file mode 100644
> index 000000000000..ef8388f6bc0a
> --- /dev/null
> +++ b/drivers/misc/throttler/Kconfig
> @@ -0,0 +1,13 @@
> +menuconfig THROTTLER
> + bool "Throttler support"
> + default n
> + depends on OF
> + select CPU_FREQ
> + select PM_DEVFREQ
> + help
> + This option enables core support for non-thermal throttling of CPUs
> + and devfreq devices.
> +
> + Note that you also need a event monitor module usually called
> + *_throttler.
> +
> diff --git a/drivers/misc/throttler/Makefile b/drivers/misc/throttler/Makefile
> new file mode 100644
> index 000000000000..c8d920cee315
> --- /dev/null
> +++ b/drivers/misc/throttler/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_THROTTLER) += core.o
> diff --git a/drivers/misc/throttler/core.c b/drivers/misc/throttler/core.c
> new file mode 100644
> index 000000000000..c058d03212b8
> --- /dev/null
> +++ b/drivers/misc/throttler/core.c
> @@ -0,0 +1,373 @@
> +/*
> + * Core code for non-thermal throttling
> + *
> + * Copyright (C) 2018 Google, Inc.
> + *
> + * This software is licensed under the terms of the GNU General Public
> + * License version 2, as published by the Free Software Foundation, and
> + * may be copied, distributed, and modified under those terms.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + */
> +
> +#include <linux/cpufreq.h>
> +#include <linux/devfreq.h>
> +#include <linux/kernel.h>
> +#include <linux/notifier.h>
> +#include <linux/of.h>
> +#include <linux/of_platform.h>
> +#include <linux/platform_device.h>
> +
> +/*
> + * Non-thermal throttling: throttling of system components in response to
> + * external events (e.g. high battery discharge current).
> + *
> + * The throttler supports throttling through cpufreq and devfreq. Multiple
> + * levels of throttling can be configured. At level 0 no throttling is
> + * active on behalf of the throttler, for values > 0 throttling is typically
> + * configured to be increasingly aggressive with each level.
> + * The number of throttling levels is not limited by the throttler (though
> + * it is likely limited by the throttling devices). It is not necessary to
> + * configure the same number of levels for all throttling devices. If the
> + * requested throttling level for a device is higher than the maximum level
> + * of the device the throttler will sleect the maximum throttling level of
> + * the device.
> + *
> + * Non-thermal throttling is split in two parts:
> + *
> + * - throttler core
> + * - parses the thermal policy
> + * - applies throttling settings for a requested level of throttling
> + *
> + * - event monitor driver
> + * - monitors the events that trigger throttling
> + * - determines the throttling level (often limited to on/off)
> + * - requests throttler core to apply throttling settings
> + *
> + * It is possible for a system to have more than one throttler and the
> + * throttlers may make use of the same throttling devices, in case of
> + * conflicting settings for a device the more aggressive values will be
> + * applied.
> + *
> + */
> +
> +struct thrcfg {
> + uint32_t *freqs;
> + int num_levels;
> +};
> +
> +struct cpufreq_thrdev {
> + uint32_t cpu;
> + struct thrcfg cfg;
> +};
> +
> +struct devfreq_thrdev {
> + struct devfreq *devfreq;
> + struct thrcfg cfg;
> + struct throttler *thr;
> + struct notifier_block nb;
> +};
> +
> +struct __thr_cpufreq {
> + struct cpufreq_thrdev *devs;
> + int ndevs;
> + struct notifier_block nb;
> +};
> +
> +struct __thr_devfreq {
> + struct devfreq_thrdev *devs;
> + int ndevs;
> +};
> +
> +struct throttler {
> + struct device *dev;
> + int level;
> + struct __thr_cpufreq cpufreq;
> + struct __thr_devfreq devfreq;
> +};
> +
> +static unsigned long thr_get_throttling_freq(struct thrcfg *cfg, int level)
> +{
> + if (level == 0 ) {
> + WARN(true, "level == 0");
> + return 0;
> + }
> +
> + if (level <= cfg->num_levels)
> + return cfg->freqs[level - 1];
> + else
> + return cfg->freqs[cfg->num_levels - 1];
> +}
> +
> +static int thr_cpufreq_event(struct notifier_block *nb,
> + unsigned long event, void *data)
> +{
> + struct throttler *thr =
> + container_of(nb, struct throttler, cpufreq.nb);
> + struct cpufreq_policy *policy = data;
> + struct cpufreq_thrdev *ctd;
> + int i;
> +
> + if ((event != CPUFREQ_ADJUST) || (thr->level == 0))
> + return NOTIFY_DONE;
> +
> + for (i = 0; i < thr->cpufreq.ndevs; i++) {
> + ctd = &thr->cpufreq.devs[i];
> +
> + if (ctd->cpu == policy->cpu) {
> + unsigned long clamp_freq =
> + thr_get_throttling_freq(&ctd->cfg, thr->level);
> + if (clamp_freq < policy->max) {
> + cpufreq_verify_within_limits(policy, 0, clamp_freq);
> + }
> + }
> + }
> +
> + return NOTIFY_DONE;
> +}
> +
> +static int thr_devfreq_event(struct notifier_block *nb,
> + unsigned long event, void *data)
> +{
> + struct devfreq_thrdev *dtd =
> + container_of(nb, struct devfreq_thrdev, nb);
> + struct throttler *thr = dtd->thr;
> + struct devfreq_policy *policy = data;
> + unsigned long clamp_freq;
> +
> + if ((event != DEVFREQ_ADJUST) || (thr->level == 0))
> + return NOTIFY_DONE;
> +
> + clamp_freq = thr_get_throttling_freq(&dtd->cfg, thr->level);
> + if (clamp_freq < policy->max)
> + devfreq_verify_within_limits(policy, 0, clamp_freq);
> +
> + return NOTIFY_DONE;
> +}
> +
> +static void thr_cpufreq_update_policy(struct throttler *thr)
> +{
> + int i;
> +
> + for (i = 0; i < thr->cpufreq.ndevs; i++) {
> + struct cpufreq_thrdev *ctd = &thr->cpufreq.devs[i];
> + struct cpufreq_policy *policy = cpufreq_cpu_get(ctd->cpu);
> +
> + if (!policy) {
> + dev_warn(thr->dev, "CPU%d does have no cpufreq policy!\n", ctd->cpu);
> + continue;
> + }
> +
> + cpufreq_update_policy(ctd->cpu);
> + cpufreq_cpu_put(policy);
> + }
> +}
> +
> +static int thr_parse_thrcfg(struct throttler *thr,
> + struct device_node *np, struct thrcfg *cfg) {
> + int err;
> +
> + cfg->num_levels =
> + of_property_count_u32_elems(np, "throttling-frequencies");
> + if (cfg->num_levels < 0) {
> + pr_err("%s: failed to determine number of throttling frequencies\n",
> + np->full_name);
> + return cfg->num_levels;
> + }
> +
> + cfg->freqs = devm_kzalloc(thr->dev,
> + cfg->num_levels * sizeof(u32), GFP_KERNEL);
> + if (!cfg->freqs)
> + return -ENOMEM;
> +
> + err = of_property_read_u32_array(np, "throttling-frequencies",
> + cfg->freqs, cfg->num_levels);
> + if (err) {
> + pr_err("%s: failed to read throttling frequencies\n", np->full_name);
> + return err;
> + }
> +
> + return 0;
> +}
> +
> +static struct devfreq *thr_find_devfreq_dev(struct throttler *thr,
> + struct device_node *np_df) {
> + struct device_node *node;
> + struct platform_device *pdev;
> +
> + node = of_parse_phandle(np_df, "device", 0);
> + if (!node) {
> + pr_err("%s: failed to get devfreq parent device\n",
> + np_df->full_name);
> + return ERR_PTR(-EINVAL);
> + }
> +
> + pdev = of_find_device_by_node(node);
> + if (!pdev) {
> + pr_err("%s: could not find devfreq parent device\n",
> + node->full_name);
> + return ERR_PTR(-EINVAL);
> + }
> +
> + return dev_to_devfreq(&pdev->dev);
> +}
> +
> +static int thr_parse_dt(struct throttler *thr, struct device_node *np)
> +{
> + struct device_node *node, *child;
> + int err, i;
> +
> + node = of_get_child_by_name(np, "cpufreq");
> + if (node) {
> + thr->cpufreq.ndevs = of_get_child_count(node);
> + thr->cpufreq.devs = devm_kzalloc(thr->dev,
> + sizeof(*thr->cpufreq.devs) * thr->cpufreq.ndevs,
> + GFP_KERNEL);
> +
> + i = 0;
> + for_each_child_of_node(node, child) {
> + struct cpufreq_thrdev *ctd = &thr->cpufreq.devs[i];
> +
> + err = of_property_read_u32(child, "cpu", &ctd->cpu);
> + if (err) {
> + pr_err("%s: failed to read CPU id\n", child->full_name);
> + return err;
> + }
> +
> + err = thr_parse_thrcfg(thr, child, &ctd->cfg);
> + if (err)
> + return err;
> +
> + i++;
> + }
> + }
> +
> + node = of_get_child_by_name(np, "devfreq");
> + if (node) {
> + thr->devfreq.ndevs = of_get_child_count(node);
> + thr->devfreq.devs = devm_kzalloc(thr->dev,
> + sizeof(*thr->devfreq.devs) * thr->devfreq.ndevs,
> + GFP_KERNEL);
> +
> + i = 0;
> + for_each_child_of_node(node, child) {
> + struct devfreq_thrdev *dtd = &thr->devfreq.devs[i];
> +
> + dtd->thr = thr;
> +
> + dtd->devfreq = thr_find_devfreq_dev(thr, child);
> + if (IS_ERR(dtd->devfreq))
> + return PTR_ERR(dtd->devfreq);
> +
> + err = thr_parse_thrcfg(thr, child, &dtd->cfg);
> + if (err)
> + return err;
> +
> + i++;
> + }
> + }
> +
> + return 0;
> +}
> +
> +static void thr_update_devfreq(struct devfreq *devfreq)
> +{
> + mutex_lock(&devfreq->lock);
> + update_devfreq(devfreq);
> + mutex_unlock(&devfreq->lock);
> +}
> +
> +void throttler_set_level(struct throttler *thr, int level)
> +{
> + int i;
> +
> + if (level == thr->level)
> + return;
> +
> + dev_dbg(thr->dev, "throttling level: %d\n", level);
> + thr->level = level;
> +
> + if (thr->cpufreq.ndevs > 0)
> + thr_cpufreq_update_policy(thr);
> +
> + if (thr->devfreq.ndevs > 0)
> + for (i = 0; i < thr->devfreq.ndevs; i++)
> + thr_update_devfreq(thr->devfreq.devs[i].devfreq);
> +}
> +EXPORT_SYMBOL_GPL(throttler_set_level);
> +
> +struct throttler *throttler_setup(struct device *dev)
> +{
> + struct throttler *thr;
> + struct device_node *np = dev->of_node;
> + int err, i;
> +
> + if (!np)
> + /* should never happen */
> + return ERR_PTR(-EINVAL);
> +
> + thr = devm_kzalloc(dev, sizeof(*thr), GFP_KERNEL);
> + if (!thr)
> + return ERR_PTR(-ENOMEM);
> +
> + thr->dev = dev;
> +
> + err = thr_parse_dt(thr, np);
> + if (err)
> + return ERR_PTR(err);
> +
> + if (thr->cpufreq.ndevs > 0) {
> + thr->cpufreq.nb.notifier_call = thr_cpufreq_event;
> + err = cpufreq_register_notifier(&thr->cpufreq.nb,
> + CPUFREQ_POLICY_NOTIFIER);
> + if (err < 0) {
> + dev_err(dev, "failed to register cpufreq notifier\n");
> + return ERR_PTR(err);
> + }
> + }
> +
> + for (i = 0; i < thr->devfreq.ndevs; i++) {
> + struct devfreq_thrdev *dtd = &thr->devfreq.devs[i];
> +
> + dtd->nb.notifier_call = thr_devfreq_event;
> + err = devm_devfreq_register_notifier(dev, dtd->devfreq,
> + &dtd->nb, DEVFREQ_POLICY_NOTIFIER);
> + if (err < 0) {
> + dev_err(dev, "failed to register devfreq notifier\n");
> + goto err_cpufreq_unregister;
> + }
> + }
> +
> + return thr;
> +
> +err_cpufreq_unregister:
> + if (thr->cpufreq.ndevs > 0)
> + cpufreq_unregister_notifier(&thr->cpufreq.nb,
> + CPUFREQ_POLICY_NOTIFIER);
> +
> + return ERR_PTR(err);
> +}
> +EXPORT_SYMBOL_GPL(throttler_setup);
> +
> +void throttler_teardown(struct throttler *thr)
> +{
> + int i;
> +
> + thr->level = 0;
> +
> + if (thr->cpufreq.ndevs > 0) {
> + thr_cpufreq_update_policy(thr);
> +
> + cpufreq_unregister_notifier(&thr->cpufreq.nb,
> + CPUFREQ_POLICY_NOTIFIER);
> + }
> +
> + if (thr->devfreq.ndevs > 0)
> + for (i = 0; i < thr->devfreq.ndevs; i++)
> + thr_update_devfreq(thr->devfreq.devs[i].devfreq);
> +}
> +EXPORT_SYMBOL_GPL(throttler_teardown);
> diff --git a/include/linux/throttler.h b/include/linux/throttler.h
> new file mode 100644
> index 000000000000..cab8c466da4b
> --- /dev/null
> +++ b/include/linux/throttler.h
> @@ -0,0 +1,10 @@
> +#ifndef __LINUX_THROTTLER_H__
> +#define __LINUX_THROTTLER_H__
> +
> +struct throttler;
> +
> +extern struct throttler *throttler_setup(struct device *dev);
> +extern void throttler_teardown(struct throttler *thr);
> +extern void throttler_set_level(struct throttler *thr, int level);
> +
> +#endif /* __LINUX_THROTTLER_H__ */
>

2018-05-28 08:09:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 09/11] misc: throttler: Add core support for non-thermal throttling

On Fri, May 25, 2018 at 01:30:41PM -0700, Matthias Kaehlcke wrote:
> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> index 20be70c3f118..01a1714dd2ad 100644
> --- a/drivers/misc/Makefile
> +++ b/drivers/misc/Makefile
> @@ -57,3 +57,4 @@ obj-$(CONFIG_ASPEED_LPC_SNOOP) += aspeed-lpc-snoop.o
> obj-$(CONFIG_PCI_ENDPOINT_TEST) += pci_endpoint_test.o
> obj-$(CONFIG_OCXL) += ocxl/
> obj-$(CONFIG_MISC_RTSX) += cardreader/
> +obj-y += throttler/

Shouldn't you depend on a Kconfig option to traverse into this
directory?

> diff --git a/drivers/misc/throttler/Kconfig b/drivers/misc/throttler/Kconfig
> new file mode 100644
> index 000000000000..ef8388f6bc0a
> --- /dev/null
> +++ b/drivers/misc/throttler/Kconfig
> @@ -0,0 +1,13 @@
> +menuconfig THROTTLER
> + bool "Throttler support"
> + default n

the default is always 'n' no need to say it again here :)

> + depends on OF
> + select CPU_FREQ
> + select PM_DEVFREQ
> + help
> + This option enables core support for non-thermal throttling of CPUs
> + and devfreq devices.
> +
> + Note that you also need a event monitor module usually called
> + *_throttler.
> +
> diff --git a/drivers/misc/throttler/Makefile b/drivers/misc/throttler/Makefile
> new file mode 100644
> index 000000000000..c8d920cee315
> --- /dev/null
> +++ b/drivers/misc/throttler/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_THROTTLER) += core.o
> diff --git a/drivers/misc/throttler/core.c b/drivers/misc/throttler/core.c
> new file mode 100644
> index 000000000000..c058d03212b8
> --- /dev/null
> +++ b/drivers/misc/throttler/core.c
> @@ -0,0 +1,373 @@
> +/*
> + * Core code for non-thermal throttling
> + *
> + * Copyright (C) 2018 Google, Inc.
> + *
> + * This software is licensed under the terms of the GNU General Public
> + * License version 2, as published by the Free Software Foundation, and
> + * may be copied, distributed, and modified under those terms.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + */

Please just use a single SPDX line, like checkpatch.pl would have warned
you about. No need for the full verbous license boiler-plate text here
at all.

thanks,

greg k-h

2018-05-29 18:07:19

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH 01/11] PM / devfreq: Init user limits from OPP limits, not viceversa

On Mon, May 28, 2018 at 02:26:49PM +0900, Chanwoo Choi wrote:
> Hi,
>
> On 2018년 05월 26일 05:30, Matthias Kaehlcke wrote:
> > Commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when adding
> > the devfreq device") introduced the initialization of the user
> > limits min/max_freq from the lowest/highest available OPPs. Later
> > commit f1d981eaecf8 ("PM / devfreq: Use the available min/max
> > frequency") added scaling_min/max_freq, which actually represent
> > the frequencies of the lowest/highest available OPP. scaling_min/
> > max_freq are initialized with the values from min/max_freq, which
> > is totally correct in the context, but a bit awkward to read.
> >
> > Swap the initialization and assign scaling_min/max_freq with the
> > OPP freqs and then the user limts min/max_freq with scaling_min/
> > max_freq.
> >
> > Needless to say that this change is a NOP, intended to improve
> > readability.
> >
> > Signed-off-by: Matthias Kaehlcke <[email protected]>
> > ---
> > drivers/devfreq/devfreq.c | 12 ++++++------
> > 1 file changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> > index fe2af6aa88fc..0057ef5b0a98 100644
> > --- a/drivers/devfreq/devfreq.c
> > +++ b/drivers/devfreq/devfreq.c
> > @@ -604,21 +604,21 @@ struct devfreq *devfreq_add_device(struct device *dev,
> > mutex_lock(&devfreq->lock);
> > }
> >
> > - devfreq->min_freq = find_available_min_freq(devfreq);
> > - if (!devfreq->min_freq) {
> > + devfreq->scaling_min_freq = find_available_min_freq(devfreq);
> > + if (!devfreq->scaling_min_freq) {
> > mutex_unlock(&devfreq->lock);
> > err = -EINVAL;
> > goto err_dev;
> > }
> > - devfreq->scaling_min_freq = devfreq->min_freq;
> > + devfreq->min_freq = devfreq->scaling_min_freq;
> >
> > - devfreq->max_freq = find_available_max_freq(devfreq);
> > - if (!devfreq->max_freq) {
> > + devfreq->scaling_max_freq = find_available_max_freq(devfreq);
> > + if (!devfreq->scaling_max_freq) {
> > mutex_unlock(&devfreq->lock);
> > err = -EINVAL;
> > goto err_dev;
> > }
> > - devfreq->scaling_max_freq = devfreq->max_freq;
> > + devfreq->max_freq = devfreq->scaling_max_freq;
> >
> > dev_set_name(&devfreq->dev, "devfreq%d",
> > atomic_inc_return(&devfreq_no));
> >
>
> I already replied with my Reviewed-by tag. You are missing my tag.
>
> Again,
> Reviewed-by: Chanwoo Choi <[email protected]>

Sorry, forgot to add the tag after creating the series. I now added it
to my local tree.

2018-05-29 18:59:00

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH 02/11] PM / devfreq: Fix handling of min/max_freq == 0

On Mon, May 28, 2018 at 03:37:47PM +0900, Chanwoo Choi wrote:
> Hi,
>
> On 2018년 05월 26일 05:30, Matthias Kaehlcke wrote:
> > Commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when adding the
> > devfreq device") initializes df->min/max_freq with the min/max OPP when
> > the device is added. Later commit f1d981eaecf8 ("PM / devfreq: Use the
> > available min/max frequency") adds df->scaling_min/max_freq and the
> > following to the frequency adjustment code:
> >
> > max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);
> >
> > With the current handling of min/max_freq this is incorrect:
> >
> > Even though df->max_freq is now initialized to a value != 0 user space
> > can still set it to 0, in this case max_freq would be 0 instead of
> > df->scaling_max_freq as intended. In consequence the frequency adjustment
> > is not performed:
> >
> > if (max_freq && freq > max_freq) {
> > freq = max_freq;
> >
> > To fix this set df->min/max freq to the min/max OPP in max/max_freq_store,
> > when the user passes a value of 0. This also prevents df->max_freq from
> > being set below the min OPP when df->min_freq is 0, and similar for
> > min_freq. Since it is now guaranteed that df->min/max_freq can't be 0 the
> > checks for this case can be removed.
> >
> > Fixes: f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency")
> > Signed-off-by: Matthias Kaehlcke <[email protected]>
> > ---
> > drivers/devfreq/devfreq.c | 30 ++++++++++++++++++------------
> > 1 file changed, 18 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> > index 0057ef5b0a98..67da4e7b486b 100644
> > --- a/drivers/devfreq/devfreq.c
> > +++ b/drivers/devfreq/devfreq.c
> > @@ -283,11 +283,11 @@ int update_devfreq(struct devfreq *devfreq)
> > max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);
> > min_freq = MAX(devfreq->scaling_min_freq, devfreq->min_freq);
> >
> > - if (min_freq && freq < min_freq) {
> > + if (freq < min_freq) {
> > freq = min_freq;
> > flags &= ~DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use GLB */
> > }
> > - if (max_freq && freq > max_freq) {
> > + if (freq > max_freq) {
> > freq = max_freq;
> > flags |= DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use LUB */
> > }
> > @@ -1123,17 +1123,20 @@ static ssize_t min_freq_store(struct device *dev, struct device_attribute *attr,
> > struct devfreq *df = to_devfreq(dev);
> > unsigned long value;
> > int ret;
> > - unsigned long max;
> >
> > ret = sscanf(buf, "%lu", &value);
> > if (ret != 1)
> > return -EINVAL;
> >
> > mutex_lock(&df->lock);
> > - max = df->max_freq;
> > - if (value && max && value > max) {
> > - ret = -EINVAL;
> > - goto unlock;
> > +
> > + if (value) {
> > + if (value > df->max_freq) {
> > + ret = -EINVAL;
> > + goto unlock;
> > + }
> > + } else {
> > + value = df->profile->freq_table[df->profile->max_state - 1];
> > }
>
> If you want to prevent that df->min_freq is zero,
> you should reinitialize 'value' as following.
> Because freq_table must be in ascending order.
> value = df->profile->freq_table[0];

Thanks for pointing this out!

The devfreq device I tested with (a Mali GPU) uses descending order
for some reason, and I assumed that's the usual order.

https://chromium.googlesource.com/chromiumos/third_party/kernel/+/chromeos-4.4/drivers/gpu/arm/midgard/backend/gpu/mali_kbase_devfreq.c#208

It seems the ordering doesn't have any impact beyond this patch. If
the order isn't mandatory for drivers that set up their own freq_table
we should probably support both cases to be safe.

> > @@ -1158,17 +1161,20 @@ static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr,
> > struct devfreq *df = to_devfreq(dev);
> > unsigned long value;
> > int ret;
> > - unsigned long min;
> >
> > ret = sscanf(buf, "%lu", &value);
> > if (ret != 1)
> > return -EINVAL;
> >
> > mutex_lock(&df->lock);
> > - min = df->min_freq;
> > - if (value && min && value < min) {
> > - ret = -EINVAL;
> > - goto unlock;
> > +
> > + if (!value) {
> > + value = df->profile->freq_table[0];
>
> ditto.
> value = df->profile->freq_table[df->profile->max_state - 1];
>
> > + } else {
> > + if (value < df->min_freq) {
> > + ret = -EINVAL;
> > + goto unlock;
> > + }
> > }
> >
> > df->max_freq = value;
> >
>
> Actually, min_freq_store() and max_freq_store() are very similar.
> But, this patch changed the order of conditional statement as following:
> If there is no special reason, you better to keep the same format
> for the readability.
>
>
> min_freq_store()
> if (value) {
> ...
> } else {
> value = df->profile->freq_table[df->profile->max_state - 1];
> }
>
>
> max_freq_store()
> if (!value) {
> value = df->profile->freq_table[0];
> } else {
> ...
>

Agreed, better use the same format, I'll update it in the next revision.

2018-05-29 19:33:43

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH 05/11] PM / devfreq: governors: Return device frequency limits instead of user limits

On Mon, May 28, 2018 at 02:04:44PM +0900, MyungJoo Ham wrote:
> >The performance, powersave and simpleondemand governors can return
> >df->min/max_freq, which are the user defined frequency limits.
> >update_devfreq() already takes care of adjusting the target frequency
> >with the user limits if necessary, therefore we can return
> >df->scaling_min/max_freq instead, which is the min/max frequency
> >supported by the device at a given time (depending on the
> >enabled/disabled OPPs)
> >
> >Signed-off-by: Matthias Kaehlcke <[email protected]>
> >---
> > drivers/devfreq/governor_performance.c | 2 +-
> > drivers/devfreq/governor_powersave.c | 2 +-
> > drivers/devfreq/governor_simpleondemand.c | 6 +++---
> > 3 files changed, 5 insertions(+), 5 deletions(-)
> >
>
> Actually, even scaling_max_freq and scaling_min_freq are
> covered centerally at devfreq.c:update_devfreq();
>
> Wouldn't it be sufficient to return UINT_MAX for performance
> and return UINT_MIN (0) for powersave, if the purpose is to
> remove redundancy?
>
> In the same sense, we may return UINT_MAX for freq-increasing
> case for simpleondemand as well, because they are filtered
> centrally anyway.
>
> (This commit might be better merged to 4/11 in that case as well.)

I did this in the first variant of the patch (before sending it in a
series), but Chanwoo Choi objected:

https://patchwork.kernel.org/patch/10404893/

I also still think that returning a constant would be the cleanest
solution if we can agree on this. What do you think about
DEVFREQ_MIN/MAX_FREQ (0/UINT_MAX) to make things slightly clearer?

2018-05-29 20:03:29

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH 07/11] PM / devfreg: Add support policy notifiers

On Mon, May 28, 2018 at 02:19:49PM +0900, MyungJoo Ham wrote:
> >Policy notifiers are called before a frequency change and may narrow
> >the min/max frequency range in devfreq_policy, which is used to adjust
> >the target frequency if it is beyond this range.
> >
> >Also add a few helpers:
> > - devfreq_verify_within_[dev_]limits()
> > - should be used by the notifiers for policy adjustments.
> > - dev_to_devfreq()
> > - lookup a devfreq strict from a device pointer
> >
> >Signed-off-by: Matthias Kaehlcke <[email protected]>
> >---
> > drivers/devfreq/devfreq.c | 47 +++++++++++++++++++++-------
> > include/linux/devfreq.h | 66 +++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 102 insertions(+), 11 deletions(-)
>
> Hello Matthias,
>
>
> Why should we have yet another notifier from an instance of devfreq?
> Wouldn't it better to let the current notifier (transition notifier)
> handle new events as well by adding possible event states to it?

Honestly the main reason is that I sought inspiration from cpufreq,
which uses a dedicated policy notifier. Unfortunately this change
predates the git history so I don't know what was the exact rationale
to do it this way.

Some minor advantages that I see are:

- transition notifiers aren't bothered about adjustments and viceversa
- different data types are passed for transitions and adjustments,
which makes code of notifiers that handle both a bit more messy.

> Anyway, is this the reason why you've separated some data of devfreq
> into "policy" struct? (I was wondering why while reading commit 6/11).

The DEVFREQ_ADJUST is the reason for the "policy struct". With this
change we are dealing with 3 types of frequency pairs: user
(df->min/max_freq), devinfo (df->scaling_min/max_freq) and the
policy/adjustable ones. I think it is cleaner to group them in a
struct (and sub-structs), rather than having 6 individual
<type>_min/max_freq variables. Also it allows to only pass the policy
object to the notifiers, instead of the entire devfreq device.

I opted to do the introduction of the struct policy in a separate NOP
patch, because I think it is easier to review the 'reorg' churn
and the functional change separately.

Please let me know if you'd prefer to have certain things done
differently.

Thanks

Matthias

2018-05-29 20:58:34

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH 09/11] misc: throttler: Add core support for non-thermal throttling

On Mon, May 28, 2018 at 04:32:37PM +0900, Chanwoo Choi wrote:

> IMHO, you better to split out the devfreq patches from
> 'throttler' patch set. Because I'm not sure throttler is either
> necessary or not.
>
> After finishing the review of 'throttler' patches without devfreq handling,
> it would be better for you to send devfreq patches separately.

I could certainly try to get 'throttler' with only cpufreq support
merged, but that would kind of defeat the purpose.

I first sent a RFC patch for the devfreq policy notifiers
(https://patchwork.kernel.org/patch/10401999/) to get an idea if this
is a reasonable path to pursue. In response you asked about "real code
and patches" and here it is :)

For my use case throttler is not really useful without devfreq
support. In this sense I prefer to know 'early' if there are any
blocking issues, rather then making the effort to get a limited
version of the driver merged, and then learn that I wasted my own and
the reviewers time because it is a dead end.

> On 2018년 05월 26일 05:30, Matthias Kaehlcke wrote:
> > The purpose of the throttler is to provide support for non-thermal
> > throttling. Throttling is triggered by external event, e.g. the
> > detection of a high battery discharge current, close to the OCP limit
> > of the battery. The throttler is only in charge of the throttling, not
> > the monitoring, which is done by another (possibly platform specific)
> > driver.
> >
> > Signed-off-by: Matthias Kaehlcke <[email protected]>
> > ---
> > drivers/misc/Kconfig | 1 +
> > drivers/misc/Makefile | 1 +
> > drivers/misc/throttler/Kconfig | 13 ++
> > drivers/misc/throttler/Makefile | 1 +
> > drivers/misc/throttler/core.c | 373 ++++++++++++++++++++++++++++++++
> > include/linux/throttler.h | 10 +
> > 6 files changed, 399 insertions(+)
> > create mode 100644 drivers/misc/throttler/Kconfig
> > create mode 100644 drivers/misc/throttler/Makefile
> > create mode 100644 drivers/misc/throttler/core.c
> > create mode 100644 include/linux/throttler.h
> >
> > diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
> > index 5d713008749b..691d9625d83c 100644
> > --- a/drivers/misc/Kconfig
> > +++ b/drivers/misc/Kconfig
> > @@ -513,4 +513,5 @@ source "drivers/misc/echo/Kconfig"
> > source "drivers/misc/cxl/Kconfig"
> > source "drivers/misc/ocxl/Kconfig"
> > source "drivers/misc/cardreader/Kconfig"
> > +source "drivers/misc/throttler/Kconfig"
> > endmenu
> > diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> > index 20be70c3f118..01a1714dd2ad 100644
> > --- a/drivers/misc/Makefile
> > +++ b/drivers/misc/Makefile
> > @@ -57,3 +57,4 @@ obj-$(CONFIG_ASPEED_LPC_SNOOP) += aspeed-lpc-snoop.o
> > obj-$(CONFIG_PCI_ENDPOINT_TEST) += pci_endpoint_test.o
> > obj-$(CONFIG_OCXL) += ocxl/
> > obj-$(CONFIG_MISC_RTSX) += cardreader/
> > +obj-y += throttler/
> > diff --git a/drivers/misc/throttler/Kconfig b/drivers/misc/throttler/Kconfig
> > new file mode 100644
> > index 000000000000..ef8388f6bc0a
> > --- /dev/null
> > +++ b/drivers/misc/throttler/Kconfig
> > @@ -0,0 +1,13 @@
> > +menuconfig THROTTLER
> > + bool "Throttler support"
> > + default n
> > + depends on OF
> > + select CPU_FREQ
> > + select PM_DEVFREQ
> > + help
> > + This option enables core support for non-thermal throttling of CPUs
> > + and devfreq devices.
> > +
> > + Note that you also need a event monitor module usually called
> > + *_throttler.
> > +
> > diff --git a/drivers/misc/throttler/Makefile b/drivers/misc/throttler/Makefile
> > new file mode 100644
> > index 000000000000..c8d920cee315
> > --- /dev/null
> > +++ b/drivers/misc/throttler/Makefile
> > @@ -0,0 +1 @@
> > +obj-$(CONFIG_THROTTLER) += core.o
> > diff --git a/drivers/misc/throttler/core.c b/drivers/misc/throttler/core.c
> > new file mode 100644
> > index 000000000000..c058d03212b8
> > --- /dev/null
> > +++ b/drivers/misc/throttler/core.c
> > @@ -0,0 +1,373 @@
> > +/*
> > + * Core code for non-thermal throttling
> > + *
> > + * Copyright (C) 2018 Google, Inc.
> > + *
> > + * This software is licensed under the terms of the GNU General Public
> > + * License version 2, as published by the Free Software Foundation, and
> > + * may be copied, distributed, and modified under those terms.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > + * GNU General Public License for more details.
> > + *
> > + */
> > +
> > +#include <linux/cpufreq.h>
> > +#include <linux/devfreq.h>
> > +#include <linux/kernel.h>
> > +#include <linux/notifier.h>
> > +#include <linux/of.h>
> > +#include <linux/of_platform.h>
> > +#include <linux/platform_device.h>
> > +
> > +/*
> > + * Non-thermal throttling: throttling of system components in response to
> > + * external events (e.g. high battery discharge current).
> > + *
> > + * The throttler supports throttling through cpufreq and devfreq. Multiple
> > + * levels of throttling can be configured. At level 0 no throttling is
> > + * active on behalf of the throttler, for values > 0 throttling is typically
> > + * configured to be increasingly aggressive with each level.
> > + * The number of throttling levels is not limited by the throttler (though
> > + * it is likely limited by the throttling devices). It is not necessary to
> > + * configure the same number of levels for all throttling devices. If the
> > + * requested throttling level for a device is higher than the maximum level
> > + * of the device the throttler will sleect the maximum throttling level of
> > + * the device.
> > + *
> > + * Non-thermal throttling is split in two parts:
> > + *
> > + * - throttler core
> > + * - parses the thermal policy
> > + * - applies throttling settings for a requested level of throttling
> > + *
> > + * - event monitor driver
> > + * - monitors the events that trigger throttling
> > + * - determines the throttling level (often limited to on/off)
> > + * - requests throttler core to apply throttling settings
> > + *
> > + * It is possible for a system to have more than one throttler and the
> > + * throttlers may make use of the same throttling devices, in case of
> > + * conflicting settings for a device the more aggressive values will be
> > + * applied.
> > + *
> > + */
> > +
> > +struct thrcfg {
> > + uint32_t *freqs;
> > + int num_levels;
> > +};
> > +
> > +struct cpufreq_thrdev {
> > + uint32_t cpu;
> > + struct thrcfg cfg;
> > +};
> > +
> > +struct devfreq_thrdev {
> > + struct devfreq *devfreq;
> > + struct thrcfg cfg;
> > + struct throttler *thr;
> > + struct notifier_block nb;
> > +};
> > +
> > +struct __thr_cpufreq {
> > + struct cpufreq_thrdev *devs;
> > + int ndevs;
> > + struct notifier_block nb;
> > +};
> > +
> > +struct __thr_devfreq {
> > + struct devfreq_thrdev *devs;
> > + int ndevs;
> > +};
> > +
> > +struct throttler {
> > + struct device *dev;
> > + int level;
> > + struct __thr_cpufreq cpufreq;
> > + struct __thr_devfreq devfreq;
> > +};
> > +
> > +static unsigned long thr_get_throttling_freq(struct thrcfg *cfg, int level)
> > +{
> > + if (level == 0 ) {
> > + WARN(true, "level == 0");
> > + return 0;
> > + }
> > +
> > + if (level <= cfg->num_levels)
> > + return cfg->freqs[level - 1];
> > + else
> > + return cfg->freqs[cfg->num_levels - 1];
> > +}
> > +
> > +static int thr_cpufreq_event(struct notifier_block *nb,
> > + unsigned long event, void *data)
> > +{
> > + struct throttler *thr =
> > + container_of(nb, struct throttler, cpufreq.nb);
> > + struct cpufreq_policy *policy = data;
> > + struct cpufreq_thrdev *ctd;
> > + int i;
> > +
> > + if ((event != CPUFREQ_ADJUST) || (thr->level == 0))
> > + return NOTIFY_DONE;
> > +
> > + for (i = 0; i < thr->cpufreq.ndevs; i++) {
> > + ctd = &thr->cpufreq.devs[i];
> > +
> > + if (ctd->cpu == policy->cpu) {
> > + unsigned long clamp_freq =
> > + thr_get_throttling_freq(&ctd->cfg, thr->level);
> > + if (clamp_freq < policy->max) {
> > + cpufreq_verify_within_limits(policy, 0, clamp_freq);
> > + }
> > + }
> > + }
> > +
> > + return NOTIFY_DONE;
> > +}
> > +
> > +static int thr_devfreq_event(struct notifier_block *nb,
> > + unsigned long event, void *data)
> > +{
> > + struct devfreq_thrdev *dtd =
> > + container_of(nb, struct devfreq_thrdev, nb);
> > + struct throttler *thr = dtd->thr;
> > + struct devfreq_policy *policy = data;
> > + unsigned long clamp_freq;
> > +
> > + if ((event != DEVFREQ_ADJUST) || (thr->level == 0))
> > + return NOTIFY_DONE;
> > +
> > + clamp_freq = thr_get_throttling_freq(&dtd->cfg, thr->level);
> > + if (clamp_freq < policy->max)
> > + devfreq_verify_within_limits(policy, 0, clamp_freq);
> > +
> > + return NOTIFY_DONE;
> > +}
> > +
> > +static void thr_cpufreq_update_policy(struct throttler *thr)
> > +{
> > + int i;
> > +
> > + for (i = 0; i < thr->cpufreq.ndevs; i++) {
> > + struct cpufreq_thrdev *ctd = &thr->cpufreq.devs[i];
> > + struct cpufreq_policy *policy = cpufreq_cpu_get(ctd->cpu);
> > +
> > + if (!policy) {
> > + dev_warn(thr->dev, "CPU%d does have no cpufreq policy!\n", ctd->cpu);
> > + continue;
> > + }
> > +
> > + cpufreq_update_policy(ctd->cpu);
> > + cpufreq_cpu_put(policy);
> > + }
> > +}
> > +
> > +static int thr_parse_thrcfg(struct throttler *thr,
> > + struct device_node *np, struct thrcfg *cfg) {
> > + int err;
> > +
> > + cfg->num_levels =
> > + of_property_count_u32_elems(np, "throttling-frequencies");
> > + if (cfg->num_levels < 0) {
> > + pr_err("%s: failed to determine number of throttling frequencies\n",
> > + np->full_name);
> > + return cfg->num_levels;
> > + }
> > +
> > + cfg->freqs = devm_kzalloc(thr->dev,
> > + cfg->num_levels * sizeof(u32), GFP_KERNEL);
> > + if (!cfg->freqs)
> > + return -ENOMEM;
> > +
> > + err = of_property_read_u32_array(np, "throttling-frequencies",
> > + cfg->freqs, cfg->num_levels);
> > + if (err) {
> > + pr_err("%s: failed to read throttling frequencies\n", np->full_name);
> > + return err;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static struct devfreq *thr_find_devfreq_dev(struct throttler *thr,
> > + struct device_node *np_df) {
> > + struct device_node *node;
> > + struct platform_device *pdev;
> > +
> > + node = of_parse_phandle(np_df, "device", 0);
> > + if (!node) {
> > + pr_err("%s: failed to get devfreq parent device\n",
> > + np_df->full_name);
> > + return ERR_PTR(-EINVAL);
> > + }
> > +
> > + pdev = of_find_device_by_node(node);
> > + if (!pdev) {
> > + pr_err("%s: could not find devfreq parent device\n",
> > + node->full_name);
> > + return ERR_PTR(-EINVAL);
> > + }
> > +
> > + return dev_to_devfreq(&pdev->dev);
> > +}
> > +
> > +static int thr_parse_dt(struct throttler *thr, struct device_node *np)
> > +{
> > + struct device_node *node, *child;
> > + int err, i;
> > +
> > + node = of_get_child_by_name(np, "cpufreq");
> > + if (node) {
> > + thr->cpufreq.ndevs = of_get_child_count(node);
> > + thr->cpufreq.devs = devm_kzalloc(thr->dev,
> > + sizeof(*thr->cpufreq.devs) * thr->cpufreq.ndevs,
> > + GFP_KERNEL);
> > +
> > + i = 0;
> > + for_each_child_of_node(node, child) {
> > + struct cpufreq_thrdev *ctd = &thr->cpufreq.devs[i];
> > +
> > + err = of_property_read_u32(child, "cpu", &ctd->cpu);
> > + if (err) {
> > + pr_err("%s: failed to read CPU id\n", child->full_name);
> > + return err;
> > + }
> > +
> > + err = thr_parse_thrcfg(thr, child, &ctd->cfg);
> > + if (err)
> > + return err;
> > +
> > + i++;
> > + }
> > + }
> > +
> > + node = of_get_child_by_name(np, "devfreq");
> > + if (node) {
> > + thr->devfreq.ndevs = of_get_child_count(node);
> > + thr->devfreq.devs = devm_kzalloc(thr->dev,
> > + sizeof(*thr->devfreq.devs) * thr->devfreq.ndevs,
> > + GFP_KERNEL);
> > +
> > + i = 0;
> > + for_each_child_of_node(node, child) {
> > + struct devfreq_thrdev *dtd = &thr->devfreq.devs[i];
> > +
> > + dtd->thr = thr;
> > +
> > + dtd->devfreq = thr_find_devfreq_dev(thr, child);
> > + if (IS_ERR(dtd->devfreq))
> > + return PTR_ERR(dtd->devfreq);
> > +
> > + err = thr_parse_thrcfg(thr, child, &dtd->cfg);
> > + if (err)
> > + return err;
> > +
> > + i++;
> > + }
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static void thr_update_devfreq(struct devfreq *devfreq)
> > +{
> > + mutex_lock(&devfreq->lock);
> > + update_devfreq(devfreq);
> > + mutex_unlock(&devfreq->lock);
> > +}
> > +
> > +void throttler_set_level(struct throttler *thr, int level)
> > +{
> > + int i;
> > +
> > + if (level == thr->level)
> > + return;
> > +
> > + dev_dbg(thr->dev, "throttling level: %d\n", level);
> > + thr->level = level;
> > +
> > + if (thr->cpufreq.ndevs > 0)
> > + thr_cpufreq_update_policy(thr);
> > +
> > + if (thr->devfreq.ndevs > 0)
> > + for (i = 0; i < thr->devfreq.ndevs; i++)
> > + thr_update_devfreq(thr->devfreq.devs[i].devfreq);
> > +}
> > +EXPORT_SYMBOL_GPL(throttler_set_level);
> > +
> > +struct throttler *throttler_setup(struct device *dev)
> > +{
> > + struct throttler *thr;
> > + struct device_node *np = dev->of_node;
> > + int err, i;
> > +
> > + if (!np)
> > + /* should never happen */
> > + return ERR_PTR(-EINVAL);
> > +
> > + thr = devm_kzalloc(dev, sizeof(*thr), GFP_KERNEL);
> > + if (!thr)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + thr->dev = dev;
> > +
> > + err = thr_parse_dt(thr, np);
> > + if (err)
> > + return ERR_PTR(err);
> > +
> > + if (thr->cpufreq.ndevs > 0) {
> > + thr->cpufreq.nb.notifier_call = thr_cpufreq_event;
> > + err = cpufreq_register_notifier(&thr->cpufreq.nb,
> > + CPUFREQ_POLICY_NOTIFIER);
> > + if (err < 0) {
> > + dev_err(dev, "failed to register cpufreq notifier\n");
> > + return ERR_PTR(err);
> > + }
> > + }
> > +
> > + for (i = 0; i < thr->devfreq.ndevs; i++) {
> > + struct devfreq_thrdev *dtd = &thr->devfreq.devs[i];
> > +
> > + dtd->nb.notifier_call = thr_devfreq_event;
> > + err = devm_devfreq_register_notifier(dev, dtd->devfreq,
> > + &dtd->nb, DEVFREQ_POLICY_NOTIFIER);
> > + if (err < 0) {
> > + dev_err(dev, "failed to register devfreq notifier\n");
> > + goto err_cpufreq_unregister;
> > + }
> > + }
> > +
> > + return thr;
> > +
> > +err_cpufreq_unregister:
> > + if (thr->cpufreq.ndevs > 0)
> > + cpufreq_unregister_notifier(&thr->cpufreq.nb,
> > + CPUFREQ_POLICY_NOTIFIER);
> > +
> > + return ERR_PTR(err);
> > +}
> > +EXPORT_SYMBOL_GPL(throttler_setup);
> > +
> > +void throttler_teardown(struct throttler *thr)
> > +{
> > + int i;
> > +
> > + thr->level = 0;
> > +
> > + if (thr->cpufreq.ndevs > 0) {
> > + thr_cpufreq_update_policy(thr);
> > +
> > + cpufreq_unregister_notifier(&thr->cpufreq.nb,
> > + CPUFREQ_POLICY_NOTIFIER);
> > + }
> > +
> > + if (thr->devfreq.ndevs > 0)
> > + for (i = 0; i < thr->devfreq.ndevs; i++)
> > + thr_update_devfreq(thr->devfreq.devs[i].devfreq);
> > +}
> > +EXPORT_SYMBOL_GPL(throttler_teardown);
> > diff --git a/include/linux/throttler.h b/include/linux/throttler.h
> > new file mode 100644
> > index 000000000000..cab8c466da4b
> > --- /dev/null
> > +++ b/include/linux/throttler.h
> > @@ -0,0 +1,10 @@
> > +#ifndef __LINUX_THROTTLER_H__
> > +#define __LINUX_THROTTLER_H__
> > +
> > +struct throttler;
> > +
> > +extern struct throttler *throttler_setup(struct device *dev);
> > +extern void throttler_teardown(struct throttler *thr);
> > +extern void throttler_set_level(struct throttler *thr, int level);
> > +
> > +#endif /* __LINUX_THROTTLER_H__ */
> >

2018-05-29 21:33:03

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH 09/11] misc: throttler: Add core support for non-thermal throttling

On Mon, May 28, 2018 at 10:08:26AM +0200, Greg Kroah-Hartman wrote:
> On Fri, May 25, 2018 at 01:30:41PM -0700, Matthias Kaehlcke wrote:
> > diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> > index 20be70c3f118..01a1714dd2ad 100644
> > --- a/drivers/misc/Makefile
> > +++ b/drivers/misc/Makefile
> > @@ -57,3 +57,4 @@ obj-$(CONFIG_ASPEED_LPC_SNOOP) += aspeed-lpc-snoop.o
> > obj-$(CONFIG_PCI_ENDPOINT_TEST) += pci_endpoint_test.o
> > obj-$(CONFIG_OCXL) += ocxl/
> > obj-$(CONFIG_MISC_RTSX) += cardreader/
> > +obj-y += throttler/
>
> Shouldn't you depend on a Kconfig option to traverse into this
> directory?

Ack

> > diff --git a/drivers/misc/throttler/Kconfig b/drivers/misc/throttler/Kconfig
> > new file mode 100644
> > index 000000000000..ef8388f6bc0a
> > --- /dev/null
> > +++ b/drivers/misc/throttler/Kconfig
> > @@ -0,0 +1,13 @@
> > +menuconfig THROTTLER
> > + bool "Throttler support"
> > + default n
>
> the default is always 'n' no need to say it again here :)

Will remove

> > + depends on OF
> > + select CPU_FREQ
> > + select PM_DEVFREQ
> > + help
> > + This option enables core support for non-thermal throttling of CPUs
> > + and devfreq devices.
> > +
> > + Note that you also need a event monitor module usually called
> > + *_throttler.
> > +
> > diff --git a/drivers/misc/throttler/Makefile b/drivers/misc/throttler/Makefile
> > new file mode 100644
> > index 000000000000..c8d920cee315
> > --- /dev/null
> > +++ b/drivers/misc/throttler/Makefile
> > @@ -0,0 +1 @@
> > +obj-$(CONFIG_THROTTLER) += core.o
> > diff --git a/drivers/misc/throttler/core.c b/drivers/misc/throttler/core.c
> > new file mode 100644
> > index 000000000000..c058d03212b8
> > --- /dev/null
> > +++ b/drivers/misc/throttler/core.c
> > @@ -0,0 +1,373 @@
> > +/*
> > + * Core code for non-thermal throttling
> > + *
> > + * Copyright (C) 2018 Google, Inc.
> > + *
> > + * This software is licensed under the terms of the GNU General Public
> > + * License version 2, as published by the Free Software Foundation, and
> > + * may be copied, distributed, and modified under those terms.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > + * GNU General Public License for more details.
> > + *
> > + */
>
> Please just use a single SPDX line, like checkpatch.pl would have warned
> you about. No need for the full verbous license boiler-plate text here
> at all.

Ok

2018-05-30 08:07:08

by Chanwoo Choi

[permalink] [raw]
Subject: Re: [PATCH 02/11] PM / devfreq: Fix handling of min/max_freq == 0

Hi,

On 2018년 05월 30일 03:57, Matthias Kaehlcke wrote:
> On Mon, May 28, 2018 at 03:37:47PM +0900, Chanwoo Choi wrote:
>> Hi,
>>
>> On 2018년 05월 26일 05:30, Matthias Kaehlcke wrote:
>>> Commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when adding the
>>> devfreq device") initializes df->min/max_freq with the min/max OPP when
>>> the device is added. Later commit f1d981eaecf8 ("PM / devfreq: Use the
>>> available min/max frequency") adds df->scaling_min/max_freq and the
>>> following to the frequency adjustment code:
>>>
>>> max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);
>>>
>>> With the current handling of min/max_freq this is incorrect:
>>>
>>> Even though df->max_freq is now initialized to a value != 0 user space
>>> can still set it to 0, in this case max_freq would be 0 instead of
>>> df->scaling_max_freq as intended. In consequence the frequency adjustment
>>> is not performed:
>>>
>>> if (max_freq && freq > max_freq) {
>>> freq = max_freq;
>>>
>>> To fix this set df->min/max freq to the min/max OPP in max/max_freq_store,
>>> when the user passes a value of 0. This also prevents df->max_freq from
>>> being set below the min OPP when df->min_freq is 0, and similar for
>>> min_freq. Since it is now guaranteed that df->min/max_freq can't be 0 the
>>> checks for this case can be removed.
>>>
>>> Fixes: f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency")
>>> Signed-off-by: Matthias Kaehlcke <[email protected]>
>>> ---
>>> drivers/devfreq/devfreq.c | 30 ++++++++++++++++++------------
>>> 1 file changed, 18 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
>>> index 0057ef5b0a98..67da4e7b486b 100644
>>> --- a/drivers/devfreq/devfreq.c
>>> +++ b/drivers/devfreq/devfreq.c
>>> @@ -283,11 +283,11 @@ int update_devfreq(struct devfreq *devfreq)
>>> max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);
>>> min_freq = MAX(devfreq->scaling_min_freq, devfreq->min_freq);
>>>
>>> - if (min_freq && freq < min_freq) {
>>> + if (freq < min_freq) {
>>> freq = min_freq;
>>> flags &= ~DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use GLB */
>>> }
>>> - if (max_freq && freq > max_freq) {
>>> + if (freq > max_freq) {
>>> freq = max_freq;
>>> flags |= DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use LUB */
>>> }
>>> @@ -1123,17 +1123,20 @@ static ssize_t min_freq_store(struct device *dev, struct device_attribute *attr,
>>> struct devfreq *df = to_devfreq(dev);
>>> unsigned long value;
>>> int ret;
>>> - unsigned long max;
>>>
>>> ret = sscanf(buf, "%lu", &value);
>>> if (ret != 1)
>>> return -EINVAL;
>>>
>>> mutex_lock(&df->lock);
>>> - max = df->max_freq;
>>> - if (value && max && value > max) {
>>> - ret = -EINVAL;
>>> - goto unlock;
>>> +
>>> + if (value) {
>>> + if (value > df->max_freq) {
>>> + ret = -EINVAL;
>>> + goto unlock;
>>> + }
>>> + } else {
>>> + value = df->profile->freq_table[df->profile->max_state - 1];
>>> }
>>
>> If you want to prevent that df->min_freq is zero,
>> you should reinitialize 'value' as following.
>> Because freq_table must be in ascending order.
>> value = df->profile->freq_table[0];
>
> Thanks for pointing this out!
>
> The devfreq device I tested with (a Mali GPU) uses descending order
> for some reason, and I assumed that's the usual order.
>
> https://chromium.googlesource.com/chromiumos/third_party/kernel/+/chromeos-4.4/drivers/gpu/arm/midgard/backend/gpu/mali_kbase_devfreq.c#208
>
> It seems the ordering doesn't have any impact beyond this patch. If
> the order isn't mandatory for drivers that set up their own freq_table
> we should probably support both cases to be safe.

Prior to that 'freq_table' is optional. So, patch[1] initialize the 'freq_table'
by using OPP interface if 'freq_table' is NULL.
[1] commit 0ec09ac2cebe ("PM / devfreq: Set the freq_table of devfreq device")

Current devfreq recommend the ascending order for 'freq_table'.
But, as you know, it might be not enough to support them.

I agree that we should support the both cases (ascending or descending order).

Maybe, it might be not proper to access the freq_table[] directly
because we don't know the ordering style of 'freq_table'
if 'freq_table' is made by devfreq user instead of devfreq core.


>
>>> @@ -1158,17 +1161,20 @@ static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr,
>>> struct devfreq *df = to_devfreq(dev);
>>> unsigned long value;
>>> int ret;
>>> - unsigned long min;
>>>
>>> ret = sscanf(buf, "%lu", &value);
>>> if (ret != 1)
>>> return -EINVAL;
>>>
>>> mutex_lock(&df->lock);
>>> - min = df->min_freq;
>>> - if (value && min && value < min) {
>>> - ret = -EINVAL;
>>> - goto unlock;
>>> +
>>> + if (!value) {
>>> + value = df->profile->freq_table[0];
>>
>> ditto.
>> value = df->profile->freq_table[df->profile->max_state - 1];
>>
>>> + } else {
>>> + if (value < df->min_freq) {
>>> + ret = -EINVAL;
>>> + goto unlock;
>>> + }
>>> }
>>>
>>> df->max_freq = value;
>>>
>>
>> Actually, min_freq_store() and max_freq_store() are very similar.
>> But, this patch changed the order of conditional statement as following:
>> If there is no special reason, you better to keep the same format
>> for the readability.
>>
>>
>> min_freq_store()
>> if (value) {
>> ...
>> } else {
>> value = df->profile->freq_table[df->profile->max_state - 1];
>> }
>>
>>
>> max_freq_store()
>> if (!value) {
>> value = df->profile->freq_table[0];
>> } else {
>> ...
>>
>
> Agreed, better use the same format, I'll update it in the next revision.
>
>
>


--
Best Regards,
Chanwoo Choi
Samsung Electronics

2018-05-30 08:10:48

by Chanwoo Choi

[permalink] [raw]
Subject: Re: [PATCH 09/11] misc: throttler: Add core support for non-thermal throttling

Hi,

On 2018년 05월 30일 05:57, Matthias Kaehlcke wrote:
> On Mon, May 28, 2018 at 04:32:37PM +0900, Chanwoo Choi wrote:
>
>> IMHO, you better to split out the devfreq patches from
>> 'throttler' patch set. Because I'm not sure throttler is either
>> necessary or not.
>>
>> After finishing the review of 'throttler' patches without devfreq handling,
>> it would be better for you to send devfreq patches separately.
>
> I could certainly try to get 'throttler' with only cpufreq support
> merged, but that would kind of defeat the purpose.
>
> I first sent a RFC patch for the devfreq policy notifiers
> (https://patchwork.kernel.org/patch/10401999/) to get an idea if this
> is a reasonable path to pursue. In response you asked about "real code
> and patches" and here it is :)
>
> For my use case throttler is not really useful without devfreq
> support. In this sense I prefer to know 'early' if there are any
> blocking issues, rather then making the effort to get a limited
> version of the driver merged, and then learn that I wasted my own and
> the reviewers time because it is a dead end.

I'm never force to you. Just my opinion is how to make the patches
including the new concept. Thanks for your explanation why you send
the patch set with devfreq.

>
>> On 2018년 05월 26일 05:30, Matthias Kaehlcke wrote:
>>> The purpose of the throttler is to provide support for non-thermal
>>> throttling. Throttling is triggered by external event, e.g. the
>>> detection of a high battery discharge current, close to the OCP limit
>>> of the battery. The throttler is only in charge of the throttling, not
>>> the monitoring, which is done by another (possibly platform specific)
>>> driver.
>>>
>>> Signed-off-by: Matthias Kaehlcke <[email protected]>
>>> ---
>>> drivers/misc/Kconfig | 1 +
>>> drivers/misc/Makefile | 1 +
>>> drivers/misc/throttler/Kconfig | 13 ++
>>> drivers/misc/throttler/Makefile | 1 +
>>> drivers/misc/throttler/core.c | 373 ++++++++++++++++++++++++++++++++
>>> include/linux/throttler.h | 10 +
>>> 6 files changed, 399 insertions(+)
>>> create mode 100644 drivers/misc/throttler/Kconfig
>>> create mode 100644 drivers/misc/throttler/Makefile
>>> create mode 100644 drivers/misc/throttler/core.c
>>> create mode 100644 include/linux/throttler.h
>>>
>>> diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
>>> index 5d713008749b..691d9625d83c 100644
>>> --- a/drivers/misc/Kconfig
>>> +++ b/drivers/misc/Kconfig
>>> @@ -513,4 +513,5 @@ source "drivers/misc/echo/Kconfig"
>>> source "drivers/misc/cxl/Kconfig"
>>> source "drivers/misc/ocxl/Kconfig"
>>> source "drivers/misc/cardreader/Kconfig"
>>> +source "drivers/misc/throttler/Kconfig"
>>> endmenu
>>> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
>>> index 20be70c3f118..01a1714dd2ad 100644
>>> --- a/drivers/misc/Makefile
>>> +++ b/drivers/misc/Makefile
>>> @@ -57,3 +57,4 @@ obj-$(CONFIG_ASPEED_LPC_SNOOP) += aspeed-lpc-snoop.o
>>> obj-$(CONFIG_PCI_ENDPOINT_TEST) += pci_endpoint_test.o
>>> obj-$(CONFIG_OCXL) += ocxl/
>>> obj-$(CONFIG_MISC_RTSX) += cardreader/
>>> +obj-y += throttler/
>>> diff --git a/drivers/misc/throttler/Kconfig b/drivers/misc/throttler/Kconfig
>>> new file mode 100644
>>> index 000000000000..ef8388f6bc0a
>>> --- /dev/null
>>> +++ b/drivers/misc/throttler/Kconfig
>>> @@ -0,0 +1,13 @@
>>> +menuconfig THROTTLER
>>> + bool "Throttler support"
>>> + default n
>>> + depends on OF
>>> + select CPU_FREQ
>>> + select PM_DEVFREQ
>>> + help
>>> + This option enables core support for non-thermal throttling of CPUs
>>> + and devfreq devices.
>>> +
>>> + Note that you also need a event monitor module usually called
>>> + *_throttler.
>>> +
>>> diff --git a/drivers/misc/throttler/Makefile b/drivers/misc/throttler/Makefile
>>> new file mode 100644
>>> index 000000000000..c8d920cee315
>>> --- /dev/null
>>> +++ b/drivers/misc/throttler/Makefile
>>> @@ -0,0 +1 @@
>>> +obj-$(CONFIG_THROTTLER) += core.o
>>> diff --git a/drivers/misc/throttler/core.c b/drivers/misc/throttler/core.c
>>> new file mode 100644
>>> index 000000000000..c058d03212b8
>>> --- /dev/null
>>> +++ b/drivers/misc/throttler/core.c
>>> @@ -0,0 +1,373 @@
>>> +/*
>>> + * Core code for non-thermal throttling
>>> + *
>>> + * Copyright (C) 2018 Google, Inc.
>>> + *
>>> + * This software is licensed under the terms of the GNU General Public
>>> + * License version 2, as published by the Free Software Foundation, and
>>> + * may be copied, distributed, and modified under those terms.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>>> + * GNU General Public License for more details.
>>> + *
>>> + */
>>> +
>>> +#include <linux/cpufreq.h>
>>> +#include <linux/devfreq.h>
>>> +#include <linux/kernel.h>
>>> +#include <linux/notifier.h>
>>> +#include <linux/of.h>
>>> +#include <linux/of_platform.h>
>>> +#include <linux/platform_device.h>
>>> +
>>> +/*
>>> + * Non-thermal throttling: throttling of system components in response to
>>> + * external events (e.g. high battery discharge current).
>>> + *
>>> + * The throttler supports throttling through cpufreq and devfreq. Multiple
>>> + * levels of throttling can be configured. At level 0 no throttling is
>>> + * active on behalf of the throttler, for values > 0 throttling is typically
>>> + * configured to be increasingly aggressive with each level.
>>> + * The number of throttling levels is not limited by the throttler (though
>>> + * it is likely limited by the throttling devices). It is not necessary to
>>> + * configure the same number of levels for all throttling devices. If the
>>> + * requested throttling level for a device is higher than the maximum level
>>> + * of the device the throttler will sleect the maximum throttling level of
>>> + * the device.
>>> + *
>>> + * Non-thermal throttling is split in two parts:
>>> + *
>>> + * - throttler core
>>> + * - parses the thermal policy
>>> + * - applies throttling settings for a requested level of throttling
>>> + *
>>> + * - event monitor driver
>>> + * - monitors the events that trigger throttling
>>> + * - determines the throttling level (often limited to on/off)
>>> + * - requests throttler core to apply throttling settings
>>> + *
>>> + * It is possible for a system to have more than one throttler and the
>>> + * throttlers may make use of the same throttling devices, in case of
>>> + * conflicting settings for a device the more aggressive values will be
>>> + * applied.
>>> + *
>>> + */
>>> +
>>> +struct thrcfg {
>>> + uint32_t *freqs;
>>> + int num_levels;
>>> +};
>>> +
>>> +struct cpufreq_thrdev {
>>> + uint32_t cpu;
>>> + struct thrcfg cfg;
>>> +};
>>> +
>>> +struct devfreq_thrdev {
>>> + struct devfreq *devfreq;
>>> + struct thrcfg cfg;
>>> + struct throttler *thr;
>>> + struct notifier_block nb;
>>> +};
>>> +
>>> +struct __thr_cpufreq {
>>> + struct cpufreq_thrdev *devs;
>>> + int ndevs;
>>> + struct notifier_block nb;
>>> +};
>>> +
>>> +struct __thr_devfreq {
>>> + struct devfreq_thrdev *devs;
>>> + int ndevs;
>>> +};
>>> +
>>> +struct throttler {
>>> + struct device *dev;
>>> + int level;
>>> + struct __thr_cpufreq cpufreq;
>>> + struct __thr_devfreq devfreq;
>>> +};
>>> +
>>> +static unsigned long thr_get_throttling_freq(struct thrcfg *cfg, int level)
>>> +{
>>> + if (level == 0 ) {
>>> + WARN(true, "level == 0");
>>> + return 0;
>>> + }
>>> +
>>> + if (level <= cfg->num_levels)
>>> + return cfg->freqs[level - 1];
>>> + else
>>> + return cfg->freqs[cfg->num_levels - 1];
>>> +}
>>> +
>>> +static int thr_cpufreq_event(struct notifier_block *nb,
>>> + unsigned long event, void *data)
>>> +{
>>> + struct throttler *thr =
>>> + container_of(nb, struct throttler, cpufreq.nb);
>>> + struct cpufreq_policy *policy = data;
>>> + struct cpufreq_thrdev *ctd;
>>> + int i;
>>> +
>>> + if ((event != CPUFREQ_ADJUST) || (thr->level == 0))
>>> + return NOTIFY_DONE;
>>> +
>>> + for (i = 0; i < thr->cpufreq.ndevs; i++) {
>>> + ctd = &thr->cpufreq.devs[i];
>>> +
>>> + if (ctd->cpu == policy->cpu) {
>>> + unsigned long clamp_freq =
>>> + thr_get_throttling_freq(&ctd->cfg, thr->level);
>>> + if (clamp_freq < policy->max) {
>>> + cpufreq_verify_within_limits(policy, 0, clamp_freq);
>>> + }
>>> + }
>>> + }
>>> +
>>> + return NOTIFY_DONE;
>>> +}
>>> +
>>> +static int thr_devfreq_event(struct notifier_block *nb,
>>> + unsigned long event, void *data)
>>> +{
>>> + struct devfreq_thrdev *dtd =
>>> + container_of(nb, struct devfreq_thrdev, nb);
>>> + struct throttler *thr = dtd->thr;
>>> + struct devfreq_policy *policy = data;
>>> + unsigned long clamp_freq;
>>> +
>>> + if ((event != DEVFREQ_ADJUST) || (thr->level == 0))
>>> + return NOTIFY_DONE;
>>> +
>>> + clamp_freq = thr_get_throttling_freq(&dtd->cfg, thr->level);
>>> + if (clamp_freq < policy->max)
>>> + devfreq_verify_within_limits(policy, 0, clamp_freq);
>>> +
>>> + return NOTIFY_DONE;
>>> +}
>>> +
>>> +static void thr_cpufreq_update_policy(struct throttler *thr)
>>> +{
>>> + int i;
>>> +
>>> + for (i = 0; i < thr->cpufreq.ndevs; i++) {
>>> + struct cpufreq_thrdev *ctd = &thr->cpufreq.devs[i];
>>> + struct cpufreq_policy *policy = cpufreq_cpu_get(ctd->cpu);
>>> +
>>> + if (!policy) {
>>> + dev_warn(thr->dev, "CPU%d does have no cpufreq policy!\n", ctd->cpu);
>>> + continue;
>>> + }
>>> +
>>> + cpufreq_update_policy(ctd->cpu);
>>> + cpufreq_cpu_put(policy);
>>> + }
>>> +}
>>> +
>>> +static int thr_parse_thrcfg(struct throttler *thr,
>>> + struct device_node *np, struct thrcfg *cfg) {
>>> + int err;
>>> +
>>> + cfg->num_levels =
>>> + of_property_count_u32_elems(np, "throttling-frequencies");
>>> + if (cfg->num_levels < 0) {
>>> + pr_err("%s: failed to determine number of throttling frequencies\n",
>>> + np->full_name);
>>> + return cfg->num_levels;
>>> + }
>>> +
>>> + cfg->freqs = devm_kzalloc(thr->dev,
>>> + cfg->num_levels * sizeof(u32), GFP_KERNEL);
>>> + if (!cfg->freqs)
>>> + return -ENOMEM;
>>> +
>>> + err = of_property_read_u32_array(np, "throttling-frequencies",
>>> + cfg->freqs, cfg->num_levels);
>>> + if (err) {
>>> + pr_err("%s: failed to read throttling frequencies\n", np->full_name);
>>> + return err;
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static struct devfreq *thr_find_devfreq_dev(struct throttler *thr,
>>> + struct device_node *np_df) {
>>> + struct device_node *node;
>>> + struct platform_device *pdev;
>>> +
>>> + node = of_parse_phandle(np_df, "device", 0);
>>> + if (!node) {
>>> + pr_err("%s: failed to get devfreq parent device\n",
>>> + np_df->full_name);
>>> + return ERR_PTR(-EINVAL);
>>> + }
>>> +
>>> + pdev = of_find_device_by_node(node);
>>> + if (!pdev) {
>>> + pr_err("%s: could not find devfreq parent device\n",
>>> + node->full_name);
>>> + return ERR_PTR(-EINVAL);
>>> + }
>>> +
>>> + return dev_to_devfreq(&pdev->dev);
>>> +}
>>> +
>>> +static int thr_parse_dt(struct throttler *thr, struct device_node *np)
>>> +{
>>> + struct device_node *node, *child;
>>> + int err, i;
>>> +
>>> + node = of_get_child_by_name(np, "cpufreq");
>>> + if (node) {
>>> + thr->cpufreq.ndevs = of_get_child_count(node);
>>> + thr->cpufreq.devs = devm_kzalloc(thr->dev,
>>> + sizeof(*thr->cpufreq.devs) * thr->cpufreq.ndevs,
>>> + GFP_KERNEL);
>>> +
>>> + i = 0;
>>> + for_each_child_of_node(node, child) {
>>> + struct cpufreq_thrdev *ctd = &thr->cpufreq.devs[i];
>>> +
>>> + err = of_property_read_u32(child, "cpu", &ctd->cpu);
>>> + if (err) {
>>> + pr_err("%s: failed to read CPU id\n", child->full_name);
>>> + return err;
>>> + }
>>> +
>>> + err = thr_parse_thrcfg(thr, child, &ctd->cfg);
>>> + if (err)
>>> + return err;
>>> +
>>> + i++;
>>> + }
>>> + }
>>> +
>>> + node = of_get_child_by_name(np, "devfreq");
>>> + if (node) {
>>> + thr->devfreq.ndevs = of_get_child_count(node);
>>> + thr->devfreq.devs = devm_kzalloc(thr->dev,
>>> + sizeof(*thr->devfreq.devs) * thr->devfreq.ndevs,
>>> + GFP_KERNEL);
>>> +
>>> + i = 0;
>>> + for_each_child_of_node(node, child) {
>>> + struct devfreq_thrdev *dtd = &thr->devfreq.devs[i];
>>> +
>>> + dtd->thr = thr;
>>> +
>>> + dtd->devfreq = thr_find_devfreq_dev(thr, child);
>>> + if (IS_ERR(dtd->devfreq))
>>> + return PTR_ERR(dtd->devfreq);
>>> +
>>> + err = thr_parse_thrcfg(thr, child, &dtd->cfg);
>>> + if (err)
>>> + return err;
>>> +
>>> + i++;
>>> + }
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static void thr_update_devfreq(struct devfreq *devfreq)
>>> +{
>>> + mutex_lock(&devfreq->lock);
>>> + update_devfreq(devfreq);
>>> + mutex_unlock(&devfreq->lock);
>>> +}
>>> +
>>> +void throttler_set_level(struct throttler *thr, int level)
>>> +{
>>> + int i;
>>> +
>>> + if (level == thr->level)
>>> + return;
>>> +
>>> + dev_dbg(thr->dev, "throttling level: %d\n", level);
>>> + thr->level = level;
>>> +
>>> + if (thr->cpufreq.ndevs > 0)
>>> + thr_cpufreq_update_policy(thr);
>>> +
>>> + if (thr->devfreq.ndevs > 0)
>>> + for (i = 0; i < thr->devfreq.ndevs; i++)
>>> + thr_update_devfreq(thr->devfreq.devs[i].devfreq);
>>> +}
>>> +EXPORT_SYMBOL_GPL(throttler_set_level);
>>> +
>>> +struct throttler *throttler_setup(struct device *dev)
>>> +{
>>> + struct throttler *thr;
>>> + struct device_node *np = dev->of_node;
>>> + int err, i;
>>> +
>>> + if (!np)
>>> + /* should never happen */
>>> + return ERR_PTR(-EINVAL);
>>> +
>>> + thr = devm_kzalloc(dev, sizeof(*thr), GFP_KERNEL);
>>> + if (!thr)
>>> + return ERR_PTR(-ENOMEM);
>>> +
>>> + thr->dev = dev;
>>> +
>>> + err = thr_parse_dt(thr, np);
>>> + if (err)
>>> + return ERR_PTR(err);
>>> +
>>> + if (thr->cpufreq.ndevs > 0) {
>>> + thr->cpufreq.nb.notifier_call = thr_cpufreq_event;
>>> + err = cpufreq_register_notifier(&thr->cpufreq.nb,
>>> + CPUFREQ_POLICY_NOTIFIER);
>>> + if (err < 0) {
>>> + dev_err(dev, "failed to register cpufreq notifier\n");
>>> + return ERR_PTR(err);
>>> + }
>>> + }
>>> +
>>> + for (i = 0; i < thr->devfreq.ndevs; i++) {
>>> + struct devfreq_thrdev *dtd = &thr->devfreq.devs[i];
>>> +
>>> + dtd->nb.notifier_call = thr_devfreq_event;
>>> + err = devm_devfreq_register_notifier(dev, dtd->devfreq,
>>> + &dtd->nb, DEVFREQ_POLICY_NOTIFIER);
>>> + if (err < 0) {
>>> + dev_err(dev, "failed to register devfreq notifier\n");
>>> + goto err_cpufreq_unregister;
>>> + }
>>> + }
>>> +
>>> + return thr;
>>> +
>>> +err_cpufreq_unregister:
>>> + if (thr->cpufreq.ndevs > 0)
>>> + cpufreq_unregister_notifier(&thr->cpufreq.nb,
>>> + CPUFREQ_POLICY_NOTIFIER);
>>> +
>>> + return ERR_PTR(err);
>>> +}
>>> +EXPORT_SYMBOL_GPL(throttler_setup);
>>> +
>>> +void throttler_teardown(struct throttler *thr)
>>> +{
>>> + int i;
>>> +
>>> + thr->level = 0;
>>> +
>>> + if (thr->cpufreq.ndevs > 0) {
>>> + thr_cpufreq_update_policy(thr);
>>> +
>>> + cpufreq_unregister_notifier(&thr->cpufreq.nb,
>>> + CPUFREQ_POLICY_NOTIFIER);
>>> + }
>>> +
>>> + if (thr->devfreq.ndevs > 0)
>>> + for (i = 0; i < thr->devfreq.ndevs; i++)
>>> + thr_update_devfreq(thr->devfreq.devs[i].devfreq);
>>> +}
>>> +EXPORT_SYMBOL_GPL(throttler_teardown);
>>> diff --git a/include/linux/throttler.h b/include/linux/throttler.h
>>> new file mode 100644
>>> index 000000000000..cab8c466da4b
>>> --- /dev/null
>>> +++ b/include/linux/throttler.h
>>> @@ -0,0 +1,10 @@
>>> +#ifndef __LINUX_THROTTLER_H__
>>> +#define __LINUX_THROTTLER_H__
>>> +
>>> +struct throttler;
>>> +
>>> +extern struct throttler *throttler_setup(struct device *dev);
>>> +extern void throttler_teardown(struct throttler *thr);
>>> +extern void throttler_set_level(struct throttler *thr, int level);
>>> +
>>> +#endif /* __LINUX_THROTTLER_H__ */
>>>
>
>
>


--
Best Regards,
Chanwoo Choi
Samsung Electronics

2018-05-30 21:13:59

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH 02/11] PM / devfreq: Fix handling of min/max_freq == 0

On Wed, May 30, 2018 at 05:04:14PM +0900, Chanwoo Choi wrote:
> Hi,
>
> On 2018년 05월 30일 03:57, Matthias Kaehlcke wrote:
> > On Mon, May 28, 2018 at 03:37:47PM +0900, Chanwoo Choi wrote:
> >> Hi,
> >>
> >> On 2018년 05월 26일 05:30, Matthias Kaehlcke wrote:
> >>> Commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when adding the
> >>> devfreq device") initializes df->min/max_freq with the min/max OPP when
> >>> the device is added. Later commit f1d981eaecf8 ("PM / devfreq: Use the
> >>> available min/max frequency") adds df->scaling_min/max_freq and the
> >>> following to the frequency adjustment code:
> >>>
> >>> max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);
> >>>
> >>> With the current handling of min/max_freq this is incorrect:
> >>>
> >>> Even though df->max_freq is now initialized to a value != 0 user space
> >>> can still set it to 0, in this case max_freq would be 0 instead of
> >>> df->scaling_max_freq as intended. In consequence the frequency adjustment
> >>> is not performed:
> >>>
> >>> if (max_freq && freq > max_freq) {
> >>> freq = max_freq;
> >>>
> >>> To fix this set df->min/max freq to the min/max OPP in max/max_freq_store,
> >>> when the user passes a value of 0. This also prevents df->max_freq from
> >>> being set below the min OPP when df->min_freq is 0, and similar for
> >>> min_freq. Since it is now guaranteed that df->min/max_freq can't be 0 the
> >>> checks for this case can be removed.
> >>>
> >>> Fixes: f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency")
> >>> Signed-off-by: Matthias Kaehlcke <[email protected]>
> >>> ---
> >>> drivers/devfreq/devfreq.c | 30 ++++++++++++++++++------------
> >>> 1 file changed, 18 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> >>> index 0057ef5b0a98..67da4e7b486b 100644
> >>> --- a/drivers/devfreq/devfreq.c
> >>> +++ b/drivers/devfreq/devfreq.c
> >>> @@ -283,11 +283,11 @@ int update_devfreq(struct devfreq *devfreq)
> >>> max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);
> >>> min_freq = MAX(devfreq->scaling_min_freq, devfreq->min_freq);
> >>>
> >>> - if (min_freq && freq < min_freq) {
> >>> + if (freq < min_freq) {
> >>> freq = min_freq;
> >>> flags &= ~DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use GLB */
> >>> }
> >>> - if (max_freq && freq > max_freq) {
> >>> + if (freq > max_freq) {
> >>> freq = max_freq;
> >>> flags |= DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use LUB */
> >>> }
> >>> @@ -1123,17 +1123,20 @@ static ssize_t min_freq_store(struct device *dev, struct device_attribute *attr,
> >>> struct devfreq *df = to_devfreq(dev);
> >>> unsigned long value;
> >>> int ret;
> >>> - unsigned long max;
> >>>
> >>> ret = sscanf(buf, "%lu", &value);
> >>> if (ret != 1)
> >>> return -EINVAL;
> >>>
> >>> mutex_lock(&df->lock);
> >>> - max = df->max_freq;
> >>> - if (value && max && value > max) {
> >>> - ret = -EINVAL;
> >>> - goto unlock;
> >>> +
> >>> + if (value) {
> >>> + if (value > df->max_freq) {
> >>> + ret = -EINVAL;
> >>> + goto unlock;
> >>> + }
> >>> + } else {
> >>> + value = df->profile->freq_table[df->profile->max_state - 1];
> >>> }
> >>
> >> If you want to prevent that df->min_freq is zero,
> >> you should reinitialize 'value' as following.
> >> Because freq_table must be in ascending order.
> >> value = df->profile->freq_table[0];
> >
> > Thanks for pointing this out!
> >
> > The devfreq device I tested with (a Mali GPU) uses descending order
> > for some reason, and I assumed that's the usual order.
> >
> > https://chromium.googlesource.com/chromiumos/third_party/kernel/+/chromeos-4.4/drivers/gpu/arm/midgard/backend/gpu/mali_kbase_devfreq.c#208
> >
> > It seems the ordering doesn't have any impact beyond this patch. If
> > the order isn't mandatory for drivers that set up their own freq_table
> > we should probably support both cases to be safe.
>
> Prior to that 'freq_table' is optional. So, patch[1] initialize the 'freq_table'
> by using OPP interface if 'freq_table' is NULL.
> [1] commit 0ec09ac2cebe ("PM / devfreq: Set the freq_table of devfreq device")
>
> Current devfreq recommend the ascending order for 'freq_table'.
> But, as you know, it might be not enough to support them.
>
> I agree that we should support the both cases (ascending or descending order).
>
> Maybe, it might be not proper to access the freq_table[] directly
> because we don't know the ordering style of 'freq_table'
> if 'freq_table' is made by devfreq user instead of devfreq core.

If we can assume that it is either ascending or descending, but not
random order than a simple check if freq_table[0] <
freq_table[max_state - 1] would be sufficient.

Otherwise we could also determine the min/max after initialization and
save the result, though that would leave us with yet another frequency
pair, which might be confusing, especially if we don't come up with
good names to distinguish between them.

2018-05-31 09:07:13

by Enric Balletbo Serra

[permalink] [raw]
Subject: Re: [PATCH 11/11] misc/throttler: Add Chrome OS EC throttler

Hi Matthias,

The patch looks good to me, just three minor comments to be more
coherent with other cros-ec drivers.

2018-05-25 22:30 GMT+02:00 Matthias Kaehlcke <[email protected]>:
> The driver subscribes to throttling events from the Chrome OS
> embedded controller and enables/disables system throttling based
> on these events.
>
> Signed-off-by: Matthias Kaehlcke <[email protected]>
> ---
> drivers/misc/throttler/Kconfig | 15 +++
> drivers/misc/throttler/Makefile | 1 +
> drivers/misc/throttler/cros_ec_throttler.c | 122 +++++++++++++++++++++
> 3 files changed, 138 insertions(+)
> create mode 100644 drivers/misc/throttler/cros_ec_throttler.c
>
> diff --git a/drivers/misc/throttler/Kconfig b/drivers/misc/throttler/Kconfig
> index ef8388f6bc0a..652b6817b75c 100644
> --- a/drivers/misc/throttler/Kconfig
> +++ b/drivers/misc/throttler/Kconfig
> @@ -11,3 +11,18 @@ menuconfig THROTTLER
> Note that you also need a event monitor module usually called
> *_throttler.
>
> +if THROTTLER
> +
> +config CROS_EC_THROTTLER
> + tristate "Throttler event monitor for the Chrome OS Embedded Controller"

s/Chrome OS/ChromeOS/ This is how other cros-ec drivers refer to the
embedded controller, so will be good if you could maintain this
denomination.

> + default n
> + depends on MFD_CROS_EC
> + ---help---
> + This driver adds support to throttle the system in reaction to
> + Chrome OS EC events.
> +

ditto

> + To compile this driver as a module, choose M here: the
> + module will be called cros_ec_throttler.
> +
> +endif # THROTTLER
> +
> diff --git a/drivers/misc/throttler/Makefile b/drivers/misc/throttler/Makefile
> index c8d920cee315..d9b2a77dabc9 100644
> --- a/drivers/misc/throttler/Makefile
> +++ b/drivers/misc/throttler/Makefile
> @@ -1 +1,2 @@
> obj-$(CONFIG_THROTTLER) += core.o
> +obj-$(CONFIG_CROS_EC_THROTTLER) += cros_ec_throttler.o
> diff --git a/drivers/misc/throttler/cros_ec_throttler.c b/drivers/misc/throttler/cros_ec_throttler.c
> new file mode 100644
> index 000000000000..ea6bc002d49c
> --- /dev/null
> +++ b/drivers/misc/throttler/cros_ec_throttler.c
> @@ -0,0 +1,122 @@
> +/*
> + * Driver for throttling triggered by EC events.
> + *
> + * Copyright (C) 2018 Google, Inc.
> + *
> + * This software is licensed under the terms of the GNU General Public
> + * License version 2, as published by the Free Software Foundation, and
> + * may be copied, distributed, and modified under those terms.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + */
> +

Replace this for the SPDX header format

// SPDX-License-Identifier: GPL-2.0+
// Driver for throttling triggered by EC events.
//
// Copyright (C) 2018 Google, Inc.
// Author: Matthias Kaehlcke <[email protected]>

> +#include <linux/kernel.h>
> +#include <linux/mfd/cros_ec.h>
> +#include <linux/module.h>
> +#include <linux/notifier.h>
> +#include <linux/of.h>
> +#include <linux/of_platform.h>
> +#include <linux/platform_device.h>
> +#include <linux/throttler.h>
> +
> +struct cros_ec_throttler {
> + struct cros_ec_device *ec;
> + struct throttler *throttler;
> + struct notifier_block nb;
> +};
> +
> +static int cros_ec_throttler_event(struct notifier_block *nb,
> + unsigned long queued_during_suspend, void *_notify)
> +{
> + struct cros_ec_throttler *cte =
> + container_of(nb, struct cros_ec_throttler, nb);

nit: Better add a define here instead of split the line like this ?

> + u32 host_event;
> +
> + host_event = cros_ec_get_host_event(cte->ec);
> + if (host_event & EC_HOST_EVENT_MASK(EC_HOST_EVENT_THROTTLE_START)) {
> + throttler_set_level(cte->throttler, 1);
> +
> + return NOTIFY_OK;
> + } else if (host_event & EC_HOST_EVENT_MASK(EC_HOST_EVENT_THROTTLE_STOP)) {
> + throttler_set_level(cte->throttler, 0);
> +
> + return NOTIFY_OK;
> + }
> +
> + return NOTIFY_DONE;
> +}
> +
> +static int cros_ec_throttler_probe(struct platform_device *pdev)
> +{
> + struct cros_ec_throttler *cte;
> + struct device *dev = &pdev->dev;
> + struct device_node *np = pdev->dev.of_node;
> + int ret;
> +
> + if (!np) {
> + /* should never happen */
> + return -EINVAL;
> + }
> +
> + cte = devm_kzalloc(dev, sizeof(*cte), GFP_KERNEL);
> + if (!cte)
> + return -ENOMEM;
> +
> + cte->ec = dev_get_drvdata(pdev->dev.parent);
> +
> + cte->throttler = throttler_setup(dev);
> + if (IS_ERR(cte->throttler))
> + return PTR_ERR(cte->throttler);
> +
> + dev_set_drvdata(dev, cte);
> +
> + cte->nb.notifier_call = cros_ec_throttler_event;
> + ret = blocking_notifier_chain_register(&cte->ec->event_notifier,
> + &cte->nb);
> + if (ret < 0) {
> + dev_err(dev, "failed to register notifier\n");
> + throttler_teardown(cte->throttler);
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> +static int cros_ec_throttler_remove(struct platform_device *pdev)
> +{
> + struct cros_ec_throttler *cte = platform_get_drvdata(pdev);
> +
> + blocking_notifier_chain_unregister(&cte->ec->event_notifier,
> + &cte->nb);
> +
> + throttler_teardown(cte->throttler);
> +
> + return 0;
> +}
> +
> +#ifdef CONFIG_OF
> +static const struct of_device_id cros_ec_throttler_of_match[] = {
> + { .compatible = "google,cros-ec-throttler" },
> + { /* sentinel */ }
> +};
> +MODULE_DEVICE_TABLE(of, cros_ec_throttler_of_match);
> +#endif /* CONFIG_OF */
> +
> +static struct platform_driver cros_ec_throttler_driver = {
> + .driver = {
> + .name = "cros-ec-throttler",
> + .of_match_table = of_match_ptr(cros_ec_throttler_of_match),
> + },
> + .probe = cros_ec_throttler_probe,
> + .remove = cros_ec_throttler_remove,
> +};
> +
> +module_platform_driver(cros_ec_throttler_driver);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Matthias Kaehlcke <[email protected]>");
> +MODULE_DESCRIPTION("Chrome OS EC Throttler");

s/Chrome OS/ChromeOS/

> --
> 2.17.0.921.gf22659ad46-goog
>

Reviewed-by: Enric Balletbo i Serra <[email protected]>

Thanks,
Enric

2018-05-31 16:32:53

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH 10/11] dt-bindings: misc: add bindings for throttler

On Fri, May 25, 2018 at 01:30:42PM -0700, Matthias Kaehlcke wrote:

Commit msg?

> Signed-off-by: Matthias Kaehlcke <[email protected]>
> ---
> .../devicetree/bindings/misc/throttler.txt | 41 +++++++++++++++++++
> 1 file changed, 41 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/misc/throttler.txt
>
> diff --git a/Documentation/devicetree/bindings/misc/throttler.txt b/Documentation/devicetree/bindings/misc/throttler.txt
> new file mode 100644
> index 000000000000..92f13e94451a
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/misc/throttler.txt
> @@ -0,0 +1,41 @@
> +Throttler driver
> +
> +The Throttler is used for non-thermal throttling of system components like
> +CPUs or devfreq devices.

This all looks very Linux specific and not a h/w device. Perhaps you can
add hint properties to OPP tables as to what entries can be used for
throttling, but otherwise this doesn't belong in DT.

2018-05-31 17:34:17

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH 11/11] misc/throttler: Add Chrome OS EC throttler

Hi Enric,

thanks for the review!

I'll address your comments in the next revision.

On Thu, May 31, 2018 at 11:05:09AM +0200, Enric Balletbo Serra wrote:
> Hi Matthias,
>
> The patch looks good to me, just three minor comments to be more
> coherent with other cros-ec drivers.
>
> 2018-05-25 22:30 GMT+02:00 Matthias Kaehlcke <[email protected]>:
> > The driver subscribes to throttling events from the Chrome OS
> > embedded controller and enables/disables system throttling based
> > on these events.
> >
> > Signed-off-by: Matthias Kaehlcke <[email protected]>
> > ---
> > drivers/misc/throttler/Kconfig | 15 +++
> > drivers/misc/throttler/Makefile | 1 +
> > drivers/misc/throttler/cros_ec_throttler.c | 122 +++++++++++++++++++++
> > 3 files changed, 138 insertions(+)
> > create mode 100644 drivers/misc/throttler/cros_ec_throttler.c
> >
> > diff --git a/drivers/misc/throttler/Kconfig b/drivers/misc/throttler/Kconfig
> > index ef8388f6bc0a..652b6817b75c 100644
> > --- a/drivers/misc/throttler/Kconfig
> > +++ b/drivers/misc/throttler/Kconfig
> > @@ -11,3 +11,18 @@ menuconfig THROTTLER
> > Note that you also need a event monitor module usually called
> > *_throttler.
> >
> > +if THROTTLER
> > +
> > +config CROS_EC_THROTTLER
> > + tristate "Throttler event monitor for the Chrome OS Embedded Controller"
>
> s/Chrome OS/ChromeOS/ This is how other cros-ec drivers refer to the
> embedded controller, so will be good if you could maintain this
> denomination.
>
> > + default n
> > + depends on MFD_CROS_EC
> > + ---help---
> > + This driver adds support to throttle the system in reaction to
> > + Chrome OS EC events.
> > +
>
> ditto
>
> > + To compile this driver as a module, choose M here: the
> > + module will be called cros_ec_throttler.
> > +
> > +endif # THROTTLER
> > +
> > diff --git a/drivers/misc/throttler/Makefile b/drivers/misc/throttler/Makefile
> > index c8d920cee315..d9b2a77dabc9 100644
> > --- a/drivers/misc/throttler/Makefile
> > +++ b/drivers/misc/throttler/Makefile
> > @@ -1 +1,2 @@
> > obj-$(CONFIG_THROTTLER) += core.o
> > +obj-$(CONFIG_CROS_EC_THROTTLER) += cros_ec_throttler.o
> > diff --git a/drivers/misc/throttler/cros_ec_throttler.c b/drivers/misc/throttler/cros_ec_throttler.c
> > new file mode 100644
> > index 000000000000..ea6bc002d49c
> > --- /dev/null
> > +++ b/drivers/misc/throttler/cros_ec_throttler.c
> > @@ -0,0 +1,122 @@
> > +/*
> > + * Driver for throttling triggered by EC events.
> > + *
> > + * Copyright (C) 2018 Google, Inc.
> > + *
> > + * This software is licensed under the terms of the GNU General Public
> > + * License version 2, as published by the Free Software Foundation, and
> > + * may be copied, distributed, and modified under those terms.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > + * GNU General Public License for more details.
> > + *
> > + */
> > +
>
> Replace this for the SPDX header format
>
> // SPDX-License-Identifier: GPL-2.0+
> // Driver for throttling triggered by EC events.
> //
> // Copyright (C) 2018 Google, Inc.
> // Author: Matthias Kaehlcke <[email protected]>
>
> > +#include <linux/kernel.h>
> > +#include <linux/mfd/cros_ec.h>
> > +#include <linux/module.h>
> > +#include <linux/notifier.h>
> > +#include <linux/of.h>
> > +#include <linux/of_platform.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/throttler.h>
> > +
> > +struct cros_ec_throttler {
> > + struct cros_ec_device *ec;
> > + struct throttler *throttler;
> > + struct notifier_block nb;
> > +};
> > +
> > +static int cros_ec_throttler_event(struct notifier_block *nb,
> > + unsigned long queued_during_suspend, void *_notify)
> > +{
> > + struct cros_ec_throttler *cte =
> > + container_of(nb, struct cros_ec_throttler, nb);
>
> nit: Better add a define here instead of split the line like this ?
>
> > + u32 host_event;
> > +
> > + host_event = cros_ec_get_host_event(cte->ec);
> > + if (host_event & EC_HOST_EVENT_MASK(EC_HOST_EVENT_THROTTLE_START)) {
> > + throttler_set_level(cte->throttler, 1);
> > +
> > + return NOTIFY_OK;
> > + } else if (host_event & EC_HOST_EVENT_MASK(EC_HOST_EVENT_THROTTLE_STOP)) {
> > + throttler_set_level(cte->throttler, 0);
> > +
> > + return NOTIFY_OK;
> > + }
> > +
> > + return NOTIFY_DONE;
> > +}
> > +
> > +static int cros_ec_throttler_probe(struct platform_device *pdev)
> > +{
> > + struct cros_ec_throttler *cte;
> > + struct device *dev = &pdev->dev;
> > + struct device_node *np = pdev->dev.of_node;
> > + int ret;
> > +
> > + if (!np) {
> > + /* should never happen */
> > + return -EINVAL;
> > + }
> > +
> > + cte = devm_kzalloc(dev, sizeof(*cte), GFP_KERNEL);
> > + if (!cte)
> > + return -ENOMEM;
> > +
> > + cte->ec = dev_get_drvdata(pdev->dev.parent);
> > +
> > + cte->throttler = throttler_setup(dev);
> > + if (IS_ERR(cte->throttler))
> > + return PTR_ERR(cte->throttler);
> > +
> > + dev_set_drvdata(dev, cte);
> > +
> > + cte->nb.notifier_call = cros_ec_throttler_event;
> > + ret = blocking_notifier_chain_register(&cte->ec->event_notifier,
> > + &cte->nb);
> > + if (ret < 0) {
> > + dev_err(dev, "failed to register notifier\n");
> > + throttler_teardown(cte->throttler);
> > + return ret;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static int cros_ec_throttler_remove(struct platform_device *pdev)
> > +{
> > + struct cros_ec_throttler *cte = platform_get_drvdata(pdev);
> > +
> > + blocking_notifier_chain_unregister(&cte->ec->event_notifier,
> > + &cte->nb);
> > +
> > + throttler_teardown(cte->throttler);
> > +
> > + return 0;
> > +}
> > +
> > +#ifdef CONFIG_OF
> > +static const struct of_device_id cros_ec_throttler_of_match[] = {
> > + { .compatible = "google,cros-ec-throttler" },
> > + { /* sentinel */ }
> > +};
> > +MODULE_DEVICE_TABLE(of, cros_ec_throttler_of_match);
> > +#endif /* CONFIG_OF */
> > +
> > +static struct platform_driver cros_ec_throttler_driver = {
> > + .driver = {
> > + .name = "cros-ec-throttler",
> > + .of_match_table = of_match_ptr(cros_ec_throttler_of_match),
> > + },
> > + .probe = cros_ec_throttler_probe,
> > + .remove = cros_ec_throttler_remove,
> > +};
> > +
> > +module_platform_driver(cros_ec_throttler_driver);
> > +
> > +MODULE_LICENSE("GPL");
> > +MODULE_AUTHOR("Matthias Kaehlcke <[email protected]>");
> > +MODULE_DESCRIPTION("Chrome OS EC Throttler");
>
> s/Chrome OS/ChromeOS/
>
> >
>
> Reviewed-by: Enric Balletbo i Serra <[email protected]>
>
> Thanks,
> Enric

2018-05-31 18:36:29

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH 10/11] dt-bindings: misc: add bindings for throttler

Hi Rob,

On Thu, May 31, 2018 at 11:31:59AM -0500, Rob Herring wrote:
> On Fri, May 25, 2018 at 01:30:42PM -0700, Matthias Kaehlcke wrote:
>
> Commit msg?

Will add some more info in the next revision.

> > Signed-off-by: Matthias Kaehlcke <[email protected]>
> > ---
> > .../devicetree/bindings/misc/throttler.txt | 41 +++++++++++++++++++
> > 1 file changed, 41 insertions(+)
> > create mode 100644 Documentation/devicetree/bindings/misc/throttler.txt
> >
> > diff --git a/Documentation/devicetree/bindings/misc/throttler.txt b/Documentation/devicetree/bindings/misc/throttler.txt
> > new file mode 100644
> > index 000000000000..92f13e94451a
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/misc/throttler.txt
> > @@ -0,0 +1,41 @@
> > +Throttler driver
> > +
> > +The Throttler is used for non-thermal throttling of system components like
> > +CPUs or devfreq devices.
>
> This all looks very Linux specific and not a h/w device. Perhaps you can
> add hint properties to OPP tables as to what entries can be used for
> throttling, but otherwise this doesn't belong in DT.

My idea is to allow multiple throttlers, which might operate on
different throttling devices or use different OPPs for the same
device. To support this a simple boolean hint that the OPP can be used
for throttling would not be sufficient.

What should work is a property with an array of phandles of the
throttlers that use a given OPP.

AFAIK it is currently not possible to enumerate the devfreq devices in
the system, so besides the info in the OPPs the throttler itself would
still need a phandle to the devfreq device(s) it uses.

I envision something like this:

gpu_opp_table: opp-table2 {
compatible = "operating-points-v2";

opp00 {
opp-hz = /bits/ 64 <200000000>;
opp-microvolt = <800000>;
};
opp01 {
opp-hz = /bits/ 64 <297000000>;
opp-microvolt = <800000>;
opp-throttlers = <&cros_ec_throttler>;
};
...
};

cros_ec_throttler: cros-ec-throttler {
compatible = "google,cros-ec-throttler";

devfreq-devices = <&gpu>;
};

Would this be acceptable?

Thanks

Matthias

2018-05-31 20:05:25

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH 10/11] dt-bindings: misc: add bindings for throttler

On Thu, May 31, 2018 at 1:34 PM, Matthias Kaehlcke <[email protected]> wrote:
> Hi Rob,
>
> On Thu, May 31, 2018 at 11:31:59AM -0500, Rob Herring wrote:
>> On Fri, May 25, 2018 at 01:30:42PM -0700, Matthias Kaehlcke wrote:
>>
>> Commit msg?
>
> Will add some more info in the next revision.
>
>> > Signed-off-by: Matthias Kaehlcke <[email protected]>
>> > ---
>> > .../devicetree/bindings/misc/throttler.txt | 41 +++++++++++++++++++
>> > 1 file changed, 41 insertions(+)
>> > create mode 100644 Documentation/devicetree/bindings/misc/throttler.txt
>> >
>> > diff --git a/Documentation/devicetree/bindings/misc/throttler.txt b/Documentation/devicetree/bindings/misc/throttler.txt
>> > new file mode 100644
>> > index 000000000000..92f13e94451a
>> > --- /dev/null
>> > +++ b/Documentation/devicetree/bindings/misc/throttler.txt
>> > @@ -0,0 +1,41 @@
>> > +Throttler driver
>> > +
>> > +The Throttler is used for non-thermal throttling of system components like
>> > +CPUs or devfreq devices.
>>
>> This all looks very Linux specific and not a h/w device. Perhaps you can
>> add hint properties to OPP tables as to what entries can be used for
>> throttling, but otherwise this doesn't belong in DT.
>
> My idea is to allow multiple throttlers, which might operate on
> different throttling devices or use different OPPs for the same
> device. To support this a simple boolean hint that the OPP can be used
> for throttling would not be sufficient.
>
> What should work is a property with an array of phandles of the
> throttlers that use a given OPP.
>
> AFAIK it is currently not possible to enumerate the devfreq devices in
> the system, so besides the info in the OPPs the throttler itself would
> still need a phandle to the devfreq device(s) it uses.

Why don't you fix that OS problem instead of working around it in DT?

>
> I envision something like this:
>
> gpu_opp_table: opp-table2 {
> compatible = "operating-points-v2";
>
> opp00 {
> opp-hz = /bits/ 64 <200000000>;
> opp-microvolt = <800000>;
> };
> opp01 {
> opp-hz = /bits/ 64 <297000000>;
> opp-microvolt = <800000>;
> opp-throttlers = <&cros_ec_throttler>;
> };
> ...
> };
>
> cros_ec_throttler: cros-ec-throttler {
> compatible = "google,cros-ec-throttler";

Is this an actual h/w device?

>
> devfreq-devices = <&gpu>;
> };
>
> Would this be acceptable?
>
> Thanks
>
> Matthias

2018-05-31 21:11:35

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH 10/11] dt-bindings: misc: add bindings for throttler

On Thu, May 31, 2018 at 03:04:18PM -0500, Rob Herring wrote:
> On Thu, May 31, 2018 at 1:34 PM, Matthias Kaehlcke <[email protected]> wrote:
> > Hi Rob,
> >
> > On Thu, May 31, 2018 at 11:31:59AM -0500, Rob Herring wrote:
> >> On Fri, May 25, 2018 at 01:30:42PM -0700, Matthias Kaehlcke wrote:
> >>
> >> Commit msg?
> >
> > Will add some more info in the next revision.
> >
> >> > Signed-off-by: Matthias Kaehlcke <[email protected]>
> >> > ---
> >> > .../devicetree/bindings/misc/throttler.txt | 41 +++++++++++++++++++
> >> > 1 file changed, 41 insertions(+)
> >> > create mode 100644 Documentation/devicetree/bindings/misc/throttler.txt
> >> >
> >> > diff --git a/Documentation/devicetree/bindings/misc/throttler.txt b/Documentation/devicetree/bindings/misc/throttler.txt
> >> > new file mode 100644
> >> > index 000000000000..92f13e94451a
> >> > --- /dev/null
> >> > +++ b/Documentation/devicetree/bindings/misc/throttler.txt
> >> > @@ -0,0 +1,41 @@
> >> > +Throttler driver
> >> > +
> >> > +The Throttler is used for non-thermal throttling of system components like
> >> > +CPUs or devfreq devices.
> >>
> >> This all looks very Linux specific and not a h/w device. Perhaps you can
> >> add hint properties to OPP tables as to what entries can be used for
> >> throttling, but otherwise this doesn't belong in DT.
> >
> > My idea is to allow multiple throttlers, which might operate on
> > different throttling devices or use different OPPs for the same
> > device. To support this a simple boolean hint that the OPP can be used
> > for throttling would not be sufficient.
> >
> > What should work is a property with an array of phandles of the
> > throttlers that use a given OPP.
> >
> > AFAIK it is currently not possible to enumerate the devfreq devices in
> > the system, so besides the info in the OPPs the throttler itself would
> > still need a phandle to the devfreq device(s) it uses.
>
> Why don't you fix that OS problem instead of working around it in DT?

I can try, though it's not exclusively in my hands, depends on what
the devfreq maintainers think about it.

> > I envision something like this:
> >
> > gpu_opp_table: opp-table2 {
> > compatible = "operating-points-v2";
> >
> > opp00 {
> > opp-hz = /bits/ 64 <200000000>;
> > opp-microvolt = <800000>;
> > };
> > opp01 {
> > opp-hz = /bits/ 64 <297000000>;
> > opp-microvolt = <800000>;
> > opp-throttlers = <&cros_ec_throttler>;
> > };
> > ...
> > };
> >
> > cros_ec_throttler: cros-ec-throttler {
> > compatible = "google,cros-ec-throttler";
>
> Is this an actual h/w device?

The Chrome OS Embedded Controller is a MCU that communicates with
Linux over SPI or I2C. The low-level communication with the EC is
handled by drivers/mfd/cros_ec* and then there are multiple drivers
representing 'remote devices' on top of this (rtc-cros-ec.c,
extcon-usbc-cros-ec.c, pwm-cros-ec.c, ...). cros_ec_throttler is a
similar 'device' that responds to signals from the EC with frequency
adjustments.

2018-06-05 09:42:26

by Chanwoo Choi

[permalink] [raw]
Subject: Re: [PATCH 02/11] PM / devfreq: Fix handling of min/max_freq == 0

Hi,

On 2018년 05월 31일 06:13, Matthias Kaehlcke wrote:
> On Wed, May 30, 2018 at 05:04:14PM +0900, Chanwoo Choi wrote:
>> Hi,
>>
>> On 2018년 05월 30일 03:57, Matthias Kaehlcke wrote:
>>> On Mon, May 28, 2018 at 03:37:47PM +0900, Chanwoo Choi wrote:
>>>> Hi,
>>>>
>>>> On 2018년 05월 26일 05:30, Matthias Kaehlcke wrote:
>>>>> Commit ab8f58ad72c4 ("PM / devfreq: Set min/max_freq when adding the
>>>>> devfreq device") initializes df->min/max_freq with the min/max OPP when
>>>>> the device is added. Later commit f1d981eaecf8 ("PM / devfreq: Use the
>>>>> available min/max frequency") adds df->scaling_min/max_freq and the
>>>>> following to the frequency adjustment code:
>>>>>
>>>>> max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);
>>>>>
>>>>> With the current handling of min/max_freq this is incorrect:
>>>>>
>>>>> Even though df->max_freq is now initialized to a value != 0 user space
>>>>> can still set it to 0, in this case max_freq would be 0 instead of
>>>>> df->scaling_max_freq as intended. In consequence the frequency adjustment
>>>>> is not performed:
>>>>>
>>>>> if (max_freq && freq > max_freq) {
>>>>> freq = max_freq;
>>>>>
>>>>> To fix this set df->min/max freq to the min/max OPP in max/max_freq_store,
>>>>> when the user passes a value of 0. This also prevents df->max_freq from
>>>>> being set below the min OPP when df->min_freq is 0, and similar for
>>>>> min_freq. Since it is now guaranteed that df->min/max_freq can't be 0 the
>>>>> checks for this case can be removed.
>>>>>
>>>>> Fixes: f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency")
>>>>> Signed-off-by: Matthias Kaehlcke <[email protected]>
>>>>> ---
>>>>> drivers/devfreq/devfreq.c | 30 ++++++++++++++++++------------
>>>>> 1 file changed, 18 insertions(+), 12 deletions(-)
>>>>>
>>>>> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
>>>>> index 0057ef5b0a98..67da4e7b486b 100644
>>>>> --- a/drivers/devfreq/devfreq.c
>>>>> +++ b/drivers/devfreq/devfreq.c
>>>>> @@ -283,11 +283,11 @@ int update_devfreq(struct devfreq *devfreq)
>>>>> max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);
>>>>> min_freq = MAX(devfreq->scaling_min_freq, devfreq->min_freq);
>>>>>
>>>>> - if (min_freq && freq < min_freq) {
>>>>> + if (freq < min_freq) {
>>>>> freq = min_freq;
>>>>> flags &= ~DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use GLB */
>>>>> }
>>>>> - if (max_freq && freq > max_freq) {
>>>>> + if (freq > max_freq) {
>>>>> freq = max_freq;
>>>>> flags |= DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use LUB */
>>>>> }
>>>>> @@ -1123,17 +1123,20 @@ static ssize_t min_freq_store(struct device *dev, struct device_attribute *attr,
>>>>> struct devfreq *df = to_devfreq(dev);
>>>>> unsigned long value;
>>>>> int ret;
>>>>> - unsigned long max;
>>>>>
>>>>> ret = sscanf(buf, "%lu", &value);
>>>>> if (ret != 1)
>>>>> return -EINVAL;
>>>>>
>>>>> mutex_lock(&df->lock);
>>>>> - max = df->max_freq;
>>>>> - if (value && max && value > max) {
>>>>> - ret = -EINVAL;
>>>>> - goto unlock;
>>>>> +
>>>>> + if (value) {
>>>>> + if (value > df->max_freq) {
>>>>> + ret = -EINVAL;
>>>>> + goto unlock;
>>>>> + }
>>>>> + } else {
>>>>> + value = df->profile->freq_table[df->profile->max_state - 1];
>>>>> }
>>>>
>>>> If you want to prevent that df->min_freq is zero,
>>>> you should reinitialize 'value' as following.
>>>> Because freq_table must be in ascending order.
>>>> value = df->profile->freq_table[0];
>>>
>>> Thanks for pointing this out!
>>>
>>> The devfreq device I tested with (a Mali GPU) uses descending order
>>> for some reason, and I assumed that's the usual order.
>>>
>>> https://chromium.googlesource.com/chromiumos/third_party/kernel/+/chromeos-4.4/drivers/gpu/arm/midgard/backend/gpu/mali_kbase_devfreq.c#208
>>>
>>> It seems the ordering doesn't have any impact beyond this patch. If
>>> the order isn't mandatory for drivers that set up their own freq_table
>>> we should probably support both cases to be safe.
>>
>> Prior to that 'freq_table' is optional. So, patch[1] initialize the 'freq_table'
>> by using OPP interface if 'freq_table' is NULL.
>> [1] commit 0ec09ac2cebe ("PM / devfreq: Set the freq_table of devfreq device")
>>
>> Current devfreq recommend the ascending order for 'freq_table'.
>> But, as you know, it might be not enough to support them.
>>
>> I agree that we should support the both cases (ascending or descending order).
>>
>> Maybe, it might be not proper to access the freq_table[] directly
>> because we don't know the ordering style of 'freq_table'
>> if 'freq_table' is made by devfreq user instead of devfreq core.
>
> If we can assume that it is either ascending or descending, but not
> random order than a simple check if freq_table[0] <
> freq_table[max_state - 1] would be sufficient.

Also, we should consider the order way of freq_table on available_frequency
because available_frequency have to show the frequency as the ascending order
even if freq_table uses the descending order.

>
> Otherwise we could also determine the min/max after initialization and
> save the result, though that would leave us with yet another frequency
> pair, which might be confusing, especially if we don't come up with
> good names to distinguish between them.

IMO, it might make the confusion if devfreq device has the two frequency
table.

--
Best Regards,
Chanwoo Choi
Samsung Electronics