This was submitted back in December and not picked up after review.
P.
-----8<----
The Intel ASDM provides a maximum time window that can be specified when
setting a time window in the RAPL driver. While the ASDM doesn't explicitly
provide a minimum time window value, it does provide a minimum time window
unit that also can be used as a minimum value.
This patchset implements barrier checking for the time windows, and adds
reporting of a known bug in which the maxmimum time window value may be
erroneously set to 0, as well as a module parameter to avoid the maximum
window checks on broken BIOSes.
Prarit Bhargava (3):
powercap, intel_rapl, implement get_max_time_window
powercap, intel_rapl, implement check for minimum time window
powercap, intel_rapl, Add ignore_max_time_window_check module
parameter for broken BIOSes
drivers/powercap/intel_rapl.c | 50 +++++++++++++++++++++++++++++++++++++++
drivers/powercap/powercap_sys.c | 6 +++--
2 files changed, 54 insertions(+), 2 deletions(-)
--
1.7.9.3
The MSR_PKG_POWER_INFO register (Intel ASDM, section 14.9.3
"Package RAPL Domain") provides a maximum time window which the
system can support. This window is read-only and is currently
not examined when setting the time windows for the package.
This patch implements get_max_time_window_us() and checks the window when
a user attempts to set power capping for the package.
Before the patch it was possible to set the window to, for example, 10000
micro seconds:
[root@intel-chiefriver-03 rhel7]# echo 10000 >
/sys/devices/virtual/powercap/intel-rapl/intel-rapl\:0/constraint_0_time_window_us;
egrep ^ /sys/devices/virtual/powercap/intel-rapl/intel-rapl\:0/constraint_0_time_window_us
/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_time_window_us:1:9765
but from 'turbostat -d', the package is limited to 976us:
cpu0: MSR_PKG_POWER_INFO: 0x01200168 (45 W TDP, RAPL 36 - 0 W, 0.000977 sec.)
(Note, there appears to be a rounding issue in turbostat which needs to
also be fixed. Looking at the values in the register it is clear the
value is 1/1024 = 976us.)
After the patch we are limited by the maximum time window:
[root@intel-chiefriver-03 rhel7]# echo 10000 >
/sys/devices/virtual/powercap/intel-rapl/intel-rapl\:0/constraint_0_time_window_us;
egrep ^ /sys/devices/virtual/powercap/intel-rapl/intel-rapl\:0/constraint_0_time_window_us
-bash: echo: write error: Invalid argument
/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_time_window_us:1:976
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Prarit Bhargava <[email protected]>
Cc: Radivoje Jovanovic <[email protected]>
Cc: Seiichi Ikarashi <[email protected]>
Cc: Mathias Krause <[email protected]>
Cc: Ajay Thomas <[email protected]>
Signed-off-by: Prarit Bhargava <[email protected]>
---
drivers/powercap/intel_rapl.c | 31 +++++++++++++++++++++++++++++++
drivers/powercap/powercap_sys.c | 6 ++++--
2 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c
index 6c592dc..feb063d 100644
--- a/drivers/powercap/intel_rapl.c
+++ b/drivers/powercap/intel_rapl.c
@@ -493,13 +493,42 @@ static int get_current_power_limit(struct powercap_zone *power_zone, int id,
return ret;
}
+static int get_max_time_window(struct powercap_zone *power_zone, int id,
+ u64 *data)
+{
+ struct rapl_domain *rd;
+ int ret = 0;
+ u64 val;
+
+ get_online_cpus();
+ rd = power_zone_to_rapl_domain(power_zone);
+
+ if (rapl_read_data_raw(rd, MAX_TIME_WINDOW, true, &val))
+ ret = -EIO;
+ else
+ *data = val;
+
+ put_online_cpus();
+ return ret;
+}
+
static int set_time_window(struct powercap_zone *power_zone, int id,
u64 window)
{
struct rapl_domain *rd;
int ret = 0;
+ u64 max_window;
get_online_cpus();
+ ret = get_max_time_window(power_zone, id, &max_window);
+ if (ret < 0)
+ goto out;
+
+ if (window > max_window) {
+ ret = -EINVAL;
+ goto out;
+ }
+
rd = power_zone_to_rapl_domain(power_zone);
switch (rd->rpl[id].prim_id) {
case PL1_ENABLE:
@@ -511,6 +540,7 @@ static int set_time_window(struct powercap_zone *power_zone, int id,
default:
ret = -EINVAL;
}
+out:
put_online_cpus();
return ret;
}
@@ -590,6 +620,7 @@ static const struct powercap_zone_constraint_ops constraint_ops = {
.set_time_window_us = set_time_window,
.get_time_window_us = get_time_window,
.get_max_power_uw = get_max_power,
+ .get_max_time_window_us = get_max_time_window,
.get_name = get_constraint_name,
};
diff --git a/drivers/powercap/powercap_sys.c b/drivers/powercap/powercap_sys.c
index 14bde0d..53fad0f 100644
--- a/drivers/powercap/powercap_sys.c
+++ b/drivers/powercap/powercap_sys.c
@@ -101,7 +101,7 @@ static ssize_t store_constraint_##_attr(struct device *dev,\
int err; \
u64 value; \
struct powercap_zone *power_zone = to_powercap_zone(dev); \
- int id; \
+ int id, ret; \
struct powercap_zone_constraint *pconst;\
\
if (!sscanf(dev_attr->attr.name, "constraint_%d_", &id)) \
@@ -113,8 +113,10 @@ static ssize_t store_constraint_##_attr(struct device *dev,\
if (err) \
return -EINVAL; \
if (pconst && pconst->ops && pconst->ops->set_##_attr) { \
- if (!pconst->ops->set_##_attr(power_zone, id, value)) \
+ ret = pconst->ops->set_##_attr(power_zone, id, value); \
+ if (!ret) \
return count; \
+ return ret; \
} \
\
return -ENODATA; \
--
1.7.9.3
Some systems erroneously set the maximum time window field of
MSR_PKG_POWER_INFO register to 0. This results in a user not being able
to set the time windows for the package. In some cases, however, RAPL
will still continue to work with a small window (albeit through some
trial and error). This patch adds a ignore_max_time_window_check module
parameter to avoid the maximum time window check in set_time_window().
[v2]: change name to max_time_window_check, fix (val == 0) check
[v3]: fix typo in debug message
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Prarit Bhargava <[email protected]>
Cc: Radivoje Jovanovic <[email protected]>
Cc: Seiichi Ikarashi <[email protected]>
Cc: Mathias Krause <[email protected]>
Cc: Ajay Thomas <[email protected]>
Signed-off-by: Prarit Bhargava <[email protected]>
---
drivers/powercap/intel_rapl.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c
index cf89b3d..87dac13 100644
--- a/drivers/powercap/intel_rapl.c
+++ b/drivers/powercap/intel_rapl.c
@@ -505,13 +505,24 @@ static int get_max_time_window(struct powercap_zone *power_zone, int id,
if (rapl_read_data_raw(rd, MAX_TIME_WINDOW, true, &val))
ret = -EIO;
- else
+ else {
*data = val;
-
+ if (val == 0)
+ pr_warn_once(FW_BUG "intel_rapl: Maximum Time Window is zero. This is a BIOS bug that should be reported to your hardware or BIOS vendor. The value of zero may prevent Intel RAPL from functioning properly. Most bugs can be avoided by setting the ignore_max_time_window_check module parameter.\n");
+ }
put_online_cpus();
return ret;
}
+/* Some BIOSes incorrectly program the maximum time window in the
+ * MSR_PKG_POWER_INFO register. Some of these systems still have functional
+ * RAPL registers, etc., so give the user the option of disabling the maximum
+ * time window check.
+ */
+static int ignore_max_time_window_check;
+module_param(ignore_max_time_window_check, int, 0444);
+MODULE_PARM_DESC(ignore_max_time_window_check, "Ignore maximum time window check. A bug should be reported to your hardware or BIOS vendor if this parameter is used.");
+
static int set_time_window(struct powercap_zone *power_zone, int id,
u64 window)
{
@@ -532,7 +543,8 @@ static int set_time_window(struct powercap_zone *power_zone, int id,
* The MSR_RAPL_POWER_UNIT register, read during initialization,
* does contain the smallest unit of time that can be measured.
*/
- if ((window > max_window) || (window < rp->time_unit)) {
+ if ((!ignore_max_time_window_check && (window > max_window)) ||
+ (window < rp->time_unit)) {
ret = -EINVAL;
goto out;
}
--
1.7.9.3
Using an small value for the time window results in a
bogus value for the time window. For example,
[root@intel-chiefriver-03 linux]# echo 950 >
/sys/devices/virtual/powercap/intel-rapl/intel-rapl\:0/constraint_0_time_window_us;
egrep ^ /sys/devices/virtual/powercap/intel-rapl/intel-rapl\:0/constraint_0_time_window_us
-bash: echo: write error: Invalid argument
/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_time_window_us:1:4501502475370496
The Intel ASDM doesn't explicitly define a minimum time window.
The MSR_RAPL_POWER_UNIT register, read during initialization, does
specify a minimum time window unit so that can be used as a lower
bound for error checking.
After this change the minimum time window is properly clamped:
[root@intel-chiefriver-03 linux]# echo 950 >
/sys/devices/virtual/powercap/intel-rapl/intel-rapl\:0/constraint_0_time_window_us;
egrep ^ /sys/devices/virtual/powercap/intel-rapl/intel-rapl\:0/constraint_0_time_window_us
-bash: echo: write error: Invalid argument
/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_time_window_us:1:976
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Prarit Bhargava <[email protected]>
Cc: Radivoje Jovanovic <[email protected]>
Cc: Seiichi Ikarashi <[email protected]>
Cc: Mathias Krause <[email protected]>
Cc: Ajay Thomas <[email protected]>
Signed-off-by: Prarit Bhargava <[email protected]>
---
drivers/powercap/intel_rapl.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c
index feb063d..cf89b3d 100644
--- a/drivers/powercap/intel_rapl.c
+++ b/drivers/powercap/intel_rapl.c
@@ -516,6 +516,7 @@ static int set_time_window(struct powercap_zone *power_zone, int id,
u64 window)
{
struct rapl_domain *rd;
+ struct rapl_package *rp;
int ret = 0;
u64 max_window;
@@ -524,12 +525,18 @@ static int set_time_window(struct powercap_zone *power_zone, int id,
if (ret < 0)
goto out;
- if (window > max_window) {
+ rd = power_zone_to_rapl_domain(power_zone);
+ rp = find_package_by_id(rd->package_id);
+ /*
+ * The Intel ASDM doesn't explicitly define a minimum time window.
+ * The MSR_RAPL_POWER_UNIT register, read during initialization,
+ * does contain the smallest unit of time that can be measured.
+ */
+ if ((window > max_window) || (window < rp->time_unit)) {
ret = -EINVAL;
goto out;
}
- rd = power_zone_to_rapl_domain(power_zone);
switch (rd->rpl[id].prim_id) {
case PL1_ENABLE:
rapl_write_data_raw(rd, TIME_WINDOW1, window);
--
1.7.9.3
On 3/16/2016 1:00 PM, Prarit Bhargava wrote:
> The MSR_PKG_POWER_INFO register (Intel ASDM, section 14.9.3
> "Package RAPL Domain") provides a maximum time window which the
> system can support. This window is read-only and is currently
> not examined when setting the time windows for the package.
>
> This patch implements get_max_time_window_us() and checks the window when
> a user attempts to set power capping for the package.
>
> Before the patch it was possible to set the window to, for example, 10000
> micro seconds:
>
> [root@intel-chiefriver-03 rhel7]# echo 10000 >
> /sys/devices/virtual/powercap/intel-rapl/intel-rapl\:0/constraint_0_time_window_us;
> egrep ^ /sys/devices/virtual/powercap/intel-rapl/intel-rapl\:0/constraint_0_time_window_us
>
> /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_time_window_us:1:9765
>
> but from 'turbostat -d', the package is limited to 976us:
>
> cpu0: MSR_PKG_POWER_INFO: 0x01200168 (45 W TDP, RAPL 36 - 0 W, 0.000977 sec.)
>
> (Note, there appears to be a rounding issue in turbostat which needs to
> also be fixed. Looking at the values in the register it is clear the
> value is 1/1024 = 976us.)
>
> After the patch we are limited by the maximum time window:
>
> [root@intel-chiefriver-03 rhel7]# echo 10000 >
> /sys/devices/virtual/powercap/intel-rapl/intel-rapl\:0/constraint_0_time_window_us;
> egrep ^ /sys/devices/virtual/powercap/intel-rapl/intel-rapl\:0/constraint_0_time_window_us
>
> -bash: echo: write error: Invalid argument
> /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_time_window_us:1:976
>
> Cc: "Rafael J. Wysocki" <[email protected]>
> Cc: Prarit Bhargava <[email protected]>
> Cc: Radivoje Jovanovic <[email protected]>
> Cc: Seiichi Ikarashi <[email protected]>
> Cc: Mathias Krause <[email protected]>
> Cc: Ajay Thomas <[email protected]>
> Signed-off-by: Prarit Bhargava <[email protected]>
Can you please resend the patches with CCs to [email protected]?
They are much easier to handle to me then.
Thanks,
Rafael