From: "Gautham R. Shenoy" <[email protected]>
Hi,
This is the fifth version of the patches to track and expose idle PURR
and SPURR ticks. These patches are required by tools such as lparstat
to compute system utilization for capacity planning purposes.
The previous versions can be found here:
v4: https://lkml.org/lkml/2020/3/27/323
v3: https://lkml.org/lkml/2020/3/11/331
v2: https://lkml.org/lkml/2020/2/21/21
v1: https://lore.kernel.org/patchwork/cover/1159341/
They changes from v4 are:
- As suggested by Naveen, moved the functions read_this_idle_purr()
and read_this_idle_spurr() from Patch 2 and Patch 3 respectively
to Patch 4 where it is invoked.
- Dropped Patch 6 which cached the values of purr, spurr,
idle_purr, idle_spurr in order to minimize the number of IPIs
sent.
- Updated the dates for the idle_purr, idle_spurr in the
Documentation Patch 5.
Motivation:
===========
On PSeries LPARs, the data centers planners desire a more accurate
view of system utilization per resource such as CPU to plan the system
capacity requirements better. Such accuracy can be obtained by reading
PURR/SPURR registers for CPU resource utilization.
Tools such as lparstat which are used to compute the utilization need
to know [S]PURR ticks when the cpu was busy or idle. The [S]PURR
counters are already exposed through sysfs. We already account for
PURR ticks when we go to idle so that we can update the VPA area. This
patchset extends support to account for SPURR ticks when idle, and
expose both via per-cpu sysfs files.
These patches are required for enhancement to the lparstat utility
that compute the CPU utilization based on PURR and SPURR which can be
found here :
https://groups.google.com/forum/#!topic/powerpc-utils-devel/fYRo69xO9r4
With the patches, when lparstat is run on a LPAR running CPU-Hogs,
=========================================================================
sudo ./src/lparstat -E 1 3
System Configuration
type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834112 kB cpus=0 ent=2.00
---Actual--- -Normalized-
%busy %idle Frequency %busy %idle
------ ------ ------------- ------ ------
1 99.99 0.00 3.35GHz[111%] 110.99 0.00
2 100.00 0.00 3.35GHz[111%] 111.01 0.00
3 100.00 0.00 3.35GHz[111%] 111.00 0.00
With patches, when lparstat is run on and idle LPAR
=========================================================================
System Configuration
type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834112 kB cpus=0 ent=2.00
---Actual--- -Normalized-
%busy %idle Frequency %busy %idle
------ ------ ------------- ------ ------
1 0.15 99.84 2.17GHz[ 72%] 0.11 71.89
2 0.24 99.76 2.11GHz[ 70%] 0.18 69.82
3 0.24 99.75 2.11GHz[ 70%] 0.18 69.81
Gautham R. Shenoy (5):
powerpc: Move idle_loop_prolog()/epilog() functions to header file
powerpc/idle: Store PURR snapshot in a per-cpu global variable
powerpc/pseries: Account for SPURR ticks on idle CPUs
powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
Documentation: Document sysfs interfaces purr, spurr, idle_purr,
idle_spurr
Documentation/ABI/testing/sysfs-devices-system-cpu | 39 +++++++++
arch/powerpc/include/asm/idle.h | 93 ++++++++++++++++++++++
arch/powerpc/kernel/sysfs.c | 82 ++++++++++++++++++-
arch/powerpc/platforms/pseries/setup.c | 8 +-
drivers/cpuidle/cpuidle-pseries.c | 39 ++-------
5 files changed, 224 insertions(+), 37 deletions(-)
create mode 100644 arch/powerpc/include/asm/idle.h
--
1.9.4
From: "Gautham R. Shenoy" <[email protected]>
On Pseries LPARs, to calculate utilization, we need to know the
[S]PURR ticks when the CPUs were busy or idle.
Via pseries_idle_prolog(), pseries_idle_epilog(), we track the idle
PURR ticks in the VPA variable "wait_state_cycles". This patch extends
the support to account for the idle SPURR ticks.
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
arch/powerpc/include/asm/idle.h | 17 +++++++++++++++++
arch/powerpc/platforms/pseries/setup.c | 2 ++
2 files changed, 19 insertions(+)
diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
index b90d75a..0efb250 100644
--- a/arch/powerpc/include/asm/idle.h
+++ b/arch/powerpc/include/asm/idle.h
@@ -5,13 +5,20 @@
#include <asm/paca.h>
#ifdef CONFIG_PPC_PSERIES
+DECLARE_PER_CPU(u64, idle_spurr_cycles);
DECLARE_PER_CPU(u64, idle_entry_purr_snap);
+DECLARE_PER_CPU(u64, idle_entry_spurr_snap);
static inline void snapshot_purr_idle_entry(void)
{
*this_cpu_ptr(&idle_entry_purr_snap) = mfspr(SPRN_PURR);
}
+static inline void snapshot_spurr_idle_entry(void)
+{
+ *this_cpu_ptr(&idle_entry_spurr_snap) = mfspr(SPRN_SPURR);
+}
+
static inline void update_idle_purr_accounting(void)
{
u64 wait_cycles;
@@ -22,10 +29,19 @@ static inline void update_idle_purr_accounting(void)
get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
}
+static inline void update_idle_spurr_accounting(void)
+{
+ u64 *idle_spurr_cycles_ptr = this_cpu_ptr(&idle_spurr_cycles);
+ u64 in_spurr = *this_cpu_ptr(&idle_entry_spurr_snap);
+
+ *idle_spurr_cycles_ptr += mfspr(SPRN_SPURR) - in_spurr;
+}
+
static inline void pseries_idle_prolog(void)
{
ppc64_runlatch_off();
snapshot_purr_idle_entry();
+ snapshot_spurr_idle_entry();
/*
* Indicate to the HV that we are idle. Now would be
* a good time to find other work to dispatch.
@@ -36,6 +52,7 @@ static inline void pseries_idle_prolog(void)
static inline void pseries_idle_epilog(void)
{
update_idle_purr_accounting();
+ update_idle_spurr_accounting();
get_lppaca()->idle = 0;
ppc64_runlatch_on();
}
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 4905c96..1b55e80 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -318,7 +318,9 @@ static int alloc_dispatch_log_kmem_cache(void)
}
machine_early_initcall(pseries, alloc_dispatch_log_kmem_cache);
+DEFINE_PER_CPU(u64, idle_spurr_cycles);
DEFINE_PER_CPU(u64, idle_entry_purr_snap);
+DEFINE_PER_CPU(u64, idle_entry_spurr_snap);
static void pseries_lpar_idle(void)
{
/*
--
1.9.4
From: "Gautham R. Shenoy" <[email protected]>
On Pseries LPARs, to calculate utilization, we need to know the
[S]PURR ticks when the CPUs were busy or idle.
The total PURR and SPURR ticks are already exposed via the per-cpu
sysfs files "purr" and "spurr". This patch adds support for exposing
the idle PURR and SPURR ticks via new per-cpu sysfs files named
"idle_purr" and "idle_spurr".
This patch also adds helper functions to accurately read the values of
idle_purr and idle_spurr especially from an interrupt context between
when the interrupt has occurred between the pseries_idle_prolog() and
pseries_idle_epilog(). This will ensure that the idle purr/spurr
values corresponding to the latest idle period is accounted for before
these values are read.
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
arch/powerpc/include/asm/idle.h | 32 ++++++++++++++++
arch/powerpc/kernel/sysfs.c | 82 +++++++++++++++++++++++++++++++++++++++--
2 files changed, 111 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
index 0efb250..accd1f5 100644
--- a/arch/powerpc/include/asm/idle.h
+++ b/arch/powerpc/include/asm/idle.h
@@ -57,5 +57,37 @@ static inline void pseries_idle_epilog(void)
ppc64_runlatch_on();
}
+static inline u64 read_this_idle_purr(void)
+{
+ /*
+ * If we are reading from an idle context, update the
+ * idle-purr cycles corresponding to the last idle period.
+ * Since the idle context is not yet over, take a fresh
+ * snapshot of the idle-purr.
+ */
+ if (unlikely(get_lppaca()->idle == 1)) {
+ update_idle_purr_accounting();
+ snapshot_purr_idle_entry();
+ }
+
+ return be64_to_cpu(get_lppaca()->wait_state_cycles);
+}
+
+static inline u64 read_this_idle_spurr(void)
+{
+ /*
+ * If we are reading from an idle context, update the
+ * idle-spurr cycles corresponding to the last idle period.
+ * Since the idle context is not yet over, take a fresh
+ * snapshot of the idle-spurr.
+ */
+ if (get_lppaca()->idle == 1) {
+ update_idle_spurr_accounting();
+ snapshot_spurr_idle_entry();
+ }
+
+ return *this_cpu_ptr(&idle_spurr_cycles);
+}
+
#endif /* CONFIG_PPC_PSERIES */
#endif
diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 479c706..571b325 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -19,6 +19,7 @@
#include <asm/smp.h>
#include <asm/pmc.h>
#include <asm/firmware.h>
+#include <asm/idle.h>
#include <asm/svm.h>
#include "cacheinfo.h"
@@ -760,6 +761,74 @@ static void create_svm_file(void)
}
#endif /* CONFIG_PPC_SVM */
+#ifdef CONFIG_PPC_PSERIES
+static void read_idle_purr(void *val)
+{
+ u64 *ret = val;
+
+ *ret = read_this_idle_purr();
+}
+
+static ssize_t idle_purr_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct cpu *cpu = container_of(dev, struct cpu, dev);
+ u64 val;
+
+ smp_call_function_single(cpu->dev.id, read_idle_purr, &val, 1);
+ return sprintf(buf, "%llx\n", val);
+}
+static DEVICE_ATTR(idle_purr, 0400, idle_purr_show, NULL);
+
+static void create_idle_purr_file(struct device *s)
+{
+ if (firmware_has_feature(FW_FEATURE_LPAR))
+ device_create_file(s, &dev_attr_idle_purr);
+}
+
+static void remove_idle_purr_file(struct device *s)
+{
+ if (firmware_has_feature(FW_FEATURE_LPAR))
+ device_remove_file(s, &dev_attr_idle_purr);
+}
+
+static void read_idle_spurr(void *val)
+{
+ u64 *ret = val;
+
+ *ret = read_this_idle_spurr();
+}
+
+static ssize_t idle_spurr_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct cpu *cpu = container_of(dev, struct cpu, dev);
+ u64 val;
+
+ smp_call_function_single(cpu->dev.id, read_idle_spurr, &val, 1);
+ return sprintf(buf, "%llx\n", val);
+}
+static DEVICE_ATTR(idle_spurr, 0400, idle_spurr_show, NULL);
+
+static void create_idle_spurr_file(struct device *s)
+{
+ if (firmware_has_feature(FW_FEATURE_LPAR))
+ device_create_file(s, &dev_attr_idle_spurr);
+}
+
+static void remove_idle_spurr_file(struct device *s)
+{
+ if (firmware_has_feature(FW_FEATURE_LPAR))
+ device_remove_file(s, &dev_attr_idle_spurr);
+}
+
+#else /* CONFIG_PPC_PSERIES */
+#define create_idle_purr_file(s)
+#define remove_idle_purr_file(s)
+#define create_idle_spurr_file(s)
+#define remove_idle_spurr_file(s)
+#endif /* CONFIG_PPC_PSERIES */
+
static int register_cpu_online(unsigned int cpu)
{
struct cpu *c = &per_cpu(cpu_devices, cpu);
@@ -823,10 +892,13 @@ static int register_cpu_online(unsigned int cpu)
if (!firmware_has_feature(FW_FEATURE_LPAR))
add_write_permission_dev_attr(&dev_attr_purr);
device_create_file(s, &dev_attr_purr);
+ create_idle_purr_file(s);
}
- if (cpu_has_feature(CPU_FTR_SPURR))
+ if (cpu_has_feature(CPU_FTR_SPURR)) {
device_create_file(s, &dev_attr_spurr);
+ create_idle_spurr_file(s);
+ }
if (cpu_has_feature(CPU_FTR_DSCR))
device_create_file(s, &dev_attr_dscr);
@@ -910,11 +982,15 @@ static int unregister_cpu_online(unsigned int cpu)
device_remove_file(s, &dev_attr_mmcra);
#endif /* CONFIG_PMU_SYSFS */
- if (cpu_has_feature(CPU_FTR_PURR))
+ if (cpu_has_feature(CPU_FTR_PURR)) {
device_remove_file(s, &dev_attr_purr);
+ remove_idle_purr_file(s);
+ }
- if (cpu_has_feature(CPU_FTR_SPURR))
+ if (cpu_has_feature(CPU_FTR_SPURR)) {
device_remove_file(s, &dev_attr_spurr);
+ remove_idle_spurr_file(s);
+ }
if (cpu_has_feature(CPU_FTR_DSCR))
device_remove_file(s, &dev_attr_dscr);
--
1.9.4
From: "Gautham R. Shenoy" <[email protected]>
Currently when CPU goes idle, we take a snapshot of PURR via
pseries_idle_prolog() which is used at the CPU idle exit to compute
the idle PURR cycles via the function pseries_idle_epilog(). Thus,
the value of idle PURR cycle thus read before pseries_idle_prolog() and
after pseries_idle_epilog() is always correct.
However, if we were to read the idle PURR cycles from an interrupt
context between pseries_idle_prolog() and pseries_idle_epilog() (this
will be done in a future patch), then, the value of the idle PURR thus
read will not include the cycles spent in the most recent idle period.
Thus, in that interrupt context, we will need access to the snapshot
of the PURR before going idle, in order to compute the idle PURR
cycles for the latest idle duration.
In this patch, we save the snapshot of PURR in pseries_idle_prolog()
in a per-cpu variable, instead of on the stack, so that it can be
accessed from an interrupt context.
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
arch/powerpc/include/asm/idle.h | 31 ++++++++++++++++++++++---------
arch/powerpc/platforms/pseries/setup.c | 7 +++----
drivers/cpuidle/cpuidle-pseries.c | 15 ++++++---------
3 files changed, 31 insertions(+), 22 deletions(-)
diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
index 32064a4c..b90d75a 100644
--- a/arch/powerpc/include/asm/idle.h
+++ b/arch/powerpc/include/asm/idle.h
@@ -5,10 +5,27 @@
#include <asm/paca.h>
#ifdef CONFIG_PPC_PSERIES
-static inline void pseries_idle_prolog(unsigned long *in_purr)
+DECLARE_PER_CPU(u64, idle_entry_purr_snap);
+
+static inline void snapshot_purr_idle_entry(void)
+{
+ *this_cpu_ptr(&idle_entry_purr_snap) = mfspr(SPRN_PURR);
+}
+
+static inline void update_idle_purr_accounting(void)
+{
+ u64 wait_cycles;
+ u64 in_purr = *this_cpu_ptr(&idle_entry_purr_snap);
+
+ wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
+ wait_cycles += mfspr(SPRN_PURR) - in_purr;
+ get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
+}
+
+static inline void pseries_idle_prolog(void)
{
ppc64_runlatch_off();
- *in_purr = mfspr(SPRN_PURR);
+ snapshot_purr_idle_entry();
/*
* Indicate to the HV that we are idle. Now would be
* a good time to find other work to dispatch.
@@ -16,16 +33,12 @@ static inline void pseries_idle_prolog(unsigned long *in_purr)
get_lppaca()->idle = 1;
}
-static inline void pseries_idle_epilog(unsigned long in_purr)
+static inline void pseries_idle_epilog(void)
{
- u64 wait_cycles;
-
- wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
- wait_cycles += mfspr(SPRN_PURR) - in_purr;
- get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
+ update_idle_purr_accounting();
get_lppaca()->idle = 0;
-
ppc64_runlatch_on();
}
+
#endif /* CONFIG_PPC_PSERIES */
#endif
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 2f53e6b..4905c96 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -318,10 +318,9 @@ static int alloc_dispatch_log_kmem_cache(void)
}
machine_early_initcall(pseries, alloc_dispatch_log_kmem_cache);
+DEFINE_PER_CPU(u64, idle_entry_purr_snap);
static void pseries_lpar_idle(void)
{
- unsigned long in_purr;
-
/*
* Default handler to go into low thread priority and possibly
* low power mode by ceding processor to hypervisor
@@ -331,7 +330,7 @@ static void pseries_lpar_idle(void)
return;
/* Indicate to hypervisor that we are idle. */
- pseries_idle_prolog(&in_purr);
+ pseries_idle_prolog();
/*
* Yield the processor to the hypervisor. We return if
@@ -342,7 +341,7 @@ static void pseries_lpar_idle(void)
*/
cede_processor();
- pseries_idle_epilog(in_purr);
+ pseries_idle_epilog();
}
/*
diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
index 46d5e05..6513ef2 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -36,12 +36,11 @@ static int snooze_loop(struct cpuidle_device *dev,
struct cpuidle_driver *drv,
int index)
{
- unsigned long in_purr;
u64 snooze_exit_time;
set_thread_flag(TIF_POLLING_NRFLAG);
- pseries_idle_prolog(&in_purr);
+ pseries_idle_prolog();
local_irq_enable();
snooze_exit_time = get_tb() + snooze_timeout;
@@ -65,7 +64,7 @@ static int snooze_loop(struct cpuidle_device *dev,
local_irq_disable();
- pseries_idle_epilog(in_purr);
+ pseries_idle_epilog();
return index;
}
@@ -91,9 +90,8 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
struct cpuidle_driver *drv,
int index)
{
- unsigned long in_purr;
- pseries_idle_prolog(&in_purr);
+ pseries_idle_prolog();
get_lppaca()->donate_dedicated_cpu = 1;
HMT_medium();
@@ -102,7 +100,7 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
local_irq_disable();
get_lppaca()->donate_dedicated_cpu = 0;
- pseries_idle_epilog(in_purr);
+ pseries_idle_epilog();
return index;
}
@@ -111,9 +109,8 @@ static int shared_cede_loop(struct cpuidle_device *dev,
struct cpuidle_driver *drv,
int index)
{
- unsigned long in_purr;
- pseries_idle_prolog(&in_purr);
+ pseries_idle_prolog();
/*
* Yield the processor to the hypervisor. We return if
@@ -125,7 +122,7 @@ static int shared_cede_loop(struct cpuidle_device *dev,
check_and_cede_processor();
local_irq_disable();
- pseries_idle_epilog(in_purr);
+ pseries_idle_epilog();
return index;
}
--
1.9.4
From: "Gautham R. Shenoy" <[email protected]>
Add documentation for the following sysfs interfaces:
/sys/devices/system/cpu/cpuX/purr
/sys/devices/system/cpu/cpuX/spurr
/sys/devices/system/cpu/cpuX/idle_purr
/sys/devices/system/cpu/cpuX/idle_spurr
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
Documentation/ABI/testing/sysfs-devices-system-cpu | 39 ++++++++++++++++++++++
1 file changed, 39 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 2e0e3b4..b73b8b5 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -580,3 +580,42 @@ Description: Secure Virtual Machine
If 1, it means the system is using the Protected Execution
Facility in POWER9 and newer processors. i.e., it is a Secure
Virtual Machine.
+
+What: /sys/devices/system/cpu/cpuX/purr
+Date: Apr 2005
+Contact: Linux for PowerPC mailing list <[email protected]>
+Description: PURR ticks for this CPU since the system boot.
+
+ The Processor Utilization Resources Register (PURR) is
+ a 64-bit counter which provides an estimate of the
+ resources used by the CPU thread. The contents of this
+ register increases monotonically. This sysfs interface
+ exposes the number of PURR ticks for cpuX.
+
+What: /sys/devices/system/cpu/cpuX/spurr
+Date: Dec 2006
+Contact: Linux for PowerPC mailing list <[email protected]>
+Description: SPURR ticks for this CPU since the system boot.
+
+ The Scaled Processor Utilization Resources Register
+ (SPURR) is a 64-bit counter that provides a frequency
+ invariant estimate of the resources used by the CPU
+ thread. The contents of this register increases
+ monotonically. This sysfs interface exposes the number
+ of SPURR ticks for cpuX.
+
+What: /sys/devices/system/cpu/cpuX/idle_purr
+Date: Apr 2020
+Contact: Linux for PowerPC mailing list <[email protected]>
+Description: PURR ticks for cpuX when it was idle.
+
+ This sysfs interface exposes the number of PURR ticks
+ for cpuX when it was idle.
+
+What: /sys/devices/system/cpu/cpuX/idle_spurr
+Date: Apr 2020
+Contact: Linux for PowerPC mailing list <[email protected]>
+Description: SPURR ticks for cpuX when it was idle.
+
+ This sysfs interface exposes the number of SPURR ticks
+ for cpuX when it was idle.
--
1.9.4
From: "Gautham R. Shenoy" <[email protected]>
Currently prior to entering an idle state on a Linux Guest, the
pseries cpuidle driver implement an idle_loop_prolog() and
idle_loop_epilog() functions which ensure that idle_purr is correctly
computed, and the hypervisor is informed that the CPU cycles have been
donated.
These prolog and epilog functions are also required in the default
idle call, i.e pseries_lpar_idle(). Hence move these accessor
functions to a common header file and call them from
pseries_lpar_idle(). Since the existing header files such as
asm/processor.h have enough clutter, create a new header file
asm/idle.h. Finally rename idle_loop_prolog() and idle_loop_epilog()
to pseries_idle_prolog() and pseries_idle_epilog() as they are only
relavent for on pseries guests.
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
arch/powerpc/include/asm/idle.h | 31 +++++++++++++++++++++++++++++
arch/powerpc/platforms/pseries/setup.c | 7 +++++--
drivers/cpuidle/cpuidle-pseries.c | 36 +++++++---------------------------
3 files changed, 43 insertions(+), 31 deletions(-)
create mode 100644 arch/powerpc/include/asm/idle.h
diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
new file mode 100644
index 0000000..32064a4c
--- /dev/null
+++ b/arch/powerpc/include/asm/idle.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _ASM_POWERPC_IDLE_H
+#define _ASM_POWERPC_IDLE_H
+#include <asm/runlatch.h>
+#include <asm/paca.h>
+
+#ifdef CONFIG_PPC_PSERIES
+static inline void pseries_idle_prolog(unsigned long *in_purr)
+{
+ ppc64_runlatch_off();
+ *in_purr = mfspr(SPRN_PURR);
+ /*
+ * Indicate to the HV that we are idle. Now would be
+ * a good time to find other work to dispatch.
+ */
+ get_lppaca()->idle = 1;
+}
+
+static inline void pseries_idle_epilog(unsigned long in_purr)
+{
+ u64 wait_cycles;
+
+ wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
+ wait_cycles += mfspr(SPRN_PURR) - in_purr;
+ get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
+ get_lppaca()->idle = 0;
+
+ ppc64_runlatch_on();
+}
+#endif /* CONFIG_PPC_PSERIES */
+#endif
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 0c8421d..2f53e6b 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -68,6 +68,7 @@
#include <asm/isa-bridge.h>
#include <asm/security_features.h>
#include <asm/asm-const.h>
+#include <asm/idle.h>
#include <asm/swiotlb.h>
#include <asm/svm.h>
@@ -319,6 +320,8 @@ static int alloc_dispatch_log_kmem_cache(void)
static void pseries_lpar_idle(void)
{
+ unsigned long in_purr;
+
/*
* Default handler to go into low thread priority and possibly
* low power mode by ceding processor to hypervisor
@@ -328,7 +331,7 @@ static void pseries_lpar_idle(void)
return;
/* Indicate to hypervisor that we are idle. */
- get_lppaca()->idle = 1;
+ pseries_idle_prolog(&in_purr);
/*
* Yield the processor to the hypervisor. We return if
@@ -339,7 +342,7 @@ static void pseries_lpar_idle(void)
*/
cede_processor();
- get_lppaca()->idle = 0;
+ pseries_idle_epilog(in_purr);
}
/*
diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
index 74c2479..46d5e05 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -19,6 +19,7 @@
#include <asm/machdep.h>
#include <asm/firmware.h>
#include <asm/runlatch.h>
+#include <asm/idle.h>
#include <asm/plpar_wrappers.h>
struct cpuidle_driver pseries_idle_driver = {
@@ -31,29 +32,6 @@ struct cpuidle_driver pseries_idle_driver = {
static u64 snooze_timeout __read_mostly;
static bool snooze_timeout_en __read_mostly;
-static inline void idle_loop_prolog(unsigned long *in_purr)
-{
- ppc64_runlatch_off();
- *in_purr = mfspr(SPRN_PURR);
- /*
- * Indicate to the HV that we are idle. Now would be
- * a good time to find other work to dispatch.
- */
- get_lppaca()->idle = 1;
-}
-
-static inline void idle_loop_epilog(unsigned long in_purr)
-{
- u64 wait_cycles;
-
- wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
- wait_cycles += mfspr(SPRN_PURR) - in_purr;
- get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
- get_lppaca()->idle = 0;
-
- ppc64_runlatch_on();
-}
-
static int snooze_loop(struct cpuidle_device *dev,
struct cpuidle_driver *drv,
int index)
@@ -63,7 +41,7 @@ static int snooze_loop(struct cpuidle_device *dev,
set_thread_flag(TIF_POLLING_NRFLAG);
- idle_loop_prolog(&in_purr);
+ pseries_idle_prolog(&in_purr);
local_irq_enable();
snooze_exit_time = get_tb() + snooze_timeout;
@@ -87,7 +65,7 @@ static int snooze_loop(struct cpuidle_device *dev,
local_irq_disable();
- idle_loop_epilog(in_purr);
+ pseries_idle_epilog(in_purr);
return index;
}
@@ -115,7 +93,7 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
{
unsigned long in_purr;
- idle_loop_prolog(&in_purr);
+ pseries_idle_prolog(&in_purr);
get_lppaca()->donate_dedicated_cpu = 1;
HMT_medium();
@@ -124,7 +102,7 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
local_irq_disable();
get_lppaca()->donate_dedicated_cpu = 0;
- idle_loop_epilog(in_purr);
+ pseries_idle_epilog(in_purr);
return index;
}
@@ -135,7 +113,7 @@ static int shared_cede_loop(struct cpuidle_device *dev,
{
unsigned long in_purr;
- idle_loop_prolog(&in_purr);
+ pseries_idle_prolog(&in_purr);
/*
* Yield the processor to the hypervisor. We return if
@@ -147,7 +125,7 @@ static int shared_cede_loop(struct cpuidle_device *dev,
check_and_cede_processor();
local_irq_disable();
- idle_loop_epilog(in_purr);
+ pseries_idle_epilog(in_purr);
return index;
}
--
1.9.4
Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <[email protected]>
>
> Hi,
>
> This is the fifth version of the patches to track and expose idle PURR
> and SPURR ticks. These patches are required by tools such as lparstat
> to compute system utilization for capacity planning purposes.
>
> The previous versions can be found here:
> v4: https://lkml.org/lkml/2020/3/27/323
> v3: https://lkml.org/lkml/2020/3/11/331
> v2: https://lkml.org/lkml/2020/2/21/21
> v1: https://lore.kernel.org/patchwork/cover/1159341/
>
> They changes from v4 are:
>
> - As suggested by Naveen, moved the functions read_this_idle_purr()
> and read_this_idle_spurr() from Patch 2 and Patch 3 respectively
> to Patch 4 where it is invoked.
>
> - Dropped Patch 6 which cached the values of purr, spurr,
> idle_purr, idle_spurr in order to minimize the number of IPIs
> sent.
>
> - Updated the dates for the idle_purr, idle_spurr in the
> Documentation Patch 5.
>
> Motivation:
> ===========
> On PSeries LPARs, the data centers planners desire a more accurate
> view of system utilization per resource such as CPU to plan the system
> capacity requirements better. Such accuracy can be obtained by reading
> PURR/SPURR registers for CPU resource utilization.
>
> Tools such as lparstat which are used to compute the utilization need
> to know [S]PURR ticks when the cpu was busy or idle. The [S]PURR
> counters are already exposed through sysfs. We already account for
> PURR ticks when we go to idle so that we can update the VPA area. This
> patchset extends support to account for SPURR ticks when idle, and
> expose both via per-cpu sysfs files.
>
> These patches are required for enhancement to the lparstat utility
> that compute the CPU utilization based on PURR and SPURR which can be
> found here :
> https://groups.google.com/forum/#!topic/powerpc-utils-devel/fYRo69xO9r4
>
>
> With the patches, when lparstat is run on a LPAR running CPU-Hogs,
> =========================================================================
> sudo ./src/lparstat -E 1 3
>
> System Configuration
> type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834112 kB cpus=0 ent=2.00
>
> ---Actual--- -Normalized-
> %busy %idle Frequency %busy %idle
> ------ ------ ------------- ------ ------
> 1 99.99 0.00 3.35GHz[111%] 110.99 0.00
> 2 100.00 0.00 3.35GHz[111%] 111.01 0.00
> 3 100.00 0.00 3.35GHz[111%] 111.00 0.00
>
> With patches, when lparstat is run on and idle LPAR
> =========================================================================
> System Configuration
> type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834112 kB cpus=0 ent=2.00
> ---Actual--- -Normalized-
> %busy %idle Frequency %busy %idle
> ------ ------ ------------- ------ ------
> 1 0.15 99.84 2.17GHz[ 72%] 0.11 71.89
> 2 0.24 99.76 2.11GHz[ 70%] 0.18 69.82
> 3 0.24 99.75 2.11GHz[ 70%] 0.18 69.81
>
> Gautham R. Shenoy (5):
> powerpc: Move idle_loop_prolog()/epilog() functions to header file
> powerpc/idle: Store PURR snapshot in a per-cpu global variable
> powerpc/pseries: Account for SPURR ticks on idle CPUs
> powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
> Documentation: Document sysfs interfaces purr, spurr, idle_purr,
> idle_spurr
Thanks, LGTM. For the series:
Acked-by: Naveen N. Rao <[email protected]>
- Naveen
On 4/7/20 2:17 PM, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <[email protected]>
>
> Hi,
>
> This is the fifth version of the patches to track and expose idle PURR
> and SPURR ticks. These patches are required by tools such as lparstat
> to compute system utilization for capacity planning purposes.
>
> The previous versions can be found here:
> v4: https://lkml.org/lkml/2020/3/27/323
> v3: https://lkml.org/lkml/2020/3/11/331
> v2: https://lkml.org/lkml/2020/2/21/21
> v1: https://lore.kernel.org/patchwork/cover/1159341/
>
> They changes from v4 are:
>
> - As suggested by Naveen, moved the functions read_this_idle_purr()
> and read_this_idle_spurr() from Patch 2 and Patch 3 respectively
> to Patch 4 where it is invoked.
>
> - Dropped Patch 6 which cached the values of purr, spurr,
> idle_purr, idle_spurr in order to minimize the number of IPIs
> sent.
>
> - Updated the dates for the idle_purr, idle_spurr in the
> Documentation Patch 5.
>
> Motivation:
> ===========
> On PSeries LPARs, the data centers planners desire a more accurate
> view of system utilization per resource such as CPU to plan the system
> capacity requirements better. Such accuracy can be obtained by reading
> PURR/SPURR registers for CPU resource utilization.
>
> Tools such as lparstat which are used to compute the utilization need
> to know [S]PURR ticks when the cpu was busy or idle. The [S]PURR
> counters are already exposed through sysfs. We already account for
> PURR ticks when we go to idle so that we can update the VPA area. This
> patchset extends support to account for SPURR ticks when idle, and
> expose both via per-cpu sysfs files.
>
> These patches are required for enhancement to the lparstat utility
> that compute the CPU utilization based on PURR and SPURR which can be
> found here :
> https://groups.google.com/forum/#!topic/powerpc-utils-devel/fYRo69xO9r4
>
>
> With the patches, when lparstat is run on a LPAR running CPU-Hogs,
> =========================================================================
> sudo ./src/lparstat -E 1 3
>
> System Configuration
> type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834112 kB cpus=0 ent=2.00
>
> ---Actual--- -Normalized-
> %busy %idle Frequency %busy %idle
> ------ ------ ------------- ------ ------
> 1 99.99 0.00 3.35GHz[111%] 110.99 0.00
> 2 100.00 0.00 3.35GHz[111%] 111.01 0.00
> 3 100.00 0.00 3.35GHz[111%] 111.00 0.00
>
> With patches, when lparstat is run on and idle LPAR
> =========================================================================
> System Configuration
> type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834112 kB cpus=0 ent=2.00
> ---Actual--- -Normalized-
> %busy %idle Frequency %busy %idle
> ------ ------ ------------- ------ ------
> 1 0.15 99.84 2.17GHz[ 72%] 0.11 71.89
> 2 0.24 99.76 2.11GHz[ 70%] 0.18 69.82
> 3 0.24 99.75 2.11GHz[ 70%] 0.18 69.81
>
> Gautham R. Shenoy (5):
> powerpc: Move idle_loop_prolog()/epilog() functions to header file
> powerpc/idle: Store PURR snapshot in a per-cpu global variable
> powerpc/pseries: Account for SPURR ticks on idle CPUs
> powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
> Documentation: Document sysfs interfaces purr, spurr, idle_purr,
> idle_spurr
>
> Documentation/ABI/testing/sysfs-devices-system-cpu | 39 +++++++++
> arch/powerpc/include/asm/idle.h | 93 ++++++++++++++++++++++
> arch/powerpc/kernel/sysfs.c | 82 ++++++++++++++++++-
> arch/powerpc/platforms/pseries/setup.c | 8 +-
> drivers/cpuidle/cpuidle-pseries.c | 39 ++-------
> 5 files changed, 224 insertions(+), 37 deletions(-)
> create mode 100644 arch/powerpc/include/asm/idle.h
>
Hi Gautham,
Thanks for working on it, I tested it using the lparstat patches posted at:
https://groups.google.com/forum/#!topic/powerpc-utils-devel/_imHP1Guw3c
On idle system:
===============
sudo ./src/lparstat -E 1 3
System Configuration
type=Dedicated mode=Capped smt=8 lcpu=2 mem=4324928 kB cpus=0 ent=2.00
---Actual--- -Normalized-
%busy %idle Frequency %busy %idle
------ ------ ------------- ------ ------
0.27 99.74 2.11GHz[ 70%] 0.24 69.76
0.57 99.43 2.17GHz[ 72%] 0.43 71.57
0.52 99.47 2.11GHz[ 70%] 0.38 69.62
On system running N while(1) (N == online cpus)
===============================================
sudo ./src/lparstat -E 1 3
System Configuration
type=Dedicated mode=Capped smt=8 lcpu=2 mem=4324928 kB cpus=0 ent=2.00
---Actual--- -Normalized-
%busy %idle Frequency %busy %idle
------ ------ ------------- ------ ------
99.99 0.00 3.35GHz[111%] 110.99 0.00
100.00 0.00 3.35GHz[111%] 111.00 0.00
100.00 0.00 3.35GHz[111%] 111.00 0.00
For the series:
Reviewed-and-Tested-by: Kamalesh Babulal <[email protected]>
--
Kamalesh
On 4/7/20 1:47 AM, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <[email protected]>
>
> Hi,
>
> This is the fifth version of the patches to track and expose idle PURR
> and SPURR ticks. These patches are required by tools such as lparstat
> to compute system utilization for capacity planning purposes.
>
> The previous versions can be found here:
> v4: https://lkml.org/lkml/2020/3/27/323
> v3: https://lkml.org/lkml/2020/3/11/331
> v2: https://lkml.org/lkml/2020/2/21/21
> v1: https://lore.kernel.org/patchwork/cover/1159341/
>
> They changes from v4 are:
>
> - As suggested by Naveen, moved the functions read_this_idle_purr()
> and read_this_idle_spurr() from Patch 2 and Patch 3 respectively
> to Patch 4 where it is invoked.
>
> - Dropped Patch 6 which cached the values of purr, spurr,
> idle_purr, idle_spurr in order to minimize the number of IPIs
> sent.
>
> - Updated the dates for the idle_purr, idle_spurr in the
> Documentation Patch 5.
>
> Motivation:
> ===========
> On PSeries LPARs, the data centers planners desire a more accurate
> view of system utilization per resource such as CPU to plan the system
> capacity requirements better. Such accuracy can be obtained by reading
> PURR/SPURR registers for CPU resource utilization.
>
> Tools such as lparstat which are used to compute the utilization need
> to know [S]PURR ticks when the cpu was busy or idle. The [S]PURR
> counters are already exposed through sysfs. We already account for
> PURR ticks when we go to idle so that we can update the VPA area. This
> patchset extends support to account for SPURR ticks when idle, and
> expose both via per-cpu sysfs files.
>
> These patches are required for enhancement to the lparstat utility
> that compute the CPU utilization based on PURR and SPURR which can be
> found here :
> https://groups.google.com/forum/#!topic/powerpc-utils-devel/fYRo69xO9r4
>
>
> With the patches, when lparstat is run on a LPAR running CPU-Hogs,
> =========================================================================
> sudo ./src/lparstat -E 1 3
>
> System Configuration
> type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834112 kB cpus=0 ent=2.00
>
> ---Actual--- -Normalized-
> %busy %idle Frequency %busy %idle
> ------ ------ ------------- ------ ------
> 1 99.99 0.00 3.35GHz[111%] 110.99 0.00
> 2 100.00 0.00 3.35GHz[111%] 111.01 0.00
> 3 100.00 0.00 3.35GHz[111%] 111.00 0.00
>
> With patches, when lparstat is run on and idle LPAR
> =========================================================================
> System Configuration
> type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834112 kB cpus=0 ent=2.00
> ---Actual--- -Normalized-
> %busy %idle Frequency %busy %idle
> ------ ------ ------------- ------ ------
> 1 0.15 99.84 2.17GHz[ 72%] 0.11 71.89
> 2 0.24 99.76 2.11GHz[ 70%] 0.18 69.82
> 3 0.24 99.75 2.11GHz[ 70%] 0.18 69.81
>
> Gautham R. Shenoy (5):
> powerpc: Move idle_loop_prolog()/epilog() functions to header file
> powerpc/idle: Store PURR snapshot in a per-cpu global variable
> powerpc/pseries: Account for SPURR ticks on idle CPUs
> powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
> Documentation: Document sysfs interfaces purr, spurr, idle_purr,
> idle_spurr
>
> Documentation/ABI/testing/sysfs-devices-system-cpu | 39 +++++++++
> arch/powerpc/include/asm/idle.h | 93 ++++++++++++++++++++++
> arch/powerpc/kernel/sysfs.c | 82 ++++++++++++++++++-
> arch/powerpc/platforms/pseries/setup.c | 8 +-
> drivers/cpuidle/cpuidle-pseries.c | 39 ++-------
> 5 files changed, 224 insertions(+), 37 deletions(-)
> create mode 100644 arch/powerpc/include/asm/idle.h
>
Reviewed-by: Tyrel Datwyler <[email protected]>
Any chance this is going to be merged in the near future? There is a patchset to
update lparstat in the powerpc-utils package to calculate PURR/SPURR cpu
utilization that I would like to merge, but have been holding off to make sure
we are synced with this proposed patchset.
"Gautham R. Shenoy" <[email protected]> writes:
> This is the fifth version of the patches to track and expose idle PURR
> and SPURR ticks. These patches are required by tools such as lparstat
> to compute system utilization for capacity planning purposes.
>
> The previous versions can be found here:
> v4: https://lkml.org/lkml/2020/3/27/323
> v3: https://lkml.org/lkml/2020/3/11/331
> v2: https://lkml.org/lkml/2020/2/21/21
> v1: https://lore.kernel.org/patchwork/cover/1159341/
>
> They changes from v4 are:
>
> - As suggested by Naveen, moved the functions read_this_idle_purr()
> and read_this_idle_spurr() from Patch 2 and Patch 3 respectively
> to Patch 4 where it is invoked.
>
> - Dropped Patch 6 which cached the values of purr, spurr,
> idle_purr, idle_spurr in order to minimize the number of IPIs
> sent.
>
> - Updated the dates for the idle_purr, idle_spurr in the
> Documentation Patch 5.
LGTM
Acked-by: Nathan Lynch <[email protected]>
Thanks.
On Mon, Apr 20, 2020 at 03:46:35PM -0700, Tyrel Datwyler wrote:
> On 4/7/20 1:47 AM, Gautham R. Shenoy wrote:
> > From: "Gautham R. Shenoy" <[email protected]>
> >
> > Hi,
> >
> > This is the fifth version of the patches to track and expose idle PURR
> > and SPURR ticks. These patches are required by tools such as lparstat
> > to compute system utilization for capacity planning purposes.
> >
> > The previous versions can be found here:
> > v4: https://lkml.org/lkml/2020/3/27/323
> > v3: https://lkml.org/lkml/2020/3/11/331
> > v2: https://lkml.org/lkml/2020/2/21/21
> > v1: https://lore.kernel.org/patchwork/cover/1159341/
> >
> > They changes from v4 are:
> >
> > - As suggested by Naveen, moved the functions read_this_idle_purr()
> > and read_this_idle_spurr() from Patch 2 and Patch 3 respectively
> > to Patch 4 where it is invoked.
> >
> > - Dropped Patch 6 which cached the values of purr, spurr,
> > idle_purr, idle_spurr in order to minimize the number of IPIs
> > sent.
> >
> > - Updated the dates for the idle_purr, idle_spurr in the
> > Documentation Patch 5.
> >
> > Motivation:
> > ===========
> > On PSeries LPARs, the data centers planners desire a more accurate
> > view of system utilization per resource such as CPU to plan the system
> > capacity requirements better. Such accuracy can be obtained by reading
> > PURR/SPURR registers for CPU resource utilization.
> >
> > Tools such as lparstat which are used to compute the utilization need
> > to know [S]PURR ticks when the cpu was busy or idle. The [S]PURR
> > counters are already exposed through sysfs. We already account for
> > PURR ticks when we go to idle so that we can update the VPA area. This
> > patchset extends support to account for SPURR ticks when idle, and
> > expose both via per-cpu sysfs files.
> >
> > These patches are required for enhancement to the lparstat utility
> > that compute the CPU utilization based on PURR and SPURR which can be
> > found here :
> > https://groups.google.com/forum/#!topic/powerpc-utils-devel/fYRo69xO9r4
> >
> >
> > With the patches, when lparstat is run on a LPAR running CPU-Hogs,
> > =========================================================================
> > sudo ./src/lparstat -E 1 3
> >
> > System Configuration
> > type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834112 kB cpus=0 ent=2.00
> >
> > ---Actual--- -Normalized-
> > %busy %idle Frequency %busy %idle
> > ------ ------ ------------- ------ ------
> > 1 99.99 0.00 3.35GHz[111%] 110.99 0.00
> > 2 100.00 0.00 3.35GHz[111%] 111.01 0.00
> > 3 100.00 0.00 3.35GHz[111%] 111.00 0.00
> >
> > With patches, when lparstat is run on and idle LPAR
> > =========================================================================
> > System Configuration
> > type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834112 kB cpus=0 ent=2.00
> > ---Actual--- -Normalized-
> > %busy %idle Frequency %busy %idle
> > ------ ------ ------------- ------ ------
> > 1 0.15 99.84 2.17GHz[ 72%] 0.11 71.89
> > 2 0.24 99.76 2.11GHz[ 70%] 0.18 69.82
> > 3 0.24 99.75 2.11GHz[ 70%] 0.18 69.81
> >
> > Gautham R. Shenoy (5):
> > powerpc: Move idle_loop_prolog()/epilog() functions to header file
> > powerpc/idle: Store PURR snapshot in a per-cpu global variable
> > powerpc/pseries: Account for SPURR ticks on idle CPUs
> > powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
> > Documentation: Document sysfs interfaces purr, spurr, idle_purr,
> > idle_spurr
> >
> > Documentation/ABI/testing/sysfs-devices-system-cpu | 39 +++++++++
> > arch/powerpc/include/asm/idle.h | 93 ++++++++++++++++++++++
> > arch/powerpc/kernel/sysfs.c | 82 ++++++++++++++++++-
> > arch/powerpc/platforms/pseries/setup.c | 8 +-
> > drivers/cpuidle/cpuidle-pseries.c | 39 ++-------
> > 5 files changed, 224 insertions(+), 37 deletions(-)
> > create mode 100644 arch/powerpc/include/asm/idle.h
> >
>
> Reviewed-by: Tyrel Datwyler <[email protected]>
Thanks for reviewing the patches.
>
> Any chance this is going to be merged in the near future? There is a patchset to
> update lparstat in the powerpc-utils package to calculate PURR/SPURR cpu
> utilization that I would like to merge, but have been holding off to make sure
> we are synced with this proposed patchset.
Michael, could you please consider this for 5.8 ?
--
Thanks and Regards
gautham.
Gautham R Shenoy <[email protected]> writes:
> On Mon, Apr 20, 2020 at 03:46:35PM -0700, Tyrel Datwyler wrote:
>> On 4/7/20 1:47 AM, Gautham R. Shenoy wrote:
>> > From: "Gautham R. Shenoy" <[email protected]>
>> >
>> > Hi,
>> >
>> > This is the fifth version of the patches to track and expose idle PURR
>> > and SPURR ticks. These patches are required by tools such as lparstat
>> > to compute system utilization for capacity planning purposes.
...
>> >
>> > Gautham R. Shenoy (5):
>> > powerpc: Move idle_loop_prolog()/epilog() functions to header file
>> > powerpc/idle: Store PURR snapshot in a per-cpu global variable
>> > powerpc/pseries: Account for SPURR ticks on idle CPUs
>> > powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
>> > Documentation: Document sysfs interfaces purr, spurr, idle_purr,
>> > idle_spurr
>> >
>> > Documentation/ABI/testing/sysfs-devices-system-cpu | 39 +++++++++
>> > arch/powerpc/include/asm/idle.h | 93 ++++++++++++++++++++++
>> > arch/powerpc/kernel/sysfs.c | 82 ++++++++++++++++++-
>> > arch/powerpc/platforms/pseries/setup.c | 8 +-
>> > drivers/cpuidle/cpuidle-pseries.c | 39 ++-------
>> > 5 files changed, 224 insertions(+), 37 deletions(-)
>> > create mode 100644 arch/powerpc/include/asm/idle.h
>> >
>>
>> Reviewed-by: Tyrel Datwyler <[email protected]>
>
> Thanks for reviewing the patches.
>
>>
>> Any chance this is going to be merged in the near future? There is a patchset to
>> update lparstat in the powerpc-utils package to calculate PURR/SPURR cpu
>> utilization that I would like to merge, but have been holding off to make sure
>> we are synced with this proposed patchset.
>
> Michael, could you please consider this for 5.8 ?
Yes. Has it been tested on KVM at all?
cheers
Hello Michael,
On Thu, Apr 30, 2020 at 12:34:52PM +1000, Michael Ellerman wrote:
> Gautham R Shenoy <[email protected]> writes:
> > On Mon, Apr 20, 2020 at 03:46:35PM -0700, Tyrel Datwyler wrote:
> >> On 4/7/20 1:47 AM, Gautham R. Shenoy wrote:
> >> > From: "Gautham R. Shenoy" <[email protected]>
> >> >
> >> > Hi,
> >> >
> >> > This is the fifth version of the patches to track and expose idle PURR
> >> > and SPURR ticks. These patches are required by tools such as lparstat
> >> > to compute system utilization for capacity planning purposes.
> ...
> >> >
> >> > Gautham R. Shenoy (5):
> >> > powerpc: Move idle_loop_prolog()/epilog() functions to header file
> >> > powerpc/idle: Store PURR snapshot in a per-cpu global variable
> >> > powerpc/pseries: Account for SPURR ticks on idle CPUs
> >> > powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
> >> > Documentation: Document sysfs interfaces purr, spurr, idle_purr,
> >> > idle_spurr
> >> >
> >> > Documentation/ABI/testing/sysfs-devices-system-cpu | 39 +++++++++
> >> > arch/powerpc/include/asm/idle.h | 93 ++++++++++++++++++++++
> >> > arch/powerpc/kernel/sysfs.c | 82 ++++++++++++++++++-
> >> > arch/powerpc/platforms/pseries/setup.c | 8 +-
> >> > drivers/cpuidle/cpuidle-pseries.c | 39 ++-------
> >> > 5 files changed, 224 insertions(+), 37 deletions(-)
> >> > create mode 100644 arch/powerpc/include/asm/idle.h
> >> >
> >>
> >> Reviewed-by: Tyrel Datwyler <[email protected]>
> >
> > Thanks for reviewing the patches.
> >
> >>
> >> Any chance this is going to be merged in the near future? There is a patchset to
> >> update lparstat in the powerpc-utils package to calculate PURR/SPURR cpu
> >> utilization that I would like to merge, but have been holding off to make sure
> >> we are synced with this proposed patchset.
> >
> > Michael, could you please consider this for 5.8 ?
>
> Yes. Has it been tested on KVM at all?
No. I haven't tested this on KVM. Will do that today.
>
> cheers
--
Thanks and Regards
gautham.
On Thu, Apr 30, 2020 at 09:46:13AM +0530, Gautham R Shenoy wrote:
> Hello Michael,
> > >
> > > Michael, could you please consider this for 5.8 ?
> >
> > Yes. Has it been tested on KVM at all?
>
> No. I haven't tested this on KVM. Will do that today.
The results on Shared LPAR and KVM are as follows:
---------------------------------------------------
The lparstat results on a Shared LPAR are consistent with that
observed on a dedicated LPAR when at least one of the threads of the
core is active. When all the threads are idle, the lparstat shows
incorrect idle percentage. But this is perhaps due to the fact that
the Hypervisor puts a completely idle core in some power-saving state
with runlatch turned off due to which PURR counts on the threads of a
core do not add up to the elapsed timebase ticks. The results are in
section A) below.
lparstat is not supported on KVM. However, I performed some basic
sanity checks on purr, spurr, idle_purr, and idle_spurr sysfs files
that show up after this patch series. When CPUs are offlined, the
idle_purr and idle_spurr sysfs files no longer show up, just like purr
and spurr sysfs files. The values of the counters monotonically
increase, except when the CPU is busy, in which case the idle_purr and
idle_spurr counts are stagnant as expected.
However, I don't think the even the values of PURR or SPURR make much
sense on KVM guest, since the Linux Hypervisor doesn't set additional
registers such as RWMR, except on POWER8, where the KVM sets RWMR
corresponding to the number of online threads in a vCORE before
dispatching the vcore. I haven't been able to test it on a POWER8
guest yet. The results on POWER9 are in section B) below.
A ) Shared LPAR
======================
1. When all the threads of the core are running a CPU-Hog
# ./lparstat -E 1 5
System Configuration
type=Shared mode=Capped smt=8 lcpu=6 mem=10362752 kB cpus=10 ent=6.00
---Actual--- -Normalized-
%busy %idle Frequency %busy %idle
------ ------ ------------- ------ ------
100.00 0.00 2.90GHz[126%] 126.00 0.00
100.00 0.00 2.90GHz[126%] 126.00 0.00
100.00 0.00 2.90GHz[126%] 126.00 0.00
100.00 0.00 2.90GHz[126%] 126.00 0.00
100.01 0.00 2.90GHz[126%] 126.01 0.00
2. When 4 threads of a core are running CPU Hogs, with the remaining 4
threads idle.
# ./lparstat -E 1 5
System Configuration
type=Shared mode=Capped smt=8 lcpu=6 mem=10362752 kB cpus=10 ent=6.00
---Actual--- -Normalized-
%busy %idle Frequency %busy %idle
------ ------ ------------- ------ ------
81.06 18.94 2.97GHz[129%] 104.56 24.44
81.05 18.95 2.97GHz[129%] 104.56 24.44
81.06 18.95 2.97GHz[129%] 104.56 24.44
81.06 18.95 2.97GHz[129%] 104.56 24.44
81.05 18.95 2.97GHz[129%] 104.56 24.45
3. When 2 threads of a core are running CPU Hogs, with the other 6
threads idle.
# ./lparstat -E 1 5
System Configuration
type=Shared mode=Capped smt=8 lcpu=6 mem=10362752 kB cpus=10 ent=6.00
---Actual--- -Normalized-
%busy %idle Frequency %busy %idle
------ ------ ------------- ------ ------
65.21 34.79 3.13GHz[136%] 88.69 47.31
65.20 34.81 3.13GHz[136%] 88.67 47.33
64.25 35.76 3.13GHz[136%] 87.38 48.63
63.68 36.31 3.13GHz[136%] 86.60 49.39
63.55 36.45 3.13GHz[136%] 86.42 49.58
4. When a single thread of the core is running CPU Hog, remaining 7
threads are idle.
# ./lparstat -E 1 5
System Configuration
type=Shared mode=Capped smt=8 lcpu=6 mem=10362752 kB cpus=10 ent=6.00
---Actual--- -Normalized-
%busy %idle Frequency %busy %idle
------ ------ ------------- ------ ------
31.80 68.20 3.20GHz[139%] 44.20 94.80
31.80 68.20 3.20GHz[139%] 44.20 94.81
31.80 68.20 3.20GHz[139%] 44.20 94.80
31.80 68.21 3.20GHz[139%] 44.20 94.81
31.79 68.21 3.20GHz[139%] 44.19 94.81
5. When the LPAR is idle:
# ./lparstat -E 1 5
System Configuration
type=Shared mode=Capped smt=8 lcpu=6 mem=10362752 kB cpus=10 ent=6.00
---Actual--- -Normalized-
%busy %idle Frequency %busy %idle
------ ------ ------------- ------ ------
0.04 0.14 2.41GHz[105%] 0.04 0.15
0.04 0.15 2.36GHz[102%] 0.04 0.15
0.03 0.13 2.35GHz[102%] 0.03 0.14
0.03 0.13 2.31GHz[100%] 0.03 0.13
0.03 0.13 2.32GHz[101%] 0.03 0.14
In this case, the sum of the PURR values do not add up to the elapsed
TB. This is probably due to the Hypervisor putting the core into some
power-saving state with the runlatch turned off.
# ./purr_tb -t 8
Got threads_per_core = 8
CORE 0:
CPU 0 : Delta PURR : 85744
CPU 1 : Delta PURR : 113632
CPU 2 : Delta PURR : 78224
CPU 3 : Delta PURR : 68856
CPU 4 : Delta PURR : 78064
CPU 5 : Delta PURR : 60488
CPU 6 : Delta PURR : 77776
CPU 7 : Delta PURR : 59464
Total Delta PURR : 622248 (Expected ~513156096)
B) KVM guest
==============================
vCPU idle:
-------------
Sampled every second when the KVM guest (1 socket, 2 cores, 4 threads,
vCPUs pinned) was idle. The value monotonically increase over time as
expected.
idle_purr:33128550
idle_spurr:3e3c775c
purr:d48181820
spurr:10295e8f28
idle_purr:331319f0
idle_spurr:3e3d56a4
purr:d481c4600
spurr:102964d3f0
idle_purr:331378c0
idle_spurr:3e3de58c
purr:d481faea0
spurr:102969f118
idle_purr:3313daa0
idle_spurr:3e3e77a4
purr:d4822c750
spurr:10296e9538
idle_purr:33143ab0
idle_spurr:3e3f093c
purr:d482608c0
spurr:1029737808
vCPU busy
---------------
Sampled every second on the same KVM guest, when the vCPU was running
a cpu-hog. The values of purr and spurr monotonically increase. And
the values of idle_purr and idle_spurr are stagnant as expected.
idle_purr:3335fca0
idle_spurr:3e71a774
purr:d5dd6bca0
spurr:1049fca1f0
idle_purr:3335fca0
idle_spurr:3e71a774
purr:d7c6f1c50
spurr:1077e12d40
idle_purr:3335fca0
idle_spurr:3e71a774
purr:d9b078720
spurr:10a5c5cc08
idle_purr:3335fca0
idle_spurr:3e71a774
purr:db99ef1d0
spurr:10d3a8eac0
idle_purr:3335fca0
idle_spurr:3e71a774
purr:dd8365c20
spurr:11018c0908
However, I don't think the even the values of PURR or SPURR make any
sense on KVM guest, since the Linux Hypervisor doesn't set additional
registers such as RWMR, except on POWER8, where the KVM sets RWMR
corresponding to the number of online threads in a vCORE before
dispatching the vcore.
On a POWER9 KVM guest:
When it is idle:
# ./purr_tb -t 4
Got threads_per_core = 4
CORE 0:
CPU 0 : Delta PURR : 2371632
CPU 1 : Delta PURR : 5056
CPU 2 : Delta PURR : 8016
CPU 3 : Delta PURR : 12688
Total Delta PURR : 2397392 (Expected ~514567680)
When all the threads are running CPU Hogs:
# ./purr_tb -t 4
Got threads_per_core = 4
CORE 0:
CPU 0 : Delta PURR : 510742304
CPU 1 : Delta PURR : 510747696
CPU 2 : Delta PURR : 510740208
CPU 3 : Delta PURR : 510735200
Total Delta PURR : 2042965408 (Expected ~512289792)
>
>
> >
> > cheers
>
--
Thanks and Regards
gautham.
On Tue, 2020-04-07 at 08:47:39 UTC, "Gautham R. Shenoy" wrote:
> From: "Gautham R. Shenoy" <[email protected]>
>
> Currently prior to entering an idle state on a Linux Guest, the
> pseries cpuidle driver implement an idle_loop_prolog() and
> idle_loop_epilog() functions which ensure that idle_purr is correctly
> computed, and the hypervisor is informed that the CPU cycles have been
> donated.
>
> These prolog and epilog functions are also required in the default
> idle call, i.e pseries_lpar_idle(). Hence move these accessor
> functions to a common header file and call them from
> pseries_lpar_idle(). Since the existing header files such as
> asm/processor.h have enough clutter, create a new header file
> asm/idle.h. Finally rename idle_loop_prolog() and idle_loop_epilog()
> to pseries_idle_prolog() and pseries_idle_epilog() as they are only
> relavent for on pseries guests.
>
> Signed-off-by: Gautham R. Shenoy <[email protected]>
Series applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/e4a884cc28fa3f5d8b81de46998ffe29b4ad169e
cheers