This is v2 patch set for enabling per-core RAPL counters through
energy-cores event on RAPL PMU. Added RFC tag because a new patch is
added and this additional patch may need clarification from the
community.
While enabling this support with v1 patch [1], Stephane reported an
issue where for the first reading of energy-cores event for every core was
invalid [2]. This issue is addressed with patch 1 of this series. There
is no change in patch 2.
[1]: https://lore.kernel.org/lkml/[email protected]/
[2]: https://lore.kernel.org/lkml/CABPqkBQ_bSTC-OEe_LrgUrpj2VsseX1ThvO-yLcEtF8vb4+AAw@mail.gmail.com/
Change log:
v1 -> v2:
- Added new patch to fix the issue reported by Stephane
- Added RFC tag
Wyes Karny (2):
perf/x86/rapl: Fix energy-cores event
perf/x86/rapl: Enable Core RAPL for AMD
arch/x86/events/rapl.c | 20 +++++++++++++++-----
1 file changed, 15 insertions(+), 5 deletions(-)
--
2.34.1
For quite some time, energy-cores event is broken, because RAPL PMU
assumes all the events on this PMU are uncore and sets rapl_cpu_mask
with the first available CPU on the die. Therefore, for energy-cores
event if we read MSR form pmu->cpu, it's wrong. But the following two
changes helped to hide this issue.
- commit 704e2f5b700d ("perf stat: Use affinity for enabling/disabling
events")
- commit e64cd6f73ff5 ("perf/x86: Use PMUEF_READ_CPU_PKG in uncore
events")
These two changes together acted as a workaround for energy-cores event.
First change affined perf events to respective CPUs whereas the second
change helped to pick the local CPU to read the MSR. In this way, MSRs
were read from the correct CPU. This works unless it's the first
reading. For the first reading the second patch doesn't apply and we
get wrong readings. Stephane reported this issue when a patch to enable
AMD energy-cores RAPL event was posted [1].
The right way to fix the issue is to get rid of RAPL being considered an
uncore event. That is a larger change. To enable current RAPL usage,
work around the issue by conditionally remove the
`PERF_EV_CAP_READ_ACTIVE_PKG` flag for energy-cores event. Also, use the
event's CPU instead for PMU's CPU to read the MSR.
[1]: https://lore.kernel.org/lkml/CABPqkBQ_bSTC-OEe_LrgUrpj2VsseX1ThvO-yLcEtF8vb4+AAw@mail.gmail.com/#t
Fixes: e64cd6f73ff5 ("perf/x86: Use PMUEF_READ_CPU_PKG in uncore events")
Signed-off-by: Wyes Karny <[email protected]>
---
arch/x86/events/rapl.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/arch/x86/events/rapl.c b/arch/x86/events/rapl.c
index 52e6e7ed4f78..e6a0c077daf5 100644
--- a/arch/x86/events/rapl.c
+++ b/arch/x86/events/rapl.c
@@ -343,14 +343,15 @@ static int rapl_pmu_event_init(struct perf_event *event)
if (event->cpu < 0)
return -EINVAL;
- event->event_caps |= PERF_EV_CAP_READ_ACTIVE_PKG;
-
if (!cfg || cfg >= NR_RAPL_DOMAINS + 1)
return -EINVAL;
cfg = array_index_nospec((long)cfg, NR_RAPL_DOMAINS + 1);
bit = cfg - 1;
+ if (bit != PERF_RAPL_PP0)
+ event->event_caps |= PERF_EV_CAP_READ_ACTIVE_PKG;
+
/* check event supported */
if (!(rapl_cntr_mask & (1 << bit)))
return -EINVAL;
@@ -363,7 +364,15 @@ static int rapl_pmu_event_init(struct perf_event *event)
pmu = cpu_to_rapl_pmu(event->cpu);
if (!pmu)
return -EINVAL;
- event->cpu = pmu->cpu;
+
+ /*
+ * FIXME: RAPL PMU considers events are uncore and MSRs can be read from
+ * the first available CPU of the die. But this is not true for energy-cores
+ * event. Therefore as a workaround don't consider pmu->cpu here for PERF_RAPL_PP0.
+ */
+ if (event->event_caps & PERF_EV_CAP_READ_ACTIVE_PKG)
+ event->cpu = pmu->cpu;
+
event->pmu_private = pmu;
event->hw.event_base = rapl_msrs[bit].msr;
event->hw.config = cfg;
--
2.34.1
AMD processors support per-package and per-core energy monitoring
through RAPL counters which can be accessed by users running in
supervisor mode.
Core RAPL counters gives power consumption information per core. For
AMD processors the package level RAPL counter are already exposed to
perf. Expose the core level RAPL counters also.
sudo perf stat -a --per-core -C 0-127 -e power/energy-cores/
Output:
S0-D0-C0 2 8.73 Joules power/energy-cores/
S0-D0-C1 2 8.73 Joules power/energy-cores/
S0-D0-C2 2 8.73 Joules power/energy-cores/
S0-D0-C3 2 8.73 Joules power/energy-cores/
S0-D0-C4 2 8.73 Joules power/energy-cores/
Signed-off-by: Wyes Karny <[email protected]>
---
arch/x86/events/rapl.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/events/rapl.c b/arch/x86/events/rapl.c
index e6a0c077daf5..9cce8e590cc4 100644
--- a/arch/x86/events/rapl.c
+++ b/arch/x86/events/rapl.c
@@ -546,7 +546,7 @@ static struct perf_msr intel_rapl_spr_msrs[] = {
* - want to use same event codes across both architectures
*/
static struct perf_msr amd_rapl_msrs[] = {
- [PERF_RAPL_PP0] = { 0, &rapl_events_cores_group, 0, false, 0 },
+ [PERF_RAPL_PP0] = { MSR_AMD_CORE_ENERGY_STATUS, &rapl_events_cores_group, test_msr, false, RAPL_MSR_MASK },
[PERF_RAPL_PKG] = { MSR_AMD_PKG_ENERGY_STATUS, &rapl_events_pkg_group, test_msr, false, RAPL_MSR_MASK },
[PERF_RAPL_RAM] = { 0, &rapl_events_ram_group, 0, false, 0 },
[PERF_RAPL_PP1] = { 0, &rapl_events_gpu_group, 0, false, 0 },
@@ -773,7 +773,8 @@ static struct rapl_model model_spr = {
};
static struct rapl_model model_amd_hygon = {
- .events = BIT(PERF_RAPL_PKG),
+ .events = BIT(PERF_RAPL_PP0) |
+ BIT(PERF_RAPL_PKG),
.msr_power_unit = MSR_AMD_RAPL_POWER_UNIT,
.rapl_msrs = amd_rapl_msrs,
};
--
2.34.1
On 2023-03-01 1:14 p.m., Wyes Karny wrote:
> For quite some time, energy-cores event is broken, because RAPL PMU
> assumes all the events on this PMU are uncore and sets rapl_cpu_mask
> with the first available CPU on the die. Therefore, for energy-cores
> event if we read MSR form pmu->cpu, it's wrong. But the following two
> changes helped to hide this issue.
>
> - commit 704e2f5b700d ("perf stat: Use affinity for enabling/disabling
> events")
> - commit e64cd6f73ff5 ("perf/x86: Use PMUEF_READ_CPU_PKG in uncore
> events")
>
> These two changes together acted as a workaround for energy-cores event.
> First change affined perf events to respective CPUs whereas the second
> change helped to pick the local CPU to read the MSR. In this way, MSRs
> were read from the correct CPU. This works unless it's the first
> reading. For the first reading the second patch doesn't apply and we
> get wrong readings. Stephane reported this issue when a patch to enable
> AMD energy-cores RAPL event was posted [1].
>
> The right way to fix the issue is to get rid of RAPL being considered an
> uncore event. That is a larger change. To enable current RAPL usage,
> work around the issue by conditionally remove the
> `PERF_EV_CAP_READ_ACTIVE_PKG` flag for energy-cores event. Also, use the
> event's CPU instead for PMU's CPU to read the MSR.
The current RAPL PMU aka 'power' should be die/socket scope.
The energy-cores event is also defined as a die/socket scope.
* pp0 counter: consumption of all physical cores (power plane 0)
* event: rapl_energy_cores
* perf code: 0x1
I don't think we want to change the scope of the energy-cores event.
Otherwise Intel's energy-cores event probably be broken.
It looks like you are looking for a new per-core RAPL event. I think
it's better to create a new core scope RAPL PMU.
Thanks,
Kan
>
> [1]: https://lore.kernel.org/lkml/CABPqkBQ_bSTC-OEe_LrgUrpj2VsseX1ThvO-yLcEtF8vb4+AAw@mail.gmail.com/#t
>
> Fixes: e64cd6f73ff5 ("perf/x86: Use PMUEF_READ_CPU_PKG in uncore events")
> Signed-off-by: Wyes Karny <[email protected]>
> ---
> arch/x86/events/rapl.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/events/rapl.c b/arch/x86/events/rapl.c
> index 52e6e7ed4f78..e6a0c077daf5 100644
> --- a/arch/x86/events/rapl.c
> +++ b/arch/x86/events/rapl.c
> @@ -343,14 +343,15 @@ static int rapl_pmu_event_init(struct perf_event *event)
> if (event->cpu < 0)
> return -EINVAL;
>
> - event->event_caps |= PERF_EV_CAP_READ_ACTIVE_PKG;
> -
> if (!cfg || cfg >= NR_RAPL_DOMAINS + 1)
> return -EINVAL;
>
> cfg = array_index_nospec((long)cfg, NR_RAPL_DOMAINS + 1);
> bit = cfg - 1;
>
> + if (bit != PERF_RAPL_PP0)
> + event->event_caps |= PERF_EV_CAP_READ_ACTIVE_PKG;
> +
> /* check event supported */
> if (!(rapl_cntr_mask & (1 << bit)))
> return -EINVAL;
> @@ -363,7 +364,15 @@ static int rapl_pmu_event_init(struct perf_event *event)
> pmu = cpu_to_rapl_pmu(event->cpu);
> if (!pmu)
> return -EINVAL;
> - event->cpu = pmu->cpu;
> +
> + /*
> + * FIXME: RAPL PMU considers events are uncore and MSRs can be read from
> + * the first available CPU of the die. But this is not true for energy-cores
> + * event. Therefore as a workaround don't consider pmu->cpu here for PERF_RAPL_PP0.
> + */
> + if (event->event_caps & PERF_EV_CAP_READ_ACTIVE_PKG)
> + event->cpu = pmu->cpu;
> +
> event->pmu_private = pmu;
> event->hw.event_base = rapl_msrs[bit].msr;
> event->hw.config = cfg;
Hello.
On středa 1. března 2023 19:14:47 CET Wyes Karny wrote:
> This is v2 patch set for enabling per-core RAPL counters through
> energy-cores event on RAPL PMU. Added RFC tag because a new patch is
> added and this additional patch may need clarification from the
> community.
>
> While enabling this support with v1 patch [1], Stephane reported an
> issue where for the first reading of energy-cores event for every core was
> invalid [2]. This issue is addressed with patch 1 of this series. There
> is no change in patch 2.
>
> [1]: https://lore.kernel.org/lkml/[email protected]/
> [2]: https://lore.kernel.org/lkml/CABPqkBQ_bSTC-OEe_LrgUrpj2VsseX1ThvO-yLcEtF8vb4+AAw@mail.gmail.com/
>
> Change log:
> v1 -> v2:
> - Added new patch to fix the issue reported by Stephane
> - Added RFC tag
>
> Wyes Karny (2):
> perf/x86/rapl: Fix energy-cores event
> perf/x86/rapl: Enable Core RAPL for AMD
>
> arch/x86/events/rapl.c | 20 +++++++++++++++-----
> 1 file changed, 15 insertions(+), 5 deletions(-)
Since I was asked to re-test v2:
```
$ sudo perf stat -a --per-core -C 0-15 -e power/energy-cores/ -- dd if=/dev/zero of=/dev/null bs=1M count=1000000
1000000+0 records in
1000000+0 records out
1048576000000 bytes (1,0 TB, 977 GiB) copied, 17,0321 s, 61,6 GB/s
Performance counter stats for 'system wide':
S0-D0-C0 1 6,67 Joules power/energy-cores/
S0-D0-C1 1 197,93 Joules power/energy-cores/
S0-D0-C2 1 11,32 Joules power/energy-cores/
S0-D0-C3 1 13,88 Joules power/energy-cores/
S0-D0-C4 1 12,40 Joules power/energy-cores/
S0-D0-C5 1 5,10 Joules power/energy-cores/
S0-D0-C6 1 3,92 Joules power/energy-cores/
S0-D0-C7 1 5,65 Joules power/energy-cores/
S0-D0-C8 1 4,91 Joules power/energy-cores/
S0-D0-C9 1 57,18 Joules power/energy-cores/
S0-D0-C10 1 174,95 Joules power/energy-cores/
S0-D0-C11 1 3,58 Joules power/energy-cores/
S0-D0-C12 1 5,41 Joules power/energy-cores/
S0-D0-C13 1 5,33 Joules power/energy-cores/
S0-D0-C14 1 2,89 Joules power/energy-cores/
S0-D0-C15 1 5,61 Joules power/energy-cores/
17,034519241 seconds time elapsed
```
Tested-by: Oleksandr Natalenko <[email protected]>
Thanks.
--
Oleksandr Natalenko (post-factum)