2023-07-06 07:01:12

by Sandipan Das

[permalink] [raw]
Subject: [PATCH v2] perf vendor events amd: Fix large metrics

There are cases where a metric requires more events than the number of
available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
data fabric counters but the "nps1_die_to_dram" metric has eight events.
By default, the constituent events are placed in a group and since the
events cannot be scheduled at the same time, the metric is not computed.
The "all metrics" test also fails because of this.

Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
the user to run perf with "--metric-no-group".

E.g.

$ sudo perf test -v 101

Before:

101: perf all metrics test :
--- start ---
test child forked, pid 37131
Testing branch_misprediction_ratio
Testing all_remote_links_outbound
Testing nps1_die_to_dram
Metric 'nps1_die_to_dram' not printed in:
Error:
Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
Testing macro_ops_dispatched
Testing all_l2_cache_accesses
Testing all_l2_cache_hits
Testing all_l2_cache_misses
Testing ic_fetch_miss_ratio
Testing l2_cache_accesses_from_l2_hwpf
Testing l2_cache_misses_from_l2_hwpf
Testing op_cache_fetch_miss_ratio
Testing l3_read_miss_latency
Testing l1_itlb_misses
test child finished with -1
---- end ----
perf all metrics test: FAILED!

After:

101: perf all metrics test :
--- start ---
test child forked, pid 43766
Testing branch_misprediction_ratio
Testing all_remote_links_outbound
Testing nps1_die_to_dram
Testing macro_ops_dispatched
Testing all_l2_cache_accesses
Testing all_l2_cache_hits
Testing all_l2_cache_misses
Testing ic_fetch_miss_ratio
Testing l2_cache_accesses_from_l2_hwpf
Testing l2_cache_misses_from_l2_hwpf
Testing op_cache_fetch_miss_ratio
Testing l3_read_miss_latency
Testing l1_itlb_misses
test child finished with 0
---- end ----
perf all metrics test: Ok

Reported-by: Ayush Jain <[email protected]>
Suggested-by: Ian Rogers <[email protected]>
Signed-off-by: Sandipan Das <[email protected]>
---

Previous versions can be found at:
v1: https://lore.kernel.org/all/[email protected]/

Changes in v2:
- As suggested by Ian, use the NO_GROUP_EVENTS constraint instead of
retrying the test scenario with --metric-no-group.
- Change the commit message accordingly.

tools/perf/pmu-events/arch/x86/amdzen1/recommended.json | 3 ++-
tools/perf/pmu-events/arch/x86/amdzen2/recommended.json | 3 ++-
tools/perf/pmu-events/arch/x86/amdzen3/recommended.json | 3 ++-
3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
index bf5083c1c260..4d28177325a0 100644
--- a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
+++ b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
@@ -169,8 +169,9 @@
},
{
"MetricName": "nps1_die_to_dram",
- "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
+ "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
"MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
+ "MetricConstraint": "NO_GROUP_EVENTS",
"MetricGroup": "data_fabric",
"PerPkg": "1",
"ScaleUnit": "6.1e-5MiB"
diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
index a71694a043ba..60e19456d4c8 100644
--- a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
+++ b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
@@ -169,8 +169,9 @@
},
{
"MetricName": "nps1_die_to_dram",
- "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
+ "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
"MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
+ "MetricConstraint": "NO_GROUP_EVENTS",
"MetricGroup": "data_fabric",
"PerPkg": "1",
"ScaleUnit": "6.1e-5MiB"
diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
index 988cf68ae825..3e9e1781812e 100644
--- a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
+++ b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
@@ -205,10 +205,11 @@
},
{
"MetricName": "nps1_die_to_dram",
- "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
+ "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
"MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
"MetricGroup": "data_fabric",
"PerPkg": "1",
+ "MetricConstraint": "NO_GROUP_EVENTS",
"ScaleUnit": "6.1e-5MiB"
}
]
--
2.34.1



2023-07-06 14:39:50

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v2] perf vendor events amd: Fix large metrics

On Wed, Jul 5, 2023 at 11:34 PM Sandipan Das <[email protected]> wrote:
>
> There are cases where a metric requires more events than the number of
> available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
> data fabric counters but the "nps1_die_to_dram" metric has eight events.
> By default, the constituent events are placed in a group and since the
> events cannot be scheduled at the same time, the metric is not computed.
> The "all metrics" test also fails because of this.
>
> Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
> the user to run perf with "--metric-no-group".
>
> E.g.
>
> $ sudo perf test -v 101
>
> Before:
>
> 101: perf all metrics test :
> --- start ---
> test child forked, pid 37131
> Testing branch_misprediction_ratio
> Testing all_remote_links_outbound
> Testing nps1_die_to_dram
> Metric 'nps1_die_to_dram' not printed in:
> Error:
> Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
> Testing macro_ops_dispatched
> Testing all_l2_cache_accesses
> Testing all_l2_cache_hits
> Testing all_l2_cache_misses
> Testing ic_fetch_miss_ratio
> Testing l2_cache_accesses_from_l2_hwpf
> Testing l2_cache_misses_from_l2_hwpf
> Testing op_cache_fetch_miss_ratio
> Testing l3_read_miss_latency
> Testing l1_itlb_misses
> test child finished with -1
> ---- end ----
> perf all metrics test: FAILED!
>
> After:
>
> 101: perf all metrics test :
> --- start ---
> test child forked, pid 43766
> Testing branch_misprediction_ratio
> Testing all_remote_links_outbound
> Testing nps1_die_to_dram
> Testing macro_ops_dispatched
> Testing all_l2_cache_accesses
> Testing all_l2_cache_hits
> Testing all_l2_cache_misses
> Testing ic_fetch_miss_ratio
> Testing l2_cache_accesses_from_l2_hwpf
> Testing l2_cache_misses_from_l2_hwpf
> Testing op_cache_fetch_miss_ratio
> Testing l3_read_miss_latency
> Testing l1_itlb_misses
> test child finished with 0
> ---- end ----
> perf all metrics test: Ok
>
> Reported-by: Ayush Jain <[email protected]>
> Suggested-by: Ian Rogers <[email protected]>
> Signed-off-by: Sandipan Das <[email protected]>

Acked-by: Ian Rogers <[email protected]>

Will there be a PMU driver fix so that the perf_event_open fails for
the group? That way the weak group would work.

Thanks,
Ian

> ---
>
> Previous versions can be found at:
> v1: https://lore.kernel.org/all/[email protected]/
>
> Changes in v2:
> - As suggested by Ian, use the NO_GROUP_EVENTS constraint instead of
> retrying the test scenario with --metric-no-group.
> - Change the commit message accordingly.
>
> tools/perf/pmu-events/arch/x86/amdzen1/recommended.json | 3 ++-
> tools/perf/pmu-events/arch/x86/amdzen2/recommended.json | 3 ++-
> tools/perf/pmu-events/arch/x86/amdzen3/recommended.json | 3 ++-
> 3 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> index bf5083c1c260..4d28177325a0 100644
> --- a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> +++ b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> @@ -169,8 +169,9 @@
> },
> {
> "MetricName": "nps1_die_to_dram",
> - "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
> + "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
> "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
> + "MetricConstraint": "NO_GROUP_EVENTS",
> "MetricGroup": "data_fabric",
> "PerPkg": "1",
> "ScaleUnit": "6.1e-5MiB"
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
> index a71694a043ba..60e19456d4c8 100644
> --- a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
> +++ b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
> @@ -169,8 +169,9 @@
> },
> {
> "MetricName": "nps1_die_to_dram",
> - "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
> + "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
> "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
> + "MetricConstraint": "NO_GROUP_EVENTS",
> "MetricGroup": "data_fabric",
> "PerPkg": "1",
> "ScaleUnit": "6.1e-5MiB"
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
> index 988cf68ae825..3e9e1781812e 100644
> --- a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
> +++ b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
> @@ -205,10 +205,11 @@
> },
> {
> "MetricName": "nps1_die_to_dram",
> - "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
> + "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
> "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
> "MetricGroup": "data_fabric",
> "PerPkg": "1",
> + "MetricConstraint": "NO_GROUP_EVENTS",
> "ScaleUnit": "6.1e-5MiB"
> }
> ]
> --
> 2.34.1
>

2023-07-06 15:12:36

by Sandipan Das

[permalink] [raw]
Subject: Re: [PATCH v2] perf vendor events amd: Fix large metrics

Hi Ian,

On 7/6/2023 7:19 PM, Ian Rogers wrote:
> On Wed, Jul 5, 2023 at 11:34 PM Sandipan Das <[email protected]> wrote:
>>
>> There are cases where a metric requires more events than the number of
>> available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
>> data fabric counters but the "nps1_die_to_dram" metric has eight events.
>> By default, the constituent events are placed in a group and since the
>> events cannot be scheduled at the same time, the metric is not computed.
>> The "all metrics" test also fails because of this.
>>
>> Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
>> the user to run perf with "--metric-no-group".
>>
>> E.g.
>>
>> $ sudo perf test -v 101
>>
>> Before:
>>
>> 101: perf all metrics test :
>> --- start ---
>> test child forked, pid 37131
>> Testing branch_misprediction_ratio
>> Testing all_remote_links_outbound
>> Testing nps1_die_to_dram
>> Metric 'nps1_die_to_dram' not printed in:
>> Error:
>> Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
>> Testing macro_ops_dispatched
>> Testing all_l2_cache_accesses
>> Testing all_l2_cache_hits
>> Testing all_l2_cache_misses
>> Testing ic_fetch_miss_ratio
>> Testing l2_cache_accesses_from_l2_hwpf
>> Testing l2_cache_misses_from_l2_hwpf
>> Testing op_cache_fetch_miss_ratio
>> Testing l3_read_miss_latency
>> Testing l1_itlb_misses
>> test child finished with -1
>> ---- end ----
>> perf all metrics test: FAILED!
>>
>> After:
>>
>> 101: perf all metrics test :
>> --- start ---
>> test child forked, pid 43766
>> Testing branch_misprediction_ratio
>> Testing all_remote_links_outbound
>> Testing nps1_die_to_dram
>> Testing macro_ops_dispatched
>> Testing all_l2_cache_accesses
>> Testing all_l2_cache_hits
>> Testing all_l2_cache_misses
>> Testing ic_fetch_miss_ratio
>> Testing l2_cache_accesses_from_l2_hwpf
>> Testing l2_cache_misses_from_l2_hwpf
>> Testing op_cache_fetch_miss_ratio
>> Testing l3_read_miss_latency
>> Testing l1_itlb_misses
>> test child finished with 0
>> ---- end ----
>> perf all metrics test: Ok
>>
>> Reported-by: Ayush Jain <[email protected]>
>> Suggested-by: Ian Rogers <[email protected]>
>> Signed-off-by: Sandipan Das <[email protected]>
>
> Acked-by: Ian Rogers <[email protected]>
>
> Will there be a PMU driver fix so that the perf_event_open fails for
> the group? That way the weak group would work.
>

Yes, that's in our plan. Ravi (in CC) and I have discussed about adding
group validation in the event_init() path.

- Sandipan

2023-07-11 15:11:43

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2] perf vendor events amd: Fix large metrics

Em Thu, Jul 06, 2023 at 06:49:29AM -0700, Ian Rogers escreveu:
> On Wed, Jul 5, 2023 at 11:34 PM Sandipan Das <[email protected]> wrote:
> >
> > There are cases where a metric requires more events than the number of
> > available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
> > data fabric counters but the "nps1_die_to_dram" metric has eight events.
> > By default, the constituent events are placed in a group and since the
> > events cannot be scheduled at the same time, the metric is not computed.
> > The "all metrics" test also fails because of this.
> >
> > Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
> > the user to run perf with "--metric-no-group".
> >
> > E.g.
> >
> > $ sudo perf test -v 101
> >
> > Before:
> >
> > 101: perf all metrics test :
> > --- start ---
> > test child forked, pid 37131
> > Testing branch_misprediction_ratio
> > Testing all_remote_links_outbound
> > Testing nps1_die_to_dram
> > Metric 'nps1_die_to_dram' not printed in:
> > Error:
> > Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
> > Testing macro_ops_dispatched
> > Testing all_l2_cache_accesses
> > Testing all_l2_cache_hits
> > Testing all_l2_cache_misses
> > Testing ic_fetch_miss_ratio
> > Testing l2_cache_accesses_from_l2_hwpf
> > Testing l2_cache_misses_from_l2_hwpf
> > Testing op_cache_fetch_miss_ratio
> > Testing l3_read_miss_latency
> > Testing l1_itlb_misses
> > test child finished with -1
> > ---- end ----
> > perf all metrics test: FAILED!
> >
> > After:
> >
> > 101: perf all metrics test :
> > --- start ---
> > test child forked, pid 43766
> > Testing branch_misprediction_ratio
> > Testing all_remote_links_outbound
> > Testing nps1_die_to_dram
> > Testing macro_ops_dispatched
> > Testing all_l2_cache_accesses
> > Testing all_l2_cache_hits
> > Testing all_l2_cache_misses
> > Testing ic_fetch_miss_ratio
> > Testing l2_cache_accesses_from_l2_hwpf
> > Testing l2_cache_misses_from_l2_hwpf
> > Testing op_cache_fetch_miss_ratio
> > Testing l3_read_miss_latency
> > Testing l1_itlb_misses
> > test child finished with 0
> > ---- end ----
> > perf all metrics test: Ok
> >
> > Reported-by: Ayush Jain <[email protected]>
> > Suggested-by: Ian Rogers <[email protected]>
> > Signed-off-by: Sandipan Das <[email protected]>
>
> Acked-by: Ian Rogers <[email protected]>

Thanks, applied.

- Arnaldo


> Will there be a PMU driver fix so that the perf_event_open fails for
> the group? That way the weak group would work.
>
> Thanks,
> Ian
>
> > ---
> >
> > Previous versions can be found at:
> > v1: https://lore.kernel.org/all/[email protected]/
> >
> > Changes in v2:
> > - As suggested by Ian, use the NO_GROUP_EVENTS constraint instead of
> > retrying the test scenario with --metric-no-group.
> > - Change the commit message accordingly.
> >
> > tools/perf/pmu-events/arch/x86/amdzen1/recommended.json | 3 ++-
> > tools/perf/pmu-events/arch/x86/amdzen2/recommended.json | 3 ++-
> > tools/perf/pmu-events/arch/x86/amdzen3/recommended.json | 3 ++-
> > 3 files changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> > index bf5083c1c260..4d28177325a0 100644
> > --- a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> > +++ b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> > @@ -169,8 +169,9 @@
> > },
> > {
> > "MetricName": "nps1_die_to_dram",
> > - "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
> > + "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
> > "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
> > + "MetricConstraint": "NO_GROUP_EVENTS",
> > "MetricGroup": "data_fabric",
> > "PerPkg": "1",
> > "ScaleUnit": "6.1e-5MiB"
> > diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
> > index a71694a043ba..60e19456d4c8 100644
> > --- a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
> > +++ b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
> > @@ -169,8 +169,9 @@
> > },
> > {
> > "MetricName": "nps1_die_to_dram",
> > - "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
> > + "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
> > "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
> > + "MetricConstraint": "NO_GROUP_EVENTS",
> > "MetricGroup": "data_fabric",
> > "PerPkg": "1",
> > "ScaleUnit": "6.1e-5MiB"
> > diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
> > index 988cf68ae825..3e9e1781812e 100644
> > --- a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
> > +++ b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
> > @@ -205,10 +205,11 @@
> > },
> > {
> > "MetricName": "nps1_die_to_dram",
> > - "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
> > + "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
> > "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
> > "MetricGroup": "data_fabric",
> > "PerPkg": "1",
> > + "MetricConstraint": "NO_GROUP_EVENTS",
> > "ScaleUnit": "6.1e-5MiB"
> > }
> > ]
> > --
> > 2.34.1
> >

--

- Arnaldo

2023-07-11 17:58:23

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2] perf vendor events amd: Fix large metrics

On Tue, Jul 11, 2023 at 7:51 AM Arnaldo Carvalho de Melo
<[email protected]> wrote:
>
> Em Thu, Jul 06, 2023 at 06:49:29AM -0700, Ian Rogers escreveu:
> > On Wed, Jul 5, 2023 at 11:34 PM Sandipan Das <[email protected]> wrote:
> > >
> > > There are cases where a metric requires more events than the number of
> > > available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
> > > data fabric counters but the "nps1_die_to_dram" metric has eight events.
> > > By default, the constituent events are placed in a group and since the
> > > events cannot be scheduled at the same time, the metric is not computed.
> > > The "all metrics" test also fails because of this.
> > >
> > > Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
> > > the user to run perf with "--metric-no-group".
> > >
> > > E.g.
> > >
> > > $ sudo perf test -v 101
> > >
> > > Before:
> > >
> > > 101: perf all metrics test :
> > > --- start ---
> > > test child forked, pid 37131
> > > Testing branch_misprediction_ratio
> > > Testing all_remote_links_outbound
> > > Testing nps1_die_to_dram
> > > Metric 'nps1_die_to_dram' not printed in:
> > > Error:
> > > Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
> > > Testing macro_ops_dispatched
> > > Testing all_l2_cache_accesses
> > > Testing all_l2_cache_hits
> > > Testing all_l2_cache_misses
> > > Testing ic_fetch_miss_ratio
> > > Testing l2_cache_accesses_from_l2_hwpf
> > > Testing l2_cache_misses_from_l2_hwpf
> > > Testing op_cache_fetch_miss_ratio
> > > Testing l3_read_miss_latency
> > > Testing l1_itlb_misses
> > > test child finished with -1
> > > ---- end ----
> > > perf all metrics test: FAILED!
> > >
> > > After:
> > >
> > > 101: perf all metrics test :
> > > --- start ---
> > > test child forked, pid 43766
> > > Testing branch_misprediction_ratio
> > > Testing all_remote_links_outbound
> > > Testing nps1_die_to_dram
> > > Testing macro_ops_dispatched
> > > Testing all_l2_cache_accesses
> > > Testing all_l2_cache_hits
> > > Testing all_l2_cache_misses
> > > Testing ic_fetch_miss_ratio
> > > Testing l2_cache_accesses_from_l2_hwpf
> > > Testing l2_cache_misses_from_l2_hwpf
> > > Testing op_cache_fetch_miss_ratio
> > > Testing l3_read_miss_latency
> > > Testing l1_itlb_misses
> > > test child finished with 0
> > > ---- end ----
> > > perf all metrics test: Ok
> > >
> > > Reported-by: Ayush Jain <[email protected]>
> > > Suggested-by: Ian Rogers <[email protected]>
> > > Signed-off-by: Sandipan Das <[email protected]>
> >
> > Acked-by: Ian Rogers <[email protected]>
>
> Thanks, applied.

If I'm not too late..

Tested-by: Namhyung Kim <[email protected]>

Thanks,
Namhyung