2022-09-28 10:18:50

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v3 04/15] perf/x86/amd: Support PERF_SAMPLE_{WEIGHT|WEIGHT_STRUCT}

IbsDcMissLat indicates the number of clock cycles from when a miss is
detected in the data cache to when the data was delivered to the core.
Similarly, IbsTagToRetCtr provides number of cycles from when the op
was tagged to when the op was retired. Consider these fields for
sample->weight.

Signed-off-by: Ravi Bangoria <[email protected]>
---
Note:
While opening a new event, perf tool starts with a set of attributes
and goes on reverting some attributes in a predefined order until it
succeeds or run out or all attempts. Here, 1st attempt includes both
WEIGHT_STRUCT and exclude_guest which always fails because IBS does
not support guest filtering. The problem however is, perf reverts
WEIGHT_STRUCT but keeps trying with exclude_guest. Thus, although,
this patch enables WEIGHT_STRUCT support from kernel, using it from
the perf tool needs more changes(not included in this series).

arch/x86/events/amd/ibs.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index e20caa5cf02f..d883694e0fd4 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -955,6 +955,7 @@ static void perf_ibs_parse_ld_st_data(__u64 sample_type,
{
union ibs_op_data3 op_data3;
union ibs_op_data2 op_data2;
+ union ibs_op_data op_data;

data->data_src.val = PERF_MEM_NA;
op_data3.val = ibs_data->regs[ibs_op_msr_idx(MSR_AMD64_IBSOPDATA3)];
@@ -970,6 +971,19 @@ static void perf_ibs_parse_ld_st_data(__u64 sample_type,
perf_ibs_get_data_src(ibs_data, data, &op_data2, &op_data3);
data->sample_flags |= PERF_SAMPLE_DATA_SRC;
}
+
+ if (sample_type & PERF_SAMPLE_WEIGHT_TYPE && op_data3.dc_miss &&
+ data->data_src.mem_op == PERF_MEM_OP_LOAD) {
+ op_data.val = ibs_data->regs[ibs_op_msr_idx(MSR_AMD64_IBSOPDATA)];
+
+ if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT) {
+ data->weight.var1_dw = op_data3.dc_miss_lat;
+ data->weight.var2_w = op_data.tag_to_ret_ctr;
+ } else if (sample_type & PERF_SAMPLE_WEIGHT) {
+ data->weight.full = op_data3.dc_miss_lat;
+ }
+ data->sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
+ }
}

static int perf_ibs_get_offset_max(struct perf_ibs *perf_ibs, u64 sample_type,
@@ -977,7 +991,8 @@ static int perf_ibs_get_offset_max(struct perf_ibs *perf_ibs, u64 sample_type,
{
if (sample_type & PERF_SAMPLE_RAW ||
(perf_ibs == &perf_ibs_op &&
- sample_type & PERF_SAMPLE_DATA_SRC))
+ (sample_type & PERF_SAMPLE_DATA_SRC ||
+ sample_type & PERF_SAMPLE_WEIGHT_TYPE)))
return perf_ibs->offset_max;
else if (check_rip)
return 3;
--
2.31.1


2022-09-30 05:50:36

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v3 04/15] perf/x86/amd: Support PERF_SAMPLE_{WEIGHT|WEIGHT_STRUCT}

On Wed, Sep 28, 2022 at 2:59 AM Ravi Bangoria <[email protected]> wrote:
>
> IbsDcMissLat indicates the number of clock cycles from when a miss is
> detected in the data cache to when the data was delivered to the core.
> Similarly, IbsTagToRetCtr provides number of cycles from when the op
> was tagged to when the op was retired. Consider these fields for
> sample->weight.
>
> Signed-off-by: Ravi Bangoria <[email protected]>
> ---
> Note:
> While opening a new event, perf tool starts with a set of attributes
> and goes on reverting some attributes in a predefined order until it
> succeeds or run out or all attempts. Here, 1st attempt includes both
> WEIGHT_STRUCT and exclude_guest which always fails because IBS does
> not support guest filtering. The problem however is, perf reverts
> WEIGHT_STRUCT but keeps trying with exclude_guest. Thus, although,
> this patch enables WEIGHT_STRUCT support from kernel, using it from
> the perf tool needs more changes(not included in this series).

Yeah, it'd be nice if kernel could expose more pmu capabilities like
no-exclude then tools can skip setting it for them.

Thanks,
Namhyung

2022-09-30 10:11:27

by tip-bot2 for Haifeng Xu

[permalink] [raw]
Subject: [tip: perf/core] perf/x86/amd: Support PERF_SAMPLE_{WEIGHT|WEIGHT_STRUCT}

The following commit has been merged into the perf/core branch of tip:

Commit-ID: 6b2ae4952ef8ac23b467bc10776404092b581143
Gitweb: https://git.kernel.org/tip/6b2ae4952ef8ac23b467bc10776404092b581143
Author: Ravi Bangoria <[email protected]>
AuthorDate: Wed, 28 Sep 2022 15:27:54 +05:30
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Thu, 29 Sep 2022 12:20:55 +02:00

perf/x86/amd: Support PERF_SAMPLE_{WEIGHT|WEIGHT_STRUCT}

IbsDcMissLat indicates the number of clock cycles from when a miss is
detected in the data cache to when the data was delivered to the core.
Similarly, IbsTagToRetCtr provides number of cycles from when the op
was tagged to when the op was retired. Consider these fields for
sample->weight.

Signed-off-by: Ravi Bangoria <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/events/amd/ibs.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index e20caa5..d883694 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -955,6 +955,7 @@ static void perf_ibs_parse_ld_st_data(__u64 sample_type,
{
union ibs_op_data3 op_data3;
union ibs_op_data2 op_data2;
+ union ibs_op_data op_data;

data->data_src.val = PERF_MEM_NA;
op_data3.val = ibs_data->regs[ibs_op_msr_idx(MSR_AMD64_IBSOPDATA3)];
@@ -970,6 +971,19 @@ static void perf_ibs_parse_ld_st_data(__u64 sample_type,
perf_ibs_get_data_src(ibs_data, data, &op_data2, &op_data3);
data->sample_flags |= PERF_SAMPLE_DATA_SRC;
}
+
+ if (sample_type & PERF_SAMPLE_WEIGHT_TYPE && op_data3.dc_miss &&
+ data->data_src.mem_op == PERF_MEM_OP_LOAD) {
+ op_data.val = ibs_data->regs[ibs_op_msr_idx(MSR_AMD64_IBSOPDATA)];
+
+ if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT) {
+ data->weight.var1_dw = op_data3.dc_miss_lat;
+ data->weight.var2_w = op_data.tag_to_ret_ctr;
+ } else if (sample_type & PERF_SAMPLE_WEIGHT) {
+ data->weight.full = op_data3.dc_miss_lat;
+ }
+ data->sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
+ }
}

static int perf_ibs_get_offset_max(struct perf_ibs *perf_ibs, u64 sample_type,
@@ -977,7 +991,8 @@ static int perf_ibs_get_offset_max(struct perf_ibs *perf_ibs, u64 sample_type,
{
if (sample_type & PERF_SAMPLE_RAW ||
(perf_ibs == &perf_ibs_op &&
- sample_type & PERF_SAMPLE_DATA_SRC))
+ (sample_type & PERF_SAMPLE_DATA_SRC ||
+ sample_type & PERF_SAMPLE_WEIGHT_TYPE)))
return perf_ibs->offset_max;
else if (check_rip)
return 3;