2022-03-28 21:26:20

by Liang, Kan

[permalink] [raw]
Subject: [PATCH] perf/x86/intel: Don't extend the pseudo-encoding to GP counters

From: Kan Liang <[email protected]>

The INST_RETIRED.PREC_DIST event (0x0100) doesn't count on SPR.
perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0

Performance counter stats for 'CPU(s) 0':

607,246 cpu/event=0xc0,umask=0x0/
0 cpu/event=0x0,umask=0x1/

The encoding for INST_RETIRED.PREC_DIST is pseudo-encoding, which
doesn't work on the generic counters. However, current perf extends its
mask to the generic counters.

The pseudo event-code for a fixed counter must be 0x00. Check and avoid
extending the mask for the fixed counter event which using the
pseudo-encoding, e.g., ref-cycles and PREC_DIST event.

With the patch,
perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0

Performance counter stats for 'CPU(s) 0':

583,184 cpu/event=0xc0,umask=0x0/
583,048 cpu/event=0x0,umask=0x1/

Fixes: 2de71ee153ef ("perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings")
Signed-off-by: Kan Liang <[email protected]>
Cc: [email protected]
---
arch/x86/events/intel/core.c | 6 +++++-
arch/x86/include/asm/perf_event.h | 5 +++++
2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index db32ef6..1d2e49d 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5668,7 +5668,11 @@ static void intel_pmu_check_event_constraints(struct event_constraint *event_con
/* Disabled fixed counters which are not in CPUID */
c->idxmsk64 &= intel_ctrl;

- if (c->idxmsk64 != INTEL_PMC_MSK_FIXED_REF_CYCLES)
+ /*
+ * Don't extend the pseudo-encoding to the
+ * generic counters
+ */
+ if (!use_fixed_pseudo_encoding(c->code))
c->idxmsk64 |= (1ULL << num_counters) - 1;
}
c->idxmsk64 &=
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 48e6ef56..cd85f03 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -242,6 +242,11 @@ struct x86_pmu_capability {
#define INTEL_PMC_IDX_FIXED_SLOTS (INTEL_PMC_IDX_FIXED + 3)
#define INTEL_PMC_MSK_FIXED_SLOTS (1ULL << INTEL_PMC_IDX_FIXED_SLOTS)

+static inline bool use_fixed_pseudo_encoding(u64 code)
+{
+ return !(code & 0xff);
+}
+
/*
* We model BTS tracing as another fixed-mode PMC.
*
--
2.7.4


2022-03-28 21:49:20

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH] perf/x86/intel: Don't extend the pseudo-encoding to GP counters



On 3/28/2022 1:11 PM, Stephane Eranian wrote:
> On Mon, Mar 28, 2022 at 8:50 AM <[email protected]> wrote:
>>
>> From: Kan Liang <[email protected]>
>>
>> The INST_RETIRED.PREC_DIST event (0x0100) doesn't count on SPR.
>> perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0
>>
>> Performance counter stats for 'CPU(s) 0':
>>
>> 607,246 cpu/event=0xc0,umask=0x0/
>> 0 cpu/event=0x0,umask=0x1/
>>
>> The encoding for INST_RETIRED.PREC_DIST is pseudo-encoding, which
>> doesn't work on the generic counters. However, current perf extends its
>> mask to the generic counters.
>>
>> The pseudo event-code for a fixed counter must be 0x00. Check and avoid
>> extending the mask for the fixed counter event which using the
>> pseudo-encoding, e.g., ref-cycles and PREC_DIST event.
>>
>> With the patch,
>> perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0
>>
>> Performance counter stats for 'CPU(s) 0':
>>
>> 583,184 cpu/event=0xc0,umask=0x0/
>> 583,048 cpu/event=0x0,umask=0x1/
>>
>> Fixes: 2de71ee153ef ("perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings")
>> Signed-off-by: Kan Liang <[email protected]>
>> Cc: [email protected]
>> ---
>> arch/x86/events/intel/core.c | 6 +++++-
>> arch/x86/include/asm/perf_event.h | 5 +++++
>> 2 files changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index db32ef6..1d2e49d 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -5668,7 +5668,11 @@ static void intel_pmu_check_event_constraints(struct event_constraint *event_con
>> /* Disabled fixed counters which are not in CPUID */
>> c->idxmsk64 &= intel_ctrl;
>>
>> - if (c->idxmsk64 != INTEL_PMC_MSK_FIXED_REF_CYCLES)
>> + /*
>> + * Don't extend the pseudo-encoding to the
>> + * generic counters
>> + */
>> + if (!use_fixed_pseudo_encoding(c->code))
>> c->idxmsk64 |= (1ULL << num_counters) - 1;
>> }
>> c->idxmsk64 &=
>> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
>> index 48e6ef56..cd85f03 100644
>> --- a/arch/x86/include/asm/perf_event.h
>> +++ b/arch/x86/include/asm/perf_event.h
>> @@ -242,6 +242,11 @@ struct x86_pmu_capability {
>> #define INTEL_PMC_IDX_FIXED_SLOTS (INTEL_PMC_IDX_FIXED + 3)
>> #define INTEL_PMC_MSK_FIXED_SLOTS (1ULL << INTEL_PMC_IDX_FIXED_SLOTS)
>>
>> +static inline bool use_fixed_pseudo_encoding(u64 code)
>> +{
>> + return !(code & 0xff);
>> +}
>> +
> I ack the problem.
>
> That does not take into account the old encoding for PREC_DIST 0x01c0
> which is also forced to
> fixed counter0 on ICL and should not be extended.

The old encoding is not documented in the ICL event list now. The only
PREC_DIST event for ICL is using the pseudo encoding.

{
"EventCode": "0x00",
"UMask": "0x01",
"EventName": "INST_RETIRED.PREC_DIST",
"BriefDescription": "Precise instruction retired event with a
reduced effect of PEBS shadow in IP distribution",
"PublicDescription": "A version of INST_RETIRED that allows for a
more unbiased distribution of samples across instructions retired. It
utilizes the Precise Distribution of Instructions Retired (PDIR) feature
to mitigate some bias in how retired instructions get sampled. Use on
Fixed Counter 0.",
"Counter": "Fixed counter 0",

Ideally, I think we should remove the old encoding 0x01c0 from the
constraints table rather than force it to fixed counter 0 only.
If so, that should be a separate patch.

>
> That also limits the options for the SLOTS events which can be
> measured by a GP. Yet to work
> with PERF_METRICS, it has to be programmed into fixed counter 3.

For the SLOTS event which can only work with PERF_METRICS, the current
perf already limit it as below.
FIXED_EVENT_CONSTRAINT(0x0400, 3), /* SLOTS */
No behavior is changed with this patch.

For the GP version of SLOTS, it's 0x01a4. According to the event list,
it can be scheduled on all GP counters. So it's not added into the
constraints table.

"EventCode": "0xa4",
"UMask": "0x01",
"EventName": "TOPDOWN.SLOTS_P",
"BriefDescription": "TMA slots available for an unhalted logical
processor. General counter - architectural event",
"PublicDescription": "Counts the number of available slots for an
unhalted logical processor. The event increments by machine-width of the
narrowest pipeline as employed by the Top-down Microarchitecture
Analysis method. The count is distributed among unhalted logical
processors (hyper-threads) who share the same physical core.",
"Counter": "0,1,2,3,4,5,6,7",
"PEBScounters": "0,1,2,3,4,5,6,7",

Even we finally decide to extend the 0x01a4 to the fixed counter 3 and
add an entry FIXED_EVENT_CONSTRAINT(0x01a4, 3) in the constraints table.
This patch doesn't limit it.

Thanks,
Kan

>
>> /*
>> * We model BTS tracing as another fixed-mode PMC.
>> *
>> --
>> 2.7.4
>>

2022-03-28 22:27:06

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH] perf/x86/intel: Don't extend the pseudo-encoding to GP counters

On Mon, Mar 28, 2022 at 8:50 AM <[email protected]> wrote:
>
> From: Kan Liang <[email protected]>
>
> The INST_RETIRED.PREC_DIST event (0x0100) doesn't count on SPR.
> perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0
>
> Performance counter stats for 'CPU(s) 0':
>
> 607,246 cpu/event=0xc0,umask=0x0/
> 0 cpu/event=0x0,umask=0x1/
>
> The encoding for INST_RETIRED.PREC_DIST is pseudo-encoding, which
> doesn't work on the generic counters. However, current perf extends its
> mask to the generic counters.
>
> The pseudo event-code for a fixed counter must be 0x00. Check and avoid
> extending the mask for the fixed counter event which using the
> pseudo-encoding, e.g., ref-cycles and PREC_DIST event.
>
> With the patch,
> perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0
>
> Performance counter stats for 'CPU(s) 0':
>
> 583,184 cpu/event=0xc0,umask=0x0/
> 583,048 cpu/event=0x0,umask=0x1/
>
> Fixes: 2de71ee153ef ("perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings")
> Signed-off-by: Kan Liang <[email protected]>
> Cc: [email protected]
> ---
> arch/x86/events/intel/core.c | 6 +++++-
> arch/x86/include/asm/perf_event.h | 5 +++++
> 2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index db32ef6..1d2e49d 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -5668,7 +5668,11 @@ static void intel_pmu_check_event_constraints(struct event_constraint *event_con
> /* Disabled fixed counters which are not in CPUID */
> c->idxmsk64 &= intel_ctrl;
>
> - if (c->idxmsk64 != INTEL_PMC_MSK_FIXED_REF_CYCLES)
> + /*
> + * Don't extend the pseudo-encoding to the
> + * generic counters
> + */
> + if (!use_fixed_pseudo_encoding(c->code))
> c->idxmsk64 |= (1ULL << num_counters) - 1;
> }
> c->idxmsk64 &=
> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
> index 48e6ef56..cd85f03 100644
> --- a/arch/x86/include/asm/perf_event.h
> +++ b/arch/x86/include/asm/perf_event.h
> @@ -242,6 +242,11 @@ struct x86_pmu_capability {
> #define INTEL_PMC_IDX_FIXED_SLOTS (INTEL_PMC_IDX_FIXED + 3)
> #define INTEL_PMC_MSK_FIXED_SLOTS (1ULL << INTEL_PMC_IDX_FIXED_SLOTS)
>
> +static inline bool use_fixed_pseudo_encoding(u64 code)
> +{
> + return !(code & 0xff);
> +}
> +
I ack the problem.

That does not take into account the old encoding for PREC_DIST 0x01c0
which is also forced to
fixed counter0 on ICL and should not be extended.

That also limits the options for the SLOTS events which can be
measured by a GP. Yet to work
with PERF_METRICS, it has to be programmed into fixed counter 3.

> /*
> * We model BTS tracing as another fixed-mode PMC.
> *
> --
> 2.7.4
>

Subject: [tip: perf/urgent] perf/x86/intel: Don't extend the pseudo-encoding to GP counters

The following commit has been merged into the perf/urgent branch of tip:

Commit-ID: 4a263bf331c512849062805ef1b4ac40301a9829
Gitweb: https://git.kernel.org/tip/4a263bf331c512849062805ef1b4ac40301a9829
Author: Kan Liang <[email protected]>
AuthorDate: Mon, 28 Mar 2022 08:49:02 -07:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Tue, 05 Apr 2022 09:59:44 +02:00

perf/x86/intel: Don't extend the pseudo-encoding to GP counters

The INST_RETIRED.PREC_DIST event (0x0100) doesn't count on SPR.
perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0

Performance counter stats for 'CPU(s) 0':

607,246 cpu/event=0xc0,umask=0x0/
0 cpu/event=0x0,umask=0x1/

The encoding for INST_RETIRED.PREC_DIST is pseudo-encoding, which
doesn't work on the generic counters. However, current perf extends its
mask to the generic counters.

The pseudo event-code for a fixed counter must be 0x00. Check and avoid
extending the mask for the fixed counter event which using the
pseudo-encoding, e.g., ref-cycles and PREC_DIST event.

With the patch,
perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0

Performance counter stats for 'CPU(s) 0':

583,184 cpu/event=0xc0,umask=0x0/
583,048 cpu/event=0x0,umask=0x1/

Fixes: 2de71ee153ef ("perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings")
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/events/intel/core.c | 6 +++++-
arch/x86/include/asm/perf_event.h | 5 +++++
2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 28f075e..eb17b96 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5536,7 +5536,11 @@ static void intel_pmu_check_event_constraints(struct event_constraint *event_con
/* Disabled fixed counters which are not in CPUID */
c->idxmsk64 &= intel_ctrl;

- if (c->idxmsk64 != INTEL_PMC_MSK_FIXED_REF_CYCLES)
+ /*
+ * Don't extend the pseudo-encoding to the
+ * generic counters
+ */
+ if (!use_fixed_pseudo_encoding(c->code))
c->idxmsk64 |= (1ULL << num_counters) - 1;
}
c->idxmsk64 &=
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 58d9e4b..b06e4c5 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -241,6 +241,11 @@ struct x86_pmu_capability {
#define INTEL_PMC_IDX_FIXED_SLOTS (INTEL_PMC_IDX_FIXED + 3)
#define INTEL_PMC_MSK_FIXED_SLOTS (1ULL << INTEL_PMC_IDX_FIXED_SLOTS)

+static inline bool use_fixed_pseudo_encoding(u64 code)
+{
+ return !(code & 0xff);
+}
+
/*
* We model BTS tracing as another fixed-mode PMC.
*