2022-03-23 07:49:44

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v7 05/13] perf/x86/amd: enable branch sampling priv level filtering

The AMD Branch Sampling features does not provide hardware filtering by
privilege level. The associated PMU counter does but not the branch sampling
by itself. Given how BRS operates there is a possibility that BRS captures
kernel level branches even though the event is programmed to count only at
the user level.

Implement a workaround in software by removing the branches which belong to
the wrong privilege level. The privilege level is evaluated on the target of
the branch and not the source so as to be compatible with other architectures.
As a consequence of this patch, the number of entries in the
PERF_RECORD_BRANCH_STACK buffer may be less than the maximum (16). It could
even be zero. Another consequence is that consecutive entries in the branch
stack may not reflect actual code path and may have discontinuities, in case
kernel branches were suppressed. But this is no different than what happens
on other architectures.

Signed-off-by: Stephane Eranian <[email protected]>
---
arch/x86/events/amd/brs.c | 26 ++++++++++++++++++++------
1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/amd/brs.c b/arch/x86/events/amd/brs.c
index 3c13c484c637..40461c3ce714 100644
--- a/arch/x86/events/amd/brs.c
+++ b/arch/x86/events/amd/brs.c
@@ -92,10 +92,6 @@ int amd_brs_setup_filter(struct perf_event *event)
if ((type & ~PERF_SAMPLE_BRANCH_PLM_ALL) != PERF_SAMPLE_BRANCH_ANY)
return -EINVAL;

- /* can only capture at all priv levels due to the way BRS works */
- if ((type & PERF_SAMPLE_BRANCH_PLM_ALL) != PERF_SAMPLE_BRANCH_PLM_ALL)
- return -EINVAL;
-
return 0;
}

@@ -195,6 +191,21 @@ void amd_brs_disable_all(void)
amd_brs_disable();
}

+static bool amd_brs_match_plm(struct perf_event *event, u64 to)
+{
+ int type = event->attr.branch_sample_type;
+ int plm_k = PERF_SAMPLE_BRANCH_KERNEL | PERF_SAMPLE_BRANCH_HV;
+ int plm_u = PERF_SAMPLE_BRANCH_USER;
+
+ if (!(type & plm_k) && kernel_ip(to))
+ return 0;
+
+ if (!(type & plm_u) && !kernel_ip(to))
+ return 0;
+
+ return 1;
+}
+
/*
* Caller must ensure amd_brs_inuse() is true before calling
* return:
@@ -252,8 +263,6 @@ void amd_brs_drain(void)
if (to == BRS_POISON)
break;

- rdmsrl(brs_from(brs_idx), from);
-
/*
* Sign-extend SAMP_BR_TO to 64 bits, bits 61-63 are reserved.
* Necessary to generate proper virtual addresses suitable for
@@ -261,6 +270,11 @@ void amd_brs_drain(void)
*/
to = (u64)(((s64)to << shift) >> shift);

+ if (!amd_brs_match_plm(event, to))
+ continue;
+
+ rdmsrl(brs_from(brs_idx), from);
+
perf_clear_branch_entry_bitfields(br+nr);

br[nr].from = from;
--
2.35.1.894.gb6a874cedc-goog


2022-04-06 02:12:10

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: perf/core] perf/x86/amd: Enable branch sampling priv level filtering

The following commit has been merged into the perf/core branch of tip:

Commit-ID: 8910075d61a37e5b0d82e6c83ed9a0a31fe9ea08
Gitweb: https://git.kernel.org/tip/8910075d61a37e5b0d82e6c83ed9a0a31fe9ea08
Author: Stephane Eranian <[email protected]>
AuthorDate: Tue, 22 Mar 2022 15:15:09 -07:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Tue, 05 Apr 2022 10:24:37 +02:00

perf/x86/amd: Enable branch sampling priv level filtering

The AMD Branch Sampling features does not provide hardware filtering by
privilege level. The associated PMU counter does but not the branch sampling
by itself. Given how BRS operates there is a possibility that BRS captures
kernel level branches even though the event is programmed to count only at
the user level.

Implement a workaround in software by removing the branches which belong to
the wrong privilege level. The privilege level is evaluated on the target of
the branch and not the source so as to be compatible with other architectures.
As a consequence of this patch, the number of entries in the
PERF_RECORD_BRANCH_STACK buffer may be less than the maximum (16). It could
even be zero. Another consequence is that consecutive entries in the branch
stack may not reflect actual code path and may have discontinuities, in case
kernel branches were suppressed. But this is no different than what happens
on other architectures.

Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/events/amd/brs.c | 26 ++++++++++++++++++++------
1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/amd/brs.c b/arch/x86/events/amd/brs.c
index 3c13c48..40461c3 100644
--- a/arch/x86/events/amd/brs.c
+++ b/arch/x86/events/amd/brs.c
@@ -92,10 +92,6 @@ int amd_brs_setup_filter(struct perf_event *event)
if ((type & ~PERF_SAMPLE_BRANCH_PLM_ALL) != PERF_SAMPLE_BRANCH_ANY)
return -EINVAL;

- /* can only capture at all priv levels due to the way BRS works */
- if ((type & PERF_SAMPLE_BRANCH_PLM_ALL) != PERF_SAMPLE_BRANCH_PLM_ALL)
- return -EINVAL;
-
return 0;
}

@@ -195,6 +191,21 @@ void amd_brs_disable_all(void)
amd_brs_disable();
}

+static bool amd_brs_match_plm(struct perf_event *event, u64 to)
+{
+ int type = event->attr.branch_sample_type;
+ int plm_k = PERF_SAMPLE_BRANCH_KERNEL | PERF_SAMPLE_BRANCH_HV;
+ int plm_u = PERF_SAMPLE_BRANCH_USER;
+
+ if (!(type & plm_k) && kernel_ip(to))
+ return 0;
+
+ if (!(type & plm_u) && !kernel_ip(to))
+ return 0;
+
+ return 1;
+}
+
/*
* Caller must ensure amd_brs_inuse() is true before calling
* return:
@@ -252,8 +263,6 @@ void amd_brs_drain(void)
if (to == BRS_POISON)
break;

- rdmsrl(brs_from(brs_idx), from);
-
/*
* Sign-extend SAMP_BR_TO to 64 bits, bits 61-63 are reserved.
* Necessary to generate proper virtual addresses suitable for
@@ -261,6 +270,11 @@ void amd_brs_drain(void)
*/
to = (u64)(((s64)to << shift) >> shift);

+ if (!amd_brs_match_plm(event, to))
+ continue;
+
+ rdmsrl(brs_from(brs_idx), from);
+
perf_clear_branch_entry_bitfields(br+nr);

br[nr].from = from;