2022-09-28 18:49:11

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH 0/2] perf/x86/amd/lbr: fix LBR filtering support

Short patch series to address some kernel issues with the AMD LBrv2
enablement which were introduced in Linux 6.0.

Stephane Eranian (2):
perf/x86/utils: fix uninitialized var in get_branch_type()
perf/x86/amd/lbr: adjust LBR regardless of filtering

arch/x86/events/amd/lbr.c | 8 ++++++--
arch/x86/events/utils.c | 4 ++++
2 files changed, 10 insertions(+), 2 deletions(-)

--
2.37.3.998.g577e59143f-goog


2022-09-28 18:49:18

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH 1/2] perf/x86/utils: fix uninitialized var in get_branch_type()

offset is passed as a pointer and on certain call path is not set by the
function. If the caller does not re-initialize offset between calls, value
could be inherited between calls. Prevent this by initializing offset on each
call.
This impacts the code in amd_pmu_lbr_filter() which does
for(i=0; ...) {
ret = get_branch_type_fused(..., &offset);
if (offset)
lbr_entries[i].from += offset;
}

Signed-off-by: Stephane Eranian <[email protected]>
---
arch/x86/events/utils.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/x86/events/utils.c b/arch/x86/events/utils.c
index 5f5617afde79..76b1f8bb0fd5 100644
--- a/arch/x86/events/utils.c
+++ b/arch/x86/events/utils.c
@@ -94,6 +94,10 @@ static int get_branch_type(unsigned long from, unsigned long to, int abort,
u8 buf[MAX_INSN_SIZE];
int is64 = 0;

+ /* make sure we initialize offset */
+ if (offset)
+ *offset = 0;
+
to_plm = kernel_ip(to) ? X86_BR_KERNEL : X86_BR_USER;
from_plm = kernel_ip(from) ? X86_BR_KERNEL : X86_BR_USER;

--
2.37.3.998.g577e59143f-goog

2022-09-28 18:49:22

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH 2/2] perf/x86/amd/lbr: adjust LBR regardless of filtering

In case of fused compare and taken branch instructions, the AMD LBR points to
the compare instruction instead of the branch. Users of LBR usually expects
the from address to point to a branch instruction. The kernel has code to
adjust the from address via get_branch_type_fused(). However this correction
is only applied when a branch filter is applied. That means that if no
filter is present, the quality of the data is lower.

Fix the problem by applying the adjustment regardless of the filter setting,
bringing the AMD LBR to the same level as other LBR implementations.

Signed-off-by: Stephane Eranian <[email protected]>
---
arch/x86/events/amd/lbr.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/amd/lbr.c b/arch/x86/events/amd/lbr.c
index 2e1c1573efe7..38a75216c12c 100644
--- a/arch/x86/events/amd/lbr.c
+++ b/arch/x86/events/amd/lbr.c
@@ -99,12 +99,13 @@ static void amd_pmu_lbr_filter(void)
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
int br_sel = cpuc->br_sel, offset, type, i, j;
bool compress = false;
+ bool fused_only = false;
u64 from, to;

/* If sampling all branches, there is nothing to filter */
if (((br_sel & X86_BR_ALL) == X86_BR_ALL) &&
((br_sel & X86_BR_TYPE_SAVE) != X86_BR_TYPE_SAVE))
- return;
+ fused_only = true;

for (i = 0; i < cpuc->lbr_stack.nr; i++) {
from = cpuc->lbr_entries[i].from;
@@ -116,8 +117,11 @@ static void amd_pmu_lbr_filter(void)
* fusion where it points to an instruction preceding the
* actual branch
*/
- if (offset)
+ if (offset) {
cpuc->lbr_entries[i].from += offset;
+ if (fused_only)
+ continue;
+ }

/* If type does not correspond, then discard */
if (type == X86_BR_NONE || (br_sel & type) != type) {
--
2.37.3.998.g577e59143f-goog

2022-09-29 07:47:35

by Sandipan Das

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf/x86/utils: fix uninitialized var in get_branch_type()

On 9/29/2022 12:10 AM, Stephane Eranian wrote:
> offset is passed as a pointer and on certain call path is not set by the
> function. If the caller does not re-initialize offset between calls, value
> could be inherited between calls. Prevent this by initializing offset on each
> call.
> This impacts the code in amd_pmu_lbr_filter() which does
> for(i=0; ...) {
> ret = get_branch_type_fused(..., &offset);
> if (offset)
> lbr_entries[i].from += offset;
> }
>
> Signed-off-by: Stephane Eranian <[email protected]>
> ---
> arch/x86/events/utils.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/events/utils.c b/arch/x86/events/utils.c
> index 5f5617afde79..76b1f8bb0fd5 100644
> --- a/arch/x86/events/utils.c
> +++ b/arch/x86/events/utils.c
> @@ -94,6 +94,10 @@ static int get_branch_type(unsigned long from, unsigned long to, int abort,
> u8 buf[MAX_INSN_SIZE];
> int is64 = 0;
>
> + /* make sure we initialize offset */
> + if (offset)
> + *offset = 0;
> +
> to_plm = kernel_ip(to) ? X86_BR_KERNEL : X86_BR_USER;
> from_plm = kernel_ip(from) ? X86_BR_KERNEL : X86_BR_USER;
>

Reviewed-by: Sandipan Das <[email protected]>

2022-09-29 07:54:05

by Sandipan Das

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf/x86/amd/lbr: adjust LBR regardless of filtering

On 9/29/2022 12:10 AM, Stephane Eranian wrote:
> In case of fused compare and taken branch instructions, the AMD LBR points to
> the compare instruction instead of the branch. Users of LBR usually expects
> the from address to point to a branch instruction. The kernel has code to
> adjust the from address via get_branch_type_fused(). However this correction
> is only applied when a branch filter is applied. That means that if no
> filter is present, the quality of the data is lower.
>
> Fix the problem by applying the adjustment regardless of the filter setting,
> bringing the AMD LBR to the same level as other LBR implementations.
>
> Signed-off-by: Stephane Eranian <[email protected]>
> ---
> arch/x86/events/amd/lbr.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/events/amd/lbr.c b/arch/x86/events/amd/lbr.c
> index 2e1c1573efe7..38a75216c12c 100644
> --- a/arch/x86/events/amd/lbr.c
> +++ b/arch/x86/events/amd/lbr.c
> @@ -99,12 +99,13 @@ static void amd_pmu_lbr_filter(void)
> struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> int br_sel = cpuc->br_sel, offset, type, i, j;
> bool compress = false;
> + bool fused_only = false;
> u64 from, to;
>
> /* If sampling all branches, there is nothing to filter */
> if (((br_sel & X86_BR_ALL) == X86_BR_ALL) &&
> ((br_sel & X86_BR_TYPE_SAVE) != X86_BR_TYPE_SAVE))
> - return;
> + fused_only = true;
>
> for (i = 0; i < cpuc->lbr_stack.nr; i++) {
> from = cpuc->lbr_entries[i].from;
> @@ -116,8 +117,11 @@ static void amd_pmu_lbr_filter(void)
> * fusion where it points to an instruction preceding the
> * actual branch
> */
> - if (offset)
> + if (offset) {
> cpuc->lbr_entries[i].from += offset;
> + if (fused_only)
> + continue;
> + }
>
> /* If type does not correspond, then discard */
> if (type == X86_BR_NONE || (br_sel & type) != type) {

Reviewed-by: Sandipan Das <[email protected]>

2022-09-29 08:27:39

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/2] perf/x86/amd/lbr: fix LBR filtering support

On Wed, Sep 28, 2022 at 11:40:41AM -0700, Stephane Eranian wrote:
> Short patch series to address some kernel issues with the AMD LBrv2
> enablement which were introduced in Linux 6.0.
>
> Stephane Eranian (2):
> perf/x86/utils: fix uninitialized var in get_branch_type()
> perf/x86/amd/lbr: adjust LBR regardless of filtering
>
> arch/x86/events/amd/lbr.c | 8 ++++++--
> arch/x86/events/utils.c | 4 ++++
> 2 files changed, 10 insertions(+), 2 deletions(-)

If you want this in perf/urgent you're missing Fixes tags.

2022-09-29 08:51:37

by Sandipan Das

[permalink] [raw]
Subject: Re: [PATCH 0/2] perf/x86/amd/lbr: fix LBR filtering support

On 9/29/2022 1:44 PM, Peter Zijlstra wrote:
> On Wed, Sep 28, 2022 at 11:40:41AM -0700, Stephane Eranian wrote:
>> Short patch series to address some kernel issues with the AMD LBrv2
>> enablement which were introduced in Linux 6.0.
>>
>> Stephane Eranian (2):
>> perf/x86/utils: fix uninitialized var in get_branch_type()
>> perf/x86/amd/lbr: adjust LBR regardless of filtering
>>
>> arch/x86/events/amd/lbr.c | 8 ++++++--
>> arch/x86/events/utils.c | 4 ++++
>> 2 files changed, 10 insertions(+), 2 deletions(-)
>
> If you want this in perf/urgent you're missing Fixes tags.

That would be:

Fixes: df3e9612f758 ("perf/x86: Make branch classifier fusion-aware")

and

Fixes: 245268c19f70 ("perf/x86/amd/lbr: Use fusion-aware branch classifier")

for the 1st and 2nd patch respectively.


- Sandipan

2022-09-29 10:29:29

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/2] perf/x86/amd/lbr: fix LBR filtering support

On Thu, Sep 29, 2022 at 02:00:07PM +0530, Sandipan Das wrote:
> On 9/29/2022 1:44 PM, Peter Zijlstra wrote:
> > On Wed, Sep 28, 2022 at 11:40:41AM -0700, Stephane Eranian wrote:
> >> Short patch series to address some kernel issues with the AMD LBrv2
> >> enablement which were introduced in Linux 6.0.
> >>
> >> Stephane Eranian (2):
> >> perf/x86/utils: fix uninitialized var in get_branch_type()
> >> perf/x86/amd/lbr: adjust LBR regardless of filtering
> >>
> >> arch/x86/events/amd/lbr.c | 8 ++++++--
> >> arch/x86/events/utils.c | 4 ++++
> >> 2 files changed, 10 insertions(+), 2 deletions(-)
> >
> > If you want this in perf/urgent you're missing Fixes tags.
>
> That would be:
>
> Fixes: df3e9612f758 ("perf/x86: Make branch classifier fusion-aware")
>
> and
>
> Fixes: 245268c19f70 ("perf/x86/amd/lbr: Use fusion-aware branch classifier")
>
> for the 1st and 2nd patch respectively.

Thanks; but trying to queue then in /urgent resulted in me finding out
they're not at all slated for 6.0. The AMD LBRv2 stuff is in perf/core,
waiting for 6.1 to start. So I'll just go queue them there.

Still, good to have Fixes tags on them.

Subject: [tip: perf/core] perf/x86/amd/lbr: Adjust LBR regardless of filtering

The following commit has been merged into the perf/core branch of tip:

Commit-ID: 3f9a1b3591003b122a6ea2d69f89a0fd96ec58b9
Gitweb: https://git.kernel.org/tip/3f9a1b3591003b122a6ea2d69f89a0fd96ec58b9
Author: Stephane Eranian <[email protected]>
AuthorDate: Wed, 28 Sep 2022 11:40:43 -07:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Thu, 29 Sep 2022 12:20:57 +02:00

perf/x86/amd/lbr: Adjust LBR regardless of filtering

In case of fused compare and taken branch instructions, the AMD LBR points to
the compare instruction instead of the branch. Users of LBR usually expects
the from address to point to a branch instruction. The kernel has code to
adjust the from address via get_branch_type_fused(). However this correction
is only applied when a branch filter is applied. That means that if no
filter is present, the quality of the data is lower.

Fix the problem by applying the adjustment regardless of the filter setting,
bringing the AMD LBR to the same level as other LBR implementations.

Fixes: 245268c19f70 ("perf/x86/amd/lbr: Use fusion-aware branch classifier")
Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sandipan Das <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/events/amd/lbr.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/amd/lbr.c b/arch/x86/events/amd/lbr.c
index 2e1c157..38a7521 100644
--- a/arch/x86/events/amd/lbr.c
+++ b/arch/x86/events/amd/lbr.c
@@ -99,12 +99,13 @@ static void amd_pmu_lbr_filter(void)
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
int br_sel = cpuc->br_sel, offset, type, i, j;
bool compress = false;
+ bool fused_only = false;
u64 from, to;

/* If sampling all branches, there is nothing to filter */
if (((br_sel & X86_BR_ALL) == X86_BR_ALL) &&
((br_sel & X86_BR_TYPE_SAVE) != X86_BR_TYPE_SAVE))
- return;
+ fused_only = true;

for (i = 0; i < cpuc->lbr_stack.nr; i++) {
from = cpuc->lbr_entries[i].from;
@@ -116,8 +117,11 @@ static void amd_pmu_lbr_filter(void)
* fusion where it points to an instruction preceding the
* actual branch
*/
- if (offset)
+ if (offset) {
cpuc->lbr_entries[i].from += offset;
+ if (fused_only)
+ continue;
+ }

/* If type does not correspond, then discard */
if (type == X86_BR_NONE || (br_sel & type) != type) {

Subject: [tip: perf/core] perf/x86/utils: Fix uninitialized var in get_branch_type()

The following commit has been merged into the perf/core branch of tip:

Commit-ID: 117ceeb1f4f87331e45a77e71f18303d15ec882e
Gitweb: https://git.kernel.org/tip/117ceeb1f4f87331e45a77e71f18303d15ec882e
Author: Stephane Eranian <[email protected]>
AuthorDate: Wed, 28 Sep 2022 11:40:42 -07:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Thu, 29 Sep 2022 12:20:56 +02:00

perf/x86/utils: Fix uninitialized var in get_branch_type()

offset is passed as a pointer and on certain call path is not set by the
function. If the caller does not re-initialize offset between calls, value
could be inherited between calls. Prevent this by initializing offset on each
call.

This impacts the code in amd_pmu_lbr_filter() which does:

for(i=0; ...) {
ret = get_branch_type_fused(..., &offset);
if (offset)
lbr_entries[i].from += offset;
}

Fixes: df3e9612f758 ("perf/x86: Make branch classifier fusion-aware")
Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sandipan Das <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/events/utils.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/x86/events/utils.c b/arch/x86/events/utils.c
index 5f5617a..76b1f8b 100644
--- a/arch/x86/events/utils.c
+++ b/arch/x86/events/utils.c
@@ -94,6 +94,10 @@ static int get_branch_type(unsigned long from, unsigned long to, int abort,
u8 buf[MAX_INSN_SIZE];
int is64 = 0;

+ /* make sure we initialize offset */
+ if (offset)
+ *offset = 0;
+
to_plm = kernel_ip(to) ? X86_BR_KERNEL : X86_BR_USER;
from_plm = kernel_ip(from) ? X86_BR_KERNEL : X86_BR_USER;