2023-02-02 12:26:40

by Sandipan Das

[permalink] [raw]
Subject: [PATCH v4 0/4] tools perf: Add branch speculation info

AMD Last Branch Record Extension Version 2 (LbrExtV2) provides branch
speculation information and the perf UAPI is extended to provide this in
a generic way. Make perf tool show this additional information.

The UAPI changes can be found in commit 93315e46b000 ("perf/core: Add
speculation info to branch entries").

Requesting help from folks having access to big-endian systems to test
changes in the sample parsing test as I was only able to test these in
a ppc64 simulator.

Previous versions can be found at:
v3: https://lore.kernel.org/all/[email protected]/
v2: https://lore.kernel.org/all/[email protected]/
v1: https://lore.kernel.org/all/[email protected]/

Changes in v4:
- Update tests that were failing due to changes in perf output and
sample parsing (thanks to Arnaldo for reporting).

Changes in v3:
- Drop tools-side UAPI changes as they have already been added by other
commits.
- Rebase on top of latest perf/core.

Changes in v2:
- Drop msr-index.h related changes for now.
- Rebase on top of latest perf/core.
- Fix UAPI breakage introduced by the ARM64 BRBE changes to perf branch
entry.

Sandipan Das (4):
perf script: Show branch speculation info
perf session: Show branch speculation info in raw dump
perf test sample-parsing: Update expected branch flags
perf test brstack: Update regex to include spec field

tools/perf/builtin-script.c | 5 +++--
tools/perf/tests/sample-parsing.c | 2 +-
tools/perf/tests/shell/test_brstack.sh | 18 +++++++++---------
tools/perf/util/branch.c | 15 +++++++++++++++
tools/perf/util/branch.h | 2 ++
tools/perf/util/evsel.c | 15 ++++++++++++---
tools/perf/util/session.c | 5 +++--
7 files changed, 45 insertions(+), 17 deletions(-)

--
2.34.1



2023-02-02 12:27:05

by Sandipan Das

[permalink] [raw]
Subject: [PATCH v4 1/4] perf script: Show branch speculation info

Show the branch speculation info if provided by the branch recording
hardware feature. This can be useful for optimizing code further.

The speculation info is appended to the end of the list of fields so any
existing tools that use "/" as a delimiter for access fields via an index
remain unaffected. Also show "-" instead of "N/A" when speculation info
is unavailable because "/" is used as the field separator.

E.g.

$ perf record -j any,u,save_type ./test_branch
$ perf script --fields brstacksym

Before:

[...]
check_match+0x60/strcmp+0x0/P/-/-/0/CALL
do_lookup_x+0x3c5/check_match+0x0/P/-/-/0/CALL
[...]

After:

[...]
check_match+0x60/strcmp+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
do_lookup_x+0x3c5/check_match+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
[...]

Signed-off-by: Sandipan Das <[email protected]>
---
tools/perf/builtin-script.c | 5 +++--
tools/perf/util/branch.c | 15 +++++++++++++++
tools/perf/util/branch.h | 2 ++
tools/perf/util/evsel.c | 15 ++++++++++++---
4 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 69394ac0a20d..782319e8fe6a 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -880,12 +880,13 @@ mispred_str(struct branch_entry *br)

static int print_bstack_flags(FILE *fp, struct branch_entry *br)
{
- return fprintf(fp, "/%c/%c/%c/%d/%s ",
+ return fprintf(fp, "/%c/%c/%c/%d/%s/%s ",
mispred_str(br),
br->flags.in_tx ? 'X' : '-',
br->flags.abort ? 'A' : '-',
br->flags.cycles,
- get_branch_type(br));
+ get_branch_type(br),
+ br->flags.spec ? branch_spec_desc(br->flags.spec) : "-");
}

static int perf_sample__fprintf_brstack(struct perf_sample *sample,
diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
index 6d38238481d3..378f16a24751 100644
--- a/tools/perf/util/branch.c
+++ b/tools/perf/util/branch.c
@@ -212,3 +212,18 @@ int branch_type_str(struct branch_type_stat *st, char *bf, int size)

return printed;
}
+
+const char *branch_spec_desc(int spec)
+{
+ const char *branch_spec_outcomes[PERF_BR_SPEC_MAX] = {
+ "N/A",
+ "SPEC_WRONG_PATH",
+ "NON_SPEC_CORRECT_PATH",
+ "SPEC_CORRECT_PATH",
+ };
+
+ if (spec >= 0 && spec < PERF_BR_SPEC_MAX)
+ return branch_spec_outcomes[spec];
+
+ return NULL;
+}
diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h
index 3ed792db1125..e41bfffe2217 100644
--- a/tools/perf/util/branch.h
+++ b/tools/perf/util/branch.h
@@ -89,4 +89,6 @@ const char *get_branch_type(struct branch_entry *e);
void branch_type_stat_display(FILE *fp, struct branch_type_stat *st);
int branch_type_str(struct branch_type_stat *st, char *bf, int bfsize);

+const char *branch_spec_desc(int spec);
+
#endif /* _PERF_BRANCH_H */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8550638587e5..019e53db03b3 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2319,7 +2319,10 @@ u64 evsel__bitfield_swap_branch_flags(u64 value)
* abort:1 //transaction abort
* cycles:16 //cycle count to last branch
* type:4 //branch type
- * reserved:40
+ * spec:2 //branch speculation info
+ * new_type:4 //additional branch type
+ * priv:3 //privilege level
+ * reserved:31
* }
* }
*
@@ -2335,7 +2338,10 @@ u64 evsel__bitfield_swap_branch_flags(u64 value)
new_val |= bitfield_swap(value, 3, 1);
new_val |= bitfield_swap(value, 4, 16);
new_val |= bitfield_swap(value, 20, 4);
- new_val |= bitfield_swap(value, 24, 40);
+ new_val |= bitfield_swap(value, 24, 2);
+ new_val |= bitfield_swap(value, 26, 4);
+ new_val |= bitfield_swap(value, 30, 3);
+ new_val |= bitfield_swap(value, 33, 31);
} else {
new_val = bitfield_swap(value, 63, 1);
new_val |= bitfield_swap(value, 62, 1);
@@ -2343,7 +2349,10 @@ u64 evsel__bitfield_swap_branch_flags(u64 value)
new_val |= bitfield_swap(value, 60, 1);
new_val |= bitfield_swap(value, 44, 16);
new_val |= bitfield_swap(value, 40, 4);
- new_val |= bitfield_swap(value, 0, 40);
+ new_val |= bitfield_swap(value, 38, 2);
+ new_val |= bitfield_swap(value, 34, 4);
+ new_val |= bitfield_swap(value, 31, 3);
+ new_val |= bitfield_swap(value, 0, 31);
}

return new_val;
--
2.34.1


2023-02-02 12:27:23

by Sandipan Das

[permalink] [raw]
Subject: [PATCH v4 2/4] perf session: Show branch speculation info in raw dump

Show the branch speculation info if provided by the branch recording
hardware feature. This can be useful for purposes of code optimization.

E.g.

$ perf record -j any,u ./test_branch
$ perf report --dump-raw-trace

Before:

[...]
8380958377610 0x40b178 [0x1b0]: PERF_RECORD_SAMPLE(IP, 0x2): 7952/7952: 0x4f851a period: 48973 addr: 0
... branch stack: nr:16
..... 0: 00000000004b52fd -> 00000000004f82c0 0 cycles P 0
..... 1: ffffffff8220137c -> 00000000004b52f0 0 cycles M 0
..... 2: 000000000041d1c4 -> 00000000004b52f0 0 cycles P 0
..... 3: 00000000004e7ead -> 000000000041d1b0 0 cycles M 0
..... 4: 00000000004e7f91 -> 00000000004e7ead 0 cycles P 0
..... 5: 00000000004e7ea8 -> 00000000004e7f70 0 cycles P 0
..... 6: 00000000004e7e52 -> 00000000004e7e98 0 cycles M 0
..... 7: 00000000004e7e1f -> 00000000004e7e40 0 cycles M 0
..... 8: 00000000004e7f60 -> 00000000004e7df0 0 cycles P 0
..... 9: 00000000004e7f58 -> 00000000004e7f60 0 cycles M 0
..... 10: 000000000041d85d -> 00000000004e7f50 0 cycles P 0
..... 11: 000000000043306a -> 000000000041d840 0 cycles P 0
..... 12: ffffffff8220137c -> 0000000000433040 0 cycles M 0
..... 13: 000000000041e4a1 -> 0000000000433040 0 cycles P 0
..... 14: ffffffff8220137c -> 000000000041e490 0 cycles M 0
..... 15: 000000000041d89b -> 000000000041e487 0 cycles P 0
... thread: test_branch:7952
...... dso: /data/sandipan/test_branch
[...]

After:

[...]
8380958377610 0x40b178 [0x1b0]: PERF_RECORD_SAMPLE(IP, 0x2): 7952/7952: 0x4f851a period: 48973 addr: 0
... branch stack: nr:16
..... 0: 00000000004b52fd -> 00000000004f82c0 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 1: ffffffff8220137c -> 00000000004b52f0 0 cycles M 0 NON_SPEC_CORRECT_PATH
..... 2: 000000000041d1c4 -> 00000000004b52f0 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 3: 00000000004e7ead -> 000000000041d1b0 0 cycles M 0 NON_SPEC_CORRECT_PATH
..... 4: 00000000004e7f91 -> 00000000004e7ead 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 5: 00000000004e7ea8 -> 00000000004e7f70 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 6: 00000000004e7e52 -> 00000000004e7e98 0 cycles M 0 SPEC_CORRECT_PATH
..... 7: 00000000004e7e1f -> 00000000004e7e40 0 cycles M 0 NON_SPEC_CORRECT_PATH
..... 8: 00000000004e7f60 -> 00000000004e7df0 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 9: 00000000004e7f58 -> 00000000004e7f60 0 cycles M 0 NON_SPEC_CORRECT_PATH
..... 10: 000000000041d85d -> 00000000004e7f50 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 11: 000000000043306a -> 000000000041d840 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 12: ffffffff8220137c -> 0000000000433040 0 cycles M 0 NON_SPEC_CORRECT_PATH
..... 13: 000000000041e4a1 -> 0000000000433040 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 14: ffffffff8220137c -> 000000000041e490 0 cycles M 0 NON_SPEC_CORRECT_PATH
..... 15: 000000000041d89b -> 000000000041e487 0 cycles P 0 NON_SPEC_CORRECT_PATH
... thread: test_branch:7952
...... dso: /data/sandipan/test_branch
[...]

Signed-off-by: Sandipan Das <[email protected]>
---
tools/perf/util/session.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 7c021c6cedb9..a42f051dab9d 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1180,7 +1180,7 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack)
struct branch_entry *e = &entries[i];

if (!callstack) {
- printf("..... %2"PRIu64": %016" PRIx64 " -> %016" PRIx64 " %hu cycles %s%s%s%s %x %s\n",
+ printf("..... %2"PRIu64": %016" PRIx64 " -> %016" PRIx64 " %hu cycles %s%s%s%s %x %s %s\n",
i, e->from, e->to,
(unsigned short)e->flags.cycles,
e->flags.mispred ? "M" : " ",
@@ -1188,7 +1188,8 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack)
e->flags.abort ? "A" : " ",
e->flags.in_tx ? "T" : " ",
(unsigned)e->flags.reserved,
- get_branch_type(e));
+ get_branch_type(e),
+ e->flags.spec ? branch_spec_desc(e->flags.spec) : "");
} else {
if (i == 0) {
printf("..... %2"PRIu64": %016" PRIx64 "\n"
--
2.34.1


2023-02-02 12:27:40

by Sandipan Das

[permalink] [raw]
Subject: [PATCH v4 3/4] perf test sample-parsing: Update expected branch flags

The bitfield swapping scheme used duing sample parsing has changed
because of the addition of new branch flags, namely "spec", "new_type"
and "priv". Earlier, these were all part of the "reserved" field but
now, each of these fields get swapped separately. Change the expected
flag values accordingly for the test to pass.

E.g.

$ perf test -v 27

Before:

27: Sample parsing :
--- start ---
test child forked, pid 61979
parsing failed for sample_type 0x800
test child finished with -1
---- end ----
Sample parsing: FAILED!

After:

27: Sample parsing :
--- start ---
test child forked, pid 63293
test child finished with 0
---- end ----
Sample parsing: Ok

Signed-off-by: Sandipan Das <[email protected]>
---
tools/perf/tests/sample-parsing.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/sample-parsing.c b/tools/perf/tests/sample-parsing.c
index 927c7f0cc4cc..25a3f6cece50 100644
--- a/tools/perf/tests/sample-parsing.c
+++ b/tools/perf/tests/sample-parsing.c
@@ -37,7 +37,7 @@
* in branch_stack variable.
*/
#define BS_EXPECTED_BE 0xa000d00000000000
-#define BS_EXPECTED_LE 0xd5000000
+#define BS_EXPECTED_LE 0x1aa00000000
#define FLAG(s) s->branch_stack->entries[i].flags

static bool samples_same(const struct perf_sample *s1,
--
2.34.1


2023-02-02 12:28:00

by Sandipan Das

[permalink] [raw]
Subject: [PATCH v4 4/4] perf test brstack: Update regex to include spec field

With the addition of new branch flags, the "brstacksym" fields in perf
script output now shows speculation information after the branch type.
Change the regular expressions accordingly for the test to pass. Since
branch speculation information may vary across platforms, the test does
not look for specific values.

E.g.

$ perf test -v 110

Before:

110: Check branch stack sampling :
--- start ---
test child forked, pid 54154
Testing user branch stack sampling
+ grep -E -m1 ^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/IND_CALL$ /tmp/__perf_test.program.AfhUI/perf.script
+ cleanup
+ rm -rf /tmp/__perf_test.program.AfhUI
test child finished with -1
---- end ----
Check branch stack sampling: FAILED!

After:

110: Check branch stack sampling :
--- start ---
test child forked, pid 43716
Testing user branch stack sampling
+ grep -E -m1 ^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/IND_CALL/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack_bench+0x66/brstack_foo+0x0/P/-/-/0/IND_CALL/NON_SPEC_CORRECT_PATH
+ grep -E -m1 ^brstack_foo\+[^ ]*/brstack_bar\+[^ ]*/CALL/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack_foo+0x1b/brstack_bar+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
+ grep -E -m1 ^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/CALL/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack_bench+0x58/brstack_foo+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
+ grep -E -m1 ^brstack_bench\+[^ ]*/brstack_bar\+[^ ]*/CALL/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack_bench+0x5d/brstack_bar+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
+ grep -E -m1 ^brstack_bar\+[^ ]*/brstack_foo\+[^ ]*/RET/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack_bar+0x31/brstack_foo+0x20/P/-/-/0/RET/NON_SPEC_CORRECT_PATH
+ grep -E -m1 ^brstack_foo\+[^ ]*/brstack_bench\+[^ ]*/RET/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack_foo+0x36/brstack_bench+0x5d/P/-/-/0/RET/NON_SPEC_CORRECT_PATH
+ grep -E -m1 ^brstack_bench\+[^ ]*/brstack_bench\+[^ ]*/COND/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack_bench+0x76/brstack_bench+0x7d/P/-/-/0/COND/NON_SPEC_CORRECT_PATH
+ grep -E -m1 ^brstack\+[^ ]*/brstack\+[^ ]*/UNCOND/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack+0x5a/brstack+0x41/P/-/-/0/UNCOND/NON_SPEC_CORRECT_PATH
+ set +x
Testing branch stack filtering permutation (any_call,CALL|IND_CALL|COND_CALL|SYSCALL|IRQ)
Testing branch stack filtering permutation (call,CALL|SYSCALL)
Testing branch stack filtering permutation (cond,COND)
Testing branch stack filtering permutation (any_ret,RET|COND_RET|SYSRET|ERET)
Testing branch stack filtering permutation (call,cond,CALL|SYSCALL|COND)
Testing branch stack filtering permutation (any_call,cond,CALL|IND_CALL|COND_CALL|IRQ|SYSCALL|COND)
Testing branch stack filtering permutation (cond,any_call,any_ret,COND|CALL|IND_CALL|COND_CALL|SYSCALL|IRQ|RET|COND_RET|SYSRET|ERET)
test child finished with 0
---- end ----
Check branch stack sampling: Ok

Signed-off-by: Sandipan Das <[email protected]>
---
tools/perf/tests/shell/test_brstack.sh | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/tools/perf/tests/shell/test_brstack.sh b/tools/perf/tests/shell/test_brstack.sh
index 59195eb80052..1c49d8293003 100755
--- a/tools/perf/tests/shell/test_brstack.sh
+++ b/tools/perf/tests/shell/test_brstack.sh
@@ -30,14 +30,14 @@ test_user_branches() {
# brstack_foo+0x14/brstack_bar+0x40/P/-/-/0/CALL

set -x
- grep -E -m1 "^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/IND_CALL$" $TMPDIR/perf.script
- grep -E -m1 "^brstack_foo\+[^ ]*/brstack_bar\+[^ ]*/CALL$" $TMPDIR/perf.script
- grep -E -m1 "^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/CALL$" $TMPDIR/perf.script
- grep -E -m1 "^brstack_bench\+[^ ]*/brstack_bar\+[^ ]*/CALL$" $TMPDIR/perf.script
- grep -E -m1 "^brstack_bar\+[^ ]*/brstack_foo\+[^ ]*/RET$" $TMPDIR/perf.script
- grep -E -m1 "^brstack_foo\+[^ ]*/brstack_bench\+[^ ]*/RET$" $TMPDIR/perf.script
- grep -E -m1 "^brstack_bench\+[^ ]*/brstack_bench\+[^ ]*/COND$" $TMPDIR/perf.script
- grep -E -m1 "^brstack\+[^ ]*/brstack\+[^ ]*/UNCOND$" $TMPDIR/perf.script
+ grep -E -m1 "^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/IND_CALL/.*$" $TMPDIR/perf.script
+ grep -E -m1 "^brstack_foo\+[^ ]*/brstack_bar\+[^ ]*/CALL/.*$" $TMPDIR/perf.script
+ grep -E -m1 "^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/CALL/.*$" $TMPDIR/perf.script
+ grep -E -m1 "^brstack_bench\+[^ ]*/brstack_bar\+[^ ]*/CALL/.*$" $TMPDIR/perf.script
+ grep -E -m1 "^brstack_bar\+[^ ]*/brstack_foo\+[^ ]*/RET/.*$" $TMPDIR/perf.script
+ grep -E -m1 "^brstack_foo\+[^ ]*/brstack_bench\+[^ ]*/RET/.*$" $TMPDIR/perf.script
+ grep -E -m1 "^brstack_bench\+[^ ]*/brstack_bench\+[^ ]*/COND/.*$" $TMPDIR/perf.script
+ grep -E -m1 "^brstack\+[^ ]*/brstack\+[^ ]*/UNCOND/.*$" $TMPDIR/perf.script
set +x

# some branch types are still not being tested:
@@ -57,7 +57,7 @@ test_filter() {

# fail if we find any branch type that doesn't match any of the expected ones
# also consider UNKNOWN branch types (-)
- if grep -E -vm1 "^[^ ]*/($expect|-|( *))$" $TMPDIR/perf.script; then
+ if grep -E -vm1 "^[^ ]*/($expect|-|( *))/.*$" $TMPDIR/perf.script; then
return 1
fi
}
--
2.34.1


2023-02-02 13:04:26

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] tools perf: Add branch speculation info

Em Thu, Feb 02, 2023 at 05:56:13PM +0530, Sandipan Das escreveu:
> AMD Last Branch Record Extension Version 2 (LbrExtV2) provides branch
> speculation information and the perf UAPI is extended to provide this in
> a generic way. Make perf tool show this additional information.
>
> The UAPI changes can be found in commit 93315e46b000 ("perf/core: Add
> speculation info to branch entries").
>
> Requesting help from folks having access to big-endian systems to test
> changes in the sample parsing test as I was only able to test these in
> a ppc64 simulator.

I'll try folding some of these patches as 'perf test' must pass after
each of them, so that we keep the codebase bisectable.

Right now, after appling the first patch on this v4 series:

⬢[acme@toolbox perf]$ perf test 27
27: Sample parsing : FAILED!
⬢[acme@toolbox perf]$

- Arnaldo

> Previous versions can be found at:
> v3: https://lore.kernel.org/all/[email protected]/
> v2: https://lore.kernel.org/all/[email protected]/
> v1: https://lore.kernel.org/all/[email protected]/
>
> Changes in v4:
> - Update tests that were failing due to changes in perf output and
> sample parsing (thanks to Arnaldo for reporting).
>
> Changes in v3:
> - Drop tools-side UAPI changes as they have already been added by other
> commits.
> - Rebase on top of latest perf/core.
>
> Changes in v2:
> - Drop msr-index.h related changes for now.
> - Rebase on top of latest perf/core.
> - Fix UAPI breakage introduced by the ARM64 BRBE changes to perf branch
> entry.
>
> Sandipan Das (4):
> perf script: Show branch speculation info
> perf session: Show branch speculation info in raw dump
> perf test sample-parsing: Update expected branch flags
> perf test brstack: Update regex to include spec field
>
> tools/perf/builtin-script.c | 5 +++--
> tools/perf/tests/sample-parsing.c | 2 +-
> tools/perf/tests/shell/test_brstack.sh | 18 +++++++++---------
> tools/perf/util/branch.c | 15 +++++++++++++++
> tools/perf/util/branch.h | 2 ++
> tools/perf/util/evsel.c | 15 ++++++++++++---
> tools/perf/util/session.c | 5 +++--
> 7 files changed, 45 insertions(+), 17 deletions(-)
>
> --
> 2.34.1
>

--

- Arnaldo

2023-02-02 13:10:42

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] tools perf: Add branch speculation info

Em Thu, Feb 02, 2023 at 10:04:12AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, Feb 02, 2023 at 05:56:13PM +0530, Sandipan Das escreveu:
> > AMD Last Branch Record Extension Version 2 (LbrExtV2) provides branch
> > speculation information and the perf UAPI is extended to provide this in
> > a generic way. Make perf tool show this additional information.
> >
> > The UAPI changes can be found in commit 93315e46b000 ("perf/core: Add
> > speculation info to branch entries").
> >
> > Requesting help from folks having access to big-endian systems to test
> > changes in the sample parsing test as I was only able to test these in
> > a ppc64 simulator.
>
> I'll try folding some of these patches as 'perf test' must pass after
> each of them, so that we keep the codebase bisectable.
>
> Right now, after appling the first patch on this v4 series:
>
> ⬢[acme@toolbox perf]$ perf test 27
> 27: Sample parsing : FAILED!
> ⬢[acme@toolbox perf]$

So this is what I did:

$ git rebase -i HEAD~4
pick 266d6702711d299c perf script: Show branch speculation info
squash d2fa279aba8d2863 perf test sample-parsing: Update expected branch flags
pick b335ad966cadcbfa perf session: Show branch speculation info in raw dump
squash 272ce62f64e60fc7 perf test brstack: Update regex to include spec field

And then combined the commit messages. Please have bisectability in
mind, running 'perf test', and if it fails, add the fix to to 'perf
test' on the patch that introduced the problem.

Thanks,

- Arnaldo

2023-02-02 13:14:54

by Sandipan Das

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] tools perf: Add branch speculation info

On 2/2/2023 6:40 PM, Arnaldo Carvalho de Melo wrote:
> Em Thu, Feb 02, 2023 at 10:04:12AM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Thu, Feb 02, 2023 at 05:56:13PM +0530, Sandipan Das escreveu:
>>> AMD Last Branch Record Extension Version 2 (LbrExtV2) provides branch
>>> speculation information and the perf UAPI is extended to provide this in
>>> a generic way. Make perf tool show this additional information.
>>>
>>> The UAPI changes can be found in commit 93315e46b000 ("perf/core: Add
>>> speculation info to branch entries").
>>>
>>> Requesting help from folks having access to big-endian systems to test
>>> changes in the sample parsing test as I was only able to test these in
>>> a ppc64 simulator.
>>
>> I'll try folding some of these patches as 'perf test' must pass after
>> each of them, so that we keep the codebase bisectable.
>>
>> Right now, after appling the first patch on this v4 series:
>>
>> ⬢[acme@toolbox perf]$ perf test 27
>> 27: Sample parsing : FAILED!
>> ⬢[acme@toolbox perf]$
>
> So this is what I did:
>
> $ git rebase -i HEAD~4
> pick 266d6702711d299c perf script: Show branch speculation info
> squash d2fa279aba8d2863 perf test sample-parsing: Update expected branch flags
> pick b335ad966cadcbfa perf session: Show branch speculation info in raw dump
> squash 272ce62f64e60fc7 perf test brstack: Update regex to include spec field
>
> And then combined the commit messages. Please have bisectability in
> mind, running 'perf test', and if it fails, add the fix to to 'perf
> test' on the patch that introduced the problem.
>

Sure. Thanks for the cleanup.

- Sandipan