Sometimes, a small change in a hot function reducing the cycles of
this function, but the overall workload doesn't get faster. It is
interesting where the cycles are moved to.
What it would like is to diff before/after streams. The stream is the
branch history which is aggregated by the branch records from perf
samples. For example, the callchains aggregated from the branch records.
By browsing the hot stream, we can understand the hot code path.
By browsing the hot streams, we can understand the hot code path.
By comparing the cycles variation of same streams between old perf
data and new perf data, we can understand if the cycles are moved
to other codes.
The before stream is the stream in perf.data.old. The after stream
is the stream in perf.data.
Diffing before/after streams compares top N hottest streams between
two perf data files.
If all entries of one stream in perf.data.old are fully matched with
all entries of another stream in perf.data, we think two streams
are matched, otherwise the streams are not matched.
For example,
cycles: 1, hits: 26.80% cycles: 1, hits: 27.30%
-------------------------- --------------------------
main div.c:39 main div.c:39
main div.c:44 main div.c:44
The above streams are matched and we can see for the same streams the
cycles (1) are equal and the callchain hit percents are slightly changed
(26.80% vs. 27.30%). That's expected.
Now let's see examples.
perf record -b ... Generate perf.data.old with branch data
perf record -b ... Generate perf.data with branch data
perf diff --stream
[ Matched hot streams ]
hot chain pair 1:
cycles: 1, hits: 27.77% cycles: 1, hits: 9.24%
--------------------------- --------------------------
main div.c:39 main div.c:39
main div.c:44 main div.c:44
hot chain pair 2:
cycles: 34, hits: 20.06% cycles: 27, hits: 16.98%
--------------------------- --------------------------
__random_r random_r.c:360 __random_r random_r.c:360
__random_r random_r.c:388 __random_r random_r.c:388
__random_r random_r.c:388 __random_r random_r.c:388
__random_r random_r.c:380 __random_r random_r.c:380
__random_r random_r.c:357 __random_r random_r.c:357
__random random.c:293 __random random.c:293
__random random.c:293 __random random.c:293
__random random.c:291 __random random.c:291
__random random.c:291 __random random.c:291
__random random.c:291 __random random.c:291
__random random.c:288 __random random.c:288
rand rand.c:27 rand rand.c:27
rand rand.c:26 rand rand.c:26
rand@plt rand@plt
rand@plt rand@plt
compute_flag div.c:25 compute_flag div.c:25
compute_flag div.c:22 compute_flag div.c:22
main div.c:40 main div.c:40
main div.c:40 main div.c:40
main div.c:39 main div.c:39
hot chain pair 3:
cycles: 9, hits: 4.48% cycles: 6, hits: 4.51%
--------------------------- --------------------------
__random_r random_r.c:360 __random_r random_r.c:360
__random_r random_r.c:388 __random_r random_r.c:388
__random_r random_r.c:388 __random_r random_r.c:388
__random_r random_r.c:380 __random_r random_r.c:380
[ Hot streams in old perf data only ]
hot chain 1:
cycles: 18, hits: 6.75%
--------------------------
__random_r random_r.c:360
__random_r random_r.c:388
__random_r random_r.c:388
__random_r random_r.c:380
__random_r random_r.c:357
__random random.c:293
__random random.c:293
__random random.c:291
__random random.c:291
__random random.c:291
__random random.c:288
rand rand.c:27
rand rand.c:26
rand@plt
rand@plt
compute_flag div.c:25
compute_flag div.c:22
main div.c:40
hot chain 2:
cycles: 29, hits: 2.78%
--------------------------
compute_flag div.c:22
main div.c:40
main div.c:40
main div.c:39
[ Hot streams in new perf data only ]
hot chain 1:
cycles: 4, hits: 4.54%
--------------------------
main div.c:42
compute_flag div.c:28
hot chain 2:
cycles: 5, hits: 3.51%
--------------------------
main div.c:39
main div.c:44
main div.c:42
compute_flag div.c:28
v6:
---
Rebase to perf/core
v5:
---
1. Remove enum stream_type
2. Rebase to perf/core
v4:
---
The previous version is too big and very hard for review.
1. v4 removes the code which supports the source line mapping
table and remove the source line based comparison. Now we
only supports the basic functionality of stream comparison.
2. Refactor the code in a generic way.
v3:
---
v2 has 14 patches, it's hard to review.
v3 is only 7 patches for basic stream comparison.
Jin Yao (7):
perf util: Create streams
perf util: Get the evsel_streams by evsel_idx
perf util: Compare two streams
perf util: Link stream pair
perf util: Calculate the sum of total streams hits
perf util: Report hot streams
perf diff: Support hot streams comparison
tools/perf/Documentation/perf-diff.txt | 4 +
tools/perf/builtin-diff.c | 133 +++++++++-
tools/perf/util/Build | 1 +
tools/perf/util/callchain.c | 99 ++++++++
tools/perf/util/callchain.h | 9 +
tools/perf/util/stream.c | 324 +++++++++++++++++++++++++
tools/perf/util/stream.h | 34 +++
7 files changed, 591 insertions(+), 13 deletions(-)
create mode 100644 tools/perf/util/stream.c
create mode 100644 tools/perf/util/stream.h
--
2.17.1
In previous patch, we have created evsel_streams array
This patch returns the specified evsel_streams according to the
evsel_idx.
Signed-off-by: Jin Yao <[email protected]>
---
v6:
- Rebase to perf/core
v5:
- Rebase to perf/core
v4:
- Rename the patch from 'perf util: Return per-event callchain
streams' to 'perf util: Get the evsel_streams by evsel_idx'
tools/perf/util/stream.c | 11 +++++++++++
tools/perf/util/stream.h | 3 +++
2 files changed, 14 insertions(+)
diff --git a/tools/perf/util/stream.c b/tools/perf/util/stream.c
index 015c1d07ce3a..7882a7f05d97 100644
--- a/tools/perf/util/stream.c
+++ b/tools/perf/util/stream.c
@@ -146,3 +146,14 @@ struct evsel_streams *perf_evlist__create_streams(struct evlist *evlist,
return es;
}
+
+struct evsel_streams *evsel_streams_get(struct evsel_streams *es,
+ int nr_evsel, int evsel_idx)
+{
+ for (int i = 0; i < nr_evsel; i++) {
+ if (es[i].evsel_idx == evsel_idx)
+ return &es[i];
+ }
+
+ return NULL;
+}
diff --git a/tools/perf/util/stream.h b/tools/perf/util/stream.h
index c6844c5787cb..66f61d954eef 100644
--- a/tools/perf/util/stream.h
+++ b/tools/perf/util/stream.h
@@ -20,4 +20,7 @@ struct evlist;
struct evsel_streams *perf_evlist__create_streams(struct evlist *evlist,
int nr_streams_max);
+struct evsel_streams *evsel_streams_get(struct evsel_streams *es,
+ int nr_evsel, int evsel_idx);
+
#endif /* __PERF_STREAM_H */
--
2.17.1
We define the stream is the branch history which is aggregated by
the branch records from perf samples. For example, the callchains
aggregated from the branch records are considered as streams.
By browsing the hot stream, we can understand the hot code path.
Now we only support the callchain for stream. For measuring the
hot level for a stream, we use the callchain_node->hit, higher
is hotter.
There may be many callchains sampled so we only focus on the top
N hottest callchains. N is a user defined parameter or predefined
default value (nr_streams_max).
This patch creates an evsel_streams array per event, and saves
the top N hottest streams in a stream array.
So now we can get the per-event top N hottest streams.
Signed-off-by: Jin Yao <[email protected]>
---
v6:
- Rebase to perf/core
v5:
- Remove enum stram_type
- Rebase to perf/core
v4:
- Refactor the code
- Rename patch name from 'perf util: Create streams for managing
top N hottest callchains' to 'perf util: Create streams'
v2:
- Use zfree in free_evsel_streams().
tools/perf/util/Build | 1 +
tools/perf/util/stream.c | 148 +++++++++++++++++++++++++++++++++++++++
tools/perf/util/stream.h | 23 ++++++
3 files changed, 172 insertions(+)
create mode 100644 tools/perf/util/stream.c
create mode 100644 tools/perf/util/stream.h
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index cd5e41960e64..6ffdf833cd51 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -101,6 +101,7 @@ perf-y += call-path.o
perf-y += rwsem.o
perf-y += thread-stack.o
perf-y += spark.o
+perf-y += stream.o
perf-$(CONFIG_AUXTRACE) += auxtrace.o
perf-$(CONFIG_AUXTRACE) += intel-pt-decoder/
perf-$(CONFIG_AUXTRACE) += intel-pt.o
diff --git a/tools/perf/util/stream.c b/tools/perf/util/stream.c
new file mode 100644
index 000000000000..015c1d07ce3a
--- /dev/null
+++ b/tools/perf/util/stream.c
@@ -0,0 +1,148 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Compare and figure out the top N hottest streams
+ * Copyright (c) 2020, Intel Corporation.
+ * Author: Jin Yao
+ */
+
+#include <inttypes.h>
+#include <stdlib.h>
+#include <linux/zalloc.h>
+#include "debug.h"
+#include "hist.h"
+#include "sort.h"
+#include "stream.h"
+#include "evlist.h"
+
+static void free_evsel_streams(struct evsel_streams *es, int nr_evsel)
+{
+ for (int i = 0; i < nr_evsel; i++)
+ zfree(&es[i].streams);
+
+ free(es);
+}
+
+static struct evsel_streams *create_evsel_streams(int nr_evsel,
+ int nr_streams_max)
+{
+ struct evsel_streams *es;
+
+ es = calloc(nr_evsel, sizeof(struct evsel_streams));
+ if (!es)
+ return NULL;
+
+ for (int i = 0; i < nr_evsel; i++) {
+ struct evsel_streams *s = &es[i];
+
+ s->streams = calloc(nr_streams_max, sizeof(struct stream));
+ if (!s->streams)
+ goto err;
+
+ s->nr_streams_max = nr_streams_max;
+ s->evsel_idx = -1;
+ }
+
+ return es;
+
+err:
+ free_evsel_streams(es, nr_evsel);
+ return NULL;
+}
+
+/*
+ * The cnodes with high hit number are hot callchains.
+ */
+static void set_hot_cnode(struct evsel_streams *es,
+ struct callchain_node *cnode)
+{
+ int i, idx = 0;
+ u64 hit;
+
+ if (es->nr_streams < es->nr_streams_max) {
+ i = es->nr_streams;
+ es->streams[i].cnode = cnode;
+ es->nr_streams++;
+ return;
+ }
+
+ /*
+ * Considering a few number of hot streams, only use simple
+ * way to find the cnode with smallest hit number and replace.
+ */
+ hit = (es->streams[0].cnode)->hit;
+ for (i = 1; i < es->nr_streams; i++) {
+ if ((es->streams[i].cnode)->hit < hit) {
+ hit = (es->streams[i].cnode)->hit;
+ idx = i;
+ }
+ }
+
+ if (cnode->hit > hit)
+ es->streams[idx].cnode = cnode;
+}
+
+static void update_hot_callchain(struct hist_entry *he,
+ struct evsel_streams *es)
+{
+ struct rb_root *root = &he->sorted_chain;
+ struct rb_node *rb_node = rb_first(root);
+ struct callchain_node *cnode;
+
+ while (rb_node) {
+ cnode = rb_entry(rb_node, struct callchain_node, rb_node);
+ set_hot_cnode(es, cnode);
+ rb_node = rb_next(rb_node);
+ }
+}
+
+static void init_hot_callchain(struct hists *hists, struct evsel_streams *es)
+{
+ struct rb_node *next = rb_first_cached(&hists->entries);
+
+ while (next) {
+ struct hist_entry *he;
+
+ he = rb_entry(next, struct hist_entry, rb_node);
+ update_hot_callchain(he, es);
+ next = rb_next(&he->rb_node);
+ }
+}
+
+static int evlist_init_callchain_streams(struct evlist *evlist,
+ struct evsel_streams *es, int nr_es)
+{
+ struct evsel *pos;
+ int i = 0;
+
+ BUG_ON(nr_es < evlist->core.nr_entries);
+
+ evlist__for_each_entry(evlist, pos) {
+ struct hists *hists = evsel__hists(pos);
+
+ hists__output_resort(hists, NULL);
+ init_hot_callchain(hists, &es[i]);
+ es[i].evsel_idx = pos->idx;
+ i++;
+ }
+
+ return 0;
+}
+
+struct evsel_streams *perf_evlist__create_streams(struct evlist *evlist,
+ int nr_streams_max)
+{
+ struct evsel_streams *es;
+ int nr_evsel = evlist->core.nr_entries, ret = -1;
+
+ es = create_evsel_streams(nr_evsel, nr_streams_max);
+ if (!es)
+ return NULL;
+
+ ret = evlist_init_callchain_streams(evlist, es, nr_evsel);
+ if (ret) {
+ free_evsel_streams(es, nr_evsel);
+ return NULL;
+ }
+
+ return es;
+}
diff --git a/tools/perf/util/stream.h b/tools/perf/util/stream.h
new file mode 100644
index 000000000000..c6844c5787cb
--- /dev/null
+++ b/tools/perf/util/stream.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_STREAM_H
+#define __PERF_STREAM_H
+
+#include "callchain.h"
+
+struct stream {
+ struct callchain_node *cnode;
+};
+
+struct evsel_streams {
+ struct stream *streams;
+ int nr_streams_max;
+ int nr_streams;
+ int evsel_idx;
+};
+
+struct evlist;
+
+struct evsel_streams *perf_evlist__create_streams(struct evlist *evlist,
+ int nr_streams_max);
+
+#endif /* __PERF_STREAM_H */
--
2.17.1
In previous patch, we have created an evsel_streams for one event,
and top N hottest streams will be saved in a stream array in
evsel_streams.
This patch compares total streams among two evsel_streams.
Once two streams are fully matched, they will be linked as
a pair. From the pair, we can know which streams are matched.
Signed-off-by: Jin Yao <[email protected]>
---
v6:
- Rebase to perf/core
v5:
- Remove enum stream_type
v4:
- New patch in v4.
tools/perf/util/stream.c | 40 ++++++++++++++++++++++++++++++++++++++++
tools/perf/util/stream.h | 4 ++++
2 files changed, 44 insertions(+)
diff --git a/tools/perf/util/stream.c b/tools/perf/util/stream.c
index 7882a7f05d97..e96e21d6e07b 100644
--- a/tools/perf/util/stream.c
+++ b/tools/perf/util/stream.c
@@ -157,3 +157,43 @@ struct evsel_streams *evsel_streams_get(struct evsel_streams *es,
return NULL;
}
+
+static struct stream *stream_callchain_match(struct stream *base_stream,
+ struct evsel_streams *es_pair)
+{
+ for (int i = 0; i < es_pair->nr_streams; i++) {
+ struct stream *pair_stream = &es_pair->streams[i];
+
+ if (callchain_cnode_matched(base_stream->cnode,
+ pair_stream->cnode)) {
+ return pair_stream;
+ }
+ }
+
+ return NULL;
+}
+
+static struct stream *stream_match(struct stream *base_stream,
+ struct evsel_streams *es_pair)
+{
+ return stream_callchain_match(base_stream, es_pair);
+}
+
+static void stream_link(struct stream *base_stream, struct stream *pair_stream)
+{
+ base_stream->pair_cnode = pair_stream->cnode;
+ pair_stream->pair_cnode = base_stream->cnode;
+}
+
+void match_evsel_streams(struct evsel_streams *es_base,
+ struct evsel_streams *es_pair)
+{
+ for (int i = 0; i < es_base->nr_streams; i++) {
+ struct stream *base_stream = &es_base->streams[i];
+ struct stream *pair_stream;
+
+ pair_stream = stream_match(base_stream, es_pair);
+ if (pair_stream)
+ stream_link(base_stream, pair_stream);
+ }
+}
diff --git a/tools/perf/util/stream.h b/tools/perf/util/stream.h
index 66f61d954eef..2eb6f17a834e 100644
--- a/tools/perf/util/stream.h
+++ b/tools/perf/util/stream.h
@@ -6,6 +6,7 @@
struct stream {
struct callchain_node *cnode;
+ struct callchain_node *pair_cnode;
};
struct evsel_streams {
@@ -23,4 +24,7 @@ struct evsel_streams *perf_evlist__create_streams(struct evlist *evlist,
struct evsel_streams *evsel_streams_get(struct evsel_streams *es,
int nr_evsel, int evsel_idx);
+void match_evsel_streams(struct evsel_streams *es_base,
+ struct evsel_streams *es_pair);
+
#endif /* __PERF_STREAM_H */
--
2.17.1
We show the streams separately. They are divided into different sections.
1. "Matched hot streams"
2. "Hot streams in old perf data only"
3. "Hot streams in new perf data only".
For each stream, we report the cycles and hot percent (hits%).
For example,
cycles: 2, hits: 4.08%
--------------------------
main div.c:42
compute_flag div.c:28
Signed-off-by: Jin Yao <[email protected]>
---
v6:
- Rebase to perf/core
v5:
- Rebase to perf/core
v4:
- Remove "Hot chains in old perf data but source line changed
in new perf data"
tools/perf/util/callchain.c | 13 ++++
tools/perf/util/callchain.h | 2 +
tools/perf/util/stream.c | 123 ++++++++++++++++++++++++++++++++++++
tools/perf/util/stream.h | 3 +
4 files changed, 141 insertions(+)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 4f824bfcc072..1b60985690bb 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1699,3 +1699,16 @@ u64 callchain_total_hits(struct hists *hists)
return chain_hits;
}
+
+s64 callchain_avg_cycles(struct callchain_node *cnode)
+{
+ struct callchain_list *chain;
+ s64 cycles = 0;
+
+ list_for_each_entry(chain, &cnode->val, list) {
+ if (chain->srcline && chain->branch_count)
+ cycles += chain->cycles_count / chain->branch_count;
+ }
+
+ return cycles;
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index ac5bea9c1eb7..5824134f983b 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -305,4 +305,6 @@ bool callchain_cnode_matched(struct callchain_node *base_cnode,
u64 callchain_total_hits(struct hists *hists);
+s64 callchain_avg_cycles(struct callchain_node *cnode);
+
#endif /* __PERF_CALLCHAIN_H */
diff --git a/tools/perf/util/stream.c b/tools/perf/util/stream.c
index 642316078e40..2d3dc7361ef1 100644
--- a/tools/perf/util/stream.c
+++ b/tools/perf/util/stream.c
@@ -199,3 +199,126 @@ void match_evsel_streams(struct evsel_streams *es_base,
stream_link(base_stream, pair_stream);
}
}
+
+static void print_callchain_pair(struct stream *base_stream, int idx,
+ struct evsel_streams *es_base,
+ struct evsel_streams *es_pair)
+{
+ struct callchain_node *base_cnode = base_stream->cnode;
+ struct callchain_node *pair_cnode = base_stream->pair_cnode;
+ struct callchain_list *base_chain, *pair_chain;
+ char buf1[512], buf2[512], cbuf1[256], cbuf2[256];
+ char *s1, *s2;
+ double pct;
+
+ printf("\nhot chain pair %d:\n", idx);
+
+ pct = (double)base_cnode->hit / (double)es_base->streams_hits;
+ scnprintf(buf1, sizeof(buf1), "cycles: %ld, hits: %.2f%%",
+ callchain_avg_cycles(base_cnode), pct * 100.0);
+
+ pct = (double)pair_cnode->hit / (double)es_pair->streams_hits;
+ scnprintf(buf2, sizeof(buf2), "cycles: %ld, hits: %.2f%%",
+ callchain_avg_cycles(pair_cnode), pct * 100.0);
+
+ printf("%35s\t%35s\n", buf1, buf2);
+
+ printf("%35s\t%35s\n",
+ "---------------------------",
+ "--------------------------");
+
+ pair_chain = list_first_entry(&pair_cnode->val,
+ struct callchain_list,
+ list);
+
+ list_for_each_entry(base_chain, &base_cnode->val, list) {
+ if (&pair_chain->list == &pair_cnode->val)
+ return;
+
+ s1 = callchain_list__sym_name(base_chain, cbuf1, sizeof(cbuf1),
+ false);
+ s2 = callchain_list__sym_name(pair_chain, cbuf2, sizeof(cbuf2),
+ false);
+
+ scnprintf(buf1, sizeof(buf1), "%35s\t%35s", s1, s2);
+ printf("%s\n", buf1);
+ pair_chain = list_next_entry(pair_chain, list);
+ }
+}
+
+static void print_stream_callchain(struct stream *stream, int idx,
+ struct evsel_streams *es, bool pair)
+{
+ struct callchain_node *cnode = stream->cnode;
+ struct callchain_list *chain;
+ char buf[512], cbuf[256], *s;
+ double pct;
+
+ printf("\nhot chain %d:\n", idx);
+
+ pct = (double)cnode->hit / (double)es->streams_hits;
+ scnprintf(buf, sizeof(buf), "cycles: %ld, hits: %.2f%%",
+ callchain_avg_cycles(cnode), pct * 100.0);
+
+ if (pair) {
+ printf("%35s\t%35s\n", "", buf);
+ printf("%35s\t%35s\n",
+ "", "--------------------------");
+ } else {
+ printf("%35s\n", buf);
+ printf("%35s\n", "--------------------------");
+ }
+
+ list_for_each_entry(chain, &cnode->val, list) {
+ s = callchain_list__sym_name(chain, cbuf, sizeof(cbuf), false);
+
+ if (pair)
+ scnprintf(buf, sizeof(buf), "%35s\t%35s", "", s);
+ else
+ scnprintf(buf, sizeof(buf), "%35s", s);
+
+ printf("%s\n", buf);
+ }
+}
+
+static void callchain_streams_report(struct evsel_streams *es_base,
+ struct evsel_streams *es_pair)
+{
+ struct stream *base_stream;
+ int i, idx = 0;
+
+ printf("[ Matched hot streams ]\n");
+ for (i = 0; i < es_base->nr_streams; i++) {
+ base_stream = &es_base->streams[i];
+ if (base_stream->pair_cnode) {
+ print_callchain_pair(base_stream, ++idx,
+ es_base, es_pair);
+ }
+ }
+
+ idx = 0;
+ printf("\n[ Hot streams in old perf data only ]\n");
+ for (i = 0; i < es_base->nr_streams; i++) {
+ base_stream = &es_base->streams[i];
+ if (!base_stream->pair_cnode) {
+ print_stream_callchain(base_stream, ++idx,
+ es_base, false);
+ }
+ }
+
+ idx = 0;
+ printf("\n[ Hot streams in new perf data only ]\n");
+ for (i = 0; i < es_pair->nr_streams; i++) {
+ base_stream = &es_pair->streams[i];
+ if (!base_stream->pair_cnode) {
+ print_stream_callchain(base_stream, ++idx,
+ es_pair, true);
+ }
+ }
+}
+
+void evsel_streams_report(struct evsel_streams *es_base,
+ struct evsel_streams *es_pair)
+{
+ return callchain_streams_report(es_base, es_pair);
+}
diff --git a/tools/perf/util/stream.h b/tools/perf/util/stream.h
index 56dfa90c810d..33b02c059d23 100644
--- a/tools/perf/util/stream.h
+++ b/tools/perf/util/stream.h
@@ -28,4 +28,7 @@ struct evsel_streams *evsel_streams_get(struct evsel_streams *es,
void match_evsel_streams(struct evsel_streams *es_base,
struct evsel_streams *es_pair);
+void evsel_streams_report(struct evsel_streams *es_base,
+ struct evsel_streams *es_pair);
+
#endif /* __PERF_STREAM_H */
--
2.17.1
Stream is the branch history which is aggregated by the branch
records from perf samples. Now we support the callchain as
stream.
If the callchain entries of one stream are fully matched with
the callchain entries of another stream, we think two streams
are matched.
For example,
cycles: 1, hits: 26.80% cycles: 1, hits: 27.30%
----------------------- -----------------------
main div.c:39 main div.c:39
main div.c:44 main div.c:44
Above two streams are matched (we don't consider the case that
source code is changed).
The matching logic is, compare the chain string first. If it's not
matched, fallback to dso address comparison.
Signed-off-by: Jin Yao <[email protected]>
---
v6:
- Rebase to perf/core
v5:
- Remove enum stream_type
- Rebase to perf/core
v4:
- Remove original source line comparison code.
tools/perf/util/callchain.c | 54 +++++++++++++++++++++++++++++++++++++
tools/perf/util/callchain.h | 4 +++
2 files changed, 58 insertions(+)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 2775b752f2fa..d356e73c5622 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1613,3 +1613,57 @@ void callchain_param_setup(u64 sample_type)
callchain_param.record_mode = CALLCHAIN_FP;
}
}
+
+static bool chain_match(struct callchain_list *base_chain,
+ struct callchain_list *pair_chain)
+{
+ enum match_result match;
+
+ match = match_chain_strings(base_chain->srcline,
+ pair_chain->srcline);
+ if (match != MATCH_ERROR)
+ return match == MATCH_EQ;
+
+ match = match_chain_dso_addresses(base_chain->ms.map,
+ base_chain->ip,
+ pair_chain->ms.map,
+ pair_chain->ip);
+
+ return match == MATCH_EQ;
+}
+
+bool callchain_cnode_matched(struct callchain_node *base_cnode,
+ struct callchain_node *pair_cnode)
+{
+ struct callchain_list *base_chain, *pair_chain;
+ bool match = false;
+
+ pair_chain = list_first_entry(&pair_cnode->val,
+ struct callchain_list,
+ list);
+
+ list_for_each_entry(base_chain, &base_cnode->val, list) {
+ if (&pair_chain->list == &pair_cnode->val)
+ return false;
+
+ if (!base_chain->srcline || !pair_chain->srcline) {
+ pair_chain = list_next_entry(pair_chain, list);
+ continue;
+ }
+
+ match = chain_match(base_chain, pair_chain);
+ if (!match)
+ return false;
+
+ pair_chain = list_next_entry(pair_chain, list);
+ }
+
+ /*
+ * Say chain1 is ABC, chain2 is ABCD, we consider they are
+ * not fully matched.
+ */
+ if (pair_chain && (&pair_chain->list != &pair_cnode->val))
+ return false;
+
+ return match;
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index fe36a9e5ccd1..ad27fc8c7948 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -298,4 +298,8 @@ int callchain_branch_counts(struct callchain_root *root,
u64 *abort_count, u64 *cycles_count);
void callchain_param_setup(u64 sample_type);
+
+bool callchain_cnode_matched(struct callchain_node *base_cnode,
+ struct callchain_node *pair_cnode);
+
#endif /* __PERF_CALLCHAIN_H */
--
2.17.1
This patch enables perf-diff with "--stream" option.
"--stream": Enable hot streams comparison
Now let's see examples.
perf record -b ... Generate perf.data.old with branch data
perf record -b ... Generate perf.data with branch data
perf diff --stream
[ Matched hot streams ]
hot chain pair 1:
cycles: 1, hits: 27.77% cycles: 1, hits: 9.24%
--------------------------- --------------------------
main div.c:39 main div.c:39
main div.c:44 main div.c:44
hot chain pair 2:
cycles: 34, hits: 20.06% cycles: 27, hits: 16.98%
--------------------------- --------------------------
__random_r random_r.c:360 __random_r random_r.c:360
__random_r random_r.c:388 __random_r random_r.c:388
__random_r random_r.c:388 __random_r random_r.c:388
__random_r random_r.c:380 __random_r random_r.c:380
__random_r random_r.c:357 __random_r random_r.c:357
__random random.c:293 __random random.c:293
__random random.c:293 __random random.c:293
__random random.c:291 __random random.c:291
__random random.c:291 __random random.c:291
__random random.c:291 __random random.c:291
__random random.c:288 __random random.c:288
rand rand.c:27 rand rand.c:27
rand rand.c:26 rand rand.c:26
rand@plt rand@plt
rand@plt rand@plt
compute_flag div.c:25 compute_flag div.c:25
compute_flag div.c:22 compute_flag div.c:22
main div.c:40 main div.c:40
main div.c:40 main div.c:40
main div.c:39 main div.c:39
hot chain pair 3:
cycles: 9, hits: 4.48% cycles: 6, hits: 4.51%
--------------------------- --------------------------
__random_r random_r.c:360 __random_r random_r.c:360
__random_r random_r.c:388 __random_r random_r.c:388
__random_r random_r.c:388 __random_r random_r.c:388
__random_r random_r.c:380 __random_r random_r.c:380
[ Hot streams in old perf data only ]
hot chain 1:
cycles: 18, hits: 6.75%
--------------------------
__random_r random_r.c:360
__random_r random_r.c:388
__random_r random_r.c:388
__random_r random_r.c:380
__random_r random_r.c:357
__random random.c:293
__random random.c:293
__random random.c:291
__random random.c:291
__random random.c:291
__random random.c:288
rand rand.c:27
rand rand.c:26
rand@plt
rand@plt
compute_flag div.c:25
compute_flag div.c:22
main div.c:40
hot chain 2:
cycles: 29, hits: 2.78%
--------------------------
compute_flag div.c:22
main div.c:40
main div.c:40
main div.c:39
[ Hot streams in new perf data only ]
hot chain 1:
cycles: 4, hits: 4.54%
--------------------------
main div.c:42
compute_flag div.c:28
hot chain 2:
cycles: 5, hits: 3.51%
--------------------------
main div.c:39
main div.c:44
main div.c:42
compute_flag div.c:28
Signed-off-by: Jin Yao <[email protected]>
---
v6:
- Rebase to perf/core
v5:
- Remove enum stream_type
- Rebase to perf/core
v4:
- Remove the "--before" and "--after" options since they are for
source line based comparison. In this patchset, we will not
support source line based comparison.
tools/perf/Documentation/perf-diff.txt | 4 +
tools/perf/builtin-diff.c | 133 ++++++++++++++++++++++---
2 files changed, 124 insertions(+), 13 deletions(-)
diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt
index f50ca0fef0a4..be65bd55ab2a 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -182,6 +182,10 @@ OPTIONS
--tid=::
Only diff samples for given thread ID (comma separated list).
+--stream::
+ Enable hot streams comparison. Stream can be a callchain which is
+ aggregated by the branch records from samples.
+
COMPARISON
----------
The comparison is governed by the baseline file. The baseline perf.data
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index f8c9bdd8269a..d6db473cd010 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -25,6 +25,7 @@
#include "util/map.h"
#include "util/spark.h"
#include "util/block-info.h"
+#include "util/stream.h"
#include <linux/err.h>
#include <linux/zalloc.h>
#include <subcmd/pager.h>
@@ -42,6 +43,7 @@ struct perf_diff {
int range_size;
int range_num;
bool has_br_stack;
+ bool stream;
};
/* Diff command specific HPP columns. */
@@ -72,6 +74,8 @@ struct data__file {
struct perf_data data;
int idx;
struct hists *hists;
+ struct evsel_streams *evsel_streams;
+ int nr_evsel_streams;
struct diff_hpp_fmt fmt[PERF_HPP_DIFF__MAX_INDEX];
};
@@ -106,6 +110,7 @@ enum {
COMPUTE_DELTA_ABS,
COMPUTE_CYCLES,
COMPUTE_MAX,
+ COMPUTE_STREAM, /* After COMPUTE_MAX to avoid use current compute arrays */
};
const char *compute_names[COMPUTE_MAX] = {
@@ -393,6 +398,11 @@ static int diff__process_sample_event(struct perf_tool *tool,
struct perf_diff *pdiff = container_of(tool, struct perf_diff, tool);
struct addr_location al;
struct hists *hists = evsel__hists(evsel);
+ struct hist_entry_iter iter = {
+ .evsel = evsel,
+ .sample = sample,
+ .ops = &hist_iter_normal,
+ };
int ret = -1;
if (perf_time__ranges_skip_sample(pdiff->ptime_range, pdiff->range_num,
@@ -411,14 +421,8 @@ static int diff__process_sample_event(struct perf_tool *tool,
goto out_put;
}
- if (compute != COMPUTE_CYCLES) {
- if (!hists__add_entry(hists, &al, NULL, NULL, NULL, sample,
- true)) {
- pr_warning("problem incrementing symbol period, "
- "skipping event\n");
- goto out_put;
- }
- } else {
+ switch (compute) {
+ case COMPUTE_CYCLES:
if (!hists__add_entry_ops(hists, &block_hist_ops, &al, NULL,
NULL, NULL, sample, true)) {
pr_warning("problem incrementing symbol period, "
@@ -428,6 +432,23 @@ static int diff__process_sample_event(struct perf_tool *tool,
hist__account_cycles(sample->branch_stack, &al, sample, false,
NULL);
+ break;
+
+ case COMPUTE_STREAM:
+ if (hist_entry_iter__add(&iter, &al, PERF_MAX_STACK_DEPTH,
+ NULL)) {
+ pr_debug("problem adding hist entry, skipping event\n");
+ goto out_put;
+ }
+ break;
+
+ default:
+ if (!hists__add_entry(hists, &al, NULL, NULL, NULL, sample,
+ true)) {
+ pr_warning("problem incrementing symbol period, "
+ "skipping event\n");
+ goto out_put;
+ }
}
/*
@@ -996,6 +1017,50 @@ static void data_process(void)
}
}
+static int process_base_stream(struct data__file *data_base,
+ struct data__file *data_pair,
+ const char *title __maybe_unused)
+{
+ struct evlist *evlist_base = data_base->session->evlist;
+ struct evlist *evlist_pair = data_pair->session->evlist;
+ struct evsel *evsel_base, *evsel_pair;
+ struct evsel_streams *es_base, *es_pair;
+
+ evlist__for_each_entry(evlist_base, evsel_base) {
+ evsel_pair = evsel_match(evsel_base, evlist_pair);
+ if (!evsel_pair)
+ continue;
+
+ es_base = evsel_streams_get(data_base->evsel_streams,
+ data_base->nr_evsel_streams,
+ evsel_base->idx);
+ if (!es_base)
+ return -1;
+
+ es_pair = evsel_streams_get(data_pair->evsel_streams,
+ data_pair->nr_evsel_streams,
+ evsel_pair->idx);
+ if (!es_pair)
+ return -1;
+
+ match_evsel_streams(es_base, es_pair);
+ evsel_streams_report(es_base, es_pair);
+ }
+
+ return 0;
+}
+
+static void stream_process(void)
+{
+ /*
+ * Stream comparison only supports two data files.
+ * perf.data.old and perf.data. data__files[0] is perf.data.old,
+ * data__files[1] is perf.data.
+ */
+ process_base_stream(&data__files[0], &data__files[1],
+ "# Output based on old perf data:\n#\n");
+}
+
static void data__free(struct data__file *d)
{
int col;
@@ -1109,6 +1174,18 @@ static int check_file_brstack(void)
return 0;
}
+static struct evsel_streams *create_evsel_streams(struct evlist *evlist,
+ int nr_streams_max,
+ int *nr_evsel_streams)
+{
+ struct evsel_streams *es;
+
+ es = perf_evlist__create_streams(evlist, nr_streams_max);
+ *nr_evsel_streams = evlist->core.nr_entries;
+
+ return es;
+}
+
static int __cmd_diff(void)
{
struct data__file *d;
@@ -1153,9 +1230,21 @@ static int __cmd_diff(void)
if (pdiff.ptime_range)
zfree(&pdiff.ptime_range);
+
+ if (compute == COMPUTE_STREAM) {
+ d->evsel_streams = create_evsel_streams(
+ d->session->evlist,
+ 5,
+ &d->nr_evsel_streams);
+ if (!d->evsel_streams)
+ goto out_delete;
+ }
}
- data_process();
+ if (compute == COMPUTE_STREAM)
+ stream_process();
+ else
+ data_process();
out_delete:
data__for_each_file(i, d) {
@@ -1228,6 +1317,8 @@ static const struct option options[] = {
"only consider symbols in these pids"),
OPT_STRING(0, "tid", &symbol_conf.tid_list_str, "tid[,tid...]",
"only consider symbols in these tids"),
+ OPT_BOOLEAN(0, "stream", &pdiff.stream,
+ "Enable hot streams comparison."),
OPT_END()
};
@@ -1887,6 +1978,9 @@ int cmd_diff(int argc, const char **argv)
if (cycles_hist && (compute != COMPUTE_CYCLES))
usage_with_options(diff_usage, options);
+ if (pdiff.stream)
+ compute = COMPUTE_STREAM;
+
symbol__annotation_init();
if (symbol__init(NULL) < 0)
@@ -1898,13 +1992,26 @@ int cmd_diff(int argc, const char **argv)
if (check_file_brstack() < 0)
return -1;
- if (compute == COMPUTE_CYCLES && !pdiff.has_br_stack)
+ if ((compute == COMPUTE_CYCLES || compute == COMPUTE_STREAM)
+ && !pdiff.has_br_stack) {
return -1;
+ }
- if (ui_init() < 0)
- return -1;
+ if (compute == COMPUTE_STREAM) {
+ symbol_conf.show_branchflag_count = true;
+ symbol_conf.disable_add2line_warn = true;
+ callchain_param.mode = CHAIN_FLAT;
+ callchain_param.key = CCKEY_SRCLINE;
+ callchain_param.branch_callstack = 1;
+ symbol_conf.use_callchain = true;
+ callchain_register_param(&callchain_param);
+ sort_order = "srcline,symbol,dso";
+ } else {
+ if (ui_init() < 0)
+ return -1;
- sort__mode = SORT_MODE__DIFF;
+ sort__mode = SORT_MODE__DIFF;
+ }
if (setup_sorting(NULL) < 0)
usage_with_options(diff_usage, options);
--
2.17.1
We have used callchain_node->hit to measure the hot level of one
stream. This patch calculates the sum of hits of total streams.
Thus in next patch, we can use following formula to report hot
percent for one stream.
hot percent = callchain_node->hit / sum of total hits
Signed-off-by: Jin Yao <[email protected]>
---
v6:
- Rebase to perf/core
v5:
- Rebase to perf/core
v4:
- No functional change.
v2:
- Combine the variable decl line with its initial assignment
in total_callchain_hits().
tools/perf/util/callchain.c | 32 ++++++++++++++++++++++++++++++++
tools/perf/util/callchain.h | 3 +++
tools/perf/util/stream.c | 2 ++
tools/perf/util/stream.h | 1 +
4 files changed, 38 insertions(+)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index d356e73c5622..4f824bfcc072 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1667,3 +1667,35 @@ bool callchain_cnode_matched(struct callchain_node *base_cnode,
return match;
}
+
+static u64 count_callchain_hits(struct hist_entry *he)
+{
+ struct rb_root *root = &he->sorted_chain;
+ struct rb_node *rb_node = rb_first(root);
+ struct callchain_node *node;
+ u64 chain_hits = 0;
+
+ while (rb_node) {
+ node = rb_entry(rb_node, struct callchain_node, rb_node);
+ chain_hits += node->hit;
+ rb_node = rb_next(rb_node);
+ }
+
+ return chain_hits;
+}
+
+u64 callchain_total_hits(struct hists *hists)
+{
+ struct rb_node *next = rb_first_cached(&hists->entries);
+ u64 chain_hits = 0;
+
+ while (next) {
+ struct hist_entry *he = rb_entry(next, struct hist_entry,
+ rb_node);
+
+ chain_hits += count_callchain_hits(he);
+ next = rb_next(&he->rb_node);
+ }
+
+ return chain_hits;
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index ad27fc8c7948..ac5bea9c1eb7 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -13,6 +13,7 @@ struct ip_callchain;
struct map;
struct perf_sample;
struct thread;
+struct hists;
#define HELP_PAD "\t\t\t\t"
@@ -302,4 +303,6 @@ void callchain_param_setup(u64 sample_type);
bool callchain_cnode_matched(struct callchain_node *base_cnode,
struct callchain_node *pair_cnode);
+u64 callchain_total_hits(struct hists *hists);
+
#endif /* __PERF_CALLCHAIN_H */
diff --git a/tools/perf/util/stream.c b/tools/perf/util/stream.c
index e96e21d6e07b..642316078e40 100644
--- a/tools/perf/util/stream.c
+++ b/tools/perf/util/stream.c
@@ -106,6 +106,8 @@ static void init_hot_callchain(struct hists *hists, struct evsel_streams *es)
update_hot_callchain(he, es);
next = rb_next(&he->rb_node);
}
+
+ es->streams_hits = callchain_total_hits(hists);
}
static int evlist_init_callchain_streams(struct evlist *evlist,
diff --git a/tools/perf/util/stream.h b/tools/perf/util/stream.h
index 2eb6f17a834e..56dfa90c810d 100644
--- a/tools/perf/util/stream.h
+++ b/tools/perf/util/stream.h
@@ -14,6 +14,7 @@ struct evsel_streams {
int nr_streams_max;
int nr_streams;
int evsel_idx;
+ u64 streams_hits;
};
struct evlist;
--
2.17.1
On Fri, Sep 11, 2020 at 04:03:46PM +0800, Jin Yao wrote:
SNIP
> main div.c:40
> main div.c:40
> main div.c:39
>
> [ Hot streams in new perf data only ]
>
> hot chain 1:
> cycles: 4, hits: 4.54%
> --------------------------
> main div.c:42
> compute_flag div.c:28
>
> hot chain 2:
> cycles: 5, hits: 3.51%
> --------------------------
> main div.c:39
> main div.c:44
> main div.c:42
> compute_flag div.c:28
>
> v6:
> ---
> Rebase to perf/core
it looks good to me
Acked-by: Jiri Olsa <[email protected]>
thanks,
jirka
Em Thu, Sep 17, 2020 at 03:05:56PM +0200, Jiri Olsa escreveu:
> On Fri, Sep 11, 2020 at 04:03:46PM +0800, Jin Yao wrote:
>
> SNIP
>
> > main div.c:40
> > main div.c:40
> > main div.c:39
> >
> > [ Hot streams in new perf data only ]
> >
> > hot chain 1:
> > cycles: 4, hits: 4.54%
> > --------------------------
> > main div.c:42
> > compute_flag div.c:28
> >
> > hot chain 2:
> > cycles: 5, hits: 3.51%
> > --------------------------
> > main div.c:39
> > main div.c:44
> > main div.c:42
> > compute_flag div.c:28
> >
> > v6:
> > ---
> > Rebase to perf/core
>
> it looks good to me
>
> Acked-by: Jiri Olsa <[email protected]>
Jin,
I'm sorry I only got to look at this now, there are some issues,
I'll try to point them out patch by patch,
Thanks,
- Arnaldo
Em Fri, Sep 11, 2020 at 04:03:48PM +0800, Jin Yao escreveu:
> In previous patch, we have created evsel_streams array
>
> This patch returns the specified evsel_streams according to the
> evsel_idx.
>
> Signed-off-by: Jin Yao <[email protected]>
> ---
> v6:
> - Rebase to perf/core
>
> v5:
> - Rebase to perf/core
>
> v4:
> - Rename the patch from 'perf util: Return per-event callchain
> streams' to 'perf util: Get the evsel_streams by evsel_idx'
>
> tools/perf/util/stream.c | 11 +++++++++++
> tools/perf/util/stream.h | 3 +++
> 2 files changed, 14 insertions(+)
>
> diff --git a/tools/perf/util/stream.c b/tools/perf/util/stream.c
> index 015c1d07ce3a..7882a7f05d97 100644
> --- a/tools/perf/util/stream.c
> +++ b/tools/perf/util/stream.c
> @@ -146,3 +146,14 @@ struct evsel_streams *perf_evlist__create_streams(struct evlist *evlist,
>
> return es;
> }
> +
> +struct evsel_streams *evsel_streams_get(struct evsel_streams *es,
> + int nr_evsel, int evsel_idx)
foo__get() is the idiom for refcount_t method to bump the refcount for
'struct foo', so please rename it to:
struct evsel_streams__entry(struct evsel_streams *es, int nr_evsel, int evsel_idx)
Also please consider having the array and the number of entries in
'struct evsel_streams', so that you don't have to always pass the
number of entries around.
> +{
> + for (int i = 0; i < nr_evsel; i++) {
> + if (es[i].evsel_idx == evsel_idx)
> + return &es[i];
> + }
> +
> + return NULL;
> +}
> diff --git a/tools/perf/util/stream.h b/tools/perf/util/stream.h
> index c6844c5787cb..66f61d954eef 100644
> --- a/tools/perf/util/stream.h
> +++ b/tools/perf/util/stream.h
> @@ -20,4 +20,7 @@ struct evlist;
> struct evsel_streams *perf_evlist__create_streams(struct evlist *evlist,
> int nr_streams_max);
>
> +struct evsel_streams *evsel_streams_get(struct evsel_streams *es,
> + int nr_evsel, int evsel_idx);
> +
> #endif /* __PERF_STREAM_H */
> --
> 2.17.1
>
--
- Arnaldo
Em Fri, Sep 11, 2020 at 04:03:47PM +0800, Jin Yao escreveu:
> We define the stream is the branch history which is aggregated by
> the branch records from perf samples. For example, the callchains
> aggregated from the branch records are considered as streams.
> By browsing the hot stream, we can understand the hot code path.
>
> Now we only support the callchain for stream. For measuring the
> hot level for a stream, we use the callchain_node->hit, higher
> is hotter.
>
> There may be many callchains sampled so we only focus on the top
> N hottest callchains. N is a user defined parameter or predefined
> default value (nr_streams_max).
>
> This patch creates an evsel_streams array per event, and saves
> the top N hottest streams in a stream array.
>
> So now we can get the per-event top N hottest streams.
>
> Signed-off-by: Jin Yao <[email protected]>
> ---
> v6:
> - Rebase to perf/core
>
> v5:
> - Remove enum stram_type
> - Rebase to perf/core
>
> v4:
> - Refactor the code
> - Rename patch name from 'perf util: Create streams for managing
> top N hottest callchains' to 'perf util: Create streams'
>
> v2:
> - Use zfree in free_evsel_streams().
>
> tools/perf/util/Build | 1 +
> tools/perf/util/stream.c | 148 +++++++++++++++++++++++++++++++++++++++
> tools/perf/util/stream.h | 23 ++++++
> 3 files changed, 172 insertions(+)
> create mode 100644 tools/perf/util/stream.c
> create mode 100644 tools/perf/util/stream.h
>
> diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> index cd5e41960e64..6ffdf833cd51 100644
> --- a/tools/perf/util/Build
> +++ b/tools/perf/util/Build
> @@ -101,6 +101,7 @@ perf-y += call-path.o
> perf-y += rwsem.o
> perf-y += thread-stack.o
> perf-y += spark.o
> +perf-y += stream.o
> perf-$(CONFIG_AUXTRACE) += auxtrace.o
> perf-$(CONFIG_AUXTRACE) += intel-pt-decoder/
> perf-$(CONFIG_AUXTRACE) += intel-pt.o
> diff --git a/tools/perf/util/stream.c b/tools/perf/util/stream.c
> new file mode 100644
> index 000000000000..015c1d07ce3a
> --- /dev/null
> +++ b/tools/perf/util/stream.c
> @@ -0,0 +1,148 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Compare and figure out the top N hottest streams
> + * Copyright (c) 2020, Intel Corporation.
> + * Author: Jin Yao
> + */
> +
> +#include <inttypes.h>
> +#include <stdlib.h>
> +#include <linux/zalloc.h>
> +#include "debug.h"
> +#include "hist.h"
> +#include "sort.h"
> +#include "stream.h"
> +#include "evlist.h"
> +
> +static void free_evsel_streams(struct evsel_streams *es, int nr_evsel)
Please name the functions with the struct name first, followed by __ and
then the method name, i.e. for this one:
static void evsel_streams__delete(struct evsel_streams *es, int nr_evsels)
One thing that made me curious is why nr_evsels (plural please) isn't in
'struct evsel_streams', why?
> +{
> + for (int i = 0; i < nr_evsel; i++)
> + zfree(&es[i].streams);
> +
> + free(es);
> +}
> +
> +static struct evsel_streams *create_evsel_streams(int nr_evsel,
> + int nr_streams_max)
Oh, I see, you create just an array :-\
Please rename it to:
evsel_streams__new(int nr_evsels, int nr_streams_max)
> +{
> + struct evsel_streams *es;
> +
> + es = calloc(nr_evsel, sizeof(struct evsel_streams));
> + if (!es)
> + return NULL;
> +
> + for (int i = 0; i < nr_evsel; i++) {
> + struct evsel_streams *s = &es[i];
> +
> + s->streams = calloc(nr_streams_max, sizeof(struct stream));
> + if (!s->streams)
> + goto err;
> +
> + s->nr_streams_max = nr_streams_max;
> + s->evsel_idx = -1;
> + }
> +
> + return es;
> +
> +err:
> + free_evsel_streams(es, nr_evsel);
> + return NULL;
> +}
> +
> +/*
> + * The cnodes with high hit number are hot callchains.
> + */
> +static void set_hot_cnode(struct evsel_streams *es,
> + struct callchain_node *cnode)
Since it is static its kinda ok not to have the prefix, but I'd do it
as:
static void evsel_streams__set_hot_cnode(...)
For ctags sake, for instance, i.e. navigate in vim using ctags.
> +{
> + int i, idx = 0;
> + u64 hit;
> +
> + if (es->nr_streams < es->nr_streams_max) {
> + i = es->nr_streams;
> + es->streams[i].cnode = cnode;
> + es->nr_streams++;
> + return;
> + }
> +
> + /*
> + * Considering a few number of hot streams, only use simple
> + * way to find the cnode with smallest hit number and replace.
> + */
> + hit = (es->streams[0].cnode)->hit;
> + for (i = 1; i < es->nr_streams; i++) {
> + if ((es->streams[i].cnode)->hit < hit) {
> + hit = (es->streams[i].cnode)->hit;
> + idx = i;
> + }
> + }
> +
> + if (cnode->hit > hit)
> + es->streams[idx].cnode = cnode;
> +}
> +
> +static void update_hot_callchain(struct hist_entry *he,
> + struct evsel_streams *es)
> +{
> + struct rb_root *root = &he->sorted_chain;
> + struct rb_node *rb_node = rb_first(root);
> + struct callchain_node *cnode;
> +
> + while (rb_node) {
> + cnode = rb_entry(rb_node, struct callchain_node, rb_node);
> + set_hot_cnode(es, cnode);
> + rb_node = rb_next(rb_node);
> + }
> +}
> +
> +static void init_hot_callchain(struct hists *hists, struct evsel_streams *es)
> +{
> + struct rb_node *next = rb_first_cached(&hists->entries);
> +
> + while (next) {
> + struct hist_entry *he;
> +
> + he = rb_entry(next, struct hist_entry, rb_node);
> + update_hot_callchain(he, es);
> + next = rb_next(&he->rb_node);
> + }
> +}
> +
> +static int evlist_init_callchain_streams(struct evlist *evlist,
> + struct evsel_streams *es, int nr_es)
So here we miss just one extra _, i.e.:
+static int evlist__init_callchain_streams(struct evlist *evlist,
+ struct evsel_streams *es, int nr_es)
> +{
> + struct evsel *pos;
> + int i = 0;
> +
> + BUG_ON(nr_es < evlist->core.nr_entries);
> +
> + evlist__for_each_entry(evlist, pos) {
> + struct hists *hists = evsel__hists(pos);
> +
> + hists__output_resort(hists, NULL);
> + init_hot_callchain(hists, &es[i]);
> + es[i].evsel_idx = pos->idx;
> + i++;
> + }
> +
> + return 0;
> +}
> +
> +struct evsel_streams *perf_evlist__create_streams(struct evlist *evlist,
> + int nr_streams_max)
And here it should be:
+struct evsel_streams *evlist__create_streams(struct evlist *evlist,
+ int nr_streams_max)
I.e. without the perf_ prefix, since that is for 'struct perf_evlist'
methods, that are in tools/lib/perf/, aka libperf
> +{
> + struct evsel_streams *es;
> + int nr_evsel = evlist->core.nr_entries, ret = -1;
> +
> + es = create_evsel_streams(nr_evsel, nr_streams_max);
Minor nitpick, usually combine declaration with first attribution, if
possible, i.e.:
+ struct evsel_streams *es = create_evsel_streams(nr_evsel, nr_streams_max);
> + if (!es)
> + return NULL;
> +
> + ret = evlist_init_callchain_streams(evlist, es, nr_evsel);
> + if (ret) {
> + free_evsel_streams(es, nr_evsel);
> + return NULL;
> + }
> +
> + return es;
> +}
> diff --git a/tools/perf/util/stream.h b/tools/perf/util/stream.h
> new file mode 100644
> index 000000000000..c6844c5787cb
> --- /dev/null
> +++ b/tools/perf/util/stream.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __PERF_STREAM_H
> +#define __PERF_STREAM_H
> +
> +#include "callchain.h"
> +
> +struct stream {
> + struct callchain_node *cnode;
> +};
> +
> +struct evsel_streams {
> + struct stream *streams;
> + int nr_streams_max;
> + int nr_streams;
> + int evsel_idx;
> +};
> +
> +struct evlist;
> +
> +struct evsel_streams *perf_evlist__create_streams(struct evlist *evlist,
> + int nr_streams_max);
> +
evlist__create_streams()
> +#endif /* __PERF_STREAM_H */
> --
> 2.17.1
>
--
- Arnaldo
Em Fri, Sep 11, 2020 at 04:03:50PM +0800, Jin Yao escreveu:
> In previous patch, we have created an evsel_streams for one event,
> and top N hottest streams will be saved in a stream array in
> evsel_streams.
>
> This patch compares total streams among two evsel_streams.
>
> Once two streams are fully matched, they will be linked as
> a pair. From the pair, we can know which streams are matched.
>
> Signed-off-by: Jin Yao <[email protected]>
> ---
> v6:
> - Rebase to perf/core
>
> v5:
> - Remove enum stream_type
>
> v4:
> - New patch in v4.
>
> tools/perf/util/stream.c | 40 ++++++++++++++++++++++++++++++++++++++++
> tools/perf/util/stream.h | 4 ++++
> 2 files changed, 44 insertions(+)
>
> diff --git a/tools/perf/util/stream.c b/tools/perf/util/stream.c
> index 7882a7f05d97..e96e21d6e07b 100644
> --- a/tools/perf/util/stream.c
> +++ b/tools/perf/util/stream.c
> @@ -157,3 +157,43 @@ struct evsel_streams *evsel_streams_get(struct evsel_streams *es,
>
> return NULL;
> }
> +
> +static struct stream *stream_callchain_match(struct stream *base_stream,
> + struct evsel_streams *es_pair)
Please use stream__
> +{
> + for (int i = 0; i < es_pair->nr_streams; i++) {
> + struct stream *pair_stream = &es_pair->streams[i];
> +
> + if (callchain_cnode_matched(base_stream->cnode,
> + pair_stream->cnode)) {
> + return pair_stream;
> + }
> + }
> +
> + return NULL;
> +}
> +
> +static struct stream *stream_match(struct stream *base_stream,
> + struct evsel_streams *es_pair)
> +{
> + return stream_callchain_match(base_stream, es_pair);
> +}
> +
> +static void stream_link(struct stream *base_stream, struct stream *pair_stream)
> +{
> + base_stream->pair_cnode = pair_stream->cnode;
> + pair_stream->pair_cnode = base_stream->cnode;
> +}
> +
> +void match_evsel_streams(struct evsel_streams *es_base,
> + struct evsel_streams *es_pair)
> +{
> + for (int i = 0; i < es_base->nr_streams; i++) {
> + struct stream *base_stream = &es_base->streams[i];
> + struct stream *pair_stream;
> +
> + pair_stream = stream_match(base_stream, es_pair);
> + if (pair_stream)
> + stream_link(base_stream, pair_stream);
> + }
> +}
> diff --git a/tools/perf/util/stream.h b/tools/perf/util/stream.h
> index 66f61d954eef..2eb6f17a834e 100644
> --- a/tools/perf/util/stream.h
> +++ b/tools/perf/util/stream.h
> @@ -6,6 +6,7 @@
>
> struct stream {
> struct callchain_node *cnode;
> + struct callchain_node *pair_cnode;
> };
>
> struct evsel_streams {
> @@ -23,4 +24,7 @@ struct evsel_streams *perf_evlist__create_streams(struct evlist *evlist,
> struct evsel_streams *evsel_streams_get(struct evsel_streams *es,
> int nr_evsel, int evsel_idx);
>
> +void match_evsel_streams(struct evsel_streams *es_base,
> + struct evsel_streams *es_pair);
> +
> #endif /* __PERF_STREAM_H */
> --
> 2.17.1
>
--
- Arnaldo
Em Fri, Sep 11, 2020 at 04:03:53PM +0800, Jin Yao escreveu:
> This patch enables perf-diff with "--stream" option.
>
> "--stream": Enable hot streams comparison
>
> Now let's see examples.
>
> perf record -b ... Generate perf.data.old with branch data
> perf record -b ... Generate perf.data with branch data
> perf diff --stream
>
> [ Matched hot streams ]
>
> hot chain pair 1:
> cycles: 1, hits: 27.77% cycles: 1, hits: 9.24%
> --------------------------- --------------------------
> main div.c:39 main div.c:39
> main div.c:44 main div.c:44
>
> hot chain pair 2:
> cycles: 34, hits: 20.06% cycles: 27, hits: 16.98%
> --------------------------- --------------------------
> __random_r random_r.c:360 __random_r random_r.c:360
Would it be interesting to get the associated source code and show right below
these file:number lines?
- Arnaldo
> __random_r random_r.c:388 __random_r random_r.c:388
> __random_r random_r.c:388 __random_r random_r.c:388
> __random_r random_r.c:380 __random_r random_r.c:380
> __random_r random_r.c:357 __random_r random_r.c:357
> __random random.c:293 __random random.c:293
> __random random.c:293 __random random.c:293
> __random random.c:291 __random random.c:291
> __random random.c:291 __random random.c:291
> __random random.c:291 __random random.c:291
> __random random.c:288 __random random.c:288
> rand rand.c:27 rand rand.c:27
> rand rand.c:26 rand rand.c:26
> rand@plt rand@plt
> rand@plt rand@plt
> compute_flag div.c:25 compute_flag div.c:25
> compute_flag div.c:22 compute_flag div.c:22
> main div.c:40 main div.c:40
> main div.c:40 main div.c:40
> main div.c:39 main div.c:39
>
> hot chain pair 3:
> cycles: 9, hits: 4.48% cycles: 6, hits: 4.51%
> --------------------------- --------------------------
> __random_r random_r.c:360 __random_r random_r.c:360
> __random_r random_r.c:388 __random_r random_r.c:388
> __random_r random_r.c:388 __random_r random_r.c:388
> __random_r random_r.c:380 __random_r random_r.c:380
>
> [ Hot streams in old perf data only ]
>
> hot chain 1:
> cycles: 18, hits: 6.75%
> --------------------------
> __random_r random_r.c:360
> __random_r random_r.c:388
> __random_r random_r.c:388
> __random_r random_r.c:380
> __random_r random_r.c:357
> __random random.c:293
> __random random.c:293
> __random random.c:291
> __random random.c:291
> __random random.c:291
> __random random.c:288
> rand rand.c:27
> rand rand.c:26
> rand@plt
> rand@plt
> compute_flag div.c:25
> compute_flag div.c:22
> main div.c:40
>
> hot chain 2:
> cycles: 29, hits: 2.78%
> --------------------------
> compute_flag div.c:22
> main div.c:40
> main div.c:40
> main div.c:39
>
> [ Hot streams in new perf data only ]
>
> hot chain 1:
> cycles: 4, hits: 4.54%
> --------------------------
> main div.c:42
> compute_flag div.c:28
>
> hot chain 2:
> cycles: 5, hits: 3.51%
> --------------------------
> main div.c:39
> main div.c:44
> main div.c:42
> compute_flag div.c:28
>
> Signed-off-by: Jin Yao <[email protected]>
> ---
> v6:
> - Rebase to perf/core
>
> v5:
> - Remove enum stream_type
> - Rebase to perf/core
>
> v4:
> - Remove the "--before" and "--after" options since they are for
> source line based comparison. In this patchset, we will not
> support source line based comparison.
>
> tools/perf/Documentation/perf-diff.txt | 4 +
> tools/perf/builtin-diff.c | 133 ++++++++++++++++++++++---
> 2 files changed, 124 insertions(+), 13 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt
> index f50ca0fef0a4..be65bd55ab2a 100644
> --- a/tools/perf/Documentation/perf-diff.txt
> +++ b/tools/perf/Documentation/perf-diff.txt
> @@ -182,6 +182,10 @@ OPTIONS
> --tid=::
> Only diff samples for given thread ID (comma separated list).
>
> +--stream::
> + Enable hot streams comparison. Stream can be a callchain which is
> + aggregated by the branch records from samples.
> +
> COMPARISON
> ----------
> The comparison is governed by the baseline file. The baseline perf.data
> diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
> index f8c9bdd8269a..d6db473cd010 100644
> --- a/tools/perf/builtin-diff.c
> +++ b/tools/perf/builtin-diff.c
> @@ -25,6 +25,7 @@
> #include "util/map.h"
> #include "util/spark.h"
> #include "util/block-info.h"
> +#include "util/stream.h"
> #include <linux/err.h>
> #include <linux/zalloc.h>
> #include <subcmd/pager.h>
> @@ -42,6 +43,7 @@ struct perf_diff {
> int range_size;
> int range_num;
> bool has_br_stack;
> + bool stream;
> };
>
> /* Diff command specific HPP columns. */
> @@ -72,6 +74,8 @@ struct data__file {
> struct perf_data data;
> int idx;
> struct hists *hists;
> + struct evsel_streams *evsel_streams;
> + int nr_evsel_streams;
> struct diff_hpp_fmt fmt[PERF_HPP_DIFF__MAX_INDEX];
> };
>
> @@ -106,6 +110,7 @@ enum {
> COMPUTE_DELTA_ABS,
> COMPUTE_CYCLES,
> COMPUTE_MAX,
> + COMPUTE_STREAM, /* After COMPUTE_MAX to avoid use current compute arrays */
> };
>
> const char *compute_names[COMPUTE_MAX] = {
> @@ -393,6 +398,11 @@ static int diff__process_sample_event(struct perf_tool *tool,
> struct perf_diff *pdiff = container_of(tool, struct perf_diff, tool);
> struct addr_location al;
> struct hists *hists = evsel__hists(evsel);
> + struct hist_entry_iter iter = {
> + .evsel = evsel,
> + .sample = sample,
> + .ops = &hist_iter_normal,
> + };
> int ret = -1;
>
> if (perf_time__ranges_skip_sample(pdiff->ptime_range, pdiff->range_num,
> @@ -411,14 +421,8 @@ static int diff__process_sample_event(struct perf_tool *tool,
> goto out_put;
> }
>
> - if (compute != COMPUTE_CYCLES) {
> - if (!hists__add_entry(hists, &al, NULL, NULL, NULL, sample,
> - true)) {
> - pr_warning("problem incrementing symbol period, "
> - "skipping event\n");
> - goto out_put;
> - }
> - } else {
> + switch (compute) {
> + case COMPUTE_CYCLES:
> if (!hists__add_entry_ops(hists, &block_hist_ops, &al, NULL,
> NULL, NULL, sample, true)) {
> pr_warning("problem incrementing symbol period, "
> @@ -428,6 +432,23 @@ static int diff__process_sample_event(struct perf_tool *tool,
>
> hist__account_cycles(sample->branch_stack, &al, sample, false,
> NULL);
> + break;
> +
> + case COMPUTE_STREAM:
> + if (hist_entry_iter__add(&iter, &al, PERF_MAX_STACK_DEPTH,
> + NULL)) {
> + pr_debug("problem adding hist entry, skipping event\n");
> + goto out_put;
> + }
> + break;
> +
> + default:
> + if (!hists__add_entry(hists, &al, NULL, NULL, NULL, sample,
> + true)) {
> + pr_warning("problem incrementing symbol period, "
> + "skipping event\n");
> + goto out_put;
> + }
> }
>
> /*
> @@ -996,6 +1017,50 @@ static void data_process(void)
> }
> }
>
> +static int process_base_stream(struct data__file *data_base,
> + struct data__file *data_pair,
> + const char *title __maybe_unused)
> +{
> + struct evlist *evlist_base = data_base->session->evlist;
> + struct evlist *evlist_pair = data_pair->session->evlist;
> + struct evsel *evsel_base, *evsel_pair;
> + struct evsel_streams *es_base, *es_pair;
> +
> + evlist__for_each_entry(evlist_base, evsel_base) {
> + evsel_pair = evsel_match(evsel_base, evlist_pair);
> + if (!evsel_pair)
> + continue;
> +
> + es_base = evsel_streams_get(data_base->evsel_streams,
> + data_base->nr_evsel_streams,
> + evsel_base->idx);
> + if (!es_base)
> + return -1;
> +
> + es_pair = evsel_streams_get(data_pair->evsel_streams,
> + data_pair->nr_evsel_streams,
> + evsel_pair->idx);
> + if (!es_pair)
> + return -1;
> +
> + match_evsel_streams(es_base, es_pair);
> + evsel_streams_report(es_base, es_pair);
> + }
> +
> + return 0;
> +}
> +
> +static void stream_process(void)
> +{
> + /*
> + * Stream comparison only supports two data files.
> + * perf.data.old and perf.data. data__files[0] is perf.data.old,
> + * data__files[1] is perf.data.
> + */
> + process_base_stream(&data__files[0], &data__files[1],
> + "# Output based on old perf data:\n#\n");
> +}
> +
> static void data__free(struct data__file *d)
> {
> int col;
> @@ -1109,6 +1174,18 @@ static int check_file_brstack(void)
> return 0;
> }
>
> +static struct evsel_streams *create_evsel_streams(struct evlist *evlist,
> + int nr_streams_max,
> + int *nr_evsel_streams)
> +{
> + struct evsel_streams *es;
> +
> + es = perf_evlist__create_streams(evlist, nr_streams_max);
> + *nr_evsel_streams = evlist->core.nr_entries;
> +
> + return es;
> +}
> +
> static int __cmd_diff(void)
> {
> struct data__file *d;
> @@ -1153,9 +1230,21 @@ static int __cmd_diff(void)
>
> if (pdiff.ptime_range)
> zfree(&pdiff.ptime_range);
> +
> + if (compute == COMPUTE_STREAM) {
> + d->evsel_streams = create_evsel_streams(
> + d->session->evlist,
> + 5,
> + &d->nr_evsel_streams);
> + if (!d->evsel_streams)
> + goto out_delete;
> + }
> }
>
> - data_process();
> + if (compute == COMPUTE_STREAM)
> + stream_process();
> + else
> + data_process();
>
> out_delete:
> data__for_each_file(i, d) {
> @@ -1228,6 +1317,8 @@ static const struct option options[] = {
> "only consider symbols in these pids"),
> OPT_STRING(0, "tid", &symbol_conf.tid_list_str, "tid[,tid...]",
> "only consider symbols in these tids"),
> + OPT_BOOLEAN(0, "stream", &pdiff.stream,
> + "Enable hot streams comparison."),
> OPT_END()
> };
>
> @@ -1887,6 +1978,9 @@ int cmd_diff(int argc, const char **argv)
> if (cycles_hist && (compute != COMPUTE_CYCLES))
> usage_with_options(diff_usage, options);
>
> + if (pdiff.stream)
> + compute = COMPUTE_STREAM;
> +
> symbol__annotation_init();
>
> if (symbol__init(NULL) < 0)
> @@ -1898,13 +1992,26 @@ int cmd_diff(int argc, const char **argv)
> if (check_file_brstack() < 0)
> return -1;
>
> - if (compute == COMPUTE_CYCLES && !pdiff.has_br_stack)
> + if ((compute == COMPUTE_CYCLES || compute == COMPUTE_STREAM)
> + && !pdiff.has_br_stack) {
> return -1;
> + }
>
> - if (ui_init() < 0)
> - return -1;
> + if (compute == COMPUTE_STREAM) {
> + symbol_conf.show_branchflag_count = true;
> + symbol_conf.disable_add2line_warn = true;
> + callchain_param.mode = CHAIN_FLAT;
> + callchain_param.key = CCKEY_SRCLINE;
> + callchain_param.branch_callstack = 1;
> + symbol_conf.use_callchain = true;
> + callchain_register_param(&callchain_param);
> + sort_order = "srcline,symbol,dso";
> + } else {
> + if (ui_init() < 0)
> + return -1;
>
> - sort__mode = SORT_MODE__DIFF;
> + sort__mode = SORT_MODE__DIFF;
> + }
>
> if (setup_sorting(NULL) < 0)
> usage_with_options(diff_usage, options);
> --
> 2.17.1
>
--
- Arnaldo
Hi Arnaldo,
On 9/18/2020 4:26 AM, Arnaldo Carvalho de Melo wrote:
> Em Fri, Sep 11, 2020 at 04:03:53PM +0800, Jin Yao escreveu:
>> This patch enables perf-diff with "--stream" option.
>>
>> "--stream": Enable hot streams comparison
>>
>> Now let's see examples.
>>
>> perf record -b ... Generate perf.data.old with branch data
>> perf record -b ... Generate perf.data with branch data
>> perf diff --stream
>>
>> [ Matched hot streams ]
>>
>> hot chain pair 1:
>> cycles: 1, hits: 27.77% cycles: 1, hits: 9.24%
>> --------------------------- --------------------------
>> main div.c:39 main div.c:39
>> main div.c:44 main div.c:44
>>
>> hot chain pair 2:
>> cycles: 34, hits: 20.06% cycles: 27, hits: 16.98%
>> --------------------------- --------------------------
>> __random_r random_r.c:360 __random_r random_r.c:360
>
> Would it be interesting to get the associated source code and show right below
> these file:number lines?
>
> - Arnaldo
>
Yes, that would be better. Let me think about the implementation.
Thanks
Jin Yao
Hi Arnaldo,
On 9/18/2020 4:13 AM, Arnaldo Carvalho de Melo wrote:
> Em Thu, Sep 17, 2020 at 03:05:56PM +0200, Jiri Olsa escreveu:
>> On Fri, Sep 11, 2020 at 04:03:46PM +0800, Jin Yao wrote:
>>
>> SNIP
>>
>>> main div.c:40
>>> main div.c:40
>>> main div.c:39
>>>
>>> [ Hot streams in new perf data only ]
>>>
>>> hot chain 1:
>>> cycles: 4, hits: 4.54%
>>> --------------------------
>>> main div.c:42
>>> compute_flag div.c:28
>>>
>>> hot chain 2:
>>> cycles: 5, hits: 3.51%
>>> --------------------------
>>> main div.c:39
>>> main div.c:44
>>> main div.c:42
>>> compute_flag div.c:28
>>>
>>> v6:
>>> ---
>>> Rebase to perf/core
>>
>> it looks good to me
>>
>> Acked-by: Jiri Olsa <[email protected]>
>
> Jin,
>
> I'm sorry I only got to look at this now, there are some issues,
> I'll try to point them out patch by patch,
>
> Thanks,
>
> - Arnaldo
>
Thanks so much for looking at this patchset! :)
I will fix the issues which you point out in other mail threads. Once the fixes are done, I will
post v7.
Thanks
Jin Yao
Hi Arnaldo,
On 9/18/2020 4:26 AM, Arnaldo Carvalho de Melo wrote:
> Em Fri, Sep 11, 2020 at 04:03:53PM +0800, Jin Yao escreveu:
>> This patch enables perf-diff with "--stream" option.
>>
>> "--stream": Enable hot streams comparison
>>
>> Now let's see examples.
>>
>> perf record -b ... Generate perf.data.old with branch data
>> perf record -b ... Generate perf.data with branch data
>> perf diff --stream
>>
>> [ Matched hot streams ]
>>
>> hot chain pair 1:
>> cycles: 1, hits: 27.77% cycles: 1, hits: 9.24%
>> --------------------------- --------------------------
>> main div.c:39 main div.c:39
>> main div.c:44 main div.c:44
>>
>> hot chain pair 2:
>> cycles: 34, hits: 20.06% cycles: 27, hits: 16.98%
>> --------------------------- --------------------------
>> __random_r random_r.c:360 __random_r random_r.c:360
>
> Would it be interesting to get the associated source code and show right below
> these file:number lines?
>
> - Arnaldo
>
I'm thinking we can implement this function in callchain_list__sym_name(), and then all callchain
functionality will benefit from it. While that looks to be another patchset. :)
Thanks
Jin Yao
Em Sat, Sep 19, 2020 at 12:41:35PM +0800, Jin, Yao escreveu:
> Hi Arnaldo,
>
> On 9/18/2020 4:26 AM, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Sep 11, 2020 at 04:03:53PM +0800, Jin Yao escreveu:
> > > This patch enables perf-diff with "--stream" option.
> > >
> > > "--stream": Enable hot streams comparison
> > >
> > > Now let's see examples.
> > >
> > > perf record -b ... Generate perf.data.old with branch data
> > > perf record -b ... Generate perf.data with branch data
> > > perf diff --stream
> > >
> > > [ Matched hot streams ]
> > >
> > > hot chain pair 1:
> > > cycles: 1, hits: 27.77% cycles: 1, hits: 9.24%
> > > --------------------------- --------------------------
> > > main div.c:39 main div.c:39
> > > main div.c:44 main div.c:44
> > >
> > > hot chain pair 2:
> > > cycles: 34, hits: 20.06% cycles: 27, hits: 16.98%
> > > --------------------------- --------------------------
> > > __random_r random_r.c:360 __random_r random_r.c:360
> >
> > Would it be interesting to get the associated source code and show right below
> > these file:number lines?
> >
> > - Arnaldo
> >
>
> I'm thinking we can implement this function in callchain_list__sym_name(),
> and then all callchain functionality will benefit from it. While that looks
> to be another patchset. :)
Sure, after we go thru the process of merging the current one.
> Thanks
> Jin Yao
--
- Arnaldo