2020-10-14 09:21:03

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 0/8] perf c2c: Refine the organization of metrics

This patch set is to refine metrics output organization.

If we reivew the current memory metrics in Perf c2c tool, it doesn't
orgnize the metrics with directive approach; thus user needs to take
time to dig into every statistics item. On the other hand, if use the
"summary and breakdown" approach, the output result will be easier for
reviewing by users, e.g. the output result can firstly give out the
summary values, and then the later items will breakdown into more
detailed statistics.

For this reason, this patch is to reorgnize the metrics and it only
changes for the "Shared Data Cache Line Table": it firstly displays the
summary values for total records, total loads, total stores; then it
breaks these summary values into small values, with the order from the
most near memory node ("CPU Load Hit") to more far nodes
("LLC Load Hit", "RMT Load Hit", "Load Dram").

"LLC Load Hit" = "LclHit" + "LclHitm"

"RMT Load Hit" = "RmtHit" + "RmtHitm" \
-> LLC Load Miss
"Load Dram" = "Lcl" + "Rmt" /

Another main reason for this patch set is wanting to extend "perf c2c"
to support Arm SPE memory event, but Arm SPE doesn't contain 'HTIM' tag
in its default trace data, for this case if want to analyze cache false
sharing issue, we need to rely on LLC metrics + multi-threading info.
So this patch set can be friendly to show LLC related metrics in the
"Shared Data Cache Line Table"; for sorting cache lines with LLC metrics
which will be sent out with another separate patch set.

Before:

=================================================
Shared Data Cache Line Table
=================================================
#
# ----------- Cacheline ---------- Total Tot ----- LLC Load Hitm ----- ---- Store Reference ---- --- Load Dram ---- LLC Total ----- Core Load Hit ----- -- LLC Load Hit --
# Index Address Node PA cnt records Hitm Total Lcl Rmt Total L1Hit L1Miss Lcl Rmt Ld Miss Loads FB L1 L2 Llc Rmt
# ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ........ ........ ....... ....... ....... ....... ....... ........ ........
#
0 0x55acdcc92100 0 8197 40716 52.18% 3170 3170 0 24466 24437 29 0 0 0 16250 3349 5909 0 3822 0
1 0x55acdcc920c0 0 1 4621 31.01% 1884 1884 0 0 0 0 0 0 0 4621 739 0 0 1998 0
2 0x55acdcc92080 0 1 4475 16.69% 1014 1014 0 0 0 0 0 0 0 4475 2405 0 0 1056 0


After:

=================================================
Shared Data Cache Line Table
=================================================
#
# ----------- Cacheline ---------- Tot ------- Load Hitm ------- Total Total Total ---- Stores ---- ----- Core Load Hit ----- - LLC Load Hit -- - RMT Load Hit -- --- Load Dram ----
# Index Address Node PA cnt Hitm Total LclHitm RmtHitm records Loads Stores L1Hit L1Miss FB L1 L2 LclHit LclHitm RmtHit RmtHitm Lcl Rmt
# ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ........ ....... ........ ....... ........ ........
#
0 0x55acdcc92100 0 8197 52.18% 3170 3170 0 40716 16250 24466 24437 29 3349 5909 0 3822 3170 0 0 0 0
1 0x55acdcc920c0 0 1 31.01% 1884 1884 0 4621 4621 0 0 0 739 0 0 1998 1884 0 0 0 0
2 0x55acdcc92080 0 1 16.69% 1014 1014 0 4475 4475 0 0 0 2405 0 0 1056 1014 0 0 0 0


Leo Yan (8):
perf c2c: Display the total numbers continuously
perf c2c: Display "Total Stores" as a standalone metrics
perf c2c: Organize metrics based on memory hierarchy
perf c2c: Change header from "LLC Load Hitm" to "Load Hitm"
perf c2c: Use more explicit headers for HITM
perf c2c: Change header for LLC local hit
perf c2c: Correct LLC load hit metrics
perf c2c: Add metrics "RMT Load Hit"

tools/perf/builtin-c2c.c | 83 +++++++++-------------------------------
1 file changed, 18 insertions(+), 65 deletions(-)

--
2.17.1


2020-10-14 09:21:38

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 8/8] perf c2c: Add metrics "RMT Load Hit"

The metrics "LLC Ld Miss" and "Load Dram" overlap with each other for
accouting items:

"LLC Ld Miss" = "lcl_dram" + "rmt_dram" + "rmt_hit" + "rmt_hitm"
"Load Dram" = "lcl_dram" + "rmt_dram"

Furthermore, the metrics "LLC Ld Miss" is not directive to show
statistics due to it contains summary value and cannot give out
breakdown details.

For this reason, add a new metrics "RMT Load Hit" which is used to
present the remote cache hit; it contains two items:

"RMT Load Hit" = remote hit ("rmt_hit") + remote hitm ("rmt_hitm")

As result, the metrics "LLC Ld Miss" is perfectly divided into two
metrics "RMT Load Hit" and "Load Dram". It's not necessary to keep
metrics "LLC Ld Miss", so remove it.

Before:

# ----------- Cacheline ---------- Tot ------- Load Hitm ------- Total Total Total ---- Stores ---- ----- Core Load Hit ----- - LLC Load Hit -- LLC --- Load Dram ----
# Index Address Node PA cnt Hitm Total LclHitm RmtHitm records Loads Stores L1Hit L1Miss FB L1 L2 LclHit LclHitm Ld Miss Lcl Rmt
# ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ........ ....... ....... ........ ........
#
0 0x55f07d580100 0 1499 85.89% 481 481 0 7243 3879 3364 2599 765 548 2615 66 169 481 0 0 0
1 0x55f07d580080 0 1 13.93% 78 78 0 664 664 0 0 0 187 361 27 11 78 0 0 0
2 0x55f07d5800c0 0 1 0.18% 1 1 0 405 405 0 0 0 131 0 10 263 1 0 0 0

After:

# ----------- Cacheline ---------- Tot ------- Load Hitm ------- Total Total Total ---- Stores ---- ----- Core Load Hit ----- - LLC Load Hit -- - RMT Load Hit -- --- Load Dram ----
# Index Address Node PA cnt Hitm Total LclHitm RmtHitm records Loads Stores L1Hit L1Miss FB L1 L2 LclHit LclHitm RmtHit RmtHitm Lcl Rmt
# ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ........ ....... ........ ....... ........ ........
#
0 0x55f07d580100 0 1499 85.89% 481 481 0 7243 3879 3364 2599 765 548 2615 66 169 481 0 0 0 0
1 0x55f07d580080 0 1 13.93% 78 78 0 664 664 0 0 0 187 361 27 11 78 0 0 0 0
2 0x55f07d5800c0 0 1 0.18% 1 1 0 405 405 0 0 0 131 0 10 263 1 0 0 0 0

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/builtin-c2c.c | 52 ++--------------------------------------
1 file changed, 2 insertions(+), 50 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 61fb939a4e70..9c2183957c50 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -652,45 +652,6 @@ STAT_FN(ld_l2hit)
STAT_FN(ld_llchit)
STAT_FN(rmt_hit)

-static uint64_t llc_miss(struct c2c_stats *stats)
-{
- uint64_t llcmiss;
-
- llcmiss = stats->lcl_dram +
- stats->rmt_dram +
- stats->rmt_hitm +
- stats->rmt_hit;
-
- return llcmiss;
-}
-
-static int
-ld_llcmiss_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
- struct hist_entry *he)
-{
- struct c2c_hist_entry *c2c_he;
- int width = c2c_width(fmt, hpp, he->hists);
-
- c2c_he = container_of(he, struct c2c_hist_entry, he);
-
- return scnprintf(hpp->buf, hpp->size, "%*lu", width,
- llc_miss(&c2c_he->stats));
-}
-
-static int64_t
-ld_llcmiss_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
- struct hist_entry *left, struct hist_entry *right)
-{
- struct c2c_hist_entry *c2c_left;
- struct c2c_hist_entry *c2c_right;
-
- c2c_left = container_of(left, struct c2c_hist_entry, he);
- c2c_right = container_of(right, struct c2c_hist_entry, he);
-
- return (uint64_t) llc_miss(&c2c_left->stats) -
- (uint64_t) llc_miss(&c2c_right->stats);
-}
-
static uint64_t total_records(struct c2c_stats *stats)
{
uint64_t lclmiss, ldcnt, total;
@@ -1440,21 +1401,13 @@ static struct c2c_dimension dim_ld_llchit = {
};

static struct c2c_dimension dim_ld_rmthit = {
- .header = HEADER_SPAN_LOW("Rmt"),
+ .header = HEADER_SPAN("- RMT Load Hit --", "RmtHit", 1),
.name = "ld_rmthit",
.cmp = rmt_hit_cmp,
.entry = rmt_hit_entry,
.width = 8,
};

-static struct c2c_dimension dim_ld_llcmiss = {
- .header = HEADER_BOTH("LLC", "Ld Miss"),
- .name = "ld_llcmiss",
- .cmp = ld_llcmiss_cmp,
- .entry = ld_llcmiss_entry,
- .width = 7,
-};
-
static struct c2c_dimension dim_tot_recs = {
.header = HEADER_BOTH("Total", "records"),
.name = "tot_recs",
@@ -1658,7 +1611,6 @@ static struct c2c_dimension *dimensions[] = {
&dim_ld_l2hit,
&dim_ld_llchit,
&dim_ld_rmthit,
- &dim_ld_llcmiss,
&dim_tot_recs,
&dim_tot_loads,
&dim_percent_hitm,
@@ -2854,7 +2806,7 @@ static int perf_c2c__report(int argc, const char **argv)
"stores_l1hit,stores_l1miss,"
"ld_fbhit,ld_l1hit,ld_l2hit,"
"ld_lclhit,lcl_hitm,"
- "ld_llcmiss,"
+ "ld_rmthit,rmt_hitm,"
"dram_lcl,dram_rmt",
c2c.display == DISPLAY_TOT ? "tot_hitm" :
c2c.display == DISPLAY_LCL ? "lcl_hitm" : "rmt_hitm"
--
2.17.1

2020-10-14 09:22:09

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 4/8] perf c2c: Change header from "LLC Load Hitm" to "Load Hitm"

The metrics "LLC Load Hitm" contains two items: one is "local Hitm" and
another is "remote Hitm".

"local Hitm" means: L3 HIT and was serviced by another processor core
with a cross core snoop where modified copies were found; it's no doubt
that "local Hitm" belongs to LLC access.

But for "remote Hitm", based on the code in util/mem-events, it's the
event for remote cache HIT and was serviced by another processor core
with modified copies. Thus the remote Hitm is a remote cache's hit and
actually it's LLC load miss.

Now the display format gives users the impression that "local Hitm" and
"remote Hitm" both belong to the LLC load, but this is not the fact as
described.

This patch changes the header from "LLC Load Hitm" to "Load Hitm", this
can avoid the give the wrong impression that all Hitm belong to LLC.

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/builtin-c2c.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 404d4739b8c1..fa7a1c55b989 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1328,7 +1328,7 @@ static struct c2c_dimension dim_iaddr = {
};

static struct c2c_dimension dim_tot_hitm = {
- .header = HEADER_SPAN("----- LLC Load Hitm -----", "Total", 2),
+ .header = HEADER_SPAN("------- Load Hitm -------", "Total", 2),
.name = "tot_hitm",
.cmp = tot_hitm_cmp,
.entry = tot_hitm_entry,
--
2.17.1

2020-10-14 09:22:23

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 2/8] perf c2c: Display "Total Stores" as a standalone metrics

The total stores is displayed under the metrics "Store Reference", to
output the same format with total records and all loads, extract the
total stores number as a standalone metrics "Total Stores".

After this patch, the tool shows the summary numbers ("Total records",
"Total loads", "Total Stores") in the unified form.

Before:

# ----------- Cacheline ---------- Tot ----- LLC Load Hitm ----- Total Total ---- Store Reference ---- --- Load Dram ---- LLC ----- Core Load Hit ----- -- LLC Load Hit --
# Index Address Node PA cnt Hitm Total Lcl Rmt records Loads Total L1Hit L1Miss Lcl Rmt Ld Miss FB L1 L2 Llc Rmt
# ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ....... ........ ........ ....... ....... ....... ....... ........ ........
#
0 0x55f07d580100 0 1499 85.89% 481 481 0 7243 3879 3364 2599 765 0 0 0 548 2615 66 169 0
1 0x55f07d580080 0 1 13.93% 78 78 0 664 664 0 0 0 0 0 0 187 361 27 11 0
2 0x55f07d5800c0 0 1 0.18% 1 1 0 405 405 0 0 0 0 0 0 131 0 10 263 0

After:

# ----------- Cacheline ---------- Tot ----- LLC Load Hitm ----- Total Total Total ---- Stores ---- --- Load Dram ---- LLC ----- Core Load Hit ----- -- LLC Load Hit --
# Index Address Node PA cnt Hitm Total Lcl Rmt records Loads Stores L1Hit L1Miss Lcl Rmt Ld Miss FB L1 L2 Llc Rmt
# ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ....... ........ ........ ....... ....... ....... ....... ........ ........
#
0 0x55f07d580100 0 1499 85.89% 481 481 0 7243 3879 3364 2599 765 0 0 0 548 2615 66 169 0
1 0x55f07d580080 0 1 13.93% 78 78 0 664 664 0 0 0 0 0 0 187 361 27 11 0
2 0x55f07d5800c0 0 1 0.18% 1 1 0 405 405 0 0 0 0 0 0 131 0 10 263 0

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/builtin-c2c.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index e602b7891ce9..a2ad24799aea 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1367,16 +1367,16 @@ static struct c2c_dimension dim_cl_lcl_hitm = {
.width = 7,
};

-static struct c2c_dimension dim_stores = {
- .header = HEADER_SPAN("---- Store Reference ----", "Total", 2),
- .name = "stores",
+static struct c2c_dimension dim_tot_stores = {
+ .header = HEADER_BOTH("Total", "Stores"),
+ .name = "tot_stores",
.cmp = store_cmp,
.entry = store_entry,
.width = 7,
};

static struct c2c_dimension dim_stores_l1hit = {
- .header = HEADER_SPAN_LOW("L1Hit"),
+ .header = HEADER_SPAN("---- Stores ----", "L1Hit", 1),
.name = "stores_l1hit",
.cmp = st_l1hit_cmp,
.entry = st_l1hit_entry,
@@ -1648,7 +1648,7 @@ static struct c2c_dimension *dimensions[] = {
&dim_rmt_hitm,
&dim_cl_lcl_hitm,
&dim_cl_rmt_hitm,
- &dim_stores,
+ &dim_tot_stores,
&dim_stores_l1hit,
&dim_stores_l1miss,
&dim_cl_stores_l1hit,
@@ -2850,7 +2850,8 @@ static int perf_c2c__report(int argc, const char **argv)
"tot_hitm,lcl_hitm,rmt_hitm,"
"tot_recs,"
"tot_loads,"
- "stores,stores_l1hit,stores_l1miss,"
+ "tot_stores,"
+ "stores_l1hit,stores_l1miss,"
"dram_lcl,dram_rmt,"
"ld_llcmiss,"
"ld_fbhit,ld_l1hit,ld_l2hit,"
--
2.17.1

2020-10-14 09:23:31

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 6/8] perf c2c: Change header for LLC local hit

Replace the header string "Lcl" with "LclHit", which is more explicit
to express the event type is LLC local hit.

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/builtin-c2c.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 3d5aa21020f2..2292261b40a2 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1432,7 +1432,7 @@ static struct c2c_dimension dim_ld_l2hit = {
};

static struct c2c_dimension dim_ld_llchit = {
- .header = HEADER_SPAN("-- LLC Load Hit --", "Llc", 1),
+ .header = HEADER_SPAN("-- LLC Load Hit --", "LclHit", 1),
.name = "ld_lclhit",
.cmp = ld_llchit_cmp,
.entry = ld_llchit_entry,
--
2.17.1

2020-10-14 14:41:28

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 1/8] perf c2c: Display the total numbers continuously

To view the statistics with "breakdown" mode, it's good to show the
summary numbers for the total records, all stores and all loads, then
the sequential conlumns can be used to break into more detailed items.

To achieve this purpose, this patch displays the summary numbers for
records/stores/loads continuously and places them before breakdown
items, this can allow uses to easily read the summarized statistics.

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/builtin-c2c.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 5938b100eaf4..e602b7891ce9 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2846,13 +2846,13 @@ static int perf_c2c__report(int argc, const char **argv)
"dcacheline,"
"dcacheline_node,"
"dcacheline_count,"
- "tot_recs,"
"percent_hitm,"
"tot_hitm,lcl_hitm,rmt_hitm,"
+ "tot_recs,"
+ "tot_loads,"
"stores,stores_l1hit,stores_l1miss,"
"dram_lcl,dram_rmt,"
"ld_llcmiss,"
- "tot_loads,"
"ld_fbhit,ld_l1hit,ld_l2hit,"
"ld_lclhit,ld_rmthit",
c2c.display == DISPLAY_TOT ? "tot_hitm" :
--
2.17.1

2020-10-14 14:56:25

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 3/8] perf c2c: Organize metrics based on memory hierarchy

The metrics are not organized based on memory hierarchy, e.g. the tool
doesn't organize the metrics order based on memory nodes from the close
node (e.g. L1/L2 cache) to far node (e.g. L3 cache and DRAM).

To output metrics with more friendly form, this patch refines the
metrics order based on memory hierarchy:

"Core Load Hit" => "LLC Load Hit" => "LLC Ld Miss" => "Load Dram"

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/builtin-c2c.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index a2ad24799aea..404d4739b8c1 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2852,10 +2852,10 @@ static int perf_c2c__report(int argc, const char **argv)
"tot_loads,"
"tot_stores,"
"stores_l1hit,stores_l1miss,"
- "dram_lcl,dram_rmt,"
- "ld_llcmiss,"
"ld_fbhit,ld_l1hit,ld_l2hit,"
- "ld_lclhit,ld_rmthit",
+ "ld_lclhit,ld_rmthit,"
+ "ld_llcmiss,"
+ "dram_lcl,dram_rmt",
c2c.display == DISPLAY_TOT ? "tot_hitm" :
c2c.display == DISPLAY_LCL ? "lcl_hitm" : "rmt_hitm"
);
--
2.17.1

2020-10-14 14:56:25

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 5/8] perf c2c: Use more explicit headers for HITM

Local and remote HITM use the headers 'Lcl' and 'Rmt' respectively,
suppose if we want to extend the tool to display these two dimensions
under any one metrics, users cannot understand the semantics if only
based on the header string 'Lcl' or 'Rmt'.

To explicit express the meaning for HITM items, this patch changes the
headers string as "LclHitm" and "RmtHitm", the strings are more readable
and this allows to extend metrics for using HITM items.

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/builtin-c2c.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index fa7a1c55b989..3d5aa21020f2 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1336,7 +1336,7 @@ static struct c2c_dimension dim_tot_hitm = {
};

static struct c2c_dimension dim_lcl_hitm = {
- .header = HEADER_SPAN_LOW("Lcl"),
+ .header = HEADER_SPAN_LOW("LclHitm"),
.name = "lcl_hitm",
.cmp = lcl_hitm_cmp,
.entry = lcl_hitm_entry,
@@ -1344,7 +1344,7 @@ static struct c2c_dimension dim_lcl_hitm = {
};

static struct c2c_dimension dim_rmt_hitm = {
- .header = HEADER_SPAN_LOW("Rmt"),
+ .header = HEADER_SPAN_LOW("RmtHitm"),
.name = "rmt_hitm",
.cmp = rmt_hitm_cmp,
.entry = rmt_hitm_entry,
@@ -1486,7 +1486,7 @@ static struct c2c_dimension dim_percent_hitm = {
};

static struct c2c_dimension dim_percent_rmt_hitm = {
- .header = HEADER_SPAN("----- HITM -----", "Rmt", 1),
+ .header = HEADER_SPAN("----- HITM -----", "RmtHitm", 1),
.name = "percent_rmt_hitm",
.cmp = percent_rmt_hitm_cmp,
.entry = percent_rmt_hitm_entry,
@@ -1495,7 +1495,7 @@ static struct c2c_dimension dim_percent_rmt_hitm = {
};

static struct c2c_dimension dim_percent_lcl_hitm = {
- .header = HEADER_SPAN_LOW("Lcl"),
+ .header = HEADER_SPAN_LOW("LclHitm"),
.name = "percent_lcl_hitm",
.cmp = percent_lcl_hitm_cmp,
.entry = percent_lcl_hitm_entry,
--
2.17.1

2020-10-14 15:07:35

by Leo Yan

[permalink] [raw]
Subject: [PATCH v1 7/8] perf c2c: Correct LLC load hit metrics

"rmt_hit" is accounted into two metrics: one is accounted into the
metrics "LLC Ld Miss" (see the function llc_miss() for calculation
"llcmiss"); and it's accounted into metrics "LLC Load Hit". Thus,
for the literal meaning, it is contradictory that "rmt_hit" is
accounted for both "LLC Ld Miss" (LLC miss) and "LLC Load Hit"
(LLC hit).

Thus this is easily to introduce confusion: "LLC Load Hit" gives
impression that all items belong to it are LLC hit; in fact "rmt_hit"
is LLC miss and remote cache hit.

To give out clear semantics for metric "LLC Load Hit", "rmt_hit" is
moved out from it and changes "LLC Load Hit" to contain two items:

LLC Load Hit = LLC's hit ("ld_llchit") + LLC's hitm ("lcl_hitm")

For output alignment, adjusts the header for "LLC Load Hit".

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/builtin-c2c.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 2292261b40a2..61fb939a4e70 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1432,7 +1432,7 @@ static struct c2c_dimension dim_ld_l2hit = {
};

static struct c2c_dimension dim_ld_llchit = {
- .header = HEADER_SPAN("-- LLC Load Hit --", "LclHit", 1),
+ .header = HEADER_SPAN("- LLC Load Hit --", "LclHit", 1),
.name = "ld_lclhit",
.cmp = ld_llchit_cmp,
.entry = ld_llchit_entry,
@@ -2853,7 +2853,7 @@ static int perf_c2c__report(int argc, const char **argv)
"tot_stores,"
"stores_l1hit,stores_l1miss,"
"ld_fbhit,ld_l1hit,ld_l2hit,"
- "ld_lclhit,ld_rmthit,"
+ "ld_lclhit,lcl_hitm,"
"ld_llcmiss,"
"dram_lcl,dram_rmt",
c2c.display == DISPLAY_TOT ? "tot_hitm" :
--
2.17.1

2020-10-14 16:59:51

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v1 0/8] perf c2c: Refine the organization of metrics

On Wed, Oct 14, 2020 at 06:09:13AM +0100, Leo Yan wrote:
> This patch set is to refine metrics output organization.
>
> If we reivew the current memory metrics in Perf c2c tool, it doesn't
> orgnize the metrics with directive approach; thus user needs to take
> time to dig into every statistics item. On the other hand, if use the
> "summary and breakdown" approach, the output result will be easier for
> reviewing by users, e.g. the output result can firstly give out the
> summary values, and then the later items will breakdown into more
> detailed statistics.
>
> For this reason, this patch is to reorgnize the metrics and it only
> changes for the "Shared Data Cache Line Table": it firstly displays the
> summary values for total records, total loads, total stores; then it
> breaks these summary values into small values, with the order from the
> most near memory node ("CPU Load Hit") to more far nodes
> ("LLC Load Hit", "RMT Load Hit", "Load Dram").
>
> "LLC Load Hit" = "LclHit" + "LclHitm"
>
> "RMT Load Hit" = "RmtHit" + "RmtHitm" \
> -> LLC Load Miss
> "Load Dram" = "Lcl" + "Rmt" /
>
> Another main reason for this patch set is wanting to extend "perf c2c"
> to support Arm SPE memory event, but Arm SPE doesn't contain 'HTIM' tag
> in its default trace data, for this case if want to analyze cache false
> sharing issue, we need to rely on LLC metrics + multi-threading info.
> So this patch set can be friendly to show LLC related metrics in the
> "Shared Data Cache Line Table"; for sorting cache lines with LLC metrics
> which will be sent out with another separate patch set.
>
> Before:
>
> =================================================
> Shared Data Cache Line Table
> =================================================
> #
> # ----------- Cacheline ---------- Total Tot ----- LLC Load Hitm ----- ---- Store Reference ---- --- Load Dram ---- LLC Total ----- Core Load Hit ----- -- LLC Load Hit --
> # Index Address Node PA cnt records Hitm Total Lcl Rmt Total L1Hit L1Miss Lcl Rmt Ld Miss Loads FB L1 L2 Llc Rmt
> # ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ........ ........ ....... ....... ....... ....... ....... ........ ........
> #
> 0 0x55acdcc92100 0 8197 40716 52.18% 3170 3170 0 24466 24437 29 0 0 0 16250 3349 5909 0 3822 0
> 1 0x55acdcc920c0 0 1 4621 31.01% 1884 1884 0 0 0 0 0 0 0 4621 739 0 0 1998 0
> 2 0x55acdcc92080 0 1 4475 16.69% 1014 1014 0 0 0 0 0 0 0 4475 2405 0 0 1056 0
>
>
> After:
>
> =================================================
> Shared Data Cache Line Table
> =================================================
> #
> # ----------- Cacheline ---------- Tot ------- Load Hitm ------- Total Total Total ---- Stores ---- ----- Core Load Hit ----- - LLC Load Hit -- - RMT Load Hit -- --- Load Dram ----
> # Index Address Node PA cnt Hitm Total LclHitm RmtHitm records Loads Stores L1Hit L1Miss FB L1 L2 LclHit LclHitm RmtHit RmtHitm Lcl Rmt
> # ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ........ ....... ........ ....... ........ ........
> #
> 0 0x55acdcc92100 0 8197 52.18% 3170 3170 0 40716 16250 24466 24437 29 3349 5909 0 3822 3170 0 0 0 0
> 1 0x55acdcc920c0 0 1 31.01% 1884 1884 0 4621 4621 0 0 0 739 0 0 1998 1884 0 0 0 0
> 2 0x55acdcc92080 0 1 16.69% 1014 1014 0 4475 4475 0 0 0 2405 0 0 1056 1014 0 0 0 0

I haven't used the tool for some time, so it's fine with me,
but there might be some people already used to see certain
columns in place and I don't want to make them angry unless
there's really good reason for that ;-)

Joe, could you please check on these changes?

thanks,
jirka

>
>
> Leo Yan (8):
> perf c2c: Display the total numbers continuously
> perf c2c: Display "Total Stores" as a standalone metrics
> perf c2c: Organize metrics based on memory hierarchy
> perf c2c: Change header from "LLC Load Hitm" to "Load Hitm"
> perf c2c: Use more explicit headers for HITM
> perf c2c: Change header for LLC local hit
> perf c2c: Correct LLC load hit metrics
> perf c2c: Add metrics "RMT Load Hit"
>
> tools/perf/builtin-c2c.c | 83 +++++++++-------------------------------
> 1 file changed, 18 insertions(+), 65 deletions(-)
>
> --
> 2.17.1
>

2020-10-14 18:40:48

by Joe Mario

[permalink] [raw]
Subject: Re: [PATCH v1 0/8] perf c2c: Refine the organization of metrics



On 10/14/20 1:09 AM, Leo Yan wrote:
> This patch set is to refine metrics output organization.
>
> If we reivew the current memory metrics in Perf c2c tool, it doesn't
> orgnize the metrics with directive approach; thus user needs to take
> time to dig into every statistics item. On the other hand, if use the
> "summary and breakdown" approach, the output result will be easier for
> reviewing by users, e.g. the output result can firstly give out the
> summary values, and then the later items will breakdown into more
> detailed statistics.
>
> For this reason, this patch is to reorgnize the metrics and it only
> changes for the "Shared Data Cache Line Table": it firstly displays the
> summary values for total records, total loads, total stores; then it
> breaks these summary values into small values, with the order from the
> most near memory node ("CPU Load Hit") to more far nodes
> ("LLC Load Hit", "RMT Load Hit", "Load Dram").
>
> "LLC Load Hit" = "LclHit" + "LclHitm"
>
> "RMT Load Hit" = "RmtHit" + "RmtHitm" \
> -> LLC Load Miss
> "Load Dram" = "Lcl" + "Rmt" /
>
> Another main reason for this patch set is wanting to extend "perf c2c"
> to support Arm SPE memory event, but Arm SPE doesn't contain 'HTIM' tag
> in its default trace data, for this case if want to analyze cache false
> sharing issue, we need to rely on LLC metrics + multi-threading info.
> So this patch set can be friendly to show LLC related metrics in the
> "Shared Data Cache Line Table"; for sorting cache lines with LLC metrics
> which will be sent out with another separate patch set.
>
> <SNIP>
>
> Leo Yan (8):
> perf c2c: Display the total numbers continuously
> perf c2c: Display "Total Stores" as a standalone metrics
> perf c2c: Organize metrics based on memory hierarchy
> perf c2c: Change header from "LLC Load Hitm" to "Load Hitm"
> perf c2c: Use more explicit headers for HITM
> perf c2c: Change header for LLC local hit
> perf c2c: Correct LLC load hit metrics
> perf c2c: Add metrics "RMT Load Hit"
>
> tools/perf/builtin-c2c.c | 83 +++++++++-------------------------------
> 1 file changed, 18 insertions(+), 65 deletions(-)

Hi Leo:
I ran your patches through some perf c2c tests and it all looks good.
I agree the new format of the "Shared Data Cache Line Table" makes more sense now. And it still holds together nicely when sorted on local HitMs (-d lcl).

Thank you for doing this.
Joe

Tested-by: Joe Mario <[email protected]>

2020-10-15 15:08:18

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v1 0/8] perf c2c: Refine the organization of metrics

Hi Joe,

On Wed, Oct 14, 2020 at 02:38:19PM -0400, Joe Mario wrote:

[...]

> > This patch set is to refine metrics output organization.

[...]

> Hi Leo:
> I ran your patches through some perf c2c tests and it all looks good.
> I agree the new format of the "Shared Data Cache Line Table" makes more sense now. And it still holds together nicely when sorted on local HitMs (-d lcl).
>
> Thank you for doing this.
> Joe
>
> Tested-by: Joe Mario <[email protected]>

Thank you for quick response and testing.

I share the same thinking with Jiri that we should respect the existed
usages and habits of the tool, I was also a bit concern that my changes
might introduce inconvinence for others. But it's great that receive
your agreement for the changes!

I have respinned the patch set v2 [1] with adding your test tag and
updated documentation; furthermore, I sent out another patch set for
enhancement perf c2c with sorting on LLC load hit, you are welcome to
reivew and comment on it [2].

Thanks,
Leo

[1] https://lore.kernel.org/patchwork/cover/1321499/
[2] https://lore.kernel.org/patchwork/cover/1321514/