2021-01-05 22:47:01

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V4 0/6] Add the page size in the perf record (user tools)

From: Kan Liang <[email protected]>

Changes since V3:
- Rebase on top of acme's perf/core branch
commit c07b45a355ee ("perf record: Tweak "Lowering..." warning in record_opts__config_freq")

Changes since V2:
- Rebase on top of acme perf/core branch
commit eec7b53d5916 ("perf test: Make sample-parsing test aware of PERF_SAMPLE_{CODE,DATA}_PAGE_SIZE")
- Use unit_number__scnprintf() in get_page_size_name()
- Emit warning about kernel not supporting the code page size sample_type bit

Changes since V1:
- Fix the compile warning with GCC 10
- Add Acked-by from Namhyung Kim

Current perf can report both virtual addresses and physical addresses,
but not the page size. Without the page size information of the utilized
page, users cannot decide whether to promote/demote large pages to
optimize memory usage.

The kernel patches have been merged into tip perf/core branch,
commit 8d97e71811aa ("perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE")
commit 76a5433f95f3 ("perf/x86/intel: Support PERF_SAMPLE_DATA_PAGE_SIZE")
commit 4cb6a42e4c4b ("powerpc/perf: Support PERF_SAMPLE_DATA_PAGE_SIZE")
commit 995f088efebe ("perf/core: Add support for PERF_SAMPLE_CODE_PAGE_SIZE")
commit 51b646b2d9f8 ("perf,mm: Handle non-page-table-aligned hugetlbfs")

and Peter's perf/core branch
commit 524680ce47a1 ("mm/gup: Provide gup_get_pte() more generic")
commit 44a35d6937d2 ("mm: Introduce pXX_leaf_size()")
commit 2f1e2f091ad0 ("perf/core: Fix arch_perf_get_page_size()")
commit 7649e44aacdd ("arm64/mm: Implement pXX_leaf_size() support")
commit 1df1ae7e262c ("sparc64/mm: Implement pXX_leaf_size() support")

This patch set is to enable the page size support in user tools.

Kan Liang (3):
perf mem: Clean up output format
perf mem: Support data page size
perf tools: Add support for PERF_SAMPLE_CODE_PAGE_SIZE

Stephane Eranian (3):
perf script: Add support for PERF_SAMPLE_CODE_PAGE_SIZE
perf report: Add support for PERF_SAMPLE_CODE_PAGE_SIZE
perf test: Add test case for PERF_SAMPLE_CODE_PAGE_SIZE

tools/perf/Documentation/perf-mem.txt | 3 +
tools/perf/Documentation/perf-record.txt | 3 +
tools/perf/Documentation/perf-report.txt | 1 +
tools/perf/Documentation/perf-script.txt | 2 +-
tools/perf/builtin-mem.c | 111 +++++++++++-----------
tools/perf/builtin-record.c | 2 +
tools/perf/builtin-script.c | 13 ++-
tools/perf/tests/sample-parsing.c | 4 +
tools/perf/util/event.h | 1 +
tools/perf/util/evsel.c | 18 +++-
tools/perf/util/evsel.h | 1 +
tools/perf/util/hist.c | 2 +
tools/perf/util/hist.h | 1 +
tools/perf/util/perf_event_attr_fprintf.c | 2 +-
tools/perf/util/record.h | 1 +
tools/perf/util/session.c | 3 +
tools/perf/util/sort.c | 26 +++++
tools/perf/util/sort.h | 2 +
tools/perf/util/synthetic-events.c | 8 ++
19 files changed, 144 insertions(+), 60 deletions(-)

--
2.25.1


2021-01-05 22:49:18

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V4 4/6] perf script: Add support for PERF_SAMPLE_CODE_PAGE_SIZE

From: Stephane Eranian <[email protected]>

Display sampled code page sizes when PERF_SAMPLE_CODE_PAGE_SIZE was set.

For example,
perf script --fields comm,event,ip,code_page_size
dtlb mem-loads:uP: 445777 4K
dtlb mem-loads:uP: 40f724 4K
dtlb mem-loads:uP: 474926 4K
dtlb mem-loads:uP: 401075 4K
dtlb mem-loads:uP: 401095 4K
dtlb mem-loads:uP: 401095 4K
dtlb mem-loads:uP: 4010cc 4K
dtlb mem-loads:uP: 440b6f 4K

Acked-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Signed-off-by: Stephane Eranian <[email protected]>
---
tools/perf/Documentation/perf-script.txt | 2 +-
tools/perf/builtin-script.c | 13 +++++++++++--
tools/perf/util/session.c | 3 +++
3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 44d37210fc8f..60dae302db27 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -118,7 +118,7 @@ OPTIONS
comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output,
brstackinsn, brstackoff, callindent, insn, insnlen, synth, phys_addr,
- metric, misc, srccode, ipc, data_page_size.
+ metric, misc, srccode, ipc, data_page_size, code_page_size.
Field list can be prepended with the type, trace, sw or hw,
to indicate to which event type the field list applies.
e.g., -F sw:comm,tid,time,ip,sym and -F trace:time,cpu,trace
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index edacfa98d073..9e995311a9b8 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -117,6 +117,7 @@ enum perf_output_field {
PERF_OUTPUT_IPC = 1ULL << 31,
PERF_OUTPUT_TOD = 1ULL << 32,
PERF_OUTPUT_DATA_PAGE_SIZE = 1ULL << 33,
+ PERF_OUTPUT_CODE_PAGE_SIZE = 1ULL << 34,
};

struct perf_script {
@@ -182,6 +183,7 @@ struct output_option {
{.str = "ipc", .field = PERF_OUTPUT_IPC},
{.str = "tod", .field = PERF_OUTPUT_TOD},
{.str = "data_page_size", .field = PERF_OUTPUT_DATA_PAGE_SIZE},
+ {.str = "code_page_size", .field = PERF_OUTPUT_CODE_PAGE_SIZE},
};

enum {
@@ -255,7 +257,7 @@ static struct {
PERF_OUTPUT_DSO | PERF_OUTPUT_PERIOD |
PERF_OUTPUT_ADDR | PERF_OUTPUT_DATA_SRC |
PERF_OUTPUT_WEIGHT | PERF_OUTPUT_PHYS_ADDR |
- PERF_OUTPUT_DATA_PAGE_SIZE,
+ PERF_OUTPUT_DATA_PAGE_SIZE | PERF_OUTPUT_CODE_PAGE_SIZE,

.invalid_fields = PERF_OUTPUT_TRACE | PERF_OUTPUT_BPF_OUTPUT,
},
@@ -507,6 +509,10 @@ static int evsel__check_attr(struct evsel *evsel, struct perf_session *session)
evsel__check_stype(evsel, PERF_SAMPLE_DATA_PAGE_SIZE, "DATA_PAGE_SIZE", PERF_OUTPUT_DATA_PAGE_SIZE))
return -EINVAL;

+ if (PRINT_FIELD(CODE_PAGE_SIZE) &&
+ evsel__check_stype(evsel, PERF_SAMPLE_CODE_PAGE_SIZE, "CODE_PAGE_SIZE", PERF_OUTPUT_CODE_PAGE_SIZE))
+ return -EINVAL;
+
return 0;
}

@@ -2020,6 +2026,9 @@ static void process_event(struct perf_script *script,
if (PRINT_FIELD(DATA_PAGE_SIZE))
fprintf(fp, " %s", get_page_size_name(sample->data_page_size, str));

+ if (PRINT_FIELD(CODE_PAGE_SIZE))
+ fprintf(fp, " %s", get_page_size_name(sample->code_page_size, str));
+
perf_sample__fprintf_ipc(sample, attr, fp);

fprintf(fp, "\n");
@@ -3519,7 +3528,7 @@ int cmd_script(int argc, const char **argv)
"addr,symoff,srcline,period,iregs,uregs,brstack,"
"brstacksym,flags,bpf-output,brstackinsn,brstackoff,"
"callindent,insn,insnlen,synth,phys_addr,metric,misc,ipc,tod,"
- "data_page_size",
+ "data_page_size,code_page_size",
parse_output_fields),
OPT_BOOLEAN('a', "all-cpus", &system_wide,
"system-wide collection from all CPUs"),
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 357d6b972b9d..492c994c948a 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1312,6 +1312,9 @@ static void dump_sample(struct evsel *evsel, union perf_event *event,
if (sample_type & PERF_SAMPLE_DATA_PAGE_SIZE)
printf(" .. data page size: %s\n", get_page_size_name(sample->data_page_size, str));

+ if (sample_type & PERF_SAMPLE_CODE_PAGE_SIZE)
+ printf(" .. code page size: %s\n", get_page_size_name(sample->code_page_size, str));
+
if (sample_type & PERF_SAMPLE_TRANSACTION)
printf("... transaction: %" PRIx64 "\n", sample->transaction);

--
2.25.1

2021-01-05 22:49:18

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V4 3/6] perf tools: Add support for PERF_SAMPLE_CODE_PAGE_SIZE

From: Kan Liang <[email protected]>

Adds the infrastructure to sample the code address page size.

Introduce a new --code-page-size option for perf record.

Acked-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Originally-by: Stephane Eranian <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/Documentation/perf-record.txt | 3 +++
tools/perf/builtin-record.c | 2 ++
tools/perf/util/event.h | 1 +
tools/perf/util/evsel.c | 18 +++++++++++++++++-
tools/perf/util/evsel.h | 1 +
tools/perf/util/perf_event_attr_fprintf.c | 2 +-
tools/perf/util/record.h | 1 +
tools/perf/util/synthetic-events.c | 8 ++++++++
8 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 0042ff7f6f33..9087b223e324 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -296,6 +296,9 @@ OPTIONS
--data-page-size::
Record the sampled data address data page size.

+--code-page-size::
+ Record the sampled code address (ip) page size
+
-T::
--timestamp::
Record the sample timestamps. Use it with 'perf report -D' to see the
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 7bb10e9863bd..7704c33bfe31 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -2477,6 +2477,8 @@ static struct option __record_options[] = {
"Record the sample physical addresses"),
OPT_BOOLEAN(0, "data-page-size", &record.opts.sample_data_page_size,
"Record the sampled data address data page size"),
+ OPT_BOOLEAN(0, "code-page-size", &record.opts.sample_code_page_size,
+ "Record the sampled code address (ip) page size"),
OPT_BOOLEAN(0, "sample-cpu", &record.opts.sample_cpu, "Record the sample cpu"),
OPT_BOOLEAN_SET('T', "timestamp", &record.opts.sample_time,
&record.opts.sample_time_set,
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index ff403ea578e1..2afea7247dd3 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -136,6 +136,7 @@ struct perf_sample {
u64 data_src;
u64 phys_addr;
u64 data_page_size;
+ u64 code_page_size;
u64 cgroup;
u32 flags;
u16 insn_len;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index dc0cfa5f2610..d1463d6c9336 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1193,6 +1193,9 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
if (opts->sample_data_page_size)
evsel__set_sample_bit(evsel, DATA_PAGE_SIZE);

+ if (opts->sample_code_page_size)
+ evsel__set_sample_bit(evsel, CODE_PAGE_SIZE);
+
if (opts->record_switch_events)
attr->context_switch = track;

@@ -1875,7 +1878,12 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
* Must probe features in the order they were added to the
* perf_event_attr interface.
*/
- if (!perf_missing_features.data_page_size &&
+ if (!perf_missing_features.code_page_size &&
+ (evsel->core.attr.sample_type & PERF_SAMPLE_CODE_PAGE_SIZE)) {
+ perf_missing_features.code_page_size = true;
+ pr_debug2_peo("Kernel has no PERF_SAMPLE_CODE_PAGE_SIZE support, bailing out\n");
+ goto out_close;
+ } else if (!perf_missing_features.data_page_size &&
(evsel->core.attr.sample_type & PERF_SAMPLE_DATA_PAGE_SIZE)) {
perf_missing_features.data_page_size = true;
pr_debug2_peo("Kernel has no PERF_SAMPLE_DATA_PAGE_SIZE support, bailing out\n");
@@ -2371,6 +2379,12 @@ int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
array++;
}

+ data->code_page_size = 0;
+ if (type & PERF_SAMPLE_CODE_PAGE_SIZE) {
+ data->code_page_size = *array;
+ array++;
+ }
+
if (type & PERF_SAMPLE_AUX) {
OVERFLOW_CHECK_u64(array);
sz = *array++;
@@ -2680,6 +2694,8 @@ int evsel__open_strerror(struct evsel *evsel, struct target *target,
"We found oprofile daemon running, please stop it and try again.");
break;
case EINVAL:
+ if (evsel->core.attr.sample_type & PERF_SAMPLE_CODE_PAGE_SIZE && perf_missing_features.code_page_size)
+ return scnprintf(msg, size, "Asking for the code page size isn't supported by this kernel.");
if (evsel->core.attr.sample_type & PERF_SAMPLE_DATA_PAGE_SIZE && perf_missing_features.data_page_size)
return scnprintf(msg, size, "Asking for the data page size isn't supported by this kernel.");
if (evsel->core.attr.write_backward && perf_missing_features.write_backward)
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index cd1d8dd43199..157d7c27d6e3 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -145,6 +145,7 @@ struct perf_missing_features {
bool branch_hw_idx;
bool cgroup;
bool data_page_size;
+ bool code_page_size;
};

extern struct perf_missing_features perf_missing_features;
diff --git a/tools/perf/util/perf_event_attr_fprintf.c b/tools/perf/util/perf_event_attr_fprintf.c
index 22b417f43470..1bd6cfd74257 100644
--- a/tools/perf/util/perf_event_attr_fprintf.c
+++ b/tools/perf/util/perf_event_attr_fprintf.c
@@ -35,7 +35,7 @@ static void __p_sample_type(char *buf, size_t size, u64 value)
bit_name(BRANCH_STACK), bit_name(REGS_USER), bit_name(STACK_USER),
bit_name(IDENTIFIER), bit_name(REGS_INTR), bit_name(DATA_SRC),
bit_name(WEIGHT), bit_name(PHYS_ADDR), bit_name(AUX),
- bit_name(CGROUP), bit_name(DATA_PAGE_SIZE),
+ bit_name(CGROUP), bit_name(DATA_PAGE_SIZE), bit_name(CODE_PAGE_SIZE),
{ .name = NULL, }
};
#undef bit_name
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index b996ce61fadd..68f471d9a88b 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -23,6 +23,7 @@ struct record_opts {
bool sample_address;
bool sample_phys_addr;
bool sample_data_page_size;
+ bool sample_code_page_size;
bool sample_weight;
bool sample_time;
bool sample_time_set;
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 69688f20db11..3a898520f05c 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -1473,6 +1473,9 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
if (type & PERF_SAMPLE_DATA_PAGE_SIZE)
result += sizeof(u64);

+ if (type & PERF_SAMPLE_CODE_PAGE_SIZE)
+ result += sizeof(u64);
+
if (type & PERF_SAMPLE_AUX) {
result += sizeof(u64);
result += sample->aux_sample.size;
@@ -1657,6 +1660,11 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type, u64 read_fo
array++;
}

+ if (type & PERF_SAMPLE_CODE_PAGE_SIZE) {
+ *array = sample->code_page_size;
+ array++;
+ }
+
if (type & PERF_SAMPLE_AUX) {
sz = sample->aux_sample.size;
*array++ = sz;
--
2.25.1

2021-01-05 22:49:52

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V4 5/6] perf report: Add support for PERF_SAMPLE_CODE_PAGE_SIZE

From: Stephane Eranian <[email protected]>

Add a new sort dimension "code_page_size" for common sort.
With this option applied, perf can sort and report by sample's code page
size.

For example,
perf report --stdio --sort=comm,symbol,code_page_size
# To display the perf.data header info, please use
# --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 3K of event 'mem-loads:uP'
# Event count (approx.): 1470769
#
# Overhead Command Symbol Code Page Size IPC
# [IPC Coverage]
# ........ ....... ............................ ..............
# ....................
#
69.56% dtlb [.] GetTickCount 4K -

17.93% dtlb [.] Calibrate 4K -
-
11.40% dtlb [.] __gettimeofday 4K -
-

Acked-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Signed-off-by: Stephane Eranian <[email protected]>
---
tools/perf/Documentation/perf-report.txt | 1 +
tools/perf/util/hist.c | 2 ++
tools/perf/util/hist.h | 1 +
tools/perf/util/sort.c | 26 ++++++++++++++++++++++++
tools/perf/util/sort.h | 2 ++
5 files changed, 32 insertions(+)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 8f7f4e9605d8..e44045842c5c 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -108,6 +108,7 @@ OPTIONS
- period: Raw number of event count of sample
- time: Separate the samples by time stamp with the resolution specified by
--time-quantum (default 100ms). Specify with overhead and before it.
+ - code_page_size: the code page size of sampled code address (ip)

By default, comm, dso and symbol keys are used.
(i.e. --sort comm,dso,symbol)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index a08fb9ea411b..6d50379af90e 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -212,6 +212,7 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
hists__new_col_len(hists, HISTC_TIME, 16);
else
hists__new_col_len(hists, HISTC_TIME, 12);
+ hists__new_col_len(hists, HISTC_CODE_PAGE_SIZE, 6);

if (h->srcline) {
len = MAX(strlen(h->srcline), strlen(sort_srcline.se_header));
@@ -718,6 +719,7 @@ __hists__add_entry(struct hists *hists,
.cpumode = al->cpumode,
.ip = al->addr,
.level = al->level,
+ .code_page_size = sample->code_page_size,
.stat = {
.nr_events = 1,
.period = sample->period,
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 14f66330923d..361108533a56 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -53,6 +53,7 @@ enum hist_column {
HISTC_DSO_TO,
HISTC_LOCAL_WEIGHT,
HISTC_GLOBAL_WEIGHT,
+ HISTC_CODE_PAGE_SIZE,
HISTC_MEM_DADDR_SYMBOL,
HISTC_MEM_DADDR_DSO,
HISTC_MEM_PHYS_DADDR,
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 80907bc32683..c00934c91b58 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1491,6 +1491,31 @@ struct sort_entry sort_mem_data_page_size = {
.se_width_idx = HISTC_MEM_DATA_PAGE_SIZE,
};

+static int64_t
+sort__code_page_size_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ uint64_t l = left->code_page_size;
+ uint64_t r = right->code_page_size;
+
+ return (int64_t)(r - l);
+}
+
+static int hist_entry__code_page_size_snprintf(struct hist_entry *he, char *bf,
+ size_t size, unsigned int width)
+{
+ char str[PAGE_SIZE_NAME_LEN];
+
+ return repsep_snprintf(bf, size, "%-*s", width,
+ get_page_size_name(he->code_page_size, str));
+}
+
+struct sort_entry sort_code_page_size = {
+ .se_header = "Code Page Size",
+ .se_cmp = sort__code_page_size_cmp,
+ .se_snprintf = hist_entry__code_page_size_snprintf,
+ .se_width_idx = HISTC_CODE_PAGE_SIZE,
+};
+
static int64_t
sort__abort_cmp(struct hist_entry *left, struct hist_entry *right)
{
@@ -1735,6 +1760,7 @@ static struct sort_dimension common_sort_dimensions[] = {
DIM(SORT_CGROUP_ID, "cgroup_id", sort_cgroup_id),
DIM(SORT_SYM_IPC_NULL, "ipc_null", sort_sym_ipc_null),
DIM(SORT_TIME, "time", sort_time),
+ DIM(SORT_CODE_PAGE_SIZE, "code_page_size", sort_code_page_size),
};

#undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index e50f2b695bc4..cab4172a6ec3 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -106,6 +106,7 @@ struct hist_entry {
u64 transaction;
s32 socket;
s32 cpu;
+ u64 code_page_size;
u8 cpumode;
u8 depth;

@@ -229,6 +230,7 @@ enum sort_type {
SORT_CGROUP_ID,
SORT_SYM_IPC_NULL,
SORT_TIME,
+ SORT_CODE_PAGE_SIZE,

/* branch stack specific sort keys */
__SORT_BRANCH_STACK,
--
2.25.1

2021-01-05 22:52:51

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V4 6/6] perf test: Add test case for PERF_SAMPLE_CODE_PAGE_SIZE

From: Stephane Eranian <[email protected]>

Extend sample-parsing test cases to support new sample type
PERF_SAMPLE_CODE_PAGE_SIZE.

Acked-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Signed-off-by: Stephane Eranian <[email protected]>
---
tools/perf/tests/sample-parsing.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/tools/perf/tests/sample-parsing.c b/tools/perf/tests/sample-parsing.c
index 2393916f6128..e93d0689a27b 100644
--- a/tools/perf/tests/sample-parsing.c
+++ b/tools/perf/tests/sample-parsing.c
@@ -157,6 +157,9 @@ static bool samples_same(const struct perf_sample *s1,
if (type & PERF_SAMPLE_DATA_PAGE_SIZE)
COMP(data_page_size);

+ if (type & PERF_SAMPLE_CODE_PAGE_SIZE)
+ COMP(code_page_size);
+
if (type & PERF_SAMPLE_AUX) {
COMP(aux_sample.size);
if (memcmp(s1->aux_sample.data, s2->aux_sample.data,
@@ -238,6 +241,7 @@ static int do_test(u64 sample_type, u64 sample_regs, u64 read_format)
.phys_addr = 113,
.cgroup = 114,
.data_page_size = 115,
+ .code_page_size = 116,
.aux_sample = {
.size = sizeof(aux_data),
.data = (void *)aux_data,
--
2.25.1

2021-01-06 00:15:19

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V4 1/6] perf mem: Clean up output format

From: Kan Liang <[email protected]>

Now, "--phys-data" is the only option which impacts the output format.
A simple "if else" is enough to handle the option. But there will be
more options added, e.g. "--data-page-size", which also impact the
output format. The code will become too complex to be maintained.

Divide the big printf into several small pieces. Output the specific
piece only if the related option is applied.

No functional change.

Acked-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/builtin-mem.c | 93 ++++++++++++++++------------------------
1 file changed, 38 insertions(+), 55 deletions(-)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 823742036ddb..7d6ee2208709 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -172,7 +172,7 @@ dump_raw_samples(struct perf_tool *tool,
{
struct perf_mem *mem = container_of(tool, struct perf_mem, tool);
struct addr_location al;
- const char *fmt;
+ const char *fmt, *field_sep;

if (machine__resolve(machine, &al, sample) < 0) {
fprintf(stderr, "problem processing %d event, skipping it.\n",
@@ -186,60 +186,41 @@ dump_raw_samples(struct perf_tool *tool,
if (al.map != NULL)
al.map->dso->hit = 1;

- if (mem->phys_addr) {
- if (symbol_conf.field_sep) {
- fmt = "%d%s%d%s0x%"PRIx64"%s0x%"PRIx64"%s0x%016"PRIx64
- "%s%"PRIu64"%s0x%"PRIx64"%s%s:%s\n";
- } else {
- fmt = "%5d%s%5d%s0x%016"PRIx64"%s0x016%"PRIx64
- "%s0x%016"PRIx64"%s%5"PRIu64"%s0x%06"PRIx64
- "%s%s:%s\n";
- symbol_conf.field_sep = " ";
- }
-
- printf(fmt,
- sample->pid,
- symbol_conf.field_sep,
- sample->tid,
- symbol_conf.field_sep,
- sample->ip,
- symbol_conf.field_sep,
- sample->addr,
- symbol_conf.field_sep,
- sample->phys_addr,
- symbol_conf.field_sep,
- sample->weight,
- symbol_conf.field_sep,
- sample->data_src,
- symbol_conf.field_sep,
- al.map ? (al.map->dso ? al.map->dso->long_name : "???") : "???",
- al.sym ? al.sym->name : "???");
+ field_sep = symbol_conf.field_sep;
+ if (field_sep) {
+ fmt = "%d%s%d%s0x%"PRIx64"%s0x%"PRIx64"%s";
} else {
- if (symbol_conf.field_sep) {
- fmt = "%d%s%d%s0x%"PRIx64"%s0x%"PRIx64"%s%"PRIu64
- "%s0x%"PRIx64"%s%s:%s\n";
- } else {
- fmt = "%5d%s%5d%s0x%016"PRIx64"%s0x016%"PRIx64
- "%s%5"PRIu64"%s0x%06"PRIx64"%s%s:%s\n";
- symbol_conf.field_sep = " ";
- }
+ fmt = "%5d%s%5d%s0x%016"PRIx64"%s0x016%"PRIx64"%s";
+ symbol_conf.field_sep = " ";
+ }
+ printf(fmt,
+ sample->pid,
+ symbol_conf.field_sep,
+ sample->tid,
+ symbol_conf.field_sep,
+ sample->ip,
+ symbol_conf.field_sep,
+ sample->addr,
+ symbol_conf.field_sep);

- printf(fmt,
- sample->pid,
- symbol_conf.field_sep,
- sample->tid,
- symbol_conf.field_sep,
- sample->ip,
- symbol_conf.field_sep,
- sample->addr,
- symbol_conf.field_sep,
- sample->weight,
- symbol_conf.field_sep,
- sample->data_src,
- symbol_conf.field_sep,
- al.map ? (al.map->dso ? al.map->dso->long_name : "???") : "???",
- al.sym ? al.sym->name : "???");
+ if (mem->phys_addr) {
+ printf("0x%016"PRIx64"%s",
+ sample->phys_addr,
+ symbol_conf.field_sep);
}
+
+ if (field_sep)
+ fmt = "%"PRIu64"%s0x%"PRIx64"%s%s:%s\n";
+ else
+ fmt = "%5"PRIu64"%s0x%06"PRIx64"%s%s:%s\n";
+
+ printf(fmt,
+ sample->weight,
+ symbol_conf.field_sep,
+ sample->data_src,
+ symbol_conf.field_sep,
+ al.map ? (al.map->dso ? al.map->dso->long_name : "???") : "???",
+ al.sym ? al.sym->name : "???");
out_put:
addr_location__put(&al);
return 0;
@@ -287,10 +268,12 @@ static int report_raw_events(struct perf_mem *mem)
if (ret < 0)
goto out_delete;

+ printf("# PID, TID, IP, ADDR, ");
+
if (mem->phys_addr)
- printf("# PID, TID, IP, ADDR, PHYS ADDR, LOCAL WEIGHT, DSRC, SYMBOL\n");
- else
- printf("# PID, TID, IP, ADDR, LOCAL WEIGHT, DSRC, SYMBOL\n");
+ printf("PHYS ADDR, ");
+
+ printf("LOCAL WEIGHT, DSRC, SYMBOL\n");

ret = perf_session__process_events(session);

--
2.25.1

2021-01-12 11:19:37

by Athira Rajeev

[permalink] [raw]
Subject: Re: [PATCH V4 0/6] Add the page size in the perf record (user tools)



> On 06-Jan-2021, at 1:27 AM, [email protected] wrote:
>
> From: Kan Liang <[email protected]>
>
> Changes since V3:
> - Rebase on top of acme's perf/core branch
> commit c07b45a355ee ("perf record: Tweak "Lowering..." warning in record_opts__config_freq")
>
> Changes since V2:
> - Rebase on top of acme perf/core branch
> commit eec7b53d5916 ("perf test: Make sample-parsing test aware of PERF_SAMPLE_{CODE,DATA}_PAGE_SIZE")
> - Use unit_number__scnprintf() in get_page_size_name()
> - Emit warning about kernel not supporting the code page size sample_type bit
>
> Changes since V1:
> - Fix the compile warning with GCC 10
> - Add Acked-by from Namhyung Kim
>
> Current perf can report both virtual addresses and physical addresses,
> but not the page size. Without the page size information of the utilized
> page, users cannot decide whether to promote/demote large pages to
> optimize memory usage.
>
> The kernel patches have been merged into tip perf/core branch,
> commit 8d97e71811aa ("perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE")
> commit 76a5433f95f3 ("perf/x86/intel: Support PERF_SAMPLE_DATA_PAGE_SIZE")
> commit 4cb6a42e4c4b ("powerpc/perf: Support PERF_SAMPLE_DATA_PAGE_SIZE")
> commit 995f088efebe ("perf/core: Add support for PERF_SAMPLE_CODE_PAGE_SIZE")
> commit 51b646b2d9f8 ("perf,mm: Handle non-page-table-aligned hugetlbfs")
>
> and Peter's perf/core branch
> commit 524680ce47a1 ("mm/gup: Provide gup_get_pte() more generic")
> commit 44a35d6937d2 ("mm: Introduce pXX_leaf_size()")
> commit 2f1e2f091ad0 ("perf/core: Fix arch_perf_get_page_size()")
> commit 7649e44aacdd ("arm64/mm: Implement pXX_leaf_size() support")
> commit 1df1ae7e262c ("sparc64/mm: Implement pXX_leaf_size() support")
>
> This patch set is to enable the page size support in user tools.

Hi Kan Liang,

I am trying to check this series on powerpc.

# perf mem --phys-data --data-page-size record <workload>

To my observation, some of the samples returned zero size and comes as ’N/A’ in the perf report

# perf mem --phys-data --data-page-size report

For fetching the page size, though initially there was a weak function added ( as arch_perf_get_page_size ) here:

https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=51b646b2d9f84d6ff6300e3c1d09f2be4329a424

later I see it got removed here:

https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=8af26be062721e52eba1550caf50b712f774c5fd

I picked kernel changes from git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git , or I am missing something ?

Thanks
Athira

>
> Kan Liang (3):
> perf mem: Clean up output format
> perf mem: Support data page size
> perf tools: Add support for PERF_SAMPLE_CODE_PAGE_SIZE
>
> Stephane Eranian (3):
> perf script: Add support for PERF_SAMPLE_CODE_PAGE_SIZE
> perf report: Add support for PERF_SAMPLE_CODE_PAGE_SIZE
> perf test: Add test case for PERF_SAMPLE_CODE_PAGE_SIZE
>
> tools/perf/Documentation/perf-mem.txt | 3 +
> tools/perf/Documentation/perf-record.txt | 3 +
> tools/perf/Documentation/perf-report.txt | 1 +
> tools/perf/Documentation/perf-script.txt | 2 +-
> tools/perf/builtin-mem.c | 111 +++++++++++-----------
> tools/perf/builtin-record.c | 2 +
> tools/perf/builtin-script.c | 13 ++-
> tools/perf/tests/sample-parsing.c | 4 +
> tools/perf/util/event.h | 1 +
> tools/perf/util/evsel.c | 18 +++-
> tools/perf/util/evsel.h | 1 +
> tools/perf/util/hist.c | 2 +
> tools/perf/util/hist.h | 1 +
> tools/perf/util/perf_event_attr_fprintf.c | 2 +-
> tools/perf/util/record.h | 1 +
> tools/perf/util/session.c | 3 +
> tools/perf/util/sort.c | 26 +++++
> tools/perf/util/sort.h | 2 +
> tools/perf/util/synthetic-events.c | 8 ++
> 19 files changed, 144 insertions(+), 60 deletions(-)
>
> --
> 2.25.1
>
>
>

2021-01-13 02:27:31

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH V4 0/6] Add the page size in the perf record (user tools)



On 1/12/2021 12:24 AM, Athira Rajeev wrote:
>
>
>> On 06-Jan-2021, at 1:27 AM, [email protected] wrote:
>>
>> From: Kan Liang <[email protected]>
>>
>> Changes since V3:
>> - Rebase on top of acme's perf/core branch
>> commit c07b45a355ee ("perf record: Tweak "Lowering..." warning in record_opts__config_freq")
>>
>> Changes since V2:
>> - Rebase on top of acme perf/core branch
>> commit eec7b53d5916 ("perf test: Make sample-parsing test aware of PERF_SAMPLE_{CODE,DATA}_PAGE_SIZE")
>> - Use unit_number__scnprintf() in get_page_size_name()
>> - Emit warning about kernel not supporting the code page size sample_type bit
>>
>> Changes since V1:
>> - Fix the compile warning with GCC 10
>> - Add Acked-by from Namhyung Kim
>>
>> Current perf can report both virtual addresses and physical addresses,
>> but not the page size. Without the page size information of the utilized
>> page, users cannot decide whether to promote/demote large pages to
>> optimize memory usage.
>>
>> The kernel patches have been merged into tip perf/core branch,
>> commit 8d97e71811aa ("perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE")
>> commit 76a5433f95f3 ("perf/x86/intel: Support PERF_SAMPLE_DATA_PAGE_SIZE")
>> commit 4cb6a42e4c4b ("powerpc/perf: Support PERF_SAMPLE_DATA_PAGE_SIZE")
>> commit 995f088efebe ("perf/core: Add support for PERF_SAMPLE_CODE_PAGE_SIZE")
>> commit 51b646b2d9f8 ("perf,mm: Handle non-page-table-aligned hugetlbfs")
>>
>> and Peter's perf/core branch
>> commit 524680ce47a1 ("mm/gup: Provide gup_get_pte() more generic")
>> commit 44a35d6937d2 ("mm: Introduce pXX_leaf_size()")
>> commit 2f1e2f091ad0 ("perf/core: Fix arch_perf_get_page_size()")
>> commit 7649e44aacdd ("arm64/mm: Implement pXX_leaf_size() support")
>> commit 1df1ae7e262c ("sparc64/mm: Implement pXX_leaf_size() support")
>>
>> This patch set is to enable the page size support in user tools.
>
> Hi Kan Liang,
>
> I am trying to check this series on powerpc.
>
> # perf mem --phys-data --data-page-size record <workload>
>
> To my observation, some of the samples returned zero size and comes as ’N/A’ in the perf report
>
> # perf mem --phys-data --data-page-size report
>
> For fetching the page size, though initially there was a weak function added ( as arch_perf_get_page_size ) here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=51b646b2d9f84d6ff6300e3c1d09f2be4329a424
>
> later I see it got removed here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=8af26be062721e52eba1550caf50b712f774c5fd
>
> I picked kernel changes from git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git , or I am missing something ?

I believe all the kernel changes have been merged.

According to the commit message of the recent changes, only Power/8xxx
is supported for power for now. I guess that may be the reason of some
'N/A's.
https://lore.kernel.org/patchwork/cover/1345521/

Thanks,
Kan


>
> Thanks
> Athira
>
>>
>> Kan Liang (3):
>> perf mem: Clean up output format
>> perf mem: Support data page size
>> perf tools: Add support for PERF_SAMPLE_CODE_PAGE_SIZE
>>
>> Stephane Eranian (3):
>> perf script: Add support for PERF_SAMPLE_CODE_PAGE_SIZE
>> perf report: Add support for PERF_SAMPLE_CODE_PAGE_SIZE
>> perf test: Add test case for PERF_SAMPLE_CODE_PAGE_SIZE
>>
>> tools/perf/Documentation/perf-mem.txt | 3 +
>> tools/perf/Documentation/perf-record.txt | 3 +
>> tools/perf/Documentation/perf-report.txt | 1 +
>> tools/perf/Documentation/perf-script.txt | 2 +-
>> tools/perf/builtin-mem.c | 111 +++++++++++-----------
>> tools/perf/builtin-record.c | 2 +
>> tools/perf/builtin-script.c | 13 ++-
>> tools/perf/tests/sample-parsing.c | 4 +
>> tools/perf/util/event.h | 1 +
>> tools/perf/util/evsel.c | 18 +++-
>> tools/perf/util/evsel.h | 1 +
>> tools/perf/util/hist.c | 2 +
>> tools/perf/util/hist.h | 1 +
>> tools/perf/util/perf_event_attr_fprintf.c | 2 +-
>> tools/perf/util/record.h | 1 +
>> tools/perf/util/session.c | 3 +
>> tools/perf/util/sort.c | 26 +++++
>> tools/perf/util/sort.h | 2 +
>> tools/perf/util/synthetic-events.c | 8 ++
>> 19 files changed, 144 insertions(+), 60 deletions(-)
>>
>> --
>> 2.25.1
>>
>>
>>
>

2021-01-15 19:27:31

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH V4 4/6] perf script: Add support for PERF_SAMPLE_CODE_PAGE_SIZE

Em Tue, Jan 05, 2021 at 11:57:50AM -0800, [email protected] escreveu:
> From: Stephane Eranian <[email protected]>
>
> Display sampled code page sizes when PERF_SAMPLE_CODE_PAGE_SIZE was set.
>
> For example,
> perf script --fields comm,event,ip,code_page_size
> dtlb mem-loads:uP: 445777 4K
> dtlb mem-loads:uP: 40f724 4K
> dtlb mem-loads:uP: 474926 4K
> dtlb mem-loads:uP: 401075 4K
> dtlb mem-loads:uP: 401095 4K
> dtlb mem-loads:uP: 401095 4K
> dtlb mem-loads:uP: 4010cc 4K
> dtlb mem-loads:uP: 440b6f 4K
>
> Acked-by: Namhyung Kim <[email protected]>
> Acked-by: Jiri Olsa <[email protected]>
> Signed-off-by: Stephane Eranian <[email protected]>

You missed your Signed-off-by, I'm adding it, please ack this change.

- Arnaldo

> ---
> tools/perf/Documentation/perf-script.txt | 2 +-
> tools/perf/builtin-script.c | 13 +++++++++++--
> tools/perf/util/session.c | 3 +++
> 3 files changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
> index 44d37210fc8f..60dae302db27 100644
> --- a/tools/perf/Documentation/perf-script.txt
> +++ b/tools/perf/Documentation/perf-script.txt
> @@ -118,7 +118,7 @@ OPTIONS
> comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
> srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output,
> brstackinsn, brstackoff, callindent, insn, insnlen, synth, phys_addr,
> - metric, misc, srccode, ipc, data_page_size.
> + metric, misc, srccode, ipc, data_page_size, code_page_size.
> Field list can be prepended with the type, trace, sw or hw,
> to indicate to which event type the field list applies.
> e.g., -F sw:comm,tid,time,ip,sym and -F trace:time,cpu,trace
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index edacfa98d073..9e995311a9b8 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -117,6 +117,7 @@ enum perf_output_field {
> PERF_OUTPUT_IPC = 1ULL << 31,
> PERF_OUTPUT_TOD = 1ULL << 32,
> PERF_OUTPUT_DATA_PAGE_SIZE = 1ULL << 33,
> + PERF_OUTPUT_CODE_PAGE_SIZE = 1ULL << 34,
> };
>
> struct perf_script {
> @@ -182,6 +183,7 @@ struct output_option {
> {.str = "ipc", .field = PERF_OUTPUT_IPC},
> {.str = "tod", .field = PERF_OUTPUT_TOD},
> {.str = "data_page_size", .field = PERF_OUTPUT_DATA_PAGE_SIZE},
> + {.str = "code_page_size", .field = PERF_OUTPUT_CODE_PAGE_SIZE},
> };
>
> enum {
> @@ -255,7 +257,7 @@ static struct {
> PERF_OUTPUT_DSO | PERF_OUTPUT_PERIOD |
> PERF_OUTPUT_ADDR | PERF_OUTPUT_DATA_SRC |
> PERF_OUTPUT_WEIGHT | PERF_OUTPUT_PHYS_ADDR |
> - PERF_OUTPUT_DATA_PAGE_SIZE,
> + PERF_OUTPUT_DATA_PAGE_SIZE | PERF_OUTPUT_CODE_PAGE_SIZE,
>
> .invalid_fields = PERF_OUTPUT_TRACE | PERF_OUTPUT_BPF_OUTPUT,
> },
> @@ -507,6 +509,10 @@ static int evsel__check_attr(struct evsel *evsel, struct perf_session *session)
> evsel__check_stype(evsel, PERF_SAMPLE_DATA_PAGE_SIZE, "DATA_PAGE_SIZE", PERF_OUTPUT_DATA_PAGE_SIZE))
> return -EINVAL;
>
> + if (PRINT_FIELD(CODE_PAGE_SIZE) &&
> + evsel__check_stype(evsel, PERF_SAMPLE_CODE_PAGE_SIZE, "CODE_PAGE_SIZE", PERF_OUTPUT_CODE_PAGE_SIZE))
> + return -EINVAL;
> +
> return 0;
> }
>
> @@ -2020,6 +2026,9 @@ static void process_event(struct perf_script *script,
> if (PRINT_FIELD(DATA_PAGE_SIZE))
> fprintf(fp, " %s", get_page_size_name(sample->data_page_size, str));
>
> + if (PRINT_FIELD(CODE_PAGE_SIZE))
> + fprintf(fp, " %s", get_page_size_name(sample->code_page_size, str));
> +
> perf_sample__fprintf_ipc(sample, attr, fp);
>
> fprintf(fp, "\n");
> @@ -3519,7 +3528,7 @@ int cmd_script(int argc, const char **argv)
> "addr,symoff,srcline,period,iregs,uregs,brstack,"
> "brstacksym,flags,bpf-output,brstackinsn,brstackoff,"
> "callindent,insn,insnlen,synth,phys_addr,metric,misc,ipc,tod,"
> - "data_page_size",
> + "data_page_size,code_page_size",
> parse_output_fields),
> OPT_BOOLEAN('a', "all-cpus", &system_wide,
> "system-wide collection from all CPUs"),
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 357d6b972b9d..492c994c948a 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -1312,6 +1312,9 @@ static void dump_sample(struct evsel *evsel, union perf_event *event,
> if (sample_type & PERF_SAMPLE_DATA_PAGE_SIZE)
> printf(" .. data page size: %s\n", get_page_size_name(sample->data_page_size, str));
>
> + if (sample_type & PERF_SAMPLE_CODE_PAGE_SIZE)
> + printf(" .. code page size: %s\n", get_page_size_name(sample->code_page_size, str));
> +
> if (sample_type & PERF_SAMPLE_TRANSACTION)
> printf("... transaction: %" PRIx64 "\n", sample->transaction);
>
> --
> 2.25.1
>

--

- Arnaldo

2021-01-15 19:34:53

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH V4 5/6] perf report: Add support for PERF_SAMPLE_CODE_PAGE_SIZE

Em Tue, Jan 05, 2021 at 11:57:51AM -0800, [email protected] escreveu:
> From: Stephane Eranian <[email protected]>
>
> Add a new sort dimension "code_page_size" for common sort.
> With this option applied, perf can sort and report by sample's code page
> size.

Ditto, adding your:

Signed-off-by: Kan Liang <[email protected]>

> For example,
> perf report --stdio --sort=comm,symbol,code_page_size
> # To display the perf.data header info, please use
> # --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 3K of event 'mem-loads:uP'
> # Event count (approx.): 1470769
> #
> # Overhead Command Symbol Code Page Size IPC
> # [IPC Coverage]
> # ........ ....... ............................ ..............
> # ....................
> #
> 69.56% dtlb [.] GetTickCount 4K -
>
> 17.93% dtlb [.] Calibrate 4K -
> -
> 11.40% dtlb [.] __gettimeofday 4K -
> -
>
> Acked-by: Namhyung Kim <[email protected]>
> Acked-by: Jiri Olsa <[email protected]>
> Signed-off-by: Stephane Eranian <[email protected]>
> ---
> tools/perf/Documentation/perf-report.txt | 1 +
> tools/perf/util/hist.c | 2 ++
> tools/perf/util/hist.h | 1 +
> tools/perf/util/sort.c | 26 ++++++++++++++++++++++++
> tools/perf/util/sort.h | 2 ++
> 5 files changed, 32 insertions(+)
>
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index 8f7f4e9605d8..e44045842c5c 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -108,6 +108,7 @@ OPTIONS
> - period: Raw number of event count of sample
> - time: Separate the samples by time stamp with the resolution specified by
> --time-quantum (default 100ms). Specify with overhead and before it.
> + - code_page_size: the code page size of sampled code address (ip)
>
> By default, comm, dso and symbol keys are used.
> (i.e. --sort comm,dso,symbol)
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index a08fb9ea411b..6d50379af90e 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -212,6 +212,7 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
> hists__new_col_len(hists, HISTC_TIME, 16);
> else
> hists__new_col_len(hists, HISTC_TIME, 12);
> + hists__new_col_len(hists, HISTC_CODE_PAGE_SIZE, 6);
>
> if (h->srcline) {
> len = MAX(strlen(h->srcline), strlen(sort_srcline.se_header));
> @@ -718,6 +719,7 @@ __hists__add_entry(struct hists *hists,
> .cpumode = al->cpumode,
> .ip = al->addr,
> .level = al->level,
> + .code_page_size = sample->code_page_size,
> .stat = {
> .nr_events = 1,
> .period = sample->period,
> diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
> index 14f66330923d..361108533a56 100644
> --- a/tools/perf/util/hist.h
> +++ b/tools/perf/util/hist.h
> @@ -53,6 +53,7 @@ enum hist_column {
> HISTC_DSO_TO,
> HISTC_LOCAL_WEIGHT,
> HISTC_GLOBAL_WEIGHT,
> + HISTC_CODE_PAGE_SIZE,
> HISTC_MEM_DADDR_SYMBOL,
> HISTC_MEM_DADDR_DSO,
> HISTC_MEM_PHYS_DADDR,
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 80907bc32683..c00934c91b58 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -1491,6 +1491,31 @@ struct sort_entry sort_mem_data_page_size = {
> .se_width_idx = HISTC_MEM_DATA_PAGE_SIZE,
> };
>
> +static int64_t
> +sort__code_page_size_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> + uint64_t l = left->code_page_size;
> + uint64_t r = right->code_page_size;
> +
> + return (int64_t)(r - l);
> +}
> +
> +static int hist_entry__code_page_size_snprintf(struct hist_entry *he, char *bf,
> + size_t size, unsigned int width)
> +{
> + char str[PAGE_SIZE_NAME_LEN];
> +
> + return repsep_snprintf(bf, size, "%-*s", width,
> + get_page_size_name(he->code_page_size, str));
> +}
> +
> +struct sort_entry sort_code_page_size = {
> + .se_header = "Code Page Size",
> + .se_cmp = sort__code_page_size_cmp,
> + .se_snprintf = hist_entry__code_page_size_snprintf,
> + .se_width_idx = HISTC_CODE_PAGE_SIZE,
> +};
> +
> static int64_t
> sort__abort_cmp(struct hist_entry *left, struct hist_entry *right)
> {
> @@ -1735,6 +1760,7 @@ static struct sort_dimension common_sort_dimensions[] = {
> DIM(SORT_CGROUP_ID, "cgroup_id", sort_cgroup_id),
> DIM(SORT_SYM_IPC_NULL, "ipc_null", sort_sym_ipc_null),
> DIM(SORT_TIME, "time", sort_time),
> + DIM(SORT_CODE_PAGE_SIZE, "code_page_size", sort_code_page_size),
> };
>
> #undef DIM
> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> index e50f2b695bc4..cab4172a6ec3 100644
> --- a/tools/perf/util/sort.h
> +++ b/tools/perf/util/sort.h
> @@ -106,6 +106,7 @@ struct hist_entry {
> u64 transaction;
> s32 socket;
> s32 cpu;
> + u64 code_page_size;
> u8 cpumode;
> u8 depth;
>
> @@ -229,6 +230,7 @@ enum sort_type {
> SORT_CGROUP_ID,
> SORT_SYM_IPC_NULL,
> SORT_TIME,
> + SORT_CODE_PAGE_SIZE,
>
> /* branch stack specific sort keys */
> __SORT_BRANCH_STACK,
> --
> 2.25.1
>

--

- Arnaldo

2021-01-18 13:46:14

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH V4 4/6] perf script: Add support for PERF_SAMPLE_CODE_PAGE_SIZE



On 1/15/2021 2:25 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Jan 05, 2021 at 11:57:50AM -0800,[email protected] escreveu:
>> From: Stephane Eranian<[email protected]>
>>
>> Display sampled code page sizes when PERF_SAMPLE_CODE_PAGE_SIZE was set.
>>
>> For example,
>> perf script --fields comm,event,ip,code_page_size
>> dtlb mem-loads:uP: 445777 4K
>> dtlb mem-loads:uP: 40f724 4K
>> dtlb mem-loads:uP: 474926 4K
>> dtlb mem-loads:uP: 401075 4K
>> dtlb mem-loads:uP: 401095 4K
>> dtlb mem-loads:uP: 401095 4K
>> dtlb mem-loads:uP: 4010cc 4K
>> dtlb mem-loads:uP: 440b6f 4K
>>
>> Acked-by: Namhyung Kim<[email protected]>
>> Acked-by: Jiri Olsa<[email protected]>
>> Signed-off-by: Stephane Eranian<[email protected]>
> You missed your Signed-off-by, I'm adding it, please ack this change.

The patch 4 and 5 are from Stephane. I only made minor changes so that
the code can be rebased to the latest perf/core branch (c07b45a355ee).

May add a tag as below.

[[email protected]: Rebase on top of acme's perf/core branch
commit c07b45a355ee]
Signed-off-by: Kan Liang <[email protected]>


Thanks,
Kan

2021-01-19 12:46:34

by Athira Rajeev

[permalink] [raw]
Subject: Re: [PATCH V4 0/6] Add the page size in the perf record (user tools)



> On 13-Jan-2021, at 12:43 AM, Liang, Kan <[email protected]> wrote:
>
>
>
> On 1/12/2021 12:24 AM, Athira Rajeev wrote:
>>> On 06-Jan-2021, at 1:27 AM, [email protected] wrote:
>>>
>>> From: Kan Liang <[email protected]>
>>>
>>> Changes since V3:
>>> - Rebase on top of acme's perf/core branch
>>> commit c07b45a355ee ("perf record: Tweak "Lowering..." warning in record_opts__config_freq")
>>>
>>> Changes since V2:
>>> - Rebase on top of acme perf/core branch
>>> commit eec7b53d5916 ("perf test: Make sample-parsing test aware of PERF_SAMPLE_{CODE,DATA}_PAGE_SIZE")
>>> - Use unit_number__scnprintf() in get_page_size_name()
>>> - Emit warning about kernel not supporting the code page size sample_type bit
>>>
>>> Changes since V1:
>>> - Fix the compile warning with GCC 10
>>> - Add Acked-by from Namhyung Kim
>>>
>>> Current perf can report both virtual addresses and physical addresses,
>>> but not the page size. Without the page size information of the utilized
>>> page, users cannot decide whether to promote/demote large pages to
>>> optimize memory usage.
>>>
>>> The kernel patches have been merged into tip perf/core branch,
>>> commit 8d97e71811aa ("perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE")
>>> commit 76a5433f95f3 ("perf/x86/intel: Support PERF_SAMPLE_DATA_PAGE_SIZE")
>>> commit 4cb6a42e4c4b ("powerpc/perf: Support PERF_SAMPLE_DATA_PAGE_SIZE")
>>> commit 995f088efebe ("perf/core: Add support for PERF_SAMPLE_CODE_PAGE_SIZE")
>>> commit 51b646b2d9f8 ("perf,mm: Handle non-page-table-aligned hugetlbfs")
>>>
>>> and Peter's perf/core branch
>>> commit 524680ce47a1 ("mm/gup: Provide gup_get_pte() more generic")
>>> commit 44a35d6937d2 ("mm: Introduce pXX_leaf_size()")
>>> commit 2f1e2f091ad0 ("perf/core: Fix arch_perf_get_page_size()")
>>> commit 7649e44aacdd ("arm64/mm: Implement pXX_leaf_size() support")
>>> commit 1df1ae7e262c ("sparc64/mm: Implement pXX_leaf_size() support")
>>>
>>> This patch set is to enable the page size support in user tools.
>> Hi Kan Liang,
>> I am trying to check this series on powerpc.
>> # perf mem --phys-data --data-page-size record <workload>
>> To my observation, some of the samples returned zero size and comes as ’N/A’ in the perf report
>> # perf mem --phys-data --data-page-size report
>> For fetching the page size, though initially there was a weak function added ( as arch_perf_get_page_size ) here:
>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=51b646b2d9f84d6ff6300e3c1d09f2be4329a424
>> later I see it got removed here:
>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=8af26be062721e52eba1550caf50b712f774c5fd
>> I picked kernel changes from git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git , or I am missing something ?
>
> I believe all the kernel changes have been merged.
>
> According to the commit message of the recent changes, only Power/8xxx is supported for power for now. I guess that may be the reason of some 'N/A's.
> https://lore.kernel.org/patchwork/cover/1345521/

Thanks for clarifying.
For tools side changes, other than ’N/A’ in the perf report which I got, I verified the --data-page-size option for perf mem record and mem report.

For tools-side changes,
Tested-by: Athira Rajeev<[email protected]>

Thanks
Athira
>
> Thanks,
> Kan
>
>
>> Thanks
>> Athira
>>>
>>> Kan Liang (3):
>>> perf mem: Clean up output format
>>> perf mem: Support data page size
>>> perf tools: Add support for PERF_SAMPLE_CODE_PAGE_SIZE
>>>
>>> Stephane Eranian (3):
>>> perf script: Add support for PERF_SAMPLE_CODE_PAGE_SIZE
>>> perf report: Add support for PERF_SAMPLE_CODE_PAGE_SIZE
>>> perf test: Add test case for PERF_SAMPLE_CODE_PAGE_SIZE
>>>
>>> tools/perf/Documentation/perf-mem.txt | 3 +
>>> tools/perf/Documentation/perf-record.txt | 3 +
>>> tools/perf/Documentation/perf-report.txt | 1 +
>>> tools/perf/Documentation/perf-script.txt | 2 +-
>>> tools/perf/builtin-mem.c | 111 +++++++++++-----------
>>> tools/perf/builtin-record.c | 2 +
>>> tools/perf/builtin-script.c | 13 ++-
>>> tools/perf/tests/sample-parsing.c | 4 +
>>> tools/perf/util/event.h | 1 +
>>> tools/perf/util/evsel.c | 18 +++-
>>> tools/perf/util/evsel.h | 1 +
>>> tools/perf/util/hist.c | 2 +
>>> tools/perf/util/hist.h | 1 +
>>> tools/perf/util/perf_event_attr_fprintf.c | 2 +-
>>> tools/perf/util/record.h | 1 +
>>> tools/perf/util/session.c | 3 +
>>> tools/perf/util/sort.c | 26 +++++
>>> tools/perf/util/sort.h | 2 +
>>> tools/perf/util/synthetic-events.c | 8 ++
>>> 19 files changed, 144 insertions(+), 60 deletions(-)
>>>
>>> --
>>> 2.25.1