2015-04-21 05:00:22

by Namhyung Kim

[permalink] [raw]
Subject: [PATCHSET 0/6] perf kmem: Implement page allocation analysis (v8)

Hello,

Currently perf kmem command only analyzes SLAB memory allocation. And
I'd like to introduce page allocation analysis also. Users can use
--slab and/or --page option to select it. If none of these options
are used, it does slab allocation analysis for backward compatibility.

* changes in v8)
- rename 'stat' to 'pstat' due to build error
- add Acked-by from Pekka

* changes in v7)
- drop already merged patches
- check return value of map__load() (Arnaldo)
- rename to page_stat__findnew_*() functions (Arnaldo)
- show warning when try to run stat before record

* changes in v6)
- add -i option fix (Jiri)
- libtraceevent operator priority fix

* changes in v5)
- print migration type and gfp flags in more compact form (Arnaldo)
- add kmem.default config option

* changes in v4)
- use pfn instead of struct page * in tracepoints (Joonsoo, Ingo)
- print gfp flags in human readable string (Joonsoo, Minchan)

* changes in v3)
- add live page statistics

* changes in v2)
- Use thousand grouping for big numbers - i.e. 12345 -> 12,345 (Ingo)
- Improve output stat readability (Ingo)
- Remove alloc size column as it can be calculated from hits and order

In this patchset, I used two kmem events: kmem:mm_page_alloc and
kmem_page_free for analysis as they can track almost all of memory
allocation/free path AFAIK. However, unlike slab tracepoint events,
those page allocation events don't provide callsite info directly. So
I recorded callchains and extracted callsites like below:

Normal page allocation callchains look like this:

360a7e __alloc_pages_nodemask
3a711c alloc_pages_current
357bc7 __page_cache_alloc <-- callsite
357cf6 pagecache_get_page
48b0a prepare_pages
494d3 __btrfs_buffered_write
49cdf btrfs_file_write_iter
3ceb6e new_sync_write
3cf447 vfs_write
3cff99 sys_write
7556e9 system_call
f880 __write_nocancel
33eb9 cmd_record
4b38e cmd_kmem
7aa23 run_builtin
27a9a main
20800 __libc_start_main

But first two are internal page allocation functions so it should be
skipped. To determine such allocation functions, I used following regex:

^_?_?(alloc|get_free|get_zeroed)_pages?

This gave me a following list of functions (you can see this with -v):

alloc func: __get_free_pages
alloc func: get_zeroed_page
alloc func: alloc_pages_exact
alloc func: __alloc_pages_direct_compact
alloc func: __alloc_pages_nodemask
alloc func: alloc_page_interleave
alloc func: alloc_pages_current
alloc func: alloc_pages_vma
alloc func: alloc_page_buffers
alloc func: alloc_pages_exact_nid

After skipping those function, it got '__page_cache_alloc'.

Other information such as allocation order, migration type and gfp
flags are provided by tracepoint events.

Basically the output will be sorted by total allocation bytes, but you
can change it by using -s/--sort option. The following sort keys are
added to support page analysis: page, order, migtype, gfp. Existing
'callsite', 'bytes' and 'hit' sort keys also can be used.

An example follows:

# perf kmem record --page sleep 5
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 1.065 MB perf.data (2949 samples) ]

# perf kmem stat --page --caller -s order,hit -l 10
#
# GFP flags
# ---------
# 00000010: NI: GFP_NOIO
# 000000d0: K: GFP_KERNEL
# 00000200: NWR: GFP_NOWARN
# 000052d0: K|NWR|NR|C: GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP
# 000084d0: K|R|Z: GFP_KERNEL|GFP_REPEAT|GFP_ZERO
# 000200d0: U: GFP_USER
# 000200d2: HU: GFP_HIGHUSER
# 000200da: HUM: GFP_HIGHUSER_MOVABLE
# 000280da: HUM|Z: GFP_HIGHUSER_MOVABLE|GFP_ZERO
# 002084d0: K|R|Z|NT: GFP_KERNEL|GFP_REPEAT|GFP_ZERO|GFP_NOTRACK
# 0102005a: NF|HW|M: GFP_NOFS|GFP_HARDWALL|GFP_MOVABLE

---------------------------------------------------------------------------------------------------------
Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
---------------------------------------------------------------------------------------------------------
16 | 1 | 2 | UNMOVABL | K|NWR|NR|C | alloc_skb_with_frags
24 | 3 | 1 | UNMOVABL | K|NWR|NR|C | alloc_skb_with_frags
3,876 | 969 | 0 | MOVABLE | HUM | shmem_alloc_page
972 | 243 | 0 | UNMOVABL | K | __pollwait
624 | 156 | 0 | MOVABLE | NF|HW|M | __page_cache_alloc
304 | 76 | 0 | UNMOVABL | U | dma_generic_alloc_coherent
108 | 27 | 0 | MOVABLE | HUM|Z | handle_mm_fault
56 | 14 | 0 | UNMOVABL | K|R|Z|NT | pte_alloc_one
24 | 6 | 0 | MOVABLE | HUM | do_wp_page
16 | 4 | 0 | UNMOVABL | NWR | __tlb_remove_page
... | ... | ... | ... | ... | ...
---------------------------------------------------------------------------------------------------------

SUMMARY (page allocator)
========================
Total allocation requests : 1,518 [ 6,096 KB ]
Total free requests : 1,431 [ 5,748 KB ]

Total alloc+freed requests : 1,330 [ 5,344 KB ]
Total alloc-only requests : 188 [ 752 KB ]
Total free-only requests : 101 [ 404 KB ]

Total allocation failures : 0 [ 0 KB ]

Order Unmovable Reclaimable Movable Reserved CMA/Isolated
----- ------------ ------------ ------------ ------------ ------------
0 351 . 1,163 . .
1 3 . . . .
2 1 . . . .
3 . . . . .
4 . . . . .
5 . . . . .
6 . . . . .
7 . . . . .
8 . . . . .
9 . . . . .
10 . . . . .

I have some idea how to improve it. But I'd also like to hear other
idea, suggestion, feedback and so on.

This is available at perf/kmem-page-v8 branch on my tree:

git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Namhyung Kim (6):
perf kmem: Implement stat --page --caller
perf kmem: Support sort keys on page analysis
perf kmem: Add --live option for current allocation stat
perf kmem: Print gfp flags in human readable string
perf kmem: Add kmem.default config option
perf kmem: Show warning when trying to run stat without record

tools/perf/Documentation/perf-kmem.txt | 11 +-
tools/perf/builtin-kmem.c | 995 +++++++++++++++++++++++++++++----
2 files changed, 898 insertions(+), 108 deletions(-)

--
2.3.4


2015-04-21 05:00:26

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 1/6] perf kmem: Implement stat --page --caller

It perf kmem support caller statistics for page. Unlike slab case,
the tracepoints in page allocator don't provide callsite info. So
it records with callchain and extracts callsite info.

Note that the callchain contains several memory allocation functions
which has no meaning for users. So skip those functions to get proper
callsites. I used following regex pattern to skip the allocator
functions:

^_?_?(alloc|get_free|get_zeroed)_pages?

This gave me a following list of functions:

# perf kmem record --page sleep 3
# perf kmem stat --page -v
...
alloc func: __get_free_pages
alloc func: get_zeroed_page
alloc func: alloc_pages_exact
alloc func: __alloc_pages_direct_compact
alloc func: __alloc_pages_nodemask
alloc func: alloc_page_interleave
alloc func: alloc_pages_current
alloc func: alloc_pages_vma
alloc func: alloc_page_buffers
alloc func: alloc_pages_exact_nid
...

The output looks mostly same as --alloc (I also added callsite column
to that) but groups entries by callsite. Currently, the order,
migrate type and GFP flag info is for the last allocation and not
guaranteed to be same for all allocations from the callsite.

---------------------------------------------------------------------------------------------
Total_alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
---------------------------------------------------------------------------------------------
1,064 | 266 | 0 | UNMOVABL | 000000d0 | __pollwait
52 | 13 | 0 | UNMOVABL | 002084d0 | pte_alloc_one
44 | 11 | 0 | MOVABLE | 000280da | handle_mm_fault
20 | 5 | 0 | MOVABLE | 000200da | do_cow_fault
20 | 5 | 0 | MOVABLE | 000200da | do_wp_page
16 | 4 | 0 | UNMOVABL | 000084d0 | __pmd_alloc
16 | 4 | 0 | UNMOVABL | 00000200 | __tlb_remove_page
12 | 3 | 0 | UNMOVABL | 000084d0 | __pud_alloc
8 | 2 | 0 | UNMOVABL | 00000010 | bio_copy_user_iov
4 | 1 | 0 | UNMOVABL | 000200d2 | pipe_write
4 | 1 | 0 | MOVABLE | 000280da | do_wp_page
4 | 1 | 0 | UNMOVABL | 002084d0 | pgd_alloc
---------------------------------------------------------------------------------------------

Acked-by: Pekka Enberg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/builtin-kmem.c | 327 +++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 306 insertions(+), 21 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 4f0f38462d97..3649eec6807f 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -10,6 +10,7 @@
#include "util/header.h"
#include "util/session.h"
#include "util/tool.h"
+#include "util/callchain.h"

#include "util/parse-options.h"
#include "util/trace-event.h"
@@ -21,6 +22,7 @@
#include <linux/rbtree.h>
#include <linux/string.h>
#include <locale.h>
+#include <regex.h>

static int kmem_slab;
static int kmem_page;
@@ -241,6 +243,7 @@ static unsigned long nr_page_fails;
static unsigned long nr_page_nomatch;

static bool use_pfn;
+static struct perf_session *kmem_session;

#define MAX_MIGRATE_TYPES 6
#define MAX_PAGE_ORDER 11
@@ -250,6 +253,7 @@ static int order_stats[MAX_PAGE_ORDER][MAX_MIGRATE_TYPES];
struct page_stat {
struct rb_node node;
u64 page;
+ u64 callsite;
int order;
unsigned gfp_flags;
unsigned migrate_type;
@@ -262,8 +266,144 @@ struct page_stat {
static struct rb_root page_tree;
static struct rb_root page_alloc_tree;
static struct rb_root page_alloc_sorted;
+static struct rb_root page_caller_tree;
+static struct rb_root page_caller_sorted;

-static struct page_stat *search_page(unsigned long page, bool create)
+struct alloc_func {
+ u64 start;
+ u64 end;
+ char *name;
+};
+
+static int nr_alloc_funcs;
+static struct alloc_func *alloc_func_list;
+
+static int funcmp(const void *a, const void *b)
+{
+ const struct alloc_func *fa = a;
+ const struct alloc_func *fb = b;
+
+ if (fa->start > fb->start)
+ return 1;
+ else
+ return -1;
+}
+
+static int callcmp(const void *a, const void *b)
+{
+ const struct alloc_func *fa = a;
+ const struct alloc_func *fb = b;
+
+ if (fb->start <= fa->start && fa->end < fb->end)
+ return 0;
+
+ if (fa->start > fb->start)
+ return 1;
+ else
+ return -1;
+}
+
+static int build_alloc_func_list(void)
+{
+ int ret;
+ struct map *kernel_map;
+ struct symbol *sym;
+ struct rb_node *node;
+ struct alloc_func *func;
+ struct machine *machine = &kmem_session->machines.host;
+ regex_t alloc_func_regex;
+ const char pattern[] = "^_?_?(alloc|get_free|get_zeroed)_pages?";
+
+ ret = regcomp(&alloc_func_regex, pattern, REG_EXTENDED);
+ if (ret) {
+ char err[BUFSIZ];
+
+ regerror(ret, &alloc_func_regex, err, sizeof(err));
+ pr_err("Invalid regex: %s\n%s", pattern, err);
+ return -EINVAL;
+ }
+
+ kernel_map = machine->vmlinux_maps[MAP__FUNCTION];
+ if (map__load(kernel_map, NULL) < 0) {
+ pr_err("cannot load kernel map\n");
+ return -ENOENT;
+ }
+
+ map__for_each_symbol(kernel_map, sym, node) {
+ if (regexec(&alloc_func_regex, sym->name, 0, NULL, 0))
+ continue;
+
+ func = realloc(alloc_func_list,
+ (nr_alloc_funcs + 1) * sizeof(*func));
+ if (func == NULL)
+ return -ENOMEM;
+
+ pr_debug("alloc func: %s\n", sym->name);
+ func[nr_alloc_funcs].start = sym->start;
+ func[nr_alloc_funcs].end = sym->end;
+ func[nr_alloc_funcs].name = sym->name;
+
+ alloc_func_list = func;
+ nr_alloc_funcs++;
+ }
+
+ qsort(alloc_func_list, nr_alloc_funcs, sizeof(*func), funcmp);
+
+ regfree(&alloc_func_regex);
+ return 0;
+}
+
+/*
+ * Find first non-memory allocation function from callchain.
+ * The allocation functions are in the 'alloc_func_list'.
+ */
+static u64 find_callsite(struct perf_evsel *evsel, struct perf_sample *sample)
+{
+ struct addr_location al;
+ struct machine *machine = &kmem_session->machines.host;
+ struct callchain_cursor_node *node;
+
+ if (alloc_func_list == NULL) {
+ if (build_alloc_func_list() < 0)
+ goto out;
+ }
+
+ al.thread = machine__findnew_thread(machine, sample->pid, sample->tid);
+ sample__resolve_callchain(sample, NULL, evsel, &al, 16);
+
+ callchain_cursor_commit(&callchain_cursor);
+ while (true) {
+ struct alloc_func key, *caller;
+ u64 addr;
+
+ node = callchain_cursor_current(&callchain_cursor);
+ if (node == NULL)
+ break;
+
+ key.start = key.end = node->ip;
+ caller = bsearch(&key, alloc_func_list, nr_alloc_funcs,
+ sizeof(key), callcmp);
+ if (!caller) {
+ /* found */
+ if (node->map)
+ addr = map__unmap_ip(node->map, node->ip);
+ else
+ addr = node->ip;
+
+ return addr;
+ } else
+ pr_debug3("skipping alloc function: %s\n", caller->name);
+
+ callchain_cursor_advance(&callchain_cursor);
+ }
+
+out:
+ pr_debug2("unknown callsite: %"PRIx64 "\n", sample->ip);
+ return sample->ip;
+}
+
+static struct page_stat *
+__page_stat__findnew_page(u64 page, bool create)
{
struct rb_node **node = &page_tree.rb_node;
struct rb_node *parent = NULL;
@@ -298,6 +438,16 @@ static struct page_stat *search_page(unsigned long page, bool create)
return data;
}

+static struct page_stat *page_stat__find_page(u64 page)
+{
+ return __page_stat__findnew_page(page, false);
+}
+
+static struct page_stat *page_stat__findnew_page(u64 page)
+{
+ return __page_stat__findnew_page(page, true);
+}
+
static int page_stat_cmp(struct page_stat *a, struct page_stat *b)
{
if (a->page > b->page)
@@ -319,7 +469,8 @@ static int page_stat_cmp(struct page_stat *a, struct page_stat *b)
return 0;
}

-static struct page_stat *search_page_alloc_stat(struct page_stat *pstat, bool create)
+static struct page_stat *
+__page_stat__findnew_alloc(struct page_stat *pstat, bool create)
{
struct rb_node **node = &page_alloc_tree.rb_node;
struct rb_node *parent = NULL;
@@ -357,6 +508,62 @@ static struct page_stat *search_page_alloc_stat(struct page_stat *pstat, bool cr
return data;
}

+static struct page_stat *page_stat__find_alloc(struct page_stat *pstat)
+{
+ return __page_stat__findnew_alloc(pstat, false);
+}
+
+static struct page_stat *page_stat__findnew_alloc(struct page_stat *pstat)
+{
+ return __page_stat__findnew_alloc(pstat, true);
+}
+
+static struct page_stat *
+__page_stat__findnew_caller(u64 callsite, bool create)
+{
+ struct rb_node **node = &page_caller_tree.rb_node;
+ struct rb_node *parent = NULL;
+ struct page_stat *data;
+
+ while (*node) {
+ s64 cmp;
+
+ parent = *node;
+ data = rb_entry(*node, struct page_stat, node);
+
+ cmp = data->callsite - callsite;
+ if (cmp < 0)
+ node = &parent->rb_left;
+ else if (cmp > 0)
+ node = &parent->rb_right;
+ else
+ return data;
+ }
+
+ if (!create)
+ return NULL;
+
+ data = zalloc(sizeof(*data));
+ if (data != NULL) {
+ data->callsite = callsite;
+
+ rb_link_node(&data->node, parent, node);
+ rb_insert_color(&data->node, &page_caller_tree);
+ }
+
+ return data;
+}
+
+static struct page_stat *page_stat__find_caller(u64 callsite)
+{
+ return __page_stat__findnew_caller(callsite, false);
+}
+
+static struct page_stat *page_stat__findnew_caller(u64 callsite)
+{
+ return __page_stat__findnew_caller(callsite, true);
+}
+
static bool valid_page(u64 pfn_or_page)
{
if (use_pfn && pfn_or_page == -1UL)
@@ -375,6 +582,7 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
unsigned int migrate_type = perf_evsel__intval(evsel, sample,
"migratetype");
u64 bytes = kmem_page_size << order;
+ u64 callsite;
struct page_stat *pstat;
struct page_stat this = {
.order = order,
@@ -397,25 +605,40 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
return 0;
}

+ callsite = find_callsite(evsel, sample);
+
/*
* This is to find the current page (with correct gfp flags and
* migrate type) at free event.
*/
- pstat = search_page(page, true);
+ pstat = page_stat__findnew_page(page);
if (pstat == NULL)
return -ENOMEM;

pstat->order = order;
pstat->gfp_flags = gfp_flags;
pstat->migrate_type = migrate_type;
+ pstat->callsite = callsite;

this.page = page;
- pstat = search_page_alloc_stat(&this, true);
+ pstat = page_stat__findnew_alloc(&this);
if (pstat == NULL)
return -ENOMEM;

pstat->nr_alloc++;
pstat->alloc_bytes += bytes;
+ pstat->callsite = callsite;
+
+ pstat = page_stat__findnew_caller(callsite);
+ if (pstat == NULL)
+ return -ENOMEM;
+
+ pstat->order = order;
+ pstat->gfp_flags = gfp_flags;
+ pstat->migrate_type = migrate_type;
+
+ pstat->nr_alloc++;
+ pstat->alloc_bytes += bytes;

order_stats[order][migrate_type]++;

@@ -441,7 +664,7 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
nr_page_frees++;
total_page_free_bytes += bytes;

- pstat = search_page(page, false);
+ pstat = page_stat__find_page(page);
if (pstat == NULL) {
pr_debug2("missing free at page %"PRIx64" (order: %d)\n",
page, order);
@@ -455,11 +678,19 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
this.page = page;
this.gfp_flags = pstat->gfp_flags;
this.migrate_type = pstat->migrate_type;
+ this.callsite = pstat->callsite;

rb_erase(&pstat->node, &page_tree);
free(pstat);

- pstat = search_page_alloc_stat(&this, false);
+ pstat = page_stat__find_alloc(&this);
+ if (pstat == NULL)
+ return -ENOENT;
+
+ pstat->nr_free++;
+ pstat->free_bytes += bytes;
+
+ pstat = page_stat__find_caller(this.callsite);
if (pstat == NULL)
return -ENOENT;

@@ -576,41 +807,89 @@ static const char * const migrate_type_str[] = {
"UNKNOWN",
};

-static void __print_page_result(struct rb_root *root,
- struct perf_session *session __maybe_unused,
- int n_lines)
+static void __print_page_alloc_result(struct perf_session *session, int n_lines)
{
- struct rb_node *next = rb_first(root);
+ struct rb_node *next = rb_first(&page_alloc_sorted);
+ struct machine *machine = &session->machines.host;
const char *format;

- printf("\n%.80s\n", graph_dotted_line);
- printf(" %-16s | Total alloc (KB) | Hits | Order | Mig.type | GFP flags\n",
+ printf("\n%.105s\n", graph_dotted_line);
+ printf(" %-16s | Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n",
use_pfn ? "PFN" : "Page");
- printf("%.80s\n", graph_dotted_line);
+ printf("%.105s\n", graph_dotted_line);

if (use_pfn)
- format = " %16llu | %'16llu | %'9d | %5d | %8s | %08lx\n";
+ format = " %16llu | %'16llu | %'9d | %5d | %8s | %08lx | %s\n";
else
- format = " %016llx | %'16llu | %'9d | %5d | %8s | %08lx\n";
+ format = " %016llx | %'16llu | %'9d | %5d | %8s | %08lx | %s\n";

while (next && n_lines--) {
struct page_stat *data;
+ struct symbol *sym;
+ struct map *map;
+ char buf[32];
+ char *caller = buf;

data = rb_entry(next, struct page_stat, node);
+ sym = machine__find_kernel_function(machine, data->callsite,
+ &map, NULL);
+ if (sym && sym->name)
+ caller = sym->name;
+ else
+ scnprintf(buf, sizeof(buf), "%"PRIx64, data->callsite);

printf(format, (unsigned long long)data->page,
(unsigned long long)data->alloc_bytes / 1024,
data->nr_alloc, data->order,
migrate_type_str[data->migrate_type],
- (unsigned long)data->gfp_flags);
+ (unsigned long)data->gfp_flags, caller);

next = rb_next(next);
}

if (n_lines == -1)
- printf(" ... | ... | ... | ... | ... | ... \n");
+ printf(" ... | ... | ... | ... | ... | ... | ...\n");

- printf("%.80s\n", graph_dotted_line);
+ printf("%.105s\n", graph_dotted_line);
+}
+
+static void __print_page_caller_result(struct perf_session *session, int n_lines)
+{
+ struct rb_node *next = rb_first(&page_caller_sorted);
+ struct machine *machine = &session->machines.host;
+
+ printf("\n%.105s\n", graph_dotted_line);
+ printf(" Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n");
+ printf("%.105s\n", graph_dotted_line);
+
+ while (next && n_lines--) {
+ struct page_stat *data;
+ struct symbol *sym;
+ struct map *map;
+ char buf[32];
+ char *caller = buf;
+
+ data = rb_entry(next, struct page_stat, node);
+ sym = machine__find_kernel_function(machine, data->callsite,
+ &map, NULL);
+ if (sym && sym->name)
+ caller = sym->name;
+ else
+ scnprintf(buf, sizeof(buf), "%"PRIx64, data->callsite);
+
+ printf(" %'16llu | %'9d | %5d | %8s | %08lx | %s\n",
+ (unsigned long long)data->alloc_bytes / 1024,
+ data->nr_alloc, data->order,
+ migrate_type_str[data->migrate_type],
+ (unsigned long)data->gfp_flags, caller);
+
+ next = rb_next(next);
+ }
+
+ if (n_lines == -1)
+ printf(" ... | ... | ... | ... | ... | ...\n");
+
+ printf("%.105s\n", graph_dotted_line);
}

static void print_slab_summary(void)
@@ -682,8 +961,10 @@ static void print_slab_result(struct perf_session *session)

static void print_page_result(struct perf_session *session)
{
+ if (caller_flag)
+ __print_page_caller_result(session, caller_lines);
if (alloc_flag)
- __print_page_result(&page_alloc_sorted, session, alloc_lines);
+ __print_page_alloc_result(session, alloc_lines);
print_page_summary();
}

@@ -802,6 +1083,7 @@ static void sort_result(void)
}
if (kmem_page) {
__sort_page_result(&page_alloc_tree, &page_alloc_sorted);
+ __sort_page_result(&page_caller_tree, &page_caller_sorted);
}
}

@@ -1084,7 +1366,7 @@ static int __cmd_record(int argc, const char **argv)
if (kmem_slab)
rec_argc += ARRAY_SIZE(slab_events);
if (kmem_page)
- rec_argc += ARRAY_SIZE(page_events);
+ rec_argc += ARRAY_SIZE(page_events) + 1; /* for -g */

rec_argv = calloc(rec_argc + 1, sizeof(char *));

@@ -1099,6 +1381,8 @@ static int __cmd_record(int argc, const char **argv)
rec_argv[i] = strdup(slab_events[j]);
}
if (kmem_page) {
+ rec_argv[i++] = strdup("-g");
+
for (j = 0; j < ARRAY_SIZE(page_events); j++, i++)
rec_argv[i] = strdup(page_events[j]);
}
@@ -1159,7 +1443,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)

file.path = input_name;

- session = perf_session__new(&file, false, &perf_kmem);
+ kmem_session = session = perf_session__new(&file, false, &perf_kmem);
if (session == NULL)
return -1;

@@ -1172,6 +1456,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
}

kmem_page_size = pevent_get_page_size(evsel->tp_format->pevent);
+ symbol_conf.use_callchain = true;
}

symbol__init(&session->header.env);
--
2.3.4

2015-04-21 05:01:18

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 2/6] perf kmem: Support sort keys on page analysis

Add new sort keys for page: page, order, migtype, gfp - existing
'bytes', 'hit' and 'callsite' sort keys also work for page. Note that
-s/--sort option should be preceded by either of --slab or --page
option to determine where the sort keys applies.

Now it properly groups and sorts allocation stats - so same
page/caller with different order/migtype/gfp will be printed on a
different line.

# perf kmem stat --page --caller -l 10 -s order,hit

--------------------------------------------------------------------------------------------
Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
--------------------------------------------------------------------------------------------
64 | 4 | 2 | RECLAIM | 00285250 | new_slab
50,144 | 12,536 | 0 | MOVABLE | 0102005a | __page_cache_alloc
52 | 13 | 0 | UNMOVABL | 002084d0 | pte_alloc_one
40 | 10 | 0 | MOVABLE | 000280da | handle_mm_fault
28 | 7 | 0 | UNMOVABL | 000000d0 | __pollwait
20 | 5 | 0 | MOVABLE | 000200da | do_wp_page
20 | 5 | 0 | MOVABLE | 000200da | do_cow_fault
16 | 4 | 0 | UNMOVABL | 00000200 | __tlb_remove_page
16 | 4 | 0 | UNMOVABL | 000084d0 | __pmd_alloc
8 | 2 | 0 | UNMOVABL | 000084d0 | __pud_alloc
... | ... | ... | ... | ... | ...
--------------------------------------------------------------------------------------------

Acked-by: Pekka Enberg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/Documentation/perf-kmem.txt | 6 +-
tools/perf/builtin-kmem.c | 393 ++++++++++++++++++++++++++-------
2 files changed, 313 insertions(+), 86 deletions(-)

diff --git a/tools/perf/Documentation/perf-kmem.txt b/tools/perf/Documentation/perf-kmem.txt
index 23219c65c16f..69e181272c51 100644
--- a/tools/perf/Documentation/perf-kmem.txt
+++ b/tools/perf/Documentation/perf-kmem.txt
@@ -37,7 +37,11 @@ OPTIONS

-s <key[,key2...]>::
--sort=<key[,key2...]>::
- Sort the output (default: frag,hit,bytes)
+ Sort the output (default: 'frag,hit,bytes' for slab and 'bytes,hit'
+ for page). Available sort keys are 'ptr, callsite, bytes, hit,
+ pingpong, frag' for slab and 'page, callsite, bytes, hit, order,
+ migtype, gfp' for page. This option should be preceded by one of the
+ mode selection options - i.e. --slab, --page, --alloc and/or --caller.

-l <num>::
--line=<num>::
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 3649eec6807f..0393a7f3fa35 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -30,7 +30,7 @@ static int kmem_page;
static long kmem_page_size;

struct alloc_stat;
-typedef int (*sort_fn_t)(struct alloc_stat *, struct alloc_stat *);
+typedef int (*sort_fn_t)(void *, void *);

static int alloc_flag;
static int caller_flag;
@@ -181,8 +181,8 @@ static int perf_evsel__process_alloc_node_event(struct perf_evsel *evsel,
return ret;
}

-static int ptr_cmp(struct alloc_stat *, struct alloc_stat *);
-static int callsite_cmp(struct alloc_stat *, struct alloc_stat *);
+static int ptr_cmp(void *, void *);
+static int slab_callsite_cmp(void *, void *);

static struct alloc_stat *search_alloc_stat(unsigned long ptr,
unsigned long call_site,
@@ -223,7 +223,8 @@ static int perf_evsel__process_free_event(struct perf_evsel *evsel,
s_alloc->pingpong++;

s_caller = search_alloc_stat(0, s_alloc->call_site,
- &root_caller_stat, callsite_cmp);
+ &root_caller_stat,
+ slab_callsite_cmp);
if (!s_caller)
return -1;
s_caller->pingpong++;
@@ -448,26 +449,14 @@ static struct page_stat *page_stat__findnew_page(u64 page)
return __page_stat__findnew_page(page, true);
}

-static int page_stat_cmp(struct page_stat *a, struct page_stat *b)
-{
- if (a->page > b->page)
- return -1;
- if (a->page < b->page)
- return 1;
- if (a->order > b->order)
- return -1;
- if (a->order < b->order)
- return 1;
- if (a->migrate_type > b->migrate_type)
- return -1;
- if (a->migrate_type < b->migrate_type)
- return 1;
- if (a->gfp_flags > b->gfp_flags)
- return -1;
- if (a->gfp_flags < b->gfp_flags)
- return 1;
- return 0;
-}
+struct sort_dimension {
+ const char name[20];
+ sort_fn_t cmp;
+ struct list_head list;
+};
+
+static LIST_HEAD(page_alloc_sort_input);
+static LIST_HEAD(page_caller_sort_input);

static struct page_stat *
__page_stat__findnew_alloc(struct page_stat *pstat, bool create)
@@ -475,14 +464,20 @@ __page_stat__findnew_alloc(struct page_stat *pstat, bool create)
struct rb_node **node = &page_alloc_tree.rb_node;
struct rb_node *parent = NULL;
struct page_stat *data;
+ struct sort_dimension *sort;

while (*node) {
- s64 cmp;
+ int cmp = 0;

parent = *node;
data = rb_entry(*node, struct page_stat, node);

- cmp = page_stat_cmp(data, pstat);
+ list_for_each_entry(sort, &page_alloc_sort_input, list) {
+ cmp = sort->cmp(pstat, data);
+ if (cmp)
+ break;
+ }
+
if (cmp < 0)
node = &parent->rb_left;
else if (cmp > 0)
@@ -519,19 +514,25 @@ static struct page_stat *page_stat__findnew_alloc(struct page_stat *pstat)
}

static struct page_stat *
-__page_stat__findnew_caller(u64 callsite, bool create)
+__page_stat__findnew_caller(struct page_stat *pstat, bool create)
{
struct rb_node **node = &page_caller_tree.rb_node;
struct rb_node *parent = NULL;
struct page_stat *data;
+ struct sort_dimension *sort;

while (*node) {
- s64 cmp;
+ int cmp = 0;

parent = *node;
data = rb_entry(*node, struct page_stat, node);

- cmp = data->callsite - callsite;
+ list_for_each_entry(sort, &page_caller_sort_input, list) {
+ cmp = sort->cmp(pstat, data);
+ if (cmp)
+ break;
+ }
+
if (cmp < 0)
node = &parent->rb_left;
else if (cmp > 0)
@@ -545,7 +546,10 @@ __page_stat__findnew_caller(u64 callsite, bool create)

data = zalloc(sizeof(*data));
if (data != NULL) {
- data->callsite = callsite;
+ data->callsite = pstat->callsite;
+ data->order = pstat->order;
+ data->gfp_flags = pstat->gfp_flags;
+ data->migrate_type = pstat->migrate_type;

rb_link_node(&data->node, parent, node);
rb_insert_color(&data->node, &page_caller_tree);
@@ -554,14 +558,14 @@ __page_stat__findnew_caller(u64 callsite, bool create)
return data;
}

-static struct page_stat *page_stat__find_caller(u64 callsite)
+static struct page_stat *page_stat__find_caller(struct page_stat *pstat)
{
- return __page_stat__findnew_caller(callsite, false);
+ return __page_stat__findnew_caller(pstat, false);
}

-static struct page_stat *page_stat__findnew_caller(u64 callsite)
+static struct page_stat *page_stat__findnew_caller(struct page_stat *pstat)
{
- return __page_stat__findnew_caller(callsite, true);
+ return __page_stat__findnew_caller(pstat, true);
}

static bool valid_page(u64 pfn_or_page)
@@ -629,14 +633,11 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
pstat->alloc_bytes += bytes;
pstat->callsite = callsite;

- pstat = page_stat__findnew_caller(callsite);
+ this.callsite = callsite;
+ pstat = page_stat__findnew_caller(&this);
if (pstat == NULL)
return -ENOMEM;

- pstat->order = order;
- pstat->gfp_flags = gfp_flags;
- pstat->migrate_type = migrate_type;
-
pstat->nr_alloc++;
pstat->alloc_bytes += bytes;

@@ -690,7 +691,7 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
pstat->nr_free++;
pstat->free_bytes += bytes;

- pstat = page_stat__find_caller(this.callsite);
+ pstat = page_stat__find_caller(&this);
if (pstat == NULL)
return -ENOENT;

@@ -976,14 +977,10 @@ static void print_result(struct perf_session *session)
print_page_result(session);
}

-struct sort_dimension {
- const char name[20];
- sort_fn_t cmp;
- struct list_head list;
-};
-
-static LIST_HEAD(caller_sort);
-static LIST_HEAD(alloc_sort);
+static LIST_HEAD(slab_caller_sort);
+static LIST_HEAD(slab_alloc_sort);
+static LIST_HEAD(page_caller_sort);
+static LIST_HEAD(page_alloc_sort);

static void sort_slab_insert(struct rb_root *root, struct alloc_stat *data,
struct list_head *sort_list)
@@ -1032,10 +1029,12 @@ static void __sort_slab_result(struct rb_root *root, struct rb_root *root_sorted
}
}

-static void sort_page_insert(struct rb_root *root, struct page_stat *data)
+static void sort_page_insert(struct rb_root *root, struct page_stat *data,
+ struct list_head *sort_list)
{
struct rb_node **new = &root->rb_node;
struct rb_node *parent = NULL;
+ struct sort_dimension *sort;

while (*new) {
struct page_stat *this;
@@ -1044,8 +1043,11 @@ static void sort_page_insert(struct rb_root *root, struct page_stat *data)
this = rb_entry(*new, struct page_stat, node);
parent = *new;

- /* TODO: support more sort key */
- cmp = data->alloc_bytes - this->alloc_bytes;
+ list_for_each_entry(sort, sort_list, list) {
+ cmp = sort->cmp(data, this);
+ if (cmp)
+ break;
+ }

if (cmp > 0)
new = &parent->rb_left;
@@ -1057,7 +1059,8 @@ static void sort_page_insert(struct rb_root *root, struct page_stat *data)
rb_insert_color(&data->node, root);
}

-static void __sort_page_result(struct rb_root *root, struct rb_root *root_sorted)
+static void __sort_page_result(struct rb_root *root, struct rb_root *root_sorted,
+ struct list_head *sort_list)
{
struct rb_node *node;
struct page_stat *data;
@@ -1069,7 +1072,7 @@ static void __sort_page_result(struct rb_root *root, struct rb_root *root_sorted

rb_erase(node, root);
data = rb_entry(node, struct page_stat, node);
- sort_page_insert(root_sorted, data);
+ sort_page_insert(root_sorted, data, sort_list);
}
}

@@ -1077,13 +1080,15 @@ static void sort_result(void)
{
if (kmem_slab) {
__sort_slab_result(&root_alloc_stat, &root_alloc_sorted,
- &alloc_sort);
+ &slab_alloc_sort);
__sort_slab_result(&root_caller_stat, &root_caller_sorted,
- &caller_sort);
+ &slab_caller_sort);
}
if (kmem_page) {
- __sort_page_result(&page_alloc_tree, &page_alloc_sorted);
- __sort_page_result(&page_caller_tree, &page_caller_sorted);
+ __sort_page_result(&page_alloc_tree, &page_alloc_sorted,
+ &page_alloc_sort);
+ __sort_page_result(&page_caller_tree, &page_caller_sorted,
+ &page_caller_sort);
}
}

@@ -1132,8 +1137,12 @@ static int __cmd_kmem(struct perf_session *session)
return err;
}

-static int ptr_cmp(struct alloc_stat *l, struct alloc_stat *r)
+/* slab sort keys */
+static int ptr_cmp(void *a, void *b)
{
+ struct alloc_stat *l = a;
+ struct alloc_stat *r = b;
+
if (l->ptr < r->ptr)
return -1;
else if (l->ptr > r->ptr)
@@ -1146,8 +1155,11 @@ static struct sort_dimension ptr_sort_dimension = {
.cmp = ptr_cmp,
};

-static int callsite_cmp(struct alloc_stat *l, struct alloc_stat *r)
+static int slab_callsite_cmp(void *a, void *b)
{
+ struct alloc_stat *l = a;
+ struct alloc_stat *r = b;
+
if (l->call_site < r->call_site)
return -1;
else if (l->call_site > r->call_site)
@@ -1157,11 +1169,14 @@ static int callsite_cmp(struct alloc_stat *l, struct alloc_stat *r)

static struct sort_dimension callsite_sort_dimension = {
.name = "callsite",
- .cmp = callsite_cmp,
+ .cmp = slab_callsite_cmp,
};

-static int hit_cmp(struct alloc_stat *l, struct alloc_stat *r)
+static int hit_cmp(void *a, void *b)
{
+ struct alloc_stat *l = a;
+ struct alloc_stat *r = b;
+
if (l->hit < r->hit)
return -1;
else if (l->hit > r->hit)
@@ -1174,8 +1189,11 @@ static struct sort_dimension hit_sort_dimension = {
.cmp = hit_cmp,
};

-static int bytes_cmp(struct alloc_stat *l, struct alloc_stat *r)
+static int bytes_cmp(void *a, void *b)
{
+ struct alloc_stat *l = a;
+ struct alloc_stat *r = b;
+
if (l->bytes_alloc < r->bytes_alloc)
return -1;
else if (l->bytes_alloc > r->bytes_alloc)
@@ -1188,9 +1206,11 @@ static struct sort_dimension bytes_sort_dimension = {
.cmp = bytes_cmp,
};

-static int frag_cmp(struct alloc_stat *l, struct alloc_stat *r)
+static int frag_cmp(void *a, void *b)
{
double x, y;
+ struct alloc_stat *l = a;
+ struct alloc_stat *r = b;

x = fragmentation(l->bytes_req, l->bytes_alloc);
y = fragmentation(r->bytes_req, r->bytes_alloc);
@@ -1207,8 +1227,11 @@ static struct sort_dimension frag_sort_dimension = {
.cmp = frag_cmp,
};

-static int pingpong_cmp(struct alloc_stat *l, struct alloc_stat *r)
+static int pingpong_cmp(void *a, void *b)
{
+ struct alloc_stat *l = a;
+ struct alloc_stat *r = b;
+
if (l->pingpong < r->pingpong)
return -1;
else if (l->pingpong > r->pingpong)
@@ -1221,7 +1244,135 @@ static struct sort_dimension pingpong_sort_dimension = {
.cmp = pingpong_cmp,
};

-static struct sort_dimension *avail_sorts[] = {
+/* page sort keys */
+static int page_cmp(void *a, void *b)
+{
+ struct page_stat *l = a;
+ struct page_stat *r = b;
+
+ if (l->page < r->page)
+ return -1;
+ else if (l->page > r->page)
+ return 1;
+ return 0;
+}
+
+static struct sort_dimension page_sort_dimension = {
+ .name = "page",
+ .cmp = page_cmp,
+};
+
+static int page_callsite_cmp(void *a, void *b)
+{
+ struct page_stat *l = a;
+ struct page_stat *r = b;
+
+ if (l->callsite < r->callsite)
+ return -1;
+ else if (l->callsite > r->callsite)
+ return 1;
+ return 0;
+}
+
+static struct sort_dimension page_callsite_sort_dimension = {
+ .name = "callsite",
+ .cmp = page_callsite_cmp,
+};
+
+static int page_hit_cmp(void *a, void *b)
+{
+ struct page_stat *l = a;
+ struct page_stat *r = b;
+
+ if (l->nr_alloc < r->nr_alloc)
+ return -1;
+ else if (l->nr_alloc > r->nr_alloc)
+ return 1;
+ return 0;
+}
+
+static struct sort_dimension page_hit_sort_dimension = {
+ .name = "hit",
+ .cmp = page_hit_cmp,
+};
+
+static int page_bytes_cmp(void *a, void *b)
+{
+ struct page_stat *l = a;
+ struct page_stat *r = b;
+
+ if (l->alloc_bytes < r->alloc_bytes)
+ return -1;
+ else if (l->alloc_bytes > r->alloc_bytes)
+ return 1;
+ return 0;
+}
+
+static struct sort_dimension page_bytes_sort_dimension = {
+ .name = "bytes",
+ .cmp = page_bytes_cmp,
+};
+
+static int page_order_cmp(void *a, void *b)
+{
+ struct page_stat *l = a;
+ struct page_stat *r = b;
+
+ if (l->order < r->order)
+ return -1;
+ else if (l->order > r->order)
+ return 1;
+ return 0;
+}
+
+static struct sort_dimension page_order_sort_dimension = {
+ .name = "order",
+ .cmp = page_order_cmp,
+};
+
+static int migrate_type_cmp(void *a, void *b)
+{
+ struct page_stat *l = a;
+ struct page_stat *r = b;
+
+ /* for internal use to find free'd page */
+ if (l->migrate_type == -1U)
+ return 0;
+
+ if (l->migrate_type < r->migrate_type)
+ return -1;
+ else if (l->migrate_type > r->migrate_type)
+ return 1;
+ return 0;
+}
+
+static struct sort_dimension migrate_type_sort_dimension = {
+ .name = "migtype",
+ .cmp = migrate_type_cmp,
+};
+
+static int gfp_flags_cmp(void *a, void *b)
+{
+ struct page_stat *l = a;
+ struct page_stat *r = b;
+
+ /* for internal use to find free'd page */
+ if (l->gfp_flags == -1U)
+ return 0;
+
+ if (l->gfp_flags < r->gfp_flags)
+ return -1;
+ else if (l->gfp_flags > r->gfp_flags)
+ return 1;
+ return 0;
+}
+
+static struct sort_dimension gfp_flags_sort_dimension = {
+ .name = "gfp",
+ .cmp = gfp_flags_cmp,
+};
+
+static struct sort_dimension *slab_sorts[] = {
&ptr_sort_dimension,
&callsite_sort_dimension,
&hit_sort_dimension,
@@ -1230,16 +1381,24 @@ static struct sort_dimension *avail_sorts[] = {
&pingpong_sort_dimension,
};

-#define NUM_AVAIL_SORTS ((int)ARRAY_SIZE(avail_sorts))
+static struct sort_dimension *page_sorts[] = {
+ &page_sort_dimension,
+ &page_callsite_sort_dimension,
+ &page_hit_sort_dimension,
+ &page_bytes_sort_dimension,
+ &page_order_sort_dimension,
+ &migrate_type_sort_dimension,
+ &gfp_flags_sort_dimension,
+};

-static int sort_dimension__add(const char *tok, struct list_head *list)
+static int slab_sort_dimension__add(const char *tok, struct list_head *list)
{
struct sort_dimension *sort;
int i;

- for (i = 0; i < NUM_AVAIL_SORTS; i++) {
- if (!strcmp(avail_sorts[i]->name, tok)) {
- sort = memdup(avail_sorts[i], sizeof(*avail_sorts[i]));
+ for (i = 0; i < (int)ARRAY_SIZE(slab_sorts); i++) {
+ if (!strcmp(slab_sorts[i]->name, tok)) {
+ sort = memdup(slab_sorts[i], sizeof(*slab_sorts[i]));
if (!sort) {
pr_err("%s: memdup failed\n", __func__);
return -1;
@@ -1252,7 +1411,27 @@ static int sort_dimension__add(const char *tok, struct list_head *list)
return -1;
}

-static int setup_sorting(struct list_head *sort_list, const char *arg)
+static int page_sort_dimension__add(const char *tok, struct list_head *list)
+{
+ struct sort_dimension *sort;
+ int i;
+
+ for (i = 0; i < (int)ARRAY_SIZE(page_sorts); i++) {
+ if (!strcmp(page_sorts[i]->name, tok)) {
+ sort = memdup(page_sorts[i], sizeof(*page_sorts[i]));
+ if (!sort) {
+ pr_err("%s: memdup failed\n", __func__);
+ return -1;
+ }
+ list_add_tail(&sort->list, list);
+ return 0;
+ }
+ }
+
+ return -1;
+}
+
+static int setup_slab_sorting(struct list_head *sort_list, const char *arg)
{
char *tok;
char *str = strdup(arg);
@@ -1267,8 +1446,34 @@ static int setup_sorting(struct list_head *sort_list, const char *arg)
tok = strsep(&pos, ",");
if (!tok)
break;
- if (sort_dimension__add(tok, sort_list) < 0) {
- error("Unknown --sort key: '%s'", tok);
+ if (slab_sort_dimension__add(tok, sort_list) < 0) {
+ error("Unknown slab --sort key: '%s'", tok);
+ free(str);
+ return -1;
+ }
+ }
+
+ free(str);
+ return 0;
+}
+
+static int setup_page_sorting(struct list_head *sort_list, const char *arg)
+{
+ char *tok;
+ char *str = strdup(arg);
+ char *pos = str;
+
+ if (!str) {
+ pr_err("%s: strdup failed\n", __func__);
+ return -1;
+ }
+
+ while (true) {
+ tok = strsep(&pos, ",");
+ if (!tok)
+ break;
+ if (page_sort_dimension__add(tok, sort_list) < 0) {
+ error("Unknown page --sort key: '%s'", tok);
free(str);
return -1;
}
@@ -1284,10 +1489,17 @@ static int parse_sort_opt(const struct option *opt __maybe_unused,
if (!arg)
return -1;

- if (caller_flag > alloc_flag)
- return setup_sorting(&caller_sort, arg);
- else
- return setup_sorting(&alloc_sort, arg);
+ if (kmem_page > kmem_slab) {
+ if (caller_flag > alloc_flag)
+ return setup_page_sorting(&page_caller_sort, arg);
+ else
+ return setup_page_sorting(&page_alloc_sort, arg);
+ } else {
+ if (caller_flag > alloc_flag)
+ return setup_slab_sorting(&slab_caller_sort, arg);
+ else
+ return setup_slab_sorting(&slab_alloc_sort, arg);
+ }

return 0;
}
@@ -1395,7 +1607,8 @@ static int __cmd_record(int argc, const char **argv)

int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
{
- const char * const default_sort_order = "frag,hit,bytes";
+ const char * const default_slab_sort = "frag,hit,bytes";
+ const char * const default_page_sort = "bytes,hit";
struct perf_data_file file = {
.mode = PERF_DATA_MODE_READ,
};
@@ -1408,8 +1621,8 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_CALLBACK_NOOPT(0, "alloc", NULL, NULL,
"show per-allocation statistics", parse_alloc_opt),
OPT_CALLBACK('s', "sort", NULL, "key[,key2...]",
- "sort by keys: ptr, call_site, bytes, hit, pingpong, frag",
- parse_sort_opt),
+ "sort by keys: ptr, callsite, bytes, hit, pingpong, frag, "
+ "page, order, migtype, gfp", parse_sort_opt),
OPT_CALLBACK('l', "line", NULL, "num", "show n lines", parse_line_opt),
OPT_BOOLEAN(0, "raw-ip", &raw_ip, "show raw ip instead of symbol"),
OPT_BOOLEAN('f', "force", &file.force, "don't complain, do it"),
@@ -1467,11 +1680,21 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
if (cpu__setup_cpunode_map())
goto out_delete;

- if (list_empty(&caller_sort))
- setup_sorting(&caller_sort, default_sort_order);
- if (list_empty(&alloc_sort))
- setup_sorting(&alloc_sort, default_sort_order);
-
+ if (list_empty(&slab_caller_sort))
+ setup_slab_sorting(&slab_caller_sort, default_slab_sort);
+ if (list_empty(&slab_alloc_sort))
+ setup_slab_sorting(&slab_alloc_sort, default_slab_sort);
+ if (list_empty(&page_caller_sort))
+ setup_page_sorting(&page_caller_sort, default_page_sort);
+ if (list_empty(&page_alloc_sort))
+ setup_page_sorting(&page_alloc_sort, default_page_sort);
+
+ if (kmem_page) {
+ setup_page_sorting(&page_alloc_sort_input,
+ "page,order,migtype,gfp");
+ setup_page_sorting(&page_caller_sort_input,
+ "callsite,order,migtype,gfp");
+ }
ret = __cmd_kmem(session);
} else
usage_with_options(kmem_usage, kmem_options);
--
2.3.4

2015-04-21 05:01:16

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 3/6] perf kmem: Add --live option for current allocation stat

Currently perf kmem shows total (page) allocation stat by default, but
sometimes one might want to see live (total alloc-only) requests/pages
only. The new --live option does this by subtracting freed allocation
from the stat.

Acked-by: Pekka Enberg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/Documentation/perf-kmem.txt | 5 ++
tools/perf/builtin-kmem.c | 110 ++++++++++++++++++++-------------
2 files changed, 73 insertions(+), 42 deletions(-)

diff --git a/tools/perf/Documentation/perf-kmem.txt b/tools/perf/Documentation/perf-kmem.txt
index 69e181272c51..ff0f433b3fce 100644
--- a/tools/perf/Documentation/perf-kmem.txt
+++ b/tools/perf/Documentation/perf-kmem.txt
@@ -56,6 +56,11 @@ OPTIONS
--page::
Analyze page allocator events

+--live::
+ Show live page stat. The perf kmem shows total allocation stat by
+ default, but this option shows live (currently allocated) pages
+ instead. (This option works with --page option only)
+
SEE ALSO
--------
linkperf:perf-record[1]
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 0393a7f3fa35..7ead9423fd7a 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -244,6 +244,7 @@ static unsigned long nr_page_fails;
static unsigned long nr_page_nomatch;

static bool use_pfn;
+static bool live_page;
static struct perf_session *kmem_session;

#define MAX_MIGRATE_TYPES 6
@@ -264,7 +265,7 @@ struct page_stat {
int nr_free;
};

-static struct rb_root page_tree;
+static struct rb_root page_live_tree;
static struct rb_root page_alloc_tree;
static struct rb_root page_alloc_sorted;
static struct rb_root page_caller_tree;
@@ -403,10 +404,19 @@ static u64 find_callsite(struct perf_evsel *evsel, struct perf_sample *sample)
return sample->ip;
}

+struct sort_dimension {
+ const char name[20];
+ sort_fn_t cmp;
+ struct list_head list;
+};
+
+static LIST_HEAD(page_alloc_sort_input);
+static LIST_HEAD(page_caller_sort_input);
+
static struct page_stat *
-__page_stat__findnew_page(u64 page, bool create)
+__page_stat__findnew_page(struct page_stat *pstat, bool create)
{
- struct rb_node **node = &page_tree.rb_node;
+ struct rb_node **node = &page_live_tree.rb_node;
struct rb_node *parent = NULL;
struct page_stat *data;

@@ -416,7 +426,7 @@ __page_stat__findnew_page(u64 page, bool create)
parent = *node;
data = rb_entry(*node, struct page_stat, node);

- cmp = data->page - page;
+ cmp = data->page - pstat->page;
if (cmp < 0)
node = &parent->rb_left;
else if (cmp > 0)
@@ -430,34 +440,28 @@ __page_stat__findnew_page(u64 page, bool create)

data = zalloc(sizeof(*data));
if (data != NULL) {
- data->page = page;
+ data->page = pstat->page;
+ data->order = pstat->order;
+ data->gfp_flags = pstat->gfp_flags;
+ data->migrate_type = pstat->migrate_type;

rb_link_node(&data->node, parent, node);
- rb_insert_color(&data->node, &page_tree);
+ rb_insert_color(&data->node, &page_live_tree);
}

return data;
}

-static struct page_stat *page_stat__find_page(u64 page)
+static struct page_stat *page_stat__find_page(struct page_stat *pstat)
{
- return __page_stat__findnew_page(page, false);
+ return __page_stat__findnew_page(pstat, false);
}

-static struct page_stat *page_stat__findnew_page(u64 page)
+static struct page_stat *page_stat__findnew_page(struct page_stat *pstat)
{
- return __page_stat__findnew_page(page, true);
+ return __page_stat__findnew_page(pstat, true);
}

-struct sort_dimension {
- const char name[20];
- sort_fn_t cmp;
- struct list_head list;
-};
-
-static LIST_HEAD(page_alloc_sort_input);
-static LIST_HEAD(page_caller_sort_input);
-
static struct page_stat *
__page_stat__findnew_alloc(struct page_stat *pstat, bool create)
{
@@ -615,17 +619,8 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
* This is to find the current page (with correct gfp flags and
* migrate type) at free event.
*/
- pstat = page_stat__findnew_page(page);
- if (pstat == NULL)
- return -ENOMEM;
-
- pstat->order = order;
- pstat->gfp_flags = gfp_flags;
- pstat->migrate_type = migrate_type;
- pstat->callsite = callsite;
-
this.page = page;
- pstat = page_stat__findnew_alloc(&this);
+ pstat = page_stat__findnew_page(&this);
if (pstat == NULL)
return -ENOMEM;

@@ -633,6 +628,16 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
pstat->alloc_bytes += bytes;
pstat->callsite = callsite;

+ if (!live_page) {
+ pstat = page_stat__findnew_alloc(&this);
+ if (pstat == NULL)
+ return -ENOMEM;
+
+ pstat->nr_alloc++;
+ pstat->alloc_bytes += bytes;
+ pstat->callsite = callsite;
+ }
+
this.callsite = callsite;
pstat = page_stat__findnew_caller(&this);
if (pstat == NULL)
@@ -665,7 +670,8 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
nr_page_frees++;
total_page_free_bytes += bytes;

- pstat = page_stat__find_page(page);
+ this.page = page;
+ pstat = page_stat__find_page(&this);
if (pstat == NULL) {
pr_debug2("missing free at page %"PRIx64" (order: %d)\n",
page, order);
@@ -676,20 +682,23 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
return 0;
}

- this.page = page;
this.gfp_flags = pstat->gfp_flags;
this.migrate_type = pstat->migrate_type;
this.callsite = pstat->callsite;

- rb_erase(&pstat->node, &page_tree);
+ rb_erase(&pstat->node, &page_live_tree);
free(pstat);

- pstat = page_stat__find_alloc(&this);
- if (pstat == NULL)
- return -ENOENT;
+ if (live_page) {
+ order_stats[this.order][this.migrate_type]--;
+ } else {
+ pstat = page_stat__find_alloc(&this);
+ if (pstat == NULL)
+ return -ENOMEM;

- pstat->nr_free++;
- pstat->free_bytes += bytes;
+ pstat->nr_free++;
+ pstat->free_bytes += bytes;
+ }

pstat = page_stat__find_caller(&this);
if (pstat == NULL)
@@ -698,6 +707,16 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
pstat->nr_free++;
pstat->free_bytes += bytes;

+ if (live_page) {
+ pstat->nr_alloc--;
+ pstat->alloc_bytes -= bytes;
+
+ if (pstat->nr_alloc == 0) {
+ rb_erase(&pstat->node, &page_caller_tree);
+ free(pstat);
+ }
+ }
+
return 0;
}

@@ -815,8 +834,8 @@ static void __print_page_alloc_result(struct perf_session *session, int n_lines)
const char *format;

printf("\n%.105s\n", graph_dotted_line);
- printf(" %-16s | Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n",
- use_pfn ? "PFN" : "Page");
+ printf(" %-16s | %5s alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n",
+ use_pfn ? "PFN" : "Page", live_page ? "Live" : "Total");
printf("%.105s\n", graph_dotted_line);

if (use_pfn)
@@ -860,7 +879,8 @@ static void __print_page_caller_result(struct perf_session *session, int n_lines
struct machine *machine = &session->machines.host;

printf("\n%.105s\n", graph_dotted_line);
- printf(" Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n");
+ printf(" %5s alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n",
+ live_page ? "Live" : "Total");
printf("%.105s\n", graph_dotted_line);

while (next && n_lines--) {
@@ -1085,8 +1105,13 @@ static void sort_result(void)
&slab_caller_sort);
}
if (kmem_page) {
- __sort_page_result(&page_alloc_tree, &page_alloc_sorted,
- &page_alloc_sort);
+ if (live_page)
+ __sort_page_result(&page_live_tree, &page_alloc_sorted,
+ &page_alloc_sort);
+ else
+ __sort_page_result(&page_alloc_tree, &page_alloc_sorted,
+ &page_alloc_sort);
+
__sort_page_result(&page_caller_tree, &page_caller_sorted,
&page_caller_sort);
}
@@ -1630,6 +1655,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
parse_slab_opt),
OPT_CALLBACK_NOOPT(0, "page", NULL, NULL, "Analyze page allocator",
parse_page_opt),
+ OPT_BOOLEAN(0, "live", &live_page, "Show live page stat"),
OPT_END()
};
const char *const kmem_subcommands[] = { "record", "stat", NULL };
--
2.3.4

2015-04-21 05:01:14

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 4/6] perf kmem: Print gfp flags in human readable string

Save libtraceevent output and print it in the header.

# perf kmem stat --page --caller
#
# GFP flags
# ---------
# 00000010: NI: GFP_NOIO
# 000000d0: K: GFP_KERNEL
# 00000200: NWR: GFP_NOWARN
# 000084d0: K|R|Z: GFP_KERNEL|GFP_REPEAT|GFP_ZERO
# 000200d2: HU: GFP_HIGHUSER
# 000200da: HUM: GFP_HIGHUSER_MOVABLE
# 000280da: HUM|Z: GFP_HIGHUSER_MOVABLE|GFP_ZERO
# 002084d0: K|R|Z|NT: GFP_KERNEL|GFP_REPEAT|GFP_ZERO|GFP_NOTRACK
# 0102005a: NF|HW|M: GFP_NOFS|GFP_HARDWALL|GFP_MOVABLE

---------------------------------------------------------------------------------------------------------
Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
---------------------------------------------------------------------------------------------------------
60 | 15 | 0 | UNMOVABL | K|R|Z|NT | pte_alloc_one
40 | 10 | 0 | MOVABLE | HUM|Z | handle_mm_fault
24 | 6 | 0 | MOVABLE | HUM | do_wp_page
24 | 6 | 0 | UNMOVABL | K | __pollwait
...

Requested-by: Joonsoo Kim <[email protected]>
Suggested-by: Minchan Kim <[email protected]>
Acked-by: Pekka Enberg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/builtin-kmem.c | 222 +++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 209 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 7ead9423fd7a..1c668953c7ec 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -581,6 +581,176 @@ static bool valid_page(u64 pfn_or_page)
return true;
}

+struct gfp_flag {
+ unsigned int flags;
+ char *compact_str;
+ char *human_readable;
+};
+
+static struct gfp_flag *gfps;
+static int nr_gfps;
+
+static int gfpcmp(const void *a, const void *b)
+{
+ const struct gfp_flag *fa = a;
+ const struct gfp_flag *fb = b;
+
+ return fa->flags - fb->flags;
+}
+
+/* see include/trace/events/gfpflags.h */
+static const struct {
+ const char *original;
+ const char *compact;
+} gfp_compact_table[] = {
+ { "GFP_TRANSHUGE", "THP" },
+ { "GFP_HIGHUSER_MOVABLE", "HUM" },
+ { "GFP_HIGHUSER", "HU" },
+ { "GFP_USER", "U" },
+ { "GFP_TEMPORARY", "TMP" },
+ { "GFP_KERNEL", "K" },
+ { "GFP_NOFS", "NF" },
+ { "GFP_ATOMIC", "A" },
+ { "GFP_NOIO", "NI" },
+ { "GFP_HIGH", "H" },
+ { "GFP_WAIT", "W" },
+ { "GFP_IO", "I" },
+ { "GFP_COLD", "CO" },
+ { "GFP_NOWARN", "NWR" },
+ { "GFP_REPEAT", "R" },
+ { "GFP_NOFAIL", "NF" },
+ { "GFP_NORETRY", "NR" },
+ { "GFP_COMP", "C" },
+ { "GFP_ZERO", "Z" },
+ { "GFP_NOMEMALLOC", "NMA" },
+ { "GFP_MEMALLOC", "MA" },
+ { "GFP_HARDWALL", "HW" },
+ { "GFP_THISNODE", "TN" },
+ { "GFP_RECLAIMABLE", "RC" },
+ { "GFP_MOVABLE", "M" },
+ { "GFP_NOTRACK", "NT" },
+ { "GFP_NO_KSWAPD", "NK" },
+ { "GFP_OTHER_NODE", "ON" },
+ { "GFP_NOWAIT", "NW" },
+};
+
+static size_t max_gfp_len;
+
+static char *compact_gfp_flags(char *gfp_flags)
+{
+ char *orig_flags = strdup(gfp_flags);
+ char *new_flags = NULL;
+ char *str, *pos;
+ size_t len = 0;
+
+ if (orig_flags == NULL)
+ return NULL;
+
+ str = strtok_r(orig_flags, "|", &pos);
+ while (str) {
+ size_t i;
+ char *new;
+ const char *cpt;
+
+ for (i = 0; i < ARRAY_SIZE(gfp_compact_table); i++) {
+ if (strcmp(gfp_compact_table[i].original, str))
+ continue;
+
+ cpt = gfp_compact_table[i].compact;
+ new = realloc(new_flags, len + strlen(cpt) + 2);
+ if (new == NULL) {
+ free(new_flags);
+ return NULL;
+ }
+
+ new_flags = new;
+
+ if (!len) {
+ strcpy(new_flags, cpt);
+ } else {
+ strcat(new_flags, "|");
+ strcat(new_flags, cpt);
+ len++;
+ }
+
+ len += strlen(cpt);
+ }
+
+ str = strtok_r(NULL, "|", &pos);
+ }
+
+ if (max_gfp_len < len)
+ max_gfp_len = len;
+
+ free(orig_flags);
+ return new_flags;
+}
+
+static char *compact_gfp_string(unsigned long gfp_flags)
+{
+ struct gfp_flag key = {
+ .flags = gfp_flags,
+ };
+ struct gfp_flag *gfp;
+
+ gfp = bsearch(&key, gfps, nr_gfps, sizeof(*gfps), gfpcmp);
+ if (gfp)
+ return gfp->compact_str;
+
+ return NULL;
+}
+
+static int parse_gfp_flags(struct perf_evsel *evsel, struct perf_sample *sample,
+ unsigned int gfp_flags)
+{
+ struct pevent_record record = {
+ .cpu = sample->cpu,
+ .data = sample->raw_data,
+ .size = sample->raw_size,
+ };
+ struct trace_seq seq;
+ char *str, *pos;
+
+ if (nr_gfps) {
+ struct gfp_flag key = {
+ .flags = gfp_flags,
+ };
+
+ if (bsearch(&key, gfps, nr_gfps, sizeof(*gfps), gfpcmp))
+ return 0;
+ }
+
+ trace_seq_init(&seq);
+ pevent_event_info(&seq, evsel->tp_format, &record);
+
+ str = strtok_r(seq.buffer, " ", &pos);
+ while (str) {
+ if (!strncmp(str, "gfp_flags=", 10)) {
+ struct gfp_flag *new;
+
+ new = realloc(gfps, (nr_gfps + 1) * sizeof(*gfps));
+ if (new == NULL)
+ return -ENOMEM;
+
+ gfps = new;
+ new += nr_gfps++;
+
+ new->flags = gfp_flags;
+ new->human_readable = strdup(str + 10);
+ new->compact_str = compact_gfp_flags(str + 10);
+ if (!new->human_readable || !new->compact_str)
+ return -ENOMEM;
+
+ qsort(gfps, nr_gfps, sizeof(*gfps), gfpcmp);
+ }
+
+ str = strtok_r(NULL, " ", &pos);
+ }
+
+ trace_seq_destroy(&seq);
+ return 0;
+}
+
static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
struct perf_sample *sample)
{
@@ -613,6 +783,9 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
return 0;
}

+ if (parse_gfp_flags(evsel, sample, gfp_flags) < 0)
+ return -1;
+
callsite = find_callsite(evsel, sample);

/*
@@ -832,16 +1005,18 @@ static void __print_page_alloc_result(struct perf_session *session, int n_lines)
struct rb_node *next = rb_first(&page_alloc_sorted);
struct machine *machine = &session->machines.host;
const char *format;
+ int gfp_len = max(strlen("GFP flags"), max_gfp_len);

printf("\n%.105s\n", graph_dotted_line);
- printf(" %-16s | %5s alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n",
- use_pfn ? "PFN" : "Page", live_page ? "Live" : "Total");
+ printf(" %-16s | %5s alloc (KB) | Hits | Order | Mig.type | %-*s | Callsite\n",
+ use_pfn ? "PFN" : "Page", live_page ? "Live" : "Total",
+ gfp_len, "GFP flags");
printf("%.105s\n", graph_dotted_line);

if (use_pfn)
- format = " %16llu | %'16llu | %'9d | %5d | %8s | %08lx | %s\n";
+ format = " %16llu | %'16llu | %'9d | %5d | %8s | %-*s | %s\n";
else
- format = " %016llx | %'16llu | %'9d | %5d | %8s | %08lx | %s\n";
+ format = " %016llx | %'16llu | %'9d | %5d | %8s | %-*s | %s\n";

while (next && n_lines--) {
struct page_stat *data;
@@ -862,13 +1037,15 @@ static void __print_page_alloc_result(struct perf_session *session, int n_lines)
(unsigned long long)data->alloc_bytes / 1024,
data->nr_alloc, data->order,
migrate_type_str[data->migrate_type],
- (unsigned long)data->gfp_flags, caller);
+ gfp_len, compact_gfp_string(data->gfp_flags), caller);

next = rb_next(next);
}

- if (n_lines == -1)
- printf(" ... | ... | ... | ... | ... | ... | ...\n");
+ if (n_lines == -1) {
+ printf(" ... | ... | ... | ... | ... | %-*s | ...\n",
+ gfp_len, "...");
+ }

printf("%.105s\n", graph_dotted_line);
}
@@ -877,10 +1054,11 @@ static void __print_page_caller_result(struct perf_session *session, int n_lines
{
struct rb_node *next = rb_first(&page_caller_sorted);
struct machine *machine = &session->machines.host;
+ int gfp_len = max(strlen("GFP flags"), max_gfp_len);

printf("\n%.105s\n", graph_dotted_line);
- printf(" %5s alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n",
- live_page ? "Live" : "Total");
+ printf(" %5s alloc (KB) | Hits | Order | Mig.type | %-*s | Callsite\n",
+ live_page ? "Live" : "Total", gfp_len, "GFP flags");
printf("%.105s\n", graph_dotted_line);

while (next && n_lines--) {
@@ -898,21 +1076,37 @@ static void __print_page_caller_result(struct perf_session *session, int n_lines
else
scnprintf(buf, sizeof(buf), "%"PRIx64, data->callsite);

- printf(" %'16llu | %'9d | %5d | %8s | %08lx | %s\n",
+ printf(" %'16llu | %'9d | %5d | %8s | %-*s | %s\n",
(unsigned long long)data->alloc_bytes / 1024,
data->nr_alloc, data->order,
migrate_type_str[data->migrate_type],
- (unsigned long)data->gfp_flags, caller);
+ gfp_len, compact_gfp_string(data->gfp_flags), caller);

next = rb_next(next);
}

- if (n_lines == -1)
- printf(" ... | ... | ... | ... | ... | ...\n");
+ if (n_lines == -1) {
+ printf(" ... | ... | ... | ... | %-*s | ...\n",
+ gfp_len, "...");
+ }

printf("%.105s\n", graph_dotted_line);
}

+static void print_gfp_flags(void)
+{
+ int i;
+
+ printf("#\n");
+ printf("# GFP flags\n");
+ printf("# ---------\n");
+ for (i = 0; i < nr_gfps; i++) {
+ printf("# %08x: %*s: %s\n", gfps[i].flags,
+ (int) max_gfp_len, gfps[i].compact_str,
+ gfps[i].human_readable);
+ }
+}
+
static void print_slab_summary(void)
{
printf("\nSUMMARY (SLAB allocator)");
@@ -982,6 +1176,8 @@ static void print_slab_result(struct perf_session *session)

static void print_page_result(struct perf_session *session)
{
+ if (caller_flag || alloc_flag)
+ print_gfp_flags();
if (caller_flag)
__print_page_caller_result(session, caller_lines);
if (alloc_flag)
--
2.3.4

2015-04-21 05:00:59

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 5/6] perf kmem: Add kmem.default config option

Currently perf kmem command will select --slab if neither --slab nor
--page is given for backward compatibility. Add kmem.default config
option to select the default value ('page' or 'slab').

# cat ~/.perfconfig
[kmem]
default = page

# perf kmem stat

SUMMARY (page allocator)
========================
Total allocation requests : 1,518 [ 6,096 KB ]
Total free requests : 1,431 [ 5,748 KB ]

Total alloc+freed requests : 1,330 [ 5,344 KB ]
Total alloc-only requests : 188 [ 752 KB ]
Total free-only requests : 101 [ 404 KB ]

Total allocation failures : 0 [ 0 KB ]
...

Acked-by: Pekka Enberg <[email protected]>
Cc: Taeung Song <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/builtin-kmem.c | 32 +++++++++++++++++++++++++++++---
1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 1c668953c7ec..828b7284e547 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -28,6 +28,10 @@ static int kmem_slab;
static int kmem_page;

static long kmem_page_size;
+static enum {
+ KMEM_SLAB,
+ KMEM_PAGE,
+} kmem_default = KMEM_SLAB; /* for backward compatibility */

struct alloc_stat;
typedef int (*sort_fn_t)(void *, void *);
@@ -1710,7 +1714,8 @@ static int parse_sort_opt(const struct option *opt __maybe_unused,
if (!arg)
return -1;

- if (kmem_page > kmem_slab) {
+ if (kmem_page > kmem_slab ||
+ (kmem_page == 0 && kmem_slab == 0 && kmem_default == KMEM_PAGE)) {
if (caller_flag > alloc_flag)
return setup_page_sorting(&page_caller_sort, arg);
else
@@ -1826,6 +1831,22 @@ static int __cmd_record(int argc, const char **argv)
return cmd_record(i, rec_argv, NULL);
}

+static int kmem_config(const char *var, const char *value, void *cb)
+{
+ if (!strcmp(var, "kmem.default")) {
+ if (!strcmp(value, "slab"))
+ kmem_default = KMEM_SLAB;
+ else if (!strcmp(value, "page"))
+ kmem_default = KMEM_PAGE;
+ else
+ pr_err("invalid default value ('slab' or 'page' required): %s\n",
+ value);
+ return 0;
+ }
+
+ return perf_default_config(var, value, cb);
+}
+
int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
{
const char * const default_slab_sort = "frag,hit,bytes";
@@ -1862,14 +1883,19 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
struct perf_session *session;
int ret = -1;

+ perf_config(kmem_config, NULL);
argc = parse_options_subcommand(argc, argv, kmem_options,
kmem_subcommands, kmem_usage, 0);

if (!argc)
usage_with_options(kmem_usage, kmem_options);

- if (kmem_slab == 0 && kmem_page == 0)
- kmem_slab = 1; /* for backward compatibility */
+ if (kmem_slab == 0 && kmem_page == 0) {
+ if (kmem_default == KMEM_SLAB)
+ kmem_slab = 1;
+ else
+ kmem_page = 1;
+ }

if (!strncmp(argv[0], "rec", 3)) {
symbol__init(NULL);
--
2.3.4

2015-04-21 05:00:30

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 6/6] perf kmem: Show warning when trying to run stat without record

Sometimes one can mistakenly run perf kmem stat without perf kmem
record before or different configuration like recoding --slab and stat
--page. Show a warning message like below to inform user:

# perf kmem stat --page --caller
Not found page events. Have you run 'perf kmem record --page' before?

Acked-by: Pekka Enberg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/builtin-kmem.c | 31 ++++++++++++++++++++++++++++---
1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 828b7284e547..f29a766f18f8 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -1882,6 +1882,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
};
struct perf_session *session;
int ret = -1;
+ const char errmsg[] = "Not found %s events. Have you run 'perf kmem record --%s' before?\n";

perf_config(kmem_config, NULL);
argc = parse_options_subcommand(argc, argv, kmem_options,
@@ -1908,11 +1909,35 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
if (session == NULL)
return -1;

+ if (kmem_slab) {
+ struct perf_evsel *evsel;
+ bool found = false;
+
+ evlist__for_each(session->evlist, evsel) {
+ if (!strcmp(perf_evsel__name(evsel), "kmem:kmalloc")) {
+ found = true;
+ break;
+ }
+ }
+ if (!found) {
+ pr_err(errmsg, "slab", "slab");
+ return -1;
+ }
+ }
+
if (kmem_page) {
- struct perf_evsel *evsel = perf_evlist__first(session->evlist);
+ struct perf_evsel *evsel;
+ bool found = false;

- if (evsel == NULL || evsel->tp_format == NULL) {
- pr_err("invalid event found.. aborting\n");
+ evlist__for_each(session->evlist, evsel) {
+ if (!strcmp(perf_evsel__name(evsel),
+ "kmem:mm_page_alloc")) {
+ found = true;
+ break;
+ }
+ }
+ if (!found) {
+ pr_err(errmsg, "page", "page");
return -1;
}

--
2.3.4

2015-05-02 14:55:42

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCHSET 0/6] perf kmem: Implement page allocation analysis (v8)

Ping!

On Tue, Apr 21, 2015 at 1:55 PM, Namhyung Kim <[email protected]> wrote:
> Hello,
>
> Currently perf kmem command only analyzes SLAB memory allocation. And
> I'd like to introduce page allocation analysis also. Users can use
> --slab and/or --page option to select it. If none of these options
> are used, it does slab allocation analysis for backward compatibility.
>
> * changes in v8)
> - rename 'stat' to 'pstat' due to build error
> - add Acked-by from Pekka
>
> * changes in v7)
> - drop already merged patches
> - check return value of map__load() (Arnaldo)
> - rename to page_stat__findnew_*() functions (Arnaldo)
> - show warning when try to run stat before record
>
> * changes in v6)
> - add -i option fix (Jiri)
> - libtraceevent operator priority fix
>
> * changes in v5)
> - print migration type and gfp flags in more compact form (Arnaldo)
> - add kmem.default config option
>
> * changes in v4)
> - use pfn instead of struct page * in tracepoints (Joonsoo, Ingo)
> - print gfp flags in human readable string (Joonsoo, Minchan)
>
> * changes in v3)
> - add live page statistics
>
> * changes in v2)
> - Use thousand grouping for big numbers - i.e. 12345 -> 12,345 (Ingo)
> - Improve output stat readability (Ingo)
> - Remove alloc size column as it can be calculated from hits and order
>
> In this patchset, I used two kmem events: kmem:mm_page_alloc and
> kmem_page_free for analysis as they can track almost all of memory
> allocation/free path AFAIK. However, unlike slab tracepoint events,
> those page allocation events don't provide callsite info directly. So
> I recorded callchains and extracted callsites like below:
>
> Normal page allocation callchains look like this:
>
> 360a7e __alloc_pages_nodemask
> 3a711c alloc_pages_current
> 357bc7 __page_cache_alloc <-- callsite
> 357cf6 pagecache_get_page
> 48b0a prepare_pages
> 494d3 __btrfs_buffered_write
> 49cdf btrfs_file_write_iter
> 3ceb6e new_sync_write
> 3cf447 vfs_write
> 3cff99 sys_write
> 7556e9 system_call
> f880 __write_nocancel
> 33eb9 cmd_record
> 4b38e cmd_kmem
> 7aa23 run_builtin
> 27a9a main
> 20800 __libc_start_main
>
> But first two are internal page allocation functions so it should be
> skipped. To determine such allocation functions, I used following regex:
>
> ^_?_?(alloc|get_free|get_zeroed)_pages?
>
> This gave me a following list of functions (you can see this with -v):
>
> alloc func: __get_free_pages
> alloc func: get_zeroed_page
> alloc func: alloc_pages_exact
> alloc func: __alloc_pages_direct_compact
> alloc func: __alloc_pages_nodemask
> alloc func: alloc_page_interleave
> alloc func: alloc_pages_current
> alloc func: alloc_pages_vma
> alloc func: alloc_page_buffers
> alloc func: alloc_pages_exact_nid
>
> After skipping those function, it got '__page_cache_alloc'.
>
> Other information such as allocation order, migration type and gfp
> flags are provided by tracepoint events.
>
> Basically the output will be sorted by total allocation bytes, but you
> can change it by using -s/--sort option. The following sort keys are
> added to support page analysis: page, order, migtype, gfp. Existing
> 'callsite', 'bytes' and 'hit' sort keys also can be used.
>
> An example follows:
>
> # perf kmem record --page sleep 5
> [ perf record: Woken up 2 times to write data ]
> [ perf record: Captured and wrote 1.065 MB perf.data (2949 samples) ]
>
> # perf kmem stat --page --caller -s order,hit -l 10
> #
> # GFP flags
> # ---------
> # 00000010: NI: GFP_NOIO
> # 000000d0: K: GFP_KERNEL
> # 00000200: NWR: GFP_NOWARN
> # 000052d0: K|NWR|NR|C: GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP
> # 000084d0: K|R|Z: GFP_KERNEL|GFP_REPEAT|GFP_ZERO
> # 000200d0: U: GFP_USER
> # 000200d2: HU: GFP_HIGHUSER
> # 000200da: HUM: GFP_HIGHUSER_MOVABLE
> # 000280da: HUM|Z: GFP_HIGHUSER_MOVABLE|GFP_ZERO
> # 002084d0: K|R|Z|NT: GFP_KERNEL|GFP_REPEAT|GFP_ZERO|GFP_NOTRACK
> # 0102005a: NF|HW|M: GFP_NOFS|GFP_HARDWALL|GFP_MOVABLE
>
> ---------------------------------------------------------------------------------------------------------
> Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
> ---------------------------------------------------------------------------------------------------------
> 16 | 1 | 2 | UNMOVABL | K|NWR|NR|C | alloc_skb_with_frags
> 24 | 3 | 1 | UNMOVABL | K|NWR|NR|C | alloc_skb_with_frags
> 3,876 | 969 | 0 | MOVABLE | HUM | shmem_alloc_page
> 972 | 243 | 0 | UNMOVABL | K | __pollwait
> 624 | 156 | 0 | MOVABLE | NF|HW|M | __page_cache_alloc
> 304 | 76 | 0 | UNMOVABL | U | dma_generic_alloc_coherent
> 108 | 27 | 0 | MOVABLE | HUM|Z | handle_mm_fault
> 56 | 14 | 0 | UNMOVABL | K|R|Z|NT | pte_alloc_one
> 24 | 6 | 0 | MOVABLE | HUM | do_wp_page
> 16 | 4 | 0 | UNMOVABL | NWR | __tlb_remove_page
> ... | ... | ... | ... | ... | ...
> ---------------------------------------------------------------------------------------------------------
>
> SUMMARY (page allocator)
> ========================
> Total allocation requests : 1,518 [ 6,096 KB ]
> Total free requests : 1,431 [ 5,748 KB ]
>
> Total alloc+freed requests : 1,330 [ 5,344 KB ]
> Total alloc-only requests : 188 [ 752 KB ]
> Total free-only requests : 101 [ 404 KB ]
>
> Total allocation failures : 0 [ 0 KB ]
>
> Order Unmovable Reclaimable Movable Reserved CMA/Isolated
> ----- ------------ ------------ ------------ ------------ ------------
> 0 351 . 1,163 . .
> 1 3 . . . .
> 2 1 . . . .
> 3 . . . . .
> 4 . . . . .
> 5 . . . . .
> 6 . . . . .
> 7 . . . . .
> 8 . . . . .
> 9 . . . . .
> 10 . . . . .
>
> I have some idea how to improve it. But I'd also like to hear other
> idea, suggestion, feedback and so on.
>
> This is available at perf/kmem-page-v8 branch on my tree:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Thanks,
> Namhyung
>
>
> Namhyung Kim (6):
> perf kmem: Implement stat --page --caller
> perf kmem: Support sort keys on page analysis
> perf kmem: Add --live option for current allocation stat
> perf kmem: Print gfp flags in human readable string
> perf kmem: Add kmem.default config option
> perf kmem: Show warning when trying to run stat without record
>
> tools/perf/Documentation/perf-kmem.txt | 11 +-
> tools/perf/builtin-kmem.c | 995 +++++++++++++++++++++++++++++----
> 2 files changed, 898 insertions(+), 108 deletions(-)
>
> --
> 2.3.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2015-05-04 20:55:51

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 1/6] perf kmem: Implement stat --page --caller

Em Tue, Apr 21, 2015 at 01:55:02PM +0900, Namhyung Kim escreveu:
> It perf kmem support caller statistics for page. Unlike slab case,
> the tracepoints in page allocator don't provide callsite info. So
> it records with callchain and extracts callsite info.
>
> Note that the callchain contains several memory allocation functions
> which has no meaning for users. So skip those functions to get proper
> callsites. I used following regex pattern to skip the allocator
> functions:
>
> ^_?_?(alloc|get_free|get_zeroed)_pages?
>
> This gave me a following list of functions:
>
> # perf kmem record --page sleep 3
> # perf kmem stat --page -v
> ...
> alloc func: __get_free_pages
> alloc func: get_zeroed_page
> alloc func: alloc_pages_exact
> alloc func: __alloc_pages_direct_compact
> alloc func: __alloc_pages_nodemask
> alloc func: alloc_page_interleave
> alloc func: alloc_pages_current
> alloc func: alloc_pages_vma
> alloc func: alloc_page_buffers
> alloc func: alloc_pages_exact_nid
...
>
> The output looks mostly same as --alloc (I also added callsite column

> to that) but groups entries by callsite. Currently, the order,
> migrate type and GFP flag info is for the last allocation and not
> guaranteed to be same for all allocations from the callsite.

In my testing:

[root@ssdandy ~]# perf kmem stat --page --caller

------------------------------------------------------------------------------
TotalAlloc(KB)|Hits|Ord| Mig.type |GFP flags| Callsite
------------------------------------------------------------------------------
492 | 21 | 3 | UNMOVABL | 0235200 | new_slab
456 |114 | 0 | UNMOVABL | 00202d0 | iwl_pcie_rx_replenish
60 | 15 | 0 | UNMOVABL | 02284d0 | pte_alloc_one
44 | 11 | 0 | MOVABLE | 00280da | handle_mm_fault
28 | 7 | 0 | MOVABLE | 00200da | do_wp_page
16 | 4 | 0 | UNMOVABL | 00284d0 | __pmd_alloc
16 | 4 | 0 | UNMOVABL | 0020200 | __tlb_remove_page
12 | 3 | 0 | MOVABLE | 00200da | handle_mm_fault
8 | 2 | 0 | UNMOVABL | 00284d0 | __pud_alloc
4 | 1 | 0 | UNMOVABL | 0020250 | ftrace_define_fields_xfs_ag_class
4 | 1 | 0 | UNMOVABL | 0020010 | bio_copy_kern
4 | 1 | 0 | UNMOVABL | 00200d0 | __pollwait
4 | 1 | 0 | UNMOVABL | 00200d2 | pipe_write
4 | 1 | 0 | MOVABLE | 00280da | do_wp_page
4 | 1 | 0 | UNMOVABL | 02284d0 | pgd_alloc
---------------------------------------------------------------------------------

Probably that new_slab() one should go into the regexp?

[acme@ssdandy linux]$ uname -a
Linux ssdandy 4.0.0-rc6+ #3 SMP Mon Apr 13 16:45:57 BRT 2015 x86_64 x86_64 x86_64 GNU/Linux

[acme@ssdandy linux]$ grep SL.B /lib/modules/`uname -r`/build/.config
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
CONFIG_SLUB_CPU_PARTIAL=y
CONFIG_SLABINFO=y
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_SLUB_STATS is not set
[acme@ssdandy linux]$

- Arnaldo

> ---------------------------------------------------------------------------------------------
> Total_alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
> ---------------------------------------------------------------------------------------------
> 1,064 | 266 | 0 | UNMOVABL | 000000d0 | __pollwait
> 52 | 13 | 0 | UNMOVABL | 002084d0 | pte_alloc_one
> 44 | 11 | 0 | MOVABLE | 000280da | handle_mm_fault
> 20 | 5 | 0 | MOVABLE | 000200da | do_cow_fault
> 20 | 5 | 0 | MOVABLE | 000200da | do_wp_page
> 16 | 4 | 0 | UNMOVABL | 000084d0 | __pmd_alloc
> 16 | 4 | 0 | UNMOVABL | 00000200 | __tlb_remove_page
> 12 | 3 | 0 | UNMOVABL | 000084d0 | __pud_alloc
> 8 | 2 | 0 | UNMOVABL | 00000010 | bio_copy_user_iov
> 4 | 1 | 0 | UNMOVABL | 000200d2 | pipe_write
> 4 | 1 | 0 | MOVABLE | 000280da | do_wp_page
> 4 | 1 | 0 | UNMOVABL | 002084d0 | pgd_alloc
> ---------------------------------------------------------------------------------------------
>
> Acked-by: Pekka Enberg <[email protected]>
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/builtin-kmem.c | 327 +++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 306 insertions(+), 21 deletions(-)
>
> diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
> index 4f0f38462d97..3649eec6807f 100644
> --- a/tools/perf/builtin-kmem.c
> +++ b/tools/perf/builtin-kmem.c
> @@ -10,6 +10,7 @@
> #include "util/header.h"
> #include "util/session.h"
> #include "util/tool.h"
> +#include "util/callchain.h"
>
> #include "util/parse-options.h"
> #include "util/trace-event.h"
> @@ -21,6 +22,7 @@
> #include <linux/rbtree.h>
> #include <linux/string.h>
> #include <locale.h>
> +#include <regex.h>
>
> static int kmem_slab;
> static int kmem_page;
> @@ -241,6 +243,7 @@ static unsigned long nr_page_fails;
> static unsigned long nr_page_nomatch;
>
> static bool use_pfn;
> +static struct perf_session *kmem_session;
>
> #define MAX_MIGRATE_TYPES 6
> #define MAX_PAGE_ORDER 11
> @@ -250,6 +253,7 @@ static int order_stats[MAX_PAGE_ORDER][MAX_MIGRATE_TYPES];
> struct page_stat {
> struct rb_node node;
> u64 page;
> + u64 callsite;
> int order;
> unsigned gfp_flags;
> unsigned migrate_type;
> @@ -262,8 +266,144 @@ struct page_stat {
> static struct rb_root page_tree;
> static struct rb_root page_alloc_tree;
> static struct rb_root page_alloc_sorted;
> +static struct rb_root page_caller_tree;
> +static struct rb_root page_caller_sorted;
>
> -static struct page_stat *search_page(unsigned long page, bool create)
> +struct alloc_func {
> + u64 start;
> + u64 end;
> + char *name;
> +};
> +
> +static int nr_alloc_funcs;
> +static struct alloc_func *alloc_func_list;
> +
> +static int funcmp(const void *a, const void *b)
> +{
> + const struct alloc_func *fa = a;
> + const struct alloc_func *fb = b;
> +
> + if (fa->start > fb->start)
> + return 1;
> + else
> + return -1;
> +}
> +
> +static int callcmp(const void *a, const void *b)
> +{
> + const struct alloc_func *fa = a;
> + const struct alloc_func *fb = b;
> +
> + if (fb->start <= fa->start && fa->end < fb->end)
> + return 0;
> +
> + if (fa->start > fb->start)
> + return 1;
> + else
> + return -1;
> +}
> +
> +static int build_alloc_func_list(void)
> +{
> + int ret;
> + struct map *kernel_map;
> + struct symbol *sym;
> + struct rb_node *node;
> + struct alloc_func *func;
> + struct machine *machine = &kmem_session->machines.host;
> + regex_t alloc_func_regex;
> + const char pattern[] = "^_?_?(alloc|get_free|get_zeroed)_pages?";
> +
> + ret = regcomp(&alloc_func_regex, pattern, REG_EXTENDED);
> + if (ret) {
> + char err[BUFSIZ];
> +
> + regerror(ret, &alloc_func_regex, err, sizeof(err));
> + pr_err("Invalid regex: %s\n%s", pattern, err);
> + return -EINVAL;
> + }
> +
> + kernel_map = machine->vmlinux_maps[MAP__FUNCTION];
> + if (map__load(kernel_map, NULL) < 0) {
> + pr_err("cannot load kernel map\n");
> + return -ENOENT;
> + }
> +
> + map__for_each_symbol(kernel_map, sym, node) {
> + if (regexec(&alloc_func_regex, sym->name, 0, NULL, 0))
> + continue;
> +
> + func = realloc(alloc_func_list,
> + (nr_alloc_funcs + 1) * sizeof(*func));
> + if (func == NULL)
> + return -ENOMEM;
> +
> + pr_debug("alloc func: %s\n", sym->name);
> + func[nr_alloc_funcs].start = sym->start;
> + func[nr_alloc_funcs].end = sym->end;
> + func[nr_alloc_funcs].name = sym->name;
> +
> + alloc_func_list = func;
> + nr_alloc_funcs++;
> + }
> +
> + qsort(alloc_func_list, nr_alloc_funcs, sizeof(*func), funcmp);
> +
> + regfree(&alloc_func_regex);
> + return 0;
> +}
> +
> +/*
> + * Find first non-memory allocation function from callchain.
> + * The allocation functions are in the 'alloc_func_list'.
> + */
> +static u64 find_callsite(struct perf_evsel *evsel, struct perf_sample *sample)
> +{
> + struct addr_location al;
> + struct machine *machine = &kmem_session->machines.host;
> + struct callchain_cursor_node *node;
> +
> + if (alloc_func_list == NULL) {
> + if (build_alloc_func_list() < 0)
> + goto out;
> + }
> +
> + al.thread = machine__findnew_thread(machine, sample->pid, sample->tid);
> + sample__resolve_callchain(sample, NULL, evsel, &al, 16);
> +
> + callchain_cursor_commit(&callchain_cursor);
> + while (true) {
> + struct alloc_func key, *caller;
> + u64 addr;
> +
> + node = callchain_cursor_current(&callchain_cursor);
> + if (node == NULL)
> + break;
> +
> + key.start = key.end = node->ip;
> + caller = bsearch(&key, alloc_func_list, nr_alloc_funcs,
> + sizeof(key), callcmp);
> + if (!caller) {
> + /* found */
> + if (node->map)
> + addr = map__unmap_ip(node->map, node->ip);
> + else
> + addr = node->ip;
> +
> + return addr;
> + } else
> + pr_debug3("skipping alloc function: %s\n", caller->name);
> +
> + callchain_cursor_advance(&callchain_cursor);
> + }
> +
> +out:
> + pr_debug2("unknown callsite: %"PRIx64 "\n", sample->ip);
> + return sample->ip;
> +}
> +
> +static struct page_stat *
> +__page_stat__findnew_page(u64 page, bool create)
> {
> struct rb_node **node = &page_tree.rb_node;
> struct rb_node *parent = NULL;
> @@ -298,6 +438,16 @@ static struct page_stat *search_page(unsigned long page, bool create)
> return data;
> }
>
> +static struct page_stat *page_stat__find_page(u64 page)
> +{
> + return __page_stat__findnew_page(page, false);
> +}
> +
> +static struct page_stat *page_stat__findnew_page(u64 page)
> +{
> + return __page_stat__findnew_page(page, true);
> +}
> +
> static int page_stat_cmp(struct page_stat *a, struct page_stat *b)
> {
> if (a->page > b->page)
> @@ -319,7 +469,8 @@ static int page_stat_cmp(struct page_stat *a, struct page_stat *b)
> return 0;
> }
>
> -static struct page_stat *search_page_alloc_stat(struct page_stat *pstat, bool create)
> +static struct page_stat *
> +__page_stat__findnew_alloc(struct page_stat *pstat, bool create)
> {
> struct rb_node **node = &page_alloc_tree.rb_node;
> struct rb_node *parent = NULL;
> @@ -357,6 +508,62 @@ static struct page_stat *search_page_alloc_stat(struct page_stat *pstat, bool cr
> return data;
> }
>
> +static struct page_stat *page_stat__find_alloc(struct page_stat *pstat)
> +{
> + return __page_stat__findnew_alloc(pstat, false);
> +}
> +
> +static struct page_stat *page_stat__findnew_alloc(struct page_stat *pstat)
> +{
> + return __page_stat__findnew_alloc(pstat, true);
> +}
> +
> +static struct page_stat *
> +__page_stat__findnew_caller(u64 callsite, bool create)
> +{
> + struct rb_node **node = &page_caller_tree.rb_node;
> + struct rb_node *parent = NULL;
> + struct page_stat *data;
> +
> + while (*node) {
> + s64 cmp;
> +
> + parent = *node;
> + data = rb_entry(*node, struct page_stat, node);
> +
> + cmp = data->callsite - callsite;
> + if (cmp < 0)
> + node = &parent->rb_left;
> + else if (cmp > 0)
> + node = &parent->rb_right;
> + else
> + return data;
> + }
> +
> + if (!create)
> + return NULL;
> +
> + data = zalloc(sizeof(*data));
> + if (data != NULL) {
> + data->callsite = callsite;
> +
> + rb_link_node(&data->node, parent, node);
> + rb_insert_color(&data->node, &page_caller_tree);
> + }
> +
> + return data;
> +}
> +
> +static struct page_stat *page_stat__find_caller(u64 callsite)
> +{
> + return __page_stat__findnew_caller(callsite, false);
> +}
> +
> +static struct page_stat *page_stat__findnew_caller(u64 callsite)
> +{
> + return __page_stat__findnew_caller(callsite, true);
> +}
> +
> static bool valid_page(u64 pfn_or_page)
> {
> if (use_pfn && pfn_or_page == -1UL)
> @@ -375,6 +582,7 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
> unsigned int migrate_type = perf_evsel__intval(evsel, sample,
> "migratetype");
> u64 bytes = kmem_page_size << order;
> + u64 callsite;
> struct page_stat *pstat;
> struct page_stat this = {
> .order = order,
> @@ -397,25 +605,40 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
> return 0;
> }
>
> + callsite = find_callsite(evsel, sample);
> +
> /*
> * This is to find the current page (with correct gfp flags and
> * migrate type) at free event.
> */
> - pstat = search_page(page, true);
> + pstat = page_stat__findnew_page(page);
> if (pstat == NULL)
> return -ENOMEM;
>
> pstat->order = order;
> pstat->gfp_flags = gfp_flags;
> pstat->migrate_type = migrate_type;
> + pstat->callsite = callsite;
>
> this.page = page;
> - pstat = search_page_alloc_stat(&this, true);
> + pstat = page_stat__findnew_alloc(&this);
> if (pstat == NULL)
> return -ENOMEM;
>
> pstat->nr_alloc++;
> pstat->alloc_bytes += bytes;
> + pstat->callsite = callsite;
> +
> + pstat = page_stat__findnew_caller(callsite);
> + if (pstat == NULL)
> + return -ENOMEM;
> +
> + pstat->order = order;
> + pstat->gfp_flags = gfp_flags;
> + pstat->migrate_type = migrate_type;
> +
> + pstat->nr_alloc++;
> + pstat->alloc_bytes += bytes;
>
> order_stats[order][migrate_type]++;
>
> @@ -441,7 +664,7 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
> nr_page_frees++;
> total_page_free_bytes += bytes;
>
> - pstat = search_page(page, false);
> + pstat = page_stat__find_page(page);
> if (pstat == NULL) {
> pr_debug2("missing free at page %"PRIx64" (order: %d)\n",
> page, order);
> @@ -455,11 +678,19 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
> this.page = page;
> this.gfp_flags = pstat->gfp_flags;
> this.migrate_type = pstat->migrate_type;
> + this.callsite = pstat->callsite;
>
> rb_erase(&pstat->node, &page_tree);
> free(pstat);
>
> - pstat = search_page_alloc_stat(&this, false);
> + pstat = page_stat__find_alloc(&this);
> + if (pstat == NULL)
> + return -ENOENT;
> +
> + pstat->nr_free++;
> + pstat->free_bytes += bytes;
> +
> + pstat = page_stat__find_caller(this.callsite);
> if (pstat == NULL)
> return -ENOENT;
>
> @@ -576,41 +807,89 @@ static const char * const migrate_type_str[] = {
> "UNKNOWN",
> };
>
> -static void __print_page_result(struct rb_root *root,
> - struct perf_session *session __maybe_unused,
> - int n_lines)
> +static void __print_page_alloc_result(struct perf_session *session, int n_lines)
> {
> - struct rb_node *next = rb_first(root);
> + struct rb_node *next = rb_first(&page_alloc_sorted);
> + struct machine *machine = &session->machines.host;
> const char *format;
>
> - printf("\n%.80s\n", graph_dotted_line);
> - printf(" %-16s | Total alloc (KB) | Hits | Order | Mig.type | GFP flags\n",
> + printf("\n%.105s\n", graph_dotted_line);
> + printf(" %-16s | Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n",
> use_pfn ? "PFN" : "Page");
> - printf("%.80s\n", graph_dotted_line);
> + printf("%.105s\n", graph_dotted_line);
>
> if (use_pfn)
> - format = " %16llu | %'16llu | %'9d | %5d | %8s | %08lx\n";
> + format = " %16llu | %'16llu | %'9d | %5d | %8s | %08lx | %s\n";
> else
> - format = " %016llx | %'16llu | %'9d | %5d | %8s | %08lx\n";
> + format = " %016llx | %'16llu | %'9d | %5d | %8s | %08lx | %s\n";
>
> while (next && n_lines--) {
> struct page_stat *data;
> + struct symbol *sym;
> + struct map *map;
> + char buf[32];
> + char *caller = buf;
>
> data = rb_entry(next, struct page_stat, node);
> + sym = machine__find_kernel_function(machine, data->callsite,
> + &map, NULL);
> + if (sym && sym->name)
> + caller = sym->name;
> + else
> + scnprintf(buf, sizeof(buf), "%"PRIx64, data->callsite);
>
> printf(format, (unsigned long long)data->page,
> (unsigned long long)data->alloc_bytes / 1024,
> data->nr_alloc, data->order,
> migrate_type_str[data->migrate_type],
> - (unsigned long)data->gfp_flags);
> + (unsigned long)data->gfp_flags, caller);
>
> next = rb_next(next);
> }
>
> if (n_lines == -1)
> - printf(" ... | ... | ... | ... | ... | ... \n");
> + printf(" ... | ... | ... | ... | ... | ... | ...\n");
>
> - printf("%.80s\n", graph_dotted_line);
> + printf("%.105s\n", graph_dotted_line);
> +}
> +
> +static void __print_page_caller_result(struct perf_session *session, int n_lines)
> +{
> + struct rb_node *next = rb_first(&page_caller_sorted);
> + struct machine *machine = &session->machines.host;
> +
> + printf("\n%.105s\n", graph_dotted_line);
> + printf(" Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n");
> + printf("%.105s\n", graph_dotted_line);
> +
> + while (next && n_lines--) {
> + struct page_stat *data;
> + struct symbol *sym;
> + struct map *map;
> + char buf[32];
> + char *caller = buf;
> +
> + data = rb_entry(next, struct page_stat, node);
> + sym = machine__find_kernel_function(machine, data->callsite,
> + &map, NULL);
> + if (sym && sym->name)
> + caller = sym->name;
> + else
> + scnprintf(buf, sizeof(buf), "%"PRIx64, data->callsite);
> +
> + printf(" %'16llu | %'9d | %5d | %8s | %08lx | %s\n",
> + (unsigned long long)data->alloc_bytes / 1024,
> + data->nr_alloc, data->order,
> + migrate_type_str[data->migrate_type],
> + (unsigned long)data->gfp_flags, caller);
> +
> + next = rb_next(next);
> + }
> +
> + if (n_lines == -1)
> + printf(" ... | ... | ... | ... | ... | ...\n");
> +
> + printf("%.105s\n", graph_dotted_line);
> }
>
> static void print_slab_summary(void)
> @@ -682,8 +961,10 @@ static void print_slab_result(struct perf_session *session)
>
> static void print_page_result(struct perf_session *session)
> {
> + if (caller_flag)
> + __print_page_caller_result(session, caller_lines);
> if (alloc_flag)
> - __print_page_result(&page_alloc_sorted, session, alloc_lines);
> + __print_page_alloc_result(session, alloc_lines);
> print_page_summary();
> }
>
> @@ -802,6 +1083,7 @@ static void sort_result(void)
> }
> if (kmem_page) {
> __sort_page_result(&page_alloc_tree, &page_alloc_sorted);
> + __sort_page_result(&page_caller_tree, &page_caller_sorted);
> }
> }
>
> @@ -1084,7 +1366,7 @@ static int __cmd_record(int argc, const char **argv)
> if (kmem_slab)
> rec_argc += ARRAY_SIZE(slab_events);
> if (kmem_page)
> - rec_argc += ARRAY_SIZE(page_events);
> + rec_argc += ARRAY_SIZE(page_events) + 1; /* for -g */
>
> rec_argv = calloc(rec_argc + 1, sizeof(char *));
>
> @@ -1099,6 +1381,8 @@ static int __cmd_record(int argc, const char **argv)
> rec_argv[i] = strdup(slab_events[j]);
> }
> if (kmem_page) {
> + rec_argv[i++] = strdup("-g");
> +
> for (j = 0; j < ARRAY_SIZE(page_events); j++, i++)
> rec_argv[i] = strdup(page_events[j]);
> }
> @@ -1159,7 +1443,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
>
> file.path = input_name;
>
> - session = perf_session__new(&file, false, &perf_kmem);
> + kmem_session = session = perf_session__new(&file, false, &perf_kmem);
> if (session == NULL)
> return -1;
>
> @@ -1172,6 +1456,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
> }
>
> kmem_page_size = pevent_get_page_size(evsel->tp_format->pevent);
> + symbol_conf.use_callchain = true;
> }
>
> symbol__init(&session->header.env);
> --
> 2.3.4

2015-05-04 20:55:44

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 1/6] perf kmem: Implement stat --page --caller

Em Mon, May 04, 2015 at 12:38:53PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Apr 21, 2015 at 01:55:02PM +0900, Namhyung Kim escreveu:
> Probably that new_slab() one should go into the regexp?

Ah, nevermind about this question ;-)

- Arnaldo

> [acme@ssdandy linux]$ uname -a
> Linux ssdandy 4.0.0-rc6+ #3 SMP Mon Apr 13 16:45:57 BRT 2015 x86_64 x86_64 x86_64 GNU/Linux
>
> [acme@ssdandy linux]$ grep SL.B /lib/modules/`uname -r`/build/.config
> CONFIG_SLUB_DEBUG=y
> # CONFIG_SLAB is not set
> CONFIG_SLUB=y
> CONFIG_SLUB_CPU_PARTIAL=y
> CONFIG_SLABINFO=y
> # CONFIG_SLUB_DEBUG_ON is not set
> # CONFIG_SLUB_STATS is not set
> [acme@ssdandy linux]$
>
> - Arnaldo
>
> > ---------------------------------------------------------------------------------------------
> > Total_alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
> > ---------------------------------------------------------------------------------------------
> > 1,064 | 266 | 0 | UNMOVABL | 000000d0 | __pollwait
> > 52 | 13 | 0 | UNMOVABL | 002084d0 | pte_alloc_one
> > 44 | 11 | 0 | MOVABLE | 000280da | handle_mm_fault
> > 20 | 5 | 0 | MOVABLE | 000200da | do_cow_fault
> > 20 | 5 | 0 | MOVABLE | 000200da | do_wp_page
> > 16 | 4 | 0 | UNMOVABL | 000084d0 | __pmd_alloc
> > 16 | 4 | 0 | UNMOVABL | 00000200 | __tlb_remove_page
> > 12 | 3 | 0 | UNMOVABL | 000084d0 | __pud_alloc
> > 8 | 2 | 0 | UNMOVABL | 00000010 | bio_copy_user_iov
> > 4 | 1 | 0 | UNMOVABL | 000200d2 | pipe_write
> > 4 | 1 | 0 | MOVABLE | 000280da | do_wp_page
> > 4 | 1 | 0 | UNMOVABL | 002084d0 | pgd_alloc
> > ---------------------------------------------------------------------------------------------
> >
> > Acked-by: Pekka Enberg <[email protected]>
> > Signed-off-by: Namhyung Kim <[email protected]>
> > ---
> > tools/perf/builtin-kmem.c | 327 +++++++++++++++++++++++++++++++++++++++++++---
> > 1 file changed, 306 insertions(+), 21 deletions(-)
> >
> > diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
> > index 4f0f38462d97..3649eec6807f 100644
> > --- a/tools/perf/builtin-kmem.c
> > +++ b/tools/perf/builtin-kmem.c
> > @@ -10,6 +10,7 @@
> > #include "util/header.h"
> > #include "util/session.h"
> > #include "util/tool.h"
> > +#include "util/callchain.h"
> >
> > #include "util/parse-options.h"
> > #include "util/trace-event.h"
> > @@ -21,6 +22,7 @@
> > #include <linux/rbtree.h>
> > #include <linux/string.h>
> > #include <locale.h>
> > +#include <regex.h>
> >
> > static int kmem_slab;
> > static int kmem_page;
> > @@ -241,6 +243,7 @@ static unsigned long nr_page_fails;
> > static unsigned long nr_page_nomatch;
> >
> > static bool use_pfn;
> > +static struct perf_session *kmem_session;
> >
> > #define MAX_MIGRATE_TYPES 6
> > #define MAX_PAGE_ORDER 11
> > @@ -250,6 +253,7 @@ static int order_stats[MAX_PAGE_ORDER][MAX_MIGRATE_TYPES];
> > struct page_stat {
> > struct rb_node node;
> > u64 page;
> > + u64 callsite;
> > int order;
> > unsigned gfp_flags;
> > unsigned migrate_type;
> > @@ -262,8 +266,144 @@ struct page_stat {
> > static struct rb_root page_tree;
> > static struct rb_root page_alloc_tree;
> > static struct rb_root page_alloc_sorted;
> > +static struct rb_root page_caller_tree;
> > +static struct rb_root page_caller_sorted;
> >
> > -static struct page_stat *search_page(unsigned long page, bool create)
> > +struct alloc_func {
> > + u64 start;
> > + u64 end;
> > + char *name;
> > +};
> > +
> > +static int nr_alloc_funcs;
> > +static struct alloc_func *alloc_func_list;
> > +
> > +static int funcmp(const void *a, const void *b)
> > +{
> > + const struct alloc_func *fa = a;
> > + const struct alloc_func *fb = b;
> > +
> > + if (fa->start > fb->start)
> > + return 1;
> > + else
> > + return -1;
> > +}
> > +
> > +static int callcmp(const void *a, const void *b)
> > +{
> > + const struct alloc_func *fa = a;
> > + const struct alloc_func *fb = b;
> > +
> > + if (fb->start <= fa->start && fa->end < fb->end)
> > + return 0;
> > +
> > + if (fa->start > fb->start)
> > + return 1;
> > + else
> > + return -1;
> > +}
> > +
> > +static int build_alloc_func_list(void)
> > +{
> > + int ret;
> > + struct map *kernel_map;
> > + struct symbol *sym;
> > + struct rb_node *node;
> > + struct alloc_func *func;
> > + struct machine *machine = &kmem_session->machines.host;
> > + regex_t alloc_func_regex;
> > + const char pattern[] = "^_?_?(alloc|get_free|get_zeroed)_pages?";
> > +
> > + ret = regcomp(&alloc_func_regex, pattern, REG_EXTENDED);
> > + if (ret) {
> > + char err[BUFSIZ];
> > +
> > + regerror(ret, &alloc_func_regex, err, sizeof(err));
> > + pr_err("Invalid regex: %s\n%s", pattern, err);
> > + return -EINVAL;
> > + }
> > +
> > + kernel_map = machine->vmlinux_maps[MAP__FUNCTION];
> > + if (map__load(kernel_map, NULL) < 0) {
> > + pr_err("cannot load kernel map\n");
> > + return -ENOENT;
> > + }
> > +
> > + map__for_each_symbol(kernel_map, sym, node) {
> > + if (regexec(&alloc_func_regex, sym->name, 0, NULL, 0))
> > + continue;
> > +
> > + func = realloc(alloc_func_list,
> > + (nr_alloc_funcs + 1) * sizeof(*func));
> > + if (func == NULL)
> > + return -ENOMEM;
> > +
> > + pr_debug("alloc func: %s\n", sym->name);
> > + func[nr_alloc_funcs].start = sym->start;
> > + func[nr_alloc_funcs].end = sym->end;
> > + func[nr_alloc_funcs].name = sym->name;
> > +
> > + alloc_func_list = func;
> > + nr_alloc_funcs++;
> > + }
> > +
> > + qsort(alloc_func_list, nr_alloc_funcs, sizeof(*func), funcmp);
> > +
> > + regfree(&alloc_func_regex);
> > + return 0;
> > +}
> > +
> > +/*
> > + * Find first non-memory allocation function from callchain.
> > + * The allocation functions are in the 'alloc_func_list'.
> > + */
> > +static u64 find_callsite(struct perf_evsel *evsel, struct perf_sample *sample)
> > +{
> > + struct addr_location al;
> > + struct machine *machine = &kmem_session->machines.host;
> > + struct callchain_cursor_node *node;
> > +
> > + if (alloc_func_list == NULL) {
> > + if (build_alloc_func_list() < 0)
> > + goto out;
> > + }
> > +
> > + al.thread = machine__findnew_thread(machine, sample->pid, sample->tid);
> > + sample__resolve_callchain(sample, NULL, evsel, &al, 16);
> > +
> > + callchain_cursor_commit(&callchain_cursor);
> > + while (true) {
> > + struct alloc_func key, *caller;
> > + u64 addr;
> > +
> > + node = callchain_cursor_current(&callchain_cursor);
> > + if (node == NULL)
> > + break;
> > +
> > + key.start = key.end = node->ip;
> > + caller = bsearch(&key, alloc_func_list, nr_alloc_funcs,
> > + sizeof(key), callcmp);
> > + if (!caller) {
> > + /* found */
> > + if (node->map)
> > + addr = map__unmap_ip(node->map, node->ip);
> > + else
> > + addr = node->ip;
> > +
> > + return addr;
> > + } else
> > + pr_debug3("skipping alloc function: %s\n", caller->name);
> > +
> > + callchain_cursor_advance(&callchain_cursor);
> > + }
> > +
> > +out:
> > + pr_debug2("unknown callsite: %"PRIx64 "\n", sample->ip);
> > + return sample->ip;
> > +}
> > +
> > +static struct page_stat *
> > +__page_stat__findnew_page(u64 page, bool create)
> > {
> > struct rb_node **node = &page_tree.rb_node;
> > struct rb_node *parent = NULL;
> > @@ -298,6 +438,16 @@ static struct page_stat *search_page(unsigned long page, bool create)
> > return data;
> > }
> >
> > +static struct page_stat *page_stat__find_page(u64 page)
> > +{
> > + return __page_stat__findnew_page(page, false);
> > +}
> > +
> > +static struct page_stat *page_stat__findnew_page(u64 page)
> > +{
> > + return __page_stat__findnew_page(page, true);
> > +}
> > +
> > static int page_stat_cmp(struct page_stat *a, struct page_stat *b)
> > {
> > if (a->page > b->page)
> > @@ -319,7 +469,8 @@ static int page_stat_cmp(struct page_stat *a, struct page_stat *b)
> > return 0;
> > }
> >
> > -static struct page_stat *search_page_alloc_stat(struct page_stat *pstat, bool create)
> > +static struct page_stat *
> > +__page_stat__findnew_alloc(struct page_stat *pstat, bool create)
> > {
> > struct rb_node **node = &page_alloc_tree.rb_node;
> > struct rb_node *parent = NULL;
> > @@ -357,6 +508,62 @@ static struct page_stat *search_page_alloc_stat(struct page_stat *pstat, bool cr
> > return data;
> > }
> >
> > +static struct page_stat *page_stat__find_alloc(struct page_stat *pstat)
> > +{
> > + return __page_stat__findnew_alloc(pstat, false);
> > +}
> > +
> > +static struct page_stat *page_stat__findnew_alloc(struct page_stat *pstat)
> > +{
> > + return __page_stat__findnew_alloc(pstat, true);
> > +}
> > +
> > +static struct page_stat *
> > +__page_stat__findnew_caller(u64 callsite, bool create)
> > +{
> > + struct rb_node **node = &page_caller_tree.rb_node;
> > + struct rb_node *parent = NULL;
> > + struct page_stat *data;
> > +
> > + while (*node) {
> > + s64 cmp;
> > +
> > + parent = *node;
> > + data = rb_entry(*node, struct page_stat, node);
> > +
> > + cmp = data->callsite - callsite;
> > + if (cmp < 0)
> > + node = &parent->rb_left;
> > + else if (cmp > 0)
> > + node = &parent->rb_right;
> > + else
> > + return data;
> > + }
> > +
> > + if (!create)
> > + return NULL;
> > +
> > + data = zalloc(sizeof(*data));
> > + if (data != NULL) {
> > + data->callsite = callsite;
> > +
> > + rb_link_node(&data->node, parent, node);
> > + rb_insert_color(&data->node, &page_caller_tree);
> > + }
> > +
> > + return data;
> > +}
> > +
> > +static struct page_stat *page_stat__find_caller(u64 callsite)
> > +{
> > + return __page_stat__findnew_caller(callsite, false);
> > +}
> > +
> > +static struct page_stat *page_stat__findnew_caller(u64 callsite)
> > +{
> > + return __page_stat__findnew_caller(callsite, true);
> > +}
> > +
> > static bool valid_page(u64 pfn_or_page)
> > {
> > if (use_pfn && pfn_or_page == -1UL)
> > @@ -375,6 +582,7 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
> > unsigned int migrate_type = perf_evsel__intval(evsel, sample,
> > "migratetype");
> > u64 bytes = kmem_page_size << order;
> > + u64 callsite;
> > struct page_stat *pstat;
> > struct page_stat this = {
> > .order = order,
> > @@ -397,25 +605,40 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
> > return 0;
> > }
> >
> > + callsite = find_callsite(evsel, sample);
> > +
> > /*
> > * This is to find the current page (with correct gfp flags and
> > * migrate type) at free event.
> > */
> > - pstat = search_page(page, true);
> > + pstat = page_stat__findnew_page(page);
> > if (pstat == NULL)
> > return -ENOMEM;
> >
> > pstat->order = order;
> > pstat->gfp_flags = gfp_flags;
> > pstat->migrate_type = migrate_type;
> > + pstat->callsite = callsite;
> >
> > this.page = page;
> > - pstat = search_page_alloc_stat(&this, true);
> > + pstat = page_stat__findnew_alloc(&this);
> > if (pstat == NULL)
> > return -ENOMEM;
> >
> > pstat->nr_alloc++;
> > pstat->alloc_bytes += bytes;
> > + pstat->callsite = callsite;
> > +
> > + pstat = page_stat__findnew_caller(callsite);
> > + if (pstat == NULL)
> > + return -ENOMEM;
> > +
> > + pstat->order = order;
> > + pstat->gfp_flags = gfp_flags;
> > + pstat->migrate_type = migrate_type;
> > +
> > + pstat->nr_alloc++;
> > + pstat->alloc_bytes += bytes;
> >
> > order_stats[order][migrate_type]++;
> >
> > @@ -441,7 +664,7 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
> > nr_page_frees++;
> > total_page_free_bytes += bytes;
> >
> > - pstat = search_page(page, false);
> > + pstat = page_stat__find_page(page);
> > if (pstat == NULL) {
> > pr_debug2("missing free at page %"PRIx64" (order: %d)\n",
> > page, order);
> > @@ -455,11 +678,19 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
> > this.page = page;
> > this.gfp_flags = pstat->gfp_flags;
> > this.migrate_type = pstat->migrate_type;
> > + this.callsite = pstat->callsite;
> >
> > rb_erase(&pstat->node, &page_tree);
> > free(pstat);
> >
> > - pstat = search_page_alloc_stat(&this, false);
> > + pstat = page_stat__find_alloc(&this);
> > + if (pstat == NULL)
> > + return -ENOENT;
> > +
> > + pstat->nr_free++;
> > + pstat->free_bytes += bytes;
> > +
> > + pstat = page_stat__find_caller(this.callsite);
> > if (pstat == NULL)
> > return -ENOENT;
> >
> > @@ -576,41 +807,89 @@ static const char * const migrate_type_str[] = {
> > "UNKNOWN",
> > };
> >
> > -static void __print_page_result(struct rb_root *root,
> > - struct perf_session *session __maybe_unused,
> > - int n_lines)
> > +static void __print_page_alloc_result(struct perf_session *session, int n_lines)
> > {
> > - struct rb_node *next = rb_first(root);
> > + struct rb_node *next = rb_first(&page_alloc_sorted);
> > + struct machine *machine = &session->machines.host;
> > const char *format;
> >
> > - printf("\n%.80s\n", graph_dotted_line);
> > - printf(" %-16s | Total alloc (KB) | Hits | Order | Mig.type | GFP flags\n",
> > + printf("\n%.105s\n", graph_dotted_line);
> > + printf(" %-16s | Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n",
> > use_pfn ? "PFN" : "Page");
> > - printf("%.80s\n", graph_dotted_line);
> > + printf("%.105s\n", graph_dotted_line);
> >
> > if (use_pfn)
> > - format = " %16llu | %'16llu | %'9d | %5d | %8s | %08lx\n";
> > + format = " %16llu | %'16llu | %'9d | %5d | %8s | %08lx | %s\n";
> > else
> > - format = " %016llx | %'16llu | %'9d | %5d | %8s | %08lx\n";
> > + format = " %016llx | %'16llu | %'9d | %5d | %8s | %08lx | %s\n";
> >
> > while (next && n_lines--) {
> > struct page_stat *data;
> > + struct symbol *sym;
> > + struct map *map;
> > + char buf[32];
> > + char *caller = buf;
> >
> > data = rb_entry(next, struct page_stat, node);
> > + sym = machine__find_kernel_function(machine, data->callsite,
> > + &map, NULL);
> > + if (sym && sym->name)
> > + caller = sym->name;
> > + else
> > + scnprintf(buf, sizeof(buf), "%"PRIx64, data->callsite);
> >
> > printf(format, (unsigned long long)data->page,
> > (unsigned long long)data->alloc_bytes / 1024,
> > data->nr_alloc, data->order,
> > migrate_type_str[data->migrate_type],
> > - (unsigned long)data->gfp_flags);
> > + (unsigned long)data->gfp_flags, caller);
> >
> > next = rb_next(next);
> > }
> >
> > if (n_lines == -1)
> > - printf(" ... | ... | ... | ... | ... | ... \n");
> > + printf(" ... | ... | ... | ... | ... | ... | ...\n");
> >
> > - printf("%.80s\n", graph_dotted_line);
> > + printf("%.105s\n", graph_dotted_line);
> > +}
> > +
> > +static void __print_page_caller_result(struct perf_session *session, int n_lines)
> > +{
> > + struct rb_node *next = rb_first(&page_caller_sorted);
> > + struct machine *machine = &session->machines.host;
> > +
> > + printf("\n%.105s\n", graph_dotted_line);
> > + printf(" Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n");
> > + printf("%.105s\n", graph_dotted_line);
> > +
> > + while (next && n_lines--) {
> > + struct page_stat *data;
> > + struct symbol *sym;
> > + struct map *map;
> > + char buf[32];
> > + char *caller = buf;
> > +
> > + data = rb_entry(next, struct page_stat, node);
> > + sym = machine__find_kernel_function(machine, data->callsite,
> > + &map, NULL);
> > + if (sym && sym->name)
> > + caller = sym->name;
> > + else
> > + scnprintf(buf, sizeof(buf), "%"PRIx64, data->callsite);
> > +
> > + printf(" %'16llu | %'9d | %5d | %8s | %08lx | %s\n",
> > + (unsigned long long)data->alloc_bytes / 1024,
> > + data->nr_alloc, data->order,
> > + migrate_type_str[data->migrate_type],
> > + (unsigned long)data->gfp_flags, caller);
> > +
> > + next = rb_next(next);
> > + }
> > +
> > + if (n_lines == -1)
> > + printf(" ... | ... | ... | ... | ... | ...\n");
> > +
> > + printf("%.105s\n", graph_dotted_line);
> > }
> >
> > static void print_slab_summary(void)
> > @@ -682,8 +961,10 @@ static void print_slab_result(struct perf_session *session)
> >
> > static void print_page_result(struct perf_session *session)
> > {
> > + if (caller_flag)
> > + __print_page_caller_result(session, caller_lines);
> > if (alloc_flag)
> > - __print_page_result(&page_alloc_sorted, session, alloc_lines);
> > + __print_page_alloc_result(session, alloc_lines);
> > print_page_summary();
> > }
> >
> > @@ -802,6 +1083,7 @@ static void sort_result(void)
> > }
> > if (kmem_page) {
> > __sort_page_result(&page_alloc_tree, &page_alloc_sorted);
> > + __sort_page_result(&page_caller_tree, &page_caller_sorted);
> > }
> > }
> >
> > @@ -1084,7 +1366,7 @@ static int __cmd_record(int argc, const char **argv)
> > if (kmem_slab)
> > rec_argc += ARRAY_SIZE(slab_events);
> > if (kmem_page)
> > - rec_argc += ARRAY_SIZE(page_events);
> > + rec_argc += ARRAY_SIZE(page_events) + 1; /* for -g */
> >
> > rec_argv = calloc(rec_argc + 1, sizeof(char *));
> >
> > @@ -1099,6 +1381,8 @@ static int __cmd_record(int argc, const char **argv)
> > rec_argv[i] = strdup(slab_events[j]);
> > }
> > if (kmem_page) {
> > + rec_argv[i++] = strdup("-g");
> > +
> > for (j = 0; j < ARRAY_SIZE(page_events); j++, i++)
> > rec_argv[i] = strdup(page_events[j]);
> > }
> > @@ -1159,7 +1443,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
> >
> > file.path = input_name;
> >
> > - session = perf_session__new(&file, false, &perf_kmem);
> > + kmem_session = session = perf_session__new(&file, false, &perf_kmem);
> > if (session == NULL)
> > return -1;
> >
> > @@ -1172,6 +1456,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
> > }
> >
> > kmem_page_size = pevent_get_page_size(evsel->tp_format->pevent);
> > + symbol_conf.use_callchain = true;
> > }
> >
> > symbol__init(&session->header.env);
> > --
> > 2.3.4

2015-05-04 20:57:17

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 6/6] perf kmem: Show warning when trying to run stat without record

Em Tue, Apr 21, 2015 at 01:55:07PM +0900, Namhyung Kim escreveu:
> Sometimes one can mistakenly run perf kmem stat without perf kmem
> record before or different configuration like recoding --slab and stat
> --page. Show a warning message like below to inform user:
>
> # perf kmem stat --page --caller
> Not found page events. Have you run 'perf kmem record --page' before?
>
> Acked-by: Pekka Enberg <[email protected]>
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/builtin-kmem.c | 31 ++++++++++++++++++++++++++++---
> 1 file changed, 28 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
> index 828b7284e547..f29a766f18f8 100644
> --- a/tools/perf/builtin-kmem.c
> +++ b/tools/perf/builtin-kmem.c
> @@ -1882,6 +1882,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
> };
> struct perf_session *session;
> int ret = -1;
> + const char errmsg[] = "Not found %s events. Have you run 'perf kmem record --%s' before?\n";
>
> perf_config(kmem_config, NULL);
> argc = parse_options_subcommand(argc, argv, kmem_options,
> @@ -1908,11 +1909,35 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
> if (session == NULL)
> return -1;
>
> + if (kmem_slab) {
> + struct perf_evsel *evsel;
> + bool found = false;
> +
> + evlist__for_each(session->evlist, evsel) {
> + if (!strcmp(perf_evsel__name(evsel), "kmem:kmalloc")) {
> + found = true;
> + break;
> + }
> + }

We have:

struct perf_evsel *
perf_evlist__find_tracepoint_by_name(struct perf_evlist *evlist,
const char *name);

Example of it being used in 'perf trace':

evsel = perf_evlist__find_tracepoint_by_name(session->evlist,
"raw_syscalls:sys_enter");
/* older kernels have syscalls tp versus raw_syscalls */
if (evsel == NULL)
evsel = perf_evlist__find_tracepoint_by_name(session->evlist,
"syscalls:sys_enter");


Applied 1-5, can you please resubmit this one with this change?

- Arnaldo

> + if (!found) {
> + pr_err(errmsg, "slab", "slab");
> + return -1;
> + }
> + }
> +
> if (kmem_page) {
> - struct perf_evsel *evsel = perf_evlist__first(session->evlist);
> + struct perf_evsel *evsel;
> + bool found = false;
>
> - if (evsel == NULL || evsel->tp_format == NULL) {
> - pr_err("invalid event found.. aborting\n");
> + evlist__for_each(session->evlist, evsel) {
> + if (!strcmp(perf_evsel__name(evsel),
> + "kmem:mm_page_alloc")) {
> + found = true;
> + break;
> + }
> + }
> + if (!found) {
> + pr_err(errmsg, "page", "page");
> return -1;
> }
>
> --
> 2.3.4

2015-05-05 00:59:34

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH v2] perf kmem: Show warning when trying to run stat without record

Sometimes one can mistakenly run perf kmem stat without perf kmem
record before or different configuration like recoding --slab and stat
--page. Show a warning message like below to inform user:

# perf kmem stat --page --caller
Not found page events. Have you run 'perf kmem record --page' before?

Acked-by: Pekka Enberg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
Use perf_evlist__find_tracepoint_by_name().

tools/perf/builtin-kmem.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 828b7284e547..5868b4347925 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -1882,6 +1882,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
};
struct perf_session *session;
int ret = -1;
+ const char errmsg[] = "Not found %s events. Have you run 'perf kmem record --%s' before?\n";

perf_config(kmem_config, NULL);
argc = parse_options_subcommand(argc, argv, kmem_options,
@@ -1908,11 +1909,21 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
if (session == NULL)
return -1;

+ if (kmem_slab) {
+ if (!perf_evlist__find_tracepoint_by_name(session->evlist,
+ "kmem:kmalloc")) {
+ pr_err(errmsg, "slab", "slab");
+ return -1;
+ }
+ }
+
if (kmem_page) {
- struct perf_evsel *evsel = perf_evlist__first(session->evlist);
+ struct perf_evsel *evsel;

- if (evsel == NULL || evsel->tp_format == NULL) {
- pr_err("invalid event found.. aborting\n");
+ evsel = perf_evlist__find_tracepoint_by_name(session->evlist,
+ "kmem:mm_page_alloc");
+ if (evsel == NULL) {
+ pr_err(errmsg, "page", "page");
return -1;
}

--
2.3.7

2015-05-05 14:07:16

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2] perf kmem: Show warning when trying to run stat without record

Em Tue, May 05, 2015 at 09:58:12AM +0900, Namhyung Kim escreveu:
> Sometimes one can mistakenly run perf kmem stat without perf kmem
> record before or different configuration like recoding --slab and stat
> --page. Show a warning message like below to inform user:
>
> # perf kmem stat --page --caller
> Not found page events. Have you run 'perf kmem record --page' before?
>
> Acked-by: Pekka Enberg <[email protected]>
> Signed-off-by: Namhyung Kim <[email protected]>

Thanks, applied.

I just found the messages a bit odd souding, perhaps:

# perf kmem stat --page --caller
No page allocation events found. Have you run 'perf kmem record --page'?

Pekka?

- Arnaldo

> ---
> Use perf_evlist__find_tracepoint_by_name().
>
> tools/perf/builtin-kmem.c | 17 ++++++++++++++---
> 1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
> index 828b7284e547..5868b4347925 100644
> --- a/tools/perf/builtin-kmem.c
> +++ b/tools/perf/builtin-kmem.c
> @@ -1882,6 +1882,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
> };
> struct perf_session *session;
> int ret = -1;
> + const char errmsg[] = "Not found %s events. Have you run 'perf kmem record --%s' before?\n";
>
> perf_config(kmem_config, NULL);
> argc = parse_options_subcommand(argc, argv, kmem_options,
> @@ -1908,11 +1909,21 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
> if (session == NULL)
> return -1;
>
> + if (kmem_slab) {
> + if (!perf_evlist__find_tracepoint_by_name(session->evlist,
> + "kmem:kmalloc")) {
> + pr_err(errmsg, "slab", "slab");
> + return -1;
> + }
> + }
> +
> if (kmem_page) {
> - struct perf_evsel *evsel = perf_evlist__first(session->evlist);
> + struct perf_evsel *evsel;
>
> - if (evsel == NULL || evsel->tp_format == NULL) {
> - pr_err("invalid event found.. aborting\n");
> + evsel = perf_evlist__find_tracepoint_by_name(session->evlist,
> + "kmem:mm_page_alloc");
> + if (evsel == NULL) {
> + pr_err(errmsg, "page", "page");
> return -1;
> }
>
> --
> 2.3.7

2015-05-05 16:05:59

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH v2] perf kmem: Show warning when trying to run stat without record

On 05/05/2015 05:07 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, May 05, 2015 at 09:58:12AM +0900, Namhyung Kim escreveu:
>> Sometimes one can mistakenly run perf kmem stat without perf kmem
>> record before or different configuration like recoding --slab and stat
>> --page. Show a warning message like below to inform user:
>>
>> # perf kmem stat --page --caller
>> Not found page events. Have you run 'perf kmem record --page' before?
>>
>> Acked-by: Pekka Enberg <[email protected]>
>> Signed-off-by: Namhyung Kim <[email protected]>
> Thanks, applied.
>
> I just found the messages a bit odd souding, perhaps:
>
> # perf kmem stat --page --caller
> No page allocation events found. Have you run 'perf kmem record --page'?
>
> Pekka?

Sure, that sounds less confusing.

- Pekka

2015-05-05 16:08:19

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH v3 6/6] perf kmem: Show warning when trying to run stat without record

Sometimes one can mistakenly run perf kmem stat without perf kmem
record before or different configuration like recoding --slab and stat
--page. Show a warning message like below to inform user:

# perf kmem stat --page --caller
No page allocation events found. Have you run 'perf kmem record --page'?

Acked-by: Pekka Enberg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
Update the warning message.

tools/perf/builtin-kmem.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 828b7284e547..e628bf1a0c24 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -1882,6 +1882,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
};
struct perf_session *session;
int ret = -1;
+ const char errmsg[] = "No %s allocation events found. Have you run 'perf kmem record --%s'?\n";

perf_config(kmem_config, NULL);
argc = parse_options_subcommand(argc, argv, kmem_options,
@@ -1908,11 +1909,21 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
if (session == NULL)
return -1;

+ if (kmem_slab) {
+ if (!perf_evlist__find_tracepoint_by_name(session->evlist,
+ "kmem:kmalloc")) {
+ pr_err(errmsg, "slab", "slab");
+ return -1;
+ }
+ }
+
if (kmem_page) {
- struct perf_evsel *evsel = perf_evlist__first(session->evlist);
+ struct perf_evsel *evsel;

- if (evsel == NULL || evsel->tp_format == NULL) {
- pr_err("invalid event found.. aborting\n");
+ evsel = perf_evlist__find_tracepoint_by_name(session->evlist,
+ "kmem:mm_page_alloc");
+ if (evsel == NULL) {
+ pr_err(errmsg, "page", "page");
return -1;
}

--
2.3.7

2015-05-05 16:08:36

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v3 6/6] perf kmem: Show warning when trying to run stat without record

Em Tue, May 05, 2015 at 11:52:52PM +0900, Namhyung Kim escreveu:
> Sometimes one can mistakenly run perf kmem stat without perf kmem
> record before or different configuration like recoding --slab and stat
> --page. Show a warning message like below to inform user:
>
> # perf kmem stat --page --caller
> No page allocation events found. Have you run 'perf kmem record --page'?
>
> Acked-by: Pekka Enberg <[email protected]>
> Signed-off-by: Namhyung Kim <[email protected]>

Thanks, replacing that patch with this one.

- Arnaldo


> ---
> Update the warning message.
>
> tools/perf/builtin-kmem.c | 17 ++++++++++++++---
> 1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
> index 828b7284e547..e628bf1a0c24 100644
> --- a/tools/perf/builtin-kmem.c
> +++ b/tools/perf/builtin-kmem.c
> @@ -1882,6 +1882,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
> };
> struct perf_session *session;
> int ret = -1;
> + const char errmsg[] = "No %s allocation events found. Have you run 'perf kmem record --%s'?\n";
>
> perf_config(kmem_config, NULL);
> argc = parse_options_subcommand(argc, argv, kmem_options,
> @@ -1908,11 +1909,21 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
> if (session == NULL)
> return -1;
>
> + if (kmem_slab) {
> + if (!perf_evlist__find_tracepoint_by_name(session->evlist,
> + "kmem:kmalloc")) {
> + pr_err(errmsg, "slab", "slab");
> + return -1;
> + }
> + }
> +
> if (kmem_page) {
> - struct perf_evsel *evsel = perf_evlist__first(session->evlist);
> + struct perf_evsel *evsel;
>
> - if (evsel == NULL || evsel->tp_format == NULL) {
> - pr_err("invalid event found.. aborting\n");
> + evsel = perf_evlist__find_tracepoint_by_name(session->evlist,
> + "kmem:mm_page_alloc");
> + if (evsel == NULL) {
> + pr_err(errmsg, "page", "page");
> return -1;
> }
>
> --
> 2.3.7

Subject: [tip:perf/core] perf kmem: Implement stat --page --caller

Commit-ID: c9758cc4569955c6d8ad519adf539848e8824c72
Gitweb: http://git.kernel.org/tip/c9758cc4569955c6d8ad519adf539848e8824c72
Author: Namhyung Kim <[email protected]>
AuthorDate: Tue, 21 Apr 2015 13:55:02 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Mon, 4 May 2015 12:43:57 -0300

perf kmem: Implement stat --page --caller

It is 'perf kmem' support caller statistics for page. Unlike slab case,
the tracepoints in page allocator don't provide callsite info. So it
records with callchain and extracts callsite info.

Note that the callchain contains several memory allocation functions
which has no meaning for users. So skip those functions to get proper
callsites. I used following regex pattern to skip the allocator
functions:

^_?_?(alloc|get_free|get_zeroed)_pages?

This gave me a following list of functions:

# perf kmem record --page sleep 3
# perf kmem stat --page -v
...
alloc func: __get_free_pages
alloc func: get_zeroed_page
alloc func: alloc_pages_exact
alloc func: __alloc_pages_direct_compact
alloc func: __alloc_pages_nodemask
alloc func: alloc_page_interleave
alloc func: alloc_pages_current
alloc func: alloc_pages_vma
alloc func: alloc_page_buffers
alloc func: alloc_pages_exact_nid
...

The output looks mostly same as --alloc (I also added callsite column
to that) but groups entries by callsite. Currently, the order,
migrate type and GFP flag info is for the last allocation and not
guaranteed to be same for all allocations from the callsite.

---------------------------------------------------------------------------------------------
Total_alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
---------------------------------------------------------------------------------------------
1,064 | 266 | 0 | UNMOVABL | 000000d0 | __pollwait
52 | 13 | 0 | UNMOVABL | 002084d0 | pte_alloc_one
44 | 11 | 0 | MOVABLE | 000280da | handle_mm_fault
20 | 5 | 0 | MOVABLE | 000200da | do_cow_fault
20 | 5 | 0 | MOVABLE | 000200da | do_wp_page
16 | 4 | 0 | UNMOVABL | 000084d0 | __pmd_alloc
16 | 4 | 0 | UNMOVABL | 00000200 | __tlb_remove_page
12 | 3 | 0 | UNMOVABL | 000084d0 | __pud_alloc
8 | 2 | 0 | UNMOVABL | 00000010 | bio_copy_user_iov
4 | 1 | 0 | UNMOVABL | 000200d2 | pipe_write
4 | 1 | 0 | MOVABLE | 000280da | do_wp_page
4 | 1 | 0 | UNMOVABL | 002084d0 | pgd_alloc
---------------------------------------------------------------------------------------------

Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Pekka Enberg <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-kmem.c | 327 +++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 306 insertions(+), 21 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 4f0f384..3649eec 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -10,6 +10,7 @@
#include "util/header.h"
#include "util/session.h"
#include "util/tool.h"
+#include "util/callchain.h"

#include "util/parse-options.h"
#include "util/trace-event.h"
@@ -21,6 +22,7 @@
#include <linux/rbtree.h>
#include <linux/string.h>
#include <locale.h>
+#include <regex.h>

static int kmem_slab;
static int kmem_page;
@@ -241,6 +243,7 @@ static unsigned long nr_page_fails;
static unsigned long nr_page_nomatch;

static bool use_pfn;
+static struct perf_session *kmem_session;

#define MAX_MIGRATE_TYPES 6
#define MAX_PAGE_ORDER 11
@@ -250,6 +253,7 @@ static int order_stats[MAX_PAGE_ORDER][MAX_MIGRATE_TYPES];
struct page_stat {
struct rb_node node;
u64 page;
+ u64 callsite;
int order;
unsigned gfp_flags;
unsigned migrate_type;
@@ -262,8 +266,144 @@ struct page_stat {
static struct rb_root page_tree;
static struct rb_root page_alloc_tree;
static struct rb_root page_alloc_sorted;
+static struct rb_root page_caller_tree;
+static struct rb_root page_caller_sorted;

-static struct page_stat *search_page(unsigned long page, bool create)
+struct alloc_func {
+ u64 start;
+ u64 end;
+ char *name;
+};
+
+static int nr_alloc_funcs;
+static struct alloc_func *alloc_func_list;
+
+static int funcmp(const void *a, const void *b)
+{
+ const struct alloc_func *fa = a;
+ const struct alloc_func *fb = b;
+
+ if (fa->start > fb->start)
+ return 1;
+ else
+ return -1;
+}
+
+static int callcmp(const void *a, const void *b)
+{
+ const struct alloc_func *fa = a;
+ const struct alloc_func *fb = b;
+
+ if (fb->start <= fa->start && fa->end < fb->end)
+ return 0;
+
+ if (fa->start > fb->start)
+ return 1;
+ else
+ return -1;
+}
+
+static int build_alloc_func_list(void)
+{
+ int ret;
+ struct map *kernel_map;
+ struct symbol *sym;
+ struct rb_node *node;
+ struct alloc_func *func;
+ struct machine *machine = &kmem_session->machines.host;
+ regex_t alloc_func_regex;
+ const char pattern[] = "^_?_?(alloc|get_free|get_zeroed)_pages?";
+
+ ret = regcomp(&alloc_func_regex, pattern, REG_EXTENDED);
+ if (ret) {
+ char err[BUFSIZ];
+
+ regerror(ret, &alloc_func_regex, err, sizeof(err));
+ pr_err("Invalid regex: %s\n%s", pattern, err);
+ return -EINVAL;
+ }
+
+ kernel_map = machine->vmlinux_maps[MAP__FUNCTION];
+ if (map__load(kernel_map, NULL) < 0) {
+ pr_err("cannot load kernel map\n");
+ return -ENOENT;
+ }
+
+ map__for_each_symbol(kernel_map, sym, node) {
+ if (regexec(&alloc_func_regex, sym->name, 0, NULL, 0))
+ continue;
+
+ func = realloc(alloc_func_list,
+ (nr_alloc_funcs + 1) * sizeof(*func));
+ if (func == NULL)
+ return -ENOMEM;
+
+ pr_debug("alloc func: %s\n", sym->name);
+ func[nr_alloc_funcs].start = sym->start;
+ func[nr_alloc_funcs].end = sym->end;
+ func[nr_alloc_funcs].name = sym->name;
+
+ alloc_func_list = func;
+ nr_alloc_funcs++;
+ }
+
+ qsort(alloc_func_list, nr_alloc_funcs, sizeof(*func), funcmp);
+
+ regfree(&alloc_func_regex);
+ return 0;
+}
+
+/*
+ * Find first non-memory allocation function from callchain.
+ * The allocation functions are in the 'alloc_func_list'.
+ */
+static u64 find_callsite(struct perf_evsel *evsel, struct perf_sample *sample)
+{
+ struct addr_location al;
+ struct machine *machine = &kmem_session->machines.host;
+ struct callchain_cursor_node *node;
+
+ if (alloc_func_list == NULL) {
+ if (build_alloc_func_list() < 0)
+ goto out;
+ }
+
+ al.thread = machine__findnew_thread(machine, sample->pid, sample->tid);
+ sample__resolve_callchain(sample, NULL, evsel, &al, 16);
+
+ callchain_cursor_commit(&callchain_cursor);
+ while (true) {
+ struct alloc_func key, *caller;
+ u64 addr;
+
+ node = callchain_cursor_current(&callchain_cursor);
+ if (node == NULL)
+ break;
+
+ key.start = key.end = node->ip;
+ caller = bsearch(&key, alloc_func_list, nr_alloc_funcs,
+ sizeof(key), callcmp);
+ if (!caller) {
+ /* found */
+ if (node->map)
+ addr = map__unmap_ip(node->map, node->ip);
+ else
+ addr = node->ip;
+
+ return addr;
+ } else
+ pr_debug3("skipping alloc function: %s\n", caller->name);
+
+ callchain_cursor_advance(&callchain_cursor);
+ }
+
+out:
+ pr_debug2("unknown callsite: %"PRIx64 "\n", sample->ip);
+ return sample->ip;
+}
+
+static struct page_stat *
+__page_stat__findnew_page(u64 page, bool create)
{
struct rb_node **node = &page_tree.rb_node;
struct rb_node *parent = NULL;
@@ -298,6 +438,16 @@ static struct page_stat *search_page(unsigned long page, bool create)
return data;
}

+static struct page_stat *page_stat__find_page(u64 page)
+{
+ return __page_stat__findnew_page(page, false);
+}
+
+static struct page_stat *page_stat__findnew_page(u64 page)
+{
+ return __page_stat__findnew_page(page, true);
+}
+
static int page_stat_cmp(struct page_stat *a, struct page_stat *b)
{
if (a->page > b->page)
@@ -319,7 +469,8 @@ static int page_stat_cmp(struct page_stat *a, struct page_stat *b)
return 0;
}

-static struct page_stat *search_page_alloc_stat(struct page_stat *pstat, bool create)
+static struct page_stat *
+__page_stat__findnew_alloc(struct page_stat *pstat, bool create)
{
struct rb_node **node = &page_alloc_tree.rb_node;
struct rb_node *parent = NULL;
@@ -357,6 +508,62 @@ static struct page_stat *search_page_alloc_stat(struct page_stat *pstat, bool cr
return data;
}

+static struct page_stat *page_stat__find_alloc(struct page_stat *pstat)
+{
+ return __page_stat__findnew_alloc(pstat, false);
+}
+
+static struct page_stat *page_stat__findnew_alloc(struct page_stat *pstat)
+{
+ return __page_stat__findnew_alloc(pstat, true);
+}
+
+static struct page_stat *
+__page_stat__findnew_caller(u64 callsite, bool create)
+{
+ struct rb_node **node = &page_caller_tree.rb_node;
+ struct rb_node *parent = NULL;
+ struct page_stat *data;
+
+ while (*node) {
+ s64 cmp;
+
+ parent = *node;
+ data = rb_entry(*node, struct page_stat, node);
+
+ cmp = data->callsite - callsite;
+ if (cmp < 0)
+ node = &parent->rb_left;
+ else if (cmp > 0)
+ node = &parent->rb_right;
+ else
+ return data;
+ }
+
+ if (!create)
+ return NULL;
+
+ data = zalloc(sizeof(*data));
+ if (data != NULL) {
+ data->callsite = callsite;
+
+ rb_link_node(&data->node, parent, node);
+ rb_insert_color(&data->node, &page_caller_tree);
+ }
+
+ return data;
+}
+
+static struct page_stat *page_stat__find_caller(u64 callsite)
+{
+ return __page_stat__findnew_caller(callsite, false);
+}
+
+static struct page_stat *page_stat__findnew_caller(u64 callsite)
+{
+ return __page_stat__findnew_caller(callsite, true);
+}
+
static bool valid_page(u64 pfn_or_page)
{
if (use_pfn && pfn_or_page == -1UL)
@@ -375,6 +582,7 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
unsigned int migrate_type = perf_evsel__intval(evsel, sample,
"migratetype");
u64 bytes = kmem_page_size << order;
+ u64 callsite;
struct page_stat *pstat;
struct page_stat this = {
.order = order,
@@ -397,25 +605,40 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
return 0;
}

+ callsite = find_callsite(evsel, sample);
+
/*
* This is to find the current page (with correct gfp flags and
* migrate type) at free event.
*/
- pstat = search_page(page, true);
+ pstat = page_stat__findnew_page(page);
if (pstat == NULL)
return -ENOMEM;

pstat->order = order;
pstat->gfp_flags = gfp_flags;
pstat->migrate_type = migrate_type;
+ pstat->callsite = callsite;

this.page = page;
- pstat = search_page_alloc_stat(&this, true);
+ pstat = page_stat__findnew_alloc(&this);
if (pstat == NULL)
return -ENOMEM;

pstat->nr_alloc++;
pstat->alloc_bytes += bytes;
+ pstat->callsite = callsite;
+
+ pstat = page_stat__findnew_caller(callsite);
+ if (pstat == NULL)
+ return -ENOMEM;
+
+ pstat->order = order;
+ pstat->gfp_flags = gfp_flags;
+ pstat->migrate_type = migrate_type;
+
+ pstat->nr_alloc++;
+ pstat->alloc_bytes += bytes;

order_stats[order][migrate_type]++;

@@ -441,7 +664,7 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
nr_page_frees++;
total_page_free_bytes += bytes;

- pstat = search_page(page, false);
+ pstat = page_stat__find_page(page);
if (pstat == NULL) {
pr_debug2("missing free at page %"PRIx64" (order: %d)\n",
page, order);
@@ -455,11 +678,19 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
this.page = page;
this.gfp_flags = pstat->gfp_flags;
this.migrate_type = pstat->migrate_type;
+ this.callsite = pstat->callsite;

rb_erase(&pstat->node, &page_tree);
free(pstat);

- pstat = search_page_alloc_stat(&this, false);
+ pstat = page_stat__find_alloc(&this);
+ if (pstat == NULL)
+ return -ENOENT;
+
+ pstat->nr_free++;
+ pstat->free_bytes += bytes;
+
+ pstat = page_stat__find_caller(this.callsite);
if (pstat == NULL)
return -ENOENT;

@@ -576,41 +807,89 @@ static const char * const migrate_type_str[] = {
"UNKNOWN",
};

-static void __print_page_result(struct rb_root *root,
- struct perf_session *session __maybe_unused,
- int n_lines)
+static void __print_page_alloc_result(struct perf_session *session, int n_lines)
{
- struct rb_node *next = rb_first(root);
+ struct rb_node *next = rb_first(&page_alloc_sorted);
+ struct machine *machine = &session->machines.host;
const char *format;

- printf("\n%.80s\n", graph_dotted_line);
- printf(" %-16s | Total alloc (KB) | Hits | Order | Mig.type | GFP flags\n",
+ printf("\n%.105s\n", graph_dotted_line);
+ printf(" %-16s | Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n",
use_pfn ? "PFN" : "Page");
- printf("%.80s\n", graph_dotted_line);
+ printf("%.105s\n", graph_dotted_line);

if (use_pfn)
- format = " %16llu | %'16llu | %'9d | %5d | %8s | %08lx\n";
+ format = " %16llu | %'16llu | %'9d | %5d | %8s | %08lx | %s\n";
else
- format = " %016llx | %'16llu | %'9d | %5d | %8s | %08lx\n";
+ format = " %016llx | %'16llu | %'9d | %5d | %8s | %08lx | %s\n";

while (next && n_lines--) {
struct page_stat *data;
+ struct symbol *sym;
+ struct map *map;
+ char buf[32];
+ char *caller = buf;

data = rb_entry(next, struct page_stat, node);
+ sym = machine__find_kernel_function(machine, data->callsite,
+ &map, NULL);
+ if (sym && sym->name)
+ caller = sym->name;
+ else
+ scnprintf(buf, sizeof(buf), "%"PRIx64, data->callsite);

printf(format, (unsigned long long)data->page,
(unsigned long long)data->alloc_bytes / 1024,
data->nr_alloc, data->order,
migrate_type_str[data->migrate_type],
- (unsigned long)data->gfp_flags);
+ (unsigned long)data->gfp_flags, caller);

next = rb_next(next);
}

if (n_lines == -1)
- printf(" ... | ... | ... | ... | ... | ... \n");
+ printf(" ... | ... | ... | ... | ... | ... | ...\n");

- printf("%.80s\n", graph_dotted_line);
+ printf("%.105s\n", graph_dotted_line);
+}
+
+static void __print_page_caller_result(struct perf_session *session, int n_lines)
+{
+ struct rb_node *next = rb_first(&page_caller_sorted);
+ struct machine *machine = &session->machines.host;
+
+ printf("\n%.105s\n", graph_dotted_line);
+ printf(" Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n");
+ printf("%.105s\n", graph_dotted_line);
+
+ while (next && n_lines--) {
+ struct page_stat *data;
+ struct symbol *sym;
+ struct map *map;
+ char buf[32];
+ char *caller = buf;
+
+ data = rb_entry(next, struct page_stat, node);
+ sym = machine__find_kernel_function(machine, data->callsite,
+ &map, NULL);
+ if (sym && sym->name)
+ caller = sym->name;
+ else
+ scnprintf(buf, sizeof(buf), "%"PRIx64, data->callsite);
+
+ printf(" %'16llu | %'9d | %5d | %8s | %08lx | %s\n",
+ (unsigned long long)data->alloc_bytes / 1024,
+ data->nr_alloc, data->order,
+ migrate_type_str[data->migrate_type],
+ (unsigned long)data->gfp_flags, caller);
+
+ next = rb_next(next);
+ }
+
+ if (n_lines == -1)
+ printf(" ... | ... | ... | ... | ... | ...\n");
+
+ printf("%.105s\n", graph_dotted_line);
}

static void print_slab_summary(void)
@@ -682,8 +961,10 @@ static void print_slab_result(struct perf_session *session)

static void print_page_result(struct perf_session *session)
{
+ if (caller_flag)
+ __print_page_caller_result(session, caller_lines);
if (alloc_flag)
- __print_page_result(&page_alloc_sorted, session, alloc_lines);
+ __print_page_alloc_result(session, alloc_lines);
print_page_summary();
}

@@ -802,6 +1083,7 @@ static void sort_result(void)
}
if (kmem_page) {
__sort_page_result(&page_alloc_tree, &page_alloc_sorted);
+ __sort_page_result(&page_caller_tree, &page_caller_sorted);
}
}

@@ -1084,7 +1366,7 @@ static int __cmd_record(int argc, const char **argv)
if (kmem_slab)
rec_argc += ARRAY_SIZE(slab_events);
if (kmem_page)
- rec_argc += ARRAY_SIZE(page_events);
+ rec_argc += ARRAY_SIZE(page_events) + 1; /* for -g */

rec_argv = calloc(rec_argc + 1, sizeof(char *));

@@ -1099,6 +1381,8 @@ static int __cmd_record(int argc, const char **argv)
rec_argv[i] = strdup(slab_events[j]);
}
if (kmem_page) {
+ rec_argv[i++] = strdup("-g");
+
for (j = 0; j < ARRAY_SIZE(page_events); j++, i++)
rec_argv[i] = strdup(page_events[j]);
}
@@ -1159,7 +1443,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)

file.path = input_name;

- session = perf_session__new(&file, false, &perf_kmem);
+ kmem_session = session = perf_session__new(&file, false, &perf_kmem);
if (session == NULL)
return -1;

@@ -1172,6 +1456,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
}

kmem_page_size = pevent_get_page_size(evsel->tp_format->pevent);
+ symbol_conf.use_callchain = true;
}

symbol__init(&session->header.env);

Subject: [tip:perf/core] perf kmem: Support sort keys on page analysis

Commit-ID: fb4f313d304b0a5120e870a6cd9ecf90c1023037
Gitweb: http://git.kernel.org/tip/fb4f313d304b0a5120e870a6cd9ecf90c1023037
Author: Namhyung Kim <[email protected]>
AuthorDate: Tue, 21 Apr 2015 13:55:03 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Mon, 4 May 2015 13:34:47 -0300

perf kmem: Support sort keys on page analysis

Add new sort keys for page: page, order, migtype, gfp - existing
'bytes', 'hit' and 'callsite' sort keys also work for page. Note that
-s/--sort option should be preceded by either of --slab or --page option
to determine where the sort keys applies.

Now it properly groups and sorts allocation stats - so same
page/caller with different order/migtype/gfp will be printed on a
different line.

# perf kmem stat --page --caller -l 10 -s order,hit

-----------------------------------------------------------------------------
Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
-----------------------------------------------------------------------------
64 | 4 | 2 | RECLAIM | 00285250 | new_slab
50,144 | 12,536 | 0 | MOVABLE | 0102005a | __page_cache_alloc
52 | 13 | 0 | UNMOVABL | 002084d0 | pte_alloc_one
40 | 10 | 0 | MOVABLE | 000280da | handle_mm_fault
28 | 7 | 0 | UNMOVABL | 000000d0 | __pollwait
20 | 5 | 0 | MOVABLE | 000200da | do_wp_page
20 | 5 | 0 | MOVABLE | 000200da | do_cow_fault
16 | 4 | 0 | UNMOVABL | 00000200 | __tlb_remove_page
16 | 4 | 0 | UNMOVABL | 000084d0 | __pmd_alloc
8 | 2 | 0 | UNMOVABL | 000084d0 | __pud_alloc
... | ... | ... | ... | ... | ...
-----------------------------------------------------------------------------

Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Pekka Enberg <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-kmem.txt | 6 +-
tools/perf/builtin-kmem.c | 393 ++++++++++++++++++++++++++-------
2 files changed, 313 insertions(+), 86 deletions(-)

diff --git a/tools/perf/Documentation/perf-kmem.txt b/tools/perf/Documentation/perf-kmem.txt
index 23219c6..69e1812 100644
--- a/tools/perf/Documentation/perf-kmem.txt
+++ b/tools/perf/Documentation/perf-kmem.txt
@@ -37,7 +37,11 @@ OPTIONS

-s <key[,key2...]>::
--sort=<key[,key2...]>::
- Sort the output (default: frag,hit,bytes)
+ Sort the output (default: 'frag,hit,bytes' for slab and 'bytes,hit'
+ for page). Available sort keys are 'ptr, callsite, bytes, hit,
+ pingpong, frag' for slab and 'page, callsite, bytes, hit, order,
+ migtype, gfp' for page. This option should be preceded by one of the
+ mode selection options - i.e. --slab, --page, --alloc and/or --caller.

-l <num>::
--line=<num>::
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 3649eec..0393a7f 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -30,7 +30,7 @@ static int kmem_page;
static long kmem_page_size;

struct alloc_stat;
-typedef int (*sort_fn_t)(struct alloc_stat *, struct alloc_stat *);
+typedef int (*sort_fn_t)(void *, void *);

static int alloc_flag;
static int caller_flag;
@@ -181,8 +181,8 @@ static int perf_evsel__process_alloc_node_event(struct perf_evsel *evsel,
return ret;
}

-static int ptr_cmp(struct alloc_stat *, struct alloc_stat *);
-static int callsite_cmp(struct alloc_stat *, struct alloc_stat *);
+static int ptr_cmp(void *, void *);
+static int slab_callsite_cmp(void *, void *);

static struct alloc_stat *search_alloc_stat(unsigned long ptr,
unsigned long call_site,
@@ -223,7 +223,8 @@ static int perf_evsel__process_free_event(struct perf_evsel *evsel,
s_alloc->pingpong++;

s_caller = search_alloc_stat(0, s_alloc->call_site,
- &root_caller_stat, callsite_cmp);
+ &root_caller_stat,
+ slab_callsite_cmp);
if (!s_caller)
return -1;
s_caller->pingpong++;
@@ -448,26 +449,14 @@ static struct page_stat *page_stat__findnew_page(u64 page)
return __page_stat__findnew_page(page, true);
}

-static int page_stat_cmp(struct page_stat *a, struct page_stat *b)
-{
- if (a->page > b->page)
- return -1;
- if (a->page < b->page)
- return 1;
- if (a->order > b->order)
- return -1;
- if (a->order < b->order)
- return 1;
- if (a->migrate_type > b->migrate_type)
- return -1;
- if (a->migrate_type < b->migrate_type)
- return 1;
- if (a->gfp_flags > b->gfp_flags)
- return -1;
- if (a->gfp_flags < b->gfp_flags)
- return 1;
- return 0;
-}
+struct sort_dimension {
+ const char name[20];
+ sort_fn_t cmp;
+ struct list_head list;
+};
+
+static LIST_HEAD(page_alloc_sort_input);
+static LIST_HEAD(page_caller_sort_input);

static struct page_stat *
__page_stat__findnew_alloc(struct page_stat *pstat, bool create)
@@ -475,14 +464,20 @@ __page_stat__findnew_alloc(struct page_stat *pstat, bool create)
struct rb_node **node = &page_alloc_tree.rb_node;
struct rb_node *parent = NULL;
struct page_stat *data;
+ struct sort_dimension *sort;

while (*node) {
- s64 cmp;
+ int cmp = 0;

parent = *node;
data = rb_entry(*node, struct page_stat, node);

- cmp = page_stat_cmp(data, pstat);
+ list_for_each_entry(sort, &page_alloc_sort_input, list) {
+ cmp = sort->cmp(pstat, data);
+ if (cmp)
+ break;
+ }
+
if (cmp < 0)
node = &parent->rb_left;
else if (cmp > 0)
@@ -519,19 +514,25 @@ static struct page_stat *page_stat__findnew_alloc(struct page_stat *pstat)
}

static struct page_stat *
-__page_stat__findnew_caller(u64 callsite, bool create)
+__page_stat__findnew_caller(struct page_stat *pstat, bool create)
{
struct rb_node **node = &page_caller_tree.rb_node;
struct rb_node *parent = NULL;
struct page_stat *data;
+ struct sort_dimension *sort;

while (*node) {
- s64 cmp;
+ int cmp = 0;

parent = *node;
data = rb_entry(*node, struct page_stat, node);

- cmp = data->callsite - callsite;
+ list_for_each_entry(sort, &page_caller_sort_input, list) {
+ cmp = sort->cmp(pstat, data);
+ if (cmp)
+ break;
+ }
+
if (cmp < 0)
node = &parent->rb_left;
else if (cmp > 0)
@@ -545,7 +546,10 @@ __page_stat__findnew_caller(u64 callsite, bool create)

data = zalloc(sizeof(*data));
if (data != NULL) {
- data->callsite = callsite;
+ data->callsite = pstat->callsite;
+ data->order = pstat->order;
+ data->gfp_flags = pstat->gfp_flags;
+ data->migrate_type = pstat->migrate_type;

rb_link_node(&data->node, parent, node);
rb_insert_color(&data->node, &page_caller_tree);
@@ -554,14 +558,14 @@ __page_stat__findnew_caller(u64 callsite, bool create)
return data;
}

-static struct page_stat *page_stat__find_caller(u64 callsite)
+static struct page_stat *page_stat__find_caller(struct page_stat *pstat)
{
- return __page_stat__findnew_caller(callsite, false);
+ return __page_stat__findnew_caller(pstat, false);
}

-static struct page_stat *page_stat__findnew_caller(u64 callsite)
+static struct page_stat *page_stat__findnew_caller(struct page_stat *pstat)
{
- return __page_stat__findnew_caller(callsite, true);
+ return __page_stat__findnew_caller(pstat, true);
}

static bool valid_page(u64 pfn_or_page)
@@ -629,14 +633,11 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
pstat->alloc_bytes += bytes;
pstat->callsite = callsite;

- pstat = page_stat__findnew_caller(callsite);
+ this.callsite = callsite;
+ pstat = page_stat__findnew_caller(&this);
if (pstat == NULL)
return -ENOMEM;

- pstat->order = order;
- pstat->gfp_flags = gfp_flags;
- pstat->migrate_type = migrate_type;
-
pstat->nr_alloc++;
pstat->alloc_bytes += bytes;

@@ -690,7 +691,7 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
pstat->nr_free++;
pstat->free_bytes += bytes;

- pstat = page_stat__find_caller(this.callsite);
+ pstat = page_stat__find_caller(&this);
if (pstat == NULL)
return -ENOENT;

@@ -976,14 +977,10 @@ static void print_result(struct perf_session *session)
print_page_result(session);
}

-struct sort_dimension {
- const char name[20];
- sort_fn_t cmp;
- struct list_head list;
-};
-
-static LIST_HEAD(caller_sort);
-static LIST_HEAD(alloc_sort);
+static LIST_HEAD(slab_caller_sort);
+static LIST_HEAD(slab_alloc_sort);
+static LIST_HEAD(page_caller_sort);
+static LIST_HEAD(page_alloc_sort);

static void sort_slab_insert(struct rb_root *root, struct alloc_stat *data,
struct list_head *sort_list)
@@ -1032,10 +1029,12 @@ static void __sort_slab_result(struct rb_root *root, struct rb_root *root_sorted
}
}

-static void sort_page_insert(struct rb_root *root, struct page_stat *data)
+static void sort_page_insert(struct rb_root *root, struct page_stat *data,
+ struct list_head *sort_list)
{
struct rb_node **new = &root->rb_node;
struct rb_node *parent = NULL;
+ struct sort_dimension *sort;

while (*new) {
struct page_stat *this;
@@ -1044,8 +1043,11 @@ static void sort_page_insert(struct rb_root *root, struct page_stat *data)
this = rb_entry(*new, struct page_stat, node);
parent = *new;

- /* TODO: support more sort key */
- cmp = data->alloc_bytes - this->alloc_bytes;
+ list_for_each_entry(sort, sort_list, list) {
+ cmp = sort->cmp(data, this);
+ if (cmp)
+ break;
+ }

if (cmp > 0)
new = &parent->rb_left;
@@ -1057,7 +1059,8 @@ static void sort_page_insert(struct rb_root *root, struct page_stat *data)
rb_insert_color(&data->node, root);
}

-static void __sort_page_result(struct rb_root *root, struct rb_root *root_sorted)
+static void __sort_page_result(struct rb_root *root, struct rb_root *root_sorted,
+ struct list_head *sort_list)
{
struct rb_node *node;
struct page_stat *data;
@@ -1069,7 +1072,7 @@ static void __sort_page_result(struct rb_root *root, struct rb_root *root_sorted

rb_erase(node, root);
data = rb_entry(node, struct page_stat, node);
- sort_page_insert(root_sorted, data);
+ sort_page_insert(root_sorted, data, sort_list);
}
}

@@ -1077,13 +1080,15 @@ static void sort_result(void)
{
if (kmem_slab) {
__sort_slab_result(&root_alloc_stat, &root_alloc_sorted,
- &alloc_sort);
+ &slab_alloc_sort);
__sort_slab_result(&root_caller_stat, &root_caller_sorted,
- &caller_sort);
+ &slab_caller_sort);
}
if (kmem_page) {
- __sort_page_result(&page_alloc_tree, &page_alloc_sorted);
- __sort_page_result(&page_caller_tree, &page_caller_sorted);
+ __sort_page_result(&page_alloc_tree, &page_alloc_sorted,
+ &page_alloc_sort);
+ __sort_page_result(&page_caller_tree, &page_caller_sorted,
+ &page_caller_sort);
}
}

@@ -1132,8 +1137,12 @@ out:
return err;
}

-static int ptr_cmp(struct alloc_stat *l, struct alloc_stat *r)
+/* slab sort keys */
+static int ptr_cmp(void *a, void *b)
{
+ struct alloc_stat *l = a;
+ struct alloc_stat *r = b;
+
if (l->ptr < r->ptr)
return -1;
else if (l->ptr > r->ptr)
@@ -1146,8 +1155,11 @@ static struct sort_dimension ptr_sort_dimension = {
.cmp = ptr_cmp,
};

-static int callsite_cmp(struct alloc_stat *l, struct alloc_stat *r)
+static int slab_callsite_cmp(void *a, void *b)
{
+ struct alloc_stat *l = a;
+ struct alloc_stat *r = b;
+
if (l->call_site < r->call_site)
return -1;
else if (l->call_site > r->call_site)
@@ -1157,11 +1169,14 @@ static int callsite_cmp(struct alloc_stat *l, struct alloc_stat *r)

static struct sort_dimension callsite_sort_dimension = {
.name = "callsite",
- .cmp = callsite_cmp,
+ .cmp = slab_callsite_cmp,
};

-static int hit_cmp(struct alloc_stat *l, struct alloc_stat *r)
+static int hit_cmp(void *a, void *b)
{
+ struct alloc_stat *l = a;
+ struct alloc_stat *r = b;
+
if (l->hit < r->hit)
return -1;
else if (l->hit > r->hit)
@@ -1174,8 +1189,11 @@ static struct sort_dimension hit_sort_dimension = {
.cmp = hit_cmp,
};

-static int bytes_cmp(struct alloc_stat *l, struct alloc_stat *r)
+static int bytes_cmp(void *a, void *b)
{
+ struct alloc_stat *l = a;
+ struct alloc_stat *r = b;
+
if (l->bytes_alloc < r->bytes_alloc)
return -1;
else if (l->bytes_alloc > r->bytes_alloc)
@@ -1188,9 +1206,11 @@ static struct sort_dimension bytes_sort_dimension = {
.cmp = bytes_cmp,
};

-static int frag_cmp(struct alloc_stat *l, struct alloc_stat *r)
+static int frag_cmp(void *a, void *b)
{
double x, y;
+ struct alloc_stat *l = a;
+ struct alloc_stat *r = b;

x = fragmentation(l->bytes_req, l->bytes_alloc);
y = fragmentation(r->bytes_req, r->bytes_alloc);
@@ -1207,8 +1227,11 @@ static struct sort_dimension frag_sort_dimension = {
.cmp = frag_cmp,
};

-static int pingpong_cmp(struct alloc_stat *l, struct alloc_stat *r)
+static int pingpong_cmp(void *a, void *b)
{
+ struct alloc_stat *l = a;
+ struct alloc_stat *r = b;
+
if (l->pingpong < r->pingpong)
return -1;
else if (l->pingpong > r->pingpong)
@@ -1221,7 +1244,135 @@ static struct sort_dimension pingpong_sort_dimension = {
.cmp = pingpong_cmp,
};

-static struct sort_dimension *avail_sorts[] = {
+/* page sort keys */
+static int page_cmp(void *a, void *b)
+{
+ struct page_stat *l = a;
+ struct page_stat *r = b;
+
+ if (l->page < r->page)
+ return -1;
+ else if (l->page > r->page)
+ return 1;
+ return 0;
+}
+
+static struct sort_dimension page_sort_dimension = {
+ .name = "page",
+ .cmp = page_cmp,
+};
+
+static int page_callsite_cmp(void *a, void *b)
+{
+ struct page_stat *l = a;
+ struct page_stat *r = b;
+
+ if (l->callsite < r->callsite)
+ return -1;
+ else if (l->callsite > r->callsite)
+ return 1;
+ return 0;
+}
+
+static struct sort_dimension page_callsite_sort_dimension = {
+ .name = "callsite",
+ .cmp = page_callsite_cmp,
+};
+
+static int page_hit_cmp(void *a, void *b)
+{
+ struct page_stat *l = a;
+ struct page_stat *r = b;
+
+ if (l->nr_alloc < r->nr_alloc)
+ return -1;
+ else if (l->nr_alloc > r->nr_alloc)
+ return 1;
+ return 0;
+}
+
+static struct sort_dimension page_hit_sort_dimension = {
+ .name = "hit",
+ .cmp = page_hit_cmp,
+};
+
+static int page_bytes_cmp(void *a, void *b)
+{
+ struct page_stat *l = a;
+ struct page_stat *r = b;
+
+ if (l->alloc_bytes < r->alloc_bytes)
+ return -1;
+ else if (l->alloc_bytes > r->alloc_bytes)
+ return 1;
+ return 0;
+}
+
+static struct sort_dimension page_bytes_sort_dimension = {
+ .name = "bytes",
+ .cmp = page_bytes_cmp,
+};
+
+static int page_order_cmp(void *a, void *b)
+{
+ struct page_stat *l = a;
+ struct page_stat *r = b;
+
+ if (l->order < r->order)
+ return -1;
+ else if (l->order > r->order)
+ return 1;
+ return 0;
+}
+
+static struct sort_dimension page_order_sort_dimension = {
+ .name = "order",
+ .cmp = page_order_cmp,
+};
+
+static int migrate_type_cmp(void *a, void *b)
+{
+ struct page_stat *l = a;
+ struct page_stat *r = b;
+
+ /* for internal use to find free'd page */
+ if (l->migrate_type == -1U)
+ return 0;
+
+ if (l->migrate_type < r->migrate_type)
+ return -1;
+ else if (l->migrate_type > r->migrate_type)
+ return 1;
+ return 0;
+}
+
+static struct sort_dimension migrate_type_sort_dimension = {
+ .name = "migtype",
+ .cmp = migrate_type_cmp,
+};
+
+static int gfp_flags_cmp(void *a, void *b)
+{
+ struct page_stat *l = a;
+ struct page_stat *r = b;
+
+ /* for internal use to find free'd page */
+ if (l->gfp_flags == -1U)
+ return 0;
+
+ if (l->gfp_flags < r->gfp_flags)
+ return -1;
+ else if (l->gfp_flags > r->gfp_flags)
+ return 1;
+ return 0;
+}
+
+static struct sort_dimension gfp_flags_sort_dimension = {
+ .name = "gfp",
+ .cmp = gfp_flags_cmp,
+};
+
+static struct sort_dimension *slab_sorts[] = {
&ptr_sort_dimension,
&callsite_sort_dimension,
&hit_sort_dimension,
@@ -1230,16 +1381,24 @@ static struct sort_dimension *avail_sorts[] = {
&pingpong_sort_dimension,
};

-#define NUM_AVAIL_SORTS ((int)ARRAY_SIZE(avail_sorts))
+static struct sort_dimension *page_sorts[] = {
+ &page_sort_dimension,
+ &page_callsite_sort_dimension,
+ &page_hit_sort_dimension,
+ &page_bytes_sort_dimension,
+ &page_order_sort_dimension,
+ &migrate_type_sort_dimension,
+ &gfp_flags_sort_dimension,
+};

-static int sort_dimension__add(const char *tok, struct list_head *list)
+static int slab_sort_dimension__add(const char *tok, struct list_head *list)
{
struct sort_dimension *sort;
int i;

- for (i = 0; i < NUM_AVAIL_SORTS; i++) {
- if (!strcmp(avail_sorts[i]->name, tok)) {
- sort = memdup(avail_sorts[i], sizeof(*avail_sorts[i]));
+ for (i = 0; i < (int)ARRAY_SIZE(slab_sorts); i++) {
+ if (!strcmp(slab_sorts[i]->name, tok)) {
+ sort = memdup(slab_sorts[i], sizeof(*slab_sorts[i]));
if (!sort) {
pr_err("%s: memdup failed\n", __func__);
return -1;
@@ -1252,7 +1411,27 @@ static int sort_dimension__add(const char *tok, struct list_head *list)
return -1;
}

-static int setup_sorting(struct list_head *sort_list, const char *arg)
+static int page_sort_dimension__add(const char *tok, struct list_head *list)
+{
+ struct sort_dimension *sort;
+ int i;
+
+ for (i = 0; i < (int)ARRAY_SIZE(page_sorts); i++) {
+ if (!strcmp(page_sorts[i]->name, tok)) {
+ sort = memdup(page_sorts[i], sizeof(*page_sorts[i]));
+ if (!sort) {
+ pr_err("%s: memdup failed\n", __func__);
+ return -1;
+ }
+ list_add_tail(&sort->list, list);
+ return 0;
+ }
+ }
+
+ return -1;
+}
+
+static int setup_slab_sorting(struct list_head *sort_list, const char *arg)
{
char *tok;
char *str = strdup(arg);
@@ -1267,8 +1446,34 @@ static int setup_sorting(struct list_head *sort_list, const char *arg)
tok = strsep(&pos, ",");
if (!tok)
break;
- if (sort_dimension__add(tok, sort_list) < 0) {
- error("Unknown --sort key: '%s'", tok);
+ if (slab_sort_dimension__add(tok, sort_list) < 0) {
+ error("Unknown slab --sort key: '%s'", tok);
+ free(str);
+ return -1;
+ }
+ }
+
+ free(str);
+ return 0;
+}
+
+static int setup_page_sorting(struct list_head *sort_list, const char *arg)
+{
+ char *tok;
+ char *str = strdup(arg);
+ char *pos = str;
+
+ if (!str) {
+ pr_err("%s: strdup failed\n", __func__);
+ return -1;
+ }
+
+ while (true) {
+ tok = strsep(&pos, ",");
+ if (!tok)
+ break;
+ if (page_sort_dimension__add(tok, sort_list) < 0) {
+ error("Unknown page --sort key: '%s'", tok);
free(str);
return -1;
}
@@ -1284,10 +1489,17 @@ static int parse_sort_opt(const struct option *opt __maybe_unused,
if (!arg)
return -1;

- if (caller_flag > alloc_flag)
- return setup_sorting(&caller_sort, arg);
- else
- return setup_sorting(&alloc_sort, arg);
+ if (kmem_page > kmem_slab) {
+ if (caller_flag > alloc_flag)
+ return setup_page_sorting(&page_caller_sort, arg);
+ else
+ return setup_page_sorting(&page_alloc_sort, arg);
+ } else {
+ if (caller_flag > alloc_flag)
+ return setup_slab_sorting(&slab_caller_sort, arg);
+ else
+ return setup_slab_sorting(&slab_alloc_sort, arg);
+ }

return 0;
}
@@ -1395,7 +1607,8 @@ static int __cmd_record(int argc, const char **argv)

int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
{
- const char * const default_sort_order = "frag,hit,bytes";
+ const char * const default_slab_sort = "frag,hit,bytes";
+ const char * const default_page_sort = "bytes,hit";
struct perf_data_file file = {
.mode = PERF_DATA_MODE_READ,
};
@@ -1408,8 +1621,8 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_CALLBACK_NOOPT(0, "alloc", NULL, NULL,
"show per-allocation statistics", parse_alloc_opt),
OPT_CALLBACK('s', "sort", NULL, "key[,key2...]",
- "sort by keys: ptr, call_site, bytes, hit, pingpong, frag",
- parse_sort_opt),
+ "sort by keys: ptr, callsite, bytes, hit, pingpong, frag, "
+ "page, order, migtype, gfp", parse_sort_opt),
OPT_CALLBACK('l', "line", NULL, "num", "show n lines", parse_line_opt),
OPT_BOOLEAN(0, "raw-ip", &raw_ip, "show raw ip instead of symbol"),
OPT_BOOLEAN('f', "force", &file.force, "don't complain, do it"),
@@ -1467,11 +1680,21 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
if (cpu__setup_cpunode_map())
goto out_delete;

- if (list_empty(&caller_sort))
- setup_sorting(&caller_sort, default_sort_order);
- if (list_empty(&alloc_sort))
- setup_sorting(&alloc_sort, default_sort_order);
-
+ if (list_empty(&slab_caller_sort))
+ setup_slab_sorting(&slab_caller_sort, default_slab_sort);
+ if (list_empty(&slab_alloc_sort))
+ setup_slab_sorting(&slab_alloc_sort, default_slab_sort);
+ if (list_empty(&page_caller_sort))
+ setup_page_sorting(&page_caller_sort, default_page_sort);
+ if (list_empty(&page_alloc_sort))
+ setup_page_sorting(&page_alloc_sort, default_page_sort);
+
+ if (kmem_page) {
+ setup_page_sorting(&page_alloc_sort_input,
+ "page,order,migtype,gfp");
+ setup_page_sorting(&page_caller_sort_input,
+ "callsite,order,migtype,gfp");
+ }
ret = __cmd_kmem(session);
} else
usage_with_options(kmem_usage, kmem_options);

Subject: [tip:perf/core] perf kmem: Add --live option for current allocation stat

Commit-ID: 2a7ef02c9ca0172cd48945407893f38c2438e754
Gitweb: http://git.kernel.org/tip/2a7ef02c9ca0172cd48945407893f38c2438e754
Author: Namhyung Kim <[email protected]>
AuthorDate: Tue, 21 Apr 2015 13:55:04 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Mon, 4 May 2015 13:34:47 -0300

perf kmem: Add --live option for current allocation stat

Currently 'perf kmem stat --page' shows total (page) allocation stat by
default, but sometimes one might want to see live (total alloc-only)
requests/pages only. The new --live option does this by subtracting freed
allocation from the stat.

E.g.:

# perf kmem stat --page

SUMMARY (page allocator)
========================
Total allocation requests : 988,858 [ 4,045,368 KB ]
Total free requests : 886,484 [ 3,624,996 KB ]

Total alloc+freed requests : 885,969 [ 3,622,628 KB ]
Total alloc-only requests : 102,889 [ 422,740 KB ]
Total free-only requests : 515 [ 2,368 KB ]

Total allocation failures : 0 [ 0 KB ]

Order Unmovable Reclaimable Movable Reserved CMA/Isolated
----- ------------ ------------ ------------ ------------ ------------
0 172,173 3,083 806,686 . .
1 284 . . . .
2 6,124 58 . . .
3 114 335 . . .
4 . . . . .
5 . . . . .
6 . . . . .
7 . . . . .
8 . . . . .
9 . . 1 . .
10 . . . . .
# perf kmem stat --page --live

SUMMARY (page allocator)
========================
Total allocation requests : 988,858 [ 4,045,368 KB ]
Total free requests : 886,484 [ 3,624,996 KB ]

Total alloc+freed requests : 885,969 [ 3,622,628 KB ]
Total alloc-only requests : 102,889 [ 422,740 KB ]
Total free-only requests : 515 [ 2,368 KB ]

Total allocation failures : 0 [ 0 KB ]

Order Unmovable Reclaimable Movable Reserved CMA/Isolated
----- ------------ ------------ ------------ ------------ ------------
0 2,214 3,025 97,156 . .
1 59 . . . .
2 19 58 . . .
3 23 335 . . .
4 . . . . .
5 . . . . .
6 . . . . .
7 . . . . .
8 . . . . .
9 . . . . .
10 . . . . .
#

Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Pekka Enberg <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
[ Added examples to the changeset log ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-kmem.txt | 5 ++
tools/perf/builtin-kmem.c | 110 ++++++++++++++++++++-------------
2 files changed, 73 insertions(+), 42 deletions(-)

diff --git a/tools/perf/Documentation/perf-kmem.txt b/tools/perf/Documentation/perf-kmem.txt
index 69e1812..ff0f433 100644
--- a/tools/perf/Documentation/perf-kmem.txt
+++ b/tools/perf/Documentation/perf-kmem.txt
@@ -56,6 +56,11 @@ OPTIONS
--page::
Analyze page allocator events

+--live::
+ Show live page stat. The perf kmem shows total allocation stat by
+ default, but this option shows live (currently allocated) pages
+ instead. (This option works with --page option only)
+
SEE ALSO
--------
linkperf:perf-record[1]
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 0393a7f..7ead942 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -244,6 +244,7 @@ static unsigned long nr_page_fails;
static unsigned long nr_page_nomatch;

static bool use_pfn;
+static bool live_page;
static struct perf_session *kmem_session;

#define MAX_MIGRATE_TYPES 6
@@ -264,7 +265,7 @@ struct page_stat {
int nr_free;
};

-static struct rb_root page_tree;
+static struct rb_root page_live_tree;
static struct rb_root page_alloc_tree;
static struct rb_root page_alloc_sorted;
static struct rb_root page_caller_tree;
@@ -403,10 +404,19 @@ out:
return sample->ip;
}

+struct sort_dimension {
+ const char name[20];
+ sort_fn_t cmp;
+ struct list_head list;
+};
+
+static LIST_HEAD(page_alloc_sort_input);
+static LIST_HEAD(page_caller_sort_input);
+
static struct page_stat *
-__page_stat__findnew_page(u64 page, bool create)
+__page_stat__findnew_page(struct page_stat *pstat, bool create)
{
- struct rb_node **node = &page_tree.rb_node;
+ struct rb_node **node = &page_live_tree.rb_node;
struct rb_node *parent = NULL;
struct page_stat *data;

@@ -416,7 +426,7 @@ __page_stat__findnew_page(u64 page, bool create)
parent = *node;
data = rb_entry(*node, struct page_stat, node);

- cmp = data->page - page;
+ cmp = data->page - pstat->page;
if (cmp < 0)
node = &parent->rb_left;
else if (cmp > 0)
@@ -430,34 +440,28 @@ __page_stat__findnew_page(u64 page, bool create)

data = zalloc(sizeof(*data));
if (data != NULL) {
- data->page = page;
+ data->page = pstat->page;
+ data->order = pstat->order;
+ data->gfp_flags = pstat->gfp_flags;
+ data->migrate_type = pstat->migrate_type;

rb_link_node(&data->node, parent, node);
- rb_insert_color(&data->node, &page_tree);
+ rb_insert_color(&data->node, &page_live_tree);
}

return data;
}

-static struct page_stat *page_stat__find_page(u64 page)
+static struct page_stat *page_stat__find_page(struct page_stat *pstat)
{
- return __page_stat__findnew_page(page, false);
+ return __page_stat__findnew_page(pstat, false);
}

-static struct page_stat *page_stat__findnew_page(u64 page)
+static struct page_stat *page_stat__findnew_page(struct page_stat *pstat)
{
- return __page_stat__findnew_page(page, true);
+ return __page_stat__findnew_page(pstat, true);
}

-struct sort_dimension {
- const char name[20];
- sort_fn_t cmp;
- struct list_head list;
-};
-
-static LIST_HEAD(page_alloc_sort_input);
-static LIST_HEAD(page_caller_sort_input);
-
static struct page_stat *
__page_stat__findnew_alloc(struct page_stat *pstat, bool create)
{
@@ -615,17 +619,8 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
* This is to find the current page (with correct gfp flags and
* migrate type) at free event.
*/
- pstat = page_stat__findnew_page(page);
- if (pstat == NULL)
- return -ENOMEM;
-
- pstat->order = order;
- pstat->gfp_flags = gfp_flags;
- pstat->migrate_type = migrate_type;
- pstat->callsite = callsite;
-
this.page = page;
- pstat = page_stat__findnew_alloc(&this);
+ pstat = page_stat__findnew_page(&this);
if (pstat == NULL)
return -ENOMEM;

@@ -633,6 +628,16 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
pstat->alloc_bytes += bytes;
pstat->callsite = callsite;

+ if (!live_page) {
+ pstat = page_stat__findnew_alloc(&this);
+ if (pstat == NULL)
+ return -ENOMEM;
+
+ pstat->nr_alloc++;
+ pstat->alloc_bytes += bytes;
+ pstat->callsite = callsite;
+ }
+
this.callsite = callsite;
pstat = page_stat__findnew_caller(&this);
if (pstat == NULL)
@@ -665,7 +670,8 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
nr_page_frees++;
total_page_free_bytes += bytes;

- pstat = page_stat__find_page(page);
+ this.page = page;
+ pstat = page_stat__find_page(&this);
if (pstat == NULL) {
pr_debug2("missing free at page %"PRIx64" (order: %d)\n",
page, order);
@@ -676,20 +682,23 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
return 0;
}

- this.page = page;
this.gfp_flags = pstat->gfp_flags;
this.migrate_type = pstat->migrate_type;
this.callsite = pstat->callsite;

- rb_erase(&pstat->node, &page_tree);
+ rb_erase(&pstat->node, &page_live_tree);
free(pstat);

- pstat = page_stat__find_alloc(&this);
- if (pstat == NULL)
- return -ENOENT;
+ if (live_page) {
+ order_stats[this.order][this.migrate_type]--;
+ } else {
+ pstat = page_stat__find_alloc(&this);
+ if (pstat == NULL)
+ return -ENOMEM;

- pstat->nr_free++;
- pstat->free_bytes += bytes;
+ pstat->nr_free++;
+ pstat->free_bytes += bytes;
+ }

pstat = page_stat__find_caller(&this);
if (pstat == NULL)
@@ -698,6 +707,16 @@ static int perf_evsel__process_page_free_event(struct perf_evsel *evsel,
pstat->nr_free++;
pstat->free_bytes += bytes;

+ if (live_page) {
+ pstat->nr_alloc--;
+ pstat->alloc_bytes -= bytes;
+
+ if (pstat->nr_alloc == 0) {
+ rb_erase(&pstat->node, &page_caller_tree);
+ free(pstat);
+ }
+ }
+
return 0;
}

@@ -815,8 +834,8 @@ static void __print_page_alloc_result(struct perf_session *session, int n_lines)
const char *format;

printf("\n%.105s\n", graph_dotted_line);
- printf(" %-16s | Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n",
- use_pfn ? "PFN" : "Page");
+ printf(" %-16s | %5s alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n",
+ use_pfn ? "PFN" : "Page", live_page ? "Live" : "Total");
printf("%.105s\n", graph_dotted_line);

if (use_pfn)
@@ -860,7 +879,8 @@ static void __print_page_caller_result(struct perf_session *session, int n_lines
struct machine *machine = &session->machines.host;

printf("\n%.105s\n", graph_dotted_line);
- printf(" Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n");
+ printf(" %5s alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n",
+ live_page ? "Live" : "Total");
printf("%.105s\n", graph_dotted_line);

while (next && n_lines--) {
@@ -1085,8 +1105,13 @@ static void sort_result(void)
&slab_caller_sort);
}
if (kmem_page) {
- __sort_page_result(&page_alloc_tree, &page_alloc_sorted,
- &page_alloc_sort);
+ if (live_page)
+ __sort_page_result(&page_live_tree, &page_alloc_sorted,
+ &page_alloc_sort);
+ else
+ __sort_page_result(&page_alloc_tree, &page_alloc_sorted,
+ &page_alloc_sort);
+
__sort_page_result(&page_caller_tree, &page_caller_sorted,
&page_caller_sort);
}
@@ -1630,6 +1655,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
parse_slab_opt),
OPT_CALLBACK_NOOPT(0, "page", NULL, NULL, "Analyze page allocator",
parse_page_opt),
+ OPT_BOOLEAN(0, "live", &live_page, "Show live page stat"),
OPT_END()
};
const char *const kmem_subcommands[] = { "record", "stat", NULL };

Subject: [tip:perf/core] perf kmem: Print gfp flags in human readable string

Commit-ID: 0e11115644b39ff9e986eb308b6c44ca75cd475f
Gitweb: http://git.kernel.org/tip/0e11115644b39ff9e986eb308b6c44ca75cd475f
Author: Namhyung Kim <[email protected]>
AuthorDate: Tue, 21 Apr 2015 13:55:05 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Mon, 4 May 2015 13:34:48 -0300

perf kmem: Print gfp flags in human readable string

Save libtraceevent output and print it in the header.

# perf kmem stat --page --caller
#
# GFP flags
# ---------
# 00000010: NI: GFP_NOIO
# 000000d0: K: GFP_KERNEL
# 00000200: NWR: GFP_NOWARN
# 000084d0: K|R|Z: GFP_KERNEL|GFP_REPEAT|GFP_ZERO
# 000200d2: HU: GFP_HIGHUSER
# 000200da: HUM: GFP_HIGHUSER_MOVABLE
# 000280da: HUM|Z: GFP_HIGHUSER_MOVABLE|GFP_ZERO
# 002084d0: K|R|Z|NT: GFP_KERNEL|GFP_REPEAT|GFP_ZERO|GFP_NOTRACK
# 0102005a: NF|HW|M: GFP_NOFS|GFP_HARDWALL|GFP_MOVABLE

---------------------------------------------------------------------------------------------------------
Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
---------------------------------------------------------------------------------------------------------
60 | 15 | 0 | UNMOVABL | K|R|Z|NT | pte_alloc_one
40 | 10 | 0 | MOVABLE | HUM|Z | handle_mm_fault
24 | 6 | 0 | MOVABLE | HUM | do_wp_page
24 | 6 | 0 | UNMOVABL | K | __pollwait
...

Requested-by: Joonsoo Kim <[email protected]>
Suggested-by: Minchan Kim <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Pekka Enberg <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-kmem.c | 222 +++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 209 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 7ead942..1c66895 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -581,6 +581,176 @@ static bool valid_page(u64 pfn_or_page)
return true;
}

+struct gfp_flag {
+ unsigned int flags;
+ char *compact_str;
+ char *human_readable;
+};
+
+static struct gfp_flag *gfps;
+static int nr_gfps;
+
+static int gfpcmp(const void *a, const void *b)
+{
+ const struct gfp_flag *fa = a;
+ const struct gfp_flag *fb = b;
+
+ return fa->flags - fb->flags;
+}
+
+/* see include/trace/events/gfpflags.h */
+static const struct {
+ const char *original;
+ const char *compact;
+} gfp_compact_table[] = {
+ { "GFP_TRANSHUGE", "THP" },
+ { "GFP_HIGHUSER_MOVABLE", "HUM" },
+ { "GFP_HIGHUSER", "HU" },
+ { "GFP_USER", "U" },
+ { "GFP_TEMPORARY", "TMP" },
+ { "GFP_KERNEL", "K" },
+ { "GFP_NOFS", "NF" },
+ { "GFP_ATOMIC", "A" },
+ { "GFP_NOIO", "NI" },
+ { "GFP_HIGH", "H" },
+ { "GFP_WAIT", "W" },
+ { "GFP_IO", "I" },
+ { "GFP_COLD", "CO" },
+ { "GFP_NOWARN", "NWR" },
+ { "GFP_REPEAT", "R" },
+ { "GFP_NOFAIL", "NF" },
+ { "GFP_NORETRY", "NR" },
+ { "GFP_COMP", "C" },
+ { "GFP_ZERO", "Z" },
+ { "GFP_NOMEMALLOC", "NMA" },
+ { "GFP_MEMALLOC", "MA" },
+ { "GFP_HARDWALL", "HW" },
+ { "GFP_THISNODE", "TN" },
+ { "GFP_RECLAIMABLE", "RC" },
+ { "GFP_MOVABLE", "M" },
+ { "GFP_NOTRACK", "NT" },
+ { "GFP_NO_KSWAPD", "NK" },
+ { "GFP_OTHER_NODE", "ON" },
+ { "GFP_NOWAIT", "NW" },
+};
+
+static size_t max_gfp_len;
+
+static char *compact_gfp_flags(char *gfp_flags)
+{
+ char *orig_flags = strdup(gfp_flags);
+ char *new_flags = NULL;
+ char *str, *pos;
+ size_t len = 0;
+
+ if (orig_flags == NULL)
+ return NULL;
+
+ str = strtok_r(orig_flags, "|", &pos);
+ while (str) {
+ size_t i;
+ char *new;
+ const char *cpt;
+
+ for (i = 0; i < ARRAY_SIZE(gfp_compact_table); i++) {
+ if (strcmp(gfp_compact_table[i].original, str))
+ continue;
+
+ cpt = gfp_compact_table[i].compact;
+ new = realloc(new_flags, len + strlen(cpt) + 2);
+ if (new == NULL) {
+ free(new_flags);
+ return NULL;
+ }
+
+ new_flags = new;
+
+ if (!len) {
+ strcpy(new_flags, cpt);
+ } else {
+ strcat(new_flags, "|");
+ strcat(new_flags, cpt);
+ len++;
+ }
+
+ len += strlen(cpt);
+ }
+
+ str = strtok_r(NULL, "|", &pos);
+ }
+
+ if (max_gfp_len < len)
+ max_gfp_len = len;
+
+ free(orig_flags);
+ return new_flags;
+}
+
+static char *compact_gfp_string(unsigned long gfp_flags)
+{
+ struct gfp_flag key = {
+ .flags = gfp_flags,
+ };
+ struct gfp_flag *gfp;
+
+ gfp = bsearch(&key, gfps, nr_gfps, sizeof(*gfps), gfpcmp);
+ if (gfp)
+ return gfp->compact_str;
+
+ return NULL;
+}
+
+static int parse_gfp_flags(struct perf_evsel *evsel, struct perf_sample *sample,
+ unsigned int gfp_flags)
+{
+ struct pevent_record record = {
+ .cpu = sample->cpu,
+ .data = sample->raw_data,
+ .size = sample->raw_size,
+ };
+ struct trace_seq seq;
+ char *str, *pos;
+
+ if (nr_gfps) {
+ struct gfp_flag key = {
+ .flags = gfp_flags,
+ };
+
+ if (bsearch(&key, gfps, nr_gfps, sizeof(*gfps), gfpcmp))
+ return 0;
+ }
+
+ trace_seq_init(&seq);
+ pevent_event_info(&seq, evsel->tp_format, &record);
+
+ str = strtok_r(seq.buffer, " ", &pos);
+ while (str) {
+ if (!strncmp(str, "gfp_flags=", 10)) {
+ struct gfp_flag *new;
+
+ new = realloc(gfps, (nr_gfps + 1) * sizeof(*gfps));
+ if (new == NULL)
+ return -ENOMEM;
+
+ gfps = new;
+ new += nr_gfps++;
+
+ new->flags = gfp_flags;
+ new->human_readable = strdup(str + 10);
+ new->compact_str = compact_gfp_flags(str + 10);
+ if (!new->human_readable || !new->compact_str)
+ return -ENOMEM;
+
+ qsort(gfps, nr_gfps, sizeof(*gfps), gfpcmp);
+ }
+
+ str = strtok_r(NULL, " ", &pos);
+ }
+
+ trace_seq_destroy(&seq);
+ return 0;
+}
+
static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
struct perf_sample *sample)
{
@@ -613,6 +783,9 @@ static int perf_evsel__process_page_alloc_event(struct perf_evsel *evsel,
return 0;
}

+ if (parse_gfp_flags(evsel, sample, gfp_flags) < 0)
+ return -1;
+
callsite = find_callsite(evsel, sample);

/*
@@ -832,16 +1005,18 @@ static void __print_page_alloc_result(struct perf_session *session, int n_lines)
struct rb_node *next = rb_first(&page_alloc_sorted);
struct machine *machine = &session->machines.host;
const char *format;
+ int gfp_len = max(strlen("GFP flags"), max_gfp_len);

printf("\n%.105s\n", graph_dotted_line);
- printf(" %-16s | %5s alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n",
- use_pfn ? "PFN" : "Page", live_page ? "Live" : "Total");
+ printf(" %-16s | %5s alloc (KB) | Hits | Order | Mig.type | %-*s | Callsite\n",
+ use_pfn ? "PFN" : "Page", live_page ? "Live" : "Total",
+ gfp_len, "GFP flags");
printf("%.105s\n", graph_dotted_line);

if (use_pfn)
- format = " %16llu | %'16llu | %'9d | %5d | %8s | %08lx | %s\n";
+ format = " %16llu | %'16llu | %'9d | %5d | %8s | %-*s | %s\n";
else
- format = " %016llx | %'16llu | %'9d | %5d | %8s | %08lx | %s\n";
+ format = " %016llx | %'16llu | %'9d | %5d | %8s | %-*s | %s\n";

while (next && n_lines--) {
struct page_stat *data;
@@ -862,13 +1037,15 @@ static void __print_page_alloc_result(struct perf_session *session, int n_lines)
(unsigned long long)data->alloc_bytes / 1024,
data->nr_alloc, data->order,
migrate_type_str[data->migrate_type],
- (unsigned long)data->gfp_flags, caller);
+ gfp_len, compact_gfp_string(data->gfp_flags), caller);

next = rb_next(next);
}

- if (n_lines == -1)
- printf(" ... | ... | ... | ... | ... | ... | ...\n");
+ if (n_lines == -1) {
+ printf(" ... | ... | ... | ... | ... | %-*s | ...\n",
+ gfp_len, "...");
+ }

printf("%.105s\n", graph_dotted_line);
}
@@ -877,10 +1054,11 @@ static void __print_page_caller_result(struct perf_session *session, int n_lines
{
struct rb_node *next = rb_first(&page_caller_sorted);
struct machine *machine = &session->machines.host;
+ int gfp_len = max(strlen("GFP flags"), max_gfp_len);

printf("\n%.105s\n", graph_dotted_line);
- printf(" %5s alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite\n",
- live_page ? "Live" : "Total");
+ printf(" %5s alloc (KB) | Hits | Order | Mig.type | %-*s | Callsite\n",
+ live_page ? "Live" : "Total", gfp_len, "GFP flags");
printf("%.105s\n", graph_dotted_line);

while (next && n_lines--) {
@@ -898,21 +1076,37 @@ static void __print_page_caller_result(struct perf_session *session, int n_lines
else
scnprintf(buf, sizeof(buf), "%"PRIx64, data->callsite);

- printf(" %'16llu | %'9d | %5d | %8s | %08lx | %s\n",
+ printf(" %'16llu | %'9d | %5d | %8s | %-*s | %s\n",
(unsigned long long)data->alloc_bytes / 1024,
data->nr_alloc, data->order,
migrate_type_str[data->migrate_type],
- (unsigned long)data->gfp_flags, caller);
+ gfp_len, compact_gfp_string(data->gfp_flags), caller);

next = rb_next(next);
}

- if (n_lines == -1)
- printf(" ... | ... | ... | ... | ... | ...\n");
+ if (n_lines == -1) {
+ printf(" ... | ... | ... | ... | %-*s | ...\n",
+ gfp_len, "...");
+ }

printf("%.105s\n", graph_dotted_line);
}

+static void print_gfp_flags(void)
+{
+ int i;
+
+ printf("#\n");
+ printf("# GFP flags\n");
+ printf("# ---------\n");
+ for (i = 0; i < nr_gfps; i++) {
+ printf("# %08x: %*s: %s\n", gfps[i].flags,
+ (int) max_gfp_len, gfps[i].compact_str,
+ gfps[i].human_readable);
+ }
+}
+
static void print_slab_summary(void)
{
printf("\nSUMMARY (SLAB allocator)");
@@ -982,6 +1176,8 @@ static void print_slab_result(struct perf_session *session)

static void print_page_result(struct perf_session *session)
{
+ if (caller_flag || alloc_flag)
+ print_gfp_flags();
if (caller_flag)
__print_page_caller_result(session, caller_lines);
if (alloc_flag)

Subject: [tip:perf/core] perf kmem: Add kmem.default config option

Commit-ID: 0c160d495b5616e071bb4f873812e8f473128149
Gitweb: http://git.kernel.org/tip/0c160d495b5616e071bb4f873812e8f473128149
Author: Namhyung Kim <[email protected]>
AuthorDate: Tue, 21 Apr 2015 13:55:06 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Mon, 4 May 2015 13:34:48 -0300

perf kmem: Add kmem.default config option

Currently perf kmem command will select --slab if neither --slab nor
--page is given for backward compatibility. Add kmem.default config
option to select the default value ('page' or 'slab').

# cat ~/.perfconfig
[kmem]
default = page

# perf kmem stat

SUMMARY (page allocator)
========================
Total allocation requests : 1,518 [ 6,096 KB ]
Total free requests : 1,431 [ 5,748 KB ]

Total alloc+freed requests : 1,330 [ 5,344 KB ]
Total alloc-only requests : 188 [ 752 KB ]
Total free-only requests : 101 [ 404 KB ]

Total allocation failures : 0 [ 0 KB ]
...

Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Pekka Enberg <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Taeung Song <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-kmem.c | 32 +++++++++++++++++++++++++++++---
1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 1c66895..828b728 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -28,6 +28,10 @@ static int kmem_slab;
static int kmem_page;

static long kmem_page_size;
+static enum {
+ KMEM_SLAB,
+ KMEM_PAGE,
+} kmem_default = KMEM_SLAB; /* for backward compatibility */

struct alloc_stat;
typedef int (*sort_fn_t)(void *, void *);
@@ -1710,7 +1714,8 @@ static int parse_sort_opt(const struct option *opt __maybe_unused,
if (!arg)
return -1;

- if (kmem_page > kmem_slab) {
+ if (kmem_page > kmem_slab ||
+ (kmem_page == 0 && kmem_slab == 0 && kmem_default == KMEM_PAGE)) {
if (caller_flag > alloc_flag)
return setup_page_sorting(&page_caller_sort, arg);
else
@@ -1826,6 +1831,22 @@ static int __cmd_record(int argc, const char **argv)
return cmd_record(i, rec_argv, NULL);
}

+static int kmem_config(const char *var, const char *value, void *cb)
+{
+ if (!strcmp(var, "kmem.default")) {
+ if (!strcmp(value, "slab"))
+ kmem_default = KMEM_SLAB;
+ else if (!strcmp(value, "page"))
+ kmem_default = KMEM_PAGE;
+ else
+ pr_err("invalid default value ('slab' or 'page' required): %s\n",
+ value);
+ return 0;
+ }
+
+ return perf_default_config(var, value, cb);
+}
+
int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
{
const char * const default_slab_sort = "frag,hit,bytes";
@@ -1862,14 +1883,19 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
struct perf_session *session;
int ret = -1;

+ perf_config(kmem_config, NULL);
argc = parse_options_subcommand(argc, argv, kmem_options,
kmem_subcommands, kmem_usage, 0);

if (!argc)
usage_with_options(kmem_usage, kmem_options);

- if (kmem_slab == 0 && kmem_page == 0)
- kmem_slab = 1; /* for backward compatibility */
+ if (kmem_slab == 0 && kmem_page == 0) {
+ if (kmem_default == KMEM_SLAB)
+ kmem_slab = 1;
+ else
+ kmem_page = 1;
+ }

if (!strncmp(argv[0], "rec", 3)) {
symbol__init(NULL);

Subject: [tip:perf/core] perf kmem: Show warning when trying to run stat without record

Commit-ID: a923e2c4b14f99f70692f82ee7bd63717604b738
Gitweb: http://git.kernel.org/tip/a923e2c4b14f99f70692f82ee7bd63717604b738
Author: Namhyung Kim <[email protected]>
AuthorDate: Tue, 5 May 2015 23:52:52 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Tue, 5 May 2015 18:13:08 -0300

perf kmem: Show warning when trying to run stat without record

Sometimes one can mistakenly run 'perf kmem stat' without running 'perf
kmem record' before or with a different configuration like recording
--slab and stat --page. Show a warning message like the one below to
inform the user:

# perf kmem stat --page --caller
No page allocation events found. Have you run 'perf kmem record --page'?

Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Pekka Enberg <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-kmem.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 828b728..e628bf1 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -1882,6 +1882,7 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
};
struct perf_session *session;
int ret = -1;
+ const char errmsg[] = "No %s allocation events found. Have you run 'perf kmem record --%s'?\n";

perf_config(kmem_config, NULL);
argc = parse_options_subcommand(argc, argv, kmem_options,
@@ -1908,11 +1909,21 @@ int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
if (session == NULL)
return -1;

+ if (kmem_slab) {
+ if (!perf_evlist__find_tracepoint_by_name(session->evlist,
+ "kmem:kmalloc")) {
+ pr_err(errmsg, "slab", "slab");
+ return -1;
+ }
+ }
+
if (kmem_page) {
- struct perf_evsel *evsel = perf_evlist__first(session->evlist);
+ struct perf_evsel *evsel;

- if (evsel == NULL || evsel->tp_format == NULL) {
- pr_err("invalid event found.. aborting\n");
+ evsel = perf_evlist__find_tracepoint_by_name(session->evlist,
+ "kmem:mm_page_alloc");
+ if (evsel == NULL) {
+ pr_err(errmsg, "page", "page");
return -1;
}

2015-05-11 14:35:43

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 4/6] perf kmem: Print gfp flags in human readable string

Em Tue, Apr 21, 2015 at 01:55:05PM +0900, Namhyung Kim escreveu:
> Save libtraceevent output and print it in the header.

<SNIP>

> +static int parse_gfp_flags(struct perf_evsel *evsel, struct perf_sample *sample,
> + unsigned int gfp_flags)
> +{
> + struct pevent_record record = {
> + .cpu = sample->cpu,
> + .data = sample->raw_data,
> + .size = sample->raw_size,
> + };
> + struct trace_seq seq;
> + char *str, *pos;
> +
> + if (nr_gfps) {
> + struct gfp_flag key = {
> + .flags = gfp_flags,
> + };
> +
> + if (bsearch(&key, gfps, nr_gfps, sizeof(*gfps), gfpcmp))
> + return 0;
> + }
> +
> + trace_seq_init(&seq);
> + pevent_event_info(&seq, evsel->tp_format, &record);
> +
> + str = strtok_r(seq.buffer, " ", &pos);


This introduced a problem I only now noticed, possibly because my
compiler was upgraded:

[acme@zoo linux]$ git bisect good
0e11115644b39ff9e986eb308b6c44ca75cd475f is the first bad commit
commit 0e11115644b39ff9e986eb308b6c44ca75cd475f
Author: Namhyung Kim <[email protected]>
Date: Tue Apr 21 13:55:05 2015 +0900

perf kmem: Print gfp flags in human readable string

Save libtraceevent output and print it in the header.

-------------------------------------------------


GEN /tmp/build/perf/common-cmds.h
PERF_VERSION = 4.1.rc2.ga20d87
CC /tmp/build/perf/builtin-kmem.o
builtin-kmem.c: In function ‘perf_evsel__process_page_alloc_event’:
builtin-kmem.c:743:427: error: ‘pos’ may be used uninitialized in this
function [-Werror=maybe-uninitialized]
new->human_readable = strdup(str + 10);
^
builtin-kmem.c:716:14: note: ‘pos’ was declared here
char *str, *pos;
^
cc1: all warnings being treated as errors
/home/git/linux/tools/build/Makefile.build:68: recipe for target
'/tmp/build/perf/builtin-kmem.o' failed
make[2]: *** [/tmp/build/perf/builtin-kmem.o] Error 1
Makefile.perf:330: recipe for target '/tmp/build/perf/builtin-kmem.o'
failed
make[1]: *** [/tmp/build/perf/builtin-kmem.o] Error 2
Makefile:87: recipe for target 'builtin-kmem.o' failed
make: *** [builtin-kmem.o] Error 2
make: Leaving directory '/home/git/linux/tools/perf'

------

Trying to fix it by initializing it to NULL.

- Arnaldo

2015-05-11 14:41:18

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 4/6] perf kmem: Print gfp flags in human readable string

Em Mon, May 11, 2015 at 11:35:36AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Apr 21, 2015 at 01:55:05PM +0900, Namhyung Kim escreveu:
> > Save libtraceevent output and print it in the header.
>
> <SNIP>
>
> > +static int parse_gfp_flags(struct perf_evsel *evsel, struct perf_sample *sample,
> > + unsigned int gfp_flags)
> > +{
> > + char *str, *pos;

> > + str = strtok_r(seq.buffer, " ", &pos);
>
> builtin-kmem.c:743:427: error: ‘pos’ may be used uninitialized in this
> function [-Werror=maybe-uninitialized]
> new->human_readable = strdup(str + 10);
> ^
> builtin-kmem.c:716:14: note: ‘pos’ was declared here
> char *str, *pos;
> ^

Emphasis on the "may", as according to strtok_r your code is ok, its
just the compiler that needs to be told that no, it is not being
accessed uninitialized:

<quote man strtok>
On the first call to strtok_r(), str should point to the string
to be parsed, and the value of saveptr is ignored. In subsequent calls,
str should be NULL, and saveptr should be unchanged since the previous
call.
</>

So just setting it to NULL is enough.

- Arnaldo

2015-05-11 15:28:09

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 4/6] perf kmem: Print gfp flags in human readable string

On Mon, May 11, 2015 at 11:41:10AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, May 11, 2015 at 11:35:36AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Tue, Apr 21, 2015 at 01:55:05PM +0900, Namhyung Kim escreveu:
> > > Save libtraceevent output and print it in the header.
> >
> > <SNIP>
> >
> > > +static int parse_gfp_flags(struct perf_evsel *evsel, struct perf_sample *sample,
> > > + unsigned int gfp_flags)
> > > +{
> > > + char *str, *pos;
>
> > > + str = strtok_r(seq.buffer, " ", &pos);
> >
> > builtin-kmem.c:743:427: error: ‘pos’ may be used uninitialized in this
> > function [-Werror=maybe-uninitialized]
> > new->human_readable = strdup(str + 10);
> > ^
> > builtin-kmem.c:716:14: note: ‘pos’ was declared here
> > char *str, *pos;
> > ^
>
> Emphasis on the "may", as according to strtok_r your code is ok, its
> just the compiler that needs to be told that no, it is not being
> accessed uninitialized:
>
> <quote man strtok>
> On the first call to strtok_r(), str should point to the string
> to be parsed, and the value of saveptr is ignored. In subsequent calls,
> str should be NULL, and saveptr should be unchanged since the previous
> call.
> </>
>
> So just setting it to NULL is enough.

Agreed.

Thanks for fixing this,
Namhyung