From: Kan Liang <[email protected]>
perf_top__mmap_read has severe performance issue in
Knights Landing/Mill, when monitoring in heavy load system. It costs
several minutes to finish, which is unacceptable.
Currently, perf top is non overwrite mode. For non overwrite mode, it
tries to read everything in the ringbuffer and doesn't pause the
ringbuffer. Once there are lots of samples delivered persistently,
the processing time could be very long. Also, the latest samples could
be lost when the ringbuffer is full.
It's better to change it to overwrite mode, which takes a snapshot for
the system by pausing the ringbuffer and could significantly reducing
the processing time (from several minutes to several seconds).
Also, the overwrite mode always keep the latest samples.
Patch 1-8: Introduce new interfaces for generic code to support
overwrite mode for one by one event read.
Discards stale interfaces.
The patches can be merged separately.
Patch 9-15: Add overwrite support to perf top.
Perf top should only support either overwrite or non-overwrite
mode.
Switch default mode to overwrite mode
If kernel doesn't support overwrite mode, fall back to
non-overwrite mode.
Changes since V3:
- Separated patches to add new interface perf_mmap__read_init and
apply to the perf_mmap__push()
- Corrected the comments of perf_mmap__read_done()
- Name the pointer parameter with 'p' postfix
- Add new rules to check per-event overwrite term in comments.
Do the check before perf_evlist__config()
- Add a new patch to disable/enable event lost warning in hists browser.
Changes since V2:
- Move duplicate 'map->prev' out of perf_mmap__read. Modify the
perf_mmap__read_event accordingly.
- Introduce new interface perf_mmap__read_init to calculate the ringbuffer
position
- Check perf_missing_features.write_backward
- Discard stale interfaces perf_mmap__read_backward and
perf_mmap__read_catchup
Changes since V1:
- New patches 4-6
- Support both overwrite mode and non-overwrite mode.
If kernel doesn't support default overwrite mode, fall back to
non-overwrite mode.
Kan Liang (15):
perf evlist: remove stale mmap read for backward
perf mmap: introduce perf_mmap__read_init()
perf mmap: apply perf_mmap__read_init() to perf_mmap__push()
perf mmap: discard 'prev' in perf_mmap__read
perf mmap: introduce perf_mmap__read_done
perf mmap: introduce perf_mmap__read_event()
perf test: update mmap read functions for backward-ring-buffer test
perf mmap: discard legacy interface for mmap read
perf top: check per-event overwrite term
perf evsel: expose perf_missing_features.write_backward
perf top: add overwrite fall back
perf hists browser: add parameter to disable lost event warning
perf top: remove lost events checking
perf top: switch default mode to overwrite mode
perf top: check the latency of perf_top__mmap_read
tools/perf/builtin-c2c.c | 4 +-
tools/perf/builtin-report.c | 3 +-
tools/perf/builtin-top.c | 150 ++++++++++++++++++++++++++++--
tools/perf/tests/backward-ring-buffer.c | 7 +-
tools/perf/ui/browsers/hists.c | 38 +++++---
tools/perf/ui/browsers/hists.h | 3 +-
tools/perf/util/evlist.c | 17 ----
tools/perf/util/evlist.h | 4 -
tools/perf/util/evsel.c | 5 +
tools/perf/util/evsel.h | 2 +
tools/perf/util/hist.h | 6 +-
tools/perf/util/mmap.c | 157 ++++++++++++++++++--------------
tools/perf/util/mmap.h | 10 +-
13 files changed, 287 insertions(+), 119 deletions(-)
--
2.5.5
From: Kan Liang <[email protected]>
perf_evlist__mmap_read_catchup() and perf_evlist__mmap_read_backward()
are only for overwrite mode.
But they read the evlist->mmap buffer which is for non-overwrite mode.
It did not bring any serious problem yet, because there is no one use
it.
Remove the unused interfaces.
Acked-by: Wang Nan <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/util/evlist.c | 17 -----------------
tools/perf/util/evlist.h | 4 ----
2 files changed, 21 deletions(-)
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index f0a5e09..cf0fd35 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -714,28 +714,11 @@ union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, int
return perf_mmap__read_forward(md);
}
-union perf_event *perf_evlist__mmap_read_backward(struct perf_evlist *evlist, int idx)
-{
- struct perf_mmap *md = &evlist->mmap[idx];
-
- /*
- * No need to check messup for backward ring buffer:
- * We can always read arbitrary long data from a backward
- * ring buffer unless we forget to pause it before reading.
- */
- return perf_mmap__read_backward(md);
-}
-
union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
{
return perf_evlist__mmap_read_forward(evlist, idx);
}
-void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx)
-{
- perf_mmap__read_catchup(&evlist->mmap[idx]);
-}
-
void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
{
perf_mmap__consume(&evlist->mmap[idx], false);
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index e7fbca6..bbf53b1 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -134,10 +134,6 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx);
union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist,
int idx);
-union perf_event *perf_evlist__mmap_read_backward(struct perf_evlist *evlist,
- int idx);
-void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx);
-
void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx);
int perf_evlist__open(struct perf_evlist *evlist);
--
2.5.5
From: Kan Liang <[email protected]>
Per-event overwrite term is not forbidden in perf top, which can bring
problems. Because perf top only support non-overwrite mode now.
Add new rules and check regarding to overwrite term for perf top.
- All events either have same per-event term or don't have per-event
mode setting. Otherwise, it will error out.
- Per-event overwrite term should be consistent as opts->overwrite.
If not, updating the opts->overwrite according to per-event term.
Make it possible to support either non-overwrite or overwrite mode.
The overwrite mode is forbidden now, which will be removed when the
overwrite mode is supported later.
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/builtin-top.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 73 insertions(+)
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index c6ccda5..c86eee4 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -881,6 +881,68 @@ static void perf_top__mmap_read(struct perf_top *top)
perf_top__mmap_read_idx(top, i);
}
+/*
+ * Check per-event overwrite term.
+ * perf top should support consistent term for all events.
+ * - All events don't have per-event term
+ * E.g. "cpu/cpu-cycles/,cpu/instructions/"
+ * Nothing change, return 0.
+ * - All events have same per-event term
+ * E.g. "cpu/cpu-cycles,no-overwrite/,cpu/instructions,no-overwrite/
+ * Using the per-event setting to replace the opts->overwrite if
+ * they are different, then return 0.
+ * - Events have different per-event term
+ * E.g. "cpu/cpu-cycles,overwrite/,cpu/instructions,no-overwrite/"
+ * Return -1
+ * - Some of the event set per-event term, but some not.
+ * E.g. "cpu/cpu-cycles/,cpu/instructions,no-overwrite/"
+ * Return -1
+ */
+static int perf_top_overwrite_check(struct perf_top *top)
+{
+ struct record_opts *opts = &top->record_opts;
+ struct perf_evlist *evlist = top->evlist;
+ struct perf_evsel_config_term *term;
+ struct list_head *config_terms;
+ struct perf_evsel *evsel;
+ int set, overwrite = -1;
+
+ evlist__for_each_entry(evlist, evsel) {
+ set = -1;
+ config_terms = &evsel->config_terms;
+ list_for_each_entry(term, config_terms, list) {
+ if (term->type == PERF_EVSEL__CONFIG_TERM_OVERWRITE)
+ set = term->val.overwrite ? 1 : 0;
+ }
+
+ /* no term for current and previous event (likely) */
+ if ((overwrite < 0) && (set < 0))
+ continue;
+
+ /* has term for both current and previous event, compare */
+ if ((overwrite >= 0) && (set >= 0) && (overwrite != set))
+ return -1;
+
+ /* no term for current event but has term for previous one */
+ if ((overwrite >= 0) && (set < 0))
+ return -1;
+
+ /* has term for current event */
+ if ((overwrite < 0) && (set >= 0)) {
+ /* if it's first event, set overwrite */
+ if (evsel == perf_evlist__first(evlist))
+ overwrite = set;
+ else
+ return -1;
+ }
+ }
+
+ if ((overwrite >= 0) && (opts->overwrite != overwrite))
+ opts->overwrite = overwrite;
+
+ return 0;
+}
+
static int perf_top__start_counters(struct perf_top *top)
{
char msg[BUFSIZ];
@@ -888,6 +950,17 @@ static int perf_top__start_counters(struct perf_top *top)
struct perf_evlist *evlist = top->evlist;
struct record_opts *opts = &top->record_opts;
+ if (perf_top_overwrite_check(top)) {
+ ui__error("perf top only support consistent per-event "
+ "overwrite setting for all events\n");
+ goto out_err;
+ }
+
+ if (opts->overwrite) {
+ ui__error("not support overwrite mode yet\n");
+ goto out_err;
+ }
+
perf_evlist__config(evlist, opts, &callchain_param);
evlist__for_each_entry(evlist, counter) {
--
2.5.5
From: Kan Liang <[email protected]>
Use the new perf_mmap__read_* interfaces for overwrite ringbuffer test.
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/tests/backward-ring-buffer.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/tools/perf/tests/backward-ring-buffer.c b/tools/perf/tests/backward-ring-buffer.c
index 4035d43..e0b1b41 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -31,10 +31,12 @@ static int count_samples(struct perf_evlist *evlist, int *sample_count,
int i;
for (i = 0; i < evlist->nr_mmaps; i++) {
+ struct perf_mmap *map = &evlist->overwrite_mmap[i];
union perf_event *event;
+ u64 start, end;
- perf_mmap__read_catchup(&evlist->overwrite_mmap[i]);
- while ((event = perf_mmap__read_backward(&evlist->overwrite_mmap[i])) != NULL) {
+ perf_mmap__read_init(map, true, &start, &end);
+ while ((event = perf_mmap__read_event(map, true, &start, end)) != NULL) {
const u32 type = event->header.type;
switch (type) {
@@ -49,6 +51,7 @@ static int count_samples(struct perf_evlist *evlist, int *sample_count,
return TEST_FAIL;
}
}
+ perf_mmap__read_done(map);
}
return TEST_OK;
}
--
2.5.5
From: Kan Liang <[email protected]>
For overwrite mode, the ringbuffer will be paused. The event lost is
expected. It needs a way to notify the browser not print the warning.
It will be used later for perf top to disable lost event warning in
overwrite mode. There is no behavior change for now.
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/builtin-c2c.c | 4 ++--
tools/perf/builtin-report.c | 3 ++-
tools/perf/builtin-top.c | 2 +-
tools/perf/ui/browsers/hists.c | 38 +++++++++++++++++++++++++-------------
tools/perf/ui/browsers/hists.h | 3 ++-
tools/perf/util/hist.h | 6 ++++--
6 files changed, 36 insertions(+), 20 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index c0debc3..5b509df 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2245,7 +2245,7 @@ static int perf_c2c__browse_cacheline(struct hist_entry *he)
c2c_browser__update_nr_entries(browser);
while (1) {
- key = hist_browser__run(browser, "? - help");
+ key = hist_browser__run(browser, "? - help", false);
switch (key) {
case 's':
@@ -2314,7 +2314,7 @@ static int perf_c2c__hists_browse(struct hists *hists)
c2c_browser__update_nr_entries(browser);
while (1) {
- key = hist_browser__run(browser, "? - help");
+ key = hist_browser__run(browser, "? - help", false);
switch (key) {
case 'q':
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index dd4df9a..28a9030 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -527,7 +527,8 @@ static int report__browse_hists(struct report *rep)
case 1:
ret = perf_evlist__tui_browse_hists(evlist, help, NULL,
rep->min_percent,
- &session->header.env);
+ &session->header.env,
+ false);
/*
* Usually "ret" is the last pressed key, and we only
* care if the key notifies us to switch data file.
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index ecba2cd..fc3c91b 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -611,7 +611,7 @@ static void *display_thread_tui(void *arg)
perf_evlist__tui_browse_hists(top->evlist, help, &hbt,
top->min_percent,
- &top->session->header.env);
+ &top->session->header.env, false);
done = 1;
return NULL;
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 68146f4..e458920 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -608,7 +608,8 @@ static int hist_browser__title(struct hist_browser *browser, char *bf, size_t si
return browser->title ? browser->title(browser, bf, size) : 0;
}
-int hist_browser__run(struct hist_browser *browser, const char *help)
+int hist_browser__run(struct hist_browser *browser, const char *help,
+ bool no_lost_event_warning)
{
int key;
char title[160];
@@ -638,8 +639,9 @@ int hist_browser__run(struct hist_browser *browser, const char *help)
nr_entries = hist_browser__nr_entries(browser);
ui_browser__update_nr_entries(&browser->b, nr_entries);
- if (browser->hists->stats.nr_lost_warned !=
- browser->hists->stats.nr_events[PERF_RECORD_LOST]) {
+ if (!no_lost_event_warning &&
+ (browser->hists->stats.nr_lost_warned !=
+ browser->hists->stats.nr_events[PERF_RECORD_LOST])) {
browser->hists->stats.nr_lost_warned =
browser->hists->stats.nr_events[PERF_RECORD_LOST];
ui_browser__warn_lost_events(&browser->b);
@@ -2763,7 +2765,8 @@ static int perf_evsel__hists_browse(struct perf_evsel *evsel, int nr_events,
bool left_exits,
struct hist_browser_timer *hbt,
float min_pcnt,
- struct perf_env *env)
+ struct perf_env *env,
+ bool no_lost_event_warning)
{
struct hists *hists = evsel__hists(evsel);
struct hist_browser *browser = perf_evsel_browser__new(evsel, hbt, env);
@@ -2844,7 +2847,8 @@ static int perf_evsel__hists_browse(struct perf_evsel *evsel, int nr_events,
nr_options = 0;
- key = hist_browser__run(browser, helpline);
+ key = hist_browser__run(browser, helpline,
+ no_lost_event_warning);
if (browser->he_selection != NULL) {
thread = hist_browser__selected_thread(browser);
@@ -3184,7 +3188,8 @@ static void perf_evsel_menu__write(struct ui_browser *browser,
static int perf_evsel_menu__run(struct perf_evsel_menu *menu,
int nr_events, const char *help,
- struct hist_browser_timer *hbt)
+ struct hist_browser_timer *hbt,
+ bool no_lost_event_warning)
{
struct perf_evlist *evlist = menu->b.priv;
struct perf_evsel *pos;
@@ -3203,7 +3208,9 @@ static int perf_evsel_menu__run(struct perf_evsel_menu *menu,
case K_TIMER:
hbt->timer(hbt->arg);
- if (!menu->lost_events_warned && menu->lost_events) {
+ if (!menu->lost_events_warned &&
+ menu->lost_events &&
+ !no_lost_event_warning) {
ui_browser__warn_lost_events(&menu->b);
menu->lost_events_warned = true;
}
@@ -3224,7 +3231,8 @@ static int perf_evsel_menu__run(struct perf_evsel_menu *menu,
key = perf_evsel__hists_browse(pos, nr_events, help,
true, hbt,
menu->min_pcnt,
- menu->env);
+ menu->env,
+ no_lost_event_warning);
ui_browser__show_title(&menu->b, title);
switch (key) {
case K_TAB:
@@ -3282,7 +3290,8 @@ static int __perf_evlist__tui_browse_hists(struct perf_evlist *evlist,
int nr_entries, const char *help,
struct hist_browser_timer *hbt,
float min_pcnt,
- struct perf_env *env)
+ struct perf_env *env,
+ bool no_lost_event_warning)
{
struct perf_evsel *pos;
struct perf_evsel_menu menu = {
@@ -3309,13 +3318,15 @@ static int __perf_evlist__tui_browse_hists(struct perf_evlist *evlist,
menu.b.width = line_len;
}
- return perf_evsel_menu__run(&menu, nr_entries, help, hbt);
+ return perf_evsel_menu__run(&menu, nr_entries, help,
+ hbt, no_lost_event_warning);
}
int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,
struct hist_browser_timer *hbt,
float min_pcnt,
- struct perf_env *env)
+ struct perf_env *env,
+ bool no_lost_event_warning)
{
int nr_entries = evlist->nr_entries;
@@ -3325,7 +3336,7 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,
return perf_evsel__hists_browse(first, nr_entries, help,
false, hbt, min_pcnt,
- env);
+ env, no_lost_event_warning);
}
if (symbol_conf.event_group) {
@@ -3342,5 +3353,6 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,
}
return __perf_evlist__tui_browse_hists(evlist, nr_entries, help,
- hbt, min_pcnt, env);
+ hbt, min_pcnt, env,
+ no_lost_event_warning);
}
diff --git a/tools/perf/ui/browsers/hists.h b/tools/perf/ui/browsers/hists.h
index ba43177..a903ce1 100644
--- a/tools/perf/ui/browsers/hists.h
+++ b/tools/perf/ui/browsers/hists.h
@@ -28,7 +28,8 @@ struct hist_browser {
struct hist_browser *hist_browser__new(struct hists *hists);
void hist_browser__delete(struct hist_browser *browser);
-int hist_browser__run(struct hist_browser *browser, const char *help);
+int hist_browser__run(struct hist_browser *browser, const char *help,
+ bool no_lost_event_warning);
void hist_browser__init(struct hist_browser *browser,
struct hists *hists);
#endif /* _PERF_UI_BROWSER_HISTS_H_ */
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index f6630cb..b5d8ab2 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -430,7 +430,8 @@ int hist_entry__tui_annotate(struct hist_entry *he, struct perf_evsel *evsel,
int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,
struct hist_browser_timer *hbt,
float min_pcnt,
- struct perf_env *env);
+ struct perf_env *env,
+ bool no_lost_event_warning);
int script_browse(const char *script_opt);
#else
static inline
@@ -438,7 +439,8 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist __maybe_unused,
const char *help __maybe_unused,
struct hist_browser_timer *hbt __maybe_unused,
float min_pcnt __maybe_unused,
- struct perf_env *env __maybe_unused)
+ struct perf_env *env __maybe_unused,
+ bool no_lost_event_warning __maybe_unused)
{
return 0;
}
--
2.5.5
From: Kan Liang <[email protected]>
Discards perf_mmap__read_backward() and perf_mmap__read_catchup(). No
tools use them.
There are tools still use perf_mmap__read_forward(). Keep it, but add
comments to point to the new interface for future use.
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/util/mmap.c | 50 ++++----------------------------------------------
tools/perf/util/mmap.h | 3 ---
2 files changed, 4 insertions(+), 49 deletions(-)
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index de50c71..c2a56bb 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -63,6 +63,10 @@ static union perf_event *perf_mmap__read(struct perf_mmap *map,
return event;
}
+/*
+ * legacy interface for mmap read.
+ * Don't use it. Use perf_mmap__read_event().
+ */
union perf_event *perf_mmap__read_forward(struct perf_mmap *map)
{
u64 head;
@@ -78,41 +82,6 @@ union perf_event *perf_mmap__read_forward(struct perf_mmap *map)
return perf_mmap__read(map, &map->prev, head);
}
-union perf_event *perf_mmap__read_backward(struct perf_mmap *map)
-{
- u64 head, end;
-
- /*
- * Check if event was unmapped due to a POLLHUP/POLLERR.
- */
- if (!refcount_read(&map->refcnt))
- return NULL;
-
- head = perf_mmap__read_head(map);
- if (!head)
- return NULL;
-
- /*
- * 'head' pointer starts from 0. Kernel minus sizeof(record) form
- * it each time when kernel writes to it, so in fact 'head' is
- * negative. 'end' pointer is made manually by adding the size of
- * the ring buffer to 'head' pointer, means the validate data can
- * read is the whole ring buffer. If 'end' is positive, the ring
- * buffer has not fully filled, so we must adjust 'end' to 0.
- *
- * However, since both 'head' and 'end' is unsigned, we can't
- * simply compare 'end' against 0. Here we compare '-head' and
- * the size of the ring buffer, where -head is the number of bytes
- * kernel write to the ring buffer.
- */
- if (-head < (u64)(map->mask + 1))
- end = 0;
- else
- end = head + map->mask + 1;
-
- return perf_mmap__read(map, &map->prev, end);
-}
-
/*
* Read event from ring buffer one by one.
* Return one event for each call.
@@ -152,17 +121,6 @@ union perf_event *perf_mmap__read_event(struct perf_mmap *map,
return event;
}
-void perf_mmap__read_catchup(struct perf_mmap *map)
-{
- u64 head;
-
- if (!refcount_read(&map->refcnt))
- return;
-
- head = perf_mmap__read_head(map);
- map->prev = head;
-}
-
static bool perf_mmap__empty(struct perf_mmap *map)
{
return perf_mmap__read_head(map) == map->prev && !map->auxtrace_mmap.base;
diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index 29d7c5d..303e5d2 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -65,8 +65,6 @@ void perf_mmap__put(struct perf_mmap *map);
void perf_mmap__consume(struct perf_mmap *map, bool overwrite);
-void perf_mmap__read_catchup(struct perf_mmap *md);
-
static inline u64 perf_mmap__read_head(struct perf_mmap *mm)
{
struct perf_event_mmap_page *pc = mm->base;
@@ -87,7 +85,6 @@ static inline void perf_mmap__write_tail(struct perf_mmap *md, u64 tail)
}
union perf_event *perf_mmap__read_forward(struct perf_mmap *map);
-union perf_event *perf_mmap__read_backward(struct perf_mmap *map);
union perf_event *perf_mmap__read_event(struct perf_mmap *map,
bool overwrite,
--
2.5.5
From: Kan Liang <[email protected]>
Switch to non-overwrite mode if kernel doesnot support overwrite
ringbuffer.
It's only effect when overwrite mode is supported.
No change to current behavior.
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/builtin-top.c | 36 ++++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index c86eee4..ecba2cd 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -943,6 +943,27 @@ static int perf_top_overwrite_check(struct perf_top *top)
return 0;
}
+static int perf_top_overwrite_fallback(struct perf_top *top,
+ struct perf_evsel *evsel)
+{
+ struct record_opts *opts = &top->record_opts;
+ struct perf_evlist *evlist = top->evlist;
+ struct perf_evsel *counter;
+
+ if (!opts->overwrite)
+ return 0;
+
+ /* only fall back when first event fails */
+ if (evsel != perf_evlist__first(evlist))
+ return 0;
+
+ evlist__for_each_entry(evlist, counter)
+ counter->attr.write_backward = false;
+ opts->overwrite = false;
+ ui__warning("fall back to non-overwrite mode\n");
+ return 1;
+}
+
static int perf_top__start_counters(struct perf_top *top)
{
char msg[BUFSIZ];
@@ -967,6 +988,21 @@ static int perf_top__start_counters(struct perf_top *top)
try_again:
if (perf_evsel__open(counter, top->evlist->cpus,
top->evlist->threads) < 0) {
+
+ /*
+ * Specially handle overwrite fall back.
+ * Because perf top is the only tool which has
+ * overwrite mode by default, support
+ * both overwrite and non-overwrite mode, and
+ * require consistent mode for all events.
+ *
+ * May move it to generic code with more tools
+ * have similar attribute.
+ */
+ if (is_write_backward_fail() &&
+ perf_top_overwrite_fallback(top, counter))
+ goto try_again;
+
if (perf_evsel__fallback(counter, errno, msg, sizeof(msg))) {
if (verbose > 0)
ui__warning("%s\n", msg);
--
2.5.5
From: Kan Liang <[email protected]>
perf top need it to handle overwrite fallback later.
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/util/evsel.c | 5 +++++
tools/perf/util/evsel.h | 2 ++
2 files changed, 7 insertions(+)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 4eea3b4..1dd3700 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1670,6 +1670,11 @@ static bool ignore_missing_thread(struct perf_evsel *evsel,
return true;
}
+bool is_write_backward_fail(void)
+{
+ return perf_missing_features.write_backward;
+}
+
int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
struct thread_map *threads)
{
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 846e416..1f46728 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -448,4 +448,6 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
struct perf_env *perf_evsel__env(struct perf_evsel *evsel);
+bool is_write_backward_fail(void);
+
#endif /* __PERF_EVSEL_H */
--
2.5.5
From: Kan Liang <[email protected]>
The latency of perf_top__mmap_read should be lower than refresh time.
If not, give some hints to reduce the latency.
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/builtin-top.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 5ab7598..8a3151b 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -889,8 +889,10 @@ static void perf_top__mmap_read(struct perf_top *top)
{
bool overwrite = top->record_opts.overwrite;
struct perf_evlist *evlist = top->evlist;
+ unsigned long long start, end;
int i;
+ start = rdclock();
if (overwrite)
perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_DATA_PENDING);
@@ -901,6 +903,13 @@ static void perf_top__mmap_read(struct perf_top *top)
perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_EMPTY);
perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_RUNNING);
}
+ end = rdclock();
+
+ if ((end - start) > (unsigned long long)top->delay_secs * NSEC_PER_SEC)
+ ui__warning("Too slow to read ring buffer.\n"
+ "Please try increasing the period (-c) or\n"
+ "decreasing the freq (-F) or\n"
+ "limiting the number of CPUs (-C)\n");
}
/*
--
2.5.5
From: Kan Liang <[email protected]>
There would be some records lost in overwrite mode because of pausing
the ringbuffer. It has little impact for the accuracy of the snapshot
and could be tolerant for perf top.
The lost events checking is removed.
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/builtin-top.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index fc3c91b..3f10304 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -283,8 +283,9 @@ static void perf_top__print_sym_table(struct perf_top *top)
printf("%-*.*s\n", win_width, win_width, graph_dotted_line);
- if (hists->stats.nr_lost_warned !=
- hists->stats.nr_events[PERF_RECORD_LOST]) {
+ if (!top->record_opts.overwrite &&
+ (hists->stats.nr_lost_warned !=
+ hists->stats.nr_events[PERF_RECORD_LOST])) {
hists->stats.nr_lost_warned =
hists->stats.nr_events[PERF_RECORD_LOST];
color_fprintf(stdout, PERF_COLOR_RED,
@@ -611,7 +612,8 @@ static void *display_thread_tui(void *arg)
perf_evlist__tui_browse_hists(top->evlist, help, &hbt,
top->min_percent,
- &top->session->header.env, false);
+ &top->session->header.env,
+ top->record_opts.overwrite);
done = 1;
return NULL;
--
2.5.5
From: Kan Liang <[email protected]>
perf_top__mmap_read has severe performance issue in
Knights Landing/Mill, when monitoring in heavy load system. It costs
several minutes to finish, which is unacceptable.
Currently, perf top is non overwrite mode. For non overwrite mode, it
tries to read everything in the ringbuffer and doesn't pause the
ringbuffer. Once there are lots of samples delivered persistently,
the processing time could be very long. Also, the latest samples could
be lost when the ringbuffer is full.
For overwrite mode, it takes a snapshot for the system by pausing the
ringbuffer, which could significantly reducing the processing time.
Also, the overwrite mode always keep the latest samples.
Considering the real time requirement for perf top, the overwrite mode
is more suitable for perf top.
Actually, perf top was overwrite mode. It is changed to non overwrite
mode since commit 93fc64f14472 ("perf top: Switch to non overwrite
mode"). It's better to change it back to overwrite mode by default.
For the kernel which doesn't support overwrite mode, it will fall back
to non overwrite mode.
There would be some records lost in overwrite mode because of pausing
the ringbuffer. It has little impact for the accuracy of the snapshot
and could be tolerant.
For overwrite mode, unconditionally wait 100 ms before each snapshot. It
also reduce the overhead caused by pausing ringbuffer, especially on
light load system.
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/builtin-top.c | 34 +++++++++++++++++++++++++---------
1 file changed, 25 insertions(+), 9 deletions(-)
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 3f10304..5ab7598 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -809,15 +809,23 @@ static void perf_event__process_sample(struct perf_tool *tool,
static void perf_top__mmap_read_idx(struct perf_top *top, int idx)
{
+ struct record_opts *opts = &top->record_opts;
+ struct perf_evlist *evlist = top->evlist;
struct perf_sample sample;
struct perf_evsel *evsel;
+ struct perf_mmap *md;
struct perf_session *session = top->session;
union perf_event *event;
struct machine *machine;
+ u64 end, start;
int ret;
- while ((event = perf_evlist__mmap_read(top->evlist, idx)) != NULL) {
- ret = perf_evlist__parse_sample(top->evlist, event, &sample);
+ md = opts->overwrite ? &evlist->overwrite_mmap[idx] : &evlist->mmap[idx];
+ if (perf_mmap__read_init(md, opts->overwrite, &start, &end) < 0)
+ return;
+
+ while ((event = perf_mmap__read_event(md, opts->overwrite, &start, end)) != NULL) {
+ ret = perf_evlist__parse_sample(evlist, event, &sample);
if (ret) {
pr_err("Can't parse sample, err = %d\n", ret);
goto next_event;
@@ -871,16 +879,28 @@ static void perf_top__mmap_read_idx(struct perf_top *top, int idx)
} else
++session->evlist->stats.nr_unknown_events;
next_event:
- perf_evlist__mmap_consume(top->evlist, idx);
+ perf_mmap__consume(md, opts->overwrite);
}
+
+ perf_mmap__read_done(md);
}
static void perf_top__mmap_read(struct perf_top *top)
{
+ bool overwrite = top->record_opts.overwrite;
+ struct perf_evlist *evlist = top->evlist;
int i;
+ if (overwrite)
+ perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_DATA_PENDING);
+
for (i = 0; i < top->evlist->nr_mmaps; i++)
perf_top__mmap_read_idx(top, i);
+
+ if (overwrite) {
+ perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_EMPTY);
+ perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_RUNNING);
+ }
}
/*
@@ -979,11 +999,6 @@ static int perf_top__start_counters(struct perf_top *top)
goto out_err;
}
- if (opts->overwrite) {
- ui__error("not support overwrite mode yet\n");
- goto out_err;
- }
-
perf_evlist__config(evlist, opts, &callchain_param);
evlist__for_each_entry(evlist, counter) {
@@ -1144,7 +1159,7 @@ static int __cmd_top(struct perf_top *top)
perf_top__mmap_read(top);
- if (hits == top->samples)
+ if (opts->overwrite || (hits == top->samples))
ret = perf_evlist__poll(top->evlist, 100);
if (resize) {
@@ -1238,6 +1253,7 @@ int cmd_top(int argc, const char **argv)
.uses_mmap = true,
},
.proc_map_timeout = 500,
+ .overwrite = 1,
},
.max_stack = sysctl_perf_event_max_stack,
.sym_pcnt_filter = 5,
--
2.5.5
From: Kan Liang <[email protected]>
Except perf record, other perf tools read events one by one from ring
buffer by perf_mmap__read_forward(). But it only supports non-overwrite
mode.
Introduce perf_mmap__read_event() to support both non-overwrite and
overwrite mode.
Usage:
perf_mmap__read_init()
while(event = perf_mmap__read_event()) {
//process the event
perf_mmap__consume()
}
perf_mmap__read_done()
It cannot use perf_mmap__read_backward(). Because it always read the
stale buffer which is already processed. Furthermore, the forward and
backward concepts have been removed. The perf_mmap__read_backward() will
be replaced and discarded later.
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/util/mmap.c | 39 +++++++++++++++++++++++++++++++++++++++
tools/perf/util/mmap.h | 4 ++++
2 files changed, 43 insertions(+)
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 2291160..de50c71 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -113,6 +113,45 @@ union perf_event *perf_mmap__read_backward(struct perf_mmap *map)
return perf_mmap__read(map, &map->prev, end);
}
+/*
+ * Read event from ring buffer one by one.
+ * Return one event for each call.
+ *
+ * Usage:
+ * perf_mmap__read_init()
+ * while(event = perf_mmap__read_event()) {
+ * //process the event
+ * perf_mmap__consume()
+ * }
+ * perf_mmap__read_done()
+ */
+union perf_event *perf_mmap__read_event(struct perf_mmap *map,
+ bool overwrite,
+ u64 *startp, u64 end)
+{
+ union perf_event *event;
+
+ /*
+ * Check if event was unmapped due to a POLLHUP/POLLERR.
+ */
+ if (!refcount_read(&map->refcnt))
+ return NULL;
+
+ if (startp == NULL)
+ return NULL;
+
+ /* non-overwirte doesn't pause the ringbuffer */
+ if (!overwrite)
+ end = perf_mmap__read_head(map);
+
+ event = perf_mmap__read(map, startp, end);
+
+ if (!overwrite)
+ map->prev = *startp;
+
+ return event;
+}
+
void perf_mmap__read_catchup(struct perf_mmap *map)
{
u64 head;
diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index 4434cc5..29d7c5d 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -89,6 +89,10 @@ static inline void perf_mmap__write_tail(struct perf_mmap *md, u64 tail)
union perf_event *perf_mmap__read_forward(struct perf_mmap *map);
union perf_event *perf_mmap__read_backward(struct perf_mmap *map);
+union perf_event *perf_mmap__read_event(struct perf_mmap *map,
+ bool overwrite,
+ u64 *startp, u64 end);
+
int perf_mmap__push(struct perf_mmap *md, bool backward,
void *to, int push(void *to, void *buf, size_t size));
--
2.5.5
From: Kan Liang <[email protected]>
'start' and 'prev' are duplicate in perf_mmap__read()
Use 'map->prev' to replace 'start' in perf_mmap__read_*().
Suggested-by: Wang Nan <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/util/mmap.c | 28 ++++++++++------------------
1 file changed, 10 insertions(+), 18 deletions(-)
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 7c5cbdc..2f2cc3a 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -22,29 +22,27 @@ size_t perf_mmap__mmap_len(struct perf_mmap *map)
/* When check_messup is true, 'end' must points to a good entry */
static union perf_event *perf_mmap__read(struct perf_mmap *map,
- u64 start, u64 end, u64 *prev)
+ u64 *startp, u64 end)
{
unsigned char *data = map->base + page_size;
union perf_event *event = NULL;
- int diff = end - start;
+ int diff = end - *startp;
if (diff >= (int)sizeof(event->header)) {
size_t size;
- event = (union perf_event *)&data[start & map->mask];
+ event = (union perf_event *)&data[*startp & map->mask];
size = event->header.size;
- if (size < sizeof(event->header) || diff < (int)size) {
- event = NULL;
- goto broken_event;
- }
+ if (size < sizeof(event->header) || diff < (int)size)
+ return NULL;
/*
* Event straddles the mmap boundary -- header should always
* be inside due to u64 alignment of output.
*/
- if ((start & map->mask) + size != ((start + size) & map->mask)) {
- unsigned int offset = start;
+ if ((*startp & map->mask) + size != ((*startp + size) & map->mask)) {
+ unsigned int offset = *startp;
unsigned int len = min(sizeof(*event), size), cpy;
void *dst = map->event_copy;
@@ -59,20 +57,15 @@ static union perf_event *perf_mmap__read(struct perf_mmap *map,
event = (union perf_event *)map->event_copy;
}
- start += size;
+ *startp += size;
}
-broken_event:
- if (prev)
- *prev = start;
-
return event;
}
union perf_event *perf_mmap__read_forward(struct perf_mmap *map)
{
u64 head;
- u64 old = map->prev;
/*
* Check if event was unmapped due to a POLLHUP/POLLERR.
@@ -82,13 +75,12 @@ union perf_event *perf_mmap__read_forward(struct perf_mmap *map)
head = perf_mmap__read_head(map);
- return perf_mmap__read(map, old, head, &map->prev);
+ return perf_mmap__read(map, &map->prev, head);
}
union perf_event *perf_mmap__read_backward(struct perf_mmap *map)
{
u64 head, end;
- u64 start = map->prev;
/*
* Check if event was unmapped due to a POLLHUP/POLLERR.
@@ -118,7 +110,7 @@ union perf_event *perf_mmap__read_backward(struct perf_mmap *map)
else
end = head + map->mask + 1;
- return perf_mmap__read(map, start, end, &map->prev);
+ return perf_mmap__read(map, &map->prev, end);
}
void perf_mmap__read_catchup(struct perf_mmap *map)
--
2.5.5
From: Kan Liang <[email protected]>
The direction of overwrite mode is backward. The last perf_mmap__read()
will set tail to map->prev. Need to correct the map->prev to head which
is the end of next read.
It will be used later.
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/util/mmap.c | 11 +++++++++++
tools/perf/util/mmap.h | 1 +
2 files changed, 12 insertions(+)
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 2f2cc3a..2291160 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -342,3 +342,14 @@ int perf_mmap__push(struct perf_mmap *md, bool overwrite,
out:
return rc;
}
+
+/*
+ * Mandatory for overwrite mode
+ * The direction of overwrite mode is backward.
+ * The last perf_mmap__read() will set tail to map->prev.
+ * Need to correct the map->prev to head which is the end of next read.
+ */
+void perf_mmap__read_done(struct perf_mmap *map)
+{
+ map->prev = perf_mmap__read_head(map);
+}
diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index 0633308..4434cc5 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -96,4 +96,5 @@ size_t perf_mmap__mmap_len(struct perf_mmap *map);
int perf_mmap__read_init(struct perf_mmap *map, bool overwrite,
u64 *startp, u64 *endp);
+void perf_mmap__read_done(struct perf_mmap *map);
#endif /*__PERF_MMAP_H */
--
2.5.5
From: Kan Liang <[email protected]>
Apply the generic perf_mmap__read_init() to perf_mmap__push().
'size' is needed for both perf_mmap__read_init() and perf_mmap__push().
Have to calculate in each function.
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/util/mmap.c | 32 ++++++--------------------------
1 file changed, 6 insertions(+), 26 deletions(-)
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 414089f..7c5cbdc 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -313,38 +313,18 @@ int perf_mmap__read_init(struct perf_mmap *map, bool overwrite,
int perf_mmap__push(struct perf_mmap *md, bool overwrite,
void *to, int push(void *to, void *buf, size_t size))
{
- u64 head = perf_mmap__read_head(md);
- u64 old = md->prev;
- u64 end = head, start = old;
unsigned char *data = md->base + page_size;
+ u64 head = perf_mmap__read_head(md);
unsigned long size;
+ u64 end, start;
void *buf;
- int rc = 0;
+ int rc;
- start = overwrite ? head : old;
- end = overwrite ? old : head;
-
- if (start == end)
- return 0;
+ rc = perf_mmap__read_init(md, overwrite, &start, &end);
+ if (rc < 0)
+ return (rc == -EAGAIN) ? 0 : -1;
size = end - start;
- if (size > (unsigned long)(md->mask) + 1) {
- if (!overwrite) {
- WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
-
- md->prev = head;
- perf_mmap__consume(md, overwrite);
- return 0;
- }
-
- /*
- * Backward ring buffer is full. We still have a chance to read
- * most of data from it.
- */
- if (overwrite_rb_find_range(data, md->mask, head, &start, &end))
- return -1;
- }
-
if ((start & md->mask) + size != (end & md->mask)) {
buf = &data[start & md->mask];
size = md->mask + 1 - (start & md->mask);
--
2.5.5
From: Kan Liang <[email protected]>
The perf record has specific codes to calculate the ringbuffer position
for both overwrite and non-overwrite mode. Now, only perf record
supports both modes. The perf top will support both modes later.
It is useful to make the specific codes generic.
Introduce a new interface perf_mmap__read_init() to find ringbuffer
position. The perf_mmap__read_init() is actually factored out from
perf_mmap__push().
There are slight differences.
- Add a check for map->refcnt
- Add new return value logic, EAGAIN and EINVAL.
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/util/mmap.c | 43 +++++++++++++++++++++++++++++++++++++++++++
tools/perf/util/mmap.h | 2 ++
2 files changed, 45 insertions(+)
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 05076e6..414089f 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -267,6 +267,49 @@ static int overwrite_rb_find_range(void *buf, int mask, u64 head, u64 *start, u6
return -1;
}
+/*
+ * Report the start and end of the available data in ringbuffer
+ */
+int perf_mmap__read_init(struct perf_mmap *map, bool overwrite,
+ u64 *startp, u64 *endp)
+{
+ unsigned char *data = map->base + page_size;
+ u64 head = perf_mmap__read_head(map);
+ u64 old = map->prev;
+ unsigned long size;
+
+ /*
+ * Check if event was unmapped due to a POLLHUP/POLLERR.
+ */
+ if (!refcount_read(&map->refcnt))
+ return -EINVAL;
+
+ *startp = overwrite ? head : old;
+ *endp = overwrite ? old : head;
+
+ if (*startp == *endp)
+ return -EAGAIN;
+
+ size = *endp - *startp;
+ if (size > (unsigned long)(map->mask) + 1) {
+ if (!overwrite) {
+ WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
+
+ map->prev = head;
+ perf_mmap__consume(map, overwrite);
+ return -EAGAIN;
+ }
+
+ /*
+ * Backward ring buffer is full. We still have a chance to read
+ * most of data from it.
+ */
+ if (overwrite_rb_find_range(data, map->mask, head, startp, endp))
+ return -EINVAL;
+ }
+ return 0;
+}
+
int perf_mmap__push(struct perf_mmap *md, bool overwrite,
void *to, int push(void *to, void *buf, size_t size))
{
diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index e43d7b5..0633308 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -94,4 +94,6 @@ int perf_mmap__push(struct perf_mmap *md, bool backward,
size_t perf_mmap__mmap_len(struct perf_mmap *map);
+int perf_mmap__read_init(struct perf_mmap *map, bool overwrite,
+ u64 *startp, u64 *endp);
#endif /*__PERF_MMAP_H */
--
2.5.5
On Mon, Jan 15, 2018 at 12:20:38PM -0800, [email protected] wrote:
> From: Kan Liang <[email protected]>
>
> The perf record has specific codes to calculate the ringbuffer position
> for both overwrite and non-overwrite mode. Now, only perf record
> supports both modes. The perf top will support both modes later.
> It is useful to make the specific codes generic.
>
> Introduce a new interface perf_mmap__read_init() to find ringbuffer
> position. The perf_mmap__read_init() is actually factored out from
> perf_mmap__push().
> There are slight differences.
> - Add a check for map->refcnt
> - Add new return value logic, EAGAIN and EINVAL.
not helpful.. I asked to separate those changes,
so we can clearly see what the refcnt check for
and what's behing that EAGAIN return
please add separate:
1) patch that adds perf_mmap__read_init into perf_mmap__push
with no functional change
2) patch adds adds and explain the refcnt check
3) patch that adds and explain the EAGAIN return
thanks,
jirka
>
> Signed-off-by: Kan Liang <[email protected]>
> ---
> tools/perf/util/mmap.c | 43 +++++++++++++++++++++++++++++++++++++++++++
> tools/perf/util/mmap.h | 2 ++
> 2 files changed, 45 insertions(+)
>
> diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
> index 05076e6..414089f 100644
> --- a/tools/perf/util/mmap.c
> +++ b/tools/perf/util/mmap.c
> @@ -267,6 +267,49 @@ static int overwrite_rb_find_range(void *buf, int mask, u64 head, u64 *start, u6
> return -1;
> }
>
> +/*
> + * Report the start and end of the available data in ringbuffer
> + */
> +int perf_mmap__read_init(struct perf_mmap *map, bool overwrite,
> + u64 *startp, u64 *endp)
> +{
> + unsigned char *data = map->base + page_size;
> + u64 head = perf_mmap__read_head(map);
> + u64 old = map->prev;
> + unsigned long size;
> +
> + /*
> + * Check if event was unmapped due to a POLLHUP/POLLERR.
> + */
> + if (!refcount_read(&map->refcnt))
> + return -EINVAL;
> +
> + *startp = overwrite ? head : old;
> + *endp = overwrite ? old : head;
> +
> + if (*startp == *endp)
> + return -EAGAIN;
> +
> + size = *endp - *startp;
> + if (size > (unsigned long)(map->mask) + 1) {
> + if (!overwrite) {
> + WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
> +
> + map->prev = head;
> + perf_mmap__consume(map, overwrite);
> + return -EAGAIN;
> + }
> +
> + /*
> + * Backward ring buffer is full. We still have a chance to read
> + * most of data from it.
> + */
> + if (overwrite_rb_find_range(data, map->mask, head, startp, endp))
> + return -EINVAL;
> + }
> + return 0;
> +}
> +
> int perf_mmap__push(struct perf_mmap *md, bool overwrite,
> void *to, int push(void *to, void *buf, size_t size))
> {
> diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
> index e43d7b5..0633308 100644
> --- a/tools/perf/util/mmap.h
> +++ b/tools/perf/util/mmap.h
> @@ -94,4 +94,6 @@ int perf_mmap__push(struct perf_mmap *md, bool backward,
>
> size_t perf_mmap__mmap_len(struct perf_mmap *map);
>
> +int perf_mmap__read_init(struct perf_mmap *map, bool overwrite,
> + u64 *startp, u64 *endp);
> #endif /*__PERF_MMAP_H */
> --
> 2.5.5
>
> On Mon, Jan 15, 2018 at 12:20:38PM -0800, [email protected] wrote:
> > From: Kan Liang <[email protected]>
> >
> > The perf record has specific codes to calculate the ringbuffer position
> > for both overwrite and non-overwrite mode. Now, only perf record
> > supports both modes. The perf top will support both modes later.
> > It is useful to make the specific codes generic.
> >
> > Introduce a new interface perf_mmap__read_init() to find ringbuffer
> > position. The perf_mmap__read_init() is actually factored out from
> > perf_mmap__push().
> > There are slight differences.
> > - Add a check for map->refcnt
> > - Add new return value logic, EAGAIN and EINVAL.
>
> not helpful.. I asked to separate those changes,
> so we can clearly see what the refcnt check for
> and what's behing that EAGAIN return
>
> please add separate:
> 1) patch that adds perf_mmap__read_init into perf_mmap__push
> with no functional change
> 2) patch adds adds and explain the refcnt check
> 3) patch that adds and explain the EAGAIN return
>
Oh I see. I have misunderstood before.
I will change it in V5.
Thanks,
Kan
> thanks,
> jirka
>
> >
> > Signed-off-by: Kan Liang <[email protected]>
> > ---
> > tools/perf/util/mmap.c | 43
> +++++++++++++++++++++++++++++++++++++++++++
> > tools/perf/util/mmap.h | 2 ++
> > 2 files changed, 45 insertions(+)
> >
> > diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
> > index 05076e6..414089f 100644
> > --- a/tools/perf/util/mmap.c
> > +++ b/tools/perf/util/mmap.c
> > @@ -267,6 +267,49 @@ static int overwrite_rb_find_range(void *buf, int
> mask, u64 head, u64 *start, u6
> > return -1;
> > }
> >
> > +/*
> > + * Report the start and end of the available data in ringbuffer
> > + */
> > +int perf_mmap__read_init(struct perf_mmap *map, bool overwrite,
> > + u64 *startp, u64 *endp)
> > +{
> > + unsigned char *data = map->base + page_size;
> > + u64 head = perf_mmap__read_head(map);
> > + u64 old = map->prev;
> > + unsigned long size;
> > +
> > + /*
> > + * Check if event was unmapped due to a POLLHUP/POLLERR.
> > + */
> > + if (!refcount_read(&map->refcnt))
> > + return -EINVAL;
> > +
> > + *startp = overwrite ? head : old;
> > + *endp = overwrite ? old : head;
> > +
> > + if (*startp == *endp)
> > + return -EAGAIN;
> > +
> > + size = *endp - *startp;
> > + if (size > (unsigned long)(map->mask) + 1) {
> > + if (!overwrite) {
> > + WARN_ONCE(1, "failed to keep up with mmap data.
> (warn only once)\n");
> > +
> > + map->prev = head;
> > + perf_mmap__consume(map, overwrite);
> > + return -EAGAIN;
> > + }
> > +
> > + /*
> > + * Backward ring buffer is full. We still have a chance to read
> > + * most of data from it.
> > + */
> > + if (overwrite_rb_find_range(data, map->mask, head, startp,
> endp))
> > + return -EINVAL;
> > + }
> > + return 0;
> > +}
> > +
> > int perf_mmap__push(struct perf_mmap *md, bool overwrite,
> > void *to, int push(void *to, void *buf, size_t size))
> > {
> > diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
> > index e43d7b5..0633308 100644
> > --- a/tools/perf/util/mmap.h
> > +++ b/tools/perf/util/mmap.h
> > @@ -94,4 +94,6 @@ int perf_mmap__push(struct perf_mmap *md, bool
> backward,
> >
> > size_t perf_mmap__mmap_len(struct perf_mmap *map);
> >
> > +int perf_mmap__read_init(struct perf_mmap *map, bool overwrite,
> > + u64 *startp, u64 *endp);
> > #endif /*__PERF_MMAP_H */
> > --
> > 2.5.5
> >
Hi,
On Mon, Jan 15, 2018 at 12:20:48PM -0800, [email protected] wrote:
> From: Kan Liang <[email protected]>
>
> For overwrite mode, the ringbuffer will be paused. The event lost is
> expected. It needs a way to notify the browser not print the warning.
>
> It will be used later for perf top to disable lost event warning in
> overwrite mode. There is no behavior change for now.
>
> Signed-off-by: Kan Liang <[email protected]>
> ---
[SNIP]
> diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
> index 68146f4..e458920 100644
> --- a/tools/perf/ui/browsers/hists.c
> +++ b/tools/perf/ui/browsers/hists.c
> @@ -608,7 +608,8 @@ static int hist_browser__title(struct hist_browser *browser, char *bf, size_t si
> return browser->title ? browser->title(browser, bf, size) : 0;
> }
>
> -int hist_browser__run(struct hist_browser *browser, const char *help)
> +int hist_browser__run(struct hist_browser *browser, const char *help,
> + bool no_lost_event_warning)
> {
> int key;
> char title[160];
> @@ -638,8 +639,9 @@ int hist_browser__run(struct hist_browser *browser, const char *help)
> nr_entries = hist_browser__nr_entries(browser);
> ui_browser__update_nr_entries(&browser->b, nr_entries);
>
> - if (browser->hists->stats.nr_lost_warned !=
> - browser->hists->stats.nr_events[PERF_RECORD_LOST]) {
> + if (!no_lost_event_warning &&
Double negation is always confusing (at least for me), why not making
it "warn_lost_event"?
Thanks,
Namhyung
> + (browser->hists->stats.nr_lost_warned !=
> + browser->hists->stats.nr_events[PERF_RECORD_LOST])) {
> browser->hists->stats.nr_lost_warned =
> browser->hists->stats.nr_events[PERF_RECORD_LOST];
> ui_browser__warn_lost_events(&browser->b);
> Hi,
>
> On Mon, Jan 15, 2018 at 12:20:48PM -0800, [email protected] wrote:
> > From: Kan Liang <[email protected]>
> >
> > For overwrite mode, the ringbuffer will be paused. The event lost is
> > expected. It needs a way to notify the browser not print the warning.
> >
> > It will be used later for perf top to disable lost event warning in
> > overwrite mode. There is no behavior change for now.
> >
> > Signed-off-by: Kan Liang <[email protected]>
> > ---
>
> [SNIP]
> > diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
> > index 68146f4..e458920 100644
> > --- a/tools/perf/ui/browsers/hists.c
> > +++ b/tools/perf/ui/browsers/hists.c
> > @@ -608,7 +608,8 @@ static int hist_browser__title(struct hist_browser
> *browser, char *bf, size_t si
> > return browser->title ? browser->title(browser, bf, size) : 0;
> > }
> >
> > -int hist_browser__run(struct hist_browser *browser, const char *help)
> > +int hist_browser__run(struct hist_browser *browser, const char *help,
> > + bool no_lost_event_warning)
> > {
> > int key;
> > char title[160];
> > @@ -638,8 +639,9 @@ int hist_browser__run(struct hist_browser
> *browser, const char *help)
> > nr_entries = hist_browser__nr_entries(browser);
> > ui_browser__update_nr_entries(&browser->b,
> nr_entries);
> >
> > - if (browser->hists->stats.nr_lost_warned !=
> > - browser->hists-
> >stats.nr_events[PERF_RECORD_LOST]) {
> > + if (!no_lost_event_warning &&
>
> Double negation is always confusing (at least for me), why not making
> it "warn_lost_event"?
>
OK. I will change it in V5.
Thanks,
Kan
> Thanks,
> Namhyung
>
>
> > + (browser->hists->stats.nr_lost_warned !=
> > + browser->hists-
> >stats.nr_events[PERF_RECORD_LOST])) {
> > browser->hists->stats.nr_lost_warned =
> > browser->hists-
> >stats.nr_events[PERF_RECORD_LOST];
> > ui_browser__warn_lost_events(&browser-
> >b);