Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754092AbdLFB7P (ORCPT ); Tue, 5 Dec 2017 20:59:15 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:57298 "EHLO huawei.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1753300AbdLFB7L (ORCPT ); Tue, 5 Dec 2017 20:59:11 -0500 Subject: Re: [PATCH 4/5] perf top: switch to overwrite mode To: , , , , References: <1512513555-118798-1-git-send-email-kan.liang@intel.com> <1512513555-118798-5-git-send-email-kan.liang@intel.com> CC: , , From: "Wangnan (F)" Message-ID: Date: Wed, 6 Dec 2017 09:58:51 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <1512513555-118798-5-git-send-email-kan.liang@intel.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.111.194.139] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4787 Lines: 136 On 2017/12/6 6:39, kan.liang@intel.com wrote: > From: Kan Liang > > perf_top__mmap_read has severe performance issue in > Knights Landing/Mill, when monitoring in heavy load system. It costs > several minutes to finish, which is unacceptable. > > Currently, perf top is non overwrite mode. For non overwrite mode, it > tries to read everything in the ringbuffer and doesn't pause the > ringbuffer. Once there are lots of samples delivered persistently, > the processing time could be very long. Also, the latest samples could > be lost when the ringbuffer is full. > > For overwrite mode, it takes a snapshot for the system by pausing the > ringbuffer, which could significantly reducing the processing time. > Also, the overwrite mode always keep the latest samples. > Considering the real time requirement for perf top, the overwrite mode > is more suitable for perf top. > > Actually, perf top was overwrite mode. It is changed to non overwrite > mode since commit 93fc64f14472 ("perf top: Switch to non overwrite > mode"). It's better to change it back to overwrite mode. > > There would be some records lost in overwrite mode because of pausing > the ringbuffer. It has little impact for the accuracy of the snapshot > and could be tolerant. > The lost events checking is removed. > > Unconditionally wait 100 ms before each snapshot. It also reduce the > overhead caused by pausing ringbuffer, especially on light load system. > > Signed-off-by: Kan Liang > --- > tools/perf/builtin-top.c | 30 ++++++++++++++---------------- > tools/perf/ui/browsers/hists.c | 12 +++++++++--- > 2 files changed, 23 insertions(+), 19 deletions(-) > > diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c > index 540461f..721d786 100644 > --- a/tools/perf/builtin-top.c > +++ b/tools/perf/builtin-top.c > @@ -283,16 +283,6 @@ static void perf_top__print_sym_table(struct perf_top *top) > > printf("%-*.*s\n", win_width, win_width, graph_dotted_line); > > - if (hists->stats.nr_lost_warned != > - hists->stats.nr_events[PERF_RECORD_LOST]) { > - hists->stats.nr_lost_warned = > - hists->stats.nr_events[PERF_RECORD_LOST]; > - color_fprintf(stdout, PERF_COLOR_RED, > - "WARNING: LOST %d chunks, Check IO/CPU overload", > - hists->stats.nr_lost_warned); > - ++printed; > - } > - > if (top->sym_filter_entry) { > perf_top__show_details(top); > return; > @@ -807,14 +797,19 @@ static void perf_event__process_sample(struct perf_tool *tool, > > static void perf_top__mmap_read_idx(struct perf_top *top, int idx) > { > + struct perf_mmap *md = &top->evlist->overwrite_mmap[idx]; > struct perf_sample sample; > struct perf_evsel *evsel; > struct perf_session *session = top->session; > union perf_event *event; > struct machine *machine; > + unsigned long size; > + u64 end, start; > int ret; > > - while ((event = perf_evlist__mmap_read(top->evlist, idx)) != NULL) { > + perf_mmap__read_catchup(md, true, &start, &end, &size); > + > + while ((event = perf_mmap__read_backward(md, &start, end)) != NULL) { > ret = perf_evlist__parse_sample(top->evlist, event, &sample); > if (ret) { > pr_err("Can't parse sample, err = %d\n", ret); > @@ -869,16 +864,21 @@ static void perf_top__mmap_read_idx(struct perf_top *top, int idx) > } else > ++session->evlist->stats.nr_unknown_events; > next_event: > - perf_evlist__mmap_consume(top->evlist, idx); > + perf_mmap__consume(md, true); > } > + > + perf_mmap__read_done(md); > } > > static void perf_top__mmap_read(struct perf_top *top) > { > int i; > > + perf_evlist__toggle_bkw_mmap(top->evlist, BKW_MMAP_DATA_PENDING); > for (i = 0; i < top->evlist->nr_mmaps; i++) > perf_top__mmap_read_idx(top, i); > + perf_evlist__toggle_bkw_mmap(top->evlist, BKW_MMAP_EMPTY); > + perf_evlist__toggle_bkw_mmap(top->evlist, BKW_MMAP_RUNNING); > } > > static int perf_top__start_counters(struct perf_top *top) > @@ -1029,12 +1029,9 @@ static int __cmd_top(struct perf_top *top) > } > > while (!done) { > - u64 hits = top->samples; > - > perf_top__mmap_read(top); > > - if (hits == top->samples) > - ret = perf_evlist__poll(top->evlist, 100); > + ret = perf_evlist__poll(top->evlist, 100); > > if (resize) { > perf_top__resize(top); > @@ -1127,6 +1124,7 @@ int cmd_top(int argc, const char **argv) > .uses_mmap = true, > }, > .proc_map_timeout = 500, > + .overwrite = 1, This breaks old kernel: # ./perf top Error: Reading from overwrite event is not supported by this kernel. # uname -r 3.0.13-0.27-default We need a way to fall back to normal ring buffer when kernel doesn't support backward ring buffer. Thank you.