Each call to tui_progress__update() would forcibly refresh the entire
screen. This is somewhat inefficient and causes noticable flickering
during the startup of perf-report, especially on large/slow terminals.
It looks like the force-refresh in tui_progress__update() serves no
purpose other than to clear the screen so that the progress bar of a
previous operation does not subsume with that of a subsequent operation.
But we can do just that in a much more efficient manner by clearing only
the region that a previous progress bar may have occupied before
repainting the new progress bar. Then the force-refresh could be
removed with no change in visuals.
This patch disables the slow force-refresh in tui_progress__update() and
instead calls SLsmg_fill_region() on the entire area that the progress
bar may occupy before repainting it. This change makes the startup of
perf-report much faster and appear much "smoother".
It turns out that this was a big bottleneck in the startup speed of
perf-report -- with this patch, perf-report starts up ~1.6x faster (0.8s
vs 0.5s) on my machines. (These numbers were measured by running "time
perf report" on an 8MB perf.data and pressing 'q' immediately.)
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Patrick Palka <[email protected]>
---
tools/perf/ui/tui/progress.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/ui/tui/progress.c b/tools/perf/ui/tui/progress.c
index 6c2184d..641049a 100644
--- a/tools/perf/ui/tui/progress.c
+++ b/tools/perf/ui/tui/progress.c
@@ -17,13 +17,14 @@ static void tui_progress__update(u64 curr, u64 total, const char *title)
if (total == 0)
return;
- ui__refresh_dimensions(true);
+ ui__refresh_dimensions(false);
pthread_mutex_lock(&ui__lock);
y = SLtt_Screen_Rows / 2 - 2;
SLsmg_set_color(0);
SLsmg_draw_box(y, 0, 3, SLtt_Screen_Cols);
SLsmg_gotorc(y++, 1);
SLsmg_write_string((char *)title);
+ SLsmg_fill_region(y, 1, 1, SLtt_Screen_Cols - 2, ' ');
SLsmg_set_color(HE_COLORSET_SELECTED);
bar = ((SLtt_Screen_Cols - 2) * curr) / total;
SLsmg_fill_region(y, 1, 1, bar, ' ');
--
1.8.4.rc3
* Patrick Palka <[email protected]> wrote:
> Each call to tui_progress__update() would forcibly refresh the entire
> screen. This is somewhat inefficient and causes noticable flickering
> during the startup of perf-report, especially on large/slow terminals.
>
> It looks like the force-refresh in tui_progress__update() serves no
> purpose other than to clear the screen so that the progress bar of a
> previous operation does not subsume with that of a subsequent operation.
> But we can do just that in a much more efficient manner by clearing only
> the region that a previous progress bar may have occupied before
> repainting the new progress bar. Then the force-refresh could be
> removed with no change in visuals.
>
> This patch disables the slow force-refresh in tui_progress__update() and
> instead calls SLsmg_fill_region() on the entire area that the progress
> bar may occupy before repainting it. This change makes the startup of
> perf-report much faster and appear much "smoother".
>
> It turns out that this was a big bottleneck in the startup speed of
> perf-report -- with this patch, perf-report starts up ~1.6x faster (0.8s
> vs 0.5s) on my machines. (These numbers were measured by running "time
> perf report" on an 8MB perf.data and pressing 'q' immediately.)
>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Signed-off-by: Patrick Palka <[email protected]>
> ---
> tools/perf/ui/tui/progress.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/ui/tui/progress.c b/tools/perf/ui/tui/progress.c
> index 6c2184d..641049a 100644
> --- a/tools/perf/ui/tui/progress.c
> +++ b/tools/perf/ui/tui/progress.c
> @@ -17,13 +17,14 @@ static void tui_progress__update(u64 curr, u64 total, const char *title)
> if (total == 0)
> return;
>
> - ui__refresh_dimensions(true);
> + ui__refresh_dimensions(false);
> pthread_mutex_lock(&ui__lock);
> y = SLtt_Screen_Rows / 2 - 2;
> SLsmg_set_color(0);
> SLsmg_draw_box(y, 0, 3, SLtt_Screen_Cols);
> SLsmg_gotorc(y++, 1);
> SLsmg_write_string((char *)title);
> + SLsmg_fill_region(y, 1, 1, SLtt_Screen_Cols - 2, ' ');
> SLsmg_set_color(HE_COLORSET_SELECTED);
> bar = ((SLtt_Screen_Cols - 2) * curr) / total;
> SLsmg_fill_region(y, 1, 1, bar, ' ');
Nice! This is something I noticed as well, never figured out the
root cause of it.
Acked-by: Ingo Molnar <[email protected]>
Thanks,
Ingo
Each call to tui_progress__update() would forcibly refresh the entire
screen. This is somewhat inefficient and causes noticable flickering
during the startup of perf-report, especially on large/slow terminals.
It looks like the force-refresh in tui_progress__update() serves no
purpose other than to clear the screen so that the progress bar of a
previous operation does not subsume that of a subsequent operation.
But we can do just that in a much more efficient manner by clearing only
the region that a previous progress bar may have occupied before
repainting the new progress bar. Then the force-refresh could be
removed with no change in visuals.
This patch disables the slow force-refresh in tui_progress__update() and
instead calls SLsmg_fill_region() on the entire area that the progress
bar may occupy before repainting it. This change makes the startup of
perf-report much faster and appear much "smoother".
It turns out that this was a big bottleneck in the startup speed of
perf-report -- with this patch, perf-report starts up ~2x faster (1.1s
vs 0.55s) on my machines. (These numbers were measured by running
"time perf report" on an 8MB perf.data and pressing 'q' immediately.)
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Ingo Molnar <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: Patrick Palka <[email protected]>
---
Rebased against
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
And also CC'd two more relevant persons.
tools/perf/ui/tui/progress.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/ui/tui/progress.c b/tools/perf/ui/tui/progress.c
index 3e2d936..c61d14b 100644
--- a/tools/perf/ui/tui/progress.c
+++ b/tools/perf/ui/tui/progress.c
@@ -18,13 +18,14 @@ static void tui_progress__update(struct ui_progress *p)
if (p->total == 0)
return;
- ui__refresh_dimensions(true);
+ ui__refresh_dimensions(false);
pthread_mutex_lock(&ui__lock);
y = SLtt_Screen_Rows / 2 - 2;
SLsmg_set_color(0);
SLsmg_draw_box(y, 0, 3, SLtt_Screen_Cols);
SLsmg_gotorc(y++, 1);
SLsmg_write_string((char *)p->title);
+ SLsmg_fill_region(y, 1, 1, SLtt_Screen_Cols - 2, ' ');
SLsmg_set_color(HE_COLORSET_SELECTED);
bar = ((SLtt_Screen_Cols - 2) * p->curr) / p->total;
SLsmg_fill_region(y, 1, 1, bar, ' ');
--
1.8.4.rc3
Hi Patrick,
On Fri, 25 Oct 2013 20:25:49 -0400, Patrick Palka wrote:
> Each call to tui_progress__update() would forcibly refresh the entire
> screen. This is somewhat inefficient and causes noticable flickering
> during the startup of perf-report, especially on large/slow terminals.
>
> It looks like the force-refresh in tui_progress__update() serves no
> purpose other than to clear the screen so that the progress bar of a
> previous operation does not subsume that of a subsequent operation.
> But we can do just that in a much more efficient manner by clearing only
> the region that a previous progress bar may have occupied before
> repainting the new progress bar. Then the force-refresh could be
> removed with no change in visuals.
>
> This patch disables the slow force-refresh in tui_progress__update() and
> instead calls SLsmg_fill_region() on the entire area that the progress
> bar may occupy before repainting it. This change makes the startup of
> perf-report much faster and appear much "smoother".
>
> It turns out that this was a big bottleneck in the startup speed of
> perf-report -- with this patch, perf-report starts up ~2x faster (1.1s
> vs 0.55s) on my machines. (These numbers were measured by running
> "time perf report" on an 8MB perf.data and pressing 'q' immediately.)
Acked-by: Namhyung Kim <[email protected]>
Thanks,
Namhyung