Greetings Arnaldo,
------------------------------------------------------------------------------------------------------------------------------
PerfTop: 670 irqs/sec kernel:14.6% [1000Hz cycles], (all, cpu: 3)
------------------------------------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________ ________________________________________________________________
12535.00 82.1% main /root/bin/pert
302.00 2.0% clear_page_c [kernel.kallsyms]
169.00 1.1% _int_malloc /lib64/libc-2.9.so
168.00 1.1% page_fault [kernel.kallsyms]
138.00 0.9% yylex /usr/local/src/kernel/linux-2.6.33.git/scripts/genksyms/genksyms
121.00 0.8% __GI_memset /lib64/libc-2.9.so
117.00 0.8% copy_page_c [kernel.kallsyms]
96.00 0.6% _raw_spin_lock [kernel.kallsyms]
79.00 0.5% use_config /usr/local/src/kernel/linux-2.6.33.git/scripts/basic/fixdep
If only 100% userland hog (pert) is running, I see the expected 1KHz
irqs/sec. As soon as I start a kbuild with it, irqs/sec drops, though
NMIs are indeed happening at 1KHz.
Bisection points to 5b2bb75 perf top: Support userspace symbols too.
-Mike
Yup, definitely a booboo in there somewhere.
Step 1.
oinker running, perf top -C 3 -z
------------------------------------------------------------------------------------------------------------------------------
PerfTop: 1001 irqs/sec kernel: 0.0% [1000Hz cycles], (all, cpu: 3)
------------------------------------------------------------------------------------------------------------------------------
samples pcnt func DSO
_______ _____ ____ ______________
2003.00 100.0% main /root/bin/pert
perfect (off by one is beauty mark, not wart;)
Step 2.
taskset -c 3 ./massive_intr 3 999
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND
7133 root 20 0 7928 232 128 R 25 0.0 0:09.68 3 massive_intr
7134 root 20 0 7928 232 128 R 25 0.0 0:09.68 3 massive_intr
25932 root 20 0 7920 584 476 R 25 0.0 42:03.07 3 pert
7132 root 20 0 7928 232 128 R 25 0.0 0:09.68 3 massive_intr
------------------------------------------------------------------------------------------------------------------------------
PerfTop: 252 irqs/sec kernel: 0.8% [1000Hz cycles], (all, cpu: 3)
------------------------------------------------------------------------------------------------------------------------------
samples pcnt func DSO
_______ _____ ____ ______________
500.00 99.2% main /root/bin/pert
massive_intr appeared briefly, then poof, gone (likely when it forked
off worker-bees), leaving this.
-Mike
Em Thu, Jan 21, 2010 at 06:42:08AM +0100, Mike Galbraith escreveu:
> Greetings Arnaldo,
>
> ------------------------------------------------------------------------------------------------------------------------------
> PerfTop: 670 irqs/sec kernel:14.6% [1000Hz cycles], (all, cpu: 3)
> ------------------------------------------------------------------------------------------------------------------------------
>
> samples pcnt function DSO
> _______ _____ ___________________ ________________________________________________________________
>
> 12535.00 82.1% main /root/bin/pert
> 302.00 2.0% clear_page_c [kernel.kallsyms]
> 169.00 1.1% _int_malloc /lib64/libc-2.9.so
> 168.00 1.1% page_fault [kernel.kallsyms]
> 138.00 0.9% yylex /usr/local/src/kernel/linux-2.6.33.git/scripts/genksyms/genksyms
> 121.00 0.8% __GI_memset /lib64/libc-2.9.so
> 117.00 0.8% copy_page_c [kernel.kallsyms]
> 96.00 0.6% _raw_spin_lock [kernel.kallsyms]
> 79.00 0.5% use_config /usr/local/src/kernel/linux-2.6.33.git/scripts/basic/fixdep
>
> If only 100% userland hog (pert) is running, I see the expected 1KHz
> irqs/sec. As soon as I start a kbuild with it, irqs/sec drops, though
> NMIs are indeed happening at 1KHz.
>
> Bisection points to 5b2bb75 perf top: Support userspace symbols too.
Can you check if this one helps?
- Arnaldo
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 7a8a77e..8b04988 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -934,8 +934,11 @@ static void event__process_sample(const event_t *self,
struct addr_location al;
u8 origin = self->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+ ++samples;
+
switch (origin) {
case PERF_RECORD_MISC_USER:
+ ++userspace_samples;
if (hide_user_symbols)
return;
break;
@@ -960,9 +963,6 @@ static void event__process_sample(const event_t *self,
if (list_empty(&syme->node) || !syme->node.next)
__list_insert_active_sym(syme);
pthread_mutex_unlock(&active_symbols_lock);
- if (origin == PERF_RECORD_MISC_USER)
- ++userspace_samples;
- ++samples;
}
}
On Thu, 2010-01-21 at 10:22 -0200, Arnaldo Carvalho de Melo wrote:
> Can you check if this one helps?
Made interrupt count correct, but symbols aren't mapped after the task
forks it seems. At least, in event__preprocess_sample(), after
thread__find_addr_location(), al->map is NULL a LOT.
-Mike
Em Thu, Jan 21, 2010 at 03:28:32PM +0100, Mike Galbraith escreveu:
> On Thu, 2010-01-21 at 10:22 -0200, Arnaldo Carvalho de Melo wrote:
>
> > Can you check if this one helps?
>
> Made interrupt count correct, but symbols aren't mapped after the task
> forks it seems. At least, in event__preprocess_sample(), after
> thread__find_addr_location(), al->map is NULL a LOT.
Try this one on top of the previous one:
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 8b04988..2227b84 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -975,6 +975,10 @@ static int event__process(event_t *event, struct perf_session *session)
case PERF_RECORD_MMAP:
event__process_mmap(event, session);
break;
+ case PERF_RECORD_FORK:
+ case PERF_RECORD_EXIT:
+ event__process_task(event, session);
+ break;
default:
break;
}
On Thu, 2010-01-21 at 12:49 -0200, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jan 21, 2010 at 03:28:32PM +0100, Mike Galbraith escreveu:
> > On Thu, 2010-01-21 at 10:22 -0200, Arnaldo Carvalho de Melo wrote:
> >
> > > Can you check if this one helps?
> >
> > Made interrupt count correct, but symbols aren't mapped after the task
> > forks it seems. At least, in event__preprocess_sample(), after
> > thread__find_addr_location(), al->map is NULL a LOT.
>
> Try this one on top of the previous one:
>
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 8b04988..2227b84 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -975,6 +975,10 @@ static int event__process(event_t *event, struct perf_session *session)
> case PERF_RECORD_MMAP:
> event__process_mmap(event, session);
> break;
> + case PERF_RECORD_FORK:
> + case PERF_RECORD_EXIT:
> + event__process_task(event, session);
> + break;
> default:
> break;
> }
Yup. All better.
I figured that had to to be tied in from rummaging around earlier.
-Mike
Em Thu, Jan 21, 2010 at 03:56:12PM +0100, Mike Galbraith escreveu:
> On Thu, 2010-01-21 at 12:49 -0200, Arnaldo Carvalho de Melo wrote:
> > + case PERF_RECORD_FORK:
> > + case PERF_RECORD_EXIT:
> > + event__process_task(event, session);
> > + break;
> > default:
> > break;
> > }
>
> Yup. All better.
>
> I figured that had to to be tied in from rummaging around earlier.
Thanks for testing!
I'm pushing these two fixes with your Tested-by tags.
- Arnaldo