2006-05-22 15:41:10

by Serge Noiraud

[permalink] [raw]
Subject: RT patch + LTTng

Hi,

I have added the LTTng patch to the 2.6.16-rt23.
In the LTT trace, I see some odd time stamps :

First case :

...
process.schedchange: 12.800681016 (/cpu_1), 7667, lttctl, 0, 0x0, MODE_UNKNOWN { 17, 7667, 2 }
socket.call: 12.800685053 (/cpu_1), 7667, lttctl, 0, 0x0, MODE_UNKNOWN { 11, 3 }
socket.sendmsg: 12.800687153 (/cpu_1), 7667, lttctl, 0, 0x0, MODE_UNKNOWN { 0xf3b78b80, 16, 3, 31, 288 }
>The pb is effectively on the next line : we skip from 12.800687153 to 13.397691687
>0,597 seconde ???
core.time_heartbeat: 13.397691687 (/control/facilities_1), 7667, lttctl, 0, 0x0, MODE_UNKNOWN
core.time_heartbeat: 13.397691979 (/control/facilities_0), 7662, lttd, 7660, 0x0, USER_MODE
core.time_heartbeat: 13.397694936 (/control/interrupts_0), 7662, lttd, 7660, 0x0, USER_MODE
...

Second case :

...
process.schedchange: 4.062674969 (/cpu_1), 0, UNNAMED, 0, 0x0, SYSCALL { 17, 0, 2 }
process.wakeup: 4.062733369 (/cpu_0), 7538, lttd, 1, 0x0, USER_MODE { 5, 2 }
process.wakeup: 4.062738209 (/cpu_0), 7538, lttd, 1, 0x0, USER_MODE { 3, 2 }
process.schedchange: 4.062746308 (/cpu_0), 3, posix_cpu_timer, 1, 0x0, SYSCALL { 7538, 3, 0 }
process.schedchange: 4.062751787 (/cpu_0), 5, softirq-timer/0, 1, 0x0, SYSCALL { 3, 5, 2 }
timer.softirq: 4.062753184 (/cpu_0), 5, softirq-timer/0, 1, 0x0, SYSCALL
process.schedchange: 4.062759716 (/cpu_0), 7538, lttd, 1, 0x0, USER_MODE { 5, 7538, 2 }
>====================== Problem On the following lines
>573us between 4.062759716 and 4.063493275
kernel.trap_entry: 4.063493275 (/cpu_1), 0, UNNAMED, 0, 0x0, TRAP { 2, 0xc0101329 }
kernel.trap_exit: 4.063494252 (/cpu_1), 0, UNNAMED, 0, 0x0, SYSCALL
kernel.trap_entry: 4.063494624 (/cpu_0), 7538, lttd, 1, 0x0, TRAP { 2, 0xc032e8b9 }
kernel.trap_exit: 4.063495799 (/cpu_0), 7538, lttd, 1, 0x0, USER_MODE
kernel.irq_entry: 4.063497194 (/cpu_0), 7538, lttd, 1, 0x0, IRQ { 0, kernel }
kernel.irq_exit: 4.063500254 (/cpu_0), 7538, lttd, 1, 0x0, USER_MODE
process.wakeup: 4.063644762 (/cpu_1), 0, UNNAMED, 0, 0x0, SYSCALL { 17, 2 }
...

My questions are :
What could explains these gaps :
NMI ?
ACPI ?
clock synchronization between CPU ?
other cause ?

There is no big latencies :
-sh-2.05b# cat /proc/latency_trace
preemption latency trace v1.1.5 on 2.6.16-DAV07_ltt
--------------------------------------------------------------------
latency: 93 us, #240/240, CPU#0 | (M:rt VP:0, KP:0, SP:1 HP:1 #P:2)
-----------------
| task: posix_cpu_timer-3 (uid:0 nice:0 policy:1 rt_prio:99)
-----------------

_------=> CPU#
/ _-----=> irqs-off
| / _----=> need-resched
|| / _---=> hardirq/softirq
||| / _--=> preempt-depth
|||| /
||||| delay
cmd pid ||||| time | caller
\ / ||||| \ | /
ltt_stat-7809 0Dnh3 0us : __trace_start_sched_wakeup (try_to_wake_up)
ltt_stat-7809 0Dnh3 0us : __trace_start_sched_wakeup <<...>-3> (0 0)
ltt_stat-7809 0Dnh3 0us : _raw_spin_unlock (__trace_start_sched_wakeup)
ltt_stat-7809 0Dnh2 1us : preempt_schedule (_raw_spin_unlock)
ltt_stat-7809 0Dnh2 1us : resched_task (try_to_wake_up)
ltt_stat-7809 0Dnh2 1us : try_to_wake_up <<...>-3> (199 -8)
ltt_stat-7809 0Dnh2 1us : _raw_spin_unlock_irqrestore (try_to_wake_up)
ltt_stat-7809 0Dnh1 1us : preempt_schedule (_raw_spin_unlock_irqrestore)
ltt_stat-7809 0Dnh1 1us : wake_up_process (run_posix_cpu_timers)
ltt_stat-7809 0Dnh1 2us : irq_exit (smp_apic_timer_interrupt)
ltt_stat-7809 0.n.1 2us : ltt_heartbeat_read_synthetic_tsc (ltt_statedump_thread)
...
ltt_stat-7809 0.n.1 81us : ltt_heartbeat_read_synthetic_tsc (ltt_statedump_thread)
ltt_stat-7809 0.n.1 81us : preempt_schedule (ltt_heartbeat_read_synthetic_tsc)
ltt_stat-7809 0.n.1 82us : ltt_heartbeat_read_synthetic_tsc (ltt_statedump_thread)
ltt_stat-7809 0.n.1 82us : preempt_schedule (ltt_heartbeat_read_synthetic_tsc)
ltt_stat-7809 0.n.1 82us : ltt_heartbeat_read_synthetic_tsc (ltt_statedump_thread)
ltt_stat-7809 0.n.1 83us : preempt_schedule (ltt_heartbeat_read_synthetic_tsc)
ltt_stat-7809 0.n.1 83us : ltt_heartbeat_read_synthetic_tsc (ltt_statedump_thread)
ltt_stat-7809 0.n.1 83us : preempt_schedule (ltt_heartbeat_read_synthetic_tsc)
ltt_stat-7809 0.n.. 84us : rt_up_read (ltt_statedump_thread)
ltt_stat-7809 0.n.. 84us : _raw_spin_lock_irqsave (rt_up_read)
ltt_stat-7809 0Dn.1 84us : _raw_spin_unlock_irqrestore (rt_up_read)
ltt_stat-7809 0.n.. 85us : preempt_schedule (_raw_spin_unlock_irqrestore)
ltt_stat-7809 0Dn.. 85us : __schedule (preempt_schedule)
ltt_stat-7809 0Dn.. 85us : profile_hit (__schedule)
ltt_stat-7809 0Dn.1 85us : sched_clock (__schedule)
ltt_stat-7809 0Dn.1 86us : _raw_spin_lock_irq (__schedule)
ltt_stat-7809 0Dn.2 86us : balance_rt_tasks (__schedule)
ltt_stat-7809 0Dn.2 86us : find_next_bit (balance_rt_tasks)
ltt_stat-7809 0Dn.2 86us : find_next_bit (balance_rt_tasks)
ltt_stat-7809 0D..2 87us : trace_array (__schedule)
ltt_stat-7809 0D..2 87us : trace_array <<...>-3> (0 199)
ltt_stat-7809 0D..2 88us : trace_array <<...>-5> (98 101)
ltt_stat-7809 0D..2 89us : trace_array <ltt_stat-7809> (112 -8)
ltt_stat-7809 0D..2 89us : trace_array <<...>-7806> (118 -2)
ltt_stat-7809 0D..2 89us : trace_array (__schedule)
ltt_stat-7809 0D..2 90us : sched_info_arrive (__schedule)
ltt_stat-7809 0D..3 91us : ltt_heartbeat_read_synthetic_tsc (__schedule)
<...>-3 0D..2 91us : __switch_to (__schedule)
<...>-3 0D..2 92us : __schedule <ltt_stat-7809> (-8 199)
<...>-3 0D..2 92us : _raw_spin_unlock_irq (__schedule)
<...>-3 0...1 92us : trace_stop_sched_switched (__schedule)
<...>-3 0D..1 93us : _raw_spin_lock (trace_stop_sched_switched)
<...>-3 0D..2 93us : trace_stop_sched_switched <<...>-3> (0 0)
<...>-3 0D..2 93us : trace_stop_sched_switched (__schedule)


I don't use CPU FREQ :
-sh-2.05b# zgrep CPU_FREQ /proc/config.gz
# CONFIG_CPU_FREQ is not set
-sh-2.05b#


--
Serge Noiraud


2006-05-22 23:39:16

by Lee Revell

[permalink] [raw]
Subject: Re: RT patch + LTTng

On Mon, 2006-05-22 at 17:42 +0200, Serge Noiraud wrote:
> Hi,
>
> I have added the LTTng patch to the 2.6.16-rt23.
> In the LTT trace, I see some odd time stamps :

Is your test machine a dual core AMD64?

Lee

2006-05-23 07:00:50

by Serge Noiraud

[permalink] [raw]
Subject: Re: RT patch + LTTng

mardi 23 Mai 2006 01:39, Lee Revell wrote/a ?crit?:
> On Mon, 2006-05-22 at 17:42 +0200, Serge Noiraud wrote:
> > Hi,
> >
> > I have added the LTTng patch to the 2.6.16-rt23.
> > In the LTT trace, I see some odd time stamps :
>
> Is your test machine a dual core AMD64?
NO.
-sh-2.05b# more /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 3.60GHz
stepping : 1
cpu MHz : 3600.494
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm constant_tsc pni mo
nitor ds_cpl est tm2 cid cx16 xtpr
bogomips : 7203.68

processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 3.60GHz
stepping : 1
cpu MHz : 3600.494
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm constant_tsc pni mo
nitor ds_cpl est tm2 cid cx16 xtpr
bogomips : 7199.36

-sh-2.05b#

>
> Lee
>

--
Serge Noiraud

2006-05-23 14:06:40

by Daniel Walker

[permalink] [raw]
Subject: Re: RT patch + LTTng

On Mon, 2006-05-22 at 17:42 +0200, Serge Noiraud wrote:

> ltt_stat-7809 0Dnh3 0us : __trace_start_sched_wakeup (try_to_wake_up)
> ltt_stat-7809 0Dnh3 0us : __trace_start_sched_wakeup <<...>-3> (0 0)


Do you also have preempt and interrupt latency turned on ? In addition
to wakeup latency ..

Daniel

2006-05-23 14:56:23

by Serge Noiraud

[permalink] [raw]
Subject: Re: RT patch + LTTng

mardi 23 Mai 2006 16:06, Daniel Walker wrote/a ?crit?:
> On Mon, 2006-05-22 at 17:42 +0200, Serge Noiraud wrote:
>
> > ltt_stat-7809 0Dnh3 0us : __trace_start_sched_wakeup (try_to_wake_up)
> > ltt_stat-7809 0Dnh3 0us : __trace_start_sched_wakeup <<...>-3> (0 0)
>
>
> Do you also have preempt and interrupt latency turned on ? In addition
> to wakeup latency ..
Here is all I configured :
...
CONFIG_PREEMPT_RT=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_SOFTIRQS=y
CONFIG_PREEMPT_HARDIRQS=y
CONFIG_PREEMPT_BKL=y
# CONFIG_CLASSIC_RCU is not set
CONFIG_PREEMPT_RCU=y
CONFIG_RCU_STATS=y
...
CONFIG_DEBUG_PREEMPT=y
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_PI_LIST=y
CONFIG_RT_MUTEX_TESTER=y
CONFIG_WAKEUP_TIMING=y
CONFIG_WAKEUP_LATENCY_HIST=y
CONFIG_PREEMPT_TRACE=y
CONFIG_CRITICAL_PREEMPT_TIMING=y
CONFIG_PREEMPT_OFF_HIST=y
CONFIG_CRITICAL_IRQSOFF_TIMING=y
CONFIG_INTERRUPT_OFF_HIST=y
CONFIG_CRITICAL_TIMING=y
CONFIG_DEBUG_TRACE_IRQFLAGS=y
CONFIG_LATENCY_TIMING=y
CONFIG_CRITICAL_LATENCY_HIST=y
CONFIG_LATENCY_HIST=y
CONFIG_LATENCY_TRACE=y
...

If you want to known what I have in :
/proc/latency_hist/preempt_off_latency and in /proc/latency_hist/interrupt_off_latency
I suppressed all lines with a zero at the end.
There are no infos in these files.

-sh-2.05b# more /proc/latency_hist/preempt_off_latency/CPU0 | grep -v " 0$"
#Minimum latency: 4294967295 microseconds.
#Average latency: 0 microseconds.
#Maximum latency: 0 microseconds.
#Total samples: 0
#There are 0 samples greater or equal than 10240 microseconds
#usecs samples
-sh-2.05b# more /proc/latency_hist/preempt_off_latency/CPU1 | grep -v " 0$"
#Minimum latency: 4294967295 microseconds.
#Average latency: 0 microseconds.
#Maximum latency: 0 microseconds.
#Total samples: 0
#There are 0 samples greater or equal than 10240 microseconds
#usecs samples
-sh-2.05b# more /proc/latency_hist/interrupt_off_latency/CPU0 | grep -v " 0$"
#Minimum latency: 4294967295 microseconds.
#Average latency: 0 microseconds.
#Maximum latency: 0 microseconds.
#Total samples: 0
#There are 0 samples greater or equal than 10240 microseconds
#usecs samples
-sh-2.05b# more /proc/latency_hist/interrupt_off_latency/CPU1 | grep -v " 0$"
#Minimum latency: 4294967295 microseconds.
#Average latency: 0 microseconds.
#Maximum latency: 0 microseconds.
#Total samples: 0
#There are 0 samples greater or equal than 10240 microseconds
#usecs samples
-sh-2.05b#

>
> Daniel

--
Serge Noiraud

2006-05-23 15:26:34

by Daniel Walker

[permalink] [raw]
Subject: Re: RT patch + LTTng

On Tue, 2006-05-23 at 16:57 +0200, Serge Noiraud wrote:
> mardi 23 Mai 2006 16:06, Daniel Walker wrote/a écrit :
> > On Mon, 2006-05-22 at 17:42 +0200, Serge Noiraud wrote:
> >
> > > ltt_stat-7809 0Dnh3 0us : __trace_start_sched_wakeup (try_to_wake_up)
> > > ltt_stat-7809 0Dnh3 0us : __trace_start_sched_wakeup <<...>-3> (0 0)
> >
> >
> > Do you also have preempt and interrupt latency turned on ? In addition
> > to wakeup latency ..
> Here is all I configured :


Do you get the same effect with all those options turned off?

Daniel

2006-05-23 16:22:08

by Serge Noiraud

[permalink] [raw]
Subject: Re: RT patch + LTTng

mardi 23 Mai 2006 17:26, Daniel Walker wrote/a écrit :
> On Tue, 2006-05-23 at 16:57 +0200, Serge Noiraud wrote:
> > mardi 23 Mai 2006 16:06, Daniel Walker wrote/a écrit :
> > > On Mon, 2006-05-22 at 17:42 +0200, Serge Noiraud wrote:
> > >
> > > > ltt_stat-7809 0Dnh3 0us : __trace_start_sched_wakeup (try_to_wake_up)
> > > > ltt_stat-7809 0Dnh3 0us : __trace_start_sched_wakeup <<...>-3> (0 0)
> > >
> > >
> > > Do you also have preempt and interrupt latency turned on ? In addition
> > > to wakeup latency ..
> > Here is all I configured :
>
>
> Do you get the same effect with all those options turned off?
I'll try.

The last pb was with nmi_watchdog=1.
I get the same things with nmi_watchdog=0.

>
> Daniel
>
>

--
Serge Noiraud