Hi,
I am still getting a few xruns even after raising Jack's priority
level to 80. I am wondering whether it's fair to report these when I
have CONFIG_DEBUG_PREEMPT set?
Should I unset this option and test for a while without it, or
shall we work on the root cause of my xruns with it set?
The most recent xrun was caused when I:
1) Switched to desktop #4 in Gnome
2) Opened Firefox and went to Gentoo's bugzilla to report a fix had
worked. When the browser was updaing the page I heard a click in my
audio.
3) Returned to Desktop #1 wherre QJC reported an xrun:
configuring for 44100Hz, period = 128 frames, buffer = 2 periods
nperiods = 2 for capture
nperiods = 2 for playback
11:30:45.312 Server configuration saved to "/home/mark/.jackdrc".
11:30:45.312 Statistics reset.
11:30:45.454 Client activated.
11:30:45.456 Audio connection change.
11:30:45.459 Audio connection graph change.
11:31:00.256 Audio connection graph change.
11:31:00.335 Audio connection change.
11:31:02.616 Audio connection graph change.
11:33:06.199 XRUN callback (1).
**** alsa_pcm: xrun of at least 2.545 msecs
mark@lightning ~ $ ps -Leo pid,pri,rtprio,cmd | grep jack
7911 24 - qjackctl
7911 119 79 qjackctl
7913 20 - /usr/bin/jackd -R -P80 -dalsa -dhw:1 -r44100 -p128 -n2
7913 23 - /usr/bin/jackd -R -P80 -dalsa -dhw:1 -r44100 -p128 -n2
7913 23 - /usr/bin/jackd -R -P80 -dalsa -dhw:1 -r44100 -p128 -n2
7913 130 90 /usr/bin/jackd -R -P80 -dalsa -dhw:1 -r44100 -p128 -n2
7913 120 80 /usr/bin/jackd -R -P80 -dalsa -dhw:1 -r44100 -p128 -n2
7922 24 - aqualung -o jack --auto=alsa_pcm:playback_17
alsa_pcm:playback_18
7922 24 - aqualung -o jack --auto=alsa_pcm:playback_17
alsa_pcm:playback_18
7922 119 79 aqualung -o jack --auto=alsa_pcm:playback_17
alsa_pcm:playback_18
7993 21 - grep jack
mark@lightning ~ $ ps -Leo pid,pri,rtprio,cmd | grep IRQ
15 89 49 [IRQ 9]
798 88 48 [IRQ 8]
801 87 47 [IRQ 7]
805 86 46 [IRQ 12]
815 85 45 [IRQ 6]
850 84 44 [IRQ 14]
867 83 43 [IRQ 225]
872 82 42 [IRQ 233]
887 81 41 [IRQ 50]
893 80 40 [IRQ 217]
908 79 39 [IRQ 1]
4559 78 38 [IRQ 58]
4663 77 37 [IRQ 66]
7371 76 36 [IRQ 4]
7995 21 - grep IRQ
mark@lightning ~ $ cat /proc/interrupts
CPU0
0: 338305 IO-APIC-edge timer
1: 1993 IO-APIC-edge i8042
7: 2 IO-APIC-edge lpptest
8: 0 IO-APIC-edge rtc
9: 0 IO-APIC-level acpi
12: 62930 IO-APIC-edge i8042
14: 48 IO-APIC-edge ide0
50: 2 IO-APIC-level ehci_hcd:usb1
58: 257570 IO-APIC-level hdsp
66: 3951 IO-APIC-level ohci1394
217: 134711 IO-APIC-level ohci_hcd:usb2, eth0
225: 0 IO-APIC-level libata, NVidia CK804
233: 7751 IO-APIC-level libata
NMI: 169
LOC: 338277
ERR: 1
MIS: 0
mark@lightning ~ $
Since my NIC is getting a higher priority than both my sound card and
my 1394 audio drives (IRQ217 vs. IRQ58/IRQ66) I assume that network
activity might possibly sometimes cause a problem? Or is this not
true?
0000:00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3)
Subsystem: ASUSTeK Computer Inc. K8N4-E Mainboard
Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 217
0000:05:06.0 Multimedia audio controller: Xilinx Corporation RME
Hammerfall DSP (rev 68)
Flags: bus master, medium devsel, latency 255, IRQ 58
Memory at da000000 (32-bit, non-prefetchable) [size=64K]
0000:05:08.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2
IEEE-1394b Link Layer Controller (rev 01) (prog-if 10 [OHCI])
Subsystem: Texas Instruments TSB82AA2 IEEE-1394b Link Layer Controller
Flags: bus master, medium devsel, latency 32, IRQ 66
Ideally I think I'd like the NIC to be lower priority than the sound
card or the 1394 drives, and possibly lower then the system's SATA
drive also.
Thanks in advance,
Mark
* Mark Knecht <[email protected]> wrote:
> I am still getting a few xruns even after raising Jack's priority
> level to 80. I am wondering whether it's fair to report these when I
> have CONFIG_DEBUG_PREEMPT set?
> 4559 78 38 [IRQ 58]
> 58: 257570 IO-APIC-level hdsp
IRQ 58 is your audio interrupt, right? You should raise that one to prio
80 too. (via chrt)
> Since my NIC is getting a higher priority than both my sound card and
> my 1394 audio drives (IRQ217 vs. IRQ58/IRQ66) I assume that network
> activity might possibly sometimes cause a problem? Or is this not
> true?
yeah, that could be the case.
Ingo
On 10/6/05, Ingo Molnar <[email protected]> wrote:
>
> * Mark Knecht <[email protected]> wrote:
>
> > I am still getting a few xruns even after raising Jack's priority
> > level to 80. I am wondering whether it's fair to report these when I
> > have CONFIG_DEBUG_PREEMPT set?
>
> > 4559 78 38 [IRQ 58]
>
> > 58: 257570 IO-APIC-level hdsp
>
> IRQ 58 is your audio interrupt, right? You should raise that one to prio
> 80 too. (via chrt)
>
> > Since my NIC is getting a higher priority than both my sound card and
> > my 1394 audio drives (IRQ217 vs. IRQ58/IRQ66) I assume that network
> > activity might possibly sometimes cause a problem? Or is this not
> > true?
>
> yeah, that could be the case.
>
> Ingo
>
Thanks Ingo,
I've set the HDSP priority to 80 using chrt. I'll continue to test away.
lightning ~ # ps -Leo pid,pri,rtprio,cmd | grep 58
4559 78 38 [IRQ 58]
8723 22 - grep 58
lightning ~ # chrt -p 4559
pid 4559's current scheduling policy: SCHED_FIFO
pid 4559's current scheduling priority: 38
lightning ~ # chrt -p 80 4559
lightning ~ # chrt -p 4559
pid 4559's current scheduling policy: SCHED_RR
pid 4559's current scheduling priority: 80
lightning ~ #
Can you suggest how I might be able to do this at boot time? Is
seems that the info is there but it requires figuring out the process
ID in a script and then running chrt in that script? Or is there a
simple place to configure how much priority I want to give a specific
IRQ?
Anyway, thanks for all the help. I'll continue on. Probably I won't
see too many xruns with this set like this.
cheers,
Mark
On 10/6/05, Mark Knecht <[email protected]> wrote:
> On 10/6/05, Ingo Molnar <[email protected]> wrote:
> >
> > * Mark Knecht <[email protected]> wrote:
> >
> > > I am still getting a few xruns even after raising Jack's priority
> > > level to 80. I am wondering whether it's fair to report these when I
> > > have CONFIG_DEBUG_PREEMPT set?
> >
> > > 4559 78 38 [IRQ 58]
> >
> > > 58: 257570 IO-APIC-level hdsp
> >
> > IRQ 58 is your audio interrupt, right? You should raise that one to prio
> > 80 too. (via chrt)
> >
> > > Since my NIC is getting a higher priority than both my sound card and
> > > my 1394 audio drives (IRQ217 vs. IRQ58/IRQ66) I assume that network
> > > activity might possibly sometimes cause a problem? Or is this not
> > > true?
> >
> > yeah, that could be the case.
> >
>
> Can you suggest how I might be able to do this at boot time?
Apologies for answering myself...
I found an old conversation that answered this, at least when I'm
logged in as root. Hopefully it will work at boot time also:
chrt -f -p 80 `pidof "IRQ 58"`
I noted that the default chrt command changed the IRQ process from
SCHED_FIFO to SCHED_RR so I assume I really should leave it at
SCHED_FIFO?
Thanks,
Mark
On 10/6/05, Mark Knecht <[email protected]> wrote:
>
> I found an old conversation that answered this, at least when I'm
> logged in as root. Hopefully it will work at boot time also:
>
> chrt -f -p 80 `pidof "IRQ 58"`
>
> I noted that the default chrt command changed the IRQ process from
> SCHED_FIFO to SCHED_RR so I assume I really should leave it at
> SCHED_FIFO?
>
Evening all. A small report...
After getting rt priorities semi-setup using this commands:
/usr/bin/chrt -f -p 80 `/bin/pidof "IRQ 58"`
/usr/bin/chrt -f -p 61 `/bin/pidof "IRQ 66"`
/usr/bin/chrt -f -p 60 `/bin/pidof "IRQ 233"`
where 58 is my sound card, 66 is my 1394 audio drive, and 233 is
(apparently) my internal SATA system drive, I am now running a very
fast 64/2 setting on Jack. This corresponds to <3mS latency and is the
best that I can do with RME sound cards since the Alsa driver enforces
nothing lower than buffer sizes of 64. If things run at this level
reliably then no one is going to hear the latency and recorded audio
can go full circle and sound live. This is great. Thanks!
However, at odd times I still get xruns. For instance one set of xruns
came while browsing the web. I was on this page:
http://steveisaacs.typepad.com/news/2005/09/a_little_more_e.html
and went to this page:
http://steveisaacs.typepad.com/news/2005/09/adventureland.html
and got 2 xruns:
18:20:06.541 XRUN callback (8).
**** alsa_pcm: xrun of at least 3.172 msecs
**** alsa_pcm: xrun of at least 0.967 msecs
18:20:07.908 XRUN callback (1 skipped).
So, while things are far, far better for me than they were earlier
this wekk, there are still some problems I'd like to get to the bottom
of if possible.
I see no xruns created by using Gnome. I can change desktops, start
and stop applications, wiggle my mouse excessively, and do lots of
other things. No problems. I am still somewhat concerned however about
my networking setup. This motherboard uses the NVidia driver which is
marked as 'reverse engineered' and experiemental, I believe.
Would it make sense to put a different NIC in here or is that not
likely to produce valuable results? I'm not immediately certain if I
even have a PCI slot, and if I do it will likely share an interrupt
with some other device.
I will also mention, although I'm hoping it doesn't matter, that I am
seeing an occasional xrun problem when using an audio app that's
attached to the Jack server. For instance, if a certian app is
transferring audio and I tell the app to load a new file into a list
of files to be played then Jack reports a 'subgraph timeout'
associated with the app even though the new files won't get played
immediately.
I am assuming for now that this is just an internal programming issue
in the application itself, but maybe not. I don't know yet.
I have tried rebuilding the kernel with the preempt debug option
turned off. This did not seem to improve this remain souce of xruns.
As always I'm interested in whatever ideas folks have to help me zero
in on the source of the problems.
Thanks,
Mark
* Mark Knecht <[email protected]> wrote:
> However, at odd times I still get xruns. For instance one set of xruns
> came while browsing the web. I was on this page:
> and got 2 xruns:
>
> 18:20:06.541 XRUN callback (8).
> **** alsa_pcm: xrun of at least 3.172 msecs
> **** alsa_pcm: xrun of at least 0.967 msecs
> 18:20:07.908 XRUN callback (1 skipped).
>
> So, while things are far, far better for me than they were earlier
> this week, there are still some problems I'd like to get to the bottom
> of if possible.
one thing i noticed: you have CONFIG_SMP set. Is it a true SMP x64
system? In any case, could you try without CONFIG_SMP, just to test
whether the latencies are related to SMP.
Ingo
On 10/7/05, Ingo Molnar <[email protected]> wrote:
>
> * Mark Knecht <[email protected]> wrote:
>
> > However, at odd times I still get xruns. For instance one set of xruns
> > came while browsing the web. I was on this page:
>
> > and got 2 xruns:
> >
> > 18:20:06.541 XRUN callback (8).
> > **** alsa_pcm: xrun of at least 3.172 msecs
> > **** alsa_pcm: xrun of at least 0.967 msecs
> > 18:20:07.908 XRUN callback (1 skipped).
> >
> > So, while things are far, far better for me than they were earlier
> > this week, there are still some problems I'd like to get to the bottom
> > of if possible.
>
> one thing i noticed: you have CONFIG_SMP set. Is it a true SMP x64
> system? In any case, could you try without CONFIG_SMP, just to test
> whether the latencies are related to SMP.
>
> Ingo
>
Hi Ingo,
OK, I've been running -rt10 for the last couple of hours on a new
kernel without SMP. No xruns so far at 64/2. I'm doing all the normal
stuff. emerge sync, building some code outside of portage, playing
music. Very good so far, but it will likely take 4-6 hours for me to
be more sure saying it was just SMP latencies.
I see you're putting out -rt updates faster than I can build & test
them so I decided to stick with -rt10 for now to ensure a fair
comparison. I'll catch up with -rt12 or later sometime this evening to
tomorrow.
Until then, one question. Yesterday I set priorities as such:
Jack - 80
Sound card - 80
1394 drive - 61
SATA drive 60
What exactly do these numbers mean? If the 1394 drive was a 79
instead of a 61 would it change anything? Or does the system just
prioitize them numerically but operate on the highest numbered request
in which case a 79 vs. 61 wouldn't matter ?
Thanks,
Mark
On Fri, 2005-10-07 at 09:44 -0700, Mark Knecht wrote:
> Hi Ingo,
> OK, I've been running -rt10 for the last couple of hours on a new
> kernel without SMP. No xruns so far at 64/2. I'm doing all the normal
> stuff. emerge sync, building some code outside of portage, playing
> music. Very good so far, but it will likely take 4-6 hours for me to
> be more sure saying it was just SMP latencies.
IIRC you posted some traces that implied the migration thread was
involved.
Lee
On 10/7/05, Lee Revell <[email protected]> wrote:
> On Fri, 2005-10-07 at 09:44 -0700, Mark Knecht wrote:
> > Hi Ingo,
> > OK, I've been running -rt10 for the last couple of hours on a new
> > kernel without SMP. No xruns so far at 64/2. I'm doing all the normal
> > stuff. emerge sync, building some code outside of portage, playing
> > music. Very good so far, but it will likely take 4-6 hours for me to
> > be more sure saying it was just SMP latencies.
>
> IIRC you posted some traces that implied the migration thread was
> involved.
>
> Lee
No Lee, I don't think I've posted any traces myself. Possibly someone else?
I need to learn how to debug where time is spent by the kernel when an
xrun occurs. I haven't yet tried some brief instructions you gave me a
couple of weeks back.
Anyway, I'm now about 3 hours down the road with no xruns now that
I've turned of SMP support. Is the migration thread part of SMP
support?
Thanks,
Mark
On Fri, 2005-10-07 at 10:47 -0700, Mark Knecht wrote:
> On 10/7/05, Lee Revell <[email protected]> wrote:
> > On Fri, 2005-10-07 at 09:44 -0700, Mark Knecht wrote:
> > > Hi Ingo,
> > > OK, I've been running -rt10 for the last couple of hours on a new
> > > kernel without SMP. No xruns so far at 64/2. I'm doing all the normal
> > > stuff. emerge sync, building some code outside of portage, playing
> > > music. Very good so far, but it will likely take 4-6 hours for me to
> > > be more sure saying it was just SMP latencies.
> >
> > IIRC you posted some traces that implied the migration thread was
> > involved.
> >
> > Lee
>
> No Lee, I don't think I've posted any traces myself. Possibly someone else?
>
I was thinking of this trace that Fernando posted on 9/26 in the LKML
thread "jack, PREEMPT_DESKTOP, delayed interrupts?". Note move_tasks
and load_balance_newidle near the end. I never saw an explanation of
this:
# cat /proc/latency_trace
preemption latency trace v1.1.5 on 2.6.13-0.3.rdtd.rhfc4.ccrmasmp
--------------------------------------------------------------------
latency: 10852 us, #70/70, CPU#0 | (M:preempt VP:0, KP:1, SP:1 HP:1
#P:2)
-----------------
| task: qjackctl-4797 (uid:743 nice:0 policy:1 rt_prio:61)
-----------------
_------=> CPU#
/ _-----=> irqs-off
| / _----=> need-resched
|| / _---=> hardirq/softirq
||| / _--=> preempt-depth
|||| /
||||| delay
cmd pid ||||| time | caller
\ / ||||| \ | /
<...>-4593 1Dnh3 0us : MacPrivateStat (SkPnmiGetStruct)
<...>-4593 1Dnh3 0us : MacPrivateStat (SkPnmiGetStruct)
<...>-4593 1Dnh3 0us : MacUpdate (MacPrivateStat)
<...>-4593 1Dnh3 0us : GetStatVal (MacPrivateStat)
<...>-4593 1Dnh3 0us : GetPhysStatVal (GetStatVal)
<...>-4593 1Dnh3 1us+: SkGmMacStatistic (GetPhysStatVal)
<...>-4593 1Dnh3 3us : MacPrivateStat (SkPnmiGetStruct)
<...>-4593 1Dnh3 3us : MacUpdate (MacPrivateStat)
<...>-4593 1Dnh3 3us : GetStatVal (MacPrivateStat)
<...>-4593 1Dnh3 3us : GetPhysStatVal (GetStatVal)
<...>-4593 1Dnh3 3us+: SkGmMacStatistic (GetPhysStatVal)
<...>-4593 1Dnh3 5us : MacPrivateStat (SkPnmiGetStruct)
<...>-4593 1Dnh3 6us : MacPrivateStat (SkPnmiGetStruct)
<...>-4593 1Dnh3 6us : MacUpdate (MacPrivateStat)
<...>-4593 1Dnh3 6us : GetStatVal (MacPrivateStat)
<...>-4593 1Dnh3 6us : GetPhysStatVal (GetStatVal)
<...>-4593 1Dnh3 6us+: SkGmMacStatistic (GetPhysStatVal)
<...>-4593 1Dnh3 8us : MacPrivateStat (SkPnmiGetStruct)
<...>-4593 1Dnh3 9us : MacUpdate (MacPrivateStat)
<...>-4593 1Dnh3 9us : GetStatVal (MacPrivateStat)
<...>-4593 1Dnh3 9us : GetPhysStatVal (GetStatVal)
<...>-4593 1Dnh3 9us+: SkGmMacStatistic (GetPhysStatVal)
<...>-4593 1Dnh3 11us : MacPrivateStat (SkPnmiGetStruct)
<...>-4593 1Dnh3 11us : MacPrivateStat (SkPnmiGetStruct)
<...>-4593 1Dnh3 12us : MacUpdate (MacPrivateStat)
<...>-4593 1Dnh3 12us : GetStatVal (MacPrivateStat)
<...>-4593 1Dnh3 12us : GetPhysStatVal (GetStatVal)
<...>-4593 1Dnh3 12us : MacPrivateStat (SkPnmiGetStruct)
<...>-4593 1Dnh3 12us : MacUpdate (MacPrivateStat)
<...>-4593 1Dnh3 12us : GetStatVal (MacPrivateStat)
<...>-4593 1Dnh3 13us : GetPhysStatVal (GetStatVal)
<...>-4593 1Dnh3 13us : MacPrivateStat (SkPnmiGetStruct)
<...>-4593 1Dnh3 13us : MacPrivateStat (SkPnmiGetStruct)
<...>-4593 1Dnh3 13us : MacUpdate (MacPrivateStat)
<...>-4593 1Dnh3 13us : GetStatVal (MacPrivateStat)
<...>-4593 1Dnh3 13us : GetPhysStatVal (GetStatVal)
<...>-4593 1Dnh3 13us : MacPrivateStat (SkPnmiGetStruct)
<...>-4593 1Dnh3 13us : MacUpdate (MacPrivateStat)
<...>-4593 1Dnh3 13us : GetStatVal (MacPrivateStat)
<...>-4593 1Dnh3 14us : GetPhysStatVal (GetStatVal)
<...>-4593 1Dnh3 14us : MacPrivateStat (SkPnmiGetStruct)
<...>-4593 1Dnh3 14us : MacPrivateStat (SkPnmiGetStruct)
<...>-4593 1Dnh3 14us : MacUpdate (MacPrivateStat)
<...>-4593 1Dnh3 14us : GetStatVal (MacPrivateStat)
<...>-4593 1Dnh3 14us : GetPhysStatVal (GetStatVal)
<...>-4593 1Dnh3 14us+: SkGmMacStatistic (GetPhysStatVal)
<...>-4593 1Dnh3 17us : GetStatVal (MacPrivateStat)
<...>-4593 1Dnh3 18us : GetPhysStatVal (GetStatVal)
<...>-4593 1Dnh3 18us : SkGmMacStatistic (GetPhysStatVal)
<...>-4593 1Dnh3 20us : GetStatVal (MacPrivateStat)
<...>-4593 1Dnh3 20us : GetPhysStatVal (GetStatVal)
<...>-4593 1Dnh3 20us+: SkGmMacStatistic (GetPhysStatVal)
<...>-4593 1Dnh3 22us : GetStatVal (MacPrivateStat)
<...>-4593 1Dnh3 22us : GetPhysStatVal (GetStatVal)
<...>-4593 1Dnh3 22us+: SkGmMacStatistic (GetPhysStatVal)
<...>-4593 1Dnh3 24us!: MacPrivateStat (SkPnmiGetStruct)
softirq--8 0Dnh4 9901us : trace_change_sched_cpu (1 0 0)
softirq--8 0Dnh4 9901us : _raw_spin_unlock (trace_change_sched_cpu)
softirq--8 0Dnh3 9902us : enqueue_task (move_tasks)
softirq--8 0Dnh3 9902us : resched_task (move_tasks)
softirq--8 0Dnh3 9902us : _raw_spin_unlock (load_balance_newidle)
softirq--8 0Dnh2 9902us : preempt_schedule (_raw_spin_unlock)
<...>-4797 0Dnh2 9903us : __switch_to (__schedule)
<...>-4797 0Dnh2 9904us : __schedule <softirq--8> (62 26)
<...>-4797 0Dnh2 9904us : _raw_spin_unlock_irq (__schedule)
<...>-4797 0...1 9904us : trace_stop_sched_switched (__schedule)
<...>-4797 0Dnh1 9905us : _raw_spin_lock (trace_stop_sched_switched)
<...>-4797 0Dnh2 9905us : trace_stop_sched_switched <<...>-4797> (26
0)
<...>-4797 0Dnh2 9905us : trace_stop_sched_switched (__schedule)