2006-03-07 22:03:35

by Fernando Lopez-Lezcano

[permalink] [raw]
Subject: 2.6.15-rt18, alsa sequencer, rosegarden -> alsa hangs

Hi all, I don't know where to go from here...

I'm trying to get a 2.6.15-rt18 kernel going and I'm finding a big
showstopper in the alsa sequencer, presumably at the kernel module
level. This is -rt18 with PREEMPT_RT and high resolution timers enabled,
plus either alsa kernel drivers from 1.0.10 (latest stable release) or
1.0.11rc3. Same problem for both. I initially thought that the
snd-rtctimer kernel module was a problem so the latest tests were done
with that module disabled so that the alsa sequencer is using the system
timer for timing (HZ == 1000 for this kernel).

The symptoms are as follows:
- start jack using qjackctl
- start qsynth (gui front end for fluidsynth, a synth)
- start rosegarden (midi sequencer and audio recorder)
- load a midi file into rosegarden
- midi file plays successfully
- close rosegarden
at this point one of the threads of rosegarden fails to exit and stays
forever in the process list, in a ps axuw it shows as:

nando 5484 0.0 0.0 0 0 pts/1 D 13:32 0:00 [rosegardenseque]

Anything else that I try to stop that touches the alsa sequencer never
dies (qjackctl, vkeybd, qsynth, etc). Anything I try to start that tries
to use it does not start. This happened with two widely different
versions of rosegarden (1.0 and 1.2.3) - I have not yet tried with other
alsa sequencer clients - I tried different kernel builds and alsa builds
before giving up, the cycle is long when I include the frequent reboots.
I can browse around in /proc/asound/seq but a:
cat /proc/asound/clients
hangs forever (unkillable). Obviously I can't unload the alsa subsystem
to try again. At this point my only recourse to recover is a reboot
(which hangs when trying to stop the alsa subsystem - so the reset key
is the only option).

There are no kernel oops or messages either in dmesg
or /var/log/messages that I can see.

Booting into 2.6.14 with the latest -rt patch for that branch does not
show the problem (using alsa kernel modules from 1.0.10 stable).

The problem happens with at least three completely different soundcards
and two different kernel builds (one kind of vanilla, the other derived
from the latest fc4 kernel configuration and patches).

So it would look like an interaction between the newest -rt patches and
the alsa sequencer timing. Weird because while rosegarden is up and
running it plays midi sequences with no apparent problems, it is only
when it exits that things get completely messed up.

Suggestions appreciated...
-- Fernando

# cat /proc/asound/seq/timer
Timer for queue 0 : system timer
Period time : 0.001000000
Skew : 65536 / 65536

[rosegarden is still there.....]
# cat /proc/asound/seq/queues
queue 0: [Rosegarden queue]
owned by client : 130
lock status : Locked
queued time events : 0
queued tick events : 0
timer state : Running
timer PPQ : 96
current tempo : 500000
current BPM : 120
current time : 1617.680000000 s
current tick : 318130

# cat /proc/asound/seq/drivers
snd-seq-oss,loaded,0
snd-seq-midi,loaded,1



2006-03-07 22:07:36

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.15-rt18, alsa sequencer, rosegarden -> alsa hangs


* Fernando Lopez-Lezcano <[email protected]> wrote:

> The symptoms are as follows:
> - start jack using qjackctl
> - start qsynth (gui front end for fluidsynth, a synth)
> - start rosegarden (midi sequencer and audio recorder)
> - load a midi file into rosegarden
> - midi file plays successfully
> - close rosegarden
> at this point one of the threads of rosegarden fails to exit and stays
> forever in the process list, in a ps axuw it shows as:
>
> nando 5484 0.0 0.0 0 0 pts/1 D 13:32 0:00 [rosegardenseque]
>
> Anything else that I try to stop that touches the alsa sequencer never
> dies (qjackctl, vkeybd, qsynth, etc). Anything I try to start that tries
> to use it does not start. This happened with two widely different

could you get a tasklist-dump? It's either SysRq-T, or:

echo t > /proc/sysrq-trigger

that should dump all tasks and their backtraces - including the hung
rosegardensequencer task.

Ingo

2006-03-08 01:29:19

by Karsten Wiese

[permalink] [raw]
Subject: RE: [Alsa-devel] Re: 2.6.15-rt18, alsa sequencer, rosegarden -> alsa hangs


--- Ingo Molnar <[email protected]> schrieb:

>
> * Fernando Lopez-Lezcano <[email protected]>
> wrote:
>
> > The symptoms are as follows:
> > - start jack using qjackctl
> > - start qsynth (gui front end for fluidsynth, a
> synth)
> > - start rosegarden (midi sequencer and audio
> recorder)
> > - load a midi file into rosegarden
> > - midi file plays successfully
> > - close rosegarden
> > at this point one of the threads of rosegarden fails to
> exit and stays
> > forever in the process list, in a ps axuw it shows as:
> >
> > nando 5484 0.0 0.0 0 0 pts/1 D 13:32 0:00
> [rosegardenseque]
> >
> > Anything else that I try to stop that touches the alsa
> sequencer never
> > dies (qjackctl, vkeybd, qsynth, etc). Anything I try to
> start that tries
> > to use it does not start. This happened with two widely
> different
>
> could you get a tasklist-dump? It's either SysRq-T, or:
>
> echo t > /proc/sysrq-trigger
>
> that should dump all tasks and their backtraces -
> including the hung
> rosegardensequencer task.
>
I had similar symptoms here (FC4) and cured them with
attached patch against linux/2.6.15/rt18/kernel/softirq.c.
Its in rt19 and ++ i think.

Karsten






___________________________________________________________
Telefonate ohne weitere Kosten vom PC zum PC: http://messenger.yahoo.de


Attachments:
patch_rt-18___tasklet_common_schedule.diff (1.08 kB)
2951293989-patch_rt-18___tasklet_common_schedule.diff

2006-03-08 06:54:11

by Fernando Lopez-Lezcano

[permalink] [raw]
Subject: Re: 2.6.15-rt18, alsa sequencer, rosegarden -> alsa hangs

On Tue, 2006-03-07 at 23:06 +0100, Ingo Molnar wrote:
> * Fernando Lopez-Lezcano <[email protected]> wrote:
>
> > The symptoms are as follows:
> > - start jack using qjackctl
> > - start qsynth (gui front end for fluidsynth, a synth)
> > - start rosegarden (midi sequencer and audio recorder)
> > - load a midi file into rosegarden
> > - midi file plays successfully
> > - close rosegarden
> > at this point one of the threads of rosegarden fails to exit and stays
> > forever in the process list, in a ps axuw it shows as:
> >
> > nando 5484 0.0 0.0 0 0 pts/1 D 13:32 0:00 [rosegardenseque]
> >
> > Anything else that I try to stop that touches the alsa sequencer never
> > dies (qjackctl, vkeybd, qsynth, etc). Anything I try to start that tries
> > to use it does not start. This happened with two widely different
>
> could you get a tasklist-dump? It's either SysRq-T, or:
>
> echo t > /proc/sysrq-trigger
>
> that should dump all tasks and their backtraces - including the hung
> rosegardensequencer task.

I'll try to do that tomorrow.

In the meanwhile I think this might be the same problem that was solved
in -rt20? (building it as I type). I have been trying to catch up to the
subset of lkml that I manage to read and found some reports that
apparently point to similar problems with the alsa sequencer (Rui, in
particular, and then Karsten's post in this thread).

-- Fernando


2006-03-08 11:46:30

by Takashi Iwai

[permalink] [raw]
Subject: Re: [Alsa-devel] Re: 2.6.15-rt18, alsa sequencer, rosegarden -> alsa hangs

At Tue, 07 Mar 2006 22:53:55 -0800,
Fernando Lopez-Lezcano wrote:
>
> On Tue, 2006-03-07 at 23:06 +0100, Ingo Molnar wrote:
> > * Fernando Lopez-Lezcano <[email protected]> wrote:
> >
> > > The symptoms are as follows:
> > > - start jack using qjackctl
> > > - start qsynth (gui front end for fluidsynth, a synth)
> > > - start rosegarden (midi sequencer and audio recorder)
> > > - load a midi file into rosegarden
> > > - midi file plays successfully
> > > - close rosegarden
> > > at this point one of the threads of rosegarden fails to exit and stays
> > > forever in the process list, in a ps axuw it shows as:
> > >
> > > nando 5484 0.0 0.0 0 0 pts/1 D 13:32 0:00 [rosegardenseque]
> > >
> > > Anything else that I try to stop that touches the alsa sequencer never
> > > dies (qjackctl, vkeybd, qsynth, etc). Anything I try to start that tries
> > > to use it does not start. This happened with two widely different
> >
> > could you get a tasklist-dump? It's either SysRq-T, or:
> >
> > echo t > /proc/sysrq-trigger
> >
> > that should dump all tasks and their backtraces - including the hung
> > rosegardensequencer task.
>
> I'll try to do that tomorrow.

That'll be helpful. A deadlocked mutex could be easily found by a
backtrace.


Takashi

2006-03-08 11:49:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: [Alsa-devel] Re: 2.6.15-rt18, alsa sequencer, rosegarden -> alsa hangs


* Takashi Iwai <[email protected]> wrote:

> > > could you get a tasklist-dump? It's either SysRq-T, or:
> > >
> > > echo t > /proc/sysrq-trigger
> > >
> > > that should dump all tasks and their backtraces - including the hung
> > > rosegardensequencer task.
> >
> > I'll try to do that tomorrow.
>
> That'll be helpful. A deadlocked mutex could be easily found by a
> backtrace.

furthermore, in the -rt kernel (if PREEMPT_RT and DEBUG_DEADLOCKS is
enabled), all locks will also be printed, and their owners, and the site
where they got acquired, etc. A deadlocked mutex/semaphore should in
fact be detected by the kernel, if those options are enabled.

Ingo

2006-03-08 13:36:37

by Karsten Wiese

[permalink] [raw]
Subject: RE: [Alsa-devel] Re: 2.6.15-rt18, alsa sequencer, rosegarden -> alsa hangs

[ 998.876000] aconnect [f5867010]R C013CF4C [on rq #0] 0 2708 2272 (NOTLB)
[ 998.876000] f3ca1d1c f5867010 c0452ee0 c013cf4c f3ca1000 00000000 f5867148 c0494740
[ 998.876000] f3ca1d08 f5867010 8ec9c300 000000e8 003d0900 00000000 f3ca1000 0002aa75
[ 998.876000] 00000000 f3ca1d44 c034a6e5 c01279b9 c0494740 f3ca1d58 0002aa73 60000000
[ 998.876000] Call Trace:
[ 998.876000] [<c034a6e5>] schedule+0xa5/0x140 (40)
[ 998.876000] [<c034b0cc>] schedule_timeout+0x4c/0xc0 (52)
[ 998.876000] [<c034b17a>] schedule_timeout_uninterruptible+0x1a/0x20 (8)
[ 998.876000] [<c0128bd8>] msleep+0x28/0x40 (12)
[ 998.876000] [<c0123ddc>] tasklet_kill+0x3c/0x70 (16)
[ 998.876000] [<f88372a3>] snd_rawmidi_drop_output+0x43/0x90 [snd_rawmidi] (32)
[ 998.876000] [<f8837372>] snd_rawmidi_drain_output+0x82/0x180 [snd_rawmidi] (68)
[ 998.876000] [<f8a4c5a7>] midisynth_unuse+0x37/0x80 [snd_seq_midi] (32)
[ 998.876000] [<f8aafb3e>] unsubscribe_port+0x4e/0x100 [snd_seq] (36)
[ 998.876000] [<f8ab0004>] snd_seq_port_disconnect+0x134/0x180 [snd_seq] (60)
[ 998.876000] [<f8aa9eb6>] snd_seq_ioctl_unsubscribe_port+0xe6/0x140 [snd_seq] (132)
[ 998.876000] [<f8aaaf9a>] snd_seq_do_ioctl+0x5a/0xd0 [snd_seq] (36)
[ 998.876000] [<f8aab039>] snd_seq_ioctl+0x29/0x60 [snd_seq] (24)
[ 998.876000] [<c01838a8>] do_ioctl+0x38/0xa0 (36)
[ 998.876000] [<c0183ad0>] vfs_ioctl+0x60/0x200 (40)
[ 998.876000] [<c0183cea>] sys_ioctl+0x7a/0x90 (40)
[ 998.876000] [<c01032db>] sysenter_past_esp+0x54/0x75 (-4020)
[ 998.876000] ---------------------------
[ 998.876000] | preempt count: 00000002 ]
[ 998.876000] | 2-level deep critical section nesting:
[ 998.876000] ----------------------------------------
[ 998.876000] .. [<c013cf4c>] .... add_preempt_count+0x1c/0x20
[ 998.876000] .....[<c0349f7b>] .. ( <= __schedule+0x4b/0x710)
[ 998.876000] .. [<c013cf4c>] .... add_preempt_count+0x1c/0x20
[ 998.876000] .....[<c034a021>] .. ( <= __schedule+0xf1/0x710)
[ 998.876000]
[ 998.876000] ------------------------------
[ 998.876000] | showing all locks held by: | (aconnect/2708 [f5867010, 116]):
[ 998.876000] ------------------------------
[ 998.876000]
[ 998.876000] #001: [f0ee1264] {(struct rw_semaphore *)(&grp->list_mutex)}
[ 998.876000] ... acquired at: snd_seq_port_disconnect+0x27/0x180 [snd_seq]
[ 998.876000]
[ 998.876000] #002: [f0ee16f8] {(struct rw_semaphore *)(&grp->list_mutex)}
[ 998.876000] ... acquired at: snd_seq_port_disconnect+0x3a/0x180 [snd_seq]
[ 998.876000]
[ 998.876000]


Attachments:
dmlog_sysrq_t_exc (2.56 kB)
3110844057-dmlog_sysrq_t_exc

2006-03-08 14:07:41

by Ingo Molnar

[permalink] [raw]
Subject: Re: [Alsa-devel] Re: 2.6.15-rt18, alsa sequencer, rosegarden -> alsa hangs


* karsten wiese <[email protected]> wrote:

> ALSA Midi sequencer uses tasklets. there are 2 kinds of
> them: lo and hi.
> In rt-18 PREEMPT-RT, tasklet_hi_schedule() didn't work,
> 'cause it woke up tasklet_lo's thread.
> Thats what my patch fixed.

ah - i thought that -rt18 had your fix already - but indeed it's -rt19
that added it. So it's probably the same bug.

Ingo

2006-03-08 17:55:29

by Fernando Lopez-Lezcano

[permalink] [raw]
Subject: Re: [Alsa-devel] Re: 2.6.15-rt18, alsa sequencer, rosegarden -> alsa hangs

On Wed, 2006-03-08 at 15:06 +0100, Ingo Molnar wrote:
> * karsten wiese <[email protected]> wrote:
>
> > ALSA Midi sequencer uses tasklets. there are 2 kinds of
> > them: lo and hi.
> > In rt-18 PREEMPT-RT, tasklet_hi_schedule() didn't work,
> > 'cause it woke up tasklet_lo's thread.
> > Thats what my patch fixed.
>
> ah - i thought that -rt18 had your fix already - but indeed it's -rt19
> that added it. So it's probably the same bug.

I agree, I just booted into -rt20 and the problem seems to be gone. Let
me know if you still want the trace, I can always boot again into the
buggy kernel.

Thanks for the help!
-- Fernando