While it is not clear what form the final soft real time implementation
is, we should complete the partial implementation of SCHED_ISO that is
in 2.6.11-rc2-mm1.
Thanks to Alex Nyberg and Zwane Mwaikambo for debugging help.
Cheers,
Con
Con Kolivas <[email protected]> writes:
> While it is not clear what form the final soft real time
> implementation is, we should complete the partial implementation of
> SCHED_ISO that is in 2.6.11-rc2-mm1.
I finally had a chance to try this today. I applied a slightly
different patch (2.6.11-rc2-iso3.diff) on top of patch-2.6.11-rc2. I
tried to use 2.6.11-rc2-mm2, but could not due to conflicts with other
scheduler updates.
It is not clear whether the realtime threads are running in the new
scheduler class. Checking with schedtool yields odd results.
(Before, my old schedtool always said "POLICY I: SCHED_ISO".)
[joq@sulphur] jack_test/ $ pst jackd
2173 2173 TS - 0 19 0 0.0 SLs rt_sigsuspend jackd
2174 2174 ? 21 0 60 0 0.0 SL - jackd
2175 2175 TS - 0 23 0 0.0 SL rt_sigsuspend jackd
2176 2176 TS - 0 23 0 0.0 SL - jackd
2177 2177 ? 20 0 59 0 0.0 SL syscall_call jackd
2178 2178 ? 10 0 49 0 1.7 SL - jackd
[joq@sulphur] jack_test/ $ schedtool 2174 2176 2177 2178
PID 2174: PRIO 21, POLICY (null) , NICE 0
PID 2176: PRIO 0, POLICY N: SCHED_NORMAL, NICE 0
PID 2177: PRIO 20, POLICY (null) , NICE 0
PID 2178: PRIO 10, POLICY (null) , NICE 0
The results of the first run indicate something is badly wrong. It is
quite possible that I got confused and messed up the build somehow.
http://www.joq.us/jack/benchmarks/sched-iso3/jack_test3-2.6.11-rc2-q1-200501311225.log
http://www.joq.us/jack/benchmarks/sched-iso3/jack_test3-2.6.11-rc2-q1-200501311225.png
Loading the realtime-lsm and then running with SCHED_FIFO *does* work
as expected on this kernel. I should retry the test with *exactly*
the expected patch sequence. What would that be?
--
joq
Jack O'Quin wrote:
> Con Kolivas <[email protected]> writes:
>
>
>>While it is not clear what form the final soft real time
>>implementation is, we should complete the partial implementation of
>>SCHED_ISO that is in 2.6.11-rc2-mm1.
>
>
> I finally had a chance to try this today. I applied a slightly
> different patch (2.6.11-rc2-iso3.diff) on top of patch-2.6.11-rc2. I
> tried to use 2.6.11-rc2-mm2, but could not due to conflicts with other
> scheduler updates.
>
> It is not clear whether the realtime threads are running in the new
> scheduler class. Checking with schedtool yields odd results.
> (Before, my old schedtool always said "POLICY I: SCHED_ISO".)
>
> [joq@sulphur] jack_test/ $ pst jackd
> 2173 2173 TS - 0 19 0 0.0 SLs rt_sigsuspend jackd
> 2174 2174 ? 21 0 60 0 0.0 SL - jackd
> 2175 2175 TS - 0 23 0 0.0 SL rt_sigsuspend jackd
> 2176 2176 TS - 0 23 0 0.0 SL - jackd
> 2177 2177 ? 20 0 59 0 0.0 SL syscall_call jackd
> 2178 2178 ? 10 0 49 0 1.7 SL - jackd
> [joq@sulphur] jack_test/ $ schedtool 2174 2176 2177 2178
> PID 2174: PRIO 21, POLICY (null) , NICE 0
> PID 2176: PRIO 0, POLICY N: SCHED_NORMAL, NICE 0
> PID 2177: PRIO 20, POLICY (null) , NICE 0
> PID 2178: PRIO 10, POLICY (null) , NICE 0
They're SCHED_ISO_FIFO which schedtool doesn't know about.
> The results of the first run indicate something is badly wrong. It is
> quite possible that I got confused and messed up the build somehow.
>
> http://www.joq.us/jack/benchmarks/sched-iso3/jack_test3-2.6.11-rc2-q1-200501311225.log
> http://www.joq.us/jack/benchmarks/sched-iso3/jack_test3-2.6.11-rc2-q1-200501311225.png
>
> Loading the realtime-lsm and then running with SCHED_FIFO *does* work
> as expected on this kernel. I should retry the test with *exactly*
> the expected patch sequence. What would that be?
Shouldn't matter. There must still be something wrong with my code...
sigh. I'll look into it at some stage, but there doesn't seem much point.
Cheers,
Con
Jack O'Quin wrote:
> Loading the realtime-lsm and then running with SCHED_FIFO *does* work
> as expected on this kernel. I should retry the test with *exactly*
> the expected patch sequence. What would that be?
Sure enough I found the bug in less than 5 mins, and it would definitely
cause this terrible behaviour.
A silly bracket transposition error on my part :P
Cheers,
Con
Con Kolivas <[email protected]> writes:
> Jack O'Quin wrote:
>> Loading the realtime-lsm and then running with SCHED_FIFO *does* work
>> as expected on this kernel. I should retry the test with *exactly*
>> the expected patch sequence. What would that be?
>
> Sure enough I found the bug in less than 5 mins, and it would
> definitely cause this terrible behaviour.
>
> A silly bracket transposition error on my part :P
Cool. I'll try that right away.
--
joq
Con Kolivas <[email protected]> writes:
> Sure enough I found the bug in less than 5 mins, and it would
> definitely cause this terrible behaviour.
>
> A silly bracket transposition error on my part :P
The corrected version works noticeably better, but still nowhere near
as well as SCHED_FIFO. The first run had a cluster of really bad
xruns. The second and third were much better, but still with numerous
small xruns.
http://www.joq.us/jack/benchmarks/sched-iso-fix/
With a compile running in the background it was a complete failure.
Some kind of big xrun storm triggered a collapse on every attempt.
http://www.joq.us/jack/benchmarks/sched-iso-fix+compile/
The summary statistics are mixed. The delay_max is noticeably better
than before, but still much worse than SCHED_FIFO. But, the xruns are
really bad news...
http://www.joq.us/jack/benchmarks/.SUMMARY
# sched-iso-fix
Delay Maximum . . . . . . . . : 33894 usecs
Delay Maximum . . . . . . . . : 745 usecs
Delay Maximum . . . . . . . . : 341 usecs
# sched-iso
Delay Maximum . . . . . . . . : 21410 usecs
Delay Maximum . . . . . . . . : 36830 usecs
Delay Maximum . . . . . . . . : 4062 usecs
# sched-fifo
Delay Maximum . . . . . . . . : 347 usecs
Delay Maximum . . . . . . . . : 277 usecs
Delay Maximum . . . . . . . . : 246 usecs
Delay Maximum . . . . . . . . : 199 usecs
Delay Maximum . . . . . . . . : 261 usecs
Delay Maximum . . . . . . . . : 305 usecs
# sched-iso-fix+compile
Delay Maximum . . . . . . . . : 14549 usecs
Delay Maximum . . . . . . . . : 38961 usecs
Delay Maximum . . . . . . . . : 26904 usecs
# sched-iso+compile
Delay Maximum . . . . . . . . : 98909 usecs
Delay Maximum . . . . . . . . : 39414 usecs
Delay Maximum . . . . . . . . : 40294 usecs
Delay Maximum . . . . . . . . : 217192 usecs
Delay Maximum . . . . . . . . : 156989 usecs
# sched-fifo+compile
Delay Maximum . . . . . . . . : 285 usecs
Delay Maximum . . . . . . . . : 269 usecs
Delay Maximum . . . . . . . . : 277 usecs
Delay Maximum . . . . . . . . : 569 usecs
Delay Maximum . . . . . . . . : 461 usecs
Delay Maximum . . . . . . . . : 405 usecs
Delay Maximum . . . . . . . . : 286 usecs
Delay Maximum . . . . . . . . : 579 usecs
# sched-iso-fix
XRUN Count . . . . . . . . . : 26
XRUN Count . . . . . . . . . : 24
XRUN Count . . . . . . . . . : 17
# sched-iso
XRUN Count . . . . . . . . . : 15
XRUN Count . . . . . . . . . : 17
XRUN Count . . . . . . . . . : 5
# sched-fifo
XRUN Count . . . . . . . . . : 0
XRUN Count . . . . . . . . . : 0
XRUN Count . . . . . . . . . : 0
XRUN Count . . . . . . . . . : 0
XRUN Count . . . . . . . . . : 0
XRUN Count . . . . . . . . . : 0
# sched-iso-fix+compile
XRUN Count . . . . . . . . . : 22
XRUN Count . . . . . . . . . : 44
XRUN Count . . . . . . . . . : 39
# sched-iso+compile
XRUN Count . . . . . . . . . : 44
XRUN Count . . . . . . . . . : 46
XRUN Count . . . . . . . . . : 45
XRUN Count . . . . . . . . . : 27
XRUN Count . . . . . . . . . : 101
# sched-fifo+compile
XRUN Count . . . . . . . . . : 0
XRUN Count . . . . . . . . . : 0
XRUN Count . . . . . . . . . : 0
XRUN Count . . . . . . . . . : 4
XRUN Count . . . . . . . . . : 0
XRUN Count . . . . . . . . . : 0
XRUN Count . . . . . . . . . : 0
XRUN Count . . . . . . . . . : 0
--
joq
Jack O'Quin wrote:
> Con Kolivas <[email protected]> writes:
>
>
>>Sure enough I found the bug in less than 5 mins, and it would
>>definitely cause this terrible behaviour.
>>
>>A silly bracket transposition error on my part :P
>
>
> The corrected version works noticeably better, but still nowhere near
> as well as SCHED_FIFO. The first run had a cluster of really bad
> xruns. The second and third were much better, but still with numerous
> small xruns.
>
> http://www.joq.us/jack/benchmarks/sched-iso-fix/
>
> With a compile running in the background it was a complete failure.
> Some kind of big xrun storm triggered a collapse on every attempt.
>
> http://www.joq.us/jack/benchmarks/sched-iso-fix+compile/
>
> The summary statistics are mixed. The delay_max is noticeably better
> than before, but still much worse than SCHED_FIFO. But, the xruns are
> really bad news...
Excellent.
Believe it or not these look like good results to me. Your XRUNS are
happening when the DSP load is >70% which is the iso_cpu % cutoff. Try
setting the iso_cpu to 90%
echo 90 > /proc/sys/kernel/iso_cpu
Cheers,
Con
> Jack O'Quin wrote:
>> The corrected version works noticeably better, but still nowhere near
>> as well as SCHED_FIFO. The first run had a cluster of really bad
>> xruns. The second and third were much better, but still with numerous
>> small xruns.
>>
>> http://www.joq.us/jack/benchmarks/sched-iso-fix/
>>
>> With a compile running in the background it was a complete failure.
>> Some kind of big xrun storm triggered a collapse on every attempt.
>>
>> http://www.joq.us/jack/benchmarks/sched-iso-fix+compile/
>>
>> The summary statistics are mixed. The delay_max is noticeably better
>> than before, but still much worse than SCHED_FIFO. But, the xruns are
>> really bad news...
Con Kolivas <[email protected]> writes:
> Believe it or not these look like good results to me. Your XRUNS are
> happening when the DSP load is >70% which is the iso_cpu % cutoff. Try
> setting the iso_cpu to 90%
>
> echo 90 > /proc/sys/kernel/iso_cpu
I ran them again with that setting. But, don't forget the large
number of xruns before, even running without the compiles in the
background. There are still way too many of those, although the third
run was clean. If you can get them all to work like that, we'll
really have something.
http://www.joq.us/jack/benchmarks/sched-iso-fix.90
With a compile running in the background, the entire run did not fail
completely as it had at 70%. But there are still way too many xruns.
http://www.joq.us/jack/benchmarks/sched-iso-fix.90+compile
I moved a bunch of directories testing older prototypes to a .old
subdirectory, they no longer clutter up the summary statistics.
http://www.joq.us/jack/benchmarks/.SUMMARY
The fact that the results did improve with the 90% setting suggests
that there may be a bug in your throttling or time accounting. The
DSP load for this test should hover around 50% when things are working
properly. It should never hit a 70% limit, not even momentarily. The
background compile should not affect that, either.
Something seems to be causing scheduling delays when the sound card
interrupt causes jackd to become runnable. Ingo's nice(-20) patches
seem to have the same problem, but his RLIMIT_RT_CPU version does not.
This is still not working well enough for JACK users. Long and
variable trigger latencies after hardware interrupts are deadly to any
serious realtime application.
--
joq
Jack O'Quin wrote:
> The fact that the results did improve with the 90% setting suggests
> that there may be a bug in your throttling or time accounting. The
> DSP load for this test should hover around 50% when things are working
> properly. It should never hit a 70% limit, not even momentarily. The
> background compile should not affect that, either.
>
> Something seems to be causing scheduling delays when the sound card
> interrupt causes jackd to become runnable. Ingo's nice(-20) patches
> seem to have the same problem, but his RLIMIT_RT_CPU version does not.
Good work. Looks like you're probably right about the accounting. It may
be as simple as the fact that it is on the timer tick that we're getting
rescheduled and this ends up being accounted as more since the
accounting happens only at the scheduler tick. A test run setting
iso_cpu at 100% should tell you if it's accounting related - however the
RLIMIT_RT_CPU patch is accounted in a similar way so I'm not sure there
isn't another bug hanging around. I'm afraid on my hardware it has been
behaving just like SCHED_FIFO for some time which is why I've been
hanging on your results. You're not obliged to do anything (obviously),
but the 100% run should help discriminate where the problem is.
Since I've come this far with the code I'll have another look for any
other obvious bugs.
Cheers,
Con
Con Kolivas <[email protected]> writes:
> Good work. Looks like you're probably right about the accounting. It
> may be as simple as the fact that it is on the timer tick that we're
> getting rescheduled and this ends up being accounted as more since the
> accounting happens only at the scheduler tick. A test run setting
> iso_cpu at 100% should tell you if it's accounting related - however
> the RLIMIT_RT_CPU patch is accounted in a similar way so I'm not sure
> there isn't another bug hanging around.
> I'm afraid on my hardware it has been behaving just like SCHED_FIFO
> for some time which is why I've been hanging on your results.
My guess is that most of this test fits inside that huge cache of
yours, making it run much faster than on my system. You probably need
to increase the number of clients to get comparable results.
When you say just like SCHED_FIFO, do you mean completely clean? Or
are you still getting unexplained xruns? If that's the case, we need
to figure out why and eliminate them.
The reason I can measure an effect here is that the test is heavy
enough to stress my system and the system is RT-clean enough for
SCHED_FIFO to work properly. (That's no surprise, I've been running
it that way for years.)
> You're not obliged to do anything (obviously), but the 100% run
> should help discriminate where the problem is.
I don't mind. It's the main way I can help. I just get busy some of
the time.
It did work better. On the first run, there were a couple of real bad
xruns starting up. But, the other two runs look fairly clean.
http://www.joq.us/jack/benchmarks/sched-iso-fix.100
With a compile running, bad xruns and really long delays become a
serious problem again.
http://www.joq.us/jack/benchmarks/sched-iso-fix.100+compile
Comparing the summary statistics with the 90% run, suggests that the
same problems occur in both cases, but not as often at 100%.
http://www.joq.us/jack/benchmarks/.SUMMARY
With these latency demands, the system can't ever pick the wrong
thread on exit from even a single interrupt, or we're screwed. I am
pretty well convinced this is not happening reliably (except with
SCHED_FIFO).
--
joq
Jack O'Quin wrote:
> Con Kolivas <[email protected]> writes:
>
>
>>Good work. Looks like you're probably right about the accounting. It
>>may be as simple as the fact that it is on the timer tick that we're
>>getting rescheduled and this ends up being accounted as more since the
>>accounting happens only at the scheduler tick. A test run setting
>>iso_cpu at 100% should tell you if it's accounting related - however
>>the RLIMIT_RT_CPU patch is accounted in a similar way so I'm not sure
>>there isn't another bug hanging around.
>
>
>>I'm afraid on my hardware it has been behaving just like SCHED_FIFO
>>for some time which is why I've been hanging on your results.
>
>
> My guess is that most of this test fits inside that huge cache of
> yours, making it run much faster than on my system. You probably need
> to increase the number of clients to get comparable results.
Bah increasing the clients from 14 to 20 the script just fails in some
meaningless way
Killed
[con@laptop jack_test4.1]$ [1/1] jack_test4_client (17/20) stopped.
[1/1] jack_test4_client (18/20) stopped.
./run.sh: line 153: 7504 Broken pipe ${CMD} >>${LOG} 2>&1
[1/1] jack_test4_client ( 2/20) stopped.
./run.sh: line 153: 7507 Broken pipe ${CMD} >>${LOG} 2>&1
even before it starts :(
>
> When you say just like SCHED_FIFO, do you mean completely clean? Or
> are you still getting unexplained xruns? If that's the case, we need
> to figure out why and eliminate them.
On my P4 with the results I posted I am getting no xruns whatsoever with
either SCHED_FIFO or ISO. As for the pentiumM I've given up trying to
use that for latency runs since even with everything shut down and the
file system with journal off running SCHED_FIFO I get 8ms peaks every 20
seconds. I'll keep blaming reiserfs for that one. Only dropping to
single user mode and unmounting filesystems can get rid of them.
> The reason I can measure an effect here is that the test is heavy
> enough to stress my system and the system is RT-clean enough for
> SCHED_FIFO to work properly. (That's no surprise, I've been running
> it that way for years.)
Yeah I understand. I'm kinda stuck with hardware that either doesn't
have a problem, or an installation too flawed to use.
> It did work better. On the first run, there were a couple of real bad
> xruns starting up. But, the other two runs look fairly clean.
>
> http://www.joq.us/jack/benchmarks/sched-iso-fix.100
>
> With a compile running, bad xruns and really long delays become a
> serious problem again.
>
> http://www.joq.us/jack/benchmarks/sched-iso-fix.100+compile
>
> Comparing the summary statistics with the 90% run, suggests that the
> same problems occur in both cases, but not as often at 100%.
>
> http://www.joq.us/jack/benchmarks/.SUMMARY
>
> With these latency demands, the system can't ever pick the wrong
> thread on exit from even a single interrupt, or we're screwed. I am
> pretty well convinced this is not happening reliably (except with
> SCHED_FIFO).
Looking at the code I see some bias towards keeping the cpu count too
high (it decays too slowly) but your results confirm a bigger problem
definitely exists. At 100% it should behave the same as SCHED_FIFO
without mlock, and it is not in your test. I simply need to look at my
code harder.
Cheers,
Con