2006-11-29 08:57:13

by tike64

[permalink] [raw]
Subject: realtime-preempt and arm

Hi all,

I'm trying the realtime-preempt patch-2.6.18-rt6 on
lh7a400 arm system with little success. In a test
program I try 5 ms timeout with select() but get 20 ms
avg or 26 ms max. When the framebuffer scrolls, the
max delay goes up to 59 ms. With a vanilla kernel I
get 10 ms (because of tick resolution?), 11 ms and 39
ms.

My question is, is the realtime-preempt patch supposed
to work on arm architecture and/or without high
resolution timer (which lh7a40x seems to lack) at all
or should I just try to be more clever.

Relevant code:

====
prio.sched_priority = 99;
if (sched_setscheduler(0, SCHED_RR, &prio) < 0) ...
if (mlockall(MCL_CURRENT | MCL_FUTURE) < 0) ...
while (1) {
t = raw_timer();
tv.tv_usec = 5000;
tv.tv_sec = 0;
select(0, 0, 0, 0, &tv);
t = raw_timer() - t;
if (max_t < t) max_t = t;
if (min_t > t) min_t = t;
avg_t += t;
++n;
if (n < 100) continue;
printf("%i revs; min: %i max: %i avg: %i\n",
n,
min_t,
max_t,
(avg_t + n / 2) / n);
====

Relevant config: PREEMPT_RT, PREEMPT_SOFTIRQS,
PREEMPT_HARDIRQS

I didnt' enable HIGH_RES_TIMERS because lh7a40x seems
not to support it.

--

tike




____________________________________________________________________________________
Cheap talk?
Check out Yahoo! Messenger's low PC-to-Phone call rates.
http://voice.yahoo.com


2006-11-30 15:57:14

by junjie cai

[permalink] [raw]
Subject: Re: realtime-preempt and arm

Hi,

Without the support of High Resolution Timer supported,
the timer resolution wouldn't change.
With high-resolution-timer supported,
our arm926-based board could get resolution like 40~50us.
There are codes you can reference ,may be you should just try to implement it.

JFI, Thanks.

From: tike64 <[email protected]>
Subject: realtime-preempt and arm
Date: Wed, 29 Nov 2006 00:57:05 -0800 (PST)

> Hi all,
>
> I'm trying the realtime-preempt patch-2.6.18-rt6 on
> lh7a400 arm system with little success. In a test
> program I try 5 ms timeout with select() but get 20 ms
> avg or 26 ms max. When the framebuffer scrolls, the
> max delay goes up to 59 ms. With a vanilla kernel I
> get 10 ms (because of tick resolution?), 11 ms and 39
> ms.
>
> My question is, is the realtime-preempt patch supposed
> to work on arm architecture and/or without high
> resolution timer (which lh7a40x seems to lack) at all
> or should I just try to be more clever.
>
> Relevant code:
>
> ====
> prio.sched_priority = 99;
> if (sched_setscheduler(0, SCHED_RR, &prio) < 0) ...
> if (mlockall(MCL_CURRENT | MCL_FUTURE) < 0) ...
> while (1) {
> t = raw_timer();
> tv.tv_usec = 5000;
> tv.tv_sec = 0;
> select(0, 0, 0, 0, &tv);
> t = raw_timer() - t;
> if (max_t < t) max_t = t;
> if (min_t > t) min_t = t;
> avg_t += t;
> ++n;
> if (n < 100) continue;
> printf("%i revs; min: %i max: %i avg: %i\n",
> n,
> min_t,
> max_t,
> (avg_t + n / 2) / n);
> ====
>
> Relevant config: PREEMPT_RT, PREEMPT_SOFTIRQS,
> PREEMPT_HARDIRQS
>
> I didnt' enable HIGH_RES_TIMERS because lh7a40x seems
> not to support it.
>
> --
>
> tike
>
>
>
>
> ____________________________________________________________________________________
> Cheap talk?
> Check out Yahoo! Messenger's low PC-to-Phone call rates.
> http://voice.yahoo.com
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2006-12-01 09:07:07

by tike64

[permalink] [raw]
Subject: Re: realtime-preempt and arm

> Hi,
>
> Without the support of High Resolution Timer
> supported, the timer resolution wouldn't change.

Ok, I understand that. I was not expecting more
resolution. I expected only that I would get more
precise 10ms delays. What confuses me is that the
delays roughly doubled.

> With high-resolution-timer supported, our
> arm926-based board could get resolution like
40~50us.
> There are codes you can reference ,may be you should
> just try to implement it.

It is good to know that the problem is not the arm
architecture itself. Thanks to you for that.

The problem must be in the lh7a40x specific code or my
configuration. I am not yet convinced enough that high
resolution timer implementation would solve the
problem. I don't need timing resolution finer than
10ms providing that FB doesn't blow it up to 60ms.

Could you or someone please give a hint where to look
next or give an explanation why the lack of high
resolution timer would behave like that.

--

tike




____________________________________________________________________________________
Cheap talk?
Check out Yahoo! Messenger's low PC-to-Phone call rates.
http://voice.yahoo.com

2006-12-13 18:36:23

by Steven Rostedt

[permalink] [raw]
Subject: Re: realtime-preempt and arm

For -rt issues, please CC Ingo Molnar, and for High Res issues, please
CC Thomas Gleixner.


On Fri, 2006-12-01 at 01:07 -0800, tike64 wrote:
> > Hi,
> >
> > Without the support of High Resolution Timer
> > supported, the timer resolution wouldn't change.
>
> Ok, I understand that. I was not expecting more
> resolution. I expected only that I would get more
> precise 10ms delays. What confuses me is that the
> delays roughly doubled.
>
> > With high-resolution-timer supported, our
> > arm926-based board could get resolution like
> 40~50us.
> > There are codes you can reference ,may be you should
> > just try to implement it.
>
> It is good to know that the problem is not the arm
> architecture itself. Thanks to you for that.
>
> The problem must be in the lh7a40x specific code or my
> configuration. I am not yet convinced enough that high
> resolution timer implementation would solve the
> problem. I don't need timing resolution finer than
> 10ms providing that FB doesn't blow it up to 60ms.
>
> Could you or someone please give a hint where to look
> next or give an explanation why the lack of high
> resolution timer would behave like that.

Also, have you tried this with a nanosleep instead of a select.
Select's timeout is just that, a timeout. It's not suppose to be
accurate, as long as it doesn't expire early. The reason I state this,
is that select uses a different mechanism than nanosleep, and that can
indeed affect the jitter.

Although without the high res enabled, you can't get better than jiffy
resolution, you shouldn't get a large jitter either. BTW, using high
res won't help the select anyway. The select uses a normal
schedule_timeout, which means that it's not really expected to timeout,
but something should wake it up before hand. Which means that the good
old timer wheel (non-hrtimer) is going to do the waking of the process.
This means that you need to wait for the timer softirq to be scheduled
before your process wakes up. If there's a process with a higher
priority than the timer softirq running, then you need to wait.

Using nansleep uses the hrtimer code (available with out the high
resolutions). The hrtimer uses its own timer softirq (softirq-hrtimer),
and it is special. It inherits the priority of the task that created
the timer when the timer goes off. Also, something like nanosleep,
won't even use the softirq, and will bypass the softirq all together,
and wake your process up from the interrupt.

So basically, don't use select for timing.

-- Steve


2006-12-14 07:28:10

by tike64

[permalink] [raw]
Subject: Re: realtime-preempt and arm

Steven Rostedt <[email protected]> wrote:
> Also, have you tried this with a nanosleep instead of a select.
> Select's timeout is just that, a timeout. It's not suppose to be
> accurate, as long as it doesn't expire early. The reason I state
> this, is that select uses a different mechanism than nanosleep, and
> that can indeed affect the jitter.

Ok, understood; I tried this:

t = raw_timer();
ts.tv_nsec = 5000000;
ts.tv_sec = 0;
nanosleep(&ts, 0);
t = raw_timer() - t;

It is better but I still see 8ms occasional delays when listing
nfs-mounted directories onto FB. And, what is funny, also this version
makes the average delay 20ms as if it made the jiffy 20ms.

> Although without the high res enabled, you can't get better than
jiffy
> resolution, you shouldn't get a large jitter either. BTW, using high
> res won't help the select anyway. The select uses a normal
> schedule_timeout, which means that it's not really expected to
> timeout, but something should wake it up before hand. Which means
that
> the good old timer wheel (non-hrtimer) is going to do the waking of
> the process. This means that you need to wait for the timer softirq
to
> be scheduled before your process wakes up. If there's a process with
a
> higher priority than the timer softirq running, then you need to
wait.
>
> Using nansleep uses the hrtimer code (available with out the high
> resolutions). The hrtimer uses its own timer softirq
> (softirq-hrtimer), and it is special. It inherits the priority of
the
> task that created the timer when the timer goes off. Also, something
> like nanosleep, won't even use the softirq, and will bypass the
> softirq all together, and wake your process up from the interrupt.
>
> So basically, don't use select for timing.

Thanks a lot for a thorough explanation. While we are at it, is it then
the only option to use threads to wait for IO and use ms-accurate
timing? Formerly I have used select with timeouts for this task but
timing requirement have not been this accurate back then, of course.

--

tike




____________________________________________________________________________________
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com



____________________________________________________________________________________
Need a quick answer? Get one in minutes from people who know.
Ask your question on http://www.Answers.yahoo.com

2006-12-14 10:02:45

by Ingo Molnar

[permalink] [raw]
Subject: Re: realtime-preempt and arm


* tike64 <[email protected]> wrote:

> Steven Rostedt <[email protected]> wrote:
> > Also, have you tried this with a nanosleep instead of a select.
> > Select's timeout is just that, a timeout. It's not suppose to be
> > accurate, as long as it doesn't expire early. The reason I state
> > this, is that select uses a different mechanism than nanosleep, and
> > that can indeed affect the jitter.
>
> Ok, understood; I tried this:
>
> t = raw_timer();
> ts.tv_nsec = 5000000;
> ts.tv_sec = 0;
> nanosleep(&ts, 0);
> t = raw_timer() - t;
>
> It is better but I still see 8ms occasional delays when listing
> nfs-mounted directories onto FB. And, what is funny, also this version
> makes the average delay 20ms as if it made the jiffy 20ms.

ARM has no high resolution timers support yet in the -rt tree.

Ingo

2006-12-14 10:26:59

by tike64

[permalink] [raw]
Subject: Re: realtime-preempt and arm

Ingo Molnar <[email protected]> wrote:
> tike64 <[email protected]> wrote:
> > Ok, understood; I tried this:
> >
> > t = raw_timer();
> > ts.tv_nsec = 5000000;
> > ts.tv_sec = 0;
> > nanosleep(&ts, 0);
> > t = raw_timer() - t;
> >
> > It is better but I still see 8ms occasional delays when listing
> > nfs-mounted directories onto FB. And, what is funny, also this
> > version makes the average delay 20ms as if it made the jiffy 20ms.
>
> ARM has no high resolution timers support yet in the -rt tree.

Yes, but is there a reason why the -rt patch seems to make the 10ms
jiffy 20ms and why the jitter is so high. I don't need high resolution
but reasonable, a couple of milliseconds, jitter.

--

tike




____________________________________________________________________________________
Cheap talk?
Check out Yahoo! Messenger's low PC-to-Phone call rates.
http://voice.yahoo.com

2006-12-14 12:52:13

by Steven Rostedt

[permalink] [raw]
Subject: Re: realtime-preempt and arm



On Thu, 14 Dec 2006, tike64 wrote:

> Ingo Molnar <[email protected]> wrote:
> > tike64 <[email protected]> wrote:
> > > Ok, understood; I tried this:
> > >
> > > t = raw_timer();
> > > ts.tv_nsec = 5000000;
> > > ts.tv_sec = 0;
> > > nanosleep(&ts, 0);
> > > t = raw_timer() - t;
> > >
> > > It is better but I still see 8ms occasional delays when listing
> > > nfs-mounted directories onto FB. And, what is funny, also this
> > > version makes the average delay 20ms as if it made the jiffy 20ms.
> >
> > ARM has no high resolution timers support yet in the -rt tree.
>
> Yes, but is there a reason why the -rt patch seems to make the 10ms
> jiffy 20ms and why the jitter is so high. I don't need high resolution
> but reasonable, a couple of milliseconds, jitter.
>

OK, let me see if I get this right. You have jiffies at 100HZ right? So
that means the timer needs to go off at 10ms intervals. So you are always
seeing a jiffy+1 delay? Well this unfortunately has to happen, since it's
ok for the timer to be a little over, but it must never be a little under.
Lets add some ASCII graphics to this :)


10ms 20ms (n+10)ms (n+11)ms
|---------+---------+---- .... ---+---------+--->
^ ^
Start End

OK, here we have a timer that should go off in (n)ms. We start between
10ms and 20ms (remember, our resolution is only 10ms). If we just make
the timer go off at 10+n ms in the future, you get the above. But notice,
that the Start was really closer to 20 than to 10, so the End really
didn't go off in (n)ms. It went off in less. So to solve this, we must add
one resolution time to the counter. So we make sure the timer goes off in
(n+1) ms, and not just (n).

Is this what you're seeing?

Note, even with high resolution timers, you still get a resolution+1 time.
But with high resolution timers, that resolution number is much smaller
than 10ms :)

-- Steve

2006-12-14 14:23:52

by tike64

[permalink] [raw]
Subject: Re: realtime-preempt and arm

Steven Rostedt <[email protected]> wrote:
> ...
> it's ok for the timer to be a little over, but it must never be a
> little under.
> ...
> So we make sure the timer goes off in (n+1) ms, and not just (n).

Ok, this makes sense - thanks.

What confuses / confused me is that I have 4 combinations:
without-rt/with-rt X select/nanosleep; I first tried the
without-rt/select combination and right after that with-rt combinations
skipping the without-rt/nanosleep case. The first one was the one (the
only one) which gives me the 10ms average delay. And after your
explanations that fact bugs me even more.

But that is a side issue. The real problem is now: how do I get rid of
the multi-ms jitter?

--

tike




____________________________________________________________________________________
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com

2006-12-14 15:20:25

by Steven Rostedt

[permalink] [raw]
Subject: Re: realtime-preempt and arm


On Thu, 14 Dec 2006, tike64 wrote:

> Steven Rostedt <[email protected]> wrote:
> > ...
> > it's ok for the timer to be a little over, but it must never be a
> > little under.
> > ...
> > So we make sure the timer goes off in (n+1) ms, and not just (n).

Oops, that should have read (n+1) 10ms, or +1 res. But you got the point
anyway ;)

>
> Ok, this makes sense - thanks.
>
> What confuses / confused me is that I have 4 combinations:
> without-rt/with-rt X select/nanosleep; I first tried the
> without-rt/select combination and right after that with-rt combinations
> skipping the without-rt/nanosleep case. The first one was the one (the
> only one) which gives me the 10ms average delay. And after your
> explanations that fact bugs me even more.

Actually, I just ran your prog on a ia32 -rt kernel, with highres, and
using select, I get return times of less than 5ms. So this looks like a
bug. On 2.6.17 vanilla, I also got under 5ms. But it might be ok for
select to return early. I'm not sure on this one. But using nansleep
never returned early on either system.

>
> But that is a side issue. The real problem is now: how do I get rid of
> the multi-ms jitter?
>

So you got a big jitter using nanosleep??? If that's the case, could you
post the times you got. I'll also boot a kernel with the latest -rt patch,
without highres compiled, and see if I can reproduce the same on x86.

-- Steve

2006-12-15 07:15:43

by tike64

[permalink] [raw]
Subject: Re: realtime-preempt and arm

Steven Rostedt <[email protected]> wrote:
> So you got a big jitter using nanosleep??? If that's the case, could
> you post the times you got. I'll also boot a kernel with the latest
> -rt patch, without highres compiled, and see if I can reproduce the
> same on x86.

You're very kind! Here you go:

This is from "Linux uclibc 2.6.14.2 #12 PREEMPT" without -rt:

100 revs; min: 19888 max: 20386 avg: 20013
100 revs; min: 19724 max: 20296 avg: 20013
100 revs; min: 19920 max: 20322 avg: 20013
100 revs; min: 19840 max: 20323 avg: 20016
100 revs; min: 10276 max: 42789 avg: 21294
100 revs; min: 10466 max: 34080 avg: 21687
100 revs; min: 10249 max: 30594 avg: 21161
100 revs; min: 10962 max: 34421 avg: 21415
100 revs; min: 10437 max: 31338 avg: 20562
100 revs; min: 11660 max: 29751 avg: 21066
100 revs; min: 10457 max: 30612 avg: 21417
100 revs; min: 10270 max: 37828 avg: 21513

First four lines are with the system otherwise idle. Then I fired 'ls
-Rl /mnt/some/nfs/share' on a framebuffer console.

And the same on a "Linux uclibc 2.6.18-rt6 #19 PREEMPT":

100 revs; min: 19847 max: 20242 avg: 20014
100 revs; min: 19685 max: 20332 avg: 20014
100 revs; min: 19652 max: 20374 avg: 20014
100 revs; min: 19622 max: 20399 avg: 20012
100 revs; min: 19736 max: 26612 avg: 20074
100 revs; min: 19478 max: 21199 avg: 20021
100 revs; min: 19569 max: 21093 avg: 20022
100 revs; min: 19582 max: 20460 avg: 20017
100 revs; min: 19723 max: 20410 avg: 20016
100 revs; min: 19459 max: 24565 avg: 20056
100 revs; min: 19610 max: 24257 avg: 20053
100 revs; min: 19376 max: 26848 avg: 20079
100 revs; min: 19445 max: 26522 avg: 20077
100 revs; min: 19510 max: 22349 avg: 20034
100 revs; min: 19562 max: 20334 avg: 20017

The one to be blamed the most seems to be FB. 'ls ... > /dev/null'
leads to less than 2ms slips.

I'm supposed to make a 10ms control loop, so I could live with a couple
of ms jitter. 7ms is rather high and I think it tells about some
problem which makes one wonder if even higher occasional slips are
possible.

I made my test code visible if you want to take a look: www dot
riihineva dot no-ip dot org uphill public uphill test-rt.c

--

tike




____________________________________________________________________________________
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com

2006-12-15 10:00:52

by Ingo Molnar

[permalink] [raw]
Subject: Re: realtime-preempt and arm


* tike64 <[email protected]> wrote:

> I made my test code visible if you want to take a look: www dot
> riihineva dot no-ip dot org uphill public uphill test-rt.c

on x86, with nanosleep i get:

# ./test-rt
100 revs; min: 5026 max: 5099 avg: 5062
100 revs; min: 5031 max: 5105 avg: 5065
100 revs; min: 5021 max: 5096 avg: 5048
100 revs; min: 5014 max: 5080 avg: 5041
100 revs; min: 5015 max: 5072 avg: 5040
100 revs; min: 5018 max: 5075 avg: 5041
100 revs; min: 5021 max: 5091 avg: 5042

with select i get:

# ./test-rt
100 revs; min: 4276 max: 6048 avg: 5181
100 revs; min: 4371 max: 6060 avg: 5438
100 revs; min: 4409 max: 6056 avg: 5338
100 revs; min: 4940 max: 6056 avg: 5468
100 revs; min: 4938 max: 6049 avg: 5398
100 revs; min: 4373 max: 6056 avg: 5279
100 revs; min: 4943 max: 6040 avg: 5068

(HZ=250 on this kernel)

so these results look pretty normal to me. Modified code attached below.
(Change the '#if 1' to '#if 0' to get the select() measurement.)

Ingo

----------------->
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <limits.h>
#include <sched.h>
#include <time.h>
#include <sys/time.h>
#include <sys/mman.h>

static unsigned raw_timer(void)
{
struct timeval tv;

gettimeofday(&tv, 0);

return tv.tv_sec * 1000000 + tv.tv_usec;
}

int main(int argc, char *argv[])
{
int t, min_t = INT_MAX, max_t = INT_MIN, avg_t = 0, n = 0;
struct timespec ts;
struct timeval tv;
struct sched_param prio;

prio.sched_priority = 99;
if (sched_setscheduler(0, SCHED_RR, &prio) < 0) {
perror("setscheduler failed"); }
if (mlockall(MCL_CURRENT | MCL_FUTURE) < 0) {
perror("mlockall failed"); }
while (1) {
t = raw_timer();

ts.tv_nsec = 5000000;
ts.tv_sec = 0;
#if 1
nanosleep(&ts, 0);
#else
tv.tv_usec = 5000;
tv.tv_sec = 0;
select(0, 0, 0, 0, &tv);
#endif
t = raw_timer() - t;
if (max_t < t) max_t = t;
if (min_t > t) min_t = t;
avg_t += t;
++n;
if (n < 100)
continue;
printf("%i revs; min: %i max: %i avg: %i\n", n, min_t, max_t, (avg_t + n / 2) / n);
fflush(stdout);
min_t = INT_MAX;
max_t = INT_MIN;
avg_t = 0;
n = 0;
}
}

2006-12-15 13:02:05

by Steven Rostedt

[permalink] [raw]
Subject: Re: realtime-preempt and arm


On Fri, 15 Dec 2006, Ingo Molnar wrote:

>
> so these results look pretty normal to me.

Ingo, Did you run this with high res turned off? That will simulate his
scenerio more so.

> Modified code attached below.
> (Change the '#if 1' to '#if 0' to get the select() measurement.)

Your code is almost exactly what I did to test! :)

-- Steve

2006-12-16 00:14:35

by Robert Crocombe

[permalink] [raw]
Subject: Re: realtime-preempt and arm

root@spanky:~$ uname -r
2.6.19.1-rt15_00

And I'm totally thrilled since this is the first -rt kernel that I've
tried and been able to boot since .16-rt29. Yay!

root@spanky:~$ zcat /proc/config.gz | egrep "HZ.*=y"
CONFIG_HZ_1000=y

100 revs; min: 5008 max: 5034 avg: 5015
100 revs; min: 5008 max: 5023 avg: 5010
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5018 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5014 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5023 avg: 5010
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5018 avg: 5009
100 revs; min: 5008 max: 5019 avg: 5009
100 revs; min: 5008 max: 5013 avg: 5009

quad Opteron running x86_64 Fedora Core 5.

--
Robert Crocombe
[email protected]