2007-08-01 16:42:42

by John Sigler

[permalink] [raw]
Subject: Old -rt patches

Hello,

I wrote a Linux app where I need high-resolution timers. I went all the
way and installed the -rt patch, which includes the -hrt patches, as far
as I understand.

Since I could not afford to change kernels with every new release, I
decided to track the 2.6.20 branch (arbitrarily).

At this point I'm using 2.6.20.7-rt8 (-rt8 was the last patch to the
2.6.20 branch). (I see 2.6.20 is already up to .15)

I see a lot of patches going into the -rt patch, but development moved
on to 2.6.21, then 2.6.22, and now 2.6.23, following mainline.

I'm not a kernel hacker, so I don't claim to understand the patches, but
the comments sound like a few bugs are fixed here and there. I can't
tell whether these bugs also exist in previous kernels or only appeared
in newer kernels. Even if I knew, I probably wouldn't have the expertise
to back-port the patch.

I'm seeing weird behavior in my app, when it has been running 2-4 days.
The timers start to act out, and all hell breaks loose. Basically, I
have no idea what's going on...

My question is: is it possible that there is a bug in 2.6.20-rt8 that
has been fixed in subsequent -rt patches (there's a 2.6.21.6-rt21 and a
2.6.22.1-rt9)? In other words, if I give 2.6.21.6-rt8 or 2.6.22.1-rt9 a
spin, is it possible that my weird behavior disappears?

I'm not saying that it's impossible for the bug to be in my app, but the
app is small enough that I'm fairly confident there's no problem there.

Thanks for reading this far,

John


2007-08-06 08:55:56

by John Sigler

[permalink] [raw]
Subject: Re: Old -rt patches

John Sigler wrote:

> I wrote a Linux app where I need high-resolution timers. I went all the
> way and installed the -rt patch, which includes the -hrt patches, as far
> as I understand.
>
> Since I could not afford to change kernels with every new release, I
> decided to track the 2.6.20 branch (arbitrarily).
>
> At this point I'm using 2.6.20.7-rt8 (-rt8 was the last patch to the
> 2.6.20 branch). (I see 2.6.20 is already up to .15)
>
> I see a lot of patches going into the -rt patch, but development moved
> on to 2.6.21, then 2.6.22, and now 2.6.23, following mainline.
>
> I'm not a kernel hacker, so I don't claim to understand the patches, but
> the comments sound like a few bugs are fixed here and there. I can't
> tell whether these bugs also exist in previous kernels or only appeared
> in newer kernels. Even if I knew, I probably wouldn't have the expertise
> to back-port the patch.
>
> I'm seeing weird behavior in my app, when it has been running 2-4 days.
> The timers start to act out, and all hell breaks loose. Basically, I
> have no idea what's going on...
>
> My question is: is it possible that there is a bug in 2.6.20-rt8 that
> has been fixed in subsequent -rt patches (there's a 2.6.21.6-rt21 and a
> 2.6.22.1-rt9)? In other words, if I give 2.6.21.6-rt8 or 2.6.22.1-rt9 a
> spin, is it possible that my weird behavior disappears?
>
> I'm not saying that it's impossible for the bug to be in my app, but the
> app is small enough that I'm fairly confident there's no problem there.

Would anyone care to comment?

Perhaps I could also test a different strategy, such as xenomai?
http://www.xenomai.org/

Regards.

2007-08-06 14:42:17

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Old -rt patches

On Mon, Aug 06, 2007 at 10:55:44AM +0200, John Sigler wrote:
> John Sigler wrote:
>
> >I wrote a Linux app where I need high-resolution timers. I went all the
> >way and installed the -rt patch, which includes the -hrt patches, as far
> >as I understand.
> >
> >Since I could not afford to change kernels with every new release, I
> >decided to track the 2.6.20 branch (arbitrarily).
> >
> >At this point I'm using 2.6.20.7-rt8 (-rt8 was the last patch to the
> >2.6.20 branch). (I see 2.6.20 is already up to .15)
> >
> >I see a lot of patches going into the -rt patch, but development moved
> >on to 2.6.21, then 2.6.22, and now 2.6.23, following mainline.
> >
> >I'm not a kernel hacker, so I don't claim to understand the patches, but
> >the comments sound like a few bugs are fixed here and there. I can't
> >tell whether these bugs also exist in previous kernels or only appeared
> >in newer kernels. Even if I knew, I probably wouldn't have the expertise
> >to back-port the patch.
> >
> >I'm seeing weird behavior in my app, when it has been running 2-4 days.
> >The timers start to act out, and all hell breaks loose. Basically, I
> >have no idea what's going on...
> >
> >My question is: is it possible that there is a bug in 2.6.20-rt8 that
> >has been fixed in subsequent -rt patches (there's a 2.6.21.6-rt21 and a
> >2.6.22.1-rt9)? In other words, if I give 2.6.21.6-rt8 or 2.6.22.1-rt9 a
> >spin, is it possible that my weird behavior disappears?
> >
> >I'm not saying that it's impossible for the bug to be in my app, but the
> >app is small enough that I'm fairly confident there's no problem there.
>
> Would anyone care to comment?

Would you be willing to post your app so people could try to reproduce
your bug?

Thanx, Paul

> Perhaps I could also test a different strategy, such as xenomai?
> http://www.xenomai.org/
>
> Regards.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2007-08-07 06:06:34

by Daniel Walker

[permalink] [raw]
Subject: Re: Old -rt patches

On Mon, 2007-08-06 at 10:55 +0200, John Sigler wrote:
> John Sigler wrote:
>

> Would anyone care to comment?

I'm not sure if this is the answer that you looking for, but yes you
certainly will find fixed bug is older version of the tree.

> Perhaps I could also test a different strategy, such as xenomai?
> http://www.xenomai.org/

If it's a kernel bug it's not going to matter if you use a xenomai skin
or not.. If you use some other real time layer that might fix it ..

you really need to test your app on a current version of the kernel ..
We as developers generally don't support out dated trees..

Daniel

2007-08-07 08:27:36

by John Sigler

[permalink] [raw]
Subject: Re: Old -rt patches

Daniel Walker wrote:

> John Sigler wrote:
>
>> Would anyone care to comment?
>
> I'm not sure if this is the answer that you're looking for, but yes you
> certainly will find fixed bug is older version of the tree.

I am not a kernel hacker, therefore I can only imagine how complex it is
to bring real-time to Linux. For some reason, I had come to believe that
the case of a single CPU without dynticks had been "solved" in the more
recent kernels (say 2.6.20), and that development had moved on, and was
now active in more "complex" areas like SMP/multi-core, dynticks, etc.

>> Perhaps I could also test a different strategy, such as xenomai?
>> http://www.xenomai.org/
>
> If it's a kernel bug it's not going to matter if you use a xenomai skin
> or not.. If you use some other real time layer that might fix it ..

*If* it is a bug in the -rt patch, then Adeos/Xenomai coupled with a
vanilla Linux kernel should not be affected by the same bug. Unless my
logic is broken somewhere.

> you really need to test your app on a current version of the kernel ..
> We as developers generally don't support out dated trees..

This is the part I don't understand. I work for a tiny company with
limited resources. It's infeasible for me to track every new kernel
release and upgrade every time. I need to pick a kernel version that has
the functionality we need, test it thoroughly with my app, and then
never touch that kernel again.

Unless I am mistaken, some people (like Thomas) have been deploying
systems based on PREEMPT_RT (or just -hrt) in indutrial settings for a
long time. (As far back as 2.6.15?) Obviously many bugs have been fixed
since then, which means that these versions contained many bugs.

How does one react when an important bug in found in a system that is
already in the field? Do they provide a way to upgrade the kernel (like
consumer-grade network routers)? Do they replace the complete system?

Regards.

2007-08-07 08:52:59

by John Sigler

[permalink] [raw]
Subject: Re: Old -rt patches

Paul E. McKenney wrote:

> John Sigler wrote:
>
>> I wrote a Linux app where I need high-resolution timers. I went all the
>> way and installed the -rt patch, which includes the -hrt patches, as far
>> as I understand.
>>
>> Since I could not afford to change kernels with every new release, I
>> decided to track the 2.6.20 branch (arbitrarily).
>>
>> At this point I'm using 2.6.20.7-rt8 (-rt8 was the last patch to the
>> 2.6.20 branch). (I see 2.6.20 is already up to .15)
>>
>> I see a lot of patches going into the -rt patch, but development moved
>> on to 2.6.21, then 2.6.22, and now 2.6.23, following mainline.
>>
>> I'm not a kernel hacker, so I don't claim to understand the patches, but
>> the comments sound like a few bugs are fixed here and there. I can't
>> tell whether these bugs also exist in previous kernels or only appeared
>> in newer kernels. Even if I knew, I probably wouldn't have the expertise
>> to back-port the patch.
>>
>> I'm seeing weird behavior in my app, when it has been running 2-4 days.
>> The timers start to act out, and all hell breaks loose. Basically, I
>> have no idea what's going on...
>>
>> My question is: is it possible that there is a bug in 2.6.20-rt8 that
>> has been fixed in subsequent -rt patches (there's a 2.6.21.6-rt21 and a
>> 2.6.22.1-rt9)? In other words, if I give 2.6.21.6-rt8 or 2.6.22.1-rt9 a
>> spin, is it possible that my weird behavior disappears?
>>
>> I'm not saying that it's impossible for the bug to be in my app, but the
>> app is small enough that I'm fairly confident there's no problem there.
>
> Would you be willing to post your app so people could try to reproduce
> your bug?

Hello Paul,

I cannot provide the app "as is" (I fear the paranoid PHB would
spontaneously combust). If time permits, I will try to make a small
testcase to exhibit the problem.

Regards.

2007-08-07 15:15:45

by Daniel Walker

[permalink] [raw]
Subject: Re: Old -rt patches

On Tue, 2007-08-07 at 10:27 +0200, John Sigler wrote:
> Daniel Walker wrote:
>
> > John Sigler wrote:
> >
> >> Would anyone care to comment?
> >
> > I'm not sure if this is the answer that you're looking for, but yes you
> > certainly will find fixed bug is older version of the tree.
>
> I am not a kernel hacker, therefore I can only imagine how complex it is
> to bring real-time to Linux. For some reason, I had come to believe that
> the case of a single CPU without dynticks had been "solved" in the more
> recent kernels (say 2.6.20), and that development had moved on, and was
> now active in more "complex" areas like SMP/multi-core, dynticks, etc.

There is always a chance of something not working, or a regression
getting introduced in any of the code .. -rt is all still a development
tree too, so it may not always be stable.

> >> Perhaps I could also test a different strategy, such as xenomai?
> >> http://www.xenomai.org/
> >
> > If it's a kernel bug it's not going to matter if you use a xenomai skin
> > or not.. If you use some other real time layer that might fix it ..
>
> *If* it is a bug in the -rt patch, then Adeos/Xenomai coupled with a
> vanilla Linux kernel should not be affected by the same bug. Unless my
> logic is broken somewhere.

I wouldn't think so, but then what if Adeos/Xenomai has a new bug?

> > you really need to test your app on a current version of the kernel ..
> > We as developers generally don't support out dated trees..
>
> This is the part I don't understand. I work for a tiny company with
> limited resources. It's infeasible for me to track every new kernel
> release and upgrade every time. I need to pick a kernel version that has
> the functionality we need, test it thoroughly with my app, and then
> never touch that kernel again.

Ok ..

> Unless I am mistaken, some people (like Thomas) have been deploying
> systems based on PREEMPT_RT (or just -hrt) in indutrial settings for a
> long time. (As far back as 2.6.15?) Obviously many bugs have been fixed
> since then, which means that these versions contained many bugs.

It's not a stretch to imagine. These real time chance has already been
included in commercial distros, and are generally getting made into
various products right now .. Just like what your doing .

> How does one react when an important bug in found in a system that is
> already in the field? Do they provide a way to upgrade the kernel (like
> consumer-grade network routers)? Do they replace the complete system?

I don't really know the details of this , but I would imagine there is
some sort of upgrade path for say routers .. I've heard people talk
about cell phone upgrading their software over the air.. It would be a
difficult proposition to say make some specific kernel static forever ..
Even the mainline stable kernel has bugs that get fixed..

Daniel