2005-09-13 10:00:18

by Ingo Molnar

[permalink] [raw]
Subject: 2.6.13-rt6, ktimer subsystem


i have released the 2.6.13-rt6 tree, which can be downloaded from the
usual place:

http://redhat.com/~mingo/realtime-preempt/

there are lots of small updates all across and there's a big feature as
well in this release: a complete rework of the high-resolution timers
framework, from Thomas Gleixner, called 'ktimers'.

under the ktimer framework the HR (and posix) timers live in a separate
domain, have their own (per-CPU) rbtree to stay scalable and
deterministic even with a high number of timers. Another positive effect
of the introduction of separate ktimers is that kernel/timer.c is now
using preemptible locks again, removing the cascade() worst-case
latency. The cleanup factor is high as well: the ktimer framework
slashes 1300+ lines off the HRT code. See kernel/ktimer.c for details.

the end-effect of ktimers is a much more deterministic HRT engine. The
original merging of HR timers into the stock timer wheel was a Bad Idea
(tm). We intend to push the ktimer subsystem upstream as well.

Changes since 2.6.13-rt1:

- new ktimer subsystem to cleanly implement High Resolution Timers
(Thomas Gleixner)

- BKL fix for the new pi-lock code (Steven Rostedt)

- SMP fix for the new pi-lock code (Steven Rostedt)

- ALL_TASKS_PI updates (Daniel Walker, Steven Rostedt)

- trace-irqs fix (Daniel Walker)

- add raw_irqs_disabled() into the might_sleep() check (Daniel Walker)

- turned off ALL_TASKS_PI

- ide_lock updates

- ioapic build fix

- HRT build fixes

to build a 2.6.13-rt6 tree, the following patches should be applied:

http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.13.tar.bz2
http://redhat.com/~mingo/realtime-preempt/patch-2.6.13-rt6

Ingo


2005-09-13 19:59:54

by Lee Revell

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

On Tue, 2005-09-13 at 12:00 +0200, Ingo Molnar wrote:
> i have released the 2.6.13-rt6 tree, which can be downloaded from the
> usual place:
>
> http://redhat.com/~mingo/realtime-preempt/


Ingo,

Is this supposed to work on amd64? Lots of people on linux-audio-user
report that it just reboots immediately when booting the kernel. I have
the .configs if you want them.

Lee

2005-09-13 20:06:10

by Lee Revell

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

On Tue, 2005-09-13 at 15:59 -0400, Lee Revell wrote:
> On Tue, 2005-09-13 at 12:00 +0200, Ingo Molnar wrote:
> > i have released the 2.6.13-rt6 tree, which can be downloaded from the
> > usual place:
> >
> > http://redhat.com/~mingo/realtime-preempt/
>
>
> Ingo,
>
> Is this supposed to work on amd64? Lots of people on linux-audio-user
> report that it just reboots immediately when booting the kernel. I have
> the .configs if you want them.

Sorry the problem is specific to x86-64 not amd64.

Lee

2005-09-13 20:09:35

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem


* Lee Revell <[email protected]> wrote:

> On Tue, 2005-09-13 at 12:00 +0200, Ingo Molnar wrote:
> > i have released the 2.6.13-rt6 tree, which can be downloaded from the
> > usual place:
> >
> > http://redhat.com/~mingo/realtime-preempt/
>
>
> Ingo,
>
> Is this supposed to work on amd64? Lots of people on linux-audio-user
> report that it just reboots immediately when booting the kernel. I
> have the .configs if you want them.

it wont even build right now, due to the ktimer changes. I'll fix x64 up
once things have settled down a bit. (but if someone does patches i'll
sure apply them)

Ingo

2005-09-13 20:36:52

by Lee Revell

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

On Tue, 2005-09-13 at 22:10 +0200, Ingo Molnar wrote:
> * Lee Revell <[email protected]> wrote:
>
> > On Tue, 2005-09-13 at 12:00 +0200, Ingo Molnar wrote:
> > > i have released the 2.6.13-rt6 tree, which can be downloaded from the
> > > usual place:
> > >
> > > http://redhat.com/~mingo/realtime-preempt/
> >
> >
> > Ingo,
> >
> > Is this supposed to work on amd64? Lots of people on linux-audio-user
> > report that it just reboots immediately when booting the kernel. I
> > have the .configs if you want them.
>
> it wont even build right now, due to the ktimer changes. I'll fix x64 up
> once things have settled down a bit. (but if someone does patches i'll
> sure apply them)

The problem apparently affected 2.6.13-rt4 too. The users reported that
it built OK (as long as realtime preemption is enabled) but then reboots
as soon as grub loads the kernel.

Can this be worked around by disabling CONFIG_HIGH_RES_TIMERS?

Lee


2005-09-14 15:56:23

by Darren Hart

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

Ingo Molnar wrote:
> i have released the 2.6.13-rt6 tree, which can be downloaded from the
> usual place:
>
> http://redhat.com/~mingo/realtime-preempt/

I haven't been able to get 2.6.13-rt6 to compile on a 16 way x440 (SUMMIT) with
gcc-2.95. Is there a known minimum compiler version? The same config builds
fine on a P3 8 way with gcc-3.3.5.

Make output:

CHK include/linux/version.h
SYMLINK include/asm -> include/asm-i386
SPLIT include/linux/autoconf.h -> include/config/*
UPD include/linux/version.h
HOSTCC scripts/kallsyms
HOSTCC scripts/conmakehash
HOSTCC scripts/testlpp
scripts/testlpp.c: In function `cleanup':
scripts/testlpp.c:90: warning: use of `l' length character with `f' type character
scripts/testlpp.c:92: warning: use of `l' length character with `f' type character
scripts/testlpp.c:94: warning: use of `l' length character with `f' type character
scripts/testlpp.c: In function `main':
scripts/testlpp.c:142: warning: use of `l' length character with `f' type character
CC arch/i386/kernel/asm-offsets.s
In file included from include/linux/time.h:7,
from include/linux/timex.h:58,
from include/linux/sched.h:11,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/seqlock.h: In function `__write_seqlock':
include/linux/seqlock.h:75: warning: implicit declaration of function
`__builtin_types_compatible_p'
include/linux/seqlock.h:75: parse error before `typeof'
include/linux/seqlock.h: In function `__write_sequnlock':
include/linux/seqlock.h:84: parse error before `typeof'
include/linux/seqlock.h: In function `__write_tryseqlock':
include/linux/seqlock.h:89: parse error before `typeof'
include/linux/seqlock.h: In function `__read_seqretry':
include/linux/seqlock.h:126: parse error before `typeof'
include/linux/seqlock.h:127: parse error before `typeof'
include/linux/seqlock.h: In function `__write_seqlock_raw':
include/linux/seqlock.h:134: parse error before `typeof'
include/linux/seqlock.h: In function `__write_sequnlock_raw':
include/linux/seqlock.h:143: parse error before `typeof'
include/linux/seqlock.h: In function `__write_tryseqlock_raw':
include/linux/seqlock.h:148: parse error before `typeof'
In file included from include/asm/semaphore.h:40,
from include/linux/sched.h:20,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/wait.h: In function `init_waitqueue_head':
include/linux/wait.h:84: parse error before `typeof'
In file included from include/linux/ktimer.h:207,
from include/linux/sched.h:254,
from arch/i386/kernel/asm-offsets.c:7:

Thanks,

--
Darren Hart
IBM Linux Technology Center
Linux Kernel Team
Phone: 503 578 3185
T/L: 775 3185

2005-09-14 19:40:35

by George Anzinger

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

Ingo Molnar wrote:
> i have released the 2.6.13-rt6 tree, which can be downloaded from the
> usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> there are lots of small updates all across and there's a big feature as
> well in this release: a complete rework of the high-resolution timers
> framework, from Thomas Gleixner, called 'ktimers'.
>
> under the ktimer framework the HR (and posix) timers live in a separate
> domain, have their own (per-CPU) rbtree to stay scalable and
> deterministic even with a high number of timers. Another positive effect
> of the introduction of separate ktimers is that kernel/timer.c is now
> using preemptible locks again, removing the cascade() worst-case
> latency. The cleanup factor is high as well: the ktimer framework
> slashes 1300+ lines off the HRT code. See kernel/ktimer.c for details.
>
> the end-effect of ktimers is a much more deterministic HRT engine. The
> original merging of HR timers into the stock timer wheel was a Bad Idea
> (tm). We intend to push the ktimer subsystem upstream as well.

Well, having spent a bit of time looking at the code it appears that a
lot of the ideas we looked at and discarded (see
[email protected]) are in this. Shame it
was all done with out reference or comment to that list, anyone on it or
even the lkml.

I DO agree that it _looks_ nicer, cleaner and so on. But there are a
lot of things we rejected in here and they really do need, at least, a
hard look.

A few of the top issues:

time in nanoseconds 64-bits, requires a divide to do much of anything
with it. Divides are slow and should be avoided if possible. This is
especially true in the embedded market.


The rbtree is a high overhead tree. I suspect performance problems
here. If it is the right answer here, then why not use it for normal
timers? A list of timers is a rather unique thing and, I think,
deserves a management structure that accounts for the fact that the
elements in the tree are perishable.

It appears that the "monotonic_clock" is being used to drive ktimers.
The "monotonic_clock" was NEVER meant to poke outside of the kernel. It
is a raw kernel clock that is only required to be monotonic with nothing
said about accuracy. It should NOT be confused with CLOCK_MONOTONIC
which is directly tied to xtime and therefor is ntp corrected.

These are only the concerns I have from having a rather quick look at
the code. I am sure that there are other issues...



--
George Anzinger [email protected]
HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/

2005-09-14 22:09:21

by Darren Hart

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

Darren Hart wrote:

> Ingo Molnar wrote:
> > i have released the 2.6.13-rt6 tree, which can be downloaded from the
> > usual place:
> >
> > http://redhat.com/~mingo/realtime-preempt/
>
> I haven't been able to get 2.6.13-rt6 to compile on a 16 way x440
(SUMMIT) with gcc-2.95. Is there a known minimum compiler version? The
same config builds fine on a P3 8 way with gcc-3.3.5.


Update: I was able to build the same config on the SUMMIT box with
gcc-3.3.5, so the compiler version does seem to be the issue.


--
Darren Hart
IBM Linux Technology Center
Linux Kernel Team
Phone: 503 578 3185
T/L: 775 3185

2005-09-15 02:25:30

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

On Wed, 2005-09-14 at 12:38 -0700, George Anzinger wrote:

> Well, having spent a bit of time looking at the code it appears that a
> lot of the ideas we looked at and discarded (see
> [email protected]) are in this. Shame it
> was all done with out reference or comment to that list, anyone on it or
> even the lkml.

Well, I'm considering to wear sackcloth and ashes. But this seems like
the pot calling the kettle back as I don't remember a single relevant
reference/comment from you on the UTIME/KURT mailing list after your
UTIME->HRT fork.

> A few of the top issues:
>
> time in nanoseconds 64-bits, requires a divide to do much of anything
> with it. Divides are slow and should be avoided if possible.

The divides are rare and definitely not in the hot pathes. I'm sure that
they can be replaced by some intellegent scaled math algorithm if it
turns out to be necessary. The hot path instructions are simple
add/sub/cmp which are less/equal expensive on a 32bit machine to an
operation on struct timespec or an jiffies/arch_cycles pair. The non
nsec based implementation gives a burden to 64bit machines and is
provable wrong in the aspect of summing rounding errors of interval
timers.

> This is especially true in the embedded market.

I'm well aware of the embedded market constraints.


> The rbtree is a high overhead tree. I suspect performance problems
> here.

1. rbtree is available out of the box

2. rbtree is proven to be efficient - at least there are a couple of
performance relevant users relying on it e.g. mm, ext3

3. I did insertion/removal tests with 10k entries (<2us on a 1GHz box in
userspace). This is way below the experienced and reproducible trouble
of recascading. The penalty is completely on the thread which owns the
timer for non signal related functions. For signal related functions
only the removal on expiry is a penalty for the complete system (in the
softirq)

The cascading is a penalty for the complete system all the time.

Performance is a strawman argument here. You know very well that > 90%
of the timers are inaccurate "timeout" timers related to I/O,
networking, devices. Most of those never expire (the positive feedback
removes the timer before expiry) and those timers have no constraint to
be accurate, except for the fact that they have to detect an
device/network problem at some time. In this case it is completely
irrelevant whether the timeout occures n msecs earlier or later.

> If it is the right answer here, then why not use it for normal
> timers? A list of timers is a rather unique thing and, I think,
> deserves a management structure that accounts for the fact that the
> elements in the tree are perishable.

The first goal was to seperate out the timers from the timeout API and I
believe that this seperation is necessary.

The implementation of ktimers is not at all restricted to the timers we
addressed for now, it can also be utilized for the timeout API without
much effort.

The performance improvement is enourmous despite the alleged 64bit math
overhead.

Testcase on a 600MHz CeleronM, 512MB RAM:

16 cyclic SCHED_FIFO tasks using clock_nanosleep(ABSTIME)
base interval = 1000 us, offset per task = 50 us
tracing each latency value to disk

16 cyclic SCHED_OTHER tasks using clock_nanosleep(ABSTIME)
base interval = 1000 us, offset per task = 500 us
tracing each latency value to disk

while true; do hackbench 50; done
cat /dev/hda | nc atlas 4711

This injects a (insane) load avg of >600 permantely

2.6.13-rt4 (cascade based implementation)
16 cyclic SCHED_FIFO tasks using clock_nanosleep(ABSTIME)

task loops min max average sigma prio
0 999999 5 869 19 20 80
1 999999 9 883 18 22 79
2 999999 9 927 19 23 78
3 999999 5 908 21 28 77
4 999999 0 1056 22 33 76
5 999999 0 973 23 33 75
6 999999 0 926 23 33 74
7 999999 1 893 24 33 73
8 999999 2 942 23 34 72
9 999999 1 868 24 34 71
10 999999 0 912 23 34 70
11 999999 0 911 28 46 69
12 999999 0 912 28 46 68
13 999999 9 967 28 46 67
14 999999 9 954 28 46 66
15 999999 9 946 28 46 65


2.6.13-rt11 (ktimers based implementation)
16 cyclic SCHED_FIFO tasks using clock_nanosleep(ABSTIME)

task loops min max average sigma prio
0 999999 9 76 20 4 80
1 999999 8 84 20 4 79
2 999999 8 98 21 4 78
3 999999 9 121 20 4 77
4 999999 8 124 20 4 76
5 999999 9 140 20 4 75
6 999999 10 103 22 5 74
7 999999 9 99 21 5 73
8 999999 8 95 21 5 72
9 999999 9 148 21 5 71
10 999999 9 141 22 6 70
11 999999 9 143 22 5 69
12 999999 8 129 20 4 68
13 999999 9 149 21 5 67
14 999999 9 135 21 5 66
15 999999 8 230 22 5 65


> It appears that the "monotonic_clock" is being used to drive ktimers.
> The "monotonic_clock" was NEVER meant to poke outside of the kernel. It
> is a raw kernel clock that is only required to be monotonic with nothing
> said about accuracy. It should NOT be confused with CLOCK_MONOTONIC
> which is directly tied to xtime and therefor is ntp corrected.

I'm well aware of that and this is a completely different playground.

The ktimers implementation is _independent_ of the clock sources. The
current clock source implementation may suck, but thats not a problem of
ktimers at all. Its a problem of arch/XXX/timeYY and kernel/time.c. I
did not spend too much time on that as John Stultz is working on the
timesource and timeofday stuff and I really hope that this gets
accepted.

The relation ship of clock sources and their accuracy is also a
worthwhile field of discussion.

> These are only the concerns I have from having a rather quick look at
> the code. I am sure that there are other issues...

It would be hubristic to deny that :)

OTH,

- The posix timer tests run all successful, except the broken 2timertest
which fails on any other HRT kernel too and the sleep to long for real
timers when the clock is set backwards, which is easily solvable
(working on that).

- The performance improvements in combination with simpler code is an
argument of itself


tglx


2005-09-15 07:55:36

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem


* Lee Revell <[email protected]> wrote:

> > it wont even build right now, due to the ktimer changes. I'll fix x64 up
> > once things have settled down a bit. (but if someone does patches i'll
> > sure apply them)
>
> The problem apparently affected 2.6.13-rt4 too. The users reported
> that it built OK (as long as realtime preemption is enabled) but then
> reboots as soon as grub loads the kernel.
>
> Can this be worked around by disabling CONFIG_HIGH_RES_TIMERS?

on x64 HIGH_RES_TIMERS is disabled by default (no lowlevel support code
integrated yet). Where did the problems start, rt4? (i.e. rt3 works
fine?)

Ingo

2005-09-15 09:20:04

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem


* George Anzinger <[email protected]> wrote:

> > the end-effect of ktimers is a much more deterministic HRT engine.
> > The original merging of HR timers into the stock timer wheel was a
> > Bad Idea (tm). We intend to push the ktimer subsystem upstream as
> > well.
>
> Well, having spent a bit of time looking at the code it appears that a
> lot of the ideas we looked at and discarded (see
> [email protected]) are in this. Shame
> it was all done with out reference or comment to that list, anyone on
> it or even the lkml.

this was done in the timeframe of 2 days, and was posted ASAP - with you
Cc:-ed for the specific purpose of getting feedback from you.

given the very good performance results of ktimers, and the
simplification effect on the original HRT code:

36 files changed, 2016 insertions(+), 3094 deletions(-)

it was a no-brainer to put it into the -rt tree.

> I DO agree that it _looks_ nicer, cleaner and so on. But there are a
> lot of things we rejected in here and they really do need, at least, a
> hard look.
>
> A few of the top issues:
>
> time in nanoseconds 64-bits, requires a divide to do much of anything
> with it. Divides are slow and should be avoided if possible. This is
> especially true in the embedded market.

Wrong. Divides are slow _on the micro micro level_. They make zero, nil,
nada difference in reality. The cleanliness difference between having a
flat nanosec scale and having some artificial 2x 32-bit structure are
significant.

_By far_ the biggest problem of the HRT code is cleanliness (or the lack
of it), and the resulting maintainance overhead, and the resulting gut
reaction from upstream: "oh, yuck, bleh!". [Similar problems are true
for the timer code in general - by far the biggest issues are
organization and cleanliness, not micro-issues.]

Micro-level optimizations like 64-bit vs. 32-bit variables is the very
very last issue to consider - and this statement comes from me, an
admitted performance extremist. If the HRT code ever wants to go
upstream then it _must get much much cleaner_. Thomas has been doing
great work in this area.

> The rbtree is a high overhead tree. I suspect performance problems
> here. [...]

Wrong. rbtrees are used for some of the most performance-critical areas
of the kernel: the VMA tree, the new ext3 reservations code [a
performance feature], access keys.

> [...] If it is the right answer here, then why not use it for normal
> timers? [...]

i'd like to remind you that the code is less than a week old.

But, i dont think we want to make the majority of normal timeouts
tree-based. There are in essence two fundamental time related objects in
the kernel: timeouts and timers. Timeouts never expire in 99% of the
cases - so they must be optimized for the 'fast insert+remove' codepath.
Timers on the other hand expire in 99% of the cases, so they must be
optimized for the 'fast insert+expire' codepath.

Also, for timers, since they are often used by time-deterministic code,
it does not hurt to have a fundamentally deterministic design. The
current upstream timer(timeout) wheel is fundamentally non-deterministic
with an increasing number of timers, due to the cascading design.

hence the separation of timers and timeouts. I think that this
distinction might as well stay for the long run.

and we'be been through a number of design variants during the past
couple of weeks in the -rt tree: we tried the original HRT patch, a
combo method with partly split HR timers, and now a completely separated
design. From what i've seen ktimers are the best solution so far.

Ingo

2005-09-15 09:43:23

by Roman Zippel

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

Hi,

On Tue, 13 Sep 2005, Ingo Molnar wrote:

> there are lots of small updates all across and there's a big feature as
> well in this release: a complete rework of the high-resolution timers
> framework, from Thomas Gleixner, called 'ktimers'.

Is that somewhere available as separate patch?

bye, Roman

2005-09-15 11:37:26

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem


* Ingo Molnar <[email protected]> wrote:

>
> * Lee Revell <[email protected]> wrote:
>
> > > it wont even build right now, due to the ktimer changes. I'll fix x64 up
> > > once things have settled down a bit. (but if someone does patches i'll
> > > sure apply them)
> >
> > The problem apparently affected 2.6.13-rt4 too. The users reported
> > that it built OK (as long as realtime preemption is enabled) but then
> > reboots as soon as grub loads the kernel.
> >
> > Can this be worked around by disabling CONFIG_HIGH_RES_TIMERS?
>
> on x64 HIGH_RES_TIMERS is disabled by default (no lowlevel support
> code integrated yet). Where did the problems start, rt4? (i.e. rt3
> works fine?)

ok, fixed a couple of bugs on x64, it boots on my box now. The main
breakage was the elimination of the preemption-disabling soft irq flag
in -53-05, that unearthed an irq-flags 32-bitness bug in the spinlock
macros, which was there for ages. This resulted in the early reboot
during bootup.

I've uploaded 2.6.13-rt12 with the fixes.

Ingo

2005-09-15 22:36:31

by George Anzinger

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

Thomas Gleixner wrote:
> On Wed, 2005-09-14 at 12:38 -0700, George Anzinger wrote:
>
>
>>Well, having spent a bit of time looking at the code it appears that a
>>lot of the ideas we looked at and discarded (see
>>[email protected]) are in this. Shame it
>>was all done with out reference or comment to that list, anyone on it or
>>even the lkml.
>
>
> Well, I'm considering to wear sackcloth and ashes. But this seems like
> the pot calling the kettle back as I don't remember a single relevant
> reference/comment from you on the UTIME/KURT mailing list after your
> UTIME->HRT fork.

Hm...All the work has been on HRT mailing list. In the early days the
UTIME/KURT folks were on the list, still may be for all I know.

Still, as much as sackcloth and ashes might become you or me, I would
rather try to learn how to be more open about what we are doing so that
what results is the best possible for the linux community.
>
>
>>A few of the top issues:
>>
>>time in nanoseconds 64-bits, requires a divide to do much of anything
>>with it. Divides are slow and should be avoided if possible.
>
>
> The divides are rare and definitely not in the hot paths. I'm sure that
> they can be replaced by some intelligent scaled math algorithm if it
> turns out to be necessary. The hot path instructions are simple
> add/sub/cmp which are less/equal expensive on a 32bit machine to an
> operation on struct timespec or an jiffies/arch_cycles pair. The non
> nsec based implementation gives a burden to 64bit machines and is
> provable wrong in the aspect of summing rounding errors of interval
> timers.

With out arguing with your "provable wrong", I will admit that the
64-bit LOOKS good and easy. At the same time I think we can abstract
all the required actions on the "nstime" thing and set it up so we can
use either a timespec or a s64 depending on the machine (or some such).
I have such a patch nearly ready...
>
>
~
>
>>The rbtree is a high overhead tree. I suspect performance problems
>>here.
>
>
> 1. rbtree is available out of the box

True.
>
> 2. rbtree is proven to be efficient - at least there are a couple of
> performance relevant users relying on it e.g. mm, ext3
>
> 3. I did insertion/removal tests with 10k entries (<2us on a 1GHz box in
> user space). This is way below the experienced and reproducible trouble
> of recascading. The penalty is completely on the thread which owns the
> timer for non signal related functions. For signal related functions
> only the removal on expiry is a penalty for the complete system (in the
> softirq)

Well, I never was a cascade fan.

In the early HRT patches the whole timer list was replaced with a hashed
list. It was O(N/M) on insertion where we could easily choose M (for a
while it was even a config option). Removal was just an unlink, same as
the cascade list.

To be clear on my take on this, as I understand it the rblist is
something like O(N*somelog 2). What is left out here is the fixed
overhead of F which is there even if N = 1. So we have something like
(F+O(f(N)) for a list. For the most part we don't look at F, but as
list complexity grows, it gets larger thus pushing out the cross over
point to a higher "N" when comparing two lists. I considered the rbtree
when doing the secondary list for HRT and concluded that "N" was small
enough that a simple O(N/2) list would do just fine as it would only
contain timers set to expire in the next jiffie.

I don't think an O(N/2) list is right for all timers (as in the
ktimers), but I still wonder if there isn't something better than and
rblist. Note, "wonder". I don't have such a list structure in hand.
>
> The cascading is a penalty for the complete system all the time.
>
> Performance is a straw man argument here. You know very well that > 90%
> of the timers are inaccurate "timeout" timers related to I/O,
> networking, devices. Most of those never expire (the positive feedback
> removes the timer before expiry) and those timers have no constraint to
> be accurate, except for the fact that they have to detect an
> device/network problem at some time. In this case it is completely
> irrelevant whether the timeout occurs n msecs earlier or later.

I agree, but it not accuracy that I am arguing, but cpu cycles. Those
we use in the kernel are not available for the user.
>
>
>>If it is the right answer here, then why not use it for normal
>>timers? A list of timers is a rather unique thing and, I think,
>>deserves a management structure that accounts for the fact that the
>>elements in the tree are perishable.
>
>
> The first goal was to separate out the timers from the timeout API and I
> believe that this separation is necessary.
>
> The implementation of ktimers is not at all restricted to the timers we
> addressed for now, it can also be utilized for the timeout API without
> much effort.
>
> The performance improvement is enormous despite the alleged 64bit math
> overhead.
>
> Testcase on a 600MHz CeleronM, 512MB RAM:
>
> 16 cyclic SCHED_FIFO tasks using clock_nanosleep(ABSTIME)
> base interval = 1000 us, offset per task = 50 us
> tracing each latency value to disk
>
> 16 cyclic SCHED_OTHER tasks using clock_nanosleep(ABSTIME)
> base interval = 1000 us, offset per task = 500 us
> tracing each latency value to disk
>
> while true; do hackbench 50; done
> cat /dev/hda | nc atlas 4711
>
> This injects a (insane) load avg of >600 permantely
>
> 2.6.13-rt4 (cascade based implementation)
> 16 cyclic SCHED_FIFO tasks using clock_nanosleep(ABSTIME)
>
> task loops min max average sigma prio
> 0 999999 5 869 19 20 80
> 1 999999 9 883 18 22 79
> 2 999999 9 927 19 23 78
> 3 999999 5 908 21 28 77
> 4 999999 0 1056 22 33 76
> 5 999999 0 973 23 33 75
> 6 999999 0 926 23 33 74
> 7 999999 1 893 24 33 73
> 8 999999 2 942 23 34 72
> 9 999999 1 868 24 34 71
> 10 999999 0 912 23 34 70
> 11 999999 0 911 28 46 69
> 12 999999 0 912 28 46 68
> 13 999999 9 967 28 46 67
> 14 999999 9 954 28 46 66
> 15 999999 9 946 28 46 65
>
>
> 2.6.13-rt11 (ktimers based implementation)
> 16 cyclic SCHED_FIFO tasks using clock_nanosleep(ABSTIME)
>
> task loops min max average sigma prio
> 0 999999 9 76 20 4 80
> 1 999999 8 84 20 4 79
> 2 999999 8 98 21 4 78
> 3 999999 9 121 20 4 77
> 4 999999 8 124 20 4 76
> 5 999999 9 140 20 4 75
> 6 999999 10 103 22 5 74
> 7 999999 9 99 21 5 73
> 8 999999 8 95 21 5 72
> 9 999999 9 148 21 5 71
> 10 999999 9 141 22 6 70
> 11 999999 9 143 22 5 69
> 12 999999 8 129 20 4 68
> 13 999999 9 149 21 5 67
> 14 999999 9 135 21 5 66
> 15 999999 8 230 22 5 65

I confess I don't understand the above numbers. What are min and max
and in what units? Are you saying the large max numbers are caused by
the cascade?
>
>
>
>>It appears that the "monotonic_clock" is being used to drive ktimers.
>>The "monotonic_clock" was NEVER meant to poke outside of the kernel. It
>>is a raw kernel clock that is only required to be monotonic with nothing
>>said about accuracy. It should NOT be confused with CLOCK_MONOTONIC
>>which is directly tied to xtime and therefor is ntp corrected.
>
>
> I'm well aware of that and this is a completely different playground.
>
> The ktimers implementation is _independent_ of the clock sources. The
> current clock source implementation may suck, but thats not a problem of
> ktimers at all. Its a problem of arch/XXX/timeYY and kernel/time.c. I
> did not spend too much time on that as John Stultz is working on the
> timesource and timeofday stuff and I really hope that this gets
> accepted.
>
> The relation ship of clock sources and their accuracy is also a
> worthwhile field of discussion.
>
>
>>These are only the concerns I have from having a rather quick look at
>>the code. I am sure that there are other issues...
>
>
> It would be hubristic to deny that :)
>
> OTH,
>
> - The posix timer tests run all successful, except the broken 2timertest
> which fails on any other HRT kernel too and the sleep to long for real
> timers when the clock is set backwards, which is easily solvable
> (working on that).

Your mileage seems to differ from mine. Here is what I get from ./do_test:
The following tests failed:
clock_nanosleeptest
abs_timer_test
4-1
clock_settimetest
clock_gettimetest2
2timer_test



Then, on the second run, it crashed in an attempt to get the monotonic
clock (a divide error). System is a dual PIII, 800Mhz. This from the
rt11 patch.
>
> - The performance improvements in combination with simpler code is an
> argument of itself

--
George Anzinger [email protected]
HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/

2005-09-15 22:53:34

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

On Thu, 2005-09-15 at 15:35 -0700, George Anzinger wrote:

> > Performance is a straw man argument here. You know very well that > 90%
> > of the timers are inaccurate "timeout" timers related to I/O,
> > networking, devices. Most of those never expire (the positive feedback
> > removes the timer before expiry) and those timers have no constraint to
> > be accurate, except for the fact that they have to detect an
> > device/network problem at some time. In this case it is completely
> > irrelevant whether the timeout occurs n msecs earlier or later.
>
> I agree, but it not accuracy that I am arguing, but cpu cycles. Those
> we use in the kernel are not available for the user.

The time used for recascding is neither available :). Seriously, I'm
quite sure that the rbtree for the sorting of "timers" - not "timeouts"
- will not have any relevant performance impact. If there is a faster
sorted tree around, I have no problem to use that.

> I confess I don't understand the above numbers. What are min and max
> and in what units? Are you saying the large max numbers are caused by
> the cascade?

Sorry, all units usec.

Yes. The problem is the combined base lock, which holds off interrupts
for quite a bunch of time. Daniel was experiencing this too.

> > - The posix timer tests run all successful, except the broken 2timertest
> > which fails on any other HRT kernel too and the sleep to long for real
> > timers when the clock is set backwards, which is easily solvable
> > (working on that).
>
> Your mileage seems to differ from mine. Here is what I get from ./do_test:
> The following tests failed:
> clock_nanosleeptest
> abs_timer_test
> 4-1
> clock_settimetest
> clock_gettimetest2
> 2timer_test

Hmm. Except for the 2timer_test, where my source seems to be broken it
works here.

> Then, on the second run, it crashed in an attempt to get the monotonic
> clock (a divide error). System is a dual PIII, 800Mhz. This from the
> rt11 patch.

Hmm, divide error. I had one of those in the early phase due to some
strange 64/32 truncation problem, which was caused by nested
inline/macros. After unmingling the problem went away.

tglx


2005-09-15 23:05:27

by George Anzinger

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

Ingo Molnar wrote:
> * George Anzinger <[email protected]> wrote:
>
>
>>>the end-effect of ktimers is a much more deterministic HRT engine.
>>>The original merging of HR timers into the stock timer wheel was a
>>>Bad Idea (tm). We intend to push the ktimer subsystem upstream as
>>>well.
>>
>>Well, having spent a bit of time looking at the code it appears that a
>>lot of the ideas we looked at and discarded (see
>>[email protected]) are in this. Shame
>>it was all done with out reference or comment to that list, anyone on
>>it or even the lkml.
>
>
> this was done in the time frame of 2 days, and was posted ASAP - with you
> Cc:-ed for the specific purpose of getting feedback from you.

Possibly your involvement was that short. I rather doubt the code was
written that quickly... I first found out about it on the 12th from the
rt5 patch you posted about 2pm. Was there an earlier email?
>
> given the very good performance results of ktimers, and the
> simplification effect on the original HRT code:
>
> 36 files changed, 2016 insertions(+), 3094 deletions(-)
>
> it was a no-brainer to put it into the -rt tree.
>
>
>>I DO agree that it _looks_ nicer, cleaner and so on. But there are a
>>lot of things we rejected in here and they really do need, at least, a
>>hard look.
>>
>>A few of the top issues:
>>
>>time in nanoseconds 64-bits, requires a divide to do much of anything
>>with it. Divides are slow and should be avoided if possible. This is
>>especially true in the embedded market.
>
>
> Wrong. Divides are slow _on the micro micro level_. They make zero, nil,
> nada difference in reality. The cleanliness difference between having a
> flat nanosec scale and having some artificial 2x 32-bit structure are
> significant.

Cleanliness can easily be obtained with a CPP macro. I am not sure what
the right answer is here. I have heard complaints on the 64-bit thing
on lkml WRT the work John Stultz is doing. Also, it is real easy to
abstract all the operations into CPP macros and choose 64 bit or
timespec at compile time.
>
> _By far_ the biggest problem of the HRT code is cleanliness (or the lack
> of it), and the resulting maintenance overhead, and the resulting gut
> reaction from upstream: "oh, yuck, bleh!". [Similar problems are true
> for the timer code in general - by far the biggest issues are
> organization and cleanliness, not micro-issues.]

I agree, the ktimers code looks clean. I am concerned that all the
issues be addressed, but time will shake that out.
>
> Micro-level optimizations like 64-bit vs. 32-bit variables is the very
> very last issue to consider - and this statement comes from me, an
> admitted performance extremist. If the HRT code ever wants to go
> upstream then it _must get much much cleaner_. Thomas has been doing
> great work in this area.

Agreed, I just wish we could see it a bit earlier...
>
>
>>The rbtree is a high overhead tree. I suspect performance problems
>>here. [...]
>
>
> Wrong. rbtrees are used for some of the most performance-critical areas
> of the kernel: the VMA tree, the new ext3 reservations code [a
> performance feature], access keys.
>
>
>>[...] If it is the right answer here, then why not use it for normal
>>timers? [...]
>
>
> I'd like to remind you that the code is less than a week old.
>
> But, i don't think we want to make the majority of normal timeouts
> tree-based. There are in essence two fundamental time related objects in
> the kernel: timeouts and timers. Timeouts never expire in 99% of the
> cases - so they must be optimized for the 'fast insert+remove' code path.
> Timers on the other hand expire in 99% of the cases, so they must be
> optimized for the 'fast insert+expire' code path.

Timers do, but I am not so sure of sleeps. Seems a lot of them are
interrupted. Still, as I said in the note to Thomas, I don't have a
real answer here. The hashed tree I used in HRT instead of the cascade,
did eliminate the cascade. The hashed tree is faster for small N and,
by changing the hash bucket size, we can easily make N<4k or so small.
For larger N, the rbtree or some such seems right. As I said in the
note to Thomas, I am NO fan of the cascade.
>
> Also, for timers, since they are often used by time-deterministic code,
> it does not hurt to have a fundamentally deterministic design. The
> current upstream timer(timeout) wheel is fundamentally non-deterministic
> with an increasing number of timers, due to the cascading design.

Interested in a hashed list. It is rather easy to do and does NOT
cascade. It is an O(N/M) list for inserts, same as what there now for
deletes and nearly the same for the run timer code. It takes away all
the cascade stuff so it will eliminate a all of that code. Oh, "M" is
the number of buckets and can easily be configured at compile time.
>
> hence the separation of timers and timeouts. I think that this
> distinction might as well stay for the long run.
>
> and we'be been through a number of design variants during the past
> couple of weeks in the -rt tree: we tried the original HRT patch, a
> combo method with partly split HR timers, and now a completely separated
> design. From what I've seen ktimers are the best solution so far.
>

--
George Anzinger [email protected]
HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/

2005-09-15 23:11:10

by Daniel Walker

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

On Thu, 2005-09-15 at 15:35 -0700, George Anzinger wrote:
>
> In the early HRT patches the whole timer list was replaced with a hashed
> list. It was O(N/M) on insertion where we could easily choose M (for a
> while it was even a config option). Removal was just an unlink, same as
> the cascade list.
>
> To be clear on my take on this, as I understand it the rblist is
> something like O(N*somelog 2). What is left out here is the fixed
> overhead of F which is there even if N = 1. So we have something like
> (F+O(f(N)) for a list. For the most part we don't look at F, but as
> list complexity grows, it gets larger thus pushing out the cross over
> point to a higher "N" when comparing two lists. I considered the rbtree
> when doing the secondary list for HRT and concluded that "N" was small
> enough that a simple O(N/2) list would do just fine as it would only
> contain timers set to expire in the next jiffie.

The fact that we know in advance that a system is only going to a very
small number of these timers should be noted. You could just use a
regular list , and limit the total number of timers . I would hesitate
to stick a big data structure in when your only dealing with one or two
timers on average..

George, what's largest number of highres timers that someone might
need/want?

Daniel

2005-09-15 23:11:08

by George Anzinger

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

Thomas Gleixner wrote:
~
>>>- The posix timer tests run all successful, except the broken 2timertest
>>>which fails on any other HRT kernel too and the sleep to long for real
>>>timers when the clock is set backwards, which is easily solvable
>>>(working on that).
>>
>>Your mileage seems to differ from mine. Here is what I get from ./do_test:
>>The following tests failed:
>>clock_nanosleeptest
>>abs_timer_test
>>4-1
>>clock_settimetest
>>clock_gettimetest2
>>2timer_test
>
>
> Hmm. Except for the 2timer_test, where my source seems to be broken it
> works here.

The latest support source is in the CVS tree on the HRT site. It is no
longer a patch...
>
>
>>Then, on the second run, it crashed in an attempt to get the monotonic
>>clock (a divide error). System is a dual PIII, 800Mhz. This from the
>>rt11 patch.
>
>
> Hmm, divide error. I had one of those in the early phase due to some
> strange 64/32 truncation problem, which was caused by nested
> inline/macros. After unmingling the problem went away.

I suspect that the 64/32 div resulted in a >32 bit result which is a
fault. This was deep in monotonic_clock. I would rather change to
clock_monotonic (i.e. xtime+offset+walltomonotonic) and work that code
patch. monotonic_clock is just the wrong thing for this work.
>
> tglx
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
George Anzinger [email protected]
HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/

2005-09-15 23:20:14

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

On Thu, 2005-09-15 at 16:04 -0700, George Anzinger wrote:

> Possibly your involvement was that short. I rather doubt the code was
> written that quickly... I first found out about it on the 12th from the
> rt5 patch you posted about 2pm. Was there an earlier email?

The first sketch of ktimers was written in less than a day. Some
cleanups and the integration into -RT took not longer than 3-4 days.

tglx


2005-09-16 00:09:33

by George Anzinger

[permalink] [raw]
Subject: Re: 2.6.13-rt6, ktimer subsystem

Daniel Walker wrote:
> On Thu, 2005-09-15 at 15:35 -0700, George Anzinger wrote:
>
>>In the early HRT patches the whole timer list was replaced with a hashed
>>list. It was O(N/M) on insertion where we could easily choose M (for a
>>while it was even a config option). Removal was just an unlink, same as
>>the cascade list.
>>
>>To be clear on my take on this, as I understand it the rblist is
>>something like O(N*somelog 2). What is left out here is the fixed
>>overhead of F which is there even if N = 1. So we have something like
>>(F+O(f(N)) for a list. For the most part we don't look at F, but as
>>list complexity grows, it gets larger thus pushing out the cross over
>>point to a higher "N" when comparing two lists. I considered the rbtree
>>when doing the secondary list for HRT and concluded that "N" was small
>>enough that a simple O(N/2) list would do just fine as it would only
>>contain timers set to expire in the next jiffie.
>
>
> The fact that we know in advance that a system is only going to a very
> small number of these timers should be noted. You could just use a
> regular list , and limit the total number of timers . I would hesitate
> to stick a big data structure in when your only dealing with one or two
> timers on average..
>
> George, what's largest number of highres timers that someone might
> need/want?
>
Well, the interesting thing is that, unless you change something, the
system has a current limit of 1000 posix timers. This can be changed,
but, I suspect it is not changed very often. And this handles all posix
timers, low and high res. Sleep is another thing, with a max of one
sleep timer per task. The ktimer list is also doing itimers, which are
also limited to the number of tasks.

As for data structures, a hashed list requires a "list head" for each
bucket while, I think the rblist only has one list head, but requires an
additional list head (or is it two) in the entry data structure so this
is a pay as you go list.

--
George Anzinger [email protected]
HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/

2005-09-26 07:01:33

by Ingo Molnar

[permalink] [raw]
Subject: 2.6.14-rc2-rt2


i have released the 2.6.14-rc2-rt2 tree, which can be downloaded from
the usual place:

http://redhat.com/~mingo/realtime-preempt/

the biggest change is the merge to the 2.6.14 tree, but there are also
updates all across the board: lots of ktimer updates and fixes from
Thomas Gleixner, an important PI fix from Steven Rostedt, and lots of
other details.

Changes since 2.6.13-rt6:

- tons of ktimer updates: build fixes, SMP fixes and more (Thomas
Gleixner)

- PI fix (Steven Rostedt)

- ntfs fix for bit-spin-locks (Eran Mann)

- updates/fixes in preparation of the ARM merge (Daniel Walker)

- latency histogram cleanups (Yi Yang)

- merge to 2.6.14-rc2

- sysfs/scsi interaction workaround to get aic7xxx to boot

to build a 2.6.14-rc2-rt2 tree, the following patches should be applied:

http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.13.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.14-rc2.bz2
http://redhat.com/~mingo/realtime-preempt/patch-2.6.14-rc2-rt2

Ingo

2005-09-27 06:08:13

by Eran Mann

[permalink] [raw]
Subject: Re: 2.6.14-rc2-rt2

The attached 2 patches (against 2.6.14-rc2-rt3) seem to be required to
compile dccp and nfnetlink (only compile-tested).

Ingo Molnar wrote:
> i have released the 2.6.14-rc2-rt2 tree, which can be downloaded from
> the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
...


Attachments:
rt-netfilter.patch (895.00 B)
rt-dccp.patch (1.10 kB)
Download all attachments

2005-09-27 10:32:54

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.14-rc2-rt2


* Eran Mann <[email protected]> wrote:

> The attached 2 patches (against 2.6.14-rc2-rt3) seem to be required to
> compile dccp and nfnetlink (only compile-tested).

thanks, applied.

Ingo

2005-09-27 17:00:55

by Fernando Lopez-Lezcano

[permalink] [raw]
Subject: Re: 2.6.14-rc2-rt2

On Mon, 2005-09-26 at 09:02 +0200, Ingo Molnar wrote:
> i have released the 2.6.14-rc2-rt2 tree, which can be downloaded from
> the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> the biggest change is the merge to the 2.6.14 tree, but there are also
> updates all across the board: lots of ktimer updates and fixes from
> Thomas Gleixner, an important PI fix from Steven Rostedt, and lots of
> other details.

can't build 2.6.13-rc2-rt4 (also happened on rt1, I think):

CHK include/linux/version.h
SYMLINK include/asm -> include/asm-i386
SPLIT include/linux/autoconf.h -> include/config/*
CHK usr/initramfs_list
UPD usr/initramfs_list
CHK include/linux/compile.h
UPD include/linux/compile.h
{standard input}: Assembler messages:
{standard input}:164: Error: can't resolve `.sched.text' {.sched.text
section} - `.Ltext0' {.text section}
{standard input}:165: Error: can't resolve `.sched.text' {.sched.text
section} - `.Ltext0' {.text section}
make[1]: *** [arch/i386/kernel/semaphore.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [arch/i386/kernel] Error 2
make: *** Waiting for unfinished jobs....

Failing .config attached.
-- Fernando


Attachments:
kernel-2.6.13-i686.ccrma.config (59.03 kB)

2005-09-27 22:15:06

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.14-rc2-rt2

On Tue, 2005-09-27 at 09:59 -0700, Fernando Lopez-Lezcano wrote:
> {standard input}:165: Error: can't resolve `.sched.text' {.sched.text
> section} - `.Ltext0' {.text section}
> make[1]: *** [arch/i386/kernel/semaphore.o] Error 1
> make[1]: *** Waiting for unfinished jobs....
> make: *** [arch/i386/kernel] Error 2
> make: *** Waiting for unfinished jobs....

Thats a gcc problem. Which gcc version are you using ?

tglx


2005-09-27 23:10:30

by Daniel Walker

[permalink] [raw]
Subject: Re: 2.6.14-rc2-rt2

On Tue, 2005-09-27 at 09:59 -0700, Fernando Lopez-Lezcano wrote:
> UPD include/linux/compile.h
> {standard input}: Assembler messages:
> {standard input}:164: Error: can't resolve `.sched.text' {.sched.text
> section} - `.Ltext0' {.text section}
> {standard input}:165: Error: can't resolve `.sched.text' {.sched.text
> section} - `.Ltext0' {.text section}
> make[1]: *** [arch/i386/kernel/semaphore.o] Error 1
> make[1]: *** Waiting for unfinished jobs....
> make: *** [arch/i386/kernel] Error 2
> make: *** Waiting for unfinished jobs....
>
> Failing .config attached.
> -- Fernando
>

Here's the fix.

Index: linux-2.6.13/lib/semaphore-sleepers.c
===================================================================
--- linux-2.6.13.orig/lib/semaphore-sleepers.c
+++ linux-2.6.13/lib/semaphore-sleepers.c
@@ -176,3 +176,10 @@ fastcall int __compat_down_trylock(struc
spin_unlock_irqrestore(&sem->wait.lock, flags);
return 1;
}
+
+int fastcall compat_sem_is_locked(struct compat_semaphore *sem)
+{
+ return (int) atomic_read(&sem->count) < 0;
+}
+
+EXPORT_SYMBOL(compat_sem_is_locked);
Index: linux-2.6.13/arch/i386/kernel/semaphore.c
===================================================================
--- linux-2.6.13.orig/arch/i386/kernel/semaphore.c
+++ linux-2.6.13/arch/i386/kernel/semaphore.c
@@ -102,10 +102,3 @@ asm(
"ret"
);

-int fastcall compat_sem_is_locked(struct compat_semaphore *sem)
-{
- return (int) atomic_read(&sem->count) < 0;
-}
-
-EXPORT_SYMBOL(compat_sem_is_locked);
-


2005-09-27 23:11:41

by Fernando Lopez-Lezcano

[permalink] [raw]
Subject: Re: 2.6.14-rc2-rt2

On Wed, 2005-09-28 at 00:15 +0200, Thomas Gleixner wrote:
> On Tue, 2005-09-27 at 09:59 -0700, Fernando Lopez-Lezcano wrote:
> > {standard input}:165: Error: can't resolve `.sched.text' {.sched.text
> > section} - `.Ltext0' {.text section}
> > make[1]: *** [arch/i386/kernel/semaphore.o] Error 1
> > make[1]: *** Waiting for unfinished jobs....
> > make: *** [arch/i386/kernel] Error 2
> > make: *** Waiting for unfinished jobs....
>
> Thats a gcc problem. Which gcc version are you using ?

# gcc --version
gcc (GCC) 4.0.1 20050727 (Red Hat 4.0.1-5)
(on FC4)

-- Fernando


2005-09-28 03:05:56

by Fernando Lopez-Lezcano

[permalink] [raw]
Subject: Re: 2.6.14-rc2-rt2

On Tue, 2005-09-27 at 16:10 -0700, Daniel Walker wrote:
> On Tue, 2005-09-27 at 09:59 -0700, Fernando Lopez-Lezcano wrote:
> > UPD include/linux/compile.h
> > {standard input}: Assembler messages:
> > {standard input}:164: Error: can't resolve `.sched.text' {.sched.text
> > section} - `.Ltext0' {.text section}
> > {standard input}:165: Error: can't resolve `.sched.text' {.sched.text
> > section} - `.Ltext0' {.text section}
> > make[1]: *** [arch/i386/kernel/semaphore.o] Error 1
> > make[1]: *** Waiting for unfinished jobs....
> > make: *** [arch/i386/kernel] Error 2
> > make: *** Waiting for unfinished jobs....
> >
> > Failing .config attached.
>
> Here's the fix.

Hey thanks! That fixes that, but the compile fails further along:

CHK include/linux/compile.h
UPD include/linux/compile.h
arch/i386/kernel/built-in.o(.text+0xf086): In function `do_powersaver':
longhaul.c: undefined reference to `safe_halt'
arch/i386/kernel/built-in.o(.text+0xf271): In function
`longhaul_setstate':
longhaul.c: undefined reference to `safe_halt'
make: *** [.tmp_vmlinux1] Error 1

-- Fernando


2005-09-28 09:10:04

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 2.6.14-rc2-rt2

Hi all,

-rt[45] do not compile with CONFIG_HOTPLUG_CPU. ktimers seem to mess up.
Not a biggie, don't need it anyway.

Peter

--
Peter Zijlstra <[email protected]>

2005-09-28 09:47:44

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.14-rc2-rt2


* Fernando Lopez-Lezcano <[email protected]> wrote:

> > Here's the fix.
>
> Hey thanks! That fixes that, but the compile fails further along:
>
> CHK include/linux/compile.h
> UPD include/linux/compile.h
> arch/i386/kernel/built-in.o(.text+0xf086): In function `do_powersaver':
> longhaul.c: undefined reference to `safe_halt'
> arch/i386/kernel/built-in.o(.text+0xf271): In function
> `longhaul_setstate':
> longhaul.c: undefined reference to `safe_halt'
> make: *** [.tmp_vmlinux1] Error 1

could you try 2.6.14-rc2-rt6, does it build?

Ingo

2005-09-28 16:37:06

by Fernando Lopez-Lezcano

[permalink] [raw]
Subject: Re: 2.6.14-rc2-rt2

On Wed, 2005-09-28 at 11:48 +0200, Ingo Molnar wrote:
> * Fernando Lopez-Lezcano <[email protected]> wrote:
>
> > > Here's the fix.
> >
> > Hey thanks! That fixes that, but the compile fails further along:
> >
> > CHK include/linux/compile.h
> > UPD include/linux/compile.h
> > arch/i386/kernel/built-in.o(.text+0xf086): In function `do_powersaver':
> > longhaul.c: undefined reference to `safe_halt'
> > arch/i386/kernel/built-in.o(.text+0xf271): In function
> > `longhaul_setstate':
> > longhaul.c: undefined reference to `safe_halt'
> > make: *** [.tmp_vmlinux1] Error 1
>
> could you try 2.6.14-rc2-rt6, does it build?

No, sorry...

fs/ntfs/aops.c: In function 'ntfs_end_buffer_async_read':
fs/ntfs/aops.c:108: error: 'BH_Uptodate_Lock' undeclared (first use in
this function)
fs/ntfs/aops.c:108: error: (Each undeclared identifier is reported only
once
fs/ntfs/aops.c:108: error: for each function it appears in.)
make[2]: *** [fs/ntfs/aops.o] Error 1

and (probably unrelated to rt):

drivers/isdn/hisax/config.c: In function 'HiSax_readstatus':
drivers/isdn/hisax/config.c:636: warning: ignoring return value of
'copy_to_user', declared with attribute warn_unused_result
drivers/isdn/hisax/config.c:647: warning: ignoring return value of
'copy_to_user', declared with attribute warn_unused_result
drivers/isdn/hisax/callc.c: In function 'HiSax_writebuf_skb':
drivers/isdn/hisax/callc.c:1781: warning: large integer implicitly
truncated to unsigned type
drivers/isdn/hisax/st5481_usb.c: In function 'st5481_in_mode':
drivers/isdn/hisax/st5481_usb.c:648: error: 'URB_ASYNC_UNLINK'
undeclared (first use in this function)
drivers/isdn/hisax/st5481_usb.c:648: error: (Each undeclared identifier
is reported only once
drivers/isdn/hisax/st5481_usb.c:648: error: for each function it appears
in.)
make[3]: *** [drivers/isdn/hisax/st5481_usb.o] Error 1

-- Fernando


2005-09-29 09:02:14

by Eran Mann

[permalink] [raw]
Subject: Re: 2.6.14-rc2-rt2


Fernando Lopez-Lezcano wrote:
>>could you try 2.6.14-rc2-rt6, does it build?
>
>
> No, sorry...
>
> fs/ntfs/aops.c: In function 'ntfs_end_buffer_async_read':
> fs/ntfs/aops.c:108: error: 'BH_Uptodate_Lock' undeclared (first use in
> this function)
> fs/ntfs/aops.c:108: error: (Each undeclared identifier is reported only
> once
> fs/ntfs/aops.c:108: error: for each function it appears in.)
> make[2]: *** [fs/ntfs/aops.o] Error 1
>
> and (probably unrelated to rt):
>
> drivers/isdn/hisax/config.c: In function 'HiSax_readstatus':
> drivers/isdn/hisax/config.c:636: warning: ignoring return value of
> 'copy_to_user', declared with attribute warn_unused_result
> drivers/isdn/hisax/config.c:647: warning: ignoring return value of
> 'copy_to_user', declared with attribute warn_unused_result
> drivers/isdn/hisax/callc.c: In function 'HiSax_writebuf_skb':
> drivers/isdn/hisax/callc.c:1781: warning: large integer implicitly
> truncated to unsigned type
> drivers/isdn/hisax/st5481_usb.c: In function 'st5481_in_mode':
> drivers/isdn/hisax/st5481_usb.c:648: error: 'URB_ASYNC_UNLINK'
> undeclared (first use in this function)
> drivers/isdn/hisax/st5481_usb.c:648: error: (Each undeclared identifier
> is reported only once
> drivers/isdn/hisax/st5481_usb.c:648: error: for each function it appears
> in.)
> make[3]: *** [drivers/isdn/hisax/st5481_usb.o] Error 1
>
> -- Fernando
>
>
Regarding NTFS - try with the attached patch. It seems to be still
missing from 2.6.14-rc2-rt7.


Attachments:
linux-rt-ntfs.patch (994.00 B)

2005-09-29 16:46:31

by Badari Pulavarty

[permalink] [raw]
Subject: Re: 2.6.14-rc2-rt2

On Mon, 2005-09-26 at 09:02 +0200, Ingo Molnar wrote:
> i have released the 2.6.14-rc2-rt2 tree, which can be downloaded from
> the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>

Hi Ingo,

I noticed that you moved to "-rt7" already.
"-rt7" fails to compile with CONFIG_NUMA.

mm/slab.c:2404: error: conflicting types for `kmem_cache_alloc_node'
include/linux/slab.h:122: error: previous declaration of
`kmem_cache_alloc_node'

Here is the simple fix.

Thanks,
Badari



Attachments:
rt7-fix.patch (495.00 B)

2005-09-30 10:57:43

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.14-rc2-rt2


* Badari Pulavarty <[email protected]> wrote:

> On Mon, 2005-09-26 at 09:02 +0200, Ingo Molnar wrote:
> > i have released the 2.6.14-rc2-rt2 tree, which can be downloaded from
> > the usual place:
> >
> > http://redhat.com/~mingo/realtime-preempt/
> >
>
> Hi Ingo,
>
> I noticed that you moved to "-rt7" already.
> "-rt7" fails to compile with CONFIG_NUMA.
>
> mm/slab.c:2404: error: conflicting types for `kmem_cache_alloc_node'
> include/linux/slab.h:122: error: previous declaration of
> `kmem_cache_alloc_node'
>
> Here is the simple fix.

thanks, applied.

Ingo

2005-10-02 15:18:15

by Ingo Molnar

[permalink] [raw]
Subject: 2.6.14-rc3-rt1


i have released the 2.6.14-rc3-rt1 tree, which can be downloaded from
the usual place:

http://redhat.com/~mingo/realtime-preempt/

the biggest change is the merge of the generic ARM-irq patches into the
-rt tree, and a port of -rt to the ARM platform, by Thomas Gleixner and
John Cooper. There are also lots of updates and cleanups in the ktimer
code. Also, x64 should work again. Plus smaller changes all around.

Changes since 2.6.14-rc2-rt2:

- ARM-genirq code (Thomas Gleixner, me - testing by lots of people)

- latency tracing on ARM (John Cooper)

- port of -rt to ARM (Thomas Gleixner)

- lots of ktimer updates/cleanups (Thomas Gleixner)

- NTFS bit-spinlock fix (Eran Mann)

- gcc4 build fix (Daniel Walker)

- fix "No Forced Preemption (Server)" build problems
(reported by Mark Knecht)

- convert epca_lock to the new syntax (Daniel Walker)

- typo fix in latency-hist prototype (Clark Williams)

- netlink build fix (Eran Mann)

- dccp build fix (Eran Mann)

- x64 build fixes

- fix audit.c compilation error

- merge to 2.6.14-rc3

- cpufreq build fix

- pcmcia build fix

- XFS build fix

to build a 2.6.14-rc3-rt1 tree, the following patches should be applied:

http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.13.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.14-rc3.bz2
http://redhat.com/~mingo/realtime-preempt/patch-2.6.14-rc3-rt1

Ingo

2005-10-02 15:42:24

by Mark Knecht

[permalink] [raw]
Subject: Re: 2.6.14-rc3-rt1

Thanks Ingo. 2.6.14-rc2-rt7 on AMD64 has been working well for me the
last few days. (After finally getting it to build!) I expect that I'll
build 2.6.14-rc3-rt1 today.

Cheers,
Mark

On 10/2/05, Ingo Molnar <[email protected]> wrote:
>
> i have released the 2.6.14-rc3-rt1 tree, which can be downloaded from
> the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> the biggest change is the merge of the generic ARM-irq patches into the
> -rt tree, and a port of -rt to the ARM platform, by Thomas Gleixner and
> John Cooper. There are also lots of updates and cleanups in the ktimer
> code. Also, x64 should work again. Plus smaller changes all around.
>
> Changes since 2.6.14-rc2-rt2:
>
> - ARM-genirq code (Thomas Gleixner, me - testing by lots of people)
>
> - latency tracing on ARM (John Cooper)
>
> - port of -rt to ARM (Thomas Gleixner)
>
> - lots of ktimer updates/cleanups (Thomas Gleixner)
>
> - NTFS bit-spinlock fix (Eran Mann)
>
> - gcc4 build fix (Daniel Walker)
>
> - fix "No Forced Preemption (Server)" build problems
> (reported by Mark Knecht)
>
> - convert epca_lock to the new syntax (Daniel Walker)
>
> - typo fix in latency-hist prototype (Clark Williams)
>
> - netlink build fix (Eran Mann)
>
> - dccp build fix (Eran Mann)
>
> - x64 build fixes
>
> - fix audit.c compilation error
>
> - merge to 2.6.14-rc3
>
> - cpufreq build fix
>
> - pcmcia build fix
>
> - XFS build fix
>
> to build a 2.6.14-rc3-rt1 tree, the following patches should be applied:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.13.tar.bz2
> http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.14-rc3.bz2
> http://redhat.com/~mingo/realtime-preempt/patch-2.6.14-rc3-rt1
>
> Ingo
>

2005-10-02 19:25:27

by Mark Knecht

[permalink] [raw]
Subject: Re: 2.6.14-rc3-rt1

2.6.14-rc3-rt1 up and running for me. No problems at all building it.
I've got Jack running at <5mS and doing light streaming from 1394
drives. I'm using realtime-lsm.

The only problem I had over the last few days happened with
2.6.14-rc2-rt7. One time, when attempting to shutdown, the machine
hung after the 'Unloading Alsa modules...[OK]' step.

I'm going to be doing some composition/recording in Ardour this week.
I'll use one of these two RC kernels and see how it goes.

Cheers,
Mark

On 10/2/05, Mark Knecht <[email protected]> wrote:
> Thanks Ingo. 2.6.14-rc2-rt7 on AMD64 has been working well for me the
> last few days. (After finally getting it to build!) I expect that I'll
> build 2.6.14-rc3-rt1 today.
>
> Cheers,
> Mark
>
> On 10/2/05, Ingo Molnar <[email protected]> wrote:
> >
> > i have released the 2.6.14-rc3-rt1 tree, which can be downloaded from
> > the usual place:
> >
> > http://redhat.com/~mingo/realtime-preempt/
> >
> > the biggest change is the merge of the generic ARM-irq patches into the
> > -rt tree, and a port of -rt to the ARM platform, by Thomas Gleixner and
> > John Cooper. There are also lots of updates and cleanups in the ktimer
> > code. Also, x64 should work again. Plus smaller changes all around.
> >
> > Changes since 2.6.14-rc2-rt2:
> >
> > - ARM-genirq code (Thomas Gleixner, me - testing by lots of people)
> >
> > - latency tracing on ARM (John Cooper)
> >
> > - port of -rt to ARM (Thomas Gleixner)
> >
> > - lots of ktimer updates/cleanups (Thomas Gleixner)
> >
> > - NTFS bit-spinlock fix (Eran Mann)
> >
> > - gcc4 build fix (Daniel Walker)
> >
> > - fix "No Forced Preemption (Server)" build problems
> > (reported by Mark Knecht)
> >
> > - convert epca_lock to the new syntax (Daniel Walker)
> >
> > - typo fix in latency-hist prototype (Clark Williams)
> >
> > - netlink build fix (Eran Mann)
> >
> > - dccp build fix (Eran Mann)
> >
> > - x64 build fixes
> >
> > - fix audit.c compilation error
> >
> > - merge to 2.6.14-rc3
> >
> > - cpufreq build fix
> >
> > - pcmcia build fix
> >
> > - XFS build fix
> >
> > to build a 2.6.14-rc3-rt1 tree, the following patches should be applied:
> >
> > http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.13.tar.bz2
> > http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.14-rc3.bz2
> > http://redhat.com/~mingo/realtime-preempt/patch-2.6.14-rc3-rt1
> >
> > Ingo
> >
>

2005-10-02 20:52:25

by Felix Oxley

[permalink] [raw]
Subject: Re: 2.6.14-rc3-rt1


I have a compile error in drivers/net/hamradio/mkiss.c

CC [M] drivers/net/hamradio/mkiss.o
drivers/net/hamradio/mkiss.c:625: error:
RW_LOCK_UNLOCKED’ undeclared here (not in a function)

Due to the fact that

RW_LOCK_UNLOCKED

has not been converted to the form

RW_LOCK_UNLOCKED(name.lock)

by the RT patch.

But I don't actually need this module anyway. :-)
regards,
Felix

2005-10-02 21:55:14

by Felix Oxley

[permalink] [raw]
Subject: Re: 2.6.14-rc3-rt1

On Sunday 02 October 2005 21:51, Felix Oxley wrote:
> I have a compile error in drivers/net/hamradio/mkiss.c

I should have said - I have added Ralf Baechle to this post as I think he is
the maintainer for this module.

Ralf, I hope that is OK? :-)

regards,
Felix

2005-10-03 06:33:13

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.14-rc3-rt1


* Felix Oxley <[email protected]> wrote:

> I have a compile error in drivers/net/hamradio/mkiss.c
>
> CC [M] drivers/net/hamradio/mkiss.o
> drivers/net/hamradio/mkiss.c:625: error:
> RW_LOCK_UNLOCKED’ undeclared here (not in a function)
>
> Due to the fact that
>
> RW_LOCK_UNLOCKED
>
> has not been converted to the form
>
> RW_LOCK_UNLOCKED(name.lock)
>
> by the RT patch.

i've applied the cleanup below to my tree - it might as well go upstream
too, it's slightly more compact than the explicit initializer.

Ingo

Signed-off-by: Ingo Molnar <[email protected]>

Index: linux/drivers/net/hamradio/mkiss.c
===================================================================
--- linux.orig/drivers/net/hamradio/mkiss.c
+++ linux/drivers/net/hamradio/mkiss.c
@@ -622,7 +622,7 @@ static void ax_setup(struct net_device *
* best way to fix this is to use a rwlock in the tty struct, but for now we
* use a single global rwlock for all ttys in ppp line discipline.
*/
-static rwlock_t disc_data_lock = RW_LOCK_UNLOCKED;
+static DEFINE_RWLOCK(disc_data_lock);

static struct mkiss *mkiss_get(struct tty_struct *tty)
{

2005-10-06 17:14:28

by Steven Rostedt

[permalink] [raw]
Subject: Re: 2.6.14-rc3-rt1



On Sun, 2 Oct 2005, Mark Knecht wrote:

>
> The only problem I had over the last few days happened with
> 2.6.14-rc2-rt7. One time, when attempting to shutdown, the machine
> hung after the 'Unloading Alsa modules...[OK]' step.

Acutally it may be the next step. Do you have pcmcia configured? I've
been noticing that my system has been locking up on shutdown of the
pcmcia.

Ingo, here's the patch. This should probably go upstream too since it can
happen there too. The pccardd thread has a race in it that it can
shutdown in the TASK_INTERRUPTIBLE state. Here's the fix.

-- Steve

PS. Thanks for the info on quilt ;-)

Index: linux-rt-quilt/drivers/pcmcia/cs.c
===================================================================
--- linux-rt-quilt.orig/drivers/pcmcia/cs.c 2005-10-06 08:03:56.000000000 -0400
+++ linux-rt-quilt/drivers/pcmcia/cs.c 2005-10-06 12:48:02.000000000 -0400
@@ -689,6 +689,9 @@
schedule();
try_to_freeze();
}
+ /* make sure we are running before we exit */
+ set_current_state(TASK_RUNNING);
+
remove_wait_queue(&skt->thread_wait, &wait);

/* remove from the device core */

2005-10-07 11:08:51

by Ingo Molnar

[permalink] [raw]
Subject: [patch] pcmcia-shutdown-fix.patch


* Steven Rostedt <[email protected]> wrote:

> Ingo, here's the patch. This should probably go upstream too since it
> can happen there too. The pccardd thread has a race in it that it can
> shutdown in the TASK_INTERRUPTIBLE state. Here's the fix.

ah, certainly makes sense. Dominik, does it look good to you too? Patch
below is for upstream.

Ingo

----

The pccardd thread has a race in it that it can shutdown in the
TASK_INTERRUPTIBLE state. Found on the -rt kernel.

From: Steven Rostedt <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

--

drivers/pcmcia/cs.c | 3 +++
1 files changed, 3 insertions(+)

Index: linux/drivers/pcmcia/cs.c
===================================================================
--- linux.orig/drivers/pcmcia/cs.c
+++ linux/drivers/pcmcia/cs.c
@@ -689,6 +689,9 @@ static int pccardd(void *__skt)
schedule();
try_to_freeze();
}
+ /* make sure we are running before we exit */
+ set_current_state(TASK_RUNNING);
+
remove_wait_queue(&skt->thread_wait, &wait);

/* remove from the device core */

2005-10-07 19:17:26

by Russell King

[permalink] [raw]
Subject: Re: [patch] pcmcia-shutdown-fix.patch

On Fri, Oct 07, 2005 at 01:09:14PM +0200, Ingo Molnar wrote:
>
> * Steven Rostedt <[email protected]> wrote:
>
> > Ingo, here's the patch. This should probably go upstream too since it
> > can happen there too. The pccardd thread has a race in it that it can
> > shutdown in the TASK_INTERRUPTIBLE state. Here's the fix.
>
> ah, certainly makes sense. Dominik, does it look good to you too? Patch
> below is for upstream.

Looks correct to me (I'm the author of this code.) Since it's
a bug fix, please send it upstream ASAP.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

2005-10-07 19:47:09

by Steven Rostedt

[permalink] [raw]
Subject: Re: [patch] pcmcia-shutdown-fix.patch



On Fri, 7 Oct 2005, Russell King wrote:

> On Fri, Oct 07, 2005 at 01:09:14PM +0200, Ingo Molnar wrote:
> >
> > * Steven Rostedt <[email protected]> wrote:
> >
> > > Ingo, here's the patch. This should probably go upstream too since it
> > > can happen there too. The pccardd thread has a race in it that it can
> > > shutdown in the TASK_INTERRUPTIBLE state. Here's the fix.
> >
> > ah, certainly makes sense. Dominik, does it look good to you too? Patch
> > below is for upstream.
>
> Looks correct to me (I'm the author of this code.) Since it's
> a bug fix, please send it upstream ASAP.
>

Russell,

I believe that the email that Ingo was sending was an upstream patch. You
cut out the patch part in this reply. I'm sure if you just Ack that patch
and CC it to the powers that be, it will go in.

-- Steve

2005-10-10 15:14:15

by Steven Rostedt

[permalink] [raw]
Subject: Re: [patch] pcmcia-shutdown-fix.patch


On Fri, 7 Oct 2005, Russell King wrote:

> On Fri, Oct 07, 2005 at 01:09:14PM +0200, Ingo Molnar wrote:
> >
> > * Steven Rostedt <[email protected]> wrote:
> >
> > > Ingo, here's the patch. This should probably go upstream too since it
> > > can happen there too. The pccardd thread has a race in it that it can
> > > shutdown in the TASK_INTERRUPTIBLE state. Here's the fix.
> >
> > ah, certainly makes sense. Dominik, does it look good to you too? Patch
> > below is for upstream.
>
> Looks correct to me (I'm the author of this code.) Since it's
> a bug fix, please send it upstream ASAP.
>

Just in case this was missed and hasn't been incorporated. Here's the
patch once again:

-- Steve

Signed-off-by: Steven Rostedt <[email protected]>

Index: linux-2.6.14-rc3/drivers/pcmcia/cs.c
===================================================================
--- linux-2.6.14-rc3/drivers/pcmcia/cs.c.orig 2005-10-06 06:56:17.000000000 -0400
+++ linux-2.6.14-rc3/drivers/pcmcia/cs.c 2005-10-10 11:05:09.000000000 -0400
@@ -689,6 +689,9 @@
schedule();
try_to_freeze();
}
+ /* make sure we are running before we exit */
+ set_current_state(TASK_RUNNING);
+
remove_wait_queue(&skt->thread_wait, &wait);

/* remove from the device core */

2005-10-10 15:38:26

by Steven Rostedt

[permalink] [raw]
Subject: Re: [patch] pcmcia-shutdown-fix.patch



On Mon, 10 Oct 2005, Steven Rostedt wrote:

>
> On Fri, 7 Oct 2005, Russell King wrote:
>
> > On Fri, Oct 07, 2005 at 01:09:14PM +0200, Ingo Molnar wrote:
> > >
> > > * Steven Rostedt <[email protected]> wrote:
> > >
> > > > Ingo, here's the patch. This should probably go upstream too since it
> > > > can happen there too. The pccardd thread has a race in it that it can
> > > > shutdown in the TASK_INTERRUPTIBLE state. Here's the fix.
> > >
> > > ah, certainly makes sense. Dominik, does it look good to you too? Patch
> > > below is for upstream.
> >
> > Looks correct to me (I'm the author of this code.) Since it's
> > a bug fix, please send it upstream ASAP.
> >
>
> Just in case this was missed and hasn't been incorporated. Here's the
> patch once again:
>

Oh, and I forgot to add the write-up that Ingo did to explain the patch.

----

The pccardd thread has a race in it that it can shutdown in the
TASK_INTERRUPTIBLE state. Found on the -rt kernel.

Signed-off-by: Steven Rostedt <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


> -- Steve
>
> Signed-off-by: Steven Rostedt <[email protected]>
>
> Index: linux-2.6.14-rc3/drivers/pcmcia/cs.c
> ===================================================================
> --- linux-2.6.14-rc3/drivers/pcmcia/cs.c.orig 2005-10-06 06:56:17.000000000 -0400
> +++ linux-2.6.14-rc3/drivers/pcmcia/cs.c 2005-10-10 11:05:09.000000000 -0400
> @@ -689,6 +689,9 @@
> schedule();
> try_to_freeze();
> }
> + /* make sure we are running before we exit */
> + set_current_state(TASK_RUNNING);
> +
> remove_wait_queue(&skt->thread_wait, &wait);
>
> /* remove from the device core */
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>