2002-04-04 11:59:42

by Adam J. Richter

[permalink] [raw]
Subject: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time

When I attempted to boot linux-2.5.8-pre1, I got a kernel
BUG() for exit.c line 519. The was a small change to to kernel/exit.c
in 2.5.8-pre1 which deleted a kernel_lock() call. Restoring that line
resulted in a kernel that booted fine. I am sending this email from
the machine running that kernel (so I guess a matching release of
the kernel lock is already in the code).

Here is the patch. Of course, it would be helpful if someone
who actually understands this could take a look at it.

--
Adam J. Richter __ ______________ 4880 Stevens Creek Blvd, Suite 104
[email protected] \ / San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l United States of America
fax +1 408 261-6631 "Free Software For The Rest Of Us."


Attachments:
(No filename) (786.00 B)
exit.diff (263.00 B)
Download all attachments

2002-04-04 12:56:57

by Stelian Pop

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time

On Thu, Apr 04, 2002 at 03:59:10AM -0800, Adam J. Richter wrote:

> When I attempted to boot linux-2.5.8-pre1, I got a kernel
> BUG() for exit.c line 519. The was a small change to to kernel/exit.c
> in 2.5.8-pre1 which deleted a kernel_lock() call. Restoring that line
> resulted in a kernel that booted fine. I am sending this email from
> the machine running that kernel (so I guess a matching release of
> the kernel lock is already in the code).

It should be added that the bug is hit only if CONFIG_PREEMPT is on.

Stelian.
--
Stelian Pop <[email protected]>
Alcove - http://www.alcove.com

2002-04-04 13:43:27

by Alessandro Suardi

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time

Stelian Pop wrote:
> On Thu, Apr 04, 2002 at 03:59:10AM -0800, Adam J. Richter wrote:
>
>
>> When I attempted to boot linux-2.5.8-pre1, I got a kernel
>>BUG() for exit.c line 519. The was a small change to to kernel/exit.c
>>in 2.5.8-pre1 which deleted a kernel_lock() call. Restoring that line
>>resulted in a kernel that booted fine. I am sending this email from
>>the machine running that kernel (so I guess a matching release of
>>the kernel lock is already in the code).
>
>
> It should be added that the bug is hit only if CONFIG_PREEMPT is on.

Just to say that

* I hit the bug
* I have CONFIG_PREEMPT
* Adam's fix "works for me (TM)"

--alessandro

"time is never time at all / you can never ever leave
without leaving a piece of youth"
(Smashing Pumpkins, "Tonight, tonight")

2002-04-04 16:27:02

by Dave Hansen

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time

Adam J. Richter wrote:

> When I attempted to boot linux-2.5.8-pre1, I got a kernel
>BUG() for exit.c line 519.
>
That bug is hit when the schedule() returns. In the do_exit() case,
schedule is not supposed to return. After the current task is scheduled
out, it is destroyed. I guess that most of the freeing of the task's
resources is done in do_exit(), but I don't see where its kernel stack
is freed.

> The was a small change to to kernel/exit.c
>in 2.5.8-pre1 which deleted a kernel_lock() call. Restoring that line
>resulted in a kernel that booted fine.
>
I take it you don't have a copy of the BUG(). I was going to ask if
preemption was enabled, but I see that it was from another message. I
was guessing that preemption contributed to this, but now I know. The
lock_kernel() has 2 different effects here. It locks the kernel_flag,
AND it disables preemption. The correct fix here will probably be to
disable preemption, rather than readd the lock_kernel().

For the preemption gurus:
Is a preempt_disable() in do_exit() going to hurt anything? Shold we
selectively skip the preempt_disable() in schedule() if it was
schedule() called from do_exit()?
--
Dave Hansen
[email protected]

2002-04-04 18:28:33

by Dave Hansen

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time

--- linux-2.5.8-pre1-clean/kernel/exit.c Thu Apr 4 08:58:31 2002
+++ linux/kernel/exit.c Thu Apr 4 10:19:37 2002
@@ -499,6 +499,11 @@
acct_process(code);
__exit_mm(tsk);

+ /* I removed the lock_kernel() from here. It caused preempt kernels
+ to oops. This fixes it for now, but the real cause needs to
+ be found.
+ - Dave Hansen <[email protected]> 04-04-2002 */
+ preempt_disable();
sem_exit();
__exit_files(tsk);
__exit_fs(tsk);
@@ -515,6 +520,7 @@

tsk->exit_code = code;
exit_notify();
+ preempt_enable_no_resched(); /* partner of above preempt_disable(); */
schedule();
BUG();
/*


Attachments:
do_exit-add_preempt_disable.2.5.8-pre1.patch (633.00 B)

2002-04-04 18:52:16

by Robert Love

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time

On Thu, 2002-04-04 at 13:28, Dave Hansen wrote:

> I've replicated the problem too.
> I've diabled preemption in the area where it used to be disabled because
> of the old lock_kernel(). I'm sending this message from a machine with
> that patch applied, so the patch does fix it. As the comment says, this
> is something that the preempt experts need to take a look at.
> Linus, this is a hack, and there is probably still a window where
> preemption can happen. But, it is a band-aid until we find the real
> problem.

Thanks for the CC. I've been looking into this problem. I am not too
sure why we require protection from concurrency via preemption and not
via SMP. In other words, why are we SMP-safe but not preempt-safe here.

I don't really have an answer.

The problem I saw on boot was the BUG being triggered on the last line.
If we are able to preempt here and cause some problem, what proof is
there that we don't have a race wrt SMP?

Robert Love

2002-04-04 19:14:35

by Linus Torvalds

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time


On Thu, 4 Apr 2002, Dave Hansen wrote:
>
> I've diabled preemption in the area where it used to be disabled because
> of the old lock_kernel(). I'm sending this message from a machine with
> that patch applied, so the patch does fix it.

I think the real fix is to make sure that preemption never hits while
current->state == TASK_ZOMBIE.

In other words, I think the _correct_ fix is to just make
preempt_schedule() check for something like

if (current->state != TASK_RUNNING)
return;

and just getting rid of the current "unconditionally set state to
running".

(Side note: if the state isn't running, we're most likely going to
reschedule in a controlled manner soon anyway. Sure, there are some loops
which set state to non-runnable early for race avoidance, but is it a
_problem_?)

Linus

2002-04-04 19:15:28

by Linus Torvalds

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time


On 4 Apr 2002, Robert Love wrote:
>
> Thanks for the CC. I've been looking into this problem. I am not too
> sure why we require protection from concurrency via preemption and not
> via SMP. In other words, why are we SMP-safe but not preempt-safe here.
>
> I don't really have an answer.

The answer is that preempt_schedule() illegally sets

current->state = TASK_RUNNING;

without asking the process whether that's ok. The SMP code never does
anything like that.

Linus

2002-04-04 19:17:35

by Robert Love

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time

On Thu, 2002-04-04 at 14:13, Linus Torvalds wrote:

> I think the real fix is to make sure that preemption never hits while
> current->state == TASK_ZOMBIE.
>
> In other words, I think the _correct_ fix is to just make
> preempt_schedule() check for something like
>
> if (current->state != TASK_RUNNING)
> return;
>
> and just getting rid of the current "unconditionally set state to
> running".
>
> (Side note: if the state isn't running, we're most likely going to
> reschedule in a controlled manner soon anyway. Sure, there are some loops
> which set state to non-runnable early for race avoidance, but is it a
> _problem_?)

Eh, maybe - what about all the code that sets non-running before putting
itself on a wait queue?

Robert Love

2002-04-04 19:28:07

by Robert Love

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time

On Thu, 2002-04-04 at 14:14, Linus Torvalds wrote:

> The answer is that preempt_schedule() illegally sets
>
> current->state = TASK_RUNNING;
>
> without asking the process whether that's ok. The SMP code never does
> anything like that.

Well Ingo added that ;)

We used to just set a flag in the preempt_count that marked the task as
preempted and made sure on its next trip into schedule it ran again.

Do you think it is better to deny preemption if state==TASK_ZOMBIE (note
this requires code in preempt_schedule and the interrupt return path,
since Ingo decoupled the two) or just disable preemption around critical
regions caused by setting state to TASK_ZOMBIE ?

I suspect this is the first occurrence of a problem of this kind ... and
the attached patch handles it.

Robert Love

diff -urN linux-2.5.8-pre1/kernel/exit.c linux/kernel/exit.c
--- linux-2.5.8-pre1/kernel/exit.c Wed Apr 3 20:57:37 2002
+++ linux/kernel/exit.c Thu Apr 4 14:22:30 2002
@@ -499,6 +499,8 @@
acct_process(code);
__exit_mm(tsk);

+ preempt_disable();
+
sem_exit();
__exit_files(tsk);
__exit_fs(tsk);
@@ -515,6 +517,7 @@

tsk->exit_code = code;
exit_notify();
+ preempt_enable_no_resched();
schedule();
BUG();
/*


2002-04-04 19:42:15

by Dave Hansen

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time

Robert Love wrote:
> tsk->exit_code = code;
> exit_notify();
> + preempt_enable_no_resched();
* PREEMPT HERE *
> schedule();
> BUG();

Isn't there still a race here? A preempt CAN happen after you reenable,
right? Admittedly, its a small window, but it _is_ a window.


--
Dave Hansen
[email protected]

2002-04-04 19:45:55

by Linus Torvalds

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time


On 4 Apr 2002, Robert Love wrote:
>
> Eh, maybe - what about all the code that sets non-running before putting
> itself on a wait queue?

In most cases that code will call a schedule itself.

Of course, we might make just ZOMBIE a special case, but in general I
think it's simply absolutely wrong for the preemption to change task
internal data structures on its own.

Linus

2002-04-04 19:51:05

by George Anzinger

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() atboot time

Linus Torvalds wrote:
>
> On Thu, 4 Apr 2002, Dave Hansen wrote:
> >
> > I've diabled preemption in the area where it used to be disabled because
> > of the old lock_kernel(). I'm sending this message from a machine with
> > that patch applied, so the patch does fix it.
>
> I think the real fix is to make sure that preemption never hits while
> current->state == TASK_ZOMBIE.
>
> In other words, I think the _correct_ fix is to just make
> preempt_schedule() check for something like
>
> if (current->state != TASK_RUNNING)
> return;
>
I agree that the set state to running code should go, but there are
places in the kernel where a great deal of time is spent with the state
other than running. AND most of them make sense. Adding a PREEMPTING
super state, as was in the original patch, addressed this quite well
IMHO. This super state was treated by the scheduler as RUNNING
regardless of the actual state. As long as it is not in the actual
"state" member, the condition waited for can still set the true state to
RUNNING and all is well.

What is the objection to this?

> and just getting rid of the current "unconditionally set state to
> running".
>
> (Side note: if the state isn't running, we're most likely going to
> reschedule in a controlled manner soon anyway. Sure, there are some loops
> which set state to non-runnable early for race avoidance, but is it a
> _problem_?)
>
All I can say is that we thought it was when we did the original
preemption work. Some of those loops are long.

I tend to agree with not doing the random "set state to running". It
breeds paranoia, and rightly so.

And yes the ZOMBIE code must not be preempted. This is one of the
reasons the preempt code was designed to allow schedule() to be called
with preemption disabled. I gives a nice clean schedule() call. It
might be a better solution than the interrupt off schedule() calls made
in the sleep calls in sched.c.

> Linus
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

2002-04-04 20:04:36

by George Anzinger

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() atboot time

Dave Hansen wrote:
>
> Robert Love wrote:
> > tsk->exit_code = code;
> > exit_notify();
> > + preempt_enable_no_resched();
> * PREEMPT HERE *
> > schedule();
> > BUG();
>
> Isn't there still a race here? A preempt CAN happen after you reenable,
> right? Admittedly, its a small window, but it _is_ a window.

Right, eliminate this line. Since the task is going away it doesn't
matter. Actually it doesn't matter even if schedule() returns (i.e. for
other than ZOMBIE) as long as preempt is enabled after the schedule()
call, somewhere. It is designed to work this way for a reason, this
being one of them.

-g
>
> --
> Dave Hansen
> [email protected]
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

2002-04-04 20:13:40

by George Anzinger

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() atboot time

Linus Torvalds wrote:
>
> On 4 Apr 2002, Robert Love wrote:
> >
> > Eh, maybe - what about all the code that sets non-running before putting
> > itself on a wait queue?
>
> In most cases that code will call a schedule itself.
>
> Of course, we might make just ZOMBIE a special case, but in general I
> think it's simply absolutely wrong for the preemption to change task
> internal data structures on its own.
>
Amen!

> Linus
>

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

2002-04-04 20:55:23

by Andrew Morton

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() atboot time

Robert Love wrote:
>
> ...
> Do you think it is better to deny preemption if state==TASK_ZOMBIE (note
> this requires code in preempt_schedule and the interrupt return path,
> since Ingo decoupled the two) or just disable preemption around critical
> regions caused by setting state to TASK_ZOMBIE ?
>
> I suspect this is the first occurrence of a problem of this kind ... and
> the attached patch handles it.
>

No, the problem goes deeper than this.

I have code which does, effectively:

sleeper()
{
spin_lock(&some_lock);
set_current_state(TASK_UNINTERRUPTIBLE);
some_flag = 0;
spin_unlock(&lock);
schedule();
if (some_flag == 0)
i_am_horribly_confused();
}

waker()
{
spin_lock(&some_lock);
some_flag = 1;
wake_up_process(sleeper);
spin_unlock(&some_lock);
}

or something like that. See __pdflush() and
pdflush_operation() in http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.8-pre1/delalloc/dallocbase-60-pdflush.patch

The above code work fine, is nice and I want to keep
it that way. But it fails on preempt.

The spin_unlock() in sleeper() can sometimes set
task->state to TASK_RUNNING(), so my schedule() call
just falls straight through.

Probably nobody has noticed this in other places because
most sleep/wakeup stuff tends to be done inside a loop;
the bogus "wakeup" is ignored.

Although it can be worked around at the call site, I
think this needs fixing. Otherwise we have the rule
"spin_unlock will flip you into TASK_RUNNING 0.0001%
of the time if CONFIG_PREEMPT=y". ug.

I have thought deeply about this, and I then promptly
forgot everything I thought about, but I ended up
concluding that the sanest way of resolving this is
inside __set_current_state(). If the new state is
TASK_RUNNING and the old state is not TASK_RUNNING
then enable preemption, call schedule() if necessary, etc.

It is not acceptable to just say "don't preempt a task
which is not in state TASK_RUNNING", because if an
interrupt happens against a CPU which is running a task
which is in state TASK_INTERRUPTIBLE (say), then that
wakeup won't be serviced until the task exits the kernel.

-

2002-04-04 21:33:43

by Roger Larsson

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time

On torsdagen den 4 april 2002 21.26, Robert Love wrote:
> On Thu, 2002-04-04 at 14:14, Linus Torvalds wrote:
> > The answer is that preempt_schedule() illegally sets
> >
> > current->state = TASK_RUNNING;
> >
> > without asking the process whether that's OK. The SMP code never does
> > anything like that.
>
> Well Ingo added that ;)
>
> We used to just set a flag in the preempt_count that marked the task as
> preempted and made sure on its next trip into schedule it ran again.
>

How about doing:

asmlinkage void preempt_schedule(void)
{
unsigned long saved_state;

if (unlikely(preempt_get_count()))
return;

preempt_disable(); /* or use an atomic operation */
saved_state = current->state;
current->state = TASK_RUNNING;
preempt_enable_no_resched(); /* we are scheduling anyway... */
schedule();
current->state = saved_state;
}

It is unlikely to get preemption between schedule() and the
setting since schedule it self checks - the window is small.
And when it hits if will correctly restore the correct value.

Note this code does not need to solve the FLAG problem.

current->state |= FLAG
* PREEMPT *
current->state |= FLAG
schedule()
current->state &= ~FLAG
schedule() with flag disabled


/RogerL

--
Roger Larsson
Skellefte?
Sweden

2002-04-04 22:40:44

by Andrew Morton

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time

Roger Larsson wrote:
>
> ...
> How about doing:
>
> asmlinkage void preempt_schedule(void)
> {
> unsigned long saved_state;
>
> if (unlikely(preempt_get_count()))
> return;
>
> preempt_disable(); /* or use an atomic operation */
> saved_state = current->state;
> current->state = TASK_RUNNING;
> preempt_enable_no_resched(); /* we are scheduling anyway... */
> schedule();

Interrupt occurs, puts this task in state TASK_RUNNING.

> current->state = saved_state;

whoops. We went back into TASK_UNINTERRUPTIBLE.

> }
>

We could fix this with changes to schedule(), but that's
not nice. Another approach would be:

preempt_schedule()
{
current->state2 = current->state;
current->state = TASK_RUNNING;
schedule();
current->state = current->state2;
}

and wake_up() would do:

tsk->state = TASK_RUNNING;
tsk->state2 = TASK_RUNNING;

With the appropriate locking, memory barriers and other
relevant goo I think this would work...

-

2002-04-04 22:43:54

by Linus Torvalds

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time


On Thu, 4 Apr 2002, Andrew Morton wrote:
> Another approach would be:
>
> preempt_schedule()
> {
> current->state2 = current->state;
> current->state = TASK_RUNNING;
> schedule();
> current->state = current->state2;
> }

Yes, but please no.

My current tree says

asmlinkage void preempt_schedule(void)
{
if (unlikely(preempt_get_count()))
return;
if (current->state != TASK_RUNNING)
return;
schedule();
}

and if people start getting latency problems due to loops with state !=
TASK_RUNNING, then I suspect we might just make "set_current_state()"
check that case explicitly and do a conditional reschedule (ie make it the
same as if we released a lock). That would be a hell of a lot cleaner, in
my opinion.

Linus

2002-04-04 22:56:25

by Robert Love

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time

On Thu, 2002-04-04 at 17:38, Andrew Morton wrote:

> With the appropriate locking, memory barriers and other
> relevant goo I think this would work...

Yah, I guess, but that isn't pretty at all ;)

Andrew, remember how we used to do it (and still do it in the 2.4
patch)? Wouldn't that work? Specifically, when we enter
preempt_schedule we set a flag value in preempt_count. This flag value
is checked at the top of schedule and, if set, we skip the first chunk
of code that handles sleeping tasks. The task->state never changes.
Upon leaving schedule and returning to preempt_schedule, we unset the
flag.

This allows us to preempt tasks in any state, without problems or
special cases. It also wasn't too much overhead - compared to now,
basically just:

if (unlikely(current->preempt_count() & PREEMPT_ACTIVE))
goto pick_next_task;

at the top of schedule().

Robert Love

2002-04-04 22:57:05

by Andrew Morton

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() atboot time

Linus Torvalds wrote:
>
> On Thu, 4 Apr 2002, Andrew Morton wrote:
> > Another approach would be:
> >
> > preempt_schedule()
> > {
> > current->state2 = current->state;
> > current->state = TASK_RUNNING;
> > schedule();
> > current->state = current->state2;
> > }
>
> Yes, but please no.
>
> My current tree says
>
> asmlinkage void preempt_schedule(void)
> {
> if (unlikely(preempt_get_count()))
> return;
> if (current->state != TASK_RUNNING)
> return;
> schedule();
> }
>
> and if people start getting latency problems due to loops with state !=
> TASK_RUNNING, then I suspect we might just make "set_current_state()"
> check that case explicitly and do a conditional reschedule (ie make it the
> same as if we released a lock). That would be a hell of a lot cleaner, in
> my opinion.
>

That would work. And would also fix my "spin_unlock sometimes
stomps on TASK_INTERRUPTIBLE" problem.

It does mean that we'll need to convert many open-coded

current->state = whatever;

instances to use [__]set_current_state(). But that's not
a bad thing. Janitorial patches for this are already
floating about.

Robert, do you have time to do the code-and-test thing?

-

2002-04-04 23:12:57

by Andrew Morton

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() atboot time

Robert Love wrote:
>
> ...
> This allows us to preempt tasks in any state, without problems or
> special cases. It also wasn't too much overhead - compared to now,
> basically just:
>
> if (unlikely(current->preempt_count() & PREEMPT_ACTIVE))
> goto pick_next_task;
>
> at the top of schedule().
>

That's changing schedule(). Seems that I'd ruled out that
option prematurely. As current->preempt_count() and PREEMPT_ACTIVE
can both evaluate to constant zero if CONFIG_PREEMPT=n, it can
be done ifdeflessly.

Everything happens inside rq->lock. Looks solid to me.

-

2002-04-04 23:09:47

by Robert Love

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time

On Thu, 2002-04-04 at 17:42, Linus Torvalds wrote:

> My current tree says
>
> asmlinkage void preempt_schedule(void)
> {
> if (unlikely(preempt_get_count()))
> return;
> if (current->state != TASK_RUNNING)
> return;
> schedule();
> }

You need to do the same thing in entry.S ... Ingo decoupled the two
entry paths from each other (entry.S does preempt_schedule's work on its
own, and then directly calls schedule).

So, right now the ret_from_intr path sets the task state to
TASK_RUNNING. That needs to be replaced with a jump out if it is NOT
task running.

Curiously, what is wrong with the way we did it in the original patch?
I.e. set a bit in preempt_count and use that to skip to pick_next_task
in schedule. This allows us to preempt any task of any state ...

Robert Love

2002-04-04 23:16:27

by Robert Love

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() atboot time

On Thu, 2002-04-04 at 18:10, Andrew Morton wrote:

> That's changing schedule(). Seems that I'd ruled out that
> option prematurely. As current->preempt_count() and PREEMPT_ACTIVE
> can both evaluate to constant zero if CONFIG_PREEMPT=n, it can
> be done ifdeflessly.
>
> Everything happens inside rq->lock. Looks solid to me.

It is solid - I did not just invent that approach, it was how we have
always done it until around 2.5.6-pre (Ingo sent a patch to change it).
The 2.4 patches do it this way (take a look) and 2.5 before the change
obviously worked like this.

It hasn't shown any problems in ~6 months of use.

I'll cook up a patch to do it, but I'd like to hear Linus opinion ...

Robert Love

2002-04-04 23:43:41

by Linus Torvalds

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time


On 4 Apr 2002, Robert Love wrote:
>
> Curiously, what is wrong with the way we did it in the original patch?
> I.e. set a bit in preempt_count and use that to skip to pick_next_task
> in schedule. This allows us to preempt any task of any state ...

Because that requires that every user of "set_task_state()" needs to know
about preemption.

Btw, I think entry.S should just call preempt_schedule() instead, instead
of knowing about these details.

Linus

2002-04-04 23:48:21

by Robert Love

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time

On Thu, 2002-04-04 at 18:42, Linus Torvalds wrote:

> Because that requires that every user of "set_task_state()" needs to know
> about preemption.

Hm, how so? I contend not to rudely set the task state but instead mark
the task as "preempted" in preempt_schedule and handle this case in
schedule.

It requires zero change to anything else; this is the behavior of the
original patch I sent you.

> Btw, I think entry.S should just call preempt_schedule() instead, instead
> of knowing about these details.

Agreed.

Robert Love


2002-04-04 23:56:51

by Linus Torvalds

[permalink] [raw]
Subject: Re: Patch: linux-2.5.8-pre1/kernel/exit.c change caused BUG() at boot time


On 4 Apr 2002, Robert Love wrote:
>
> Hm, how so? I contend not to rudely set the task state but instead mark
> the task as "preempted" in preempt_schedule and handle this case in
> schedule.

Ahh, I misunderstood - I thought you meant setting the bit when setting
current->state.

Fair enough. Send me a patch to look at.

Linus

2002-04-05 00:04:12

by Robert Love

[permalink] [raw]
Subject: [PATCH] preemptive kernel behavior change: don't be rude

On Thu, 2002-04-04 at 18:55, Linus Torvalds wrote:
> Fair enough. Send me a patch to look at.

Linus et all,

Here we go:

- do not manually set task->state
- instead, in preempt_schedule, set a flag in preempt_count that
denotes that this task is entering schedule off a kernel preemption.
- use this flag in schedule to jump to pick_next_task
- in preempt_schedule, upon return from schedule, unset the flag
- have entry.S just call preempt_schedule and not duplicate this work,
as Linus suggested. I agree. Note this makes debugging easier as
we keep a single point of entry for kernel preemptions.

The result: we can safely preempt non-TASK_RUNNING tasks. If one is
preempted, we can safely survive schedule because we won't handle the
special casing of non-TASK_RUNNING at the top of schedule. Thus other
tasks can run as desired and our non-TASK_RUNNING task will eventually
be rescheduled, in its original state, and complete happily.

This is the behavior we have in the 2.4 patches and 2.5 until
~2.5.6-pre. This works. It requires no other changes elsewhere (it
actually removes some special-casing Ingo did in the signal code).

This patch works in theory and compiles, but received minimal testing.

Robert Love

diff -urN linux-2.5.8-pre1/arch/i386/kernel/entry.S linux/arch/i386/kernel/entry.S
--- linux-2.5.8-pre1/arch/i386/kernel/entry.S Wed Apr 3 20:57:24 2002
+++ linux/arch/i386/kernel/entry.S Thu Apr 4 18:32:03 2002
@@ -240,9 +240,7 @@
jnz restore_all
incl TI_PRE_COUNT(%ebx)
sti
- movl TI_TASK(%ebx), %ecx # ti->task
- movl $0,(%ecx) # current->state = TASK_RUNNING
- call SYMBOL_NAME(schedule)
+ call SYMBOL_NAME(preempt_schedule)
jmp ret_from_intr
#endif

diff -urN linux-2.5.8-pre1/arch/i386/kernel/ptrace.c linux/arch/i386/kernel/ptrace.c
--- linux-2.5.8-pre1/arch/i386/kernel/ptrace.c Wed Apr 3 20:57:24 2002
+++ linux/arch/i386/kernel/ptrace.c Thu Apr 4 18:31:17 2002
@@ -455,11 +455,9 @@
between a syscall stop and SIGTRAP delivery */
current->exit_code = SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD)
? 0x80 : 0);
- preempt_disable();
current->state = TASK_STOPPED;
notify_parent(current, SIGCHLD);
schedule();
- preempt_enable();
/*
* this isn't the same as continuing with a signal, but it will do
* for normal use. strace only continues with a signal if the
diff -urN linux-2.5.8-pre1/arch/i386/kernel/signal.c linux/arch/i386/kernel/signal.c
--- linux-2.5.8-pre1/arch/i386/kernel/signal.c Wed Apr 3 20:57:24 2002
+++ linux/arch/i386/kernel/signal.c Thu Apr 4 18:31:00 2002
@@ -610,11 +610,9 @@
if ((current->ptrace & PT_PTRACED) && signr != SIGKILL) {
/* Let the debugger run. */
current->exit_code = signr;
- preempt_disable();
current->state = TASK_STOPPED;
notify_parent(current, SIGCHLD);
schedule();
- preempt_enable();

/* We're back. Did the debugger cancel the sig? */
if (!(signr = current->exit_code))
@@ -669,14 +667,12 @@

case SIGSTOP: {
struct signal_struct *sig;
+ current->state = TASK_STOPPED;
current->exit_code = signr;
sig = current->parent->sig;
- preempt_disable();
- current->state = TASK_STOPPED;
if (sig && !(sig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDSTOP))
notify_parent(current, SIGCHLD);
schedule();
- preempt_enable();
continue;
}

diff -urN linux-2.5.8-pre1/include/linux/sched.h linux/include/linux/sched.h
--- linux-2.5.8-pre1/include/linux/sched.h Wed Apr 3 20:57:27 2002
+++ linux/include/linux/sched.h Thu Apr 4 18:29:53 2002
@@ -91,6 +91,7 @@
#define TASK_UNINTERRUPTIBLE 2
#define TASK_ZOMBIE 4
#define TASK_STOPPED 8
+#define PREEMPT_ACTIVE 0x4000000

#define __set_task_state(tsk, state_value) \
do { (tsk)->state = (state_value); } while (0)
diff -urN linux-2.5.8-pre1/kernel/sched.c linux/kernel/sched.c
--- linux-2.5.8-pre1/kernel/sched.c Wed Apr 3 20:57:37 2002
+++ linux/kernel/sched.c Thu Apr 4 18:29:24 2002
@@ -764,6 +764,13 @@
prev->sleep_timestamp = jiffies;
spin_lock_irq(&rq->lock);

+ /*
+ * if entering from preempt_schedule, off a kernel preemption,
+ * go straight to picking the next task.
+ */
+ if (unlikely(preempt_get_count() & PREEMPT_ACTIVE))
+ goto pick_next_task;
+
switch (prev->state) {
case TASK_INTERRUPTIBLE:
if (unlikely(signal_pending(prev))) {
@@ -775,7 +782,7 @@
case TASK_RUNNING:
;
}
-#if CONFIG_SMP
+#if CONFIG_SMP || CONFIG_PREEMPT
pick_next_task:
#endif
if (unlikely(!rq->nr_running)) {
@@ -843,8 +850,11 @@
{
if (unlikely(preempt_get_count()))
return;
- current->state = TASK_RUNNING;
+
+ current_thread_info()->preempt_count += PREEMPT_ACTIVE;
schedule();
+ current_thread_info()->preempt_count -= PREEMPT_ACTIVE;
+ barrier();
}
#endif /* CONFIG_PREEMPT */


2002-04-05 01:55:39

by George Anzinger

[permalink] [raw]
Subject: Re: [PATCH] preemptive kernel behavior change: don't be rude

Robert Love wrote:
>
> On Thu, 2002-04-04 at 18:55, Linus Torvalds wrote:
> > Fair enough. Send me a patch to look at.
>
> Linus et all,
>
> Here we go:
>
> - do not manually set task->state
> - instead, in preempt_schedule, set a flag in preempt_count that
> denotes that this task is entering schedule off a kernel preemption.
> - use this flag in schedule to jump to pick_next_task
> - in preempt_schedule, upon return from schedule, unset the flag
> - have entry.S just call preempt_schedule and not duplicate this work,
> as Linus suggested. I agree. Note this makes debugging easier as
> we keep a single point of entry for kernel preemptions.
>
> The result: we can safely preempt non-TASK_RUNNING tasks. If one is
> preempted, we can safely survive schedule because we won't handle the
> special casing of non-TASK_RUNNING at the top of schedule. Thus other
> tasks can run as desired and our non-TASK_RUNNING task will eventually
> be rescheduled, in its original state, and complete happily.
>
> This is the behavior we have in the 2.4 patches and 2.5 until
> ~2.5.6-pre. This works. It requires no other changes elsewhere (it
> actually removes some special-casing Ingo did in the signal code).
>
> This patch works in theory and compiles, but received minimal testing.
>
> Robert Love
>
> diff -urN linux-2.5.8-pre1/arch/i386/kernel/entry.S linux/arch/i386/kernel/entry.S
> --- linux-2.5.8-pre1/arch/i386/kernel/entry.S Wed Apr 3 20:57:24 2002
> +++ linux/arch/i386/kernel/entry.S Thu Apr 4 18:32:03 2002
> @@ -240,9 +240,7 @@
> jnz restore_all
> incl TI_PRE_COUNT(%ebx)
> sti
> - movl TI_TASK(%ebx), %ecx # ti->task
> - movl $0,(%ecx) # current->state = TASK_RUNNING
> - call SYMBOL_NAME(schedule)
> + call SYMBOL_NAME(preempt_schedule)
> jmp ret_from_intr
> #endif
>
> diff -urN linux-2.5.8-pre1/arch/i386/kernel/ptrace.c linux/arch/i386/kernel/ptrace.c
> --- linux-2.5.8-pre1/arch/i386/kernel/ptrace.c Wed Apr 3 20:57:24 2002
> +++ linux/arch/i386/kernel/ptrace.c Thu Apr 4 18:31:17 2002
> @@ -455,11 +455,9 @@
> between a syscall stop and SIGTRAP delivery */
> current->exit_code = SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD)
> ? 0x80 : 0);
> - preempt_disable();
> current->state = TASK_STOPPED;
> notify_parent(current, SIGCHLD);
> schedule();
> - preempt_enable();
> /*
> * this isn't the same as continuing with a signal, but it will do
> * for normal use. strace only continues with a signal if the
> diff -urN linux-2.5.8-pre1/arch/i386/kernel/signal.c linux/arch/i386/kernel/signal.c
> --- linux-2.5.8-pre1/arch/i386/kernel/signal.c Wed Apr 3 20:57:24 2002
> +++ linux/arch/i386/kernel/signal.c Thu Apr 4 18:31:00 2002
> @@ -610,11 +610,9 @@
> if ((current->ptrace & PT_PTRACED) && signr != SIGKILL) {
> /* Let the debugger run. */
> current->exit_code = signr;
> - preempt_disable();
> current->state = TASK_STOPPED;
> notify_parent(current, SIGCHLD);
> schedule();
> - preempt_enable();
>
> /* We're back. Did the debugger cancel the sig? */
> if (!(signr = current->exit_code))
> @@ -669,14 +667,12 @@
>
> case SIGSTOP: {
> struct signal_struct *sig;
> + current->state = TASK_STOPPED;
> current->exit_code = signr;
> sig = current->parent->sig;
> - preempt_disable();
> - current->state = TASK_STOPPED;
> if (sig && !(sig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDSTOP))
> notify_parent(current, SIGCHLD);
> schedule();
> - preempt_enable();
> continue;
> }
>
> diff -urN linux-2.5.8-pre1/include/linux/sched.h linux/include/linux/sched.h
> --- linux-2.5.8-pre1/include/linux/sched.h Wed Apr 3 20:57:27 2002
> +++ linux/include/linux/sched.h Thu Apr 4 18:29:53 2002
> @@ -91,6 +91,7 @@
> #define TASK_UNINTERRUPTIBLE 2
> #define TASK_ZOMBIE 4
> #define TASK_STOPPED 8
> +#define PREEMPT_ACTIVE 0x4000000
>
> #define __set_task_state(tsk, state_value) \
> do { (tsk)->state = (state_value); } while (0)
> diff -urN linux-2.5.8-pre1/kernel/sched.c linux/kernel/sched.c
> --- linux-2.5.8-pre1/kernel/sched.c Wed Apr 3 20:57:37 2002
> +++ linux/kernel/sched.c Thu Apr 4 18:29:24 2002
> @@ -764,6 +764,13 @@
> prev->sleep_timestamp = jiffies;
> spin_lock_irq(&rq->lock);
>
> + /*
> + * if entering from preempt_schedule, off a kernel preemption,
> + * go straight to picking the next task.
> + */
> + if (unlikely(preempt_get_count() & PREEMPT_ACTIVE))
> + goto pick_next_task;
> +
> switch (prev->state) {
> case TASK_INTERRUPTIBLE:
> if (unlikely(signal_pending(prev))) {
> @@ -775,7 +782,7 @@
> case TASK_RUNNING:
> ;
> }
> -#if CONFIG_SMP
> +#if CONFIG_SMP || CONFIG_PREEMPT
> pick_next_task:
> #endif
> if (unlikely(!rq->nr_running)) {
> @@ -843,8 +850,11 @@
> {
> if (unlikely(preempt_get_count()))
> return;

This line implies that entry.S calls with preempt count of 0. It use to
call with 1 or was that WAY back there?

If this is the way it is and it works, then the += and -= below can be
changed to = and the second reference to PREEMPT_ACTIVE becomes 0.

I think I would rather have entry.S set preempt_count to PREEMPT_ACTIVE
with the interrupt system off, turn it on, make the call directly to
schedule(), and then set to zero on return. I really am concerned with
taking an interrupt during the call, i.e. between the interrupt on and
the store below. This can lead to stack overflow rather easily.

> - current->state = TASK_RUNNING;
> +
> + current_thread_info()->preempt_count += PREEMPT_ACTIVE;
> schedule();
> + current_thread_info()->preempt_count -= PREEMPT_ACTIVE;
> + barrier();
> }
> #endif /* CONFIG_PREEMPT */
>

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

2002-04-05 02:07:31

by Robert Love

[permalink] [raw]
Subject: Re: [PATCH] preemptive kernel behavior change: don't be rude

On Thu, 2002-04-04 at 20:51, george anzinger wrote:

> This line implies that entry.S calls with preempt count of 0. It use to
> call with 1 or was that WAY back there?

No you are right, this line needs to be pushed back out to
preempt_enable OR we need to duplicate preempt_schedule in the entry.S
code. For now, I will just push this back into preempt_enable.

> If this is the way it is and it works, then the += and -= below can be
> changed to = and the second reference to PREEMPT_ACTIVE becomes 0.
>
> I think I would rather have entry.S set preempt_count to PREEMPT_ACTIVE
> with the interrupt system off, turn it on, make the call directly to
> schedule(), and then set to zero on return. I really am concerned with
> taking an interrupt during the call, i.e. between the interrupt on and
> the store below. This can lead to stack overflow rather easily.

Agreed. This is probably the best way to do it ... for now I'll do it
like above.

Robert Love

2002-04-05 03:09:31

by Robert Love

[permalink] [raw]
Subject: Re: [PATCH] preemptive kernel behavior change: don't be rude

Linus and company,

Attached find a final version of the new behavior patch. It meets
everyone's concerns and it compiles, boots, and survived a basic stress
test here.

The next step would be to push the preempt_count check back into
preempt_schedule and duplicate preempt_schedule in entry.S and then have
it call schedule directly. This would be both a space and time
optimization. I will do this, but I'd like to give this as it is some
testing first.

Linus, if this is how you want it, please apply. Patch is against
2.5.8-pre1.

Robert Love

diff -urN linux-2.5.8-pre1/arch/i386/kernel/entry.S linux/arch/i386/kernel/entry.S
--- linux-2.5.8-pre1/arch/i386/kernel/entry.S Wed Apr 3 20:57:24 2002
+++ linux/arch/i386/kernel/entry.S Thu Apr 4 20:44:40 2002
@@ -240,9 +240,7 @@
jnz restore_all
incl TI_PRE_COUNT(%ebx)
sti
- movl TI_TASK(%ebx), %ecx # ti->task
- movl $0,(%ecx) # current->state = TASK_RUNNING
- call SYMBOL_NAME(schedule)
+ call SYMBOL_NAME(preempt_schedule)
jmp ret_from_intr
#endif

diff -urN linux-2.5.8-pre1/arch/i386/kernel/ptrace.c linux/arch/i386/kernel/ptrace.c
--- linux-2.5.8-pre1/arch/i386/kernel/ptrace.c Wed Apr 3 20:57:24 2002
+++ linux/arch/i386/kernel/ptrace.c Thu Apr 4 20:44:40 2002
@@ -455,11 +455,9 @@
between a syscall stop and SIGTRAP delivery */
current->exit_code = SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD)
? 0x80 : 0);
- preempt_disable();
current->state = TASK_STOPPED;
notify_parent(current, SIGCHLD);
schedule();
- preempt_enable();
/*
* this isn't the same as continuing with a signal, but it will do
* for normal use. strace only continues with a signal if the
diff -urN linux-2.5.8-pre1/arch/i386/kernel/signal.c linux/arch/i386/kernel/signal.c
--- linux-2.5.8-pre1/arch/i386/kernel/signal.c Wed Apr 3 20:57:24 2002
+++ linux/arch/i386/kernel/signal.c Thu Apr 4 20:44:40 2002
@@ -610,11 +610,9 @@
if ((current->ptrace & PT_PTRACED) && signr != SIGKILL) {
/* Let the debugger run. */
current->exit_code = signr;
- preempt_disable();
current->state = TASK_STOPPED;
notify_parent(current, SIGCHLD);
schedule();
- preempt_enable();

/* We're back. Did the debugger cancel the sig? */
if (!(signr = current->exit_code))
@@ -669,14 +667,12 @@

case SIGSTOP: {
struct signal_struct *sig;
+ current->state = TASK_STOPPED;
current->exit_code = signr;
sig = current->parent->sig;
- preempt_disable();
- current->state = TASK_STOPPED;
if (sig && !(sig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDSTOP))
notify_parent(current, SIGCHLD);
schedule();
- preempt_enable();
continue;
}

diff -urN linux-2.5.8-pre1/include/linux/sched.h linux/include/linux/sched.h
--- linux-2.5.8-pre1/include/linux/sched.h Wed Apr 3 20:57:27 2002
+++ linux/include/linux/sched.h Thu Apr 4 21:56:11 2002
@@ -91,6 +91,7 @@
#define TASK_UNINTERRUPTIBLE 2
#define TASK_ZOMBIE 4
#define TASK_STOPPED 8
+#define PREEMPT_ACTIVE 0x4000000

#define __set_task_state(tsk, state_value) \
do { (tsk)->state = (state_value); } while (0)
diff -urN linux-2.5.8-pre1/include/linux/spinlock.h linux/include/linux/spinlock.h
--- linux-2.5.8-pre1/include/linux/spinlock.h Wed Apr 3 20:57:34 2002
+++ linux/include/linux/spinlock.h Thu Apr 4 21:56:11 2002
@@ -177,8 +177,9 @@
do { \
--current_thread_info()->preempt_count; \
barrier(); \
- if (unlikely(test_thread_flag(TIF_NEED_RESCHED))) \
- preempt_schedule(); \
+ if (unlikely(!(current_thread_info()->preempt_count) && \
+ test_thread_flag(TIF_NEED_RESCHED))) \
+ preempt_schedule(); \
} while (0)

#define spin_lock(lock) \
diff -urN linux-2.5.8-pre1/kernel/exit.c linux/kernel/exit.c
--- linux-2.5.8-pre1/kernel/exit.c Wed Apr 3 20:57:37 2002
+++ linux/kernel/exit.c Thu Apr 4 21:55:29 2002
@@ -499,6 +495,7 @@
acct_process(code);
__exit_mm(tsk);

+ preempt_disable();
sem_exit();
__exit_files(tsk);
__exit_fs(tsk);
diff -urN linux-2.5.8-pre1/kernel/sched.c linux/kernel/sched.c
--- linux-2.5.8-pre1/kernel/sched.c Wed Apr 3 20:57:37 2002
+++ linux/kernel/sched.c Thu Apr 4 20:54:17 2002
@@ -764,6 +764,13 @@
prev->sleep_timestamp = jiffies;
spin_lock_irq(&rq->lock);

+ /*
+ * if entering from preempt_schedule, off a kernel preemption,
+ * go straight to picking the next task.
+ */
+ if (unlikely(preempt_get_count() & PREEMPT_ACTIVE))
+ goto pick_next_task;
+
switch (prev->state) {
case TASK_INTERRUPTIBLE:
if (unlikely(signal_pending(prev))) {
@@ -775,9 +782,7 @@
case TASK_RUNNING:
;
}
-#if CONFIG_SMP
pick_next_task:
-#endif
if (unlikely(!rq->nr_running)) {
#if CONFIG_SMP
load_balance(rq, 1);
@@ -841,10 +846,12 @@
*/
asmlinkage void preempt_schedule(void)
{
- if (unlikely(preempt_get_count()))
- return;
- current->state = TASK_RUNNING;
- schedule();
+ do {
+ current_thread_info()->preempt_count += PREEMPT_ACTIVE;
+ schedule();
+ current_thread_info()->preempt_count -= PREEMPT_ACTIVE;
+ barrier();
+ } while(test_thread_flag(TIF_NEED_RESCHED));
}
#endif /* CONFIG_PREEMPT */


2002-04-05 20:02:34

by Roger Larsson

[permalink] [raw]
Subject: Re: [PATCH] preemptive kernel behavior change: don't be rude

Hi Robert,

This does not look symmetrical, add and sub and bitops...

> +#define PREEMPT_ACTIVE 0x4000000


> + /*
> + * if entering from preempt_schedule, off a kernel preemption,
> + * go straight to picking the next task.
> + */
> + if (unlikely(preempt_get_count() & PREEMPT_ACTIVE))
> + goto pick_next_task;
> +


> + do {
> + current_thread_info()->preempt_count += PREEMPT_ACTIVE;
> + schedule();
> + current_thread_info()->preempt_count -= PREEMPT_ACTIVE;
> + barrier();

And since it has to be zero to end up in the routine anyway...

asmlinkage void preempt_schedule(void)
{
BUG_ON(current_thread_info()->preempt_count != 0);

do {
/* Problem: suppose a new interrupt happens before we got to set
* preempt_count... then we will call this routine recursively.
* innermost will select the correct process, but wont it be scheduled
* away in enclosing preempt_schedule() - schedule() will be called
* with PREEMPT_ACTIVE but not TIF_NEED_RESCHED...
* (goto pick_next_task) */

current_thread_info()->preempt_count = PREEMPT_ACTIVE;

/* interrupts here might cause calls to preempt_schedule() but those will
bounce off above since preempt_count != 0 */

schedule();
current_thread_info()->preempt_count = 0;

/* interrupt here will possibly preempt the right process - no problem */

/* need to check if any interrupt happened during the preemption off time,
* this case is in fact unlikely */
while (unlikely(test_thread_flag(TIF_NEED_RESCHED)));
}

/RogerL

--
Roger Larsson
Skellefte?
Sweden