2002-01-14 02:40:15

by Jeff Dike

[permalink] [raw]
Subject: The O(1) scheduler breaks UML

The new scheduler holds IRQs off across the call to context_switch. UML's
_switch_to expects them to be enabled when it is called, and things go
badly wrong when they are not.

Because UML has a host process for each UML thread, SIGIO needs to be
forwarded from one process to the next during a context switch. A SIGIO
arriving during the window between the disabling of IRQs and forwarding of
IRQs to the next process will be trapped on the process going out of
context. This happens fairly regularly and causes hangs because some process
is waiting for disk IO which never arrives because the process that was notified
of the completion is switched out.

So, is it possible to enable IRQs across the call to _switch_to?

Jeff


2002-01-14 02:50:05

by Davide Libenzi

[permalink] [raw]
Subject: Re: The O(1) scheduler breaks UML

On Sun, 13 Jan 2002, Jeff Dike wrote:

> The new scheduler holds IRQs off across the call to context_switch. UML's
> _switch_to expects them to be enabled when it is called, and things go
> badly wrong when they are not.
>
> Because UML has a host process for each UML thread, SIGIO needs to be
> forwarded from one process to the next during a context switch. A SIGIO
> arriving during the window between the disabling of IRQs and forwarding of
> IRQs to the next process will be trapped on the process going out of
> context. This happens fairly regularly and causes hangs because some process
> is waiting for disk IO which never arrives because the process that was notified
> of the completion is switched out.
>
> So, is it possible to enable IRQs across the call to _switch_to?

Yes, this should work :


if (likely(prev != next)) {
rq->nr_switches++;
rq->curr = next;
next->cpu = prev->cpu;
spin_unlock_irq(&rq->lock);
context_switch(prev, next);
} else
spin_unlock_irq(&rq->lock);

and there's no need for barrier() and rq reload in this way.




- Davide


2002-01-14 04:48:23

by Jeff Dike

[permalink] [raw]
Subject: Re: The O(1) scheduler breaks UML

[email protected] said:
> Yes, this should work :
> if (likely(prev != next)) {
> rq->nr_switches++;
> rq->curr = next;
> next->cpu = prev->cpu;
> spin_unlock_irq(&rq->lock);
> context_switch(prev, next);
> } else
> spin_unlock_irq(&rq->lock);
> and there's no need for barrier() and rq reload in this way.

Yup, UML works much better with that.

Jeff

2002-01-14 07:43:30

by Ingo Molnar

[permalink] [raw]
Subject: Re: The O(1) scheduler breaks UML


On Sun, 13 Jan 2002, Jeff Dike wrote:

> The new scheduler holds IRQs off across the call to context_switch.
> UML's _switch_to expects them to be enabled when it is called, and
> things go badly wrong when they are not.

unfortunately this cannot be done, due to exit(), ptrace() and other SMP
races. On SMP, the 'previous' task is protected by the runqueue lock. If
we do the context switch outside the runqueue lock then a task might be
freed on another CPU while it's in fact still in use.

there are other heavy implications as well:

- current->processor is no longer valid from IRQ handlers.

- a CPU might execute the 'previous' task before we have switched away
from it. (nothing but the runqueue lock keeps the load balancer from
taking the task from the runqueue.)

in 2.4 i've implemented irq-enabled context switches, and it was a major
PITA. To do it correctly one has to do reintroduce __schedule_tail() and
do a task_lock/task_unlock to get context-switch atomicity via other means
than the local runqueue lock. On 2.4 i did this because global runqueue
contention was such an issue for certain workloads that even the
task-unlocking overhead was worth it. With the O(1) scheduler this is
pretty much out of the question.

we could enable interrupts on UP - because UP is special, disabling
interrupts there is in essence a cheap 'global interrupt lock'. But that
doesnt help the SMP/UML situation much.

i'd suggest to find some other solution for UML, besides signals.
__switch_to is a very internal function that can very well be called with
spinlocks disabled, we just cannot guarantee that it will be called with
irqs enabled. Signals are something that is often 'heavy', it cannot be
done atomically in the generic case.

Ingo

2002-01-14 07:58:25

by Ingo Molnar

[permalink] [raw]
Subject: Re: The O(1) scheduler breaks UML


On Sun, 13 Jan 2002, Davide Libenzi wrote:

> > So, is it possible to enable IRQs across the call to _switch_to?
>
> Yes, this should work :
>
> if (likely(prev != next)) {
> rq->nr_switches++;
> rq->curr = next;
> next->cpu = prev->cpu;
> spin_unlock_irq(&rq->lock);
> context_switch(prev, next);
> } else
> spin_unlock_irq(&rq->lock);

this change is incredibly broken on SMP - eg. what protects 'prev' from
being executed on another CPU prematurely. It's even broken on UP:
interrupt context that changes current->need_resched needs to be aware of
nonatomic context switches. See my previous mail.

> and there's no need for barrier() and rq reload in this way.

we can remove the barrier(), but for a different reason: the asm volatile
definition of the switch_to macro is a compilation barrier in itself
already. I've removed the barrier() from my tree, the change will be in
the -H8 patch. The rq = this_rq() reload is still necessery.

Ingo

2002-01-14 15:32:55

by Davide Libenzi

[permalink] [raw]
Subject: Re: The O(1) scheduler breaks UML

On Mon, 14 Jan 2002, Ingo Molnar wrote:

>
> On Sun, 13 Jan 2002, Davide Libenzi wrote:
>
> > > So, is it possible to enable IRQs across the call to _switch_to?
> >
> > Yes, this should work :
> >
> > if (likely(prev != next)) {
> > rq->nr_switches++;
> > rq->curr = next;
> > next->cpu = prev->cpu;
> > spin_unlock_irq(&rq->lock);
> > context_switch(prev, next);
> > } else
> > spin_unlock_irq(&rq->lock);
>
> this change is incredibly broken on SMP - eg. what protects 'prev' from
> being executed on another CPU prematurely. It's even broken on UP:
> interrupt context that changes current->need_resched needs to be aware of
> nonatomic context switches. See my previous mail.

yup, true. no more schedule_tail()



- Davide


2002-01-14 19:15:24

by Jeff Dike

[permalink] [raw]
Subject: Re: The O(1) scheduler breaks UML

[email protected] said:
> i'd suggest to find some other solution for UML, besides signals.

You suggest implementing interrupts with something other than signals? What
else is there?

In any case, I stuck a little kludge in _switch_to which checks for pending
SIGIO and, if there is one, hits the incoming process with a SIGIO. This
seems to do the trick.

Jeff