2003-06-19 01:30:49

by Perez-Gonzalez, Inaky

[permalink] [raw]
Subject: RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks

> From: Andrew Morton [mailto:[email protected]]
>
> Various things like character drivers do rely upon keventd services. So
it
> is possible that bash is stuck waiting on keyboard input, but there is no
> keyboard input because keventd is locked out.
>
> I'll take a closer look at this, see if there is a specific case which can
> be fixed.
>
> Arguably, keventd should be running max-prio RT because it is a kernel
> service, providing "process context interrupt service".

Now that we are at that, it might be wise to add a higher-than-anything
priority that the kernel code can use (what would be 100 for user space,
but off-limits), so even FIFO 99 code in user space cannot block out
the migration thread, keventd and friends.

> IIRC, Andrea's kernel runs keventd as SCHED_FIFO. I've tried to avoid
> making this change for ideological reasons ;) Userspace is more important
> than the kernel and the kernel has no damn right to be saying "oh my stuff
> is so important that it should run before latency-critical user code".

I agree with that, but the consequence is kind of ugly; not that a true
real-time embedded process is going to be printing to the console, but
it might be outputting to a serial line, so now they rely on the keventd.

BTW, I have seen similar problems wrt to the migration thread, where a
FIFO 20 process would get stuck in CPU1, that is taken by a FIFO 40
while CPU0 was running a FIFO 10 -- however, I am not that positive
that it is a migration thread problem; I blame it more on the scheduler
not taking into account priorities for firing the load balancer. It is
a tricky thingie, though. Affinity helps, in this case.

I?aky P?rez-Gonz?lez -- Not speaking for Intel -- all opinions are my own
(and my fault)


2003-06-19 01:44:45

by Robert Love

[permalink] [raw]
Subject: RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks

On Wed, 2003-06-18 at 18:44, Perez-Gonzalez, Inaky wrote:

> Now that we are at that, it might be wise to add a higher-than-anything
> priority that the kernel code can use (what would be 100 for user space,
> but off-limits), so even FIFO 99 code in user space cannot block out
> the migration thread, keventd and friends.

I did this about a year ago, and it is merged into the kernel.

See MAX_USER_RT_PRIO and MAX_RT_PRIO in <linux/sched.h>.

We just need to change MAX_RT_PRIO to, say, (MAX_USER_RT_PRIO + 10).

The one kicker is if we end up changing the size of BITMAP_SIZE, the
default sched_find_first_bit() will break and we will need to implement
a new one. I did a generic one, as well as code to detect at
compile-time which to use, but the optimized one is a lot nicer. On
32-bit machines, the BITMAP_SIZE ends up being 160-bits
(5*sizeof(unsigned long)) so there are about 20 extra priority levels
one can add "for free."

Robert Love

2003-06-19 01:49:36

by George Anzinger

[permalink] [raw]
Subject: Re: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks

Perez-Gonzalez, Inaky wrote:
>>From: Andrew Morton [mailto:[email protected]]
>>
>>Various things like character drivers do rely upon keventd services. So
>
> it
>
>>is possible that bash is stuck waiting on keyboard input, but there is no
>>keyboard input because keventd is locked out.
>>
>>I'll take a closer look at this, see if there is a specific case which can
>>be fixed.
>>
>>Arguably, keventd should be running max-prio RT because it is a kernel
>>service, providing "process context interrupt service".
>
>
> Now that we are at that, it might be wise to add a higher-than-anything
> priority that the kernel code can use (what would be 100 for user space,
> but off-limits), so even FIFO 99 code in user space cannot block out
> the migration thread, keventd and friends.

Wait a bit (or even a byte) here. I think the proper thing to do, IF
we want to go down this road, is to seperate out the various
subsystems and give them each their own kernel task or workqueue.
Then those who need to could adjust, for example, network code to run
after real time process control and prior to print jobs, priority
wise, that is. Likewise, you could adjust the console access to be
higher priority than the network so that we call always talk to the
system. If you give any kernel thread an untouchable priority, you
might just as well move the work back to a bottom half or even the
interrupt code.

-g
>
>
>>IIRC, Andrea's kernel runs keventd as SCHED_FIFO. I've tried to avoid
>>making this change for ideological reasons ;) Userspace is more important
>>than the kernel and the kernel has no damn right to be saying "oh my stuff
>>is so important that it should run before latency-critical user code".
>
>
> I agree with that, but the consequence is kind of ugly; not that a true
> real-time embedded process is going to be printing to the console, but
> it might be outputting to a serial line, so now they rely on the keventd.
>
> BTW, I have seen similar problems wrt to the migration thread, where a
> FIFO 20 process would get stuck in CPU1, that is taken by a FIFO 40
> while CPU0 was running a FIFO 10 -- however, I am not that positive
> that it is a migration thread problem; I blame it more on the scheduler
> not taking into account priorities for firing the load balancer. It is
> a tricky thingie, though. Affinity helps, in this case.
>
> I?aky P?rez-Gonz?lez -- Not speaking for Intel -- all opinions are my own
> (and my fault)
>
>

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

2003-06-19 04:21:39

by Joe Korty

[permalink] [raw]
Subject: Re: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks

On Wed, Jun 18, 2003 at 06:44:42PM -0700, Perez-Gonzalez, Inaky wrote:
>
> Now that we are at that, it might be wise to add a higher-than-anything
> priority that the kernel code can use (what would be 100 for user space,
> but off-limits), so even FIFO 99 code in user space cannot block out
> the migration thread, keventd and friends.


I would prefer users have the ability to put one or two truly critical RT
tasks above keventd & family. Such tasks would have to follow certain rules
.. run & sleep quick .. limited or no device IO .. most communication to
other tasks through shared memory .. possibly others.

There are those willing to follow whatever rules necessary & split up their
application into tasks any which way in order to get high responsiveness to a
critical but small part of their application. If you follow the rules, you
should be allowed to put a carefully crafted task above the system daemons
(with the possible exception of the migration daemon).

Joe