2004-11-08 18:33:37

by Stephen Warren

[permalink] [raw]
Subject: SCHED_RR and kernel threads

Hello.

We have an application that is running on kernel 2.6.9. This application
makes use of real-time threads, namely using the SCHED_RR policy.

It appears that during times of high application CPU usage, some
*kernel* threads don't get to run. As an example, this means that local
keyboard presses aren't processed (or are processed very slowly) by the
kernel, so our application never sees them. This has the effect of
hanging the system, since the way to get out of the higher CPU usage
portion of the application is to press the ESC key, and our application
never sees that keypress.

This appears to be due to the fact that the kernel threads are all
SCHED_OTHER, so our SCHED_RR user-space application trumps them!

So, we made a little patch to make the kernel threads SCHED_RR too, so
that they will be guaranteed to get some CPU time, even when user-space
threads suck a lot of CPU (depending on their priority - at present,
most of our app threads are the same priority as the kernel threads,
with just a few being elevated).

The patch is below. Can anyone comment on whether this is a safe and/or
sensible thing to do? Any comments on alternative "right ways" to do
this would be great as well.

Thanks for any help!

Notes: The first part of the patch is where we initialize things to get
SCHED_RR for all tasks/threads in the system. The second is to work
around the fact that some of the init_task data is trashed later, so we
restore it back...

diff -urN linux-kernel.org/include/linux/init_task.h
linux-kernel.org-2/include/linux/init_task.h
--- linux-kernel.org/include/linux/init_task.h 2004-08-13
23:36:16.000000000 -0600
+++ linux-kernel.org-2/include/linux/init_task.h 2004-11-08
10:31:51.851216704 -0700
@@ -71,9 +71,10 @@
.usage = ATOMIC_INIT(2),
\
.flags = 0,
\
.lock_depth = -1,
\
- .prio = MAX_PRIO-20,
\
+ .prio = (MAX_RT_PRIO / 2) - 1,
\
.static_prio = MAX_PRIO-20,
\
- .policy = SCHED_NORMAL,
\
+ .policy = SCHED_RR,
\
+ .rt_priority = MAX_RT_PRIO / 2,
\
.cpus_allowed = CPU_MASK_ALL,
\
.mm = NULL,
\
.active_mm = &init_mm,
\
diff -urN linux-kernel.org/init/main.c linux-kernel.org-2/init/main.c
--- linux-kernel.org/init/main.c 2004-08-16 06:58:48.000000000
-0600
+++ linux-kernel.org-2/init/main.c 2004-11-08 10:32:12.001153448
-0700
@@ -660,6 +660,11 @@
*/
child_reaper = current;

+ /* Reset the prio to allow SCHED_RR tasks */
+ if (current->policy == SCHED_RR) {
+ current->prio = current->rt_priority - 1;
+ }
+
/* Sets up cpus_possible() */
smp_prepare_cpus(max_cpus);

--
Stephen Warren, Software Engineer, NVIDIA, Fort Collins, CO
[email protected] http://www.nvidia.com/
[email protected] http://www.wwwdotorg.org/pgp.html


2004-11-08 20:29:57

by Con Kolivas

[permalink] [raw]
Subject: Re: SCHED_RR and kernel threads

Stephen Warren wrote:
> Hello.
>
> We have an application that is running on kernel 2.6.9. This application
> makes use of real-time threads, namely using the SCHED_RR policy.
>
> It appears that during times of high application CPU usage, some
> *kernel* threads don't get to run. As an example, this means that local
> keyboard presses aren't processed (or are processed very slowly) by the
> kernel, so our application never sees them. This has the effect of
> hanging the system, since the way to get out of the higher CPU usage
> portion of the application is to press the ESC key, and our application
> never sees that keypress.
>
> This appears to be due to the fact that the kernel threads are all
> SCHED_OTHER, so our SCHED_RR user-space application trumps them!

Don't run your userspace at SCHED_RR? The kernel threads are
SCHED_NORMAL precisely for the reason that you wont get real time
performance if the kernel threads rear their ugly heads, albeit rarely.

Cheers,
Con


Attachments:
signature.asc (256.00 B)
OpenPGP digital signature

2004-11-08 20:52:32

by Stephen Warren

[permalink] [raw]
Subject: RE: SCHED_RR and kernel threads

> From: Con Kolivas [mailto:[email protected]]
> Stephen Warren wrote:
> > It appears that during times of high application CPU usage, some
> > *kernel* threads don't get to run.
> > ...
> > This appears to be due to the fact that the kernel threads are all
> > SCHED_OTHER, so our SCHED_RR user-space application trumps them!
>
> Don't run your userspace at SCHED_RR? The kernel threads are
> SCHED_NORMAL precisely for the reason that you wont get real time
> performance if the kernel threads rear their ugly heads,
> albeit rarely.

We have actually set the kernel threads to priority SCHED_RR 50, and
most user-space threads to SCHED_RR priority 50. Some critical
user-space threads are above priority 50.

Won't this allow the kernel and user space threads to co-operate nicely
all the time?

What is it specifically that will make kernel SCHED_RR threads cause
non-real-time operation? If it's just a bunch of corner cases or odd
conditions, we may be in an environment we can control so that doesn't
happen...

I guess we could have most threads stay at SCHED_NORMAL, and just make
the few critical threads SCHED_RR, but I'm getting a lot of push-back on
this, since it makes our thread API a lot more complex.

--
Stephen Warren, Software Engineer, NVIDIA, Fort Collins, CO
[email protected] http://www.nvidia.com/
[email protected] http://www.wwwdotorg.org/pgp.html

2004-11-09 01:24:35

by Con Kolivas

[permalink] [raw]
Subject: Re: SCHED_RR and kernel threads

Stephen Warren writes:

>> From: Con Kolivas [mailto:[email protected]]
>> Stephen Warren wrote:
>> > It appears that during times of high application CPU usage, some
>> > *kernel* threads don't get to run.
>> > ...
>> > This appears to be due to the fact that the kernel threads are all
>> > SCHED_OTHER, so our SCHED_RR user-space application trumps them!
>>
>> Don't run your userspace at SCHED_RR? The kernel threads are
>> SCHED_NORMAL precisely for the reason that you wont get real time
>> performance if the kernel threads rear their ugly heads,
>> albeit rarely.
>
> We have actually set the kernel threads to priority SCHED_RR 50, and
> most user-space threads to SCHED_RR priority 50. Some critical
> user-space threads are above priority 50.
>
> Won't this allow the kernel and user space threads to co-operate nicely
> all the time?
>
> What is it specifically that will make kernel SCHED_RR threads cause
> non-real-time operation? If it's just a bunch of corner cases or odd
> conditions, we may be in an environment we can control so that doesn't
> happen...
>
> I guess we could have most threads stay at SCHED_NORMAL, and just make
> the few critical threads SCHED_RR, but I'm getting a lot of push-back on
> this, since it makes our thread API a lot more complex.

Your workaround is not suitable for the kernel at large. Preventing
starvation of the system if you are using SCHED_RR threads is up to your
userspace apps to provide. SCHED_RR is _not_ designed to use 100% of the cpu
all the time, but to provide minimum latency preempting everything lower
priority than itself when scheduled. The kernel threads do not need that
sort of control and can potentially starve critical userspace threads during
heavy system stress.

Cheers,
Con

2004-11-09 02:01:52

by Stephen Warren

[permalink] [raw]
Subject: RE: SCHED_RR and kernel threads

> From: Con Kolivas [mailto:[email protected]]
> Stephen Warren writes:
>> I guess we could have most threads stay at SCHED_NORMAL, and just
make
>> the few critical threads SCHED_RR, but I'm getting a lot of push-back
on
>> this, since it makes our thread API a lot more complex.
>
>Your workaround is not suitable for the kernel at large.

You mean the official kernel.org kernel? I wasn't implying that the
patch should be part of that!

In our system we have literally EVERY single thread (kernel, user-space
daemons, and user-space applications) all setup as SCHED_RR with
identical priority at present, except a couple higher priority threads.
We did this initially for user-space by replacing /sbin/init with a
wrapper that set the scheduler policy and default priority, and verified
that this was inherited by all daemons & application threads. Then, we
found that the kernel threads could get starved in some situations,
hence the kernel change.

Our threading model dictates that every thread have a priority (so that
the thread model is portable between Linux, embedded RTOSs etc.), and in
Linux AFAIK, the only way to implement priorities is to use a real-time
scheduling policy. Some threads do a lot of calculation. We want to make
them equal (or probably, lower) priority to the kernel threads, so
therefore the kernel threads must then be SCHED_RR.

Can you elaborate on specific conditions that would cause the kernel
threads to suck up unusual amounts of CPU time?

In our application, keyboard processing is a real-time requirement, so
if that is performed in a kernel thread, that kernel thread should be
real-time. We basically want the control to insert e.g. the keyboard
processing kernel thread into the middle of our priority hierarchy,
rather than having it forced as the lowest possible priority.

I get the impression you're implying that scheduling doesn't work
correctly in this situation - that if kernel threads are set to
SCHED_RR, they can still lock out user-space threads of the same or
higher priority? Is this what you're saying, or do you mean that the
kernel threads can lock out user-space threads of *lower* priority,
which is to be expected. In all the RTOS's I've seen, all threads are
SCHED_RR, thus mimicking the situation we've creating by patching our
kernel...

--
Stephen Warren, Software Engineer, NVIDIA, Fort Collins, CO
[email protected] http://www.nvidia.com/
[email protected] http://www.wwwdotorg.org/pgp.html

2004-11-09 02:22:27

by Con Kolivas

[permalink] [raw]
Subject: Re: SCHED_RR and kernel threads

Stephen Warren writes:

>> From: Con Kolivas [mailto:[email protected]]
>> Stephen Warren writes:
>>> I guess we could have most threads stay at SCHED_NORMAL, and just
> make
>>> the few critical threads SCHED_RR, but I'm getting a lot of push-back
> on
>>> this, since it makes our thread API a lot more complex.
>>
>>Your workaround is not suitable for the kernel at large.
>
> You mean the official kernel.org kernel? I wasn't implying that the
> patch should be part of that!
>
> In our system we have literally EVERY single thread (kernel, user-space
> daemons, and user-space applications) all setup as SCHED_RR with
> identical priority at present, except a couple higher priority threads.
> We did this initially for user-space by replacing /sbin/init with a
> wrapper that set the scheduler policy and default priority, and verified
> that this was inherited by all daemons & application threads. Then, we
> found that the kernel threads could get starved in some situations,
> hence the kernel change.
>
> Our threading model dictates that every thread have a priority (so that
> the thread model is portable between Linux, embedded RTOSs etc.), and in
> Linux AFAIK, the only way to implement priorities is to use a real-time
> scheduling policy. Some threads do a lot of calculation. We want to make
> them equal (or probably, lower) priority to the kernel threads, so
> therefore the kernel threads must then be SCHED_RR.
>
> Can you elaborate on specific conditions that would cause the kernel
> threads to suck up unusual amounts of CPU time?
>
> In our application, keyboard processing is a real-time requirement, so
> if that is performed in a kernel thread, that kernel thread should be
> real-time. We basically want the control to insert e.g. the keyboard
> processing kernel thread into the middle of our priority hierarchy,
> rather than having it forced as the lowest possible priority.
>
> I get the impression you're implying that scheduling doesn't work
> correctly in this situation - that if kernel threads are set to
> SCHED_RR, they can still lock out user-space threads of the same or
> higher priority? Is this what you're saying, or do you mean that the
> kernel threads can lock out user-space threads of *lower* priority,
> which is to be expected. In all the RTOS's I've seen, all threads are
> SCHED_RR, thus mimicking the situation we've creating by patching our
> kernel...

If everything is the same priority then you've created a simple round robin
scheduler out of the kernel and that's fine for your setting. If you're
looking for another alternative to this, check out the email I posted in the
last week for implementing a sched bound policy. We will be looking at
implementing that in the near future.

Cheers,
Con

2004-11-10 20:24:24

by Bill Davidsen

[permalink] [raw]
Subject: Re: SCHED_RR and kernel threads

Stephen Warren wrote:
>>From: Con Kolivas [mailto:[email protected]]
>>Stephen Warren writes:
>>
>>>I guess we could have most threads stay at SCHED_NORMAL, and just
>
> make
>
>>>the few critical threads SCHED_RR, but I'm getting a lot of push-back
>
> on
>
>>>this, since it makes our thread API a lot more complex.
>>
>>Your workaround is not suitable for the kernel at large.
>
>
> You mean the official kernel.org kernel? I wasn't implying that the
> patch should be part of that!
>
> In our system we have literally EVERY single thread (kernel, user-space
> daemons, and user-space applications) all setup as SCHED_RR with
> identical priority at present, except a couple higher priority threads.
> We did this initially for user-space by replacing /sbin/init with a
> wrapper that set the scheduler policy and default priority, and verified
> that this was inherited by all daemons & application threads. Then, we
> found that the kernel threads could get starved in some situations,
> hence the kernel change.
>
> Our threading model dictates that every thread have a priority (so that
> the thread model is portable between Linux, embedded RTOSs etc.), and in
> Linux AFAIK, the only way to implement priorities is to use a real-time
> scheduling policy. Some threads do a lot of calculation. We want to make
> them equal (or probably, lower) priority to the kernel threads, so
> therefore the kernel threads must then be SCHED_RR.
>
> Can you elaborate on specific conditions that would cause the kernel
> threads to suck up unusual amounts of CPU time?
>
> In our application, keyboard processing is a real-time requirement, so
> if that is performed in a kernel thread, that kernel thread should be
> real-time. We basically want the control to insert e.g. the keyboard
> processing kernel thread into the middle of our priority hierarchy,
> rather than having it forced as the lowest possible priority.

Perhaps someone could comment on why the keyboard thread is NOT higher
priority? The whole functionality of SysReq key combinations would seem
to depend on actually seeing the strokes. I would cautiously suggest
that a priority control in /proc/sys might be a useful interface,
certainly compared to patching the kernel and rebuilding.

Yes, I mean an option in the mainline kernel, so when debugging hangs
the keyboard could be used.

--
-bill davidsen ([email protected])
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me

2004-11-10 20:41:44

by Con Kolivas

[permalink] [raw]
Subject: Re: SCHED_RR and kernel threads

Bill Davidsen wrote:
> Stephen Warren wrote:
>
>>> From: Con Kolivas [mailto:[email protected]] Stephen Warren writes:
>>>
>>>> I guess we could have most threads stay at SCHED_NORMAL, and just
>>
>>
>> make
>>
>>>> the few critical threads SCHED_RR, but I'm getting a lot of push-back
>>
>>
>> on
>>
>>>> this, since it makes our thread API a lot more complex.
>>>
>>>
>>> Your workaround is not suitable for the kernel at large.
>>
>>
>>
>> You mean the official kernel.org kernel? I wasn't implying that the
>> patch should be part of that!
>>
>> In our system we have literally EVERY single thread (kernel, user-space
>> daemons, and user-space applications) all setup as SCHED_RR with
>> identical priority at present, except a couple higher priority threads.
>> We did this initially for user-space by replacing /sbin/init with a
>> wrapper that set the scheduler policy and default priority, and verified
>> that this was inherited by all daemons & application threads. Then, we
>> found that the kernel threads could get starved in some situations,
>> hence the kernel change.
>>
>> Our threading model dictates that every thread have a priority (so that
>> the thread model is portable between Linux, embedded RTOSs etc.), and in
>> Linux AFAIK, the only way to implement priorities is to use a real-time
>> scheduling policy. Some threads do a lot of calculation. We want to make
>> them equal (or probably, lower) priority to the kernel threads, so
>> therefore the kernel threads must then be SCHED_RR.
>>
>> Can you elaborate on specific conditions that would cause the kernel
>> threads to suck up unusual amounts of CPU time?
>>
>> In our application, keyboard processing is a real-time requirement, so
>> if that is performed in a kernel thread, that kernel thread should be
>> real-time. We basically want the control to insert e.g. the keyboard
>> processing kernel thread into the middle of our priority hierarchy,
>> rather than having it forced as the lowest possible priority.
>
>
> Perhaps someone could comment on why the keyboard thread is NOT higher
> priority? The whole functionality of SysReq key combinations would seem
> to depend on actually seeing the strokes. I would cautiously suggest
> that a priority control in /proc/sys might be a useful interface,
> certainly compared to patching the kernel and rebuilding.
>
> Yes, I mean an option in the mainline kernel, so when debugging hangs
> the keyboard could be used.
>

There is nothing stopping you from setting the priority and the
scheduling policy from userspace in mainline.

Cheers,
Con


Attachments:
signature.asc (256.00 B)
OpenPGP digital signature