2006-11-18 13:13:15

by Christian

[permalink] [raw]
Subject: Sluggish system responsiveness on I/O

Hello lkml!

Im currently testing 2.6.19-rc5-mm1. Everything works really fine except the
little wart with bad multimedia interactivity with a kernel compiling in the
background. So I tried to narrow it down as much
as possible.

I did several find's,dd's and cats in parrallel and watched four instances of
glxgears and also played a little enemy-territory. The interactivity was very
good, in fact no loss of interactivity at all. This was contrary to what I
believed the whole time. The loss of interactivity has nothing to do with
heavy I/O. In fact it happens only when I run a task which is I/O and CPU
heavy at the same time. That means a single kernel compile (with -j1) is able
to harm interactivity with glxgears and enemy-territory, but fully loading my
three disks does no harm at all.

So I tried to nice the make and see what happens:

nice 5 make -j4: Seems to make no difference. Heavy stuttering in glxgears and
et
nice 10 make -j4: Somewhat better but still unusable with et

everything above nice 15 is usable. nice 19 has full interactivity, that means
you can't make out a difference between no load and kernel compile while
playing enemy-territory.

I suspect that it has something to do with the priority boost for I/O hogs.
But if this is a "general" scheduler problem, then why aren't more people
complaining about this?

-Christian


2006-11-18 13:24:50

by Prakash Punnoor

[permalink] [raw]
Subject: Re: Sluggish system responsiveness on I/O

Am Samstag 18 November 2006 14:12 schrieb Christian:
> So I tried to nice the make and see what happens:
>
> nice 5 make -j4: Seems to make no difference. Heavy stuttering in glxgears
> and et
> nice 10 make -j4: Somewhat better but still unusable with et
>
> everything above nice 15 is usable. nice 19 has full interactivity, that
> means you can't make out a difference between no load and kernel compile
> while playing enemy-territory.
>
> I suspect that it has something to do with the priority boost for I/O hogs.
> But if this is a "general" scheduler problem, then why aren't more people
> complaining about this?

I complained about this a year ago, but not much has changed. :-( It gets esp
bad if you copy GB size files (the writes are the problemmakers, less the
reads) - no matter which io scheduler I use, though using deadline seems to
lessen the impact a little bit. And I don't find it acceptable to have to
play around with nice to get a responsible desktop, esp when one is using a
GUI.

Cheers,
--
(?= =?)
//\ Prakash Punnoor /\\
V_/ \_V


Attachments:
(No filename) (1.07 kB)
(No filename) (189.00 B)
Download all attachments

2006-11-18 14:41:15

by Christian

[permalink] [raw]
Subject: Re: Sluggish system responsiveness on I/O

Am Samstag, 18. November 2006 14:25 schrieb Prakash Punnoor:
> Am Samstag 18 November 2006 14:12 schrieb Christian:
> > So I tried to nice the make and see what happens:
> >
> > nice 5 make -j4: Seems to make no difference. Heavy stuttering in
> > glxgears and et
> > nice 10 make -j4: Somewhat better but still unusable with et
> >
> > everything above nice 15 is usable. nice 19 has full interactivity, that
> > means you can't make out a difference between no load and kernel compile
> > while playing enemy-territory.
> >
> > I suspect that it has something to do with the priority boost for I/O
> > hogs. But if this is a "general" scheduler problem, then why aren't more
> > people complaining about this?
>
> I complained about this a year ago, but not much has changed. :-( It gets
> esp bad if you copy GB size files (the writes are the problemmakers, less
> the reads) - no matter which io scheduler I use, though using deadline
> seems to lessen the impact a little bit. And I don't find it acceptable to
> have to play around with nice to get a responsible desktop, esp when one is
> using a GUI.
>
> Cheers,

Ah yes, you put me on the right track! So we can say that we are actually
talking about two different classes of problems here.

The first class is process scheduler related. An I/O intensive process which
is CPU intensive at the same time gets such a high priority boost, that it
harms multimedia interactivity. This leads to short interruptions
("stuttering") in multimedia apps eg. glxgears.

The second problem is (CFQ) I/O scheduler related. Multiple readers get a
fairly nice sharing of I/O bandwidth but as soon as you introduce a single
writer, this writer harms the readers very much.

The first problem can be mitigated by using nice. Since that is why we have
nice at all. You can also use another scheduling class like SCHED_BATCH.

The second problem is much more pressuring for the average desktop user.
While multiple readers are running, you click on the kmenu and it loads slower
than normal. This is what you expect from sharing bandwidth with same I/O
nice level processes. The I/O bandwidth is shared equally. If you want a fast
desktop renice the streaming readers with ionice. Distros could also ionice
the desktop processes like kicker with a low nice level. The real problem for
desktop "interactivity" is when you are running streaming writers and then
trigger short reads eg. with the kmenu. It happens that the read request gets
starved for about a minute(!!) or even more.

Some use-cases:
glxgears with heavy read I/O: no problems
glxgears with heavy write I/O: no problems

glxgears with a "read,compute" load: stuttering due to priority boost

kmenu with several readers: slightly slower, equally shared bandwith. Ability
to use ionice
kmenu with several writers: unusable

So I think the major problem here is the starvation of short reads while
running multiple streaming writers. That deffinitely needs to be adressed.
This would be the last real problem that I see with a fully maxed out Linux
machine. Linux now has one of the best process and I/O schedulers I have ever
seen. Thanks to the great work of Jens Axboe and all the other nice people.
If this last wart would be attacked than I would consider Linux for total
World domination ;-)

-Christian

2006-11-19 07:49:53

by Mike Galbraith

[permalink] [raw]
Subject: Re: Sluggish system responsiveness on I/O

On Sat, 2006-11-18 at 14:12 +0100, Christian wrote:
> Hello lkml!
>
> Im currently testing 2.6.19-rc5-mm1. Everything works really fine except the
> little wart with bad multimedia interactivity with a kernel compiling in the
> background. So I tried to narrow it down as much
> as possible.
>
> I did several find's,dd's and cats in parrallel and watched four instances of
> glxgears and also played a little enemy-territory. The interactivity was very
> good, in fact no loss of interactivity at all. This was contrary to what I
> believed the whole time. The loss of interactivity has nothing to do with
> heavy I/O. In fact it happens only when I run a task which is I/O and CPU
> heavy at the same time. That means a single kernel compile (with -j1) is able
> to harm interactivity with glxgears and enemy-territory, but fully loading my
> three disks does no harm at all.

That makes sense, I/O tasks don't generally hold the cpu for extended
periods, whereas a cpu bound task does. I suspect you'll get the same
result by running a shell doing while true; do i=i+1; done while your
glxgears test is running. Anything which uses lots of cpu continuously
will eventually lose it's "interactive" status, and will therefore
round-robin with any other cpu hogs in the system with no ability to
preempt, other than when their competition runs out of timeslice.

> So I tried to nice the make and see what happens:
>
> nice 5 make -j4: Seems to make no difference. Heavy stuttering in glxgears and
> et
> nice 10 make -j4: Somewhat better but still unusable with et
>
> everything above nice 15 is usable. nice 19 has full interactivity, that means
> you can't make out a difference between no load and kernel compile while
> playing enemy-territory.

That makes sense too if enemy-territory sleeps ever so briefly very
frequently. At nice 19, there is no possibility that gcc is at the same
priority or above enemy-territory or any other nice 0 cpu hog regardless
of any dynamic priority adjustment. At every wake-up, it will be able
to preempt gcc. If enemy-territory doesn't sleep frequently and very
briefly, I'd expect the anti-starvation logic combined 100ms timeslices
to give you noticeable hiccups.

> I suspect that it has something to do with the priority boost for I/O hogs.
> But if this is a "general" scheduler problem, then why aren't more people
> complaining about this?

I suspect it's because most of the time, even heavy cpu using
interactive tasks sleep enough to not lose their interactive status with
the scheduler. IOW, the heuristics work well, but are not perfect. The
scheduler simply cannot determine that any task is truely interactive,
so can't automatically give it as much cpu as it wants when the system
is over-loaded.

What if: the scheduler did always give glxgears super high priority,
and you start a kernel compile and glxgears... and leave both running
while you go shopping. While you're gone, glxgears has nobody to
interact with, but the scheduler can't possibly know that you left.
When you come back, you expect your compile to have finished, but it
just sat there while glxgears used 100% cpu. Kobiashi Maru.

-Mike

2006-11-19 17:45:04

by Lee Revell

[permalink] [raw]
Subject: Re: Sluggish system responsiveness on I/O

On Sun, 2006-11-19 at 08:51 +0100, Mike Galbraith wrote:
> That makes sense, I/O tasks don't generally hold the cpu for extended
> periods, whereas a cpu bound task does.

So what can we do about I/O intensive tasks that also want a lot of CPU,
for example, the bloatier Gnome/KDE apps? Evolution is the worst for
me.

Lee

2006-11-19 18:34:52

by Mike Galbraith

[permalink] [raw]
Subject: Re: Sluggish system responsiveness on I/O

On Sun, 2006-11-19 at 12:44 -0500, Lee Revell wrote:
> On Sun, 2006-11-19 at 08:51 +0100, Mike Galbraith wrote:
> > That makes sense, I/O tasks don't generally hold the cpu for extended
> > periods, whereas a cpu bound task does.
>
> So what can we do about I/O intensive tasks that also want a lot of CPU,
> for example, the bloatier Gnome/KDE apps? Evolution is the worst for
> me.


Evolution has big trouble with the ext3 (and maybe others) journal.
I've _never_ seen evolution having scheduler priority problems, only
journal problems (absolutely every damn time hefty I/O is going on).

What should we do about I/O tasks that decide to use massive cpu?

IMHO, absolutely nothing beyond what ever we decide to do with any other
cpu intensvive task. There is nothing special about scheduling I/O
heavy tasks. If it uses massive cpu for sustained periods, it must pay
the price. In the meantime, an I/O intensive task that decides to use
heavy cpu will round-robin at relatively high frequency with every other
"interactive" task, which may also be doing a burst of cpu heavy work.
The reason for doing that cpu intensive burst just doesn't matter.

Currently, we special case I/O tasks to limit the dynamic priority boost
they can get via I/O. I think that is wrong.

-Mike

2006-11-22 10:55:47

by Mike Galbraith

[permalink] [raw]
Subject: [rfc patch] Re: Sluggish system responsiveness on I/O

Greetings,

Problem: If X or one of it's clients gets into a position where it
can't get it's work done and go to sleep, no sleep means no priority
boost. The consequence is terrible interactivity. Our sleep based
interactivity heuristics are very good, but not perfect.

Solution: The simple patch belows acknowledges this shortcoming in
scheduler interactivity heuristics by making a(nother) concession to the
real world - it adds the complement of class SCHED_BATCH to the
scheduler. While SCHED_BATCH tasks are never interactive, tasks which
are class SCHED_INTERACTIVE will always have interactive status, as will
any tasks they awaken (excluding in_interrupt()). The awakened task
will only have it's sleep_avg adjusted, it will not change class.

Setting X to SCHED_INTERACTIVE obviously cures the situation where X
can't get to sleep often enough. It also cures a scenario which
demonstrates the client problem very well here: start xmms, enable it's
G-FORCE visualization, and stretch it out large enough that it eats
massive cpu and then start a modest parallel kernel build. The very
hungry, but nonetheless definitely interactive (while I'm watching;)
G-FORCE visualization has no chance of producing decent output. Set X
to SCHED_INTERACTIVE, and presto, G-FORCE becomes a happy camper.

Setting X won't help if a threaded interactive application has it's cpu
hog component awakened by one of it's threads. The application would
either have to be started as SCHED_INTERACTIVE by the user, or modified
to set interactive threads to SCHED_INTERACTIVE during startup.

It also won't eliminate hiccups that can happen when the anti-starvation
logic kicks in on an overloaded box.

I've attached a modified userland tool (which was posted here a few
years ago, I didn't write it) to allow setting SCHED_INTERACTIVE if
anyone wants to try this out on their favorite interactivity problem.

(Hi Christian;)

Suggestions for a solution that doesn't include adding yet another
scheduling class would be most welcome.

--- linux-2.6.19-rc6/include/linux/sched.h.org 2006-11-21 09:08:31.000000000 +0100
+++ linux-2.6.19-rc6/include/linux/sched.h 2006-11-21 11:34:15.000000000 +0100
@@ -34,6 +34,7 @@
#define SCHED_FIFO 1
#define SCHED_RR 2
#define SCHED_BATCH 3
+#define SCHED_INTERACTIVE 4

#ifdef __KERNEL__

@@ -505,7 +506,7 @@ struct signal_struct {
#define rt_prio(prio) unlikely((prio) < MAX_RT_PRIO)
#define rt_task(p) rt_prio((p)->prio)
#define batch_task(p) (unlikely((p)->policy == SCHED_BATCH))
-#define is_rt_policy(p) ((p) != SCHED_NORMAL && (p) != SCHED_BATCH)
+#define is_rt_policy(p) ((p) == SCHED_RR || (p) == SCHED_FIFO)
#define has_rt_policy(p) unlikely(is_rt_policy((p)->policy))

/*
--- linux-2.6.19-rc6/kernel/sched.c.org 2006-11-16 10:02:26.000000000 +0100
+++ linux-2.6.19-rc6/kernel/sched.c 2006-11-22 09:01:35.000000000 +0100
@@ -921,6 +921,14 @@ static int recalc_task_prio(struct task_
p->sleep_avg += sleep_time;

}
+ /*
+ * If a task of class SCHED_INTERACTIVE awakens another,
+ * that task should also be considered interactive despite
+ * heavy cpu usage.
+ */
+ if (!in_interrupt() && current->policy == SCHED_INTERACTIVE &&
+ p->sleep_avg < ceiling)
+ p->sleep_avg = ceiling;
if (p->sleep_avg > NS_MAX_SLEEP_AVG)
p->sleep_avg = NS_MAX_SLEEP_AVG;
}
@@ -3091,6 +3099,13 @@ void scheduler_tick(void)
goto out_unlock;
}
if (!--p->time_slice) {
+ if (p->policy == SCHED_INTERACTIVE) {
+ unsigned long floor = INTERACTIVE_SLEEP(p);
+ if (floor > NS_MAX_SLEEP_AVG)
+ floor = NS_MAX_SLEEP_AVG;
+ if (p->sleep_avg < floor)
+ p->sleep_avg = floor;
+ }
dequeue_task(p, rq->active);
set_tsk_need_resched(p);
p->prio = effective_prio(p);
@@ -4117,7 +4132,8 @@ recheck:
if (policy < 0)
policy = oldpolicy = p->policy;
else if (policy != SCHED_FIFO && policy != SCHED_RR &&
- policy != SCHED_NORMAL && policy != SCHED_BATCH)
+ policy != SCHED_NORMAL && policy != SCHED_BATCH &&
+ policy != SCHED_INTERACTIVE)
return -EINVAL;
/*
* Valid priorities for SCHED_FIFO and SCHED_RR are
@@ -4663,6 +4679,7 @@ asmlinkage long sys_sched_get_priority_m
break;
case SCHED_NORMAL:
case SCHED_BATCH:
+ case SCHED_INTERACTIVE:
ret = 0;
break;
}
@@ -4687,6 +4704,7 @@ asmlinkage long sys_sched_get_priority_m
break;
case SCHED_NORMAL:
case SCHED_BATCH:
+ case SCHED_INTERACTIVE:
ret = 0;
}
return ret;


Attachments:
schedctl.c (10.00 kB)