2003-07-22 01:19:43

by Simon Kirby

[permalink] [raw]
Subject: Scheduler starvation (2.5.x, 2.6.0-test1)

I keep seeing cases where browsing in mozilla / galeon will suck away all
CPU from X updating the mouse, xmms playing, etc., for about a second as
Mozilla renders a page (which should take 50 ms to render, but anyway..).

This is only a Celeron 466 MHz, but there never used to be such a problem
in 2.2 and 2.4 kernels. All processes (X, galeon, xmms) are running with
the default nice of 0. It seems there is something with the scheduler
heuristics which is giving galeon a way-too-large preemptive timeslice
and not even allowing enough CPU to other processes to, for example, let
X update the mouse cursor. This seems wrong -- Shouldn't the device
event always wake up and switch to the X task if the process has usable
timeslices left?

This only seems to happen after letting the system settle for some time.
If I refresh a page once, the problem happens. Again, less. Again, the
scheduler seems to allow mouse updates normally. I have to wait about 30
seconds for the problem to occur again. This is much easier to see with
X reniced to +1. It occurs with X reniced to +0, but not as often. With
the old scheduler, +20 wouldn't even make a noticeable difference because
mouse events would still wake up and run the process as expected.

It is really easy to notice the problem on this box not because of the
audible and visible skips, but the fact that there's a bug in the ALSA
Gravis Ultrasound Classic driver which causes it to sometimes play
incorrect portions of RAM when the sound restarts. :)

Is anybody else seeing this or is it something to do with my setup here?

Simon-


2003-07-22 14:59:39

by Tom Felker

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

On Tue, 22 Jul 2003 03:40:09 +0200, Simon Kirby wrote:
>
> Is anybody else seeing this or is it something to do with my setup here?
>
> Simon-

I see this too, on a much faster system. My browsing habit is to open a
link in a new tab, and immediately click on that tab. The mouse gets
noticeably jumpy right after clicking the link. I may be jaded by my
2.4.20-gentoo kernel which never did this, but still...

My system is a 2.4 GHz P4 running Gentoo, 2.6.0-test1 (tainted with
nVidia), and Mozilla Firebird, with a USB mouse.

--
Tom Felker

Alchemists became chemists when they stopped keeping secrets.

2003-07-22 15:49:42

by Martin Zwickel

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

On Tue, 22 Jul 2003 10:14:46 -0500
Tom Felker <[email protected]> bubbled:

> On Tue, 22 Jul 2003 03:40:09 +0200, Simon Kirby wrote:
> >
> > Is anybody else seeing this or is it something to do with my setup here?
> >
> > Simon-
>
> I see this too, on a much faster system. My browsing habit is to open a
> link in a new tab, and immediately click on that tab. The mouse gets
> noticeably jumpy right after clicking the link. I may be jaded by my
> 2.4.20-gentoo kernel which never did this, but still...
>
> My system is a 2.4 GHz P4 running Gentoo, 2.6.0-test1 (tainted with
> nVidia), and Mozilla Firebird, with a USB mouse.

I have the same problem with 2.6.0-t1-ac2. If I minimize a
window, my xmms hangs for a few ms. Same if I browse on some pages and go
back/forward.
The screen/windows also fills/draws a little bit slow. (well,
could be a X/nvidia problem)

I changed back to 2.4.22-p7 and everything works fine.

Regards,
Martin

ps.:
my machine: P4 2.4Ghz, 512Mb ram, Gentoo, nvidia driver, mozilla 1.4

--
MyExcuse:
Power company testing new voltage spike (creation) equipment

Martin Zwickel <[email protected]>
Research & Development

TechnoTrend AG <http://www.technotrend.de>


Attachments:
(No filename) (189.00 B)

2003-07-22 17:14:26

by Apurva Mehta

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

* Simon Kirby <[email protected]> [2003-07-22 11:50]:
> I keep seeing cases where browsing in mozilla / galeon will suck away all
> CPU from X updating the mouse, xmms playing, etc., for about a second as
> Mozilla renders a page (which should take 50 ms to render, but anyway..).

I do not have any problems with mouse response, but xmms sure does
skip a whole lot more on my 2.6.0-test1 running on a PIII 500 MHz, 192
MB RAM.

I usually run Opera and the skipping occurs often while switching
between tabs with the mouse (not when it is done with the keyboard).

Also, severe xmms skipping occurs while scrolling through PDF files
(in Acrobat) while the first few seconds of a song are playing. The
song virtually stops while I scroll. After the song plays for a bit,
scrolling through a PDF makes no difference.

Sometimes, xmms pops up in between songs saying that it could not
detect the audio device! This occurs mainly during heavy disk i/o or
cpu usage.

> .. but there never used to be such a problem in 2.2 and 2.4 kernels.
> All processes (X, galeon, xmms) are running with the default nice of
> 0.

Same here..

> Is anybody else seeing this or is it something to do with my setup here?

Apparently it is not just you :)

- Apurva

2003-07-22 18:26:42

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

On Tue, 2003-07-22 at 18:04, Martin Zwickel wrote:
> I have the same problem with 2.6.0-t1-ac2. If I minimize a
> window, my xmms hangs for a few ms. Same if I browse on some pages and go
> back/forward.
> The screen/windows also fills/draws a little bit slow. (well,
> could be a X/nvidia problem)
>
> I changed back to 2.4.22-p7 and everything works fine.

Could you please test 2.6.0-test1-mm2? It includes some scheduler fixes
from Con Kolivas that will help in reducing or eliminating your
starvation issues.

Thank you very much!

2003-07-22 18:33:33

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

On Tue, 2003-07-22 at 19:28, Apurva Mehta wrote:
> * Simon Kirby <[email protected]> [2003-07-22 11:50]:
> > I keep seeing cases where browsing in mozilla / galeon will suck away all
> > CPU from X updating the mouse, xmms playing, etc., for about a second as
> > Mozilla renders a page (which should take 50 ms to render, but anyway..).
>
> I do not have any problems with mouse response, but xmms sure does
> skip a whole lot more on my 2.6.0-test1 running on a PIII 500 MHz, 192
> MB RAM.
>
> I usually run Opera and the skipping occurs often while switching
> between tabs with the mouse (not when it is done with the keyboard).
>
> Also, severe xmms skipping occurs while scrolling through PDF files
> (in Acrobat) while the first few seconds of a song are playing. The
> song virtually stops while I scroll. After the song plays for a bit,
> scrolling through a PDF makes no difference.
>
> Sometimes, xmms pops up in between songs saying that it could not
> detect the audio device! This occurs mainly during heavy disk i/o or
> cpu usage.

Could you please test 2.6.0-test1-mm2 instead? It has additional
scheduler fixes which should improve your overall experience.

Thanks!

2003-07-22 19:24:16

by Jose Luis Domingo Lopez

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

On Tuesday, 22 July 2003, at 20:41:42 +0200,
Felipe Alfaro Solana wrote:

> Could you please test 2.6.0-test1-mm2? It includes some scheduler fixes
> from Con Kolivas that will help in reducing or eliminating your
> starvation issues.
>
I was having the same jumpy mouse behaviuor with 2.6.0-test1, and on an
otherwise idle Pentium III at 600 MHz box, scrolling a very simple HTML
page in Mozilla makes xmms skip audio.

Then I tried 2.6.0-test1-mm2, and several things happened: now scrolling
an HTML page in Mozilla seems not to affect MP3 playback with XMMS, but
this is the only possitive effect. Focusing windows raises them way
slower than in 2.6.0-test1, scheduler starvation is constant (just try
to do something like going to another virtual desktop), and then, after
several minutes, only XMMS got CPU time, the rest of the applications
(at least, those running over X-Window) get stalled.

Hope this helps.

--
Jose Luis Domingo Lopez
Linux Registered User #189436 Debian Linux Sid (Linux 2.6.0-test1)

2003-07-22 19:38:04

by Mike Fedyk

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

On Tue, Jul 22, 2003 at 09:39:17PM +0200, Jose Luis Domingo Lopez wrote:
> On Tuesday, 22 July 2003, at 20:41:42 +0200,
> Felipe Alfaro Solana wrote:
>
> > Could you please test 2.6.0-test1-mm2? It includes some scheduler fixes
> > from Con Kolivas that will help in reducing or eliminating your
> > starvation issues.
> >
> I was having the same jumpy mouse behaviuor with 2.6.0-test1, and on an
> otherwise idle Pentium III at 600 MHz box, scrolling a very simple HTML
> page in Mozilla makes xmms skip audio.
>
> Then I tried 2.6.0-test1-mm2, and several things happened: now scrolling
> an HTML page in Mozilla seems not to affect MP3 playback with XMMS, but
> this is the only possitive effect. Focusing windows raises them way
> slower than in 2.6.0-test1, scheduler starvation is constant (just try
> to do something like going to another virtual desktop), and then, after
> several minutes, only XMMS got CPU time, the rest of the applications
> (at least, those running over X-Window) get stalled.

Yes, I get the same thing with KDE 3.1. Are you using KDE or Gnome? And if
so, what version?

2003-07-22 20:05:25

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

On Tue, 2003-07-22 at 21:39, Jose Luis Domingo Lopez wrote:
> > Could you please test 2.6.0-test1-mm2? It includes some scheduler fixes
> > from Con Kolivas that will help in reducing or eliminating your
> > starvation issues.
> >
> I was having the same jumpy mouse behaviuor with 2.6.0-test1, and on an
> otherwise idle Pentium III at 600 MHz box, scrolling a very simple HTML
> page in Mozilla makes xmms skip audio.
>
> Then I tried 2.6.0-test1-mm2, and several things happened: now scrolling
> an HTML page in Mozilla seems not to affect MP3 playback with XMMS, but
> this is the only possitive effect. Focusing windows raises them way
> slower than in 2.6.0-test1, scheduler starvation is constant (just try
> to do something like going to another virtual desktop), and then, after
> several minutes, only XMMS got CPU time, the rest of the applications
> (at least, those running over X-Window) get stalled.

We know there are still some issues and we need help from people like
you. It's testing from a big enough user population what allows us to
fix things that don't work as expected :-)

If you can, please, keem trying/testing/using the -mm series of the
kernel. Con Kolivas is dedicating a lot of effort on the scheduler
issus, and I wouldn't like this effort to be useless in the end.

Thanks!

2003-07-22 20:22:43

by Apurva Mehta

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

* Felipe Alfaro Solana <[email protected]> [2003-07-23 00:30]:
> On Tue, 2003-07-22 at 19:28, Apurva Mehta wrote:
> > I usually run Opera and the skipping occurs often while switching
> > between tabs with the mouse (not when it is done with the keyboard).
> >
> > Also, severe xmms skipping occurs while scrolling through PDF files
> > (in Acrobat) while the first few seconds of a song are playing. The
> > song virtually stops while I scroll. After the song plays for a bit,
> > scrolling through a PDF makes no difference.
> >
> > Sometimes, xmms pops up in between songs saying that it could not
> > detect the audio device! This occurs mainly during heavy disk i/o or
> > cpu usage.
>
> Could you please test 2.6.0-test1-mm2 instead? It has additional
> scheduler fixes which should improve your overall experience.

I just patched my tree and compiled it. It does not work.. It freezes
the system when I try to start X.. I gives a huge error message and
the last line is something like :
<6>note: X[1306] exited with preempt_count 1

I checked all the logs and I cannot find the complete error message
anywhere. Any suggestions where to look?

I also had to set root=0307 instead of /dev/hda7 to make it boot. But
I guess that is unrelated to this issue..


- Apurva

2003-07-22 20:40:46

by Diego Calleja

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

El 22 Jul 2003 20:41:42 +0200 Felipe Alfaro Solana <[email protected]> escribi?:

> Could you please test 2.6.0-test1-mm2? It includes some scheduler fixes
> from Con Kolivas that will help in reducing or eliminating your
> starvation issues.

BTW, if/when you've some free time, you could try 2.5.63.
It has the best linux scheduler i've never tried! (for my workload, of
course)

It'd be interesting to try to know what has changed since then.

2003-07-22 21:18:23

by Mike Fedyk

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

On Wed, Jul 23, 2003 at 02:07:16AM +0530, Apurva Mehta wrote:
> * Felipe Alfaro Solana <[email protected]> [2003-07-23 00:30]:
> > On Tue, 2003-07-22 at 19:28, Apurva Mehta wrote:
> > > I usually run Opera and the skipping occurs often while switching
> > > between tabs with the mouse (not when it is done with the keyboard).
> > >
> > > Also, severe xmms skipping occurs while scrolling through PDF files
> > > (in Acrobat) while the first few seconds of a song are playing. The
> > > song virtually stops while I scroll. After the song plays for a bit,
> > > scrolling through a PDF makes no difference.
> > >
> > > Sometimes, xmms pops up in between songs saying that it could not
> > > detect the audio device! This occurs mainly during heavy disk i/o or
> > > cpu usage.
> >
> > Could you please test 2.6.0-test1-mm2 instead? It has additional
> > scheduler fixes which should improve your overall experience.
>
> I just patched my tree and compiled it. It does not work.. It freezes
> the system when I try to start X.. I gives a huge error message and
> the last line is something like :
> <6>note: X[1306] exited with preempt_count 1
>
> I checked all the logs and I cannot find the complete error message
> anywhere. Any suggestions where to look?

How about your Xfree86 log? (probably somewhere in /var/log)

2003-07-22 22:13:22

by Jose Luis Domingo Lopez

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

On Tuesday, 22 July 2003, at 12:53:06 -0700,
Mike Fedyk wrote:

> Yes, I get the same thing with KDE 3.1. Are you using KDE or Gnome? And if
> so, what version?
>
Debian Sid KDE version 3.1.2 here, and xmms 1.2.7 and Mozilla 1.4 if
that matters. I'll try to do some additional test, time permitting.

Regards,

--
Jose Luis Domingo Lopez
Linux Registered User #189436 Debian Linux Sid (Linux 2.6.0-test1)

2003-07-23 08:03:38

by Apurva Mehta

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

* Mike Fedyk <[email protected]> [2003-07-23 12:00]:

[snip]
> > I just patched my tree and compiled it. It does not work.. It freezes
> > the system when I try to start X.. I gives a huge error message and
> > the last line is something like :
> > <6>note: X[1306] exited with preempt_count 1
> >
> > I checked all the logs and I cannot find the complete error message
> > anywhere. Any suggestions where to look?
>
> How about your Xfree86 log? (probably somewhere in /var/log)

I stumbled accross something very interesting today morning while
trying to start X on 2.6.0-test1-mm2. If I login on the console and
then execute a simple command like 'ls' and then do a `startx`, it
works. But if login and immediately do a `startx` (without any
preceding command), I get that error. Very strange.

The problems do not end there. Once I start X, I experience
random freezes of the system. In one session I could play some music
and write some email. It froze just after dialing into the
internet. In another session it froze while trying to start xmms.

After the system freezes in X and when I reboot into mm2, I get an error
message like I described in my previous mail just after all the init
scripts have started. The system then freezes again. On the subsequent
reboot, I can login.

The Xfee86 logs (XFree86.0.log and XFree86.0.log.old) do not contain
anything relavent. I checked /var/log/messages and it has nothing either.

- Apurva

2003-07-23 16:26:35

by Mike Fedyk

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

On Wed, Jul 23, 2003 at 01:48:11PM +0530, Apurva Mehta wrote:
> The problems do not end there. Once I start X, I experience
> random freezes of the system. In one session I could play some music
> and write some email. It froze just after dialing into the
> internet. In another session it froze while trying to start xmms.

please post the output of lspci here. Also, what version of Xfree86 are you
running, and read the documentation for the nmi oopser in the kernel
documentation tree, and turn it on for your system.

Then we might be able to get a useful message out of your system.

Mike

2003-07-23 18:49:11

by Apurva Mehta

[permalink] [raw]
Subject: Re: Scheduler starvation (2.5.x, 2.6.0-test1)

* Mike Fedyk <[email protected]> [2003-07-23 22:41]:
> On Wed, Jul 23, 2003 at 01:48:11PM +0530, Apurva Mehta wrote:
> > The problems do not end there. Once I start X, I experience
> > random freezes of the system. In one session I could play some music
> > and write some email. It froze just after dialing into the
> > internet. In another session it froze while trying to start xmms.
>
> please post the output of lspci here. Also, what version of Xfree86 are you
> running, and read the documentation for the nmi oopser in the kernel
> documentation tree, and turn it on for your system.
>
> Then we might be able to get a useful message out of your system.

Here is the output of lspci:

------

00:00.0 Host bridge: VIA Technologies, Inc. VT82C693A/694x [Apollo PRO133x] (rev c4)
00:01.0 PCI bridge: VIA Technologies, Inc. VT82C598/694x [Apollo MVP3/Pro133x AGP]
00:04.0 ISA bridge: VIA Technologies, Inc. VT82C596 ISA [Mobile South] (rev 23)
00:04.1 IDE interface: VIA Technologies, Inc. VT82C586/B/686A/B PIPC Bus Master IDE (rev 10)
00:04.2 USB Controller: VIA Technologies, Inc. USB (rev 11)
00:04.3 Host bridge: VIA Technologies, Inc. VT82C596 Power Management (rev 30)
00:0b.0 Multimedia audio controller: Ensoniq 5880 AudioPCI (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation NV5 [RIVA TNT2/TNT2 Pro] (rev 15)

------

I enabled the nmi watchdog by appending nmi_watchdog=1 to my kernel
boot parameters. I verified that it was enabled by doing a `cat
/proc/interrupts`. The NMI value was 30.

Still, there are no special messages in /var/log/messages or in any
XFree86 log nor in any boot log..

May be the nmi watchdog is not working because it is not a normal
'freeze' per se. What happens is that my mouse still moves around the
screen, I can also type (*only* type) commands into an xterm. But
those commands cannot be executed. When I try to execute a command,
nothing happens, I do not even get the prompt back. If I try to login
to a virtual console, I can type my login name, but then the password
prompt does not come up. When I try to switch back to X, I cannot do
so and I get stuck at that blank screen. Then I am force to do a cold
reboot.

Sometime during the subsequent boot process, I get that error
message while trying to load a random startup program. Last time, it
was while trying to load xfs.

Also, I have found that there is no pattern to the sequence of
operations that lead up to these freezes. I mentioned that if I
executed some programs on a virtual console before starting X, then X
would load fine, otherwise it would not. This is not true all the
time.

- Apurva