2003-08-16 08:56:34

by Con Kolivas

[permalink] [raw]
Subject: [PATCH] O16.2int

Much simpler

Con


Attachments:
(No filename) (18.00 B)
patch-O16.1-O16.2int (2.73 kB)
Download all attachments

2003-08-16 11:04:56

by Voluspa

[permalink] [raw]
Subject: Re: [PATCH] O16.2int


On 2003-08-16 8:59:48 Con Kolivas wrote:

> Much simpler

For a coder, perhaps. This user however is facing what feels like a
fundamental flaw. The doubling of boot time was fixed, but game-test is
pretty much as it was in pure O16 (impossible) and Blender is equally
bad - now even the mouse pointer vanishes, becomes invisible, during the
10+ second pauses. And it is with Blender as only app running. Didn't
dare to start xmms...

I'll keep running the 2.6.0-test3 ---> O16.2int though. Might pick up
other cases of regression in lighter usage.

Mvh
Mats Johannesson

2003-08-16 13:28:11

by Wiktor Wodecki

[permalink] [raw]
Subject: Re: [PATCH] O16.2int

Hello Con,

On Sat, Aug 16, 2003 at 07:02:52PM +1000, Con Kolivas wrote:
> Much simpler
>
> Con
[ snip ]

I just had a failure on of my raid1 devices (not related I think), the
system goes on in degregated mode. At the moment I'm generating a nice
IO load with badblocks on the failed drive and a backup from the
degregated raid1 to another raid1 (on a seperate bus). During this I'm
playing xmms without much hassle and the system still feels
interactively. However, I have xterm on alt-f1 and if I press it now it
takes about 7 secs to open up. Could be better, but under this
conditions I consider this fair.

Here's a bit of vmstat:

kakerlak:/home/wiktor# vmstat 1 10
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 2 0 4752 18376 148028 0 0 895 2655 1162 1952 10 4 79 7
1 2 0 4096 18604 148472 0 0 1988 29344 2360 7481 38 25 0 38
0 2 0 4176 18700 148284 0 0 3296 27363 2317 7298 36 27 0 37
0 2 0 5456 18804 146824 0 0 2392 30567 2161 7177 31 26 0 44
0 3 0 5392 18664 146340 0 0 3544 46644 2172 6894 29 28 0 42
2 3 0 5520 17888 147656 0 0 5568 21068 2173 5776 38 27 0 34
0 2 0 3856 17436 149700 0 0 6656 26464 2122 6654 35 29 0 36
0 2 0 4784 17392 148748 0 0 2324 30532 2175 7277 33 25 0 43
0 2 0 4016 17052 149644 0 0 2620 30412 2175 7214 37 22 0 40
0 3 0 5584 16568 148052 0 0 2632 42131 2163 7040 34 25 0 42
kakerlak:/home/wiktor#

--
Regards,

Wiktor Wodecki


Attachments:
(No filename) (1.62 kB)
(No filename) (189.00 B)
Download all attachments

2003-08-16 14:02:45

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O16.2int

On Sat, 16 Aug 2003 21:07, Voluspa wrote:
> On 2003-08-16 8:59:48 Con Kolivas wrote:
> > Much simpler
>
> For a coder, perhaps. This user however is facing what feels like a
> fundamental flaw. The doubling of boot time was fixed, but game-test is
> pretty much as it was in pure O16 (impossible) and Blender is equally
> bad - now even the mouse pointer vanishes, becomes invisible, during the
> 10+ second pauses. And it is with Blender as only app running. Didn't
> dare to start xmms...

Nice.

Funny you should mention xmms in the same sentence since that's an app that
works fine. I was under no illusion that O16s were going to make these apps
perform well. Now you have to clarify what you mean by game test as being
impossible. I assume you mean wine based games? The only game I own is
neverwinternights and that performs beautifully. You have to tell me about
this blender app? I'm unable to fully reproduce these problems here as the
only thing exhibiting starvation is a mozilla plugin that is busy on wait
using the libgdk(something) library, and it is usable without starvation
albeit at sucky performance. If you can profile blender sucking it would be
helpful.

> I'll keep running the 2.6.0-test3 ---> O16.2int though. Might pick up
> other cases of regression in lighter usage.

It's not a matter of light versus heavy, it's only select applications. Try
running your machine at absurd loads (without hitting swap) with apps that
don't exhibit it.

Since these are the last thing in my gunsights for the interactivity
development I'd appreciate as much info as anybody has on what happens when
they suck, and if profiling can find a common link. Since I own none of the
apps that do this I can only try to fix them with your help. If someone is
available for frequent small patch testing that would also be helpful as that
helped the other interactivity development, so please email me directly.

Con

2003-08-16 14:41:36

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O16.2int



Con Kolivas wrote:

>On Sat, 16 Aug 2003 21:07, Voluspa wrote:
>
>>On 2003-08-16 8:59:48 Con Kolivas wrote:
>>
>>>Much simpler
>>>
>>For a coder, perhaps. This user however is facing what feels like a
>>fundamental flaw. The doubling of boot time was fixed, but game-test is
>>pretty much as it was in pure O16 (impossible) and Blender is equally
>>bad - now even the mouse pointer vanishes, becomes invisible, during the
>>10+ second pauses. And it is with Blender as only app running. Didn't
>>dare to start xmms...
>>
>
>Nice.
>

Hi Con,
Up late fixing the scheduler? ;)
Quick question: have you been Contest-ing your patches much? Would
you like me to if you haven't the time?

Nick


2003-08-16 14:41:55

by Voluspa

[permalink] [raw]
Subject: Re: [PATCH] O16.2int

On Sun, 17 Aug 2003 00:09:06 +1000 Con Kolivas wrote:

> On Sat, 16 Aug 2003 21:07, Voluspa wrote:
> > On 2003-08-16 8:59:48 Con Kolivas wrote:
[...]
> Funny you should mention xmms in the same sentence since that's an app
> that works fine.
[...]
> Now you have to clarify what you mean by game test as being
> impossible. I assume you mean wine based games?
[...]
> If you can profile blender sucking it would be helpful.

I can still starve xmms while running Blender ;-) Game-test
(yes, wine) impossible means it acts almost exactly as I wrote about in:

http://marc.theaimsgroup.com/?l=linux-kernel&m=106098091012383&w=2

We're talking many minutes of total starvation at each stage, start
- menu selection single player - load saved game menu - load game
(haven't had the patience to wait for an actual game to begin...)

Profile Blender? Ok, know nothing about the technique but will compile a
kernel with profiling and read up on what has been mentioned on this
list (Mr Erwin calling for proper "instrumentation")

Will take alot of time. I'll return any restults directly to your
address, won't copy lkml.

Mvh
Mats Johannesson

2003-08-16 14:54:13

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O16.2int

On Sun, 17 Aug 2003 00:41, Nick Piggin wrote:
> Hi Con,
> Up late fixing the scheduler? ;)

Sigh.. yeah it was killing me slowly before and now it's killing me more
rapidly. Seems I age a year with each patch release. Very occasionally I wish
I was doing this in daylight hours instead of starting all my hacking at 10pm
when the family has gone to bed.

> Quick question: have you been Contest-ing your patches much? Would
> you like me to if you haven't the time?

Not as frequently as I'd like no. If you can do nice repeatable runs it would
be very nice of you thanks. I like to spend as much time as I can testing
them locally on real world usage, and each time I planned to do some
benchmarking there was always something else I needed to try after others
responded. 2.6.0-test3-mm2 #85 doesn't even begin to tell you how many
kernels I've booted. I've almost rebooted enough to install windows.

If you're interested, this is the script I ran for maximum repeatability in
minimum timespace of the tests that varied a lot. I had a separate ext3
partition on another hard disk called /ext3 (explains the command)

oops just noticed you cc'ed lkml. ah what the heck I'll cc it as well
Con

#!/bin/sh
contest -k $1 -o /ext3/dump -n 1 no_load cacherun process_load ctar_load \
xtar_load io_load io_other read_load list_load mem_load dbench_load
contest -k $1 -o /ext3/dump -n 1 process_load ctar_load \
xtar_load io_load io_other read_load list_load mem_load dbench_load
contest -k $1 -o /ext3/dump -n 1 ctar_load \
xtar_load io_load dbench_load
contest -k $1 -o /ext3/dump -n 1 io_load dbench_load
contest -r


2003-08-17 07:39:09

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] O16.2int

Con Kolivas <[email protected]> wrote:
>
> Much simpler

But broken.

The machine runs about 100x slower than normal. The screensaver cut in
halfway through the initscripts ;) That's on 2-way. The same kernel works
OK on uniprocessor.

2003-08-17 09:24:45

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O16.2int

On Sun, 17 Aug 2003 17:40, Andrew Morton wrote:
> Con Kolivas <[email protected]> wrote:
> > Much simpler
>
> But broken.
>
> The machine runs about 100x slower than normal. The screensaver cut in
> halfway through the initscripts ;) That's on 2-way. The same kernel works
> OK on uniprocessor.

Good enough. Don't worry about any of the 16s; this drastic change only helped
the mild case anyway. I'll save only the cleanups and post an incremental to
16.2 at a later stage.

Con

2003-08-17 14:52:36

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O16.2int

On Sun, 17 Aug 2003 17:40, Andrew Morton wrote:
> Con Kolivas <[email protected]> wrote:
> > Much simpler
>
> But broken.
>
> The machine runs about 100x slower than normal. The screensaver cut in
> halfway through the initscripts ;) That's on 2-way. The same kernel works
> OK on uniprocessor.

Reverting the !in_interrupt nonsense should be enough to avoid the dreaded
screensaver at boottime I hope. Does this fix it?

Starvation will be approached differently.

Change:
Make preemption occur as vanilla again except for now not preempting same
priority tasks with less timeslice.

Con


Attachments:
(No filename) (594.00 B)
patch-O16.2-O16.3int (468.00 B)
Download all attachments

2003-08-17 20:38:26

by Benjamin Weber

[permalink] [raw]
Subject: Re: [PATCH] O16.2int

Hi Con

First of all, thanks a lot for all the hard work on the scheduler. A lot
of people are looking forward to 2.6 and your work is a major part of
the desktop user elements of this next kernel generation.

I am not one of those who can judge much difference from 15 to 16 e.g.
but I do know that not using any of your patches is a major step back in
interactivity and general feeling of responsiveness of my system. Been
using your patches since the 6th iteration. Back then juk (the kde
jukebox thingy) was a major skipping candidate under heavy load, which
nowadays is almost smooth as silk.

I am pretty sure when 2.6.0 is released on the world your scheduler is
that fine tuned and matured to make the new kernel a must have for
everyone. And you will get there even when you do not work late at night
each day to get the latest improvments to us as fast as possible hehe.
Just to keep you from too many sleepless nights ;)

Until then I will continue applying your patches as soon as they get
out and look out for anything suspicious that might catch my attention
=)

Keep up the great work!

--
Ben







2003-08-17 21:06:10

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] O16.2int

Con Kolivas <[email protected]> wrote:
>
> Reverting the !in_interrupt nonsense should be enough to avoid the dreaded
> screensaver at boottime I hope. Does this fix it?

yes, that's a fix.

2003-08-17 21:47:20

by Tom Sightler

[permalink] [raw]
Subject: Re: [PATCH] O16.2int

Hi Con,

I've been unable to test many of your patches lately, but I recently
upgraded to 2.6.0-test3-mm2 and included the O16.2 patches and I'm
having some very bad behaviour. I'll try to describe it the best way I
can.

The worst case scenario involves wine applications (a program that seems
to be causing a significant amount of trouble) but I think I've seen the
issue in several other applications as well.

1. -- Wine running Windows Media Player 6.4 works great when running in
a window but if you make the program open to full screen it gets so much
attention that it almost completely starves the entire system, you can't
even manage to get switched back to the standard screen. If you get
very lucky you can switch the system to a VT and manage, painfully
slowly to kill the wine process off, then the system will turn to
normal. This issue actually seems to affect almost any wine
applications that's doing a lot of screen updates, for example small
flash animations work OK, but complex animations can cause many seconds
of starvation to the rest of the system. Renicing the X server and wine
server to much lower numbers seems to make this issue go away.

2. -- Adobe Acrobat 5.07 for Linux seems to have a very similar issue, a
large complex document seems to starve out the whole system making the
system feel locked up for several seconds.

3. -- It seems I can trigger the same kind of starvation by simply doing
large selects in a Konsole window. Selecting a large section of text
which causes the window to scroll can sometime make the system almost
totally non-responsive in every other window.

The wine issue is the easiest to trigger, the others seem to require
that some other things are also using CPU. In other words, if the
system is otherwise totally idle I have been unable to trigger the
Konsole issue, however, if another application is using a reasonable
amount of CPU, say a video playing in Wine using ~50% CPU, then it seems
easy to make this happen. Once it happens it's very hard to recover,
the systems almost completely fails to respond to mouse button clicks
and keyboard functions (although the mouse usually continues to move
smoothly) if I get lucky I can get to a VT and eventually login and kill
something off, but this sometimes takes minutes (it took 90 seconds
earlier today just to get the password prompt after type in my username
at the VT prompt and another minute to get to a shell).

Later,
Tom



2003-08-18 01:21:11

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O16.2int

Quoting Tom Sightler <[email protected]>:

> Hi Con,

Hi Tom

> 1. -- Wine running Windows Media Player 6.4 works great when running in

Yes a well known problem now (see other threads about wine). Wine doing
something very cpu intensive exhibits this due to a combination of wine
breakage brought out by my scheduler tweaks causing priority inversion.

> 2. -- Adobe Acrobat 5.07 for Linux seems to have a very similar issue, a
> large complex document seems to starve out the whole system making the
> system feel locked up for several seconds.

Actually I've profiled acroread and it seems to be more a memory issue than a
scheduler one per se. Something very inefficient about it's design and it
behaves much worse as a mozilla plugin than standalone. Give it lots of cpu
time and it just keeps doing more and more vm work.

> 3. -- It seems I can trigger the same kind of starvation by simply doing
> large selects in a Konsole window. Selecting a large section of text

> Konsole issue, however, if another application is using a reasonable
> amount of CPU, say a video playing in Wine using ~50% CPU, then it seems
> easy to make this happen. Once it happens it's very hard to recover,

This is because of wine, not the Konsole which behaves fine, but tips the
combination of wine and something else over the edge.

I'm working on it. Thanks very much for your report.

Cheers,
Con

2003-08-18 16:47:45

by Antonio Vargas

[permalink] [raw]
Subject: Re: [PATCH] O16.2int

On Mon, Aug 18, 2003 at 11:21:04AM +1000, Con Kolivas wrote:
> Quoting Tom Sightler <[email protected]>:
>
> > Hi Con,
>
> Hi Tom
>
> > 1. -- Wine running Windows Media Player 6.4 works great when running in
>
> Yes a well known problem now (see other threads about wine). Wine doing
> something very cpu intensive exhibits this due to a combination of wine
> breakage brought out by my scheduler tweaks causing priority inversion.
>
> > 2. -- Adobe Acrobat 5.07 for Linux seems to have a very similar issue, a
> > large complex document seems to starve out the whole system making the
> > system feel locked up for several seconds.
>
> Actually I've profiled acroread and it seems to be more a memory issue than a
> scheduler one per se. Something very inefficient about it's design and it
> behaves much worse as a mozilla plugin than standalone. Give it lots of cpu
> time and it just keeps doing more and more vm work.

Acrobat has a switch so that it keeps a cache of rendered pages, and
obviously it default to ON, so just reading a big PDF file page by
page will trash all the system with lots useless data. BUT, for simple
PDF usage in a non-multitasking single-user machine it's faster
so there you have a possible reason for it's strange behaviour.


> > 3. -- It seems I can trigger the same kind of starvation by simply doing
> > large selects in a Konsole window. Selecting a large section of text
>
> > Konsole issue, however, if another application is using a reasonable
> > amount of CPU, say a video playing in Wine using ~50% CPU, then it seems
> > easy to make this happen. Once it happens it's very hard to recover,
>
> This is because of wine, not the Konsole which behaves fine, but tips the
> combination of wine and something else over the edge.
>
> I'm working on it. Thanks very much for your report.
>
> Cheers,
> Con
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
winden/network

1. Dado un programa, siempre tiene al menos un fallo.
2. Dadas varias lineas de codigo, siempre se pueden acortar a menos lineas.
3. Por induccion, todos los programas se pueden
reducir a una linea que no funciona.

2003-08-18 23:40:51

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O16.2int

On Tue, 19 Aug 2003 02:51, Antonio Vargas wrote:
> On Mon, Aug 18, 2003 at 11:21:04AM +1000, Con Kolivas wrote:
> > Quoting Tom Sightler <[email protected]>:
> > > 2. -- Adobe Acrobat 5.07 for Linux seems to have a very similar issue,
> > > a large complex document seems to starve out the whole system making
> > > the system feel locked up for several seconds.
> >
> > Actually I've profiled acroread and it seems to be more a memory issue
> > than a scheduler one per se. Something very inefficient about it's design
> > and it behaves much worse as a mozilla plugin than standalone. Give it
> > lots of cpu time and it just keeps doing more and more vm work.
>
> Acrobat has a switch so that it keeps a cache of rendered pages, and
> obviously it default to ON, so just reading a big PDF file page by
> page will trash all the system with lots useless data. BUT, for simple
> PDF usage in a non-multitasking single-user machine it's faster
> so there you have a possible reason for it's strange behaviour.

Yes. As well as this though, there is a specific problem with it as a mozilla
plugin. Profiling shows some libgdk is doing all the work and it really
behaves badly. Put the same plugin into a different browser (eg opera) and it
behaves well, working pretty much like standalone acroread.

Con