2002-10-07 08:02:26

by Helge Hafting

[permalink] [raw]
Subject: Re: The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 - (NUMA))

"Martin J. Bligh" wrote:
>
> > Then there's the issue of application startup. There's not enough
> > read ahead. This is especially sad, as the order of page faults is
> > at least partially predictable.
>
> Is the problem really, fundamentally a lack of readahead in the
> kernel? Or is it that your application is huge bloated pig?

Often the latter. People getting interested in linux
seems to believe that openoffice is the msoffice replacement,
and that _is_ a huge bloated pig. It needs 50M to start
the text editor - and lots of _cpu_. It takes a long time
to start on a 266MHz machine even when the disk io
is avoided by the pagecahce.

A snappy desktop is trivial with 2.5, even with a slow machine.
Just stay away from gnome and kde, use a ugly fast
window manager like icewm or twm (and possibly lots
of others I haven't even heard about.)
X itself is snappy enough, particularly with increased
priority.
Take some care when selecting apps (yes - there is choice!)
and the desktop is just fine. Openoffice is a nice
package of programs, but there are replacements for most
of them if speed is an issue. If the machine is powerful
enough to run ms software snappy then speed probably
isn't such a big issue though.

> With admittedly no evidence whatsoever, I suspect the latter is
> really the root cause of this type of problem.
>
> Ditto for the "takes me years to switch between desktops" ...
> maybe it's just that RAM is full of utter garbage due to mindless
> feature-bloat, so everything gets swapped out. If you're running
> something like Netscape / Mozilla ... ;-)

My guess is a bloated window manager. Switching desktops
is fast for me, even with netscape running and swap in use.
Or are you talking 64M machines?

> I still think userspace is 90% of the problem here ...

Yes.

Helge Hafting


2002-10-07 10:03:56

by Oliver Neukum

[permalink] [raw]
Subject: Re: The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 - (NUMA))

On Monday 07 October 2002 10:08, Helge Hafting wrote:
> "Martin J. Bligh" wrote:
> > > Then there's the issue of application startup. There's not enough
> > > read ahead. This is especially sad, as the order of page faults is
> > > at least partially predictable.
> >
> > Is the problem really, fundamentally a lack of readahead in the
> > kernel? Or is it that your application is huge bloated pig?
>
> Often the latter. People getting interested in linux
> seems to believe that openoffice is the msoffice replacement,
> and that _is_ a huge bloated pig. It needs 50M to start
> the text editor - and lots of _cpu_. It takes a long time
> to start on a 266MHz machine even when the disk io
> is avoided by the pagecahce.

OpenOffice _is_ an important application, whether we like it or not.

How does one measure and profile application startup other than with
a stopwatch ? I'd like to gather some objective data on this.

> A snappy desktop is trivial with 2.5, even with a slow machine.
> Just stay away from gnome and kde, use a ugly fast

A desktop machine needs to run a desktop enviroment. Only a window manager is
not enough.

> window manager like icewm or twm (and possibly lots
> of others I haven't even heard about.)
> X itself is snappy enough, particularly with increased
> priority.
> Take some care when selecting apps (yes - there is choice!)
> and the desktop is just fine. Openoffice is a nice
> package of programs, but there are replacements for most
> of them if speed is an issue. If the machine is powerful
> enough to run ms software snappy then speed probably
> isn't such a big issue though.

KDE and friends _are_ not quite optimised for speed. That however doesn't
mean that the kernel should not make an effort to allow them to run as fast
as they can.

Regards
Oliver

2002-10-07 14:06:17

by Jan Hudec

[permalink] [raw]
Subject: Re: The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 - (NUMA))

On Mon, Oct 07, 2002 at 11:18:44AM +0200, Oliver Neukum wrote:
> On Monday 07 October 2002 10:08, Helge Hafting wrote:
> > "Martin J. Bligh" wrote:
> > > > Then there's the issue of application startup. There's not enough
> > > > read ahead. This is especially sad, as the order of page faults is
> > > > at least partially predictable.
> > >
> > > Is the problem really, fundamentally a lack of readahead in the
> > > kernel? Or is it that your application is huge bloated pig?
> >
> > Often the latter. People getting interested in linux
> > seems to believe that openoffice is the msoffice replacement,
> > and that _is_ a huge bloated pig. It needs 50M to start
> > the text editor - and lots of _cpu_. It takes a long time
> > to start on a 266MHz machine even when the disk io
> > is avoided by the pagecahce.
>
> OpenOffice _is_ an important application, whether we like it or not.
>
> How does one measure and profile application startup other than with
> a stopwatch ? I'd like to gather some objective data on this.

Add some debuging output to the program (mainly at the very begining of
main) and then launch it with simple program that will note time right
before it forks and then wait for the application to output something
(which should be the debuging write at the start od main) and note time
it returned from select.

> > A snappy desktop is trivial with 2.5, even with a slow machine.
> > Just stay away from gnome and kde, use a ugly fast
>
> A desktop machine needs to run a desktop enviroment. Only a window manager is
> not enough.

Please, could someone explain to me, what is desktop enviroment in
addition to window manager and horde of libraries for UI and IPC.

(No, panel is not important thing and even if it were, it's a simple
fast application, providing it's implemented sanely (I mean, gnome panel
is currently buggy))

> > window manager like icewm or twm (and possibly lots
> > of others I haven't even heard about.)
> > X itself is snappy enough, particularly with increased
> > priority.
> > Take some care when selecting apps (yes - there is choice!)
> > and the desktop is just fine. Openoffice is a nice
> > package of programs, but there are replacements for most
> > of them if speed is an issue. If the machine is powerful
> > enough to run ms software snappy then speed probably
> > isn't such a big issue though.
>
> KDE and friends _are_ not quite optimised for speed. That however doesn't
> mean that the kernel should not make an effort to allow them to run as fast
> as they can.

No, it does not.

-------------------------------------------------------------------------------
Jan 'Bulb' Hudec <[email protected]>

2002-10-07 14:57:57

by Jesse Pollard

[permalink] [raw]
Subject: Re: The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 - (NUMA))

On Monday 07 October 2002 09:11 am, Jan Hudec wrote:
> On Mon, Oct 07, 2002 at 11:18:44AM +0200, Oliver Neukum wrote:
> > On Monday 07 October 2002 10:08, Helge Hafting wrote:
[snip]
> >
> > How does one measure and profile application startup other than with
> > a stopwatch ? I'd like to gather some objective data on this.
>
> Add some debuging output to the program (mainly at the very begining of
> main) and then launch it with simple program that will note time right
> before it forks and then wait for the application to output something
> (which should be the debuging write at the start od main) and note time
> it returned from select.

nope... It has to be after input parameters have been evaluated, after
X window initialization has been done, and possibly after the application
windows are created. For a benchmark, it would likely be good to have
them at ALL such locations. Even on exit (how long does it take to
cleanup?).

> > > A snappy desktop is trivial with 2.5, even with a slow machine.
> > > Just stay away from gnome and kde, use a ugly fast
> >
> > A desktop machine needs to run a desktop enviroment. Only a window
> > manager is not enough.
>
> Please, could someone explain to me, what is desktop enviroment in
> addition to window manager and horde of libraries for UI and IPC.
>
> (No, panel is not important thing and even if it were, it's a simple
> fast application, providing it's implemented sanely (I mean, gnome panel
> is currently buggy))

The applications that USE that horde of libraries that must be running.
Otherwise, a blank screen would have been considered sufficient. Some
of these applications are: tool chest (sometimes part of a WM), multiple
desktop support (usually part of the WM, but not necessarily), WP or
other applications activated - depending on what the user wants.

What you end up having to do is define what the base desktop is
required to have to be considered "functional", and the amount of
time available for the desktop to be ready for use. I've even seen
M$ windows with 50-75 icons already present. Until they are initialized
the user didn't consider the system "usable". And that took several minutes
on an 800 MHZ system. During some of that setup the mouse was just
unusable (frozen) or it would jump around trying to catch up with the
users activity.

The other part of "usable" is how long it takes for an application to
"start". A simple fork/exec is quite fast. But that isn't a "started"
application. A responsive system means that the time between
the selection of the application to the time the user can enter data
(ie. make a menu selection/start typing) is as short as possible. The
users desire is about 1/4th of a second. With a large number of applications,
this activity requires a LOT of swap in code. Not something done fast.

One way some systems used to do this is to guarantee a MINIMUM of
50-100K of the application to be loaded BEFORE a context switch
to the application is done. Of course, this assumes that all of the
initialization code can actually FIT in the first 100K. Usually it doesn't
because a lot of that initialization is for general runtime support and X
library initialization. Hopefully, this is already loaded and resident by a
pre-existing application (the window manger). Unfortunately, the WM
initialization may have already been swapped out. and some of the X
libraries too.

The only solution for this is to not swap out at all, and have enough
memory for everything. Which is also the first recommendation to
improve M$ Windows performance. (got that one when a laptop
was alread maxed out "... not enough resources, why don't you
get some more memory...")

--
-------------------------------------------------------------------------
Jesse I Pollard, II
Email: [email protected]

Any opinions expressed are solely my own.

2002-10-07 15:12:51

by Martin J. Bligh

[permalink] [raw]
Subject: Re: The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 - (NUMA))

> OpenOffice _is_ an important application, whether we like it or not.
>
> How does one measure and profile application startup other than with
> a stopwatch ? I'd like to gather some objective data on this.

I suggest a slightly (not a lot) more sophisticated stopwatch ...

Use -mm kernels, that's where the latest vm stuff is
http://www.zipworld.com.au/~akpm/linux/patches/2.5/2.5.40/2.5.40-mm2/
and Andrew is normally wonderfully responsive to clear data from
profiles (see oprofile below)

Then either use strace with the time option on it (-t?), or:

1. use oprofile (grab from akpm's site:
http://www.zipworld.com.au/~akpm/linux/patches/2.5/2.5.40/2.5.40-mm2/experimental/), and boot with idle=poll

2. in one window type the command to stop the oprofile stuff, but
don't press return (something like "op_stop > /dev/linux")

3. In another window do:

rm -rf /var/lib/oprofile

op_start --vmlinux=/boot/vmlinux --map-file=/boot/System.map --ctr0-event=CPU_CLK_UNHALTED --ctr0-count=300000 > /dev/null

my_application

4. When your app finishes starting, hit return in that first window.

5. oprofpp -dl -i /boot/vmlinux > data_dumpy_place.

Examine output.
Or something along those lines. Not very sophisticated, but that's
what I'd do I guess (what does that say? ;-))

M.

PS. Actually the combination of an strace and profile might be most
meaningful (though you might want to do them seperately ... make
sure the cache is either cold or warm both times, not one of each).

2002-10-07 15:30:15

by Jan Hudec

[permalink] [raw]
Subject: Re: The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 - (NUMA))

On Mon, Oct 07, 2002 at 10:01:22AM -0500, Jesse Pollard wrote:
> On Monday 07 October 2002 09:11 am, Jan Hudec wrote:
> > On Mon, Oct 07, 2002 at 11:18:44AM +0200, Oliver Neukum wrote:
> > > On Monday 07 October 2002 10:08, Helge Hafting wrote:
> [snip]
> > >
> > > How does one measure and profile application startup other than with
> > > a stopwatch ? I'd like to gather some objective data on this.
> >
> > Add some debuging output to the program (mainly at the very begining of
> > main) and then launch it with simple program that will note time right
> > before it forks and then wait for the application to output something
> > (which should be the debuging write at the start od main) and note time
> > it returned from select.
>
> nope... It has to be after input parameters have been evaluated, after
> X window initialization has been done, and possibly after the application
> windows are created. For a benchmark, it would likely be good to have
> them at ALL such locations. Even on exit (how long does it take to
> cleanup?).

Well, depends on what we want to measure. If it's on the begining of
main, it measures library loading time. Then argument parsing, library
initialization, X initialization etc. can be measured. All those parts
should be timed so we can see where most time is spent and which can be
sped up.

> > > > A snappy desktop is trivial with 2.5, even with a slow machine.
> > > > Just stay away from gnome and kde, use a ugly fast
> > >
> > > A desktop machine needs to run a desktop enviroment. Only a window
> > > manager is not enough.
> >
> > Please, could someone explain to me, what is desktop enviroment in
> > addition to window manager and horde of libraries for UI and IPC.
> >
> > (No, panel is not important thing and even if it were, it's a simple
> > fast application, providing it's implemented sanely (I mean, gnome panel
> > is currently buggy))
>
> The applications that USE that horde of libraries that must be running.
> Otherwise, a blank screen would have been considered sufficient. Some
> of these applications are: tool chest (sometimes part of a WM), multiple
> desktop support (usually part of the WM, but not necessarily), WP or
> other applications activated - depending on what the user wants.

Tool chest definitely does not need most of the horde of libraries. And
it's part of most window managers (except sawfish and icewm(?))
Multiple desktop support is _the_ windowmanager. I asked what in
addition to window manager.
Application is application using the desktop enviroment.

Thus we come back to that desktop enviroment is only a window manager
(which either provides toolchest or uses separate process to do it, but
that process does not have to be that much complicated) and a horde of
libraries for applications to cooperate together well. Some basic
application must of course be there, like a file manager.

> What you end up having to do is define what the base desktop is
> required to have to be considered "functional", and the amount of
> time available for the desktop to be ready for use. I've even seen
> M$ windows with 50-75 icons already present. Until they are initialized
> the user didn't consider the system "usable". And that took several minutes
> on an 800 MHZ system. During some of that setup the mouse was just
> unusable (frozen) or it would jump around trying to catch up with the
> users activity.

And each of them was redrawn three times during the setup...
unfortunately gnome is not far from there too.

> The other part of "usable" is how long it takes for an application to
> "start". A simple fork/exec is quite fast. But that isn't a "started"
> application. A responsive system means that the time between
> the selection of the application to the time the user can enter data
> (ie. make a menu selection/start typing) is as short as possible. The
> users desire is about 1/4th of a second. With a large number of applications,
> this activity requires a LOT of swap in code. Not something done fast.

Here the larger the horde of libraries used is and the larger
individual libraries in it are, the worse.

> One way some systems used to do this is to guarantee a MINIMUM of
> 50-100K of the application to be loaded BEFORE a context switch
> to the application is done. Of course, this assumes that all of the
> initialization code can actually FIT in the first 100K. Usually it doesn't
> because a lot of that initialization is for general runtime support and X
> library initialization. Hopefully, this is already loaded and resident by a
> pre-existing application (the window manger). Unfortunately, the WM
> initialization may have already been swapped out. and some of the X
> libraries too.
>
> The only solution for this is to not swap out at all, and have enough
> memory for everything. Which is also the first recommendation to
> improve M$ Windows performance. (got that one when a laptop
> was alread maxed out "... not enough resources, why don't you
> get some more memory...")

Well, one of worst part is loading that horde of libraries in memory.
When you take a typical gnome application, the dynamic linker has quite
hard time there, because it must at least locate all of them and mmap
them. And must do that recursively for all the dependencied (fortunately
it can use cache the ld.cache where dependencies are listed). With
many gnome applications, many of these libraries will never be used or
will be used for just one or two functions, only once ... but they are
all mmaped, which means opened, which means looked up.

So what could help quite a lot would be to try hard to make as many
things as possible lazy (both in dynamic linker and in initialization of
all those libraries).

-------------------------------------------------------------------------------
Jan 'Bulb' Hudec <[email protected]>

2002-10-08 03:06:19

by Scott McDermott

[permalink] [raw]
Subject: [OT] Re: The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 - (NUMA))

Jan Hudec on Mon 7/10 17:34 +0200:
> Well, depends on what we want to measure. If it's on the begining of
> main, it measures library loading time. Then argument parsing, library
> initialization, X initialization etc. can be measured. All those parts
> should be timed so we can see where most time is spent and which can
> be sped up.

newer glibc prelinking support should help here a lot, according to
publshed time trials I have seen with and without the feature.

2002-10-10 23:43:22

by Mike Fedyk

[permalink] [raw]
Subject: Re: [OT] Re: The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 - (NUMA))

On Mon, Oct 07, 2002 at 11:12:04PM -0400, Scott Mcdermott wrote:
> Jan Hudec on Mon 7/10 17:34 +0200:
> > Well, depends on what we want to measure. If it's on the begining of
> > main, it measures library loading time. Then argument parsing, library
> > initialization, X initialization etc. can be measured. All those parts
> > should be timed so we can see where most time is spent and which can
> > be sped up.
>
> newer glibc prelinking support should help here a lot, according to
> publshed time trials I have seen with and without the feature.

Define newer.

Latest 2.2, or upcoming 3.0?