2002-01-15 03:15:01

by Klaus Meyer

[permalink] [raw]
Subject: highmem=system killer, 2.2.17=performance killer ?

Sorry, i'm a little bit confused now after reading all comments in this
list since
the release of kernel 2.4.17 (without patch) concerning problems in vm +
swap.

Perhaps somebody with more kernel experience can give a hint to clarify
the situation.

i've got serious problems using 2.4.x kernels using highmem support.
It seems to me that i'm not the only one, but the difference to most
other ones is,
that i can't use highmem because the system performance is terrible
slow.

the testbed:
1) Asus CUR-DLS (Server Set LE III) with two 1Ghz Pentiums, 2GB of ram
Suse 7.3, Redhat
7.2 Kernel 2.4.7, 2.4.16, official 2.4.17 i(without patch) in
different versions
(smp,no_smp,4GB,64GB highmem support)
2) Asus A7V 900 Mhz Athlon, 256 MB of RAM Suse 7.0, Kernel 2.2.16
2x DGE-500SX (Gigabit Ethernet)

my test: copy 2.6 GB of data from machine b) to machine a) via rcp or
nfs
using the gigabit ethernet network

a) using redhat 7.2 with kernel 2.4.7 on machine 1)
the copy process runs fine until 1.4 GB are cached (30 MB/s)
after that the performance is degraded to unusability of the system
(very high load, network throuput < 100k/s)
b) using Suse 7.3 with kernel 2.4.16 on machine 1)
booting into runlevel 3 is a pain, loading modules become very slow
after starting the copy process the load of the machine a) will
raise and raise and raise ... here a snapshot of top as long as it
was possible to get one:

-----------------------------------------------------------------------------
9:17pm up 28 min, 2 users, load average: 1.42, 0.36,
0.12
36 processes: 31 sleeping, 5 running, 0 zombie, 0 stopped
CPU0 states: 1.2% user, 100.3% system, 0.0% nice, -1.-5% idle
CPU1 states: 5.2% user, 145.0% system, 0.0% nice, -50.-3% idle
Mem: 2061536K av, 410072K used, 1651464K free, 0K shrd, 20964K
buff
Swap: 136512K av, 0K used, 136512K free 359084K
cached
-----------------------------------------------------------------------------

i didnt know that it seems to be possible to get a negative idle value
???

c) using Suse 7.3 with kernel 2.4.17 in all configurations (smp,no_smp,
4,64 GB highmem support on machine 1)
the same as in b)


I just want to state that there were no oops, so the system were always
running but
inacceptable slow, eg starting xosview leads to a relativly high load:
-----------------------------------------------------------------------------
4:34am up 17 min, 1 user, load average: 0.16, 0.23, 0.26
35 processes: 33 sleeping, 1 running, 1 zombie, 0 stopped
CPU0 states: 23.0% user, 5.0% system, 0.0% nice, 70.1% idle
CPU1 states: 13.0% user, 6.0% system, 0.0% nice, 79.1% idle
Mem: 2061656K av, 37676K used, 2023980K free, 0K shrd, 7628K
buff
Swap: 136512K av, 0K used, 136512K free 11876K
cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
755 root 11 0 1484 1484 1176 S 21.8 0.0 0:21
xosview.bin
-----------------------------------------------------------------------------




Then i shrinked the available memory to 1024M using the mem= boot
parameter.
The result was amazing:

the above mentioned configurations lead to the following details:
all configuration a) b) c) were running flawless without any performance
losses.

the network throughput of configuration a) and b) was constantly at
above 30MB/s, but
using kernel 2.4.17 (configuration c) degrades the network throughput to
15 MB/s.

all in all there are two questions left:
a) what can i do to use highmem support (which patches do i have to
apply ?)
b) Does have other users made experiences with network performance
degration using 2.2.17
instead of 2.2.16 ?


Any hints ?

thanx in advance

Klaus


2002-01-15 14:42:54

by Randy Hron

[permalink] [raw]
Subject: Re: highmem=system killer, 2.2.17=performance killer ?

> i've got serious problems using 2.4.x kernels using highmem support.
> It seems to me that i'm not the only one, but the difference to most
> other ones is,
> that i can't use highmem because the system performance is terrible
> slow.

Klaus,

Have you tried 2.4.18pre2aa2 from:
ftp://ftp.de.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.18pre2aa2.bz2

With this patch applied:
http://marc.theaimsgroup.com/?l=linux-kernel&m=101110373911359&w=2

I get better performance with it, but I've only used it on
a machine with 1GB ram.

--
Randy Hron

2002-01-15 15:00:44

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: highmem=system killer, 2.2.17=performance killer ?

On Tue, 15 Jan 2002 04:13:49 +0100
Klaus Meyer <[email protected]> wrote:

> i've got serious problems using 2.4.x kernels using highmem support.
> It seems to me that i'm not the only one, but the difference to most
> other ones is,
> that i can't use highmem because the system performance is terrible
> slow.
>
> the testbed:
> 1) Asus CUR-DLS (Server Set LE III) with two 1Ghz Pentiums, 2GB of ram

Interestingly I have about the same setup and use, only I transfer about 25 GB
a day via nfs to an Asus CUV4XD with 2 GB under 2.4.18-pre3 and do not
experience any problem so far. I haven't had any with 2.4.17, too. Cache is
pretty heavy used, but I experience no slowdown or other weird things. Can this
be somehow chipset related? Maybe something about the DGE cards? I am using TP
100MBit tulip-based.

Regards,
Stephan


2002-01-15 16:43:44

by Klaus Meyer

[permalink] [raw]
Subject: Re: highmem=system killer, 2.2.17=performance killer ?

Stephan von Krawczynski wrote:
>
> On Tue, 15 Jan 2002 04:13:49 +0100
> Klaus Meyer <[email protected]> wrote:
>
> > i've got serious problems using 2.4.x kernels using highmem support.
> > It seems to me that i'm not the only one, but the difference to most
> > other ones is,
> > that i can't use highmem because the system performance is terrible
> > slow.
> >
> > the testbed:
> > 1) Asus CUR-DLS (Server Set LE III) with two 1Ghz Pentiums, 2GB of ram
>
> Interestingly I have about the same setup and use, only I transfer about 25 GB
> a day via nfs to an Asus CUV4XD with 2 GB under 2.4.18-pre3 and do not
> experience any problem so far. I haven't had any with 2.4.17, too. Cache is
> pretty heavy used, but I experience no slowdown or other weird things. Can this
> be somehow chipset related? Maybe something about the DGE cards? I am using TP
> 100MBit tulip-based.
>

I dont think that the network driver is the one who causes problems,
because the throughput is very nice, if i limit the memory to 1GB by
hand.
if files are in the cache I'm even getting a throughput of nearly 60
MB/S (using udp) !
(but sorry, not with kernel 2.2.17 => network throuput decreases
significantly)
The whole system is running quite stable and pretty fast using only 1GB
of mem.
Probably somebody can explain the difference what will happen if i have
a kernel
with highmem support (4GB or 64GB) compiled in, but using only 1GB of
physically 2GB?
Is the kernel aware how to use highmem in this case ?
it seems to be that only a small amount of highmem will be used in this
case:

cat /proc/meminfo reads:
HighTotal: 131072 kB
HighFree: 115628
kB

As I just took a look on the output of cat /proc/meminfo i got the idea
that i'll increase the pysical swap space. (136M before that means >
highmem).
astonishing (using Suse kernel 2.4.16): after an increase to 2GB swap
and
using 1,5GB of mem the system runs quit a longer time with a good
performance,
but starting the copy process leads also to a slow down of the machine.
Finally i could see that kupdated is suffering.

-----------------------------------------------------------------------------
5:56pm up 2 min, 1 user, load average: 2.97, 1.02, 0.36
34 processes: 29 sleeping, 5 running, 0 zombie, 0 stopped
CPU0 states: 2.5% user, 96.0% system, 0.0% nice, 1.0% idle
CPU1 states: 11.4% user, 95.0% system, 0.0% nice, -6.-5% idle
Mem: 1545456K av, 452480K used, 1092976K free, 0K shrd, 19708K
buff
Swap: 2097136K av, 0K used, 2097136K free 400732K
cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
7 root 15 0 0 0 0 RW 81.2 0.0 0:44
kupdated
------------------------------------------------------------------------------

As kupdated finished his work, the system was quite usefull and came
back to a
much more better performance.

Using the avail. 2 GB of ram led to the same effect.

So whats the relation between physical swap space and highmem and
physical memory
(and the chipset) ?

testing this configuration with the offical kernel 2.4.17 falls back to
the known slow down.

It seems to be Suse has applied some patches or back porting ?!?

regards, Klaus

2002-01-15 16:49:54

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: highmem=system killer, 2.2.17=performance killer ?

On Tue, 15 Jan 2002 17:42:36 +0100
Klaus Meyer <[email protected]> wrote:

> As I just took a look on the output of cat /proc/meminfo i got the idea
> that i'll increase the pysical swap space. (136M before that means >
> highmem).
> astonishing (using Suse kernel 2.4.16): after an increase to 2GB swap
> and
> using 1,5GB of mem the system runs quit a longer time with a good
> performance,
> but starting the copy process leads also to a slow down of the machine.
> Finally i could see that kupdated is suffering.

I was already tempted to suggest you turn off swap completely, as 136 MB in a 2
GB box are somehow useless anyways. I know, I have the same setup (256MB swap).
As this could work without boot, willing to give it a try? Anyway I would very
much suggest to use -pre3.

Regards,
Stephan



2002-01-18 02:05:43

by Klaus Meyer

[permalink] [raw]
Subject: Re: highmem=system killer, 2.2.17=performance killer ?

Hi all,

thank you very much for your convenience and effort.
I've now located the real reason for all my problems.

It was just a bad memory modul. Believe me, i'd tested them before
carefully.
But i had to learn that even ECC-modules installed in brand motherboards
dont tell you that they are not working correctly.

After trying all new kernel versions and patches i was really
desperated.
All the discussions in the LKLM concerning the new vm +swap
frustated me since i was really thinking that the stable kernel tree is
not really stable.
I'm a user of linux since 0.99pl2 so this would have been a new
experience to me.

All in all i'm still wondering that the system was working anyway.
So finding the real error was just inspiration and luck.

Thanx for all hints

Klaus

Stephan von Krawczynski wrote:
>
> On Tue, 15 Jan 2002 17:42:36 +0100
> Klaus Meyer <[email protected]> wrote:
>
> > As I just took a look on the output of cat /proc/meminfo i got the idea
> > that i'll increase the pysical swap space. (136M before that means >
> > highmem).
> > astonishing (using Suse kernel 2.4.16): after an increase to 2GB swap
> > and
> > using 1,5GB of mem the system runs quit a longer time with a good
> > performance,
> > but starting the copy process leads also to a slow down of the machine.
> > Finally i could see that kupdated is suffering.
>
> I was already tempted to suggest you turn off swap completely, as 136 MB in a 2
> GB box are somehow useless anyways. I know, I have the same setup (256MB swap).
> As this could work without boot, willing to give it a try? Anyway I would very
> much suggest to use -pre3.
>
> Regards,
> Stephan

2002-01-18 04:56:52

by Bill Davidsen

[permalink] [raw]
Subject: Re: highmem=system killer, 2.2.17=performance killer ?

On Fri, 18 Jan 2002, Klaus Meyer wrote:

> It was just a bad memory modul. Believe me, i'd tested them before
> carefully.
> But i had to learn that even ECC-modules installed in brand motherboards
> dont tell you that they are not working correctly.

I wonder if your BIOS is doing the right thing setting the ECC config?
That should have been reported.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2002-01-18 14:38:36

by Klaus Meyer

[permalink] [raw]
Subject: Re: highmem=system killer, 2.2.17=performance killer ?

Bill Davidsen wrote:
>
> On Fri, 18 Jan 2002, Klaus Meyer wrote:
>
> > It was just a bad memory modul. Believe me, i'd tested them before
> > carefully.
> > But i had to learn that even ECC-modules installed in brand motherboards
> > dont tell you that they are not working correctly.
>
> I wonder if your BIOS is doing the right thing setting the ECC config?
> That should have been reported.
>

Yes you are right. That _should_ have been reported.
I'm using the ASUS CUR-DLS with the ServerSet LE III chipset.
I was honestly convinced that such a motherboard is constructed for
server usage and stability. Perhaps there is something wrong with my
board.
Since you have to use ECC modules in this motherboard there is
no way to configure a special ECC config mode in the bios.

The bad module (1 GB) was installed in bank 1 beside another GB module
in bank 0.
The board reported 2GB of installed ram.
I put it then in bank 0 as the only module installed.
The bios then reported 16M as mem, which is wrong of course, but: no
error.
It's really funny since all memory tests with 16M were sucessful (using
memtest).
After that the board always reported 16M not depending on the bank
position _without any error_.

I really don't know wether I should laugh or cry about that ;)
All in all the system is now working stable with other modules (2x1GB)
installed.

regards
Klaus