Date: Fri, 1 Feb 2002 12:52:27 +0100 (CET)
From: Roy Sigurd Karlsbakk <roy@karlsbakk.net>
To: James Washer <washer@us.ibm.com>
cc: <linux-kernel@vger.kernel.org>
Subject: Re: Errors in the VM - detailed
In-Reply-To: <OF675D993F.933C6CB9-ON88256B52.00595CCC@boulder.ibm.com>
Message-ID: <Pine.LNX.4.30.0201311730240.14577-100000@mustard.heime.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-owner@vger.kernel.org

> I'm looking into this a bit.. One question, you seem to have 200 processes
> waiting on IO, That's interesting since it is exactly double your 100
> readers.. Any idea what those other 100 are? btw, would you mind repeating
> this with a tiny little c program that just reads the data, and doesn't
> write the data out anywhere??

sorry... I've been doing testing with both 200 and 100 processes. The
numbers on the vmstat was from another run

>
> let me know
>
>  - jim
>
> Roy Sigurd Karlsbakk <roy@karlsbakk.net>@vger.kernel.org on 01/31/2002
> 07:05:12 AM
>
> Sent by:    linux-kernel-owner@vger.kernel.org
>
>
> To:    <linux-kernel@vger.kernel.org>
> cc:
> Subject:    Errors in the VM - detailed
>
>
>
> hi all
>
> The last month or so, I've been trying to make a particular configuration
> work
> with Linux-2.4.17 and other 2.4.x kernels. Two major bugs have been
> blocking
> my way into the light. Below follows a detailed description on both bugs.
> One
> of them seems to be solved in the latests -rmap patches. The other is still
> unsolved.
>
> CONFIGURATION INTRO
>
> The test has been performed on two equally configured computers, giving the
> same results, telling the chance of hardware failure is rather small.
>
> Config:
>
> 1xAthlon 1133
> 2x512MB (1GB) SRAM
> Asus A7S-VM motherboard with
>  Realtek 10/100Mbps card
>  ATA100
>  Sound+VGA+USB+other crap
> 1xPromise ATA133 controller
> 2xWDC 120GB drives (with ATA100 cabeling connected to Promise controller)
> 1xWDC 20GB drive (connected to motherboard - configured as the boot device)
> 1xIntel desktop gigE card (e1000 driver - modular)
>
> Server is configured with console on serial port
> Highmem is disabled
> The two 120GB drives is configured in RAID-0 with chunk size [256|512|1024]
> I have tried several different file systems - same error
>
> Versions tested:
>
> Linux-2.4.1([3-7]|8-pre.) tested. All buggy. Bug #1 was fixed in -rmap11c
>
> TEST SETUP
>
> Reading 100 500MB files with dd, tux, apache, cat, something, and
> redirecting
> the output to /dev/null. With tux/apache, I used another computer using
> wget
> to retrieve the same amount of data.
>
> The test scripts look something like this
>
> #!/bin/bash
> dd if=file0000 of=/dev/null &
> dd if=file0001 of=/dev/null &
> dd if=file0002 of=/dev/null &
> dd if=file0003 of=/dev/null &
> ...
> dd if=file0099 of=/dev/null &
>
> or similar - just with wget -O /dev/null ... &
>
> BUGS
>
> Bug #1:
>
> When (RAMx2) bytes has been read from disk, I/O as reported from vmstat
> drops
> to a mere 1MB/s
>
> When reading starts, the speed is initially high. Then, slowly, the speed
> decreases until it goes to something close to a complete halt (see output
> from
> vmstat below).
>
> # vmstat 2
>  r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy
>  id
>  0 200  1   1676   3200   3012 786004   0 292 42034   298  791   745   4
>  29 67
>  0 200  1   1676   3308   3136 785760   0   0 44304     0  748   758   3
>  15 82
>  0 200  1   1676   3296   3232 785676   0   0 44236     0  756   710   2
>  23 75
>  0 200  1   1676   3304   3356 785548   0   0 38662    70  778   791   3
>  19 78
>  0 200  1   1676   3200   3456 785552   0   0 33536     0  693   594   3
>  13 84
>  1 200  0   1676   3224   3528 785192   0   0 35330    24  794   712   3
>  16 81
>  0 200  0   1676   3304   3736 784324   0   0 30524    74  725   793  12
>  14 74
>  0 200  0   1676   3256   3796 783664   0   0 29984     0  718   826   4
>  10 86
>  0 200  0   1676   3288   3868 783592   0   0 25540   152  763   812   3
>  17 80
>  0 200  0   1676   3276   3908 783472   0   0 22820     0  693   731   0
>  7 92
>  0 200  0   1676   3200   3964 783540   0   0 23312     6  759   827   4
>  11 85
>  0 200  0   1676   3308   3984 783452   0   0 17506     0  687   697   0
>  11 89
>  0 200  0   1676   3388   4012 783888   0   0 14512     0  671   638   1
>  5 93
>  0 200  0   2188   3208   4048 784156   0 512 16104   548  707   833   2
>  10 88
>  0 200  0   3468   3204   4048 784788   0  66  8220    66  628   662   0
>  3 96
>  0 200  0   3468   3296   4060 784680   0   0  1036     6  687   714   1
>  6 93
>  0 200  0   3468   3316   4060 784668   0   0  1018     0  613   631   1
>  2 97
>  0 200  0   3468   3292   4060 784688   0   0  1034     0  617   638   0
>  3 97
>  0 200  0   3468   3200   4068 784772   0   0  1066     6  694   727   2
>  4 94
>
> Bug #2:
>
> Doing the same test on Rik's -rmap(.*) somehow fixes Bug #1, and makes room
> for another bug to come out.
>
> Doing the same test, I can, with -rmap, get some 33-35MB/s sustained from
> /dev/md0 to memory. This is all good, but when doing this test, only 40 of
> the
> original processes ever finish. The same error occurs both locally (dd) and
> remotely (tux). If new i/o requests is issued to the same device, they
> don't
> hang. If tux is restarted, it works fine afterwards.
>
> Please - anyone - help me with this. I've been trying to setup this system
> for
> almost two months now, fighting various bugs.
>
> Best regards
>
> roy
>
> --
> Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA
>
> Computers are like air conditioners.
> They stop working when you open Windows.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>
>

--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/