2002-01-31 15:05:33

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Errors in the VM - detailed

hi all

The last month or so, I've been trying to make a particular configuration work
with Linux-2.4.17 and other 2.4.x kernels. Two major bugs have been blocking
my way into the light. Below follows a detailed description on both bugs. One
of them seems to be solved in the latests -rmap patches. The other is still
unsolved.

CONFIGURATION INTRO

The test has been performed on two equally configured computers, giving the
same results, telling the chance of hardware failure is rather small.

Config:

1xAthlon 1133
2x512MB (1GB) SRAM
Asus A7S-VM motherboard with
Realtek 10/100Mbps card
ATA100
Sound+VGA+USB+other crap
1xPromise ATA133 controller
2xWDC 120GB drives (with ATA100 cabeling connected to Promise controller)
1xWDC 20GB drive (connected to motherboard - configured as the boot device)
1xIntel desktop gigE card (e1000 driver - modular)

Server is configured with console on serial port
Highmem is disabled
The two 120GB drives is configured in RAID-0 with chunk size [256|512|1024]
I have tried several different file systems - same error

Versions tested:

Linux-2.4.1([3-7]|8-pre.) tested. All buggy. Bug #1 was fixed in -rmap11c

TEST SETUP

Reading 100 500MB files with dd, tux, apache, cat, something, and redirecting
the output to /dev/null. With tux/apache, I used another computer using wget
to retrieve the same amount of data.

The test scripts look something like this

#!/bin/bash
dd if=file0000 of=/dev/null &
dd if=file0001 of=/dev/null &
dd if=file0002 of=/dev/null &
dd if=file0003 of=/dev/null &
...
dd if=file0099 of=/dev/null &

or similar - just with wget -O /dev/null ... &

BUGS

Bug #1:

When (RAMx2) bytes has been read from disk, I/O as reported from vmstat drops
to a mere 1MB/s

When reading starts, the speed is initially high. Then, slowly, the speed
decreases until it goes to something close to a complete halt (see output from
vmstat below).

# vmstat 2
r b w swpd free buff cache si so bi bo in cs us sy id
0 200 1 1676 3200 3012 786004 0 292 42034 298 791 745 4 29 67
0 200 1 1676 3308 3136 785760 0 0 44304 0 748 758 3 15 82
0 200 1 1676 3296 3232 785676 0 0 44236 0 756 710 2 23 75
0 200 1 1676 3304 3356 785548 0 0 38662 70 778 791 3 19 78
0 200 1 1676 3200 3456 785552 0 0 33536 0 693 594 3 13 84
1 200 0 1676 3224 3528 785192 0 0 35330 24 794 712 3 16 81
0 200 0 1676 3304 3736 784324 0 0 30524 74 725 793 12 14 74
0 200 0 1676 3256 3796 783664 0 0 29984 0 718 826 4 10 86
0 200 0 1676 3288 3868 783592 0 0 25540 152 763 812 3 17 80
0 200 0 1676 3276 3908 783472 0 0 22820 0 693 731 0 7 92
0 200 0 1676 3200 3964 783540 0 0 23312 6 759 827 4 11 85
0 200 0 1676 3308 3984 783452 0 0 17506 0 687 697 0 11 89
0 200 0 1676 3388 4012 783888 0 0 14512 0 671 638 1 5 93
0 200 0 2188 3208 4048 784156 0 512 16104 548 707 833 2 10 88
0 200 0 3468 3204 4048 784788 0 66 8220 66 628 662 0 3 96
0 200 0 3468 3296 4060 784680 0 0 1036 6 687 714 1 6 93
0 200 0 3468 3316 4060 784668 0 0 1018 0 613 631 1 2 97
0 200 0 3468 3292 4060 784688 0 0 1034 0 617 638 0 3 97
0 200 0 3468 3200 4068 784772 0 0 1066 6 694 727 2 4 94

Bug #2:

Doing the same test on Rik's -rmap(.*) somehow fixes Bug #1, and makes room
for another bug to come out.

Doing the same test, I can, with -rmap, get some 33-35MB/s sustained from
/dev/md0 to memory. This is all good, but when doing this test, only 40 of the
original processes ever finish. The same error occurs both locally (dd) and
remotely (tux). If new i/o requests is issued to the same device, they don't
hang. If tux is restarted, it works fine afterwards.

Please - anyone - help me with this. I've been trying to setup this system for
almost two months now, fighting various bugs.

Best regards

roy

--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.


2002-01-31 15:44:57

by David Mansfield

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

On Thu, 31 Jan 2002, Roy Sigurd Karlsbakk wrote:

> hi all
>
> The last month or so, I've been trying to make a particular configuration work
> with Linux-2.4.17 and other 2.4.x kernels. Two major bugs have been blocking
> my way into the light. Below follows a detailed description on both bugs. One
> of them seems to be solved in the latests -rmap patches. The other is still
> unsolved.
>
> CONFIGURATION INTRO
>

My config:

Athlon 1400mhz, 512mb ram, single HD seagate ST360020A 60GB ATA100. I am
running the 2.4.17rc2aa2 kernel, which many on the list (and I will
second) have stated to be a very excellent kernel. I noticed you haven't
tried the aa kernels. You should.

I'm *not* running sw raid however, and this may be the significant factor,
have you tested your drives singly (without the raid?).


I created 100 100mb files (I don't have enough free space to do anything
else) using dd if=/dev/zero of=file???. I did this sequentially. Then I
wrote a second script to use dd if=file??? of=/dev/null & and started 100
reader in parallel. There were no stalls in the read from beginning to
end, my system maintained about 6-8Mb/s xfer rate throughout the test.
That's about what I would expect for 100 simultaneous readers.


In your tests, are you sure that you are synchronising the starting of
your reader processes? Maybe you are seeing the first readers getting
started first (and you have less seeks ruining your I/O bandwidth) and
then as they get going, the additional seeks ruin everything. I honestly
think this is unlikely, since your I/O level does drop to a disgustingly
low level.

Hope this helps.


David


--
/==============================\
| David Mansfield |
| [email protected] |
\==============================/

2002-01-31 20:25:02

by Roger Larsson

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

On Thursday den 31 January 2002 16.05, Roy Sigurd Karlsbakk wrote:
> hi all
> - - -
> Versions tested:
>
> Linux-2.4.1([3-7]|8-pre.) tested. All buggy. Bug #1 was fixed in -rmap11c
>
> TEST SETUP
>
> Reading 100 500MB files with dd, tux, apache, cat, something, and
> redirecting the output to /dev/null. With tux/apache, I used another
> computer using wget to retrieve the same amount of data.
>
> The test scripts look something like this
>
> #!/bin/bash
> dd if=file0000 of=/dev/null &
> dd if=file0001 of=/dev/null &
> dd if=file0002 of=/dev/null &
> dd if=file0003 of=/dev/null &
> ...
> dd if=file0099 of=/dev/null &
>
> or similar - just with wget -O /dev/null ... &
>
> BUGS
>
> Bug #1:
>
> When (RAMx2) bytes has been read from disk, I/O as reported from vmstat
> drops to a mere 1MB/s
>
> When reading starts, the speed is initially high. Then, slowly, the speed
> decreases until it goes to something close to a complete halt (see output
> from vmstat below).
>

Wait a minute - it might be readahead that gets killed.
If I remember correctly READA requests are dropped when failing to allocate
space for it - yes I did...

/usr/src/develop/linux/drivers/block/ll_rw_block.c:746 (earlier kernel)

/*
* Grab a free request from the freelist - if that is empty, check
* if we are doing read ahead and abort instead of blocking for
* a free slot.
*/
get_rq:
if (freereq) {
req = freereq;
freereq = NULL;
} else if ((req = get_request(q, rw)) == NULL) {
spin_unlock_irq(&io_request_lock);
if (rw_ahead)
goto end_io;

freereq = __get_request_wait(q, rw);
goto again;
}

Suppose we fail with get_request, the request is a rw_ahead,
it quits... => no read ahead.

Try to add a prink there...
if (rw_ahead) {
printk("Skipping readahead...\n");
goto end_io;
}

Can it be the problem???

/RogerL

--
Roger Larsson
Skellefte?
Sweden

2002-01-31 20:30:11

by Jens Axboe

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

On Thu, Jan 31 2002, Roger Larsson wrote:
> Wait a minute - it might be readahead that gets killed.
> If I remember correctly READA requests are dropped when failing to allocate
> space for it - yes I did...

s/allocate/retrieve

No allocation takes place.

> /usr/src/develop/linux/drivers/block/ll_rw_block.c:746 (earlier kernel)
>
> /*
> * Grab a free request from the freelist - if that is empty, check
> * if we are doing read ahead and abort instead of blocking for
> * a free slot.
> */
> get_rq:
> if (freereq) {
> req = freereq;
> freereq = NULL;
> } else if ((req = get_request(q, rw)) == NULL) {
> spin_unlock_irq(&io_request_lock);
> if (rw_ahead)
> goto end_io;
>
> freereq = __get_request_wait(q, rw);
> goto again;
> }
>
> Suppose we fail with get_request, the request is a rw_ahead,
> it quits... => no read ahead.
>
> Try to add a prink there...
> if (rw_ahead) {
> printk("Skipping readahead...\n");
> goto end_io;
> }

That will trigger _all the time_ even on a moderately busy machine.
Checking if tossing away read-ahead is the issue is probably better
tested with just increasing the request slots. Roy, please try and change
the queue_nr_requests assignment in ll_rw_blk:blk_dev_init() to
something like 2048.

--
Jens Axboe

2002-01-31 20:50:52

by Andrew Morton

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

Jens Axboe wrote:
>
> That will trigger _all the time_ even on a moderately busy machine.
> Checking if tossing away read-ahead is the issue is probably better
> tested with just increasing the request slots. Roy, please try and change
> the queue_nr_requests assignment in ll_rw_blk:blk_dev_init() to
> something like 2048.
>

heh. Yep, Roger finally nailed it, I think.

Roy says the bug was fixed in rmap11c. Changelog says:


rmap 11c:
...
- elevator improvement (Andrew Morton)

Which includes:

- queue_nr_requests = 64;
- if (total_ram > MB(32))
- queue_nr_requests = 128; + queue_nr_requests = (total_ram >> 9) & ~15; /* One per half-megabyte */
+ if (queue_nr_requests < 32)
+ queue_nr_requests = 32;
+ if (queue_nr_requests > 1024)
+ queue_nr_requests = 1024;


So Roy is running with 1024 requests.

The question is (sorry, Roy): does this need fixing?

The only thing which can trigger it is when we have
zillions of threads doing reads (or zillions of outstanding
aio read requests) or when there are a large number of
unmerged write requests in the elevator. It's a rare
case.

If we _do_ need a fix, then perhaps we should just stop
using READA in the readhead code? readahead is absolutely
vital to throughput, and best-effort request allocation
just isn't good enough.

Thoughts?

-

2002-01-31 21:38:43

by Jens Axboe

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

On Thu, Jan 31 2002, Andrew Morton wrote:
> rmap 11c:
> ...
> - elevator improvement (Andrew Morton)
>
> Which includes:
>
> - queue_nr_requests = 64;
> - if (total_ram > MB(32))
> - queue_nr_requests = 128; + queue_nr_requests = (total_ram >> 9) & ~15; /* One per half-megabyte */
> + if (queue_nr_requests < 32)
> + queue_nr_requests = 32;
> + if (queue_nr_requests > 1024)
> + queue_nr_requests = 1024;
>
>
> So Roy is running with 1024 requests.

Ah yes, of course.

> The question is (sorry, Roy): does this need fixing?
>
> The only thing which can trigger it is when we have
> zillions of threads doing reads (or zillions of outstanding
> aio read requests) or when there are a large number of
> unmerged write requests in the elevator. It's a rare
> case.

Indeed.

> If we _do_ need a fix, then perhaps we should just stop
> using READA in the readhead code? readahead is absolutely
> vital to throughput, and best-effort request allocation
> just isn't good enough.

Hmm well. Maybe just a small pool of requests set aside for READA would
be a better idea. That way "normal" reads are not able to starve READA
completely.

Something ala this, completely untested. Will try and boot it now :-)
Roy, could you please test? It's against 2.4.18-pre7, I'll boot it now
as well...

--- /opt/kernel/linux-2.4.18-pre7/include/linux/blkdev.h Mon Nov 26 14:29:17 2001
+++ linux/include/linux/blkdev.h Thu Jan 31 22:29:01 2002
@@ -74,9 +74,9 @@
struct request_queue
{
/*
- * the queue request freelist, one for reads and one for writes
+ * the queue request freelist, one for READ, WRITE, and READA
*/
- struct request_list rq[2];
+ struct request_list rq[3];

/*
* Together with queue_head for cacheline sharing
--- /opt/kernel/linux-2.4.18-pre7/drivers/block/ll_rw_blk.c Sun Jan 27 16:06:31 2002
+++ linux/drivers/block/ll_rw_blk.c Thu Jan 31 22:36:24 2002
@@ -333,8 +333,10 @@

INIT_LIST_HEAD(&q->rq[READ].free);
INIT_LIST_HEAD(&q->rq[WRITE].free);
+ INIT_LIST_HEAD(&q->rq[READA].free);
q->rq[READ].count = 0;
q->rq[WRITE].count = 0;
+ q->rq[READA].count = 0;

/*
* Divide requests in half between read and write
@@ -352,6 +354,20 @@
q->rq[i&1].count++;
}

+ for (i = 0; i < queue_nr_requests / 4; i++) {
+ rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
+ /*
+ * hey well, this needs better checking (as well as the above)
+ */
+ if (!rq)
+ break;
+
+ memset(rq, 0, sizeof(struct request));
+ rq->rq_status = RQ_INACTIVE;
+ list_add(&rq->queue, &q->rq[READA].free);
+ q->rq[READA].count++;
+ }
+
init_waitqueue_head(&q->wait_for_request);
spin_lock_init(&q->queue_lock);
}
@@ -752,12 +768,18 @@
req = freereq;
freereq = NULL;
} else if ((req = get_request(q, rw)) == NULL) {
- spin_unlock_irq(&io_request_lock);
+
if (rw_ahead)
- goto end_io;
+ req = get_request(q, READA);

- freereq = __get_request_wait(q, rw);
- goto again;
+ spin_unlock_irq(&io_request_lock);
+
+ if (!req && rw_ahead)
+ goto end_io;
+ else if (!req) {
+ freereq = __get_request_wait(q, rw);
+ goto again;
+ }
}

/* fill up the request-info, and add it to the queue */
@@ -1119,7 +1141,7 @@
*/
queue_nr_requests = 64;
if (total_ram > MB(32))
- queue_nr_requests = 128;
+ queue_nr_requests = 256;

/*
* Batch frees according to queue length

--
Jens Axboe

2002-02-01 11:52:55

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

> I'm looking into this a bit.. One question, you seem to have 200 processes
> waiting on IO, That's interesting since it is exactly double your 100
> readers.. Any idea what those other 100 are? btw, would you mind repeating
> this with a tiny little c program that just reads the data, and doesn't
> write the data out anywhere??

sorry... I've been doing testing with both 200 and 100 processes. The
numbers on the vmstat was from another run

>
> let me know
>
> - jim
>
> Roy Sigurd Karlsbakk <[email protected]>@vger.kernel.org on 01/31/2002
> 07:05:12 AM
>
> Sent by: [email protected]
>
>
> To: <[email protected]>
> cc:
> Subject: Errors in the VM - detailed
>
>
>
> hi all
>
> The last month or so, I've been trying to make a particular configuration
> work
> with Linux-2.4.17 and other 2.4.x kernels. Two major bugs have been
> blocking
> my way into the light. Below follows a detailed description on both bugs.
> One
> of them seems to be solved in the latests -rmap patches. The other is still
> unsolved.
>
> CONFIGURATION INTRO
>
> The test has been performed on two equally configured computers, giving the
> same results, telling the chance of hardware failure is rather small.
>
> Config:
>
> 1xAthlon 1133
> 2x512MB (1GB) SRAM
> Asus A7S-VM motherboard with
> Realtek 10/100Mbps card
> ATA100
> Sound+VGA+USB+other crap
> 1xPromise ATA133 controller
> 2xWDC 120GB drives (with ATA100 cabeling connected to Promise controller)
> 1xWDC 20GB drive (connected to motherboard - configured as the boot device)
> 1xIntel desktop gigE card (e1000 driver - modular)
>
> Server is configured with console on serial port
> Highmem is disabled
> The two 120GB drives is configured in RAID-0 with chunk size [256|512|1024]
> I have tried several different file systems - same error
>
> Versions tested:
>
> Linux-2.4.1([3-7]|8-pre.) tested. All buggy. Bug #1 was fixed in -rmap11c
>
> TEST SETUP
>
> Reading 100 500MB files with dd, tux, apache, cat, something, and
> redirecting
> the output to /dev/null. With tux/apache, I used another computer using
> wget
> to retrieve the same amount of data.
>
> The test scripts look something like this
>
> #!/bin/bash
> dd if=file0000 of=/dev/null &
> dd if=file0001 of=/dev/null &
> dd if=file0002 of=/dev/null &
> dd if=file0003 of=/dev/null &
> ...
> dd if=file0099 of=/dev/null &
>
> or similar - just with wget -O /dev/null ... &
>
> BUGS
>
> Bug #1:
>
> When (RAMx2) bytes has been read from disk, I/O as reported from vmstat
> drops
> to a mere 1MB/s
>
> When reading starts, the speed is initially high. Then, slowly, the speed
> decreases until it goes to something close to a complete halt (see output
> from
> vmstat below).
>
> # vmstat 2
> r b w swpd free buff cache si so bi bo in cs us sy
> id
> 0 200 1 1676 3200 3012 786004 0 292 42034 298 791 745 4
> 29 67
> 0 200 1 1676 3308 3136 785760 0 0 44304 0 748 758 3
> 15 82
> 0 200 1 1676 3296 3232 785676 0 0 44236 0 756 710 2
> 23 75
> 0 200 1 1676 3304 3356 785548 0 0 38662 70 778 791 3
> 19 78
> 0 200 1 1676 3200 3456 785552 0 0 33536 0 693 594 3
> 13 84
> 1 200 0 1676 3224 3528 785192 0 0 35330 24 794 712 3
> 16 81
> 0 200 0 1676 3304 3736 784324 0 0 30524 74 725 793 12
> 14 74
> 0 200 0 1676 3256 3796 783664 0 0 29984 0 718 826 4
> 10 86
> 0 200 0 1676 3288 3868 783592 0 0 25540 152 763 812 3
> 17 80
> 0 200 0 1676 3276 3908 783472 0 0 22820 0 693 731 0
> 7 92
> 0 200 0 1676 3200 3964 783540 0 0 23312 6 759 827 4
> 11 85
> 0 200 0 1676 3308 3984 783452 0 0 17506 0 687 697 0
> 11 89
> 0 200 0 1676 3388 4012 783888 0 0 14512 0 671 638 1
> 5 93
> 0 200 0 2188 3208 4048 784156 0 512 16104 548 707 833 2
> 10 88
> 0 200 0 3468 3204 4048 784788 0 66 8220 66 628 662 0
> 3 96
> 0 200 0 3468 3296 4060 784680 0 0 1036 6 687 714 1
> 6 93
> 0 200 0 3468 3316 4060 784668 0 0 1018 0 613 631 1
> 2 97
> 0 200 0 3468 3292 4060 784688 0 0 1034 0 617 638 0
> 3 97
> 0 200 0 3468 3200 4068 784772 0 0 1066 6 694 727 2
> 4 94
>
> Bug #2:
>
> Doing the same test on Rik's -rmap(.*) somehow fixes Bug #1, and makes room
> for another bug to come out.
>
> Doing the same test, I can, with -rmap, get some 33-35MB/s sustained from
> /dev/md0 to memory. This is all good, but when doing this test, only 40 of
> the
> original processes ever finish. The same error occurs both locally (dd) and
> remotely (tux). If new i/o requests is issued to the same device, they
> don't
> hang. If tux is restarted, it works fine afterwards.
>
> Please - anyone - help me with this. I've been trying to setup this system
> for
> almost two months now, fighting various bugs.
>
> Best regards
>
> roy
>
> --
> Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA
>
> Computers are like air conditioners.
> They stop working when you open Windows.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
>

--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.

2002-02-01 13:42:51

by Denis Vlasenko

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

On 31 January 2002 13:05, Roy Sigurd Karlsbakk wrote:
> The last month or so, I've been trying to make a particular configuration
> work with Linux-2.4.17 and other 2.4.x kernels. Two major bugs have been
> blocking my way into the light. Below follows a detailed description on
> both bugs. One of them seems to be solved in the latests -rmap patches. The
> other is still unsolved.

I've seen your posts. Can't help you directly, but:

> The two 120GB drives is configured in RAID-0 with chunk size [256|512|1024]

Do bugs bite you with plain partitions (no RAID). Maybe it's a RAID bug?

> When (RAMx2) bytes has been read from disk, I/O as reported from vmstat
> drops to a mere 1MB/s
> When reading starts, the speed is initially high. Then, slowly, the speed
> decreases until it goes to something close to a complete halt (see output
> from vmstat below).

Can you run oom_trigger at this point and watch what will happen?
It will force most (if not all) of the page cache to be flushed, speed might
increase. This is not a solution, just a way to get additional info on bug
behavior. I've got a little patch which improves (read: fixes) cache flush
behavior. Attached below. BTW, did you try -aa kernels?

> Bug #2:
>
> Doing the same test on Rik's -rmap(.*) somehow fixes Bug #1, and makes room
> for another bug to come out.
>
> Doing the same test, I can, with -rmap, get some 33-35MB/s sustained from
> /dev/md0 to memory. This is all good, but when doing this test, only 40 of
> the original processes ever finish. The same error occurs both locally (dd)
> and remotely (tux). If new i/o requests is issued to the same device, they
> don't hang. If tux is restarted, it works fine afterwards.

After they hang, make Alt-SysRq-T trace, ksymoops it and send to Rik and LKML.
CC'ing Andrea won't hurt I think.
--
vda

oom_trigger.c
=============
#include <stdlib.h>
int main() {
void *p;
unsigned size = 1<<20;
unsigned long total=0;
while(size) {
p = malloc(size);
if(!p) size>>=1;
else {
memset(p, 0x77, size);
total+=size;
printf("Allocated %9u bytes, %12lu total\n",size,total);
}
}
return 0;
}


vmscan.patch.2.4.17.d (author: "M.H.VanLeeuwen" <[email protected]>)
====================================================================
--- linux.virgin/mm/vmscan.c Mon Dec 31 12:46:25 2001
+++ linux/mm/vmscan.c Fri Jan 11 18:03:05 2002
@@ -394,9 +394,9 @@
if (PageDirty(page) && is_page_cache_freeable(page) && page->mapping) {
/*
* It is not critical here to write it only if
- * the page is unmapped beause any direct writer
+ * the page is unmapped because any direct writer
* like O_DIRECT would set the PG_dirty bitflag
- * on the phisical page after having successfully
+ * on the physical page after having successfully
* pinned it and after the I/O to the page is finished,
* so the direct writes to the page cannot get lost.
*/
@@ -480,11 +480,14 @@

/*
* Alert! We've found too many mapped pages on the
- * inactive list, so we start swapping out now!
+ * inactive list.
+ * Move referenced pages to the active list.
*/
- spin_unlock(&pagemap_lru_lock);
- swap_out(priority, gfp_mask, classzone);
- return nr_pages;
+ if (PageReferenced(page) && !PageLocked(page)) {
+ del_page_from_inactive_list(page);
+ add_page_to_active_list(page);
+ }
+ continue;
}

/*
@@ -521,6 +524,9 @@
}
spin_unlock(&pagemap_lru_lock);

+ if (max_mapped <= 0 && (nr_pages > 0 || priority < DEF_PRIORITY))
+ swap_out(priority, gfp_mask, classzone);
+
return nr_pages;
}


2002-02-01 16:05:53

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

> Something ala this, completely untested. Will try and boot it now :-)
> Roy, could you please test? It's against 2.4.18-pre7, I'll boot it now
> as well...

Still problems after installing the patch. No change at all. The patch was
installed against 2.4.17-rmap12a+ide+tux.

Testing with Apache2 now (apache2 uses mmap instead of sendfile() as
far as I can see) ... these tests take some time

--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.




2002-02-01 16:12:13

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

It does not seem to be possible to reproduce the error with apache2. But
this may be because Apache2's i/o handling doesn't impress much. With Tux,
I keep getting up to 40 megs per sec, but with Apache the average is
~15MB/s.

Btw ... It looks like your patch (against rmap12a) gave me an extra
performance kick. 12c gave me a max of ~32MB/s, whereas your patch
highered this to ~41.

thanks

roy

On Thu, 31 Jan 2002, Jens Axboe wrote:

> On Thu, Jan 31 2002, Andrew Morton wrote:
> > rmap 11c:
> > ...
> > - elevator improvement (Andrew Morton)
> >
> > Which includes:
> >
> > - queue_nr_requests = 64;
> > - if (total_ram > MB(32))
> > - queue_nr_requests = 128; + queue_nr_requests = (total_ram >> 9) & ~15; /* One per half-megabyte */
> > + if (queue_nr_requests < 32)
> > + queue_nr_requests = 32;
> > + if (queue_nr_requests > 1024)
> > + queue_nr_requests = 1024;
> >
> >
> > So Roy is running with 1024 requests.
>
> Ah yes, of course.
>
> > The question is (sorry, Roy): does this need fixing?
> >
> > The only thing which can trigger it is when we have
> > zillions of threads doing reads (or zillions of outstanding
> > aio read requests) or when there are a large number of
> > unmerged write requests in the elevator. It's a rare
> > case.
>
> Indeed.
>
> > If we _do_ need a fix, then perhaps we should just stop
> > using READA in the readhead code? readahead is absolutely
> > vital to throughput, and best-effort request allocation
> > just isn't good enough.
>
> Hmm well. Maybe just a small pool of requests set aside for READA would
> be a better idea. That way "normal" reads are not able to starve READA
> completely.
>
> Something ala this, completely untested. Will try and boot it now :-)
> Roy, could you please test? It's against 2.4.18-pre7, I'll boot it now
> as well...
>
> --- /opt/kernel/linux-2.4.18-pre7/include/linux/blkdev.h Mon Nov 26 14:29:17 2001
> +++ linux/include/linux/blkdev.h Thu Jan 31 22:29:01 2002
> @@ -74,9 +74,9 @@
> struct request_queue
> {
> /*
> - * the queue request freelist, one for reads and one for writes
> + * the queue request freelist, one for READ, WRITE, and READA
> */
> - struct request_list rq[2];
> + struct request_list rq[3];
>
> /*
> * Together with queue_head for cacheline sharing
> --- /opt/kernel/linux-2.4.18-pre7/drivers/block/ll_rw_blk.c Sun Jan 27 16:06:31 2002
> +++ linux/drivers/block/ll_rw_blk.c Thu Jan 31 22:36:24 2002
> @@ -333,8 +333,10 @@
>
> INIT_LIST_HEAD(&q->rq[READ].free);
> INIT_LIST_HEAD(&q->rq[WRITE].free);
> + INIT_LIST_HEAD(&q->rq[READA].free);
> q->rq[READ].count = 0;
> q->rq[WRITE].count = 0;
> + q->rq[READA].count = 0;
>
> /*
> * Divide requests in half between read and write
> @@ -352,6 +354,20 @@
> q->rq[i&1].count++;
> }
>
> + for (i = 0; i < queue_nr_requests / 4; i++) {
> + rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
> + /*
> + * hey well, this needs better checking (as well as the above)
> + */
> + if (!rq)
> + break;
> +
> + memset(rq, 0, sizeof(struct request));
> + rq->rq_status = RQ_INACTIVE;
> + list_add(&rq->queue, &q->rq[READA].free);
> + q->rq[READA].count++;
> + }
> +
> init_waitqueue_head(&q->wait_for_request);
> spin_lock_init(&q->queue_lock);
> }
> @@ -752,12 +768,18 @@
> req = freereq;
> freereq = NULL;
> } else if ((req = get_request(q, rw)) == NULL) {
> - spin_unlock_irq(&io_request_lock);
> +
> if (rw_ahead)
> - goto end_io;
> + req = get_request(q, READA);
>
> - freereq = __get_request_wait(q, rw);
> - goto again;
> + spin_unlock_irq(&io_request_lock);
> +
> + if (!req && rw_ahead)
> + goto end_io;
> + else if (!req) {
> + freereq = __get_request_wait(q, rw);
> + goto again;
> + }
> }
>
> /* fill up the request-info, and add it to the queue */
> @@ -1119,7 +1141,7 @@
> */
> queue_nr_requests = 64;
> if (total_ram > MB(32))
> - queue_nr_requests = 128;
> + queue_nr_requests = 256;
>
> /*
> * Batch frees according to queue length
>
> --
> Jens Axboe
>

--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.

2002-02-01 18:48:31

by Roger Larsson

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

On Fridayen den 1 February 2002 17.11, Roy Sigurd Karlsbakk wrote:
> It does not seem to be possible to reproduce the error with apache2. But
> this may be because Apache2's i/o handling doesn't impress much. With Tux,
> I keep getting up to 40 megs per sec, but with Apache the average is
> ~15MB/s.
>
> Btw ... It looks like your patch (against rmap12a) gave me an extra
> performance kick. 12c gave me a max of ~32MB/s, whereas your patch
> highered this to ~41.
>

Hmm.. suppose this is the problem anyway and that Jens patch was not enough.
How do the disk drive sound during the test?

Does it start to sound more when performance goes down?

About Jens patch:

My feeling is that there should be (a lot) more READA than READ.
since sequential READ really only NEEDS one at a time.

Number of READ limits the number of concurrent streams.
And READA limits the maximum total read ahead.

Jens said earlier "Roy, please try and change
the queue_nr_requests assignment in ll_rw_blk:blk_dev_init() to
something like 2048." - Roy have you tested this too?

/RogerL

--
Roger Larsson
Skellefte?
Sweden

2002-02-01 18:56:11

by Roger Larsson

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

One more thing, that I think is important.

On Friday den 1 February 2002 19.44, Roger Larsson wrote:
> On Friday den 1 February 2002 17.11, Roy Sigurd Karlsbakk wrote:
> - - -
> About Jens patch:
>
> My feeling is that there should be (a lot) more READA than READ.
> since sequential READ really only NEEDS one at a time.
>
> Number of READ limits the number of concurrent streams.
> And READA limits the maximum total read ahead.

With RAID as Roy uses this gets even worse!
READs has to be > concurrent streams * raid disks [IMHO]
since each stream is splitted out on all disks...

/RogerL

--
Roger Larsson
Skellefte?
Sweden

2002-02-01 18:58:31

by Jens Axboe

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

On Fri, Feb 01 2002, Roger Larsson wrote:
> On Fridayen den 1 February 2002 17.11, Roy Sigurd Karlsbakk wrote:
> > It does not seem to be possible to reproduce the error with apache2. But
> > this may be because Apache2's i/o handling doesn't impress much. With Tux,
> > I keep getting up to 40 megs per sec, but with Apache the average is
> > ~15MB/s.
> >
> > Btw ... It looks like your patch (against rmap12a) gave me an extra
> > performance kick. 12c gave me a max of ~32MB/s, whereas your patch
> > highered this to ~41.
> >
>
> Hmm.. suppose this is the problem anyway and that Jens patch was not enough.
> How do the disk drive sound during the test?
>
> Does it start to sound more when performance goes down?

Yes that would be interesting to know, if the disk becomes seek bound.

> About Jens patch:
>
> My feeling is that there should be (a lot) more READA than READ.
> since sequential READ really only NEEDS one at a time.

Probably, my patch was really just a quick try to see if it changed
anything.

> Number of READ limits the number of concurrent streams.
> And READA limits the maximum total read ahead.

Correct, Roy you could try and change the READA balance by allocating
lots more READA requests. Simply play around with the
queue_nr_requests / 4 setting. Try something "absurd" like
queue_nr_requests << 2 or even bigger.

--
Jens Axboe

2002-02-02 14:44:20

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

> Jens said earlier "Roy, please try and change
> the queue_nr_requests assignment in ll_rw_blk:blk_dev_init() to
> something like 2048." - Roy have you tested this too?

No ... Where do I change it?

--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.

2002-02-02 14:43:49

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

> Hmm.. suppose this is the problem anyway and that Jens patch was not enough.
> How do the disk drive sound during the test?

The disk is SILENT! I can hardly hear anything.

> Does it start to sound more when performance goes down?

I don't beleive it's a seek problem, as the readahead (RAID chunk size) is
1MB

--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.

2002-02-02 14:46:20

by Jens Axboe

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

On Sat, Feb 02 2002, Roy Sigurd Karlsbakk wrote:
> > Jens said earlier "Roy, please try and change
> > the queue_nr_requests assignment in ll_rw_blk:blk_dev_init() to
> > something like 2048." - Roy have you tested this too?
>
> No ... Where do I change it?

drivers/block/ll_rw_blk.c:blk_dev_init()
{
queue_nr_requests = 64;
if (total_ram > MB(32))
queue_nr_requests = 256;

Change the 256 to 2048.

--
Jens Axboe

2002-02-02 14:53:20

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

> > Hmm.. suppose this is the problem anyway and that Jens patch was not enough.
> > How do the disk drive sound during the test?
> >
> > Does it start to sound more when performance goes down?
>
> Yes that would be interesting to know, if the disk becomes seek bound.

The performance never goes down. It's stable @ ~40-43 MB/s. It DID go
down, but that was before -rmap11c. Then the problem was in the VM

> Probably, my patch was really just a quick try to see if it changed
> anything.
>
> > Number of READ limits the number of concurrent streams.
> > And READA limits the maximum total read ahead.
>
> Correct, Roy you could try and change the READA balance by allocating
> lots more READA requests. Simply play around with the
> queue_nr_requests / 4 setting. Try something "absurd" like
> queue_nr_requests << 2 or even bigger.

sure.

where do I change this???

--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.

2002-02-02 15:03:33

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

On Sat, 2 Feb 2002, Jens Axboe wrote:

> On Sat, Feb 02 2002, Roy Sigurd Karlsbakk wrote:
> > > Jens said earlier "Roy, please try and change
> > > the queue_nr_requests assignment in ll_rw_blk:blk_dev_init() to
> > > something like 2048." - Roy have you tested this too?
> >
> > No ... Where do I change it?
>
> drivers/block/ll_rw_blk.c:blk_dev_init()
> {
> queue_nr_requests = 64;
> if (total_ram > MB(32))
> queue_nr_requests = 256;
>
> Change the 256 to 2048.

Is this an attempt to fix the problem #2 (as described in the initial
email), or to further improve throughtput?

Problem #2 is _the_ worst problem, as it makes the server more-or-less
unusable

--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.

2002-02-02 15:07:13

by Jens Axboe

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

On Sat, Feb 02 2002, Roy Sigurd Karlsbakk wrote:
> On Sat, 2 Feb 2002, Jens Axboe wrote:
>
> > On Sat, Feb 02 2002, Roy Sigurd Karlsbakk wrote:
> > > > Jens said earlier "Roy, please try and change
> > > > the queue_nr_requests assignment in ll_rw_blk:blk_dev_init() to
> > > > something like 2048." - Roy have you tested this too?
> > >
> > > No ... Where do I change it?
> >
> > drivers/block/ll_rw_blk.c:blk_dev_init()
> > {
> > queue_nr_requests = 64;
> > if (total_ram > MB(32))
> > queue_nr_requests = 256;
> >
> > Change the 256 to 2048.
>
> Is this an attempt to fix the problem #2 (as described in the initial
> email), or to further improve throughtput?

Further "improvement", the question is will it make a difference.
Bumping READA count would interesting too, as outlined.

> Problem #2 is _the_ worst problem, as it makes the server more-or-less
> unusable

Please send sysrq-t traces for such stuck processes. It's _impossible_
to guess whats going on from here, the crystal ball just isn't good
enough :-)

--
Jens Axboe

2002-02-02 15:22:48

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: Errors in the VM - detailed (or is it Tux?)

> > Problem #2 is _the_ worst problem, as it makes the server more-or-less
> > unusable
>
> Please send sysrq-t traces for such stuck processes. It's _impossible_
> to guess whats going on from here, the crystal ball just isn't good
> enough :-)

Decoded sysrq+t is attached.

I've found only the first 60 wget processes started from the remote
machine is being serviced. After they are done, Tux hangs, using 100%
system time, still open on port ## (80), but doesn't do anything.

I don't understand anything...

Thanks, guys. You're of great help!

--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.


Attachments:
altsysrqt.decoded (12.03 kB)

2002-02-02 15:39:25

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: Errors in the VM - detailed (or is it Tux? or rmap? or those together...)

> I have reread the first mail in this series - I would say that Bug#2 is much
> worse than Bug#1. This since Bug#1 is "only" a performance problem,
> but Bug#2 is about correctness...
>
> Are you 100% sure that tux works with rmap?

Of course not. How can I be sure???

> I would suggest testing the simplest possible case.
> * Standard kernel
> * concurrent dd:s

Won't work. Then all I get is (ref prob #1) good throughput until RAMx2
bytes is read from disk. Then it all falls down to ~1MB/s. See
http://karlsbakk.net/dev/kernel/vm-fsckup.txt for more details.

> What can your problem be:
> * something to do with the VM - but the problem is in several different VMs...
> * something to do with read ahead? you got some patch suggestions -
> please use them on a standard kernel, not rmap (for now...)

Then fix the problem rmap11c fixed. I first need that fixed before being
able to do any further testing!

roy

--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.

2002-02-02 15:35:35

by Roger Larsson

[permalink] [raw]
Subject: Re: Errors in the VM - detailed (or is it Tux? or rmap? or those together...)

On Saturdayen den 2 February 2002 16.22, Roy Sigurd Karlsbakk wrote:
> > > Problem #2 is _the_ worst problem, as it makes the server more-or-less
> > > unusable
> >
> > Please send sysrq-t traces for such stuck processes. It's _impossible_
> > to guess whats going on from here, the crystal ball just isn't good
> > enough :-)
>
> Decoded sysrq+t is attached.
>
> I've found only the first 60 wget processes started from the remote
> machine is being serviced. After they are done, Tux hangs, using 100%
> system time, still open on port ## (80), but doesn't do anything.
>
> I don't understand anything...

I have reread the first mail in this series - I would say that Bug#2 is much
worse than Bug#1. This since Bug#1 is "only" a performance problem,
but Bug#2 is about correctness...

Are you 100% sure that tux works with rmap?

I would suggest testing the simplest possible case.
* Standard kernel
* concurrent dd:s

What can your problem be:
* something to do with the VM - but the problem is in several different VMs...
* something to do with read ahead? you got some patch suggestions -
please use them on a standard kernel, not rmap (for now...)

/RogerL

--
Roger Larsson
Skellefte?
Sweden

2002-02-02 16:28:14

by Roger Larsson

[permalink] [raw]
Subject: Re: Errors in the VM - detailed (or is it Tux? or rmap? or those together...)

On Saturday den 2 February 2002 16.38, Roy Sigurd Karlsbakk wrote:
> > I have reread the first mail in this series - I would say that Bug#2 is
> > much worse than Bug#1. This since Bug#1 is "only" a performance problem,
> > but Bug#2 is about correctness...
> >
> > Are you 100% sure that tux works with rmap?
>
> Of course not. How can I be sure???
>
> > I would suggest testing the simplest possible case.
> > * Standard kernel
> > * concurrent dd:s
>
> Won't work. Then all I get is (ref prob #1) good throughput until RAMx2
> bytes is read from disk. Then it all falls down to ~1MB/s. See
> http://karlsbakk.net/dev/kernel/vm-fsckup.txt for more details.

How do you know that it gets into this at RAMx2? Have you added 'bi' from
vmstat?

One interesting thing to notice from vmstat is...

r b w swpd free buff cache si so bi bo in cs us sy id
When performing nicely:
0 200 1 1676 3200 3012 786004 0 292 42034 298 791 745 4 29 67
0 200 1 1676 3308 3136 785760 0 0 44304 0 748 758 3 15 82
0 200 1 1676 3296 3232 785676 0 0 44236 0 756 710 2 23 75
Later when being slow:
0 200 0 3468 3316 4060 784668 0 0 1018 0 613 631 1 2 97
0 200 0 3468 3292 4060 784688 0 0 1034 0 617 638 0 3 97
0 200 0 3468 3200 4068 784772 0 0 1066 6 694 727 2 4 94

No swap activity (si + so == 0), mostly idle (id > 90).
So it is waiting - on what??? timer? disk?

>
> > What can your problem be:
> > * something to do with the VM - but the problem is in several different
> > VMs... * something to do with read ahead? you got some patch suggestions
> > - please use them on a standard kernel, not rmap (for now...)
>
> Then fix the problem rmap11c fixed. I first need that fixed before being
> able to do any further testing!

Roy, did you notice the mail from Andrew Morton:
> heh. Yep, Roger finally nailed it, I think.
>
> Roy says the bug was fixed in rmap11c. Changelog says:
>
>
> rmap 11c:
> ...
> - elevator improvement (Andrew Morton)
>
> Which includes:
>
> - queue_nr_requests = 64;
> - if (total_ram > MB(32))
> - queue_nr_requests = 128;
> + queue_nr_requests = (total_ram >> 9) &
> ~15; /* One per half-megabyte */
> + if (queue_nr_requests < 32)
> + queue_nr_requests = 32;
> + if (queue_nr_requests > 1024)
> + queue_nr_requests = 1024;

rmap11c changed the queue_nr_requests, that problem went away.
But another one showed its ugly head...

Could you please try this part of rmap11c only? Or the very simple one
setting queue_nr_request to = 2048 for a test drive...

/RogerL

--
Roger Larsson
Skellefte?
Sweden

2002-02-02 16:39:55

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: Errors in the VM - detailed (or is it Tux? or rmap? or those together...)

> How do you know that it gets into this at RAMx2? Have you added 'bi' from
> vmstat?

yes

> One interesting thing to notice from vmstat is...
>
> r b w swpd free buff cache si so bi bo in cs us sy id
> When performing nicely:
> 0 200 1 1676 3200 3012 786004 0 292 42034 298 791 745 4 29 67
> 0 200 1 1676 3308 3136 785760 0 0 44304 0 748 758 3 15 82
> 0 200 1 1676 3296 3232 785676 0 0 44236 0 756 710 2 23 75
> Later when being slow:
> 0 200 0 3468 3316 4060 784668 0 0 1018 0 613 631 1 2 97
> 0 200 0 3468 3292 4060 784688 0 0 1034 0 617 638 0 3 97
> 0 200 0 3468 3200 4068 784772 0 0 1066 6 694 727 2 4 94
>
> No swap activity (si + so == 0), mostly idle (id > 90).
> So it is waiting - on what??? timer? disk?

I don't know. All I know is that with rmap-11c, it works

> Roy, did you notice the mail from Andrew Morton:
> > heh. Yep, Roger finally nailed it, I think.
> >
> > Roy says the bug was fixed in rmap11c. Changelog says:
> >
> >
> > rmap 11c:
> > ...
> > - elevator improvement (Andrew Morton)
> >
> > Which includes:
> >
> > - queue_nr_requests = 64;
> > - if (total_ram > MB(32))
> > - queue_nr_requests = 128;
> > + queue_nr_requests = (total_ram >> 9) &
> > ~15; /* One per half-megabyte */
> > + if (queue_nr_requests < 32)
> > + queue_nr_requests = 32;
> > + if (queue_nr_requests > 1024)
> > + queue_nr_requests = 1024;
>
> rmap11c changed the queue_nr_requests, that problem went away.
> But another one showed its ugly head...
>
> Could you please try this part of rmap11c only? Or the very simple one
> setting queue_nr_request to = 2048 for a test drive...

u mean - on a 2.4.1[18](-pre.)? kernel?

I'll try

--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.

2002-02-02 16:53:10

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: Errors in the VM - detailed (or is it Tux? or rmap? or those together...)

> > Roy, did you notice the mail from Andrew Morton:
> > > heh. Yep, Roger finally nailed it, I think.
> > >
> > > Roy says the bug was fixed in rmap11c. Changelog says:
> > >
> > >
> > > rmap 11c:
> > > ...
> > > - elevator improvement (Andrew Morton)
> > >
> > > Which includes:
> > >
> > > - queue_nr_requests = 64;
> > > - if (total_ram > MB(32))
> > > - queue_nr_requests = 128;
> > > + queue_nr_requests = (total_ram >> 9) &
> > > ~15; /* One per half-megabyte */
> > > + if (queue_nr_requests < 32)
> > > + queue_nr_requests = 32;
> > > + if (queue_nr_requests > 1024)
> > > + queue_nr_requests = 1024;
> >
> > rmap11c changed the queue_nr_requests, that problem went away.
> > But another one showed its ugly head...
> >
> > Could you please try this part of rmap11c only? Or the very simple one
> > setting queue_nr_request to = 2048 for a test drive...
>
> u mean - on a 2.4.1[18](-pre.)? kernel?
>
> I'll try

er..

# grep queue_nr_requests /usr/src/packed/k/2.4.17-rmap-11c
#


---
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.

2002-02-02 17:32:57

by Roger Larsson

[permalink] [raw]
Subject: Re: Errors in the VM - detailed (or is it Tux? or rmap? or those together...)

Hi again Roy,

> er..
>
> # grep queue_nr_requests /usr/src/packed/k/2.4.17-rmap-11c
> #
Andrew did supply a patch for Riel but he did not accept all of it?

Lets see again. Do I understand you correctly:
rmap 11c fixes the problem #1 but not 11b? are all later
rmaps good?

rmap 11c:
- oom_kill race locking fix (Andres Salomon)
- elevator improvement (Andrew Morton)
- dirty buffer writeout speedup (hopefully ;)) (me)
- small documentation updates (me)
- page_launder() never does synchronous IO, kswapd
and the processes calling it sleep on higher level (me)
- deadlock fix in touch_page() (me)
rmap 11b:

Lets see, not oom condition, no dirty buffers (read "only"),
not documentation, page_launder (no dirty...), not deadlock.
Remaining is the elevator... And that can really be it!
(read ahead related too...)

and 2.4.18-pre2 (or later) does not fix it?

2.4.18-pre2:
- ...
- Fix elevator insertion point on failed
request merge (Jens Axboe)
- ...
pre1:

--
Roger Larsson
Skellefte?
Sweden

2002-02-02 17:46:17

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: Errors in the VM - detailed

> Andrew did supply a patch for Riel but he did not accept all of it?
>
> Lets see again. Do I understand you correctly:
> rmap 11c fixes the problem #1 but not 11b? are all later
> rmaps good?

I've just tried 11c and 12a. Both are good. The change was made between
11b and 11c.

>
> rmap 11c:
> - oom_kill race locking fix (Andres Salomon)
> - elevator improvement (Andrew Morton)
> - dirty buffer writeout speedup (hopefully ;)) (me)
> - small documentation updates (me)
> - page_launder() never does synchronous IO, kswapd
> and the processes calling it sleep on higher level (me)
> - deadlock fix in touch_page() (me)
> rmap 11b:
>
> Lets see, not oom condition, no dirty buffers (read "only"),
> not documentation, page_launder (no dirty...), not deadlock.
> Remaining is the elevator... And that can really be it!
> (read ahead related too...)
>
> and 2.4.18-pre2 (or later) does not fix it?

I'll try.

>
> 2.4.18-pre2:
> - ...
> - Fix elevator insertion point on failed
> request merge (Jens Axboe)
> - ...
> pre1:

btw... I beleive the error #2 is Tux specific. I'm debugging it now. Sorry
for that

roy

--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.