2001-10-06 06:55:06

by Christian Borntraeger

[permalink] [raw]
Subject: OOM-Killer in 2.4.11pre4

I reported __alloc_pages: 0-order allocation failed errors in 2.4.10 with a
memory eating program.

These errors are gone with 2.4.11pre4. The OOM-Killer works __correct__. it
seems that Marcelos Patch works correct for me.



2001-10-06 14:26:02

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: OOM-Killer in 2.4.11pre4

On Sat, Oct 06, 2001 at 08:53:30AM +0200, Christian Borntr?ger wrote:
> I reported __alloc_pages: 0-order allocation failed errors in 2.4.10 with a
> memory eating program.
>
> These errors are gone with 2.4.11pre4. The OOM-Killer works __correct__. it
> seems that Marcelos Patch works correct for me.

to test the oom killer you should try to run out of memory sometime.
It's not the oom killer that cured the oom faliures, it is the deadlock
prone infinite loop in the allocator that did.

Now I identified various issues that can explain the oom faliures on the
highmem boxes (I don't have any highmem box so it wasn't possible to
trigger them here, the higher memory ia32 machine that I own is my
UP desktop with 512mbyte of ram), and I will be able to verify my fixes
as soon as I can get a login on a 4/8G box. I created this project for
this purpose:

http://www.osdlab.org/cgi-bin/eidetic.cgi?command=display&modulename=projects&on=60

After I get the login and after verifying my fixes I'll release a new
-aa that will be meant primarly to fix the allocation faliures.

Andrea

2001-10-06 15:08:30

by Christian Borntraeger

[permalink] [raw]
Subject: Re: OOM-Killer in 2.4.11pre4

> to test the oom killer you should try to run out of memory sometime.

I used a test program with an endless dummy=new char[1024] loop.

This program triggered the allocation failure with 2.4.10 but with 2.4.11pre4
this program is killed by the OOM-Killer and not by the VM.
Bytheway,I had this problem without highmem - only 512 MB, and my problem is
gone with 2.4.11pre4.
At least this bug is solved. There is nothing more I wanted to report.
Good work :-)

greetings

Christian Borntr?ger

2001-10-06 15:28:13

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: OOM-Killer in 2.4.11pre4

On Sat, Oct 06, 2001 at 05:06:53PM +0200, Christian Borntr?ger wrote:
> > to test the oom killer you should try to run out of memory sometime.
>
> I used a test program with an endless dummy=new char[1024] loop.

This loop doesn't generate any page fault, it just allocates virtual
space.

> Bytheway,I had this problem without highmem - only 512 MB, and my problem is

I cannot reproduce anything like that here with 512M on 2.4.11pre3aa1.
the reports I had where all with 4G of ram, in particular with the 3.5G
of virtual memory per-process on x86 which increases the pressure on the
normal zone that in turn showed me the problem.

Anyways now that I think to have seen the issues with normal zone
faliures I will try to address them soon without having to introduce
deadlock prone code into -aa. Probably not today but I hope tomorrow or
on Monday.

Andrea

2001-10-08 17:34:59

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: OOM-Killer in 2.4.11pre4



On Sat, 6 Oct 2001, Andrea Arcangeli wrote:

> On Sat, Oct 06, 2001 at 08:53:30AM +0200, Christian Borntr?ger wrote:
> > I reported __alloc_pages: 0-order allocation failed errors in 2.4.10 with a
> > memory eating program.
> >
> > These errors are gone with 2.4.11pre4. The OOM-Killer works __correct__. it
> > seems that Marcelos Patch works correct for me.
>
> to test the oom killer you should try to run out of memory sometime.
> It's not the oom killer that cured the oom faliures, it is the deadlock
> prone infinite loop in the allocator that did.
>
> Now I identified various issues that can explain the oom faliures on the
> highmem boxes (I don't have any highmem box so it wasn't possible to
> trigger them here, the higher memory ia32 machine that I own is my
> UP desktop with 512mbyte of ram), and I will be able to verify my fixes
> as soon as I can get a login on a 4/8G box. I created this project for
> this purpose:
>
> http://www.osdlab.org/cgi-bin/eidetic.cgi?command=display&modulename=projects&on=60
>
> After I get the login and after verifying my fixes I'll release a new
> -aa that will be meant primarly to fix the allocation faliures.

Andrea,

I already have access to a 16GB box on OSDLabs exactly to test highmem
related issues -- there is no need to create another similar project IMO.

I can give you access to the machine if you want.