2003-03-04 12:47:47

by Alastair Stevens

[permalink] [raw]
Subject: VM / OOM troubles in 2.4.20-ck4 (-aa VM)

Hi Guys - I was surprised to discover that the very latest 2.4.20
kernels running the latest -ck patches still have major VM problems,
even with the -aa VM.

Our dual Athlon server with 512Mb RAM / 1.2Gb swap, and not particularly
heavily loaded, lasted 81 days with 2.4.20-ck1 under RH8.0, and then
succumbed with these errors:

VM error: killing process wineserver
_alloc_pages: 0-order allocation failed (gfp=0x1d2/0)

This time, it only lasted _3 days_ with -ck4 before the same thing
happened.

I presume this is the OOM killer? Swap is indeed full, but I've no idea
why, on a machine that's only running a couple of instances of a small
Windoze app under WINE.

Is there a problem here? Should I just give up and run 2.5? ;-)

Cheers
Alastair .-=-.
__________________________________,' `.
\ http://www.mrc-bsu.cam.ac.uk
Alastair Stevens, Systems Management Team \ 01223 330383
MRC Biostatistics Unit, Cambridge UK `=.......................


2003-03-04 15:18:53

by Mike Galbraith

[permalink] [raw]
Subject: Re: VM / OOM troubles in 2.4.20-ck4 (-aa VM)

At 12:58 PM 3/4/2003 +0000, Alastair Stevens wrote:
>Hi Guys - I was surprised to discover that the very latest 2.4.20
>kernels running the latest -ck patches still have major VM problems,
>even with the -aa VM.
>
>Our dual Athlon server with 512Mb RAM / 1.2Gb swap, and not particularly
>heavily loaded, lasted 81 days with 2.4.20-ck1 under RH8.0, and then
>succumbed with these errors:
>
> VM error: killing process wineserver
> _alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
>
>This time, it only lasted _3 days_ with -ck4 before the same thing
>happened.
>
>I presume this is the OOM killer? Swap is indeed full, but I've no idea
>why, on a machine that's only running a couple of instances of a small
>Windoze app under WINE.

You sure your userland proggies aren't leaking like a sieve?

-Mike

2003-03-05 00:17:09

by Con Kolivas

[permalink] [raw]
Subject: Re: VM / OOM troubles in 2.4.20-ck4 (-aa VM)

On Tue, 4 Mar 2003 11:58 pm, Alastair Stevens wrote:
> Hi Guys - I was surprised to discover that the very latest 2.4.20
> kernels running the latest -ck patches still have major VM problems,
> even with the -aa VM.
>
> Our dual Athlon server with 512Mb RAM / 1.2Gb swap, and not particularly
> heavily loaded, lasted 81 days with 2.4.20-ck1 under RH8.0, and then
> succumbed with these errors:
>
> VM error: killing process wineserver
> _alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
>
> This time, it only lasted _3 days_ with -ck4 before the same thing
> happened.
>
> I presume this is the OOM killer? Swap is indeed full, but I've no idea
> why, on a machine that's only running a couple of instances of a small
> Windoze app under WINE.
>
> Is there a problem here? Should I just give up and run 2.5? ;-)

My first guess would be wine. There are all sorts of leaks in that.

I'm not aware of any memory leak / vm problems with -ck although that may be
possible. However ck4 does not have the OOM killer enabled so it's not that
in action; you simply have run out of memory and it can't allocate any more.
Have you tried without the aa vm addons in ck? Does this happen with vanilla
2.4.20? -ck is a very different branch.

Con

2003-03-05 10:10:31

by Alastair Stevens

[permalink] [raw]
Subject: Re: VM / OOM troubles in 2.4.20-ck4 (-aa VM)

> > Our dual Athlon server with 512Mb RAM / 1.2Gb swap, and not particularly
> > heavily loaded, lasted 81 days with 2.4.20-ck1 under RH8.0, and then
> > succumbed with these errors:
> >
> > VM error: killing process wineserver
> > _alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
> >
> > This time, it only lasted _3 days_ with -ck4 before the same thing
> > happened.
>
> My first guess would be wine. There are all sorts of leaks in that.

It's actually CodeWeavers Wine if it makes any difference!

> I'm not aware of any memory leak / vm problems with -ck although that may be
> possible. However ck4 does not have the OOM killer enabled so it's not that
> in action; you simply have run out of memory and it can't allocate any more.
> Have you tried without the aa vm addons in ck? Does this happen with vanilla
> 2.4.20? -ck is a very different branch.

Thanks - I'm being a bit thick here, because everyone's right, it's
obviously a userspace problem. The machine is now running plain
2.4.21-pre5, so we'll see what happens....

PS - thanks for your work on -ck, Con! It runs beautifully on my home
machine, which isn't tortured by Wine ;-)

Cheers
Alastair .-=-.
__________________________________,' `.
\ http://www.mrc-bsu.cam.ac.uk
Alastair Stevens, Systems Management Team \ 01223 330383
MRC Biostatistics Unit, Cambridge UK `=.......................

2003-03-11 09:00:03

by Alastair Stevens

[permalink] [raw]
Subject: Re: VM / OOM troubles in 2.4.20-ck4 (-aa VM)

> > Our dual Athlon server with 512Mb RAM / 1.2Gb swap, and not particularly
> > heavily loaded, lasted 81 days with 2.4.20-ck1 under RH8.0, and then
> > succumbed with these errors:
> >
> > VM error: killing process wineserver
> > _alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
>
> I'm not aware of any memory leak / vm problems with -ck although that may be
> possible. However ck4 does not have the OOM killer enabled so it's not that
> in action; you simply have run out of memory and it can't allocate any more.
> Have you tried without the aa vm addons in ck? Does this happen with vanilla
> 2.4.20? -ck is a very different branch.

FWIW - these problems don't appear to be happening with stock
2.4.21-pre5. Of course I can't say for sure, since the circumstances
will never be te same twice, but the machine survived the same sort of
hammering as the other day (with memory and swap almost full), but is
now happily relaxing again:

total: used: free: shared: buffers: cached:
Mem: 528072704 410812416 117260288 0 113184768 66420736
Swap: 1258426368 23404544 1235021824
MemTotal: 515696 kB
MemFree: 114512 kB
MemShared: 0 kB
Buffers: 110532 kB
Cached: 56848 kB
SwapCached: 8016 kB
Active: 131172 kB
Inactive: 86036 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 515696 kB
LowFree: 114512 kB
SwapTotal: 1228932 kB
SwapFree: 1206076 kB

So this time, WINE does _not_ appear to have been leaking like a sieve!
Stranger and stranger....

Regards
Alastair

2003-03-11 09:38:11

by Con Kolivas

[permalink] [raw]
Subject: Re: VM / OOM troubles in 2.4.20-ck4 (-aa VM)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 11 Mar 2003 20:10, Alastair Stevens wrote:
> > > Our dual Athlon server with 512Mb RAM / 1.2Gb swap, and not
> > > particularly heavily loaded, lasted 81 days with 2.4.20-ck1 under
> > > RH8.0, and then succumbed with these errors:
> > >
> > > VM error: killing process wineserver
> > > _alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
> >
> > I'm not aware of any memory leak / vm problems with -ck although that may
> > be possible. However ck4 does not have the OOM killer enabled so it's not
> > that in action; you simply have run out of memory and it can't allocate
> > any more. Have you tried without the aa vm addons in ck? Does this happen
> > with vanilla 2.4.20? -ck is a very different branch.
>
> FWIW - these problems don't appear to be happening with stock
> 2.4.21-pre5. Of course I can't say for sure, since the circumstances
> will never be te same twice, but the machine survived the same sort of
> hammering as the other day (with memory and swap almost full), but is
> now happily relaxing again:
>
> total: used: free: shared: buffers: cached:
> Mem: 528072704 410812416 117260288 0 113184768 66420736
> Swap: 1258426368 23404544 1235021824
> MemTotal: 515696 kB
> MemFree: 114512 kB
> MemShared: 0 kB
> Buffers: 110532 kB
> Cached: 56848 kB
> SwapCached: 8016 kB
> Active: 131172 kB
> Inactive: 86036 kB
> HighTotal: 0 kB
> HighFree: 0 kB
> LowTotal: 515696 kB
> LowFree: 114512 kB
> SwapTotal: 1228932 kB
> SwapFree: 1206076 kB
>
> So this time, WINE does _not_ appear to have been leaking like a sieve!
> Stranger and stranger....

Cannot pass judgement on that alone. It must be fully reproducible in some way
and not happen with another kernel.

If it really is my kernel then I do want to know what the cause is. Ideally
testing -ck4 without the aa vm addons (just reverse patch it) and see if that
makes the problem go away. I'm no VM hacker though, and if it turns out to be
the vm addons then you should test an -aa kernel with the vm addons and see
if that exhibits the same problem. _Then_ you can tell AA about it. All of
this depends on whether you can find some way of reproducing the problem and
are interested in helping debug it :-P

Con
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE+bbD7F6dfvkL3i1gRAjHfAJ4iv5/FMglzMeY2PGcf+MGJGeVnHACeI/Sz
GPUNSgrWvcBgNE+9TBf1e1s=
=R4hY
-----END PGP SIGNATURE-----