2004-03-26 01:28:08

by Robin Holt

[permalink] [raw]
Subject: Large memory application exhuasts buffers during write.


We have a large memory application which is being killed by the OOM
killer.

This is a 2.4 based kernel with many of the redhat patches applied.
Before the application is started, there is approx 350GB of memory
free according to top. When the app starts, it mallocs a 300GB
buffer, initializes it, does computations into it, and then starts
to write it to a disk file.

What we see happen is the first approx 30GBs gets written and then
swap starts getting utilized. Once swap has been heavily utilized,
the OOM killer kicks in and kills the job.

The application is a vendor provided app and probably cannot be
modified. Does anybody have any suggestion on possible changes
to make to the kernel to eliminate or significantly reduce the
likelihood that the job gets terminated.

Thanks,
Robin Holt


2004-03-26 10:47:57

by Denis Vlasenko

[permalink] [raw]
Subject: Re: Large memory application exhuasts buffers during write.

On Friday 26 March 2004 03:20, Robin Holt wrote:
> We have a large memory application which is being killed by the OOM
> killer.
>
> This is a 2.4 based kernel with many of the redhat patches applied.
> Before the application is started, there is approx 350GB of memory
> free according to top. When the app starts, it mallocs a 300GB
> buffer, initializes it, does computations into it, and then starts
> to write it to a disk file.
>
> What we see happen is the first approx 30GBs gets written and then
> swap starts getting utilized. Once swap has been heavily utilized,
> the OOM killer kicks in and kills the job.

How many swap do you have? What do you see in top?

> The application is a vendor provided app and probably cannot be
> modified. Does anybody have any suggestion on possible changes
> to make to the kernel to eliminate or significantly reduce the
> likelihood that the job gets terminated.
--
vda

2004-03-26 10:54:08

by Robin Holt

[permalink] [raw]
Subject: Re: Large memory application exhuasts buffers during write.

On Fri, Mar 26, 2004 at 12:47:10PM +0200, Denis Vlasenko wrote:
> On Friday 26 March 2004 03:20, Robin Holt wrote:
> > We have a large memory application which is being killed by the OOM
> > killer.
> >
> > This is a 2.4 based kernel with many of the redhat patches applied.
> > Before the application is started, there is approx 350GB of memory
> > free according to top. When the app starts, it mallocs a 300GB
> > buffer, initializes it, does computations into it, and then starts
> > to write it to a disk file.
> >
> > What we see happen is the first approx 30GBs gets written and then
> > swap starts getting utilized. Once swap has been heavily utilized,
> > the OOM killer kicks in and kills the job.
>
> How many swap do you have? What do you see in top?

I am not sure how much swap is configured or available, but I
highly doubt that there is 350GB of swap.

Robin

2004-03-26 10:57:39

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Large memory application exhuasts buffers during write.

On Thu, Mar 25, 2004 at 07:20:56PM -0600, Robin Holt wrote:
> This is a 2.4 based kernel with many of the redhat patches applied.
> Before the application is started, there is approx 350GB of memory
> free according to top. When the app starts, it mallocs a 300GB
> buffer, initializes it, does computations into it, and then starts
> to write it to a disk file.
>
> What we see happen is the first approx 30GBs gets written and then
> swap starts getting utilized. Once swap has been heavily utilized,
> the OOM killer kicks in and kills the job.

Buffered writes or O_DIRECT? I guess you're doing the former and
actually want the latter. Try preloading a tiny library stub that
adds O_DIRECT to open for the interesting fds.

2004-03-26 11:16:27

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Large memory application exhuasts buffers during write.

On Fri, Mar 26, 2004 at 04:43:03PM +0530, Pravin Nanaware , Gurgaon wrote:
> Aare abhi tak koi courier nahi aaya yaar. pata nahi woh kya kar raha hai

If you want to contact me please use Englisch or German in emails :)

2004-03-26 14:10:39

by John Stoffel

[permalink] [raw]
Subject: Re: Large memory application exhuasts buffers during write.


Robin> What we see happen is the first approx 30GBs gets written and then
Robin> swap starts getting utilized. Once swap has been heavily utilized,
Robin> the OOM killer kicks in and kills the job.

Can you post the output of 'vmstat 1' starting just before your app is
fired up, and until OOM kicks in an kills it off? And more details
on exactly which version of Linux kernel you're running and how the
system is configured as well.

John