2006-12-04 18:47:53

by Horst H. von Brand

[permalink] [raw]
Subject: Re: la la la la ... swappiness

Aucoin <[email protected]> wrote:
> From: Horst H. von Brand [mailto:[email protected]]
> > That means that there isn't a need for that memory at all (and so they

> In the current isolated non-production, not actually bearing a load test
> case yes. But if I can't get it to not swap on an idle system I have no hope
> of avoiding OOM on a loaded system.

How do you /know/ it won't just be recycled in the production case?

> > In any case, how do you know it is the tar data that stays around, and not
> > just that the number of pages "in use" stays roughly constant?
>
> I'm not dumping the contents of memory so I don't.

OK.

> > - What you are doing, step by step
>
> Trying to deliver a high availability, linearly scalable, clustered iSCSI
> storage solution that can be upgraded with minimum downtime.

That is your ultimate goal, not what you are doing, step by step.

> > - What are your exact requirements

> OOM not to kill anything.

Can't ever guarantee that (unless you have the exact memory requirements
beforehand, and enough RAM for the worst case).

> > - In what exact way is it missbehaving. Please tell /in detail/ how you

> OOM kills important stuff.

What "important stuff"? How come OOM kills it, when there is plenty of
free(able) memory around? Is this in the production setting, or are you
just afraid it could happen by what you see in the "current isolated
non-production, not actually bearing a load test" case?
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria +56 32 2654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 2797513




2006-12-04 21:43:50

by Aucoin

[permalink] [raw]
Subject: RE: la la la la ... swappiness

> From: Horst H. von Brand [mailto:[email protected]]
> How do you /know/ it won't just be recycled in the production case?

In the production case is when oom fires and kills things. I can only assume
memory is not being freed fast enough otherwise oom wouldn't get so upset.

> That is your ultimate goal, not what you are doing, step by step.

It's 1/2+ million lines of code, there are a lot of steps. Other than saying
we create a 1.6GB shared memory segment up front, then load the high
availability iSCSI application, start I/O with some number of clients and
then launch an update. I'm not sure what detail you're looking for. Linus
seems to have the best summary of the problem so far saying that we have a
2GB system with 1.6GB dedicated and we want the OS to pretend there's only
400MB of memory.