2000-11-07 18:21:44

by Jeff Merkey

[permalink] [raw]
Subject: sendmail has low memory problems on 2.4.0 - Working Sets Required?


I might be missing something, but with final load testing of NWFS on
2.4, I occasionally see sendmail fail to accept connections on a live
internet server when low memory conditions are present. I reviewed the
buffer cache, and it does appear that periodically buffer cache pages
get freed and the buffer cache shrinks when memory gets low, however,
the code that does this gets kicked by calling sync() functions and
seems to happen in bdflush.

I've plummed NWFS to mimic the buffer cache in-kernel with the same
memory calls. The size the cache could grow to in 2.2.X was based on
some global variables that looked at high and low water marks before
freeing pages out of the buffer cache. In 2.4, the bdflush daemon
appears to do this and comically, has a CHECK_EMERGENCY_SYNC macro it
calls, then frantically attempts to flush and free pages.

I have found that even with this in place, I am still seeing sendmail
and other internet daemons start dropping connections when heavy I/O
loading is occurring, like an NFS copy over 100 mbit ethernet while a
large numberof ftp downloads are occuring simultaneously.

I am wondering if anyone has or considered a registration funtion to
provide working set-like functinality, so Rik's VM could signal (by
calling an API) into the buffer cache and my LRU to brute-force release
pages and ask the buffer cache or some other registered agent to return
pages when memory becomes low. UnixWare uses working sets and allows
Cache agents (like the buffer cache and the NWFS LRU) to register a
working set callback to request free pages when memory gets low. The VM
performance has been getting better in 2.4, but this low memory problem
has gotten worse for some reason. I am seeing sendmail drop and refuse
connections more often than seems to happen on 2.2.X.

The way the code in structured, it appears that bdflush has to get woken
up before the buffer cache can free memory rather than allowing an agent
to come in via an API and just take it. I have found that when my AMD
650Mhz server gets realy busy, the method being used in bdflush isn't
working as intended because it does not seem to be getting cycles.

The console lockup problem is also apparently here, because bdflush is
masking off signals then trying to flush all the dirty pages to free up
memory -- causing the server to process stuff in "waves" (king of like a
car with an engine that has a bad spark plug or two lurching along the
road as it stalls and lurches, stalls and lurches -- rather than
smoothly cycling memory through the box.

I can work around for now, but perhaps I am missing something. I would
like to know if there's a better way to get signals from the OS about
returning pages. Having something as simple as a global event registry
for page *return_a_page_if_you_can() function for consumers of a lot of
pages would be very helpful and allow me to return memory more smoothly
than guessing and grepping around with global high and low water marks
in the OS.