2008-06-19 18:46:30

by Bruce Fields

[permalink] [raw]
Subject: Re: CONFIG_DEBUG_SLAB_LEAK omits size-4096 and larger?

On Thu, Jun 19, 2008 at 10:53:28AM -0500, Weathers, Norman R. wrote:
> The kernel that we were really seeing the problem with was, but
> I think we may have figured out the 4096 problem, and it was probably a
> mistake on my part, but it is important for the NFS users to see it so
> they don't make the same mistake. I had found some performance tuning
> guides, and in trying some of the suggestions, found that the setting
> changes did seem to help on some things, but of course I never got to
> run a check under full load (800 + clients). A suggestion was to change
> the tcp_reordering tunable under /proc/sys/net/ipv4 from the default 3
> to 127. We think that this was actually causing the issue. I was able
> to trace back through all of the changes, and I changed this setting
> back to the default 3, and it immediately fixed the size-4096 hell. It
> appears that the reordering just eats into the memory, especially in
> high demand situations, and I guess that should make perfect sense if we
> are actually buffering up packets for reorder, and we are slamming the
> box with thousands of requests per minute.

OK, sounds plausible, though I won't pretend to understand exactly how
that reordering code is using memory.

> We still have other performance issues now, but it appears to be more of
> a bottleneck, the nodes do not appear to be backing off when the servers
> are becoming congested.
> > So with that many clients all making requests to the server at once,
> > we'd start hitting that (serv->sv_nrthreads+3)*20 limit when
> > the number
> > of threads was set to less than 30-50. That doesn't seem to be the
> > point where you're seeing a change in behavior, though.
> >
> We were estimating between 40 and 50 threads was the cut off for being
> able to service all of the (current) requests at once. I haven't ramped
> back up to that level yet. I wasn't comfortable yet with letting it all
> hang back out just in case we get into that hellish mode again, it can
> be a pain to try and get into those systems once they are overloaded
> (even over serial, sometimes it can just timeout the login). We had to
> actually bring online a second option to help alleviate some of the back
> congestion because the servers couldn't handle the workload.

Thanks for the update, and let us know if you figure out anything more.