2001-07-31 16:05:51

by Florian Weimer

[permalink] [raw]
Subject: [2.2.13] memory leak in NFS if a sever goes away?

We've received a hard disk for post-mortem analysis because of a
strange hole of several months in the system logs, and it seems the
system's syslogd was killed by the VM subsystem during an OOM
situation.

The problems seem to begin at the following syslog event:

kernel: nfs: server sun not responding, still trying

Periodically (every quarter of an hour), the following messages appear:

kernel: nfs: task 111 can't get a request slot

(The task number is monotonically increasing.)

This goes on for about two days, after which the VM subsystem starts
killing processes (kdm first, then several times the X server, and
finally syslogd itself).

Are there some known issues with 2.2.13, for example, a memory leak in
the NFS code which is triggered in this specific situation?

(The kernel seems to have some SuSE-specific patches, for example, the
X server is sent the TERM signal, not the KILL signal on OOM. Perhaps
I should ask the SuSE folks if there were any NFS peculiarities in
their 2.2.13 version. :-/)

--
Florian Weimer [email protected]
University of Stuttgart http://cert.uni-stuttgart.de/
RUS-CERT +49-711-685-5973/fax +49-711-685-5898


2001-07-31 21:25:13

by Matthias Andree

[permalink] [raw]
Subject: Re: [2.2.13] memory leak in NFS if a sever goes away?

Florian Weimer schrieb am Dienstag, den 31. Juli 2001:

> We've received a hard disk for post-mortem analysis because of a
> strange hole of several months in the system logs, and it seems the
> system's syslogd was killed by the VM subsystem during an OOM
> situation.

If it's crucial, list it in inittab.

> Are there some known issues with 2.2.13, for example, a memory leak in
> the NFS code which is triggered in this specific situation?

There are known security threats in 2.2.x which have been fixed in
2.2.16, and there have been VM fixes in a later version, and further
minor but numerable security fixes in 2.2.19.

Update to 2.2.19. Don't bother analyzing 2.2.13, no-one fixes that
nowadays unless the problem persists in 2.2.19.