2010-08-10 06:05:56

by Mikael Abrahamsson

[permalink] [raw]
Subject: 2.6.32 swapper allocation failure with plenty of memory available


Hi.

Yesterday my Ubuntu 10.04 machine with their 2.6.32 (amd64) kernel, under
a lot of disk IO and network stress stopped responding. I thought it had
frozen completely, but ~2 hours later it came back to life.

When I logged in I saw a lot of "swapper allocation failure" and r8169
timeouts in dmesg (first time I've seen this cause network instability
like this, but it's also the first motherboard I've tested with that has a
r8169 NIC).

I've had this problem before with older kernels on other hardware
<https://bugs.launchpad.net/ubuntu/+source/linux/+bug/296275>, and it
seems related to having a lot of TCP sessions up moving data, in
conjunction with pretty agressive TCP tuning for long bandwidth delay
product (4-8 megs of tcp memory settings with sysctl).

The machine has 8 gigs of ram (core i5 + P7H57D-V EVO motherboard) and was
running programs which was using ~2 gigs of memory, so most of the memory
was used for buffers and disk cache.

Unless this has been fixed since 2.6.32, I suspect it's still a problem
even in newer kernels because the behaviour seems to have been present
since at least 2.6.24. Generally, tuning down the TCP wmem and rmem etc to
~1 megabyte makes the problem go away.

Please see attached dmesg file for more information.

--
Mikael Abrahamsson email: [email protected]


Attachments:
dmesg.100809-2.txt.gz (26.26 kB)

2010-08-10 09:37:12

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: 2.6.32 swapper allocation failure with plenty of memory available

On Tue, 2010-08-10 at 08:05 +0200, Mikael Abrahamsson wrote:
> Hi.
>
> Yesterday my Ubuntu 10.04 machine with their 2.6.32 (amd64) kernel, under
> a lot of disk IO and network stress stopped responding. I thought it had
> frozen completely, but ~2 hours later it came back to life.
>
> When I logged in I saw a lot of "swapper allocation failure" and r8169
> timeouts in dmesg (first time I've seen this cause network instability
> like this, but it's also the first motherboard I've tested with that has a
> r8169 NIC).
.../...

I noticed that on a completely different setup as well... 2.6.32 tend to
have a hard time servicing the skb allocations for demanding network
drivers.

Probably some threshold in the VM that might want tuning...

Cheers,
Ben.