2007-12-13 15:17:55

by Krzysztof Oledzki

[permalink] [raw]
Subject: Re: [Bug 9182] Critical memory leak (dirty pages)



On Mon, 3 Dec 2007, Thomas Osterried wrote:

> On the machine which has troubles, the bug occured within about 10 days
> During these days, the amount of dirty pages increased, up to 400MB.
> I have testet kernel 2.6.19, 2.6.20, 2.6.22.1 and 2.6.22.10 (with our config),
> and even linux-2.6.20 from ubuntu-sever. They have all shown that behaviour.

<CUT>

> 10 days ago, i installed kernel 2.6.18.5 on this machine (with backported
> 3ware controller code). I'm quite sure that this kernel will now fixes our
> severe stability problems on this production machine (currently:
> Dirty: 472 kB, nr_dirty 118).
> If so, it's the "lastest" kernel i found usable, after half of a year of pain.

Strange, my tests show that both 2.6.18(.8) and 2.6.19(.7) are OK and the
first wrong kernel is 2.6.20.

BTW: Could someone please look at this problem? I feel little ignored and
in my situation this is a critical regression.

Best regards,

Krzysztof Ol?dzki


2007-12-13 15:44:43

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [Bug 9182] Critical memory leak (dirty pages)


On Thu, 2007-12-13 at 16:17 +0100, Krzysztof Oledzki wrote:
>

> BTW: Could someone please look at this problem? I feel little ignored and
> in my situation this is a critical regression.

I was hoping to get around to it today, but I guess tomorrow will have
to do :-/

So, its ext3, dirty some pages, sync, and dirty doesn't fall to 0,
right?

Does it happen with other filesystems as well?

What are you ext3 mount options?


2007-12-13 16:17:57

by Krzysztof Oledzki

[permalink] [raw]
Subject: Re: [Bug 9182] Critical memory leak (dirty pages)



On Thu, 13 Dec 2007, Peter Zijlstra wrote:

>
> On Thu, 2007-12-13 at 16:17 +0100, Krzysztof Oledzki wrote:
>>
>
>> BTW: Could someone please look at this problem? I feel little ignored and
>> in my situation this is a critical regression.
>
> I was hoping to get around to it today, but I guess tomorrow will have
> to do :-/

Thanks.

> So, its ext3, dirty some pages, sync, and dirty doesn't fall to 0,
> right?

Not only doesn't fall but continuously grows.

> Does it happen with other filesystems as well?

Don't know. I generally only use ext3 and I'm afraid I'm not able to
switch this system to other filesystem.

> What are you ext3 mount options?
/dev/root / ext3 rw,data=journal 0 0
/dev/VolGrp0/usr /usr ext3 rw,nodev,data=journal 0 0
/dev/VolGrp0/var /var ext3 rw,nodev,data=journal 0 0
/dev/VolGrp0/squid_spool /var/cache/squid/cd0 ext3 rw,nosuid,nodev,noatime,data=writeback 0 0
/dev/VolGrp0/squid_spool2 /var/cache/squid/cd1 ext3 rw,nosuid,nodev,noatime,data=writeback 0 0
/dev/VolGrp0/news_spool /var/spool/news ext3 rw,nosuid,nodev,noatime,data=ordered 0 0

Best regards,

Krzysztof Ol?dzki

2007-12-15 12:34:50

by Krzysztof Oledzki

[permalink] [raw]
Subject: Re: [Bug 9182] Critical memory leak (dirty pages)



On Thu, 13 Dec 2007, Krzysztof Oledzki wrote:

>
>
> On Thu, 13 Dec 2007, Peter Zijlstra wrote:
>
>>
>> On Thu, 2007-12-13 at 16:17 +0100, Krzysztof Oledzki wrote:
>>>
>>
>>> BTW: Could someone please look at this problem? I feel little ignored and
>>> in my situation this is a critical regression.
>>
>> I was hoping to get around to it today, but I guess tomorrow will have
>> to do :-/
>
> Thanks.
>
>> So, its ext3, dirty some pages, sync, and dirty doesn't fall to 0,
>> right?
>
> Not only doesn't fall but continuously grows.
>
>> Does it happen with other filesystems as well?
>
> Don't know. I generally only use ext3 and I'm afraid I'm not able to switch
> this system to other filesystem.
>
>> What are you ext3 mount options?
> /dev/root / ext3 rw,data=journal 0 0
> /dev/VolGrp0/usr /usr ext3 rw,nodev,data=journal 0 0
> /dev/VolGrp0/var /var ext3 rw,nodev,data=journal 0 0
> /dev/VolGrp0/squid_spool /var/cache/squid/cd0 ext3
> rw,nosuid,nodev,noatime,data=writeback 0 0
> /dev/VolGrp0/squid_spool2 /var/cache/squid/cd1 ext3
> rw,nosuid,nodev,noatime,data=writeback 0 0
> /dev/VolGrp0/news_spool /var/spool/news ext3
> rw,nosuid,nodev,noatime,data=ordered 0 0

BTW: this regression also exists in 2.6.24-rc5. I'll try to find when it
was introduced but it is hard to do it on a highly critical production
system, especially since it takes ~2h after a reboot, to be sure.

However, 2h is quite good time, on other systems I have to wait ~2 months
to get 20MB of leaked memory:

# uptime
13:29:34 up 58 days, 13:04, 9 users, load average: 0.38, 0.27, 0.31

# sync;sync;sleep 1;sync;grep Dirt /proc/meminfo
Dirty: 23820 kB

Best regards,

Krzysztof Ol?dzki

2007-12-15 21:54:53

by Krzysztof Oledzki

[permalink] [raw]
Subject: Re: [Bug 9182] Critical memory leak (dirty pages)


http://bugzilla.kernel.org/show_bug.cgi?id=9182


On Sat, 15 Dec 2007, Krzysztof Oledzki wrote:

>
>
> On Thu, 13 Dec 2007, Krzysztof Oledzki wrote:
>
>>
>>
>> On Thu, 13 Dec 2007, Peter Zijlstra wrote:
>>
>>>
>>> On Thu, 2007-12-13 at 16:17 +0100, Krzysztof Oledzki wrote:
>>>>
>>>
>>>> BTW: Could someone please look at this problem? I feel little ignored and
>>>> in my situation this is a critical regression.
>>>
>>> I was hoping to get around to it today, but I guess tomorrow will have
>>> to do :-/
>>
>> Thanks.
>>
>>> So, its ext3, dirty some pages, sync, and dirty doesn't fall to 0,
>>> right?
>>
>> Not only doesn't fall but continuously grows.
>>
>>> Does it happen with other filesystems as well?
>>
>> Don't know. I generally only use ext3 and I'm afraid I'm not able to switch
>> this system to other filesystem.
>>
>>> What are you ext3 mount options?
>> /dev/root / ext3 rw,data=journal 0 0
>> /dev/VolGrp0/usr /usr ext3 rw,nodev,data=journal 0 0
>> /dev/VolGrp0/var /var ext3 rw,nodev,data=journal 0 0
>> /dev/VolGrp0/squid_spool /var/cache/squid/cd0 ext3
>> rw,nosuid,nodev,noatime,data=writeback 0 0
>> /dev/VolGrp0/squid_spool2 /var/cache/squid/cd1 ext3
>> rw,nosuid,nodev,noatime,data=writeback 0 0
>> /dev/VolGrp0/news_spool /var/spool/news ext3
>> rw,nosuid,nodev,noatime,data=ordered 0 0
>
> BTW: this regression also exists in 2.6.24-rc5. I'll try to find when it was
> introduced but it is hard to do it on a highly critical production system,
> especially since it takes ~2h after a reboot, to be sure.
>
> However, 2h is quite good time, on other systems I have to wait ~2 months to
> get 20MB of leaked memory:
>
> # uptime
> 13:29:34 up 58 days, 13:04, 9 users, load average: 0.38, 0.27, 0.31
>
> # sync;sync;sleep 1;sync;grep Dirt /proc/meminfo
> Dirty: 23820 kB

More news, I hope this time my problem get more attention from developers
since now I have much more information.

So far I found that:
- 2.6.20-rc4 - bad: http://bugzilla.kernel.org/attachment.cgi?id=14057
- 2.6.20-rc2 - bad: http://bugzilla.kernel.org/attachment.cgi?id=14058
- 2.6.20-rc1 - OK (probably, I need to wait little more to be 100% sure).

2.6.20-rc1 with 33m uptime:
~$ grep Dirt /proc/meminfo ;sync ; sleep 1 ; sync ; grep Dirt /proc/meminfo
Dirty: 10504 kB
Dirty: 0 kB

2.6.20-rc2 was released Dec 23/24 2006 (BAD)
2.6.20-rc1 was released Dec 13/14 2006 (GOOD?)

It seems that this bug was introduced exactly one year ago. Surprisingly,
dirty memory in 2.6.20-rc2/2.6.20-rc4 leaks _much_ more faster than in
2.6.20-final and later kernels as it took only about 6h to reach 172MB.
So, this bug might be cured afterward, but only a little.

There are three commits that may be somehow related:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.20.y.git;a=commitdiff;h=fba2591bf4e418b6c3f9f8794c9dd8fe40ae7bd9
http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.20.y.git;a=commitdiff;h=3e67c0987d7567ad666641164a153dca9a43b11d
http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.20.y.git;a=commitdiff;h=5f2a105d5e33a038a717995d2738434f9c25aed2

I'm going to check 2.6.20-rc1-git... releases but it would be *very* nice
if someone could finally give ma a hand and point some hints helping
debugging this problem.

Please note that none of my systems with kernels >= 2.6.20-rc1 is able to
reach 0 kb of dirty memory, even after many synces, even when idle.

Best regards,

Krzysztof Ol?dzki