2003-08-26 12:18:58

by MånsRullgård

[permalink] [raw]
Subject: Strange memory usage reporting


I was a little surprised to see top tell me this:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10642 mru 11 0 23200 81m 2740 S 0.0 37.0 0:00.07 tcvp

It didn't make sense that RES > VIRT, so I check /proc/pid/*. Their
contents are below. Am I missing something? Note that they are not
consistent with the 'top' line above, since they were copied at a
different time. The effect is easily reproducible. It happens every
time I run my music player with using ALSA.

The memory usage summary by top, also doesn't agree:

Mem: 224140k total, 219236k used, 4904k free, 7184k buffers
Swap: 524280k total, 224k used, 524056k free, 125976k cached

Subtracting the cached memory and some heavy processes doesn't leave
room for another 80 MB.

/proc/pid/stat:
10642 (tcvp) S 926 10642 926 34816 10642 0 68 0 2556 0 4 3 0 0 11 0 0 0 1162589 25665536 18902 4294967295 134512640 134523016 3221216832 3221215328 1074519393 0 0 0 2 3222307553 0 0 17 0 0 0

/proc/pid/statm:
6266 18902 685 4 0 6262 0

/proc/pid/status:
Name: tcvp
State: S (sleeping)
Tgid: 10642
Pid: 10642
PPid: 926
TracerPid: 0
Uid: 51770 51770 51770 51770
Gid: 100 100 100 100
FDSize: 256
Groups: 100 102
VmSize: 25064 kB
VmLck: 0 kB
VmRSS: 75608 kB
VmData: 22308 kB
VmStk: 16 kB
VmExe: 12 kB
VmLib: 2492 kB
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 8000000000000000
SigCgt: 0000000380000002
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000



--
M?ns Rullg?rd
[email protected]


2003-08-26 12:36:20

by MånsRullgård

[permalink] [raw]
Subject: Re: Strange memory usage reporting

William Lee Irwin III <[email protected]> writes:

> On Tue, Aug 26, 2003 at 02:18:54PM +0200, M?ns Rullg?rd wrote:
>> I was a little surprised to see top tell me this:
>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>> 10642 mru 11 0 23200 81m 2740 S 0.0 37.0 0:00.07 tcvp
>> It didn't make sense that RES > VIRT, so I check /proc/pid/*. Their
>> contents are below. Am I missing something? Note that they are not
>> consistent with the 'top' line above, since they were copied at a
>> different time. The effect is easily reproducible. It happens every
>> time I run my music player with using ALSA.
>> The memory usage summary by top, also doesn't agree:
>
> What kernel version?

Sorry, I forgot that. It's 2.6.0-test4 with Nick Piggins' v7
scheduler patch. The machine I'm running on is a Pentium 4 based
laptop.

--
M?ns Rullg?rd
[email protected]

2003-08-26 12:26:17

by William Lee Irwin III

[permalink] [raw]
Subject: Re: Strange memory usage reporting

On Tue, Aug 26, 2003 at 02:18:54PM +0200, M?ns Rullg?rd wrote:
> I was a little surprised to see top tell me this:
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 10642 mru 11 0 23200 81m 2740 S 0.0 37.0 0:00.07 tcvp
> It didn't make sense that RES > VIRT, so I check /proc/pid/*. Their
> contents are below. Am I missing something? Note that they are not
> consistent with the 'top' line above, since they were copied at a
> different time. The effect is easily reproducible. It happens every
> time I run my music player with using ALSA.
> The memory usage summary by top, also doesn't agree:

What kernel version?


-- wli

2003-08-26 13:05:26

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: Strange memory usage reporting

On Tue, 26 Aug 2003, [iso-8859-1] M?ns Rullg?rd wrote:

> I was a little surprised to see top tell me this:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 10642 mru 11 0 23200 81m 2740 S 0.0 37.0 0:00.07 tcvp
>
> It didn't make sense that RES > VIRT, so I check /proc/pid/*. Their
> contents are below. Am I missing something? Note that they are not
> consistent with the 'top' line above, since they were copied at a
> different time. The effect is easily reproducible. It happens every
> time I run my music player with using ALSA.

I have exactly same behaviour with 2.4.21 kernel. It seems that VmRSS
grows with the mmap2 syscalls although appropriate munmap is called. I'm
investigating a possible problem with the memory accounting.

Jaroslav

-----
Jaroslav Kysela <[email protected]>
Linux Kernel Sound Maintainer
ALSA Project, SuSE Labs

2003-08-26 14:01:01

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: Strange memory usage reporting

On Tue, 26 Aug 2003, Jaroslav Kysela wrote:

> On Tue, 26 Aug 2003, [iso-8859-1] M?ns Rullg?rd wrote:
>
> > I was a little surprised to see top tell me this:
> >
> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > 10642 mru 11 0 23200 81m 2740 S 0.0 37.0 0:00.07 tcvp
> >
> > It didn't make sense that RES > VIRT, so I check /proc/pid/*. Their
> > contents are below. Am I missing something? Note that they are not
> > consistent with the 'top' line above, since they were copied at a
> > different time. The effect is easily reproducible. It happens every
> > time I run my music player with using ALSA.
>
> I have exactly same behaviour with 2.4.21 kernel. It seems that VmRSS
> grows with the mmap2 syscalls although appropriate munmap is called. I'm
> investigating a possible problem with the memory accounting.

Yes, it seems so. The do_no_page() function in mm/memory.c does accounting
for reserved pages (++mm->rss), but in zap_pte_range() there is a check
preventing increase the count of freed pages.

Here is a patch for VM gurus to review (for 2.4 kernel, but it should
apply to 2.6 as well):

===== mm/memory.c 1.57 vs edited =====
--- 1.57/mm/memory.c Fri Jun 13 18:26:23 2003
+++ edited/mm/memory.c Tue Aug 26 15:33:28 2003
@@ -1306,7 +1306,8 @@
*/
/* Only go through if we didn't race with anybody else... */
if (pte_none(*page_table)) {
- ++mm->rss;
+ if (!PageReserved(new_page))
+ ++mm->rss;
flush_page_to_ram(new_page);
flush_icache_page(vma, new_page);
entry = mk_pte(new_page, vma->vm_page_prot);


Jaroslav

-----
Jaroslav Kysela <[email protected]>
Linux Kernel Sound Maintainer
ALSA Project, SuSE Labs

2003-08-26 17:01:35

by Hugh Dickins

[permalink] [raw]
Subject: Re: Strange memory usage reporting

On Tue, 26 Aug 2003, Jaroslav Kysela wrote:
>
> Yes, it seems so. The do_no_page() function in mm/memory.c does accounting
> for reserved pages (++mm->rss), but in zap_pte_range() there is a check
> preventing increase the count of freed pages.
>
> Here is a patch for VM gurus to review (for 2.4 kernel, but it should
> apply to 2.6 as well):
>
> ===== mm/memory.c 1.57 vs edited =====
> --- 1.57/mm/memory.c Fri Jun 13 18:26:23 2003
> +++ edited/mm/memory.c Tue Aug 26 15:33:28 2003
> @@ -1306,7 +1306,8 @@
> */
> /* Only go through if we didn't race with anybody else... */
> if (pte_none(*page_table)) {
> - ++mm->rss;
> + if (!PageReserved(new_page))
> + ++mm->rss;
> flush_page_to_ram(new_page);
> flush_icache_page(vma, new_page);
> entry = mk_pte(new_page, vma->vm_page_prot);

You're right (but please rediff against 2.4.22 when you send Marcelo).

You may wonder how this has taken so long to show up: because usually
drivers which mmap Reserved pages use remap_page_range on them,
and so never fault to do_no_page.

Which is the driver involved? Though it's not wrong to give do_no_page
a Reserved page, beware of the the page->count accounting: while it's
Reserved, get_page or page_cache_get raises the count, but put_page
or page_cache_release does not decrement it - very easy to end up
with the page never freed.

Hugh

2003-08-27 13:09:20

by Ingo Oeser

[permalink] [raw]
Subject: Re: Strange memory usage reporting

Hi,

On Tue, Aug 26, 2003 at 06:03:14PM +0100, Hugh Dickins wrote:
> Which is the driver involved? Though it's not wrong to give do_no_page
> a Reserved page, beware of the the page->count accounting: while it's
> Reserved, get_page or page_cache_get raises the count, but put_page
> or page_cache_release does not decrement it - very easy to end up
> with the page never freed.

Why is this so asymetric? I would understand ignoring these pages
in the freeing logic, but why exclude them also from refcounting?

Regards

Ingo Oeser

2003-08-27 14:44:14

by Hugh Dickins

[permalink] [raw]
Subject: Re: Strange memory usage reporting

On Wed, 27 Aug 2003, Ingo Oeser wrote:
> On Tue, Aug 26, 2003 at 06:03:14PM +0100, Hugh Dickins wrote:
> > Which is the driver involved? Though it's not wrong to give do_no_page
> > a Reserved page, beware of the the page->count accounting: while it's
> > Reserved, get_page or page_cache_get raises the count, but put_page
> > or page_cache_release does not decrement it - very easy to end up
> > with the page never freed.
>
> Why is this so asymetric? I would understand ignoring these pages
> in the freeing logic, but why exclude them also from refcounting?

I don't think there's a _good_ reason, it just evolved that way.

The real answer is to get rid of PageReserved completely, which
I'll embark on again in 2.7 (I did start a couple of times in 2.5,
but each time it was too late).

There was a halfway-house suggestion in 2.5 about three months ago,
inspired (as usual) by Reserved page problems in AIO's get_user_pages,
to do as you suggest: submit them to normal refcounting. I don't
know what became of that, I didn't have much time to get involved.

Hugh