Hi,
I was a bit frustrated by bad quality of memory usage info
from top and ps, and decided to write my own utility.
One problem I don't know how to solve is how to avoid counting
twice (or more) memory used by processes which share VM
(by use of CLONE_VM flage to sys_clone).
I know how to detect and correctly account for threads
(processes created with CLONE_THREAD), but how to detect non-threads
with shared VM?
If this question is not clear enough, maybe notes below and attached
program for reading /proc/PID/* memory stats will help
to understand it better.
=========================
Shared VM detection
In Linux, processes can have shared VM. Typically, they are threads,
but it's not a certainty.
In Linux, "threads" are processes which were created with CLONE_THREAD
flag to clone(). They share PID, common parent and most of signal handling.
Parent is only signaled when last thread exits, not every one.
Each thread, though, has it's own thread ID (TID).
Threads do not show up as /proc/PID, except for the "thread group leader"
(that is, the process which did the first cloning with CLONE_THREAD).
They are accessible thru /proc/PID/task/TID.
Now, peculiarities you may need to know.
Threads actually *are* accessible as /proc/TID too, they just aren't
visible in ls (readdir/getdents syscall don't return you the info)!
(Peculiar, but not very useful for mem accounting.)
Threads are always spawned with CLONE_VM too. Yon cannot do CLONE_THREAD
without CLONE_VM. This is enforced by Linux kernel.
It means that they share the same VM. No COWing. And therefore you
don't need to go to /proc/PID/task/TID/* and scan info there to figure out
how much memory they use, and how. /proc/PID/* is enough.
Inverse is not true! You can clone a process with CLONE_VM, but
without CLONE_THREAD, and it will get new PID, and its own,
visible /proc/PID entry. It creates a problem: there is no way you can
figure out that /proc/PID1 and /proc/PID2 correspond to two
processes which share VM, and if you will sum memory usage
over the whole of /proc/*, you will count their usage twice.
It can be nice to know how many such CLONE_VM'ed processes
share VM with given /proc/PID. We can do accurate accounting
of memory by dividing all memory numbers of this process
by this number.
But this info seems to be unavailable. /proc/PID/status
has "Threads: N" line but it shows the number of threads,
i.e. the number we are NOT interested in, because we can
automatically account for them by not scanning
/proc/PID/task/TID (ans thus counting all threads' mem
usage only once, in thread group leader).
"Threads: N" does not include processes created with
CLONE_VM, but without CLONE_THREAD.
(NB: CLONE_SIGHAND also seems to be not affecting it).
===========================
--
vda
Hi Denys,
On Mon, Aug 27, 2007 at 12:56:31PM +0100, Denys Vlasenko wrote:
> Hi,
>
> I was a bit frustrated by bad quality of memory usage info
> from top and ps, and decided to write my own utility.
>
> One problem I don't know how to solve is how to avoid counting
> twice (or more) memory used by processes which share VM
> (by use of CLONE_VM flage to sys_clone).
>
> I know how to detect and correctly account for threads
> (processes created with CLONE_THREAD), but how to detect non-threads
> with shared VM?
There is a nice LWN article on this issue:
ELC: How much memory are applications really using?
http://lwn.net/Articles/230975/
Another helpful patch could be:
maps: PSS(proportional set size) accounting in smaps
http://lkml.org/lkml/2007/8/19/23
Fengguang
On Monday 27 August 2007 13:13, Fengguang Wu wrote:
> Hi Denys,
>
> On Mon, Aug 27, 2007 at 12:56:31PM +0100, Denys Vlasenko wrote:
> > Hi,
> >
> > I was a bit frustrated by bad quality of memory usage info
> > from top and ps, and decided to write my own utility.
> >
> > One problem I don't know how to solve is how to avoid counting
> > twice (or more) memory used by processes which share VM
> > (by use of CLONE_VM flage to sys_clone).
> >
> > I know how to detect and correctly account for threads
> > (processes created with CLONE_THREAD), but how to detect non-threads
> > with shared VM?
>
> There is a nice LWN article on this issue:
> ELC: How much memory are applications really using?
> http://lwn.net/Articles/230975/
>
> Another helpful patch could be:
> maps: PSS(proportional set size) accounting in smaps
> http://lkml.org/lkml/2007/8/19/23
Thanks a lot, very useful pages indeed.
However they still don't explain how I can avoid counting memory
twice for /proc/PID1 and /proc/PID2 when PID2 is a child of PID1,
created with CLONE_VM.
The example: I allocate 1234k, dirty it, then clone with CLONE_VM.
I will seemingly have two processes, each using 1234k, _privately_
(i.e., pages are not shown as shared in smaps) -
which is technically correct, pages are not shared with other VMs,
but they ARE shared by means of these two processes having the same VM!
How userspace tools can figure out that these processes have shared VM?
IOW: do we need "VMsharecount: N" in addition to "Threads: N"
in /proc/PID/status?
$ gcc clonetest.c
$ ./a.out
parent 21143 (21143)
clone returned 21144
child 21144 (21144)
<sleeps 1000 seconds>
On another console:
$ cp /proc/21143/smaps /tmp/1
$ cp /proc/21144/smaps /tmp/2
$ diff -u /tmp/1 /tmp/2 <============ smaps are the same!
$ ls -l /tmp/1 /tmp/2
-r--r--r-- 1 vda eng 2869 Aug 27 14:17 /tmp/1
-r--r--r-- 1 vda eng 2869 Aug 27 14:17 /tmp/2
This is the 1234k of memset'ed malloc in /proc/*/smaps:
f7eae000-f7fe4000 rw-p f7eae000 00:00 0
Size: 1240 kB
Rss: 1240 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 1240 kB
See? Any memory tool will conclude that 21143 is using 1240k here and 21144
uses another 1240k. But it's the same 1240k!
clonetest.c
===========
#include <sched.h>
#include <sys/types.h>
#include <linux/unistd.h>
#include <errno.h>
#include <syscall.h>
// Run this proggie, cd into /proc and explore there
// while it runs, erm, sleeps.
/* Defeat glibc "pid caching" */
#define GETPID() ((int)syscall(SYS_getpid))
#define GETTID() ((int)syscall(SYS_gettid))
char stack[8*1024];
int f(void *arg)
{
printf("child %d (%d)\n", GETPID(), GETTID());
sleep(1000);
_exit(0);
}
int main()
{
int n;
memset(malloc(1234*1024), 1, 1234*1024);
printf("parent %d (%d)\n", GETPID(), GETTID());
// Create thread
// Create a process with shared VM, but not a thread
n = clone(f, stack + sizeof(stack)/2, CLONE_VM, 0);
printf("clone returned %d\n", n);
sleep(1000);
_exit(0);
}
--
vda
On Mon, Aug 27, 2007 at 02:26:50PM +0100, Denys Vlasenko wrote:
> On Monday 27 August 2007 13:13, Fengguang Wu wrote:
> > Hi Denys,
> >
> > On Mon, Aug 27, 2007 at 12:56:31PM +0100, Denys Vlasenko wrote:
> > > Hi,
> > >
> > > I was a bit frustrated by bad quality of memory usage info
> > > from top and ps, and decided to write my own utility.
> > >
> > > One problem I don't know how to solve is how to avoid counting
> > > twice (or more) memory used by processes which share VM
> > > (by use of CLONE_VM flage to sys_clone).
> > >
> > > I know how to detect and correctly account for threads
> > > (processes created with CLONE_THREAD), but how to detect non-threads
> > > with shared VM?
> >
> > There is a nice LWN article on this issue:
> > ELC: How much memory are applications really using?
> > http://lwn.net/Articles/230975/
> >
> > Another helpful patch could be:
> > maps: PSS(proportional set size) accounting in smaps
> > http://lkml.org/lkml/2007/8/19/23
>
> Thanks a lot, very useful pages indeed.
>
> However they still don't explain how I can avoid counting memory
> twice for /proc/PID1 and /proc/PID2 when PID2 is a child of PID1,
> created with CLONE_VM.
>
> The example: I allocate 1234k, dirty it, then clone with CLONE_VM.
> I will seemingly have two processes, each using 1234k, _privately_
> (i.e., pages are not shown as shared in smaps) -
> which is technically correct, pages are not shared with other VMs,
> but they ARE shared by means of these two processes having the same VM!
>
> How userspace tools can figure out that these processes have shared VM?
>
> IOW: do we need "VMsharecount: N" in addition to "Threads: N"
> in /proc/PID/status?
A full solution would require two parameters, i.e. VmUsers/VmMagic.
But please make sure the new lines won't break important tools like
ps/top/pmaps/...
A quick test shows that only ps will parse /proc/<pid>/status:
strace -e open ps
strace -e open top
strace -e open pmap $$
On Tuesday 28 August 2007 01:10, Fengguang Wu wrote:
> > > There is a nice LWN article on this issue:
> > > ELC: How much memory are applications really using?
> > > http://lwn.net/Articles/230975/
> > >
> > > Another helpful patch could be:
> > > maps: PSS(proportional set size) accounting in smaps
> > > http://lkml.org/lkml/2007/8/19/23
> >
> > Thanks a lot, very useful pages indeed.
> >
> > However they still don't explain how I can avoid counting memory
> > twice for /proc/PID1 and /proc/PID2 when PID2 is a child of PID1,
> > created with CLONE_VM.
> >
> > The example: I allocate 1234k, dirty it, then clone with CLONE_VM.
> > I will seemingly have two processes, each using 1234k, _privately_
> > (i.e., pages are not shown as shared in smaps) -
> > which is technically correct, pages are not shared with other VMs,
> > but they ARE shared by means of these two processes having the same VM!
> >
> > How userspace tools can figure out that these processes have shared VM?
> >
> > IOW: do we need "VMsharecount: N" in addition to "Threads: N"
> > in /proc/PID/status?
>
> A full solution would require two parameters, i.e. VmUsers/VmMagic.
>
> But please make sure the new lines won't break important tools like
> ps/top/pmaps/...
Should be safe - tools skip lines they do not recognize.
Ok, we have "Threads: N".
I can cook up a patch which adds count of processes
which share VM with us - it's just atomic_read(¤t->mm->mm_users).
What name do you like?
SharedVmCount: N
VmUsers: N
other?
--
vda
On Tue, Aug 28, 2007 at 09:00:41PM +0100, Denys Vlasenko wrote:
> On Tuesday 28 August 2007 01:10, Fengguang Wu wrote:
> > > > There is a nice LWN article on this issue:
> > > > ELC: How much memory are applications really using?
> > > > http://lwn.net/Articles/230975/
> > > >
> > > > Another helpful patch could be:
> > > > maps: PSS(proportional set size) accounting in smaps
> > > > http://lkml.org/lkml/2007/8/19/23
> > >
> > > Thanks a lot, very useful pages indeed.
> > >
> > > However they still don't explain how I can avoid counting memory
> > > twice for /proc/PID1 and /proc/PID2 when PID2 is a child of PID1,
> > > created with CLONE_VM.
> > >
> > > The example: I allocate 1234k, dirty it, then clone with CLONE_VM.
> > > I will seemingly have two processes, each using 1234k, _privately_
> > > (i.e., pages are not shown as shared in smaps) -
> > > which is technically correct, pages are not shared with other VMs,
> > > but they ARE shared by means of these two processes having the same VM!
> > >
> > > How userspace tools can figure out that these processes have shared VM?
> > >
> > > IOW: do we need "VMsharecount: N" in addition to "Threads: N"
> > > in /proc/PID/status?
> >
> > A full solution would require two parameters, i.e. VmUsers/VmMagic.
> >
> > But please make sure the new lines won't break important tools like
> > ps/top/pmaps/...
>
> Should be safe - tools skip lines they do not recognize.
Except yours ;-)
FYI, I found another tool that depends on status: atop.
> Ok, we have "Threads: N".
>
> I can cook up a patch which adds count of processes
> which share VM with us - it's just atomic_read(¤t->mm->mm_users).
Yeah, the code itself would be simple.
> What name do you like?
>
> SharedVmCount: N
> VmUsers: N
> other?
I'd prefer VmUsers: that's the choice of source code.
Fengguang