2008-06-12 11:15:26

by Mark Seger

[permalink] [raw]
Subject: Latest release of collectl can now show top processes sorted by I/O

This felt significant enough to announce to a wider audience. If you
haven't heard of collectl before, see:
http://collectl.sourceforge.net/index.html, as monitoring processes is
just a small component of what it can do and besides the 'usual set of
suspects' such as cpu, network, disk and memory it can also monitor less
common types of data such as nfs, slabs, lustre, quadrics, infiniband
and even interrupts by cpu! To read more about its capabilities see
http://collectl.sourceforge.net/Features.html

The short story on I/O is you can now say "collectl --top io" and see a
dynamically sorted list of top I/O users, displayed once a second,
assuming of course that you're using a kernel that supports this. If
you include --procopts t, you can also see the top threads as well.
Unfortunately there's a bug in the way I/O stats are currently reported
in that if you just look at a process and not its threads, the aggregate
is not included and so you can see I/O rates of 0 while the threads are
working their little hearts out. Andrea Righi has published a patch
that corrects this and both his patch and my original bugzilla can be
found at the bottom of the page that describes process monitoring in
more detail at http://collectl.sourceforge.net/Process.html. Just one
comment, and I tried to be more descriptive in the webpage, looking for
new processes/threads can be pretty labor intensive and by applying
appropriate filters or taking less frequent samples you can
significantly reduce the system load.

-mark