2006-10-03 08:14:53

by Irfan Habib

[permalink] [raw]
Subject: Fwd: Any way to find the network usage by a process?

Hi,

Is there any method either kernel or user level which tells me which
process is generating how much traffic from a machine. For example if
some process is flooding the network, then I would like to know which
process (PID ideally), is generating the most traffic.

Some people told me to monitor ports and check through nmap which
process is using that port, but thats not a good approach as there is
nothing to restrict a process to use the same port for its lifetime,
it can close it and open another one etc..

I dont require it to do this remotely, it should be run on the same
machine as that originates the traffic
Any help will be highly appreciated

Regards
Irfan


2006-10-03 15:33:52

by Arnd Hannemann

[permalink] [raw]
Subject: Re: Fwd: Any way to find the network usage by a process?

Irfan Habib schrieb:
> Hi,
>
> Is there any method either kernel or user level which tells me which
> process is generating how much traffic from a machine. For example if
> some process is flooding the network, then I would like to know which
> process (PID ideally), is generating the most traffic.

If you want to monitor just specific PIDs you could easily do this with
the "Owner match support" from netfilter. However if you also want to
monitor traffic for all processes that may be created in the future this
is probably a bit tricky. Off course you could try to add a rule for
every possible PID ;-)
Seriously, you could write a daemon which permanently scans for new or
died processes and which adds/deletes appropriate netfilter rules...

Best regards
Arnd Hannemann

2006-10-03 15:49:44

by Jose R. Santos

[permalink] [raw]
Subject: Re: Fwd: Any way to find the network usage by a process?

Irfan Habib wrote:
> Hi,
>
> Is there any method either kernel or user level which tells me which
> process is generating how much traffic from a machine. For example if
> some process is flooding the network, then I would like to know which
> process (PID ideally), is generating the most traffic.
>
>

A while ago I did a SystemTap script to solve a problem similar to
this. It's been siting in my system for a while collecting dust and you
currently don't need the embedded C code since the networking.stp tapset
has all this script needs(and more), but I should point you in the right
direction.

This worked a couple of months ago but it is currently untested. Hope
it helps.

-JRS


global ifstats, ifdevs, execname

%{
#include<linux/skbuff.h>
#include<linux/netdevice.h>
%}

probe kernel.function("dev_queue_xmit")
{
execname[pid()] = execname()
name=skb_to_name($skb)
ifdevs[name] = name
ifstats[pid(),name] <<< 1
}

function skb_to_name:string (skbuff:long)
%{
struct sk_buff *skbuff = (struct sk_buff *)((long)THIS->skbuff);
struct net_device *netdev = skbuff->dev;
sprintf (THIS->__retvalue, "%s" , netdev->name);
%}

probe timer.ms(5000)
{
exit()
}

probe end {
foreach( pid in execname) {
if (pid == 0) continue
printf("%15s[%5d] ->\t", execname[pid],pid)
foreach( ifname in ifdevs) {
printf("[%s:%7d] \t", ifname,
@count(ifstats[pid, ifname]))
}
print("\n")
}
print("\n")
}

2006-10-04 20:54:36

by Mike Mason

[permalink] [raw]
Subject: Re: Fwd: Any way to find the network usage by a process?

Here's a variation of Jose's script that uses the networking tapset and
prints top-like output for transmits and receives. Much of the activity
shows up under pid 0, which Jose's script filtered out. This obviously
doesn't reflect the actual process generating the traffic.

The networking tapset currently probes netif_receive_skb() for receives and
dev_queue_xmit() for transmits. Can anyone suggest better probe points to
get transmits and receives by pid?

- Mike

Sample output:

PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND
0 0 eth0 493 880 32 1238 swapper
3078 0 eth0 26 2 28 2 Xvnc
13957 0 eth0 1 2 0 2 lspci
3148 0 eth0 1 2 0 2 nautilus
5058 0 eth0 1 1 0 1 firefox-bin
12277 0 eth0 1 0 0 0 sshd

nettop.stp script:

global ifxmit_p, ifrecv_p, ifxmit_b, ifrecv_b, ifdevs, ifpid, execname, user

probe netdev.transmit
{
execname[pid()] = execname()
user[pid()] = uid()
ifdevs[pid(), dev_name] = dev_name
ifxmit_p[pid(), dev_name] ++
ifxmit_b[pid(), dev_name] += length
ifpid[pid(), dev_name] ++
}

probe netdev.receive
{
execname[pid()] = execname()
user[pid()] = uid()
ifdevs[pid(), dev_name] = dev_name
ifrecv_p[pid(), dev_name] ++
ifrecv_b[pid(), dev_name] += length
ifpid[pid(), dev_name] ++
}


function print_activity()
{
printf("%5s %5s %-7s %7s %7s %7s %7s %-15s\n",
"PID", "UID", "DEV", "XMIT_PK", "RECV_PK", "XMIT_KB",
"RECV_KB", "COMMAND")

foreach ([pid,dev] in ifpid-) {
printf("%5d %5d %-7s %7d %7d %7d %7d %-15s\n",
pid, user[pid], dev,
ifxmit_p[pid, dev], ifrecv_p[pid, dev],
ifxmit_b[pid, dev]/1024,
ifrecv_b[pid, dev]/1024,
execname[pid])
}

print("\n")

delete execname
delete user
delete ifdevs
delete ifxmit_p
delete ifrecv_p
delete ifxmit_b
delete ifrecv_b
delete ifpid
}

probe timer.ms(5000)
{
print_activity()
}


Jose R. Santos wrote:
> Irfan Habib wrote:
>> Hi,
>>
>> Is there any method either kernel or user level which tells me which
>> process is generating how much traffic from a machine. For example if
>> some process is flooding the network, then I would like to know which
>> process (PID ideally), is generating the most traffic.
>>
>>
>
> A while ago I did a SystemTap script to solve a problem similar to
> this. It's been siting in my system for a while collecting dust and you
> currently don't need the embedded C code since the networking.stp tapset
> has all this script needs(and more), but I should point you in the right
> direction.
>
> This worked a couple of months ago but it is currently untested. Hope
> it helps.
>
> -JRS
>
>
> global ifstats, ifdevs, execname
>
> %{
> #include<linux/skbuff.h>
> #include<linux/netdevice.h>
> %}
>
> probe kernel.function("dev_queue_xmit")
> {
> execname[pid()] = execname()
> name=skb_to_name($skb)
> ifdevs[name] = name
> ifstats[pid(),name] <<< 1
> }
>
> function skb_to_name:string (skbuff:long)
> %{
> struct sk_buff *skbuff = (struct sk_buff *)((long)THIS->skbuff);
> struct net_device *netdev = skbuff->dev;
> sprintf (THIS->__retvalue, "%s" , netdev->name);
> %}
>
> probe timer.ms(5000)
> {
> exit()
> }
>
> probe end {
> foreach( pid in execname) {
> if (pid == 0) continue
> printf("%15s[%5d] ->\t", execname[pid],pid)
> foreach( ifname in ifdevs) {
> printf("[%s:%7d] \t", ifname, @count(ifstats[pid,
> ifname]))
> }
> print("\n")
> }
> print("\n")
> }
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2006-10-05 21:22:35

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: Fwd: Any way to find the network usage by a process?


Mike Mason <[email protected]> writes:

> Here's a variation of Jose's script that uses the networking tapset
> and prints top-like output for transmits and receives. [...]

Thanks for posting it to the systemtap wiki.

Some minor style suggestions follow:

> [...]
> ifxmit_p[pid(), dev_name] ++
> ifxmit_b[pid(), dev_name] += length

These could be collapsed into a single statistics-aggregate array:
# ifxmit[pid(), dev_name] <<< length
Then the printing routine would use @count(ifxmit[...]) and @sum(ifxmit[...])
to extract the two values. Same of course for ifrecv.

> execname[pid()] = execname()
> user[pid()] = uid()
> ifdevs[pid(), dev_name] = dev_name

Calling pid() so many times is worse than calling it once and caching
the result in a local variable ("p = pid()").

The way that the script tracks pid-to-uid and pid-to-execname mappings
is not bad, though if that part were moved to new probes on fork or
exec, it would allow the network-related probes to run concurrently on
an SMP without fighting over locks.


- FChE

2006-10-05 23:28:26

by Mike Mason

[permalink] [raw]
Subject: Re: Fwd: Any way to find the network usage by a process?

Frank Ch. Eigler wrote:
> Mike Mason <[email protected]> writes:
>
>> Here's a variation of Jose's script that uses the networking tapset
>> and prints top-like output for transmits and receives. [...]
>
> Thanks for posting it to the systemtap wiki.
>
> Some minor style suggestions follow:
>
>> [...]
>> ifxmit_p[pid(), dev_name] ++
>> ifxmit_b[pid(), dev_name] += length
>
> These could be collapsed into a single statistics-aggregate array:
> # ifxmit[pid(), dev_name] <<< length
> Then the printing routine would use @count(ifxmit[...]) and @sum(ifxmit[...])
> to extract the two values. Same of course for ifrecv.

I tried that and got the following output:

PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND
0 0 eth0 9 10 486 672 swapper
ERROR: empty aggregate near identifier 'execname' at nettop.stp:35:4
WARNING: Number of errors: 1, skipped probes: 0

Apparently using @sum on empty aggregates isn't allowed. I expected 0's to
be returned. The only way to avoid the error is use @sum only if @count >
0, which makes the printf too complex in my opinion.

>
>> execname[pid()] = execname()
>> user[pid()] = uid()
>> ifdevs[pid(), dev_name] = dev_name
>
> Calling pid() so many times is worse than calling it once and caching
> the result in a local variable ("p = pid()").

Agreed. I'll change that.

>
> The way that the script tracks pid-to-uid and pid-to-execname mappings
> is not bad, though if that part were moved to new probes on fork or
> exec, it would allow the network-related probes to run concurrently on
> an SMP without fighting over locks.

But that would only catch processes created after the script starts, correct?

- Mike

>
>
> - FChE