2008-01-18 10:38:00

by Fredrik Lindgren

[permalink] [raw]
Subject: Understanding iostat-ms output

Hello,

I have been messing around with the excellent "iostat-ms" tool in
an effort to understand what is going on with our machines that are
showing lots of IOwait.

Below are two 15 second samples, the first one is taken during peak
and the second one during the night. The obvious difference is in the
"exe" field. Please excuse my ignorance, but what does this field mean?
Where is this additional time spent? On the filer or in the NFS stack
on the host?

stbnas1:/foo mounted on /misc/foo:

op/s rpc bklog
2364.60 0.96
read: ops/s Kb/s Kb/op retrans avg RTT (ms) avg exe (ms)
43.267 659.624 15.246 0 (0.0%) 17.066 293.880
write: ops/s Kb/s Kb/op retrans avg RTT (ms) avg exe (ms)
16.133 60.314 3.738 0 (0.0%) 3.868 255.074

stbnas1:/foo mounted on /misc/foo:

op/s rpc bklog
1446.07 0.26
read: ops/s Kb/s Kb/op retrans avg RTT (ms) avg exe (ms)
105.200 189.155 1.798 0 (0.0%) 8.264 8.702
write: ops/s Kb/s Kb/op retrans avg RTT (ms) avg exe (ms)
105.067 297.774 2.834 0 (0.0%) 1.048 1.310


Regards,
Fredrik Lindgren






2008-01-18 16:36:20

by Chuck Lever III

[permalink] [raw]
Subject: Re: Understanding iostat-ms output

On Jan 18, 2008, at 4:37 AM, Fredrik Lindgren wrote:
> I have been messing around with the excellent "iostat-ms" tool in
> an effort to understand what is going on with our machines that are
> showing lots of IOwait.
>
> Below are two 15 second samples, the first one is taken during peak
> and the second one during the night. The obvious difference is in the
> "exe" field. Please excuse my ignorance, but what does this field
> mean?
> Where is this additional time spent? On the filer or in the NFS stack
> on the host?

RTT is the time from when the kernel RPC client sends the RPC request
to the time it receives the reply; RTT includes network transit time
and server execution time.

"exe" is the time from when the NFS client dispatches the RPC request
to the kernel RPC client to the time the RPC request is complete;
"exe" includes RTT, and RPC client queuing and processing time.

So, even though the server turns your requests around in under
20msec, the RPC client is taking almost a quarter second, on average,
to process an RPC request.

The RPC client, by default, can process up to 16 requests
concurrently. Once this limit is exceeded, the RPC client queues
requests coming from the NFS client in a backlog queue. Average
execution time "exe" goes up if the requests have to wait in the
backlog queue.

Since your retransmit rate is zero, I'll assume you are using TCP.
Try increasing the default slot table size of the RPC client:

sudo sysctl -w sunrpc.tcp_slot_table_entries=64

then remounting your NFS file systems.

> stbnas1:/foo mounted on /misc/foo:
>
> op/s rpc bklog
> 2364.60 0.96
> read: ops/s Kb/s Kb/op
> retrans avg RTT (ms) avg exe (ms)
> 43.267 659.624 15.246 0
> (0.0%) 17.066 293.880
> write: ops/s Kb/s Kb/op
> retrans avg RTT (ms) avg exe (ms)
> 16.133 60.314 3.738 0
> (0.0%) 3.868 255.074
>
> stbnas1:/foo mounted on /misc/foo:
>
> op/s rpc bklog
> 1446.07 0.26
> read: ops/s Kb/s Kb/op
> retrans avg RTT (ms) avg exe (ms)
> 105.200 189.155 1.798 0
> (0.0%) 8.264 8.702
> write: ops/s Kb/s Kb/op
> retrans avg RTT (ms) avg exe (ms)
> 105.067 297.774 2.834 0
> (0.0%) 1.048 1.310
>
>
> Regards,
> Fredrik Lindgren
>
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-
> nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com