2004-01-18 21:17:47

by Jean-Luc Fontaine

[permalink] [raw]
Subject: kernel 2.6.1: erroneous statistics in /proc/diskstats?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I would like to make sure /proc/diskstats is reliable or if I am totally
misunderstanding the significance of the milliseconds spent reading or
writing figures.
Copying a huge file from hdb to hdc, I get:

$ date; egrep 'hdb|hdc' /proc/diskstats
Sun Jan 18 17:52:51 CET 2004
3 64 hdb 12949 88 1335982 352530 8506 8009 132344 257702 1 102525 610631
3 65 hdb1 13034 1335950 16543 132344
22 0 hdc 260 0 2072 15180 3026 87831 727760 1232708 113 25415 1299749
22 1 hdc1 256 2040 90970 727760
$ date; egrep 'hdb|hdc' /proc/diskstats
Sun Jan 18 17:52:57 CET 2004
3 64 hdb 13162 88 1384702 356272 8506 8009 132344 257702 0 106045 614311
3 65 hdb1 13246 1384670 16543 132344
22 0 hdc 260 0 2072 15180 3725 109431 906136 1931270 111 31685 2000878
22 1 hdc1 256 2040 113267 906136

According to the above, hdc has spent writing (field #8):
~ 1931270 - 1232708 = 698562 ms = 699 s
doing I/O (field #10):
~ 31685 - 25415 = 6270 ms = 6.27 s
in about a 6 s interval (from date command).
It seems field #8 is incorrect.

As for hdb, it has spent reading (field #4):
~ 356272 - 352530 = 3742 ms = 3.74 s
doing I/O:
~ 106045 - 102525 = 3520 ms = 3.52 s
Looks correct.

Would somebody please enlighten me?

PS: from Documentation/iostats.txt:
Field 8 -- # of milliseconds spent writing
This is the total number of milliseconds spent by all writes (as
measured from __make_request() to end_that_request_last()).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFACr5YkG/MMvcT1qQRAlbSAJ92pypJf/8unflGFdk1oZmwrTysZQCgoZS+
pCjP5LbPzUpRWSWcOtyU38E=
=E7/j
-----END PGP SIGNATURE-----


2004-01-19 11:00:51

by Rick Lindsley

[permalink] [raw]
Subject: Re: kernel 2.6.1: erroneous statistics in /proc/diskstats?

I would like to make sure /proc/diskstats is reliable or if I am totally
misunderstanding the significance of the milliseconds spent reading or
writing figures.

So far as I know, they are reliable but there always could be a bug.
Note that it is legitimate for field 8 to be greater than the time
elapsed, since it is the *total* time spent in *all* writes. If two write
requests were issued simultaneously and each fulfilled simultaneously,
taking exactly 10ms, you'd account for 20ms of "write I/O" even though
only 10ms of time has passed.

So, a few more details ...

How big was it?
How did you copy it (what command)?
Were you going through a file system or direct to disk?

One thing I notice is that your hdc shows 100+ "io's in progress" both
at start and completion. It may be as simple as that counter is off.
If there are IO's in progress when you start and they haven't finished
by the time you end, then they will certainly add about 6 seconds of
"write time" for each outstanding, apparently unfulfilled request.
This alone could account for your extra time. Any theories on why that
number is so large for hdc but quite small for hdb?

Rick

2004-01-19 13:15:12

by Jean-Luc Fontaine

[permalink] [raw]
Subject: Re: kernel 2.6.1: erroneous statistics in /proc/diskstats?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Rick Lindsley wrote:

| I would like to make sure /proc/diskstats is reliable or if I am
totally
| misunderstanding the significance of the milliseconds spent reading or
| writing figures.

| So far as I know, they are reliable but there always could be a bug.
| Note that it is legitimate for field 8 to be greater than the time
| elapsed, since it is the *total* time spent in *all* writes. If two write
| requests were issued simultaneously and each fulfilled simultaneously,
| taking exactly 10ms, you'd account for 20ms of "write I/O" even though
| only 10ms of time has passed.

| So, a few more details ...
| How big was it?
~ 700 Mbytes
| How did you copy it (what command)?
~ a simple cp
| Were you going through a file system or direct to disk?
~ from and to reiserfs

| One thing I notice is that your hdc shows 100+ "io's in progress" both
| at start and completion. It may be as simple as that counter is off.
| If there are IO's in progress when you start and they haven't finished
| by the time you end, then they will certainly add about 6 seconds of
| "write time" for each outstanding, apparently unfulfilled request.
| This alone could account for your extra time. Any theories on why that
| number is so large for hdc but quite small for hdb?

I did some more measurements, this time waiting after the copy operation
ended, as in my previous experiment, I was doing the measurements while
copying:

(fields: major minor name rio rmerge rsect ruse wio wmerge wsect wuse
~ running use aveq)

# date; egrep 'hdb|hdc' /proc/diskstats
# cp 700MBFile /mnt/hdc1/tmp/
Mon Jan 19 13:53:03 CET 2004
3 64 hdb 44653 197 685454 413847 17585 24405 336784 1631535 0 176727\
~ 2045781
3 65 hdb1 44846 685422 42098 336784
22 0 hdc 499 0 3960 60944 78 52 1040 84 0 2819 61028
22 1 hdc1 495 3928 130 1040
# date; egrep 'hdb|hdc' /proc/diskstats
Mon Jan 19 13:54:50 CET 2004
3 64 hdb 50822 197 2068046 489489 17786 24516 339280 1632095 0 246281\
~ 2121983
3 65 hdb1 51015 2068014 42410 339280
22 0 hdc 503 0 3992 61016 5111 140110 1161768 2494462 0 44692 2555478
22 1 hdc1 499 3960 145221 1161768

Clearly, wuse on hdc is still very large:
~ 2494462 - 84 = 2494378 ms = 2494 s
hdc use:
~ 44692 - 2819 = 41873 ms = 41.9 s
for a copy lasting 107 s, while ruse on hdb is correct:
~ 489489 - 413847 = 75642 ms = 75.6 s

Jean-Luc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFAC9gOkG/MMvcT1qQRAs4JAJ0XUdl235mj3iIXpuz0nMiQDX+T5wCZAQzF
r1FbUAt33Wo85znsQMHXj9U=
=Dp0P
-----END PGP SIGNATURE-----