2013-05-26 06:16:14

by Fredrik Tolf

[permalink] [raw]
Subject: Weird disk idling

Dear list,

In order to debug I/O performance, I recently wrote a tiny program for
inspecting /sys/block/$DISK/stat. It works by dumping deltas of the values
every 100 ms, quite simply (except the queue-length value, for which
deltas are clearly useless).

Using this, I often see periods during constant I/O loads where there are
lots of requests in the queue, but the disk is completely idle. They
usually last somewhere from 0.1 to 2 seconds. Using the aforementioned
program, they might look, for instance, like this:

1369545418.8 0 0 0 0 2 0 16 4888 134 100 13400
1369545418.9 0 0 0 0 0 0 0 0 134 100 13400
1369545419.0 0 0 0 0 0 0 0 0 135 104 13984
1369545419.1 0 0 0 0 0 0 0 0 135 100 13500
1369545419.2 0 0 0 0 0 0 0 0 135 100 13500
1369545419.3 0 0 0 0 0 0 0 0 135 100 13500
1369545419.4 0 0 0 0 0 0 0 0 135 100 13500
1369545419.5 0 0 0 0 0 0 0 0 135 100 13500
1369545419.6 0 0 0 0 0 0 0 0 135 104 14040
1369545419.7 2 0 64 672 58 0 1185 152512 78 100 11296

I'm sure you all know what the various fields are (except the first, which
is just a timestamp), so as you can see, there are 135 requests in the
queue, but no reads or writes happen, in this case, for at least 800 ms.

Is this behavior normal and expected, or is there something wrong here? In
the latter case, is it my hardware that is failing somehow, or can there
be some software weirdness that can be tweaked away or bugfixed?

The disk in question is a 640 GB WDC Caviar Green, and it's attached via
an old Silicon Image 3114 PCI card. Clearly, the hardware is less than
optimal, but can that explain this behavior? (For the record, the disk
does at least not report any SMART errors, and there are no errors about
in the dmesg.)

The kernel version is 3.7.1, and the scheduler is CFQ.

Thanks for reading!

--

Fredrik Tolf


2013-05-26 08:59:13

by ethan zhao

[permalink] [raw]
Subject: Re: Weird disk idling

Fred??
How do you know the disk is completely idle ? How much cache memory do your controller and disk have ? Why do you think the requests should trigger action of disk immediately while you don't know what kind of requests are they. They are reading the same offset of the same block ?

... ....

Before got the answer, you should ask yourself many questions.


Ethan

?????ҵ? iPad

?? 2013-5-26??13:29??Fredrik Tolf <[email protected]> д????

> Dear list,
>
> In order to debug I/O performance, I recently wrote a tiny program for inspecting /sys/block/$DISK/stat. It works by dumping deltas of the values every 100 ms, quite simply (except the queue-length value, for which deltas are clearly useless).
>
> Using this, I often see periods during constant I/O loads where there are lots of requests in the queue, but the disk is completely idle. They usually last somewhere from 0.1 to 2 seconds. Using the aforementioned program, they might look, for instance, like this:
>
> 1369545418.8 0 0 0 0 2 0 16 4888 134 100 13400
> 1369545418.9 0 0 0 0 0 0 0 0 134 100 13400
> 1369545419.0 0 0 0 0 0 0 0 0 135 104 13984
> 1369545419.1 0 0 0 0 0 0 0 0 135 100 13500
> 1369545419.2 0 0 0 0 0 0 0 0 135 100 13500
> 1369545419.3 0 0 0 0 0 0 0 0 135 100 13500
> 1369545419.4 0 0 0 0 0 0 0 0 135 100 13500
> 1369545419.5 0 0 0 0 0 0 0 0 135 100 13500
> 1369545419.6 0 0 0 0 0 0 0 0 135 104 14040
> 1369545419.7 2 0 64 672 58 0 1185 152512 78 100 11296
>
> I'm sure you all know what the various fields are (except the first, which is just a timestamp), so as you can see, there are 135 requests in the queue, but no reads or writes happen, in this case, for at least 800 ms.
>
> Is this behavior normal and expected, or is there something wrong here? In the latter case, is it my hardware that is failing somehow, or can there be some software weirdness that can be tweaked away or bugfixed?
>
> The disk in question is a 640 GB WDC Caviar Green, and it's attached via an old Silicon Image 3114 PCI card. Clearly, the hardware is less than optimal, but can that explain this behavior? (For the record, the disk does at least not report any SMART errors, and there are no errors about in the dmesg.)
>
> The kernel version is 3.7.1, and the scheduler is CFQ.
>
> Thanks for reading!
>
> --
>
> Fredrik Tolf
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2013-05-27 01:55:13

by Alexander Holler

[permalink] [raw]
Subject: Re: Weird disk idling

Am 26.05.2013 07:29, schrieb Fredrik Tolf:

> I'm sure you all know what the various fields are (except the first,

Hmm, I had to look at Documentation/block/stat.txt.

> which is just a timestamp), so as you can see, there are 135 requests in
> the queue, but no reads or writes happen, in this case, for at least 800
> ms.
>
> Is this behavior normal and expected, or is there something wrong here?
> In the latter case, is it my hardware that is failing somehow, or can
> there be some software weirdness that can be tweaked away or bugfixed?

I would say thats how caches do work. You might have a look at

sysctl vm

which shows a lot of the knobs you can turn to change some of the
aspects you see.

Regards,

Alexander Holler

2013-05-29 00:15:39

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: Weird disk idling

On Sun, 26 May 2013 16:58:59 +0800, ethan said:
> Fred??
> How do you know the disk is completely idle ?

Actually, my first question was "How do you know the disk is *spinning*?"
A second or two delay sounds suspiciously like a spun-down disk in powersave
move....


Attachments:
(No filename) (865.00 B)