2009-01-17 00:44:52

by Mathieu Desnoyers

[permalink] [raw]
Subject: [Regression] High latency when doing large I/O

Hi,

A long standing I/O regression (since 2.6.18, still there today) has hit
Slashdot recently :
http://bugzilla.kernel.org/show_bug.cgi?id=12309
http://it.slashdot.org/article.pl?sid=09/01/15/049201

I've taken a trace reproducing the wrong behavior on my machine and I
think it's getting us somewhere.

LTTng 0.83, kernel 2.6.28
Machine : Intel Xeon E5405 dual quad-core, 16GB ram
(just created a new block-trace.c LTTng probe which is not released yet.
It basically replaces blktrace)


echo 3 > /proc/sys/vm/drop_caches

lttctl -C -w /tmp/trace -o channel.mm.bufnum=8 -o channel.block.bufnum=64 trace

dd if=/dev/zero of=/tmp/newfile bs=1M count=1M
cp -ax music /tmp (copying 1.1GB of mp3)

ls (takes 15 seconds to get the directory listing !)

lttctl -D trace

I looked at the trace (especially at the ls surroundings), and bash is
waiting for a few seconds for I/O in the exec system call (to exec ls).

While this happens, we have dd doing lots and lots of bio_queue. There
is a bio_backmerge after each bio_queue event. This is reasonable,
because dd is writing to a contiguous file.

However, I wonder if this is not the actual problem. We have dd which
has the head request in the elevator request queue. It is progressing
steadily by plugging/unplugging the device periodically and gets its
work done. However, because requests are being dequeued at the same
rate others are being merged, I suspect it stays at the top of the queue
and does not let the other unrelated requests run.

There is a test in the blk-merge.c which makes sure that merged requests
do not get bigger than a certain size. However, if the request is
steadily dequeued, I think this test is not doing anything.

If you are interested in looking at the trace I've taken, I could
provide it.

Does that make sense ?

Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68


2009-01-17 16:27:18

by Mathieu Desnoyers

[permalink] [raw]
Subject: [RFC PATCH] block: Fix bio merge induced high I/O latency

A long standing I/O regression (since 2.6.18, still there today) has hit
Slashdot recently :
http://bugzilla.kernel.org/show_bug.cgi?id=12309
http://it.slashdot.org/article.pl?sid=09/01/15/049201

I've taken a trace reproducing the wrong behavior on my machine and I
think it's getting us somewhere.

LTTng 0.83, kernel 2.6.28
Machine : Intel Xeon E5405 dual quad-core, 16GB ram
(just created a new block-trace.c LTTng probe which is not released yet.
It basically replaces blktrace)


echo 3 > /proc/sys/vm/drop_caches

lttctl -C -w /tmp/trace -o channel.mm.bufnum=8 -o channel.block.bufnum=64 trace

dd if=/dev/zero of=/tmp/newfile bs=1M count=1M
cp -ax music /tmp (copying 1.1GB of mp3)

ls (takes 15 seconds to get the directory listing !)

lttctl -D trace

I looked at the trace (especially at the ls surroundings), and bash is
waiting for a few seconds for I/O in the exec system call (to exec ls).

While this happens, we have dd doing lots and lots of bio_queue. There
is a bio_backmerge after each bio_queue event. This is reasonable,
because dd is writing to a contiguous file.

However, I wonder if this is not the actual problem. We have dd which
has the head request in the elevator request queue. It is progressing
steadily by plugging/unplugging the device periodically and gets its
work done. However, because requests are being dequeued at the same
rate others are being merged, I suspect it stays at the top of the queue
and does not let the other unrelated requests run.

There is a test in the blk-merge.c which makes sure that merged requests
do not get bigger than a certain size. However, if the request is
steadily dequeued, I think this test is not doing anything.


This patch implements a basic test to make sure we never merge more than 128
requests into the same request if it is the "last_merge" request. I have not
been able to trigger the problem again with the fix applied. It might not be in
a perfect state : there may be better solutions to the problem, but I think it
helps pointing out where the culprit lays.

Signed-off-by: Mathieu Desnoyers <[email protected]>
CC: Jens Axboe <[email protected]>
CC: Andrea Arcangeli <[email protected]>
CC: [email protected]
CC: Ingo Molnar <[email protected]>
CC: Linus Torvalds <[email protected]>
---
block/blk-merge.c | 12 +++++++++---
block/elevator.c | 31 ++++++++++++++++++++++++++++---
include/linux/blkdev.h | 1 +
3 files changed, 38 insertions(+), 6 deletions(-)

Index: linux-2.6-lttng/include/linux/blkdev.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/blkdev.h 2009-01-17 09:49:54.000000000 -0500
+++ linux-2.6-lttng/include/linux/blkdev.h 2009-01-17 09:50:29.000000000 -0500
@@ -313,6 +313,7 @@ struct request_queue
*/
struct list_head queue_head;
struct request *last_merge;
+ int nr_cached_merge;
elevator_t *elevator;

/*
Index: linux-2.6-lttng/block/elevator.c
===================================================================
--- linux-2.6-lttng.orig/block/elevator.c 2009-01-17 09:49:54.000000000 -0500
+++ linux-2.6-lttng/block/elevator.c 2009-01-17 11:07:12.000000000 -0500
@@ -255,6 +255,7 @@ int elevator_init(struct request_queue *

INIT_LIST_HEAD(&q->queue_head);
q->last_merge = NULL;
+ q->nr_cached_merge = 0;
q->end_sector = 0;
q->boundary_rq = NULL;

@@ -438,8 +439,10 @@ void elv_dispatch_sort(struct request_qu
struct list_head *entry;
int stop_flags;

- if (q->last_merge == rq)
+ if (q->last_merge == rq) {
q->last_merge = NULL;
+ q->nr_cached_merge = 0;
+ }

elv_rqhash_del(q, rq);

@@ -478,8 +481,10 @@ EXPORT_SYMBOL(elv_dispatch_sort);
*/
void elv_dispatch_add_tail(struct request_queue *q, struct request *rq)
{
- if (q->last_merge == rq)
+ if (q->last_merge == rq) {
q->last_merge = NULL;
+ q->nr_cached_merge = 0;
+ }

elv_rqhash_del(q, rq);

@@ -498,6 +503,16 @@ int elv_merge(struct request_queue *q, s
int ret;

/*
+ * Make sure we don't starve other requests by merging too many cached
+ * requests together.
+ */
+ if (q->nr_cached_merge >= BLKDEV_MAX_RQ) {
+ q->last_merge = NULL;
+ q->nr_cached_merge = 0;
+ return ELEVATOR_NO_MERGE;
+ }
+
+ /*
* First try one-hit cache.
*/
if (q->last_merge) {
@@ -536,6 +551,10 @@ void elv_merged_request(struct request_q
if (type == ELEVATOR_BACK_MERGE)
elv_rqhash_reposition(q, rq);

+ if (q->last_merge != rq)
+ q->nr_cached_merge = 0;
+ else
+ q->nr_cached_merge++;
q->last_merge = rq;
}

@@ -551,6 +570,10 @@ void elv_merge_requests(struct request_q
elv_rqhash_del(q, next);

q->nr_sorted--;
+ if (q->last_merge != rq)
+ q->nr_cached_merge = 0;
+ else
+ q->nr_cached_merge++;
q->last_merge = rq;
}

@@ -626,8 +649,10 @@ void elv_insert(struct request_queue *q,
q->nr_sorted++;
if (rq_mergeable(rq)) {
elv_rqhash_add(q, rq);
- if (!q->last_merge)
+ if (!q->last_merge) {
+ q->nr_cached_merge = 1;
q->last_merge = rq;
+ }
}

/*
Index: linux-2.6-lttng/block/blk-merge.c
===================================================================
--- linux-2.6-lttng.orig/block/blk-merge.c 2009-01-17 09:49:54.000000000 -0500
+++ linux-2.6-lttng/block/blk-merge.c 2009-01-17 09:50:29.000000000 -0500
@@ -231,8 +231,10 @@ static inline int ll_new_hw_segment(stru
if (req->nr_phys_segments + nr_phys_segs > q->max_hw_segments
|| req->nr_phys_segments + nr_phys_segs > q->max_phys_segments) {
req->cmd_flags |= REQ_NOMERGE;
- if (req == q->last_merge)
+ if (req == q->last_merge) {
q->last_merge = NULL;
+ q->nr_cached_merge = 0;
+ }
return 0;
}

@@ -256,8 +258,10 @@ int ll_back_merge_fn(struct request_queu

if (req->nr_sectors + bio_sectors(bio) > max_sectors) {
req->cmd_flags |= REQ_NOMERGE;
- if (req == q->last_merge)
+ if (req == q->last_merge) {
q->last_merge = NULL;
+ q->nr_cached_merge = 0;
+ }
return 0;
}
if (!bio_flagged(req->biotail, BIO_SEG_VALID))
@@ -281,8 +285,10 @@ int ll_front_merge_fn(struct request_que

if (req->nr_sectors + bio_sectors(bio) > max_sectors) {
req->cmd_flags |= REQ_NOMERGE;
- if (req == q->last_merge)
+ if (req == q->last_merge) {
q->last_merge = NULL;
+ q->nr_cached_merge = 0;
+ }
return 0;
}
if (!bio_flagged(bio, BIO_SEG_VALID))

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-01-17 16:51:20

by Leon Woestenberg

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

Hello Mathieu et al,

On Sat, Jan 17, 2009 at 5:26 PM, Mathieu Desnoyers
<[email protected]> wrote:
> A long standing I/O regression (since 2.6.18, still there today) has hit
> Slashdot recently :
> http://bugzilla.kernel.org/show_bug.cgi?id=12309

Are you sure you are solving the *actual* problem?

The bugzilla entry shows a bisect attempt that leads to a patch
involving negative clock jumps.
http://bugzilla.kernel.org/show_bug.cgi?id=12309#c29

with a corrected link to the bisect patch:
http://bugzilla.kernel.org/show_bug.cgi?id=12309#c30

Wouldn't a negative clock jump be very influential to the
(time-driven) I/O schedulers and be a more probable cause?

Regards,
--
Leon

p.s. Added Thomas to the CC list as his name is on the patch Signed-off-by list.

2009-01-17 17:16:22

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

* Leon Woestenberg ([email protected]) wrote:
> Hello Mathieu et al,
>
> On Sat, Jan 17, 2009 at 5:26 PM, Mathieu Desnoyers
> <[email protected]> wrote:
> > A long standing I/O regression (since 2.6.18, still there today) has hit
> > Slashdot recently :
> > http://bugzilla.kernel.org/show_bug.cgi?id=12309
>
> Are you sure you are solving the *actual* problem?
>
> The bugzilla entry shows a bisect attempt that leads to a patch
> involving negative clock jumps.
> http://bugzilla.kernel.org/show_bug.cgi?id=12309#c29
>
> with a corrected link to the bisect patch:
> http://bugzilla.kernel.org/show_bug.cgi?id=12309#c30
>
> Wouldn't a negative clock jump be very influential to the
> (time-driven) I/O schedulers and be a more probable cause?
>

When a merge is done, the lowest timestamp between the existing request
and the new request to merge is kept as a start_time value for the
merged request we end up with. In this case, that would probably make
that request stay on top of the queue even if unrelated interactive I/O
requests come.

I suspect that this negative clock jump could have hidden the problem by
making the start time of the interactive request lower than the start
time of the merged request.

Mathieu

> Regards,
> --
> Leon
>
> p.s. Added Thomas to the CC list as his name is on the patch Signed-off-by list.

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-01-17 19:06:20

by Jens Axboe

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

On Sat, Jan 17 2009, Mathieu Desnoyers wrote:
> A long standing I/O regression (since 2.6.18, still there today) has hit
> Slashdot recently :
> http://bugzilla.kernel.org/show_bug.cgi?id=12309
> http://it.slashdot.org/article.pl?sid=09/01/15/049201
>
> I've taken a trace reproducing the wrong behavior on my machine and I
> think it's getting us somewhere.
>
> LTTng 0.83, kernel 2.6.28
> Machine : Intel Xeon E5405 dual quad-core, 16GB ram
> (just created a new block-trace.c LTTng probe which is not released yet.
> It basically replaces blktrace)
>
>
> echo 3 > /proc/sys/vm/drop_caches
>
> lttctl -C -w /tmp/trace -o channel.mm.bufnum=8 -o channel.block.bufnum=64 trace
>
> dd if=/dev/zero of=/tmp/newfile bs=1M count=1M
> cp -ax music /tmp (copying 1.1GB of mp3)
>
> ls (takes 15 seconds to get the directory listing !)
>
> lttctl -D trace
>
> I looked at the trace (especially at the ls surroundings), and bash is
> waiting for a few seconds for I/O in the exec system call (to exec ls).
>
> While this happens, we have dd doing lots and lots of bio_queue. There
> is a bio_backmerge after each bio_queue event. This is reasonable,
> because dd is writing to a contiguous file.
>
> However, I wonder if this is not the actual problem. We have dd which
> has the head request in the elevator request queue. It is progressing
> steadily by plugging/unplugging the device periodically and gets its
> work done. However, because requests are being dequeued at the same
> rate others are being merged, I suspect it stays at the top of the queue
> and does not let the other unrelated requests run.
>
> There is a test in the blk-merge.c which makes sure that merged requests
> do not get bigger than a certain size. However, if the request is
> steadily dequeued, I think this test is not doing anything.
>
>
> This patch implements a basic test to make sure we never merge more
> than 128 requests into the same request if it is the "last_merge"
> request. I have not been able to trigger the problem again with the
> fix applied. It might not be in a perfect state : there may be better
> solutions to the problem, but I think it helps pointing out where the
> culprit lays.

To be painfully honest, I have no idea what you are attempting to solve
with this patch. First of all, Linux has always merged any request
possible. The one-hit cache is just that, a one hit cache frontend for
merging. We'll be hitting the merge hash and doing the same merge if it
fails. Since we even cap the size of the request, the merging is also
bounded.

Furthermore, the request being merged is not considered for IO yet. It
has not been dispatched by the io scheduler. IOW, I'm surprised your
patch makes any difference at all. Especially with your 128 limit, since
4kbx128kb is 512kb which is the default max merge size anyway. These
sort of test cases tend to be very sensitive and exhibit different
behaviour for many runs, so call me a bit skeptical and consider that an
enouragement to do more directed testing. You could use fio for
instance. Have two jobs in your job file. One is a dd type process that
just writes a huge file, the other job starts eg 10 seconds later and
does a 4kb read of a file.

As a quick test, could you try and increase the slice_idle to eg 20ms?
Sometimes I've seen timing being slightly off, which makes us miss the
sync window for the ls (in your case) process. Then you get a mix of
async and sync IO all the time, which very much slows down the sync
process.

--
Jens Axboe

2009-01-17 20:04:01

by Ben Gamari

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

On Sat, 2009-01-17 at 11:26 -0500, Mathieu Desnoyers wrote:
> This patch implements a basic test to make sure we never merge more than 128
> requests into the same request if it is the "last_merge" request. I have not
> been able to trigger the problem again with the fix applied. It might not be in
> a perfect state : there may be better solutions to the problem, but I think it
> helps pointing out where the culprit lays.

Unfortunately, it seems like the patch hasn't really fixed much. After
porting it forward to Linus' master, I haven't exhibited any difference
in real world use cases (e.g. desktop use cases while building a
kernel).

Given Jen's remarks, I suppose this isn't too surprising. Does anyone
else with greater familiarity with the block I/O subsystem have any more
ideas about the source of the slowdown? It seems like the recent patches
incorporating blktrace support into ftrace could be helpful for further
data collection, correct?

- Ben


2009-01-18 21:13:01

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

* Jens Axboe ([email protected]) wrote:
> On Sat, Jan 17 2009, Mathieu Desnoyers wrote:
> > A long standing I/O regression (since 2.6.18, still there today) has hit
> > Slashdot recently :
> > http://bugzilla.kernel.org/show_bug.cgi?id=12309
> > http://it.slashdot.org/article.pl?sid=09/01/15/049201
> >
> > I've taken a trace reproducing the wrong behavior on my machine and I
> > think it's getting us somewhere.
> >
> > LTTng 0.83, kernel 2.6.28
> > Machine : Intel Xeon E5405 dual quad-core, 16GB ram
> > (just created a new block-trace.c LTTng probe which is not released yet.
> > It basically replaces blktrace)
> >
> >
> > echo 3 > /proc/sys/vm/drop_caches
> >
> > lttctl -C -w /tmp/trace -o channel.mm.bufnum=8 -o channel.block.bufnum=64 trace
> >
> > dd if=/dev/zero of=/tmp/newfile bs=1M count=1M
> > cp -ax music /tmp (copying 1.1GB of mp3)
> >
> > ls (takes 15 seconds to get the directory listing !)
> >
> > lttctl -D trace
> >
> > I looked at the trace (especially at the ls surroundings), and bash is
> > waiting for a few seconds for I/O in the exec system call (to exec ls).
> >
> > While this happens, we have dd doing lots and lots of bio_queue. There
> > is a bio_backmerge after each bio_queue event. This is reasonable,
> > because dd is writing to a contiguous file.
> >
> > However, I wonder if this is not the actual problem. We have dd which
> > has the head request in the elevator request queue. It is progressing
> > steadily by plugging/unplugging the device periodically and gets its
> > work done. However, because requests are being dequeued at the same
> > rate others are being merged, I suspect it stays at the top of the queue
> > and does not let the other unrelated requests run.
> >
> > There is a test in the blk-merge.c which makes sure that merged requests
> > do not get bigger than a certain size. However, if the request is
> > steadily dequeued, I think this test is not doing anything.
> >
> >
> > This patch implements a basic test to make sure we never merge more
> > than 128 requests into the same request if it is the "last_merge"
> > request. I have not been able to trigger the problem again with the
> > fix applied. It might not be in a perfect state : there may be better
> > solutions to the problem, but I think it helps pointing out where the
> > culprit lays.
>
> To be painfully honest, I have no idea what you are attempting to solve
> with this patch. First of all, Linux has always merged any request
> possible. The one-hit cache is just that, a one hit cache frontend for
> merging. We'll be hitting the merge hash and doing the same merge if it
> fails. Since we even cap the size of the request, the merging is also
> bounded.
>

Hi Jens,

I was mostly trying to poke around and try to figure out what was going
on in the I/O elevator. Sorry if my first attempts did not make much
sense. Following your advice, I've looked more deeply into the test
cases.

> Furthermore, the request being merged is not considered for IO yet. It
> has not been dispatched by the io scheduler. IOW, I'm surprised your
> patch makes any difference at all. Especially with your 128 limit, since
> 4kbx128kb is 512kb which is the default max merge size anyway. These
> sort of test cases tend to be very sensitive and exhibit different
> behaviour for many runs, so call me a bit skeptical and consider that an
> enouragement to do more directed testing. You could use fio for
> instance. Have two jobs in your job file. One is a dd type process that
> just writes a huge file, the other job starts eg 10 seconds later and
> does a 4kb read of a file.
>

I looked at the "ls" behavior (while doing a dd) within my LTTng trace
to create a fio job file. The said behavior is appended below as "Part
1 - ls I/O behavior". Note that the original "ls" test case was done
with the anticipatory I/O scheduler, which was active by default on my
debian system with custom vanilla 2.6.28 kernel. Also note that I am
running this on a raid-1, but have experienced the same problem on a
standard partition I created on the same machine.

I created the fio job file appended as "Part 2 - dd+ls fio job file". It
consists of one dd-like job and many small jobs reading as many data as
ls did. I used the small test script to batch run this ("Part 3 - batch
test").

The results for the ls-like jobs are interesting :

I/O scheduler runt-min (msec) runt-max (msec)
noop 41 10563
anticipatory 63 8185
deadline 52 33387
cfq 43 1420


> As a quick test, could you try and increase the slice_idle to eg 20ms?
> Sometimes I've seen timing being slightly off, which makes us miss the
> sync window for the ls (in your case) process. Then you get a mix of
> async and sync IO all the time, which very much slows down the sync
> process.
>

Just to confirm, the quick test you are taking about would be :

---
block/cfq-iosched.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6-lttng/block/cfq-iosched.c
===================================================================
--- linux-2.6-lttng.orig/block/cfq-iosched.c 2009-01-18 15:17:32.000000000 -0500
+++ linux-2.6-lttng/block/cfq-iosched.c 2009-01-18 15:46:38.000000000 -0500
@@ -26,7 +26,7 @@ static const int cfq_back_penalty = 2;
static const int cfq_slice_sync = HZ / 10;
static int cfq_slice_async = HZ / 25;
static const int cfq_slice_async_rq = 2;
-static int cfq_slice_idle = HZ / 125;
+static int cfq_slice_idle = 20;

/*
* offset from end of service tree


It does not make much difference with the standard cfq test :

I/O scheduler runt-min (msec) runt-max (msec)
cfq (standard) 43 1420
cfq (20ms slice_idle) 31 1573


So, I guess 1.5s delay to run ls on a directory when the cache is cold
with a cfq I/O scheduler is somewhat acceptable, but I doubt the 8, 10
and 33s response times for the anticipatory, noop and deadline I/O
schedulers are. I wonder why on earth is the anticipatory I/O scheduler
activated by default with my kernel given it results in so poor
interactive behavior when doing large I/O ?

Thanks for the advices,

Mathieu



* Part 1 - ls I/O behavior

lttv -m textDump -t /traces/block-backmerge \
-e "state.pid=4145&event.subname=bio_queue"

block.bio_queue: 662.707321959 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, SYSCALL { sector = 327680048, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 662.707331445 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, SYSCALL { sector = 349175018, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 662.968214766 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, SYSCALL { sector = 327696968, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 662.968222110 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, SYSCALL { sector = 349191938, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 662.971662800 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 327697032, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 662.971670417 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 349192002, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 662.971684184 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 327697040, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 662.971689854 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 349192010, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 662.971695762 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 327697048, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 662.971701135 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 349192018, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 662.971706301 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 327697056, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 662.971711698 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 349192026, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 662.971723359 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 327697064, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 662.971729035 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 349192034, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 662.999391873 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 327697072, size = 53248, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 662.999397864 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 349192042, size = 53248, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 670.809328737 (/traces/block-backmerge/block_7), 4145, 4145, /bin/ls, , 4063, 0x0, TRAP { sector = 327697000, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 670.809337500 (/traces/block-backmerge/block_7), 4145, 4145, /bin/ls, , 4063, 0x0, TRAP { sector = 349191970, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 671.161036834 (/traces/block-backmerge/block_5), 4145, 4145, /bin/ls, , 4063, 0x0, SYSCALL { sector = 360714880, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 671.161047247 (/traces/block-backmerge/block_5), 4145, 4145, /bin/ls, , 4063, 0x0, SYSCALL { sector = 382209850, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 671.653601399 (/traces/block-backmerge/block_7), 4145, 4145, /bin/ls, , 4063, 0x0, SYSCALL { sector = 360712184, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 671.653611077 (/traces/block-backmerge/block_7), 4145, 4145, /bin/ls, , 4063, 0x0, SYSCALL { sector = 382207154, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }


* Part 2 - dd+ls fio job file (test.job5)

[job1]
rw=write
size=10240m
direct=0
blocksize=1024k

[global]
rw=randread
size=96k
filesize=30m
direct=0
bsrange=4k-52k

[file1]
startdelay=0

[file2]
startdelay=4

[file3]
startdelay=8

[file4]
startdelay=12

[file5]
startdelay=16

[file6]
startdelay=20

[file7]
startdelay=24

[file8]
startdelay=28

[file9]
startdelay=32

[file10]
startdelay=36

[file11]
startdelay=40

[file12]
startdelay=44

[file13]
startdelay=48

[file14]
startdelay=52

[file15]
startdelay=56

[file16]
startdelay=60

[file17]
startdelay=64

[file18]
startdelay=68

[file19]
startdelay=72

[file20]
startdelay=76

[file21]
startdelay=80

[file22]
startdelay=84

[file23]
startdelay=88

[file24]
startdelay=92

[file25]
startdelay=96

[file26]
startdelay=100

[file27]
startdelay=104

[file28]
startdelay=108

[file29]
startdelay=112

[file30]
startdelay=116

[file31]
startdelay=120

[file32]
startdelay=124

[file33]
startdelay=128

[file34]
startdelay=132

[file35]
startdelay=134

[file36]
startdelay=138

[file37]
startdelay=142

[file38]
startdelay=146

[file39]
startdelay=150

[file40]
startdelay=200

[file41]
startdelay=260


* Part 3 - batch test (do-tests.sh)

#!/bin/sh

TESTS="anticipatory noop deadline cfq"

for TEST in ${TESTS}; do
echo "Running ${TEST}"

rm -f file*.0 job*.0

echo ${TEST} > /sys/block/sda/queue/scheduler
echo ${TEST} > /sys/block/sdb/queue/scheduler
sync
echo 3 > /proc/sys/vm/drop_caches
sleep 5

./fio test.job5 --output test.result.${TEST}
done


--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-01-18 21:27:30

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

* Mathieu Desnoyers ([email protected]) wrote:
> * Jens Axboe ([email protected]) wrote:
> > On Sat, Jan 17 2009, Mathieu Desnoyers wrote:
> > > A long standing I/O regression (since 2.6.18, still there today) has hit
> > > Slashdot recently :
> > > http://bugzilla.kernel.org/show_bug.cgi?id=12309
> > > http://it.slashdot.org/article.pl?sid=09/01/15/049201
> > >
> > > I've taken a trace reproducing the wrong behavior on my machine and I
> > > think it's getting us somewhere.
> > >
> > > LTTng 0.83, kernel 2.6.28
> > > Machine : Intel Xeon E5405 dual quad-core, 16GB ram
> > > (just created a new block-trace.c LTTng probe which is not released yet.
> > > It basically replaces blktrace)
> > >
> > >
> > > echo 3 > /proc/sys/vm/drop_caches
> > >
> > > lttctl -C -w /tmp/trace -o channel.mm.bufnum=8 -o channel.block.bufnum=64 trace
> > >
> > > dd if=/dev/zero of=/tmp/newfile bs=1M count=1M
> > > cp -ax music /tmp (copying 1.1GB of mp3)
> > >
> > > ls (takes 15 seconds to get the directory listing !)
> > >
> > > lttctl -D trace
> > >
> > > I looked at the trace (especially at the ls surroundings), and bash is
> > > waiting for a few seconds for I/O in the exec system call (to exec ls).
> > >
> > > While this happens, we have dd doing lots and lots of bio_queue. There
> > > is a bio_backmerge after each bio_queue event. This is reasonable,
> > > because dd is writing to a contiguous file.
> > >
> > > However, I wonder if this is not the actual problem. We have dd which
> > > has the head request in the elevator request queue. It is progressing
> > > steadily by plugging/unplugging the device periodically and gets its
> > > work done. However, because requests are being dequeued at the same
> > > rate others are being merged, I suspect it stays at the top of the queue
> > > and does not let the other unrelated requests run.
> > >
> > > There is a test in the blk-merge.c which makes sure that merged requests
> > > do not get bigger than a certain size. However, if the request is
> > > steadily dequeued, I think this test is not doing anything.
> > >
> > >
> > > This patch implements a basic test to make sure we never merge more
> > > than 128 requests into the same request if it is the "last_merge"
> > > request. I have not been able to trigger the problem again with the
> > > fix applied. It might not be in a perfect state : there may be better
> > > solutions to the problem, but I think it helps pointing out where the
> > > culprit lays.
> >
> > To be painfully honest, I have no idea what you are attempting to solve
> > with this patch. First of all, Linux has always merged any request
> > possible. The one-hit cache is just that, a one hit cache frontend for
> > merging. We'll be hitting the merge hash and doing the same merge if it
> > fails. Since we even cap the size of the request, the merging is also
> > bounded.
> >
>
> Hi Jens,
>
> I was mostly trying to poke around and try to figure out what was going
> on in the I/O elevator. Sorry if my first attempts did not make much
> sense. Following your advice, I've looked more deeply into the test
> cases.
>
> > Furthermore, the request being merged is not considered for IO yet. It
> > has not been dispatched by the io scheduler. IOW, I'm surprised your
> > patch makes any difference at all. Especially with your 128 limit, since
> > 4kbx128kb is 512kb which is the default max merge size anyway. These
> > sort of test cases tend to be very sensitive and exhibit different
> > behaviour for many runs, so call me a bit skeptical and consider that an
> > enouragement to do more directed testing. You could use fio for
> > instance. Have two jobs in your job file. One is a dd type process that
> > just writes a huge file, the other job starts eg 10 seconds later and
> > does a 4kb read of a file.
> >
>
> I looked at the "ls" behavior (while doing a dd) within my LTTng trace
> to create a fio job file. The said behavior is appended below as "Part
> 1 - ls I/O behavior". Note that the original "ls" test case was done
> with the anticipatory I/O scheduler, which was active by default on my
> debian system with custom vanilla 2.6.28 kernel. Also note that I am
> running this on a raid-1, but have experienced the same problem on a
> standard partition I created on the same machine.
>
> I created the fio job file appended as "Part 2 - dd+ls fio job file". It
> consists of one dd-like job and many small jobs reading as many data as
> ls did. I used the small test script to batch run this ("Part 3 - batch
> test").
>
> The results for the ls-like jobs are interesting :
>
> I/O scheduler runt-min (msec) runt-max (msec)
> noop 41 10563
> anticipatory 63 8185
> deadline 52 33387
> cfq 43 1420
>
>
> > As a quick test, could you try and increase the slice_idle to eg 20ms?
> > Sometimes I've seen timing being slightly off, which makes us miss the
> > sync window for the ls (in your case) process. Then you get a mix of
> > async and sync IO all the time, which very much slows down the sync
> > process.
> >
>
> Just to confirm, the quick test you are taking about would be :
>
> ---
> block/cfq-iosched.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-2.6-lttng/block/cfq-iosched.c
> ===================================================================
> --- linux-2.6-lttng.orig/block/cfq-iosched.c 2009-01-18 15:17:32.000000000 -0500
> +++ linux-2.6-lttng/block/cfq-iosched.c 2009-01-18 15:46:38.000000000 -0500
> @@ -26,7 +26,7 @@ static const int cfq_back_penalty = 2;
> static const int cfq_slice_sync = HZ / 10;
> static int cfq_slice_async = HZ / 25;
> static const int cfq_slice_async_rq = 2;
> -static int cfq_slice_idle = HZ / 125;
> +static int cfq_slice_idle = 20;
>
> /*
> * offset from end of service tree
>
>
> It does not make much difference with the standard cfq test :
>
> I/O scheduler runt-min (msec) runt-max (msec)
> cfq (standard) 43 1420
> cfq (20ms slice_idle) 31 1573
>
>
> So, I guess 1.5s delay to run ls on a directory when the cache is cold
> with a cfq I/O scheduler is somewhat acceptable, but I doubt the 8, 10
> and 33s response times for the anticipatory, noop and deadline I/O
> schedulers are. I wonder why on earth is the anticipatory I/O scheduler
> activated by default with my kernel given it results in so poor
> interactive behavior when doing large I/O ?
>

I found out why : I had an old pre-2.6.18 .config hanging around in
/boot on _many_ of my systems and upgraded to a newer vanilla kernel
using these defaults. make oldconfig left
CONFIG_DEFAULT_IOSCHED="anticipatory".

Changing to CONFIG_DEFAULT_IOSCHED="cfq" makes everything run better
under heavy I/O. I bet I'm not the only one in this situation.

Mathieu


> Thanks for the advices,
>
> Mathieu
>
>
>
> * Part 1 - ls I/O behavior
>
> lttv -m textDump -t /traces/block-backmerge \
> -e "state.pid=4145&event.subname=bio_queue"
>
> block.bio_queue: 662.707321959 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, SYSCALL { sector = 327680048, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
> block.bio_queue: 662.707331445 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, SYSCALL { sector = 349175018, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 662.968214766 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, SYSCALL { sector = 327696968, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 662.968222110 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, SYSCALL { sector = 349191938, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 662.971662800 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 327697032, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 662.971670417 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 349192002, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 662.971684184 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 327697040, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 662.971689854 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 349192010, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 662.971695762 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 327697048, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 662.971701135 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 349192018, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 662.971706301 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 327697056, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 662.971711698 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 349192026, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 662.971723359 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 327697064, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 662.971729035 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 349192034, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 662.999391873 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 327697072, size = 53248, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 662.999397864 (/traces/block-backmerge/block_2), 4145, 4145, bash, , 4063, 0x0, TRAP { sector = 349192042, size = 53248, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 670.809328737 (/traces/block-backmerge/block_7), 4145, 4145, /bin/ls, , 4063, 0x0, TRAP { sector = 327697000, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 670.809337500 (/traces/block-backmerge/block_7), 4145, 4145, /bin/ls, , 4063, 0x0, TRAP { sector = 349191970, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 671.161036834 (/traces/block-backmerge/block_5), 4145, 4145, /bin/ls, , 4063, 0x0, SYSCALL { sector = 360714880, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
> block.bio_queue: 671.161047247 (/traces/block-backmerge/block_5), 4145, 4145, /bin/ls, , 4063, 0x0, SYSCALL { sector = 382209850, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
> block.bio_queue: 671.653601399 (/traces/block-backmerge/block_7), 4145, 4145, /bin/ls, , 4063, 0x0, SYSCALL { sector = 360712184, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
> block.bio_queue: 671.653611077 (/traces/block-backmerge/block_7), 4145, 4145, /bin/ls, , 4063, 0x0, SYSCALL { sector = 382207154, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
>
>
> * Part 2 - dd+ls fio job file (test.job5)
>
> [job1]
> rw=write
> size=10240m
> direct=0
> blocksize=1024k
>
> [global]
> rw=randread
> size=96k
> filesize=30m
> direct=0
> bsrange=4k-52k
>
> [file1]
> startdelay=0
>
> [file2]
> startdelay=4
>
> [file3]
> startdelay=8
>
> [file4]
> startdelay=12
>
> [file5]
> startdelay=16
>
> [file6]
> startdelay=20
>
> [file7]
> startdelay=24
>
> [file8]
> startdelay=28
>
> [file9]
> startdelay=32
>
> [file10]
> startdelay=36
>
> [file11]
> startdelay=40
>
> [file12]
> startdelay=44
>
> [file13]
> startdelay=48
>
> [file14]
> startdelay=52
>
> [file15]
> startdelay=56
>
> [file16]
> startdelay=60
>
> [file17]
> startdelay=64
>
> [file18]
> startdelay=68
>
> [file19]
> startdelay=72
>
> [file20]
> startdelay=76
>
> [file21]
> startdelay=80
>
> [file22]
> startdelay=84
>
> [file23]
> startdelay=88
>
> [file24]
> startdelay=92
>
> [file25]
> startdelay=96
>
> [file26]
> startdelay=100
>
> [file27]
> startdelay=104
>
> [file28]
> startdelay=108
>
> [file29]
> startdelay=112
>
> [file30]
> startdelay=116
>
> [file31]
> startdelay=120
>
> [file32]
> startdelay=124
>
> [file33]
> startdelay=128
>
> [file34]
> startdelay=132
>
> [file35]
> startdelay=134
>
> [file36]
> startdelay=138
>
> [file37]
> startdelay=142
>
> [file38]
> startdelay=146
>
> [file39]
> startdelay=150
>
> [file40]
> startdelay=200
>
> [file41]
> startdelay=260
>
>
> * Part 3 - batch test (do-tests.sh)
>
> #!/bin/sh
>
> TESTS="anticipatory noop deadline cfq"
>
> for TEST in ${TESTS}; do
> echo "Running ${TEST}"
>
> rm -f file*.0 job*.0
>
> echo ${TEST} > /sys/block/sda/queue/scheduler
> echo ${TEST} > /sys/block/sdb/queue/scheduler
> sync
> echo 3 > /proc/sys/vm/drop_caches
> sleep 5
>
> ./fio test.job5 --output test.result.${TEST}
> done
>
>
> --
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-01-19 15:46:16

by Nikanth K

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

On Sun, Jan 18, 2009 at 12:34 AM, Jens Axboe <[email protected]> wrote:

>
> As a quick test, could you try and increase the slice_idle to eg 20ms?
> Sometimes I've seen timing being slightly off, which makes us miss the
> sync window for the ls (in your case) process. Then you get a mix of
> async and sync IO all the time, which very much slows down the sync
> process.
>

Do you mean to say that 'ls' could not submit another request until
the previous sync request completes, but its idle window gets disabled
as it takes way too long to complete during heavy load? But when there
are requests in the driver, wont the idling be disabled anyway? Or did
you mean to increase slice_sync?

Thanks
Nikanth

2009-01-19 18:25:26

by Jens Axboe

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

On Mon, Jan 19 2009, Nikanth K wrote:
> On Sun, Jan 18, 2009 at 12:34 AM, Jens Axboe <[email protected]> wrote:
>
> >
> > As a quick test, could you try and increase the slice_idle to eg 20ms?
> > Sometimes I've seen timing being slightly off, which makes us miss the
> > sync window for the ls (in your case) process. Then you get a mix of
> > async and sync IO all the time, which very much slows down the sync
> > process.
> >
>
> Do you mean to say that 'ls' could not submit another request until
> the previous sync request completes, but its idle window gets disabled
> as it takes way too long to complete during heavy load? But when there

'ls' would never submit a new request before the previous one completes,
such is the nature of sync processes. That's the whole reason we have
the idle window.

> are requests in the driver, wont the idling be disabled anyway? Or did
> you mean to increase slice_sync?

No, idling is on a per-cfqq (process) basis. I did not mean to increase
slice_sync, that wont help at all. It's the window between submissions
of requests that I wanted to test being larger, but apparently that
wasn't the case here.

--
Jens Axboe

2009-01-19 18:28:36

by Jens Axboe

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

On Sun, Jan 18 2009, Mathieu Desnoyers wrote:
> I looked at the "ls" behavior (while doing a dd) within my LTTng trace
> to create a fio job file. The said behavior is appended below as "Part
> 1 - ls I/O behavior". Note that the original "ls" test case was done
> with the anticipatory I/O scheduler, which was active by default on my
> debian system with custom vanilla 2.6.28 kernel. Also note that I am
> running this on a raid-1, but have experienced the same problem on a
> standard partition I created on the same machine.
>
> I created the fio job file appended as "Part 2 - dd+ls fio job file". It
> consists of one dd-like job and many small jobs reading as many data as
> ls did. I used the small test script to batch run this ("Part 3 - batch
> test").
>
> The results for the ls-like jobs are interesting :
>
> I/O scheduler runt-min (msec) runt-max (msec)
> noop 41 10563
> anticipatory 63 8185
> deadline 52 33387
> cfq 43 1420

Do you have queuing enabled on your drives? You can check that in
/sys/block/sdX/device/queue_depth. Try setting those to 1 and retest all
schedulers, would be good for comparison.

raid personalities or dm complicates matters, since it introduces a
disconnect between 'ls' and the io scheduler at the bottom...

> > As a quick test, could you try and increase the slice_idle to eg 20ms?
> > Sometimes I've seen timing being slightly off, which makes us miss the
> > sync window for the ls (in your case) process. Then you get a mix of
> > async and sync IO all the time, which very much slows down the sync
> > process.
> >
>
> Just to confirm, the quick test you are taking about would be :
>
> ---
> block/cfq-iosched.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-2.6-lttng/block/cfq-iosched.c
> ===================================================================
> --- linux-2.6-lttng.orig/block/cfq-iosched.c 2009-01-18 15:17:32.000000000 -0500
> +++ linux-2.6-lttng/block/cfq-iosched.c 2009-01-18 15:46:38.000000000 -0500
> @@ -26,7 +26,7 @@ static const int cfq_back_penalty = 2;
> static const int cfq_slice_sync = HZ / 10;
> static int cfq_slice_async = HZ / 25;
> static const int cfq_slice_async_rq = 2;
> -static int cfq_slice_idle = HZ / 125;
> +static int cfq_slice_idle = 20;
>
> /*
> * offset from end of service tree
>
>
> It does not make much difference with the standard cfq test :
>
> I/O scheduler runt-min (msec) runt-max (msec)
> cfq (standard) 43 1420
> cfq (20ms slice_idle) 31 1573

OK, that's good at least!

> So, I guess 1.5s delay to run ls on a directory when the cache is cold
> with a cfq I/O scheduler is somewhat acceptable, but I doubt the 8, 10
> and 33s response times for the anticipatory, noop and deadline I/O
> schedulers are. I wonder why on earth is the anticipatory I/O scheduler
> activated by default with my kernel given it results in so poor
> interactive behavior when doing large I/O ?

I see you already found out why :-)

--
Jens Axboe

2009-01-20 02:16:22

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

* Jens Axboe ([email protected]) wrote:
> On Sun, Jan 18 2009, Mathieu Desnoyers wrote:
> > I looked at the "ls" behavior (while doing a dd) within my LTTng trace
> > to create a fio job file. The said behavior is appended below as "Part
> > 1 - ls I/O behavior". Note that the original "ls" test case was done
> > with the anticipatory I/O scheduler, which was active by default on my
> > debian system with custom vanilla 2.6.28 kernel. Also note that I am
> > running this on a raid-1, but have experienced the same problem on a
> > standard partition I created on the same machine.
> >
> > I created the fio job file appended as "Part 2 - dd+ls fio job file". It
> > consists of one dd-like job and many small jobs reading as many data as
> > ls did. I used the small test script to batch run this ("Part 3 - batch
> > test").
> >
> > The results for the ls-like jobs are interesting :
> >
> > I/O scheduler runt-min (msec) runt-max (msec)
> > noop 41 10563
> > anticipatory 63 8185
> > deadline 52 33387
> > cfq 43 1420
>

Extra note : I have a HZ=250 on my system. Changing to 100 or 1000 did
not make much difference (also tried with NO_HZ enabled).

> Do you have queuing enabled on your drives? You can check that in
> /sys/block/sdX/device/queue_depth. Try setting those to 1 and retest all
> schedulers, would be good for comparison.
>

Here are the tests with a queue_depth of 1 :

I/O scheduler runt-min (msec) runt-max (msec)
noop 43 38235
anticipatory 44 8728
deadline 51 19751
cfq 48 427


Overall, I wouldn't say it makes much difference.


> raid personalities or dm complicates matters, since it introduces a
> disconnect between 'ls' and the io scheduler at the bottom...
>

Yes, ideally I should re-run those directly on the disk partitions.

I am also tempted to create a fio job file which acts like a ssh server
receiving a connexion after it has been pruned from the cache while the
system if doing heavy I/O. "ssh", in this case, seems to be doing much
more I/O than a simple "ls", and I think we might want to see if cfq
behaves correctly in such case. Most of this I/O is coming from page
faults (identified as traps in the trace) probably because the ssh
executable has been thrown out of the cache by

echo 3 > /proc/sys/vm/drop_caches

The behavior of an incoming ssh connexion after clearing the cache is
appended below (Part 1 - LTTng trace for incoming ssh connexion). The
job file created (Part 2) reads, for each job, a 2MB file with random
reads each between 4k-44k. The results are very interesting for cfq :

I/O scheduler runt-min (msec) runt-max (msec)
noop 586 110242
anticipatory 531 26942
deadline 561 108772
cfq 523 28216

So, basically, ssh being out of the cache can take 28s to answer an
incoming ssh connexion even with the cfq scheduler. This is not exactly
what I would call an acceptable latency.

Mathieu


* Part 1 - LTTng trace for incoming ssh connexion


block.bio_queue: 14270.987362011 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, sshd, , 4159, 0x0, SYSCALL { sector = 12312, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14270.987370577 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, sshd, , 4159, 0x0, SYSCALL { sector = 21507282, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14271.002701211 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, sshd, , 4159, 0x0, SYSCALL { sector = 376717312, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14271.002708852 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, sshd, , 4159, 0x0, SYSCALL { sector = 398212282, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14271.994249134 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, sshd, , 4159, 0x0, SYSCALL { sector = 376762504, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14271.994258500 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, sshd, , 4159, 0x0, SYSCALL { sector = 398257474, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.005047300 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, sshd, , 4159, 0x0, TRAP { sector = 186581088, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.005054182 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, sshd, , 4159, 0x0, TRAP { sector = 208076058, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.197046688 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, sshd, , 4159, 0x0, TRAP { sector = 186581680, size = 45056, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.197056120 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, sshd, , 4159, 0x0, TRAP { sector = 208076650, size = 45056, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.214463959 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, sshd, , 4159, 0x0, TRAP { sector = 376983192, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.214469777 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, sshd, , 4159, 0x0, TRAP { sector = 398478162, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.358980449 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, sshd, , 4159, 0x0, TRAP { sector = 376983312, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.358986893 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, sshd, , 4159, 0x0, TRAP { sector = 398478282, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366179882 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 504036296, size = 20480, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366188841 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 525531266, size = 20480, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366228133 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 504037392, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366233770 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 525532362, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366245471 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 504070144, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366250460 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 525565114, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366258431 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 504172624, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366263414 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 525667594, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366271329 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 504172640, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366275709 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 525667610, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366305707 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 504172664, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366311569 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 525667634, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366320581 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 504172680, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366327005 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 525667650, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366334928 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 504172688, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366339671 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 525667658, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366351578 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 504172696, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.366356064 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 525667666, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.394371136 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 504172704, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.394378840 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 525667674, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.394396826 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 504172744, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.394402397 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 525667714, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.504393076 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 376762496, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14272.504399733 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 398257466, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.651642743 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376819168, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.651650198 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398314138, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.651668568 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376819192, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.651673473 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398314162, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.813095173 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376930384, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.813103780 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398425354, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.818773204 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376983360, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.818779958 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398478330, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.867827280 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376871792, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.867834786 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398366762, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.867857878 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376871816, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14272.867863845 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398366786, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.000933599 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376871832, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.000941927 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398366802, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.000962547 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376871856, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.000967971 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398366826, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.000988999 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376871896, size = 20480, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.000994441 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398366866, size = 20480, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.016781818 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 557798168, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.016787698 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 579293138, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.027449494 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557798264, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.027455846 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579293234, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.079950572 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557801192, size = 69632, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.079957430 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579296162, size = 69632, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.087728033 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557800984, size = 106496, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.087734033 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579295954, size = 106496, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.205730103 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376977904, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.205735312 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398472874, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.213716615 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 557596672, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14273.213725447 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 579091642, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.376105867 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 557632888, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14273.376113769 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 579127858, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.390329162 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557744176, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.390338057 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579239146, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.390366345 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557744184, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.390371136 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579239154, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.390384775 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557744192, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.390389617 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579239162, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.390402469 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557744200, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.390407113 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579239170, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.390420125 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557744208, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.390424982 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579239178, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.390432638 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557744216, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.390436805 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579239186, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.390462732 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557744224, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.390467689 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579239194, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.548801789 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557744232, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.548812506 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579239202, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.548844346 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557744256, size = 32768, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.548850571 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579239226, size = 32768, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555483129 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978008, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555489558 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398472978, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555502566 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978016, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555507462 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398472986, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555513691 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978024, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555518362 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398472994, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555522790 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978032, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555527365 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398473002, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555531940 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978040, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555536359 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398473010, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555540953 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978048, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555545306 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398473018, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555549707 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978056, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555554228 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398473026, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555565226 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978064, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.555583185 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398473034, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.556111195 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978072, size = 12288, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.556116436 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398473042, size = 12288, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.556132550 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978104, size = 24576, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.556137395 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398473074, size = 24576, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.557633755 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376979192, size = 20480, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.557639746 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398474162, size = 20480, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.557651417 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376979240, size = 12288, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.557655782 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398474210, size = 12288, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.558790122 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978680, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.558797670 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398473650, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.558810157 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978688, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.558815023 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398473658, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.558826051 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978736, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.558830869 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398473706, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.559618325 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978744, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.559624455 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398473714, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.559648476 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978760, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.559653673 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398473730, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.560470401 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 557632776, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14273.560475954 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 579127746, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.564633093 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557647824, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.564639949 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579142794, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.570412202 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557647944, size = 36864, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.570417494 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579142914, size = 36864, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.570432050 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557648024, size = 28672, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.570436544 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579142994, size = 28672, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.573250317 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 557648112, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.573255825 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 579143082, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.573813668 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557648208, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.573819380 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579143178, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.574357597 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557649240, size = 69632, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.574363720 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579144210, size = 69632, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.579745509 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 557632816, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14273.579750936 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 579127786, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.580137575 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557649536, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.580143137 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579144506, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.581782686 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557649648, size = 28672, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.581787972 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579144618, size = 28672, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.581798890 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557649712, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.581803213 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579144682, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.583373838 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376980416, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.583379589 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398475386, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.592597554 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376982864, size = 77824, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.592603461 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398477834, size = 77824, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.605484632 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 557649424, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.605490392 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 579144394, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.606285537 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376766472, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.606292749 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398261442, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.618255248 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503841136, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14273.618262031 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525336106, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.766848612 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503957088, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14273.766854819 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525452058, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.779173851 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503857536, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.779179020 (/traces/trace-slow-ssh-pid-5555/block_3), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525352506, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.956064108 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 383516688, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14273.956073127 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 405011658, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14273.963661833 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 504172672, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14273.963667482 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525667642, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.105890774 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503857200, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.105897887 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525352170, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.114466614 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 639844352, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14274.114471721 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 661339322, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.194546003 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503857392, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.194551112 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525352362, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.195244833 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376978584, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.195250131 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398473554, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.342679172 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376977824, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.342686069 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398472794, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.342702066 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376977864, size = 12288, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.342706689 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398472834, size = 12288, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.514308041 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376979128, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.514316219 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398474098, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.514332549 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376979144, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.514337418 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398474114, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.514354278 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376979160, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.514358806 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398474130, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.514371841 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376979176, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.514376353 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398474146, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.671607720 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 110366736, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.671614533 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 131861706, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.688855653 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503841144, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14274.688861789 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525336114, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.710775517 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503957224, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14274.710783249 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525452194, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.711178453 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 504036272, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14274.711185887 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525531242, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14275.753947620 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 557727992, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14275.753956191 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 579222962, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14275.891101527 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 558242792, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14275.891109390 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 579737762, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14276.054306664 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566165504, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14276.054312781 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587660474, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14276.202061219 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566169560, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14276.202067900 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587664530, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14276.343169743 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566169656, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14276.343177097 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587664626, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14276.435036005 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566171584, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14276.435042329 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587666554, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14276.587967625 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566170576, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14276.587975446 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587665546, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14276.714877542 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566171080, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14276.714885441 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587666050, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14276.885331923 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566170824, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14276.885338400 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587665794, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.041004774 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566170696, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.041011242 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587665666, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.090024321 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566170760, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.090030807 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587665730, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.139160617 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566170792, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.139166503 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587665762, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.146527238 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566170808, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.146532806 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587665778, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.147041642 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566170816, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.147046664 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587665786, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.147056378 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566170832, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.147060909 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587665802, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.149654636 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 504086544, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.149661995 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525581514, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.299441568 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566165512, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.299449098 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587660482, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.316058849 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566165608, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.316064702 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587660578, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.316655231 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566167536, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.316661231 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587662506, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.319198772 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566168544, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.319204644 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587663514, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.325427594 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566169048, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.325432190 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587664018, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.327980237 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566169296, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.327985268 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587664266, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.329234978 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566169168, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.329239811 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587664138, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.330769742 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566169104, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.330775631 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587664074, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.331300113 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566169136, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.331305777 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587664106, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.331634685 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566169120, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.331640664 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587664090, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.332191280 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566169112, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.332198036 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587664082, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.332857870 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 641990688, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14277.332863016 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 663485658, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.339925356 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 504086552, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.339930549 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525581522, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.350000251 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503840960, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14277.350007112 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525335930, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.360440736 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503844888, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.360446037 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525339858, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.417649469 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503841152, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14277.417655383 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525336122, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418058555 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503957240, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418063403 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525452210, size = 16384, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418555076 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503957272, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418560377 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525452242, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418570217 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503957280, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418574897 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525452250, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418581063 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503957288, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418585764 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525452258, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418590078 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503957296, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418594614 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525452266, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418598451 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503957304, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418602756 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525452274, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418606908 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503957312, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418611238 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525452282, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418615216 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503957320, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418619527 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525452290, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418623322 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503957328, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418627663 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525452298, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418836246 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503957336, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.418841193 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525452306, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.419381341 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503957344, size = 65536, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.419386225 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525452314, size = 65536, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.419849133 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503957472, size = 20480, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.419853747 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525452442, size = 20480, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.576690908 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 110510128, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.576698949 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 132005098, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.588845789 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503988328, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.588852656 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525483298, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.601952879 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503873536, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14277.601959539 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525368506, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.060232543 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376983048, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.060241912 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398478018, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.064129159 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503857272, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14278.064138655 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525352242, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.071310370 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 504037776, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.071330264 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525532746, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.080891196 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503939072, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.080897109 (/traces/trace-slow-ssh-pid-5555/block_1), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525434042, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.084320641 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376947512, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.084328574 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398442482, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.084343616 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376947552, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.084348755 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398442522, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.084358266 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376947568, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.084363390 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398442538, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.084378252 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376947576, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.084383308 (/traces/trace-slow-ssh-pid-5555/block_0), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398442546, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.096592889 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376947584, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.096599909 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398442554, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.096953622 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 376946984, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.096958890 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 398441954, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.101879473 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503955464, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.101885305 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525450434, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.118154240 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503971864, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.118162137 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525466834, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.126133387 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503988608, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.126139687 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525483578, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.136351623 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503857280, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.136357399 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525352250, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.138499766 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566169080, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.138506375 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587664050, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.139160026 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566169064, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.139165315 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587664034, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.139782848 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566169072, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.139788161 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587664042, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.139799535 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 566169088, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.139804017 (/traces/trace-slow-ssh-pid-5555/block_5), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 587664058, size = 8192, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.141005857 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503841632, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14278.141012172 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525336602, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.149367501 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 503956240, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.149373775 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 525451210, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.155173707 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 315408384, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14278.155179359 (/traces/trace-slow-ssh-pid-5555/block_6), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 336903354, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.169842985 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 483393984, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14278.169849091 (/traces/trace-slow-ssh-pid-5555/block_7), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 504888954, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.180896269 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 483400808, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14278.180903577 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 504895778, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.184431117 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 483795656, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.184437162 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 505290626, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.209624125 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 503923064, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.209631628 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, TRAP { sector = 525418034, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.221083451 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 503873552, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.221090019 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 525368522, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.318767351 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 640040968, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 10, not_uptodate = 0 }
block.bio_queue: 14278.318773435 (/traces/trace-slow-ssh-pid-5555/block_4), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 661535938, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.325009226 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 641367208, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.325014566 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 662862178, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.330573352 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 641367216, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }
block.bio_queue: 14278.330579649 (/traces/trace-slow-ssh-pid-5555/block_2), 5555, 5555, /usr/sbin/sshd, , 4159, 0x0, SYSCALL { sector = 662862186, size = 4096, rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT, = 0, not_uptodate = 0 }


* Part 2 - ssh connexion job file (test.job.ssh)

[job1]
rw=write
size=10240m
direct=0
blocksize=1024k

[global]
rw=randread
size=2048k
filesize=30m
direct=0
bsrange=4k-44k

[file1]
startdelay=0

[file2]
startdelay=4

[file3]
startdelay=8

[file4]
startdelay=12

[file5]
startdelay=16

[file6]
startdelay=20

[file7]
startdelay=24

[file8]
startdelay=28

[file9]
startdelay=32

[file10]
startdelay=36

[file11]
startdelay=40

[file12]
startdelay=44

[file13]
startdelay=48

[file14]
startdelay=52

[file15]
startdelay=56

[file16]
startdelay=60

[file17]
startdelay=64

[file18]
startdelay=68

[file19]
startdelay=72

[file20]
startdelay=76

[file21]
startdelay=80

[file22]
startdelay=84

[file23]
startdelay=88

[file24]
startdelay=92

[file25]
startdelay=96

[file26]
startdelay=100

[file27]
startdelay=104

[file28]
startdelay=108

[file29]
startdelay=112

[file30]
startdelay=116

[file31]
startdelay=120

[file32]
startdelay=124

[file33]
startdelay=128

[file34]
startdelay=132

[file35]
startdelay=134

[file36]
startdelay=138

[file37]
startdelay=142

[file38]
startdelay=146

[file39]
startdelay=150

[file40]
startdelay=200

[file41]
startdelay=260

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-01-20 07:38:51

by Jens Axboe

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

On Mon, Jan 19 2009, Mathieu Desnoyers wrote:
> * Jens Axboe ([email protected]) wrote:
> > On Sun, Jan 18 2009, Mathieu Desnoyers wrote:
> > > I looked at the "ls" behavior (while doing a dd) within my LTTng trace
> > > to create a fio job file. The said behavior is appended below as "Part
> > > 1 - ls I/O behavior". Note that the original "ls" test case was done
> > > with the anticipatory I/O scheduler, which was active by default on my
> > > debian system with custom vanilla 2.6.28 kernel. Also note that I am
> > > running this on a raid-1, but have experienced the same problem on a
> > > standard partition I created on the same machine.
> > >
> > > I created the fio job file appended as "Part 2 - dd+ls fio job file". It
> > > consists of one dd-like job and many small jobs reading as many data as
> > > ls did. I used the small test script to batch run this ("Part 3 - batch
> > > test").
> > >
> > > The results for the ls-like jobs are interesting :
> > >
> > > I/O scheduler runt-min (msec) runt-max (msec)
> > > noop 41 10563
> > > anticipatory 63 8185
> > > deadline 52 33387
> > > cfq 43 1420
> >
>
> Extra note : I have a HZ=250 on my system. Changing to 100 or 1000 did
> not make much difference (also tried with NO_HZ enabled).
>
> > Do you have queuing enabled on your drives? You can check that in
> > /sys/block/sdX/device/queue_depth. Try setting those to 1 and retest all
> > schedulers, would be good for comparison.
> >
>
> Here are the tests with a queue_depth of 1 :
>
> I/O scheduler runt-min (msec) runt-max (msec)
> noop 43 38235
> anticipatory 44 8728
> deadline 51 19751
> cfq 48 427
>
>
> Overall, I wouldn't say it makes much difference.

0,5 seconds vs 1,5 seconds isn't much of a difference?

> > raid personalities or dm complicates matters, since it introduces a
> > disconnect between 'ls' and the io scheduler at the bottom...
> >
>
> Yes, ideally I should re-run those directly on the disk partitions.

At least for comparison.

> I am also tempted to create a fio job file which acts like a ssh server
> receiving a connexion after it has been pruned from the cache while the
> system if doing heavy I/O. "ssh", in this case, seems to be doing much
> more I/O than a simple "ls", and I think we might want to see if cfq
> behaves correctly in such case. Most of this I/O is coming from page
> faults (identified as traps in the trace) probably because the ssh
> executable has been thrown out of the cache by
>
> echo 3 > /proc/sys/vm/drop_caches
>
> The behavior of an incoming ssh connexion after clearing the cache is
> appended below (Part 1 - LTTng trace for incoming ssh connexion). The
> job file created (Part 2) reads, for each job, a 2MB file with random
> reads each between 4k-44k. The results are very interesting for cfq :
>
> I/O scheduler runt-min (msec) runt-max (msec)
> noop 586 110242
> anticipatory 531 26942
> deadline 561 108772
> cfq 523 28216
>
> So, basically, ssh being out of the cache can take 28s to answer an
> incoming ssh connexion even with the cfq scheduler. This is not exactly
> what I would call an acceptable latency.

At some point, you have to stop and consider what is acceptable
performance for a given IO pattern. Your ssh test case is purely random
IO, and neither CFQ nor AS would do any idling for that. We can make
this test case faster for sure, the hard part is making sure that we
don't regress on async throughput at the same time.

Also remember that with your raid1, it's not entirely reasonable to
blaim all performance issues on the IO scheduler as per my previous
mail. It would be a lot more fair to view the disk numbers individually.

Can you retry this job with 'quantum' set to 1 and 'slice_async_rq' set
to 1 as well?

However, I think we should be doing somewhat better at this test case.

--
Jens Axboe

2009-01-20 12:30:38

by Jens Axboe

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

On Tue, Jan 20 2009, Jens Axboe wrote:
> On Mon, Jan 19 2009, Mathieu Desnoyers wrote:
> > * Jens Axboe ([email protected]) wrote:
> > > On Sun, Jan 18 2009, Mathieu Desnoyers wrote:
> > > > I looked at the "ls" behavior (while doing a dd) within my LTTng trace
> > > > to create a fio job file. The said behavior is appended below as "Part
> > > > 1 - ls I/O behavior". Note that the original "ls" test case was done
> > > > with the anticipatory I/O scheduler, which was active by default on my
> > > > debian system with custom vanilla 2.6.28 kernel. Also note that I am
> > > > running this on a raid-1, but have experienced the same problem on a
> > > > standard partition I created on the same machine.
> > > >
> > > > I created the fio job file appended as "Part 2 - dd+ls fio job file". It
> > > > consists of one dd-like job and many small jobs reading as many data as
> > > > ls did. I used the small test script to batch run this ("Part 3 - batch
> > > > test").
> > > >
> > > > The results for the ls-like jobs are interesting :
> > > >
> > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > noop 41 10563
> > > > anticipatory 63 8185
> > > > deadline 52 33387
> > > > cfq 43 1420
> > >
> >
> > Extra note : I have a HZ=250 on my system. Changing to 100 or 1000 did
> > not make much difference (also tried with NO_HZ enabled).
> >
> > > Do you have queuing enabled on your drives? You can check that in
> > > /sys/block/sdX/device/queue_depth. Try setting those to 1 and retest all
> > > schedulers, would be good for comparison.
> > >
> >
> > Here are the tests with a queue_depth of 1 :
> >
> > I/O scheduler runt-min (msec) runt-max (msec)
> > noop 43 38235
> > anticipatory 44 8728
> > deadline 51 19751
> > cfq 48 427
> >
> >
> > Overall, I wouldn't say it makes much difference.
>
> 0,5 seconds vs 1,5 seconds isn't much of a difference?
>
> > > raid personalities or dm complicates matters, since it introduces a
> > > disconnect between 'ls' and the io scheduler at the bottom...
> > >
> >
> > Yes, ideally I should re-run those directly on the disk partitions.
>
> At least for comparison.
>
> > I am also tempted to create a fio job file which acts like a ssh server
> > receiving a connexion after it has been pruned from the cache while the
> > system if doing heavy I/O. "ssh", in this case, seems to be doing much
> > more I/O than a simple "ls", and I think we might want to see if cfq
> > behaves correctly in such case. Most of this I/O is coming from page
> > faults (identified as traps in the trace) probably because the ssh
> > executable has been thrown out of the cache by
> >
> > echo 3 > /proc/sys/vm/drop_caches
> >
> > The behavior of an incoming ssh connexion after clearing the cache is
> > appended below (Part 1 - LTTng trace for incoming ssh connexion). The
> > job file created (Part 2) reads, for each job, a 2MB file with random
> > reads each between 4k-44k. The results are very interesting for cfq :
> >
> > I/O scheduler runt-min (msec) runt-max (msec)
> > noop 586 110242
> > anticipatory 531 26942
> > deadline 561 108772
> > cfq 523 28216
> >
> > So, basically, ssh being out of the cache can take 28s to answer an
> > incoming ssh connexion even with the cfq scheduler. This is not exactly
> > what I would call an acceptable latency.
>
> At some point, you have to stop and consider what is acceptable
> performance for a given IO pattern. Your ssh test case is purely random
> IO, and neither CFQ nor AS would do any idling for that. We can make
> this test case faster for sure, the hard part is making sure that we
> don't regress on async throughput at the same time.
>
> Also remember that with your raid1, it's not entirely reasonable to
> blaim all performance issues on the IO scheduler as per my previous
> mail. It would be a lot more fair to view the disk numbers individually.
>
> Can you retry this job with 'quantum' set to 1 and 'slice_async_rq' set
> to 1 as well?
>
> However, I think we should be doing somewhat better at this test case.

Mathieu, does this improve anything for you?

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index e8525fa..a556512 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1765,6 +1765,32 @@ cfq_update_idle_window(struct cfq_data *cfqd, struct cfq_queue *cfqq,
}

/*
+ * Pull dispatched requests from 'cfqq' back into the scheduler
+ */
+static void cfq_pull_dispatched_requests(struct cfq_data *cfqd,
+ struct cfq_queue *cfqq)
+{
+ struct request_queue *q = cfqd->queue;
+ struct request *rq, *tmp;
+
+ list_for_each_entry_safe(rq, tmp, &q->queue_head, queuelist) {
+ if ((rq->cmd_flags & REQ_STARTED) || RQ_CFQQ(rq) != cfqq)
+ continue;
+
+ /*
+ * Pull off the dispatch list and put it back into the cfqq
+ */
+ list_del(&rq->queuelist);
+ cfqq->dispatched--;
+ if (cfq_cfqq_sync(cfqq))
+ cfqd->sync_flight--;
+
+ list_add_tail(&rq->queuelist, &cfqq->fifo);
+ cfq_add_rq_rb(rq);
+ }
+}
+
+/*
* Check if new_cfqq should preempt the currently active queue. Return 0 for
* no or if we aren't sure, a 1 will cause a preempt.
*/
@@ -1820,8 +1846,14 @@ cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq,
*/
static void cfq_preempt_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
{
+ struct cfq_queue *old_cfqq = cfqd->active_queue;
+
cfq_log_cfqq(cfqd, cfqq, "preempt");
- cfq_slice_expired(cfqd, 1);
+
+ if (old_cfqq) {
+ __cfq_slice_expired(cfqd, old_cfqq, 1);
+ cfq_pull_dispatched_requests(cfqd, old_cfqq);
+ }

/*
* Put the new queue at the front of the of the current list,

--
Jens Axboe

2009-01-20 13:46:05

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [ltt-dev] [RFC PATCH] block: Fix bio merge induced high I/O latency

* Jens Axboe ([email protected]) wrote:
> On Mon, Jan 19 2009, Mathieu Desnoyers wrote:
> > * Jens Axboe ([email protected]) wrote:
> > > On Sun, Jan 18 2009, Mathieu Desnoyers wrote:
> > > > I looked at the "ls" behavior (while doing a dd) within my LTTng trace
> > > > to create a fio job file. The said behavior is appended below as "Part
> > > > 1 - ls I/O behavior". Note that the original "ls" test case was done
> > > > with the anticipatory I/O scheduler, which was active by default on my
> > > > debian system with custom vanilla 2.6.28 kernel. Also note that I am
> > > > running this on a raid-1, but have experienced the same problem on a
> > > > standard partition I created on the same machine.
> > > >
> > > > I created the fio job file appended as "Part 2 - dd+ls fio job file". It
> > > > consists of one dd-like job and many small jobs reading as many data as
> > > > ls did. I used the small test script to batch run this ("Part 3 - batch
> > > > test").
> > > >
> > > > The results for the ls-like jobs are interesting :
> > > >
> > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > noop 41 10563
> > > > anticipatory 63 8185
> > > > deadline 52 33387
> > > > cfq 43 1420
> > >
> >
> > Extra note : I have a HZ=250 on my system. Changing to 100 or 1000 did
> > not make much difference (also tried with NO_HZ enabled).
> >
> > > Do you have queuing enabled on your drives? You can check that in
> > > /sys/block/sdX/device/queue_depth. Try setting those to 1 and retest all
> > > schedulers, would be good for comparison.
> > >
> >
> > Here are the tests with a queue_depth of 1 :
> >
> > I/O scheduler runt-min (msec) runt-max (msec)
> > noop 43 38235
> > anticipatory 44 8728
> > deadline 51 19751
> > cfq 48 427
> >
> >
> > Overall, I wouldn't say it makes much difference.
>
> 0,5 seconds vs 1,5 seconds isn't much of a difference?
>

threefold.. yes, that's significant, but not in term of usability in
that specific case.

> > > raid personalities or dm complicates matters, since it introduces a
> > > disconnect between 'ls' and the io scheduler at the bottom...
> > >
> >
> > Yes, ideally I should re-run those directly on the disk partitions.
>
> At least for comparison.
>

Here it is. ssh test done on /dev/sda directly

queue_depth=31 (default)
/sys/block/sda/queue/iosched/slice_async_rq = 2 (default)
/sys/block/sda/queue/iosched/quantum = 4 (default)

I/O scheduler runt-min (msec) runt-max (msec)
noop 612 205684
anticipatory 562 5555
deadline 505 113153
cfq 523 6637

> > I am also tempted to create a fio job file which acts like a ssh server
> > receiving a connexion after it has been pruned from the cache while the
> > system if doing heavy I/O. "ssh", in this case, seems to be doing much
> > more I/O than a simple "ls", and I think we might want to see if cfq
> > behaves correctly in such case. Most of this I/O is coming from page
> > faults (identified as traps in the trace) probably because the ssh
> > executable has been thrown out of the cache by
> >
> > echo 3 > /proc/sys/vm/drop_caches
> >
> > The behavior of an incoming ssh connexion after clearing the cache is
> > appended below (Part 1 - LTTng trace for incoming ssh connexion). The
> > job file created (Part 2) reads, for each job, a 2MB file with random
> > reads each between 4k-44k. The results are very interesting for cfq :
> >
> > I/O scheduler runt-min (msec) runt-max (msec)
> > noop 586 110242
> > anticipatory 531 26942
> > deadline 561 108772
> > cfq 523 28216
> >
> > So, basically, ssh being out of the cache can take 28s to answer an
> > incoming ssh connexion even with the cfq scheduler. This is not exactly
> > what I would call an acceptable latency.
>
> At some point, you have to stop and consider what is acceptable
> performance for a given IO pattern. Your ssh test case is purely random
> IO, and neither CFQ nor AS would do any idling for that. We can make
> this test case faster for sure, the hard part is making sure that we
> don't regress on async throughput at the same time.
>
> Also remember that with your raid1, it's not entirely reasonable to
> blaim all performance issues on the IO scheduler as per my previous
> mail. It would be a lot more fair to view the disk numbers individually.
>
> Can you retry this job with 'quantum' set to 1 and 'slice_async_rq' set
> to 1 as well?
>

Sure, ssh test done on /dev/sda

queue_depth=31 (default)
/sys/block/sda/queue/iosched/slice_async_rq = 1
/sys/block/sda/queue/iosched/quantum = 1

I/O scheduler runt-min (msec) runt-max (msec)
cfq (default) 523 6637
cfq (s_rq=1,q=1) 503 6743

It did not do much difference.

Mathieu


> However, I think we should be doing somewhat better at this test case.
>
> --
> Jens Axboe
>
>
> _______________________________________________
> ltt-dev mailing list
> [email protected]
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-01-20 14:22:42

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [ltt-dev] [RFC PATCH] block: Fix bio merge induced high I/O latency

* Jens Axboe ([email protected]) wrote:
> On Tue, Jan 20 2009, Jens Axboe wrote:
> > On Mon, Jan 19 2009, Mathieu Desnoyers wrote:
> > > * Jens Axboe ([email protected]) wrote:
> > > > On Sun, Jan 18 2009, Mathieu Desnoyers wrote:
> > > > > I looked at the "ls" behavior (while doing a dd) within my LTTng trace
> > > > > to create a fio job file. The said behavior is appended below as "Part
> > > > > 1 - ls I/O behavior". Note that the original "ls" test case was done
> > > > > with the anticipatory I/O scheduler, which was active by default on my
> > > > > debian system with custom vanilla 2.6.28 kernel. Also note that I am
> > > > > running this on a raid-1, but have experienced the same problem on a
> > > > > standard partition I created on the same machine.
> > > > >
> > > > > I created the fio job file appended as "Part 2 - dd+ls fio job file". It
> > > > > consists of one dd-like job and many small jobs reading as many data as
> > > > > ls did. I used the small test script to batch run this ("Part 3 - batch
> > > > > test").
> > > > >
> > > > > The results for the ls-like jobs are interesting :
> > > > >
> > > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > > noop 41 10563
> > > > > anticipatory 63 8185
> > > > > deadline 52 33387
> > > > > cfq 43 1420
> > > >
> > >
> > > Extra note : I have a HZ=250 on my system. Changing to 100 or 1000 did
> > > not make much difference (also tried with NO_HZ enabled).
> > >
> > > > Do you have queuing enabled on your drives? You can check that in
> > > > /sys/block/sdX/device/queue_depth. Try setting those to 1 and retest all
> > > > schedulers, would be good for comparison.
> > > >
> > >
> > > Here are the tests with a queue_depth of 1 :
> > >
> > > I/O scheduler runt-min (msec) runt-max (msec)
> > > noop 43 38235
> > > anticipatory 44 8728
> > > deadline 51 19751
> > > cfq 48 427
> > >
> > >
> > > Overall, I wouldn't say it makes much difference.
> >
> > 0,5 seconds vs 1,5 seconds isn't much of a difference?
> >
> > > > raid personalities or dm complicates matters, since it introduces a
> > > > disconnect between 'ls' and the io scheduler at the bottom...
> > > >
> > >
> > > Yes, ideally I should re-run those directly on the disk partitions.
> >
> > At least for comparison.
> >
> > > I am also tempted to create a fio job file which acts like a ssh server
> > > receiving a connexion after it has been pruned from the cache while the
> > > system if doing heavy I/O. "ssh", in this case, seems to be doing much
> > > more I/O than a simple "ls", and I think we might want to see if cfq
> > > behaves correctly in such case. Most of this I/O is coming from page
> > > faults (identified as traps in the trace) probably because the ssh
> > > executable has been thrown out of the cache by
> > >
> > > echo 3 > /proc/sys/vm/drop_caches
> > >
> > > The behavior of an incoming ssh connexion after clearing the cache is
> > > appended below (Part 1 - LTTng trace for incoming ssh connexion). The
> > > job file created (Part 2) reads, for each job, a 2MB file with random
> > > reads each between 4k-44k. The results are very interesting for cfq :
> > >
> > > I/O scheduler runt-min (msec) runt-max (msec)
> > > noop 586 110242
> > > anticipatory 531 26942
> > > deadline 561 108772
> > > cfq 523 28216
> > >
> > > So, basically, ssh being out of the cache can take 28s to answer an
> > > incoming ssh connexion even with the cfq scheduler. This is not exactly
> > > what I would call an acceptable latency.
> >
> > At some point, you have to stop and consider what is acceptable
> > performance for a given IO pattern. Your ssh test case is purely random
> > IO, and neither CFQ nor AS would do any idling for that. We can make
> > this test case faster for sure, the hard part is making sure that we
> > don't regress on async throughput at the same time.
> >
> > Also remember that with your raid1, it's not entirely reasonable to
> > blaim all performance issues on the IO scheduler as per my previous
> > mail. It would be a lot more fair to view the disk numbers individually.
> >
> > Can you retry this job with 'quantum' set to 1 and 'slice_async_rq' set
> > to 1 as well?
> >
> > However, I think we should be doing somewhat better at this test case.
>
> Mathieu, does this improve anything for you?
>

I got this message when running with your patch applied :
cfq: forced dispatching is broken (nr_sorted=4294967275), please report this
(message appeared 10 times in a job run)

Here is the result :

ssh test done on /dev/sda directly

queue_depth=31 (default)
/sys/block/sda/queue/iosched/slice_async_rq = 2 (default)
/sys/block/sda/queue/iosched/quantum = 4 (default)

I/O scheduler runt-min (msec) runt-max (msec)
cfq (default) 523 6637
cfq (patched) 564 7195

Pretty much the same.

Here is the test done on raid1 :
queue_depth=31 (default)
/sys/block/sd{a,b}/queue/iosched/slice_async_rq = 2 (default)
/sys/block/sd{a,b}/queue/iosched/quantum = 4 (default)

I/O scheduler runt-min (msec) runt-max (msec)
cfq (default, raid1) 523 28216
cfq (patched, raid1) 540 16454

With nearly same order of magnitude worse-case.

Mathieu


> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index e8525fa..a556512 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -1765,6 +1765,32 @@ cfq_update_idle_window(struct cfq_data *cfqd, struct cfq_queue *cfqq,
> }
>
> /*
> + * Pull dispatched requests from 'cfqq' back into the scheduler
> + */
> +static void cfq_pull_dispatched_requests(struct cfq_data *cfqd,
> + struct cfq_queue *cfqq)
> +{
> + struct request_queue *q = cfqd->queue;
> + struct request *rq, *tmp;
> +
> + list_for_each_entry_safe(rq, tmp, &q->queue_head, queuelist) {
> + if ((rq->cmd_flags & REQ_STARTED) || RQ_CFQQ(rq) != cfqq)
> + continue;
> +
> + /*
> + * Pull off the dispatch list and put it back into the cfqq
> + */
> + list_del(&rq->queuelist);
> + cfqq->dispatched--;
> + if (cfq_cfqq_sync(cfqq))
> + cfqd->sync_flight--;
> +
> + list_add_tail(&rq->queuelist, &cfqq->fifo);
> + cfq_add_rq_rb(rq);
> + }
> +}
> +
> +/*
> * Check if new_cfqq should preempt the currently active queue. Return 0 for
> * no or if we aren't sure, a 1 will cause a preempt.
> */
> @@ -1820,8 +1846,14 @@ cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq,
> */
> static void cfq_preempt_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> {
> + struct cfq_queue *old_cfqq = cfqd->active_queue;
> +
> cfq_log_cfqq(cfqd, cfqq, "preempt");
> - cfq_slice_expired(cfqd, 1);
> +
> + if (old_cfqq) {
> + __cfq_slice_expired(cfqd, old_cfqq, 1);
> + cfq_pull_dispatched_requests(cfqd, old_cfqq);
> + }
>
> /*
> * Put the new queue at the front of the of the current list,
>
> --
> Jens Axboe
>
>
> _______________________________________________
> ltt-dev mailing list
> [email protected]
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-01-20 14:26:40

by Jens Axboe

[permalink] [raw]
Subject: Re: [ltt-dev] [RFC PATCH] block: Fix bio merge induced high I/O latency

On Tue, Jan 20 2009, Mathieu Desnoyers wrote:
> * Jens Axboe ([email protected]) wrote:
> > On Tue, Jan 20 2009, Jens Axboe wrote:
> > > On Mon, Jan 19 2009, Mathieu Desnoyers wrote:
> > > > * Jens Axboe ([email protected]) wrote:
> > > > > On Sun, Jan 18 2009, Mathieu Desnoyers wrote:
> > > > > > I looked at the "ls" behavior (while doing a dd) within my LTTng trace
> > > > > > to create a fio job file. The said behavior is appended below as "Part
> > > > > > 1 - ls I/O behavior". Note that the original "ls" test case was done
> > > > > > with the anticipatory I/O scheduler, which was active by default on my
> > > > > > debian system with custom vanilla 2.6.28 kernel. Also note that I am
> > > > > > running this on a raid-1, but have experienced the same problem on a
> > > > > > standard partition I created on the same machine.
> > > > > >
> > > > > > I created the fio job file appended as "Part 2 - dd+ls fio job file". It
> > > > > > consists of one dd-like job and many small jobs reading as many data as
> > > > > > ls did. I used the small test script to batch run this ("Part 3 - batch
> > > > > > test").
> > > > > >
> > > > > > The results for the ls-like jobs are interesting :
> > > > > >
> > > > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > > > noop 41 10563
> > > > > > anticipatory 63 8185
> > > > > > deadline 52 33387
> > > > > > cfq 43 1420
> > > > >
> > > >
> > > > Extra note : I have a HZ=250 on my system. Changing to 100 or 1000 did
> > > > not make much difference (also tried with NO_HZ enabled).
> > > >
> > > > > Do you have queuing enabled on your drives? You can check that in
> > > > > /sys/block/sdX/device/queue_depth. Try setting those to 1 and retest all
> > > > > schedulers, would be good for comparison.
> > > > >
> > > >
> > > > Here are the tests with a queue_depth of 1 :
> > > >
> > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > noop 43 38235
> > > > anticipatory 44 8728
> > > > deadline 51 19751
> > > > cfq 48 427
> > > >
> > > >
> > > > Overall, I wouldn't say it makes much difference.
> > >
> > > 0,5 seconds vs 1,5 seconds isn't much of a difference?
> > >
> > > > > raid personalities or dm complicates matters, since it introduces a
> > > > > disconnect between 'ls' and the io scheduler at the bottom...
> > > > >
> > > >
> > > > Yes, ideally I should re-run those directly on the disk partitions.
> > >
> > > At least for comparison.
> > >
> > > > I am also tempted to create a fio job file which acts like a ssh server
> > > > receiving a connexion after it has been pruned from the cache while the
> > > > system if doing heavy I/O. "ssh", in this case, seems to be doing much
> > > > more I/O than a simple "ls", and I think we might want to see if cfq
> > > > behaves correctly in such case. Most of this I/O is coming from page
> > > > faults (identified as traps in the trace) probably because the ssh
> > > > executable has been thrown out of the cache by
> > > >
> > > > echo 3 > /proc/sys/vm/drop_caches
> > > >
> > > > The behavior of an incoming ssh connexion after clearing the cache is
> > > > appended below (Part 1 - LTTng trace for incoming ssh connexion). The
> > > > job file created (Part 2) reads, for each job, a 2MB file with random
> > > > reads each between 4k-44k. The results are very interesting for cfq :
> > > >
> > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > noop 586 110242
> > > > anticipatory 531 26942
> > > > deadline 561 108772
> > > > cfq 523 28216
> > > >
> > > > So, basically, ssh being out of the cache can take 28s to answer an
> > > > incoming ssh connexion even with the cfq scheduler. This is not exactly
> > > > what I would call an acceptable latency.
> > >
> > > At some point, you have to stop and consider what is acceptable
> > > performance for a given IO pattern. Your ssh test case is purely random
> > > IO, and neither CFQ nor AS would do any idling for that. We can make
> > > this test case faster for sure, the hard part is making sure that we
> > > don't regress on async throughput at the same time.
> > >
> > > Also remember that with your raid1, it's not entirely reasonable to
> > > blaim all performance issues on the IO scheduler as per my previous
> > > mail. It would be a lot more fair to view the disk numbers individually.
> > >
> > > Can you retry this job with 'quantum' set to 1 and 'slice_async_rq' set
> > > to 1 as well?
> > >
> > > However, I think we should be doing somewhat better at this test case.
> >
> > Mathieu, does this improve anything for you?
> >
>
> I got this message when running with your patch applied :
> cfq: forced dispatching is broken (nr_sorted=4294967275), please report this
> (message appeared 10 times in a job run)

Woops, missed a sort inc. Updated version below, or just ignore the
warning.

> Here is the result :
>
> ssh test done on /dev/sda directly
>
> queue_depth=31 (default)
> /sys/block/sda/queue/iosched/slice_async_rq = 2 (default)
> /sys/block/sda/queue/iosched/quantum = 4 (default)
>
> I/O scheduler runt-min (msec) runt-max (msec)
> cfq (default) 523 6637
> cfq (patched) 564 7195
>
> Pretty much the same.

Can you retry with depth=1 as well? There's not much to rip back out, if
everything is immediately sent to the device.

>
> Here is the test done on raid1 :
> queue_depth=31 (default)
> /sys/block/sd{a,b}/queue/iosched/slice_async_rq = 2 (default)
> /sys/block/sd{a,b}/queue/iosched/quantum = 4 (default)
>
> I/O scheduler runt-min (msec) runt-max (msec)
> cfq (default, raid1) 523 28216
> cfq (patched, raid1) 540 16454
>
> With nearly same order of magnitude worse-case.

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index e8525fa..30714de 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1765,6 +1765,36 @@ cfq_update_idle_window(struct cfq_data *cfqd, struct cfq_queue *cfqq,
}

/*
+ * Pull dispatched requests from 'cfqq' back into the scheduler
+ */
+static void cfq_pull_dispatched_requests(struct cfq_data *cfqd,
+ struct cfq_queue *cfqq)
+{
+ struct request_queue *q = cfqd->queue;
+ struct request *rq;
+
+ list_for_each_entry_reverse(rq, &q->queue_head, queuelist) {
+ if (rq->cmd_flags & REQ_STARTED)
+ break;
+
+ if (RQ_CFQQ(rq) != cfqq)
+ continue;
+
+ /*
+ * Pull off the dispatch list and put it back into the cfqq
+ */
+ list_del(&rq->queuelist);
+ cfqq->dispatched--;
+ if (cfq_cfqq_sync(cfqq))
+ cfqd->sync_flight--;
+
+ cfq_add_rq_rb(rq);
+ q->nr_sorted++;
+ list_add_tail(&rq->queuelist, &cfqq->fifo);
+ }
+}
+
+/*
* Check if new_cfqq should preempt the currently active queue. Return 0 for
* no or if we aren't sure, a 1 will cause a preempt.
*/
@@ -1820,8 +1850,14 @@ cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq,
*/
static void cfq_preempt_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
{
+ struct cfq_queue *old_cfqq = cfqd->active_queue;
+
cfq_log_cfqq(cfqd, cfqq, "preempt");
- cfq_slice_expired(cfqd, 1);
+
+ if (old_cfqq) {
+ __cfq_slice_expired(cfqd, old_cfqq, 1);
+ cfq_pull_dispatched_requests(cfqd, old_cfqq);
+ }

/*
* Put the new queue at the front of the of the current list,

--
Jens Axboe

2009-01-20 15:42:44

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [ltt-dev] [RFC PATCH] block: Fix bio merge induced high I/O latency

* Jens Axboe ([email protected]) wrote:
> On Tue, Jan 20 2009, Mathieu Desnoyers wrote:
> > * Jens Axboe ([email protected]) wrote:
> > > On Tue, Jan 20 2009, Jens Axboe wrote:
> > > > On Mon, Jan 19 2009, Mathieu Desnoyers wrote:
> > > > > * Jens Axboe ([email protected]) wrote:
> > > > > > On Sun, Jan 18 2009, Mathieu Desnoyers wrote:
> > > > > > > I looked at the "ls" behavior (while doing a dd) within my LTTng trace
> > > > > > > to create a fio job file. The said behavior is appended below as "Part
> > > > > > > 1 - ls I/O behavior". Note that the original "ls" test case was done
> > > > > > > with the anticipatory I/O scheduler, which was active by default on my
> > > > > > > debian system with custom vanilla 2.6.28 kernel. Also note that I am
> > > > > > > running this on a raid-1, but have experienced the same problem on a
> > > > > > > standard partition I created on the same machine.
> > > > > > >
> > > > > > > I created the fio job file appended as "Part 2 - dd+ls fio job file". It
> > > > > > > consists of one dd-like job and many small jobs reading as many data as
> > > > > > > ls did. I used the small test script to batch run this ("Part 3 - batch
> > > > > > > test").
> > > > > > >
> > > > > > > The results for the ls-like jobs are interesting :
> > > > > > >
> > > > > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > > > > noop 41 10563
> > > > > > > anticipatory 63 8185
> > > > > > > deadline 52 33387
> > > > > > > cfq 43 1420
> > > > > >
> > > > >
> > > > > Extra note : I have a HZ=250 on my system. Changing to 100 or 1000 did
> > > > > not make much difference (also tried with NO_HZ enabled).
> > > > >
> > > > > > Do you have queuing enabled on your drives? You can check that in
> > > > > > /sys/block/sdX/device/queue_depth. Try setting those to 1 and retest all
> > > > > > schedulers, would be good for comparison.
> > > > > >
> > > > >
> > > > > Here are the tests with a queue_depth of 1 :
> > > > >
> > > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > > noop 43 38235
> > > > > anticipatory 44 8728
> > > > > deadline 51 19751
> > > > > cfq 48 427
> > > > >
> > > > >
> > > > > Overall, I wouldn't say it makes much difference.
> > > >
> > > > 0,5 seconds vs 1,5 seconds isn't much of a difference?
> > > >
> > > > > > raid personalities or dm complicates matters, since it introduces a
> > > > > > disconnect between 'ls' and the io scheduler at the bottom...
> > > > > >
> > > > >
> > > > > Yes, ideally I should re-run those directly on the disk partitions.
> > > >
> > > > At least for comparison.
> > > >
> > > > > I am also tempted to create a fio job file which acts like a ssh server
> > > > > receiving a connexion after it has been pruned from the cache while the
> > > > > system if doing heavy I/O. "ssh", in this case, seems to be doing much
> > > > > more I/O than a simple "ls", and I think we might want to see if cfq
> > > > > behaves correctly in such case. Most of this I/O is coming from page
> > > > > faults (identified as traps in the trace) probably because the ssh
> > > > > executable has been thrown out of the cache by
> > > > >
> > > > > echo 3 > /proc/sys/vm/drop_caches
> > > > >
> > > > > The behavior of an incoming ssh connexion after clearing the cache is
> > > > > appended below (Part 1 - LTTng trace for incoming ssh connexion). The
> > > > > job file created (Part 2) reads, for each job, a 2MB file with random
> > > > > reads each between 4k-44k. The results are very interesting for cfq :
> > > > >
> > > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > > noop 586 110242
> > > > > anticipatory 531 26942
> > > > > deadline 561 108772
> > > > > cfq 523 28216
> > > > >
> > > > > So, basically, ssh being out of the cache can take 28s to answer an
> > > > > incoming ssh connexion even with the cfq scheduler. This is not exactly
> > > > > what I would call an acceptable latency.
> > > >
> > > > At some point, you have to stop and consider what is acceptable
> > > > performance for a given IO pattern. Your ssh test case is purely random
> > > > IO, and neither CFQ nor AS would do any idling for that. We can make
> > > > this test case faster for sure, the hard part is making sure that we
> > > > don't regress on async throughput at the same time.
> > > >
> > > > Also remember that with your raid1, it's not entirely reasonable to
> > > > blaim all performance issues on the IO scheduler as per my previous
> > > > mail. It would be a lot more fair to view the disk numbers individually.
> > > >
> > > > Can you retry this job with 'quantum' set to 1 and 'slice_async_rq' set
> > > > to 1 as well?
> > > >
> > > > However, I think we should be doing somewhat better at this test case.
> > >
> > > Mathieu, does this improve anything for you?
> > >
> >
> > I got this message when running with your patch applied :
> > cfq: forced dispatching is broken (nr_sorted=4294967275), please report this
> > (message appeared 10 times in a job run)
>
> Woops, missed a sort inc. Updated version below, or just ignore the
> warning.
>
> > Here is the result :
> >
> > ssh test done on /dev/sda directly
> >
> > queue_depth=31 (default)
> > /sys/block/sda/queue/iosched/slice_async_rq = 2 (default)
> > /sys/block/sda/queue/iosched/quantum = 4 (default)
> >
> > I/O scheduler runt-min (msec) runt-max (msec)
> > cfq (default) 523 6637
> > cfq (patched) 564 7195
> >
> > Pretty much the same.
>
> Can you retry with depth=1 as well? There's not much to rip back out, if
> everything is immediately sent to the device.
>

echo 1 > /sys/block/sda/queue/iosched/quantum
echo 1 > /sys/block/sda/queue/iosched/slice_async_rq
echo 1 > /sys/block/sda/device/queue_depth

ssh test done on /dev/sda directly

oops, something wrong in the new patch ?


[ 302.077063] BUG: unable to handle kernel paging request at 00000008
[ 302.078732] IP: [<ffffffff8040a1e5>] cfq_remove_request+0x35/0x1d0
[ 302.078732] PGD 43ac76067 PUD 43b1f3067 PMD 0
[ 302.078732] Oops: 0002 [#1] PREEMPT SMP
[ 302.078732] LTT NESTING LEVEL : 0
[ 302.078732] last sysfs file: /sys/block/sda/stat
[ 302.078732] Dumping ftrace buffer:
[ 302.078732] (ftrace buffer empty)
[ 302.078732] CPU 0
[ 302.078732] Modules linked in: e1000e loop ltt_tracer ltt_trace_control ltt_e
[ 302.078732] Pid: 3748, comm: cron Not tainted 2.6.28 #53
[ 302.078732] RIP: 0010:[<ffffffff8040a1e5>] [<ffffffff8040a1e5>] cfq_remove_0
[ 302.078732] RSP: 0018:ffff8804388a38a8 EFLAGS: 00010087
[ 302.078732] RAX: 0000000000200200 RBX: ffff880437d92000 RCX: 000000002bcde392
[ 302.078732] RDX: 0000000000100100 RSI: ffff880437d92fd0 RDI: ffff880437d92fd0
[ 302.078732] RBP: ffff8804388a38d8 R08: ffff88043e8ce608 R09: 000000002bcdb78a
[ 302.078732] R10: 000000002bcdbb8a R11: 0000000000000808 R12: ffff88043e8ce5d8
[ 302.078732] R13: ffff880437d92fd0 R14: ffff88043e433800 R15: ffff88043e8ce5d8
[ 302.078732] FS: 00007fd9637ea780(0000) GS:ffffffff808de7c0(0000) knlGS:00000
[ 302.078732] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 302.078732] CR2: 0000000000100108 CR3: 000000043ad52000 CR4: 00000000000006e0
[ 302.078732] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 302.078732] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 302.078732] Process cron (pid: 3748, threadinfo ffff8804388a2000, task ffff8)
[ 302.078732] Stack:
[ 302.078732] ffff88043e8ce5e8 ffff880437d92fd0 ffff88043e8ce5d8 ffff88043d550
[ 302.078732] ffff88043e433800 ffff88043e433800 ffff8804388a3908 ffffffff8040d
[ 302.078732] ffff88043e8ce5d8 ffff88043e433800 ffff880437d92fd0 ffff88043e8c8
[ 302.078732] Call Trace:
[ 302.078732] [<ffffffff8040a3bd>] cfq_dispatch_insert+0x3d/0x70
[ 302.078732] [<ffffffff8040a43c>] cfq_add_rq_rb+0x4c/0xb0
[ 302.078732] [<ffffffff8040ab6f>] cfq_insert_request+0x24f/0x420
[ 302.078732] [<ffffffff803fac30>] elv_insert+0x160/0x2f0
[ 302.078732] [<ffffffff803fae3b>] __elv_add_request+0x7b/0xd0
[ 302.078732] [<ffffffff803fe02d>] __make_request+0xfd/0x4f0
[ 302.078732] [<ffffffff803fc39c>] generic_make_request+0x40c/0x550
[ 302.078732] [<ffffffff8029ccab>] ? mempool_alloc+0x5b/0x150
[ 302.078732] [<ffffffff802f54c8>] ? __find_get_block+0xc8/0x210
[ 302.078732] [<ffffffff803fc582>] submit_bio+0xa2/0x150
[ 302.078732] [<ffffffff802fa75e>] ? bio_alloc_bioset+0x5e/0x100
[ 302.078732] [<ffffffff802f4d26>] submit_bh+0xf6/0x130
[ 302.078732] [<ffffffff8032fbc4>] __ext3_get_inode_loc+0x224/0x340
[ 302.078732] [<ffffffff8032fd40>] ext3_iget+0x60/0x420
[ 302.078732] [<ffffffff80336e68>] ext3_lookup+0xa8/0x100
[ 302.078732] [<ffffffff802e3d46>] ? d_alloc+0x186/0x1f0
[ 302.078732] [<ffffffff802d92a6>] do_lookup+0x206/0x260
[ 302.078732] [<ffffffff802db4f6>] __link_path_walk+0x756/0xfe0
[ 302.078732] [<ffffffff80262cd4>] ? get_lock_stats+0x34/0x70
[ 302.078732] [<ffffffff802dc16b>] ? do_path_lookup+0x9b/0x200
[ 302.078732] [<ffffffff802dbf9e>] path_walk+0x6e/0xe0
[ 302.078732] [<ffffffff802dc176>] do_path_lookup+0xa6/0x200
[ 302.078732] [<ffffffff802dad36>] ? getname+0x1c6/0x230
[ 302.078732] [<ffffffff802dd02b>] user_path_at+0x7b/0xb0
[ 302.078732] [<ffffffff8067d3a7>] ? _spin_unlock_irqrestore+0x47/0x80
[ 302.078732] [<ffffffff80259ad3>] ? hrtimer_try_to_cancel+0x53/0xb0
[ 302.078732] [<ffffffff80259b52>] ? hrtimer_cancel+0x22/0x30
[ 302.078732] [<ffffffff802d414d>] vfs_stat_fd+0x2d/0x60
[ 302.078732] [<ffffffff802d422c>] sys_newstat+0x2c/0x50
[ 302.078732] [<ffffffff80265901>] ? trace_hardirqs_on_caller+0x1b1/0x210
[ 302.078732] [<ffffffff8067cd0e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 302.078732] [<ffffffff8020c5db>] system_call_fastpath+0x16/0x1b
[ 302.078732] Code: 41 54 53 48 83 ec 08 0f 1f 44 00 00 4c 8b bf c0 00 00 00 4
[ 302.078732] RIP [<ffffffff8040a1e5>] cfq_remove_request+0x35/0x1d0
[ 302.078732] RSP <ffff8804388a38a8>
[ 302.078732] CR2: 0000000000100108
[ 302.078732] ---[ end trace 925e67a354a83fdc ]---
[ 302.078732] note: cron[3748] exited with preempt_count 1



> >
> > Here is the test done on raid1 :
> > queue_depth=31 (default)
> > /sys/block/sd{a,b}/queue/iosched/slice_async_rq = 2 (default)
> > /sys/block/sd{a,b}/queue/iosched/quantum = 4 (default)
> >
> > I/O scheduler runt-min (msec) runt-max (msec)
> > cfq (default, raid1) 523 28216
> > cfq (patched, raid1) 540 16454
> >
> > With nearly same order of magnitude worse-case.
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index e8525fa..30714de 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -1765,6 +1765,36 @@ cfq_update_idle_window(struct cfq_data *cfqd, struct cfq_queue *cfqq,
> }
>
> /*
> + * Pull dispatched requests from 'cfqq' back into the scheduler
> + */
> +static void cfq_pull_dispatched_requests(struct cfq_data *cfqd,
> + struct cfq_queue *cfqq)
> +{
> + struct request_queue *q = cfqd->queue;
> + struct request *rq;
> +
> + list_for_each_entry_reverse(rq, &q->queue_head, queuelist) {
> + if (rq->cmd_flags & REQ_STARTED)
> + break;
> +
> + if (RQ_CFQQ(rq) != cfqq)
> + continue;
> +
> + /*
> + * Pull off the dispatch list and put it back into the cfqq
> + */
> + list_del(&rq->queuelist);
> + cfqq->dispatched--;
> + if (cfq_cfqq_sync(cfqq))
> + cfqd->sync_flight--;
> +
> + cfq_add_rq_rb(rq);
> + q->nr_sorted++;
> + list_add_tail(&rq->queuelist, &cfqq->fifo);
> + }
> +}
> +
> +/*
> * Check if new_cfqq should preempt the currently active queue. Return 0 for
> * no or if we aren't sure, a 1 will cause a preempt.
> */
> @@ -1820,8 +1850,14 @@ cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq,
> */
> static void cfq_preempt_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> {
> + struct cfq_queue *old_cfqq = cfqd->active_queue;
> +
> cfq_log_cfqq(cfqd, cfqq, "preempt");
> - cfq_slice_expired(cfqd, 1);
> +
> + if (old_cfqq) {
> + __cfq_slice_expired(cfqd, old_cfqq, 1);
> + cfq_pull_dispatched_requests(cfqd, old_cfqq);
> + }
>
> /*
> * Put the new queue at the front of the of the current list,
>
> --
> Jens Axboe
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-01-20 20:22:24

by Ben Gamari

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

On Tue, Jan 20, 2009 at 2:37 AM, Jens Axboe <[email protected]> wrote:
> On Mon, Jan 19 2009, Mathieu Desnoyers wrote:
>> * Jens Axboe ([email protected]) wrote:
>> Yes, ideally I should re-run those directly on the disk partitions.
>
> At least for comparison.
>

I just completed my own set of benchmarks using the fio job file
Mathieu provided. This was on a 2.5 inch 7200 RPM SATA partition
formatted as ext3. As you can see, I tested all of the available
schedulers with both queuing enabled and disabled. I'll test the Jens'
patch soon. Would a blktrace of the fio run help? Let me know if
there's any other benchmarking or profiling that could be done.
Thanks,

- Ben


mint maxt
==========================================================
queue_depth=31:
anticipatory 35 msec 11036 msec
cfq 37 msec 3350 msec
deadline 36 msec 18144 msec
noop 39 msec 41512 msec

==========================================================
queue_depth=1:
anticipatory 45 msec 9561 msec
cfq 28 msec 3974 msec
deadline 47 msec 16802 msec
noop 35 msec 38173 msec

2009-01-20 22:23:21

by Ben Gamari

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

The kernel build finally finished. Unfortunately, it crashes quickly
after booting with moderate disk IO, bringing down the entire machine.
For this reason, I haven't been able to complete a fio benchmark.
Jens, what do you think about this backtrace?

- Ben


BUG: unable to handle kernel paging request at 0000000008
IP: [<ffffffff811c4b2d>] cfq_remove_request+0xb0/0x1da
PGD b2902067 PUD b292e067 PMD 0
Oops: 0002 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0t
CPU 0
Modules linked in: aes_x86_64 aes_generic i915 drm i2c_algo_bit rfcomm bridge s]
Pid: 3903, comm: evolution Not tainted 2.6.29-rc2ben #16
RIP: 0010:[<ffffffff811c4b2d>] [<ffffffff811c4b2d>] cfq_remove_request+0xb0/0xa
RSP: 0018:ffff8800bb853758 EFLAGS: 00010006
RAX: 0000000000200200 RBX: ffff8800b28f3420 RCX: 0000000009deabeb
RDX: 0000000000100100 RSI: ffff8800b010afd0 RDI: ffff8800b010afd0
RBP: ffff8800bb853788 R08: ffff88011fc08250 R09: 000000000cf8b20b
R10: 0000000009e15923 R11: ffff8800b28f3420 R12: ffff8800b010afd0
R13: ffff8800b010afd0 R14: ffff88011d4e8000 R15: ffff88011fc08220
FS: 00007f4b1ef407e0(0000) GS:ffffffff817e7000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000100108 CR3: 00000000b284b000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process evolution (pid: 3903, threadinfo ffff8800bb852000, task ffff8800da0c2de)
Stack:
ffffffff811ccc19 ffff88011fc08220 ffff8800b010afd0 ffff88011d572000
ffff88011d4e8000 ffff88011d572000 ffff8800bb8537b8 ffffffff811c4ca8
ffff88011fc08220 ffff88011d572000 ffff8800b010afd0 ffff88011fc08250
Call Trace:
[<ffffffff811ccc19>] ? rb_insert_color+0xbd/0xe6
[<ffffffff811c4ca8>] cfq_dispatch_insert+0x51/0x72
[<ffffffff811c4d0d>] cfq_add_rq_rb+0x44/0xcf
[<ffffffff811c5519>] cfq_insert_request+0x34d/0x3d1
[<ffffffff811b6d81>] elv_insert+0x1a9/0x250
[<ffffffff811b6ec3>] __elv_add_request+0x9b/0xa4
[<ffffffff811b9769>] __make_request+0x3c4/0x446
[<ffffffff811b7f53>] generic_make_request+0x2bf/0x309
[<ffffffff811b8068>] submit_bio+0xcb/0xd4
[<ffffffff810f170b>] submit_bh+0x115/0x138
[<ffffffff810f31f7>] ll_rw_block+0xa5/0xf4
[<ffffffff810f3886>] __block_prepare_write+0x277/0x306
[<ffffffff8112c759>] ? ext3_get_block+0x0/0x101
[<ffffffff810f3a7e>] block_write_begin+0x8b/0xdd
[<ffffffff8112bd66>] ext3_write_begin+0xee/0x1c0
[<ffffffff8112c759>] ? ext3_get_block+0x0/0x101
[<ffffffff8109f3be>] generic_file_buffered_write+0x12e/0x2e4
[<ffffffff8109f973>] __generic_file_aio_write_nolock+0x263/0x297
[<ffffffff810e4470>] ? touch_atime+0xdf/0x101
[<ffffffff8109feaa>] ? generic_file_aio_read+0x503/0x59c
[<ffffffff810a01ed>] generic_file_aio_write+0x6c/0xc8
[<ffffffff81128c72>] ext3_file_write+0x23/0xa5
[<ffffffff810d2d77>] do_sync_write+0xec/0x132
[<ffffffff8105da1c>] ? autoremove_wake_function+0x0/0x3d
[<ffffffff8119c880>] ? selinux_file_permission+0x40/0xcb
[<ffffffff8119c902>] ? selinux_file_permission+0xc2/0xcb
[<ffffffff81194cc4>] ? security_file_permission+0x16/0x18
[<ffffffff810d3693>] vfs_write+0xb0/0x10a
[<ffffffff810d37bb>] sys_write+0x4c/0x74
[<ffffffff810114aa>] system_call_fastpath+0x16/0x1b
Code: 48 85 c0 74 0c 4c 39 e0 48 8d b0 60 ff ff ff 75 02 31 f6 48 8b 7d d0 48 8
RIP [<ffffffff811c4b2d>] cfq_remove_request+0xb0/0x1da
RSP <ffff8800bb853758>
CR2: 0000000000100108
---[ end trace 6c5ef63f7957c4cf ]---




On Tue, Jan 20, 2009 at 3:22 PM, Ben Gamari <[email protected]> wrote:
> On Tue, Jan 20, 2009 at 2:37 AM, Jens Axboe <[email protected]> wrote:
>> On Mon, Jan 19 2009, Mathieu Desnoyers wrote:
>>> * Jens Axboe ([email protected]) wrote:
>>> Yes, ideally I should re-run those directly on the disk partitions.
>>
>> At least for comparison.
>>
>
> I just completed my own set of benchmarks using the fio job file
> Mathieu provided. This was on a 2.5 inch 7200 RPM SATA partition
> formatted as ext3. As you can see, I tested all of the available
> schedulers with both queuing enabled and disabled. I'll test the Jens'
> patch soon. Would a blktrace of the fio run help? Let me know if
> there's any other benchmarking or profiling that could be done.
> Thanks,
>
> - Ben
>
>
> mint maxt
> ==========================================================
> queue_depth=31:
> anticipatory 35 msec 11036 msec
> cfq 37 msec 3350 msec
> deadline 36 msec 18144 msec
> noop 39 msec 41512 msec
>
> ==========================================================
> queue_depth=1:
> anticipatory 45 msec 9561 msec
> cfq 28 msec 3974 msec
> deadline 47 msec 16802 msec
> noop 35 msec 38173 msec
>

2009-01-20 23:06:46

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [ltt-dev] [RFC PATCH] block: Fix bio merge induced high I/O latency

* Mathieu Desnoyers ([email protected]) wrote:
> * Jens Axboe ([email protected]) wrote:
> > On Tue, Jan 20 2009, Mathieu Desnoyers wrote:
> > > * Jens Axboe ([email protected]) wrote:
> > > > On Tue, Jan 20 2009, Jens Axboe wrote:
> > > > > On Mon, Jan 19 2009, Mathieu Desnoyers wrote:
> > > > > > * Jens Axboe ([email protected]) wrote:
> > > > > > > On Sun, Jan 18 2009, Mathieu Desnoyers wrote:
> > > > > > > > I looked at the "ls" behavior (while doing a dd) within my LTTng trace
> > > > > > > > to create a fio job file. The said behavior is appended below as "Part
> > > > > > > > 1 - ls I/O behavior". Note that the original "ls" test case was done
> > > > > > > > with the anticipatory I/O scheduler, which was active by default on my
> > > > > > > > debian system with custom vanilla 2.6.28 kernel. Also note that I am
> > > > > > > > running this on a raid-1, but have experienced the same problem on a
> > > > > > > > standard partition I created on the same machine.
> > > > > > > >
> > > > > > > > I created the fio job file appended as "Part 2 - dd+ls fio job file". It
> > > > > > > > consists of one dd-like job and many small jobs reading as many data as
> > > > > > > > ls did. I used the small test script to batch run this ("Part 3 - batch
> > > > > > > > test").
> > > > > > > >
> > > > > > > > The results for the ls-like jobs are interesting :
> > > > > > > >
> > > > > > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > > > > > noop 41 10563
> > > > > > > > anticipatory 63 8185
> > > > > > > > deadline 52 33387
> > > > > > > > cfq 43 1420
> > > > > > >
> > > > > >
> > > > > > Extra note : I have a HZ=250 on my system. Changing to 100 or 1000 did
> > > > > > not make much difference (also tried with NO_HZ enabled).
> > > > > >
> > > > > > > Do you have queuing enabled on your drives? You can check that in
> > > > > > > /sys/block/sdX/device/queue_depth. Try setting those to 1 and retest all
> > > > > > > schedulers, would be good for comparison.
> > > > > > >
> > > > > >
> > > > > > Here are the tests with a queue_depth of 1 :
> > > > > >
> > > > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > > > noop 43 38235
> > > > > > anticipatory 44 8728
> > > > > > deadline 51 19751
> > > > > > cfq 48 427
> > > > > >
> > > > > >
> > > > > > Overall, I wouldn't say it makes much difference.
> > > > >
> > > > > 0,5 seconds vs 1,5 seconds isn't much of a difference?
> > > > >
> > > > > > > raid personalities or dm complicates matters, since it introduces a
> > > > > > > disconnect between 'ls' and the io scheduler at the bottom...
> > > > > > >
> > > > > >
> > > > > > Yes, ideally I should re-run those directly on the disk partitions.
> > > > >
> > > > > At least for comparison.
> > > > >
> > > > > > I am also tempted to create a fio job file which acts like a ssh server
> > > > > > receiving a connexion after it has been pruned from the cache while the
> > > > > > system if doing heavy I/O. "ssh", in this case, seems to be doing much
> > > > > > more I/O than a simple "ls", and I think we might want to see if cfq
> > > > > > behaves correctly in such case. Most of this I/O is coming from page
> > > > > > faults (identified as traps in the trace) probably because the ssh
> > > > > > executable has been thrown out of the cache by
> > > > > >
> > > > > > echo 3 > /proc/sys/vm/drop_caches
> > > > > >
> > > > > > The behavior of an incoming ssh connexion after clearing the cache is
> > > > > > appended below (Part 1 - LTTng trace for incoming ssh connexion). The
> > > > > > job file created (Part 2) reads, for each job, a 2MB file with random
> > > > > > reads each between 4k-44k. The results are very interesting for cfq :
> > > > > >
> > > > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > > > noop 586 110242
> > > > > > anticipatory 531 26942
> > > > > > deadline 561 108772
> > > > > > cfq 523 28216
> > > > > >
> > > > > > So, basically, ssh being out of the cache can take 28s to answer an
> > > > > > incoming ssh connexion even with the cfq scheduler. This is not exactly
> > > > > > what I would call an acceptable latency.
> > > > >
> > > > > At some point, you have to stop and consider what is acceptable
> > > > > performance for a given IO pattern. Your ssh test case is purely random
> > > > > IO, and neither CFQ nor AS would do any idling for that. We can make
> > > > > this test case faster for sure, the hard part is making sure that we
> > > > > don't regress on async throughput at the same time.
> > > > >
> > > > > Also remember that with your raid1, it's not entirely reasonable to
> > > > > blaim all performance issues on the IO scheduler as per my previous
> > > > > mail. It would be a lot more fair to view the disk numbers individually.
> > > > >
> > > > > Can you retry this job with 'quantum' set to 1 and 'slice_async_rq' set
> > > > > to 1 as well?
> > > > >
> > > > > However, I think we should be doing somewhat better at this test case.
> > > >
> > > > Mathieu, does this improve anything for you?
> > > >
> > >
> > > I got this message when running with your patch applied :
> > > cfq: forced dispatching is broken (nr_sorted=4294967275), please report this
> > > (message appeared 10 times in a job run)
> >
> > Woops, missed a sort inc. Updated version below, or just ignore the
> > warning.
> >
> > > Here is the result :
> > >
> > > ssh test done on /dev/sda directly
> > >
> > > queue_depth=31 (default)
> > > /sys/block/sda/queue/iosched/slice_async_rq = 2 (default)
> > > /sys/block/sda/queue/iosched/quantum = 4 (default)
> > >
> > > I/O scheduler runt-min (msec) runt-max (msec)
> > > cfq (default) 523 6637
> > > cfq (patched) 564 7195
> > >
> > > Pretty much the same.
> >
> > Can you retry with depth=1 as well? There's not much to rip back out, if
> > everything is immediately sent to the device.
> >
>
> echo 1 > /sys/block/sda/queue/iosched/quantum
> echo 1 > /sys/block/sda/queue/iosched/slice_async_rq
> echo 1 > /sys/block/sda/device/queue_depth
>
> ssh test done on /dev/sda directly
>
> oops, something wrong in the new patch ?
>

[...]

Don't waste time looking into this, here is the fixed version (list_del
in a previously non-safe list iteration).

Mathieu


Signed-off-by: Mathieu Desnoyers <[email protected]>
---
block/cfq-iosched.c | 38 +++++++++++++++++++++++++++++++++++++-
1 file changed, 37 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/block/cfq-iosched.c
===================================================================
--- linux-2.6-lttng.orig/block/cfq-iosched.c 2009-01-20 10:31:46.000000000 -0500
+++ linux-2.6-lttng/block/cfq-iosched.c 2009-01-20 17:41:06.000000000 -0500
@@ -1761,6 +1761,36 @@ cfq_update_idle_window(struct cfq_data *
}

/*
+ * Pull dispatched requests from 'cfqq' back into the scheduler
+ */
+static void cfq_pull_dispatched_requests(struct cfq_data *cfqd,
+ struct cfq_queue *cfqq)
+{
+ struct request_queue *q = cfqd->queue;
+ struct request *rq, *tmp;
+
+ list_for_each_entry_safe_reverse(rq, tmp, &q->queue_head, queuelist) {
+ if (rq->cmd_flags & REQ_STARTED)
+ break;
+
+ if (RQ_CFQQ(rq) != cfqq)
+ continue;
+
+ /*
+ * Pull off the dispatch list and put it back into the cfqq
+ */
+ list_del(&rq->queuelist);
+ cfqq->dispatched--;
+ if (cfq_cfqq_sync(cfqq))
+ cfqd->sync_flight--;
+
+ cfq_add_rq_rb(rq);
+ q->nr_sorted++;
+ list_add_tail(&rq->queuelist, &cfqq->fifo);
+ }
+}
+
+/*
* Check if new_cfqq should preempt the currently active queue. Return 0 for
* no or if we aren't sure, a 1 will cause a preempt.
*/
@@ -1816,8 +1846,14 @@ cfq_should_preempt(struct cfq_data *cfqd
*/
static void cfq_preempt_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
{
+ struct cfq_queue *old_cfqq = cfqd->active_queue;
+
cfq_log_cfqq(cfqd, cfqq, "preempt");
- cfq_slice_expired(cfqd, 1);
+
+ if (old_cfqq) {
+ __cfq_slice_expired(cfqd, old_cfqq, 1);
+ cfq_pull_dispatched_requests(cfqd, old_cfqq);
+ }

/*
* Put the new queue at the front of the of the current list,

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-01-20 23:10:24

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

* Ben Gamari ([email protected]) wrote:
> The kernel build finally finished. Unfortunately, it crashes quickly
> after booting with moderate disk IO, bringing down the entire machine.
> For this reason, I haven't been able to complete a fio benchmark.
> Jens, what do you think about this backtrace?
>


Hi Ben,

Try with this new patch I just did. It solves the problem for me. Jens
seems to have done a list_del in a non-safe list iteration.

Mathieu

Fixes cfq iosched test patch

Signed-off-by: Mathieu Desnoyers <[email protected]>
---
block/cfq-iosched.c | 38 +++++++++++++++++++++++++++++++++++++-
1 file changed, 37 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/block/cfq-iosched.c
===================================================================
--- linux-2.6-lttng.orig/block/cfq-iosched.c 2009-01-20 10:31:46.000000000 -0500
+++ linux-2.6-lttng/block/cfq-iosched.c 2009-01-20 17:41:06.000000000 -0500
@@ -1761,6 +1761,36 @@ cfq_update_idle_window(struct cfq_data *
}

/*
+ * Pull dispatched requests from 'cfqq' back into the scheduler
+ */
+static void cfq_pull_dispatched_requests(struct cfq_data *cfqd,
+ struct cfq_queue *cfqq)
+{
+ struct request_queue *q = cfqd->queue;
+ struct request *rq, *tmp;
+
+ list_for_each_entry_safe_reverse(rq, tmp, &q->queue_head, queuelist) {
+ if (rq->cmd_flags & REQ_STARTED)
+ break;
+
+ if (RQ_CFQQ(rq) != cfqq)
+ continue;
+
+ /*
+ * Pull off the dispatch list and put it back into the cfqq
+ */
+ list_del(&rq->queuelist);
+ cfqq->dispatched--;
+ if (cfq_cfqq_sync(cfqq))
+ cfqd->sync_flight--;
+
+ cfq_add_rq_rb(rq);
+ q->nr_sorted++;
+ list_add_tail(&rq->queuelist, &cfqq->fifo);
+ }
+}
+
+/*
* Check if new_cfqq should preempt the currently active queue. Return 0 for
* no or if we aren't sure, a 1 will cause a preempt.
*/
@@ -1816,8 +1846,14 @@ cfq_should_preempt(struct cfq_data *cfqd
*/
static void cfq_preempt_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
{
+ struct cfq_queue *old_cfqq = cfqd->active_queue;
+
cfq_log_cfqq(cfqd, cfqq, "preempt");
- cfq_slice_expired(cfqd, 1);
+
+ if (old_cfqq) {
+ __cfq_slice_expired(cfqd, old_cfqq, 1);
+ cfq_pull_dispatched_requests(cfqd, old_cfqq);
+ }

/*
* Put the new queue at the front of the of the current list,

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-01-20 23:28:01

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

* Jens Axboe ([email protected]) wrote:
> On Tue, Jan 20 2009, Jens Axboe wrote:
> > On Mon, Jan 19 2009, Mathieu Desnoyers wrote:
> > > * Jens Axboe ([email protected]) wrote:
> > > > On Sun, Jan 18 2009, Mathieu Desnoyers wrote:
> > > > > I looked at the "ls" behavior (while doing a dd) within my LTTng trace
> > > > > to create a fio job file. The said behavior is appended below as "Part
> > > > > 1 - ls I/O behavior". Note that the original "ls" test case was done
> > > > > with the anticipatory I/O scheduler, which was active by default on my
> > > > > debian system with custom vanilla 2.6.28 kernel. Also note that I am
> > > > > running this on a raid-1, but have experienced the same problem on a
> > > > > standard partition I created on the same machine.
> > > > >
> > > > > I created the fio job file appended as "Part 2 - dd+ls fio job file". It
> > > > > consists of one dd-like job and many small jobs reading as many data as
> > > > > ls did. I used the small test script to batch run this ("Part 3 - batch
> > > > > test").
> > > > >
> > > > > The results for the ls-like jobs are interesting :
> > > > >
> > > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > > noop 41 10563
> > > > > anticipatory 63 8185
> > > > > deadline 52 33387
> > > > > cfq 43 1420
> > > >
> > >
> > > Extra note : I have a HZ=250 on my system. Changing to 100 or 1000 did
> > > not make much difference (also tried with NO_HZ enabled).
> > >
> > > > Do you have queuing enabled on your drives? You can check that in
> > > > /sys/block/sdX/device/queue_depth. Try setting those to 1 and retest all
> > > > schedulers, would be good for comparison.
> > > >
> > >
> > > Here are the tests with a queue_depth of 1 :
> > >
> > > I/O scheduler runt-min (msec) runt-max (msec)
> > > noop 43 38235
> > > anticipatory 44 8728
> > > deadline 51 19751
> > > cfq 48 427
> > >
> > >
> > > Overall, I wouldn't say it makes much difference.
> >
> > 0,5 seconds vs 1,5 seconds isn't much of a difference?
> >
> > > > raid personalities or dm complicates matters, since it introduces a
> > > > disconnect between 'ls' and the io scheduler at the bottom...
> > > >
> > >
> > > Yes, ideally I should re-run those directly on the disk partitions.
> >
> > At least for comparison.
> >
> > > I am also tempted to create a fio job file which acts like a ssh server
> > > receiving a connexion after it has been pruned from the cache while the
> > > system if doing heavy I/O. "ssh", in this case, seems to be doing much
> > > more I/O than a simple "ls", and I think we might want to see if cfq
> > > behaves correctly in such case. Most of this I/O is coming from page
> > > faults (identified as traps in the trace) probably because the ssh
> > > executable has been thrown out of the cache by
> > >
> > > echo 3 > /proc/sys/vm/drop_caches
> > >
> > > The behavior of an incoming ssh connexion after clearing the cache is
> > > appended below (Part 1 - LTTng trace for incoming ssh connexion). The
> > > job file created (Part 2) reads, for each job, a 2MB file with random
> > > reads each between 4k-44k. The results are very interesting for cfq :
> > >
> > > I/O scheduler runt-min (msec) runt-max (msec)
> > > noop 586 110242
> > > anticipatory 531 26942
> > > deadline 561 108772
> > > cfq 523 28216
> > >
> > > So, basically, ssh being out of the cache can take 28s to answer an
> > > incoming ssh connexion even with the cfq scheduler. This is not exactly
> > > what I would call an acceptable latency.
> >
> > At some point, you have to stop and consider what is acceptable
> > performance for a given IO pattern. Your ssh test case is purely random
> > IO, and neither CFQ nor AS would do any idling for that. We can make
> > this test case faster for sure, the hard part is making sure that we
> > don't regress on async throughput at the same time.
> >
> > Also remember that with your raid1, it's not entirely reasonable to
> > blaim all performance issues on the IO scheduler as per my previous
> > mail. It would be a lot more fair to view the disk numbers individually.
> >
> > Can you retry this job with 'quantum' set to 1 and 'slice_async_rq' set
> > to 1 as well?
> >
> > However, I think we should be doing somewhat better at this test case.
>
> Mathieu, does this improve anything for you?
>

So, I ran the tests with my corrected patch, and the results are very
good !

"incoming ssh connexion" test

"config 2.6.28 cfq"
Linux 2.6.28
/sys/block/sd{a,b}/device/queue_depth = 31 (default)
/sys/block/sd{a,b}/queue/iosched/slice_async_rq = 2 (default)
/sys/block/sd{a,b}/queue/iosched/quantum = 4 (default)

"config 2.6.28.1-patch1"
Linux 2.6.28.1
Corrected cfq patch applied
echo 1 > /sys/block/sd{a,b}/device/queue_depth
echo 1 > /sys/block/sd{a,b}/queue/iosched/slice_async_rq
echo 1 > /sys/block/sd{a,b}/queue/iosched/quantum

On /dev/sda :

I/O scheduler runt-min (msec) runt-max (msec)
cfq (2.6.28 cfq) 523 6637
cfq (2.6.28.1-patch1) 579 2082

On raid1 :

I/O scheduler runt-min (msec) runt-max (msec)
cfq (2.6.28 cfq) 523 28216
cfq (2.6.28.1-patch1) 517 3086

It looks like we are getting somewhere :) Are there any specific
queue_depth, slice_async_rq, quantum variations you would like to be
tested ?

For reference, I attach my ssh-like job file (again) to this mail.

Mathieu


[job1]
rw=write
size=10240m
direct=0
blocksize=1024k

[global]
rw=randread
size=2048k
filesize=30m
direct=0
bsrange=4k-44k

[file1]
startdelay=0

[file2]
startdelay=4

[file3]
startdelay=8

[file4]
startdelay=12

[file5]
startdelay=16

[file6]
startdelay=20

[file7]
startdelay=24

[file8]
startdelay=28

[file9]
startdelay=32

[file10]
startdelay=36

[file11]
startdelay=40

[file12]
startdelay=44

[file13]
startdelay=48

[file14]
startdelay=52

[file15]
startdelay=56

[file16]
startdelay=60

[file17]
startdelay=64

[file18]
startdelay=68

[file19]
startdelay=72

[file20]
startdelay=76

[file21]
startdelay=80

[file22]
startdelay=84

[file23]
startdelay=88

[file24]
startdelay=92

[file25]
startdelay=96

[file26]
startdelay=100

[file27]
startdelay=104

[file28]
startdelay=108

[file29]
startdelay=112

[file30]
startdelay=116

[file31]
startdelay=120

[file32]
startdelay=124

[file33]
startdelay=128

[file34]
startdelay=132

[file35]
startdelay=134

[file36]
startdelay=138

[file37]
startdelay=142

[file38]
startdelay=146

[file39]
startdelay=150

[file40]
startdelay=200

[file41]
startdelay=260

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-01-21 00:25:29

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

* Mathieu Desnoyers ([email protected]) wrote:
> * Jens Axboe ([email protected]) wrote:
> > On Tue, Jan 20 2009, Jens Axboe wrote:
> > > On Mon, Jan 19 2009, Mathieu Desnoyers wrote:
> > > > * Jens Axboe ([email protected]) wrote:
> > > > > On Sun, Jan 18 2009, Mathieu Desnoyers wrote:
> > > > > > I looked at the "ls" behavior (while doing a dd) within my LTTng trace
> > > > > > to create a fio job file. The said behavior is appended below as "Part
> > > > > > 1 - ls I/O behavior". Note that the original "ls" test case was done
> > > > > > with the anticipatory I/O scheduler, which was active by default on my
> > > > > > debian system with custom vanilla 2.6.28 kernel. Also note that I am
> > > > > > running this on a raid-1, but have experienced the same problem on a
> > > > > > standard partition I created on the same machine.
> > > > > >
> > > > > > I created the fio job file appended as "Part 2 - dd+ls fio job file". It
> > > > > > consists of one dd-like job and many small jobs reading as many data as
> > > > > > ls did. I used the small test script to batch run this ("Part 3 - batch
> > > > > > test").
> > > > > >
> > > > > > The results for the ls-like jobs are interesting :
> > > > > >
> > > > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > > > noop 41 10563
> > > > > > anticipatory 63 8185
> > > > > > deadline 52 33387
> > > > > > cfq 43 1420
> > > > >
> > > >
> > > > Extra note : I have a HZ=250 on my system. Changing to 100 or 1000 did
> > > > not make much difference (also tried with NO_HZ enabled).
> > > >
> > > > > Do you have queuing enabled on your drives? You can check that in
> > > > > /sys/block/sdX/device/queue_depth. Try setting those to 1 and retest all
> > > > > schedulers, would be good for comparison.
> > > > >
> > > >
> > > > Here are the tests with a queue_depth of 1 :
> > > >
> > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > noop 43 38235
> > > > anticipatory 44 8728
> > > > deadline 51 19751
> > > > cfq 48 427
> > > >
> > > >
> > > > Overall, I wouldn't say it makes much difference.
> > >
> > > 0,5 seconds vs 1,5 seconds isn't much of a difference?
> > >
> > > > > raid personalities or dm complicates matters, since it introduces a
> > > > > disconnect between 'ls' and the io scheduler at the bottom...
> > > > >
> > > >
> > > > Yes, ideally I should re-run those directly on the disk partitions.
> > >
> > > At least for comparison.
> > >
> > > > I am also tempted to create a fio job file which acts like a ssh server
> > > > receiving a connexion after it has been pruned from the cache while the
> > > > system if doing heavy I/O. "ssh", in this case, seems to be doing much
> > > > more I/O than a simple "ls", and I think we might want to see if cfq
> > > > behaves correctly in such case. Most of this I/O is coming from page
> > > > faults (identified as traps in the trace) probably because the ssh
> > > > executable has been thrown out of the cache by
> > > >
> > > > echo 3 > /proc/sys/vm/drop_caches
> > > >
> > > > The behavior of an incoming ssh connexion after clearing the cache is
> > > > appended below (Part 1 - LTTng trace for incoming ssh connexion). The
> > > > job file created (Part 2) reads, for each job, a 2MB file with random
> > > > reads each between 4k-44k. The results are very interesting for cfq :
> > > >
> > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > noop 586 110242
> > > > anticipatory 531 26942
> > > > deadline 561 108772
> > > > cfq 523 28216
> > > >
> > > > So, basically, ssh being out of the cache can take 28s to answer an
> > > > incoming ssh connexion even with the cfq scheduler. This is not exactly
> > > > what I would call an acceptable latency.
> > >
> > > At some point, you have to stop and consider what is acceptable
> > > performance for a given IO pattern. Your ssh test case is purely random
> > > IO, and neither CFQ nor AS would do any idling for that. We can make
> > > this test case faster for sure, the hard part is making sure that we
> > > don't regress on async throughput at the same time.
> > >
> > > Also remember that with your raid1, it's not entirely reasonable to
> > > blaim all performance issues on the IO scheduler as per my previous
> > > mail. It would be a lot more fair to view the disk numbers individually.
> > >
> > > Can you retry this job with 'quantum' set to 1 and 'slice_async_rq' set
> > > to 1 as well?
> > >
> > > However, I think we should be doing somewhat better at this test case.
> >
> > Mathieu, does this improve anything for you?
> >
>
> So, I ran the tests with my corrected patch, and the results are very
> good !
>
> "incoming ssh connexion" test
>
> "config 2.6.28 cfq"
> Linux 2.6.28
> /sys/block/sd{a,b}/device/queue_depth = 31 (default)
> /sys/block/sd{a,b}/queue/iosched/slice_async_rq = 2 (default)
> /sys/block/sd{a,b}/queue/iosched/quantum = 4 (default)
>
> "config 2.6.28.1-patch1"
> Linux 2.6.28.1
> Corrected cfq patch applied
> echo 1 > /sys/block/sd{a,b}/device/queue_depth
> echo 1 > /sys/block/sd{a,b}/queue/iosched/slice_async_rq
> echo 1 > /sys/block/sd{a,b}/queue/iosched/quantum
>
> On /dev/sda :
>
> I/O scheduler runt-min (msec) runt-max (msec)
> cfq (2.6.28 cfq) 523 6637
> cfq (2.6.28.1-patch1) 579 2082
>
> On raid1 :
>
> I/O scheduler runt-min (msec) runt-max (msec)
> cfq (2.6.28 cfq) 523 28216

As a side-note : I'd like to have my results confirmed by others. I just
found out that my 2 Seagate drives are in the "defect" list
(ST3500320AS) that exhibits the behavior to stop for about 30s when doing
"video streaming".
(http://www.computerworld.com/action/article.do?command=viewArticleBasic&taxonomyName=storage&articleId=9126280&taxonomyId=19&intsrc=kc_top)
(http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207931)

Therefore, I would not take any decision based on such known bad
firmware. But the last results we've got are definitely interesting.

I'll upgrade my firmware as soon as Segate puts it back online so I can
re-run more tests.

Mathieu

> cfq (2.6.28.1-patch1) 517 3086
>
> It looks like we are getting somewhere :) Are there any specific
> queue_depth, slice_async_rq, quantum variations you would like to be
> tested ?
>
> For reference, I attach my ssh-like job file (again) to this mail.
>
> Mathieu
>
>
> [job1]
> rw=write
> size=10240m
> direct=0
> blocksize=1024k
>
> [global]
> rw=randread
> size=2048k
> filesize=30m
> direct=0
> bsrange=4k-44k
>
> [file1]
> startdelay=0
>
> [file2]
> startdelay=4
>
> [file3]
> startdelay=8
>
> [file4]
> startdelay=12
>
> [file5]
> startdelay=16
>
> [file6]
> startdelay=20
>
> [file7]
> startdelay=24
>
> [file8]
> startdelay=28
>
> [file9]
> startdelay=32
>
> [file10]
> startdelay=36
>
> [file11]
> startdelay=40
>
> [file12]
> startdelay=44
>
> [file13]
> startdelay=48
>
> [file14]
> startdelay=52
>
> [file15]
> startdelay=56
>
> [file16]
> startdelay=60
>
> [file17]
> startdelay=64
>
> [file18]
> startdelay=68
>
> [file19]
> startdelay=72
>
> [file20]
> startdelay=76
>
> [file21]
> startdelay=80
>
> [file22]
> startdelay=84
>
> [file23]
> startdelay=88
>
> [file24]
> startdelay=92
>
> [file25]
> startdelay=96
>
> [file26]
> startdelay=100
>
> [file27]
> startdelay=104
>
> [file28]
> startdelay=108
>
> [file29]
> startdelay=112
>
> [file30]
> startdelay=116
>
> [file31]
> startdelay=120
>
> [file32]
> startdelay=124
>
> [file33]
> startdelay=128
>
> [file34]
> startdelay=132
>
> [file35]
> startdelay=134
>
> [file36]
> startdelay=138
>
> [file37]
> startdelay=142
>
> [file38]
> startdelay=146
>
> [file39]
> startdelay=150
>
> [file40]
> startdelay=200
>
> [file41]
> startdelay=260
>
> --
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-01-21 04:38:31

by Ben Gamari

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

On Tue, Jan 20, 2009 at 7:25 PM, Mathieu Desnoyers
<[email protected]> wrote:
> * Mathieu Desnoyers ([email protected]) wrote:
>
> As a side-note : I'd like to have my results confirmed by others.

Well, I think the (fixed) patch did help to some degree (I haven't
done fio benchmarks to compare against yet). Unfortunately, the I/O
wait time problem still remains. I have been waiting 3 minutes now for
evolution to start with 88% I/O wait time yet no visible signs of
progress. I've confirmed I'm using the CFQ scheduler, so that's not
the problem.

Also, Jens, I'd just like to point out that the problem is
reproducible across all schedulers. Does your patch seek to tackle a
problem specific to the CFQ scheduler, leaving the I/O wait issue for
later? Just wondering.

I'll post some benchmarks numbers once I have them. Thanks,

- Ben

2009-01-21 05:00:08

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [ltt-dev] [RFC PATCH] block: Fix bio merge induced high I/O latency

* Ben Gamari ([email protected]) wrote:
> On Tue, Jan 20, 2009 at 7:25 PM, Mathieu Desnoyers
> <[email protected]> wrote:
> > * Mathieu Desnoyers ([email protected]) wrote:
> >
> > As a side-note : I'd like to have my results confirmed by others.
>
> Well, I think the (fixed) patch did help to some degree (I haven't
> done fio benchmarks to compare against yet). Unfortunately, the I/O
> wait time problem still remains. I have been waiting 3 minutes now for
> evolution to start with 88% I/O wait time yet no visible signs of
> progress. I've confirmed I'm using the CFQ scheduler, so that's not
> the problem.
>

Did you also

echo 1 > /sys/block/sd{a,b}/device/queue_depth
echo 1 > /sys/block/sd{a,b}/queue/iosched/slice_async_rq
echo 1 > /sys/block/sd{a,b}/queue/iosched/quantum

(replacing sd{a,b} with your actual drives) ?

It seems to have been part of the factors that helped (along with the
patch).

And hopefully you don't have a recent Seagate hard drive like me ? :-)

So you test case is :
- start a large dd with 1M block size
- time evolution

?

Mathieu

> Also, Jens, I'd just like to point out that the problem is
> reproducible across all schedulers. Does your patch seek to tackle a
> problem specific to the CFQ scheduler, leaving the I/O wait issue for
> later? Just wondering.
>
> I'll post some benchmarks numbers once I have them. Thanks,
>
> - Ben
>
> _______________________________________________
> ltt-dev mailing list
> [email protected]
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-01-21 06:18:17

by Ben Gamari

[permalink] [raw]
Subject: Re: [ltt-dev] [RFC PATCH] block: Fix bio merge induced high I/O latency

On Tue, 2009-01-20 at 23:54 -0500, Mathieu Desnoyers wrote:
> * Ben Gamari ([email protected]) wrote:
> > On Tue, Jan 20, 2009 at 7:25 PM, Mathieu Desnoyers
> > <[email protected]> wrote:
> > > * Mathieu Desnoyers ([email protected]) wrote:
> > >
> > > As a side-note : I'd like to have my results confirmed by others.
> >
> > Well, I think the (fixed) patch did help to some degree (I haven't
> > done fio benchmarks to compare against yet). Unfortunately, the I/O
> > wait time problem still remains. I have been waiting 3 minutes now for
> > evolution to start with 88% I/O wait time yet no visible signs of
> > progress. I've confirmed I'm using the CFQ scheduler, so that's not
> > the problem.
> >
>
> Did you also
>
> echo 1 > /sys/block/sd{a,b}/device/queue_depth
I have been using this in some of my measurements (this is recorded, of
course).

> echo 1 > /sys/block/sd{a,b}/queue/iosched/slice_async_rq
> echo 1 > /sys/block/sd{a,b}/queue/iosched/quantum
I haven't been doing this although I will collect a data set with these
parameters set. It would be to compare the effect of this to the default
configuration.

>
> (replacing sd{a,b} with your actual drives) ?
>
> It seems to have been part of the factors that helped (along with the
> patch).
>
> And hopefully you don't have a recent Seagate hard drive like me ? :-)
Thankfully, no.

>
> So you test case is :
> - start a large dd with 1M block size
> - time evolution
>
I've been using evolution to get a rough idea of the performance of the
configurations but not as a benchmark per se. I have some pretty
good-sized maildirs, so launching evolution for the first time can be
quite a task, IO-wise. Also, switching between folders used to be quite
time consuming. It seems like the patch did help a bit on this front
though.

For a quantitative benchmark I've been using the fio job that you posted
earlier. I've been collecting results and should have a pretty good data
set soon.

I'll send out a compilation of all the data I've collected as soon as
I've finished.

- Ben

2009-01-22 02:35:55

by Ben Gamari

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

I'm not sure if this will help, but I just completed another set of
benchmarks using Jens' patch and a variety of device parameters. Again,
I don't know if this will help anyone, but I figured it might help
quantify the differences between device parameters. Let me know if
there's any other benchmarking or testing that I can do. Thanks,

- Ben


mint maxt
==========================================================
queue_depth=1, slice_async_rq=1, quantum=1, patched
anticipatory 25 msec 4410 msec
cfq 27 msec 1466 msec
deadline 36 msec 10735 msec
noop 48 msec 37439 msec
==========================================================
queue_depth=1, slice_async_rq=1, quantum=4, patched
anticipatory 38 msec 3579 msec
cfq 35 msec 822 msec
deadline 37 msec 10072 msec
noop 32 msec 45535 msec
==========================================================
queue_depth=1, slice_async_rq=2, quantum=1, patched
anticipatory 33 msec 4480 msec
cfq 28 msec 353 msec
deadline 30 msec 6738 msec
noop 36 msec 39691 msec
==========================================================
queue_depth=1, slice_async_rq=2, quantum=4, patched
anticipatory 40 msec 4498 msec
cfq 35 msec 1395 msec
deadline 41 msec 6877 msec
noop 38 msec 46410 msec
==========================================================
queue_depth=31, slice_async_rq=1, quantum=1, patched
anticipatory 31 msec 6011 msec
cfq 36 msec 4575 msec
deadline 41 msec 18599 msec
noop 38 msec 46347 msec
==========================================================
queue_depth=31, slice_async_rq=2, quantum=1, patched
anticipatory 30 msec 9985 msec
cfq 33 msec 4200 msec
deadline 38 msec 22285 msec
noop 25 msec 40245 msec
==========================================================
queue_depth=31, slice_async_rq=2, quantum=4, patched
anticipatory 30 msec 12197 msec
cfq 30 msec 3457 msec
deadline 35 msec 18969 msec
noop 34 msec 42803 msec



On Tue, 2009-01-20 at 15:22 -0500, Ben Gamari wrote:
> On Tue, Jan 20, 2009 at 2:37 AM, Jens Axboe <[email protected]> wrote:
> > On Mon, Jan 19 2009, Mathieu Desnoyers wrote:
> >> * Jens Axboe ([email protected]) wrote:
> >> Yes, ideally I should re-run those directly on the disk partitions.
> >
> > At least for comparison.
> >
>
> I just completed my own set of benchmarks using the fio job file
> Mathieu provided. This was on a 2.5 inch 7200 RPM SATA partition
> formatted as ext3. As you can see, I tested all of the available
> schedulers with both queuing enabled and disabled. I'll test the Jens'
> patch soon. Would a blktrace of the fio run help? Let me know if
> there's any other benchmarking or profiling that could be done.
> Thanks,
>
> - Ben
>
>
> mint maxt
> ==========================================================
> queue_depth=31:
> anticipatory 35 msec 11036 msec
> cfq 37 msec 3350 msec
> deadline 36 msec 18144 msec
> noop 39 msec 41512 msec
>
> ==========================================================
> queue_depth=1:
> anticipatory 45 msec 9561 msec
> cfq 28 msec 3974 msec
> deadline 47 msec 16802 msec
> noop 35 msec 38173 msec

2009-01-22 23:00:20

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

* Mathieu Desnoyers ([email protected]) wrote:
> * Mathieu Desnoyers ([email protected]) wrote:
> > * Jens Axboe ([email protected]) wrote:
> > > On Tue, Jan 20 2009, Jens Axboe wrote:
> > > > On Mon, Jan 19 2009, Mathieu Desnoyers wrote:
> > > > > * Jens Axboe ([email protected]) wrote:
> > > > > > On Sun, Jan 18 2009, Mathieu Desnoyers wrote:
> > > > > > > I looked at the "ls" behavior (while doing a dd) within my LTTng trace
> > > > > > > to create a fio job file. The said behavior is appended below as "Part
> > > > > > > 1 - ls I/O behavior". Note that the original "ls" test case was done
> > > > > > > with the anticipatory I/O scheduler, which was active by default on my
> > > > > > > debian system with custom vanilla 2.6.28 kernel. Also note that I am
> > > > > > > running this on a raid-1, but have experienced the same problem on a
> > > > > > > standard partition I created on the same machine.
> > > > > > >
> > > > > > > I created the fio job file appended as "Part 2 - dd+ls fio job file". It
> > > > > > > consists of one dd-like job and many small jobs reading as many data as
> > > > > > > ls did. I used the small test script to batch run this ("Part 3 - batch
> > > > > > > test").
> > > > > > >
> > > > > > > The results for the ls-like jobs are interesting :
> > > > > > >
> > > > > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > > > > noop 41 10563
> > > > > > > anticipatory 63 8185
> > > > > > > deadline 52 33387
> > > > > > > cfq 43 1420
> > > > > >
> > > > >
> > > > > Extra note : I have a HZ=250 on my system. Changing to 100 or 1000 did
> > > > > not make much difference (also tried with NO_HZ enabled).
> > > > >
> > > > > > Do you have queuing enabled on your drives? You can check that in
> > > > > > /sys/block/sdX/device/queue_depth. Try setting those to 1 and retest all
> > > > > > schedulers, would be good for comparison.
> > > > > >
> > > > >
> > > > > Here are the tests with a queue_depth of 1 :
> > > > >
> > > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > > noop 43 38235
> > > > > anticipatory 44 8728
> > > > > deadline 51 19751
> > > > > cfq 48 427
> > > > >
> > > > >
> > > > > Overall, I wouldn't say it makes much difference.
> > > >
> > > > 0,5 seconds vs 1,5 seconds isn't much of a difference?
> > > >
> > > > > > raid personalities or dm complicates matters, since it introduces a
> > > > > > disconnect between 'ls' and the io scheduler at the bottom...
> > > > > >
> > > > >
> > > > > Yes, ideally I should re-run those directly on the disk partitions.
> > > >
> > > > At least for comparison.
> > > >
> > > > > I am also tempted to create a fio job file which acts like a ssh server
> > > > > receiving a connexion after it has been pruned from the cache while the
> > > > > system if doing heavy I/O. "ssh", in this case, seems to be doing much
> > > > > more I/O than a simple "ls", and I think we might want to see if cfq
> > > > > behaves correctly in such case. Most of this I/O is coming from page
> > > > > faults (identified as traps in the trace) probably because the ssh
> > > > > executable has been thrown out of the cache by
> > > > >
> > > > > echo 3 > /proc/sys/vm/drop_caches
> > > > >
> > > > > The behavior of an incoming ssh connexion after clearing the cache is
> > > > > appended below (Part 1 - LTTng trace for incoming ssh connexion). The
> > > > > job file created (Part 2) reads, for each job, a 2MB file with random
> > > > > reads each between 4k-44k. The results are very interesting for cfq :
> > > > >
> > > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > > noop 586 110242
> > > > > anticipatory 531 26942
> > > > > deadline 561 108772
> > > > > cfq 523 28216
> > > > >
> > > > > So, basically, ssh being out of the cache can take 28s to answer an
> > > > > incoming ssh connexion even with the cfq scheduler. This is not exactly
> > > > > what I would call an acceptable latency.
> > > >
> > > > At some point, you have to stop and consider what is acceptable
> > > > performance for a given IO pattern. Your ssh test case is purely random
> > > > IO, and neither CFQ nor AS would do any idling for that. We can make
> > > > this test case faster for sure, the hard part is making sure that we
> > > > don't regress on async throughput at the same time.
> > > >
> > > > Also remember that with your raid1, it's not entirely reasonable to
> > > > blaim all performance issues on the IO scheduler as per my previous
> > > > mail. It would be a lot more fair to view the disk numbers individually.
> > > >
> > > > Can you retry this job with 'quantum' set to 1 and 'slice_async_rq' set
> > > > to 1 as well?
> > > >
> > > > However, I think we should be doing somewhat better at this test case.
> > >
> > > Mathieu, does this improve anything for you?
> > >
> >
> > So, I ran the tests with my corrected patch, and the results are very
> > good !
> >
> > "incoming ssh connexion" test
> >
> > "config 2.6.28 cfq"
> > Linux 2.6.28
> > /sys/block/sd{a,b}/device/queue_depth = 31 (default)
> > /sys/block/sd{a,b}/queue/iosched/slice_async_rq = 2 (default)
> > /sys/block/sd{a,b}/queue/iosched/quantum = 4 (default)
> >
> > "config 2.6.28.1-patch1"
> > Linux 2.6.28.1
> > Corrected cfq patch applied
> > echo 1 > /sys/block/sd{a,b}/device/queue_depth
> > echo 1 > /sys/block/sd{a,b}/queue/iosched/slice_async_rq
> > echo 1 > /sys/block/sd{a,b}/queue/iosched/quantum
> >
> > On /dev/sda :
> >
> > I/O scheduler runt-min (msec) runt-max (msec)
> > cfq (2.6.28 cfq) 523 6637
> > cfq (2.6.28.1-patch1) 579 2082
> >
> > On raid1 :
> >
> > I/O scheduler runt-min (msec) runt-max (msec)
> > cfq (2.6.28 cfq) 523 28216
>
> As a side-note : I'd like to have my results confirmed by others. I just
> found out that my 2 Seagate drives are in the "defect" list
> (ST3500320AS) that exhibits the behavior to stop for about 30s when doing
> "video streaming".
> (http://www.computerworld.com/action/article.do?command=viewArticleBasic&taxonomyName=storage&articleId=9126280&taxonomyId=19&intsrc=kc_top)
> (http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207931)
>
> Therefore, I would not take any decision based on such known bad
> firmware. But the last results we've got are definitely interesting.
>
> I'll upgrade my firmware as soon as Segate puts it back online so I can
> re-run more tests.
>

After firmware upgrade :

"incoming ssh connexion" test
(ran the job file 2-3 times to get correct runt-max results)

"config 2.6.28.1 dfl"
Linux 2.6.28.1
/sys/block/sd{a,b}/device/queue_depth = 31 (default)
/sys/block/sd{a,b}/queue/iosched/slice_async_rq = 2 (default)
/sys/block/sd{a,b}/queue/iosched/quantum = 4 (default)

"config 2.6.28.1 1"
Linux 2.6.28.1
echo 1 > /sys/block/sd{a,b}/device/queue_depth
echo 1 > /sys/block/sd{a,b}/queue/iosched/slice_async_rq
echo 1 > /sys/block/sd{a,b}/queue/iosched/quantum

"config 2.6.28.1-patch dfl"
Linux 2.6.28.1
Corrected cfq patch applied
/sys/block/sd{a,b}/device/queue_depth = 31 (default)
/sys/block/sd{a,b}/queue/iosched/slice_async_rq = 2 (default)
/sys/block/sd{a,b}/queue/iosched/quantum = 4 (default)

"config 2.6.28.1-patch 1"
Linux 2.6.28.1
Corrected cfq patch applied
echo 1 > /sys/block/sd{a,b}/device/queue_depth
echo 1 > /sys/block/sd{a,b}/queue/iosched/slice_async_rq
echo 1 > /sys/block/sd{a,b}/queue/iosched/quantum

On /dev/sda :

I/O scheduler runt-min (msec) runt-avg (msec) runt-max (msec)
cfq (2.6.28.1 dfl) 560 4134.04 12125
cfq (2.6.28.1-patch dfl) 508 4329.75 9625
cfq (2.6.28.1 1) 535 1068.46 2622
cfq (2.6.28.1-patch 1) 511 2239.87 4117

On /dev/md1 (raid1) :

I/O scheduler runt-min (msec) runt-avg (msec) runt-max (msec)
cfq (2.6.28.1 dfl) 507 4053.19 26265
cfq (2.6.28.1-patch dfl) 532 3991.75 18567
cfq (2.6.28.1 1) 510 1900.14 27410
cfq (2.6.28.1-patch 1) 539 2112.60 22859


A fio output taken from the raid1 cfq (2.6.28.1-patch 1) run looks like
the following. It's a bit strange that we have readers started earlier
which seems to complete only _after_ more recent readers have.

Excerpt (full output appended after email) :

Jobs: 2 (f=1): [W________________rPPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 3 (f=1): [W________________rrPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 3 (f=1): [W________________rrPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 3 (f=1): [W________________rrPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 3 (f=1): [W________________rrPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 4 (f=1): [W________________rrrPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 4 (f=1): [W________________rrrPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 4 (f=1): [W________________rrrPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 4 (f=1): [W________________rrrPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 5 (f=1): [W________________rrrrPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 5 (f=1): [W________________rrrrPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 5 (f=1): [W________________rrrrPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 5 (f=1): [W________________rrrrPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 6 (f=2): [W________________rrrrrPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 560/
Jobs: 5 (f=1): [W________________rrrr_PPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 1512/
Jobs: 5 (f=1): [W________________rrrr_PPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 5 (f=1): [W________________rrrr_PPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 6 (f=2): [W________________rrrr_rPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 144/
Jobs: 5 (f=1): [W________________rrrr__PPPPPPPPPPPPPPPPPPP] [0.0% done] [ 1932/
Jobs: 5 (f=1): [W________________rrrr__PPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 5 (f=1): [W________________rrrr__IPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 6 (f=2): [W________________rrrr__rPPPPPPPPPPPPPPPPPP] [0.0% done] [ 608/
Jobs: 6 (f=2): [W________________rrrr__rPPPPPPPPPPPPPPPPPP] [0.0% done] [ 1052/
Jobs: 5 (f=1): [W________________rrrr___PPPPPPPPPPPPPPPPPP] [0.0% done] [ 388/
Jobs: 5 (f=1): [W________________rrrr___IPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 5 (f=1): [W________________rrrr____PPPPPPPPPPPPPPPPP] [0.0% done] [ 2076/
Jobs: 5 (f=5): [W________________rrrr____PPPPPPPPPPPPPPPPP] [49.0% done] [ 2936
Jobs: 2 (f=2): [W_________________r______PPPPPPPPPPPPPPPPP] [50.8% done] [ 5192
Jobs: 2 (f=2): [W________________________rPPPPPPPPPPPPPPPP] [16.0% done] [ 104

Given the numbers I get, I see that runt-max numbers does not appear to
be so high at each job file run, which makes it difficult to compare
them (since you never know if you've hit the worse-case yet). This could
be related to raid1, because I've seen this both with and without your
patch applied, and it only seems to appear on raid1 executions.

However, the patch you sent does not seem to improve the behavior. It
actually makes the average and max latency worse in almost every case.
Changing the queue, slice_async_rq and quantum parameters clearly helps
reducing both avg and max latency.

Mathieu


Full output :

Running cfq
Starting 42 processes

Jobs: 1 (f=1): [W_PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [5.7% done] [ 0/
Jobs: 1 (f=1): [W_PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [7.7% done] [ 0/
Jobs: 1 (f=1): [W_IPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [9.6% done] [ 0/
Jobs: 2 (f=2): [W_rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [11.1% done] [ 979
Jobs: 1 (f=1): [W__PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [10.9% done] [ 1098
Jobs: 1 (f=1): [W__PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [12.9% done] [ 0
Jobs: 2 (f=2): [W__rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [112.5% done] [
Jobs: 2 (f=2): [W__rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [16.1% done] [ 1160
Jobs: 2 (f=1): [W__rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [15.9% done] [ 888
Jobs: 2 (f=1): [W__rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [16.0% done] [ 0
Jobs: 3 (f=2): [W__rrPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 3 (f=2): [W__rrPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 3 (f=2): [W__rrPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 3 (f=2): [W__rrPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 3 (f=3): [W___rrPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [16.7% done] [ 660
Jobs: 2 (f=2): [W____rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [18.0% done] [ 2064
Jobs: 1 (f=1): [W_____PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [19.4% done] [ 1392
Jobs: 1 (f=1): [W_____PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [20.6% done] [ 0
Jobs: 2 (f=2): [W_____rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [105.0% done] [
Jobs: 2 (f=2): [W_____rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [110.0% done] [
Jobs: 2 (f=2): [W_____rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [115.0% done] [
Jobs: 2 (f=2): [W_____rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [120.0% done] [
Jobs: 3 (f=3): [W_____rrPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [104.2% done] [
Jobs: 3 (f=3): [W_____rrPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [108.3% done] [
Jobs: 3 (f=3): [W_____rrPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [112.5% done] [
Jobs: 3 (f=3): [W_____rrPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [116.7% done] [
Jobs: 4 (f=4): [W_____rrrPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [103.6% done] [
Jobs: 4 (f=4): [W_____rrrPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [107.1% done] [
Jobs: 4 (f=4): [W_____rrrPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [9.8% done] [ 280/
Jobs: 3 (f=3): [W_____r_rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [34.0% done] [ 3624
Jobs: 2 (f=2): [W________rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [34.0% done] [ 2744
Jobs: 1 (f=1): [W_________PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [34.0% done] [ 1620
Jobs: 1 (f=1): [W_________PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [34.0% done] [ 0
Jobs: 1 (f=1): [W_________PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [34.3% done] [ 0
Jobs: 2 (f=2): [W_________rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [34.9% done] [ 116
Jobs: 1 (f=1): [W__________PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [34.9% done] [ 1944
Jobs: 1 (f=1): [W__________PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [35.8% done] [ 0
Jobs: 1 (f=1): [W__________PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [36.4% done] [ 0
Jobs: 2 (f=2): [W__________rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [36.9% done] [ 228
Jobs: 2 (f=2): [W__________rPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [37.2% done] [ 1420
Jobs: 1 (f=1): [W___________PPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [37.7% done] [ 400
Jobs: 1 (f=1): [W___________PPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [39.1% done] [ 0
Jobs: 2 (f=2): [W___________rPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [39.3% done] [ 268
Jobs: 2 (f=2): [W___________rPPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [39.5% done] [ 944
Jobs: 1 (f=1): [W____________PPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [40.3% done] [ 848
Jobs: 1 (f=1): [W____________IPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [40.5% done] [ 0
Jobs: 2 (f=2): [W____________rPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [41.0% done] [ 400
Jobs: 2 (f=2): [W____________rPPPPPPPPPPPPPPPPPPPPPPPPPPPP] [41.1% done] [ 1208
Jobs: 1 (f=1): [W_____________PPPPPPPPPPPPPPPPPPPPPPPPPPPP] [41.9% done] [ 456
Jobs: 2 (f=2): [W_____________rPPPPPPPPPPPPPPPPPPPPPPPPPPP] [101.9% done] [
Jobs: 2 (f=2): [W_____________rPPPPPPPPPPPPPPPPPPPPPPPPPPP] [42.9% done] [ 380
Jobs: 2 (f=2): [W_____________rPPPPPPPPPPPPPPPPPPPPPPPPPPP] [43.3% done] [ 760
Jobs: 1 (f=1): [W______________PPPPPPPPPPPPPPPPPPPPPPPPPPP] [43.8% done] [ 912
Jobs: 2 (f=2): [W______________rPPPPPPPPPPPPPPPPPPPPPPPPPP] [44.2% done] [ 44
Jobs: 2 (f=2): [W______________rPPPPPPPPPPPPPPPPPPPPPPPPPP] [44.6% done] [ 1020
Jobs: 1 (f=1): [W_______________PPPPPPPPPPPPPPPPPPPPPPPPPP] [45.4% done] [ 1008
Jobs: 1 (f=1): [W_______________PPPPPPPPPPPPPPPPPPPPPPPPPP] [46.2% done] [ 0
Jobs: 2 (f=2): [W_______________rPPPPPPPPPPPPPPPPPPPPPPPPP] [46.6% done] [ 52
Jobs: 2 (f=2): [W_______________rPPPPPPPPPPPPPPPPPPPPPPPPP] [47.0% done] [ 1248
Jobs: 1 (f=1): [W________________PPPPPPPPPPPPPPPPPPPPPPPPP] [47.4% done] [ 760
Jobs: 1 (f=1): [W________________PPPPPPPPPPPPPPPPPPPPPPPPP] [48.1% done] [ 0
Jobs: 2 (f=1): [W________________rPPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 2 (f=1): [W________________rPPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 2 (f=1): [W________________rPPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 2 (f=1): [W________________rPPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 3 (f=1): [W________________rrPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 3 (f=1): [W________________rrPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 3 (f=1): [W________________rrPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 3 (f=1): [W________________rrPPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 4 (f=1): [W________________rrrPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 4 (f=1): [W________________rrrPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 4 (f=1): [W________________rrrPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 4 (f=1): [W________________rrrPPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 5 (f=1): [W________________rrrrPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 5 (f=1): [W________________rrrrPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 5 (f=1): [W________________rrrrPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 5 (f=1): [W________________rrrrPPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 6 (f=2): [W________________rrrrrPPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 560/
Jobs: 5 (f=1): [W________________rrrr_PPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 1512/
Jobs: 5 (f=1): [W________________rrrr_PPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 5 (f=1): [W________________rrrr_PPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 6 (f=2): [W________________rrrr_rPPPPPPPPPPPPPPPPPPP] [0.0% done] [ 144/
Jobs: 5 (f=1): [W________________rrrr__PPPPPPPPPPPPPPPPPPP] [0.0% done] [ 1932/
Jobs: 5 (f=1): [W________________rrrr__PPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 5 (f=1): [W________________rrrr__IPPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 6 (f=2): [W________________rrrr__rPPPPPPPPPPPPPPPPPP] [0.0% done] [ 608/
Jobs: 6 (f=2): [W________________rrrr__rPPPPPPPPPPPPPPPPPP] [0.0% done] [ 1052/
Jobs: 5 (f=1): [W________________rrrr___PPPPPPPPPPPPPPPPPP] [0.0% done] [ 388/
Jobs: 5 (f=1): [W________________rrrr___IPPPPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 5 (f=1): [W________________rrrr____PPPPPPPPPPPPPPPPP] [0.0% done] [ 2076/
Jobs: 5 (f=5): [W________________rrrr____PPPPPPPPPPPPPPPPP] [49.0% done] [ 2936
Jobs: 2 (f=2): [W_________________r______PPPPPPPPPPPPPPPPP] [50.8% done] [ 5192
Jobs: 2 (f=2): [W________________________rPPPPPPPPPPPPPPPP] [16.0% done] [ 104
Jobs: 2 (f=2): [W________________________rPPPPPPPPPPPPPPPP] [54.7% done] [ 1052
Jobs: 1 (f=1): [W_________________________PPPPPPPPPPPPPPPP] [56.6% done] [ 1016
Jobs: 1 (f=1): [W_________________________PPPPPPPPPPPPPPPP] [58.1% done] [ 0
Jobs: 2 (f=2): [W_________________________rPPPPPPPPPPPPPPP] [59.8% done] [ 52
Jobs: 2 (f=2): [W_________________________rPPPPPPPPPPPPPPP] [61.1% done] [ 1372
Jobs: 1 (f=1): [W__________________________PPPPPPPPPPPPPPP] [63.2% done] [ 652
Jobs: 1 (f=1): [W__________________________PPPPPPPPPPPPPPP] [65.0% done] [ 0
Jobs: 2 (f=1): [W__________________________rPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 2 (f=1): [W__________________________rPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 2 (f=1): [W__________________________rPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 2 (f=1): [W__________________________rPPPPPPPPPPPPPP] [0.0% done] [ 0/
Jobs: 3 (f=3): [W__________________________rrPPPPPPPPPPPPP] [67.3% done] [ 1224
Jobs: 2 (f=2): [W___________________________rPPPPPPPPPPPPP] [68.8% done] [ 2124
Jobs: 1 (f=1): [W____________________________PPPPPPPPPPPPP] [69.8% done] [ 780
Jobs: 1 (f=1): [W____________________________PPPPPPPPPPPPP] [71.3% done] [ 0
Jobs: 2 (f=2): [W____________________________rPPPPPPPPPPPP] [72.9% done] [ 84
Jobs: 2 (f=2): [W____________________________rPPPPPPPPPPPP] [73.1% done] [ 1312
Jobs: 1 (f=1): [W_____________________________PPPPPPPPPPPP] [73.2% done] [ 688
Jobs: 1 (f=1): [W_____________________________PPPPPPPPPPPP] [73.9% done] [ 0
Jobs: 2 (f=2): [W_____________________________rPPPPPPPPPPP] [73.6% done] [ 476
Jobs: 1 (f=1): [W_____________________________EPPPPPPPPPPP] [73.8% done] [ 1608
Jobs: 1 (f=1): [W______________________________PPPPPPPPPPP] [73.9% done] [ 0
Jobs: 1 (f=1): [W______________________________PPPPPPPPPPP] [74.1% done] [ 0
Jobs: 2 (f=2): [W______________________________rPPPPPPPPPP] [74.7% done] [ 228
Jobs: 2 (f=2): [W______________________________rPPPPPPPPPP] [74.8% done] [ 1564
Jobs: 1 (f=1): [W_______________________________PPPPPPPPPP] [75.5% done] [ 264
Jobs: 1 (f=1): [W_______________________________PPPPPPPPPP] [76.1% done] [ 0
Jobs: 2 (f=2): [W_______________________________rPPPPPPPPP] [76.2% done] [ 516
Jobs: 1 (f=1): [W________________________________PPPPPPPPP] [75.9% done] [ 1532
Jobs: 1 (f=1): [W________________________________PPPPPPPPP] [76.0% done] [ 0
Jobs: 1 (f=1): [W________________________________PPPPPPPPP] [76.2% done] [ 0
Jobs: 2 (f=2): [W________________________________rPPPPPPPP] [76.5% done] [ 768
Jobs: 1 (f=1): [W_________________________________PPPPPPPP] [76.6% done] [ 1316
Jobs: 1 (f=1): [W_________________________________PPPPPPPP] [76.7% done] [ 0
Jobs: 1 (f=1): [W_________________________________IPPPPPPP] [77.8% done] [ 0
Jobs: 2 (f=2): [W_________________________________rPPPPPPP] [77.9% done] [ 604
Jobs: 1 (f=1): [W__________________________________IPPPPPP] [78.0% done] [ 1444
Jobs: 2 (f=2): [W__________________________________rPPPPPP] [78.2% done] [ 1145
Jobs: 1 (f=1): [W___________________________________PPPPPP] [78.3% done] [ 932
Jobs: 1 (f=1): [W___________________________________PPPPPP] [79.3% done] [ 0
Jobs: 2 (f=2): [W___________________________________rPPPPP] [100.7% done] [
Jobs: 2 (f=2): [W___________________________________rPPPPP] [80.0% done] [ 1012
Jobs: 1 (f=1): [W____________________________________PPPPP] [80.6% done] [ 1072
Jobs: 1 (f=1): [W____________________________________PPPPP] [81.6% done] [ 0
Jobs: 2 (f=2): [W____________________________________rPPPP] [72.2% done] [ 36
Jobs: 2 (f=2): [W____________________________________rPPPP] [82.3% done] [ 956
Jobs: 1 (f=1): [W_____________________________________PPPP] [82.9% done] [ 1076
Jobs: 1 (f=1): [W_____________________________________PPPP] [83.4% done] [ 0
Jobs: 2 (f=2): [W_____________________________________rPPP] [78.2% done] [ 48
Jobs: 2 (f=2): [W_____________________________________rPPP] [84.6% done] [ 1060
Jobs: 1 (f=1): [W______________________________________PPP] [85.1% done] [ 956
Jobs: 1 (f=1): [W______________________________________PPP] [85.7% done] [ 0
Jobs: 2 (f=2): [W______________________________________rPP] [86.3% done] [ 96
Jobs: 2 (f=2): [W______________________________________rPP] [86.4% done] [ 756
Jobs: 1 (f=1): [W_______________________________________PP] [86.9% done] [ 1212
Jobs: 1 (f=1): [W_______________________________________PP] [87.5% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [88.6% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [89.1% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [90.2% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [90.8% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [91.4% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [92.5% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [93.1% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [93.6% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [94.2% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [95.3% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [95.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [97.1% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [97.7% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.2% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.8% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.8% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [98.9% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [99.0% done] [ 0
Jobs: 1 (f=1): [W_______________________________________PP] [99.0% done] [ 0
Jobs: 0 (f=0) [eta 00m:02s]



Mathieu

> Mathieu
>
> > cfq (2.6.28.1-patch1) 517 3086
> >
> > It looks like we are getting somewhere :) Are there any specific
> > queue_depth, slice_async_rq, quantum variations you would like to be
> > tested ?
> >
> > For reference, I attach my ssh-like job file (again) to this mail.
> >
> > Mathieu
> >
> >
> > [job1]
> > rw=write
> > size=10240m
> > direct=0
> > blocksize=1024k
> >
> > [global]
> > rw=randread
> > size=2048k
> > filesize=30m
> > direct=0
> > bsrange=4k-44k
> >
> > [file1]
> > startdelay=0
> >
> > [file2]
> > startdelay=4
> >
> > [file3]
> > startdelay=8
> >
> > [file4]
> > startdelay=12
> >
> > [file5]
> > startdelay=16
> >
> > [file6]
> > startdelay=20
> >
> > [file7]
> > startdelay=24
> >
> > [file8]
> > startdelay=28
> >
> > [file9]
> > startdelay=32
> >
> > [file10]
> > startdelay=36
> >
> > [file11]
> > startdelay=40
> >
> > [file12]
> > startdelay=44
> >
> > [file13]
> > startdelay=48
> >
> > [file14]
> > startdelay=52
> >
> > [file15]
> > startdelay=56
> >
> > [file16]
> > startdelay=60
> >
> > [file17]
> > startdelay=64
> >
> > [file18]
> > startdelay=68
> >
> > [file19]
> > startdelay=72
> >
> > [file20]
> > startdelay=76
> >
> > [file21]
> > startdelay=80
> >
> > [file22]
> > startdelay=84
> >
> > [file23]
> > startdelay=88
> >
> > [file24]
> > startdelay=92
> >
> > [file25]
> > startdelay=96
> >
> > [file26]
> > startdelay=100
> >
> > [file27]
> > startdelay=104
> >
> > [file28]
> > startdelay=108
> >
> > [file29]
> > startdelay=112
> >
> > [file30]
> > startdelay=116
> >
> > [file31]
> > startdelay=120
> >
> > [file32]
> > startdelay=124
> >
> > [file33]
> > startdelay=128
> >
> > [file34]
> > startdelay=132
> >
> > [file35]
> > startdelay=134
> >
> > [file36]
> > startdelay=138
> >
> > [file37]
> > startdelay=142
> >
> > [file38]
> > startdelay=146
> >
> > [file39]
> > startdelay=150
> >
> > [file40]
> > startdelay=200
> >
> > [file41]
> > startdelay=260
> >
> > --
> > Mathieu Desnoyers
> > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
>
> --
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2009-01-23 03:22:53

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [ltt-dev] [RFC PATCH] block: Fix bio merge induced high I/O latency

> So, I ran the tests with my corrected patch, and the results are very
> good !
>
> "incoming ssh connexion" test
>
> "config 2.6.28 cfq"
> Linux 2.6.28
> /sys/block/sd{a,b}/device/queue_depth = 31 (default)
> /sys/block/sd{a,b}/queue/iosched/slice_async_rq = 2 (default)
> /sys/block/sd{a,b}/queue/iosched/quantum = 4 (default)
>
> "config 2.6.28.1-patch1"
> Linux 2.6.28.1
> Corrected cfq patch applied
> echo 1 > /sys/block/sd{a,b}/device/queue_depth
> echo 1 > /sys/block/sd{a,b}/queue/iosched/slice_async_rq
> echo 1 > /sys/block/sd{a,b}/queue/iosched/quantum
>
> On /dev/sda :
>
> I/O scheduler runt-min (msec) runt-max (msec)
> cfq (2.6.28 cfq) 523 6637
> cfq (2.6.28.1-patch1) 579 2082
>
> On raid1 :
>
> I/O scheduler runt-min (msec) runt-max (msec)
> cfq (2.6.28 cfq) 523 28216
> cfq (2.6.28.1-patch1) 517 3086

Congraturation.
In university machine room (at least, the university in japan),
parallel ssh workload freqently happend.

I like this patch :)



>
> It looks like we are getting somewhere :) Are there any specific
> queue_depth, slice_async_rq, quantum variations you would like to be
> tested ?
>
> For reference, I attach my ssh-like job file (again) to this mail.
>
> Mathieu
>
>
> [job1]
> rw=write
> size=10240m
> direct=0
> blocksize=1024k
>
> [global]
> rw=randread
> size=2048k
> filesize=30m
> direct=0
> bsrange=4k-44k
>
> [file1]
> startdelay=0
>
> [file2]
> startdelay=4
>
> [file3]
> startdelay=8
>
> [file4]
> startdelay=12
>
> [file5]
> startdelay=16
>
> [file6]
> startdelay=20
>
> [file7]
> startdelay=24
>
> [file8]
> startdelay=28
>
> [file9]
> startdelay=32
>
> [file10]
> startdelay=36
>
> [file11]
> startdelay=40
>
> [file12]
> startdelay=44
>
> [file13]
> startdelay=48
>
> [file14]
> startdelay=52
>
> [file15]
> startdelay=56
>
> [file16]
> startdelay=60
>
> [file17]
> startdelay=64
>
> [file18]
> startdelay=68
>
> [file19]
> startdelay=72
>
> [file20]
> startdelay=76
>
> [file21]
> startdelay=80
>
> [file22]
> startdelay=84
>
> [file23]
> startdelay=88
>
> [file24]
> startdelay=92
>
> [file25]
> startdelay=96
>
> [file26]
> startdelay=100
>
> [file27]
> startdelay=104
>
> [file28]
> startdelay=108
>
> [file29]
> startdelay=112
>
> [file30]
> startdelay=116
>
> [file31]
> startdelay=120
>
> [file32]
> startdelay=124
>
> [file33]
> startdelay=128
>
> [file34]
> startdelay=132
>
> [file35]
> startdelay=134
>
> [file36]
> startdelay=138
>
> [file37]
> startdelay=142
>
> [file38]
> startdelay=146
>
> [file39]
> startdelay=150
>
> [file40]
> startdelay=200
>
> [file41]
> startdelay=260
>
> --
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
>
> _______________________________________________
> ltt-dev mailing list
> [email protected]
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev


2009-01-23 04:03:47

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [ltt-dev] [RFC PATCH] block: Fix bio merge induced high I/O latency

* KOSAKI Motohiro ([email protected]) wrote:
> > So, I ran the tests with my corrected patch, and the results are very
> > good !
> >
> > "incoming ssh connexion" test
> >
> > "config 2.6.28 cfq"
> > Linux 2.6.28
> > /sys/block/sd{a,b}/device/queue_depth = 31 (default)
> > /sys/block/sd{a,b}/queue/iosched/slice_async_rq = 2 (default)
> > /sys/block/sd{a,b}/queue/iosched/quantum = 4 (default)
> >
> > "config 2.6.28.1-patch1"
> > Linux 2.6.28.1
> > Corrected cfq patch applied
> > echo 1 > /sys/block/sd{a,b}/device/queue_depth
> > echo 1 > /sys/block/sd{a,b}/queue/iosched/slice_async_rq
> > echo 1 > /sys/block/sd{a,b}/queue/iosched/quantum
> >
> > On /dev/sda :
> >
> > I/O scheduler runt-min (msec) runt-max (msec)
> > cfq (2.6.28 cfq) 523 6637
> > cfq (2.6.28.1-patch1) 579 2082
> >
> > On raid1 :
> >
> > I/O scheduler runt-min (msec) runt-max (msec)
> > cfq (2.6.28 cfq) 523 28216
> > cfq (2.6.28.1-patch1) 517 3086
>
> Congraturation.
> In university machine room (at least, the university in japan),
> parallel ssh workload freqently happend.
>
> I like this patch :)
>

Please see my today's posts with numbers taken after my Seagate firmware
upgrade. The runt-max case is pretty hard to trigger "for sure" and I
had to do a few runs to trigger the problem. The latest tests are
better. E.g. the 3086msec is actually just because the problem has not
been hit.

But the
echo 1 > /sys/block/sd{a,b}/device/queue_depth
echo 1 > /sys/block/sd{a,b}/queue/iosched/slice_async_rq
echo 1 > /sys/block/sd{a,b}/queue/iosched/quantum

Are definitely helping a lot, as my last numbers also show. The patch,
OTOH, degraded performances rather than making them better.

Mathieu

>
>
> >
> > It looks like we are getting somewhere :) Are there any specific
> > queue_depth, slice_async_rq, quantum variations you would like to be
> > tested ?
> >
> > For reference, I attach my ssh-like job file (again) to this mail.
> >
> > Mathieu
> >
> >
> > [job1]
> > rw=write
> > size=10240m
> > direct=0
> > blocksize=1024k
> >
> > [global]
> > rw=randread
> > size=2048k
> > filesize=30m
> > direct=0
> > bsrange=4k-44k
> >
> > [file1]
> > startdelay=0
> >
> > [file2]
> > startdelay=4
> >
> > [file3]
> > startdelay=8
> >
> > [file4]
> > startdelay=12
> >
> > [file5]
> > startdelay=16
> >
> > [file6]
> > startdelay=20
> >
> > [file7]
> > startdelay=24
> >
> > [file8]
> > startdelay=28
> >
> > [file9]
> > startdelay=32
> >
> > [file10]
> > startdelay=36
> >
> > [file11]
> > startdelay=40
> >
> > [file12]
> > startdelay=44
> >
> > [file13]
> > startdelay=48
> >
> > [file14]
> > startdelay=52
> >
> > [file15]
> > startdelay=56
> >
> > [file16]
> > startdelay=60
> >
> > [file17]
> > startdelay=64
> >
> > [file18]
> > startdelay=68
> >
> > [file19]
> > startdelay=72
> >
> > [file20]
> > startdelay=76
> >
> > [file21]
> > startdelay=80
> >
> > [file22]
> > startdelay=84
> >
> > [file23]
> > startdelay=88
> >
> > [file24]
> > startdelay=92
> >
> > [file25]
> > startdelay=96
> >
> > [file26]
> > startdelay=100
> >
> > [file27]
> > startdelay=104
> >
> > [file28]
> > startdelay=108
> >
> > [file29]
> > startdelay=112
> >
> > [file30]
> > startdelay=116
> >
> > [file31]
> > startdelay=120
> >
> > [file32]
> > startdelay=124
> >
> > [file33]
> > startdelay=128
> >
> > [file34]
> > startdelay=132
> >
> > [file35]
> > startdelay=134
> >
> > [file36]
> > startdelay=138
> >
> > [file37]
> > startdelay=142
> >
> > [file38]
> > startdelay=146
> >
> > [file39]
> > startdelay=150
> >
> > [file40]
> > startdelay=200
> >
> > [file41]
> > startdelay=260
> >
> > --
> > Mathieu Desnoyers
> > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
> >
> > _______________________________________________
> > ltt-dev mailing list
> > [email protected]
> > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>
>
>
>
> _______________________________________________
> ltt-dev mailing list
> [email protected]
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68