Thomas have done a very thorough bissection job and have made some very
interesting findings. Just in case everyone is ignoring this now huge
bugzilla thread, I'm forwarding his email to LKML so people can have a
closer look at the offending commits.
I'm glad you are so persistent on figuring out the source of these
long-lasting regressions Thomas, thanks a lot !
Mathieu
----- Forwarded message from [email protected] -----
Date: Fri, 5 Jun 2009 21:55:16 GMT
To: [email protected]
From: [email protected]
Subject: [Bug 12309] Large I/O operations result in slow performance and high
iowait times
http://bugzilla.kernel.org/show_bug.cgi?id=12309
--- Comment #360 from Thomas Pilarski <[email protected]> 2009-06-05 21:55:05 ---
Created an attachment (id=21774)
--> (http://bugzilla.kernel.org/attachment.cgi?id=21774)
Test patch against heavy io bug
I have made an bisection and got these two patches. Reverting these patches
improves the desktop responsiveness on my notebook enormous. I have tested it
on a 2.6.28 non smp kernel (my heavy io testing installation) during four
concurrent read and write operations, while working with two VMs. It's only a
Core2 @2.4GHz system. I can even start new application during heavy io.
I have added the patch, which I have applied to my test installation. Use it
with care, as I am not a kernel developer and does not know the dependencies in
the cfq scheduler.
I have reverted theses two patches:
07db59bd6b0f279c31044cba6787344f63be87ea is first bad commit
commit 07db59bd6b0f279c31044cba6787344f63be87ea
Author: Linus Torvalds <[email protected]>
Date: Fri Apr 27 09:10:47 2007 -0700
Change default dirty-writeback limits
Do this really early in the 2.6.22-rc series, so that we'll get
feedback. And don't change by half measures. Just cut the default
dirty limit to a quarter of what it was, and see if anybody even
notices.
Signed-off-by: Linus Torvalds <[email protected]>
:040000 040000 b63eb9faf5b9a42a1cdad901a5f18d6cceb7fdf6
2b8b4117ca34077cb0b817c77595aa6c9e34253a M mm
a993800655ee516b6f6a6fc4c2ee13fedfb0590b is first bad commit
commit a993800655ee516b6f6a6fc4c2ee13fedfb0590b
Author: Jens Axboe <[email protected]>
Date: Fri Apr 20 08:55:52 2007 +0200
cfq-iosched: fix sequential write regression
We have a 10-15% performance regression for sequential writes on TCQ/NCQ
enabled drives in 2.6.21-rcX after the CFQ update went in. It has been
reported by Valerie Clement <[email protected]> and the Intel
testing folks. The regression is because of CFQ's now more aggressive
queue control, limiting the depth available to the device.
This patches fixes that regression by allowing a greater depth when only
one queue is busy. It has been tested to not impact sync-vs-async
workloads too much - we still do a lot better than 2.6.20.
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
:040000 040000 07c48a6930ce62d36540b6650e3ea0563bd7ec59
95fc11105fe3339c90c4e7bebb66a820f7084601 M block
Here the fsync result on my machine:
**************************************************************************
Without patch
Linux balrog 2.6.28 #2 Mon Mar 23 11:19:13 CET 2009 x86_64 GNU/Linux
fsync time: 7.8282
fsync time: 17.3598
fsync time: 24.0352
fsync time: 19.7307
fsync time: 21.9559
fsync time: 21.0571
5000+0 Datens?tze ein
5000+0 Datens?tze aus
5242880000 Bytes (5,2 GB) kopiert, 129,286 s, 40,6 MB/s
fsync time: 21.8491
fsync time: 0.0430
fsync time: 0.0448
fsync time: 0.0451
fsync time: 0.0451
fsync time: 0.0451
fsync time: 0.0452
**************************************************************************
With patch
Linux balrog 2.6.28 #5 Fri Jun 5 22:23:54 CEST 2009 x86_64 GNU/Linux
fsync time: 2.8409
fsync time: 2.3345
fsync time: 2.8423
fsync time: 0.0851
fsync time: 1.2497
fsync time: 0.9981
fsync time: 0.9494
fsync time: 2.7094
fsync time: 2.9753
fsync time: 2.8886
fsync time: 2.9894
fsync time: 1.2673
fsync time: 2.6728
fsync time: 1.3408
5000+0 Datens?tze ein
5000+0 Datens?tze aus
5242880000 Bytes (5,2 GB) kopiert, 117,388 s, 44,7 MB/s
fsync time: 85.1461
fsync time: 23.5310
fsync time: 0.0317
fsync time: 0.0337
fsync time: 0.0338
fsync time: 0.0338
--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
----- End forwarded message -----
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
(Cc:-ed Linus and Jens too)
* Mathieu Desnoyers <[email protected]> wrote:
> Thomas have done a very thorough bissection job and have made some very
> interesting findings. Just in case everyone is ignoring this now huge
> bugzilla thread, I'm forwarding his email to LKML so people can have a
> closer look at the offending commits.
>
> I'm glad you are so persistent on figuring out the source of these
> long-lasting regressions Thomas, thanks a lot !
>
> Mathieu
>
> ----- Forwarded message from [email protected] -----
>
> Date: Fri, 5 Jun 2009 21:55:16 GMT
> To: [email protected]
> From: [email protected]
> Subject: [Bug 12309] Large I/O operations result in slow performance and high
> iowait times
>
> http://bugzilla.kernel.org/show_bug.cgi?id=12309
>
>
>
>
>
> --- Comment #360 from Thomas Pilarski <[email protected]> 2009-06-05 21:55:05 ---
> Created an attachment (id=21774)
> --> (http://bugzilla.kernel.org/attachment.cgi?id=21774)
> Test patch against heavy io bug
>
> I have made an bisection and got these two patches. Reverting these patches
> improves the desktop responsiveness on my notebook enormous. I have tested it
> on a 2.6.28 non smp kernel (my heavy io testing installation) during four
> concurrent read and write operations, while working with two VMs. It's only a
> Core2 @2.4GHz system. I can even start new application during heavy io.
>
> I have added the patch, which I have applied to my test installation. Use it
> with care, as I am not a kernel developer and does not know the dependencies in
> the cfq scheduler.
>
> I have reverted theses two patches:
>
> 07db59bd6b0f279c31044cba6787344f63be87ea is first bad commit
> commit 07db59bd6b0f279c31044cba6787344f63be87ea
> Author: Linus Torvalds <[email protected]>
> Date: Fri Apr 27 09:10:47 2007 -0700
>
> Change default dirty-writeback limits
>
> Do this really early in the 2.6.22-rc series, so that we'll get
> feedback. And don't change by half measures. Just cut the default
> dirty limit to a quarter of what it was, and see if anybody even
> notices.
>
> Signed-off-by: Linus Torvalds <[email protected]>
>
> :040000 040000 b63eb9faf5b9a42a1cdad901a5f18d6cceb7fdf6
> 2b8b4117ca34077cb0b817c77595aa6c9e34253a M mm
>
> a993800655ee516b6f6a6fc4c2ee13fedfb0590b is first bad commit
> commit a993800655ee516b6f6a6fc4c2ee13fedfb0590b
> Author: Jens Axboe <[email protected]>
> Date: Fri Apr 20 08:55:52 2007 +0200
>
> cfq-iosched: fix sequential write regression
>
> We have a 10-15% performance regression for sequential writes on TCQ/NCQ
> enabled drives in 2.6.21-rcX after the CFQ update went in. It has been
> reported by Valerie Clement <[email protected]> and the Intel
> testing folks. The regression is because of CFQ's now more aggressive
> queue control, limiting the depth available to the device.
>
> This patches fixes that regression by allowing a greater depth when only
> one queue is busy. It has been tested to not impact sync-vs-async
> workloads too much - we still do a lot better than 2.6.20.
>
> Signed-off-by: Jens Axboe <[email protected]>
> Signed-off-by: Linus Torvalds <[email protected]>
>
> :040000 040000 07c48a6930ce62d36540b6650e3ea0563bd7ec59
> 95fc11105fe3339c90c4e7bebb66a820f7084601 M block
>
>
> Here the fsync result on my machine:
>
> **************************************************************************
> Without patch
> Linux balrog 2.6.28 #2 Mon Mar 23 11:19:13 CET 2009 x86_64 GNU/Linux
>
> fsync time: 7.8282
> fsync time: 17.3598
> fsync time: 24.0352
> fsync time: 19.7307
> fsync time: 21.9559
> fsync time: 21.0571
> 5000+0 Datens?tze ein
> 5000+0 Datens?tze aus
> 5242880000 Bytes (5,2 GB) kopiert, 129,286 s, 40,6 MB/s
> fsync time: 21.8491
> fsync time: 0.0430
> fsync time: 0.0448
> fsync time: 0.0451
> fsync time: 0.0451
> fsync time: 0.0451
> fsync time: 0.0452
>
>
>
> **************************************************************************
> With patch
> Linux balrog 2.6.28 #5 Fri Jun 5 22:23:54 CEST 2009 x86_64 GNU/Linux
>
> fsync time: 2.8409
> fsync time: 2.3345
> fsync time: 2.8423
> fsync time: 0.0851
> fsync time: 1.2497
> fsync time: 0.9981
> fsync time: 0.9494
> fsync time: 2.7094
> fsync time: 2.9753
> fsync time: 2.8886
> fsync time: 2.9894
> fsync time: 1.2673
> fsync time: 2.6728
> fsync time: 1.3408
> 5000+0 Datens?tze ein
> 5000+0 Datens?tze aus
> 5242880000 Bytes (5,2 GB) kopiert, 117,388 s, 44,7 MB/s
> fsync time: 85.1461
> fsync time: 23.5310
> fsync time: 0.0317
> fsync time: 0.0337
> fsync time: 0.0338
> fsync time: 0.0338
>
> --
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
>
> ----- End forwarded message -----
>
> --
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
Hello,
I have executed some tests and the improvements in kernel 30 are
gigantic.
The time of starting applications during heavy i/o is shorter on kernel
30 than on kernel 20, which was the fastest kernel for me. The first
part of the patch from comment #366 @ kernel bug 12309 improves the
desktop responsiveness in kernel 29 and 30, but kernel 29 was still bad,
while the kernel 30 was fine during my tests. It must be only one part
of the bad commit.
+ if (cfqd->rq_in_driver && cfq_cfqq_idle_window(cfqq))
+ return 0;
The fsync problem still exists. Firefox is unusable during heavy i/o.
This problem exists in every kernel I have tested (17, 18, 20, 22, 24,
26, 27, 28, 29 and 30). I could not tests the kernel 15, as I was not
able to start the X server.
A high i/o wait time was in every of these kernels too, even in the
kernel 15.
NCQ should be disabled on my test drive, because I have no access to
read or write the queue_depth file. If I chmod +w queue_depth, I can
read a one. Write access fails with an i/o error. It's an Ultrabay sata
drive in my ThinkPad. The main drive shows 31 as queue_depth and it's
read- and writeable.
For testing I have used 16 concurrent writing dd processes. My tests
were stating gimp, eclipse, compiling the kernel, switching windows and
desktops. The desktop performance (starting application / working)
during heavy i/o on the kernel 30 is really great. Even better than in
the kernel 20, which was the best kernel for me. But the cpu usage of
the kernel 30 is higher and I have some short stall (mouse freezes)
shorter than 1s while updating the screen at 1920x1200 in vesa mode and
at 800MHz. These stalls exists with the kernel 20 too, but are shorter.
The freezes disappear on enabling the cpu scaling or setting the cpu to
max frequency.
The patch improves the start up time e.g. of eclipse from ~2min during
heavy i/o to ~1:30min in kernel 30. The overall throughput is nearby the
same, ~70% of disk capacity during 16 writing processes. Every app
started quick from the same disk, even during loadavg of ~20. There
where no mouse freezes with the patch. I could even use the input
assistance of eclipse, although it takes a while (~5s the first time ).
Gimp started even faster than in kernel 20. Everything was quick and
without any stall in spite of such a high load (up to 25). I had no
typing delays in the console. It's really great. I could only not test
any virtual machines, as the vmware kernel driver does not work with the
kernel 30, but I have done some quick tests with virtualbox and it looks
fine.
I have executed all these tests on a patched and an unpatched kernel 20,
29 and 30 without smp support and the a final test on the patched kernel
30 with smp and multicore support to have the direct comparison.
The final tests includes a test with 16 concurrent reading and writing
processes on the same partition, a test with one reading and writing dd
process on the same partition and a test with one writing dd process.
All partitions where ext3, mounted with relatime and data=ordered. The
partition for the writing processes was formated before every test.
My real installation does not show such a clearly improvement or even a
regression compared to kernel 29, but I just started to use it and it's
an installation on a full encrypted lvm drive.
I am not sure, if it's really the source of the problem or it's a lucky
state, which let's the problem disappear for my machine on my test
installation. I doubt the seconds case. I don't known how to prove it
reliable. I have tried the AS with the kernel 30 with smp support too,
and there seems to be an improvement too. The startup times are in some
cases better and in some cases worse, but I didn't have any desktop
freezes at all.
Thank you all for your work, the results are impressive.
Best regard,
Thomas Pilarski