2007-08-22 06:47:45

by Jeffrey W. Baker

[permalink] [raw]
Subject: huge improvement with per-device dirty throttling

I tested 2.6.23-rc2-mm + Peter's per-BDI v9 patches, versus 2.6.20 as
shipped in Ubuntu 7.04. I realize there is a large delta between these
two kernels.

I load the system with du -hs /bigtree, where bigtree has millions of
files, and dd if=/dev/zero of=bigfile bs=1048576. I test how long it
takes to ls /*, how long it takes to launch gnome-terminal, and how long
it takes to launch firefox.

2.6.23-rc2-mm-bdi is better than 2.6.20 by a factor between 50x and
100x.

1.

sleep 60 && time ls -l /var /home /usr /lib /etc /boot /root /tmp

2.6.20: 53s, 57s
2.6.23: .652s, .870s, .819s

improvement: ~70x

2.

sleep 60 && time gnome-terminal

2.6.20: 1m50s, 1m50s
2.6.23: 3s, 2s, 2s

improvement: ~40x

3.

sleep 60 && time firefox

2.6.20: >30m
2.6.23: 30s, 32s, 37s

improvement: +inf

Yes, you read that correctly. In the presence of a sustained writer and
a competing reader, it takes more than 30 minutes to start firefox.

4.

du -hs /bigtree

Under 2.6.20, lstat64 has a mean latency of 75ms in the presence of a
sustained writer. Under 2.6.23-rc2-mm+bdi, the mean latency of lstat64
is only 5ms (15x improvement). The worst case latency I observed was
more than 2.9 seconds for a single lstat64 call.

Here's the stem plot of lstat64 latency under 2.6.20

The decimal point is 1 digit(s) to the left of the |

0 | 00000000000000000000000000000000000000000000000000000000000000000000+1737
2 | 177891223344556788899999
4 | 00000111122333333444444555555556666666666667777777777777788888888888+69
6 | 00000111222334557778999344677788899
8 | 0123484589
10 | 020045
12 | 1448239
14 | 1
16 | 5
18 | 399
20 | 32
22 | 80
24 |
26 | 2
28 | 1

Here's the same plot for 2.6.23-rc2-mm+bdi. Note the scale

The decimal point is 1 digit(s) to the left of the |

0 | 00000000000000000000000000000000000000000000000000000000000000000000+2243
1 | 1222255677788999999
2 | 0011122257
3 | 237
4 | 3
5 |
6 |
7 | 3
8 | 45
9 |
10 |
11 |
12 |
13 | 9

In other words, under 2.6.20, only writing processes make progress.
Readers never make progress.

5.

dd writeout speed

2.6.20: 36.3MB/s, 35.3MB/s, 33.9MB/s
2.6.23: 20.9MB/s, 22.2MB/s

2.6.23 is slower when writing out, because other processes make progress

My system is a Core 2 Duo, 2GB, single SATA disk.

-jwb


2007-08-22 10:11:29

by Andi Kleen

[permalink] [raw]
Subject: Re: huge improvement with per-device dirty throttling

"Jeffrey W. Baker" <[email protected]> writes:
>
> My system is a Core 2 Duo, 2GB, single SATA disk.

Hmm, I thought the patch was only supposed to make a real difference
if you have multiple devices? But you only got a single disk.

At least that was the case it was supposed to fix: starvation of fast
devices from slow devices.

Ok perhaps the new adaptive dirty limits helps your single disk
a lot too. But your improvements seem to be more "collateral damage" @)

But if that was true it might be enough to just change the dirty limits
to get the same effect on your system. You might want to play with
/proc/sys/vm/dirty_*

-Andi

2007-08-22 11:09:45

by Paolo Ornati

[permalink] [raw]
Subject: Re: huge improvement with per-device dirty throttling

On 22 Aug 2007 13:05:13 +0200
Andi Kleen <[email protected]> wrote:

> "Jeffrey W. Baker" <[email protected]> writes:
> >
> > My system is a Core 2 Duo, 2GB, single SATA disk.
>
> Hmm, I thought the patch was only supposed to make a real difference
> if you have multiple devices? But you only got a single disk.

No, there's also:
[PATCH 22/23] mm: dirty balancing for tasks

:)

--
Paolo Ornati
Linux 2.6.23-rc3-g2a677896-dirty on x86_64

2007-08-22 11:31:42

by Al Boldi

[permalink] [raw]
Subject: Re: huge improvement with per-device dirty throttling

Jeffrey W. Baker wrote:
> I tested 2.6.23-rc2-mm + Peter's per-BDI v9 patches, versus 2.6.20 as
> shipped in Ubuntu 7.04. I realize there is a large delta between these
> two kernels.
>
> I load the system with du -hs /bigtree, where bigtree has millions of
> files, and dd if=/dev/zero of=bigfile bs=1048576. I test how long it
> takes to ls /*, how long it takes to launch gnome-terminal, and how long
> it takes to launch firefox.
>
> 2.6.23-rc2-mm-bdi is better than 2.6.20 by a factor between 50x and
> 100x.
:
:
> Yes, you read that correctly. In the presence of a sustained writer and
> a competing reader, it takes more than 30 minutes to start firefox.
>
> 4.
>
> du -hs /bigtree
>
> Under 2.6.20, lstat64 has a mean latency of 75ms in the presence of a
> sustained writer. Under 2.6.23-rc2-mm+bdi, the mean latency of lstat64
> is only 5ms (15x improvement). The worst case latency I observed was
> more than 2.9 seconds for a single lstat64 call.
:
:
> In other words, under 2.6.20, only writing processes make progress.
> Readers never make progress.
>
> 5.
>
> dd writeout speed
>
> 2.6.20: 36.3MB/s, 35.3MB/s, 33.9MB/s
> 2.6.23: 20.9MB/s, 22.2MB/s
>
> 2.6.23 is slower when writing out, because other processes make progress
>
> My system is a Core 2 Duo, 2GB, single SATA disk.

Many thanks for your detailed analysis.

Which io-scheduler did you use, and what numbers do you get with other
io-schedulers?


Thanks!

--
Al

2007-08-22 12:47:48

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: huge improvement with per-device dirty throttling

On Wed, Aug 22, 2007 at 01:05:13PM +0200, Andi Kleen wrote:
> Ok perhaps the new adaptive dirty limits helps your single disk
> a lot too. But your improvements seem to be more "collateral damage" @)
>
> But if that was true it might be enough to just change the dirty limits
> to get the same effect on your system. You might want to play with
> /proc/sys/vm/dirty_*

The adaptive dirty limit is per task so it can't be reproduced with
global sysctl. It made quite some difference when I researched into it
in function of time. This isn't in function of time but it certainly
makes a lot of difference too, actually it's the most important part
of the patchset for most people, the rest is for the corner cases that
aren't handled right currently (writing to a slow device with
writeback cache has always been hanging the whole thing).

2007-09-04 09:37:55

by Leroy van Logchem

[permalink] [raw]
Subject: Re: huge improvement with per-device dirty throttling

Andrea Arcangeli wrote:
> On Wed, Aug 22, 2007 at 01:05:13PM +0200, Andi Kleen wrote:
>> Ok perhaps the new adaptive dirty limits helps your single disk
>> a lot too. But your improvements seem to be more "collateral damage" @)
>>
>> But if that was true it might be enough to just change the dirty limits
>> to get the same effect on your system. You might want to play with
>> /proc/sys/vm/dirty_*
>
> The adaptive dirty limit is per task so it can't be reproduced with
> global sysctl. It made quite some difference when I researched into it
> in function of time. This isn't in function of time but it certainly
> makes a lot of difference too, actually it's the most important part
> of the patchset for most people, the rest is for the corner cases that
> aren't handled right currently (writing to a slow device with
> writeback cache has always been hanging the whole thing).


Self-tuning > static sysctl's. The last years we needed to use very
small values for dirty_ratio and dirty_background_ratio to soften the
latency problems we have during sustained writes. Imo these patches
really help in many cases, please commit to mainline.

--
Leroy

2007-09-04 19:23:18

by Martin Knoblauch

[permalink] [raw]
Subject: Re: huge improvement with per-device dirty throttling


--- Leroy van Logchem <[email protected]> wrote:

> Andrea Arcangeli wrote:
> > On Wed, Aug 22, 2007 at 01:05:13PM +0200, Andi Kleen wrote:
> >> Ok perhaps the new adaptive dirty limits helps your single disk
> >> a lot too. But your improvements seem to be more "collateral
> damage" @)
> >>
> >> But if that was true it might be enough to just change the dirty
> limits
> >> to get the same effect on your system. You might want to play with
> >> /proc/sys/vm/dirty_*
> >
> > The adaptive dirty limit is per task so it can't be reproduced with
> > global sysctl. It made quite some difference when I researched into
> it
> > in function of time. This isn't in function of time but it
> certainly
> > makes a lot of difference too, actually it's the most important
> part
> > of the patchset for most people, the rest is for the corner cases
> that
> > aren't handled right currently (writing to a slow device with
> > writeback cache has always been hanging the whole thing).
>
>
> Self-tuning > static sysctl's. The last years we needed to use very
> small values for dirty_ratio and dirty_background_ratio to soften the
>
> latency problems we have during sustained writes. Imo these patches
> really help in many cases, please commit to mainline.
>
> --
> Leroy
>

while it helps in some situations, I did some tests today with
2.6.22.6+bdi-v9 (Peter was so kind) which seem to indicate that it
hurts NFS writes. Anyone seen similar effects?

Otherwise I would just second your request. It definitely helps the
problematic performance of my CCISS based RAID5 volume.

Martin

Martin

------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www: http://www.knobisoft.de

2007-09-05 08:54:37

by Martin Knoblauch

[permalink] [raw]
Subject: Re: huge improvement with per-device dirty throttling


--- Andrea Arcangeli <[email protected]> wrote:

> On Wed, Aug 22, 2007 at 01:05:13PM +0200, Andi Kleen wrote:
> > Ok perhaps the new adaptive dirty limits helps your single disk
> > a lot too. But your improvements seem to be more "collateral
> damage" @)
> >
> > But if that was true it might be enough to just change the dirty
> limits
> > to get the same effect on your system. You might want to play with
> > /proc/sys/vm/dirty_*
>
> The adaptive dirty limit is per task so it can't be reproduced with
> global sysctl. It made quite some difference when I researched into
> it
> in function of time. This isn't in function of time but it certainly
> makes a lot of difference too, actually it's the most important part
> of the patchset for most people, the rest is for the corner cases
> that

> aren't handled right currently (writing to a slow device with
> writeback cache has always been hanging the whole thing).

didn't see that remark before. I just realized that "slow device with
writeback cache" pretty well describes the CCISS controller in the
DL380g4. Could you elaborate why that is a problematic case?

Cheers
Martin

------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www: http://www.knobisoft.de

2007-09-06 09:50:27

by Martin Knoblauch

[permalink] [raw]
Subject: Re: huge improvement with per-device dirty throttling


--- Martin Knoblauch <[email protected]> wrote:

>
> --- Leroy van Logchem <[email protected]> wrote:
>
> > Andrea Arcangeli wrote:
> > > On Wed, Aug 22, 2007 at 01:05:13PM +0200, Andi Kleen wrote:
> > >> Ok perhaps the new adaptive dirty limits helps your single disk
> > >> a lot too. But your improvements seem to be more "collateral
> > damage" @)
> > >>
> > >> But if that was true it might be enough to just change the dirty
> > limits
> > >> to get the same effect on your system. You might want to play
> with
> > >> /proc/sys/vm/dirty_*
> > >
> > > The adaptive dirty limit is per task so it can't be reproduced
> with
> > > global sysctl. It made quite some difference when I researched
> into
> > it
> > > in function of time. This isn't in function of time but it
> > certainly
> > > makes a lot of difference too, actually it's the most important
> > part
> > > of the patchset for most people, the rest is for the corner cases
> > that
> > > aren't handled right currently (writing to a slow device with
> > > writeback cache has always been hanging the whole thing).
> >
> >
> > Self-tuning > static sysctl's. The last years we needed to use very
>
> > small values for dirty_ratio and dirty_background_ratio to soften
> the
> >
> > latency problems we have during sustained writes. Imo these patches
>
> > really help in many cases, please commit to mainline.
> >
> > --
> > Leroy
> >
>
> while it helps in some situations, I did some tests today with
> 2.6.22.6+bdi-v9 (Peter was so kind) which seem to indicate that it
> hurts NFS writes. Anyone seen similar effects?
>
> Otherwise I would just second your request. It definitely helps the
> problematic performance of my CCISS based RAID5 volume.
>

please disregard my comment about NFS write performance. What I have
seen is caused by some other stuff I am toying with.

So, I second your request to push this forward.

Martin

------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www: http://www.knobisoft.de