CFQ v2 is much better in a lot of cases, but certain situations trigger
a cpu load so high it starves the rest of the system thus completely
ruining the interactive experience. While casually looking at the
problem, I stumbled upon a patch by Arjan van de Ven sent to lkml on
sept. 1 (Subject: block fixes). Part of it is already included in the
CFQ v2 patches and after applying the rest[1] I'm no longer able to
trigger the problem.
[1] Patch attached against 2.6.9-rc4-ck1 but applies to rc4-mm1 with
some minor fuzz.
--
Ronny V. Vindenes <[email protected]>
Ronny V. Vindenes wrote:
> CFQ v2 is much better in a lot of cases, but certain situations trigger
> a cpu load so high it starves the rest of the system thus completely
> ruining the interactive experience. While casually looking at the
> problem, I stumbled upon a patch by Arjan van de Ven sent to lkml on
> sept. 1 (Subject: block fixes). Part of it is already included in the
> CFQ v2 patches and after applying the rest[1] I'm no longer able to
> trigger the problem.
>
> [1] Patch attached against 2.6.9-rc4-ck1 but applies to rc4-mm1 with
> some minor fuzz.
>
>
>
> ------------------------------------------------------------------------
>
> --- patches/linux-2.6.9-rc4-ck1/drivers/block/ll_rw_blk.c 2004-10-12 12:25:09.798003278 +0200
> +++ linux-2.6.9-rc4-ck1/drivers/block/ll_rw_blk.c 2004-10-12 12:25:42.959479479 +0200
> @@ -100,7 +100,7 @@
> nr = q->nr_requests;
> q->nr_congestion_on = nr;
>
> - nr = q->nr_requests - (q->nr_requests / 8) - 1;
> + nr = q->nr_requests - (q->nr_requests / 8) - (q->nr_requests/16)- 1;
> if (nr < 1)
> nr = 1;
> q->nr_congestion_off = nr;
I thought this first hunk looked like a good idea when Arjan sent the
patch. Can you check if it alone helps your problem?
The second hunk should be basically a noop.
On Tue, Oct 12 2004, Nick Piggin wrote:
> Ronny V. Vindenes wrote:
> >CFQ v2 is much better in a lot of cases, but certain situations trigger
> >a cpu load so high it starves the rest of the system thus completely
> >ruining the interactive experience. While casually looking at the
> >problem, I stumbled upon a patch by Arjan van de Ven sent to lkml on
> >sept. 1 (Subject: block fixes). Part of it is already included in the
> >CFQ v2 patches and after applying the rest[1] I'm no longer able to
> >trigger the problem.
> >
> >[1] Patch attached against 2.6.9-rc4-ck1 but applies to rc4-mm1 with
> >some minor fuzz.
> >
> >
> >
> >------------------------------------------------------------------------
> >
> >--- patches/linux-2.6.9-rc4-ck1/drivers/block/ll_rw_blk.c 2004-10-12
> >12:25:09.798003278 +0200
> >+++ linux-2.6.9-rc4-ck1/drivers/block/ll_rw_blk.c 2004-10-12
> >12:25:42.959479479 +0200
> >@@ -100,7 +100,7 @@
> > nr = q->nr_requests;
> > q->nr_congestion_on = nr;
> >
> >- nr = q->nr_requests - (q->nr_requests / 8) - 1;
> >+ nr = q->nr_requests - (q->nr_requests / 8) - (q->nr_requests/16)- 1;
> > if (nr < 1)
> > nr = 1;
> > q->nr_congestion_off = nr;
>
>
> I thought this first hunk looked like a good idea when Arjan sent the
> patch. Can you check if it alone helps your problem?
Yeah agree, it's a good idea to leave a bit of air between congestion on
and off. Fully explains the cfq v2 excessive sys time for some
workloads, which is extra nice.
> The second hunk should be basically a noop.
I don't see what it is trying to achieve, I like the current code
better.
--
Jens Axboe
Jens Axboe wrote:
> On Tue, Oct 12 2004, Nick Piggin wrote:
>
>>Ronny V. Vindenes wrote:
>>
>>>CFQ v2 is much better in a lot of cases, but certain situations trigger
>>>a cpu load so high it starves the rest of the system thus completely
>>>ruining the interactive experience. While casually looking at the
>>>problem, I stumbled upon a patch by Arjan van de Ven sent to lkml on
>>>sept. 1 (Subject: block fixes). Part of it is already included in the
>>>CFQ v2 patches and after applying the rest[1] I'm no longer able to
>>>trigger the problem.
>>>
>>>[1] Patch attached against 2.6.9-rc4-ck1 but applies to rc4-mm1 with
>>>some minor fuzz.
>>>
>>>
>>>
>>>------------------------------------------------------------------------
>>>
>>>--- patches/linux-2.6.9-rc4-ck1/drivers/block/ll_rw_blk.c 2004-10-12
>>>12:25:09.798003278 +0200
>>>+++ linux-2.6.9-rc4-ck1/drivers/block/ll_rw_blk.c 2004-10-12
>>>12:25:42.959479479 +0200
>>>@@ -100,7 +100,7 @@
>>> nr = q->nr_requests;
>>> q->nr_congestion_on = nr;
>>>
>>>- nr = q->nr_requests - (q->nr_requests / 8) - 1;
>>>+ nr = q->nr_requests - (q->nr_requests / 8) - (q->nr_requests/16)- 1;
>>> if (nr < 1)
>>> nr = 1;
>>> q->nr_congestion_off = nr;
>>
>>
>>I thought this first hunk looked like a good idea when Arjan sent the
>>patch. Can you check if it alone helps your problem?
>
>
> Yeah agree, it's a good idea to leave a bit of air between congestion on
> and off. Fully explains the cfq v2 excessive sys time for some
> workloads, which is extra nice.
>
Cool. Can you queue up a patch for when -mm opens again, or shall I?
I can't imagine it should cause any problems but a bit of testing
would be wise.
>
>>The second hunk should be basically a noop.
>
>
> I don't see what it is trying to achieve, I like the current code
> better.
>
I think Arjan may have just misread the code a little bit.
On Tue, Oct 12 2004, Nick Piggin wrote:
> Jens Axboe wrote:
> >On Tue, Oct 12 2004, Nick Piggin wrote:
> >
> >>Ronny V. Vindenes wrote:
> >>
> >>>CFQ v2 is much better in a lot of cases, but certain situations trigger
> >>>a cpu load so high it starves the rest of the system thus completely
> >>>ruining the interactive experience. While casually looking at the
> >>>problem, I stumbled upon a patch by Arjan van de Ven sent to lkml on
> >>>sept. 1 (Subject: block fixes). Part of it is already included in the
> >>>CFQ v2 patches and after applying the rest[1] I'm no longer able to
> >>>trigger the problem.
> >>>
> >>>[1] Patch attached against 2.6.9-rc4-ck1 but applies to rc4-mm1 with
> >>>some minor fuzz.
> >>>
> >>>
> >>>
> >>>------------------------------------------------------------------------
> >>>
> >>>--- patches/linux-2.6.9-rc4-ck1/drivers/block/ll_rw_blk.c 2004-10-12
> >>>12:25:09.798003278 +0200
> >>>+++ linux-2.6.9-rc4-ck1/drivers/block/ll_rw_blk.c 2004-10-12
> >>>12:25:42.959479479 +0200
> >>>@@ -100,7 +100,7 @@
> >>> nr = q->nr_requests;
> >>> q->nr_congestion_on = nr;
> >>>
> >>>- nr = q->nr_requests - (q->nr_requests / 8) - 1;
> >>>+ nr = q->nr_requests - (q->nr_requests / 8) - (q->nr_requests/16)- 1;
> >>> if (nr < 1)
> >>> nr = 1;
> >>> q->nr_congestion_off = nr;
> >>
> >>
> >>I thought this first hunk looked like a good idea when Arjan sent the
> >>patch. Can you check if it alone helps your problem?
> >
> >
> >Yeah agree, it's a good idea to leave a bit of air between congestion on
> >and off. Fully explains the cfq v2 excessive sys time for some
> >workloads, which is extra nice.
> >
>
> Cool. Can you queue up a patch for when -mm opens again, or shall I?
> I can't imagine it should cause any problems but a bit of testing
> would be wise.
Testing is always good, but maybe Andrew can take it now but just not
feed to Linus until after 2.6.9?
Jens
Jens Axboe wrote:
> On Tue, Oct 12 2004, Nick Piggin wrote:
>>Cool. Can you queue up a patch for when -mm opens again, or shall I?
>>I can't imagine it should cause any problems but a bit of testing
>>would be wise.
>
>
> Testing is always good, but maybe Andrew can take it now but just not
> feed to Linus until after 2.6.9?
>
If Andrew doesn't mind, then yeah that would be nice.
Nick Piggin schrieb:
> Jens Axboe wrote:
>
>> On Tue, Oct 12 2004, Nick Piggin wrote:
>>
>>> Ronny V. Vindenes wrote:
>>>
>>>>
>>>> --- patches/linux-2.6.9-rc4-ck1/drivers/block/ll_rw_blk.c
>>>> 2004-10-12 12:25:09.798003278 +0200
>>>> +++ linux-2.6.9-rc4-ck1/drivers/block/ll_rw_blk.c 2004-10-12
>>>> 12:25:42.959479479 +0200
>>>> @@ -100,7 +100,7 @@
>>>> nr = q->nr_requests;
>>>> q->nr_congestion_on = nr;
>>>>
>>>> - nr = q->nr_requests - (q->nr_requests / 8) - 1;
>>>> + nr = q->nr_requests - (q->nr_requests / 8) -
>>>> (q->nr_requests/16)- 1;
>>>> if (nr < 1)
>>>> nr = 1;
>>>> q->nr_congestion_off = nr;
>>>
>>>
>>>
>>> I thought this first hunk looked like a good idea when Arjan sent the
>>> patch. Can you check if it alone helps your problem?
>>
>>
>>
>> Yeah agree, it's a good idea to leave a bit of air between congestion on
>> and off. Fully explains the cfq v2 excessive sys time for some
>> workloads, which is extra nice.
>>
>
> Cool. Can you queue up a patch for when -mm opens again, or shall I?
> I can't imagine it should cause any problems but a bit of testing
> would be wise.
Just as a feedback: I applied the first hunk of that patch to
2.6.9-rc4-ck1 and it really makes a difference. At first I thought the
staircase scheduler was responsible for io starvation sometimes, but now
this behaviour is gone. Very well!
Prakash