Commit 7667aa0630407bc07dc38dcc79d29cc0a65553c1 added logic to wait for
the last queue of the group to become busy (have at least one request),
so that the group does not lose out for not being continuously
backlogged. The commit did not check for the condition that the last
queue already has some requests. As a result, if the queue already has
requests, wait_busy is set. Later on, cfq_select_queue() checks the
flag, and decides that since the queue has a request now and wait_busy
is set, the queue is expired. This results in early expiration of the
queue.
This patch fixes the problem by adding a check to see if queue already
has requests. If it does, wait_busy is not set. As a result, time slices
do not expire early.
The queues with more than one request are usually buffered writers.
Testing shows improvement in isolation between buffered writers.
Signed-off-by: Justin TerAvest <[email protected]>
---
block/cfq-iosched.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 501ffdf..5dcc353 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -3432,6 +3432,10 @@ static bool cfq_should_wait_busy(struct cfq_data *cfqd, struct cfq_queue *cfqq)
{
struct cfq_io_context *cic = cfqd->active_cic;
+ /* If the queue already has requests, don't wait */
+ if (!RB_EMPTY_ROOT(&cfqq->sort_list))
+ return false;
+
/* If there are other queues in the group, don't wait */
if (cfqq->cfqg->nr_cfqq > 1)
return false;
--
1.7.3.1
On Tue, Feb 08, 2011 at 11:18:17AM -0800, Justin TerAvest wrote:
> Commit 7667aa0630407bc07dc38dcc79d29cc0a65553c1 added logic to wait for
> the last queue of the group to become busy (have at least one request),
> so that the group does not lose out for not being continuously
> backlogged. The commit did not check for the condition that the last
> queue already has some requests. As a result, if the queue already has
> requests, wait_busy is set. Later on, cfq_select_queue() checks the
> flag, and decides that since the queue has a request now and wait_busy
> is set, the queue is expired. This results in early expiration of the
> queue.
Hi Justin,
wait_busy will be set only if slice has expired or about to be expired. So
even if we are setting wait_busy flag, it is not a huge deal even if
select_queue() expires it? Anyway queue has consumed or almost consumed
its allocated slice?
Having said that, it does not make sense to set wait_busy flag if
cfqq has requests. So I would be fine with the patch. I am just
curious that how did you see a difference in practice.
>
> This patch fixes the problem by adding a check to see if queue already
> has requests. If it does, wait_busy is not set. As a result, time slices
> do not expire early.
>
> The queues with more than one request are usually buffered writers.
> Testing shows improvement in isolation between buffered writers.
Upstream code puts all the buffered WRITES in root cgroup. So there
is no isolation between buffered WRITES?
Thanks
Vivek
>
> Signed-off-by: Justin TerAvest <[email protected]>
> ---
> block/cfq-iosched.c | 4 ++++
> 1 files changed, 4 insertions(+), 0 deletions(-)
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index 501ffdf..5dcc353 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -3432,6 +3432,10 @@ static bool cfq_should_wait_busy(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> {
> struct cfq_io_context *cic = cfqd->active_cic;
>
> + /* If the queue already has requests, don't wait */
> + if (!RB_EMPTY_ROOT(&cfqq->sort_list))
> + return false;
> +
> /* If there are other queues in the group, don't wait */
> if (cfqq->cfqg->nr_cfqq > 1)
> return false;
> --
> 1.7.3.1
On Tue, Feb 8, 2011 at 11:48 AM, Vivek Goyal <[email protected]> wrote:
> On Tue, Feb 08, 2011 at 11:18:17AM -0800, Justin TerAvest wrote:
>> Commit 7667aa0630407bc07dc38dcc79d29cc0a65553c1 added logic to wait for
>> the last queue of the group to become busy (have at least one request),
>> so that the group does not lose out for not being continuously
>> backlogged. The commit did not check for the condition that the last
>> queue already has some requests. As a result, if the queue already has
>> requests, wait_busy is set. Later on, cfq_select_queue() checks the
>> flag, and decides that since the queue has a request now and wait_busy
>> is set, the queue is expired. ?This results in early expiration of the
>> queue.
>
> Hi Justin,
>
> wait_busy will be set only if slice has expired or about to be expired. So
> even if we are setting wait_busy flag, it is not a huge deal even if
> select_queue() expires it? Anyway queue has consumed or almost consumed
> its allocated slice?
Correct, the queue will have consumed or almost consumed its allocated
slice. However, this must be happening often enough because there is a
measurable impact in our testing.
>
> Having said that, it does not make sense to set wait_busy flag if
> cfqq has requests. So I would be fine with the patch. I am just
> curious that how did you see a difference in practice.
In practice,
- We see some better isolation between tasks with this patch alone
(Possibly those writes are getting marked SYNC somehow, or it's
another timing change)
- With other patches that introduce isolation between buffered
writes, we see that isolation works better, especially for cgroups
with small weights.
>
>>
>> This patch fixes the problem by adding a check to see if queue already
>> has requests. If it does, wait_busy is not set. As a result, time slices
>> do not expire early.
>>
>> The queues with more than one request are usually buffered writers.
>> Testing shows improvement in isolation between buffered writers.
>
> Upstream code puts all the buffered WRITES in root cgroup. So there
> is no isolation between buffered WRITES?
This was a mistake on my part, we have other patches that add
isolation between buffered writes; I just took this one out of order.
If you'd like, we can hold on this patch for now and I can resend this
later, but I think the actual patch itself is still good.
My apologies for the confusion.
Thanks,
Justin
>
> Thanks
> Vivek
>
>>
>> Signed-off-by: Justin TerAvest <[email protected]>
>> ---
>> ?block/cfq-iosched.c | ? ?4 ++++
>> ?1 files changed, 4 insertions(+), 0 deletions(-)
>>
>> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
>> index 501ffdf..5dcc353 100644
>> --- a/block/cfq-iosched.c
>> +++ b/block/cfq-iosched.c
>> @@ -3432,6 +3432,10 @@ static bool cfq_should_wait_busy(struct cfq_data *cfqd, struct cfq_queue *cfqq)
>> ?{
>> ? ? ? struct cfq_io_context *cic = cfqd->active_cic;
>>
>> + ? ? /* If the queue already has requests, don't wait */
>> + ? ? if (!RB_EMPTY_ROOT(&cfqq->sort_list))
>> + ? ? ? ? ? ? return false;
>> +
>> ? ? ? /* If there are other queues in the group, don't wait */
>> ? ? ? if (cfqq->cfqg->nr_cfqq > 1)
>> ? ? ? ? ? ? ? return false;
>> --
>> 1.7.3.1
>
On Tue, Feb 08, 2011 at 02:21:18PM -0800, Justin TerAvest wrote:
> On Tue, Feb 8, 2011 at 11:48 AM, Vivek Goyal <[email protected]> wrote:
> > On Tue, Feb 08, 2011 at 11:18:17AM -0800, Justin TerAvest wrote:
> >> Commit 7667aa0630407bc07dc38dcc79d29cc0a65553c1 added logic to wait for
> >> the last queue of the group to become busy (have at least one request),
> >> so that the group does not lose out for not being continuously
> >> backlogged. The commit did not check for the condition that the last
> >> queue already has some requests. As a result, if the queue already has
> >> requests, wait_busy is set. Later on, cfq_select_queue() checks the
> >> flag, and decides that since the queue has a request now and wait_busy
> >> is set, the queue is expired. ?This results in early expiration of the
> >> queue.
> >
> > Hi Justin,
> >
> > wait_busy will be set only if slice has expired or about to be expired. So
> > even if we are setting wait_busy flag, it is not a huge deal even if
> > select_queue() expires it? Anyway queue has consumed or almost consumed
> > its allocated slice?
>
> Correct, the queue will have consumed or almost consumed its allocated
> slice. However, this must be happening often enough because there is a
> measurable impact in our testing.
>
> >
> > Having said that, it does not make sense to set wait_busy flag if
> > cfqq has requests. So I would be fine with the patch. I am just
> > curious that how did you see a difference in practice.
>
> In practice,
> - We see some better isolation between tasks with this patch alone
> (Possibly those writes are getting marked SYNC somehow, or it's
> another timing change)
> - With other patches that introduce isolation between buffered
> writes, we see that isolation works better, especially for cgroups
> with small weights.
>
> >
> >>
> >> This patch fixes the problem by adding a check to see if queue already
> >> has requests. If it does, wait_busy is not set. As a result, time slices
> >> do not expire early.
> >>
> >> The queues with more than one request are usually buffered writers.
> >> Testing shows improvement in isolation between buffered writers.
> >
> > Upstream code puts all the buffered WRITES in root cgroup. So there
> > is no isolation between buffered WRITES?
>
> This was a mistake on my part, we have other patches that add
> isolation between buffered writes; I just took this one out of order.
> If you'd like, we can hold on this patch for now and I can resend this
> later, but I think the actual patch itself is still good.
I think this patch as it is good as there is no point in marking a
queue wait_busy if it already has requests.
Acked-by: Vivek Goyal <[email protected]>
Thanks
Vivek
>
> My apologies for the confusion.
>
> Thanks,
> Justin
>
> >
> > Thanks
> > Vivek
> >
> >>
> >> Signed-off-by: Justin TerAvest <[email protected]>
> >> ---
> >> ?block/cfq-iosched.c | ? ?4 ++++
> >> ?1 files changed, 4 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> >> index 501ffdf..5dcc353 100644
> >> --- a/block/cfq-iosched.c
> >> +++ b/block/cfq-iosched.c
> >> @@ -3432,6 +3432,10 @@ static bool cfq_should_wait_busy(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> >> ?{
> >> ? ? ? struct cfq_io_context *cic = cfqd->active_cic;
> >>
> >> + ? ? /* If the queue already has requests, don't wait */
> >> + ? ? if (!RB_EMPTY_ROOT(&cfqq->sort_list))
> >> + ? ? ? ? ? ? return false;
> >> +
> >> ? ? ? /* If there are other queues in the group, don't wait */
> >> ? ? ? if (cfqq->cfqg->nr_cfqq > 1)
> >> ? ? ? ? ? ? ? return false;
> >> --
> >> 1.7.3.1
> >
Justin TerAvest wrote:
> Commit 7667aa0630407bc07dc38dcc79d29cc0a65553c1 added logic to wait for
> the last queue of the group to become busy (have at least one request),
> so that the group does not lose out for not being continuously
> backlogged. The commit did not check for the condition that the last
> queue already has some requests. As a result, if the queue already has
> requests, wait_busy is set. Later on, cfq_select_queue() checks the
> flag, and decides that since the queue has a request now and wait_busy
> is set, the queue is expired. This results in early expiration of the
> queue.
>
> This patch fixes the problem by adding a check to see if queue already
> has requests. If it does, wait_busy is not set. As a result, time slices
> do not expire early.
>
> The queues with more than one request are usually buffered writers.
> Testing shows improvement in isolation between buffered writers.
>
> Signed-off-by: Justin TerAvest <[email protected]>
This patch looks good to me.
Reviewed-by: Gui Jianfeng <[email protected]>
Thanks,
Gui
On 2011-02-08 20:18, Justin TerAvest wrote:
> Commit 7667aa0630407bc07dc38dcc79d29cc0a65553c1 added logic to wait for
> the last queue of the group to become busy (have at least one request),
> so that the group does not lose out for not being continuously
> backlogged. The commit did not check for the condition that the last
> queue already has some requests. As a result, if the queue already has
> requests, wait_busy is set. Later on, cfq_select_queue() checks the
> flag, and decides that since the queue has a request now and wait_busy
> is set, the queue is expired. This results in early expiration of the
> queue.
>
> This patch fixes the problem by adding a check to see if queue already
> has requests. If it does, wait_busy is not set. As a result, time slices
> do not expire early.
>
> The queues with more than one request are usually buffered writers.
> Testing shows improvement in isolation between buffered writers.
Thanks for catching this in due time, I'll queue it up for 2.6.38 and
mark it for stable backport.
--
Jens Axboe
On 2011-02-09 14:15, Jens Axboe wrote:
> On 2011-02-08 20:18, Justin TerAvest wrote:
>> Commit 7667aa0630407bc07dc38dcc79d29cc0a65553c1 added logic to wait for
>> the last queue of the group to become busy (have at least one request),
>> so that the group does not lose out for not being continuously
>> backlogged. The commit did not check for the condition that the last
>> queue already has some requests. As a result, if the queue already has
>> requests, wait_busy is set. Later on, cfq_select_queue() checks the
>> flag, and decides that since the queue has a request now and wait_busy
>> is set, the queue is expired. This results in early expiration of the
>> queue.
>>
>> This patch fixes the problem by adding a check to see if queue already
>> has requests. If it does, wait_busy is not set. As a result, time slices
>> do not expire early.
>>
>> The queues with more than one request are usually buffered writers.
>> Testing shows improvement in isolation between buffered writers.
>
> Thanks for catching this in due time, I'll queue it up for 2.6.38 and
BTW, this was not meant as a snide remark, I initially thought the bad
commit was from dec 2010 when it is from 2009 :-)
--
Jens Axboe