2010-07-07 15:23:20

by Corrado Zoccolo

[permalink] [raw]
Subject: [PATCH 1/2] cfq-iosched: fix tree-wide handling of rq_noidle

Patch: 8e55063 cfq-iosched: fix corner cases in idling logic
introduced the possibility of iding even on no-idle requests
in the no_idle tree, if any previous request in the current slice
could idle. The implementation had a problem, though:
- if a queue sent several possibly idling requests and a noidle
request as last one in the same time slice, the tree was still
marked as idle.
This patch fixes this misbehaviour, by using a 31-bucket hash to
keep idling/non-idling status for queues in the noidle tree.

Signed-off-by: Corrado Zoccolo <[email protected]>
---
block/cfq-iosched.c | 11 ++++++++---
1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index d1ad066..5ef9a5d 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -214,7 +214,7 @@ struct cfq_data {
enum wl_type_t serving_type;
unsigned long workload_expires;
struct cfq_group *serving_group;
- bool noidle_tree_requires_idle;
+ u32 noidle_tree_requires_idle;

/*
* Each priority tree is sorted by next_request position. These
@@ -2066,7 +2066,7 @@ static void choose_service_tree(struct cfq_data *cfqd, struct cfq_group *cfqg)
slice = max_t(unsigned, slice, CFQ_MIN_TT);
cfq_log(cfqd, "workload slice:%d", slice);
cfqd->workload_expires = jiffies + slice;
- cfqd->noidle_tree_requires_idle = false;
+ cfqd->noidle_tree_requires_idle = 0U;
}

static struct cfq_group *cfq_get_next_cfqg(struct cfq_data *cfqd)
@@ -3349,7 +3349,12 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq)
cfq_slice_expired(cfqd, 1);
else if (sync && cfqq_empty &&
!cfq_close_cooperator(cfqd, cfqq)) {
- cfqd->noidle_tree_requires_idle |= !rq_noidle(rq);
+ u32 bitmask = 1U << (((int)cfqq) % 31);
+ if (rq_noidle(rq))
+ cfqd->noidle_tree_requires_idle &= ~bitmask;
+ else
+ cfqd->noidle_tree_requires_idle |= bitmask;
+
/*
* Idling is enabled for SYNC_WORKLOAD.
* SYNC_NOIDLE_WORKLOAD idles at the end of the tree
--
1.6.4.4


2010-07-07 15:23:28

by Corrado Zoccolo

[permalink] [raw]
Subject: [PATCH 2/2] cfq-iosched: RQ_NOIDLE enabled for SYNC_WORKLOAD

RQ_NOIDLE flag is meaningful and should be honored for SYNC_WORKLOAD,
without further checks.
RQ_NOIDLE can be used to mark the last request of a sequence for which
- we want to idle between the requests of the sequence, to keep locality
- we don't want to idle after the sequence, because we know that no new
nearby requests will follow, so we should switch servicing other
queues.
This patch fixes this behaviour, making it similar to how it behaved
before 8e55063, but still fixing the corner cases that were the
motivation for it.

Signed-off-by: Corrado Zoccolo <[email protected]>
---
block/cfq-iosched.c | 15 ++++++++++-----
1 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 5ef9a5d..cac3afb 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -3356,12 +3356,17 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq)
cfqd->noidle_tree_requires_idle |= bitmask;

/*
- * Idling is enabled for SYNC_WORKLOAD.
- * SYNC_NOIDLE_WORKLOAD idles at the end of the tree
- * only if we processed at least one !rq_noidle request
+ * Idling is enabled for:
+ * - the last sync queue of a group
+ * - SYNC_WORKLOAD queues, for !rq_noidle requests
+ * - SYNC_NOIDLE_WORKLOAD "at the end of the tree"
+ * if at least one queue sent !rq_noidle requests
+ * not followed by at least one rq_noidle request.
*/
- if (cfqd->serving_type == SYNC_WORKLOAD
- || cfqd->noidle_tree_requires_idle
+ if ((cfqd->serving_type == SYNC_WORKLOAD
+ && !rq_noidle(rq))
+ || (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD
+ && cfqd->noidle_tree_requires_idle)
|| cfqq->cfqg->nr_cfqq == 1)
cfq_arm_slice_timer(cfqd);
}
--
1.6.4.4

2010-07-07 15:46:40

by Vivek Goyal

[permalink] [raw]
Subject: Re: [PATCH 2/2] cfq-iosched: RQ_NOIDLE enabled for SYNC_WORKLOAD

On Wed, Jul 07, 2010 at 05:23:47PM +0200, Corrado Zoccolo wrote:
> RQ_NOIDLE flag is meaningful and should be honored for SYNC_WORKLOAD,
> without further checks.
> RQ_NOIDLE can be used to mark the last request of a sequence for which
> - we want to idle between the requests of the sequence, to keep locality
> - we don't want to idle after the sequence, because we know that no new
> nearby requests will follow, so we should switch servicing other
> queues.

Corrado, in higher layers any WRITE_SYNC request currently is marked
as RQ_NOIDLE. At that point it is just not known whether there will be
another request after this or not. So I would not think of RQ_NOIDLE
as being conclusively telling us that this is last request in the
sequence.

I think requst being WRITE_SYNC, we just don't know if the application
is going to write more or not immediately. fsync, O_SYNC etc fall in
this category.

But in general I like the idea of getting rid of idling on as many cases
as possiblle. Jeff's recent posting to fix fsync issue depends on idling
even on WRITE_SYNC queues so your patch and his patchsets are
fundamentally incompatible.

Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just
don't know the answer to that question. :-)). But in general I want to
get rid of idling as much as possible otherwise it becomes a serious
bottleneck in any kind of performance testing on higher end storage.

At the same time not idling runs the risk of process doing WRITE_SYNC
not getting fair share in presence of sequential readers if writer does
not keep the queue busy.

I will do some testing with this patchset little later.

Thanks
Vivek

> This patch fixes this behaviour, making it similar to how it behaved
> before 8e55063, but still fixing the corner cases that were the
> motivation for it.
>
> Signed-off-by: Corrado Zoccolo <[email protected]>
> ---
> block/cfq-iosched.c | 15 ++++++++++-----
> 1 files changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index 5ef9a5d..cac3afb 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -3356,12 +3356,17 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq)
> cfqd->noidle_tree_requires_idle |= bitmask;
>
> /*
> - * Idling is enabled for SYNC_WORKLOAD.
> - * SYNC_NOIDLE_WORKLOAD idles at the end of the tree
> - * only if we processed at least one !rq_noidle request
> + * Idling is enabled for:
> + * - the last sync queue of a group
> + * - SYNC_WORKLOAD queues, for !rq_noidle requests
> + * - SYNC_NOIDLE_WORKLOAD "at the end of the tree"
> + * if at least one queue sent !rq_noidle requests
> + * not followed by at least one rq_noidle request.
> */
> - if (cfqd->serving_type == SYNC_WORKLOAD
> - || cfqd->noidle_tree_requires_idle
> + if ((cfqd->serving_type == SYNC_WORKLOAD
> + && !rq_noidle(rq))
> + || (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD
> + && cfqd->noidle_tree_requires_idle)
> || cfqq->cfqg->nr_cfqq == 1)
> cfq_arm_slice_timer(cfqd);
> }
> --
> 1.6.4.4

2010-07-07 15:52:10

by Vivek Goyal

[permalink] [raw]
Subject: Re: [PATCH 2/2] cfq-iosched: RQ_NOIDLE enabled for SYNC_WORKLOAD

On Wed, Jul 07, 2010 at 11:46:31AM -0400, Vivek Goyal wrote:
> On Wed, Jul 07, 2010 at 05:23:47PM +0200, Corrado Zoccolo wrote:
> > RQ_NOIDLE flag is meaningful and should be honored for SYNC_WORKLOAD,
> > without further checks.
> > RQ_NOIDLE can be used to mark the last request of a sequence for which
> > - we want to idle between the requests of the sequence, to keep locality
> > - we don't want to idle after the sequence, because we know that no new
> > nearby requests will follow, so we should switch servicing other
> > queues.
>
> Corrado, in higher layers any WRITE_SYNC request currently is marked
> as RQ_NOIDLE. At that point it is just not known whether there will be
> another request after this or not. So I would not think of RQ_NOIDLE
> as being conclusively telling us that this is last request in the
> sequence.
>
> I think requst being WRITE_SYNC, we just don't know if the application
> is going to write more or not immediately. fsync, O_SYNC etc fall in
> this category.
>
> But in general I like the idea of getting rid of idling on as many cases
> as possiblle. Jeff's recent posting to fix fsync issue depends on idling
> even on WRITE_SYNC queues so your patch and his patchsets are
> fundamentally incompatible.
>
> Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just
> don't know the answer to that question. :-)). But in general I want to
> get rid of idling as much as possible otherwise it becomes a serious
> bottleneck in any kind of performance testing on higher end storage.
>
> At the same time not idling runs the risk of process doing WRITE_SYNC
> not getting fair share in presence of sequential readers if writer does
> not keep the queue busy.
>
> I will do some testing with this patchset little later.

Hmm..., noticed that you are still using Jens's old mail id. Fixing it.

Thanks
Vivek
>
> > This patch fixes this behaviour, making it similar to how it behaved
> > before 8e55063, but still fixing the corner cases that were the
> > motivation for it.
> >
> > Signed-off-by: Corrado Zoccolo <[email protected]>
> > ---
> > block/cfq-iosched.c | 15 ++++++++++-----
> > 1 files changed, 10 insertions(+), 5 deletions(-)
> >
> > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> > index 5ef9a5d..cac3afb 100644
> > --- a/block/cfq-iosched.c
> > +++ b/block/cfq-iosched.c
> > @@ -3356,12 +3356,17 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq)
> > cfqd->noidle_tree_requires_idle |= bitmask;
> >
> > /*
> > - * Idling is enabled for SYNC_WORKLOAD.
> > - * SYNC_NOIDLE_WORKLOAD idles at the end of the tree
> > - * only if we processed at least one !rq_noidle request
> > + * Idling is enabled for:
> > + * - the last sync queue of a group
> > + * - SYNC_WORKLOAD queues, for !rq_noidle requests
> > + * - SYNC_NOIDLE_WORKLOAD "at the end of the tree"
> > + * if at least one queue sent !rq_noidle requests
> > + * not followed by at least one rq_noidle request.
> > */
> > - if (cfqd->serving_type == SYNC_WORKLOAD
> > - || cfqd->noidle_tree_requires_idle
> > + if ((cfqd->serving_type == SYNC_WORKLOAD
> > + && !rq_noidle(rq))
> > + || (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD
> > + && cfqd->noidle_tree_requires_idle)
> > || cfqq->cfqg->nr_cfqq == 1)
> > cfq_arm_slice_timer(cfqd);
> > }
> > --
> > 1.6.4.4

2010-07-07 16:04:20

by Corrado Zoccolo

[permalink] [raw]
Subject: Re: [PATCH 2/2] cfq-iosched: RQ_NOIDLE enabled for SYNC_WORKLOAD

On Wed, Jul 7, 2010 at 5:46 PM, Vivek Goyal <[email protected]> wrote:
> On Wed, Jul 07, 2010 at 05:23:47PM +0200, Corrado Zoccolo wrote:
>> RQ_NOIDLE flag is meaningful and should be honored for SYNC_WORKLOAD,
>> without further checks.
>> RQ_NOIDLE can be used to mark the last request of a sequence for which
>> - we want to idle between the requests of the sequence, to keep locality
>> - we don't want to idle after the sequence, because we know that no new
>>   nearby requests will follow, so we should switch servicing other
>>   queues.
>
> Corrado, in higher layers any WRITE_SYNC request currently is marked
> as RQ_NOIDLE. At that point it is just not known whether there will be
> another request after this or not. So I would not think of RQ_NOIDLE
> as being conclusively telling us that this is last request in the
> sequence.

Probably WRITE_SYNC are marked as RQ_NOIDLE because the application
can always send several write requests together (while for reads, you
usually need the result of one read to send the other). This means
that cfq will actually care only of the RQ_NOIDLE on the last request
in the queue (since, until the queue is empty, we don't even consider
idling).

>
> I think requst being WRITE_SYNC, we just don't know if the application
> is going to write more or not immediately. fsync, O_SYNC etc fall in
> this category.
>
> But in general I like the idea of getting rid of idling on as many cases
> as possiblle. Jeff's recent posting to fix fsync issue depends on idling
> even on WRITE_SYNC queues so your patch and his patchsets are
> fundamentally incompatible.

I think this can be easily fixed by removing RQ_NOIDLE from those
requests on which Jeff wants to idle. Once no more requests can ever
be marked RQ_NOIDLE, then we can remove this code completely.

>
> Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just
> don't know the answer to that question. :-)). But in general I want to
> get rid of idling as much as possible otherwise it becomes a serious
> bottleneck in any kind of performance testing on higher end storage.
>
> At the same time not idling runs the risk of process doing WRITE_SYNC
> not getting fair share in presence of sequential readers if writer does
> not keep the queue busy.
>
> I will do some testing with this patchset little later.
Thanks, I've resent the patches for 2.6.36 (this version were based on 2.6.34).
Corrado
>
> Thanks
> Vivek
>
>> This patch fixes this behaviour, making it similar to how it behaved
>> before 8e55063, but still fixing the corner cases that were the
>> motivation for it.
>>
>> Signed-off-by: Corrado Zoccolo <[email protected]>
>> ---
>>  block/cfq-iosched.c |   15 ++++++++++-----
>>  1 files changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
>> index 5ef9a5d..cac3afb 100644
>> --- a/block/cfq-iosched.c
>> +++ b/block/cfq-iosched.c
>> @@ -3356,12 +3356,17 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq)
>>                               cfqd->noidle_tree_requires_idle |= bitmask;
>>
>>                       /*
>> -                      * Idling is enabled for SYNC_WORKLOAD.
>> -                      * SYNC_NOIDLE_WORKLOAD idles at the end of the tree
>> -                      * only if we processed at least one !rq_noidle request
>> +                      * Idling is enabled for:
>> +                      * - the last sync queue of a group
>> +                      * - SYNC_WORKLOAD queues, for !rq_noidle requests
>> +                      * - SYNC_NOIDLE_WORKLOAD "at the end of the tree"
>> +                      *   if at least one queue sent !rq_noidle requests
>> +                      *   not followed by at least one rq_noidle request.
>>                        */
>> -                     if (cfqd->serving_type == SYNC_WORKLOAD
>> -                         || cfqd->noidle_tree_requires_idle
>> +                     if ((cfqd->serving_type == SYNC_WORKLOAD
>> +                          && !rq_noidle(rq))
>> +                         || (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD
>> +                             && cfqd->noidle_tree_requires_idle)
>>                           || cfqq->cfqg->nr_cfqq == 1)
>>                               cfq_arm_slice_timer(cfqd);
>>               }
>> --
>> 1.6.4.4
>



--
__________________________________________________________________________

dott. Corrado Zoccolo mailto:[email protected]
PhD - Department of Computer Science - University of Pisa, Italy
--------------------------------------------------------------------------
The self-confidence of a warrior is not the self-confidence of the average
man. The average man seeks certainty in the eyes of the onlooker and calls
that self-confidence. The warrior seeks impeccability in his own eyes and
calls that humbleness.
Tales of Power - C. Castaneda

2010-07-07 16:13:16

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 2/2] cfq-iosched: RQ_NOIDLE enabled for SYNC_WORKLOAD

On Wed, Jul 07, 2010 at 11:46:31AM -0400, Vivek Goyal wrote:
> Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just
> don't know the answer to that question. :-)). But in general I want to
> get rid of idling as much as possible otherwise it becomes a serious
> bottleneck in any kind of performance testing on higher end storage.

After I've been thinking about this for a while I think the major
problems is that we use WRITE_SYNC for two very different I/O patterns.

One is synchronous data I/O (O_SYNC/O_DIRECT/fsync). While this is a
high-level synchronous workload in the sense that someone waits for the
I/O to finish, the I/O can still be batched as we're doing relatively
large amounts of bios.

The other one is synchronous writeout of metadata or the journal. Here
we typically wait on that single I/O we're just submitting (or at most a
handfull), and there is absolutely no point in idling.

We already have the REQ_NOIDLE flag to distinguish between the two, so
instead of second guessing we should actually make use of it.

2010-07-07 17:12:31

by Vivek Goyal

[permalink] [raw]
Subject: Re: [PATCH 2/2] cfq-iosched: RQ_NOIDLE enabled for SYNC_WORKLOAD

On Wed, Jul 07, 2010 at 12:13:08PM -0400, Christoph Hellwig wrote:
> On Wed, Jul 07, 2010 at 11:46:31AM -0400, Vivek Goyal wrote:
> > Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just
> > don't know the answer to that question. :-)). But in general I want to
> > get rid of idling as much as possible otherwise it becomes a serious
> > bottleneck in any kind of performance testing on higher end storage.
>
> After I've been thinking about this for a while I think the major
> problems is that we use WRITE_SYNC for two very different I/O patterns.
>
> One is synchronous data I/O (O_SYNC/O_DIRECT/fsync). While this is a
> high-level synchronous workload in the sense that someone waits for the
> I/O to finish, the I/O can still be batched as we're doing relatively
> large amounts of bios.
>
> The other one is synchronous writeout of metadata or the journal.

Jeff Moyer had mentioned that in his testing journal writes from jbd
threads were appearing as asynchronous (WRITES) in CFQ and we don't do any
kind of idling in CFQ on asynchronous WRITES. So this is probably already
a non issue.

Thanks
Vivek

> Here
> we typically wait on that single I/O we're just submitting (or at most a
> handfull), and there is absolutely no point in idling.
>
> We already have the REQ_NOIDLE flag to distinguish between the two, so
> instead of second guessing we should actually make use of it.