LinuxLists.cc - [PATCH 3/4] cfq-iosched: idling on deep seeky sync queues

2009-11-24 13:49:28

Subject: [PATCH 3/4] cfq-iosched: idling on deep seeky sync queues

Seeky sync queues with large depth can gain unfairly big share of disk
time, at the expense of other seeky queues. This patch ensures that
idling will be enabled for queues with I/O depth at least 4, and small
think time. The decision to enable idling is sticky, until an idle
window times out without seeing a new request.

The reasoning behind the decision is that, if an application is using
large I/O depth, it is already optimized to make full utilization of
the hardware, and therefore we reserve a slice of exclusive use for it.

Reported-by: Vivek Goyal <[email protected]>
Signed-off-by: Corrado Zoccolo <[email protected]>
---
block/cfq-iosched.c | 13 ++++++++++++-
1 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 2a304f4..373e80f 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -260,6 +260,7 @@ enum cfqq_state_flags {
CFQ_CFQQ_FLAG_slice_new, /* no requests dispatched in slice */
CFQ_CFQQ_FLAG_sync, /* synchronous queue */
CFQ_CFQQ_FLAG_coop, /* cfqq is shared */
+ CFQ_CFQQ_FLAG_deep, /* sync cfqq experienced large depth */
};

#define CFQ_CFQQ_FNS(name) \
@@ -286,6 +287,7 @@ CFQ_CFQQ_FNS(prio_changed);
CFQ_CFQQ_FNS(slice_new);
CFQ_CFQQ_FNS(sync);
CFQ_CFQQ_FNS(coop);
+CFQ_CFQQ_FNS(deep);
#undef CFQ_CFQQ_FNS

#define cfq_log_cfqq(cfqd, cfqq, fmt, args...) \
@@ -2359,8 +2361,12 @@ cfq_update_idle_window(struct cfq_data *cfqd, struct cfq_queue *cfqq,

enable_idle = old_idle = cfq_cfqq_idle_window(cfqq);

+ if (cfqq->queued[0] + cfqq->queued[1] >= 4)
+ cfq_mark_cfqq_deep(cfqq);
+
if (!atomic_read(&cic->ioc->nr_tasks) || !cfqd->cfq_slice_idle ||
- (sample_valid(cfqq->seek_samples) && CFQQ_SEEKY(cfqq)))
+ (!cfq_cfqq_deep(cfqq) && sample_valid(cfqq->seek_samples)
+ && CFQQ_SEEKY(cfqq)))
enable_idle = 0;
else if (sample_valid(cic->ttime_samples)) {
if (cic->ttime_mean > cfqd->cfq_slice_idle)
@@ -2858,6 +2864,11 @@ static void cfq_idle_slice_timer(unsigned long data)
*/
if (!RB_EMPTY_ROOT(&cfqq->sort_list))
goto out_kick;
+
+ /*
+ * Queue depth flag is reset only when the idle didn't succeed
+ */
+ cfq_clear_cfqq_deep(cfqq);
}
expire:
cfq_slice_expired(cfqd, timed_out);
--
1.6.2.5

2009-11-24 14:33:40

by Vivek Goyal

[permalink] [raw]

Subject: Re: [PATCH 3/4] cfq-iosched: idling on deep seeky sync queues

On Tue, Nov 24, 2009 at 02:49:20PM +0100, Corrado Zoccolo wrote:
> Seeky sync queues with large depth can gain unfairly big share of disk
> time, at the expense of other seeky queues. This patch ensures that
> idling will be enabled for queues with I/O depth at least 4, and small
> think time. The decision to enable idling is sticky, until an idle
> window times out without seeing a new request.
>
> The reasoning behind the decision is that, if an application is using
> large I/O depth, it is already optimized to make full utilization of
> the hardware, and therefore we reserve a slice of exclusive use for it.
>
> Reported-by: Vivek Goyal <[email protected]>
> Signed-off-by: Corrado Zoccolo <[email protected]>
> ---
> block/cfq-iosched.c | 13 ++++++++++++-
> 1 files changed, 12 insertions(+), 1 deletions(-)
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index 2a304f4..373e80f 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -260,6 +260,7 @@ enum cfqq_state_flags {
> CFQ_CFQQ_FLAG_slice_new, /* no requests dispatched in slice */
> CFQ_CFQQ_FLAG_sync, /* synchronous queue */
> CFQ_CFQQ_FLAG_coop, /* cfqq is shared */
> + CFQ_CFQQ_FLAG_deep, /* sync cfqq experienced large depth */
> };
>
> #define CFQ_CFQQ_FNS(name) \
> @@ -286,6 +287,7 @@ CFQ_CFQQ_FNS(prio_changed);
> CFQ_CFQQ_FNS(slice_new);
> CFQ_CFQQ_FNS(sync);
> CFQ_CFQQ_FNS(coop);
> +CFQ_CFQQ_FNS(deep);
> #undef CFQ_CFQQ_FNS
>
> #define cfq_log_cfqq(cfqd, cfqq, fmt, args...) \
> @@ -2359,8 +2361,12 @@ cfq_update_idle_window(struct cfq_data *cfqd, struct cfq_queue *cfqq,
>
> enable_idle = old_idle = cfq_cfqq_idle_window(cfqq);
>
> + if (cfqq->queued[0] + cfqq->queued[1] >= 4)
> + cfq_mark_cfqq_deep(cfqq);
> +
> if (!atomic_read(&cic->ioc->nr_tasks) || !cfqd->cfq_slice_idle ||
> - (sample_valid(cfqq->seek_samples) && CFQQ_SEEKY(cfqq)))
> + (!cfq_cfqq_deep(cfqq) && sample_valid(cfqq->seek_samples)
> + && CFQQ_SEEKY(cfqq)))
> enable_idle = 0;
> else if (sample_valid(cic->ttime_samples)) {
> if (cic->ttime_mean > cfqd->cfq_slice_idle)
> @@ -2858,6 +2864,11 @@ static void cfq_idle_slice_timer(unsigned long data)
> */
> if (!RB_EMPTY_ROOT(&cfqq->sort_list))
> goto out_kick;
> +
> + /*
> + * Queue depth flag is reset only when the idle didn't succeed
> + */
> + cfq_clear_cfqq_deep(cfqq);
> }

Hi Corrado,

Thinking more about it. This clearing of flag when idle expires might
create issues with queues which sent down requests with a burst initially
forcing to set "deep" flag and then fall back to low depth. In that case,
enable_idle will continue to be 1 and we will be driving queue depth as 1.

This is a theoritical explanation looking at the patch. I don't know if
in real life we have workloads who do this frequently. At least for my
testing, this patch did make sure we don't switch between workload type
of queue very frequently.

May be keeping a track of average queue depth of a seeky process might
help here like thinktime. If average queue depth is less over a period of
time, we move the queue to sync-noidle group to achieve better throughput
overall and if average queue depth is high, make is sync-idle.

Currently we seem to be taking queue depth into account only for enabling
the flag. We don't want too frequent switching of "deep" flag, so some
kind of slow moving average might help.

Thanks
Vivek

2009-11-24 15:24:20

by Corrado Zoccolo

[permalink] [raw]

Subject: Re: [PATCH 3/4] cfq-iosched: idling on deep seeky sync queues

Hi Vivek,
On Tue, Nov 24, 2009 at 3:33 PM, Vivek Goyal <[email protected]> wrote:
> Hi Corrado,
>
> Thinking more about it. This clearing of flag when idle expires might
> create issues with queues which sent down requests with a burst initially
> forcing to set "deep" flag and then fall back to low depth. In that case,
> enable_idle will continue to be 1 and we will be driving queue depth as 1.
>
> This is a theoritical explanation looking at the patch. I don't know if
> in real life we have workloads who do this frequently. At least for my
> testing, this patch did make sure we don't switch between workload type
> of queue very frequently.
>
I thought at this scenario when developing the patch, but considered
it too infrequent (and not so costly) to justify the added complexity
of having a moving average.

For me, wasting an idle time is something to be punished for, while
driving the queue at lower depth is not, if the requests are coming
timely.

> May be keeping a track of average queue depth of a seeky process might
> help here like thinktime. If average queue depth is less over a period of
> time, we move the queue to sync-noidle group to achieve better throughput
> overall and if average queue depth is high, make is sync-idle.
>
> Currently we seem to be taking queue depth into account only for enabling
> the flag. We don't want too frequent switching of "deep" flag, so some
> kind of slow moving average might help.
>
Averages can still change in the middle of a slice.
A simpler way could be to reset the deep flag after a full slice, if
the depth never reached the threshold during that slice.

> Thanks
> Vivek
>

--
__________________________________________________________________________

dott. Corrado Zoccolo mailto:[email protected]
PhD - Department of Computer Science - University of Pisa, Italy
--------------------------------------------------------------------------

2009-11-24 15:39:01

by Vivek Goyal

[permalink] [raw]

Subject: Re: [PATCH 3/4] cfq-iosched: idling on deep seeky sync queues

On Tue, Nov 24, 2009 at 04:24:23PM +0100, Corrado Zoccolo wrote:
> Hi Vivek,
> On Tue, Nov 24, 2009 at 3:33 PM, Vivek Goyal <[email protected]> wrote:
> > Hi Corrado,
> >
> > Thinking more about it. This clearing of flag when idle expires might
> > create issues with queues which sent down requests with a burst initially
> > forcing to set "deep" flag and then fall back to low depth. In that case,
> > enable_idle will continue to be 1 and we will be driving queue depth as 1.
> >
> > This is a theoritical explanation looking at the patch. I don't know if
> > in real life we have workloads who do this frequently. At least for my
> > testing, this patch did make sure we don't switch between workload type
> > of queue very frequently.
> >
> I thought at this scenario when developing the patch, but considered
> it too infrequent (and not so costly) to justify the added complexity
> of having a moving average.
>
> For me, wasting an idle time is something to be punished for, while
> driving the queue at lower depth is not, if the requests are coming
> timely.

Agreed that penalty of idling and not dispatching anything to disk/array
is more as compared to penalty of driving queue depth smaller. But in
general we don't want to drive shallow queue depths until and unless
required.

>
> > May be keeping a track of average queue depth of a seeky process might
> > help here like thinktime. If average queue depth is less over a period of
> > time, we move the queue to sync-noidle group to achieve better throughput
> > overall and if average queue depth is high, make is sync-idle.
> >
> > Currently we seem to be taking queue depth into account only for enabling
> > the flag. We don't want too frequent switching of "deep" flag, so some
> > kind of slow moving average might help.
> >
> Averages can still change in the middle of a slice.
> A simpler way could be to reset the deep flag after a full slice, if
> the depth never reached the threshold during that slice.

That's fine. For the time being we can stick to this patch and observe
if there are singifincant cases which hit this condition. If yes, we can
go for your second suggestion of resetting the flag if queue never
achieved the deeper depths again during the slice.

Thanks
Vivek