2007-11-06 19:00:08

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH] cfq: fix IOPRIO_CLASS_IDLE delays

After the fresh boot:

ionice -c3 -p $$
echo cfq >> /sys/block/XXX/queue/scheduler
dd if=/dev/XXX of=/dev/null bs=512 count=1

Now dd hangs in D state and the queue is completely stalled for approximately
INITIAL_JIFFIES + CFQ_IDLE_GRACE jiffies. This is because cfq_init_queue()
forgets to initialize cfq_data->last_end_request.

(I guess this patch is not complete, overflow is still possible)

Signed-off-by: Oleg Nesterov <[email protected]>

--- cfq/block/cfq-iosched.c~ 2007-08-09 19:59:26.000000000 +0400
+++ cfq/block/cfq-iosched.c 2007-11-06 22:39:08.000000000 +0300
@@ -2122,6 +2122,7 @@ static void *cfq_init_queue(struct reque

INIT_WORK(&cfqd->unplug_work, cfq_kick_queue);

+ cfqd->last_end_request = jiffies;
cfqd->cfq_quantum = cfq_quantum;
cfqd->cfq_fifo_expire[0] = cfq_fifo_expire[0];
cfqd->cfq_fifo_expire[1] = cfq_fifo_expire[1];


2007-11-06 19:26:24

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] cfq: fix IOPRIO_CLASS_IDLE delays

On Tue, Nov 06 2007, Oleg Nesterov wrote:
> After the fresh boot:
>
> ionice -c3 -p $$
> echo cfq >> /sys/block/XXX/queue/scheduler
> dd if=/dev/XXX of=/dev/null bs=512 count=1
>
> Now dd hangs in D state and the queue is completely stalled for approximately
> INITIAL_JIFFIES + CFQ_IDLE_GRACE jiffies. This is because cfq_init_queue()
> forgets to initialize cfq_data->last_end_request.

Another good catch, thanks Oleg!

> (I guess this patch is not complete, overflow is still possible)

Hmm, where would it overflow?

--
Jens Axboe

2007-11-07 12:43:27

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] cfq: fix IOPRIO_CLASS_IDLE delays

On 11/06, Jens Axboe wrote:
>
> On Tue, Nov 06 2007, Oleg Nesterov wrote:
> >
> > (I guess this patch is not complete, overflow is still possible)
>
> Hmm, where would it overflow?

What if the queue was idle long enough (no !CLASS_IDLE requests) so that
jiffies wraps into the past wrt ->last_end_request?

IOW, perhaps something like the patch below makes some sense. Of course,
this is only a theoretical problem, I'm not sure we really need a fix.

Oleg.


[PATCH] cfq_idle_class_timer: add paranoid checks for jiffies overflow

In theory, if the queue was idle long enough, cfq_idle_class_timer may have
a false (and very long) timeout because jiffies can wrap into the past wrt
->last_end_request.

Signed-off-by: Oleg Nesterov <[email protected]>

--- cfq/block/cfq-iosched.c~ 2007-11-07 15:04:44.000000000 +0300
+++ cfq/block/cfq-iosched.c 2007-11-07 16:04:08.000000000 +0300
@@ -789,6 +789,20 @@ static inline void cfq_slice_expired(str
__cfq_slice_expired(cfqd, cfqq, timed_out);
}

+static int start_idle_class_timer(struct cfq_data *cfqd)
+{
+ unsigned long end = cfqd->last_end_request + CFQ_IDLE_GRACE;
+ unsigned long now = jiffies;
+
+ if (time_before(now, end) &&
+ time_after_eq(now, cfqd->last_end_request)) {
+ mod_timer(&cfqd->idle_class_timer, end);
+ return 1;
+ }
+
+ return 0;
+}
+
/*
* Get next queue for service. Unless we have a queue preemption,
* we'll simply select the first cfqq in the service tree.
@@ -805,19 +819,14 @@ static struct cfq_queue *cfq_get_next_qu
cfqq = rb_entry(n, struct cfq_queue, rb_node);

if (cfq_class_idle(cfqq)) {
- unsigned long end;
-
/*
* if we have idle queues and no rt or be queues had
* pending requests, either allow immediate service if
* the grace period has passed or arm the idle grace
* timer
*/
- end = cfqd->last_end_request + CFQ_IDLE_GRACE;
- if (time_before(jiffies, end)) {
- mod_timer(&cfqd->idle_class_timer, end);
+ if (start_idle_class_timer(cfqd))
cfqq = NULL;
- }
}

return cfqq;
@@ -2033,17 +2042,14 @@ out_cont:
static void cfq_idle_class_timer(unsigned long data)
{
struct cfq_data *cfqd = (struct cfq_data *) data;
- unsigned long flags, end;
+ unsigned long flags;

spin_lock_irqsave(cfqd->queue->queue_lock, flags);

/*
* race with a non-idle queue, reset timer
*/
- end = cfqd->last_end_request + CFQ_IDLE_GRACE;
- if (!time_after_eq(jiffies, end))
- mod_timer(&cfqd->idle_class_timer, end);
- else
+ if (!start_idle_class_timer(cfqd))
cfq_schedule_dispatch(cfqd);

spin_unlock_irqrestore(cfqd->queue->queue_lock, flags);

2007-11-07 12:51:55

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] cfq: fix IOPRIO_CLASS_IDLE delays

On Wed, Nov 07 2007, Oleg Nesterov wrote:
> On 11/06, Jens Axboe wrote:
> >
> > On Tue, Nov 06 2007, Oleg Nesterov wrote:
> > >
> > > (I guess this patch is not complete, overflow is still possible)
> >
> > Hmm, where would it overflow?
>
> What if the queue was idle long enough (no !CLASS_IDLE requests) so that
> jiffies wraps into the past wrt ->last_end_request?
>
> IOW, perhaps something like the patch below makes some sense. Of course,
> this is only a theoretical problem, I'm not sure we really need a fix.

OK I see what you mean, probably pretty much a theoretical problem only.
But that doesn't mean it should be fixed, I've applied your patch. It
also collects the two paths enabling the idle class timer, so I consider
it a cleanup as well as it improves readability.

--
Jens Axboe