2007-12-03 15:34:11

by Miklos Szeredi

[permalink] [raw]
Subject: [2.6.24 BUG] 100% iowait on host while UML is running

On 2.6.24, top started showing 100% iowait on one CPU when a UML
instance was running (but completely idle). I've traced it to this
commit. Reverting it cures the problem.

Miklos



commit 41d10da3717409de33d5441f2f6d8f072ab3fbb6
Author: Jeff Moyer <[email protected]>
Date: Tue Oct 16 23:27:20 2007 -0700

aio: account I/O wait time properly

Some months back I proposed changing the schedule() call in
read_events to an io_schedule():
http://osdir.com/ml/linux.kernel.aio.general/2006-10/msg00024.html
This was rejected as there are AIO operations that do not initiate
disk I/O. I've had another look at the problem, and the only AIO
operation that will not initiate disk I/O is IOCB_CMD_NOOP. However,
this command isn't even wired up!

Given that it doesn't work, and hasn't for *years*, I'm going to
suggest again that we do proper I/O accounting when using AIO.

Signed-off-by: Jeff Moyer <[email protected]>
Acked-by: Zach Brown <[email protected]>
Cc: Benjamin LaHaise <[email protected]>
Cc: Suparna Bhattacharya <[email protected]>
Cc: Badari Pulavarty <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

diff --git a/fs/aio.c b/fs/aio.c
index ea2e198..d02f43b 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -303,7 +303,7 @@ static void wait_for_all_aios(struct kioctx *ctx)
set_task_state(tsk, TASK_UNINTERRUPTIBLE);
while (ctx->reqs_active) {
spin_unlock_irq(&ctx->ctx_lock);
- schedule();
+ io_schedule();
set_task_state(tsk, TASK_UNINTERRUPTIBLE);
spin_lock_irq(&ctx->ctx_lock);
}
@@ -323,7 +323,7 @@ ssize_t fastcall wait_on_sync_kiocb(struct kiocb *iocb)
set_current_state(TASK_UNINTERRUPTIBLE);
if (!iocb->ki_users)
break;
- schedule();
+ io_schedule();
}
__set_current_state(TASK_RUNNING);
return iocb->ki_user_data;
@@ -1170,7 +1170,7 @@ retry:
ret = 0;
if (to.timed_out) /* Only check after read evt */
break;
- schedule();
+ io_schedule();
if (signal_pending(tsk)) {
ret = -EINTR;
break;


2007-12-03 16:56:33

by Jeff Moyer

[permalink] [raw]
Subject: Re: [2.6.24 BUG] 100% iowait on host while UML is running

Miklos Szeredi <[email protected]> writes:

> On 2.6.24, top started showing 100% iowait on one CPU when a UML
> instance was running (but completely idle). I've traced it to this
> commit [1]. Reverting it cures the problem.

Hi,

The UML code sits in io_getevents waiting for an event to be submitted
and completed. This is a case I completely overlooked when putting
together the referenced patch.

We could check ctx->reqs_active before scheduling to determine whether
or not we are waiting for I/O, but this would require taking the
context lock in order to be accurate. Given that the test would be
only for the sake of book keeping, it might be okay to do it outside
of the lock.

Zach, what are your thoughts on this?

-Jeff


[1] commit 41d10da3717409de33d5441f2f6d8f072ab3fbb6

2007-12-03 18:53:43

by Zach Brown

[permalink] [raw]
Subject: Re: [2.6.24 BUG] 100% iowait on host while UML is running


> We could check ctx->reqs_active before scheduling to determine whether
> or not we are waiting for I/O, but this would require taking the
> context lock in order to be accurate. Given that the test would be
> only for the sake of book keeping, it might be okay to do it outside
> of the lock.
>
> Zach, what are your thoughts on this?

I agree that it'd be OK to test it outside the lock, though we'll want
some commentary:

/* Try to only show up in io wait if there are ops in flight */
if (ctx->reqs_active)
io_schedule();
else
schedule();

It's cheap, safe, and accurate the overwhelming majority of the time :).

We only need it in read_events(). The other two io_schedule() calls are
only reached to wait on pending reqs specifically.

It still won't make sense for iocbs which aren't performing IO, but I
guess that's one more bridge to cross when we come to it.

Do you want to throw this tiny patch together and submit it?

- z

2007-12-03 19:34:27

by Jeff Moyer

[permalink] [raw]
Subject: Re: [2.6.24 BUG] 100% iowait on host while UML is running

Zach Brown <[email protected]> writes:

>> We could check ctx->reqs_active before scheduling to determine whether
>> or not we are waiting for I/O, but this would require taking the
>> context lock in order to be accurate. Given that the test would be
>> only for the sake of book keeping, it might be okay to do it outside
>> of the lock.
>>
>> Zach, what are your thoughts on this?
>
> I agree that it'd be OK to test it outside the lock, though we'll want
> some commentary:
>
> /* Try to only show up in io wait if there are ops in flight */
> if (ctx->reqs_active)
> io_schedule();
> else
> schedule();
>
> It's cheap, safe, and accurate the overwhelming majority of the time :).
>
> We only need it in read_events(). The other two io_schedule() calls are
> only reached to wait on pending reqs specifically.
>
> It still won't make sense for iocbs which aren't performing IO, but I
> guess that's one more bridge to cross when we come to it.
>
> Do you want to throw this tiny patch together and submit it?

Sure. I tested this on a system that I used to reproduce the problem,
and it shows I/O Wait back at normal levels on an idle system with 1
uml guest running.

Andrew, do you need a separate email with a [patch] heading or will
this do?

Cheers,

Jeff

Only account I/O wait time in read_events if there are active
requests.

Signed-off-by: Jeff Moyer <[email protected]>

diff --git a/fs/aio.c b/fs/aio.c
index f12db41..9dec7d2 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1161,7 +1161,12 @@ retry:
ret = 0;
if (to.timed_out) /* Only check after read evt */
break;
- io_schedule();
+ /* Try to only show up in io wait if there are ops
+ * in flight */
+ if (ctx->reqs_active)
+ io_schedule();
+ else
+ schedule();
if (signal_pending(tsk)) {
ret = -EINTR;
break;