2001-04-19 12:04:05

by Takanori Kawano

[permalink] [raw]
Subject: Kernel panics on raw I/O stress test


When I ran raw I/O SCSI read/write test with 2.4.1 kernel
on our IA64 8way SMP box, kernel paniced and following
message was displayed.

Aiee, killing interrupt handler!

No stack trace and register dump are displayed.

Then I analyze FSB traces around the panic, and found that
following functions are called before panic().


CPU0: CPU1:

・瘢雹 ・瘢雹
・瘢雹 ・瘢雹
・瘢雹 rw_raw_dev()
・瘢雹 ・瘢雹
・瘢雹 ・瘢雹
・瘢雹 brw_kiovec()
・瘢雹 ・瘢雹
・瘢雹 ・瘢雹
・瘢雹 free_kiovec()
・瘢雹 ・瘢雹
・瘢雹 ・瘢雹
end_kio_request()
__wake_up()
ia64_do_page_fault()
do_exit()
panic()

I suppose that free_kiobuf() is called on CPU1 before
end_kio_request() is called on CPU0 for the same kiobuf
and resulted in the panic.
In 2.4.1 source code, I think there is no assurance
that free_kiovec() in rw_raw_dev() is called after
end_kio_request() is done.

I tried following two workarounds.

(1) Wait in rw_raw_dev() while io_count is positive.

--- drivers/char/raw.c Mon Oct 2 12:35:15 2000
+++ drivers/char/raw.c.workaround Thu Apr 19 16:54:26 2001
@@ -333,6 +333,11 @@
break;
}

+ while(atomic_read(&iobuf->io_count)) {
+ set_task_state(current, TASK_UNINTERRUPTIBLE);
+ schedule();
+ }
+
free_kiovec(1, &iobuf);

if (transferred) {



(2) Keep buffer lock until end_kio_request() is done.

--- fs/buffer.c Tue Jan 16 05:42:32 2001
+++ fs/buffer.c.workaround Thu Apr 19 17:22:19 2001
@@ -1990,8 +1990,8 @@
mark_buffer_uptodate(bh, uptodate);

kiobuf = bh->b_private;
- unlock_buffer(bh);
end_kio_request(kiobuf, uptodate);
+ unlock_buffer(bh);
}



Both of them worked well for our raw I/O testing,
but I'm not sure they are right.

Does anybody have comments?

regards,

---
Takanori Kawano
Hitachi Ltd,
Internet Systems Platform Division
[email protected]


2001-04-19 12:49:42

by Alan

[permalink] [raw]
Subject: Re: Kernel panics on raw I/O stress test

> When I ran raw I/O SCSI read/write test with 2.4.1 kernel
> on our IA64 8way SMP box, kernel paniced and following
> message was displayed.
>
> Aiee, killing interrupt handler!
>
> (1) Wait in rw_raw_dev() while io_count is positive.

Stephen submitted a chunk of raw i/o fixes which are in recent -ac kernels.
I don't know if Linus has merged them offhand. But 2.4.1 raw is definitely
not watertight


2001-04-19 17:11:45

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: Kernel panics on raw I/O stress test

On Thu, Apr 19, 2001 at 09:01:53PM +0900, Takanori Kawano wrote:
>
> When I ran raw I/O SCSI read/write test with 2.4.1 kernel
> on our IA64 8way SMP box, kernel paniced and following
> message was displayed.

Could you try again with 2.4.4pre4 plus the below patch?

ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.4pre2/rawio-3

You should experience also a quite noticeable improvement on both CPU usage and
disk I/O (also depends on the max size of a I/O request for your hardware disk
controller).

Andrea

2001-04-20 11:46:51

by Takanori Kawano

[permalink] [raw]
Subject: Re: Kernel panics on raw I/O stress test


> Could you try again with 2.4.4pre4 plus the below patch?
>
> ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.4pre2/rawio-3

I suppose that 2.4.4-pre4 + rawio-3 patch still has SMP-unsafe
raw i/o code and can cause the same panic I reported.

I think the following scenario is possible if there are 3 or more CPUs.

(1) CPU0 enter rw_raw_dev()
(2) CPU0 execute alloc_kiovec(1, &iobuf) // drivers/char/raw.c line 309
(3) CPU0 enter brw_kiovec(rw, 1, &iobuf,..) // drivers/char/raw.c line 362
(4) CPU0 enter __wait_on_buffer()
(5) CPU0 execute run_task_queue() and wait
while buffer_locked(bh) is true. // fs/buffer.c line 152-158
(6) CPU1 enter end_buffer_io_kiobuf() with
iobuf allocated at (2)
(7) CPU1 execute unlock_buffer() // fs/buffer.c line 1994
(8) CPU0 exit __wait_on_buffer()
(9) CPU0 exit brw_kiovec(rw, 1, &iobuf,..)
(10) CPU0 execute free_kiovec(1, &iobuf) // drivers/char/raw.c line 388
(11) The task on CPU2 reused the area freed
at (10).
(12) CPU1 enter end_kio_request() and touch
the corrupted iobuf, then panic.

---
Takanori Kawano
Hitachi Ltd,
Internet Systems Platform Division
[email protected]



2001-04-20 13:50:25

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: Kernel panics on raw I/O stress test

On Fri, Apr 20, 2001 at 08:44:35PM +0900, Takanori Kawano wrote:
>
> > Could you try again with 2.4.4pre4 plus the below patch?
> >
> > ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.4pre2/rawio-3
>
> I suppose that 2.4.4-pre4 + rawio-3 patch still has SMP-unsafe
> raw i/o code and can cause the same panic I reported.

I just fixed that as well last week and 2.4.4-pre4 + rawio-3 should be just SMP
safe, faster and my patch is racommended for integration.

> I think the following scenario is possible if there are 3 or more CPUs.
>
> (1) CPU0 enter rw_raw_dev()
> (2) CPU0 execute alloc_kiovec(1, &iobuf) // drivers/char/raw.c line 309
> (3) CPU0 enter brw_kiovec(rw, 1, &iobuf,..) // drivers/char/raw.c line 362
> (4) CPU0 enter __wait_on_buffer()

With my patch applied the kernel doesn't execute wait_on_buffer from wait_kio
here, it first executes kiobuf_wait_for_io and that is also a performance
optimization because kiobuf_wait_for_io will sleep only once and it will
get only 1 wakeup once the whole kiobuf I/O completed.

> (5) CPU0 execute run_task_queue() and wait
> while buffer_locked(bh) is true. // fs/buffer.c line 152-158
> (6) CPU1 enter end_buffer_io_kiobuf() with
> iobuf allocated at (2)
> (7) CPU1 execute unlock_buffer() // fs/buffer.c line 1994
> (8) CPU0 exit __wait_on_buffer()
> (9) CPU0 exit brw_kiovec(rw, 1, &iobuf,..)
> (10) CPU0 execute free_kiovec(1, &iobuf) // drivers/char/raw.c line 388
> (11) The task on CPU2 reused the area freed
> at (10).
> (12) CPU1 enter end_kio_request() and touch
> the corrupted iobuf, then panic.

The end_kio_request in CPU1 with my patch applied is executed before CPU0 can
execute wait_kio so it cannot race the above way.

Thanks for your comments (and yes you are right that the above race can happen
in all 2.4 kernels out there except in the aa latest ones).

Andrea