2003-02-27 21:00:25

by Stephan von Krawczynski

[permalink] [raw]
Subject: OOPS in 2.4.21-pre5, ide-scsi

Hello all,

I just installed pre5 and did my current favourite test: mounting a cdrom with ide-scsi. Maybe you remember my problem: I enter the mount, cdrom spins up, around 20-30 seconds, then freeze.

But this time it oops'ed, and here it is:
(I had to write it down by hand, and "filled" it in another oops "form", so just forget the date/time. All the values should be correct, I checked twice.)

# ksymoops < oops
ksymoops 2.4.5 on i686 2.4.21-pre5. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.21-pre5/ (default)
-m /boot/System.map-2.4.21-pre5 (default)

Warning: You did not tell me where to find symbol information. I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc. ksymoops -h explains the options.

Nov 5 19:48:49 linux kernel: Oops: 0002
Nov 5 19:48:49 linux kernel: CPU: 0
Nov 5 19:48:49 linux kernel: EIP: 0010:[<c0213ab3>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Nov 5 19:48:49 linux kernel: EFLAGS: 00010202
Nov 5 19:48:49 linux kernel: eax: 00000000 ebx: 00000001 ecx: c1613d84 edx: 3e076fe3
Nov 5 19:48:49 linux kernel: esi: d93ca000 edi: c0363d80 ebp: c165cd80 esp: d98d3f2c
Nov 5 19:48:49 linux kernel: ds: 0018 es: 0018 ss: 0018
Nov 5 19:48:49 linux kernel: Process setiathome (pid: 1371, stackpage=d98d3000)
Nov 5 19:48:49 linux kernel: Stack: 00000177 51eb851f d98d2000 00100000 c01299e5 bffffa60 d98d3f50 3e076fe3
Nov 5 19:48:49 linux kernel: c1613d60 c0363d80 00000286 c0363cd0 c01dcbd6 c0363d80 00000000 c0213a50
Nov 5 19:48:49 linux kernel: c1634c80 04000001 00000000 d98d3fc4 c0109129 0000000f c1613d60 d98d3fc4
Nov 5 19:48:49 linux kernel: Call Trace: [<c01299e5>] [<c01dcbd6>] [<c0213a50>] [<c0109129>] [<c0109348>] [<c010bec8>]
Nov 5 19:48:49 linux kernel: Code: ff 42 18 89 3c 24 c7 44 24 04 01 00 00 00 e8 ae fc ff ff 31


>>EIP; c0213ab3 <idescsi_pc_intr+63/360> <=====

>>ecx; c1613d84 <_end+12a072c/20557a08>
>>edx; 3e076fe3 Before first symbol
>>esi; d93ca000 <_end+190569a8/20557a08>
>>edi; c0363d80 <ide_hwifs+520/2c60>
>>ebp; c165cd80 <_end+12e9728/20557a08>
>>esp; d98d3f2c <_end+195608d4/20557a08>

Trace; c01299e5 <getrusage+d5/230>
Trace; c01dcbd6 <ide_intr+e6/180>
Trace; c0213a50 <idescsi_pc_intr+0/360>
Trace; c0109129 <handle_IRQ_event+69/a0>
Trace; c0109348 <do_IRQ+98/f0>
Trace; c010bec8 <call_do_IRQ+5/d>

Code; c0213ab3 <idescsi_pc_intr+63/360>
00000000 <_EIP>:
Code; c0213ab3 <idescsi_pc_intr+63/360> <=====
0: ff 42 18 incl 0x18(%edx) <=====
Code; c0213ab6 <idescsi_pc_intr+66/360>
3: 89 3c 24 mov %edi,(%esp,1)
Code; c0213ab9 <idescsi_pc_intr+69/360>
6: c7 44 24 04 01 00 00 movl $0x1,0x4(%esp,1)
Code; c0213ac0 <idescsi_pc_intr+70/360>
d: 00
Code; c0213ac1 <idescsi_pc_intr+71/360>
e: e8 ae fc ff ff call fffffcc1 <_EIP+0xfffffcc1> c0213774 <idescsi_do_end_request+a4/e0>
Code; c0213ac6 <idescsi_pc_intr+76/360>
13: 31 00 xor %eax,(%eax)

Nov 5 19:48:49 linux kernel: <0>Kernel panic: Aiee, killing interrupt handler!

1 warning issued. Results may not be reliable.


Hope this helps

Regards,
Stephan


2003-02-28 15:19:48

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

On Thu, 27 Feb 2003 22:10:17 +0100
Stephan von Krawczynski <[email protected]> wrote:

> >>EIP; c0213ab3 <idescsi_pc_intr+63/360> <=====

Additional comment:

This oops is reproducable at my system. I tried today again, and again it
happened on the same EIP. If I got that right it is this code:


if (status.b.check)
rq->errors++;
idescsi_end_request(drive, 1);
return ide_stopped;
}

Obviously rq is somehow damaged. I tried to retrieve further info by adding:

/* $$$ */
local_irq_enable();
printk("scsi: %08lx, pc: %08lx, rq: %08lx\n",scsi,pc,rq);
if (status.b.check)
rq->errors++;
idescsi_end_request(drive, 1);
return ide_stopped;
}

Interestingly there are about 10 lines in syslog with this output, then it
stops for around 10-20 seconds, and _then_ it oops'es. I got the feeling that
this "late" rq is indeed long gone, when the code is entered.
I tried to patch a bit around this problem, but no success at this time...

--
Regards,
Stephan

2003-03-11 17:14:11

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

Hello all,

again, news on this topic. Today I plugged in an additional:

01:03.0 Unknown mass storage controller: Promise Technology, Inc. 20268 (rev
01)

and connected my ATAPI cdwriter to it. And _now_ everything works! ide-scsi is
just fine, I can mount CDs again. So I state that the ide-driver for

00:0f.1 IDE interface: ServerWorks CSB5 IDE Controller (rev 93)

is in fact broken, at least regarding use of ide-scsi. If anyone has patches or
the like, please submit, I can test.
I tested all available -ac and they all do not work. 2.4.20 does not work
either.

--
Regards,
Stephan

2003-03-13 10:42:42

by Herbert Xu

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

Stephan von Krawczynski <[email protected]> wrote:
>
> Code; c0213ab3 <idescsi_pc_intr+63/360>
> 00000000 <_EIP>:
> Code; c0213ab3 <idescsi_pc_intr+63/360> <=====
> 0: ff 42 18 incl 0x18(%edx) <=====
> Code; c0213ab6 <idescsi_pc_intr+66/360>
> 3: 89 3c 24 mov %edi,(%esp,1)
> Code; c0213ab9 <idescsi_pc_intr+69/360>
> 6: c7 44 24 04 01 00 00 movl $0x1,0x4(%esp,1)
> Code; c0213ac0 <idescsi_pc_intr+70/360>
> d: 00
> Code; c0213ac1 <idescsi_pc_intr+71/360>
> e: e8 ae fc ff ff call fffffcc1 <_EIP+0xfffffcc1> c0213774 <idescsi_do_end_request+a4/e0>
> Code; c0213ac6 <idescsi_pc_intr+76/360>
> 13: 31 00 xor %eax,(%eax)

Does this patch fix the problem?
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
Index: drivers/scsi/ide-scsi.c
===================================================================
RCS file: /home/gondolin/herbert/src/CVS/debian/kernel-source-2.4/drivers/scsi/ide-scsi.c,v
retrieving revision 1.1.1.8
retrieving revision 1.2
diff -u -r1.1.1.8 -r1.2
--- drivers/scsi/ide-scsi.c 28 Nov 2002 23:53:14 -0000 1.1.1.8
+++ drivers/scsi/ide-scsi.c 25 Feb 2003 08:55:10 -0000 1.2
@@ -261,7 +261,7 @@
ide_drive_t *drive = hwgroup->drive;
idescsi_scsi_t *scsi = drive->driver_data;
struct request *rq = hwgroup->rq;
- idescsi_pc_t *pc = (idescsi_pc_t *) rq->buffer;
+ idescsi_pc_t *pc = rq->special;
int log = test_bit(IDESCSI_LOG_CMD, &scsi->log);
u8 *scsi_buf;
unsigned long flags;
@@ -462,7 +462,7 @@
#endif /* IDESCSI_DEBUG_LOG */

if (rq->cmd == IDESCSI_PC_RQ) {
- return idescsi_issue_pc (drive, (idescsi_pc_t *) rq->buffer);
+ return idescsi_issue_pc (drive, rq->special);
}
printk (KERN_ERR "ide-scsi: %s: unsupported command in request queue (%x)\n", drive->name, rq->cmd);
idescsi_end_request (0,HWGROUP (drive));
@@ -836,7 +836,7 @@
}

ide_init_drive_cmd (rq);
- rq->buffer = (char *) pc;
+ rq->special = pc;
rq->bh = idescsi_dma_bh (drive, pc);
rq->cmd = IDESCSI_PC_RQ;
spin_unlock_irq(&io_request_lock);

2003-03-13 11:05:18

by Willy Gardiol

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


I am having this same OOPS with 2.4.20!

i will try your patch...

Alle 10:47, gioved? 13 marzo 2003, Herbert Xu ha scritto:
> Stephan von Krawczynski <[email protected]> wrote:
> > Code; c0213ab3 <idescsi_pc_intr+63/360>
> > 00000000 <_EIP>:
> > Code; c0213ab3 <idescsi_pc_intr+63/360> <=====
> > 0: ff 42 18 incl 0x18(%edx) <=====
> > Code; c0213ab6 <idescsi_pc_intr+66/360>
> > 3: 89 3c 24 mov %edi,(%esp,1)
> > Code; c0213ab9 <idescsi_pc_intr+69/360>
> > 6: c7 44 24 04 01 00 00 movl $0x1,0x4(%esp,1)
> > Code; c0213ac0 <idescsi_pc_intr+70/360>
> > d: 00
> > Code; c0213ac1 <idescsi_pc_intr+71/360>
> > e: e8 ae fc ff ff call fffffcc1 <_EIP+0xfffffcc1>
> > c0213774 <idescsi_do_end_request+a4/e0> Code; c0213ac6
> > <idescsi_pc_intr+76/360>
> > 13: 31 00 xor %eax,(%eax)
>
> Does this patch fix the problem?

- --

!
Willy Gardiol - [email protected]
goemon.polito.it/~gardiol
Use linux for your freedom.

"La guerra non far? mai finire
alcuna guerra, nel migliore dei
casi sar? stata una guerra in pi?."

Gino Strada, Buskash?

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+EXwLQ9qolN/zUk4RAuBXAKCZMpMDJ4O/NUGMSEkg8/gjKIsyEACgsZ0+
ps7JTUlVNud4O8F00jDp4b8=
=pImM
-----END PGP SIGNATURE-----

2003-03-13 15:13:30

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

On Thu, 13 Mar 2003 20:47:37 +1100
Herbert Xu <[email protected]> wrote:

> Stephan von Krawczynski <[email protected]> wrote:
> >
> > Code; c0213ab3 <idescsi_pc_intr+63/360>
> > 00000000 <_EIP>:
> > Code; c0213ab3 <idescsi_pc_intr+63/360> <=====
> > 0: ff 42 18 incl 0x18(%edx) <=====
> > Code; c0213ab6 <idescsi_pc_intr+66/360>
> > 3: 89 3c 24 mov %edi,(%esp,1)
> > Code; c0213ab9 <idescsi_pc_intr+69/360>
> > 6: c7 44 24 04 01 00 00 movl $0x1,0x4(%esp,1)
> > Code; c0213ac0 <idescsi_pc_intr+70/360>
> > d: 00
> > Code; c0213ac1 <idescsi_pc_intr+71/360>
> > e: e8 ae fc ff ff call fffffcc1 <_EIP+0xfffffcc1> c0213774
> > <idescsi_do_end_request+a4/e0>
> > Code; c0213ac6 <idescsi_pc_intr+76/360>
> > 13: 31 00 xor %eax,(%eax)
>
> Does this patch fix the problem?
> [attached patch]

Hello Herbert, hello all,

first of all: your patch does not apply at all on -pre5. Anyway I got your idea
and re-did it accordingly.
Interestingly the machine does not crash any more! And I have some useful
output from mounting:

>From "modprobe ide-scsi":

Mar 13 16:11:30 admin kernel: Vendor: AOPEN Model: CD-RW CRW2440 Rev:
2.02
Mar 13 16:11:30 admin kernel: Type: CD-ROM ANSI
SCSI revision: 02
Mar 13 16:11:30 admin kernel: Attached scsi CD-ROM sr0 at scsi2, channel 0, id
0, lun 0
Mar 13 16:11:30 admin kernel: sr0: scsi3-mmc drive: 40x/40x writer cd/rw
xa/form2 cdda tray

>From "mount /dev/sr0 /mnt":

Mar 13 16:12:12 admin kernel: scsi : aborting command due to timeout : pid
114491, scsi2, channel 0, id 0, lun 0 0x28 00 00 00 00 00 00 00 02 00
Mar 13 16:12:12 admin kernel: hdc: timeout waiting for DMA
Mar 13 16:12:12 admin kernel: hdc: timeout waiting for DMA
Mar 13 16:12:12 admin kernel: hdc: (__ide_dma_test_irq) called while not
waiting
Mar 13 16:12:12 admin kernel: hdc: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
Mar 13 16:12:12 admin kernel: hdc: drive not ready for command
Mar 13 16:12:12 admin kernel: hdc: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
Mar 13 16:12:12 admin kernel: hdc: drive not ready for command
Mar 13 16:12:12 admin kernel: hdc: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
Mar 13 16:12:12 admin kernel: hdc: drive not ready for command
Mar 13 16:12:12 admin kernel: hdc: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
Mar 13 16:12:12 admin kernel: hdc: drive not ready for command
Mar 13 16:12:12 admin kernel: hdc: ATAPI reset complete
Mar 13 16:12:12 admin kernel: I/O error: dev 0b:00, sector 0

It looks like the serverworks ide-driver produces some error, and that is
absolutely poor-handled inside ide-scsi. After this output the mounted CD is
accessible, btw.

And another thing: this error only occurs the _first_ time a mount is performed
(the above CD is completely ok). From the second mount on everything looks
normal.

Regards,
Stephan


My patch:

diff -Nur linux/drivers/scsi/ide-scsi.c linux-patch/drivers/scsi/ide-scsi.c
--- linux/drivers/scsi/ide-scsi.c 2003-03-13 15:37:06.000000000 +0100
+++ linux-patch/drivers/scsi/ide-scsi.c 2003-03-13 16:19:41.000000000 +0100
@@ -321,7 +321,7 @@
{
idescsi_scsi_t *scsi = drive->driver_data;
struct request *rq = HWGROUP(drive)->rq;
- idescsi_pc_t *pc = (idescsi_pc_t *) rq->buffer;
+ idescsi_pc_t *pc = rq->special;
int log = test_bit(IDESCSI_LOG_CMD, &scsi->log);
u8 *scsi_buf;
unsigned long flags;
@@ -587,7 +587,7 @@
#endif /* IDESCSI_DEBUG_LOG */

if (rq->cmd == IDESCSI_PC_RQ) {
- return idescsi_issue_pc(drive, (idescsi_pc_t *) rq->buffer);
+ return idescsi_issue_pc (drive, rq->special);
}
printk(KERN_ERR "ide-scsi: %s: unsupported command in request "
"queue (%x)\n", drive->name, rq->cmd);
@@ -1083,7 +1083,7 @@
}

ide_init_drive_cmd(rq);
- rq->buffer = (char *) pc;
+ rq->special = pc;
rq->bh = idescsi_dma_bh(drive, pc);
rq->cmd = IDESCSI_PC_RQ;
spin_unlock_irq(&io_request_lock);


2003-03-13 15:34:27

by James Stevenson

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

Hi

strange looks alot like the ones i have seen though the whole 2.4.x tree.

this was discussed before somebody said they would send a patch myself
and sombody else were going to test it but the patch never happens.
>From what i can work out an error occurs on the cd drive and the request
queue is then empty and the ide-scsi driver then attempts to access the
reuest queue that doesnt exist i never did manage to find out
where the request get removed from the queue though.


*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c01e5783>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010202
eax: 00000000 ebx: c7a71000 ecx: c0327104 edx: 00000000
esi: 00000001 edi: c13a4fc0 ebp: cb23df58 esp: cb23df44
ds: 0018 es: 0018 ss: 0018
Process klogd (pid: 381, stackpage=cb23d000)
Stack: 00000000 c0327294 c13de260 c0327294 00000202 cb23df78 c01cdd11
c0327294
c01e5700 c0327104 c121db00 04000001 0000000f cb23df98 c010a0bd
0000000f
c13de260 cb23dfc4 cb23dfc4 0000000f c02f8ae0 cb23dfbc c010a24d
0000000f
Call Trace: [<c01cdd11>] [<c01e5700>] [<c010a0bd>] [<c010a24d>] [<c010c358>]
Code: 8b 72 18 46 89 72 18 8b 55 f0 8b 82 f0 00 00 00 8b 58 04 53

>>EIP; c01e5783 <idescsi_pc_intr+83/290> <=====
Trace; c01cdd11 <ide_intr+c1/120>
Trace; c01e5700 <idescsi_pc_intr+0/290>
Trace; c010a0bd <handle_IRQ_event+3d/70>
Trace; c010a24d <do_IRQ+7d/c0>
Trace; c010c358 <call_do_IRQ+5/d>
Code; c01e5783 <idescsi_pc_intr+83/290>
00000000 <_EIP>:
Code; c01e5783 <idescsi_pc_intr+83/290> <=====
0: 8b 72 18 mov 0x18(%edx),%esi <=====
Code; c01e5786 <idescsi_pc_intr+86/290>
3: 46 inc %esi
Code; c01e5787 <idescsi_pc_intr+87/290>
4: 89 72 18 mov %esi,0x18(%edx)
Code; c01e578a <idescsi_pc_intr+8a/290>
7: 8b 55 f0 mov 0xfffffff0(%ebp),%edx
Code; c01e578d <idescsi_pc_intr+8d/290>
a: 8b 82 f0 00 00 00 mov 0xf0(%edx),%eax
Code; c01e5793 <idescsi_pc_intr+93/290>
10: 8b 58 04 mov 0x4(%eax),%ebx
Code; c01e5796 <idescsi_pc_intr+96/290>
13: 53 push %ebx

<0>Kernel panic: Aiee, killing interrupt handler!

1 warning issued. Results may not be reliable.



----- Original Message -----
From: "Stephan von Krawczynski" <[email protected]>
To: "linux-kernel" <[email protected]>
Cc: "Alan Cox" <[email protected]>; "Marcelo Tosatti"
<[email protected]>
Sent: Thursday, February 27, 2003 9:10 PM
Subject: OOPS in 2.4.21-pre5, ide-scsi


> Hello all,
>
> I just installed pre5 and did my current favourite test: mounting a cdrom
with ide-scsi. Maybe you remember my problem: I enter the mount, cdrom spins
up, around 20-30 seconds, then freeze.
>
> But this time it oops'ed, and here it is:
> (I had to write it down by hand, and "filled" it in another oops "form",
so just forget the date/time. All the values should be correct, I checked
twice.)
>
> # ksymoops < oops
> ksymoops 2.4.5 on i686 2.4.21-pre5. Options used
> -V (default)
> -k /proc/ksyms (default)
> -l /proc/modules (default)
> -o /lib/modules/2.4.21-pre5/ (default)
> -m /boot/System.map-2.4.21-pre5 (default)
>
> Warning: You did not tell me where to find symbol information. I will
> assume that the log matches the kernel and modules that are running
> right now and I'll use the default options above for symbol resolution.
> If the current kernel and/or modules do not match the log, you can get
> more accurate output by telling me the kernel version and where to find
> map, modules, ksyms etc. ksymoops -h explains the options.
>
> Nov 5 19:48:49 linux kernel: Oops: 0002
> Nov 5 19:48:49 linux kernel: CPU: 0
> Nov 5 19:48:49 linux kernel: EIP: 0010:[<c0213ab3>] Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> Nov 5 19:48:49 linux kernel: EFLAGS: 00010202
> Nov 5 19:48:49 linux kernel: eax: 00000000 ebx: 00000001 ecx:
c1613d84 edx: 3e076fe3
> Nov 5 19:48:49 linux kernel: esi: d93ca000 edi: c0363d80 ebp:
c165cd80 esp: d98d3f2c
> Nov 5 19:48:49 linux kernel: ds: 0018 es: 0018 ss: 0018
> Nov 5 19:48:49 linux kernel: Process setiathome (pid: 1371,
stackpage=d98d3000)
> Nov 5 19:48:49 linux kernel: Stack: 00000177 51eb851f d98d2000 00100000
c01299e5 bffffa60 d98d3f50 3e076fe3
> Nov 5 19:48:49 linux kernel: c1613d60 c0363d80 00000286 c0363cd0
c01dcbd6 c0363d80 00000000 c0213a50
> Nov 5 19:48:49 linux kernel: c1634c80 04000001 00000000 d98d3fc4
c0109129 0000000f c1613d60 d98d3fc4
> Nov 5 19:48:49 linux kernel: Call Trace: [<c01299e5>] [<c01dcbd6>]
[<c0213a50>] [<c0109129>] [<c0109348>] [<c010bec8>]
> Nov 5 19:48:49 linux kernel: Code: ff 42 18 89 3c 24 c7 44 24 04 01 00 00
00 e8 ae fc ff ff 31
>
>
> >>EIP; c0213ab3 <idescsi_pc_intr+63/360> <=====
>
> >>ecx; c1613d84 <_end+12a072c/20557a08>
> >>edx; 3e076fe3 Before first symbol
> >>esi; d93ca000 <_end+190569a8/20557a08>
> >>edi; c0363d80 <ide_hwifs+520/2c60>
> >>ebp; c165cd80 <_end+12e9728/20557a08>
> >>esp; d98d3f2c <_end+195608d4/20557a08>
>
> Trace; c01299e5 <getrusage+d5/230>
> Trace; c01dcbd6 <ide_intr+e6/180>
> Trace; c0213a50 <idescsi_pc_intr+0/360>
> Trace; c0109129 <handle_IRQ_event+69/a0>
> Trace; c0109348 <do_IRQ+98/f0>
> Trace; c010bec8 <call_do_IRQ+5/d>
>
> Code; c0213ab3 <idescsi_pc_intr+63/360>
> 00000000 <_EIP>:
> Code; c0213ab3 <idescsi_pc_intr+63/360> <=====
> 0: ff 42 18 incl 0x18(%edx) <=====
> Code; c0213ab6 <idescsi_pc_intr+66/360>
> 3: 89 3c 24 mov %edi,(%esp,1)
> Code; c0213ab9 <idescsi_pc_intr+69/360>
> 6: c7 44 24 04 01 00 00 movl $0x1,0x4(%esp,1)
> Code; c0213ac0 <idescsi_pc_intr+70/360>
> d: 00
> Code; c0213ac1 <idescsi_pc_intr+71/360>
> e: e8 ae fc ff ff call fffffcc1 <_EIP+0xfffffcc1>
c0213774 <idescsi_do_end_request+a4/e0>
> Code; c0213ac6 <idescsi_pc_intr+76/360>
> 13: 31 00 xor %eax,(%eax)
>
> Nov 5 19:48:49 linux kernel: <0>Kernel panic: Aiee, killing interrupt
handler!
>
> 1 warning issued. Results may not be reliable.
>


2003-03-13 16:26:41

by Jens Axboe

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

On Thu, Mar 13 2003, James Stevenson wrote:
> Hi
>
> strange looks alot like the ones i have seen though the whole 2.4.x tree.
>
> this was discussed before somebody said they would send a patch myself
> and sombody else were going to test it but the patch never happens.
> >From what i can work out an error occurs on the cd drive and the request
> queue is then empty and the ide-scsi driver then attempts to access the
> reuest queue that doesnt exist i never did manage to find out
> where the request get removed from the queue though.

Your explanation doesn't quite make sense, but I can take a look at the
problem :-)

What kernel is the below oops from? What compiler?

> *pde = 00000000
> Oops: 0000
> CPU: 0
> EIP: 0010:[<c01e5783>] Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010202
> eax: 00000000 ebx: c7a71000 ecx: c0327104 edx: 00000000
> esi: 00000001 edi: c13a4fc0 ebp: cb23df58 esp: cb23df44
> ds: 0018 es: 0018 ss: 0018
> Process klogd (pid: 381, stackpage=cb23d000)
> Stack: 00000000 c0327294 c13de260 c0327294 00000202 cb23df78 c01cdd11
> c0327294
> c01e5700 c0327104 c121db00 04000001 0000000f cb23df98 c010a0bd
> 0000000f
> c13de260 cb23dfc4 cb23dfc4 0000000f c02f8ae0 cb23dfbc c010a24d
> 0000000f
> Call Trace: [<c01cdd11>] [<c01e5700>] [<c010a0bd>] [<c010a24d>] [<c010c358>]
> Code: 8b 72 18 46 89 72 18 8b 55 f0 8b 82 f0 00 00 00 8b 58 04 53
>
> >>EIP; c01e5783 <idescsi_pc_intr+83/290> <=====
> Trace; c01cdd11 <ide_intr+c1/120>
> Trace; c01e5700 <idescsi_pc_intr+0/290>
> Trace; c010a0bd <handle_IRQ_event+3d/70>
> Trace; c010a24d <do_IRQ+7d/c0>
> Trace; c010c358 <call_do_IRQ+5/d>
> Code; c01e5783 <idescsi_pc_intr+83/290>
> 00000000 <_EIP>:
> Code; c01e5783 <idescsi_pc_intr+83/290> <=====
> 0: 8b 72 18 mov 0x18(%edx),%esi <=====
> Code; c01e5786 <idescsi_pc_intr+86/290>
> 3: 46 inc %esi
> Code; c01e5787 <idescsi_pc_intr+87/290>
> 4: 89 72 18 mov %esi,0x18(%edx)
> Code; c01e578a <idescsi_pc_intr+8a/290>
> 7: 8b 55 f0 mov 0xfffffff0(%ebp),%edx
> Code; c01e578d <idescsi_pc_intr+8d/290>
> a: 8b 82 f0 00 00 00 mov 0xf0(%edx),%eax
> Code; c01e5793 <idescsi_pc_intr+93/290>
> 10: 8b 58 04 mov 0x4(%eax),%ebx
> Code; c01e5796 <idescsi_pc_intr+96/290>
> 13: 53 push %ebx
>
> <0>Kernel panic: Aiee, killing interrupt handler!
>
> 1 warning issued. Results may not be reliable.

--
Jens Axboe

2003-03-13 16:31:41

by Willy Gardiol

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Do you thnik this can have somthing to share with the oops i am getting with
ide-scsi, a cd burner, DMA, and a PCI IDE controller?
I attach the oops decoded.

There is a thread about this on linux-ide, it seems a problem common at least
to another guy


Alle 17:37, gioved? 13 marzo 2003, Jens Axboe ha scritto:
> On Thu, Mar 13 2003, James Stevenson wrote:
> > Hi
> >
> > strange looks alot like the ones i have seen though the whole 2.4.x tree.
> >
> > this was discussed before somebody said they would send a patch myself
> > and sombody else were going to test it but the patch never happens.
> >
> > >From what i can work out an error occurs on the cd drive and the request
> >
> > queue is then empty and the ide-scsi driver then attempts to access the
> > reuest queue that doesnt exist i never did manage to find out
> > where the request get removed from the queue though.
>
> Your explanation doesn't quite make sense, but I can take a look at the
> problem :-)
>
> What kernel is the below oops from? What compiler?
>
> > *pde = 00000000
> > Oops: 0000
> > CPU: 0
> > EIP: 0010:[<c01e5783>] Not tainted
> > Using defaults from ksymoops -t elf32-i386 -a i386
> > EFLAGS: 00010202
> > eax: 00000000 ebx: c7a71000 ecx: c0327104 edx: 00000000
> > esi: 00000001 edi: c13a4fc0 ebp: cb23df58 esp: cb23df44
> > ds: 0018 es: 0018 ss: 0018
> > Process klogd (pid: 381, stackpage=cb23d000)
> > Stack: 00000000 c0327294 c13de260 c0327294 00000202 cb23df78 c01cdd11
> > c0327294
> > c01e5700 c0327104 c121db00 04000001 0000000f cb23df98 c010a0bd
> > 0000000f
> > c13de260 cb23dfc4 cb23dfc4 0000000f c02f8ae0 cb23dfbc c010a24d
> > 0000000f
> > Call Trace: [<c01cdd11>] [<c01e5700>] [<c010a0bd>] [<c010a24d>]
> > [<c010c358>] Code: 8b 72 18 46 89 72 18 8b 55 f0 8b 82 f0 00 00 00 8b 58
> > 04 53
> >
> > >>EIP; c01e5783 <idescsi_pc_intr+83/290> <=====
> >
> > Trace; c01cdd11 <ide_intr+c1/120>
> > Trace; c01e5700 <idescsi_pc_intr+0/290>
> > Trace; c010a0bd <handle_IRQ_event+3d/70>
> > Trace; c010a24d <do_IRQ+7d/c0>
> > Trace; c010c358 <call_do_IRQ+5/d>
> > Code; c01e5783 <idescsi_pc_intr+83/290>
> > 00000000 <_EIP>:
> > Code; c01e5783 <idescsi_pc_intr+83/290> <=====
> > 0: 8b 72 18 mov 0x18(%edx),%esi <=====
> > Code; c01e5786 <idescsi_pc_intr+86/290>
> > 3: 46 inc %esi
> > Code; c01e5787 <idescsi_pc_intr+87/290>
> > 4: 89 72 18 mov %esi,0x18(%edx)
> > Code; c01e578a <idescsi_pc_intr+8a/290>
> > 7: 8b 55 f0 mov 0xfffffff0(%ebp),%edx
> > Code; c01e578d <idescsi_pc_intr+8d/290>
> > a: 8b 82 f0 00 00 00 mov 0xf0(%edx),%eax
> > Code; c01e5793 <idescsi_pc_intr+93/290>
> > 10: 8b 58 04 mov 0x4(%eax),%ebx
> > Code; c01e5796 <idescsi_pc_intr+96/290>
> > 13: 53 push %ebx
> >
> > <0>Kernel panic: Aiee, killing interrupt handler!
> >
> > 1 warning issued. Results may not be reliable.

- --

!
Willy Gardiol - [email protected]
goemon.polito.it/~gardiol
Use linux for your freedom.

"La GPL e il modello open source consentono
la creazione della tecologia migliore.
Tutto qui."

Linus Torvalds

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+cLRLQ9qolN/zUk4RAhf0AJ0QrCS37i1zp1HuKFurga1SS1q4IQCfUqS+
bLL8Q7X5t2a967ANOs0i8iM=
=AH5R
-----END PGP SIGNATURE-----


Attachments:
(No filename) (3.41 kB)
clearsigned data
ide-ops2.log (2.68 kB)
Download all attachments

2003-03-13 16:33:31

by James Stevenson

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

> > strange looks alot like the ones i have seen though the whole 2.4.x
tree.
> >
> > this was discussed before somebody said they would send a patch myself
> > and sombody else were going to test it but the patch never happens.
> > >From what i can work out an error occurs on the cd drive and the
request
> > queue is then empty and the ide-scsi driver then attempts to access the
> > reuest queue that doesnt exist i never did manage to find out
> > where the request get removed from the queue though.
>
> Your explanation doesn't quite make sense, but I can take a look at the
> problem :-)
>
> What kernel is the below oops from? What compiler?

i can trigger this on any 2.4.x series kernel.
-> Insert dmaged / lightly scratched cd into drive
dd /dev/scd0 bs=8192k of=file
wait for opps.
opps also cd tries to re read several times
short hang then the following output

gcc versions.
Whatever shits with redhat 7.1 + 7.2 + 7.3 and the
updates between them in each of the redhat versions.
but normally
Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-98)

or
Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-113)

> > *pde = 00000000
> > Oops: 0000
> > CPU: 0
> > EIP: 0010:[<c01e5783>] Not tainted
> > Using defaults from ksymoops -t elf32-i386 -a i386
> > EFLAGS: 00010202
> > eax: 00000000 ebx: c7a71000 ecx: c0327104 edx: 00000000
> > esi: 00000001 edi: c13a4fc0 ebp: cb23df58 esp: cb23df44
> > ds: 0018 es: 0018 ss: 0018
> > Process klogd (pid: 381, stackpage=cb23d000)
> > Stack: 00000000 c0327294 c13de260 c0327294 00000202 cb23df78 c01cdd11
> > c0327294
> > c01e5700 c0327104 c121db00 04000001 0000000f cb23df98 c010a0bd
> > 0000000f
> > c13de260 cb23dfc4 cb23dfc4 0000000f c02f8ae0 cb23dfbc c010a24d
> > 0000000f
> > Call Trace: [<c01cdd11>] [<c01e5700>] [<c010a0bd>] [<c010a24d>]
[<c010c358>]
> > Code: 8b 72 18 46 89 72 18 8b 55 f0 8b 82 f0 00 00 00 8b 58 04 53
> >
> > >>EIP; c01e5783 <idescsi_pc_intr+83/290> <=====
> > Trace; c01cdd11 <ide_intr+c1/120>
> > Trace; c01e5700 <idescsi_pc_intr+0/290>
> > Trace; c010a0bd <handle_IRQ_event+3d/70>
> > Trace; c010a24d <do_IRQ+7d/c0>
> > Trace; c010c358 <call_do_IRQ+5/d>
> > Code; c01e5783 <idescsi_pc_intr+83/290>
> > 00000000 <_EIP>:
> > Code; c01e5783 <idescsi_pc_intr+83/290> <=====
> > 0: 8b 72 18 mov 0x18(%edx),%esi <=====
> > Code; c01e5786 <idescsi_pc_intr+86/290>
> > 3: 46 inc %esi
> > Code; c01e5787 <idescsi_pc_intr+87/290>
> > 4: 89 72 18 mov %esi,0x18(%edx)
> > Code; c01e578a <idescsi_pc_intr+8a/290>
> > 7: 8b 55 f0 mov 0xfffffff0(%ebp),%edx
> > Code; c01e578d <idescsi_pc_intr+8d/290>
> > a: 8b 82 f0 00 00 00 mov 0xf0(%edx),%eax
> > Code; c01e5793 <idescsi_pc_intr+93/290>
> > 10: 8b 58 04 mov 0x4(%eax),%ebx
> > Code; c01e5796 <idescsi_pc_intr+96/290>
> > 13: 53 push %ebx
> >
> > <0>Kernel panic: Aiee, killing interrupt handler!
> >
> > 1 warning issued. Results may not be reliable.
>
> --
> Jens Axboe


2003-03-13 16:35:50

by Jens Axboe

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

On Thu, Mar 13 2003, James Stevenson wrote:
> > > strange looks alot like the ones i have seen though the whole 2.4.x
> tree.
> > >
> > > this was discussed before somebody said they would send a patch myself
> > > and sombody else were going to test it but the patch never happens.
> > > >From what i can work out an error occurs on the cd drive and the
> request
> > > queue is then empty and the ide-scsi driver then attempts to access the
> > > reuest queue that doesnt exist i never did manage to find out
> > > where the request get removed from the queue though.
> >
> > Your explanation doesn't quite make sense, but I can take a look at the
> > problem :-)
> >
> > What kernel is the below oops from? What compiler?
>
> i can trigger this on any 2.4.x series kernel.
> -> Insert dmaged / lightly scratched cd into drive
> dd /dev/scd0 bs=8192k of=file
> wait for opps.
> opps also cd tries to re read several times
> short hang then the following output
>
> gcc versions.
> Whatever shits with redhat 7.1 + 7.2 + 7.3 and the
> updates between them in each of the redhat versions.
> but normally
> Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
> gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-98)
>
> or
> Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
> gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-113)

weee ok not my choice for compilers, but probably alright. do me a favor
then:

# cd /to/kernel/source
# rm drivers/scsi/ide-scsi.o
# EXTRA_CFLAGS=-g make drivers/scsi/ide-scsi.o
# objdump -S drivers/scsi/ide-scsi.o > /tmp/some_file

and then mail me some_file (privately), thanks.

--
Jens Axboe

2003-03-13 16:36:38

by Jens Axboe

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

On Thu, Mar 13 2003, Willy Gardiol wrote:
> Do you thnik this can have somthing to share with the oops i am
> getting with ide-scsi, a cd burner, DMA, and a PCI IDE controller? I
> attach the oops decoded.
>
> There is a thread about this on linux-ide, it seems a problem common
> at least to another guy

Looks very similar, indeed.

--
Jens Axboe

2003-03-13 16:53:40

by James Stevenson

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

> > > Your explanation doesn't quite make sense, but I can take a look at
the
> > > problem :-)
> > >
> > > What kernel is the below oops from? What compiler?
> >
> > i can trigger this on any 2.4.x series kernel.
> > -> Insert dmaged / lightly scratched cd into drive
> > dd /dev/scd0 bs=8192k of=file
> > wait for opps.
> > opps also cd tries to re read several times
> > short hang then the following output
> >
> > gcc versions.
> > Whatever shits with redhat 7.1 + 7.2 + 7.3 and the
> > updates between them in each of the redhat versions.
> > but normally
> > Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
> > gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-98)
> >
> > or
> > Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
> > gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-113)
>
> weee ok not my choice for compilers, but probably alright. do me a favor
> then:
>
> # cd /to/kernel/source
> # rm drivers/scsi/ide-scsi.o
> # EXTRA_CFLAGS=-g make drivers/scsi/ide-scsi.o
> # objdump -S drivers/scsi/ide-scsi.o > /tmp/some_file

i know longer have the source tree from what the opps was generated
from that was actually the 2.4.19 opps i posted but the kernel is now
2.4.20 it was the same trace and same place in the file i tracked it to
this point.

>From 2.4.20 tree line 333 ide-scsi.c

333: if ((status & DRQ_STAT) == 0) { /* No more interrupts */
if (test_bit(IDESCSI_LOG_CMD, &scsi->log))
printk (KERN_INFO "Packet command completed, %d
bytes transferred\n", pc->actually_transferred);
ide__sti();
if (status & ERR_STAT)
338: rq->errors++;
idescsi_end_request (1, HWGROUP(drive));
return ide_stopped;
}


the oops occurs on line 338

i know its only error counting on that line but rq->errors is used
in idescsi_end_request as well which from what i can work out if it
never hits the limit it will keep retrying on the drive for ever. Then
i start to get lost / confused ....
I cant retrigger at the weekend if you wish and provide all uptodate
information on it.




2003-03-13 17:03:52

by Jens Axboe

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

On Thu, Mar 13 2003, James Stevenson wrote:
> > > > Your explanation doesn't quite make sense, but I can take a look at
> the
> > > > problem :-)
> > > >
> > > > What kernel is the below oops from? What compiler?
> > >
> > > i can trigger this on any 2.4.x series kernel.
> > > -> Insert dmaged / lightly scratched cd into drive
> > > dd /dev/scd0 bs=8192k of=file
> > > wait for opps.
> > > opps also cd tries to re read several times
> > > short hang then the following output
> > >
> > > gcc versions.
> > > Whatever shits with redhat 7.1 + 7.2 + 7.3 and the
> > > updates between them in each of the redhat versions.
> > > but normally
> > > Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
> > > gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-98)
> > >
> > > or
> > > Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
> > > gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-113)
> >
> > weee ok not my choice for compilers, but probably alright. do me a favor
> > then:
> >
> > # cd /to/kernel/source
> > # rm drivers/scsi/ide-scsi.o
> > # EXTRA_CFLAGS=-g make drivers/scsi/ide-scsi.o
> > # objdump -S drivers/scsi/ide-scsi.o > /tmp/some_file
>
> i know longer have the source tree from what the opps was generated
> from that was actually the 2.4.19 opps i posted but the kernel is now
> 2.4.20 it was the same trace and same place in the file i tracked it to
> this point.
>
> >From 2.4.20 tree line 333 ide-scsi.c
>
> 333: if ((status & DRQ_STAT) == 0) { /* No more interrupts */
> if (test_bit(IDESCSI_LOG_CMD, &scsi->log))
> printk (KERN_INFO "Packet command completed, %d
> bytes transferred\n", pc->actually_transferred);
> ide__sti();
> if (status & ERR_STAT)
> 338: rq->errors++;
> idescsi_end_request (1, HWGROUP(drive));
> return ide_stopped;
> }
>
>
> the oops occurs on line 338
>
> i know its only error counting on that line but rq->errors is used
> in idescsi_end_request as well which from what i can work out if it
> never hits the limit it will keep retrying on the drive for ever. Then
> i start to get lost / confused ....
> I cant retrigger at the weekend if you wish and provide all uptodate
> information on it.

Ok, please reproduce in 2.4.21-pre5, its end_request handling is a lot
different. If you just want a one-liner, I'd suggest trying something
ala this on 2.4.20 and see if it makes any difference:

--- drivers/scsi/ide-scsi.c~ 2003-03-13 18:13:00.876624632 +0100
+++ drivers/scsi/ide-scsi.c 2003-03-13 18:13:14.167604096 +0100
@@ -313,7 +313,7 @@
byte status, ireason;
int bcount;
idescsi_pc_t *pc=scsi->pc;
- struct request *rq = pc->rq;
+ struct request *rq = HWGROUP(drive)->rq;
unsigned int temp;

#if IDESCSI_DEBUG_LOG


But really, 2.4.21-pre is much more interesting to reproduce on.

--
Jens Axboe

2003-03-13 17:09:53

by Alan

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

On Thu, 2003-03-13 at 17:14, Jens Axboe wrote:
> Ok, please reproduce in 2.4.21-pre5, its end_request handling is a lot
> different. If you just want a one-liner, I'd suggest trying something
> ala this on 2.4.20 and see if it makes any difference:

The do_reset code is also racey in some cases on 2.4.21 < pre5-ac2

2003-03-13 17:21:31

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

On Thu, 13 Mar 2003 18:14:26 +0100
Jens Axboe <[email protected]> wrote:

> Ok, please reproduce in 2.4.21-pre5, its end_request handling is a lot
> different. If you just want a one-liner, I'd suggest trying something
> ala this on 2.4.20 and see if it makes any difference:

Please read my subject ;-)

Regards, Stephan

2003-03-13 17:26:43

by Jens Axboe

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

On Thu, Mar 13 2003, Stephan von Krawczynski wrote:
> On Thu, 13 Mar 2003 18:14:26 +0100
> Jens Axboe <[email protected]> wrote:
>
> > Ok, please reproduce in 2.4.21-pre5, its end_request handling is a lot
> > different. If you just want a one-liner, I'd suggest trying something
> > ala this on 2.4.20 and see if it makes any difference:
>
> Please read my subject ;-)

Sorry, was there a clean oops from 2.4.21-pre5 posted? If so, please
follow the suggestions I sent to James and provide me with the
objdump output.

--
Jens Axboe

2003-03-13 17:31:32

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

On Thu, 13 Mar 2003 18:37:17 +0100
Jens Axboe <[email protected]> wrote:

> On Thu, Mar 13 2003, Stephan von Krawczynski wrote:
> > On Thu, 13 Mar 2003 18:14:26 +0100
> > Jens Axboe <[email protected]> wrote:
> >
> > > Ok, please reproduce in 2.4.21-pre5, its end_request handling is a lot
> > > different. If you just want a one-liner, I'd suggest trying something
> > > ala this on 2.4.20 and see if it makes any difference:
> >
> > Please read my subject ;-)
>
> Sorry, was there a clean oops from 2.4.21-pre5 posted? If so, please
> follow the suggestions I sent to James and provide me with the
> objdump output.

Sorry, yes there was (first posting of this thread). And there already is a
working patch on -pre5 in another branch of this thread. Please have a look ...

--
Regards,
Stephan

2003-03-13 17:42:05

by Jens Axboe

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

On Thu, Mar 13 2003, Stephan von Krawczynski wrote:
> On Thu, 13 Mar 2003 18:37:17 +0100
> Jens Axboe <[email protected]> wrote:
>
> > On Thu, Mar 13 2003, Stephan von Krawczynski wrote:
> > > On Thu, 13 Mar 2003 18:14:26 +0100
> > > Jens Axboe <[email protected]> wrote:
> > >
> > > > Ok, please reproduce in 2.4.21-pre5, its end_request handling is a lot
> > > > different. If you just want a one-liner, I'd suggest trying something
> > > > ala this on 2.4.20 and see if it makes any difference:
> > >
> > > Please read my subject ;-)
> >
> > Sorry, was there a clean oops from 2.4.21-pre5 posted? If so, please
> > follow the suggestions I sent to James and provide me with the
> > objdump output.
>
> Sorry, yes there was (first posting of this thread). And there already is a
> working patch on -pre5 in another branch of this thread. Please have a
> look ...

Oh, sorry missed most parts of the thread. Since this was the nth time
the subject was mentioned, I thought I'd take a look.

All's well, then.

--
Jens Axboe

2003-03-13 18:40:59

by Andre Hedrick

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi


Alan,

Did we not fix this problem when HP addressed it with the ia64 stuff?

Additionally I have finally found a long outstanding bug in the
buildsgtable in ide-dma.c. I just need to reverify the nature. It has to
do with the execution of the EOT bit in the last segment. This would also
explain why we are seeing expiry dma timeouts.

Cheers,

On Thu, 13 Mar 2003, Willy Gardiol wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> Do you thnik this can have somthing to share with the oops i am getting with
> ide-scsi, a cd burner, DMA, and a PCI IDE controller?
> I attach the oops decoded.
>
> There is a thread about this on linux-ide, it seems a problem common at least
> to another guy
>
>
> Alle 17:37, gioved? 13 marzo 2003, Jens Axboe ha scritto:
> > On Thu, Mar 13 2003, James Stevenson wrote:
> > > Hi
> > >
> > > strange looks alot like the ones i have seen though the whole 2.4.x tree.
> > >
> > > this was discussed before somebody said they would send a patch myself
> > > and sombody else were going to test it but the patch never happens.
> > >
> > > >From what i can work out an error occurs on the cd drive and the request
> > >
> > > queue is then empty and the ide-scsi driver then attempts to access the
> > > reuest queue that doesnt exist i never did manage to find out
> > > where the request get removed from the queue though.
> >
> > Your explanation doesn't quite make sense, but I can take a look at the
> > problem :-)
> >
> > What kernel is the below oops from? What compiler?
> >
> > > *pde = 00000000
> > > Oops: 0000
> > > CPU: 0
> > > EIP: 0010:[<c01e5783>] Not tainted
> > > Using defaults from ksymoops -t elf32-i386 -a i386
> > > EFLAGS: 00010202
> > > eax: 00000000 ebx: c7a71000 ecx: c0327104 edx: 00000000
> > > esi: 00000001 edi: c13a4fc0 ebp: cb23df58 esp: cb23df44
> > > ds: 0018 es: 0018 ss: 0018
> > > Process klogd (pid: 381, stackpage=cb23d000)
> > > Stack: 00000000 c0327294 c13de260 c0327294 00000202 cb23df78 c01cdd11
> > > c0327294
> > > c01e5700 c0327104 c121db00 04000001 0000000f cb23df98 c010a0bd
> > > 0000000f
> > > c13de260 cb23dfc4 cb23dfc4 0000000f c02f8ae0 cb23dfbc c010a24d
> > > 0000000f
> > > Call Trace: [<c01cdd11>] [<c01e5700>] [<c010a0bd>] [<c010a24d>]
> > > [<c010c358>] Code: 8b 72 18 46 89 72 18 8b 55 f0 8b 82 f0 00 00 00 8b 58
> > > 04 53
> > >
> > > >>EIP; c01e5783 <idescsi_pc_intr+83/290> <=====
> > >
> > > Trace; c01cdd11 <ide_intr+c1/120>
> > > Trace; c01e5700 <idescsi_pc_intr+0/290>
> > > Trace; c010a0bd <handle_IRQ_event+3d/70>
> > > Trace; c010a24d <do_IRQ+7d/c0>
> > > Trace; c010c358 <call_do_IRQ+5/d>
> > > Code; c01e5783 <idescsi_pc_intr+83/290>
> > > 00000000 <_EIP>:
> > > Code; c01e5783 <idescsi_pc_intr+83/290> <=====
> > > 0: 8b 72 18 mov 0x18(%edx),%esi <=====
> > > Code; c01e5786 <idescsi_pc_intr+86/290>
> > > 3: 46 inc %esi
> > > Code; c01e5787 <idescsi_pc_intr+87/290>
> > > 4: 89 72 18 mov %esi,0x18(%edx)
> > > Code; c01e578a <idescsi_pc_intr+8a/290>
> > > 7: 8b 55 f0 mov 0xfffffff0(%ebp),%edx
> > > Code; c01e578d <idescsi_pc_intr+8d/290>
> > > a: 8b 82 f0 00 00 00 mov 0xf0(%edx),%eax
> > > Code; c01e5793 <idescsi_pc_intr+93/290>
> > > 10: 8b 58 04 mov 0x4(%eax),%ebx
> > > Code; c01e5796 <idescsi_pc_intr+96/290>
> > > 13: 53 push %ebx
> > >
> > > <0>Kernel panic: Aiee, killing interrupt handler!
> > >
> > > 1 warning issued. Results may not be reliable.
>
> - --
>
> !
> Willy Gardiol - [email protected]
> goemon.polito.it/~gardiol
> Use linux for your freedom.
>
> "La GPL e il modello open source consentono
> la creazione della tecologia migliore.
> Tutto qui."
>
> Linus Torvalds
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.1 (GNU/Linux)
>
> iD8DBQE+cLRLQ9qolN/zUk4RAhf0AJ0QrCS37i1zp1HuKFurga1SS1q4IQCfUqS+
> bLL8Q7X5t2a967ANOs0i8iM=
> =AH5R
> -----END PGP SIGNATURE-----
>

Andre Hedrick
LAD Storage Consulting Group

2003-03-13 19:09:19

by Alan

[permalink] [raw]
Subject: Re: OOPS in 2.4.21-pre5, ide-scsi

On Thu, 2003-03-13 at 18:50, Andre Hedrick wrote:
> Alan,
>
> Did we not fix this problem when HP addressed it with the ia64 stuff?

I've been working through a set of DMA problems. The PIO/DMA switching
timing out is fixed and has been for a while. I've very recently fixed a
race where we could get a command issued while resetting an interface.

With 2.4.x the reports I have make me think there are more races in
ide-scsi left. With 2.5.x its completely broken. Someone rewrote the
abort/reset handling, some other people rewrote the scsi core and the
result needs significant work yet

> Additionally I have finally found a long outstanding bug in the
> buildsgtable in ide-dma.c. I just need to reverify the nature. It has to
> do with the execution of the EOT bit in the last segment. This would also
> explain why we are seeing expiry dma timeouts.

Cool