2002-09-10 18:15:25

by martin.knoblauch

[permalink] [raw]
Subject: Oops + Aiee when mounting CDROM via ide-scsi under 2.4.20-pre5-ac4

Hi,

I am getting a reproducable Oops+Aiee when trying to mount a ATAPI
CDROM via the ide-scsi interface under 2.4.20-pre5-ac4. Works OK
without ide-scsi.

First the Oops from mount:

knobi:/tmp # ksymoops -m /System.map < warn
ksymoops 2.4.3 on i686 2.4.20-pre5-ac4. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.20-pre5-ac4/ (default)
-m /System.map (specified)

Warning (compare_maps): ksyms_base symbol GPLONLY_cpufreq_register not
found in System.map. Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol GPLONLY_cpufreq_restore not
found in System.map. Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol GPLONLY_cpufreq_set not found
in System.map. Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol GPLONLY_cpufreq_setmax not
found in System.map. Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol GPLONLY_cpufreq_unregister not
found in System.map. Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol GPLONLY_idle_cpu not found in
System.map. Ignoring ksyms_base entry
kernel BUG at
/scratch/linux-kernel/linux-2.4.20-pre5-ac4/include/linux/blkdev.h:153!
invalid operand: 0000
CPU: 0
EIP: 0010:[ide_build_sglist+77/396] Not tainted
EFLAGS: 00010206
eax: 0000005a ebx: c16cf000 ecx: c03636f4 edx: db657f60
esi: 00000000 edi: db657f60 ebp: d7d31d18 esp: d7d31cf8
ds: 0018 es: 0018 ss: 0018
Process mount (pid: 1307, stackpage=d7d31000)
Stack: c16cf000 c03637a4 db657f60 db657f60 00000003 00000297 dbc85e34
c16cc000
d7d31d44 c02009da c03636f4 db657f60 c03636f4 c03637a4 db657f60
df48a06c
00000000 00000000 c03636f4 d7d31d64 c0200e92 c03637a4 db657f60
c03637a4
Call Trace: [ide_build_dmatable+86/396] [__ide_dma_read+42/284]
[yenta_socket:__insmod_yenta_socket_O/lib/modules/2.4.20-pre5-ac4/kernel/+-325121/96]
[yenta_socket:__insmod_yenta_socket_O/lib/modules/2.4.20-pre5-ac4/kernel/+-324740/96]
[start_request+370/460]
Code: 0f 0b 99 00 80 64 2b c0 8b 45 08 c7 80 24 04 00 00 01 00 00
Using defaults from ksymoops -t elf32-i386 -a i386

Code; 00000000 Before first symbol
00000000 <_EIP>:
Code; 00000000 Before first symbol
0: 0f 0b ud2a
Code; 00000002 Before first symbol
2: 99 cltd
Code; 00000002 Before first symbol
3: 00 80 64 2b c0 8b add %al,0x8bc02b64(%eax)
Code; 00000008 Before first symbol
9: 45 inc %ebp
Code; 0000000a Before first symbol
a: 08 c7 or %al,%bh
Code; 0000000c Before first symbol
c: 80 24 04 00 andb $0x0,(%esp,%eax,1)
Code; 00000010 Before first symbol
10: 00 01 add %al,(%ecx)


6 warnings issued. Results may not be reliable.

After a few 10 seconds I get scsi reset messages and then the Aiee:

knobi:/tmp # ksymoops -m /System.map < aiee
ksymoops 2.4.3 on i686 2.4.20-pre5-ac4. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.20-pre5-ac4/ (default)
-m /System.map (specified)

Warning (compare_maps): ksyms_base symbol GPLONLY_cpufreq_register not
found in System.map. Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol GPLONLY_cpufreq_restore not
found in System.map. Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol GPLONLY_cpufreq_set not found
in System.map. Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol GPLONLY_cpufreq_setmax not
found in System.map. Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol GPLONLY_cpufreq_unregister not
found in System.map. Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol GPLONLY_idle_cpu not found in
System.map. Ignoring ksyms_base entry
Oops: 0000
CPU: 0
EIP: 0010:[<c011554a>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010002
eax: c167efa0 ebx: d7d31f2c ecx: 00000000 edx: 00000000
esi: 00000000 edi: 00000286 ebp: c031ddec esp: c031ddd8
ds: 0018 es: 0018 ss: 0018
Process swapper (pid:0, stackpage=c031d000)
Stack: c1515eec c167efa0 c02e037c 00000003 c167efa0 c031de00 c0128337
d7d35200
00000202 c1515eec c031de14 c013807e d7d35200 df48a0b8 00000004
c031de34
c020bb45 d7d35200 00000001 df48a000 00000000 df48a000 df667a14
c031de7c
Call Trace: [<c0128337>] [<c013807e>] [<c020bb45>] [<c020bea3>]
[<c02116b4>]
[<c020b6e4>] [<c020b16c>] [<e19b0571>] [<e19b0660>]
[<c01f7976>]
[<e19b05bc>] [<c0109d26>] [<c0109ec2>] [<c0105000>]
[<c010c0f8>]
[<c0107010>] [<c0105000>] [<c0107036>] [<c010708b>]
[<c0105016>]
Code: 8b 02 85 45 f8 74 ef 6a 00 52 e8 97 f8 ff ff 83 c4 08 85 c0

>>EIP; c011554a <__wake_up+2a/5c> <=====
Trace; c0128336 <unlock_page+7a/84>
Trace; c013807e <end_buffer_io_async+76/84>
Trace; c020bb44 <__scsi_end_request+78/134>
Trace; c020bea2 <scsi_io_completion+1ba/3d0>
Trace; c02116b4 <rw_intr+17c/188>
Trace; c020b6e4 <update_timeout+28/40>
Trace; c020b16c <scsi_old_done+5e0/5f0>
Trace; e19b0570 <[ide-scsi]idescsi_end_request+208/254>
Trace; e19b0660 <[ide-scsi]idescsi_pc_intr+a4/2e8>
Trace; c01f7976 <ide_intr+c2/118>
Trace; e19b05bc <[ide-scsi]idescsi_pc_intr+0/2e8>
Trace; c0109d26 <handle_IRQ_event+2e/5c>
Trace; c0109ec2 <do_IRQ+96/d4>
Trace; c0105000 <_stext+0/0>
Trace; c010c0f8 <call_do_IRQ+6/e>
Trace; c0107010 <default_idle+0/30>
Trace; c0105000 <_stext+0/0>
Trace; c0107036 <default_idle+26/30>
Trace; c010708a <cpu_idle+22/30>
Trace; c0105016 <rest_init+16/20>
Code; c011554a <__wake_up+2a/5c>
00000000 <_EIP>:
Code; c011554a <__wake_up+2a/5c> <=====
0: 8b 02 mov (%edx),%eax <=====
Code; c011554c <__wake_up+2c/5c>
2: 85 45 f8 test %eax,0xfffffff8(%ebp)
Code; c011554e <__wake_up+2e/5c>
5: 74 ef je fffffff6 <_EIP+0xfffffff6>
c0115540 <__wake_up+20/5c>
Code; c0115550 <__wake_up+30/5c>
7: 6a 00 push $0x0
Code; c0115552 <__wake_up+32/5c>
9: 52 push %edx
Code; c0115554 <__wake_up+34/5c>
a: e8 97 f8 ff ff call fffff8a6 <_EIP+0xfffff8a6>
c0114df0 <try_to_wake_up+0/118>
Code; c0115558 <__wake_up+38/5c>
f: 83 c4 08 add $0x8,%esp
Code; c011555c <__wake_up+3c/5c>
12: 85 c0 test %eax,%eax


6 warnings issued. Results may not be reliable.


Since I moved from new with 2.4.20-pre5-ac4 (compared to 2.4.19-ac4)
are also the following messages in dmesg:

yenta 02:05.0: no resource of type 100 available, trying to continue...
yenta 02:05.0: no resource of type 100 available, trying to continue...
yenta 02:05.1: no resource of type 100 available, trying to continue...
yenta 02:05.1: no resource of type 100 available, trying to continue...

My .config is included.

Martin
--
Martin Knoblauch
Senior System Architect
MSC.software GmbH
Am Moosfeld 13
D-81829 Muenchen, Germany

e-mail: [email protected]
http://www.mscsoftware.com
Phone/Fax: +49-89-431987-189 / -7189
Mobile: +49-174-3069245


Attachments:
aiee (956.00 B)
oops (1.41 kB)
.config (32.60 kB)
Download all attachments

2002-09-10 19:09:03

by Jens Axboe

[permalink] [raw]
Subject: Re: Oops + Aiee when mounting CDROM via ide-scsi under 2.4.20-pre5-ac4

On Tue, Sep 10 2002, Jens Axboe wrote:
> On Tue, Sep 10 2002, Martin Knoblauch wrote:
> > Hi,
> >
> > I am getting a reproducable Oops+Aiee when trying to mount a ATAPI
> > CDROM via the ide-scsi interface under 2.4.20-pre5-ac4. Works OK
> > without ide-scsi.
>
> Ok, the problem is that ide-scsi builds a request which eventually ends
> up going through the ide code dma mapping. ide_build_sglist() does a
> rq_data_dir() on the request, which BUG()'s if the command isn't an fs
> read or write. This actually went undetected before, because the ide
> code did:
>
> if (rq->cmd == READ)
> direction is dma from device
> else
> direction is to device
>
> and rq->cmd is IDESCSI_PC_RQ in this case. So we always mapped for dma
> to the device, even if that wasn't the case.
>
> Hmm, maybe just adding a single direction bit to struct request is the
> easy way out for 2.4. Or... I'll cook something up.

ok try this patch, against 2.4.20-pre5-ac4 (well not a clean one, but I
think it should apply).

alan, this should probably go into ac5 provided that Martin tests it as
ok.

diff -ur -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.4.20-pre5-ac4/drivers/ide/arm/icside.c linux-2.4.20-pre5-ac4/drivers/ide/arm/icside.c
--- /opt/kernel/linux-2.4.20-pre5-ac4/drivers/ide/arm/icside.c 2002-09-10 12:56:51.000000000 +0200
+++ linux-2.4.20-pre5-ac4/drivers/ide/arm/icside.c 2002-09-10 21:04:58.000000000 +0200
@@ -264,9 +264,9 @@
}

static int
-icside_build_dmatable(ide_drive_t *drive, int reading)
+icside_build_dmatable(ide_drive_t *drive, int ddir)
{
- return HWIF(drive)->sg_nents = ide_build_sglist(HWIF(drive), HWGROUP(drive)->rq);
+ return HWIF(drive)->sg_nents = ide_build_sglist(HWIF(drive), HWGROUP(drive)->rq, ddir);
}

/* Teardown mappings after DMA has completed. */
@@ -556,7 +556,7 @@
u8 lba48 = (drive->addressing == 1) ? 1 : 0;
task_ioreg_t command = WIN_NOP;

- count = icside_build_dmatable(drive, 1);
+ count = icside_build_dmatable(drive, PCI_DMA_FROMDEVICE);
if (!count)
return 1;
disable_dma(hwif->hw.dma);
@@ -610,7 +610,7 @@
u8 lba48 = (drive->addressing == 1) ? 1 : 0;
task_ioreg_t command = WIN_NOP;

- count = icside_build_dmatable(drive, 0);
+ count = icside_build_dmatable(drive, PCI_DMA_TODEVICE);
if (!count)
return 1;
disable_dma(hwif->hw.dma);
diff -ur -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.4.20-pre5-ac4/drivers/ide/ide-dma.c linux-2.4.20-pre5-ac4/drivers/ide/ide-dma.c
--- /opt/kernel/linux-2.4.20-pre5-ac4/drivers/ide/ide-dma.c 2002-09-10 12:56:51.000000000 +0200
+++ linux-2.4.20-pre5-ac4/drivers/ide/ide-dma.c 2002-09-10 21:03:11.000000000 +0200
@@ -215,7 +214,7 @@
return DRIVER(drive)->error(drive, "dma_intr", stat);
}

-static int ide_build_sglist (ide_hwif_t *hwif, struct request *rq)
+static int ide_build_sglist (ide_hwif_t *hwif, struct request *rq, int ddir)
{
struct buffer_head *bh;
struct scatterlist *sg = hwif->sg_table;
@@ -223,11 +222,7 @@

if (hwif->sg_dma_active)
BUG();
-
- if (rq_data_dir(rq) == READ)
- hwif->sg_dma_direction = PCI_DMA_FROMDEVICE;
- else
- hwif->sg_dma_direction = PCI_DMA_TODEVICE;
+
bh = rq->bh;
do {
unsigned char *virt_addr = bh->b_data;
@@ -252,7 +247,8 @@
if(nents == 0)
BUG();

- return pci_map_sg(hwif->pci_dev, sg, nents, hwif->sg_dma_direction);
+ hwif->sg_dma_direction = ddir;
+ return pci_map_sg(hwif->pci_dev, sg, nents, ddir);
}

static int ide_raw_build_sglist (ide_hwif_t *hwif, struct request *rq)
@@ -302,7 +298,7 @@
* Returns 0 if all went okay, returns 1 otherwise.
* May also be invoked from trm290.c
*/
-int ide_build_dmatable (ide_drive_t *drive, struct request *rq)
+int ide_build_dmatable (ide_drive_t *drive, struct request *rq, int ddir)
{
ide_hwif_t *hwif = HWIF(drive);
unsigned int *table = hwif->dmatable_cpu;
@@ -314,7 +310,7 @@
if (rq->cmd == IDE_DRIVE_TASKFILE)
hwif->sg_nents = i = ide_raw_build_sglist(hwif, rq);
else
- hwif->sg_nents = i = ide_build_sglist(hwif, rq);
+ hwif->sg_nents = i = ide_build_sglist(hwif, rq, ddir);

if (!i)
return 0;
@@ -543,7 +539,7 @@
u8 dma_stat = 0, lba48 = (drive->addressing == 1) ? 1 : 0;
task_ioreg_t command = WIN_NOP;

- if (!(count = ide_build_dmatable(drive, rq)))
+ if (!(count = ide_build_dmatable(drive, rq, PCI_DMA_FROMDEVICE)))
/* try PIO instead of DMA */
return 1;
/* PRD table */
@@ -595,7 +591,7 @@
u8 dma_stat = 0, lba48 = (drive->addressing == 1) ? 1 : 0;
task_ioreg_t command = WIN_NOP;

- if (!(count = ide_build_dmatable(drive, rq)))
+ if (!(count = ide_build_dmatable(drive, rq, PCI_DMA_TODEVICE)))
/* try PIO instead of DMA */
return 1;
/* PRD table */
diff -ur -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.4.20-pre5-ac4/drivers/ide/pci/trm290.c linux-2.4.20-pre5-ac4/drivers/ide/pci/trm290.c
--- /opt/kernel/linux-2.4.20-pre5-ac4/drivers/ide/pci/trm290.c 2002-09-10 12:56:52.000000000 +0200
+++ linux-2.4.20-pre5-ac4/drivers/ide/pci/trm290.c 2002-09-10 21:08:15.000000000 +0200
@@ -191,7 +191,7 @@
trm290_prepare_drive(drive, 0); /* select PIO xfer */
return 1;
#endif
- if (!(count = ide_build_dmatable(drive, rq))) {
+ if (!(count = ide_build_dmatable(drive, rq, PCI_DMA_TODEVICE))) {
/* try PIO instead of DMA */
trm290_prepare_drive(drive, 0); /* select PIO xfer */
return 1;
@@ -235,7 +235,7 @@
task_ioreg_t command = WIN_NOP;
unsigned int count, reading = 2, writing = 0;

- if (!(count = ide_build_dmatable(drive, rq))) {
+ if (!(count = ide_build_dmatable(drive, rq, PCI_DMA_FROMDEVICE))) {
/* try PIO instead of DMA */
trm290_prepare_drive(drive, 0); /* select PIO xfer */
return 1;
diff -ur -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.4.20-pre5-ac4/drivers/ide/ppc/pmac.c linux-2.4.20-pre5-ac4/drivers/ide/ppc/pmac.c
--- /opt/kernel/linux-2.4.20-pre5-ac4/drivers/ide/ppc/pmac.c 2002-09-10 12:56:52.000000000 +0200
+++ linux-2.4.20-pre5-ac4/drivers/ide/ppc/pmac.c 2002-09-10 21:07:42.000000000 +0200
@@ -932,7 +932,7 @@
#ifdef CONFIG_BLK_DEV_IDEDMA_PMAC

static int
-pmac_ide_build_sglist(ide_hwif_t *hwif, struct request *rq)
+pmac_ide_build_sglist(ide_hwif_t *hwif, struct request *rq, int data_dir)
{
pmac_ide_hwif_t *pmif = (pmac_ide_hwif_t *)hwif->hwif_data;
struct buffer_head *bh;
@@ -942,10 +942,6 @@
if (hwif->sg_dma_active)
BUG();

- if (rq->cmd == READ)
- pmif->sg_dma_direction = PCI_DMA_FROMDEVICE;
- else
- pmif->sg_dma_direction = PCI_DMA_TODEVICE;
bh = rq->bh;
do {
unsigned char *virt_addr = bh->b_data;
@@ -965,7 +961,8 @@
nents++;
} while (bh != NULL);

- return pci_map_sg(hwif->pci_dev, sg, nents, pmif->sg_dma_direction);
+ pmif->sg_dma_direction = data_dir;
+ return pci_map_sg(hwif->pci_dev, sg, nents, data_dir);
}

static int
@@ -1013,6 +1010,7 @@
pmac_ide_hwif_t* pmif = (pmac_ide_hwif_t *)hwif->hwif_data;
volatile struct dbdma_regs *dma = pmif->dma_regs;
struct scatterlist *sg;
+ int data_dir;

/* DMA table is already aligned */
table = (struct dbdma_cmd *) pmif->dma_table_cpu;
@@ -1022,11 +1020,16 @@
while (readl(&dma->status) & RUN)
udelay(1);

+ if (wr)
+ data_dir = PCI_DMA_TODEVICE;
+ else
+ data_dir = PCI_DMA_FROMDEVICE;
+
/* Build sglist */
if (rq->cmd == IDE_DRIVE_TASKFILE)
pmif->sg_nents = i = pmac_ide_raw_build_sglist(hwif, rq);
else
- pmif->sg_nents = i = pmac_ide_build_sglist(hwif, rq);
+ pmif->sg_nents = i = pmac_ide_build_sglist(hwif, rq, data_dir);
if (!i)
return 0;

diff -ur -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.4.20-pre5-ac4/include/linux/ide.h linux-2.4.20-pre5-ac4/include/linux/ide.h
--- /opt/kernel/linux-2.4.20-pre5-ac4/include/linux/ide.h 2002-09-10 12:56:54.000000000 +0200
+++ linux-2.4.20-pre5-ac4/include/linux/ide.h 2002-09-10 21:04:00.000000000 +0200
@@ -1732,7 +1732,7 @@
#ifdef CONFIG_BLK_DEV_IDEDMA
#define BAD_DMA_DRIVE 0
#define GOOD_DMA_DRIVE 1
-extern int ide_build_dmatable(ide_drive_t *, struct request *);
+extern int ide_build_dmatable(ide_drive_t *, struct request *, int);
extern void ide_destroy_dmatable(ide_drive_t *);
extern ide_startstop_t ide_dma_intr(ide_drive_t *);
extern int ide_release_dma(ide_hwif_t *);

--
Jens Axboe

2002-09-10 19:04:40

by Jens Axboe

[permalink] [raw]
Subject: Re: Oops + Aiee when mounting CDROM via ide-scsi under 2.4.20-pre5-ac4

On Tue, Sep 10 2002, Martin Knoblauch wrote:
> Hi,
>
> I am getting a reproducable Oops+Aiee when trying to mount a ATAPI
> CDROM via the ide-scsi interface under 2.4.20-pre5-ac4. Works OK
> without ide-scsi.

Ok, the problem is that ide-scsi builds a request which eventually ends
up going through the ide code dma mapping. ide_build_sglist() does a
rq_data_dir() on the request, which BUG()'s if the command isn't an fs
read or write. This actually went undetected before, because the ide
code did:

if (rq->cmd == READ)
direction is dma from device
else
direction is to device

and rq->cmd is IDESCSI_PC_RQ in this case. So we always mapped for dma
to the device, even if that wasn't the case.

Hmm, maybe just adding a single direction bit to struct request is the
easy way out for 2.4. Or... I'll cook something up.

--
Jens Axboe

2002-09-11 10:29:27

by Jens Axboe

[permalink] [raw]
Subject: Re: Oops + Aiee when mounting CDROM via ide-scsi under 2.4.20-pre5-ac4

On Wed, Sep 11 2002, Martin Knoblauch wrote:
> > ok try this patch, against 2.4.20-pre5-ac4 (well not a clean one, but
> > I think it should apply).
> >
> >
> > alan, this should probably go into ac5 provided that Martin tests it
> > as ok.
> Jens, Alan.
>
> BBW (Boots, Builds, Works :-). Thanks you very much. After the patch
> mounting CD's through ide-scsi works like a charm.
>
> So, from my point of view it should go into pre5-ac5 or pre6-ac1.

Thanks for testing, glad to hear it.

--
Jens Axboe

2002-09-11 10:25:47

by martin.knoblauch

[permalink] [raw]
Subject: Re: Oops + Aiee when mounting CDROM via ide-scsi under 2.4.20-pre5-ac4

> ok try this patch, against 2.4.20-pre5-ac4 (well not a clean one, but
> I think it should apply).
>
>
> alan, this should probably go into ac5 provided that Martin tests it
> as ok.
Jens, Alan.

BBW (Boots, Builds, Works :-). Thanks you very much. After the patch
mounting CD's through ide-scsi works like a charm.

So, from my point of view it should go into pre5-ac5 or pre6-ac1.

Martin
--
Martin Knoblauch
Senior System Architect
MSC.software GmbH
Am Moosfeld 13
D-81829 Muenchen, Germany

e-mail: [email protected]
http://www.mscsoftware.com
Phone/Fax: +49-89-431987-189 / -7189
Mobile: +49-174-3069245