2002-06-04 18:49:52

by Zlatko Calusic

[permalink] [raw]
Subject: IDE{,-SCSI} trouble [2.5.20]

2.5.20 on a typical UP setup (PIII, BX chipset) produces quite a lot of

Jun 4 19:34:44 magla kernel: hda: lost interrupt
Jun 4 19:47:09 magla kernel: hda: lost interrupt
Jun 4 19:53:59 magla kernel: hda: lost interrupt
Jun 4 19:56:54 magla kernel: hda: lost interrupt
Jun 4 20:05:49 magla kernel: hda: lost interrupt
Jun 4 20:06:09 magla kernel: hda: lost interrupt
Jun 4 20:11:59 magla kernel: hda: lost interrupt
Jun 4 20:12:34 magla kernel: hda: lost interrupt
Jun 4 20:33:29 magla kernel: hda: lost interrupt
Jun 4 20:42:24 magla kernel: hdc: lost interrupt

I haven't run lots of 2.5.x on that machine, so I don't know if it's
specific/new to 2.5.20.


Second thing, ide-scsi has some kind of trouble that hasn't been
solved for some time now. This one is on SMP machine, 2.5.20, but I've
seen it on older 2.5.x:

Jun 4 20:08:41 atlas kernel: VFS: Disk change detected on device sr(11,1)
Jun 4 20:08:41 atlas kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000098
Jun 4 20:08:41 atlas kernel: printing eip:
Jun 4 20:08:41 atlas kernel: c01af537
Jun 4 20:08:41 atlas kernel: *pde = 00000000
Jun 4 20:08:41 atlas kernel: Oops: 0000
Jun 4 20:08:41 atlas kernel: CPU: 1
Jun 4 20:08:41 atlas kernel: EIP: 0010:[blk_rq_map_sg+23/356] Not tainted
Jun 4 20:08:41 atlas kernel: EFLAGS: 00210296
Jun 4 20:08:41 atlas kernel: eax: da17df40 ebx: efdfb000 ecx: da17df40 edx: 00000000
Jun 4 20:08:41 atlas kernel: esi: 0000c008 edi: da17df40 ebp: c0371b08 esp: dabefc74
Jun 4 20:08:41 atlas kernel: ds: 0018 es: 0018 ss: 0018
Jun 4 20:08:41 atlas kernel: Process nautilus (pid: 457, threadinfo=dabee000 task=ec808120)
Jun 4 20:08:41 atlas kernel: Stack: efdfb000 0000c008 da17df40 c0371b08 da17dd40 00000000 da17dd40 c0371b1c
Jun 4 20:08:41 atlas kernel: 00000000 c03717a8 c01aebba c01ca5cb 00000000 da17df40 efdfb000 efdfc000
Jun 4 20:08:41 atlas kernel: 0000c008 00000008 c0371b08 c0371b08 00000000 c01caa38 c03717a8 da17df40
Jun 4 20:08:41 atlas kernel: Call Trace: [__elv_next_request+10/16] [build_sglist+271/448] [udma_new_table+36/332] [ata_start_dma+44/100] [udma_pci_init+20/212]
Jun 4 20:08:41 atlas kernel: [idescsi_issue_pc+100/324] [idescsi_do_request+27/60] [start_request+288/396] [__elv_next_request+10/16] [queue_commands+294/372] [do_request+64/104]
Jun 4 20:08:41 atlas kernel: [do_ide_request+16/20] [ide_do_drive_cmd+228/292] [idescsi_queue+1332/1420] [scsi_dispatch_cmd+265/428] [scsi_done+0/164] [scsi_request_fn+1128/1168]
Jun 4 20:08:41 atlas kernel: [generic_unplug_device+59/84] [blk_run_queues+174/268] [do_page_cache_readahead+227/264] [page_cache_readahead+235/244] [do_generic_file_read+146/808] [generic_file_read+126/304]
Jun 4 20:08:41 atlas kernel: [file_read_actor+0/140] [vfs_read+154/264] [sys_read+42/60] [syscall_call+7/11]
Jun 4 20:08:41 atlas kernel:
Jun 4 20:08:41 atlas kernel: Code: 8b 8a 98 00 00 00 83 e1 02 89 4c 24 18 c7 44 24 28 00 00 00
Jun 4 20:09:21 atlas kernel: <6>scsi: device set offline - command error recover failed: host 1 channel 0 id 0 lun 0
Jun 4 20:09:21 atlas kernel: SCSI cdrom error : host 1 channel 0 id 0 lun 0 return code = 6000000
Jun 4 20:09:21 atlas kernel: end_request: I/O error, dev 0b:01, sector 64
Jun 4 20:09:21 atlas kernel: Buffer I/O error on device sr(11,1), logical block 8
Jun 4 20:09:21 atlas kernel: end_request: I/O error, dev 0b:01, sector 72
Jun 4 20:09:21 atlas kernel: Buffer I/O error on device sr(11,1), logical block 9
Jun 4 20:09:21 atlas kernel: Buffer I/O error on device sr(11,1), logical block 10
Jun 4 20:09:21 atlas kernel: Buffer I/O error on device sr(11,1), logical block 11
Jun 4 20:09:21 atlas kernel: SCSI error: host 1 id 0 lun 0 return code = 6000000
Jun 4 20:09:21 atlas kernel: ^ISense class 0, sense error 0, extended sense 0
Jun 4 20:09:21 atlas kernel: VFS: Disk change detected on device sr(11,1)
Jun 4 20:09:21 atlas kernel: VFS: Disk change detected on device sr(11,1)
Jun 4 20:09:41 atlas kernel: scsi: device set offline - command error recover failed: host 0 channel 0 id 0 lun 0
Jun 4 20:09:41 atlas kernel: SCSI error: host 0 id 0 lun 0 return code = 6000000
Jun 4 20:09:41 atlas kernel: ^ISense class 0, sense error 0, extended sense 0
Jun 4 20:09:41 atlas kernel: VFS: Disk change detected on device sr(11,0)
Jun 4 20:09:41 atlas kernel: VFS: Disk change detected on device sr(11,0)

More info available on request.

Regards,
--
Zlatko


2002-06-04 21:38:45

by Adam J. Richter

[permalink] [raw]
Subject: Re: IDE{,-SCSI} trouble [2.5.20]

I sent Martin Dalecki a patch for a null pointer dereference
problem that the ide-scsi code tripped in the IDE code. I imagine
Martin will submit it to Linus for 2.5.21.

The IDE code was receiving some requests where
request->q == NULL and then passing request->q to routine
that dereferenced it. My patch determined the queue another way,
and now I can read from my IDE DVD-ROM drive.

Here is a copy of the patch, in case you are would like to
try it in the meantime.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."

--- linux/include/linux/ide.h 2002-06-03 00:45:25.000000000 -0700
+++ linux-2.5.20/include/linux/ide.h 2002-06-02 18:44:43.000000000 -0700
@@ -799,7 +799,7 @@
extern void udma_pci_irq_lost(struct ata_device *);
extern int udma_pci_setup(struct ata_device *);

-extern int udma_new_table(struct ata_device *, struct request *);
+extern int udma_new_table(struct ata_channel *, struct request *);
extern void udma_destroy_table(struct ata_channel *);
extern void udma_print(struct ata_device *);

--- linux/drivers/ide/pcidma.c 2002-06-03 00:47:52.000000000 -0700
+++ linux-2.5.20/drivers/ide/pcidma.c 2002-06-02 18:44:43.000000000 -0700
@@ -58,9 +58,8 @@
* FIXME: taskfiles should be a map of pages, not a long virt address... /jens
* FIXME: I agree with Jens --mdcki!
*/
-static int build_sglist(struct ata_device *drive, struct request *rq)
+static int build_sglist(struct ata_channel *ch, struct request *rq)
{
- struct ata_channel *ch = drive->channel;
struct scatterlist *sg = ch->sg_table;
int nents = 0;

@@ -70,7 +69,7 @@
unsigned char *virt_addr = rq->buffer;
int sector_count = rq->nr_sectors;
#else
- nents = blk_rq_map_sg(&drive->queue, rq, ch->sg_table);
+ nents = blk_rq_map_sg(rq->q, rq, ch->sg_table);

if (nents > rq->nr_segments)
printk("ide-dma: received %d segments, build %d\n", rq->nr_segments, nents);
@@ -100,7 +99,7 @@
sg[nents].length = sector_count * SECTOR_SIZE;
++nents;
} else {
- nents = blk_rq_map_sg(&drive->queue, rq, ch->sg_table);
+ nents = blk_rq_map_sg(rq->q, rq, ch->sg_table);

if (rq->q && nents > rq->nr_phys_segments)
printk("ide-dma: received %d phys segments, build %d\n", rq->nr_phys_segments, nents);
@@ -151,7 +150,7 @@
reading = 1 << 3;

/* try PIO instead of DMA */
- if (!udma_new_table(drive, rq))
+ if (!udma_new_table(ch, rq))
return 1;

outl(ch->dmatable_dma, dma_base + 4); /* PRD table */
@@ -307,9 +306,8 @@
* This prepares a dma request. Returns 0 if all went okay, returns 1
* otherwise. May also be invoked from trm290.c
*/
-int udma_new_table(struct ata_device *drive, struct request *rq)
+int udma_new_table(struct ata_channel *ch, struct request *rq)
{
- struct ata_channel *ch = drive->channel;
unsigned int *table = ch->dmatable_cpu;
#ifdef CONFIG_BLK_DEV_TRM290
unsigned int is_trm290_chipset = (ch->chipset == ide_trm290);
@@ -320,7 +318,7 @@
int i;
struct scatterlist *sg;

- ch->sg_nents = i = build_sglist(drive, rq);
+ ch->sg_nents = i = build_sglist(ch, rq);
if (!i)
return 0;

--- linux/drivers/ide/hpt34x.c 2002-06-03 00:44:58.000000000 -0700
+++ linux-2.5.20/drivers/ide/hpt34x.c 2002-06-02 18:44:41.000000000 -0700
@@ -253,7 +253,7 @@
unsigned int count;
u8 cmd;

- if (!(count = udma_new_table(drive, rq)))
+ if (!(count = udma_new_table(ch, rq)))
return 1; /* try PIO instead of DMA */

if (rq_data_dir(rq) == READ)
--- linux/drivers/ide/trm290.c 2002-06-03 00:44:37.000000000 -0700
+++ linux-2.5.20/drivers/ide/trm290.c 2002-06-02 18:44:43.000000000 -0700
@@ -217,7 +217,7 @@
writing = 0;
}

- if (!(count = udma_new_table(drive, rq))) {
+ if (!(count = udma_new_table(ch, rq))) {
trm290_prepare_drive(drive, 0); /* select PIO xfer */
return 1; /* try PIO instead of DMA */
}
--- linux/drivers/ide/icside.c 2002-06-03 00:46:21.000000000 -0700
+++ linux-2.5.20/drivers/ide/icside.c 2002-06-02 18:44:41.000000000 -0700
@@ -275,9 +275,8 @@
#define NR_ENTRIES 256
#define TABLE_SIZE (NR_ENTRIES * 8)

-static int ide_build_sglist(struct ata_device *drive, struct request *rq)
+static int ide_build_sglist(struct ata_channel *ch, struct request *rq)
{
- struct ata_channel *ch = drive->channel;
struct scatterlist *sg = ch->sg_table;
int nents;

@@ -295,7 +294,7 @@
sg->length = rq->nr_sectors * SECTOR_SIZE;
nents = 1;
} else {
- nents = blk_rq_map_sg(&drive->queue, rq, sg);
+ nents = blk_rq_map_sg(rq->q, rq, sg);

if (rq->q && nents > rq->nr_phys_segments)
printk("icside: received %d segments, build %d\n",

2002-06-04 22:19:15

by Russell King

[permalink] [raw]
Subject: Re: IDE{,-SCSI} trouble [2.5.20]

On Tue, Jun 04, 2002 at 02:37:55PM -0700, Adam J. Richter wrote:
> --- linux/drivers/ide/icside.c 2002-06-03 00:46:21.000000000 -0700
> +++ linux-2.5.20/drivers/ide/icside.c 2002-06-02 18:44:41.000000000 -0700
> @@ -275,9 +275,8 @@
> #define NR_ENTRIES 256
> #define TABLE_SIZE (NR_ENTRIES * 8)
>
> -static int ide_build_sglist(struct ata_device *drive, struct request *rq)
> +static int ide_build_sglist(struct ata_channel *ch, struct request *rq)
> {
> - struct ata_channel *ch = drive->channel;
> struct scatterlist *sg = ch->sg_table;
> int nents;
>

Umm, you sure this is right? ide_build_sglist takes an ata_channel
argument in my 2.5.20.

If this is reversed, you also forgot to change where it is used in
icside.c. Besides that, one of the recent changes from Martin broke
the driver rather disgustingly again. The following patch fixes this
breakage such that:

1. we won't issue *WRITE* commands to the drive when trying to
*READ* data from said drive.
2. allows icside.c to build again.

Luckily, (2) prevents its use in 2.5.20, which is a god-send given the
effects that (1) could have. Martin, could you please be more careful
with your editing in future? Maybe using '#error' if you're not able
to fix stuff up properly?

--- orig/drivers/ide/icside.c Mon Jun 3 10:24:34 2002
+++ linux/drivers/ide/icside.c Mon Jun 3 11:54:42 2002
@@ -517,33 +517,6 @@
return 0;
}

-static int icside_dma_read(struct ata_device *drive, struct request *rq)
-{
- struct ata_channel *ch = drive->channel;
- unsigned int cmd;
-
- if (icside_dma_common(drive, rq, DMA_MODE_READ))
- return 1;
-
- if (drive->type != ATA_DISK)
- return 0;
-
- ide_set_handler(drive, icside_dmaintr, WAIT_CMD, NULL);
-
- if ((rq->flags & REQ_DRIVE_ACB) && drive->addressing == 1) {
- struct ata_taskfile *args = rq->special;
- cmd = args->taskfile.command;
- } else if (drive->addressing) {
- cmd = WIN_READDMA_EXT;
- } else {
- cmd = WIN_READDMA;
- }
-
- OUT_BYTE(cmd, IDE_COMMAND_REG);
- enable_dma(ch->hw.dma);
- return 0;
-}
-
static int icside_dma_init(struct ata_device *drive, struct request *rq)
{
struct ata_channel *ch = drive->channel;
@@ -559,11 +532,11 @@

if ((rq->flags & REQ_DRIVE_ACB) && drive->addressing == 1) {
struct ata_taskfile *args = rq->special;
- cmd = args->taskfile.command;
+ cmd = args->cmd;
} else if (drive->addressing) {
- cmd = WIN_WRITEDMA_EXT;
+ cmd = rq_data_dir(rq) == WRITE ? WIN_WRITEDMA_EXT : WIN_READDMA_EXT;
} else {
- cmd = WIN_WRITEDMA;
+ cmd = rq_data_dir(rq) == WRITE ? WIN_WRITEDMA : WIN_READDMA;
}
OUT_BYTE(cmd, IDE_COMMAND_REG);



--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-06-04 22:55:44

by Adam J. Richter

[permalink] [raw]
Subject: Re: IDE{,-SCSI} trouble [2.5.20]

Russell King wrote:
>On Tue, Jun 04, 2002 at 02:37:55PM -0700, Adam J. Richter wrote:
>> --- linux/drivers/ide/icside.c 2002-06-03 00:46:21.000000000 -0700
>> +++ linux-2.5.20/drivers/ide/icside.c 2002-06-02 18:44:41.000000000 -0700
>> @@ -275,9 +275,8 @@
>> #define NR_ENTRIES 256
>> #define TABLE_SIZE (NR_ENTRIES * 8)
>>
>> -static int ide_build_sglist(struct ata_device *drive, struct request *rq)
>> +static int ide_build_sglist(struct ata_channel *ch, struct request *rq)
>> {
>> - struct ata_channel *ch = drive->channel;
>> struct scatterlist *sg = ch->sg_table;
>> int nents;
>>

>Umm, you sure this is right? ide_build_sglist takes an ata_channel
>argument in my 2.5.20.


Right. As the order of the file names in the diff confirms,
I accidentally submitted a diff in reverse order. You are also
correct about:

>If this is reversed, you also forgot to change where it is used in
>icside.c.

Russell: sorry for not cc'ing you in my original patch
submission to Martin. I infer that since you are adding another patch
of your own and adding Martin to the recipient list that it is OK with
you to volunteer Martin to combine your patch and mine in this case
and submit them to Linus.

Martin: unless you, Russell, or anyone else sees a problem
with this, could you please also apply the attached patch to icside.c,
which I should have included in my original submission. I missed my
error when I checked for compiler warnings, because icside is not
built on x86.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."


--- before/drivers/ide/icside.c 2002-06-02 18:44:41.000000000 -0700
+++ linux/drivers/ide/icside.c 2002-06-04 15:31:21.000000000 -0700
@@ -491,7 +492,7 @@
*/
BUG_ON(dma_channel_active(ch->hw.dma));

- count = ch->sg_nents = ide_build_sglist(ch, rq);
+ count = ch->sg_nents = ide_build_sglist(drive, rq);
if (!count)
return 1;

2002-06-05 09:40:24

by Martin Dalecki

[permalink] [raw]
Subject: Re: IDE{,-SCSI} trouble [2.5.20]

Adam J. Richter wrote:
> Russell King wrote:
>
>>On Tue, Jun 04, 2002 at 02:37:55PM -0700, Adam J. Richter wrote:
>>
>>>--- linux/drivers/ide/icside.c 2002-06-03 00:46:21.000000000 -0700
>>>+++ linux-2.5.20/drivers/ide/icside.c 2002-06-02 18:44:41.000000000 -0700
>>>@@ -275,9 +275,8 @@
>>> #define NR_ENTRIES 256
>>> #define TABLE_SIZE (NR_ENTRIES * 8)
>>>
>>>-static int ide_build_sglist(struct ata_device *drive, struct request *rq)
>>>+static int ide_build_sglist(struct ata_channel *ch, struct request *rq)
>>> {
>>>- struct ata_channel *ch = drive->channel;
>>> struct scatterlist *sg = ch->sg_table;
>>> int nents;
>>>
>>
>
>>Umm, you sure this is right? ide_build_sglist takes an ata_channel
>>argument in my 2.5.20.
>
>
>
> Right. As the order of the file names in the diff confirms,
> I accidentally submitted a diff in reverse order. You are also
> correct about:
>
>
>>If this is reversed, you also forgot to change where it is used in
>>icside.c.
>
>
> Russell: sorry for not cc'ing you in my original patch
> submission to Martin. I infer that since you are adding another patch
> of your own and adding Martin to the recipient list that it is OK with
> you to volunteer Martin to combine your patch and mine in this case
> and submit them to Linus.
>
> Martin: unless you, Russell, or anyone else sees a problem
> with this, could you please also apply the attached patch to icside.c,
> which I should have included in my original submission. I missed my
> error when I checked for compiler warnings, because icside is not
> built on x86.
>
> Adam J. Richter __ ______________ 575 Oroville Road
> [email protected] \ / Milpitas, California 95035
> +1 408 309-6081 | g g d r a s i l United States of America
> "Free Software For The Rest Of Us."
>
>
> --- before/drivers/ide/icside.c 2002-06-02 18:44:41.000000000 -0700
> +++ linux/drivers/ide/icside.c 2002-06-04 15:31:21.000000000 -0700
> @@ -491,7 +492,7 @@
> */
> BUG_ON(dma_channel_active(ch->hw.dma));
>
> - count = ch->sg_nents = ide_build_sglist(ch, rq);
> + count = ch->sg_nents = ide_build_sglist(drive, rq);
> if (!count)
> return 1;

Yeep. Seems to be pretty "obviously correct".


2002-06-05 09:46:55

by Martin Dalecki

[permalink] [raw]
Subject: Re: IDE{,-SCSI} trouble [2.5.20]

Russell King wrote:
> On Tue, Jun 04, 2002 at 02:37:55PM -0700, Adam J. Richter wrote:
>
>>--- linux/drivers/ide/icside.c 2002-06-03 00:46:21.000000000 -0700
>>+++ linux-2.5.20/drivers/ide/icside.c 2002-06-02 18:44:41.000000000 -0700
>>@@ -275,9 +275,8 @@
>> #define NR_ENTRIES 256
>> #define TABLE_SIZE (NR_ENTRIES * 8)
>>
>>-static int ide_build_sglist(struct ata_device *drive, struct request *rq)
>>+static int ide_build_sglist(struct ata_channel *ch, struct request *rq)
>> {
>>- struct ata_channel *ch = drive->channel;
>> struct scatterlist *sg = ch->sg_table;
>> int nents;
>>
>
>
> Umm, you sure this is right? ide_build_sglist takes an ata_channel
> argument in my 2.5.20.
>
> If this is reversed, you also forgot to change where it is used in
> icside.c. Besides that, one of the recent changes from Martin broke
> the driver rather disgustingly again. The following patch fixes this
> breakage such that:
>
> 1. we won't issue *WRITE* commands to the drive when trying to
> *READ* data from said drive.
> 2. allows icside.c to build again.
>
> Luckily, (2) prevents its use in 2.5.20, which is a god-send given the
> effects that (1) could have. Martin, could you please be more careful
> with your editing in future? Maybe using '#error' if you're not able
> to fix stuff up properly?

Sorry this change had to be done. As you can see from the
rest of the associated diff it's actually simplifying stuff.
The goal which got me at this was to finally make it possible
to unify udma_init and udma_tcq_taksfile.

Unfortuantely I have apparently missed the remaining usage of
taskfile.command in icside.c during find -exec grep...
However my changelog documented the change so I hope at least that
it was "pretty obvious" how to accomodate. BTW.> Disaster
1. could not occur becouse I deleted the usage of the read
variant altogether.

Anyway as always: Thank you for cheerishing this out.


>
> --- orig/drivers/ide/icside.c Mon Jun 3 10:24:34 2002
> +++ linux/drivers/ide/icside.c Mon Jun 3 11:54:42 2002
> @@ -517,33 +517,6 @@
> return 0;
> }
>
> -static int icside_dma_read(struct ata_device *drive, struct request *rq)
> -{
> - struct ata_channel *ch = drive->channel;
> - unsigned int cmd;
> -
> - if (icside_dma_common(drive, rq, DMA_MODE_READ))
> - return 1;
> -
> - if (drive->type != ATA_DISK)
> - return 0;
> -
> - ide_set_handler(drive, icside_dmaintr, WAIT_CMD, NULL);
> -
> - if ((rq->flags & REQ_DRIVE_ACB) && drive->addressing == 1) {
> - struct ata_taskfile *args = rq->special;
> - cmd = args->taskfile.command;
> - } else if (drive->addressing) {
> - cmd = WIN_READDMA_EXT;
> - } else {
> - cmd = WIN_READDMA;
> - }
> -
> - OUT_BYTE(cmd, IDE_COMMAND_REG);
> - enable_dma(ch->hw.dma);
> - return 0;
> -}
> -
> static int icside_dma_init(struct ata_device *drive, struct request *rq)
> {
> struct ata_channel *ch = drive->channel;
> @@ -559,11 +532,11 @@
>
> if ((rq->flags & REQ_DRIVE_ACB) && drive->addressing == 1) {
> struct ata_taskfile *args = rq->special;
> - cmd = args->taskfile.command;
> + cmd = args->cmd;
> } else if (drive->addressing) {
> - cmd = WIN_WRITEDMA_EXT;
> + cmd = rq_data_dir(rq) == WRITE ? WIN_WRITEDMA_EXT : WIN_READDMA_EXT;
> } else {
> - cmd = WIN_WRITEDMA;
> + cmd = rq_data_dir(rq) == WRITE ? WIN_WRITEDMA : WIN_READDMA;
> }
> OUT_BYTE(cmd, IDE_COMMAND_REG);
>