LinuxLists.cc - Serverworks OSB4 in impossible state

2002-06-10 15:52:30

Subject: Serverworks OSB4 in impossible state

Hello,

I know a similar problem was discussed here a short while ago.
However we have here a situation where we can reproduce the problem
reliably. This is a RedHat 2.4.18-4 kernel.

We have a CD with a corrupt last block. If we try to read this block in
PIO mode (hdparm -d 0 /dev/hdc) , we get errors like in the first
attachment.

The machine has only a CDROM (Mitsumi FX 4830T) attached to the IDE bus
as /dev/hdc. We used no IDE-related boot parameters.

If we read the block in DMA mode (with dd), the machine stalls with the
"impossible state" message.

A PCI bus scan reveals that the IO register (dma_base+2) contains indeed
0xa5 (bit 0 set), which leads to the panic. Normally the read on that
register returns 0xa0.

We see in our PCI bus scan that a successful DMA of 4096 bytes was
carried out ~23ms before the stall condition. Another 4096 byte request
was scheduled but never seen. Between the successful DMA and the stall
condition we see nothing but a few timer interrupts.
Then an IDE interrupt occurs, which leads immediately to the panic.

The CD-ROM drive certainly reports some sort of error like in the PIO
case when tyring to access the last block. This seems to be the
(indirect) reason why the Bus master bit in (dma_base+2) remains set
long after the DMA is finished.

Any ideas/comments?

Martin

--
Martin Wilck Phone: +49 5251 8 15113
Fujitsu Siemens Computers Fax: +49 5251 8 20409
Heinz-Nixdorf-Ring 1 mailto:[email protected]
D-33106 Paderborn http://www.fujitsu-siemens.com/primergy

Attachments:

hdc-msgs (3.45 kB)
Kernel error messages in PIO-mode dmesg (14.98 kB)
settings (1.15 kB)
/proc/ide/ide1/hdc/settings svwks (771.00 B)
/proc/ide/svwks Download all attachments

2002-06-10 16:42:06

by Daniela Engert

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

Hello Martin,

On 10 Jun 2002 17:52:58 +0200, Martin Wilck wrote:

>We have a CD with a corrupt last block. If we try to read this block in
>PIO mode (hdparm -d 0 /dev/hdc) , we get errors like in the first
>attachment.

The error code returned is "check condition" with a sense key of 3
"medium error". The most appropriate driver action would have been to
issue a "request sense" command to learn the precise error and retry
only in case of a good chance of a recoverable problem - but that's a
different story.

>If we read the block in DMA mode (with dd), the machine stalls with the
>"impossible state" message.
>
>A PCI bus scan reveals that the IO register (dma_base+2) contains indeed
>0xa5 (bit 0 set), which leads to the panic. Normally the read on that
>register returns 0xa0.

The intersting bits of the DMA status register are bits 0 though 2. A
value of 5 indicates the condition "interrupt from unit, DMA state
machine active". This is a valid status! It basically means the unit
issued an interrupt before the PRD table is exhausted. This makes sense
because the CD-ROM units fails to transfer the amount of data described
by the PRD table because of the non-recoverable read error.

>We see in our PCI bus scan that a successful DMA of 4096 bytes was
>carried out ~23ms before the stall condition. Another 4096 byte request
>was scheduled but never seen. Between the successful DMA and the stall
>condition we see nothing but a few timer interrupts.
>Then an IDE interrupt occurs, which leads immediately to the panic.

What you makes sense (the next DMA transfer is scheduled but never
carried out by the CD-ROM unit) except for the panic, ofcoz. The
correct driver action in this case were stopping the DMA engine and
issuing a reset of the state machines involved (both on the host and
the unit side).

>Any ideas/comments?

I hope this clears up things a little ...

Ciao,
Dani

2002-06-11 07:21:37

by Martin Wilck

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

Am Mon, 2002-06-10 um 18.41 schrieb Daniela Engert:

> The intersting bits of the DMA status register are bits 0 though 2. A
> value of 5 indicates the condition "interrupt from unit, DMA state
> machine active". This is a valid status! It basically means the unit
> issued an interrupt before the PRD table is exhausted. This makes sense
> because the CD-ROM units fails to transfer the amount of data described
> by the PRD table because of the non-recoverable read error.

Shouldn't the error bit be set too? (But that wouldn't make any
difference with the current driver ...)

> What you makes sense (the next DMA transfer is scheduled but never
> carried out by the CD-ROM unit) except for the panic, ofcoz. The
> correct driver action in this case were stopping the DMA engine and
> issuing a reset of the state machines involved (both on the host and
> the unit side).

The message, the comments in the code, and what Alan wrote here:
http://groups.google.com/groups?hl=de&lr=&threadm=linux.kernel.Pine.LNX.4.31.0206031234370.12103-100000%40boxer.fnal.gov&rnum=2&prev=/groups%3Fq%3Dosb4-bug%2540ide.cabal.tm%26hl%3Dde%26lr%3D%26selm%3Dlinux.kernel.Pine.LNX.4.31.0206031234370.12103-100000%2540boxer.fnal.gov%26rnum%3D2
suggest that trying to recover from this condition is extremely
dangerous (note that the kernel doesn't even panic(), because
a sync() may kill a disk, the comments say).

Anyway, thanks a lot for your insightful comments.
Martin

--
Martin Wilck Phone: +49 5251 8 15113
Fujitsu Siemens Computers Fax: +49 5251 8 20409
Heinz-Nixdorf-Ring 1 mailto:[email protected]
D-33106 Paderborn http://www.fujitsu-siemens.com/primergy

2002-06-11 07:45:23

by Daniela Engert

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

On 11 Jun 2002 09:22:24 +0200, Martin Wilck wrote:

>Am Mon, 2002-06-10 um 18.41 schrieb Daniela Engert:

>> The intersting bits of the DMA status register are bits 0 though 2. A
>> value of 5 indicates the condition "interrupt from unit, DMA state
>> machine active". This is a valid status! It basically means the unit
>> issued an interrupt before the PRD table is exhausted. This makes sense
>> because the CD-ROM units fails to transfer the amount of data described
>> by the PRD table because of the non-recoverable read error.
>
>Shouldn't the error bit be set too? (But that wouldn't make any
>difference with the current driver ...)

No it shouldn't. The error is happening on the unit side and not on the
host side of the bus. Thus it is correct that the host is *not*
reporting an error (which is true) but only the CD-ROM unit.

>> What you makes sense (the next DMA transfer is scheduled but never
>> carried out by the CD-ROM unit) except for the panic, ofcoz. The
>> correct driver action in this case were stopping the DMA engine and
>> issuing a reset of the state machines involved (both on the host and
>> the unit side).
>
>The message, the comments in the code, and what Alan wrote here:
>http://groups.google.com/groups?hl=de&lr=&threadm=linux.kernel.Pine.LNX.4.31.0206031234370.12103-100000%40boxer.fnal.gov&rnum=2&prev=/groups%3Fq%3Dosb4-bug%2540ide.cabal.tm%26hl%3Dde%26lr%3D%26selm%3Dlinux.kernel.Pine.LNX.4.31.0206031234370.12103-100000%2540boxer.fnal.gov%26rnum%3D2
>suggest that trying to recover from this condition is extremely
>dangerous (note that the kernel doesn't even panic(), because
>a sync() may kill a disk, the comments say).

I'm aware of all of that. By pure chance I have a machine with an OSB4
sitting on my desk for a couple of days. May be I can find a defect
CD-ROM to test it with my driver and see if it manages to recover from
errors like these. Hopefully, the PCI tracer gives some more insight.

Ciao,
Dani

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Daniela Engert, systems engineer at MEDAV GmbH
Gr?fenberger Str. 34, 91080 Uttenreuth, Germany
Phone ++49-9131-583-348, Fax ++49-9131-583-11

2002-06-11 08:36:48

by Martin Wilck

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

Am Die, 2002-06-11 um 09.45 schrieb Daniela Engert:

> I'm aware of all of that. By pure chance I have a machine with an OSB4
> sitting on my desk for a couple of days. May be I can find a defect
> CD-ROM to test it with my driver and see if it manages to recover from
> errors like these. Hopefully, the PCI tracer gives some more insight.

Do you have a custom version of the driver (because you write "my
driver")? If yes, can you send it, so that I can test it, too?

Can you point me to any reference material on the web?

Martin
--
Martin Wilck Phone: +49 5251 8 15113
Fujitsu Siemens Computers Fax: +49 5251 8 20409
Heinz-Nixdorf-Ring 1 mailto:[email protected]
D-33106 Paderborn http://www.fujitsu-siemens.com/primergy

2002-06-11 11:24:40

by Martin Wilck

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

[Alan, I am cc'ing you on this because I read elsewhere that you want
[email protected] to be forwarded to you, and that address still
bounces].

I have tried the following:

- comment out the code that stalls the machine when the condition in
question is encountered.
- run dd over a couple of good blocks on the CD.
- run dd over the corrupted blocks. This leads now to very similar
errors as in the PIO case.
- reenable DMA with hdparm, because it is automatically disabled by the
ide-cd driver if an error occurs (why that? the error has nothing to
do with DMA here).
- repeat the first dd command on the good blocks and compare the
results.

The results are identical, thus I cannot verify the "4 byte shift" Alan
has been talking about. Of course this is a CD-ROM only scenario, thus
I can't tell anything about hard disks.

Is it possible that the 4-byte shift occurs only with some particular
(older?) version of the chipset?

In any case, the condition that usually causes Linux to stall is
indeed a perfectly valid condition for DMA when the device transfers
less data than it's supposed to. I doubt that hanging the system
without more detailed checks is the right measure to take there.

Martin

--
Martin Wilck Phone: +49 5251 8 15113
Fujitsu Siemens Computers Fax: +49 5251 8 20409
Heinz-Nixdorf-Ring 1 mailto:[email protected]
D-33106 Paderborn http://www.fujitsu-siemens.com/primergy

2002-06-11 21:27:49

by Chris Wedgwood

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

On Tue, Jun 11, 2002 at 01:25:25PM +0200, Martin Wilck wrote:

Is it possible that the 4-byte shift occurs only with some
particular (older?) version of the chipset?

Maybe.

I have an oldish OSB4 here and beating on it only with the CDROM
(disks are all SCSI) I don't ever seem to see this problem:

00:00.0 Host bridge: ServerWorks CNB20LE Host Bridge (rev 05)
Flags: bus master, medium devsel, latency 48

00:00.1 Host bridge: ServerWorks CNB20LE Host Bridge (rev 05)
Flags: bus master, medium devsel, latency 48

I think what is really required is input from ServerWorks/Broadcom
about this.

--cw

2002-06-12 07:23:50

by Martin Wilck

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

Am Die, 2002-06-11 um 23.27 schrieb Chris Wedgwood:

> I have an oldish OSB4 here and beating on it only with the CDROM
> (disks are all SCSI) I don't ever seem to see this problem:

UDMA33 mode? You need to have a broken CD (we happen to have a CD burner
that generates broken CDs)

> I think what is really required is input from ServerWorks/Broadcom
> about this.

Yeah, we are in contact with them.
Thanks,
Martin

--
Martin Wilck Phone: +49 5251 8 15113
Fujitsu Siemens Computers Fax: +49 5251 8 20409
Heinz-Nixdorf-Ring 1 mailto:[email protected]
D-33106 Paderborn http://www.fujitsu-siemens.com/primergy

2002-06-12 08:36:53

by Alan

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

Triggering the check on csb5/csb6 would be a bug - maybe an extra
test is needed there as CSB5/6 are fine

2002-06-12 08:47:11

by Martin Wilck

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

Am Mit, 2002-06-12 um 10.58 schrieb Alan Cox:
> Triggering the check on csb5/csb6 would be a bug - maybe an extra
> test is needed there as CSB5/6 are fine

Currently the stall is triggered if the DMA engine active bit is set, no
further conditions.

Would you concur that it would be reasonable to trigger only if

- the chipset version is < CSB5,
- the drive is a hard disk,
- and the drive did not report an error?

(I am not certain about the last condition, but from the descriptions
of the 4-byte-shift problem I have seen I infer that there was no drive
error condition involved).

Martin

--
Martin Wilck Phone: +49 5251 8 15113
Fujitsu Siemens Computers Fax: +49 5251 8 20409
Heinz-Nixdorf-Ring 1 mailto:[email protected]
D-33106 Paderborn http://www.fujitsu-siemens.com/primergy

2002-06-12 08:53:02

by Alan

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

> Would you concur that it would be reasonable to trigger only if
>
> - the chipset version is < CSB5,
> - the drive is a hard disk,
> - and the drive did not report an error?
>
> (I am not certain about the last condition, but from the descriptions
> of the 4-byte-shift problem I have seen I infer that there was no drive
> error condition involved).

Entirely agreed

2002-06-12 10:30:25

by Martin Wilck

[permalink] [raw]

Subject: OSB4 PATCH (was: Re: Serverworks OSB4 in impossible state)

Am Mit, 2002-06-12 um 11.14 schrieb Alan Cox:
> Entirely agreed

I propose this patch to remedy the problem.

I don't know how to test if the drive is a seagate drive, and
I think we don't want to do that, because it would end up in yet another
blacklist.

I cannot test if this behaves correctly on machines that do expose the
4-byte shift bug - it would be great if somebody could test that.

Martin

--- drivers/ide/serverworks.c.orig Tue Jun 11 11:24:59 2002
+++ drivers/ide/serverworks.c Wed Jun 12 12:00:36 2002
@@ -547,7 +547,13 @@
ide_hwif_t *hwif = HWIF(drive);
unsigned long dma_base = hwif->dma_base;

- if(inb(dma_base+0x02)&1)
+ /* If it's a disk on the OSB4, the DMA engine is still on,
+ and the device reports no error status, we are probably
+ facing the "4 byte shift" problem */
+ if(drive->media == ide_disk &&
+ hwif->pci_dev->device == PCI_DEVICE_ID_SERVERWORKS_OSB4IDE &&
+ inb(dma_base+0x02)&1 &&
+ OK_STAT (GET_STAT(), DRIVE_READY, BAD_STAT))
{
#if 0
int i;

--
Martin Wilck Phone: +49 5251 8 15113
Fujitsu Siemens Computers Fax: +49 5251 8 20409
Heinz-Nixdorf-Ring 1 mailto:[email protected]
D-33106 Paderborn http://www.fujitsu-siemens.com/primergy

2002-06-12 20:27:50

by Christian Zoffoli

[permalink] [raw]

Subject: Re: OSB4 PATCH (was: Re: Serverworks OSB4 in impossible state)

Martin Wilck wrote:
> Am Mit, 2002-06-12 um 11.14 schrieb Alan Cox:
> > Entirely agreed
>
> I propose this patch to remedy the problem.
>
> I don't know how to test if the drive is a seagate drive, and
> I think we don't want to do that, because it would end up in yet another
> blacklist.
>
> I cannot test if this behaves correctly on machines that do expose the
> 4-byte shift bug - it would be great if somebody could test that.
>
> Martin
>
> --- drivers/ide/serverworks.c.orig Tue Jun 11 11:24:59 2002
> +++ drivers/ide/serverworks.c Wed Jun 12 12:00:36 2002
> @@ -547,7 +547,13 @@
> ide_hwif_t *hwif = HWIF(drive);
> unsigned long dma_base = hwif->dma_base;
>
> - if(inb(dma_base+0x02)&1)
> + /* If it's a disk on the OSB4, the DMA engine is still on,
> + and the device reports no error status, we are probably
> + facing the "4 byte shift" problem */
> + if(drive->media == ide_disk &&
> + hwif->pci_dev->device == PCI_DEVICE_ID_SERVERWORKS_OSB4IDE &&
> + inb(dma_base+0x02)&1 &&
> + OK_STAT (GET_STAT(), DRIVE_READY, BAD_STAT))
> {
> #if 0
> int i;
>
>

It works for me ...I have a supermicro 370DE6 (serverworks HE-SL) and a
maxtor HD (5T030H3).

Christian

2002-06-13 11:50:20

by Daniela Engert

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

Hi,

as promised I've conducted a test similar to Martin's to check the
behaviour of a Serverworks ROSB4 IDE controller in case of an aborted
ATAPI DMA transfer (probably due to a media error). In fact, I've done
this by comparing it with a known-to-be-good system, a dual processor
Intel BX based board with a PIIX4 IDE controller chip.

The following trace shows how it should be:

- lines 158-172: setup DMA transfer, send command packet
- lines 173-174: the DMA engine loads the first (of multiple)
PRD entry
- the actual DMA induced memory writes are not shown here
- line 175: IRQ14 is acknowledged
- lines 176-181: gather unit and DMA status
- lines 182-210: issue "request sense" and get sense status

CD-ROM read error on Intel PIIX4:

______Time_______Burst_BE#__Wait___Command_Address__Data____
158 20.089ms . 1011 . I/OWri 000001F6 ..B0....
159 1.656us . 1011 . I/ORd 000003F6 ..50....
160 10.08us . 0000 . I/OWri 0000F004 00EAC800
161 812.7ns . 1110 . I/OWri 0000F000 ......08
162 752.5ns . 1011 . I/OWri 0000F002 ..46....
163 3.100us . 1110 . I/OWri 000001F4 ......FF
164 4.214us . 1101 . I/OWri 000001F5 ....FF..
165 4.244us . 1101 . I/OWri 000001F1 ....01..
166 4.214us . 0111 . I/OWri 000001F7 A0......
167 5.027us . 1011 . I/ORd 000003F6 ..58....
168 5.448us . 1011 . I/OWri 0000F002 ..46....
169 752.5ns . 1110 . I/OWri 0000F000 ......09
170 1.204us . 0000 . I/OWri 000001F0 00000028
171 903.0ns . 0000 . I/OWri 000001F0 0000440D
172 903.0ns . 0000 . I/OWri 000001F0 0000001F
173 3.673s Start 0000 . MemRd 00EAC800 006E3000
174 30.1ns B 0000 . MemRd 00EAC800 0000D000
175 648.99ms . 1110 . IntAck ........ ......76
176 5.779us . 0111 . I/ORd 000001F7 51......
177 11.47us . 0111 . I/ORd 000001F7 51......
178 1.324us . 1011 . I/ORd 000001F2 ..03....
179 1.957us . 1110 . I/OWri 0000F000 ......08
180 812.7ns . 1011 . I/ORd 0000F002 ..44....
181 1.355us . 1101 . I/ORd 000001F1 ....30..
182 9.361us . 1011 . I/OWri 000001F6 ..B0....
183 1.806us . 1011 . I/ORd 000003F6 ..51....
184 9.301us . 1110 . I/OWri 000001F4 ......12
185 4.274us . 1101 . I/OWri 000001F5 ....00..
186 4.244us . 1101 . I/OWri 000001F1 ....00..
187 4.214us . 0111 . I/OWri 000001F7 A0......
188 4.906us . 1011 . I/ORd 000003F6 ..58....
189 6.020us . 0000 . I/OWri 000001F0 00000003
190 903.0ns . 0000 . I/OWri 000001F0 00000012
191 903.0ns . 0000 . I/OWri 000001F0 00000000
192 258.17us . 1110 . IntAck ........ ......76
193 3.431us . 0111 . I/ORd 000001F7 58......
194 10.08us . 0111 . I/ORd 000001F7 58......
195 1.204us . 1011 . I/ORd 000001F2 ..02....
196 1.535us . 1101 . I/ORd 000001F5 ....00..
197 1.174us . 1110 . I/ORd 000001F4 ......12
198 10.20us . 1100 . I/ORd 000001F0 ....0070
199 1.475us . 1100 . I/ORd 000001F0 ....0003
200 632.1ns . 1100 . I/ORd 000001F0 ....0000
201 632.1ns . 1100 . I/ORd 000001F0 ....0A00
202 602.0ns . 1100 . I/ORd 000001F0 ....0000
203 632.1ns . 1100 . I/ORd 000001F0 ....0000
204 602.0ns . 1100 . I/ORd 000001F0 ....0611
205 632.1ns . 1100 . I/ORd 000001F0 ....0000
206 602.0ns . 1100 . I/ORd 000001F0 ....0000
207 12.79us . 1110 . IntAck ........ ......76
208 3.401us . 0111 . I/ORd 000001F7 50......
209 9.361us . 0111 . I/ORd 000001F7 50......
210 1.234us . 1011 . I/ORd 000001F2 ..03....

And here is the same with the ROSB4. This time, some of the
DMA writes are shown. After loading the second PRD entry
which describes a memory region of 7800h bytes, 3000h bytes
are transferred before IRQ14 is asserted. The IRQ14 INTACK
cycle is the last transaction on the PCI bus ever, the
machine is completely frozen!

CD-ROM read error on ServerWorks ROSB4 revision 0:

______Time_______Burst_BE#__Wait___Command_Address__Data____
51316 297.63us . 1011 . I/OWri 000001F6 ..B0....
51317 1.530us . 1011 . I/ORd 000003F6 ..50....
51318 6.300us . 0000 . I/OWri 00005404 00EF2800
51319 450ns . 1110 . I/OWri 00005400 ......08
51320 450ns . 1011 . I/OWri 00005402 ..66....
51321 1.440us . 1110 . I/OWri 000001F4 ......FF
51322 3.480us . 1101 . I/OWri 000001F5 ....FF..
51323 3.480us . 1101 . I/OWri 000001F1 ....01..
51324 3.510us . 0111 . I/OWri 000001F7 A0......
51325 4.470us . 1011 . I/ORd 000003F6 ..58....
51326 4.620us . 1011 . I/OWri 00005402 ..66....
51327 660ns . 0000 . I/OWri 000001F0 00000028
51328 420ns . 0000 . I/OWri 000001F0 0000F80D
51329 420ns . 0000 . I/OWri 000001F0 0000001F
51330 1.290us . 1011 . I/ORd 000003F6 ..D0....
51331 3.660us . 1110 . I/OWri 00005400 ......09
51332 1.290us . 0000 . MemRd 00EF2800 00B08000
51333 630ns . 0000 . MemRd 00EF2804 00008000
51334 166.11us Start 0000 . MemWri 00B08000 7BC0728C
51335 30ns B 0000 . MemWri 00B08000 285DA7D0
51336 30ns B 0000 . MemWri 00B08000 9FAE557A
51337 30ns B 0000 . MemWri 00B08000 B3F88165
51338 30ns B 0000 . MemWri 00B08000 BDFD7823
51339 30ns B 0000 . MemWri 00B08000 42ED22D0
51340 30ns B 0000 . MemWri 00B08000 7BA5743F
51341 30ns B 0000 . MemWri 00B08000 6B5897BA
51342 780ns Start 0000 . MemWri 00B08020 ACF1D36B
..
..
59518 930ns Start 0000 . MemWri 00B0FFE0 845971B8
59519 30ns B 0000 . MemWri 00B0FFE0 7E325F95
59520 30ns B 0000 . MemWri 00B0FFE0 7ADA36D0
59521 30ns B 0000 . MemWri 00B0FFE0 96BD435C
59522 30ns B 0000 . MemWri 00B0FFE0 4ED88CB0
59523 30ns B 0000 . MemWri 00B0FFE0 2E1CCAF7
59524 30ns B 0000 . MemWri 00B0FFE0 FC8782B3
59525 30ns B 0000 . MemWri 00B0FFE0 9C0A2335
59526 780ns . 0000 . MemRd 00EF2808 00B10000
59527 630ns . 0000 . MemRd 00EF280C 80007800
59528 1.2518ms Start 0000 . MemWri 00B10000 E85C33CD
59529 30ns B 0000 . MemWri 00B10000 AD2F9613
59530 30ns B 0000 . MemWri 00B10000 D8BEC924
59531 30ns B 0000 . MemWri 00B10000 E273C0BD
59532 30ns B 0000 . MemWri 00B10000 DC655F5E
59533 30ns B 0000 . MemWri 00B10000 69B3087B
59534 30ns B 0000 . MemWri 00B10000 369B26D1
59535 30ns B 0000 . MemWri 00B10000 9A8C47DF
59536 780ns Start 0000 . MemWri 00B10020 3F026EA5
..
..
62592 750ns Start 0000 . MemWri 00B12FE0 367016E1
62593 30ns B 0000 . MemWri 00B12FE0 35654905
62594 30ns B 0000 . MemWri 00B12FE0 9968FF02
62595 30ns B 0000 . MemWri 00B12FE0 9ABB5CAE
62596 30ns B 0000 . MemWri 00B12FE0 D32DF135
62597 30ns B 0000 . MemWri 00B12FE0 7A03326A
62598 30ns B 0000 . MemWri 00B12FE0 86CCE8BF
62599 30ns B 0000 . MemWri 00B12FE0 D4E66D21
62600 1.176s . 1110 . IntAck ........ ......76

My conclusion: don't do ATAPI DMA on a serverworks ROSB4 revision 0 IDE
controller.

Ciao,
Dani

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Daniela Engert, systems engineer at MEDAV GmbH
Gr?fenberger Str. 34, 91080 Uttenreuth, Germany
Phone ++49-9131-583-348, Fax ++49-9131-583-11

2002-06-13 11:58:34

by Martin Wilck

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

Am Don, 2002-06-13 um 13.50 schrieb Daniela Engert:

> And here is the same with the ROSB4. This time, some of the
> DMA writes are shown. After loading the second PRD entry
> which describes a memory region of 7800h bytes, 3000h bytes
> are transferred before IRQ14 is asserted. The IRQ14 INTACK
> cycle is the last transaction on the PCI bus ever, the
> machine is completely frozen!

You say (dma_base+2) is never read?
Was that a Linux system? If yes, I assume you never saw "OSB4 in
impossible state ..." ?

Martin

--
Martin Wilck Phone: +49 5251 8 15113
Fujitsu Siemens Computers Fax: +49 5251 8 20409
Heinz-Nixdorf-Ring 1 mailto:[email protected]
D-33106 Paderborn http://www.fujitsu-siemens.com/primergy

2002-06-13 12:04:50

by Daniela Engert

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

On 13 Jun 2002 13:59:06 +0200, Martin Wilck wrote:

>Am Don, 2002-06-13 um 13.50 schrieb Daniela Engert:

>> are transferred before IRQ14 is asserted. The IRQ14 INTACK
>> cycle is the last transaction on the PCI bus ever, the
>> machine is completely frozen!
>
>You say (dma_base+2) is never read?

Exactly. If checked this twice, the PCI tracer was configured to gather
*all* PCI bus events.

>Was that a Linux system?

No, I think this doesn't matter here at all, because the hardware
stalls completely - full stop.

Ciao,
Dani

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Daniela Engert, systems engineer at MEDAV GmbH
Gr?fenberger Str. 34, 91080 Uttenreuth, Germany
Phone ++49-9131-583-348, Fax ++49-9131-583-11

2002-06-13 12:51:46

by Martin Wilck

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

Am Don, 2002-06-13 um 14.32 schrieb Daniela Engert:

> I have no idea if the same is happening in case of an aborted ATA DMA
> transfer (I have no bad disk around), but at least I will disable ATAPI
> DMA transfers in my driver in case of early revision (whatever this is)
> OSB4 systems - possibly on all OSB4 systems. According to your
> experiences, the CSB5 and later seem to be fine.

Sorry, bad wording. I meant "OSB4" as opposed to "CSB5/6".

--
Martin Wilck Phone: +49 5251 8 15113
Fujitsu Siemens Computers Fax: +49 5251 8 20409
Heinz-Nixdorf-Ring 1 mailto:[email protected]
D-33106 Paderborn http://www.fujitsu-siemens.com/primergy

2002-06-13 23:50:02

by Nerijus Baliūnas

[permalink] [raw]

Subject: Re[2]: Serverworks OSB4 in impossible state

On Thu, 13 Jun 2002 13:50:25 +0200 (CDT) Daniela Engert <[email protected]> wrote:

> My conclusion: don't do ATAPI DMA on a serverworks ROSB4 revision 0 IDE
> controller.

How can I find revision? I have a problem with (Seagate) hdds, but lspci -v
only shows:

00:0f.0 ISA bridge: ServerWorks OSB4 South Bridge (rev 51)
Subsystem: ServerWorks OSB4 South Bridge
Flags: bus master, medium devsel, latency 0

00:0f.1 IDE interface: ServerWorks OSB4 IDE Controller (prog-if 8a [Master SecP PriP])
Flags: bus master, medium devsel, latency 64
I/O ports at 2000 [size=16]

Regards,
Nerijus

2002-06-13 18:27:52

by rico-linux-kernel

[permalink] [raw]

Subject: Re: Serverworks OSB4 in impossible state

Thanks for investing time on the logic analyser, Dani. My experience
is slightly different.

I have several mainboards (Tyan S1867) with older chipsets from
ServerWorks (f.k.a. Reliance). The IDE controller (OSB4 rev 0) is used
daily with ATAPI CDRW drives in UDMA(33) Mode. System handles read/write
errors without problem.

The system will lock solid when both IDE channels are accessed,
and either one is using DMA. Since I want DMA, I simply abandon the
secondary channel.

I have spare machines available for quack medical experiments.

Select boot-time info...

Linux version 2.4.17 (rico@pc2) (gcc version 2.95.3 20010315 (release)) #1 SMP Mon Dec 31 11:51:33 CST 2001
ServerWorks OSB4: IDE controller on PCI bus 00 dev 79
ServerWorks OSB4: chipset revision 0
ServerWorks OSB4: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xfcb0-0xfcb7, BIOS settings: hda:pio, hdb:pio
ide1: BM-DMA at 0xfcb8-0xfcbf, BIOS settings: hdc:pio, hdd:pio
hda: PLEXTOR CD-R PX-W2410A, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: ATAPI 40X CD-ROM CD-R/RW drive, 4096kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.12