2002-04-11 12:40:20

by Denis Vlasenko

[permalink] [raw]
Subject: New IDE code and DMA failures

Hi Jens, Martin, Vojtech,

I have a flaky IDE subsystem in one box. Reads work fine,
writes sometimes don't work and hang either IDE/block device
sybsystem or entire box. For example, I dumped ~40 MB file to
the disk and now I have additional power led (i.e. hdd activity
led is constantly on) and a bunch of "D" state processes
(kupdated, mount, umount).

This is happening since I decided to try 2.5.7.
2.4.18 reported DMA failures and reverted to PIO.

I did send a detailed report of similar event with
ksymoopsed stack traces of hung prosesses to lkml.

Since you are working on IDE subsystem, I will be glad to
*retain* my flaky IDE setup and test future kernels
for correct operation in this failure mode.

Please inform me whenever you want me to test your patches.
--
vda


2002-04-11 13:05:49

by Ted Deppner

[permalink] [raw]
Subject: Re: New IDE code and DMA failures

On Thu, Apr 11, 2002 at 03:39:33PM -0200, Denis Vlasenko wrote:
> I have a flaky IDE subsystem in one box. Reads work fine,
> writes sometimes don't work and hang either IDE/block device
>
> Please inform me whenever you want me to test your patches.

I've been testing 2.4.17 and 2.4.19-pre6 and see some similar issues. I
have an Asus A7V w/ 1gig Athlon processor. Using the onboard Promise
UDMA100 controller, I can read and write all day long to /dev/hde all by
itself... However, after few minutes of any type of access to /dev/hdh,
/dev/hde suddenly starts having DMA errors and switches to PIO. I'm on my
third DMA66 cable (yet it fights tightly), and am still seeing the exact
same issues. I don't believe my IDE subsystem to be flaky. hde is a WD
drive, and hdh is a Maxtor.

In one of my tests the contents /dev/hdh was additionally corrupted (a
write test to /dev/hdh1) so badly that the partion information changed
from type 83 to type 3 (Xenix), and the contents of a reiser partition so
badly damaged that a --rebuild-tree and later a --rebuild-sb to reiserfsck
didn't restore it to usable. (I put those options in at the request of
reiserfsck, and I haven't wiped the drive yet if someone would like
further tests against the reiserfs partition).

--
Ted Deppner
http://www.psyber.com/~ted/

2002-04-11 13:08:14

by Martin Dalecki

[permalink] [raw]
Subject: Re: New IDE code and DMA failures

Denis Vlasenko wrote:
> Hi Jens, Martin, Vojtech,

Zdrastwujtie.

> I have a flaky IDE subsystem in one box. Reads work fine,
> writes sometimes don't work and hang either IDE/block device
> sybsystem or entire box. For example, I dumped ~40 MB file to
> the disk and now I have additional power led (i.e. hdd activity
> led is constantly on) and a bunch of "D" state processes
> (kupdated, mount, umount).
>
> This is happening since I decided to try 2.5.7.
> 2.4.18 reported DMA failures and reverted to PIO.
>
> I did send a detailed report of similar event with
> ksymoopsed stack traces of hung prosesses to lkml.
>
> Since you are working on IDE subsystem, I will be glad to
> *retain* my flaky IDE setup and test future kernels
> for correct operation in this failure mode.
>
> Please inform me whenever you want me to test your patches.

Guessing from the symptoms I would rather suggest that:

1. Are you sure you have the support for your chipset properly
enabled? It's allmost a must for DMA.

2. Could you please report about the hardware you have. There are
chipsets around there which are using theyr own transport layer
implementations. host chip (aka south bridge) disk types and so on.

3. Some timeout values got increased to more generally used values (in esp.
IBM microdrives advice about timeout values. Could you see whatever
the data doesn't eventually go to the disk after georgeous
amounts of time.

4. Could you try to set the DMA mode lower then it's set up
per default by using hdparm and try whatever it helps?

2002-04-11 13:44:11

by Denis Vlasenko

[permalink] [raw]
Subject: Re: New IDE code and DMA failures

On 11 April 2002 10:05, Martin Dalecki wrote:
> > Since you are working on IDE subsystem, I will be glad to
> > *retain* my flaky IDE setup and test future kernels
> > for correct operation in this failure mode.
> >
> > Please inform me whenever you want me to test your patches.
>
> Guessing from the symptoms I would rather suggest that:
>
> 1. Are you sure you have the support for your chipset properly
> enabled? It's allmost a must for DMA.

I am deadly sure. lspci:
00:00.0 Host bridge: Intel Corp. 440LX/EX - 82443LX/EX Host bridge (rev 03)
00:01.0 PCI bridge: Intel Corp. 440LX/EX - 82443LX/EX AGP bridge (rev 03)
00:04.0 ISA bridge: Intel Corp. 82371AB PIIX4 ISA (rev 01)
00:04.1 IDE interface: Intel Corp. 82371AB PIIX4 IDE (rev 01)
00:04.2 USB Controller: Intel Corp. 82371AB PIIX4 USB (rev 01)
00:04.3 Bridge: Intel Corp. 82371AB PIIX4 ACPI (rev 01)
00:06.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 24)
00:0a.0 VGA compatible controller: Matrox Graphics, Inc. MGA 2164W [Millennium II]
00:0c.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8029(AS)

/boot/2.4.7/config:
CONFIG_BLK_DEV_PIIX=y

> 2. Could you please report about the hardware you have. There are
> chipsets around there which are using theyr own transport layer
> implementations. host chip (aka south bridge) disk types and so on.

# hdparm -i /dev/hda
Model=Maxtor 51369U3, FwRev=DA620CQ0, SerialNo=EK3HAE61C
Config={ Fixed }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=57
BuffType=3(DualPortCache), BuffSize=2048kB, MaxMultSect=16, MultSect=16
DblWordIO=no, maxPIO=2(fast), DMA=yes, maxDMA=0(slow)
CurCHS=17475/15/63, CurSects=16513875, LBA=yes
LBA CHS=512/511/63 Remapping, LBA=yes, LBAsects=26520480
tDMA={min:120,rec:120}, DMA modes: mword0 mword1 mword2
IORDY=on/off, tPIO={min:120,w/IORDY:120}, PIO modes: mode3 mode4
UDMA modes: mode0 mode1 *mode2

# hdparm -i /dev/hdc
Model=ST31277A, FwRev=0.75, SerialNo=VAE07701
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=2482/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=0(?), BuffSize=0kB, MaxMultSect=16, MultSect=16
DblWordIO=no, maxPIO=1(medium), DMA=yes, maxDMA=2(fast)
CurCHS=2482/16/63, CurSects=2501856, LBA=yes
LBA CHS=620/64/63 Remapping, LBA=yes, LBAsects=2501856
tDMA={min:120,rec:120}, DMA modes: mword0 mword1 *mword2
IORDY=on/off, tPIO={min:383,w/IORDY:120}, PIO modes: mode3 mode4

I have problems with hdc. hda is mostly unused, so maybe it is DMA errors
prone too but I have not seen that yet.

> 3. Some timeout values got increased to more generally used values (in esp.
> IBM microdrives advice about timeout values. Could you see whatever
> the data doesn't eventually go to the disk after georgeous
> amounts of time.

Erm.. my English comprehension fails here... do you say my disk
does not like bigger timeouts?

> 4. Could you try to set the DMA mode lower then it's set up
> per default by using hdparm and try whatever it helps?

Current params:

# hdparm /dev/hda /dev/hdc
/dev/hda:
multcount = 16 (on)
I/O support = 1 (32-bit)
unmaskirq = 1 (on)
using_dma = 1 (on)
keepsettings = 0 (off)
nowerr = 0 (off)
readonly = 0 (off)
BLKRAGET failed: Invalid argument
geometry = 1754/240/63, sectors = 26520480, start = 0

/dev/hdc:
multcount = 16 (on)
I/O support = 1 (32-bit)
unmaskirq = 1 (on)
using_dma = 1 (on)
keepsettings = 0 (off)
nowerr = 0 (off)
readonly = 0 (off)
BLKRAGET failed: Invalid argument
geometry = 620/64/63, sectors = 2501856, start = 0

I can't quite figure what MW/UDMA mode is active.
--
vda

2002-04-11 13:55:17

by Martin Dalecki

[permalink] [raw]
Subject: Re: New IDE code and DMA failures

Denis Vlasenko wrote:

> # hdparm -i /dev/hda
> Model=Maxtor 51369U3, FwRev=DA620CQ0, SerialNo=EK3HAE61C
> Config={ Fixed }
> RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=57
> BuffType=3(DualPortCache), BuffSize=2048kB, MaxMultSect=16, MultSect=16
> DblWordIO=no, maxPIO=2(fast), DMA=yes, maxDMA=0(slow)
> CurCHS=17475/15/63, CurSects=16513875, LBA=yes
> LBA CHS=512/511/63 Remapping, LBA=yes, LBAsects=26520480
> tDMA={min:120,rec:120}, DMA modes: mword0 mword1 mword2
> IORDY=on/off, tPIO={min:120,w/IORDY:120}, PIO modes: mode3 mode4
> UDMA modes: mode0 mode1 *mode2

To answer an later question.
The asterix here denotes the active UDMA mode!

>
> # hdparm -i /dev/hdc
> Model=ST31277A, FwRev=0.75, SerialNo=VAE07701
> Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
> RawCHS=2482/16/63, TrkSize=0, SectSize=0, ECCbytes=4
> BuffType=0(?), BuffSize=0kB, MaxMultSect=16, MultSect=16
> DblWordIO=no, maxPIO=1(medium), DMA=yes, maxDMA=2(fast)
> CurCHS=2482/16/63, CurSects=2501856, LBA=yes
> LBA CHS=620/64/63 Remapping, LBA=yes, LBAsects=2501856
> tDMA={min:120,rec:120}, DMA modes: mword0 mword1 *mword2
> IORDY=on/off, tPIO={min:383,w/IORDY:120}, PIO modes: mode3 mode4
>
> I have problems with hdc. hda is mostly unused, so maybe it is DMA errors
> prone too but I have not seen that yet.
>
>
>>3. Some timeout values got increased to more generally used values (in esp.
>> IBM microdrives advice about timeout values. Could you see whatever
>> the data doesn't eventually go to the disk after georgeous
>> amounts of time.
>
>
> Erm.. my English comprehension fails here... do you say my disk
> does not like bigger timeouts?

Please just wait and look whatever the driver actually recovers (can be
minutes...)

>
>>4. Could you try to set the DMA mode lower then it's set up
>> per default by using hdparm and try whatever it helps?
>
>
> Current params:
>
> # hdparm /dev/hda /dev/hdc
> /dev/hda:
> multcount = 16 (on)
> I/O support = 1 (32-bit)
> unmaskirq = 1 (on)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Could you try to disable this please? This can cause trouble
as well.

> using_dma = 1 (on)
> keepsettings = 0 (off)
> nowerr = 0 (off)
> readonly = 0 (off)
> BLKRAGET failed: Invalid argument
> geometry = 1754/240/63, sectors = 26520480, start = 0
>
> /dev/hdc:
> multcount = 16 (on)
> I/O support = 1 (32-bit)
> unmaskirq = 1 (on)
> using_dma = 1 (on)
> keepsettings = 0 (off)
> nowerr = 0 (off)
> readonly = 0 (off)
> BLKRAGET failed: Invalid argument
> geometry = 620/64/63, sectors = 2501856, start = 0
>
> I can't quite figure what MW/UDMA mode is active.

See above.

2002-04-11 14:10:33

by Oleg Drokin

[permalink] [raw]
Subject: Re: New IDE code and DMA failures

Hello!

On Thu, Apr 11, 2002 at 06:05:44AM -0700, Ted Deppner wrote:

> In one of my tests the contents /dev/hdh was additionally corrupted (a
> write test to /dev/hdh1) so badly that the partion information changed
> from type 83 to type 3 (Xenix), and the contents of a reiser partition so
> badly damaged that a --rebuild-tree and later a --rebuild-sb to reiserfsck
> didn't restore it to usable. (I put those options in at the request of
> reiserfsck, and I haven't wiped the drive yet if someone would like
> further tests against the reiserfs partition).

We are interested in such a damaged partitions that makes current reiserfsck
to segfault or to incorrectly repair FS (incorrectly in the meaning that
subsequent reiserfsck run finds more errors)
Is this the case with you?

Bye,
Oleg

2002-04-11 14:15:54

by Denis Vlasenko

[permalink] [raw]
Subject: Re: New IDE code and DMA failures

On 11 April 2002 10:52, Martin Dalecki wrote:
> >>3. Some timeout values got increased to more generally used values (in
> >> esp. IBM microdrives advice about timeout values. Could you see whatever
> >> the data doesn't eventually go to the disk after georgeous
> >> amounts of time.
> >
> > Erm.. my English comprehension fails here... do you say my disk
> > does not like bigger timeouts?
>
> Please just wait and look whatever the driver actually recovers (can be
> minutes...)

I tried that just today. Continued to work despite kupdated hung
in "D" state. After a long while box box froze. SysRq-B worked though.

In my first report to lkml I told that live disconnect of hdc
cured "D" state processes (yes I know I risk burning my southbridge...).
Do you want me to mail it again (there is ksymoopsed SysRq-T)?

> > unmaskirq = 1 (on)
> Could you try to disable this please? This can cause trouble
> as well.

Will try this, but I don't specifically seek to eliminate freezes,
I want to help debug new IDE code so that it will be no worse
than 2.4 in this failure mode. I don't want to eliminate DMA failures,
I _want to have them_ to see what IDE code will do.
--
vda

2002-04-11 15:28:32

by Petr Vandrovec

[permalink] [raw]
Subject: Re: New IDE code and DMA failures

On 11 Apr 02 at 6:05, Ted Deppner wrote:
> On Thu, Apr 11, 2002 at 03:39:33PM -0200, Denis Vlasenko wrote:
> > I have a flaky IDE subsystem in one box. Reads work fine,
> > writes sometimes don't work and hang either IDE/block device
> >
> > Please inform me whenever you want me to test your patches.
>
> I've been testing 2.4.17 and 2.4.19-pre6 and see some similar issues. I
> have an Asus A7V w/ 1gig Athlon processor. Using the onboard Promise
> UDMA100 controller, I can read and write all day long to /dev/hde all by
> itself... However, after few minutes of any type of access to /dev/hdh,
> /dev/hde suddenly starts having DMA errors and switches to PIO. I'm on my
> third DMA66 cable (yet it fights tightly), and am still seeing the exact
> same issues. I don't believe my IDE subsystem to be flaky. hde is a WD
> drive, and hdh is a Maxtor.

What your /dev/hdg is? Using slave-alone on the A7V's Promise (and maybe
on other motherboards too) will corrupt your disk badly. Under Linux,
and also under Windows98. I did not tried other OSes...
Petr Vandrovec
[email protected]

2002-04-11 15:49:07

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: New IDE code and DMA failures

On Thu, Apr 11, 2002 at 04:44:29PM -0200, Denis Vlasenko wrote:
> On 11 April 2002 10:05, Martin Dalecki wrote:
> > > Since you are working on IDE subsystem, I will be glad to
> > > *retain* my flaky IDE setup and test future kernels
> > > for correct operation in this failure mode.
> > >
> > > Please inform me whenever you want me to test your patches.
> >
> > Guessing from the symptoms I would rather suggest that:
> >
> > 1. Are you sure you have the support for your chipset properly
> > enabled? It's allmost a must for DMA.
>
> I am deadly sure. lspci:
> 00:00.0 Host bridge: Intel Corp. 440LX/EX - 82443LX/EX Host bridge (rev 03)
> 00:01.0 PCI bridge: Intel Corp. 440LX/EX - 82443LX/EX AGP bridge (rev 03)
> 00:04.0 ISA bridge: Intel Corp. 82371AB PIIX4 ISA (rev 01)
> 00:04.1 IDE interface: Intel Corp. 82371AB PIIX4 IDE (rev 01)
> 00:04.2 USB Controller: Intel Corp. 82371AB PIIX4 USB (rev 01)
> 00:04.3 Bridge: Intel Corp. 82371AB PIIX4 ACPI (rev 01)
> 00:06.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 24)
> 00:0a.0 VGA compatible controller: Matrox Graphics, Inc. MGA 2164W [Millennium II]
> 00:0c.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8029(AS)
>
> /boot/2.4.7/config:
> CONFIG_BLK_DEV_PIIX=y

There's new PIIX code by me in the 2.5 kernels. Can you provide
/proc/ide/piix data (and lspci -vvxxx) as well?

>
> > 2. Could you please report about the hardware you have. There are
> > chipsets around there which are using theyr own transport layer
> > implementations. host chip (aka south bridge) disk types and so on.
>
> # hdparm -i /dev/hda
> Model=Maxtor 51369U3, FwRev=DA620CQ0, SerialNo=EK3HAE61C
> Config={ Fixed }
> RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=57
> BuffType=3(DualPortCache), BuffSize=2048kB, MaxMultSect=16, MultSect=16
> DblWordIO=no, maxPIO=2(fast), DMA=yes, maxDMA=0(slow)
> CurCHS=17475/15/63, CurSects=16513875, LBA=yes
> LBA CHS=512/511/63 Remapping, LBA=yes, LBAsects=26520480
> tDMA={min:120,rec:120}, DMA modes: mword0 mword1 mword2
> IORDY=on/off, tPIO={min:120,w/IORDY:120}, PIO modes: mode3 mode4
> UDMA modes: mode0 mode1 *mode2
>
> # hdparm -i /dev/hdc
> Model=ST31277A, FwRev=0.75, SerialNo=VAE07701
> Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
> RawCHS=2482/16/63, TrkSize=0, SectSize=0, ECCbytes=4
> BuffType=0(?), BuffSize=0kB, MaxMultSect=16, MultSect=16
> DblWordIO=no, maxPIO=1(medium), DMA=yes, maxDMA=2(fast)
> CurCHS=2482/16/63, CurSects=2501856, LBA=yes
> LBA CHS=620/64/63 Remapping, LBA=yes, LBAsects=2501856
> tDMA={min:120,rec:120}, DMA modes: mword0 mword1 *mword2
> IORDY=on/off, tPIO={min:383,w/IORDY:120}, PIO modes: mode3 mode4
>
> I have problems with hdc. hda is mostly unused, so maybe it is DMA errors
> prone too but I have not seen that yet.
>
> > 3. Some timeout values got increased to more generally used values (in esp.
> > IBM microdrives advice about timeout values. Could you see whatever
> > the data doesn't eventually go to the disk after georgeous
> > amounts of time.
>
> Erm.. my English comprehension fails here... do you say my disk
> does not like bigger timeouts?
>
> > 4. Could you try to set the DMA mode lower then it's set up
> > per default by using hdparm and try whatever it helps?
>
> Current params:
>
> # hdparm /dev/hda /dev/hdc
> /dev/hda:
> multcount = 16 (on)
> I/O support = 1 (32-bit)
> unmaskirq = 1 (on)
> using_dma = 1 (on)
> keepsettings = 0 (off)
> nowerr = 0 (off)
> readonly = 0 (off)
> BLKRAGET failed: Invalid argument
> geometry = 1754/240/63, sectors = 26520480, start = 0
>
> /dev/hdc:
> multcount = 16 (on)
> I/O support = 1 (32-bit)
> unmaskirq = 1 (on)
> using_dma = 1 (on)
> keepsettings = 0 (off)
> nowerr = 0 (off)
> readonly = 0 (off)
> BLKRAGET failed: Invalid argument
> geometry = 620/64/63, sectors = 2501856, start = 0
>
> I can't quite figure what MW/UDMA mode is active.
> --
> vda

--
Vojtech Pavlik
SuSE Labs

2002-04-12 09:47:17

by Denis Vlasenko

[permalink] [raw]
Subject: Re: New IDE code and DMA failures

On 11 April 2002 13:48, Vojtech Pavlik wrote:
> > lspci:
> > 00:00.0 Host bridge: Intel Corp. 440LX/EX - 82443LX/EX Host bridge (rev
> > 03) 00:01.0 PCI bridge: Intel Corp. 440LX/EX - 82443LX/EX AGP bridge (rev
> > 03) 00:04.0 ISA bridge: Intel Corp. 82371AB PIIX4 ISA (rev 01)
> > 00:04.1 IDE interface: Intel Corp. 82371AB PIIX4 IDE (rev 01)
> > 00:04.2 USB Controller: Intel Corp. 82371AB PIIX4 USB (rev 01)
> > 00:04.3 Bridge: Intel Corp. 82371AB PIIX4 ACPI (rev 01)
> > 00:06.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone]
> > (rev 24) 00:0a.0 VGA compatible controller: Matrox Graphics, Inc. MGA
> > 2164W [Millennium II] 00:0c.0 Ethernet controller: Realtek Semiconductor
> > Co., Ltd. RTL-8029(AS)
> >
> > /boot/2.4.7/config:
> > CONFIG_BLK_DEV_PIIX=y
>
> There's new PIIX code by me in the 2.5 kernels. Can you provide
> /proc/ide/piix data (and lspci -vvxxx) as well?

lspci -vvxxx:
00:04.1 IDE interface: Intel Corp. 82371AB PIIX4 IDE (rev 01) (prog-if 80 [Master])
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 64
Region 4: I/O ports at fcb0 [size=16]
00: 86 80 11 71 05 00 80 02 01 80 01 01 00 40 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: b1 fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
40: 77 e3 47 e3 0b 00 00 00 01 00 02 00 00 00 00 00
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 28 0f 00 00 00 00 00 00

/proc/ide/piix:
----------PIIX BusMastering IDE Configuration---------------
Driver Version: 1.2
South Bridge: PCI device 8086:7111
Revision: IDE 0x1
Highest DMA rate: UDMA33
BM-DMA base: 0xfcb0
PCI clock: 33.3MHz
-----------------------Primary IDE-------Secondary IDE------
Enabled: yes yes
Simplex only: no no
Cable Type: 40w 40w
-------------------drive0----drive1----drive2----drive3-----
Prefetch+Post: yes yes yes yes
Transfer Mode: UDMA DMA DMA PIO
Address Setup: 90ns 90ns 90ns 90ns
Cmd Active: 360ns 360ns 360ns 360ns
Cmd Recovery: 540ns 540ns 540ns 540ns
Data Active: 90ns 90ns 90ns 360ns
Data Recovery: 30ns 30ns 30ns 540ns
Cycle Time: 60ns 120ns 120ns 900ns
Transfer Rate: 33.3MB/s 16.6MB/s 16.6MB/s 2.2MB/s

--
vda

2002-04-13 00:58:09

by Ted Deppner

[permalink] [raw]
Subject: Re: New IDE code and DMA failures

On Thu, Apr 11, 2002 at 06:10:27PM +0400, Oleg Drokin wrote:
> We are interested in such a damaged partitions that makes current reiserfsck
> to segfault or to incorrectly repair FS (incorrectly in the meaning that
> subsequent reiserfsck run finds more errors)
> Is this the case with you?

Subsequent runs of reiserfsck are no longer finding new errors. There
were several cases where --rebuild-tree segfaulted reiserfsck -- HOWEVER
this was before I got the DMA errors ironed out.

Now that the DMA errors are taken care of, I've not been able to get
reiserfsck to behave oddly.

--
Ted Deppner
http://www.psyber.com/~ted/

2002-04-13 01:02:34

by Ted Deppner

[permalink] [raw]
Subject: Re: New IDE code and DMA failures

On Thu, Apr 11, 2002 at 05:27:29PM +0100, Petr Vandrovec wrote:
> What your /dev/hdg is? Using slave-alone on the A7V's Promise (and maybe
> on other motherboards too) will corrupt your disk badly. Under Linux,
> and also under Windows98. I did not tried other OSes...

I did not have a /dev/hdg. I searched and found your emails to
linux-kernel regarding your findings on quirks with the PDC20265
controller and moved /dev/hdh to /dev/hdg

I've not had a single DMA error since, regardless of how much I've tried
to break things. Previously I was able to fail things within a few
minutes.

Thank you!

At this point I am racking these issues against hardware quirks of my
A7V's onboard controller... I cannot say that there is anything amiss in
the kernel (or with reiserfs) in the light these findings.

--
Ted Deppner
http://www.psyber.com/~ted/

2002-04-13 07:07:30

by Oleg Drokin

[permalink] [raw]
Subject: Re: New IDE code and DMA failures

Hello!

On Fri, Apr 12, 2002 at 05:58:05PM -0700, Ted Deppner wrote:
> > We are interested in such a damaged partitions that makes current reiserfsck
> > to segfault or to incorrectly repair FS (incorrectly in the meaning that
> > subsequent reiserfsck run finds more errors)
> > Is this the case with you?
> Subsequent runs of reiserfsck are no longer finding new errors. There
> were several cases where --rebuild-tree segfaulted reiserfsck -- HOWEVER
> this was before I got the DMA errors ironed out.

Still reiserfsck should not segfault.
What version of reiserfsprogs do you have?
Have you saved core files?

Bye,
Oleg