2000-12-21 11:06:25

by xOr

[permalink] [raw]
Subject: lockups from heavy IDE/CD-ROM usage

Hi,

I've been experiencing this problem for a long time now (it occured even
in 2.2, and now in 2.4), but its Areally starting to bother me so i
thought i'd better post it here and see what you people have to say.

Problem: When i am using my harddrive and cdrom, my computer will freeze.
It freezes in two different ways.. sometimes just the harddrive access
will freeze (can still do things in X as long as they dont require the
harddrive), and then everything freezes within a few seconds. or else
everything just locks instntly. the problem is reproducable, all i need to
do is be using the harddrive extensively for a couple separate functions
(like compiling the kernel, and copying a large file) and ripping cd audio
(cd paranoia) and i can lock the system in as little as seconds, or a few
minutes sometimes. This will happen more reliably, and much quicker and
easier when i have dma enabled, but still occurs when it is not enabled. I
have used hdparm to turn off any features on the drives, but to no avail.
And i do not receive any log messages before the computer locks. Im sure
the problem is not solely hardware, because in m$ windows, i have no
problems with dma or lockups like this.

My Hardware:
CPU: Athlon K7 750
Motherboard: Abit KA7
Chipset: VIA VT8371(KX133) /VIA 686A
Harddrive: FUJITSU MPE3273AT
/dev/hda:
multcount = 16 (on)
I/O support = 0 (default 16-bit)
unmaskirq = 0 (off)
using_dma = 0 (off)
keepsettings = 0 (off)
nowerr = 0 (off)
readonly = 0 (off)
readahead = 8 (on)
geometry = 3322/255/63, sectors = 53377152, start = 0
CDROM: TDK CDRW8432
/dev/hdc:
HDIO_GET_MULTCOUNT failed: Input/output error
I/O support = 0 (default 16-bit)
unmaskirq = 0 (off)
using_dma = 0 (off)
keepsettings = 0 (off)
HDIO_GET_NOWERR failed: Input/output error
readonly = 0 (off)
BLKRAGET failed: Input/output error
HDIO_GETGEO failed: Invalid argument

I am NOT overclocking my computer either :-) im not sure what other info i
can give you about this problem.. any pointers would be greatly
appreciated.

O, one thing, i do remember hearing somewhere that the VIA KX133 chipset
had flaw(s) in it, which were fixed in the KT133 chipset. Maybe we could
get a workaround for those flaws (if they exist). Does anyone have a KX133
or KT133 chipset and have dma working?

Hope we can get this ironed out :)

thanks in advance
xOr


2000-12-21 13:35:21

by Zdenek Kabelac

[permalink] [raw]
Subject: Re: lockups from heavy IDE/CD-ROM usage

> Problem: When i am using my harddrive and cdrom, my computer will freeze.
> It freezes in two different ways.. sometimes just the harddrive access
> will freeze (can still do things in X as long as they dont require the
> harddrive), and then everything freezes within a few seconds. or else
> everything just locks instntly. the problem is reproducable, all i need to
> do is be using the harddrive extensively for a couple separate functions
> (like compiling the kernel, and copying a large file) and ripping cd audio
> (cd paranoia) and i can lock the system in as little as seconds, or a few
> minutes sometimes. This will happen more reliably, and much quicker and

This is really very similar to my problem with BP6 I'm reporting for a
long long time.
But everyone says its faulty board.

For BP6 somehow helps to set UDMA to mode 2.
(I'm not getting these locks when I'm just using ATA33 controler)
(hdparm -X66 /dev/hdX)

Also could you look at what is being written to console ?
(run those intesive programs and stay on console - BP6 lock with
this message displayed:

hdf: timeout waiting for DMA
ide_dmaproc: chipset supported ide_dma_timeout: func only 14

In this point it looks like timers are dead... :(
And the situation is the same with SMP & NoSMP kernel with apic &
noapic.

--
There are three types of people in the world:
those who can count, and those who can't.
Zdenek Kabelac http://i.am/kabi/ [email protected] {debian.org; fi.muni.cz}

2000-12-21 20:31:05

by safemode

[permalink] [raw]
Subject: Re: lockups from heavy IDE/CD-ROM usage

Zdenek Kabelac wrote:

> > Problem: When i am using my harddrive and cdrom, my computer will freeze.
> > It freezes in two different ways.. sometimes just the harddrive access
> > will freeze (can still do things in X as long as they dont require the
> > harddrive), and then everything freezes within a few seconds. or else
> > everything just locks instntly. the problem is reproducable, all i need to
> > do is be using the harddrive extensively for a couple separate functions
> > (like compiling the kernel, and copying a large file) and ripping cd audio
> > (cd paranoia) and i can lock the system in as little as seconds, or a few
> > minutes sometimes. This will happen more reliably, and much quicker and
>
> This is really very similar to my problem with BP6 I'm reporting for a
> long long time.
> But everyone says its faulty board.
>
> For BP6 somehow helps to set UDMA to mode 2.
> (I'm not getting these locks when I'm just using ATA33 controler)
> (hdparm -X66 /dev/hdX)
>
> Also could you look at what is being written to console ?
> (run those intesive programs and stay on console - BP6 lock with
> this message displayed:
>
> hdf: timeout waiting for DMA
> ide_dmaproc: chipset supported ide_dma_timeout: func only 14
>
> In this point it looks like timers are dead... :(
> And the situation is the same with SMP & NoSMP kernel with apic &
> noapic.
>

I get this on the 440LX with the same DMA timeout message. Everyone says it's
the board's fault as well. Funny. Anyways this happens accross just about
any Dev kernel but more so in the -test12 and up versions. . Test10 works
fine without locking. Blaming the hardware reminds me of the help given by
some other company I can't seem to remember the name to.

2000-12-21 20:51:01

by Andre Hedrick

[permalink] [raw]
Subject: Blow Torch (Re: lockups from heavy IDE/CD-ROM usage)

On Thu, 21 Dec 2000, safemode wrote:

> I get this on the 440LX with the same DMA timeout message. Everyone says it's
> the board's fault as well. Funny. Anyways this happens accross just about
> any Dev kernel but more so in the -test12 and up versions. . Test10 works
> fine without locking. Blaming the hardware reminds me of the help given by
> some other company I can't seem to remember the name to.

29063507.pdf Page 22 sections 9,10
What is the Intel solution to the is system hang?

29063507.pdf Page 25 section 16
Is this erratum valid to include all PIIX4-AB/EB, PIIX3, and PIIX a/b.

It is the DAMN hardware and quit BITCHING.
I told everyone once that I was working on this issue.
If you think you can fix it before me, be my guest.

I have given you the INTEL doc numbers and the page and the section.
Go read.

Regards

Andre Hedrick
Linux ATA Development


2000-12-21 21:14:25

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Blow Torch (Re: lockups from heavy IDE/CD-ROM usage)

On Thu, 21 Dec 2000, Andre Hedrick wrote:

> On Thu, 21 Dec 2000, safemode wrote:
>
> > I get this on the 440LX with the same DMA timeout message. Everyone says it's
> > the board's fault as well. Funny. Anyways this happens accross just about
> > any Dev kernel but more so in the -test12 and up versions. . Test10 works
> > fine without locking. Blaming the hardware reminds me of the help given by
> > some other company I can't seem to remember the name to.
>
> 29063507.pdf Page 22 sections 9,10
> What is the Intel solution to the is system hang?
>
> 29063507.pdf Page 25 section 16
> Is this erratum valid to include all PIIX4-AB/EB, PIIX3, and PIIX a/b.
>
> It is the DAMN hardware and quit BITCHING.
> I told everyone once that I was working on this issue.
> If you think you can fix it before me, be my guest.
>
> I have given you the INTEL doc numbers and the page and the section.
> Go read.
>
> Regards
>

FYI, I havn't found a decent motherboard (chipset) amongst the new
boards released during the past year. Both ASUS and TYAN boards,
including the expensive "Thunderbolt", have the infamous "Bit 17"
memory errors, regardless of the amount/kind/speed/cost of SDRAM
installed.

If you get an Oops trace, see if the faulting address would be
correct if bit 17 was changed. There is something wrong with
the timing on the SDRAM controller so that all the timing skews
pile up, occasionally corrupting bit 17. This can't be by chance,
since I have now tested over 10 different systems, most with
different motherboards, and or course different sets of RAM from
single 32 MB sticks to 8 256 MB sticks, P-100, to 133 MHz, etc.

Every system has an occasional error with bit 17! I even wrote
a memory-test program which shows this.

So, either the SDRAM controller used on these boards is bad,
or all the RAM produced by a half/dozen vendors over the past
year is bad. Take a choice.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.0-test12 on an i686 machine (799.54 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


2000-12-21 22:12:47

by Daniel Stone

[permalink] [raw]
Subject: Re: lockups from heavy IDE/CD-ROM usage

> I get this on the 440LX with the same DMA timeout message. Everyone says it's
> the board's fault as well. Funny. Anyways this happens accross just about
> any Dev kernel but more so in the -test12 and up versions. . Test10 works
> fine without locking. Blaming the hardware reminds me of the help given by
> some other company I can't seem to remember the name to.

Well, think about it - if there are DMA/IRQ timeouts, the hardware IS rooted.
Otherwise, why would it be timing out? I've been seeing these messages
shortly before a hardlock (except for the fact numlock still works, but
nothing else) when doing long, intensive hard drive activity. Because my
hard drives are right next to each other, overheat sometimes and shut
straight down when they do. But I'm gonna take a wild guess it's not Linux's
fault, unless they've done some whacky stuff with the elevator ;)

--
Daniel Stone
Linux Kernel Developer
[email protected]

-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
G!>CS d s++:- a---- C++ ULS++++$>B P---- L+++>++++ E+(joe)>+++ W++ N->++ !o
K? w++(--) O---- M- V-- PS+++ PE- Y PGP>++ t--- 5-- X- R- tv-(!) b+++ DI+++
D+ G e->++ h!(+) r+(%) y? UF++
------END GEEK CODE BLOCK------