Hello,
I tried posting this question last week on comp.os.linux.hardware and
didn't get much of a response. So I thought I'd try it here and see if
anyone can give me a hand.
I'm trying to track down an issue I'm having with corrupted files on
some of my hard drives and I'm hoping someone can offer some help or
ideas of things to try.
I'm running a Gentoo system with a 2.6.10-r6 kernel. This is the
gentoo-dev-sources ebuild. My motherboard is a Asus A7N8X with an
nForce2 nVidia chipset. The system has five 40 gig hard drives, all set
as masters. Each drive is connected to a single IDE channel. Two
drives are connected to the motherboard IDE connectors. The remaining
three drivers are connected to two SIIG UltraATA 133 PCI controllers.
These controllers use the Silicon Image 0680 chipset.
The system is setup to use software raid and LVM. The / partition is a
RAID1 using drive partitions run from the motherboard IDE controller.
The /tmp partition is a RAID0 using partitions from the SIIG
controllers. The LVM is on top of a RAID5 partition using the rest of
the partitions, some controlled by the motherboard and some by the
SIIGs. All file systems are formatted using JFS.
There have been no unusual messages in /var/log/messages.
I started noticed problems with Quake3 Arena not being able to load
certain maps. So I checked the md5sum of the pak0.pk3 and it didn't
match the one on the CD. So I tried to recopy it from the CD and the
new copy had a different md5sum than the other two. Then I tried to
copy the file to the root partition. That copy succeeded with the
correct md5sum, however, when I tried to copy it to the Quake3 install,
it was corrupted.
My Quake3 install is in my /opt which resides on the LVM. So my first
thought is perhaps there is a bug running an LVM on top of a RAID or
possibly with JFS. So to test that theory I stopped the /tmp RAID and
reformatted the 3 partitions as EXT2 and mounted them. Then I started
copying pak0.pk3 to my new filesystems and md5suming the copies. Most
of the copies succeeded, but a few were corrupted, about 5 out of 300.
The failed copies were not limited to a particular drive, all three
drives had at least one bad copy.
That eliminated LVM or JFS or RAID as a source of the problem.
I haven't run the same test on my root partition. So I can't state yet
that the root partition never corrupts. But I can state that I've never
noticed a corrupt file on the root partition.
So then I started to wonder about the drivers for the Silicom Image
chipset since the problem so far has only occured with drives running on
those external controller cards.
In the source for the driver (siimage.c), I noticed the following comment:
If you have strange problems with nVidia chipset systems
please see the SI support documentation and update your
system BIOS if neccessary.
Does anyone know what that comment means or where this 'SI support
documentation' can be found?
Both the BIOS on my system and on the SIIG controller cards is up to date.
The system has has ran memtest86 for about24 hours without an error.
I'm willing to put some time and effort into finding this problem (as
opposed to simple chucking the controllers and buying ones based on a
different chipset) if someone can help me along the way.
Thanks for any advice anyone can offer!
Jeff Cooper
[email protected]
Jeff Cooper schrieb:
> I'm running a Gentoo system with a 2.6.10-r6 kernel. This is the
> gentoo-dev-sources ebuild. My motherboard is a Asus A7N8X with an
> nForce2 nVidia chipset. The system has five 40 gig hard drives, all set
>
>
> Thanks for any advice anyone can offer!
Search in bios for ext-p2p discard time and set it to a higher value, at least
1ms. I alrealdy reported this to lkml but there apperantly is no interest in
implementing a kernel quirk.
--
Prakash Punnoor
formerly known as Prakash K. Cheemplavam
Prakash Punnoor wrote:
> Jeff Cooper schrieb:
>
>> I'm running a Gentoo system with a 2.6.10-r6 kernel. This is the
>> gentoo-dev-sources ebuild. My motherboard is a Asus A7N8X with an
>> nForce2 nVidia chipset. The system has five 40 gig hard drives, all set
>>
>>
>> Thanks for any advice anyone can offer!
>
>
> Search in bios for ext-p2p discard time and set it to a higher value, at
> least
> 1ms. I alrealdy reported this to lkml but there apperantly is no
> interest in
> implementing a kernel quirk.
>
Well I checkd my BIOS and it doesn't have the ability to adjust that
option. There are no new updates for the BIOS of my motherboard or the
SIIG ATA controllers.
Is this a SATA option? I only have PATA drives in my system.
Does anyone else have any other suggestions?
thanks,
Jeff