2005-09-14 13:17:54

by David Sanchez

[permalink] [raw]
Subject: Corrupted file on a copy

Hi,

I'm using the linux kernel 2.6.10 and busybox 1.0 on a AMD AU1550 board.

When I copy a big file (around 300M) within an ext2 filesystem (even on
ext3 filesystem) then the output file is sometime "corrupted" (I mean
that the source and the destination files are different and thus
generate a different SHA1).
Does somebody have a same behaviour?

Thanks,
David


2005-09-14 14:36:55

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: Corrupted file on a copy


On Wed, 14 Sep 2005, David Sanchez wrote:

> Hi,
>
> I'm using the linux kernel 2.6.10 and busybox 1.0 on a AMD AU1550 board.
>
> When I copy a big file (around 300M) within an ext2 filesystem (even on
> ext3 filesystem) then the output file is sometime "corrupted" (I mean
> that the source and the destination files are different and thus
> generate a different SHA1).
> Does somebody have a same behaviour?
>
> Thanks,
> David
>

Use `cmp` to compare the two files. You could have discovered
a bug in your checksum utility, you need to isolate it to
the file-system. FYI, I have never seen a copy of a file, including
the image of an entire DVD (saved to clone another), that was not
properly identical.


Cheers,
Dick Johnson
Penguin : Linux version 2.6.13 on an i686 machine (5589.53 BogoMips).
Warning : 98.36% of all statistics are fiction.
.
I apologize for the following. I tried to kill it with the above dot :

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2005-09-14 14:59:35

by David Sanchez

[permalink] [raw]
Subject: RE: Corrupted file on a copy

Unfortunately, the cmp command shows me a difference between the 2 files!
I known that it is a strange behaviour and that probably come from my code but I don't find it yet :(
More, I try the last linux kernel 2.6.13 from linux-mips.org with the last busybox version and the problem persists!


David SANCHEZ
LexBox :: The Digital Evidence
3, avenue Didier Daurat
31400 TOULOUSE / FRANCE
[email protected]
Tel : +33 (0)5 62 47 15 81
Fax : +33 (0)5 62 47 15 84
-----Message d'origine-----
De?: linux-os (Dick Johnson) [mailto:[email protected]]
Envoy??: mercredi 14 septembre 2005 16:46
??: David Sanchez
Cc?: [email protected]
Objet?: Re: Corrupted file on a copy


On Wed, 14 Sep 2005, David Sanchez wrote:

> Hi,
>
> I'm using the linux kernel 2.6.10 and busybox 1.0 on a AMD AU1550 board.
>
> When I copy a big file (around 300M) within an ext2 filesystem (even on
> ext3 filesystem) then the output file is sometime "corrupted" (I mean
> that the source and the destination files are different and thus
> generate a different SHA1).
> Does somebody have a same behaviour?
>
> Thanks,
> David
>

Use `cmp` to compare the two files. You could have discovered
a bug in your checksum utility, you need to isolate it to
the file-system. FYI, I have never seen a copy of a file, including
the image of an entire DVD (saved to clone another), that was not
properly identical.


Cheers,
Dick Johnson
Penguin : Linux version 2.6.13 on an i686 machine (5589.53 BogoMips).
Warning : 98.36% of all statistics are fiction.
.
I apologize for the following. I tried to kill it with the above dot :

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.


Thank you.





2005-09-14 15:17:04

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: RE: Corrupted file on a copy


On Wed, 14 Sep 2005, David Sanchez wrote:

> Unfortunately, the cmp command shows me a difference between the 2 files!
> I known that it is a strange behaviour and that probably come from my code but I don't find it yet :(
> More, I try the last linux kernel 2.6.13 from linux-mips.org with the last busybox version and the problem persists!
>

Yes, but 'cmp' can tell you the offset where the bytes differ.
This should let you know if you have "off-by-one" errors, etc., in
your code. For instance, if you have a 64k sized buffer and your
first bad byte offset is at 0xffff, then that should tell you
something (like the code logic has interchanged length and offset).


>
> David SANCHEZ
> LexBox :: The Digital Evidence
> 3, avenue Didier Daurat
> 31400 TOULOUSE / FRANCE
> [email protected]
> Tel : +33 (0)5 62 47 15 81
> Fax : +33 (0)5 62 47 15 84
> -----Message d'origine-----
> De?: linux-os (Dick Johnson) [mailto:[email protected]]
> Envoy??: mercredi 14 septembre 2005 16:46
> ??: David Sanchez
> Cc?: [email protected]
> Objet?: Re: Corrupted file on a copy
>
>
> On Wed, 14 Sep 2005, David Sanchez wrote:
>
>> Hi,
>>
>> I'm using the linux kernel 2.6.10 and busybox 1.0 on a AMD AU1550 board.
>>
>> When I copy a big file (around 300M) within an ext2 filesystem (even on
>> ext3 filesystem) then the output file is sometime "corrupted" (I mean
>> that the source and the destination files are different and thus
>> generate a different SHA1).
>> Does somebody have a same behaviour?
>>
>> Thanks,
>> David
>>
>
> Use `cmp` to compare the two files. You could have discovered
> a bug in your checksum utility, you need to isolate it to
> the file-system. FYI, I have never seen a copy of a file, including
> the image of an entire DVD (saved to clone another), that was not
> properly identical.
>
>
> Cheers,
> Dick Johnson
> Penguin : Linux version 2.6.13 on an i686 machine (5589.53 BogoMips).
> Warning : 98.36% of all statistics are fiction.
> .
> I apologize for the following. I tried to kill it with the above dot :
>
> ****************************************************************
> The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.
>
>
> Thank you.
>
>
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Cheers,
Dick Johnson
Penguin : Linux version 2.6.13 on an i686 machine (5589.53 BogoMips).
Warning : 98.36% of all statistics are fiction.

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2005-09-14 16:22:18

by Bill Davidsen

[permalink] [raw]
Subject: Re: Corrupted file on a copy

David Sanchez wrote:
> Hi,
>
> I'm using the linux kernel 2.6.10 and busybox 1.0 on a AMD AU1550 board.
>
> When I copy a big file (around 300M) within an ext2 filesystem (even on
> ext3 filesystem) then the output file is sometime "corrupted" (I mean
> that the source and the destination files are different and thus
> generate a different SHA1).

That's likely to be hardware. Have you tried memtest86 or similar? Are
you overclocked, or running aggressive memory timing?

Similar kernel+bbox installs seem stable on other hardware.


--
-bill davidsen ([email protected])
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me

2005-09-14 16:37:20

by Roger Heflin

[permalink] [raw]
Subject: RE: Corrupted file on a copy



> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> linux-os (Dick Johnson)
> Sent: Wednesday, September 14, 2005 9:37 AM
> To: David Sanchez
> Cc: [email protected]
> Subject: Re: Corrupted file on a copy
>
>
> On Wed, 14 Sep 2005, David Sanchez wrote:
>
> > Hi,
> >
> > I'm using the linux kernel 2.6.10 and busybox 1.0 on a AMD
> AU1550 board.
> >
> > When I copy a big file (around 300M) within an ext2
> filesystem (even
> > on
> > ext3 filesystem) then the output file is sometime
> "corrupted" (I mean
> > that the source and the destination files are different and thus
> > generate a different SHA1).
> > Does somebody have a same behaviour?
> >
> > Thanks,
> > David
> >
>
> Use `cmp` to compare the two files. You could have discovered
> a bug in your checksum utility, you need to isolate it to the
> file-system. FYI, I have never seen a copy of a file,
> including the image of an entire DVD (saved to clone
> another), that was not properly identical.
>

I have seen 2 similar issues. Both where bad hardware of completely
different configurations (nothing at all in common, and completely
different machines).

Both would corrupt data on a read (we never found a corrupted
write). One had a MTBF of 3 GB, and the other about 5GB, and if
you say made 200 50mb files (or however many you need to bust your
disk cache) , and do a checksum on all of them, the
checksum will be wrong on 1 or 2 of them each pass, and each pass
different files will be wrong (once you get all of the original
ones right on disk).

Both were fixed by replacing the proper piece of hardware by a replacement
card, or by slowing down the pci bus one step to something that did not get
corruption. In the second case both the card and the motherboard
were rated for the speed that was getting corruption, and this problem
was duplicated with 2 different mb's of the same kind and 3 different
pci cards, 1 of them being a completely different companies PCI card that
also
did not like the motherboard, but locked up linux rather than corrupted
data.

Roger

2005-09-14 21:05:28

by Chris Wedgwood

[permalink] [raw]
Subject: Re: Corrupted file on a copy

On Wed, Sep 14, 2005 at 03:14:58PM +0200, David Sanchez wrote:

> I'm using the linux kernel 2.6.10 and busybox 1.0 on a AMD AU1550
> board.

The 2.6.10 mips kernel has a prefetch bug, make sure that isn't biting
you (it might be there still, I don't recall seeing a fix for it
posted).

2005-09-15 10:04:29

by David Sanchez

[permalink] [raw]
Subject: RE: Corrupted file on a copy

Hi,

My investigation leads me to suspect the read operation.
I divide the CPU frequency by two and the problem no more occurred!

I carry on my investigation...

Thanks,


David SANCHEZ
LexBox :: The Digital Evidence
3, avenue Didier Daurat
31400 TOULOUSE / FRANCE
[email protected]
Tel : +33 (0)5 62 47 15 81
Fax : +33 (0)5 62 47 15 84

-----Message d'origine-----
De?: [email protected] [mailto:[email protected]] De la part de Roger Heflin
Envoy??: mercredi 14 septembre 2005 18:46
??: David Sanchez
Cc?: [email protected]
Objet?: RE: Corrupted file on a copy



> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> linux-os (Dick Johnson)
> Sent: Wednesday, September 14, 2005 9:37 AM
> To: David Sanchez
> Cc: [email protected]
> Subject: Re: Corrupted file on a copy
>
>
> On Wed, 14 Sep 2005, David Sanchez wrote:
>
> > Hi,
> >
> > I'm using the linux kernel 2.6.10 and busybox 1.0 on a AMD
> AU1550 board.
> >
> > When I copy a big file (around 300M) within an ext2
> filesystem (even
> > on
> > ext3 filesystem) then the output file is sometime
> "corrupted" (I mean
> > that the source and the destination files are different and thus
> > generate a different SHA1).
> > Does somebody have a same behaviour?
> >
> > Thanks,
> > David
> >
>
> Use `cmp` to compare the two files. You could have discovered
> a bug in your checksum utility, you need to isolate it to the
> file-system. FYI, I have never seen a copy of a file,
> including the image of an entire DVD (saved to clone
> another), that was not properly identical.
>

I have seen 2 similar issues. Both where bad hardware of completely
different configurations (nothing at all in common, and completely
different machines).

Both would corrupt data on a read (we never found a corrupted
write). One had a MTBF of 3 GB, and the other about 5GB, and if
you say made 200 50mb files (or however many you need to bust your
disk cache) , and do a checksum on all of them, the
checksum will be wrong on 1 or 2 of them each pass, and each pass
different files will be wrong (once you get all of the original
ones right on disk).

Both were fixed by replacing the proper piece of hardware by a replacement
card, or by slowing down the pci bus one step to something that did not get
corruption. In the second case both the card and the motherboard
were rated for the speed that was getting corruption, and this problem
was duplicated with 2 different mb's of the same kind and 3 different
pci cards, 1 of them being a completely different companies PCI card that
also
did not like the motherboard, but locked up linux rather than corrupted
data.

Roger