2004-10-31 05:47:11

by Leo Przybylski

[permalink] [raw]
Subject: Blast and data miscompare

Hello,

I have tried searching on this issue, but found nothing. I heard from a
kernel developer at work that a memory error was discovered recently in
the linux 2.6 kernel that causes data miscompare errors in the generic
scsi driver when executing blast tests.

Does anyone know more about this???

Leo


2004-11-09 15:35:16

by Jake Moilanen

[permalink] [raw]
Subject: Re: Blast and data miscompare

> I have tried searching on this issue, but found nothing. I heard from a
> kernel developer at work that a memory error was discovered recently in
> the linux 2.6 kernel that causes data miscompare errors in the generic
> scsi driver when executing blast tests.
>
> Does anyone know more about this???

Not sure if it's the same problem. But we were seeing a miscompare on
2.4 due to a incorrect COW happening, followed by a hardware hash hole
w/ PPC64.

To fix it we had to make sure that the PTE was cleared and the TLB's
flushed before the new PTE was established.

Martin, was this fixed on 2.6?

Thanks,
Jake

2004-11-09 15:44:18

by Martin J. Bligh

[permalink] [raw]
Subject: Re: Blast and data miscompare

--Jake Moilanen <[email protected]> wrote (on Tuesday, November 09, 2004 09:34:00 -0600):

>> I have tried searching on this issue, but found nothing. I heard from a
>> kernel developer at work that a memory error was discovered recently in
>> the linux 2.6 kernel that causes data miscompare errors in the generic
>> scsi driver when executing blast tests.
>>
>> Does anyone know more about this???
>
> Not sure if it's the same problem. But we were seeing a miscompare on
> 2.4 due to a incorrect COW happening, followed by a hardware hash hole
> w/ PPC64.
>
> To fix it we had to make sure that the PTE was cleared and the TLB's
> flushed before the new PTE was established.
>
> Martin, was this fixed on 2.6?

Yup, was already fixed in 2.6, and is PPC64 only. Most of those errors
tend to be caused by IO problems ...

M.

2004-11-11 05:05:27

by Paul Mackerras

[permalink] [raw]
Subject: Re: Blast and data miscompare

Jake Moilanen writes:

> Not sure if it's the same problem. But we were seeing a miscompare on
> 2.4 due to a incorrect COW happening, followed by a hardware hash hole
> w/ PPC64.
>
> To fix it we had to make sure that the PTE was cleared and the TLB's
> flushed before the new PTE was established.
>
> Martin, was this fixed on 2.6?

It can't happen on 2.6; when BenH rewrote the PTE handling in 2.6
earlier this year, this was one of the things we made sure couldn't be
a problem.

Paul.