2013-03-28 22:23:17

by Aaron Williams

[permalink] [raw]
Subject: MTD NAND BCH support for 24 bits/1K of ECC correction?

Hi all,

I am trying to clean up our OCTEON NAND flash driver in the Linux kernel
and enable support for multi-bit ECC using BCH and am having some
issues. I am able to successfully work with NAND flash that requires 4
bits ECC per 512 bytes but I am having issues with one of our boards
that has a NAND device that requires 24 bits of ECC per 1024 bytes.

I was wondering if ECC of this magnitude has been successfully tested in
the past. By my calculations I should have 42 bytes of ECC per 1K block
(m=14, t=24 for 336 bits of ECC data). My problem is that when decoding
an encoded block I am seeing that nroots != err in decode_bch() after
find_poly_roots(). I am seeing this for all of the blocks I attempt to
read. As far as I can tell the data being sent to BCH is good, though it
might have a few bad bits but nowhere near 24.

I am also seeing this same behavior in my U-Boot code which uses the
identical bch and nand_bch code.

Cheers,

Aaron Williams

--
Aaron Williams
Software Engineer
Cavium, Inc.
(408) 943-7198 (510) 789-8988 (cell)


2013-03-29 09:02:23

by Ivan Djelic

[permalink] [raw]
Subject: Re: MTD NAND BCH support for 24 bits/1K of ECC correction?

On Thu, Mar 28, 2013 at 10:23:01PM +0000, Aaron Williams wrote:
> Hi all,
>
> I am trying to clean up our OCTEON NAND flash driver in the Linux kernel
> and enable support for multi-bit ECC using BCH and am having some
> issues. I am able to successfully work with NAND flash that requires 4
> bits ECC per 512 bytes but I am having issues with one of our boards
> that has a NAND device that requires 24 bits of ECC per 1024 bytes.
>
> I was wondering if ECC of this magnitude has been successfully tested in
> the past. By my calculations I should have 42 bytes of ECC per 1K block
> (m=14, t=24 for 336 bits of ECC data). My problem is that when decoding
> an encoded block I am seeing that nroots != err in decode_bch() after
> find_poly_roots(). I am seeing this for all of the blocks I attempt to
> read. As far as I can tell the data being sent to BCH is good, though it
> might have a few bad bits but nowhere near 24.
>
> I am also seeing this same behavior in my U-Boot code which uses the
> identical bch and nand_bch code.
>

Hi Aaron,

CC-ing your message to linux-mtd which is the place to go for such questions :-)

Your configuration (m=14 t=24 with 1024 bytes of data) has been tested, and should work
with the BCH library. Could you give some details about your ECC setup:

1. Are you trying to locate and correct errors from hardware-computed syndromes ?

2. If yes, did you provide the BCH lib with the specific primitive polynomial used by
your hardware ? What is this polynomial ?

3. Could you provide the ECC bytes generated for the following block patterns:
- a 0xff-filled 1024 bytes block
- a 0xff-filled 1024 bytes block, except for the first byte set to 0xfe
This would help me find out how to setup the library to match your hardware.

Regards,
--
Ivan

2013-03-30 05:40:49

by Aaron Williams

[permalink] [raw]
Subject: Re: MTD NAND BCH support for 24 bits/1K of ECC correction?

On 03/29/2013 02:02 AM, Ivan Djelic wrote:
> On Thu, Mar 28, 2013 at 10:23:01PM +0000, Aaron Williams wrote:
>> Hi all,
>>
>> I am trying to clean up our OCTEON NAND flash driver in the Linux kernel
>> and enable support for multi-bit ECC using BCH and am having some
>> issues. I am able to successfully work with NAND flash that requires 4
>> bits ECC per 512 bytes but I am having issues with one of our boards
>> that has a NAND device that requires 24 bits of ECC per 1024 bytes.
>>
>> I was wondering if ECC of this magnitude has been successfully tested in
>> the past. By my calculations I should have 42 bytes of ECC per 1K block
>> (m=14, t=24 for 336 bits of ECC data). My problem is that when decoding
>> an encoded block I am seeing that nroots != err in decode_bch() after
>> find_poly_roots(). I am seeing this for all of the blocks I attempt to
>> read. As far as I can tell the data being sent to BCH is good, though it
>> might have a few bad bits but nowhere near 24.
>>
>> I am also seeing this same behavior in my U-Boot code which uses the
>> identical bch and nand_bch code.
>>
> Hi Aaron,
>
> CC-ing your message to linux-mtd which is the place to go for such questions :-)
>
> Your configuration (m=14 t=24 with 1024 bytes of data) has been tested, and should work
> with the BCH library. Could you give some details about your ECC setup:
>
> 1. Are you trying to locate and correct errors from hardware-computed syndromes ?
>
> 2. If yes, did you provide the BCH lib with the specific primitive polynomial used by
> your hardware ? What is this polynomial ?
>
> 3. Could you provide the ECC bytes generated for the following block patterns:
> - a 0xff-filled 1024 bytes block
> - a 0xff-filled 1024 bytes block, except for the first byte set to 0xfe
> This would help me find out how to setup the library to match your hardware.
>
> Regards,

We are doing the BCH support entirely in software due to limitations of
our hardware to 1 bit ECC.

I may have to get back to you. I may have found the problem and will try
and look at it some more next week. It looks like I now have it working
in U-Boot and the same bug exists in our Linux driver. It wouldn't be
the first one... our NAND drivers haven't been updated in ages.

-Aaron

--
Aaron Williams
Software Engineer
Cavium, Inc.
(408) 943-7198 (510) 789-8988 (cell)