2021-08-03 14:35:16

by Daniel Kestrel

[permalink] [raw]
Subject: [PATCH] mtd: rawnand: xway: No hardcoded ECC engine for Micron Chips

Some lantiq xway devices use Micron NAND chips, which use on-die ECC.
The hardcoded setting of NAND_ECC_ENGINE_TYPE_SOFT makes them unusable,
because the software ECC on top of the hardware ECC produces errors for
every read and write access, not to mention that booting does not work,
because the boot loader uses the correct ECC when trying to load the
kernel and stops loading on severe ECC errors.
Removing the hardcoded settings would break a number of devices that
work with those settings.
Adding a DTB property was considered, but did not work, because devices
of the same type but from different manufacture dates have different
NAND chips and as such it is not possible to determine the NAND chip
in advance or device specific.

Signed-off-by: Daniel Kestrel <[email protected]>
---
drivers/mtd/nand/raw/xway_nand.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/mtd/nand/raw/xway_nand.c b/drivers/mtd/nand/raw/xway_nand.c
index 26751976e502..20cb5ce2f3b0 100644
--- a/drivers/mtd/nand/raw/xway_nand.c
+++ b/drivers/mtd/nand/raw/xway_nand.c
@@ -10,6 +10,7 @@
#include <linux/of_platform.h>

#include <lantiq_soc.h>
+#include "internals.h"

/* nand registers */
#define EBU_ADDSEL1 0x24
@@ -148,7 +149,8 @@ static void xway_write_buf(struct nand_chip *chip, const u_char *buf, int len)

static int xway_attach_chip(struct nand_chip *chip)
{
- chip->ecc.engine_type = NAND_ECC_ENGINE_TYPE_SOFT;
+ if (chip->manufacturer.desc->id != NAND_MFR_MICRON)
+ chip->ecc.engine_type = NAND_ECC_ENGINE_TYPE_SOFT;

if (chip->ecc.algo == NAND_ECC_ALGO_UNKNOWN)
chip->ecc.algo = NAND_ECC_ALGO_HAMMING;
--
2.17.1



2021-08-06 23:55:26

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH] mtd: rawnand: xway: No hardcoded ECC engine for Micron Chips

Hi Daniel,

Daniel Kestrel <[email protected]> wrote on Tue, 3 Aug 2021
16:32:56 +0200:

> Some lantiq xway devices use Micron NAND chips, which use on-die ECC.
> The hardcoded setting of NAND_ECC_ENGINE_TYPE_SOFT makes them unusable,
> because the software ECC on top of the hardware ECC produces errors for
> every read and write access, not to mention that booting does not work,
> because the boot loader uses the correct ECC when trying to load the
> kernel and stops loading on severe ECC errors.
> Removing the hardcoded settings would break a number of devices that
> work with those settings.
> Adding a DTB property was considered, but did not work, because devices
> of the same type but from different manufacture dates have different
> NAND chips and as such it is not possible to determine the NAND chip
> in advance or device specific.

I understand the problem and it is a very crappy situation.

>
> Signed-off-by: Daniel Kestrel <[email protected]>
> ---
> drivers/mtd/nand/raw/xway_nand.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mtd/nand/raw/xway_nand.c b/drivers/mtd/nand/raw/xway_nand.c
> index 26751976e502..20cb5ce2f3b0 100644
> --- a/drivers/mtd/nand/raw/xway_nand.c
> +++ b/drivers/mtd/nand/raw/xway_nand.c
> @@ -10,6 +10,7 @@
> #include <linux/of_platform.h>
>
> #include <lantiq_soc.h>
> +#include "internals.h"
>
> /* nand registers */
> #define EBU_ADDSEL1 0x24
> @@ -148,7 +149,8 @@ static void xway_write_buf(struct nand_chip *chip, const u_char *buf, int len)
>
> static int xway_attach_chip(struct nand_chip *chip)
> {
> - chip->ecc.engine_type = NAND_ECC_ENGINE_TYPE_SOFT;
> + if (chip->manufacturer.desc->id != NAND_MFR_MICRON)
> + chip->ecc.engine_type = NAND_ECC_ENGINE_TYPE_SOFT;

Could we make this a little bit clever with something like this:
https://elixir.bootlin.com/linux/v5.13-rc7/source/drivers/mtd/nand/raw/nand_micron.c#L434

This is far from ideal, there should definitely be a change in the DT.
But given your initial comments I guess it is not possible.

Anyway I don't find a better way as, during the attach() call, we don't
yet ran the manufacturer code, hence we don't know if on-die ECC is
actually available or not.

Thanks,
Miquèl

2021-08-08 06:50:02

by Daniel Kestrel

[permalink] [raw]
Subject: Re: [PATCH] mtd: rawnand: xway: No hardcoded ECC engine for Micron Chips

Am Fr., 6. Aug. 2021 um 18:57 Uhr schrieb Miquel Raynal
<[email protected]>:
>
> Hi Daniel,
>
> Daniel Kestrel <[email protected]> wrote on Tue, 3 Aug 2021
> 16:32:56 +0200:
>
> > Some lantiq xway devices use Micron NAND chips, which use on-die ECC.
> > The hardcoded setting of NAND_ECC_ENGINE_TYPE_SOFT makes them unusable,
> > because the software ECC on top of the hardware ECC produces errors for
> > every read and write access, not to mention that booting does not work,
> > because the boot loader uses the correct ECC when trying to load the
> > kernel and stops loading on severe ECC errors.
> > Removing the hardcoded settings would break a number of devices that
> > work with those settings.
> > Adding a DTB property was considered, but did not work, because devices
> > of the same type but from different manufacture dates have different
> > NAND chips and as such it is not possible to determine the NAND chip
> > in advance or device specific.
>
> I understand the problem and it is a very crappy situation.
>
> >
> > Signed-off-by: Daniel Kestrel <[email protected]>
> > ---
> > drivers/mtd/nand/raw/xway_nand.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/mtd/nand/raw/xway_nand.c b/drivers/mtd/nand/raw/xway_nand.c
> > index 26751976e502..20cb5ce2f3b0 100644
> > --- a/drivers/mtd/nand/raw/xway_nand.c
> > +++ b/drivers/mtd/nand/raw/xway_nand.c
> > @@ -10,6 +10,7 @@
> > #include <linux/of_platform.h>
> >
> > #include <lantiq_soc.h>
> > +#include "internals.h"
> >
> > /* nand registers */
> > #define EBU_ADDSEL1 0x24
> > @@ -148,7 +149,8 @@ static void xway_write_buf(struct nand_chip *chip, const u_char *buf, int len)
> >
> > static int xway_attach_chip(struct nand_chip *chip)
> > {
> > - chip->ecc.engine_type = NAND_ECC_ENGINE_TYPE_SOFT;
> > + if (chip->manufacturer.desc->id != NAND_MFR_MICRON)
> > + chip->ecc.engine_type = NAND_ECC_ENGINE_TYPE_SOFT;
>
> Could we make this a little bit clever with something like this:
> https://elixir.bootlin.com/linux/v5.13-rc7/source/drivers/mtd/nand/raw/nand_micron.c#L434
>
> This is far from ideal, there should definitely be a change in the DT.
> But given your initial comments I guess it is not possible.
>
> Anyway I don't find a better way as, during the attach() call, we don't
> yet ran the manufacturer code, hence we don't know if on-die ECC is
> actually available or not.
>
> Thanks,
> Miquèl

Hi Miquèl,

I have worked with an other owner of lantiq hardware and he has successfully
tested adding nand-ecc-mode = "soft" in the device tree and removing the one
line of code that sets the engine type to soft.
I will send another patch version that just removes this line of code.
This change
is required for kernel 5.4 as well, but the setting is in the xway_nand_probe
method at line 237.

Thanks, Daniel.