2021-06-17 13:22:19

by Daniel Palmer

[permalink] [raw]
Subject: [RFC PATCH] mtd: spinand: core: Properly fill the OOB area.

The comment in spinand_write_to_cache_op() says that
spinand_ondie_ecc_prepare_io_req() should 0xff fill the OOB
area but it doesn't.

This causes the OOB area to get filled with zeros
and anytime the first page in a block the bad block marker
is cleared and it becomes a bad block on the next boot.

This was observed on Longsys FORSEE branded parts and
might be specific to these parts.

Signed-off-by: Daniel Palmer <[email protected]>
---
drivers/mtd/nand/spi/core.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
index 6f2d39f9bb06..f1c76fa0e220 100644
--- a/drivers/mtd/nand/spi/core.c
+++ b/drivers/mtd/nand/spi/core.c
@@ -281,6 +281,9 @@ static int spinand_ondie_ecc_prepare_io_req(struct nand_device *nand,
struct spinand_device *spinand = nand_to_spinand(nand);
bool enable = (req->mode != MTD_OPS_RAW);

+ memset(spinand->databuf + nanddev_page_size(nand),
+ 0xff, nanddev_per_page_oobsize(nand));
+
/* Only enable or disable the engine */
return spinand_ecc_enable(spinand, enable);
}
--
2.32.0.rc0


2021-08-07 00:10:11

by Miquel Raynal

[permalink] [raw]
Subject: Re: [RFC PATCH] mtd: spinand: core: Properly fill the OOB area.

Hi Daniel,

Daniel Palmer <[email protected]> wrote on Thu, 17 Jun 2021 20:08:42
+0900:

> The comment in spinand_write_to_cache_op() says that
> spinand_ondie_ecc_prepare_io_req() should 0xff fill the OOB
> area but it doesn't.
>
> This causes the OOB area to get filled with zeros
> and anytime the first page in a block the bad block marker
> is cleared and it becomes a bad block on the next boot.
>
> This was observed on Longsys FORSEE branded parts and
> might be specific to these parts.
>
> Signed-off-by: Daniel Palmer <[email protected]>

This change looks fine, I'll use spinand->oobbuf instead of databuf +
offset (will update when applying).

> ---
> drivers/mtd/nand/spi/core.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
> index 6f2d39f9bb06..f1c76fa0e220 100644
> --- a/drivers/mtd/nand/spi/core.c
> +++ b/drivers/mtd/nand/spi/core.c
> @@ -281,6 +281,9 @@ static int spinand_ondie_ecc_prepare_io_req(struct nand_device *nand,
> struct spinand_device *spinand = nand_to_spinand(nand);
> bool enable = (req->mode != MTD_OPS_RAW);
>
> + memset(spinand->databuf + nanddev_page_size(nand),
> + 0xff, nanddev_per_page_oobsize(nand));
> +
> /* Only enable or disable the engine */
> return spinand_ecc_enable(spinand, enable);
> }

Thanks,
Miquèl

2021-08-07 00:10:47

by Miquel Raynal

[permalink] [raw]
Subject: Re: [RFC PATCH] mtd: spinand: core: Properly fill the OOB area.

On Thu, 2021-06-17 at 11:08:42 UTC, Daniel Palmer wrote:
> The comment in spinand_write_to_cache_op() says that
> spinand_ondie_ecc_prepare_io_req() should 0xff fill the OOB
> area but it doesn't.
>
> This causes the OOB area to get filled with zeros
> and anytime the first page in a block the bad block marker
> is cleared and it becomes a bad block on the next boot.
>
> This was observed on Longsys FORSEE branded parts and
> might be specific to these parts.
>
> Signed-off-by: Daniel Palmer <[email protected]>

Applied to https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git nand/next, thanks.

Miquel

2021-08-11 07:39:50

by Daniel Palmer

[permalink] [raw]
Subject: Re: [RFC PATCH] mtd: spinand: core: Properly fill the OOB area.

Hi Miquel,

On Sat, 7 Aug 2021 at 05:02, Miquel Raynal <[email protected]> wrote:
> This change looks fine, I'll use spinand->oobbuf instead of databuf +
> offset (will update when applying).

Thanks for looking at this for me. One thing I was worried about is
why the SPI NAND subsystem worked before this change with winbond etc
parts.

You probably don't remember now but I sent a patch to include support
for the longsys foresee parts that have the weird quirk of having no
ECC
data in the OOB so it's all usable by the user except for the bad
block marker and the ECC status bits being next to useless.
I found this issue while trying to validate ubi + my ECC status
decoding worked. [0]

The SPI NAND subsystem in u-boot worked fine as it could create the
ubi formatting on the flash and that would survive reboots but any
blocks written by Linux would be bad on reboot.
When Linux created the ubi format it would work until a reboot as the
correct data was cached in memory then u-boot would complain because
all of the blocks were marked bad.

But winbond parts mounted on the same board, same code etc worked just fine.
I guess the OOB is getting fixed somewhere else for other parts. Maybe
it only happens on the longsys parts because there is no ECC in OOB?

Anyhow I was worried my fix hid some other issue/broke other parts.

Cheers,

Daniel

0 - https://lore.kernel.org/lkml/20210408174922.55c1149f@xps13/

2021-08-16 07:46:09

by Miquel Raynal

[permalink] [raw]
Subject: Re: [RFC PATCH] mtd: spinand: core: Properly fill the OOB area.

Hi Daniel,

Daniel Palmer <[email protected]> wrote on Wed, 11 Aug 2021 16:37:55
+0900:

> Hi Miquel,
>
> On Sat, 7 Aug 2021 at 05:02, Miquel Raynal <[email protected]> wrote:
> > This change looks fine, I'll use spinand->oobbuf instead of databuf +
> > offset (will update when applying).
>
> Thanks for looking at this for me. One thing I was worried about is
> why the SPI NAND subsystem worked before this change with winbond etc
> parts.

I didn't look closely to the history but it is possible that during the
ECC engine framework introduction and the split of the on-die ECC code
the behavior changed (which, in this case, is a regression).

> You probably don't remember now but I sent a patch to include support
> for the longsys foresee parts that have the weird quirk of having no
> ECC
> data in the OOB so it's all usable by the user except for the bad
> block marker and the ECC status bits being next to useless.
> I found this issue while trying to validate ubi + my ECC status
> decoding worked. [0]

Yes I remember now!

> The SPI NAND subsystem in u-boot worked fine as it could create the
> ubi formatting on the flash and that would survive reboots but any
> blocks written by Linux would be bad on reboot.
> When Linux created the ubi format it would work until a reboot as the
> correct data was cached in memory then u-boot would complain because
> all of the blocks were marked bad.
>
> But winbond parts mounted on the same board, same code etc worked just fine.
> I guess the OOB is getting fixed somewhere else for other parts.

I think the reason is that most ECC codes do not cover the OOB part,
hence writing garbage there is not an issue (while with your part, the
OOB area was actually fully protected). But it is certainly best to keep
this area tidy anyway.

> Maybe
> it only happens on the longsys parts because there is no ECC in OOB?
>
> Anyhow I was worried my fix hid some other issue/broke other parts.

I think this patch is legit, I will keep it.

>
> Cheers,
>
> Daniel
>
> 0 - https://lore.kernel.org/lkml/20210408174922.55c1149f@xps13/

Thanks,
Miquèl